Next Article in Journal
Experimental Study on the Mechanism of Overtopping Failure and Breach Development in Homogeneous Earth Dams
Next Article in Special Issue
Advancing Hydrological Prediction with Hybrid Quantum Neural Networks: A Comparative Study for Mile Mughan Dam
Previous Article in Journal
Simulation of Actual Evapotranspiration and Its Multiple-Timescale Attribution Analysis in the Upper Reaches of the Jinsha River, China
Previous Article in Special Issue
Effects of Permeability and Pyrite Distribution Heterogeneity on Pyrite Oxidation in Flooded Lignite Mine Dumps
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Dual-Branch Coupled Fourier Neural Operator for High-Resolution Multi-Phase Flow Modeling in Porous Media

1
Department of Computational Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
2
Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
*
Author to whom correspondence should be addressed.
Water 2025, 17(23), 3351; https://doi.org/10.3390/w17233351
Submission received: 21 October 2025 / Revised: 16 November 2025 / Accepted: 19 November 2025 / Published: 23 November 2025

Abstract

This paper investigates a physics-informed surrogate modeling framework for multi-phase flow in porous media based on the Fourier Neural Operator. Traditional numerical simulators, though accurate, suffer from severe computational bottlenecks due to fine-grid discretizations and the iterative solution of highly nonlinear partial differential equations. By parameterizing the kernel integral directly in Fourier space, the operator provides a discretization-invariant mapping between function spaces, enabling efficient spectral convolutions. We introduce a Dual-Branch Adaptive Fourier Neural Operator with a shared Fourier encoder and two decoders: a saturation branch that uses an inverse Fourier transform followed by a multilayer perceptron and a pressure branch that uses a convolutional decoder. Temporal information is injected via Time2Vec embeddings and a causal temporal transformer, conditioning each forward pass on step index and time step to maintain consistent dynamics across horizons. Physics-informed losses couple data fidelity with residuals from mass conservation and Darcy pressure, enforcing the governing constraints in Fourier space; truncated spectral kernels promote generalization across meshes without retraining. On SPE10-style heterogeneities, the model shifts the infinity-norm error mass into the 10 2 to 10 1 band during early transients and sustains lower errors during pseudo-steady state. In zero-shot three-dimensional coarse-to-fine upscaling from 30 × 110 × 5 to 60 × 220 × 5 , it attains R 2 = 0.90 , RMSE = 4.4 × 10 2 , and MAE = 3.2 × 10 2 , with more than 90% of voxels below five percent absolute error across five unseen layers, while the end-to-end pipeline runs about three times faster than a full-order fine-grid solve and preserves water-flood fronts and channel connectivity. Benchmarking against established baselines indicates a scalable, high-fidelity alternative for high-resolution multi-phase flow simulation in porous media.

1. Introduction

Multi-phase flow in porous media underpins critical applications in subsurface engineering—from groundwater remediation to carbon sequestration. The governing equations, nonlinear partial differential equations (PDEs) that capture the intricate interactions among fluid phases and the heterogeneous porous structure, are notoriously difficult to solve numerically. Traditional solvers, whether implemented in commercial software or open-source tools, require extremely fine discretizations to capture the essential dynamics, which in turn leads to prohibitive computational costs and long turnaround times, especially when conducting multi-parameter studies or producing high-resolution predictions.
While deep learning has revolutionized fields such as imaging, speech recognition, and natural language processing [1], the application of machine learning techniques that explicitly incorporate physical laws is relatively new. Early contributions such as the physics-informed neural networks (PINNs) introduced [2] and later refined in works like those of [3,4] have demonstrated that embedding governing equations into the training process can guide the solution toward physically consistent behavior. Yet, these methods have often yielded modest gains when compared with the dramatic successes observed in other areas of deep learning.
A particularly transformative approach comes from the emerging field of operator learning, which seeks to approximate mappings between infinite-dimensional function spaces. In this context, the Fourier Neural Operator (FNO) introduced [5] and extended [6] represents a paradigm shift. By reformulating the kernel integral in Fourier space and truncating the high-frequency components, FNO achieves spectral convolution that is both computationally efficient and inherently suited to enforcing physical constraints. Crucially, its invariance with respect to discretization allows the learned operator to generalize seamlessly across arbitrary grids without requiring re-training—a property of immense value for high-resolution simulations.
In porous media flow, where multi-scale heterogeneities and strong nonlinearities prevail, integrating physics-informed loss terms into the FNO framework (as demonstrated in physics-informed neural operators [7,8]) has proven essential. This integration ensures that the surrogate model rigorously adheres to the underlying PDEs, thereby reducing the computational effort needed to achieve accurate predictions. For example, implementations on platforms such as NVIDIA Modulus have shown that FNO-based approaches can reduce simulation times by up to three orders of magnitude while preserving accuracy across different discretizations [9]. More recently, several studies have adapted FNO-style architectures specifically to multiphase flow in porous media and related subsurface applications, including enhanced multiphase FNO variants, residual-corrected neural operators for nonlinear PDE inverse problems, and operator-based surrogates for field-scale reservoir simulations [10,11,12,13].
However, existing applications of Fourier Neural Operators to porous-media flow still face important limitations. Most studies focus on single-phase or weakly coupled settings and treat only one state variable (typically pressure or saturation), thereby neglecting the strong two-way coupling that characterizes multiphase displacement. Purely data-driven FNOs also tend to oversmooth sharp saturation fronts and underestimate channelized flow in highly heterogeneous formations such as SPE10, particularly during early transients. In many implementations, time is handled as a simple scalar feature or as an additional spatial coordinate, which can degrade accuracy over long horizons and lead to drift between transient and pseudo-steady regimes. Moreover, existing FNO surrogates rarely enforce PDE residuals during training, so saturation and pressure predictions may violate mass balance and Darcy’s law, reducing robustness when extrapolating across injection scenarios.
This work presents an advanced framework for applying FNO to the physics-informed simulation of multi-phase flow in porous media. Building on a dual-branch formulation (DB-AFNO), the physics-based loss functions are integrated with different temporal encodings (e.g., Time2Vec, TFT/TST) to honor the governing PDE constraints in both saturation and pressure fields. Our comprehensive computational analysis demonstrates that DB-AFNO outperforms conventional, purely data-driven FNO approaches across diverse tracer transport configurations, reducing errors during early transient phases and maintaining accuracy in later pseudo-steady states. In particular, DB-AFNO with Time2Vec and TFT/TST concentrates the majority of saturation predictions in the lower error ranges, preserving critical flow structures that conventional FNO often oversmooth. Blind testing on top and bottom layers of the 10th SPE Comparative Solution Project [14] highlights the framework’s potential for scalable, high-resolution simulations without incurring the computational costs of traditional full-order solvers.
The uniqueness of this approach lies in explicitly coupling the two PDE variables using a multitask, time-aware architecture. While conventional FNOs typically address a single field, our model retains the core spectral convolution operations and extends them via key modifications. First, a shared spectral encoder predicts both pressure and saturation, extracting spatial features that naturally encode interdependencies. Second, two distinct time encoders—one for saturation and one for pressure—transform the scalar time t into high-dimensional representations that capture differing temporal sensitivities while preserving mutual spatial information. Finally, after decoding via separate branches, the outputs are fused through an additional convolutional layer that enforces consistency between the predictions through cross-information exchange.

2. Methods

This section presents a technical description of the problem using the dynamical state-space framework. The focus of this formulation is on the incompressible and immiscible displacement of two phases within a porous media. More recently, the same principles have been extended to carbon sequestration efforts, specifically in the displacement of brine by CO 2 [15]. In this work, we adopt a Darcy-scale, incompressible and immiscible two-phase formulation on a fixed Cartesian grid. The two phases are denoted wetting and non-wetting, with saturations S w and S n w and corresponding pressures P w and P n w . The physical domain Ω is discretized into N control volumes on a structured grid, and the primary unknowns collected in the state vector are the cell-wise saturation and pressure fields. The fluid system is isothermal, with constant viscosities and densities, and rock properties (porosity and absolute permeability) are time-invariant and defined per grid cell. Flow is governed by Darcy’s law under a single-continuum representation, without explicit geomechanical deformation, fracture propagation, or matrix–fracture transfer terms, and without capillary-pressure hysteresis. Gravity is included through a constant acceleration vector, while molecular diffusion and dispersion are neglected so that transport is driven by advection and sources/sinks only. Under these assumptions, the saturation constraint reduces to S w + S n w = 1 , and a monotone capillary-pressure law P c ( S w ) together with Corey-type relative permeability functions k r ψ ( S ψ ) closes the system. The governing equations fully describe the displacement process considered here. Gas-phase transport, solution-gas liberation, and evaporation/condensation processes are neglected; consequently, no gas saturation or gas pressure variables are introduced into the state vector, and all volumetric source terms represent injection and production of the two liquid phases only. A concise summary of the formulation is provided in Table 1, Table 2 and Table 3.
The above formulation was coded in Python using Pytorch and Cuda to allow for interoperable transfer between the simulator and the neural operator architecture. In the following sections, we will present the numerical methods used for the simulation part along with the neural operator architectures.

2.1. Model Description

In this study, we used the SPE10 Model 2 dataset [14], an open source benchmark known to evaluate up-scaling methods for aquifer flow and tracer transport. This is used as an analogue to an underground water aquifer for tracer transport. This data set is composed of a Brent sequence mapped on a Cartesian grid with dimensions of 60 × 220 × 85 , resulting in 1,122,000 cells. The model encompasses two distinct formations: the shallow-marine Tarbert formation occupying the upper 35 layers, characterized by relatively smooth permeability, and the fluvial Upper-Ness formation constituting the lower 50 layers, noted for its significant permeability variations spanning 8 to 12 orders of magnitude. These formations present markedly different permeability structures, as highlighted in Figure 1, where the model is inverted to emphasize the heterogeneity within the Upper-Ness formation. This complex permeability and porosity distribution poses a rigorous challenge, making it an ideal test case for operator learning.

2.2. Numerical Methods

We have implemented a framework that incorporates an adaptive trapezoidal integration scheme. The Generalized Conjugate Residual (GCR) method serves as a robust solver for linear systems within each Newton iteration, providing efficient handling of sparse matrix structures and enhancing the convergence characteristics of the overall method. The Newton’s algorithm, integrated with the GCR solver, is designed to leverage its fast convergence properties, significantly reducing the computational overhead associated with direct methods like LU decomposition, while maintaining the capacity for handling nonlinear systems effectively. This integration of the adaptive trapezoidal method with iterative linear solvers is crucial in our simulations, striking an optimal balance between computational efficiency and the accuracy of transient dynamic responses in complex multiphase flows.
There are two encountered main challenges in this work: (1) building a stable model to act as a base for our neural operator implementation and (2) ensuring that the model is computationally efficient.
To circumvent these challenges, we have implemented several advanced numerical techniques and optimization methods that enhance both the stability and efficiency of the model. These techniques include the adaptive dynamic stepping scheme, the use of GCR with a banded LU preconditioner, and the employment of automatic differentiation for efficient Jacobian computation.

2.2.1. Preconditioned Iterative Solvers

Iterative methods were deemed to be more suitable for our approach due to the ability to predetermine the accuracy needed for phase saturations. However, iterative methods, specifically GCR, may not be efficient for sparse matrices out-of-the-box. Therefore, using a preconditioner is important to achieve fast convergence while having control over the desired accuracy.
Banded LU/ILU preconditioners are particularly suitable for systems where the matrix A exhibits a banded structure [16]. This approach is advantageous for large sparse systems with a banded pattern, as it significantly reduces the fill-in compared to a full LU decomposition, thereby preserving the computational efficiency.
κ ( U 1 L 1 A ) < κ ( A ) , for banded A .
Alternatively, when the Jacobian matrix A is symmetric and positive-definite, a Cholesky decomposition can be used for preconditioning [17]. The preconditioned system is solved via
L T L 1 A x = L T L 1 b ,
Hence, multi-grid preconditioners are also common for fluid flow problems, however, Python does not support GPU direct implement similar to PyAMG on the CPU therefore it was not included in our benchmark. The choice between banded LU and Cholesky preconditioning depends on the structure and properties of the Jacobian matrix, Cholesky preferred for symmetric positive-definite systems because of its computational efficiency. In our model, we found that ILU is more stable for our Neumann boundary condition pressure solver.

2.2.2. Adaptive Dynamic Stepping:

In the adaptive time-stepping scheme implemented within the simulation framework, the trapezoidal method is utilized to preserve numerical stability without the need for the implementation of CFL, particularly crucial in the context of stiff systems encountered in multiphase flow simulations. The integration process begins with a prediction step that takes advantage of the explicit Forward Euler method to approximate saturations at the subsequent timestep, denoted as S predictor . This preliminary prediction is calculated according to the following equation:
S predictor = S current + Δ t · d S d t current
where S current represents the current saturation state, Δ t signifies the current timestep, and  d S d t current is the rate of change in saturation evaluated in the current state. This predictor step serves as an initial approximation for the subsequent implicit corrector step, which refines this estimate to enhance stability and accuracy. The corrector phase is carried out using an implicit trapezoidal integration, where the corrected saturation S next is determined by resolving the nonlinear equation:
S next S current Δ t 2 d S d t current + d S d t next = 0
Here, d S d t next denotes the rate of saturation change evaluated in the next predicted state. This equation is solved iteratively using the Newton–Raphson method.
To dynamically adjust the timestep Δ t , an error estimation mechanism is incorporated that compares the norm of the discrepancy between the predictor and corrector solutions against a predefined tolerance. This adaptive time-step control is vital for managing computational efficiency and accuracy throughout varying dynamics of the modeled system. If the error estimate exceeds the tolerance threshold, Δ t is reduced to enhance resolution; conversely, if the error is significantly below the tolerance, Δ t is increased to expedite computations without sacrificing significant accuracy.In the context of dynamically regulating the time step Δ t within numerical simulations, a sophisticated approach is required to ensure stability and precision under varying conditions. The reduction factor for adjusting Δ t can be mathematically defined using an exponential function, providing a smooth and continuous regulation mechanism. The reduction factor R for the time step Δ t is given by the following equation:
R ( Δ t ) = α + ( β α ) exp ( γ Δ t )
where α is the minimum reduction factor, β is the maximum reduction factor and γ is a scaling parameter controlling the steepness of the exponential decay.
In this formulation, as  Δ t increases, the reduction factor R approaches the lower bound α , whereas as  Δ t decreases, R approaches the upper bound β . To dynamically adjust the time step Δ t , the new time step Δ t new is defined as
Δ t new = max Δ t × R ( Δ t ) , Δ t min ,
where Δ t min is the minimum allowable step size, preventing excessively small steps that would be computationally inefficient. The stability constraints for Δ t are derived from the spectral properties of the system’s Jacobian J , using the inverse of its maximum eigenvalue, 1 λ max ( J ) , and a more conservative bound, 0.1 λ max ( J ) . These limits ensure that timestep updates remain within a range that preserves the numerical stability of the integration method for multiphase flow. All pseudocode listings for the above algorithms, Algorithms A1–A4, are provided in Appendix A.

2.3. Neural Operator Architecture

We use a DB–AFNO neural operator that fuses spectral, spatial, and physics-based information to predict mean saturation S ^ R N x × N y and variance σ 2 R N x × N y . Static geological properties—porosity ϕ ( x ) , permeability k ( x ) —initial states P initial , S initial , and phase viscosities μ w , μ n w are combined with temporal encoding via Time2Vec and a causal transformer that conditions each step on the time index and Δ t . Fourier layers capture the coupled space–time response, and training minimizes a composite loss that blends data misfit with PDE residuals enforcing mass conservation and Darcy flow. In this way, the network advances the state in time and returns high-resolution, physically consistent pressure P and saturation S fields at arbitrary query times, as summarized in Figure 2.

2.3.1. Data Preparation

The data preparation pipeline for the physics-informed Fourier Neural Operator framework integrates simulation outputs—including grid parameters, K fields, well configurations, initial saturation distributions, and fluid properties—stored in serialized files. The raw data is reformatted for a consistent spatial representation; for example, the K tensor is converted to an effective scalar field via magnitude computation. Temporal dynamics are captured using a multi-frequency positional encoding with sine and cosine functions over logarithmically spaced scales, which is then spatially broadcast and concatenated with static input channels to form a comprehensive feature tensor. Global statistical measures (mean and standard deviation) are computed over the entire dataset and applied to both input and target fields to standardize the data, reducing numerical variability and aiding convergence during training. Early time steps are replicated to better represent rapid transients. Over 1000 simulation runs at half fidelity were performed, with the data partitioned 80% for training and 20% for validation, plus additional cases reserved for blind testing. These runs span a wide range of heterogeneous SPE10 Model 2 realizations, as well as idealized cases with spatially uniform permeability and porosity that are included explicitly in the training set. Across all runs, we vary initial water saturations, viscosity ratios, well locations, and injection schedules, resulting in substantial variability in flood-front shapes, breakthrough times, and recovery factors. A statistical overview of the key petrophysical variables is provided in Figure 3, and descriptive statistics are listed in Table 4, demonstrating that the training set covers a broad portion of the oil–water displacement behavior within this benchmark setting. This process is implemented using a modular Python framework built on PyTorch (version 2.9.1, CUDA 12.6), supporting efficient file handling, dynamic sample generation, and mini-batch processing for large-scale data, with the simulations executed on hardware platforms equipped with RTX Titan GPUs.

2.3.2. Pure Data-Driven Approach

The model processes inputs X via:
Z 0 = InputProj X F 1 ( W F ( G ) ) ,
where F and F 1 denote Fourier and inverse Fourier transforms, W are learnable spectral weights, and  G represents learned Fourier features. Subsequent layers combine spectral, spatial, and physics-guided paths:
Z l + 1 = GELU k = 1 2 w k ( l ) P k ( Z l ) + Z l , l { 1 , . . . , L } ,
where P 1 , and  P 2 correspond to spectral convolution and spatial depthwise convolution, respectively.

2.3.3. Dual Output Approach

In this physics-informed multi-task framework, the model jointly predicts transient pressure (P) and saturation (S) fields governed by two-phase flow dynamics.
K , wells ( q ) , S i , P i , μ w , μ n w , ϕ , t S ^ ( x , t ) , P ^ ( x , t ) .
The composite loss function is designed to enforce both data fidelity and the underlying physical constraints through a dual-task formulation. It is defined as
L = S ^ S true 2 + P ^ P true 2 Data terms + α R S 2 + R P 2 PDE residuals ,
where α is a weighting hyperparameter. The PDE residuals are expressed as
R S = S ^ t + · f ( S ^ ) v ^ ,
R P = · λ t ( S ^ ) K P ^ q .
v ^ = λ t ( S ^ ) K ϕ P ^ ,
In this formulation, pressure gradients drive the propagation of the saturation front via the nonlinear interaction expressed by the product f ( S ) λ t ( S ) , while porosity ϕ modulates the effective velocity in the aquifer. This dual-output approach, therefore, not only minimizes discrepancies between the predicted and true fields but also rigorously enforces the physical laws governing two-phase flow in porous media.
For this coupling, handling the temporal part becomes a necessity. The temporal part of the PDE was tackled by incorporating a temporal representation into the input features with the goal of capturing both short-term variations and long-term trends. In our work, we consider three complementary approaches:
Temporal Sinusoidal Encoding
Temporal sinusoidal encoding relies on fixed periodic functions to map a scalar time t into a high-dimensional vector. At time step t, the encoding is defined as
P t = sin t λ 1 , cos t λ 1 , , sin t λ d / 2 , cos t λ d / 2 ,
where λ i = 10 4 i / d and d is the embedding dimension. This encoding is then concatenated with the static input features x t (e.g., permeability, well locations) and the previous saturation S t 1 to form
z t = S t 1 , x t , P t .
This approach, introduced in the Transformer architecture [18], remains a popular and effective means to provide temporal context.
Time2Vec
Time2Vec is a learnable temporal encoding method that transforms the scalar time t into a vector composed of one linear component and several periodic components. The encoding is given by
T ( t ) = a 0 t + b 0 , sin a 1 t + b 1 , , sin a d 1 t + b d 1 ,
where a i and b i are learnable parameters. By learning these parameters from data, Time2Vec is capable of capturing complex, nonlinear temporal patterns. This method has demonstrated advantages over fixed encodings in various time-series applications [19].
Transformer-Based Temporal Encoding
Transformer-based temporal encoding leverages self-attention mechanisms to generate context-aware representations of time. Dedicated time-series transformer architectures—such as the Temporal Fusion Transformer (TFT) [20] and the Time-series-specific Transformer family [21,22,23]—often incorporate gating mechanisms and variable selection networks to enhance stability and interpretability. In this approach, the scalar-time input is first projected into a high-dimensional space and then processed as a sequence via a Transformer encoder. The resulting outputs are aggregated (e.g., by mean pooling) to yield a dynamic embedding that effectively captures both local and global temporal dependencies.
The model predicts both saturation ( S ^ ) and pressure ( P ^ ) fields via separate decoding branches. Recognizing that the pressure signal is typically lower in magnitude, we apply a learnable amplification factor γ P (initialized at 2.0) to the pressure output:
P ^ amp = γ P P ^ .
These outputs are then fused using a cross-gated mechanism that implements bidirectional modulation: the pressure branch provides context for refining saturation, and the saturation branch provides context for refining pressure. Concretely, we first construct gating maps by passing each branch through a shallow convolutional block followed by a sigmoid nonlinearity, so that the gates lie in [ 0 , 1 ] and can act as spatially and temporally varying masks. The fused outputs are computed as
S ^ fused = S ^ + α σ C 1 ( P ^ amp ) P ^ amp ,
P ^ fused = P ^ amp + β σ C 2 ( S ^ ) S ^ ,
where α and β are learnable scalar (or channel-wise) parameters that control the strength of cross-branch contributions, σ ( · ) denotes the sigmoid function, and ⊙ is element-wise multiplication. In this formulation, σ ( C 1 ( P ^ amp ) ) acts as a data-dependent gate that selectively injects pressure information into the saturation field, while σ ( C 2 ( S ^ ) ) plays the analogous role in the pressure branch. This cross-gated fusion allows the network to dynamically downweight noisy or less informative regions and to emphasize locations where pressure and saturation must remain tightly coupled (e.g., near displacement fronts and high-permeability channels). As a result, the model can better balance the relative contributions of pressure and saturation, leading to final predictions that are more consistent with the underlying two-phase flow physics and less prone to spurious decoupling between the two fields. final predictions adhere to the underlying physics of two-phase flow. Figure 4 shows a dual-branch, time-aware FNO architecture that splits shared spectral features into separate saturation and pressure paths.

2.4. Training and Optimization

All models were implemented in PyTorch and trained on a single NVIDIA GPU. Inputs were standardized per channel using statistics computed once on the training set; saturation and pressure targets were standardized separately. Losses were computed in standardized space, and predictions were de-standardized for reporting. The network uses per-channel normalization in the spectral trunk and BatchNorm in the convolutional decoder; dropout is applied only in the pressure decoder ( p = 0.2 ). A learned 64-dimensional temporal/positional embedding of the scalar time/step is concatenated to the static inputs and broadcast over the spatial grid. No geometric augmentation was used; to emphasize early transients we oversampled early steps ( t 10 ) by a factor of 2.
Optimization used AdamW (initial learning rate 5 × 10 3 , weight decay 10 2 ). A cosine-annealing schedule with warm restarts (CosineAnnealingWarmRestarts; T 0 = 100 , T mult = 2 , η min = 10 8 ) was stepped per optimizer update and not reset across curriculum phases. Mini-batch size was 64, with global-norm gradient clipping at 2. The data term was Smooth- L 1 (Huber) with β = 0.01 , applied to saturation and pressure with equal weights. A physics-residual term (conservation and Darcy consistency) was weighted by λ phys ( e ) = exp ( e / 25 ) , where e is the global epoch index; a total-variation penalty on the pressure head was available but set to zero in the main runs.
Training followed a staged curriculum over the simulation horizon. In the first phase, the network was trained only on the earliest portion of each trajectory (up to time index 25), and in subsequent phases the maximum time index was progressively increased to 100, 500, and finally the full trajectory (), using 50, 50, 50, and 500 epochs per phase, respectively. This schedule forces the model to first capture early-time transients and sharp displacement fronts before fitting the later-time, quasi-steady-state behavior over the entire horizon. A concise summary of the training and configuration settings is provided in Table 5.

3. Results

3.1. Numerical Validation

For numerical validation, we examine both single-phase and multi-phase flow scenarios. In the single-phase case, the formation is modeled as a homogeneous 64 × 64 × 1 grid with unit porosity and viscosities, and irreducible saturations s w , s n w set to zero, allowing full pore space utilization. Injection occurs at one corner and production at the opposite. Unlike Aarnes et al. [24], who normalize the injection rate to one pore volume of water, our non-normalized simulation reproduces the expected physics, as shown in Figure 5 (A vs. B). For the multi-phase case, we use SPE10 Model 2 with an incompressible two-phase system μ w = 0.3 cP , μ n w = 3.0 cP , s w c = s n w r = 0.2 . Figure 5 presents breakthrough saturation distributions for the top (A1 vs. B1) and bottom layers (A2 vs. B2). Our model matches the reference, with minor discrepancies attributable to numerical smoothing differences.

3.2. 2D FNO Results

We compare (i) a purely data-driven FNO baseline and (ii) a DB-AFNO with different temporal encodings (Positional Encoding, Time2Vec, TFT/TST). Numerical simulations serve as ground truth. We evaluate absolute error fields and spatio-temporal accuracy over time, including blind tests on the top and bottom layers of SPE10 (Figure 6).
As shown in Figure 6, DB-AFNO with Time2Vec and TFT/TST yields lower early-time errors ( t * < 0.25 ), with more samples in the 10 2 10 1 range. For later times ( t * 0.25 ), Time2Vec and TFT/TST remain close; Positional Encoding improves over the purely data-driven FNO but trails the former two. In Figure 7, each column corresponds to a different injection time, and each row compares the ground-truth simulator, the conventional FNO baseline, and DB-AFNO. The baseline FNO visibly diffuses the displacement front, particularly along high-permeability channels, and in several time slices the advancing water front becomes disconnected or breaks into isolated patches. In contrast, DB-AFNO preserves the continuity of the main displacement front and better reproduces the fingering patterns and channelized advance observed in the simulator. This is most evident near breakthrough, where DB-AFNO aligns both the front position and the saturation contrast across the front, whereas the baseline front arrives too early and exhibits a smeared transition zone. These qualitative differences are consistent with the error distributions in Figure 6, where DB-AFNO has a higher proportion of predictions in the 10 2 10 1 error band at both early and late times.

3.3. Zero-Shot Super Resolution

DB-AFNO is trained once—as an upscaler—on a single coarse/fine pair from the upper five layers of SPE10: input lattice 30 × 110 × 5 , target lattice 60 × 220 × 5 . After convergence, weights are frozen. At deployment, the coarse simulator supplies S coarse , P coarse , which the operator upsamples
G θ : S coarse , P coarse , K , ϕ , μ w , μ n w , q , t S ^ , P ^ | Ω 60 × 220 × 5 .
The FNO backbone includes a Cartesian coordinate-grid embedding (normalized ( x , y , z ) appended per voxel) for absolute spatial context on any mesh. Training uses an incremental multi-resolution curriculum: start with 2 × 2 × 2 retained Fourier modes on a  2 ×  subsampled grid, then raise spectral/spatial resolutions ( 2 , 2 , 2 ) ( 6 , 6 , 3 ) ( 18 , 18 , 8 ) once validation-loss gradients plateau. In this zero-shot setup the full model (including gating and temporal channels) converges in 100 epochs, with ≈1.5 GPU-min per case on a single RTX Titan.
With >90% of voxels below a 5 % absolute-error threshold. Figure 8 shows 3-D renderings of S ^ and | S S ^ | ; fronts, channel connectivity, and high-saturation ridges are preserved at 8 × volumetric up-resolution. Figure 9 provides layer-wise scatter plots at the final time step; similar behavior holds at other times. Results remain sensitive to the coarse grid: using 15 × 55 × 5 did not reproduce the same accuracy.

4. Discussion

Across the 2-D benchmarks, DB-AFNO consistently outperforms a purely data-driven FNO. Error distributions in norm show that Time2Vec- and TFT/TST-encoded variants concentrate early-time errors ( t * < 0.25 ) in the 10 2 10 1 band, whereas Positional Encoding and the conventional FNO exhibit heavier tails (Figure 6). Near pseudo–steady state ( t * 0.25 ), Time2Vec and TFT/TST remain closely matched and continue to dominate Positional Encoding. Visual comparisons corroborate these statistics: the baseline FNO produces over-smoothed fields with disconnected saturation artifacts, while DB-AFNO retains sharp, physically plausible fronts and spatial connectivity (Figure 7). These gains arise from (i) the explicit coupling of pressure and saturation through dual branches, (ii) temporal conditioning that disambiguates early transients, and (iii) physics-informed losses that penalize mass-balance and Darcy residuals.
For zero-shot super-resolution, a train-low, infer-high regimen enables coarse-to-fine mapping without retraining. A single coarse/fine pair 30 × 110 × 5 60 × 220 × 5 is sufficient to learn mesh-agnostic upscaling, aided by a Cartesian coordinate-grid embedding for absolute spatial context and an incremental multi-resolution curriculum that increases retained Fourier modes and spatial resolution only after validation plateaus. This curriculum halves wall-clock training and reduces GPU memory by ∼40% relative to single-shot high-resolution training, while the deployed pipeline (coarse TPFA + one forward pass) achieves ∼3× speedup over full-order fine-grid simulation (Table 6). On five unseen layers, we obtain R 2 ¯ = 0.90 , RMSE ¯ = 4.4 × 10 2 , and  MAE ¯ = 3.2 × 10 2 , with >90% of voxels below 5 % absolute error. Three-dimensional renderings confirm preservation of water-flood fronts, channel continuity, and high-saturation ridges at 8 × volumetric up-resolution (Figure 8), and layer-wise scatter plots indicate tight agreement at the final time step with similar behavior at other times (Figure 9).
A key technical limitation of the zero-shot path is its dependence on the fidelity of the coarse solution: the mapping presumes that the coarse grid resolves the dominant transport features. When the input was degraded to 15 × 55 × 5 , the same accuracy was not recovered, indicating a lower bound on coarse resolution for reliable upscale inference. Minor discrepancies in the validation figures are consistent with numerical smoothing choices rather than architectural deficiencies.
Overall, the results indicate that (i) enforcing coupled physics with dual-branch decoding and temporal encodings improves early-time accuracy and suppresses over-smoothing; (ii) the operator generalizes to unseen layers within SPE10 when conditioned on physically meaningful inputs; and (iii) zero-shot super-resolution provides accurate fine-grid fields at a fraction of the computational cost, provided the coarse simulation captures the essential flow physics.
The DB-AFNO surrogates presented here are trained entirely on incompressible, immiscible two-phase displacements, and thus are most reliable for formations and operating conditions with similar heterogeneity statistics, viscosity ratios, and flow regimes; all results in this study pertain to such systems with no explicit gas phase or compositional effects. When the underlying physics changes significantly—e.g., strong capillary hysteresis, gas evolution or gas caps, or fracture-dominated flow—retraining or transfer learning on simulations from the new regime would be required. While the dual-branch architecture is, in principle, extensible to three-phase or compositional flow (e.g., via an additional output branch and corresponding physics residuals), this lies beyond the scope of the present work and is left for future investigation.

5. Conclusions

  • The proposed DB-AFNO framework achieves a significant reduction in computational cost compared to traditional full-order solvers while preserving high-resolution prediction accuracy, as demonstrated on the SPE10 benchmark.
  • The dual-branch architecture effectively couples pressure and saturation fields, ensuring that the predicted outputs remain physically consistent and adhere closely to the governing PDE constraints.
  • The incorporation of advanced temporal encoding methods (e.g., Time2Vec, TFT/TST) enhances the model’s ability to capture dynamic transient behaviors, resulting in lower validation errors during both early transient and pseudo-steady states.
  • By integrating physics-informed loss functions, the model not only minimizes data-driven prediction errors but also enforces adherence to the underlying flow physics, thereby improving overall reliability.
  • The overall architecture demonstrates scalability to higher-resolution simulations, given that it is trained on similar lower-fidelity cases.
  • Despite its high-fidelity performance within the training distribution, DB-AFNO-based approaches may fall out-of-distribution when the underlying physics deviates from those encountered during training.
  • The scalability of DB-AFNO was found to be similar to that of FNO-based models. It is limited in scenarios involving sudden or drastic changes in governing physical processes, indicating the need for adaptive re-training or architectural modifications when the physics evolve beyond the original training regime.
  • Zero-shot super-resolution cuts the end-to-end turnaround for a 60 × 220 × 5 grid from ∼15 min (full TPFA) to 4.5 min by upscaling a coarse 30 × 110 × 5 run with a single forward pass, achieving R 2 ¯ = 0.90 and RMSE ¯ = 4.4 × 10 2 without additional fine-grid training.

Author Contributions

Conceptualization, H.A.H. and J.W.; methodology, H.A.H.; software, H.A.H. and O.E.; validation, H.A.H., O.E. and J.W.; formal analysis, H.A.H.; investigation, H.A.H. and O.E.; resources, H.A.H. and O.E.; data curation, O.E.; writing—original draft preparation, H.A.H.; writing—review and editing, H.A.H., O.E. and J.W.; visualization, H.A.H. and O.E.; supervision, J.W.; project administration, J.W.; funding acquisition, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

The authors did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors for the research, authorship, or publication of this work.

Data Availability Statement

The data used in this study are openly available in the SPE Comparative Solution Project (Dataset 2, Model 2) at https://www.spe.org/web/csp/datasets/set02.htm (accessed on 14 November 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

2-DTwo-dimensional
3-DThree-dimensional
AdamWAdam optimizer with decoupled weight decay
AFNOAdaptive Fourier Neural Operator
DB-AFNODual-Branch Adaptive Fourier Neural Operator
FNOFourier Neural Operator
PINNsPhysics-Informed Neural Networks
Time2VecLearnable temporal encoding method
TFTTemporal Fusion Transformer
TSTTime-series Transformer (family)
GCRGeneralized Conjugate Residual
BiCGStabBiConjugate Gradient Stabilized
JF-BiCGStabJacobian-Free BiCGStab
LULower–Upper factorization
ILUIncomplete LU
TPFATwo-Point Flux Approximation
CFLCourant–Friedrichs–Lewy condition
TVTotal Variation
PDEPartial Differential Equation
PVPore Volume
SPE10           10th SPE Comparative Solution Project
RMSERoot-Mean-Square Error
MAEMean Absolute Error
R2Coefficient of determination
GPUGraphics Processing Unit
CPUCentral Processing Unit
CUDACompute Unified Device Architecture
PyTorchPython-based deep learning framework
GELUGaussian Error Linear Unit
cPCentipoise
mDMillidarcy
RLReinforcement Learning

Appendix A. Algorithms and Pseudocode

Algorithm A1 Generalized Conjugate Residual (GCR) [16]
Require:  A R n × n , b, initial x 0 , tolerance ε
  1:   r 0 b A x 0 , k 0
  2:  while  r k > ε   do
  3:   p ˜ r k
  4:  for  i = 0 to k 1  do
  5:     β i k ( A p ˜ , A p i ) ( A p i , A p i ) ,     p ˜ p ˜ + β i k p i
  6:   p k p ˜ / A p ˜
  7:   α k ( r k , A p k ) ( A p k , A p k )
  8:   x k + 1 x k + α k p k ,     r k + 1 r k α k A p k
  9:   k k + 1
10:  return  x k
Algorithm A2 Newton–Krylov with GCR linear step [25]
Require: Nonlinear residual F ( x ) , Jacobian J F ( x ) , initial x 0 , tolerance t o l f , max iters k max
  1:   k 0 , x x 0
  2:   repeat
  3:   f F ( x )
  4:   J J F ( x )
  5:  Solve J Δ x = f with GCR Δ x
  6:   x x + Δ x ,     k k + 1
  7:   until  f t o l f or k k max
  8:  return  x
Algorithm A3 Adaptive trapezoidal time integration [26]
Require:  x ˙ = f ( x , p , u ) , x ( t s ) = x 0 , t s < t f , initial Δ t , tolerances
  1:   t t s , x x 0
  2:  while  t < t f   do
  3:  Predict  x pred x + Δ t f ( x , p , u )
  4:  Correct solve for x new s.t. x new x Δ t 2 f ( x ) + f ( x new ) = 0 (Newton)
  5:  Estimate error e x new x pred
  6:  if e small then
  7:     t t + Δ t , x x new
  8:  Update Δ t using controller (e.g., PI or exponential rule); enforce bounds
  9:  return trajectory { x ( t ) }
Algorithm A4 Jacobian-free BiCGStab [25,27]
Require: Residual F ( x ) , current x, right-hand side b, finite-diff. step ε , tolerance t o l
  1:  function matvec(v) return  F ( x + ε v ) F ( x ) / ε
  2:   δ 0 , r b matvec(δ), r ^ r
  3:   ρ 1 , α 1 , ω 1 , v 0 , p 0
  4:  for  k = 0 to k max 1  do
  5:   ρ new r ^ r
  6:  if  | ρ new | < 10 14  then break
  7:  if  k = 0  then
  8:     p r
  9:  else
10:     β ( ρ new / ρ ) ( α / ω ) ; p r + β ( p ω v )
11:   v matvec(p);     α ρ new / ( r ^ v )
12:   s r α v
13:  if  s < t o l  then
14:     δ δ + α p ; return  δ
15:   t matvec(s);     ω ( t s ) / ( t t )
16:   δ δ + α p + ω s ;     r s ω t
17:  if  r < t o l  then return  δ
18:   ρ ρ new
19:  return  δ

References

  1. Sejnowski, T.J. The unreasonable effectiveness of deep learning in artificial intelligence. Proc. Natl. Acad. Sci. USA 2020, 117, 30033–30038. [Google Scholar] [CrossRef]
  2. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
  3. Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-informed machine learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
  4. Lu, L.; Meng, X.; Yang, L.; Karniadakis, G.E. DeepXDE: A deep learning library for solving differential equations. SIAM Rev. 2021, 63, 208–228. [Google Scholar] [CrossRef]
  5. Li, Z.; Kovachki, N.; Azizzadenesheli, K.; Liu, B.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Fourier neural operator for parametric partial differential equations. arXiv 2020, arXiv:2010.08895. [Google Scholar] [CrossRef]
  6. Kovachki, N.; Li, Z.; Liu, B.; Azizzadenesheli, K.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Neural operator: Learning maps between function spaces. J. Mach. Learn. Res. 2021, 24, 4061–4157. [Google Scholar]
  7. Li, Z.; Zheng, H.; Kovachki, N.; Jin, D.; Chen, H.; Liu, B.; Azizzadenesheli, K.; Anandkumar, A. Physics-informed neural operator for learning partial differential equations. arXiv 2021, arXiv:2111.03794. [Google Scholar] [CrossRef]
  8. Patel, R.G.; Trask, N.A.; Wood, M.A.; Cyr, E.C. A physics-informed operator regression framework for extracting data-driven continuum models. Comput. Methods Appl. Mech. Eng. 2021, 373, 113500. [Google Scholar] [CrossRef]
  9. NVIDIA Modulus Team. Darcy Flow with Fourier Neural Operator. 2024. Available online: https://docs.nvidia.com/physicsnemo/25.08/physicsnemo-sym/user_guide/neural_operators/darcy_fno.html (accessed on 18 July 2024).
  10. Wen, G.; Li, Z.; Azizzadenesheli, K.; Anandkumar, A.; Benson, S.M. U-FNO—An enhanced Fourier neural operator-based deep-learning model for multiphase flow. Adv. Water Resour. 2022, 163, 104180. [Google Scholar] [CrossRef]
  11. Cao, L.; O’Leary-Roseberry, T.; Jha, P.K.; Oden, J.T.; Ghattas, O. Residual-based error correction for neural operator accelerated infinite-dimensional Bayesian inverse problems. J. Comput. Phys. 2023, 486, 112104. [Google Scholar] [CrossRef]
  12. Ma, X.; Zhong, R.; Zhan, J.; Zhou, D. Enhancing subsurface multiphase flow simulation with Fourier neural operator. Heliyon 2024, 10, e38103. [Google Scholar] [CrossRef] [PubMed]
  13. Jain, N.; Roy, S.; Kodamana, H.; Nair, P. Scaling the predictions of multiphase flow through porous media using operator learning. Chem. Eng. J. 2025, 503, 157671. [Google Scholar] [CrossRef]
  14. Society of Petroleum Engineers. 10th SPE Comparative Solution Project, Model 2. 2024. Available online: https://www.spe.org/web/csp/datasets/set02.htm (accessed on 18 July 2024).
  15. Buckley, S.E.; Leverett, M.C. Mechanism of fluid displacement in sands. Trans. Aime 1941, 146, 107–116. [Google Scholar] [CrossRef]
  16. Demmel, J.W. Applied Numerical Linear Algebra; SIAM: Philadelphia, PA, USA, 1997. [Google Scholar] [CrossRef]
  17. Golub, G.H.; Van Loan, C.F. Matrix Computations; Johns Hopkins Studies in the Mathematical Sciences; Johns Hopkins University Press: Baltimore, MD, USA, 2013. [Google Scholar]
  18. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the Advances in Neural Information Processing Systems 30 (NeurIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar] [CrossRef]
  19. Kazemi, S.M. Time2Vec: Learning a vector representation of time. arXiv 2019, arXiv:1907.05321. [Google Scholar] [CrossRef]
  20. Lim, B.; Zohdy, M.; Phan, H.; Lee, S.J. Temporal fusion transformers for interpretable multi-horizon time series forecasting. Int. J. Forecast. 2019, 36, 628–645. [Google Scholar] [CrossRef]
  21. Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Liu, R.; Shen, J.; Pan, G. Informer: Beyond efficient transformer for long sequence time-series forecasting. arXiv 2021, arXiv:2012.07436. [Google Scholar] [CrossRef]
  22. Xu, H.; Wu, W.; Jiang, P.; Ding, Y.; Lu, Z.; Li, X. Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy. In Proceedings of the Tenth International Conference on Learning Representations (ICLR 2022), Virtual, 25–29 April 2022. [Google Scholar] [CrossRef]
  23. Nie, Y.; Chi, X.; Xue, Y.; Liu, Y.; Zhang, T.; Pei, Y.; Wang, C.; Li, Y. PatchTST: Training back-bone is all you need for time-series forecasting. arXiv 2022, arXiv:2211:14730. [Google Scholar] [CrossRef]
  24. Aarnes, J.E.; Gimse, T.; Lie, K.A. An Introduction to the Numerics of Flow in Porous Media Using MATLAB; Springer: Berlin, Germany, 2007. [Google Scholar] [CrossRef]
  25. Knoll, D.A.; Keyes, D.E. Jacobian-free Newton–Krylov methods: A survey of approaches and applications. J. Comput. Phys. 2004, 193, 357–397. [Google Scholar] [CrossRef]
  26. Hairer, E.; Wanner, G. Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems; Springer Series in Computational Mathematics; Springer: Berlin, Germany, 1996. [Google Scholar] [CrossRef]
  27. Kelley, C.T. Solving Nonlinear Equations with Newton’s Method; SIAM: Philadelphia, PA, USA, 2003. [Google Scholar] [CrossRef]
Figure 1. Permeability distributions in the SPE10 Model 2. (A) The left panel shows log ( K h ) for the static model. (B) The right panel shows log ( K v ) for the static model. The color scale indicates the permeability values, with red representing high permeability and blue representing low permeability.
Figure 1. Permeability distributions in the SPE10 Model 2. (A) The left panel shows log ( K h ) for the static model. (B) The right panel shows log ( K v ) for the static model. The color scale indicates the permeability values, with red representing high permeability and blue representing low permeability.
Water 17 03351 g001
Figure 2. Physics-informed Fourier neural-operator framework integrating static geological inputs, dynamic well controls, and temporal encoders with Fourier operator blocks to predict subsurface pressure and saturation fields. Solid boxes denote trainable neural-network modules (encoders, Fourier blocks, and decoder), while dashed boxes represent physical input and output fields.
Figure 2. Physics-informed Fourier neural-operator framework integrating static geological inputs, dynamic well controls, and temporal encoders with Fourier operator blocks to predict subsurface pressure and saturation fields. Solid boxes denote trainable neural-network modules (encoders, Fourier blocks, and decoder), while dashed boxes represent physical input and output fields.
Water 17 03351 g002
Figure 3. Statistical analysis of key training parameters. Top row (left to right): permeability distribution, log ( K x ) , in log(mD); porosity, ϕ (fraction); and initial water saturation, S w (−). Middle row: saturation evolution over normalized time, t / t total (blue curves: individual realizations; thick red curve: ensemble mean S ¯ w ); normalized well locations with X , Y [ 0 , 1 ] (orange markers: wells; blue frame: domain); and wetting-phase viscosity, μ (cP) (bars: histogram; dashed green line: smoothed distribution). Bottom row: non-wetting-phase viscosity, μ (cP) (bars: histogram; dashed orange line: smoothed distribution).
Figure 3. Statistical analysis of key training parameters. Top row (left to right): permeability distribution, log ( K x ) , in log(mD); porosity, ϕ (fraction); and initial water saturation, S w (−). Middle row: saturation evolution over normalized time, t / t total (blue curves: individual realizations; thick red curve: ensemble mean S ¯ w ); normalized well locations with X , Y [ 0 , 1 ] (orange markers: wells; blue frame: domain); and wetting-phase viscosity, μ (cP) (bars: histogram; dashed green line: smoothed distribution). Bottom row: non-wetting-phase viscosity, μ (cP) (bars: histogram; dashed orange line: smoothed distribution).
Water 17 03351 g003
Figure 4. Schematic of the proposed multi-task, time-aware FNO architecture.
Figure 4. Schematic of the proposed multi-task, time-aware FNO architecture.
Water 17 03351 g004
Figure 5. Single- and multiphase saturation fields used for model validation. The saturation colormap spans from 0 (blue) to 1 (red). Panels (A) and (B) compare single-phase simulations in a homogeneous domain for our model and the reference solution from [24], respectively. Panels (A1,B1) show the top layer of SPE10 Model 2 at breakthrough for our model and the reference, while panels (A2,B2) show the corresponding bottom-layer saturation fields, demonstrating that our model reproduces the main displacement fronts and flow patterns.
Figure 5. Single- and multiphase saturation fields used for model validation. The saturation colormap spans from 0 (blue) to 1 (red). Panels (A) and (B) compare single-phase simulations in a homogeneous domain for our model and the reference solution from [24], respectively. Panels (A1,B1) show the top layer of SPE10 Model 2 at breakthrough for our model and the reference, while panels (A2,B2) show the corresponding bottom-layer saturation fields, demonstrating that our model reproduces the main displacement fronts and flow patterns.
Water 17 03351 g005
Figure 6. Distribution of errors, S ^ S , across normalized time t * for early ( t * < 0.25 ) and late ( t * 0.25 ) phases.
Figure 6. Distribution of errors, S ^ S , across normalized time t * for early ( t * < 0.25 ) and late ( t * 0.25 ) phases.
Water 17 03351 g006
Figure 7. Saturation fields at multiple injection times. (Top): ground truth. (Middle): conventional FNO. (Bottom): DB-AFNO. Red dashed boxes and circles in the middle row highlight oversmoothed regions and spurious saturation artifacts produced by the conventional FNO, which are not present in the DB-AFNO predictions. Per-panel error metrics corroborate DB-AFNO accuracy.
Figure 7. Saturation fields at multiple injection times. (Top): ground truth. (Middle): conventional FNO. (Bottom): DB-AFNO. Red dashed boxes and circles in the middle row highlight oversmoothed regions and spurious saturation artifacts produced by the conventional FNO, which are not present in the DB-AFNO predictions. Per-panel error metrics corroborate DB-AFNO accuracy.
Water 17 03351 g007
Figure 8. Zero-shot super-resolution on SPE10: saturation S ^ and absolute error | S S ^ | at four times. Total coarse → fine pipeline: 4.5 min.
Figure 8. Zero-shot super-resolution on SPE10: saturation S ^ and absolute error | S S ^ | at four times. Total coarse → fine pipeline: 4.5 min.
Water 17 03351 g008
Figure 9. Layer-wise comparison for L = 1 : 5 at 60 × 220 resolution. Scatter plots report R 2 , RMSE, and MAE; red dashed line denotes perfect agreement.
Figure 9. Layer-wise comparison for L = 1 : 5 at 60 × 220 resolution. Scatter plots report R 2 , RMSE, and MAE; red dashed line denotes perfect agreement.
Water 17 03351 g009
Table 1. State-Space Variables for Multiphase Flow in Porous Media.
Table 1. State-Space Variables for Multiphase Flow in Porous Media.
ComponentDescription/Equation
State vector x x = [ S w 1 , S n w 1 , , S w N , S n w N ]
Parameter vector p p = [ ϕ , K , μ w , μ n w , ρ w , ρ n w , P c a p , M ]
Excitation vector u ( t ) u ( t ) = [ q w 1 ( t ) , q n w 1 ( t ) , , q w N ( t ) , q n w N ( t ) ]
Output vector y y = [ S w 1 , S n w 1 , P 1 , , S w N , S n w N , P N ]
Table 2. Governing Equations for Multiphase Flow in Porous Media.
Table 2. Governing Equations for Multiphase Flow in Porous Media.
ComponentDescription/Equation
Mass conservation ( ϕ S ψ ) t + · ( v ψ ) = q ψ , ψ { w , n w }
Darcy’s law v ψ = K k r ψ μ ψ P ψ ρ ψ g , ψ { w , n w }
Saturation constraint S w + S n w = 1
Capillary pressure P c ψ = P ψ P w , ψ { n w }
Relative permeabilities λ w ( s ) = ( s * ) 2 μ w , λ n w ( s ) = ( 1 s * ) 2 μ n w , s * = s s w c 1 s n w r s w c
Effective mobility k r ψ = f ( S ψ ) , ψ { w , n w }
Table 3. Discrete Saturation Dynamics and Coupling Terms.
Table 3. Discrete Saturation Dynamics and Coupling Terms.
ComponentDescription/Equation
Saturation Dynamics S ψ t = · K k r ψ μ ψ P ψ ρ ψ g + q ψ ϕ
or ϕ S ψ t + · f ψ ( s ) v = q ψ ρ ψ
Transmissibility, T T i j = 2 | A i j | Δ x i λ i , i j + Δ x j λ j , i j 1
Pressure EquationSolve T P = q (Two-Point Flux Approx.)
Coupling S ψ t = diag ( f w 1 , , f w N ) A + [ f i 1 , , f i N ] ,
where A captures velocity interactions and source scaling by local  P V .
Notes: In the above table, S w i and S n w i denote the saturations/concentration of the wetting and non-wetting phases in the i-th grid block, respectively; ϕ is the porosity; K represents permeability; μ w and μ n w are the viscosities of the wetting and non-wetting phases; ρ w and ρ n w are their densities; P c a p is the capillary pressure; and M includes any additional parameters. The excitation vector u ( t ) consists of injection/production rates q w i ( t ) and q n w i ( t ) . The output vector y pairs saturation and pressure measurements. Mass conservation ( q ψ ) and Darcy’s law (with velocity v ψ ) govern flow, while the saturation constraint and capillary pressure ensure physical consistency. Relative permeabilities λ w ( s ) and λ n w ( s ) (with normalized saturation s * ) and effective mobility k r ψ = f ( S ψ ) model fluid transport properties. The transmissibility matrix T (computed via TPFA) couples the pressure field P to the source q , and the pressure-to-saturation coupling (via matrix A ) reflects the influence of the velocity field and local pore volume ( P V ) on saturation evolution.
Table 4. Statistical summary of petrophysical properties and fluid parameters. For each parameter, N is the number of samples, μ is the mean, σ is the standard deviation, and Min/Max denote the range of values.
Table 4. Statistical summary of petrophysical properties and fluid parameters. For each parameter, N is the number of samples, μ is the mean, σ is the standard deviation, and Min/Max denote the range of values.
ParameterUnitN μ σ Min/Max
K x mD 2.89 × 10 6 3.86 × 10 2 1.16 × 10 3 5.08 × 10 3 / 3.98 × 10 4
ϕ - 2.89 × 10 6 1.92 × 10 1 8.60 × 10 2 1.00 × 10 1 / 5.00 × 10 1
S w , init - 2.89 × 10 6 1.97 × 10 1 5.81 × 10 2 1.00 × 10 1 / 3.00 × 10 1
μ w cP 8.75 × 10 2 3.48 × 10 1 8.61 × 10 2 2.00 × 10 1 / 5.00 × 10 1
μ nw cP 8.75 × 10 2 2.97 5.62 × 10 1 2.00 / 4.00
Table 5. Training and configuration settings.
Table 5. Training and configuration settings.
BlockItemSetting
DataNormalizationz-score per channel; separate S , P
ModelTemporal/Coord emb.64-D time/step; grid ( x , y , ( z ) )
FNOWidth/Layers W = 64 , L = 8
FNOActivation/ResidualGELU; skips
FNOPathsSpectral conv + depthwise conv
FNOFourier modes (curric.) ( 2 , 2 , 2 ) ( 6 , 6 , 3 ) ( 18 , 18 , 8 )
DecodersS/PS: iFFT→MLP; P: Conv, dropout p = 0.2
FusionGating/Amp.Cross-gated conv + σ ; γ P = 2.0
Optim.Batch/Optimizer64; AdamW ( lr = 5 × 10 3 , wd 10 2 )
Optim.LR scheduleCosine restarts ( T 0 = 100 , T mult = 2 , η min = 10 8 )
Optim.Grad clipGlobal norm = 2
LossesData/PhysicsHuber ( β = 0.01 ) on S , P ; λ phys ( e ) = exp ( e / 25 )
ProtocolSampling/SplitEarly t 10 × 2 ; 80/20 train/val
ProtocolCurriculum/CKPT t max { 25 , 100 , 500 , } ; epochs { 50 , 50 , 50 , 500 } ; best val
Table 6. Runtime benchmark for zero-shot upscaling vs. full-order TPFA on five unseen layers.
Table 6. Runtime benchmark for zero-shot upscaling vs. full-order TPFA on five unseen layers.
StepWall-TimeComment
Full-order TPFA on 60 × 220 × 5 ≈15 minGround truth
Coarse TPFA on 30 × 110 × 5 3 minGenerates inputs
DB-AFNO zero-shot upscale 1.5 minSingle forward pass
Total 4.5 min∼3 × faster
Note: All runs on a single RTX Titan; identical simulator settings except grid resolution.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Al Hashim, H.; Elyas, O.; Williams, J. A Dual-Branch Coupled Fourier Neural Operator for High-Resolution Multi-Phase Flow Modeling in Porous Media. Water 2025, 17, 3351. https://doi.org/10.3390/w17233351

AMA Style

Al Hashim H, Elyas O, Williams J. A Dual-Branch Coupled Fourier Neural Operator for High-Resolution Multi-Phase Flow Modeling in Porous Media. Water. 2025; 17(23):3351. https://doi.org/10.3390/w17233351

Chicago/Turabian Style

Al Hashim, Hassan, Odai Elyas, and John Williams. 2025. "A Dual-Branch Coupled Fourier Neural Operator for High-Resolution Multi-Phase Flow Modeling in Porous Media" Water 17, no. 23: 3351. https://doi.org/10.3390/w17233351

APA Style

Al Hashim, H., Elyas, O., & Williams, J. (2025). A Dual-Branch Coupled Fourier Neural Operator for High-Resolution Multi-Phase Flow Modeling in Porous Media. Water, 17(23), 3351. https://doi.org/10.3390/w17233351

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop