Prediction/Assessment of CO2 EOR and Storage Efficiency in Residual Oil Zones Using Machine Learning Techniques

Abdulwarith, Abdulrahman; Ammar, Mohamed; Dindoruk, Birol

doi:10.3390/en18205498

Open AccessArticle

Prediction/Assessment of CO₂ EOR and Storage Efficiency in Residual Oil Zones Using Machine Learning Techniques^†

by

Abdulrahman Abdulwarith

¹,

Mohamed Ammar

²

and

Birol Dindoruk

^2,*

¹

Department of Petroleum Engineering, University of Houston, Houston, TX 77204, USA

²

Department of Petroleum Engineering, Texas A&M University, College Station, TX 77843, USA

^*

Author to whom correspondence should be addressed.

^†

This article is an extended version of our paper presented at the SPE/AAPG/SEG Carbon Capture, Utilization, and Storage Conference and Exhibition, Houston, Texas, USA, March 2024 (Paper ID: SPE-CCUS-2024-4011705).

Energies 2025, 18(20), 5498; https://doi.org/10.3390/en18205498 (registering DOI)

Submission received: 3 July 2025 / Revised: 2 September 2025 / Accepted: 3 October 2025 / Published: 18 October 2025

Download

Browse Figures

Versions Notes

Abstract

Residual oil zones (ROZ) arise under the oil–water contact of main pay zones due to diverse geological conditions. Historically, these zones were considered economically unviable for development with conventional recovery methods because of the immobile nature of the oil. However, they represent a substantial subsurface volume with strong potential for CO₂ sequestration and storage. Despite this potential, effective techniques for assessing CO₂-EOR performance coupled with CCUS in ROZs remain limited. To address this gap, this study introduces a machine learning framework that employs artificial neural network (ANN) models trained on data generated from a large number of reservoir simulations (300 cases produced using Latin Hypercube Sampling across nine geological and operational parameters). The dataset was divided into training and testing subsets to ensure generalization, with key input variables including reservoir properties (thickness, permeability, porosity, Sorg, salinity) and operational parameters (producer BHP and CO₂ injection rate). The objective was to forecast CO₂ storage capacity and oil recovery potential, thereby reducing reliance on time-consuming and costly reservoir simulations. The developed ANN models achieved high predictive accuracy, with R² values ranging from 0.90 to 0.98 and mean absolute percentage error (MAPRE) consistently below 10%. Validation against real ROZ field data demonstrated strong agreement, confirming model reliability. Beyond prediction, the workflow also provided insights for reservoir management: optimization results indicated that maintaining a producer BHP of approximately 1250 psi and a CO₂ injection rate of 14–16 MMSCF/D offered the best balance between enhanced oil recovery and stable storage efficiency. In summary, the integrated combination of reservoir simulation and machine learning provides a fast, technically robust, and cost-effective tool for evaluating CO₂-EOR and CCUS performance in ROZs. The demonstrated accuracy, scalability, and optimization capability make the proposed ANN workflow well-suited for both rapid screening and field-scale applications.

Keywords:

residual oil zone (ROZ); CO₂-EOR; CCUS; artificial neural network (ANN); reservoir simulation; machine learning; optimization

1. Introduction

Geological CO₂ storage is considered an important technique to reduce CO₂ gas emissions into the atmosphere that contribute to climate change challenges. CO₂ can be stored in various geological reservoirs such as depleted oil and gas fields, shale gas layers, deep saline aquifers, coal beds, geothermal reservoirs, and hydrate-bearing formations [1,2,3,4,5,6].

CO₂ enhanced oil recovery (CO₂-EOR) has been widely used as an EOR method for medium and light oil production in conventional and unconventional oil reservoirs for decades. When CO₂ is injected into the reservoir, the oil swells, oil viscosity reduces, interfacial tension reduces, oil vaporizes, capillary number increases, and both the sweep and displacement efficiencies increase [7,8]. In addition, the injected CO₂ is retained and trapped in reservoir due either structural trapping, residual trapping, solubility, and mineral trapping [9]. Most of previous studies on CO₂-EOR and storage focused on the main pay zone [10,11,12]. Recently, CO₂-EOR and storage in residual oil zones (ROZs) has drawn significant attention due to the successful commercial CO₂-EOR projects in ROZs. The residual oil zone can be defined as an interval of the reservoir rock that contains immobile oil with respect to the formation water at the level of residual oil saturation, typically 40% and less [13]. ROZs can be categorized into two primary groups: brownfields and greenfields, as illustrated in Figure 1. Brownfield ROZs are located beneath a conventional reservoir’s main pay zone (MPZ) or the subsurface interval where oil is traditionally extracted using conventional primary and/or enhanced recovery methods. On the other hand, greenfield ROZs are found in regions lacking an overlying conventional oil formation and are frequently identified as hydrodynamic fairways [14]. However, as many large oil reservoirs reach depletion and carbon dioxide enhanced oil recovery (CO₂-EOR) is implemented, ROZs may become attractive targets for increased oil recovery. CO₂-EOR is increasingly used for oil production in areas with documented ROZs, primarily in the Permian Basin [15,16,17], characterized by the effect of CO₂ storage and EOR in ROZ using Monte Carlo simulations and sensitivity analysis on geological and operational parameters [15]. Ren and Duncan [18] used reservoir simulation to investigate the hydrodynamic effects of water flow in the aquifer at the base of oil zones and emphasized the importance of these factors in assessing the EOR and storage capacity in ROZs. Kumar and Bandyopadhyay [19] introduced the use of dimensional analysis and pulser process for quantifying and discerning the production of the MPZ and ROZ due to CO₂ flooding without the need for numerical simulations. Recently, machine learning techniques have grown in popularity for developing computationally quick proxy models, surrogate models, and predictive empirical models in subsurface modeling. Several machine learning algorithms have been widely used in the prediction of reservoir production and performance assessment [20,21]. Therefore, ROZs have emerged as potential reservoirs for CCUS, and there is a lack of efficient tools for evaluating CO₂ EOR performance coupled with CCUS in ROZs.

In this study, we will use a 3D reservoir model to simulate the CO₂ injection process. The reservoir properties are referenced to one of the potential ROZs in the Permian basin (Goldsmith–Landreth San Andres Unit). The main objectives of this work are the following:

Evaluate the CO₂ injection as EOR and CCUS in ROZs in terms of Cumulative oil Production, Cumulative CO₂ injection, and retained CO₂ injection in each phase;
Evaluate the sensitivity of the uncertainty of the reservoir rock properties (net pay thickness, permeability, vertical to horizontal permeability ratio, porosity, residual oil saturation to water flood, residual oil saturation flood, formation water salinity) and the operational parameters, including (producer BHP and gas injection rate);
Develop a proxy predictive model to predict the performance of CO₂-EOR-CCUS using machine learning techniques to provide rapid screening and evaluation of the injection performance.

2. Theory and Approach

2.1. Governing Equations

The mass conservation equation, considering molecular diffusion, for each component I present in the oil, gas, and water phases, can be expressed as follows:

\frac{\partial}{\partial t} (\emptyset \sum_{l = 1}^{N p} ρ_{l} S_{l} m_{i l}) + \nabla . (\sum_{l = 1}^{N p} ρ_{l} m_{i l} v_{l} - \emptyset ρ_{l} S_{l} D_{i l} \nabla m_{i l}) - q_{i} = 0, i = 1 : N c

(1)

where t is the time,

\emptyset

is the porosity,

ρ

is the density, S is phase saturation, mi weight fraction for each component i, v is Darcy velocity, and Di is the coefficient of the molecular diffusion of component i in phase l and q_i is the production mass rate or injection mass rate.

The Darcy’s velocity is expressed in terms of Darcy’s law:

v_{l} = - \frac{K k r l}{μ l} (\nabla P l - ρ_{l} g)

(2)

where K is the absolute permeability of the rock, krl is the relative permeability of phase l, Pl is the pressure of the phase l,

μ l

is the phase viscosity, and

ρ_{l}

is the phase density.

The mass rate of production or injection can be expressed as:

q i = \sum_{l = 1}^{N p} ρ_{l} m_{i l} {P I}_{l} (P_{w e l l}^{l} - P_{b l o c k}^{l})

(3)

where P_well is the wellbore pressure, P block is the grid pressure, and PI is the productivity index.

The two-phase water–oil and liquid gas relative permeability are fitted using relative permeability tables. While the three-phase relative permeability is generated using one of the three relative permeability model as Stone’s model II [22].

k_{r o} = (k_{r o g} + k_{r g}) + (k_{r o w} + k_{r w}) - (k_{r w} + k_{r g})

(4)

The mass exchange between the oil and gas phases for each component is modeled using thermodynamic phase equilibrium conditions which is defined by equality of the fugacity of all components:

f_{g}^{i} = f_{o}^{i}

(5)

The governing equations are solved subject to the following initial and boundary conditions. The initial reservoir pressure is set at 2000 psia, which is above the calculated MMP of 1500 psia, ensuring miscibility. The initial oil saturation is assumed to be at residual saturation (0.40), with corresponding water saturation. Reservoir temperature is fixed at 140 °F, consistent with the Seminole Field conditions.

Boundary conditions are applied as follows:

Lateral boundaries: no-flow boundaries are imposed on the reservoir sides;
Top boundary: an open boundary condition is applied to allow buoyant CO₂ migration;
Bottom boundary: Aquifer support is modeled to maintain reservoir pressure behavior;
Well constraints: injectors are subject to a maximum bottomhole pressure (BHP) of 4000 psia (fracture limit), while producers are constrained by specified BHP values (250–1500 psia) as summarized in Table 1.

2.2. CO₂ Trapping Mechanism

This study focuses on three main mechanisms of CO₂ storage explicitly: solubility, residual, and structural trapping.

2.3. Solubility Trapping

Depending on the temperature of the reservoir, minimum miscibility pressure (MMP), and the properties CO₂ can either stay soluble or become miscible with the oil. The supercritical characteristics of CO₂ play a crucial role in penetrating the oil surface, leading to swelling and a reduction in viscosity. In a study by [23], they demonstrated how the solubility of CO₂ varies with changes in temperature and pressure. Their findings indicated that solubility tends to increase with higher reservoir pressure and API gravity, while it decreases with a reduction in reservoir temperature [23]. Also, the water can partition in the water phase, so to model the CO₂ solubility in the aqueous phase, Henry’s law is used:

f_{i}^{a q} = y_{i a q} * H i

(6)

where

f_{i}^{a q}

is the fugacity of component i in aqueous phase, y_iaq is the mole fraction of component i in aqueous phase, and H_i is the Henry’s constant of component i.

Henry’s law constants are functions of temperature, pressure, and water salinity. They can be estimated using the molar volume at specific pressure and temperature, along with the known Henry’s constant at a specific reference pressure and temperature, considering fixed salinity and temperature. However, this approach may not be applicable in thick reservoirs. Therefore, the use of Henry constant correlation provides more flexibility to handle such situations, such as the Harvey 1996 correlation, as shown in Equations (7) and (8) [24]:

\ln {H i}^{s} = \ln P_{H 2 O}^{s} + A {(T_{r H 2 O})}^{- 1} + B {(1 - T_{r H 2 O})}^{0.335} {(T_{r H 2 O})}^{- 1} + C [\exp (1 - T_{r H 2 O})] {(T_{r H 2 O})}^{- 0.41}

(7)

where

{H i}^{s}

is Henry’s constant at the saturation pressure

P_{H 2 O}^{s}

and Tr is the reduced temperature. A, B, and C are constants and for CO₂ they are −9.4234, 4.0087, and 10.3199, respectively.

The Henry’s law Constant at given P ad T is expressed as:

\ln H i = l n {H i}^{s} + \frac{1}{R T} \int_{P_{H 2 O}^{s}}^{P} v_{i}^{-} d P

(8)

The solubility trapping efficiency can be calculated using the following formula in Equation (9):

S o l u b i l i t y T r a p p i n g i n d e x = \frac{T o t a l d i s s o l v e d m a s s o f i n j e c t e d C O_{2} i n b r i n e}{T o t a l i n j e c t e d C O_{2} m a s s}

(9)

2.4. Residual Trapping

Residual trapping is an important CO₂ trapping mechanism. Hysteresis phenomena allow capillary pressures and relative permeabilities to vary between imbibition and drainage curves through scanning curves. Capillary pressure follows drainage curves for decreasing wetting-phase saturations and imbibition curves for increasing wetting-phase saturations. In the case of a reversal of saturation directions, capillary pressure follows along the scanning curves. Entrapment of the nonwetting phase occurs when it is bypassed by the wetting phase, thereby making it immobile as shown in Figure 2. Several studies have presented several correlations for the modeling of hysteresis. In this study, the hysteresis in relative permeability is modeled based on the land correlation [25] Equations (10) and (11):

S_{g r h} = \frac{S_{g h} - S_{g i c}}{1 + C (S_{g h} - S_{g i c})}

(10)

C = \frac{1}{S_{g r m a x} - S_{g c r i t}} - \frac{1}{S_{g m a x} - S_{g c r i t}}

(11)

where Sgrh is residual gas saturation of imbibition process, and Sgh is historical-maximum-attained gas saturation. Sgic critical reversal saturation for trapping and Sgcrit is the critical gas saturation.

The residual trapping efficiency can be calculated using the following formula in Equation (12):

R e s i d u a l T r a p p i n g i n d e x = \frac{T o t a l T r a p p e d m a s s o f i n j e c t e d C O_{2}}{T o t a l i n j e c t e d C O_{2} m a s s}

(12)

2.5. Reservoir Simulation Model Setup

A three-dimensional reservoir model was constructed using CMG GEM to explore two main objectives: (1) assessing the performance of CO₂ flooding for enhanced oil recovery (EOR), and (2) determining the amount of CO₂ sequestered in the reservoir through residual and solubility trapping. The reservoir model consists of 36 grids in the x-direction, 36 grids in the y-direction, and 10 grids in the z-direction, with horizontal grid sizes of 120 ft and 122 ft in the i and j directions. A five-spot pattern, depicted in Figure 3, is employed for evaluating CO₂ injection for both EOR and carbon capture, utilization, and storage (CCUS). The initial oil saturation is assumed to be at a residual saturation of 0.4. Relative permeability curves, as shown in Figure 4, are referenced from the Goldsmith–Landreth San Andres Unit. Injectors and producers are assumed to be completed over the entire reservoir interval. CO₂ injection occurs for a period of 10 years, concurrent with simultaneous production from producers over the same duration. The simulation is conducted for 100 years, encompassing a 90-year post-injection period. A maximum bottomhole pressure (BHP) constraint, set at the rock fracturing pressure of 4000 psi, is imposed on injectors. Reservoir rock properties, including porosity, permeability, thickness, Sor (residual oil saturation), Sorg (initial oil saturation), formation water salinity, producer BHP, and CO₂ injection rate, are considered as variables for sensitivity analysis, as outlined in Table 1.

Reservoir fluid composition is referenced to the ROZ composition in residual oil zone, Seminole Field, Permian Basin in [27] study as shown in Table 2.

2.6. Minimum Miscible Pressure (MMP) Determination and Reservoir Fluid Characterization

The minimum miscible pressure (MMP) computed using UH_MMP Calculator [28] was 1500 psia, which is lower than the reservoir pressure of 2000 psia. The components of reservoir oil were lumped into 10 pseudo-components, and the parameters of the Peng–Robinson equation of state were fitted based on the experiment data from the constant composition expansion (CCE) test, the differential liberation (DL) test.

2.7. Machine Learning Framework

In addition to physics-based simulation, this study incorporates artificial neural networks (ANN) to develop a predictive proxy model for CO₂–EOR and storage in ROZs. An ANN is a supervised learning algorithm inspired by biological neural networks, consisting of input, hidden, and output layers. Each hidden layer applies nonlinear activation functions (ReLU in this study) to capture complex relationships among input variables. The ANN minimizes a loss function (mean squared error, MSE) using the Adam optimizer to iteratively update weights and biases. To prevent overfitting, early stopping criteria are applied during training. Input data were standardized using min–max scaling, and principal component analysis (PCA) was employed to reduce dimensionality and highlight the most influential features.

In total, 300 simulation cases were generated using Latin Hypercube Sampling (LHC) across the selected geological and operational parameters. The dataset was randomly split into 80% training (240 cases) and 20% testing (60 cases) to evaluate model generalization. An additional set of one-parameter-at-a-time (OAT) runs was performed for sensitivity and reference purposes.

The ANN is particularly suited for enhanced oil recovery (EOR) in ROZs because reservoir performance depends on nonlinear and coupled interactions of geological and operational parameters (e.g., porosity, permeability, BHP, injection rate). By training the ANN on a large dataset generated from reservoir simulations, the model learns to map these inputs directly to outcomes such as cumulative oil production and CO₂ trapping efficiencies, thereby eliminating the need for time-intensive numerical simulations.

2.8. Workflow for Generating the Dataset for the Machine Learning Model

After building the physics-based compositional reservoir simulation model, a large dataset needs to be generated to train the predictive model using machine learning. In this study, a numerical model was employed to generate an appropriate dataset covering all uncertainties in geological and reservoir properties. Nine parameters were investigated in this study. The sampling method for sensitivity analysis was established using the Latin Hypercube sampling method. This method involves dividing the cumulative density function (CDF) into equal segments and then choosing a random data point in each segment. By employing this sampling method, the optimum number of reservoir simulation runs is determined. After generating all possible cases, the reservoir simulation is run to produce results in terms of cumulative oil production, and cumulative CO₂ trapped in each phase due to residual trapping and solubility trapping. The outcomes of the reservoir simulation model are evaluated to ensure the quality of the results before passing them to the machine learning model as summarized in Figure 5. The machine learning model divides the dataset into training and testing/validation portions. The performance of the machine learning model is then evaluated using the root mean square error (RMSE) and the coefficient of determination (

R^{2}

):

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {({a c t u a l y}_{i} - p r e d i c t e d y_{i})}^{2}}

(13)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(A c t u a l y i - p r e d i c t e d y i)}^{2}}{\sum_{i = 1}^{n} {(A c t u a l y i - y m e a n)}^{2}}

(14)

To further analyze the influence of input parameters on model outputs, a correlation analysis was carried out. We employed the Pearson correlation coefficient, a well-established statistical measure, to quantify the strength and direction of linear relationships between input parameters and outcomes (oil recovery, CO₂ dissolved, residual and structural trapping). Prior to analysis, all variables were normalized using min–max scaling, and anomalous simulation results were excluded. The correlation analysis was conducted on the full set of 300 valid cases, and the results are presented in the following section. This approach, based on a widely recognized statistical equation, ensures that the analysis is reproducible and transparent.

3. Results and Discussion

3.1. Base Case Reservoir Simulation Results

The base case reservoir simulation model was run with geological and operational parameters summarized in Table 3. CO₂ is injected for 10 years with cumulative oil production, and about 32 MMSTB were recovered as shown in Figure 6. The storage profile for residual, solubility trapping, and structural trapping is shown in Figure 7. The majority of the CO₂ volume was stored due to structural trapping and residual trapping due to hysteresis effect mentioned earlier. However, a lower amount of CO₂ was dissolved in water due to high salinity of 200,000 ppm. Additionally, it is noticeable that the total CO₂ in the supercritical phase decreases when CO₂ breakthrough occurs in the producer well. This is evident in the cumulative CO₂ production profile at the producer well.

3.2. Machine Learning Model

Dataset Description

Reservoir simulation realization was used as input for the machine learning model in our study, using an artificial neural network (ANN). The Reservoir rock and fluid, and operational parameter range, summarized in Table 1, were used to generate several cases. The input for the ANN model was net pay thickness, horizontal permeability, ratio of vertical to horizontal permeability, porosity, residual oil saturation to water flood, residual gas saturation to gas flood, formation water salinity, producer bottom hole pressure, and CO₂ injection rate. The final machine learning dataset consisted of 300 Latin Hypercube Sampling cases, ensuring broad coverage of the uncertainty space. These cases were supplemented by additional OAT runs, which were used for sensitivity checks. These input parameters were used to generate different ANN models to predict the cumulative oil produced, CO₂ dissolved in water, CO₂ trapped due to residual hysteresis, and CO₂ trapped structurally. Before training, the dataset was screened for anomalous data. Simulation runs that terminated abnormally, produced non-physical outputs, or generated extreme outliers due to numerical instabilities were removed. The remaining dataset (300 valid cases) was normalized using min–max scaling, and statistical checks were applied to ensure consistency of parameter ranges and outputs. This preprocessing step optimized the dataset quality and ensured the reliability of the ANN training and testing.

3.3. Summary of the Correlation Coefficient of the Dataset per Input Parameter

To assess the impact of each input on the output values, a correlation coefficient analysis was conducted, and the results are summarized in Figure 8. The analysis revealed that an increase in the vertical permeability/horizontal permeability ratio led to an increase in cumulative oil production while decreasing the amount of CO₂ stored in the reservoir. This effect can be attributed to the enhanced sweep efficiency, facilitating the injected CO₂ to reach and mobilize more oil, thereby increasing cumulative oil production. Conversely, an increase in kv/kh resulted in a rise in CO₂ levels due to gravity, leading to a faster migration of CO₂ toward the top of the reservoir. This, in turn, reduced the amount of CO₂ trapped and stored in the reservoir. Residual saturation for both water and gas flooding exhibited a small correlation with both cumulative oil production and CO₂ storage. Increased horizontal permeability showed a positive correlation with cumulative oil production, while decreasing structural trapping. Porosity increases demonstrated a positive correlation with both cumulative oil production and CO₂ storage. The producer’s bottom hole pressure showed a negative correlation with cumulative oil production and a positive correlation with CO₂ storage.

Furthermore, the CO₂ injection rate exhibited a positive correlation with both CO₂ injection and cumulative oil production. Salinity displayed a negative correlation with CO₂ dissolved in water, as expected. As salinity increases, it decreases the storage capacity of CO₂ solubility in water.

To improve the interpretability of the ANN results, the influence of input parameters was examined in the context of underlying physical mechanisms. For example, the negative correlation between Sorg (gas-driven residual oil saturation) and cumulative oil recovery aligns with displacement theory, as higher Sorg reduces the proportion of mobile oil available for production while increasing the fraction of oil retained in pore spaces. Similarly, the positive influence of porosity and permeability on recovery and storage corresponds with their roles in the mass conservation and Darcy flow equations (Equations (1) and (2)), where larger pore volume and higher transmissibility enhance fluid mobility and sweep efficiency. The ANN-predicted reduction of dissolved CO₂ at higher salinity is consistent with Henry’s law (Equations (7) and (8)). These consistencies between machine learning outputs and fundamental reservoir mechanics strengthen the interpretability of the proposed framework.

3.4. ANN Models Configuration

In this study, an artificial neural network (ANN) model was constructed to predict the cumulative oil production in a reservoir based on a set of selected input parameters. The model architecture consists of multiple dense layers with rectified linear unit (ReLU) activation functions, allowing the network to capture complex relationships within the data. The input features were initially standardized using min-max scaling to ensure consistent input ranges for improved model convergence. To further enhance the model’s ability to capture essential patterns within the data, principal component analysis (PCA) was applied to reduce the dimensionality of the input space. The resulting principal components were then utilized as input features for the ANN.

This dimensionality reduction not only streamlined the computational complexity but also facilitated the identification of key features influencing the output of cumulative oil production and CO₂ storage. The model was trained using the mean squared error (MSE) loss function and the Adam optimizer with a learning rate of 0.1. The training process was monitored by early stopping criteria, preventing overfitting, and ensuring the model’s generalization performance. The training history showed the convergence of the training and validation losses over the epochs. The early stopping mechanism prevented the model from continuing training once the validation loss reached a plateau, ensuring optimal model performance. The summary of the model’s configuration is shown in Table 4.

The architecture of each ANN model was determined through systematic hyper-parameter tuning. Several network depths (2–6 hidden layers) and neuron counts (16–256 per layer) were tested, and performance was evaluated using validation R² and MSE. The final selected architectures balanced accuracy with computational efficiency: 5 hidden layers with 128 neurons for cumulative oil production, 5 hidden layers with 64 neurons for CO₂ dissolved in water, and smaller networks (3 hidden layers, 10–15 neurons) for structural and residual CO₂ trapping. The learning rate for the Adam optimizer was tested across values between 0.001 and 0.1, with 0.01 selected as the best trade-off between fast convergence and stable validation performance, while early stopping criteria were applied to ensure generalization and prevent overfitting.

To capture the unique behaviors of each output, we developed four independent ANN models, each trained to predict one of the four target variables: cumulative oil production, CO₂ dissolved in water, CO₂ structurally trapped, and CO₂ residually trapped. This design was chosen instead of a single multi-output ANN because the target variables (i) differ substantially in scale and distribution (e.g., MMSTB vs fractional CO₂ mass), (ii) are governed by different physical mechanisms (Darcy flow vs solubility vs capillary trapping), and (iii) preliminary testing confirmed that independent networks achieved higher R² values and lower errors compared to a multi-output configuration. This modeling strategy is consistent with prior proxy-modeling studies, where separate networks improved predictive accuracy and interpretability.

The choice of activation function was also systematically evaluated. Although ReLU is often highlighted for feature filtering, it is well suited for regression tasks because it alleviates the vanishing gradient problem, accelerates convergence, and effectively models nonlinear parameter interactions. Comparative experiments with alternative activation functions (tanh and sigmoid) under identical network architectures showed that ReLU produced the most stable and accurate results (R² = 0.90–0.98, MAPRE < 10%), whereas tanh converged more slowly with slightly lower accuracy, and sigmoid underperformed due to gradient saturation. These results confirm that ReLU was the most appropriate activation function for the ANN models in this study.

3.5. Developed Models Performance Evaluation

The performance of the developed ANN models was assessed using the coefficient of determination (R²) and mean absolute percentage relative error (MAPRE) as well as the mean squared error during the training and testing process. The cumulative oil production model achieved an outstanding R² value of 0.98, demonstrating its ability to accurately predict oil production trends. Similarly, the CO₂ dissolved in water model achieved an R² of 0.93, indicating its effectiveness in capturing dissolved CO₂ dynamics. The CO₂ trapped (structural) and CO₂ residual trapping models demonstrated high predictive power with R² values of 0.9 and 0.96, respectively. All the models showed MAPRE less than 10%. Cross plots of the testing capability of the developed models are shown in Figure 9.

4. Summary and Conclusions

In summary, this study focused on constructing a three-dimensional reservoir model using CMG GEM to assess the performance of CO₂ flooding for enhanced oil recovery (EOR) and determine the amount of CO₂ sequestered through residual, solubility, and structural trapping in ROZs. A comprehensive sensitivity analysis was conducted using a physics-based compositional reservoir simulation model, considering various reservoir rock properties and operational parameters. The dataset generated from the reservoir simulation was used to train an artificial neural network (ANN) model, aiming to predict cumulative oil production, CO₂ dissolved in water, and CO₂ trapped structurally and due to residual hysteresis.

The sensitivity analysis revealed key correlations between input parameters and output variables. For instance, an increase in the vertical permeability/horizontal permeability ratio enhanced cumulative oil production but decreased CO₂ storage. Horizontal permeability, porosity, and CO₂ injection rate displayed positive correlations with both cumulative oil production and CO₂ storage, while producer bottom hole pressure exhibited a negative correlation with oil production but a positive correlation with CO₂ storage. The developed ANN models demonstrated high predictive accuracy, with R² values ranging from 0.9 to 0.98, indicating their effectiveness in capturing complex relationships within the data. Additionally, the mean absolute percentage relative error (MAPRE) for all models was less than 10%, confirming their reliability. Cross plots illustrated the models’ ability to predict testing data accurately.

In this study, 300 simulation cases were generated using the Latin Hypercube Sampling (LHS) method across the selected geological and operational parameters. This sampling strategy was chosen to efficiently explore the uncertainty space while minimizing the number of required reservoir simulation runs. Since the model represents a sector of the reservoir, the number of realizations was kept tractable to balance computational cost with coverage of parameter variability. We acknowledge that, compared to typical large-scale machine learning applications where datasets often reach millions of samples, 300 cases may appear limited. However, the primary objective of this study is to demonstrate the methodology of integrating physics-based simulations with machine learning rather than to present a fully generalized predictive model. In practical applications, this workflow can be readily extended by generating larger simulation datasets or incorporating field-scale data, thereby improving model robustness and generalization.

In conclusion, the integrated approach of combining reservoir simulation and machine learning, particularly the ANN model, proved successful in predicting reservoir behavior under CO₂ flooding scenarios. The established correlations between input parameters and output variables provide valuable insights for optimizing EOR and carbon capture, utilization, and storage (CCUS) strategies. This study contributes to advancing the understanding of CO₂ flooding dynamics and offers a robust methodology for reservoir management and decision-making in the context of CO₂ EOR and CCUS.

While this study focused on validation using field data from the Permian Basin, which represents a relatively homogeneous carbonate system, we recognize that the developed ANN models have not yet been tested in more heterogeneous reservoirs such as fractured shales, tight formations, or reservoirs under different pressure–temperature conditions. Extending the validation of the proposed methodology to such systems will be an important avenue for future work to enhance the robustness and generalization of the models.

Compared with traditional ML approaches, which typically rely on limited or noisy field data, the proposed ANN framework was trained on physics-based simulation datasets, ensuring physically consistent learning of parameter interactions. While the trained models are specific to the studied reservoir, the methodology is universal and can be transferred to other reservoirs by retraining on their corresponding simulation data. Parametric optimization demonstrated that maintaining producer BHP near 1250 psi and CO₂ injection rates between 14 and 16 MMSCF/D maximized oil recovery while controlling CO₂ breakthrough, thus offering practical guidance for field operations.

Author Contributions

Conceptualization, A.A.; methodology, A.A.; software, A.A.; validation, B.D.; formal analysis, A.A. and M.A.; resources, M.A.; data curation, M.A.; writing—original draft, A.A.; writing—review and editing, A.A. and B.D.; visualization, M.A.; supervision, B.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

This research is supported by the members of the Research Consortium on Interaction of Phase Behavior and Flow (IPB&F). We gratefully acknowledge their support. All individuals included in this section have consented to the acknowledgement.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Bachu, S. Identification of oil reservoirs suitable for CO₂-EOR and CO₂ storage (CCUS) using reserves databases, with application to Alberta, Canada. Int. J. Greenh. Gas Control 2016, 44, 152–165. [Google Scholar] [CrossRef]
Li, Z.; Dong, M.; Li, S.; Huang, S. CO₂ sequestration in depleted oil and gas reservoirs—Caprock characterization and storage capacity. Energy Convers. Manag. 2006, 47, 1372–1382. [Google Scholar] [CrossRef]
Li, Q.; Li, Q.; Wang, F.; Xu, N.; Wang, Y.; Bai, B. Settling behavior and mechanism analysis of kaolinite as a fracture proppant of hydrocarbon reservoirs in CO₂ fracturing fluid. Colloids Surf. A Physicochem. Eng. Asp. 2025, 724, 137463. [Google Scholar] [CrossRef]
Li, Q.; Han, Y.; Liu, X.; Ansari, U.; Cheng, Y.; Yan, C. Hydrate as a by-product in CO₂ leakage during the long-term sub-seabed sequestration and its role in preventing further leakage. Environ. Sci. Pollut. Res. Int. 2022, 29, 77737–77754. [Google Scholar] [CrossRef] [PubMed]
Abdulwarith, A.; Ammar, M.; Dindoruk, B. Prediction/Assessment of CO₂ EOR and Storage Efficiency in Residual Oil Zones Using Machine Learning Techniques. In Proceedings of the SPE/AAPG/SEG Carbon, Capture, Utilization, and Storage Conference and Exhibition, Houston, TX, USA, 11–13 March 2024. [Google Scholar]
Hampton, D.W.; Wagia-Alla, A. Analytical Method for Forecasting ROZ Production in a Commingled MOC and ROZ CO₂ Flood. In Proceedings of the SPE Improved Oil Recovery Conference, Online, 25–29 April 2022. [Google Scholar]
Johns, R.T.; Dindoruk, B. Gas flooding. In Enhanced Oil Recovery Field Case Studies; Gulf Professional Publishing: Waltham, MA, USA; Kidlington, Oxford, UK, 2013; pp. 1–22. [Google Scholar]
Manrique, E.; Thomas, C.; Ravikiran, R.; Izadi, M.; Lantz, M.; Romero, J.; Alvarado, V. EOR: Current status and opportunities. In Proceedings of the SPE Improved Oil Recovery Conference, Tulsa, OK, USA, 17–21 April 2004. SPE-130113. [Google Scholar]
Cao, C.; Liu, H.; Hou, Z.; Mehmood, F.; Liao, J.; Feng, W. A review of CO₂ storage in view of safety and cost-effectiveness. Energies 2020, 13, 600. [Google Scholar] [CrossRef]
Ampomah, W.; Balch, R.S.; Grigg, R.B.; McPherson, B.; Will, R.A.; Lee, S.Y.; Pan, F. Co-optimization of CO₂-EOR and storage processes in mature oil reservoirs. Greenh. Gases Sci. Technol. 2017, 7, 128–142. [Google Scholar] [CrossRef]
Ettehadtavakkol, A.; Lake, L.W.; Bryant, S.L. CO₂-EOR and storage design optimization. Int. J. Greenh. Gas Control 2014, 25, 79–92. [Google Scholar] [CrossRef]
Liu, Y.; Rui, Z. A storage-driven CO₂ EOR for a net-zero emission target. Engineering 2022, 18, 79–87. [Google Scholar] [CrossRef]
Sanguinito, S.; Singh, H.; Myshakin, E.M.; Goodman, A.L.; Dilmore, R.M.; Grant, T.C.; Pawar, R. Methodology for estimating the prospective CO₂ storage resource of residual oil zones at the national and regional scale. Int. J. Greenh. Gas Control 2020, 96, 103006. [Google Scholar] [CrossRef]
Melzer, L.S. Stranded Oil in the Residual Oil Zone Prepared for Advanced Resources International and the US Department of Energy: Office of Fossil Energy Office of Oil and Natural Gas; Melzer Consulting: Midland, TX, USA, 2006; Volune 91. [Google Scholar]
Chen, B.; Pawar, R.J. Characterization of CO₂ storage and enhanced oil recovery in residual oil zones. Energy 2019, 183, 291–304. [Google Scholar] [CrossRef]
Kuuskraa, V.A.; Petrusak, R.L.; Wallace, M. A Four-County Appraisal of the San Andres Residual Oil Zone (ROZ) ‘Fairway’ of the Permian Basin (No. DOE/NETL-2020/2627); National Energy Technology Laboratory (NETL): Pittsburgh, PA, USA, 2020.
Ren, B.; Male, F.; Duncan, I.J. Economic analysis of CCUS: Accelerated development for CO₂ EOR and storage in residual oil zones under the context of 45Q tax credit. Appl. Energy 2022, 321, 119393. [Google Scholar] [CrossRef]
Ren, B.; Duncan, I. Modeling oil saturation evolution in residual oil zones: Implications for CO₂ EOR and sequestration. J. Pet. Sci. Eng. 2019, 177, 528–539. [Google Scholar] [CrossRef]
Kumar, D.; Bandyopadhyay, P. Pulser Model: Updated Framework and Lessons Learned from Applications in Forecasting CO₂ Flood Performance. In Proceedings of the SPE Improved Oil Recovery Conference, Online, 31 August–4 September 2020; Society of Petroleum Engineers: Richardson, TX, USA; p. D031S046R002. [Google Scholar]
He, J.; Xie, J.; Wen, X.H.; Chen, W. An alternative proxy for history matching using proxy-for-data approach and reduced order modeling. J. Pet. Sci. Eng. 2016, 146, 392–399. [Google Scholar] [CrossRef]
Song, Y.; Sung, W.; Jang, Y.; Jung, W. Application of an artificial neural network in predicting the effectiveness of trapping mechanisms on CO₂ sequestration in saline aquifers. Int. J. Greenh. Gas Control 2020, 98, 103042. [Google Scholar] [CrossRef]
Stone, H.L. Probability model for estimating three-phase relative permeability. J. Pet. Technol. 1970, 22, 214–218. [Google Scholar] [CrossRef]
Mosavat, N.; Torabi, F. Application of CO₂-saturated water flooding as a prospective safe CO₂ storage strategy. Energy Procedia 2014, 63, 5619–5630. [Google Scholar] [CrossRef]
Harvey, A.H. Semiempirical Correlation for Henry’s Constants over Large Temperature Ranges. AIChE J. 1996, 42, 1491–1494. [Google Scholar] [CrossRef]
Land, C.E. Calculation of Imbibition Relative Permeability for Two- and Three-Phase Flow from Rock Properties. SPEJ 1968, 8, 149–156. [Google Scholar] [CrossRef]
Ampomah, W.; Balch, R.; Cather, M.; Rose-Coss, D.; Dai, Z.; Heath, J.; Dewers, T.; Mozley, P. Evaluation of CO₂ Storage Mechanisms in CO₂ Enhanced Oil Recovery Sites: Application to Morrow Sandstone Reservoir. Energy Fuels 2016, 30, 8545–8555. [Google Scholar] [CrossRef]
Honarpour, M.M.; Nagarajan, N.R.; Grijalba, A.C.; Valle, M.; Adesoye, K. Rock-fluid characterization for miscible CO₂ injection: Residual oil zone, Seminole Field, Permian Basin. In Proceedings of the SPE Annual Technical Conference and Exhibition, Florence, Italy, 19–22 September 2010. SPE-133089. [Google Scholar]
Sinha, U.; Dindoruk, B.; Soliman, M. Prediction of CO₂ Minimum Miscibility Pressure Using an Augmented Machine-Learning-Based Model. SPE J. 2021, 26, 1666–1678. [Google Scholar] [CrossRef]

Figure 1. Residual oil zone types (greenfield and brownfield) [13].

Figure 2. Imbibition and drainage curves used in hysteresis modelling effect [26].

Figure 3. Three-dimensional reservoir simulation model with five-spot pattern showing grid dimensions (36 × 36 × 10), injector and producer locations, and annotated boundary/initial conditions (closed sides, open top, aquifer support at base) The colored scale indicates depth (ft) of the grid tops.

Figure 4. Relative permeability curves (oil–water (right), gas–liquid (left)).

Figure 5. Workflow for dataset generation.

Figure 6. Cumulative oil production vs. time for the base case.

Figure 7. Cumulative CO₂ stored vs. time for the base case.

Figure 8. Correlation coefficient analysis between input parameters vs. the cumulative oil production and CO₂ storage.

Figure 9. Cross plots of the performance of developed ANN models ((a): Cumulative oil production, (b): CO₂ dissolved in water, (c): CO₂ trapped structurally, (d): CO₂ residual trapping).

Table 1. Summary of sensitivity variable for generation of the reservoir simulation dataset.

Parameter	Lower Bound	Upper Bound
Porosity	0.05	0.3
Permeability, md	0.01	250
Kv/Kh	0.01	1
Salinity, ppm	50,000	250,000
Residual oil saturation to water	0.2	0.4
Residual oil saturation to gas	0.1	0.25
CO₂ injection rate, MMSCF/D	5	20
Producer BHP, Psia	250	1500
Net pay thickness, ft	50	350

Table 2. Reservoir fluid composition.

Component	Composition (Mole %)
N₂	0.04
CO₂	0.02
H₂S	0
CH₄	20.10
C₂H₆	9.07
C₃H₈	6.95
i-C₄H₁₀	0.04
n-C₄H₁₀	3.90
i-C₅H₁₂	0.04
n-C₅H₁₂	2.49
C₆H₁₄	2.69
C7+	54.66
MWC7+	261

Table 3. Base case reservoir simulation model parameters.

Parameter	Value
Thickness, ft	200
Permeability, md	200
Porosity	0.25
Producer BHP, Psia	500
KV/KH	0.1
Salinity, ppm	200,000
Residual oil saturation to water	0.4
Residual oil saturation to gas	0.2
CO₂ injection rate, MMSCF/D	20

Table 4. Summary of developed ANN models.

Model	Number of Hidden Layers	Number of Neurons
Cumulative Oil Production	5	128
CO₂ Dissolved in Water	5	64
CO₂ Trapped (Structural)	3	15
CO₂ Residual Trapping	3	10

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Abdulwarith, A.; Ammar, M.; Dindoruk, B. Prediction/Assessment of CO₂ EOR and Storage Efficiency in Residual Oil Zones Using Machine Learning Techniques. Energies 2025, 18, 5498. https://doi.org/10.3390/en18205498

AMA Style

Abdulwarith A, Ammar M, Dindoruk B. Prediction/Assessment of CO₂ EOR and Storage Efficiency in Residual Oil Zones Using Machine Learning Techniques. Energies. 2025; 18(20):5498. https://doi.org/10.3390/en18205498

Chicago/Turabian Style

Abdulwarith, Abdulrahman, Mohamed Ammar, and Birol Dindoruk. 2025. "Prediction/Assessment of CO₂ EOR and Storage Efficiency in Residual Oil Zones Using Machine Learning Techniques" Energies 18, no. 20: 5498. https://doi.org/10.3390/en18205498

APA Style

Abdulwarith, A., Ammar, M., & Dindoruk, B. (2025). Prediction/Assessment of CO₂ EOR and Storage Efficiency in Residual Oil Zones Using Machine Learning Techniques. Energies, 18(20), 5498. https://doi.org/10.3390/en18205498

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction/Assessment of CO₂ EOR and Storage Efficiency in Residual Oil Zones Using Machine Learning Techniques^†

Abstract

1. Introduction