Machine-Learning-Based Approach to Optimize CO 2 -WAG Flooding in Low Permeability Oil Reservoirs

: One of the main applications of carbon capture, utilization, and storage (CCUS) technology in the industry is carbon-dioxide-enhanced oil recovery (CO 2 -EOR). However, accurately and rapidly assessing their application potential remains a major challenge. In this study, a numerical model of the CO 2 -WAG technique was developed using the reservoir numerical simulation software CMG (Version 2021), which is widely used in the ﬁeld of reservoir engineering. Then, 10,000 different reservoir models were randomly generated using the Monte Carlo method for numerical simulations, with each having different formation physical parameters, ﬂuid parameters, initial conditions, and injection and production parameters. Among them, 70% were used as the training set and 30% as the test set. A comprehensive analysis was conducted using eight different machine learning regression methods to train and evaluate the dataset. After evaluation, the XGBoost algorithm emerged as the top-performing method and was selected as the optimal approach for the prediction and optimization. By integrating the production prediction model with a particle swarm optimizer (PSO), a workﬂow for optimizing the CO 2 -EOR parameters was developed. This process enables the rapid optimization of the CO 2 -EOR parameters and the prediction of the production for each period based on cumulative production under different geological conditions. The proposed XGBoost-PSO proxy model accurately, reliably, and efﬁciently predicts production, thereby making it an important tool for optimizing CO 2 -EOR design.


Introduction
Oil resources have always been the primary source of fossil energy for global energy demand.However, extracting the remaining oil from complex reservoir formations remains a challenge, thereby making it increasingly important to enhance extraction efficiency [1].In addition, the use of fossil fuels leads to the emission of a large amount of CO 2 , which is a greenhouse gas, thereby resulting in global climate change, which has become a major challenge facing the world today [2].Carbon capture, utilization, and storage (CCUS) technology is considered to be an important approach to mitigating global climate change [3].Based on the challenges mentioned above, CO 2 -enhanced oil recovery (CO 2 -EOR) technology is a method within CCUS that can enhance oil recovery by using the CO 2 in a miscible displacement process and effectively sequestering CO 2 in the lower portion of the reservoir.This technology has relatively low requirements for the purity of the CO 2 and allows for the recycling of CO 2 , thereby reducing process costs [4].CO 2 -EOR, with its potential to significantly enhance oil recovery while achieving carbon capture, utilization, and storage, represents an EOR method that offers both societal and economic benefits.
The injection of CO 2 is applied to enhance the oil recovery (EOR) due to its superior capabilities in improving the fluid properties under reservoir conditions.The fundamental Energies 2023, 16, 6149 2 of 21 mechanisms of CO 2 -EOR include reducing interfacial tension (IFT), lowering oil viscosity, oil swelling, and light hydrocarbon extraction.These mechanisms contribute to the enhanced recovery of the oil in reservoirs [5][6][7].Compared to other gases such as natural gas, air, and nitrogen (N 2 ), carbon dioxide (CO 2 ) has a lower minimum miscibility pressure (MMP) than oil.Therefore, selecting CO 2 as the injection gas for displacing oil can achieve better miscibility and more effectively recover the oil [8].In addition to continuous CO 2 injection for miscible displacement, the CO 2 -WAG (water alternating gas) technique has been proposed to improve the flowability of CO 2 in the reservoir and to prevent CO 2 fingering.This technique involves alternating injections of CO 2 and water, which enhance the efficiency of the CO 2 propagation and oil displacement [9,10].The CO 2 -WAG technique was first used by Mobil Corporation in 1957 in a sandstone reservoir in Alberta, Canada.In addition, the CO 2 -WAG technique alleviates the issue of rapid CO 2 breakthrough and increases the resistance to gas phase flow.It also reduces the resistance to the water phase flow and increases the mobility ratio [11].According to surveys, the WAG technique has achieved significant success and has been employed in 80% of the oilfield projects in the United States, thereby demonstrating its superiority in enhancing oilfield development and improving oil recovery [12].Christensen et al. [13] conducted a study on 59 WAG fields and found that, in all WAG cases, the average oil recovery rate increased by 10%.This demonstrates the positive impact and effectiveness of WAG technology in enhancing oil field production.Al-Bayati et al. [14] investigated the impact of core-scale heterogeneity on the oil recovery efficiency of CO 2 -WAG injection.The research findings indicated that CO 2 -WAG injection exhibited better performance in homogeneous, layered, and composite samples.Sun et al. [15] investigated the feasibility of the CO 2 phase through porous media in WAG injection scenarios and successfully increased the oil recovery factor (RF) by approximately 46%.The gas-to-water injection ratio was identified as a crucial parameter affecting the efficiency of water-gas alternating injection [16,17].Khather et al. [18] investigated the impact of CO 2 -carbonate interaction on the oil and gas recovery in three heterogeneous carbonate rock core samples with different initial oil saturations (low and moderate permeability).Overall, CO 2 -WAG injection after water flooding resulted in an increase in the recovery factor of over 30% for the three rock cores.Ren et al. [19] conducted experiments on CO 2 -EOR and storage in oilfields in the Ordos Basin in China using two CO 2 injection schemes: continuous injection (CI) and water alternating gas (WAG) injection.The results showed that the equal injection of CO 2 and WAG significantly increased the crude oil production.
Currently, the optimization of CO 2 -EOR technology is a focal point of attention for many oilfield and reservoir engineers.Rodrigues et al. [20] utilized the CMG reservoir numerical simulation software (Version 2021) to optimize the application of WAG in a sub-salt offshore oilfield in Brazil.They proposed a CO 2 -WAG operational design method suitable for carbonate reservoirs, with a focus on economic viability, CO 2 recycling efficiency, and project risks.However, traditional parameter optimization methods are time-consuming and labor-intensive.They tend to overlook complex nonlinear relationships and the underlying influencing factors.Moreover, these methods are often based on specific models and algorithms, thus lacking the flexibility to adapt to different oilfield situations and variations.They have certain limitations and lack adaptability.Therefore, the introduction of more advanced optimization techniques, such as machine learning and metaheuristic algorithms, can better address the complexity and uncertainty of oilfields, thereby enhancing the optimization efficiency and accuracy.
At the current stage, rapidly evolving intelligent algorithms, such as machine learning, have found significant applications in the field of petroleum exploration and development.Sen et al. [21] employed a Specialized RNN Unit (SRU) model, which is a type of recurrent neural network (RNN), to optimize the parameters and predict the production in actual CO 2 -EOR projects.The injection rate, injection pressure, cumulative injection volume of the injection wells, and bottom hole flowing pressure of the production wells were used as inputs for the SRU model, while the fluid production of the production wells served as the output.Li et al. [22] utilized the random forest (RF) regression algorithm to predict the performance of the CO 2 -WAG technique, including oil well production, CO 2 storage volume, and CO 2 storage efficiency.The CO 2 -WAG cycle, CO 2 injection rate, and water-gas ratio were identified as the main injection parameters.The prediction results showed a close approximation between the predicted values and the actual values in the test set.The average absolute prediction deviations for cumulative oil production, CO 2 storage volume, and CO 2 storage efficiency were 1.10%, 3.04%, and 2.24%, respectively.He et al. [23] proposed an optimization workflow for CO 2 -EOR operations based on machine learning methods and heuristic optimization algorithms.Their workflow included a power consumption prediction using a Gaussian process regression (GPR) model, which combines a nonlinear autoregressive neural network with external inputs (NARX) model for oil production prediction and an operational optimization model.The optimization results were significant; the optimization parameters used included the duration of the water/gas alternating injection cycles, the bottom hole pressure of the production wells, and the injection rate of water.Some researchers, in order to swiftly explore the solution space and find the global optimal solution, have combined metaheuristic algorithms with machine learning.By harnessing the predictive capability of machine learning to guide the search process of metaheuristic algorithms, they can quickly identify the optimal solution and achieve better results in parameter optimization.In 2018, Mohagheghia et al. [24] utilized a robust evolutionary algorithm to automatically optimize the performance of the hydrocarbon WAG technique used in the E segment of the Norne oilfield.They employed the net present value (NPV) as the objective function and two global semi-random search strategies, namely, the genetic algorithm (GA) and particle swarm optimization (PSO).Parameters such as the water injection volume, gas injection volume, bottom hole pressure of producing wells, cycle ratio, cycle duration, injected hydrocarbon gas fraction, and total WAG cycle were optimized.You et al. [25] combined Gaussian-SVR (support vector regression) with a Gaussian kernel to construct a surrogate model, and the hyperparameters of the surrogate model were optimized using Bayesian optimization.The trained surrogate model was then coupled with a multi-objective particle swarm optimization (MOPSO) protocol.This approach was used to optimize the complex CO 2 -WAG process, which involves many control parameters.The optimization parameters included operational variables for controlling the CO 2 -WAG process, such as the duration of the water/gas alternating injection cycle, the bottom hole pressure control, and the injection rates for each well.Jaber [26] utilized the genetic algorithm (GA) technique based on the surrogate model to optimize the most influential parameters in the CO 2 -WAG process in the Subba-Nahr Umr reservoir.Four operational variables were considered for optimizing the CO 2 -WAG displacement: the CO 2 -to-water slug size ratio (WAG), cyclic length (CL), bottom hole pressure (BHP), and CO 2 slug size (SZ).The results demonstrated that the highest incremental oil recovery (∆FOE) of 9.7% in the Subba-Nahr Umr reservoir could be achieved with a WAG ratio of 1.5, a cyclic length of 3 months, a bottom hole pressure of 2221 psi, and a CO 2 slug size of 0.91.Based on the above, it can be observed that, in most cases of CO 2 -EOR parameter optimization, the dataset is relatively small, and the optimization objective functions often only include specific time points of production, which cannot form a complete production curve.As a result, there are limitations and particularities.Due to the lack of complete production curves and large-scale time series data, machine learning and other prediction methods may not fully leverage their advantages and may struggle to achieve global optimization results.
This study proposes a comprehensive workflow for optimizing CO 2 -EOR (WAG) parameters by combining reservoir numerical simulation with machine learning.In Section 2, the machine learning methods used in this study are described, along with the workflow.Section 3 focuses on establishing the geological and numerical models of the reservoir.In Section 4, the study conducted a correlation analysis of geological and operational parameters.The performance of production prediction models based on different machine learning models was evaluated, and the best machine learning model was selected.In Section 5, the selected machine learning model, combined with particle swarm optimization (PSO), was used for capacity prediction and parameter optimization.Discussions and conclusions are presented in Section 6.

Methods
This section describes the methodological principles and workflow of the main algorithms used in this study.Eight machine learning methods were employed to build the prediction models, including linear regression [27,28], ridge regression [29], decision tree (DT) [30], random forest (RF) [31,32], gradient boosting decision tree (GBDT) [33], extreme gradient boosting (XGBoost) [34], K-nearest neighbors (KNN) [35], and neural network (NN) [36].This study proposes a coupled model of the machine learning algorithm XGBoost and particle swarm optimization (PSO) [37] to address the optimization problem.Therefore, the focus is on introducing the XGBoost algorithm and the particle swarm optimization algorithm (PSO).

XGBoost Algorithm
XGBoost is an expandable tree boosting system proposed by Chen et al. [34].It is an improved version of the gradient boosting decision tree (GBDT) algorithm [38] and is widely used in classification and regression tasks.The basic idea of XGBoost is similar to GBDT, but it incorporates several optimizations, which include the following: 1.
Optimizing the loss function by employing a second-order Taylor expansion to enhance computational accuracy.

3.
Utilizing a block storage structure to enable parallel computing and improve efficiency.
The structure of the XGBoost algorithm is illustrated in Figure 1, and the model details are described below.Given a training dataset T =  ,  ,  ,  , … ,  ,  , a loss function   ,  , and a regularization term Ω  , the objective function can be expressed as follows: where ℒ  is the representation in the linear space,  denotes the -th sample,  repre- Given a training dataset T = {(x 1 , y 1 ), (x 2 , y 2 ), . . . ,(x n , y n )}, a loss function l(y i , ŷi ), and a regularization term Ω( f k ), the objective function can be expressed as follows: where L(φ) is the representation in the linear space, i denotes the i-th sample, k represents the k-th tree, ŷi hat i is the predicted value of the i-th sample x i , and ∑ k Ω( f k ) represents the complexity of the trees.
Due to the expression of the objective function in GBDT, we can rewrite it as follows: In this case, the expression of L(φ) can be transformed into the following form: (3)

Particle Swarm Optimization (PSO)
Particle swarm optimization (PSO) is an evolutionary computation technique that was first introduced by Eberhart and Kennedy in 1995 [37].The basic concept of PSO originates from the study of the foraging behavior in bird flocks and is a simplified model of swarm intelligence algorithms.The algorithm was initially inspired by the regular patterns observed in the movements of prey bird flocks, which led to the development of a simplified model using collective intelligence.PSO utilizes collaboration and information sharing among individuals within a swarm to search for the optimal solution [41].
Figure 2 shows the flow of the PSO algorithm, where each particle individually searches for the optimal solution in the search space.The optimal solution is recorded as the current individual extremum and shared with the other particles in the entire particle population.The particles move at a certain speed in the search space, wherein they dynamically adjust their respective speed and position according to their own flight experience and the flight experience of other particles [42].
The equation to update particle velocity in the PSO algorithm is as follows: where V id is the current velocity of the particle; ω is the inertia factor (with velocity there is motion inertia); random(0, 1) is the random number generation function that generates random numbers between 0 and 1; P id is the current position of the particle; X id is the global best position of this particle; P gd represents the current best position among all particles in the population; and C 1 and C 2 denote the learning factors, which learn from the best position in the history of this particle and the best position in the population, respectively.The equation to update particle velocity in the PSO algorithm is as follows: where V is the current velocity of the particle;  is the inertia factor (with velocity there is motion inertia);  0,1 is the random number generation function that generates random numbers between 0 and 1; P is the current position of the particle; X is the global best position of this particle; P represents the current best position among all particles in the population; and C1 and C2 denote the learning factors, which learn from the best position in the history of this particle and the best position in the population, respectively.

Workflow
As shown in Figure 3, the process of prediction and the parameters optimization of CO2-EOR can be divided into three steps: Step 1: Numerical Model and Database Establishment.Extensive literature research is conducted to gather knowledge on optimizing CO2-EOR parameters and production profiles.The reservoir numerical simulation software CMG (Version 2021) was utilized to build the CO2-EOR numerical model.By employing the Monte Carlo method, 10,000 sets of different reservoir models were randomly generated and simulated to obtain corresponding production curves for various geological parameters, fluid parameters, relative permeability parameters, and injection/production parameters.

Workflow
As shown in Figure 3, the process of prediction and the parameters optimization of CO 2 -EOR can be divided into three steps: Step 1: Numerical Model and Database Establishment.Extensive literature research is conducted to gather knowledge on optimizing CO 2 -EOR parameters and production profiles.The reservoir numerical simulation software CMG (Version 2021) was utilized to build the CO 2 -EOR numerical model.By employing the Monte Carlo method, 10,000 sets of different reservoir models were randomly generated and simulated to obtain corresponding production curves for various geological parameters, fluid parameters, relative permeability parameters, and injection/production parameters.
Step 2: Machine Learning Model Selection.Firstly, a correlation analysis was conducted to assess the relationships between different CO 2 -EOR parameters.Then, using the dataset generated in the first step, which consisted of 10,000 sets of diverse parameters and corresponding production curves, the machine learning models were trained and evaluated.Eight different machine learning models were employed and trained with the dataset to determine their performance in predicting CO 2 -EOR production.Through thorough evaluation and comparison, the XGBoost algorithm was selected as the best-performing machine learning method for this study.
Step 3: CO 2 -EOR Production Prediction and Parameter Optimization.The XGBoost-PSO proxy model was employed to predict CO 2 -EOR production and optimize the CO 2 -EOR parameters.
Energies 2023, 16, 6149 7 of 21 evaluated.Eight different machine learning models were employed and trained w dataset to determine their performance in predicting CO2-EOR production.Throug ough evaluation and comparison, the XGBoost algorithm was selected as the be forming machine learning method for this study.
Step 3: CO2-EOR Production Prediction and Parameter Optimization XGBoost-PSO proxy model was employed to predict CO2-EOR production and op the CO2-EOR parameters.

Establishment of CO2-EOR Numerical Model
First, based on the actual geological parameters of the oilfield, a character model was established, which took into account factors such as well spacing, fluid erties, and heterogeneity.The numerical model consisted of a grid with dimension × 21 × 5, with a grid spacing of 10 m in the I direction, 10 m in the J direction, and the K direction.Therefore, the feature model had dimensions of 210 m in length, 21 width, and 25 m in depth.The well pattern was deployed as a 1/4 five-spot patter one injector well and one producer well per pattern, as shown in Figure 4.The ba rameters of the feature model are described in Table 1.

Establishing the Database
After building the geologic model, a large dataset needs to be generated to train the predictive model built using machine learning.In this study, a numerical model was used to randomly generate cumulative production data for 10,000 sets of geological and completion parameters.This study investigated a total of 24 parameters, including geological parameters, fluid parameters, initial conditions, and injection/production parameters.The parameters included in this study are as follows.The geological parameters included the following: initial pressure, porosity, permeability, temperature, and spacing in the I, J, and K directions.The fluid parameters included the following: oil density, gas specific gravity, residual oil saturation index, water saturation, oil saturation, oil viscosity, and phase mixing parameter.The phase saturation parameters included the following: residual water saturation, residual oil saturation in the oil-water system, residual oil saturation in the gas-liquid system, and residual gas saturation.The injection/production parameters included the following: gas injection well bottom flow pressure, water injection well bottom flow pressure, production well bottom flow pressure, WF ending time, WAG gas injection rate, and WAG water injection rate.The range of the values for each parameter is shown in Table 2, and the distribution of each parameter is illustrated in Figure 5.The applicable range for each parameter in the table was primarily based on the actual conditions of CO 2 -driven oil reservoirs in China.The objective function used in this study was the cumulative oil production, which is the output obtained by simulating the monthly production for each combination using the numerical simulation model.Figures 6 and 7 respectively illustrate the cumulative oil production curve and the distribution of the cumulative oil production.Based on the data and the accompanying figures, it can be observed that the minimum cumulative oil production was 10 4 m 3 , while the maximum cumulative oil production was 7.2 × 10 5 m 3 .The majority of the distribution fell within the range of 0-10 5 m 3 of oil production.

Correlation Analysis
By observing the results of the correlation analysis in Figure 8, it can be concluded that there are strong linear correlations between cumulative oil production in CO2-EOR and geological parameters, fluid parameters, phase saturation parameters, and injec tion/production parameters.The discovery of these correlations is significant for gaining a deeper understanding of reservoir characteristics and for optimizing the CO2-EOR pro cess.

Correlation Analysis
By observing the results of the correlation analysis in Figure 8, it can be concluded that there are strong linear correlations between cumulative oil production in CO2-EOR and geological parameters, fluid parameters, phase saturation parameters, and injection/production parameters.The discovery of these correlations is significant for gaining a deeper understanding of reservoir characteristics and for optimizing the CO2-EOR process.

Correlation Analysis
By observing the results of the correlation analysis in Figure 8, it can be concluded that there are strong linear correlations between cumulative oil production in CO 2 -EOR and geological parameters, fluid parameters, phase saturation parameters, and injection/production parameters.The discovery of these correlations is significant for gaining a deeper understanding of reservoir characteristics and for optimizing the CO 2 -EOR process.From the perspective of the geological parameters (Figure 8a), there was a positive correlation between cumulative oil production in CO2-EOR and certain factors.Notably, there were strong linear correlations between the cumulative oil production and the spacing in the I, J, and K directions, which had correlation coefficients of 0.748, 0.748, and 0.327, respectively.This indicates that the spacing in these directions significantly influences the oil production during the CO2-EOR process.Additionally, the porosity and permeability showed correlations with the cumulative oil production in CO2-EOR, which yielded correlation coefficients of 0.258 and 0.211, respectively.This suggests that, as porosity and From the perspective of the geological parameters (Figure 8a), there was a positive correlation between cumulative oil production in CO 2 -EOR and certain factors.Notably, there were strong linear correlations between the cumulative oil production and the spacing in the I, J, and K directions, which had correlation coefficients of 0.748, 0.748, and 0.327, respectively.This indicates that the spacing in these directions significantly influences the oil production during the CO 2 -EOR process.Additionally, the porosity and permeability showed correlations with the cumulative oil production in CO 2 -EOR, which yielded correlation coefficients of 0.258 and 0.211, respectively.This suggests that, as porosity and permeability increase, the cumulative oil production in CO 2 -EOR also increases.Porosity represents the void space in the reservoir, while permeability reflects the capacity for fluid flow within the reservoir.Higher porosity and permeability values indicate larger effective storage capacity and better fluid migration capability, thereby enabling CO 2 to react more fully with crude oil, which increases the cumulative oil production.
From the perspective of the fluid parameters and phase saturation parameters (Figure 8a,b), there was a weak correlation with the cumulative oil production in CO 2 -EOR.For instance, in the fluid parameters, the correlation coefficients for the residual oil saturation index, gas specific gravity, and phase mixing parameter were 0.011, −0.007, and −0.008, respectively.Similarly, in the phase permeability parameters, the correlation coefficients for the residual gas saturation, residual oil saturation in the oil-water system, and residual oil saturation in the gas-liquid system were 0.006, −0.004, and −0.012, respectively.These correlation coefficients being close to zero indicate that there is a weak linear relationship between the phase permeability parameters and the cumulative oil production in CO 2 -EOR.
Furthermore, from the perspective of the injection-production parameters (Figure 8d), there was a strong linear correlation between the CO 2 -EOR cumulative oil production and CO 2 -WAG injection volume.The correlation coefficient for the CO 2 -WAG injection volume was 0.187.This indicates that increasing the CO 2 -WAG injection volume can effectively enhance the displacement efficiency of the CO 2 and increase oil production in the reservoir.Optimizing these parameters can lead to more efficient oil recovery in the CO 2 -EOR process.

Machine Learning Model Building
The dataset was split into 70% for training and 30% for testing.Eight machine learning models, including linear regression, ridge regression, decision tree, random forest, gradient boosting decision tree, extreme gradient boosting, K-nearest neighbors, and the neural network, were established.By comparing their accuracies, the model with the highest accuracy was selected as the optimal model.
To evaluate the prediction accuracy of the machine learning models, the coefficient of determination (R 2 ) was selected as the metric [43].The R 2 value ranges from 0 to 1, with a higher value indicating a better fit of the model.The specific formula to calculate R 2 is as follows: where y i is the mean value of y i .Figure 9 presents the scatter plots of the predicted results versus the true results for the eight predictive models investigated in this study.The corresponding coefficient of determination (R 2 ) values for the selected predictive models are shown in Table 3.Among them, the linear regression, ridge regression, K-nearest neighbors, and decision tree models exhibited scattered predicted points and true points around the 45-degree line, thereby indicating poor predictive performance, with test R 2 values of 0.95, 0.81, 0.79, and 0.91, respectively.In contrast, the extreme gradient boosting (XGBoost) model showed a concentration of predicted points and true points along the 45-degree line, with a high test R 2 value of 0.98, thereby indicating a low error and good predictive performance.Particle swarm optimization (PSO) is a widely recognized metaheuristic algorithm that is known for its ability to effectively explore the solution space and find global optima by simulating the collective behavior of a swarm of particles.In this study, PSO was used to search for the optimal combination of CO 2 -WAG parameters that maximizes the cumulative oil production.When applying the PSO algorithm, a trained XGBoost model was used to evaluate the suitability of a large number of project design parameters.With the aid of the surrogate model, the computational burden of the optimization procedure was significantly reduced, thereby allowing for more iterations of the PSO algorithm.The parameters of the final PSO model, along with the XGBoost parameters, are provided in Table 5, and the optimization process is depicted in Figure 10.

Production Prediction and Parameter Optimization
In the process of exploiting reservoirs using CO2-EOR technology, various operational and injection/production parameters have an impact on the cumulative oil production.Therefore, the XGBoost-PSO optimization model was employed to optimize the operational parameters, thereby aiming to enhance the cumulative oil production and recovery factor of the reservoir.During the optimization process, a set of key parameters was considered, including water injection well bottom flow pressure, gas injection well bottom flow pressure, production well bottom flow pressure, WAG gas injection, and WAG water injection.
By optimizing these parameters, the final optimization results were obtained, and

Parameters in PSO Algorithm Value
Population number group size 15 Maximum number of iterations maximum 50 Inertia weight (ω) 0.8 Learning factor (c1) 2 Learning factor (c2) 2

Production Prediction and Parameter Optimization
In the process of exploiting reservoirs using CO 2 -EOR technology, various operational and injection/production parameters have an impact on the cumulative oil production.Therefore, the XGBoost-PSO optimization model was employed to optimize the operational parameters, thereby aiming to enhance the cumulative oil production and recovery factor of the reservoir.During the optimization process, a set of key parameters was considered, including water injection well bottom flow pressure, gas injection well bottom flow pressure, production well bottom flow pressure, WAG gas injection, and WAG water injection.
By optimizing these parameters, the final optimization results were obtained, and they are shown in Table 6.Figures 11 and 12 illustrate the optimized cumulative oil production and daily oil production, respectively.From the figures, it is evident that the cumulative oil production and daily oil production under the CO 2 -WAG method were significantly higher than under the WF method.This finding indicates the immense potential of CO 2 -EOR technology in improving oil recovery.The optimized cumulative oil production successfully increased from 425,916 m 3 to 475,047 m 3 .This implies that, by optimizing the operational parameters, the oil production potential of the reservoir can be further enhanced.

Discussion and Conclusions
Machine learning is a popular research method used in data processing, and this study utilized a predictive model that has significant room for improvement.In this study, we used the XGBoost model, which allows for the customization of parameters such as the number of hidden layers, the number of neurons, and the learning rate to suit specific needs.Additionally, these hyperparameters can be adjusted to enhance the model's predictive accuracy.Optimization techniques such as PSO or GA can also be introduced to further strengthen the model's hyperparameters.Furthermore, the evaluation and optimization in this study only considered the cumulative production, daily production, and recovery rate as the objective functions.Future research can consider incorporating other metrics such as net present value (NPV) as objective functions.
This study utilized the XGBoost machine learning algorithm to establish a workflow for evaluating the cumulative gas production in CO 2 -EOR modeling.This workflow was used for capacity prediction and parameter optimization in CO 2 -EOR.The following conclusions were drawn: (1) Compared to traditional simulation and prediction methods, machine learning approaches can effectively handle reservoir data and address non-linear problems.By incorporating multiple factors such as geology and operations, they significantly improve the efficiency and accuracy of the models.(2) The investigation of the correlation between various factors and the cumulative oil production reveals that, from a geological perspective, there is a strong linear correlation between the porosity and permeability with the CO 2 -EOR cumulative oil production.From an injection/production parameter perspective, there is a strong linear correlation between the CO 2 -WAG gas injection rate and the CO 2 -EOR cumulative oil production.(3) Different machine learning models exhibited varying performance results in predicting production.By comparing eight different production prediction models, it can be concluded that the extreme gradient boosting (XGBoost) model outperforms other machine learning models in terms of predictive performance.The XGBoost model achieved an R 2 score of 0.99 on the training set and 0.98 on the testing set.(4) The cumulative oil production, daily oil production, and recovery factor under the CO 2 -WAG method were significantly higher than those under the WF method.This finding suggests that CO 2 -EOR technology has great potential in improving the recovery factor of oil reservoirs.(5) During the optimization of the CO 2 -EOR parameters, PSO was coupled with the trained XGBoost model.PSO efficiently searches the parameter space to find the optimal CO 2 -EOR parameters that maximize the cumulative oil production, thus saving computational costs.The optimized parameters resulted in a higher cumulative oil production and recovery factor when compared to previous results.

Figure 3 . 3 .
Figure 3. Workflow diagram.3.Establishment of Numerical Model and Database3.1.Establishment of CO 2 -EOR Numerical ModelFirst, based on the actual geological parameters of the oilfield, a characterization model was established, which took into account factors such as well spacing, fluid properties, and heterogeneity.The numerical model consisted of a grid with dimensions of 21 × 21 × 5, with a grid spacing of 10 m in the I direction, 10 m in the J direction, and 5 m in the K direction.Therefore, the feature model had dimensions of 210 m in length, 210 m in width, and 25 m in depth.The well pattern was deployed as a 1/4 five-spot pattern, with one injector well and one producer well per pattern, as shown in Figure4.The basic parameters of the feature model are described in Table1.

Figure 4 .
Figure 4. Schematic of the reservoir model in the feature model.

Figure 4 .
Figure 4. Schematic of the reservoir model in the feature model.

Figure 5 .
Figure 5. Distribution of each parameter.The red color represents geological parameters, the orange color represents fluid parameters, the green color represents phase saturation parameters, and the blue color represents injection/production parameters.

Figure 5 .
Figure 5. Distribution of each parameter.The red color represents geological parameters, the orange color represents fluid parameters, the green color represents phase saturation parameters, and the blue color represents injection/production parameters.

Figure 8 .
Figure 8. Correlation analysis of parameters with cumulative oil production.

Figure 8 .
Figure 8. Correlation analysis of parameters with cumulative oil production.

Figure 11 .
Figure 11.Comparison of cumulative oil production before and after optimization.

3 )Figure 11 .
Figure 11.Comparison of cumulative oil production before and after optimization.

Figure 11 .
Figure 11.Comparison of cumulative oil production before and after optimization.

Figure 12 .
Figure 12.Comparison of daily oil production before and after optimization.Figure 12.Comparison of daily oil production before and after optimization.

Figure 12 .
Figure 12.Comparison of daily oil production before and after optimization.Figure 12.Comparison of daily oil production before and after optimization.

Table 1 .
Basic parameters of the feature model.

Table 1 .
Basic parameters of the feature model.

Table 2 .
Range of values for each parameter.

Table 3 .
Comparison of predictive performance of the models.

Table 5 .
Hyperparameters of PSO algorithm and XGBoost model.

Table 6 .
Optimization of basic parameters of CO 2 -EOR.By establishing the XGBoost-PSO optimization model and optimizing the operational parameters, this process can provide reliable guidance and decision support for reservoir development.This optimization model not only improved the cumulative oil production and recovery factor, but also provides crucial support for the long-term sustainable development of the oilfield.Therefore, further research and optimization of this model are necessary to further enhance the efficiency and benefits of reservoir development.