A Comprehensive Summary of the Application of Machine Learning Techniques for CO 2 -Enhanced Oil Recovery Projects

: This paper focuses on the current application of machine learning (ML) in enhanced oil recovery (EOR) through CO 2 injection, which exhibits promising economic and environmental benefits for climate-change mitigation strategies. Our comprehensive review explores the diverse use cases of ML techniques in CO 2 -EOR, including aspects such as minimum miscible pressure (MMP) prediction, well location optimization, oil production and recovery factor prediction, multi-objective optimization, Pressure–Volume–Temperature (PVT) property estimation, Water Alternating Gas (WAG) analysis


Introduction
There is a strong correlation between energy consumption and economic growth.Liquid fossil fuels are a key component of the energy mix, contributing up to about 35% of worldwide energy usage.While energy sources are diversifying, liquid fossil fuels are still a key energy source in developing countries such as India and China.The rapid development of these economies will most likely intensify energy generation from fossil fuels.This will unsurprisingly lead to CO 2 emissions.CO 2 emissions have been rising worldwide.The IPCC report on "Global Warming of 1.5 • C" declared a major concern that unless CO 2 emissions are reduced by 50% by the year 2030, major changes will occur in the ocean and on the land, and unfortunately, they may be permanent in nature.
The time is of essence to globally transition to new energy systems.Bloomberg news mentions "Climate change is not a problem with a single solution.And it is not a challenge that any one group-governments, companies, scientists or individual citizens-can solve alone".Working together, one can build a healthier and more sustainable future for the generations to come.Utilizing a variety of technologies, e.g., solar, wind, geo-thermal, nuclear, extended batteries, and hydrogen, and strong government support, dedicated companies, universities and research centers, regulatory agencies and others, we have a great opportunity to solve the problem.
We can distinguish two main strategies for reducing atmospheric concentrations of CO 2 .The first strategy includes reducing the emissions of CO 2 to the atmosphere by increasing energy efficiency and switching to low-carbon fuel sources, utilizing proven and existing technologies, e.g., solar, wind and nuclear at a large scale and fast pace.The second strategy includes the deployment of negative emission technologies to remove carbon from the atmosphere and sequester it reliably.Some examples of this strategy may include DAC (direct air capture), CCS and CCUS (e.g., CO 2 EOR).The potential impact of these technologies on reducing CO 2 emissions is immense and should not be underestimated.
Our knowledge of the reservoir management of an oil and gas field from primary to tertiary recovery phases yields an understanding of its key properties.Hence, the use of mature or declining oil and gas reservoirs to store CO 2 significantly reduces subsurface uncertainties.CO 2 injection is a well-documented method for improving hydrocarbon production rates and increasing recovery.Thus, in light of climate concerns, using CO 2 injection for the dual objectives of enhancing oil recovery and carbon storage is a powerful choice.
Petroleum resources have been deemed as the principal source of fossil-fuel-based energy to meet the world's energy demands since the early 20th century.The importance of enhancing oil reservoir extraction efficiency has grown due to the restricted supply of reserves.Over two-thirds of the original oil in place (OOIP) remains trapped after primary and secondary recovery processes.Furthermore, extracting the remaining oil from mature reservoirs in complicated geological formations is more challenging.EOR methods are initiated to recover the remaining oil from reservoirs after both primary and secondary recovery methods have been exhausted.Surfactant flooding, chemical flooding, polymer flooding, steam stimulation, microbial flooding, gas injection, and so forth [1,2] are the common EOR approaches.Carbon dioxide (CO 2 ) is very successful since it increases oil production by increasing mobility and reducing oil viscosity and saturation, which works well with both conventional and some unconventional formations.CO 2 -EOR is one of the popular techniques, occupying around 20% of 1120 worldwide EOR projects (Figure 1).It may recover 15% to 25% of the OOIP of the light or medium oil fields that are close to depletion due to flooding [3].
Mach.Learn.Knowl.Extr.2024, 6, FOR PEER REVIEW 2 We can distinguish two main strategies for reducing atmospheric concentrations of CO2.The first strategy includes reducing the emissions of CO2 to the atmosphere by increasing energy efficiency and switching to low-carbon fuel sources, utilizing proven and existing technologies, e.g., solar, wind and nuclear at a large scale and fast pace.The second strategy includes the deployment of negative emission technologies to remove carbon from the atmosphere and sequester it reliably.Some examples of this strategy may include DAC (direct air capture), CCS and CCUS (e.g., CO2 EOR).The potential impact of these technologies on reducing CO2 emissions is immense and should not be underestimated.
Our knowledge of the reservoir management of an oil and gas field from primary to tertiary recovery phases yields an understanding of its key properties.Hence, the use of mature or declining oil and gas reservoirs to store CO2 significantly reduces subsurface uncertainties.CO2 injection is a well-documented method for improving hydrocarbon production rates and increasing recovery.Thus, in light of climate concerns, using CO2 injection for the dual objectives of enhancing oil recovery and carbon storage is a powerful choice.
Petroleum resources have been deemed as the principal source of fossil-fuel-based energy to meet the world's energy demands since the early 20th century.The importance of enhancing oil reservoir extraction efficiency has grown due to the restricted supply of reserves.Over two-thirds of the original oil in place (OOIP) remains trapped after primary and secondary recovery processes.Furthermore, extracting the remaining oil from mature reservoirs in complicated geological formations is more challenging.EOR methods are initiated to recover the remaining oil from reservoirs after both primary and secondary recovery methods have been exhausted.Surfactant flooding, chemical flooding, polymer flooding, steam stimulation, microbial flooding, gas injection, and so forth [1,2] are the common EOR approaches.Carbon dioxide (CO2) is very successful since it increases oil production by increasing mobility and reducing oil viscosity and saturation, which works well with both conventional and some unconventional formations.CO2-EOR is one of the popular techniques, occupying around 20% of 1120 worldwide EOR projects (Figure 1).It may recover 15% to 25% of the OOIP of the light or medium oil fields that are close to depletion due to flooding [3].The utilization of CO2 in EOR can significantly improve oil recovery; at the same time, it plays an essential role in environmental preservation.The importance of CO2-EOR as part of carbon capture, use, and storage (CCUS) schemes becomes more vital as the petroleum industry works toward decarbonization to mitigate greenhouse gas emissions.If reinjection is not considered, approximately 60% of injected CO2 can be trapped in the reservoir at the CO2 breakthrough [5].This approach, efficiently utilizing CO2 in oil The utilization of CO 2 in EOR can significantly improve oil recovery; at the same time, it plays an essential role in environmental preservation.The importance of CO 2 -EOR as part of carbon capture, use, and storage (CCUS) schemes becomes more vital as the petroleum industry works toward decarbonization to mitigate greenhouse gas emissions.If reinjection is not considered, approximately 60% of injected CO 2 can be trapped in the reservoir at the CO 2 breakthrough [5].This approach, efficiently utilizing CO 2 in oil recovery, aligns with an environmentally friendly protocol while simultaneously enhancing resource efficiency and contributing substantially to sustainability goals [6].
Machine learning (ML) approaches have drawn considerable interest as emerging technologies in the oil and gas industry over the past 20 years.Applying the ML approaches to examine issues in the oilfield development process has acquired new life with the advent of intelligent oilfields and big data technology.Indeed, ML shows the feasibility of offering a more straightforward and quicker method than rigorous and numerous simulations or experiments.Many ML correlations have emerged with the development of computer tools, particularly in reservoir characterization, CO 2 storage, production, and drilling operations [7][8][9][10].
Many literature reviews have been conducted in the past to summarize the application of ML in the oil and gas industry [11].However, no study on global research trends analyzed the dominant input parameters and evaluated the research work on CO 2 -EOR projects.The evaluations could help researchers get a preliminary idea about the current research trend on CO 2 -EOR and whether their recent research impacts a particular field.Furthermore, few studies have systematically summarized and examined all the literature on ML for CO 2 -EOR.Few reviews find the most critical topics, objectives, input parameters, evaluations, and research gaps in ML for CO 2 -EOR.This study aims to offer insight into current trends and technological development indicators, which will help identify the viewpoint for the following research areas and prospects.Thus, data extraction analysis was carried out to ascertain the research advancement and trends in ML for CO 2 -EOR, whereby a systematic review accomplishes the closure of research gaps on this subject.
This paper aims to summarize and evaluate the various ML models in CO 2 -EOR and provide insightful analysis with 101 papers reviewed.The rest of the paper is organized as follows: Section 2 describes the mechanisms and processes of CO 2 -EOR; Section 3 provides the most popular ML and optimization methods employed in the literature; Section 4 summarizes the work that has applied ML in the CO 2 -EOR process, including MMP prediction, WAG, well placement optimization, oil production or recovery factor prediction, multiple objectives optimization, PVT properties estimation, and CO 2 -foam; and Section 5 outlines the benefits and limitations of the application of ML in the CO 2 -EOR process, before ending this survey paper with concluding remarks.

Mechanisms and Processes of CO 2 -EOR
CO 2 is generally injected into the reservoir under the following conditions: (a) miscible injection; (b) immiscible front displacement after water flooding; (c) water alternating gas (WAG) displacement; and (d) CO 2 dissolved in brine flooding, also referred to as carbonated water injection (CWI) [12].Miscible displacement has been successful over the years.It occurs at pressures above a minimum miscible pressure (MMP) of the oil, where the injected gas and the hydrocarbons are entirely miscible and form a single-phase fluid.The main advantages of miscible displacement are that it can promote oil swelling, reduce fluid viscosity, increase mobility, reduce remaining oil saturation, and improve oil production.
CO 2 has been historically favored over other gases due to its low MMP.Furthermore, CO 2 gas injection can potentially mitigate greenhouse gas emissions while improving oil recovery.CO 2 -miscible flooding, whether initiated upon first contact or multiple contacts, results in the remaining oil and CO 2 becoming miscible, which leads to near-zero interfacial tension (IFT), no capillary pressure, and improved volumetric sweep (Ev) and displacement efficiency (Ed) [13].Conversely, in the case of CO 2 -immiscible flooding, the IFT is not near zero, maintaining the capillary pressure and causing some residual oil saturation.The oil recovery efficacy is contingent upon the efficiency of fluid displacement, volumetric sweep, and CO 2 solubility in the oleic phase, consequently increasing oil mobility.These characteristics are influenced by various factors, including gravity, rock wettability, reservoir heterogeneity, crude oil phase behavior, and phenomena such as viscous fingering, etc. [12,14].

Summary of Machine Learning Approaches
Machine learning (ML) involves the development of computational models and algorithms capable of learning patterns and making data-driven predictions or decisions without being explicitly programmed.ML algorithms employ data to automatically identify and generalize patterns, which may be applied for classification, regression, clustering, and more tasks.ML can be categorized into four main types: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning.Figure 2 provides some examples of different ML algorithms.Among these various algorithms, supervised learning is most applied in the oil and gas industry [11].

Summary of Machine Learning Approaches
Machine learning (ML) involves the development of computational models and algorithms capable of learning patterns and making data-driven predictions or decisions without being explicitly programmed.ML algorithms employ data to automatically identify and generalize patterns, which may be applied for classification, regression, clustering, and more tasks.ML can be categorized into four main types: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning.Figure 2 provides some examples of different ML algorithms.Among these various algorithms, supervised learning is most applied in the oil and gas industry [11].For instance, ANNs have demonstrated remarkable efficacy in providing userfriendly, cost-effective, reliable, and expedited solutions to a variety of complex challenges encountered in the oil and gas industry.This is primarily attributed to the inherent complexity and non-linear nature of oil and gas datasets, which often have intricate relationships between input variables and output parameters.ANNs excel in capturing these complex relationships by effectively modeling non-linear functions.Moreover, oil and gas data are frequently characterized by noise, incompleteness, and heterogeneity.ANNs exhibit superior capability in handling such diverse data types and can adeptly adapt to varying data distributions, thereby making them highly versatile for addressing various tasks across different domains within the industry.
Furthermore, the enhancement of the ML process involves optimization techniques to determine optimal values for control parameters, including the spreading coefficient, number of neurons, biases, and weights.Several optimization methods, such as the Levenberg-Marquardt (LM) algorithm, genetic algorithm (GA), and smart nature-inspired swarm algorithms like particle swarm optimization (PSO), grey wolf optimization (GWO), and ant colony optimization (ACO), have demonstrated their efficacy in achieving significant improvements in these tasks.There are two categories in intelligent optimization algorithms: single-objective optimization and multi-objective optimization (Figure 3).For instance, ANNs have demonstrated remarkable efficacy in providing user-friendly, cost-effective, reliable, and expedited solutions to a variety of complex challenges encountered in the oil and gas industry.This is primarily attributed to the inherent complexity and non-linear nature of oil and gas datasets, which often have intricate relationships between input variables and output parameters.ANNs excel in capturing these complex relationships by effectively modeling non-linear functions.Moreover, oil and gas data are frequently characterized by noise, incompleteness, and heterogeneity.ANNs exhibit superior capability in handling such diverse data types and can adeptly adapt to varying data distributions, thereby making them highly versatile for addressing various tasks across different domains within the industry.
Furthermore, the enhancement of the ML process involves optimization techniques to determine optimal values for control parameters, including the spreading coefficient, number of neurons, biases, and weights.Several optimization methods, such as the Levenberg-Marquardt (LM) algorithm, genetic algorithm (GA), and smart nature-inspired swarm algorithms like particle swarm optimization (PSO), grey wolf optimization (GWO), and ant colony optimization (ACO), have demonstrated their efficacy in achieving significant improvements in these tasks.There are two categories in intelligent optimization algorithms: single-objective optimization and multi-objective optimization (Figure 3).

Minimum Miscibility Pressure (MMP)
In miscible gas injection, MMP is one of the most important parameters to determine the accuracy of miscible CO2 flooding into the reservoir.Traditionally, MMP is defined as the pressure at which 80% of the OOIP is extracted from the reservoir upon the breakthrough of CO2 [16].Because CO2 flooding is more expensive than waterflooding, an ac-

Application of ML in CO 2 -EOR 4.1. Minimum Miscibility Pressure (MMP)
In miscible gas injection, MMP is one of the most important parameters to determine the accuracy of miscible CO 2 flooding into the reservoir.Traditionally, MMP is defined as the pressure at which 80% of the OOIP is extracted from the reservoir upon the breakthrough of CO 2 [16].Because CO 2 flooding is more expensive than waterflooding, an accurate estimation of MMP can help better design miscible CO 2 flooding, ultimately leading to cost savings.In the literature, researchers have proposed various MMP estimation approaches, including the following: (a) experimental methods such as slim-tube tests [17]; rising-bubble apparatus [18]; vanishing interfacial tension [19]; (b) empirical correlations [17,[20][21][22] and computational techniques such as single mixingcell and multiple mixing-cell approaches [23].
However, though accurate and reliable, experimental methods are time-consuming and expensive.Most empirical correlations and computation techniques do not consider different thermodynamic and reservoir properties.Moreover, they exhibit limitations in accurately estimating the trend of MMP concerning their input parameters [24].In contrast, the advent of ML has provided various robust algorithms in problems involving regression/classification. Consequently, considerable research studies dedicated to the precise modeling of MMP and the successful application of ML in this domain have been well documented.
The earliest application of ML on CO 2 -EOR MMP can be traced back to 2003, when Huang et al. [25] first introduced ANN into this field.Subsequently, Emera and Sarma [26] employed the GA to optimize the MMP prediction processes.Following the year 2010, there has been a gradual increase in the adoption of ML algorithms and optimization techniques, accompanied by a significant expansion of the available dataset.Nowadays, the application of ML in predicting MMP has evolved into a more mature state.A comprehensive survey of the literature review in the field of CO 2 -oil MMP estimation applying ML, spanning the period from 2003 to the present, is summarized in Table 1.Each reviewed paper is scrutinized and synthesized with respect to the employed algorithms, dataset size, data splitting methods, input variables, outcomes, our assessment, and a rating score.A paper deserving a high rating ought to exhibit certain characteristics, such as the following: a substantial dataset, typically comprising no fewer than 100 data points; a demonstration of effective model generalization without signs of overfitting, where the training dataset constitutes a maximum of 80% of the total data; and validation through empirical evidence derived from experimental and/or field data.Furthermore, a high-rated paper should demonstrate depth in result analysis, including a thorough examination of the outcomes in comparison to other existing models.
Figure 4 presents a statistical analysis from 56 research papers.It reveals a remarkable surge in the adoption of ML methodologies within this domain.ANN and GA have emerged as the most favored choices among many ML and optimization algorithms.ANNs, particularly RBFNN and MLP, are prominently employed.We have provided a separate categorization for RBFNN and MLP to afford a more detailed perspective on their individual utilization patterns.
Furthermore, an essential factor impacting the efficacy of ML models in MMP predictions is the size of the dataset.It is widely recognized that an inadequately sized dataset can lead to overfitting, potentially compromising the model's generalizability.A substantial proportion of the examined papers (64%) have datasets with fewer than 200 data points, with a noteworthy subset (21%) relying on datasets with fewer than 100 data points.This stark discrepancy in dataset size necessitates critically examining the quality and robustness of models trained on such limited data.Therefore, it becomes paramount to consider the trade-offs between the advantages of ML applications and the constraints posed by data scarcity in the context of MMP prediction.
parameter on model predictions.It plays a crucial role in model interpretation, validation, and feature selection, ultimately improving the trustworthiness and transparency of machine learning models.Methods like SHAP (Shapley Additive exPlanations) and relevancy factors are commonly used for sensitivity analysis.Nevertheless, few existing studies [27,28] have performed a sensitivity analysis, while the majority of research only compares their models with experimental and/or empirical results.Future research endeavors should allocate attention toward sensitivity analysis, thereby enhancing the completeness and credibility of machine learning studies.As summarized in Table 1, the most dominant parameters affecting pure CO 2 MMP are reservoir temperature, the molecular weight of C 5+ or C 7+ , the mole fraction of volatile oil elements, and the mole fraction of intermediate oil elements.Meanwhile, for impure CO 2 MMP, additional parameters such as the mole fraction of gas, including C 1 to C 4 , CO 2 , N 2 , and H 2 S are also considered.Some studies included volatile oil components (C 1 and N 2 ) as well.
A more rigorous way to investigate the impact of each input variable involves conducting sensitivity analysis, a widely employed way to analyze the effect of each input parameter on model predictions.It plays a crucial role in model interpretation, validation, and feature selection, ultimately improving the trustworthiness and transparency of machine learning models.Methods like SHAP (Shapley Additive exPlanations) and relevancy factors are commonly used for sensitivity analysis.Nevertheless, few existing studies [27,28] have performed a sensitivity analysis, while the majority of research only compares their models with experimental and/or empirical results.Future research endeavors should allocate attention toward sensitivity analysis, thereby enhancing the completeness and credibility of machine learning studies.*: On a scale of 1 to 10, a higher score indicates higher quality of the article.

Water-Alternating-Gas (WAG)
WAG injection, a widely adopted method in EOR techniques, cyclically injects water and gas, typically CO 2 or CO 2 -hydrocarbon blends, to increase sweep efficiency and maximize oil recovery.Optimizing parameters such as the WAG ratio, duration of each cycle, and reservoir properties is pivotal for achieving favorable economic outcomes.The application of ML methods on WAG has been developed more recently.The earliest application of ML in WAG started in 2016; Hosseinzadeh Helaleh and Alizadeh [76] employed SVM together with three optimization methods, ACO, PSO, and GA, to predict fractional oil recovery.In 2018, Nait Amar et al. [77] used time-dependent multi-ANN to predict the total field oil production.Later on, Nait Amar and Zeraibi [78] successfully applied SVR to construct a dynamic proxy of a field in Algeria, complemented by genetic algorithms (GAs) for optimizing water-alternating CO 2 gas parameters.A more detailed summary is listed in Table 2. Figure 5 provides statistical analysis based on 26 papers.Similar to MMP, the most popular ML algorithm is ANN, and the most preferred optimization is GA.

Well Placement Optimization (WPO)
WPO plays an essential role in reservoir management and development for many reasons.It can help maximize oil recovery and economic considerations (because drilling and maintaining wells is expensive).However, it has been considered one of the most challenging tasks due to the necessity of evaluating numerous computation scenarios to identify the optimal location for wells and achieve maximum production.The complexity of geological heterogeneities, such as variations in permeability and porosity, the existence of multiple facies, and stratigraphic and structural boundary conditions, requires extensive computational efforts.Furthermore, small changes in well locations can lead to significant changes in oil recovery prediction, making the optimization more challenging.Numerous simulations for hundreds or thousands of scenarios need to be run to make the best decision.
In recent years, studies suggesting the integration of ML approaches have been proposed in the literature as a potential solution.They hold the potential to accelerate computation processes, enabling the quicker attainment of accurate scenarios within numerical simulations.Despite the recognized importance of optimizing well placement, the investigations of CO 2 injector locations for optimal oil recovery and storage are relatively infrequent (Table 3).Most research is focused on waterflood injector selection [102].*: On a scale of 1 to 10, a higher score indicates higher quality of the article.

Oil Production/Recovery Factor
The recovery factor, defined as the ratio of produced oil to OOIP, is one of the most crucial success metrics for evaluating all EOR projects, as it determines how much incremental oil or ultimate oil is produced.Accurately predicting the recovery factor is challenging because it depends on diverse factors, including reservoir characteristics and heterogeneity, fluid properties, well design, injection condition, and the composition of the injected fluid.Reservoir simulations, together with laboratory experiments at reservoir conditions, can help predict the recovery factor.After that, a small-scale pilot test is conducted before undertaking larger-scale operations [105].Although this approach may provide solutions to numerous problems, it is costly and time consuming.Therefore, ML methods emerge as more practical, affordable, rapid, and accurate alternatives.
Alternatively, ML methods have obtained popularity in predicting oil recovery.For example, Ahmadi et al. [106] applied LSSVM to predict the ultimate oil recovery factor of the miscible CO 2 -EOR injection operations at different rock, fluid, and process conditions.Karacan [107] employed fuzzy logic to predict the recovery factors of the major past and existing U.S. field applications of miscible CO 2 -EOR.Table 4 provides further information on ML applications on the CO 2 -EOR recovery factor.*: On a scale of 1 to 10, a higher score indicates higher quality of the article.

Multi-Objective Optimization
As the name indicates, multi-objective optimization optimizes multiple objections simultaneously, such as the oil recovery factor or cumulative oil production, CO 2 storage, and net present value (NPV).For each objective, running high-fidelity numerical models provides possible solutions to figure out the optimum.However, finding optimal solutions to all the objectives simultaneously is not always guaranteed since objectives can compete with each other.For example, to maximize oil recovery, more CO 2 may be needed, leading to higher oil production.However, this might also mean that more CO 2 is used, potentially increasing the project's cost, which will also adversely affect the project NPV [110].This requires sophisticated optimization techniques to identify solutions that balance these objectives, considering all the constraints involved in the problem.Therefore, ML techniques outperform other methods as an effective, reliable, and stable workflow to co-optimize crude oil recovery, CO 2 sequestration, NPV, and related factors.
Given the complexity of multi-objective optimization, the application of ML on CO 2 -EOR is very limited (Table 5) and is strongly restricted by the geological model.Once the reservoir characteristics have changed, the model must be rebuilt and retrained.The development of the ML and optimization workflow is challenging and requires more effort in different oil and gas fields.*: On a scale of 1 to 10, a higher score indicates the higher quality of the article.

PVT Properties
For any CO 2 -flooding project, it is imperative to comprehend the intricate physical and chemical interactions between CO 2 and the reservoir oil, even when primarily exploring recovery potential.Laboratory investigations and the utilization of available modeling or correlation packages serve as viable methods for analyzing the influence of CO 2 on the physical properties of oil.Nonetheless, conducting a comprehensive laboratory study to obtain an extensive dataset is costly and time consuming.Furthermore, the available correlation packages are limited in their applicability, rendering them unsuitable for many scenarios.
ML is being increasingly harnessed for tasks such as predicting CO 2 solubility and interfacial tension (IFT), as briefly presented in Table 6.Intriguingly, a majority of the studies incorporated the same dataset sourced from Emera and Sarma [115].Given the relatively small dataset size comprising only 106 data points, the risk of overfitting looms large, casting doubt on the accuracy and generalizability of their ML models.It is evident that a larger and more diverse dataset is required to facilitate a deeper comprehension of the performance of ML techniques in this context.Given the year that this paper was published, the dataset is small.

7
*: On a scale of 1 to 10, a higher score indicates higher quality of the article.

CO 2 -Foam Flooding
The implementation of CO 2 injection in Enhanced Oil Recovery (EOR) demonstrates significant potential, but it is accompanied by inherent limitations, including suboptimal sweep efficiency, asphaltene precipitation, and the corrosion of well infrastructure.In response to these challenges, the utilization of CO 2 foam has emerged as a promising strategy to enhance the effectiveness of CO 2 -EOR flooding.Foams offer distinct advantages, primarily due to their elevated viscosities compared to pure gases, a property that equips foams with the capability to displace oil from reservoir formations more efficiently [119].Furthermore, by obstructing highly permeable pore pathways, foams redirect displaced fluids toward unswept reservoir regions, thereby improving both the sweep efficiency and the storage capacity of CO 2 within the reservoir matrix.While ML models have found extensive applications in EOR research, their application in the context of CO 2 foam is still in its nascent stages, and the existing body of literature on this subject remains limited, as evidenced in Table 7. Limited to laboratory experiments.9 *: On a scale of 1 to 10, a higher score indicates the higher quality of the article.

Benefits and Limitations of ML
ML exhibits high efficiency when compared with conventional reservoir simulators.Typically, these simulators are performed on 3D grids comprising one million to several billion cells.Computations tend to be time-consuming, imposing constraints on the feasibility of conducting multiple iterations.Consequently, this limitation reduces the optimization potential for meticulous field development planning.A pivotal role of ML techniques is their capacity to speed up reservoir modeling computations.These models can predict time-dependent variables at 100 to 1000 times faster speeds than traditional simulators.This acceleration in computation velocity via ML methods maintains an equivalent level of functionality [11].
Furthermore, extensive research findings have proved the impressive performance of ML methods, consistently yielding accuracy levels exceeding 90% based on statistical quality assessments.This high degree of accuracy demonstrates the confidence in ML's reliability and portends a promising future within the oil and gas industry.
While the advantages of employing ML are widely acknowledged, it is imperative to recognize the associated limitations inherent in ML-based methodologies.A central challenge confronting researchers is obtaining authentic data from experimental and/or field sources.The limited availability of large datasets is also a concern, impacting both the training accuracy and the overall efficacy of the ML models.When faced with restricted data, researchers often use single-shot learning strategies, wherein models are pre-trained on similar datasets and subsequently refined through experience.
Overfitting is a prevalent issue in ML applications, primarily driven by insufficient training data and the absence of well-defined stopping criteria during training.In total, 12% of the reviewed research papers contain datasets with fewer than 100 data points, heightening the risk of overfitting.Addressing this problem may involve adjusting the model's structure, including weight modifications.However, it is important to recognize that such alterations can increase model complexity, potentially limiting its generalization beyond the specific dataset.
More efforts are needed to advance ML applications within the oil and gas industry.For instance, integrating knowledge from multiple disciplines, such as geology, reservoir engineering, and petrophysics, into ML models could enhance model accuracy and interpretability.Future endeavors may involve the development of hybrid models implementing ML techniques with physics-based methodologies.Another improvement is reducing data scarcity and heterogeneity, which requires concerted efforts to address bias and model generalization.Researchers could focus on deploying data augmentation techniques, employing transfer learning methodologies, and refining ML algorithms to handle sparse and noisy data effectively.

Conclusions
In this work, we have investigated and summarized the employment of ML methods in the application of CO 2 -EOR from several areas: MMP, WAG, well location placement, oil production/recovery factor, multi-objective optimization, PVT properties, and CO 2 foam.We have listed the input parameters, objectives, data sources, results, evaluation, and rating for each area based on the data quality, ML process, and results analysis.The important highlights of this work are summarized by seven key points as follows: • Most reports on model performance indicators are limited to the size of the data bank, with 12% of the investigated papers having a database of less than 100 data points, making it difficult to accurately assess the quality of the model over time or track its drift with new data; • Regarding validation and verification, the CO 2 -EOR has many reliable, dependable, and well-established techniques for verification and validation procedures for ML models; the research highlights several issues with current ML models, including scalability, validation and verification deficiencies, and an absence of published data regarding the establishment costs of ML models; • Most CO 2 -EOR research focused on MMP predictions and WAG design, with 56 out of 101 papers devoted to MMP prediction and 26 of 101 papers to WAG design; the applications in the recovery factor, well placement optimization, and PVT properties are limited; • ANN is the most employed ML algorithm, and GA is the most popular optimization algorithm based on 101 reviewed papers.ANN has proven to be flexible enough to be implemented to build intelligent proxies; while oil and gas data are frequently characterized by noise, incompleteness, heterogeneity, and nonlinearity, ANNs exhibit superior capability in handling such diverse data types and can adeptly adapt to varying data distributions; • ML algorithms have the potential to greatly reduce the computational cost and time to perform compositional simulation runs; however, ML applications for well placement and multi-objective optimizations in CO 2 -EOR are very limited given the complexity of the problem.Furthermore, the reliability of coupled ML-metaheuristic paradigms based on reservoir simulation results needs further investigation; • The application of ML in the oil and gas industry still requires further exploration and development.Future work can focus on integrating knowledge from multiple disciplines, such as geology, reservoir engineering, and petrophysics with ML models to enhance accuracy and interpretability; another focus area could be the development of hybrid models that implement ML techniques alongside physics-based methodologies, providing robust and reliable support;

•
In summary, this study provides a comprehensive overview of the application of ML and optimization techniques in CO 2 -EOR projects; our work significantly contributes to the advancement of knowledge in the field by providing a synthesis of the latest research; these methods have demonstrated their ability to improve the efficiency, production forecast, and economic viability of CO 2 -EOR operations; the insights gained from this study provide valuable guidance for the future direction of ML applications in CO 2 -EOR R&D (research and development) and deployment.

Figure 2 .
Figure 2. Examples of different machine learning algorithms.

Figure 2 .
Figure 2. Examples of different machine learning algorithms.

Figure 4 .
Figure 4. (a).Rise of ML application papers in MMP prediction; (b) occurrence of different ML algorithms; (c) distribution of dataset size.

Figure 5 .
Figure 5. Occurrence of ML algorithms in WAG.

Figure 5 .
Figure 5. Occurrence of ML algorithms in WAG.

Table 1 .
Summary of ML application on CO 2 -EOR MMP.
R , x vol /x int , MW, y

Table 2 .
Summary for ML applications on WAG.

Table 2 .
Summary for ML applications on WAG.

Table 3 .
Summary of ML applications in well location optimization.

Table 4 .
Summary of ML applications on oil production/recovery factor.

Table 5 .
Summary of ML applications on multi-objective optimizations.

Table 6 .
Summary of ML application on PVT properties.

Table 7 .
Summary of ML application on CO 2 -foam EOR.