Practical CO 2 —WAG Field Operational Designs Using Hybrid Numerical-Machine-Learning Approaches

: Machine-learning technologies have exhibited robust competences in solving many petroleum engineering problems. The accurate predictivity and fast computational speed enable a large volume of time-consuming engineering processes such as history-matching and ﬁeld development optimization. The Southwest Regional Partnership on Carbon Sequestration (SWP) project desires rigorous history-matching and multi-objective optimization processes, which ﬁts the superiorities of the machine-learning approaches. Although the machine-learning proxy models are trained and validated before imposing to solve practical problems, the error margin would essentially introduce uncertainties to the results. In this paper, a hybrid numerical machine-learning workﬂow solving various optimization problems is presented. By coupling the expert machine-learning proxies with a global optimizer, the workﬂow successfully solves the history-matching and CO 2 water alternative gas (WAG) design problem with low computational overheads. The history-matching work considers the heterogeneities of multiphase relative characteristics, and the CO 2 -WAG injection design takes multiple techno-economic objective functions into accounts. This work trained an expert response surface, a support vector machine, and a multi-layer neural network as proxy models to effectively learn the high-dimensional nonlinear data structure. The proposed workﬂow suggests revisiting the high-ﬁdelity numerical simulator for validation purposes. The experience gained from this work would provide valuable guiding insights to similar CO 2 enhanced oil recovery (EOR) projects.


Introduction
The Southwest Regional Partnership on Carbon Sequestration (SWP) project focuses on the design and monitoring of a field-scale CO 2 enhanced oil recovery (EOR) process in the Farnsworth Unit (FWU) located in the Anadarko Basin, Texas. From 2010 to 2014, 16.82 billion standard cubic feet of anthropogenic CO 2 was injected into the Morrow-B sand [1]. the CO 2 utilized in this project is captured by the Arkalon Ethanol Plant and the Agrium Fertilizer Plant locating in Liberal, Kansas, and Borger, Texas, respectively [2]. According to Munson [3], the original oil (OOIP) and gas in place (OGIP) are approximately 120 million (MM) barrels and 41.48 billion standard cubic feet (SCF), respectively. The field development was initiated in 1955 and the waterflood started in 1963. Starting from the end of 2010, the FWU field has been undergoing a water alternative CO 2 injection process to extract the residual oil in place.
The Morrow B sandstone formation is located at a depth interval between 7550 ft and 7950 ft. The formation has an average dip angle of <1 • [4] and was deposited in the late Pennsylvanian by a fluvial system in an incised valley [5]. The average net pay coupled in the workflows to generate fast and high-quality reservoir engineering analysis. The discussion of this paper starts from the description of the reservoir model. Then the machine-learning proxies employed by this work are summarized. Afterwards, the optimization algorithms and treatment of the multi-objective problem are presented. By coupling the machine-learning proxy and the optimization protocols, two workflows are structured to solve the history-matching and CO 2 -WAG design optimization problems. Last but not least, we employ case studies by imposing the proposed workflow on the FWU field.

Reservoir Modeling
A reservoir model is the most fundamental element of the proposed workflow. The establishment of the numerical simulation model assembles geological, petrophysical, and field historical data. Reservoir engineers have structured many versions of numerical simulation models for the FWU field considering hydrodynamic, geo-mechanical, and geochemical mechanisms of the fluid transportation in Morrow-B formation. The field scale numerical models are validated via rigorous history-matching processes using the historical injection and production data and employed to build forecasting scenarios. In addition, sector models around the well 13-10A pattern are utilized to validate the optimization workflows.

Hydraulic Flow Unit
The geological model takes advantage of the petrophysical properties interpolated from the laboratory measurements using the hydraulic flow unit (HFU). Morrow-B sandstone exhibits strong heterogeneity divergences from the diagenetic processes, which leads to various pore structures, multiphase flow characteristics, and wettability. Moreover, rock and fluid property variations could evolve progressively after CO 2 -WAG injection starts. The porosity-permeability relationship can be identified using the HFU to classify the sedimentologic and diagenetic heterogeneity of the Morrow-B formation.
Morrow B sandstone was first classified into five porosity facies and eight subfaces; and eight HFUs were identified to characterize hydrogeologic heterogeneities using the well log and petrophysical data collected from various scientific wells [13]. Followed up research works suggest lumping HFU 5 to 8 into an identical group due to the similar flow features observed from core-flooding experiments [27].
The geological model continuously utilizes the permeability-porosity relationship developed from the HFUs. When a reservoir simulation model is structured, grids are assigned with different relative permeability curves based on the HFU characteristics. In Figure 1, the HFU distributions at different simulation layers of the reservoir model are displayed.

Updated Geological Model
As with the advancement of the reservoir characteristic analysis on the FWU field, the original simulation model is restructured using an updated outer boundary and property population protocol. As shown in Figure 2, the permeability and porosity distributions of the updated geological mode are displayed in Figure 2a,c using a 100 ft by 100 ft mesh grid system. To implement the updated geological model, the history-matching process must be revisited from the primary and secondary stages. Due to the long production period (from 1956 to 2010), completing one simulation case takes more than 12,000 s of central processing unit (CPU) time, which makes the computational cost of the history-matching process prohibitively expensive. With such a running speed, even preparing the dataset to train the proxy model becomes unrealistic. Thus, the upscale of the original model becomes necessary. Figure 2b,d show the upscaled permeability and porosity distribution using a 200 ft by 200 ft mesh grid system. Figure 3 illustrates the comparison of the simulation results regarding the oil and water production using the original and upscaled reservoir model with an average disparity of less than 0.8%. The use of the upscaled Energies 2021, 14, 1055 4 of 26 model significantly reduces the computational time to less than 300 s of CPU time, which is 40 times faster than the original model. With the help of upscaling, the history-matching of the primary and secondary recovery periods is done using the coarser grid first and then validated by the finer grids.

Updated Geological Model
As with the advancement of the reservoir characteristic analysis on the FWU field, the original simulation model is restructured using an updated outer boundary and property population protocol. As shown in Figure 2, the permeability and porosity distributions of the updated geological mode are displayed in Figure 2a and Figure 2c using a 100 ft by 100 ft mesh grid system. To implement the updated geological model, the history-matching process must be revisited from the primary and secondary stages. Due to the long production period (from 1956 to 2010), completing one simulation case takes more than 12,000 s of central processing unit (CPU) time, which makes the computational cost of the history-matching process prohibitively expensive. With such a running speed, even preparing the dataset to train the proxy model becomes unrealistic. Thus, the upscale of the original model becomes necessary. Figure 2b and d show the upscaled permeability and porosity distribution using a 200 ft by 200 ft mesh grid system. Figure  3 illustrates the comparison of the simulation results regarding the oil and water production using the original and upscaled reservoir model with an average disparity of less than 0.8%. The use of the upscaled model significantly reduces the computational time to less than 300 s of CPU time, which is 40 times faster than the original model. With the help of upscaling, the history-matching of the primary and secondary recovery periods is done using the coarser grid first and then validated by the finer grids.   Comparison of the results generated by the finer and upscaled (coarser) grid simulation models in terms of (a) oil production and (b) water production.

Injection Pattern Model
A sector model focusing on the Well 13-10A pattern was structured to simulate the smaller area. As shown in Figure 4, the sector model is discretized using a 60 by 60 Cartesian grid system (mesh size is 26.7 ft by 26.7 ft) and simulates 1/8 of the five-spot injection pattern. The property distribution is established using the data collected from 13-A well. This work employs the sector model to generate datasets for the testing purposes of the optimization workflow, since the computational cost of completing one simulation is low. Moreover, the injection pattern model is suitable to investigate multiple mechanisms near the scientific well. Comparison of the results generated by the finer and upscaled (coarser) grid simulation models in terms of (a) oil production and (b) water production.

Injection Pattern Model
A sector model focusing on the Well 13-10A pattern was structured to simulate the smaller area. As shown in Figure 4, the sector model is discretized using a 60 by 60 Cartesian grid system (mesh size is 26.7 ft by 26.7 ft) and simulates 1/8 of the five-spot injection pattern. The property distribution is established using the data collected from 13-A well. This work employs the sector model to generate datasets for the testing purposes of the optimization workflow, since the computational cost of completing one simulation is low. Moreover, the injection pattern model is suitable to investigate multiple mechanisms near the scientific well.  The aforementioned simulation models played a crucial role in this work in terms of comprehending the fluid flow dynamics in the Morrow-B sands, and more importantly, generating the desired data to develop the machine-learning proxy models. It is worth stressing that the interaction between the high-fidelity numerical and machine-learning protocols is critical for the long-term development and updating of the workflow.

Machine-Learning Proxies
In this work, the history-matching and CO2 injection design processes employed many machine-learning technologies such as response surface models (RSM), multi-layer neural networks (MLNN), and support vector machines (SVM). These expert machinelearning models (also called "proxies") are developed to learn from the numerical simulation models and investigate the fluid transportation mechanisms via a data-driven perspective. Due to their high computational efficacies and robust generalization competence, machine-learning models have been successfully developed and act as regression tool to evaluate the field performance in terms of the fluid recovery, injection performance, pressure responses, and economic assessments.
In the history-matching applications, the proxy models correlate the uncertain hy- The aforementioned simulation models played a crucial role in this work in terms of comprehending the fluid flow dynamics in the Morrow-B sands, and more importantly, generating the desired data to develop the machine-learning proxy models. It is worth stressing that the interaction between the high-fidelity numerical and machine-learning protocols is critical for the long-term development and updating of the workflow.

Machine-Learning Proxies
In this work, the history-matching and CO 2 injection design processes employed many machine-learning technologies such as response surface models (RSM), multi-layer neural networks (MLNN), and support vector machines (SVM). These expert machine-learning models (also called "proxies") are developed to learn from the numerical simulation models and investigate the fluid transportation mechanisms via a data-driven perspective. Due to their high computational efficacies and robust generalization competence, machine- learning models have been successfully developed and act as regression tool to evaluate the field performance in terms of the fluid recovery, injection performance, pressure responses, and economic assessments.
In the history-matching applications, the proxy models correlate the uncertain hydrodynamical properties with the fluid production/injection and pressure data, which assists the evaluation of the fitting error during the tuning process. There are many generations of reservoir simulation models evolving as more geological, petrophysical, and field operational data become available. The initial history matching works can be done without the help of machine-learning technologies because the earlier version of the reservoir model is simpler in terms of the reservoir characteristics. When the new geological model arrives, the permeability distributions are updated and more importantly, various relative permeability curves are assigned to the grids based on the HFU indices. Consequentially, the number of uncertain parameters becomes considerably larger, and so does the required volume of simulation realizations. Therefore, the use of machine-learning technologies becomes quite necessary to history-match the current version of the simulation model.
After a history-matched model was structured, another class of proxy model was developed to learn the data structure presented between the CO 2 -WAG injection strategies and the forecasted project's techno-economic responses. The development of the proxy models includes the training process and a rigorous blind testing protocol to ensure the prediction accuracy. The following discussions summarize the machine-learning models developed in the history-matching and optimization studies of the SWP CO 2 -WAG projects.

Response Surface Models
The response surface model (RSM) is one of the most classic nonlinear regression algorithms. The prediction model can be expressed via Equation (1): where b 0 , b i , b ii , and b ij are the regression coefficients; x i is the ith input feature; m is the total number of input features; and y is the output feature. To avoid overfitting issues, the higher than quadratic order terms are usually omitted. Gradient decent and least square methods [28] can be used to determine the regression coefficient. The RSM model is the earliest machine-learning model used in the CO 2 -WAG optimization study of the SWP project. Four RSM proxies are developed to predict oil and water productions, CO 2 in place, and production volumes in the CO 2 -WAG injection period of the project [6]. The R 2 value observed by comparing the results predicted by proxy and high-fidelity numerical models is close to one with relative error values of less than 0.1%, which confirms the validity of the RSM proxies.

Multi-Layer Neural Networks
Multi-layer neural networks (MLNN) model a class of artificial neural network (ANN) model that is inspired by the signal transportation process of biological neural units. Typically, an MLNN model is composed of an input, an output, and serval hidden layers. Artificial neurons are included in these layers to store, transform, and transport data signals.
The numerical values connecting the neurons in adjacent layers are "weights." An MLNN model transfers the signal via Equation (2): a n i = f n ∑ j w n i,j a n−1 where the notation n numbers the current layer (n − 1 is the previous layer) of the layer, a is the artificial neuron, and w and b are the weight and the bias, respectively. The index notations i and j number the layer and neuron, respectively. Notably, f n is a transfer  (3): where w refers to the weight vector, and o and t are the prediction and training target, respectively. Thus, the training epoch of the MLNN iteratively updates w to minimize the error function until a prescribed stopping criterion is achieved. Many training algorithms such as the scaled conjugate gradient and the Levenberg-Marquardt and Bayesian regularization [29] methods are widely used in training MLNN models with complex topologies.
There are many MLNN applications in the SWP project: In the machine-learningassisted history-matching process, MLNN models are trained to predict the field production responses with varying reservoir properties. Two representative blind test cases are shown in Figure 5. The proxy model is competent to predict the oil and water productions when the hydrodynamical properties are changed, with overall blind testing errors at 0.5% and 9.87%, respectively. As shown in Figure 5, the satisfactory blind testing performance indicates that the ANN models have been well trained and could be used to evaluate the history-matching error in the proposed workflow. Notably, such results have not been history-matched yet. The other class of MLNN models are trained to forecast the long-term field development responses in terms of hydrocarbon production, CO2 sequestration volume, and project economic assessments considering the CO2-WAG design parameters as input [18]. The CO2-WAG operational parameters include the durations of CO2 and water injection cycles, the water injection rate, and production well specifications. In this work, the error statistics of blind testing applications for an injection pattern and the field case applications, with overall mean errors observed as 0.71% and 1.00%, respectively. Therefore, the MLNN model can be employed as a surrogate model to evaluate the techno-economic objective functions considered in the design of CO2-WAG projects.
In the SWP project, MLNN models have been successfully trained and act as robust regression tools. The high computational speed enables many time-consuming studies such as global sensitivity analysis, history-matching, multi-objective optimization, etc.

Support Vector Machines
The support vector machine (SVM) is a robust regression model employed in this work. A linear SVM regression model is an extension of a linear regression model that aims at [30]: The other class of MLNN models are trained to forecast the long-term field development responses in terms of hydrocarbon production, CO 2 sequestration volume, and project economic assessments considering the CO 2 -WAG design parameters as input [18]. The CO 2 -WAG operational parameters include the durations of CO 2 and water injection cycles, the water injection rate, and production well specifications. In this work, the error statistics of blind testing applications for an injection pattern and the field case applications, with overall mean errors observed as 0.71% and 1.00%, respectively. Therefore, the MLNN model can be employed as a surrogate model to evaluate the techno-economic objective functions considered in the design of CO 2 -WAG projects.
In the SWP project, MLNN models have been successfully trained and act as robust regression tools. The high computational speed enables many time-consuming studies such as global sensitivity analysis, history-matching, multi-objective optimization, etc.

Support Vector Machines
The support vector machine (SVM) is a robust regression model employed in this work. A linear SVM regression model is an extension of a linear regression model that aims at [30]: The model is suitable to make predictions for problems with n input and 1 output, where x ∈ R n and y ∈ R. Note that in Equations (4) and (5), ω is the coefficient vector and b is the intercept. The flatness of the model measures how small the norm of ω could be. ε is the error tolerance of the regression problem. The operator x,ω indicates the dot product of vectors x and ω. To extend the applicability of SVM model to solve real problems, the soft margin is introduced to Equation (6) via a slake variable ξ: . . , m where C is a constant to balance the flatness and the tolerance degree of the deviations larger than ε. A loss function depicted in Equation (7) is used: The modern SVM models are typically written via a dual optimization form, which can be expressed as follows: where α and α * are Lagrange multipliers. For nonlinear problems, kernel functions are employed to map the original training patterns into an implicit feature space so that the linear SVM regression can be used. In this case, Equation (8) is modified as Equation (9): where k(x i , x j ) is a kernel. In this work the Gaussian kernel written in Equation (10) was employed: Similar to the MLNN model, the training of an SVM model needs to identify three hyperparameters, including the constant C, magnitude of ε, and the kernel scale factor δ. In this work, the SVM models were developed to act as proxy models to predict the oil production (Proxy-Oil) and CO 2 storage volume (Proxy-CO 2 ). The cross validation results indicate that the relative error of the Proxy-Oil and Proxy-CO 2 comparing to the high-fidelity numerical simulator to be 0.06% and 0.01%, respectively [31].
By developing the machine-learning proxies, the following experiences can be summarized: The RSM is more suitable for problems with a smaller input dimension and single output parameter. Compared to MLNN and SVM models, the training overhead is much lighter. 2.
The MLNN model exhibits robust generalization capability for problems with large input and output dimensions. However, the degree of freedom of the hyperparameter is more than that of RSM and SVM models. Thus, more computational costs are required to obtain an optimum model with optimum prediction performance.

3.
The SVM model is more suitable for problems with strong nonlinearity and a large input dimension. The number of hyperparameters to be tuned is smaller than in the MLNN model. However, it cannot make a prediction for more than one output variable.
All of the machine-learning proxies play vital roles in the optimization study by enabling various computational expensive processes, which require giant amounts of assessments on the objective functions.

Optimization Protocols
In this work, the metaheuristic and stochastic optimization protocols such as particle swarm optimization and genetic algorithms were employed to optimize various technical and economic objective functions. Compared to the conventional gradient-based optimization methods, such optimization protocols are not constrained by the continuity and differentiability of the objective function. Therefore, they are more suitable for solving problems with complex and implicit objective functions. However, the global optimization technologies used in this work would demand the establishment of populations with a certain volume of samples. Thus, totally relying on the high-fidelity numerical simulators would make the computational overhead prohibitively heavy. This section briefly goes through the critical strategies of global technologies included in this work.

Objective Functions and Constraints Economic Objective Functions
The project net present value (NPV) is a major economic consideration in the optimization study. A general definition of NPV can be depicted as Equation (11): In Equation (11), CAPEX is the project capital expenditure, m is the total number of counted timesteps of the project, q oi is the cumulative oil production of the ith timestep, and C i is the operational cost. In this work, the NPV definition was modified to adapt to the CO 2 -WAG field operation as Equation (12): where the CAPEX = 10 million dollars (USD), oil price is 50 USD/bbl, interest rate is 5%, the CO 2 storage credit (r co2,credit ) is 45 USD/metric ton. The cost term C i includes water and CO 2 injection cost, CO 2 purchase rate, and produced water treatment cost, which can be expressed by Equation (13): where the water injection cost (r w ) is 1.03 USD/STB, the produced water treatment cost (r w,pro ) is 0.64 USD/STB, the CO 2 purchase cost is 1.72 USD/thousand(M) SCF, CO 2 injection cost is 0.85 USD/MSCF. The variables q w,inj and q w,pro are water injection and production rates in STB/day, respectively; q inj,CO2 and q CO2,p are CO 2 injection and purchasing rates in MSCF/day, respectively. The evaluation of the NPV value requires the oil/CO 2 /water production and CO 2 /water injection data reported by the numerical simulation model, which is determined by the CO 2 -WAG design strategies. When the NPV is considered one of the objective functions in the optimization workflow, a large volume of realizations is required to find the optimum CO 2 -WAG injection protocol to maximize the NPV. In this work, the calculation of the NPV was done by the proxy models with a low computational cost.

Technical Objective Functions
History matching error: The history-matching error is a significant objective function to measure the misfit of the numerical model results with the field historical data. In this work, the square error was used to account for the differences between the results generated by the simulator with the field historical measurement for oil, water, gas production data, water, gas injection data, and pressure data via Equation (14) through Equation (19), respectively.
where O and T represent the model prediction and historical data, respectively, and m is the total number of timesteps of the available field data. Notably, in this work, O was generated by machine-learning proxy models to reduce the computational overhead. CO 2 storage volume: For a CCUS project, the CO 2 storage volume is a critical technical objective function considered in the optimization study, which can be defined as [6]: The CO 2 storage volume plays a crucial role in the project economic perspective by bringing considerable tax allowance to the project. Since the SWP project has rigorous plans to strategically purchase a certain volume of CO 2 based on the surface operational facility capacities, maximizing the CO 2 storage volume is essentially minimizing the CO 2 production volume in the CO 2 -WAG process.
Oil recovery: The cumulative oil production (which could be normalized by the residual oil in place as the recovery factor) in the forecasting period is another technical objective function to be maximized in the design of the CO 2 -WAG project, which also decides the major income of the project NPV.

Physical and Engineering Constraints
The optimization protocols design the CO 2 -WAG operational criteria from the algorithmic perspective. Without imposing proper physical engineering and physical constraints, the optimization results would be unrealistic. In this work, the following constraints were imposed on the optimization workflow to ensure the engineering practicability: 1.
The history-matching study specifies the oil production, CO 2 , and water injection rates, and considers the CO 2 and water production the primary objective functions. The constraints imposed on the history-matching work is that the average pressure must be below 5400 psi. 2.
The CO 2 -WAG optimization is constrained by an average reservoir pressure range of [3700, 5400] psi to maintain the miscibility of the sweeping front.
In the optimization process, either for the history-matching or project design purposes, the solutions in the population pool would be screened by the constraints to sustain the engineering applicability. To meet such a requirement, proxy models need to be developed to predict the average reservoir pressure based on various history-matching and CO 2 -WAG design scenarios.

Treatment of Multiple-Objective Optimizations
The optimization studies in this work considered more than one objective function, which are called multi-objective optimization problems (MOO). For instance, the historymatching work needs to minimize the error functions defined by Equation (14) through Equation (19) simultaneously, and the CO 2 -WAG design problem aims at maximizing the NPV, oil recovery, and CO 2 storage volume in the meantime. There were two different ways employed in this work to treat the MOO problems: Aggregated method: An aggregated objective function can be defined by Equation (21): where f i refers to various objective functions, w i is the associated weights applied to the functions, and f a is the aggregated objective function. In this way, the MOO is converted to a single-objective optimization study. In this work, the weighted method was used to solve the history matching problem by defining the history matching error as Equation (22): where the weights imposed on the error terms are identical. In addition, the optimization of the CO 2 -WAG design uses the aggregated objective function depicted in Equation (23) [6]: The weight factors can be justified based on the operational preference. Pareto optimum theory: An alternative approach to treating the MOO problem is to generate a Pareto front solution by employing the Pareto optimum theory, which is suitable for establishing a solution repository considering the tradeoff relationship amongst multiple objective functions [32].
The Pareto optimum theory defines a vector → x = [x 1 , x 3 , x 3 , · · · x n ] with n variables, and a vector → v are two objective function vectors, if ∀i ∈ {1, 2, 3, · · · m} to make u i ≤ v i and ∃i ∈ {1, 2, 3, · · · m} to make u i < v i . This concept can be denoted as The Pareto optimum set collects the solution vectors as and the corresponding objective function ensemble is defined as the Pareto Front: The Pareto optimum theory can be illustrated by Figure   It should be emphasized that regardless of the optimization algorithm, the application of the Pareto front theory by itself is a computational expensive process since it ranks a large volume of solutions regarding multiple objective functions. The CO2-EOR project is a good candidate upon which to impose the Pareto front theory since the incremental oil production would yield more produced CO2, which reduces the volume of carbon sequestration. More importantly, considering the project NPV, the impacts of the tax allowance and the oil production benefits need to be comprehensively investigated. More importantly, the Pareto front solution suggests the operators have more flexibility to design the CO2-WAG processes under various techno-economic conditions. However, such an optimization protocol is not suitable for the history-matching problems. The reason is that history-matching processes shoot for a solution that minimizes all the objective functions simultaneously. A solution fitting the oil production history with large water production error could be a dominating solution in the Pareto front, but it cannot be considered in the history-matching study. Therefore, the majority of the Pareto optimum set would be abandoned.

Optimization Algorithms
Genetic algorithm (GA): The GA refers to a family of computational models derived from Darwin's theory of biological evolution. The idea is one of the natural selection organization principles for optimizing the individuals of populations. GAs mimic natural selection to optimize more successfully. Problems are solved by an evolutionary process resulting in a best solution (fittest survivor). GAs do not search via gradients; the searching is done via sampling by stochastic operators rather than by deterministic rules. Figure  7 briefly illustrates the workflow of a GA optimizer. The application of a GA includes the following operators: Encoding: The very first step is to convert the solution of a problem, typically com-  It should be emphasized that regardless of the optimization algorithm, the application of the Pareto front theory by itself is a computational expensive process since it ranks a large volume of solutions regarding multiple objective functions. The CO 2 -EOR project is a good candidate upon which to impose the Pareto front theory since the incremental oil production would yield more produced CO 2 , which reduces the volume of carbon sequestration. More importantly, considering the project NPV, the impacts of the tax allowance and the oil production benefits need to be comprehensively investigated. More importantly, the Pareto front solution suggests the operators have more flexibility to design the CO 2 -WAG processes under various techno-economic conditions. However, such an optimization protocol is not suitable for the history-matching problems. The reason is that history-matching processes shoot for a solution that minimizes all the objective functions simultaneously. A solution fitting the oil production history with large water production error could be a dominating solution in the Pareto front, but it cannot be considered in the history-matching study. Therefore, the majority of the Pareto optimum set would be abandoned.

Optimization Algorithms
Genetic algorithm (GA): The GA refers to a family of computational models derived from Darwin's theory of biological evolution. The idea is one of the natural selection organization principles for optimizing the individuals of populations. GAs mimic natural selection to optimize more successfully. Problems are solved by an evolutionary process resulting in a best solution (fittest survivor). GAs do not search via gradients; the searching is done via sampling by stochastic operators rather than by deterministic rules. Figure 7 briefly illustrates the workflow of a GA optimizer. The application of a GA includes the following operators:

14 of 27
Mutation: Mutation is a GA operator that makes the search range wider. What mutation does is change a gene of a chromosome completely without any reason. In a genetic algorithm, the mutation rate is extremely low, typically less than 1%.
Decoding: At the end of a genetic algorithm iteration, the chromosomes need to be converted back to real number solutions. The GA is one of the first trials in the SWP project and optimizes multiple technoeconomic objective functions for the CO2-WAG injection process [6].
Particle swarm optimization (PSO): PSO is a nature-inspired evolutionary and stochastic optimization technique to solve computationally hard optimization problems. It is a fast technology based on the movement and intelligence of swarms. PSO was built by mimicking the working mechanisms of biological swarm migrations. In a PSO application, particles communicate directly or indirectly with one another via searching directions. During the iteration process, the particles update their position according to their previous experience and the experience of their neighbors. The process of PSO optimization includes the following three steps [33]: 1. Evaluate the fitness by the proxy model. 2. Calculate the velocity term v using Equation (24): where k is the iteration level, ̅ is the local best, and g is the global best.
3. Update the particle position via Equation (25): Compared to GA, the optimization process of PSO does not include any encoding/decoding procedure, which accelerates the convergence. However, the PSO can be easily trapped by the local minima when a complex objective function is considered. Figure 8 is the optimization workflow using a PSO algorithm. Encoding: The very first step is to convert the solution of a problem, typically composed of vectors of control parameters into a chromosome-type data structure.
Selection: The selection operator is employed to find individuals that exhibit good fitness functions and become candidates as parents to generated offspring. The most used selection operators include tournament and roulette wheel selection.
Reproducing: Once the parents are selected, the next step is to make them reproduce offspring via crossover. The crossover is a GA operator to generate new chromosomes based on the individuals of previous generations. A certain crossover rate needs to be specified so that a certain volume of selected individuals will be used to generate offspring. Typically, the crossover rate ranges from 60% to 95%.
Mutation: Mutation is a GA operator that makes the search range wider. What mutation does is change a gene of a chromosome completely without any reason. In a genetic algorithm, the mutation rate is extremely low, typically less than 1%.
Decoding: At the end of a genetic algorithm iteration, the chromosomes need to be converted back to real number solutions.
The GA is one of the first trials in the SWP project and optimizes multiple technoeconomic objective functions for the CO 2 -WAG injection process [6].
Particle swarm optimization (PSO): PSO is a nature-inspired evolutionary and stochastic optimization technique to solve computationally hard optimization problems. It is a fast technology based on the movement and intelligence of swarms. PSO was built by mimicking the working mechanisms of biological swarm migrations. In a PSO application, particles communicate directly or indirectly with one another via searching directions. During the iteration process, the particles update their position according to their previous experience and the experience of their neighbors. The process of PSO optimization includes the following three steps [33]:

1.
Evaluate the fitness by the proxy model.

2.
Calculate the velocity term v using Equation (24): where k is the iteration level, x is the local best, and g is the global best.
Energies 2021, 14, 1055 14 of 26 3. Update the particle position via Equation (25): Compared to GA, the optimization process of PSO does not include any encoding/decoding procedure, which accelerates the convergence. However, the PSO can be easily trapped by the local minima when a complex objective function is considered. Figure 8 is the optimization workflow using a PSO algorithm. Multi-objective particle swarm optimization (MOPSO): Compared to the single-objective PSO, MOPSO introduces a "repository (REP)" to restore the nondominated solutions of each iteration. The size of the repository can be prescribed from 100 to 250 particles. When the repository is full, the dominated solution in REP needs to be replaced by the nondominated solutions appearing in the next generation.
A "hypercube" is introduced in MOPSO to quantify the "neighbor" (also known as the searching domain) of the particles. Hypercubes are a subdivision of the objective function domain separated by hyperlines that are uniformly distributed in each objective function domain. We illustrated a hypercube (Figure 9) of a problem with two objective functions of project NPV and oil recovery. The dots represent the particles of certain MOPSO iterations. It is worth emphasizing that the hyperlines need to be justified after the fitness of population (POP) is updated in each iteration due to the upper and lower limits of the objective function domains possibly being changed.

Initialize the swarm position with random velocity vectors
Compute the fitness of the swarms Find the global leader (g best ) and neighborhood (p best ) leaders Calculate the velocity term: Update the position Terminate?
Output the g best Figure 8. Optimization workflow using particle swarm optimization (PSO).
Multi-objective particle swarm optimization (MOPSO): Compared to the singleobjective PSO, MOPSO introduces a "repository (REP)" to restore the nondominated solutions of each iteration. The size of the repository can be prescribed from 100 to 250 particles. When the repository is full, the dominated solution in REP needs to be replaced by the nondominated solutions appearing in the next generation.
A "hypercube" is introduced in MOPSO to quantify the "neighbor" (also known as the searching domain) of the particles. Hypercubes are a subdivision of the objective function domain separated by hyperlines that are uniformly distributed in each objective function domain. We illustrated a hypercube (Figure 9) of a problem with two objective functions of project NPV and oil recovery. The dots represent the particles of certain MOPSO iterations. It is worth emphasizing that the hyperlines need to be justified after the fitness of population (POP) is updated in each iteration due to the upper and lower limits of the objective function domains possibly being changed. Similar to the single objective PSO, the MOPSO iteration also includ [34]: Stage 1: Initialization: The initial population (POP) is generated and th of each particle is set to 0. The objective functions of all the particles are their fitness is assessed. Then, the initial solution repository (REP) is struc nondominated solutions in the POP. Then the initial personal best (PBEST) is set to be the initial fitness. Then, the fitness of the particles is re-evaluated based on the updated this stage, the REP needs to be updated by removing the dominating solu percube structure and the PBEST would also change accordingly.
Current research works still focus on quantifying the convergence crite One of the most broadly accepted opinions to determine the convergenc that the iteration can be terminated when none of the new particles can d the particles involved in REP. Figure 10 shows the workflow of the M MOPSO is proved to be a robust algorithm that finds the repository of the c Similar to the single objective PSO, the MOPSO iteration also includes three stages [34]: Stage 1: Initialization: The initial population (POP) is generated and the velocity (VEL) of each particle is set to 0. The objective functions of all the particles are computed and their fitness is assessed. Then, the initial solution repository (REP) is structured using the nondominated solutions in the POP. Then the initial personal best (PBEST) of the particles is set to be the initial fitness.
Stage 2: Velocity calculation: Compute the movement velocity of the particles using Equation (26): where C 1 and C 2 are two weight factors within [0, 1]. Based on the hypercube structure in the solution repository, a global best REP [h] is randomly picked from the grid where each particle is located. Stage 3: Update: The particles in the population are updated via Equation (27): Then, the fitness of the particles is re-evaluated based on the updated population. At this stage, the REP needs to be updated by removing the dominating solutions. The hypercube structure and the PBEST would also change accordingly.
Current research works still focus on quantifying the convergence criteria of MOPSO. One of the most broadly accepted opinions to determine the convergence of MOPSO is that the iteration can be terminated when none of the new particles can dominate any of the particles involved in REP. Figure 10 shows the workflow of the MOPSO process. MOPSO is proved to be a robust algorithm that finds the repository of the close to the true Pareto front by many searches.

Structuring the Hybrid Numerical Machine-Learning Workflow
In the previous sections, we introduced the machine-learning proxy models and optimization protocols. The coupling of them would build robust machine-learning assisted optimization protocols for various engineering purposes in the CO2-WAG injection project. In this paper, a machine-learning assisted history-matching and a multi-objective optimization workflow is presented.

History-Matching Workflow
Due to the high dimension and considerable computational overhead, history-matching cannot be completed by totally relying on the high-fidelity numerical model. Therefore, machine-learning models are needed to assist the history-matching work. Using the prepared numerical simulation runs as the knowledge base, a series of ANN models can be successfully trained to predict the oil, water, and gas production considering the uncertain reservoir properties as inputs. The importance of using the proxy models is that one can conduct a large volume of numerical simulation realizations with little computational cost. The trained proxy models are coupled with PSO to minimize the historymatching error function. The error function is defined as the summation of the square differences between the field historical data and the model prediction.

Structuring the Hybrid Numerical Machine-Learning Workflow
In the previous sections, we introduced the machine-learning proxy models and optimization protocols. The coupling of them would build robust machine-learning assisted optimization protocols for various engineering purposes in the CO 2 -WAG injection project. In this paper, a machine-learning assisted history-matching and a multi-objective optimization workflow is presented.

History-Matching Workflow
Due to the high dimension and considerable computational overhead, history-matching cannot be completed by totally relying on the high-fidelity numerical model. Therefore, machine-learning models are needed to assist the history-matching work. Using the prepared numerical simulation runs as the knowledge base, a series of ANN models can be successfully trained to predict the oil, water, and gas production considering the uncertain reservoir properties as inputs. The importance of using the proxy models is that one can conduct a large volume of numerical simulation realizations with little computational cost. The trained proxy models are coupled with PSO to minimize the history-matching error function. The error function is defined as the summation of the square differences between the field historical data and the model prediction.
In the history-matching process, the oil, gas, and water production rates were considered primary unknowns to calculate the error function. Considering the extensive number of tuning parameters, the results of the history-matching model could exhibit strong nonuniqueness, which means that there is more than one combination of the 62 parameters that could generate a similar level of history-matching error. After an optimal solution is obtained, the history matching solution found by the machine-learning workflow is revisited by the high-fidelity numerical reservoir simulator, which confirms the matching quality. In Figure 11, the machine-learning assisted history-matching workflow is displayed, which was successfully imposed to develop history-matched reservoir models for the primary/secondary and CO 2 -WAG period. In the history-matching process, the oil, gas, and water production rates were considered primary unknowns to calculate the error function. Considering the extensive number of tuning parameters, the results of the history-matching model could exhibit strong non-uniqueness, which means that there is more than one combination of the 62 parameters that could generate a similar level of history-matching error. After an optimal solution is obtained, the history matching solution found by the machine-learning workflow is revisited by the high-fidelity numerical reservoir simulator, which confirms the matching quality. In Figure 11, the machine-learning assisted history-matching workflow is displayed, which was successfully imposed to develop history-matched reservoir models for the primary/secondary and CO2-WAG period.  Figure 12 illustrate the machine-learning assisted multi-objective optimization workflow coupling the numerical reservoir simulation model, proxy models, and global optimization algorithm.

Multi-Objective Optimization Workflow
Data preparation: The original high-fidelity numerical model is utilized to prepare a certain volume of numerical realizations. Those realizations are utilized to prepare the knowledge base that is used to train the proxy model. Considering a blind testing error margin of <5%, the training of an injection pattern base model needs at least 100 simulation runs due to the smaller model size and dimensions. Training for the field scale proxy model may require more than 500 simulation runs.
Design of hyperparameters: It is known that the hyperparameters of the machinelearning models have significant impacts on the prediction performance. The hyperparameters of SVM are optimized using Bayesian optimization and the MLNN topology is designed with a self-adaptive training protocol [35].
Structuring the Pareto front: MOPSO employs the proxy models to generate the Pareto front by considering various t objective functions. The 2-and 3-objective Pareto fronts can be visualized using a two-dimensional curve and a surface, respectively.   Figure 12 illustrate the machine-learning assisted multi-objective optimization workflow coupling the numerical reservoir simulation model, proxy models, and global optimization algorithm.

Multi-Objective Optimization Workflow
Data preparation: The original high-fidelity numerical model is utilized to prepare a certain volume of numerical realizations. Those realizations are utilized to prepare the knowledge base that is used to train the proxy model. Considering a blind testing error margin of <5%, the training of an injection pattern base model needs at least 100 simulation runs due to the smaller model size and dimensions. Training for the field scale proxy model may require more than 500 simulation runs. Validation of the result: The Pareto front solutions must be validated before advising any operational decisions. The input parameters of the Pareto front solutions revisit the numerical model and compare the between the proxy and simulation results. If large disparities are observed, then some new samples are added to the training database to retrain the proxy. This loop can be continued several times until the error between the simulation results and the proxy predictions is lower than a prescribed error tolerance.

A History-Matching Application
A history-matching application using the proposed machine-learning assisted workflow is presented in this discussion. The objective of this study was to tune the reservoir hydrodynamic properties including the permeability along the x-, y-, and z-directions, and the Corey's relative permeability coefficients. The permeability distributions were tuned by imposing anisotropic multipliers. The reservoir model assigned various threephase relative permeability curves based on the HFU characterization. There were five different relative permeability sets considered in this work. Notably, HFU 5, 6, 7, and 8 were lumped into one group and shared identical relative permeability data. Considering the permeability multiplier and five different sets of Corey's coefficients, there were 62 parameters involved in the history-matching processes.
The machine-learning assisted tuning process was carried out by the workflow discussed in the previous sections. The Latin cube sampling protocol was used to prepare 100 simulation cases by varying the uncertain hydrodynamic parameters. The dataset was used to train an expert MLNN proxy to predict the field responses based on different hydrodynamic properties. In the machine-learning assisted workflow, the proxy model substituted the high-fidelity numerical simulator to evaluate the history-matching error defined by Equation (22). Figure 13 shows the history-matching results obtained by the machine-learning assisted workflow coupling ANN proxy models and PSO optimizer. It illustrates the fluid production matching quality predicted by the ANN models when the history-matching error was minimized. The average relative matching errors were 0.9%, 42.9%, and 17.2% for oil, water, and gas production rates comparing the real-field historical data, respectively. Although the overall history-matching results obtained by the PSO algorithm indicated a 20.2% relative error, water production exhibited much worse performance compared to that of oil and gas. Thus, a confirmation by revisiting the highfidelity numerical simulator was quite necessary. Design of hyperparameters: It is known that the hyperparameters of the machinelearning models have significant impacts on the prediction performance. The hyperparameters of SVM are optimized using Bayesian optimization and the MLNN topology is designed with a self-adaptive training protocol [35].
Structuring the Pareto front: MOPSO employs the proxy models to generate the Pareto front by considering various t objective functions. The 2-and 3-objective Pareto fronts can be visualized using a two-dimensional curve and a surface, respectively.
Validation of the result: The Pareto front solutions must be validated before advising any operational decisions. The input parameters of the Pareto front solutions revisit the numerical model and compare the between the proxy and simulation results. If large disparities are observed, then some new samples are added to the training database to re-train the proxy. This loop can be continued several times until the error between the simulation results and the proxy predictions is lower than a prescribed error tolerance.

A History-Matching Application
A history-matching application using the proposed machine-learning assisted workflow is presented in this discussion. The objective of this study was to tune the reservoir hydrodynamic properties including the permeability along the x-, y-, and zdirections, and the Corey's relative permeability coefficients. The permeability distributions were tuned by imposing anisotropic multipliers. The reservoir model assigned various three-phase relative permeability curves based on the HFU characterization. There were five different relative permeability sets considered in this work. Notably, HFU 5, 6, 7, and 8 were lumped into one group and shared identical relative permeability data. Considering the permeability multiplier and five different sets of Corey's coefficients, there were 62 parameters involved in the history-matching processes.
The machine-learning assisted tuning process was carried out by the workflow discussed in the previous sections. The Latin cube sampling protocol was used to prepare 100 simulation cases by varying the uncertain hydrodynamic parameters. The dataset was used to train an expert MLNN proxy to predict the field responses based on different hydrodynamic properties. In the machine-learning assisted workflow, the proxy model substituted the high-fidelity numerical simulator to evaluate the history-matching error defined by Equation (22). Figure 13 shows the history-matching results obtained by the machine-learning assisted workflow coupling ANN proxy models and PSO optimizer. It illustrates the fluid production matching quality predicted by the ANN models when the history-matching error was minimized. The average relative matching errors were 0.9%, 42.9%, and 17.2% for oil, water, and gas production rates comparing the real-field historical data, respectively. Although the overall history-matching results obtained by the PSO algorithm indicated a 20.2% relative error, water production exhibited much worse performance compared to that of oil and gas. Thus, a confirmation by revisiting the high-fidelity numerical simulator was quite necessary. The major objective of the machine-learning assisted history-matching workflow was to find the combination of reservoir hydrodynamic properties that make the reservoir simulation model predictions agree with the field historical data. When the PSO optimizer minimizes the history-matching error during the iterative processes, the ANN proxies are employed to predict the field responses. Notably, a set of reservoir properties would be obtained after the PSO iteration converges, which is considered the solution of the historymatching process. The history matching solution must revisit the high-fidelity model for the following reasons: 1. Although the proxy models were well-trained, there existed potential error margins.
A history-matching solution must feed into the high-fidelity simulator and confirm the matching quality. 2. To structure the forecasting scenarios, a base-case numerical simulation model needed to be established by re-running the high-fidelity simulator using the history matching solution suggested by the machine-learning assisted workflow. Figure 14 shows the history-matching quality confirmed by the high-fidelity simulator considering the solution found by the machine-learning assisted workflow. Good agreements in terms of oil/gas/water production and the gas injection rate were observed, which indicates that the numerical model was well tuned and effectively characterized the underground hydrodynamic environment. More importantly, the successful historymatching study stressed the robustness of the proxy model by learning the data structure Figure 13. History-matching results obtained by the proxy model for (a) oil production, (b) water production, and (c) gas production data.
The major objective of the machine-learning assisted history-matching workflow was to find the combination of reservoir hydrodynamic properties that make the reservoir simulation model predictions agree with the field historical data. When the PSO optimizer minimizes the history-matching error during the iterative processes, the ANN proxies are employed to predict the field responses. Notably, a set of reservoir properties would be obtained after the PSO iteration converges, which is considered the solution of the history-matching process. The history matching solution must revisit the high-fidelity model for the following reasons: 1.
Although the proxy models were well-trained, there existed potential error margins. A history-matching solution must feed into the high-fidelity simulator and confirm the matching quality.

2.
To structure the forecasting scenarios, a base-case numerical simulation model needed to be established by re-running the high-fidelity simulator using the history matching solution suggested by the machine-learning assisted workflow. Figure 14 shows the history-matching quality confirmed by the high-fidelity simulator considering the solution found by the machine-learning assisted workflow. Good agreements in terms of oil/gas/water production and the gas injection rate were observed, which indicates that the numerical model was well tuned and effectively characterized the underground hydrodynamic environment. More importantly, the successful historymatching study stressed the robustness of the proxy model by learning the data structure presented by the dataset. The computational cost was significantly reduced by the machinelearning assisted workflow. Preparing the 100 simulation runs took 80 h of CPU time. With the help of the proxy model, the workflow completed more than 600 PSO iterations using a population size of 100 (60,000 realizations in total) within 300 s. presented by the dataset. The computational cost was significantly reduced by the machine-learning assisted workflow. Preparing the 100 simulation runs took 80 h of CPU time. With the help of the proxy model, the workflow completed more than 600 PSO iterations using a population size of 100 (60,000 realizations in total) within 300 s. Figure 14. History-matching results confirmed using the high-fidelity numerical model for (a) the oil production rate, (b) the gas production rate, (c) the water production rate, and (d) the gas injection rate.

A Multi-Objective Optimization Application
Another application of the proposed workflow successfully designed the CO2-WAG injection in the FWU field for the period from January 2020 to January 2038. The CO2 injection includes the purchased CO2 and the recycled CO2 from the produced gas. According to the initial analysis of the simulation results and previous studies [31], as shown in Figure 15, the purchased CO2 rate varies from 2020 to 2033. At the end of 2033, no more CO2 is purchased.  History-matching results confirmed using the high-fidelity numerical model for (a) the oil production rate, (b) the gas production rate, (c) the water production rate, and (d) the gas injection rate.

A Multi-Objective Optimization Application
Another application of the proposed workflow successfully designed the CO 2 -WAG injection in the FWU field for the period from January 2020 to January 2038. The CO 2 injection includes the purchased CO 2 and the recycled CO 2 from the produced gas. According to the initial analysis of the simulation results and previous studies [31], as shown in Figure 15, the purchased CO 2 rate varies from 2020 to 2033. At the end of 2033, no more CO 2 is purchased. presented by the dataset. The computational cost was significantly reduced by the machine-learning assisted workflow. Preparing the 100 simulation runs took 80 h of CPU time. With the help of the proxy model, the workflow completed more than 600 PSO iterations using a population size of 100 (60,000 realizations in total) within 300 s. Figure 14. History-matching results confirmed using the high-fidelity numerical model for (a) the oil production rate, (b) the gas production rate, (c) the water production rate, and (d) the gas injection rate.

A Multi-Objective Optimization Application
Another application of the proposed workflow successfully designed the CO2-WAG injection in the FWU field for the period from January 2020 to January 2038. The CO2 injection includes the purchased CO2 and the recycled CO2 from the produced gas. According to the initial analysis of the simulation results and previous studies [31], as shown in Figure 15, the purchased CO2 rate varies from 2020 to 2033. At the end of 2033, no more CO2 is purchased. Figure 15. CO2 purchasing rate. Figure 15. CO 2 purchasing rate. A base-case forecasting scenario was established by dividing the CO 2 -WAG injection patterns into four groups (WAG A, B, C, and D). The entire project timeline was split into eight time periods, as shown in Table 1. There are more wells added to the groups as the project processes. The CO 2 -WAG injection parameters included the water and gas injection duration, water injection rates, and production well specifications, which vary for each group and time period. Thus, there were 37 design specifications considered in the optimization study. The NPV, oil recovery, and CO 2 storage volume were the objective functions. A base case model was structured using default design parameters suggested by the field operator [2] The incremental oil production, NPV, and CO 2 injection/production volume of the base case model are displayed in Figure 16. The objective functions of the base case model are summarized in Table 2. Note that the CO 2 storage efficacy is the ratio of CO 2 sequestrated to the total purchased volume. A base-case forecasting scenario was established by dividing the CO2-WAG injection patterns into four groups (WAG A, B, C, and D). The entire project timeline was split into eight time periods, as shown in Table 1. There are more wells added to the groups as the project processes. The CO2-WAG injection parameters included the water and gas injection duration, water injection rates, and production well specifications, which vary for each group and time period. Thus, there were 37 design specifications considered in the optimization study. The NPV, oil recovery, and CO2 storage volume were the objective functions. A base case model was structured using default design parameters suggested by the field operator [2] The incremental oil production, NPV, and CO2 injection/production volume of the base case model are displayed in Figure 16. The objective functions of the base case model are summarized in Table 2. Note that the CO2 storage efficacy is the ratio of CO2 sequestrated to the total purchased volume. The cumulative oil recovery was 16.8 MM bbl and the total CO2 storage was 2.36 million metric tons, which was 81.2% of the total purchased CO2. The project NPV was USD 183 million. It is worth stressing that the average reservoir pressure of ≥4000 psi was imposed as the physical constraint on the multi-objective optimization to sustain a miscible flooding process [6]. The proposed optimization workflow aims at improving all the objective functions simultaneously.   The cumulative oil recovery was 16.8 MM bbl and the total CO 2 storage was 2.36 million metric tons, which was 81.2% of the total purchased CO 2 . The project NPV was USD 183 million. It is worth stressing that the average reservoir pressure of ≥4000 psi was imposed as the physical constraint on the multi-objective optimization to sustain a miscible flooding process [6]. The proposed optimization workflow aims at improving all the objective functions simultaneously.
As shown in Figure 17, the Pareto front solution considering three objective functions is illustrated as a space curve in the domain of f 1 (oil recovery), f 2 (CO 2 storage), and f 3 (NPV). The dominating solution sitting on the Pareto front was validated by the numerical simulator. The good agreements can be observed in Figure 17, which confirms the validity of the proxy model and the MOPSO optimization results. Another interesting observation was drawn by plotting the projections of the three-objective Pareto front to three of the orthogonal surfaces. In Figure 18, a strong tradeoff relationship can be observed between the oil recovery and NPV, as well as the oil recovery and CO 2 storage volume, which means that by improving the oil recovery, the NPV and CO 2 storage volume has to be sacrificed. Therefore, the field operator could have a flexible range of choices to design the CO 2 -WAG process based on the primary desire of the project. However, due to the tax credit brought by the CO 2 sequestration, the project NPV and CO 2 storage volume increases monotonically. As shown in Figure 17, the Pareto front solution considering three objective functions is illustrated as a space curve in the domain of f1 (oil recovery), f2(CO2 storage), and f3 (NPV). The dominating solution sitting on the Pareto front was validated by the numerical simulator. The good agreements can be observed in Figure 17, which confirms the validity of the proxy model and the MOPSO optimization results. Another interesting observation was drawn by plotting the projections of the three-objective Pareto front to three of the orthogonal surfaces. In Figure 18, a strong tradeoff relationship can be observed between the oil recovery and NPV, as well as the oil recovery and CO2 storage volume, which means that by improving the oil recovery, the NPV and CO2 storage volume has to be sacrificed. Therefore, the field operator could have a flexible range of choices to design the CO2-WAG process based on the primary desire of the project. However, due to the tax credit brought by the CO2 sequestration, the project NPV and CO2 storage volume increases monotonically. Table 3 summarizes the optimization results using the proposed workflow and the base case values. The base case oil recovery fell out of the range of the Pareto front solution, which means that the base case design scenario was part of the dominating solution. Notably, the range of oil recovery included in the Pareto front was quite narrow (13.3-15.7 MM STB), and the improvement to the base case was 16.0%. The design of CO2-WAG made a considerable difference for the CO2 storage volume and NPV. Therefore, the solution yielding the highest NPV and CO2 storage volume, even with the lowest oil recovery, should be considered with priority. However, that investigation could be altered when the oil price increases, or the tax allowance brought by CO2 utilization becomes lower.   Figure 18. Projection view of the comparison between the Pareto front and numerical simulation results using some of the optimized input parameters: three-objective. Table 3 summarizes the optimization results using the proposed workflow and the base case values. The base case oil recovery fell out of the range of the Pareto front solution, which means that the base case design scenario was part of the dominating solution. Notably, the range of oil recovery included in the Pareto front was quite narrow (13.3-15.7 MM STB), and the improvement to the base case was 16.0%. The design of CO 2 -WAG made a considerable difference for the CO 2 storage volume and NPV. Therefore, the solution yielding the highest NPV and CO 2 storage volume, even with the lowest oil recovery, should be considered with priority. However, that investigation could be altered when the oil price increases, or the tax allowance brought by CO 2 utilization becomes lower. The field operator collaborating with the SWP project aims to extract hydrocarbon in the Morrow-B formation using CO 2 -EOR technologies. The primary goal of the operation is to produce residual oil, and the sequestration of CO 2 during the EOR stage would help improve the project NPV via tax credits. That is why oil recovery, the NPV, and CO 2 storage volume are selected as the objective functions. The key to maximizing the CO 2 storage (minimizing the CO 2 production) during the EOR process is the injection specifications. The miscible flooding processes inject CO 2 gas in continuous or cyclic manners. For a multiphase flow system, the mobility of the gaseous phase relates to the gas saturation. As the gas injection processes, a high gas saturation channel could form from which the injected gas could break through. Such an issue would occur in the continuous CO 2 injection process. In the practical CO 2 injection process, the water and CO 2 may inject in a cyclic manner and avoid the persistent buildup of gas saturation. More importantly, since the pressure and saturation distributions of the system would exhibit strong heterogeneity, the CO 2 -WAG design should vary for different injectors. For instance, injectors in the high-water saturation region could take a higher gas injection volume (longer gas injection cycle) and vice versa. Thus, an accurate characterization of the pressure and saturation distribution is also critical for a successful CO 2 -WAG design. The deployment of CO 2 foam [36] exhibits more robust mobility control competences. Although these technologies are not used in the Morrow-B CO 2 -EOR project, the coupling of chemical slug and CO 2 gas would be beneficial for CO 2 storage and oil recovery under feasible project economic criteria.

Conclusive Remarks
This paper presents a comprehensive summary of machine-learning assisted workflows to solve various engineering problems associated with the CO 2 -WAG design. By coupling the proxy model and the advanced optimization protocols, the computational overheads of the history-matching and multi-objective optimization processes are significantly reduced. Besides the successful experiences obtained from this work, the following limitations cannot be ignored: 1.
The proxy model developed in this work is a field-specified model that only works for the Morrow-B formation. Its implementation in other fields needs a new reservoir simulation model structure and needs to go through the proposed workflow.