Ensemble-Based Data Assimilation in Reservoir Characterization: A Review

: This paper presents a review of ensemble-based data assimilation for strongly nonlinear problems on the characterization of heterogeneous reservoirs with different production histories. It concentrates on ensemble Kalman ﬁlter (EnKF) and ensemble smoother (ES) as representative frameworks, discusses their pros and cons, and investigates recent progress to overcome their drawbacks. The typical weaknesses of ensemble-based methods are non-Gaussian parameters, improper prior ensembles and ﬁnite population size. Three categorized approaches, to mitigate these limitations, are reviewed with recent accomplishments; improvement of Kalman gains, add-on of transformation functions, and independent evaluation of observed data. The data assimilation in heterogeneous reservoirs, applying the improved ensemble methods, is discussed on predicting unknown dynamic data in reservoir characterization.


Introduction
Data assimilation, as a methodology for integrating various kinds of data, is defined as an analysis technique in which the observed information is accumulated into the model state, by taking advantage of consistency constraints with laws of time evolution and physical properties. The technique uses measured data and the theoretical information of the system to improve knowledge of the past, present, or future system states. The framework of typical data assimilation is a recursive method by updating the model state, i.e., ensembles, since the new estimate is a function of the previous estimate, and thereby it updates the ensembles to match the observed data [1,2].
Data assimilation techniques can be categorized as variational schemes, statistical schemes, or hybrid schemes [3,4]. The variational schemes are based on least squares estimation for maximum likelihood, while the statistical methods estimate minimum variance for least uncertainty. The variational schemes, originating from optimal control theory, minimize some cost function that expresses the distance between observations and corresponding model values, using the model equations as constraints [5]. Typical variational methods are 3D-Var, 4D-Var [6], and Nudging [7]. The statistical schemes emanate from estimation theory, and use the error covariances of the observations and of the model predictions to find the most likely linear combination of the two [5]. The classic approach is the Kalman filter for linear problems, whereby for strongly nonlinear problems, it is extended to the ensemble Kalman filter (EnKF), ensemble smoother (ES), ensemble Kalman smoother (EnKS), The statistical assimilation for high nonlinear problems in reservoir characterization is fundamentally based on mathematical modeling with linear algebra, and thereby its assumptions cause a few limitations as follows: non-Gaussian model parameters, improper prior ensembles, and finite ensemble size (the number of ensemble members). First, an assimilation equation assumes the model parameters follow a Gaussian distribution. One of the strengths of ensemble-based data assimilation is a wide range of applications, since there are few applicable constraints of model parameters in state vectors, e.g., permeability, facies, aquifer size, and relative permeability. However, most properties in reservoir engineering do not satisfy this Gaussian condition, e.g., some facies models in channelized reservoirs follow bimodal distributions. To make this matter worse, the updated model parameters tend to follow a normal distribution (see Figure 1a). This violates preservation of geological realism, and static information from well logging, core, and seismic data. The second weakness is the improper ensembles: the improper prior ensembles might induce unreliable estimation of posterior solutions. The mean value of initial ensembles is assumed to be true, in spite of inevitable error. A misfit between individual geomodel and true reservoir calculates an estimate error covariance, but a lot of actual fields are lacking in available data for making suitable prior datasets. To make this matter worse, the severe reservoir heterogeneity hinders the proper design of prior models. The uncertainty range of prior ensembles cannot even include observed data (see Figure 1b). Initial geomodels that are far off from the actual reservoir result in false convergence of updated models. The last is the problem of setting the proper size of ensemble. As regards the computation cost, the number of ensemble members must be small enough to be applicable in real fields, but it is difficult to confirm that the fixed ranges contain the true. When the ensemble size is too small, a cross-covariance can be mistakenly assessed [11,12]. Also, an ensemble collapse problem may occur due to small variance [13]. Therefore, the set of appropriate ensembles is essential. It is generally known that more than two hundred ensemble members are required for stable assimilation [14]. If geological uncertainty is too wide, or the inversion problem is very complex to assimilate recursively different-scaled dynamic data, more than four hundred ensemble members are needed [14].
For another research topic, the applicability of ensemble-based history matching has been expanded to various types of reservoirs. The principle of ensemble-based history matching is identical for all types of reservoirs. However, for each reservoir, detailed design and implementation of the method have evolved in different directions, reflecting their individual characteristics. It should be noted that different design and implementation of ensemble-based history matching, such as construction of a state vector, generation of initial ensembles, ensemble size, and supplementary techniques, are considered according to the reservoir type.
This paper introduces state-of-the-art ensemble-based methods with nonlinear problems, concentrates on EnKF and ES as frameworks in reservoir characterization, and also analyzes the aspect of overcoming weaknesses and developing schemes. The structure of this review is as follows: First, the mathematical frameworks of EnKF and ES are presented, together with their pros and cons.
To mitigate the problems, recent progress is explained for strongly nonlinear problems. The case studies implementing ensemble-based data assimilation, categorized as reservoir characteristics like naturally fractured reservoir, channelized reservoir, and tight reservoir with hydraulic fracturing, are reviewed. In the conclusions, the recent progress of ensemble-based data assimilation is summarized, and future works are discussed.
Energies 2018, 11, 445 3 of 24 First, the mathematical frameworks of EnKF and ES are presented, together with their pros and cons.
To mitigate the problems, recent progress is explained for strongly nonlinear problems. The case studies implementing ensemble-based data assimilation, categorized as reservoir characteristics like naturally fractured reservoir, channelized reservoir, and tight reservoir with hydraulic fracturing, are reviewed. In the conclusions, the recent progress of ensemble-based data assimilation is summarized, and future works are discussed.
(a) (b) Figure 1. Examples of unreliable results after assimilating the dynamic data through ensemble-based data assimilation: (a) The reservoir properties of the true case follow a non-Gaussian distribution (bimodal distribution), but the assimilated result shows a Gaussian distribution; (b) When prior models (grey lines) contain the true performance (red line), ensemble-based methods estimate the reliable assimilation for their mean values to converge the true profile. Examples of unreliable results after assimilating the dynamic data through ensemble-based data assimilation: (a) The reservoir properties of the true case follow a non-Gaussian distribution (bimodal distribution), but the assimilated result shows a Gaussian distribution; (b) When prior models (grey lines) contain the true performance (red line), ensemble-based methods estimate the reliable assimilation for their mean values to converge the true profile.

Mathematical Formulation
Ensemble-based methods consist of prediction and assimilation steps. After generating initial (or prior) values using the static data given, they are implemented by reservoir simulator to predict reservoir performances at observation time of dynamic data as Equation (1): where, the function f represents a forward model, i.e., reservoir simulation. The input parameter, m s i , stands for the i-th model parameters, which will be calibrated through ensemble-based methods, e.g., permeability, porosity, and facies. As initial condition, m d i means dynamic conditions, such as pressure and saturation, for each grid, and d denotes dynamic data as the results of simulation, which will be compared with the observed data. Subscripts i and t mean indication of ensemble member and observed time step, respectively. Therefore, N e indicates the number of total ensemble, and N o represents the number of observation steps.
In this assimilation step, all equations are derived with the concept of state vector: This state vector is updated by the product of the misfit between the observed data and the simulated data and the Kalman gain, K: where, the superscripts a and p denote the assimilated and the prior state vector, respectively. H means the measurement matrix operator, which extracts simulation information from the state vector to compare with the observed data. The Kalman gain is constant for all ensembles, even though the observed data are also perturbed by the measurement error covariance, C D . Equation (4) for Kalman gain is derived by minimizing the posterior estimate error covariance, C a y t . For the purpose, the prior estimate error covariance, C p y t , is calculated by Equation (5). In the equation, the estimate error covariance is defined with the estimate error, e, which is the difference between the individual ensemble member and the true value. In ensemble-based methods, the mean of the state vector becomes the true vector,ŷ t , due to their inherent assumption. Therefore, initial ensemble design is one of the key factors for successful application of ensemble-based history matching: where, superscripts T and −1 mean the transpose and the inverse of the matrix, respectively. Basically, the above procedures are used for EnKF and ES in common, but they can be distinguished in terms of the number of updating the state vector and composition of the state vector. As described in Figure 2, prior models for EnKF are simulated and assimilated repeatedly until each measurement time, while prediction and assimilation for ES are conducted once until the last observation time, according to its global update strategy. In EnKF, after the updated state vector is obtained by Equations (3) and (4), they replace the prior state vector in the prediction step of Equation (1) to predict the reservoir

Characteristics of EnKF and ES
Conceptually, smoothing, filtering, and predicting can be explained in the time domain as below. These are determined depending on the relationship between the assimilation time, t and observation time, tn. If the assimilation time is larger than the observation time, it is a data smoothing problem; whereas, if assimilation is conducted at observation time, it is considered a filtering problem. The other case is a prediction problem, when the assimilation time is less than the observation time [4]: Both of the ensemble-based methods have strengths in common over other data assimilation methods, as below:


Easy-coupling with forward models  Various applications for model parameters  Uncertainty analysis  Well-established in mathematics Table 1 summarizes the relative strengths and drawbacks of EnKF and ES. The only difference between EnKF and ES is whether the method is optimized considering the time dimension. EnKF repeats covariance minimization at each assimilation time, whereas ES includes the time dimension for optimization at a specific time. It is generally known that EnKF is superior to ES for nonlinear  An open circle means a prediction step, while a dark circle is an assimilation stage.

Characteristics of EnKF and ES
Conceptually, smoothing, filtering, and predicting can be explained in the time domain as below. These are determined depending on the relationship between the assimilation time, t and observation time, t n . If the assimilation time is larger than the observation time, it is a data smoothing problem; whereas, if assimilation is conducted at observation time, it is considered a filtering problem. The other case is a prediction problem, when the assimilation time is less than the observation time [4]: Both of the ensemble-based methods have strengths in common over other data assimilation methods, as below: • Easy-coupling with forward models • Various applications for model parameters • Uncertainty analysis • Well-established in mathematics Table 1 summarizes the relative strengths and drawbacks of EnKF and ES. The only difference between EnKF and ES is whether the method is optimized considering the time dimension. EnKF repeats covariance minimization at each assimilation time, whereas ES includes the time dimension for optimization at a specific time. It is generally known that EnKF is superior to ES for nonlinear dynamic models. That is why EnKF has been widely used in the data assimilation of reservoir models, rather than ES [15]. In terms of simulation time, EnKF requires the number of N e × N o times of reservoir simulations, while ES needs N e times of them. The difference in the number of reservoir simulations is very large, because normally the number of ensembles is around two hundred, and observations are available on a monthly basis. In the case of ES, the size of the state vector is larger than that of EnKF, due to the simulated data in Equation (1). However, this does not significantly affect the assimilation calculation, because the number of static and dynamic parameters in the state vector is much larger than the simulated data [16].

Methods to Overcome the Limitations of Ensemble-Based History Matching
Many researches have been conducted to solve the aforementioned limitations, e.g., non-Gaussian model parameters, improper prior ensembles, and finite ensemble size. The methodological efforts to improve the ensemble-based method are divided into three categories: (1) Kalman gain; (2) model parameters; and (3) observed dynamic data (see Figure 3). Figure 3 shows that the performance of ensemble-based method can be improved effectively through refinement of these key factors.

Importance of the Kalman Gain
Kalman gain is one of the most important factors in the ensemble-based method, because the assimilation equation is induced to minimize a posterior estimate error covariance. The Kalman gain is a weight of how much to reflect the difference between simulated and observed data to updated model parameters (see Equation (3)). Improvements of Kalman gain are covered as follows: localization, multiple Kalman gains, and regeneration of ensembles.
The goal of localization is to handle a spurious correlation between model parameters and observed data. In other words, localization is to identify the region that might be affected by observed data, rather than the whole reservoir. Model parameters near observation wells are properly correlated with the observed values, but parameters at distant locations are ambiguous to establish the relationship. Localization is a kind of weight function for an estimated error covariance. It alleviates the filter divergence problem, because it increases the degrees of freedom for observed data [17]. It is also one of the effective solutions when the ensemble size is small [11,18]. As the number of ensembles decreases, a cut-off radius decreases. The concept of localization can be divided into three categories, according to the following criteria: Streamline or tracer simulation Drainage area A covariance localization, as a solution to reduce ensemble size at the first suggested time, simply used a cut-off criterion based on the distance from the location of observed data to the location of a given state variable [11]. For example, the model parameters outside a certain area were excluded from data assimilation. Later, Schur product was applied to calculate a localized Kalman gain [18], which was an elementwise product between a correlation function and an estimate error covariance matrix. This concept gave more smoothed posterior parameters, than those from a cut-off localization. At gas reservoirs with aquifer, the exponential distance as a correlation function was investigated as the more suitable equation for localization [19].

Importance of the Kalman Gain
Kalman gain is one of the most important factors in the ensemble-based method, because the assimilation equation is induced to minimize a posterior estimate error covariance. The Kalman gain is a weight of how much to reflect the difference between simulated and observed data to updated model parameters (see Equation (3)). Improvements of Kalman gain are covered as follows: localization, multiple Kalman gains, and regeneration of ensembles.
The goal of localization is to handle a spurious correlation between model parameters and observed data. In other words, localization is to identify the region that might be affected by observed data, rather than the whole reservoir. Model parameters near observation wells are properly correlated with the observed values, but parameters at distant locations are ambiguous to establish the relationship. Localization is a kind of weight function for an estimated error covariance. It alleviates the filter divergence problem, because it increases the degrees of freedom for observed data [17]. It is also one of the effective solutions when the ensemble size is small [11,18]. As the number of ensembles decreases, a cut-off radius decreases. The concept of localization can be divided into three categories, according to the following criteria: Streamline-based covariance localization was implemented as an effective method to maintain geological realism, which used streamline trajectories to define the specific area influenced by a dynamic dataset [12,13,20]. It was able to reduce computation cost with small ensemble size (<100), by eliminating an area unrelated to the observed data [12]. The localization based on streamlines was more significant than a weight based on distance, because the streamlines were relevant to production data. Watanabe and Datta-Gupta expanded the concept of streamline localization according to fluid phase, and defined a correlation function from a phase streamline for each production data [13]. For example, water-cut (WCT) and gas-oil ratio (GOR) were coupled with water and gas streamlines, respectively. This flow-relevant localization could maintain geological realism, and gave a reliable uncertainty range without ensemble collapse. Jung and Choe distinguished the production-influenced area efficiently by comparing time-of-flight (TOF) from a phase velocity with the observation time of dynamic data [20]. As a limitation of steamline-based covariance localization, when a prior model covariance had a long correlation length, it would cause large local changes with loss of geological realism [17,21]. The requirement of different localization matrices came to the fore as a covariance comparison, i.e., covariance between the observed data and model parameters, and that between the observed data and primary variables [21]. In addition, the consideration of critical length was important at the high-order correlation function, e.g., fifth-order compact correlation function [22], and had to be determined including the degree of sensitivity of each observed data, and the correlation length of the geological model [17].  WOPR is well oil production rate, and WWPR stands for well water production rate. P represents 'the production well', and the subscripts 1, 2, and n mean the indication number for each production well.
Another idea for Kalman gain is to utilize multiple Kalman gains, instead of a unified value [26][27][28][29][30]. It is reported that the role of Kalman gain is similar to that of a sensitivity matrix in the maximum posterior and randomized maximum likelihood [26,27]. However, the unified Kalman gain in the previous ensemble-based method is applied to the whole ensembles, although each ensemble member has a different sensitivity matrix. Multiple Kalman gains were introduced to EnKF WOPR is well oil production rate, and WWPR stands for well water production rate. P represents 'the production well', and the subscripts 1, 2, and n mean the indication number for each production well. The covariance localization based on drainage area, determined by the pseudo-tracer of each well, was proposed [23][24][25]. The estimation of drainage area with drainage radius was shaped into a circle, but the reservoir heterogeneity could make for irregular shape [24]. Some schemes were studied to fix this matter, e.g., different localization concepts allocated at individual layers [24], and oil velocity field [25]. Figure 4 describes the concept of correlation function using hundreds of ensemble members [25]. Figure 4a shows a total N e of 2D models with production well P1 in the lower left corner. When the drainage area is configured by a phase velocity from reservoir simulation, each model has a different drainage area, as shown in Figure 4. Although each model has a discrete drainage area, their mean has continuous values between 0 and 1. The vector of the unified drainage area for each production well in Figure 4b becomes a correlation function for localization. Here, the procedure for converting 2D models to column vectors, and for converting the vectors to the correlation matrix is exactly the same as the generation of a state vector and the estimated error covariance matrix in Equations (2) and (5).
Another idea for Kalman gain is to utilize multiple Kalman gains, instead of a unified value [26][27][28][29][30]. It is reported that the role of Kalman gain is similar to that of a sensitivity matrix in the maximum posterior and randomized maximum likelihood [26,27]. However, the unified Kalman gain in the previous ensemble-based method is applied to the whole ensembles, although each ensemble member has a different sensitivity matrix. Multiple Kalman gains were introduced to EnKF [28] and ES [29,30], to apply more reliable Kalman gain for each ensemble member in Equation (3). reservoir, because its heterogeneity is very high, and the difference in reservoir parameters between sand and shale facies is large. Also, channelized reservoirs can be transformed into a discrete model based on facies, so it is easy to apply image recognition algorithms. In previous research, the Hausdorff distance was successfully applied to 2D and 3D channel models. The larger the Hausdorff distance between the two models, the greater the difference between the two models. After calculating the distance, unsupervised learning algorithms, such as k-means or self-organizing map, can classify similar models based on the distance. However, this approach requires additional calculation for clustering, and the result of assimilation is sensitive for the clustering. The last method for Kalman gain is to regenerate the ensembles whenever the covariance becomes smaller than a predetermined criterion [31,32]. In the process of data assimilation using EnKF progresses, an estimated error covariance becomes small, although the average of ensembles approaches the true field [31]. If this phenomenon gets worse, there will be a filter divergence problem where the ensembles become too similar. The estimated error in Equation (5) becomes very small, which results in small Kalman gain. Consequently, the state vector in Equation (3) cannot be calibrated, because the Kalman gain is small, even if the difference between the observed data and the simulated data is large. That is, there is no correction effect, even if there are additional observation data. This concept assumes that reservoir models can be grouped by their similarity. To satisfy the assumption, distance-based clustering was utilized. After clustering, the reservoir models within the same cluster had more similar sensitivity matrices than those of other clusters (see Figure 5). Kalman gain is calculated for each cluster separately. This idea is especially useful for the channelized reservoir, because its heterogeneity is very high, and the difference in reservoir parameters between sand and shale facies is large. Also, channelized reservoirs can be transformed into a discrete model based on facies, so it is easy to apply image recognition algorithms. In previous research, the Hausdorff distance was successfully applied to 2D and 3D channel models. The larger the Hausdorff distance between the two models, the greater the difference between the two models. After calculating the distance, unsupervised learning algorithms, such as k-means or self-organizing map, can classify similar models based on the distance. However, this approach requires additional calculation for clustering, and the result of assimilation is sensitive for the clustering.
The last method for Kalman gain is to regenerate the ensembles whenever the covariance becomes smaller than a predetermined criterion [31,32]. In the process of data assimilation using EnKF progresses, an estimated error covariance becomes small, although the average of ensembles approaches the true field [31]. If this phenomenon gets worse, there will be a filter divergence problem where the ensembles become too similar. The estimated error in Equation (5) becomes very small, which results in small Kalman gain. Consequently, the state vector in Equation (3) cannot be calibrated, because the Kalman gain is small, even if the difference between the observed data and the simulated data is large. That is, there is no correction effect, even if there are additional observation data.
To solve this problem, a re-sampling method to mitigate filer divergence was developed, re-generating a set of ensembles, using both hard data and pseudo-hard data [32]. Here, hard data means given static well data, which data were used for building prior models by geostatistics. Pseudo-hard data can be generated from the updated ensemble member at the current time step. If the uncertainty of the value of the specific grid is small in the calibrated model, the value of the grid is used as input data to the geostatistics. Newly generated models replace updated models at the current time step, and the next prediction and assimilation are implemented with the new models. These procedures are repeated during recursive updates, whenever the covariance became smaller than a predetermined criterion.

Modification of Model Parameters
The typical assumptions of ensemble-based methods cause constraints on model parameters: Model parameters should follow Gaussian distribution, the mean of which is assumed as the true value of the given parameter, even though there is inevitable error. Previous researches have solved the concept of transformation and initial ensemble design. The most common conversion of model parameters is a log-normal distribution transformation. When permeability was a concerned model parameter in state vector, it was converted to a normal distribution by the transformation, because permeability is known to follow a log-normal distribution. Updated logarithmic permeability was transformed inversely to the original distribution on the prediction step (see Equation (1)). Recently, a lot of transformation or feature extraction techniques have been used for the conversion of model parameters, as follows: To assess the geological uncertainty, NST was used to convert non-parametric permeability to Gaussian distribution. The empirical cumulative distribution function (CDF) of model parameter was transformed into the corresponding quantile in CDF of standard normal distribution. This approach is useful, because any distribution could be converted to a standard normal distribution, which satisfies the Gaussian assumption in ensemble-based methods [33,34]. DCT is a method of representing audio or image information as a sum of cosine functions of different frequencies. The coefficient of the cosine function has overall information at a low frequency, and stores detailed information at a high frequency. It has been known to be an efficient tool to extract some significant features similar to principal component analysis (PCA), with respect to computational cost and prior information of error covariance [35][36][37][38]. Reservoir characterization with geological realism implemented DCT in channelized reservoir [34], and heavy oil reservoir [38]. Figure 6 describes the workflow of DCT by updating coefficients to update model parameters, instead of using the original model parameters [38]. LSF has been applied to separate a certain domain or shape into subdomains, which are set positive or negative signs with a zero-level set boundary. In reservoir characterization, it has been performed to parameterize facies models, as an effective scheme to convert a Gaussian distribution with facies preservation [39][40][41]. The concept of preservation of facies ratio was utilized for the transformation of updated parameters, not prior parameters [42]. After the updated parameter for each grid was arranged in descending order, facies were assigned to each grid based on the facies ratio. This method could not only preserve the facies ratio, but also mitigate the overshooting problem. These transformation methods could satisfy the Gaussian assumption in the assimilation step, but they had a problem in applying to discrete model parameters, such as facies, due to continuous values after assimilation. Also, the cut-off for determination of low frequency was not clear for transformation. Many reports have underlined the importance of ensemble design to the ensemble-based method [14,15,[43][44][45]. Some emphasized posterior ensembles obtained by combining prior ensembles [15,43]. When initial models are built by different geostatistical parameters, e.g., training image (TI) from the reference field, updated ensembles cannot provide reliable uncertainty ranges [14]. As an assimilation progresses, ensembles gradually converge to the mean of ensembles different from the reference model, which can require a sufficiently large number of ensembles to ensure reliable performance. The easiest way to include the true model in the range of initial models is to generate lots of ensembles that reflect geological uncertainties. Previous researches into uncertainty quantification built initial models through wide ranges of geostatistical parameters. In two-point simulation, correlation length and anisotropy parameters were utilized to design ensemble models [44,46]. In multiple-point simulation, lots of TIs were used to make hundreds of facies models [47,48]. However, the larger the geological uncertainty, the greater the ensemble size needed. This caused a burden of simulation cost. Researches on ensemble design are categorized as follows: Assimilation of uncertain geological factors The use of a sampling scheme was to represent the characteristics of initial ensembles with low computational costs, reducing the reservoir simulations. The representative methods were singular value decomposition (SVD) [49], and distance-based clustering [50,51]. The selected models using these sampling methods were simulated with time-consuming reservoir simulators, instead of all models in the population set [51].
Geostatistical approaches, e.g., variogram and TI, have dealt with geological uncertainty, but the variables used in these methods are uncertain. An assimilation of variogram variables using EnKF failed to search the proper geomodels, due to the high nonlinearity between variogram variables and flow responses [44]. A lot of equiprobable models could make this matter worse, by increasing Many reports have underlined the importance of ensemble design to the ensemble-based method [14,15,[43][44][45]. Some emphasized posterior ensembles obtained by combining prior ensembles [15,43]. When initial models are built by different geostatistical parameters, e.g., training image (TI) from the reference field, updated ensembles cannot provide reliable uncertainty ranges [14]. As an assimilation progresses, ensembles gradually converge to the mean of ensembles different from the reference model, which can require a sufficiently large number of ensembles to ensure reliable performance. The easiest way to include the true model in the range of initial models is to generate lots of ensembles that reflect geological uncertainties. Previous researches into uncertainty quantification built initial models through wide ranges of geostatistical parameters. In two-point simulation, correlation length and anisotropy parameters were utilized to design ensemble models [44,46]. In multiple-point simulation, lots of TIs were used to make hundreds of facies models [47,48]. However, the larger the geological uncertainty, the greater the ensemble size needed. This caused a burden of simulation cost. Researches on ensemble design are categorized as follows: • Sampling • Assimilation of uncertain geological factors The use of a sampling scheme was to represent the characteristics of initial ensembles with low computational costs, reducing the reservoir simulations. The representative methods were singular value decomposition (SVD) [49], and distance-based clustering [50,51]. The selected models using these sampling methods were simulated with time-consuming reservoir simulators, instead of all models in the population set [51].
Geostatistical approaches, e.g., variogram and TI, have dealt with geological uncertainty, but the variables used in these methods are uncertain. An assimilation of variogram variables using EnKF failed to search the proper geomodels, due to the high nonlinearity between variogram variables and flow responses [44]. A lot of equiprobable models could make this matter worse, by increasing computing time. Some works calibrated the grid properties, e.g., grid permeability, facies ratio, or mean value of permeability allocated to each facies [33,34]. However, it was hard to directly modify parameters for geostatistics from dynamic data, due to high nonlinearity. Also, it was difficult to get a converged model, because during model generation, there were still lots of equiprobable models using updated geostatistical parameters.

Adjustment of Observed Dynamic Data
The last controllable parameter in an assimilation equation (refer to Equation (2)) is related to observed dynamic data. The topics of previous researches can be grouped as selective usage of observed data, and measurement error. It was recommended to use all available static data due to lack of data, while the dynamic data were preferable to utilize only essential observed data from sensitivity analysis [30,31]. A reliable data-analysis should exclude meaningless, erroneous, or inconsistent data. Figure 7 shows the unexpected effects of production performances in waterflooding at the channelized reservoir. The waterfront significantly changed the oil production and watercut before and after water breakthrough. This abrupt change of production performances regardless of spatial characteristics with static data could cause failure of data assimilation, and thereby result in unrealistic history matching. To solve this problem, some researchers used a few chosen performances, instead of all available dynamic data, e.g., exclusion of BHP during the shut-in period [30], or well flowing bottom hole pressure reaching the operational constraints [52].

Adjustment of Observed Dynamic Data
The last controllable parameter in an assimilation equation (refer to Equation (2)) is related to observed dynamic data. The topics of previous researches can be grouped as selective usage of observed data, and measurement error. It was recommended to use all available static data due to lack of data, while the dynamic data were preferable to utilize only essential observed data from sensitivity analysis [30,31]. A reliable data-analysis should exclude meaningless, erroneous, or inconsistent data. Figure 7 shows the unexpected effects of production performances in waterflooding at the channelized reservoir. The waterfront significantly changed the oil production and watercut before and after water breakthrough. This abrupt change of production performances regardless of spatial characteristics with static data could cause failure of data assimilation, and thereby result in unrealistic history matching. To solve this problem, some researchers used a few chosen performances, instead of all available dynamic data, e.g., exclusion of BHP during the shut-in period [30], or well flowing bottom hole pressure reaching the operational constraints [52]. Figure 7. Example of selective usage of different observed data: As water breakthrough occurs, distinct separation of oil production rates from watercut is observed. Before the breakthrough, the oil production rates are the target of data assimilation; but after the water volume produced is significantly increased, the data assimilation should use watercut data, to obtain the reliability of ensemble-based methods.
One of the advantages of the ensemble-based method is that various observed data are used together efficiently. In other words, the observed data in Equation (3) consist of various measurement data, such as pressure, rate, and dimensionless-parameters, and they are not required pre-processing like normalization. Proper measurement error has to be set depending on the type of observed data, because the common range or magnitude of the observed data is different for each type. Figure 7. Example of selective usage of different observed data: As water breakthrough occurs, distinct separation of oil production rates from watercut is observed. Before the breakthrough, the oil production rates are the target of data assimilation; but after the water volume produced is significantly increased, the data assimilation should use watercut data, to obtain the reliability of ensemble-based methods.
One of the advantages of the ensemble-based method is that various observed data are used together efficiently. In other words, the observed data in Equation (3) consist of various measurement data, such as pressure, rate, and dimensionless-parameters, and they are not required pre-processing like normalization. Proper measurement error has to be set depending on the type of observed data, because the common range or magnitude of the observed data is different for each type.

Applications of Ensemble-Based History Matching in Reservoir Simulation
Early studies on assisted history matching using the ensemble based method were concentrated on synthetic fields, such as PUNQ-S3 [26,30,[52][53][54][55], Brugge [56][57][58], or other synthetic fields customized for research objectives [59][60][61]. They tried to represent the advantages of EnKF and ES over gradient-based methods, such as genetic algorithm, simulated annealing, or other conventional methods. After the methods were verified in the history-matching problem, many researches have been targeted on resolving typical problems of ensemble-based history matching, as investigated in the previous section. At the same time, other efforts have been made to improve its applicability to various reservoir types.
Initial applications of ensemble-based history matching were limited to typical heterogeneous reservoirs, which have porous formation, with conventional completion and production mechanisms. However, many researches have been expanded to various types of reservoirs, such as naturally fractured reservoirs [60,62,63], channelized reservoirs [28,29,34,36,64,65], unconventional reservoirs [61,66,67], gas reservoirs [19,37,50], and others. Depending on the type of reservoir, the particular parameters, in addition to rock properties, can be included in the state vector. Facies ratio for channelized reservoir, fracture half-length for unconventional reservoir, and aquifer strength for gas reservoir are good examples of this. In addition, 4D seismic data, acoustic impedance, or Poisson's ratio have been used for assimilation, as well as production data [68][69][70]. Recently, history-matching with production data has been successfully conducted in a large field with 60 million active gridblocks [71]. The full-field model was decomposed into six sector models by streamline maps, and each individual sector model was assimilated with local observations. Then, the assembled full-field model was assimilated again with observations from the full-field. This incurred tremendous computing costs, but computing time could be reduced significantly by decomposition and parallel computing scheme with ES.
Application examples of three reservoir types, naturally fractured reservoir, channelized reservoir, and tight reservoir with hydraulic fracturing, are presented. Reservoir parameterization and initial ensemble generation according to the reservoir type will be demonstrated in detail.

Naturally Fractured Reservoirs
Numerical simulation for the naturally fractured reservoir was usually conducted by the dual porosity-dual permeability model [72]. This can mimic production behaviors affected by the fracture network in the reservoir. It assumed that the matrix acts as hydrocarbon storage toward fracture, and fracture takes the role of conduits for fluid flow. Static data for the dual-porosity model consists of permeability and porosity for fracture, porosity for the matrix, and other parameters. The dual-porosity model has two additional parameters, interporosity flow coefficient and storativity ratio, which explain the phenomena in naturally fractured reservoirs. The interporosity flow coefficient can describe how quickly hydrocarbon fluid can flow from matrix to fracture through different porosities. As the coefficient is decreased, the transition from matrix to fracture is delayed. The other parameter, storativity ratio, represents the ratio of the reserves inside the fractures, to all the reserves. For history matching representing complex production behavior in naturally fractured reservoirs, more static data, as well as these additional parameters, are added into the state vector in Equation (1). The large number of unknown parameters increases the degree of freedom; thus, generation of the initial ensemble honoring static data is the most important factor for reliable history-matching reducing the uncertainty. method of the initial ensemble. When considering anisotropy using the discrete fracture network (DFN), the state vector includes the permeability tensor. Sigma factor, which is one of the parameters of dual porosity model for the fractured reservoir, can optionally be included in a state vector, as considered in [62]. Covariance localization [60], or refinement with velocity [63], can ensure numerically stable results, in spite of the small size of ensemble of around fifty. An ensemble-based history matching was applied in naturally fractured tight reservoirs. The objective of the study was to prove the applicability of the proposed method to naturally fractured reservoirs. The authors created a simple synthetic reservoir, assuming that the orientation and location of the fractures were known approximately from seismic, microseismic, and core samples. The reservoir has a simple fracture network with several perpendicular fractures, and four production wells. Two of them share the fracture network connected to each other, one is completely in the isolated fracture, and the other is located in the matrix.
First of all, the dual porosity-dual permeability (DPDP) model was proposed as a simulation model for the naturally fractured reservoirs. This made a reservoir simulation simpler, compared to the discrete fracture network model using local raid refinement (LGR). Since the orientation and location of fractures were assumed known, the next parameters to affect production behavior were the matrix permeability, matrix-fracture transmissibility, and fracture permeability. Thus, the state vector for EnKF was composed of matrix permeability, fracture permeability, fracture porosity, and matrix-permeability transmissibility (sigma factor).
This study demonstrated that history matching with EnKF has the capability for fracture characterization and reserve estimation. When it estimated future production after history matching with only 20% of total production life, the production forecast showed good agreement with the true production behavior. Even though the application was simple, it validated the applicability of EnKF for history matching in naturally fractured reservoirs.
An application of EnKF to history matching considering the uncertainty of a naturally fractured reservoir [60] was published. Figure 8 shows its workflow that integrates fracture properties. Available static data that can be gathered in naturally fractured reservoirs include statistical data about the fracture network, such as fracture density, half-length, orientation, and aperture size. These data are just distributions in a certain region, and do not contain geographic reservoir properties of the whole region. That is why a geographic history-matching calibrating fracture distribution is needed.
EnKF with LGU accomplished the history matching with low error between the estimated and actually observed dynamic parameters. However, the velocity and water saturation fields showed different shapes with smearing edges, compared to the reference field with fine-grid system. This weakness was improved by additional implementation of RVF. It was concluded that RVF can calibrate naturally fractured reservoir models more accurately, reflecting contrast characteristics of flow behavior between matrix and fracture.  In the research, an initial ensemble was constructed by DFN model from fracture statistical data, and converted into an equivalent model to the porous reservoir by the Oda method [73]. The converted model, which consists of permeability tensor, porosity for matrix and fracture, and interporosity flow coefficient, can be simulated, just like conventional reservoirs. These equi-probable reservoir models for initial ensemble honor statistical fracture data from the aspect of regional average properties, but heterogeneity within the region is not reflected at all. Thus, history-matching using dynamic data should be needed. Covariance localization was selected to overcome the typical limitations of EnKF. The concept is a selective assimilation that each type of observation identifies each influenced region by the time of flight (TOF) of the streamline simulation, and then assimilates reservoir properties within the influenced region through EnKF. The correlation function is calculated by averaging the influence region of ensemble members, as shown in Figure 4.
The method was applied to a producing inverted 9-spot reservoir with one injector and nine producers. The research verified the proposed scheme integrating DFN and EnKF for naturally fractured reservoirs. Covariance localization by streamline simulation can make EnKF more stable and reliable against overshooting or the filter divergence problem, notwithstanding the small ensemble size of 40. It could reproduce a reservoir model that is consistent with the true reservoir, and accurately estimate reserves of the field with an uncertainty assessment. As the results of production were forecast for 880 days after history-matching with production data for the initial 120 days, the coefficient of variant for reserves estimated of the history-matched ensemble member was reduced to 32%, compared to those of the initial ensemble member.
Another application for naturally fractured reservoirs is a study to estimate fracture effective permeability by upscaling using EnKF [63]. Its assumptions are quite different from the previously mentioned study. The authors assumed that the information of the fracture network is already known, and tried to characterize the permeability distribution of the matrix. The number of ensemble members was only 40, and each initial ensemble was generated by overlap of the assumed fracture network on the realization of matrix permeability using sequential Gaussian simulation (SGS). The integration was conducted by the local-global upscaling (LGU) method, which calculates coarse scale permeability using local boundary conditions determined from global coarse-scale flow solutions. History matching using production data was accomplished for the coarse-scale field by EnKF. Additionally, to overcome the limitation of the typical weakness of EnKF, the effective permeability distribution was refined with the velocity field after history matching.
This scheme was applied to a synthetic field, which is a 2-dimensional model of 1000 ft by 1000 ft. The reservoir produced oil with an inverted 5-spot system. Thus, the oil rate from four producers and bottomhole pressure from an injector were used for history matching as observation data. This study was designed to investigate the effects of LGU, EnKF, and refinement with velocity field (RVF). EnKF with LGU accomplished the history matching with low error between the estimated and actually observed dynamic parameters. However, the velocity and water saturation fields showed different shapes with smearing edges, compared to the reference field with fine-grid system. This weakness was improved by additional implementation of RVF. It was concluded that RVF can calibrate naturally fractured reservoir models more accurately, reflecting contrast characteristics of flow behavior between matrix and fracture.

Channelized Reservoir
A channelized reservoir consists of two kinds of deposits; one is high permeable sand with a longitudinally propagated narrow band, and the other is less permeable shale background. The permeability of this kind of reservoir is a bimodal distribution. The characteristics of the production behavior depend on the connectivity of the sand channel stream. Only when a producer is connected to a water injector, can it be expected that pressure support and oil are incremental with the results of water injection. Due to these geologically complex characteristics of the channelized reservoir, application of ensemble-based history-matching for these reservoirs has been a challenging topic. Many technical approaches linked with EnKF or ES have been adopted to resolve this problem; DCT, discrete wavelet transform (DWT), and other geostatistical methods reflecting the properties of channelized reservoirs.
Meanwhile, Table 3 summarizes other considerations for channelized reservoir that are already customized. The state vector in the majority of researches for channelized reservoir consisted of only permeability. This is because parametrization for the channelized reservoir is simple, and permeability is the best parameter to distinguish permeable sand from shale background. SNEsim (single normal equation simulation), which is one of the most representative algorithms in multiple point simulation, was commonly used for generating initial ensembles. Recently, new schemes of assisted history-matching, such as iterative adaptive Gaussian mixture filter (IAGM) [64], or ensemble smoother with clustered covariance (ESC) [65], have been conducted for efficient and reliable results.
Jafarpour and McLaughlin derived important implications for proper ensemble design, when EnKF was applied for history matching in channelized reservoirs [14]. The authors designed several experiments to quantify the effects of the ensemble generation method and number of ensemble members. Generally, initial ensembles for channelized reservoirs are generated by the multipoint geostatistical simulation method, and its accuracy is dependent on training images, which are used for the geostatistical method. When it is applied for EnKF, the training image to generate initial ensemble members should include uncertainties of width, tortuosity, connectedness, and complexity. If the initial ensemble did not include properties of the true field, the results of history matching were not reliable.
In the same context, the ensemble size should be designed to be large enough for each ensemble member to cover geological uncertainty. As the ensemble size increases, the initial ensembles can provide wider geological uncertainty. That is why ensemble size can affect the robustness and accuracy of EnKF. The study suggested that an ensemble size of 100 was too small for reliable results, while an ensemble size of 300 was sufficient for the case.
To complement a diversity of ensemble members, a probability weighted re-sampling coupled with EnKF was suggested [32]. The main idea of the study was to generate new ensemble members that reflect both the geological characteristics, and the early production data. The procedures were composed of generating the initial ensemble by SNESim, updating with EnKF, ensemble re-sampling with probability weighting, selecting re-sampling points and generating new ensemble members, and updating with EnKF for the next time step, repeatedly. The method was implemented to a synthetic channelized reservoir with two distinct facies: sand and shale. An injector and a producer, drilled horizontally and completed with inflow control valves (ICV), were applied for a production scheme. History matching was conducted every month for the initial 9 months. Because ensemble variance diminished drastically after the 5th update, 20 ensemble members by re-sampling with the observed petrophysical properties were added. Re-sampling coupled with EnKF resulted in the reproduction of spatial continuity, and facies ratio consistent with the true field property.
One of the recent researches for channelized reservoirs is the ensemble smoother with clustered covariance as an assisted history-matching, proposed by Ref. [65]. The authors maintained that the proposed method could produce good performances for history-matching and production forecasts, by comparison with other methods, EnKF and ES. It differentiated assimilation procedures by clustering ensemble members. Clustered covariance can provide reliable Kalman gains, as shown in Equation (3).
The procedures of the method were initial facies modeling, clustering, and dynamic data assimilation using ES, as schematized in Figure 9. Initial facies models are generated with TIs and core data by SNESim, as mentioned before. The next process of the method is clustering. This consists of distance definition, dimension reduction, and clustering with K-means. Initial ensemble members are classified into several groups, and the criterion for classification is Hausdorff distance. The Hausdorff distance matrix was converted into an orthogonal coordinate system by multi-dimensional scaling. According to the converted distance in the orthogonal coordinate system, the ensemble members were clustered by K-means clustering. The main concept of the method was that each group of clustered models was assimilated by its own Kalman gain calculated from each group.
The research reported in [65] attempted to characterize the reference field on the assumption of the initial facies ratio. An initial ensemble of 200 members was generated, and then clustered into 10 groups. While the standard ES yielded an overshooting problem, the standard EnKF and the proposed ESC produced reliable results, representing the channel connectivity of the reference field without any numerical problem. However, the standard EnKF failed to conserve a bimodal distribution for permeability, and to accurately predict future production behavior. In spite of the initial assumption of lower facies ratio for sand over the reference field, the proposed ESC characterized the channelized reservoir with accurate facies ratio. Moreover, it represented production forecasts that were well-matched to the reference field, and efficient computation time, only 4.2% of the standard EnKF. of distance definition, dimension reduction, and clustering with K-means. Initial ensemble members are classified into several groups, and the criterion for classification is Hausdorff distance. The Hausdorff distance matrix was converted into an orthogonal coordinate system by multi-dimensional scaling. According to the converted distance in the orthogonal coordinate system, the ensemble members were clustered by K-means clustering. The main concept of the method was that each group of clustered models was assimilated by its own Kalman gain calculated from each group. Figure 9. Typical workflow to forecast production performances integrating data clustering with ensemble-based methods at channelized reservoirs [65].
The research reported in [65] attempted to characterize the reference field on the assumption of the initial facies ratio. An initial ensemble of 200 members was generated, and then clustered into 10 groups. While the standard ES yielded an overshooting problem, the standard EnKF and the proposed ESC produced reliable results, representing the channel connectivity of the reference field without any numerical problem. However, the standard EnKF failed to conserve a bimodal distribution for permeability, and to accurately predict future production behavior. In spite of the initial assumption of lower facies ratio for sand over the reference field, the proposed ESC characterized the channelized reservoir with accurate facies ratio. Moreover, it represented

Tight Reservoir with Hydraulic Fracturing
In most of the tight reservoirs, hydraulic fracturing is a key factor for a commercially successful project. Optimization for a multi-stage fracturing process, such as design, implementation, and monitoring, is essential. Post-evaluation of hydraulic fracturing can be a most reliable basis to optimize the process of hydraulic fracturing. Despite its importance, there are not many researches on post-evaluation through history-matching in unconventional reservoirs. However, the environment of low oil price has yielded several researches on post-evaluation in multi-stage fractured reservoirs. Table 4 summarizes three researches on ensemble-based history-matching for unconventional reservoirs. It was commonly assumed that reservoir properties are homogeneous, and that the fracture location for each stage is known. The reservoir model was calibrated by only the fracture half-length and permeability. A state vector can be constructed with the fracture half-length and permeability. In the case of Ref. [61], the fracture half-length can be defined by calibrated permeability, thus only the permeability was included in a state vector. However, observations were different, depending on the data gathering method. The methods of the three examples are different from each other: distributed temperature sensing (DTS), tracer test, and production logging tool (PLT). The observation parameter of each method corresponds to temperature, tracer concentration, and production rate, respectively.
Tracer test is widely conducted in the hydraulic fractured well, due to the convenience. It can diagnose the well performance in a very early stage of the production period, with relatively low cost. History matching with tracer test data for a hydraulic fractured reservoir was published in Ref. [67]. The reservoir model for this study assumed 4 stages of hydraulic fracturing.
A sensitivity study of fracture half-length and fracture permeability was performed, and the following conclusions made: the longer the fracture half-length, the larger the stimulated reservoir volume. Because the production data is dependent on SRV, the production data can be sensitive to the fracture half-length. Meanwhile, the fracture permeability did not affect the stimulated reservoir volume, but only the early production time of tracers. Thus, the tracer data is sensitive to fracture permeability.
The study quantified the effects of tracer and production data for ensemble-based history matching. The authors conducted three numerical experiments of history matching; with tracer data, production data, and both of them. History matching with tracer data characterized the fracture permeability accurately, but not the fracture half-length at all. In contrast, the production data characterized the fracture half-length accurately, but not the fracture permeability. The results of history matching with tracer and production data showed that the predicted fracture half-length and permeability showed good agreement with the true values. From the experiments, it was concluded that the tracer data during flowback and production data can be mutually complementary for characterization of hydraulic fractures by EnKF. The applicability of EnKF was expanded to hydraulic fracture reservoir characterization using DTS [66]. DTS can be installed at wellbore, and provide temperature profile in real time during treatment, flow-back, and production period. In the case of DTS, temperature observation can be used in ensemble-based history-matching in unconventional reservoirs. In the research, temperature was observed at various locations of hydraulic fractures, and used as a history matching parameter. That is why they performed non-isothermal reservoir simulation by ECLIPSE 300 (Schlumberger, Houston, TX, USA). A sensitivity analysis was conducted to investigate the impact of DTS data with regard to reservoir and fracture parameters. The results of the analysis revealed that the impacts of the parameters in order are the number of fractures, reservoir permeability, fracture half-length, height, width, and fracture permeability, respectively. The purpose of the study was to estimate the hydraulic fracture geometry by integrating DTS observations. The authors selected two fracture parameters; one was the fracture permeability with small sensitivity, and the other was the fracture half-length with medium sensitivity. They organized an experiment to characterize the two parameters simultaneously with EnKF. Two parameters were assumed to be unknown, and the others constant.
The reservoir size and grid configuration of the example field were 3100 ft by 600 ft by 150 ft, and 100 by 40 by 30, respectively. The LGR was applied to neighboring grid blocks with hydraulic fractures for numerical stability. The total number of fracture stages was assumed to be eight, and the porosity and permeability of the reservoir were 30% and 0.2 md, respectively. The authors constructed an inverse model with fracture half-length and permeability as unknown parameters, and temperature as observation. The fracture half-length for the initial ensemble was generated randomly from a uniform distribution from (5 to 300) ft. The other unknown parameter, fracture permeability, was also generated randomly between (100 and 5000) md. The initial ensemble was calibrated by temperature profiles at seven time steps using EnKF. A wide range of uniform distributions with wide range at the initial stage turned into true values of fracture half-length and permeability as recursive updates. The study showed that the proposed scheme could make accurate estimates of fracture properties within only five updates during 100 days.
The last is an example using PLT data for ensemble-based history matching [61]. PLT has the advantage that it can gather production data of each fracture stage at bottomhole, not combined at the wellhead. However, PLT survey cannot be frequently conducted, due to the cost. A practical method was proposed to characterize each stage of hydraulic fractures, by integrating PLT data and production data with ES.
The key point of the study is how often PLT surveys are conducted. Frequent PLT data can improve the accuracy of history matching with ES. The author tried to seek a cost effective scheme to be applied in the industry. A controlled experiment was designed to quantify the effects of the number of PLTs on history matching with ES. In the study, there were three cases of data assimilation; with three times of production data, three times of PLT surveys, and one PLT survey and two production data. The results showed that only single PLT could significantly improve characterization of the fracture properties. Moreover, its estimate error was a similar level, compared to the case using two more PLT data.

Conclusions
This paper reviewed the historical development, as well as the recent progress, of ensemble-based history matching methods with data assimilation to solve nonlinear problems of static and dynamic data. The typical strengths of these methods are the easy integration of various data and mathematical clearness to reduce error covariance, while the weaknesses are the influences of the initial ensembles, e.g., non-Gaussian model parameters, improper prior ensembles, and finite ensemble size. To overcome these drawbacks, the studies have been to improve Kalman gains, the model parameters, and the observed dynamic data. Recent research trends can be categorized as two different directions: One trend is to improve the accuracy by mathematically resolving the inherent problems. The other trend is to enhance the field applicability by finding a proper combination of model parameterization, initial ensemble generation, and supplementary techniques, according to various reservoir types.
As regards representative challengeable works, new approaches to enhance the computation efficiency and to secure field applications are suggested. To improve accuracy, some methods have a tendency to sacrifice computational efficiency, by increasing the number of assimilation and ensemble member. As ES, compared with the computational efforts of EnKF, was developed to dramatically reduce the assimilation number, the paradigm shift to reduce simulation number can become a major research theme for ensemble-based techniques. A more reliable assisted-history matching tool, applicable to heterogeneous facies models, is required as well. To increase the applicability of heterogeneous reservoirs, the nonlinear relationships among static properties, dynamic variables, and observed scale-different data should be considered. Some unsolved problems should be included in current ensemble-based data assimilation, e.g., the preservation of geological realism, the reliable correlations between geological scenarios and reservoir properties, and uncertainty quantification with the results of many-objective history matching.