1. Introduction
Reservoir fluid pressure volume temperature (PVT) properties such as bubble point pressure, gas solubility, and oil and gas formation volume factors and viscosities are critical in reservoir engineering management and computations. These PVT properties are required to obtain the initial hydrocarbons in place, optimum production schemes, ultimate hydrocarbon recovery, design of fluid handling equipment, and reservoir volumetric estimates. Bubble point pressure (
Pb) and gas solubility (
Rs) are two of the most critical quantities used to characterize an oil reservoir. Therefore, the accurate determination of these properties is one of the main challenges in reservoir development and management. There are also other factors that affect reservoir management, such as permeability. Jia et al. [
1] illustrated that for shale reservoirs with a permeability of 0.01 mD, continuous gas injection is preferred, while for ultra-low permeability reservoirs, CO
2 huff-
n-puff is recommended. For CO
2 huff-
n-puff injection in oil shale reservoirs, the reservoir heterogeneity is not a favorable function during the primary production period. Meanwhile, the fracture length plays a key role in oil production [
2].
Conventionally, PVT properties are determined by laboratory measurements. However, these experiments are costly, time-consuming, and highly dependent on the quality and quantity of collected samples [
3,
4,
5]. Therefore, several empirical correlations such as the equation of states (EOS) as well as linear, non-linear, and multiple regression correlations have been introduced to predict PVT properties [
6,
7,
8,
9]. However, the accuracy of these correlations is highly dependent on fluid types and the chosen equation [
10,
11,
12].
Recently, artificial intelligence (AI) techniques have been extensively applied in the petroleum industry, especially in predicting well/field performance. Alajmi et al. predicted the choke performance using artificial neural network (ANN) [
13]. Alarifi et al. [
14] estimated the productivity index for oil horizontal wells using ANN, functional network, and fuzzy logic. Chen et al. [
15] applied neural network and fuzzy logic to evaluate the performance of an inflow control device (ICD) in a horizontal well. Moussa et al. [
16] used optimized an ANN model to predict average reservoir permeability using well-log data. Van and Chon [
17] evaluated the performance of CO
2 flooding using ANN. Elkatatny et al. [
18] applied ANN to estimate the rheological properties of drilling fluids based on real-time measurements.
Therefore, several AI approaches and data-driven models have been introduced to predict PVT properties and overcome the challenges associated with laboratory measurements and analytical correlations. Abedini et al. [
19] used ANN and fuzzy logic approaches to predict the oil viscosity of undersaturated oil reservoirs. Two models were introduced and showed accurate prediction of oil viscosity compared to the measured values in the laboratory. The input parameters of their models were oil gravity, reservoir temperature, gas oil ratio (GOR), and bubble point pressure. Moghadasi et al. [
20] used ANN to estimate the values of
Pb for Iranian oil fields. The input parameters utilized in their model were reservoir temperature, GOR, and oil and gas gravities. They compared their prediction with previous models and showed that ANN yielded the highest accuracy. Al-Marhoun et al. used ANN to determine the
Pb from the oil composition as well as the GOR, oil and gas gravities, and reservoir temperature [
21]. They compared the developed ANN model with other equations of states (EOS) and other available models in the literature and they concluded that ANN yielded very accurate prediction compared to the previous methods. Tatar et al. [
22] used ANN models to estimate the water density in oil and gas reservoirs. Water density is necessary in reservoir simulation and material balance calculations. Their model predicted the formation water density with a correlation coefficient (CC) close to unity and error close to zero. They used reservoir pressure, temperature, and sodium chloride concentration as inputs to predict the water density. Ahmadi and Bahadori [
23] used AI tools such as fuzzy logic to evaluate the enhanced oil recovery (EOR) processes. They coupled the fuzzy approach with commercial reservoir simulators to enhance the accuracy of selecting and ranking the appropriate EOR method for the specified oil reservoirs. Choubineh et al. [
24] used 693 data points to develop an ANN model to predict the natural gas density for different temperature and pressure ranges. Their model showed that the gas pressure and temperature have a great effect on the natural gas density. Their model can be used in a temperature range of 250 to 450 K and a pressure range of 15 to 65 MPa. The model accuracy was high compared with previous models; the regression coefficient was more than 0.99 and the average absolute error was less than 0.5%.
Although the data-driven models developed by different AI approaches have shown good accuracy compared to laboratory measurements and have outperformed analytical correlations, the input parameters of these models still require expensive laboratory experiments. For example, in order to estimate bubble point pressure (Pb), gas solubility (Rs) or oil composition is required as an input parameter. In other words, the abovementioned models did not eliminate the requirement of the expensive and time-consuming laboratory experiments. Therefore, the objective of this paper is to introduce two data-driven models to: (1) predict the Pb of crude oil samples based on three input parameters—reservoir temperature and oil and gas gravities; and (2) predict the Rs using the three input parameters as well as the predicted Pb as the fourth input parameter.
The proposed methods require no expensive laboratory experiments. Hence, it is a step toward minimizing PVT laboratory experiments. The proposed data-driven models are developed using a modified self-adaptive differential evolution algorithm (MSaDE) [
25] combined with ANN. In the subsequence sections of this paper, the proposed hybrid algorithm is referred to as SaDE-ANN.
3. Methodology
ANN has several control parameters, such as the number of hidden layers, number of neurons at each layer, training and transferring functions, and ratio of testing over training datasets. Conventionally, the values of these control parameters are assigned by several sensitivity trials. In each single trial, different values of one parameter are assigned while keeping other parameters constant. Then the value that achieved the minimum error between the measured (real) and predicted output is selected. Similar processes are applied to the remaining parameters to find their best values. However, because of the interdependency of these control parameters, this “trial” approach does not ensure the accomplishment of optimum results.
Therefore, the methodology approached in this paper involves the simultaneous optimization of these parameters to achieve the minimum average absolute percentage error (AAPE) and the maximum CC. The definitions of AAPE and CC are shown in
Appendix A. The stochastic optimization method used in this paper is modified by self-adaptive differential evolution (MSaDE) [
25]. In MSaDE, the control variables of a differential evolution algorithm, such as scale factor, crossover, and mutation strategy, are self-adapted during each iteration. In this paper, MSaDE is integrated with ANN to optimize the control parameters of ANN.
The input parameters to the ANN are: reservoir temperature (T), oil gravity (American Petroleum Institute (API)), and gas specific gravity (GG). The outputs are bubble point pressure (Pb) and solution gas ratio (Rs). As mentioned earlier, ANN consists of two phases—training and testing. In the training phase, the optimization process of SaDE-ANN continues running until one of two conditions: (1) the AAPE is less than 5%, or (2) the maximum number of function evaluations (1000) is reached. Then the optimized SaDE-ANN model is validated on unseen testing datasets to predict the values of Pb and Rs using the input parameters T, API, and GG.
Data Analysis and Acquisition
The data points utilized in this paper were collected from the literature [
7,
8,
9,
33,
34,
35]. The data includes different oil sources with different concentrations. Data from the Middle East (Al-Marhoun) [
8], data from Malaysian Crudes (Omar and Todd) [
34], data from North Sea Glasø [
7], data from fields all over the world (Vazquez and Beggs) [
9], and data from the Mediterranean Basin, Africa, the Persian Gulf, and the North Sea (De Ghetto) [
35] were employed. Each data point contains input parameters (reservoir temperature (
T), oil gravity (API), and GG) and output parameters (solution gas oil ratio (
Rs) and bubble point pressure (
Pb)).
Table 1 shows the statistical parameters of the studied 460 datasets after outlier removal using mean-standard deviation method; in which the dataset (
) would be considered as outliers if the condition shown in Equation (1) is achieved.
where
is the data vector for the
parameter,
,
,
is the total number of input parameters (in this case,
),
is the mean of the
parameter,
,
is the total number of datasets, and
is the standard deviation of the
parameter.
The CC of the input parameters (
T, API, and GG) with output parameters (
Pb and
Rs) are shown in
Figure 1. In this paper, a combined correlation coefficient (cCC) parameter is introduced to indicate the combined CC of T, API, and GG to
Pb and
Rs. cCC is the arithmetic mean of the CCs of the three input parameters calculated by Equation (2). cCC is estimated for
Pb and
Rs to determine which output should be estimated first.
where
,
, and
are the correlation coefficients between the output parameter and GG, oil gravity, and reservoir temperature, respectively.
Figure 1 shows that
Pb has a higher
with the input parameters (0.37) compared to
Rs (0.32). Therefore, it is more convenient to estimate
Pb first, and then use the estimated
Pb with the three input parameters to predict
Rs.
5. Conclusions
Bubble point pressure (Pb) and gas solubility (Rs) have a significant effect on the accuracy of modeling fluid flow in porous media. This paper introduced two data-driven correlations to predict Pb and Rs using reservoir temperature and oil and gas gravities. These empirical correlations were developed using a self-adaptive artificial neural network (SaDE-ANN). SaDE-ANN is a hybrid ANN integrated with a modified self-adaptive differential evolution (MSaDE) algorithm. The proposed correlations by SaDE-ANN were validated using previous experimental data reported in the literature (760 data points).
The developed empirical correlation for Pb predicted the Pb with a CC of 0.99 and an average absolute error (AAPE) of 6%. The same results were obtained for Rs, where the new empirical correlation predicted the Rs with a coefficient of determination (R2) of 0.99 and an AAPE of less than 6%.
The proposed correlations showed the highest prediction accuracy when compared to different empirical correlations. The proposed method outperformed other previously reported methods, as it obtained the highest CC of 0.992 and lowest AAPE of 5.42% between measured and predicted values. The correlations introduced in this paper used reservoir temperature and oil and gas gravities as input parameters to predict Pb and Rs. Hence, this represents a breakthrough that minimizes the need for the expensive and time-consuming PVT laboratory experiments commonly used to determine Pb and Rs.