## 1. Introduction

Young’s modulus is a measure of the sample stiffness against being subjected to a uniaxial load [

1]. Static Young’s modulus (E

_{static}) is an essential parameter required to develop the geomechanical earth model [

2] which is required for fracture mapping and designing [

3]. A complete description of the in-situ stresses which requires assessment of different petrophysical and mechanical parameters is also needed during the drilling operations to ensure wellbore stability [

4]. Several previous studies confirmed the impact of the E

_{static} on both fracture design and wellbore stability [

1,

5].

Lithology is one of the main factors affecting the E

_{static}. According to Howard and Fast [

6] and Fjaer et al. [

1], E

_{static} for shale ranges from 0.1–1.0 MPsi; for sandstone it is between 2 and 10 MPsi; and for limestone it is between 8 and 12 MPsi [

6]. These ranges confirm the very large variation in E

_{static} in different formations, as well as the wide range for same lithology type. These facts indicate the necessity to estimate E

_{static} along the different sections of the drilled well.

Currently, two methods are available for assessing the rocks elastic parameters, these are, namely, (1) laboratory measurements, and (2) through applying empirical correlations. The elastic properties of a rock sample could be measured in the laboratory using either dynamic or static method. The dynamic method involves estimating the modulus from measurements of density, compressional and shear waves velocities while the static method directly measures the deformation in the rock caused by subjecting a sample to uniaxial or triaxial load [

7]. In oil and gas fields, the shear and compressional wave velocities measured by the wireline logging [

8]. The determined acoustic velocities are then used to calculate the dynamic Young’s modulus (E

_{dynamic}), Equation (1):

where

ρ represents the bulk formation density in g/cm

^{3}, V

_{S} and V

_{P} denote the shear and compressional wave’s velocities in km/s, and E

_{dynamic} is the dynamic Young’s modulus in GPa.

For the same rock, usually the laboratory-measured E

_{dynamic} is significantly greater than E

_{static} [

9,

10,

11]. E

_{dynamic} could be 1.5–3 times greater than E

_{static} [

12,

13], and in some cases E

_{dynamic} could be up to ten times larger than E

_{static} [

14,

15,

16]. The difference is attributed to the strain amplitude between the two testing techniques, and it decreases with the increase in the strength of the rock [

17].

The reservoir in situ stress-strain conditions are truly represented by the static elastic parameters [

18], determination of these parameters requires retrieval of real core samples along the reservoir section which is a costly and time-consuming process [

13,

19]. To minimize the high cost of retrieving the core samples and performing laboratory tests; usually few cores samples are collected from the targeted (reservoir) interval, the laboratory evaluated properties of these core samples are used to develop empirical correlations based on the well log data, to evaluate the required core-derived properties. Dynamic elastic modulus could then be calibrated using these log-based correlations to predict the static modulus throughout the reservoir depths [

3]. The applicability of log-derived correlations will be restricted to the formations used to develop these correlations, thus, because of the complexity of the heterogeneous formations, the log-derived correlations will not be able to capture the trend of the static parameters changes. To overcome this limitation different empirical correlations were developed to estimate E

_{static} from the E

_{dynamic}, every correlation is restricted for a specific formation type.

Eissa and Kazi [

20] developed a generalized empirical equation to predict the E

_{static} as a function of both Edynamic and formation density. The authors developed this correlation (Equation (2)) based on the regression analysis of 76 tests, with data collected from different sources, and they found that considering the formation density improved the predictability of the E

_{static} considerably:

where E

_{static} and E

_{dynamic} are in Gpa, and γ is the formation density in g/cm

^{3}.

Canady [

21] developed another generalized empirical correlation (Equation (3)) which could also be used effectively to estimate the E

_{static} for any rock type. This correlation enabled prediction of the E

_{static} where only E

_{dynamic} is known, the results of the E

_{static} predicted with (Equation (3)) was compared to previously available correlations and it found to be well correlated to these models:

where E

_{static} and E

_{dynamic} are in GPa.

Najibi et al [

22] developed another simple correlation (Equation (4)) to evaluate the E

_{static} for Sarvak and Asmari limestone based on only the compressional velocity (Vp). This model is very useful when the shear velocity (Vs) is not available:

where E

_{static} is in Gpa, and Vp is in km/s.

Recently, Fei et al. [

23] developed an empirical correlation to predict E

_{static} from E

_{dynamic} especially for sandstone formation. The developed equation (Equation (5)) is based on the triaxial tests conducted on 22 sandstone core samples:

E_{static} and E_{dynamic} are in GPa.

Mahmoud et al. [

24] developed empirical correlations for E

_{static} estimation for different rock types. The developed correlations do not require the knowledge of E

_{dynamic} and they directly evaluated E

_{static} based on the bulk density, shear, and compressional time data.

It is clear from the literature that obtaining the E_{static} required retrieve cores from specific depth of the well which is costly and time consuming which required to perform the laboratory analysis. In addition, the analysis will be performed for specific well which cannot easily generalized through the entire field while using the developed empirical correlations had their own limitations such as core type, data range and accuracy. The main objective of this study is to develop an ANN model to predict E_{static} from the well logs using a real field data (592 core and log data points) which were collected from the whole sandstone field. Furthermore, a new empirical correlation will be developed for estimating E_{static} for sandstone reservoirs; the correlation is developed based on the extracted weights and biases of the optimized ANN model.

## 2. Uses of Artificial Intelligence in Estimating Rock Mechanical Parameters

The use of artificial intelligence (AI) techniques in many scientific fields, including the petroleum industry, started in the early 1990s. Since then many publications have treated various areas of petroleum engineering, including the prediction of the bubble point pressure, evaluation of drilling mud, interpretation of the well log data, reservoir characterization, recovery factor estimation, optimization of rate of penetration, and many more.

Recent publications (2016–2018) reported several studies that used AI in estimating rock mechanical parameters. These studies used various AI techniques to: predict failure parameter for carbonates [

25]; compare ANN, ANFIS, and SVM in predicting static Poisson’s ratio [

26]; develop empirical correlation for static Young’s modulus [

27]; develop an ANN-based correlation to predict sonic transit time [

28]; estimation of the unconfined compressive strength (UCS) based on the ANN [

29]; and use ANN in estimating Young’s modulus, Poisson’s ratio, and UCS from log data [

30].

## 3. Methodology

An artificial neural network (ANN) is an artificial intelligence technique developed to enable estimation, classification, identification, decision making by a machine program in various conditions or situations. Different ANN structures are currently available; the simplest ANN structure is called the multi-layered perceptron (MLP) which is used in this study. The MLP consists of a single input layer, one or several hidden layers (mid-layers) and one output layer.

The performance of the ANN depends on many design parameters, such as the training/testing dataset ratio, number of the hidden (training) layers, the number of neurons in each training layer, and the training and transferring functions. The optimization of different combinations of these design parameters requires a long computational time.

Differential evolution (DE) is an accurate, reliable, fast, and robust optimization technique, which has been used to solve effectively different numerical optimization problems. The main limitation for the DE is the need to set the values of the DE control parameters which is problem-dependent, thus, parameter tuning is time-consuming. Omran et al. [

31] developed the self-adaptive differential evolution (SaDE) algorithm, which does not require parameter tuning.

In this study, the SaDE optimization algorithm will be used to speed up the optimization process to select the different design parameters of the ANN model to predict E_{static}. A new empirical correlation for estimating E_{static} for sandstone reservoirs will be developed based on the extracted weights and biases of the optimized ANN model.

#### 3.1. Data Preparation

In this study, the ANN model was trained using well log data of bulk density, compressional time, and shear time as inputs to predict the core-derived E

_{static} as an output. The input well log data has been selected based on their correlation coefficient with E

_{static}, the importance of the input parameters considered in this study in estimating E

_{static} is reported by several previous studies. Eissa and Kazi [

20] confirmed the ability of improving E

_{static} prediction by incorporating the formation density, and the necessity of E

_{dynamic}, which is dependent on the compressional and shear transit times as reported by several previous studies [

21,

22,

23,

24].

Data collected in this study are from four wells: 598 data points from Well-A; 34 data points from Well-B; six data points from Well-C; and 11 data points from Well-D. The majority of the data belongs to Well-A, therefore, it will be used to build and test the ANN model which will then be used to develop an ANN-based correlation. The rest of the unseen data which was collected from Well -B, Well-C, and Well-D will be used to validate the developed ANN-based correlation. All the study data are collected from sandstone formations in the Middle East.

Data preparation and preprocessing are the most important steps to ensure a highly accurate prediction of the objective property using any of the AI techniques [

32]. As stated earlier the input variables are log-derived, which will be used to predict a core-derived output. Thus, the first step in this study is to perform a depth matching between the core-derived E

_{static} and the log data, gamma ray log was considered to perform the data matching. Then, statistical analysis was performed on the input and output parameters to remove data outliers. For the purpose of outlier removal, all parameter values without a range of ±3.0 standard deviation are considered an outlier and not considered to develop the ANN model. Six data points (outliers) from Well-A were removed in this process.

#### 3.2. Training the ANN Model

The 592 data points of Well-A, log data, and their corresponding core-derived E_{static} were considered as valid data to build the ANN model. Sixty-nine percent (409 data points) of Well-A data, were randomly selected to train the ANN model.

Table 1 summarizes the statistical analysis of the training dataset. The analysis shows that the bulk density (

ρ_{b}) for the input dataset ranges from 2.312–2.968 g/cm

^{3}; the compressional time (ΔT

_{C}) ranges between 44.3 and 77.8 μsec/ft; the shear time (ΔT

_{S}) is between 73.2 and 136.1 μsec/ft; and E

_{static} ranges from 7.5–92.8 GPa.

The relative importance of the input parameters is shown in

Figure 1. The bulk density and compressional time are strong functions on E

_{static} with correlation coefficients of 0.724 and −0.815, respectively, while the E

_{static} dependence on the shear time is moderate with a correlation coefficient of 0.439.

Design parameters of the ANN model were optimized using the SaDE algorithm, the best combination of design parameters is the one that enables prediction of the E_{static} with the lowest average absolute percentage error (AAPE), as well as highest correlation coefficient (R) and coefficient of determination (R^{2}). During the optimization process, we evaluated the performance of different training functions such as Levenberg–Marquardt backpropagation (trainlm), gradient descent with momentum backpropagation (traingdm), Gradient descent with adaptive learning rate backpropagation (traingda), Bayesian regularization backpropagation (trainbr), and conjugate gradient (traincgf); different transfer functions such as tan-sigmoid (tansig), log-sigmoid (logsig), and pure line (purelin); number of hidden layers from 1–3; and the number of neurons per each hidden layer from 5 to 30 on estimating the E_{static}.

The SaDE algorithm was applied using MATLAB software developed by MathWorks (Natick, Massachusetts, U.S.A.) to select the optimum combinations of the ANN design parameters. Based on the optimization process, the combination of the parameters summarized in

Table 2 was found to optimize the ANN performance for E

_{static} prediction. As listed in

Table 2, trainbr is the best training function that optimizes the E

_{static} predictability of the ANN model. trainbr is a network training function that updates the weight and bias values of for the ANN model based on Levenberg–Marquardt optimization, and it determines the correct output variable after minimizing a combination of weights and squared errors in a process called Bayesian regularization [

33]. logsig is the optimum transfer function. The use of a single hidden (training) function (i.e., a single layer) with 20 neurons also optimized predictability of the ANN model for E

_{static}.

Figure 2 shows the structure of the suggested ANN model for E

_{static} prediction.

#### 3.3. Evaluation and Validation of the Developed ANN-Based Empirical Correlation

The remaining 31% of the Well-A dataset, which comprises 183 data points, are considered for evaluating the developed ANN-based empirical correlation. Model validation is important step that is preferably performed on unseen data. The three wells (Well-B: 38 data points, Well-C: six data points, and Well-D: 11 data points) are used in model validation. All testing and validation data are within the range of the training data which used to develop the model to ensure high accuracy in predicting E

_{static}. The ability of the developed ANN-based empirical correlation in evaluating the E

_{static} for the validation data collected from Well-B will be compared with four of the available correlations, namely, Eissa and Kazi [

20], Canady [

21], Najibi et al. [

22], and Fei et al. [

23] correlations are presented earlier by Equations (2)–(5).

#### 3.4. Evaluation Criterion

The predictive power of the developed ANN-based empirical correlation in estimating E_{static} will be evaluated based on the AAPE, R^{2}, R, and visualization check.