Productivity Prediction of Fractured Horizontal Well in Shale Gas Reservoirs with Machine Learning Algorithms

: Predicting shale gas production under different geological and fracturing conditions in the fractured shale gas reservoirs is the foundation of optimizing the fracturing parameters, which is crucial to effectively exploit shale gas. We present a multi-layer perceptron (MLP) network and a long short-term memory (LSTM) network to predict shale gas production, both of which can quickly and accurately forecast gas production. The prediction performances of the networks are comprehensively evaluated and compared. The results show that the MLP network can predict shale gas production by geological and fracturing reservoir parameters. The average relative error of the MLP neural network is 2.85%, and the maximum relative error is 12.9%, which can meet the demand of engineering shale gas productivity prediction. The LSTM network can predict shale gas production through historical production under the constraints of geological and fracturing reservoir parameters. The average relative error of the LSTM neural network is 0.68%, and the maximum relative error is 3.08%, which can reliably predict shale gas production. There is a slight deviation between the predicted results of the MLP model and the true values in the ﬁrst 10 days. This is because the daily production decreases rapidly during the early production stage, and the production data change greatly. The largest relative errors of LSTM in this work on the 10th, 100th, and 1000th day are 0.95%, 0.73%, and 1.85%, respectively, which are far lower than the relative errors of the MLP predictions. The research results can provide a fast and effective mean for shale gas productivity prediction.


Introduction
Shale gas is an unconventional and promising alternative energy resource [1][2][3]. Hydraulic fracturing is a main technology of shale gas development, which determines shale gas production [4,5]. Predicting shale gas production under different geological and fracturing reservoir conditions in the fractured shale gas reservoirs is the foundation of optimizing the fracturing parameters, which is crucial to effectively exploit shale gas.
Due to the various complicated and co-dependent factors, such as geological and fracturing reservoir parameters, predicting gas production in shale gas reservoir poses a long-standing challenge [6,7]. Numerical simulations are conventional methods to predict shale gas production [8]. However, a numerical model needs detailed information about the specific reservoirs, which is based on numerous geological data [9]. The structure of the geology is complex, the production involves large nonlinear dynamic problems, and the numerical simulation cost is high. Therefore, the numerical simulation of shale gas production is a time-consuming method. Recently, machine learning (ML), especially deep learning, has developed rapidly, providing an effective means for shale gas production forecast [10,11]. ML is good at dealing with nonlinear problems. Usually, the calculation speed of ML is faster than numerical analysis. This requires a deeper understanding of productivity prediction in shale gas reservoirs based on machine learning algorithms.
An artificial neural network is used to predict shale gas production, because it can learn the relationship between geological and fracturing reservoir parameters and gas production. Moreover, a few particular artificial neural networks are trained to describe the relationship of the production data. Chakra et al. [12] predicted the oil production with higher-order neural network in a reservoir in India. The network has a good forecast effect with insufficient field data available. Sheremetov et al. [13] used nonlinear autoregressive neural network to predict oil production, which shows competitive accuracy in a naturally fractured oilfield. They also found that preliminary clustering is an effective means to enhance the prediction veracity. Aizenberg et al. [14] predicted oil production with a multilayer neural network. They evaluated the algorithm using a real field dataset and found the model can predict univariate and multivariate reservoir dynamics. Cao et al. [15] applied ANN to predict production using geological maps, production data, and pressure and operational constraints. Sun et al. [16] compared the RNN productive prediction model and decline curve analysis of single and multiple wells. They found that LSTM models can describe the overall trend, whereas DCA can only generate smooth curves. Previous research has shown that machine learning shows efficient prediction performance for well production. However, the constraints of geological and fracturing parameters are ignored. Thus, a shale gas productivity forecasting model, which takes the historical production data with the constraints of geological and fracturing reservoir parameters, urgently needs to be established.
In this study, we investigate the feasibility of production forecast based on machine learning algorithms, and use MLP and LSTM neural networks to predict shale gas production. We first calculate massive production data with different geological and fracturing parameters based on our previous numerical production model as the machine learning dataset. It is necessary to preprocess the collected field raw data and use it in the machine learning process. In order to reduce the user's manual processing and fill in empty values, an automated preprocessing method must be adopted. Then, we train the MLP and LSTM neural networks based on the dataset. The MLP neural network can forecast shale gas production by geological and fracturing reservoir parameters, and the LSTM neural network can predict shale gas production through historical production under the constraints of a geological and fracturing reservoir parameters. Lastly, we analyze the prediction results and compare the prediction performances of MLP and LSTM neural networks. This study provides an accurate and efficient method for predicting shale gas production.

Neural Network Description
The MLP network, which is composed of several neurons, is a basic artificial neural network [17]. MLP is a fully connected neural network with three or more layers (an input layer, an output layer, and one or more hidden layers) of nonlinearly activating nodes. The nonlinear relationship between the input and output can be obtained by using a nonlinear activation function. In this paper, the nonlinear relationship between shale gas production and reservoir properties and fracturing parameters is obtained using MLP. However, the MLP cannot deal with sequential data because it lacks memory functions.
A recurrent neural network (RNN) allows information to be transferred from one step to the next [18]. In RNN, a neural network unit can be regarded as the superposition of multiple units, and each neural network transmits a message to the next neural network. This chain-like structure enables the RNN to connect the previously stored information with the current information. Thus, RNN can infer subsequent events from previous events, which is suitable for solving time series data problems. However, because of gradient disappearance and gradient explosion, RNN cannot solve the problem of time series forecasting [19].
The LSTM network is a special kind of RNN, which was proposed by Hochreiter and Schmidhuber [20]. LSTM's design was inspired by the logic gates of a computer. LSTM introduces a memory cell (or cell for short) that has the same shape as the hidden state (some papers consider the memory cell as a special type of the hidden state), engineered to record additional information. To control the memory cell, we need a number of gates. One gate is needed to read out the entries from the cell. We will refer to this as the output gate. A second gate is needed to decide when to read data into the cell. We refer to this as the input gate. Lastly, we need a mechanism to reset the content of the cell, governed by a forget gate. This is more practical than the ordinary recurrent neural network because of its ability to process the sequential data [10].
The forget gate is expressed as: where f t is the sigmoid gate, σ is sigmoid activation function, h t−1 is the hidden state from the previous time step t−1 , x t is the input at the time step t , w represents the weight, and b is the bias. The input gate is calculated by: where i t is the sigmoid layer, C t * is the tanh layer, and tanh is the tanh activation function.
The output gate is written as: where o t is the sigmoid layer and h t is the tanh layer.

Data Preparation and Preprocessing
The networks establishment needs to determine the weights and thresholds based on the dataset [21]. We calculate massive production data with different geological and fracturing parameters based on our previous numerical production model [22,23] as the dataset. Detailed information of the numerical model can be found in our previous study [8]. Shale gas production is affected by many geological and fracturing parameters. To facilitate the learning dataset preparation, four parameters that have major effects on shale gas production are chosen: fracturing cluster, half-length of fractures, fracture conductivity, and reservoir permeability, as shown in Table 1. Each parameter has four values, so we have 44 = 256 numerical simulation cases in total. Every case includes 1000-day production data. Thus, we have 256,000 group data points. The first and last 5 lines are listed in Table 2. The shale gas productions with different geological and fracturing parameters are obtained as the machine learning dataset. Considering that the numerical value of geological and fracturing reservoir parameters significantly differ, we normalize the input variables, preventing the contribution of smaller eigenvalues from being eliminated. A total of 80% of the data is chosen to train the model and 20% of the data is used to test the model.

Prediction Accuracy Evaluation
In the process of training, we used the mean absolute error (MAE), which is a commonly used index to evaluate the accuracy of regression prediction model [24], to evaluate the shale gas prediction model: where n is the sample quantity, y pred i is the predictive value, and y act i is the true value.

MLP Network
The MLP network is trained on the training dataset to determine the structure parameters. The optimal number of neural network layers is 3 and the number of neurons in each layer is 128. The batch size and epochs are 5000 and 100, respectively. The ReLU activation function and Adam optimizer are used in the training process. The network is implemented based on the TensorFlow framework [25]. The MAE between prediction values and true values is 50.12 m 2 /d.
To validate the shale gas productive prediction model based on MLP, we select 100 groups of different geological and fracturing reservoir parameters and compare the prediction results of the model with the real values. The comparison of the true values and values predicted by MLP networks of daily shale gas production and the relative error between the true values and predictions are shown in Figure 1. The black broken line is the true value of daily production, and the red circular scatter is the predictions of the model. The red scattered points are all located on the black broken line, which shows that the model can accurately and stably forecast the gas production. Figure 1 also shows the relative errors between the predicted values and true data. The average relative error is 2.85% and the maximum relative error is 12.9%. It was found that there are only three data points with relative errors greater than 8%, and the corresponding daily productions are small. At this time, the absolute errors of the model prediction results are small. the model can accurately and stably forecast the gas production. Figure 1 also shows the relative errors between the predicted values and true data. The average relative error is 2.85% and the maximum relative error is 12.9%. It was found that there are only three data points with relative errors greater than 8%, and the corresponding daily productions are small. At this time, the absolute errors of the model prediction results are small. Furthermore, we analyze the prediction performance of the MLP network for shale gas production. Figure 2 depicts the prediction performance for the daily shale gas production under three groups of geological and fracturing reservoir parameters. The geological and fracturing reservoir parameters of the three cases are as follows: in case 1, the number of perforating clusters is 9, the half-length of hydraulic fracture is 120 m, the hydraulic fracture conductivity is 400 MD·m, and the matrix permeability is 400 nd; in case 2, the number of perforation clusters is 5, the half-length of hydraulic fracture is 100 m, the hydraulic fracture conductivity is 300 MD·m, and the matrix permeability is 300 nd; in case 3, the number of perforation clusters is 3, the half-length of hydraulic fracture is 60 m, the hydraulic fracture conductivity is 100 MD·m, and the matrix permeability is 100 nd. The parameters in case 1 are the maximum values in the dataset, and the parameters in case 3 are the minimum values in the dataset. The predictions of the three groups of examples are in good agreement with the actual daily production, which further validated the prediction model. There is a slight deviation between the predicted results of the model and the true values in the first 10 days. This is because the daily production decreases rapidly during the early production stage, and the production data changes greatly. The neural network model focus on the changes of production with time. Therefore, in the early stage, the predicted results of the MLP network are slightly different from the true values. Furthermore, we analyze the prediction performance of the MLP network for shale gas production. Figure 2 depicts the prediction performance for the daily shale gas production under three groups of geological and fracturing reservoir parameters. The geological and fracturing reservoir parameters of the three cases are as follows: in case 1, the number of perforating clusters is 9, the half-length of hydraulic fracture is 120 m, the hydraulic fracture conductivity is 400 MD·m, and the matrix permeability is 400 nd; in case 2, the number of perforation clusters is 5, the half-length of hydraulic fracture is 100 m, the hydraulic fracture conductivity is 300 MD·m, and the matrix permeability is 300 nd; in case 3, the number of perforation clusters is 3, the half-length of hydraulic fracture is 60 m, the hydraulic fracture conductivity is 100 MD·m, and the matrix permeability is 100 nd. The parameters in case 1 are the maximum values in the dataset, and the parameters in case 3 are the minimum values in the dataset. The predictions of the three groups of examples are in good agreement with the actual daily production, which further validated the prediction model. There is a slight deviation between the predicted results of the model and the true values in the first 10 days. This is because the daily production decreases rapidly during the early production stage, and the production data changes greatly. The neural network model focus on the changes of production with time. Therefore, in the early stage, the predicted results of the MLP network are slightly different from the true values.
The relative error of the predictions and the true value of shale gas production on the 10th day, 100th day, and 1000th day are shown in Figure 3. When the production time is the 100th day, the relative errors of the three groups are the smallest, which are 1.49%, 1.55%, and 1.99%, separately. On the 10th day, the relative errors of the three groups are 5.61%, 6.55%, and 7.81%, separately. On the 1000th day, the relative error of case 1 and case 3 is significantly larger than that of case 2. In conclusion, the shale gas productivity prediction model based on MPL can predict shale gas production by geological and fracturing reservoir parameters. Appl. Sci. 2021, 11, x FOR PEER REVIEW 6 of 11 The relative error of the predictions and the true value of shale gas production on the 10th day, 100th day, and 1000th day are shown in Figure 3. When the production time is the 100th day, the relative errors of the three groups are the smallest, which are 1.49%, 1.55%, and 1.99%, separately. On the 10th day, the relative errors of the three groups are 5.61%, 6.55%, and 7.81%, separately. On the 1000th day, the relative error of case 1 and case 3 is significantly larger than that of case 2. In conclusion, the shale gas productivity prediction model based on MPL can predict shale gas production by geological and fracturing reservoir parameters.

LSTM Network
The LSTM network is trained to determine the structure parameters based on the training dataset. The optimal number of neural network layers is 2 and the number of neurons in each layer is 32. The batch size and epochs are 256 and 150. The tanh activation function and Adam optimizer are used in the training process. To validate the shale gas productive prediction model based on LSTM, we also select 100 groups of different geological and fracturing reservoir parameters and compare the prediction results of the model with the real values. The comparison of the true values and values predicted by  The relative error of the predictions and the true value of shale gas production on the 10th day, 100th day, and 1000th day are shown in Figure 3. When the production time is the 100th day, the relative errors of the three groups are the smallest, which are 1.49%, 1.55%, and 1.99%, separately. On the 10th day, the relative errors of the three groups are 5.61%, 6.55%, and 7.81%, separately. On the 1000th day, the relative error of case 1 and case 3 is significantly larger than that of case 2. In conclusion, the shale gas productivity prediction model based on MPL can predict shale gas production by geological and fracturing reservoir parameters.

LSTM Network
The LSTM network is trained to determine the structure parameters based on the training dataset. The optimal number of neural network layers is 2 and the number of neurons in each layer is 32. The batch size and epochs are 256 and 150. The tanh activation function and Adam optimizer are used in the training process. To validate the shale gas productive prediction model based on LSTM, we also select 100 groups of different geological and fracturing reservoir parameters and compare the prediction results of the model with the real values. The comparison of the true values and values predicted by

LSTM Network
The LSTM network is trained to determine the structure parameters based on the training dataset. The optimal number of neural network layers is 2 and the number of neurons in each layer is 32. The batch size and epochs are 256 and 150. The tanh activation function and Adam optimizer are used in the training process. To validate the shale gas productive prediction model based on LSTM, we also select 100 groups of different geological and fracturing reservoir parameters and compare the prediction results of the model with the real values. The comparison of the true values and values predicted by LSTM networks of daily shale gas production and the relative error between the true values and predictions are shown in Figure 4. The predictions of the model are in accordance with the true values, which indicates that the model has a good prediction effect. Figure 4 also shows the relative error between the predicted values and the true values of the neural network under the constraint of geological and fracturing reservoir parameters. The average relative error is 0.68% and the maximum relative error is 3.08%, which is far lower than the average relative error of the MLP network.
LSTM networks of daily shale gas production and the relative error between the true values and predictions are shown in Figure 4. The predictions of the model are in accordance with the true values, which indicates that the model has a good prediction effect. Figure  4 also shows the relative error between the predicted values and the true values of the neural network under the constraint of geological and fracturing reservoir parameters. The average relative error is 0.68% and the maximum relative error is 3.08%, which is far lower than the average relative error of the MLP network. In this section, we analyze the LSTM prediction results for shale gas production. The prediction performance for the daily shale gas production under three groups of geological and fracturing reservoir parameters is shown in Figure 5. The geological and fracturing reservoir parameters in case 1 are the maximum values in the dataset, and the parameters in case 3 are the minimum values in the dataset. The comparison between the LSTM predictions and the true values under different constraint parameters is shown in Figure 6. The data of the first 6 days are input data, and the prediction results start from the 7th day. It can be seen that the LSTM predictions agree satisfactorily with the true values. The relative error of the LSTM predictions and the true value of shale gas production on the 10th day, 100th day, and 1000th day are shown in Figure 6. The relative errors of case 2 are the smallest, and the relative errors on the 10th, 100th, and 1000th day are 0.04%, 0.13%, and 1.18% respectively. The relative errors of case 3 on the 10th, 100th, and 1000th day are the largest, which are 0.95%, 0.73%, and 1.85%, respectively, of the production. However, they are far lower than the relative errors between the MLP predictions and the true values. In this section, we analyze the LSTM prediction results for shale gas production. The prediction performance for the daily shale gas production under three groups of geological and fracturing reservoir parameters is shown in Figure 5. The geological and fracturing reservoir parameters in case 1 are the maximum values in the dataset, and the parameters in case 3 are the minimum values in the dataset. The comparison between the LSTM predictions and the true values under different constraint parameters is shown in Figure 6. The data of the first 6 days are input data, and the prediction results start from the 7th day. It can be seen that the LSTM predictions agree satisfactorily with the true values. The relative error of the LSTM predictions and the true value of shale gas production on the 10th day, 100th day, and 1000th day are shown in Figure 6. The relative errors of case 2 are the smallest, and the relative errors on the 10th, 100th, and 1000th day are 0.04%, 0.13%, and 1.18% respectively. The relative errors of case 3 on the 10th, 100th, and 1000th day are the largest, which are 0.95%, 0.73%, and 1.85%, respectively, of the production. However, they are far lower than the relative errors between the MLP predictions and the true values.

Comparisons of the Different Networks
We then compared the predictions of the MLP and LSTM networks under two new constraint conditions, as shown in Figure 7. The geological and fracturing reservoir parameters of the three cases are as follows: in case 1, the number of perforating clusters is 7, the half-length of hydraulic fracture is 120 m, the hydraulic fracture conductivity is 400 MD·m, and the matrix permeability is 400 nd; in case 2, the number of perforation clusters is 5, the half-length of hydraulic fracture is 60 m, the hydraulic fracture conductivity is 100 MD·m, and the matrix permeability is 100 nd. Figure 8 shows the relative error between the predictive value and true value of the MLP and LSTM networks. We can see that the number of relative errors of LSTM in this work is far lower than the number of relative errors of the MLP predictions.

Comparisons of the Different Networks
We then compared the predictions of the MLP and LSTM networks under two new constraint conditions, as shown in Figure 7. The geological and fracturing reservoir parameters of the three cases are as follows: in case 1, the number of perforating clusters is 7, the half-length of hydraulic fracture is 120 m, the hydraulic fracture conductivity is 400 MD·m, and the matrix permeability is 400 nd; in case 2, the number of perforation clusters is 5, the half-length of hydraulic fracture is 60 m, the hydraulic fracture conductivity is 100 MD·m, and the matrix permeability is 100 nd. Figure 8 shows the relative error between the predictive value and true value of the MLP and LSTM networks. We can see that the number of relative errors of LSTM in this work is far lower than the number of relative errors of the MLP predictions.

Relative Errors at an Early Production Time
The relative error of the predictions and the true value of shale gas production on pre-100 days are shown in Figure 9. For the MLP networks, the relative errors of the three groups are 3.98%, 6.18%, and 6.46%, respectively. The average relative error is 5.54%. The maximum relative errors of the three groups are on the 1st day, 3rd day, and 3rd day, respectively. After that, the relative errors are decreased with the production time. The relative errors of case 1 and case 3 are slightly increased between 90 days and 100 days. For the LSTM networks, the relative errors of the three groups are 0.16%, 0.08%, and 0.55%, respectively. The average relative error is 0.26%. The maximum relative errors of the three groups are on the 100th day, 8th day, and 9th day, respectively.

Relative Errors at an Early Production Time
The relative error of the predictions and the true value of shale gas production on pre-100 days are shown in Figure 9. For the MLP networks, the relative errors of the three groups are 3.98%, 6.18%, and 6.46%, respectively. The average relative error is 5.54%. The maximum relative errors of the three groups are on the 1st day, 3rd day, and 3rd day, respectively. After that, the relative errors are decreased with the production time. The relative errors of case 1 and case 3 are slightly increased between 90 days and 100 days. For the LSTM networks, the relative errors of the three groups are 0.16%, 0.08%, and 0.55%, respectively. The average relative error is 0.26%. The maximum relative errors of the three groups are on the 100th day, 8th day, and 9th day, respectively.

Conclusions
We present two machine learning algorithms to predict the shale gas production, both of which can reliably forecast shale gas production. The MLP neural network can forecast shale gas production by geological and fracturing reservoir parameters, and the LSTM neural network can predict shale gas production through historical production under the constraints of geological and fracturing reservoir parameters. The average relative error of the MLP neural network is 2.85%, and the maximum relative error is 12.9%, which can meet the demand of engineering shale gas productivity prediction. The average relative error of the LSTM neural network is 0.68%, and the maximum relative error is 3.08%, which can reliably predict shale gas production. There is a slight deviation between the predicted results of the MLP model and the true values in the first 10 days. This is because the daily production decreases rapidly during the early production stage, and the production data change greatly. The largest relative errors of LSTM in this work on the 10th, 100th, and 1000th day are 0.95%, 0.73%, and 1.85% respectively, which are far lower than the relative errors of the MLP predictions. Although these neural network models are trained based on the simulation data in this study, the research results can provide a basic

Conclusions
We present two machine learning algorithms to predict the shale gas production, both of which can reliably forecast shale gas production. The MLP neural network can forecast shale gas production by geological and fracturing reservoir parameters, and the LSTM neural network can predict shale gas production through historical production under the constraints of geological and fracturing reservoir parameters. The average relative error of the MLP neural network is 2.85%, and the maximum relative error is 12.9%, which can meet the demand of engineering shale gas productivity prediction. The average relative error of the LSTM neural network is 0.68%, and the maximum relative error is 3.08%, which can reliably predict shale gas production. There is a slight deviation between the predicted results of the MLP model and the true values in the first 10 days. This is because the daily production decreases rapidly during the early production stage, and the production data change greatly. The largest relative errors of LSTM in this work on the 10th, 100th, and 1000th day are 0.95%, 0.73%, and 1.85% respectively, which are far lower than the relative errors of the MLP predictions. Although these neural network models are trained based on the simulation data in this study, the research results can provide a basic theory for shale gas productivity prediction and fine design of fracturing and completion parameters. In future work, we will train a new model based on the field data.