Next Article in Journal
An Improved MTPA Control Method Based on DTC-SVM Using D-Axis Flux Optimization
Previous Article in Journal
Time Division Multiple Access–Non-Orthogonal Multiple Access-Assisted Heterogeneous Semantic and Bit Communications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimization of Oil Well Production Prediction Model Based on Inter-Attention and BiLSTM

1
School of Electrical Information, Southwest Petroleum University, Chengdu 610500, China
2
School of Computer, University of Electronic Science and Technology of China, Chengdu 611731, China
*
Authors to whom correspondence should be addressed.
Electronics 2025, 14(5), 1004; https://doi.org/10.3390/electronics14051004
Submission received: 22 January 2025 / Revised: 21 February 2025 / Accepted: 26 February 2025 / Published: 2 March 2025

Abstract

:
Accurate prediction of future oil production is critical for decision-making in oil well operations. However, existing prediction models often lack precision due to the vast and complex nature of oil well data. This study proposes an oil well production prediction model based on the Inter-Attention Mechanism (IAM) and Bidirectional Long Short-Term Memory Network (BiLSTM), optimized using a Comprehensive Search Algorithm (CSA). By incorporating the Inter-Attention Mechanism, the model enhances its capacity to model complex time-series data. The CSA, combined with Sequential Quadratic Programming (SQP) and Monotone Basin Hopping (MBH) algorithms, ensures both global and local parameter optimization. Using historical data from an oil well in Sichuan, the feasibility of the proposed model was validated, demonstrating superior accuracy and robustness compared to other prediction models and optimization algorithms.

1. Introduction

In the process of oil production, oil well production prediction is typically conducted to save time and resources in detecting well production, as well as to comprehensively evaluate and prepare for well development [1]. However, current oil well production predictions face challenges, such as suboptimal performance and insufficient model generalization [2], making efficient and accurate predictions difficult and impacting the scientific quality of production decisions and optimal resource allocation.
With the development of oil field technology and artificial intelligence, scholars, both domestically and internationally, have conducted extensive research on oil well production prediction [3]. Traditional prediction methods, such as empirical formula methods [4], curve fitting methods [5], and numerical simulation methods [6], suffer from excessive reliance on experience, complex computations, and poor adaptability. In the field of oil well production prediction, traditional methods primarily rely on physical-based models and empirical formulas [7]. These methods typically involve analyzing key parameters such as reservoir fluid flow, reservoir pressure, and geological characteristics, combined with historical production data, to establish mathematical models for prediction [8]. Common models include capacity equations, pressure recovery analysis, and material balance methods. For example, Van Everdingen A. F. explored the use of the Material Balance Equation (MBE) as a predictive tool for estimating future production [9], while Xu J reviewed the development of reservoir simulation methods [10], highlighting their importance in predicting oil well performance. The advantage of these methods lies in their solid physical foundation, making them particularly suitable for cases with limited early data.
In the field of reservoir engineering, oil well production prediction via artificial intelligence prediction methods has been widely applied in recent years, mainly including machine learning and deep learning techniques. These methods train on large amounts of historical production data, automatically mining potential nonlinear relationships, and are capable of effectively capturing dynamic changes under complex working conditions and multi-factor coupling. Common algorithms include Convolutional Neural Networks (CNN), Random Forest (RF) [11], Long Short-Term Memory Networks (LSTM) [12], and ensemble models [13]. These methods do not rely on detailed physical model support and, when sufficient data is available, can significantly improve prediction accuracy, although they have higher requirements for data quality and quantity.
Deep learning methods have demonstrated especially strong performance in handling large-scale data and intricate nonlinear relationships. For example, research based on the GRU-KAN model improved prediction accuracy and efficiency by combining Gated Recurrent Units (GRU) and Kernel Attention Networks (KAN) [14]; probabilistic machine learning methods were used to quantify the uncertainty in the prediction process, thereby improving the profitability of oil and gas wells [15]; prediction models combining Principal Component Analysis (PCA) and GRU improved the accuracy of oil well production predictions [16]; and data-driven regression models developed using sequential convolution and Long Short-Term Memory (LSTM) units have been applied to oil production forecasting, achieving high prediction accuracy [17]. However, most of these studies lack optimization of the prediction models, leading to deficiencies in accuracy, efficiency, and adaptability, especially the tendency to fall into local optima, which impacts prediction accuracy.
Traditional methods (such as the empirical formula method and numerical simulation method) are difficult to adapt to the dynamic changes of the reservoir environment due to complex calculations and high data dependence. Artificial intelligence methods (especially deep learning) have gradually become an important means of oil well production prediction due to their powerful data processing capabilities. However, the current deep learning model still has shortcomings in prediction accuracy, generalization ability, and avoiding local optimality. Therefore, this paper proposes an oil well production prediction model based on IAM-BiLSTM and optimizes it through a Comprehensive Search Algorithm (CSA) to improve prediction accuracy and stability. First, by combining the Inter-Attention Mechanism with the BiLSTM structure, the IAM-BiLSTM model is designed to more effectively filter and retain key features, thereby improving prediction accuracy. Second, the paper introduces a CSA algorithm that combines the Monotone Basin Hopping Algorithm (MBH) and Sequential Quadratic Programming (SQP) algorithm, optimizing the model parameters through the global search ability of MBH and the local optimization ability of SQP. Finally, experimental results demonstrate that the IAM-BiLSTM model optimized by CSA significantly outperforms traditional methods in prediction performance.

2. Methodology

This chapter introduces the proposed IAM-BiLSTM model and the CSA optimization algorithm. First, it presents a BiLSTM-based prediction model incorporating an Inter-Attention Mechanism and explains its working principles. Then, it describes the CSA algorithm, which combines MBH and SQP, along with its improvement strategies.

2.1. IAM-BiLSTM Model

2.1.1. Inter-Attention Mechanism

The Inter-Attention Mechanism is a data processing method in machine learning [18], widely applied in tasks such as natural language processing [19], image recognition, and speech recognition [20]. The degree of focus on different pieces of information in the attention mechanism is reflected through weights [21]. The attention mechanism can be regarded as a multilayer perceptron (MLP) composed of a query matrix (Q), keys (K), and weighted averages (V).
The Inter-Attention Mechanism is a type of attention mechanism in deep learning used to capture the correlations and interactions between two different sequences or datasets [22]. Its working principle is similar to that of the standard attention mechanism, with the difference being that one of the sequences is used as the Q, while the other sequence serves as the K and V. The corresponding attention weights and attention values are then calculated. The Inter-Attention Mechanism can effectively model the dependencies between two distinct sequences, thereby enhancing the model’s understanding of the input data.

2.1.2. BiLSTM Model

Standard Recurrent Neural Networks (RNNs) capture only limited contextual information in sequential tasks [23]. To address this, Long Short-Term Memory (LSTM) introduces cell states and gating mechanisms—forget gate, input gate, and output gate—each controlled by the sigmoid function [24]. However, conventional LSTMs may still ignore future context. Bidirectional LSTMs (BiLSTMs) [25] solve this issue by running two LSTMs in opposite directions and combining their outputs, thus incorporating both past and future information.
Concretely, for a BiLSTM network, the output at time t , y t , is determined by the forward state h t , the backward state h t , and the input x t . The update formulas are:
h t = z 1 ω 1 x t + ω 5 h t 1 h t = z 2 ω 2 x t + ω 6 h t + 1 y t = z 3 ω 3 h t + ω 4 h t
where ω i ( i = 1,2 , , 6 ) is the corresponding weight of each layer and z i ( i = 1 , 2 , 3 ) is the activation function of different layers.

2.1.3. IAM-BiLSTM Prediction Model

By introducing an Inter-Attention Mechanism between the forward and backward LSTMs, the forward and backward LSTMs are able to “attend” to each other’s hidden states at each time step, thereby better capturing the bidirectional dependencies within the sequence. Table 1 outlines the implementation steps of the IAM-BiLSTM prediction model.
By incorporating the Inter-Attention Mechanism into BiLSTM, the forward and backward LSTMs can mutually attend to each other’s hidden states at each time step, enabling the capture of richer bidirectional information. A trainable feedforward network is used to compute the similarity scores, enhancing the model’s representational capacity. The context vector c t , which summarizes the global information from the counterpart LSTM, aids in making more accurate predictions at the current time step. Figure 1 illustrates the IAM-BiLSTM prediction model.

2.2. Comprehensive Search Algorithm

In oil well production prediction based on artificial intelligence methods, incorporating optimization algorithms can effectively enhance the accuracy of prediction models. This interdisciplinary approach improves prediction accuracy and efficiency across various fields. Azevedo B. F. conducted a comprehensive review of hybrid methods combining optimization and machine learning techniques [26], including their applications in clustering and classification tasks. Krzywanski J. explored the applications of artificial intelligence and other advanced computational methods in energy systems [27], including material production and optimization. These studies collectively demonstrate that the use of optimization algorithms can significantly improve the performance of prediction models.

2.2.1. Sequential Quadratic Programming (SQP)

The Sequential Quadratic Programming (SQP) algorithm is a nonlinear optimization method that iteratively refines the solution by solving Quadratic Programming (QP) subproblems. In this study, SQP is integrated with the Monotone Basin Hopping (MBH) algorithm to enhance global optimization capabilities, addressing the limitations of conventional deep learning parameter tuning. SQP applies a quasi-Newton method to the Kuhn–Tucker conditions, where the Lagrangian function is given by Formula (2):
L x , λ , μ = f x + i = 1 m λ i h i x + j = m + 1 n μ j g j x
where x denotes the decision variable vector, f ( x ) is the objective function, h i x and g j x represent the equality and inequality constraints, λ i and μ j represent the equality and inequality constraints, m is the number of equality constraints, and n is the total number of constraints. In each iteration, the search direction d is determined by solving the following QP subproblem:
m i n 1 / 2 d T H k d + f x k T d s . t . h i x T d + h i ( x ) = 0 g j x T d + g j ( x ) = 0
The new solution is then updated by x k + 1 = x k + a k d k , where a k is obtained via a line search. The Hessian approximation H k is updated using the DFP formula:
H k + 1 = H k H k y k y k T H k y k T H k y k + s k s k T y k s k T s k
where y k is the difference in gradients of the Lagrangian and s k is the difference in the solution estimates.
The steps of the SQP algorithm are shown in Table 2.

2.2.2. Monotone Basin Hopping Algorithm (MBH)

The Monotone Basin Hopping Algorithm (MBH) is designed to find global optima in problems with many local minima [28]. It extends the Basin Hopping Algorithm (BH) by introducing a monotonicity constraint: if a new point has a lower objective value but violates the monotonicity condition (i.e., one of its variables is larger than that in the current best solution), it is rejected. MBH was originally developed for molecular conformation problems in computational chemistry [29], combining monotonic search and basin hopping to achieve global optimization.
MBH exhibits strong global search capabilities [30]. Unlike traditional local search algorithms that easily get trapped in local optima, MBH uses a basin hopping strategy to explore the solution space globally, thus improving the probability of finding the global optimum. Additionally, randomness helps discover potentially better solutions. The MBH steps are summarized in Table 3.

2.2.3. Comprehensive Search Algorithm (CSA)

Although the MBH algorithm can help find the global optimum or near-optimal solutions, there is still the risk of getting stuck in local optima. Therefore, it is combined with the SQP algorithm. SQP has weak global search capabilities but strong local search capabilities, allowing it to converge to local extrema in a short time. By combining them, their strengths can complement each other. The integration of Sequential Quadratic Programming (SQP) and Monotone Basin Hopping (MBH) algorithms provides a comprehensive optimization method that combines both global and local search capabilities, known as the Comprehensive Search Algorithm (CSA), as shown in Figure 2. Its detailed steps are outlined in Table 4.

2.3. CSA Optimization IAM-BiLSTM Prediction Model

This study uses the CSA to optimize the IAM-BiLSTM prediction model. First, historical data from a specific oil well in Sichuan is collected, and an IAM-BiLSTM prediction model is proposed. The historical data includes production data of each well, well pressure, wellhead temperature, separator pressure, separator temperature and manifold pressure. The CSA algorithm is then used to optimize the parameters of the prediction model. Finally, the optimized IAM-BiLSTM prediction model is applied to predict oil well production, and the prediction results are obtained. By comparing the results of the optimized IAM-BiLSTM model with other prediction algorithms, it is evident that the optimized IAM-BiLSTM prediction model demonstrates superior accuracy and feasibility in the field of oil well production prediction.
For the parameters of the proposed IAM-BiLSTM model that may influence prediction accuracy, this section employs the CSA to search for their optimal values to improve the model’s predictive performance. The optimization process for the IAM-BiLSTM model is illustrated in Figure 3, and the steps are detailed in Table 5.

3. Results and Discussion

In this section, we describe the experimental setup and compare the prediction results. In the experiments, the IAM-BiLSTM model is compared with the Back Propagation neural network (BP), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), BiLSTM, and AM-BiLSTM (Attention Mechanism-Bidirectional Long Short-Term Memory) for production prediction, and the comparative results are presented. Additionally, the results of optimizing the IAM-BiLSTM model using the Grey Wolf Optimizer (GWO) [31], Particle Swarm Optimizer (PSO) [32], and Whale Optimization Algorithm (WOA) [33] are compared with those obtained using the CSA-optimized model. The models used in the experiments were primarily implemented in Python 3.10.

3.1. Data Types

Based on the operating conditions of ground equipment and production equipment at a specific oil well site in Sichuan, the experimental data used consists of historical dynamic liquid level data collected on-site from three oil wells during 2021–2022. To enhance the generalization ability of the model, 1000 data points were randomly selected from each well to construct the dataset. Each dataset corresponds to dynamic liquid level data measured over a month for a specific well. The production variations for the three wells are shown in Figure 4 [34]. The standard deviation and variance reflect the fluctuation of a dataset, while the mean, minimum, and maximum values indicate the differences in production among the wells [35]. The standard deviation, minimum, and maximum values of the oil well production data are listed in Table 6. This dataset can be used to validate the proposed model’s performance in predicting wells with significant production disparities.

3.2. Evaluation Criteria

This study uses the Mean Squared Error (MSE) and Mean Absolute Error (MAE) as evaluation metrics to comprehensively assess the predictive performance of the model. The smaller the values of these metrics, the closer the predicted values are to the actual values, indicating better prediction performance.
M S E = 1 m i = 1 m y i y ^ i 2
M A E = 1 m i = 1 m y i y ^ i
where y i is the actual oil production value, y ^ i is the predicted oil production value, and m is the total amount of oil production data in the test dataset.

3.3. Verifying the Effectiveness of the IAM-BiLSTM Prediction Model

3.3.1. Feature Engineering and Model Tuning

The proposed IAM-BiLSTM prediction model is designed for time series data. In addition to gas production, the dataset also includes well pressure, wellhead temperature, separator pressure, separator temperature, and manifold pressure. Data other than gas production are used as input features, and gas production is used as the target feature for prediction. The input features have been normalized to ensure consistency. We use the sliding window technique to convert the sequence data into feature-target pairs, making them suitable for training BP, SVM, and XGBoost models. The dataset is divided into 80% training and 20% testing to evaluate the model performance. In order to verify the improvement in prediction performance compared with machine learning methods, prediction models based on BP network, SVM, and XGBoost are established for comparison. The data collected from the three wellheads have the same characteristics, and all datasets are readings from field equipment sensors. The data characteristics are shown in Table 7.
Additionally, to evaluate the impact of the attention mechanism on the prediction model, BiLSTM and AM-BiLSTM models were also constructed for comparison. The network parameters of the prediction models involved in this study are shown in Table 8, where I i n p u t and I o u t p u t represent the input and output for SVM, XGBoost, and the BP neural network, respectively, C is the hyperparameter for SVM, with other parameters set to default values, g a m m a is the optimization parameter for XGBoost, with other parameters also set to default values, and c i denotes the number of neurons in the i-th hidden layer. For BiLSTM, L i n p u t and L o u t p u t represent the input and output, respectively. To validate the model’s accuracy, the training batch size S b , learning rate l r , loss function, and optimizer are kept consistent across models.

3.3.2. Verification Results

To verify the effectiveness of the proposed IAM-BiLSTM prediction model in the field of oil well production prediction, as well as the improvement in prediction accuracy brought by the Inter-Attention Mechanism to the BiLSTM model, production prediction models were established based on the aforementioned dataset and model parameters. The results of predicting oil well production using the IAM-BiLSTM prediction model are shown in Figure 5.
After comparing the prediction results with several machine learning and deep learning methods, Figure 6 was obtained, and the prediction accuracy calculation results are shown in Table 9. The results can be summarized as follows:
  • The proposed IAM-BiLSTM prediction model significantly improves prediction accuracy. The MSE and MAE of the IAM-BiLSTM prediction model are 0.0152 and 0.0621, respectively, which represent reductions of 61.14% and 34.3% compared to the BP network, 73.2% and 42.3% compared to SVM, and 74.94% and 38.99% compared to XGBoost. This demonstrates that deep learning methods, compared to traditional machine learning methods, can significantly enhance the accuracy of prediction models.
  • The MSE and MAE of the IAM-BiLSTM prediction model, at 0.0152 and 0.0621, respectively, are reduced by 39.44% and 35.68% compared to BiLSTM, and by 46.08% and 31.06% compared to AM-BiLSTM. This indicates that the Inter-Attention Mechanism enhances the weight of valuable information when extracting feature information using BiLSTM, thereby further improving the model’s prediction accuracy.
In Figure 6, the black curve (Actual Values) shows how the true data fluctuates frequently across 200 samples. The colored curves represent predictions by various models (BP, SVM, XGBoost, BiLSTM, IAM-BiLSTM, and AM_BiLSTM), all generally following the data’s oscillatory pattern. However, the IAM-BiLSTM model captures sharp peaks and troughs more accurately, demonstrating superior predictive performance compared to the other methods. Although AM_BiLSTM also tracks rapid changes relatively well, methods such as BP, SVM, and XGBoost display more noticeable deviations from the actual values.

3.4. CSA Optimization Performance Analysis

The above experimental results indicate that the IAM-BiLSTM prediction model demonstrates good prediction accuracy. However, determining reasonable learning rates and feature dimensions is essential for further improving the accuracy of the prediction model. Common methods for optimizing model parameters include GWO, PSO, and WOA.
Existing optimization algorithms, such as PSO, WOA, and GWO, have certain limitations when dealing with complex high-dimensional optimization problems. For example, PSO converges quickly in the early stage, but is prone to fall into local optimality in the later stage; although WOA has strong global search capabilities, its optimization efficiency is low. In contrast, the Comprehensive Search Algorithm (CSA) combines the global search capability of MBH and the local optimization capability of SQP, which can effectively improve the convergence speed and accuracy of model parameter optimization. The experimental results show that the IAM-BiLSTM model optimized by CSA is superior to other optimization methods in terms of convergence speed (see Figure 7). As can be seen from the figure, the fitness value of the IAM-BiLSTM model is optimized by the four algorithms, that is, the overall average error shows a downward trend. However, compared with GWO, PSO, and WOA, CSA achieves a shorter average convergence time and the smallest average error. In addition, the error convergence curve obtained by CSA is smoother, and it can be concluded that CSA exhibits better stability and convergence. Therefore, this paper selects CSA as the optimizer of the IAM-BiLSTM model to improve the reliability and generalization ability of the prediction model.

3.5. Performance Analysis and Evaluation of CSA Optimized IAM-BiLSTM Prediction Model

To further validate the optimization capability of CSA for the IAM-BiLSTM prediction model, the CSA-optimized IAM-BiLSTM prediction model was established using the aforementioned production historical data as the dataset. CSA was utilized to optimize the learning rate and feature dimensions of the IAM-BiLSTM prediction model, with specific parameters shown in Table 10. The prediction results are illustrated in Figure 8, and the prediction accuracy comparison results are presented in Table 11. As shown in Table 11, the MAE and MSE of the CSA-optimized IAM-BiLSTM prediction model are 0.0795 and 0.0123, respectively. Compared to IAM-BiLSTM, prediction accuracy has been improved. It can be concluded that optimizing the learning rate, batch size, and the number of neurons in the IAM-BiLSTM prediction model using CSA can effectively enhance the model’s prediction accuracy.

4. Results and Conclusions

The high-precision performance of the CSA-optimized IAM-BiLSTM model proposed in this study in oil well production prediction makes it of significant value in oilfield management and commercial applications. First, the model can help oilfield managers optimize production plans, reduce unnecessary equipment maintenance, and improve resource utilization, thereby reducing operating costs. Second, accurate prediction capabilities can help reduce production fluctuations and improve economic benefits. For example, reducing prediction errors can effectively reduce the production capacity loss of oil fields due to misjudgment. Finally, the high efficiency of the model can also reduce energy waste in oilfield production, optimize carbon emission management, and meet the requirements of sustainable development. Therefore, this study not only provides an innovative prediction method at the technical level, but also provides an important reference for intelligent production in the oil and gas industry.
This study validated the IAM-BiLSTM model using a dataset constructed from oilfield production data and compared the prediction performance with various baseline models. By applying multiple optimization algorithms to the parameter optimization of the IAM-BiLSTM model, the following conclusions were drawn:
  • A BiLSTM model based on the Inter-Attention Mechanism (IAM-BiLSTM) was proposed, which assigns different weights to hidden states, enhancing the influence of critical information. Experimental results show that the IAM-BiLSTM model outperforms traditional BiLSTM and AM-BiLSTM models in prediction accuracy.
  • Compared with optimization algorithms such as GWO, PSO, and WOA, the CSA-optimized IAM-BiLSTM model achieves faster and more stable convergence of its error iteration curve. Experimental results demonstrate that the CSA algorithm has excellent stability and convergence properties, improving the prediction accuracy of the IAM-BiLSTM model.
  • The proposed CSA-optimized IAM-BiLSTM model has been successfully applied to production prediction at a specific oil well site in Sichuan, verifying the method’s robustness and generalization ability.

Author Contributions

Conceptualization, Z.H. and H.D.; methodology, Z.H. and H.D.; software, X.M. and X.L.; validation, X.M. and X.L.; investigation, X.M. and X.L.; resources, M.W.; data curation, X.M.; writing—original draft preparation, X.M.; writing—review and editing, H.D.; visualization, X.M.; supervision, H.D.; project administration, M.W.; funding acquisition, M.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China grant number 62006200.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Liu, W.; Liu, W.D.; Gu, J. Forecasting oil production using ensemble empirical model decomposition based Long Short-Term Memory neural network. J. Pet. Sci. Eng. 2020, 189, 107013. [Google Scholar] [CrossRef]
  2. Lawal, A.; Yang, Y.; He, H.; Baisa, N.L. Machine Learning in Oil and Gas Exploration—A Review. IEEE Access 2024, 12, 19035–19058. [Google Scholar] [CrossRef]
  3. Kuang, L.; He, L.I.U.; Yili, R.E.N.; Kai, L.U.O.; Mingyu, S.H.I.; Jian, S.U.; Xin, L.I. Application and development trend of artificial intelligence in petroleum exploration and development. Pet. Explor. Dev. 2021, 48, 1–14. [Google Scholar] [CrossRef]
  4. Espinoza, R. Digital oil field powered with new empirical equations for oil rate prediction. In SPE Middle East Intelligent Oil and Gas Symposium; SPE: Richardson, TX, USA, 2015; p. D021S007R003. [Google Scholar]
  5. Nguyen, H.H.; Chan, C.W. Applications of data analysis techniques for oil production prediction. Eng. Appl. Artif. Intell. 2005, 18, 549–558. [Google Scholar] [CrossRef]
  6. Jalilinasrabady, S.; Tanaka, T.; Itoi, R.; Goto, H. Numerical simulation and production prediction assessment of Takigami geothermal reservoir. Energy 2021, 236, 121503. [Google Scholar] [CrossRef]
  7. Cao, X.; Liu, Z.; Hu, C.; Song, X.; Quaye, J.A.; Lu, N. Three-Dimensional Geological Modelling in Earth Science Research: An In-Depth Review and Perspective Analysis. Minerals 2024, 14, 686. [Google Scholar] [CrossRef]
  8. Van Everdingen, A.F.; Timmerman, E.H.; McMahon, J.J. Application of the Material Balance Equation to a Partial Water-Drive Reservoir. J. Pet. Technol. 1953, 5, 51–60. [Google Scholar] [CrossRef]
  9. Xu, J.; Chen, Z.; Hu, J.; Wu, K.; Zhou, D. Advances in reservoir modelling and simulation. Front. Earth Sci. 2022, 10, 1106622. [Google Scholar] [CrossRef]
  10. Ibrahim, N.M.; Alharbi, A.A.; Alzahrani, T.A.; Abdulkarim, A.M.; Alessa, I.A.; Hameed, A.M.; Albabtain, A.S.; Alqahtani, D.A.; Alsawwaf, M.K.; Almuqhim, A.A. Well performance classification and prediction: Deep learning and machine learning long term regression experiments on oil, gas, and water production. Sensors 2022, 22, 5326. [Google Scholar] [CrossRef] [PubMed]
  11. Liu, J.; Wang, F.; Zhang, C.; Zhang, Y.; Li, T. Reservoir production capacity prediction of Zananor field based on LSTM neural network. Acta Geophys. 2024, 73, 295–310. [Google Scholar] [CrossRef]
  12. Nguyen, H.H.; Chan, C.W.; Wilson, M. Prediction of oil well production: A multiple-neural-network approach. Intell. Data Anal. 2004, 8, 183–196. [Google Scholar] [CrossRef]
  13. Azevedo, B.F.; Rocha AM, A.C.; Pereira, A.I. Hybrid approaches to optimization and machine learning methods: A systematic literature review. Mach. Learn. 2024, 113, 4055–4097. [Google Scholar] [CrossRef]
  14. Andrais, R. Probabilistic Oil and Gas Production Forecasting Using Machine Learning; Massachusetts Institute of Technology: Cambridge, MA, USA, 2021. [Google Scholar]
  15. Hu, H.; Feng, J.; Guan, X. A Method of Oil Well Production Prediction Based on PCA-GRU. In 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS); IEEE: Piscataway, NJ, USA, 2019; pp. 710–713. [Google Scholar]
  16. Hosseini, S.; Thangarajah, A. Advanced deep regression models for forecasting time series oil production. arXiv 2023, arXiv:2308.16105. [Google Scholar]
  17. Panja, P.; Jia, W.; McPherson, B. Prediction of well performance in SACROC field using stacked Long Short-Term Memory (LSTM) network. Expert Syst. Appl. 2022, 205, 117670. [Google Scholar] [CrossRef]
  18. Otter, D.W.; Medina, J.R.; Kalita, J.K. A survey of the usages of deep learning for natural language processing. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 604–624. [Google Scholar] [CrossRef]
  19. Lieskovská, E.; Jakubec, M.; Jarina, R.; Chmulík, M. A review on speech emotion recognition using deep learning and attention mechanism. Electronics 2021, 10, 1163. [Google Scholar] [CrossRef]
  20. Jain, S.; Wallace, B.C. Attention is not explanation. arXiv 2019, arXiv:1902.10186. [Google Scholar]
  21. Luo, Z.; Li, J.; Zhu, Y. A deep feature fusion network based on multiple attention mechanisms for joint iris-periocular biometric recognition. IEEE Signal Process. Lett. 2021, 28, 1060–1064. [Google Scholar] [CrossRef]
  22. Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
  23. Mao, Y.; Zhang, Y.; Jiao, L.; Zhang, H. Document-level sentiment analysis using attention-based bi-directional long short-term memory network and two-dimensional convolutional neural network. Electronics 2022, 11, 1906. [Google Scholar] [CrossRef]
  24. Sabah, R.; Lam, M.C.; Qamar, F.; Zaidan, B.B. A BiLSTM-Based Feature Fusion with CNN Model: Integrating Smartphone Sensor Data for Pedestrian Activity Recognition. IEEE Access 2024, 12, 142957–142978. [Google Scholar] [CrossRef]
  25. Landi, F.; Baraldi, L.; Cornia, M.; Cucchiara, R. Working memory connections for LSTM. Neural Netw. 2021, 144, 334–341. [Google Scholar] [CrossRef]
  26. Krzywanski, J.; Sosnowski, M.; Grabowska, K.; Zylka, A.; Lasek, L. Advanced computational methods for modeling, prediction and optimization—A review. Materials 2024, 17, 3521. [Google Scholar] [CrossRef] [PubMed]
  27. Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
  28. Salih, S.Q.; Alsewari, A.A.; Wahab, H.A.; Mohammed, M.K.A.; Rashid, T.A.; Das, D.; Basurra, S.S. Multi-population Black Hole Algorithm for the problem of data clustering. PLoS ONE 2023, 18, e0288044. [Google Scholar] [CrossRef]
  29. Seumer, J.; Kirschner Solberg Hansen, J.; Brøndsted Nielsen, M.; Jensen, J.H. Computational evolution of new catalysts for the Morita–Baylis–Hillman reaction. Angew. Chem. Int. Ed. 2023, 62, e202218565. [Google Scholar] [CrossRef]
  30. Baioletti, M.; Santucci, V.; Tomassini, M. A performance analysis of Basin hopping compared to established metaheuristics for global optimization. J. Glob. Optim. 2024, 89, 803–832. [Google Scholar] [CrossRef]
  31. Grosso, A.; Locatelli, M.; Schoen, F. Solving molecular distance geometry problems by global optimization algorithms. Comput. Optim. Appl. 2009, 43, 23–37. [Google Scholar] [CrossRef]
  32. Wang, D.; Tan, D.; Liu, L. Particle swarm optimization algorithm: An overview. Soft Comput. 2018, 22, 387–408. [Google Scholar] [CrossRef]
  33. Rana, N.; Latiff, M.S.A.; Abdulhamid, S.I.M.; Chiroma, H. Whale optimization algorithm: A systematic review of contemporary applications, modifications and developments. Neural Comput. Appl. 2020, 32, 16245–16277. [Google Scholar] [CrossRef]
  34. Yadav, A.; Roy, S.M. An artificial neural network-particle swarm optimization (ANN-PSO) approach to predict the aeration efficiency of venturi aeration system. Smart Agric. Technol. 2023, 4, 100230. [Google Scholar] [CrossRef]
  35. Montgomery, D.C.; Runger, G.C. Applied Statistics and Probability for Engineers; John Wiley & Sons: Hoboken, NJ, USA, 2010. [Google Scholar]
Figure 1. Schematic Diagram of the IAM-BiLSTM Prediction Model.
Figure 1. Schematic Diagram of the IAM-BiLSTM Prediction Model.
Electronics 14 01004 g001
Figure 2. Comprehensive Search Algorithm (CSA) Flow Chart.
Figure 2. Comprehensive Search Algorithm (CSA) Flow Chart.
Electronics 14 01004 g002
Figure 3. CSA-Optimized IAM-BiLSTM Prediction Model.
Figure 3. CSA-Optimized IAM-BiLSTM Prediction Model.
Electronics 14 01004 g003
Figure 4. Production Changes of Three Oil Wells.
Figure 4. Production Changes of Three Oil Wells.
Electronics 14 01004 g004
Figure 5. IAM-BiLSTM Model Prediction Results.
Figure 5. IAM-BiLSTM Model Prediction Results.
Electronics 14 01004 g005
Figure 6. Prediction Results.
Figure 6. Prediction Results.
Electronics 14 01004 g006
Figure 7. Average Error Convergence Curve.
Figure 7. Average Error Convergence Curve.
Electronics 14 01004 g007
Figure 8. Comparison of Prediction Results of IAM-BiLSTM Optimized by CSA.
Figure 8. Comparison of Prediction Results of IAM-BiLSTM Optimized by CSA.
Electronics 14 01004 g008
Table 1. IAM-BiLSTM Prediction Model Implementation Steps.
Table 1. IAM-BiLSTM Prediction Model Implementation Steps.
Step 1: Given the input sequence x 1 , x 2 , , x T , the forward LSTM computes the hidden states h 1 , h 2 , , h T from t = 1 to T, and the backward LSTM computes the hidden states h 1 , h 2 , , h T from t = T to 1.
Step 2: Calculate the attention weights. For each time step t, compute the attention weight between the forward hidden state h t and all backward hidden states h 1 , h 2 , , h T , use a compatibility function to calculate the similarity score: e t , k = s c o r e h t , h k , and apply Softmax to the scores.
Step 3: Compute the context vector. Use the attention weights to calculate the context vector c t , which is the weighted sum of the backward hidden states:
c t = k = 1 T α t , k h k
Step 4: Combine the hidden state and context vector. Concatenate the forward hidden state h t and the context vector c t , forming an enhanced representation:
h t e n h a n c e d = C o n c a t ( h t , c t )
Step 5: Use the enhanced representation h t e n h a n c e d for subsequent tasks.
Table 2. SQP Algorithm Steps.
Table 2. SQP Algorithm Steps.
Step 1: Choose a feasible initial point x 0 ; set k = 0 .
Step 2: Iteration step (for each k ):
(1)
Linearize the constraints and form the quadratic approximation of the Lagrangian.
(2)
Solve the QP subproblem for d k .
(3)
Determine α k and update x k + 1 = x k + a k d k .
(4)
Update Lagrange multipliers and the Hessian approximation H k + 1 .
Step 3: If convergence criteria are met, stop; otherwise, increment k and repeat.
Table 3. MBH Algorithm Steps.
Table 3. MBH Algorithm Steps.
Step 1: Generate a randomly started local minimum x 0 and set the consecutive rejection counter R to 0.
Step 2: From the current local minimum x k , add a random perturbation S to obtain x k + S , then find a new local minimum Y by local minimization starting at this perturbed point.
If E ( Y ) < E ( x k ) , then accept x k + 1 = Y . Reset the consecutive rejection counter R to 0 and return to Step 1.
Otherwise, reject Y and increment R . If R R m a x , retain x k and return to Step 1; if R > R m a x , go to Step 3.
Step 3: Use the current x k as the bottom of the funnel and terminate the sequence.
Table 4. CSA Algorithm Steps.
Table 4. CSA Algorithm Steps.
Step 1: Randomly generate an initial solution x 0 , set the continuous rejection counter R = 0 , the maximum number of continuous rejections R m a x , and the allowable error ϵ .
Step 2: For the current local minimum x k , generate a random perturbation vector S R 3 N so that the new starting point is: X n e w = x k + S . Random perturbation vector: S = α · U .
Step 3: Use the sequential quadratic programming (SQP) algorithm to start the local optimal search for the new candidate point X n e w and obtain the new local minimum Y .
Step 4: Determine whether the new solution Y is better than the current solution x_k. If the objective function value E ( Y ) < E ( x k ) , the new solution Y is considered to be a better solution, that is: x k + 1 = Y . Reset the continuous rejection counter R to 0. Return to Step 1.
Step 5: If the objective function value E ( Y ) E ( x k ) ,   reject Y , i.e., x k + 1 = x k . Increase the consecutive rejection counter R by 1.
Step 6: Determine whether the continuous rejection counter R reaches the maximum allowed value R m a x . If R R m a x , return to Step 2 and perform new random perturbations and local searches.
If R > R m a x , it is considered that the current solution cannot be further improved, and the loop is exited to enter the termination step.
Table 5. CSA Optimization IAM-BiLSTM Steps.
Table 5. CSA Optimization IAM-BiLSTM Steps.
Step 1: The time series features with high correlation with production obtained in the well site are used as input variables of the prediction model. In order to avoid the large difference in magnitude of different variables affecting the accuracy of the model, the data samples are standardized: X nom   = X X m i n X m a x X m i n [ 0 , 1 ] .
Step 2: Initialize the model parameters θ ( l r , S b , E p , L o u t , c 1 , c 2 , c 3 ) to be optimized. Set the maximum number of rejections R x = 20 and the random perturbation amplitude of the model parameters.
Step 3: Given the training samples and model parameters, the oil well production prediction model is established using the above IAM-BiLSTM prediction model.
Step 4: Define the mean square error of the IAM-BiLSTM prediction model results as the loss function.
Step 5: Train the model using the new perturbed parameters and calculate the model loss L n e w after training.
Step 6: Perform local optimization on the perturbed parameters, retrain the model with the optimized parameters, and calculate the optimized model loss L n e w .
Step 7: Determine whether to accept the new solution. If L n e w L b e s t , accept the new solution, update the optimal parameters and loss function, and exit the optimization search. If L n e w > L b e s t , reject the new solution and search for the best solution again.
Table 6. Production of Three Wells in a Well Field.
Table 6. Production of Three Wells in a Well Field.
Number of WellsMeanStandard DeviationVarianceMaximum
1122,525.4590,270.898,148,835,259.10435,133.1
276,886.2573,732.175,436,433,284.58346,582.5
3220,579.6671,483.745,109,925,865.95591,495.4
Table 7. Oil Well Basic Statistics of Features.
Table 7. Oil Well Basic Statistics of Features.
Oil Well NumberData CharacteristicsAverage ValueStandard DeviationData Unit
Well 1Well pressure137.299497.7015MPa
Wellhead temperature29.07835.5435°C
Manifold pressure33.36786.3404MPa
Separator pressure3.37010.8442MPa
Separator temperature81.723712.6625°C
Gas production122,417.819790,081.2698m3
Well 2Well pressure104.077975.2034MPa
Wellhead temperature39.873111.9762°C
Manifold pressure23.630915.1888MPa
Separator pressure3.39570.9003MPa
Separator temperature65.663114.1295°C
Gas production100,100.301660,535.3192m3
Well 3Well pressure107.707467.1752MPa
Wellhead temperature34.02753.4423°C
Manifold pressure28.94507.8847MPa
Separator pressure3.72790.8624MPa
Separator temperature86.58679.3566°C
Gas production220,666.629571,503.8883m3
Table 8. Parameter Settings of the Comparative Prediction Models.
Table 8. Parameter Settings of the Comparative Prediction Models.
Prediction ModelParameter Settings
BP I i n p u t = 438 × 1 , C = 3 , I o u t p u t = 1
SVM I i n p u t = 438 × 1 , g a m m a = 0.1 , I o u t p u t = 1
XGBoost I i n p u t = 438 × 1 , c 1 = 512 , c 2 = 128 , c 3 = 32 , c 4 = 8 , I o u t p u t = 1
BiLSTM S b = 128 , l r = 0.001 ,   l o s s   f u n c t i o n :   M S E ,   o p t i m i z e r :   A d a m ,
L i n p u t = 216 × 2 , L o u t p u t = 32 × 1 , c 1 = 128 , c 2 = 128 , c 3 = 64 , c 4 = 8
AM-BiLSTM S b = 128 , l r = 0.001 , E p = 1000 , l o s s   f u n c t i o n :   M S E ,   o p t i m i z e r :   A d a m ,   L i n p u t = 216 × 2 ,
L o u t p u t = 32 × 1 , c 1 = 128 , c 2 = 128 , c 3 = 64 , c 4 = 8
Table 9. Error Results of Different Prediction Methods.
Table 9. Error Results of Different Prediction Methods.
Prediction MethodMSEMAE
BP0.03910.0945
SVM0.05670.1077
XGBoost0.06070.1018
BiLSTM0.02510.0965
AM-BiLSTM0.02810.0902
IAM-BiLSTM0.01520.0621
Table 10. Parameter Settings of IAM-BiLSTM Prediction Model Optimized by CSA.
Table 10. Parameter Settings of IAM-BiLSTM Prediction Model Optimized by CSA.
SymbolMeaningValue
R x Maximum rejection times20
NRandom perturbation amplitude10%
L b Optimization range lower limit[0.001, 1, 1]
U b Optimization range upper limit[0.01, 512, 512]
Table 11. Comparison of Prediction Accuracy.
Table 11. Comparison of Prediction Accuracy.
Prediction MethodsMeaningValue
CSA Optimized IAM-BiLSTM0.01230.0795
CSA Optimized IAM-BiLSTM0.03850.1561
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Meng, X.; Liu, X.; Duan, H.; Hu, Z.; Wang, M. Optimization of Oil Well Production Prediction Model Based on Inter-Attention and BiLSTM. Electronics 2025, 14, 1004. https://doi.org/10.3390/electronics14051004

AMA Style

Meng X, Liu X, Duan H, Hu Z, Wang M. Optimization of Oil Well Production Prediction Model Based on Inter-Attention and BiLSTM. Electronics. 2025; 14(5):1004. https://doi.org/10.3390/electronics14051004

Chicago/Turabian Style

Meng, Xin, Xingyu Liu, Hancong Duan, Ze Hu, and Min Wang. 2025. "Optimization of Oil Well Production Prediction Model Based on Inter-Attention and BiLSTM" Electronics 14, no. 5: 1004. https://doi.org/10.3390/electronics14051004

APA Style

Meng, X., Liu, X., Duan, H., Hu, Z., & Wang, M. (2025). Optimization of Oil Well Production Prediction Model Based on Inter-Attention and BiLSTM. Electronics, 14(5), 1004. https://doi.org/10.3390/electronics14051004

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop