A Method for Predicting Bottomhole Pressure Based on Data Augmentation and Hyperparameter Optimisation

Xin, Xiankang; Jiang, Xuecheng; Liu, Saijun; Yu, Gaoming; Jiang, Xujian

doi:10.3390/pr14081194

Open AccessArticle

A Method for Predicting Bottomhole Pressure Based on Data Augmentation and Hyperparameter Optimisation

by

Xiankang Xin

^1,2,3,4

,

Xuecheng Jiang

¹,

Saijun Liu

^1,2,3,4,*,

Gaoming Yu

^1,2,3,4,* and

Xujian Jiang

⁵

¹

School of Petroleum Engineering, Yangtze University, Wuhan 430100, China

²

State Key Laboratory of Low Carbon Catalysis and Carbon Dioxide Utilization, Yangtze University, Wuhan 430100, China

³

Hubei Key Laboratory of Oil and Gas Drilling and Production Engineering, Yangtze University, Wuhan 430100, China

⁴

National Engineering Research Center for Oil & Gas Drilling and Completion Technology, School of Petroleum Engineering, Yangtze University, Wuhan 430100, China

⁵

Research Institute of Petroleum Exploration and Development, Tarim Oilfield Company, PetroChina, Korla 841000, China

^*

Authors to whom correspondence should be addressed.

Processes 2026, 14(8), 1194; https://doi.org/10.3390/pr14081194

Submission received: 12 March 2026 / Revised: 4 April 2026 / Accepted: 6 April 2026 / Published: 8 April 2026

(This article belongs to the Special Issue New Advances in Low-Energy Processes for Geo-Energy Development: 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

With the continuous development of the petroleum industry, bottomhole pressure prediction technology, which exerts a significant impact on oil production and recovery, has become a key research direction in the current oil and gas field. To enhance the accuracy and robustness of bottomhole pressure prediction under transient and variable operating conditions, a method based on data augmentation strategies and hyperparameter optimization was proposed in this paper. Addressing challenges such as limited data volume and significant disturbances in actual oilfield production, a data augmentation strategy incorporating noise perturbation and sliding windows was introduced to expand training samples and improve model generalization. In terms of model architecture, a deep network integrating CNN, BiGRU, and Multi-Head Attention mechanisms was proposed in this paper, which is referred to as the CNN-BiGRU-Multi-Head Attention model. By introducing Bayesian optimization for automatic hyperparameter search, the performance of the temporal model was further enhanced, achieving efficient extraction and dynamic focusing of wellbore pressure temporal features. Prediction results demonstrated that the proposed method outperforms existing mainstream forecasting models in metrics such as Mean Absolute Error (MAE) and Coefficient of Determination (R²), with R² reaching 0.9831, which confirms its strong generalization capability and engineering applicability. Practical guidance for intelligent oilfield production management and bottomhole pressure forecasting, along with a novel prediction method, is provided by this study, which holds significant importance for extending well life and stabilizing hydrocarbon production.

Keywords:

bottomhole pressure; predictive model; deep learning; data augmentation; fusion time-series model

1. Introduction

Bottomhole Pressure (BHP) serves as a crucial indicator of fluid dynamics within oil and gas reservoirs, playing an irreplaceable role in reservoir development [1,2]. Accurate BHP monitoring and prediction are essential for effectively guiding production control as well as for revealing internal reservoir pressure distributions and fluid flow patterns, both of which are crucial for ensuring operational continuity and stability [3,4]. Particularly during the mid-to-late development stages of oil and gas fields, where non-steady-state flow and complex, variable operating conditions prevail, the spatiotemporal variations in BHP become increasingly intricate [5]. Traditional measurement and prediction methods struggle to meet the demands of practical production management [6,7].

Currently, two methods are employed to obtain bottomhole pressure data in field operations. One involves direct measurement using downhole pressure gauges. While this method captures pressure data, it suffers from issues such as high equipment costs, complex maintenance, and short instrument lifespans. Additionally, the data obtained is lagging, suitable only for monitoring and unable to support subsequent production decisions [8]. The other method employs theoretical calculations combining existing data with empirical formulas. This approach calculates bottomhole pressure using previously constructed multiphase flow models of the wellbore [9,10]. While less costly than direct measurement, it suffers from numerous model assumptions, limited applicability, difficulties in accurately obtaining key parameters during actual operation, and insufficient prediction accuracy [11,12,13]. It struggles to meet the demands of modern intelligent oilfields for high-precision, real-time predictions.

In recent years, with the advancement of machine learning and deep learning technologies [14,15,16,17,18], data-driven bottomhole pressure prediction methods have emerged as a research hotspot [9,19,20,21,22,23]. These approaches can automatically extract complex temporal features, enhancing model predictive capability and generalization performance [24,25,26,27]. However, existing methods often overlook practical challenges in oil and gas fields, such as limited data volume, strong disturbances, and highly variable operating conditions, leading to insufficient model robustness in non-steady-state and complex environments, which ultimately limits their widespread engineering application.

In response to the aforementioned challenges, Ternyik et al. [28] employed an artificial neural network (ANN) model to predict pipeline pressure drop under various operating conditions, representing an early study utilizing artificial intelligence methods for pressure prediction. After that, Osman et al. [29] developed a vertical multiphase flow ANN model using a multilayer feedforward network approach, which was validated using field data encompassing a wide range of variables. Sami et al. [30] proposed employing three machine learning techniques to predict multiphase flow bottomhole pressure. Results indicate that artificial neural networks outperform other machine learning methods in terms of error control. These studies effectively validate the feasibility of machine learning in bottomhole pressure prediction tasks.

With the advancement of technology, bidirectional models are gradually being applied in the petroleum sector. Mohammadpoor et al. [10] developed an ANN model based on the backpropagation learning algorithm to investigate correlations under different flow conditions. Al-Shammari et al. [31] combined the backpropagation learning algorithm with least-squares function theory to propose an adaptive neural-fuzzy inference system model suitable for two-phase flow systems. Li et al. [20] enhanced the backpropagation neural network model to accommodate segmented computational workflows, achieving improved performance under steady-state flow conditions. These studies further validate the advantages of bidirectional models.

The introduction of fusion models represents a major advancement in foundational model research. Sun et al. [32] proposed a GA-XGBoost model for predicting bottomhole pressure in karstic reservoirs, achieving a prediction accuracy of 0.84 R² on the test set. Zhang et al. [6] proposed a fusion model combining CNN and GRU with Bayesian hyperparameter optimization for bottomhole pressure prediction, demonstrating the superiority of CNN-based fusion models in this task. While these studies further validate the advantages of fusion models in bottomhole pressure prediction, challenges remain, including suboptimal prediction performance and limited data availability.

To capture the inherent relationship between long-term dependencies and key features, Li et al. [33] proposed applying the CNN-BiLSTM-Multi-Head Attention model for coalbed methane production forecasting, achieving remarkable results that demonstrated the effectiveness of bidirectional models and multi-head attention mechanisms in enhancing the stability and accuracy of time-series predictions. Pang et al. [34] employed Bayesian optimization to tune model hyperparameters, with empirical tests confirming the necessity of Bayesian hyperparameter optimization.

However, the aforementioned methods still exhibit certain limitations. Most notably, the aforementioned research often inadequately addresses the issues of limited data and strong noise interference, leading to insufficient robustness in complex operating environments. Furthermore, even in studies that employ attention mechanisms or hyperparameter optimization, there is a prevalent lack of multiscale joint modeling designed to simultaneously capture short-term dynamics and long-term trends.

A deep learning-based bottomhole pressure prediction method, which integrates data augmentation strategies with hyperparameter optimization, was proposed in this paper.

By introducing noise perturbation and sliding window techniques to expand the training data, coupled with a multi-scale temporal input mechanism, the model was able to capture both short-term dynamic variations and long-term trends in bottomhole pressure. A composite architecture combining Convolutional Neural Networks (CNN), Bidirectional Gated Recurrent Units (BiGRU), and Multi-Head Attention achieved efficient extraction and dynamic focusing of temporal features. Simultaneously, a Bayesian optimization algorithm was employed to automatically tune the model hyperparameters, further enhancing prediction accuracy and stability. The proposed method was subsequently validated using actual production data from Oilfield B and compared against multiple conventional machine learning models. The results demonstrated that the proposed approach significantly outperformed the baseline models in both prediction accuracy and stability, accurately capturing the temporal evolution characteristics of bottomhole pressure. This study provided a novel technical pathway for accurate bottomhole pressure prediction and offered important references for production optimization and safety management in intelligent oil and gas fields.

The core innovations of this study were summarized as follows: (1) A targeted data augmentation strategy was proposed to alleviate the problem of insufficient oilfield data samples. (2) A multi-scale time series input design was constructed to capture the dynamic temporal characteristics of bottomhole pressure. These two improvements effectively enhanced the prediction performance of the model in this study.

2. Materials and Methods

2.1. Framework for Deep Learning Models

The deep learning framework developed in this study was structured around two main components: model development and data preprocessing (Figure 1). The data preprocessing stage consists of several sequential steps, including data collection and analysis, data cleaning, feature selection and dataset partitioning. Model development was the core focus of this research. Performance evaluation metrics including MAE, MSE, SMAPE, and R2 are employed to assess common deep learning models alongside the proposed CNN-BiGRU-Multi-Head Attention model. Finally, visualization techniques were employed to display and compare the prediction results of each model.

2.2. Model Construction and Design

2.2.1. CNN Model

In this study, the convolutional neural network (CNN) extracts local temporal features from the input sequence through two convolutional layers. By scanning the data using convolutional kernels (filters), it effectively captures local temporal dependencies, providing crucial foundational data for subsequent sequence modeling.

f_{t} = R e L U (W_{c o n v} * X_{t} + b),

(1)

In the equation,

f_{t}

represents the feature map output at the time step, which is the result processed by the convolutional layer;

W_{c o n v}

denotes the convolutional weights, serving as trainable parameters for feature extraction;

*

signifies the convolutional operator, indicating the convolution computation;

X_{t}

denotes the input data at the time step;

b

represents the bias term, a learnable scalar or vector used to shift the convolution result;

R e L U

denotes the modified linear unit activation function.

2.2.2. BIGRRU Model

The output feature map

f_{t}

from the convolutional layer is fed into a bidirectional GRU (BiGRU). The BiGRU layer captures long-term dependencies in the time series and combines them with the local features extracted by the CNN.

h_{t}^{(f)} = G R U (x_{t}, h_{t - 1}^{(f)}),

(2)

h_{t}^{(b)} = G R U (x_{t}, h_{t - 1}^{(b)}),

(3)

h_{t} = [h_{t}^{(f)}, h_{t}^{(b)}],

(4)

In the equation,

h_{t}^{(f)}

is the output of the forward GRU unit at time step

t

(i.e., the forward hidden state);

x_{t}

is the input data at time step

t

;

h_{t - 1}^{(f)}

is the output of the forward GRU unit at time step

t - 1

;

h_{t}^{(b)}

is the output of the backward GRU unit at time step

t

(i.e., the backward hidden state);

h_{t - 1}^{(b)}

is the output of the backward GRU unit at time step

t - 1

;

G R U

is the operation of the gated recurrent unit;

h_{t}

is the output of the bidirectional GRU, which is the concatenation of the forward and backward GRU outputs.

2.2.3. Multi-Head Attention

Self-attention is a key technique in deep learning that dynamically assigns attention weights to different parts of an input sequence, enabling the model to capture correlations between data at various positions. The multi-head attention mechanism not only focuses on a finite set of features within the input sequence but also comprehensively captures information across different positions within the sequence. The framework of the multi-head attention mechanism is shown in Figure 2.

Multi-Head Attention applies Softmax and linear layer transformations to BiGRU outputs, enabling weighted attention across different time steps within the model. This mechanism empowers the model to capture long-term dependencies and critical temporal information.

α_{t}^{(i)} = \frac{\exp (W^{(i)^{T}} h_{t})}{\sum_{j = 1}^{T} \exp (W^{(i)^{T}} h_{j})},

(5)

c^{(i)} = \sum_{t} α_{t}^{(i)} h_{t},

(6)

In the equation,

α_{t}^{(i)}

represents the attention weight of the i-th attention head at time step

t

;

W^{(i)}

denotes the weight matrix of the i-th attention head;

h_{t}

is the hidden state output by the bidirectional GRU at time step

t

, representing the feature representation at the current time step;

e x p

is the exponential function, used to amplify larger values and enhance the effectiveness of the Softmax operation;

c^{(i)}

is the context vector for the i-th attention head, containing key information relevant to prediction;

α_{t}^{(i)}

is the attention weight for the i-th attention head at time step

t

, reflecting the model’s focus on information.

2.2.4. Bayesian Hyperparameter Optimization

Machine learning models typically set fixed hyperparameters before training. Default parameters can easily lead to models being overly complex or overly simplistic. Improper regularization parameters may cause overfitting or underfitting, potentially increasing sensitivity to noise and outliers. Hyperparameter optimization enhances the performance of the CNN-BiGRU-Multi-Head Attention model by systematically searching the hyperparameter space to identify an optimal set of hyperparameters.

λ = {[k_{C N N}, d_{B i G R U}, H_{a t t}, η, B]^{T}\},

(7)

In the equation,

λ

is the hyperparameter vector to be optimized in this study;

k_{C N N}

denotes the size of the convolutional kernel in the CNN section;

d_{B i G R U}

is the dimension of the BiGRU hidden layer;

H_{a t t}

represents the number of heads in the multi-head attention component;

η

is the learning rate; and

B

is the batch size.

2.2.5. Predicted Value Calculation

\hat{y} = W_{f c} Pool (\{c^{(1)}, c^{(2)}, \dots, c^{(H_{a t t})}\}) + b_{f c},

(8)

In the equation,

\hat{y}

is the final predicted value;

W_{f c}

and

b_{f c}

represent the weights and biases of the fully connected layer, respectively;

H_{a t t}

represents the number of heads in the multi-head attention component; and

P o o l

is the pooling operation.

2.3. Data Processing

2.3.1. Data Sources and Feature Descriptions

The data used in this study were derived from single-well production records of Oilfield B, comprising a total of 2027 daily sampling entries.

Fifteen original features were utilized in this paper. These consisted of daily process parameters (Daily liquid production, Daily oil production, Daily water production, Daily gas production, Water cut, Oil pressure, Tubing pressure, Wellhead back pressure, Wellhead temperature, Post-choke temperature) and cumulative production metrics (Total gas production, Total oil production, Total water production, Gas-oil ratio). The “Bottomhole pressure” is the target for prediction.

The predictive performance of statistical analysis and machine learning models is highly dependent on data quality. Data analysis can reveal patterns and relationships within the data, helping to understand its structure and thereby optimize model selection and tuning. Through preprocessing, noise removal, and missing value imputation, model stability and accuracy can be enhanced. Simultaneously, data analysis aids in identifying outliers, preventing them from negatively impacting model results. Sound data analysis not only improves model effectiveness but also provides robust support for research.

2.3.2. Data Description

This study conducted meticulous analysis and preprocessing of the data, and preliminary statistical results are shown in Table 1. Skewness is prevalent across all variables, with some variables—such as Daily water production and Wellhead back pressure—exhibiting anomalies or extreme values. To enhance modeling accuracy, the raw data require analysis and appropriate processing.

Some data exhibited significant outliers (the minimum Wellhead back pressure value of −30.64 MPa is inconsistent with actual conditions), necessitating the handling of these outliers.

2.3.3. Box Plot Analysis and Outlier Removal

The statistical characteristics table indicates that some variables contain significant outliers (e.g., minimum Wellhead back pressure of −70.44, extremely skewed Water cut), which may interfere with model training.

In this study, the data were first subjected to outlier removal using box plots (IQR) combined with logical judgment (Figure 3). Following this, a filtering process was performed to eliminate extreme values in key parameters such as Wellhead back pressure, Water cut, and Daily water production, thereby ensuring that only physically consistent samples were retained.

2.3.4. Feature Standardization

Standardization facilitates the model’s accelerated search for optimal solutions. To eliminate the influence of feature dimensions, continuous variables in this study were standardized using Z-score normalization, resulting in distributions with a mean of 0 and a standard deviation of 1. This ensures that each independent variable exhibits values on a uniform scale.

z = \frac{x - μ}{σ},

(9)

In the equation,

z

is the standard score;

x

is the raw data value;

μ

is the mean; and

σ

is the population standard deviation.

2.3.5. Correlation Analysis and Feature Selection

The predictive performance of statistical analysis and machine learning models heavily depends on data quality. While the entire dataset underwent cleaning as described above, the retention of all fourteen independent variables was excessive for predicting bottomhole pressure. A reduction in the number of input parameters can effectively enhance model efficiency by mitigating the negative impacts of excessive data on training, such as prolonged training times and reduced prediction accuracy. Therefore, selecting an appropriate number of independent variables is crucial. To this end, the independent variables were screened using Spearman’s correlation coefficient in this study.

The Spearman correlation diagram revealed strong positive or negative correlations among certain independent variables (Figure 4), as well as between these variables and bottomhole pressure. Among them, variables such as daily water production, water cut, oil pressure, tubing pressure, post-choke temperature, total gas production, total oil production, and total water production exhibited high correlations with the predicted variable (bottomhole pressure). These variables were therefore identified as preliminary candidates for feature selection.

However, strong correlations among independent variables can lead to collinearity and multicollinearity issues in linear machine learning models. If two variables are highly correlated, one variable can predict the other. Highly correlated independent variables provide redundant information for predicting the target output, thereby compromising the predictive capability of the developed model. Therefore, removing some highly correlated independent variables helps reduce dimensionality and improves prediction speed and accuracy.

To further refine the number of independent variables required for predicting Bottomhole pressure and reduce the impact of multicollinearity on predictions, this study continued to refine feature selection based on Spearman’s correlation coefficient. Information redundancy existed among the three features: Total oil production, Total gas production, and Total water production. To ensure the statistical effectiveness of the model, only cumulative gas production was ultimately selected from these three. The final features chosen for this study were “Total gas production”, “Oil pressure”, “Water cut”, “Daily water production”, “Post-choke temperature”, and “Tubing pressure”.

2.3.6. Data Expansion

To address the challenges of sparse effective data and high variability across different production periods within the complex context of non-steady-state flow and variable operating conditions, a data augmentation method based on the principles of sliding window replication with translation and noise augmentation is introduced in this study. The “sliding window replication with translation” technique was systematically applied by moving a fixed-size window to traverse and replicate data segments, thereby effectively expanding the coverage of the training set. The core parameters of the adopted translation sliding window method in this study are set as window size 1200 and an overlap step 200. This process resulted in a significant increase in the size of the sample set and a marked improvement in the density of the training data distribution. The workflow of the sliding window is shown in Figure 5.

Small Gaussian noise was added to each input sample as an additive perturbation to enhance model robustness.

x_{a u g} = x + N (0, σ^{2}),

(10)

In the equation,

x_{a u g}

denotes noisy data, i.e., data with added noise;

x

represents the raw data, which are unperturbed input samples;

N

is a Gaussian normal distribution function;

σ

represents the noise standard deviation, used to control noise intensity.

During the data augmentation process, the noise intensity is set to 5%. This ratio is determined based on the actual noise level of oilfield data, which is particularly consistent with the measurement error characteristics of downhole and surface sensors and conforms to the actual working conditions of data acquisition in petroleum engineering. In this study, Gaussian noise perturbation with a mean (loc) of 0.0 and a standard deviation (scale) of 0.05 is employed to implement data augmentation. These parameters are optimized and selected through multiple experiments. By applying small-scale perturbations, only weak noise is added to the normalized bottomhole pressure labels. This method can not only effectively expand the dataset and enhance the generalization ability of the model, but also avoid data distortion, excessive augmentation and artifacts caused by an excessively high noise ratio or amplitude. It preserves the temporal distribution characteristics of the raw data to the maximum extent, thereby ensuring the physical authenticity and engineering rationality of the augmented data.

3. Results and Discussion

A unified dataset was utilized to train both the aforementioned novel model and conventional machine learning models (CNN, LSTM, GRU, BiGRU) along with their hybrid variants, and their performance was subsequently compared in this study. A reliable prediction evaluation should rely on test data rather than training data. Most deep learning models perform well on training sets but exhibit error accumulation on test sets. Therefore, the prediction results of each model on the prediction set were presented in this section to evaluate their generalization capabilities.

Among the basic models, both traditional time series models and convolutional models demonstrated certain convergence capabilities, though with noticeable differences. In the comparison of fusion models, performance disparities between different architectures became even more pronounced. Among the fusion models, the LSTM-GRU and LSTM-BiGRU models exhibited slower loss reduction during training, with higher convergence values in the later stages, accompanied by some degree of fluctuation. This indicated that their modeling capabilities for key features remained insufficient in the highly nonlinear and strongly time-correlated task of wellbore pressure estimation. In contrast, a faster and more stable decline in the loss function was observed for the CNN-BiGRU model, which ultimately converged to a lower level. An effective complementarity between the convolutional network and bidirectional temporal modeling was evident in the extraction of local operating condition features, as demonstrated by this result.

The introduction of the multi-head attention mechanism further enhanced the model’s performance. The CNN-BiGRU-Multi-Head Attention model consistently maintained the lowest loss values throughout training and exhibited the smoothest convergence curve. This demonstrated that the multi-head attention mechanism adaptively enhanced focus on critical operating condition features and significant time steps, which effectively suppressed the interference of redundant information during model training. The comparison of loss functions for each model is shown in Figure 6. The horizontal axis represents the training epoch, and the vertical axis represents the loss function value, showing the loss convergence of each model during the training process.

The scatter plots of predicted versus actual bottomhole pressure (BHP) for different models is shown in Figure 7. The horizontal axis represents the actual BHP, and the vertical axis represents the predicted BHP. The blue dashed line is the ideal prediction line (True Line), and the closer the scatter points are to this line, the higher the prediction accuracy of the model.

In the basic model, the scatter plot generally followed a diagonal distribution, but the point cloud remained relatively dispersed, with more pronounced deviations in extreme pressure ranges such as high/low pressure. This indicated that a single structural model still has limitations in fitting highly nonlinear and strongly time-correlated signals like bottomhole pressure. In the fusion models, the scatter points were generally distributed more closely along the diagonal, and exhibited a higher degree of concentration. This demonstrated that the fusion architecture exhibited stronger complementarity in feature representation and temporal modeling, which effectively enhanced the consistency and stability of bottomhole pressure predictions. However, differences persisted among the fusion architectures. Some models still exhibited notable dispersion in extreme pressure ranges, indicating room for improvement in capturing complex operational variations. The superior capability of the CNN-BiGRU model in modeling bottomhole pressure dynamics was evidenced by its more concentrated scatter plot distribution, where most data points aligned closely along the diagonal, as revealed in a further comparison of the key models. With the incorporation of the multi-head attention mechanism, the scatter plot of the CNN-BiGRU-Multi-Head Attention model showed further convergence. Points near the diagonal became denser, and outliers decreased. These changes demonstrated that the multi-head attention mechanism could adaptively amplify key information while suppressing redundant noise, thereby enhancing the model’s overall fitting accuracy and robustness.

To comprehensively evaluate the predictive performance of each model, radar charts (Figure 8) are constructed based on five metrics: coefficient of determination (R², measuring the goodness of fit of the model; the closer to 1 the better the performance), root mean square error (RMSE, measuring the deviation between predicted and actual values), mean absolute error (MAE, measuring the average absolute value of prediction errors), symmetric mean absolute percentage error (SMAPE, a symmetric indicator of relative errors), and mean directional accuracy (MDA, measuring the accuracy of prediction direction), which are used to comprehensively compare the bottomhole pressure prediction performance of each model. The specific values of each evaluation metric were presented in Table 2.

Among the base models, BiGRU demonstrated the best overall performance. Compared to CNN/LSTM/GRU, BiGRU showed superiority in both error and goodness-of-fit, indicating that bidirectional temporal dependency modeling is more effective for signals with strong temporal correlations. In the corresponding radar chart of base models, BiGRU also covered a larger area, reflecting its more balanced comprehensive performance.

The fusion model generally outperformed the base model, indicating that the combined “local feature extraction + temporal modeling” approach could more fully characterize the dynamic variation patterns of bottomhole pressure. Among these, CNN-BiGRU demonstrated the best performance, exhibiting a larger coverage area and a more balanced shape in the radar chart among the fusion models.

Compared to BiGRU, which was the best basic model before fusion, the improvement of CNN-BiGRU was quite evident. This indicates that CNN’s extraction of local patterns within input sequences effectively complements BiGRU’s bidirectional temporal modeling. It should be noted that the fusion effect of LSTM-GRU and LSTM-BiGRU was not ideal. For instance, LSTM-GRU performed close to or even fell short of some baseline models. Although LSTM-BiGRU outperformed LSTM-GRU, it lagged significantly behind CNN-based fusion models. This suggested that merely stacking or fusing recurrent structures yields relatively limited gains for sequences like bottomhole pressure, which exhibit both local fluctuation characteristics and coexisting short- and long-term dependencies.

By introducing multi-head attention to the optimal fusion architecture CNN-BiGRU, the CNN-BiGRU-Multi-Head Attention model achieved the best results across the entire table. Compared to CNN-BiGRU, it delivered significant improvements: MSE decreased by approximately 69.1%, MAE further decreased by about 46.4%, SMAPE decreased by approximately 45.6%, R² increased by 0.0368, and MDA increased by 5.18%. In the radar chart of attention models, it exhibited the largest coverage area, with more “extended” and balanced dimensions across all metrics. This demonstrated that multi-head attention can adaptively enhance the contribution of critical time steps/key features, significantly improving trend judgment capability (MDA) while reducing errors. Consequently, it proved more suitable for predicting bottomhole pressure under complex operating conditions.

The efficacy of the introduced attention mechanism in capturing long-term dependencies and complex features within the data was demonstrated by the above findings, leading to an enhancement of the model’s predictive capability.

Despite the advantages of the proposed CNN-BiGRU-Multi-Head Attention mechanism in this study, it should be acknowledged that the present work is subject to certain limitations. Firstly, the model was trained and validated on a single-well dataset comprising only 2027 samples. While this dataset is representative of the target well’s operating conditions, its relatively small size may limit the robustness and generalization capability of the proposed model, particularly given the complexity of the deep learning architecture employed, which carries an inherent risk of overfitting. Additionally, the lack of validation across multiple wells, different reservoir types, and varying operating conditions restricts the broader applicability of the current findings. Future work will focus on expanding the dataset to include multi-well, multi-reservoir, and multi-condition data to further verify and enhance the generalization performance of the model.

4. Conclusions

This paper addresses challenges in bottomhole pressure prediction, including sparse samples, strong temporal fluctuations, and high modeling complexity, by proposing a deep prediction method based on data augmentation and multi-scale sliding windows. Key findings are as follows:

Data augmentation strategies, including Gaussian perturbation and sliding window translation, significantly enhanced model training sample diversity and generalization capabilities without introducing physical anomalies.
The sliding window mechanism effectively balanced short-term disturbances and long-term trends, improving the model’s sensitivity to complex pressure variations.
The CNN-BiGRU-Multi-Head Attention model architecture fully integrated local feature extraction, sequence modeling, and attention-based focus capabilities, enabling more precise modeling of pressure sequences.
Compared to traditional models, the proposed method outperformed them across all metrics, demonstrating particularly significant advantages in RMSE and MAE.

Author Contributions

Conceptualization: X.X. and G.Y.; Methodology: X.X. and X.J. (Xuecheng Jiang); Software: X.X., X.J. (Xuecheng Jiang) and G.Y.; Validation: X.J. (Xuecheng Jiang); Formal Analysis: X.X., X.J. (Xuecheng Jiang) and S.L.; Investigation: X.J. (Xuecheng Jiang); Data Curation: X.J. (Xuecheng Jiang) and S.L.; Resources: G.Y. and S.L.; Writing—Original Draft: X.X., X.J. (Xuecheng Jiang) and S.L.; Writing—Review and Editing: X.X., X.J. (Xuecheng Jiang) and S.L.; Visualization: X.J. (Xuecheng Jiang) and S.L.; Supervision: X.X., G.Y. and X.J. (Xujian Jiang); Project Administration: G.Y.; Funding Acquisition: G.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Oil & Gas Major Project, grant number 2025ZD1401502.

Data Availability Statement

With the consent of the corresponding author, data will be provided as needed.

Conflicts of Interest

Author XuJian Jiang was employed by the Research Institute of Petroleum Exploration and Development, Tarim Oilfield Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Campos, D.; Wayo, D.D.K.; De Santis, R.B.; Martyushev, D.A.; Yaseen, Z.M.; Duru, U.I.; Saporetti, C.M.; Goliatt, L. Evolutionary automated radial basis function neural network for multiphase flowing bottom-hole pressure prediction. Fuel 2024, 377, 132666. [Google Scholar] [CrossRef]
Hailong, L. Semi-analytical solution for bottomhole pressure transient analysis of a hydraulically fractured horizontal well in a fracture-cavity reservoir. Sci. Rep. 2022, 12, 22095. [Google Scholar] [CrossRef] [PubMed]
Adeyinka, A.; Oriola, O.; Tomomewo, O.S. An innovative approach for oil well bottomhole pressure forecasting using Kolmogorov-Arnold Neural Networks (KANs): A case study in an offshore oilfield. Deep Resour. Eng. 2025, 3, 100233. [Google Scholar] [CrossRef]
Ahmadi, M.A.; Chen, Z. Machine learning models to predict bottom hole pressure in multi-phase flow in vertical oil production wells. Can. J. Chem. Eng. 2019, 97, 2928–2940. [Google Scholar] [CrossRef]
Maraggi, L.M.R.; Walsh, M.P.; Lake, L.W.; Male, F.R. Bayesian variable pressure decline-curve analysis for shale gas wells. Unconv. Resour. 2024, 4, 100103. [Google Scholar] [CrossRef]
Zhang, C.; Zhang, R.; Zhu, Z.; Song, X.; Su, Y.; Li, G.; Han, L. Bottom hole pressure prediction based on hybrid neural networks and Bayesian optimization. Pet. Sci. 2023, 20, 3712–3722. [Google Scholar] [CrossRef]
Makarov, N.; Al-Shargabi, M.; Wood, D.A.; Burnaev, E.; Davoodi, S. Prediction of oil production rate in multiple wells of a producing field applying combined deep–learning and optimization techniques. Fuel 2026, 406, 136847. [Google Scholar] [CrossRef]
Al Shehri, F.H.; Gryzlov, A.; Al Tayyar, T.; Arsalan, M. Utilizing machine learning methods to estimate flowing bottom-hole pressure in unconventional gas condensate tight sand fractured wells in Saudi Arabia. In Proceedings of the SPE Russian Petroleum Technology Conference, SPE, Virtual, 26–28 October 2020. D043S032R002. [Google Scholar] [CrossRef]
Chen, W.; Di, Q.; Ye, F.; Zhang, J.; Wang, W. Flowing bottomhole pressure prediction for gas wells based on support vector machine and random samples selection. Int. J. Hydrogen Energy 2017, 42, 18333–18342. [Google Scholar] [CrossRef]
Mohammadpoor, M.; Shahbazi, K.; Torabi, F.; Qazvini, A. A new methodology for prediction of bottomhole flowing pressure in vertical multiphase flow in Iranian oil fields using artificial neural networks (ANNs). In Proceedings of the SPE Latin America and Caribbean Petroleum Engineering Conference, SPE, Lima, Peru, 1–3 December 2010. SPE-139147-MS. [Google Scholar] [CrossRef]
Rathnayake, S.; Rajora, A.; Firouzi, M. A machine learning-based predictive model for real-time monitoring of flowing bottom-hole pressure of gas wells. Fuel 2022, 317, 123524. [Google Scholar] [CrossRef]
Okoro, E.E.; Sanni, S.E.; Obomanu, T.; Igbinedion, P. Predicting the effects of selected reservoir petrophysical properties on bottomhole pressure via three computational intelligence techniques. Pet. Res. 2023, 8, 118–129. [Google Scholar] [CrossRef]
Liu, X.Q.; Li, Y.Z. Prediction of flowing bottomhole pressures for two-phase coalbed methane wells. Acta Pet. Sin. 2010, 31, 998. Available online: https://kns.cnki.net/kcms2/article/abstract?v=Jz5IuRg0t03LZIy-NkmJv4JSZwEDPtXs2ldCSVwMFphlny1hvFQ5_waY_fZgGXUnazMoKCVR-thX23zKYrNHQ3ZCI3ogKZIaTa5lOiDQ_ZHhKfoTWX1TkOlcEKM7GeEwz7PAgkIVHPSLYaHHuhR44Hay9NNfk8XO-WG25c-Vxfi5_gSDBQCrcQ==&uniplatform=NZKPT&language=CHS (accessed on 5 April 2026).
Al-Otaibi, M.B.; Elkamel, A.; Nassehi, V.; Abdul-Wahab, S.A. A computational intelligence based approach for the analysis and optimization of a crude oil desalting and dehydration process. Energy Fuels 2005, 19, 2526–2534. [Google Scholar] [CrossRef]
Gladchenko, E.; Orlov, D.; Cheremisin, A.; Chernov, A.; Sharonov, M.; Samsonov, I.; Savchuk, D.; Novikov, A.; Koroteev, D. Gradient Boosting Justification for Predicting Flow Assurance Issues in the Bottomhole Zone of Gas-Condensate Wells. Petroleum 2025, in press. [Google Scholar] [CrossRef]
Jinming, L.; Xuesen, Q.; Jian, D.; Zhi, Z.; Yue, Z.; Jiangping, D. Dynamic Prediction and Control Method for Annular Pressure in Offshore Gas Wells Based on Real-Time Monitoring Data. Energy Sci. Eng. 2025, 13, 6040–6049. [Google Scholar] [CrossRef]
Liu, X.; Zhou, S.; Zhang, W.; Cheng, Q.; Liu, D.; Yan, D.; Wang, H. Machine learning method for lacustrine shale oil reservoirs: Improving movable fluid porosity prediction. Unconv. Resour. 2026, 9, 100263. [Google Scholar] [CrossRef]
Almudhhi, S.; Lababidi, H.M.S.; Garrouch, A.A. Application of Machine Learning for Modeling Heavy Oil Viscosity. J. Eng. Res. 2025, 13, 6040–6049. [Google Scholar] [CrossRef]
Liu, Z.; Dong, L.; Li, C.; Xu, H. Spatio-temporal feature perception and fusion for multi-frame maritime infrared small target detection. Adv. Eng. Inform. 2026, 74, 104641. [Google Scholar] [CrossRef]
Li, X.; Miskimins, J.L.; Hoffman, B.T. A combined bottom-hole pressure calculation procedure using multiphase correlations and artificial neural network models. In Proceedings of the SPE Annual Technical Conference and Exhibition, SPE, Amsterdam, The Netherlands, 27–29 October 2014. SPE-170683-MS. [Google Scholar] [CrossRef]
Antonelo, E.A.; Camponogara, E.; Foss, B. Echo state networks for data-driven downhole pressure estimation in gas-lift oil wells. Neural Netw. 2017, 85, 106–117. [Google Scholar] [CrossRef]
Jiang, H.; Ceng, J.; Li, L.; Wen, T.; Zhou, J.; Chen, X.; Wang, J. Data—driven bottomhole pressure prediction for gas storage reservoirs. Spec. Oil Gas 2025, 32, 122–129. Available online: https://kns.cnki.net/kcms2/article/abstract?v=Jz5IuRg0t0381oN02D54u01__-MxPjr4saohBieEkkzZRBvGTRAr7wBWGnB8tWg1A76sv7a44ZfLN6psH2iqLaFPb88V6QerXumiewdZ8tOVG9CDpPdqOYihEluaE2VHD1zEI_lAFrYiLz_JvKifeHPMtINWRlNSMcH5d5mFUdUHcJ6P1I65sg==&uniplatform=NZKPT&language=CHS (accessed on 5 April 2026).
Guo, H. Rapid calculation method of bottom hole pressure for gas storage based on incremental learning. Fault-Block Oil Gas Field 2025, 32, 292–299. Available online: https://kns.cnki.net/kcms2/article/abstract?v=Jz5IuRg0t02jZYHf_UXxUjhZZ3pG-rhQ7Mcivvp7VgCwuU3obflTF1B7BmK3ZYTmK7GCrjThjjmLeKz3r3njwVVVWdCZlRFBd5UnAjISJ48GZe8Mt-nWLOQMeNx8UlYisAryOuY21-ho7WJItunLQA-1-9bQ1-upVWkhY6nBJ0SM9mFZe6u5TQ==&uniplatform=NZKPT&language=CHS (accessed on 5 April 2026).
Spesivtsev, P.; Sinkov, K.; Sofronov, I.; Zimina, A.; Umnov, A.; Yarullin, R.; Vetrov, D. Predictive model for bottomhole pressure based on machine learning. J. Pet. Sci. Eng. 2018, 166, 825–841. [Google Scholar] [CrossRef]
Tian, C. Machine Learning Approaches for Permanent Downhole Gauge Data Interpretation. Ph.D. Thesis, Stanford University, Stanford, CA, USA, 2018; 199p. Available online: https://pangea.stanford.edu/ERE/pdf/pereports/PhD/Tian_Chuan2018.pdf?utm_source=chatgpt.com (accessed on 5 April 2026).
Ignatov, D.I.; Sinkov, K.; Spesivtsev, P.; Vrabie, I.; Zyuzin, V. Tree-based ensembles for predicting the bottomhole pressure of oil and gas well flows. In Proceedings of the International Conference on Analysis of Images, Social Networks and Texts; Springer International Publishing: Cham, Switzerland, 2018; pp. 221–233. [Google Scholar] [CrossRef]
Firouzi, M.; Rathnayake, S. Prediction of the flowing bottom-hole pressure using advanced data analytics. In Proceedings of the //SPE/AAPG/SEG Asia Pacific Unconventional Resources Technology Conference, URTEC, Brisbane, Australia, 18–19 November 2019. D021S013R001. [Google Scholar] [CrossRef]
Ternyik, I.V.J.; Bilgesu, H.I.; Mohaghegh, S. Virtual measurement in pipes: Part 2-liquid holdup and flow pattern correlations. In Proceedings of the SPE Eastern Regional Meeting, SPE, Morgantown, WV, USA, 17–19 October 1995. SPE-30976-MS. [Google Scholar] [CrossRef]
Osman, E.A.; Ayoub, M.A.; Aggour, M.A. Artificial neural network model for predicting bottomhole flowing pressure in vertical multiphase flow. In Proceedings of the SPE Middle East Oil and Gas Show and Conference, SPE, Manama, Bahrain, 12–15 March 2005. SPE-93632-MS. [Google Scholar] [CrossRef]
Sami, N.A.; Ibrahim, D.S. Forecasting multiphase flowing bottom-hole pressure of vertical oil wells using three machine learning techniques. Pet. Res. 2021, 6, 417–422. [Google Scholar] [CrossRef]
Al-Shammari, A. Accurate prediction of pressure drop in two-phase vertical flow systems using artificial intelligence. In Proceedings of the SPE Kingdom of Saudi Arabia Annual Technical Symposium and Exhibition, SPE, Dhahran, Saudi Arabia, 22–24 May 2011. SPE-149035-MS. [Google Scholar] [CrossRef]
Sun, H.; Luo, Q.; Xia, Z.; Li, Y.; Yu, Y. Bottomhole pressure prediction of carbonate reservoirs using XGBoost. Processes 2024, 12, 125. [Google Scholar] [CrossRef]
Li, X.; Li, X.; Xie, H.; Feng, C.; Cai, J.; He, Y. Enhanced coalbed methane well production prediction framework utilizing the CNN-BL-MHA approach. Sci. Rep. 2024, 14, 14689. [Google Scholar] [CrossRef]
Pang, Z.; Zhang, R.; Ma, M.; Wang, H.; Li, Q.; Wang, C. Real-Time Prediction of Bottom Hole Pressure via Graph Neural Network. Processes 2025, 13, 4081. [Google Scholar] [CrossRef]

Figure 1. The Overall Workflow of Machine Learning Research.

Figure 2. Overall framework of Multi-Head Attention.

Figure 3. Box plot of certain features.

Figure 4. Spiegelman thermal coefficient for all features.

Figure 5. The workflow of Sliding window Data Extension.

Figure 6. Loss plots for different models.

Figure 7. Cross plot of the actual and predicted values of the ten models.

Figure 8. The comparison of model performance on the test set.

Table 1. Data Description of Relevant Production Characteristics.

Feature	Mean	Standard Deviation	Minimum	First Quartile	Median	Third Quartile	Maximum	Skewness	Abundance
Daily liquid production	77.70	18.43	0.90	68.90	78.36	84.26	161.23	0.37	2.05
Daily oil production	76.43	17.60	0.90	68.90	77.62	83.84	155.04	0.28	2.27
Daily water production	1.32	1.84	0.00	0.00	0.00	3.02	23.04	1.69	9.79
Daily gas production	45.42	10.34	0.56	40.32	47.29	50.27	123.86	0.94	8.08
Water cut	1.49	2.02	0.00	0.00	0.00	3.80	23.34	1.42	6.67
oil pressure	58.44	17.82	0.00	46.59	50.12	82.91	90.90	0.65	−1.10
Tubing pressure	16.39	13.68	0.00	6.77	10.18	28.37	51.31	1.02	−0.38
Wellhead back pressure	15.47	2.43	−30.64	13.68	16.04	17.26	19.70	−4.42	67.62
Wellhead temperature	53.54	9.30	0.00	49.70	53.70	60.00	74.70	−1.70	6.08
Post-choke temperature	41.17	14.01	0.00	28.50	42.00	52.30	73.30	−0.00	−0.90
Total gas production	4.67	2.62	0.00	2.43	4.59	6.94	9.20	−0.03	−1.15
Total oil production	7.94	4.42	0.01	4.28	7.89	11.73	15.48	−0.08	−1.12
Total water production	0.21	0.08	0.00	0.18	0.27	0.27	0.27	−1.30	0.20
Gas-oil ratio	5999.60	667.95	1975.07	6099.67	6100.10	6174.67	10,742.57	−1.68	7.52
Bottomhole pressure	93.63	22.13	0.68	77.79	88.85	120.71	128.72	−0.53	1.50

Table 2. Model performance data on the test set.

Model	RMSE	R²	MAE	SMAPE(%)	MDA(%)
CNN	5.34595	0.8954	4.257	4.75	83.38
LSTM	5.67523	0.8853	4.510	5.03	81.45
GRU	5.18282	0.8993	4.132	4.61	84.38
BiGRU	4.64768	0.9204	3.689	4.13	85.13
CNN-LSTM	4.23495	0.9324	3.399	3.78	85.46
CNN-GRU	4.10117	0.9369	3.260	3.65	85.21
CNN-BiGRU	3.72384	0.9463	2.948	3.31	86.38
LSTM-GRU	5.14049	0.9011	4.078	4.62	83.12
LSTM-BiGRU	4.66329	0.9197	3.669	4.11	82.62
CNN-BiGRU -MH Attention	2.07094	0.9831	1.580	1.80	91.56

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xin, X.; Jiang, X.; Liu, S.; Yu, G.; Jiang, X. A Method for Predicting Bottomhole Pressure Based on Data Augmentation and Hyperparameter Optimisation. Processes 2026, 14, 1194. https://doi.org/10.3390/pr14081194

AMA Style

Xin X, Jiang X, Liu S, Yu G, Jiang X. A Method for Predicting Bottomhole Pressure Based on Data Augmentation and Hyperparameter Optimisation. Processes. 2026; 14(8):1194. https://doi.org/10.3390/pr14081194

Chicago/Turabian Style

Xin, Xiankang, Xuecheng Jiang, Saijun Liu, Gaoming Yu, and Xujian Jiang. 2026. "A Method for Predicting Bottomhole Pressure Based on Data Augmentation and Hyperparameter Optimisation" Processes 14, no. 8: 1194. https://doi.org/10.3390/pr14081194

APA Style

Xin, X., Jiang, X., Liu, S., Yu, G., & Jiang, X. (2026). A Method for Predicting Bottomhole Pressure Based on Data Augmentation and Hyperparameter Optimisation. Processes, 14(8), 1194. https://doi.org/10.3390/pr14081194

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Method for Predicting Bottomhole Pressure Based on Data Augmentation and Hyperparameter Optimisation

Abstract

1. Introduction

2. Materials and Methods

2.1. Framework for Deep Learning Models

2.2. Model Construction and Design

2.2.1. CNN Model

2.2.2. BIGRRU Model

2.2.3. Multi-Head Attention

2.2.4. Bayesian Hyperparameter Optimization

2.2.5. Predicted Value Calculation

2.3. Data Processing

2.3.1. Data Sources and Feature Descriptions

2.3.2. Data Description

2.3.3. Box Plot Analysis and Outlier Removal

2.3.4. Feature Standardization

2.3.5. Correlation Analysis and Feature Selection

2.3.6. Data Expansion

3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI