Pre-Trained Surrogate Model for Fracture Propagation Based on LSTM with Integrated Attention Mechanism

He, Xiaodong; Tian, Huiyang; Xie, Jinliang; Wang, Luyao; Liu, Hao; Zhong, Runhao; Liao, Qinzhuo; Tian, Shouceng

doi:10.3390/pr13092764

Open AccessArticle

Pre-Trained Surrogate Model for Fracture Propagation Based on LSTM with Integrated Attention Mechanism

by

Xiaodong He

^1,2,

Huiyang Tian

¹,

Jinliang Xie

¹,

Luyao Wang

¹,

Hao Liu

²,

Runhao Zhong

²,

Qinzhuo Liao

^1,*

and

Shouceng Tian

¹

National Key Laboratory of Petroleum Resources and Engineering, China University of Petroleum (Beijing), Beijing 102249, China

²

Oil Production Technology Research Institute, Petrochina Xinjiang Oilfield Company, Karamay 834000, China

^*

Author to whom correspondence should be addressed.

Processes 2025, 13(9), 2764; https://doi.org/10.3390/pr13092764

Submission received: 29 July 2025 / Revised: 22 August 2025 / Accepted: 27 August 2025 / Published: 29 August 2025

(This article belongs to the Special Issue New Insights into Enhanced Oil Recovery Process Analysis and Application, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

The development of unconventional oil and gas resources highly relies on hydraulic fracturing technology, and the fracturing effect directly affects the level of oil and gas recovery. Carrying out fracturing evaluation is the main way to understand the fracturing effect. However, the current fracturing evaluation methods are usually carried out after the completion of fracturing operations, making it difficult to achieve real-time monitoring and dynamic regulation of the fracturing process. In order to solve this problem, an intelligent prediction method for fracture propagation based on the attention mechanism and Long Short-Term Memory (LSTM) neural network was proposed to improve the fracturing effect. Firstly, the GOHFER software was used to simulate the fracturing process to generate 12,000 groups of fracture geometric parameters. Then, through parameter sensitivity analysis, the key factors affecting fracture geometric parameters are identified. Next, the time-series data generated during the fracturing process were collected. Missing values were filled using the K-nearest neighbor algorithm. Outliers were identified by applying the 3-sigma method. Features were combined through the binomial feature transformation method. The wavelet transform method was adopted to extract the time-series features of the data. Subsequently, an LSTM model integrated with an attention mechanism was constructed, and it was trained using the fracture geometric parameters generated by GOHFER software, forming a surrogate model for fracture propagation. Finally, the surrogate model was applied to an actual fracturing well in Block Ma 2 of the Mabei Oilfield to verify the model performance. The results show that by correlating the pumping process with the fracture propagation process, the model achieves the prediction of changes in fracture geometric parameters and Stimulated Reservoir Volume (SRV) throughout the entire fracturing process. The model’s prediction accuracy exceeds 75%, and its response time is less than 0.1 s, which is more than 1000 times faster than that of GOHFER software. The model can accurately capture the dynamic propagation of fractures during fracturing operations, providing reliable guidance and decision-making basis for on-site fracturing operations.

Keywords:

attention mechanism; fracture propagation; GOHFER software; LSTM; intelligent prediction; surrogate model; wavelet transform

1. Introduction

Under the background of continuous growth of global energy demand and slow development of conventional oil and gas resources, it has become an inevitable trend to increase the development of unconventional oil and gas resources. Unconventional reservoirs are characterized by low porosity and permeability, which severely hinder the flow of fluids such as oil and gas. Therefore, stimulation measures such as hydraulic fracturing are necessary to improve the reservoir flow conditions [1,2].

The essence of hydraulic fracturing is to create hydraulic fractures that form a subsurface flow network with high conductivity. Hydraulic fracture morphology plays a key role in ensuring accurate fracture placement and enhancing reservoir stimulation [3,4]. Therefore, understanding the propagation dynamics of hydraulic fractures during the fracturing process is essential [5,6,7]. For a long time, scholars at home and abroad have carried out in-depth research on fracture prediction, developing various research methods such as analytical methods and numerical simulations [8,9]. The analytical methods, represented by the Perkins-Kern-Nordgren (PKN) and Kristianovich-Geertsma-de Klerk (KGD) models, construct mathematical models based on theories such as rock mechanics and fluid seepage mechanics [10,11]. They achieve theoretical solutions for fracture morphology by making assumptions and simplifying models for conditions such as rock homogeneity and isotropy. Although these methods improve calculation efficiency, they often fail to accurately represent complex geological features such as rock heterogeneity, resulting in discrepancies between the simulated and actual fracture geometries. Numerical simulation methods, employing algorithms such as the Finite Element Method (FEM) and the Discrete Element Method (DEM), overcome the limitations caused by simplified geological assumptions in traditional models [12,13]. They can realize the coupling of in situ stress, fluid flow and rock failure as well as the accurate characterization of geological models, and intuitively reveal the dynamic evolution process of fracture propagation [14]. However, these methods are highly dependent on rock mechanical parameters. Their high computational cost and long runtime limit their applicability to real-time decision-making in the field. In recent years, with the rapid development of artificial intelligence technology, their applications in the field of fracturing evaluation have been continuously deepened [15,16,17]. For example, in the Wolfcamp shale formation of the Permian Basin in Texas, USA, relevant researchers adopted an unsupervised deep learning method based on the U-Net convolutional network to classify microseismic events, and demonstrated that such microseismic activity can be used to track the fracture propagation during the hydraulic fracturing process [18]. With its powerful data processing capabilities, machine learning demonstrates unique advantages in extracting features from large-scale fracturing data and is emerging as a promising approach for fracture propagation prediction.

In order to solve the problems existing in the above traditional methods, this study takes the T1b2 interval of the Baikouquan Formation in Block Ma 2 of the Mabei Oilfield, located at the northwestern margin of the Junggar Basin, as the research object. This study integrates Grid Oriented Hydraulic Fracture Extension Replicator (GOHFER) Version 9.0.1.6 with machine learning methods to conduct research on the intelligent prediction of fracture propagation. Aiming at the problem of scarce on-site data, this study simulates the hydraulic fracturing process by GOHFER software and generates extensive fracture parameters, including fracture length, width, height, and stimulated reservoir volume (SRV) [19,20,21]. Since there are many influencing factors for fracture propagation, parameter sensitivity analysis is carried out to screen out the key influencing factors. To address the problems of low prediction accuracy of traditional analytical methods and high computational cost and longtime consumption of numerical simulation methods, a Long Short-Term Memory (LSTM) neural network model integrated with the attention mechanism is constructed based on the time-series features of fracturing operation data extracted by wavelet transform. With its powerful data analysis and processing capabilities, the model can analyze time-series features quickly and in-depth, accurately identify the moments of fracture propagation, dynamically capture the stress fluctuation characteristics during fracture propagation, and finally output the fracture geometric parameters and SRV at each moment during the pumping process. In addition, compared with the traditional LSTM neural network model, this model, by integrating the attention mechanism, can strengthen the representation of discriminative features, such as pressure fluctuation patterns and proppant concentration gradients, that are highly relevant to key stages of fracture propagation, such as fracture initiation and closure, thereby improving the prediction accuracy of the fracture propagation state. Field operators can dynamically adjust the pumping program to optimize the morphology of fractures formed by fracturing, based on the model’s real-time outputs of fracture geometric parameters and SRV. This increases the recovery rate of unconventional oil and gas resources and ultimately provides critical technical support for their efficient development.

2. Methods

2.1. Simulation of Fracturing Process

The complex oilfield environment has brought many challenges to the collection of fracturing operation data. On the one hand, equipment deployment is difficult and maintenance costs are high. On the other hand, restricted by strict safety and environmental protection regulations, data collection is difficult to carry out frequently. In addition, problems such as fragmentation and inconsistent formats of historical data further exacerbate the shortage of available data. To obtain enough available data, fracturing simulation software can be used to simulate the fracturing operations process and generate data such as fracture parameters and SRV. Conventional fracturing software, such as StimPlan and MFrac Suite, often adopt the linear elastic deformation mechanism, which frequently causes the fracture parameters derived from software simulations to deviate from actual measured values. In contrast, GOHFER software simulates the fracturing process based on physical assumptions that better align with real-world rock fracture or fracture propagation, such as the shear-slip decoupled fracturing mechanism and the process zone stress (PZS) fracture criterion. Compared with the linear elastic deformation mechanism used in traditional fracturing simulation software, the shear-slip decoupled fracturing mechanism can more realistically depict the dynamic coupling effect between rock shear deformation and sliding separation during fracture propagation. Especially in heterogeneous reservoirs or complex stress fields, it can effectively capture irregular fracture paths and morphological changes in rocks that are difficult to reflect by the linear elastic mechanism. The PZS fracture criterion breaks through the simplified assumption that solely relies on stress concentration at the fracture tip. It fully considers the stress distribution characteristics of the process zone around the fracture tip and judges fracture propagation conditions based on a nearly constant stress threshold within this process zone. This better conforms to the mechanical response law of the process zone during actual rock fracture and can more accurately quantify the critical states of fracture initiation and propagation, particularly in ductile rocks or dynamic fracturing scenarios. As a result, GOHFER software can more precisely simulate fracture propagation, pressure response, and fluid flow behavior. Additionally, GOHFER has an advantage in fracture height control accuracy, mainly because it fully accounts for multiple fracture height constraint mechanisms, including interlayer slip, natural fractures, and plastic deformation. In conclusion, this study selects GOHFER software to simulate the fracturing operation process.

This study conducts fracturing simulation based on on-site geological, engineering, and other actual data from Block Ma 2 of the Mabei Oilfield, located at the northwestern margin of the Junggar Basin. GOHFER software simulates the fracturing operations process through the following steps. First, based on the basic parameters of the T1b2 interval of the Baikouquan Formation in Block Ma 2 of the Mabei Oilfield such as average porosity of approximately 8%, average permeability of about 0.5 mD, and reservoir pressure of around 52 MPa, combined with actual logging data from a fractured well (Well X) in this interval, the formation rock mechanical parameters, including Young’s modulus, Poisson’s ratio, and fracture toughness, are obtained through inversion using elastic wave theory formulas, thereby constructing a 3D geological model of the target interval. Second, we integrate information such as wellbore trajectory, perforation cluster positions, and completion structure to establish a production well model. Subsequently, we input on-site pumping data such as fracturing fluid type, pumping rate, and pressure to drive the software’s fluid–structure interaction numerical simulation engine for calculations [22,23]. In addition, GOHFER fully considers the heterogeneity of the geological model and the dynamic interference caused by the stress shadow effect between perforation clusters [24,25]. After the simulation of the fracturing process is completed, the software outputs parameters such as SRV, fracture geometric parameters, and production performance for each fractured interval. By simulating multiple combinations of different geological conditions, well conditions, and pumping schedules, the software generates 12,000 sets of data including fracture parameters and SRV, effectively addressing the issue of insufficient on-site data.

2.2. Parameter Sensitivity Analysis

Since hydraulic fracture propagation is a complex multi-factor coupled process involving reservoir geological conditions and fracturing operation parameters, it is comprehensively influenced by geological parameters such as Young’s modulus, formation water saturation, and reservoir temperature, as well as engineering parameters such as pumping pressure, pumping rate, and sand concentration. If all potential parameters are directly incorporated into the model inputs, it is prone to causing the curse of dimensionality. This not only increases the computational complexity of model training and extends the convergence time but also may lead to model overfitting due to interference from redundant features, thereby reducing the prediction accuracy and generalization ability for fracture propagation. Therefore, it is necessary to conduct parameter sensitivity analysis to clarify the key input features of the model. In addition, since the SRV is greatly affected by fracture geometric parameters such as fracture length and fracture width, parameter sensitivity analysis is only carried out for fracture geometric parameters, so as to further screen out the key parameters that have a significant influence on fracture geometric parameters.

The parameter sensitivity analysis in this study is conducted based on the single-factor control variable method combined with the GOHFER software. In each analysis, only the target input parameter is set as a variable, while the remaining parameters are fixed to the on-site measured values of Block Ma 2 in the Mabei Oilfield. By observing the variation amplitude of the fracture geometric parameters output by the GOHFER software, the sensitivity degree of fracture propagation to this parameter is evaluated, and finally, the key input parameters are screened out. Based on the on-site hydraulic fracturing operation experience in Block Ma 2 of the Mabei Oilfield, 10 representative candidate parameters have been initially screened, covering two major categories (geological and engineering). The details are shown in Table 1.

Sensitivity analysis was conducted on these candidate parameters, with the specific steps as follows: For each candidate parameter, 10 gradient levels were reasonably set in accordance with the principle of uniform distribution and combined with on-site practical conditions, and the gradient setting of the remaining parameters all referred to this principle. The 10 gradient values of a specific target parameter were input sequentially into the GOHFER software, while the remaining parameters remained unchanged. Each gradient group corresponded to one independent hydraulic fracturing simulation experiment; that is, 10 hydraulic fracturing simulation experiments were required for the sensitivity analysis of each parameter, so a total of 100 hydraulic fracturing simulation experiments were needed to complete the sensitivity analysis of all parameters. The values of fracture length, fracture width, and fracture height output from each experiment were recorded, and their variation patterns were analyzed to clarify the sensitivity degree of each candidate parameter to fracture propagation.

2.3. Data Processing

Pumping time-series data such as pumping pressure, pumping rate, and sand concentration generated during fracturing operations serve as the key link connecting fracturing operation parameters and the physical process of fracture propagation. Therefore, it is necessary to collect pumping data. Pumping data can be collected by deploying high-precision sensors such as pressure transducers, flow meters, and densitometers at key locations, including pump outlets, wellhead manifolds, and fracturing pipelines. During on-site fracturing operations, accidents such as pump group vibration, equipment failure, signal transmission interruption, and construction operation errors may occur, leading to discontinuous data collection with missing values or outliers. If the field-collected data are directly used to train the neural network model, the model will learn pseudo-features, causing prediction results to deviate from reality. Therefore, pre-processing of the original data is required to ensure data integrity and accuracy. This paper employs the workflow shown in Figure 1 to preprocess the original data: firstly, the k-nearest neighbor (KNN) algorithm is used to fill missing values to ensure data integrity; secondly, the 3-sigma method is adopted to identify outliers in the data to guarantee data accuracy; finally, the binomial feature transformation method is used to generate feature combinations to enhance data diversity. This study conducts data preprocessing work based on on-site actual fracturing data collected from multiple fractured wells in Block Ma 2 of the Mabei Oilfield. Among them, the pumping data processed by the KNN algorithm and the 3-sigma method in Figure 1 are all obtained from part of the on-site pumping data in this block. Furthermore, in the process of generating feature combinations via the binomial feature transformation method as shown in Figure 1, various involved features, such as x, y, and so on, all correspond to specific types of on-site pumping data, e.g., pumping pressure, pumping rate, and so on.

The KNN algorithm is primarily used for classification and is a supervised learning algorithm. The basic principle of the algorithm is to classify new samples by calculating the distance between samples, essentially making category judgments based on the similarity of the sample space [26]. When using the KNN algorithm to fill missing values in pumping data, it is first necessary to screen samples with complete features (such as pumping pressure, pumping rate, and sand concentration) from the original data as the known reference dataset. Meanwhile, the remaining samples with missing values are marked as the target dataset to be filled. Then, the distance between the target sample and the reference samples is calculated. Euclidean distance is typically used to measure the distance between two samples. The Euclidean distance is calculated only for the non-missing feature dimensions of the target sample. A smaller distance indicates greater similarity between samples. This quantifies the degree of difference between the target sample and the reference samples. The specific calculation process is as follows:

d (x, y) = \sqrt{\sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2}}

(1)

where x_i and y_i represent the i-th feature value of the target sample and the reference sample, respectively. For instance, when the pumping pressure value of the target sample is missing, the distance to the reference sample is calculated based solely on pumping rate and sand concentration, ignoring the pumping pressure feature.

Finally, a reasonable K value needs to be determined. The basic process is shown in Figure 2. The optimal K value can be effectively found through cross-validation and grid search. The specific process is as follows. Firstly, approximately 10% of missing values are randomly introduced into the known reference data to simulate actual missing scenarios. Next, multiple candidate K values are selected. The selection of K values should follow the principle of giving priority to odd numbers to avoid voting ties when samples have the same distance. Additionally, the selection of K values needs to balance algorithm stability and feature capture ability. If K is too small, the algorithm is susceptible to local noise data. If K is too large, the data will be over-smoothed, leading to the neglect of local features [27]. Then, 5-fold cross-validation is performed for each candidate K value. The reference dataset containing simulated missing values is divided into 5 equal parts. Each time, one part is selected as the validation set, and the remaining 4 parts form the training set. For each candidate K value, the KNN model is trained using the training set. After training, the model is used to fill missing values in the validation set, and the Mean Squared Error (MSE) between the filled values and the true values is calculated. After completing 5-fold cross-validation, the average MSE of each K value across all folds is calculated. Finally, the K value with the smallest average MSE is determined as the optimal value for filling actual missing values. After completing the above steps, the actual target dataset needs to be filled with missing values. For each target sample, K nearest neighbor samples are selected from the known reference dataset based on the Euclidean distance and the method for determining the optimal K value. If the target sample has only one missing feature, such as only the pumping pressure value is missing, the pumping pressure values of these K nearest neighbor samples are extracted. Weights are assigned based on the distances between the samples (the closer the distance, the higher the weight), and the filling value is calculated through weighted averaging to complete the filling of the missing feature value. If the target sample has multiple missing features, the above process is performed separately for each missing feature.

For individual values in the pumping data that deviate significantly from others, the 3-sigma method is used for identification and processing. The 3-sigma method leverages the properties of the normal distribution in statistics (approximately 99.7% of data falls within the µ ± 3σ range) to quickly screen for outliers [28,29]. The first step is to calculate the mean of each parameter in the pumping data to reflect the central tendency of the parameter. Then, calculate the standard deviation of each parameter. Taking pumping pressure as an example, other parameters such as pumping rate and sand concentration can be calculated similarly. The specific calculation process is as follows:

σ_{p} = \sqrt{\frac{1}{n - 1} \sum_{i = 1}^{n} {(p_{i} - μ_{p})}^{2}}

(2)

where n represents the number of collected pumping pressure data points, and p_i and µ_p denote the value of the i-th pumping pressure and the mean value of all collected pumping pressures, respectively.

As shown in Figure 3, taking the pumping pressure parameter as an example, the data used are obtained from the on-site measured pumping pressure sequence of the Mabei Oilfield. According to the 3-sigma principle, normal pumping pressure values should fall within the interval [µ_p − 3σ_p, µ_p + 3σ_p], and data points outside this interval are identified as outliers. Similarly, for parameters such as pumping rate and sand concentration, outliers can be identified using the intervals [µ_q − 3σ_q, µ_q + 3σ_q], and [µ_c − 3σ_c, µ_c + 3σ_c] respectively. Each data sequence of the parameters is traversed, and each data point is compared with its corresponding normal interval. If the data point falls outside the interval, it is marked as an outlier. These outliers are replaced by the average of the two nearest data points in the time series.

In fracturing operations, parameters such as pumping pressure, pumping rate, and sand concentration are interrelated. Their combined relationships, such as the product of pumping pressure and pumping rate, the square of pumping pressure, etc., contain key information about fracture propagation. Fracture propagation is inherently a nonlinear and multi-factor coupled process. By applying binomial feature transformation to generate power and interaction terms from the original features, these complex coupling relationships can be quantified into new features, thereby more effectively revealing the underlying physical laws of fracture propagation. The binomial feature transformation method mainly includes univariate binomial, bivariate binomial, and full combinatorial binomial feature transformations [30]. Among them, univariate binomial feature transformation highlights the nonlinear variation law of a single feature through power operation. Bivariate binomial feature transformation captures the synergistic effect between features through the product operation of two features. Full combinatorial binomial feature transformation integrates the advantages of the previous two transformation methods, including both the self-combination of single features and the cross-combination of two features, comprehensively considering the nonlinear relationships of features and the synergistic effects between features. However, univariate binomial feature transformation only focuses on the nonlinear variations in a single feature and fails to capture the interaction effects between different features; full combinatorial binomial feature transformation tends to cause excessive expansion of feature dimensions and increase computational complexity. By contrast, bivariate binomial feature transformation not only can effectively capture the interaction effects between features but also has the advantage of lower computational cost. Therefore, this paper selects bivariate binomial feature transformation to perform feature combination on the original data. As shown in Figure 4, taking pumping pressure P, pumping rate Q, and sand concentration C as examples, 9 features can be generated through this method: {P, Q, C, P², Q², C², PQ, PC, QC}. Among them, P² highlights the impact of extreme pumping pressure on fracture propagation by amplifying the extreme value effect of pumping pressure. Q² enhances the difference between high and low pumping rate regions. For example, excessive pumping rate may cause fracture diversion and generate complex fractures, while low pumping rate may affect proppant carrying capacity. C² enhances the nonlinear response of high sand concentration regions, highlighting the abrupt increase in sand plugging risk caused by instantaneous high sand concentration. PQ reflects the energy input intensity, embodying the role of high pumping rate and high pumping pressure in synergistically accelerating fracture propagation. PC represents the pumping pressure cost of the proppant carrying process. Abnormal pressure increase under high sand concentration is a key signal for sand plugging risk warning. QC reflects the matching degree between pumping rate and sand concentration, affecting the proppant placement effect and support efficiency in fractures. Full combinatorial binomial feature transformation can, without increasing the amount of original data, use mathematical combinations to excavate the interaction effects and nonlinear relationships between features, expanding the representation capability of the feature space.

Due to the significant differences in dimensions and value ranges of parameters such as pumping pressure, pumping rate, and sand concentration, the Z-Score standardization method is adopted to process the feature data. This method transforms the original features into a standard normal distribution with a mean of 0 and a standard deviation of 1. The processing procedure is as follows:

x_{n o r m} = \frac{x - μ}{σ}

(3)

where x represents the original feature value, which can be the value of pumping pressure, pumping rate, or sand concentration. µ represents the mean of the original feature value. σ represents the standard deviation of the original feature value.

2.4. Pumping Time-Series Feature Extraction Method Based on Wavelet Transform

We considered adopting the wavelet transform time-frequency analysis method to extract the time-series features of pumping data. Wavelet transform decomposes the original signal into different time-frequency domains by stretching and translating the wavelet basis function, so as to effectively extract the features of the signal [31]. The basic formula is as follows:

W_{T} (a, b) = \frac{1}{\sqrt{|a|}} \int_{- \infty}^{+ \infty} x (t) ψ (t) (\frac{t - b}{a}) d t

(4)

where x (t) is a signal of parameters such as pumping pressure, pumping rate, and sand concentration changing with time. a is the scale factor, used to control the stretching of the wavelet and corresponding to frequency analysis. b is the translation factor, mainly used to control the translation of the wavelet on the time axis and corresponding to time localization. Ψ (t) is the wavelet basis function, which generates different wavelet functions through scale stretching a and time translation b to analyze the signal. W (a, b) is the wavelet transform coefficient, reflecting the similarity between the original signal and the wavelet basis function at different time-frequency positions.

Common wavelet basis functions include Daubechies wavelets, Symlets wavelets, and Coiflets wavelets, which should be adaptively selected based on the characteristics of pumping data [32,33,34]. Among them, Daubechies wavelets tend to cause phase distortion due to their asymmetric nature, leading to misalignment of mutation points in pumping data. Symlets wavelets, with approximate symmetry, can not only effectively suppress phase distortion and accurately preserve the positional information of mutation points, but also retain the feature correlations of parameters when decomposing multiparameter-coupled pumping curves. Coiflets wavelets are suitable for the sparse representation of smooth signals and the extraction of slow-varying features. Considering the characteristics of pumping data with mutations, Symlets wavelets are selected for analysis. After the wavelet basis function is determined, the decomposition level J needs to be further set. If the decomposition level is too small, it is difficult to fully extract the multi-scale features of the signal. If the level is too large, it will increase the computational burden and may introduce redundant information. Here, a decomposition level of 3 is selected to decompose the pumping data signal to balance the completeness of features and computational cost.

Using the selected Symlets wavelet basis function and decomposition level, discrete wavelet decomposition is performed on the pumping data. Taking the pumping pressure signal as an example, decomposition by the Mallat algorithm gives:

x_{p} (t) = A_{3} + D_{3} + D_{2} + D_{1}

(5)

where A₃ is the third layer approximation coefficient, reflecting the overall trend of the pumping pressure signal. D₁, D₂ and D₃ are the first, second- and third-layer detail coefficients, corresponding to the variation details of the pumping pressure in different frequency ranges, respectively. Similarly, decompose the pumping rate and sand concentration signals to obtain their respective approximation coefficients and detail coefficients.

Statistical information such as mean, variance, energy, and entropy are calculated for the wavelet coefficients of each layer. The mean reflects the central tendency of the coefficients, corresponding to the average intensity of the signal in the frequency band. The variance characterizes the dispersion degree of the coefficients, corresponding to the fluctuation amplitude of the signal in the frequency band. The energy represents the characteristic intensity of the frequency band. The entropy reflects the complexity and uncertainty of the coefficients. The larger the entropy value, the more complex the signal changes in the corresponding frequency band. The specific calculation methods are shown in Equations (6)–(9):

μ_{c_{j}} = \frac{1}{N_{j}} \sum_{i = 1}^{N_{j}} c_{j, i}

(6)

where N_j is the number of data points of the j-th layer coefficients. c_j_,i is the i-th coefficient value of the j-th layer, and µ_{c_j} is the mean value of the j-th layer coefficients.

σ_{c_{j}}^{2} = \frac{1}{N_{j}} \sum_{i = 1}^{N_{j}} {(c_{j, i} - μ_{c_{j}})}^{2}

(7)

where σ²_{c_j} represents the variance of the j-th layer coefficients.

E_{c_{j}} = \sum_{i = 1}^{N_{j}} c_{j, i}^{2}

(8)

where E_{c_j} represents the energy distribution of the j-th layer coefficients, corresponding to the energy proportion of the signal in the frequency band. The greater the energy, the more significant the features of the frequency band.

H_{c_{j}} = - \sum_{i = 1}^{N_{j}} E_{j, i} \log_{2} E_{j, i}

(9)

where E_j,i is the energy proportion of the i-th coefficient in the j-th layer. H_{c_j} represents the entropy of the j-th layer coefficients.

Based on the time-series features of pumping pressure, pumping rate, and sand concentration extracted by wavelet transform, the fracture evolution stages throughout the pumping process, such as fracture initiation, stable propagation, and pump shutdown fracture closure, can be identified. After these features are input into the neural network model, the model identifies the fracture propagation moments, obtains the fluctuation characteristics of fracture propagation, and predict the fracture propagation status in real time.

2.5. The LSTM Model Integrated with Attention Mechanism

The Long Short-Term Memory (LSTM) is a special type of Recurrent Neural Network (RNN). By introducing gating mechanisms such as the forget gate, input gate, and output gate to control the flow of information, it solves the vanishing gradient and exploding gradient problems of traditional RNN and performs well in time-series data processing and prediction tasks [35]. However, when processing the complex time-series features of pumping data, LSTM faces challenges in automatically identifying discriminative features that are highly relevant to critical fracture propagation stages, specifically initiation points and turning points [36,37]. To address this, this study proposes an LSTM-based method fused with an attention mechanism. The attention mechanism assigns weights to different features, strengthens the representation of critical features, and thereby improves the model’s prediction accuracy for fracture propagation status.

The structure of the preliminarily constructed LSTM model integrated with an attention mechanism in this paper is shown in Figure 5. This model adopts a five-layer architecture design, including an input layer, an LSTM layer, an attention mechanism layer, a fusion layer, and an output layer [38].

The input layer receives multi-dimensional time-series features of pumping pressure, pumping rate, sand concentration, etc., extracted by wavelet transform, and employs a dynamic batch normalization mechanism to support real-time data streams being input in batches according to time steps. The LSTM layer adopts a bidirectional structure, which simultaneously captures the historical dependencies and future information of time-series features through forward and backward neurons. Meanwhile, a Dropout layer is embedded between layers to suppress the overfitting phenomenon by randomly discarding the outputs of some neurons. The LSTM layer processes the features from the received input layer, outputs the corresponding hidden state at each time step, and forms a sequence. Since the hidden state sequence output by the LSTM layer may not have its dimensionality and feature distribution directly adapted to the computational requirements of the attention mechanism, the attention mechanism layer first needs to perform spatial dimension mapping on this hidden state sequence through a fully connected layer. This fully connected layer contains a learnable weight matrix and bias term and maps each hidden state from the LSTM feature space to the preset attention space through linear transformation. This mapping process not only realizes the adaptation of feature dimensionality but also can, through the learning of the weight matrix, perform targeted strengthening of the feature components in the hidden state that are related to the key stages of fracture propagation. For example, in the fracture initiation stage, the pumping pressure rises sharply and reaches the first peak within the construction cycle. In the stable propagation stage, the pumping pressure drops to a stable range with a small fluctuation amplitude, the pumping rate maintains continuous and stable output to match the fracture propagation demand, and the sand concentration increases stepwise to the design value and remains stable to support the fracture. In the fracture closure stage, the pumping rate shows a gradually decreasing trend and finally drops to zero pumping rate. At the same time, the pumping pressure shows a gradual attenuation trend as the pumping rate decreases. The specific calculation formula for the mapped feature vector is as follows.

u_{i} = W_{a} \cdot {\hat{h}}_{i} + b_{a}

(10)

where u_i is the mapped feature vector. W_a represents the learnable weight matrix of the fully connected layer,

{\hat{h}}_{i}

stands for the original hidden state output by the LSTM and b_a indicates the bias term of the fully connected layer.

After completing the spatial mapping, it is necessary to further calculate the attention scores. The core logic is to quantify the importance of the features at each time step in the fracture propagation process based on the mapped feature vectors. Generally, the initial scores are obtained by conducting similarity measurement between the mapped feature vectors and a learnable “query vector”. Subsequently, the attention weights are generated by normalizing the initial scores of all-time steps. In the fracture propagation scenario, these weights intuitively reflect the contribution degree of the features at each time step to the overall evolution process. The calculation formulas for attention scores and attention weights are as follows:

e_{i} = v_{q}^{T} \cdot \tanh (u_{i} + W_{q})

(11)

where e_i represents the initial attention score at the i-th time step and v_q represents the learnable query vector. W_q represents the learnable weight matrix and tanh serves as the activation function.

α_{i} = \frac{\exp (e_{i})}{\sum_{i = 1}^{T} \exp (e_{i})}

(12)

where α_i represents the attention weight at the i-th time step and exp denotes the exponential function, which can amplify the differences in scores while ensuring the weights are positive. T represents the total number of time steps.

The attention mechanism layer performs weighted summation of the original hidden states output by the LSTM layer and the corresponding attention weights to generate a context vector. Through weight assignment, this vector selectively aggregates the core features of the key stages of fracture propagation. This focused feature aggregation enables the context vector to more accurately characterize the laws of fracture propagation. The calculation formula for the context vector is as follows:

c = \sum_{i = 1}^{T} α_{i} \cdot {\hat{h}}_{i}

(13)

where c represents the context vector. α_i represents the attention weight at the i-th time step and

{\hat{h}}_{i}

stands for the original hidden state output by the LSTM.

The fusion layer introduces a gating mechanism to adaptively adjust the fusion weights of LSTM hidden state sequences and context vectors through a dynamic learning strategy. The feature vector generated after weighted fusion contains both time-series features extracted by LSTM and key information focused by the attention mechanism, thereby deeply analyzing the intrinsic correlation between pumping time-series features and dynamic fracture propagation [39,40]. Since the prediction of fracture propagation is a regression task, the output layer uses a zero-centered Tanh activation function. Compared with other activation functions such as Sigmoid, Tanh can limit the output to −1 to 1, reduce vanishing gradients, accelerate training convergence, and improve model training efficiency.

2.6. Model Training

The dataset is divided into a training set and a test set at an 8:2 ratio. When training the neural network model, reservoir characteristics, fracturing operation data simulated by GOHFER software, and extracted pumping time-series features are used as inputs, and the model outputs fracture geometric parameters and SRV. To quantify the difference between model predictions and actual values and optimize parameters, regression tasks often use loss functions such as mean squared error (MSE), root mean squared error (RMSE), and mean absolute percentage error (MAPE). Among them, MAPE refers to the percentage of the mean of absolute errors between predicted values and actual values relative to the actual values. It can quantify the difference between the model’s predicted values and actual values into an intuitive percentage form and has significant advantages in cross-scale parameter evaluation and model generalization performance analysis. However, the calculation of MAPE takes actual values as the denominator. When the actual value is 0, the denominator loses mathematical significance, resulting in the failure of normal calculation of the indicator. This makes MAPE difficult to apply to scenarios where actual values include 0. Based on this, this study improves the calculation method of MAPE: a fixed constant is introduced into its denominator to circumvent the problem of the denominator being 0, thereby reducing the calculation deviation caused by the meaningless denominator and ensuring the effectiveness of the indicator in scenarios where actual values include 0. For the remainder of this study, the subsequently mentioned MAPE specifically refers to the modified MAPE. The specific calculation formula is as follows:

M A P E = \frac{1}{n} \sum_{i = 1}^{n} (\frac{| A_{i} - F_{i} |}{A_{i} + c}) \times 100

(14)

where A_i represents the actual value, F_i represents the predicted value of the model, and n is the number of samples. c represents a fixed value more than 0.

The hyperparameters to be optimized include learning rate, batch size, number of hidden layers, number of neurons, and Dropout ratio. In this study, the single-factor optimization method is adopted to conduct model training based on the training set. The MAPE between the fracture geometric parameters and SRV predicted by the model during each training session and those of the training set is calculated as the optimization index. The optimal values of each hyperparameter are determined in the sequence of learning rate, batch size, number of hidden layers, number of neurons, and Dropout ratio, and finally the determination of the model structure and training parameter tuning are completed. The specific optimization process is as follows: First, set the base values of each hyperparameter as the initial starting point, and prioritize the optimization of the learning rate. Test different values within the preset range of the learning rate, calculate the MAPE between the corresponding model prediction results and the training set, and select the learning rate that minimizes the MAPE as the optimal learning rate. Next, based on the optimal learning rate determined in the previous step and keeping the initial values of other parameters unchanged, continue to optimize the batch size. Similarly, test different values within its preset range and select the optimal value using MAPE as the index. Subsequently, using the optimal values of learning rate and batch size obtained from the previous two steps, sequentially optimize the number of hidden layers, number of neurons, and Dropout ratio according to the above method until the optimal values of all hyperparameters are determined. Finally, the model training and hyperparameter optimization are completed. After multiple rounds of tests and experiments, the training results of the model are shown in Table 2.

Experiments indicate that when the learning rate is 0.0001, the batch size is 32, and the number of neurons is 36, the MAPE of the model on the training set reaches 14.98%, achieving the optimal prediction performance. To further explore the model performance, adjustments were made to the number of hidden layers in the neural network. After increasing the number of LSTM layers to 3, the model’s prediction accuracy did not improve significantly; instead, it increased the model complexity, leading to a decline in training efficiency and even the occurrence of overfitting. In addition, the deep structure may also trigger problems such as vanishing gradient or exploding gradient, making it difficult for the model to converge. By contrast, the two-layer LSTM architecture, through streamlining the neural network model structure, can not only significantly improve training efficiency and reduce computational resource consumption but also effectively lower the risk of overfitting. Therefore, considering both prediction accuracy and computational efficiency comprehensively, a neural network architecture with two LSTM layers was finally selected, and the surrogate model for intelligent prediction fracture propagation was obtained.

3. Results

In this section of the study, based on the on-site actual data from Block Ma 2 of the Mabei Oilfield located at the northwestern margin of the Junggar Basin, relevant work including parameter sensitivity analysis, time-series feature extraction, and intelligent prediction of fracture propagation will be carried out. The target formation of the study is the T1b2 interval of the Baikouquan Formation. The reservoir lithology of this formation is mainly interbedded sandy conglomerate and mudstone, featuring low porosity and low permeability. Meanwhile, this reservoir also exhibits moderate water sensitivity and moderate stress sensitivity, making it suitable for conducting research related to hydraulic fracturing.

3.1. Results of Parameter Sensitivity Analysis

Since fracture geometric parameters (including fracture length, fracture width, and fracture height) exert a significant influence on the SRV, this study focuses exclusively on conducting sensitivity analysis for fracture geometric parameters. In this study, sensitivity analysis was conducted based on the selected 10 candidate parameters and mainly takes fracture length as an example to elaborate on the parameter sensitivity analysis process in detail, and the relevant results are shown in Figure 6.

It can be seen from Figure 6a,b,f,h,i that water saturation, modulus stiffness factor, relative permeability factor, pumping pressure, and pumping rate all show a positive correlation with fracture length. Among them, both water saturation and relative permeability factor belong to reservoir fluid seepage characteristic parameters. When water saturation increases, the water phase effect will weaken rock strength, thereby creating favorable conditions for the increase in fracture length. The increase in the relative permeability factor can reduce fluid seepage resistance and promote the extension of fracture length. A larger modulus stiffness factor corresponds to lower rock stiffness, and the resistance for fractures to break through rock constraints during fracturing decreases accordingly, which is more conducive to the extension of fracture length. Pumping pressure and pumping rate belong to fracturing operation parameters. Among them, pumping pressure provides the driving force for fracture initiation and propagation; an increase in its value can break through the constraints of stronger in situ stress and rock tensile strength, thereby promoting the increase in fracture length. By continuously injecting fracturing fluid, the pumping rate maintains fracture energy and delays fracturing fluid leak-off, thus providing continuous power for fracture extension.

It can be seen from Figure 6c,j that both the overburden gradient and sand concentration show a negative correlation with fracture length. A larger overburden gradient means higher overburden pressure per unit depth, which exerts a stronger constraining effect on fractures and generally shows a trend of inhibiting the growth of fracture length. When sand concentration increases, an excessively high proppant content not only increases the viscosity of the fracturing fluid and reduces fluid fluidity but also may accumulate in the fracture and block seepage channels. This makes it difficult for the fracturing fluid to continuously transport energy to the far end of the fracture, ultimately inhibiting the extension of fracture length.

It can be seen from Figure 6d,e,g that oil specific gravity, reservoir temperature, and perf diameter are parameters that have no significant impact on fracture length. Among them, changes in oil specific gravity mainly act on fluid viscosity, but the impact magnitude is limited, which is not sufficient to change the energy-carrying efficiency of the fracturing fluid and the key conditions for fracture extension. The fluctuation range of reservoir temperature is small, and its regulatory effect on rock mechanical properties and fracturing fluid performance is relatively weak. The perf diameter does not significantly change the distribution of pumping pressure and flow rate, so its impact on fracture length is limited.

To further quantify the sensitivity of each parameter to fracture geometric parameters, a comparative analysis of the sensitivity of fracture length, width, and height was conducted based on Figure 7a–c, respectively. The length of the sensitive interval of each parameter in the figures intuitively reflects the degree of its influence on the corresponding fracture geometric characteristics. Parameters are arranged in the order of decreasing length of sensitive interval from top to bottom, indicating that the degree of their influence on fracture geometric characteristics gradually weakens. Among them, pumping pressure and modulus stiffness factor both exhibit a longer sensitive interval in the sensitivity analysis of fracture length and width, exerting the most significant influence. Pumping rate, overburden gradient, and sand concentration have a medium length of sensitive interval, and all show a relatively significant influence in the sensitivity analysis of fracture length, fracture width, and fracture height. Although the relative permeability factor has a shorter sensitive interval in the sensitivity analysis of fracture length and height, it still exhibits a clear response. Water saturation has a relatively large degree of influence on fracture height, and the characteristics of its sensitive interval are clearly distinguishable. Synthesizing the above rules, pumping pressure, modulus stiffness factor, pumping rate, overburden gradient, sand concentration, relative permeability factor, and water saturation were screened out as the key parameters for studying fracture propagation.

3.2. Establishment of Time-Series Dataset

In the process of filling missing values in pumping data using the KNN algorithm to determine the optimal K value, this study conducted multiple groups of comparative tests based on the 5-fold cross-validation method, and the test results are shown in Table 3. The results indicate that when K takes a value of 7, the average MSE of the 5-fold cross-validation reaches the minimum value; therefore, the value of K is determined to be 7.

After completing the missing value filling, the 3-sigma method was adopted to identify and process outliers in the pumping time-series data, leading to a significant improvement in both the integrity and reliability of the data. Subsequently, in accordance with the on-site actual construction time sequence, the preprocessed pumping data were integrated, and finally a continuous and standardized pumping time-series dataset was formed. Table 4 shows part of the processed pumping time-series data during the full construction process of an actual fracturing well. Each feature (pumping pressure, pumping rate, sand concentration) in the time-series data has no missing values, and all data points fall within the [μ − 3σ, μ + 3σ] interval. For example, the pumping pressure parameter fluctuates between 41.4 psi and 79.5 psi. Statistically, its mean μ_p = 62.3 psi and standard deviation σ_p = 11.2 psi. All pumping pressure data are within the [28.7, 95.9] interval without outlier interference. This integrity ensures that the neural network model will not learn pseudo-features due to data missing during training. After ensuring data integrity and reliability, to further explore the potential relationships between data features and enhance the learning ability of the neural network model, binomial feature transformation is performed on the processed pumping time-series data. This transformation expands linear features such as pumping pressure, pumping rate, and sand concentration into nonlinear features by constructing binomial combinations of features, thereby capturing the interaction and complex relationships between variables. These extended features can not only more accurately characterize the complex propagation dynamics of fractures in fracturing operations, but also effectively improve the neural network model’s ability to fit nonlinear relationships.

3.3. Identification of Fracture Propagation Process

Based on the pumping curve of a fracturing operation of on-site Well X, as shown in Figure 8, this curve mainly records the information on the changes in pumping pressure, pumping rate, and sand concentration over time throughout the entire fracturing process of the well. To better capture the dynamic response laws of pumping pressure, pumping rate, and sand concentration at each stage of fracture initiation, stable propagation, and pump shutdown fracture closure, the Symlets wavelet basis function is adopted to decompose this pumping curve via 3-level discrete wavelet transform, which decouples the pumping parameters into dynamic features at different time-frequency scales, as shown in Figure 9. In Figure 9, Level 1 (greater than 0.5 Hz) is the high-frequency detail component, mainly capturing the finest and fastest fluctuations in the signal. Level 2 (0.25–0.5 Hz) is the mid-frequency detail component, reflecting the second-fastest change characteristics in the signal. Level 3 (less than 0.25 Hz) is the low-frequency detail component, capturing relatively slow changes in the signal.

It can be seen from Figure 9 that the entire pumping process is divided into three main stages: fracture initiation, stable fracture propagation, and pump shutdown fracture closure. In the fracture initiation stage (0–30 min), the original pumping pressure curve shows a continuous increase, and its high-frequency detail component exhibits small fluctuations. Meanwhile, the high-frequency detail component of the pumping rate generates synchronous high-frequency oscillations. This phenomenon indicates that in situ stress gradually accumulates with the continuous injection of fracturing fluid. When the operation reaches approximately 30 min, the pumping rate and pumping pressure increase sharply simultaneously, indicating that the pumping pressure breaks through the constraint of in situ stress and fracture initiation occurs. The high-frequency detail component of sand concentration remains stable in this stage, indicating that proppant has not yet entered the fracture to participate in the propagation process, and fluid preferentially drives the initial fracture extension. In the stable fracture propagation stage (30–100 min), the original pumping rate curve remains stable, and the fluctuation of its high-frequency detail component weakens, indicating that the injection system enters a stable fluid supply state. The original pumping pressure curve also shows a relatively stable trend, and the small fluctuations of its mid-low frequency detail components reflect that the pumping pressure maintains dynamic balance with in situ stress, fluid loss, etc. The original sand concentration curve gradually increases, and the fluctuation characteristics of its high-frequency and mid-frequency detail components show a synergistic response with pumping pressure and pumping rate. This indicates that with the stable fluid injection, sand grains enter the fracture in an orderly manner: on the one hand, proppant provides skeleton support for the fracture, inhibits fracture closure, and thus maintains relatively stable pumping pressure. On the other hand, the fluctuation of sand concentration corresponds to the fine adjustment of pumping pressure and pumping rate, reflecting the dynamic coordination of fluid and proppant migration in the fracture. This ensures stable fracture extension while achieving synchronous construction of the extension area and flow conductivity. In the pump shutdown fracture closure stage (100–120 min), the original pumping rate curve decreases rapidly, and the fluctuation of its high-frequency detail component intensifies, reflecting the active adjustment of the pumping rate before shutdown. The original pumping pressure curve drops rapidly with the decrease in pumping rate, and the change in the fluctuation amplitude of its mid-low frequency detail components reflects the mechanical process of fluid pressure release in the fracture and fracture closure tendency. The original sand concentration curve drops sharply, and the fluctuations in all frequency bands are disordered, indicating that after sand addition stops, the sand grains remaining in the fracture begin to settle or flow back with the fluid. The decrease in pumping rate triggers pressure unloading, and the sharp drop in sand concentration leads to the interruption of proppant supplementation. The combined effects of pumping pressure decline and proppant settling gradually cause the fracture to transition from stable propagation to closure.

3.4. Surrogate Model for Prediction of Fracture Propagation

The 12,000 sets of fracture data simulated by GOHFER software are divided into a training set and a test set at an 8:2 ratio, and the test set containing 2400 sets of fracture data is used to validate the performance of the surrogate model. Reservoir characteristics, fracturing operation data, and pumping time-series features are input into the surrogate model. The model deeply analyses these time-series features to accurately identify the fracture propagation moment. It can dynamically capture stress fluctuation features during fracture propagation, output fracture geometric parameters and SRV at each moment during the pumping process. To comprehensively evaluate the model performance, verification is conducted from two dimensions: prediction accuracy and operation response time. In terms of prediction accuracy, the MAPE is used as the quantitative indicator. The specific calculation process is as follows: first, calculate the MAPE between the fracture geometric parameters and SRV output by the model each time and the corresponding parameters in the simulation results of GOHFER software; then, take the average of all the MAPE values obtained from individual calculations to characterize the prediction accuracy level of the model. In terms of operation speed evaluation, the comparative evaluation of the model’s operation speed is realized by separately monitoring and recording the single-operation response time data of the model and GOHFER software, and then taking the average of the two sets of response time data, respectively. The final results of the above tests are shown in Table 5.

It can be seen from Table 5 that in all tests, for the 2400 sets of fracture length, fracture width, fracture height and SRV predicted by the surrogate model, the average MAPE is less than 15% when compared with the corresponding parameters obtained from the simulation of GOHFER software. Therefore, the prediction accuracy of the model exceeds 85%. In addition, the operation response time of this model is basically stable in the range of 0.05–0.07 s, with an average response time of 0.064 s; in contrast, the operation response time of GOHFER software is basically stable in the range of 60–85 s, with an average response time of 73.039 s. It can be seen that compared with GOHFER software, the response time of the surrogate model is reduced to less than 1/1000 of the original, and this improvement significantly enhances the efficiency of fracturing pumping decision-making.

3.5. Model Validation

To further verify the practical application performance of the surrogate model, this study applied it to a certain well interval of Horizontal Well Y in the Mabei Oilfield. The construction data of this well interval were not involved in model training, which can effectively ensure the objectivity and accuracy of the verification results. As shown in Table 6, this table lists part of the construction parameters of this well interval, and the entire pumping process lasts for 100 min. Based on the above-mentioned construction parameters, this study used the surrogate model to predict the fracture geometric parameters and SRV and conducted a comparative analysis between the model’s predicted values and the on-site actual monitoring values to verify the prediction reliability of the model under actual well conditions. The results are shown in Figure 10.

The MAPEs between the fracture length, fracture width, fracture height, and SRV results predicted by the surrogate model and the actual results of the corresponding parameters are 20.26%, 22.39%, 21.87%, and 22.53%, respectively, all below 25%; as shown in Figure 10. It can be concluded that the model’s prediction accuracy reaches and exceeds 75%. Although its accuracy is approximately 10% lower than that of the GOHFER benchmark test, its prediction results remain in good general consistency with the real results in actual engineering scenarios and can meet the requirements of practical engineering applications. Moreover, the model prediction results overall conform to the physical laws of fracture propagation. It can be seen from Figure 10a–d that during the initial pumping stage (0–20 min), the pumping pressure and pumping rate are relatively high to overcome rock strength, prompting fractures to initiate and propagate rapidly, thereby leading to significant growth in fracture parameters such as fracture length, width, and height. Since proppant has not been injected into the formation in large quantities at this stage, fractures are not effectively supported by proppant, resulting in slow growth of SRV. During the middle pumping stage (20–80 min), the pumping pressure and in situ stress, as well as the pumping rate and fluid leak off, reach a dynamic equilibrium, causing fracture propagation to stabilize and the growth rate of fracture parameters to slow down. SRV growth accelerates at this stage because a large amount of proppant is injected into the formation and enters the fractures, providing effective support for the fractures and driving rapid growth of SRV. During the late pumping stage (80–100 min), as the pumping pressure and pumping rate gradually decrease, fractures gradually close. Fracture length and height decrease to varying degrees due to fluid leak off and stress release. Fracture width does not decrease or decreases slightly due to the certain support provided by the proppant injected in the early stage; however, as time progresses, the proppant begins to break or embed into the formation rock under stress, causing the fracture width to gradually narrow. Due to the decrease in fracture length and height, SRV also gradually begins to decrease.

In addition, to clarify the actual operation efficiency of the model, this study simultaneously used GOHFER software to conduct corresponding simulations on the parameters of this well interval. It recorded the single response time of the surrogate model and GOHFER software, respectively, conducted a comparison, as shown in Table 7.

It can be seen from Table 7 that in this application, the single response time of the surrogate model is 0.06 s, while the single simulation time of GOHFER software is 66.45 s, and the response time of the former is reduced to less than 1/1000 of the latter. It should be noted that this comparison result is only for the prediction scenario of a single fracturing interval in one horizontal well; even so, the response time for the prediction of a single interval has still achieved a reduction of more than 60 s. If this surrogate model is popularized and applied to batch prediction of all fracturing intervals in this horizontal well, its time advantage will be further highlighted: on the one hand, it can greatly reduce the overall prediction time consumption, realizing the real-time acquisition of fracture geometric parameters and SRV during the construction process; on the other hand, it can provide on-site engineers with a timely basis for judging operating conditions, supporting them to dynamically adjust the pumping program according to the model output, thereby optimizing the morphology of hydraulic fractures, improving the reservoir stimulation effect, and providing key technical support for the real-time regulation and efficient decision-making of hydraulic fracturing operations in unconventional oil and gas resources.

In summary, the surrogate model for fracture propagation prediction constructed in this paper has successfully predicted the fracture parameters and SRV at each moment throughout the entire 100 min pumping process. This indicates that the surrogate model can dynamically correlate the pumping process with the fracture propagation process, achieve prediction of fracture propagation, and provide key technical support for dynamic construction regulation.

4. Discussion

The surrogate model for prediction fracture propagation constructed in this study can accurately identify the moments of fracture propagation and dynamically capture the stress fluctuation characteristics during the propagation process by conducting in-depth analysis and processing of the pumping time-series features extracted via wavelet transform. Meanwhile, it outputs the fracture geometric parameters and SRV at each moment during the pumping process. Compared with GOHFER software, the response time of this model is reduced by more than 1000 times, which can effectively provide guidance for real-time pumping decision-making. The application scenarios of the model are clear: it is generally applicable to fractured wells in conventional reservoirs and ordinary unconventional reservoirs; in particular, it has better applicability to multiple fractured wells within the same block that have similar reservoir characteristics and small differences in construction methods.

Despite its prominent advantages, the model still has limitations. In terms of prediction accuracy, although it performs well in specific blocks of the Mabei Oilfield, when facing reservoirs with stronger heterogeneity or complex stress fields, it exhibits larger prediction errors for complex fracture morphologies such as fracture diversion and branch fracture development. In terms of application scope, if geological, construction, and other data are severely scarce, or if it is applied to fractured wells with extremely complex reservoir conditions (e.g., ultra-low porosity and permeability reservoirs, reservoirs with complex natural fracture development, abnormal high-pressure or ultra-high temperature reservoirs) as well as special fracturing methods such as supercritical carbon dioxide fracturing, the applicability of the model will decrease significantly. Additionally, the model is essentially a data-driven model and lacks in-depth integration of physical mechanisms. Although the model can capture the parameter correlation laws through feature engineering, it does not explicitly embed physical mechanisms such as fluid-rock coupling. As a result, its interpretability of the intrinsic physical laws of fracture propagation is weak, and it is difficult to cope with dynamic changes in geological parameters, which may easily lead to prediction deviations.

To address the above limitations, future efforts should be made to improve the model from four aspects:

Since on-site pumping pressure is easier to monitor than fracture geometric parameters, the prediction accuracy of the surrogate model can be improved by fitting pumping pressure data. First, the surrogate model is used to predict the fracture geometric parameters of actual on-site fractured wells, and then a pumping pressure inversion model is established. Based on this inversion model, the predicted fracture geometric parameters are continuously adjusted to achieve fitting between the pumping pressure calculated by the inversion model and the on-site monitored pumping pressure. Subsequently, the correspondingly adjusted fracture geometric parameters are used again to train the surrogate model. After training is completed, the prediction accuracy of this surrogate model in actual on-site fractured wells will be significantly improved.
Systematic errors are calibrated by adjusting the geological parameters of the surrogate model, thereby significantly improving the model’s prediction accuracy. Specifically, when the fracture geometric parameters obtained from micro seismic monitoring of multiple wells are available, the monitoring data of several of these wells can be selected as “calibration samples”. By adjusting key geological parameters (such as porosity, permeability, etc.) in the prediction model, the fracture geometric parameters output by the model are fitted to the measured values of the “calibration samples”. This process essentially uses measured data to correct the systematic errors of the model. After the model calibration is completed, it is applied to the prediction of fracture geometric parameters for the remaining wells. At this point, the prediction accuracy of the model will be significantly improved.
Expand the multi-scenario adaptation capability of the model. On the one hand, expand the coverage of the training dataset to include fracturing simulation and on-site data of different reservoir types and geological conditions; on the other hand, introduce transfer learning methods to leverage the feature extraction capability of the pre-trained model and quickly achieve model adaptation through a small amount of new block data, thereby enhancing its generalization capability in multiple scenarios of unconventional oil and gas development.
Promote the in-depth integration of physical mechanisms and data-driven approaches. Embed physical equations such as rock mechanics criteria and fluid seepage equations into the model’s loss function to construct a “physics-data dual-driven” model, thereby improving the interpretability and reliability of prediction results.

5. Conclusions

This study utilizes GOHFER software to generate fracturing operation datasets and carries out processing such as missing value filling, outlier identification, and feature transformation on the collected pumping time-series data. Based on methods such as wavelet transform and machine learning, this study predicts the fracture propagation status during fracturing operations and draws the following conclusions:

Wavelet transform is used to extract time-series features from pumping time-series data, and through in-depth analysis of the features, the stages of fracture initiation, stable propagation, and pump shutdown fracture closure are accurately identified. To accurately capture the fluctuation characteristics of the fracture propagation process, an LSTM model integrated with an attention mechanism was constructed.
Based on the fracturing operation dataset simulated by GOHFER software, MAPE was used as the loss function to train and adjust the parameters of the LSTM model integrated with an attention mechanism. Training results show that when the learning rate is set to 0.0001, the batch size is 32, and the model adopts two LSTM layers with 36 neurons in each layer, the error rate of the training set is 14.98%, and a surrogate model for prediction fracture propagation is finally constructed.
The established surrogate model can identify and analyze the input pumping time-series features, extract fracture propagation fluctuation characteristics by correlating the pumping and fracture propagation processes, and then output fracture geometric parameters and SRV. The surrogate model is applied to an actual fractured horizontal well Y in Block Ma 2 of the Mabei Oilfield. By comparing and analyzing the fracture length, fracture width, fracture height, and SRV predicted by the model with the actual values of this well, it is concluded that the model’s prediction accuracy exceeds 75% and its response time is less than 0.1 s, which has important guiding value for the optimization of fracturing design and real-time decision-making for on-site construction.

Author Contributions

Conceptualization, X.H.; Methodology, H.T.; Writing—original draft, J.X.; Validation, L.W.; Formal analysis, H.L.; Resources, R.Z.; Visualization, S.T.; Supervision, Q.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the PetroChina Xinjiang Oilfield Company (grant number XJYT-2024-JS-6398), National Natural Science Foundation of China (No. 52320105002, 52421002 and 52374017), Science Foundation of China University of Petroleum, Beijing (No. 2462025YJRC017), and National Key Laboratory of Petroleum Resources and Engineering (PRE/indep-1-2303).

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy restrictions from the company.

Acknowledgments

The authors at China University of Petroleum, Beijing would like to acknowledge the support provided by the National Natural Science Foundation of China (No. 52320105002, 52421002 and 52374017), Science Foundation of China University of Petroleum, Beijing (No. 2462025YJRC017), National Key Laboratory of Petroleum Resources and Engineering (PRE/indep-1-2303).

Conflicts of Interest

Authors Xiaodong He, Hao Liu and Runhao Zhong were employed by Oil Production Technology Research Institute, Petrochina Xinjiang Oilfield Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The Oil Production Technology Research Institute, Petrochina Xinjiang Oilfield Company had no role in the design of the study, the collection, analysis, or interpretation of data, the writing of the manuscript, or the decision to publish the results.

References

Sui, W.; Wen, C.; Sun, W.; Guo, H.; Yang, Y.; Li, J.; Song, J. Joint application of distributed optical fiber sensing technologies for hydraulic fracturing monitoring. Nat. Gas. Ind. 2023, 43, 87–103. (In Chinese) [Google Scholar] [CrossRef]
Guo, J.; Zeng, J.; Zhao, Z.; Zhang, T.; Zhai, L. Research progress on the fracturing effect evaluation methods for deep coal formations. Drill. Prod. Technol. 2025, 48, 36–47. (In Chinese) [Google Scholar] [CrossRef]
Zhang, A.; Yang, Z.; Li, X.; Xia, D.; Zhang, Y.; Luo, Y.; He, Y.; Chen, T.; Zhao, X. An evaluation method of volume fracturing effects for vertical wells in low permeability reservoirs. Pet. Explor. Dev. 2020, 47, 441–448. [Google Scholar] [CrossRef]
Li, N.; Fang, L.; Sun, W.; Zhang, X.; Chen, D. Evaluation of borehole hydraulic fracturing in coal seam using the microseismic monitoring method. Rock Mech. Rock Eng. 2021, 54, 607–625. [Google Scholar] [CrossRef]
Qu, H.; Zhang, J.; Zhou, F.; Peng, Y.; Pan, Z.; Wu, X. Evaluation of hydraulic fracturing of horizontal wells in tight reservoirs based on the deep neural network with physical constraints. Pet. Sci. 2023, 20, 1129–1141. [Google Scholar] [CrossRef]
Liao, Q.; Wang, B.; Chen, X.; Tan, P. Reservoir stimulation for unconventional oil and gas resources: Recent advances and future perspectives. Adv. Geo-Energy Res. 2024, 13, 7–9. [Google Scholar] [CrossRef]
You, S.; Liao, Q.; Yue, Y.; Tian, S.; Li, G.; Patil, S. Enhancing fracture geometry monitoring in hydraulic fracturing using radial basis functions and distributed acoustic sensing. Adv. Geo-Energy Res. 2025, 16, 260–275. [Google Scholar] [CrossRef]
Yang, F.; Mei, W.; Li, L.; Sun, Z.; An, Q.; Yang, Q.; Lu, M.; Yang, R. Propagation of hydraulic fractures in thin interbedded tight sandstones. Coal Geol. Explor. 2023, 51, 61–71. (In Chinese) [Google Scholar] [CrossRef]
Sun, L.; Fang, M.; Fan, W.; Li, H.; Li, L. Transient Pressure Performance Analysis of Hydraulically Fractured Horizontal Well in Tight Oil Reservoir. Energies 2024, 17, 2556. [Google Scholar] [CrossRef]
Xie, G.; Luo, L.; Liu, X.; Liang, L.; Jiang, W.; Chang, J. Predicting the shape of hydraulic fracture of shale gas horizontal well in Sichuan with log data. Well Logging Technol. 2017, 41, 590–595. (In Chinese) [Google Scholar] [CrossRef]
Liu, X.; Chen, Z.; Chen, C.; Xiao, Q. Fracture prediction method based on Fourier series of azimuthal elastic impedance. Oil Geophys. Prospect. 2022, 57, 423–433. (In Chinese) [Google Scholar] [CrossRef]
Carpenter, C. Multiwell-pressure history matching in delaware play helps optimize fracturing. J. Pet. Technol. 2023, 75, 94–96. [Google Scholar] [CrossRef]
Gong, X.; Jin, Z.; Ma, X.; Liu, Y.; Hong, C. Effect of bedding structures on hydraulic fracture propagation behavior investigated using a coupled thermo-hydraulic-mechanical numerical model based on the phase-field cohesive zone method. Comput. Geotech. 2025, 186, 107427. [Google Scholar] [CrossRef]
Tosta, M.; Oliveira, G.P.; Wang, B.; Chen, Z.; Liao, Q. APyCE: A Python module for parsing and visualizing 3D reservoir digital twin models. Adv. Geo-Energy Res. 2023, 8, 206–210. [Google Scholar] [CrossRef]
Hu, X.; Zhou, F.; Li, Y.; Qiu, Y.; Li, Z. Filtering methods and characteristic analysis of water hammer pressure-wave signals from fracturing stop pumps. Pet. Sci. Bull. 2021, 6, 79–91. (In Chinese) [Google Scholar] [CrossRef]
Li, E.; Bedi, S.; Melek, W. Anomaly detection in three-axis CNC machines using LSTM networks and transfer learning. Int. J. Adv. Manuf. Technol. 2023, 127, 5185–5198. [Google Scholar] [CrossRef]
Tian, H.; Liu, R.; Li, D.; You, S.; Liao, Q.; Tian, S. Applications and prospect of DeepSeek large language model in petroleum engineering. Xinjiang Oil Gas 2025, 21, 55–63. (In Chinese) [Google Scholar] [CrossRef]
Duan, C.; Huang, L.; Gross, M.; Fehler, M.; Lumley, D.; Glubokovskikh, S. Monitoring Subsurface Fracture Flow Using Unsupervised Deep Learning of Borehole Microseismic Waveform Data. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–12. [Google Scholar] [CrossRef]
El Sgher, M.; Aminian, K.; Samuel, A. Impact of the Cluster Spacing on Hydraulic Fracture Conductivity and Productivity of a Marcellus Shale Horizontal Well. In Proceedings of the SPE Eastern Regional Meeting, Farmington, PA, USA, 2–3 November 2021; SPE: New York, NY, USA, 2021; p. D021S003R002. [Google Scholar] [CrossRef]
Yan, Y.; Deng, J.; Guerra, D.; Yu, W.; Miao, J. A Novel Workflow from StimPlan to EDFM for Complex Hydraulic Fracture Modeling and Production Simulation. In Proceedings of the SPE/AAPG/SEG Unconventional Resources Technology Conference, Houston, TX, USA, 26–28 July 2021; URTEC: Houston, TX, USA, 2021; p. D021S036R002. [Google Scholar] [CrossRef]
Zhang, D.; Liu, Y.; Luo, H.; Cao, S.; Cao, J.; Li, X. Staged fracturing of horizontal wells in continental tight sandstone oil reservoirs: A case study of Yanchang Formation in Western Ordos Basin, China. Front. Earth Sci. 2021, 9, 760976. [Google Scholar] [CrossRef]
Medina, L.A.; Tutuncu, A.N.; Miskimins, J.L.; Eustes, A.W. Discrete fracture network (dfn) and hydraulic fracturing analysis based on a 3d geomechanical model for prospective shale plays in colombia. In Proceedings of the 54th U.S. Rock Mechanics/Geomechanics Symposium, Golden, CO, USA, 28 June–1 July 2020; ARMA. p. ARMA-2020. [Google Scholar]
Hasan, H.A.; Hamd-Allah, S. Estimation of the fracturing parameters and reservoir permeability using diagnostic fracturing injection test. Iraqi Geol. J. 2023, 56, 247–259. [Google Scholar] [CrossRef]
Yuan, X.; Yao, Y.; Gan, Q.; Liu, D.; Zhou, Z. Investigation of hydraulic fracturing process in coal reservoir by a coupled thermo-hydro-mechanical simulator TOUGH-FLAC^3D. Oil Gas. Geol. 2018, 39, 611–619. (In Chinese) [Google Scholar] [CrossRef]
El Sgher, M.; Aminian, K.; Ameri, S. The impact of rock properties and stress shadowing on the hydraulic fracture properties in Marcellus Shale. In Proceedings of the SPE Eastern Regional Meeting, Charleston, VA, USA, 15–17 October 2019; SPE: New York, NY, USA, 2019; p. D031S006R002. [Google Scholar] [CrossRef]
Tabasi, S.; Tehrani, P.S.; Rajabi, M.; Wood, D.A.; Davoodi, S.; Ghorbani, H.; Alvar, M.A. Optimized machine learning models for natural fractures prediction using conventional well logs. Fuel 2022, 326, 124952. [Google Scholar] [CrossRef]
Liu, Z.; Lei, Q.; Weng, D.; Yang, L.; Wang, X.; Wang, Z.; Fan, M.; Wang, J. A powerful prediction framework of fracture parameters for hydraulic fracturing incorporating eXtreme gradient boosting and bayesian optimization. Energies 2023, 16, 7890. [Google Scholar] [CrossRef]
Fan, Y.; Ma, X.; Lian, J. Research and comparison of filling methods for missing data in hydraulic fracturing. Petrochem. Ind. Appl. 2020, 39, 48–55. (In Chinese) [Google Scholar] [CrossRef]
Vajda, D.L.; Do, T.V.; Bérczes, T.; Farkas, K. Machine learning-based real-time anomaly detection using data pre-processing in the telemetry of server farms. Sci. Rep. 2024, 14, 23288. [Google Scholar] [CrossRef]
Xu, X.; Ma, X.; Zhan, J. Probabilistic evaluation of hydraulic fracture performance using ensemble machine learning. Geofluids 2022, 2022, 1760065. [Google Scholar] [CrossRef]
Hu, X.; Huang, G.; Zhou, F.; Qiu, Y.; Gou, X.; Chen, C. Pressure response using wavelet analysis in the process of hydraulic fracturing: Numerical simulation and field case. J. Pet. Sci. Eng. 2022, 217, 110837. [Google Scholar] [CrossRef]
Zheng, Z.; Wang, Z.; Zhu, Y.; Tang, S.; Wang, B. Feature extraction method for hydraulic pump fault signal based on improved empirical wavelet transform. Processes 2019, 7, 824. [Google Scholar] [CrossRef]
Gabry, M.A.; Eltaleb, I.; Soliman, M.Y.; Farouq-Ali, S.M. A new technique for estimating stress from fracture injection tests using continuous wavelet transform. Energies 2023, 16, 764. [Google Scholar] [CrossRef]
Sun, X.; Li, A.; Qin, N.; Jiang, R.; Wang, J.; Zhao, L. Reconstruction of seismic data with fast Bregman based on compressed sensing. J. China Univ. Pet. (Ed. Nat. Sci.) 2025, 49, 62–68. (In Chinese) [Google Scholar] [CrossRef]
Yang, Z.Z.; Gou, L.J.; Min, C.; Wen, G.Q.; Li, X.G. Hybrid deep learning approach for hydraulic fracture morphology diagnosis in coalbed methane wells. Energy Fuels 2024, 38, 6806–6820. [Google Scholar] [CrossRef]
Hu, X.; Liu, J.; Wang, T.; Zhou, F.; Lu, X.; Yi, P.; Chen, C. A physics and data dual-driven method for real-time fracturing pressure prediction. Pet. Geol. Exp. 2024, 46, 1323–1335. (In Chinese) [Google Scholar] [CrossRef]
Liu, L.; Feng, J.; Li, J.; Chen, W.; Mao, Z.; Tan, X. Multi-layer CNN-LSTM network with self-attention mechanism for robust estimation of nonlinear uncertain systems. Front. Neurosci. 2024, 18, 1379495. [Google Scholar] [CrossRef]
Li, G.; Chen, Z.; Hu, L.; Liao, X.; Zhang, L. Pump pressure prediction and application based on mechanism and intelligence. Pet. Sci. Bull. 2024, 9, 586–603. (In Chinese) [Google Scholar] [CrossRef]
Yang, Z.; Yang, C.; Li, X.; Min, C. Pattern Recognition of the Vertical Hydraulic Fracture Shapes in Coalbed Methane Reservoirs Based on Hierarchical Bi-LSTM Network. Complexity 2020, 2020, 1734048. [Google Scholar] [CrossRef]
Zhang, L.; Yang, C. ResNet-LSTM coal mine gas concentration prediction model based on attention mechanism. Coal Technol. 2024, 43, 208–213. (In Chinese) [Google Scholar] [CrossRef]

Figure 1. Data processing procedure. Red stars represent the initially selected centers of the KNN algorithm, and red squares represent missing values.

Figure 2. KNN interpolation filling flowchart.

Figure 3. 3-sigma outlier identification.

Figure 4. Binomial feature transformation.

Figure 5. Neural network model architecture.

Figure 6. Fracture length sensitivity analysis: (a) water saturation, (b) modulus stiffness factor (1/psi), (c) overburden gradient (psi/ft), (d) oil specific gravity, (e) reservoir temperature (F), (f) relative permeability factor, (g) perf diameter (in), (h) pumping pressure (psi), (i) pumping rate (bpm), (j) sand concentration (lb/ft²).

Figure 7. Comparison of sensitivity analysis for fracture geometric parameters: (a) fracture length, (b) fracture width, (c) fracture height. The black number represents the minimum value of the parameter, and the red number represents the maximum value of the parameter.

Figure 8. Fracturing pumping curve.

Figure 9. Decomposition results of pumping curve: (a) pressure curve decomposition, (b) pumping rate curve decomposition, and (c) sand concentration curve decomposition.

Figure 10. Model prediction results: (a) fracture length, (b) fracture width, (c) fracture height, and (d) SRV.

Table 1. Initial parameters for sensitivity analysis.

	Parameter Type	Characteristic Type
Geological parameters	Modulus stiffness factor	Rock mechanics characteristics
	Overburden gradient	Rock mechanics characteristics
	Water saturation	Reservoir seepage characteristics
	Oil specific gravity
	Relative permeability factor
	Reservoir temperature	Geological environment characteristics
Operation parameters	Pumping pressure	Fracturing fluid injection dynamic characteristics
	Pumping rate	Fracturing fluid injection dynamic characteristics
	Sand concentration	Proppant support characteristics
	Perf diameter	Flow channel geometric characteristics

Table 2. Model training results.

Parameters	Values	MAPE
Learning rate	0.01	32.07%
	0.001	30.36%
	0.0001	14.98%
	0.00001	27.39%
Batch size	16	28.62%
	32	14.98%
	64	32.63%
	128	36.96%
Number of neurons	12	30.29%
	24	20.73%
	36	14.98%
	48	28.97%

Table 3. Selection of the k-value.

K Value	5-Fold Cross-Validation					Average MSE
3	0.082	0.091	0.087	0.095	0.089	0.0888
5	0.065	0.071	0.068	0.073	0.069	0.0692
7	0.051	0.055	0.052	0.054	0.053	0.0530
9	0.058	0.062	0.060	0.063	0.059	0.0604
11	0.067	0.070	0.069	0.072	0.068	0.0692
13	0.074	0.078	0.076	0.079	0.075	0.0764

Table 4. Part of the pre-processed pumping data.

Time (min)	Pumping Pressure (psi)	Pumping Rate (bpm)	Sand Concentration (lb/ft²)
50	56.73	35.28	0.00
51	58.29	35.29	0.00
52	59.31	35.32	80.05
53	62.85	35.38	121.53
54	63.31	35.42	119.71
55	64.23	35.40	140.70
56	65.65	35.43	161.16
57	67.87	35.45	159.92
58	70.01	35.42	197.78
59	70.88	35.43	200.78
60	71.56	35.45	201.24

Table 5. Model test results.

Prediction Result of Parameters	The Average MAPE	The Average Response Time of the Model (s)	The Average Response Time of GOHFER (s)
Fracture length	0.1483	0.064	73.039
Fracture width	0.1456
Fracture height	0.1470
SRV	0.1492

Table 6. Partial pumping data of the Horizontal Well Y.

Time (min)	Pumping Pressure (psi)	Pumping Rate (bpm)	Sand Concentration (lb/ft²)
50	72.85	35.45	201.24
51	73.11	35.54	199.31
52	72.25	35.37	239.43
53	71.9	35.52	239.39
54	71.41	35.51	259.28
55	71.26	35.57	258.37
56	71.17	35.6	264.8
57	71.41	35.49	258.16
58	71.69	35.53	260.09
59	71.58	35.57	262.45
60	71.47	35.51	214.62
61	71.33	35.42	179.64
62	72.3	35.64	204.07
63	72.69	35.34	200.71
64	72.61	35.41	204.27
65	72.09	35.41	201.93

Table 7. Comparison of single response time between GOHFER software and surrogate model.

	Single Response Time (s)	Speedup
GOHFER	66.45	1
Surrogate model	0.06	1107.50

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

He, X.; Tian, H.; Xie, J.; Wang, L.; Liu, H.; Zhong, R.; Liao, Q.; Tian, S. Pre-Trained Surrogate Model for Fracture Propagation Based on LSTM with Integrated Attention Mechanism. Processes 2025, 13, 2764. https://doi.org/10.3390/pr13092764

AMA Style

He X, Tian H, Xie J, Wang L, Liu H, Zhong R, Liao Q, Tian S. Pre-Trained Surrogate Model for Fracture Propagation Based on LSTM with Integrated Attention Mechanism. Processes. 2025; 13(9):2764. https://doi.org/10.3390/pr13092764

Chicago/Turabian Style

He, Xiaodong, Huiyang Tian, Jinliang Xie, Luyao Wang, Hao Liu, Runhao Zhong, Qinzhuo Liao, and Shouceng Tian. 2025. "Pre-Trained Surrogate Model for Fracture Propagation Based on LSTM with Integrated Attention Mechanism" Processes 13, no. 9: 2764. https://doi.org/10.3390/pr13092764

APA Style

He, X., Tian, H., Xie, J., Wang, L., Liu, H., Zhong, R., Liao, Q., & Tian, S. (2025). Pre-Trained Surrogate Model for Fracture Propagation Based on LSTM with Integrated Attention Mechanism. Processes, 13(9), 2764. https://doi.org/10.3390/pr13092764

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Pre-Trained Surrogate Model for Fracture Propagation Based on LSTM with Integrated Attention Mechanism

Abstract

1. Introduction

2. Methods

2.1. Simulation of Fracturing Process

2.2. Parameter Sensitivity Analysis

2.3. Data Processing

2.4. Pumping Time-Series Feature Extraction Method Based on Wavelet Transform

2.5. The LSTM Model Integrated with Attention Mechanism

2.6. Model Training

3. Results

3.1. Results of Parameter Sensitivity Analysis

3.2. Establishment of Time-Series Dataset

3.3. Identification of Fracture Propagation Process

3.4. Surrogate Model for Prediction of Fracture Propagation

3.5. Model Validation

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI