Method for Formation Pore Pressure Prediction Based on Heterogeneous Transfer Learning

Dang, Wenhui; Wang, Yingjie; Zhong, Zhen; Wang, Xin; Chen, Hao; Xu, Yuqiang; Yang, Lei; He, Hailong

doi:10.3390/pr14081280

Open AccessArticle

Method for Formation Pore Pressure Prediction Based on Heterogeneous Transfer Learning

by

Wenhui Dang

¹,

Yingjie Wang

¹,

Zhen Zhong

¹,

Xin Wang

¹,

Hao Chen

¹,

Yuqiang Xu

²,

Lei Yang

² and

Hailong He

^2,*

¹

Oil Production Technology Research Institute, Xinjiang Oilfield Company, Karamay 834000, China

²

School of Petroleum Engineering, China University of Petroleum (East China), Qingdao 266580, China

^*

Author to whom correspondence should be addressed.

Processes 2026, 14(8), 1280; https://doi.org/10.3390/pr14081280

Submission received: 25 March 2026 / Revised: 7 April 2026 / Accepted: 14 April 2026 / Published: 17 April 2026

(This article belongs to the Special Issue Application of Artificial Intelligence in Oil and Gas Engineering)

Download

Browse Figures

Versions Notes

Abstract

Accurate prediction of formation pore pressure is of great significance for drilling safety, the efficient development of oil and gas resources, and engineering risk control. Traditional methods based on empirical parameters or mechanical models are difficult to fully adapt to complex geological conditions. Although intelligent models have strong nonlinear modeling capabilities, they are highly dependent on large-scale and high-quality training data, and tend to suffer from poor generalization ability and insufficient adaptability in blocks with limited samples or significant differences in geological characteristics. To improve the adaptability of the model between different blocks, this study introduces a heterogeneous transfer learning method to construct a formation pore pressure prediction model suitable for scenarios with inconsistent feature spaces. This method can effectively transfer knowledge from the source domain to the target domain, alleviating the prediction difficulties caused by differences in data distribution. Experimental results show that the proposed method still maintains excellent prediction accuracy and stability under the conditions of limited training samples and complex geological conditions, and has better generalization ability and cross-block applicability compared with traditional models.

Keywords:

formation pore pressure prediction; heterogeneous transfer learning; HEMAP; LSTM

1. Introduction

Formation pore pressure refers to the pressure acting on the fluid in rock pores, also known as formation pressure or pore pressure. It not only reflects the physical properties of underground rocks, such as porosity, diagenetic age, and burial depth, but also reveals the distribution state and migration conditions of formation fluids. As China’s oil and gas exploration continues to advance into deep, ultra-deep, deepwater, and unconventional oil and gas fields, the formation environment has become increasingly complex, and the complexity and uncertainty of pore pressure have increased significantly. Against this backdrop, abnormal pore pressure has become a major inducement for downhole complex conditions such as lost circulation, wellbore collapse, and blowout, which seriously threaten the safety and continuity of drilling operations. Therefore, conducting accurate prediction of formation pore pressure is of great significance for guiding the design of drilling parameters, ensuring engineering safety, and improving the efficiency of exploration and development [1,2].

For a long time, the prediction of formation pore pressure has been a challenging task [3,4,5]. Traditional methods mostly rely on idealized assumptions and use semi-empirical formulas for prediction. Although they have a certain physical basis, they depend on manual experience, have limited parameters, and are difficult to adapt to complex formation conditions. To improve prediction accuracy, some studies have introduced a large amount of block data and rock mechanics experiments, and made improvements by combining different models. However, the overall method process is cumbersome and inefficient, making it difficult to meet the current demand for fast drilling. Hence, the development of a prediction method with high efficiency, stability, and intelligence has become an urgent need.

In recent years, with the rapid development of machine learning, data-driven prediction methods have shown great potential in formation pressure prediction. Relying on their strong nonlinear modeling capabilities, machine learning methods have been widely applied in fields such as seismic inversion [6,7,8], well logging interpretation [9,10,11], production prediction [12,13,14], and reservoir development [15,16,17], achieving remarkable results. In pore pressure prediction, intelligent models can fully explore the deep correlations between multi-source data (such as logging and mud logging data) and formation pressure, and exhibit high prediction accuracy in areas with sufficient data. However, such models generally rely on a large number of high-quality training samples and have high requirements for the consistency of input features. If there are problems such as missing data or feature inconsistency, the prediction effect will decrease significantly. Especially in the application of new blocks or cross-blocks, their adaptability and robustness are obviously insufficient [18,19,20,21].

To address the above problems, transfer learning, an important branch of machine learning, has been widely applied in many fields in recent years. Its core idea is to transfer the knowledge obtained from existing tasks to new tasks, so as to improve the performance of the target model in scenarios with insufficient samples or significant distribution differences. The application of transfer learning in pore pressure prediction can not only transfer the drilling data of existing blocks to target blocks with scarce data, reducing the model’s dependence on a large number of new samples, but also improve the prediction ability when input parameters are inconsistent or data is missing, breaking through the adaptability bottleneck of traditional data-driven models under “heterogeneous feature spaces”.

Based on this, this study takes the prediction of formation pore pressure as the research object, addresses the practical problems of scarce data and heterogeneous features, and introduces the heterogeneous transfer learning method to construct a prediction model with cross-block generalization ability. The aim is to improve the adaptability and practicality of the model under complex geological conditions, and provide theoretical support and technical approaches for intelligent drilling and oilfield development.

2. Methodology

2.1. Principle of the Methodology

2.1.1. Principle of Heterogeneous Spectral Mapping

Heterogeneous Spectral Mapping (HEMAP) is a symmetric heterogeneous transfer learning method, which aims to address the problems of heterogeneous feature spaces, differences in data distribution, and inconsistent label spaces between the source domain and the target domain [22].

HEMAP comprises 3 key technical modules:

Symmetric transformation and spectral mapping are adopted to project data from the source and target domains into a common subspace, thereby alleviating feature space heterogeneity.
To mitigate discrepancies in data distribution, HEMAP uses a clustering-based sample selection approach in the latent subspace to select source-domain samples most similar to the target data as the new training set.
To address inconsistent label spaces, HEMAP fuses output probabilities of the source and target domains via Bayesian inference for classification tasks, or unifies the output space via linear transformation for regression tasks, thus ensuring model generalization.

The optimization objective of the HEMAP heterogeneous transfer learning algorithm is as follows:

\begin{matrix} \min_{B_{T}^{T} B_{T} = I, B_{S}^{T} B_{S} = I} G (B_{T}, B_{S}, P_{T}, P_{S}) \\ = {m i n}_{B_{T}^{T} B_{T} = I, B_{S}^{T} B_{S} = I} {‖ T - B_{T} P_{T} ‖}^{2} + {‖ S - B_{S} P_{S} ‖}^{2} \\ + β \times (\frac{1}{2} \cdot {‖ T - B_{S} P_{T} ‖}^{2} + \frac{1}{2} \cdot {‖ S - B_{T} P_{S} ‖}^{2}) \end{matrix}

(1)

In the formula,

B_{T}

and

B_{S}

are the projection matrices of the target data and source data in the common subspace;

P_{T}

and

P_{S}

are the linear mapping matrices;

β

is the parameter controlling similarity;

S

is the source domain dataset matrix;

T

is the target domain dataset matrix.

The detailed algorithm of HEMAP is shown in Algorithm 1.

Algorithm 1. Heterogeneous Spectral Mapping Method

Input: source domain dataset

S

, target domain dataset

T

, similarity parameter

β

(default value: 1), dimension of the common subspace

k

Output: transferred source domain dataset

B_{S}

and target domain dataset

B_{T}

1:: Preprocessing: Make the number of instances of target data $T$ and source data $S$ the same through random sampling (assumed to be for both $N$ )

2:: Construct matrix $A$ : calculate submatrices $A_{1}$ $, A_{2}$ $, A_{3}$ $, A_{4}$ , respectively, and combine them into a block matrix $A$

3:: Eigen decomposition: Compute the eigenvectors corresponding to the top $k$ largest eigenvalues of matrix $A$ , denoted as $U = [μ_{1}, \dots, μ_{k}]$

4:: Split the projection matrix: $B_{T} = U_{f i r s t N r o w s}$ (projection for target data), $B_{s} =$ $U_{l a s t N r o w s}$ (projection for source data)

After obtaining the projections, sample selection is performed, that is, screening relevant source data. Merge the projected target data

B_{T}

and projected source data

B_{s}

, then select the K-means algorithm for clustering to obtain the new source domain and target domain.

The K-means clustering algorithm minimizes the objective function:

J = \sum_{i = 1}^{N} \sum_{k = 1}^{K} 1 (c_{i} = k) {‖ x_{i} - μ_{k} ‖}^{2}

(2)

In the formula,

N

is the number of data points;

K

is the number of clusters;

c_{i}

is the index of the cluster to which the data point

x_{i}

belongs;

μ_{k}

is the center (mean value) of cluster

k

.

The main steps of the K-means clustering algorithm are as follows:

(1): Randomly select K data points as the initial cluster centers.
(2): Assign each data point to the nearest cluster center.
(3): Update the center of each cluster to the mean value of the data points within the cluster.
(4): Repeat steps 2 and 3 until the cluster centers stabilize or the maximum number of iterations is reached.

2.1.2. Principle of the Long Short-Term Memory Network

Long Short-Term Memory (LSTM) is an improved algorithm for Recurrent Neural Networks (RNNs). By introducing a gating mechanism, it integrates short-term memory with long-term memory, overcoming the limitation that traditional RNNs can only process short-term input information. Meanwhile, it alleviates issues such as vanishing gradients and exploding gradients to a certain extent. The model structure of LSTM is illustrated in Figure 1.

The core of LSTM consists of three gating units (forget gate, input gate, output gate) and a cell state, with the mathematical expressions of each unit as follows.

The forget gate combines the current input and the hidden state of the memory cell from the previous time step to generate a value between 0 and 1, which controls the retention of information in the memory cell.

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(3)

In the formula,

f_{t}

is the output of the forget gate;

σ (\cdot)

denotes the sigmoid activation function;

W_{f}

is the weight matrix of the forget gate;

h_{t - 1}

is the hidden state at the previous time step;

x_{t}

is the input at the current time step; and

b_{f}

is the bias term of the forget gate.

The input gate combines the current input and the hidden state of the memory cell from the previous time step to generate a value that controls which information should be updated or stored.

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(4)

In the formula,

i_{t}

is the output of the input gate;

W_{i}

is the weight matrix of the input gate; and

b_{i}

is the bias term of the input gate.

Generation of new candidate information:

{\hat{C}}_{t} = t a n h (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})

(5)

In the formula,

{\hat{C}}_{t}

represents the new candidate information in the memory cell;

W_{c}

is the weight matrix of the memory cell; and

b_{c}

is the bias term of the memory cell.

The update of the cell state is jointly determined by the cell state from the previous time step and the new candidate information generated at the current time step.

C_{t} = f_{t} C_{t - 1} + i_{i} {\hat{C}}_{t}

(6)

In the formula,

C_{t}

and

C_{t - 1}

are the cell states at the current and previous time steps, respectively.

The output gate computes the concatenation of the current input and the hidden state from the previous time step, and updates the hidden state using the current cell state.

O_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(7)

In the formula,

O_{t}

is the output of the output gate;

W_{o}

is the weight matrix of the output gate; and

b_{o}

is the bias term of the output gate.

Calculation of the hidden state

h_{t}

:

h_{t} = O_{t} t a n h (C_{t})

(8)

In the formula, Ot is the output of the output gate, and ht is the hidden state at the current time step.

2.2. HEMAP-LSTM Model

This study combines the transfer learning algorithm HEMAP with the LSTM neural network to form the HEMAP-LSTM model. Figure 2 presents in detail the specific implementation process of the HEMAP-LSTM model.

2.3. Model Evaluation Metrics

This study employs the following model evaluation metrics: Mean Absolute Error (MAE) and Mean Squared Error (MSE). Their calculation formulas are as follows:

M A E = \frac{1}{N} \sum_{i = 1}^{N} | y_{t r u e} - y_{p r e d i c t} |

(9)

M S E = \frac{1}{N} \sum_{i = 1}^{N} {(y_{t r u e} - y_{p r e d i c t})}^{2}

(10)

In the formula,

y_{t r u e}

is the true sample values;

y_{p r e d i c t}

is the predicted sample values;

N

is the number of samples.

2.4. Data Preprocessing

Due to significant differences in the units and value ranges of different logging parameters, direct input into the model will cause features with large values to dominate the training process, thereby impairing the generalization performance of the model. Therefore, the min-max normalization method is adopted to map the values of all features to the interval [0, 1], with the calculation formula given by:

x_{i}^{'} = \frac{x_{i} - x_{m i n}}{x_{m a x}}

(11)

In the formula,

x_{i}^{'}

is the normalized data;

x_{i}

is the original input data;

x_{m i n}

is the minimum value of the variable;

x_{m a x}

is the maximum value of the variable.

After normalization, all logging features are scaled to the same numerical range, which can effectively improve the feature mapping accuracy of the HEMAP algorithm and the training efficiency of the LSTM network, and avoid model bias caused by unit differences.

3. Formation Pore Pressure Prediction

This study adopts the Long Short-Term Memory Neural Network (LSTM) as the prediction model and introduces the HEMAP heterogeneous transfer learning algorithm to achieve accurate prediction of formation pore pressure in different blocks of the Xinjiang region.

In practical applications, logging data from different blocks are usually collected by different service companies, and there are significant differences in the logging instruments used and parameter naming conventions, resulting in inconsistent feature spaces across blocks. This feature heterogeneity severely restricts the popularization and application of traditional machine learning algorithms and conventional domain adaptation methods. To address this issue, this study employs the HEMAP algorithm to map the different feature spaces of the source domain and target domain into a common subspace, achieving feature homogenization and thereby improving the cross-block prediction capability of the model. On this basis, the LSTM model is combined to conduct research on formation pore pressure prediction.

Traditional data-driven methods generally require that the input parameters used in the training phase and prediction phase are consistent in quantity and type. Once there is parameter missing or structural inconsistency, the prediction accuracy of the model will decrease significantly, and effective prediction may even be impossible. Therefore, this study introduces a heterogeneous transfer learning framework to enhance the generalization ability and adaptability of the model under the condition of feature heterogeneity.

3.1. Geological Overview and Data Characteristics of the Study Blocks

In this study, two blocks (Block A and Block B) located in Xinjiang were selected as the source domain and target domain for heterogeneous transfer learning, respectively. Both blocks belong to the Junggar Basin and share homologous regional tectonic evolution, but exhibit significant differences in lithological characteristics, pressure systems, and data completeness, forming a typical heterogeneous feature space. This setup is used to verify the cross-block formation pore pressure prediction capability of the proposed HEMAP-LSTM model.

3.1.1. Geological and Pressure System Characteristics of the Blocks

Block A is located in the central depression belt of the Junggar Basin, a mature exploration and development block, with the target layers being the Triassic-Jurassic formations. The lithology is dominated by clastic rocks, with local volcanic reservoirs developed. The sand body distribution is stable, and the reservoir heterogeneity is weak to moderate. The block develops a normal to strong overpressure composite system, with the formation pore pressure equivalent density ranging from 0.5 to 2.5 g/cm³. The formation pore pressure profile of Well A1 is shown in Figure 3. Overpressure is concentrated in the 4500–5500 m well section, and is a multi-genetic model dominated by mudstone under compaction, superimposed by hydrocarbon generation and fluid thermal expansion, with a clear vertical distribution pattern of pressure.

Block B is located in the Fukang Sag on the southern margin of the Junggar Basin, a less explored development block, with the target layer being the Jurassic formation. The lithology consists of fluvial-lacustrine sand-mudstone interbeds, with rapid lateral facies change in sand bodies and extremely strong reservoir heterogeneity. The block develops a normal to weak overpressure system, with the formation pore pressure equivalent density ranging from 0 to 1.8 g/cm³. The formation pore pressure profile of Well B1 is shown in Figure 4. Overpressure is caused by a single origin of mudstone under compaction. Affected by piedmont tectonic compression, local pressure fluctuations are significant, and the vertical distribution regularity is poor.

3.1.2. Input Data Characteristics and Heterogeneous Scenarios

The input data characteristics of the two blocks are shown in Table 1. The input features of the source domain (Block A) cover 5 categories: wellbore structure, conventional logging, drilling fluid logging, drilling engineering, and seismic attributes, totaling 28-dimensional parameters. The core features include well depth, borehole diameter, spontaneous potential, compensated neutron, acoustic travel time, full drilling fluid engineering parameters, full drilling mechanical parameters, and 7 seismic attribute parameters. The input features of the target domain (Block B) cover 4 categories: wellbore structure, conventional logging, drilling fluid logging, and drilling engineering, totaling 15-dimensional parameters. The core features include well depth, deep and shallow lateral resistivity, acoustic travel time, compensated neutron, partial drilling fluid logging parameters, and core drilling engineering parameters. It lacks spontaneous potential, partial drilling engineering parameters, and all seismic attribute parameters, with the prediction target being formation pore pressure.

The two blocks form a typical feature space heterogeneous scenario: the public parameters between the 28-dimensional source domain features and the 15-dimensional target domain features are few, and there is no strict one-to-one correspondence in feature dimensions and types. Traditional machine learning models cannot be directly adapted, which is the core application scenario of the heterogeneous transfer learning method in this study.

3.2. Experimental Design

Aiming at the problems of low exploration degree, insufficient characteristic parameters, and the inability of traditional models to achieve high-precision cross-block formation pore pressure prediction in Block B, this chapter takes Block A as the source domain of transfer learning, conducts formation pore pressure prediction experiments on Well B1 and Well B2 in Block B, and verifies the cross-block generalization ability, prediction accuracy and engineering applicability of the proposed HEMAP-LSTM heterogeneous transfer learning model in feature heterogeneous scenarios.

3.2.1. Construction of Experimental Dataset

All data used in this experiment are from field measurements in the oilfield, ensuring the authenticity and engineering representativeness of the experimental data. The dataset is divided as follows:

(1): Source domain training and validation set:

The measured data of 5 wells in Block A are selected and randomly divided into a training set and a validation set at an 8:2 ratio. The input is the 28-dimensional full-feature parameters described in Section 3.1.2, and the output is formation pore pressure, which is used for model pre-training and hyperparameter optimization.

(2): Target domain test set:

The measured data of Well B1 and Well B2 in Block B are selected as the model prediction objects and accuracy verification set. Neither well participates in model training, and is only used to test the cross-block prediction ability of the model. The input is 15-dimensional heterogeneous feature parameters, and the output labels are the formation pore pressure of the two wells.

3.2.2. Model Hyperparameter Setting

To ensure the scientific rigor and reproducibility of the model, this study clarifies the core hyperparameter settings of the HEMAP algorithm and LSTM network, as follows:

(1): HEMAP algorithm hyperparameters:

The similarity control parameter β = 1; the common subspace dimension k is determined according to the target domain feature dimension; the number of clusters K is determined based on the geological lithology combination and silhouette coefficient of the study area; the sample size N is consistent with the single-well sample size of the target domain.

(2): LSTM network hyperparameters:

The key hyperparameters of the LSTM model include the number of network layers, number of neurons, learning rate, dropout rate, and number of epochs. Due to the large number of parameters, complex tuning process and high computational cost of the LSTM model, the Bayesian optimization algorithm is adopted to tune the model hyperparameters. The number of epochs is determined according to the optimal hyperparameter combination, and the final value is set to 150. Table 2 shows the value range and final determined values of the hyperparameters.

4. Formation Pore Pressure Prediction Results

By analyzing the data of Block A, the data from 5 wells in Block A were selected as the source domain, and the HEMAP heterogeneous transfer learning method was used to predict the formation pore pressure of Well B1 and Well B2 in Block B. The formation pore pressure prediction results of Well B1 and Well B2 are shown in Figure 5.

The corresponding evaluation metrics for Well B1 and B2 are shown in Figure 6.

5. Discussion

In this study, data from different blocks within the same basin were used. Five wells in Block A were taken as the source domain to predict formation pore pressure in Well B1 and Well B2 of Block B. Prediction results from the HEMAP-LSTM model show that the MSE values of Well B1 and Well B2 are as low as 0.00143 and 0.00234, respectively, and the MAE values are 0.0318 and 0.0416, respectively. This demonstrates that the proposed method achieves high prediction accuracy and stability under feature-heterogeneous and cross-block conditions. The predicted formation pore pressure profiles of Well B1 and B2 are generally consistent with measured data in the vertical distribution trend. Compared with conventional machine learning and isomorphic transfer models, HEMAP-LSTM overcomes the constraint of inconsistent feature spaces between the source and target domains. Despite only a small number of shared parameters between the 28-dimensional features of Block A and the 15-dimensional features of Block B, the model still enables effective knowledge transfer without manual feature completion.

The symmetric spectral mapping of HEMAP aligns heterogeneous data into a common subspace, and K-means clustering is applied to select similar source-domain samples, thereby mitigating negative transfer caused by geological differences. Combined with the ability of LSTM to capture long-term dependencies in logging sequences, the model significantly reduces prediction errors under small-sample and missing-parameter conditions.

In contrast to empirical models such as the traditional Eaton method and equivalent depth method, the proposed data-driven model does not rely on manual empirical parameters and exhibits stronger adaptability to complex formations with multi-genetic overpressure and strong heterogeneity. Compared with existing isomorphic transfer learning studies, HEMAP-LSTM extends the application scope to real engineering scenarios with feature dimension differences and incomplete logging parameters.

This method has certain applicability boundaries: it requires the source and target domains to share certain geological homology; otherwise, the transfer performance will degrade. In future work, the model can be extended to deep-water, unconventional, and carbonate reservoirs. Combined with data denoising and semi-supervised transfer learning, its adaptability to low-quality and unlabeled data can be further improved.

6. Conclusions

(1): To address the poor cross-block generalization ability of traditional formation pore pressure prediction methods caused by heterogeneous feature spaces, HEMAP heterogeneous transfer learning is introduced. It realizes effective knowledge transfer from mature blocks to newly explored areas and solves the problem of model adaptation under inconsistent feature conditions.
(2): The constructed HEMAP-LSTM model achieves high-precision cross-block formation pore pressure prediction under limited target-domain samples and missing logging features. The MSE values of Well B1 and Well B2 are as low as 0.00143 and 0.00234, with MAE values of 0.0318 and 0.0416, respectively. Both accuracy and robustness outperform traditional models.
(3): The proposed method requires neither manual parameter calibration nor feature completion, greatly reducing modeling costs. It is suitable for cross-block prediction under complex geological conditions and can provide reliable support for drilling safety design, reserve evaluation, and development decision-making.
(4): Heterogeneous transfer learning is more consistent with actual oilfield data conditions and provides a new technical approach for intelligent prediction of formation parameters.

Author Contributions

Conceptualization, W.D. and Y.W.; Methodology, Z.Z. and X.W.; Validation, H.C. and Y.X.; Formal analysis, L.Y.; Investigation, H.H.; Data curation, L.Y. and H.H.; Writing—original draft preparation, W.D.; Writing—review and editing, Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research and the APC were funded by the Science and Technology Special Project of China National Petroleum Corporation: “Research on Large-Scale Reserve Growth, Production Enhancement and Exploration & Development Technologies for Ultra-Deep Clastic Rock Oil and Gas Reservoirs” (grant number 2023ZZ14).

Data Availability Statement

Some data are not fully disclosed due to involving the commercial secrets of oilfield enterprises. For access to such data, please contact the corresponding author (E-mail: z23020125@s.upc.edu.cn) and provide proof of legitimate research purposes.

Acknowledgments

We would like to express our gratitude to the Oil Production Technology Research Institute of PetroChina Xinjiang Oilfield Company for providing valuable logging data and on-site geological data. Additionally, we appreciate the technical support and constructive suggestions from colleagues in the laboratory during the model validation and experimental tests. All authors have carefully reviewed the content of this manuscript and bear full responsibility for the accuracy of the research conclusions presented herein.

Conflicts of Interest

Authors Wenhui Dang, Yingjie Wang, Zhen Zhong, Xin Wang, Hao Chen were employed by the Oil Production Technology Research Institute, Xinjiang Oilfield Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The company in affiliation and funding had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Sheng, Y.; Guan, Z.; Luo, M.; Li, W.; Xu, Y. A quantitative evaluation method of drilling risks based on uncertainty analysis theory. J. China Univ. Pet. (Ed. Nat. Sci.) 2019, 43, 91–96. [Google Scholar] [CrossRef]
Jing, Y.; Wang, T.; Zhang, B.; Zheng, Y.; Li, X.; Lu, N. Safety Risk Analysis of Well Control for Wellbore with Sustained Annular Pressure and Prospects for Technological Development. Chem. Technol. Fuels Oils 2025, 61, 110–119. [Google Scholar] [CrossRef]
Ramdhan, A.M.; Goulty, N.R. Overpressure-generating mechanisms in the Peciko field, lower Kutai Basin, Indonesia. Pet. Geosci. 2010, 16, 367–376. [Google Scholar] [CrossRef]
Zhang, J. Pore pressure prediction from well logs: Methods, modifications, and new approaches. Earth-Sci. Rev. 2011, 108, 50–63. [Google Scholar] [CrossRef]
Radwan, A.E.; Abudeif, A.M.; Attia, M.M.; Mohammed, M. Pore and fracture pressure modeling using direct and indirect methods in Badri Field, Gulf of Suez, Egypt. J. Afr. Earth Sci. 2019, 156, 133–143. [Google Scholar] [CrossRef]
Ogbamikhumi, A.; Ebeniro, J.O. Reservoir properties estimation from 3D seismic data in the Alose field using artificial intelligence. J. Pet. Explor. Prod. 2021, 11, 1275–1287. [Google Scholar] [CrossRef]
Yan, X.; Zhang, M.; Wu, Q. Big-data-driven pre-stack seismic intelligent inversion. Inf. Sci. 2021, 549, 34–52. [Google Scholar] [CrossRef]
Zou, W. Research Status of Artificial Intelligence and Its Application in Well Logging Field. Well Logging Technol. 2020, 44, 323–328. [Google Scholar] [CrossRef]
Chen, X.; Deng, Z.; Liu, P.; Zhang, T.; Du, J.; Tang, H.; Hu, H.; Gao, X.; Wang, Z.; He, X. Rapidly improving the acid-fracture conductivity in deep and ultra-deep carbonate reservoirs through mineral alteration: A new method. Int. J. Rock Mech. Min. Sci. 2026, 199, 106415. [Google Scholar] [CrossRef]
Liao, G.; Li, Y.; Xiao, L.; Qing, Z.; Hu, X.; Hu, F. Prediction of Microscopic Pore Structure in Tight Reservoirs Using Convolutional Neural Network Model. Pet. Sci. Bull. 2020, 5, 26–38. [Google Scholar]
Wu, D.; Wu, S.; Zhang, Y.; Yu, J. Research on Intelligent Interpretation Method of Reservoir Physical Property Parameters under Small Sample Condition. Pet. Sci. Bull. 2025, 10, 378–391. [Google Scholar]
Chen, M. Production Prediction Method for Offshore Petroleum Geological Exploration and Development Based on LSTM Model. China Pet. Chem. Stand. Qual. 2025, 45, 86–88. [Google Scholar]
Xue, L.; Wu, Y.; Liu, Q.; Liu, Y.; Wang, J.; Jiang, L.; Cheng, Z. Advances in numerical simulation and automatic history matching of fractured reservoirs. Pet. Sci. Bull. 2019, 4, 335–346. [Google Scholar]
Zhang, R.; Jia, H. Production performance forecasting method based on multivariate time series and vector autoregressive machine learning model for waterflooding reservoirs. Pet. Explor. Dev. 2021, 48, 175–184. [Google Scholar] [CrossRef]
Liu, H.; Li, Y.; Jia, D.; Wang, S.; Qiao, M.; Qu, R.; Wen, P.; Ren, Z. Application status and prospects of artificial intelligence in the refinement of waterflooding development program. Acta Pet. Sin. 2023, 44, 1574–1586. [Google Scholar] [CrossRef]
Chen, X.; Hu, H.; Liu, P.; Du, J.; Wang, M.; Tang, H.; Deng, Z.; Wang, G.; Liu, F. Enhancing fracture conductivity in carbonate formations through mineral alteration. Int. J. Rock Mech. Min. Sci. 2025, 186, 106027. [Google Scholar] [CrossRef]
Zhang, K.; Zhao, X.; Zhang, L.; Zhang, H.; Wang, H.; Chen, G.; Zhao, M.; Jiang, Y.; Yao, J. Research Status and Prospect of Big Data and Intelligent Optimization Theory and Methods in Intelligent Oilfield Development. J. China Univ. Pet. (Ed. Nat. Sci.) 2020, 44, 28–38. [Google Scholar] [CrossRef]
Yu, H.; Chen, G.; Gu, H. A machine learning methodology for multivariate pore-pressure prediction. Comput. Geosci. 2020, 143, 104548. [Google Scholar] [CrossRef]
Chen, X.; Huang, Q.; Liu, P.; Du, J.; Guo, Y.; Xiong, Z. Numerical Simulation of CO₂ Sequestration Stability in Carbonate-Bearing Saline Aquifers: Effects of Mineral Dissolution. Energy 2025, 335, 138273. [Google Scholar] [CrossRef]
Luo, F.; Liu, J.; Chen, X.; Li, S.; Yao, X.; Chen, D. Intelligent Prediction Method for Formation Pore Pressure in Fault Zone 5 of Shunbei Oilfield Based on BP and LSTM Neural Networks. Oil Drill. Prod. Technol. 2022, 44, 506–514. [Google Scholar] [CrossRef]
Farsi, M.; Mohamadian, N.; Ghorbani, H.; Wood, D.A.; Davoodi, S.; Moghadasi, J.; Alvar, M.A. Predicting formation pore-pressure from well-log data with hybrid machine-learning optimization algorithms. Nat. Resour. Res. 2021, 30, 3455–3481. [Google Scholar] [CrossRef]
Hottmann, C.E.; Johnson, R.K. Estimation of formation pressures from log-derived shale properties. J. Pet. Technol. 1965, 17, 717–722. [Google Scholar] [CrossRef]

Figure 1. Structure of LSTM model.

Figure 2. HEMAP-LSTM model process.

Figure 3. Formation pore pressure profile of Well A1.

Figure 4. Formation pore pressure profile of Well B1.

Figure 5. Formation pore pressure profile of Well B1 and Well B2. (a) Well B1, (b) Well B2.

Figure 6. MSE and MAE of Well B1 and Well B2. (a) Well B1, (b) Well B2.

Table 1. Input Data of Blocks A and B.

Block	Input Data
A	DEP, CAL, SP, CNL, AC, TVOL, FLO_OUT, FLO_IN, COND_OUT, COND_IN, TEMP_OUT, TEMP_IN, DEN_OUT, DEN_IN, FPC, HKLD, TORQ, SPM, RPM, SPP, WOB, DIP_ILL, 3D_CURV, SWT, RMS_AMP, REFL_INT, LSA, ISO_FREQ
B	DEP, RD, RS, AC, CNL, TVOL, FLO_OUT, COND_OUT, TEMP_OUT, TEMP_IN, DEN_OUT, DEN_IN, HKLD, RPM, SPP

Table 2. Optimal hyperparameter combination for LSTM.

No.	Parameter	Value Range	Final Value
1	layers	[1, 3]	1
2	neurons	[32, 128]	53
3	learning rate	[0.00001, 0.1]	0.0019
4	dropout	[0.01, 0.2]	0.01
5	epochs	[50, 300]	150

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dang, W.; Wang, Y.; Zhong, Z.; Wang, X.; Chen, H.; Xu, Y.; Yang, L.; He, H. Method for Formation Pore Pressure Prediction Based on Heterogeneous Transfer Learning. Processes 2026, 14, 1280. https://doi.org/10.3390/pr14081280

AMA Style

Dang W, Wang Y, Zhong Z, Wang X, Chen H, Xu Y, Yang L, He H. Method for Formation Pore Pressure Prediction Based on Heterogeneous Transfer Learning. Processes. 2026; 14(8):1280. https://doi.org/10.3390/pr14081280

Chicago/Turabian Style

Dang, Wenhui, Yingjie Wang, Zhen Zhong, Xin Wang, Hao Chen, Yuqiang Xu, Lei Yang, and Hailong He. 2026. "Method for Formation Pore Pressure Prediction Based on Heterogeneous Transfer Learning" Processes 14, no. 8: 1280. https://doi.org/10.3390/pr14081280

APA Style

Dang, W., Wang, Y., Zhong, Z., Wang, X., Chen, H., Xu, Y., Yang, L., & He, H. (2026). Method for Formation Pore Pressure Prediction Based on Heterogeneous Transfer Learning. Processes, 14(8), 1280. https://doi.org/10.3390/pr14081280

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Method for Formation Pore Pressure Prediction Based on Heterogeneous Transfer Learning

Abstract

1. Introduction

2. Methodology

2.1. Principle of the Methodology

2.1.1. Principle of Heterogeneous Spectral Mapping

2.1.2. Principle of the Long Short-Term Memory Network

2.2. HEMAP-LSTM Model

2.3. Model Evaluation Metrics

2.4. Data Preprocessing

3. Formation Pore Pressure Prediction

3.1. Geological Overview and Data Characteristics of the Study Blocks

3.1.1. Geological and Pressure System Characteristics of the Blocks

3.1.2. Input Data Characteristics and Heterogeneous Scenarios

3.2. Experimental Design

3.2.1. Construction of Experimental Dataset

3.2.2. Model Hyperparameter Setting

4. Formation Pore Pressure Prediction Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI