Stratigraphic Correlation of Well Logs Using Geology-Informed Deep Learning Networks

Xu, Zhaohui; Zheng, Boyu; Liu, Bo; Song, Wendan

doi:10.3390/pr13051288

Open AccessArticle

Stratigraphic Correlation of Well Logs Using Geology-Informed Deep Learning Networks

¹

State Key Laboratory of Petroleum Resources and Engineering, China University of Petroleum (Beijing), Beijing 102249, China

²

College of Geosciences, China University of Petroleum (Beijing), Beijing 102249, China

³

Shuanghe Geological Research Institute, Research Institute of Exploration and Development, Henan Oilfield, Nanyang 473132, China

^*

Author to whom correspondence should be addressed.

Processes 2025, 13(5), 1288; https://doi.org/10.3390/pr13051288

Submission received: 2 March 2025 / Revised: 16 April 2025 / Accepted: 22 April 2025 / Published: 23 April 2025

(This article belongs to the Special Issue Applications of Intelligent Models in the Petroleum Industry)

Download

Browse Figures

Versions Notes

Abstract

Stratigraphic correlation plays a crucial role in reservoir characterization. However, it is often time-consuming and heavily dependent on geological expertise. To address this issue, we propose a novel method called CMT-enhanced Hiformer, which integrates convolutional neural networks meet vision transformers (CMT) and hierarchical multi-scale representations using transformers (Hiformer). First, the architecture of CMT-enhanced Hiformer fuses the advantages of convolutional neural networks and transformers, effectively extracting complex features from well logs and capturing both local and global dependencies via a well-designed attention mechanism. Next, a geological constraint with regularization parameters is incorporated into the loss function. The new loss function promotes the accuracy of stratigraphic boundaries. The proposed method was validated using data from the Shuanghe oil field in central China. Specifically, the model achieved a maximum F1 score of 0.8857 and a precision of 0.8865 on the blind test dataset, demonstrating its robustness and high classification accuracy. Moreover, we conducted ablation studies and performed a detailed comparison with state-of-the-art deep learning models. The results demonstrate that the proposed method significantly improves the accuracy and efficiency of stratigraphic correlation.

Keywords:

stratigraphic correlation; transformer; convolutional neural network; geological constraint; well log

1. Introduction

Stratigraphic correlation is a fundamental task in reservoir characterization, aimed at establishing an isochronous stratigraphic framework to provide data support for the spatial representation of reservoir bodies [1,2]. Well log data serve as one of the core datasets for conducting stratigraphic correlation. Essentially, it involves an in-depth analysis and comparison of the geological information embedded within logging curves, identifying stratigraphic units of various scales based on characteristic values and morphological variations of the curves. Traditional stratigraphic correlation based on well logs primarily relies on manual interpretation, with the accuracy of results heavily dependent on the expertise and experience of geoscientists. However, as the number of wells increases, stratigraphic correlation becomes increasingly labor-intensive and time-consuming [3].

To address these challenges, various semiautomated and automated stratigraphic correlation methods have been proposed, including signal decomposition techniques, statistical methods, and machine learning approaches. Signal decomposition methods focus on processing well log data by decomposing a single log curve into multiple modes, with each mode representing stratigraphic features at different scales. These methods include wavelet analysis [4,5], variational mode decomposition (VMD) [6], cluster analysis [7], principal component analysis (PCA) [8,9], and pattern recognition techniques [10]. By extracting multiple data components from the log curves, these approaches provide a more intuitive representation of multi-scale stratigraphic characteristics, thereby enhancing the accuracy of stratigraphic correlation. Statistical methods treat well log curves as data sequences and delineate stratigraphic units by analyzing the correlations between these sequences. Representative techniques in this category include dynamic time warping (DTW) [11,12,13], cross-correlation analysis [14], maximum likelihood estimation [15], and grey relational analysis [16].

In recent years, machine learning techniques have been increasingly applied to stratigraphic division and correlation tasks, achieving notable progress. Initially, methods like genetic algorithms were used for biostratigraphic correlation but were computationally intensive and time-consuming [17]. To improve interpretation efficiency, researchers have increasingly adopted machine learning algorithms, such as support vector machines, adaptive boosting, gradient boosting, and random forests, that leverage annotated datasets to achieve faster and more consistent stratigraphic analysis [18,19,20]. With the development of deep learning, more powerful tools emerged for automated geological analysis. Models like convolutional neural networks (CNNs) and recurrent neural networks have been increasingly adopted for stratigraphic correlation [21,22,23,24,25]. Furthermore, the strengths of CNNs, including parameter sharing, translation invariance, and direct learning from raw inputs, enable effective performance in classification, segmentation, and regression tasks [26,27,28,29,30,31]. However, CNNs face limitations in modeling long-range dependencies, which can require deep architectures and increase computational cost. To address these limitations, transformer-based architectures have been introduced, leveraging self-attention mechanisms to capture both local and global dependencies, making them ideal for sequential modeling tasks such as well log interpretation [32,33,34,35,36,37,38].

This paper addresses two key challenges in stratigraphic correlation of well logs: enhancing the quality of feature extraction through algorithm fusion and improving prediction accuracy through the design of geology-informed loss function. First, we propose a geology-informed deep learning network for stratigraphic correlation of well logs. The deep learning algorithm architecture, namely CMT-enhanced HiFormer, leverages the strengths of both CNNs and transformers [39,40]. To improve prediction accuracy, we incorporate geological constraints into the loss function. Next, we discuss the details of the implementation of the training process and conduct ablation studies, which evaluate the effectiveness of the different modules in our CMT-enhanced HiFormer and different regularization weights of the loss function. The well-trained model is then applied to perform stratigraphic correlation of well logs acquired from the Shuanghe oil field in central China. Representative well tops are analyzed to quantitatively assess the performance of the proposed method. Finally, we discuss the effect of the number and distribution of trained wells on prediction accuracy, where the broad applications of our model are also discussed. At the end, we provide brief conclusions of this study.

2. Methodology

2.1. The Architecture of CMT-Enhanced Hiformer

The simplified structure of our proposed CMT-enhanced Hiformer is presented in Figure 1. The proposed model architecture primarily consists of two components: a feature extraction module and a decoding module. The feature extraction module comprises one CMTSTEM layer, two CMTBLOCK layers, and two Pooling layers, while the decoding module incorporates two CONVUP modules designed to restore both the resolution and dimension of feature maps extracted from well log data. Detailed explanations about the key modules are provided as follows.

2.1.1. CMTSTEM Module

As shown in Figure 1a, the CMTSTEM module serves as the front layer of our suggested model in this study. The input well log data are passed through a 3 × 3 convolutional layer with stride 2 for initial feature extraction and downsampling. This reduces the spatial resolution of the feature map while preserving important local feature information. Such fine-grained features can overcome the inability concerning transformer structure. Subsequently, two consecutive 3 × 3 convolutional layers with a kernel size of

3 \times 3

and a stride of 1 further extract deep features from the downsampled feature map without altering its spatial dimensions. Each layer integrates GELU activation and batch normalization (BN), forming a hierarchical feature processing architecture. This design provides subsequent operations with feature representations that simultaneously maintain local receptive fields and achieve spatial compression.

2.1.2. CMTBLOCK Module

As depicted in Figure 1b,c, the CMTBLOCK is one of the key components of CMT [40], designed to effectively capture and represent complex features with an enhanced attention mechanism. The modified module starts with a 3 × 3 depth-wise convolution (DWConv) and includes a well-tailored attention mechanism from the Hiformer-MHSA (HMHSA). The typical multi-head self-attention refines the learning process by utilizing the extracted features. In addition to generating fine-grained feature maps

P^{s}

, it also offers the query

C L S^{s}

for the HMHSA to further combine the low-level representation

P^{l}

. This wide range of features withstands the physical constraint of stratigraphical layers. The HMHSA module utilizes cross-attention to effectively guide feature map fusion. As illustrated in Figure 1f, “P” represents the segmented local data for capturing local stratigraphic features and “CLS” is a learnable token added at the “P” sequence start to aggregate global information for classification. The HMHSA takes the output feature map

P^{s}

and tokens

C L S^{s}

from the LMHSA to guide the fusion stage. Another input

P^{l}

is the raw feature map simply after convolution. During the fusion stage, we utilize the weighted embedding representations

C L S^{s}

to function as the query for fusion, where the value is the original feature maps generated by simple downsampling to capture global features.

Specifically, the precise segmented predictions emphasize the importance of boundary points, highlighting the challenges of relying solely on the transformer model. Conversely, the CNN-based model loses the global captured features. Hence, the Cross Attention module, as shown in Figure 1f, takes the resultant smallest

P^{s}

and largest

P^{l}

levels as inputs and employs a cross-attention mechanism to fuse information across scales, addressing the inconsistency and low precision problems. Finally, the IRFFN module reformulates the feature map into the required output size.

2.1.3. CONVUP Module

The CONVUP module is a decoder component designed for feature map upsampling. It processes input feature maps through a series of convolutional layers, GroupNorm, and ReLU activation functions, while incorporating bilinear interpolation for upscaling. For the CMTBLOCK1 and CMTBLOCK2 stages, the CONVUP module first educes channel dimensions and extracts features via convolution, then enlarges spatial dimensions through upsampling. Finally, these two modules work collaboratively to upsample the deep features and fuse them with shallow features, ultimately restoring high-resolution output and achieving the transformation from low-resolution high-level semantic features to high-resolution detailed features.

2.2. Loss Function with Geological Constraint

The first loss function used in this study is the commonly used cross-entropy (CE) loss, defined as

L o s s_{C E} (y, p) = - \sum_{k = 0}^{k = N_{k}} y_{k} \cdot l o g (p_{k})

(1)

where k denotes the stratigraphic class and

N_{k}

represents the total number of stratigraphic classes.

y_{k}

indicates the value of the ground true label for class k, i.e., if the sample belongs to class k, then

y_{k}

is 1; otherwise, it is 0.

p_{k}

represents the predicted probability for class k, which is the k-th element of the probability distribution output by the well-trained model.

Our suggested geological constraint (GC) is defined as

L o s s_{G C} (\bar{y}, \bar{p}) = - \sum_{l = 0}^{l = N_{l}} {\bar{y}}_{l} \cdot l o g ({\bar{p}}_{l})

(2)

where l represents the index of the stratigraphic boundary, ranging from 0 to

N_{l}

, where the boundary positions are sequentially numbered from 1 to

N_{l}

and non-boundary positions are 0.

N_{l}

denotes the number of stratigraphic boundaries.

{\bar{y}}_{l}

denotes the ground true value for the stratigraphic boundary of class l and

{\bar{p}}_{l}

represents the probability predicted by the model for boundary l.

The position of the stratigraphic boundary is the key point of the loss function we proposed in this study. Focusing on optimizing the prediction of different stratigraphic boundary positions reduces confusion between different horizons in the model. Then, our loss function in this study can be defined as

L o s s_{t o t a l} = λ_{1} L o s s_{C E} + λ_{2} L o s s_{G C}

(3)

where

λ_{1}

and

λ_{2}

are two regularization weights. The selection of

λ_{1}

and

λ_{2}

will be explained in Section 4.

3. Data Introduction and Implementation Details

3.1. Study Area and Training Data Preparation

The Nanxiang Basin is bordered to the north by the Qinling Mountains and to the south by the Daba Mountains. It is a Meso-Cenozoic continental intermountain rift basin that formed during the late Yanshanian period. The sedimentary basin is controlled by faults along its periphery, covering a total area of 17,000 km² [41]. The Nanxiang Basin includes four depressions and four uplifts. Among them, the Biyang Depression in the southeast is the main area for oil and gas exploration. Influenced by the late Yanshanian stretching, it forms three structural belts from north to south: the northern slope belt, the central deep depression belt, and the southern steep slope belt presenting a general characteristic of being higher in the north and lower in the south [42], presented in Figure 2.

More than ten oil fields have been discovered in the Biyang Depression. This study focuses on the northern block of the Shuanghe Oil Field, which features a nose-like structure with a northwest-southeast trend, tilting towards the southeast, and a steeper western wing than the eastern wing. The Shuanghe Oil Field develops the Pingyuan Formation, Shangsi Formation, Hetaoyuan Formation, Dacangfang Formation, and Yuhuangding Formation from top to bottom [43]. Among these, the third member of the Hetaoyuan Formation is the main oil-bearing layer. This research target is the second sandstone group of the third member of the Hetaoyuan Formation, with a burial depth of 1350–2400 m. This layer consists primarily of gravel rock, gravelly sandstone, conglomerate sandstone, and sandstone, forming a mixed sandstone-conglomerate complex deposited in the fan delta front. The sand bodies are distributed in a fan shape from southeast to northwest, with wide distribution and significant sediment thickness. The sand bodies gradually thin out and pinch out from southeast to northwest [44].

The well location map of the study area is presented in Figure 3. The gray points in this image indicate well boreholes, where 100 wells are used for building the well dataset in this study, named W1, W2, …, and W100. When utilizing DL models for stratigraphic correlation, we can randomly select wells for model training, validation, and blind testing. Note that we select three well logs in this study, i.e., Gamma Ray (GR), Spontaneous Potential (SP), and Resistivity (RT), and we do not pay much attention to the sensitivity analysis of well logs. Note that, to test the effectiveness of our model, we randomly select 30% of the dataset in Figure 3 as our training data (indicated by the red points) and 5 wells as our validation data, while the rest are used as our blind testing data.

3.2. Implementation Details of Training Process

All networks use the PyTorch 2.2.1 deep learning library, Python 3.8, trained on a workstation with two NVIDIA GeForce RTX 3090 GPUs, equipped with a dual-channel Intel 6354 CPU. The SGD optimizer is selected with a learning rate scheduler. The learning rate is set with an initialization of 0.001. We train the base models for a maximum of 20 epochs, saving the model with the best validation loss during model training.

To evaluate the performance of difference models, confusion matrices defined in Equation (4) are utilized in this study.

\begin{matrix} P r e c i s i o n = \frac{T P}{T P + F P}, \\ R e c a l l = \frac{T P}{T P + F N}, \\ F 1 = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} \end{matrix}

(4)

where True Positives (TP) are instances where positive samples are correctly identified as positive, reflecting the model’s ability to detect the target condition, True Negatives (TN) are cases where negative samples are accurately classified as negative, demonstrating the model’s effectiveness in recognizing the absence of the target condition, False Positives (FP) occur when the model incorrectly classifies negative samples as positive, leading to false detections, and False Negatives (FN) arise when positive samples are mistakenly classified as negative, resulting in missed detections. These metrics offer a thorough assessment of the model’s classification performance, providing insights into its precision and recall across various classes.

4. Ablation Study

4.1. The Comparisons of Different Modules

In this section, we first implement an ablation study on different models within our proposed CMT-enhanced Hiformer, i.e., comparing the effectiveness of CMT, Hiformer, and CMT-enhanced Hiformer. As mentioned above, we randomly select 30 wells as our training data and 5 wells as our validation data, while the others are selected as the blind test dataset. The training loss curves of CMT, Hiformer, and our suggested CMT-enhanced Hiformer are presented in Figure 4. Note that we normalize all these three training losses to make a clear comparison. We obtain three well-trained and convergent models after model training, which can be easily used for model testing. Moreover, the final loss values of our suggested model are smaller than those of the other two models, indicating that our model shows a better convergence result.

Afterward, we apply these three models to the blind test dataset. The precision and F1 scores are, respectively, calculated and indicated in Table 1. Comparing these quantitative values, it can be easily found that our model shows better performance than the other two models by obtaining higher precision and F1. Moreover, we randomly select a well from the blind test dataset, i.e., W29, and then apply these three models for stratigraphic division. The results are presented in Figure 5c–e, while Figure 5a shows three well logs, i.e., GR, SP, and RT, and Figure 5b shows the ground truth stratigraphic classification labels. The red dashed lines indicate the sequence boundaries of W29. By comparing these images, we have several main observations. First, CMT shows an inaccurate stratigraphic division result even in the middle of a certain sequence, indicated by the red arrow. Second, there are incorrect predicted results at the sequence boundaries, denoted by the blue arrows in Figure 5c,d. Our model provides a more accurate result in Figure 5e by comparing it with the other two models, demonstrating the effectiveness of our fused model.

4.2. The Comparisons of Different Regularization Weights

We validate the effectiveness of our proposed geological constraint (GC) in this section. Moreover, we discuss in detail the selection of two regularization weights

λ_{1}

and

λ_{2}

assigned to the cross-entropy (CE) loss and our GC loss in Equation (3), while satisfying the constraint

λ_{1} + λ_{2}

= 1. Note that, when selecting

λ_{1}

= 1 and

λ_{2}

= 0, it indicates that we only use the cross-entropy loss defined in Equation (1) for model training and our GC loss is not used for model training.

As the weight of our GC loss increases in Table 2, i.e.,

λ_{2}

increases from 0 to 0.3, the model’s performance generally improves on the test set, despite some fluctuations with certain weight allocations. Specifically, when

λ_{1}

= 0.8 and

λ_{2}

= 0.2, the highest accuracy (0.8865) and F1 score (0.8857) on the blind test dataset are achieved. The quantitative results are worse than those of

λ_{1}

= 0.8 and

λ_{2}

= 0.2. These results indicate that the introduction of our GC loss enhances the robustness and generalization ability of our suggested model, particularly in terms of improved classification accuracy and consistency, demonstrating that our proposed GC loss shows a positive impact on the overall results by effectively stabilizing gradients during the learning process.

Afterward, we first utilize

λ_{1}

= 1.0 and

λ_{2}

= 0 for training our CMT-enhanced Hiformer and then

λ_{1}

= 0.8 and

λ_{2}

= 0.2, i.e., without and with our geological constraint. After model convergence, we randomly select a well from the blind test dataset, i.e., W25, whose well logs are denoted in Figure 6a. The ground truth stratigraphic classification label is shown in Figure 6b, while the predicted results via our suggested CMT-enhanced Hiformer without and with our GC loss are indicated in Figure 6c,d. The red dashed lines indicate the sequence boundaries of W25. Our CMT-enhanced Hiformer can realize stratigraphic division based on well logs. However, the introduction of our geological constraint effectively improves the accuracy of stratigraphic division, especially at the sequence boundaries indicated by the red arrows. These images in Figure 6 verify the validity of our suggested GC loss in this study.

5. Stratigraphic Correlation Results

We calculate the average differences between the ground truth and predicted well tops (in meters) of the blind test dataset by using SegNet, CMT, Hiformer, CMT-enhanced Hiformer, and CMT-enhanced Hiformer with GC loss, as indicated in Table 3. Our suggested model obtains the smallest differences among these four models. Moreover, the proposed geological constraint promotes our model with 0.46 m, verifying its effectiveness. In addition, we select two representative well tops, i.e., Class 1 and Class 3, to calculate the statistical errors of the predicted results by using different methods, as denoted in Figure 7. By comparing these images, it can still be seen that our proposed model and geological constraint can effectively promote the performance of automatic stratigraphic correlation.

Next, we calculate the structural deep maps of Class 1’s top surfaces by using the predicted and ground truth well tops, as presented in Figure 8. These images are, respectively, calculated by using the results of (a) SegNet, (b) CMT, (c) Hiformer, and our suggested CMT-enhanced Hiformer (d) without and (e) with our GC loss, while Figure 8f is calculated by using the ground truth labels. By comparing these images, we can easily find that our result in Figure 8e is the closest one to the result in Figure 8f, further demonstrating the effectiveness of our suggested method in this study.

After predicting the stratigraphic division results, we show a well section parallel to the sediment source direction and a well section perpendicular to the sediment source direction, as denoted in Figure 9 and Figure 10. The dashed blue lines denote the stratigraphic correlation results of our CMT-enhanced Hiformer without GC loss, while the solid red lines indicate those of our CMT-enhanced Hiformer with GC loss. By comparing these stratigraphic correlation results, we can find that both methods can perform automatic stratigraphic correlation of well logs with high accuracy, verifying the effectiveness of our suggested CMT-enhanced Hiformer. Moreover, after introducing our GC loss, we can interpret the stratigraphic boundaries more accurately, as indicated by the red arrows in Figure 9 and Figure 10, further demonstrating the availability of our GC loss.

6. Discussion

We suggest a CMT-enhanced Hiformer with the geological constraint for stratigraphic correlation of well logs. On the one hand, based on Hiformer, we propose to replace its encoder by using the feature extraction module in CMT. This promotes the capability for extracting the local features of well logs and then different scale stratigraphic unit analysis is realized. On the other hand, the introduction of the geological constraint improves the accuracy of the formation interfaces, especially the formation boundaries. The applications and tests on the dataset at the Shuanghe Oil Field demonstrate the effectiveness of our model. However, the predicted accuracy and the generalization property depend on the number of training wells and their distribution.

The percentage of the training wells is the most important factor that affects the prediction accuracy. To further test the robustness of our suggested model, we randomly select different amounts of wells for model training, i.e., from 10% to 60%. The validation dataset is also built by randomly selecting 5 wells, while the other wells are used for building the blind test dataset. After model training, we apply different models to the blind test dataset and calculate the precision and F1, as shown in Figure 11. By comparing these values, we have several observations. First, no matter how many training wells we used, our suggested CMT-enhanced Hiformer with GC shows a superior classification effect compared to CMT-enhanced Hiformer without GC. This indicates that our proposed geological constraint promotes the classification effectiveness of our CMT-enhanced Hiformer. Second, with the reduction of training wells used, i.e., from 30% to 10%, the performances of all models degrade significantly, indicating the great influence of the size of the training dataset for successfully implementing DL models. Moreover, the increase in model accuracy is limited when the training percentage increases from 30% to 60% (3.53%), not so obvious as those from 10% to 30% (7.92%). Furthermore, more training wells lead to more training time. To balance the training efficiency and the prediction accuracy, we select 30% wells as our training dataset and implement the detailed comparisons in this study.

For the latter, the accuracy of the model also depends on the distribution of training wells. In practical applications, different positions on the reservoir plane are affected by the source control, the deposition characteristics are different, and the logging response characteristics are also different. To improve the accuracy and generalization ability of the model, it is necessary to select wells in different reservoir locations as training datasets.

Finally, we focus on the automatic stratigraphic correlation of well logs in this study. Nevertheless, the suggested model is actually a classified or segmented model that can also be used for addressing other seismic interpretation tasks, such as fault detection and horizon picking. Certainly, we should build specialized losses for promoting the performance of DL models when solving different issues.

7. Conclusions

In this paper, we propose a geology-informed deep learning network called CMT-enhanced Hiformer for stratigraphic correlation of well logs. The network architecture integrates the advantages of convolutional neural networks and transformers, along with a mixed loss function composed of cross-entropy and a geological constraint. The proposed method was validated using data from the Shuanghe oil field in central China, achieving a maximum F1 score of 0.8857 and a precision of 0.8865 on the blind test dataset. Experiments show that CMT-enhanced Hiformer significantly improves the accuracy and efficiency of stratigraphic correlation and outperforms the state-of-the-art counterparts.

Author Contributions

Conceptualization, investigation, writing—review and editing, Z.X.; Methodology, writing—original draft, validation, visualization, B.Z.; Validation, resources, B.L.; Analyzed data, visualization, W.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China grant number 42472209.

Data Availability Statement

The datasets in this study can be obtained by contacting the corresponding author.

Acknowledgments

The authors would like to thank the Shuanghe Oil Field for providing access to the well log dataset, which significantly contributes to this study.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Liu, Y.; Shi, C.; Wu, Q.; Zhang, R.; Zhou, Z. Visual analytics of stratigraphic correlation for multi-attribute well-logging data exploration. IEEE Access 2019, 7, 98122–98135. [Google Scholar] [CrossRef]
Dai, Y.; Huang, X.; Liu, H.; Yang, H.; Wei, G.; Lu, N.; Han, Z.; Song, H. Stratigraphic automatic correlation using SegNet semantic segmentation model. In Proceedings of the SEG/AAPG/SEPM First International Meeting for Applied Geoscience & Energy, Denver, CO, USA, 26 September–1 October 2021; p. D011S065R003. [Google Scholar]
Partovi, S.M.A.; Sadeghnejad, S. Fractal parameters and well-logs investigation using automated well-to-well correlation. Comput. Geosci. 2017, 103, 59–69. [Google Scholar] [CrossRef]
Liang, J.; Wang, H.; Blum, M.J.; Ji, X. Demarcation and correlation of stratigraphic sequences using wavelet and Hilbert-Huang transforms: A case study from Niger Delta Basin. J. Pet. Sci. Eng. 2019, 182, 106329. [Google Scholar] [CrossRef]
Kadkhodaie, A.; Rezaee, R. Intelligent sequence stratigraphy through a wavelet-based decomposition of well log data. J. Nat. Gas Sci. Eng. 2017, 40, 38–50. [Google Scholar] [CrossRef]
Xu, Z.; Zhang, B.; Li, F.; Cao, G.; Liu, Y. Well-log decomposition using variational mode decomposition in assisting the sequence stratigraphy analysis of a conglomerate reservoir. Geophysics 2018, 83, B221–B228. [Google Scholar] [CrossRef]
Al-Baldawi, B.A.H. Using well logs data in Logfacies determination by applying the cluster analysis technique for Khasib Formation, Amara oil field, south Eastern Iraq. J. Pet. Res. Stud. 2016, 6, 9–26. [Google Scholar] [CrossRef]
Karimi, A.M.; Sadeghnejad, S.; Rezghi, M. Well-to-well correlation and identifying lithological boundaries by principal component analysis of well-logs. Comput. Geosci. 2021, 157, 104942. [Google Scholar] [CrossRef]
Ma, Y.Z. Lithofacies clustering using principal component analysis and neural network: Applications to wireline logs. Math. Geosci. 2011, 43, 401–419. [Google Scholar] [CrossRef]
Vincent, P.; Gartner, J.; Attali, G. An approach to detailed dip determination using correlation by pattern recognition. J. Pet. Technol. 1979, 31, 232–240. [Google Scholar] [CrossRef]
Behdad, A. A step toward the practical stratigraphic automatic correlation of well logs using continuous wavelet transform and dynamic time warping technique. J. Appl. Geophys. 2019, 167, 26–32. [Google Scholar] [CrossRef]
Fang, H.; Lou, Y.; Zhang, B.; Xu, H.; Lu, M. Mimicking the process of manual sequence stratigraphy well correlation. Interpretation 2021, 9, T667–T684. [Google Scholar] [CrossRef]
Wang, C.; Wei, X.; Pan, H.; Han, L.; Wang, H.; Wang, H.; Zhao, H. Well Logging Stratigraphic Correlation Algorithm Based on Semantic Segmentation. Appl. Geophys. 2024, 21, 650–666. [Google Scholar] [CrossRef]
Rudman, A.J.; Lankston, R.W. Stratigraphic correlation of well logs by computer techniques. Aapg Bull. 1973, 57, 577–588. [Google Scholar]
Mehta, C.; Radhakrishnan, S.; Srikanth, G. Segmentation of well logs by maximum-likelihood estimation. Math. Geol. 1990, 22, 853–869. [Google Scholar] [CrossRef]
Wang, X.; Du, M.; Yu, W. Application of grey correlation method for stratigraphic correlation and its improvements. Well Logging Technol. 2006, 30, 126. [Google Scholar]
Zhang, T.; Plotnick, R.E. Graphic Biostratigraphic Correlation Using Genetic Algorithms. Math. Geol. 2006, 38, 781–800. [Google Scholar] [CrossRef]
Hohn, M.E.; Fontana, M.V. Geostatistics and artificial intelligence applied to stratigraphic correlation. Am. Assoc. Pet. Geol. Conv. 1986, 70, 5201053. [Google Scholar]
Parimontonsakul, M.; Lotongkum, S.; Mularlee, K. A machine learning based approach to automate stratigraphic correlation through marker determination. Improv. Oil Gas Recovery 2023, 7, IOGR.1204. [Google Scholar] [CrossRef]
Tognoli, F.M.W.; Spaniol, A.F.; Mello, M.E.D.; Souza, L.V.d. A machine-learning based approach to predict facies associations and improve local and regional stratigraphic correlations. Mar. Pet. Geol. 2024, 160, 19. [Google Scholar] [CrossRef]
Malmgren, B.A.; Nordlund, U. Application of Artificial Neural Networks to Stratigraphic Correlation. Paleontol. Soc. Spec. Publ. 1996, 8, 257. [Google Scholar] [CrossRef]
Tokpanov, Y.; Smith, J.; Ma, Z.; Deng, L.; Benhallam, W.; Salehi, A.; Zhai, X.; Darabi, H.; Castineira, D. Deep-learning-based automated stratigraphic correlation. In Proceedings of the SPE Annual Technical Conference and Exhibition, Virtual, 26–29 October 2020; p. D022S061R020. [Google Scholar]
Gu, X.; Lu, W.; Li, Y.; Wang, Y. Semi-supervised seismic stratigraphic interpretation constrained by spatial structure. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5912710. [Google Scholar] [CrossRef]
Jiang, C.; Zhang, D.; Chen, S. Lithology identification from well-log curves via neural networks with additional geologic constraint. Geophysics 2021, 86, IM85–IM100. [Google Scholar] [CrossRef]
Wang, D.; Chen, G. Intelligent seismic stratigraphic modeling using temporal convolutional network. Comput. Geosci. 2023, 171, 105294. [Google Scholar] [CrossRef]
Yuan, S.; Liu, J.; Wang, S.; Wang, T.; Shi, P. Seismic waveform classification and first-break picking using convolution neural networks. IEEE Geosci. Remote Sens. Lett. 2018, 15, 272–276. [Google Scholar] [CrossRef]
Feng, R.; Balling, N.; Grana, D.; Dramsch, J.S.; Hansen, T.M. Bayesian convolutional neural networks for seismic facies classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 8933–8940. [Google Scholar] [CrossRef]
Pardo, E.; Garfias, C.; Malpica, N. Seismic phase picking using convolutional networks. IEEE Trans. Geosci. Remote Sens. 2019, 57, 7086–7092. [Google Scholar] [CrossRef]
Hu, G.; Hu, Z.; Liu, J.; Cheng, F.; Peng, D. Seismic fault interpretation using deep learning-based semantic segmentation method. IEEE Geosci. Remote Sens. Lett. 2020, 19, 7500905. [Google Scholar] [CrossRef]
Ferreira, R.S.; Oliveira, D.A.; Semin, D.G.; Zaytsev, S. Automatic velocity analysis using a hybrid regression approach with convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2020, 59, 4464–4470. [Google Scholar] [CrossRef]
Liu, N.; Lei, Y.; Yang, Y.; Wei, S.; Gao, J.; Jiang, X. Self-supervised time-frequency representation based on generative adversarial networks. Geophysics 2023, 88, IM87–IM99. [Google Scholar] [CrossRef]
Torres, J.F.; Hadjout, D.; Sebaa, A.; Martínez-Álvarez, F.; Troncoso, A. Deep learning for time series forecasting: A survey. Big Data 2021, 9, 3–21. [Google Scholar] [CrossRef]
Raghu, M.; Unterthiner, T.; Kornblith, S.; Zhang, C.; Dosovitskiy, A. Do vision transformers see like convolutional neural networks? Adv. Neural Inf. Process. Syst. 2021, 34, 12116–12128. [Google Scholar]
Liu, N.; Huo, J.; Li, Z.; Wu, H.; Lou, Y.; Gao, J. Seismic attributes aided horizon interpretation using an ensemble dense inception transformer network. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5902010. [Google Scholar] [CrossRef]
Vaswani, A. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
Liu, C.; Zhao, R.; Shi, Z. Remote-sensing image captioning based on multilayer aggregated transformer. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6506605. [Google Scholar] [CrossRef]
Zerveas, G.; Jayaraman, S.; Patel, D.; Bhamidipaty, A.; Eickhoff, C. A transformer-based framework for multivariate time series representation learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual Event, Singapore, 14–18 August 2021; pp. 2114–2124. [Google Scholar]
Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
Heidari, M.; Kazerouni, A.; Soltany, M.; Azad, R.; Aghdam, E.K.; Cohen-Adad, J.; Merhof, D. Hiformer: Hierarchical multi-scale representations using transformers for medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 6202–6212. [Google Scholar]
Guo, J.; Han, K.; Wu, H.; Tang, Y.; Chen, X.; Wang, Y.; Xu, C. Cmt: Convolutional neural networks meet vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 12175–12185. [Google Scholar]
Xu, K.; Chen, H.; Huang, C.; Ogg, J.G.; Zhu, J.; Lin, S.; Yang, D.; Zhao, P.; Kong, L. Astronomical time scale of the Paleogene lacustrine paleoclimate record from the Nanxiang Basin, central China. Palaeogeogr. Palaeoclimatol. Palaeoecol. 2019, 532, 109253. [Google Scholar] [CrossRef]
Su, A.; Chen, H.; Zhao, J.; Feng, Y.; Nguyen, A.D. Exhumation filling and paleo-pasteurization of the shallow petroleum system in the North Slope of the Biyang Sag, Nanxiang Basin, China. Mar. Pet. Geol. 2021, 133, 105267. [Google Scholar] [CrossRef]
Su, A.; Chen, H.; Zhao, J.; Feng, Y. Integrated fluid inclusion analysis and petrography constraints on the petroleum system evolution of the central and southern Biyang Sag, Nanxiang Basin, Eastern China. Mar. Pet. Geol. 2020, 118, 104437. [Google Scholar] [CrossRef]
Dong, Y.; Zhu, X.; Xian, B.; Hu, T.; Geng, X.; Liao, J.; Luo, Q. Seismic geomorphology study of the Paleogene Hetaoyuan Formation, central-south Biyang Sag, Nanxiang Basin, China. Mar. Pet. Geol. 2015, 64, 104–124. [Google Scholar] [CrossRef]

Figure 1. The simplified structure of our suggested CMT-enhanced Hiformer: (a) CMTSTEM, (b–d) CMTBLOCK module, (e) Lightweight Multi-Head Self-Attention(LMHSA), (f) Cross Attention, and (g) IRFFN.

Figure 2. (a) The study area used in this study and (b) the structural map of the study area.

Figure 3. The well location map of the study area, where the red points represent the training wells, the green points represent the validation wells, and the gray points represent the test wells.

Figure 4. The training loss curves of CMT (red), Hiformer (green), and CMT-enhanced Hiformer (cyan).

Figure 5. Well logs, ground truth labels, and predicted results by using different models of W29: (a) well logs (GR, SP, and RT), (b) the ground truth stratigraphic classification, and the predicted results calculated using (c) CMT, (d) Hiformer, and (e) CMT-enhanced Hiformer. The red dashed lines indicate the sequence boundaries.

Figure 6. Well logs, ground truth labels, and the predicted results of W25: (a) well logs (GR, SP, and RT), (b) the ground truth stratigraphic classification, and the predicted results via our suggested CMT-enhanced Hiformer (c) without and (d) with our GC loss. The red dashed lines indicate the sequence boundaries.

Figure 7. The statistical errors of the predicted results by applying different models to the blind test dataset, (a) Class 1 and (b) Class 3.

Figure 8. The structural deep maps of Class 1’s top surfaces calculated by using the predictions of (a) SegNet, (b) CMT, (c) Hiformer, and our suggested CMT-enhanced Hiformer (d) without and (e) with our GC loss, and (f) ground truth.

Figure 9. The automatic stratigraphic correlation results of our suggested model without (dashed blue) and with (solid red) our GC loss in a well section parallel to the sedimentary source direction.

Figure 10. The automatic stratigraphic correlation results of our suggested model without (dashed blue) and with (solid red) our GC loss in a well section perpendicular to the sedimentary source direction.

Figure 11. The precision and F1 using the CMT-enhanced Hiformer without and with GC for different amounts of training data.

Table 1. The precision and F1 of the blind test dataset by using CMT, Hiformer, and CMT-enhanced Hiformer.

Methods	Precision	F1
CMT	0.8365	0.8357
Hiformer	0.8277	0.8271
CMT-enhanced Hiformer	0.8717	0.8708

Table 2. The ablation study of our suggested geological constraint and two regularization parameters by applying our CMT-enhanced Hiformer to the blind test dataset.

Weight	Precision	F1
$λ_{1}$ = 1, $λ_{2}$ = 0	0.8717	0.8708
$λ_{1}$ = 0.95, $λ_{2}$ = 0.05	0.8433	0.8414
$λ_{1}$ = 0.90, $λ_{2}$ = 0.10	0.8571	0.8567
$λ_{1}$ = 0.85, $λ_{2}$ = 0.15	0.8735	0.8731
$λ_{1}$ = 0.80, $λ_{2}$ = 0.20	0.8865	0.8857
$λ_{1}$ = 0.75, $λ_{2}$ = 0.25	0.8759	0.8749
$λ_{1}$ = 0.70, $λ_{2}$ = 0.30	0.8489	0.8480

Table 3. The average differences between the ground truth and predicted well tops (in meters) of the blind test dataset by using SegNet, CMT, Hiformer, CMT-enhanced Hiformer, and CMT-enhanced Hiformer with GC loss.

Methods	Average Differences (m)
SegNet	2.91
CMT	2.23
Hiformer	2.54
CMT-enhanced Hiformer	1.85
CMT-enhanced Hiformer with GC	1.39

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, Z.; Zheng, B.; Liu, B.; Song, W. Stratigraphic Correlation of Well Logs Using Geology-Informed Deep Learning Networks. Processes 2025, 13, 1288. https://doi.org/10.3390/pr13051288

AMA Style

Xu Z, Zheng B, Liu B, Song W. Stratigraphic Correlation of Well Logs Using Geology-Informed Deep Learning Networks. Processes. 2025; 13(5):1288. https://doi.org/10.3390/pr13051288

Chicago/Turabian Style

Xu, Zhaohui, Boyu Zheng, Bo Liu, and Wendan Song. 2025. "Stratigraphic Correlation of Well Logs Using Geology-Informed Deep Learning Networks" Processes 13, no. 5: 1288. https://doi.org/10.3390/pr13051288

APA Style

Xu, Z., Zheng, B., Liu, B., & Song, W. (2025). Stratigraphic Correlation of Well Logs Using Geology-Informed Deep Learning Networks. Processes, 13(5), 1288. https://doi.org/10.3390/pr13051288

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Stratigraphic Correlation of Well Logs Using Geology-Informed Deep Learning Networks

Abstract

1. Introduction

2. Methodology

2.1. The Architecture of CMT-Enhanced Hiformer

2.1.1. CMTSTEM Module

2.1.2. CMTBLOCK Module

2.1.3. CONVUP Module

2.2. Loss Function with Geological Constraint

3. Data Introduction and Implementation Details

3.1. Study Area and Training Data Preparation

3.2. Implementation Details of Training Process

4. Ablation Study

4.1. The Comparisons of Different Modules

4.2. The Comparisons of Different Regularization Weights

5. Stratigraphic Correlation Results

6. Discussion

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI