MS-Unet: A Multi-Scale Feature Fusion U-Net for 3D Seismic Fault Detection

Cui, Lijie; Huang, Yawen; Niu, Yuxi; Cui, Hongyan; Tao, Ye; Qian, Longlong; Zhao, Jiaqi

doi:10.3390/pr13071976

Open AccessArticle

MS-Unet: A Multi-Scale Feature Fusion U-Net for 3D Seismic Fault Detection

by

Lijie Cui

^*

,

Yawen Huang

,

Yuxi Niu

,

Hongyan Cui

,

Ye Tao

,

Longlong Qian

and

Jiaqi Zhao

Petroleum Institute, China University of Petroleum-Beijing at Karamay, Karamay 834000, China

^*

Author to whom correspondence should be addressed.

Processes 2025, 13(7), 1976; https://doi.org/10.3390/pr13071976

Submission received: 30 April 2025 / Revised: 4 June 2025 / Accepted: 16 June 2025 / Published: 23 June 2025

(This article belongs to the Section Process Control and Monitoring)

Download

Browse Figures

Versions Notes

Abstract

Accurate detection of fault structures in seismic data is vital for oil and gas exploration and geological hazard assessment. These faults exhibit diverse scales, shapes, and levels of complexity, ranging from small fractures to large-scale discontinuities across seismic volumes. Considering the multi-scale nature of fault features, we propose MS-Unet, an improved U-Net architecture that incorporates multi-scale feature fusion. This approach integrates encoder feature maps at different spatial resolutions, enabling the network to capture both local details and global structural context more effectively. We validate our model using the Dutch North Sea F3 dataset and seismic data from an oilfield in the Junggar Basin, China. The results demonstrate that MS-Unet outperforms other methods in preserving fault continuity, enhancing detail resolution, and improving structural interpretability. These findings highlight the potential of multi-scale deep learning architectures for robust and automated seismic fault identification.

Keywords:

seismic fault detection; multi-scale feature fusion; U-Net; convolutional neural network

1. Introduction

Faults are significant geological structures formed by brittle deformation in the upper part of the Earth’s crust [1], and fault interpretation refers to the process of mapping discontinuous faults [2]. Accurate fault interpretation is not only critical for improving the efficiency of hydrocarbon exploration, but also forms the foundation for seismic hazard assessment, urban infrastructure planning, and geotechnical engineering design. It plays a vital role in ensuring public safety and supporting the sustainable development of natural resources.

Traditional seismic fault detection techniques typically rely on calculating and analyzing discontinuity features in seismic reflection signals, such as similarity [3], coherence [4], curvature [5], and variance, or optimal surface voting techniques [6], local fault models [7], and topological metrics [8], which offer automated or semi-automated approaches. However, these methods often require manual selection of hyperparameters, which can impact fault identification accuracy and algorithm stability. To reduce the impact of manual intervention, researchers have attempted to introduce artificial neural networks (ANNs) into the application of seismic multi-attribute fusion in seismic fault imaging [9,10,11,12,13,14,15,16,17,18]. However, these methods largely depend on high-quality seismic data and substantial computational resources. If the input data is problematic, such as being heavily noise-contaminated or having insufficient resolution, the accuracy of the final results is highly likely to be affected.

In recent years, deep learning has introduced a new perspective to seismic fault interpretation due to its excellent self-learning and generalization capabilities, enabling rapid, cost-effective, and reproducible fault predictions [19]. Xiong et al. [20] trained a CNN model to automatically detect and map fault zones using annotated seismic image cubes, outperforming coherence techniques in terms of fault continuity and clarity—where clarity refers to the visual distinctness of fault boundaries, including contrast and edge sharpness, in the seismic images.

Given the scarcity and difficulty of acquiring field data, Wu et al. [21] applied a simplified U-Net model trained on synthetic datasets, demonstrating that networks trained on synthetic data possess some generalization capability. However, the U-Net model was first published by Ronneberger et al. [22] in 2015. Therefore, in order to improve the performance and generalization ability of the network, researchers have been constantly trying to use more advanced model architectures and methods to provide better results in fault interpretation tasks. For example, Yang et al. [23] employed a densely connected U-Net++ for fault identification, enhancing the differentiation of adjacent faults. Gao et al. [24] introduced nested residuals into the U-Net, fusing three fault feature maps with low, medium, and high resolutions. Yan et al. [25] proposed WNet, which added a new extended path in the decoder part to improve the flow of information through skip connections, allowing features from different levels to be integrated into the final results. To further improve the convergence speed and generalization of neural networks, researchers have incorporated attention mechanisms and other modules into network architectures to strengthen useful features and suppress irrelevant information [26,27].

These methods aim to enable the network to capture more features through the skip connections. However, whether through the dense connections of U-Net++, the shortcut connections of residual networks, or the dual extension paths of WNet, the features extracted remain limited to a single scale. In this study, we define “scale” in two key dimensions relevant to seismic fault interpretation: spatial scale, referring to the physical size of fault features, from small local discontinuities to large basin-wide structures; and resolution scale, related to seismic data sampling and imaging resolution, which impacts feature detectability. Fault development itself has multi-scale and multi-resolution characteristics [28]. To address the challenges posed by faults of varying complexity and scale, we propose a multi-scale feature fusion approach. Using the widely adopted U-Net in the segmentation domain as the base network, we named our model Multi-Scale U-Net (MS-Unet). The primary innovation of this network lies in the fusion of multi-scale feature information extracted from the decoder. Before each skip connection, the multi-scale feature information extracted by the decoder is concatenated using feature fusion techniques, followed by skip connections and upsampling. This approach reduces the loss of information caused by layer-by-layer compression, improves fault identification accuracy, and enriches the spatial morphology of faults, enabling a more comprehensive understanding of fault structures.

Recent studies have also highlighted the importance of neural network architecture optimization in achieving robust predictive performance. For instance, various approaches have been explored to select optimal configurations, including genetic algorithms [29], Bayesian model selection [30], and structural performance modeling [31]. These findings reinforce the need for principled architectural design in fault detection tasks. Therefore, the proposed MS-Unet follows a lightweight yet expressive configuration validated in prior work [22], while integrating multi-scale fusion modules to further enhance its segmentation capability.

2. Materials and Methods

2.1. Overall Architecture

The proposed MS-Unet network is an improvement based on the U-Net architecture. U-Net is a symmetric U-shaped architecture consisting of an encoder and a decoder, and it is widely used in various directions of semantic segmentation due to its simplicity and efficiency. As shown in Figure 1, the encoder part of the U-Net model adopts the classic CNN architecture, using convolutional and pooling layers for multi-level feature extraction and downsampling of the input image. The decoder part uses upsampling and skip connections to progressively restore the feature maps from the encoder back to the original size, achieving fine-grained segmentation of the image.

However, U-Net has limitations when applied to more detailed semantic segmentation tasks. In deep neural networks, deeper layers have larger receptive fields and stronger semantic representation capabilities, but lower feature resolution, which weakens their ability to represent geometric information. Conversely, shallower layers have smaller receptive fields, stronger geometric detail representation, and higher resolution, but weaker semantic representation capabilities. As the network depth increases, features are compressed layer by layer, making it easy to lose useful information. Although U-Net uses skip connections to compensate for this loss, the contributions of different levels of skip connections to the final result vary. U-Net does not differentiate between these contributions, which can result in key information being overlooked or irrelevant information being overly emphasized [25].

To address these limitations, we introduce a multi-scale feature fusion mechanism, which integrates feature maps of different resolutions at various decoder levels to enhance the model’s ability to capture fine details. As shown in Figure 2, each convolutional layer in the encoder is downsampled or upsampled to match the dimensions of the current layer, followed by a 3 × 3 convolution and the concatenation of features for fusion. This enhances the network’s ability to represent multi-scale features. A 1 × 1 × 1 convolution is then used to compress the feature map’s channel count, reducing computational cost while retaining important features.

2.2. Network Architecture Details

The proposed MS-Unet is implemented as a 3D fully convolutional architecture, consisting of an encoder, multi-scale fusion modules, and a decoder. The encoder comprises four convolutional blocks, each containing two 3D convolution layers with kernel size 3 × 3 × 3, padding set to ‘same’, and ReLU activation. Feature resolution is progressively halved through 3D max pooling (stride = 2), and the number of filters increases as follows: 16 → 32 → 64 → 128. The encoder–decoder framework, number of layers, and feature channel settings follow a lightweight U-Net design proposed by Wu et al. [21], who demonstrated that such configurations achieve high segmentation performance while maintaining computational efficiency. Building upon this validated structure, we introduced multi-scale fusion modules to enhance feature integration across hierarchical levels.

The multi-scale fusion mechanism aggregates encoder features at different spatial resolutions at three decoder stages. For instance, at the second decoder level, feature maps from shallow and deep layers (e.g., conv1, conv2, conv3, and upsampled conv4) are rescaled to a uniform resolution (e.g., 32 × 32 × 32) using 3D up/downsampling, as shown in Figure 3. These are then concatenated and passed through a 1 × 1 × 1 convolution for dimensionality reduction, followed by two 3 × 3 × 3 convolutions for feature refinement. This process is repeated at higher resolutions (64³ and 128³). All intermediate layers use ReLU activations, and the final prediction is generated through a 1 × 1 × 1 convolution with a sigmoid activation to output voxel-wise fault probability.

To further clarify, in this study, we define overall expressiveness as the network’s ability to extract, preserve, and reconstruct multi-scale fault features, especially in terms of continuity, boundary sharpness, and completeness. This expressiveness is enhanced by the multi-scale fusion mechanism, which strengthens spatial coherence and interpretability in the final prediction.

The advantage of this architecture is that it enables dense reuse and fusion of hierarchical features, thus enhancing the model’s representation capability for both high-level semantics and low-level geometric details.

2.3. Evaluation Indicators

To evaluate the performance of the model and ensure its accuracy and reliability in real-world applications, this study uses the following four metrics: Intersection over Union (IOU) [32], Precision, Recall [33], and Dice [34] coefficient. IOU and Dice measure the overlap between the predicted results and the ground truth labels, while Precision and Recall focus on false positives and false negatives, respectively. The specific formulas are as follows:

IOU = \frac{TP}{TP + FP + FN}

(1)

Precision = \frac{TP}{TP + FP}

(2)

Recall = \frac{TP}{TP + FN}

(3)

Dice = \frac{2 \times TP}{2 \times TP + FP + FN}

(4)

where TP (True Positive) refers to the number of pixels correctly predicted as faults, FP (False Positive) refers to the number of pixels incorrectly predicted as faults, and FN (False Negative) refers to the number of pixels that were missed as faults. These metrics collectively assess the similarity between the model’s predictions and the ground truth, as well as the false positive and false negative rates when identifying faults.

3. Results

3.1. Synthetic Data

This study utilizes the synthetic seismic dataset released by Wu et al. [35], which includes 200 training samples and 20 validation samples, each with dimensions of 128 × 128 × 128. The data was generated by embedding randomly simulated faults and folds into reflectivity models, followed by convolution with Ricker wavelets and the addition of noise. Each sample contains a 3D seismic volume and its corresponding fault label, with an example shown in Figure 4, where white regions indicate the labeled fault structures.

The experiments were conducted in a PyCharm environment using the TensorFlow (version 2.10.1) framework and trained on an NVIDIA GeForce RTX 3090 GPU for 50 epochs. The model was optimized using the Adam optimizer with a learning rate of 1 × 10⁻⁴, employing binary cross-entropy as the loss function. A batch size of 1 was adopted due to the high memory requirements of 3D seismic volumes and to ensure that the model could fully capture the spatial context of each individual sample without downsampling. This setting also facilitates better structural learning in fault regions by avoiding interference from inter-sample variability. During training, the model weights were saved at each epoch, and the final model used for evaluation was selected based on the epoch with the best validation performance.

Upon comparing the training outcomes of the U-Net, WNet, and MS-Unet models, as illustrated in Figure 5 and Figure 6, it becomes clear that with the progression of training iterations, WNet attains a higher voxel-wise classification accuracy than U-Net. Moreover, MS-Unet further surpasses WNet in both training and validation accuracy (Figure 5).

In terms of loss values, MS-Unet also demonstrates lower binary cross-entropy (BCE) loss compared to the other two models (Figure 6), which can be attributed to its multi-scale fusion mechanism. This mechanism enhances feature representation by integrating spatial and semantic information from various hierarchical levels, thereby improving model generalization and convergence.

We further compared the IOU, Precision, Dice coefficient, and AP (Average Precision) values. The curves in Figure 7a–c indicate that MS-Unet outperforms U-Net and WNet during the early to middle stages of training across these metrics. In the later stages, the models tend to approach a performance plateau on the synthetic dataset, although minor fluctuations—especially in the WNet curve—are observed, reflecting typical stochastic training dynamics and architectural differences. The comparison of PR curves and AP values in Figure 7d further demonstrates that MS-Unet achieves superior performance on class-imbalanced datasets.

To further evaluate the model’s performance in boundary prediction, we introduced the Hausdorff Distance (HD) as an additional metric. HD measures the maximum deviation between the predicted fault boundary and the ground truth, providing a more rigorous assessment of worst-case boundary error. As shown in Figure 8, we compare the HD trends over training epochs for three models: U-Net (blue) and WNet (orange) exhibit higher HD values during the early training stages, indicating larger boundary prediction errors; MS-Unet (green) demonstrates a significantly faster convergence rate and consistently lower HD values, reflecting better boundary robustness and lower model uncertainty. After approximately epoch 14, MS-Unet stabilizes at a low HD value (1.0), indicating superior performance in fault boundary localization. This result further confirms the effectiveness of our multi-scale feature fusion strategy in enhancing structural accuracy and reducing boundary ambiguity.

3.2. Three-Dimensional Field Data Examples

In this section, we selected widely used seismic data from the Dutch North Sea F3 block and data from an oilfield in the Junggar Basin, China, to compare the generalization capabilities of different networks. Experimental results demonstrate that all three networks can identify most of the faults, but both U-Net and WNet exhibit issues such as blurred boundaries and information loss.

Figure 9 presents fault identification results on the 3D seismic data from the F3 block, with dimensions of 128 [vertical] × 384 [inline] × 512 [crossline]. Panels (a)–(d) show the seismic label, U-Net, WNet, and MS-Unet identification results, respectively. The results reveal that both U-Net (b) and WNet (c) exhibit noticeable discontinuities and missing fault detections, especially in the regions highlighted by yellow ellipses. These issues indicate blurred fault boundaries and information loss during feature extraction and reconstruction. In contrast, MS-Unet (d) produces more continuous and complete fault delineations, with sharper boundaries and better alignment with geological features, resulting in clearer and more interpretable outputs.

The improved performance of MS-Unet can be attributed to its multi-scale feature fusion mechanism, which effectively integrates features across spatial resolutions, preserving fine details that are otherwise lost in single-scale architectures such as U-Net and WNet.

Figure 10 presents a horizontal time slice at vertical = 90 from the F3 block, along with the fault identification results of the networks. The comparison shows that MS-Unet outperforms the other two models in terms of fault continuity. We observed that none of the three networks accurately identified the complex faults in the lower right corner. This is due to the limitations of synthetic data, which lack the complexity and diversity of real seismic data. Synthetic data often fails to fully simulate the various geological conditions, noise, and irregularities encountered in real seismic surveys. For example, synthetic datasets may not adequately capture the complex variations in fault orientations, densities, and intersecting structures present in actual subsurface environments. This discrepancy between synthetic and real data results in models that perform well on validation sets of synthetic data but require further improvement in generalization when applied to real-world data.

Figure 11 presents time slices of the seismic data overlaid with identified faults, using a black-white-red color scheme to enhance fault visibility. Panel (a) shows the original time slice, while panels (b), (c), and (d) depict fault identification results from U-Net, W-Net, and MS-Unet, respectively. The yellow ellipses highlight regions of interest where MS-Unet demonstrates superior performance. Specifically, MS-Unet captures richer details in the northeast-oriented fault located in the upper right corner, better aligning with geological patterns interpreted from the seismic data. It should be noted that these geological patterns are interpretations derived from the seismic data rather than independent datasets, and our results show improved consistency with these geological interpretations. Furthermore, in the lower left region, MS-Unet’s fault identification results appear more continuous and complete compared to the other methods. These improvements underscore the effectiveness of MS-Unet’s multi-scale feature fusion in preserving detailed fault structures and enhancing interpretability.

Figure 12 presents a volumetric fault identification result based on a 3D seismic data subset from an oilfield in the Junggar Basin, China (512 [vertical] × 384 [inline] × 128 [crossline]), offering a comprehensive overview of the fault structures within the seismic volume. Panels (a), (b), and (c) show the fault detection outcomes from U-Net, WNet, and MS-Unet, respectively. The results indicate that the fault maps produced by U-Net and WNet contain numerous interruptions and discontinuities, especially in regions highlighted by the yellow ellipses. In contrast, MS-Unet delivers more continuous and coherent fault detections, capturing finer fault details that are missed by the other two models.

Notably, this dataset features complex strike-slip faults, which present significant challenges for fault detection due to their intricate geometry and heterogeneous characteristics. We observe that neural networks trained solely on synthetic datasets exhibit limited performance when applied to such complex real-world seismic scenarios. This limitation underscores the necessity of incorporating more diverse and realistic training data to enhance model generalization and robustness.

These results demonstrate the advantage of MS-Unet’s multi-scale feature fusion in preserving fault continuity and enhancing detail resolution, while also highlighting the challenges posed by geological complexity in practical applications.

We also compared the fault identification results along a cross-section of the region (crossline = 70). Unlike the volumetric view presented in Figure 12, Figure 13 focuses specifically on this 2D vertical slice, enabling a detailed comparison of fault continuity and morphology within this cross-section.

Figure 13b,e,h display the results for U-Net, WNet, and MS-Unet, respectively. The areas highlighted with green ellipses show that MS-Unet identifies more fault details, while the areas in red ellipses indicate that, compared to the other two networks, MS-Unet exhibits fewer interruptions in the identified faults, leading to a more continuous fault structure and improved fault quality. This improvement is due to the multi-scale fusion mechanism of MS-Unet, which reduces the loss of useful features and enhances network performance, thus providing richer fault identification.

Furthermore, Figure 13i overlays the identified faults from Figure 13h onto the original seismic data using a black-white-red color scheme to enhance fault visibility. This overlay allows for direct visual verification of the accuracy and geological consistency of the fault detection. The faults identified by MS-Unet align well with seismic reflectors, confirming the reliability of the proposed method.

It is worth noting that the field seismic datasets used in this section, including the F3 block and Junggar Basin, do not provide publicly available ground-truth fault annotations. Therefore, we are unable to perform quantitative evaluations such as accuracy or Dice coefficient for these examples. Instead, performance was assessed qualitatively based on fault continuity, completeness, and geological plausibility, which are common and accepted evaluation criteria in seismic interpretation. This qualitative approach, though subjective to some extent, allows for meaningful visual comparison and interpretability across models when ground truth is unavailable.

In summary, the use of the MS-Unet neural network model can effectively enhance fault continuity and compensate for some of the missing fault information. Experimental results indicate that neural networks trained on synthetic data have a certain degree of generalization capability when applied to real seismic data, being able to identify most faults and showing promise for improvement through further network optimization. However, in cases where seismic data is highly complex, the performance of neural network models can be impacted. Moreover, geological complexity—including intricate fault networks, lithological variations, and subtle stratigraphic features—also poses significant challenges for fault identification. Such geological heterogeneity may result in ambiguous seismic signatures, complicating accurate fault detection. Therefore, the interaction of seismic data quality and geological complexity both influences model performance, underscoring the need to incorporate realistic geological scenarios in future training datasets to improve robustness and generalization.

4. Conclusions

This study proposes using an improved U-Net network, MS-Unet, to train a fault identification model on synthetic data. By employing a multi-scale feature fusion mechanism, MS-Unet integrates features of different scales obtained during the encoding stage, thereby enhancing the network’s ability to preserve detail and improve overall expressiveness, which in this study is defined as the capacity to represent multi-scale fault structures with greater continuity, boundary sharpness, and structural completeness during decoding. This method enhances the accuracy and continuity of 3D fault information identification and improves the interpretability of fault structures. However, the identification accuracy remains limited when dealing with intersecting or complex fault zones, which in this study refer to fault systems characterized by irregular geometries, closely spaced or intersecting faults, and heterogeneous seismic responses. These complexities challenge the model’s ability to distinguish boundaries and maintain structural continuity, particularly when trained on synthetic datasets. To further improve the network’s performance in complex fault identification, we plan to incorporate real seismic data into the synthetic dataset to enrich training diversity and realism, and explore advanced architectural enhancements such as attention mechanisms or dynamic convolution to optimize the network’s fault detection capabilities.

Author Contributions

Methodology, L.Q. and J.Z.; Validation, Y.H. and Y.T.; Resources, L.C.; Data curation, Y.H. and Y.N.; Writing—original draft, Y.H.; Writing—review & editing, H.C. and L.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Research Foundation of China University of Petroleum-Beijing at Karamay (No. XQZX20240011) and the Xinjiang Uygur Autonomous Region “Tianchi Talents” Introduction Program.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Fossen, H. Structural Geology; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
An, Y.; Guo, J.; Ye, Q.; Childs, C.; Walsh, J.; Dong, R. Deep Convolutional Neural Network for Automatic Fault Recognition from 3D Seismic Datasets. Comput. Geosci. 2021, 153, 104776. [Google Scholar] [CrossRef]
Marfurt, K.J.; Kirlin, R.L.; Farmer, S.L.; Bahorich, M.S. 3-D Seismic Attributes Using a Semblance-based Coherency Algorithm. Geophysics 1998, 63, 1150–1165. [Google Scholar] [CrossRef]
Bahorich, M.; Farmer, S. 3-D Seismic Discontinuity for Faults and Stratigraphic Features: The Coherence Cube. Lead. Edge 1995, 14, 1053–1058. [Google Scholar] [CrossRef]
Roberts, A. Curvature Attributes and Their Application to 3D Interpreted Horizons. First Break. 2001, 19, 85–100. [Google Scholar] [CrossRef]
Wu, X.; Fomel, S. Automatic Fault Interpretation with Optimal Surface Voting. Geophysics 2018, 83, O67–O82. [Google Scholar] [CrossRef]
Lou, Y.; Zhang, B.; Wang, R.; Lin, T.; Cao, D. Seismic Fault Attribute Estimation Using a Local Fault Model. Geophysics 2019, 84, O73–O80. [Google Scholar] [CrossRef]
Lou, Y.; Zhang, B.; Yong, P.; Fang, H.; Zhang, Y.; Cao, D. Semiautomatic Fault-Surface Generation and Interpretation Using Topological Metrics. Geophysics 2021, 86, O13–O27. [Google Scholar] [CrossRef]
Tingdahl, K.M.; De Rooij, M. Semi-automatic Detection of Faults in 3D Seismic Data. Geophys. Prospect. 2005, 53, 533–542. [Google Scholar] [CrossRef]
Basir, H.M.; Javaherian, A.; Yaraki, M.T. Multi-Attribute Ant-Tracking and Neural Network for Fault Detection: A Case Study of an Iranian Oilfield. J. Geophys. Eng. 2013, 10, 015009. [Google Scholar] [CrossRef]
Zheng, Z.H.; Kavousi, P.; Di, H.B. Multi-Attributes and Neural Network-Based Fault Detection in 3D Seismic Interpretation. Adv. Mater. Res. 2014, 838, 1497–1502. [Google Scholar] [CrossRef]
Kumar, P.C.; Mandal, A. Enhancement of Fault Interpretation Using Multi-Attribute Analysis and Artificial Neural Network (ANN) Approach: A Case Study from Taranaki Basin, New Zealand. Explor. Geophys. 2018, 49, 409–424. [Google Scholar] [CrossRef]
Srivastava, E.; Mandal, A.; Kumar, P.C. Seismic Data Conditioning and Multiattribute Analysis for Enhanced Structural Interpretation: A Case Study from Offshore Nova Scotia, Scotian Basin. In Proceedings of the SEG International Exposition and Annual Meeting (SEG 2017), Houston, TX, USA, 24–29 September 2017. [Google Scholar]
Kumar, P.C.; Sain, K. Attribute Amalgamation-Aiding Interpretation of Faults from Seismic Data: An Example from Waitara 3D Prospect in Taranaki Basin off New Zealand. J. Appl. Geophys. 2018, 159, 52–68. [Google Scholar] [CrossRef]
Mandal, A.; Srivastava, E. Enhanced Structural Interpretation from 3D Seismic Data Using Hybrid Attributes: New Insights into Fault Visualization and Displacement in Cretaceous Formations of the Scotian Basin, Offshore Nova Scotia. Mar. Pet. Geol. 2018, 89, 464–478. [Google Scholar] [CrossRef]
Kumar, P.C.; Kamal’deen, O.O.; Alves, T.M.; Sain, K. A Neural Network Approach for Elucidating Fluid Leakage along Hard-Linked Normal Faults. Mar. Pet. Geol. 2019, 110, 518–538. [Google Scholar] [CrossRef]
Cui, L.; Wu, K.; Liu, Q.; Wang, D.; Guo, W.; Liu, Y.; Xu, G. Enhanced Interpretation of Strike-Slip Faults Using Hybrid Attributes: Advanced Insights into Fault Geometry and Relationship with Hydrocarbon Accumulation in Jurassic Formations of the Junggar Basin. J. Pet. Sci. Eng. 2022, 208, 109630. [Google Scholar] [CrossRef]
Mirkamali, M.S.; Keshavarz Fk, N.; Bakhtiari, M.R. Fault Zone Identification in the Eastern Part of the Persian Gulf Based on Combined Seismic Attributes. J. Geophys. Eng. 2013, 10, 015007. [Google Scholar] [CrossRef]
An, Y.; Du, H.; Ma, S.; Niu, Y.; Liu, D.; Wang, J.; Du, Y.; Childs, C.; Walsh, J.; Dong, R. Current State and Future Directions for Deep Learning Based Automatic Seismic Fault Interpretation: A Systematic Review. Earth-Sci. Rev. 2023, 243, 104509. [Google Scholar] [CrossRef]
Xiong, W.; Ji, X.; Ma, Y.; Wang, Y.; Al Bin Hassan, N.M.; Ali, M.N.; Luo, Y. Seismic Fault Detection with Convolutional Neural Network. Geophysics 2018, 83, O97–O103. [Google Scholar] [CrossRef]
Wu, X.; Liang, L.; Shi, Y.; Fomel, S. FaultSeg3D: Using Synthetic Data Sets to Train an End-to-End Convolutional Neural Network for 3D Seismic Fault Segmentation. Geophysics 2019, 84, IM35–IM45. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2015; Volume 9351, pp. 234–241. ISBN 978-3-319-24573-7. [Google Scholar]
Yang, D.; Cai, Y.; Hu, G.; Yao, X.; Zou, W. Seismic Fault Detection Based on 3D Unet++ Model. In Proceedings of the SEG International Exposition and Annual Meeting, Virtual, 11–16 October 2020; p. D031S039R002. [Google Scholar]
Gao, K.; Huang, L.; Zheng, Y. Fault Detection on Seismic Structural Images Using a Nested Residual U-Net. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–15. [Google Scholar] [CrossRef]
Yan, B.; Qian, L.; Zhao, J.; Li, M.; Pan, R. Fault Identification Based on W-Net in 3D Seismic Images. IEEE Geosci. Remote Sens. Lett. 2024, 21, 7504805. [Google Scholar] [CrossRef]
LIU, G.; MA, Z. Fault Identification of Post Stack Seismic Data by Improved Unet Network. Chin. J. Comput. Phys. 2023, 40, 742. [Google Scholar] [CrossRef]
Li, X.; Li, K.; Xu, Z.; Huang, Z.; Dou, Y. Fault-Seg-Net: A Method for Seismic Fault Segmentation Based on Multi-Scale Feature Fusion with Imbalanced Classification. Comput. Geotech. 2023, 158, 105412. [Google Scholar] [CrossRef]
Wang, X.; Jin, Z.; Chen, G.; Peng, M.; Huang, L.; Wang, Z.; Zeng, L.; Lu, G.; Du, X.; Liu, G. Multi-Scale Natural Fracture Prediction in Continental Shale Oil Reservoirs: A Case Study of the Fengcheng Formation in the Mahu Sag, Junggar Basin, China. Front. Earth Sci. 2022, 10, 929467. [Google Scholar] [CrossRef]
Domashova, J.V.; Emtseva, S.S.; Fail, V.S.; Gridin, A.S. Selecting an Optimal Architecture of Neural Network Using Genetic Algorithm. Procedia Comput. Sci. 2021, 190, 263–273. [Google Scholar] [CrossRef]
Kuok, S.; Yuen, K. Broad Bayesian Learning (BBL) for Nonparametric Probabilistic Modeling with Optimized Architecture Configuration. Comput.-Aided Civ. Infrastruct. Eng. 2021, 36, 1270–1287. [Google Scholar] [CrossRef]
Oh, B.K.; Kim, J. Optimal Architecture of a Convolutional Neural Network to Estimate Structural Responses for Safety Evaluation of the Structures. Measurement 2021, 177, 109313. [Google Scholar] [CrossRef]
Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The Pascal Visual Object Classes (VOC) Challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
Powers, D.M.W. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation. arXiv 2020, arXiv:2010.16061. [Google Scholar]
Dice, L.R. Measures of the Amount of Ecologic Association between Species. Ecology 1945, 26, 297–302. [Google Scholar] [CrossRef]
Wu, X.; Geng, Z.; Shi, Y.; Pham, N.; Fomel, S.; Caumon, G. Building Realistic Structure Models to Train Convolutional Neural Networks for Seismic Structural Interpretation. Geophysics 2020, 85, WA27–WA39. [Google Scholar] [CrossRef]

Figure 1. U-Net network architecture.

Figure 2. MS-Unet network architecture.

Figure 3. Multi-scale fusion mechanism, taking the second layer as an example.

Figure 4. A synthetic 3D seismic sample with fault labels. The seismic amplitude is shown in color, and white surfaces indicate simulated fault regions used as training labels.

Figure 5. Accuracy comparison of U-Net, WNet, and MS-Unet over 50 epochs. Accuracy refers to voxel-wise binary classification accuracy, computed on both training and validation sets.

Figure 6. Binary cross-entropy loss comparison of U-Net, WNet, and MS-Unet during training. Lower validation loss reflects better generalization performance.

Figure 7. Evaluation metrics. (a) IOU. (b) Dice. (c) Precision. (d) Precision–Recall.

Figure 8. Hausdorff Distance comparison.

Figure 9. Fault identification is displayed with actual seismic data. (a) Seismic label. (b) U-Net identification result. (c) WNet identification result. (d) MS-Unet identification result.

Figure 10. Identifying faults in seismic data. (a) Corresponding labels of time slices of 3D seismic images. (b) U-Net identification results. (c) WNet identification results. (d) MS-Unet identification results.

Figure 11. Superimposed images of fault identification using seismic data. (a) Time slice of 3D seismic image. (b) U-Net identification result. (c) WNet identification result. (d) MS-Unet identification result.

Figure 12. Fault identification of seismic data subset of an oilfield in the Junggar Basin, China. (a) U-Net identification results. (b) WNet identification results. (c) MS-Unet identification results.

Figure 13. Fault identification and stacking images using an oilfield seismic data in the Junggar Basin, China. (a,d,g) Time slice of 3D seismic image. (b,c) U-Net identification result. (e,f) WNet identification result. (h,i) MS-Unet identification result.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cui, L.; Huang, Y.; Niu, Y.; Cui, H.; Tao, Y.; Qian, L.; Zhao, J. MS-Unet: A Multi-Scale Feature Fusion U-Net for 3D Seismic Fault Detection. Processes 2025, 13, 1976. https://doi.org/10.3390/pr13071976

AMA Style

Cui L, Huang Y, Niu Y, Cui H, Tao Y, Qian L, Zhao J. MS-Unet: A Multi-Scale Feature Fusion U-Net for 3D Seismic Fault Detection. Processes. 2025; 13(7):1976. https://doi.org/10.3390/pr13071976

Chicago/Turabian Style

Cui, Lijie, Yawen Huang, Yuxi Niu, Hongyan Cui, Ye Tao, Longlong Qian, and Jiaqi Zhao. 2025. "MS-Unet: A Multi-Scale Feature Fusion U-Net for 3D Seismic Fault Detection" Processes 13, no. 7: 1976. https://doi.org/10.3390/pr13071976

APA Style

Cui, L., Huang, Y., Niu, Y., Cui, H., Tao, Y., Qian, L., & Zhao, J. (2025). MS-Unet: A Multi-Scale Feature Fusion U-Net for 3D Seismic Fault Detection. Processes, 13(7), 1976. https://doi.org/10.3390/pr13071976

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

MS-Unet: A Multi-Scale Feature Fusion U-Net for 3D Seismic Fault Detection

Abstract

1. Introduction

2. Materials and Methods

2.1. Overall Architecture

2.2. Network Architecture Details

2.3. Evaluation Indicators

3. Results

3.1. Synthetic Data

3.2. Three-Dimensional Field Data Examples

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI