Leakage Concentration Prediction and Interpretable Analysis of Buried Pipelines Based on Multi-Layer Perceptron and Interval Sampling
Abstract
1. Introduction
1.1. Background
1.2. Related Research Status
1.3. Contributions
- (1)
- A CFD simulation data foundation for buried-pipeline leakage concentration prediction is constructed. By considering soil medium characteristics and multiple working conditions, the proposed simulation model generates a high-fidelity leakage diffusion dataset covering pressure, pipe diameter, burial depth, leakage aperture, porosity, particle size, soil temperature, time and spatial coordinates, providing a reliable sample source for subsequent machine learning modeling.
- (2)
- An interval-sampling-based modeling strategy for massive CFD data is introduced. To address the high storage and training cost caused by 1.4 billion original CFD samples, a 1:10 interval-sampling strategy is used to extract 140 million representative samples. Three independent sampling runs are further designed to evaluate the influence of different sampling starting points on model accuracy and stability, thereby verifying the feasibility of reducing the sample scale while retaining representative information.
- (3)
- Multiple machine learning surrogate models are established and compared for leakage concentration prediction. Based on the sampled high-quality dataset, LightGBM, XGBoost and MLP are constructed under the same training and testing conditions. Their performance is comprehensively evaluated using MAE, MSE, RMSE, R2 and EV, and the MLP model is identified as the most suitable model for this task because it achieves the highest prediction accuracy and the lowest error on the test set.
- (4)
- SHAP-based interpretability analysis is introduced to reveal the key factors controlling leakage concentration prediction. From both global feature importance and local sample explanation perspectives, the contribution direction and magnitude of each input feature are analyzed, which helps clarify the influence of spatial position, time, leakage aperture and soil particle size on concentration distribution and enhances the engineering credibility of the prediction model.
2. Methodology
2.1. Machine Learning Prediction Models
2.1.1. LightGBM
2.1.2. eXtreme Gradient Boosting (XGBoost)
2.1.3. Multi-Layer Perceptron (MLP)
2.2. Interpretability Analysis Method
2.3. Model Evaluation Metrics
- (1)
- Mean Absolute Error (MAE)
- (2)
- Mean Squared Error (MSE)
- (3)
- Root Mean Squared Error (RMSE)
- (4)
- R-squared ()
- (5)
- Explanatory Variance (EV)
3. Model Training and Result Analysis
3.1. Model Parameter Settings
3.2. Model Performance Comparison
3.3. Interval Sampling of the Optimal Model
3.4. SHAP Interpretability Analysis
4. Conclusions
- (1)
- Superiority of MLP for concentration regression: Among the models tested, MLP achieved the highest accuracy (R2 = 0.9988, RMSE = 0.0153), demonstrating its superior capability to capture the complex, nonlinear mapping between multi-physical features (pressure, soil properties, spatial coordinates) and leakage concentration. This makes it a highly suitable surrogate model for CFD.
- (2)
- Validation of the 1:10 sampling strategy: Three independent repeated experiments proved that the 1:10 interval-sampling strategy effectively reduces data volume (from 1.4B to 140M) without compromising model performance. The resulting MLP model showed exceptional stability (R2 ≈ 0.9987, CV ≈ 0%), confirming the strategy’s reliability for large-scale CFD data dimensionality reduction.
- (3)
- Physical interpretability via SHAP: The SHAP analysis quantitatively revealed that spatial location (X, Z) is the dominant factor, followed by time and leak aperture. Notably, soil particle size (X, Y, Z) exhibited nonlinear effects, with medium values maximizing predicted concentration, reflecting its complex influence on gas diffusion in anisotropic porous media.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Mahmoud, A.A.; Hasan, R. A Comprehensive Survey on Pipeline Monitoring Technologies: Advancements, Challenges, Market Opportunities and Future Directions. J. Pipeline Sci. Eng. 2025; in press. [CrossRef]
- Ma, Q.; Liang, W.; Zhou, P. A Review on Pipeline In-Line Inspection Technologies. Sensors 2025, 25, 4873. [Google Scholar] [CrossRef]
- Wei, Q.; Zhou, P.; Shi, X. Assessing the congestion cost of gas pipeline between China and Russia. Energy Strategy Rev. 2024, 55, 101493. [Google Scholar] [CrossRef]
- Gaurina-Međimurec, N.; Novak Mavar, K.; Simon, K.; Djerdji, F. Accidents in Oil and Gas Pipeline Transportation Systems. Energies 2025, 18, 4056. [Google Scholar] [CrossRef]
- Bu, F.; Lu, Q.; Jia, B.; Gao, X.; Zhang, H.; Wang, W.; Wang, Z. A Review of Research on Leakage and Diffusion Characteristics of Buried Gas Pipelines. J. Pipeline Sci. Eng. 2025; in press. [CrossRef]
- Feng, Y.; Gao, J.; Yin, X.; Chen, J.; Wu, X. Risk assessment and simulation of gas pipeline leakage based on Markov chain theory. J. Loss Prev. Process Ind. 2024, 91, 105370. [Google Scholar] [CrossRef]
- Xu, T.; Martynov, S.; Mahgerefteh, H. A Review of Optimization Methods for Pipeline Monitoring Systems: Applications and Challenges for CO2 Transport. Energies 2025, 18, 3591. [Google Scholar] [CrossRef]
- Gong, Y.; Bao, C.; He, Z.; Jian, Y.; Wang, X.; Huang, H.; Song, X. A Review on Gas Pipeline Leak Detection: Acoustic-Based, OGI-Based, and Multimodal Fusion Methods. Information 2025, 16, 731. [Google Scholar] [CrossRef]
- Abubakar, A.; Abisoye, O.A.; Alabi, I.O.; Solomon, A.; Oyefolahan, I.O. Systematic literature review and bibliometric analysis of pipeline monitoring and leakage detection techniques. Discov. Mech. Eng. 2025, 4, 17. [Google Scholar] [CrossRef]
- Dai, Z.; Wang, T.; Hu, X.; Ma, D. A review of data-driven leakage diagnosis methods across pipeline and energy transportation system. J. Pipeline Sci. Eng. 2026; in press. [CrossRef]
- Bagheri, M.; Sari, A. Study of natural gas emission from a hole on underground pipelines using optimal design-based CFD simulations: Developing comprehensive soil classified leakage models. J. Nat. Gas Sci. Eng. 2022, 102, 104583. [Google Scholar] [CrossRef]
- Mohanty, S.; Brennan, S.; Molkov, V. CFD modelling of methane dispersion from buried pipeline leaks: Experimental validation and hazard distance estimation. Process Saf. Environ. Prot. 2024, 187, 1540–1557. [Google Scholar] [CrossRef]
- Cai, Y.; Gu, X.; Zhang, X.; Zhang, K.; Zhang, H.; Xiong, Z. Acoustic Characterization of Leakage in Buried Natural Gas Pipelines. Processes 2025, 13, 2274. [Google Scholar] [CrossRef]
- Chang, W.; Gu, X.; Zhang, X.; Gou, Z.; Zhang, X.; Xiong, Z. Numerical Study of the Soil Temperature Field Affected by Natural Gas Pipeline Leakage. Processes 2024, 13, 36. [Google Scholar] [CrossRef]
- Zhang, C.; Hu, Y.; Dong, Z.; Yang, Z.; Yi, D. Simulation and experiment of leakage and diffusion of natural gas pipelines with different burial depths under different pressures. Sci. Rep. 2024, 14, 31782. [Google Scholar] [CrossRef]
- Bu, F.; He, Y.; Lu, Q.; Liu, M.; Bai, J.; Lv, Z.; Leng, C. Analysis of Leakage and Diffusion Characteristics and Hazard Range Determination of Buried Hydrogen-Blended Natural Gas Pipeline Based on CFD. ACS Omega 2024, 9, 39202–39218. [Google Scholar] [CrossRef]
- Wang, H.; Tian, X. Numerical Simulation of Diffusion Characteristics and Hazards in Multi-Hole Leakage from Hydrogen-Blended Natural Gas Pipelines. Energies 2025, 18, 4309. [Google Scholar] [CrossRef]
- Zhang, S.; Xia, X.; Deng, Y.; Han, X.; Deng, B.; Liu, H.; Yan, X.; Chen, L. Investigating Light Hydrocarbon Pipeline Leaks: A Comprehensive Study on Diffusion Patterns and Energy Safety Implications. Energies 2025, 18, 3151. [Google Scholar] [CrossRef]
- Chen, X.; Liu, C.; Xiao, K.; Liu, W.; Gu, T.; Li, Y. Experimental study on the leakage identification for the buried gas pipeline via vibration signals. J. Pipeline Sci. Eng. 2025, 5, 100230. [Google Scholar] [CrossRef]
- Ma, H.; Zhong, Y.; Wang, J.; Xie, Y.; Ding, R.; Kang, H.; Zeng, Y. Method for identifying the leakage of buried natural gas pipeline by soil vibration signals. Gas Sci. Eng. 2024, 132, 205487. [Google Scholar] [CrossRef]
- Liu, C.; Zhu, S.; Yin, Y.; Xiao, K.; Chen, X.; Liu, W.; Li, Y. A leakage monitoring technology for buried hydrogen-doped natural gas pipelines based on vibration signal with machine learning. Int. J. Hydrogen Energy 2025, 131, 118–135. [Google Scholar] [CrossRef]
- Saleem, F.; Ahmad, Z.; Kim, J.-M. Real-Time Pipeline Leak Detection: A Hybrid Deep Learning Approach Using Acoustic Emission Signals. Appl. Sci. 2024, 15, 185. [Google Scholar] [CrossRef]
- Chen, Z.; Gu, Z.; Qin, L.; Mi, H.; Zhou, C.; Zhang, H.; Feng, X.; Song, T.; Wu, K.; Wang, X.; et al. Classification Prediction of Natural Gas Pipeline Leakage Faults Based on Deep Learning: Employing a Lightweight CNN with Attention Mechanisms. Processes 2025, 13, 3454. [Google Scholar] [CrossRef]
- Liu, Y.; Xie, W.; Guo, Q.; Wang, S. Enhancing Pipeline Leakage Detection Through Multi-Algorithm Fusion with Machine Learning. Processes 2025, 13, 1519. [Google Scholar] [CrossRef]
- Yuan, H.; Liu, Y.; Huang, L.; Liu, G.; Chen, T.; Su, G.; Dai, J. Real-time detection of urban gas pipeline leakage based on machine learning of IoT time-series data. Measurement 2025, 242, 115937. [Google Scholar] [CrossRef]
- Zhao, Y.; Yang, L.; Duan, Q.; Zhao, Z.; Wang, Z. Research on Detection Methods for Gas Pipeline Networks Under Small-Hole Leakage Conditions. Sensors 2025, 25, 755. [Google Scholar] [CrossRef]
- Benabid, M.-K.; Baumgartner, P.; Jin, G.; Fan, Y. Leakage Detection Using Distributed Acoustic Sensing in Gas Pipelines. Sensors 2025, 25, 4937. [Google Scholar] [CrossRef] [PubMed]
- Han, Z.; Wu, J.; Cai, J.; Wang, C.; Xu, T.; Li, Y. Real-time prediction of gas leakage and diffusion for buried natural gas pipelines by deep learning and dimensionality reduction methods. J. Loss Prev. Process Ind. 2026, 100, 105868. [Google Scholar] [CrossRef]
- Hridoy, M.A.A.M.; Shawkat, A.I.; Bordin, C.; Acharjee, M.R.; Masood, A.; Baki, A.O.; Al Mamun, M.A. Advanced machine learning models for accurate water quality classification and WQI prediction: Implications for aquatic disease risk management. Sci. Total Environ. 2025, 1008, 180965. [Google Scholar] [CrossRef] [PubMed]
- Li, X.; Liu, Y.; Zhang, R.; Zhang, N. Probabilistic failure assessment of oil and gas gathering pipelines using machine learning approach. Reliab. Eng. Syst. Saf. 2025, 256, 110747. [Google Scholar] [CrossRef]
- Ge, J.; Lin, H.; Li, S.; Zhou, J.; Li, W. Research on multi-task leakage identification methods for gas drainage pipeline. Reliab. Eng. Syst. Saf. 2026, 267, 111947. [Google Scholar] [CrossRef]
- Wiens, M.; Verone-Boyle, A.; Henscheid, N.; Podichetty, J.T.; Burton, J. A Tutorial and Use Case Example of the eXtreme Gradient Boosting (XGBoost) Artificial Intelligence Algorithm for Drug Development Applications. Clin. Transl. Sci. 2025, 18, e70172. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, S. Novel Physics-Informed Indicators for Leak Detection in Water Supply Pipelines. Sensors 2025, 25, 5069. [Google Scholar] [CrossRef]
- Lazcano, A.; Jaramillo-Morán, M.A.; Sandubete, J.E. Back to Basics: The Power of the Multilayer Perceptron in Financial Time Series Forecasting. Mathematics 2024, 12, 1920. [Google Scholar] [CrossRef]
- Salih, A.M.; Raisi-Estabragh, Z.; Galazzo, I.B.; Radeva, P.; Petersen, S.E.; Lekadir, K.; Menegaz, G. A Perspective on Explainable Artificial Intelligence Methods: SHAP and LIME. Adv. Intell. Syst. 2025, 7, 2400304. [Google Scholar] [CrossRef]
- Ben Seghier, M.E.A.; Mohamed, O.A.; Ouaer, H. Machine learning-based Shapley additive explanations approach for corroded pipeline failure mode identification. Structures 2024, 65, 106653. [Google Scholar] [CrossRef]
- Yu, Z.; Wang, X.; Pan, T.; Li, Z.; Yin, Z.; Wang, F.; Hong, S.; Hong, B. Leakage and Diffusion Law and Risk Assessment of Buried Natural Gas Pipelines Considering Soil Stratification and Permeability Difference. Processes 2026, 14, 1467. [Google Scholar] [CrossRef]
- Pan, T.; Wang, X.; Li, F.; Yu, Z.; Liu, K.; Li, Z.; Yin, Z.; Hong, S.; Hong, B. Numerical Investigation on Natural Gas Leakage and Diffusion from Buried Pipelines in Soil: Effects of Pipeline Parameters and Leakage Hole Characteristics. Appl. Sci. 2026, 16, 4731. [Google Scholar] [CrossRef]









| No. | Feature Name | Unit/Description |
|---|---|---|
| 1 | P | MPa/Pressure |
| 2 | D | mm/Pipe Diameter |
| 3 | L | m/Burial Depth |
| 4 | Dia | mm/Leakage Aperture |
| 5 | Sh | Dimensionless/Leakage Hole Shape Category |
| 6 | Ori | Dimensionless/Leakage Direction |
| 7 | PorX | Dimensionless/Porosity in X-Direction |
| 8 | ParX | μm/Particle Size in X-Direction |
| 9 | PorY | Dimensionless/Porosity in Y-Direction |
| 10 | ParY | μm/Particle Size in Y-Direction |
| 11 | PorZ | Dimensionless/Porosity in Z-Direction |
| 12 | ParZ | μm/Particle Size in Z-Direction |
| 13 | Soil-T | °C/Soil Temperature |
| 14 | Time | Dimensionless/Time Step |
| 15 | X-axis | Dimensionless/Spatial X Coordinate |
| 16 | Y-axis | Dimensionless/Spatial Y Coordinate |
| 17 | Z-axis | Dimensionless/Spatial Z Coordinate |
| Parameter | Value | Description |
|---|---|---|
| Hidden Layer Structure | [128, 64, 32] | Three fully connected layers |
| Activation Function | ReLU | Hidden layer |
| Output Layer Activation Function | Linear | Regression output |
| Dropout | 0.2 | Prevent overfitting |
| Batch Size | 1024 | Number of samples per batch |
| Epochs | 100 | Maximum training rounds |
| Optimizer | Adam | Adaptive learning rate |
| Initial Learning Rate | 0.001 | Fixed |
| Early Stopping | 10 rounds | Stop when validation loss does not decrease |
| Loss Function | MSE | Mean squared error |
| Weight Initialization | Xavier | Stabilize training |
| random_state | 42 | Random seed |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Yu, Z.; Wang, X.; Qu, T.; Pan, T.; Liu, K.; Hong, S.; Cen, X.; Li, Z.; Yin, Z.; Wang, M. Leakage Concentration Prediction and Interpretable Analysis of Buried Pipelines Based on Multi-Layer Perceptron and Interval Sampling. Processes 2026, 14, 1771. https://doi.org/10.3390/pr14111771
Yu Z, Wang X, Qu T, Pan T, Liu K, Hong S, Cen X, Li Z, Yin Z, Wang M. Leakage Concentration Prediction and Interpretable Analysis of Buried Pipelines Based on Multi-Layer Perceptron and Interval Sampling. Processes. 2026; 14(11):1771. https://doi.org/10.3390/pr14111771
Chicago/Turabian StyleYu, Zhipeng, Xingyu Wang, Tengrui Qu, Ting Pan, Kai Liu, Siyan Hong, Xiao Cen, Zhenglong Li, Zhanghua Yin, and Minjuan Wang. 2026. "Leakage Concentration Prediction and Interpretable Analysis of Buried Pipelines Based on Multi-Layer Perceptron and Interval Sampling" Processes 14, no. 11: 1771. https://doi.org/10.3390/pr14111771
APA StyleYu, Z., Wang, X., Qu, T., Pan, T., Liu, K., Hong, S., Cen, X., Li, Z., Yin, Z., & Wang, M. (2026). Leakage Concentration Prediction and Interpretable Analysis of Buried Pipelines Based on Multi-Layer Perceptron and Interval Sampling. Processes, 14(11), 1771. https://doi.org/10.3390/pr14111771

