Research on Multi-Source Collaborative Leakage Location Method for Coal Mine Gas Extraction Pipeline Based on Stacking Integration Learning

Zhou, Jie; Zhang, Weihong; Zhao, Ju; Ge, Jiaqi; Li, Wenjing; Liu, Ji

doi:10.3390/pr14121908

Open AccessArticle

Research on Multi-Source Collaborative Leakage Location Method for Coal Mine Gas Extraction Pipeline Based on Stacking Integration Learning

by

Jie Zhou

^1,2,*

,

Weihong Zhang

¹,

Ju Zhao

¹,

Jiaqi Ge

³,

Wenjing Li

³ and

Ji Liu

¹

College of Energy and Chemical Engineering, Shaanxi Energy Institute, Xianyang 712000, China

²

College of Energy and Mining Engineering, Xi’an University of Science and Technology, Xi’an 710054, China

³

College of Safety Science and Engineering, Xi’an University of Science and Technology, Xi’an 710054, China

^*

Author to whom correspondence should be addressed.

Processes 2026, 14(12), 1908; https://doi.org/10.3390/pr14121908

Submission received: 10 May 2026 / Revised: 28 May 2026 / Accepted: 5 June 2026 / Published: 11 June 2026

(This article belongs to the Section Energy Systems)

Download

Browse Figures

Versions Notes

Abstract

The accurate location of leakage points is a key part of underground gas prevention. To solve the problem of low positioning accuracy for gas extraction pipeline leakage, the gas extraction pipeline leakage experimental system was built, and the multi-source collaborative leakage localization method based on Stacking learning was proposed. The results showed that the Stacking–LSSVM–Elman–DBN (S-L-E-D) model with pressure–flow collaborative input achieved the best localization performance, with an accuracy of 0.932, Root Mean Square Error (RMSE) of 0.053, Mean Absolute Percentage Error (MAPE) of 0.082, Theil Inequality Coefficient (TIC) of 0.056, and a distance error below 1 m. Compared with a single-parameter input, the collaborative pressure–flow input improved the localization accuracy by more than 10%, while the RMSE and MAPE decreased by 39.0% and 37.4%, respectively. Under monitoring point fault conditions, the localization accuracies of monitoring points 1, 4, and 5 were 0.884, 0.891, and 0.881, respectively, while the dual-fault condition of monitoring points 1 and 4 still maintained an accuracy of 0.861. The study provides a feasible multi-source collaborative learning framework for leakage localization in gas extraction pipelines and offers a methodological reference for improving leakage monitoring and early warning.

Keywords:

gas extraction pipeline; leakage identification; stacking integration learning; multi-parameter fusion

1. Introduction

With the continuous increase in the depth and intensity of coal mining in China, the amount of gas emission has increased significantly, posing a serious threat to safety production [1,2,3]. The gas extraction system is identified as a key technical carrier for the efficient transportation of gas [4]. As an important component of the system, the gas extraction pipeline is widely used in the complex underground environment. However, gas extraction pipelines are prone to leakage from harsh underground conditions and aging, impacting the utilization efficiency and economic operation of the system. Moreover, gas leakage may even cause an explosion accident when encountering fire sources or static electricity. The study of the methods for locating leakage holds significant theoretical and engineering value for enhancing the safety and operational efficiency of a coal mine gas extraction system.

The pipeline leakage diagnosis methods based on Deep Learning (DL) have been widely applied in recent years. In contrast to existing leakage detection methods that are constructed based on accurate pipeline flow models, the adaptive capability of Neural Networks (NNs) [5,6] is leveraged to identify various pipeline operating conditions. Yan et al. [7] proposed a Temporal Convolutional Network (TCN) method that leverages negative pressure wave signals and supervised learning to directly map sensor inputs to leakage positions, avoiding the need for complex physical modeling. Liu et al. [8] established pipeline leakage diagnostic methods based on a Back Propagation Neural Network (BPNN). The results showed that the proposed leakage diagnosis method based on a BPNN has a better transfer learning performance than that based on Random Forest (RF) or Support Vector Machine (SVM). Rahmati [9] proposed a pipeline leakage diagnosis method based on an NN. This model was validated with actual data, which ensured the practicability of the proposed method. Miao et al. [10] constructed a relationship between residual magnetic flux density and stress using an Improved Sparrow Search Algorithm (ISSA) and an Extreme Learning Machine (ELM). The results demonstrated that the relative error of the proposed residual magnetic stress model was controlled within 3%, indicating that the method could effectively identify the severity of the leakage. Tian et al. [11] employed the Extreme Learning Machine (ELM) in the leakage detection method. Through data training and model adjustment, a detection accuracy over 90% was achieved. Zhou et al. [12] obtained the flow and pressure data of each monitoring point under different leakage point sizes through gas extraction pipeline leakage experiments. The Simulated Annealing–Particle Swarm Optimization (SA-PSO)–BPNN leakage diagnosis model based on monitoring point changes was established using the validation samples, and the diagnosis accuracy was 93.33%. Chuang et al. [13] adopted a Convolutional Neural Network (CNN) as the leakage identification model, extracting Mel Frequency Cepstral Coefficients (MFCCs) from pipeline acoustic signals as characteristic indicators. Consequently, the identification accuracy was improved from 96.6% to 99.4%. Zhu et al. [14] developed a Multi-Scale Attention Convolutional Neural Network (MSACNN) to effectively decouple coupled features under complex operating conditions via a multi-scale attention mechanism. The experimental results demonstrated that under extreme data imbalance conditions, the proposed hybrid diagnostic model could achieve accurate fault identification. Lu et al. [15] proposed a Variable Mode Decomposition (VMD) and a Local Linear Embedding (LLE) for the feature extraction of Acoustic Emission (AE) signals, which could achieve a recognized accuracy of 95%. Mujtaba et al. [16] adopted fault signatures to identify the leakage for a natural gas pipeline, mass flow rate measurements were used to estimate the Auto Regressive eXogenous (ARX) model, and a minimum detectable leak with zero false alarm was 0.084 m in diameter. Zuo et al. [17] proposed a semi-supervised leakage detection method comprising an improved Long Short-Term Memory (LSTM) Autoencoder and a One-Class Support Vector Machine (OCSVM) to reduce dependency on leakage data. As demonstrated in the results, an accuracy of 98% and AUC of 99% were achieved on the actual dataset.

The identification of pipeline leakage using an SVM was fundamentally based on discovering characteristic quantities that could represent leakage. Subsequently, features distinct from noise signals were identified, and the training of the SVM was performed based on these selected features [18,19]. Tan et al. [20] developed a coupled CFD-DL framework for hydrogen leakage characterization and diagnosis. The result showed that the SVM demonstrates robust generalization in small-sample, high-dimensional cases. Yang et al. [21] proposed a novel integration model of a One-Dimensional Convolution Neural Network (1DCNN) and SVM to improve the detection accuracy in the process of pipeline leakage detection. The experimental results demonstrated that the developed integration model can extract pipeline data features more accurately and was more robust in pipeline leakage detection. Li et al. [22] established PSO with a Decreased and Variable Amplitude strategy in the Least Squares Twin Support Vector Machine (DVAPSO-LSTSVM) leakage diagnosis model. The experimental results showed that the leakage identification accuracy of the DVAPSO-LSTSVM was 95.2%. Hys et al. [23] combined K-Nearest Neighbors (KNN) with an SVM to identify the operating states of valves, and employed regression models to estimate the valve leakage rates. Bui and Kim [24] proposed a trained multi-class SVM for gas leakage diagnosis using a spectral portrait of AE signals acquired by two sensor channels.

Other machine learning-based methods are also widely used in pipeline leakage identification. Liu et al. [25] developed a probabilistic prediction model for the leakage diameter in gas transmission pipelines based on Bayesian Networks (BNs). The experimental results showed that the BNs model could accurately evaluate the leakage diameter. Yuan et al. [26] proposed a Whale-Optimized Evolutionary Convolutional Neural Network model (WOA-ECNN) for leakage diagnosis research. The results demonstrated that the proposed pixel-level fusion method can accurately identify and differentiate between leakage and non-leakage states. Li et al. [27] proposed a leakage diagnosis method based on the frequency oscillation function model. The results indicated that the absolute positioning errors of all leakage diagnosis cases were within 1%. Zhang et al. [28] combined Complete Integration Empirical Mode Decomposition with Adaptive Noise (CIEMDAN) and Sparse Representation (SR) techniques to create a signal processing model. The experiments confirm its effectiveness, with a RMSE of 0.1135 mm between the predicted and actual leak severity values. Zhu et al. [29] discussed the effectiveness of fuzzy C-means (FCM), K-means, and K-medoids for leakage rate discrimination. The results showed that the K-medoids clustering achieved an identification accuracy of 96.28%. Wang et al. [30] proposed a leakage detection and localization model based on Variational Mode Decomposition and Dynamic Time Warping (VMD-DTW). The results demonstrated that the proposed method offers higher localization accuracy, with a localization error of 0.247%.

To summarize, although existing machine learning-based leakage localization methods have improved diagnostic accuracy, most studies still rely on single-source monitoring data or single prediction models, which limits model robustness and adaptability under complex underground conditions. For example, our previous study [12] mainly focused on optimizing a single BPNN model and analyzing the influence of different monitoring point inputs on the localization accuracy. However, it did not systematically compare pressure-only, flow-only, and collaborative inputs, nor evaluate localization robustness under monitoring point fault conditions. Therefore, this study developed a multi-source collaborative leakage detection framework by combining complementary pressure–flow features with heterogeneous learners and adopting a stacked ensemble learning approach. Compared with single-parameter or single-model methods, the proposed framework improves localization stability and robustness under complex operating conditions.

2. Experiment on the Gas Flow Characteristics of the Leakage Gas Extraction Pipeline

2.1. Construction of the Gas Extraction Pipeline Leakage Experimental System

After a leakage point occurs in the coal mine gas extraction pipeline, the flow and pressure along the pipeline change accordingly [31]. The variation characteristics of the flow and pressure in the pipelines were analyzed, to accurately identify pipeline leakage based on the fluid parameters. The schematic diagram of the gas extraction pipeline leakage experimental system is shown in Figure 1. The experimental pipeline was divided into three sections: a main pipeline with a diameter of 50 mm and a length of 13 m, and two groups of branch pipelines with a diameter of 32 mm and a length of 4.5 m. Each section of the pipeline was equipped with flow and pressure monitoring devices. For each group of leakage points, three repeated tests were conducted, and the average value was obtained. Data were collected 30 s after leakage, with one set of data recorded every 150 ms, resulting in 200 samples for each leakage condition. These samples were used to construct the leakage localization dataset, and the training and validation sets were divided at a ratio of 3:1. The present validation mainly reflects the model performance under the current laboratory dataset division. To verify the rationality of this data partitioning strategy and reduce the influence of random sample division, 5-fold cross-validation was further adopted during model training. The training–validation split ensured that sufficient samples were used for model learning, while an independent validation subset was retained for evaluating model generalization performance.

To ensure the reliability and reproducibility of the experimental data, the main instruments used in the gas extraction pipeline leakage experimental system were further specified. The experimental system mainly consisted of a rotary vane vacuum pump, flow monitoring devices, pressure sensors, leakage control valves, and a data acquisition unit. The flow and pressure signals at each monitoring point were synchronously collected by the data acquisition system. Before the experiment, all sensors were calibrated according to the instructions of the manufacturer, and zero-point correction was conducted to reduce systematic measurement errors. The detailed models, manufacturers, measurement ranges, and accuracies of the main experimental instruments are listed in Table 1. The locations of leakage points and monitoring points are shown in Table 2.

2.2. The Gas Flow Characteristic of the Leakage Gas Extraction Pipeline

2.2.1. Data Processing Based on EEMD

The Ensemble Empirical Mode Decomposition (EEMD) [32] is the decomposition of multiple empirical modes after the superposition of Gaussian white noise. The corresponding Intrinsic Mode Function (IMF) after the decomposition of multiple frequency domains is averaged. The added white noise is eliminated to achieve the purpose of suppressing modal aliasing. The white noise standard deviation σ and ensemble number Nd are two key parameters affecting the denoising performance of the EEMD. In general, excessively small σ values may lead to insufficient modal separation and residual modal aliasing, while excessively large σ values may introduce additional interference and distort the original signal characteristics. Similarly, increasing Nd can improve the decomposition stability and reduce random noise influence, but it also increases the computational complexity and processing time. The influence of the EEMD parameters on the localization accuracy is shown in Table 3.

As shown in Table 3, the EEMD parameters significantly affected the localization performance. When σ was too small, modal aliasing could not be fully suppressed, while excessively large σ values introduced additional noise and increased the RMSE and MAPE. Increasing Nd improved the decomposition stability, but the improvement became limited beyond 100 and increased the computational complexity. Therefore, σ = 0.05 and Nd = 100 were selected as the optimal parameter combination by balancing the denoising performance, localization accuracy, and computational efficiency.

The time series before and after flow and pressure signal denoising are shown in Figure 2. It can be seen from Figure 2, after EEMD denoising, the data can effectively separate noise from the real signal, significantly weakening random interference. At the same time, the nonlinear and sudden change characteristics of the signal are retained better.

2.2.2. Analysis of Gas Flow Characteristics in the Leaked Gas Extraction Pipeline

The data distribution of the change rates of each monitoring point when leakage occurred at different locations are shown in Figure 3. The pressure and flow responses at different monitoring points varied significantly with different leakage locations, mainly due to the spatial relationship between the leakage and monitoring points. For flow signals, M1 showed a clear response when L1 was activated, while M1 and M2 increased by 0.0317 and 0.0809 under L2. When L3 was activated, all monitoring points showed decreased flow rates, with M4 reaching the lowest change rate of −0.1628. For pressure signals, only M2 increased slightly under L4, with a change rate of 0.0167.

The relatively broad dispersion range for leakage point 5 in Figure 3 is mainly attributed to its location in branch pipeline 2 and the complex gas redistribution between the main pipeline and branch pipeline. After L5 was activated, the leakage disturbance was transmitted unevenly to different monitoring points, causing some signals to change slightly while others fluctuated significantly. For example, the pressure response at M3 reached a maximum change rate of 0.1100, indicating strong spatial differences in signal response. This phenomenon is mainly caused by branch pipeline resistance, local turbulence, airflow redistribution, and the distance between leakage and monitoring points. The broad standard deviation range of L5 reflects stronger spatial heterogeneity and pressure–flow instability under branch pipeline leakage, rather than abnormal data. It further indicates that single-source monitoring is insufficient for complex pipeline structures, and pressure–flow collaborative analysis is necessary to improve localization robustness.

3. The Method of Leakage Location in Gas Extraction Pipeline Based on Stacking Integration Learning

3.1. The Establishment Process of the Leakage Location Model for Gas Extraction Pipeline

3.1.1. The Leakage Location Model for Gas Extraction Pipeline

Integration learning can effectively balance out different data noises, enhance model generalization ability, and its prediction results are normally more accurate than those of the best individual model [33,34]. Stacking integration learning was used to fuse multiple base learners and generate prediction results. Each result was further fused into a new feature, and ultimately the final result is produced through an MR model [35]. LSSVM, ENN, and DBN were used as the base models in the Stacking integration model in this study. In this study, five-fold cross-validation was adopted to balance model reliability and computational efficiency. Since the dataset size was limited, the five-fold strategy improved data utilization and reduced the randomness caused by data partitioning. Compared with 10-fold cross-validation, it also avoided excessive computational cost and unstable validation caused by overly small validation subsets [36]. The gas extraction pipeline leakage location model based on Stacking integration learning is shown in Figure 4.

(1): LSSVM

The core idea of the LSSVM in regression prediction is to effectively project data in high dimensions and transform nonlinear problems into linear ones by constructing decision functions [37,38].

The regression function is as follows:

y (x) = \sum_{i = 1}^{k} \partial_{i} K (x, x_{i}) + b_{m}

(1)

where

\partial_{i}

is the Lagrange coefficient;

K (x, x_{i}) = \exp (- \frac{{‖x_{i} - x‖}^{2}}{2 σ^{2}})

is the radial basis kernel function; and b_m is the bias term.

The Gaussian radial basis function [39,40] was selected as the feature extraction method for the LSSVM. Compared with the traditional Gaussian radial basis classifier, the LSSVM constructed using the Gaussian radial basis function exhibits favorable generalization performance.

(2): ENN

The ENN is modified from the structure of the BPNN, and is composed of four parts: input layer, hidden layer, context layer and output layer [41,42]. The inputs of the ENN are comprised of two parts: the current input values and the output values of the hidden layer at the previous moment.

Equations (2)–(4) are the functional representations of the mathematical model of the ENN:

x (k) = f (w_{1} x_{c} (k) + w_{2} i (k - 1))

(2)

x_{c} (k) = a \cdot x_{c} (k - 1) + x (k - 1)

(3)

y_{k} = g (w_{3} x (k))

(4)

The error function of the ENN is as follows:

E = 1 / 2 {(y_{d} (k) - y (k))}^{T} (y_{d} (k) - y (k))

(5)

where a is specified as the self-connection feedback gain factor; w₁, w₂ and w₃ are the connection weights corresponding to context-to-hidden, input-to-hidden, and hidden-to-output layers; and f (x) and g(x) represent the activation functions of the hidden layer and output layer, respectively.

Partial derivatives of w₁, w₂ and w₃ are calculated. Subsequently, weight updates and the calculation of the output error are performed, until the output error meets the accuracy requirement.

(3): DBN

The DBN is composed of multiple stacked Restricted Boltzmann Machines (RBMs) and one regression layer. It is a deep network model with integrated feature learning and discriminative decision-making [43,44]. An RBM is a type of Neural Network with stochastic properties. It is an undirected graphical model composed of one visible layer (V) and one hidden layer (H), in which neurons have connections only between layers, with no connections within the same layer. Solving the network structure of an RBM means converting to solving for parameter θ using the marginal distribution of the joint probability distribution. For a given training sample, an RBM needs to iteratively adjust θ so that the output probability distribution obtained by an RBM is as close as possible to the input training sample data. The objective function for training an RBM is

\max_{θ} L_{θ, S} = \prod_{i = 1}^{n S} P (v^{i})

(6)

where S is the training sample, nS is the number of training samples.

3.1.2. Establishment of the Multi-Parameter Fusion Location Model for Gas Extraction Pipeline Leakage

(1): Leakage location and detection criteria for gas extraction pipelines based on the Stacking integration model

The establishment of the gas extraction pipeline location model consists of four parts: data acquisition and processing, localization model based on Stacking integration, preliminary model selection, and optimal model selection. The gas extraction pipeline leakage location identification process based on Stacking integration learning is shown in Figure 5.

Data acquisition and processing. Perform EEMD decomposition, reconstruction and standardization on the original dataset.
Localization model based on Stacking integration. Under different leakage point conditions, five sets of flow rate monitoring values and five sets of pressure monitoring values were input, and the leakage point location was output. The three input parameter combinations, including single pressure data, single flow data and pressure–flow collaborative prediction, were obtained. The seven localization model combinations, including the LSSVM, ENN, DBN, Stacking–LSSVM–ENN (S-L-E), Stacking–LSSVM–DBN (S-L-D), Stacking–ENN–DBN (S-E-D), Stacking–LSSVM–ENN–DBN (S-L-E-D), were established. Thus, the 21 groups of extraction pipeline localization models were established using different methods and feature combinations.
Preliminary model selection. The data of the validation set were analyzed. The localization model and feature parameter combinations with an average R² ≥ 0.80 were selected, and the prediction models were chosen. The prediction models with an RMSE ≤ 0.12 and MAPE ≤ 0.20 were selected as the preliminary optimized models.
Optimal model selection. The models with TIC ≤ 0.1 were selected. Subsequently, prediction models with RMSE ≤ 0.12 and MAPE ≤ 0.20 for each leakage point were selected as the final optimized localization models. The predicted data of each group from the optimized models were comparatively analyzed using the validation set.

(2): Base learner model parameters

The base learners were initially selected as the LSSVM, ENN and DBN models [45]. The parameters of the base learners are shown in Table 4. To improve the reproducibility and fairness of model comparison, the hyperparameters of the LSSVM, ENN, and DBN were determined through preliminary experiments and grid search. The RMSE and MAPE were selected as the main optimization criteria, and the parameter combination with a lower prediction error and more stable five-fold cross-validation performance was adopted. During model training, the random seed was fixed to reduce the influence of random initialization and data partitioning. All models were implemented under the same computational environment to ensure comparability.

3.1.3. Model Accuracy Evaluation Index

Indicators including the Coefficient of Determination (R²), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), Theil Inequality Coefficient (TIC), Mean Absolute Error (MAE) and Coefficient of Variation (CV) were used to conduct comprehensive evaluation of the models [46,47]. Metrics for measuring the prediction accuracy are shown in Table 5.

3.2. Verification of Gas Extraction Pipeline Leakage Location Model

3.2.1. Multi-Parameter Fusion Location Model of Gas Extraction Pipeline Leakage

(1): Discussion on the positioning accuracy of a single pressure data input

The localization accuracy of each model in the pressure training set is shown in Table 6. Under a single pressure input, the prediction performance differed among models. The LSSVM achieved an average accuracy of 0.799, indicating its ability to capture nonlinear relationships between pressure variation and leakage location. The ENN obtained a higher accuracy of 0.820, suggesting that its recurrent structure was more suitable for describing temporal pressure response characteristics. The DBN showed relatively unstable performance, with an average accuracy of 0.760, possibly due to the limited sample size and insufficient pressure-only features. Among the Stacking models, the S-L-E, S-L-D, and S-E-D achieved average accuracies of 0.819, 0.819, and 0.827, respectively, generally outperforming most single models.

Under pressure input, the S-E-D model achieved the best performance, with an average accuracy of 0.827, mainly because the combination of the ENN and DBN enhanced the extraction of temporal and nonlinear pressure response features. However, the S-L-E-D model achieved only 0.528, suggesting that more base learners do not necessarily lead to better performance. Overall, pressure data contain useful leakage location information, but a single pressure input remains insufficient for robust localization, further highlighting the necessity of pressure–flow collaborative input. As shown in Table 6, under a single pressure input, the training time of the single models was relatively short, with the LSSVM, ENN, and DBN requiring 3.82 s, 8.65 s, and 15.43 s, respectively. The Stacking models required a longer training time because multiple base learners and a meta-learner were involved. Among them, the S-E-D achieved the highest average accuracy of 0.827, with a training time of 27.64 s, indicating that the improvement in positioning accuracy was accompanied by an increase in offline computational cost.

The evaluation indicators of each model after preliminary selection with a single pressure data input are shown in Figure 6. The fold number, RMSE, and MAPE jointly evaluated the model reliability, absolute error, and relative error. These indicators provided a more comprehensive assessment of model accuracy, stability, and generalization performance. The average values of the MAPE and RMSE for all positioning methods exhibited slight fluctuations, with the RMSE ranging from 0.111 to 0.116 and the MAPE varying between 0.181 and 0.203. Under a single pressure data input, the S-L-E model achieved the best stability, with the CV values of the RMSE and MAPE being 3.83% and 6.02%, respectively. The S-E-D model showed slightly larger fluctuations, with an RMSE-CV and MAPE-CV of 6.77% and 12.46%, respectively. However, the S-L-E exhibited the best comprehensive accuracy, with the lowest average RMSE (0.1110) and average MAPE (0.1810), making it the overall optimal method.

(2): Discussion on the positioning accuracy of a single flow data input

The localization accuracy of each model in the flow training set is shown in Table 7. The localization models showed different accuracy levels on the flow training set. Among the single models, the ENN achieved the highest accuracy of 0.820, outperforming the LSSVM (0.798) and DBN (0.763), indicating that its structure was more suitable for capturing the temporal characteristics of flow data. Among all models, the S-L-E-D performed best, with fold accuracies ranging from 0.813 to 0.843 and an average accuracy of 0.830. This improvement was mainly attributed to the complementary learning and feature fusion ability of the Stacking ensemble model.

The S-L-E-D model integrated the outputs of three foundational models (LSSVM, ENN, DBN), with the three models having inherently distinct modeling logic. When leakage occurred, the pipeline flow rate presented a temporal process of stability, sudden change, attenuation and new stability. Under a single flow data input, the S-L-E-D model exhibited the smallest fluctuation among all models. The RMSE-CV and MAPE-CV values were 5.94% and 8.31%, respectively, indicating strong robustness and generalization ability under different flow data distributions. The accuracy of the fourth fold dropped sharply to 0.676, which was a direct reflection of the insufficient generalization ability caused by overfitting. The local fitting advantage of the LSSVM compensated for the neglect of small-sample details by the DBN. The temporal capture capability of the ENN remedied the inadequacy of the LSSVM in the modeling of dynamic processes. This conclusion was consistent with the finding in the reference [48] that the integration of multiple models could reduce the positioning error through feature complementarity.

As shown in Table 7, the training time of each model under a single flow input was close to that under a single pressure input. The training time of the LSSVM was only 3.64 s, while that of the optimal S-L-E-D model increased to 31.57 s. Although the S-L-E-D had the longest training time, it achieved the highest average positioning accuracy of 0.830. This indicates that the Stacking ensemble structure improved the localization performance by increasing the offline computational complexity, and the additional training cost was acceptable for model construction.

The evaluation indicators of each model after preliminary selection with a single flow data input are shown in Figure 7. The overall pattern showed that the positioning accuracy of the Stacking ensemble model outperformed the single models, and the combinations containing the ENN among the fusion models performed more prominently. The S-L-E-D model had an RMSE stably below 0.11 and MAPE close to 0.15 across all five folds, which indicated that it could maintain high accuracy and strong generalization ability under different scenarios of flow rate data distribution.

(3): Discussion on the positioning accuracy of pressure–flow collaborative data input

The localization accuracy of each individual localization model in the pressure–flow collaborative training set is shown in Table 8. Compared with a single pressure/flow input, the accuracy of all models was significantly improved under collaborative input. The advantage of the S-L-E-D model was further amplified; the optimal model S-L-E-D under collaborative input reached an accuracy of 0.932, representing an increase of more than 10%. Furthermore, under a single dataset input, the accuracy improvement in the ensemble models over the single models was approximately 3% to 5%, while this gap widened to 5% to 8% under collaborative input. The pressure–flow rate dataset provided richer features, enabling the defect complementary and feature fusion mechanisms of ensemble models to exert their effects to a greater extent. Under pressure–flow collaborative input, the S-L-E-D model achieved the optimal stability, with an RMSE-CV of 2.41% and MAPE-CV of 4.26%. These results further demonstrated that single pressure or flow data may be affected by sensor errors and environmental interference, whereas pressure–flow collaborative input provides information from two different monitoring parameters. The collaborative input enabled the models to acquire two-dimensional features of spatial and temporal attributes, avoiding the limitation that single-source data could only capture single-dimensional information.

As shown in Table 8, the training time increased slightly under pressure–flow collaborative input due to the higher feature dimension. The S-L-E-D model required the longest training time of 36.59 s but achieved the highest average accuracy of 0.932. Although the Stacking models had higher offline training costs, the optimal S-L-E-D model had an average inference time of only 0.021 s per sample, shorter than the 150 ms sampling interval, indicating its potential for real-time leakage localization.

The evaluation indicator of each model after preliminary selection with pressure–flow collaborative data input is shown in Figure 8. The optimal model was the S-L-E-D model, with an RMSE below 0.09 and MAPE below 0.10 across all folds, and it exhibited the smallest fluctuation across folds. Compared with a single pressure or flow rate input, the RMSE and MAPE of all models decreased significantly under collaborative input, and the fluctuation of errors across folds was generally reduced. Therefore, collaborative data input significantly reduced the absolute deviation in the model in predicting the leakage position.

To further evaluate whether the performance improvement in the S-L-E-D model was statistically significant, the uncertainty of the five-fold cross-validation results was analyzed. The mean accuracy, standard deviation, and 95% confidence interval were calculated for each model. In addition, paired t-tests were conducted between the S-L-E-D model and other models under pressure–flow collaborative input, and Cohen’s d was used to evaluate the effect size. A significance level of p < 0.05 was used. Statistical comparisons of the models under pressure–flow collaborative input are shown in Table 9.

As shown in Table 9, the S-L-E-D model achieved the highest mean accuracy of 0.932, with a 95% confidence interval of 0.926–0.938. Compared with the other models under pressure–flow collaborative input, the mean accuracy improvement in the S-L-E-D ranged from 0.037 to 0.058. The paired t-test results showed that these differences were statistically significant at the p < 0.05 level. In addition, the effect sizes were relatively large, indicating that the improvement in the S-L-E-D model was not only reflected in the mean value but also supported by fold-wise statistical comparison.

In summary, relying on a single flow or pressure signal for leakage localization is easily affected by normal pipeline fluctuations, leading to reduced positioning accuracy. In contrast, pressure–flow collaborative analysis can capture the spatiotemporal relationship between upstream flow variation and pressure attenuation near the leakage point. The mutual verification of these two parameters helps reduce interference and misjudgment from single-signal monitoring, thereby improving the accuracy and reliability of leakage localization.

(4): Determine the optimal localization model

After preliminary model selection, the methods with average five-fold cross-validation accuracy above 0.80, RMSE ≤ 0.12, and MAPE ≤ 0.20 were retained. The TIC values of different leakage points are shown in Table 10. Among them, L2 showed the highest TIC values under most input and model combinations. For example, under pressure input, the TIC values of the S-L-D and S-E-D reached 0.201 and 0.218, respectively, while under pressure–flow collaborative input, the TIC of the DBN reached 0.248. This may be related to the location of L2 in the middle of the pipeline, where turbulent flow, vortex flow, and unstable gas transmission increased pressure–flow signal fluctuations, making stable feature mapping more difficult. In contrast, L3 showed the lowest TIC values across different scenarios. Under pressure–flow collaborative input, the TIC values of the S-L-E-D and LSSVM were only 0.015 and 0.020, respectively. This is mainly because L3 was located in the downstream section with a relatively regular pipeline structure and stable signal transmission, allowing the models to extract leakage features more consistently.

The evaluation indicators of each optimized localization model are shown in Figure 9. L1 showed the most significant improvement under collaborative input, with the MAPE reduced by 38.7%. L2 exhibited the highest RMSE and MAPE among the five leakage points. Under a single pressure input, the RMSE of the S-L-D was 2.8 times that of L3, and under collaborative input, the RMSE of the S-L-E-D was still 2.1 times that of L3. This indicates that L2 remained more difficult to locate due to its complex flow characteristics. Nevertheless, collaborative input significantly reduced the overall errors, with the RMSE and MAPE decreasing by 39.0% and 37.4%, respectively, compared with a single pressure input. The average RMSE across all leakage points of the DBN, the worst model under collaborative input, was 0.082, which was still lower than the average RMSE of 0.091 of the S-E-D. Overall, the S-L-E-D is preferred under collaborative input, especially for high-accuracy scenarios.

The MAE and distance error of each leakage point is presented in Table 11. For the S-L-E-D positioning method, the MAE of each fold was found to be lower than 0.1, and the distance error was found to be less than 1 m. It was verified that the S-L-E-D model was the most applicable localization model in this paper.

In practical underground gas extraction systems, monitoring points are usually spaced hundreds of meters apart. Therefore, the distance error of 0.814 m achieved by the S-L-E-D model indicates its potential for narrowing the leakage area and improving maintenance efficiency. Since pressure and flow sensors are commonly used, the pressure–flow collaborative framework also shows potential for online leakage warning. However, this result was obtained from a laboratory-scale pipeline, and practical deployment still requires further consideration of sensor calibration, data transmission delay, computational cost, long-term sensor stability, and underground disturbances such as gas fluctuation, dust, moisture, vibration, temperature variation, pipeline attenuation, and sensor drift. In addition, its transferability to pipelines with different diameters, layouts, lengths, roughness, and network structures needs further verification. Although the S-L-E-D performed best under the current validation framework, future work should use independent test sets, nested cross-validation, cost–benefit analysis, and field-scale validation before practical application.

3.2.2. Analysis of the Number of Monitoring Points and Leakage Localization Performance

To explore the relationship between the number of input layer nodes and the fitting effect of the integration localization model, the leakage positioning accuracy under three forms of data model input was investigated in this section. To clarify the monitoring point failure simulation protocol, sensor failure in this study was defined as complete loss of the pressure and flow signals at the corresponding monitoring point. The pressure and flow features of this point were removed from the input feature vector. The leakage localization model was then evaluated using the remaining available monitoring point data. Single-point fault scenarios and two-point fault scenarios were considered to analyze the influence of missing monitoring information on localization performance. This fault setting mainly represents complete sensor outage or communication interruption. The discussion was divided into three groups: (1) A fault occurred at any one point; (2) faults occurred at any two points; and (3) no sensor faults occurred.

The localization accuracy of the validation set under different monitoring point fault conditions is shown in Figure 10. Single faults at M1, M4, and M5 had relatively small effects, with localization accuracies of 0.884, 0.891, and 0.881, respectively, whereas faults at M2 and M3 caused greater accuracy loss. Under dual-fault conditions, the M1 and M4 combination achieved the highest accuracy of 0.861, followed by M1 and M5, M2 and M4, and M2 and M5, with accuracies of 0.858, 0.857, and 0.855, respectively. These results indicate that faults at key monitoring points on the main pipeline have a stronger influence on leakage localization.

To further evaluate the statistical significance and robustness of the sensor fault analysis, the localization results under different monitoring point fault conditions were statistically analyzed. For each fault scenario, the mean positioning accuracy, standard deviation, and 95% confidence interval were calculated based on repeated validation results. The 95% confidence interval was used to describe the uncertainty range of the positioning accuracy, while the Coefficient of Variation was introduced to evaluate the stability of the model under sensor fault conditions.

The statistical robustness results are shown in Table 12. Under normal monitoring conditions, the model achieved a mean accuracy of 0.932, with a standard deviation of 0.005 and a narrow 95% confidence interval of 0.926–0.938, indicating high stability. Under single-point fault conditions, the mean accuracy remained between 0.881 and 0.891, with Coefficients of Variation below 1.50%, showing stable performance when one sensor failed. Under dual-point fault conditions, the mean accuracy decreased to 0.855–0.861, and the standard deviation slightly increased, indicating greater localization uncertainty. Nevertheless, the confidence intervals remained narrow and the Coefficients of Variation were below 2.00%, demonstrating good robustness and fault tolerance of the proposed pressure–flow collaborative Stacking model.

As shown in Figure 11, different monitoring point faults caused distinct error distributions among leakage points. When M1 failed, the positioning errors were widely distributed, indicating uneven effects on different leakage points. The failure of M4 caused large accuracy fluctuations, especially for L5, whose errors were mainly concentrated in medium-error intervals. When M5 failed, L5 showed the poorest accuracy, with 73.5% of errors falling within the 5–10% interval, while L1 maintained the best performance. Under the joint failure of M1 and M4, double-fault error superposition was observed, reducing the proportion of low-error intervals, especially for L1. Overall, monitoring point faults increased localization uncertainty, and the influence varied with both sensor position and leakage location.

In summary, the effect of monitoring point failure on leakage localization showed clear spatial dependence. Failures at middle monitoring points had greater influence than endpoint failures because they provided more critical spatial gradient information. In multi-point fault scenarios, scattered failures caused less accuracy degradation than adjacent failures, as more spatial information could be retained. In contrast, concentrated failures led to local information loss and weakened the model’s inference ability. Therefore, sensors in branch pipelines and middle pipeline sections should be prioritized during field deployment and maintenance, and a spatially uniform monitoring layout is recommended to improve information complementarity and localization robustness under partial sensor failure.

It should be noted that the present fault simulation only considered the complete loss of monitoring point data. In actual underground gas extraction systems, sensor faults may also appear as signal drift, abnormal noise, intermittent packet loss, calibration error, delayed response, or partial measurement distortion. Different fault modes may have different effects on pressure–flow feature distribution and leakage localization accuracy. Therefore, future studies will further introduce more realistic sensor fault modes and compare their influence on the robustness of the proposed model.

4. Conclusions

In this study, the gas extraction pipeline leakage location method based on Stacking integration learning was proposed. The selection criterion for the gas extraction pipeline leakage location model was established, and the optimal model for gas extraction pipeline leakage location was identified. The main achievements were as follows:

(1): Different leakage locations cause distinct pressure and flow change rates at monitoring points with M1 at 0.0892 for L1, M4 at −0.1628 for L3, and M3 at 0.1100 for L5. This variation stems from the spatial leak–monitor relationship and system response mechanism. Multi-source monitoring data serve as valid feature vectors for leakage localization due to differentiated signal combinations.
(2): A multi-parameter collaborative location model for gas extraction pipeline leakage based on the pressure–flow method was established. The 21 combinations of location models were obtained, and nine optimal location models for gas extraction pipeline leakage were selected through preliminary model screening. The Stacking–LSSVM–Elman–DBN was finally determined as the optimal location model. The R², RMSE, MAPE, and TIC of this model were 0.932, 0.053, 0.082, and 0.056, respectively.
(3): The accuracy of the pipeline leakage location system was highly dependent on the layout and reliability of the monitoring points. Single faults at points 1, 4 and 5 led to higher accuracies of 0.884, 0.891 and 0.881 respectively. The faults at points 2 and 3, and the 1and 4 dual faults achieved the highest accuracy of 0.861. Along-pipeline monitoring point failures exert greater impacts on localization than endpoint ones, and scattered multi-point faults cause less accuracy loss than adjacent ones.

Therefore, efforts should be made to optimize the layout of on-site monitoring points to achieve a more uniform spatial distribution. In addition, the complementary characteristics of multi-source data fusion should be fully utilized to enhance the robustness of the gas extraction pipeline leakage localization method. The current framework provides a more comprehensive assessment of multi-source collaborative leakage localization and sensor fault robustness, while the localization error lower than 1 m demonstrates the promising engineering applicability of the proposed method for underground gas extraction systems. Moreover, the proposed multi-source collaborative framework can be integrated with existing underground monitoring systems, showing potential for intelligent leakage warning and real-time safety management in coal mines. However, it should be noted that the present study is mainly based on a laboratory-scale experimental system, and actual underground coal mine environments may involve more complex factors, such as pipeline vibration, dust interference, gas fluctuation, and long-distance transmission effects. Therefore, further research should focus on improving the universality of the model under complex underground conditions and different pipeline structures, while enhancing its robustness against sensor faults and environmental interference. Additional large-scale field validation under practical underground environments is still required before industrial deployment.

Author Contributions

Conceptualization, J.Z. (Jie Zhou) and J.Z. (Ju Zhao); methodology, J.Z. (Jie Zhou) and W.Z.; software, J.G. and W.Z.; validation, J.Z. (Jie Zhou) and W.L.; formal analysis, J.G.; investigation, J.L.; resources, J.Z. (Ju Zhao); data curation, J.Z. (Jie Zhou); writing—original draft preparation, J.Z. (Jie Zhou); writing—review and editing, J.Z. (Jie Zhou) and W.L.; visualization, W.L. and W.Z.; supervision, J.Z. (Ju Zhao); project administration, J.L.; funding acquisition, J.Z. (Jie Zhou). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by [the Natural Science Project of Shaanxi Provincial Department of Education] grant number [24JK0381].

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Ren, L.-F.; Tao, F.; Xiao, Y.; Deng, J.; Li, Q.-W.; Zhai, X.-W.; Shu, C.-M. Progress and development on mechanism and risk assessment of coal spontaneous combustion with gas explosion in underground goaf. J. Loss Prev. Process Ind. 2026, 99, 105795. [Google Scholar] [CrossRef]
Yang, G.; Song, D.; Wang, M.; Qiu, L.; He, X.; Khan, M.; Qian, S. New insights into dynamic disaster monitoring through asynchronous deformation induced coal-gas outburst mechanism of tectonic and raw coal seams. Energy 2024, 295, 131063. [Google Scholar] [CrossRef]
Lin, H.; Li, W.; Li, S.; Wang, L.; Ge, J.; Tian, Y.; Zhou, J. Coal mine gas emission prediction based on multifactor time series method. Reliab. Eng. Syst. Saf. 2024, 252, 110443. [Google Scholar] [CrossRef]
Cai, J.; Wu, J.; Yuan, S.; Liu, Z.; Kong, D. Numerical analysis of multi-factors effects on the leakage and gas diffusion of gas drainage pipeline in underground coal mines. Process Saf. Environ. Prot. 2021, 151, 166–181. [Google Scholar] [CrossRef]
Liang, X.; Liang, W.; Xiong, J. Intelligent diagnosis of natural gas pipeline defects using improved flower pollination algorithm and artificial neural network. J. Clean. Prod. 2020, 264, 121655. [Google Scholar] [CrossRef]
Xie, M.; Wei, Z.; Zhao, J.; Chen, Y. Failure analysis of corroded hydrogen-blended natural gas pipelines based on finite element analysis and genetic algorithm-back propagation neural network. Reliab. Eng. Syst. Saf. 2025, 262, 111174. [Google Scholar] [CrossRef]
Yan, D.; Wang, G.; Wang, J.; Ren, L.; Jia, Z. Pipeline multi-point leakage identification based on temporal convolutional network. Flow Meas. Instrum. 2025, 106, 102992. [Google Scholar] [CrossRef]
Liu, C.; Zhou, S.; Zhang, Y.; Zhang, C.; Liu, X. Leakage diagnosis of district heating-network based on system simulation and PCA_BP neural network. Process Saf. Environ. Prot. 2023, 180, 260–273. [Google Scholar] [CrossRef]
Rahmati, M. Modeling of gas pipeline in order to implement a leakage detection system using artificial neural networks based on instrumentation. Int. J. Numer. Model. Electron. Netw. Devices Fields 2019, 32, e2520. [Google Scholar] [CrossRef]
Miao, X.; Zhao, H.; Xiang, Z. Leakage detection in natural gas pipeline based on unsupervised learning and stress perception. Process Saf. Environ. Prot. 2023, 170, 76–88. [Google Scholar] [CrossRef]
Tian, X.; Jiao, W.; Liu, T.; Ren, L.; Song, B. Leakage detection of low-pressure gas distribution pipeline system based on linear fitting and extreme learning machine. Int. J. Press. Vessel. Pip. 2021, 194, 104553. [Google Scholar] [CrossRef]
Zhou, J.; Lin, H.; Li, S.; Jin, H.; Zhao, B.; Liu, S. Leakage diagnosis and localization of the gas extraction pipeline based on SA-PSO BP neural network. Reliab. Eng. Syst. Saf. 2023, 232, 109051. [Google Scholar] [CrossRef]
Chuang, W.Y.; Tsai, Y.L.; Wang, L.H. Leak Detection in Water Distribution Pipes Based on CNN with Mel Frequency Cepstral Coefficients. In Proceedings of the 2019 3rd International Conference on Natural Language and Speech Processing, Trento, Italy, 12–13 September 2019. [Google Scholar]
Zhu, Y.; Li, S.; Lang, X.; Liu, L. Physics-informed CGAN and multi-scale attention CNN for pipeline leakage diagnosis under imbalanced data. Adv. Eng. Inform. 2025, 66, 103471. [Google Scholar] [CrossRef]
Lu, J.; Fu, Y.; Yue, J.; Zhu, L.; Wang, D.; Hu, Z. Natural gas pipeline leak diagnosis based on improved variational modal decomposition and locally linear embedding feature extraction method. Process Saf. Environ. Prot. 2022, 164, 857–867. [Google Scholar] [CrossRef]
Mujtaba, S.M.; Lemma, T.A.; Vandrangi, S.K. Leak diagnostics in natural gas pipelines using fault signatures. Int. J. Press. Vessel. Pip. 2022, 199, 104698. [Google Scholar] [CrossRef]
Zuo, Z.; Ma, L.; Liang, S.; Liang, J.; Zhang, H.; Liu, T. A semi-supervised leakage detection method driven by multivariate time series for natural gas gathering pipeline. Process Saf. Environ. Prot. 2022, 164, 468–478. [Google Scholar] [CrossRef]
Qu, Z.; Feng, H.; Zeng, Z.; Zhuge, J.; Jin, S. A SVM-based pipeline leakage detection and pre-warning system. Measurement 2010, 43, 513–519. [Google Scholar] [CrossRef]
Dai, Z.; Li, S.; Liu, L.; Zhu, Y. Divisional intuitionistic fuzzy least squares twin SVM for pipeline leakage detection. Process Saf. Environ. Prot. 2024, 192, 104–114. [Google Scholar] [CrossRef]
Xiao, Y.; Peng, C.; Wu, J.; Deng, J. Research on multi-factor hydrogen leak accident diagnosis and optimization of monitoring sensors’ layout through CFD-based data-driven approach. Reliab. Eng. Syst. Saf. 2026, 267, 111861. [Google Scholar] [CrossRef]
Yang, D.; Hou, N.; Lu, J.; Ji, D. Novel leakage detection by ensemble 1DCNN-VAPSO-SVM in oil and gas pipeline systems. Appl. Soft Comput. 2022, 115, 108212. [Google Scholar] [CrossRef]
Li, S.; Dai, Z.; Cai, M.; Liu, L.; Mei, L. A novel DVAPSO-LSTSVM classifier in compressed sensing domain for intelligent pipeline leakage diagnosis. Process Saf. Environ. Prot. 2023, 175, 447–460. [Google Scholar] [CrossRef]
Hys, A.; Rra, B.; As, A.; Ming, F. Detection and estimation of valve leakage losses in reciprocating compressor using acoustic emission technique. Measurement 2020, 152, 107315. [Google Scholar] [CrossRef]
Bui Quy, T.; Kim, J.-M. Leak detection in a gas pipeline using spectral portrait of acoustic emission signals. Measurement 2020, 152, 107403. [Google Scholar] [CrossRef]
Liu, C.; Wang, Y.; Li, X.; Li, Y.; Khan, F.; Cai, B. Quantitative assessment of leakage orifices within gas pipelines using a Bayesian network. Reliab. Eng. Syst. Saf. 2021, 209, 107438. [Google Scholar] [CrossRef]
Yuan, Y.; Cui, X.; Han, X.; Gao, Y.; Lu, F.; Liu, X. Multi-condition pipeline leak diagnosis based on acoustic image fusion and whale-optimized evolutionary convolutional neural network. Eng. Appl. Artif. Intell. 2025, 153, 110886. [Google Scholar] [CrossRef]
Li, X.; Zhao, T.; Sun, Q.-H.; Chen, Q. Frequency response function method for dynamic gas flow modeling and its application in pipeline system leakage diagnosis. Appl. Energy 2022, 324, 119720. [Google Scholar] [CrossRef]
Zhang, Q.; Wang, Y.; Feng, L.; Yin, Y.; You, Z. Predicting leak aperture in the pipeline of ultra-long coal mine working faces under strong noise interference based on joint denoising and random forest. Flow Meas. Instrum. 2024, 97, 102609. [Google Scholar] [CrossRef]
Zhu, S.-B.; Li, Z.-L.; Zhang, S.-M.; Liang, L.-L.; Zhang, H.-F. Natural gas pipeline valve leakage rate estimation via factor and cluster analysis of acoustic emissions. Measurement 2018, 125, 48–55. [Google Scholar] [CrossRef]
Wang, Y.; Liu, W.; Zhang, Q.; Feng, L.; Liu, W. Gas pipeline leakage detection and localization method based on VMD-DTW. Flow Meas. Instrum. 2025, 102, 102820. [Google Scholar] [CrossRef]
Zhou, A.; Du, C.a.; Wang, K.; Fan, X.; Wang, D.; Zhao, W.; Gao, H. Research on intelligent control theory and strategy of gas drainage pipe network based on graph theory. Fuel 2024, 357, 129867. [Google Scholar] [CrossRef]
Wang, F.; Xiahou, T.; Zhang, X.; He, P.; Yang, T.; Niu, J.; Liu, C.; Liu, Y. Convolutional preprocessing Transformer-based fault diagnosis for rectifier-filter circuits in nuclear power plants. Reliab. Eng. Syst. Saf. 2024, 249, 110198. [Google Scholar]
Guo, X.; Gao, Y.; Zheng, D.; Ning, Y.; Zhao, Q. Study on short-term photovoltaic power prediction model based on the Stacking ensemble learning. Energy Rep. 2020, 6, 1424–1431. [Google Scholar] [CrossRef]
Hou, H.; Chen, X.; Li, M.; Zhu, L.; Huang, Y.; Yu, J. Prediction of user outage under typhoon disaster based on multi-algorithm Stacking integration. Int. J. Electr. Power Energy Syst. 2021, 131, 107123. [Google Scholar] [CrossRef]
Satish, N.; Anmala, J.; Varma, M.R.R.; Rajitha, K. Performance of Machine Learning, Artificial Neural Network (ANN), and stacked ensemble models in predicting Water Quality Index (WQI) from surface water quality parameters, climatic and land use data. Process Saf. Environ. Prot. 2024, 192, 177–195. [Google Scholar]
Dandamudi, E.G.; Abdulkareem, S.; Tekal, M.; Uppalapati, K.; Alamuri, D.K.; Ice, S.N.; Sharma, K.; Nandi, A.; Lorson, C.L.; Singh, K. Performance of transformer-convolutional neural network ensemble for melanoma diagnosis on segmented 3D total body photography data: Cross-Validation stratified K-fold. Biosens. Bioelectron. X 2025, 27, 100714. [Google Scholar] [CrossRef]
Zhang, Y.; Li, R. Short term wind energy prediction model based on data decomposition and optimized LSSVM. Sustain. Energy Technol. Assess. 2022, 52, 102025. [Google Scholar] [CrossRef]
Li, H.; Wei, D.; Wu, Y.; Zhou, T.; Zhou, H. Intelligent performance enhancement of flue gas waste heat recovery in combined low-temperature economizer—Air heater systems. Process Saf. Environ. Prot. 2025, 202, 107812. [Google Scholar] [CrossRef]
Ullah, S.; Ali, M.; Sheikh, M.F.; Chaudhary, G.Q.; Kerbache, L. Performance predication of a solar assisted desiccant air conditioning system using radial basis function neural network: An integrated machine learning approach. Heliyon 2024, 10, e29777. [Google Scholar] [CrossRef] [PubMed]
Adnan, M.S.; Safaei-Farouji, M.; Amiri-Ramsheh, B.; Hemmati-Sarapardeh, A. Modeling apparent viscosity of waxy crude oils doped with polymers using tree-based models, radial basis function neural networks, and Gaussian process regression. Geoenergy Sci. Eng. 2024, 235, 212689. [Google Scholar] [CrossRef]
Liu, Z.; Wang, H.; Zhou, X.; Chen, H.; Duan, H.; Liang, K.; Chen, B.; Cao, Y.; Wang, W.; Yang, D.; et al. State of health prediction of lithium-ion batteries based on incremental capacity analysis and adaptive genetic algorithm optimized Elman neural network model. Energy 2025, 335, 137955. [Google Scholar] [CrossRef]
Ai, X.; Feng, T.; Gan, W.; Li, S. An innovative memory-enhanced Elman neural network-based selective ensemble system for short-term wind speed prediction. Appl. Energy 2025, 380, 125108. [Google Scholar] [CrossRef]
Guo, Y.; Wang, H.; Guo, Y.; Zhong, M.; Li, Q.; Gao, C. System operational reliability evaluation based on dynamic Bayesian network and XGBoost. Reliab. Eng. Syst. Saf. 2022, 225, 108622. [Google Scholar] [CrossRef]
Zhao, W.; Wan, Y.; Wang, K.; Zhao, Z.; Song, Y.; Guo, X.; Xiang, G. Methods for constructing scenarios of coal and gas outburst accidents: Implications for the intelligent emergency response of secondary mine accidents. Process Saf. Environ. Prot. 2025, 199, 107236. [Google Scholar] [CrossRef]
Gou, S.; Gao, C.; Zhao, G.; Chen, S.; Li, Z.; Wei, D. Research on Combined Prediction Model based on BP Neural Network Model and Multiple Regression Analysis Model. Procedia Comput. Sci. 2025, 262, 785–794. [Google Scholar] [CrossRef]
Ghimire, S.; Deo, R.C.; Jiang, N.; Ahmed, A.A.M.; Prasad, S.S.; Casillas-Pérez, D.; Salcedo-Sanz, S.; Yaseen, Z.M. Explainable deep learning hybrid modeling framework for total suspended particles concentrations prediction. Atmos. Environ. 2025, 347, 121079. [Google Scholar] [CrossRef]
Siegel, A.F.; Wagner, M.R. Chapter 11—Correlation and Regression: Measuring and Predicting Relationships. In Practical Business Statistics, 8th ed.; Siegel, A.F., Wagner, M.R., Eds.; Academic Press: Cambridge, MA, USA, 2022; pp. 313–370. [Google Scholar]
Li, X.; Song, Q.; Fang, X.; Zhang, L.; Zheng, J.; Cai, Y.; Liu, Z.; Huang, J.; Liu, L. A novel stacking-based ensemble learning framework for thermal error prediction of machine tool feed systems. J. Manuf. Process. 2025, 155, 655–680. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of gas extraction pipeline leakage experimental system.

Figure 2. Time series signals before and after EEMD denoising: (a) flow signal, (b) pressure signal.

Figure 3. The data distribution of the change rates of each monitoring point when leakage occurred at different locations. (a) Flow monitoring points. (b) Pressure monitoring points.

Figure 4. The gas extraction pipeline leakage location model based on Stacking integration learning.

Figure 5. The gas extraction pipeline leakage location identification process based on Stacking integration learning.

Figure 6. The evaluation indicators of each model after preliminary selection with single pressure data input: (a) RMSE, (b) MAPE. (note: the dashed line in (a) denotes RMSE ≤ 0.12, while the dashed line in (b) denotes MAPE ≤ 0.20).

Figure 7. The evaluation indicators of each model after preliminary selection with single flow data input: (a) RMSE, (b) MAPE. (note: the dashed line in (a) denotes RMSE ≤ 0.12, while the dashed line in (b) denotes MAPE ≤ 0.20).

Figure 8. The evaluation indicator of each model after preliminary selection with pressure–flow collaborative data input: (a) RMSE, (b) MAPE.

Figure 9. The evaluation indicator of each optimized localization model after selection: (a) RMSE, (b) MAPE. (note: the dashed line in (a) denotes RMSE ≤ 0.12, while the dashed line in (b) denotes MAPE ≤ 0.20).

Figure 10. The localization accuracy of the validation set under different conditions. (note: the dashed line denotes positioning accuracy rate ≥ 0.85).

Figure 11. The positioning accuracy ranges of each leakage point under the fault conditions of different monitoring points: (a) failure of monitoring point 1, (b) failure of monitoring point 4, (c) failure of monitoring point 5, (d) failure of monitoring point 1 and 4.

Table 1. Main experimental instruments and technical specifications.

Instrument	Model	Manufacturer	Measurement Range	Accuracy	Function
Rotary vane vacuum pump	VSV-065	Xi’an Chengde Vacuum Equipment Co., Ltd., (Xi’an, China).	Exhaust rate: 65 m³·h⁻¹; the maximum pressure: 0.3 mbar; the power: 1.5 kW	—	Provide a stable negative pressure condition for the gas extraction pipeline experimental system
Gas flowmeter	MF5200	Xi’an Shenghong chuang Instrumentation & Metering Co., Ltd., (Xi’an, China).	The measurable flow range: 0–800 L·min⁻¹; range ratio: 30:1	±(2.0 + 0.5 FS)%	Measures gas flow rate at monitoring points
Pressure sensor	SHC10X		The measuring range: −100 kPa to 100 kPa; the sensitivity temperature drift: ≤1.5% FS: the response time: ≤16 ms	±0.5% F.S	Measures pipeline pressure variation
Leakage control valve	BVFL 06		Insert valve diameter: DN25/opening range: 0–90°	—	Controls leakage point opening and closing
Data acquisition unit	SH-R70		Sampling interval: 150 ms	0.2 FS%	Collects and stores pressure and flow signals

Table 2. The locations of leakage points and monitoring points.

Monitoring Point	Location	Distance from the Extraction Pump/m	Leakage Point	Location	Distance from the Extraction Pump/m
M1	Main pipeline	2	L1	Main pipeline	3.5
M2	Main pipeline	6	L2	Main pipeline	8
M3	Main pipeline	10	L3	Main pipeline	11.5
M4	Branch pipeline 2	15.5	L4	Branch pipeline 1	10
M5	Branch pipeline 1	11.5	L5	Branch pipeline 2	13.5

Table 3. The influence of EEMD parameters on localization accuracy.

σ	Nd	RMSE	MAPE
0.03	50	0.071	0.109
0.03	100	0.066	0.101
0.05	50	0.061	0.094
0.05	100	0.053	0.082
0.05	150	0.054	0.084
0.08	100	0.069	0.106
0.10	100	0.076	0.118

Table 4. The parameters of the base learner.

Method	Parameter	Search Range	Value
LSSVM	Kernel function	-	Radial basis function
	Penalty factor	0.1, 0.3, 0.6, 0.9, 1.2	0.6
	Regularization parameter	10, 50, 100, 150, 200	100
	Gamma parameter	0.01, 0.03, 0.05, 0.08, 0.10	0.05
ENN	Number of hidden layers	5, 10, 15, 20, 25	15
	Maximum number of iterations	50, 70, 85, 100, 150	85
	Iterative target error	0.001, 0.0005, 0.0001	0.0001
	Transfer function	-	tan sig
DBN	The quantity of RBM	1, 2, 3	2
	The number of hidden layer nodes 1	5, 10, 15, 20	10
	The number of hidden layer nodes 2	3, 5, 8, 10	5
	Number of iterations	0.001, 0.005, 0.01, 0.05	50
	Learning rate	30, 50, 80, 100	0.01
	Momentum	0, 0.3, 0.5, 0.9	0

Table 5. Metrics for measuring prediction accuracy.

Measurement Index	Definition
Coefficient of Determination (R²)	$1 - \frac{\sum_{i = 1}^{n} {(y_{i} - y_{p})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}$
Root Mean Square Error (RMSE)	$\sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{p} - y_{i})}^{2}}$
Mean Absolute Percentage Error (MAPE)	$\frac{1}{n} \sum_{i = 1}^{n} \|\frac{y_{i} - y_{p}}{y_{i}}\|$
Theil Inequality Coefficient (TIC)	$\frac{\sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{p} - y_{i})}^{2}}}{\sqrt{\frac{1}{n} \sum_{i = 1}^{n} {y_{p}}^{2}} + \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {y_{i}}^{2}}}$
Mean Absolute Error (MAE)	$\frac{1}{n} \|(y_{i} - y_{p})\|$
Coefficient of Variation (CV)	$\frac{\sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}}{\bar{y}} \times 100 %$

Notes: y_i is true value, i ∈ [1, n]; y_p is prediction value, i ∈ [1, n];

{\bar{y}}_{i}

is average value, i ∈ [1, n].

Table 6. The localization accuracy of each method in the pressure training set.

Positioning Method	5-Fold Cross-Validation					Average Accuracy Rate	Training Time/s
Positioning Method	1st Fold	2nd Fold	3rd Fold	4th Fold	5th Fold	Average Accuracy Rate	Training Time/s
LSSVM	0.799	0.801	0.801	0.804	0.790	0.799	3.82
ENN	0.833	0.806	0.821	0.809	0.832	0.820	8.65
DBN	0.531	0.840	0.827	0.799	0.804	0.760	15.43
S-L-E	0.818	0.829	0.811	0.824	0.813	0.819	18.76
S-L-D	0.808	0.809	0.821	0.818	0.838	0.819	24.35
S-E-D	0.811	0.819	0.832	0.840	0.832	0.827	27.64
S-L-E-D	0.411	0.398	0.715	0.616	0.502	0.528	32.18

Table 7. The localization accuracy of each model in the flow training set.

Positioning Method	5-Fold Cross-Validation					Average Accuracy Rate	Training Time/s
Positioning Method	1st Fold	2nd Fold	3rd Fold	4th Fold	5th Fold	Average Accuracy Rate	Training Time/s
LSSVM	0.798	0.799	0.798	0.797	0.796	0.798	3.64
ENN	0.808	0.830	0.820	0.818	0.825	0.820	8.27
DBN	0.779	0.807	0.804	0.676	0.751	0.763	14.86
S-L-E	0.821	0.823	0.817	0.815	0.822	0.820	18.34
S-L-D	0.801	0.804	0.805	0.797	0.826	0.807	23.72
S-E-D	0.844	0.812	0.814	0.826	0.820	0.823	26.95
S-L-E-D	0.843	0.813	0.829	0.831	0.836	0.830	31.57

Table 8. The localization accuracy of each individual localization model in the pressure–flow collaborative training set.

Positioning Method	5-Fold Cross-Validation					Average Accuracy Rate	Training Time/s
Positioning Method	1st Fold	2nd Fold	3rd Fold	4th Fold	5th Fold	Average Accuracy Rate	Training Time/s
LSSVM	0.886	0.888	0.888	0.886	0.889	0.889	4.96
ENN	0.886	0.882	0.890	0.885	0.887	0.887	10.48
DBN	0.831	0.893	0.890	0.894	0.858	0.858	18.65
S-L-E	0.890	0.892	0.887	0.891	0.895	0.895	22.36
S-L-D	0.889	0.893	0.895	0.892	0.893	0.893	28.74
S-E-D	0.890	0.899	0.902	0.890	0.894	0.894	31.82
S-L-E-D	0.924	0.931	0.938	0.933	0.932	0.932	36.59

Table 9. Statistical comparison of models under pressure–flow collaborative input.

Model	Mean Accuracy	Standard Deviation	95% Confidence Interval	Mean Difference vs. S-L-E-D	p-Value	Cohen’s d
LSSVM	0.887	0.001	0.886–0.889	0.044	<0.001	9.71
ENN	0.886	0.003	0.882–0.890	0.046	<0.001	10.12
DBN	0.873	0.028	0.839–0.908	0.058	0.006	2.41
S-L-E	0.891	0.003	0.887–0.895	0.041	<0.001	6.24
S-L-D	0.892	0.002	0.890–0.895	0.039	<0.001	12.92
S-E-D	0.895	0.005	0.888–0.902	0.037	<0.001	8.68
S-L-E-D	0.932	0.005	0.926–0.938	—	—	—

Table 10. The TIC of the localization results for each leakage point.

Dataset	Positioning Method	Leakage Point					Average TIC
Dataset	Positioning Method	1	2	3	4	5	Average TIC
Pressure	S-L-D	0.083	0.201	0.036	0.102	0.049	0.094
Pressure	S-E-D	0.043	0.218	0.061	0.089	0.038	0.090
Flow	S-E-D	0.321	0.203	0.073	0.098	0.071	0.153
Flow	S-L-E-D	0.298	0.193	0.064	0.081	0.075	0.142
Pressure–Flow	LSSVM	0.072	0.181	0.020	0.056	0.051	0.076
	ENN	0.103	0.169	0.039	0.057	0.044	0.082
	DBN	0.073	0.248	0.056	0.056	0.023	0.091
	S-L-E	0.081	0.161	0.034	0.061	0.039	0.075
	S-L-D	0.070	0.181	0.023	0.051	0.038	0.073
	S-E-D	0.069	0.164	0.036	0.064	0.029	0.072
	S-L-E-D	0.041	0.138	0.015	0.058	0.028	0.056
Average value		0.114	0.187	0.041	0.070	0.044

Table 11. The MAE and distance error of each leakage point.

Dataset	Positioning Method	Leakage Point					MAE	Distance Error/m
Dataset	Positioning Method	1	2	3	4	5	MAE	Distance Error/m
Pressure	S-L-D	0.029	0.168	0.025	0.107	0.051	0.076	1.672
Pressure	S-E-D	0.007	0.186	0.036	0.089	0.065	0.077	1.694
Pressure–flow	LSSVM	0.016	0.114	0.014	0.047	0.064	0.051	1.122
	ENN	0.026	0.092	0.027	0.054	0.063	0.052	1.144
	DBN	0.024	0.115	0.046	0.078	0.025	0.058	1.276
	S-L-E	0.021	0.096	0.020	0.056	0.059	0.051	1.122
	S-L-D	0.019	0.102	0.018	0.049	0.059	0.049	1.078
	S-E-D	0.014	0.093	0.016	0.067	0.048	0.047	1.034
	S-L-E-D	0.009	0.069	0.009	0.056	0.041	0.037	0.814
	Average value	0.027	0.107	0.034	0.089	0.083

Table 12. Statistical robustness evaluation under different monitoring point fault conditions.

Fault Condition	Mean Accuracy	Standard Deviation	95% Confidence Interval	Coefficient of Variation/%
No sensor fault	0.932	0.005	0.926–0.938	0.54
Failure of M1	0.884	0.012	0.869–0.899	1.36
Failure of M4	0.891	0.010	0.879–0.903	1.12
Failure of M5	0.881	0.013	0.865–0.897	1.48
Failure of M1 & M4	0.861	0.014	0.844–0.878	1.63
Failure of M1 & M5	0.858	0.015	0.839–0.877	1.75
Failure of M2 & M4	0.857	0.016	0.837–0.877	1.87
Failure of M2 & M5	0.855	0.016	0.835–0.875	1.87

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, J.; Zhang, W.; Zhao, J.; Ge, J.; Li, W.; Liu, J. Research on Multi-Source Collaborative Leakage Location Method for Coal Mine Gas Extraction Pipeline Based on Stacking Integration Learning. Processes 2026, 14, 1908. https://doi.org/10.3390/pr14121908

AMA Style

Zhou J, Zhang W, Zhao J, Ge J, Li W, Liu J. Research on Multi-Source Collaborative Leakage Location Method for Coal Mine Gas Extraction Pipeline Based on Stacking Integration Learning. Processes. 2026; 14(12):1908. https://doi.org/10.3390/pr14121908

Chicago/Turabian Style

Zhou, Jie, Weihong Zhang, Ju Zhao, Jiaqi Ge, Wenjing Li, and Ji Liu. 2026. "Research on Multi-Source Collaborative Leakage Location Method for Coal Mine Gas Extraction Pipeline Based on Stacking Integration Learning" Processes 14, no. 12: 1908. https://doi.org/10.3390/pr14121908

APA Style

Zhou, J., Zhang, W., Zhao, J., Ge, J., Li, W., & Liu, J. (2026). Research on Multi-Source Collaborative Leakage Location Method for Coal Mine Gas Extraction Pipeline Based on Stacking Integration Learning. Processes, 14(12), 1908. https://doi.org/10.3390/pr14121908

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Multi-Source Collaborative Leakage Location Method for Coal Mine Gas Extraction Pipeline Based on Stacking Integration Learning

Abstract

1. Introduction

2. Experiment on the Gas Flow Characteristics of the Leakage Gas Extraction Pipeline

2.1. Construction of the Gas Extraction Pipeline Leakage Experimental System

2.2. The Gas Flow Characteristic of the Leakage Gas Extraction Pipeline

2.2.1. Data Processing Based on EEMD

2.2.2. Analysis of Gas Flow Characteristics in the Leaked Gas Extraction Pipeline

3. The Method of Leakage Location in Gas Extraction Pipeline Based on Stacking Integration Learning

3.1. The Establishment Process of the Leakage Location Model for Gas Extraction Pipeline

3.1.1. The Leakage Location Model for Gas Extraction Pipeline

3.1.2. Establishment of the Multi-Parameter Fusion Location Model for Gas Extraction Pipeline Leakage

3.1.3. Model Accuracy Evaluation Index

3.2. Verification of Gas Extraction Pipeline Leakage Location Model

3.2.1. Multi-Parameter Fusion Location Model of Gas Extraction Pipeline Leakage

3.2.2. Analysis of the Number of Monitoring Points and Leakage Localization Performance

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI