BiLSTM-VAE Anomaly Weighted Model for Risk-Graded Mine Water Inrush Early Warning

Liang, Manyu; Yao, Hui; Yin, Shangxian; Hou, Enke; Lian, Huiqing; Xia, Xiangxue; Wu, Jinsui; Xu, Bin

doi:10.3390/app151910394

Open AccessArticle

BiLSTM-VAE Anomaly Weighted Model for Risk-Graded Mine Water Inrush Early Warning

by

Manyu Liang

^1,2

,

Hui Yao

^1,2

,

Shangxian Yin

^2,*

,

Enke Hou

¹

,

Huiqing Lian

²

,

Xiangxue Xia

²,

Jinsui Wu

³

and

Bin Xu

²

¹

College of Geology and Environment, Xi’an University of Science and Technology, Xi’an 710054, China

²

Hebei State Key Laboratory of Mine Disaster Prevention, North China Institute of Science and Technology, Langfang 065201, China

³

Department of Management Science and Engineering, Khalifa University, Abu Dhabi P.O. Box 127788, United Arab Emirates

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(19), 10394; https://doi.org/10.3390/app151910394

Submission received: 31 July 2025 / Revised: 20 September 2025 / Accepted: 24 September 2025 / Published: 25 September 2025

(This article belongs to the Special Issue Hydrogeology and Regional Groundwater Flow)

Download

Browse Figures

Versions Notes

Abstract

A new cascaded model is proposed to improve the accuracy and early warning capability of predicting mine water inrush accidents. The model sequentially applies a Bidirectional Long Short-Term Memory Network (BiLSTM) and a Variational Autoencoder (VAE) to capture the spatio-temporal dependencies between borehole water level data and water inrush events. First, the BiLSTM predicts borehole water levels, and the prediction errors are analyzed to summarize temporal patterns in water level fluctuations. Then, the VAE identifies anomalies in the predicted results. The spatial correlation between borehole water levels, induced by the cone of depression during water inrush, is quantified to assign weights to each borehole. A weighted comprehensive anomaly score is calculated for final prediction. In actual water inrush cases from Xin’an Coal Mine, the BiLSTM-VAE model triggered high-risk alerts 9 h and 30 min in advance, outperforming the conventional threshold-based method by approximately 6 h. Compared with other models, the BiLSTM-VAE demonstrates better timeliness and higher accuracy with lower false alarm rates in mine water inrush prediction. This framework extends the lead time for implementing safety measures and provides a data-driven approach to early warning systems for mine water inrush.

Keywords:

mine water inrush early warning; borehole water level prediction; anomalous water level detection; hydrogeological spatial correlation; Ordovician limestone aquifer water inrush

1. Introduction

Water inrush is a global disaster that is not only severely destructive but also affected by a combination of factors [1,2,3,4]. These factors make water inrush accidents sudden and difficult to predict accurately [5,6,7,8,9].

In addition to traditional data analytics [10,11,12], numerical simulation [13,14,15], and empirical thresholds [14,15,16,17,18,19], machine learning models and time series analysis methods have gradually become the dominant forces in predicting mine accident trends. Methods such as data-driven convolutional neural networks [20,21] and recurrent neural networks [22,23] are widely used to predict water inrush accidents. These neural network models overcome the limitations of autoregressive integral moving average models (ARIMA) [22,24], exponential smoothing (Holt–Winters) [25,26], and support vector machines (SVMs) [27,28] in processing complex, nonlinear and noise data by learning the dynamic patterns of time series data, but they have poor applicability, require a large amount of data as input and can only be applied to specific mining conditions.

In recent years, scholars have proposed various novel optimization methods to improve prediction accuracy and generalization. These approaches include Data Preprocessing (e.g., feature selection, sparse coding [29,30], and dimensionality reduction [31,32]) to enhance model efficiency; Feature Engineering for extracting discriminative patterns from raw data; Data Augmentation to expand dataset diversity and improve generalization; Multi-factor Intelligent Recognition [33,34,35,36,37] to analyze interactions between complex factors; and Multi-source Information Fusion [38,39,40,41,42] to integrate heterogeneous data sources. Collectively, these methods provide comprehensive data representations that significantly boost model robustness in predicting mine water inrushes.

Model optimization methods involve optimizing model parameters and hyperparameters [43,44,45,46,47] to improve the model’s fit and robustness on the training set.

Recent work has emphasized the importance of transitioning coal mines toward multipurpose operations with integrated water management and methane utilization systems as part of sustainable ESG strategies, which further underlines the urgency of early and accurate water inrush prediction [48].

To achieve the goal of rapid response to water inrush disasters, it is necessary to improve the accuracy and timeliness of water inrush prediction. Therefore, risk positioning [49] and anomaly detection methods have been introduced: Yin et al. [50] used LSTM and isolated forest (iForest) coupling model to predict water inrush in mines. This method makes full use of the advantages of LSTM time series data prediction and iForest anomaly detection to provide hierarchical early warning of potential water inrush accidents.

This study innovatively integrates temporal prediction and anomaly detection technologies to construct a BiLSTM-VAE coupled model for mine water inrush early warning:

(1): Model Design: BiLSTM captures long-term dependencies in water level time series for precise prediction, while VAE learns latent data distributions to detect anomalies in prediction residuals, enabling dynamic risk perception.
(2): Indicator Optimization: The Ordovician limestone aquifer borehole water level was selected as the primary prediction indicator, based on the cone of depression effect observed near water inrush points. The sampling frequency was increased to 5-minute intervals using forward-filling interpolation, thereby improving early warning timeliness.
(3): Multi-level Early Warning: A weighted multi-level composite anomaly index is established by assigning weights based on spatial correlations among borehole water levels, overcoming the limitations of traditional threshold-based approaches. Validated by the Xin’an Mine case, this model provides warnings 6 h earlier than empirical methods (9 h and 30 min ahead of actual inrush events), reducing the false alarm rate.

2. Data Modeling

2.1. BiLSTM-VAE

BiLSTM is a variant of the Long Short-Term Memory (LSTM) network. Its core structure consists of two parallel LSTM layers: one processes the input sequence in the forward direction (from left to right), and the other processes it in the backward direction (from right to left). This bidirectional design enables the network to capture the context information of both past and future sequences simultaneously, overcoming the limitation of the unidirectional LSTM, which can only rely on historical information [51,52,53,54,55].

VAE is a generative network structure based on variational Bayes (VB) inference. It introduces the concepts of probabilistic inference and probabilistic modeling on the basis of autoencoders [56]. After encoding and decoding, the difference between the test data and the generated data is usually measured using distance metrics or reconstruction errors, which is advantageous in anomaly detection [57]. Therefore, the BiLSTM-VAE is used to detect abnormal water levels, the model structure of which is shown in Figure 1.

(1): BiLSTM networks are first trained individually on raw water-level sequences to generate six critical feature series per borehole, including predictive residuals (AE), absolute change rates (Δy1, Δy2, ΔAE), and their source sequences (y1, y2); Inter-sequence combinations form variable groups, providing options for optimal input variable combinations for subsequent anomaly detection.
(2): Variational Autoencoders (VAEs) quantify single-borehole anomalies by evaluating the reconstruction error between input water level data and their VAE-reconstructed outputs. The anomaly score is computed as the mean squared error (MSE) of the reconstruction. The decision threshold for true anomalies is set to 0.3 times the maximum absolute water level fluctuation observed during the monitoring period (April 2020 and April 2021). Optimal input feature combinations for detection are determined through ROC-AUC analysis.
(3): Comprehensive early warning results are obtained by weighted summation of the anomaly values from the four boreholes, with weight coefficients determined based on the spatial correlation among the boreholes at the moment of water inrush. The determination of the true anomaly threshold remains unchanged; when the monitored value of any borehole exceeds its corresponding threshold, the comprehensive early warning system determines that moment as a true anomaly moment.

2.2. Studied Area and Dataset

In October 2021, a water inrush accident occurred in the old coal kiln of Xin’an Mine. The mine is located in the southern runoff area of the Heilongdong Spring region, within the low hilly terrain between Jiushan and the southern section of Gushan. The elevation difference between the highest and lowest points reaches 104.7 m (Figure 2). Xin’an Mine belongs to the Carboniferous-Permian coalfield, where mining operations are primarily threatened by the underlying Ordovician limestone aquifer.

As illustrated in Figure 3, the water inrush event can be divided into three distinct stages: The pre-inrush phase (22 April 2020, to 25 October 2021), which includes water-level rises due to seasonal rainfall.

The inrush phase (25 October to 3 November 2021), highlighted in yellow. Following a sustained rise in the water level across all boreholes, the water levels in XO5 and KM—completed in the Ordovician limestone aquifer—abruptly declined by 0.7 m and 1.0 m, respectively, within the first three hours following the water inrush. Thus, the anomalous fluctuation (characterized by an initial rise followed by a sudden drop) of the water level within the Ordovician limestone aquifer—induced by the formation of a cone of depression—serves as a precursor signal for water inrush events.

The post-inrush treatment phase (3 November 2021, to 19 May 2022), where continuous grouting efforts gradually reduced and stabilized the water inflow.

This study utilizes water-level data from April 2020 to December 2021, collected from boreholes completed in the Ordovician limestone aquifer (O2-4, O2-6, XO5, KM) and the Carboniferous Daqing limestone aquifer (D19), all located near the accident site. The dataset comprises approximately 17,000 entries per borehole. Continuous monitoring before the incident captured hydraulic fluctuations under extreme precipitation conditions, providing a robust foundation for training machine learning models.

2.3. Data Pre-Processing

To ensure the accuracy of model predictions, the dataset underwent comprehensive cleaning, which involved the removal of obviously erroneous records and the application of forward-filling interpolation. The sampling interval was reduced from 1 h to 5 min to enhance the temporal resolution and overall quality of the data.

Decomposition of the water-level time series using an additive model revealed no significant seasonal or periodic patterns in the Ordovician limestone aquifer boreholes, with variations primarily characterized by random fluctuations (Figure 4). Residual analysis further confirmed the stability of water levels in borehole XO5, indicating a generally stable mine environment. Consequently, seasonal factors were deemed unnecessary for inclusion in the water-inrush prediction model.

Correlation and multicollinearity among the boreholes (O2-4, O2-6, XO5, KM, and D19) were evaluated using Spearman’s rank correlation coefficient and the variance inflation factor (VIF). The results are illustrated in Figure 5 and summarized in Table 1.

During the period from 20 November 2021 to 20 May 2022, strong positive correlations (Spearman’s ρ = 0.70–0.99) were observed between the water levels in monitoring boreholes (XO5, O2-4, O2-6, KM) and the recorded water inflow during inrush events. In contrast, borehole D19 exhibited only moderate correlation (ρ = 0.71 with XO5 and KM; ρ = 0.70 with O2-4 and O2-6) and demonstrated a distinct dynamic response during the incident.

VIF analysis indicated significant multicollinearity among most of the borehole water-level variables. Borehole D19, however, displayed low multicollinearity, with VIF values ranging between 2 and 4. This finding was consistent with the Spearman correlation results, which showed consistently weaker associations between D19 and the other boreholes.

This divergence can be attributed to differences in hydrostratigraphic units: borehole D19 monitors the Carboniferous Daqing Limestone Aquifer, while the remaining boreholes are completed in the Ordovician limestone aquifer (O2 aquifer). Each aquifer is governed by distinct hydrogeological mechanisms, leading to independent water-level fluctuation behaviors. To mitigate multicollinearity effects and simplify the model, data from D19 were excluded from subsequent water-inrush prediction modeling.

Variables with the lowest VIF values were selected as inputs for constructing the prediction model. The following variable pairs were used: (O2-4, O2-6), (O2-6, O2-4), (XO5, KM), and (KM, XO5), where the first variable in each pair served as the input and the second as the prediction target.

3. Results and Discussion

3.1. Data Prediction

Using the aforementioned data combinations, water levels in boreholes O2-4, O2-6, XO5, and KM were predicted by applying their partitioned datasets to a Bidirectional LSTM (BiLSTM) network. To prevent the influence of post-accident mitigation measures (e.g., grouting) on model training, water level data from these boreholes were divided into training and testing sets based on the timeline of the water inrush incident. The training period spans from 1 April 2020 to 23 September 2021 and the testing period spans from 23 September 2021 to 1 December 2021.

The prediction model employed in this study is BiLSTM network, comprising a single bidirectional LSTM layer with 18 units, followed by batch normalization. The architecture also includes three fully connected layers with 16, 8, and 1 neuron(s), respectively, all using linear activation functions. Model training was conducted using the Adam optimizer (learning rate = 0.001, epsilon = 1 × 10⁻⁵), with mean squared error (MSE) as the loss function. Key training parameters included a batch size of 15, 50 epochs, a 10% validation split, and early stopping criteria (patience = 20, min_delta = 0.001) with restoration of the best weights.

Using borehole XO5 as an example, Figure 6 illustrates the training and validation loss curves, which converge sufficiently after 50 epochs, indicating stable training. Experiments showed that increasing prediction timesteps beyond 12 did not significantly improve accuracy (as measured by RMSE and MAE) but considerably increased computational cost. Thus, the final model uses 15 historical timesteps (equivalent to 45 min of data) to predict the water level for the next 5-minute interval.

Model performance was evaluated using the coefficient of determination (R²) and mean absolute error (MAE) across various cell configurations. As illustrated in Figure 7, the multivariate model (denoted as R²(2) and MAE(2)) consistently exhibited superior performance compared to the univariate model. The optimal configuration was uniquely attained with 18 cells, where R²(2) reached its maximum while MAE(2) simultaneously achieved a minimum—a dual optimum that was not observed at other cell counts, such as 10 or 12 cells.

A comprehensive set of eight evaluation metrics was selected to thoroughly evaluate the model’s performance in water-level prediction, addressing key aspects such as accuracy, stability, and operational safety (Table 2). These metrics were carefully chosen to align with the specific requirements of groundwater monitoring and early-warning systems for water inrush. R² was employed to evaluate the overall goodness-of-fit between predicted and observed water levels. RMSE and MAE were used to quantify absolute error magnitudes, with RMSE emphasizing larger deviations often critical in anomaly detection. MARE provided a relative error measure intuitive for interpreting percentage deviations in hydraulic head. MSRE complemented this by offering enhanced sensitivity to outliers, which is vital given the abrupt changes characteristic of inrush precursors. MaxARE was included to monitor worst-case prediction errors, essential for ensuring reliability in safety-critical forecasting. MBE helped identify any systematic bias in predictions—over- or under-estimation of water levels—that could influence risk decisions. Finally, SD quantified the consistency of prediction errors, reflecting the model’s stability under continuous operational deployment in dynamic mining environments [58].

A comparative analysis of four neural network models—BiLSTM, LSTM, GRU, and RNN—each trained over 50 runs using 15 time-step inputs and 18 hidden units (Table 3), reveals distinct performance differences across four boreholes. BiLSTM achieves the highest predictive accuracy in boreholes O2-4 and O2-6, with R² values of 0.980 and 0.978, and MAE values of 0.019 and 0.029, respectively. It also attains the lowest values in MSE, MAE, MARE, and MSRE across most boreholes, reflecting high precision and robustness against outliers.

A slight negative MBE, such as −0.178 in borehole KM, indicates a conservative prediction tendency that reduces the risk of missed alarms—an essential trait for operational safety. Furthermore, BiLSTM exhibits among the smallest maximum absolute errors (erMAX) and low standard deviation (SD), emphasizing its exceptional stability for continuous groundwater monitoring.

Although LSTM shows competitive accuracy in borehole KM (R² = 0.965, MAE = 0.189), its overall performance is generally inferior to BiLSTM. GRU yields inconsistent results across boreholes, such as a relatively low R² of 0.739 in XO5, while RNN performs poorest with high MAE and low R², particularly in XO5 and KM.

In summary, BiLSTM demonstrates superior accuracy, reliability, and stability, making it highly suitable for real-time water-level forecasting and early-warning systems in water inrush prevention.

During actual water inrush incidents, the model proved highly responsive. For example, a sudden drop of 1.19 m was observed in borehole XO5, during which predicted values consistently exceeded actual measurements with a visible time lag. A peak prediction error of 0.768 m occurred at 11:55 on 25 October 2021 (Figure 8). This behavior stems from the disruption of natural periodic aquifer fluctuations under sudden hydraulic forcing, resulting in anomalous variations that are effectively captured by BiLSTM as abrupt increases in prediction error.

In conclusion, while water level prediction alone is insufficient for reliable forecasting of major water inrush events, the integration of specialized anomaly detection methods—such as the proposed BiLSTM-based framework—can significantly enhance early warning capabilities and hazard preparedness.

3.2. Anomaly Detection

Given that neither the water level nor the water inflow data followed a normal distribution, the 3σ method failed to identify any anomalies. Considering the atypical nature of water-level data after April 2021 and based on expert knowledge and common threshold-setting practices in the coal mining industry, a threshold range of 0.1 to 0.4 times the maximum absolute variation rate of borehole XO5 observed between April 2020 and April 2021 was established [50]. Data points exceeding this variation rate threshold were labeled as anomalous. In the identification of genuine anomalies, the following scenarios are explicitly classified as anomalous conditions: firstly, a simultaneous sharp rise in water levels across multiple boreholes during the rainy season, accompanied by rapid hydraulic increase; secondly, a synchronized rapid decline in water levels observed across several boreholes within a short period during water inrush events; thirdly, throughout the water control phase, such as during the implementation of grouting and water reduction engineering, persistent declining and fluctuating water levels resulting from the continuous intervention of such engineering measures. Additionally, since grouting engineering exerts a continuous and complex influence on water level dynamics, it is often difficult to accurately delineate the specific time periods affected using empirical methods. Therefore, the entire grouting construction phase was excluded from subsequent anomaly indicator scoring in the evaluation process. A comparison between anomalies identified under different thresholds and the actual anomaly events is illustrated in Figure 9.

The selection of evaluation metrics was guided by the critical need to minimize both missed detections and false alarms in mine water inrush early warning. Missed anomalies may lead to catastrophic safety failures, while false alerts can trigger unnecessary emergency responses, resulting in operational disruptions and economic losses.

To holistically evaluate model performance under these constraints, key metrics including Recall, Precision, F1-Score, False Positive Rate (FPR), Specificity, Matthews Correlation Coefficient (MCC), True Positives (TP), and False Positives (FP) were adopted (Table 4). These indicators collectively provide insights into the model’s ability to detect true anomalies while controlling false alarms.

Due to the specific requirements of mine water inrush early warning—which aims to capture true anomalies (true positives) to the greatest extent while minimizing false alarms (false positives)—the selection of an appropriate threshold is critical. As illustrated in Figure 9, a threshold set at or below 0.25 times the maximum variation rate results in a higher number of false alarms. Conversely, when the threshold is raised to 0.35 times or above, the missed detection rate increases significantly. In comparison, a threshold defined as 0.3 times the maximum absolute variation rate demonstrates a better balance between controlling false alarms and minimizing missed detections. This threshold accurately identifies two key genuine anomaly events: one is a more rapid synchronous rise occurring during the general water level increase in the rainy season, and the other is a synchronous rapid decline during the water inrush incident, which deviates from normal dynamic patterns. Furthermore, following the initiation of the third phase involving grouting and water reduction engineering, the 0.3-times threshold is also the first to trigger anomaly alerts, indicating its high sensitivity to sustained abnormal water level responses induced by continuous engineering interventions. This aligns well with the practical requirement for persistent anomaly detection during this phase.

Among the tested thresholds, 0.3 was identified as optimal. At this value, Recall reached 0.840, indicating high sensitivity to true anomalies, while Precision was 0.992 and FPR was only 0.00049—reflecting minimal false alarms. The MCC of 0.907 confirms well-balanced classification performance. With only six false positives across all samples, this threshold effectively meets the operational objective of maximizing detection without introducing significant false alerts.

Lower thresholds increased FPR substantially, raising false alarm rates, whereas higher thresholds severely reduced Recall, thus increasing the risk of missed events. Threshold 0.3 thus represents the most suitable trade-off, ensuring both safety and operational continuity in mine water inrush.

According to the BiLSTM prediction results, the variables can be obtained: borehole water level y₁, predicted water level y₂, prediction error (AE), absolute rate of change in prediction error (ΔAE), absolute rate of change in water level data (Δy₁), absolute rate of change in prediction data (Δy₂) and other data. Six combinations can be obtained by combining the variables (Table 5).

Selecting optimal input variables is critical for enhancing anomaly detection performance. Receiver Operating Characteristic (ROC) and Area Under the Curve (AUC) analysis were employed to evaluate classification model performance across variable combinations.

For the XO5 borehole, the combination (y₁, y₂, AE, Δy₁, Δy₂, ΔAE) designated as Combination 3 achieved maximum AUC for the VAE model (Figure 10). Its superior performance can be attributed to the comprehensive multi-perspective input it provides: the raw water levels (y₁, y₂) convey the system’s current state; the prediction error (AE) reflects model deviation; and the temporal rates of change (Δy₁, Δy₂, ΔAE) capture dynamic trends. This integration of instantaneous, residual, and derivative information enables the VAE to better learn normal patterns and identify subtle anomalies.

This variable group was subsequently used for anomaly detection in all four boreholes. The detection results were validated using a hydraulic threshold method, with the threshold value set at 0.3 times the maximum absolute water-level change (|Δh|_max).

Figure 11 shows the results of the VAE model’s detection of anomalies in the XO5 borehole, where combination 3 performs best in terms of Recall, Precision, and F1 Score, especially with a Recall value of 1, indicating the model’s high ability to identify anomalous samples. Despite the imbalance in the anomaly detection data, the high Precision and F1 Score of combination 3 further confirms its superior performance in anomaly detection.

To obtain the most appropriate detection model, combine 3 variables of the XO5 prediction results as input data. VAE and autoencoder (AE), one-class support vector machine (OCSVM), local outlier factor (LOF), robust covariance estimation (Elliptic Envelope, EE) and Isolation Forest (iForest). The models are trained on the training set and the results and the abnormalities determined by the threshold method are shown in Figure 12.

In detecting water level anomalies under precipitation and mining impacts: iForest suffers missed detection; OCSVM, LOF, and EE exhibit higher false positives (EE being most sensitive), while AE and VAE demonstrate optimal performance with minimal false alerts and highest accuracy.

Four standard evaluation metrics are used: Accuracy, F1-Score, Recall, Precision evaluate the detection performance of different models. Due to the fact that there are far more normal categories than abnormal ones in the water level data, anomaly detection places more emphasis on comprehensive indicators such as F1 value, Precision, and Recall rate, rather than just accuracy. Figure 13 shows the performance of different methods for detecting abnormal water levels in four boreholes.

Based on a comprehensive analysis of multiple performance metrics, the VAE demonstrates significant advantages in anomaly detection, achieving the highest scores in both Accuracy (0.99) and Precision (0.81), indicating high overall classification accuracy with a very low false alarm rate. Its F1 Score (0.70) and Recall (0.61) further reflect a well-balanced performance in identifying both positive and negative instances. In contrast, although the AE attains relatively high Accuracy (0.92), its exceptionally low Precision (0.12) and F1 Score (0.22) reveal a high false alarm propensity. The OCSVM shows strong Recall (0.89) and high sensitivity, yet its severely limited Precision (0.066) leads to significant false positive issues. The LOF model delivers mediocre performance across all metrics (F1 = 0.11, Precision = 0.066), rendering it inadequate for practical applications. Both EE and iForest exhibit complete performance failure, with near-zero F1 Scores (0.0078, 0.10) and extremely low Precision values (0.0042, 0.053), indicating an almost total inability to effectively distinguish anomalous samples. In summary, the VAE is the only method that combines high accuracy and strong robustness, significantly outperforming all other compared algorithms.

Combination 3 and a threshold of 0.3 times the historical maximum absolute change in water level for calibration, the VAE detected anomalies across four boreholes (with post-November 2021 data excluded for stability), demonstrating validated effectiveness in stochastic water level scenarios (Figure 12 and Figure 14).

The anomaly detection results for the three boreholes are divided into four stages: pre-inrush, inrush, and post-inrush periods. The VAE model can effectively detect anomalies at all stages, and these anomalies correspond to those determined by the threshold method.

The VAE model demonstrated high accuracy and sensitivity in anomaly detection across the four stages for the four boreholes. By comparing the false alarm situations of each borehole over time, it was found that there were instances of multiple boreholes issuing warnings simultaneously, mutual warnings, and false alarms in one borehole corresponding to real anomalies in others. This occurs because the formation of a cone of depression during water inrush establishes spatial correlations between borehole water levels, leading to synchronized anomalies in the detection results that provide mutual verification.

In summary, a single water level anomaly is not sufficient to determine an anomaly at that time. It is necessary to consider the anomalies of all four drill holes and conduct further analysis and judgment to obtain more reliable anomaly detection results and provide a more reliable basis for early warning of mine water inrush accidents.

3.3. Comprehensive Water Inrush Warning

When a water inrush occurs in a mine, the groundwater level near the outburst point drops rapidly, forming a cone of depression (Figure 15). In space, the water level of the borehole will change with the occurrence of water inrush in the mine, and the water level change in the borehole near the water inrush point will be more obvious. The higher the correlation between the borehole and the water inrush event, the more the abnormality detected by the borehole can reflect the real water inrush. Therefore, all the abnormalities detected by the borehole should be comprehensively considered to capture the real water inrush abnormality more accurately.

In water inrush events, dynamically assign weights based on borehole response speed and fluctuation magnitude: higher weights for rapid/significant changes, lower for delayed/minor variations. The weight assigned to each borehole is calculated as the proportion of its water level change rate relative to the total water level change rate across all boreholes. The formula for calculating the weight is as follows:

Δ h_{i} = \frac{H_{b e f o r e} - H_{a f t e r}}{Δ t}

(1)

W_{i} = \frac{Δ h_{i}}{\sum_{i = 1}^{n} Δ h_{i}}

(2)

In the formula, Δhᵢ represents the water level change rate (unit: m/min) before and after the water inrush event, calculated as the difference between the water level before the inrush (H_before, unit: m) and after the inrush (H_after, unit: m) divided by the time interval (Δt, min); Wᵢ denotes the dimensionless weight coefficient derived from each borehole’s Δhᵢ value as a proportion of the total Δhᵢ sum across all four boreholes. The calculated weight values are presented in Table 6.

The Comprehensive Alert Value (CAV) is derived through weighted integration of anomalies from four boreholes, with values normalized to the range [0,1]. Threshold determination for warning levels is based on Distribution patterns of historical normal monitoring data, and Validation from documented water inrush events.

Normal (CAV ≤ 0.4): 86.7% historical baseline coverage;

Low risk (0.4 < CAV ≤ 0.6): Coordinated micro-fluctuations;

Medium risk (0.6 < CAV ≤ 0.8): Engineered/rainfall-induced hydraulic coordination;

High risk (CAV > 0.8): Diagnostic of inrush events or extreme hydrological responses.

As shown in Figure 16, the comprehensive early warning results of the water inrush incident successfully identified three distinct types of anomalies: a rise in water level due to heavy precipitation, a rapid decline caused by the water inrush itself, and subsequent fluctuations resulting from post-accident mitigation measures such as grouting and plugging. These anomalies were detected with a low false alarm rate and triggered warnings at different risk levels. This demonstrates that the comprehensive early warning approach not only enables early detection of water inrush events but also effectively responds to diverse abnormal conditions. Compared to single-borehole anomaly detection, this method offers higher accuracy and sensitivity in identifying water inrush.

For the water inrush incident at Xin’an Mine (03:35, 25 October 2021), Figure 17 compares the warning times and risk levels between the cascaded BiLSTM–VAE model and a conventional threshold-based method.

As illustrated, the cascaded BiLSTM–VAE model demonstrates superior anomaly detection and early warning performance for the Xin’an Mine case compared to the conventional method. The first medium-risk warning was issued at 17:00 on 24 October 2021—10 h and 35 min prior to the accident—and the first high-risk warning at 18:05 on the same day, 9 h and 30 min before the inrush. These warnings were issued approximately 7 h 5 min and 6 h earlier, respectively, than the first alert from the threshold method, thus allowing substantially more time for implementing safety measures before the water inrush occurred.

3.4. Result Comparison

To systematically evaluate the performance of different models in mine water level prediction and water inrush early warning tasks, this study selects two types of prediction models for comparison: one is classical statistical models, including Seasonal Autoregressive Integrated Moving Average (SARIMA) and Holt–Winters Exponential Smoothing (HWES), which are characterized by clear structure and strong interpretability, suitable for modeling stationary time series; the other is deep learning models, including Gated Recurrent Unit (GRU) and Bidirectional Long Short-Term Memory Network (BiLSTM). Meanwhile, to deeply investigate the design necessity of the “prediction-detection” cascaded framework, this study introduces a Variational Autoencoder (VAE) as a comparison baseline that only performs anomaly detection, aiming to validate the contribution of the prediction component to enhancing anomaly detection performance through ablation comparison.

The water level prediction performance of each model at four boreholes (O2-4, O2-6, XO5, KM) is shown in Table 7. Owing to its bidirectional gated architecture, the BiLSTM model can integrate temporal context information and effectively model the complex nonlinear dynamics of mine water level influenced by the coupling effects of historical trends and external factors (such as rainfall, water inrush events, and grouting engineering), thus achieving the highest prediction accuracy (R²: 0.94–0.98) and the lowest errors (MAE: 0.019–0.265, MSE: 0.025–0.348) across all boreholes. As a lightweight variant of LSTM, GRU shows better prediction performance (R²: 0.739–0.972) than classical models in most boreholes; however, its simplified gated structure has limited capability in characterizing extreme nonlinear processes, leading to a significant increase in prediction deviation at borehole XO5 (R² = 0.739). SARIMA underwent rigorous series order determination: ACF diagnosis indicated that the original series was non-stationary (Figure 18a), but became stationary after first-order differencing (Figure 18c). Combined with PACF truncating after lag 2 (Figure 18b), the optimal order was determined as (2,1,0). However, its linear modeling nature struggles to adapt to the nonlinear and non-stationary characteristics of water level data, resulting in significantly higher prediction errors (MAE: 0.246–0.634) than neural network models. The HWES model, reliant on fixed seasonal patterns and smoothing coefficients, performed poorly when applied to water level series lacking significant seasonal patterns and containing sudden fluctuations, leading to prediction failure (negative R² values for some boreholes); consequently, it was excluded from subsequent analysis.

To further investigate the impact of model architecture on water inrush early warning effectiveness, this study constructed hybrid detection frameworks by integrating various prediction modules (SARIMA, GRU, BiLSTM) with a variational autoencoder (VAE) for anomaly detection and compared them against a pure VAE baseline. Given that both false alarms and missed detections can lead to severe consequences in mine safety management, and considering that water inrush events are extremely rare—often accounting for only about 2% of the total samples, which renders accuracy (ACC) a misleading and ineffective metric due to extreme class imbalance—a comprehensive set of evaluation metrics, previously established in Table 4 specifically for high-stakes and imbalanced anomaly detection, was employed to ensure a robust and meaningful assessment. The complete evaluation results are presented in Table 8.

The BiLSTM-VAE model demonstrates superior overall performance, excelling across a comprehensive set of evaluation metrics. It achieves an optimal balance between a high true positive rate (Recall: 0.846) and a high positive predictive value (Precision: 0.958), yielding an F1-Score of 0.898. This result reflects the model’s strong capability to accurately detect actual inrush events while substantially reducing false alerts. Such a balance is critical for operational safety in mining environments and is further supported by an exceptionally low false positive rate (FPR: 0.005) and high Specificity (0.995), indicating a robust capacity to avoid false alarms during normal operational periods. The Matthews Correlation Coefficient (MCC), a reliable metric for evaluating classification performance under imbalanced conditions, reaches 0.887, further confirming the model’s overall robustness and discrimination capability.

In comparison, the GRU-VAE and ARIMA-VAE models exhibit notable limitations. The ARIMA-VAE model shows significantly reduced performance, with Recall and Precision values of 0.527 and 0.439, respectively, indicating pronounced vulnerabilities to both missed detections and false alarms. These shortcomings are quantitatively reflected in its elevated FPR (0.092) and low MCC (0.403), suggesting that its simplified gating mechanism inadequately captures the complex precursor patterns of water inrush, thereby producing residual signals that hinder discriminative anomaly detection. The GRU-VAE model, while marginally outperforming GRU-VAE with Recall and Precision around 0.605 and 0.633 and an MCC of 0.568, remains constrained by its inherent linearity assumption. Although it achieves a comparatively lower FPR (0.048) and higher Specificity (0.952), indicating some ability to suppress false alarms, its moderate Recall level underscores a persistent tendency to miss true inrush events—a limitation rooted in its inability to model nonlinear hydrogeological dynamics.

The pure VAE model, which operates without a preceding prediction module, performs substantially below acceptable levels across all metrics (Recall: 0.034, Precision: 0.032, F1-Score: 0.033, MCC: −0.017), performing no better than random guessing. Given its comprehensively deficient detection capabilities, the pure VAE model is omitted from further visual analysis in Figure 19, which instead focuses on the three hybrid models that yielded practically meaningful anomaly detection performance.

The operational implications of these quantitative results are visually articulated in Figure 19. The BiLSTM-VAE model, for which intermediate detection results are further illustrated in Figure 16 and Figure 17, produces reliable early warnings 6 to 9.5 h preceding the inrush incident, with minimal false or missed alarms. This lead time provides a critical window for emergency response actions. In contrast, the ARIMA-VAE model fails to provide effective early warnings; although it produces fewer false alarms, it demonstrates a high missed alarm rate both before and during the inrush and grouting phases, severely limiting its practical utility. The GRU-VAE model, while achieving broader anomaly coverage with a lower missed alarm rate, triggers excessive false alerts and offers only minimal advance warning—approximately 3 h—which is operationally insufficient for effective hazard mitigation.

The significant performance disparity between the pure VAE model and the hybrid architectures underscores the essential role of the prediction module. The forecasting stage is instrumental in generating discriminative residual features that amplify the detectability of anomalous sequences, thereby establishing a necessary condition for the effectiveness of the two-stage early warning framework.

4. Discussion

4.1. Results Discussion

The comprehensive experimental results presented in Section 3 robustly demonstrate the superior predictive and early-warning capabilities of the proposed BiLSTM-VAE framework. This superiority is conclusively attributed to its dedicated architectural design rather than incidental factors.

While traditional regression models like SARIMA and HWES can achieve high prediction accuracy on stable segments of the water level series (as indicated in Figure 19), their referencing significance for actual accident forecasting remains markedly low. As corroborated by the warning timestamps listed in Table 8, these models failed to provide timely high-risk warnings prior to the water inrush event. This critical shortcoming stems from their inherent limitations: strict requirements for data stationarity and linearity, and an inability to adapt to the high degree of randomness and nonlinearity imparted by geological deformation, precipitation, mining activities, and sensor faults.

In stark contrast, the BiLSTM-VAE framework overcomes these pitfalls through a synergistic division of labor between its components. The BiLSTM module excels as a powerful adaptive temporal feature extractor. Its bidirectional structure is uniquely capable of capturing the complex, coupled dynamics of mine water levels, which are influenced by both historical trends and external perturbations. This is quantitatively affirmed by its superior prediction accuracy across all boreholes (e.g., R² of 0.94 at XO5), generating prediction residuals with a significantly higher signal-to-noise ratio than those produced by GRU or statistical models. The VAE module subsequently acts as a robust probabilistic anomaly discriminator. When provided with these high-fidelity residuals, the VAE effectively learns the manifold of normal operational data, allowing it to identify subtle, pre-inrush anomalies with high precision (0.958) and recall (0.846) while maintaining an extremely low false positive rate (0.005).

The catastrophic failure of the pure VAE model (MCC: −0.017), which attempts to detect anomalies directly from raw data, underscores a fundamental insight: raw water level data in itself is not a sufficiently discriminative feature for reliable precursor detection. The forecasting stage is, therefore, not merely additive but transformative; it performs essential feature engineering by extracting highly informative residual signals, thereby dramatically enhancing the separability of anomalies for the subsequent detector. This necessity is further highlighted by the suboptimal performance of the GRU-VAE and ARIMA-VAE hybrids. The GRU’s limited capacity for modeling extreme nonlinearities and SARIMA’s inherent linearity assumptions result in noisier residuals, which in turn degrade the performance of the connected VAE module.

In conclusion, the superior performance of the BiLSTM-VAE model is not incidental but is inherent to its architectural design: the BiLSTM module extracts highly discriminative residual features through accurate prediction, and the VAE module utilizes these features to achieve anomaly recognition with a low false alarm rate. The cascaded integration of these two modules produces a significant synergistic enhancement effect. Ablation comparison experiments, including the substitution and removal of the prediction module, demonstrate that the impairment or absence of any module leads to a substantial decline in overall performance. This validates the effectiveness and necessity of the proposed cascaded framework for the early warning of mine water inrushes.

4.2. Method Limitations

While this study demonstrates the potential of advanced models for water inrush prediction, several limitations should be acknowledged:

(1).: The generalizability of the findings is constrained by the single-case validation at a specific mine site. Application to other mining regions with differing hydrogeological conditions would necessitate site-specific model retraining and threshold recalibration.
(2).: There is a need to improve noise filtering techniques to enhance the accuracy of predictions.
(3).: The challenge of integrating data models with physical models remains, requiring a combination of both to improve the reliability of predictions.
(4).: The predictive capabilities for groundwater dynamics are limited, necessitating further research into the patterns of karst development, refinement of warning water levels, and thresholds to enhance the accuracy and practicality of the early warning system.

5. Conclusions

This study analyzes borehole water level data from Xin’an Coal Mine, examining changes in hydrogeological conditions, water level fluctuations, and water inrush events. It characterizes the precursors and influencing factors of water inrush incidents associated with the Ordovician limestone aquifer beneath the mine’s goaf. Through comparative analysis of multiple models’ prediction and anomaly detection results, a BiLSTM-VAE coupled model is proposed. The spatial correlation between borehole water levels, formed by the cone of depression during water inrush, serves as the weighting factor. The model’s anomaly detection results are weighted, enabling accurate risk classification. The model demonstrates strong predictive performance.

(1): To address the limitations in accuracy and timeliness of existing methods (empirical thresholds, numerical simulations, and time-series forecasting), a new cascaded BiLSTM-VAE model is proposed for water inrush prediction.
(2): The model introduces spatial correlation of borehole water levels as weighting factors for comprehensive early warning and develops dynamic risk classification criteria, enabling real-time water inrush risk evaluation.
(3): Verification using data from Xin’an Mine shows that the weighted early warning method predicts water inrush 6 h earlier than traditional threshold methods, providing alerts 9 h and 30 min before the actual inrush event. These results extend the safety response window and confirm the reliability of this cascaded approach, demonstrating significant practical value for mines with limited monitoring equipment.

Author Contributions

Conceptualization, S.Y.; methodology, H.Y. and M.L.; Investigation, X.X. and B.X.; validation, M.L.; formal analysis, H.L. and J.W.; resources, S.Y.; data curation, M.L. and H.Y.; writing—original draft, M.L.; writing—review and editing, H.Y. and E.H.; visualization, M.L.; supervision, E.H. and H.L.; project administration, S.Y. and B.X.; funding acquisition, S.Y. and J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Key Research and Development Program of China (Grant No. 2024YFC3013802, 2022YFC3005905-1), the Khalifa University Faculty Start-Up (FSU) Fund 2023 (Grant No. 8474000603). The grant No. 2024YFC3013802 is associated with Shangxian Yin. The grant No. 2022YFC3005905-1 is associated with Huiqing Lian. The Khalifa University grant is associated with Jinsui Wu.

Data Availability Statement

The source codes are available for downloading at the link: https://github.com/liangmanyu/BiLstm-VAE-Anomaly-Weighted-Model-for-Risk-Graded-Mine-Water-Inrush-Early-Warning (accessed on 23 September 2025).

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Mokhov, A.V. Fissuring due to inundation of coal mines and its hydrodynamic implications. Dokl. Earth Sci. 2007, 414, 519–521. [Google Scholar] [CrossRef]
Chen, Y.; Liu, R.; Barrett, D.; Gao, L.; Zhou, M.; Renzullo, L.; Emelyanova, I. A spatial assessment framework for evaluating flood risk under extreme climates. Sci. Total Environ. 2015, 538, 512–523. [Google Scholar] [CrossRef]
Dash, A.K.; Bhattacharjee, R.M.; Paul, P.S. Lessons learnt from Indian inundation disasters: An analysis of case studies. Int. J. Disaster Risk Reduct. 2016, 20, 93–102. [Google Scholar] [CrossRef]
Li, B.; Zhang, W.; Gao, B.; Yuan, J. Research status and development trends of mine floor water inrush grade prediction. Geotech. Geol. Eng. 2018, 36, 1419–1429. [Google Scholar] [CrossRef]
Bi, Y.; Wu, J.; Zhai, X.; Wang, G.; Shen, S.; Qing, X. Discriminant analysis of mine water inrush sources with multi-aquifer based on multivariate statistical analysis. Environ. Earth Sci. 2021, 80, 144. [Google Scholar] [CrossRef]
Wu, J.S.; Zhang, C.Y.; Yin, S.X.; Xing, B.; Lian, H.Q. Statistics and analysis of coal mine water damage accidents in China in recent 20 years. Coal Technol. 2022, 41, 86–89. [Google Scholar]
Yin, S.; Wang, Y.; Li, W. Cause, countermeasures and solutions of water hazards in coal mines in China. Coal Geol. Explor. 2023, 51, 214–221. [Google Scholar]
Zheng, Q.; Wang, C.; Zhu, Z. Research on the prediction of mine water inrush disasters based on multi-factor spatial game reconstruction. Geomech. Geophys. Geo-Energy Geo-Resour. 2024, 10, 41. [Google Scholar] [CrossRef]
Dong, D.; Zhang, J. Discrimination methods of mine inrush water source. Water 2023, 15, 3237. [Google Scholar] [CrossRef]
Zhao, Y.; Wu, Q.; Chen, T.; Zhang, X.; Du, Y.; Yao, Y. Location and flux discrimination of water inrush using its spreading process in underground coal mine. Saf. Sci. 2020, 124, 104566. [Google Scholar] [CrossRef]
Sun, D.; Hou, X.; Yang, T.; Zhao, Y.; Zhang, P.; Yang, B.; Liu, Y.; Ma, K. Calibration of water inrush channel and numerical simulation of water inrush process in coal mine roof. Front. Earth Sci. 2022, 10, 931508. [Google Scholar] [CrossRef]
Lian, H.; Zhang, Q.; Yin, S.; Yan, T.; Yao, H.; Yang, S.; Kang, J.; Xia, X.; Li, Q.; Huang, Y.; et al. Integrating microseismic monitoring for predicting water inrush hazards in coal mines. Water 2024, 16, 1168. [Google Scholar] [CrossRef]
Yao, B.; Bai, H.; Zhang, B. Numerical simulation on the risk of roof water inrush in Wuyang Coal Mine. Int. J. Min. Sci. Technol. 2012, 22, 273–277. [Google Scholar] [CrossRef]
Zheng, H.; Jiang, B.; Wang, H.; Zheng, Y. Experimental and numerical simulation study on forced ventilation and dust removal of coal mine heading surface. Int. J. Coal Sci. Technol. 2024, 11, 13. [Google Scholar] [CrossRef]
Bazaluk, O.; Sadovenko, I.; Zahrytsenko, A.; Saik, P.; Lozynskyi, V.; Dychkovskyi, R. Forecasting Underground Water Dynamics within the Technogenic Environment of a Mine Field. Case Study. Sustain. 2021, 13, 7161. [Google Scholar] [CrossRef]
Zheng, Y.; Sun, X.; Chen, J.; Yue, J. Extracting pulse signals in measurement while drilling using optimum denoising methods based on the ensemble empirical mode decomposition. Pet. Explor. Dev. 2012, 39, 798–801. [Google Scholar] [CrossRef]
Wu, J.; Xu, S.; Zhou, R.; Qin, Y. Scenario analysis of mine water inrush hazard using Bayesian networks. Saf. Sci. 2016, 89, 231–239. [Google Scholar] [CrossRef]
Qiu, M.; Han, J.; Zhou, Y.; Shi, L. Prediction reliability of water inrush through the coal mine floor. Mine Water Environ. 2017, 36, 217–225. [Google Scholar] [CrossRef]
Zhang, G.; Xue, Y.; Bai, C.; Su, M.; Zhang, K.; Tao, Y. Risk assessment of floor water inrush in coal mines based on MFIM-TOPSIS variable weight model. J. Cent. South Univ. 2021, 28, 2360–2374. [Google Scholar] [CrossRef]
Dey, P.; Chaulya, S.K.; Kumar, S. Hybrid CNN-LSTM and IoT-based coal mine hazards monitoring and prediction system. Process Saf. Environ. Prot. 2021, 152, 249–263. [Google Scholar] [CrossRef]
Dong, F.; Yin, H.; Cheng, W.; Zhang, C.; Zhang, D.; Ding, H.; Lu, C.; Wang, Y. Quantitative prediction model and prewarning system of water yield capacity (WYC) from coal seam roof based on deep learning and joint advanced detection. Energy 2024, 290, 130200. [Google Scholar] [CrossRef]
Qiu, H.; Zhao, H.; Xiang, H.; Ou, R.; Yi, J.; Hu, L.; Zhu, H.; Ye, M. Forecasting the incidence of mumps in Chongqing based on a SARIMA model. BMC Public Health 2021, 21, 373. [Google Scholar] [CrossRef] [PubMed]
Lu, J.; Liu, Z.; Zhang, W.; Zheng, J.; Han, C. Pressure prediction study of coal mining working face based on Nadam-LSTM. IEEE Access 2023, 11, 83867–83880. [Google Scholar] [CrossRef]
Middya, A.I.; Roy, S. Pollutant specific optimal deep learning and statistical model building for air quality forecasting. Environ. Pollut. 2022, 301, 118972. [Google Scholar] [CrossRef]
Ahmadi, A.; Daccache, A.; Sadegh, M.; Snyder, R.L. Statistical and deep learning models for reference evapotranspiration time series forecasting: A comparison of accuracy, complexity, and data efficiency. Comput. Electron. Agric. 2023, 215, 108424. [Google Scholar] [CrossRef]
Xian, X.B.; Wang, L.; Wu, X.H.; Tang, X.Q.; Zhai, X.P.; Yu, R.; Qu, L.H.; Ye, M.L. Comparison of SARIMA model, Holt-winters model and ETS model in predicting the incidence of foodborne disease. BMC Infect. Dis. 2023, 23, 803. [Google Scholar] [CrossRef]
Zhu, F.; Yang, J.; Gao, C.; Xu, S.; Ye, N.; Yin, T.M. A weighted one-class support vector machine. Neurocomputing 2016, 189, 1–10. [Google Scholar] [CrossRef]
Ma, D.; Duan, H.Y.; Cai, X.; Li, Z.H.; Li, Q.; Zhang, Q. A Global Optimization-Based Method for the Prediction of Water Inrush Hazard from Mining Floor. Water 2018, 10, 1618. [Google Scholar] [CrossRef]
Zhang, Y.; Tang, S.F.; Shi, K. Risk assessment of coal mine water inrush based on PCA-DBN. Sci. Rep. 2022, 12, 1370. [Google Scholar] [CrossRef]
Jaksik, R.; Szumała, K.; Dinh, K.N.; Śmieja, J. Multiomics-Based Feature Extraction and Selection for the Prediction of Lung Cancer Survival. Int. J. Mol. Sci. 2024, 25, 3661. [Google Scholar] [CrossRef]
Malvoni, M.; De Giorgi, M.G.; Congedo, P.M. Photovoltaic forecast based on hybrid PCA–LSSVM using dimensionality reducted data. Neurocomputing 2016, 211, 72–83. [Google Scholar] [CrossRef]
Zhang, Y.; Tang, S.F.; Shi, K.; Tong, X.M. An evaluation of the mine water inrush based on the deep learning of ISMOTE. Nat. Hazards 2023, 117, 1475–1491. [Google Scholar] [CrossRef]
Xu, L.; Song, Z.P.; Zhi, B.; Pu, J.Y.; Chen, M. Intelligent identification of rock mass structural based on point cloud deep learning technology. Constr. Build. Mater. 2024, 456, 139340. [Google Scholar] [CrossRef]
Jing, W.; Zhao, Z.Q.; Wang, X.H.; Wang, Y.Q.; Wei, X.X.; You, Z.Q. Intelligent Identification and Prediction of Roof Deterioration Areas Based on Measurements While Drilling. Sensors 2024, 24, 7421. [Google Scholar] [CrossRef] [PubMed]
Li, D.X.; Zhu, Y.Q.; Mehmood, A.; Liu, Y.T.; Qin, X.J.; Dong, Q.L. Intelligent identification of foodborne pathogenic bacteria by self-transfer deep learning and ensemble prediction based on single-cell Raman spectrum. Talanta 2025, 285, 127268. [Google Scholar] [CrossRef]
Lin, W.; Guo, W.N.; Guo, J.Y.; Zheng, S.C.; Wang, Z.Y.; Wang, K.; Hooi Siang, H.; He, L. An integrated deep learning model for intelligent recognition of long-distance natural gas pipeline features. Reliab. Eng. Syst. Saf. 2025, 255, 110664. [Google Scholar]
Zhang, X.W.; Zheng, Y.G.; Zhang, H.D.; Sheng, J.; Lu, B.J.; Feng, S. TGNet: Intelligent Identification of Thunderstorm Wind Gusts Using Multimodal Fusion. Adv. Atmos. Sci. 2025, 42, 146–164. [Google Scholar] [CrossRef]
Xu, W.H.; Li, Y.G. Enhancing information fusion and feature selection efficiency via the PROMETHEE method for multi-source dynamic decision data sets. Knowl. -Based Syst. 2025, 309, 112781. [Google Scholar] [CrossRef]
Yu, Y.; Li, Q.H.; Hua, Z.J.; Yin, C.B.; Shi, Y. An effective multi-source information fusion method for electronic nose and hyperspectral to identify the spring tea quality at different harvesting periods. Measurement 2025, 243, 116452. [Google Scholar] [CrossRef]
Zhang, Q.L.; Zhang, P.F.; Li, T.R. Information fusion for large-scale multi-source data based on the Dempster-Shafer evidence theory. Inf. Fusion 2025, 115, 102754. [Google Scholar] [CrossRef]
Zhang, Q.; Zhu, J.; Dong, Y.S.; Zhao, E.Y.; Song, M.P.; Yuan, Q.Q. 10-minute forest early wildfire detection: Fusing multi-type and multi-source information via recursive transformer. Neurocomputing 2025, 616, 128963. [Google Scholar] [CrossRef]
Zhao, S.C.; Jiang, J.; Tang, W.B.; Zhu, J.K.; Chen, H.; Xu, P.F.; Schuller, B.W.; Tao, J.H.; Yao, H.X.; Ding, G.G. Multi-source multi-modal domain adaptation. Inf. Fusion 2025, 117, 102862. [Google Scholar] [CrossRef]
Lerat, J.; Chiew, F.; Robertson, D.; Andréassian, V.; Zheng, H.X. Data Assimilation Informed Model Structure Improvement (DAISI) for Robust Prediction Under Climate Change: Application to 201 Catchments in Southeastern Australia. Water Resour. Res. 2024, 60, e2023WR036595. [Google Scholar] [CrossRef]
Li, Z.; Lingxiao, Z.Y.; Pei, Y.G.; Qu, L.L. Disentangled Seasonal-Trend representation of improved CEEMD-GRU joint model with entropy-driven reconstruction to forecast significant wave height. Renew. Energy 2024, 226, 120345. [Google Scholar]
Bongirwar, V.; Mokhade, A.S. An improved multi-scale convolutional neural network with gated recurrent neural network model for protein secondary structure prediction. Neural Comput. Appl. 2024, 36, 15063–15074. [Google Scholar] [CrossRef]
Zhiwei, Z.; Yuyan, Z.; Yintang, W.; Yaxue, R.; Xi, L.; Jiaxing, C.; Mengqi, K. An improved stacking ensemble learning model for predicting the effect of lattice structure defects on yield stress. Comput. Ind. 2023, 151, 103986. [Google Scholar] [CrossRef]
Shulin, W.; Yujing, J.; Jiangtao, L.; Zhimeng, L.; Xiaowei, Z.; Yujie, G.; Jing, L.; Jinzhang, T.; Jitti, K.; Yu, J.; et al. Integrating crystal structure and numerical data for predictive models of lithium-ion battery materials: A modified crystal graph convolutional neural networks approach. J. Energy Storage 2024, 80, 110220. [Google Scholar]
Bondarenko, V.; Salieiev, I.; Kovalevska, I.; Chervatiuk, V.; Malashkevych, D.; Shyshov, M.; Chernyak, V. A new concept for complex mining of mineral raw material resources from DTEK coal mines based on sustainable development and ESG strategy. Min. Miner. Depos. 2023, 17, 1–16. [Google Scholar] [CrossRef]
Yin, H.C.; Zhang, G.Z.; Wu, Q.; Yin, S.X.; Soltanian, M.R.; Vo Thanh, H.; Dai, Z.X. A Deep Learning-Based Data-Driven Approach for Predicting Mining Water Inrush From Coal Seam Floor Using Microseismic Monitoring Data. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4504815. [Google Scholar] [CrossRef]
Yin, H.C.; Wu, Q.; Yin, S.X.; Dong, S.N.; Dai, Z.X.; Soltanian, M.R. Predicting mine water inrush accidents based on water level anomalies of borehole groups using long short-term memory and isolation forest. J. Hydrol. 2023, 616, 128813. [Google Scholar] [CrossRef]
Luo, J.M.; Zhu, D.J.; Li, D.X. Classification-enhanced LSTM model for predicting river water levels. J. Hydrol. 2025, 650, 132535. [Google Scholar] [CrossRef]
Silva, M.B.L.d.; Barreto, F.T.C.; Costa, M.C.d.O.; Junior, C.L.d.S.; Camargo, R.d. Bias correction of significant wave height with LSTM neural networks. Ocean Eng. 2025, 318, 120015. [Google Scholar] [CrossRef]
Zheng, L.L.; Lin, S.; Guo, H.W.; Cao, X.T.; Zheng, H. Real-time rockburst assessment based on a novel hybrid convolutional long short-term memory network based on microseismic monitoring data. Eng. Fail. Anal. 2025, 169, 109191. [Google Scholar] [CrossRef]
Açıkkar, M.; Aydın, B. Deep learning-based landslide tsunami run-up prediction from synthetic gage data. Appl. Ocean Res. 2025, 154, 104360. [Google Scholar] [CrossRef]
Bharatheedasan, K.; Maity, T.; Kumaraswamidhas, L.A.; Durairaj, M. Enhanced fault diagnosis and remaining useful life prediction of rolling bearings using a hybrid multilayer perceptron and LSTM network model. Alex. Eng. J. 2025, 115, 355–369. [Google Scholar] [CrossRef]
Huawei, S.; Dongyuan, L.; Zhang, L. Unsupervised aspect-based summarization using variational autoencoders. Expert Syst. Appl. 2025, 266, 126059. [Google Scholar]
Xiangkun, Z.; Xiaomin, Z.; Runtong, Z.; Qianxia, M. Imbalanced fault diagnosis of a conditional variational auto-encoder with transfer and adversarial structures. Adv. Eng. Inform. 2025, 64, 103032. [Google Scholar]
Farzin, K.; Neda, A.; Robert, J. Optimization-based stacked machine-learning method for seismic probability and risk assessment of reinforced concrete shear walls. Expert Syst. Appl. 2024, 255, 124897. [Google Scholar]

Figure 1. Internal structure of BiLSTM-VAE.

Figure 2. Schematic diagram of the Heilongdong spring hydrogeological unit in the Xinan Mine. Blue arrows indicate the groundwater runoff direction, red lines and arrows represent faults and their orientations, and the locations of exploratory boreholes are marked by asterisks.

Figure 3. Water level data of the borehole group over a one-year period before the water inrush accident. The yellow shaded area indicates the phase during which the water level of XO5 (green line) and KM (blue line) dropped rapidly, accompanied by a sharp increase in water discharge. The start of the yellow shaded area corresponds to the time when the water inrush was sighted, on 25 October 2021 at 03:35:00. The blue shaded area represents the subsequent stage when the water discharge ceased to increase and grouting engineering for water control was implemented.

Figure 4. Decomposed water level data of XO5. The red dashed line is the time when the accident is sighted. The yellow shade represents the period of the accident.

Figure 5. Spearman correlation analysis results of each borehole data.

Figure 6. Training and Validation Loss Curves of BiLSTM Model for XO5 Borehole Water Level Prediction.

Figure 7. Performance comparison of univariate (1) and multivariate (2) models under different cell configurations, evaluated by mean absolute error (MAE) and coefficient of determination (R²).

Figure 8. The prediction results of XO5 borehole water levels. The solid blue line represents the actual water level value, the red line represents the predicted value of the water level, the dark green line represents the prediction error, and the blue dashed line represents the occurrence time of the water inrush event.

Figure 9. Comparison of Anomalies Defined by Different Thresholds and True Anomaly Events. True anomaly events are annotated as follows: the pink area represents the synchronized rapid rise in water levels in multiple boreholes during the rainy season (26–28 September 2021); the yellow area indicates the simultaneous rapid decline of water levels across multiple boreholes around the time of the water inrush incident (25 October 2021); the blue area depicts fluctuations caused by multiple grouting and water reduction operations starting from 3 November 2021.

Figure 10. ROC curve of the VAE model with various combinations of input variables. The curve denoted by AUC1 to AUC6, respectively, represents the ROC curve of the combination1 to combination6 in Table 4.

Figure 11. Anomaly Detection Performance Diagram of Different Methods.

Figure 12. Results of Various Methods for Water Level XO5. The purple area represents an in-stance where the absolute rate of change is greater than the threshold; red circles represent anomalies detected by various methods.

Figure 13. Comprehensive Anomaly Detection Performance Diagram of Different Methods.

Figure 14. VAE anomaly detection results using Variable Combination 3 as input data. (a) KM borehole; (b) O2-4 borehole; (c) O2-6 borehole. The green curve shows the borehole water level. The red circles mark the anomalies identified by the VAE. The purple shaded area defines the true anomalies based on the threshold method.

Figure 15. The cone of depression forms due to rapid groundwater level decline near the water inrush point during water inrush events. The gray area represents the depression zone formation, while the red circle indicates the water inrush location.

Figure 16. Weighted anomaly synthesis result diagram of each borehole. The purple area of the weighted anomaly synthesis result map for each borehole marks the true anomaly determined by the threshold method of synthesizing the four boreholes. The anomaly detection results are further divided into different risk levels by the color of the circle: the green circle indicates a low-risk warning, the yellow circle indicates a medium-risk warning, and the red circle indicates a high-risk warning. No time marked is normal.

Figure 17. Specific early warning results of the water inrush event. The white area indicates the period before the water inrush. The green area represents the period after the inrush, with its starting point marking the actual incident time (03:35, October 25). The purple marker denotes the first warning time identified by the threshold method (00:05, October 25). A yellow circle indicates a medium-risk warning (17:00, October 24), and a red circle indicates a high-risk warning (18:05, October 24).

Figure 18. Autocorrelation, partial autocorrelation and First-order difference in the XO5 data.

Figure 19. Early warning results for water inrush based on ARIMA-VAE and GRU-VAE models. (a) ARIMA-VAE model results; (b) GRU-VAE model results; (c) specific early warning results of the water inrush event from the ARIMA-VAE model; (d) specific early warning results of the water inrush event from the GRU-VAE model.

Table 1. VIF analysis result table of each mine borehole data.

	XO5 VIF		O2-4 VIF		O2-6 VIF		KM VIF
O2-4	502.935	XO5	181.393	XO5	181.052	XO5	21.068
O2-6	501.157	O2-6	36.468	O2-4	36.528	O2-4	501.359
D19	2.096	D19	3.796	D19	3.805	O2-6	498.448
KM	31.910	KM	273.880	KM	272.743	D19	2.545

Table 2. Statistical metrics to evaluate the proposed methods.

Indicator	Formula
Coefficient of determination	$R^{2} = 1 - \frac{{\sum_{i = 1}^{n} ({Actual}_{i} - {Predicted}_{i})}^{2}}{{\sum_{i = 1}^{n} ({Actual}_{i} - {Actual}_{a v g})}^{2}}$
Root mean squared error	$R M S E = \frac{1}{n} {\sum_{i = 1}^{n} ({Actual}_{i} - {Predicted}_{i})}^{2}$
Mean absolute error	$M A E = \frac{1}{n} \sum_{i = 1}^{n} \|{Actual}_{i} - {Predicted}_{i}\|$
Mean absolute relative error	$M A R E = \frac{100 %}{n} \sum_{i = 1}^{n} \|\frac{{Actual}_{i} - {Predicted}_{i}}{{Actual}_{i}}\|$
Mean square relative error	$M S R E = \frac{1}{n} \sum_{i = 1}^{n} {\|\frac{{Actual}_{i} - {Predicted}_{i}}{{Actual}_{i}}\|}^{2}$
Mean bias error	$M B E = \frac{1}{n} \sum_{i = 1}^{n} ({Actual}_{i} - {Predicted}_{i})$
Maximum absolute relative error	$M axARE = \max (\|\frac{{Actual}_{i} - {Predicted}_{i}}{{Actual}_{i}}\|)$
Standard deviation	$S D = \sqrt{\sum \frac{(X_{i} - Arithmetic mean)}{total number}}$

Table 3. Performance comparison of 18 cell number predictions across each borehole on the water level dataset.

Borehole	Submodel	R²	MSE	MAE	MARE	MSRE	MBE	erMAX	SD
O2-4	BiLSTM	0.980	0.025	0.019	0.140 × 10⁻³	0.338 × 10⁻⁷	0.015	0.002	0.02
	LSTM	0.970	0.039	0.035	0.258 × 10⁻³	0.811 × 10⁻⁷	0.034	0.002	0.017
	GRU	0.972	0.130	0.104	0.775 × 10⁻⁴	0.10 × 10⁻⁵	−0.06	0.003	0.115
	RNN	0.519	0.539	0.428	0.003	0.160 × 10⁻⁴	0.226	0.01	0.49
O2-6	BiLSTM	0.978	0.034	0.029	0.218 × 10⁻³	0.647 × 10⁻⁷	0.028	0.002	0.02
	LSTM	0.995	0.054	0.045	0.337 × 10⁻³	0.162 × 10⁻⁶	−0.008	0.002	0.054
	GRU	0.896	0.251	0.199	0.001	0.400 × 10⁻⁵	−0.136	0.005	0.211
	RNN	0.762	0.380	0.372	0.275 × 10⁻²	0.800 × 10⁻⁵	0.372	0.441 × 10⁻²	0.074
XO5	BiLSTM	0.94	0.348	0.265	0.002	0.600 × 10⁻⁵	0.262	0.005	0.228
	LSTM	0.927	0.382	0.357	0.251 × 10⁻²	0.700 × 10⁻⁵	0.120	0.545 × 10⁻²	0.364
	GRU	0.739	0.725	0.62	0.437 × 10⁻²	0.260 × 10⁻⁴	0.312	0.118 × 10⁻¹	0.655
	RNN	0.813	0.614	0.536	0.376 × 10⁻²	0.186 × 10⁻⁴	−0.003	0.008	0.614
KM	BiLSTM	0.953	0.274	0.208	0.001	0.376 × 10⁻⁵	−0.178	0.005	0.209
	LSTM	0.965	0.237	0.189	0.001	0.281 × 10⁻⁵	−0.138	0.004	0.193
	GRU	0.945	0.298	0.247	0.002	0.436 × 10⁻⁵	0.160	0.004	0.251
	RNN	0.569	0.835	0.725	0.005	0.343 × 10⁻⁴	0.725	0.010	0.415

Table 4. Anomaly detection metrics across different threshold values.

Threshold	0.1	0.15	0.2	0.25	0.3	0.35	0.4
Recall	0.961	0.9134	0.8734	0.8463	0.8398	0.7619	0.4502
Precision	0.2268	0.3937	0.5147	0.7931	0.9923	0.9915	0.9858
F1-Score	0.3669	0.5502	0.6477	0.8188	0.9097	0.8617	0.6181
False Positive Rate (FPR)	0.2457	0.1055	0.0617	0.0166	0.0005	0.0005	0.0005
Specificity	0.7543	0.8945	0.9383	0.9834	0.9995	0.9995	0.9995
Matthews Correlation Coefficient (MCC)	0.3993	0.5588	0.64	0.8053	0.9072	0.8612	0.6523
True Positives (TP)	888	844	807	782	776	704	416
False Positives (FP)	3028	1300	761	204	6	6	6

Table 5. Input variable combination table.

	Combination 1	Combination 2	Combination 3	Combination 4	Combination 5	Combination 6
Input 1	y₁	y₁	y₁	y₁	y₁	y₁
Input 2	y₂	y₂	y₂	y₂	y₂	y₂
Input 3	AE	AE	AE	AE	/	/
Input 4	Δy₁	Δy₁	Δy₁	Δy₁	Δy₁	/
Input 5	/	Δy₂	Δy₂	/	Δy₂	/
Input 6	/	/	ΔAE	ΔAE	/	ΔAE

Table 6. Weight table corresponding to boreholes.

Borehole Name	O2-4	O2-6	XO5	KM
weight value	0.104	0.0879	0.564	0.244

Table 7. Performance comparison of BiLSTM, GRU, SARIMA and HWES as prediction models on XO5.

Borehole	Submodel	R²	MSE	MAE	MARE	MSRE	MBE	erMAX	SD
BiLSTM	O2-4	0.980	0.025	0.019	0.140 × 10⁻³	0.338 × 10⁻⁷	0.015	0.002	0.02
	O2-6	0.978	0.034	0.029	0.218 × 10⁻³	0.647 × 10⁻⁷	0.028	0.002	0.02
	XO5	0.94	0.348	0.265	0.002	0.600 × 10⁻⁵	0.262	0.005	0.228
	KM	0.953	0.274	0.208	0.001	0.376 × 10⁻⁵	−0.178	0.005	0.209
GRU	O2-4	0.972	0.130	0.104	0.775 × 10⁻⁴	0.10 × 10⁻⁵	−0.06	0.003	0.115
	O2-6	0.896	0.251	0.199	0.001	0.400 × 10⁻⁵	−0.136	0.005	0.211
	XO5	0.739	0.725	0.62	0.437 × 10⁻²	0.260 × 10⁻⁴	0.312	0.118 × 10⁻¹	0.655
	KM	0.945	0.298	0.247	0.002	0.436 × 10⁻⁵	0.160	0.004	0.251
SARIMA	O2-4	0.897	0.112	0.246	0.002	0.215 × 10⁻⁵	0.107	0.006	0.317
	O2-6	0.817	0.198	0.313	0.002	0.582 × 10⁻⁵	0.069	0.012	0.441
	XO5	0.828	0.943	0.606	0.004	0.291 × 10⁻⁵	0.004	0.017	0.974
	KM	0.815	0.926	0.634	0.005	0.1210 × 10⁻⁴	0.089	0.020	0.961
HWES	O2-4	−3.685	5.082	1.850	1.367	0.027	−1.846	2.554	1.293
	O2-6	−2.284	3.561	1.532	1.133	0.019	−1.490	2.230	1.157
	XO5	−3.543	24.996	4.191	2.935	0.122	−4.188	5.096	2.730
	KM	0.186	4.086	1.627	1.148	0.020	−1.035	2.524	1.736

Table 8. Performance Comparison of Hybrid Models for Coal Mine Water Inrush.

Evaluation Indicators	Recall	Precision	F1-Score	False Positive Rate (FPR)	Specificity	Matthews Correlation Coefficient (MCC)	True Positives (TP)	False Positives (FP)
BiLSTM-VAE	0.846	0.958	0.898	0.005	0.995	0.887	1786	79
ARIMA-VAE	0.527	0.439	0.479	0.092	0.908	0.403	1113	1420
GRU-VAE	0.605	0.633	0.618	0.048	0.952	0.568	1277	741
VAE	0.034	0.032	0.033	0.051	0.949	−0.017	28	850

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liang, M.; Yao, H.; Yin, S.; Hou, E.; Lian, H.; Xia, X.; Wu, J.; Xu, B. BiLSTM-VAE Anomaly Weighted Model for Risk-Graded Mine Water Inrush Early Warning. Appl. Sci. 2025, 15, 10394. https://doi.org/10.3390/app151910394

AMA Style

Liang M, Yao H, Yin S, Hou E, Lian H, Xia X, Wu J, Xu B. BiLSTM-VAE Anomaly Weighted Model for Risk-Graded Mine Water Inrush Early Warning. Applied Sciences. 2025; 15(19):10394. https://doi.org/10.3390/app151910394

Chicago/Turabian Style

Liang, Manyu, Hui Yao, Shangxian Yin, Enke Hou, Huiqing Lian, Xiangxue Xia, Jinsui Wu, and Bin Xu. 2025. "BiLSTM-VAE Anomaly Weighted Model for Risk-Graded Mine Water Inrush Early Warning" Applied Sciences 15, no. 19: 10394. https://doi.org/10.3390/app151910394

APA Style

Liang, M., Yao, H., Yin, S., Hou, E., Lian, H., Xia, X., Wu, J., & Xu, B. (2025). BiLSTM-VAE Anomaly Weighted Model for Risk-Graded Mine Water Inrush Early Warning. Applied Sciences, 15(19), 10394. https://doi.org/10.3390/app151910394

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

BiLSTM-VAE Anomaly Weighted Model for Risk-Graded Mine Water Inrush Early Warning

Abstract

1. Introduction

2. Data Modeling

2.1. BiLSTM-VAE

2.2. Studied Area and Dataset

2.3. Data Pre-Processing

3. Results and Discussion

3.1. Data Prediction

3.2. Anomaly Detection

3.3. Comprehensive Water Inrush Warning

3.4. Result Comparison

4. Discussion

4.1. Results Discussion

4.2. Method Limitations

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI