Next Article in Journal
Influence of the Cause of File Fracture on the Successful Removal of Fragments from Root Canals: An In Vivo Study
Previous Article in Journal
Mechanical Performance of Reinforcement Measures for Corrugated Steel Pipe Arch Bridges Under Differential Settlement
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Risk Prediction of Water Inrush in Diversion Tunnel Crossing Water-Rich Fault Based on NRBO-XGBoost Algorithm

1
School of Civil Engineering & Hunan Provincial Key Laboratory of Geotechnical Engineering for Stability Control and Health Monitoring, Hunan University of Science and Technology, Xiangtan 411201, China
2
Hunan University of Science and Technology Engineering Testing Co., Ltd., Xiangtan 411201, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(8), 3831; https://doi.org/10.3390/app16083831
Submission received: 25 March 2026 / Revised: 9 April 2026 / Accepted: 13 April 2026 / Published: 15 April 2026
(This article belongs to the Section Civil Engineering)

Abstract

Water inrush can easily occur during the construction of diversion tunnels crossing water-rich faults, and large-scale water inrushes pose a great threat to construction personnel and machinery. For the construction safety of the diversion tunnel, it is very important to accurately predict the risk of water inrush. Therefore, to reduce the occurrence of water inrush disasters in tunnels, this paper establishes a diversion tunnel water inrush risk prediction model based on the NRBO-XGBoost algorithm on the basis of giving full play to the value of engineering data. Nine indicators were selected from engineering geological conditions, hydrogeological conditions, and tunnel construction conditions on the basis of fully mining engineering data, and the prediction indicator system of the water inrush risk of tunnels through water-rich faults was established. The model was trained and tested using 120 valid samples collected from the Longjinxi diversion tunnel, which realizes accurate and fast water inrush risk prediction in the construction process. Its predictive performance was compared with that of BPNN and the standard XGBoost model. The R2 and MAE of the novel method are 0.9129 and 0.0667, respectively, which are both superior to those of other methods. It confirms the proposed model’s reliability and effectiveness.

1. Introduction

Fault zones are common adverse geological conditions during the construction of underground projects such as water diversion tunnels. The rock mass inside the fault zone has poor lithology and a fragmented structure, and the degree of rock mass fracture development is high [1]. Tunnel construction through a water-rich fault is bound to damage the underground rock structure, change the groundwater storage conditions and migration network, and cause a sudden change in the balance between the groundwater dynamic conditions and the rock mass [2]. Large-scale water inrush events in tunnels are typically characterized by rapid and sudden energy release. The intrusion of high-pressure groundwater poses significant risks to construction personnel and equipment, thereby constituting a major geological hazard during tunnel excavation in water-rich regions [3].
To achieve accurate prediction of water inrush risks in tunnels, various methodologies have been adopted for risk assessment, including fuzzy comprehensive evaluation [4,5], catastrophe theory [6,7], and cloud models [8,9], all of which have yielded satisfactory predictive outcomes. Wang et al. [10] proposed a risk prediction model based on the intuitionistic fuzzy method and obtained the subjective and objective weights of the influencing factors. Luo et al. [11] identified the source of water inrush based on hydrological, hydrochemical, and isotopic methods, and applied a lumped rainfall-runoff model to predict water inrush. Li et al. [12] employed the tunnel-induced polarization method to probe the 3D spatial location and distribution of water-rich areas, and carried out numerical simulations to summarize the corresponding velocity, pressure, and flow characteristics for each probing line after water inrush. These methods have achieved water inrush prediction to a certain extent; however, they are highly subjective and may have deviations.
With the development of artificial intelligence algorithms, neural networks [13,14], support vector machines [15], and intelligent optimization algorithms [16,17] have been applied in the risk assessment and prediction of tunnel engineering. In terms of water inrush disasters in tunnels, Arsalan et al. [18] used a database that included 600 datasets of water inflow and six machine learning methods to predict tunnel water inflow. Zhang et al. [19] applied the random forest algorithm to predict the hazard level of water inrush and validated the effectiveness of the model in actual engineering. In a similar vein, Xu et al. [20] adopted a Weighted Bayesian Network model to evaluate the probability of water inrush across different risk grade ranges. However, in complex engineering geological conditions, the accuracy of the existing prediction models is still insufficient, and there is a lack of interpretability analysis of the causal relationship between input parameters and output results, which limits their application in engineering decision-making.
With the gradual enrichment of measurement data, the prediction of water inrush risks has become more efficient, rapid, and accurate, which is more suitable for engineering needs. Comparing various artificial intelligence algorithms, the optimized eXtreme Gradient Boosting (XGBoost) has superior computational efficiency and accuracy, and has achieved good application results in intelligent computing for tunnel engineering. Wu et al. [21] established real-time prediction models by comparing six different algorithms, and found that the XGBRF algorithm has superior prediction and generalization performance. Nguyen et al. [22] proposed the XGBoost model to predict the racking ratio of rectangular tunnels under seismic loading. However, fine-tuning of parameters (such as the maximum number of iterations, the depth of the tree, and the learning rate) is necessary to achieve the optimal performance, and the computational cost is high. NRBO is a meta-heuristic algorithm that can explore the solution space more precisely. Especially when approaching the optimal solution, it can better fine-tune to obtain more accurate results. It has the advantages of fast convergence speed and high accuracy, and can effectively avoid local optima. Many scholars have used the NRBO algorithm to optimize the performance of the XGBoost algorithm and have achieved preferable results [23,24].
In this study, the NRBO-XGBoost algorithm was developed to predict water inrush risks in diversion tunnels intersecting water-rich faults. A comprehensive prediction system was established by incorporating nine evaluation indices, with particular emphasis on engineering geological conditions, hydrogeological conditions, and tunnel construction factors, based on extensive engineering data. The proposed model has been successfully applied to assess water inrush risk in such tunnel scenarios, thereby providing an effective tool for risk evaluation and a valuable reference for similar engineering projects.

2. NRBO-XGBoost Algorithm

2.1. Basic Principles of XGBoost Algorithm

XGBoost is an efficient algorithm toolkit that improves the performance and generalization ability of a model by optimizing the objective function. The algorithm, with high performance, high accuracy and interpretability [25], has been widely applied in the intelligent analysis and prediction of tunnel engineering.
The objective function is composed of a loss function and a regularization term. The loss function was used to evaluate the difference between the actual and predicted results of the model, and the regularization term was used to control the complexity of the model and prevent overfitting. The objective function L is defined as follows [26]:
L = i = 1 I l ( y i , y ^ i ) + k = 1 K Ω ( f k )
y ^ i = k = 1 K f k ( x i ) ,   f k F
where, yi and y ^ i are the true and estimated values of the i-th data, respectively; fk is a function of the k-th decision tree; n is the total number of data samples; F is the set of regression trees; K is the total number of decision trees; Ω(fk) represents complexity. The specific expression is as follows:
Ω ( f k ) = γ J + 1 2 λ j = 1 J w j 2
where, J is the total number of leaves, w2j is the square of the leaf weight, and γ and λ are the penalty coefficients.
In the XGBoost algorithm, the correlation between decision trees is manifested in that the newly generated decision tree takes the prediction error of the previous tree as the reference value and regards the sum of the complexities of the previous trees as a constant C. When k = t, the prediction result of the data sample xi is as follows [27]:
y ^ ( t ) = y ^ ( t 1 ) + f t ( x i )
And the objective function is as follows:
L ( t ) = i = 1 I l ( y i , y ^ i ( t 1 ) ) + f t ( x i ) + k = 1 K Ω ( f k ) + C
To obtain the minimum objective function, we performed a second-order Taylor expansion using Equation (4) and incorporated the simplified regularization term into the function.
L ( t ) i = 1 I \ g i f t ( x i ) + 1 2 f t ( x i ) + Ω ( f k ) + C g i = y ^ ( t 1 ) l ( y i , y ^ i ( t 1 ) ) , h i = y ^ ( t 1 ) 2 l ( y i , y ^ i ( t 1 ) )
where, gi and hi are the first and second derivatives of the loss function, respectively.
The samples of the j-th leaf node are defined as Ij = {i|q(xi) = j}. Based on Equations (3) and (6), the objective function can be expressed as follows:
L ( t ) = j = 1 J i I j g i w j + 1 2 i I j h i + λ w j 2 + γ J
To obtain an optimal solution, Equation (7) is differentiated, and the extreme value of the objective function is as follows:
L = 1 2 j = 1 J i I j g i 2 i I j h i + λ + γ J
Equation (8) represents the minimum objective function. A lower objective value indicated a more stable model with better prediction results.

2.2. NRBO Optimizes the XGBoost Algorithm

The XGBoost algorithm, as an advanced machine-learning technique, is suitable for handling small datasets with nonlinear relationships and has a good ability to process sparse data. However, the multiple parameters of XGBoost must be carefully selected and adjusted. Inappropriate parameters can easily lead to overfitting, which affects the performance of the model on prediction data.
The NRBO algorithm is a fast iterative technique used to determine the zero point or minimum value of a function. The search direction can be determined by constructing a quadratic approximation model of the function. Two rules were used to explore the entire search process, namely the Newton-Raphson Search Rule (NRSR) and the Trap Avoidance Operator (TAO), and further exploration of the optimal search results was conducted through several matrices. The new position expression is [28]:
x n + 1 = x n + 1 r a n d n X w X b Δ x 2 X w + X b 2 x n
where, xn+1 is the next position; Xb is a better location in the vicinity of the xn neighborhood; Xw is a worse location in the vicinity of the xn neighborhood; randn is a random number from the standard normal distribution; and Δx is the disturbance quantity.
When it detects that it has fallen into a local optimum, TAO finds a better solution by combining the optimal position with the current position to address this issue.
X n I T + 1 = X T A O I T , r a n d < D F X T A O I T = X n I T + θ 1 μ 1 X b μ 2 X n I T + θ 2 δ μ 1 M e a n X I T μ 2 X n I T
where, θ1 and θ2 are the random numbers; Mean() is the mean function; δ is the adaptive parameter; μ1 and μ2 are random numbers used to control diversity.
The NRBO algorithm can be used to determine the optimal parameters based on iterative optimization, which optimizes the performance of the XGBoost algorithm. The specific process of NRBO for optimizing the XGBoost algorithm is shown in Figure 1, and the calculation steps are as follows.
(1) Data normalization
Assuming that there are M groups of data as {(Xm, Ym)|m = 1, 2, …, M}, where Xm is the input data, and each contains n indicator factors, that is, Xm = {Xm1, Xm2, …, Xmn}. Ym is the output data, to facilitate the construction of the prediction model and improve the prediction accuracy, the data set is normalized.
y ˙ i = y i y min y max y min
where, y ˙ i is the normalized data value; yi is the true data value; ymax and ymin are the maximum and minimum values of the data, respectively.
(2) Data-set training
The normalized data-set was divided into training and test sets. The set data were input into the model for training, and the accuracy of the model calculation was calculated.
(3) Parameter optimization
In the XGBoost algorithm, the three parameters that have the greatest impact on the model output results are the number of iterations (num_trees), maximum depth of the tree (max_depth), and learning rate (eta). The optimal parameters of the XGBoost algorithm are obtained using the NRBO algorithm, and they are iteratively substituted into step (2) for calculation. When the test error was the smallest, the values of this iteration step were determined as the optimal parameters.
(4) Test effects
The optimized NRBO-XGBoost model was employed for prediction, with model performance evaluated using the coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE). A value of R2 closer to 1 indicates a better fit of the model to the data. RMSE reflects the overall prediction error, with smaller values representing lower deviation. MAE, being scale-invariant, provides a robust measure of absolute error. A smaller MAE signifies improved accuracy, particularly in predicting extreme values.

3. Prediction Model of Water Inrush Risk

3.1. Levels of Water Inrush Risk

Faults are common adverse geological conditions in tunnel strata that are affected by factors such as the nature of the original rock and tectonic movements, and the scale, mechanical properties, and water richness of faults vary significantly. The construction of tunnels passing through water-rich fault zones involves many uncertainties and is highly prone to water-inrush disasters. To effectively guide the construction of water diversion tunnels through fault zones and achieve the goals of preventing water inrush disasters and reducing construction risks, the water inrush risk levels of water diversion tunnels are classified from low to high as Grade I to IV.
The water inrush risk is classified into four distinct grades based on severity. Grade I signifies no water inrush risk, allowing tunnel construction to proceed in accordance with the original design and construction plan. Grade II is characterized by seepage or linear water flow at the tunnel face, indicating a relatively low risk. Under such conditions, construction may generally continue, though monitoring and measurement frequency should be heightened. Grade III involves stream-like water gushing at the tunnel face, reflecting a relatively high-risk level. Mitigation measures-such as grouting, water blocking, and real-time monitoring—are required during construction. Grade IV represents an extremely high-risk level, with the potential for large-scale water gushing into the tunnel. Emergency measures must be implemented immediately to ensure tunnel safety.

3.2. Prediction Indicators of Water Inrush Risk

Water diversion tunnels passing through water-rich fault zones are highly prone to water-inrush disasters. Owing to construction disturbances, the movement and storage conditions of groundwater can change, resulting in water inrush disasters. Numerous factors influence water inrush disasters in water-diversion tunnels, including engineering geology, hydrogeology, and tunnel construction. The following characteristics should be considered when the assessment indices are selected. Each indicator should be able to fully reflect the characteristics of the evaluated object. There should be a direct reflecting and being-reflected relationship between the indicators and the evaluated objects. These indicators should be quantifiable, or they can be quantified through certain methods.
So, based on existing literature and case studies [29], and considering the independence and ease of acquisition for each indicator in engineering, the following risk levels and indicator systems are determined.
(1) Engineering geological Indicators
The lithology of the strata on both sides of a fault (I1) governs the properties of the fault zone, including the characteristics of filling materials and the degree of fragmentation. Ma et al. [30] established a quantitative classification standard for such lithological characteristics. The degree of rock mass fragmentation (I2), represented by the integrity index, serves as a critical indicator for groundwater storage and flow potential. A lower integrity index corresponds to poorer rock mass integrity, enhanced permeability, and consequently, a heightened risk of water inrush. Furthermore, both the width (I3) and dip angle (I4) of the fault are essential parameters that reflect its scale and structural features, respectively. These geometrical attributes significantly influence the water inrush risk for tunnels crossing the fault zone.
(2) Hydrogeological Indicators
The storage conditions of groundwater in the strata form the material basis for water inrush disasters during tunnel construction. The water richness and recharge degree within the fault zone are key factors that determine the degree of water inrush disasters. The higher the groundwater level (I5), the greater the water-richness of the fault zone, and the higher the risk of water inrush. The water level difference, which is the difference in elevation between the groundwater level and the tunnel floor, is usually used as an indicator of water richness. The water collection capacity (I6) includes the surface water collection areas of the underground fault system, such as depressions, funnels, sinkholes, valleys, groundwater infiltration recharge, and runoff conditions. The catchment area of a tunnel site is typically used to determine the water collection capacity [31]. Water pressure monitoring during tunnel construction is relatively simple; thus, groundwater pressure was selected as a predictive parameter reflecting groundwater recharge (I7), which is conducive to the engineering application of water inrush risk prediction.
(3) Tunnel construction conditions
Disturbance of rock strata and water-bearing structures during tunnel excavation is a direct cause of water inrush. The cross-sectional area of the tunnel (I8) affected the geometric and mechanical states of the surrounding rock. The larger the actual excavation area, the greater the water inrush volume in the tunnel and the more secondary disasters it causes. Advanced geological prediction (I9) can accurately identify the location and water-bearing conditions of faults during tunnel construction, and prediction accuracy is used as an important characterization parameter.
The water inrush risk levels and prediction indicators for water diversion tunnels passing through fault zones are quantified as shown in Table 1.

4. Engineering Application

4.1. Project Overview

The water diversion tunnel in Longjinxi is an important component of the Pingtan and Minjiang Estuary water resource allocation project in Fujian Province. The tunnel is 13.8 km long and is generally buried at a depth of 70–180 m. The exposed strata along the tunnel mainly consist of coarse granites and granodiorites from the Yanshanian period, as well as quartz fine sandstones and argillaceous siltstones from the Triassic period. Owing to intense regional geological activity, the water diversion tunnel is fractured with numerous faults throughout its length. Long faults cut through the strata from deep underground to the surface, forming gullies and streams on the surface, which results in rich groundwater around fault zones and their affected areas.
There are 21 faults developed around the tunnel site area, with widths ranging from 3 to 20 m. The faults are filled with crushed and fragmented rocks and breccias, and locally contain fault gouges. The rock mass within the fault zone is highly fragmented. The geological conditions and fault distribution in the site area of the water diversion tunnel are shown in Figure 2 and Figure 3. During tunnel construction through the fault zones, continuous large-scale rain-like seepage appeared in multiple sections, and large-scale sudden water inrush disasters occurred in some areas.

4.2. Dataset for Water Inrush Risk

Based on the risk quantification indicators established in Table 2, a dataset of water inrush risk indicators for the main and branch tunnels of the Longjinxi water diversion tunnel crossing the faults was collected. After eliminating samples with incomplete data and those containing obvious errors, a total of 120 samples were collected from the on-site data and used for model training and testing. This dataset includes nine risk indicator parameters and one risk-level parameter. According to the water inrush risk data collected on-site, the characteristic values of each indicator in the dataset were calculated, as shown in Table 2, and the violin distribution of each indicator parameter was plotted as shown in Figure 4. The violin plot combines the kernel density and box plots, which effectively describe the distribution of the data.
The violin plot combines the kernel density plot and the box plot, which can effectively present the distribution characteristics of the data. As shown in Figure 4, the shape of the violin represents how the data is distributed along the vertical axis, with the interior of the violin containing a box plot. Analysis shows that the dataset was not evenly distributed because of the sample size, and in general, the distribution of each indicator within its range in this dataset is relatively reasonable. The actual excavation area is mainly affected by the design of the cross-section of the diversion tunnel, and the data are concentrated in the range of 13 m2 and 22 m2. Overall, it demonstrates a relatively uniform distribution of all input parameters, and the distribution of each indicator in the dataset has a relatively small impact on the prediction results.
To comprehensively study the water inrush risk dataset, Pearson correlation coefficient (PCC) calculation was performed on the dataset (Figure 5) to evaluate the relationship between each indicator and the water inrush risk level. Pearson Correlation Coefficient (PCC) quantifies the linear correlation between two variables, with its value ranging from −1 to 1. A value of zero indicates the absence of any linear relationship between the indicators. Conversely, a value approaching ±1 reflects a strong linear correlation, implying potential redundancy in their contributions to the prediction outcome. Thus, an indicator may be ignored when optimizing the dataset. As shown in Figure 5, the PCC values are generally less than 0.6, indicating that the mutual influence among the prediction indicators is not significant.

4.3. Model Training and Prediction

The above-mentioned 90 valid data samples (75%) were randomly selected as the training set to train the NRBO-XGBoost model, whereas the remaining 30 data samples (25%) were selected as the test set to verify the prediction accuracy of the model. The NRBO algorithm was employed to adaptively optimize the three system parameters of the XGBoost algorithm. The search ranges of num_trees, max_depth and eta, respectively, are [100, 1000], [1, 20], and [0.0001, 0.1]. After iterative calculations, the parameter values corresponding to the minimum RMSE of the training set were taken as the optimal values. The optimal parameters of the XGBoost model are listed in Table 3. The parameter-optimized XGBoost model was trained based on the training set to obtain the predicted water inrush risk levels for both the training and test sets. The training and prediction loss values during the iterative calculation process are shown in Figure 6. The final prediction results are presented in Figure 7 and Figure 8, and the calculation results for the evaluation indicators are listed in Table 3.
As shown in Figure 6, the loss value of the model gradually decreases and stabilizes with the increase in the number of iterations, indicating that the model has fully learned the data features and has good convergence and recognition capabilities. As shown in Table 2 and Figure 7 and Figure 8, in the water inrush risk dataset, a total of 90 groups of samples (75%) participated in the training, and the true values and predicted values are consistent. There are 2 groups of samples in the remaining samples occurring prediction errors, with a prediction accuracy rate of 93.33%. Analyzing the prediction results of these 2 groups of samples, it is found that the predicted grades are 1 grade higher than the measured grades, and the prediction error is on the safe side for the water inrush risk prevention and control of the diversion tunnel. The correlation coefficient (R2), root mean square error (RMSE), and mean absolute error (MAE) of the NRBO-XGBoost model are, respectively, 0.9129, 0.2582, and 0.0667, which are within the relatively optimal range. The result indicates that the NRBO-XGBoost model has a good prediction effect on the water inrush risk of tunnels crossing water-rich fault zones.

4.4. Effectiveness Comparison of Model

To validate the model performance, the water inrush risk datasets were utilized to train and test three models: BPNN, XGBoost, and the proposed NRBO-XGBoost. The datasets were partitioned into a 75% training set and a 25% test set for model development and evaluation. A comparative analysis of the prediction results from each model on the test set is presented in Figure 9. Furthermore, the predictive accuracy and evaluation metrics of all models are compared in Figure 10, with the corresponding numerical values detailed in Table 4.
From Figure 9 and Figure 10 and Table 4, it can be observed that the accuracy rates of the XGBoost and NRBO-XGBoost models for water inrush risk prediction were 86.67% and 93.33%, respectively, indicating that the prediction models based on the XGBoost algorithm can achieve relatively good prediction accuracy. However, the accuracy of the BPNN model was only 76.67%, with seven incorrect predictions among the 30 test samples. The prediction grades of samples 14 and 18 in Figure 9 are lower than the actual values, leading to overly cautious predictions, which is not conducive to the safety control of tunnel construction. By comparing the prediction evaluation indicators of the three models, the R2, RMSE, and MAE of the NRBO-XGBoost model were superior to those of the BPNN and XGBoost models. The results indicate that the optimization of the three main parameters in the XGBoost model with the help of the NRBO algorithm reduces the model’s prediction confusion problem and improves the prediction accuracy by 6.66% compared with the XGBoost model, which further verifies the effectiveness of the NRBO-XGBoost model in predicting water inrush risks in tunnels crossing water-rich fault zones.

5. On-Site Verification

The Longjinxi water diversion tunnel is located along a zone with well-developed tensional faults that have formed water-rich fault structures and deep weathering grooves in granite. During tunnel construction through fault zones, multiple water inrush disasters occur, significantly affecting the safety and schedule of tunnel construction. To reduce the construction risks of the tunnel, the NRBO-XGBoost model was used to evaluate the water inrush risks of five fault zones in the main tunnel and two branch tunnels. The evaluation results were compared with the actual on-site conditions. Large water inrush disasters occurred at the 4+656 mileage of branch tunnel No. 2 and the 7+925 mileage of branch tunnel No. 3. The on-site exposure conditions are shown in Figure 11.
As shown in Table 5 and Figure 11, according to the prediction results, the tunnel risk levels at mileage 4+656 and 7+925 are both at the IV level, indicating a high possibility of water gushing disasters. Based on the engineering records, the on-site construction situation is basically consistent with the prediction. While the water gushing risks at the other three mileage points are relatively low. For different risk levels, appropriate tunnel construction measures were proposed, as shown in Table 5. By combining the prediction results with the corresponding construction measures, it can provide assistance for tunnel construction decisions.
The results indicate that the prediction model based on the NRBO-XGBoost algorithm constructed in this study has good reliability, and can quickly and accurately predict the water inrush risk for water diversion tunnels crossing water-rich fault zones. This can guide construction in a timely manner and reduce the risk of water inrush disasters to a certain extent.

6. Conclusions

The NRBO-XGBoost-based prediction model for water inrush risks in diversion tunnels crossing water-rich fault zones enables rapid and accurate risk assessment by effectively leveraging actual engineering data. In comparison with BPNN and XGBoost models, the following key conclusions are as follows.
(1) Based on the parameter indicators collected from actual engineering, considering the independence and ease of acquisition of water inrush risk indicators, nine indicators were selected from three aspects: engineering geological conditions, hydrogeological conditions, and tunnel construction conditions, to construct the water inrush risk prediction index system and classification standards for tunnels crossing water-rich fault zones.
(2) A total of 120 samples from the Longjinxi water diversion tunnel crossing the fault zone in Fujian. The samples were used for training and testing based on the BPNN, XGBoost, and NRBO-XGBoost models, and a comparative analysis was conducted. The results show that the prediction accuracy of the NRBO-XGBoost model is the highest, reaching 93.33%, which is superior to that of other models.
(3) The proposed model was further applied to predict the water inrush risk in five fault zones along the main tunnel and two branch tunnels of the Longjinxi water diversion project. The predictions demonstrate close agreement with the actual field observations, confirming the reliability and practical effectiveness of the NRBO-XGBoost model. It is suggested to further expand the training database, and at the same time, adjust the parameter indicators according to the characteristics of the actual engineering to improve the prediction accuracy.

Author Contributions

Y.P.: Data curation, funding acquisition, writing—original draft, writing—review and editing. S.Z.: Conceptualization, methodology, supervision, writing—original draft. L.S.: Conceptualization, methodology, supervision, writing—original draft. Z.Y.: Conceptualization, methodology. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Natural Science Foundation of Hunan [Grant no. 2025JJ50178], the Scientific Research Project of the Hubei Education Department [Grant no. D20242702] and Engineering Research Center of Rock-Soil Drilling & Excavation and Protection, Ministry of Education [Grant nos. 202410 and 202409].

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

Author Yaxiong Peng was employed by the company Hunan University of Science and Technology Engineering Testing Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

yiTrue values of the i-th data
y ^ i Estimated values of the i-th data
fkFunction of the k-th decision tree
nTotal number of data samples
FSet of regression trees
KTotal number of decision trees
Ω(fk)Complexity of Function
JTotal number of leaves
w2jSquare of the leaf weight
γ, λPenalty coefficients.
CConstant
giFirst derivatives of the loss function
hiSecond derivatives of the loss function
xn+1Next position
XbBetter location in the vicinity of the xn neighborhood
XwWorse location in the vicinity of the xn neighborhood
randnRandom number from the standard normal distribution
ΔxDisturbance quantity
θ1, θ2Random numbers
Mean()Mean function
δAdaptive parameter
μ1, μ2Random numbers used to control diversity
y ˙ i Normalized data value
yiTrue data value
ymaxMaximum values of the data
yminMinimum values of the data

References

  1. Liu, G.J.; Peng, Y.X.; Wu, L.; Cheng, Y.; Dong, D.J.; Jia, L.; Zhu, S. Stability prediction of surrounding rock in tunnel crossing fault zone based on cusp catastrophe theory. KSCE J. Civ. Eng. 2024, 28, 4146–4157. [Google Scholar] [CrossRef]
  2. Pan, Y.; Qi, J.R.; Zhang, J.F.; Peng, Y.X.; Chen, C.; Ma, H.N.; Ye, C. Analytical solution for water inflow into deeply buried symmetrical subsea tunnels with excavation damage zones. Water 2023, 15, 3556. [Google Scholar] [CrossRef]
  3. Li, X.W.; Li, S.C.; Wang, B.; Qu, J.X.; Zhao, J.L.; Zhao, S.S. Water inrush risk assessment during karst tunnel construction based on knowledge decision and data-driven methods. Tunn. Undergr. Space Technol. 2026, 168, 107120. [Google Scholar] [CrossRef]
  4. Kong, H.Q.; Zhang, N. Risk assessment of water inrush accident during tunnel construction based on FAHP-I-TOPSIS. J. Clean. Prod. 2024, 449, 141744. [Google Scholar] [CrossRef]
  5. Xu, K.; Li, S.; Lu, C.; Liu, J. Risk assessment of coal mine gas explosion based on cloud integrated similarity and fuzzy DEMATEL. Process. Saf. Environ. 2023, 177, 1211–1224. [Google Scholar] [CrossRef]
  6. Sun, X.L.; Teng, G.; Guo, X.L.; Li, X.Z. Risk assessment of water inrush in karst tunnels based on fuzzy comprehensive evaluation considering misjudgment losses: A case study. Arab. J. Geosci. 2022, 15, 421. [Google Scholar] [CrossRef]
  7. Zhu, J.Q.; Li, T.Z. Catastrophe theory-based risk evaluation model for water and mud inrush and its application in karst tunnels. J. Cent. South Univ. 2020, 27, 1587–1598. [Google Scholar] [CrossRef]
  8. Cheng, W.J.; Yin, H.Y.; Xie, D.L.; Dong, F.Y.; Li, Y.J.; Zhu, T.; Wang, J. Prediction of dominant roof water inrush windows and analysis of control target area based on set pair variable weight-Forward correlation cloud model. J. Clean. Prod. 2024, 483, 144253. [Google Scholar] [CrossRef]
  9. Peng, Y.X.; Wu, L.; Zuo, Q.J.; Chen, C.H.; Hao, Y. Risk assessment of water inrush in tunnel through water-rich fault based on AHP-Cloud model. Geomat. Nat. Hazards Risk 2020, 11, 301–317. [Google Scholar] [CrossRef]
  10. Wang, Y.C.; Chen, F.; Yin, X.; Geng, F. Study on the risk assessment of water inrush in karst tunnels based on intuitionistic fuzzy theory. Geomat. Nat. Hazards Risk 2019, 10, 1070–1083. [Google Scholar] [CrossRef]
  11. Luo, M.M.; Chen, J.; Hamza, J.; Li, N.; Guo, X.L.; Zhou, H. Identifying and predicting karst water inrush in a deep tunnel, South China. Eng. Geol. 2022, 305, 106716. [Google Scholar] [CrossRef]
  12. Li, S.C.; Bu, L.; Shi, S.S.; Li, L.P.; Zhou, Z.Q. Prediction for water inrush disaster source and CFD-based design of evacuation routes in karst tunnel. Int. J. Geomech. 2022, 22, 05022001. [Google Scholar] [CrossRef]
  13. Chen, J.Y.; Zhou, M.L.; Zhang, D.M.; Huang, H.W.; Zhang, F.S. Quantification of water inflow in rock tunnel faces via convolutional neural network approach. Autom. Constr. 2021, 123, 103526. [Google Scholar] [CrossRef]
  14. Li, Z.Y.; Wang, Y.C.; Olgun, C.G.; Yang, S.Q.; Jiao, Q.L.; Wang, M. Risk assessment of water inrush caused by karst cave in tunnels based on reliability and GA-BP neural network. Geomat. Nat. Hazards Risk 2020, 11, 1212–1232. [Google Scholar] [CrossRef]
  15. Song, W.H.; Cheng, S.; Zi, J.Q.; Jin, H.; Jiang, X.B.; Wang, J.H. Intelligent detection method for tunnel water inrush disasters based on deep learning. Tunn. Undergr. Space Technol. 2026, 172, 107549. [Google Scholar] [CrossRef]
  16. Li, D.D.; Xu, H.W.; Jiang, T.; Ding, H.; Xiang, Y. Tunnel water burst disaster management engineering based on artificial intelligence technology-taking Yonglian Tunnel in Jiangxi Province as the object in China. Water Suppl. 2023, 23, 3377–3391. [Google Scholar] [CrossRef]
  17. Li, S.C.; Liu, C.; Zhou, Z.Q.; Li, L.P.; Shi, S.S.; Yuan, Y.C. Multi-sources information fusion analysis of water inrush disaster in tunnels based on improved theory of evidence. Tunn. Undergr. Space Technol. 2021, 113, 103948. [Google Scholar] [CrossRef]
  18. Arsalan, M.; Mokhtar, M.; Krikar, M.; Mohammad, K.; Hawkar, H.I.; Hunar, F.; Sazan, N.A. Presenting the best prediction model of water inflow into drill and blast tunnels among several machine learning techniques. Autom. Constr. 2021, 127, 103719. [Google Scholar] [CrossRef]
  19. Zhang, N.; Niu, M.M.; Wan, F.; Lu, J.L.; Wang, Y.Y.; Yan, X.H.; Zhou, C.F. Hazard prediction of water inrush in water-rich tunnels based on random forest algorithm. Appl. Sci. 2024, 14, 867. [Google Scholar] [CrossRef]
  20. Xu, Z.G.; Kong, F.H.; Cao, C.; Zhan, Z.Y. Prediction and analysis of tunnel water inrush disasters in Chinese Karst area based on Variable weight-weighted Bayesian network model. Carbonates Evaporites 2025, 40, 2. [Google Scholar] [CrossRef]
  21. Wu, L.J.; Li, X.; Yuan, J.D.; Wang, S.J. Real-time prediction of tunnel face conditions using XGBoost Random Forest algorithm. Front. Struct. Civ. Eng. 2023, 17, 1777–1795. [Google Scholar] [CrossRef]
  22. Nguyen, V.Q.; Tran, V.L.; Nguyen, D.D.; Sadiq, S.; Park, D. Novel hybrid MFO-XGBoost model for predicting the racking ratio of the rectangular tunnels subjected to seismic loading. Transp. Geotech. 2022, 37, 100878. [Google Scholar] [CrossRef]
  23. Shen, Y.; Chen, Y.Z.; Wang, Y.M.; Ma, L.Y.; Zhang, X.L. Research on a prediction model based on a Newton-Raphson-Optimization-XGBoost Algorithm predicting environmental electromagnetic effects for an airborne synthetic aperture radar. Electronics 2025, 14, 2202. [Google Scholar] [CrossRef]
  24. Yang, H.J.; Zhang, J.M.; Pang, Z.Q. Early Indoor Fire Warning Algorithm Based on INRBO-XGBoost and Fuzzy Inference. In Proceedings of the 2024 China Automation Congress (CAC), Qingdao, China, 1–3 November 2024; pp. 911–916. [Google Scholar]
  25. Chen, T.Q.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  26. Vaneeta, R.S. IDPS_WOA-XGBoost: A novel intrusion detection and prevention system based on whale optimization and XGBoost algorithm. Int. J. Inf. Technol. 2026, 18, 345–353. [Google Scholar] [CrossRef]
  27. Zhou, K.; Jia, A.; Yang, C.; Li, Y.Y.; Wei, J. Short-term rockburst prediction model based on the WOA-XGBoost algorithm. Acta Geophys. 2026, 74, 103. [Google Scholar] [CrossRef]
  28. Ravichandran, S.; Manoharan, P.; Pradeep, J. Newton-Ralphson-based optimizer: A new population-based metaheuristic algorithm for continuous optimization problems. Eng. Appl. Artif. Intell. 2024, 128, 107532. [Google Scholar] [CrossRef]
  29. Peng, Y.X.; Wu, L.; Su, Y.; Zhou, R.F. Risk prediction of tunnel water or mud inrush based on disaster forewarning grading. Geotech. Geol. Eng. 2016, 34, 1923–1932. [Google Scholar] [CrossRef]
  30. Ma, S.W.; Mei, Z.R.; Zhang, J.W. Geologic structural quantitative evaluation of abrupt geologic hazards for long and large tunnels. Mod. Tunn. Technol. 2009, 46, 99–104. [Google Scholar] [CrossRef]
  31. Wang, J.; Li, S.C.; Li, L.P.; Lin, P.; Xu, Z.H.; Gao, C. Attribute recognition model for risk assessment of water inrush. Bull. Eng. Geol. Environ. 2019, 78, 1057–1071. [Google Scholar] [CrossRef]
Figure 1. The specific process of NRBO optimizing the XGBoost algorithm.
Figure 1. The specific process of NRBO optimizing the XGBoost algorithm.
Applsci 16 03831 g001
Figure 2. Distribution map of faults in the tunnel site area.
Figure 2. Distribution map of faults in the tunnel site area.
Applsci 16 03831 g002
Figure 3. Geological profile in the Tunnel Site Area.
Figure 3. Geological profile in the Tunnel Site Area.
Applsci 16 03831 g003
Figure 4. The violin distribution of each indicator parameter.
Figure 4. The violin distribution of each indicator parameter.
Applsci 16 03831 g004
Figure 5. Pearson correlation coefficient of the dataset.
Figure 5. Pearson correlation coefficient of the dataset.
Applsci 16 03831 g005
Figure 6. Iterative loss curve.
Figure 6. Iterative loss curve.
Applsci 16 03831 g006
Figure 7. Prediction results and confusion matrix of the training set.
Figure 7. Prediction results and confusion matrix of the training set.
Applsci 16 03831 g007
Figure 8. Prediction results and confusion matrix of the test set.
Figure 8. Prediction results and confusion matrix of the test set.
Applsci 16 03831 g008
Figure 9. Comparison of prediction results on test set for each model.
Figure 9. Comparison of prediction results on test set for each model.
Applsci 16 03831 g009
Figure 10. Comparison chart of prediction accuracy and evaluation indicator parameters of each model.
Figure 10. Comparison chart of prediction accuracy and evaluation indicator parameters of each model.
Applsci 16 03831 g010
Figure 11. Water inrush in the Longjinxi water diversion tunnel.
Figure 11. Water inrush in the Longjinxi water diversion tunnel.
Applsci 16 03831 g011
Table 1. Prediction indicators and classification standards of water inrush in tunnel.
Table 1. Prediction indicators and classification standards of water inrush in tunnel.
Type of IndicatorsPrediction IndicatorsCharacterization ParameterWater Inrush Risk Levels
IIIIIIIV
Engineering geologyLithology of the strata on both sides of the fault (I1)Lithological index0~0.20.2~0.40.4~0.60.6~1
Degree of rock mass fragmentation (I2)Integrity degree of rock1~0.750.75~0.550.55~0.350.35~0
Scale of the fault (I3)Width of fault/m0~22~55~10>10
Characteristics of the fault (I4)Dig angle of fault/°30~5050~700~3070~90
HydrogeologyGroundwater level (I5)Water level difference/m0~1010~3030~60>60
Water collection capacity (I6)Catchment area/m2<55~7.57.5~10>10
Groundwater recharge (I7)Groundwater pressure /MPa0~0.20.2~0.50.5~1.0>1.0
Tunnel construction conditionsGeometric parameters of tunnel (I8)Actual excavation area/m20~1010~5050~100>100
Advanced geological prediction (I9)Prediction accuracy/%90~10070~9050~700~50
Table 2. Dataset eigenvalues.
Table 2. Dataset eigenvalues.
Type of DataPrediction IndicatorCharacterization ParameterMaximumMinimumAverageStandard Deviation
Input dataLithology of the strata on both sides of the fault (I1)Lithological index0.7090.1830.4470.153
Degree of rock mass fragmentation (I2)Integrity degree of rock0.760.250.4610.127
Scale of the fault (I3)Width of fault/m28213.4598.240
Characteristics of the fault (I4)Dig angle of fault/°851549.03621.105
Groundwater level (I5)Water level difference/m66737.85633.586
Water collection capacity (I6)Catchment area/m213.91.26.743.904
Groundwater recharge (I7)Groundwater pressure/MPa1.9200.4820.408
Geometric parameters of tunnel (I8)Actual excavation area/m222.712.718.0684.473
Advanced geological prediction (I9)Prediction accuracy/%854067.47713.600
Output dataWater inrush risk level (I~IV)1~4412.2430.876
Table 3. The optimal parameters and evaluation indicators of NRBO-XGBoost model.
Table 3. The optimal parameters and evaluation indicators of NRBO-XGBoost model.
Parameters for XGBoost ModelEvaluation Indicator
num_treesmax_depthetaR2RMSEMAE
320200.090.91290.25820.0667
Table 4. Prediction accuracy and evaluation indicator of each model.
Table 4. Prediction accuracy and evaluation indicator of each model.
ModelBPNNXGBoostNRBO-XGBoost
Accuracy/%76.6786.6793.33
Evaluation indicatorR20.81140.90150.9129
RMSE0.46240.32510.2582
MAE0.11630.07250.0667
Table 5. Field prediction results and field conditions.
Table 5. Field prediction results and field conditions.
TunnelMileageFault No.OccurrenceRisk LevelFace ConditionEngineering RecordConstruction Measurement
Main tunnel1+566F48NW295°
SW∠75°
IISeepageRocks within the fault zone are relatively fragmented, and there is a small amount of seepage at the faceNormal construction and focuses on the water seepage situation of the tunnel surrounding rock.
2+179F52NE35°
SE∠78°
IDryRocks in the fault zone are relatively fragmented and the face is dryNormal construction in accordance with the tunnel design and plan
Branch tunnel No. 24+656F46NE80°
NW∠80°
IVWater inrushRocks within the fault zone are extremely fragmented, filled with residual granite soil. Water gushes suddenly at the tunnel face and it collapsesStopping construction, grouting and real-time monitoring measurement
5+519F45NE30°
SE∠60°
IIDrippingRocks within the fault zone are relatively fragmented, and the joints and fissures at the facet face produce dripping waterNormal construction and focuses on the water seepage situation of the tunnel surrounding rock.
Branch tunnel No. 37+925F19NE70°
NW∠55°
IVWater inrushRocks within the fault zone are extremely fragmented, filled with fractured rocks. After excavation, a large water inrush occurs, with the water inflow reaching 438.54 m3/hStopping construction, grouting and real-time monitoring measurement
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Peng, Y.; Zhang, S.; Su, L.; Yao, Z. Risk Prediction of Water Inrush in Diversion Tunnel Crossing Water-Rich Fault Based on NRBO-XGBoost Algorithm. Appl. Sci. 2026, 16, 3831. https://doi.org/10.3390/app16083831

AMA Style

Peng Y, Zhang S, Su L, Yao Z. Risk Prediction of Water Inrush in Diversion Tunnel Crossing Water-Rich Fault Based on NRBO-XGBoost Algorithm. Applied Sciences. 2026; 16(8):3831. https://doi.org/10.3390/app16083831

Chicago/Turabian Style

Peng, Yaxiong, Shizhong Zhang, Lei Su, and Zhen Yao. 2026. "Risk Prediction of Water Inrush in Diversion Tunnel Crossing Water-Rich Fault Based on NRBO-XGBoost Algorithm" Applied Sciences 16, no. 8: 3831. https://doi.org/10.3390/app16083831

APA Style

Peng, Y., Zhang, S., Su, L., & Yao, Z. (2026). Risk Prediction of Water Inrush in Diversion Tunnel Crossing Water-Rich Fault Based on NRBO-XGBoost Algorithm. Applied Sciences, 16(8), 3831. https://doi.org/10.3390/app16083831

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop