Next Article in Journal
The Mediterranean Killifish Aphanius fasciatus (Valenciennes, 1821) (Teleostei: Cyprinodontidae) as a Sentinel Species for Protection of the Quality of Transitional Water Environments: Literature, Insights, and Perspectives
Previous Article in Journal
Influence of Multi-Cross Structures on the Flood Discharge Capacity of Mountain Rivers in the Yellow River Basin
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction of the Height of Fractured Water-Conducting Zone: Significant Factors and Model Optimization

1
Shaanxi Provincial Key Laboratory of Geological Support for Coal Green Exploitation, Xi’an University of Science and Technology, Xi’an 710054, China
2
College of Geology and Environment, Xi’an University of Science and Technology, Xi’an 710054, China
3
Geological Research Institute for Coal Green Mining, Xi’an University of Science and Technology, Xi’an 710054, China
*
Author to whom correspondence should be addressed.
Water 2023, 15(15), 2720; https://doi.org/10.3390/w15152720
Submission received: 31 May 2023 / Revised: 13 July 2023 / Accepted: 24 July 2023 / Published: 27 July 2023
(This article belongs to the Section Hydrogeology)

Abstract

:
Predicting the height of the fractured water-conducting zone (FWCZ) can be challenging due to their significant grey characteristics and the difficulty in scientifically selecting relevant influencing factors. To address this issue, we utilized the Pearson correlation analysis method and the grey entropy correlation analysis method to identify the significant factors and their degree of correlation with the height of FWCZ. Based on this, several constructed models were optimized, and the reliability of the best regression model was verified through parameter inversion analysis. The results indicate that the spatial distribution differences of the main coal mining seams contribute to the complex and variable occurrence conditions of coal seams. This is an important factor that contributes to the significant gray characteristics in predicting the height of FWCZ in the study area. A modeling approach has been proposed for predicting the height of FWCZ. This method is based on analyzing significant factors and conducting a multi-level evaluation of the selected prediction models. The order of correlation between significant influencing factors and the height of FWCZ is as follows: comprehensive hardness of overlying rock > average thickness of sandstone > mining depth > mining height. The results of the multi-level evaluation analysis show that, when using small sample high-quality datasets, the GA-Catboost algorithm has better prediction accuracy compared to the MSR and GA-BP algorithms. The results of the parameter inversion analysis for the GA-Catboost regression prediction model indicate that within the mining height range of 2.5–5.5 m, the ratio of fractured/mining height in the main coal seams is primarily concentrated between 20.45–30.59. In addition, a prediction method was developed to determine the limiting mining height by considering water conservation in coal mining. The relevant research results can provide fundamental theoretical support for ensuring safety in underground production and protecting groundwater in mining areas.

1. Introduction

The height of the fractured water-conducting zone (FWCZ) is an important indicator of overburden damage caused by mining. Based on the predicted height of FWCZ, the technical parameters of mining can be reasonably selected to help reduce the disturbance or damage caused by coal mining to the geotechnical and water bodies from the source. At the same time, the height of FWCZ is a key control indicator for the design of the coal–rock pillar. Once the expansion range of the roof breakage is larger than the retention width of the coal–rock pillar, a parameter can be determined by the height of FWCZ and the comprehensive movement angle of the overburdened rock; it can easily lead to various water-filled sources infiltrating through the mining fissures into the lower mining face, causing water inrush accident and threatening the safe recovery of the coal seam. Therefore, accurate prediction of the height of FWCZ is crucial for safe and efficient coal mining, preserving groundwater resources, and protecting the surface environment, especially in inland coal mining areas. The Yushenfu mining area is a typical mega coal-producing area in western China, characterized by a delicate ecological and geological environment. As coal seams in this area are shallow and thick, high-intensity mining is prone to cause overburden fractures, leading to the aggravation of phreatic water leakage, destruction of vegetation, and desertification. However, there is currently no widely applicable and functionally reliable model for predicting the height of hydraulic fracture zones in this mining area.
At present, there are abundant prediction methods for the height of FWCZ, mainly including numerical simulation [1,2,3], physical simulation [4,5], field measurement [6,7,8], and theoretical prediction [9,10,11]. Both numerical simulation and physical simulation suffer from problems of relatively idealized models and difficult selection of physical and mechanical parameters of rock mass overlying the coal seam, which easily leads to inaccuracies in the prediction results. Field measurements have the highest accuracy among all methods, but the large workload and high economic and time costs seriously hindered their large-scale application. Theoretical prediction methods mainly include empirical formulas and statistical analysis methods. The empirical formulas recommended in coal industry norms are mainly derived from the experience of engineering practice of conventional coal mining techniques. For example, the “exploration specification of hydrogeology and engineering geology in mining areas” [12] is applicable to the mining of thin and medium-thick coal seams (1–3 m in a single layer and ≤15 m in total) using integrated or general mining techniques, but it is not applicable to the current large-scale and high-intensity coal exploitation. Statistical analysis methods are inexpensive, simple to operate, and widely applicable, but the selection of influence factors and mathematical formulae is highly subjective and dependent, which makes it difficult to solve the shortcomings of low accuracy and weak generalization ability. In recent years, the construction of regression models based on machine learning algorithms (MLAs), such as RBFNN, PSO-SVM, RFR, and GEP [13,14,15,16], has become a research trend in the field of predicting the height of FWCZ [14]. MLAs can efficiently solve complex multivariate regression problems, but they often reveal some common drawbacks in practical applications, such as a great demand for sample size, difficulty in determining significantly influential input parameters of regression models, and a tendency to fall into overlearning. Models based on MLAs still have insufficient generalization capability to replace normative, empirical formulations as universal tools.
Considering all the above factors, this paper constructs an FWCZ height prediction model based on the GA-CatBoost algorithm. First, based on analysis of the development characteristics of the Jurassic coal seam occurrence strata in the research area, this study sorts out the spatial distribution patterns of the main mining coal seams. Then, the significant factors affecting the height of FWCZ and their correlation ranking are statistically analyzed. Based on this analysis, various FWCZ height prediction models are constructed using mathematical methods (single-factor regression analysis, multiple stepwise regression) and MLAs (GA-BP, GA-CatBoost). Finally, the R-squared value is used to determine the best regression model, and the reliability of this model is verified by typical example validation and parameter inversion analysis. The research results have important theoretical and practical significance for safe and efficient coal mining and mine water protection under similar geological conditions.

2. Overview of the Study Area

2.1. Physiographic Conditions

The Yushenfu mining area is located in the northern part of Yulin City, Shaanxi Province, with a total area of 836,900 hectares and two partition districts: Yushen and Shenfu (as shown in Figure 1). This mining area is located in the transition zone between the Loess Plateau and the Mu Us sandy land, with three types of geomorphic units: aeolian sand, loess, and river valley. These units are located separately in the northeast and southwest of the mining area, as well as along the first tributary of the Yellow River. The topography of the mining area is high in the northwest and low in the southeast, with flat lake beaches and sand dunes dominating the windy desert area. The undulating loess hilly-gully region is mixed with erosion gullies and hills. The upper reaches of river valleys are flat, while the middle and lower reaches have a large relative height difference. The tectonic units of the Yushenfu mining area are located in the east of the loess tableland syncline in the Ordos Basin and in the middle of the North Shaanxi Slope. The strata are distributed in a zonal pattern along the NE-SW direction with an overall northwest dipping monocline (dip angle < 3°) [17]. The mining area belongs to an arid and semi-arid climate in temperate zones, with annual rainfall less than 500 mm, showing the regional characteristics of being coal-rich, water-poor, and ecologically fragile.

2.2. The Spatial Distribution Characteristics of Main Coal Seams

The strata in the study area are arranged in sequential order from top to bottom, starting with the Quaternary, followed by the Neogene, Cretaceous, Jurassic, and Triassic. Among them, the main aquifer in the study area and the target layer for water conservation is the upper Pleistocene Salawusu Formation (Q3s) in the Quaternary period; in the Neogene Pliocene, the main aquifers in the study area are the Jingle Formation (N2j) and Baode Formation (N2b); the Middle Jurassic Yan’an Formation is the only coal-bearing rock series, and the overlying bedrock includes the Jurassic Zhiluo Formation (J2z), Anding Formation (J2a), and Cretaceous Luohe Formation (K1l) [18]. According to drilling data, the bedrock above the main coal seams in the mining area is primarily composed of sandstone, mudstone, and their interbedding. The lithology of this bedrock consists mainly of medium-hard rocks.
The Yan’an Formation in the Yushenfu mining area contains 8–20 coal seams, with an average of 16 seams. Among them, there are five main recoverable coal seams numbered from deep to shallow: No. 5−2, No. 4−3, No. 3−1, No. 2−2, and No. l−2 (as shown in Table 1). The distribution of the main coal seams in the study area has obvious characteristics of spatial heterogeneity. In planar space, the main coal seams are distributed in a ladder pattern from southeast to northwest, and the burial depth of each coal seam gradually decreases along the line of “Erlingtan water source area—Yaozhen Town—Shenmu County”. The distribution pattern of burial depths of the main coal seams in the Shenfu mining area is not obvious. The No. 2−2 coal seam in the Yushen mine area is generally deeper than 200 m, with an average burial depth (unit: m) second only to the No. l−2 coal seam (416.35). Meanwhile, the No. 3−1 (107.24) and No. 4−2 (113.22) coal seams have similar average burial depths, both lower than that of the No. 5−2 coal seam (133.43), which is mainly caused by the erosion of river valley terraces. In terms of minable area, shallow seams of No. 3−1, No. 1−2, and No. 2−2 are all exploitable throughout the research area. There is also significant variability in the distribution of the main coal seams in vertical space due to sedimentation and tectonics. For example, the spacing (unit: m) between adjacent main coal seams is generally 35 to 57, 40, 35, and 45 to 50 from the bottom up, with decreasing bottom-up seam thickness [19].

3. Methodology

3.1. Research Technical Route

The research process of this article includes the following five main steps:
Step 1: Initially, the main factors influencing the height of FWCZ are selected. Pearson correlation analysis is applied to identify the factors that are significantly correlated with the height of FWCZ, which are referred to as “significant factors”.
Step 2: Then, grey entropy correlation analysis is used to determine the correlation ranking of significant factors on the height of FWCZ, and single-factor regression prediction models are constructed based on the above correlation analysis results.
Step 3: The significant factors are used as the independent variables, and the height of FWCZ is used as the dependent variable. On this basis, multiple linear regression algorithms and MLAs (GA-BP, GA-CatBoost) are used to construct multi-factor regression models for the height of FWCZ.
Step 4: By comparing the goodness of fit, any regression model with a medium or better-fit level is selected to determine the best predictive model. In this paper, R2 is classified into five grades to distinguish the imitative effect: R2 ∈ (0,0.3), which means very poor; R2 ∈ (0.3,0.5), which means poor; R2 ∈ (0.5,0.7), which means medium; R2 ∈ (0.7,0.9), which means good; R2 ∈ (0.9,1), which means excellent.
Step 5: Based on on-site measured data, the height prediction scheme for the mining model is determined using important factors that are orthogonally designed. The GA-CatBoost regression model is used to predict the height of FWCZ in each coal mining scenario. Meanwhile, the ratio of the height of FWCZ to the mining height (referred to as the “fractured/mining height ratio”) can be calculated. In order to furtherly enhance the credibility of the optimal prediction model, two approaches were employed: typical example validation and parameter inversion analysis. The process of example verification involves the selection of multiple representative working faces, followed by a comparison between the predicted and measured values of hydraulic fracture zone height for each working face. The process of parameter inversion analysis involves conducting statistical analysis on the ratio of fractured/mining height to determine the range of values for this ratio. This result is then compared with the statistical analysis results of the measured values.
Step 6: Based on theoretical criteria for setting waterproof coal–rock pillars, the best predictive model is utilized to predict the limiting mining height. Then, propose height limits for mining and recommend appropriate mining methods for areas with varying water-preservation conditions. The technical route is shown in Figure 2.

3.2. Calculation Principle of Grey Entropy Correlation Analysis

The grey entropy correlation analysis (GECA) method uses the similarity in geometry between the research indicator series and the influence factor series to determine the degree of correlation between them. This method determines influence factors that exhibit a consistent trend of change with the research indicator. In the decision-making process for correlation degree, the weight is determined by the size of the information entropy of the influence factor. The higher entropy indicates that the data are more discrete, carry more information, and have a greater weight [21]. The specific steps of this method are as follows:
Step 1: Determine the analysis series;
Construct the analysis matrix, namely X = [X1, X2, …, Xn, X0], where the reference sequnce X0 = {x0(k)|k = 1, 2, …, m} and the comparative sequence Xi = {xi(k)|i = 1, 2, …, n; k = 1, 2, …, m}.
Step 2: Dimensionless standardization of variables in X;
To perform dimensionless processing on variable data, this paper adopts the initialization method; the formula is
x i ( k ) = x i ( k ) x i ( 1 ) , i = 1 , 2 , , n ; k = 1 , 2 , , m
Step 3: Calculate the correlation coefficient;
Calculate the difference between the reference sequence and the comparison sequence by noting ∆i(j) = |X0Xi|, the maximum difference M = max i max k i(k), and the minimum difference m = ∆i(k). Then, the correlation coefficient between the reference sequence and the comparative sequence is donated as
ξ i ( k ) = m + ξ M Δ i ( k ) + ξ M , i = 1 , 2 , , n ; k = 1 , 2 , , m
where ξ is the identification coefficient, usually taken as 0.5.
Step 4: Calculate the gray entropy value;
Normalizing xi(k) yields rik. Then the entropy value Ei and the weight factor ωi can be formulated as [21]:
E i = k = 1 m r ik ln r ik ln m , i = 1 , 2 , , n ; k = 1 , 2 , , m
ω i = 1 E i i = 1 n ( 1 E i ) , k = 1 m ω i = 1
Step 5: Calculate the grey entropy correlation degree.
The grey entropy correlations for the i-th influencing factor can be expressed as
α 0 ( i ) = k = 1 m ω i ξ i ( k ) , i = 1 , 2 , , n ; k = 1 , 2 , , m

3.3. Analysis of the Applicability of Regression Model Construction Methods

(1)
Single-factor regression model construction method
The current coal mining technologies in the study area are characterized by high intensity and efficiency. This is mainly due to the advanced equipment techniques used, as well as the general use of fully mechanized mining or fully mechanized top coal caving mining technologies with large mining heights to extract thick coal seams (with a stratified mining thickness of at least 3.5 m). The working face is typically large, with an advancing length of 1–5 km and a width of over 200 m. Moreover, the working face advances at a speed of at least 5 m/d, and the unit production is generally 5–10 million t/a [22]. With the significant increase in mining intensity in modern coal mining processes, the technical characteristics of mining in the study area do not align with the conditions of applicability of the normative empirical formula. Therefore, instead of using the aforementioned method, this paper employs Pearson correlation analysis and grey entropy correlation analysis to identify significant factors that have a relatively small grey correlation with the height of FWCZ. Based on the analysis results, single-factor regression models can be constructed.
(2)
Multi-factor regression model construction method
As indicated in Section 2.2, there is notable spatial heterogeneity in the distribution of the main coal seams in the study area. The burial depth of coal seams, the structure of the overlying strata, and the water resistance performance vary in complexity and variability across different spatial positions. The complex and variable geological conditions of coal seams result in a variety of mining methods. In addition, even if the geological and hydrogeological conditions for coal seam occurrence are the same, the technical conditions for coal mining can vary due to the influence of human subjective initiative. Therefore, the numerous and complex influencing factors, as well as unclear internal connections, result in significant uncertainty and inaccuracy in predicting the height of FWCZ. Considering all the causes, three algorithms, namely multiple stepwise regression, BP neural network, and Catboost, are selected to construct multi-factor regression models. Multiple stepwise regression (MSR) clarifies the contribution of influencing factors to research indicators through a gradual screening process. This method can avoid collinearity by eliminating factors with insufficient contributions and retaining those with high contributions during the screening process.
Artificial neural network (ANN) is considered the most robust and versatile technique in all algorithms and is superior in solving problems where numerous complex parameters influence the process and results [23,24]. As a typical type of ANN, a back propagation (BP) neural network is a multilayer feedforward network by error back propagation algorithm, which can be well adapted to multidimensional mapping problems provided that the data are consistent and there are enough neurons in its hidden layer. As an improved gradient-boosting decision tree (GBDT) algorithm proposed in 2018, Caboost adopts the coding scheme of ordered boosting to diagnose the training data, which enables it to effectively address gradient bias and prediction bias issues, as well as minimize overfitting problems [25]. CatBoost remains best suited for handling heterogeneous data and is sensitive to hyper-parameter settings [26]. Genetic algorithm (GA) is a random search algorithm that demonstrates robustness in searching for optimal solutions to many complex problems [27]. In this paper, the GA algorithm is used to optimize the hyper-parameters of the BP and CatBoost model.

4. Regression Model Construction for the Height of FWCZ

4.1. Initial Selection and Quantification of Influencing Factors

Referring to the empirical formulas provided in appendix A.1 of the “code for hydrogeological and engineering geological exploration of mining areas” [12], as well as previous scientific understanding of the primary factors that influence the height of FWCZ [4,26] and the characteristics of the overlying rock stratigraphy in the study area [28], the following indicators are considered as the primary factors that influence the prediction of the height of FWCZ (X0): mining depth (X1), mining height (X2), working face length (X3), bedrock–soil ratio (the ratio of bedrock thickness to overburden thickness, X4), sandstone–bedrock ratio (the ratio of sandstone thickness to bedrock thickness, X5), sandstone–bedrock layer coefficient(the ratio of the number of sand layers to the total number of bedrock layers, X6), the average thickness of sandstone (X7), comprehensive hardness of cover rock (X8). The parameter X8 is calculated by taking the weighted average value of the Protodyakonov coefficient of the rock layers that lie above the coal seam, with the thickness of each layer being used as the weight. On this basis, data regarding the height of FWCZ in the overlying rock of the main coal seam in the study area and its influencing factors were compiled (as shown in Table 2).

4.2. Construction of Single-Factor Regression Models

Pearson correlation analysis was conducted using SPSS 21.0 software to examine the relationship between X0 and its influencing factors X1X8. The results presented in Table 3 reveal that mining depth (X1), mining height (X2), average thickness of sandstone (X7), and comprehensive hardness of cover rock (X8) are significant factors affecting the height of FWCZ.
After normalizing the comparison series, calculations were performed using Equations (3) and (4). The results obtained from the entropy method are presented in Table 4. As shown in the table, the weights of the significance factors were ranked in the following order: comprehensive hardness of cover rock > average sandstone thickness > mining depth > mining height.
Equations (1), (2), and (5) were used to calculate the results of the grey entropy correlation (as shown in Figure 3). As shown in the figure, the correlation between the significance factors and the height of FWCZ is ranked as follows: comprehensive hardness of cover rock > average thickness of sandstone > mining depth > mining height. This indicates that the structural and strength characteristics of the overburden have a stronger correlation with the height of FWCZ than mining-related factors in the study area. We suppose if the research scope is significantly reduced (even considering studying only one coal mine), the geological and hydrogeological conditions of the coal seams in the study area will remain relatively stable. The data distribution tends to be concentrated within the corresponding feature dimension. At this condition, the significance of mining factors, such as mining height, on the height of FWCZ will gradually become more prominent.
According to the concept of information entropy, it can be inferred that within a specific feature dimension, the weight or sensitivity of the independent variable on the dependent variable tends to decrease as the robustness of the data distribution increases. From Figure 4b, it can be observed that the data shows a densely distributed range of mining thickness from 5 to 5.5 m, with values varying from 1.5 to 7.9 m. The data are concentrated within a relatively narrow range of 7.8%, making it the most densely distributed. However, there is no clearly defined concentrated area for the comprehensive hardness of the cover rock (as seen in Figure 4d). Equivalent to a range of values encompassing 100% of the data. Based on the distribution density characteristics of the data for each feature dimension, we can verify the reliability of the correlation ranking of the four important factors mentioned above on the height of FWCZ.
Based on the results of grey entropy correlation analysis, the correlation between the average thickness of sandstone and the height of FWCZ exhibits the most significant grey characteristics. Therefore, it is not possible to construct a good single-factor regression model based solely on the relationship between these two variables (as shown in Figure 4c). The height of FWCZ showed a pattern of initially rising and then declining pattern with increasing mining depth or comprehensive hardness of cover rock (as shown in Figure 4a,d). Conversely, it showed an increasing trend with mining height (as shown in Figure 4b). Three single-factor regression models were constructed using mining depth, comprehensive hardness of cover rock, and mining height as independent variables. The determination coefficients (R2) of these three models ranged from 0.1013 to 0.4888, indicating low prediction accuracy and inappropriateness for model selection.

4.3. Multi-Factor Regression Models Constructed

Multiple regression models were constructed using X1, X2, X7, and X8 as independent variables and X0 as the dependent variable. Three different models were tested, including a stepwise regression model, a BP neural network model, and a GA-CatBoost regression model.
(1)
Multiple stepwise regression (MSR) model
Stepwise regression analysis was conducted using SPSS 21.0 software. The entry value was set to 0.25, and the deletion value was set to 0.30. As a result, the independent variables X2 and X7 were retained. Accordingly, the predictive model is
y = 13.664 + 16.172 × X 2 + 2.167 × X 7
The model has passed the F-test and does not exhibit multicollinearity. The model’s goodness of fit is poor, as indicated by an R2 value of 0.473.
A multiple non-linear regression model was created by improving Equation (6) as
y = 20.703 × X 2 0.026 × EXP ( X 2 ) + 5.334 × X 7 19.29
The model has passed the F-test and does not exhibit multicollinearity. The model’s goodness of fit is medium, as indicated by an R2 value of 0.564.
(2)
GA-BP neural network model
A GA-BP regression model was constructed using MATLAB (R2016a) with four neurons in the input layer, six neurons in the hidden layer, and one neuron in the output layer (as shown in Figure 5). The sample datasets from Table 2 were randomly divided into three sets: 34 (70%) for training, 7 (15%) for validation, and 7 (15%) for testing. The primary input parameters for the BP model were optimized using the GA algorithm to achieve the following values: iterations = 100, learning_rate = 0.094, depth = 10, 12_leaf_reg = 1. The remaining parameters were kept at their default values.
The results of training the GA-BP neural network are presented in Figure 6. The frequency distribution of errors in each dataset indicates that the errors were primarily concentrated within the range of −2.33–7.19 (as shown in Figure 6a). The height of FWCZ is relatively low compared to the mean value of 101.039 ± 31.69. The linear regression equation demonstrates a strong fit for the model, as evidenced by the complex correlation coefficient R that exceeded 0.91 for each dataset (as shown in Figure 6b).
(3)
GA-CatBoost regression model
A GA-CatBoost regression model was constructed using SPSS 21.0 software. Firstly, the GA algorithm was used to determine the optimal values of the main input parameters. The optimal values obtained were 100 iterations, a learning rate of 0.1, a depth of 10, and 12-leaf regularization of 1. The original data in Table 2 were randomly divided into a training set consisting of 32 data points (67%) and a testing set consisting of 16 data points (33%). The primary input parameters of the CatBoost model were optimized using the GA algorithm to attain the following values: iterations = 100, learning rate = 0.1, depth = 16, and 12_leaf_reg = 1. The remaining parameters were set to their default values. Finally, the accuracy of the model was evaluated using multiple indicators, and the results are presented in Table 5. As shown in the table, the model’s test set has a mean absolute percentage error (MAPE) of 9.37% and an R² value of 0.872.

5. Reliability Verification and Basic Application of the Best Predictive Model

5.1. Optimization of Regression Models

The best predictive model was determined by using R2 and the sum of squared errors (SSE) as evaluation indicators. The principle is that the closer R2 is to 1, the smaller the SSE, and the better the fit of the corresponding model. R2 is ranked by size as follows: GA-CatBoost (0.9618) > GA-BP (0.8614) > MSR (0.5640). In terms of SSE, the ranking by size is as follows: GA-CatBoost (1.06 × 106) < GA-BP (1.07 × 106) < MSR (1.11 × 106). Based on the results, it can be concluded that the GA-CatBoost model has the highest level of prediction accuracy and is, therefore, the most effective predictive model (as shown in Figure 7).

5.2. Reliability Validation of the Best Predictive Model

Five working faces were randomly selected, and the absolute values of prediction errors for the GA-CatBoost model were calculated to be 0.87–4.80% (as shown in Table 6).
Based on the measured value ranges of X1, X2, X7, and X8 presented in Table 1, we established the design value scheme for these four significant factors. The equation for the fit is X1 = 51X2 − 29.298, R2 = 0.525. When the design values of X1 are sequentially taken as 2.5, 3.5, 4.5, and 5.5 m, the corresponding values of X2 are 98.2, 149.2, 200.2, and 251.2 m, respectively. The design values of X7 are 3, 6, 9, and 12 m in sequence. The design values of X8 are 2.00, 2.35, 2.70, and 3.05 MPa in sequence. Then, utilizing the significant factor-based design value scheme, we applied the GA-CatBoost regression model to obtain the predictive value of X0 and calculate the corresponding fractured/mining ratio. The results are shown in Table 7.
The calculation results indicate that the fractured/mining height ratio of the main coal seams is mainly concentrated between 20.45 and 30.59 within the mining height range of 2.5–5.5 m, with an average ratio of 25.52. Additionally, when the mining depth is less than 100m, the mining height is ≤2.5 m, and the average thickness of sandstone is ≥9m, the fractured/mining height ratio will exceed 31. A large amount of statistical data on conductivity detection in the study area shows that the fracture ratio of fully mechanized coal face is mainly 21–30 times, with an average of about 26 times. At the same time, the lithology of coal seam overburden in the study area is mainly medium-hard, and the cracking ratio of medium-hard rock seam is 18–28, which is the general consensus of some scholars. Extensive statistical data on the height of FWCZ in the study area shows that the fractured/mining height ratio of fully mechanized coal face is mainly 21–30, with an average of about 26. Moreover, the lithology of the coal seam overburden in the study area is mainly medium-hard. The fractured/mining height ratio of a medium-hard rock seam is typically 18–28, which is widely accepted among many scholars [29,30].
In summary, it is believed that the fractured/mining height ratio of fully mechanized coal face in the study area is mainly concentrated in the range of 18–31. The results of the inversion calculation in this paper are highly consistent with previous research, indicating that the best predictive model is reliable. This model can expand the sample capacity through parameter inversion, even under the conditions of a small data set. As a result, it reveals the mining damage law of the overburdened rock of the main mining seam.
Based on the results of parameter inversion, a significant linear relationship was discovered between mining depth, mining height, and average sandstone thickness on the cracking ratio. The relationship can be expressed as follows:
y = 0.37 × X 1 + 15.126 × X 2 + 0.435 × X 7 + 26.343
The R2 of this formula is 0.759.

5.3. Determination of the Limiting Mining Height under Conditions of Water-Preserved Mining

The Salawusu Formation aquifer is the only source of water that sustains the ecological balance of the entire study area. To protect this diving water source, it is advisable to restrict the height of the FWCZ to below the weathered bedrock, which typically refers to a 20-m section of rock located at the top of the bedrock. The reasons are listed below. Weathered bedrock has weak engineering and mechanical properties. Its failure and instability can lead to mining cracks that damage the impermeable clay layer above, posing a threat to the safety of surface diving destabilization. Additionally, the weathered bedrock has a highly developed network of fissures and is typically more saturated with water than the underlying bedrock layer. “Procedures for coal pillar reservation and ‘three unders’ coal mining” recommend preventing fractures in water-conducting zones from reaching the bottom of water-filled aquifers by leaving a waterproof coal–rock pillar intact.
H sh H f H b
where Hsh represents the thickness of the water-resisting layer (m); Hf represents the height of FWCZ (m); Hb represents the height of the watertight and safe coal rock column in meters (m). When there is a clay water barrier, Hb is three times the mining height; otherwise, it is five times the mining height.
The distance from the coal seam to the base of the weathered bedrock can be used as the initial value of Hf. Next, the mining height can be calculated using Equation (8), and the resulting value can be substituted into Equation (9) to determine the value of Hsh. If the value of Hsh ensures that the water-resisting rock layers are located below the Salawusu Formation aquifer, then the calculation is complete. Otherwise, the value of Hf is continuously decreased and cyclically verified until the end condition is satisfied.
Using the aforementioned method, the limiting mining height (referring to the maximum thickness of coal seams that can be mined in water-preserved mining coalfield) was calculated for each of the comprehensive mining faces listed in Table 2. As shown in Figure 4b, the limiting mining height of the main coal seams in the study area is mainly concentrated within the range of 3.0–5.5 m. Based on the water retention zoning framework depicted in Figure 8 and the statistical analysis of the limiting mining height, the discrimination pattern of geological conditions in each zoning area is illustrated as follows.
(1)
When the depth of a coal seam is less than 100 m, and the base-load ratio is below 1.5, the limiting mining height should not exceed 1 m. If the practical mining height exceeds 1 m, there is a high risk of water loss during coal mining. The eastern part of the study area has poor water yield and recharge properties. There are no engineering requirements for water preservation in mining. However, the development of coal resources must be accompanied by measures to prevent and control secondary surface hazards in arid mining regions. The northeastern region of the study area is characterized by a lack of water resources. Coal mining in this area poses a significant threat to water bodies, and the damage caused by mining activities is often irreversible. Therefore, it is recommended to restrict or avoid coal seam mining in water-preserve and restricted mining regions.
(2)
When the depth of a coal seam is less than 150 m, and the bedrock–soil ratio is 1–3, the limiting mining height should be 1–3 m. The corresponding fully mechanized coal faces are mainly located in the southeastern regions of the Yushen I and II planning areas. These areas are classified as water-preserve and restricted mining regions because they pose a greater risk of water body damage due to coal mining. However, despite the impact of coal mining, the abundance of surface diving water provides an opportunity for ecological restoration techniques to revitalize the mining area’s ecology in the short term.
(3)
When a coal seam is at a depth of 150–250 m, and the bedrock–soil ratio is 4–6, the limiting mining height should be 3–6 m. In such cases, the bedrock–soil ratio of the overburden typically ranges from 18.0–30.5. To extract coal seams, mining techniques such as limited height and stratification mining, coordinated mining, and filling mining are suitable for application in controllably water-preserved mining regions.
(4)
When the coal seam depth exceeds 250 m and the bedrock–soil ratio is 0.25, the limiting mining height exceeds 6.0 m. At present, the longwall mining method, which involves mining the full thickness at once, does not cause damage to the surface ecology or water levels. This method is considered to be used in natural water-preserved mining regions.

5.4. Discussion

A modeling approach has been proposed for predicting the height of FWCZ. This method is based on the analysis of significant factors and the multi-level evaluation of the selected prediction models. The grey entropy correlation analysis method is used to determine the significant factors, thereby constructing an index system. Next, regression models are developed using multiple approaches, and the optimal prediction model can be determined through model selection. The multi-level evaluation consists of comparing the goodness of fit, validating typical examples, and analyzing parameter inversion. Through this method, the performance of the best prediction model can be evaluated in three aspects: fitting effect, local prediction effect, and global feature inversion effect. The performance evaluation of a model in global feature inversion guarantees the quality of the original data. This is why the Catboost model performs well in predicting performance, even when using a small dataset. This modeling method has significant advantages in ensuring the objective and reasonable selection of indicators, as well as ensuring high model reliability.
We have proposed a prediction method for determining the limiting mining height by considering water conservation in coal mining. Based on theoretical criteria for setting waterproof coal–rock pillars, this method utilizes the best predictive model to predict the optimal mining height, thereby helping to prevent incidents of underground roof water inrush. Furthermore, by comprehensively applying the principles of water conservation, classifying coal mining areas, and using the prediction method for determining the maximum mining height, we can identify the typical characteristics of coal seam occurrence in each mining area and establish appropriate mining principles.
In addition, we believe that the GA-CatBoost regression model has the potential to assess the dynamics of groundwater in coal mining areas. Especially during the closure stage of coal mines, the volume of the overlying strata in the goaf expands after unloading, which leads to an increase in rock porosity and, in turn, causes groundwater to seep into and submerge the mine. The literature [32,33] has developed a hydrodynamic model based on the finite difference principle that accurately predicts the rebound and inflow of transient mine water. We plan to combine the GA-Catboost regression model and this hydrodynamic model to analyze the spatial relationship between the height of FWCZ and the groundwater level. Our objective is to assess the impact of coal mining on the groundwater level in the mining region, specifically emphasizing the Quaternary ecological water level.

6. Conclusions

(1)
We have identified the reason why the prediction of the height of FWCZ in the Yushenfu mining area exhibits obvious gray characteristics. On the one hand, there are significant spatial heterogeneity characteristics in the main coal seams. From a spatial distribution perspective, the burial depth of coal seams is generally shallow in the east and deep in the west. The distribution range of shallow coal seams is relatively wide. Meanwhile, there is significant variability in the vertical distribution of the main coal seams due to sedimentation and tectonic activity. On the other hand, the complex and variable geological conditions of coal seams result in a variety of mining methods, while the selection of technical parameters is also influenced by subjective factors. In brief, the complex and constantly changing geological and hydrogeological conditions of coal seam occurrence are important factors contributing to significant uncertainty and inaccuracy in predicting the height of the FWCZ.
(2)
A modeling approach has been proposed for predicting the height of FWCZ. This method is based on the analysis of significant factors and the multi-level evaluation of the selected prediction models. This modeling method has significant advantages in ensuring the objective and reasonable selection of indicators, as well as ensuring a high level of model reliability. Through the grey entropy correlation analysis method, we can conclude that the descending order of correlation between the height of FWCZ and its significant influencing factors is as follows: comprehensive hardness of the overlying rock, the average thickness of sandstone, mining depth, and mining height. The multi-level evaluation consists of the comparison of the goodness of fit, validation of typical examples, and analysis of parameter inversion. The calculation results of the analysis of parameter inversion indicate that the fractured/mining height ratio of the main coal seams is mainly concentrated between 20.45 and 30.59 within the mining height range of 2.5–5.5 m, with an average ratio of 25.52. Through the application of a multi-level evaluation method, we can conclude that under the condition of small sample data sets with high quality, the GA-CatBoost algorithm has better prediction accuracy compared to SFR, MSR, and GA-BP algorithms. Thus, the GA-CatBoost regression model is the best predictive model.
(3)
We have proposed a prediction method for determining the limiting mining height by considering water conservation in coal mining. Based on theoretical criteria for setting waterproof coal–rock pillars, this method utilizes the best predictive model to predict the optimal mining height, thereby helping to prevent incidents of underground roof water inrush. Furthermore, by comprehensively applying the principles of water conservation, classifying coal mining areas, and using the prediction method for determining the maximum mining height, we can identify the typical characteristics of coal seam occurrence in each mining area and establish appropriate mining principles. Relevant research results can provide a fundamental theoretical guarantee for ensuring underground safety production and protecting groundwater in mining areas.

Author Contributions

Methodology, L.G., Y.S. and N.W.; validation, Y.S.; investigation, L.G. and S.S.; formal analysis, L.G.; writing—original draft preparation, L.G.; writing—review and editing, N.W. and H.K., supervision, N.W.; project administration, Y.S.; funding acquisition, Y.S. and S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the funding of Key Research and Development Projects of Shaanxi Province (Grant No. 2023-YBSF-458), Key Program of Shaanxi Provincial Key Laboratory of Geological Support for Coal Green Exploitation (Grant No. DZBZ2022Z-03), 2020 Plan of Science and Technology (Industry-University-Research Project) of Yulin City (Grant No. CXY-2020-034).

Data Availability Statement

Not applicable.

Acknowledgments

We thank the entire team for their efforts to improve the quality of the article. At the same time, we would like to thank the editor for his timely handling of the manuscripts.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zhang, Y.X.; Tu, S.H.; Bai, Q.S.; Li, J.J. Overburden fracture evolution laws and water-controlling technologies in mining very thick coal seam under water-rich roof. Int. J. Min. Sci. Technol. 2013, 23, 693–700. [Google Scholar] [CrossRef]
  2. Wang, G.; Wu, M.M.; Wang, R.; Xu, H.; Song, X. Height of the mining-induced fractured zone above a coal face. Eng. Geol. 2017, 216, 140–152. [Google Scholar] [CrossRef]
  3. Teng, Y.H.; Yi, S.H.; Zhu, W.; Jing, S.Q. Development patterns of fractured water-conducting zones under fully mechanized mining in wet-collapsible loess area. Water 2023, 15, 22. [Google Scholar] [CrossRef]
  4. Zhou, Y.; Yu, X.Y. Study of the evolution of water-conducting fracture zones in overlying rock of a fully mechanized caving face in gently inclined extra-thick coal seams. Appl. Sci. 2022, 12, 9057. [Google Scholar] [CrossRef]
  5. Xu, S.Y.; Zhang, Y.B.; Shi, H.; Wang, K.; Geng, Y.P.; Chen, J.F. Physical simulation of strata failure and its impact on overlying unconsolidated aquifer at various mining depths. Water 2018, 10, 650. [Google Scholar] [CrossRef] [Green Version]
  6. Du, W.G.; Chai, J.; Zhang, D.D.; Lei, W.L. The study of water-resistant key strata stability detected by optic fiber sensing in shallow-buried coal seam. Int. J. Rock Mech. Min. Sci. 2021, 141, 104604. [Google Scholar] [CrossRef]
  7. Wang, X.Z.; Zhu, W.B.; Xie, J.L.; Han, H.K.; Xu, J.M.; Tang, Z.Y.; Xu, J.L. Borehole-Based monitoring of mining-induced movement in ultrathick-and-hard sandstone strata of the Luohe formation. Minerals 2021, 11, 1157. [Google Scholar] [CrossRef]
  8. Hou, E.K.; Yuan, F.; Wang, S.M.; Xie, X.S.; Wu, B.H. Seismic identification and development characteristics of water conducting fissure zone in goaf. J. China Coal Soc. 2023, 48, 414–429. (In Chinese) [Google Scholar] [CrossRef]
  9. Guo, W.B.; Zhao, G.B.; Lou, G.Z.; Wang, S.R. Height of fractured zone inside overlying strata under high-intensity mining in China. Int. J. Min. Sci. Technol. 2019, 29, 45–49. [Google Scholar] [CrossRef]
  10. He, J.H.; Li, W.P.; Fan, K.F.; Qiao, W.; Wang, Q.Q.; Li, L.N. A method for predicting the water-flowing fractured zone height based on an improved key stratum theory. Int. J. Min. Sci. Technol. 2023, 33, 61–71. [Google Scholar] [CrossRef]
  11. Ning, J.G.; Wang, J.; Tan, Y.L.; Xu, Q. Mechanical mechanism of overlying strata breaking and development of fractured zone during close-distance coal seam group mining. Int. J. Min. Sci. Technol. 2020, 30, 207–215. [Google Scholar] [CrossRef]
  12. State Bureau of Technical Supervision. Exploration Specification of Hydrogeology and Engineering Geology in Mining Areas; Coal Industry Publish House: Beijing, China, 2021; p. 30. [Google Scholar]
  13. Wu, Q.; Shen, J.J.; Liu, W.T.; Wang, Y. A RBFNN-based method for the prediction of the developed height of a water-conductive fractured zone for fully mechanized mining with sublevel caving. Arab. J. Geosci. 2017, 10, 172. [Google Scholar] [CrossRef]
  14. Zheng, Q.S.; Wang, C.F.; Liu, W.T.; Pang, L.F. Evaluation on Development height of water-conduted fractures on overburden roof based on nonlinear algorithm. Water 2022, 14, 3853. [Google Scholar] [CrossRef]
  15. Zhao, D.K.; Wu, Q. An approach to predict the height of fractured water-conducting zone of coal roof strata using random forest regression. Sci. Rep. 2018, 8, 10986. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. Rasouli, H.; Shahriar, K.; Madani, S.H. Prediction of the height of fracturing via gene expression programming in Australian longwall panels: A comparative study. Rudarsko-Geološko-Naftni Zbornik 2022, 37, 91–104. [Google Scholar] [CrossRef]
  17. Miao, L.T.; Xia, Y.C.; Duan, Z.H.; Sun, X.Y.; Du, R.J. Coupling characteristics and intelligent integration technology of coal-overlying rock-groundwater-ecological environment in Yu-Shen-Fu mining area in the middle reaches of the Yellow River. J. China Coal Soc. 2021, 46, 1521–1531. (In Chinese) [Google Scholar] [CrossRef]
  18. Wang, S.M.; Huang, Q.X.; Fan, L.M.; Wang, W.K.; Yu, X.Y.; Wang, G.Z.; Yang, Z.Y.; Hou, E.K.; Shen, T.; Fen, Q.K.; et al. Coal Mining and Ecological Water Level Protection in Ecologically Fragile Areas, 1st ed.; Science Press: Beijing, China, 2010; pp. 1–195. [Google Scholar]
  19. Miao, L.T. Study on Hosting Pattern and Influence of the Seam Exploitation on Water Recourses in the Main Seam of Yushenfu Deposit. Master’s Thesis, Xi’an University of Science and Technology, Xi’an, China, 2008. (In Chinese). [Google Scholar]
  20. Li, Z.X.; Shen, X.L.; Li, M.P.; Wang, H.S. Occurrence regularity of uppermost minable coal seams and their harmful level of mining in Yushen mining area. Coal Geol. Explor. 2019, 47, 130–139. (In Chinese) [Google Scholar] [CrossRef]
  21. Li, H.H.; Chen, D.Y.; Arzaghi, E.; Abbassi, R.; Xu, B.B.; Patelli, E.; Tolo, S. Safety assessment of hydro-generating units using experiments and grey-entropy correlation analysis. Energy 2018, 165, 222–234. [Google Scholar] [CrossRef] [Green Version]
  22. Guo, W.B.; Wang, Y.G. The definition of high-intensity mining based on green coal mining and its index system. J. Min. Saf. Eng. 2017, 34, 616–623. (In Chinese) [Google Scholar] [CrossRef]
  23. Verma, A.K.; Sirvaiya, A. Comparative analysis of intelligent models for prediction of Langmuir constants for CO2 adsorption of Gondwana coals in India. Geomech. Geophys. Geo-Energy Geo-Resour. 2016, 2, 97–109. [Google Scholar] [CrossRef] [Green Version]
  24. Singh, T.N.; Kanchan, R.; Verma, A.K. Prediction of blast induced ground vibration and frequency using an artificial intelligent technique. Noise Vib. World 2004, 35, 7–15. [Google Scholar] [CrossRef]
  25. Lu, C.G.; Zhang, S.A.; Xue, D.; Xiao, F.C.; Liu, C. Improved estimation of coalbed methane content using the revised estimate of depth and CatBoost algorithm: A case study from southern Sichuan Basin, China. Comput. Geosci. 2022, 158, 104973. [Google Scholar] [CrossRef]
  26. Wang, X.; Yin, S.X.; Xu, B.; Cao, M.; Zhang, R.G.; Tang, Z.Y.; Huang, W.X.; Li, W.L. Study on height optimization prediction model of overburden water-conducting fracture zone under fully mechanized mining. Coal Sci. Technol. 2022, 1–13. (In Chinese) [Google Scholar] [CrossRef]
  27. Singh, J.; Verma, A.K.; Banka, H.; Singh, T.N.; Maheshwar, S. A study of soft computing models for prediction of longitudinal wave velocity. Arab. J. Geosci. 2016, 9, 224. [Google Scholar] [CrossRef]
  28. Hancock, J.T.; Khoshgoftaar, T.M. CatBoost for big data: An interdisciplinary review. J. Big Data 2020, 7, 94. [Google Scholar] [CrossRef]
  29. Song, S.J. Study on the Stratification Transfer Prediction Method of the Mining Subsidence Based on the Key Geological and Mining Factors. Ph.D. Thesis, Xi’an University of Science and Technology, Xi’an, China, 2008. (In Chinese). [Google Scholar]
  30. Liu, G. A formula for calculating the development height of water flowing fractured zone. Int. Core. J. Eng. 2022, 8, 511–522. [Google Scholar] [CrossRef]
  31. Wang, S.M.; Shen, Y.J.; Sun, Q.; Hou, E.K. Scientific issues of coal detraction mining geological assurance and their technology expectations in ecologically fragile mining areas of Western China. J. Min. Strata Control 2020, 2, 5–19. (In Chinese) [Google Scholar] [CrossRef]
  32. Rudakov, D.; Westermann, S. Analytical modeling of mine water rebound: Three case studies in closed hard-coal mines in Germany. Min. Miner. Depos. 2021, 15, 22–30. [Google Scholar] [CrossRef]
  33. Bazaluk, O.; Sadovenko, I.; Zahrytsenko, A.; Saik, P.; Lozynskyi, V.; Dychkovskyi, R. Forecasting Underground Water Dynamics within the Technogenic Environment of a Mine Field: Case Study. Sustainability 2021, 13, 7161. [Google Scholar] [CrossRef]
Figure 1. The physiographic conditions of the study area.
Figure 1. The physiographic conditions of the study area.
Water 15 02720 g001
Figure 2. Research technology roadmapping.
Figure 2. Research technology roadmapping.
Water 15 02720 g002
Figure 3. Calculation results of grey entropy correlation degree.
Figure 3. Calculation results of grey entropy correlation degree.
Water 15 02720 g003
Figure 4. The association between FWCZ and significance factors. (a) Mining depth—the height of FWCZ; (b) Mining height—the height of FWCZ; (c) Average thickness of sandstone—the height of FWCZ; (d) Comprehensive hardness of cover rock—the height of FWCZ.
Figure 4. The association between FWCZ and significance factors. (a) Mining depth—the height of FWCZ; (b) Mining height—the height of FWCZ; (c) Average thickness of sandstone—the height of FWCZ; (d) Comprehensive hardness of cover rock—the height of FWCZ.
Water 15 02720 g004
Figure 5. Schematic diagram of a BP neural network. Note: w1, b1 represent the weights and biases, respectively, from the input layer to the hidden layer. Similarly, w2, b2 represent the weights and biases, respectively, from the hidden layer to the output layer.
Figure 5. Schematic diagram of a BP neural network. Note: w1, b1 represent the weights and biases, respectively, from the input layer to the hidden layer. Similarly, w2, b2 represent the weights and biases, respectively, from the hidden layer to the output layer.
Water 15 02720 g005
Figure 6. Performance of GA-BP neural network. (a) Error histogram; (b) regression performance.
Figure 6. Performance of GA-BP neural network. (a) Error histogram; (b) regression performance.
Water 15 02720 g006
Figure 7. Comparison of fitting effects of regression models. (a) The MSR model; (b) the GA-BP model; (c) the GA-CatBoost model.
Figure 7. Comparison of fitting effects of regression models. (a) The MSR model; (b) the GA-BP model; (c) the GA-CatBoost model.
Water 15 02720 g007
Figure 8. Water conservation mining of Yushenfu mining area [31].
Figure 8. Water conservation mining of Yushenfu mining area [31].
Water 15 02720 g008
Table 1. Occurrence characteristics of the main coal seams in the study area [18,20].
Table 1. Occurrence characteristics of the main coal seams in the study area [18,20].
Coal SeamAreal RangeMinable Area (km2)Burial Depth
(m)
Thickness (m)TextureStability
5−2Shenfu mining area50408.11–280.75
133.43
0.53–6.66
3.00
Simple/
Relatively simple
Relatively stable
Yushen II and IV planning areas46.81–198.80
123.43
0.80–2.06
1.00
Simple/
Relatively simple
Relatively stable
4−3Yushen III and IV planning areas399550.61–179.86
113.22
0.30–3.68
2.00
SimpleRelatively stable
3−1Yushen III and IV planning areas717543.23–267.53
107.24
0.50–12.19
2.50
SimpleStable
2−2Yushen I, III, and IV planning areas>601251.80–555.75
261.15
0.53–12.58
5.00
SimpleStable
1−2Yushen III and IV planning areas6084173.13–588.21
416.35
0–11.27
8.00
Simple/
Relatively simple
Relatively stable
Note: In items of burial depth and thickness, the average value is shown below the range value.
Table 2. Quantification of the height of FWCZ and its influencing factors in the study area.
Table 2. Quantification of the height of FWCZ and its influencing factors in the study area.
No.Drilling HoleCoal MineX1 (m)X2 (m)X3 (m)X4X5X6X7X8X0 (m)
1L1Liuxiang266.57.91501.540.940.938.222.48117.84
2ZK2-5Daliuta90.33.02302.470.990.717.022.3160.09
3ZK2-6Ningtiaota95.53.52500.950.990.646.882.3072.31
46Ningtiaota188.95.53002.800.860.8211.162.98145.23
5ZK2-7Zhangjiamao142.13.03006.890.720.873.522.2868.66
6ZK13Zhangjiamao87.03.03002.141.000.915.772.8375.20
7ZK15Zhangjiamao77.13.03002.300.980.896.722.9875.60
89Zhangjiamao165.95.63001.800.960.929.453.00165.90
9ZK2-8Hongliulin169.23.03507.220.730.893.812.2670.28
10ZK2-9Jinjie167.45.02507.110.700.854.512.33131.13
11ZK2-10Liangshujing101.85.02908.230.700.804.632.4286.07
12ZK2-11Liangshujing194.65.02908.550.890.774.582.51141.26
13ZK2-12Liangshujing161.15.02907.140.880.895.212.09130.29
14ZK2-13Liangshujing35.55.02900.640.961.003.062.0354.22
15ZK2-14Liangshujing242.65.02906.660.920.905.842.50139.45
16ZK3-1Liangshujing327.15.52903.670.790.9910.222.55105.40
17ZK3-2Liangshujing389.95.52904.570.800.9712.102.6196.82
18ZK3-3Liangshujing383.65.52902.390.770.9811.092.58100.11
19ZK3-4Xiaobaodang243.44.52902.000.980.975.892.44108.30
20ZK3-5Yushuwan242.24.52951.880.960.917.582.45114.40
21Y3Yushuwan278.55.52551.310.960.917.252.59128.00
22Y4Yushuwan280.55.52971.350.960.936.172.92138.30
23Y5Yushuwan287.55.52551.290.990.9410.022.88135.40
24Y6Yushuwan265.55.52550.920.980.965.413.03118.60
25Y7Yushuwan272.85.02501.580.880.896.582.9757.71
26ZK3-6Yushuwan242.94.53001.840.910.845.302.82107.63
27ZK3-7Yushuwan279.35.03001.330.960.936.172.56137.3
28ZK3-8Yushuwan286.95.03001.400.990.8910.022.57138.9
29ZK3-9Yushuwan275.55.03001.040.990.925.892.61117.80
30H3Yushuwan244.25.03001.780.980.975.542.98112.44
31H4Hanglaiwan244.85.03001.900.960.917.582.98116.20
32H5Hanglaiwan249.94.53001.820.910.845.302.82107.80
33H7Hanglaiwan233.44.53002.130.970.865.922.7193.90
34JT4Jinjitan241.85.53003.000.960.934.682.83126.40
35JT5Jinjitan240.65.53003.000.930.874.852.82146.18
36JT6Jinjitan241.65.53003.290.950.914.252.86120.25
37JSD1Jinjitan263.65.53003.270.960.934.563.0351.52
38JSD2Jinjitan260.65.53003.370.930.875.083.01109.72
39JSD3Jinjitan261.75.53003.680.940.914.283.0169.94
40ZP1Yuyang195.23.52003.110.850.905.682.8196.30
41ZP2Yuyang188.33.52003.070.890.925.052.8884.80
42ZK1-1Yujialiang40.11.53001.440.990.966.282.1142.33
43ZK1-2Yujialiang52.61.53001.210.980.994.122.0244.64
44ZK1-3Yujialiang45.21.53001.390.980.935.332.0840.02
45ZK2-1Yujialiang77.33.53600.561.000.704.332.0162.86
46ZK2-2Yujialiang179.15.04007.140.990.887.192.89123.89
47ZK2-3Yujialiang129.13.53600.790.980.886.012.2282.37
48ZK2-4Yujialiang106.93.53603.650.990.998.622.1280.11
Table 3. Results of correlation analysis on factors influencing the height of FWCZ.
Table 3. Results of correlation analysis on factors influencing the height of FWCZ.
Influencing FactorsCorrelation Coefficientp-ValueSignificance
X10.544 **0.000Significant
X20.674 **0.000Significant
X3−0.0580.696Not significant
X40.1550.294Not significant
X5−0.0270.857Not significant
X60.0300.838Not significant
X70.333 *0.021Significant
X80.421 **0.003Significant
Note: “*” indicates p-value < 0.05; “**” indicates p-value < 0.01.
Table 4. Entropy method calculation results.
Table 4. Entropy method calculation results.
Influencing FactorsEntropy Value EiWeight Coefficient ωi
X80.9150.43100
X70.9480.26125
X10.9680.15932
X20.9710.14843
Table 5. Results of the GA-CatBoost regression model evaluation.
Table 5. Results of the GA-CatBoost regression model evaluation.
IndicatorsMSERMSEMAEMAPE
Training set0.1690.4050.3400.402%1
Test set135.10311.6239.0329.37%0.872
Table 6. Example validation of the best-predicted model.
Table 6. Example validation of the best-predicted model.
Working FaceCoal MineMing Depth (m)Ming Height (m)Average Thickness of Sandstone (m)Comprehensive Hardness of Cover Rock (MPa)The Height of FWCZ
X1X2X7X8Predictive Value (m)True Value (m)Error (%)
14202Zhangjiamao87.13.06.212.6674.6375.60−1.28
12-2 upper 0101Jinjitan256.85.55.352.87116.49111.494.48
122106Caojiatan286.36.07.932.93132.47139.15−4.80
2207Wulanmulun97.52.29.893.09101.7797.50−4.38
2305Hanjiawan105.04.49.653.02109.15110.11−0.87
Table 7. Inverse prediction results of the fractured/mining height ratio.
Table 7. Inverse prediction results of the fractured/mining height ratio.
No.X1 (m)X2 (m)X7
(m)
X8
(MPa)
RatioNo.X1 (m)X2 (m)X7
(m)
X8
(MPa)
Ratio
198.22.532.0024.123398.22.592.0031.90
298.22.532.3527.133498.22.592.3530.64
398.22.532.7030.553598.22.592.7034.60
498.22.533.0535.753698.22.593.0539.62
5149.23.532.0021.3937149.23.592.0026.71
6149.23.532.3522.9638149.23.592.3526.17
7149.23.532.7023.8239149.23.592.7027.02
8149.23.533.0525.3040149.23.593.0529.62
9200.24.532.0021.6641200.24.592.0024.39
10200.24.532.3523.3942200.24.592.3524.40
11200.24.532.7023.0743200.24.592.7023.72
12200.24.533.0520.9344200.24.593.0525.57
13251.25.532.0019.7345251.25.592.0020.56
14251.25.532.3520.7146251.25.592.3521.44
15251.25.532.7020.0247251.25.592.7021.77
16251.25.533.0513.7848251.25.593.0523.07
1798.22.562.0026.354998.22.5122.0033.27
1898.22.562.3527.465098.22.5122.3532.6
1998.22.562.7030.755198.22.5122.7035.82
2098.22.563.0536.035298.22.5123.0539.61
21149.23.562.0024.9453149.23.5122.0026.54
22149.23.562.3524.9554149.23.5122.3526.00
23149.23.562.7024.3555149.23.5122.7027.20
24149.23.563.0526.6656149.23.5123.0528.99
25200.24.562.0023.6657200.24.5122.0024.03
26200.24.562.3523.2558200.24.5122.3524.05
27200.24.562.7021.3359200.24.5122.7023.81
28200.24.563.0522.6460200.24.5123.0524.69
29251.25.562.0020.5461251.25.5122.0020.40
30251.25.562.3521.3662251.25.5122.3521.28
31251.25.562.7020.8563251.25.5122.7021.68
32251.25.563.0520.7964251.25.5123.0522.10
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gu, L.; Shen, Y.; Wang, N.; Kou, H.; Song, S. Prediction of the Height of Fractured Water-Conducting Zone: Significant Factors and Model Optimization. Water 2023, 15, 2720. https://doi.org/10.3390/w15152720

AMA Style

Gu L, Shen Y, Wang N, Kou H, Song S. Prediction of the Height of Fractured Water-Conducting Zone: Significant Factors and Model Optimization. Water. 2023; 15(15):2720. https://doi.org/10.3390/w15152720

Chicago/Turabian Style

Gu, Linjun, Yanjun Shen, Nianqin Wang, Haibo Kou, and Shijie Song. 2023. "Prediction of the Height of Fractured Water-Conducting Zone: Significant Factors and Model Optimization" Water 15, no. 15: 2720. https://doi.org/10.3390/w15152720

APA Style

Gu, L., Shen, Y., Wang, N., Kou, H., & Song, S. (2023). Prediction of the Height of Fractured Water-Conducting Zone: Significant Factors and Model Optimization. Water, 15(15), 2720. https://doi.org/10.3390/w15152720

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop