1. Introduction
Rockbursts are a threatening phenomenon characterized by their suddenness, destructiveness, and complexity [
1]. Rockbursts can cause serious consequences, such as equipment damage and injuries [
2]. With the rapid growth of global demand for mineral resources, deep underground mining has become an inevitable trend in the mining industry, and rockbursts are particularly challenging for engineers [
3]. Despite the progress of machine learning and deep learning methods in rockburst prediction, existing studies have obvious shortcomings in feature selection: they generally rely on the correlation between indicators while neglecting the correlation between indicators and rockburst grade and lack a systematic assessment of the indicator system in different dimensions.
This study proposes an optimization method based on multidimensional feature selection and integrated learning by constructing two types of evaluation systems (seven-dimensional to three-dimensional), namely, the indicator–indicator system and the indicator–rockburst hierarchy system, and combining six types of models, including XGBoost and CatBoost, in order to solve three problems: (1) the mechanism of the tree model’s resistance to multiple covariance; (2) the performance difference between the indicator–indicator criterion and the indicator–rockburst hierarchy system; and (3) the basis for selecting the optimal indicator dimension. The experiments show that CatBoost performs optimally in the two types of systems, and the six-dimensional system can balance performance and complexity.
The innovations of this study are (1) revealing the properties of the tree model without PCA downscaling; (2) verifying the validity of the indicator–rockburst hierarchy; and (3) establishing the engineering applicability of six-dimensional indicators. The research results provide new theoretical support and practical reference for rockburst prediction.
  2. Literature Review
With the increasing scale of mining and underground engineering, especially the rapid development of deep underground mining, the problem of predicting rockburst hazards has become increasingly complex and challenging. To address this challenge, many scholars have begun to extensively explore and utilize advanced machine learning techniques to predict the occurrence of rockbursts, thus promoting a large number of data-driven research efforts [
4]. Among the many approaches, machine learning and deep learning models have been widely used in rockburst prediction due to their superior data modeling capabilities and good predictive performance. Sun et al. [
5] proposed a prediction framework that combines a random forest-based metric weight optimization method (RF-CRITIC) with an improved cloud model, which is able to effectively perform the prediction of short-term rockbursts. The method significantly improves the reliability and accuracy of prediction through multiple feature selection and fusion. Shen et al. [
6], on the other hand, proposed a random forest (Op-RF) model based on Optuna optimization, which significantly improves the prediction performance of the model through efficient hyperparameter optimization. Their validation results showed that the method achieved an area-under-the-curve (AUC) score of 0.984 in rockburst prediction, demonstrating a strong classification capability. In contrast, Li et al. [
7] proposed the DeepForest model, which takes into consideration the contribution of each input variable to the occurrence of rockbursts by means of a multilevel integrated learning mechanism and sensitivity analysis. The authors demonstrated the effectiveness of the model in dealing with fewer input parameters, especially for rockburst prediction in deep mines. In addition, Liu et al. [
8] further extended the application of deep learning in rockburst prediction by employing a deep learning algorithm with a complex network structure and verified the great potential of deep learning models in rockburst prediction.
In rockburst grade prediction, the selection of appropriate evaluation indicators plays a crucial role in the effectiveness of the model. If there are too few discriminating indicators, the key factors affecting rockbursts may not be fully reflected; however, if too many indicators are selected, the difficulty of data collection and processing is substantially increased [
9]. Moreover, too many indicators may cause redundancy of information instead of affecting the prediction effect of the model. Therefore, how to balance the quantity and quality of indicators is always an important challenge in research. Currently, researchers have different approaches to choosing evaluation indicators for rockburst prediction. For example, Xue et al. [
10] directly selected six indexes as input variables when building the particle swarm optimized extreme learning machine model (PSO-ELM), namely, the maximum tangential stress of the surrounding rock (σ
θ), the uniaxial compressive strength of the rock (σ
c), the tensile strength (σ
t), the stress ratio (σ
θ/σ
c), the brittleness ratio of the rock (σ
θ/σ
t), and the elastic energy index (W
et). Shukla et al. [
11] directly chose four indexes, namely, maximum tangential stress, elastic energy index, uniaxial compressive strength, and uniaxial tensile stress. Meanwhile, Li et al. [
7] used seven indicators in their deep forest model, namely, maximum tangential stress, uniaxial compressive strength, tensile strength, elastic strain energy index, stress concentration factor, and rock brittleness indices B
1 and B
2 of the surrounding rock, which significantly expanded the evaluation dimensions. On the other hand, the method proposed by Lin et al. [
12] focuses on eliminating indicators that are highly correlated with each other through correlation analysis and retaining only four indicators, namely, maximum tangential stress, uniaxial compressive strength, tensile strength, and the elasticity–strain–energy index of the surrounding rock, which optimizes the input features by reducing redundant information. Similarly, Faradonbeh et al. [
13] considered the correlation between the indicators and selected only four indicators, namely, maximum tangential stress, uniaxial compressive strength, tensile strength, and elastic energy index of the surrounding rock, in order to prevent the effect of multiple covariance on the modeling process and complexity of the final model. Armaghani et al. [
14] still use the correlation between indicators as the criterion for selecting indicators. It can be seen that most of the studies on rockburst prediction tend to use the correlation between indicators as a criterion to obtain a set of evaluation index systems with weak correlation.
However, a comparison of several studies in practice reveals that the correlation between evaluation metrics does not always directly affect the performance of predictive models. For example, the variable autocoder–natural gradient boost model (VAE-NGBoost) proposed by Lin et al. [
12] has the highest correlation coefficient between the input metrics of 0.582, which is a relatively low correlation coefficient. Finally, the VAE-NGBoost model achieved an accuracy of 0.921, while the deep forest model of Li et al. [
7] achieved an accuracy of 0.924 despite the highest correlation of 0.9 between its input variables. In addition, the correlation between the indicators selected by Wang et al. [
15] was 0.788, and the accuracy of the final model was 0.946. These results show that the correlation between the evaluation indicators is not a decisive factor for the model’s effectiveness. On the other hand, the correlation between the evaluation indicators and the rockburst grade also plays a non-negligible role in the model effect. Generally speaking, choosing indicators that are highly correlated with the rockburst grade as inputs can improve the predictive ability of the model. However, current research on this aspect is still weak, and most existing studies do not use the correlation between the indicators and the rockburst grade as a criterion to select evaluation indicators.
  3. Purpose of This Study
From the existing literature review, it can be found that the current rockburst prediction research generally adopts the correlation analysis between indicators as the main basis for feature selection, aiming to construct a set of weakly correlated evaluation indicator systems as model input. However, empirical studies have shown that the statistical correlation between indicators has no significant effect on the performance of prediction models. It is worth noting that there is still a lack of comparative studies on different indicator systems in rockburst prediction, and there is a serious lack of empirical analyses to systematically evaluate the predictive efficacy of each indicator system. In addition, existing studies rarely consider the correlation between the indicators and the rockburst level as the feature selection criterion, and this approach has not yet been fully explored in related fields of research.
In order to deeply explore the influence of the selection of evaluation indexes on the performance of the prediction model, this study uses the index–indicator correlation and index–rockburst grade correlation as the selection criterion and the maximum tangential stress (σθ), the uniaxial compressive strength (σc), the tensile strength (σt), the elastic strain energy index (Wet), the stress concentration factor (SCF), the rock brittleness index B1 (B1 = σc/σt), the rock fragility index B2 (B2 = (σc − σt)/(σc + σt)), and seven other indices as the basis. Different numbers of evaluation indices (seven, six, five, four, and three indices) are used as input variables, combined with integrated learning algorithms (XGBoost, CatBoost, LightGBM, and random forest (RF)) and six mainstream algorithms, such as traditional algorithms (support vector machine (SVM) and multilayer perceptron machine (MLP)) with the Optuna hyperparameter optimization algorithm, to carry out a comparative analysis. By comparing the prediction effect of each model under different judging index systems, we aim to assess the specific impact of each index combination on the model performance and provide a theoretical basis for optimizing the rockburst prediction model.
  4. Materials and Methods
  4.1. Data Sources
In view of previous rockburst research, this study widely collected 330 cases of rockburst-related engineering case data at home and abroad, as shown in 
Table 1. According to the classification criteria of rockbursts shown in 
Table 2, the rockburst grade was divided into four different categories: none, light, medium, and strong.
  4.2. Data Description and Analysis
The distribution of rockburst categories is shown in 
Figure 1 with no rockbursts (only 53 cases), strong intense rockbursts (56 cases), light rockbursts (101 cases), and moderate rockbursts (120 cases). In this study, the influence factors in the dataset are maximum tangential stress (σ
θ), uniaxial compressive strength (σ
c), tensile strength (σ
t), elastic strain energy index (W
et), stress concentration factor (SCF), rock brittleness index B1 (B1 = σ
c/σ
t), rock brittleness index B2 (B2 = (σ
c − σ 
t)/(σ
c + σ
t)), and seven others.
The violin plot of the rockburst data set is shown in 
Figure 2. The violin plot, as a density combination of graphs, can effectively show the overall distribution of data. The width of the graph reflects the uniformity of the data distribution; a wider violin plot indicates that the data are more evenly distributed, while a narrower graph indicates a higher degree of data concentration. The box line portion of the violin plot represents the median and interquartile range of the data, and the density of the scatter reflects the degree of concentration of the data in a certain value interval. By looking at 
Figure 2, it is clear that there is a category imbalance or sampling bias in the dataset, and the presence of these outliers may be related to the samples that were targeted to specific operating conditions during the collection process.
Further, the scatter and distribution density plots in 
Figure 3 demonstrate the data distribution relationships between features and the distribution of each rockburst category on a single feature, revealing significant differences in distribution and magnitude between the categories. Therefore, to address these issues, appropriate data enhancement is particularly necessary in the early stages of data processing.
Table 3, on the other hand, provides the results of the calculations for each metric (standard deviation, kurtosis, maximum, mean, median, and range) in the statistical analysis.
   4.3. XGBoost
XGBoost (eXtreme gradient boosting) is an efficient machine learning algorithm that is deeply optimized on top of the gradient boosting decision tree (GBDT) framework. Its core mechanism is to continuously optimize the objective function by gradually adding new decision trees, and each tree is trained based on the residuals of the previous tree so as to gradually reduce the error between the model’s predicted value and the true value. As the decision trees continue to accumulate, the value of the loss function gradually decreases, pushing the model to approach the optimal solution. This incremental learning approach makes XGBoost widely used in many fields, such as wind power prediction [
23], wildfire disaster risk assessment [
24], financial market trend prediction [
25], and so on [
26].
In terms of mathematical construction, XGBoost’s objective function 
 consists of two parts: the loss function number and the regularization constraint term. The loss function measures how well the model fits the data, while the regularization constraint term acts as a penalty mechanism to control the complexity of the model and prevent it from falling into the trap of overfitting. This optimization strategy improves the generalization ability of the model while ensuring that it still maintains excellent computational efficiency when dealing with large-scale datasets. The formulas for the objective function and regularization term of XGBoost are as follows:
In Equations (1) and (2),  represents the predicted output of sample xi, while its corresponding true value is yi. The model consists of k subtrees, and the output of the kth subtree is denoted as fk, whose complexity is bounded by the regular term . The hyperparameters  and  together regulate the optimization process of the tree, where T refers to the number of leaf nodes in the decision tree, and  indicates the specific value of each leaf node. In addition, the training error of sample xi is measured by the loss function , which affects the learning effect of the overall model.
  4.4. LightGBM
Compared with XGBoost, LightGBM [
27], as an emerging gradient boosting tree model, demonstrates higher computational efficiency in its algorithm design. Its core optimization lies in the introduction of the histogram algorithm, which effectively reduces the memory occupation and, at the same time, reduces the computational overhead to ensure the efficient operation of the model on large-scale datasets.
Most traditional tree structure learning frameworks, including XGBoost, adopt a layer-by-layer growth strategy, i.e., expanding all the nodes in the same layer at the same time each time to ensure a balanced expansion of the tree. However, LightGBM breaks through this inertia and innovatively introduces a leaf-by-leaf growth mechanism. Instead of following the hierarchical expansion, the method prioritizes the growth of leaf nodes with the smallest splitting loss, which enables the model to converge more quickly and significantly reduces the memory footprint.
In contrast, the layer-by-layer approach is robust and orderly, while the leaf-by-leaf approach is more dynamically adaptive and captures localized changes in data features more accurately.
  4.5. Catboost
CatBoost, an open-source integrated model based on gradient boosting [
28], has shown excellent performance in complex classification and regression tasks dealing with highly nonlinear data due to its powerful learning capabilities.
Unlike traditional gradient boosting methods that rely on a uniform sample set to estimate the gradient and construct the model, CatBoost takes a different approach and focuses on solving the deep challenge of prediction bias. The accumulation of gradient bias not only affects model stability but may also trigger the target leakage problem [
29], which in turn weakens the generalization ability of the model.
To cope with such risks, CatBoost adopts an innovative ordered boosting framework. The method dynamically partitions the leaf nodes of the preorder tree by consistent criteria, thus effectively suppressing the negative effects of gradient bias and prediction bias. As a result, the algorithm’s overfitting resistance is significantly enhanced, and its accuracy and generalization ability are greatly improved, making it more adaptable and stable in complex data environments.
  4.6. Random Forests
Random forest (RF) is an integrated learning method that effectively overcomes the overfitting problem that may result from a single decision tree by constructing multiple decision trees and combining their results to make predictions [
30]. Its basic algorithmic process is as follows.
First, the number of trees (N) and the number of randomly selected features (m) in each tree are set, and the training data are prepared. Next, in constructing each tree, the following steps are followed sequentially: (1) random sampling from the training set; (2) random selection of a subset of features; and (3) construction of a decision tree.
In the task of classification prediction of rockburst intensity, the RF model employs a voting mechanism in which the majority vote determines the final classification result, and this strategy makes the overall prediction more robust, which is formulated as follows:
        where N denotes the number of decision trees, the prediction for each tree is 
, and I is an indicator function.
  4.7. Support Vector Machines
Support vector machine (SVM) is a remarkable machine learning algorithm widely used in classification and regression tasks. It maps data to a higher-dimensional space by means of a kernel trick, thus constructing an optimal bounding hyperplane that can be used to efficiently differentiate between different classes of data. The core idea of the algorithm is to minimize classification errors or maximize the bounds by designing a function such that data points are correctly assigned to the appropriate labels. The wider the margin between the hyperplane and the data points, the smaller the classification error, and such a separation makes the boundaries of each type of data more clear. By optimizing this separation function, SVM can achieve more accurate classification results [
31].
  4.8. Multilayer Perceptron
Multilayer perceptron (MLP) is an artificial intelligence technology that empowers computers with human brain-like capabilities to perform complex data analysis. The brain, which served as the inspiration for the neural network architecture, relies on hundreds of millions of neurons transmitting electrical signals through intricate connections that coordinate together in order to process information [
32]. Artificial neural networks work on a similar principle, consisting of artificial neurons that work together to solve problems. Each neuron consists of four basic components: input, weights, activation function, and output. The input data can come from other neurons or from the external environment, while the weights determine how much each input signal affects the current neuron and are achieved by adjusting how the elements in the previous layer affect the current element [
33].
  4.9. Optuna
As a next-generation hyperparameter search framework, Optuna [
34] achieves breakthroughs in optimization efficiency by innovatively fusing dynamic parameter space reconstruction mechanisms with adaptive pruning algorithms. The core workflow of the system starts with the sophisticated construction of the multidimensional optimization space; researchers need to explicitly define the objective function, parameter types, and their dynamic value ranges. In the iterative optimization phase, the system adopts a hybrid strategy combining Bayesian optimization and evolutionary algorithms to evaluate the convergence characteristics of each parameter combination in real time and implement intelligent abortions of inefficient test nodes based on the expected lifting threshold. This focused search strategy allows computational resources to continuously flow toward the high-potential parameter subspace until the preset termination conditions (e.g., iteration number or accuracy threshold) are satisfied and ultimately outputs the Pareto-optimal hyperparameter configuration scheme.
  4.10. Principal Component Analysis
Principal component analysis (PCA) is a classical unsupervised dimensionality reduction method whose core idea is to transform the original high-dimensional features into linearly independent low-dimensional variables (principal components) by orthogonal transformations while retaining the maximum variance information in the data [
35]. The method is widely used in eliminating redundant information among features, improving the computational efficiency of models, and visualizing high-dimensional data.
  5. Delineation of the System of Indicators for Rockburst Prediction
In order to construct the rockburst prediction index system, this study analyzed the correlation of the rockburst prediction sample data through the Pearson correlation coefficient to assess the interrelationships between the seven key characteristic parameters of rockbursts. The analysis results are shown in 
Figure 4. According to the analysis, there was a strong correlation between certain indicators. For example, the correlation coefficient between σ
θ and SCF was as high as 0.9, indicating that the trends of these two variables in the dataset were highly consistent. In addition, the correlation coefficients between σ
t and B
1 were −0.63 and between σ
t and B
2 were −0.69, which indicated that these two fragility indicators showed a strong negative correlation with σ
t, while a strong positive correlation (correlation coefficient of 0.73) existed between B
1 and B
2. The absolute values of the correlation coefficients between the other variables were less than or equal to 0.48, indicating a moderate or weak correlation between them.
Further analysis reveals that there is a strong correlation between σθ, SCF, and Wet and the rockburst grade, with correlation coefficients of 0.52, 0.41, and 0.49, respectively, which may be closely related to the strong intrinsic relationship between these variables. On the other hand, σt and σc also showed moderate correlation, while the correlation between B1 and B2 and rockburst grade was relatively weak, showing only a weak correlation.
Combining the above analysis, this study fully considered the correlation between the evaluation indicators (indicator–indicator system, hereafter referred to as I-I) and their correlation with the rockburst grade (indicator–rockburst grade system, hereafter referred to as I-R). Based on these correlation characteristics, the evaluation system of rockburst prediction was classified into nine types (as shown in 
Table 4).
  8. Discussion
(1) Aspects of the multicollinearity problem.
As can be seen from 
Figure 8 and 
Table 6 of the modeling results after PCA treatment, the modeling effects of XGBoost, LightGBM, CatBoost, and RF decreased after PCA treatment of the indicator system, with the most significant decrease in CatBoost modeling effects. On the contrary, there was a very small improvement in the model effect of SVM and MLP. It is especially noteworthy that XGBoost, LightGBM, CatBoost, and RF are all tree models.
In fact, tree models are good at capturing nonlinear relationships (e.g., interactions and segmentation functions) between features and targets [
36], which may be destroyed by linear transformations of PCA. Meanwhile, the tree model itself can effectively deal with the multicollinearity problem [
37]. Therefore, the PCA processing of the data before inputting them into XGBoost, LightGBM, CatBoost, and RF may rather reduce the model effect. The experimental results of this study verify this point of view as well. In addition, Cha et al. [
38] showed similar results in their experiments: the decision tree (DT) model R
2 was 0.872, while the principal component analysis–decision tree (PCA-DT) model R
2 was 0.849, and the model effect was worse after PCA treatment. DT also belongs to the tree model. It can be seen that tree models (XGBoost, LightGBM, CatBoost, and RF) inherently have excellent ability to deal with nonlinear relationships and multicollinearity problems, so there is often less need to consider the multicollinearity problem when using tree models for prediction tasks.
In addition, for SVM and MLP, although the modeling effect is improved after PCA treatment, the improvement is very limited, with a maximum improvement of only 2.38%. This indicates that the multicollinearity problem of the original data is not serious.
(2) Comparison of the performance of the models.
As can be seen from 
Table 7, 
Table 8, 
Table 9, 
Table 10, 
Table 11, 
Table 12, 
Table 13 and 
Table 14, in the I-I system, CatBoost had the highest average accuracy (0.6950), average precision (0.7106), average recall (0.6950), and average F1 score (0.6944), which was significantly better than the other models. In the I-R system, CatBoost also performed outstandingly, with the highest average accuracy (0.6913), average precision (0.7052), average recall (0.6913), and average F1 score (0.6907). In addition, CatBoost’s stability was significantly better than other models, both in the I-I system and the I-R system. CatBoost’s excellent performance may be attributed to its efficient processing of class features and its ability to resist overfitting, which can better capture nonlinear relationships in rockburst prediction.
LightGBM’s overall performance in the I-I and I-R systems was second only to CatBoost, with average accuracies of 0.6801 and 0.6863, respectively. It is also worth noting that LightGBM’s training speed was significantly faster than that of the other models, which makes it suitable for real-world application scenarios that require fast responses.
The average accuracy of XGBoost in the I-I and I-R systems was 0.6776 and 0.6751, respectively, which is a moderate but stable performance.
RF performed well in the I-I system, with an average accuracy (0.6873) second only to CatBoost, but RF’s performance in the I-R system was significantly degraded, with a more pronounced drop in performance in the low-dimensional system.
SVM and MLP performed poorly in all the index systems. As can be seen from 
Figure 9, 
Figure 10, 
Figure 11 and 
Figure 12, the performance indicators of SVM and MLP models in all aspects were significantly inferior to the other four models, especially in the I-R system, where the performance was even worse.
(3) Comparisons across indicator systems.
Under the I-I system, as can be seen in 
Table 7, 
Table 8, 
Table 9, 
Table 10, 
Table 11, 
Table 12, 
Table 13 and 
Table 14, the average accuracy of all models decreased from 0.6632 to 0.6529 when decreasing from seven dimensions to four dimensions, which is a relatively small decrease (about 1.5%). CatBoost and RF had the smallest fluctuation of performance when decreasing in dimensions, which indicates that they are more robust to the redundancy of metrics. The performances of the six- and five-dimensional systems were close to those of the seven-dimensional system. As can be seen from 
Figure 13, 
Figure 14, 
Figure 15 and 
Figure 16, the I-I system was somewhat less stable, with a more pronounced decrease in the performance of the models in the four-dimensional system and a sudden high in the three-dimensional system. The four-dimensional system showed a significant decrease in performance, suggesting that oversimplification may lead to the loss of key information. The sudden increase in the three-dimensional metrics may be due to the fact that when the dimensionality is reduced from 4 to 3 dimensions, it happens to retain the deterministic metrics that are strongly correlated with the rockbursts while eliminating redundant or noisy features.
In the I-R system, the average accuracy decreased from 0.6632 to 0.6508 when decreasing from seven dimensions to three dimensions, which is slightly larger than that of the I-I system (about 1.9%). As shown in 
Figure 13, 
Figure 14, 
Figure 15 and 
Figure 16, the I-R system was more sensitive to changes in dimensionality, especially in the four- and three-dimensional systems, where the decrease in performance is more pronounced, with a steady downward trend in general.
Comparing the four performance indicators in 
Figure 9, 
Figure 10, 
Figure 11 and 
Figure 12, the overall average performance of the I-I system was better than that of the I-R system. But the I-I system is an unstable phenomenon. In four dimensions and three dimensions, although the average performance indicators of the I-R system were not as good as the I-I system, it was more stable with the dimensionality of the performance of the lower decline. There was no sudden change in the situation, which may be related to the I-R system of the selection criteria for the indicators; rockburst level correlation has a close relationship with the I-R system of the indicators.
In general, the performance of the six-dimensional system of the I-I system and the I-R system was very close to that of the seven-dimensional system. The performance of the five-dimensional system decreased obviously, but it was still better than that of the four-dimensional and three-dimensional systems. The six-dimensional system of the I-I system and the six-dimensional system of the I-R system reduced the redundancy of the indexes. Therefore, it would be more suitable for practical application while guaranteeing the performance is close to that of the seven-dimensional system. If higher requirements are placed on the model’s performance, the seven-dimensional system of the I-I system can be selected. However, the performance of the five-dimensional, four-dimensional, and three-dimensional systems of the I-R system declines significantly, so they are not recommended for use in actual engineering applications.
  9. Summary
  9.1. Conclusions
This study draws the following conclusions by comparing the performance of different models and indicator systems in rockburst prediction:
(1) Tree models (e.g., CatBoost, LightGBM, etc.) are naturally resistant to multicollinearity, and PCA preprocessing will destroy their nonlinear feature relationships, leading to performance degradation; when using tree models, the original features can be directly retained, avoiding unnecessary dimensionality reduction processing.
(2) Model performance: CatBoost has the best overall performance (highest accuracy and stability), LightGBM is the second most efficient and efficiently trained, XGBoost and RF are stable, and SVM and MLP lag behind significantly.
(3) Indicator system: The six-dimensional system is suitable for practical applications, as it reduces redundancy while retaining performance (close to seven-dimensional); seven-dimensional is suitable for high-precision needs, while five-dimensional or less may lead to loss of information and model performance degradation; the I-I system is better in performance but fluctuates a lot, and the I-R system is more stable.
  9.2. Significance and Contribution of This Study
(1) For the first time, the natural resistance mechanism of tree models to multiple covariance has been systematically verified, and it is made clear that linear downscaling methods, such as PCA, destroy the nonlinear feature relationships that tree models depend on.
(2) The empirical study shows that the traditional feature selection method based on low correlation between indicators does not significantly improve the prediction performance. This finding challenges the feature selection paradigm commonly adopted in the current rockburst prediction research and provides a new theoretical basis for subsequent research.
(3) By introducing the feature selection method (I-R system) of correlation between indicators and rockburst grade and verifying its prediction efficacy, this study provides another idea for constructing an indicator system for rockburst prediction that makes up for the lack of exploration of this method in existing studies.
(4) For the first time, the differences in the predictive performance of different dimensional index systems have been systematically evaluated, and the advantages of the six-dimensional system in balancing the accuracy of the model and the engineering practicability have been clarified, which provides a reliable basis for decision-making in practical engineering applications.
  9.3. Limitations
(1) Only seven indicators commonly used in rockburst prediction (e.g., σθ, σc, etc.) were considered, and environmental factors such as geological formations, groundwater, etc., were not included, which may have omitted key predictive variables.
(2) Uneven distribution of samples (53 cases without rockburst vs. 120 cases with moderate rockburst) may affect the model’s generalization ability.
(3) The conclusions of this study were drawn based on a specific dataset (N = 330), and although internal validity was ensured through cross-validation, the findings may be subject to a certain degree of chance due to the singularity of the sample source (which were all derived from literature cases). There is still a need to verify the generalizability of the conclusions by other means in the future.