1. Introduction
Rock stability in deep underground conditions is seriously affected by rockburst, which still attracts a lot of attention nowadays [
1,
2]. In civil engineering and mining engineering, rockburst events normally occur suddenly, causing a loss of money in working facilities. Accurately evaluating the rockburst intensity has been a significant task as it can be a guideline in this area and guide managers to design carefully [
3,
4].
Rockburst cases occur in different conditions, such as tunneling and mining [
5,
6,
7]. For instance, in the deep traffic tunnel in China, there are different grades of rockburst, which have caused different types of damage to the tunnel. Slight rockburst causes some cracks in the concrete in the tunnel face, and moderate rockburst affects the arc cavity pits, with depths of about 1 m, while intense rockburst affects the arc and wedge-shaped pits with depths of about 2 m, and the extremely intense rockburst almost destroyed the working condition, causing the depth of the pits to be about 3 m. Therefore, classifying and predicting the rockburst intensity plays a significant role in working safety.
Nowadays, the mechanism of rockburst is still not clear, but the basic laws of it are known as instantaneous slip and instantaneous fracturing. To control the rockburst, different methods have been proposed, such as temporary and permanent rock support systems; however, these approaches are not efficient as the rockburst intensity is difficult to know properly. Thus, some monitoring methods, such as a microseismic monitoring system, were applied to record and analyze the rockburst events [
8]. The microseismic monitoring system records the rockburst intensity after the rockburst events, and it cannot predict the rockburst in advance. Hence, estimating and predicting rockburst intensity before its occurrence is of importance. Different models have been proposed, such as stress criteria, including the Barton, Hoek and Brown, Hou, Russenes, and Turchaninov criteria. Furthermore, the existing prediction approaches can be regarded as short-term and long-term predictions. In short-term predictions, the rockburst occurrence is based on in-situ site testes; however, the long-term prediction is basically according to the fundamental methods, such as strength theory and energy theory, which are similar to simulation, machine learning, and empirical knowledge methods [
9].
Due to the uncertainties of rockburst and the unclear mechanism of occurrence, a curtained model or method is not suitable for the accurate prediction of rockburst. The method should consider more influencing factors related to rockburst occurrence, with random, fuzzy, or even both mechanisms, and thus, the artificial intelligence method can perfectly solve the problem [
10,
11]. For instance, there are various machine learning methods for predicting long-term rockburst hazards, such as support vector machines, artificial neural networks, and decision trees. The previous studies are summarized in
Table 1. It can be noted that the prediction accuracy of rockburst intensity is affected by the number of data and different machine learning algorithms. Therefore, developing a high-performance and less-time-consuming ensemble classifier for the larger dataset is quite important.
Random forest (RF) has been applied in rockburst classification [
27]. However, the relevant studies are fewer [
14,
24,
28], by which their accuracy is limited by the hyperparameters, i.e., the number of the trees and the minimum leaf node. To optimize the structure of RF, there are some global optimization algorithms, such as the firefly algorithm (FA) and particle swarm optimization (PSO). However, these algorithms are time-consuming, and therefore, a new global algorithm should be proposed. Beetle Antennae Search (BAS), is a biologically inspired, intelligent optimization algorithm, which is inspired by the foraging principle of longicorn beetles. Furthermore, it has been used for tuning the hyper-parameters of ML algorithms in recent years.
This research aims to develop a machine learning-based model to study rockburst classification. The BAS algorithm was employed to tune the hyper-parameters of the RF algorithm. The performance of the ensemble BAS-RF model has been compared with other machine learning algorithms: the support vector machine, k-nearest neighbors, and decision tree algorithms. Furthermore, the BAS-RF has been tested against empirical criteria as well as previously published RF models, which were developed to address the rockburst problem.
2. Dataset Preparation
A total of 279 cases of rockburst events reported in the literature were collected to build a dataset [
14,
26,
29,
30,
31,
32]. The dataset included five influencing variables, with the buried depth of opening (
H), the maximum tangential stress of the excavation boundary (
σθ), the uniaxial compressive strength of rock (
), the tensile rock strength (
), and the elastic energy index (
Wet) as input parameters and rockburst intensity as the output. These input variables are commonly applied in rockburst classification and can provide fundamental understandings about rockburst occurrence in underground conditions. According to rock failure properties, the output parameter, i.e., rockburst intensity, contains four different classes, namely none, light, moderate, and strong. The frequency of each input parameter is depicted in
Figure 1. The statistics of the input parameters are summarized in
Table 2.
3. Algorithm Background and Ensemble Model
3.1. Algorithms Description
3.1.1. Decision Tree and Random Forest
The Decision Tree (DT) and Random Forest (RF) both have tree structures. In contrast to the DT, the random forest uses the method of majority votes. The normal structure of DT and RF is shown in
Figure 2.
The C 4.5 algorithm in this study was applied for the attribute selection process, which can be expressed as follows:
where
S is the training set;
A is the attribute;
is given by
The necessary steps are (1) selecting random K data points from the training set, (2) building the decision trees associated with the selected data points, (3) choosing the number of decision trees, (4) repeating steps 1 and 2, (5) finding the predictions of each decision tree for new data points, and assigning new data points to the category having the majority of votes.
3.1.2. K-Nearest Neighbor
The K-Nearest Neighbor (KNN) is a non-parametric and lazy learning algorithm. K is the number of nearest neighbors. The number of neighbors is the core deciding factor. There are some basic steps, i.e., calculate the distance, find the closest neighbors, and vote for labels. The structure of KNN is depicted in
Figure 3.
3.1.3. Support Vector Machines
Support Vector Machines (SVM) are considered to be a classification approach by constructing a hyperplane in a multidimensional space to separate different classes. They include the following steps: generate hyperplanes and select the right hyperplane with the maximum segregation. The structure of SVM is shown in
Figure 4.
3.1.4. Beetle Antennae Search Algorithm
The Beetle Antennae Search algorithm (BAS) is an intelligent optimization algorithm, proposed by Jiang et al. in 2017. Different from other bionic algorithms, the Beetle Antenna Search algorithm is a monomer search algorithm with the advantages of a simple principle, fewer parameters, and less computation. It has great advantages in dealing with low-dimensional optimization objectives, such as low time complexity and strong searchability. The flow chart of BAS is given in
Figure 5. In this study, the iteration of BAS was set as 50, and the step factor was set as 0.95. All algorithms were developed by Matlab software.
3.2. The Methodology of Ensemble RF-BAS Model
There are several procedures for constructing an ensemble model.
Step 1: Splitting the dataset into a train dataset and test dataset, and normally, the proportion is 70% and 30%, respectively. It should be pointed out that due to the rockburst intensity being classed into four classes, the train and test dataset should also be divided into four subsets accordingly.
Step 2: Initialing the parameters of BAS, i.e., the beetle’s position in the space, in which the dimension of the position vector is the number of hyperparameters of the algorithm.
Step 3: Training the model and calculating the fitness value on the remaining subset of the training set.
Step 4: The BAS will tune the hyper-parameters by decreasing the fitness value. When the iteration of 50 is reached, the optimal hyperparameters can be found.
Step 5: The above process is repeated five times, and it can be called a fivefold cross-validation (CV) (shown in
Figure 6). The full procedure is depicted in
Figure 7.
3.3. Performance Evaluation Methods
In this study, we applied the classical methods for model evaluation. The receiver operating characteristic (ROC) curve and the AUC curve (the area under the ROC) were used in the evaluation of rockburst classification. The horizontal axis is the false positive rate (FPR); however, the vertical axis represents the true positive rate (TPR) in the ROC curve.