E ﬀ ective Assessment of Blast-Induced Ground Vibration Using an Optimized Random Forest Model Based on a Harris Hawks Optimization Algorithm

: Most mines choose the drilling and blasting method which has the characteristics of being a cheap and e ﬃ cient method to fragment rock mass, but blast-induced ground vibration damages the surrounding rock mass and structure and is a drawback. To predict, analyze and control the blast-induced ground vibration, the random forest (RF) model, Harris hawks optimization (HHO) algorithm and Monte Carlo simulation approach were utilized. A database consisting of 137 datasets was collected at di ﬀ erent locations around the Tonglvshan open-cast mine, China. Seven variables were selected and collected as the input variables, and peak particle velocity was chosen as the output variable. At ﬁrst, an RF model and a hybrid model, namely a HHO-RF model, were developed, and the prediction results checked by 3 performance indices to show that the proposed HHO-RF model can provide higher prediction performance. Then blast-induced ground vibration was simulated by using the Monte Carlo simulation approach and the developed HHO-RF model. After analyzing, the mean peak particle velocity value was 0.98 cm / s, and the peak particle velocity value did not exceed 1.95 cm / s with a probability of 90%. The research results of this study provided a simple, accurate method and basis for predicting, evaluating blast-induced ground vibration and optimizing the blast design before blast operation.


Introduction
In blasting, only 25% to 30% of the explosive energy is spent in rock mass fragmentation, and more than 70% is wasted and causes side effects such as back break, fly rock, and blast-induced ground vibration [1][2][3]. Among these blast-induced side effects, blast-induced ground vibration is considered as the most common, important and dangerous side effects for nearby structures, human life and the environment [4][5][6]. Therefore, establishing the prediction methods to predict and analyze the blast-induced ground vibration and then to optimize the blasting design and building structure by using that model is of great significance to decrease the damage caused by blast-induced ground vibration.
Blast-induced ground vibration can be described by using peak particle velocity (PPV) and frequency. According to previous literature [7], the peak particle velocity is normally considered to be more important than frequency and always selected to describe the blast-induced ground vibration.
After reviewing the previous literature, some empirical models [8][9][10][11] were proposed and most of them predict PPV by using the distance between blast blocks to monitor location and max charge per delay. Although some studies show these empirical models can predict PPV in some mines with high precision, there are also some research works [5,12,13] that show these models are not sufficient for all field sites.
In recent years, artificial intelligence (AI) technology has been widely developed and applied in many areas for solving special problems such as the prediction of blast-induced rock movement [14], flyrock distance [15,16], overbreak [17,18], air overpressure [19,20], rockburst [21][22][23], mechanical properties of material [24][25][26][27][28][29], tunnel convergence [30] and tunnel boring machine penetration rate [31]. In practice, AI technology is also applied to predict blast-induced ground vibration and yield high prediction precision [32][33][34]. For example, Zhang et al. [2] combined the particle swarm optimization (PSO) algorithm and extreme gradient boosting machine (XGBoost) for peak particle velocity prediction. In his paper, the PSO was utilized to search the optimal hyperparameters of XGBoost and a promising result was obtained. Shang et al. [35] proposed a hybrid model that is combined with the firefly algorithm (FFA) and an artificial neural network (ANN). The prediction performance of FFA-ANN was compared with classification and regression tree (CART), support vector machine (SVM), and k-nearest neighbor (KNN) models, and the FFA-ANN model provided the best performance. Azimi [36] utilized the genetic algorithm (GA) to optimize the ANN, and have developed an efficient model for PPV prediction after comparing the performance of the neuro-fuzzy inference system by using 70 blast vibration datasets. In 2009, Khandelwal and Singh [37] developed an ANN model for blast-induced ground vibration prediction, and the prediction results of the ANN model is highly encouraging. Bui et al. [38] tested the feasibility of the FCM-QRNN (fuzzy C-means clustering and quantile regression neural network) model and found that the model has higher accuracy than other soft computing models such as ANN and QRNN. Nguyen et al. [39] combined the k-means clustering algorithm (HKM) and ANN and utilized that model to predict blast-induced ground vibration. In this model, the input datasets should be clustered first by the HKM algorithm, and the result obtained will be used to train the ANN model. In addition to these studies, the blast-induced ground vibration in some famous mines such as the Nui Beo open-cast mine [7], Miduk copper mine [40] and Gol-E-Gohar iron ore mine [41] were studied by using the artificial intelligence technology in previous literature. It can be found that the artificial intelligence technology shows strong prediction ability in blast-induced ground vibration prediction, and research into new AI technology and the new prediction model is necessary and meaningful for hazard control.
In this study, AI technology was tested to predict the blast-induced ground vibration of Tonglvshan open-cast mine. According to the no free lunch theorem, if the performance of algorithm A is better than algorithm B in this problem, there must be other problems that algorithm B has better performance at resolving. Hence, although a series of prediction models such as HKM-ANN [39], MLPNN (multilayer perceptron neural network) [42] and CART [43] were utilized in some mines, the proposed algorithms are far from solving all possible problems and being suited for all field sites. As far as the authors know, the combination of the Harris hawks optimization algorithm (HHO) and random forest (RF) has not been tried in predicting blast-induced ground vibration. In this paper, a new hybrid model namely HHO-RF is proposed to predict and control the PPV value of Tonglvshan open-cast mine. Considering the uncertainty of blasting parameters in field sites, a Monte Carlo process was considered as a stochastic approach to analyze the probability distribution with various combinations of blasting parameters for hazard control. Eventually, a sensitivity investigation was carried out to measure the impact of various parameters on peak particle velocity.
After analyzing, engineers can use the accurate prediction model proposed in this study to predict and evaluate the PPV value caused by blast operation before a blast, which is helpful for optimizing blast design and reducing disputes between mines and surrounding residents. In addition, the security department can stipulate measures for protecting the buildings and industrial facilities based on the probability distribution of PPV because all the probable combinations of variables will be examined. Last, the analysis method and research results in this paper can provide a reference for controlling blast-induced ground vibration in similar mines.

Data Set
To achieve the objectives of this study, blast-induced ground vibration of Tonglvshan open-cast mine was monitored by using an accurate blast-induced ground vibration monitoring machine (EXP3850-3) in Hubei province, China (see Figure 1). That mine is situated 3 km west of Daye city, and the drilling and blasting technique is utilized for rock fragmentation. In addition, the geological structure of the Tonglvshan open-cast mine is a skarn deposit, and the ore deposit is mainly composed of granodiorite porphyry, sodium porphyry, skarn, marble, copper-bearing skarn, copper-iron ore, etc. Meanwhile, the value of Protodyakonov impact strength coefficient of different rock type varies greatly, which leads to the significant difficulty of accurate peak particle velocity prediction.

Data Set
To achieve the objectives of this study, blast-induced ground vibration of Tonglvshan open-cast mine was monitored by using an accurate blast-induced ground vibration monitoring machine (EXP3850-3) in Hubei province, China (see Figure 1). That mine is situated 3 km west of Daye city, and the drilling and blasting technique is utilized for rock fragmentation. In addition, the geological structure of the Tonglvshan open-cast mine is a skarn deposit, and the ore deposit is mainly composed of granodiorite porphyry, sodium porphyry, skarn, marble, copper-bearing skarn, copper-iron ore, etc. Meanwhile, the value of Protodyakonov impact strength coefficient of different rock type varies greatly, which leads to the significant difficulty of accurate peak particle velocity prediction.
A total number of 137 blasting operations were analyzed, and 8 variables including peak particle velocity (PPV), Protodyakonov impact strength coefficient (f), max charge per delay (Qmax), total charge (Qtotal), delay time of detonator (T), burden of the first row (B), horizontal (DH) and vertical (DV) distance between blast block and monitoring station were collected. After analyzing by using the box plot, 20 outliers were moved from the database to improve the model performance [44] and the remaining 117 datasets were utilized in this study (see Table 1).   A total number of 137 blasting operations were analyzed, and 8 variables including peak particle velocity (PPV), Protodyakonov impact strength coefficient (f ), max charge per delay (Q max ), total charge (Q total ), delay time of detonator (T), burden of the first row (B), horizontal (D H ) and vertical (D V ) distance between blast block and monitoring station were collected. After analyzing by using the box plot, 20 outliers were moved from the database to improve the model performance [44] and the remaining 117 datasets were utilized in this study (see Table 1).

Random Forest (RF) Model
The RF model was invented by Breiman [45]. In this model, randomly selected input variables and datasets were utilized to train the classification or regression model, and a large number of decision trees were developed and combined to make a final decision. Limited by the article length, the detailed introduction was not shown here but readers can find it in the published papers of Breiman [45] and Liaw and Wiener [46]. Table 1. Some of the datasets used for peak particle velocity (PPV) prediction.

Number
improve prediction performance by search the optimal combination of prediction models. In fact, many meta-heuristic optimization techniques are inspired by natural phenomena and have widely applied in mining [51], healthcare [52][53][54][55], finance [56] and query optimization problems [57] with good optimization performance. Hence, a nature-inspired algorithm (the Harris hawks optimization (HHO) algorithm) was tested in this paper because of its superior performance compared with the other 11 methods through 29 unconstrained benchmark problems and 6 constrained engineering design tasks carried out by Heidari et al. [58].
The Harris hawk (also called Parabuteo unicinctus), a well-known bird in the southern half of Arizona, USA, often be considered as a truly cooperative predator because these birds usually discover, attack and share prey with other family members [58,59] (Figure 2). Inspired by the social behavior of Harris hawks, the HHO algorithm was proposed for solving optimization problems with the proper formulation. The HHO algorithm includes three phases: the exploration phase, the transition from exploration to exploitation and exploitation phase. After reviewing the study of Heidari et al. [58], a brief introduction of the HHO algorithm was performed here.

Harris Hawks Optimization (HHO) Algorithm
As an important branch of artificial intelligence, the random forest model has been widely used in many areas such as civil areas [47], energy areas [48] and medical areas [49]. Meanwhile, previous literature [50] shows that the using of meta-heuristic optimization technique can significantly improve prediction performance by search the optimal combination of prediction models. In fact, many meta-heuristic optimization techniques are inspired by natural phenomena and have widely applied in mining [51], healthcare [52][53][54][55], finance [56] and query optimization problems [57] with good optimization performance. Hence, a nature-inspired algorithm (the Harris hawks optimization (HHO) algorithm) was tested in this paper because of its superior performance compared with the other 11 methods through 29 unconstrained benchmark problems and 6 constrained engineering design tasks carried out by Heidari et al. [58].
The Harris hawk (also called Parabuteo unicinctus), a well-known bird in the southern half of Arizona, USA, often be considered as a truly cooperative predator because these birds usually discover, attack and share prey with other family members [58,59] (Figure 2). Inspired by the social behavior of Harris hawks, the HHO algorithm was proposed for solving optimization problems with the proper formulation. The HHO algorithm includes three phases: the exploration phase, the transition from exploration to exploitation and exploitation phase. After reviewing the study of Heidari et al. [58], a brief introduction of the HHO algorithm was performed here. (1) Exploration phase: In this part, the Harris hawks try to discover, search and find the prey in the desert site by using eyes. The position of a Harris hawk in the mathematical model means the closest position to the prey and can be obtained by using the following equations [58]: where X(t + 1) and X(t) are the positions of Harris hawk in iteration t + 1 and t, respectively. Xrabbit(t) and Xrand(t) are the position of prey in the t iteration and randomly selected Harris hawk in the current population. r1, r2, r3, r4, q are the random values, UB and LB are the upper and lower bounds, respectively.
Besides these parameters, Xm(t) is the mean position of Harris hawks. (1) Exploration phase: In this part, the Harris hawks try to discover, search and find the prey in the desert site by using eyes. The position of a Harris hawk in the mathematical model means the closest position to the prey and can be obtained by using the following equations [58]: where X(t + 1) and X(t) are the positions of Harris hawk in iteration t + 1 and t, respectively. X rabbit (t) and X rand (t) are the position of prey in the t iteration and randomly selected Harris hawk in the current population. r 1 , r 2 , r 3 , r 4 , q are the random values, UB and LB are the upper and lower bounds, respectively. Besides these parameters, X m (t) is the mean position of Harris hawks.
Appl. Sci. 2020, 10, 1403 where N is the number of Harris hawks; X i (t) is the position of each hawk in iteration t.
(2) The transition from exploration to exploitation: According to the escaping energy of prey, the behavior of Harris hawks in the HHO algorithm will be changed. For example, the exploitation behavior will be performed when the escaping energy is less than 1, and the Harris hawks continue to search for prey when escaping energy is greater than or equal to 1. In that model, the escaping energy (E) can be obtained as [58]: where E 0 is the initial escaping energy, and E is the escaping energy in iteration t, and T is the maximum number of iterations.
(3) Exploitation phase After searching for prey in the desert site, a Harris hawk will attempt to attack the prey and the prey will correspondingly attempt to escape the surprise pounce. According to the different escaping behavior of the prey, several possible strategies may be performed during the exploitation phase. To evaluate the possibility of successfully escaping, r, a random value range from 0 to 1, was utilized in this algorithm, and four situations were divided with different range of r value and E value. (a) In this situation, the prey still has enough energy, so the Harris hawks hope to exhaust it and then perform the attack. The following equations can be used to calculate the position of the Harris hawk in iteration t + 1 [58].
where r 5 is a random value range from 0 to 1.
In this situation, the prey is tired and the Harris hawks can perform the surprise pounce. The position can be updated using the following equation [58]: In this situation, the prey still has enough energy and has a low chance of successfully escaping. The position of Harris hawks can be updated by the following rules [58]: The hawks can dive according to the following rule: Subjected to Appl. Sci. 2020, 10, 1403 7 of 17 here, Γ is Gamma function; β is 1.5; S is a random vector; u and v are the random value range from 0 to 1. After calculation, the final position can be shown as follows: In this situation, the prey is exhausted and the escaping chance is low. Therefore, the Harris hawks can perform the hard siege behavior and can be modeled as follows [58]: where here, LF is levy flight function.

HHO-RF Model
The procedure of developing an HHO-RF model is shown in Figure 3. Firstly, the collected database should be divided into training datasets and testing datasets with a ratio of 80% and 20% [60][61][62][63]. Then the Harris hawks optimization algorithm will be used to search the best combination of the number of variables used to grow each try (m try ) and the number of trees (n tree ) in the RF model, and the final RF model will be established. Finally, the testing datasets can be used to evaluate the RF model by using some performance metrics such as correlation coefficient (R 2 ), mean absolute error (MAE) and root mean square error (RMSE).  Figure 3. Procedure for developing a Harris hawks optimization-random forest (HHO-RF) model.

Performance Metrics
In this section, an RF model and a HHO-RF model were developed to establish a stable and accurate relationship between the input variables (f, B, T, DV, DH, Qtotal, Qmax) and output variable (PPV). To evaluate the predictive performance, three performance indices including RMSE, R 2 and MAE, were introduced and utilized here [64][65][66][67][68].

Performance Metrics
In this section, an RF model and a HHO-RF model were developed to establish a stable and accurate relationship between the input variables (f, B, T, D V , D H , Q total , Q max ) and output variable Appl. Sci. 2020, 10, 1403 8 of 17 (PPV). To evaluate the predictive performance, three performance indices including RMSE, R 2 and MAE, were introduced and utilized here [64][65][66][67][68].
here x i , f (x i ) and n are the actual PPV value, predicted PPV value and number of input datasets, respectively.

Monte Carlo Simulation
The Monte Carlo (MC) simulation method is a typical sampling technique in stochastic modeling and has the ability to realize the risk and uncertainty in many problems such as financial cost and purchase decision [69]. Monte Carlo simulation can achieve two important functions; the first is the quantitative investigation of uncertainty and variation of problems regarding risk; and the second is the investigation of parameters affecting variation and uncertainty and their percentage [70]. In the calculation process, the repeated sampling method was utilized to create a large number of combinations of input variables, then these input variables will be calculated by using the HHO-RF model developed. Finally, statistical analysis will be carried out for obtaining the probability distribution of PPV values (see Figure 4). In recent years, the MC simulation approach has been utilized to solve many engineering problems such as flyrock assessment [70], backbreak analysis [71], compression coefficient prediction [72], blasting fragmentation [73] and drilling rate evaluation [74]. In the following section, the MC simulation method will be utilized to analyze the probability distribution of blast-induced ground vibration caused by the blasting operation in Tonglvshan open-cast mine. In recent years, the MC simulation approach has been utilized to solve many engineering problems such as flyrock assessment [70], backbreak analysis [71], compression coefficient prediction [72], blasting fragmentation [73] and drilling rate evaluation [74]. In the following section, the MC simulation method will be utilized to analyze the probability distribution of blast-induced ground vibration caused by the blasting operation in Tonglvshan open-cast mine.

Results of RF Model
In order to predict the PPV by using the RF model, two typical model parameters including m try and n tree should be determined first. Hence a parameter investigation was carried out by using the grid search method with m try of 1 to 7 and n tree of 2000, and 7 was selected as the optimal m try value (Figure 5a). In addition, the parameter investigation with m try of 7 and n tree of 1 to 2000 show that the prediction performance will vary within a range and that range will be stable when n tree = 2000. So 2000 was selected as the optimal value of n tree (Figure 5b). After optimal RF model development, the testing datasets were inputted into the established RF model for prediction, the prediction performance shows the prediction performance with using the training datasets with R 2 of 0.92, MAE of 0.19 and RMSE of 0.27 and using the testing datasets with R 2 of 0.90, MAE of 0.35 and RMSE of 0.40 (see Figure 6).

Results of HHO-RF Model
To obtain the best combination of mtry and ntree, 8 HHO-RF models with the maximum number of iterations of 500 and variable numbers of Harris hawks ranging from 20 to 250 were developed. The fitness curve of these HHO-RF models checked by RMSE can be found in Figure 7. As shown in

Results of HHO-RF Model
To obtain the best combination of mtry and ntree, 8 HHO-RF models with the maximum number of iterations of 500 and variable numbers of Harris hawks ranging from 20 to 250 were developed. The fitness curve of these HHO-RF models checked by RMSE can be found in Figure 7. As shown in

Results of HHO-RF Model
To obtain the best combination of m try and n tree , 8 HHO-RF models with the maximum number of iterations of 500 and variable numbers of Harris hawks ranging from 20 to 250 were developed. The fitness curve of these HHO-RF models checked by RMSE can be found in Figure 7. As shown in Figure 7, there is no significant change after 350 iterations. But the escaping energy of the rabbit has a relationship with the maximum number of iterations in the HHO algorithm according to Equation (3). The change of the maximum number of iterations may affect the optimization performance of the HHO algorithm. Hence, 500 and 150 were chosen as the best parameter combination of HHO-RF modeling in view of the excellent prediction performance.

Results of Monte Carlo Simulation
As mentioned in the previous sections, an RF model and a HHO-RF model were developed for peak particle velocity prediction, and the results obtained show that the HHO-RF model can provide more accurate prediction performance. Hence, the HHO-RF model was used here for Monte Carlo simulation, and four input variables including DV, DH, Qtotal, and Qmax were considered as the continuous probability distributions (CPD) and two input variables including T and B were considered as the discrete probability distribution (DPD) (see Table 2).

Results of Monte Carlo Simulation
As mentioned in the previous sections, an RF model and a HHO-RF model were developed for peak particle velocity prediction, and the results obtained show that the HHO-RF model can provide more accurate prediction performance. Hence, the HHO-RF model was used here for Monte Carlo simulation, and four input variables including DV, DH, Qtotal, and Qmax were considered as the continuous probability distributions (CPD) and two input variables including T and B were considered as the discrete probability distribution (DPD) (see Table 2).

Results of Monte Carlo Simulation
As mentioned in the previous sections, an RF model and a HHO-RF model were developed for peak particle velocity prediction, and the results obtained show that the HHO-RF model can provide more accurate prediction performance. Hence, the HHO-RF model was used here for Monte Carlo simulation, and four input variables including D V , D H , Q total , and Q max were considered as the continuous probability distributions (CPD) and two input variables including T and B were considered as the discrete probability distribution (DPD) (see Table 2). Table 2. Probability distribution functions of seven input variables. After inputting the probability distribution functions into the MC simulation, a large number of combinations should be randomly selected by using the sampling method. In this regard, two sampling methods including Latin hypercube sampling and simple random sampling are recommended. According to the previous literature, the Latin hypercube sampling method is much better than the simple random sampling method because it can achieve the same calculation performance with fewer simulations. It should be noted that the relationship between input variables should be considered in the MC simulation to guarantee the consistency of input variable combinations in MC simulation and monitoring databases.

Input Variables Function
In order to achieve a high level of simulation and probability distribution, 20,000 simulation runs were performed by using the Latin hypercube sampling method, and the relationship of 20,000 randomly selected combinations can be found in Figure 9. It is found that the correlations between input values in MC simulation and blast-induced ground vibration database agree well, so the 20,000 simulation runs developed can be used to analyze the probability distribution of blast-induced ground vibration. After inputting the probability distribution functions into the MC simulation, a large number of combinations should be randomly selected by using the sampling method. In this regard, two sampling methods including Latin hypercube sampling and simple random sampling are recommended. According to the previous literature, the Latin hypercube sampling method is much better than the simple random sampling method because it can achieve the same calculation performance with fewer simulations. It should be noted that the relationship between input variables should be considered in the MC simulation to guarantee the consistency of input variable combinations in MC simulation and monitoring databases.
In order to achieve a high level of simulation and probability distribution, 20,000 simulation runs were performed by using the Latin hypercube sampling method, and the relationship of 20,000 randomly selected combinations can be found in Figure 9. It is found that the correlations between input values in MC simulation and blast-induced ground vibration database agree well, so the 20,000 simulation runs developed can be used to analyze the probability distribution of blast-induced ground vibration. After analyzing, the probability distribution of blast-induced ground vibration is shown in Figure 10. As can be seen, the mean value of PPV in MC simulation results is determined as 0.98 cm/s while the mean value of PPV is 1.0 cm/s in the database, and the PPV value does not exceed 1.95 cm/s with a probability of 90%. Moreover, the measured PPV values, the predicted PPV values by the HHO-RF model, and the Monte Carlo simulation results are also shown in Figure 10. It can be observed that these three cumulative frequency curves are closely related and the curve performed After analyzing, the probability distribution of blast-induced ground vibration is shown in Figure 10. As can be seen, the mean value of PPV in MC simulation results is determined as 0.98 cm/s while the mean value of PPV is 1.0 cm/s in the database, and the PPV value does not exceed 1.95 cm/s with a probability of 90%. Moreover, the measured PPV values, the predicted PPV values by the HHO-RF model, and the Monte Carlo simulation results are also shown in Figure 10. It can be observed that these three cumulative frequency curves are closely related and the curve performed by the MC simulation is smoother. It is also found that these cumulative frequency curves still have some differences, and the differences may be due to the small amount of the monitoring database being too small and the error of the prediction model. After obtaining the probability distribution of PPV values, the engineer can adjust the blasting parameters such as the max charge per delay and burden to control the peak particle velocity and minimize the damage upon the building structure. Moreover, for a case that needs special protection, the engineer is advised to assume the most dangerous situation with the probability of 0.001% to arrange safeguards.

Variable Importance
To analyze the impact of various blasting parameters on blast-induced ground vibration, a sensitivity analysis was performed to measure the impact of various parameters on peak particle velocity. In this section, with the help of SPSS (Statistical Product and Service Solutions), the relationships between two variables can be calculated with the range of −1 to +1 after sorting the database and comparing the rank of each value. Figure 11 shows the impact of these input variables on PPV values both in blast-induced ground vibration database and MC simulation results. As shown, the DH with the correlation coefficient of −0.795 and −0.563 is the most sensitive variable, and the input variables can be divided into controllable parameters and uncontrollable parameters. The controllable parameters can be listed in descending order as DH, DV, Qtotal, Qmax, T and B in the PPV monitoring database and DH, DV, Qtotal, Qmax, B, T in MC simulation results. It is found that there is a difference in the ranking of T and B in these two databases. The reason may be that the PPV monitoring database is too small and the ranking order obtained from the MC simulation results is more reliable according to the big data principle. Meanwhile, the roughly same order of input variables demonstrates that the Monte Carlo simulation is reliable. After obtaining the probability distribution of PPV values, the engineer can adjust the blasting parameters such as the max charge per delay and burden to control the peak particle velocity and minimize the damage upon the building structure. Moreover, for a case that needs special protection, the engineer is advised to assume the most dangerous situation with the probability of 0.001% to arrange safeguards.

Variable Importance
To analyze the impact of various blasting parameters on blast-induced ground vibration, a sensitivity analysis was performed to measure the impact of various parameters on peak particle velocity. In this section, with the help of SPSS (Statistical Product and Service Solutions), the relationships between two variables can be calculated with the range of −1 to +1 after sorting the database and comparing the rank of each value. Figure 11 shows the impact of these input variables on PPV values both in blast-induced ground vibration database and MC simulation results. As shown, the D H with the correlation coefficient of −0.795 and −0.563 is the most sensitive variable, and the input variables can be divided into controllable parameters and uncontrollable parameters. The controllable parameters can be listed in descending order as D H , D V , Q total , Q max , T and B in the PPV monitoring database and D H , D V , Q total , Q max , B, T in MC simulation results. It is found that there is a difference in the ranking of T and B in these two databases. The reason may be that the PPV monitoring database is too small and the ranking order obtained from the MC simulation results is more reliable according to the big data principle. Meanwhile, the roughly same order of input variables demonstrates that the Monte Carlo simulation is reliable.

Limitations
Four limitations can be found in this paper and need to be considered in the future study. First, only 137 datasets were collected and utilized to develop the RF model and HHO-RF model, a larger database should be established for evaluating, improving model reliability and obtaining more accurate prediction performance. Then, only 8 variables including PPV, f, Qmax, Qtotal, T, B, DH, and DV were collected, more qualitative parameters such as the detonation velocity of the explosive may have an impact on the PPV prediction and should be considered in future investigations. Third, although the HHO algorithm shows good results in improving the prediction performance of the RF model, there are many optimization algorithms such as the GA, grey wolf optimization algorithm (GWO), artificial bee colony algorithm (ABC) and cuckoo search algorithm (CS) that can also be tried to select the hyperparameters of the random forest algorithm and no prediction performance comparison was carried out in this paper for these optimization algorithms. Finally, as a typical character of the machine learning method, the prediction performance is dependent on the fitness of the collected database and the complex relationship between prediction results and input variables is difficult to describe due to its "black box" nature.

Conclusions
This study aims to propose a new hybrid model for accurate blast-induced ground vibration prediction and use the stochastic modeling approach to obtain the probability distribution. At first, a simple RF model and a new proposed HHO-RF model were developed and utilized to predict the peak particle velocity; the prediction performance showed by HHO-RF model is much better than the prediction performance yield by RF model. In addition, the Monte Carlo simulation showed that the mean PPV value of the MC simulation results is 0.98 cm/s and the PPV value does not exceed 1.95 cm/s with the probability of 90%. Finally, the results of the sensitivity analysis showed that the DH is the most influential factor on blast-induced ground vibration. Based on the high-precision prediction model developed and the results of Monte Carlo simulation obtained, the engineers of Tonglvshan open-cast mine can accurately predict blast-induced ground vibration and ameliorate blast design before a blast operation, which is beneficial for controlling the blast's impact on nearby structures, human life, and the environment.

Limitations
Four limitations can be found in this paper and need to be considered in the future study. First, only 137 datasets were collected and utilized to develop the RF model and HHO-RF model, a larger database should be established for evaluating, improving model reliability and obtaining more accurate prediction performance. Then, only 8 variables including PPV, f, Q max , Q total , T, B, D H , and D V were collected, more qualitative parameters such as the detonation velocity of the explosive may have an impact on the PPV prediction and should be considered in future investigations. Third, although the HHO algorithm shows good results in improving the prediction performance of the RF model, there are many optimization algorithms such as the GA, grey wolf optimization algorithm (GWO), artificial bee colony algorithm (ABC) and cuckoo search algorithm (CS) that can also be tried to select the hyperparameters of the random forest algorithm and no prediction performance comparison was carried out in this paper for these optimization algorithms. Finally, as a typical character of the machine learning method, the prediction performance is dependent on the fitness of the collected database and the complex relationship between prediction results and input variables is difficult to describe due to its "black box" nature.

Conclusions
This study aims to propose a new hybrid model for accurate blast-induced ground vibration prediction and use the stochastic modeling approach to obtain the probability distribution. At first, a simple RF model and a new proposed HHO-RF model were developed and utilized to predict the peak particle velocity; the prediction performance showed by HHO-RF model is much better than the prediction performance yield by RF model. In addition, the Monte Carlo simulation showed that the mean PPV value of the MC simulation results is 0.98 cm/s and the PPV value does not exceed 1.95 cm/s with the probability of 90%. Finally, the results of the sensitivity analysis showed that the D H is the most influential factor on blast-induced ground vibration. Based on the high-precision prediction model developed and the results of Monte Carlo simulation obtained, the engineers of Tonglvshan open-cast mine can accurately predict blast-induced ground vibration and ameliorate blast design before a blast operation, which is beneficial for controlling the blast's impact on nearby structures, human life, and the environment.