Prediction of Peak Particle Velocity Caused by Blasting through the Combinations of Boosted-CHAID and SVM Models with Various Kernels

: This research examines the feasibility of hybridizing boosted Chi-Squared Automatic Interaction Detection (CHAID) with different kernels of support vector machine (SVM) techniques for the prediction of the peak particle velocity (PPV) induced by quarry blasting. To achieve this objective, a boosting-CHAID technique was applied to a big experimental database comprising six input variables. The technique identiﬁed four input parameters (distance from blast-face, stemming length, powder factor, and maximum charge per delay) as the most signiﬁcant parameters affecting the prediction accuracy and utilized them to propose the SVM models with various kernels. The kernel types used in this study include radial basis function, polynomial, sigmoid, and linear. Several criteria, including mean absolute error (MAE), correlation coefﬁcient (R), and gains, were calculated to evaluate the developed models’ accuracy and applicability. In addition, a simple ranking system was used to evaluate the models’ performance systematically. The performance of the R and MAE index of the radial basis function kernel of SVM in training and testing phases, respectively, conﬁrm the high capability of this SVM kernel in predicting PPV values. This study successfully demonstrates that a combination of boosting-CHAID and SVM models can identify and predict with a high level of accuracy the most effective parameters affecting PPV values.


Introduction
Blasting is a usual method of breakage in mining and quarrying processes. It is also one of the standard techniques used in several projects such as road and tunnel construction [1]. In excavation processes, blasting is formed from boring some series of explosion-holes nearly equidistant to the bench's free face [2]. Certain actions generate some undesirable environmental effects, for example, air overpressure, ground vibrations, flyrock, and backbreak around the blasting area [3][4][5][6][7][8][9][10][11][12][13][14]. Despite the availability of several experimental analytic solutions for predicting these environmental effects, these specifications take into account only a small number of important factors, whereas other influential parameters such as the blasting pattern and geological circumstances influence these impacts as kernel will be selected and introduced to predict the PPV induced by mine blasting. The paper explains the research as follows. In the next section, the modelling methodology and case study are presented. The methodology section is followed by research results and their evaluations. This paper closes with a discussion and conclusion of the findings of this research. Table 1. The most prominent research on the PPV prediction by means of the SC procedures.

Materials and Methods
In this study, a systematic approach was employed to combine the boosting-CHAID as an input selection technique, with SVM models with diverse kernels to predict the PPV resulting from blasting. Initially, a boosting-CHAID model was developed and the most important variables for predicting the PPV were identified. Subsequently, the SVM models with various kernels (sigmoid, SIG; polynomial, POL; linear, LIN; and radial basis function, RBF were built using the aforementioned variables. Finally, the models' results were evaluated by applying certain performance criteria. Figure 1 presents the flowchart of the approach employed in this study. It is important to note that three methods of model evaluation, including performance indices, variable importance, and ranking system, were used in this study.

Input Selection Technique
The Chi-Squared Automatic Interaction Detection (CHAID) algorithm creates decision trees employing ChiSquare statistics to establish the optimal divisions [89]. CHAID generates non-binary trees. Some divisions may possess more than two branches that are especially suitable for the examination of complex datasets. CHAID converts continuous inputs into ordinal type employing binning techniques since it handles merely categorical inputs. During the learning process, a heuristic statistical technique is employed to examine the relationship between a set of categorical inputs and the target variable. It offers a tree diagram that shows the kinds of inputs that most significantly affect the value of the target variable. CHAID modelling steps are (1) binning, (2) merging, (3) splitting, and (4) stopping.

Input Selection Technique
The Chi-Squared Automatic Interaction Detection (CHAID) algorithm creates decision trees employing ChiSquare statistics to establish the optimal divisions [89]. CHAID generates non-binary trees. Some divisions may possess more than two branches that are especially suitable for the examination of complex datasets. CHAID converts continuous inputs into ordinal type employing binning techniques since it handles merely categorical inputs. During the learning process, a heuristic statistical technique is employed to examine the relationship between a set of categorical inputs and the target variable. It offers a tree diagram that shows the kinds of inputs that most significantly affect the value of the Boosting procedures were introduced by Freund and Schapire [90], who utilized resampling and merging algorithms to develop the weights of misclassified examples. In this study, we utilized boosted-CHAID technique for input selection since a single tree may not show the importance of ranking variables, and they could be completely masked by other related inputs.

SVM Model and Its Variants
One of the most prominent supervised machine learning (ML) techniques that apply statistical learning principles and the necessary risk minimization system is the SVM [91]. This technique revises the non-linear system into a linear format by creating a hyperplane and converting the aforementioned system into a simplistic and processable setup [92] as shown in Figure 2. The data transmutation is conducted utilizing an analytical and precise function recognized as the Kernel function. The SVM intends to obtain the best margin of division between the groups and creates a classification hyperplane within the middle of the most significant margin [93]. These couple classes are named as "+1" (positive samples), which indicates the circumstance over the hyperplane, and "−1" (negative samples) describes the circumstance under the hyperplane. The characteristics of new data afterward can forecast the assortment to which a new record should fit.

stopping.
Boosting procedures were introduced by Freund and Schapire [90], who utilized resampling and merging algorithms to develop the weights of misclassified examples. In this study, we utilized boosted-CHAID technique for input selection since a single tree may not show the importance of ranking variables, and they could be completely masked by other related inputs.

SVM Model and Its Variants
One of the most prominent supervised machine learning (ML) techniques that apply statistical learning principles and the necessary risk minimization system is the SVM [91]. This technique revises the non-linear system into a linear format by creating a hyperplane and converting the aforementioned system into a simplistic and processable setup [92] as shown in Figure 2. The data transmutation is conducted utilizing an analytical and precise function recognized as the Kernel function. The SVM intends to obtain the best margin of division between the groups and creates a classification hyperplane within the middle of the most significant margin [93]. These couple classes are named as "+1" (positive samples), which indicates the circumstance over the hyperplane, and "−1" (negative samples) describes the circumstance under the hyperplane. The characteristics of new data afterward can forecast the assortment to which a new record should fit. The abovementioned step is executed for both classification and regression. With regards to classification, the aforementioned minimization is made assuming that all samples are entirely classified, while the regression analysis follows the provision that the "y" The abovementioned step is executed for both classification and regression. With regards to classification, the aforementioned minimization is made assuming that all samples are entirely classified, while the regression analysis follows the provision that the "y" value of each example varies less than the demanded precision of from f (x). For classification, the main aim is to find a function f (x) = wx + b where f (x) ≥ 1 for positive examples and f (x) ≤ −1 for negative examples. Under these conditions, we want to maximize the margin which is nothing more than minimizing the derivative of f = w. For regression, the objective is to determine a function f (x) = wx + b (pale diagonal line) following the condition that f (x) is within a required accuracy from the value y(x) (vertical bars) of every data point, namely |y(x) − f (x)| ≤ where epsilon is the distance between the dashed and the pale diagonal line ( Figure 3). cation, the main aim is to find a function f(x) = wx + b where f(x) ≥ 1 for positive examples and f(x) ≤ −1 for negative examples. Under these conditions, we want to maximize the margin which is nothing more than minimizing the derivative of f′ = w. For regression, the objective is to determine a function f(x) = wx + b (pale diagonal line) following the condition that f(x) is within a required accuracy ϵ from the value y(x) (vertical bars) of every data point, namely |y(x) − f(x)| ≤ ϵ where epsilon is the distance between the dashed and the pale diagonal line (Figure 3). This research examined the SVM model with diverse kernels, including RBF, LIN, SIG, and POL to predict the PPV caused by quarry blasting. Typically, "Kernel" refers to implementing a linear classifier to resolve a non-linear problem. In ML technique, this Kernel is also called "Kernel trick". LIN Kernel is suitable for simple and linearly separated data. Otherwise, other functions should be employed. It is worth mentioning that the SIG are identical to the RBF for some parameters of SVM [94]. The kernel of LIN is the particular form of the RBF and in circumstances that RBF is adopted during processing, it is unnecessary to apply the kernel of LIN. With regards to precision, the RBF has a greater ability to interpolate compared to the SIG. This triggers RBF to produce additional consistent outcomes. Instead, the RBF is not able to create longer-range extrapolation. The SIG may have a great inconsistency since it is not severely positive certain which may cause incorrect calculation. In a study by Tehrany et al. [95], it was asserted that the POL is able to produce better extrapolations. Figure 4 presents the kernels' formulas. This study utilized every type of kernels to investigate the efficiency of each kernel to predict the PPV induced by blasting. Figure 4 shows that there are some critical coefficients like " " and "d" for different kernels such as RBF and POL that need to be designed. In Figure 4, " " is the kernel width and "d" is degree of polynomial kernel. It is vital to discover the correct value of " " and "d" because " " regulates the level of nonlinearity of the SVM model and "d" determines the level of the polynomial kernel. This research examined the SVM model with diverse kernels, including RBF, LIN, SIG, and POL to predict the PPV caused by quarry blasting. Typically, "Kernel" refers to implementing a linear classifier to resolve a non-linear problem. In ML technique, this Kernel is also called "Kernel trick". LIN Kernel is suitable for simple and linearly separated data. Otherwise, other functions should be employed. It is worth mentioning that the SIG are identical to the RBF for some parameters of SVM [94]. The kernel of LIN is the particular form of the RBF and in circumstances that RBF is adopted during processing, it is unnecessary to apply the kernel of LIN. With regards to precision, the RBF has a greater ability to interpolate compared to the SIG. This triggers RBF to produce additional consistent outcomes. Instead, the RBF is not able to create longer-range extrapolation. The SIG may have a great inconsistency since it is not severely positive certain which may cause incorrect calculation. In a study by Tehrany et al. [95], it was asserted that the POL is able to produce better extrapolations. Figure 4 presents the kernels' formulas. This study utilized every type of kernels to investigate the efficiency of each kernel to predict the PPV induced by blasting. Figure 4 shows that there are some critical coefficients like "γ" and "d" for different kernels such as RBF and POL that need to be designed. In Figure 4, "γ" is the kernel width and "d" is degree of polynomial kernel. It is vital to discover the correct value of "γ" and "d" because "γ" regulates the level of nonlinearity of the SVM model and "d" determines the level of the polynomial kernel.

Experimental Database
While in the process of developing a forecast model, much attention has been devoted to the computational model itself, only marginal attention has been paid by researchers to the actual database used for the development, training, and validation of the model.
Without underestimating the high importance and added value of research efforts towards the development of new computational models, we strongly believe that the reliability of the database is of utmost importance in achieving the ultimate goal of a reliable

Experimental Database
While in the process of developing a forecast model, much attention has been devoted to the computational model itself, only marginal attention has been paid by researchers to the actual database used for the development, training, and validation of the model. Without underestimating the high importance and added value of research efforts towards the development of new computational models, we strongly believe that the reliability of the database is of utmost importance in achieving the ultimate goal of a reliable forecast. In fact, in addition to reliable data, a reliable database must comprise a sufficient amount of data, covering the full range of parameter (input and output) values that influence the problem under investigation.
It should be noted that the term "sufficient amount of data" does not necessarily imply a high amount of data, but rather datasets that cover a wide range of combinations of input parameter values, thus assisting in the model capability to simulate the problem. The demand for a reliable and capable database is especially crucial in the case of experimental databases, that is databases which are compiled using experimental results. In this case, high deviation between experimental values is frequently noticed, not only between experiments conducted by different research teams and laboratories, but even between datasets that derive from experiments conducted on specimens of the same synthesis, produced by the same technicians, cured under the same conditions and tested implementing the same standards and the same testing instruments.
In light of the above discussion, a big experimental database consisting of 166 datasets was composed. To provide a significant amount of data for the calculation of the environmental effects of mine blasting, we studied four quarries in Malaysia. Details for these sites are presented in Figure 5. The goal of mine explosion is to provide aggregate material for various applications. Depending on the weather, six to twelve mine explosions are performed each month.

Experimental Database
While in the process of developing a forecast model, much attention has been devoted to the computational model itself, only marginal attention has been paid by researchers to the actual database used for the development, training, and validation of the model.
Without underestimating the high importance and added value of research efforts towards the development of new computational models, we strongly believe that the reliability of the database is of utmost importance in achieving the ultimate goal of a reliable forecast. In fact, in addition to reliable data, a reliable database must comprise a sufficient amount of data, covering the full range of parameter (input and output) values that influence the problem under investigation.
It should be noted that the term "sufficient amount of data" does not necessarily imply a high amount of data, but rather datasets that cover a wide range of combinations of input parameter values, thus assisting in the model capability to simulate the problem. The demand for a reliable and capable database is especially crucial in the case of experimental databases, that is databases which are compiled using experimental results. In this case, high deviation between experimental values is frequently noticed, not only between experiments conducted by different research teams and laboratories, but even between datasets that derive from experiments conducted on specimens of the same synthesis, produced by the same technicians, cured under the same conditions and tested implementing the same standards and the same testing instruments.
In light of the above discussion, a big experimental database consisting of 166 datasets was composed. To provide a significant amount of data for the calculation of the environmental effects of mine blasting, we studied four quarries in Malaysia. Details for these sites are presented in Figure 5. The goal of mine explosion is to provide aggregate material for various applications. Depending on the weather, six to twelve mine explosions are performed each month.  Among these 166 data samples and investigated blasting events, 80 blasting events were investigated in Kulai quarry site. Then, 31, 29, and 26 blasting events were investigated in Bukit Indah, Senai Jaya, and Taman Bestari quarry sites, respectively. The lowest hole depth (10 m) was in the Kulai site, while the biggest hole depth (28 m) was in the Bukit Indah site. We compiled a database of 166 data samples from field measurements. The following parameters influencing the blast effect were recorded: powder factor (kg/m 3 ), spacing (m), stemming length (m), burden (m), the maximum charge per delay (kg), and the blast-face distance to the monitoring point (m). Actually, the mentioned parameters are considered as a common blasting data and have been utilized by many published works in literature [1,17,19,22]. It is also important to mention that the most important input factors in measuring/predicting the PPV are the maximum charge per delay and the distance from the blast-face [21, 22,96,97]. In the established database, we used a 115 mm diameter for blast-holes. Fine gravel as a well-known stemming material was used in these operations. We recorded the PPV using a VibraZEB seismograph equipment at specific locations. Table 2 presents a summary of the measured input and output variables including unit, maximum, minimum, mean, and standard deviation. The frequency distributions of the PPV employed in this investigation are presented in Figure 6. portant input factors in measuring/predicting the PPV are the maximum charge per delay and the distance from the blast-face [21, 22,96,97]. In the established database, we used a 115 mm diameter for blast-holes. Fine gravel as a well-known stemming material was used in these operations. We recorded the PPV using a VibraZEB seismograph equipment at specific locations. Table 2 presents a summary of the measured input and output variables including unit, maximum, minimum, mean, and standard deviation. The frequency distributions of the PPV employed in this investigation are presented in Figure 6.

Input Selection
We employed a hybrid approach for input selection. This technique was applied to six inputs for predicting the PPV values. The model was developed using the following parameters and settings: the tree growing algorithm was set to CHAID; the maximum tree depth was set as five; the minimum records in parent and child branches were assigned

Input Selection
We employed a hybrid approach for input selection. This technique was applied to six inputs for predicting the PPV values. The model was developed using the following parameters and settings: the tree growing algorithm was set to CHAID; the maximum tree depth was set as five; the minimum records in parent and child branches were assigned to two and one, respectively; the number of component models for boosting was selected as 10, and significance level for splitting and merging was set as 0.05. The accuracy of the CHAID and boosted CHAID models were 79.6% and 0.89%, which showed the superiority of the boosted model over the single tree model. According to the boosted-CHAID model results, four inputs, including distance (m), stemming (m), powder factor (kg/m 3 ), and maximum charge per delay (kg), were the most important predictors/variables for the PPV forecast. Then, these critical inputs were used to apply the SVM models with diverse Kernels to predict the PPV caused by quarry blasting.

SVM Models with Different Kernels
This study applied four SVM models with four different Kernels, including RBF, polynomial, sigmoid, and linear. We used four parameters, including stemming, powder factor, the maximum charge per delay, and distance, which were identified as the most critical and relevant parameters for developing the SVM models. The research team used several considerations for developing these models. Stopping criteria were set as 1.0 × 10 −3 ; the regularization parameter (C) was established as 10; and the regression precision (epsilon) was developed as 0.1. Before the models' development, the data were split into train and test partitions using a ratio of 80:20. Thus, 104 samples were used in the training phase, and 36 samples were used for the testing phase. The measured PPV values and predicted values by all four models are shown in Figure 7.
Appl. Sci. 2021, 11, 3705 9 of 17 training phase, and 36 samples were used for the testing phase. The measured PPV values and predicted values by all four models are shown in Figure 7.
tor, the maximum charge per delay, and distance, which were identified as the most critical and relevant parameters for developing the SVM models. The research team used several considerations for developing these models. Stopping criteria were set as 1.0 × 10 −3 ; the regularization parameter (C) was established as 10; and the regression precision (epsilon) was developed as 0.1. Before the models' development, the data were split into train and test partitions using a ratio of 80:20. Thus, 104 samples were used in the training phase, and 36 samples were used for the testing phase. The measured PPV values and predicted values by all four models are shown in Figure 7. This research used two commonly used criteria for assessing the models' performance. These criteria included the Pearson's correlation coefficient (R) and the mean absolute error (MAE). In addition, a gain chart also was used to illustrate the performance of the models graphically: where y im , y ip , and y im indicate the measured, predicted and the mean of measured values, n represents the total number of data. A simplistic ranking system that rates the performance of the models for each partition was also developed. In this system, training and testing rankings were assigned to each model. Also, an accumulative ranking was produced, which was the total of the training This research used two commonly used criteria for assessing the models' performance. These criteria included the Pearson's correlation coefficient (R) and the mean absolute error (MAE). In addition, a gain chart also was used to illustrate the performance of the models graphically: where y im , y ip , and y im indicate the measured, predicted and the mean of measured values, n represents the total number of data. A simplistic ranking system that rates the performance of the models for each partition was also developed. In this system, training and testing rankings were assigned to each model. Also, an accumulative ranking was produced, which was the total of the training and testing ranks. The formula for computing the accumulative ranking for each model is presented below: where, A-R is the accumulative ranking of each model, α denotes the ranking of R, β shows the ranking of MAE, "tr" means the training ranking, and "te" signifies the testing ranking.
The performances of the models developed in this study are shown in Table 3. As can be seen, for the training phase, the BC-SVMRBF model achieved the highest ranks of R and MAE compared to other models. On the other hand, the lowest rankings of R and MAE belonged to the BC-SVMSIG model in the training phase. For the testing phase, the BC-SVMRBF model outperformed other models in terms of R; however, regarding the MAE, the BC-SVMLIN achieved the highest ranking. Again, the BC-SVMSIG model achieved the lowest ranking in the testing phase comparing with other models. With regards to the accumulative ranking, the BC-SVMRBF model achieved the highest ranking (A-R ranking = 15), followed by the BC-SVMLIN model (A-R ranking = 12). Alternatively, BC-SVMSIG had the worst performance and consequently had the lowest accumulative ranking. Value: V; Ranking: R; Accumulative ranking: A-R.
We also employed a gain chart to compare the models developed in this study. It is critical to note that the "gain" refers to the successfulness of a predictive technique to gauge the amounts higher than the middle point of the field's range (PPV > 0.557). Mathematically, the gain is calculated as follow: where, "q" refers to the quantity of hits in quantile and "w" shows the whole quantity of hits.
In the diagram resulted from the gain calculation, the faultless model with tremendous confidence is denoted by the blue line, the diagonal red line denotes the accidental model, and the other lines in the middle denote the models utilized in this research. Generally speaking, the higher-level lines indicate higher prediction accuracy models, especially on the chart's left side. The domain within a red line model illustrates the gain difference between an applied and an accidental model. The domain mentioned above illustrates the superiority of an implemented versus accidental model. The range between an applied and the best model indicates areas of improvement for the applied model. The results of the gain's computation are presented in Figure 8. The results showed that BC-SVMSIG (the green line) had the worst gain for both the training and testing phases. where, "q" refers to the quantity of hits in quantile and "w" shows the whole quantity of hits.
In the diagram resulted from the gain calculation, the faultless model with tremendous confidence is denoted by the blue line, the diagonal red line denotes the accidental model, and the other lines in the middle denote the models utilized in this research. Generally speaking, the higher-level lines indicate higher prediction accuracy models, especially on the chart's left side. The domain within a red line model illustrates the gain difference between an applied and an accidental model. The domain mentioned above illustrates the superiority of an implemented versus accidental model. The range between an applied and the best model indicates areas of improvement for the applied model. The results of the gain's computation are presented in Figure 8. The results showed that BC-SVMSIG (the green line) had the worst gain for both the training and testing phases. The prominence of the input variables of different SVM models was identified and shown in Figure 9. As can be seen, all models except BC-SVMSIG identified distance as the most important predictor for the PPV prediction. Besides, BC-SVMRBF and BC-SVMLIN models similarly acknowledged "distance" as the most influential factor on the PPV. The "stemming" was recognized as an influential predictor only by BC-SVMPOL and BC-SVMSIG models. While the former model identified the "stemming" as another significant variable, the latter identified the "stemming" as the most significant PPV predictor. The "maximum charge per delay" and "powder factor" were selected as an influ- The prominence of the input variables of different SVM models was identified and shown in Figure 9. As can be seen, all models except BC-SVMSIG identified distance as the most important predictor for the PPV prediction. Besides, BC-SVMRBF and BC-SVMLIN models similarly acknowledged "distance" as the most influential factor on the PPV. The "stemming" was recognized as an influential predictor only by BC-SVMPOL and BC-SVMSIG models. While the former model identified the "stemming" as another significant variable, the latter identified the "stemming" as the most significant PPV predictor. The "maximum charge per delay" and "powder factor" were selected as an influential factor only by the BC-SVMSIG model.

Discussion
This study aimed to assess the feasibility of using a hybrid approach that combines boosted-CHAID technique and SVM technique with different Kernels to predict the PPV induced by quarry blasting. The applied models were analyzed regarding accuracy, error gain performance, and input variables' importance. The models' evaluation showed tha the BC-SVMRBF model achieved the best performance, which shows the efficiency of hy bridizing boosted-CHAID and SVM with RBF Kernel to predict the PPV. Alternatively the BC-SVMSIG had the weakest performance in terms of accuracy, error, and gain, whic showed that this hybridization approach is not suitable for predicting the PPV.
The finding of the present study in terms of better performance of RBF over othe kernel types is in line with those of studies in other disciplines, which pointed out that th SVM model with RBF kernel has the greatest forecast capability (e.g., [91]).
Several properties of the RBF kernel may lead to its better performance over othe kernel types. These properties included its stationarity and smoothness. Besides, the RB kernel is isotropic. Here, stationary implies that the RBF is invariant to translation. RBF' isometric property refers to the fact that in RBF, the scaling by gives a similar value i all directions.
To support the efficiency of the proposed approaches, we applied two ANN model with two different structures, including Multilayer Perceptron (MLP) and Radial Basi Function (RBF) to the same data. The training R values of 0.858 and 0.849 were achieved for ANNMLP and ANNRBF models, respectively. The results of these ANN model showed that all SVM models except SVMSIG outperformed the ANN models while wer hybridized with the Boosted-CHAID method.
Compared to the previous studies on the same dataset, this study achieved a slightl lower accuracy than that of the study by Armaghani et al. [98]. In their study, Armaghan et al. applied ANN and ANFIS models to five inputs parameters to achieve an R of 0.96

Discussion
This study aimed to assess the feasibility of using a hybrid approach that combines a boosted-CHAID technique and SVM technique with different Kernels to predict the PPV induced by quarry blasting. The applied models were analyzed regarding accuracy, error, gain performance, and input variables' importance. The models' evaluation showed that the BC-SVMRBF model achieved the best performance, which shows the efficiency of hybridizing boosted-CHAID and SVM with RBF Kernel to predict the PPV. Alternatively, the BC-SVMSIG had the weakest performance in terms of accuracy, error, and gain, which showed that this hybridization approach is not suitable for predicting the PPV.
The finding of the present study in terms of better performance of RBF over other kernel types is in line with those of studies in other disciplines, which pointed out that the SVM model with RBF kernel has the greatest forecast capability (e.g., [91]).
Several properties of the RBF kernel may lead to its better performance over other kernel types. These properties included its stationarity and smoothness. Besides, the RBF kernel is isotropic. Here, stationary implies that the RBF is invariant to translation. RBF's isometric property refers to the fact that in RBF, the scaling by γ gives a similar value in all directions.
To support the efficiency of the proposed approaches, we applied two ANN models with two different structures, including Multilayer Perceptron (MLP) and Radial Basis Function (RBF) to the same data. The training R values of 0.858 and 0.849 were achieved for ANNMLP and ANNRBF models, respectively. The results of these ANN models showed that all SVM models except SVMSIG outperformed the ANN models while were hybridized with the Boosted-CHAID method.
Compared to the previous studies on the same dataset, this study achieved a slightly lower accuracy than that of the study by Armaghani et al. [98]. In their study, Armaghani et al. applied ANN and ANFIS models to five inputs parameters to achieve an R of 0.96, while in this study, the authors applied the hybrid models on four inputs to achieve acceptable training accuracy, especially for SVM with RBF Kernel (0.95). It can therefore be concluded that the proposed models in this study are considered as reliable and sufficiently accurate in predicting the PPV induced by blasting, while retaining the advantage of reduced complexity by employing fewer input parameters.

Conclusions
The aim of this research was to predict the PPV using a hybrid ML model enhanced with both boosted-CHAID and SVM techniques with different Kernels. The boosted-CHAID model required only four out of a total of five input variables (distance from blast-face, stemming length, powder factor, and maximum charge per delay). Based on these input variables, different SVM kernels, i.e., SVMRBF, SVMPOL, SVMSIG, and SVMLIN, were designed to predict PPV values. Among these four SVM kernels, SVMRBF and SVMSIG were selected as the best and worst models, respectively, in predicting the PPV. The performance of the R and MAE index of the radial basis function kernel of SVM in training and testing phases respectively, confirm the high capability of this SVM kernel in predicting PPV values. With regards to the importance of PPV predictors, "distance" had the greatest importance, which is in line with the boosted-CHAID model results. All models also identified this input parameter as an influential predictor, which implies the importance of this predictor for PPV forecasting. The results of model importance are in line with the published intelligence and empirical studies in the area of PPV prediction.
This investigation intends to emphasize that this study's modeling method can be utilized in other disciplines to add a different problem-solving perspective. The performance of the SVM models is extensively impacted by the choice of the right values for "γ" and "d". In the present study, we employed the grid-search technique for determining the optimal value for "γ" and "d". Hence, the performance of the SVM models can be improved if the process of choosing "γ" and "d" is conducted by novel optimization techniques. Therefore, future studies on the employment of SVMs for PPV prediction should concentrate on adopting innovative soft computing optimization techniques to optimize values of kernel parameters.
While the present study selected the RBF kernel as the best kernel, it should be mentioned that the proper kernel function is problem specific. Thus, it can be an interesting topic for future studies to determine the practical process for picking appropriate kernel functions and their corresponding parameters' values consistent with the given problem.
Future investigations aiming at utilizing the SVM models may apply single and hybrid forms with various kernels in other environmental issues of blasting. It may also be of interest to apply the model to a wider database and enhance the model's prediction accuracy.

Conflicts of Interest:
The authors declare no conflict of interest.