Investigating the Applications of Machine Learning Techniques to Predict the Rock Brittleness Index

: Despite the vast usage of machine learning techniques to solve engineering problems


Introduction
In underground space and excavation related projects, brittleness of the rock is considered as one of the most important properties of the rock mass.Having an appropriate insight on rock brittleness in other fields of engineering also help engineers alleviate the issues related to brittleness.For example, the acquisition of sufficient knowledge on the rock brittleness by oil and gas engineers could help them to evaluate the wellbore stability as well as appraise the performance of a hydraulic fracturing job [1].Moreover, the brittleness regulates the properties of the shale rocks mechanic.At the same time, by employing several parameters such as the carbonates, volumetric fraction of strong minerals, weak elements and pores, Young modulus and strength of these properties can be defined [2].In deep underground engineering, brittleness is a critical factor to assess the stability of the surrounding rock mass [3].
Besides, many disasters related to rock mechanics like rock-bursts may stem from brittleness [4][5][6].Several studies showed that brittleness is also an important factor to estimate the tunnel boring machine (TBM) and road-harder cutting performance [7].In addition, it defines the excavation efficiency of drilling, which considerably influences coal mining [8].Hence, the assessment of rock brittleness is necessary for geotechnical and rock mechanics projects [5].However, despite the fact that the brittleness is an important parameter for designing civil and mining engineering projects, but according to Altindag [9], there is still no consensus on definition and measurement standards of this phenomenon.Hence, according to Yagiz [10], various rock properties influence rock brittleness.Some studies have related brittleness to the lack of ductility or ductility inversion [11].Ramsey [12] also defined the brittleness as breaking of inter-particle cohesion of a rock.In addition, Obert and Duvall [13] pointed out that brittleness is the inclination of a material, such as cast iron or many types of rocks, to be split following pressure equivalent or higher than the material yield stress.A highly brittle rock typically has the following features: (a) failure without a considerable force, (b) generation of small particles, (c) great ratio of compressive to tensile strength, (d) great firmness, (e) great interior friction angle, and (f) production of fully developed fractures following hardness lab experiments [14].It seems that the majority of studies that were conducted on rock brittleness index (BI) were based on relationship between tensile and uniaxial compressive strengths of the rock samples [15][16][17].However, a few studies have presented and suggested a relationship between BI with other rock properties such elasticity modulus, hardness, Poisson's ratio, internal friction angle, and quartz content [18].The performance of these models have reported not capable enough to predict the BI.This is because of the fact that most of them used one or two dependent parameters [8,17].It seems that the use of multi-inputs predictive systems in estimating BI of the rock would be great to receive a higher degree of accuracy compared to simple regression models.
Recently a large number of studies used soft computing (SC), machine learning (ML) and artificial intelligence (AI) techniques to solve problems related to science and engineering fields .However, a limited number of studies relevant to these methods have been conducted to predict the rock BI in literature.Kaunda and Asbury [40] employed an artificial neural network (ANN) technique to predict the rock BI using system inputs such as the velocity of the S and P waves, Poisson's ratio, elastic modulus and unit weight.Yagiz and Gokceoglu [8] estimated the rock BI by constructing fuzzy inference system (FIS) and non-linear regression analysis.The inputs that were used to develop these models were the unit weight, uniaxial compressive strength (UCS) and Brazilian tensile strength (BTS) of the rock.They concluded that the FIS model is an applicable technique in order to be used in the same field for further studies.Koopialipoor et al. [16] proposed predictive equations for calculation of rock BI as a function of intact rock properties including rock density, Schmidt hammer rebound number and p-wave velocity.They used a hybrid approach and combined the firefly algorithm and ANN models to develop the equitation.Khandelwal et al. [17] examined the feasibility of genetic programming model for predicting the brittleness of intact rocks.Their study used multiple input variables including UCS, BTS and unit weight to forecast the BI of the rock mass.
While several previous studies acknowledged the suitability of ML techniques for solving the engineering problems, several ML techniques remained unused or barely applied to predict the rock BI.
To the authors best of knowledge, no study is available which examined the feasibility of well-known ML techniques such as chi-square automatic interaction detector (CHAID), random forest (RF), support vector machine (SVM), and K-nearest neighbors (KNN) for predicting the BI.Thus, in this study the abovementioned ML techniques plus ANN technique (as a benchmark one in field of ML) were employed for BI prediction purpose.The performance of each model was evaluated through five performance indices and a gain chart.Additionally, three best models of this study are discussed in more details.

Models Developed
The models that were developed in this study are CHAID, RF, SVM, KNN, and ANN.The CHAID is from decision tree family which produce non-binary tree structure.This technique that was developed by Kass [41] employs a chi-square test to produce multiple sequential combinations and splits, and finally a single decision tree.Typically, the decision tree techniques are susceptible to overfitting.However, the CHAID automatically prunes the tree to alleviate the overfitting phenomenon.Moreover, the CHAID generates a number of rule sets and each of these rule sets has a confidence level and accuracy.
While the single-based decision trees are easy to implement and understand, these techniques are prone to result in different generalization behaviour with small changes to the training data [42].Indeed, these techniques are viewed as unstable and high variance.The RF technique is extensively efficient to remedy the abovementioned shortcomings of single-based decision trees.This technique was developed by Breiman [43] and is an ensemble-based approach (Figure 1).The RF generates more accurate prediction results compared to single-based trees since it combines a huge number of single trees.It is worth pointing out that the RF enjoys a bagging approach to create each number of an ensemble from diverse datasets.This approach randomly chooses from the space of single decision trees and generates almost identical (low diversity) predictions.Other ML techniques such as KNN, SVM, and ANN are also powerful tools for classification and regression analysis in civil and mining problems [35].KNN is an easy to implement, simple and effective data mining algorithm [44].The basic theory behind the KNN is discovering a group of "k" samples (e.g., employing the distance functions) which has the nearest distance from unknown samples in the calibration dataset.Moreover, the KNN identifies the class of unknown samples among the "k" samples by calculating the average of the response variables [45].Thus, the "k" plays an important role in the KNN performance [46].
Concerning the SVM, this technique is capable to handle the high dimensional and linearly non-separable datasets [47,48].In addition, it can reduce the error for training and testing datasets, as well as the model complexity [49].According to Cortes and Vapnik [50], statistical learning theory is the basic theory behind the SVM.In addition, the performance of SVM is influenced by Kernel functions, such as linear, radial basis function, sigmoid and polynomial [51].It is noteworthy to mention that the SVM aims to determine a perfect separation hyper plane that can distinguish the two classes [51].The SVM regression aims to discover the largest margin.Figure 2 shows a typical structure of SVM.In terms of ANN, this technique is a kind of artificial intelligence which emulates some functions of an individual's mind.Typically, the ANN is tended to sort experiential knowledge [52].This technique includes a series of layers, and each layer includes a sequence of neurons.These neurons in every layer are connected thorough weighted links to all neurons on the previous and following layers [52][53][54][55].
A positive weight reveals an excitatory association, whereas a negative weight reveals an inhibitory association.A typical ANN includes three layers, i.e., input layer, hidden layer, and an output layer [56][57][58][59][60].This structure is shown in Figure 3 for more illustration.

Data and Case Study
The data of this study was acquired from the Pahang-Selangor tunnel, Malaysia (Figure 4).The main aim of constructing this tunnel was to provide a flow path of fresh water.The tunnel specifications are as follow: (1) diameter: 5.2 m; length: 44.6 km; and longitudinal gradient: 1/1900.In addition, under free-flow conditions, the maximum allowable discharge of the tunnel is 27.6 m 3 /s.In order to excavate the tunnel, three different TBMs were used for about 35 km of the tunnel.The remainder of the tunnel was excavated using the drilling and blasting method.The geological units include granite, metamorphic and some sedimentary rocks; though, most of the rocks excavated with the abovementioned method is comprised of granite.Many geotechnical and geological investigations were conducted in the tunnel to collect rock block samples for testing.Finally, in multiple locations of TBMs site, more than 100 granite block samples were obtained from the tunnel face.A robust procedure from the International Society for Rock Mechanics [61] was followed for preparing the samples to test.Several lab tests were conducted on the samples, including density (in dry condition), Schmidt hammer rebound number (R n ), uniaxial compression strength (σ c ), tensile strength (σ t ), point load index (Is 50 ), and p-wave velocity (V p ).In this study, the BI values were calculated according to the following equation [62]: where, σ c and σ t are the uniaxial compression strength and tensile strength, respectively.Thus, considering and selecting BI values as model output, four parameters of density, Schmidt hammer rebound number, point load index and p-wave velocity were set as inputs in the form of a database with 110 datasets.The range, mean, unit and symbols of inputs and output parameters in this study are tabulated in Table 1.According to this table, average values of 5491.6 m/s, 2.59 g/cm 3 , 40.5, 3.6 MPa, and 15.5 were obtained for V p , D, R n , Is 50 and BI, respectively.In the next section, modeling procedure in approximating BI as a function form of f (V p , D, R n , and Is 50 ) and the obtained results will be presented in detail.

Modelling Process and Results
The present study developed five ML models to predict BI of the rock material.To develop the models, a database contained 110 datasets was used.These data were split into the train and test with the ratio of train to test being 70%: 30%.Thus, 77 samples for training and 33 samples were used for testing.As pointed out earlier, five ML models including, RF, CHAID, SVM, KNN and ANN were developed to estimate BI of the rock.Each of these models were evaluated using a simple ranking system and a gains chart.The three best models are discussed in more detail.

Evaluation of the Developed Models
Once the models have been developed, the accuracy performance of each model was evaluated using five well-known indices i.e., coefficient of determination (R 2 ), root mean square error (RMSE), mean absolute error (MAE), variance account for (VAF), and a20-index.The formula that was used for calculating the mentioned performance indices are presented in Equations ( 2)-( 6).This study also employed an easy to understand ranking system which ranked each model developed using the above-mentioned performance criteria for both training and testing stages.For each criteria, the ranking system first sorted the models based on their obtained values, then assigned the highest rank (5) to the best value and the lowest rank (1) to the worst value.Final rank of each model was calculated through summing the ranking values for both training and testing stages (Equation ( 6)): (2) where y denotes the measured values, ȳ and y indicate mean and predicted of the y, respectively, N denotes the total number of data, m20 shows number of samples with value of experimental value/predicted value between 0.8 and 1.20: where i denotes the indices, j shows the dataset, the R shows model's ranking.
The values and ranks of the performance indices and models are presented in Table 2.The results of this evaluation showed that the KNN model achieved the highest final rank (37).This model was followed by RF (34) and ANN (33), respectively.For the training dataset, the RF obtained the highest rank (25) while the SVM obtained the lowest rank (5).For the testing dataset, the ANN achieved the highest rank (22) while the CHAID achieved the lowest rank (6).Turning to the performance indices, the RF outperformed other models developed for the training dataset.However, for the testing dataset, the ANN achieved the best ranks for three indices including R 2 , RMSE, and a20-index.The ANN also achieved the second-best rank for VAF.Based on these discussions and rank values, three models of RF, KNN, and ANN were selected to be discussed in more details in the following sections.
The authors also used a gain chart (Figure 5) to compare the performance of the models proposed for both training and testing datasets.Gains are estimated as (number of hits in quantile/total number of hits) × 100%.Here, it is necessary to mention that "hit" refers to the success of a model to predict the values greater than the midpoint of the fields range (BI > 16.458).In this chart, the blue line denotes the perfect model which has perfect confidence (where hits = 100% of cases), the diagonal red line denotes the at-chance model, and the other five lines in the middle represent the models developed in this study.Typically, the higher lines show better models, particularly on the left side of the chart.To compare a model developed and the at-chance model, the area between a model and the red line can be used.In fact, this area identifies how much better a proposed model is compared to the at-chance model.Additionally, the area between a model proposed and the perfect model identifies where a proposed model can be improved.
For training stage, it is shown that the perfect model has correctly identified 100% of the samples, which had the BI of greater than 16.458, at the percentile of 40%.The RF model was the closest follower of the perfect model and correctly identified 100% of the samples, which had the BI greater than 16.458, at the percentile of 50%.The weakest model was the CHAID which identified the hit at the percentile of 84%.For testing dataset, it can be seen that that the perfect model has correctly identified 100% of the samples, which had the BI of greater than 16.458, at the percentile of 35%.The perfect model was followed by the RF and ANN models which had correctly identified 100% of the samples, which had the BI of greater than 16.458, at the percentile of 41%.The KNN and ANN had the weakest performance and identified the hit at the percentile of 47%.

Random Forest Model
The RF model was developed using four input variables, including V p , D, R n , and Is 50 to predict the rock BI.The present study employed several parameters to develop the RF model.After trial and error procedure, the number of models to build was set as 100, the sample size was set as 0.95, the maximum number of nodes was set as 10,000, and maximum tree depth and minimum child node size were set as 10 and 2, respectively.Predicted BI values by RF, along with their actual values for training and testing datasets, are displayed in Figure 6.The obtained R 2 values of 0.89 and 0.75 for train and test stages of RF model, respectively revealed a high and suitable accuracy level of train and test stages.In addition, the RF model identified the importance of the input variables (Figure 7).As can be seen, the R n with importance of 0.37 was identified as the most important variable and followed by V p with importance of 0.35 and Is 50 with importance of 0.29.It is noteworthy to mention that the RF model did not consider D as an important factor.

ANN Model
As mentioned earlier, this study developed the ML predictive models by means of four input variables, i.e., V p , D, R n , and Is 50 for predicting the rock BI.Here, several parameters have been used to develop the ANN model.The type of neural network model was multilayer perceptron.The study used "mean" as the default combing rule for our continuous target.Number of component models for boosting and/or bagging was set as 10.To avoid over-fitting, the over-fit prevention set was set as 30%.Different values were examined in order to determine the number of hidden neurons and in the final model, a number of 4 hidden neurons was used to predict BI. Figure 8 shows the suggested architecture of the ANN model with four input neurons, four hidden neurons and one output neuron in predicting BI of the rock.In addition, the predicted BI values by ANN, along with their actual values for train and test datasets, are displayed in Figure 9.According to obtained results of this section, R 2 values of 0.75 and 0.85 for train and test stages, respectively showed that the ANN model is able to provide acceptable level of accuracy specially in testing datasets for estimation of the BI.ANN is able to determine the importance values of the use inputs in the system (Figure 10).As a result, R n and V p are the most important and least important parameters on the BI which results of R n is same as the RF analysis part.

KNN Model
In developing KNN model, several assumptions and parameters were considered.The KNN model was developed to establish the balance between speed and accuracy.Therefore, the model automatically selected the best number of neighbors, within a small range.In the present study, we used k number between the values 3-5 by implementing a trial-and-error method of the system.In addition, the distance computation was based on Euclidean metric.Predicted BI values by KNN, along with their actual values for training and testing datasets, are displayed in Figure 11.With R 2 of 0.81 and 0.84 for train and test stages, respectively, in fact, the KNN model is able to offer a balance range for these stages compared to RF and ANN. Figure 12 shows a suggested structure of the KNN predictive model in predicting BI.
This figure shows the relationship between the predictors and K selection.In the horizontal axis of the chart, the numbers of the nearest neighbor are displayed.Sums of square errors are shown in the vertical axis.As shown by the figure, the errors for k = 3, 4, and 5 were determined as 372.31, 363.70, and 365.92 respectively.The results revealed that k = 4 is the best value of the nearest neighbor numbers for the developed KNN model.The KNN model also identified the importance of the input variables (Figure 13).As can be seen, the R n was identified as the most important variable and followed by V p , D, and Is 50 , respectively.It should be noted that R n was introduced by all RF, ANN and KNN model as the most influential factor on rock BI.

Validation of the Selected Models
After developing the models for predicting BI of the rock, they should be validated through the use of new datasets.Therefore, the authors decided to use 15 more empirical data from the same case study.In should be noted that these data were not used for training and testing phases.The authors used the selected models in previous section and run them using the new datasets for validation purposes.Then, the measured and predicted values of BI were evaluated considering the previous performance indices.Table 3 presents the results of performance indices for all 3 predictive models i.e., ANN, KNN and RF.According to this table, a20-index is 1 for all predictive models which shows that m20 (values of experimental/predicted) is equal to N (total number of samples).It confirms that all models are able to provide good results for similar data as well.In addition, R 2 results (0.971, 0.860, and 0.807) and VAF results of (96.852, 85.633, and 80.642 %) were obtained for RF, ANN and KNN models, respectively in validation stage which indicate that RF model is better than the other 2 models.In terms of system error, RF model with RMSE of 0.62 and MAE of 0.46 received lower amount of error compared to ANN and KNN models.Figure 14 shows the measured and predicted BI values for the RF, ANN, and KNN models in validation phase.In addition, Figure 15  As conclusion on this part, all models are able to provide a good prediction results for BI values when similar data will be available.However, RF can receive higher performance capacity if similar data will be available compared to ANN and KNN models.This means that if the other researchers or designers can collect/measure the inputs of this study within their ranges and their properties, it can be expected that the developed RF model is able to predict BI values with high correlations and low system error.Therefore, the developed RF model and its structure can be utilized to estimate BI of the rock in preliminary design of geotechnical projects subjected to rock mass.

Discussion and Conclusions
This present study investigated the application of multiple ML techniques for predicting the rock BI using a dataset from a water transfer tunnel in Malaysia.The main aim of this study was to identify the best model (s) in terms of accuracy for both train and test stages.To compare the models, five performance indices, a ranking system, and a gain chart were used.Therefore, five ML models, including the RF, CHAID, ANN, KNN, and SMM were developed.While the results of performance indices showed that the RF outperformed other models for the training dataset, ANN achieved the best ranking for testing dataset.However, the KNN achieved the highest cumulative ranking.A possible explanation for this is that the KNN showed a stable behavior for train and test stages, while the RF and ANN resulted in too many different rankings for train and test stages.Concerning the importance of predictors in this study, all three models, RF, KNN, and ANN identified R n as the most important factor for predicting the BI.The KNN and ANN considered D as an important predictor, while the RF did not.This can be explained by the fact that the data of D diverged from the average value.It also showed that the RF is intolerable to the dispersion of data points in a data series around the mean.
The RF method outperformed the single-tree based methods like CHAID for both the training and testing stages.The power of RF stems from its abilities to bag a huge number of single tree models and produce an ensemble tree.For categorical data, the RF produces a number of rules which can show the relationships between the predictors and the target variable.However, in this study, the target variable was continuous and the RF could not create a set of rules.While the ANN showed an acceptable performance to predict the BI, this method is viewed as black-box.Thus, while it can predict the BI, studying its structure does not provide an understanding on the structure of the function being measured.The future studies on the BI should cautiously use the KNN.While this method is an intuitive approach and immune to outliers on the predictors [25,63], this may be vulnerable to irrelevant features and correlated inputs [64].In addition, the ability of KNN to deal with data of mixed-types is still doubtful [25,64].
The last analysis section of this study was related to validation of the selected predictive models i.e., ANN, KNN and RF.To this end, 15 datasets with the same input parameters were considered and then, the ANN, KNN and RF models were run again using these 15 datasets.The results of validation stage showed that RF with R 2 of 0.971 is more capable to predict rock BI compared to KNN model with R 2 of 0.807 and ANN model with R 2 of 0.860.This indicates that all models can be used for similar conditions in the future.More specifically, this research suggests to use RF and KNN models (or each of them) by the other researcher or designers in order to predict rock BI in design stage of geotechnical projects.

Figure 4 .
Figure 4. Geological map of tunnel location and its route.

Perfect R 2 = 1 ;Figure 5 .
Figure 5. Evaluation of the models proposed using a gain chart.

Figure 6 .
Figure 6.Testing and training results of RF model to predict the BI.

Figure 7 .
Figure 7. Input variable's importance to predict the BI derived from the RF model.

Figure 8 .
Figure 8. ANN network for predicting the BI.

Figure 9 .
Figure 9. Testing and training results of ANN model to predict the BI.

Figure 10 .
Figure 10.Input variable's importance to predict the BI derived from the ANN model.

Figure 11 .
Figure 11.Testing and training results of KNN model to predict the BI.

Figure 12 .
Figure 12.The relationship between the predictors and K selection for predicting the BI.

Figure 13 .
Figure 13.Input variable's importance to predict the BI derived from the KNN model.
depicts predicted BI values by RF, ANN and KNN together with their measured BI for all 15 data samples assigned for validation stage.As it can be seen from these two figures, the BI values by RF model are closer to the measured BI values in comparison with the KNN and ANN models.

Figure 14 .
Figure 14.Actual and predicted values for the models selected in validation phase.

Figure 15 .
Figure 15.Predicted BI values by RF, ANN and KNN together with their measured BI for all 15 data samples.

Table 1 .
The Range, Mean, Unit, Category and Symbol of Inputs and Output Parameters in Predicting BI of the Rock Samples.

Table 2 .
Evaluation of Models Developed Using Five Performance Indices.

Table 3 .
Performance Assessment for the Validation Phase.