Predicting Angle of Internal Friction and Cohesion of Rocks Based on Machine Learning Algorithms

: The safe and sustainable design of rock slopes, open-pit mines, tunnels, foundations, and underground excavations requires appropriate and reliable estimation of rock strength and deformation characteristics. Cohesion ( c ) and angle of internal friction ( ϕ ) are the two key parameters widely used to characterize the shear strength of materials. Thus, the prediction of these parameters is essential to evaluate the deformation and stability of any rock formation. In this study, four advanced machine learning (ML)-based intelligent prediction models, namely Lasso regression (LR), ridge regression (RR), decision tree (DT), and support vector machine (SVM), were developed to predict c in (MPa) and ϕ in ( ◦ ), with P-wave velocity in (m/s), density in (gm/cc), UCS in (MPa), and tensile strength in (MPa) as input parameters. The actual dataset having 199 data points with no missing data was allocated identically for each model with 70% for training and 30% for testing purposes. To enhance the performance of the developed models, an iterative 5-fold cross-validation method was used. The coefﬁcient of determination (R 2 ), mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), and a10-index were used as performance metrics to evaluate the optimal prediction model. The results revealed the SVM to be a more efﬁcient model in predicting c (R 2 = 0.977) and ϕ (R 2 = 0.916) than LR ( c : R 2 = 0.928 and ϕ : R 2 = 0.606), RR ( c : R 2 = 0.961 and ϕ : R 2 = 0.822), and DT ( c : R 2 = 0.934 and ϕ : R 2 = 0.607) on the testing data. Furthermore, to check the level of accuracy of the SVM model, a sensitivity analysis was performed on the testing data. The results showed that UCS and tensile strength were the most inﬂuential parameters in predicting c and ϕ . The ﬁndings of this study contribute to long-term stability and deformation evaluation of rock masses in surface and subsurface rock excavations.


Introduction
The safe and sustainable design of rock slopes, open-pit mines, tunnels, foundations, and underground excavations needs a proper and reliable estimation of rock strength and deformation characteristics.Cohesion (c) and angle of internal friction (ϕ) are two widely used key mechanical strength parameters to characterize a material's shear strength [1,2].Thus, the prediction and estimation of these parameters are essential to evaluate the deformation and stability of any rock formation [3].The strength parameters c and ϕ can be obtained directly from laboratory tests (triaxial tests), which are destructive, laborious, and expensive.In addition, samples of the required quality are difficult to collect, especially in highly jointed and fragile rocks [2,4].In rock mechanics and geotechnical engineering, it is imperative to analyze rock's performance and estimate its related mechanical properties [5][6][7].Therefore, it is worthwhile to adapt the intelligent approaches for determining c and ϕ.
One of the earliest adopted failure criterion for determining c and ϕ was the Mohr-Coulomb (MC) failure criterion.
Due to its mathematical convenience, simplicity, and conventional use in the field of rock mechanics, the MC criterion is still widely used [1,[8][9][10].The MC criterion includes two parameters, c and ϕ.The parameter c is used to identify the bond between rock particles and the parameter ϕ is related to the internal friction generated along the shear surface [11].Before the practical application of the MC criterion, the parameters c and ϕ need to be estimated [12,13].In order to evaluate the MC parameters of c and ϕ, triaxial tests are performed at different confining pressures.However, considering the factors of time and high cost associated with triaxial tests, there is a dire need for alternative methods to obtain MC parameters, especially at the preliminary stages of any project, where triaxial tests results are limited [14][15][16].For this reason, efforts have been devoted to the development of fast and inexpensive methods for indirect estimation.Tests such as point load test [17], the Schmidt hammer test [18], sound velocity [19], impact strength [20], or the Los Angeles abrasion test [21] have been used to estimate uniaxial compressive strength (UCS) indirectly.Some researchers have investigated the applicability of UCS and uniaxial tensile strength (UTS) for estimating the c and ϕ of rocks in the absence of triaxial test data [16,[22][23][24].Additionally, some indirect estimation models have been introduced for the prediction of c and ϕ.Weingarten and Perkins found a correlation between ϕ and porosity [25] of sandstone.Plumb [26] determined the correlation between ϕ and neutron porosity, which was improved by Asquith et al. [27] and Jaeger et al. [28].Moreover, Edlmann et al. [29] determined a linear relationship between ϕ and lab-measured core porosity.Abbas et al. evaluated the correlation of ϕ with compressional waves and gamma rays using wireline logging data [30,31].In all cases, c was found to be dependent on ϕ and UCS, as revealed by Almalikee and Strength [32].Though the results of these methods have significant application in estimating c and ϕ, they are not enough for long-term stability and deformation evaluation of rocks.Therefore, there is still a need to investigate c and ϕ of rocks using indirect estimation methods (i.e., intelligent approaches).
Recently, intelligent approaches have been widely used in the field of geotechnical engineering and rock mechanics [24,[33][34][35][36][37][38][39][40][41][42][43].Numerous researchers have used intelligent techniques, i.e., machine learning (ML) methods, to extend their knowledge for predicting c and ϕ.Shen et al. applied genetic programming (GP) to predict the c and ϕ of sandstone rocks.The proposed model provided adequate predictive performance in the absence of triaxial data [16].Mahmoodzadeh et al. employed Gaussian process regression (GPR), support vector regression (SVR), decision trees (DT), and long short-term memory (LSTM) to predict c and ϕ of intact rocks using three input parameters, i.e., UCS, UTS, and confining stress (σ 3 ) [24].Khandelwal et al. implemented different approaches, namely simple and multiple regression, artificial neural network (ANN), and genetic algorithm (GA)-ANN, to predict the cohesion of limestone.For this purpose, P-wave velocity, UCS, and Brazilian tensile strength (BTS) were chosen as inputs [43].Hiba et al. aimed to construct a predictive model using actual well-logging data.The study was carried out using two ML techniques, namely DT and random forest (RF).Bulk density (ROHB), neutron porosity (NPHI), and compression time (DTC) were used as input parameters to predict c and ϕ [44].Kainthola et al. used an adaptive neuro-fuzzy inference system (ANFIS) and simple linear regression (SLR) to develop correlations between some basic physico-mechanical properties, including UCS, UTS, c, ϕ, and P-wave velocity [45].Based on the above literature, it can be inferred that some useful, but not fully sufficient, insights have been provided in predicting c and ϕ.The use of a particular procedure can be appropriate in certain circumstances, but not in others.More precisely, it has been noted in the literature that only a small amount of work has been carried out to predict c and ϕ, especially using various types of rocks.Therefore, there is a need for novel ML-based intelligent methods to provide an accurate predictive model for predicting rock c and ϕ in order to safely install underground engineering projects.
In this study, P-wave velocity, density, UCS, and tensile strength are used as input parameters to predict c (MPa) and ϕ ( • ).In addition, four advanced ML-based prediction models, namely Lasso regression (LR), ridge regression (RR), decision tree (DT), and support vector machine (SVM), are developed to achieve the desired goals.To enhance the performance of the developed models, an iterative 5-fold cross-validation method is used.At present, the use of ML-based intelligent methods in predicting the mechanical and physical properties of rocks is gaining attention and providing an important contribution to rock excavation in different geotechnical and mining engineering projects [46][47][48][49][50][51][52][53].The performance of the developed models is checked by some analytical metrics such as coefficient of determination (R 2 ), mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), and a10-index.The findings of this study could be helpful for long-standing stability and deformation evaluation of rock masses in surface and subsurface rock excavations.Figure 1 depicts the flowchart of the ML-based intelligent approach in this study.

Data Curation
In this study, c (MPa) and (°), including P-wave velocity in m/s, density in gm/cc, UCS in MPa, and tensile strength in MPa as input parameters, were predicted by LR, RR, DT, and SVM from the reported literature [45] for various rocks, namely limestone, quartzite, slate, and quartz mica schist.
The actual dataset having 199 data points with no missing data was split into 70% for

Data Curation
In this study, c (MPa) and ϕ ( • ), including P-wave velocity in m/s, density in gm/cc, UCS in MPa, and tensile strength in MPa as input parameters, were predicted by LR, RR, DT, and SVM from the reported literature [45] for various rocks, namely limestone, quartzite, slate, and quartz mica schist.
The actual dataset having 199 data points with no missing data was split into 70% for training purposes and 30% for testing purposes.To enhance the performance of the developed models, an iterative 5-fold cross-validation method is used.Figure 2 exhibits the test equipment for rock strength parameter measurement: (A) uniaxial testing machine, (B) tensile strength test, (C) P-wave velocity, and (D) triaxial test [45].Figure 2 shows the histogram representation of the statistical distribution of the input parameters and output parameters of the actual dataset used in this study.

Lasso Regression
Lasso regression (LR) was proposed in 1986 and 1996 as a biased estimator in the field of geophysics [54].Unlike ridge regression (RR), LR has the ability to perform both feature selection and penalty regularization to improve prediction accuracy.It combats multicollinearity by selecting the most important predictor from any set of highly correlated independent variables and removing all other variables.LR uses an L1-norm penalty term to shrink regression coefficients, some to zero, thus assuring the choice of the most important explanatory variables [52].LR has an additional advantage that if a dataset of size n is fitted to a regression model with p parameters and p > n, the LR model can choose only n parameters [55].To obtain estimates of the regression, the following loss function is minimized with Equation (1) [52]: The parameter λ can be selected using cross-validation.Though the LR and RR as given in Equations ( 1) and (2) bear a resemblance to each other, the results β ridge and β lasso show significant differences.In the process of shrinking the coefficients, the LR demonstrates the ability to set some of the coefficients to exactly zero.RR shrinks the coefficients, but never sets any of them to zero.LR performs variable selection by setting some coefficients to zero and retaining the coefficients that have a significant impact on output.Identifying these variables can improve the interpretability of the resulting model, especially when there is a large number of predictors [51].

Ridge Regression
Ridge regression (RR), also known as penalized least squares, provides a reduction in the variance of the estimated regression coefficients.RR shrinks the coefficients to zero and makes the estimates more stable than ordinary least squares (OLS) estimates [51].RR was presented by Hoerl et al. [56] to enhance the prediction accuracy of the regression model by minimizing the following loss function Equation (2) [52]: If λ is equal to 0, the obtained estimates are the OLS of multilinear regression (MLR).The parameter λ can be selected by using cross-validation.In RR, the L2-norm penalty term is used to shrink the regression coefficient to a non-zero value to prevent overfitting, but it does not play the role of feature selection.

Decision Tree
The decision tree (DT) is a supervised learning hierarchical model in which local regions are recognized in fewer steps through a series of iterative splits.Internal decision nodes and terminal leaves form the decision tree.Both classification and regression can be performed with this method.The regression tree is built in a similar way to a classification tree, with the exception that the impurity measure used for classification is substituted with a measure used for regression.Let us state that X m is the subset of X that reaches node m, i.e., the set of all x X that satisfy the conditions of all decision nodes on the path from the root to node m.We specify: The mean square error from the estimated value determines a good tree split.In the regression, let g m be the anticipated value in node m.
The variance at m is associated to E m .In a node, the mean of the desired outputs of the samples arriving at the node is employed.
If a node's error is satisfactory (E m < θr), a leaf node is generated, and the g m value is stored.Specifically, a piecewise constant approximation with discontinuities is generated at the boundary of the leaf.If the error is unacceptable, the data arriving at node m will be split again so that the sum of the errors in each branch is as small as possible [57,58].

Support Vector Machine
Support vector machine (SVM) is a supervised learning tool that was originally proposed by Vapnik [59].SVM is widely used in classification and regression analysis using hyperplane classifiers.The optimal hyperplane maximizes the boundary between the two classes in which the support vector is located [50].It uses a high-dimensional feature space to construct prediction functions by introducing kernel function and Vapnik's ε-insensitive loss function [46].For a dataset P = {(x 1 , y 2 ), (x 2 , y 2 ) . . .(x n , y n )}, where x i ∈ R n is the input and y i ∈ R n is the output, the SVM uses a kernel function to map the nonlinear input data in a high-dimensional feature space and tries to find the optimal hyperplane to separate them.This allows relating the original input to the output through a linear regression function [60,61], defined as follows: where ϕ(x) is the kernel function, and M v and l b denote the weight vector and the bias term, respectively.To obtain M v and l b , the cost function proposed by Cortes and Vapnik [62] needs to be minimized as follows: Equation ( 4) can be minimized when transformed into dual space using the Lagrange multiplier method, giving the following solution: where ∞ i and ∞ i are Lagrange multipliers with 0 ≤ ∞ i and ∞ i ≤ C, and ϕ x i , x j is the kernel function.The choice of the latter is significant to the success of SVR.A large number of kernel functions was examined in SVM, such as linear, polynomial, sigmoid, Gaussian, radial basis, and exponential radial basis [63].Figure 3 shows the basic structure of the SVM model.where ( ) is the kernel function, and and denote the weight vector and the bias term, respectively.To obtain and , the cost function proposed by Cortes and Vapnik [62] needs to be minimized as follows: Equation ( 4) can be minimized when transformed into dual space using the Lagrange multiplier method, giving the following solution: where ∞ and ∞ are Lagrange multipliers with 0 ≤ ∞ and ∞ ≤ C, and ( , ) is the kernel function.The choice of the latter is significant to the success of SVR.A large number of kernel functions was examined in SVM, such as linear, polynomial, sigmoid, Gaussian, radial basis, and exponential radial basis [63].Figure 3 shows the basic structure of the SVM model.

Hyperparameters
An ML algorithm needs to have optimized hyperparameters for better performance.These hyperparameters should be calibrated to the data as opposed to being defined manually.To minimalize the bias related to the random partition of the training and validation data, k-fold cross-validation was implemented in this paper, where k represents the number of folds.By using cross-validation, the validity and accuracy of ML models can be evaluated by partitioning a dataset into different subsets and assessing the accuracy of the ML model on each subset [64].The detail of optimized hyperparameters of RR, LR, DT,

Hyperparameters
An ML algorithm needs to have optimized hyperparameters for better performance.These hyperparameters should be calibrated to the data as opposed to being defined manually.To minimalize the bias related to the random partition of the training and validation data, k-fold cross-validation was implemented in this paper, where k represents the number of folds.By using cross-validation, the validity and accuracy of ML models can be evaluated by partitioning a dataset into different subsets and assessing the accuracy of the ML model on each subset [64].The detail of optimized hyperparameters of RR, LR, DT, and SVR models is presented Table 2.The λ values for the RR model were randomly selected in the range of 0.0-1.0,while the λ values for the LR model were kept at 1.0 and 0.01 for c (MPa) and ϕ ( • ), respectively.The random_state and capacity constant (C) for SVM were kept by default in the Python module for c (MPa) and ϕ ( • ).Moreover, three different functions, namely radial basis function (rbf), linear function, and polynomial function, were checked, and the performance of the rbf function was determined to be the best.

Model Evaluation
The performance indices play a key role in the assessment of model evaluation.The most suitable model is one with the highest R 2 [65]; the smallest MAE, MSE [66], and RMSE [65]; and a suitable a10-index [66].The model evaluation of each investigated model is evaluated by Equations ( 9)-( 13), as follows.
where S o and S p are the mean values of the actual and predicted values of the angle of internal friction and cohesion; S o and S p are the actual and predicted values of the angle of internal friction and cohesion, respectively; m10 signifies the datasets with a value of rate actual/predicted values between 0.90 and 1.10; and N is the number of datasets.

Results and Discussion
We aimed to investigate the ability of developed ML-based intelligent models such as LR, RR, DT, and SVM to predict rock shear strength parameters, namely ϕ ( • ) and c (MPa), using Python programming.In order to introduce the most suitable prediction model for predicting targeted output, the selection of appropriate input parameters can be considered as one of the most essential jobs.In this study, P-wave velocity (m/s), density (gm/cc), UCS (MPa), and tensile strength (MPa) were chosen as the input parameters for all developed models.Then, the actual and output values were arranged and plotted in such a way to examine the performance and correlations of each model.Based on the final prediction results, the performance and evaluation of the developed models were investigated employing different analytical indices such as R 2 , MAE, MSE, RMSE, and a10-index.The actual dataset of 199 datapoints was split into 70% for training purposes and 30% for testing purposes.
Figure 4 shows a comparison of scatter plots and performance plots between the actual and predicted values of the ϕ ( • ) at the test level for the LR, RR, DT, and SVM models.Based on the test prediction, the R 2 of each model is computed.The R 2 values of LR, RR, DT, and SVM models for the ϕ ( • ) are 0.606, 0.607, 0.822, and 0.916, respectively.be considered as one of the most essential jobs.In this study, P-wave velocity (m/s), density (gm/cc), UCS (MPa), and tensile strength (MPa) were chosen as the input parameters for all developed models.Then, the actual and output values were arranged and plotted in such a way to examine the performance and correlations of each model.Based on the final prediction results, the performance and evaluation of the developed models were investigated employing different analytical indices such as R 2 , MAE, MSE, RMSE, and a10-index.The actual dataset of 199 datapoints was split into 70% for training purposes and 30% for testing purposes.
Figure 4 shows a comparison of scatter plots and performance plots between the actual and predicted values of the φ (°) at the test level for the LR, RR, DT, and SVM models.Based on the test prediction, the R 2 of each model is computed.The R 2 values of LR, RR, DT, and SVM models for the φ (°) are 0.606, 0.607, 0.822, and 0.916, respectively.In the same manner, Figure 5 shows a comparison of scatter plots and performance plots between the actual and predicted values of the c (MPa) at the test level for the LR, RR, DT, and SVM models.Based on the test prediction, the R 2 of each model is computed.The R 2 values of LR, RR, DT, and SVM models for the c (MPa) are 0.928, 0.934, 0.961, and 0.977, respectively.In the same manner, Figure 5 shows a comparison of scatter plots and performance plots between the actual and predicted values of the c (MPa) at the test level for the LR, RR, DT, and SVM models.Based on the test prediction, the R 2 of each model is computed.The R 2 values of LR, RR, DT, and SVM models for the c (MPa) are 0.928, 0.934, 0.961, and 0.977, respectively.In the same manner, Figure 5 shows a comparison of scatter plots and performance plots between the actual and predicted values of the c (MPa) at the test level for the LR, RR, DT, and SVM models.Based on the test prediction, the R 2 of each model is computed.The R 2 values of LR, RR, DT, and SVM models for the c (MPa) are 0.928, 0.934, 0.961, and 0.977, respectively.The data were split into two parts by DT, as shown in Figures 6 and 7.By averaging the two closest leaves, the similarity score and gain were computed, and the residuals were then transferred to the leaf with the maximum score and gain.The learning rate and maximum depth were set to 1.0 and 3.0, respectively, to prevent model complexity.Once the prediction results (residuals) were obtained, all data points were run through the model to produce h(x) and F(x) predictions.The data were split into two parts by DT, as shown in Figures 6 and 7.By averaging the two closest leaves, the similarity score and gain were computed, and the residuals were then transferred to the leaf with the maximum score and gain.The learning rate and maximum depth were set to 1.0 and 3.0, respectively, to prevent model complexity.Once the prediction results (residuals) were obtained, all data points were run through the model to produce h(x) and F(x) predictions.Table 3 shows the performance indices of the developed LR, RR, DT, and SVM models calculated by Equations ( 6)- (10).In this work, based on the developed LR, RR, DT, and  Table 3 shows the performance indices of the developed LR, RR, DT, and SVM models calculated by Equations ( 6)- (10).In this work, based on the developed LR, RR, DT, and Table 3 shows the performance indices of the developed LR, RR, DT, and SVM models calculated by Equations ( 6)- (10).In this work, based on the developed LR, RR, DT, and SVM models, SVM outpaced other models at the testing level with R 2 = 0.916, MAE = 0.9094, MSE = 1.6656,RMSE = 1.2906, and a10-index = 1.00 for the ϕ ( • ) prediction and R 2 = 0.977, MAE = 0.5577, MSE = 0.6811, RMSE = 0.8253, and a10-index = 1.00 for the c (MPa) prediction.Therefore, SVM is an applicable ML-based intelligent approach that can be applied to accurately predict the ϕ ( • ) and c (MPa), as shown in Figure 8. SVM models, SVM outpaced other models at the testing level with R 2 = 0.916, MAE = 0.9094, MSE = 1.6656,RMSE = 1.2906, and a10-index = 1.00 for the (°) prediction and R 2 = 0.977, MAE = 0.5577, MSE = 0.6811, RMSE = 0.8253, and a10-index = 1.00 for the c (MPa) prediction.Therefore, SVM is an applicable ML-based intelligent approach that can be applied to accurately predict the (°) and c (MPa), as shown in Figure 8.The dataset used in this study was extracted from published literature [42] where the authors used an ANFIS and SLR to develop correlations between UCS, tensile strength, c (MPa), (°), and P-wave velocity.For further comprehensive comparison between intelligent approaches, we used the robust SVM model and predicted c (MPa) and (°), achieving the best results.Recently, few studies have used ML techniques to predict the c (MPa), (°); however, their results are limited to a single type of rock.Moreover, the authors neglected to evaluate the performance of robust ML approaches for different types of rocks [16,24,43,44].

Sensitivity Analysis
It is crucial to accurately analyze the most important parameters that have a considerable influence on the rock (°) and c (MPa), which can be problematic in the design of The dataset used in this study was extracted from published literature [42] where the authors used an ANFIS and SLR to develop correlations between UCS, tensile strength, c (MPa), ϕ ( • ), and P-wave velocity.For further comprehensive comparison between intelligent approaches, we used the robust SVM model and predicted c (MPa) and ϕ ( • ), achieving the best results.Recently, few studies have used ML techniques to predict the c (MPa), ϕ ( • ); however, their results are limited to a single type of rock.Moreover, the authors neglected to evaluate the performance of robust ML approaches for different types of rocks [16,24,43,44].

Sensitivity Analysis
It is crucial to accurately analyze the most important parameters that have a considerable influence on the rock ϕ ( • ) and c (MPa), which can be problematic in the design of the rock structure.Therefore, the cosine amplitude method [67,68] is used for the relative influence of the input parameters on the output in this study.
Because of the high accuracy of the SVM model in predicting the ϕ ( • ) and c (MPa), only a sensitivity analysis was performed at the testing level.Figure 9 show the relationship between each input parameter of the developed model and output.All parameters are positively correlated, while UCS and tensile strength are the most influential parameters in predicting the ϕ ( • ) and c (MPa).Contrarily, the P-wave velocity and density are less influential parameters in predicting the ϕ ( • ) and c (MPa).The feature importance of each input parameter is given as P-wave velocity = 0.067, density = 0.066, UCS = 0.068, and tensile strength = 0.069 for the ϕ ( • ).P-wave velocity = 0.067, density = 0.067, UCS = 0.068, and tensile strength = 0.069 for c (MPa). the rock structure.Therefore, the cosine amplitude method [67,68] is used for the relative influence of the input parameters on the output in this study.
Because of the high accuracy of the SVM model in predicting the (°) and c (MPa), only a sensitivity analysis was performed at the testing level.Figure 9 show the relationship between each input parameter of the developed model and output.All parameters are positively correlated, while UCS and tensile strength are the most influential parameters in predicting the (°) and c (MPa).Contrarily, the P-wave velocity and density are less influential parameters in predicting the (°) and c (MPa).The feature importance of each input parameter is given as P-wave velocity = 0.067, density = 0.066, UCS = 0.068, and tensile strength = 0.069 for the (°).P-wave velocity = 0.067, density = 0.067, UCS = 0.068, and tensile strength = 0.069 for c (MPa).

Limitations and Future Work
The performance of the SVM ML-based intelligent approach in predicting (°) and c (MPa) is consistent.Thus, for large-scale rock engineering projects, this work presents a sufficient basis to overcome the constraints.In order to carry out other projects, the model proposed in this study should be considered as a foundation and its results should be reanalyzed, reevaluated, and even reprocessed.

Conclusions
In this study, four ML-based intelligent models, i.e., LR, RR, DT, and SVM, were developed in order to introduce the most accurate prediction model for predicting the (°) and c (MPa).An identical 5-fold iterative cross-validation method was used to improve the efficiency of each model.The P-wave velocity (m/s), density (gm/cc), UCS (MPa), and tensile strength (MPa) were the selected input parameters for all developed models.Finally, the performance of each model was evaluated by R 2 , MAE, MSE, RMSE, and a10index values.The important conclusions drawn from this study are as follows: 1. Based on the estimated results of the developed LR, RR, DT, and SVM models, SVM outpaced other developed models at the testing level with R 2 = 0.916, MAE = 0.9094, MSE = 1.6656,RMSE = 1.2906, and a10-index = 1.00 for the (°) prediction and R 2 = 0.977, MAE = 0.5577, MSE = 0.6811, RMSE = 0.8253, and a10-index = 1.00 for the c (MPa) prediction.

Limitations and Future Work
The performance of the SVM ML-based intelligent approach in predicting ϕ ( • ) and c (MPa) is consistent.Thus, for large-scale rock engineering projects, this work presents a sufficient basis to overcome the constraints.In order to carry out other projects, the model proposed in this study should be considered as a foundation and its results should be reanalyzed, reevaluated, and even reprocessed.

Conclusions
In this study, four ML-based intelligent models, i.e., LR, RR, DT, and SVM, were developed in order to introduce the most accurate prediction model for predicting the ϕ ( • ) and c (MPa).An identical 5-fold iterative cross-validation method was used to improve the efficiency of each model.The P-wave velocity (m/s), density (gm/cc), UCS (MPa), and tensile strength (MPa) were the selected input parameters for all developed models.Finally, the performance of each model was evaluated by R 2 , MAE, MSE, RMSE, and a10-index values.The important conclusions drawn from this study are as follows: 1.

2.
According to the sensitivity analysis, UCS and tensile strength were the most influential parameters for predicting the ϕ ( • ) and c (MPa), with coefficient values of 0.068 and 0.069, respectively.

Mathematics 2022 , 18 Figure 1 .
Figure 1.Flowchart of the ML-based intelligent approach in this study.

Figure 1 .
Figure 1.Flowchart of the ML-based intelligent approach in this study.

Figure 2 .
Figure 2. The statistical description of the inputs and output parameters of the actual dataset.

Figure 3 .
Figure 3. Basic structure of SVM model.

Figure 3 .
Figure 3. Basic structure of SVM model.

Figure 4 .
Figure 4. Performance plots of LR, RR, DT, and SVM models for the (°) at the testing level.

Figure 4 .
Figure 4. Performance plots of LR, RR, DT, and SVM models for the ϕ ( • ) at the testing level.

Mathematics 2022 , 18 Figure 4 .
Figure 4. Performance plots of LR, RR, DT, and SVM models for the (°) at the testing level.

Figure 5 .
Figure 5. Performance plots of LR, RR, DT, and SVM models for the c (MPa) at the testing level.

Figure 5 .
Figure 5. Performance plots of LR, RR, DT, and SVM models for the c (MPa) at the testing level.

Figure 8 .
Figure 8. Radar plots of performance indices R 2 , MAE, MSE, RMSE, and a10-index of the developed predictive models for the (a) (°) and (b) c (MPa) at the testing phase in this study.

Figure 8 .
Figure 8. Radar plots of performance indices R 2 , MAE, MSE, RMSE, and a10-index of the developed predictive models for the (a) ϕ ( • ) and (b) c (MPa) at the testing phase in this study.

Table 1 .
Table 1 shows the lithology-based minimum and maximum, mean, and standard deviation (STD) values of the dataset.The statistical description of the inputs and output parameters of the actual dataset.Lithology-based minimum and maximum, mean, and standard deviation (STD) values of the dataset in this study.

Table 1 .
Lithology-based minimum and maximum, mean, and standard deviation (STD) values of the dataset in this study.

Table 3 .
Performance indices of ML-based developed models in this study.

Table 3 .
Performance indices of ML-based developed models in this study.