Hybrid Machine-Learning-Based Prediction Model for the Peak Dilation Angle of Rock Discontinuities

The peak dilation angle is an important mechanical feature of rock discontinuities, which is significant in assessing the mechanical behaviour of rock masses. Previous studies have shown that the efficiency and accuracy of traditional experimental methods and analytical models in determining the shear dilation angle are not completely satisfactory. Machine learning methods are popular due to their efficient prediction of outcomes for multiple influencing factors. In this paper, a novel hybrid machine learning model is proposed for predicting the peak dilation angle. The model incorporates support vector regression (SVR) techniques as the primary prediction tools, augmented with the grid search optimization algorithm to enhance prediction performance and optimize hyperparameters. The proposed model was employed on eighty-nine datasets with six input variables encompassing morphology and mechanical property parameters. Comparative analysis is conducted between the proposed model, the original SVR model, and existing analytical models. The results show that the proposed model surpasses both the original SVR model and analytical models, with a coefficient of determination (R2) of 0.917 and a mean absolute percentage error (MAPE) of 4.5%. Additionally, the study also reveals that normal stress is the most influential mechanical property parameter affecting the peak dilation angle. Consequently, the proposed model was shown to be effective in predicting the peak dilation angle of rock discontinuities.


Introduction
The forecasting and control of the mechanical behaviour of rock masses is an important factor regarding the safety of engineering structures [1][2][3][4][5].The design of structures such as tunnels, embankments, mine openings, and underground chambers relies on accurate and reliable estimates of compressive strength, tensile strength, hydraulic mechanics, internal damage characteristics, and shear strength of rock masses [6][7][8][9][10][11].It is generally accepted that rock masses are often cut into intact rock pieces by rock discontinuities at different scales, as shown in Figure 1.These rock discontinuities include fractures, joints, bedding planes, weak intercalations, shear planes, etc. [12,13].Due to the shear strength of rock discontinuities closely related to rock engineering disasters, such as rock slope failure, fault-slip burst, and collapse accidents in tunnels [14][15][16], it has attracted the attention of researchers [17][18][19][20][21].
The Mohr-Coulomb law is widely used to characterise shear behaviours of rock discontinuities in the existing shear strength models, which incorporates the internal friction angle comprising the basic friction angle and the peak dilation angle [22,23].At the same time, there is an excellent modern-day geotechnical software, FLAC3D5.0, in which this law is taken as a basis and allows for predicting the behaviour of rock masses in different conditions [24,25].The peak dilation angle reflects the comprehensive effect of the joint morphology on the shear strength [26,27].Generally speaking, the peak dilation angle is defined as the instantaneous inclination of the shear path at the shear strength with respect to the mean plane [28].In addition, the peak dilation angle is also the most commonly used parameter in numerical calculations to study the nonlinear shear dilation behaviour of rock materials and to simulate surrounding rock deformation [29][30][31][32][33].The Mohr-Coulomb law is widely used to characterise shear behaviours of rock discontinuities in the existing shear strength models, which incorporates the internal friction angle comprising the basic friction angle and the peak dilation angle [22,23].At the same time, there is an excellent modern-day geotechnical software, FLAC3D5.0, in which this law is taken as a basis and allows for predicting the behaviour of rock masses in different conditions [24,25].The peak dilation angle reflects the comprehensive effect of the joint morphology on the shear strength [26,27].Generally speaking, the peak dilation angle is defined as the instantaneous inclination of the shear path at the shear strength with respect to the mean plane [28].In addition, the peak dilation angle is also the most commonly used parameter in numerical calculations to study the nonlinear shear dilation behaviour of rock materials and to simulate surrounding rock deformation [29][30][31][32][33].
The dilation is also inherent to failure in specimens starting from intact material, and it is a fundamental parameter for models with softening or hardening behaviour, also modified by the average stress level acting along the stress path.Currently, numerous experimental studies [34,35] on the peak dilation angle have been carried out.Moreover, theoretical analysis and many empirical models [36,37] were established based on the concept of maximum dilation angle at zero normal stress.These models have some shortcomings.For example, Xia et al. [38] proposed a new empirical model by tensile joint replicas satisfying new peak dilation angle boundary conditions under zero and critical state normal stresses.However, as the normal stress increases, the peak dilation angle predicted by the Xia et al. [38] model is half of the initial dilatation angle, which is inconsistent with the actual behaviour [39].For this, Yang et al. [39] established a new empirical model based on the shear test results of granite joints and sandstone joints.Ban et al. [40] also took into account the real contact asperity distribution and proposed a semi-empirical model.Additionally, there are many empirical models for predicting the peak dilation angle and shear strength of rock discontinuities, as listed in Table 1.These models provide a valuable basis for understanding and predicting the peak dilation angle.However, the generality of these models has not been well-addressed and some model parameters lack clear physical meaning.From an engineering practicality point of view, an ideal model should be able to accurately assess the peak dilation angle in a time-saving, labour-saving, and cost-effective way.The dilation is also inherent to failure in specimens starting from intact material, and it is a fundamental parameter for models with softening or hardening behaviour, also modified by the average stress level acting along the stress path.Currently, numerous experimental studies [34,35] on the peak dilation angle have been carried out.Moreover, theoretical analysis and many empirical models [36,37] were established based on the concept of maximum dilation angle at zero normal stress.These models have some shortcomings.For example, Xia et al. [38] proposed a new empirical model by tensile joint replicas satisfying new peak dilation angle boundary conditions under zero and critical state normal stresses.However, as the normal stress increases, the peak dilation angle predicted by the Xia et al. [38] model is half of the initial dilatation angle, which is inconsistent with the actual behaviour [39].For this, Yang et al. [39] established a new empirical model based on the shear test results of granite joints and sandstone joints.Ban et al. [40] also took into account the real contact asperity distribution and proposed a semi-empirical model.Additionally, there are many empirical models for predicting the peak dilation angle and shear strength of rock discontinuities, as listed in Table 1.These models provide a valuable basis for understanding and predicting the peak dilation angle.However, the generality of these models has not been well-addressed and some model parameters lack clear physical meaning.From an engineering practicality point of view, an ideal model should be able to accurately assess the peak dilation angle in a time-saving, labour-saving, and cost-effective way.

References Shear Strength Model Peak Dilation Angle
[41] Rock materials exhibit complex behaviours and a high level of uncertainty under laboratory testing [49][50][51][52][53][54].Machine learning (ML) techniques have been developed and used by an increasing number of researchers in the last several decades [55][56][57][58][59][60][61][62].Compared with traditional test methods and empirical models, ML can effectively find implicit relationships between variables and well handle nonlinear problems [63,64].The support vector regression (SVR) algorithm presents high accuracy and efficiency in modelling the nonlinear association between input variables and outputs, and it has been widely used in rock mechanics modelling in recent years [65].For example, Huang et al. [66] used the joint roughness coefficient (JRC), uniaxial compressive strength, normal stress, and basic friction angle as the input variables of the SVR model to intelligently predict the shear strength.Under the framework of SVR, Babanouri and Fattahi [67] proposed a new shear constitutive model of rock discontinuity.Ceryan et al. [68] developed an SVR model to predict the elastic modulus of rock materials with different degrees of weathering.Recently, Xu et al. [69] used SVR to study multiple geomechanical properties of rock materials.In conclusion, SVR exhibits several distinct advantages when tackling challenges involving high-dimensional and nonlinear recognition problems.
It can be noted that the peak dilation angle model of rock discontinuities is a very topical issue.Therefore, the purpose of this study is to provide an efficient method for predicting the peak dilation angle of rock discontinuities and to achieve this.The grid search optimization algorithm (GS) is introduced to improve the effect of the SVR, and a hybrid machine learning model, the GS-SVR model, is proposed.In addition, to show the development process of the proposed model, detailed analysis and model performance are also presented.Finally, the limitations and future development progress of the proposed model are outlined.

SVR
As a typical kernel-based ML algorithm, SVR is a promotion of support vector machine (SVM).It also follows the function approximation algorithm of SVM and solves the multivariate nonlinear regression estimation problem by introducing an alternative loss function [70].As a supervised learning method based on the principle of structural risk minimization, SVR has good generalization ability in solving small-sample, nonlinear, and high-dimensional problems [71].Because it is a convex quadratic optimization technique, it can always achieve the global optimal solution [72]. Figure 2 displays a schematic diagram of the SVR employed in this paper.SVR uses nonlinear mapping to translate the input vector X into a space with higher dimensions.More details about SVR and its application can be found in other milestone papers [73][74][75].In this work, SVR is chosen as the regression tool to predict the peak dilation angle because of its high generalization performance.It is worth mentioning that the relationship between peak dilation angle and underlying variables is nonlinear, high-dimensional, and the training data are generally not large.That circumstance is particularly suitable for SVR.
predict the elastic modulus of rock materials with different degrees of weathering.Re-cently, Xu et al. [69] used SVR to study multiple geomechanical properties of rock materials.In conclusion, SVR exhibits several distinct advantages when tackling challenges involving high-dimensional and nonlinear recognition problems.
It can be noted that the peak dilation angle model of rock discontinuities is a very topical issue.Therefore, the purpose of this study is to provide an efficient method for predicting the peak dilation angle of rock discontinuities and to achieve this.The grid search optimization algorithm (GS) is introduced to improve the effect of the SVR, and a hybrid machine learning model, the GS-SVR model, is proposed.In addition, to show the development process of the proposed model, detailed analysis and model performance are also presented.Finally, the limitations and future development progress of the proposed model are outlined.

SVR
As a typical kernel-based ML algorithm, SVR is a promotion of support vector machine (SVM).It also follows the function approximation algorithm of SVM and solves the multivariate nonlinear regression estimation problem by introducing an alternative loss function [70].As a supervised learning method based on the principle of structural risk minimization, SVR has good generalization ability in solving small-sample, nonlinear, and high-dimensional problems [71].Because it is a convex quadratic optimization technique, it can always achieve the global optimal solution [72]. Figure 2 displays a schematic diagram of the SVR employed in this paper.SVR uses nonlinear mapping to translate the input vector X into a space with higher dimensions.More details about SVR and its application can be found in other milestone papers [73][74][75].In this work, SVR is chosen as the regression tool to predict the peak dilation angle because of its high generalization performance.It is worth mentioning that the relationship between peak dilation angle and underlying variables is nonlinear, high-dimensional, and the training data are generally not large.That circumstance is particularly suitable for SVR.For a certain set of training data {(x 1 , y 1 ), (x 2 , y 2 ), . ..(x n , y n )}, the aim is to seek an optimal function f (x) that has at most ε deviation from the target values y tar for all the training data.The optimal function f (x) that has the most ε deviation from the target value in ε-SVR can be written as Equation (1): where ω is the weight vector, b is the model error, N represents the total number of training data, ϕ n (x) denotes a nonlinear mapping function.Subsequently, the overall optimization is optimally transformed into Equation (2).
where the Euclidean norm 1 2 ω 2 is 1 2 ω T ω.The constraints of Equation ( 2) are shown below: By introducing two slack variables ξ i and ξ i * (i = 1, 2. .., n) into Equation (3) representing the separation between the actual values and corresponding boundary values of ε-deviation.Further, the w and b can be determined by minimizing the following optimization function where c is the regularization or penalty parameter that is greater than zero.
The 1 2 ω 2 term denotes the structure risk and the c n ∑ i ξ i + ξ * i second term represents the empirical risk.Equation ( 4) is a constrained optimization problem that can be transformed in the form of a Lagrange function L(α, α * ) by sequential minimal optimization algorithm in a dual form: where α i and α i * are the Lagrangian multipliers, K(x i ,x j ) = ϕ(x i )ϕ(x j ) is the kernel function that yields the inner product in a higher-dimensional feature space.By using K(x i ,x j ), one can directly transform the data into a higher-dimensional feature without calculating the explicit map ϕ(x).In this paper, the radial basis function kernel function (RBF) is employed because of its high generalization performance.
where g denotes the kernel parameter, x i − x j is the Euclidean distance.
The nonlinear regression function can be expressed as follows after taking the Lagrangian and optimum conditions into account:

GS Optimization
In order to achieve accurate prediction, an important issue to be concerned with in implementing the SVR model is the tuning of hyperparameters (e.g., penalty parameter c and width parameter g).The trade-off between model complexity and training error is controlled by c, while the complexity of the solution is determined by g.The tuning process is generally completed through optimization algorithms.
As a classical parameter optimization method, the grid search (GS) method is proved to be an efficient optimization method with ideal convergence speed and success rate [76].It is a method of optimizing the performance of a model by traversing a given combination of parameters, by testing all combinations of a given parameter and finding the most suitable combination.The specific optimization process is shown in Figure 3.The evaluation metrics, such as root mean square error (RMSE), coefficient of determination (R 2 ), and mean squared error (MSE), are obtained using K-fold cross-validation for all hyperparameter combinations of the selected grid nodes.The best combination of c and g, which resulted in the best performance of the evaluation metrics, was selected for subsequent model validation.
It is a method of optimizing the performance of a model by traversing a given combination of parameters, by testing all combinations of a given parameter and finding the most suitable combination.The specific optimization process is shown in Figure 3.The evaluation metrics, such as root mean square error (RMSE), coefficient of determination (R 2 ), and mean squared error (MSE), are obtained using K-fold cross-validation for all hyperparameter combinations of the selected grid nodes.The best combination of c and g, which resulted in the best performance of the evaluation metrics, was selected for subsequent model validation.

Data Pre-Processing
A dataset with reliable experimental results and wide distribution is a prerequisite for the successful application of ML modelling [77,78].Based on the literature review of the existing research method [39,40], six parameters, including normal stress, basic friction angle, three-dimensional roughness parameters, and uniaxial compressive strength, were selected as the input variables of the proposed model.
The results of joint shear tests available in the literature are compiled.The dataset consists of 89 shear test results from various experimental results collected by the authors.These test results cover common joint types, such as cement mortar [40], granite [39,79], sandstone [38,39,80], marble [79], and limestone [79], and the projected lengths of these rock discontinuities ranged from 140 to 300 mm.More information on sample preparation procedures can be found in the corresponding literature.Detailed information on rock type, sample size, normal stress (σ n ), mechanical properties (uniaxial compressive strength σ c and basic friction angle ϕ b ), three-dimensional roughness parameters (A 0 , C, θ max * ), and measured peak dilation angle (i p ) collected in the dataset are shown in the Supplemental Files.A detailed statistical description of the input variables and output variable is shown in Table 2.As shown, there is an evident difference in the data distribution (e.g., data scope, magnitude difference) for variables.Therefore, in order to speed up the computational efficiency and convergence of ML, all inputs and output need to be normalized to (0,1) range according to their maximum and minimum values.The normalization formula is shown in Equation ( 8) as follows: x max −x min y i = y i −y min y max −y min (8) where x i and y i represent normalized input and output values of the i-th sample; x i and y i represent experimental input and output values of the i-th sample; x min , x max , y min , y max represent corresponding minimum and maximum values.The distribution characteristics of the dataset are visualized by means of a violin plot, as shown in Figure 4.It combines the features of a kernel density plot and a box plot while showing the first quartile, median, and third quartile of the dataset.A matrix analysis was plotted to show the correlation coefficients between the variables, with negative numbers representing negative correlations.It is easy to see from Figure 5 that all the correlation coefficients are less than 0.53, which indicates that these input variables are independent of each other and do not cause multicollinearity problems.Moreover, the correlation coefficients between the input and output variables are relatively low (all values are less than 0.35 in absolute value), which indicates that the relationship between the peak dilation angle and these inputs is not a simple multivariate linear relationship but a complex nonlinear mapping relationship.In other words, it is difficult to establish an explicit equation between the peak dilation angle and the inputs.This is the reason why machine learning methods are used to predict the peak dilation angle in this paper.between the peak dilation angle and the inputs.This is the reason why machine learning methods are used to predict the peak dilation angle in this paper.

Hyperparameters Tuning Process
The hyperparameters c and g have a significant effect on the performance of the prediction model.The grid is divided into a range of coordinates and according to a specified

Hyperparameters Tuning Process
The hyperparameters c and g have a significant effect on the performance of the prediction model.The grid is divided into a range of coordinates and according to a specified step, and all grids are traversed.The evaluation metrics (e.g., RMSE and MSE) are obtained by searching all combinations of parameters c and g for each selected grid node one by one using K-fold cross-validation.K-fold cross-validation is a statistical technique that can successfully remove the training bias brought on by sampling irrationality [73].Subsequently, the search range and step are then adjusted according to the values of the evaluation metrics, and the best combination of c and g is the one that provides the best performance of the model cross-validation metrics.As shown in Figure 6, the search range for c and g was set to (2 −5 ,2 5 ) with a step of 2 0.2 .All grids were traversed and all combinations of parameters c and g were searched for each selected grid node one by one.The best model was determined with the lowest value of MSE using 5-fold cross-validation.Optimal solutions for the parameters in the search range are obtained in the optimal choice of parameters c and g.
After obtaining the optimal combination of hyperparameters, the framework of the GS-SVR model for estimating the peak dilation angle is shown in Figure 7.In machine learning, a training set is typically used to build the model and verify the model's ability to predict new data on an independent test set [81].Therefore, the original dataset is randomly divided into two subsets after the dataset normalization: the training set and the test set.Through optimization analysis, 80% of the entire dataset was included in the training set and the remaining 20% was included in the test set.
parameters c and g.
After obtaining the optimal combination of hyperparameters, the framework of the GS-SVR model for estimating the peak dilation angle is shown in Figure 7.In machine learning, a training set is typically used to build the model and verify the model's ability to predict new data on an independent test set [81].Therefore, the original dataset is randomly divided into two subsets after the dataset normalization: the training set and the test set.Through optimization analysis, 80% of the entire dataset was included in the training set and the remaining 20% was included in the test set.

Performance of GS-SVR Model
The coefficient of determination (R 2 ), adjusted R 2 (Adj.R 2 ), root mean square error (RMSE), and mean absolute percentage error (MAPE) have been widely used for the performance evaluation of ML.These four evaluation indices are used to characterize the relationship between the predicted and test values of the peak dilation angle.R 2 is a comprehensive metric to measure how strong the relationship is between the two variables.Adj.R 2 represents the ability to accurately predict samples.The RMSE is a metric for how

Performance of GS-SVR Model
The coefficient of determination (R 2 ), adjusted R 2 (Adj.R 2 ), root mean square error (RMSE), and mean absolute percentage error (MAPE) have been widely used for the performance evaluation of ML.These four evaluation indices are used to characterize the relationship between the predicted and test values of the peak dilation angle.R 2 is a comprehensive metric to measure how strong the relationship is between the two variables.Adj.R 2 represents the ability to accurately predict samples.The RMSE is a metric for how much actual values vary from the average of the estimated values.The MAPE measures

Performance of GS-SVR Model
The coefficient of determination (R 2 ), adjusted R 2 (Adj.R 2 ), root mean square error (RMSE), and mean absolute percentage error (MAPE) have been widely used for the performance evaluation of ML.These four evaluation indices are used to characterize the relationship between the predicted and test values of the peak dilation angle.R 2 is a comprehensive metric to measure how strong the relationship is between the two variables.Adj.R 2 represents the ability to accurately predict samples.The RMSE is a metric for how much actual values vary from the average of the estimated values.The MAPE measures the average relative error between the estimated and actual values.Generally, R 2 (Adj.R 2 ) values equal to 1 and RMSE (MAPE) values equal to 0 indicate the best prediction performance.The mathematical expressions for these four evaluation indices are listed below [82]: where y i m is the measured results; y i p is the predicted results; y m is the average of y i m ; N represents the number of samples; m represents the number of input variables.
In order to highlight the predictive performance of the proposed model, the original SVR model is also applied to the training set and test set.Figure 8 shows the prediction effect of the two models on the same training set and test set.The calculation of evaluation indices of the two models is shown in Table 3.It can be found that, compared with the SVR model, the GS-SVR model has a higher correlation coefficient (R 2 and Adj.R 2 ) and smaller error index (MAPE and RMSE).Figure 8b indicates that the GS-SVR model underestimates most of the test set and this is conducive to leaving some safety redundancy in engineering.Regardless of the training set or the test set, the predicted results of the GS-SVR model are distributed near the ideal fit line, and the predicted values are closer to the experimental results than the original SVR model.The values of evaluation indices shown in bold rows in Table 3 also indicate that the predicted values of the GS-SVR model are more consistent with the experimental values, and the predicted results are more accurate than those of the original SVR model.

Relative Importance of Inputs
A sensitivity analysis of diverse input variables is carried out for a better understanding of the peak dilation angle.The method used for interpreting the relative importance of input variables is Kendall's tau coefficient.Figure 10 demonstrates the obtained relative importance scores for each input variable.Note that each input variable contributes to the peak dilation angle, but with different levels of significance.It can be seen that the σn is the most sensitive variable for peak dilation angle.The influence of the A0 on the peak dilation angle is found to be the smallest among the input variables.The relative importance score of each input variable revealed important discoveries and indicated potential experimental studies of peak dilation angle.These findings might provide a more detailed understanding of the peak dilation angle and present potential experimental studies in the future.

Relative Importance of Inputs
A sensitivity analysis of diverse input variables is carried out for a better understanding of the peak dilation angle.The method used for interpreting the relative importance of input variables is Kendall's tau coefficient.Figure 10 demonstrates the obtained relative importance scores for each input variable.Note that each input variable contributes to the peak dilation angle, but with different levels of significance.It can be seen that the σ n is the most sensitive variable for peak dilation angle.The influence of the A 0 on the peak dilation angle is found to be the smallest among the input variables.The relative importance score of each input variable revealed important discoveries and indicated potential experimental studies of peak dilation angle.These findings might provide a more detailed understanding of the peak dilation angle and present potential experimental studies in the future.

Contribution and Limitations
The primary advantage of this study is that a machine-learning-based model for predicting peak dilation angle is proposed.This method can provide a low-cost, time-saving, and non-destructive prediction of peak dilation angle for relevant geotechnical engineering, especially for projects with time and budget constraints.
Compared with the existing prediction models, the method has the following ad-

Contribution and Limitations
The primary advantage of this study is that a machine-learning-based model for predicting peak dilation angle is proposed.This method can provide a low-cost, timesaving, and non-destructive prediction of peak dilation angle for relevant geotechnical engineering, especially for projects with time and budget constraints.
Compared with the existing prediction models, the method has the following advantages: (1) the GS-SVR model does not require any mechanical testing after model training is completed; (2) the generalization capability of the GS-SVR model can be easily improved using large datasets, which may be better than the empirical equations established between the peak dilation angle and each influencing variable; (3) compared with the six analytical models to predict peak dilation angle, the advantages of ML techniques are strong data compatibility and model generalization.The accuracy of the GS-SVR model is the highest relative to the six analytical models.
There are still some shortcomings that need to be explored in the future.The scale effect is an important research topic in rock mechanics, and the effect of scale on shear mechanical behaviour of rock discontinuities is still unknown.In rock engineering design, the accurate understanding and mastering of the law of rock scale effect is related to the selection of rock mechanics parameters.How to extend the proposed model based on laboratory test results to the engineering scale is the next important research topic, and how to apply this model to industrialization is also an interesting direction.It might be necessary to create a graphics user interface (GUI).The omission of factors such as water content, shear displacement rate, and temperature is also a clear limitation of this study.In addition, as a data-driven approach, the predictive performance of the proposed model is severely affected by the quantity and quality of the Supporting Dataset.The method might be limited in some cases if there are information restrictions or not enough rock samples available.The final limitation is that the generalization capability of the proposed model on completely unknown test results (e.g., not included in this dataset) has not been fully investigated.

Conclusions
This paper intends to provide an efficient method for predicting the peak dilation angle of rock discontinuities using a machine learning tool.The method is a hybrid GS-SVR model, which incorporates support vector regression (SVR) techniques and augments with the grid search optimization algorithm to improve prediction performance and optimize hyperparameters.To train and evaluate the proposed model, relevant datasets from experimental tests on various rocks were retrieved and GS and K-fold cross-validation methods were adapted to eliminate the overfitting or underfitting problem of the SVR model.From the analysis results, it is found that the hybrid GS-SVR model has higher prediction accuracy and less error compared with the original SVR model and existing analytical models.In addition, a sensitivity analysis was performed to examine the relative importance score of the three input variables (three-dimensional roughness, normal stress, and basic friction angle).The normal stress has the greatest effect on the peak dilation angle, followed by the basic friction angle and the least three-dimensional roughness.

Figure 1 .
Figure 1.Photograph of rock outcrops in China.

Figure 2 .
Figure 2. Schematic diagram of the SVR.For a certain set of training data {(x1, y1), (x2, y2), …(xn, yn)}, the aim is to seek an optimal function f(x) that has at most ε deviation from the target values ytar for all the training data.The optimal function f(x) that has the most ε deviation from the target value in ε-SVR can be written as Equation (1):

Figure 2 .
Figure 2. Schematic diagram of the SVR.

Figure 3 .
Figure 3. Computation flowchart of grid search method.

Figure 3 .
Figure 3. Computation flowchart of grid search method.

Figure 4 .
Figure 4. Violin plots of variables used in the database.

Figure 5 .
Figure 5. Correlation coefficients plot between input and output variables.

Figure 5 .
Figure 5. Correlation coefficients plot between input and output variables.

Figure 7 .
Figure 7. Prediction process of the proposed ML-based model.

Figure 7 .
Figure 7. Prediction process of the proposed ML-based model.

Figure 7 .
Figure 7. Prediction process of the proposed ML-based model.

Figure 8 .
Figure 8. Performance of the model on the (a) training set; (b) test set.

Figure 8 .
Figure 8. Performance of the model on the (a) training set; (b) test set.

2 Figure 9 .
Figure 9. Performance comparison between the GS-SVR model and other models.

Figure 9 .
Figure 9. Performance comparison between the GS-SVR model and other models.

Figure 10 .
Figure 10.Relative importance score of input variables.

Figure 10 .
Figure 10.Relative importance score of input variables.

Author Contributions:
Conceptualization, S.X.; data curation, S.X.; visualization, R.Y.; funding acquisition, H.L. and S.X.; investigation, Y.C. and P.Z.; methodology, S.X. and R.Y.; writing-original draft, Y.Y.; writing-review and editing, Y.Y. and H.L. All authors have read and agreed to the published version of the manuscript.Funding: This paper received financial funding from Postgraduate Research and Practice Innovation Program of Jiangsu Province (No. KYCX23_0276), the National Natural Science Foundation of China (No. 42277175), Hunan Provincial Natural Science Foundation of China (No. 2023JJ30657), and Hunan provincial key research and development Program (No. 2022SK2082).

Table 1 .
An overview of existing shear strength models.

Table 2 .
Statistical description of inputs and output.

Table 3 .
Performance comparison of the proposed GS-SVR and SVR models.most of the test set and this is conducive to leaving some safety redundancy in engineering.Regardless of the training set or the test set, the predicted results of the GS-SVR model are distributed near the ideal fit line, and the predicted values are closer to the experimental results than the original SVR model.The values of evaluation indices shown in bold rows in Table3also indicate that the predicted values of the GS-SVR model are more consistent with the experimental values, and the predicted results are more accurate than those of the original SVR model. derestimates

Table 3 .
Performance comparison of the proposed GS-SVR and SVR models.

Table 4 .
Comparison between the measured peak dilation angle and the calculated values by different models.

Table 4 .
Comparison between the measured peak dilation angle and the calculated values by different models.