Prediction of the Structural Yield Strength of Saline Soil in Western Jilin Province, China: A Comparison of the Back-Propagation Neural Network and Support Vector Machine Models

: With the increase in transportation emissions, road diseases in the saline soil area of Jilin Province have become a problem that requires serious attention. In order to improve the subgrade performance, the structural yield strength (SYS) of remolded soil and its factor sensitivity are investigated in this study. Saline soils in Western Jilin are structural in the sense that the bonding strength of soil skeleton is mainly provided by the solidiﬁcation bond formed by a physicochemical interaction between particles. Its SYS is inﬂuenced by its


Study Area
The western area of Jilin Province is a typical saline soil region in China. The study area is located in the low-lying Songnen Plain, which belongs to a semi-humid and semi-arid climate and is also a typical seasonal frozen soil area. Because of climate and environmental geological factors, salt accumulates more easily on the surface of this area. As a consequence, vegetation in this area is scarce and the ecological environment is fragile (Figure 1). Temperatures below zero in winter last from November to March every year, resulting in surface freezing and upward salinity migration under the influence of the temperature gradient. Precipitation is concentrated from June to August, and the annual precipitation is low. The average precipitation from 2008 to 2018 in Zhenlai is shown in Figure 2, along with the rainfall data collected from China Meteorological Science Data Center. The average precipitation is distributed symmetrically, with the highest in July. In spring and autumn, evaporation is especially strong, so a large amount of salt accumulates on the soil surface year by year under the action of concentration gradient, which makes Jilin Province a typical saline soil area.
Symmetry 2020, 12, x FOR PEER REVIEW 3 of 19 were used to explore the relationship between water content, salt content, compactness, and SYS in this study.
The main objective of this study is to develop a prediction model for SYS in the design stage of roadbed engineering, so remolded soil was used. The remolded soil samples with different water content, compactness, and salt content were used for the high-pressure consolidation test, and 120 data points were obtained. To eliminate redundant features, the Pearson correlation coefficient (rPCC) is an evaluation standard for feature selection. The K-fold cross-validation method was used to avoid overfitting. The BPNN and SVM were used to determining the relationship of SYS with the water content, salt content, and compactness of saline remolded soil in the west of Jilin Province, and the prediction model for SYS was established. Finally, the influence of water content, salt content and compactness on the SYS was studied.

Study Area
The western area of Jilin Province is a typical saline soil region in China. The study area is located in the low-lying Songnen Plain, which belongs to a semi-humid and semi-arid climate and is also a typical seasonal frozen soil area. Because of climate and environmental geological factors, salt accumulates more easily on the surface of this area. As a consequence, vegetation in this area is scarce and the ecological environment is fragile (Figure 1). Temperatures below zero in winter last from November to March every year, resulting in surface freezing and upward salinity migration under the influence of the temperature gradient. Precipitation is concentrated from June to August, and the annual precipitation is low. The average precipitation from 2008 to 2018 in Zhenlai is shown in Figure  2, along with the rainfall data collected from China Meteorological Science Data Center. The average precipitation is distributed symmetrically, with the highest in July. In spring and autumn, evaporation is especially strong, so a large amount of salt accumulates on the soil surface year by year under the action of concentration gradient, which makes Jilin Province a typical saline soil area.    Figure  1. Soil samples were taken vertically downward and collected every 10 cm of depth at the sampling points. The physical and chemical properties of each soil layer were determined under the guidance of GB/T 50123 (1999). From the surface to 150 cm, the natural density of the samples was 1.60~2.02 g/cm 3 , and the natural moisture content was 3.20%~17.40%. Due to strong evaporation, the surface layer had the lowest moisture content at 3.20%. The organic content was 0.168%~0.488%. The grain size composition of the soil samples was obtained by particle size analysis, which showed that the sand content of the samples is 5.66%~15.99%, the silt content is 46.92%~57.81%, and the clay content is 26.21%~45.07%. Thus, the saline soil is mainly composed of silt particles, followed by clay particles and sand particles.
In this study, the total soluble salt content is determined using a constant temperature water bath. The total soluble salt in the saline soil was 0.100%~0.408% from the surface to 150 cm. The salt content of the soil samples at 0~70 cm depth was higher than 0.3%, and reduced below 70 cm. The content of Na + and K + was determined by a flame photometer.
Results show that the primary anion in the saline soil is HCO3 − , and the primary cation is Na + . The high clay and silt content in the saline soil increases the specific surface area and surface energy, and also makes the adsorption capacity of the soil surface stronger [9]. Combined with the high content of Na + , a thick diffusion layer forms on the surface of soil particles, which will weaken the bound water connection between particles until it disappears [19].
The curve of ion components with depth is shown in Figure 3. We found that the 40 cm soil layer is the turning point, providing the maximum point of total soluble salt content, HCO3 − content, and SO4 2− content. Therefore, the 40 cm soil layer was selected as the experimental soil for studying the compression characteristics of saline soil in Zhenlai, and the results from the physical and chemical tests are shown in Tables 1 and 2. (a) (b)  Figure 1. Soil samples were taken vertically downward and collected every 10 cm of depth at the sampling points. The physical and chemical properties of each soil layer were determined under the guidance of GB/T 50123 (1999). From the surface to 150 cm, the natural density of the samples was 1.60~2.02 g/cm 3 , and the natural moisture content was 3.20%~17.40%. Due to strong evaporation, the surface layer had the lowest moisture content at 3.20%. The organic content was 0.168%~0.488%. The grain size composition of the soil samples was obtained by particle size analysis, which showed that the sand content of the samples is 5.66%~15.99%, the silt content is 46.92%~57.81%, and the clay content is 26.21%~45.07%. Thus, the saline soil is mainly composed of silt particles, followed by clay particles and sand particles.
In this study, the total soluble salt content is determined using a constant temperature water bath. The total soluble salt in the saline soil was 0.100%~0.408% from the surface to 150 cm. The salt content of the soil samples at 0~70 cm depth was higher than 0.3%, and reduced below 70 cm. The content of Na + and K + was determined by a flame photometer.
Results show that the primary anion in the saline soil is HCO 3 − , and the primary cation is Na + .
The high clay and silt content in the saline soil increases the specific surface area and surface energy, and also makes the adsorption capacity of the soil surface stronger [9]. Combined with the high content of Na + , a thick diffusion layer forms on the surface of soil particles, which will weaken the bound water connection between particles until it disappears [19]. The curve of ion components with depth is shown in Figure 3. We found that the 40 cm soil layer is the turning point, providing the maximum point of total soluble salt content, HCO 3 − content, and SO 4 2− content. Therefore, the 40 cm soil layer was selected as the experimental soil for studying the compression characteristics of saline soil in Zhenlai, and the results from the physical and chemical tests are shown in Tables 1 and 2. bound water connection between particles until it disappears [19]. The curve of ion components with depth is shown in Figure 3. We found that the 40 cm soil layer is the turning point, providing the maximum point of total soluble salt content, HCO3 − content, and SO4 2− content. Therefore, the 40 cm soil layer was selected as the experimental soil for studying the compression characteristics of saline soil in Zhenlai, and the results from the physical and chemical tests are shown in Tables 1 and 2.

Specimen Design
The high clay and silt content in the saline soil increases the soil's specific surface area and surface energy and strengthens the adsorption capacity of the soil surface. The higher the content of Na + in the soil, the thicker the bound water film on the outer surface of the soil particles [20]. When the water content is very low, salt crystallizes and forms a bond between soil particles. With an increase of the water content, the strength and the stability of the soil decreases because the salt dissolves in water, which thickens the bound water film and weakens the connection between soil particles. Because the salt content and water content are the factors that affect the thickness and cementation strength of the bound water film between soil particles, they have a great influence on the structural strength of saline soil. The natural salt content of the 40 cm soil layer in western Jilin Province is 0.408%, and the highest was 1.7%, so the salt content was set to 0.0%, 0.4%, 1.0%, and 2.0%. Combined with the natural water content and the optimal water content, the water content was set to 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, and 23%.
Compactness requirements are different for different projects. Compactness requirements in highway projects are generally greater than 90%, and those of other projects are at least 85%. When compactness exceeds 95%, it is difficult to compact the soil. In terms of economic benefits, too much compaction should be avoided. Therefore, considering with the purpose of this test, the compactness was set at 85%, 90%, and 95%, respectively.

High Pressure Consolidation Test
The soil samples were soaked repeatedly in distilled water and desalinated. According to the water content and salinity setting, a certain quantity of soluble salt was added to distilled water to prepare the solution. The prepared solution was evenly sprayed on the desalinated air-dried soil. After full mixing, it was sealed in a freshness-preserving bag and placed in a humidifying cylinder for 24 h to ensure the solution was fully and evenly distributed. The soil sample was compacted into a cutting ring (D = 61.8 mm; H = 20 mm) by the layer-compaction method. A high-pressure consolidation experiment was carried out and 120 data points were obtained by the double logarithmic coordinate method [4]. The double logarithmic coordinate method is to draw the compression curve of soil samples in ln(1 + e)-lg(P) (where e is the void ratio and P is the vertical load) coordinates (as shown in Figure 4). In the double logarithmic coordinate method, the compression curve of a soil sample can be fitted by two straight lines, and the vertical load, corresponding to the intersection point of the two lines, is the structural yield pressure (SYS). The results are shown in Table 3 below.

Feature Selection
Feature selection uses some evaluation criteria to select feature subsets from the original feature space and eliminate redundant features, and uses data preprocessing to improve the efficiency of the model [21]. The Pearson correlation coefficient (r PCC ) is an evaluation standard of feature selection. Its values range from −1 to 1. The closer the value is to −1 or 1, the stronger the correlation between features. The formula of the Pearson correlation coefficient is as follows [22]: (1) The variable x i is water content, salt content or compactness, and y i is the structural yield strength. x and y are the average values of the corresponding variables, and n is the number of samples. Generally speaking, r PCC < 0.40 indicates weak correlation among variables, 0.40 ≤ r PCC < 0.70 indicates moderate correlation, 0.70 ≤ r PCC < 0.9 indicates strong or high correlation, and extremely strong correlation is indicated when 0.90 ≤ r PCC ≤ 1.
After calculation, the Pearson correlation coefficients of the water content, compactness, and salt content with the structural yield pressure are 1, 1, and 0.402, respectively. There is an extremely strong correlation between water content, compactness, and structural yield pressure. The relationship between salt content and yield pressure is a moderate correlation. Therefore, in this study, the water content, compactness, and moisture content are retained to establish the model.

K-Fold Cross-Validation
Performance evaluation of machine learning models is very important for model selection. K-fold cross-validation is a method to evaluate the model's performance through sample reuse, which can effectively reduce the prediction error caused by sampling randomness in the modeling process [23]. Therefore, in this study we used K-fold cross-validation for model evaluation and to avoid over-fitting, thus improving the stability and generalization ability of the models.
The main concept behind K-fold cross-validation is to divide the experimental data into K parts, where K-1 parts are the training dataset, and 1 part is the testing dataset. One part was tested alternately; training and testing were carried out K times to obtain the K group model evaluation parameters [24]. Finally, the average value of the evaluation parameters was used to evaluate the different models, and the model with the best evaluation parameters was selected.
Before the establishment of the two prediction models, in order to avoid over-fitting, 120 groups of experimental datasets were divided through five-fold cross-validation.
In order to reduce the influence caused by differences in the order of magnitude, the variables of the training dataset and testing dataset were normalized.

Back Propagation Neural Network (BPNN)
The neural network method is a simplified mathematical model based on the concept of information transmission between biological neurons. Neural networks have been applied in signal processing, pattern recognition, machine control, expert system, and other fields, and have been frequently applied in the field of prediction. BPNN has the strong ability of non-linear mapping and can correctly solve some non-linear problems. The BPNN algorithm uses the Least Mean Square (LMS) learning algorithm as its basic algorithm. The gradient search algorithm is used in the learning process of the network and error propagation is used to correct the weight and to minimize the mean square deviation of the actual output and the expected output of the network. The structure of the BPNN model of this study is shown in Figure 5. Generally, though the BPNN model can get good results for fitting and classification, there are some problems such as weak interpretability, over fitting, and so on. Because of the advantages of BPNN, such as its simple topology, high error precision, easy programming, and strong practicability, the applications of BPNN are extensive, making it one of the most important algorithms in the field of intelligence. In the field of civil engineering, the BPNN has been applied to structural damage detection [25], soft rock strength prediction [26], ground vibration prediction [27], ground subsidence [28,29], engineering cost prediction [30,31], concrete expansion prediction [32], and soil-well potential prediction [33,34], as well as conductivity prediction and other engineering topics [35]. In order to reduce the influence caused by differences in the order of magnitude, the variables of the training dataset and testing dataset were normalized.

Methodology
The neural network method is a simplified mathematical model based on the concept of information transmission between biological neurons. Neural networks have been applied in signal processing, pattern recognition, machine control, expert system, and other fields, and have been frequently applied in the field of prediction. BPNN has the strong ability of non-linear mapping and can correctly solve some non-linear problems. The BPNN algorithm uses the Least Mean Square (LMS) learning algorithm as its basic algorithm. The gradient search algorithm is used in the learning process of the network and error propagation is used to correct the weight and to minimize the mean square deviation of the actual output and the expected output of the network. The structure of the BPNN model of this study is shown in Figure 5. Generally, though the BPNN model can get good results for fitting and classification, there are some problems such as weak interpretability, over fitting, and so on. Because of the advantages of BPNN, such as its simple topology, high error precision, easy programming, and strong practicability, the applications of BPNN are extensive, making it one of the most important algorithms in the field of intelligence. In the field of civil engineering, the BPNN has been applied to structural damage detection [25], soft rock strength prediction [26], ground vibration prediction [27], ground subsidence [28,29], engineering cost prediction [30,31], concrete expansion prediction [32], and soil-well potential prediction [33,34], as well as conductivity prediction and other engineering topics [35].

Support Vector Machine (SVM)
Support vector machine (SVM) is a machine learning method based on statistical learning theory [36], Vapnik-Chervonenkis dimension theory and structural risk minimization principles, thereby giving it good generalization ability for future data [37,38]. SVM was originally used to solve pattern recognition problems. With using the insensitive loss function, SVM is gradually applied to solve non-linear regression problems [39]. The SVM is mainly used in tunnel deformation prediction [40], structural damage detection and diagnosis [41], earthquake disaster prediction [42], and saline-alkali degree classification [43] in the field of civil engineering.

Support Vector Machine (SVM)
Support vector machine (SVM) is a machine learning method based on statistical learning theory [36], Vapnik-Chervonenkis dimension theory and structural risk minimization principles, thereby giving it good generalization ability for future data [37,38]. SVM was originally used to solve pattern recognition problems. With using the insensitive loss function, SVM is gradually applied to solve non-linear regression problems [39]. The SVM is mainly used in tunnel deformation prediction [40], structural damage detection and diagnosis [41], earthquake disaster prediction [42], and saline-alkali degree classification [43] in the field of civil engineering.
For a training dataset (x i , y i ), i = 1, 2, . . . , l, x i ∈ R n , y i ∈ R, Symmetry 2020, 12, 1163 9 of 20 the dataset is fitted using linear functions in the high-dimensional feature space, as shown below: The nonlinear mapping function ϕ (x) maps the dataset from the input space to the high-dimensional feature space so that the nonlinear fitting problem in the input space becomes the linear fitting problem in the high-dimensional feature space.
The regression estimation function obtained is where k(x i ,x j ) is called the kernel function, which is equal to the inner product of two vectors, x i and x j , in their characteristic spaces, ϕ(x i ) and ϕ(x j ). The kernel function must satisfy the Mercer theorem. The common kernel functions include the linear function, radial basis function, and multi-layer perception function. The coefficient α i in the formula can be obtained by solving the following quadratic programming problems: Only a part of (α i -α i *) is non-zero in the formula, and the corresponding data points are support vectors. C is a normal number that determines the balance between empirical risk and regularization [44,45].
For a dataset, only the kernel function and regularization parameter C are needed to construct the SVM. The SVM learning algorithm problem is used to solve the constrained quadratic programming problem, and the solution of the quadratic programming (QP) problem is global optimization.

Model Evaluation
In a previous study, Lei (2018) used the R 2 , RMSE and average relative deviation (MRD) to evaluate the predictive effect of the support vector regression (SVR) and BPNN models for the energy loss of a stepped spillway [46]. Zhang (2017) used the R 2 , RMSE and MAPD to evaluate the predictive effect of the general regression neural network (GRNN) and BPNN models for frost heave behavior [34]. According to the significance of the above statistical parameters, this study compares the performance of BPNN and SVM models by using three parameters: the R 2 , RMSE, and MAPD methods.
The formulas of the three statistical parameters are as follows [34]: (1) Coefficient of determination (R 2 ), also named the decision coefficient (R 2 ): In a regression analysis, R 2 is an index that reflects the approximation between the regression predictions and real data. More specifically, R 2 indicates the proportion of the variance in the dependent variable that is predicted or explained by the predictor variable, also known as the independent variable. When the range of values is 0-1, the closer the values are to 1, the closer the regression predicted values are to the experimental data: where P_SYS i is the predicted SYS, E_SYSi is measured SYS, and N is the total amount of data. (2) Root mean square error (RMSE): the RMSE is used to accurately measure the prediction errors of the different models of a particular dataset. The smaller the RMSE, the higher the matching degree between the predicted value and the experimental value: (3) Mean absolute percentage deviation (MAPD): Because the explanation of the relative error by MAPD is very intuitive, it is often used for model evaluation. The smaller the MAPD, the better the prediction effect of the model:

Determination of BPNN Parameters
In this study, the BPNN model adopts a three-layer network structure, because the BPNN with a hidden layer can simulate a highly complex nonlinear function when the neuron number in the hidden layer is sufficient [47]. There are three input variables: water content, compaction degree, and salt content. The output variable is the SYS. The "logsig" function is applied to the hidden layer, and the "tansig" function is applied to the output layer. The maximum number of iterations is set to 30,000, and the learning rate is set to 0.8.
In addition, the appropriate number of hidden neurons is crucial for network performance. However, it is a complex problem to determine. A BPNN with insufficient neurons will not learn the problem. However, a BPNN with excessive neurons is not only difficult to train, but is also prone to over fitting [48]. Many scholars simply suggest, "test and find it" [49,50]. According to Heaton's suggestion, when the number of neurons is close to twice that of the input layer, it is a good starting point to find the appropriate neuron number in the hidden layer, and then increase or decrease neurons according to the network performance [51]. At present, there is no authoritative calculation method for the number of hidden neurons, but the range of the number of hidden neurons can be calculated by the following empirical formula [34]: where n in is the number of neurons in the input layer, n out is the number of neurons in the output layer, and a 0 is the revised value, ranging from 0 to 10. According to equation (9), the number of hidden neurons of the BPNN in this study ranges from 2 to 12. The evaluation parameters of the 11 models when the number of neurons varies from 2 to 12 are shown in Table 4. The average values of the three statistical parameters of the BPNN models, which are established by the five datasets of five-fold cross-validation, are shown in Figure 6. When the number of neurons is 8, R 2 is the closest to 1, and the RMSE and MAPD values are the smallest. Therefore, the optimal number of hidden neurons of the BPNN model is 8. hidden layer is sufficient [47]. There are three input variables: water content, compaction degree, and salt content. The output variable is the SYS. The "logsig" function is applied to the hidden layer, and the "tansig" function is applied to the output layer. The maximum number of iterations is set to 30,000, and the learning rate is set to 0.8. In addition, the appropriate number of hidden neurons is crucial for network performance. However, it is a complex problem to determine. A BPNN with insufficient neurons will not learn the problem. However, a BPNN with excessive neurons is not only difficult to train, but is also prone to over fitting [48]. Many scholars simply suggest, "test and find it" [49,50]. According to Heaton's suggestion, when the number of neurons is close to twice that of the input layer, it is a good starting point to find the appropriate neuron number in the hidden layer, and then increase or decrease neurons according to the network performance [51]. At present, there is no authoritative calculation method for the number of hidden neurons, but the range of the number of hidden neurons can be calculated by the following empirical formula [34]: where nin is the number of neurons in the input layer, nout is the number of neurons in the output layer, and a0 is the revised value, ranging from 0 to 10. According to equation (9), the number of hidden neurons of the BPNN in this study ranges from 2 to 12. The evaluation parameters of the 11 models when the number of neurons varies from 2 to 12 are shown in Table 4. The average values of the three statistical parameters of the BPNN models, which are established by the five datasets of five-fold cross-validation, are shown in Figure 6. When the number of neurons is 8, R 2 is the closest to 1, and the RMSE and MAPD values are the smallest. Therefore, the optimal number of hidden neurons of the BPNN model is 8.

SVM Parameter Determination
The quality of the parameter settings will affect the prediction effect of the SVM model. In this study, radial basis function (RBF) is used as the kernel function, and the cross validation-grid search method (Grid Search) is used for finding the optimal parameters C and g. C is the penalty factor of the model, which indicates the tolerance of the model to errors. The higher the C value, the lower the tolerance of the model to errors. The g is a parameter of the kernel function, which implicitly determines the distribution of the original data mapped to high-dimensional feature space. The best parameters for the C, g, and mean square error (MSE) of the five datasets are shown in Table 5 below.

Model Performance Comparison
The regression relationships between the predicted SYS and measured SYS of the BPNN and SVM are shown in Figures 7 and 8, respectively, including the training stage and testing stage. A comparison of the BPNN and SVM evaluation parameters in the training stage is shown in Table 6, and a comparison of BPNN and SVM evaluation parameters in the testing stage is shown in Table 7.
The R 2 and RMSE ranges of the five BPNN models during the training and testing stage were more concentrated than that of the SVM models, which shows that the statistical parameters of the SVM models fluctuate greatly with different dataset groupings. The performance of the SVM models is greatly affected by different dataset groupings. This illustrates that the stability of the BPNN model is better than that of the SVM model.
Next, the average values of the three statistical indicators were analyzed. The average R 2 of the BPNN models during the training and testing stages was closer to 1 than that of SVM models. The average RMSE of the BPNN during the training and testing stage were 11.805 and 7.035 smaller than that of SVM, respectively. This showed that the prediction error of the BPNN models was smaller than that of SVM models. This indicates that the predicted data for the BPNN models match the experimental data well. The average MAPDs of the BPNN during training and testing stages were 0.022% and 0.024% smaller than that of SVM, respectively. This shows that the relative error of the BPNN model is less than that of the SVM model, so the prediction effects of the BPNN models are better.  The fitting curves between the predicted data of the BPNN and SVM models and the experimental data are shown in Figures 7 and 8. The slope and intercept of the fitting curves also reflect the accuracy of the prediction effect of the model. The closer the slope of the fitting curves is to 1 and the closer the intercept is to 0, the smaller the deviation between the predicted data and the The R 2 and RMSE ranges of the five BPNN models during the training and testing stage were more concentrated than that of the SVM models, which shows that the statistical parameters of the SVM models fluctuate greatly with different dataset groupings. The performance of the SVM models is greatly affected by different dataset groupings. This illustrates that the stability of the BPNN model is better than that of the SVM model.
Next, the average values of the three statistical indicators were analyzed. The average R 2 of the BPNN models during the training and testing stages was closer to 1 than that of SVM models. The average RMSE of the BPNN during the training and testing stage were 11.805 and 7.035 smaller than that of SVM, respectively. This showed that the prediction error of the BPNN models was smaller than that of SVM models. This indicates that the predicted data for the BPNN models match the experimental data well. The average MAPDs of the BPNN during training and testing stages were 0.022% and 0.024% smaller than that of SVM, respectively. This shows that the relative error of the BPNN model is less than that of the SVM model, so the prediction effects of the BPNN models are better.  The fitting curves between the predicted data of the BPNN and SVM models and the experimental data are shown in Figures 7 and 8. The slope and intercept of the fitting curves also reflect the accuracy of the prediction effect of the model. The closer the slope of the fitting curves is to 1 and the closer the intercept is to 0, the smaller the deviation between the predicted data and the     First, the three statistical indicators of each group of the dataset are analyzed in this section. The R 2 ranges of the five BPNN models during the training and testing stage were 0.974~0.986 and 0.943~0.986, respectively. The R 2 ranges of the five SVM models during the training and testing stage were 0.961~0.976 and 0.931~0.983, respectively. The R 2 min and R 2 max values of the BPNN models during the training and testing stage were greater than those of the SVM models, and the R 2 ranges of BPNN models were smaller than those of the SVM models. The difference between the R 2 max and R 2 min of the BPNN and SVM models during the training stage was 0.012 and 0.015 respectively. The difference between the R 2 max and R 2 min of the BPNN and SVM models during the testing stage was 0.043 and 0.052, respectively. This shows that the R 2 fluctuation of the BPNN model is smaller than that of the SVM model under different dataset grouping conditions. The explanation degree of independent variable to dependent variable is less affected by dataset grouping.
The RMSE ranges of the five BPNN models during the training and testing stage were 41.809~57.946 and 41.554~90.967, respectively. The RMSE ranges of the five SVM models during training and testing stage were 52.294~68.578 and 42.370~99.438, respectively. The RMSE max and RMSE min of the BPNN models during the training and testing stage were less than those of the SVM models. The difference between the RMSE max and RMSE min of the BPNN and SVM models during the training stage was 16.137 and 16.284, respectively. The difference between the RMSE max and RMSE min of the BPNN and SVM models during the testing stage was 49.413 and 57.068, respectively. The RMSE range of BPNN models is smaller than that of SVM models. The RMSE of the SVM models for the K-2 group of testing dataset was 99.438, which indicates that the prediction error of the SVM models for K-2 group of the testing dataset was greater than other dataset group. Under different dataset grouping conditions, the RMSE fluctuation of the BPNN models was smaller, and the prediction errors of the BPNN models ware less affected by grouping.
The MAPDs of the five BPNN models during the training and testing stage were 0.091~0.120 and 0.102~0.168, respectively. The MAPDs of the five SVM models during the training and testing stage were 0.116~0.133 and 0.109~0.177, respectively. The MAPD max and MAPD min of the BPNN models during the training and testing stage were less than or that of SVM models. The MAPDs of the five SVM models during the training stage and the testing stage were all bigger than that of BPNN models. which indicates that the prediction errors of the SVM models were all greater than BPNN models.
The R 2 and RMSE ranges of the five BPNN models during the training and testing stage were more concentrated than that of the SVM models, which shows that the statistical parameters of the SVM models fluctuate greatly with different dataset groupings. The performance of the SVM models is greatly affected by different dataset groupings. This illustrates that the stability of the BPNN model is better than that of the SVM model.
Next, the average values of the three statistical indicators were analyzed. The average R 2 of the BPNN models during the training and testing stages was closer to 1 than that of SVM models. The average RMSE of the BPNN during the training and testing stage were 11.805 and 7.035 smaller than that of SVM, respectively. This showed that the prediction error of the BPNN models was smaller than that of SVM models. This indicates that the predicted data for the BPNN models match the experimental data well. The average MAPDs of the BPNN during training and testing stages were 0.022% and 0.024% smaller than that of SVM, respectively. This shows that the relative error of the BPNN model is less than that of the SVM model, so the prediction effects of the BPNN models are better.
The fitting curves between the predicted data of the BPNN and SVM models and the experimental data are shown in Figures 7 and 8. The slope and intercept of the fitting curves also reflect the accuracy of the prediction effect of the model. The closer the slope of the fitting curves is to 1 and the closer the intercept is to 0, the smaller the deviation between the predicted data and the experimental data, and the better the prediction effect of the model. By comparing the fitting lines of the predicted data and the experimental value of the BPNN and SVM model, the predicted effect of the BPNN model is shown to be better overall than that of the SVM. In order to avoid overfitting, the K-fold cross-validation method was used in this study. From the results of the BPNN model parameters, the R 2 of K-2, K-3, and K-5 datasets in the testing stage are only slightly lower than those in the training stage, and even the R 2 of K-1 and K-4 datasets in the testing stage are higher than those in the training stage. The results of RMSE and MAPD showed opposite regularity. And the results of the SVM model parameters showed the similar regularity. Moreover, the model parameters of BPNN and SVM models are very stable, neither excellent nor poor. Thus, the performance of BPNN and SVM models is stable, the generalization ability of the model is good for the data of this study, and the generalization ability for external data needs to be explored and improved in future research.
Kogure (1977) [52], Stas (1984) [53], and Degroot (1999) [54] established the empirical models to predict the clay pre-consolidation pressure, and the R 2 values were all less than 0.80. The model performance was poor and the generalization ability was weak. Karim (2016) proposed a two-fold simple empirical model with R 2 = 83%, which greatly improved the performance of the model [55]. In this study, the R 2 results of the pre-consolidation pressure prediction models are all above 90%. Therefore, compared with the existing empirical formula model, the performance of the BPNN and SVM models established in this study are much better.
In the learning process of the BPNN model, the gradient search algorithm is used to correct the weight through error propagation, so that the MSE between the actual output and the expected output is minimized. Because of its strong learning ability, the fitting accuracy of the BPNN model is high. The SVM model is based on the principle of structural risk minimization, which transforms the plane nonlinear problem into a linear problem by mapping data to the high-dimensional feature space. The three independent variables have medium and extremely strong correlation with the dependent variables in this study, and there is no redundancy feature. Therefore, the BPNN and SVM models have good performance and stability. However, the main drawbacks of the BPNN model are that it cannot give a clear mathematical relationship and the results are not interpretable. The performance of the support vector machine mainly depends on the selection of the kernel function, but there is no good method to solve the problem of kernel function selection in different fields.
In addition, in the studies of basic properties, ignoring scale-dependence will make the experimental results deviate from the practical engineering [56]. Considering scale-dependence in the establishment of the model, the prediction results of the model will be closer to the reality and the generalization ability will be better [57][58][59][60]. But the size effect is not taken into account in this study, which may affect the generalization ability of the model. This problem should be fully considered in future research to improve the generalization ability of the model.

Sensitivity Analysis
Because the BPNN model is superior to the SVM model, we used it to explore the influence of water content, compactness, and salt content on the SYS. The first model takes compactness and salinity as the input variables, which are recorded as BPNN-1; the second model takes moisture content and salinity as the input variables, which are recorded as BPNN-2; and the third model takes moisture content and compactness as the input variables, which are recorded as BPNN-3. Similarly, according to the K-fold cross-validation method, the datasets are divided into five groups of training datasets and test datasets, and the BPNN models are established and simulated. The statistical evaluation parameters of the three models are shown in Table 8. As shown in Table 8, the average R 2 of the BPNN-3 during the training and testing stages were greater than 0.969 and were closer to 1 than the average R 2 of the other two models. This shows that the proportion of variance in the dependent variable that is explained by the independent variable is high in BPNN-3. The R 2 of BPNN-1 was the smallest, and the average R 2 value during the training and testing stages was about 0.9. This shows that the proportion of variance in the dependent variable which is explained by the independent variable decreases obviously when the water content of the input variable is removed.
The RMSE average value of BPNN-3 during the training and testing stages was the smallest, which indicates that the error of BPNN-3 was the smallest, so the BPNN-3 matching degree of experimental data and prediction data is the highest. The average RMSE of BPNN-2 during the training and testing stages was 1.186 and 1.296 times that of BPNN-3, respectively. The RMSE average value of BPNN-1 was the largest. The RMSE average value during the training and testing stage was 1.800 and 1.818 times higher than that of BPNN-3, respectively.
The average MAPD of BPNN-3 during the training and testing stages was the smallest, which indicates that the relative error of BPNN-3 was the smallest. The average MAPD of BPNN-2 during the training and testing stages is 1.187 and 1.228 times higher than that of BPNN-3, respectively. The average MAPD of the BPNN-1 model was the largest, and the average MAPD during the training and testing stage was 1.991 and 1.915 times higher than that that of BPNN-3, respectively.
The average values of the three statistical parameters of the BPNN-3 model are the best. The statistical parameters of the BPNN-2 model are better than those of BPNN-1. Because the R 2 average value of the BPNN-1 model is the smallest when removing the water content of the input variable, and the RMSE and MAPD average values are also larger than BPNN-2 and BPNN-3, the proportion of variance in the dependent variable that is predicted or explained by the independent variable of the model is reduced significantly, and the prediction error and relative error increase significantly when removing the water content of the input variable. Therefore, it can be concluded that the influence degree of each variable is as follows: water content > compaction degree > salt content.
The relationship curve between the estimated SYS and the measured SYS of the BPNN model is shown in Figures 9-11. The slope and intercept of the fitting curve also reflect the prediction effect of the model. By comparing the fitting lines between the predicted data of the BPNN-1, BPNN-2, and BPNN-3 models and the experimental data, we found that the predicted results of the BPNN-1 model are the worst, while those of the BPNN-3 model are the best.
influence degree of each variable is as follows: water content > compaction degree > salt content.
The relationship curve between the estimated SYS and the measured SYS of the BPNN model is shown in Figures 9-11. The slope and intercept of the fitting curve also reflect the prediction effect of the model. By comparing the fitting lines between the predicted data of the BPNN-1, BPNN-2, and BPNN-3 models and the experimental data, we found that the predicted results of the BPNN-1 model are the worst, while those of the BPNN-3 model are the best.

Conclusions
Structural yield strength (SYS) is a key geotechnical parameter. However, it is impractical for most geotechnical engineering to determine the SYS of the soil layer in a region because of its relatively high technical and cost requirements and because it is time-consuming. Therefore, it is of great engineering significance and economic benefit to establish a prediction model for SYS based on the basic properties of soil.
In this study, the BPNN and SVM models were used to predict the SYS of saline soil in western Jilin Province. Comparing the performance of the BPNN and SVM models, we found that the BPNN model is slightly better than the SVM model. That is, in the BPNN model, the independent variables have a higher explainable degree of dependent variables, the matching degree between the predicted values and experimental values is higher, and the relative error is smaller.
A sensitivity analysis was also carried out by the BPNN model. Based on the same datasets, a

Conclusions
Structural yield strength (SYS) is a key geotechnical parameter. However, it is impractical for most geotechnical engineering to determine the SYS of the soil layer in a region because of its relatively high technical and cost requirements and because it is time-consuming. Therefore, it is of great engineering significance and economic benefit to establish a prediction model for SYS based on the basic properties of soil.
In this study, the BPNN and SVM models were used to predict the SYS of saline soil in western Jilin Province. Comparing the performance of the BPNN and SVM models, we found that the BPNN model is slightly better than the SVM model. That is, in the BPNN model, the independent variables have a higher explainable degree of dependent variables, the matching degree between the predicted values and experimental values is higher, and the relative error is smaller.
A sensitivity analysis was also carried out by the BPNN model. Based on the same datasets, a set of input variables was removed by an exhaustive method, and two sets of input variables were

Conclusions
Structural yield strength (SYS) is a key geotechnical parameter. However, it is impractical for most geotechnical engineering to determine the SYS of the soil layer in a region because of its relatively high technical and cost requirements and because it is time-consuming. Therefore, it is of great engineering significance and economic benefit to establish a prediction model for SYS based on the basic properties of soil.
In this study, the BPNN and SVM models were used to predict the SYS of saline soil in western Jilin Province. Comparing the performance of the BPNN and SVM models, we found that the BPNN model is slightly better than the SVM model. That is, in the BPNN model, the independent variables have a higher explainable degree of dependent variables, the matching degree between the predicted values and experimental values is higher, and the relative error is smaller.
A sensitivity analysis was also carried out by the BPNN model. Based on the same datasets, a set of input variables was removed by an exhaustive method, and two sets of input variables were left to establish the BPNN model; these sets were BPNN-1 (dewatering), BPNN-2 (de-compaction degree), and BPNN-3 (desalination content). The evaluation parameters of the BPNN-3 model are better than those of the BPNN-1 and BPNN-2 models. The evaluation parameters of BPNN-2 are better than BPNN-1. The results show that water content has the greatest influence on the SYS, whereas salt content has the least influence on the SYS. The sensitivity analysis showed that the influence degree of each variable is as follows: water content > compaction degree > salt content.
Comparing the performance of the model established in this study with the traditional empirical formula model based on the internal data set, it was found that the performance of BPNN and SVM models is better. Although K-fold cross-validation is used to avoid the overfitting problem and to improve the generalization ability of the model, the performance of the models based on external datasets needs further research, which is a deficiency of this study.