Soil Liquefaction Prediction Based on Bayesian Optimization and Support Vector Machines

: Liquefaction has been responsible for several earthquake-related hazards in the past. An earthquake may cause liquefaction in saturated granular soils, which might lead to massive con-sequences. The ability to accurately anticipate soil liquefaction potential is thus critical, particularly in the context of civil engineering project planning. Support vector machines (SVMs) and Bayesian optimization (BO), a well-known optimization method, were used in this work to accurately forecast soil liquefaction potential. Before the development of the BOSVM model, an evolutionary random forest (ERF) model was used for input selection. From among the nine candidate inputs, the ERF selected six, including water table, effective vertical stress, peak acceleration at the ground surface, measured CPT tip resistance, cyclic stress ratio (CSR), and mean grain size, as the most important ones to predict the soil liquefaction. After the BOSVM model was developed using the six selected inputs, the performance of this model was evaluated using renowned performance criteria, including accuracy (%), receiver operating characteristic (ROC) curve, and area under the ROC curve (AUC). In addition, the performance of this model was compared with a standard SVM model and other machine learning models. The results of the BOSVM model showed that this model outperformed other models. The BOSVM model achieved an accuracy of 96.4% and 95.8% and an AUC of 0.93 and 0.98 for the training and testing phases, respectively. Our research suggests that BOSVM is a viable alternative to conventional soil liquefaction prediction methods. In addition, the ﬁndings of this research show that the BO method is successful in training the SVM model.


Introduction
Solid-to-liquid transitions in granular materials are known as liquefaction, and may be caused by a growth in pore water pressure [1,2]. Seismic liquefaction of saturated soils, which occurs as a result of earthquakes, is one of geotechnical engineers' most pressing problems. This is because the lateral expansion of soil mass might represent a significant hazard to civil engineering works in the area if it occurs [1][2][3]. As an example, after the Wenchuan earthquake of M 8.0 which struck China in 2008, both surface buildings and subsurface utilities were damaged by liquefaction [1,2,4]. Consequently, estimating the soil liquefaction potential is a significant issue, and must be considered when building civil engineering structures [5][6][7][8][9][10]. Soil liquefaction potential may be measured in a variety of ways, as described in the scientific literature (e.g., [11][12][13]). Since in situ observations can only be made in regions where testing may be done on site, most approaches rely on separating non-liquefaction sections from liquefaction components (e.g., the shear wave velocity (Vs) technique and flat dilatometer tests (DMTs)) [2,14]. Due to the great uncertainty in both soil properties and earthquake scenarios, it is difficult to find a single effective empirical formula for regression analysis. This is why scientists are working to develop scientific predictive methods that are simpler, more intuitive, and more accurate than the typical empirical models that were previously used to analyze soil liquefaction.
Liquefaction potential may be accurately predicted using artificial neural networks (ANN)-based models, the most extensively used of all (e.g., [15][16][17][18][19]). In fact, ANNs have been shown to be more efficient than statistical approaches, but they also display several shortcomings, such as slow convergence speed, over-fitting, falling into local minima, poor generalization, and so forth. Using post-liquefaction cone penetration test (CPT) and standard penetration test (SPT) data, Muduli and Das [20,21] created the multi-gene genetic programming (MGGP) technique to assess the ability of the soil to be liquefied. It was discovered that a new instrument for assessing liquefaction could be marketed and supported efficiently. Particle swarm optimization (PSO) was used to improve a neuro-fuzzy GMDH model created by Javdanian et al. [22]. This model was shown to be acceptable and reliable in this area. PSO was also hybridized with a kernel extreme learning machine (KELM) to evaluate liquefaction potential [23]. To forecast the likelihood of the soil liquefaction, Hoang and Bui [24] used a least squares support vector machine (LSSVM) and a kernel Fisher discriminant analysis. Their findings demonstrated that the suggested model is both acceptable and reliable in this domain. Soil liquefaction was also predicted using the ensemble group method of data handling (EGMDH) [25]. The EGMDH model was shown to be more accurate than the standard GMDH model in forecasting soil liquefaction. Rahbarzare and Azadi [26] proposed an improved fuzzy support vector machine (FSVM) based on PSO and a genetic method. According to the researchers, FSVM performance was improved by using PSO and genetic algorithms (GAs). It seems that machine learning models are able to solve liquefaction potential problems with an acceptable level of accuracy. It is important to note that such models have been successfully applied in different areas of civil engineering, as reported by many scholars . Some studies also employed Bayesian models to model the liquefaction triggers [50][51][52].
SVM models have been frequently utilized to forecast soil liquefaction. This method has been hybridized with different optimization techniques, including genetic algorithms (GAs), differential evolution (DE), grey wolf optimization (GWO), and kernel Fisher discriminant analysis (KFDA) [24,53]. However, to the best of our knowledge, no study to date has hybridized SVM with the Bayesian optimization (BO) technique, which is one of the most effective optimization techniques. Thus, in order to forecast soil liquefaction potential, this research develops a hybrid intelligence model (BOSVM). The remainder of the paper is organized as follows: Section 2 explains the mechanics of soil liquefaction. Section 3 goes on to describe the SVM and BO frameworks, as well as the datasets that were employed in this investigation. The results and discussion are described in Section 4. Finally, Section 5 gives a summary of this study.

Process of Soil Liquefaction
Saturated cohesionless soils liquefy when pore pressure rises, causing a loss in firmness that may lead to cracking and crumbling [1]. More specifically, Sladen et al. [54] describe the process of soils losing their shear resistance when subjected to cyclic, monotonic, or shock loadings; subsequently, the soil flows like a liquid until the shear stresses that operate on its mass are equal to or lower than its lowered resistance. In a broader sense, liquefaction is a transition from solidity to fluidity that happens when the pore pressure and the functional stresses are increased or decreased [1]. When soils are subjected to shearing forces, they have a propensity to shrink in volume, which can lead to the process known as liquefaction. After being sheared, saturated loose soil tends to compact into tighter particles that take up fewer pore spaces, analogous to the way that water is driven out of pores when trapped in them. Penetrating shear loads may cause pore water pressures to increase over time if the known as liquefaction. After being sheared, saturated loose soil tends to compact into tighter particles that take up fewer pore spaces, analogous to the way that water is driven out of pores when trapped in them. Penetrating shear loads may cause pore water pressures to increase over time if the drainage system is blocked. When this occurs, stress is transferred from the soil mass to the pore water, reducing the soil's shear resistance and its effective stress [1]. Liquidity occurs when the soil's shear resistance is less than its static, driving shear stress, allowing the soil to undergo structural damage. True liquefaction occurs only when the flow of soil is greater than the undrained residual shear resistance of a contracting soil under a static shear stress, according to Castro's most restrictive description [55]. It is worth remembering that both cyclic and monotonic shear stresses may lead to the liquefaction of cohesionless, loose soil.

Method
Soil liquefaction was predicted using a hybrid of the SVM model and the BO algorithm. Input selection, data splitting, model construction, and evaluation were all part of the procedure. A flowchart of this study is shown in Figure 1.

Evolutionary Random Forest (ERF)
Input selection is a critical step in the development of any ML model. Input selection refers to a process that identifies the most relevant inputs and removes irrelevant inputs from the dataset and modelling process. In this study, the evolutionary random forest (ERF) algorithm was employed. It is common practice to apply the random forest (RF) method and its ensemble theory when working with large datasets, selecting features for classification, and carrying out regression analyses. The RF method generates a variety of weak regressors based on decision trees (DTs) using randomly selected inputs or sample divisions from a training set. Each DT is created using data provided by the user, and

Evolutionary Random Forest (ERF)
Input selection is a critical step in the development of any ML model. Input selection refers to a process that identifies the most relevant inputs and removes irrelevant inputs from the dataset and modelling process. In this study, the evolutionary random forest (ERF) algorithm was employed. It is common practice to apply the random forest (RF) method and its ensemble theory when working with large datasets, selecting features for classification, and carrying out regression analyses. The RF method generates a variety of weak regressors based on decision trees (DTs) using randomly selected inputs or sample divisions from a training set. Each DT is created using data provided by the user, and generates a decision-making model. In other words, characteristics in the dataset are examined and disassembled in order to reach a satisfactory choice. Each model's forecasted decision outcomes are obtained by the algorithm throughout the regression procedure. The mean of all the forecasts is used to reach the final forecast. Regardless of whether the overfitting issue is successfully mitigated, the RF's arbitrary rule may impair learning capacity. As a result, the evolutionary computation that improves the subset sampling process [56,57] is expected to play a critical role in complementing the RF by enhancing the searchability of the complex objective function.
As can be seen in Figure 2, randomly generated rules at the start of the experiment determine data partitions and assign these subsets to each and every poor classifier/regressor. The regressors anticipate the value of the training data and collect the average forecasts to make a consensus. Regression accuracy results are used to gauge an individual's fitness in the evolutionary process. To improve accuracy and genetic characteristics of the number of proposed individuals, repeated processes such as choosing, crossover, mutation, and evaluation are later applied. If the individuals converge, the algorithm stops the replicating phase and produces an optimum split of individuals as a model for regression. An ensemble regression based on a prior stage's optimum individual is offered in this subsequent round of use of the trained model. generates a decision-making model. In other words, characteristics in the dataset are examined and disassembled in order to reach a satisfactory choice. Each model's forecasted decision outcomes are obtained by the algorithm throughout the regression procedure. The mean of all the forecasts is used to reach the final forecast. Regardless of whether the overfitting issue is successfully mitigated, the RF's arbitrary rule may impair learning capacity. As a result, the evolutionary computation that improves the subset sampling process [56,57] is expected to play a critical role in complementing the RF by enhancing the searchability of the complex objective function.
As can be seen in Figure 2, randomly generated rules at the start of the experiment determine data partitions and assign these subsets to each and every poor classifier/regressor. The regressors anticipate the value of the training data and collect the average forecasts to make a consensus. Regression accuracy results are used to gauge an individual's fitness in the evolutionary process. To improve accuracy and genetic characteristics of the number of proposed individuals, repeated processes such as choosing, crossover, mutation, and evaluation are later applied. If the individuals converge, the algorithm stops the replicating phase and produces an optimum split of individuals as a model for regression. An ensemble regression based on a prior stage's optimum individual is offered in this subsequent round of use of the trained model.

Support Vector Machines
SVM is a ML approach that incorporates various methodologies such as maximum interval hyperplane, relaxation variables, and kernel function. Statistical principles are behind this ML model. The classification difficulties associate with few samples, nonlinearity, and complexity may be solved with this method [58]. SVM has been progressively used in civil engineering as interdisciplinary integration has become more widespread. A nonlinear transformation is used to translate the input space samples into a high-dimensional characteristic space, and then an optimum classification plane is found that divides the samples linearly within the characteristic space as the next step [59,60]. The incidence of soil liquefaction functions well with the features of the approach to overcome binary classification issues in the study of soil liquefaction and its risk assessment (e.g., [61]). Figure 3 depicts a schematic representation of the SVM concept. When a hyperplane is compared to a sample point, it is known as a margin. The classifier's capacity to generalize improves with an increasing margin of error. As a result, finding the hyperplane that maximizes the margin (i.e., the ideal hyperplane) is the primary goal of the SVM. There are support vectors for every point on the hyperplane on either side of the margin, and the categorization border is decided only by the support vectors, not by additional data

Support Vector Machines
SVM is a ML approach that incorporates various methodologies such as maximum interval hyperplane, relaxation variables, and kernel function. Statistical principles are behind this ML model. The classification difficulties associate with few samples, nonlinearity, and complexity may be solved with this method [58]. SVM has been progressively used in civil engineering as interdisciplinary integration has become more widespread. A nonlinear transformation is used to translate the input space samples into a high-dimensional characteristic space, and then an optimum classification plane is found that divides the samples linearly within the characteristic space as the next step [59,60]. The incidence of soil liquefaction functions well with the features of the approach to overcome binary classification issues in the study of soil liquefaction and its risk assessment (e.g., [61]). Figure 3 depicts a schematic representation of the SVM concept. When a hyperplane is compared to a sample point, it is known as a margin. The classifier's capacity to generalize improves with an increasing margin of error. As a result, finding the hyperplane that maximizes the margin (i.e., the ideal hyperplane) is the primary goal of the SVM. There are support vectors for every point on the hyperplane on either side of the margin, and the categorization border is decided only by the support vectors, not by additional data nor the quantity of data. Because of this, the optimization of the SVM's hyperparameters is essential. Among the several hyperparameters used in SVM, kernel type, C, and gamma are among the most important. As previously stated, the kernel transforms the observed data into a feature space. By imposing a penalty for every incorrectly classified data sample, hyperparameter C manages the exchange between the decision boundary and precision.
In various kernel types, gamma is a parameter linked to C. The influence of C is minimal when gamma is large. When gamma is modest, C has an effect on the model comparable to the effect it would have on a linear one. nor the quantity of data. Because of this, the optimization of the SVM's hyperparameters is essential. Among the several hyperparameters used in SVM, kernel type, C, and gamma are among the most important. As previously stated, the kernel transforms the observed data into a feature space. By imposing a penalty for every incorrectly classified data sample, hyperparameter C manages the exchange between the decision boundary and precision. In various kernel types, gamma is a parameter linked to C. The influence of C is minimal when gamma is large. When gamma is modest, C has an effect on the model comparable to the effect it would have on a linear one.

Bayesian Optimization Algorithm
The adjustment of learning parameters and model hyperparameters is an important part of the implementation of ML algorithms [62]. Model or training process qualities are defined by hyperparameters, which have a substantial impact on the model's ultimate outcome [63]. Conventional ML algorithms use BO as a hyperparameter optimization (picking) strategy, as part of their overall design. The BO algorithm is extensively used in pioneering AI because of its evident benefits when compared with the particle swarm optimization algorithm, genetic algorithm, or other algorithms [63,64]. The Gaussian process and the Bayesian theorem are used to optimize parameters in this technique. A Bayesian ML approach and Gaussian process regression are used to generate a surrogate for the objective, and to quantify the ambiguity in that surrogate. To determine the sample position, an acquiring function can be expressed from this substitute. In Appendix A, the typical circumstances in which the BO algorithm encounters difficulties are explained. In addition, Figure 4 depicts a generic pseudocode of the BO.

Bayesian Optimization Algorithm
The adjustment of learning parameters and model hyperparameters is an important part of the implementation of ML algorithms [62]. Model or training process qualities are defined by hyperparameters, which have a substantial impact on the model's ultimate outcome [63]. Conventional ML algorithms use BO as a hyperparameter optimization (picking) strategy, as part of their overall design. The BO algorithm is extensively used in pioneering AI because of its evident benefits when compared with the particle swarm optimization algorithm, genetic algorithm, or other algorithms [63,64]. The Gaussian process and the Bayesian theorem are used to optimize parameters in this technique. A Bayesian ML approach and Gaussian process regression are used to generate a surrogate for the objective, and to quantify the ambiguity in that surrogate. To determine the sample position, an acquiring function can be expressed from this substitute. In Appendix A, the typical circumstances in which the BO algorithm encounters difficulties are explained. In addition, Figure 4 depicts a generic pseudocode of the BO.

Performance Criteria
This study used several well-known performance criteria for classification. These criteria include the confusion matrix, accuracy (%), the receiver operating characteristic

Performance Criteria
This study used several well-known performance criteria for classification. These criteria include the confusion matrix, accuracy (%), the receiver operating characteristic (ROC) curve, and the area under the ROC curve (AUC). The more accurate the model, the closer the curve is to the model's top-left corner. AUC values fall within a range of 0 to 1. The greater the AUC, the more accurate the model.

Data for Modeling
The Great Tangshan Earthquake, which occurred on 28 July 1976, was a major natural catastrophe in China. The number killed put the catastrophe at the top of the list of the most devastating earthquakes of the 20th century. Hebei's industrial metropolis of Tangshan, home to almost a million people, was the epicenter of the earthquake. Initial estimates put the death toll at 655,000, but this has since been revised to between 240,000 and 255,000, with 164,000 people suffering from serious injuries [1]. In order to build the models described in this research, a database from prior studies was used [1]. The Tangshan Earthquake was the subject of this database [65]. Several entries were omitted from the final analysis, owing to incomplete or inaccurate data. Liquefaction potential was the sole parameter included in the model output. Variables used in this study are listed in Table 1. A value of "1" indicates that liquefaction has occurred in each example, whereas "0" indicates that it has not. The overall cyclic shear stress caused by the earthquake is indicated by the term τ av . Modeling included the utilization of 79 different sets of data. The data were split into training and test datasets, with a ratio of 70:30. To train the BOSVM model, this study used 5-fold cross-validation. The developed model was then tested using the test data.

Input Selection
Input selection is a critical phase in the machine learning modelling process [54][55][56][57][66][67][68][69], and is used to remove unnecessary variables while keeping those that are valuable. This study employed the evolutionary random forest (ERF) technique for input selection. The dataset used in this study consisted of nine candidate inputs, including M, d w , d s , σ v , σ v0 , a max , q c , CSR, and D 50 . These nine parameters were selected because of their effects on the liquefaction from a geotechnical viewpoint; some of them were used in the previous related studies or suggested as the most influential factor in liquefaction occurrence [1,19,23,50,51]. Nonetheless, they have different levels of impact on liquefaction occurrence. It is necessary to keep intelligent models as simple as possible. This can be achieved by considering the most influential factors. To do this, the ERF selected six inputs, including d w , σ v0 , a max , q c , CSR, and D 50 . Based on the ERF findings, a subset of these six inputs outperformed other input subsets on the dataset utilized in this research. These inputs were used to develop SVM and BOSVM models to predict soil liquefaction. It is important to mention that several parameters were used to develop the ERF model. The selection scheme was set as "tournament", p initialize was set as 0.5, p mutation was set as −0.1, and p crossover was set as 0.5. The crossover type was uniform. The accuracy of this model was 92.50%.

BOSVM Model Development
The hyperparameter optimization of the prediction model based on SVM was carried out using the Bayesian optimization (BO) approach in this work. Hyperparameters such as box constraint level and kernel scale were optimized using the BO technique for SVM models based on the hybrid model. These settings were configured from 0.001 to 1000. An overview of how the BO optimization method is used to optimize SVM parameters is provided below: 1.
Preparing the data: Using a suitable ratio, the data set is partitioned into training and testing sets (70:30). The distribution of inputs after the data split is shown in Figures 5 and 6. 2.
Examination of fitness: The fitness function is computed and assessed before optimizing the target parameter value. The fitness function in this study is classification error.

3.
Adjusting the settings: hyperparameter optimization criteria may be adjusted according to the outcomes of each iteration, if desired.

4.
Stop checking for conditions: Optimization stops once the best parameters have been found.
Sustainability 2022, 14, x FOR PEER REVIEW 8 Figure 5. Distribution of inputs after data split (training set). SVM was used in conjunction with one optimization technique (i.e., BO) to create a hybrid intelligent model based on SVM that could better forecast soil liquefaction. As a result of the aforementioned optimization, several hyperparameter settings and model prediction results were produced.
One hundred iterations of the BOSVM model utilizing the fitness assessment of classification error were used to obtain the optimal SVM hyperparameters. As shown in Figure 7, convergence occurred in the BOSVM model before the maximum number of iterations had been completed. After 19 iterations using the BOSVM approach, the optimum SVM hyperparameters with the lowest classification error of 0.033 were found. This proves that the strategy is reasonably effective in identifying the optimum hyperparameters. The ability of BO to use all knowledge from prior runs in order to discover the next set of hyperparameters may explain this high rate of convergence [70,71]. SVM was used in conjunction with one optimization technique (i.e., BO) to create a hybrid intelligent model based on SVM that could better forecast soil liquefaction. As a result of the aforementioned optimization, several hyperparameter settings and model prediction results were produced.
One hundred iterations of the BOSVM model utilizing the fitness assessment of classification error were used to obtain the optimal SVM hyperparameters. As shown in Figure 7, convergence occurred in the BOSVM model before the maximum number of iterations had been completed. After 19 iterations using the BOSVM approach, the optimum SVM hyperparameters with the lowest classification error of 0.033 were found. This proves that the strategy is reasonably effective in identifying the optimum hyperparameters. The ability of BO to use all knowledge from prior runs in order to discover the next set of hyperparameters may explain this high rate of convergence [70,71].
As the kernel and regularization parameter (C) were optimized for BO, the linear kernel and a C value of 18.49 were the hyperparameters that best matched those values. Prior to modeling, these variables would be used as BOSVM hyperparameter values, whereas the standard SVM makes use of the default configuration. Both the SVM and BOSVM models were evaluated using the accuracy (%), confusion matrix, and ROC curve to verify and evaluate results. The ROC curve shows the predictive power of the models, while the confusion matrix shows the specifics of the model's prediction capacity. As the kernel and regularization parameter (C) were optimized for BO, the linear kernel and a C value of 18.49 were the hyperparameters that best matched those values. Prior to modeling, these variables would be used as BOSVM hyperparameter values, whereas the standard SVM makes use of the default configuration. Both the SVM and BOSVM models were evaluated using the accuracy (%), confusion matrix, and ROC curve to verify and evaluate results. The ROC curve shows the predictive power of the models, while the confusion matrix shows the specifics of the model's prediction capacity.
As stated in Table 2, the training accuracy of BOSVM is demonstrated to be 96.4%, which is almost 5.5% better than the SVM's 90.9% training accuracy. Moreover, the SVM model's test accuracy improved by 4.1% with the use of the BO algorithm. The BOSVM model's training and testing accuracy are fairly close together, indicating the model's stability in predicting soil liquefaction. Overall, soil liquefaction was better predicted with BOSVM than with SVM.   As stated in Table 2, the training accuracy of BOSVM is demonstrated to be 96.4%, which is almost 5.5% better than the SVM's 90.9% training accuracy. Moreover, the SVM model's test accuracy improved by 4.1% with the use of the BO algorithm. The BOSVM model's training and testing accuracy are fairly close together, indicating the model's stability in predicting soil liquefaction. Overall, soil liquefaction was better predicted with BOSVM than with SVM.   [72]. Both the training and testing phases of the BOSVM model have AUC values of above 0.9. There seems to be an adequate distribution of ROC values for the BOSVM, and the majority are clustered towards the top.
As can be seen in Figure 9, the BOSVM model's prediction performance was compared with that of other models, including those for logistic regression, single decision trees, boosted trees, and artificial neural networks (ANNs). The hybrid optimization model had better predictive performance than other models (see Figure 6). There can be no doubt that the BOSVM hybrid model can learn, evaluate, and forecast well from the given findings. Soil liquefaction can be predicted by applying the suggested BOSVM hybrid model.
In addition, it should be noted that the entire datasets reported in this research were utilized in the investigations carried out by Xue and Yang [1] and Cai et al. [53]. They used three more input parameters (i.e., M, d s , and σ v ) together with the six inputs used in the current study, and developed an adaptive neuro fuzzy inference system (ANFIS), a least squares support vector machine (LSSVM) and a radial basis function neural network (RBFNN) in combination with the optimization algorithms (i.e., the grey wolf optimization (GWO), differential evolution (DE), and genetic algorithm (GA)) for predicting the soil liquefaction values. The current study's results are comparable to those of the preceding investigations. This shows that the BOSVM model suggested in this work can make excellent forecasts. As can be seen in Figure 9, the BOSVM model's prediction performance was com pared with that of other models, including those for logistic regression, single decisio trees, boosted trees, and artificial neural networks (ANNs). The hybrid optimizatio model had better predictive performance than other models (see Figure 6). There can b no doubt that the BOSVM hybrid model can learn, evaluate, and forecast well from th given findings. Soil liquefaction can be predicted by applying the suggested BOSVM hy brid model. This study's findings compare favorably with those of many previous soil liquefaction studies that used different datasets. For example, accuracy values of 92.2% and 93.19% were obtained in the studies conducted by Zhang et al. [61] and Hoang and Bui [24], respectively, to predict soil liquefaction by introducing grey wolf optimization (GWO)-SVM and kernel Fisher discriminant analysis (KFDA) with least square support vector machine (LSSVM) techniques. In other words, the developed BOSVM prediction model outperformed the other models in terms of accuracy. Consequently, this study recommends that the BOSVM model be used and developed to anticipate soil liquefaction in the future. Sustainability 2022, 14, x FOR PEER REVIEW 12 of 16 Figure 9. Comparison of accuracy with other models.
In addition, it should be noted that the entire datasets reported in this research were utilized in the investigations carried out by Xue and Yang [1] and Cai et al. [53]. They used three more input parameters (i.e., , , and ) together with the six inputs used in the current study, and developed an adaptive neuro fuzzy inference system (ANFIS), a least squares support vector machine (LSSVM) and a radial basis function neural network (RBFNN) in combination with the optimization algorithms (i.e., the grey wolf optimization (GWO), differential evolution (DE), and genetic algorithm (GA)) for predicting the soil liquefaction values. The current study's results are comparable to those of the preceding investigations. This shows that the BOSVM model suggested in this work can make excellent forecasts.
This study's findings compare favorably with those of many previous soil liquefaction studies that used different datasets. For example, accuracy values of 92.2% and 93.19% were obtained in the studies conducted by Zhang et al. [61] and Hoang and Bui [24], respectively, to predict soil liquefaction by introducing grey wolf optimization (GWO)-SVM and kernel Fisher discriminant analysis (KFDA) with least square support vector machine (LSSVM) techniques. In other words, the developed BOSVM prediction model outperformed the other models in terms of accuracy. Consequently, this study recommends that the BOSVM model be used and developed to anticipate soil liquefaction in the future.

Limitations and Future Works
Future studies might use the model created in this research to predict soil liquefaction. Soil liquefaction under more severe situations requires further data and research, and this should be emphasized. Only under identical circumstances and with a suitable range of database information should the hybrid model described here be used. It is recommended that in the future more data samples and characteristics should be included in the experimental database in order to improve model accuracy.

Conclusions
Soil liquefaction was the subject of this research, which used the hybridization of SVM models. A renowned optimization strategy (i.e., BO) that has been effectively studied by other scholars was chosen and integrated with SVM, and a BOSVM hybrid model was constructed for prediction purposes. This model was constructed using six model

Limitations and Future Works
Future studies might use the model created in this research to predict soil liquefaction. Soil liquefaction under more severe situations requires further data and research, and this should be emphasized. Only under identical circumstances and with a suitable range of database information should the hybrid model described here be used. It is recommended that in the future more data samples and characteristics should be included in the experimental database in order to improve model accuracy.

Conclusions
Soil liquefaction was the subject of this research, which used the hybridization of SVM models. A renowned optimization strategy (i.e., BO) that has been effectively studied by other scholars was chosen and integrated with SVM, and a BOSVM hybrid model was constructed for prediction purposes. This model was constructed using six model inputs and an output (i.e., soil liquefaction). For input selection, an ERF approach was used prior to the development of this model. The nine possible inputs were narrowed down to the six that were ultimately used. The performance of the SVM-based model was assessed using accuracy (%), ROC curve, AUC, and confusion matrix. In addition, for comparison purposes we predicted soil liquefaction using other proposed models (i.e., SVM, ANN, KNN, boosted trees, and bagged trees). The BOSVM model outperformed all other applied predictive approaches, with an accuracy of 96.4% and 95.8% and an AUC of 0.93 and 0.98 for training and testing, respectively, after being evaluated against all other created and applied models. Therefore, the model developed in this work may be applied in future studies to forecast soil liquefaction.