1. Introduction
Natural disasters, such as earthquakes, have long been associated with risks to human life and environmental degradation. Earthquake-induced soil liquefaction (EISL) is one of the most important geotechnical phenomena, and occurs in loose, saturated, and non-dense sandy soils during an earthquake. This phenomenon causes the pore water pressure to increase due to the soil’s tendency to decrease in volume, and as a result, it reduces the effective stress in the soil. In this case, the shear resistance of the soil is greatly reduced and approaches almost zero, and finally, the granular materials in the soil transform from solid to liquid [
1,
2]. Several factors can affect the occurrence of EISL, including the intensity and magnitude of the earthquake, the type and mechanical characteristics of local soil, unit weights, relative density, fine content, mean particle diameters, the void ratio, and the initial confining pressure during the earthquake [
1,
3,
4,
5,
6]. The EISL can cause significant damage to structures, facilities, and vital urban arteries. It can also cause loss of life and geographical disasters, such as the rising of underground facilities, subsidence of buildings, landslides, and loss of life [
7]. Therefore, predicting the EISL potential is a basic and essential step in designing structures to reduce earthquake damage. Although the phenomenon of EISL has been known for a long time, it has been widely studied by geotechnical engineers due to the massive destruction of several strong earthquakes such as the Niigata 1964, Kocaeli 1999, Chi-Chi 1999, Baja California 2010, Tohoku 2011, and Bushehr 2013 [
1,
8,
9,
10,
11,
12,
13].
In general, the prediction of liquefaction is a difficult and complicated task due to the complicated mechanism of soil liquefaction and its relationship with numerous influencing factors. Therefore, many researchers have proposed several methods in the last three decades. Traditional techniques for estimating the liquefaction resistance of soils include statistical methods and laboratory tests, including the cyclic simple shear test, the cyclic triaxial test, and the cyclic torsional test, which are not only expensive and time-consuming [
14], but also have several limitations [
15]. For example, the insufficient accuracy of statistical methods [
16] and the dependence of laboratory methods on sampling and testing methods have led to unreliability [
17].
Recently, field test methods based on the cone penetration test (CPT) [
7,
9,
18,
19,
20,
21,
22,
23,
24], standard penetration test (SPT) [
6,
7,
14,
17,
25,
26,
27,
28,
29,
30], and shear wave velocity (
VS) [
17,
31] test have become the most widely used approaches to predict EISL. SPT has limitations in cohesive soils and coarse-grained soils, and it provides discrete datapoints rather than continuous profiling. The shear wave test involves sending shear waves into the ground and measuring their travel time and velocity. This test requires specialized equipment and expertise and is typically conducted in conjunction with other site investigations. In summary, while both CPT and SPT can provide valuable information for assessing liquefaction potential, CPT is generally considered the more accurate and reliable due to its continuous profiling and direct measurement of soil resistance. The shear wave test provides additional insights into the dynamic properties of soils and can be used in conjunction with other tests for a comprehensive liquefaction assessment. However, the CPT has attracted more attention from many geotechnical researchers in the last 20 years since it is consistent, faster, and repeatable [
7,
9,
19,
20,
21,
22,
23,
24].
With the development of soft computing techniques, many scholars have tried to develop these models for solving science and engineering problems [
2,
32,
33,
34,
35,
36,
37,
38,
39,
40,
41,
42,
43,
44,
45]. These techniques such as artificial neural networks (ANN), fuzzy neural inference system (Neuro-Fuzzy), and the support vector machine (SVM) have also been proposed to predict EISL using in situ test data. Goh investigated the feasibility of using various ANN models such as a backpropagation (BP) neural network and probabilistic neural network (PNN) to assess the liquefaction potential based on historical cases [
46,
47]. A general regression neural network (GRNN) model has been developed by Hanna et al. [
16] to determine the occurrence of EISL using the field test data from the earthquakes in Taiwan and Turkey. Rezania et al. [
24] presented an evolutionary polynomial regression (EPR) to estimate the EISL potential of sands based on the CPT data. Kayadelen developed two models based on genetic expression programming (GEP) and adaptive neuro-fuzzy inference system (ANFIS) to estimate the safety factor for the liquefaction of soils using the data from the earthquakes in Turkey and Taiwan [
48]. Muduli and Das proposed a classification approach based on multi-gene genetic programming (MGGP) to evaluate soil liquefaction using a large CPT dataset [
22]. Samui developed an SVM model to predict the EISL based on actual CPT data from the 1999 Chi-Chi earthquake [
19]. Ardakani and Kohestani investigated the potential of the decision tree (C4.5) algorithm for estimating EISL using CPT test results [
23]. Xue and Xiao developed two hybrid SVM classifiers based on the genetic algorithm (GA) and grid search (GS) to estimate the potential of soil liquefaction based on the CPT data from five significant earthquakes [
9]. Xue and Liu used two optimization algorithms, namely particle swarm optimization (PSO) and GA, to select the optimal parameters and prediction accuracy of the backpropagation (BP) neural network model using the CPT data from eight earthquakes between 1964 and 1983 to forecast the EISL susceptibility of soil [
21]. Kurnaz and Kaya predicted the potential of soil liquefaction using an ensemble model based on a group method of data handling neural network based on CPT data [
14]. Rahbarzadeh and Azadi used a fuzzy support vector machine (FSVM) classification method optimized with a hybrid PSO and GA to improve the prediction of soil liquefaction using CPT data [
20]. Mahmood et al. proposed a hybrid approach based on a Bayesian belief network (BBN) to predict the potential of EISL using CPT data [
49]. Zhang and Wang presented a hybrid classifier ensemble by integrating different base classifiers including BP, SVM, RF, multiple linear regression (MLR), naive Bayes (NB), k-nearest neighborhoods (KNN), and logistic regression (LR) to improve the prediction accuracy of soil liquefaction. GA was also applied to tune the hyperparameters and weights of all classifiers [
7]. Cai et al. developed a least squares support vector machine (LSSVM) and a radial basis function neural network (RBFNN) optimized by differential evolution (DE), GA, and grey wolf optimization (GWO) to assess the liquefaction potential based on CPT test results [
50]. The results showed that LSSVM-GWO and RBFNN-GWO outperformed the other classifiers.
Examining computational intelligence models developed to predict EISL shows that some of these models have not reached sufficient accuracy and have been under-fitted or over-fitted in the training process. On the other hand, regarding developing computational intelligence models for predicting EISL, due to the complexity of the developed models, the final model or computer code for predicting EISL using CPT test results has not been provided. The lack of clear presentation of these models renders these methods unusable in practice for engineers. Therefore, it seems necessary to develop a computational intelligence method that, in addition to high accuracy, can be expressed and implemented in a simple and straightforward manner, in order to predict EISL via CPT test results.
This study proposed a hybrid intelligent method based on WNN and PSO to predict the potential of EISL based on the CPT results. PSO is employed to train WNN to increase its accuracy in predicting the liquefaction potential. The developed WNN-PSO model’s accuracy was compared to other computational intelligence methods (e.g., ANN and SVM). To the best of the authors’ knowledge, this is the first time that WNN architecture is used to predict EISL based on CPT test data. Based on the discussion above, the application of the WNN-PSO model may be more accurate in estimating EISL. Additionally, in this study, the MATLAB function for the simulation of the optimized WNN model is provided, which renders geotechnical engineers or researchers able to estimate EISL via CPT results.
3. Dataset Description
The dataset used in this study was obtained from Goh [
47]. It includes 109 CPT datapoints with a broad range of parameters from five extreme earthquakes between 1964 to 1983. The majority of these records come from locations that have flat terrain and consist of sand or silty sand deposits. The CPT data include 79 case records from China based on the Tangshan earthquake, 16 from Japan based on the Niigata and Nihonkaichubu earthquakes, nine from the USA based on the Imperial Valley earthquake, and five from Romania based on the Vrancea earthquake. Out of 109 records, the occurrence of EISL is reported in 74 records, and the non-occurrence of EISL is reported in 35 records. The dataset has been provided in
Appendix A. Sensitivity to EISL depends on seismic and soil parameters. Based on the previous research, the evaluating indices, including total vertical stress (
), mean grain size (
D50), effective vertical stress (
), cone resistance (
qc), cyclic stress ratio (
CSR), normalized peak horizontal acceleration at ground surface (
αmax), and earthquake magnitude (
Mw) have been considered for the assessment of the EISL potential.
Table 2 indicates the seismic and soil properties of the desired dataset.
Table 3 shows the statistical descriptions of the data, including the minimum, maximum, mean, median, and standard deviation (SD) of each input variable for the training set, the testing set, and the overall data. A frequency histogram for input variables has been depicted in
Figure 5.
Before training, the data were normalized as follows:
where
is the original value of the input parameter, and
and
are the minimum and the maximum values of the input parameter.
6. Sensitivity Analysis
After developing the soft computing model, it is possible to determine the degree of importance of each of the input variables on the output of the model by means of sensitivity analysis. It is essential to carry out different analyses on the proposed model to validate and test the robustness of the model for unknown data. Sensitivity analysis determines the degree of importance of the input variables on the output of the model. In this study, the neighborhood component analysis (NCA) [
61] algorithm was utilized to determine the degree of importance of input variables on the prediction of EISL potential. Let
as the training set with
records for two-class classification problem, where
are the feature vectors and
are the labels of classes. The objective of NCA is to train a classifier
which obtains a feature vector and predicts the proper label
for
. Build a randomized classifier which (1) selects randomly a reference point for
as
, and (2) labels
based on the label of
. A possibility exists that an input variable
in the NCA algorithms is related to the class
. The distance between two records is obtained by Equation (7):
where
is the weight of the input variable.
The reference points (
x) in the input vector are obtained by Equation (8):
The probability of selecting
as the reference point for
is obtained by Equation (9):
Herein,
k corresponds to the kernel function (
) and
is the width of kernel function whereas the correct classification possibility of the real class is calculated by Equation (10).
where
if and only if
=
, otherwise
The results of the sensitivity analysis using NCA is illustrated in
Figure 8.
The results of sensitivity analysis using the NCA method reveals that among the seven input variables, the cone resistance (qc) has the highest degree of importance and earthquake magnitude (Mw) has the lowest degree of importance in predicting EISL. Other input variables in terms of importance to predicting the EISL based on the CPT results after qc include mean grain size (D50), total vertical stress (), cyclic stress ratio (CSR), effective vertical stress (), and normalized peak horizontal acceleration at ground surface (αmax), respectively.
8. Conclusions
Because of the numerous damages and disasters caused by earthquakes, predicting seismic liquefaction is an important task in geotechnical engineering. In this paper, a hybrid WNN-PSO was developed to predict the occurrence of liquefaction based on the seven input variables, including total vertical stress (), effective vertical stress (), normalized peak horizontal acceleration at ground surface (αmax), cone resistance (qc), mean grain size (D50), cyclic stress ratio (CSR), and earthquake magnitude (Mw). In the developed model, the PSO algorithm was used to train the wavelet neural network and find the optimal values of weights, dilation, and translation parameters. A reliable CPT dataset including 109 datapoints was employed for the training of the WNN. In order to find the optimal wavelet neural network topology to predict seismic liquefaction, different numbers of neurons in the hidden layer as well as different wavelet functions (e.g., Morlet, Gaussian, Mexican hat, GGW, Meyer, Shannon) in the hidden layer were investigated. The optimal architecture of the WNN was determined as 7-2-1, which shows that the optimal WNN has seven neurons in the input layer, two neurons in the hidden layer with the Morlet wavelet function and one neuron in the output layer with a Heaviside transfer function. Several performance indicators for binary classification models, including accuracy, precision, sensitivity, and specificity, were used to evaluate the performance of the WNN-PSO model. These indices are equal to 98.68%, 100%, 99.08%, and 96% for the training set, equal to 100%, 100%, 100%, and 100% for the testing set; and also equal to 99.09%, 100%, 98.67%, and 97.14% for the total data, respectively. A comparison of the results obtained by the proposed WNN-PSO model with those obtained by the other methods (e.g., ANN, SVM, and C4.5) showed that the proposed model has superior classification performance in predicting the nonlinear relationship between the seismic and soil parameters and the liquefaction potential. The results of sensitivity analysis using the NCA method showed that the qc has the highest degree of importance and Mw has the lowest degree of importance in predicting EISL.