A Simple and Sustainable Prediction Method of Liquefaction-Induced Settlement at Pohang Using an Artificial Neural Network

Conventionally, liquefaction-induced settlements have been predicted through numerical or analytical methods. In this study, a machine learning approach for predicting the liquefaction-induced settlement at Pohang was investigated. In particular, we examined the potential of an artificial neural network (ANN) algorithm to predict the earthquake-induced settlement at Pohang on the basis of standard penetration test (SPT) data. The performance of two ANN models for settlement prediction was studied and compared in terms of the R2 correlation. Model 1 (input parameters: unit weight, corrected SPT blow count, and cyclic stress ratio (CSR)) showed higher prediction accuracy than model 2 (input parameters: depth of the soil layer, corrected SPT blow count, and the CSR), and the difference in the R2 correlation between the models was about 0.12. Subsequently, an optimal ANN model was used to develop a simple predictive model equation, which was implemented using a matrix formulation. Finally, the liquefaction-induced settlement chart based on the predictive model equation was proposed, and the applicability of the chart was verified by comparing it with the interferometric synthetic aperture radar (InSAR) image.


Introduction
The Pohang earthquake (Mw = 5.4) that struck the Heunghae Basin, around Pohang City, on 15 November 2017, had a damaging effect, leading to liquefaction and lateral spreading. Since the event, several attempts have been made to study the post-earthquake damage [1][2][3][4][5]. However, little attention has been paid to the settlement resulting from the liquefaction. This study tried to predict the liquefaction-induced settlement of Pohang by applying a machine learning algorithm to a standard penetration test (SPT) data and proposes a liquefaction settlement chart based on the results. Before constructing a structure on the ground, design is performed based on the ground investigation results. In addition, many sites, including Pohang, have a lot of SPT data. The SPT is a common method to get ground investigation data.
Assessing liquefaction-induced settlements is a major challenge in geotechnical earthquake engineering since a variety of phenomena such as re-sedimentation or reconsolidation (volumetric strain) of the liquefied soil, ground loss due to venting of liquefied soil (i.e., sand boils or ejecta), lateral spreading under zero volume change, soil-structure interaction ratcheting, and bearing capacity failure are associated with them [6]. For numerical analysis, earthquake-induced liquefaction in the free-field can be interpreted as a 1D phenomenon occurring along a vertical soil column in which earthquake-induced cyclic shear and compressive forces increase the pore pressure and thereby cause a reduction in the transient stiffness and strength of the soil. After liquefaction, reconsolidation occurs in the soil owing to the dissipation of the excess pore pressure (Δu) by means of water flow, and it results in the vertical settlement of the ground surface [7].
Tang et al. [8] classified the significant parameters controlling seismic soil liquefaction into seismic parameters, site conditions, and soil parameters. Out of 22 influence factors, they identified 12 as being significant, and they were the magnitude, epicentral distance, duration, fines content, particle size, grain composition, relative density, drainage condition, degree of consolidation, thickness of the sand layer, depth of the sand layer, and groundwater table. Over the years, researchers have considered some of these significant influence factors for predicting earthquake-induced liquefaction and its effects through machine learning techniques [9,10].
Therefore, simple artificial neural network (ANN) models were adopted to predict liquefaction-induced settlement on the basis of SPT database from the Korea Geotechnical Information DB system [11] and the Pohang earthquake. In the following sections, the research methodology and findings are presented.

Motivation and Study Objective
Liquefaction-induced settlement is often calculated by considering numerous parameters and following several complex analytical and numerical procedures. However, obtaining such parameters in the field may not be practicable in most cases, as some of the required data may not be available. Hence, there is a need for an alternative simple settlement prediction procedure that requires a few parameters that are readily obtained from a field observations database. Therefore, the objective of this study is to fill this gap by presenting a tool to predict liquefaction-induced settlement that may occur when an earthquake occurs in the field using SPT data obtained in the past.

Methodology
The database used in this study was collected from the Korea Geotechnical Information DB system [11] and the UBCSAND constitutive effective stress model [12]. Through a 1D column analysis, the UBCSAND model estimates the shear-induced deformation from SPT data and earthquake information. SPT data were obtained for five different borehole sites near the epicenter of the earthquake at Pohang. The summary statistics of the data set are presented in Table 1 and the details of the database are in Table A1. The data set comprised 100 data points (20 data for each borehole) along with the corresponding settlement values. The locations of the boreholes considered in the study are shown in Figure 1.

Data Division and Preprocessing
The settlement prediction process comprises training and testing. Seventy percent of the entire data set was used for training, and the remaining 30% was used for testing. The data were preprocessed before training the algorithm, to ensure quick convergence and minimize the generalization error. This involved scaling the input variables to the range −1 to +1 by using Equation (1).
where A and B are the minimum and maximum values of the unscaled data set, respectively, and a and b are the minimum and maximum values of the scaled data set, respectively.

Basic Concept of ANN
Artificial neural networks (ANNs) are complex mathematical models inspired by biological neurons, and they emulate biological neural networks. They are widely used for nonlinear system modeling and system identification [13]. A typical ANN consists of an input layer, one or more hidden layers, and an output layer. The numbers of layers and neurons in each layer depend on the complexity of the problem under consideration.

Mathematical Representation of ANN Architecture
A neural network in its simplest form can be used to model the relationship between data points x and the corresponding real-valued targets y. Mathematically, if our inputs (x) comprise n features, we can choose weights (w) and bias (b) such that our prediction (y') is given by Equation (2).
For easy computation, all the features can be collected into a vector x and all weights into a vector w to express our model compactly using the dot product notation-Equation (3).
ANNs can learn by example (supervised learning). In ANNs, a set of input variables are multiplied by adjustable connection weights to produce the output. When input data are fed to an ANN, the ANN adjusts through a feed-forward back-propagation technique to determine the rules governing the relationship between the concerned variables. Figure 2 shows a graphical depiction of a typical feedforward ANN architecture. A neural network is trained using error back-propagation. Two ANN models were considered in this study, and they are shown in Figure 3. Both models had three input variables. The input variables of model 1 were unit weight (γ), corrected SPT blow count (N1(60)), and cyclic stress ratio (CSR), while those of model 2 were depth of the soil layer (d), N1(60), and CSR.
The choice of input parameters was based on domain knowledge. They were chosen by considering how the seismic and soil properties influence liquefaction-induced settlement. The soil properties considered were γ, N1(60), and d, while the CSR represented the seismic property. The CSR quantifies the demand imposed on the critical soil layer as a result of the seismic ground motion.    After the models were trained, the root mean square error (RMSE) and loss were plotted to check the models' performance for the training and test data sets, as shown in Figures 4 and 5. The x-axis represents the number of epochs (i.e., the number of times the model ran through the entire training/test data set and updated the weights).    A comparison of models 1 and 2 in terms of the prediction accuracy shows that the prediction accuracy of the former is higher. The difference in the R 2 correlation between the two models is about 0.12.

Results and Discussion
From the results shown in Figures 5 and 6, it can be concluded that there exists a strong correlation between the model predictions and the actual settlement in both cases considered.
In this study, ANN models composed of two or more hidden layers were considered, and it was found that the difference in accuracy between models with two or more hidden layers and the model with the single hidden layer was not significant. Therefore, an ANN model using one layer was used.

ANN-Based Numerical Equation
A simple equation was developed to predict the liquefaction-induced settlement. The optimal ANN model structure used for the purpose is shown in Figure 8, and its associated weights with biases are presented in Table 3.  (2).
where T12 is the output variable, namely, the predicted settlement value (S), Bk is the bias value at the output layer, Wkj is the connection weight between the jth node in the hidden layer and the kth node in the output layer, Bj is the bias value of the jth hidden node, Wji is the connection weight between the ith input node and the jth hidden node, Xi is the ith input variable, and fsig is the sigmoid transfer function given by Equation (5).
For the simplification of the calculation process, the weights and biases were arranged in a matrix form. The actual value of the settlement was 1 mm, and the value predicted using the ANN model was 1.010 mm.

Sensitivity Analysis
Sensitivity analysis was performed to determine the effect of the input parameters on the settlement prediction. The measure of variable importance was obtained using the permutation importance approach for random forests, described by Breiman [14]. This approach involves measuring the drop in the ANN model performance when a feature is unavailable.
As shown in Figures 9 and 10, the unit weight had the strongest influence on the settlement prediction in the case of ANN model 1, while the depth of the soil layer had the strongest influence on the predicted settlement in the case of model 2. In both cases, N1(60) had a stronger influence than the CSR.

Parametric Study and Extrapolation beyond the Training Data
A parametric study was conducted to verify the validity and robustness of the optimal ANN model, and it involved generating a synthetic data set within the range of the training data set to test the model. For a given unit weight of soil, the settlement was determined based on the unit thickness of each layer. As shown in Figure 11a, the amount of predictive settlement generally increased with increasing a CSR and decreased with an increase in N1(60). However, it is necessary to expand the range of N1(60) and CSR obtained through the parametric study due to some field data being beyond the range. Therefore, this study proposed a simple settlement chart based on a parametric study as shown in Figure 11b.

Application of Settlement Chart Based on the ANN Method
The proposed settlement chart from the optimal ANN model was assessed using the SPT data obtained from three additional boreholes at the Pohang site. The locations of the boreholes and the measured settlement obtained from interferometric synthetic aperture radar (InSAR) imaging are shown in Figure 12.
The InSAR procedure was recommended by the Remote Sensing Lab at Kangwon National University, Korea [15]. Following the procedure, the settlement was analyzed by the Pohang satellite images between November 4 and 16, 2017, from Google Earth. Such Google Earth images were used to generate the settlement map in Figure 12 by using a freely distributed SentiNel Application Platform (SNAP) program by the European Space Agency [16]. With an average unit weight of 18 kN/m 3 , N1(60) values were converted from the SPT blow count (NSPT) of boreholes [17]. The CSR can be calculated from Equations (6) where amax = peak acceleration at the ground surface from the earthquake (this study used the Pohang Earthquake, 0.2712 g); g = acceleration of gravity; σvo and σ'vo are total and effective vertical overburden stresses, respectively; and γd = stress reduction coefficient. The calculated total settlement for the additional boreholes, 1, 2, and 3, using the optimal ANN model are 17.14, 19.77, and 13.88 mm, respectively, as shown in Table 4. It can be observed that these settlement values are close to those measured by the InSAR imaging. Unlike the numerical analysis approach, the proposed chart between (N1)60-CSR-Settlement from the optimal ANN model has been proven to estimate settlement values with minimal input parameters. For an earthquake with similar impact and magnitude, this simple ANN model can be deployed as a handy tool to obtain liquefaction-induced settlement in the field.

Conclusions
In this study, the potential of an ANN to predict the liquefaction-induced settlement at Pohang was examined. Two ANN models were trained using a back-propagation algorithm. Both models had three input variables. The input variables of model 1 were unit weight, corrected SPT blow count (N1(60)), and CSR, while those of model 2 were depth of the soil layer, N1(60), and CSR. The output of the models was the settlement (S). After the training and testing of the models, it was evident that model 1 had higher prediction accuracy, and the difference in the R 2 correlation between the two models was about 0.12. Subsequently, the weights and biases of an optimal ANN model were used to develop a simple predictive model equation, which was implemented using a matrix formulation.
Sensitivity analysis performed using the permutation importance algorithm indicated that the corrected SPT blow count had a stronger influence than the CSR on the predicted settlement. Furthermore, a parametric study showed that for a given unit weight of soil, the settlement decreased with an increase in N1(60).
Finally, the simplified relationship between (N1)60-CSR-Settlement was proposed using the optimal ANN model, and the cumulative settlement was predicted by applying the proposed relationship to additional boreholes and compared with the InSAR results. The cumulative settlement had a similar range as the InSAR displacement map. Thus, the simplified relationship of this study can be deployed as a handy tool to obtain liquefaction-induced settlement in the field.

Conflicts of Interest:
The authors declare no conflicts of interest.