Automatic Identification of Liquefaction Induced by 2021 Maduo M w 7.3 Earthquake Based on Machine Learning Methods

: Rapid extraction of liquefaction induced by strong earthquakes is helpful for earthquake intensity assessment and earthquake emergency response. Supervised classification methods are po‑ tentially more accurate and do not need pre‑earthquake images. However, the current supervised classification methods depend on the precisely delineated polygons of liquefaction by manual and landcover maps. To overcome these shortcomings, this study proposed two binary classification methods (i.e., random forest and gradient boosting decision tree) based on typical samples. The pro‑ posed methods trained the two machine learning methods with different numbers of typical samples, then used the trained binary classification methods to extract the spatial distribution of liquefaction. Finally, a morphological transformation method was used for the postprocessing of the extracted liquefaction. The recognition accuracies of liquefaction were estimated by four evaluation indices, which all showed a score of about 90%. The spatial distribution of liquefaction pits is also consis‑ tent with the formation principle of liquefaction. This study demonstrates that the proposed binary classification methods based on machine learning could efficiently and quickly provide the spatial distribution of liquefaction based on post‑earthquake emergency satellite images.


Introduction
Coseismic liquefaction refers to the change of sand from a solid state to a liquid state caused by rising pore water pressure and decreasing effective stress when an earthquake occurs [1][2][3]. As relevant to coseismic surface rupture, rapid acquisition of the spatial distribution of liquefaction is helpful for earthquake intensity assessment and earthquake emergency response [4,5]. In addition, post-earthquake investigation of liquefaction is a fundamental way of developing knowledge on seismic engineering and provides the basis for seismic-resistant design theories [6,7]. A complete liquefaction investigation could also help enrich the globe case base of liquefaction induced by the earthquake.
Currently, the identification of liquefaction usually depends on the field investigation, which can be time-consuming and labor-intensive. More importantly, due to lack of accessibility, this method could not obtain the complete spatial distribution of liquefaction, especially in highland areas. Thus, developing automatic extraction methods based on satellite or aerial images could be helpful in overcoming the shortcomings of field investigation.
The existing automatic identification of liquefaction can be summarized into two groups. The first group is the unsupervised classification strategy, which does not need liquefaction samples. This strategy detects liquefaction by comparing pre-and postearthquake images [2,[8][9][10]. For example, Ishitsuka et al. [11] identified the extent of liquefaction induced by the 2011 Tohoku earthquake by analyzing the surface changes with preand post-synthetic aperture radar (SAR) images. Baik et al. [12] detected liquefaction by

Data Sources 2.2.1. Satellite Images and Derived Covariates
In this study, Gaofen-7 (GF-7) images were collected for identifying the coseismic liquefaction. GF-7 is a stereo mapping satellite launched by China in 2019. It can effectively acquire panchromatic stereo images and multi-spectral images. The backward and forward panchromatic images have a width of 20 km and the resolution is greater than 0.8 m. Multi-spectral imaging features four spectral bands (blue, green, red and near infrared) with a resolution of 3.2 m (Table 1). In this study, a stereo pair of GF-7 images with a few clouds was captured on 10 June 2021, 19 days after the earthquake (Figure 1b). Besides the GF-7 image, another two kinds of image were used for validation in this study. Firstly, a Gaofen-1D image (GF-1D) that was captured on 29 April 2021, was used as the pre-earthquake image (Table 1). Meanwhile, the collected GF-1D image covered the whole study area.
The second were the UAV images captured on 25 July 2021, with a resolution of 0.1 m. The aim of the UAV images was to capture the surface rupture zone of the Maduo earthquake, and the images were mainly located in the north of the study area.
Before using the satellite images, two main methods of processing the GF-7 and GF-1D were applied. The first was pan-sharpening, in which the backward panchromatic image and multi-spectral image were fused into a new image with a 0.8 m resolution. Therefore, multi-spectral images could benefit from the high spatial resolution of panchromatic stereo images. The second process was to generate a 0.8 m resolution of the digital surface model (DSM) by using the forward-and-backward panchromatic images.
Based on the processed GF-7 data, three types of covariates were derived for identifying liquefaction:

1.
Terrain parameters: We used elevation, slope, all derived at 0.8 m resolution. Since liquefaction often occurs in flat and swampy areas, terrain parameters are mainly used to characterize topographic features.
NDWI is an important spectral index for measuring water in vegetation and detecting open water [21]. It has also been used for identifying liquefaction by Sengar et al. [22] and baik et al. [12]. Although NDVI is one of the important parameters for reflecting crop growth and nutritional information, it can also reflect the background influence of plant canopy, such as soil, wet ground, snow, dead leaves and roughness [23]. Thus, in addition to vegetation classification, NDVI is also widely used in the classification of other land features, such liquefaction [10,22].
MSAVI2 is also a kind of vegetation index, which was developed to minimize soil influence using a self-adjustment factor [24].
Salinity combines the R and NIR bands to detect the surface reflectance of salt-affected land [25].
GSI uses three types of visible spectral band to reflect the content of silt-clay and finesand of topsoil [26].

Liquefaction Samples
A total of 900 samples were collected in this study area, including 300 liquefaction samples and 600 non-liquefaction samples ( Figure 2). All the liquefaction samples were located inside the obvious liquefaction pits, not close to the edge as far as possible. Thus, the samples derived could be more representative. The 600 non-liquefaction samples were located in different landscapes including grassland, river/laker, floodplain, road, exposed rock, and so on.

Methods
In this study, two ensemble machine learning methods (i.e., RF, GBDT) were used. The principle of ensemble methods is to combine the results of a certain number of estimators to improve robustness. According to the synthesis strategies for the results of different estimators, the ensemble methods could be further distinguished into two classes: averaging methods and boosting methods.
As a typical averaging method, random forest uses multiple decision trees for predicting the classes independently, then uses a majority vote to predict the final result [27]. For a binary classification of liquefaction in this study, each decision tree in the for- The spatial distribution of samples and some location details of the nonliquefaction/liquefaction samples.

Methods
In this study, two ensemble machine learning methods (i.e., RF, GBDT) were used. The principle of ensemble methods is to combine the results of a certain number of estimators to improve robustness. According to the synthesis strategies for the results of different estimators, the ensemble methods could be further distinguished into two classes: averaging methods and boosting methods.
As a typical averaging method, random forest uses multiple decision trees for predicting the classes independently, then uses a majority vote to predict the final result [27]. For a binary classification of liquefaction in this study, each decision tree in the forest would individually determine whether a pixel was liquefaction or not, then RF chose the class with the most votes as the final class. The most important parameters that influence the performance of RF is the number of decision trees (Ntree) and the number of covariates selected for best split (Mtry).
By contrast, the basic idea of a boosting method is to sort the decision trees sequentially; each decision tree will give higher weight to the samples misclassified by previous decision trees during the training process. The final result is obtained by weighting the results of each decision tree. GBDT is a famous boosting method, which was developed by Friedman in 2001 [28]. The most import parameters influencing the performance of GBDT also include Ntree and Mtry. In this study, both parameters were set at default.
The main differences between RF and GBDT can be summarized into two aspects. The first is the sampling strategy; each tree in RF is built by a subset of samples drawn randomly with replacement from the complete sample set. It means the training set of each tree in RF is independent. However, trees in GBDT select subsets of samples from the complete set of samples randomly without replacement. The second different aspect is the classification result. RFs determine the class of each pixel by the majority-vote of all the trees in the RF, while GBDTs integrate the result of each tree using a weighted average method.
Morphological transformations are binary image operations based on the image shape [29]. The transformations usually include two inputs, the original image and the kernel that decides the nature of operation. In this study, the closing operation was selected to fill the gap of the extracted liquefaction.
The complete workflow of extracting liquefaction pits is shown in Figure 3.

Cross-Validation and the Evaluation Indices
To evaluate the performance of the proposed methods and the influences of the number of samples, the samples were randomly divided into 20 groups, then choosing 1, 2, 3, …, 19 group/groups as training samples in turn, the rest were used as validation samples. The above processes were repeated 20 times.

Cross-Validation and the Evaluation Indices
To evaluate the performance of the proposed methods and the influences of the number of samples, the samples were randomly divided into 20 groups, then choosing 1, 2, 3, . . . , 19 group/groups as training samples in turn, the rest were used as validation samples. The above processes were repeated 20 times.
For each validation, four quantitative evaluation indices (i.e., Recall, Precision, F1, Kappa) were used to evaluate the performance of the proposed method.
True positives (TP) represent the count of validation samples that were correctly classified into liquefaction by the proposed method. The false positives (FP) are the number of liquefaction samples that were misclassified into non-liquefaction. False negatives (FN) represent the count of non-liquefaction samples that were misclassified into liquefaction. True negatives (TN) are the number of non-liquefaction samples that were correctly classified into non-liquefaction. N represents the number of validation samples.

The Performance of the Two Proposed Methods
The relationship between the evaluation indices of the two classification methods and the number of samples is illustrated in Figure 4. The value of the Kappa index for the RF and GBDT ranged from 0.9 to 0.98 and from 0.87 to 0.97, respectively. The overall accuracies for RF and GBDT ranged from 0.954 to 0.99 and from 0.94 to 0.99, respectively. As for the Recall score, RF ranged from 0.956 to 0.98, a slight rise of 2.5%. However, GBDT had a score lower than 0.9 with the least number of training samples. Like the Kappa indices and overall accuracies, RF and GBDT had similar scores and the same trend in Precision scores. To sum up, RF and GBDT had similar scores in all four indices with the same training samples.
In addition, the number of samples could have had an influence on the final performance of the classification methods ( Figure 4). The four evaluation indices of both classification methods increased when more samples were used as training samples. Compared with RF, GBDT could be more sensitive to the number of training samples when there were a limited number of training samples. Taking the Kappa index as an example, when the number of training samples increased from 45 to 90, the scores increased from 0.87 to 0.92, a rise of 4.6%. However, when the samples reached a certain number, the relationship between the performance of the two proposed methods and the number of training samples was not obvious.

The Final Prediction of Liquefaction by the Two Proposed Methods
Figures 5 and 6 represent the final distribution of liquefaction that were predicted by the RF and GBDT when there were 270 training samples. Clearly, most of the liquefaction induced by the earthquake located in the southern river valley, Heihe River. Almost no liquefaction was distributed in the northern mountains. The spatial distribution was consistent with the formation principle of liquefaction. The soil in the valleys tends to have a lot of moisture, while the mountain region is often dry and short of groundwater. When an earthquake occurred, the increase of pore water pressure and decrease of effective stress had a greater possibility of occurring in the valley. Therefore, liquefaction pits were usually distributed in the river valley region.

The Final Prediction of Liquefaction by the Two Proposed Methods
Figures 5 and 6 represent the final distribution of liquefaction that were predicted by the RF and GBDT when there were 270 training samples. Clearly, most of the liquefaction induced by the earthquake located in the southern river valley, Heihe River. Almost no liquefaction was distributed in the northern mountains. The spatial distribution was consistent with the formation principle of liquefaction. The soil in the valleys tends to have a lot of moisture, while the mountain region is often dry and short of groundwater. When an earthquake occurred, the increase of pore water pressure and decrease of effective stress had a greater possibility of occurring in the valley. Therefore, liquefaction pits were usually distributed in the river valley region.    6) identified by the RF and GBDT. For region I, three types of images (i.e., GF-1D, GF-7 and UAV) are available. Compared to the pre-earthquake images (GF-1D), the post-earthquake images (GF-7) show clear surface change, and RF and GBDT identified those places as liquefaction-induced by the earthquake. In the UAV, the morphological characteristics of liquefaction were very obvious, while the size of the liquefaction in the UAV image was smaller than the liquefaction identified with GF-7 images by RF and GBDT. One possible reason is that GF-7 images have lower resolution than UAV images, and the discrete liquefaction could be grouped together in satellite images with low resolution.
Similar to the images in region I, the images pre-and post-earthquake in region II and region III also showed differences and were extracted by RF and GBDT as liquefactioninduced (Figures 8 and 9). However, the shape and size of liquefaction predicted by RF and GBDT were more similar in region III. This is because the boundaries of liquefaction were clearly distinguished from the surrounding ground features in region III. In that satiation, RF and GBDT could effectively identify the liquefaction based on a small number of liquefaction samples. However, when the boundary was relatively fuzzy, neither of the two proposed methods could completely identify the liquefaction morphology of the liquefaction (Figures 7 and 8). The incomplete identification of liquefaction could lead to the difference in area and quantity of liquefaction. Taking the liquefaction located in Figure 8 as an example, complete liquefaction could be divided into a number of small parts and affect the number and area of identified liquefaction.   Figures 5 and 6) identified by the RF and GBDT. For region Ⅰ, three types of images (i.e., GF-1D, GF-7 and UAV) are available. Compared to the pre-earthquake images (GF-1D), the postearthquake images (GF-7) show clear surface change, and RF and GBDT identified those places as liquefaction-induced by the earthquake. In the UAV, the morphological characteristics of liquefaction were very obvious, while the size of the liquefaction in the UAV image was smaller than the liquefaction identified with GF-7 images by RF and GBDT. One possible reason is that GF-7 images have lower resolution than UAV images, and the discrete liquefaction could be grouped together in satellite images with low resolution.
Similar to the images in region Ⅰ, the images pre-and post-earthquake in region Ⅱ and region Ⅲ also showed differences and were extracted by RF and GBDT as liquefaction-induced (Figures 8 and 9). However, the shape and size of liquefaction predicted by RF and GBDT were more similar in region Ⅲ. This is because the boundaries of liquefaction were clearly distinguished from the surrounding ground features in region Ⅲ. In that satiation, RF and GBDT could effectively identify the liquefaction based on a small number of liquefaction samples. However, when the boundary was relatively fuzzy, neither of the two proposed methods could completely identify the liquefaction morphology of the liquefaction (Figures 7 and 8). The incomplete identification of liquefaction could lead to the difference in area and quantity of liquefaction. Taking the liquefaction In region Ⅳ, compared to the pre-earthquake image, the post-earthquake image did not show obvious change. However, RF and GBDT both identified some beaches with bare sand as liquefaction-induced. In general, the provenance of the beach around the rivers is close to that of the liquefaction, which is composed of sand. It turned out to be difficult to distinguish the liquefaction from the beach with bare sand.         In region IV, compared to the pre-earthquake image, the post-earthquake image did not show obvious change. However, RF and GBDT both identified some beaches with bare sand as liquefaction-induced. In general, the provenance of the beach around the rivers is close to that of the liquefaction, which is composed of sand. It turned out to be difficult to distinguish the liquefaction from the beach with bare sand.

The Area Statistics of the Identified Liquefaction Pits by the Two Proposed Methods
As data-driven methods, RF and GBDT usually predict different spatial distributions of liquefaction with different training samples. Figures 11 and 12 represent the area statistics of the extracted liquefaction with different training samples by RF and GBDT. Clearly, even if the same number of training samples was used, the final area of liquefaction was different, especially when there were a smaller set of training samples. This is because data-driven methods depend heavily on the number and representativeness of training samples [14,19]. Due to the randomly selected training samples in this study, the number and representativeness could be different for each set, especially the training samples of other landcovers that were close to the spectrum of liquefaction. Rashidian et al. [15] designed training liquefaction samples by considering the local classes of landcovers, and keeping the number of training samples representing different landcovers inconsistent. Compared to the multi-class classification method proposed by Rashidian, binary classification methods demand fewer training samples [30]. Thus, in binary classification, the selected training samples should focus on the liquefaction and landcovers that have similar spectral characteristics with liquefaction.
samples of other landcovers that were close to the spectrum of liquefaction. Rashidian et al. [15] designed training liquefaction samples by considering the local classes of landcovers, and keeping the number of training samples representing different landcovers inconsistent. Compared to the multi-class classification method proposed by Rashidian, binary classification methods demand fewer training samples [30]. Thus, in binary classification, the selected training samples should focus on the liquefaction and landcovers that have similar spectral characteristics with liquefaction. Figure 11. The relationship between the total area of extracted liquefaction by RF and the number of training samples. Figure 11. The relationship between the total area of extracted liquefaction by RF and the number of training samples.

The Importance of Covariates Ranked by RF and GBDT
Successfully selecting and ranking covariates in prediction is one of the greatest abilities of ensemble classifiers such as RF and GBDT, which were used in this study [27,31]. The importance of covariates scored by RF and GBDT were illustrated in Figure  13 and Figure 14, respectively. When different training samples were selected, but the number of samples was the same, the importance of covariates was different (Figures  13b-d and 14b-d). Compared to GBDT, the RF had a relatively small change in the rank of importance of covariates. When the number of training samples used in RF and GBDT

The Importance of Covariates Ranked by RF and GBDT
Successfully selecting and ranking covariates in prediction is one of the greatest abilities of ensemble classifiers such as RF and GBDT, which were used in this study [27,31].
The importance of covariates scored by RF and GBDT were illustrated in Figures 13 and 14, respectively. When different training samples were selected, but the number of samples was the same, the importance of covariates was different (Figures 13b-d and 14b-d). Compared to GBDT, the RF had a relatively small change in the rank of importance of covariates. When the number of training samples used in RF and GBDT increased, the rank of importance of covariates did not show an obvious difference. Meanwhile, the six most important covariates ranked by RF and GBDT were B1, B2, B3, B4, DEM and GSI (Figure 15). The other covariates did not play a significant role in liquefaction classification. The result indicates the consistency of covariate rankings between RF and GBDT. In addition, this also reflects that the selected samples in this study are representative.

The Spatial Distribution of the Liquefaction Pits
According to the previous report, the liquefaction pits were observed during several earthquakes, such as the 1964 M 9.2 Alaska [32], 1999 7.5 Chi-Chi earthquakes [33], 2008 M 8 Wenchuan earthquake [6,7], 2011 M6.2 Christchurch earthquake [15]. Due to the lack of high-resolution images, field investigation in key areas is often an important way of obtaining liquefaction pits. The number of liquefaction pits collected ranges from 75 in the Chi-Chi earthquake [34] to 216 in the Wenchuan earthquake [6]. The liquefaction pits usually located around the surface rupture with a distance 20-40 km. The satellite images could cover more areas, and the automatic extraction method took all the potential liquefaction pits as the real liquefaction pits; it could be a possible reason why more liquefaction pits were extracted with a smaller Maduo earthquake. Liquefaction pits extracted by the proposed method were within 10 km of the surface rupture. Meanwhile, most of the liquefaction pits located near the river, which is consistent with previous studies [6].

The Potential Application for Evaluating Seismic Hazard
Identifying liquefaction pits based on high-resolution satellite images could be an unexpensive way, especially where the liquefaction was induced on plateaus or uninhabited areas that are difficult to reach. Therefore, exploring the methods of identifying liquefaction based on high-resolution satellite emergency images has vital value for earthquake emergency response, intensity assessment and virtual seismic research. The proposed method, which takes advantage of RF and GBDT, can quickly identify the spatial distribution and relatively accurate morphological structure of liquefaction from satellite-based images with a small number of samples. As more and more high-resolution satellites are launched, the high-quality images could be easier to access in time. The concentrated liquefaction area within one month to several months after a strong earthquake often forms a contiguous liquefaction area, which makes it easier to identify the distribution range; while the liquefaction in the years after the earthquake often only retains the liquefaction bunker and becomes difficult to identify. Thus, emergency remote sensing images can help scholars and officers to reassess regional coseismic liquefaction in a short time by the method proposed in this study without pre-earthquake images. Therefore, the time to obtain the mapping of liquefaction depends on when images from less cloudy days are available.

The Limitations of the Proposed Methods for Identifying Liquefaction
Although optical images usually have a relatively high resolution that could play an important role in identifying liquefaction, there are still several limitations to identifying the liquefaction. First, the availability and quality of optical images are often affected by the weather. Due to this reason, the timeliness of optical images could be a limitation on the performance of the proposed method [34]. In this study, the selected GF-7 image was nearly 20 days after the earthquake, some changes had occurred, such as the liquefied soil having become dry, and some adjacent liquefaction pits had joined together.
Second, the different land covers could have the same spectrum. Most satellite-based optical images have a resolution that is lower than 0.5 m. Thus, it is hard to identify liquefaction based on morphology ( Figure 16). In this study, the two proposed methods extract liquefaction mainly based on the four spectral bands and the derivate five spectral indices. The liquefaction usually has a similar spectrum with the exposed sand around the river. Soil type is an important indicator for whether a place will be susceptible to liquefaction during an earthquake; it could be one possible solution. Moreover, how to take advantage of this information will be the subject of further study. ble to liquefaction during an earthquake; it could be one possible solution. Moreover, how to take advantage of this information will be the subject of further study. Third, the proposed methods had good performance regarding the Maduo earthquake. However, this area had natural landcovers or bare land; it is necessary to validate the performance of the proposed method in other places, such as the city.

Conclusions
Based on optical images from the GF-7 satellite, two machine learning methods (i.e., RF and GBDT) were proposed to detect the liquefaction that was induced by the 2021 Maduo Mw7.3 earthquake on the Tibetan Plateau. The final prediction accuracies of liquefaction were evaluated using the Kappa index, overall accuracy, recall score and precision score. Even with only forty-five training samples, the scores of the four evaluation indices reached about 0.9. By increasing the number of training samples, the performance was also improved. Furthermore, the spatial distribution predicted by RF and GBDT with 270 training samples was analyzed. Finally, the extracted liquefaction regions were overlapped on a pre-earthquake (GF-1D) and post-earthquake image (UAV) for comparison and validation. The spatial distribution of the extracted liquefaction pits was consistent with the information in previous studies. However, the number of liquefaction pits was larger than in the reported studies of other earthquakes. The highresolution images, designed methods and geologic structures could be possible reasons. In summary, the proposed methods efficiently and quickly provided the spatial distribution of liquefaction based on post-earthquake satellite images, which could be helpful for earthquake intensity assessment and earthquake emergency response.
To the best of our knowledge, this study is the first to identify liquefaction based on random forest and gradient boosting decision tree methods with GF satellites. The proposed methods could be effective in taking advantage of optical satellite images and limited samples to identify liquefaction pits quickly after an earthquake. However, the weather could affect the timeliness of optical images, thus missing the optimal time to recognize the liquefaction pits. Furthermore, future studies will develop new extraction methods for liquefaction based on multi-source satellite data (e.g., different resolution optical satellite images, synthetic aperture radar images), optical satellite images with a time series long before the earthquake, and soil type information. Those further strategies could possibly overcome the shortcomings of the proposed methods for identifying liquefaction in this study. In addition, we will search for more data on other earthquakes to validate the replicability of the proposed methods.  Third, the proposed methods had good performance regarding the Maduo earthquake. However, this area had natural landcovers or bare land; it is necessary to validate the performance of the proposed method in other places, such as the city.

Conclusions
Based on optical images from the GF-7 satellite, two machine learning methods (i.e., RF and GBDT) were proposed to detect the liquefaction that was induced by the 2021 Maduo M w 7.3 earthquake on the Tibetan Plateau. The final prediction accuracies of liquefaction were evaluated using the Kappa index, overall accuracy, recall score and precision score. Even with only forty-five training samples, the scores of the four evaluation indices reached about 0.9. By increasing the number of training samples, the performance was also improved. Furthermore, the spatial distribution predicted by RF and GBDT with 270 training samples was analyzed. Finally, the extracted liquefaction regions were overlapped on a pre-earthquake (GF-1D) and post-earthquake image (UAV) for comparison and validation. The spatial distribution of the extracted liquefaction pits was consistent with the information in previous studies. However, the number of liquefaction pits was larger than in the reported studies of other earthquakes. The high-resolution images, designed methods and geologic structures could be possible reasons. In summary, the proposed methods efficiently and quickly provided the spatial distribution of liquefaction based on postearthquake satellite images, which could be helpful for earthquake intensity assessment and earthquake emergency response.
To the best of our knowledge, this study is the first to identify liquefaction based on random forest and gradient boosting decision tree methods with GF satellites. The proposed methods could be effective in taking advantage of optical satellite images and limited samples to identify liquefaction pits quickly after an earthquake. However, the weather could affect the timeliness of optical images, thus missing the optimal time to recognize the liquefaction pits. Furthermore, future studies will develop new extraction methods for liquefaction based on multi-source satellite data (e.g., different resolution optical satellite images, synthetic aperture radar images), optical satellite images with a time series long before the earthquake, and soil type information. Those further strategies could possibly overcome the shortcomings of the proposed methods for identifying liquefaction in this study. In addition, we will search for more data on other earthquakes to validate the replicability of the proposed methods.

Data Availability Statement:
The data presented in this study are available from the corresponding author, Y.X. upon reasonable request.