Next Article in Journal
Assessment of Microplastics in Green Mussel (Perna viridis) and Surrounding Environments around Sri Racha Bay, Thailand
Previous Article in Journal
Digital Economy Development and Green Economic Efficiency: Evidence from Province-Level Empirical Data in China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparative Analysis of Machine Learning Methods and a Physical Model for Shallow Landslide Risk Modeling

1
State Key Laboratory of Soil Erosion and Dryland Farming on the Loess Plateau, Institute of Water and Soil Conservation, Chinese Academy of Sciences and Ministry of Water Resources, Yangling 712100, China
2
The Research Center of Soil and Water Conservation and Ecological Environment, Chinese Academy of Sciences and Ministry of Education, Yangling 712100, China
3
University of Chinese Academy of Sciences, Beijing 100049, China
4
Key Laboratory of Mollisols Agroecology, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin 150081, China
5
Institute of Soil and Water Conservation, Northwest A&F University, Yangling 712100, China
6
School of Environment and Resources, Taiyuan University of Science and Technology, Taiyuan 030024, China
*
Authors to whom correspondence should be addressed.
Sustainability 2023, 15(1), 6; https://doi.org/10.3390/su15010006
Submission received: 23 October 2022 / Revised: 7 December 2022 / Accepted: 10 December 2022 / Published: 20 December 2022
(This article belongs to the Section Soil Conservation and Sustainability)

Abstract

:
Shallow landslides restrict local sustainable socioeconomic development and threaten human lives and property in loess tableland. Therefore, the appropriate creation of risk maps is critical for mitigating shallow landslide disasters. The first task to be done was to evaluate the vulnerability of shallow landslides based on a machine learning model (random forest (RF), a support vector machine (SVM) and logistic regression (Log)), and a physical model (SINMAP) in the loess tableland area. By comparing the differences, the best method for evaluating the vulnerability of shallow landslide was selected. The nonlinear response relationship between shallow landslides and environmental factors was quantified based on the frequency ratio. Multicollinearity analysis was used to identify 10 factors that were applied on ML to construct the spatial distribution model. The SINMAP model used a DEM and soil physical parameters to determine the stability coefficient of the study area. The results showed that (1) shallow landslides in Dongzhiyuan mainly occurred on shady slopes with an elevation of 1068–1249 m, a slope gradient of 36°–60° and a concave shape. The stream power and stream transport indexes increased with increasing rainfall erosion, making shallow landslides likely. The susceptibility of shallow landslides changed parabolically with the change in the NDVI and mainly occurred in grassland and shrubland. (2) The four methods performed similarly in predicting the sensitivity of shallow landslides. The high-incidence areas were on both sides of eroded gully slopes. The tableland and gully bottom areas were not prone to shallow landslides. (3) The highest area under the curve (AUC) values were generated from the RF training and validation datasets of 0.92 and 0.93, respectively, followed by SVM AUC values of 0.91 and 0.92, respectively; Log AUC values of 0.91 and 0.89, respectively, and the SINMAP model AUC values of 0.69 and 0.74, respectively. In conclusion, the RF model best predicted the susceptibility of shallow landslides in the study area. The results provide a scientific basis for disaster mitigation on the Loess Plateau.

1. Introduction

Landslide hazard assessment, also known as sensitivity assessment and susceptibility zoning [1], is a systematic analysis of the environmental factors that affect the occurrence and development of natural disasters, such as landslides, on a large scale [2,3]. Shallow landslides are one of the most widespread, most frequent and most harmful types of landslides [4]. Under rainfall conditions, shallow landslides on slopes covered by vegetation have the characteristics of universality, mass occurrence and local explosions. The level of potential harm is high, and the disaster chain mode is significant [5]. The occurrence of shallow landslides provides a new underlying surface for hydraulic erosion, resulting in a sudden increase in sediment yield in the watershed [6]. Additionally, sediments derived from shallow landslides are often deposited in rivers and reservoirs, increasing the flood risk during rainfall and threatening land ecology and production safety in downstream areas [7].
The occurrence of shallow landslides results from the joint actions of various environmental factors [8]. The distribution of shallow landslides is closely related to topography. The slope gradient determines the effective free surface of the slope body and has a great impact on the near-surface hydrological characteristics [9]. In addition, the slope type, aspect and height directly reflect the development state and sliding potential energy of a shallow landslide [10,11]. Rainfall is the main environmental factor inducing shallow landslides [12]. Rainfall mainly affects slope stability through water infiltration [13]. Moreover, the existence of vegetation promotes the occurrence of shallow landslides [14]. Han et al. [15] found that when rainfall reaches the critical rainfall for slope failure, the root system of vegetation will aggravate the occurrence of landslides. Different vegetation types also have significant effects on shallow loess landslides [5]. Thus, the special geological structure in the loess tableland of China creates the basic environmental factors promoting the development of shallow landslides. Therefore, landslide susceptibility prediction methods are of great significance for studying the spatial probability distribution of regional landslides and the correlation between landslides and basic environmental factors.
Generally, there are three main types of landslide hazard risk assessment methods: physical models, statistical models and machine learning methods [16,17,18,19]. A physical model is a mechanism model established according to the process of landslide occurrence [20,21]. Scholars at home and abroad have proposed many physical mechanism models, among which the SINMAP model is widely used [22,23]. The applicability of the SINMAP model in predicting the spatial distribution of shallow landslides induced by rainfall has been tested in several areas, and the results are reliable [21,24,25]. In the middle of the 1970s, many scholars used statistical methods to study the sensitivity of landslides [26]. Different mapping units and the relationship between the spatial distribution of landslides and environmental factors were investigated at different landscape scales [27]. According to statistical principles, the areas where disasters may occur in the future were classified. Then, effective mitigation and control measures could be implemented to reduce the occurrence of disasters [28,29]. Machine learning methods are primarily based on data characteristics rather than on functional relationships [29,30]. Statistical models simplify the functional relationships between data variables through mathematical equations and then conduct spatial modeling [29,31]. With the development of artificial intelligence and big data, machine learning methods have gradually been widely used in the evaluation of regional landslide susceptibility [32,33]. For example, neuro-fuzzy and artificial neural networks, random forests, decision trees, support vector machines, logistic regression and other model algorithms have been applied to landslide susceptibility evaluation and have achieved good results [34,35,36,37,38]. Machine learning methods can be used for the sensitivity mapping of shallow landslides by constructing datasets of influencing factors and substituting there data into ML algorithms. Previous studies showed that the prediction accuracy of random forest and support vector machine methods was better than that of other models [39,40]. KC et al. [41] assessed the landslide sensitivity in the northern stretch of the Arun Tectonic window, Nepal, based on a bivariate statistical method, and the accuracy of the assessment results was only approximately 75%. Merghadi et al. [29] found that the random forest model performed well in estimating landslide sensitivity by comparing the evaluation results of various machine learning models.
The Loess Plateau is one of the regions with a fragile ecological environment and the most serious water and soil loss in China. Due to the special nature of loess [42,43], natural disasters such as shallow landslides frequently occur. In order to reduce the occurrence of landslides, a large number of scholars have conducted extensive research on loess landslides from the aspects of temporal and spatial distribution, influencing factors, formation mechanism, etc. [44]. With the development of landslide disaster research in the Loess Plateau, it is urgent to study the vulnerability of large-scale regional geological disasters. Zhuang et al. [45] took the area along the Silk Road from Xi ‘an to Lanzhou as the study area and drew the landslide hazard susceptibility zoning map based on logistic regression model. Taking Zhidan County, Shaanxi Province, China as the research area, Qiu et al. [46] used a frequency ratio model and artificial neural network to make a loess landslide sensitivity map. The comprehensive research results showed that the landslide developed more vigorously in the loess ridge, loess tableland, mountain and both sides of a river valley [47]. Among them, the loess tableland is one of the common topographic conditions and landforms in the Loess Plateau and the main land used for human life and production [48,49]. Additionally, landslides cause the area of the plateau to shrink continuously, which seriously endangers local human life and property safety. “Steep slope and deep gully” are the overall characteristics of the gully region of the loess tableland [42]. During extreme rainfall events, shallow landslides have occurred frequently in the gully slope area of the loess tableland, further aggravating the reduction in the loess tableland area [50]. However, the vulnerability to shallow landslides in the loess plateau area has not been evaluated. Therefore, it is necessary to predict and evaluate landslide susceptibility in the loess tableland to provide scientific theoretical support for reducing disaster losses and establishing a sound landslide forecasting system.
In summary, most studies evaluated the vulnerability of shallow landslides by the one-principle model. However, there are few comparative studies between the machine learning model and the physical model in the evaluation of the vulnerability of shallow landslides. By comparing the models with different principles, it will help to highlight the advantages and limitations of the model in evaluating the mapping of shallow landslide susceptibility [23,51]. Therefore, in this study, machine learning methods and a physical model were selected to evaluate the stability of the largest loess tableland—Dongzhiyuan. Notably, random forest, support vector machine and logistic regression algorithms are selected. Previous studies have verified that these machine learning methods provide high prediction accuracy and can be effectively used to predict the vulnerability of shallow landslides [39,40]. The physical model applied was the SINMAP model, which was commonly used in previous research. Additionally, the comparison of various methods helps to highlight the advantages and limitations of each model in the evaluation of shallow landslide susceptibility [52]. The purpose of this research was to provide a scientific basis for the deployment and measures of gully consolidation and highland protection in the research area. The results also provide decision support for disaster reduction work.

2. Materials and Methods

2.1. Study Area Profile

The Dongzhi Loess tableland (35°25′55″–35°51′11″ N, 107°27′42″–107°52′48″ E) is located in the middle and north of the Loess Plateau and belongs to the gully region of the loess tableland (Figure 1). The total area of the study area is 2765 km2, of which the slope area and channel area are approximately 1805.5 km2, accounting for 65.3% of the total area. According to statistics, there are 3649 erosion gullies above 500 m. The area has a temperate continental semiarid climate with sufficient sunlight and an average annual temperature of 10 °C. The annual precipitation is 400–600 mm. The rainfall distribution is uneven, and the precipitation from July to September accounts for more than 50% of the total annual precipitation. The main soil type is silty loam. The silty loam has a loose soil texture and displays poor resistance to erosion. The former vegetation mainly consisted of secondary forest and grassland.

2.2. Frequency Ratio Analysis of Environmental Factors

This study divided environmental factors into eight attribute intervals based on the natural discontinuity method in ArcGIS. The frequency ratio can be used as a simple geospatial tool. It is used to evaluate the nonlinear response relationship between shallow landslides and basic environmental factors and to reflect the quantitative statistics of the susceptibility linkage of each interval of factors to shallow landslides [53]. Its expression is as follows:
F R = A / A B / B
where A is the number of landslide grids that appear in the area for each environmental factor, A’ is the total number of shallow landslide grids in the area, B is the grid number of environmental factors in the interval, and B’ is the total number of grids in the study area.

2.3. The Principle of the Machine Learning Methods

In this study, random forest (RF), support vector machine (SVM) and logistic regression (Log) models were selected to reflect the modeling characteristics of machine learning methods to assess the susceptibility of shallow landslides in the study area.
(1)
RF can be understood as the organic integration of bagging (bootstrap aggregating) ensemble learning and the decision tree algorithm. The basic idea is to construct multiple decision trees by randomly selecting training samples, and the output category is determined by the mode or average of the predicted values of these single decision trees. That is, the prediction results of unknown samples are determined by the principle of majority voting, and the information of multiple decision trees is integrated to improve the accuracy of classification and the stability of the model. Its expression is as follows:
P k = h = 1 n j = 1 t D G k h j k = 1 m h = 1 n i = 1 t D G k h j
where m, n and t are the total number of basic environmental factors, the number of classification trees and the number of single tree nodes, respectively. DGkhj is the decreasing value of the Gini index of the k factor at the j node of tree h. Pk is the importance of the k underlying environmental factor.
(2)
SVM is a supervised learning method. The basic idea is to map the samples in the input space to a high-dimensional feature space through nonlinear transformation. Then, the optimal classification surface in the feature space that linearly separates the samples is obtained. Based on a set of linearly separable vectors xi (i = 1, 2..., n), including 10 environmental factors and their corresponding output class yi, the landslide classification is distinguished by the maximum clearance of the n-dimensional hyperplane. Its expressions are as follows:
y i = 1 2 ω 2
s . t . y i ( ω x i ) + b 1
where ‖ω‖ is the norm of the normal hyperplane, and b is a constant.
The convex quadratic optimization problem is solved by introducing a Lagrangian function, which is expressed as follows:
L ( ω ) = 1 2 ω 2 i = 1 N λ i { [ y i ( ω x i + b ) ] 1 }
where λi is the Lagrange multiplier.
For linear non-separable cases, additional relaxation variables are added to control the classification error. The constraints for correct classification change to the following:
y i ( ω x i + b ) 1
(3)
Log is a kind of generalized linear regression analysis model that fits a logical function to forecast the probability of an event occurring. Its expression is as follows:
P ( X ) = e x p ( β X ) 1 + e x p ( β X )
where P(X) is the probability of an event occurrence, X is an independent variable matrix, and β is the regression coefficient. In binomial logistic regression, a P(X) greater than 0.5 indicates that the event occurs, and a P(X) less than 0.5 indicates that the event does not occur.
In ArcGIS, the natural breakpoint method [17,54] was used to divide shallow landslide susceptibility into five zones: extremely low (0.0–0.2), low (0.2–0.4), medium (0.4–0.6), high (0.6–0.8) and extremely high (0.8–1.0).

2.4. The Principle of the SINMAP Model

The SINMAP model is a physical model based on ArcGIS, and it simulates only shallow landslides and is not suitable for deep instability [21]. Its expression is as follows:
F S = C r + C s + c o s 2 θ [ ρ s g ( D D w ) + ( ρ s g ρ w g ) D w ] t a n φ D ρ s g s i n θ c o s θ
where Cs is the soil cohesion (N/m2), Cr is the plant root cohesion (N/m2), θ is the slope gradient (°), ρs is the density of wet soil (kg/m3), ρw is the water density (kg/m3), g is the gravity acceleration (9.81 m/s2), D is the vertical depth of soil lead (m), and φ is the soil internal friction angle (°). The vertical depth (m) of Dw is determined from the soil isobaric surface.
According to the calculation results, stability classifications were carried out, as shown in Table 1.

2.5. Dataset

(1)
The machine learning method dataset
Based on Google Earth (resolution: 2.50 m × 2.50 m), the locations of shallow landslides were determined by manual visual interpretation, and 841 shallow landslides were finally selected through field investigation (Figure 1). Among them, 70% of shallow landslides were randomly selected to build the machine learning method training set, and the remaining 30% were used to validate the model evaluation results. The application of machine learning models for shallow landslide susceptibility research must include the selection of environmental factors that can effectively affect the reliability and accuracy of the evaluation results [55,56,57]. According to the characteristics and environmental conditions of shallow landslide hazard development in the study area, the factors (12 factors) included topographic and geomorphic factors, hydrological environmental factors and land cover factors. Specifically, the elevation (i.e., the digital elevation model, DEM), slope gradient, slope aspect, topographic relief, plane curvature, profile curvature, topographic wetness index (TWI), stream power index (SPI), stream transport index (STI), rainfall erosivity, land use type and normalized difference vegetation index (NDVI) were measured (Figure 2). The spatial resolution of the data used in this study met the research requirements (Table 2).
(2)
The dataset of the SINMAP model
The study area is mainly composed of Quaternary loess deposits that are widely distributed in the whole area [58]. Since the study area is located in the gully area of the loess tableland, the geotechnical engineering geological characteristics are basically the same. Therefore, this study divided the research area into a uniform calibration area. The SINMAP model parameters were determined from previous studies [58], as shown in Table 3.
As shown in Figure 3, the RF, SVM and Log methods were based on the topographic and geomorphic, hydrological environmental, and land cover, which were substituted into the algorithms to obtain the prediction results. SINMAP obtained prediction results through the coupling of topographic data and other factors. In this study, the prediction results are quantitatively assessed based on sensitivity, specificity and other indexes.

2.6. Model Performance Evaluation and Validation

Four evaluation indicators based on the confusion matrix were selected: sensitivity (TPR), specificity (TNR), accuracy receiver operating characteristic curve (ROC) and the area under the curve (AUC) were selected to measure the model performance [59]. The expressions are as follows:
T P R = T P T P + F N
T N R = T N T N + F P
A c c u r a c y = T P + T N T P + F P + T N + F N
where TP (true positive) is the correct number of landslide points classified as landslides, TN (false negative) is the total number of non-landslide points correctly classified as non-landslides, FN (false positive) is the number of landslide points classified as non-landslides, and FP (true negative) is the number of non-landslide points classified as landslides.

3. Results

3.1. Frequency Ratio Analysis of Environmental Factors

The FR was normalized to FRn, and the FRn range was based on 0–1, which helped analyze the spatial relationship between shallow landslides and environmental factors. Figure 4 shows the correlation between shallow landslides and various factors. Topographic and geomorphic factors are the internal factors that trigger shallow landslides. For topographic and geomorphic factors (Figure 4a–f), i.e., DEM, shallow landslides were most likely to occur in regions with a moderate DEM, slope aspect and profile curvature, a large slope gradient and topographic relief, and a small plane curvature. Regarding the DEM, shallow landslides mainly occurred in the range of 1134–1193 m, followed by the range of 1068–1134 m, and finally the range of 1193–1249 m. However, the elevations were 847–986 m, 1309–1381 m and 1249–1309 m, with FRn values of 0.05, 0.16 and 0.55, respectively, indicating that shallow landslides were less prevalent in this elevation range. As shown in Figure 4b, FRn increased with increasing slope gradient, indicating that the increase in slope gradient promoted the occurrence of shallow landslides. Regarding the slope aspect, shallow landslides mainly occurred in the ranges of 112.5–157.5°, 157–202.5° and 202.5–247.5°, which had FRn values of 1.0, 0.82 and 0.69, respectively. In other directions, shallow landslides occurred less frequently. Figure 4d shows the correlation between topographic relief and shallow landslides. As seen in the figure, FRn increased with increasing topographic relief, indicating that a greater terrain change increased the occurrence probability of shallow landslides. Figure 4e shows that when the plane curvature was (−0.83)–(−0.35), (−8.88)–(−1.6) and (−1.6)–(−0.83), FRn reached the maximum value, the second largest value and the third largest value, respectively. Regarding the profile curvature (Figure 4f), shallow landslides mainly occurred between 0.43 and 2.99. In other ranges of profile curvature, the FRn values were smaller.
Hydrological environmental factors were the most direct factors inducing shallow landslides. Figure 4g shows that FRn decreased with increasing TWI. When the TWI was between 6.18 and 8.52, FRn reached the maximum value, and shallow landslides were the most likely to occur. Figure 4h shows that FRn changed parabolically with the SPI. When the SPI was between 8.15 and 9.65, shallow landslides were most likely to occur, and FRn reached its maximum value. Figure 4i shows that FRn increased with increasing STI. For rainfall erosivity (Figure 4j), shallow landslides mainly occurred at 1350.94–1371.40 MJ·mm/(hm2·h·a), 1393.73–1421.86 MJ·mm/(hm2·h·a) and 1308.17–1330.65 MJ·mm/(hm2·h·a), with FRn values of 1.0, 0.97 and 0.95, respectively.
Figure 4k shows that FRn showed a parabolic trend with the NDVI. When the NDVI was 0.50–0.58, FRn reached a maximum value. Figure 4l shows that shallow landslides mainly occurred in grassland and shrubland and occurred less often in woodland and cultivated land.

3.2. Multicollinearity Analysis

Before evaluating the susceptibility of landslides, each factor should be independent. If there is a strong linear correlation between the factors, it will lead to inaccurate landslide susceptibility evaluation results [17]. Therefore, it is necessary to carry out multicollinearity analysis. Multicollinearity is usually expressed by the variance inflation factor (VIF) and tolerance (TOL). When VIF < 10 or TOL > 0.1, each factor is independent [60]. The formulas are as follows:
V I F = 1 1 R i 2
T O L = 1 V I F
where R i 2 represents the coefficient of determination of other variables for regression analysis.
After multicollinearity analysis (Table 4), 10 factors were selected from the 12 factors considered in the study, including DEM, slope gradient, slope aspect, plane curvature, profile curvature, TWI, STI, rainfall erosivity, NDVI and land use type. The VIF and TOL distributions of each factor were less than 10 and greater than 0.1, indicating that there were no multicollinearity problems among the factors.

3.3. Analysis for the Model Results

3.3.1. Model Performance and Validation

Table 5 shows that the RF had relatively high sensitivity (92.03%), specificity (85.99%) and accuracy (84.85%), and the RF had the best prediction results. The sensitivity, specificity and accuracy of the SVM were 87.87%, 86.60% and 82.23%, respectively, and the prediction performance of the SVM ranked second. The Log had a low performance index (sensitivity: 83.71%, specificity: 66.75%, accuracy: 79.38%) and a low predictive performance. Compared with the machine learning methods, the physical SINMAP model had the lowest sensitivity, specificity and accuracy, with values of 81.21%, 59.98% and 70.59%, respectively. The predictive performance of the SINMAP model was poor. Figure 5 shows the ROC curve for predicting landslide susceptibility by the model. As seen in the figure, the highest AUCs generated by the RF training and validation datasets were 0.92 and 0.93, respectively. The AUCs of the SVM were 0.91 and 0.92, respectively. The Log AUCs were 0.91 and 0.89, respectively. The minimum AUCs of the SINMAP model training and validation datasets were 0.69 and 0.74, respectively. Thus, for predicting landslide susceptibility, the models were ranked as follows: the RF was best, the SVM and the Log were moderate, and the SINMAP model had the poorest performance but was still acceptable [23].

3.3.2. Mapping and Comparison of the Sensitivity of Shallow Landslides

The model predicts the landslide susceptibility results of Dongzhiyuan, as shown in Figure 6 and Figure 7. Figure 6 shows that the model-predicted results of shallow landslide susceptibility in Dongzhiyuan were similar overall. The high occurrence area of shallow landslides was mainly on both sides of the gully slope, and the tableland and the bottom of gullies were in the area where shallow landslides did not easily occur. Overall, there were many erosion gullies in northeastern Dongzhiyuan, and the high-incidence area of shallow landslides was mainly concentrated in northeastern Dongzhiyuan. Figure 7 shows that the area ratios of very low, low, moderate, high and very high susceptibility of shallow landslides predicted by the RF were 17.76%, 25.58%, 23.41%, 20.65% and 12.60%, respectively. The area proportions of very low, low, moderate, high and very high shallow landslides predicted by the SVM were 37.22%, 20.66%, 18.55%, 10.98% and 12.59%, respectively. The Log method prediction indicated that the area proportions of the partitions were 38.99%, 15.54%, 12.34%, 13.0% and 20.11%, respectively. The SINMAP model predicted that the area proportions of the categories were as follows: 60.60% in very low, 13.01% in high, 12.11% in moderate, and 3.97% in very high. Previous analysis showed that the RF had the best effect in predicting the susceptibility of shallow landslides. Therefore, we compared the results of the RF as a benchmark. Compared with the RF, the SVM overestimated the very low area by 2.10 times, and underestimated the low, moderate and high areas by 1.24, 1.26 and 1.88 times, respectively. The Log method prediction indicated that the area proportion of shallow landslide-prone zones was similar to that of the SVM. For the SINMAP model, the area of very low regions was overestimated by a factor of 3.41, while the other regions were underestimated by a factor of 2.48, 1.93, 1.59 and 3.17.

4. Discussion

4.1. Characteristics of Environmental Factors for Shallow Landslides

Shallow landslides can be induced when environmental conditions reach a certain critical state [49]. This study selected and evaluated 12 environmental factors: topographic and geomorphic factors (DEM, slope gradient, slope aspect, topographic relief, plane curvature and profile curvature), hydrological environmental factors (TWI, rainfall erosivity, STI and SPI) and land cover factors (land use type and NDVI). For the topographic and geomorphic factors, the occurrence frequency of shallow landslides in Dongzhiyuan increased with increasing slope gradient, which was related to the change in shear stress on the slope [61]. The research results of Deng et al. [62] indicated that landslides were prone to occur on steep slopes of 30°–55°. The shallow landslides in Dongzhiyuan mostly occurred in the northern part of the erosion gully slope. In arid or semiarid regions, the north side of the erosion gully slope is usually a shaded slope [63]. Shaded slope soil receives less sunshine time, and its soil water content is higher than that of the sunny slope. Soil water is a lubricant that promotes shallow landslides, so shallow landslides occurred more frequently in the eroding gully of Dongzhiyuan [64]. Topographic relief was an important index to describe the geomorphic form, and it was also an important topographic and geomorphic factor affecting shallow landslides. The larger the topographic relief is, the more likely it is that a shallow landslide will occur. In terms of curvature, shallow landslides mainly occurred on concave slopes. The slope shape not only directly reflected the slope evolution processes and results of the erosion gully slope unit under the action of internal and external forces [65] but also affected the runoff velocity and infiltration distribution [66]. In conclusion, shallow landslides in Dongzhiyuan were mainly distributed on shallow slopes with elevations between 1068 and 1249 m, steep slopes and concave slope shapes.
Hydrological environmental factors were the main factors inducing shallow loess landslides. For the TWI, FRn gradually decreased with increasing TWI. When the TWI was between 6.18 and 9.67, shallow landslides were more distributed. Previous studies have shown that a higher TWI promotes the occurrence of shallow landslides [17]. However, the results of this study were inconsistent with previous research results. The main reason for this difference was that shallow landslides occur more frequently in the upper and middle parts of an erosion gully slope, while the higher TWI in Dongzhiyuan was located at the tableland and the bottom of the gully (Figure 2g). Previous studies have shown that higher rainfall erosion, SPI and STI values increase the occurrence of shallow landslides [67,68].
In recent years, studies have shown that shallow landslides on vegetated slopes have become a new ecological environmental problem in the process of vegetation restoration on the Loess Plateau [5]. The effects of vegetation on the stability of erosion gully slopes can be roughly divided into mechanical effects and hydrological effects, and these effects have positive and negative effects on stability [5]. Mechanical effects are mainly related to soil reinforcement by roots [69]. Hydrological effects are mainly related to vegetation canopy interception and increased infiltration [70]. The research results showed that the susceptibility of shallow landslides changed parabolically with the NDVI, indicating that when vegetation was too sparse, the existing vegetation promoted the occurrence of shallow landslides. Conversely, when there was abundant vegetation, the occurrence of shallow landslides was restrained. The main reason was that the sparse vegetation roots did not help strengthen the soil stability. Instead, the root system formed macropores in the soil [71], and these macropores promoted runoff into the permeable soil and thus intensified the occurrence of shallow landslides [64,72]. With the continuous increase in vegetation, the positive impact of vegetation reinforcement on soil stability gradually became greater than the negative impact [62], so shallow landslides did not easily occur. The results showed that shallow landslides occurred most often in grassland, followed by shrubs, and they were less prone to occur in forestland, which was consistent with previous research results. The main reason was related to the length of the vegetation root system, and the depth of shallow landslides on the Loess Plateau was mostly concentrated in the 0.30 m–1.0 m range [5].

4.2. Difference in Sensitivity Mapping of Shallow Landslides Predicted by the Models

Based on model performance, the highest AUCs generated by the RF training and validation datasets were 0.92 and 0.93, respectively. The AUCs of the SVM were 0.91 and 0.92, and those of the Log were 0.91 and 0.89, respectively. There was little difference among the three models. However, based on the sensitivity mapping of shallow landslides, there were differences among the model evaluation results. Compared with the evaluation results of the RF, the SVM and the Log overestimated the area of extremely low susceptibility by 2.10–2.20 times, while the low, medium and high areas were underestimated by 1.24–1.65 times, 1.26–1.90 times and 1.88–1.59 times, respectively. Moreover, there was a certain error in the evaluation results of the Log. Figure 8 shows that the mark was the center of the tableland, which was predicted as a stable area by the RF. However, the Log evaluated it as a very high-risk area. The reason lies in the principle of the Log predicting the susceptibility of shallow landslides. The Log added weight to each environmental factor and then obtained the probability of occurrence of shallow landslides. From Figure 2k,l, it could be seen that the NDVI and land use type at the marker differed greatly from the surrounding areas, which led to the incorrect forecasting results by the Log. The RF produced more accurate evaluation results than did the other models. The results are as follows: first, the RF had strong generalization ability. Second, it can process high-latitude data without feature selection and balance errors. Third, the RF had strong resistance to noise interference and fitting and could solve the problems of classification and regression [29,40]. The research of machine learning-based landslide sensitivity mapping by Adnan et al. [73] and Ali et al. [74] also showed that the RF had strong modeling adaptability. For conventional statistical models, the accuracy of predicting the vulnerability of shallow landslides in the study area was lower than that of machine learning models [29]. For example, KC et al. [41] predicted regional landslide sensitivity based on the weight of evidence approach and other methods, and the accuracy of their prediction results was only approximately 75%. The main reason for the low accuracy was that the statistical model was insensitive to the nonlinear relationship among factors, and machine learning models can largely mitigate this shortcoming. Moreover, compared with KC et al.’s study [41], we selected more shallow landslide sample data, with additional and more comprehensive influencing factors.
In this study, the SINMAP model performed worse than the machine learning models, which was consistent with previous research results [23]. Compared with the RF, the SINMAP model underestimated the susceptibility of shallow landslides in the study area. The main reason was related to the SINMAP model parameters [23]. The SINMAP model is a physical model. According to model theory and previous research results [20,21], the SINMAP model focuses on local slope stability, which is not specific to landslides in a wide range of research areas and is prone to spatial errors. Furthermore, physical models require a large amount of detailed data to possibly provide reliable results, which results in a waste of human and financial resources [29]. Therefore, the mapping method of shallow landslide sensitivity in this study was best performed by the RF.

4.3. Limitations and Implications of This Study

We have confirmed that there was a certain degree of difference between the mapping methods used in each model to assess the sensitivity of shallow landslides in the loess tableland. The random forest model displayed the best prediction performance and is suitable for mapping the sensitivity of shallow landslides in the loess tableland area. However, it is challenging to select the most appropriate prediction model for other regions because the prediction results depend not only on the accuracy of available data but also on the uncertainty of the parameters used to build shallow landslide models and the shortcomings of different models. In conclusion, the susceptibility mapping of shallow landslides generated by machine learning models is a potentially valuable approach for predicting the risk of shallow landslides.

5. Conclusions

In this study, machine learning methods (the RF, SVM and Log methods) and a physical model (SINMAP model) were used to evaluate the susceptibility to shallow landslides in Dongzhiyuan. The results showed that shallow landslides mainly occurred on shady slopes with elevations of 1068–1249 m, slope gradients of 36°–60° and concave shapes. Shallow landslides mainly occurred in areas with high rainfall erosion, SPI and STI values. The susceptibility of shallow landslides changed parabolically with the change in the NDVI and mainly occurred in grassland and shrubland. Overall, the models had similar results in predicting the susceptibility of shallow landslides, and the high-incidence areas were mainly located on both sides of the slopes of the erosion gullies. In summary, three machine learning methods and one physical model could predict the susceptibility of shallow landslides relatively satisfactorily, but the RF was the most reliable. Therefore, the RF was the best model for predicting the susceptibility of shallow landslides on the loess tableland.

Author Contributions

Conceptualization, L.F. and M.G.; methodology, L.F.; software, W.W.; validation, L.F., M.G. and Y.C.; formal analysis, Q.S.; investigation, W.G.; resources, Y.L.; data curation, L.F.; writing—original draft preparation, L.F.; writing—review and editing, H.K.; visualization, Z.C.; supervision, Y.Z.; project administration, L.F.; funding acquisition, L.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (42107356, 42077079 and 41907057), the State Key Laboratory of Soil Erosion and Dryland Farming on the Loess Plateau (A314021402-202102), the China Postdoctoral Science Foundation (2021T140663, 2020M681062 and 2020M683591) and the Heilongjiang Provincial Natural Science Foundation of China (YQ2021C036).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Reichenbach, P.; Rossi, M.; Malamud, B.D.; Mihir, M.; Guzzetti, F. A review of statistically-based landslide susceptibility models. Earth-Sci. Rev. 2018, 180, 60–91. [Google Scholar] [CrossRef]
  2. Chen, Y.L.; Irfan, M.; Uchimura, T.; Meng, Q.X.; Dou, J. Relationship between water content, shear deformation, and elastic wave velocity through unsaturated soil slope. Bull. Eng. Geol. Environ. 2020, 79, 4107–4121. [Google Scholar] [CrossRef]
  3. Duo, J.; Yamagishi, H.; Pourghasemi, H.R.; Yunus, A.P.; Song, X.; Xu, Y.; Zhu, Z.F. An integrated artificial neural network model for the landslide susceptibility assessment of Osado Island, Japan. Nat. Hazards 2015, 78, 1749–1776. [Google Scholar]
  4. Godt, J.W.; Baum, R.L.; Savage, W.Z.; Salciarini, D.; Schulz, W.H.; Harp, E.L. Transient deterministic shallow landslide modeling: Requirements for susceptibility and hazard assessments in a GIS framework. Eng. Geol. 2008, 102, 214–226. [Google Scholar] [CrossRef]
  5. Guo, W.Z.; Chen, Z.X.; Wang, W.L.; Gao, W.W.; Guo, M.M.; Kang, H.L.; Li, P.F.; Wang, W.X.; Zhao, M. Telling a different story: The promote role of vegetation in the initiation of shallow landslides during rainfall on the Chinese Loess Plateau. Geomorphology 2020, 350, 106879. [Google Scholar] [CrossRef]
  6. Cao, B.T.; Jiao, J.Y.; Wang, Z.J.; Wei, Y.H.; Li, Y.J. Characteristics of landslide under the extreme rainstorm in 2013 in the Yanhe basin. Res. Soil Water Conserv. 2015, 22, 103–109. (In Chinese) [Google Scholar]
  7. Kuo, C.W.; Brierley, G. The influence of landscape connectivity and landslide dynamics upon channel adjustments and sediment flux in the Liwu Basin, Taiwan. Earth Surf. Process. Landf. 2014, 39, 2038–2055. [Google Scholar] [CrossRef]
  8. Jaafari, A.; Panahi, M.; Pham, B.T.; Shahabi, H.; Bui, D.T.; Rezaie, F.; Lee, S. Meta optimization of an adaptive neuro-fuzzy inference system with grey wolf optimizer and biogeography-based optimization algorithms for spatial prediction of landslide susceptibility. Catena 2019, 175, 430–445. [Google Scholar] [CrossRef]
  9. Yin, H.; Liu, F.; Du, L.X.; Sui, S.Y. Probability of loess landslide based on terrain and vegetation distribution in Loess Plateau. Geoscience 2010, 24, 1016–1021. (In Chinese) [Google Scholar]
  10. Youssef, A.M.; AI-Kathery, M.; Pradhan, B. Landslide susceptibility mapping at Al-Hasher area, Jizan (Saudi Arabia) using GIS-based frequency ratio and index of entropy models. Geosci. J. 2015, 19, 113–134. [Google Scholar] [CrossRef]
  11. Chen, Z.X. Distribution Characteristic and Influencing Factors of Shallow Landslide on Vegetation-Covered Slope in the Loess-Tableland and Gully Region of the Loess Plateau Northwest. Master’s Thesis, A&F University, Yangling, China, 2020. (In Chinese). [Google Scholar]
  12. Peruccacci, S.; Brunetti, M.T.; Luciani, S.; Vennari, C.; Guzzetti, F. Lithological and seasonal control on rainfall thresholds for the possible initiation of landslides in central Italy. Geomorphology 2012, 139–140, 79–90. [Google Scholar] [CrossRef]
  13. Ruette, J.V.; Lehmann, P.; Or, D. Effects of rainfall spatial variability and intermittency on shallow landslide triggering patterns at a catchment scale. Water Resour. Res. 2014, 50, 7780–7799. [Google Scholar] [CrossRef]
  14. Wang, G.L.; Li, T.L.; Xing, X.L.; Zou, Y. Research on loess flow-slides induced by rainfall in July 2013 in Yan’an, NW China. Env. Earth Sci 2015, 73, 7933–7944. [Google Scholar] [CrossRef]
  15. Han, Y.; Zheng, F.L.; Xu, X.M.; Sheng, H.W. Relationship between shallow landslide erosion and vegetation in the Ziwuling forest area: A case study of the “7·21” disaster in Fuxian County. Acta Ecol. Sin. 2016, 36, 4635–4643. [Google Scholar]
  16. Abedini, M.; Ghasemian, B.; Shirzadi, A.; Bui, D.T. A comparative study of support vector machine and logistic model tree classifiers for shallow landslide susceptibility modeling. Environ. Earth Sci. 2019, 78, 560. [Google Scholar] [CrossRef]
  17. Chen, X.; Chen, W. GIS-based landslide susceptibility assessment using optimized hybrid machine learning methods. Catena 2021, 196, 104833. [Google Scholar] [CrossRef]
  18. Zhao, X.; Chen, W. Optimization of Computational intelligence models for landslide susceptibility evaluation. Remote Sens. 2020, 12, 2180. [Google Scholar] [CrossRef]
  19. Aditian, A.; Kubota, T.; Shinohara, Y. Comparison of GIS-based landslide susceptibility models using frequency ratio, logistic regression, and artificial neural network in a tertiary region of Ambon, Indonesia. Geomorphology 2018, 318, 101–111. [Google Scholar] [CrossRef]
  20. Nhu, V.H.; Zandi, D.; Shahabi, H.; Chapi, K.; Shirzadi, A.; AI-Ansari, N.; Singh, S.K.; Dou, J.; Nguyen, H. Comparison of support vector machine, bayesian logistic regression, and alternating decision tree algorithms for shallow landslide susceptibility mapping along a mountainous road in the west of Iran. Appl. Sci. 2020, 10, 5047. [Google Scholar] [CrossRef]
  21. Lin, W.; Yin, K.L.; Wang, N.T.; Xu, Y.; Guo, Z.Z.; Li, Y.Y. Landslide hazard assessment of rainfall-induced landslide based on the CF-SINMAP model: A case study from Wuling Mountain in Hunan Province, China. Nat. Hazards 2021, 106, 679–700. [Google Scholar] [CrossRef]
  22. Nery, T.D.; Vieira, B.C. Susceptibility to shallow landslides in a drainage basin in the Serra do Mar, São Paulo, Brazil, predicted using the SINMAP mathematical model. Bull. Eng. Geol. Environ. 2015, 74, 369–378. [Google Scholar] [CrossRef]
  23. Nsengiyumva, J.B.; Luo, G.; Hakorimana, E.; Mind’je, R.; Gasirabo, A.; Mukanyandwi, V. Comparative analysis of deterministic and semiquantitative approaches for shallow landslide risk modeling in Rwanda. Risk Anal. 2019, 39, 2576–2595. [Google Scholar] [CrossRef] [PubMed]
  24. Michel, G.P.; Kobiyama, M.; Goerl, R.F. Comparative analysis of SHALSTAB and SINMAP for landslide susceptibility mapping in the Cunha River basin, southern Brazil. J. Soils Sediments 2014, 14, 1266–1277. [Google Scholar] [CrossRef]
  25. Deb, S.K.; EI-Kadi, A.L. Susceptibility assessment of shallow landslides on Oahu, Hawaii, under extreme-rainfall events. Geomorphology 2009, 108, 219–233. [Google Scholar] [CrossRef]
  26. Carrara, A. Multivariate models for landslide hazard evaluation. J. Int. Assoc. Math. Geol. 1983, 15, 403–426. [Google Scholar] [CrossRef]
  27. Guzzetti, F.; Reichenbach, P.; Ardizzone, F.; Cardinali, M.; Galli, M. Estimating the quality of landslide susceptibility models. Geomorphology 2006, 81, 166–184. [Google Scholar] [CrossRef]
  28. Bzdok, D.; Altman, N.; Krzywinski, M. Points of significance statistics versus machine learning. Nat. Methods 2018, 15, 232–233. [Google Scholar] [CrossRef]
  29. Merghadi, A.; Yunus, A.P.; Dou, J.; Whiteley, J.; ThaiPham, B.; Bui, D.T.; Avtar, R.; Abderrahmane, B. Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance. Earth-Sci. Rev. 2020, 207, 103225. [Google Scholar] [CrossRef]
  30. Fang, R.K.; Liu, Y.H.; Huang, Z.Q. A review of the methods of regional landslide hazard assessment based on machine learning. Chin. J. Geol. Hazard Control 2021, 32, 1–8. (In Chinese) [Google Scholar]
  31. Kulesa, A.; Krzywinski, M.; Blainey, P.; Altman, N. Points of significance sampling distributions and the bootstrap. Nat. Methods 2015, 12, 477–478. [Google Scholar] [CrossRef] [Green Version]
  32. Elith, J.; Leathwick, J.R.; Hastie, T. A working guide to boosted regression trees. J. Anim. Ecol. 2008, 77, 802–813. [Google Scholar] [CrossRef] [PubMed]
  33. Chen, W.; Hong, H.Y.; Panahi, M.; Shanabi, H.; Wang, Y.; Shirzadi, A.; Pirasteh, S.; Alesheikh, A.A.; Khosravi, K.; Panahi, S.; et al. Spatial prediction of landslide susceptibility using GIS-based data Mining techniques of ANFIS with Whale Optimization Algorithm (WOA) and Grey Wolf Optimizer (GWO). Appl. Sci. 2019, 9, 3755. [Google Scholar] [CrossRef] [Green Version]
  34. Chen, W.; Pourghasemi, H.R.; Zhao, Z. A GIS-based comparative study of Dempster-Shafer, logistic regression and artificial neural network models for landslide susceptibility mapping. Geocarto Int. 2017, 32, 367–385. [Google Scholar] [CrossRef]
  35. Trigila, A.; Iadanza, C.; Esposito, C.; Scarascia-Mugnozza, G. Comparison of logistic regression and random forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy). Geomorphology 2015, 249, 119–136. [Google Scholar] [CrossRef]
  36. Hong, H.Y.; Pradhan, B.; Xu, C.; Bui, D.T. Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena 2015, 133, 266–281. [Google Scholar] [CrossRef]
  37. Chen, W.; Pourghasemi, H.R.; Kornejady, A.; Zhang, N. Landslide spatial modeling: Introducing new ensembles of ANN, MaxEnt, and SVM machine learning techniques. Geoderma 2017, 305, 314–327. [Google Scholar] [CrossRef]
  38. Bui, D.T.; Tuan, T.A.; Klempe, H.; Pradhan, B.; Revhaug, I. Spatial prediction models for shallow landslide hazards: A comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 2016, 13, 361–378. [Google Scholar]
  39. Al-Ruzouq, R.; Shanableh, A.; Yilmaz, A.G.; Idris, A.; Mukherjee, S.; Khalil, M.A.; Gibril, M.B.A.; Barakat, A. Dam Site Suitability Mapping and Analysis Using an Integrated GIS and Machine Learning Approach. Water 2019, 11, 1880. [Google Scholar] [CrossRef] [Green Version]
  40. Pourghasemi, H.R.; Yousefi, S.; Sadhasivam, N.; Eskandari, S. Assessing, mapping, and optimizing the locations of sediment control check dams construction. Sci. Total Environ. 2020, 739, 139954. [Google Scholar] [CrossRef]
  41. KC, D.; Dangi, H.; Hu, L. Assessing landslide susceptibility in the northern stretch of Arun Tectonic Window, Nepal. CivilEng 2022, 3, 525–540. [Google Scholar] [CrossRef]
  42. Zhang, M.S.; Hu, W.; Sun, P.P.; Wang, X.L. Advances and prospects of water sensitivity of loess and the induced loess land-slides. J. Earth Environ. 2016, 7, 323–334. (In Chinese) [Google Scholar]
  43. Feng, L.; Lin, H.; Zhang, M.S.; Guo, L.; Jin, Z.; Liu, X.B. Development and evolution of Loess vertical joints on the Chinese Loess Plateau at different spatiotemporal scales. Eng. Geol. 2020, 265, 105372. [Google Scholar] [CrossRef]
  44. Sun, P.P. Water Sensitivity of Loess and Prediction of Rainfall Induced Shallow Loess Landslides; Northwest University: Kirkland, WA, USA, 2020. [Google Scholar]
  45. Zhuang, J.Q.; Peng, J.B.; Zhu, X.H.; Li, W.; Ma, P.H.; Liu, T.M. Spatial distribution and susceptibility zoning of geohazards along the Silk Road, Xian-Lanzhou. Environ. Earth Sci. 2016, 75, 711. [Google Scholar] [CrossRef]
  46. Qiu, H.J.; Cui, P.; Regmi, A.D.; Hu, S.; Hao, J.Q. Loess slide susceptibility assessment using frequency ratio model and artificial neural network. Q. J. Eng. Geol. Hydrogeol. 2019, 52, 38–45. [Google Scholar] [CrossRef]
  47. Li, L.; Xu, C.; Zhang, Z.J.; Huang, Y.D. A review of research on landslide disasters on loess plateau. J. Inst. Disaster Prev. 2021, 23, 1–11. (In Chinese) [Google Scholar]
  48. Guo, M.M.; Wang, W.L.; Shi, Q.H.; Chen, T.D.; Kang, H.L.; Li, J.M. An experimental study on the effects of grass root density on gully headcut erosion in the gully region of China’s Loess Plateau. Land Degrad. Dev. 2019, 30, 2107–2125. [Google Scholar] [CrossRef]
  49. Fan, X.M.; Scaring, G.; Korup, O.; West, A.J.; van Westen, C.J.; Tanyas, H.; Hovius, N.; Hales, T.C.; Jibson, R.W.; Allstadt, K.E. Earthquake-induced chains of geologic hazards: Patterns, mechanisms, and impacts. Rev. Geophys. 2019, 57, 421–503. [Google Scholar] [CrossRef] [Green Version]
  50. Liu, W.F.; Zhang, H.Y.; Zhu, J.H.; Hu, A.P. Strategies for gully stabilization and highland protection in Chinese Loess Plateau. Front. Earth Sci. 2022, 10, 812609. [Google Scholar] [CrossRef]
  51. Sofia, C.L.; Oliveira, S.C.; Pereira, S.; Zezere, J.L.; Corsini, A. A comparison between bivariate and multivariate methods to assess susceptibility to liquefaction-related coseismic surface effects in the Po Plain (Northern Italy). Geomat. Nat. Hazards Risk 2018, 9, 108–126. [Google Scholar] [CrossRef] [Green Version]
  52. Khosravi, K.; Shahabi, H.; Pham, B.T.; Adamowski, J.; Shirzadi, A.; Pradhan, B.; Dou, J.; Ly, H.B.; Grof, G.; Ho, H.L.; et al. A comparative assessment of flood susceptibility modeling using Multi-Criteria Decision-Making Analysis and Machine Learning Methods. J. Hydrol. 2019, 573, 311–323. [Google Scholar] [CrossRef]
  53. Zhang, T.Y.; Han, L.; Wang, H. Assessment of landslide susceptibility using integrated ensemble fractal dimension with kernel logistic regression model. Entropy 2019, 21, 218. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  54. Bui, D.T.; Tsangaratos, P.; Nguyen, V.T.; Liem, N.V.; Trinh, P.T. Comparing the prediction performance of a Deep Learning Neural Network model with conventional machine learning models in landslide susceptibility assessment. Catena 2020, 188, 104426. [Google Scholar] [CrossRef]
  55. Hong, H.Y.; Pourghasemi, H.R.; Pourtaghi, Z.S. Landslide susceptibility assessment in Lianhua County (China): A comparison between a random forest data mining technique and bivariate and multivariate statistical models. Geomorphology 2016, 259, 105–118. [Google Scholar] [CrossRef]
  56. Ayalew, L.; Yamagishi, H. The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology 2005, 65, 15–31. [Google Scholar] [CrossRef]
  57. Süzen, M.L.; Kaya, B.S. Evaluation of environmental parameters in logistic regression models for landslide susceptibility mapping. Int. J. Digit. Earth. 2012, 5, 338–355. [Google Scholar] [CrossRef]
  58. Gansu Geological Environment Monitoring Institute. Investigation of Geological Hazards and Report on Zoning in Xifeng District of Qingyang City, Gansu Province; Gansu Geological Environment Monitoring Institute: Qingyang, China, 2007. [Google Scholar]
  59. Arabameri, A.; Saha, S.; Roy, J.; Chen, W.; Blaschke, T.; Bui, D.T. Landslide susceptibility evaluation and management using different machine learning methods in the Gallicash River Watershed, Iran. Remote Sens. 2020, 12, 475. [Google Scholar] [CrossRef] [Green Version]
  60. Chen, W.; Li, H.; Hou, E.K.; Wang, S.Q.; Wang, G.R.; Panahi, M.; Li, T.; Peng, T.; Guo, C.; Niu, C.; et al. GIS-based groundwater potential analysis using novel ensemble weights-of-evidence with logistic regression and functional tree models. Sci. Total Environ. 2018, 634, 853–867. [Google Scholar] [CrossRef] [Green Version]
  61. Maltman, A. The geological deformation of sediments. J. Quat. Sci. 1996, 11, 171–172. [Google Scholar] [CrossRef]
  62. Deng, J.Y.; Ma, C.; Zhang, Y. Shallow landslide characteristics and its response to vegetation by example of July 2013, extreme rainstorm, Central Loess Plateau, China. Bull. Eng. Geol. Environ. 2022, 81, 100. [Google Scholar] [CrossRef]
  63. McGuire, L.A.; Rengers, F.K.; Kean, J.W.; Coe, J.A.; Mirus, B.B.; Baum, R.L.; Godt, J.W. Elucidating the role of vegetation in the initiation of rainfall-induced shallow landslides: Insights from an extreme rainfall event in the Colorado Front Range. Geophys. Res. Lett. 2016, 43, 9084–9092. [Google Scholar] [CrossRef] [Green Version]
  64. Zhou, J.W.; Xu, F.G.; Yang, X.G.; Yang, Y.C.; Lu, P.Y. Comprehensive analyses of the initiation and landslide-generated wave processes of the 24 June 2015 Hongyanzi landslide at the Three Gorges Reservoir, China. Landslides 2016, 13, 589–601. [Google Scholar] [CrossRef]
  65. Feng, M. Risk Assessment of Landslide Geological Disasters in Daning County Based on Machine Learning Model. Master’s Thesis, Chang’an University, Xi’an, Chnia, 2020. (In Chinese). [Google Scholar]
  66. Chen, W.; Xie, X.S.; Wang, J.L.; Pradhan, B.; Hong, H.Y.; Bui, D.T.; Duan, Z.; Ma, J.Q. A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. Catena 2017, 151, 147–160. [Google Scholar] [CrossRef] [Green Version]
  67. Conoscenti, C.; Di Maggio, C.; Rotigliano, E. GIS analysis to assess landslide susceptibility in a fluvial basin of NW Sicily (Italy). Geomorphology 2008, 94, 325–339. [Google Scholar] [CrossRef]
  68. Prosser, I.P.; Rustomji, P. Sediment transport capacity relations for overland flow. Prog. Phys. Geogr. Earth Environ. 2000, 24, 179–193. [Google Scholar] [CrossRef]
  69. Krzeminska, D.; Kerkhof, T.; Skaalsveen, K.; Stolte, J. Effect of riparian vegetation on stream bank stability in small agricultural catchments. Catena 2019, 172, 87–96. [Google Scholar] [CrossRef]
  70. Pollen, N. Temporal and spatial variability in root reinforcement of streambanks: Accounting for soil shear strength and moisture. Catena 2007, 69, 197–205. [Google Scholar] [CrossRef]
  71. Su, Z.G.; Xiong, D.H.; Dong, Y.F.; Zhang, B.J.; Zhang, S.; Zheng, X.Y.; Yang, D.; Zhang, J.H.; Fan, J.R.; Fang, H.D. Hydraulic properties of concentrated flow of a bank gully in the dry-hot valley region of southwest China. Earth Surf. Process. Landf. 2015, 40, 1351–1363. [Google Scholar] [CrossRef]
  72. Wang, Y.; Lin, Q.G.; Shi, P.J. Spatial pattern and influencing factors of landslide casualty events. J. Geogr. Sci. 2018, 28, 259–274. [Google Scholar] [CrossRef]
  73. Adnan, M.S.G.; Rahman, M.S.; Ahmed, N.; Ahmed, B.; Rabbi, M.F.; Rahman, R.M. Improving spatial agreement in machine learning-based landslide susceptibility mapping. Remote Sens. 2020, 12, 3347. [Google Scholar] [CrossRef]
  74. Ali, M.Z.; Chu, H.J.; Chen, Y.C.; Ullah, S. Machine learning in earthquake- and typhoon-triggered landslide susceptibility mapping and critical factor identification. Environ. Earth Sci. 2021, 80, 233. [Google Scholar] [CrossRef]
Figure 1. Study area and shallow landslide points. (a): the Loess Plateau; (b): Dongzhiyuan; (c) and (d): shallow landslide.
Figure 1. Study area and shallow landslide points. (a): the Loess Plateau; (b): Dongzhiyuan; (c) and (d): shallow landslide.
Sustainability 15 00006 g001
Figure 2. Foundational environmental factors of shallow landslides in Dongzhiyun. (a): DEM; (b): Slope Gradient; (c): Slope Aspect; (d): Topograpghic Relief; (e): Plane Curvature; (f): Profile Curvature; (g): TWI; (h): SPI; (i): STI; (j): Rainfall Erosivity; (k): NDVI; (l): Land Use Type.
Figure 2. Foundational environmental factors of shallow landslides in Dongzhiyun. (a): DEM; (b): Slope Gradient; (c): Slope Aspect; (d): Topograpghic Relief; (e): Plane Curvature; (f): Profile Curvature; (g): TWI; (h): SPI; (i): STI; (j): Rainfall Erosivity; (k): NDVI; (l): Land Use Type.
Sustainability 15 00006 g002
Figure 3. Flowchart of methods.
Figure 3. Flowchart of methods.
Sustainability 15 00006 g003
Figure 4. Relationships between shallow landslides and environmental factors. (a): DEM; (b): Slope Gradient; (c): Slope Aspect; (d): Topograpghic Relief; (e): Plane Curvature; (f): Profile Curvature; (g): TWI; (h): SPI; (i): STI; (j): Rainfall Erosivity; (k): NDVI; (l): Land Use Type.
Figure 4. Relationships between shallow landslides and environmental factors. (a): DEM; (b): Slope Gradient; (c): Slope Aspect; (d): Topograpghic Relief; (e): Plane Curvature; (f): Profile Curvature; (g): TWI; (h): SPI; (i): STI; (j): Rainfall Erosivity; (k): NDVI; (l): Land Use Type.
Sustainability 15 00006 g004
Figure 5. The ROC curves of the model predicting landslide susceptibility. Note: (a) is the ROC curve generated by the training set data; (b) is the ROC curve generated by the validation set data.
Figure 5. The ROC curves of the model predicting landslide susceptibility. Note: (a) is the ROC curve generated by the training set data; (b) is the ROC curve generated by the validation set data.
Sustainability 15 00006 g005
Figure 6. Model prediction classification map of landslide susceptibility. (a): RF model; (b): SVM model; (c): Logistic model; (d): SINMAP model.
Figure 6. Model prediction classification map of landslide susceptibility. (a): RF model; (b): SVM model; (c): Logistic model; (d): SINMAP model.
Sustainability 15 00006 g006
Figure 7. Model prediction of the susceptibility zoning percentage of shallow landslides.
Figure 7. Model prediction of the susceptibility zoning percentage of shallow landslides.
Sustainability 15 00006 g007
Figure 8. Comparison of local prediction results between the RF model and the Log model. Note: (a) represents the prediction result of the RF model; (b) represents the prediction result of the Log model.
Figure 8. Comparison of local prediction results between the RF model and the Log model. Note: (a) represents the prediction result of the RF model; (b) represents the prediction result of the Log model.
Sustainability 15 00006 g008
Table 1. Classification of stability levels.
Table 1. Classification of stability levels.
ClassStability Index Using Factor SafetyCharacteristics
1<0.5Very High Susceptibility
20.5–1.0High Susceptibility
31.0–1.25Moderate Susceptibility
41.25–1.5Low Susceptibility
5>1.5Very Low Susceptibility
Table 2. Data sources and classification methods of landslide environmental factors.
Table 2. Data sources and classification methods of landslide environmental factors.
Environmental Factor TypeFactorData SourcesData ResolutionClassification Method
Topographic and Geomorphic FactorsDEMALOS Satellite12.5 mNatural Break
Slope Gradient
Slope Aspect
Topographic Relief
Plane Curvature
Profile Curvature
Hydrological Environmental FactorsTWI
SPI
STI
Rainfall ErosivityGeospatial Data Cloud30.0 m
Land Cover FactorsNDVI
Land Use TypeNational Earth System Science Data CenterEqual Interval
Table 3. Basic parameters for the SINMAP model.
Table 3. Basic parameters for the SINMAP model.
ρs
(kg·m−3)
T/R
(m)
C
(N·m−2)
φ
(°)
entry 1
1350
minmaxminmaxminmax
10030000.350.541525
Table 4. Multicollinearity analysis.
Table 4. Multicollinearity analysis.
FactorsCollinearity Statistics
VIFTOL
DEM2.360.42
Slope Gradient1.810.55
Slope Aspect1.080.93
Plane Curvature1.360.73
Profile Curvature1.360.73
TWI1.580.64
STI1.420.70
Rainfall Erosivity2.840.35
NDVI1.500.67
Land Use Type1.050.95
Table 5. Predictive performance of the models.
Table 5. Predictive performance of the models.
ModelAccuracy Parameters
SensitivitySpecificityAccuracy
RF92.03%85.99%84.85%
SVM87.87%76.60%82.23%
Log83.71%66.75%79.38%
SINMAP81.21%59.98%70.59%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Feng, L.; Guo, M.; Wang, W.; Chen, Y.; Shi, Q.; Guo, W.; Lou, Y.; Kang, H.; Chen, Z.; Zhu, Y. Comparative Analysis of Machine Learning Methods and a Physical Model for Shallow Landslide Risk Modeling. Sustainability 2023, 15, 6. https://doi.org/10.3390/su15010006

AMA Style

Feng L, Guo M, Wang W, Chen Y, Shi Q, Guo W, Lou Y, Kang H, Chen Z, Zhu Y. Comparative Analysis of Machine Learning Methods and a Physical Model for Shallow Landslide Risk Modeling. Sustainability. 2023; 15(1):6. https://doi.org/10.3390/su15010006

Chicago/Turabian Style

Feng, Lanqian, Mingming Guo, Wenlong Wang, Yulan Chen, Qianhua Shi, Wenzhao Guo, Yibao Lou, Hongliang Kang, Zhouxin Chen, and Yanan Zhu. 2023. "Comparative Analysis of Machine Learning Methods and a Physical Model for Shallow Landslide Risk Modeling" Sustainability 15, no. 1: 6. https://doi.org/10.3390/su15010006

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop