Next Article in Journal
Agile Development of Secure Software for Small and Medium-Sized Enterprises
Next Article in Special Issue
Impact of Carbon Sequestration by Terrestrial Vegetation on Economic Growth: Evidence from Chinese County Satellite Data
Previous Article in Journal
Analysis of the Relationship between the Low-Temperature Properties and Distillation Profiles of HEFA-Processed Bio-Jet Fuel
Previous Article in Special Issue
Monitoring Marine Oil Spills in Hyperspectral and Multispectral Remote Sensing Data by the Spectral Gene Extraction (SGE) Method
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparative Assessment of the Efficacy of the Five Kinds of Models in Landslide Susceptibility Map for Factor Screening: A Case Study at Zigui-Badong in the Three Gorges Reservoir Area, China

1
School of Civil Engineering, Architecture and Environment, Hubei University of Technology, Wuhan 430068, China
2
Innovation Demonstration Base of Ecological Environment Geotechnical and Ecological Restoration of Rivers and Lakes, Hubei University of Technology, Wuhan 430068, China
*
Author to whom correspondence should be addressed.
Sustainability 2023, 15(1), 800; https://doi.org/10.3390/su15010800
Submission received: 10 November 2022 / Revised: 8 December 2022 / Accepted: 14 December 2022 / Published: 1 January 2023
(This article belongs to the Special Issue Innovation and Sustainable Development of Remote Sensing Technology)

Abstract

:
Landslides are geological disasters affected by a variety of factors that have the characteristics of a strong destructive nature and rapid development and cause major harm to the safety of people’s lives and property within the scope of the disaster. Excessive landslide susceptibility mapping (LSM) factors can reduce the accuracy of LSM results and are not conducive to researchers finding the key LSM factors. In this study, with the Three Gorges Reservoir area to the Padang section as an example, the frequency ratio (FR), index of entropy (IOE), Relief-F algorithm, and weights-of-evidence (WOE) Bayesian model were used to sort and screen the importance of 20 LSM factors; then, the LSMs generated based on different factor sets modeled are evaluated and further scored. The results showed that the IOE screening factor was better than the FR, Relief-F, and WOE Bayesian models in the case of retaining no fewer than eight factors; the score for 20 factors without screening was 45 points, and the score for 12 factors screened based on the IOE was 44.8 points, indicating that there was an optimal retention number that had little effect on the LSM results when IOE screening was used. The core factor set obtained by the method for comparing the increase in scores and the increase in corresponding factors effectively improved the accuracy of the LSM results, thus verifying the effectiveness of the proposed method for ranking the importance of LSM factors. The method proposed in this study can effectively screen the key LSM factors and improve the accuracy and scientific soundness of LSM results.

1. Introduction

Based on the frequent occurrence of landslides in the Three Gorges Reservoir area of China, posing a huge risk to people’s lives and property in the reservoir area, the efficient and accurate generation of landslide vulnerability maps is of great significance for alleviating the issues caused by landslides [1]. Landslides are harmful geological disasters caused by rapid changes in the natural environment, resulting from the stress applied on the rock and soil exceeding the strength of the soil; thus, the soil along the slope (a slippery surface) moves. This downslope soil movement is characterized by strong destruction, rapid development, and other properties. Geomorphologists, geologists, engineering geologists, and other researchers have combined different strategies and methods to explore landslides in long-term research involving the international community [2]. The occurrence of landslides is influenced by many factors. There are many methods for landslide susceptibility mapping (LSM), but they all use different models to model different factors, and there are few studies on the selection of LSM factors. In order to find a suitable factor-screening method for landslide susceptibility evaluation, this paper tries to find several classical statistical methods to screen factors and compare them.
LSM, in which factors from engineering geology are used to predict the probability of future landslides by statistically analyzing the factors that lead to landslides, is common in regional landslide studies [3]. In recent years, machine learning models have been widely used in LSM models, and they include logistic regression (LR) [4,5,6,7,8], random forest (RF) [8,9,10,11], support vector machine (SVM) [12,13,14,15], and artificial neural network (ANN) [16,17,18] models, among others. There is no single or specific model that can be described as the best scenario for all situations, and the various modeling approaches are compared in this overview. LR models can produce results very quickly and easily, but the conditions of use are strict; the RF model avoids overfitting and outliers, but takes quite a long time to run; SVM can balance overall performance and computation time, but they are also sensitive to data structures; ANN has faster convergence speed and better performance, but the results are difficult to interpret, and a large number of samples are needed to obtain a reliable model [19]. These methods quantify the influence of each factor on the incidence of landslides by identifying the important factors at the time of a landslide and then calculating the values or relative weights of these factors [20]. Therefore, no matter which method is used in LSM, the choice of LSM factor is very important.
Landslides are affected by a variety of factors, including geology, soil, forest cover, topography, and climatic conditions [20,21,22,23,24], where geology, soil, forest cover, and terrain are important internal factors, and foundations for the development of landslides and climate, earthquakes, and human engineering activities are important factors that induce landslides. The extraction of remote sensing images with different resolutions can obtain LSM factors with different classification accuracy, thus affecting the overall accuracy and accuracy of LSM [25]. A large amount of remote sensing data about the Three Gorges Reservoir area can be obtained by using unmanned airborne systems (UASs) and lidar. However, the challenge faced by UASs in data collection is that the increase in data volume leads to the lack of appropriate analytical methods [26]. When analyzing the data, the steep and rugged terrain and severe vegetation cover in the Three Gorges area often cover up the geomorphic features that predict landslides, and the remote sensing image resolution obtained from the original data is too low [25]. Therefore, this study finally decided to abandon the correlation factors extracted from remote sensing images. Due to the differences in geographical locations of landslides, the factors that lead to the occurrence of landslide events are uncertain; therefore, the effects of various factors should be considered in LSM [27]. However, in LSM, the number of factors considered is not always proportional to the effects of a landslide [28], and in fact, only a few factors or combinations of factors contribute significantly to the occurrence of landslides; identifying these factors is critical for accurate LSM [27]. If all LSM factors are considered, not only do correlations and multicollinearity among factors affect the weights during computations, but also considering too many factors can result in a high computational burden and is not conducive to highlighting the results of the study factors when modeling landslide susceptibility [29]. Screening the LSM factors, removing redundancy and collinearity among factors, and reducing the dimensionality of factors are important for strengthening the stability of LSM models [30]. Therefore, the use of appropriate methods to screen the LSM factors is an important research topic.
Statistical methods are the most widely used methods for factor screening in LSM [31]. The principle is to use statistical methods to determine the importance of each factor in relation to the occurrence of landslides and only retain the factors with the greatest contributions. The commonly used statistical methods are the certainty factor (CF) [32,33,34,35], frequency ratio (FR) [36,37,38,39], index of entropy (IOE) [40,41], weights-of-evidence (WOE) [36,40,42,43], and relief algorithm methods [44,45,46,47]. Wu et al. used rough set and correlation coefficient analysis to screen 12 key environmental factors from 22 overall landslide factors for LSM [1]. Dou et al. used the CF method to optimize 14 possible LSM factors and selected slope angle, aspect, diversion density network, distance from a geological boundary, distance from a fault, and lithology as factors for further analysis [48]. Niu et al. used CF, the sensitivity index (SI), and the correlation coefficient (CC) to analyze the suitability of including nine environmental factors in landslide analysis and finally selected four terrain factors, namely slope, slope of slope, slope shape, and surface roughness, to map landslide susceptibility together with other factors, and the results exhibited reasonable accuracy [49]. Djukem et al. analyzed the geomechanical properties of 11 representative soil samples using the IOE method, calculated the weight values of geological factors, and concluded that the spatial distribution of soil mechanical properties played an important role in the occurrence of landslides [50]. Wang et al. analyzed nine LSM factors using the maximum entropy model, and the results showed that road distance, rainfall, and land use were the main risk factors affecting landslide occurrence [51]. In the above LSM-related research, the screening of factors occurred only as a one-step process, and few papers have specifically studied the screening of landslide factors. There are many methods of screening factors that, to a certain extent, can indeed remove some unnecessary factors and factors that have little impact on LSM, but it is unclear whether the results obtained by using different methods to screen factors are the same and which screening method is most effective.
In this study, the Zigui to Badong section of the Three Gorges reservoir was taken as the research area. 40 LSM factors were obtained by collating and analyzing the collected geological, topographic, and hydrological data, remote sensing images, and other original data. Then, using the Pearson correlation coefficient (PCC) and the variance inflation factor (VIF) to remove redundant, highly correlated, and multicollinear factors, 20 LSM factors were retained. To further screen the LSM factors, the importance of these factors was sorted with four methods—the FR, IOE, Relief-F, and WOE Bayesian models—and different sets of important factors (six factors, eight factors, ten factors and twelve factors) were selected in order of importance from highest to lowest. Then, the same batches of training and validation samples were used to generate LSMs using an SVM. To compare the effectiveness of the four screening methods, the LSM results were evaluated based on the receiver operating characteristic (ROC) curve and the area under the curve (AUC), and the specific category precision analysis and the five statistical measures were applied. The evaluation results were comprehensively and quantitatively scored by using the scoring system for comparisons. Furthermore, to study the importance of LSM factors, the importance of LSM factors was reordered by comparing the score increases and the corresponding factor increases.

2. Overview of the Study Area and Data Introduction

2.1. Overview of the Research Area

This study area is located in the first area of the Three Gorges Reservoir of the Yangtze River, from Zigui to Badong, covering two county-level administrative districts and eleven township-level administrative districts. The geographical coordinates of the study area are 110°18′~110°52′ east longitude and 30°01′~30°56′ north latitude. The water system in the study area is mainly the Yangtze River and its main tributaries flowing through Padang and Zigui, and the total length of the major river basin is approximately 55 km. In terms of topography, the study area is located in the eastern part of two natural geographical units in the Three Gorges Reservoir area, which is a basin, and the terrain along the river is characterized by low on middle position and high areas on both sides [52]. The Three Gorges area of the Yangtze River was affected by the Quaternary uplift, and lumpy Paleozoic and Mesozoic (Triassic Jialing River Group) limestone is severely cut along the narrow fault zone; the main geological structures in the east-west direction are the Huangning anticline, Zigui oblique, and Padang oblique, and the lithology is composed of Cretaceous rocks, Jurassic rocks, Quaternary loose rocks, Devonian clastic rocks, Triassic and Sinian carbonate rocks, and Precambrian crystalline rocks [53]. Geological disasters occur frequently in the study area, and landslides are the most prominent type of geological disaster. There have been 202 verified landslides with an area of about 23.4 km2, accounting for 6.03% of the entire study area [54]. A schematic diagram of the location of the study area is shown in Figure 1.

2.2. Data and Software Sources

The data sources used in this study are shown in Table 1.
The spatial resolution of the DEM data can be matched with the selected topographic map and geological map at a 1:50,000 scale and the landslide disaster map at a 1:10,000 scale. The software used in this study are as follows: ArcGIS 10.8, ENVI 5.3, SPSS Modeler 18, SPSS Statistics 26, and PyTorch 1.7.1. ArcGIS 10.8 and ENVI 5.3 are software developed by ESRI in RedLands, California, USA; SPSS Modeler 18 and SPSS Statistics 26 are software developed by IBM in Armonk, New York, USA; PyTorch 1.7.1 is a deep learning framework developed by Facebook in Menlo Park, CA, USA.

2.3. Definition of Data

According to the data sources in Table 1, a total of 40 LSM factors were calculated, sorted, and divided into three categories, as shown in Table 2.

2.4. Create Training Samples and Verify the Samples

According to Table 1, 30 m × 30 m grid cells are used for calculation in this research area. After removing invalid data, 423,787 raster cells were obtained. Among them, there were 25,213 landslide grid units and 398,574 non-landslide grid units. In the study, 70% of the landslide grid units and the same number of non-landslide grid units were randomly selected as training samples, and the remainder of the grid units were used as validation samples. The SVM was used for LSM.

3. Introduction to the Research Methods

3.1. Factor Pretreatment Method

3.1.1. PCC

The Pearson correlation coefficient, also known as the Pearson moment correlation coefficient, is a linear correlation coefficient that reflects the degree of linear correlation between two variables [55]. PCC is the degree of linear correlation between two variables, where a greater absolute value indicates a stronger correlation. The value of the PCC is between −1 and 1, and the correlation between two variables can vary from a negative correlation to a positive correlation. When the PCC is 0, the two variables have no correlation, which means that the two variables are independent of each other [56]. This article treats two variables with PCC greater than 0.6 as having a strong correlation, and one of the variables should be removed to eliminate correlation issues during analyses.

3.1.2. VIF

VIF is a measure of the severity of multicollinearity in multiple linear regression models [57]. It represents the ratio of the variance of the regression coefficient estimate to the variance of the hypothesized nonlinear correlation between two independent variables. In this article, the variables whose application range of VIF is 1–10 are regarded as reasonable factors of multicollinearity.

3.2. Factor-Screening Method

3.2.1. FR

The correlation between the landslide distribution and landslide genesis can be deduced using FR, defined by the ratio of the region where landslides occur to the entire study area and the ratio of landslide occurrence probabilities to nonoccurrence probabilities considering a given property [4].

3.2.2. IOE

The fuzzy comprehensive evaluation method is a comprehensive evaluation method based on fuzzy theory, and it is used to describe the uncertainty of things; additionally, the degree of dispersion of an index can be assessed based on the entropy value of information [58]. The greater the degree of dispersion of an indicator, the smaller the weight, and vice versa.

3.2.3. Relief-F Algorithm

As a separate method of evaluating filtered feature selection, Relief calculates an agent statistic for each feature, and these results can be used to estimate feature quality or correlations with a target process (i.e., predicting endpoint values). The original Relief algorithm has been rarely applied in practical applications and has been replaced by Relief-F [59]. The principle of the Relief-F algorithm is to randomly extract a sample R from the sample set T; then, k neighboring samples (H) of R are selected from the sample set of R, and k neighboring samples (N) of R are selected from the sample set of different classes of R; the whole process is repeated m times [60].
When using the Relief-F algorithm to evaluate the predictive ability of different land slide evaluation factors, a large value indicates that the weight of the evaluation factor should be high, which means that the corresponding feature influence is strong; otherwise, it indicates that the weight of the evaluation factor is lower [61].

3.2.4. WOE Bayesian Model

The WOE Bayesian model combines weights-of-evidence and a Bayesian formula. The loglinear form of the Bayesian probability model is used to predict the WOE of each independent variable in the independent variable group of dependent variables and the importance of discriminating independent variables; it is a standard, quantitative, data-driven statistical method [62].

3.3. SVM

The SVM model, first proposed by Vapnik [63], has many unique advantages for solving small-sample, nonlinear, and high-dimensional pattern recognition problems and can be generalized to other machine learning problems, such as function fitting [7,64]. Assuming that a linearly separable training vector χ i ( i = 1 , 2 , , n ) contains two classes y i = ± 1 , an n−1-dimensional hyperplane needs to be searched so that the two classes are separate and spaced as far apart as possible in the SVM [65].

3.4. Evaluation Methods

To clarify the influence of different combinations of factors obtained with different screening methods on the LSM results, the different sets of LSM results were analyzed from different perspectives by using the ROC curve, the specific category precision analysis method, and the five statistical measures.

3.4.1. ROC Curve Analysis

The ROC curve is a useful method for assessing the predictive power of a model, and it is often used in LSM. The longitudinal axis of the ROC curve is the true positive rate (TPR = TP/(TP + FN)), the horizontal axis is the false positive rate (FPR = FP/(FP + TN)), and the ROC curve starts at point (0,0) and reaches (1,1) [66,67,68]. This approach is detailed in Table 3.

3.4.2. Specific Category Precision Analysis

The specific category precision analysis method is an improved quantitative analysis method that takes into account the number of computational units within the prediction area [52]. This method can be expressed as:
P i = A i B i × 100 %
where i = 1,2…, n; n is the number of landslide zoning classifications; Ai is the number of slope units occupied by landslides in the ith landslide-prone area; Bi is the number of slope units in the ith landslide-prone zone; and Pi is the specific classification accuracy of the division of the ith landslide-prone area.

3.4.3. Evaluation with Five Statistical Measures

Overall accuracy (OA), precision, recall, the F-measure, and the Matthews correlation coefficient (MCC) are five common statistical measures used to assess the ability of classification models based on confusion matrices [69,70]. In LSM, the higher the OA value, the higher the prediction accuracy of the whole study area. The higher the precision value, the higher the prediction accuracy of a landslide. The higher the recall value, the higher the proportion of landslides correctly predicted in actual landslides. The F-measure is the weighted harmonized average of precision and recall, and a high F-measure indicates that the test method is effective. The MCC describes the correlation between the actual classification and the predicted classification; a value of 1 indicates a perfect prediction, a value of 0 indicates that the predicted result is not as good as a randomly predicted result, a value of −1 indicates that the predicted classification and the actual classification are completely inconsistent.

4. Experiments

The overall workflow of this study is shown in Figure 2.

4.1. Data Preprocessing

To reduce the collinearity and redundancy of landslide-related factors, the PCC was used to generate a correlation matrix. Then, the strongly correlated factors (correlation greater than 0.6) were removed, and 20 LSM factors (terrain surface texture, total curvature, TPI, TWI, valley depth, SPI, slope length, slope height, slope, MRN, mid-slope position, fault, flow line curvature, flow width, lithology, elevation, convexity, convergence index, slope structure, and aspect) were screened. The results are shown in Figure 3.
Multicollinearity analyses were performed by the VIFs for the 20 LSM factors screened above, and the results are shown in Table 4.
All the factors in Table 4 meet the conditions of TOL > 0.1 and VIF < 10, and the 20 factors selected passed the multicollinearity test.
In summary, the initial 20 landslide factors that passed the PCC and VIF tests are shown in Figure 4.

4.2. Factor Importance Screening

After 40 factors were screened based on the PCC and VIF tests to obtain 20 LSM factors, the highly correlated and multicollinear factors were eliminated. However, the factors that had little impact on the LSM results were not screened and removed. Therefore, an importance analysis of the landslide factors was performed. In this study, four methods, the FR, IOE, Relief-F, and WOE Bayesian modeling methods, were used to screen the LSM factors.
The relationship between landslide occurrence and the LSM factors was calculated using the FR, IOE, Relief-F algorithm, and WOE Bayesian modeling methods, and the results are shown in Table 5.
According to Table 5, the factor importance was sorted by four screening methods from highest to lowest, and the results are shown in Table 6.

4.3. SVM Modeling

According to Table 6, the most important six factors, eight factors, ten factors, and twelve factors in each group were selected as LSM factors, with a total of 16 groups. Plus 20 factors as a group of LSM factors, there were a total of 17 groups of LSM factors. SVM was used to model these 17 groups of factors to generate landslide susceptibility index (LSI). The SVM modeling processes are shown in Figure 5.

4.4. Experimental Results

4.4.1. LSI Chart

To compare the effectiveness of different screening methods at different screening degrees, four sets of trials were conducted in this study. Based on Table 5, in the first, second, third, and fourth experiments, the top six, eight, ten, and twelve important factors were selected by each screening method. The training samples for the four sets of factors and all the factors were input into the SVM. Then, the LSM model was established, and the LSI of the study area was obtained by using all the samples. The experimental results are shown in Figure 6.

4.4.2. Landslide Susceptibility Zonation (LSZ)

LSZ is a continuous variable that ranges from 0 to 1, and in this study, to increase the readability of landslide vulnerability index graphs, all landslide susceptibility indices were divided into five susceptibility levels using the manual threshold method based on the calculation results: very low (0–0.5), low (0.5–0.7), moderate (0.7–0.8), high (0.8–0.9), and very high (0.9–1.0) [71,72]. This approach is shown in Figure 7.

4.5. Analysis of the Experimental Results

4.5.1. ROC Curve

Using SPSS statistical software, the LSM results generated based on the factors screened by the FR, IOE, Relief-F algorithm, and WOE Bayesian modeling methods were used to construct ROC curves. Additionally, AUC values were calculated, and the results are shown in Figure 8 and Table 7.
In Figure 8, (a) compares the use of different screening methods to the same degree and compares the results based on 20 factors. In Figure 8, (b)–(e) compare the different degrees of screening using the same methods. Figure 8 shows that as the number of factors increases, the area under the ROC curve increases. In plot (a), the ROC curves of FR-12, IOE-12, Relief-F-12, WOE-12, and LSM with 20 factors are similar. As shown in plots (b) and (e), the ROC curves of FR-6, FR-8, FR-10, and FR-12 are very similar, the ROC curves of WOE-6, WOE-8, WOE-10, and WOE-12 are very similar, and the different degrees of factor screening with the FR and WOE Bayesian modeling methods have little effect on the ROC curve in LSM. For IOE screening in plot (c), the ROC curves of IOE-8, IOE-10, and IOE-12 are very similar, but they are very different from the ROC curve of IOE-6. In plot (d), the ROC curves of Relief-F-8, Relief-F-10, and Relief-F-12 are very similar and plot far from the ROC curve of Relief-F-6.
Table 7 shows that the AUC value increases as the number of factors increases. Notably, when the number of factors increased from six to eight, the AUC values changed little, except that of the IOE method for factor screening. For the increases from eight factors to ten factors and from ten factors to twelve factors, the increase in AUC values of the factors screened by the four methods was small. The AUC values for 12 factors obtained with the four screening methods were all between 0.9025 and 0.9103 and were lower those in the case of 20 factors (AUC value is 0.9107).

4.5.2. Specific Category Precision Analysis

The LSM results of the specific category precision analysis based on the four factor-screening methods are shown in Figure 9.
Based on Figure 9, IOE-12 yields the highest specific category precision in the predicted very-high areas of landslides, with a value of 27.06%. The second highest value is observed for IOE-10 at 26.76%. The specific category precision obtained with 20 factors is only the fourth highest, with a value of 26.61%. WOE-12, which ranked fourth in terms of AUC value in Section 4.5.1, ranked only eleventh in specific category precision, with a value of 23.14%.

4.5.3. Evaluation of Statistical Measures

The results of the calculation of five statistical measures, namely OA, precision, recall, the F-measure, and the MCC, are shown in Figure 10.
Figure 10 shows that as the number of factors increases, the OA, precision, F-measure, and MCC all increase. A comparison indicates that in addition to the recall, the OA, precision, F-measure, and MCC of the 20-factor case were the highest. The WOE-12 and WOE-10 scenarios yield the closest values of the five statistical measures to those in the 20-factor case. IOE-6 yields the lowest values of all five statistical measures. The lowest recall rate was associated with IOE-6, with a value of 0.7577, and the second lowest recall rate was for the case with 20 factors, with a value of 0.7938.
In summary, Figure 8 shows that except the ROC curve for the 20-factor, IOE-12 performs well, and the ROC curves are nearly coincident with that of the 20-factor. Table 7 indicates that the AUC value for 20-factor is 0.9107, and the AUC value for IOE-12 is 0.9103, which is similar. The highest specific category precision in a very-high area in Figure 9 is observed for IOE-12, with a value of 27.06%, and the specific category precision for the 20-factor case is 26.61% for these areas, ranking only third. The ROC curves, AUC values, and the specific category precision results of very-high landslide areas suggest that the accuracy is similar for the LSM of IOE-12 and the 20-factor, yielding accurate LSI results. The 20-factor, IOE-12, and IOE-10 cases all yield reasonable AUC and specific category precision values of very-high landslide areas, but all five statistical measures are not good in each case. Figure 10 shows that the OA in the 20-factor case is 84.19%, precision is 0.1207, recall is 0.7938, the F-measure is 0.2095, and the MCC is 0.2970; the OA, precision, F-measure, and MCC are the highest, but recall is relatively low, ranking second from the bottom in the 17 groups of experiments. The reason for the high OA, precision, F-measure, and MCC values in the 20-factor case is that FP is small and FN is large in the confusion matrix, which indicates that in the 20-factor landslide prediction result, the raster units that were originally non-landslides are commonly predicted as landslides, but few actual landslides were predicted to be non-landslides. In Figure 10, the OA, precision, recall, F-measure, and MCC results are all good for IOE-12, with values of 81.16%, 0.1077, 0.8423, 0.1910, and 0.2823, respectively. Among the five statistical measures in the 17 groups of experiments, the OA of IOE-12 ranked second, and the precision, recall, F-measure, and MCC all ranked sixth; the difference between the five calculated indicator values and the highest values of these indicators in the 17 groups of experiments was small. Thus, the overall prediction effect was the best for IOE-12, and this approach considers safety and cost. In summary, the importance of the LSM factors obtained by using different screening methods was different; the IOE-12 and 20-factor cases yielded similar ROC curve, specific category precision, and statistical measure results; as the degree of screening decreased, the LSM results obtained when more factors were retained improved; additionally, when no fewer than eight factors were retained, the IOE method outperformed FR, Relief-F, and WOE Bayesian modeling methods.

5. Discussion

5.1. Quantitative Analysis of the LSM Evaluation Results

To facilitate a comparison of the advantages and disadvantages of several methods, the scoring method is used to comprehensively quantify the ROC curves, specific category precision analysis, and five statistical measure results in LSM. The scoring rules are as follows: the number of models in this study is 17, so the highest possible score is 17 points, and the lowest is 1 point. The statistical method is based on the average of the five algorithms. The scores are shown in Table 8.
Based on Table 8, when the number of factors increases, the score will increase, indicating that when the degree of factor screening is low, retaining more factors will improve the effectiveness of LSM to a certain extent; however, the degree of improvement achieved with different methods varies. In this study, when the number of the screening factors retained was greater than 10, IOE > FR > WOE Bayesian model > Relief-F. When the number of the screening factors retained was eight, IOE > WOE Bayesian model > FR > Relief-F. When the number of the screening factors retained was six, the WOE Bayesian model > FR > Relief-F > IOE. Therefore, the following conclusions can be drawn: (1) Relief-F is relatively ineffective at different levels of screening; (2) When the degree of screening is high, the IOE method is not appropriate, and key factors are missed; (3) When screening for eight or more retained factors, IOE performs best among the four methods; (4) When the screening degree is high, the effect of the WOE Bayesian model is better than that of FR, and when the degree of screening is relatively low, the effect of FR is better than that of the WOE Bayesian model; (5) The 20-factor LSM approach scored 45 points, while IOE-12 scored 44.8 points, a small difference; therefore, the IOE method was used in this study to screen 12 retained factors (TWI, TPI, slope, terrain surface texture, the convergence index, convexity, elevation, fault, mid-slope position, slope structure, MRN, and aspect) and can ensure the accuracy of landslide-prone results, and the remaining eight factors (flow width, slope height, valley depth, slope length, flow line curvature, lithology, SPI, and total curvature) were considered to have little impact on the occurrence of landslides and are noncritical factors.
Reichenbach counted the number of LSM factors used in a review paper and found that 596 factors were used in landslide studies, but 445 factors were used only once or twice. Of remaining factors, 10.5% were slope factors, 9.2% were geo-lithological factors, 8.1% were aspect factors, and 7.3% were river/catchment factors; additionally, curvature factors accounted for 7.2%, other morphometric factors accounted for 5.6%, elevation factors accounted for 5.2%, soil factors accounted for 5.2%, distance to fault accounted for 3.5%, and geo-structural factors accounted for 3.4% [20]. The 12 factors retained in IOE screening are all commonly used and important factors in landslide studies.

5.2. Reordering the Retained LSM Factors Based on the Increase in Scores and the Increase in Related Factors

By combining Table 6 and Table 8, it can be found that the importance of the factors obtained by screening with different methods varies, and as the number of factors increases, the AUC value, the specific category precision, and the five statistical measures all increase, but to different degrees. Table 8 indicates that the increase in score when the number of factors was increased was large, suggesting that these increased factors were important for the landslide occurrence prediction. To facilitate a comparison of the increases in score for different numbers of factors, Figure 11 was created.
As shown in Figure 11a, the score in the 20-factor case is 45 points, and the IOE-12 score is 44.8 points, indicating that eight factors (flow width, slope height, valley depth, slope length, flow line curvature, lithology, SPI, and total curvature) have little effect on landslide occurrence prediction and are noncritical factors; thus, they can be removed directly from the analysis. As shown in Figure 11b, the highest increase in score is obtained when the number of IOE screening factors is increased from six to eight, and the score is increased by 32.6 points, indicating that considering the elevation and fault factors leads to an increase in the score. As shown in Figure 11c, the second highest score increase is associated with the use of the WOE Bayesian model and six screening factors, with a score increase of 19.6 points. After removing the unimportant factors, three factors were retained (elevation, slope, and terrain surface texture); thus, the score increase was caused by at least one of these 3 factors. As shown in Figure 11b,c, the two factor sets with the highest increased scores both included elevation, so the addition of elevation may have led to a significant increase in scores. Similarly, in Figure 11c,d, the two factor sets with the highest increased scores both included elevation and terrain surface texture, so the addition of these factors may have led to a significant increase in scores. By analogy, the remaining factors can also be assessed from multiple sets of comparisons: if an increase in the factor occurs multiple times in multiple groups with a larger increase in score, it indicates that the factor is responsible for the increase in score. Ranked based on the highest to lowest score increases, the importance of factors is as follows: elevation, terrain surface texture, slope, TWI, convexity, slope structure, mid-slope position, convergence index, fault, aspect, MRN, and TPI. To verify the correctness of these results, six core factors, eight core factors, and ten core factors were selected according to the importance of factors obtained in the new ranking, and after performing LSM using the SVM model, the results were evaluated based on the ROC curve, specific category precision. and five statistical measures. The evaluation results for the three groups of experiments were scored, and the results are shown in Table 9.
A comprehensive analysis of Table 8 and Table 9 shows that the score for all 10 core factors is higher than the scores for the 10 important factors individually. The score for the eight core factors is much higher than the scores for the eight important factors individually and higher than those for FR-12, WOE-12, and Relief-F-12. The score for the six core factors was higher than that for the combination of all six important factors and those for FR-8 and Relief-F-10.
A significant increase in the score after reordering the importance of the factors based on the score increase indicates that the previous assumption is correct regarding factors present in multiple groups with large score increases being the most important. Therefore, among the FR, IOE, Relief-F, and the WOE Bayesian modeling methods, IOE is the best of the four screening methods when the retained number of screening factors is eight or greater. In addition, the importance of the factors was sorted by using the FR, IOE, Relief-F, and WOE Bayesian modeling methods considering different degrees of screening, and the importance of the factors was reranked according to the score increase and a comprehensive comparison, which approach improved the accuracy of the importance ranking of factors and increased the effectiveness of factor screening.

6. Conclusions

In this paper, with the Three Gorges Reservoir area to the Padang section as an example, 40 factors were extracted from the collected geological, topographical, hydrological, remote sensing, and other data. After PCC was used to remove factors with correlation greater than 0.6 and VIF was used to remove factors with multicollinearity greater than 10 and less than 0.1, 20 LSM factors were obtained. FR, IOE, Relief-F, and WOE Bayesian modeling methods were used to sort the importance of the 20 LSM factors. To study the influence of the degree of screening of these four methods on LSM, different sets of important factors (six factors, eight factors, ten factors, and twelve factors) were screened according to the importance of the factors for comparison. Then, the same training sample validation sample of screened factors was used by an SVM to generate LSM results, and the LSM results generated with different factor combinations were evaluated based on ROC curves, specific category precision analysis, and five statistical measures. To facilitate comparison, the evaluation results were scored by comprehensive quantitative analysis. By scoring the LSM results generated with the important factor sets obtained using the four screening methods with different screening degrees, it was found that when simplified to eight factors, ten factors, and twelve factors, the effect of IOE screening was the best; however, the effect of IOE screening was very poor when 6 factors were retained. In addition, the screening effect of Relief-F was worse than that of the FR and WOE Bayesian models in most cases. The evaluation score of the LSM result with IOE screening for 12 factors was 44.8 points, and that in the 20-factor case in LSM was 45 points, a small difference. Thus, it was speculated that eight factors (flow width, slope height, valley depth, slope length, flow line curvature, lithology, SPI, and total curvature) have little effect on the LSM result and are noncritical factors; thus, they were screened and removed. In the discussion, the four screening methods were comprehensively compared, and the importance of the remaining 12 factors was sorted by comparing the score increases of the four screening methods and different screening degrees; from largest to smallest, the most important factors were as follows: elevation, terrain surface texture, slope, TWI, convexity, slope structure, mid-slope position, convergence index, fault, aspect, MRN, and TPI. According to these twelve reordered factors, the six, eight, and ten most important core factors were selected, and SVM modeling was performed. The ROC curves, specific category precision analysis results, and five statistical measures were used to evaluate the LSM results, and it was found that the core factor approaches improved the four screening methods. Not only were the scores improved, but the accuracy of the importance ranking of factors also increased after reordering. Therefore, by using the FR, IOE, Relief-F, and WOE Bayesian models to screen the factors, comprehensively comparing the score increases achieved with the four methods and the factors that led to the increases and reordering the importance of the factors, the accuracy of the importance scores and rankings of factors can be improved.

Author Contributions

Writing-original draft preparation, T.X. and X.Y.; conceptualization, T.X. and X.Y.; writing-review and editing, T.X. and X.Y.; validation, X.Y. and J.Z.; visualization, X.Y., W.J. and J.Z.; formal analysis, T.X. and J.Z.; investigation, X.Y. and J.Z.; resources, T.X. and W.J.; data curation, W.J. and J.Z.; methodology, T.X., X.Y. and W.J.; supervision, T.X.; project administration, T.X.; software, X.Y. and J.Z.; funding acquisition, T.X. and W.J. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the National Natural Science Foundation of China (No. 41807297), National Natural Science Foundation of China (No. 42101375) and Innovation Demonstration Base of Ecological Environment Geotechnical and Ecological Restoration of Rivers and Lakes (No. 2020EJB004).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Remote sensing data and DEM data can be downloaded from public websites. However, basic geographic data, basic geological data, and landslide distribution data are all confidential data in China. According to relevant regulations, these confidential data have been decrypted when we use them. Any researchers in related fields that need these decrypted data can contact the corresponding author.

Acknowledgments

We are grateful to the National Natural Science Foundation of China and Innovation Demonstration Base of Ecological Environment Geotechnical and Ecological Restoration of Rivers and Lakes. We are also thankful to the Headquarters of Prevention and Control of Geo-Hazards in the Area of the Three Gorges Reservoir for providing data and material. We thank the anonymous reviewers for their constructive comments and suggestions on the manuscript. We also thank editorial employees for editing the manuscript.

Conflicts of Interest

The authors declare no competing interest.

References

  1. Wu, X.; Niu, R.; Ren, F.; Peng, L. Landslide susceptibility mapping using rough sets and back-propagation neural networks in the Three Gorges, China. Environ. Earth Sci. 2013, 70, 1307–1318. [Google Scholar] [CrossRef]
  2. Rosi, A.; Peternel, T.; Jemec-Auflič, M.; Komac, M.; Segoni, S.; Casagli, N. Rainfall thresholds for rainfall-induced landslides in Slovenia. Landslides 2016, 13, 1571–1577. [Google Scholar] [CrossRef]
  3. Liu, J.; Zeng, Z.; Liu, H.; Wang, H. A rough set approach to analyze factors affecting landslide incidence. Comput. Geosci. 2011, 37, 1311–1317. [Google Scholar] [CrossRef]
  4. Aditian, A.; Kubota, T.; Shinohara, Y. Comparison of GIS-based landslide susceptibility models using frequency ratio, logistic regression, and artificial neural network in a tertiary region of Ambon, Indonesia. Geomorphology 2018, 318, 101–111. [Google Scholar] [CrossRef]
  5. Chen, W.; Shahabi, H.; Shirzadi, A.; Hong, H.; Akgun, A.; Tian, Y.; Liu, J.; Zhu, A.-X.; Li, S. Novel hybrid artificial intelligence approach of bivariate statistical-methods-based kernel logistic regression classifier for landslide susceptibility modeling. Bull. Eng. Geol. Environ. 2019, 78, 4397–4419. [Google Scholar] [CrossRef]
  6. Chen, W.; Yan, X.; Zhao, Z.; Hong, H.; Bui, D.T.; Pradhan, B. Spatial prediction of landslide susceptibility using data mining-based kernel logistic regression, naive Bayes and RBFNetwork models for the Long County area (China). Bull. Eng. Geol. Environ. 2019, 78, 247–266. [Google Scholar] [CrossRef]
  7. Pham, B.T.; Jaafari, A.; Prakash, I.; Bui, D.T. A novel hybrid intelligent model of support vector machines and the MultiBoost ensemble for landslide susceptibility modeling. Bull. Eng. Geol. Environ. 2019, 78, 2865–2886. [Google Scholar] [CrossRef]
  8. Sun, D.; Xu, J.; Wen, H.; Wang, D. Assessment of landslide susceptibility mapping based on Bayesian hyperparameter op-timization: A comparison between logistic regression and random forest. Eng. Geol. 2021, 281, 105972. [Google Scholar] [CrossRef]
  9. Tanyu, B.F.; Abbaspour, A.; Alimohammadlou, Y.; Tecuci, G. Landslide susceptibility analyses using Random Forest, C4.5, and C5.0 with balanced and unbalanced datasets. Catena 2021, 203, 105355. [Google Scholar] [CrossRef]
  10. Wang, S.; Zhuang, J.; Zheng, J.; Fan, H.; Kong, J.; Zhan, J. Application of Bayesian Hyperparameter Optimized Random Forest and XGBoost Model for Landslide Susceptibility Mapping. Front. Earth Sci. 2021, 9, 617. [Google Scholar] [CrossRef]
  11. Hu, X.; Huang, C.; Mei, H.; Zhang, H. Landslide susceptibility mapping using an ensemble model of Bagging scheme and random subspace–based naïve Bayes tree in Zigui County of the Three Gorges Reservoir Area, China. Bull. Eng. Geol. Environ. 2021, 80, 5315–5329. [Google Scholar] [CrossRef]
  12. Bui, D.T.; Tsangaratos, P.; Nguyen, V.-T.; Van Liem, N.; Trinh, P.T. Comparing the prediction performance of a Deep Learning Neural Network model with conventional machine learning models in landslide susceptibility assessment. Catena 2020, 188, 104426. [Google Scholar] [CrossRef]
  13. Yu, C.; Chen, J. Landslide Susceptibility Mapping Using the Slope Unit for Southeastern Helong City, Jilin Province, China: A Comparison of ANN and SVM. Symmetry 2020, 12, 1047. [Google Scholar] [CrossRef]
  14. Cao, Y.; Yin, K.; Zhou, C.; Ahmed, B. Establishment of Landslide Groundwater Level Prediction Model Based on GA-SVM and Influencing Factor Analysis. Sensors 2020, 20, 845. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Han, H.; Shi, B.; Zhang, L. Prediction of landslide sharp increase displacement by SVM with considering hysteresis of groundwater change. Eng. Geol. 2021, 280, 105876. [Google Scholar] [CrossRef]
  16. Moayedi, H.; Mehrabi, M.; Mosallanezhad, M.; Rashid, A.S.A.; Pradhan, B. Modification of landslide susceptibility mapping using optimized PSO-ANN technique. Eng. Comput. 2019, 35, 967–984. [Google Scholar] [CrossRef]
  17. Tian, Y.; Xu, C.; Hong, H.; Zhou, Q.; Wang, D. Mapping earthquake-triggered landslide susceptibility by use of artificial neural network (ANN) models: An example of the 2013 Minxian (China) Mw 5.9 event. Geomat. Nat. Hazards Risk 2018, 10, 1–25. [Google Scholar] [CrossRef] [Green Version]
  18. Harmouzi, H.; Nefeslioglu, H.A.; Rouai, M.; Sezer, E.A.; Dekayir, A.; Gokceoglu, C. Landslide susceptibility mapping of the Mediterranean coastal zone of Morocco between Oued Laou and El Jebha using artificial neural networks (ANN). Arab. J. Geosci. 2019, 12, 696. [Google Scholar] [CrossRef]
  19. Merghadi, A.; Yunus, A.P.; Dou, J.; Whiteley, J.; ThaiPham, B.; Bui, D.T.; Avtar, R.; Abderrahmane, B. Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance. Earth Sci. Rev. 2020, 207, 103225. [Google Scholar] [CrossRef]
  20. Reichenbach, P.; Rossi, M.; Malamud, B.D.; Mihir, M.; Guzzetti, F. A review of statistically-based landslide susceptibility models. Earth Sci. Rev. 2018, 180, 60–91. [Google Scholar] [CrossRef]
  21. Gutierrez-Martin, A. A GIS-physically-based emergency methodology for predicting rainfall-induced shallow landslide zo-nation. Geomorphology 2020, 359, 107121. [Google Scholar] [CrossRef]
  22. Wu, Y.; Ke, Y.; Chen, Z.; Liang, S.; Zhao, H.; Hong, H. Application of alternating decision tree with AdaBoost and bagging ensembles for landslide susceptibility mapping. Catena 2020, 187, 104396. [Google Scholar] [CrossRef]
  23. Zhang, Y.; Zhang, Z.; Xue, S.; Wang, R.; Xiao, M. Stability analysis of a typical landslide mass in the Three Gorges Reservoir under varying reservoir water levels. Environ. Earth Sci. 2020, 79, 42. [Google Scholar] [CrossRef]
  24. Zheng, H.; Shi, Z.; Shen, D.; Peng, M.; Hanley, K.J.; Ma, C.; Zhang, L. Recent Advances in Stability and Failure Mechanisms of Landslide Dams. Front. Earth Sci. 2021, 9, 659935. [Google Scholar] [CrossRef]
  25. Li, X.; Cheng, X.; Chen, W.; Chen, G.; Liu, S. Identification of Forested Landslides Using LiDar Data, Object-based Image Analysis, and Machine Learning Algorithms. Remote Sens. 2015, 7, 9705–9726. [Google Scholar] [CrossRef] [Green Version]
  26. Lippitt, C.D.; Zhang, S. The impact of small unmanned airborne platforms on passive optical remote sensing: A conceptual perspective. Int. J. Remote Sens. 2018, 39, 4852–4868. [Google Scholar] [CrossRef]
  27. Xu, C.; Sun, Q.; Yang, X. A study of the factors influencing the occurrence of landslides in the Wushan area. Environ. Earth Sci. 2018, 77, 406. [Google Scholar] [CrossRef]
  28. Guzzetti, F.; Carrara, A.; Cardinali, M.; Reichenbach, P. Landslide hazard evaluation: A review of current techniques and their application in a multi-scale study, Central Italy. Geomorphology 1999, 31, 181–216. [Google Scholar] [CrossRef]
  29. Pandove, D.; Goel, S.; Rani, R. Systematic Review of Clustering High-Dimensional and Large Datasets. ACM Trans. Knowl. Discov. Data 2018, 12, 1–68. [Google Scholar] [CrossRef]
  30. Song, Y.; Niu, R.; Xu, S.; Ye, R.; Peng, L.; Guo, T.; Li, S.; Chen, T. Landslide Susceptibility Mapping Based on Weighted Gradient Boosting Decision Tree in Wanzhou Section of the Three Gorges Reservoir Area (China). ISPRS Int. J. Geo Inf. 2018, 8, 4. [Google Scholar] [CrossRef]
  31. Liao, M.; Wen, H.; Yang, L. Identifying the essential conditioning factors of landslide susceptibility models under different grid resolutions using hybrid machine learning: A case of Wushan and Wuxi counties, China. Catena 2022, 217, 106428. [Google Scholar] [CrossRef]
  32. Chen, W.; Li, W.; Chai, H.; Hou, E.; Li, X.; Ding, X. GIS-based landslide susceptibility mapping using analytical hierarchy process (AHP) and certainty factor (CF) models for the Baozhong region of Baoji City, China. Environ. Earth Sci. 2015, 75, 63. [Google Scholar] [CrossRef]
  33. Lin, W.; Yin, K.; Wang, N.; Xu, Y.; Guo, Z.; Li, Y. Landslide hazard assessment of rainfall-induced landslide based on the CF-SINMAP model: A case study from Wuling Mountain in Hunan Province, China. Nat. Hazards 2021, 106, 679–700. [Google Scholar] [CrossRef]
  34. Moustafa, S.S.; SN Al-Arifi, N.; Jafri, M.K.; Naeem, M.; Alawadi, E.A.; Metwaly, M.A. First level seismic microzonation map of Al-Madinah province, western Saudi Arabia using the geographic information system approach. Environ. Earth Sci. 2016, 75, 251. [Google Scholar] [CrossRef]
  35. Zhao, Z.; Liu, Z.Y.; Xu, C. Slope Unit-Based Landslide Susceptibility Mapping Using Certainty Factor, Support Vector Machine, Random Forest, CF-SVM and CF-RF Models. Front. Earth Sci. 2021, 9, 589630. [Google Scholar] [CrossRef]
  36. Aghdam, I.N.; Pradhan, B.; Panahi, M. Landslide susceptibility assessment using a novel hybrid model of statistical bivariate methods (FR and WOE) and adaptive neuro-fuzzy inference system (ANFIS) at southern Zagros Mountains in Iran. Environ. Earth Sci. 2017, 76, 237. [Google Scholar] [CrossRef]
  37. Abedini, M.; Tulabi, S. Assessing LNRF, FR, and AHP models in landslide susceptibility mapping index: A comparative study of Nojian watershed in Lorestan province, Iran. Environ. Earth Sci. 2018, 77, 405. [Google Scholar] [CrossRef]
  38. Das, G.; Lepcha, K. Application of logistic regression (LR) and frequency ratio (FR) models for landslide susceptibility mapping in Relli Khola river basin of Darjeeling Himalaya, India. SN Appl. Sci. 2019, 1, 1453. [Google Scholar] [CrossRef] [Green Version]
  39. Mao, Z.; Shi, S.; Li, H.; Zhong, J.; Sun, J. Landslide susceptibility assessment using triangular fuzzy number-analytic hierarchy processing (TFN-AHP), contributing weight (CW) and random forest weighted frequency ratio (RF weighted FR) at the Pengyang county, Northwest China. Environ. Earth Sci. 2022, 81, 86. [Google Scholar] [CrossRef]
  40. Li, R.; Wang, N. Landslide Susceptibility Mapping for the Muchuan County (China): A Comparison Between Bivariate Statistical Models (WoE, EBF, and IoE) and Their Ensembles with Logistic Regression. Symmetry 2019, 11, 762. [Google Scholar] [CrossRef]
  41. Mondal, S.; Mandal, S. Landslide susceptibility mapping of Darjeeling Himalaya, India using index of entropy (IOE) model. Appl. Geomat. 2019, 11, 129–146. [Google Scholar] [CrossRef]
  42. Kontoes, C.; Loupasakis, C.; Papoutsis, I.; Alatza, S.; Poyiadji, E.; Ganas, A.; Psychogyiou, C.; Kaskara, M.; Antoniadi, S.; Spanou, N. Landslide Susceptibility Mapping of Central and Western Greece, Combining NGI and WoE Methods, with Remote Sensing and Ground Truth Data. Land 2021, 10, 402. [Google Scholar] [CrossRef]
  43. Batar, A.; Watanabe, T. Landslide Susceptibility Mapping and Assessment Using Geospatial Platforms and Weights of Evidence (WoE) Method in the Indian Himalayan Region: Recent Developments, Gaps, and Future Directions. ISPRS Int. J. Geo Inf. 2021, 10, 114. [Google Scholar] [CrossRef]
  44. Qiu, H.; Cui, P.; Regmi, A.D.; Hu, S.; Zhang, Y.; He, Y. Landslide distribution and size versus relative relief (Shaanxi Province, China). Bull. Eng. Geol. Environ. 2018, 77, 1331–1342. [Google Scholar] [CrossRef]
  45. Jeandet, L.; Steer, P.; Lague, D.; Davy, P. Coulomb Mechanics and Relief Constraints Explain Landslide Size Distribution. Geophys. Res. Lett. 2019, 46, 4258–4266. [Google Scholar] [CrossRef] [Green Version]
  46. Van der Geest, K. Landslide loss and damage in Sindhupalchok District, Nepal: Comparing income groups with implications for compensation and relief. Int. J. Disaster Risk Sci. 2018, 9, 157–166. [Google Scholar] [CrossRef] [Green Version]
  47. Martínez-Doñate, A.; Privat, A.M.-L.J.; Hodgson, D.M.; Jackson, C.A.-L.; Kane, I.A.; Spychala, Y.T.; Duller, R.A.; Stevenson, C.; Keavney, E.; Schwarz, E.; et al. Substrate Entrainment, Depositional Relief, and Sediment Capture: Impact of a Submarine Landslide on Flow Process and Sediment Supply. Front. Earth Sci. 2021, 9, 1083. [Google Scholar] [CrossRef]
  48. Dou, J.; Yamagishi, H.; Pourghasemi, H.R.; Yunus, A.P.; Song, X.; Xu, Y.; Zhu, Z. An integrated artificial neural network model for the landslide susceptibility assessment of Osado Island, Japan. Nat. Hazards 2015, 78, 1749–1776. [Google Scholar] [CrossRef]
  49. Niu, Q.; Dang, X.; Li, Y.; Zhang, Y.; Lu, X.; Gao, W. Suitability analysis for topographic factors in loess landslide research: A case study of Gangu County, China. Environ. Earth Sci. 2018, 77, 294. [Google Scholar] [CrossRef]
  50. Djukem, W.D.L.; Braun, A.; Wouatong, A.S.L.; Guedjeo, C.; Dohmen, K.; Wotchoko, P.; Fernandez-Steeger, T.M.; Havenith, H.-B. Effect of Soil Geomechanical Properties and Geo-Environmental Factors on Landslide Predisposition at Mount Oku, Cameroon. Int. J. Environ. Res. Public Health 2020, 17, 6795. [Google Scholar] [CrossRef]
  51. Wang, D.; Hao, M.; Chen, S.; Meng, Z.; Jiang, D.; Ding, F. Assessment of landslide susceptibility and risk factors in China. Nat. Hazards 2021, 108, 3045–3059. [Google Scholar] [CrossRef]
  52. Yu, X.; Wang, Y.; Niu, R.; Hu, Y. A Combination of Geographically Weighted Regression, Particle Swarm Optimization and Support Vector Machine for Landslide Susceptibility Mapping: A Case Study at Wanzhou in the Three Gorges Area, China. Int. J. Environ. Res. Public Health 2016, 13, 487. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  53. Chen, W.; Li, X.; Wang, Y.; Liu, S. Landslide susceptibility mapping using LiDAR and DMC data: A case study in the Three Gorges area, China. Environ. Earth Sci. 2012, 70, 673–685. [Google Scholar] [CrossRef]
  54. Yu, X.; Gao, H. A landslide susceptibility map based on spatial scale segmentation: A case study at Zigui-Badong in the Three Gorges Reservoir Area, China. PLoS ONE 2020, 15, e0229818. [Google Scholar] [CrossRef] [Green Version]
  55. Quintero-Rincon, A.; D’Giano, C.; Risk, M. Epileptic seizure prediction using pearson’s product-moment correlation coefficient of a linear classifier from generalized gaussian modeling. arXiv 2020, arXiv:2006.01359. [Google Scholar]
  56. Ratner, B. The correlation coefficient: Its values range between+ 1/− 1, or do they? J. Target. Meas. Anal. Mark. 2009, 17, 139–142. [Google Scholar] [CrossRef] [Green Version]
  57. Cheng, J.; Sun, J.; Yao, K.; Xu, M.; Cao, Y. A variable selection method based on mutual information and variance inflation factor. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2021, 268, 120652. [Google Scholar] [CrossRef]
  58. Yao, X.; Deng, H.; Zhang, T.; Qin, Y. Multistage fuzzy comprehensive evaluation of landslide hazards based on a cloud model. PLoS ONE 2019, 14, e0224312. [Google Scholar] [CrossRef] [Green Version]
  59. Urbanowicz, R.J.; Meeker, M.; La Cava, W.; Olson, R.S.; Moore, J.H. Relief-based feature selection: Introduction and review. J. Biomed. Inform. 2018, 85, 189–203. [Google Scholar] [CrossRef]
  60. Robnik-Šikonja, M.; Kononenko, I. Theoretical and Empirical Analysis of ReliefF and RReliefF. Mach. Learn. 2003, 53, 23–69. [Google Scholar] [CrossRef] [Green Version]
  61. Dou, J.; Song, Y.; Wei, G.; Zhang, Y. Fuzzy Information Decomposition Incorporated and Weighted Relief-F Feature Selection: When Imbalanced Data Meet Incompletion. Inf. Sci. 2022, 584, 417–432. [Google Scholar] [CrossRef]
  62. Bonham-Carter, G.F.; Bonham-Carter, G. Geographic Information Systems for Geoscientists: Modelling with GIS.; Elsevier: Amsterdam, The Netherland, 1994. [Google Scholar]
  63. Vapnik, V. The Nature of Statistical Learning Theory; Springer: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
  64. Mandal, S.; Mondal, S. Machine Learning Models and Spatial Distribution of Landslide Susceptibility. In Geoinformatics and Modelling of Landslide Susceptibility and Risk; Springer: Berlin/Heidelberg, Germany, 2019; pp. 165–175. [Google Scholar] [CrossRef]
  65. Maxwell, A.E.; Warner, T.A.; Fang, F. Implementation of machine-learning classification in remote sensing: An applied review. Int. J. Remote Sens. 2018, 39, 2784–2817. [Google Scholar] [CrossRef] [Green Version]
  66. Vakhshoori, V.; Zare, M. Is the ROC curve a reliable tool to compare the validity of landslide susceptibility maps? Geomat. Nat. Hazards Risk 2018, 9, 249–266. [Google Scholar] [CrossRef] [Green Version]
  67. Cantarino, I.; Carrion, M.A.; Goerlich, F.; Martinez Ibañez, V. A ROC analysis-based classification method for landslide susceptibility maps. Landslides 2019, 16, 265–282. [Google Scholar] [CrossRef]
  68. Corsini, A.; Mulas, M. Use of ROC curves for early warning of landslide displacement rates in response to precipitation (Piagneto landslide, Northern Apennines, Italy). Landslides 2016, 14, 1241–1252. [Google Scholar] [CrossRef]
  69. Chen, S.; He, H.B.; Garcia, E.A. RAMOBoost: Ranked Minority Oversampling in Boosting. IEEE Trans. Neural Netw. 2010, 21, 1624–1642. [Google Scholar] [CrossRef]
  70. Chicco, D.; Starovoitov, V.; Jurman, G. The Benefits of the Matthews Correlation Coefficient (MCC) Over the Diagnostic Odds Ratio (DOR) in Binary Classification Assessment. IEEE Access 2021, 9, 47112–47124. [Google Scholar] [CrossRef]
  71. Pham, B.T.; Tien Bui, D.; Prakash, I. Landslide susceptibility modelling using different advanced decision trees methods. Civ. Eng. Environ. Syst. 2018, 35, 139–157. [Google Scholar] [CrossRef]
  72. Liu, L.; Li, S.; Li, X.; Jiang, Y.; Wei, W.; Wang, Z.; Bai, Y. An integrated approach for landslide susceptibility mapping by considering spatial correlation and fractal distribution of clustered landslide data. Landslides 2019, 16, 715–728. [Google Scholar] [CrossRef]
Figure 1. Schematic diagram of the location of the study area: (a) Map of China; (b) Map of Three Gorges Reservoir area; (c) Geographical location of the study area.
Figure 1. Schematic diagram of the location of the study area: (a) Map of China; (b) Map of Three Gorges Reservoir area; (c) Geographical location of the study area.
Sustainability 15 00800 g001
Figure 2. Overall workflow of this study.
Figure 2. Overall workflow of this study.
Sustainability 15 00800 g002
Figure 3. PCCs of 20 LSM factors. A1: Terrain surface texture; A2: Total curvature; A3: TPI; A4: TWI; A5: Valley depth; A6: SPI; A7: Slope length; A8: Slope height; A9: Slope; A10: MRN; A11: Mid-slope position; A12: Fault; A13: Flow line curvature; A14: Flow width; A15: Lithology; A16: Elevation; A17: Convexity; A18: Convergence index; A19: Slope structure; A20: Aspect.
Figure 3. PCCs of 20 LSM factors. A1: Terrain surface texture; A2: Total curvature; A3: TPI; A4: TWI; A5: Valley depth; A6: SPI; A7: Slope length; A8: Slope height; A9: Slope; A10: MRN; A11: Mid-slope position; A12: Fault; A13: Flow line curvature; A14: Flow width; A15: Lithology; A16: Elevation; A17: Convexity; A18: Convergence index; A19: Slope structure; A20: Aspect.
Sustainability 15 00800 g003
Figure 4. LSM factors in the study area: (a) Elevation, (b) Convergence index, (c) Aspect, (d) Convexity, (e) Fault, (f) Flow line curvature, (g) Flow width, (h) Lithology, (i) MRN, (j) Mid-slope position, (k) Slope height, (l) Slope length, (m) Slope structure, (n) Slope, (o) Valley depth, (p) TWI, (q) TPI, (r) Total curvature, (s) Terrain surface texture, and (t) SPI.
Figure 4. LSM factors in the study area: (a) Elevation, (b) Convergence index, (c) Aspect, (d) Convexity, (e) Fault, (f) Flow line curvature, (g) Flow width, (h) Lithology, (i) MRN, (j) Mid-slope position, (k) Slope height, (l) Slope length, (m) Slope structure, (n) Slope, (o) Valley depth, (p) TWI, (q) TPI, (r) Total curvature, (s) Terrain surface texture, and (t) SPI.
Sustainability 15 00800 g004
Figure 5. Different important factor sets were selected for SVM modeling based on the importance ranking of LSM factors: (a) LSM factor sets generated after FR, IOE, Relief-F, WOE screening; (b) Modeling with SVM; (c) LSI generated by LSM (see Figure 6 for details).
Figure 5. Different important factor sets were selected for SVM modeling based on the importance ranking of LSM factors: (a) LSM factor sets generated after FR, IOE, Relief-F, WOE screening; (b) Modeling with SVM; (c) LSI generated by LSM (see Figure 6 for details).
Sustainability 15 00800 g005
Figure 6. LSI obtained with (a) 20 factors, (b) 6 factors based on FR, (c) 8 factors based on FR, (d) 10 factors based on FR, (e) 12 factors based on FR, (f) 6 factors based on IOE, (g) 8 factors based on IOE, (h) 10 factors based on IOE, (i) 12 factors based on IOE, (j) 6 factors based on Relief-F, (k) 8 factors based on Relief-F, (l) 10 factors based on Relief-F, (m) 12 factors based on Relief-F, (n) 6 factors based on the WOE Bayesian model, (o) 8 factors based on the WOE Bayesian model, (p) 10 factors based on the WOE Bayesian model, and (q) 12 factors based on the WOE Bayesian model.
Figure 6. LSI obtained with (a) 20 factors, (b) 6 factors based on FR, (c) 8 factors based on FR, (d) 10 factors based on FR, (e) 12 factors based on FR, (f) 6 factors based on IOE, (g) 8 factors based on IOE, (h) 10 factors based on IOE, (i) 12 factors based on IOE, (j) 6 factors based on Relief-F, (k) 8 factors based on Relief-F, (l) 10 factors based on Relief-F, (m) 12 factors based on Relief-F, (n) 6 factors based on the WOE Bayesian model, (o) 8 factors based on the WOE Bayesian model, (p) 10 factors based on the WOE Bayesian model, and (q) 12 factors based on the WOE Bayesian model.
Sustainability 15 00800 g006
Figure 7. LSZ obtained with (a) 20 factors, (b) 6 factors based on FR, (c) 8 factors based on FR, (d) 10 factors based on FR, (e) 12 factors based on FR, (f) 6 factors based on IOE, (g) 8 factors based on IOE, (h) 10 factors based on IOE, (i) 12 factors based on IOE, (j) 6 factors based on Relief-F, (k) 8 factors based on Relief-F, (l) 10 factors based on Relief-F, (m) 12 factors based on Relief-F, (n) 6 factors based on the WOE Bayesian model, (o) 8 factors based on the WOE Bayesian model, (p) 10 factors based on the WOE Bayesian model, and (q) 12 factors based on the WOE Bayesian model.
Figure 7. LSZ obtained with (a) 20 factors, (b) 6 factors based on FR, (c) 8 factors based on FR, (d) 10 factors based on FR, (e) 12 factors based on FR, (f) 6 factors based on IOE, (g) 8 factors based on IOE, (h) 10 factors based on IOE, (i) 12 factors based on IOE, (j) 6 factors based on Relief-F, (k) 8 factors based on Relief-F, (l) 10 factors based on Relief-F, (m) 12 factors based on Relief-F, (n) 6 factors based on the WOE Bayesian model, (o) 8 factors based on the WOE Bayesian model, (p) 10 factors based on the WOE Bayesian model, and (q) 12 factors based on the WOE Bayesian model.
Sustainability 15 00800 g007
Figure 8. ROC curve analysis: (a) ROCs of different methods, (b) ROC of FR, (c) ROC of IOE, (d) ROC of Relief-F, (e) ROC of the WOE Bayesian model.
Figure 8. ROC curve analysis: (a) ROCs of different methods, (b) ROC of FR, (c) ROC of IOE, (d) ROC of Relief-F, (e) ROC of the WOE Bayesian model.
Sustainability 15 00800 g008
Figure 9. Specific category precision analysis.
Figure 9. Specific category precision analysis.
Sustainability 15 00800 g009
Figure 10. The results of five statistical measures.
Figure 10. The results of five statistical measures.
Sustainability 15 00800 g010
Figure 11. Comparison of the increases in scores and in related factors: (a) Comparison of the score of the LSM evaluation of IOE-12 and the score of the LSM evaluation of 20 factors; (b) Comparison of the score of the LSM evaluation of IOE-6 and the score of the LSM evaluation of IOE-8; (c) The score of the LSM evaluation of WOE-0; (d) The score of the LSM evaluation of FR-6.
Figure 11. Comparison of the increases in scores and in related factors: (a) Comparison of the score of the LSM evaluation of IOE-12 and the score of the LSM evaluation of 20 factors; (b) Comparison of the score of the LSM evaluation of IOE-6 and the score of the LSM evaluation of IOE-8; (c) The score of the LSM evaluation of WOE-0; (d) The score of the LSM evaluation of FR-6.
Sustainability 15 00800 g011
Table 1. Data sources used in this study.
Table 1. Data sources used in this study.
NameData SourceSpatial Resolution/Scale
DEM datahttps://lpdaac.usgs.gov/tools/data~pool/ (accessed on 16 November 2021)30 m
Basic geographic dataHubei Geological Survey Institute (accessed on 10 November 2021)1:50,000
The landslides distribution dataLandslide hazard map (accessed on 12 November 2021)1:10,000
Table 2. Name, description, and classification of variables.
Table 2. Name, description, and classification of variables.
TypeVariableUnitValue Range
GeologyFaultm0~8719.7
Lithology 1. Hard Rock; 2. Soft–Hard Alternating Rock; 3. Soft Rock.
Slope Structure 1. Over-dip Slope; 2. Under-dip Slope; 3. Dip-oblique Slope; 4. Anaclinal Slope; 5. Anaclinal-oblique Slope; 6. Transverse Slope.
TerrainTCI Low 0.169229~0.970989
Terrain Surface Texture 0~0.694006
Total Curvature 0~0.118286
TPI −83.3541~227.751
TRI 0~192.657
Valley Depthm0.00256774~1308.15
Tangential Curvature −0.0749825~0.0468265
Slope Lengthm0~3909.74
Slope Heightm0~1227.48
Slope Form 1. Outside Convex Slope; 2. Outside Concave Slope; 3. Outside Straight Slope; 4. Inside Convex Slope; 5. Inside Concave Slope; 6. Inside Straight Slope; 7. Straight Convex Slope; 8. Straight Concave Slope; 9. Straight Slope.
Slopem0.00305845~78.419
Profile Curvature −0.0689291~0.0629155
Plan Curvature −1.96555~4.0909
Minimal Curvature −0.405303~0.0755154
Mid-slope Position 0~1
Maximal Curvature −0.0707668~0.13735
Longitudinal Curvature −0.639569~0.242719
General Curvature −0.652313~0.42573
Landforms 1. Canyons, deeply incised streams; 2. Mid-slope drainages, shallow valleys; 3. Upland drainages, headwaters; 4. U-shape valleys; 5. Plains; 6. Open slopes; 7. Upper slopes, mesas; 8. Local ridges/hills in valleys; 9. Mid-slope ridges, small hills in plains; 10. Mountain tops, high ridges.
Elevationm80~2000
Cross-sectional Curvature −0.243084~0.183011
Convexity 0.170068~0.81388
Convergence Index −74.3747~82.5724
Aspect 1. North; 2. Northeast; 3. East; 4. Southeast; 5. South; 6. Southwest; 7. West; 8. Northwest.
HydrologyDistance from Riverm0~2395.07
TWI 4.44223~18.03
VDCNm−464.027~1724.15
SPI 0~1136150
River Buffer Zonem2377.319~4559.85
MRN 0~37.728
LS Factor 0~95.7068
Flow Line Curvature −2.72377~0.346225
Flow Path Lengthm0~34.8072
Flow Widthm28.5~40.3051
CNBL 80.2274~1353.9
Catchment Slope 0~1.46657
Catchment Aream2812.25~1637870
Table 3. Confusion matrix.
Table 3. Confusion matrix.
Confusion MatrixPredicted Value
PositiveNegative
Observed ValuePositiveTrue Positive, TPFalse Negative, FN
NegativeFalse Positive, FPTrue Negative, TN
Table 4. Multicollinearity analyses of landslide susceptibility evaluation factors.
Table 4. Multicollinearity analyses of landslide susceptibility evaluation factors.
LSM FactorTOLVIF
Terrain Surface Texture0.5831.714
Total Curvature0.7071.414
TPI0.4822.076
TWI0.1945.144
Valley Depth0.3742.677
SPI0.4922.034
Slope Length0.3952.532
Slope Height0.482.084
Slope0.3253.074
MRN0.3792.636
Mid-slope Position0.7921.263
Fault0.9091.1
Flow Line Curvature0.7811.28
Flow Width0.9641.037
Lithology0.7771.287
Elevation0.711.408
Convexity0.6441.552
Convergence Index0.3682.716
Slope Structure0.9011.11
Aspect0.9641.037
Table 5. Spatial relationship between each factor and landslide occurrence based on the FR, IOE, Relief-F, and WOE Bayesian modeling methods.
Table 5. Spatial relationship between each factor and landslide occurrence based on the FR, IOE, Relief-F, and WOE Bayesian modeling methods.
FactorRange of Values for ClassificationThe Size of Study AreaThe Size of LandslideFRIOERelief-FWOE Bayesian
WfinalThe AUC without the Factor
Terrain Surface Texture≤0.1772910382.123041519131,882.123.03%25.156727910.845
0.1~0.248,11875212.47088459884.71821732
0.2~0.3125,37210,2581.29344087732.40448146
0.3~0.4131,36250740.610611743−43.98146634
0.4~0.567,53211160.261239608−49.86246713
>0.518,4612060.17639913−26.02038569
Total Curvature≤0.0000195,54711,0521.8285559862459.217.41%73.892269250.849
0.00001~0.0001195,48711,5570.934569649−10.52490482
0.0001~0.000247,94114870.490329837−30.10794876
0.0002~0.000320,7025040.38485991−22.62282999
0.0003~0.000411,0552190.313162542−17.89668609
0.0004~0.000569081230.281473312−14.56513525
>0.000520,9342710.204645176−27.40032779
TPI≤−1033,3668890.421193884140,327.936.78%−27.661803980.848
−10~−541,27922320.854770379−8.087770482
−5~0101,74193471.45231452942.93694928
0~5131,01710,6821.28887073932.94845718
5~1057,15917460.482885382−33.73766375
10~1522,2882710.192212945−28.50621446
>1511,724460.062024956−19.41890095
TWI≤819,973830.065693021142,407.350.23%−25.732640170.86
8~10264,09596720.578949323−92.00555826
10~1294,32313,5212.266082145108.1623129
12~1418,04318221.59633510721.05945154
>1421401150.849510025−1.811694153
Valley Depth≤50148,98635280.37434113965,898.164.25%−74.038539170.856
50~100108,41860390.880537952−11.96593041
100~15061,94354471.39011132527.26438873
150~20033,69440861.91703583944.61237656
>20045,53361132.12232833363.81352851
SPI≤100057,27011640.3212996410,667.482.93%−42.681628810.856
1000~4000174,78689520.809651027−27.46588053
4000~700076,42861611.27433366221.83437605
7000~10,00031,46330991.55706193326.46448039
>10,00058,62758371.57389756438.5863394
Slope Length≤100194,94080340.65150133144,684.153.78%−54.824179520.855
100~20082,27644470.854433763−12.1606934
200~30048,62034371.1175038277.18036755
300~40026,19624081.45313492919.56950679
400~50016,00018061.78435887225.85395212
500~600964013602.2302128630.7159179
>60020,90237212.81420848465.87992865
Slope Height≤2049,19318830.60510599171,790.714.30%−23.982769540.847
20~70136,26299221.15108900317.83047751
70~12087,21971711.29972975325.89550671
120~17050,70532241.0051449320.322254366
170~22028,95415550.848997213−6.924069326
220~27016,5837110.677783421−10.93334835
>27029,6587470.398165092−26.86361557
Slope≤1014,9716730.710638439135,956.402.59%−9.3252736950.843
10~2083,16783741.59171885949.03884418
20~30152,52412,2081.26529203934.03781439
30~40105,16934860.523991304−45.51085502
>4042,7434720.174566715−40.66036184
MRN≤176,41731390.64936035987,202.254.27%−27.706079530.852
1~4104,92859760.900333967−9.766364922
4~7106,13072801.0843704068.334278476
7~1062,92548971.23024418616.3112819
>1048,17439211.28667414917.37371656
Mid-slope Position≤0.139,38030721.233189848102,685.712.44%12.636130110.847
0.1~0.379,61163861.26806138121.88532577
0.3~0.581,37162531.21479561817.80136222
0.5~0.784,24452450.984217209−1.340464374
0.7~0.988,20636690.657557938−29.62068636
>0.925,7625880.360813012−26.23025167
Fault≤1500166,25410,6371.011419908105,921.5619.69%1.5848971630.848
1500~3000109,82469511.0005400380.054640473
3000~450052,64631230.937758579−3.982618149
4500~600039,68522630.901452008−5.373160425
6000~750025,96421941.33582468314.47859593
>75004201450.169334041−12.26434224
Flow Line Curvature≤−0.00142,0358330.313269742,314.638.91%−36.197661150.846
−0.001~−0.000530,21012310.644157057−16.55531373
−0.0005~0128,05210,6621.31624505835.42803841
0~0.0005126,27310,6031.32740272336.2891063
0.0005~0.00130,47011600.601824656−18.55144031
>0.00141,5347240.27556195−37.38392252
Flow Width≤3034,41322761.04552438174,675.200.64%2.2956962850.849
30~35108,37273171.0673341576.748403248
35~40194,85012,1740.987682431−1.976271103
>4060,93934460.893931809−7.387354715
LithologyHard Rock78,4218020.16166888138,710.953.61%−57.542548310.836
Soft-Hard Alternating Rock111,69613,0851.85191286183.76213169
Soft-Rock208,45711,3260.858903782−24.15633662
Elevation≤20027,10067293.925235146120,609.117.32%115.42396290.803
200~30050,15794012.962967866112.4825669
300~40052,95949861.48832213131.05621469
400~50055,96728910.816583321−12.13363048
500~60053,6028220.242423806−44.34313919
600~70044,7641920.067804229−39.54329963
>700114,0251920.026618623−55.77986673
Convexity≤0.416,96117621.642248566124,943.950.72%21.927978270.85
0.4~0.5110,92585571.21948520622.28710735
0.5~0.6207,79012,2360.930891933−11.82278933
>0.662,89826580.668040176−23.35665684
Convergence Index≤−3036,43813330.578309144129,519.446.37%−21.610422510.85
−30~−1081,25461291.19242016815.94036802
−10~10129,17412,4001.51750810257.66968393
10~3097,15047230.7685278−21.4584672
>3054,5586280.181964071−46.4254435
Slope StructureOver-dip Slope19,1965480.45128849296,820.983.66%−19.64704870.848
Under-dip Slope71,22557921.28552502921.75906997
Dip-oblique Slope69,16949321.1271871069.552974322
Anaclinal Slope123,81376300.974187903−2.842537665
Anaclinal-oblique Slope59,35035860.955155329−3.076989636
Transverse Slope55,82127250.771708592−15.05732658
AspectNorth57,06056461.56420456180,616.507.06%37.356598670.847
Northeast55,70243391.23141177615.26219266
East42,95524430.899071405−5.751234837
Southeast45,54020370.707102616−17.14717011
South47,62934931.1593419849.618338101
Southwest45,42516270.566209378−25.09301787
West53,52419790.584496175−26.43084564
Northwest50,73936491.1368846468.568868387
Table 6. Importance ranking of LSM factors based on different screening methods.
Table 6. Importance ranking of LSM factors based on different screening methods.
Importance
Ranking
FRIOERelief-FWOE
1ElevationTWIFaultElevation
2Slope LengthTPIFlow Line CurvatureLithology
3Terrain Surface TextureSlopeTotal CurvatureSlope
4TWITerrain Surface TextureElevationTerrain Surface Texture
5Valley DepthConvergence IndexAspectFlow Line Curvature
6LithologyConvexityTPISlope Height
7Total CurvatureElevationConvergence IndexMid-slope Position
8ConvexityFaultSlope HeightAspect
9SlopeMid-slope PositionMRNTPI
10SPISlope StructureValley DepthFault
11AspectMRNSlope LengthSlope Structure
12Convergence IndexAspectSlope StructureTotal Curvature
13TPIFlow WidthLithologyFlow Width
14FaultSlope HeightTerrain Surface TextureConvexity
15Flow Line CurvatureValley DepthSPIConvergence Index
16Slope HeightSlope LengthSlopeMRN
17MRNFlow Line CurvatureMid-slope PositionSlope Length
18Slope StructureLithologyConvexityValley Depth
19Mid-slope PositionSPIFlow WidthSPI
20Flow WidthTotal CurvatureTWITWI
Table 7. AUC values of the four types of models.
Table 7. AUC values of the four types of models.
Classifiers6 Factors8 Factors10 Factors12 Factors20 Factors
FR0.89000.89450.90030.90290.9107
IOE0.80410.90610.90640.9103
Relief-F0.86860.89080.89310.9025
WOE0.89740.90140.90230.9038
Table 8. Score table of LSM results’ evaluation for important subset of factors.
Table 8. Score table of LSM results’ evaluation for important subset of factors.
Number of FactorsModelAUCSpecific Category Precision Analysis-Very HighFive Statistical MeasuresTotal Score
OAPrecisionRecallF-MeasureMCCAverage Score
6 factorsFR + SVM36756555.614.6
IOE + SVM111111113
Relief-F + SVM232211223.88.8
WOE + SVM748910888.619.6
8 factorsFR + SVM610977777.423.4
IOE + SVM14126816999.635.6
Relief-F + SVM423317335.811.8
WOE + SVM9813149141412.829.8
10 factorsFR + SVM81112118111110.629.6
IOE + SVM1516101013101010.641.6
Relief-F + SVM594415446.220.2
WOE + SVM10515165161613.628.6
12 factorsFR + SVM121514134131311.438.4
IOE + SVM1617111212121211.844.8
Relief-F + SVM11135614667.431.4
WOE + SVM13716153151512.832.8
20 factorsSVM17141717217171445
Table 9. Evaluation of the LSM results for different core factor sets.
Table 9. Evaluation of the LSM results for different core factor sets.
FactorsAUCSpecific Category Precision—Very HighFive Statistical MeasuresTotal Score
OAPrecisionRecallF-MeasureMCC
6 factors0.900024.86%79.32%0.09960.85030.17830.270327.55
8 factors0.906226.84%80.27%0.10390.84930.18520.276941.72
10 factors0.908226.83%80.93%0.10630.84050.18880.280142.88
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yu, X.; Xiong, T.; Jiang, W.; Zhou, J. Comparative Assessment of the Efficacy of the Five Kinds of Models in Landslide Susceptibility Map for Factor Screening: A Case Study at Zigui-Badong in the Three Gorges Reservoir Area, China. Sustainability 2023, 15, 800. https://doi.org/10.3390/su15010800

AMA Style

Yu X, Xiong T, Jiang W, Zhou J. Comparative Assessment of the Efficacy of the Five Kinds of Models in Landslide Susceptibility Map for Factor Screening: A Case Study at Zigui-Badong in the Three Gorges Reservoir Area, China. Sustainability. 2023; 15(1):800. https://doi.org/10.3390/su15010800

Chicago/Turabian Style

Yu, Xianyu, Tingting Xiong, Weiwei Jiang, and Jianguo Zhou. 2023. "Comparative Assessment of the Efficacy of the Five Kinds of Models in Landslide Susceptibility Map for Factor Screening: A Case Study at Zigui-Badong in the Three Gorges Reservoir Area, China" Sustainability 15, no. 1: 800. https://doi.org/10.3390/su15010800

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop