Landslide Susceptibility Mapping and Driving Mechanisms in a Vulnerable Region Based on Multiple Machine Learning Models

: Landslides can cause severe damage to both the environment and society, and many statistical, index-based, and inventory-based methods have been developed to assess landslide susceptibility; however, it is still challenging to choose the most effective method and properly identify major driving factors for speciﬁc regions. Here, we applied four machine learning algorithms, adaptive boosting (AdaBoost), gradient-boosting decision tree (GBDT), multilayer perceptron (MLP), and random forest (RF), to predict the landslide susceptibility at 30 m spatial scale based on thirteen landslide conditioning factors (LCFs) in a landslide-vulnerable region. Based on inventory landslide points, the classiﬁcation results were evaluated, and indicated that the performance of the RF ( F1-score : 0.85, AUC: 0.92), AdaBoost ( F1-score : 0.83, AUC: 0.91), and GBDT ( F1-score : 0.83, AUC: 0.88) methods were signiﬁcantly better than the MLP ( F1-score : 0.76, AUC: 0.79) method. The results further indicated that the areas with high and very high landslide risk (susceptibility greater than 0.5) accounted for about 40% of the study region. All four models matched well and predicted similar spatial distribution patterns in landslide susceptibility, with the very high risk areas mostly distributed in the western and southeastern regions. Daoshi, Qingliangfeng, Jinnan, and Linglong towns have the highest landslide risk, with mean susceptibility levels greater than 0.5. The leading contributing factors to landslide susceptibility were slightly different for the four models; however, population density, distance to road, and relief amplitude were generally among the top leading factors for most towns. Our study provided significant information on the highly landslide-prone areas and the major contributing factors for decision-makers and policy planners, and suggested that different areas should take unique precautions to mitigate or avoid severe damage from landslide events.


Introduction
Landslides are considered a natural hazard that causes extensive damage to the environment and societies. A landslide hazard map demonstrates regions susceptible to landslides by considering the phenomenon's causative and triggering factors (e.g., geomorphological, geological, and meteorological) [1]. Landslide susceptibility is the likelihood of landslides occurring based on local topographical conditions [2]. Various approaches to landslide susceptibility mapping (LSM) have been proposed and practiced during the last few decades [3]. These methods can be broadly divided into five categories: geomorphological mapping, analysis of inventories, index-based approaches, process-based methods, and statistical modeling methods [3]. Among the statistical modeling methods, machine learning methods, a powerful group of data-driven tools, have experienced an increasing preference in recent years [4].
to study LSM, while the ensemble-based machine learning methods are novel and have been less applied. More studies are needed to evaluate their performance against the conventional machine learning methods. To avoid or reduce the ecological, economic, and life losses from landslides and maintain sustainable development, it is urgent to provide the extent and intensity of landslide susceptibility and identify underlying contributing factors, which could help policy-makers reduce or avoid the occurrence of disasters in advance or rescue the environment after disasters. Therefore, our main objectives are to (1) compare the performance of four machine learning algorithms, MLP, AdaBoost, GBDT, and RF, in landslide susceptibility prediction; (2) explore the pixel-(30 m) and town-level patterns of high-risk landslide areas; and (3) identify the major contributing factors to landslide susceptibility at regional and town levels.

Study Region
The case study region is located in the Lin'An District (29°56′-30°23′N, 118°51′-119°52′E), Hangzhou City, Zhejiang Province, China ( Figure 1). This region is about 100 km wide from east to west, and about 50 km long from north to south, and with a total area of 3127 km 2 . This region belongs to the subtropical monsoon climate zone. The average annual precipitation is 1614 mm and mostly occurs in summer [47]. The extreme climate events such as strong wind, extreme rainfall events, and typhoons frequently occur. The terrain slopes decline from the west to southeast, and this region is surrounded by high mountains on the north, west, and southeast. The elevation of the eastern valley plain is generally below 50 m above sea level, with the lowest at 4 m below sea level in Qingshanhu town, while the highest elevation is 1787 m in the western Qingliang Mountains. The study region was reported to be vulnerable to landslides due to the high frequency of extreme rainfall and wind events, high slopes, the undulatory terrains, fractured geological formation and lithology, well-weathered soil, and improper human construction (e.g., roads and villages).

Model Input Variables and Descriptions
The identification of the main controlling factors of landslide disasters plays an indispensable role in landslide occurrence prediction and risk assessment [41]. Generally, the influencing factors are divided into stable condition factors and triggering condition Remote Sens. 2023, 15,1886 4 of 20 factors [48], and stable conditions can be further divided into material conditions and topographic conditions [42]. In this study, we mainly selected the stable condition factors, because our assessment emphasizes the possibility of landslides under long-term stable geological conditions, while the triggering condition factors mainly emphasize the possibility of disasters under a short-term influence [43]. Based on previous research, thirteen landslide condition factors were selected, including slope, aspect, curvature, surface roughness, relief amplitude, landform, lithology, vegetation coverage, distance to fault, distance to road, distance to river, population density, and precipitation (average monthly precipitation). The data sources and characteristics are shown in Table 1, and the spatial distribution patterns of some key factors are shown in Figure 2. Among the input variables, monthly precipitation in the study region ranges from 163 to 248 mm, with a declining tendency from the west to the east. The extremely high monthly rainfall can be over 400 mm in some years with super strong typhoons, such as during the period of Typhoon Lekima in August 2019. Lithology is divided into I to V grades in terms of hardness, representing rock hardness and structural integrity [49]. Landforms are divided into six categories: plain (elevation < 200 m), hill (201-500 m), low mountain (501-1000 m), middle mountain (1001-3500 m), high mountain (3501-5000 m), and highest mountain (>5000 m) [50].
The calculation methods for some of the input variables are described here. Vegetation fractional coverage (VFC) represents the fraction of vegetation within each pixel. The calculation formula is as follows: where NDV I soil is NDV I (normalized difference vegetation index) for bare soil, NDV I veg is NDV I for vegetated areas. The surface roughness (D) is used to reflect surface fluctuation and degree of erosion. It can be expressed as the ratio of the grid surface area to its projected area, which is the reciprocal of the cosine of the slope (∝) after conversion [51]: Relief amplitude (RDLS) reflects the relative height difference in the ground, which is the difference between the maximum elevation (DEM max ) and the minimum elevation (DEM min ) [52]:  The curvature is calculated using the incomplete quartic method [53]. All the above factors were calculated using QGIS software (https://qgis.org/en/site/, accessed on 10 September 2022).

Training and Validation Sample Plot Data
Landslide occurrence points recorded during 1949-2020 were collected from the Resource and Environment Science and Data Center, Chinese Academy of Sciences (https: //www.resdc.cn/, accessed on 10 September 2022) ( Table 1). This dataset recorded most of the landslide occurrence points in the study region. A landslide occurrence was recorded as "1". There are 146 landslide occurrence points, indicating 146 landslides occurred during the historical period. We further randomly established 146 points (marked as "0") to represent points without landslides. We further extracted the data for the 13 model input variables at each sample point ( Figure 2). Finally, we obtained a training and validation dataset with 292 sample points ( Figure 1). The dataset is further divided into a training and development set and a test set, which are used to train the models and test the accuracy of the models, respectively. The training and development set has 232 sample points (80%) and the test set has 60 sample points (20%).

Research Work Flow
The entire research flow is shown in Figure 3, more detailed processes are described below.  Before the prediction of landslide susceptibility, the independence of each input variable must be tested. This can not only reduce the amount of data but also improve the accuracy of the prediction models. The variance inflation factor (VIF) and tolerance (TOL)

Multicollinearity Test
Before the prediction of landslide susceptibility, the independence of each input variable must be tested. This can not only reduce the amount of data but also improve the accuracy of the prediction models. The variance inflation factor (VIF) and tolerance (TOL) were used to screen all 13 input variables for multicollinearity [54][55][56]. VIF represents the ratio of the variance of the regression coefficient estimator to the variance assuming a nonlinear correlation between independent variables, and TOL is the inverse of the VIF value, both of which can test the multicollinearity between variables. In general, VIF >10 and TOL <0.1 indicate a serious multicollinearity problem between variables [57].

Adaptive Boosting (AdaBoost) Method
The adaptive boosting algorithm is one of the most popular methods in the integrated learning-boosting methods, which was first proposed by Freund and Schapire [58]. Schapire proved that strong and weak learning algorithms are equivalent [59], Therefore, several weak learning algorithms can be constituted into a strong learning algorithm. AdaBoost is a boosting algorithm that combines a series of weak classifiers into a strong classifier [60]. Firstly, the algorithm initializes the weight distribution of the training data and trains each weak classifier. Then, it calculates the classification error between its classification result and the training data. Finally, it updates the weight distribution for the training dataset according to the classification error rate. In each round of calculation, the weight value of the samples classified incorrectly in the previous round increases, and the weight value of the samples correctly classified decreases, so that the later classifiers give more attention to the samples that are classified incorrectly in the previous round. Eventually, AdaBoost obtains the final classification result through weighted calculation [60]. In this study, we take the decision-tree classifier as the basic classifier of the AdaBoost algorithm. Compared with other weak classifiers, the decision tree with AdaBoost model has faster calculation speed and better performance in large-scale data training [61].

Gradient-Boosting Decision-Tree (GBDT) Method
The gradient-boosting algorithm was proposed by Freidman in 2001 [62]. Gradient boosting is a large class of algorithms in boosting. GBDT is a gradient-boosting algorithm that uses decision trees as weak classifiers. The idea of gradient boosting is based on the gradient-descent method, which reduces a loss by fitting the residual of the previous time to achieve the purpose of joint decision-making. The basic principle of GBDT is to train the weak classifiers according to the negative gradient information from the current model loss function, and then add the weak classifiers to the existing model. It has good interpretability and robustness, and is considered as an algorithm with strong generalization ability and wide application [63].

Multilayer Perceptron Method
MLP is the most common fully connected artificial neural network (ANN), which is generally considered to be a model that uses a hyperplane to classify the feature vector x of an input instance [12]. The model has multiple layers, including an input layer, an output layer, and one or more hidden layers [64,65]. In each layer, the neurons usually contain a linear function and a nonlinear activation function, and the weight coefficient W and bias b in the linear function are adjusted by the gradient-descent algorithm [66] to make the value of the loss function reach a minimum. In this study, the sigmoid function was used as the activation function for the output layer so that the result of the output layer can be regarded as the probability of classification in the binary classification model.

Random Forest Method
The random forest (RF) classifier is several decision trees constructed by bootstrapping [67], which has a better generalization ability than other traditional mathematical models when dealing with high-dimensional samples, and has been widely used in ge- ological disaster prevention research. The randomness of the random forest is reflected in the sampling process and feature selection. The random forest obtains N independent samples as the training samples of the ith (1 ≤ i ≤ N) decision tree of the random forest by performing random sampling with a replacement for n times. Since each sampling is an independent random event with replacement, according to the limit formula (Equation (4)), about one-third of the samples each time do not appear in the selected sample set.
The performance of the model is affected by the training process, such as the number of submodels and learning rate, and is also affected by the performance of submodels, such as maximum tree depth and criterion.

Model Parameterization
The four models should be first optimized by selecting the appropriate hyperparameters. Commonly used hyperparameter optimization methods include grid search, cross-validation, and random search [68,69]. In this study, grid search and K-fold crossvalidation were applied to select the optimal hyperparameters for the models in the trainingdevelopment set (Table 2). These optimal hyperparameters are obtained by achieving the highest mean testing accuracy in K-fold validation. The hidden layer for MLP is set to one layer containing four neurons with the Adam optimizer as its optimization algorithm. Epoch refers to how many cycles through the full training dataset, and batch size represents the number of samples per gradient update. The RF method uses a total of 17 trees, 3 maximum feature numbers for each tree, and 3 minimum numbers of samples. AdaBoost and GBDT use 40 and 17 trees, respectively. The more trees denote a more complex model and more powerful ability of data training. However, too many trees are prone to be overfiting when the models are trained by a small dataset. Considering our training data size, we established these parameters for four methods.

Model Evaluation Metrics
In this study, the overall accuracy (OA), F1-score, and AUC (ROC curve) were used to evaluate the performance of the modeling results. F1-score is a precision evaluation index that unifies precision and recall. It is widely used in the precision evaluation of machine learning models [70,71]. Firstly, we need to use the confusion matrix to calculate the precision and recall. The precision rate is the proportion of samples that are actually positive among all the samples and are also predicted to be positive (Equation (5)). The recall rate is the proportion of the correct sample predicted in the actual positive class sample (Equation (6)). F1-score is actually a harmonic mean of precision and recall (Equation (7)).
Remote Sens. 2023, 15, 1886 9 of 20 where TP is true positive sample numbers; FP is false positives; FN is false negatives; TN is true negatives. The receiver-operating characteristic curve (ROC) takes each value of the prediction result as a possible judgment threshold. With the false positive rate (FPR) as the X-axis (Equation (9)) and the true positive rate (TPR) as the Y-axis (Equation (10)), the ROC curve is calculated by connecting the corresponding points of the samples distributed among the coordinate system: The larger the FPR on the horizontal axis, the more actual negative classes in the predicted positive class, and the larger the TPR on the vertical axis, the more actual positive classes in the predicted positive class. The ideal prediction situation is FPR equal to 0 and TPR equal to 1, which corresponds to the (0, 1) point in the coordinate axis. As an evaluation metric measuring the accuracy of the classification model, the area under the curve (AUC) for an actual classifier is between 0.5 and 1 [72]. For example, a landslide prediction model has a false positive rate prediction that indicates the probability of a landslide occurring where it did not actually occur, and a true positive rate indicated the probability that a landslide would actually occur. The closer the ROC curve radian is to the (0, 1) point, that is, the closer the AUC value is to 1, the better the prediction effect [73].

Multicollinearity Analysis
Multicollinearity among the condition factors were identified using the variance inflation factors (VIF) and tolerances (TOL). The results showed that the largest VIF value was 4.98 from landform, which is lower than the collinearity threshold of 10 ( Table 3). The lowest tolerance was 0.20, also from landform, which is greater than the collinearity threshold of 0.1. Both indicators suggest that all thirteen factors did not have serious collinearity problems, and can be used as input variables for the four models.

Accuracy Assessment for the Modeling Results
The estimated landslide susceptibility from the four machine learning methods was evaluated against the sampling plot data based on the F1-score and AUC values. In terms of F1-score, the results indicated that the accuracy of RF (F1-score = 0.85), AdaBoost (F1-score = 0.83), and GBDT (F1-score = 0.83) were similar, but all significantly higher than MLP (0.77) ( Table 4). In terms of the AUC values, all the models exhibited a sufficient performance (AUC >0.79). The success rate and generalization ability of RF (0.92), AdaBoost (0.91), and GBDT (0.88) were also significantly higher than MLP (Figure 4). Overall, the RF model had the best performance for predicting landslide susceptibility, followed by AdaBoost and GBDT, with the worst from MLP. The evaluations also indicated that all models selected in this study have reasonable goodness-of-fit in spatial prediction of landslide susceptibility.  At spatial scale, the predictive differences among four models were also tively evaluated by cross-correlation coefficient ( Table 5). The results showed th relation coefficients between MLP and other three models were generally lower while the correlation coefficients among RF, AdaBoost and GBDT were higher This implied that the three methods based on decision trees generally perfor Overall, the modeling results of the integrated model used in the study have high spatial similarity, which further mutually proved the effectiveness of the m diction results.  At spatial scale, the predictive differences among four models were also quantitatively evaluated by cross-correlation coefficient ( Table 5). The results showed that the correlation coefficients between MLP and other three models were generally lower than 0.74, while the correlation coefficients among RF, AdaBoost and GBDT were higher than 0.88. This implied that the three methods based on decision trees generally performs better. Overall, the modeling results of the integrated model used in the study have relatively high spatial similarity, which further mutually proved the effectiveness of the model prediction results.

Predicted Landslide Susceptibility
The four models evaluated were used to assess landslide susceptibility in the study area. The landslide susceptibility was regrouped into five susceptibility classes: very high prone (0.75~1.00), high (0.50~0.75), moderate (0.25~0.50), and low (0~0.25) ( Table 6). Overall, the distribution of landslide susceptibility for each class was similar among the four methods. The "very high" susceptibility class covered about 8%, 24%, 7%, and 7% of the total area based on the RF, MLP, AdaBoost, and GBDT models, respectively. The "low" susceptibility class covered about 21%, 45%, 21%, and 19% of the total area based on the RF, MLP, AdaBoost, and GBDT models, respectively. Totally, the "high" and "very high" classes accounted for about 40%, 42%, 37%, and 40% of the total land area based on the RF, MLP, AdaBoost, and GBDT models, respectively, with a mean high-risk fraction of 40%. Overall, the three decision-treebased methods (RF, AdaBoost, and GBDT) had very similar prediction for all four categories, while the MLP method predicted a significantly higher fraction of "low" and "very high" risk area. At town level, the mean landslide susceptibility values in Qingliangfeng (0.53), Daoshi (0.66), Jinnan (0.61), and Linglong (0.58) towns were distributed in the range for the "high prone" class, indicating these towns are more vulnerable to landslide incidence ( Table 7). Most of the other towns have "moderate prone" landslide risk. Among the four highest risk towns, Daoshi had the highest fraction (37.56) of "very high" risk area, while Linglong had the highest fraction (68.92%) of "high" risk area. This indicates that the high overall landslide susceptibility in Daoshi mostly results from the "very high" risk fraction, while in Linglong it results from the "high" risk fraction (Table 7). At pixel level, all four models consistently predicted that the very high landslide susceptibility is mostly distributed in the western and southeastern regions ( Figure 5), while the other regions generally had low risk. Compared with the three decision-treebased models, MLP predicted more "very high" risk pixels in the two regions. Considering the uncertainties from the different methods, we further calculated the mean landslide susceptibility and CV ( Figure 6). The results also indicated that the southeastern and western regions had the highest landslide risks, and these regions generally had the lowest CV, further implying high agreements and confidence for the high landslide susceptibility predictions. These high risk areas generally have very different LCFs (Figure 2), implying different controlling LCFs for landslide susceptibility. At town scale, the "very-high" risk areas are mostly located in Daoshi and Jinnan, southeastern Qingliangfeng, Longgang, Changhua and Linglong, eastern Heqiao, and northern Banqiao Towns ( Figure 6). predictions. These high risk areas generally have very different LCFs (Figure 2), implying different controlling LCFs for landslide susceptibility. At town scale, the "very-high" risk areas are mostly located in Daoshi and Jinnan, southeastern Qingliangfeng, Longgang, Changhua and Linglong, eastern Heqiao, and northern Banqiao Towns ( Figure 6).

Driving Mechanisms of Landslide Suceptibility
The spatial distribution patterns of landslide susceptibility can be attributable to the spatial characteristics of LCFs ( Figure 2). The comparison of relative importance of LCFs is essential to improve the efficiency and the performance of landslide prediction models, and provide implications to policy-makers. In this study, the Gini information gain method was selected to determine the relative importance of each controlling factor. Different models indicated varied major contributing factors (Figure 7). For the RF model, the most important contributor is population density (15%), followed by distance to road (10%). For the MLP model, the most important contributors are slope (17%), relief amplitude (16%) and distance to road (13%). For the AdaBoost model, the major contributing factors are population density (16%) and distance to road (12%). For the GBDT model, the major contributing factors are population density (31%), precipitation (15%) and distance to road (12%). In the thirteen LCFs, RF, AdaBoost and GBDT models agreed well that population density has the largest relative importance. Landform and curvature generally had the minimum relative importance, indicating these two factors had less effect in data fitting and classification. Overall, population density, distance to road, and relief amplitude are generally among the top three major contributing factors for all four models. At town level, our results indicated that the leading contributing LCFs varied significantly. For Daoshi Town, the top three LCFs were population density, distance to fault, and curvature; for Jinnan, the top three LCFs were population density, relief amplitude, and aspect; for Linglong, the top three LCFs were aspect, relief amplitude, and population density; for Qingliangfeng, the top three LCFs were population density, precipitation, and distance to fault. Generally, population density, relief amplitude, and distance to fault were the most important LCFs for the high-risk towns.

Driving Mechanisms of Landslide Suceptibility
The spatial distribution patterns of landslide susceptibility can be attributable to the spatial characteristics of LCFs ( Figure 2). The comparison of relative importance of LCFs is essential to improve the efficiency and the performance of landslide prediction models, and provide implications to policy-makers. In this study, the Gini information gain method was selected to determine the relative importance of each controlling factor. Different models indicated varied major contributing factors (Figure 7). For the RF model, the most important contributor is population density (15%), followed by distance to road (10%). For the MLP model, the most important contributors are slope (17%), relief amplitude (16%) and distance to road (13%). For the AdaBoost model, the major contributing factors are population density (16%) and distance to road (12%). For the GBDT model, the major contributing factors are population density (31%), precipitation (15%) and distance to road (12%). In the thirteen LCFs, RF, AdaBoost and GBDT models agreed well that population density has the largest relative importance. Landform and curvature generally had the minimum relative importance, indicating these two factors had less effect in data fitting and classification. Overall, population density, distance to road, and relief amplitude are generally among the top three major contributing factors for all four models. At town level, our results indicated that the leading contributing LCFs varied significantly. For Daoshi Town, the top three LCFs were population density, distance to fault, and curvature; for Jinnan, the top three LCFs were population density, relief amplitude, and aspect; for Linglong, the top three LCFs were aspect, relief amplitude, and population density; for Qingliangfeng, the top three LCFs were population density, precipitation, and distance to fault. Generally, population density, relief amplitude, and distance to fault were the most important LCFs for the high-risk towns.

Effectiveness for Different Models
Compared with traditional landslide susceptibility prediction methods, the use of machine learning algorithms greatly improves the prediction accuracy [26]. How to choose and use these algorithms, however, is still debatable for researchers. Based on multiple evaluation metrics, our study found that all four models can adequately predict landslide susceptibility in the study region. Among these models, the RF model performed the best, followed by AdaBoost and GBDT, all having significantly higher F1-score and AUC

Effectiveness for Different Models
Compared with traditional landslide susceptibility prediction methods, the use of machine learning algorithms greatly improves the prediction accuracy [26]. How to choose and use these algorithms, however, is still debatable for researchers. Based on multiple evaluation metrics, our study found that all four models can adequately predict landslide susceptibility in the study region. Among these models, the RF model performed the best, followed by AdaBoost and GBDT, all having significantly higher F1-score and AUC values than the MLP model. In many previous studies for model comparisons, various conclusions were drawn. For example, Youssef and Pourghasemi [31] indicated that the RF model produced the best performance as compared with other machine learning methods including SVM. They argued that landslide susceptibility models depend on the variables used to implement the models, and thus RF performed better in some specific areas, while yielded poorer results in some other areas. Hong et al. [21] also concluded that the LR method has better performance than RF in terms of AUC in assessing landslide susceptibility in Lianhua County, China. The study by Tien Bui et al. [33] indicated that the MLP method outperformed several other traditional ML models. Pourghasemi and Rahmati [4] applied 10 machine learning techniques for landslide susceptibility assessment and compared their performance. The results indicated that the frequency ratio method had the best performance compared to RF and other models in terms of AUC values.
In recent years, the ensemble methods, which combine multiple classifiers for making decisions, are increasingly applied to predict landslide susceptibility. Many previous comparative studies showed that ensemble-based machine learning methods are superior to single machine learning methods in accuracy and robustness, which can increase the availability of high-resolution LSM [28,74]. Chen et al. [20] reported that the GBDT method outperformed the other machine learning methods, and was able to provide strong technical support for producing landslide susceptibility maps in the Three Gorges Reservoir area. Kadavi et al. [29] also concluded that ensemble models have higher accuracy than traditional frequency ratio models in Sacheon-myeon. However, in this study, we also applied both regular single classifier (MLP and RF) and ensemble classifiers (GBDT and AdaBoost), and we found that the RF method performed slightly better than the GBDT and AdaBoost methods, but the MLP method performed less accurately than other methods. Sahin [75] produced a landslide susceptibility map using three decision-tree-based ensemble methods, including GBDT, XGB, and RF, and his study also indicated that the RF method had a slightly lower prediction error and higher accuracy than GBDT. Overall, the performance of machine learning models depends on the data used, and implicitly on the extent of the study areas [3].
Except for the ensemble method, another increasingly trend is to apply various integration or stacking algorithms to landslide susceptibility prediction. It can integrate different types of models, thereby reducing errors that cannot be eliminated by a single model and greatly improving the prediction accuracy [14,70,76,77]. For example, Chen et al. [18,45] integrated RF with the bivariate statistical index (SI) and other machine learning methods to assess landslide susceptibility in China and proved the integrated method performed better. Huan et al. [27] integrated RF with LP, GBDT, and XGB to predict landslides in Hunan Province, China. In the near future, these integrated methods could be the most popular and effective methods to predict landslide susceptibility.

Major Contributing Factors
Identifying the contribution from LCFs can help better understand the reasons for the occurrence of landslides in a region, and thus provide implications for policy-makers [33]. Owing to the characteristics of the decision-tree algorithm, it can directly use Gini impurity to calculate the contribution of each factor to the model prediction [78]. The factor contributions of MLP were calculated by using the difference between the F1-score on the training set obtained by removing a certain factor and the original overall accuracy. Our results showed that the LCFs have different contributions to landslide susceptibility in different models. The factors that contribute most to the four models were population density, relief amplitude, and distance to road, respectively. Our results are partially in line with many previous studies performed at various spatial scales. For example, Saha et al. [79] indicated that elevation, distance from road, and precipitation are the most important features. Kawabata and Bandibas [80] found that geological factors were the most important factors in landslide occurrence. Liao et al. [39] concluded that elevation, lithology, distance from faults, and average annual rainfall were the most significant contributing factors in Wushan and Wuxi, Chongqing, China, while vegetation coverage (NDVI) and land cover had no significant contributions. Youssef et al. [81] reported that slope angle, land use, and elevation have higher importance in landslide occurrence. Meinhardt et al. [82] reported that lithology, slope gradient, and precipitation increase have higher importance. Pourghasemi and Rahmati [4] indicated that six LCFs including slope angle, distance from road, slope length, distance from fault, drainage density, and altitude were the major factors in predicting landslide susceptibility according to their generalized additive model (GAM) analysis. Therefore, different study areas or model algorithms lead to significant differences in the major contributing LCFs [3,4].
To further identify the contributions of the three leading LCFs, we applied the Jenks natural breaks and quantile methods to further divide the three factors into four classes ( Table 8). The frequency ratio (FR) and information quality (I) for each class were used to estimate the existing relation between the three variables and the presence of landslides [17,44]. The results showed 81% of the landslide hazards occurred in areas with low population density (≤2 persons/km 2 ). For distance to road, the information quality and FR were the highest when the distance was less than 391.6 m. For relief amplitude (RDLS), the unit height difference between 87 and 156 m was the most vulnerable area, in which the FR and information quality were 0.52 and 0.18, respectively. In summary, the construction of roads could greatly destroy the geological conditions. Therefore, the highest risk was generally closer to the road.

Implications for Policymaking and Disaster Mitigation
Landslides pose great threats to the environment and socioeconomic development. LSM can show the users and stakeholders risk levels and where landslides are expected, contributing to adopt policies and to take proactive action to reduce or avoid landslides and their negative consequences [83]. Users and stakeholders may include policy-makers; government administrations; land use planners; real estate agents; environmental agencies; road, transport and utility companies; agriculture and forest managers; insurance companies; and citizens. Our findings indicate that the results from the four models can help decision-makers identify the vulnerable areas in the Lin'An District. We identified the highrisk areas, which are mostly distributed in the western and southeastern regions, including most areas in Daoshi, Jinnan, Linglong, and Qingliangfeng towns, so the related stakeholders can make decisions to avoid these high-risk areas for cropland, residential zones, roads, and utility and infrastructure construction. The mitigation can include structural and geotechnical measures, and political, legal, and administrative measures to protect endangered populations and the environment in these vulnerable areas [83].
In addition, we found that the leading contributing factors to landslide susceptibility varied significantly among towns. Therefore, our study can provide guidance on specific and different mitigation measures to the stakeholders in different towns. For example, Daoshi Town has the highest landslide risk and the major contributing factors are population density, distance to fault, and curvature. Daoshi has the highest average elevation and more than 90% of the land area is covered by low and middle mountain landforms, so the stakeholders should try to avoid the construction of residential areas in the higher mountain areas and areas with less hard bedrock. In addition, this geographical environment also leads to low local population density (2 person per km 2 on average), and most residents should be concentrated in the flat valley areas. Therefore, relevant units should focus on monitoring the mountain conditions around the valley to avoid large-scale landslide disasters in these populated areas. In contrast, the major contributing factors are population density, precipitation, and distance to fault in Jinnan Town, where the major landform is plain and two fault lines pass beneath this town (Figure 2c). Therefore, the specific mitigation measures should not give attention to elevation, and land use planning should focus on drainage facilities to avoid building residential facilities in areas with active geological movement. These same leading contributing factors apply to Jinnan and Qingliangfeng Towns; however, the major landforms in Qingliangfeng are low and middle mountains (Figure 2c). Therefore, land use planning should give more attention to avoiding areas with both high precipitation and slopes.
In recent years, applications of landslide susceptibility models and the susceptibility maps in landslide early-warning systems have emerged at different spatial scales. For example, Hong and Adler [84] developed a global early-warning system for rainfall and seismically triggered landslides based on a global-scale susceptibility model. In the future, we can integrate the best models, remote sensing data, and inventory data to develop an early-warning system for the Lin'An District.

Conclusions
Machine learning methods have been widely applied in LSM during the recent several years and their good performance has been proved for worldwide regions and countries, but there is no consensus on the best methods and leading contributing factors for specific regions. Therefore, it is often necessary to identify the best methods and contributing factors in specific regions in LSM. To find the best prediction methods and provide spatial assessments of landslide susceptibility in the landslide-vulnerable Lin'An District, this research applied two single-classifier-based (MLP and RF) and two ensemble-based (AdaBoost and GBDT) machine learning methods to predict LSM. The comparisons and evaluations indicated that all models can be adequately applied to produce landslide susceptibility maps at 30 m spatial scale for the study area, with the better performance from the RF, AdaBoost and GBDT models in terms of both F1-score and AUC values. The three decisiontree models (RF, AdaBoost, and GBDT) matched better than the MLP in predicting the distribution of landslide susceptibility, especially for the high-risk areas. The importance and contributions from the screened 13 LCFs were further identified based on these models, and the results revealed that different factors have different importance in the four models, implying the selection of LCFs is important for LSM. The leading contributing factors for the entire study region are population density, distance to fault, and relief amplitude, while the four high-risk towns identified have slightly different leading factors. Our study is the first report concerning landslide susceptibility specifically for the study region at both pixel (30 m) and town scales. Therefore, our identified methods, the predicted LSM, and the identified leading contributing factors can help produce a crucial guide for general planning and assessment purposes in the study region, thus enabling proactive actions to reduce or avoid landslides and their damage. Our findings further suggest that protection and mitigation measures should be unique for different areas and towns.

Data Availability Statement:
The data that support the findings of this study are available on request from the authors (harveyfish@163.com).

Conflicts of Interest:
The authors declare no conflict of interest.