Landslide Susceptibility Modeling Using Integrated Ensemble Weights of Evidence with Logistic Regression and Random Forest Models

The main aim of this study was to compare the performances of the hybrid approaches of traditional bivariate weights of evidence (WoE) with multivariate logistic regression (WoE-LR) and machine learning-based random forest (WoE-RF) for landslide susceptibility mapping. The performance of the three landslide models was validated with receiver operating characteristic (ROC) curves and area under the curve (AUC). The results showed that the areas under the curve obtained using the WoE, WoE-LR, and WoE-RF methods were 0.720, 0.773, and 0.802 for the training dataset, and were 0.695, 0.763, and 0.782 for the validation dataset, respectively. The results demonstrate the superiority of hybrid models and that the resultant maps would be useful for land use planning in landslide-prone areas.


Introduction
Landslides are common geological hazards caused by multiple factors including landform [1,2], geological evolution [3], groundwater [4], land use type [5], precipitation [6,7], irrigation [8], earthquake [9], engineering construction [10], and climate change [11][12][13]. To avoid casualties caused by landslides and guarantee the stable development of mountainous areas, it is critical to determine a control and prevention scheme for landslides in a region. Generally, regional landslide susceptibility maps are beneficial to mitigate the effects of landslide hazards.
At present, various methods have been proposed and introduced into landslide susceptibility mapping. The existing modeling approaches can be put into two categories: qualitative approaches and quantitative approaches [14,15]. In recent years, conventional qualitative approaches have been gradually abandoned by many researchers due to the risk that expert opinion can make the results stray from objective reality [16]. Compared with qualitative approaches, quantitative approaches are mainly based on the hidden information of objective data instead of subjective experience. Additionally, quantitative approaches mainly include traditional mathematical statistic methods, deterministic models, and some state-of-the-art machine learning algorithms.
In the past decade, with the rise of machine learning and data mining, a number of relevant algorithms have been developed for landslide susceptibility zonation [35][36][37][38][39]. For instance, the logistic regression model (LRM), artificial neural network (ANN), support vector machine (SVM), and decision tree (DT) were the top four machine learning algorithms in landslide susceptibility mapping during the period of 2005-2016 [16]. It is clear that machine learning algorithms improve the prediction accuracy of regional landslide occurrence, but the generalization performance of single classifiers still needs to be promoted [40]. In this way, a series of ensemble approaches have recently become more and more popular in geo-hazard susceptibility mapping [37,[41][42][43].
In terms of ensemble approaches, several single classifiers have been combined using ensemble frameworks including random subspace [44], random forest [45,46], Bagging [47], AdaBoost [48], MultiBoost [49], and so on [37,[50][51][52]. Currently, some novel ensemble techniques have been proposed and applied in landslide susceptibility assessment, flood susceptibility mapping, and groundwater potential analysis [41,53,54]. Additionally, the excellent performance of ensemble algorithms on predictive ability and generalization capacity has also been proven. For example, Kadavi et al. [55] compared four ensemble-based machine learning models (AdaBoost, LogitBoost, Multiclass Classifier, and Bagging) with the traditional frequency ratio model (FRM) in the task of landslide susceptibility mapping. Furthermore, the results demonstrated that all of the AUC values of the four ensemble-based machine learning models were higher than that of FRM. In addition, many scholars preferred to construct ensemble learning models by integrating machine learning algorithms with bivariate statistical models because some of the hypotheses of the conventional models can be weakened through hybrid models [56]. Meanwhile, part of the merits of bivariate statistical models and machine learning models can remain by integrating together. Weights of evidence models, as a classic bivariate statistical approach, can calculate the weights of various categories of a conditioning factor based on sturdy mathematical theories [57]. Furthermore, the weights of evidence models can be integrated with other machine learning approaches to reveal the hidden correlations between different conditioning factors and landslide occurrence. Therefore, in the present study, based on GIS tools, the integrated ensemble weights of evidence with logistic regression and random forest models were employed to map landslide susceptibility, and the results were compared and analyzed quantitatively by receiver operating characteristic curves (ROC) and area under the curve (AUC).

Study Area
The study area was located in Shaanxi Province, China ( Figure 1) where the average annual temperature is 14.2 • C, the average annual rainfall is 909.8 mm, and the evaporation is 1537.1 mm. Topographically, the study area is part of the Qinba Mountain. The general trend is high in the south and low in the north. Elevation ranges from 442 m to 2410 m above sea level, with an average elevation of 1171 m. Slope angles in the study area are in the range of 0 to 70 • . Most of the slope angles are in classes of 10-20 • (29.27%), followed by 20-30 • (26.29%), 0-10 • (23.64%), and 30-40 • (14.99%). Only 5.81% of slope angles are higher than 40 • .

Study Area
The study area was located in Shaanxi Province, China ( Figure 1) where the average annual temperature is 14.2 °C, the average annual rainfall is 909.8 mm, and the evaporation is 1537.1 mm.
Topographically, the study area is part of the Qinba Mountain. The general trend is high in the south and low in the north. Elevation ranges from 442 m to 2410 m above sea level, with an average elevation of 1171 m. Slope angles in the study area are in the range of 0 to 70°. Most of the slope angles are in classes of 10-20° (29.27%), followed by 20-30° (26.29%), 0-10° (23.64%), and 30-40° (14.99%).
Only 5.81% of slope angles are higher than 40°.

Data Preparation
A landslide inventory includes the locations of the past and recent landslides [21]. A landslide inventory can give insight into landslide location, dates, type, frequency of occurrence, state of activity, magnitude or size, failure mechanisms, causal factors, and damage caused [58,59]. In the present study, the landslide inventory map was prepared on the basis of satellite images (Google Earth and ZY03 images) and historical landslide records of the area, which were verified by GPS. A total of 202 landslides were identified to prepare the landslide susceptibility map, of which most of the landslides were slides (190), the others included 12 rock falls [60]. According to an analysis in the GIS environment, the smallest landslide was nearly 160 m 3 , the largest landslide was more than 1,000,000 m 3 , while the average was 33,000 m 3 . Finally, 141 landslides were randomly selected as training data and rest of them were used for the verification of the landslide susceptibility map ( Figure 1).
There are no universal guidelines for selecting landslide conditioning factors [33,61]. A total of 16 landslide conditioning factors were used for landslide susceptibility mapping including slope angle, slope aspect, elevation, plan curvature, profile curvature, topographic wetness index (TWI), stream power index (SPI), sediment transport index (STI), distance to rivers, distance to roads, distance to faults, soil, land use, normalized difference vegetation index (NDVI), lithology, and rainfall, which are considered as controlling factors in the occurrence of landslides in the study area.
Slope angle is an important factor that affects the stress state of slope mass, and these positions where stress exceeds failure strength may contribute to landslide hazards [62,63]. In this case, as shown in Figure 3a, the thematic data layer of the slope angle was reclassified into seven categories with an interval of 10 • , namely, (0-10 • ), (10-20 • ), (20- Slope aspect is another common conditioning factor for the task of landslide susceptibility mapping [64,65]. It has been proven that most landslides usually occur at a certain slope aspect for a given study area, but the mechanism has not been revealed clearly [66]. Therefore, slope aspect was also employed as a conditioning factor. Here, slope aspect categories include flat, north, northeast, east, southeast, south, southwest, west, and northwest ( Figure 3b).
Generally, it is considered that elevation has a firm relationship with landslide occurrence [67]. There is no denying that elevation can influence the topography, vegetation, temperature, humidity, human activities, and many other conditions that have a connection with slope stability [30,68]. In Figure 3c, the elevation of the study area was divided into ten classes with an interval of 200 m, i.e., Plan curvature and profile curvature are two quantitative indices that embody topographic characteristics and trend from different perspectives [69]. Various curvature values indicate different runoff and erosion conditions of water. For instance, the upwardly convex surfaces have positive curvature values while negative curvature values mean upwardly concave surfaces [30]. In this study, the plan curvature and profile curvature values were both reclassified into three groups (Figure 3d,e).
TWI was proposed to indicate the local groundwater potential by Moore [70] in 1991. Currently, TWI is regarded as an extensively-used causative factor in landslide susceptibility assessment [71]. It is expressed as TWI = ln( α tan β ), where β is the slope angle (radian), and α is the flow accumulation through a point [72]. The TWI values of the study area can be calculated by GIS software and reclassified as (<4), (4)(5), (5)(6), (6)(7), and (>7) with an interval of 1 ( Figure 3f). SPI can directly measure the erosion capacity of the stream. A higher SPI value indicates that the stream has more powerful erosion on the slope surface [55]. The SPI values are mainly determined as SPI = α tan β [54,70]. In this study, the SPI values were identified as five categories with an interval of 20, namely, (<20), , , , and (>80) (Figure 3g).
Rivers can not only affect the moisture distribution in slopes, but can also erode the toes of slopes, which cause slope deformation and failure [75]. Thus, it is necessary to consider the river effects when producing landslide susceptibility maps. In this study, based on the distance to rivers, five buffer zones with an interval of 200 m were generated for each river: (<200 m), (200-400 m), (400-600 m), (600-800 m), and (>800 m) (Figure 3i).
Generally speaking, road construction in mountainous areas, which always produce an engineering load and destroy the integrity of slope structure, have significant negative impacts on the slope stability [76]. Hence, the distance to roads is usually selected as a conditioning factor to embody the influence of road engineering activities on landslide occurrence [77]. Here, values of the distance to roads were divided into five groups with an interval of 300 m, i.e., (<300 m), (300-600 m), (600-900 m), (900-1200 m), and (>1200 m) (Figure 3j).
Fault structures affect the spatial distribution and characteristics of landslides in a certain region [50]. According to relevant studies [30,78], the integrity of rock and soil mass generally decrease as the distance to the faults shorten. In this way, landslide hazards are more likely to occur in the neighboring area of faults. Ultimately, buffers of various faults in the study area were obtained and reclassified into five categories with an interval of 1000 m: (<1000 m), (1000-2000 m), (2000-3000 m), (3000-4000 m), and (>4000 m) (Figure 3k).
In terms of soil, this is an essential factor that has a strong correlation with landslide occurrence [79]. To a great extent, the strength, root cohesion, permeability, and vegetation coverage of the soil mass depend on the soil type [80,81], which can impact the failure characteristics of slopes [82,83]. In this study area, a total of nine soil types were identified including cumulic anthrosol, dystric cambisol, eutric cambisol, calcaric fluvisol, haplic luvisol, chromic luvisol, eutric planosol, calcaric regosol, and eutric regosol ( Figure 3l).
Land use is one of the most frequently used conditioning factors, and the correlation between landslides and land use has been confirmed [84]. For instance, in some farmland regions, landslides are frequent and common under long-term irrigation [85]. For the study area, the types of land use mainly consist of farmland, forestland, grassland, water, residential areas, and bareland ( Figure 3m).
NDVI is a very popular index to measure the degree of vegetation in a region. NDVI values can be figured out by the formula NDVI = (I − R)/(IR + R), where IR is the infrared band and R is the red band of the electromagnetic spectrum [86]. The range of NDVI values is from −1 to 1, and a positive value means that the local ground is covered by vegetation. Five categories of NDVI values were generated based on the natural break method [87], namely (-0. 21 Like soil, lithology is one of the most important factors that directly determines slope stability. According to many existing studies, the physical and mechanical properties of rock mass usually change dramatically with lithological units [88]. Therefore, most landslides occur in the sliding-prone lithological units that have lower strength and a higher moisture content. For this study area, the strata were mainly reclassified into twelve lithological units based on the lithofacies and geological ages, and the specific distribution of various lithologies was illustrated in Figure 3o. Rainfall is a crucial triggering factor that causes massive landslides by means of raising the groundwater level and increasing pore water pressure [89]. It can be observed that the probability of landslide occurrence indeed grows under the actions of long-term or heavy rainfall. Based on the meteorological data of the study area, the corresponding rainfall map with an interval of 100 mm/yr was produced, i.e., (<900 mm/yr), (900-1000 mm/yr), (1000-1100 mm/yr), (1100-1200 mm/yr), (1200-1300 mm/yr), (1300-1400 mm/yr), (1400-1500 mm/yr), (1500-1600 mm/yr), (1600-1700 mm/yr), and (>1700 mm/yr) (Figure 3p).

Weight of Evidence
Weight of evidence (WoE) is one of the most popular models that uses the Bayesian theory of conditional probability to quantify spatial associations between evidence layers and known mineral occurrences [90]. In the WoE method, conditional independence is the most important issue that should be considered. The WoE is based on the calculation of positive weight W  and negative weight W  as follows:

Weight of Evidence
Weight of evidence (WoE) is one of the most popular models that uses the Bayesian theory of conditional probability to quantify spatial associations between evidence layers and known mineral occurrences [90]. In the WoE method, conditional independence is the most important issue that should be considered. The WoE is based on the calculation of positive weight W + and negative weight W − as follows: where B is the presence predictive factor; B is the absence of the predictive factor; A is the presence of landslide; and A is the absence of landslide. In landslide susceptibility prediction, the weight contrast W f = W + − W − was used to measure and reflect the spatial association between the landslide conditioning factors and landslide occurrence [91].

Logistic Regression
Logistic regression (LR) is one type of regression analysis where categorical outcomes can be predicted based on a certain predictor [92]. By using the logistic functions, probabilities of the possible outcomes can be modeled [93].
The logistic regression model is useful for two-class classification. Assuming there are n samples of the pairs, (x i , y i ), i = 1, 2, . . . , n, y i ∈ {−1, +1} is a binary class label for each sample i = 1, 2, . . . , n and weights (w, b). In the logistic regression for binary classification, the occurrence probability of the class is modeled with the below function: where b is the intercept; T is the matrix transposition; and the k-dimensional coefficient vector, w = (w 1 , w 2 , . . . w k ) T are parameters to be estimated.

Random Forest
Random forests (RF) are an ensemble of separately trained binary decision trees [94]. In the random forest algorithm, a random vector i k is naturally produced, independent from the previous random vectors and distributed to all trees, and each tree is grown using the training dataset and random vector i k , and outcomes are in the collection of tree-structured classifiers h(x, i k ), k = 1, 2, . . . n at input vector x. In this study, i k is the landslide conditioning factors. The random forest consisted of two trees, namely, landslide and non-landslide, each constructed while considering sixteen random features.
Generally, in a random forest algorithm, the generalization error is described as below [95]: where x and y are the landslide conditioning factors indicating the probability over the x, y space, and mg is the margin function, which is defined as below: Which measures the extent to which the average number of votes at random vectors for the right output exceeds the average vote for any other output. The I( * ) is the indicator function [96].

Correlation Analysis
The correlation between the conditioning factors and probability of landslides occurrence was measured by the weight contrast W f , and the calculation results of the WoE model are listed in Table 1. The LR method was employed to produce the landslide susceptibility map, and one of the most critical applicable conditions of LR is that the landslide conditioning factors are mutually independent [97]. Therefore, it is necessary to diagnose the multicollinearity of various conditioning factors when evaluating landslide susceptibility [98]. Currently, the tolerance (TOL) (TOL = 1 − R 2 , and R is the coefficient of determination of the regression equation) and variance inflation factor (VIF) (VIF = 1/TOL) have been applied in multicollinearity diagnosis [99][100][101]. Generally, a TOL value less than 0.1 or a VIF value larger than 10 is regarded as a symbol of multicollinearity [61]. In this study, the results of the WoE model were used as inputs to calculate the TOL and VIF values of all of the conditioning factors. In accordance with the calculated results, there was no multicollinearity among the landslide conditioning factors (Table 2).

Application of the WoE Model
In terms of slope angle, the slope angle between 10 • -20 • (0.669) is more prone to landslide occurrence. Additionally, the W f values of the region where slope angles larger than 50 • are 0. For the slope aspect factor, W f was the highest for south-facing (0.769). Furthermore, southeast-facing (0.310) and east-facing (0.001) also had a positive correlation with landslide occurrence. In the case of elevation, most landslides were distributed in the classes of 442-600 m (0.485), 600-800 m (0.968), and 800-1000 m (0.717). When the elevation was larger than 1200 m, elevation had an inhibitory effect on landslides. In the case of plan curvature, flat areas had a more important impact on landslides, whereas the W f values of convex areas and concave areas were 0.106 and −0.203, respectively. In the case of profile curvature, the W f values of the concave class, flat class, and convex class ere 0.042, 0.751, and −0.271, respectively. For TWI, the highest W f value was observed for the interval of 5-6 (0.497) while the class <4 (−0.945) had the lowest value. For SPI, the class <20 (0.277) had the highest W f value, and the areas of SPI 20-40 and >60 were negative for landslides. For STI, the class <10 had the highest W f value of 0.361, while the class 20-30 had the lowest value of −0.853. In the case of distance to rivers, the classes of <200 m (0.168) and 600-800 m (0.133) occupied higher W f values when compared to the other classes. In the case of distance to roads, the class of <300 m (0.904) had a more intimate correlation with landslide occurrence. In the case of distance to faults, it can be seen that the class >4000 m had the highest W f value of 0.246. For soil, eutric cambisol (1.177) and eutric planosol (1.516) were more likely to induce landslides due to the dramatic falling of soil strength under saturated conditions [85]. For land use, farmland (1.351) had the highest probability of landslide occurrence, which may be essentially caused by irrigation. According to the W f values of NDVI, the class of 0.36-0.44 (1.121) mainly contributed to landslide occurrence, while the lowest value was for the class of 0.52-0.65 (−2.212), which indicates that high vegetation coverage can restrain landslides. In the case of lithology, the W f values of group 1 (Q) (0.588), group 11 (Ar) (0.594), and group 12 (Pt, Pz) (0.417) were larger than 0, indicating that these lithological groups had the highest susceptibility to landslide. In the case of rainfall, the range between 1100-1200 mm/yr (1.030) showed high susceptibility for landslide occurrence.
The calculated W f values for all landslide conditioning factors were summed using the following equation to construct the landslide susceptibility map (LSM): LSM WoE = Slope angle Wf + Slope aspect Wf + Elevation Wf + Plan curvature Wf +Profile curvature Wf + TWI Wf + SPI Wf + STI Wf + Distance to rivers Wf +Distance to roads Wf + Distance to faults Wf + NDVI Wf + Soil Wf + Landuse Wf +Lithology Wf + Rainfall Wf (6) The integrated result of the WoE model is shown in Figure 4. The LSM was reclassified into five classes based on the natural break method: very low, low, moderate, high, and very high.
(0.594), and group 12 (Pt, Pz) (0.417) were larger than 0, indicating that these lithological groups had the highest susceptibility to landslide. In the case of rainfall, the range between 1100-1200 mm/yr (1.030) showed high susceptibility for landslide occurrence.
The integrated result of the WoE model is shown in Figure 4. The LSM was reclassified into five classes based on the natural break method: very low, low, moderate, high, and very high.

Application of the WoE-LR Model
In this case, SPSS 18.0 software was applied to build a landslide susceptibility model with the WoE-LR model. The input table of the LR model can be generated by the determined class values of variables based on the WoE model [54]. In the analysis process, a forward stepwise LR was adopted, and the analysis results are given in Tables 3 and 4. The Cox and Snell R Square (0.245) and Nagelkerke R Square (0.326) are two pseudo determined coefficients that are used to reflect the degree of independent variables explaining dependent variables [102,103]. According to Table 4, the LR equation and landslide occurrence probability P can be expressed as Equations (7) and (8)  Ultimately, the landslide susceptibility index (LSI) for the LR model were obtained based on Equation (8), moreover, the LSI values were reclassified into five categories by the natural break method: very low, low, moderate, high, and very high ( Figure 5).

Application of the WoE-RF Model
Similarly, the calculated results of the WoE model can be used as input for the RF model. In this study, training of the RF model was implemented by WEKA software. During the analyzing process, the importance of various conditioning factors can be measured quantitatively and ordered by MDA (mean decrease accuracy) and MDG (mean decrease Gini). Generally, MDA is determined during the Out-Of-Bag error calculation phase, while MDG is a measure of how each variable contributes to the homogeneity of the nodes and leaves [104]. The values of the above-mentioned two metrics of the conditioning factors are illustrated in Figure 6, and a larger value of MDA or MDG means a higher importance of the corresponding variable. Accordingly, in terms of MDA, land use is the most critical factor in the RF model, while soil is second in importance only to land use. For the MDG, the importance of elevation was first, followed by rainfall and land use. Finally, based on ArcGIS software, the landslide susceptibility map using the WoE-RF was generated and is shown in Figure   7.

Application of the WoE-RF Model
Similarly, the calculated results of the WoE model can be used as input for the RF model. In this study, training of the RF model was implemented by WEKA software. During the analyzing process, the importance of various conditioning factors can be measured quantitatively and ordered by MDA (mean decrease accuracy) and MDG (mean decrease Gini). Generally, MDA is determined during the Out-Of-Bag error calculation phase, while MDG is a measure of how each variable contributes to the homogeneity of the nodes and leaves [104]. The values of the above-mentioned two metrics of the conditioning factors are illustrated in Figure 6, and a larger value of MDA or MDG means a higher importance of the corresponding variable. Accordingly, in terms of MDA, land use is the most critical factor in the RF model, while soil is second in importance only to land use. For the MDG, the importance of elevation was first, followed by rainfall and land use. Finally, based on ArcGIS software, the landslide susceptibility map using the WoE-RF was generated and is shown in Figure 7.

Validation of Landslide Models
Currently, the ROC and AUC have been widely applied to validate the performance of determined landslide susceptibility models [64,69,105]. The ROC curve can be generated by plotting the false positive rate (100-specificity) in the x-axis versus the sensitivity in the y-axis [71]. The area under the ROC curve (AUC) is an indicator of the global summary measure of the performance of a model [106][107][108]. In the present study, to assess the validation of the WoE, WoE-LR, and WoE-RF models, the ROC curves of three models with training and validation datasets are described in Figures 8  and 9.

Validation of Landslide Models
Currently, the ROC and AUC have been widely applied to validate the performance of determined landslide susceptibility models [64,69,105]. The ROC curve can be generated by plotting the false positive rate (100-specificity) in the x-axis versus the sensitivity in the y-axis [71]. The area under the ROC curve (AUC) is an indicator of the global summary measure of the performance of a

Discussions
Under the action of environmental factors and human activities, the frequency of landslide occurrence has been increasing in recent decades, which may result in catastrophic losses on lives, resources, and property [109,110]. Currently, numerous approaches have been used in landslide susceptibility mapping such as FR [23], WoE [19], IoE [111], machine learning [64,112], and ensemble learning models [53,54]. In the above-mentioned models, the probabilistic meaning and calculation procedure of the WoE model are relatively concise and specific, which makes the WoE a classical and widely used method in landslide susceptibility mapping. Nevertheless, due to the uncertainties and fuzziness in the data of the conditioning factors [113], for different datasets, the performance of the WoE models were significantly distinguished [114][115][116]. In the present study, the integrated ensemble WoE with LR and RF models were proposed and applied for landslide susceptibility modeling in order to improve the accuracy and generalization ability of the traditional WoE model.
Landslide inventory map is a preliminary step toward landslide susceptibility, hazard and risk assessment [59]. Generally, there are two classes of Landslide inventories: landslide-event inventories that are associated with a trigger and historical landslide inventories [59,117]. In the present study, we adopted the latter formation, which was the sum of many landslide events over a long time. In the case of the training dataset, the WoE-RF model had the best performance with the highest AUC value of 0.802, while the AUC values of the WoE model and WoE-LR model were 0.720 and 0.773, respectively. Meanwhile, the WoE-RF model had the lowest standard error (0.0275) and a 95% confidence interval of 0.729-0.829. Thus, the WoE-RF and WoE-LR models can improve the accuracy of the traditional WoE model in this study, and the WoE-RF model showed a relatively better performance.
In the case of the validation data, it can be seen that the AUC values of the various models decreased slightly when compared with the training dataset. The AUC values were 0.695, 0.763, and 0.782 for the WoE model, WoE-LR model, and WoE-RF model, respectively. Similarly, the lowest standard error was 0.0430 for the WoE-RF model, followed by the WoE-LR model (0.0440), and the WoE model (0.0484). The detailed results demonstrated that the WoE-RF model had a prominent prediction capacity on landslide susceptibility mapping.

Discussions
Under the action of environmental factors and human activities, the frequency of landslide occurrence has been increasing in recent decades, which may result in catastrophic losses on lives, resources, and property [109,110]. Currently, numerous approaches have been used in landslide susceptibility mapping such as FR [23], WoE [19], IoE [111], machine learning [64,112], and ensemble learning models [53,54]. In the above-mentioned models, the probabilistic meaning and calculation procedure of the WoE model are relatively concise and specific, which makes the WoE a classical and widely used method in landslide susceptibility mapping. Nevertheless, due to the uncertainties and fuzziness in the data of the conditioning factors [113], for different datasets, the performance of the WoE models were significantly distinguished [114][115][116]. In the present study, the integrated ensemble WoE with LR and RF models were proposed and applied for landslide susceptibility modeling in order to improve the accuracy and generalization ability of the traditional WoE model.
Landslide inventory map is a preliminary step toward landslide susceptibility, hazard and risk assessment [59]. Generally, there are two classes of Landslide inventories: landslide-event inventories that are associated with a trigger and historical landslide inventories [59,117]. In the present study, we adopted the latter formation, which was the sum of many landslide events over a long time. However, the evidence of many smaller landslides might has been lost due to various degrees of modification by subsequent landslides, erosional processes, vegetation growth and anthropic influences [59]. Therefore, application of multi-temporal high-resolution satellites images for interpretation of smaller landslides may be an effective supplement to the current landslide inventory and efficient for improving the accuracies of landslide susceptibility maps.
According to the existing literature and multicollinearity analysis, sixteen conditioning factors were selected: slope angle, slope aspect, elevation, plan curvature, profile curvature, TWI, STI, SPI, distance to rivers, distance to roads, distance to faults, NDVI, soil, land use, lithology, and rainfall. Furthermore, based on the W f values, the relationships between landslide occurrence and these factors were analyzed. It was demonstrated that all factors had nonlinear relationships with landslides. In addition, the RF model was employed to measure the importance of factors with two indices, the MDA and MDG. In terms of MDA, it could be observed that the most critical factor was land use, followed by soil and elevation. Slope angle had the lowest impact on landslide occurrence. However, for MDG, the importance of elevation, rainfall, and land use ranked first, second, and third, respectively, while the lowest MDG value was for profile curvature.
There are some classification techniques for a landslide susceptibility map in GIS software, such as manual, defined interval, natural break, equal interval, quantile, standard deviation, geometrical interval, and landslide percentage [118]. Generally, user-defined classification is more difficult for the reader to interpret and justify. Therefore, current automatic classification systems should be used instead of a user-defined classification [118]. Besides, when landslide susceptibility indexes have positive or negative skewness, the best classification methods are quantile or natural break [119]. In the present study, natural break method, which is the most commonly used models [120,121], is the most suitable method for modelling landslide susceptibility according to the histogram of data distribution.
In this paper, a comparison study of the WoE, WoE-LR, and WoE-RF models was implemented. LR is a widely used model for classification, particularly for binary classification problems [122]. Thus, we integrated the WoE with the LR model to acquire a better classifier. The WoE-RF model is a combination of the weight of evidence and random forest approach. It has been proven that RF is one of the most popular classification algorithms and can improve the performance of single classifiers [96,123]. Moreover, RF can decrease the dependence of the WoE model on independence among the conditioning factors. Accordingly, the results showed that both the LR model (AUC = 0.773 for training data; AUC = 0.763 for validation data) and RF model (AUC = 0.802 for training data; AUC = 0.782 for validation data) can increase the performance of the traditional WoE model (AUC = 0.720 for training data; AUC = 0.695 for validation data), and the WoE-RF model produced the best results.
Comparing the overall classification results of the three models, the results confirmed that the RF model had a better performance on improving the generalization ability of a weak classifier and raising the corresponding prediction accuracy. Therefore, the landslide susceptibility maps generated by the WoE-RF and WoE-LR models contain reference meaning for the study area to a certain extent. Furthermore, the procedure of factor selection and ensemble model construction is of some value to similar studies.

Conclusions
The results are indicative of the quality of the maps drawn by the hybrid approaches of traditional bivariate weights of evidence (WoE) with multivariate logistic regression (WoE-LR) and machine learning-based random forest (WoE-RF). In general, the following conclusions can be drawn: (1) Geomorphological factors, geological factors, geo-environmental factors, and anthropogenic factors were used for the development of the landslide model. The preliminary selection of these 16 conditioning factors was based on the multicollinearity diagnosis. The TOL and VIF values of all the conditioning factors indicated no multicollinearity.
(2) According to the results of the WoE model, most occurred at slopes of 10-20 • with the south aspect, elevations of 600-800 m, distance to rivers of <200 m, distance to roads of <300 m, and a farmland land cover category.
(3) WoE-RF possessed relatively good accuracy when compared to the WoE-LR and WoE models. By using the ROC curve, the AUC values of the training dataset produced by these three methods were 0.802, 0.773, and 0.720, respectively. For the validation dataset, the AUC values were 0.782, 0.763, and 0.695, respectively. It can be concluded that the proposed hybrid models are promising approaches for the spatial prediction of landslides and can also be applied in other landslide-prone areas.
Author Contributions: W.C. and Z.S. collected the field data and conducted the landslide mapping and analysis. W.C. and Z.S. wrote the manuscript. J.H. provided critical comments in planning this paper and edited the manuscript. All authors discussed the results and edited the manuscript.