Landslide Susceptibility Mapping Using the Stacking Ensemble Machine Learning Method in Lushui, Southwest China

: Landslide susceptibility mapping is considered to be a prerequisite for landslide prevention and mitigation. However, delineating the spatial occurrence pattern of the landslide remains a challenge. This study investigates the potential application of the stacking ensemble learning technique for landslide susceptibility assessment. In particular, support vector machine (SVM), artiﬁcial neural network (ANN), logical regression (LR)


Introduction
Landslides are severe natural disasters and are responsible for losses of life and property [1].In 2018, approximately 105 deaths and direct economic losses of US$212 million were caused by landslides along with other disasters in China (http://www.cigem.cgs.gov.cn).To reduce these harmful impacts, an assessment of slope conditions must be performed.Landslide susceptibility mapping is an efficient method to identify vulnerable areas, and is used to carry out early risk assessments [2].
Numerous approaches, including the heuristic and physical-based methods, have been applied for landslide susceptibility mapping in the last few decades, and their advantages and disadvantages are well summarized in previous studies [3][4][5].Heuristic methods usually fail to provide a quantitative evaluation of the spatial likelihood of landslide occurrence [6].Physical-based methods are characterized by computing requirements and high costs for collecting necessary data, and are not suitable for large-scale susceptibility mapping [7].Recently, machine learning methods have been shown to be effective to address the issue of landslide spatial prediction [8].Relevant work has focused on logistic regression (LR) [9,10], the fuzzy logic method [11,12], maximum entropy (ME) [13], artificial neural network (ANN) [14][15][16], Bayesian network [3], naïve Bayes (NB) [17], support vector machine (SVM) [18][19][20], classification and regression tree [21,22], and logistic model tree [23], etc. Machine learning algorithms treat landslide-related conditions as input to predict where landslides are probably to occur without the constraint of statistical assumptions.Specifically, state-of-the-art machine learning techniques such as SVM and ANN are the most popular landslide modeling methods.They have achieved promising predictive capability for landslide modeling and are usually used as benchmarks to test novel methods [24,25].NB and LR have also been widely employed for landslide susceptibility mapping as they are easily implemented [2].Yet, there is still no general agreement on which method is better because the single or simple hypothesis space of learning algorithms shows difficulties in meeting all case scenarios, as the used data change [26].
Modeling of landslide susceptibility still faces challenges due to the complex nonlinear relationship between conditioning factors and landslide occurrences [27].Achieving higher accuracy is the focus for landslide susceptibility assessment.More recently, the ensemble machine learning method has proved to be able to provide an improved solution for landslide susceptibility modeling [28].Ensemble methods can expend the hypothesis space of the fitting function, thus providing better prediction than the single algorithms [29,30].In general, the single algorithm used to constitute ensemble is called the "base learner" algorithm and can be homogeneous or heterogeneous.Several landslide studies have investigated meta-learning techniques for assembling homogeneous base learners [25,[31][32][33][34][35][36].In these studies, the bagging [37], boosting [38,39], random forest [40], and rotation forest [41] were most widely applied.Additionally, combination schemes based on heterogeneous landslide models were also applied for landslide susceptibility mapping successfully, such as ANN-Bayes analysis [42], ANN-fuzzy logic [43], combinations based on linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), LR, and ANN [44]; the combination of the statistical index and adaptive neuro-fuzzy inference system [45]; and combinations based on ANN, MaxEnt, and SVM [24].Exploring novel ensemble modeling methods is of significance for landslide susceptibility assessment.
In this paper, for the first time the stacking ensemble method based on multiple machine learning algorithms was used to conduct landslide susceptibility mapping in the Lushui area, southwest China.The method has a two-level structure including the heterogeneous base learners at the base level and the meta-learner at the high level.It is worth mentioning that the advantages of the stacking method have been explored in different disciplines.Wang et al. [46] stated that the stacking-based credit scoring model obtained an improved predictive performance over base learners with respect to the average accuracy and error.Another example is a study of flood frequency analysis where the ANN with stacking outperformed both the standalone ANN and the combination of ANNs using the averaging strategy [47].In this work, four machine learning algorithms, namely, SVM, ANN, LR, and NB, were selected as candidate base learners of the stacking ensemble method to predict the spatial landslide susceptibility.
Furthermore, a strategy of evaluating the importance level for stacking ensemble method components is presented.The strategy was achieved by using the resampling strategy and Pearson's correlation analysis methods.We firstly constructed an initial ensemble model comprised of four machine learning algorithms.Then a series of estimation points of performance for each base learner was obtained via resampling strategy.The scatter of these point pairs reflected their correlations, which could be measured using the Pearson's correlation coefficient.The strong correlated base learners were treated as inappropriate components and could be eliminated from the initial ensemble model.The performance of ensemble models including remaining base learning algorithms was compared with the initial ensemble model to verify this strategy.This strategy was used to construct a proper stacking ensemble machine learning model for a realistic case: Lushui area, southwest China, where landslides occur frequently.
The structure of the paper is as follows.Section 2 gives a description of the study area.Section 3 presents the data and methodology used in this study.Sections 4 and 5 provide the results and discussion, respectively, and conclusions are given in the last section.

Study Area
Lushui is a subordinate Autonomous County of Yunnan Province in Southwest China, located between the longitudes of 98 • 34 E and 99 • 09 E and the latitudes of 25 • 33 N and 26 • 32 N, with a total area of approximately 3100 km 2 (Figure 1).Topographically, the Nujiang River Grand Canyon runs through the whole area from north to south.The elevation ranges from 740 m to 4160 m, and increases from the middle of the canyon to both sides.Areas with the slope angle between 15 model.The performance of ensemble models including remaining base learning algorithms was compared with the initial ensemble model to verify this strategy.This strategy was used to construct a proper stacking ensemble machine learning model for a realistic case: Lushui area, southwest China, where landslides occur frequently.
The structure of the paper is as follows.Section 2 gives a description of the study area.Section 3 presents the data and methodology used in this study.Section 4 and Section 5 provide the results and discussion, respectively, and conclusions are given in the last section.

Study Area
Lushui is a subordinate Autonomous County of Yunnan Province in Southwest China, located between the longitudes of 98°34′ E and 99°09′ E and the latitudes of 25°33′ N and 26°32′ N, with a total area of approximately 3100 km 2 (Figure 1).Topographically, the Nujiang River Grand Canyon runs through the whole area from north to south.The elevation ranges from 740 m to 4160 m, and increases from the middle of the canyon to both sides.Areas with the slope angle between 15° and 35° account for 59.7% of the study area, and areas with slopes over 40° cover 13.2%.The region has a tropical monsoon climate, with an annual average temperature of approximately 15.1 °C.The rainy season extends from May to October each year, and the rainfall during the rainy season accounts for 70% of the total annual rainfall.The largest daily rainfall can reach to 105.3 mm during the rainy season.The annual average rainfall is approximately 1200 mm.The surface runoff of the study area mainly consists of the Nujiang River system and its tributaries.Geologic units in the study area range from Precambrian to Quaternary and include metamorphic, igneous, and sedimentary rocks as well as unconsolidated sediments (Figure 2).The oldest geological units are represented by the Proterozoic Gaoligongshan (Ptgl) and Chongshan (Ptch) groups, which widely appear in the eastern part of the study area and along the area west of Nujiang Fault.Strata from the Paleozoic (Pz) to Mesozoic (Mz) outcrop in the area east of the Nujiang Fault.The youngest Quaternary sediments consisting of alluvial and pluvial deposits are widely distributed in the terrace of the Nujiang River, and are the major slope-forming materials.Knowledge of geology is critical because different rock types and sediments have different physical hardness, Geologic units in the study area range from Precambrian to Quaternary and include metamorphic, igneous, and sedimentary rocks as well as unconsolidated sediments (Figure 2).The oldest geological units are represented by the Proterozoic Gaoligongshan (Ptgl) and Chongshan (Ptch) groups, which widely appear in the eastern part of the study area and along the area west of Nujiang Fault.Strata from the Paleozoic (Pz) to Mesozoic (Mz) outcrop in the area east of the Nujiang Fault.The youngest Quaternary sediments consisting of alluvial and pluvial deposits are widely distributed in the terrace of the Nujiang River, and are the major slope-forming materials.Knowledge of geology is critical because different rock types and sediments have different physical hardness, interlayer structures, and weathering and erosion resistance, and those properties are related to slope susceptibility.In particular, soft clay or other phyllosilicate minerals are highly plausible causes of landslide occurrence, while the rocks or sediments with strong mechanical strength and good weathering resistance usually have a better inhibiting effect on slope deformation.In this study, the geological unit was reclassified into seven engineering rock groups (ERGs) according to the lithology, structure, hardness, and weathering degree of the rocks.Table 1 gives a detailed description for the ERGs.Lushui County is one of the key monitoring zones for geo-hazards in Yunnan Province.A complex geographic and geological setting and seasonal heavy rain make the area prone to geo-hazards.For example, on 5 July 2018, falling rocks induced by a landslide nearby a road of Chenggan hit running cars and directly caused three deaths, with seven injured.On 3 October 1997, unwise mining activities at the Shiganghe disturbed the unstable strata located on a fracture zone, triggering slope sliding and causing 23 deaths and economic losses of US$2.5 million.Landslides have caused huge damage in this area, prompting the urgent need for effective regional landslide susceptibility mapping.

Spatial Database
Landslide modeling is mainly based on the assumption that landslides will more likely to occur in the future under similar conditions that led to past landslides [2].Therefore, the landslide inventory plays a fundamental role in landslide susceptibility assessment.The field surveys from 2014 to 2016 by the Department of Nature Resources of Yunnan Province are the major sources of landslide inventory data used in this study.A total of 388 landslide locations were identified in the study area, and were mainly distributed in the middle area of Lushui County, especially along the Nujiang River Grand Canyon and important traffic lines.Those recorded landslides mainly belonged to soil slopes, varying from 80 m 2 to 600,000 m 2 in plane area.Overall, 66.8% had a slide depth <10 m and 94.04% were classified as being small to medium-sized (<100 × 104 m 3 ).Since most of the landslides were small enough against the working scale, their location information could be represented as point features [48,49].Subsequently, the landslide inventory map of the study area was generated by plotting the centroids of those landslides (Figure 1).
Landslide occurrence is the result of the negative synergy of various topographical, geological, and hydrological as well as anthropogenic features [50].The preparation of conditioning factors used for landslide modeling is a crucial step.According to the condition of the study area and the mechanism of landslide occurrence, twelve landslide-related factors were considered firstly: elevation, slope angle, slope aspect, plan curvature, profile curvature, ERG, land use, distance to roads, distance to rivers, distance to faults, annual rainfall, and normalized difference vegetation index (NDVI).Those factors were collected from various available sources.Specifically, the digital elevation models (DEMs) were derived from a digital contour map (at the scale of 1:10,000) with an equidistance of 5 m.We identified five geomorphic factors including elevation, slope angle, slope

Spatial Database
Landslide modeling is mainly based on the assumption that landslides will more likely to occur in the future under similar conditions that led to past landslides [2].Therefore, the landslide inventory plays a fundamental role in landslide susceptibility assessment.The field surveys from 2014 to 2016 by the Department of Nature Resources of Yunnan Province are the major sources of landslide inventory data used in this study.A total of 388 landslide locations were identified in the study area, and were mainly distributed in the middle area of Lushui County, especially along the Nujiang River Grand Canyon and important traffic lines.Those recorded landslides mainly belonged to soil slopes, varying from 80 m 2 to 600,000 m 2 in plane area.Overall, 66.8% had a slide depth <10 m and 94.04% were classified as being small to medium-sized (<100 × 104 m 3 ).Since most of the landslides were small enough against the working scale, their location information could be represented as point features [48,49].Subsequently, the landslide inventory map of the study area was generated by plotting the centroids of those landslides (Figure 1).
Landslide occurrence is the result of the negative synergy of various topographical, geological, and hydrological as well as anthropogenic features [50].The preparation of conditioning factors used for landslide modeling is a crucial step.According to the condition of the study area and the mechanism of landslide occurrence, twelve landslide-related factors were considered firstly: elevation, slope angle, slope aspect, plan curvature, profile curvature, ERG, land use, distance to roads, distance to rivers, distance to faults, annual rainfall, and normalized difference vegetation index (NDVI).Those factors were collected from various available sources.Specifically, the digital elevation models (DEMs) were Appl.Sci.2020, 10, 4016 6 of 21 derived from a digital contour map (at the scale of 1:10,000) with an equidistance of 5 m.We identified five geomorphic factors including elevation, slope angle, slope aspect, plan curvature, and profile curvature, with a resolution of 100 m.The road network, river network, and land use were acquired from the data of the Second Detailed Land Investigation Nationwide of China (on a county scale) that was provided by the Department of Nature Resources of Yunnan Province.The distance to roads and distance to rivers were then generated by buffering the road network and river network, respectively.Geological factors played an indispensable part in landslide susceptibility analysis [51].The ERG and distance to faults were considered as important geological factors which were collected from a regional geological map at the scale of 1:100,000.The types of ERGs reflect different physical and mechanical properties of rocks that are related to the failure mechanism.The integrity of rock and slope is usually influenced by distance to the faults.The NDVI map used in this study was constructed based on the Landsat 8 OLI imagery with a resolution of 30 m (http://www.gscloud.cn).Rainfall is generally regarded as a causative factor for the slope movement [52].In this work, the Yunnan Digital Village Website (http://www.ynszxc.net) was used to collect the annual average rainfall for each village in Lushui County, from which the annual rainfall map of the whole study area was then produced through the Kriging interpolation.Finally, all landslide conditioning factor layers were resampled into 310,604 raster cells with the same pixel size of 100 m × 100 m to unify the data scale for further analysis (Figure 3).aspect, plan curvature, and profile curvature, with a resolution of 100 m.The road network, river network, and land use were acquired from the data of the Second Detailed Land Investigation Nationwide of China (on a county scale) that was provided by the Department of Nature Resources of Yunnan Province.The distance to roads and distance to rivers were then generated by buffering the road network and river network, respectively.Geological factors played an indispensable part in landslide susceptibility analysis [51].The ERG and distance to faults were considered as important geological factors which were collected from a regional geological map at the scale of 1:100,000.The types of ERGs reflect different physical and mechanical properties of rocks that are related to the failure mechanism.The integrity of rock and slope is usually influenced by distance to the faults.The NDVI map used in this study was constructed based on the Landsat 8 OLI imagery with a resolution of 30 m (http://www.gscloud.cn).Rainfall is generally regarded as a causative factor for the slope movement [52].In this work, the Yunnan Digital Village Website (http://www.ynszxc.net) was used to collect the annual average rainfall for each village in Lushui County, from which the annual rainfall map of the whole study area was then produced through the Kriging interpolation.Finally, all landslide conditioning factor layers were resampled into 310,604 raster cells with the same pixel size of 100 m × 100 m to unify the data scale for further analysis (Figure 3).

Preparation of Training and Validation Datasets
Landslide spatial prediction can be treated as a binary classification problem, where landslides and non-landslides represent positive cases and negative cases, respectively [53].It is usually recommended that an equal number of negative and positive cases should be prepared [54].To select representative negative cases, K-means clustering was used to determine non-landslides from extensive landslide-free areas.Accordingly, the information of all non-landslide grid units was imported to SPSS 22 software, and those grid units were classified into 388 categories by using the Kmeans clustering function.The nearest unit of each category to the clustering center was taken as a negative case because the distance of clustering represents the extent of approximation to a category [55].The calculated 388 non-landslide locations are shown in Figure 4a.Then, landslides and nonlandslides were randomly divided into two parts with a same configuration (70% for training and 30% for validation) (Figure 4b).

Preparation of Training and Validation Datasets
Landslide spatial prediction can be treated as a binary classification problem, where landslides and non-landslides represent positive cases and negative cases, respectively [53].It is usually recommended that an equal number of negative and positive cases should be prepared [54].To select representative negative cases, K-means clustering was used to determine non-landslides from extensive landslide-free areas.Accordingly, the information of all non-landslide grid units was imported to SPSS 22 software, and those grid units were classified into 388 categories by using the K-means clustering function.The nearest unit of each category to the clustering center was taken as a negative case because the distance of clustering represents the extent of approximation to a category [55].The calculated 388 non-landslide locations are shown in Figure 4a.Then, landslides and non-landslides were randomly divided into two parts with a same configuration (70% for training and 30% for validation) (Figure 4b).

Preparation of Training and Validation Datasets
Landslide spatial prediction can be treated as a binary classification problem, where landslides and non-landslides represent positive cases and negative cases, respectively [53].It is usually recommended that an equal number of negative and positive cases should be prepared [54].To select representative negative cases, K-means clustering was used to determine non-landslides from extensive landslide-free areas.Accordingly, the information of all non-landslide grid units was imported to SPSS 22 software, and those grid units were classified into 388 categories by using the Kmeans clustering function.The nearest unit of each category to the clustering center was taken as a negative case because the distance of clustering represents the extent of approximation to a category [55].The calculated 388 non-landslide locations are shown in Figure 4a.Then, landslides and nonlandslides were randomly divided into two parts with a same configuration (70% for training and 30% for validation) (Figure 4b).

Importance Analysis of Landslide Conditioning Factors
Many studies have suggested that the importance of landslide conditioning factors should be evaluated because factors with noise or poor quality may jeopardize landslide spatial prediction results [5,23].With the purpose of assessing the landslide susceptibility in the study area, we prepared twelve conditioning factors in the initial step.Considering that the contribution of each factor to the landslide occurrence may not be equal, the information value (IV) evaluation method was utilized in this step to detect the factors appropriate for modeling.The IV is known as an effective feature selection method in the field of credit risk scorecards [56,57], which is designed to describe the ability of an input variable to distinguish whether a target variable occurs or not.A higher IV value indicates the variable is more important.The IV can be calculated using the following equations: where n 1 is the total number of the landslide pixels, n 0 is the total number of the non-landslide pixels, n i1 is the number of landslide pixels in the class x i of variable x, and n i0 is the number of landslide pixels in the class x i of variable x.
The prepared landslide conditioning factors contain continuous variables (e.g., elevation) and discrete variables (e.g., land use).The IV analysis implemented by R software can automatically split continuous variables into several intervals based on supervised discretization [58,59].The parameter p (minimum allowing percentage of samples per interval) was set as 0.01 to control the discretization.

Modeling Methods
The stacking ensemble method was first introduced by Wolpert [60].Unlike most existing ensemble learning methods, stacking uses meta-learning skill to combine different types of algorithms [61].The structure of stacking consists of two levels, namely level-0 and level-1, and the outputs of multiple base learners (level-0) are combined by the meta-learner (level-1).The simple sketch of the stacking structure constructed in this study is shown in Figure 5.The base learners were served by SVM, ANN, LR, and NB.These machine learning methods are commonly used for landslide susceptibility modeling [20,24,62,63].As for the meta-learner, LR was applied following the suggestions of previous research [64,65].

Importance Analysis of Landslide Conditioning Factors
Many studies have suggested that the importance of landslide conditioning factors should be evaluated because factors with noise or poor quality may jeopardize landslide spatial prediction results [5,23].With the purpose of assessing the landslide susceptibility in the study area, we prepared twelve conditioning factors in the initial step.Considering that the contribution of each factor to the landslide occurrence may not be equal, the information value (IV) evaluation method was utilized in this step to detect the factors appropriate for modeling.The IV is known as an effective feature selection method in the field of credit risk scorecards [56,57], which is designed to describe the ability of an input variable to distinguish whether a target variable occurs or not.A higher IV value indicates the variable is more important.The IV can be calculated using the following equations: where is the total number of the landslide pixels, is the total number of the non-landslide pixels, is the number of landslide pixels in the class of variable , and is the number of landslide pixels in the class of variable .
The prepared landslide conditioning factors contain continuous variables (e.g., elevation) and discrete variables (e.g., land use).The IV analysis implemented by R software can automatically split continuous variables into several intervals based on supervised discretization [58,59].The parameter p (minimum allowing percentage of samples per interval) was set as 0.01 to control the discretization.

Modeling Methods
The stacking ensemble method was first introduced by Wolpert [60].Unlike most existing ensemble learning methods, stacking uses meta-learning skill to combine different types of algorithms [61].The structure of stacking consists of two levels, namely level-0 and level-1, and the outputs of multiple base learners (level-0) are combined by the meta-learner (level-1).The simple sketch of the stacking structure constructed in this study is shown in Figure 5.The base learners were served by SVM, ANN, LR, and NB.These machine learning methods are commonly used for landslide susceptibility modeling [20,24,62,63].As for the meta-learner, LR was applied following the suggestions of previous research [64,65].The support vector machine (SVM) is a binary classifier based on the structural risk minimization principle [66], which takes advantage of the quadratic optimization technique to calculate the optimum separating hyperplane between datasets with a maximum margin on either side.In this way, two different classes of points can be separated as far as possible.The classification function of the SVM can be expressed as a set of constrains as below: subject to : where ω is an n-dimensional vector normal to the optimum hyperplane, b is bias, and y i is the classified variable (landslide and non-landslide) which belongs to set {1, −1}.
The radial basis function (RBF) kernel trick is usually recommended for the case where the classification task is non linearly separable [67,68].In this work, the SVM with the RBF kernel was utilized.

Artificial Neural Network
The artificial neural network (ANN) is a calculating algorithm that attempts to simulate the method of biological brain neurons to process information [69].The error backpropagation algorithm (BP) is probably the most popular ANN algorithm [70].The process of BP ANN includes forward propagation of signal and backward propagation of error.The initial connection weights used to pass the units in previous layer forward to the units in next layer are randomly set, and then recursively adjusted according to the back-propagating error discrepancies between the computed and desired results until the error discrepancies of the network reach a satisfactory threshold or are minimized.

Logistic Regression
Logistic regression (LR) tries to fit the most suitable regression model to describe the relationship between several independent variables and the dependent variable [71].The output of the LR is an approximate logarithmic probability of real classification, which is actually a value transformed from the result of linear regression by the logarithmic function [72].The calculation of the LR is shown as follows: where p is the probability of the landslide occurrence, n represents the number of landslide conditioning factors, B 0 is a constant which represents the intercept of the fitting model, and B i (i = 0, 1 . . .I) are the regression coefficients.

Naïve Bayes
Naïve Bayes (NB) is a statistical classifier based on the Bayesian theorem [73].To simplify the calculation of the posterior probability of observed cases in training data, the NB assumes that each attribute independently affects the classification results [74].Landslide spatial prediction using NB can be expressed as following equation: where P(y i ) is prior probability of target class y i (landslide), and P(x i y i ) is the conditional probability of each attribute.

Ensemble Modeling
Once all base learning algorithms are prepared, they are integrated into a whole framework using the stacking method.Suppose the initial dataset D consists of examples d i = (x i , y i ), where x i indicate landslide conditioning factors, and y i indicates corresponding classifications (landslide or non-landslide).i ∈ [1, N], where N represents the total number of the modeling dataset.Base learning algorithms such as SVM, ANN, LR, and NB are denoted as L t (t = 1, 2, 3, 4).Firstly, the dataset D is repeatedly divided into two disjoint subsets; one is used to train base learning algorithms to generate level-0 classifiers, noted as h t : The remaining examples are used to make predictions (z it ) through trained classifiers: These outputs from level-0 classifiers along with their true classification comprise a new dataset D = ((z it , z it , • • • , z it ), y i ), and are then fed to level-1 to train the meta-learner (LR).Thus, LR can assemble the classification results of base learners to produce the final prediction for new cases: Based on four single candidate algorithms and the stacking technique, the initial ensemble model was constructed, denoted as the SVM-ANN-NB-LR (SANL).

Resampling Strategy and Correlation Analysis
In statistical learning, an experiment with the purpose of comparing and ranking several algorithms in terms of a certain performance measure is referred to as a benchmark experiment.The resampling strategy is usually utilized to generate point estimations for the performance, which allows for an observation on the performance distribution of an algorithm on resampled subsets [75].In this work, resampling was used as a pretreatment method for evaluating the correlation between candidate algorithms (Figure 6).
The remaining examples are used to make predictions ( ) through trained classifiers: These outputs from level-0 classifiers along with their true classification comprise a new dataset = ( , , ⋯ , ), , and are then fed to level-1 to train the meta-learner (LR).Thus, LR can assemble the classification results of base learners to produce the final prediction for new cases: = ( ), ( ), ( ), ( ) Based on four single candidate algorithms and the stacking technique, the initial ensemble model was constructed, denoted as the SVM-ANN-NB-LR (SANL).

Resampling Strategy and Correlation Analysis
In statistical learning, an experiment with the purpose of comparing and ranking several algorithms in terms of a certain performance measure is referred to as a benchmark experiment.The resampling strategy is usually utilized to generate point estimations for the performance, which allows for an observation on the performance distribution of an algorithm on resampled subsets [75].In this work, resampling was used as a pretreatment method for evaluating the correlation between candidate algorithms (Figure 6).To be specific, when an ensemble model was simulated and constructed in the training stage, the training data could be randomly divided into ten folds to test single base learners.Each fold data was used to produce a prediction by a base learner, and then its prediction accuracy on this fold was recorded.This processing was repeated with n rounds.In other words, n × 10 different subsets with the same size were randomly resampled from the initial training data, and thus n × 10 accuracy values were obtained for each base learner.In this study, we set n as 2, 3, and 4 to produce 20 resamples, 30 resamples, and 40 resamples, respectively.Recalling the processing of resampling, for each resampled subset, four base learners produced prediction in parallel.As resampling proceeded, a series of point estimates of prediction accuracy was derived.Thus, the correlation between base learners could be measured according to the variation degree of those prediction accuracy pairs over the change of subset.
Pearson's correlation analysis is considered as an effective method to estimate the linear correlative degree between variables [76].Therefore, the correlation between those base learners was To be specific, when an ensemble model was simulated and constructed in the training stage, the training data could be randomly divided into ten folds to test single base learners.Each fold data was used to produce a prediction by a base learner, and then its prediction accuracy on this fold was recorded.This processing was repeated with n rounds.In other words, n × 10 different subsets with the same size were randomly resampled from the initial training data, and thus n × 10 accuracy values were obtained for each base learner.In this study, we set n as 2, 3, and 4 to produce 20 resamples, 30 resamples, and 40 resamples, respectively.Recalling the processing of resampling, for each resampled subset, four base learners produced prediction in parallel.As resampling proceeded, a series of point estimates of prediction accuracy was derived.Thus, the correlation between base learners could be measured according to the variation degree of those prediction accuracy pairs over the change of subset.
Pearson's correlation analysis is considered as an effective method to estimate the linear correlative degree between variables [76].Therefore, the correlation between those base learners was measured using Pearson's correlation coefficient (PCC) in this study.The PCC between two base learners L i and L j is defined as below: where cov ACC i , ACC j is the covariance, and ACC i and ACC j are respectively the predictive accuracy sets of L i and L i .var(ACC i ), and var ACC j are respectively the variance of ACC i and ACC j .

Selection of Conditioning Factors
The IV evaluation method was carried out to analyze the importance of landslide conditioning factors, and the results are presented in Table 2.It can be observed that the distance to roads gained a highest IV value of 3.329 among all factors, followed by elevation (2.694), land use (1.136), NDVI (1.0894), distance to faults (0.733), ERG (0.396), slope angle (0.176), profile curvature (0.105), slope aspect (0.065), annual rainfall (0.052), distance to rivers (0.050), and plan curvature (0), respectively.This result indicates that distance to roads was the most important factor responsible for landslide susceptibility assessment in the study area.Previous studies emphasized the role of roads, as excavation activities and removal of vegetation during road constructions usually disturb natural terrain and interfere with slope stability [32].In addition, the effect of faults can never be neglected in landslide analysis.The integrity of rock in the neighboring area of faults is disturbed by fault activities, which makes it prone to weathering and benefits to landslide [77].Cracks and joints induced by faults also contribute to the enrichment and infiltration process of rainwater and groundwater.It should be noted that the spatial variation of plan curvature fails to distinguish the landslide-prone areas effectively.Therefore, plan curvature was eliminated from landslide modeling owing to its incompetence.

Appropriateness Evaluation for Base Learners
The appropriateness of a base learner to the ensemble was evaluated mainly through the correlation analysis.However, in order to verify the improved role of the stacking-based ensemble, the performance of its single counterparts integrated into the whole ensemble framework still needed to be presented.In the stage of resampling, a series of predictive accuracies were recorded for each base learner as the subset was constantly sampled.To observe the accuracy distribution, a density curve was created by plotting the accuracy value on the x-axis and the corresponding frequency of the accuracy on the y-axis.Figure 7 shows the density curves for all base learners in terms of 20, 30, and 40 resamples, respectively.In general, it is preferable to use a specific value to describe a model's performance.Thus, the mean accuracy (MACC) was used to quantify the performance of base learners against the ensemble.Regarding the 20 resamples, the ANN showed the highest MACC with a value of 84.20%, followed by the SVM (83.92%),NB (82.63%), and LR (81.00%), respectively.In the case of 30 resamples, the highest MACC was also achieved by the ANN (84.49%), followed by the SVM (84.01%),NB (82.36%), and LR (81.26%), respectively.As for 40 resamples, the results showed a similar ranking, as the ANN gained the highest MACC value of 84.38%, followed by the SVM (83.92%),NB (82.68%), and LR (81.17%), respectively.Consistently, the results demonstrated the ANN had a slightly better performance than the SVM, NB, and LR.The scatter-plot matrix of the accuracy belonging to corresponding pairs of base learners is shown in Figure 8.This allowed for a pair-wise comparison to base learners.The correlation between base learners is an estimation of how closely the scatter points of the accuracy lie to the straight line.Visual inspection of Figure 8 shows that the scatter belonging to the pair of SVM and LR was roughly linear.The PCC provided a quantitative measure for this correlation, and its result revealed that the highest correlated pair was the SVM and LR regardless of 20, 30, or 40 resamples, with PCC values of 0.75, 0.73, and 0.68, respectively.The mean of PCC (MPCC) reached to 0.72, which was considered as a value indicating a strong correlation.The remaining pairs were weakly or not highly correlated, with MPCC values of 0.37, 0.33, 0.59, 0.47, and 0.42 for NB/SVM, NB/LR, NB/ANN, SVM/ANN, LR/ANN, and LR/SVM, respectively.On average, pairs that contained the NB had the lowest PCCs.
If highly correlated base learning algorithms are considered to be inappropriate components in ensemble modeling, they should be excluded from the initial ensemble model.According to the PCC results, the LR and SVM were separately eliminated from the SANL model to produce the SVM-ANN-NB (SAN) model and the ANN-NB-LR (ANL) model.Furthermore, in order to verify the weakly correlated base learners are good candidates, the NB was also removed from the SANL model, and the SVM-ANN-LR (SAL) model was further constructed.As a result, a total of four ensemble models were prepared for the experiments and their performances were evaluated and compared.

Landslide Susceptibility Assessment Using Ensemble Modeling
As the main goal of this study is to examine the "optimal" ensemble on the basis of available candidates for landslide susceptibility assessment, the susceptibility maps of the study area are developed mainly using four ensemble models.Once ensemble modeling was successfully implemented, landslide conditioning factors were fed into the model to produce predictions for the The scatter-plot matrix of the accuracy belonging to corresponding pairs of base learners is shown in Figure 8.This allowed for a pair-wise comparison to base learners.The correlation between base learners is an estimation of how closely the scatter points of the accuracy lie to the straight line.Visual inspection of Figure 8 shows that the scatter belonging to the pair of SVM and LR was roughly linear.The PCC provided a quantitative measure for this correlation, and its result revealed that the highest correlated pair was the SVM and LR regardless of 20, 30, or 40 resamples, with PCC values of 0.75, 0.73, and 0.68, respectively.The mean of PCC (MPCC) reached to 0.72, which was considered as a value indicating a strong correlation.The remaining pairs were weakly or not highly correlated, with MPCC values of 0.37, 0.33, 0.59, 0.47, and 0.42 for NB/SVM, NB/LR, NB/ANN, SVM/ANN, LR/ANN, and LR/SVM, respectively.On average, pairs that contained the NB had the lowest PCCs.The scatter-plot matrix of the accuracy belonging to corresponding pairs of base learners is shown in Figure 8.This allowed for a pair-wise comparison to base learners.The correlation between base learners is an estimation of how closely the scatter points of the accuracy lie to the straight line.Visual inspection of Figure 8 shows that the scatter belonging to the pair of SVM and LR was roughly linear.The PCC provided a quantitative measure for this correlation, and its result revealed that the highest correlated pair was the SVM and LR regardless of 20, 30, or 40 resamples, with PCC values of 0.75, 0.73, and 0.68, respectively.The mean of PCC (MPCC) reached to 0.72, which was considered as a value indicating a strong correlation.The remaining pairs were weakly or not highly correlated, with MPCC values of 0.37, 0.33, 0.59, 0.47, and 0.42 for NB/SVM, NB/LR, NB/ANN, SVM/ANN, LR/ANN, and LR/SVM, respectively.On average, pairs that contained the NB had the lowest PCCs.
If highly correlated base learning algorithms are considered to be inappropriate components in ensemble modeling, they should be excluded from the initial ensemble model.According to the PCC results, the LR and SVM were separately eliminated from the SANL model to produce the SVM-ANN-NB (SAN) model and the ANN-NB-LR (ANL) model.Furthermore, in order to verify the weakly correlated base learners are good candidates, the NB was also removed from the SANL model, and the SVM-ANN-LR (SAL) model was further constructed.As a result, a total of four ensemble models were prepared for the experiments and their performances were evaluated and compared.

Landslide Susceptibility Assessment Using Ensemble Modeling
As the main goal of this study is to examine the "optimal" ensemble on the basis of available candidates for landslide susceptibility assessment, the susceptibility maps of the study area are developed mainly using four ensemble models.Once ensemble modeling was successfully implemented, landslide conditioning factors were fed into the model to produce predictions for the If highly correlated base learning algorithms are considered to be inappropriate components in ensemble modeling, they should be excluded from the initial ensemble model.According to the PCC results, the LR and SVM were separately eliminated from the SANL model to produce the SVM-ANN-NB (SAN) model and the ANN-NB-LR (ANL) model.Furthermore, in order to verify the weakly correlated base learners are good candidates, the NB was also removed from the SANL model, and the SVM-ANN-LR (SAL) model was further constructed.As a result, a total of four ensemble models were prepared for the experiments and their performances were evaluated and compared.

Landslide Susceptibility Assessment Using Ensemble Modeling
As the main goal of this study is to examine the "optimal" ensemble on the basis of available candidates for landslide susceptibility assessment, the susceptibility maps of the study area are developed mainly using four ensemble models.Once ensemble modeling was successfully implemented, landslide conditioning factors were fed into the model to produce predictions for the whole area.Thus, each pixel in the study area was allocated a unique landslide susceptibility index (LSI).The higher the index value, the more susceptible the location.Those values were then categorized into five levels, namely, very low, low, moderate, high, and very high by using the geometrical interval classification technique in ArcGIS 10.2 software.Inspection of Figure 9 shows that a similar distribution pattern of landslide susceptibility was gained for various models, especially for the SAN model and the ANL model.All ensemble models were in accordance with the in that areas with very high susceptibility classes were mainly distributed near the Nujiang River Grand Canyon where strong tectonic breaks disturb the terrain, and that the developed road system in conjunction with low vegetation coverage in low-elevation areas further promoted slope instability.Clay-bearing rocks (e.g., mudstones and shales) and sediments play an important role in slope sliding as they contribute to the infiltration process [24].Additionally, carbonate rocks are likely to be located in highly vulnerable areas because they are prone to weathering which damages slope stability.The visual inspection of the geological map indicates that landslides were related to these negative geologic conditions.
Appl.Sci.2020, 10, x FOR PEER REVIEW 14 of 22 whole area.Thus, each pixel in the study area was allocated a unique landslide susceptibility index (LSI).The higher the index value, the more susceptible the location.Those values were then categorized into five levels, namely, very low, low, moderate, high, and very high by using the geometrical interval classification technique in ArcGIS 10.2 software.Inspection of Figure 9 shows that a similar distribution pattern of landslide susceptibility was gained for various models, especially for the SAN model and the ANL model.All ensemble models were in accordance with the in that areas with very high susceptibility classes were mainly distributed near the Nujiang River Grand Canyon where strong tectonic breaks disturb the terrain, and that the developed road system in conjunction with low vegetation coverage in low-elevation areas further promoted slope instability.Clay-bearing rocks (e.g., mudstones and shales) and sediments play an important role in slope sliding as they contribute to the infiltration process [24].Additionally, carbonate rocks are likely to be located in highly vulnerable areas because they are prone to weathering which damages slope stability.The visual inspection of the geological map indicates that landslides were related to these negative geologic conditions.Statistical results on the percentage of landslide pixels (PL) and the percentage of pixels in a susceptibility class (PC) are summarized in Table 3. Regarding the SANL model, 69.85% of the total landslides were located in the very high class that accounted for 16.80% of the total study area, whereas 83.76% of the total landslides occurred in the very highly susceptible areas (22.90%) using the SAN model.As for ANL and SAL models, 79.38% and 57.73% of the total landslides fell into very highly vulnerable classes, while very high classes occupied 21.82% and 11.80% of the total study area, respectively.Furthermore, the reliability of landslide susceptibility mapping was evaluated using landslide density (LD) analysis.LD is expressed by a ratio of PL and PC [2].It can be observed from Table 3 that LD values consistently increased from very low susceptibility level to very high susceptibility level for all ensemble models, indicating the developed landslide susceptibility assessment was reasonable.Statistical results on the percentage of landslide pixels (PL) and the percentage of pixels in a susceptibility class (PC) are summarized in Table 3. Regarding the SANL model, 69.85% of the total landslides were located in the very high class that accounted for 16.80% of the total study area, whereas 83.76% of the total landslides occurred in the very highly susceptible areas (22.90%) using the SAN model.As for ANL and SAL models, 79.38% and 57.73% of the total landslides fell into very highly vulnerable classes, while very high classes occupied 21.82% and 11.80% of the total study area, respectively.Furthermore, the reliability of landslide susceptibility mapping was evaluated using landslide density (LD) analysis.LD is expressed by a ratio of PL and PC [2].It can be observed from Table 3 that LD values consistently increased from very low susceptibility level to very high susceptibility level for all ensemble models, indicating the developed landslide susceptibility assessment was reasonable.

Performance Evaluation of Landslide Models
The model evaluation was performed using both the training set and validation set.The former shows the model's fitting skill and the latter reflects its generalization ability.The accuracy (ACC), Kappa coefficient (K), and the receiver operating characteristic (ROC) curve technique were employed as performance metrics in this study.The ACC is the ratio of landslide and non-landslide grid cells that are correctly predicted, while the K measures the reliability of a constructed model.As for the ROC curve, it is a popular method to evaluate the performance of the landslide model [78,79].The area under the ROC curve (AUC) value is the area under the ROC curve, and a higher AUC value indicates the model is better.
The overall performance is shown in Table 4 and Figure 10.All ensemble models performed well in goodness of fit as well as generalization ability.Note that the SANL model, the SAN model, and the ANL model yielded almost equal values with respect to the ACC, K, and AUC, but in detail the SAN model (with the ACC, K, and AUC values of 86.95%, 0.739, and 0.951) and the ANL model (with the ACC, K, and AUC values of 86.58%, 0.732, and 0.944) gained slightly better goodness of fit than the SANL model (with ACC, K, and AUC values of 86.40%, 0.728, and 0.945, respectively).The SAL model had a relatively lower performance especially in generalization ability, with values of 82.33%, 0.647, and 0.886 for ACC, K, and AUC, respectively, on the validation set.

Performance Evaluation of Landslide Models
The model evaluation was performed using both the training set and validation set.The former shows the model's fitting skill and the latter reflects its generalization ability.The accuracy (ACC), Kappa coefficient (K), and the receiver operating characteristic (ROC) curve technique were employed as performance metrics in this study.The ACC is the ratio of landslide and non-landslide grid cells that are correctly predicted, while the K measures the reliability of a constructed model.As for the ROC curve, it is a popular method to evaluate the performance of the landslide model [78,79].The area under the ROC curve (AUC) value is the area under the ROC curve, and a higher AUC value indicates the model is better.
The overall performance is shown in Table 4 and Figure 10.All ensemble models performed well in goodness of fit as well as generalization ability.Note that the SANL model, the SAN model, and the ANL model yielded almost equal values with respect to the ACC, K, and AUC, but in detail the SAN model (with the ACC, K, and AUC values of 86.95%, 0.739, and 0.951) and the ANL model (with the ACC, K, and AUC values of 86.58%, 0.732, and 0.944) gained slightly better goodness of fit than the SANL model (with ACC, K, and AUC values of 86.40%, 0.728, and 0.945, respectively).The SAL model had a relatively lower performance especially in generalization ability, with values of 82.33%, 0.647, and 0.886 for ACC, K, and AUC, respectively, on the validation set.

Discussion
Landslide susceptibility mapping aims to detect the vulnerable areas potentially threatened by landslide geo-hazards.Over the years, advancements in computer technology, the development of machine learning methods, the availability of multi-source data such as GIS data and remote sensing data, and the accessibility of convenient software have facilitated the conduction of landslide susceptibility assessment.In this context, how to improve the accuracy of assessment results has

Discussion
Landslide susceptibility mapping aims to detect the vulnerable areas potentially threatened by landslide geo-hazards.Over the years, advancements in computer technology, the development of machine learning methods, the availability of multi-source data such as GIS data and remote sensing data, and the accessibility of convenient software have facilitated the conduction of landslide susceptibility assessment.In this context, how to improve the accuracy of assessment results has become the main concern of researchers [5].Recently, ensemble techniques have proved to be able to provide a more accurate solution in landslide susceptibility modeling.In this work, we attempted to use a meta-learning method, namely the stacking ensemble method, to integrate multiple different algorithms.When the predictive accuracy was used as the measure, the ensemble of the SVM-ANN-NB-LR was found to be superior to all single algorithms.The advantage of stacking is reflected in that the two-level framework can learn more classification information, and meta-learner in the high level is capable of correcting bias generated in the former level [80,81].In this study, the stacking ensemble model showed improved performance against their components, and can be considered as a promising method for landslide susceptibility mapping.
In general, combining several algorithms enables an ensemble model to gain better performance [29].However, is it necessary to use all of them to constitute the ensemble when multiple predictors are available?For landslide susceptibility modeling, this needs to be investigated.In the field of machine learning, Zhou [82] introduced the concept of the selective ensemble, that is, making appropriate choices from multiple solutions and then combining those selected solutions to obtain the final decision.Similarly, Zeng et al. [83] proposed the selecting base classifiers on bagging (SBCB) algorithm, which proved to be superior to the original bagging method.Determining suitable base learning algorithms is important to examine the "optimal" landslide susceptibility assessment when ensemble modeling is the case.Rossi et al. [44] found the combination of LDA-QDA-LR had a better generalization capability than the combination of LDA-QDA-LR-ANN because the ANN overfitted the landslide information in the training stage.Chen et al. [24] investigated the SVM, ANN, and ME accompanied by their possible ensembles of ANN-SVM, ANN-ME, SVM-ME, and ANN-ME-SVM for landslide susceptibility mapping.They concluded that synergic interaction of two best single models, namely the SVM and ANN, constituted a stronger bond compared with remaining modeling methods.The ensemble models containing the ME showed moderate performance due to the underperformance of the ME model.This result is quite consistent with a gully erosion modeling study which reported that single models with good performance such as SVM and ANN played an improver role in ensemble modeling, while the relatively weak model (e.g., ME) acted as a depriver [84].A powerful model tends to add positive properties to subordinate models.To put it in another way, the outperformance of the ensemble depends mainly on good components.Therefore, the best base learner, e.g., the ANN used in this study, should be included in ensemble modeling to maintain the outperformance.In contrast, the base learner with a relatively poor performance that may play a reducer role seems to be inappropriate for constituting the ensemble.However, all base learners used in this study achieved good predictive accuracies.Here, "slightly weak" is a relative concept; the threshold for positive synergy among base learners may be difficult to determine [24].In addition, it is also questionable to explain the extremely similar behaviors between the SAN model and the ANL model only judged by the component's performance, because removing SVM or LR from the initial ensemble of the SVM-ANN-NB-LR has an equal effect in the case where the SVM outperformed the LR.
To address this issue, we introduced a resampling scheme and correlation analysis for estimating the correlation between base learners within the ensemble.When constituting the ensemble, it is desirable that the predictions produced by base learners be weakly correlated as the similar predictions may not contribute to the ensemble performance.In this study, the PCC values under different resampling times (with the 20, 30, and 40 times, respectively) showed a high correlation between the SVM and LR.This result means the SVM generates very similar or even the same predictions as the LR for ensemble modeling in most cases, reducing the effect of assembling predictions.In other words, the SVM and LR played an equal role in the ensemble of SVM-ANN-NB-LR, which led to similar performances between the SAN model and the ANL model, as well as similar distribution patterns of developed susceptibility maps.In fact, the SANL model also behaved similarly to the SAN model and the ANL model.We argue that the performance evaluation for the candidate used for composing the ensemble is important but may not be conclusive, because all base learners yielded good predictive accuracies in this work.In this case, a correlation analysis is helpful to determine eligible candidates for "optimal" landslide ensemble modeling.We further verified this by removing the NB from the SANL model, and it was found that the SAL model underperformed the other ensemble models.The correlation analysis showed pairs containing the NB had the lowest PCC on average results.The decreased performance of ensemble modeling after excluding the NB indicated that weakly or not highly correlated should be included in the ensemble.Overall, our results suggest that the candidates serving as base learners should be skillful but diverse, allowing the ensemble method to determine how to combine the best solutions from base learners for performance refinement [85,86].
However, some limitations should be noted.When it comes to such geologic factors as modeling input, we considered the physical and mechanical properties of engineering rock groups alone without taking into account geologic factors such as the angle of rock bedding and the orientation of bedding relative to slope direction that are related to slope failure.As detailed knowledge of geology is critical for landslide susceptibility studies, we will focus more on these important geologic factors and examine their effect in the future work.

Conclusions
Landslide modeling in this study mainly served two goals: (1) to investigate the stacking ensemble technique for landslide susceptibility assessment; and (2) to set a strategy to determine appropriate combinations when multiple candidates were available in ensemble learning.Before modeling, the IV evaluation was carried out for feature selection.It was found that of the initially prepared factors, eleven of them, namely distance to roads, elevation, land use, NDVI, distance to faults, ERG, slope angle, profile curvature, slope aspect, annual rainfall, and distance to rivers, were finally identified as inputs for ensemble modeling.The performance evaluation showed that the stacking ensemble-based susceptibility mapping procedure could make full use of the advantages of various individual models and achieve an improved prediction accuracy, indicating that stacking was also a promising method for landslide susceptibility mapping.Moreover, the resampling strategy and Pearson's correlation analysis were jointly used to test the correlation between base learners.The SVM and LR were highly correlated within the ensemble of the SANL, which resulted in the similar behaviors of the SANL model, the SAN model, and the ANL model.This result implies highly correlated candidates may be inappropriate for being combined in an ensemble prediction.As a result, the use of the SAN model or the ANL model instead of the SANL model is feasible for landslide susceptibility mapping in the study area.The findings and analysis from this study present the advantages of the stacking ensemble method, and provide an accessible way to select good candidates for ensemble modeling, which may be helpful for future landslide susceptibility studies.
• and 35 • account for 59.7% of the study area, and areas with slopes over 40 • cover 13.2%.The region has a tropical monsoon climate, with an annual average temperature of approximately 15.1 • C. The rainy season extends from May to October each year, and the rainfall during the rainy season accounts for 70% of the total annual rainfall.The largest daily rainfall can reach to 105.3 mm during the rainy season.The annual average rainfall is approximately 1200 mm.The surface runoff of the study area mainly consists of the Nujiang River system and its tributaries.

Figure 1 .
Figure 1.Geographical position of the study area and landslide inventory map.

Figure 1 .
Figure 1.Geographical position of the study area and landslide inventory map.

Figure 2 .
Figure 2. Geological map of the study area.

Figure 2 .
Figure 2. Geological map of the study area.

Figure 4 .
Figure 4. (a) Locations of the selected non-landslide; (b) Distributions of the training dataset and validation dataset.DEM: digital elevation model.

Figure 4 .
Figure 4. (a) Locations of the selected non-landslide; (b) Distributions of the training dataset and validation dataset.DEM: digital elevation model.

Figure 4 .
Figure 4. (a) Locations of the selected non-landslide; (b) Distributions of the training dataset and validation dataset.DEM: digital elevation model.
Appl.Sci.2020, 10, x FOR PEER REVIEW 10 of 22 3.4.5.Ensemble Modeling Once all base learning algorithms are prepared, they are integrated into a whole framework using the stacking method.Suppose the initial dataset consists of examples = ( , ), where indicate landslide conditioning factors, and indicates corresponding classifications (landslide or non-landslide).i ∈ 1, N , where N represents the total number of the modeling dataset.Base learning algorithms such as SVM, ANN, LR, and NB are denoted as (t = 1,2,3,4).Firstly, the dataset is repeatedly divided into two disjoint subsets; one is used to train base learning algorithms to generate level-0 classifiers, noted as : = ( − ) ∀i = 1,2, ⋯ , N; ∀t = 1,2,3,4

Figure 6 .
Figure 6.The flowchart of the resampling strategy and correlation evaluation.ACC: accuracy.

Figure 6 .
Figure 6.The flowchart of the resampling strategy and correlation evaluation.ACC: accuracy.

Figure 10 .
Figure 10.ROC analysis of four ensemble models (a) on the training dataset, (b) on the validation dataset.SE: standard error of the AUC; CI: confidence interval of the AUC; p: significance level.

Figure 10 .
Figure 10.ROC analysis of four ensemble models (a) on the training dataset, (b) on the validation dataset.SE: standard error of the AUC; CI: confidence interval of the AUC; p: significance level.

Table 1 .
Description of the engineering rock groups (ERGs).

Table 3 .
Landslide density analysis on landslide susceptibility maps.LD: landslide density; PC: percentage of pixels in a susceptibility class; PL: percentage of landslide pixels.

Table 4 .
Performance evaluation of landslide models.AUC: the area under the receiver operating characteristic (ROC) curve; K: Kappa coefficient.

Table 4 .
Performance evaluation of landslide models.AUC: the area under the receiver operating characteristic (ROC) curve; K: Kappa coefficient.