Landslide Susceptibility Assessment by Novel Hybrid Machine Learning Algorithms

: Landslides have multidimensional e ﬀ ects on the socioeconomic as well as environmental conditions of the impacted areas. The aim of this study is the spatial prediction of landslide using hybrid machine learning models including bagging (BA), random subspace (RS) and rotation forest (RF) with alternating decision tree (ADTree) as base classiﬁer in the northern part of the Pithoragarh district, Uttarakhand, Himalaya, India. To construct the database, ten conditioning factors and a total of 103 landslide locations with a ratio of 70 / 30 were used. The signiﬁcant factors were determined by chi-square attribute evaluation (CSEA) technique. The validity of the hybrid models was assessed by true positive rate (TP Rate), false positive rate (FP Rate), recall (sensitivity), precision, F-measure and area under the receiver operatic characteristic curve (AUC). Results concluded that land cover was the most important factor while curvature had no e ﬀ ect on landslide occurrence in the study area and it was removed from the modelling process. Additionally, results indicated that although all ensemble models enhanced the power prediction of the ADTree classiﬁer (AUC training = 0.859; AUC validation = 0.813); however, the RS ensemble model (AUC training = 0.883; AUC validation = 0.842) outperformed and outclassed the RF (AUC training = 0.871; AUC validation = 0.840), and the BA (AUC training = 0.865; AUC validation = 0.836) ensemble model. The obtained results would be helpful for recognizing the landslide prone areas in future to better manage and decrease the damage and negative impacts on the environment.


Introduction
Landslide is a local natural phenomenon, popularly understood as the "mass displacement of earth, debris, and rocks", that can be triggered by hydrological, topographic, and geophysical reasons [1]. However, anthropogenic activities such as mining and construction, as well as natural events including heavy rainfall, earthquake, volcanic eruption, and marine erosion can also trigger landslides [2]. Landslides have multidimensional effects on the socioeconomic as well as environmental conditions of the impacted areas. Hilly areas are highly prone to landslides and the after effect may significantly change the topographic features as well as river course and patterns [3]. The environmental loss could be in the form of forest along with habitat and wildlife destruction, and can elicit mild to severe tsunami and floods. Human populations living in the landslide susceptible areas are the foremost victims of landslides and may experience loss of houses, cattle, fertile lands, and even lives of their families and friends.
The World Bank reported that nearly 3.7 million square kilometers of world's land area is highly prone to landslides which could put at risk about 300 million human lives [4]. After nearly one and half decades later, these figures must have changed; however, because of the lack of consistency in the landslide reports, it is very difficult to come up with an exact number of the landslide incidences and the fatalities. These inconsistencies in the reports could be because of the varied nature of landslides; for example, landslides could be seismic only or triggered by rainfall, rock-slides, floods, or hurricane. In a recent study, the authors provide a good discussion of the discrepancies in landslide reports [2]. Furthermore, the authors report that between only 2004 and 2016, a total of 4862 landslide events occurred globally, with high impacts in the Central and South America, Caribbean Islands, East Africa, Turkey, Iran, European Alps, and Asia [2].
These seismic landslides caused 55,997 fatalities [2]. On the other hand, the National Aeronautics and Space Administration (NASA) reported that nearly 10,804 landslides occurred between 2007 and 2017, triggered by rainfall [5]. Petley captured the landslides events between 2004 and 2010, and reported 2620 landslides that caused 32,322 human losses [6]. The global infrastructure and economic damages due to landslides are daunting, costing about US$20 billion, the highest in the USA (US$12.1-4.3 billion) and Italy (US$3.9 billion), followed by Japan (>US$3.0 billion), India (US$2.0 billion), China (>US$1.0 billion), and Germany (US$0.3 billion) [7]. These numbers have significantly risen in the last decades when compared with previous reports. For example, in 1992, China's estimated annual economic loss due to landslides was nearly US$500 million [8] and in 1998, USA's estimated economic damages due to landslides was approximately US$1-3.6 billion [9].
The Himalayan Arc across Indian and southeastern China has experienced the highest landslide events, followed by areas of Laos, Bangladesh, Myanmar, Indonesia, and the Philippines [2]. India, the scope of this study, has experienced severe naturally triggered landslides in the 21st century. In 2001, nearly 40 individuals died in the Amboori of Kerala state of India; in 2013, a landslide occurred in Kedarnath of Uttarakhand state of India and more than 5000 people died; and in 2014 in the Pune of Maharashtra state of India, over 100 individuals were found to be missing after the landslide ( Figure 1) [10,11]. Figure 1 depicts the locations of landslide events in India and the fatalities that occurred at such locations-an estimated 12.6% of land area of India and approximately 4.5 million USD worth economic damage [12]. Based on our analysis of the NASA data, we found that a total of 958 landslide events occurred between 2007 and 2015 that caused 6779 human fatalities (Table 1). The landslides may not be stopped or controlled; however, the losses can be reduced by establishing a decision support system to predict possible landslides or identifying landslide prone areas for management. Therefore, prediction of possible landslides at the local and regional levels is required for pro-active landslide mitigation policy creation and management [13]. Although domain-knowledge-driven qualitative approach is advantageous in predicting landslides, data-driven quantitative methods are widely used because collecting field data from landslide areas are challenging and hard to acquire [3]. Pourghasemi et al. [14] reported that a variety of quantitatively-statistical, multi-criteria decision making, and machine learning-methods have been applied for predicting landslide susceptibility, of which logistical regression [15][16][17][18] is the most frequently used method, followed by the frequency ratio [19,20], weights-of-evidence [18,21], artificial neural networks [22,23], analytic hierarchy process [24,25], statistical index [26], index of entropy [27][28][29][30], and support vector machine [31,32]. Environmental data collected from fields as well as extracted from satellite images to develop landslide prediction models are diverse in nature, and therefore prone to inaccuracies [13]. To mitigate these challenges, researchers have applied various fuzzy-based techniques that are yet to accomplish satisfactory results [13].
In order to correctly predict possible landslides, landslide prone areas have to be clearly understood and should be used for prediction model development. This study aims to fill the above research gaps by introducing a novel decision tree-based hybrid machine learning system to correctly predict the landslide susceptible areas. To achieve a landslide susceptibility map with reliable and high prediction accuracy, we ensembled a decision tree, ADTree, as a base classifier with several Meta classifier namely bagging (BA), random subspace (RS) and rotation forest (RF) ensemble at the northern part of the Pithoragarh district, Uttarakhand, Himalaya, India. An alternative decision tree (ADTree) classifier is one of the powerful algorithms among decision tree algorithms which is rarely used for spatial prediction modelling [71,72]. It combine decision rules by the boosting and decision tree algorithms in the classification problems and therefore it can produce a simpler structure and also its interpretation of classification rules are simple and more visualizable [73]. The modelling process was carried out using Arc GIS 10.3, and Weka 3.9.

Description of the Study Area
The study area is located in the northern part of the Pithoragarh region, Uttarakhand in India, between the latitudes 29 • 59 13 N and 29 • 48 2 N and longitudes 80 • 0 34 E and 80 • 12 28 E, respectively, covering an area of about 242 km 2 ( Figure 2). Topographically, the region includes rugged hills and high mountain peaks which are dissected by long, narrow and deep valleys. The maximum elevation of the basin is 2713 m in the north and at least 757 m in the south at the outlet of the East Ramganga River from the basin. The East Ramganga River originates from the Namik glacier in the Himalayan Mountains, and flows into the Ganges River after passing 108 km in the Kumaon Region. The average slope is 28.61 • and the maximum gradient is 75.50 • . The areas with a steep slope are related to the slopes overlooking the river bed. Additionally, 9.6% of the basin area is more than 44 • slope, and only 14.81% of the basin is less than 15 • slope. In terms of land cover, 57% of the study area is covered by moderate to high density of vegetation and 23.32% of the area is under the cultivated lands. The rest of the basin includes sparsely vegetated (10.57%), barren (6.84%), lakes and rivers (1.33%), settlement (0.31%) and extensive slope cut (0.31%). In recent years, the major part of the basin has been converted into low-density forests and land degradation due to the destruction of forests and land use change from forest to agricultural land. The very fertile lands are located on the riverside. The settlements are surrounded by vast areas of agricultural lands. Most soil mass movements occur along the river valleys and the periphery of roads that are drawn along the rivers, especially during the rainy season.
In terms of geology, this basin is occupied by metamorphic rocks found in the Dharamgarh Formation (biotite gneiss, chlorite schist, inter bands of schistose quartzite with meta-volcanics) and Baijnath Formation (quartzite and gneiss), meta-sedimentary rocks of the Pithoragarh Formation (dolomitic limestone with intercalations of talcose schist, carbonaceous phyllite, slate, limestone and quartzite) and the Bering Formation (quartzite/arenite/sericitie and phyllite intercalated with meta-volcanics) of Garhwal group and recent alluvium. In terms of lithology, the northern and northeastern part of the basin is mainly covered by slate, quartzite, talc and dolomite. This lithological unit covers an area about 56% of the basin that contained 90.27% of landslides. The remaining 9.73% of landslides are located in colluvium units covering 9% of the basin area. The southwestern part of the study area, consisting of in situ soil and quartzite and slate with basic metavolcanics, has covered an area about 24% of the basin but without any landslides. In terms of geological structure, the area is affected by friction and fault due to tectonic activity. Most likely, the instability of rock in the rock masses is fractured due to discontinuities caused by faults, cracks, fractures and seams.

Landside Inventory Map
The past landslide location of any given area gives valuable information on the patterns of spatial distribution of landslide events in the landslide susceptibility zonation [74]. The past landslide locations help to understand the landslide behavior and relation between the landside causative factors. On account of this, the making of a landside inventory is an important step to landslide susceptibility assessment. Many scholars prepare a landslide inventory using high resolution remote sensing data or aerial photograph interpretation [27,[75][76][77]. Every year the area was affected by several active landslides during rainy season or after a rainy season [78] and therefore, the Google Earth data were used to cover rainfall affected landslides. In the present study, the landslide inventory map was prepared by using the Google Earth digitization from the post-rainfall seasons and the locations were field verified. A total of 103 landslide polygons were delineated and converted into the raster format. Of the total delineated landslide locations, we have selected 70% of landslides as training dataset and the remaining 30% of landslide locations for validation datasets ( Figure 2).

Landslide Conditioning Factors
A variety of landslide conditioning factors (LCF) have been used for developing landslide susceptibility prediction models including slope, lithology, aspect, land use, elevation, distance from river, distance from roads, distance from faults, plain curvature, profile curvature, precipitation, topographic wetness index, soil type, stream power index, normalized difference vegetation index, slope length, curvature, and drainage density [14]. The selection of these factors may vary based on the study area, scale of the study, and data availability [14]. Among the above-listed LCF, slope gradient has been the most frequently used LCF in the studies. The selection of LCF in this study is based on the previous research and our field observations.

Overburden Depth
The overburden depth captures the information of depth to the bedrock and has been linked to shallow translational debris landslides [79]. Furthermore, it is also influenced by slope and erosion. The study area is highly prone to erosion and has steep slopes; therefore, the overburden depth could play an important role in identifying landslide prone areas and developing prediction models. The overburden depth in the study area ranges between '0' and '4' m ( Figure 3a).

Land Cover
The majority of landslides occur in forest-scant areas, as in densely vegetated areas, the plant roots hold the soil and rocks strongly and keep them stable at steeper slopes, reduces soil erosion, and therefore protects against landslides [80,81]. Henceforth, how various land covers impact the landslide become imperative in developing landslide prediction models. In this study, we have categorized land cover into barren, cultivated land, extensive slope cut, lakes and rivers, moderately vegetated, settlement areas, sparsely vegetated areas, thickly vegetated, and wasteland ( Figure 3b).

Geomorphology
Geomorphology is one of the most important LCF as various geomorphological formations represent geomorphological phenomena including alluvial flood plain, colluvial footslop, denudational hill slope, highly dissected hills, lowly dissected hills, moderately dissected hills, ridge, river, and transportation midslope (Figure 3c) [79]. Geomorphology has also been found to have contributing effects on shallow and deep debris landslides [79].

Distance to Rivers
River networks erode the catchment areas in their natural course through surface runoff, therefore making the hilly areas highly vulnerable to landslides. Consequently, distance to river has been an important LCF in those studies where the study areas have dense river networks such as in our case in Uttarakhand [82]. We have classified the distance to rivers into 0-100 m, 100-200 m, 200-300 m, 300-400 m, 400-500 m, and above 500 m from the landslide locations (Figure 3d).

Distance to Roads
As mentioned in the previous section, landslide could be induced by road construction; considering road network in the landslide prediction model development therefore becomes a necessity. As road networks negatively impact the slopes by loosening the slope materials, the distance from roads helps understand the landslide prone areas [83]. Like distance to rivers, we have classified the distance to roads into 0-100 m, 100-200 m, 200-300 m, 300-400 m, 400-500 m, and above 500 m from the landslide locations ( Figure 3e).

Curvature
Erosion of riverbanks steepens the curvature, thus acting as a trigger point for landslide. Therefore, knowing whether the curvature is negative, zero, or positive for flat, concave, and convex surfaces is vital in identifying the landslide prone areas and so for developing landslide prediction models [84]. In this study, the curvature is classified into below −0.05, between −0.05 and 0.05, and above 0.05 (Figure 3f).

Aspect
Slope aspect is another important LCF that plays a significant role in inducing landslides in the study area as it influences the evapotranspiration by controlling the topographic moisture [82,85]. Slope aspect represents the course of extreme sloping of the terrain surface and moves clockwise starting at 00 (North) and ends to 3600 (West) [86]. The slope aspect in this study is categorized into flat, north, northeast, east, southeast, south, southwest, west, and northwest ( Figure 3g).

Valley Depth
The valley depth above 160 m in the study area is highly prone to landslides and showed a positive association between the valley depth and landslides [53]. We have classified valley depth into six categories (Figure 3h).

Slope
Slope is one of the most significant LCF used in developing landslide susceptibility prediction models since with an increase in the slope angles, the likelihood of the occurrence of landslides increases [2,84,87]. Slope is also found to be associated with both shallow translational rockslides and debris slides and has the highest landslide susceptibility predictive capability [79,85]. The study area is precipitous, and the majority of the areas fall between 15-45 degrees and goes up to 700 m, making the area prone to landslide during heavy rainfall. A majority of the landslides in the area was found to have occurred in cut-slopes [82]. We have classified the slope of the study area into 0-15, 15-25, 25-35, 35-45, and 45-75 degrees (Figure 3i).

SFM
Slope forming material (SFM) defines the rock and the soil types of the area and has significant impact on both shallow translational rock and debris landslides [79]. In this study, we classified the SFM into twelve categories based on their rock and soil types ( Figure 3j) and a majority of landslide events were reported in the study area with weak rock formed slopes [53].

Machine Learning Algorithms
Over time, landslide susceptibility modeling has been considered using both qualitative (inventory-based analysis) and quantitative or data driven models [88,89]. Development of geographical information system (GIS) and machine learning algorithm has provided alternative decision tree (ADTree), support vector machine, artificial neural network and kernel logistic regression (KLR) advanced techniques with precise model building [90]. Machine learning based data driven models with better performance than conventional models are quite appealing these days [88]. Machine learning-based landslide susceptibility models are more cost efficient and rapid than conventional models and can be extended to large area analysis [91]. Use of artificial neural network and support vector machine yielded high prediction accuracy but comparison with other models is still required to understand its precision.

Base Classifier: Alternating Decision Tree (ADTree)
Decision trees is one of the most advanced classification techniques with minimum probability of error, concomitant robustness, easy interpretation and precise classification, and has seamless applicability in solving real world situations [73]. This model has been built through data portioning in which each iteration data has been split according to the attribute values. Thus, the major goal of this analysis is to split data into subsets unless a subset contains homogenous target value or the predictable attribute. In each split, the impact of selected variables was examined on the predictable attribute. If the predictable attribute comprises discrete data, the resulting tree model is called a classification tree. This decision tree process is also called decision tree induction [92]. The training set inputs are divided into prediction node using split tests to obtain the prediction node values: where W + (c) and W − (c) refers to weighted sum of positive tuples and negative tuples meeting the demand of d. W' is other tuples' weighted sum except the tuple sets divided into p. Best split testing can be obtained by finding the minimum Z value.
The optimal construction algorithm of ADTree enunciated by [93] utilizes the Z pure pruning technology as: where Z pure represents the low limit of Z utilized for evaluating the predictive nodes.

Bagging Ensemble Classifier
Ensemble model combines various base models to produce a more optimal predictive model than single decision tree classifier. The main idea of ensemble model is to combine several weak learners (bootstrapping) to a strong learner (aggregation) for enhancing the predictability of the model. This model helps to minimize the biasness, noise and variance errors. AdaBoost, random forest and bagging are some of the random subspaces used in ensemble models. These techniques have now been utilized for groundwater potential analysis, landslide and flood susceptibility analysis (Chen et al. 2019) [18]. In AdaBoost model inaccuracy arises as it ignores the remaining data by concentrating on the difficult one which leads to a large range of diversity in the performance of bagging [94]. However, bagging ensemble can effectively be utilized for landslide susceptibility and has better prediction power than the conventional models [95].

Random Subspace Ensemble Classifier
In the time of pattern recognition, machine learning classifier is one of the topics of interest among researchers [94]. Random subspace ensemble model comprises several classifiers in a data feature space. Random subspace ensemble classifier can be used by nearest neighbor, linear, support vector and by other classifiers [67]. The advantage of this model is that training data seems to be smaller for original data which is larger for subspace data.

Rotation Forest Ensemble Classifier
There are several methods used for landslide susceptibility analysis but none of them are perfect [70]. The accuracy of the landslide susceptibility can only be achieved using the combination of ensembles classifiers [63]. Rotation forest ensemble approach first introduced by Rodriguez et al. [94] focuses on inducing the diversity and individual accuracy within the ensembles [94]. For creation of the training set, principle component analysis (PCA) was used to extract the features. The success of this model is based on the rotation matrix which is formed by the base classifier and the transformation method [63] (Figure 4).

Selecting the Most Important Conditioning Factors Using Chi-Square Attribute Evaluation (CSEA) Technique
Feature selection techniques, which have been more widely used in artificial intelligence, select a small features set of the training dataset for reducing the cost and time of modelling process as well as producing acceptable results during the modelling process [96]. There are some feature selection techniques such as gain ratio (GI), information gain ratio (IGR), least square support vector machine (LSSVM), chi-Square attribute evaluation (CSEA), correlation-based feature selection (CFS), fast correlation-based feature selection (FCBF), Euclidean distance, i-test, principal component analysis (PCA), and Markov blanket filter [97]. In this study, the chi-square attribute evaluation (CSEA) technique was used. The CSEA is calculated according to the following formula: where E is expected values and O is actual/observed values. The higher the value of the chi-square for a given conditioning factor in feature selection techniques, the more importance for landslide incidence.

Model Validation and Comparison
Although there are some evaluation measures to validate the performance of the machine learning models, in this study TP rate, FP rate, recall, precision, F-measure and ROC were used. All these measures can be computed from the confusion matrix (Table 2)  Another measure for evaluation of the performance of the models is receiver operating characteristic (ROC) curve. It is plotted by recall (sensitivity) and 100-specificity on the xand y-axis, respectively [13,99]. According to the definition, specificity is the number of incorrectly classified landslide cells per total predicted non-landslide cells [55]. The area under the ROC curve (AUC) generally has been used to evaluate model performance. The AUC for an ideal and inaccurate model have the values of 1 and 0.5, respectively [69]. The AUC is calculated as follows: where P and N are the total number of gullies and non-gullies, respectively.

The Most Significant Conditioning Factors
The predictive merit of the landslide susceptibility affecting factors with CSAE method is shown in Figure 5. The conditioning factors with higher than zero average merit (AM) values indicate contribution to landslide models. Conditioning factor selection findings revealed that the curvature factor had no effect on landslide susceptibility modelling, because its average merit (AM) was zero; hence, it was not entered in the modelling process. The CSAE method also showed that other nine conditioning factors were capable of landslide susceptibility modelling (AM > 0). Land cover has the highest predictive merit for landslide susceptibility modelling (AM = 234.285). It is followed by geomorphology (

Landslide Modelling, Evaluation and Comparison
The number of seeds and iterations can affect the landslide model performance. In order to select the optimal values a trial and error procedure has been carried out with varying numbers of seeds and iterations versus AUROC using both the training and validation data. The results showed that the best performances of RSADT model (AUC = 0.915) for the validation dataset were obtained with the number of iterations and seeds equal to 14 and 7, respectively (Figure 6a,b). Also, it can be concluded that the maximum performance (AUC) of BAADT model in the validation step was determined as 0.919 since the number of iteration equal to 20 (Figure 6c) and for the number of seed equal to 3 (Figure 6d). From Figure 6e and f. It can be observed that the highest AUC value (0.931) of RFADT model for the validation dataset was obtained with number of iterations and seeds equal to 3 and 12, respectively.
The ADTree, BAADT, RSADT and RFADT models were constructed using training data sets. According to statistical performance analysis of models in Table 3, all of the models have shown acceptable performance for landslide position prediction in the training step. Among the four models, the RFADT model has the best performance in term of TP (0.911) and FP (0.100) rate, precision (0.911), AUC (0.972), Kappa (0.815) and RMSE (0.305). It is followed by the BAADT and RSADT models. In addition, the ADTree model was shown to have the lowest performance with TP, FP, precision, Kappa, AUC and RMSE equal to 0.863, 0.131, 0.867, 0.939, 0.722 and 0.326, respectively. The results of statistical performance criteria in the validation step showed that all of the landslide susceptibility models had acceptable values ( Table 4). Out of these, like the training stage, the RFADT model was the best performing model (TP rate = 0.717, FP rate = 0.285, precision = 0.771, AUC = 0.931, Kappa = 0.433, and RMSE = 0.397) and the ADTree model showed the lowest performance (TP rate = 0.717, FP rate = 0.285, precision = 0.771, AUC = 0.931, Kappa = 0.433, and RMSE = 0.397). Therefore, both BAADT and RSADT models had intermediate efficiency between the RFADT and ADTree models.

Development of Landslide Susceptibility Maps
After determining the landslide susceptibility index using different models, the entire study area was classified into five susceptibility classes (very low (VLS), low (LS), moderate (MS), high (HS) and very high (VHS)) based on the geometrical interval, natural break and quantile classification schemes. The relative distribution of the susceptible classes in the study area and the contribution of classes in the recorded landslides are shown in Figure 7. Generally, the histograms of all models for different classification methods revealed that most of the recorded landslides are located in very high (VHS) susceptibility classes, except for ADTree model in which high class (HS) had the highest proportion of the recorded landslides. In the case of ADTree model, the very high susceptibility class determined by geometrical interval, natural breaks and quantile schemes cover 15.1, 16.9, and 16.9 percentages of the whole watershed pixels and, 27.4, 29.8, and 29.8 percentages of the recorded landslide pixels, respectively. Therefore, the natural break and quantile schemes were the best methods; however, the quantile was selected as the most appropriate method for landslide susceptibility classification. Accordingly, the quantile method was selected as the best classification method for the BAADT. However, both quantile and geometrical interval were best for the RFADT susceptibility maps for which the quantile was used. Finally, the geometrical method was the most appropriate for classification of the RSADT susceptibility maps.

Evaluation of the Landslide Susceptibility Maps
The performance of ensemble models in the prediction of landslide susceptibility were compared using the area under the ROC curve (AUC) for both training and validation datasets. Figure 9a shows the ROC curves of the four landslide susceptibility maps prepared by ADTree, RSADT, RFADT and BAADT models in the training step. The result showed that the highest degree of fit has the RSADT (AUC = 0.883), followed by the RFADT (AUC = 0.871), BAADT (AUC = 0.865), and ADT (AUC = 0.859). From Figure 9b, it can be observed that for the validation step RSADT has the highest area under the curve, with AUC value of 0.842. It is followed by RFADT, BAADT and ADTree with AUC values of 0.840, 0.836, and 0.813, respectively. Comparison of the ROC curves between training and validation steps showed that the AUC values of the training dataset were higher than the validation dataset. This is because of the same landslides that have already been used to construct the landslide models and used for performance analysis in the training step.

Discussion
Recognizing regions that are prone to landslide occurrence is one of the most important issues in land management and allocation strategies. Although different methods and techniques have been explored for spatial prediction of landslides over the world, the aims of all these methods are the same. Indeed, achieving a reasonable and reliable susceptibility map of landslides is a debate and controversial subject among landslides researchers. Researchers earlier have been mainly focused on the individual models for spatial prediction of natural hazards such as landslides. On the other hand, recently, most of them focused on the application of the ensemble/hybrid models due to some of their advantages. Basically, the aim of this study is to introduce a new hybrid artificial intelligence for landslide susceptibility mapping at the Pithoragarh region, Uttarakhand in India. This study has been focused on application of some meta-classifier/ensemble algorithms including BA, RS and RF based on a decision tree classifier such as ADTree for spatial prediction of landslide. In the modelling process and analyzing the goodness-of-fit and performance of the ensemble models, ten factors were selected. According to the chi-square attribute evaluation (CSAE) technique, all factors except curvature were effective and used for the final process. Feature selections are intelligent techniques that along with selecting the unimportant factors helps to increase the goodness-of-fit and performance of the models [55]. In this study, the curvature, known to be an ineffective factor, was removed from the final modelling process due to creating noise and over-fitting problems. Hybrid models are powerful techniques for considering the appropriate factors and enhancing the power prediction of base/individual classifiers while decreasing noise and over-fitting problems. Indeed, their results were better visualized and considered when compared with other cutting-edge/soft computing individual algorithms [55].
The ADTree can be considered for classifying binary classes and enhancing the accuracy such that it has produced promising results in spatial prediction of landslide over the world [63,67,90,100]. It is a known fact that the ADTree is an interpretable and robust algorithm against noise in order to provide significant improvement in classification error in comparison to the individual/base decision tree stump classifiers [73]. The ADTree in addition to a classification scheme has a measure of confidence that is known as classification margin. It based on very simple/weak rules represents a majority vote for classification issues. Based on this majority vote, ADTree using the Adaboost/boosting algorithm easily learn alternating trees from the training dataset [73].
After comparing the goodness-of-fit and performance of the ensemble models using TP rate/recall, FP rate, precision, kappa index, RMSE, and ROC indexes, the RS-ADTree model was known as the best model in predicting of landslide modelling. The RS is more efficient in reduction of both variance and bias compared to other ensemble methods. The obtained results are in agreement with Shirzadi et al. [13] and Pham et al. [101] who reported the ability of RS for spatial prediction of landslides and enhancing the accuracy of the base classifier used in their study. However, other ensemble models including RFADT and BAADT were also powerful techniques with higher prediction accuracy than the ADTree as an individual/base classifier. In this study, ADTree was selected as a weak classifier (a classifier with a poor performance), for modelling process of landslide susceptibility. Basically, we developed some novel ensemble models to enhance and improve the performance of the ADTree classifier by developing powerful decision rules.
It is remarkable that the ensemble models may have a different result in combination with decision tree individual/base classifiers. For example, bagging may be useful for perceptrons neural network algorithms and linear discriminant analysis (LDA) for weak and unstable classifiers; bagging and RS may be advantageous for k-nearest neighbors classification rules; and boosting and bagging are advantageous for linear classifiers [102]. Accordingly, it is possible that in a region not all these ensembles enhance the prediction accuracy of single-based classifiers. For example, Bui et al. [103] have used of functional tree (FT) as a base/weak classifier for developing some ensemble model such as bagging-FT, Adaboost-FT, and multiboost-FT for landslide susceptibility modelling. Their result concluded that Adaboost-FT had lower prediction accuracy than the FT algorithm. Such results also were shown in Bui et al. [104] that indicated Adaboost-DT had the lower prediction accuracy than the DT and bagging-DT models. In this study, all ensemble models performed well and the prediction accuracy were better obtained than the ADTree classifier. Hong et al. [98], Pham et al. [99], and Pham et al. [100] achieved the same results in which their applied ensembles had a better accuracy than the based classifier. Skurichina and Duin [102] demonstrated that the RS may have a better performance than the boosting and bagging algorithms which are useful for unstable classifiers [105]. They also confirmed that bagging is not useful for linear classifiers because they are mainly stable. Additionally, they reported that bagging for very small and also for very large training sample sizes is not usually appropriate.
The advantage of bagging is the shifting effect on the generalization error of the base classifier in the direction of generalization error computed on smaller training datasets. Therefore, it is applicable for classifiers that having a decreasing learning curve. On the other hand, the RF is a robust classifier with low bias and noise that causes an enhancement of the accuracy of individual/base classifiers and also the diversity in the ensemble at the same time [94]. The RS are useful meta-classifiers for weak linear classifiers which have been obtained from a small and critical training dataset. However, the efficient dimensionality (disadvantages) of RS depends on the level of redundancy in the feature space of the training dataset [102]. The above-mentioned advantages of the ensemble models prepared reasonable landslide susceptibility maps with high prediction accuracy in comparison to use of a weak classifier.

Conclusions
According to the obtained results from the hybrid machine learning algorithms in literature, they are more strongly and robustness than other methods and techniques for spatial prediction of landslide and hence are more favorable among landslide researchers. Since each classifier/algorithm has a different probability distribution function and structure, the output from modelling will be different due to uncertainties from the model and inputs. In the landslide modelling by machine learning, the performance and prediction accuracy will be enhanced when the proper meta/ensemble classifier is tested and selected. This result will be obtained when a training dataset with low noise and over-fitting problems and high performance and goodness-of-fit is selected. In this study, among ten conditioning factors, curvature had error and noise in the training dataset and was removed from the modeling, while land cover was the most significant factor for landslide occurrence in the study area. Three meta-classifiers including BA, RS and RF in this study were used for combination with ADTree as a weak base classifier to construct hybrid models. Our findings based on several statistical metrics pointed out that the RS-ADTree hybrid model outperformed the BA-ADTree and RF-ADTree models. This model was more able to overcome bias and over-fitting problems, resulting in higher prediction accuracy. Therefore, we conclude that the RS-ADTree ensemble model can be used as a new promising technique for spatial prediction of landslides in the study area. The RF-ADTree and the BA-ADTree models are other proper models for landslide susceptibility mapping. We suggest that to check the applicability and efficiency of these models, more case studies with different climate and geo-environmental factors should be used and validated. We believe that achieving a landslide susceptibility map with reliable and high prediction accuracy, which is the main aim of landside researchers, may be useful and constructive for decision making, enabling better management of landslide prone areas.

Conflicts of Interest:
The authors declare no conflicts of interest.