A Comparative Study of Statistics-Based Landslide Susceptibility Models : A Case Study of the Region Affected by the Gorkha Earthquake in Nepal

As a result of the Gorkha earthquake in 2015, about 9000 people lost their lives and many more were injured. Most of these losses were caused by earthquake-induced landslides. Sustainable planning and decision-making are required to reduce the losses caused by earthquakes and related hazards. The use of remote sensing and geographic information systems (GIS) for landslide susceptibility mapping can help planning authorities to prepare for and mitigate the consequences of future hazards. In this study, we developed landslide susceptibility maps using GIS-based statistical models at the regional level in central Nepal. Our study area included the districts affected by landslides after the Gorkha earthquake and its aftershocks. We used the 23,439 landslide locations obtained from high-resolution satellite imagery to evaluate the differences in landslide susceptibility using analytical hierarchy process (AHP), frequency ratio (FR) and hybrid spatial multi-criteria evaluation (SMCE) models. The nine landslide conditioning factors of lithology, land cover, precipitation, slope, aspect, elevation, distance to roads, distance to drainage and distance to faults were used as the input data for the applied landslide susceptibility mapping (LSM) models. The spatial correlation of landslides and these factors were identified using GIS-based statistical models. We divided the inventory into data used for training the statistical models (70%) and data used for validation (30%). Receiver operating characteristics (ROC) and the relative landslide density index (R-index) were used to validate the results. The area under the curve (AUC) values obtained from the ROC approach for AHP, FR and hybrid SMCE were 0.902, 0.905 and 0.91, respectively. The index of relative landslide density, R-index, values in sample datasets of AHP, FR and hybrid SMCE maps were 53%, 58% and 59% for the very high hazard classes. The final susceptibility results will be beneficial for regional planning and sustainable hazard mitigation.


Introduction
Mass movements are one of the major geological hazards in Nepal; almost 80% of the total area of the country is prone to landslides [1].Nepal is in the central part of the Himalayan mountain range, located in the intercontinental collision zone where the Indian plate is subducting under the Eurasian plate [2].On 25 April 2015, an earthquake of magnitude (M) 7.8 struck Nepal and was followed by several aftershocks.The earthquake shook most of the country and neighboring countries, and it was one of the most powerful earthquakes since the Bihar-Nepal earthquake of 1934.The main event and aftershocks triggered mass movements and resulted in damage to infrastructure and human life.The Gorkha earthquake resulted in the deaths of more than 9000 people and is estimated to affect about eighty-five million people in Nepal and neighboring countries [3].The estimates of previous studies as to the number of landslides triggered by the Gorkha earthquake range between 3000 and 25,000 [4][5][6][7][8][9].A landslide inventory is one of the essential aspects for landslide susceptibility evaluation [10,11].In most of the studies related to the Gorkha earthquake, the emphasis was on the development of a landslide inventory while using this information to develop a landslide susceptibility zonation for the earthquake-affected region was largely neglected.There has been extensive work done on the preparation of coseismic landslide inventories for susceptibility evaluations worldwide.The first coseismic landslide inventory was prepared after the 1957 San Francisco earthquake [12,13].Since then, several researchers have attempted to prepare coseismic landslide inventories over the years.Due to advancements in geographic information systems (GIS) and Earth observation (EO) technologies, including semi-automatic and machine learning techniques, the preparation of a coseismic landslide inventory is becoming more attainable and less time-consuming.
Landslide susceptibility mapping (LSM) is done by assessing the probability of a landslide occurrence in a given region [14].Over the years, landslide susceptibility mapping has become a practical approach with the aim to obtain better insights into the potential slope failures.Remote sensing and GIS techniques are the foundation of landslide susceptibility mapping.In this study, we used landslide inventory data obtained from very high-resolution Digital Globe Worldview 2-3 imagery.Satellite imagery of high spatial resolution helps in identifying even the smallest of the slope failures.Many previous studies have used very high-resolution satellite imagery to obtain landslide inventory data, for instance [15] used 2 m Worldview-2 imagery to detect landslides caused by the 2010 Haiti earthquake.Another study used IKONOS-1 imagery for landslide detection in the Tangjiaoa, Wanzhou district of China [16].Reference [17] used different CNN models and three different machine learning models, namely artificial neural network (ANN), support vector machines (SVM), and random forest (RF) for landslide detection and inventory dataset production.High-resolution satellite imagery data is useful to generate input datasets for LSM.Integration of different models for the identification of probable areas of landslide occurrence through LSM is a recognized approach for sustainable risk mitigation [18].Access to accurate inventory data along with hazard conditioning factors using both data-driven and knowledge-based models have yielded successful results for LSM in different studies [19][20][21].Several studies have used expert knowledge-based models for conducting LSM.A range of models is used in these studies, including the analytical hierarchy process (AHP) [22][23][24][25].Although the AHP yields accurate results in natural hazard susceptibility modelling and mapping, it has been proved that there is a degree of uncertainty associated with the resulting maps.Therefore, the AHP model is not always a satisfactory model for susceptibility mapping based on decision makers' preferences.A number of attempts have been made to improve the accuracy of the AHP as it is potentially a valuable tool for susceptibility mapping [19,26].
Currently, data-based models show fairly good results in the natural hazard susceptibility assessments [27][28][29].We used a data-based frequency ratio (FR) model for LSM in the Gorkha region.FR is based on data from previous landslides and causal factors and their spatial coverage [30].Several studies used FR to generate LSMs at regional or small scales, such as [24,[30][31][32][33][34][35][36].A study carried out in Selangor, Malaysia by [37] concluded that the FR model shows a correlation between landslide causal factors and the location of past landslides in a given study area.As the weightings of the FR model are directly based on the density of the past landslides in the study area, it is sensitive to the use of different training datasets.
Several researchers have integrated different models to improve the susceptibility assessment of landslides.These studies used a range of models.For example, the entropy method, logistic regression (LR) and support vector machine (SVM) were combined for LSM by [34], and the ensemble weights of evidence, LR and random forest models were integrated into a hybrid approach for LSM in Shaanxi Province, China by [38].The integration of the AHP model with a sensitivity and uncertainty analysis led to better results of LSM in a study that have done with [39,40].
In this study, the integration of the AHP and FR models was used as the theoretical background of the spatial multi-criteria evaluation (SMCE) model.The multi-criteria evaluation (MCE) is a decision-based approach in which alternatives are evaluated in a tree-like hierarchy of criteria and objectives [41].There are several stages in carrying out a hybrid SMCE.Firstly, a problem analysis is carried out by making a tree-like hierarchy; this is followed by the standardization of the weights of the factors and the generation of the final composite maps [41][42][43][44].The hybrid SMCE model yielded a higher prediction accuracy, resulting in higher reliability of the generated landslide maps, compared to FR, SVM, index of entropy (IOE) and the weighted linear combination (WLC) models implemented in a study done in the Wunning area in China by [45].
We used the AHP, FR, and hybrid SMCE models to carry out LSM for the study area.First, the AHP model was created based on expert knowledge, which was used to assign weights to factors.Then, the FR model was used to generate landslide susceptibility maps based on a data-driven approach.LSM was carried out based on different class frequency ratios of landslide inventory data.Finally, the third landslide susceptibility map was generated using the hybrid SMCE model, which is the integration of the AHP and FR models.In this study, we selected the region most affected by landslides after the earthquake; the districts are Gorkha, Rasuwa, Sindhupalchok, Dhading, Nuwakot, and Dholaka.GIS-based landslide susceptibility maps were produced using the three statistical models AHP, FR and hybrid SMCE.The validation of the three models was carried out using two different accuracy assessment methods and the validation inventory dataset of the past landslide occurrences in our case study area.

Study Area and Inventory Data Set
The epicenter of the earthquake was in the Gorkha district and its aftershock was about 140 km away in the Dholaka district.But landslides triggered by the main event and several aftershocks were scattered all over central Nepal.We have selected seven districts that were severely affected by the earthquake.The study area covers a 14,502 km 2 region spanning most of central Nepal.The area has two major drainage systems, namely Narayani and Saptakoshi [6].A total of 23,439 landslides have been selected for the landslide susceptibility evaluation (see Figure 1).
Geologically, the study area lies within the Himalayan fold and thrust zone of central Nepal, which resulted from the collision of the Indian plate with the Eurasian plate [2].The collision led to extensive crustal shortening and upheaval, resulting in the formation of the Himalaya, the quintessential collision orogen.Himalayan seismicity is attributed to the movement of the Indian plate relative to the Eurasian plate at a rate of about 5 cm per year.When the convergence is locked in some areas, the energy is ultimately released in the form of tremors along the mountain range and in its surroundings [46].

Landslide Inventory Data Set
The landslide inventory map illustrates the active landslide sites and their properties, such as landslide type, structural attributes, and distance to roads.These slope deformations are linked to the morphological, geological and climatic conditions of the locations; thus, these attributes can predict the conditions that could cause future landslides in the area.The first step was to identify the landslide locations in the satellite imagery and evaluate landslide-prone areas.Then active landslide locations were mapped and an inventory was prepared using satellite image interpretation, an extensive field survey, and literature search for historical landslide records [47].The landslide inventory dataset was generated by comparing high-resolution imagery of pre-and post-event (Figure 2) data taken from the USGS data catalogue [48].The satellite imagery was obtained from Digital Globe Worldview 2 and 3, and the spatial resolution of the imagery used for mapping the landslides was very high in the study area.The images were collected between 26th April and 15th June 2015, thus covering the main event of the Gorkha earthquake and its aftershocks.For our study area, a total of 23,439 landslide locations were identified.Our landslide inventory dataset was separated into a training and a testing dataset.Dividing the inventory dataset randomly for training and validation purposes is a general approach that has been used in several natural hazard studies [49,50].Samples are chosen based on the size of the study area, completeness of the inventory, and the methodology used.Currently, there is no standard methodology for the selection of testing and training samples [51].Reference [52] have given different ratios for various methods, and [53] also used the same ratio for prediction of landslides using computational methods.To make our models computationally robust, the inventory dataset was randomly divided into two sections, with 70% (16,407) used for training the models and 30% (7032) used for validating the results (see Figure 2).
extensive field survey, and literature search for historical landslide records [47].The landslide inventory dataset was generated by comparing high-resolution imagery of pre-and post-event (Figure 2) data taken from the USGS data catalogue [48].The satellite imagery was obtained from Digital Globe Worldview 2 and 3, and the spatial resolution of the imagery used for mapping the landslides was very high in the study area.The images were collected between 26th April and 15th June 2015, thus covering the main event of the Gorkha earthquake and its aftershocks.For our study area, a total of 23,439 landslide locations were identified.
Our landslide inventory dataset was separated into a training and a testing dataset.Dividing the inventory dataset randomly for training and validation purposes is a general approach that has been used in several natural hazard studies [49,50].Samples are chosen based on the size of the study area, completeness of the inventory, and the methodology used.Currently, there is no standard methodology for the selection of testing and training samples [51].Reference [52] have given different ratios for various methods, and [53] also used the same ratio for prediction of landslides using computational methods.To make our models computationally robust, the inventory dataset was randomly divided into two sections, with 70% (16,407) used for training the models and 30% (7032) used for validating the results (see Figure 2).

Landslide Conditioning Factors
The nine factors that most affect landslide occurrence in our study area are categorized into the four main groups of topographical, geological, hydrological and anthropogenic factors.Topographical factors include slope, elevation and aspect.The SRTM digital elevation model (DEM) with a 30 m spatial resolution was used as the basis of our slope and aspect layers.The slope angle is considered as the most critical factor of the slope stability assessment in any LSM study [54].The elevation is another crucial factor of LSM as most geomorphological and geological processes are controlled by this topographical factor [55].It was also used to describe the local relief of the study area.Relief classes denote the elevation range from the lowest to the highest point in the study area [56].Five altitude classes were defined to determine the density of landslides in different relief classes.The aspect factor affects the landslide occurrence as it is related to meteorological factors such as the direction of the precipitation and the average amount of sunshine.
The geological factors in this study are defined as lithology and distance to faults.These geological factors were extracted from the available 1:250,000 geological map of central Nepal.Different lithologic units and proximity to faults play a significant role in controlling the landslides.More than 40% of the lithology of the study area is categorized as gneiss and migmatite.The distance to faults is essential, especially in our case study, since areas closer to the faults were more affected by the earthquake [6].
Precipitation and the distance to drainage were our hydrological factors.Drainage is also one of the causal factors of the landslides because they cause erosion and saturation of the materials in the lower parts of valleys [57,58].
The anthropogenic factors are the distance to roads and land use.Human activities, and road construction in particular, cause a diminution in the load on either the toe or the topography of slopes.Finally, to apply the three models, the layers of all conditioning factors were included in the GIS environment as raster layers with a 30 m resolution (see Figure 3).The scale of 30 m was selected to match the spatial resolution of the SRTM 30 m data on which the topography data set factors were based.

Methods
Landslide susceptibility analysis was carried out using the AHP, FR and hybrid SMCE models in the region affected by the Gorkha earthquake.

Analytical Hierarchy Process (AHP)
The AHP developed by [59] has been used to weight related factors of any spatial problem in GIS environments [60].It is a common tool for analysis to support site selection, urban planning, and natural hazard susceptibility analysis [61].The AHP is a decision-making process based on multi-criteria and multiple objectives and involves the participation of experts [62].A hierarchical order of factors and numerical values is established based on the importance of each factor [63]. Subsequently, these factors are integrated, and each factor is weighted according to its importance [64].The AHP is used to establish the correlative pairwise comparison matrix.This matrix is constructed using values that represent experts' judgments by comparing the importance of each

Methods
Landslide susceptibility analysis was carried out using the AHP, FR and hybrid SMCE models in the region affected by the Gorkha earthquake.

Analytical Hierarchy Process (AHP)
The AHP developed by [59] has been used to weight related factors of any spatial problem in GIS environments [60].It is a common tool for analysis to support site selection, urban planning, and natural hazard susceptibility analysis [61].The AHP is a decision-making process based on multi-criteria and multiple objectives and involves the participation of experts [62].A hierarchical order of factors and numerical values is established based on the importance of each factor [63]. Subsequently, these factors are integrated, and each factor is weighted according to its importance [64].The AHP is used to establish the correlative pairwise comparison matrix.This matrix is constructed using values that represent experts' judgments by comparing the importance of each factor in relation to all the other related factors [65].Each layer is based on a nine-point rating scale and added to the matrix as developed by [66] (See Table 1).The decision maker specifies the value of each factor.Each factor weight from the matrix class was multiplied by the weight class.Local representation of factors in the study area determined the results of the susceptibility map.The factors were weighted using the pairwise comparison matrices of the AHP based on expert knowledge.The principle of transitivity is important in AHP for any given three factors (such as f 1 , f 2 and f 3 ), and is defined as follows; if f 1 > f 2 and f 2 > f 3 , then f 1 > f 3 .The principle of transitivity is a basis for conditioning factors weighing in AHP.Due to this principle, a consistent pairwise comparison matrix would require that if 2f 1 > f 2 (i.e., f 1 is two times more preferable than f 2 ) and 4f 2 > f 3 , then 8f 1 > f 3 to account for the transitivity principle [19,67].Therefore, it was necessary to compute the consistency of expert comparisons in matrices in each stage [20].Inconsistency can be defined based on the observation that λ max > n for comparison matrices and λ max = n, if C is a consistent comparison.The consistency ratio (CR) is defined by Equation ( 1 where RI is the random index of a randomly created pairwise comparison matrix for n = 2, 3, If the consistency ratio is <0.10 it indicates an acceptable level of consistency, whereas a CR > 0.10 points to a degree of inconsistency [68].Our case study in this research is to map areas susceptible to landslides.Accordingly, to calculate the criteria weights using the AHP model, questionnaire forms were prepared to gather expert knowledge for layer weighting.Therefore, nine local academic experts and twelve researchers and geologists from the international center for integrated mountain development (ICIMOD) and national organizations involved in post-earthquake reconstruction planning were asked to fill in the questionnaire, and a total of twenty-one forms were obtained for layer weighting.The assessed weights of the nine layers were calculated using the AHP model based on the resulting pairwise comparison matrices.

Frequency Ratio (FR)
FR is a useful geospatial assessment tool that provides the probabilities of the distribution of the occurrence and non-occurrence of landslides for each class of related factors [36].Classes are weighted based on the ratio of observed landslides to the whole study area [58].FR is one of the best geospatial assessment tools to determine the spatial correlation between inventory data and the class of related factors [69].The number or percentage of the inventory data in each class indicates the significance of the correlation with the landslide [70].For computing the FR weights, the frequency ratios of the factor classes were calculated based on our landslide inventory dataset.The percentage ratio for different classes of each factor to the whole study area was calculated.The final LSM was generated by a linear combination model (see Equation ( 2)) of the FR weights for all the factors.
where n is the number of factors and k is the number of classes in each factor that contributed to the landslide susceptibility map generation.Thus, FR ij is the FR weight of the j-th class in the i-th factor.
To prepare qualitative landslide susceptibility maps, the FR model was implemented using GIS tools.The frequency ratio of each factor was calculated from the relationship of the area of the corresponding class with the landslide events.The FR was calculated for all classes of each conditioning factor.
According to the FR model, the ratio is defined as the area where the landslide occurred to the total area.This means that an FR value of 1 is an average value for the occurrence of a landslide in a particular area compared to the whole area.If the value is >1, it means the percentage of landslides that occurred in the area is above the regional average and shows a high correlation, whereas values lower than 1 indicate a lower correlation [71,72].The generated landslide susceptibility map based on the FR weights is shown in Figure 4.

Hybrid Spatial Multi-Criteria Evaluation (SMCE)
The hybrid SMCE model enables users to solve the spatial problems of natural hazard susceptibility mapping [70].Spatial features are defined as lines, points, and areas in this approach and the generated final maps result from the consideration of landslide conditioning factors [73].SMCE is an approach that makes it possible to incorporate spatial analysis and GIS to use both spatial and non-spatial input data to produce final maps [74].In hybrid SMCE, input layers are spatially represented as factors.Based on the criteria tree, input layers are grouped weighted and standardized.The input layers are standardized from their original values to the 0-1 value range.The outputs of the hybrid SMCE were the index maps, which represent the match and mismatch of criteria in different areas.The multi-criteria evaluation (MCE) of the AHP was used as the theory of the hybrid SMCE model.The steps involved in the operation of the hybrid SMCE were the problem analysis, weighting the factors, standardization, ranking the factors using priority ordering, and finally producing the output map [41].The conditioning factors were grouped into the four main groups of topographical, geological, hydrological, and anthropogenic.The significance of each group on the landslide susceptibility was calculated based on the AHP pairwise comparison matrix completed by the same experts (see Table 2).The same methodology was used for weighting the factors categorized into different hybrid SMCE groups.However, as there were some differences in the expert's reports regarding the pairwise comparison of the hybrid SMCE groups, we used the weighted average of reported values for this matrix.Using the average of the comparison values of Saaty's rating scale (see Table 1) from different expert knowledge is a common approach for dealing with the uncertainty sources in the multi-criteria decision analyses [60,63].The hybrid SMCE resulting weight for each conditioning factor is represented in Table 3.These weights are the same with those of used for the AHP model resulting from the pairwise comparison matrix.However, in case of the hybrid SMCE, we used the resulting weights from the FR model for the classes of each conditioning factor.Therefore, the weightings of the factors were produced based on the AHP pairwise comparison matrices, whereas the FR values were used for the classes.Hence, the final impact of each hybrid SMCE factor on the resulting landslide susceptibility map was an integration of the weighting processes based on both the AHP and FR models, and it was considered the result of both models.

Results
To generate the landslide susceptibility maps to identify the areas that are highly susceptible to landslides, three different models were used in this study.The output values of the AHP and FR models were represented in Tables 4 and 5, respectively.The resulting weights from each model were applied to weight the conditioning factors and classes in a GIS environment.The derived FR value of over unity shows a strong relationship between the landslide data of the training inventory dataset high landslide susceptibility.FR values of less than unity show a low susceptibility.This is also true for the resulting factor weights from the AHP.Therefore, higher AHP weights indicated a high susceptibility.The weightings of the FR model were derived based on our produced inventory data set.However, those of the AHP model were generated according to the expert preferences in the pairwise comparison matrices.The weightings for sub-criteria classes from both the FR and AHP models followed the same pattern, there were some big differences in the first class of land use, distance to faults, distance to roads, and slope factors.According to the results based on expert knowledge, the factor of lithology containing the quartzite rock formation had the most significant impact on landslide occurrence, and this class got the lowest FR value.However, the rest of the classes were approximately the same in terms of the FR value.This is also the case for the weightings of the classes of the land use factor.
For the hybrid SMCE approach, the main groups and their corresponding factors were weighted based on the AHP model, and the FR values were considered for the classes of the factors.Thus, the resulting landslide susceptibility map based on the hybrid SMCE approach was a combination of the results of both the AHP and the FR models.Figure 4 shows the resulting landslide susceptibility maps obtained from each model.The natural breaks classification technique was used to classify similar values separated by breakpoints.This is a common and useful technique for classification of any hazard susceptibility map.The results were interpreted as belonging to the same class when they were values close to each class boundary, e.g., values between "low" and "moderate" susceptibility.Thus, the resulting landslide susceptibility maps were classified into five classes of susceptibility using the natural breaks classification technique.These results were validated using the 30% validation inventory dataset.We explain the validation approach and results in the next section.

Validation
In this section, we outline two different validation methods, which are widely used in the accuracy assessment of natural hazard susceptibility mapping.Validation is an important part of preparing natural hazard susceptibility maps to predict possible future events [76].To ascertain the success of applied landslide susceptibility models, we compared the resulting maps with the landslide inventory data from the study area.Analyzing the conformity between the inventory data and the resulting maps of the applied models can give a clear indication as to the effectiveness of each model for LSM and indicate whether the applied models could correctly predict areas that are susceptible to landslides.The locations of the historical landslides support the modelling and pattern recognition.Addressing more accurate areas that were exposed and susceptible to landslides becomes an advantageous driver to inform spatial managers in order to inform sustainable risk management and landslide mitigation.It would be useful for minimizing the adverse impacts, mainly by saving lives of the local people who live in such susceptible areas and protecting public properties in the area under investigation [44].In this study, 30% (7032 points) of the landslide points of the inventory data set was reserved for the validation.The receiver operating characteristics (ROC) curves and the relative landslide density method (R-index) were used for the accuracy assessment of the resulting landslide susceptibility maps based on the AHP, FR and hybrid SMCE models.After preparing susceptibility maps using the three models, we compared the results with the ground truth at a randomly selected sub-area as shown in Figure 5.The presented graphical verification indicates that the hybrid SMCE model gave a more accurate representation of the landslides that fell into the very high susceptibility class.Validation of the results showed good agreement between the observed and the predicted values for the hybrid SMCE model.

Validation
In this section, we outline two different validation methods, which are widely used in the accuracy assessment of natural hazard susceptibility mapping.Validation is an important part of preparing natural hazard susceptibility maps to predict possible future events [76].To ascertain the success of applied landslide susceptibility models, we compared the resulting maps with the landslide inventory data from the study area.Analyzing the conformity between the inventory data and the resulting maps of the applied models can give a clear indication as to the effectiveness of each model for LSM and indicate whether the applied models could correctly predict areas that are susceptible to landslides.The locations of the historical landslides support the modelling and pattern recognition.Addressing more accurate areas that were exposed and susceptible to landslides becomes an advantageous driver to inform spatial managers in order to inform sustainable risk management and landslide mitigation.It would be useful for minimizing the adverse impacts, mainly by saving lives of the local people who live in such susceptible areas and protecting public properties in the area under investigation [44].In this study, 30% (7032 points) of the landslide points of the inventory data set was reserved for the validation.The receiver operating characteristics (ROC) curves and the relative landslide density method (R-index) were used for the accuracy assessment of the resulting landslide susceptibility maps based on the AHP, FR and hybrid SMCE models.After preparing susceptibility maps using the three models, we compared the results with the ground truth at a randomly selected sub-area as shown in Figure 5.The presented graphical verification indicates that the hybrid SMCE model gave a more accurate representation of the landslides that fell into the very high susceptibility class.Validation of the results showed good agreement between the observed and the predicted values for the hybrid SMCE model.

Receiver Operating Characteristics (ROC)
The ROC curve [77] was used to validate different resulting landslide susceptibility maps using our validation data.The ROC approach displays a full scene of trade-off among the true-positive rate (TPR) and the false-positive rate (FPR) in the landslide susceptibility maps [78].ROC curves were calculated for all three resulting landslide susceptibility maps.The vertical axis indicates the TPR, while the horizontal axis shows the FPR [79] (see Figure 6).TPRs are the pixels that correctly referred to the landslide areas, while FPRs are the pixels wrongly labelled as landslides.To generating ROC curves, we obtain the correctly and incorrectly labelled pixels in the inventory dataset and then plot TPRs versus FPRs across the values that they took.The area under the curve (AUC) is the measure that indicates the accuracy of the landslide susceptibility maps.The resulting AUCs indicate the probability that more pixels were correctly labelled than incorrectly labelled [27].Therefore, greater AUC values indicate a higher accuracy of the resulting susceptibility map.The AUC values close to unity indicate a comprehensive susceptibility map; a value of 0.5 shows a worthless map because it means the map was generated by chance [80].The AUC values obtained from the ROC approach for AHP, FR and hybrid SMCE were 0.902, 0.905 and 0.910, respectively.Therefore, according to the results, the hybrid SMCE model indicates a more accurate landslide susceptibility map compared to the other two methods.Moreover, the resulting landslide susceptibility map based on the FR model was more accurate than that based on the AHP, which means that the data-based model performed better than the knowledge-based model.However, the result of the knowledge-based models may vary if we use other experts for criteria weightings or other models (e.g., technique for order of preference by similarity to ideal solution (TOPSIS) and conventional AHP).The resulting maps based on the data-based models are also sensitive to the training data, and they will change even when using another section of the inventory data set.The resulting landslide susceptibility map based on the FR contained 6701 landslides in its very high susceptibility class, while those of the AHP and hybrid SMCE contained 7948 and 4454, respectively.

Receiver Operating Characteristics (ROC)
The ROC curve [77] was used to validate different resulting landslide susceptibility maps using our validation data.The ROC approach displays a full scene of trade-off among the true-positive rate (TPR) and the false-positive rate (FPR) in the landslide susceptibility maps [78].ROC curves were calculated for all three resulting landslide susceptibility maps.The vertical axis indicates the TPR, while the horizontal axis shows the FPR [79] (see Figure 6).TPRs are the pixels that correctly referred to the landslide areas, while FPRs are the pixels wrongly labelled as landslides.To generating ROC curves, we obtain the correctly and incorrectly labelled pixels in the inventory dataset and then plot TPRs versus FPRs across the values that they took.The area under the curve (AUC) is the measure that indicates the accuracy of the landslide susceptibility maps.The resulting AUCs indicate the probability that more pixels were correctly labelled than incorrectly labelled [27].Therefore, greater AUC values indicate a higher accuracy of the resulting susceptibility map.The AUC values close to unity indicate a comprehensive susceptibility map; a value of 0.5 shows a worthless map because it means the map was generated by chance [80].The AUC values obtained from the ROC approach for AHP, FR and hybrid SMCE were 0.902, 0.905 and 0.910, respectively.Therefore, according to the results, the hybrid SMCE model indicates a more accurate landslide susceptibility map compared to the other two methods.Moreover, the resulting landslide susceptibility map based on the FR model was more accurate than that based on the AHP, which means that the data-based model performed better than the knowledge-based model.However, the result of the knowledge-based models may vary if we use other experts for criteria weightings or other models (e.g., technique for order of preference by similarity to ideal solution (TOPSIS) and conventional AHP).The resulting maps based on the data-based models are also sensitive to the training data, and they will change even when using another section of the inventory data set.The resulting landslide susceptibility map based on the FR contained 6701 landslides in its very high susceptibility class, while those of the AHP and hybrid SMCE contained 7948 and 4454, respectively.

Relative Landslide Density (R-index)
The accuracy of the landslide susceptibility prediction was evaluated using an index of relative landslide density (R-index).We used a total of 23,439 landslides to validate the results.The R-index is given as follows, by [81] where ni is the percentage of the area that is susceptible to landslides in each susceptibility class, and Ni is the percentage of landslides in each susceptibility class.Results show that the hybrid SMCE and FR models have higher R-index values in the very high susceptibility classes.For low and moderate susceptibility classes, the R-index values are very similar for all three models.The R-index values for the very high susceptibility class in the hybrid SMCE, AHP and FR maps are 59%, 53%, and 58%, respectively (Table 6).

Discussion
The area covered by the very high susceptibility class varies from 648,798,300 m 2 in the hybrid SMCE model to almost three times that area in the AHP model (1,894,469,000 m 2 ), and in the FR model (1,090,157,000 m 2 ) it is more than the area covered by the hybrid SMCE model.In the resulting AHP map, 13.17% of the area falls into the very high susceptibility class, and the number of landslides located in this class was 7948.In the resulting hybrid SMCE map, the corresponding values were 4.51% and 4454, respectively, which resulted in the highest R-index of all three models.As shown in Figure 7, the R-index of the FR model was a value between that of the AHP and the hybrid SMCE.In the map resulting from the FR model, 7.55% of the area falls into the very high susceptibility class with a total of 6701 landslides in this area.The distribution of the locations of landslides in the different classes of the resulting landslide susceptibility maps of a sub-area is shown in Figure 5.For the high susceptibility class, the resulting R-index values and area covered by the corresponding class for the implemented models were almost similar.But the number of landslides that fall into that class of the hybrid SMCE model is higher than in the other two models.When considering the differences in areas of the susceptibility classes in the resulting landslide susceptibility maps and the subsequent R-index for each class, the hybrid SMCE model achieved better results overall.The overall results of the AUC values obtained from the ROC approach for the three applied models showed that the hybrid SMCE generated a more accurate landslide susceptibility map than the other two models.As the hybrid SMCE is an integration of the AHP and FR models, we can say that the integration of knowledge-based models with the data-based ones achieves a more accurate result for our case study.The R-index value for the very high susceptibility category of the resulting susceptibility map based on the FR model is higher than that of the AHP.The higher accuracy of the resulting landslide susceptibility map based on the FR model compared to the AHP model shows that the data-based model performs better than the knowledge-based model.However, the results of knowledge-based models may vary if we use other experts for criteria weightings or use other models.Thus, the limitation of our work is that we used only the AHP model as our main knowledge-based model.Different multi-criteria decision-making models have been used in several spatial problem studies based on expert knowledge, and in some cases, the resulting accuracies of the models were compared to each other.The AHP model was compared to the TOPSIS by [82] within a fuzzy calculus.Both models were found to be suitable for their study.However, the TOPSIS model yielded better results than the AHP.In another study, the AHP model was compared to the interval pairwise comparison matrix (IPCM) by [83] and the second model was slightly more accurate in LSM.Generally, the AHP model was optimized through integration with other methodologies, such as fuzzy theory and the Dempster Shafer, to improve the accuracy of the final resulting landslide susceptibility maps [84].However, the conventional AHP model was used for LSM in our study.Consequently, as it was not the intent of our study to analyze the data-based and knowledge-based models separately, but rather to determine the value in combining them, the results of our study should not be considered as the best in terms of the individual models.There are also several studies in the field of landslide susceptibility that implement and compare data-based models [85].The resulting maps based on the data-based models are sensitive to the training data, and they will change by using a different selection of the inventory dataset.On the one hand, the results from the knowledge-based models may vary if using other experts for the criteria weightings.On the other hand, the resulting natural hazard susceptibility maps based on the data-based models can easily be affected by the training data, and the results will change if using training data from another source.Based on the data presented in Figure 8, although most of the weightings of the sub-criteria based on the AHP and FR models are the same, there are also some significant differences in the first class of the precipitation and distance to roads factors.According to the results based on the AHP, the class of 2500-3275 (mm/yr) of precipitation is the most important in terms of the landslide susceptibility.However, based on the weighting of the FR model, the class of The overall results of the AUC values obtained from the ROC approach for the three applied models showed that the hybrid SMCE generated a more accurate landslide susceptibility map than the other two models.As the hybrid SMCE is an integration of the AHP and FR models, we can say that the integration of knowledge-based models with the data-based ones achieves a more accurate result for our case study.The R-index value for the very high susceptibility category of the resulting susceptibility map based on the FR model is higher than that of the AHP.The higher accuracy of the resulting landslide susceptibility map based on the FR model compared to the AHP model shows that the data-based model performs better than the knowledge-based model.However, the results of knowledge-based models may vary if we use other experts for criteria weightings or use other models.Thus, the limitation of our work is that we used only the AHP model as our main knowledge-based model.Different multi-criteria decision-making models have been used in several spatial problem studies based on expert knowledge, and in some cases, the resulting accuracies of the models were compared to each other.The AHP model was compared to the TOPSIS by [82] within a fuzzy calculus.Both models were found to be suitable for their study.However, the TOPSIS model yielded better results than the AHP.In another study, the AHP model was compared to the interval pairwise comparison matrix (IPCM) by [83] and the second model was slightly more accurate in LSM.Generally, the AHP model was optimized through integration with other methodologies, such as fuzzy theory and the Dempster Shafer, to improve the accuracy of the final resulting landslide susceptibility maps [84].However, the conventional AHP model was used for LSM in our study.Consequently, as it was not the intent of our study to analyze the data-based and knowledge-based models separately, but rather to determine the value in combining them, the results of our study should not be considered as the best in terms of the individual models.There are also several studies in the field of landslide susceptibility that implement and compare data-based models [85].The resulting maps based on the data-based models are sensitive to the training data, and they will change by using a different selection of the inventory dataset.On the one hand, the results from the knowledge-based models may vary if using other experts for the criteria weightings.On the other hand, the resulting natural hazard susceptibility maps based on the data-based models can easily be affected by the training data, and the results will change if using training data from another source.Based on the data presented in Figure 8, although most of the weightings of the sub-criteria based on the AHP and FR models are the same, there are also some significant differences in the first class of the precipitation and distance to roads factors.According to the results based on the AHP, the class of 2500-3275 (mm/yr) of precipitation is the most important in terms of the landslide susceptibility.However, based on the weighting of the FR model, the class of 950-1275 (mm/yr) is the most significant.The weightings of the AHP and FR models for the criteria of elevation and distance to drainage follow a very similar pattern.The weightings of both models show that the impact of the drainage on the landslide susceptibility continuously decreases with increasing distance.
ISPRS Int.J. Geo-Inf.2019, 8, x FOR PEER REVIEW 19 of 24 950-1275 (mm/yr) is the most significant.The weightings of the AHP and FR models for the criteria of elevation and distance to drainage follow a very similar pattern.The weightings of both models show that the impact of the drainage on the landslide susceptibility continuously decreases with increasing distance.

Conclusions
The production of landslide susceptibility maps is the first step in managing a sustainable risk mitigation program in any landslide-prone area.For the entire area affected by the Gorkha earthquake, no previous attempts have been made to produce susceptibility maps using multiple statistical approaches.In this research, we produced landslide susceptibility maps for the entire Gorkha region.As the purpose of LSM is to predict the probability of future landslide occurrence in an area, our aim was to identify which of the statistical techniques (AHP, FR, or hybrid SMCE) is more successful for LSM.For this purpose, we used a total of 23,439 landslides and randomly divided them into two datasets, 70% for training and 30% for validation.The degree of fit and accuracy of the resulting landslide susceptibility maps was validated using two approaches, namely the ROC and R-index.The results of these validation approaches show that although all models were applicable for landslide susceptibility mapping, the hybrid SMCE model (0.910) gives slightly better results for ROC than the AHP (0.902) and FR (0.905) models.The hybrid SMCE resulted in better prediction accuracy and reliability of output landslide susceptibility maps.In summary, these investigations have resulted in the understanding that the integration of knowledge-based and data-based models outperforms the use of these models individually for the generation of landslide susceptibility maps.The FR model slightly outperforms the knowledge-driven AHP model.Reliable and precise susceptibility maps can minimize the costs and damage from natural disasters, such as landslides, if prepared in advance.The output maps can assist decision makers and planners to identify areas that are susceptible to future hazards.It will lead to the mitigation of future damage to infrastructure and human life.

Conclusions
The production of landslide susceptibility maps is the first step in managing a sustainable risk mitigation program in any landslide-prone area.For the entire area affected by the Gorkha earthquake, no previous attempts have been made to produce susceptibility maps using multiple statistical approaches.In this research, we produced landslide susceptibility maps for the entire Gorkha region.As the purpose of LSM is to predict the probability of future landslide occurrence in an area, our aim was to identify which of the statistical techniques (AHP, FR, or hybrid SMCE) is more successful for LSM.For this purpose, we used a total of 23,439 landslides and randomly divided them into two datasets, 70% for training and 30% for validation.The degree of fit and accuracy of the resulting landslide susceptibility maps was validated using two approaches, namely the ROC and R-index.The results of these validation approaches show that although all models were applicable for landslide susceptibility mapping, the hybrid SMCE model (0.910) gives slightly better results for ROC than the AHP (0.902) and FR (0.905) models.The hybrid SMCE resulted in better prediction accuracy and reliability of output landslide susceptibility maps.In summary, these investigations have resulted in the understanding that the integration of knowledge-based and data-based models outperforms the use of these models individually for the generation of landslide susceptibility maps.The FR model slightly outperforms the knowledge-driven AHP model.Reliable and precise susceptibility maps can minimize the costs and damage from natural disasters, such as landslides, if prepared in advance.The output maps can assist decision makers and planners to identify areas that are susceptible to future hazards.It will lead to the mitigation of future damage to infrastructure and human life.

Figure 1 .
Figure 1.Map showing the location of the study area and field photographs: (a) Mailung Khola and (b) camp near Mailung Khola hydropower plant (c) road section near Ramche (d) road section near Syaprubesi.

Figure 1 .
Figure 1.Map showing the location of the study area and field photographs: (a) Mailung Khola and (b) camp near Mailung Khola hydropower plant (c) road section near Ramche (d) road section near Syaprubesi.

Figure 2 .
Figure 2. Landslide inventory map of the study area with the distribution of training and validation datasets.

Figure 2 .
Figure 2. Landslide inventory map of the study area with the distribution of training and validation datasets.

Figure 5 .
Figure 5.An example the performance of the three landslide susceptibility models in a sub-area of the case study area.(A) Frequency ratio; (B) analytical hierarchy process; (C) spatial multi-criteria evaluation.

Figure 5 .
Figure 5.An example the performance of the three landslide susceptibility models in a sub-area of the case study area.(a) Frequency ratio; (b) analytical hierarchy process; (c) spatial multi-criteria evaluation.

Figure 6 .
Figure 6.Receiver operating characteristics (ROC) representing quality method success rate curves for the three methods.

Figure 6 .
Figure 6.Receiver operating characteristics (ROC) representing quality method success rate curves for the three methods.

Figure 7 .
Figure 7. R-index in susceptibility classes for hybrid SMCE, AHP and FR.

Figure 7 .
Figure 7. R-index in susceptibility classes for hybrid SMCE, AHP and FR.

Figure 8 .
Figure 8.The comparison of the resulting weights for the classes of each factor based on the FR and AHP model.

Figure 8 .
Figure 8.The comparison of the resulting weights for the classes of each factor based on the FR and AHP model.

Table 2 .
The pairwise comparison matrix of the hybrid spatial multi-criteria evaluation (SMCE) groups and the resulting weights.

Table 3 .
Weights for groups and factors based on the hybrid SMCE model.

Table 4 .
Pairwise comparison matrix for the analytical hierarchy process (AHP) model.

Table 5 .
The normalized weights for the classes of the factors based on the frequency ratio (FR) model.

Table 6 .
Resulting R-indexes for the landslide susceptibility mappings (LSMs) based on AHP, FR and hybrid SMCE models.