Integrating Landslide Typology with Weighted Frequency Ratio Model for Landslide Susceptibility Mapping: A Case Study from Lanzhou City of Northwestern China

: Although numerous models have been employed to address the issue of landslide susceptibility at regional scale, few have incorporated landslide typology into a model application. Thus, the aim of the present study is to perform landslide susceptibility zonation taking landslide classiﬁcation into account using a data-driven model. The speciﬁc objective is to answer the question: how to select reasonable inﬂuencing factors for different types of landslides so that the accuracy of susceptibility assessment can be improved? The Qilihe District in Lanzhou City of northwestern China was undertaken as the test area, and a total of 12 inﬂuencing factors were set as the predictive variables. An inventory map containing 227 landslides was created ﬁrst, which was divided into shallow landslides and debris ﬂows based on the geological features, distribution, and formation mechanisms. A weighted frequency ratio model was proposed to calculate the landslide susceptibility. The weights of inﬂuencing factors were calculated by the integrated model of logistic regression and fuzzy analytical hierarchy process, whereas the rating among the classes within each factor was obtained by a frequency ratio algorithm. The landslide susceptibility index of each cell was subsequently calculated in GIS environment to create landslide susceptibility maps of different types of landslide. The analysis and assessment process were separately performed for each type of landslide, and the ﬁnal landslide susceptibility map for the entire region was produced by combining them. The results showed that 73.3% of landslide pixels were classiﬁed into “very high” or “high” susceptibility zones, while “very low” or “low” susceptibility zones covered only 3.6% of landslide pixels. The accuracy of the model represented by receiver operating characteristic curve was satisfactory, with a success rate of 70.4%. When the landslide typology was not considered, the accuracy of resulted maps decreased by 1.5~5.4%.


Introduction
Landslides are one of the main causes of human casualties, environmental damages, and economic loss worldwide [1,2]. Hence, concern about landslide risk management and reduction has been increasing within the scientific community [3,4]. Landslide susceptibility assessment (LSA) has been proven as an effective tool to this end, with the goal of identifying the potential location of landslides [5,6]. Numerous models have been employed to create reliable landslide susceptibility maps in the past few decades. An overview of the literature on this topic shows that there are three categories of models, mainly: expert-based models, physically based models, and data-driven models [7][8][9][10]. Among them, the data-driven models are the most widely used models. Not only do they illustrate the nonlinear relationship between landslide occurrence and influencing factors well, but also normally have high prediction accuracies [11,12].
As a rule, there are two main branches within data-driven models, including machine learning models and statistically based models. Both have a basic assumption that future landslides are more likely to occur under the settings that led to historical landslides in a region [11]. Nonetheless, there are still some obvious differences between them. It is complicated to reflect the internal response of landslide mechanisms to indicators by machine learning models, because most of the models are characterized by a "black box" nature [13] and usually bypass the geomorphological explanation [14]. Numerous literature reports regarding commonly used machine learning algorithms have presented this point recently, such as the artificial neural network model [8,15], support vector machine model [16,17], tree-based model [18][19][20], among others. In contrast, it is relatively simple to achieve this goal when it comes to statistical models. For instance, the regression coefficients obtained by generalized additive models (GAMs) can represent the relative importance of predictive variables [21][22][23]. Concerning the application of stepwise variable selection and the Kfold cross-validation method, observations of the relative frequencies of variables are also available to evaluate variable importance in linear regression models [13,24,25]. Moreover, some statistical indices can also be used to rank the contribution of variables to the final results, such as information gain ratio [17] and Gini coefficient [26]. The fuzzy analytical hierarchy process (FAHP) is a multiple criteria decision-making tool that has been widely adopted in the recent past [27,28]. The determination of the fuzzy membership values is the most important step within this process, but most previous studies directly assigned these values based on expert opinion and field work, which is highly personal [29]. As a remedy, this study used logistic regression (LR) modeling to calculate the relative importance of factors, which can help define the fuzzy membership among the factors. On the other hand, statistical models can also be used to calculate the ranking of importance among categories of each factor, which is critical for understanding the impact of different states of the factor on inducing landslides. Many attempts have been made on this topic by researchers using bivariate or multivariate models, such as the weight of evidence model [30], information value model [31], index of entropy model [32], conditional probability model [33], certainty factor method [34], and evidential belief function [35]. In this study, the weights of different classes of a factor were determined by the frequency ratio (FR) model.
As earlier stipulated by Guzzetti et al. [5] and Corominas et al. [36], landslide occurrence depends on different influencing factors, including those that are extrinsic and intrinsic. Hence, it is important to establish a reasonable factor system for landslide susceptibility assessment. This mainly requires users to achieve two aspects: (i) selecting the factors that fit well with the development of landslides in the study area [37] and (ii) assigning appropriate weights/importance for different factors [38]. However, many studies considered all factors to have the same contribution to landslides, thus the same weight was assigned to each factor (e.g., [8,11]). This needs to be addressed.
It is evident that in a region where more than one type of landslide exists, it is necessary to separately assess the susceptibility for each type of landslide, because they may have different spatial incidence associated with distinct threshold conditions of influencing factors [17,39]. For example, to assess the landslide susceptibility, Zêzere [39] divided landslides of the test site into three types (rotational movement, translational, and shallow translational movement) and applied the information value method to both the entire landslide dataset and the three individual landslide types. A similar situation also appeared in the study of Epifânio et al. [40]. The results of both studies highlight an opinion that developing separate models for different types of landslides is essential. However, from the perspective of risk management, they only repeated the model executions by using different input datasets (then compared their accuracies) and did not generate a better overall landslide susceptibility map by considering landslide typology. Marjanović et al. [16] presented how some properties of different terrain pixels can be selected and sampled to perform landslide spatial modeling. Although their resulting susceptibility maps can distinguish the stability of several types of landslides (active and dormant landslides), the techniques adopted are only associated with machine learning approaches. Hence, it is difficult to be replicated by other types of models. A relatively standard procedure for landslide susceptibility assessment considering landslide typology was discussed by Zhou et al. [17]. They analyzed the relationship between landslides and influencing factors, and determined weights of the factors by traditional statistical methods. However, such studies have not been widely applied to other cases, most of which still considered the landslides in the region as a whole dataset.
Compared with previous work, the present study focuses on one specific aspect, i.e., to improve the process of LSA by constructing a reasonable influencing factor system for each type of landslide by statistically based models. To this end, Lanzhou City of northwestern China was selected as the test area and the landslides in the region were classified into two types. Based on the weighted frequency ratio model, the landslide susceptibility for each type of landslide was calculated in GIS, and final landslide susceptibility map for the entire region was produced by their combination. Additionally, the results obtained from the model without landslide typology, and some individual models (LR, FR, and FAHP) were also validated in an attempt to compare the model's performance. These results may help users create more reliable landslide susceptibility maps, and subsequently help decision-makers prepare corresponding strategies for landslide risk mitigation.

Description of the Study Area
The study area encompasses the Qilihe District (35 • 50 25" N, 103 • 36 43" E~36 • 06 09" N, 103 • 54 28" E) within middle-southern Lanzhou City, China, covering an area of 394.5 km 2 ( Figure 1). The area is part of the western Longzhong Loess Plateau, which is highly dissected and lies within the Yellow River valley belt. It is characterized by a parallel ridgeand-valley area and surrounded by a typical hilly and mountainous landscape [41]. The elevation in the region ranges from 1469 to 3056 m above sea level, with high south and low north. The area has a temperate continental climate with a mean temperature of 7.8°C and an average yearly rainfall of more than 327 mm. Meanwhile, a clear temporal difference on rainfall pattern can be seen: the dry season extends from November to March whereas the rainy season is between May and September, which normally yields a maximum total precipitation in August [41].
Two geological units can be observed in the region, namely the Lajishan and the Middle of Qilian strata. Both units have subhorizontal layering and several sets of subunits vertically. The exposed sedimentary strata include the Middle Ordovician (O 2 ), Sinian (Z), Upper Triassic (T 3 ), Lower Jurassic (J 1 ), Lower Cretaceous (K 1 ) and Quaternary (Q 1 , Q 2 , Q 3 , Q 4 ). The most important difference among these strata is the lithology, of which the Gaolan Formation rock mass is the most common in the area. It has been subjected to the composite action of multistage regional metamorphism, magmatic activity, and geological structure. The lithologies are mainly schist, gneiss, variegated conglomerate, and quartzite. The other typical stratum is the Quaternary that can be represented by argillaceous pebble in the Qilihe faulted basins and by extensive loess deposits, which cover the bedrock with the thickness of more than 5 m in most situations. Most faults are distributed in the southern part of the region, extending from northwest to southeast, but the scales are moderate or small. the scales are moderate or small.
The resident population in the area is approximately 5.78 × 10 5 inhabitants and the settlements are mainly distributed on the banks of rivers and within valley areas, especially the northern and southeastern parts. With a population density of more than 1300/km 2 , this part of the district is highly populated and urbanized, where many engineering activities have been performed, such as infrastructure construction and farming [42].

Landslide Inventory
Qilihe District spreads over a mixture of hilly and mountain landscape. Combined with the seasonal rainfall regime, interesting dynamics and plenty of landslides have been recorded in the region. The depths of these landslides vary from approximately 1 m to more than 20 m. Most of the landslides occurred in the loess deposits. Showing the spatial locations of these landslides in a GIS platform and creating an accurate database are important for LSA [43]. However, detailed detection of landslide extent is a rather complicated and challenging task. In this study, therefore, the landslide inventory was created according to an extensive field survey, earlier geotechnical reports, and previous literature [41]. Some literature [25,37] was referred to when we prepared the inventory to include some necessary information. The geotechnical reports were internal documents from the Lanzhou Institute of Geological Environment Monitoring in 2015. The details of every landslide were reported, such as area, volume, and type. It should be stated that landslides in these reports included the landslides that caused losses during 1996-2015. Landslides without losses are not included in the inventory; this is because there are very few and it is difficult to record when they occurred a long time ago. The field survey was conducted in 2017 and 2018, and we cross-checked the landslide information with that in the reports. The type of each landslide was confirmed according to protocol reported in standard literature [44,45].
Finally, a total of 227 landslides were recorded in the inventory, with an area of approximately 1.4 × 10 6 m 2 , accounting for 0.35% of the whole study area. Landslide volume The resident population in the area is approximately 5.78 × 10 5 inhabitants and the settlements are mainly distributed on the banks of rivers and within valley areas, especially the northern and southeastern parts. With a population density of more than 1300/km 2 , this part of the district is highly populated and urbanized, where many engineering activities have been performed, such as infrastructure construction and farming [42].

Landslide Inventory
Qilihe District spreads over a mixture of hilly and mountain landscape. Combined with the seasonal rainfall regime, interesting dynamics and plenty of landslides have been recorded in the region. The depths of these landslides vary from approximately 1 m to more than 20 m. Most of the landslides occurred in the loess deposits. Showing the spatial locations of these landslides in a GIS platform and creating an accurate database are important for LSA [43]. However, detailed detection of landslide extent is a rather complicated and challenging task. In this study, therefore, the landslide inventory was created according to an extensive field survey, earlier geotechnical reports, and previous literature [41]. Some literature [25,37] was referred to when we prepared the inventory to include some necessary information. The geotechnical reports were internal documents from the Lanzhou Institute of Geological Environment Monitoring in 2015. The details of every landslide were reported, such as area, volume, and type. It should be stated that landslides in these reports included the landslides that caused losses during 1996-2015. Landslides without losses are not included in the inventory; this is because there are very few and it is difficult to record when they occurred a long time ago. The field survey was conducted in 2017 and 2018, and we cross-checked the landslide information with that in the reports. The type of each landslide was confirmed according to protocol reported in standard literature [44,45].
Finally, a total of 227 landslides were recorded in the inventory, with an area of approximately 1.4 × 10 6 m 2 , accounting for 0.35% of the whole study area. Landslide volume varies greatly, with sizes ranging from 50 m 3 to approximately 1 × 10 6 m 3 . They can be roughly divided into three categories ( small amount of rock slides or earth slides; and (iii) debris flows or earth flows in gulley channels. Additionally, there are also three large deep-seated landslides in the study area. However, the number is too few to fit with the sampling standard of statistical methods. Hence, they were not considered in the analysis. According to the depth of sliding surfaces, landslides in categories (i) and (ii) are shallow landslides, whereas category (iii) generally does not have evident sliding surfaces. From the perspective of movement type, most landslides in categories (i) and (iii) moved in the horizonal direction along the sliding surfaces or channels, whereas landslides of category (ii) had a larger vertical movement distance. Landslides of category (ii) and (iii) normally occurred during a short time so their impact forces were larger. However, because the number of buildings and people in their runout paths were small, the direct losses caused by these two categories of landslides were not very large. Some landslides of category (i) were characterized as creep deformation, which might show sudden acceleration behavior, particularly during the rainy season. Although the infrastructure affected by them was not totally destroyed, the increasing deformation of these landslides might pose a considerable risk to residents and properties. amount of rock slides or earth slides; and (iii) debris flows or earth flows in gulley channels. Additionally, there are also three large deep-seated landslides in the study area. However, the number is too few to fit with the sampling standard of statistical methods. Hence, they were not considered in the analysis. According to the depth of sliding surfaces, landslides in categories (i) and (ii) are shallow landslides, whereas category (iii) generally does not have evident sliding surfaces. From the perspective of movement type, most landslides in categories (i) and (iii) moved in the horizonal direction along the sliding surfaces or channels, whereas landslides of category (ii) had a larger vertical movement distance. Landslides of category (ii) and (iii) normally occurred during a short time so their impact forces were larger. However, because the number of buildings and people in their runout paths were small, the direct losses caused by these two categories of landslides were not very large. Some landslides of category (i) were characterized as creep deformation, which might show sudden acceleration behavior, particularly during the rainy season. Although the infrastructure affected by them was not totally destroyed, the increasing deformation of these landslides might pose a considerable risk to residents and properties.
During the next stage, the spatial distribution of these landslides was plotted on a 1:50,000 map with 12.5 m resolution in GIS. As seen in Figure 3a, this study shows landslide locations in the form of point shapefile in GIS. However, the polygon-based file of landslides is also available, which indicates the extent of each landslide. In addition, the detailed properties of individual landslides were linked and recorded in the database, such as the area, volume, occurrence time, and associated damage.  During the next stage, the spatial distribution of these landslides was plotted on a 1:50,000 map with 12.5 m resolution in GIS. As seen in Figure 3a, this study shows landslide locations in the form of point shapefile in GIS. However, the polygon-based file of landslides is also available, which indicates the extent of each landslide. In addition, the detailed properties of individual landslides were linked and recorded in the database, such as the area, volume, occurrence time, and associated damage.

Landslide Typology
The main objective of this section is to justify why and how we divided the landslides in the inventory into different types. We analyzed the spatial distribution of landslides and clarified the difference of characteristics between different landslide types. According to this, we summarized the schematic models for different landslide types. Additionally, this analysis helps to understand the roles of factors in different landslide types.

Analysis of Landslide Density
During this step, all of the landslides were projected onto the DEM according to their coordinates. The kernel density analysis tool in GIS was used to obtain the spatial

Landslide Typology
The main objective of this section is to justify why and how we divided the landslides in the inventory into different types. We analyzed the spatial distribution of landslides and clarified the difference of characteristics between different landslide types. According to this, we summarized the schematic models for different landslide types. Additionally, this analysis helps to understand the roles of factors in different landslide types.

Analysis of Landslide Density
During this step, all of the landslides were projected onto the DEM according to their coordinates. The kernel density analysis tool in GIS was used to obtain the spatial density distribution of these points. As seen in Figure 3a, there are three areas in Qilihe District where landslides are densely distributed: (i) The northern part of the area-the Qinglan gulley and Leitan River are located in this area. On one hand, the long channel of the gulley provides a runout path for landslides [46]. On the other hand, the erosion of riverbanks also affects the slope stability [47]. (ii) The middle-western part of the area-as the largest gulley of the study area, Huangyu gulley is located here (Figure 3b), and it is one of representative areas prone to debris flows in the Qilihe District. (iii) the southeastern part of the area-this lies near Agan town, which is a highly populated region.
From the perspective of geological conditions and triggering factors, the study area has dozens of gullies and many have a long and steep channel. For instance, Huangyu gulley has the channel with a length of more than 3 km, and the elevation difference between the top and the bottom points reaches hundreds of meters. The loess and other Quaternary deposits in the mountain can be easily destabilized during the heavy rainfall and then form debris flows that move along the channel. In the southeastern part of the area, geological structures, particularly faults, are relatively concentrated. Most are parallel and have a direction from NW to SE. Historically, frequent tectonic movements make the rock-soil masses here relatively weak. Moreover, the strata angles are large at both sides of the faults. Nearby Agan town, there is a coal mine that has operated for more than ten years and greatly affects stability conditions of slopes. Meanwhile, a national road also passes this area. The frequent engineering activities-building construction, roads and mining activities-have caused many shallow landslides in the area. Moreover, Agan town is located in a gulley that also has a large elevation difference and steep slopes, so many debris flows have also occurred in this area, similar to the Huangyu gulley. Hence, the combined influences of internal geological conditions and external triggers make the Agan town area the most susceptible to landslides.
According to remote sensing images, the difference of geomorphology in different parts of Qilihe District can be observed. There are many gullies with different scales in the study area that extend from the south to north. As seen in Figure 4a,b, the cutting effect of gullies creates channels and steep slopes which are prone to debris flows. However, in the northern and eastern parts of the Qilihe District (Figure 4c), the topography is relatively gentle, and typical gullies and channels are very few. Roads and settlements are mainly located in this region; thus, external conditions are positive for shallow landslides.

Landslide Typology and Schematic Model
As mentioned above, according to topography conditions, triggering factors, and movement characteristics, the landslides in Qilihe District were divided into two types: shallow landslides and debris flows. Using the archived reports, pictures of the sites, and remote sensing images, the reclassification for every landslide was performed. As seen in Figure 5, the typical schematic models for two types of landslides in the study area are summarized. Although the materials composing of the sliding body may vary, the initiation of landslides is considered to be mainly controlled by the morphology. An important reason for this classification is the landslide mechanism, which may lead to different responses of landslides to the same influencing factor. Hence, it is not reasonable to use just one set of influencing factors for the susceptibility analysis of all the landslides. According to this rule, two types of landslides are shown in Figure 6. In total, there are 141 shallow landslides and 86 debris flows.
parts of Qilihe District can be observed. There are many gullies with different scales in the study area that extend from the south to north. As seen in Figure 4a,b, the cutting effect of gullies creates channels and steep slopes which are prone to debris flows. However, in the northern and eastern parts of the Qilihe District (Figure 4c), the topography is relatively gentle, and typical gullies and channels are very few. Roads and settlements are mainly located in this region; thus, external conditions are positive for shallow landslides.

Landslide Typology and Schematic Model
As mentioned above, according to topography conditions, triggering factors, and movement characteristics, the landslides in Qilihe District were divided into two types: shallow landslides and debris flows. Using the archived reports, pictures of the sites, and remote sensing images, the reclassification for every landslide was performed. As seen in Figure 5, the typical schematic models for two types of landslides in the study area are summarized. Although the materials composing of the sliding body may vary, the initiation of landslides is considered to be mainly controlled by the morphology. An important reason for this classification is the landslide mechanism, which may lead to different responses of landslides to the same influencing factor. Hence, it is not reasonable

Weighted Frequency Ratio Model
Landslide susceptibility can be obtained by the weight of factors and rating of categories. As Lee et al. [38] reported, the weight shows the relative importance among factors (e.g., slope and lithology) whereas the rating represents the relative importance among categories of one factor (e.g., slope of 0~10° and 10~20°). In this study, the rating was determined by an index called frequency ratio and the weight of factors was calcu-

Weighted Frequency Ratio Model
Landslide susceptibility can be obtained by the weight of factors and rating of categories. As Lee et al. [38] reported, the weight shows the relative importance among factors (e.g., slope and lithology) whereas the rating represents the relative importance among categories of one factor (e.g., slope of 0~10 • and 10~20 • ). In this study, the rating was determined by an index called frequency ratio and the weight of factors was calculated by a logistic regression model combined with a fuzzy analytical hierarchy process (FAHP) model.

Frequency Ratio Method
Frequency ratio is a kind of statistical index that can show the nonlinear relationship between landslide distribution and external factors. Generally, the statistical models are conducted based on the assumption that future landslides will occur under the similar/same conditions as past landslides [11,13]. The frequency ratio (FR) can be obtained as follows: where i is the ith category of one factor, L i is the area of landslides within this category, TL is the total area of landslides in the entire study area, S i is the area of ith category and TS is the total area of the factor (same with the total area of the study area). If FR is more than 1, the ith category is positive for landslide occurrence when compared to the average level of all categories for this factor. The larger the value of FR, the contribution given by this category is larger.

Calculation of Weight of Factors
An analytic hierarchy process (AHP), recognized as a useful technique for multiple criteria decision-making, has been widely employed in GIS-based susceptibility analysis. However, from the detailed descriptions on AHP in previous literature [29,48,49], the model semiquantitatively determines the weights of factors by comparing them in pairs, which is highly subjective. Hence, in this study fuzzy theory was integrated with AHP (called FAHP) that allows users to change the pairwise comparisons to fuzzy numbers in the judgment matrix [50,51]. The fuzzy judgment matrixÃ can be expressed as follows: where n is the total number of factors.ã ij is the fuzzy comparison value of ith factor compared with jth factor (i = 1, 2, . . . n; j = 1, 2, . . . n). If the ith factor is relatively important,ã ij is set as 1 whereasã ij is 0 when the jth factor is more important than the ith factor. If both are of equal importance, the fuzzy comparison value is 0.5. The fuzzy geometric mean of every factor (r i ) can be determined according to the geometric mean technique as follows:r The matrixÃ can be converted into fuzzy consistent matrixẼ =ẽ ij (n × n) as follows: whereẽ ij is the element in the matrixẼ,r i andr j are fuzzy geometric means of ith and jth factor, respectively. The matrixẼ can be used to define the initial values of weightsW (0) : whereW (0) is the matrix composed of the initial values of weights (w 1 ,w 2 , . . .w n ). This matrix can also act as the initial vector of characteristic valueH 0 = (h 01 ,h 02 , . . . ,h 0n ) T .
Then an iterative method can be adopted to obtain the characteristic vectorH k as follows: When the difference ofH k andH k−1 is less than ε (the accuracy determined by users), the vector obtained from the normalization ofH k can be considered as the weight of factors.
As seen from the process mentioned above, a key step in the FAHP model is the determination of fuzzy number in the fuzzy matrix. However, the judgment of relative importance among factors is still necessary for this step, which is generally conducted by the expert-based method.
In this study, a logistic regression model was employed to rank the importance of factors quantitatively to remedy this limitation. The logistic regression model considers landslide occurrences as a bivariate variable, i.e., only two values are used to express a landslide-related event: 0 and 1 represent the landslide presence and absence, respectively. If the influencing factors are set as independent variables and the landslide event is the dependent variable, the relationship between them can be given as: where Y is the landslide event, X are the influencing factors, X nj represents the jth category of nth factor. The values from a 0 to a n are regression coefficients of these factors. Because a larger/smaller coefficient can lead to a value closer to Y = 1/0, these coefficients can be considered as an index showing the importance of a factor [21]. It should be noted that a factor can have both negative and positive effects on landslide occurrence, thus the importance is determined by the absolute values of coefficients and not the actual values.

Analysis of Influencing Factors
Multiple factors may influence the landslide occurrence and they are often interconnected [52]. Therefore, determination of influencing factors is an essential step for landslide susceptibility mapping because researchers need to extract principal information that can represent landslide mechanisms. According to previous work and expert knowledge [8,53,54], a total of 12 influencing factors are utilized as preliminary inputs in the present analysis, which can be divided into five categories: (i) geomorphological factors, including elevation, slope, aspect, curvature, relief degree of land surface (RDLS), and topographic position index (TPI), (ii) the geological factor, lithology, (iii) hydrological factors, including stream power index (SPI), topographic wetness index (TWI), and specific catchment area (SCA), (iv) land use, which is an environmental factor that can be anthropically modified, and (v) soil. It should be noted that some factors have been widely accepted by scientific communities in this topic (e.g., slope and lithology) while some are not commonly used in this topic, such as RDLS and SCA. However, they are truly related to some aspects of landslides, such as mechanical and hydrological processes. Additionally, some factors are considered more as triggering factors of landslides instead of influencing factors because they generally have evident temporal variability. For example, both rainfall and human activities have impacts on landslides, but they can change much dynamically even during short periods. Therefore, they influence the temporal distribution of landslides more than spatial distribution [55,56], which excludes them from the database.
The multicollinearity analysis of factors was carried out before the calculation [57]. An ArcGIS tool named "Band Collection Statistics" was applied to do this. The results showed that the correlation between every two factors is not larger than 0.4. Hence, the collinearity among factors is considered appropriate. These selected factors involve both continuous and categorical data, of which the former may increase the computational amount much and subsequently lead to complex data processing. To remedy this, these factors were discretized into several categories with the same attribute interval (Figure 7). A review of the literature [15,20,58,59] showed that an attribute number of 5~9 is reasonable because it not only simplifies the calculation but also leads a moderate division of continuous data. The main reasons to select these factors and descriptions regarding reclassification are clarified below.

Elevation
Elevation normally represents the potential energy of a slope and it also impacts the environmental conditions on slopes such as human activity, vegetation, and climate. The digital elevation model (DEM) of the study area was obtained from free open-source data (http://www.gscloud.cn/ accessed on 25 October 2020). The resolution selected was 12.5 m because its ability in quantitative analysis of geomorphological processes has been proven [60]. The elevation in the area varies from 1469 m to 3056 m a.s.l. and was classified into eight classes with an interval of 200 m (Figure 7a).

Slope
Slope reflects the steepness at each grid surface, which contributes much to slope stability. The slope map was prepared from the DEM in GIS, which shows a range from 0 to 74 • ; 10 • was used as the interval [61] and a total of six classes were obtained (Figure 7b).

Aspect
Aspect represents the direction that a slope faces. This factor causes differences in microclimate (e.g., temperature and sun exposure) and affects land use on slopes. The aspect map (Figure 7c) was created from the DEM map. Eight directions (i.e., north, northeast, east, southeast, south, southwest, west, and northwest) were determined based on aspect values, in addition to flat area (the value is −1).

Curvature
The curvature controls the flow across a surface, thus affecting the erosion and deposition. The standard curvature was used in this study, which considers both the plan and profile curvatures [62]; thus, the whole process of flow can be understood accurately. The curvature values in the area range from -25 to 24 and divided into eight classes [63,64], among which most cells distribute in the range from −5 to 5 (Figure 7d).

Relief Degree of Land Surface
RDLS is an indicator to describe the topographic characteristics of the region, which can be obtained by calculating the maximum difference in height per unit area [65]. The map created shows that RDLS values within the study area vary from 0 to 169, which were divided into eight classes (Figure 7e).

Topographic Position Index
TPI has an impact on landslide susceptibility because many physical processes on the slope are highly associated with topographic position [66,67]. The map was created using the Land Facet Corridor Designer (LFCD) (http://www.jennessent.com/arcgis/land_ facets.htm accessed on 26 October 2020) extension in ArcGIS 10. There are six classes in this factor: valley, lower slope, gentle slope, steep slope, upper slope, and ridge (Figure 7f).

Soil
The soil herein mainly represents the materials overlying bedrock, which also makes up the sliding bodies of landslides. Geotechnical parameters are quite different when soil type varies, which can greatly affect the slope stability. Because the barren area (e.g., bare rock area) are limited in the Loess Plateau, most areas are covered by various sediments. Obtained from the Resource and Environment Science and Data Center (RESDC) (http: //www.resdc.cn/Default.aspx accessed on 26 October 2020) of China with a 1:1,000,000 scale, the soil data were reclassified into nine types (chestnut soil, sandy soil, sierozem, cinnamon soil, calcareous soil, alfisols, colluvium, loess, limed soil) in the study area, among which loess and colluvium are the most widely distributed (Figure 7g).

Lithology
The lithology can represent the geomechanical and hydraulic properties of the bedrock and subsequently affects the soil coverage [37]. Hence, it is accepted that the lithology is one of the most representative factors for landslide occurrence. This factor was obtained from geology maps (scale 1:100,000) provided by Lanzhou Institute of Geological Environment Monitoring. The geological units were divided into nine categories based on their formation ages (Figure 7h

Land Use
Land use affects the root cohesions and hydrological process on the landscape, and also indirectly explains the human influences on the hillslopes [68]. A land use map of 2018 downloaded from RESDC (http://www.resdc.cn/Default.aspx accessed on 28 October 2020) including six classes (grassland, farmland, urban area, forest, water, and bare rock) was created in 1:100,000 scale (Figure 7i).

Stream Power Index
Runoff is one of the most important hydrological processes during the landslide events, among which some factors are key indicators for runoff models. SPI can measure where DA is the upstream drainage area at a given cell and G is the slope value at this cell. This map was generated by the raster calculator tool in ArcGIS 10 ( Figure 7j).

Topographic Wetness Index
TWI (Figure 7k) considers both slope and local upslope contributing area, which can be used to quantify the topographic attributes of hydrological processes [69]. The equation to calculate this index is expressed as: where a is the upslope area draining from a specific cell and tan β is the slope at this cell.

Specific Catchment Area
SCA is defined as the upstream catchment area of a given unit [70], which is a parameter commonly used for hydrology and soil erosion modeling (Figure 7l). This factor reflects the area draining to the outlet of the catchment; thus, it can be determined as follows: where the CA is the upstream area of a contour segment and CL is the length of the segment. During the next stage, these factors were digitalized to corresponding thematic maps that were in raster format with 12.5 m-resolution cells. We used the band collection statistics tool in GIS to calculate the interrelationship among factors and the results showed that the correlation coefficient between every two factors was less than 0.4. This means these factors are independent variables when shown together.

Procedure of Landslide Susceptibility Modeling
In this step, all landslide polygons were transformed to cells for the subsequent analysis of influencing factors. The entire area is characterized by a maximum row number of 2579 and column number of 2395 cells. The total number of cells is 2,599,948, among which 8978 cells cover landslides, including 6528 cells for shallow landslides and 2450 cells for debris flows. In the ArcGIS database, all influencing factors (including detailed values, category numbers and frequency ratios) were linked to the landslide inventory for application in the weighted frequency ratio model. The detailed procedure of this analysis is shown in Figure 8. These steps are described as follows: (i) All of 8978 cells covering landslides were converted into a point shapefile. The same number of cells that do not cover landslides were randomly selected over the whole area. Eighty percent of the data (5222 cells covering shallow landslides and 1960 cells covering debris flows) were set as the training dataset, whereas the remaining 20% of the data (1306 cells covering shallow landslides and 490 cells covering debris flows) were the testing data. Similar with the frequency ratio analysis in Section 3.1, the preparation of the dataset was also individually conducted for each type of landslide, not for the entire landslide inventory. For a specific cell, a total of 13 attribute values were contained, i.e., 12 influencing factors and one set of bivariate data that represented the landslide presence/absence (0: absence; 1: presence). The influencing factors were set as input data whereas the bivariate data were the output data. Hence, the datasets used for modeling are shown in Table 1. analysis of influencing factors. The entire area is characterized by a maximum row number of 2579 and column number of 2395 cells. The total number of cells is 2,599,948, among which 8978 cells cover landslides, including 6528 cells for shallow landslides and 2450 cells for debris flows. In the ArcGIS database, all influencing factors (including detailed values, category numbers and frequency ratios) were linked to the landslide inventory for application in the weighted frequency ratio model. The detailed procedure of this analysis is shown in Figure 8. These steps are described as follows: (i) All of 8978 cells covering landslides were converted into a point shapefile. The same number of cells that do not cover landslides were randomly selected over the whole area. Eighty percent of the data (5222 cells covering shallow landslides and 1960 cells covering debris flows) were set as the training dataset, whereas the remaining 20% of the data (1306 cells covering shallow landslides and 490 cells covering debris flows) were the testing data. Similar with the frequency ratio analysis in Section 3.1, the preparation of the dataset was also individually conducted for each type of landslide, not for the entire landslide inventory. For a specific cell, a total of 13 attribute values were contained, i.e., 12 influencing factors and one set of bivariate data that represented the landslide presence/absence (0: absence; 1: presence). The influencing factors were set as input data  (ii) All of the training datasets were converted from a database in ArcGIS to numeric matrices and imported into SPSS Modeler software where the logistic regression method was employed to fit the equation of landslide occurrence.
(iii) The regression coefficients for every influencing factor obtained from step (ii) were considered as the relative importance. A fuzzy matrix was determined according to Equation (2) and subsequently used in the FAHP model. The model was performed in Stata software where a self-written program was applied and the weight of every influencing factor was determined from this step.
(iv) The steps noted above determined two kinds of weights: one is the weight of factors and the other is the rating of categories of each factor. Hence, the landslide susceptibility index (LSI) at one cell can be calculated as follows: where LSI is the landslide susceptibility index at one cell; W i is the weight of ith factor at this cell; and R ij is the rating of jth category of the ith factor at this cell, which is expressed as the frequency ratio in this study. The values of LSI at all cells were ranked from high to low and divided into five intervals on average (i.e., equal interval method for classification), namely five landslide susceptibility levels: very high, high, middle, low, and very low. To obtain the susceptibility map for the entire inventory, the LSIs of two types of landslides were added, and the same classification method was used for susceptibility zonation.
(v) The validation of the model was determined by considering the area of each susceptibility level being different, not the landslide number or density, but the relative ratio of landslides (R L ) was counted to check the model performance. This index was defined as follows: where S i is the area of one susceptibility level and TS is the total area of the study area; A Li is the area of landslides in this susceptibility level; and TA L is the total area of the landslides in the study area. Hence, this index inherently reflects the the ratio between the area proportion of one susceptibility level in the whole area and the proportion of landslides in the entire landslide inventory. If a model contains more historical landslides in less area, it has a higher value of R L , which indicates better performance.
Additionally, the receiver operating characteristic (ROC) curve, which has been widely used in previous studies, was used to compare the performance of different models. The accuracy can be evaluated according to the area under the curve (AUC); the larger the AUC value, the greater the model accuracy.

Weights of Influencing Factors
According to Table 2, a same category may have different impacts on the two types of landslides. Taking the slope as an example, 10~20 • slope value is positive for shallow landslide occurrence because the frequency ratio is more than 1.1. Although the range from 40 • to 50 • also has a large frequency ratio, our statistics showed the number of landslides in this range was small. This leads us to conclude that the lower slope values have a positive impact on this type of landslide. Although this does not fit with some previous studies that revealed medium slopes (e.g., 40 •~6 0 • ) may contribute the most to landslides in hilly areas [68,71], it reflects well the distribution features of shallow landslides in this study area. For debris flows, most are distributed in areas with slopes of 30 •~4 0 • (frequency ratio 1.665), 40 •~5 0 • (frequency ratio 2.557), or 50 •~6 0 • (frequency ratio 3.528), which is different from the case of shallow landslides.
Regarding the RDLS, shallow landslides have the largest FR value at 20~40 (1.817) and 80~100 (1.601). Although the value of RDLS > 140 is also high, very few landslides occur in this region. This is different from that of debris flows. Most debris flows occur in the region with RDLS more than 80. This indicates that shallow landslides mainly occur in the region that is relatively flat, whereas debris flows mainly occur in variable terrain.
Regarding the aspect, shallow landslides mainly occur in the area with an eastern direction, whereas debris flows mainly occur in areas that have a southwest and west aspect. Regarding the curvature, shallow landslides have larger FR values when the curvature is at −5~−3.4 and 5~25, whereas debris flows mainly occur where curvature is less than −3.4 or larger than 5.
Some similarities can also be observed on these two types of landslides. In the area with low elevations, both types have evidently larger frequency ratios than the area with higher elevations. Regarding the TPI, both types mainly occur in steep and upper slopes, which are normal situations.
Overall, from the frequency ratio of the different classes of factors, different conditions (mainly geomorphological) in which the two landslide types appear can be observed.
In addition to geomorphological factors, the impacts of other conditions can be seen. Taking the soil type as an example: the frequency ratio of loess and limed soil on shallow landslides are 1.862 and 1.306, thus indicating the these are positive conditions for shallow landslides. The frequency ratios of the other types are all less than 1. On the contrary, the frequency ratio on debris flows given by limed soil is rather low. This is also the case regarding lithology: the Lower Jurassic (J 1 ) and Sinian (Z) lithologies are negative for the occurrence of shallow landslides, while both lithologies are positive for debris flows.
Such evident differences showed that the landslide typology is a major reason influencing the nonlinear relationship between factor and landslide development and occurrence. In contrast, some factors have similar influence on both types of landslides. For instance, the SCA ranging from 100 m 2 to 10,000 m 2 has the greatest influence on both types while a SCA greater than 10,000 m 2 is negative for landslide occurrence. This is mainly because flowing water mostly converts into runoff on slope surfaces rather than infiltration flow inside slopes when the upstream area is too large or under extreme rainfall events. Moreover, the depths of many landslides in the area are small; thus, the slopes may be already saturated when the flowing water increases significantly.
The result from the logistic regression model follows: where Y 1 represents the shallow landslides and Y 2 is the debris flows. X 1~X12 are elevation, slope, aspect, curvature, RDLS, TPI, soil, lithology, land use, SPI, TWI, and SCA, respectively. The j means jth category within a factor occurring at the calculated cell. According to Equation (13), the weight of every influencing factor calculated by FAHP model is shown in Table 3. It can be found that the weight of a same factor varies with the landslide typology. This illustrates that different types of landslides in an area are conditioned in different ways by the same set of influencing factors. The most relevant influencing factors to shallow landslides are soil, RDLS, and lithology, while those for debris flows are soil, land use, and elevation. There are two possible reasons why soil is important for both landslide types: one is because geotechnical properties vary with soil types, which can affect slope stability significantly; the other aspect is that this factor has impact on runoff processes and soil moisture on slopes. Shallow landslides can be easily triggered by heavy rainfall or human engineering activities when the soil properties (especially cohesion and friction angle) are weak. For debris flows, they are often related to the failure of a colluvial soil layer overlying the bedrock on hillslopes [46]. Most debris flows in the study area occur in the colluvium and cinnamon soil layers, which agrees well with the findings from some other regions [31,72]. It is evident that the factor contributes much when landslides mainly distribute on a certain category within a factor. From the perspective of spatial location, most shallow landslides are located along the road lines or coal mines; thus, easily excavated lithology is also relatively important.
For debris flows, many are located on medium-height (1700~2300 m) hillslopes with the frequency ratio more than 0.8, where sufficient potential energy and material sources are present. On the contrary, the high mountain areas (2300~3100 m) have very few landslides and their frequency ratio (less than 0.2) is evidently lower than that of medium-height mountains. Hence, the distribution of debris flows vary with elevation by a large level and this factor shows great importance. Regarding land use, the result on its importance is similar to some studies in other mountainous settings, which concluded that the initiation and flow dynamics (e.g., infiltration and runoff) during this type of landslide event have a strong relationship with vegetation types on the slope. For instance, Shu et al. [68] analyzed the landslide density in a debris-flow-prone area of the Pyrenees Mountains, and found that the land use types and land cover change have a great impact on landslide susceptibility. However, most shallow landslides in the study area were triggered by human activities, so the number of shallow landslides involved in farmland and urban area are more than that of debris flows. The computed result on frequency ratio ( Table 2) agrees well with this point; the frequency ratio of farmland for shallow landslides reaches 0.734 while it decreases to 0.008 (almost no landslides in this category) for debris flows. Therefore, the concentrated distribution of landslides leads to a relatively high weight of this factor.

Landslide Susceptibility Zonation
The landslide susceptibility index at every cell was calculated according to Equation (11) taking into account the two landslide groups separately. The total LSI exhibited similar ranges for different types of landslides: from 0.385 to 1.865 for shallow landslides and from 0.165 to 1.919 for debris flows. The results generated for debris flows had a relatively greater range of LSI, thus indicating a larger contrast between high and low susceptibility zones. This point is confirmed by the landslide susceptibility mapping shown in Figure 9. The zones with very high or high debris flow susceptibilities mostly distributed in the central-eastern part of the study area, most of which were covered by the area with medium elevation and high slope angle. The northern and southwestern parts mostly have very low or low susceptibilities (Figure 9a). This fits with the spatial distribution of debris flows. In fact, the loess area is mainly located in the central part of the study area whereas the southern and northern areas are mostly covered by colluvium, limed soil, and sandy soil. In the area with high slope angles, the collapsibility and high porosity of loess can easily trigger landslides under heavy rainfall conditions, and relative examples have been reported by previous studies [73,74]. The frequency ratio of loess is 1.495 (Table 2), much higher than other categories. Hence, the strong impact of geological factors associated with soil type on the occurrence of this type of landslide is verified. In addition, given that movement characteristics of debris flows, the external topographical factors are also key to controlling the distribution of debris flows; medium elevation and relatively steep slopes provide potential energy for the occurrences, and the channels in such mountainous area provide possible flow paths.
For shallow landslides in the study area, the resulting LSIs are relatively low and the total area of high susceptibility is small (Figure 9b). The zones with very high or high susceptibilities are mainly distributed in the northeastern part of the area, which is the downtown Qilihe District. Another point obviously different from the susceptibility map of debris flow is that the southeastern part of the study area is also high susceptibility. This is mainly because a coal mine is located in this area and the mining activities lasted for decades. Moreover, the national road was also constructed in this area. Hence, these areas are highly populated and with well-developed infrastructure. On one hand, engineering activities (e.g., undercutting of slopes) can result in many steep slopes that may cause precedent deformation of slopes with stress redistribution in soils [75], which are positive for shallow landslides. Furthermore, extensive underground coal mining created much instability (e.g., unstable foundation, surface subsidence, and cracks) and poses a high risk for shallow landslides. Hence, at both sides of the mine, a series of landslides were triggered. Remote Sens. 2021, 13, x 25 of 35

Model Validation and Comparison
According to Equation (12), Figure 10a presents the values of index RL in each susceptibility level for the two landslide typologies. Both typologies show evidence and the same tendency: RL gradually increase when the susceptibility level changed from very low to very high. Especially for the debris flows, the landslide ratio is 0 in the very low susceptibility level while it is more than 6 in the very high susceptibility level. For shallow landslides, all the RL values in the area with very low, low, and middle are less than 1, while the values in the high and very high susceptibility zones are 3.3 and 4.6, respectively. By combining the LSIs from the two types of landslides, similar results of RL for the entire landslide inventory in the area are observed: the RL in high and very high susceptibility zones are 2.1 and 4.5, respectively, which are evidently larger than that of the other susceptibility levels. This difference in RL shows the concentrated distribution of landslides in the high susceptibility area. The number of landslide pixels in each suscep- In the last step, the LSIs of two maps were added by using the raster calculator tool in GIS and the landslide susceptibility map for the entire landslide inventory was obtained (Figure 9c). It can be seen that this map combines the characteristics of the two maps above: the zones with very high and high landslide susceptibilities mainly covered the zones near the urban, roads and mining area. The soil type is mostly loess. In the conditions negative for landslides, such as forested area, limed soil, and the area with low slope angles, the susceptibility level evidently decreases. Compared with the landslide points density map (Figure 3a), nearly all of the densely distributed areas of landslides are characterized by very high or high landslide susceptibilities. However, it should be noted that several landslides are located at the zones with very low or low susceptibilities, which are mainly distributed in the southwest part of the study area. Considering that this zone is also recognized as having very low or low susceptibility levels in individual maps (Figure 9a,b), such errors might exist at the beginning and propagated into the final map containing the entire landslide inventory. The northern part is another region with low susceptibility. This is mainly because nearly no historical landslides occurred in this region, so the statistical model identified it as nonprone area for landslides. Furthermore, the elevation in this region is low and slope is flat, which are negative for landsliding.
In conclusion, the susceptibility maps generated for different landslide typologies exhibit differences related to the spatial distribution of susceptibility levels, thus indicating different influences of the same set of input factors. Moreover, the spatial distribution of each type of landslides agrees well with the corresponding higher landslide susceptibility levels.

Model Validation and Comparison
According to Equation (12), Figure 10a presents the values of index R L in each susceptibility level for the two landslide typologies. Both typologies show evidence and the same tendency: R L gradually increase when the susceptibility level changed from very low to very high. Especially for the debris flows, the landslide ratio is 0 in the very low susceptibility level while it is more than 6 in the very high susceptibility level. For shallow landslides, all the R L values in the area with very low, low, and middle are less than 1, while the values in the high and very high susceptibility zones are 3.3 and 4.6, respectively. By combining the LSIs from the two types of landslides, similar results of R L for the entire landslide inventory in the area are observed: the R L in high and very high susceptibility zones are 2.1 and 4.5, respectively, which are evidently larger than that of the other susceptibility levels. This difference in R L shows the concentrated distribution of landslides in the high susceptibility area. The number of landslide pixels in each susceptibility level was counted and the following results obtained: for these three landslide datasets, the proportion of landslides classified into very high or high susceptibility levels are 79.5% (debris flows), 71.3% (shallow landslides), and 73.3% (entire landslide inventory), respectively. It should be noted that the total area with very high and high susceptibilities only cover a small percentage of the whole study area, which are 32.4% (debris flows), 21.2% (shallow landslides), and 32.7% (all of landslides), respectively. As Zêzere [39] concluded, the effectiveness of a model can be validated when the majority of historical unstable slopes concentrate in zones with very high or high susceptibilities. Within the very low and low susceptibility areas, the percentages of landslides are 3.1% (debris flow), 4.7% (shallow landslides), and 3.6% (all of landslides), respectively. This means that the rate of false alerts (landslide inventory points recognized in the low susceptibility area) of the model is low.
Next, 227 landslide points were analyzed in detail by comparing them with 5000 cells randomly selected in the susceptibility maps. The LSIs were divided into 20 intervals on average and the number of landslide and random points in each interval were accounted. The results were normalized in the form of percentage of points versus the LSI for every susceptibility map, which is presented in Figure 10b-d. When the LSIs are low (<50%), the percentage of random points is evidently larger; when the LSIs range from 60~100%, the percentage of landslide points is larger. For these three susceptibility maps, the percentage of landslide points is always the largest between 60% and 70% LSIs. This indicates that most landslides have larger LSIs than random points; thus, the map recognized most landslides. Furthermore, the number of landslides located in the intervals with more than 90% LSIs is relatively small. This is mainly because the area of this susceptibility level is very small. from 60%~100%, the percentage of landslide points is larger. For these three susceptibility maps, the percentage of landslide points is always the largest between 60% and 70% LSIs. This indicates that most landslides have larger LSIs than random points; thus, the map recognized most landslides. Furthermore, the number of landslides located in the intervals with more than 90% LSIs is relatively small. This is mainly because the area of this susceptibility level is very small. During the next stage, different models were used to produce landslide susceptibility maps. Although some newly developed models (e.g., some machine learning models) may have better performance [76], the models considered here only included methods relevant to the weighted frequency ratio (WFR) model, because the goal of this part was to verify the rationality of the proposed model, not to find the best model for landslide susceptibility assessment. The models used for comparison were (a) weighted frequency ratio model without taking landslide typology into account, (b) LR model, (c) FAHP model, and (d) FR model. Please note that the last three models also only regarded the entire landslide inventory without including landslide typology. The resulting landslide susceptibility maps are shown in Figure 11. On one side, a detailed receiver operating characteristic (ROC) analysis was adopted to quantitatively clarify their accuracies (Figure 12a). On the other side, the accuracy from the confusion matrix versus LSI thresholds were computed and compared for all these models (Figure 12b). From the perspective of the area under curve (AUC), the WFR model for shallow landslides is the largest (AUC = 0.771) among all the models, thus indicating its better performance. The second largest AUC value is provided by the WFR model using the entire landslide inventory (AUC = 0.704), followed by the WFR model for debris flows (AUC = 0.699), FR During the next stage, different models were used to produce landslide susceptibility maps. Although some newly developed models (e.g., some machine learning models) may have better performance [76], the models considered here only included methods relevant to the weighted frequency ratio (WFR) model, because the goal of this part was to verify the rationality of the proposed model, not to find the best model for landslide susceptibility assessment. The models used for comparison were (a) weighted frequency ratio model without taking landslide typology into account, (b) LR model, (c) FAHP model, and (d) FR model. Please note that the last three models also only regarded the entire landslide inventory without including landslide typology. The resulting landslide susceptibility maps are shown in Figure 11. On one side, a detailed receiver operating characteristic (ROC) analysis was adopted to quantitatively clarify their accuracies (Figure 12a). On the other side, the accuracy from the confusion matrix versus LSI thresholds were computed and compared for all these models (Figure 12b). From the perspective of the area under curve (AUC), the WFR model for shallow landslides is the largest (AUC = 0.771) among all the models, thus indicating its better performance. The second largest AUC value is provided by the WFR model using the entire landslide inventory (AUC = 0. From the comparison of LSI threshold with accuracy, the accuracies were similar and relatively stable between the percentages from 40% to 70%. The largest values mostly were in the range of 50~60%. This means the LSI threshold in this range could be used to divide the cells into stable and unstable. Certainly, the selection of an adequate threshold must be a cautious decision, because it is an important index to identify the positive and negative conditions. For this study, the 55% was determined as the LSI threshold because most models had the largest accuracy at this value. Under this situation, the accuracy of the WFR model considering landslide typology was still the best, which was 0. 68

Discussion
In this study, the entire landslide inventory was divided into two types: shallow landslides and debris flows. This is mainly based on geomorphological features and spatial distribution (Section 3.1). In the study area, the shallow landslides occurred in open slopes, whereas debris flows occurred in channels and gullies. Their movement mechanisms are different. This procedure is similar to that reported in previous literature, for example, Zhou et al. [17] divided all landslides in Longju (China) into colluvial landslides and rockfalls; Erener et al. [77] divided landslides in NE Turkey into dormant and active landslides. However, other studies have different attempts on this topic, where researchers classified landslides into different types according to movement pattern or widely accepted criteria [44,45], for example, translational slides and rotational slides. However, it should be noted that in order to incorporate a relatively standard landslide typology into landslide susceptibility assessment, a complete landslide inventory is particularly important. It should allow for recording the relevant information of landslide typology, but unfortunately this is seldom available in many cases, including this study. The dataset in Qilihe was compared with a few studies that performed a standard landslide classification, including Zêzere [39] (144 landslides in an area of 11.3 km 2 ) and Thiery et al. [78] (192 landslides in an area of 100 km 2 ), and it can be found that the preparation of an accurate landslide inventory is often an operational challenge when it comes to regional scale studies, particularly for our study (227 landslides in an area of 395 km 2 ). However, as Corominas et al. [36] summarized, it is difficult to provide strict In summary, the present results show that the best performance was given by the WFR model of shallow landslides, while the worst one was the LR model without considering landslide typology. Although the accuracy of the model for the entire landslide inventory considering landslide typology was in the middle level between the accuracies of two landslide typologies, it still was better than all of the other models that did not take landslide typology into account. Hence, if decision-makers want to generate a landslide susceptibility map for the entire landslide dataset of an entire area (not only for a certain landslide typology), the weighted frequency ratio model is a good option.

Discussion
In this study, the entire landslide inventory was divided into two types: shallow landslides and debris flows. This is mainly based on geomorphological features and spatial distribution (Section 3.1). In the study area, the shallow landslides occurred in open slopes, whereas debris flows occurred in channels and gullies. Their movement mechanisms are different. This procedure is similar to that reported in previous literature, for example, Zhou et al. [17] divided all landslides in Longju (China) into colluvial landslides and rockfalls; Erener et al. [77] divided landslides in NE Turkey into dormant and active landslides. However, other studies have different attempts on this topic, where researchers classified landslides into different types according to movement pattern or widely accepted criteria [44,45], for example, translational slides and rotational slides. However, it should be noted that in order to incorporate a relatively standard landslide typology into landslide susceptibility assessment, a complete landslide inventory is particularly important. It should allow for recording the relevant information of landslide typology, but unfortunately this is seldom available in many cases, including this study. The dataset in Qilihe was compared with a few studies that performed a standard landslide classification, including Zêzere [39] (144 landslides in an area of 11.3 km 2 ) and Thiery et al. [78] (192 landslides in an area of 100 km 2 ), and it can be found that the preparation of an accurate landslide inventory is often an operational challenge when it comes to regional scale studies, particularly for our study (227 landslides in an area of 395 km 2 ). However, as Corominas et al. [36] summarized, it is difficult to provide strict guidelines for the type of data required in a landslide risk analysis. Hence, in this study, the landslide typology was determined mainly according to triggering factors and geomorphological features. On one hand, the lack of detailed information on standard landslide typology makes the traditional classification impossible; on other hand, the analysis on landslide point density does indicate an evident difference between the two landslide types. Hence, in spite of historical data that may be lacking, the present classification is acceptable for a preliminary analysis.
The proposed weighted frequency ratio model to assess the regional-scale landslide susceptibility is fundamentally a statistically based method because it can quantitatively present the contribution of influencing factors or their categories in the modeling procedure. Our work indicates that soil and land use are relatively important for debris flows whereas slope is of small importance; for shallow landslides, soil and relief degree of land surface are the most important among the prediction factors. Some of these findings contradict those of previous studies. For instance, Hürlimann et al. [79] found that slope angle between 30 • and 35 • is the most significant factor for the rainfall-induced flows in the Pyrenees. Wu et al. [54] tested the importance of input factors using machine learning methods in an area with a setting similar to this study, and recorded a moderate contribution of geological factors. This leads us to conclude that the model performance depends much on input datasets (e.g., quality, availability, and type) and the types of model used, so it may vary with different cases. Given the difference among the development laws of each landslide typology at a regional scale, it is not easy to directly extrapolate an empirical landslide susceptibility model to neighboring regions. Moreover, another limitation of this study related to the selection of influencing factors; each landslide type should have specific conditioning factors that depend on the mechanism. It is therefore important to adopt different sets of influencing factors to assess landslide susceptibility. This means different numbers and types of factors are required to distinguish the occurrence conditions for different landslide types. However, in our study, only the weight of every factor was changed without considering different combinations of the factors. As some classic literature (e.g., [44]) have already stated, the fact that different types of landslides occur in different conditions is evident. Hence, subsequent analysis should deal with these conditions in different landslides types.
To obtain a reliable result, a set of easy-to-obtain geological or environmental data were used to produce thematic maps as input variables. Although several of them are still debated (e.g., RDLS and SCA) and are not commonly mentioned in previous literature, the twelve influencing factors used in this study fit well with the assertion that landslides are controlled by mechanical laws that can be determined empirically, statistically, or in an indeterministic fashion [5]. In fact, the linear statistical relationship on landslide number and factors indicates they do affect landslide occurrences. For debris flows, there is an evident positive correlation between frequency ratio and RDLS, and the FR value in the category of 120~140 is larger than most categories (including other factors). For shallow landslides, the weight of SCA is higher than some commonly used factors, such as curvature and land use. Certainly, this does not mean that more new influencing factors could be taken into account of the modeling casually; on the contrary, our study indicates that the selection of influencing factors should not only use other methods described in the literature for reference, but also take the geological/environmental settings of the test site into account. Hence, as Segoni et al. [80] suggested, researchers should have a good understanding of the geological meaning of the units defined in their maps because the difference among geological units may have a deep influence on the effectiveness of susceptibility modeling. This process can be conducted by using a variety of methods, such as field survey on the test site, review of existing literature, and examination of historical data, among others. In the present study, these geological meanings were recognized mainly through field work and archived documents (e.g., landslide triggers and historical engineering activities). During the next stage, a statistical tool in GIS (density analysis of landslide points) was adopted to perform back analysis to validate their significance on landslide occurrence. In terms of model performance, the accuracy of the proposed model is only approximately 70%, which is much closer to other studies using the same type of approach (e.g., the mean AUC of 0.74 in [81]; the maximum AUC of 0.71 in [82]; the AUC of 0.75 in [83]). It must be admitted that this accuracy is not superior, which mainly includes two reasons: one is that the data quality of the inventory is not very high, and the other is associated with the limitation of statistically based methods. The main limitation on landslide inventory is that we used landslide initiation points and not polygons as sampling data. Point data of landslide occurrence ignore the size, volume, and magnitude; as a result, landslide susceptibility mapping can be biased. Hence, our next work should consider landslide polygons to prepare landslide susceptibility mapping. Related work can be seen in Catani et al. [37]. Users should be careful, however, on this point because polygon data also have the drawback of being subjective; independent researchers may produce very different landslide susceptibility mappings in the same study area, which can result in spatial positioning inconsistencies in the boundaries of mapped landslide polygons [84]. Considering this point, landslide point data are still used in some studies worldwide [20,25].
Although the present model did not increase the accuracy by much, it does offer an improvement compared with the basic version of the model. In other words, we think the current result is acceptable for a preliminary analysis because what we concern the most is to discuss the rationality of landslide classification by using the weighted frequency ratio model, which can be considered in landslide susceptibility assessment, rather than the comparison of methodologies. This mainly addresses an operational and technical challenge: when it comes to a landslide inventory that includes more than one landslide type, landslide susceptibility mapping for the entire landslide dataset is still possible and the modeling process allows for reflecting the different incidences of a same group of input factors. It is a main topical issue currently in this research niche [85]. Moreover, it has a more complete geomorphological meaning; traditionally data-driven approaches (including statistically based approaches and machine learning approaches) mostly only analyze the nonlinear relationship between landslide and input factors without concentrating on geomorphological implications involved in the landslide process. These models usually can perform accurate prediction on landslide spatial distribution, but fail to explain the obtained results from a geomorphological point of view. However, geomorphology attributes are quite important in the Loess Plateau of China because it has been confirmed that landslide types have a close relationship with it [86]. Regarding the model accuracy, in fact, along with the development of GIS and machine learning techniques, a trend toward advanced but complicated landslide susceptibility models has shown up. Hence, it can be believed that a better model will be found in future work that not only allows consideration of the landslide typology, but also has a satisfactory model performance.

Conclusions
This study aims at incorporating landslide typology into landslide susceptibility assessment of Qilihe District in Lanzhou (northwestern China), and individually establishing an influencing factor system for each type of landslide to create an effective landslide susceptibility map. For this purpose, a WFR model was proposed, where the LR-FAHP approach was used to calculate the weight of influencing factors, and the rank among different categories within each factor was determined by the FR method. According to the analysis of landslide spatial distribution and geomorphological features, all of the landslides in the region were classified into two categories: debris flows and shallow landslides. Based on 12 thematic layers, the WFR model was demonstrated to be effective when mapping the landslide susceptibility; the accuracy presented by AUC was 70.4%. Such accuracy was better than that of the individual LR, FAHP, and FR models. When all landslides were considered as a group from the beginning, the accuracy of landslide susceptibility assessment was reduced by 1.5~5.4%.
Summarizing, the current results reveal that more reliable landslide susceptibility maps can be produced when landslide typology is incorporated into the modeling process. Therefore, it is highly recommended to separately conduct susceptibility assessment for each type of landslide instead of as a whole group. However, it should be noted that the eventual map was determined by combining individual susceptibility maps for each landslide type. This means the error in every individual map may be propagated into the final results. In this study, the resulting accuracy of shallow landslides is nearly 80% while that of debris flows is only approximately 70%. These results suggest that more work is needed before applying the proposed procedure to other cases. There are two potential tasks in our future work: one is to improve the accuracy of individual landslide susceptibility maps by comparing methodologies; the other is to simplify or optimize the influencing factor combination, especially under the condition that the weight of factors can be obtained by the existing model.