Validation of Spatial Prediction Models for Landslide Susceptibility Mapping by Considering Structural Similarity

In this paper, we propose a methodology for validating landslide susceptibility results in the Pinggu district (Beijing, China). A landslide inventory including 169 landslides was prepared, and eight factors correlated to landslides (lithology, tectonic faults, topographic elevation, slope gradient, aspect, slope curvature, land use, and road network) were processed, integrating two techniques, namely the frequency ratio (FR) and the certainty factor (CF), in a geographic information system (GIS) environment. The area under the curve (success rate curve and prediction curve) analysis was used to evaluate model compatibility and predictability. Validation results indicated that the values of the area under the curve for the FR model and the CF model were 0.769 and 0.768, respectively. Considering spatial correlation, an alternative complementary method for validating landslide susceptibility maps was introduced. The spatially approximate maps could be discriminated from their matrices which carry structural information, and the structural similarity index (SSI) was then proposed to quantify the similarity. As a specific example, the SSI value of the FR (74.15%) scored higher than that of the CF model (69.36%), demonstrating its promise in validating different landslide susceptibility maps. These results show that the FR model outperforms the CF model in producing a landslide susceptibility map in the study area.


Introduction
Landslides are referred to as geological events involving the down-slope transport of soil/rock materials, and assessing their susceptibility constitutes a major task for local authorities to plan land use and mitigation [1,2].Landslide susceptibility mapping (LSM) has been acknowledged as an effective tool to understand landslides and predict landslide-prone areas [3], and it addresses the propensity of soil/rock to produce various types of landslides, with susceptibilities expressed cartographically in maps that highlight the spatial distribution of potential slope-failure susceptibility [4].
Due to the complex nature of landslides, such as soil conditions, root strength, bedrock, topography, hydrology, and human activities, producing a reliable spatial prediction of landslides is still a challenging task.The best landslide model depends not only on the quality of the data used [5], but also strongly on the employed modeling approaches [6].Nevertheless, the analysis of cause and consequence relationships is not always simple and, on many occasions, LSM has attracted heavy criticism [7].For example, the geographic information system (GIS)-based interrelation analysis usually begins with the collection of landslide inventory maps, which mainly come from aerial photo-interpretation and systematic field checks [8,9].The scale of the available aerial photographs, the topographic maps, the typology of the landslide phenomena, and the environmental contexts greatly affect the reliability and completeness of the landslide inventory map [10,11].In addition, spatial resolution and pixel size effects [12,13], selection of conditional factors [14], criteria to classify each conditional factor map [15], and factor weight assignment techniques [16,17] also introduce uncertainties in LSM.
In GIS, if we consider a spatial database containing maps of lithological units, of land cover units, of topographic elevation and derived attributes (slope, aspect, etc.), and of the distribution in space of clearly identified mass movement, we can transform the multi-layered database into an aggregation of functional values to obtain an index of propensity of the land to failure [18].For minimizing the subjectivity and bias in the weight assignment process, quantitative methods, such as statistical analysis [19] and deterministic analysis [20], may be utilized.The core issue then arises regarding the validation of the results of these models.Generally, validation can best be performed using the random-partition, spatial-partition, and time-partition techniques [21,22].The time-partition technique enables validation to be performed by comparing landslides that occurred in a certain period and those that occurred in a different period, which is the most adequate to confirm the validity of the "prediction" made, but also the most difficult to apply as it requires the knowledge of the temporal distribution of landslides during a sufficiently long time span [23].By employing mathematical statistics theory, various validation strategies have been introduced, which can be categorized as landslide density analysis [24], receiver operating characteristics (ROC) curve analysis, the area under ROC curve (AUC) analysis [25,26], and success rate or prediction rate curve analysis [26,27].The landslide density analysis falsely assumes that landslide occurrence is a spatially continuous variable, and therefore it cannot be interpolated without considering the relation between landslide occurrence and the local geological settings [28].The AUC computed from the ROC curve validates the accuracy of a landslide model in two aspects: prediction skill and model fitting [29].When we exploit an inventory which was unknown in the model calibration, a predictive value is expected.When, instead of an independent population, the same landslide set is used, what is determined is how well the model fits the data.Similarly, the success rate method uses the training landslide pixels that have already been used for building the landslide models; thus, it is not suitable for assessing the prediction capacity of the models [30].In the literature, the use of confusion matrix analysis [31] and agreed area analysis [32] were also reported for model validation.Despite much research effort towards LSM, there still remains a dispute on which method or technique is the best for the prediction of landslide-prone areas [33].
The main objective of this study is, therefore, to deal with the issue of validating landslide susceptibility models.Traditional validation methods (i.e., success rate curve analysis and prediction rate curve analysis) were firstly conducted to confirm the compatibility and predictive capacity of the landslide susceptibility models.Under such a premise, we proposed a validation approach based on the definition of a structural similarity index (SSI), which has the aim of discriminating the similarity between different landslide susceptibility models.The SSI seems to be effective to confirm the validity of the results of some models over other ones.An application example is presented to describe the strategy in this research and to provide a basis for generalizing the approach to validation.

Description of the Study Area
The study area (Figure 1) is about 1075 km 2 and covers the whole area of the Pinggu district.It is located in the northeast of Beijing, China, between longitudes 116 • 55 E and 117 • 25 E, and latitudes 40 • 00 N and 40 • 25 N.The geomorphology is dominated by hilly and alluvial-proluvial landforms, with the elevation ranging from 14.3 m to 1233.8 m above sea level.The area experiences a continental monsoon climate, with uneven distribution of seasonal precipitation.According to the Beijing Meteorological Service, the annual average precipitation is 639.5 mm, and the rainy season commences in June and ends in September, with the heaviest precipitation during July, accounting for 74.9% of the Based on the geological map prepared by the Beijing Geological and Mineral Bureau [34], there are Archaeozoic metamorphic units, Proterozoic sedimentary units, and Quaternary deposit units in the area.Outcrops of Proterozoic sedimentary rocks are spread widely in the area, with the lithology of dolomites, dolomitic sandstones, and sandstones.Generally, road construction in mountainous areas cuts through these carbonatite rocks, and excavation of rock slopes during residence construction also leads to potential instability problems.
Due to the specific morphological settings, geology, and climate, landslides (especially rockfalls) are frequent in the area.A detailed site investigation was conducted, and geological and geotechnical studies were conducted on the landslides to gain a better understanding of the triggering mechanisms and failure processes and to better prepare for future failures in the study area.The main triggering factors are lithology and tectonic structures.Among the registered 169 landslides, 155 landslides were found, which are controlled by the altitude of discontinuities in the dolomitic sedimentary rocks.Ninety-six landslides were located on steep slopes (slope angle > 80°).Human activities (mainly land cover changes and road construction) also play an important role in landslide occurrence, as they increase the sensitivities of the surface layer to the effects of detachments/sliding.These events are destructive to roads (Figure 2), agricultural lands, and buildings.

Landslide Inventory
The first important step in LSM is to identify landslide locations that occurred in the past and present [35].In the area, slope movements are frequent, and over 70% of those movements are rockfalls.The landslide inventory was carried out to perform the analysis on a homogeneous population, and only one type of movement was selected: rockfalls.A detailed and reliable landslide inventory (Figure 1) map was prepared from two main sources: (1) the landslide database from aerial photograph (1:50,000) interpretation (obtained on 30 September 2012) with 16 landslide locations, and (2) detailed regional field investigation at 1:10,000 scale (undertaken from January to May 2013) with 153 landslide locations.A total of 169 landslides that have occurred during the last 20 years were registered.Figure 2 depicts the typical landslides observed during the field investigation.Based on the geological map prepared by the Beijing Geological and Mineral Bureau [34], there are Archaeozoic metamorphic units, Proterozoic sedimentary units, and Quaternary deposit units in the area.Outcrops of Proterozoic sedimentary rocks are spread widely in the area, with the lithology of dolomites, dolomitic sandstones, and sandstones.Generally, road construction in mountainous areas cuts through these carbonatite rocks, and excavation of rock slopes during residence construction also leads to potential instability problems.
Due to the specific morphological settings, geology, and climate, landslides (especially rockfalls) are frequent in the area.A detailed site investigation was conducted, and geological and geotechnical studies were conducted on the landslides to gain a better understanding of the triggering mechanisms and failure processes and to better prepare for future failures in the study area.The main triggering factors are lithology and tectonic structures.Among the registered 169 landslides, 155 landslides were found, which are controlled by the altitude of discontinuities in the dolomitic sedimentary rocks.Ninety-six landslides were located on steep slopes (slope angle > 80 • ).Human activities (mainly land cover changes and road construction) also play an important role in landslide occurrence, as they increase the sensitivities of the surface layer to the effects of detachments/sliding.These events are destructive to roads (Figure 2), agricultural lands, and buildings.

Landslide Inventory
The first important step in LSM is to identify landslide locations that occurred in the past and present [35].In the area, slope movements are frequent, and over 70% of those movements are rockfalls.The landslide inventory was carried out to perform the analysis on a homogeneous population, and only one type of movement was selected: rockfalls.A detailed and reliable landslide inventory (Figure 1) map was prepared from two main sources: (1) the landslide database from aerial photograph (1:50,000) interpretation (obtained on 30 September 2012) with 16 landslide locations, and (2) detailed regional field investigation at 1:10,000 scale (undertaken from January to May 2013) with 153 landslide locations.A total of 169 landslides that have occurred during the last 20 years were registered.Figure 2 depicts the typical landslides observed during the field investigation.In GIS, point data can be described as a singular (X,Y) coordinate, which does not reflect landslide affected areas, as this type of feature is usually used when the areal extent of a small landslide cannot be drawn due to the scale of the map [36].However, the logical method is to reveal the landslide-responsible pixels.Lee et al. [37] suggested that when the scale of the map was 1:5000-1:50,000, the 5 m, 10 m, and 30 m pixel sizes yield similar accuracy.In this study, a pixel size of 30 m × 30 m was adopted, and landslides larger than one cell size were used for the analyses.For each landslide, the areal extent was delimited from the accumulation/depletion zone as a polygon feature drawn from field works, and converted to raster format in GIS.

Conditional Factors
Generally, the selection of landslide correlated factors should take the geologic characteristics of the study area and data availability into consideration.In GIS-based analysis, the selected factors should be operational, complete, non-uniform, measurable, and non-redundant [10].For rockfall susceptibility mapping, Antoniou and Lekkas [38] indicated that geological information (e.g., discontinuities, joints and fault) and geomorphological information (e.g., slope angle and slope aspect) should be used as input parameters.In this study, eight conditional factors were recognized as correlated to rockfalls, i.e., lithology, proximity to major faults and road networks, slope degree, slope aspect, slope curvature, topographical elevation, and land cover were obtained (Figure 3).All of the above factors and landslides were then entered into the GIS medium using ArcGIS 10.1 software.The process of converting these continuous variables into categorical classes were conducted using expert opinions [39] and Jenks [40] natural breaks to define class intervals.In GIS, point data can be described as a singular (X,Y) coordinate, which does not reflect landslide affected areas, as this type of feature is usually used when the areal extent of a small landslide cannot be drawn due to the scale of the map [36].However, the logical method is to reveal the landslide-responsible pixels.Lee et al. [37] suggested that when the scale of the map was 1:5000-1:50,000, the 5 m, 10 m, and 30 m pixel sizes yield similar accuracy.In this study, a pixel size of 30 m × 30 m was adopted, and landslides larger than one cell size were used for the analyses.For each landslide, the areal extent was delimited from the accumulation/depletion zone as a polygon feature drawn from field works, and converted to raster format in GIS.

Conditional Factors
Generally, the selection of landslide correlated factors should take the geologic characteristics of the study area and data availability into consideration.In GIS-based analysis, the selected factors should be operational, complete, non-uniform, measurable, and non-redundant [10].For rockfall susceptibility mapping, Antoniou and Lekkas [38] indicated that geological information (e.g., discontinuities, joints and fault) and geomorphological information (e.g., slope angle and slope aspect) should be used as input parameters.In this study, eight conditional factors were recognized as correlated to rockfalls, i.e., lithology, proximity to major faults and road networks, slope degree, slope aspect, slope curvature, topographical elevation, and land cover were obtained (Figure 3).All of the above factors and landslides were then entered into the GIS medium using ArcGIS 10.1 software.The process of converting these continuous variables into categorical classes were conducted using expert opinions [39] and Jenks [40] natural breaks to define class intervals.In GIS, point data can be described as a singular (X,Y) coordinate, which does not reflect landslide affected areas, as this type of feature is usually used when the areal extent of a small landslide cannot be drawn due to the scale of the map [36].However, the logical method is to reveal the landslide-responsible pixels.Lee et al. [37] suggested that when the scale of the map was 1:5000-1:50,000, the 5 m, 10 m, and 30 m pixel sizes yield similar accuracy.In this study, a pixel size of 30 m × 30 m was adopted, and landslides larger than one cell size were used for the analyses.For each landslide, the areal extent was delimited from the accumulation/depletion zone as a polygon feature drawn from field works, and converted to raster format in GIS.

Conditional Factors
Generally, the selection of landslide correlated factors should take the geologic characteristics of the study area and data availability into consideration.In GIS-based analysis, the selected factors should be operational, complete, non-uniform, measurable, and non-redundant [10].For rockfall susceptibility mapping, Antoniou and Lekkas [38] indicated that geological information (e.g., discontinuities, joints and fault) and geomorphological information (e.g., slope angle and slope aspect) should be used as input parameters.In this study, eight conditional factors were recognized as correlated to rockfalls, i.e., lithology, proximity to major faults and road networks, slope degree, slope aspect, slope curvature, topographical elevation, and land cover were obtained (Figure 3).All of the above factors and landslides were then entered into the GIS medium using ArcGIS 10.1 software.The process of converting these continuous variables into categorical classes were conducted using expert opinions [39] and Jenks [40] natural breaks to define class intervals.Lithology is considered as one of the most important factor because it influences the geomechanical characteristics of terrain (e.g., static and dynamic friction, restitution characteristics and fragmentation ratio), therefore controlling the types and mechanism of rockfalls [41].The lithology map was constructed based on the geological mineral resources maps at 1:200,000 scale.This is the only geological map available for the study area.The lithology maps were constructed with nine lithological units, with their descriptions set out in Table 1.
Large-scale structures such as faults or thrusts induce regional perturbations in the fracturing occurrence and density, which are very unfavorable to natural or manmade slopes.The dip/dip Lithology is considered as one of the most important factor because it influences the geomechanical characteristics of terrain (e.g., static and dynamic friction, restitution characteristics and fragmentation ratio), therefore controlling the types and mechanism of rockfalls [41].The lithology map was constructed based on the geological mineral resources maps at 1:200,000 scale.This is the only geological map available for the study area.The lithology maps were constructed with nine lithological units, with their descriptions set out in Table 1.
Large-scale structures such as faults or thrusts induce regional perturbations in the fracturing occurrence and density, which are very unfavorable to natural or manmade slopes.The dip/dip direction often indicates the potential type of rockfalls [42].In this study, the map of the distance to faults with eight classes was prepared.A digital elevation model (DEM) for the study area with a spatial resolution of 30 m × 30 m was generated from topographic maps at a scale of 1:10,000.Based on the DEM, topographic attributes, such as slope, aspect, curvature, and elevation can be derived.It is well known that slope failures occur more readily on steeper slopes due to gravity stresses, since steeper slopes present a higher potential for failure, the materials that make up such slopes with steep angle can be expected to be stronger [38].Geometric relationship between rock beds and slope aspect can influence the condition of ground water flow, and hence, the process and types of rockfall [43].In general, if rocks are dipping in the same direction as the topographic surface, the slope is said to be cataclinal.If the beds dip in the direction opposite to the latter, anaclinal-slopes are created.When the strike of rock beds is perpendicular to the azimuth of mountain faces, the outcome are orthoclinal-slopes.As for slope curvature, it acts as major contributor to terrain instability in the way that it influence the concentration of the soil/rock moisture [44] and therefore influencing rockfalls.Though elevation by itself is not a conditional factor, there are some altitude ranges where the slope failures are frequent [32].In this study, the four factors were constructed based on the DEM data: the slope map with eight classes, the aspect map with eight classes, the curvature map with eight classes, and the topographic elevation map with eight classes.
Land cover is also considered as indirect factor influencing rockfalls, for different land cover type in the slopes lead to different slope roughness and energy loss at impact [32].Generally, NDVI (the normalized difference vegetation index) was used to measure surface reflectance and gives a quantitative estimate of the vegetation growth and biomass [36].NDVI value was calculated by the formula: where IR refers to the infrared portion of the electromagnetic spectrum, and R is the red portion of the electromagnetic spectrum [45].The NDVI map with six classes was then constructed based on the computed values.
The presence of the road is the major long-term factor affecting the stability of the rock mass.At the time of construction of the road, the buttress of the compartment was reduced and consequently, stresses increased in its base [43].For this reason, road networks are included in GIS-based rockfall susceptibility analyses.In this study, distance to roads was proposed as an index to quantify this influence.Eight buffer categories were constructed.

Preparation of Training and Validation Datasets
For landslide susceptibility modeling, landslide locations should be divided into training and validation datasets: the first one is used for building models and the other for validating models [46].In this study, the specific dates of each landslide occurrence are mostly unknown; therefore, we randomly split these landslide locations into two subsets with a 7:3 ratio.The first is used for model construction, whereas the second is for model validation.

Landslide Susceptibility Modeling
Landslide susceptibility can be assessed using different methods based on GIS.Especially in the last 20 years, many research papers were published in order to solve the deficiencies and difficulties in the assessment of susceptibility.However, it should be noted that the procedure of preparing landslide susceptibility maps must be simple and must have a high accuracy.The frequency ratio model is one type of statistics-based approach widely used in landslide susceptibility analysis [36,47].In the frequency ratio model, input processes, calculations, and output processes are very simple and can be readily understood.In addition, among the commonly used GIS analysis models for landslide susceptibility, the certainty factor has also been widely considered and experimentally investigated in the literature [48,49].Therefore, in this study, the landslide susceptibility analyses were implemented using these two methods.

Frequency Ratio Method
The assumption that conditions leading to slope failure in the past and present are likely to cause landslides in the future is important for LSM [50].Statistical approaches are based on the relationships between each landslide-related factor and the distribution of past landslides, and this correlation can be quantitatively evaluated using the frequency ratio (FR) model.The aforementioned eight landslide-correlated factors were used to establish this relationship with landslides.
The number of landslide pixels in each class has been evaluated and the frequency ratio for each factor class is found by dividing the landslide ratio by the area ratio [51], denoted as: where FR i,j is the frequency ratio of class j in factor i; N pix S i,j is the number of pixels of landslide occurrence within class j in factor i; N pix N i,j is the number of pixels of class j in factor i. In relation analyses, a value greater than 1.0 indicates a strong correlation between landslide occurrence and the factor's class, and a value lower than 1.0 means a weak correlation [52].Once the frequency ratio of each landslide factor's class was obtained, the landslide susceptibility index (LSI) could be calculated by summation of each factor's frequency ratio values.A higher LSI means a higher susceptibility to landslides while a lower LSI indicates a lower susceptibility to landslides [47].

Certainty Factor Method
The certainty factor (CF) is one of the possible proposed favorability functions to handle the problem of the combination of heterogeneous data.The CF is calculated for each data layer based on the landslide inventory and the landslide occurrence frequency in each class of every thematic layer.The CF for each pixel is defined as the change in certainty that a proposition is true from without the evidence (prior probability of having landslide in the study area) to be given the evidence (conditional probability of having a landslide given a certain class of a thematic layer) for each data layer [49].The CF, as a function of probability, originally proposed by Shortliffe and Buchanan, [53] is: where CF is the certainty factor, ppa is the conditional probability of having a number of landslides in a class (e.g., south-facing slope in the aspect layer, Quaternary deposits in the lithology layer) and pps is the prior probability of having the total number of landslides in the study area.The CF ranges between -1 and 1, where positive values imply an increase in certainty, after the evidence of a landslide is observed, and negative values correspond to a decrease in certainty.A value close to 0 means that the prior probability is very similar to the conditional one, hence it does not give any indication about the certainty of the occurrence of the event [54].
The layers are combined pairwise according to the integration rules [55].The combination of CF values of two thematic layers 'z' is expressed in the following equation given by Binaghi et al. [56]: the CF values are computed by overlaying each thematic layer with the landslide inventory map and calculating the landslide frequencies.Each thematic layer is reclassified according to the CF value calculated and they are combined pairwise to obtain the LSI using the integration rule of the equation.

Model Validation Strategies
Validation implies a comparison between the maps obtained from the models and the independent dataset.This comparison can be qualitative-for example, visually, by a simple overlay-or quantitative, performed using functions such as the cumulative curve [22].The current study attempts to analyze the spatial correlation between different landslide models, and to offer as an alternative a procedure which is based on the evaluation of the similarity between different methods.To realize this, traditional validation approaches, i.e., success rate and prediction rate curves, were firstly used for testing the model compatibility and prediction capacity.On such a premise, the spatially correlated validation was then performed to obtain the SSI, which can quantitatively validate the performance of different landslide models.

Traditional Validation Approaches
Landslide susceptibility maps can be verified by comparing the susceptibility maps with both the training data that were used for building the models with the validation data that were not used during the model building process.
The success rate curve is based on the comparison between the prediction image and the landslides used in the modeling [21], and the success rate method can help determine how well the resulting landslide susceptibility maps have classified the areas of existing landslides [30].The prediction rate method mimics the comparison by partitioning landslide data; one subset of the data (training data) is used for obtaining a prediction image, and the other subset (validation data) is compared with the prediction results for validation, explaining how well the model and predictor variables predict the results [55].In both techniques, the rate curves can be created.The area under the successive rate curve (AUSC) represents the quality of landslide models to reliably classify the occurrence of existing landslides (training dataset), while the area under the predicted rate curve (AUPC) explains the capacity of the proposed landslide model for predicting landslide susceptibility.The AUC value ranges from 0.5 to 1.0, and an ideal model has an AUC value close to 1.0 (perfect fit/prediction), whereas a random fit/prediction model has an AUC value close to 0.5 [57].In this study, we use 10 subdivisions of LSI values of all cells in the study area, the cumulative percentage of landslide occurrence (the training dataset and validation dataset, respectively) in the classes was calculated, and curves were then drawn to calculate their AUCs.

Spatially Correlated Validation Approaches
When a landslide susceptibility map consists of different levels of susceptibility, we usually visualize such thematic representations.It should be noted that the prediction map is constructed using mathematical models by computing a predicted value at each pixel on a continuous scale, and pixel values from two prediction maps may have different meanings.For an objective and proper comparison of the prediction maps, the ranking of equal-number classes [55] is firstly used, i.e., all of the pixels are sorted according to the pixel values in descending order, and the total number of pixels is divided into a number of classes with an equal number of pixels.
For comparing prediction maps, a virtual standard map (VSM) is expected.The VSM is constructed based on the intersection of the compared maps, and it contains mutual pixels which correspond to the same susceptible level in the compared maps.For instance, if we rank the respective pixel values derived from two individual landslide models into four levels for visualization, two prediction maps can be interpreted and denoted as map A and B, and the pixels in the first class of the VSM are pixels in the first class of both map A and map B. Since both map A and map B should be firstly subjected to the same validation (success rate and prediction rate curve analysis) to confirm their compatibility and validity, the reliability of the VSM can be guaranteed.
In the result of landslide susceptibility maps, different landslide susceptibility classes can also be demonstrated at a discrete level; the grey level of every pixel is an integer between 0 and 255, and different grey levels represent different LSIs.Meanwhile, the visible landslide susceptibility maps can be regarded as natural images that are highly structured, which are exhibited by various positional combinations of pixels of different grey levels.In fact, the produced landslide susceptibility maps in GIS are intrinsically a matrix carrying structural information.Therefore, the structural information can be extracted from the spatially approximate maps to measure the similarity between two arbitrary maps.
To obtain the matrices of the maps, image processing techniques can be adopted to optimize the landslide susceptibility maps/images and extract edges by the grey level threshold based on the local entropy of the image in MATLAB.Suppose x and y are two matrices of the two prediction images which have been aligned with each other.If we consider one of the matrices to have perfect quality, i.e., the map from which this matrix is derived is more suitable and predictable, then the similarity measure can serve as a quantitative measurement of the quality of the second matrix.In this way, the spatial similarity of two landslide susceptibility maps can eventually be quantitatively evaluated by a structural similarity index (SSI) [58].In mathematical form, the SSI is denoted as: where x i , y i are the element in matrix x, y respectively.Using the matrix manipulation functions in MATLAB, the SSI can be easily obtained.The value of the SSI ranges from 0 to 1.0, with the explanation that the higher the SSI, the higher the similarity of the two matrices, i.e., the more approximately a landslide susceptibility map resembles the other.In this study, the colorized landslide susceptibility maps were constructed based on the results of the FR and CF models, respectively.Then they were classified into four susceptibility classes (very low, low, moderate, and high susceptibility classes).Using the embedded functions in MATLAB, these colorized maps can be converted into grey figures.Through matrix manipulation functions in MATLAB, the SSI can be easily calculated according to Equation (5).

Landslide Conditional Factor Analysis
Using the aforementioned conditional factors, FR and CF were built using the training dataset.The result is set out in Table 2.For comparison, the calculated FRs and CFs for each class/type of the conditional factors were plotted in Figure 4.It can be seen that the variations of FR i,j are in consistent with those of CF i,j .When the CF gives the degree of belief, the FR denotes the level of correlation between landslide locations and conditional factors.It was observed that the lithology factor is closely correlated to landslide occurrence, as most of the landslides have occurred in the class of Jxy, Chc, and IR.With the increase of the slope gradient, landslides are more likely to occur.Additionally, areas within 1500−2100 m from the road networks and those within 2000-2500 m from major faults have a higher probability for landslides to occur.For comparison, the calculated and CFs for each class/type of the conditional factors were plotted in Figure 4.It can be seen that the variations of , are in consistent with those of , .When the CF gives the degree of belief, the denotes the level of correlation between landslide locations and conditional factors.It was observed that the lithology factor is closely correlated to landslide occurrence, as most of the landslides have occurred in the class of Jxy, Chc, and IR.With the increase of the slope gradient, landslides are more likely to occur.Additionally, areas within 1500−2100 m from the road networks and those within 2000-2500 m from major faults have a higher probability for landslides to occur.

Model Results and Analysis
Once the FR and CF models were successfully trained in the training process, they were used to calculate the LSI for all pixels in the domain.LSIs were reclassified into four susceptibility levels as high, moderate, low, and very low, using the equal-number classes [18] method.For the purpose of visualization, the two landslide susceptibility maps produced from the FR and CF models are shown in Figure 5.

Model Results and Analysis
Once the FR and CF models were successfully trained in the training process, they were used to calculate the LSI for all pixels in the domain.LSIs were reclassified into four susceptibility levels as high, moderate, low, and very low, using the equal-number classes [18] method.For the purpose of visualization, the two landslide susceptibility maps produced from the FR and CF models are shown in Figure 5.

Model Results and Analysis
Once the FR and CF models were successfully trained in the training process, they were used to calculate the LSI for all pixels in the domain.LSIs were reclassified into four susceptibility levels as high, moderate, low, and very low, using the equal-number classes [18] method.For the purpose of visualization, the two landslide susceptibility maps produced from the FR and CF models are shown in Figure 5.

Model Validation and Comparison
The compatibility of the susceptibility models was evaluated using the training dataset.Correspondingly, their prediction probability was assessed using the validating dataset.The area under the curve approach was used, and the results are shown in Figure 6.Respective AUC values of 0.807 and 0.773 for the FR model and the CF model showed that the map obtained from the FR model gives a higher accuracy in classifying the areas of existing landslides.For the prediction capacity evaluation of the developed landslide models, the prediction rate curve was obtained using the landslide pixels in the validating dataset (30% of the total observed landslides), and it can be seen from Figure 6 that both of the models have a good prediction capability, with the higher one for the FR model (AUC = 0.773), and the prediction capacities of the two models can be evaluated relatively similarly.

Model Validation and Comparison
The compatibility of the susceptibility models was evaluated using the training dataset.Correspondingly, their prediction probability was assessed using the validating dataset.The area under the curve approach was used, and the results are shown in Figure 6.Respective AUC values of 0.807 and 0.773 for the FR model and the CF model showed that the map obtained from the FR model gives a higher accuracy in classifying the areas of existing landslides.For the prediction capacity evaluation of the developed landslide models, the prediction rate curve was obtained using the landslide pixels in the validating dataset (30% of the total observed landslides), and it can be seen from Figure 6 that both of the models have a good prediction capability, with the higher one for the FR model (AUC = 0.773), and the prediction capacities of the two models can be evaluated relatively similarly.In the study, the VSM was generated from maps produced by the FR and CF models, and the matrix of the VSM was assigned to , while the matrices of maps derived from the FR and CF models were, respectively, assigned to .Using the image processing technique in MATLAB, the grey figures of the VSM and the landslide susceptibility maps produced by the FR and CF models were obtained.Based on Equation ( 5), a matrix manipulation function was adopted to calculate the SSIs.Calculation results showed that the SSIs between the matrices of the VSM and the map produced by the FR model, and the VSM and the map produced by CF model are 78.43% and 71.42%, respectively.This indicates that the landslide susceptibility map produced by the FR model yields the maximum approximation to the VSM.In the study, the VSM was generated from maps produced by the FR and CF models, and the matrix of the VSM was assigned to x, while the matrices of maps derived from the FR and CF models were, respectively, assigned to y.Using the image processing technique in MATLAB, the grey figures of the VSM and the landslide susceptibility maps produced by the FR and CF models were obtained.Based on Equation (5), a matrix manipulation function was adopted to calculate the SSIs.Calculation results showed that the SSIs between the matrices of the VSM and the map produced by the FR model, and the VSM and the map produced by CF model are 78.43% and 71.42%, respectively.This indicates that the landslide susceptibility map produced by the FR model yields the maximum approximation to the VSM.
Therefore, from the perspective of both model compatibility and predictability, the FR model slightly outperforms the CF model, but the difference in performance is slight, as represented by the AUCs.However, the SSI seems to amplify this difference, and the relative discrepancy becomes evident, so that it can confirm the validity of the results of the FR model over the CF model with much more confidence.

Discussion and Conclusions
Methodologies for producing landslide susceptibility maps are various, based on GIS technology, and much literature has been published in order to solve the deficiencies and difficulties in landslide susceptibility assessment.However, it should be noted that the procedure for preparing landslide susceptibility maps must be simple and must have a higher accuracy [36].
In this study, the frequency ratio model, as a simple method, and the certainty factor, as a complicated method, were applied to construct landslide susceptibility models.In the frequency ratio model, the input process, calculations, and output process are simple and can be readily understood [59].However, the certainty factor involves vast calculations and complicated data processing, it requires conversions of data in raster format to shapefiles, and another conversion to raster format again after statistical analyses are completed, and these data, again, need complex logical calculations in GIS to obtain the final results.
For validating landslide susceptibility models (maps), Sarkar and Kanungo [60] stated that in an ideal landslide susceptibility map, the very high susceptible class should have the highest landslide density or area proportion, and there should be a decreasing trend in the landslide density and percentage of landslide area successively from very high to very low susceptible zones.As a result, in this study, higher accuracies of LSM for the two models were obtained.Using the training data, the respective AUC values for the FR and CF models were 0.807 and 0.773.When, instead of a training dataset, the validating dataset was used, the AUC values for the FR and CF models were 0.782 and 0.760, respectively.Swets [57] suggested that the AUC values between 0.7 and 0.9 indicate a reasonable discrimination ability.Thus, it could be evaluated that both of the models have a relatively similar partition performance and prediction ability.Taking the spatial discrepancies of the corresponding susceptible classes within each map, the spatially correlated validation method described here is shown to be useful while working on map validation.Therefore, the structural similarity can serve as a complementary validation approach for map comparison when traditional methods fail to confirm the validity of one model over others in a targeted place.
The FR and CF models quantitatively estimate landslide susceptibility given a set of geo-environmental conditions.In fact, the landslide disaster itself is a typical nonlinear system, and the relationship between various factors is complex; a susceptibility map of landslides based on any stochastic methods remains uncertain.However, rather than serving as the best landslide model, the FR model presents a better performance in producing landslide susceptibility map in the study area when compared to the CF model.

Figure 1 .
Figure 1.Landslide inventory map and location of the study area.

Figure 1 .
Figure 1.Landslide inventory map and location of the study area.

Figure 2 .
Figure 2. Rockfalls at the Zhenluoying area of the study area (photographs were taken on April 2013).

Figure 2 .
Figure 2. Rockfalls at the Zhenluoying area of the study area (photographs were taken on April 2013).

16 Figure 2 .
Figure 2. Rockfalls at the Zhenluoying area of the study area (photographs were taken on April 2013).

Figure 4 .
Figure 4. Variations of the computed FRs and CFs for each class/type of the conditional factors.

Figure 4 .
Figure 4. Variations of the computed FRs and CFs for each class/type of the conditional factors.

Figure 4 .
Figure 4. Variations of the computed FRs and CFs for each class/type of the conditional factors.

Figure 5 .
Figure 5. Landslide susceptibility maps using (a) the frequency ratio, and (b) the certainty factor.Figure 5. Landslide susceptibility maps using (a) the frequency ratio, and (b) the certainty factor.

Figure 5 .
Figure 5. Landslide susceptibility maps using (a) the frequency ratio, and (b) the certainty factor.Figure 5. Landslide susceptibility maps using (a) the frequency ratio, and (b) the certainty factor.

Figure 6 .
Figure 6.The area under curve analysis: (a) success rate curve using the training dataset; and (b) predicted curve using the validating dataset.

Figure 6 .
Figure 6.The area under curve analysis: (a) success rate curve using the training dataset; and (b) predicted curve using the validating dataset.

Table 1 .
Description of geological units in the study area.

Table 2 .
Weighted values calculated for each category of the conditional factors, based on the FR model and the CF model.