Landslide Susceptibility Mapping Based on Selected Optimal Combination of Landslide Predisposing Factors in a Large Catchment

Landslides are usually initiated under complex geological conditions. It is of great significance to find out the optimal combination of predisposing factors and create an accurate landslide susceptibility map based on them. In this paper, the Information Value Model was modified to make the Modified Information Value (MIV) Model, and together with GIS (Geographical Information System) and AUC (Area Under Receiver Operating Characteristic Curve) test, 32 factor combinations were evaluated separately, and factor combination group with members Slope, Lithology, Drainage network, Annual precipitation, Faults, Road and Vegetation was selected as the optimal combination group with an accuracy of 95.0%. Based on this group, a landslide susceptibility zonation map was drawn, where the study area was reclassified into five classes, presenting an accurate description of different levels of landslide susceptibility, with 79.41% and 13.67% of the validating field survey landslides falling in the Very High and High zones, respectively, mainly distributed in the south and southeast of the catchment. It showed that MIV model can tackle the problem of “no data in subclass” well, generate the true information value and show real running trend, which performs well in showing the relationship between predisposing factors and landslide occurrence and can be used for preliminary landslide susceptibility assessment in the study area.


Introduction
Landslides are often triggered under the interaction of many factors.In sensitive regions of geological environment, the original structure of rock was damaged by frequent earthquakes, and loose debris mass gathered there.Geological environment instability is further aggravated in these areas, due to a combination of complicated geological structure, steep terrain, erosion of frequent heavy rain and human activities [1][2][3].Owing to the diversity and uncertainty of landslide predisposing factors and high increase in incidence frequency, the potential threat of landslides is increased, which not only endangers the safety of life and property for the local population, but also results in huge damage to regional resources and environment [4,5].
Many scholars in the field of landslides [6,7] found that research on predisposing factors for sensitive area is extremely important.The reason is that dynamic mechanism of landslides and landslide predisposing factors have uncertain characteristics, such as the non-uniqueness of factors and the uncertainty of the master-slave relationship, and they may change with seasons and some triggering events (such as the concentrated precipitation of monsoon climate area) [8].In the sensitive regions of the geological environment, landslide predisposing factors, such as topography and geomorphology, geological structures, rock properties and hydrological conditions, interact with each other, which made it more difficult to determine the time and scale of landslides [9].Many studies have been done on predisposing factors for evaluation.Dahl et al. [1] carried out research in the Faroe islands in the North Atlantic and found that landslide initiation area was determined by slope, lithology, soil coverage, and the primary area mainly has slopes from 25 ˝to 40 ˝.Scholars found that factors like topography and geomorphology, geological conditions, cutting intensity, vegetation environment, and human intervention are all indispensable in the study of geological predisposing factors, and the inner relationship between landslides and the predisposing factors should be further studied [10][11][12].Chinese scholars have done lots of forecasting and evaluating work on landslide and debris flow with the GIS technology and sensitivity index model in southwest China.Lan et al. [13] considered precipitation a significant landslide predisposing factor.Landslide trigger threshold by earthquake will be significantly reduced after precipitation.From the perspective of terrain factors, based on remote sensing image interpretation, Zhang et al. [14] focused on the relationship between the characteristics of landslide and terrain factors through three factors (i.e., slope, relief, and gully vertical drop) closely related to landslides.To summarize, landslides are usually initiated under complex conditions, in which some important factors should not be ignored, such as the terrain conditions, lithologic distribution, precipitation, seismic activity, etc. [15].The triggering mechanism is from a combined action of geological structure and environment.Different factors should be taken into comprehensive consideration, including topography and geomorphology, geological conditions, cutting intensity, vegetation environment, human factors and so on, excluding those factors which exert little effect.Therefore, in practical analysis and evaluation of landslides, it is of great significance to find out the most influential combination of factors to landslides in specific area.
It is extremely important to objectively analyze the distribution of every factor in the study area and the relationship between predisposing factors and landslide bodies before the geological hazard assessment and prediction [16].Different methods have been taken in the evaluation of landslide susceptibility.Kayastha [17] summarized the research methods of sensitive factors as five types: (1) direct map plotting method; (2) landslide catalog; (3) the exploratory analysis; (4) statistical methods such as fuzzy logic and artificial neural network; and (5) the conceptual model.Among them, the direct map cartography and landslides catalog methods are the most intuitive and basic ways to identify regional disasters for research.Bathrellos [18] proposed a unique approach of using mainly natural hazards as well as geological-geomorphological-geographical characteristics of the study area for urban planning and sustainable development.As for other research methods, Kritikos and Davies [19] developed an approach based on GIS (Geographic Information System) to make susceptibility assessment of precipitation-induced shallow landslide.They applied fuzzy logic technique to deal with uncertainties and intricate relationships between conditioning factors and landslide.
Information value model is a statistical analysis method which was developed from information theory and is now often applied to spatial prediction of geological hazards and disaster risk assessment [20][21][22][23].However, some problems are hidden in this IV model.When no landslide exists in certain subclass, there would be no significance.Researchers usually assign "0" [20] or "no data" [22] to those pixels, which would make the results much more exaggerated if a large number of those pixels existed, since "0" value in the model means that the ratio of landslide pixels in subclass i is equal to the average ratio of the study area, and if no landslides existed, the results should approach infinitesimal.To overcome this problem, Oliveira et al. [24] defined that when Npix px i q = 0, I pH, x i q was not calculated and was qualitatively determined as the lower Information value considering the data set of predisposing variables, which could avoid the problem of high exaggeration, while the results could not exactly show the information value of this area.
Baoxing Catchment has suffered frequent geological disasters in the past decade.Landsides have caused huge life and property losses and posed great threats to post-disaster reconstruction.Taking it as the study area, based on ARCGIS and SPSS software, this paper aims to find out the most influential factor combination to landslides in the study area and make susceptibility map based on it.Modified Information Value (MIV) Model was established, and together with AUC (Area Under Receiver Operating Characteristic Curve) test, was taken in the selection of the optimal combination group.

Study Area
In this paper, Baoxing Catchment in Sichuan province, China was taken as the study area, including Baoxing county and Lushan county.Baoxing catchment is located at 102 ˝26 1 -103 ˝14 1 E, 30 ˝02 1 -30 ˝57 1 N, in the western part of Sichuan basin, with a total area of 4319 km 2 (Figure 1).Mountainous terrain dominates this catchment, and the elevation is gradually reduced from 5268 m in the northwest to 557 m in the southeast.It belongs to the joints of the Yangtze Paraplatform and Ganzisongpan Geosynclinal fold system, across three seismic belts, including the Longmenshan Mountain Belt, Xianshuihe River Belt, and Anning River Belt.Intensive fault structures lie in Sichuan Province, China, and frequent seismic activities happened in the past decade.Wenchuan earthquake (12 May 2008) and Lushan earthquake (20 April 2013) struck this area successively.Both earthquakes caused huge losses of life and property losses and posed great threats to post-disaster reconstruction [25].The water resources, ecology, communication, electric power and road system suffered the influence of different levels [26].The rock structure loosens and the shear strength of geological structural reduces significantly when several earthquakes occur in the same area [27].The study area is located in the monsoon climate area, where the mean annual precipitation reaches 1000 mm, with most concentrated in heavy precipitations.
Sustainability 2015, 7, page-page 3 value considering the data set of predisposing variables, which could avoid the problem of high exaggeration, while the results could not exactly show the information value of this area.
Baoxing Catchment has suffered frequent geological disasters in the past decade.Landsides have caused huge life and property losses and posed great threats to post-disaster reconstruction.Taking it as the study area, based on ARCGIS and SPSS software, this paper aims to find out the most influential factor combination to landslides in the study area and make susceptibility map based on it.Modified Information Value (MIV) Model was established, and together with AUC (Area Under Receiver Operating Characteristic Curve) test, was taken in the selection of the optimal combination group.

Study Area
In this paper, Baoxing Catchment in Sichuan province, China was taken as the study area, including Baoxing county and Lushan county.Baoxing catchment is located at 102°26′-103°14′ E, 30°02′-30°57′ N, in the western part of Sichuan basin, with a total area of 4319 km 2 (Figure 1).Mountainous terrain dominates this catchment, and the elevation is gradually reduced from 5268 m in the northwest to 557 m in the southeast.It belongs to the joints of the Yangtze Paraplatform and Ganzisongpan Geosynclinal fold system, across three seismic belts, including the Longmenshan Mountain Belt, Xianshuihe River Belt, and Anning River Belt.Intensive fault structures lie in Sichuan Province, China, and frequent seismic activities happened in the past decade.Wenchuan earthquake (12 May 2008) and Lushan earthquake (20 April 2013) struck this area successively.Both earthquakes caused huge losses of life and property losses and posed great threats to post-disaster reconstruction [25].The water resources, ecology, communication, electric power and road system suffered the influence of different levels [26].The rock structure loosens and the shear strength of geological structural reduces significantly when several earthquakes occur in the same area [27].The study area is located in the monsoon climate area, where the mean annual precipitation reaches 1000 mm, with most concentrated in heavy precipitations.

Data Source and Preprocessing
Remote sensing (RS) image data Landsat TM and ETM+ data were downloaded from the data sharing platform Geospatial Data Cloud (http://www.gscloud.cn/).The atmospheric correction of RS image was processed by FLAASH (Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes) model and geometric correction was done according to the 1:250,000 topographic maps and GPS field sampling points.Through data processing and integration, multi-spectral color images were generated with the spatial resolution of 15 m.Drainage network, roadside, landslides and vegetation coverage information were interpreted by the combination of object-oriented classification method and manual vectorization.Furthermore, high spatial resolution remote sensing images from Google Earth were referenced to get a higher accuracy result.Altogether, 1258 landslides were extracted here (Figure 2), 97% of which were small-and medium-sized shallow slides or collapses, mainly triggered by Wenchuan earthquake (12 May 2008) and Lushan earthquake (20 April 2013).All the landslides together affected a total area of 3,529,530 m 2 , which corresponds to 0.08% of the study area.The landslide size ranges from 8.9 to 173,459 m 2 and the mean value is 2790 m 2 .The landslide density is 0.29 landslide/km 2 .
Sustainability 2015, 7, page-page 4 Remote sensing (RS) image data Landsat TM and ETM+ data were downloaded from the data sharing platform Geospatial Data Cloud (http://www.gscloud.cn/).The atmospheric correction of RS image was processed by FLAASH (Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes) model and geometric correction was done according to the 1:250,000 topographic maps and GPS field sampling points.Through data processing and integration, multi-spectral color images were generated with the spatial resolution of 15 m.Drainage network, roadside, landslides and vegetation coverage information were interpreted by the combination of object-oriented classification method and manual vectorization.Furthermore, high spatial resolution remote sensing images from Google Earth were referenced to get a higher accuracy result.Altogether, 1258 landslides were extracted here (Figure 2), 97% of which were small-and medium-sized shallow slides or collapses, mainly triggered by Wenchuan earthquake (12 May 2008) and Lushan earthquake (20 April 2013).All the landslides together affected a total area of 3,529,530 m 2 , which corresponds to 0.08% of the study area.The landslide size ranges from 8.9 to 173,459 m 2 and the mean value is 2790 m 2 .The landslide density is 0.29 landslide/km 2 .The geological information used in this article was derived from 1:250,000 geological map of the Geological Survey Institute of Sichuan province.Seismic data were based on information from the national earthquake science data sharing center (http://data.earthquake.cn).The precipitation data were collected from meteorological science data sharing service network of Sichuan province (http://www.climate.sc.cn).Digital Elevation Data (ASTER GDEM 30 m) of Baoxing catchment were used to delineate drainage network, slope and aspect by ARCGIS software.The field survey landslide distribution map was provided by the Geological Survey Institute of Sichuan province.Half of those landslides were used in the calculation of Information Value, and the left half were used to evaluate the accuracy of the prediction in AUC calculation.

Selection of Predisposing Factors
There are no universal guidelines regarding the selection of factors for landslide susceptibility [28].A certain parameter may be an important controlling factor for landslide occurrence in one area The geological information used in this article was derived from 1:250,000 geological map of the Geological Survey Institute of Sichuan province.Seismic data were based on information from the national earthquake science data sharing center (http://data.earthquake.cn).The precipitation data were collected from meteorological science data sharing service network of Sichuan province (http://www.climate.sc.cn).Digital Elevation Data (ASTER GDEM 30 m) of Baoxing catchment were used to delineate drainage network, slope and aspect by ARCGIS software.The field survey landslide distribution map was provided by the Geological Survey Institute of Sichuan province.Half of those landslides were used in the calculation of Information Value, and the left half were used to evaluate the accuracy of the prediction in AUC calculation.

Selection of Predisposing Factors
There are no universal guidelines regarding the selection of factors for landslide susceptibility [28].A certain parameter may be an important controlling factor for landslide occurrence in one area but not in another [20].According to the analysis of previous data and geological environment research literature on the sensitive area of landslides, the critical predisposing factors of landslides include regional geological conditions (fault structure, slope, aspect, formation lithology, etc.), climatic factors (precipitation distribution, temperature, etc.), geological disaster factors (density, frequency, types, etc.), vegetation conditions, river terrain cutting, and human engineering activities (roads and other engineering constructions) [29].In consideration of the complex conditions in the study area, together with data accessibility, we choose nine factors from four characteristic groups as landslide predisposing factors, covering most of the critical elements leading to landslides in this region.That is, Topography group (including slope, aspect and terrain), Geological lithology group (including lithology and faults), Terrain cutting group (including river and road cutting intensity) and Natural environment group (including vegetation coverage and annual precipitation distribution).
Each predisposing factors was reclassified into several classes (Table 1), which were utilized as different zonal statistic areas in the Modified Information Model to evaluate landslide susceptibility.

Information Value (IV) Model
The information value (IV) for each subclass of the factors is calculated with the following equation [20,30], In the study area, I pH, x i q is the information value of subclass i of a predisposing factor; Npix px i q is the number of landslide pixels in subclass i; Npix pN i q is the total pixel number of subclass i; ř Npix px i q is the total landslide pixel number of in the study area; and ř NpixpN i q is the total pixel number of the study area.
Therefore, the total information value I total for each pixel is calculated by summing up all the information values of each factor layer using Equation (2).
where n is the number of layers of the predisposing factors.

The Establishment of the Modified Information Value (MIV) Model
To avoid the problem ("no data" for a certain subclass) mentioned in the introduction, and to quantitatively express the results, Equation (1) in IV model was modified in this paper to establish the MIV model as the following equation, where, when no landslides existed in a certain subclass, IpH, x i q equals "0", which would be the smallest value; when IpH, x i q is "1", it means that the ratio of landslide pixels in subclass i equals to the average of the study area; and when the value is larger than "1", more landslides pixels in the subclass lie than those in the study area, and the larger the value, the higher landslides ratio in the subclass, vice versa.Taking factor Road (buffering distance) as an example, statistic maps of the two models are shown in Figure 3.When the distance was larger than 4000 m, no landslide was found in field survey, therefore, different fitting trends appear at the last subclass.Furthermore, obviously MIV model makes the true information value and the real running trend.
Sustainability 2015, 7, page-page In the study area, I(H, x ) is the information value of subclass i of a predisposing factor; Npix(x ) is the number of landslide pixels in subclass i; Npix(N ) is the total pixel number of subclass i; ∑ Npix(x ) is the total landslide pixel number of in the study area; and ∑ Npix(N ) is the total pixel number of the study area.
Therefore, the total information value Itotal for each pixel is calculated by summing up all the information values of each factor layer using Equation (2).
where n is the number of layers of the predisposing factors.

The Establishment of the Modified Information Value (MIV) Model
To avoid the problem ("no data" for a certain subclass) mentioned in the introduction, and to quantitatively express the results, Equation (1) in IV model was modified in this paper to establish the MIV model as the following equation, where, when no landslides existed in a certain subclass, I(H, x ) equals "0", which would be the smallest value; when I(H, x ) is "1", it means that the ratio of landslide pixels in subclass i equals to the average of the study area; and when the value is larger than "1", more landslides pixels in the subclass lie than those in the study area, and the larger the value, the higher landslides ratio in the subclass, vice versa.Taking factor Road (buffering distance) as an example, statistic maps of the two models are shown in Figure 3.When the distance was larger than 4000 m, no landslide was found in field survey, therefore, different fitting trends appear at the last subclass.Furthermore, obviously MIV model makes the true information value and the real running trend.
(A) (B) Itotal in MIV model is still calculated with Equation (2).Landslides were affected by many factors, and each factor's contribution was different.Thus, the weight value for each factor cannot be judged simply by artificial decision.Half of the field survey landslides were used to calculate the information value of each subclass of every predisposing factor with Equation (3), which indicates the degree of contribution of the subclass to landslides.Summing up all the information layers of all factors, layer of Itotal was achieved to show the spatial distribution of Itotal values, where the larger the value is, the higher the landslide susceptibility.I total in MIV model is still calculated with Equation (2).Landslides were affected by many factors, and each factor's contribution was different.Thus, the weight value for each factor cannot be judged simply by artificial decision.Half of the field survey landslides were used to calculate the information value of each subclass of every predisposing factor with Equation (3), which indicates the degree of contribution of the subclass to landslides.Summing up all the information layers of all factors, layer of I total was achieved to show the spatial distribution of I total values, where the larger the value is, the higher the landslide susceptibility.

Factor Combination
Due to differences of geological environments in different regions, the inducing factors of landslides would not be completely consistent and the contribution of every factor would be different, so there would be an optimal factor combination, under the function of which spatial probability of landslide occurrence would be the highest.One of the aims of this paper is to find the optimal factor combination in the study area.
Enumeration method was used in this paper to make different combinations of the nine predisposing factors.Four typical predisposing factors, each from a different group, are selected through the following analysis as the basic factors and 32 test factor combinations are set up by adding 1, 2, 3, 4, or 5 of the left factors each time.For each combination, raster layer of the total information value on every pixel was generated in ARCGIS with MIV model.

Selection of Optimal Combination Based on ROC Curve Test
The receiver operating characteristic curve, referred to as "ROC curve", is an effective method of evaluating the performance of dichotomy problems, which divide the objectives into two classes, positive and negative, like in diagnostic tests [31].An ROC curve is constructed using a true positive rate and false positive rate pair for each possible threshold value of the test.Each point on the curve is created by plotting the unique true positive rate (TPR) and false positive rate (FPR) associated with each unique test value.The area under the ROC curve (AUC) is a common metric that can be used to compare different tests (indicator variables).An AUC close to 0.5 corresponds to a poor diagnostic test.The larger the AUC, the more accurate the test is.Landslide susceptibility maps are often validated using AUC [32,33].This method has been widely used as a measure of performance of a predictive rule [20,34,35].With the implementation of ROC analysis we can assess the prediction accuracy of a model [36].
In this paper, together with enumeration method, AUC of ROC curve was applied to evaluate the 32 factor combinations separately in SPSS software.The combination with the maximum AUC value (and larger than 0.9) would be chosen as the optimal combination of predisposing factors to the landslides in the area.The value also indicates the combination's "diagnosis" ability for landslides, for higher score indicates the higher consistency of landslide sustainability with the actual spatial distribution of landslides.

Landslide Sustainability Zoning Based on the Optimal Combination
According to the former analysis, the larger the I total value is, the higher the landslide susceptibility.Using the natural break method in ARCGIS, layer of the total information values (I total ) for the optimal combination was reclassified into five classes (very high, high, moderate, low, and very low) to define different levels of landslide susceptibility in the study area.Furthermore, then layer of field survey landslides was overlaid with the landslide susceptibility layer to show the accuracy of sustainability assessment.

Spatial Distribution Characteristics of Information Value
Information Values were calculated in ARCGIS for every landslide impact factor and layers of IV for different factors were acquired (Figure 4) and IV for each subclass for those factors is shown in Figure 5. Larger IV suggests that correlation degree of factors with landslides is higher, and vice versa.Information value is not only a quantitative indicator for the influence of a factor on the landslides, but also the reflection of the landslide distribution in each factor.Based on Figures 4 and 5 spatial distribution analyses of IV were carried on in each predisposing factor.

In Topography Group
Different Topography conditions may lead to different possibilities of landslides.Here, in this paper, Slope, Aspect and Terrain (Relief amplitude) were taken as predisposing factors in Topography group to show the changes of Information Values.
In general, the higher the slope is, the more unfavorable it is to rock and soil consolidation, and the possibility of landslides would be greater [37].Lower slope IVs are mainly distributed in areas where it is relatively flat, while higher IVs are concentrated in the central and southern mountains.The relationship between IV and slope presented a state of fluctuation, but the overall trend is it rises with the increase of slope.When the slope is lower than 35 ˝, slope IV was relatively low, which showed that the probability of landslide is lower in slopes under 35 ˝, especially when it is in subclass 5 ˝-10 ˝, the contribution to landslide is the lowest degree (with IV 0.45).When the slope is higher than 35 ˝, slope IV increased dramatically with a drop for subclass 45 ˝-50 ˝, and the highest IV (3.22) was found in the subclass >60 In general, the higher the slope is, the more unfavorable it is to rock and soil consolidation, and the possibility of landslides would be greater [37].Lower slope IVs are mainly distributed in areas where it is relatively flat, while higher IVs are concentrated in the central and southern mountains.The relationship between IV and slope presented a state of fluctuation, but the overall trend is it rises with the increase of slope.When the slope is lower than 35°, slope IV was relatively low, which showed that the probability of landslide is lower in slopes under 35°, especially when it is in subclass 5°-10°, the contribution to landslide is the lowest degree (with IV 0.45).When the slope is higher than 35°, slope IV increased dramatically with a drop for subclass 45°-50°, and the highest IV (3.22) was found in the subclass >60°.As for layer of Aspect, it was found that the higher IVs mainly distributed in subclass E (East), S (South) and SW (Southwest) and it was lower in SE (Southeast), NW (Northwest) and N (North).Analysis from the remote sensing image interpretation showed that most landslides concentrated in east and south directions.The situation was related to the natural geographical environment differences.Sunny hillsides receive plenty of sunlight, leading to high evaporation and poor vegetation coverage; therefore, the conditions of water and hot combination is poorer than on shady  (A) (B) ( C) As for layer of Aspect, it was found that the higher IVs mainly distributed in subclass E (East), S (South) and SW (Southwest) and it was lower in SE (Southeast), NW (Northwest) and N (North).Analysis from the remote sensing image interpretation showed that most landslides concentrated in east and south directions.The situation was related to the natural geographical environment differences.Sunny hillsides receive plenty of sunlight, leading to high evaporation and poor vegetation coverage; therefore, the conditions of water and hot combination is poorer than on shady As for layer of Aspect, it was found that the higher IVs mainly distributed in subclass E (East), S (South) and SW (Southwest) and it was lower in SE (Southeast), NW (Northwest) and N (North).Analysis from the remote sensing image interpretation showed that most landslides concentrated in east and south directions.The situation was related to the natural geographical environment differences.Sunny hillsides receive plenty of sunlight, leading to high evaporation and poor vegetation coverage; therefore, the conditions of water and hot combination is poorer than on shady hillsides, and the degree of rock weathering is relatively more severe, therefore more landslides developed [8].While on shady hillsides, less moisture evaporation took place, a good combination water and heat conditions benefit the growth of vegetation, which can also prevent soil erosion, then the degree of rock consolidation will be fine, so the possibility of landslide is lower.
Relief amplitude is a sign of gentle relief condition in a region.The greater the Relief amplitude, the more complex the topography is.Overall, the trend line of IVs was increasing in the layer of Terrain.Terrain IV is the lowest in less than 100 m (0.81), then it increased with the fluctuation degree and reached the highest point at the subclass of >200 m (2.66), though the area of this subclass is only 2.24% of the whole study area.Higher IVs were mainly distributed in the central and southern mountains with high relief amplitude, and the area of low IV was located in the northern and southern plains.

In Geological Lithology Group
Geological lithology plays a key role in controlling spatial distribution of regional geological disasters.These areas suffer from many landslides because of the tectonic activity and its geological setting [38].In this paper, two predisposing factors Lithology and Faults (buffering distance) were considered in Geological lithology group.
Lithology is one of the most basic and important factors in the research of geological hazards, and influences the stability of regional geological structure [39].Distribution of lithology was complex, hard and soft rocks were staggered in the research area.The fitting curve of Lithology IVs rose from hard rock to soft group in a monotone increasing state with a high fitting degree.The minimum IV 0.67 was in the hard rock subclass, and the peak IV 3.91 appeared in the soft rock subclass.The information values of middle soft and soft rock were both larger than 1, which showed that they exerted greater effect on the development possibility of landslides.
Faults are one important factor of the control component for the stability of the regional geological.Fault activities tend to induce landslide and mud-rock flows in zonal distribution.Fitting curve of fault distance information is presented a trend of rise first, and then fall.High IVs mainly distributed in the range of 3000-9000 m from Faults, and landslide area was the largest in this region.Furthermore, in 6000-9000 m, it reached the highest value, 1.70, and exerted the greatest influence on the landslide.Altogether, there are three low points, in subclasses 0-3000 m, 9000-12,000 m, and >15,000 m.Fault IV reached at a minimum value of 0.23 when the distance was larger than 15,000 m.In the subclass 9000-12,000 m, Fault IV is 0.45, but the information value increased to 1.18 in the next subclass 12,000-15,000 m.

In Terrain Cutting Group
In the terrain cutting group, two predisposing factors were chosen, namely Drainage network and Road, which usually exert strong cutting action on terrain.Multi-ring buffering distance (every 200 m for Drainage network and 1000 m for Road) were used to generate subclasses.
In the natural environment, drainage network would have cutting effect on the surrounding geography, and, at the same time, influence the stability of rock and soil along the coast [40].In the study area, IVs of drainage network presented circular distribution along the network, and decreased with the distance.The closer distance to the network, the higher the information value is, and the farther, the lower it is.In mountain areas, sections along drainage network are often regions with very frequent geological disasters, which were closely related to drainage network in the history.Landslides often develop around water system [27].From Figures 4 and 5 it was found that drainage network information fitting curve was in a declining trend with the increase of the distance, and the reduction was relatively uniform and moderate.Within a distance of 400 m, IVs of drainage network were greater than 1, so the effect on landslide was obvious.Furthermore, IV were the highest in 200-400 m distance from drainage network, 1.68.When the distance was larger than 600 m, influence degree on the landslides decreased continuously, and when it was larger than 1000 m, the IV reached the minimum, 0.34.
Road were selected as manmade predisposing factor on landslides.The road information curve had a high fitting degree, monotonically decreasing with the distance of road buffer.The nearer to the road, the higher the Road IV is.Road constructed would produce disturbance on the slope ground, leaving the rock structural plane in an instability state.The maximum IV (2.58) was within 1000 m buffer distance, and the landslide area in this range, accounting for 68.2% of the total landslide area.When the distance was larger than 4000 m, no landslide was found in field survey, the influence degree dropped to the minimize value, 0. In mountainous area, road may artificially cut towards the steep slopes which might cause high landslide occurrence in the near distances from the road [41].

In Natural Environment Group
Annual precipitation distribution and Vegetation coverage are taken as the representations of the group of Natural Environment, which usually play important roles in landslide occurrence [42,43].
Precipitation is a dynamic source of rock sliding and at the same time provides sliding surface for slipmass, therefore, it is one of the essential factors in the monsoon climate region for landslides [44,45].In the research area, annual precipitation increases gradually from north to south.Precipitation IV fitting curve showed a trend of monotone increasing with the increase of annual precipitation.The lowest IV, approximate to zero, lay in the north end of the study area with annual average precipitation less than 800 mm, while, the peak IV, 2.52, appeared in the southernmost of the study area where annual average precipitation is larger than 1200 mm.Where annual average precipitation is greater than 1000 mm, area and the number of landslides became larger.The monotonic trend of precipitation IV with precipitation further proved that precipitation is the one of the main induce factors of landslides.
For vegetation coverage, The Vegetation IV fitting curve shows a slow downward trend, with an inflexion point at subclass 0.2-0.3.When vegetation coverage is lower than 0.2, Vegetation IV reached the highest value, 2.22.The maximum Vegetation impact on the landslide occurred in this area, with more landslides in less subclass area.Furthermore, landslides increase land instability, and it is likely to slide again if natural reforestation is slow [46].When the coverage is from 0.4 to 0.6, vegetation IV was slightly larger than 1.In the left three subclasses, vegetation IVs were all lower than 1, especially when the coverage is higher than 0.6, vegetation IV reached the minimum value, 0.08.High vegetation coverage can usually reduce the frequency and severity of geological hazards.Vegetation IV curve was not in a monotone drop trend with the rising of vegetation coverage in this area, which shows vegetation coverage is only the auxiliary elements affecting landslide development in the study area.

Optimal Combination Selection Based on ROC Curve Test
Among those predisposing factors, there would be an optimal combination to express landslide susceptibility.Based on the statistics of IV in Figure 5, Slope (from Topography group), Lithology (from Geological lithology group), Drainage network (from Terrain cutting group) and Precipitation (from Natural environment group) were selected as the basic combination group, and then 32 combination groups were made by adding 0,1, 2, 3, 4, or 5 of the left factors each time (Table 2).
AUC of ROC curve was applied to evaluate the 32 factor combinations separately in SPSS software.AUC results of all the 32 combination groups are shown in Table 2. (I) Vegetation coverage.Among groups with five factor members, Group 5 had the highest AUC value, 0.943, where Road was added to the basic group, showing that Road was more important than the left four factors in landslide occurrence.When group member reached six, Group 14, with Road and Faults added to the basic group, became the best combination with AUC value, 0.947, which was also higher than that of Group 5. Furthermore, when three factors were added to the basic group, the highest AUC value belonged to Group 25, reaching 0.950, the highest of all 32 groups.Here, Road and Faults were still group members.Vegetation was added to Group 14.When the members were continually added, AUC value of Group 31 was the highest of the left groups, but it was still lower than Group 25.This means that starting from Group 25, when more factors are added, the AUC value decreases.That is to say, more factors do not always mean better results.When Aspect or Terrain or both were added to Group 25, the AUC value is lower.Therefore Group 25, made up of Slope, Lithology, Drainage network, Annual precipitation, Faults, Road and Vegetation, was selected as the optimal combination group to express the susceptibility of landslide occurrence in the study area.

Discussions
For every landslide predisposing factor, information values are different between subclasses, among which there is one subclass that contributes the most to landslide development.For all factors, each would function in landslide development, but would not be the only dominating factor.Therefore, landslide is the result of joint action by different kinds of factors.
Theoretically speaking, the group with the highest AUC value is the optimal combination of landslide predisposing factors, while according to the predictive capacity among models, there is no significant differences between similar AUC values, like 0.946, 0.947 and 0.950, with respect to ROC curves.From this point of view, it can be seen that all the groups with AUC value higher than 0.940 are with H (Road) included in the group besides the basic four factors, which shows that H is an important predisposing factor in considering landslide susceptibility; while, when either or both E (Aspect) and F (Terrain) are added to the group, AUC values are slightly lower than before, indicating that E and F could be excluded in the study area.What is more, a group with the same high AUC value but less predisposing factor members would be commonly accepted as a more suitable one for the selection of optimal combination, especially when data acquisition were difficult.Therefore, Group 25 can be defined as the optimal combination in the study area and when either or both G (faults) and I (vegetation coverage) were difficult to obtain, Group 14, Group 16 or Group 5 can suffice.
The whole Baoxing Catchment was reclassified into five classes (very high, high, moderate, low, and very low) to define different levels of landslide susceptibility based on I total values of the optimal combination group (Group 25 in Table 2).The landslide susceptibility map was produced, and was overlaid with the layer of field survey landslides (Figure 6).Statistics results were listed in Table 3 to show the accuracy of susceptibility assessment.
Sustainability 2015, 7, page-page overlaid with the layer of field survey landslides (Figure 6).Statistics results were listed in Table 3 to show the accuracy of susceptibility assessment.According to the AUC values in Table 2, Group 25 owned the Highest AUC, 0.950, which also means that the accuracy of the susceptibility map generated by Group 25 was 95.0%.From Figure 6, it can be visually judged that most of the landslides fall in High and Very High zones with high fitting degree.The High and Very High zones mainly distributed in the south and southeast of the catchment, and Low and Very Low zones mainly in the north and west.Table 3 shows that in Very Low and Low zones, there were almost no landslides, reaching only 0.01% and 0.32% of all the landslides area, respectively, while in the High and Very High zones, the ratio reached altogether 93.08%, especially in the Very High zone, the ratio is 79.41%, with low zone area (30,855.06hm 2 ) but Very High landslide area, the landslide density is 2.33%.
In this paper, we have made an analysis between landslides and their predisposing factors, without considering landslide typology.The four basic factors in those groups were selected based on experience.Moreover, in the MIV model proposed in this paper, the typical negative information values in IV modal are constrained between 0 and 1 and that could possible affect the predictive capacity of the model.All those points may bring some uncertainty to the results of this paper.While, through the validation of ROC model, it can be judged that the results are highly reliable.To precisely judge the effects of these points, we will make it one of our key issues in our future research.According to the AUC values in Table 2, Group 25 owned the Highest AUC, 0.950, which also means that the accuracy of the susceptibility map generated by Group 25 was 95.0%.From Figure 6, it can be visually judged that most of the landslides fall in High and Very High zones with high fitting degree.The High and Very High zones mainly distributed in the south and southeast of the catchment, and Low and Very Low zones mainly in the north and west.Table 3 shows that in Very Low and Low zones, there were almost no landslides, reaching only 0.01% and 0.32% of all the landslides area, respectively, while in the High and Very High zones, the ratio reached altogether 93.08%, especially in the Very High zone, the ratio is 79.41%, with low zone area (30,855.06hm 2 ) but Very High landslide area, the landslide density is 2.33%.

Conclusions
In this paper, we have made an analysis between landslides and their predisposing factors, without considering landslide typology.The four basic factors in those groups were selected based on experience.Moreover, in the MIV model proposed in this paper, the typical negative information values in IV modal are constrained between 0 and 1 and that could possible affect the predictive capacity of the model.All those points may bring some uncertainty to the results of this paper.While, through the validation of ROC model, it can be judged that the results are highly reliable.To precisely judge the effects of these points, we will make it one of our key issues in our future research.

Conclusions
Landslides are one of the most hazardous geological disasters in nature and are usually initiated under complex environmental conditions.To find the optimal combination of predisposing factors and create a landslide susceptibility map based on them is of great significance and interest to planning agencies for preliminary hazard studies, especially when a regulatory planning policy is to be implemented.In the present paper, the Information Value model was modified to create the MIV model, based on which raster layers of total information value on every pixel was generated in ARCGIS for all 32 combinations separately, and together with AUC test in ROC curve, the optimal combination group of predisposing factors was selected and a landslide susceptibility map was drawn based on the group.Results show that, (1) MIV model can tackle the problem of "no data in subclass" well and generate the true information value and real running trend, which performs well in showing the relationship between predisposing factors and landslide occurrence and can be perfectly used for preliminary landslide susceptibility assessment in the study area; (2) factor combination group with members of Slope, Lithology, Drainage network, Annual precipitation, Faults, Road and Vegetation, was selected as the optimal combination group to express the susceptibility of landslide occurrence in Baoxing catchment, with an accuracy of 95.0%; (3) in the landslide susceptibility zonation map drawn based on the optimal group, the whole Baoxing Catchment was reclassified into five classes (very high, high, moderate, low, and very low), which presented an accurate description of different levels of landslide susceptibility with 79.41% and 13.67% of the validating field survey landslides falling in the very High and High zones, respectively, both mainly distributed in the south and southeast of the catchment.

Figure 1 .
Figure 1.Location of the study area.

Figure 1 .
Figure 1.Location of the study area.

Figure 3 .
Figure 3. Result comparison of the two models.(A) IV model; (B) MIV model.

Figure 3 .
Figure 3. Result comparison of the two models.(A) IV model; (B) MIV model.

7 ,
page-page Different Topography conditions may lead to different possibilities of landslides.Here, in this paper, Slope, Aspect and Terrain (Relief amplitude) were taken as predisposing factors in Topography group to show the changes of Information Values.

Table 1 .
The reclassification of different predisposing factors.

Table 2 .
AUC (Area Under Receiver Operating Characteristic Curve) values of all the combination groups.

Table 3 .
Statistics fitting results of landslides in each landslide susceptibility zone.

Table 3 .
Statistics fitting results of landslides in each landslide susceptibility zone.