Mapping Landslide Susceptibility Using Machine Learning Algorithms and GIS: A Case Study in Shexian County, Anhui Province, China

: In this study, Logistics Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Machine (GBM), and Multilayer Perceptron (MLP) machine learning algorithms are combined with GIS techniques to map landslide susceptibility in Shexian County, China. By using satellite images and various topographic and geological maps, 16 landslide susceptibility factor maps of Shexian County were initially constructed. In total, 502 landslide and random safety points were then using the “Extract Multivalues To Points” tool in ArcGIS, parameters for the 16 factors were extracted and imported into models for the ﬁve algorithms, of which 70% of samples were used for training and 30% of samples were used for veriﬁcation, which makes sense for date symmetry. The Shexian grid was converted into 260130 vector points and imported into the ﬁve models, and the natural breakpoint method was used to divide the grid into four levels: low, moderate, high, and very high. Finally, by using column results gained using Area Under Curve (AUC) analysis and a grid chart, susceptibility results for mapping landslide prediction in Shexian County was compared using the ﬁve methods. Results indicate that the ratio of landslide points of high or very high levels from LR, SVM, RF, GBM, and MLP was 1.52, 1.77, 1.95, 1.83, and 1.64, and the ratio of very high landslide points to grade area was 1.92, 2.20, 2.98, 2.62, and 2.14, respectively. The success rate of training samples for the ﬁve methods was 0.781, 0.824, 0.853, 0.828, and 0.811, and prediction accuracy was 0.772, 0.803, 0.821, 0.815, and 0.803, respectively; the order of accuracy of the ﬁve algorithms was RF > SVM > MLP > GBM > LR. Our results indicate that the ﬁve machine learning algorithms have good e ﬀ ect on landslide susceptibility evaluation in Shexian area, with Random Forest having the best e ﬀ ect.


Introduction
Landslides are a common geological disaster, resulting in economic losses of up to $100 billion globally and accounting for hundreds of deaths. Landslides have a serious effect on the lives and safety of populations, affecting the stable development of society [1]. Although China has a considerable land area, it is characterized by a generally flat topography in the north and the east, with higher land areas in the south and the west. Landslides in China are therefore predominantly concentrated in the south and western areas, especially near the Yangtze River Basin. Over the past 70 years, more than 20,000 people have died due to landslides in China; annual economic losses caused by landslides amount to

Study Area
Shexian County, located in the southernmost tip of Anhui Province, China, bordering Zhejiang Province, has an area of 2122 km 2 (118 • 15 00 -118 • 53 50 E, 29 • 30 25 -30 • 07 00 N). In 2020, the permanent resident population of Shexian County was 420000. Elevation in this county fluctuates greatly, ranging from 67 to 1777 m. In general, terrain in the east and the central area is relatively low, with areas in the south, west and north being higher. Shexian County belongs to the subtropical North Edge mountain thick monsoon humid climate; spring and autumn each account for two months and summer and winter each account for four months. Annual average temperature in this county is 16.3 • C, and annual precipitation is 1100-2000 mm (average annual precipitation is 1582.7 mm). The study area has a developed water system, comprising a dense river network. Almost all rivers in the study region converge into Xin'an River, the largest river in Shexian County. This river has a channel slope of 1.71% and an average annual discharge of 41.86 m 3 /s. Shexian County also has a good transport network, having a total highway mileage of 1682.5 km [4]. In terms of geology, the study area is located in the South China stratigraphic area, and exposed strata include Sinian of Proterozoic, Symmetry 2020, 12, 1954 3 of 18 Cambrian of Paleozoic, Ordovician, Jurassic and Cretaceous of Mesozoic, and Quaternary loose soil and deposition. In terms of structure, Shexian County has experienced many tectonic movements, with abundant folds and faults.
Landslide event refers to the phenomenon that downward movement of rock or soil when the gravity or other types of shear stress exceed the shear strength of the slope. Shexian County is frequently affected by landslides events, with a total of 502 landslide disaster points recorded in this area ( Figure 1). In general, landslide disasters in the study area are characterized by a large number and a dense distribution.
Symmetry 2020, 12, x FOR PEER REVIEW 3 of 18 Proterozoic, Cambrian of Paleozoic, Ordovician, Jurassic and Cretaceous of Mesozoic, and Quaternary loose soil and deposition. In terms of structure, Shexian County has experienced many tectonic movements, with abundant folds and faults. Landslide event refers to the phenomenon that downward movement of rock or soil when the gravity or other types of shear stress exceed the shear strength of the slope. Shexian County is frequently affected by landslides events, with a total of 502 landslide disaster points recorded in this area ( Figure 1). In general, landslide disasters in the study area are characterized by a large number and a dense distribution.

Data
In order to map and predict landslide susceptibility in Shexian County, 502 landslide points were collected in this study. It is important to note that the range of thematic data types used for susceptibility assessment has not changed significantly over time [18]. With reference to previous studies, and according to the geological environment of the study area and the development characteristics of landslides, a total of 16 condition factors were selected and divided into four categories according to type: topography and geology, hydrology, and others [34]. Among the different condition factors, topography mainly incorporated downloaded DEM geospatial data from the official cloud website (http://www.gscloud.cn/search). This category included aspect, slope, plane curvature, profile curvature, topographic relief, surface roughness, and landform. Geology was mainly derived from vectorization of the regional geological map, including faults and lithology. Hydrology was derived from vectorization of the topographic map, grid calculation of GIS, and extraction of the rainfall distribution map, which were divided into river and net flow intensity indices. The other category included road and vegetation coverage. All of the conditional categories were comprised into grid images with a pixel size of 30 m × 30 m.

Topographical and Geological Factors
Among the terrain factors, slope aspect, slope degree, plane curvature, profile curvature, topographic relief, and surface roughness were extracted from DEM data in GIS. Although slope aspect does not directly affect landslide stability, different slope aspects are affected by different

Data
In order to map and predict landslide susceptibility in Shexian County, 502 landslide points were collected in this study. It is important to note that the range of thematic data types used for susceptibility assessment has not changed significantly over time [18]. With reference to previous studies, and according to the geological environment of the study area and the development characteristics of landslides, a total of 16 condition factors were selected and divided into four categories according to type: topography and geology, hydrology, and others [34]. Among the different condition factors, topography mainly incorporated downloaded DEM geospatial data from the official cloud website (http://www.gscloud.cn/search). This category included aspect, slope, plane curvature, profile curvature, topographic relief, surface roughness, and landform. Geology was mainly derived from vectorization of the regional geological map, including faults and lithology. Hydrology was derived from vectorization of the topographic map, grid calculation of GIS, and extraction of the rainfall distribution map, which were divided into river and net flow intensity indices. The other category included road and vegetation coverage. All of the conditional categories were comprised into grid images with a pixel size of 30 m × 30 m.
Terrain curvature, a quantitative measure of the degree of change of each point on a slope [29], is decomposed into horizontal and vertical directions, termed plan curvature and profile curvature, respectively [4,34,38]. Plane curvature extracts aspect from DEM data before extracting slope from aspect. This can be divided into three categories: concave (<0), flat (0), and convex (>0) (Figure 2c). Profile curvature extracts slope twice from DEM data, being divided into <0, 0, and >0 ( Figure 2d). Topographical relief refers to the difference between the highest and the lowest altitudes in a certain area. Landslides developed on slopes with different elevation differences are often different [39,40]. The relief degree of Shexian County can be divided into five grades: <15, 15-30, 30-45, 45-  Slope degree is the angle between the slope section and the horizontal plane. This not only determines the spatial distribution characteristics of a landslide, it also controls the geotechnical distribution of the slope. This factor has an important effect on slope stability [37]. Slope variation in Shexian County was found to notably differ, being divided into five categories: <5, 5-15, 15-30, 30-50, and >50 • (Figure 2b).
Topographical relief refers to the difference between the highest and the lowest altitudes in a certain area. Landslides developed on slopes with different elevation differences are often different [39,40]. The relief degree of Shexian County can be divided into five grades: <15, 15-30, 30-45, 45-60, and >60 ( Figure 3a).
Surface roughness of Shexian reflects the degree of surface erosion [41], being the ratio of the surface area of the surface unit to the projected area on the horizontal plane. By dividing surface roughness of Shexian County into <1.05, 1.05-1.15, 1.15-1.30, and >1.30, landslides can be seen to be mainly distributed in areas with relatively low surface roughness (Figure 3b). Different geomorphic phenomena in Shexian County have had different effects on the occurrence of landslides in this region [34,42]. Landforms in the study area include plains, hills and mountains. Mountainous areas and hills account for 95% of the total study area, and plains account for 5%. Geomorphology of the study area was divided into eight grades: I-1 (plain), I-2 (shallow hilly plain), II-1 (medium hill), II-2 (high hill), III-1 (low undulating low mountain), III-2 (high undulating low mountain), IV-1 (low undulating mountain), and IV-2 (high undulating mountain) [4] (Figure 3c). Surface roughness of Shexian reflects the degree of surface erosion [41], being the ratio of the surface area of the surface unit to the projected area on the horizontal plane. By dividing surface roughness of Shexian County into <1.05, 1.05-1.15, 1.15-1.30, and >1.30, landslides can be seen to be mainly distributed in areas with relatively low surface roughness (Figure 3b). Different geomorphic phenomena in Shexian County have had different effects on the occurrence of landslides in this region [34,42]. Landforms in the study area include plains, hills and mountains. Mountainous areas and hills account for 95% of the total study area, and plains account for 5%. Geomorphology of the study area was divided into eight grades: I-1 (plain), I-2 (shallow hilly plain), II-1 (medium hill), II-2 (high hill), III-1 (low undulating low mountain), III-2 (high undulating low mountain), IV-1 (low undulating mountain), and IV-2 (high undulating mountain) [4] ( Figure  3c). Geological faults control the weak structural plane of a slope. Faults act to cut rock and soil mass of a slope into a discontinuous whole, forming a potential landslide point. At the same time, faults can destroy rock and soil structure, providing a channel for rainfall and other factors [43]. Based on the distance from faults, the study area can be divided into five grades: <400, 400-800, 800-1200, 1200-2000, and >2000 m (Figure 4a). The closer the study area is to the fault, the greater is the impact of the fault.
Lithology is the basis of landslide development, and different lithology compositions affect the type and scale of a landslide [40,44]. Through vectorization of the bedrock geological map, a stratigraphic lithologic map of Shexian County was created and divided into six types (Table 1, Figure  4b). Geological faults control the weak structural plane of a slope. Faults act to cut rock and soil mass of a slope into a discontinuous whole, forming a potential landslide point. At the same time, faults can destroy rock and soil structure, providing a channel for rainfall and other factors [43]. Based on the distance from faults, the study area can be divided into five grades: <400, 400-800, 800-1200, 1200-2000, and >2000 m (Figure 4a). The closer the study area is to the fault, the greater is the impact of the fault. Lithology is the basis of landslide development, and different lithology compositions affect the type and scale of a landslide [40,44]. Through vectorization of the bedrock geological map, a stratigraphic lithologic map of Shexian County was created and divided into six types (Table 1, Figure 4b).

Hydrological Factors
Water is an important factor affecting the development of landslides, playing a key role in the evaluation of landslide susceptibility. Hydrological factors mainly derive from vectorization of the topographic map and grid calculation by GIS. Rainfall derives from the precision monitoring points established by Shexian meteorological stations in 28 towns and villages, as well as the interpolation of observation data of other meteorological stations nearby (Tunxi District, Huangshan District) [4].
The spatial distance from rivers or water systems expresses the influence of water level change on landslide formation [45]. Slopes can be easily eroded by a river water system, forming a steep free surface. These changes can result in a change in slope stress, resulting in a landslide [46]. By vectorizing rivers in a topographic map, buffer zones were established using GIS. Based on the distance from faults, the study area can be divided into five grades: <400, 400-800, 800-1200, 1200-2000, and >2000 m (Figure 5a).
The catchment area (As) was calculated using GIS hydrological tools such as "Flow Direction" and "Flow Accumulation" from DEM data in the study area. The Stream Power Index (SPI) was calculated using the "Raster Calculator" [47][48][49] of: where, β is the slope. In this equation, slope was converted to radians as: SPI of Shexian County can be divided into three categories: <2, 2-4, and >4 (Figure 5b). Topographic Wetness Index (TWI) is a quantitative description of soil moisture in a watershed. An understanding of the influence terrain change has on a soil can be gained using the following equation [50,51]:

Hydrological Factors
Water is an important factor affecting the development of landslides, playing a key role in the evaluation of landslide susceptibility. Hydrological factors mainly derive from vectorization of the topographic map and grid calculation by GIS. Rainfall derives from the precision monitoring points established by Shexian meteorological stations in 28 towns and villages, as well as the interpolation of observation data of other meteorological stations nearby (Tunxi District, Huangshan District) [4].
The spatial distance from rivers or water systems expresses the influence of water level change on landslide formation [45]. Slopes can be easily eroded by a river water system, forming a steep free surface. These changes can result in a change in slope stress, resulting in a landslide [46]. By vectorizing rivers in a topographic map, buffer zones were established using GIS. Based on the distance from faults, the study area can be divided into five grades: <400, 400-800, 800-1200, 1200-2000, and >2000 m (Figure 5a).
The catchment area (As) was calculated using GIS hydrological tools such as "Flow Direction" and "Flow Accumulation" from DEM data in the study area. The Stream Power Index (SPI) was calculated using the "Raster Calculator" [47][48][49] of: where, β is the slope. In this equation, slope was converted to radians as: Symmetry 2020, 12,1954 7 of 18 SPI of Shexian County can be divided into three categories: <2, 2-4, and >4 (Figure 5b). Topographic Wetness Index (TWI) is a quantitative description of soil moisture in a watershed. An understanding of the influence terrain change has on a soil can be gained using the following equation [50,51]: Results for TWI enabled Shexian County to be divided into three categories: <6, 6-8 and >8 (Figure 5c).
Flow length is the projection length, referring to the maximum ground distance from a ground point along the flow direction to the start or end point on the horizontal plane, directly affects the speed of surface runoff and causes different erosivity to surface soil. [52]. Flow length of Xin'an River and its tributaries, extracted from DEM data by GIS, can be divided into five categories: <500, 500-1000, 1000-1500, 1500-2500, and >2500 m (Figure 5d).

Other Factors
Among other factors, roads were derived from the vectorization of geographical location maps, and normalized vegetation index (NDVI) was calculated based on the extraction of remote sensing images.
Roads have been categorized as the main man-made factor causing landslides [13]. As Shexian County is located in a mountainous area, roads are therefore predominantly constructed along Landslides caused by rainfall events refer to the natural phenomenon whereby rainfall infiltration leads changes in pore pressure and rock and soil mass strength, resulting in rocks and soil to slide downward along a certain weak surface under the action of gravity. Rainfall magnitude constitutes the risk degree of a landslide, with heavy rainfall having a direct effect on landslide occurrence [53]. Rainfall threshold can be used to determine the landslide data set quantitatively, that is to say, landslides may occur when a certain rainfall threshold is reached. Thus we can discard the unavailable or unreliable landslide records [53]. Based on the global landslide disaster database compiled by Froude and Petrey, 3285 out of 5318 non-seismic landslides in the world were related to rainfall from 2004 to 2016 [54][55][56]. The landslides in Shexian county are also related to rainfall events, and 81.4% of the total landslides occurred in rainy season (from May to July) [4,5]. Rainfall causes an increase in groundwater level, a decrease in effective stress between rock and soil particles, an increase in pore water pressure, and a decrease in the shear strength of a rock and soil mass. As these changes can result in the occurrence of a landslide, rainfall is therefore regarded as the main factor of a landslide occurrence [57]. According to the distribution of atmospheric rainfall in Shexian County, rainfall can be divided into four grades: <1530, 1530-1580, 1580-1630, and >1630 mm (Figure 5e).

Other Factors
Among other factors, roads were derived from the vectorization of geographical location maps, and normalized vegetation index (NDVI) was calculated based on the extraction of remote sensing images.
Roads have been categorized as the main man-made factor causing landslides [13]. As Shexian County is located in a mountainous area, roads are therefore predominantly constructed along mountains. Engineering activities used in road construction destroy the integrity of the mountain along the road, resulting in an increase in landslide susceptibility. According to the distance from the road, buffer zones of <400, 400-1200, 1200-2000, and >2000 m were set (Figure 6a).
Vegetation acts as a soil anchor via its root system, thereby improving the shear resistance of a soil. At the same time, transpiration of plants can reduce soil moisture to a certain extent [21,47]. NDVI of Shexian was extracted from remote sensing images and divided into five grades: <0.  Figure 6b). Our results indicate that NDVI values were higher in areas further away from towns and smaller in areas with concentrated populations [2,4,9].  Figure 6b). Our results indicate that NDVI values were higher in areas further away from towns and smaller in areas with concentrated populations [2,4,9].

Methodology
Firstly, on the basis of the 16 landslide susceptibility factors, information values for the factors for 502 landslide points and safety points were extracted using the "Extract Multi Value To Point" GIS tool. Of these, 351 landslide points and safety points (70%) were used for training, and 151 landslide points and safety points (30%) were used for verification [31,40,58]. All data (1004 groups) were imported into Scikit-Learn (Sklearn) repository of Python for training and verification. Data were analyzed using algorithms of LR, SVM, RF, GBM, and MLP. After the model was established, the "Raster To Point" GIS tool was used to convert the raster image of the study area into 260,130 vector points; the information values of the 16 factors for the 260,130 points were extracted using the "Extract Multi Value To Point" tool. Overall, 260,130 groups of data were imported into the

Methodology
Firstly, on the basis of the 16 landslide susceptibility factors, information values for the factors for 502 landslide points and safety points were extracted using the "Extract Multi Value To Point" GIS tool. Of these, 351 landslide points and safety points (70%) were used for training, and 151 landslide points and safety points (30%) were used for verification [31,40,58]. All data (1004 groups) were imported into Scikit-Learn (Sklearn) repository of Python for training and verification. Data were analyzed using algorithms of LR, SVM, RF, GBM, and MLP. After the model was established, the "Raster To Point" GIS tool was used to convert the raster image of the study area into 260,130 vector points; the information values of the 16 factors for the 260,130 points were extracted using the "Extract Multi Value To Point" tool. Overall, 260,130 groups of data were imported into the established models to calculate the score of landslide susceptibility. Finally, by using the "point to raster " GIS tool, five Landslide Susceptibility Maps (LSMs) for the five methods were obtained. The process of drawing LSMs using a machine learning algorithm is shown in Figure 7.

Methodology
Firstly, on the basis of the 16 landslide susceptibility factors, information values for the factors for 502 landslide points and safety points were extracted using the "Extract Multi Value To Point" GIS tool. Of these, 351 landslide points and safety points (70%) were used for training, and 151 landslide points and safety points (30%) were used for verification [31,40,58]. All data (1004 groups) were imported into Scikit-Learn (Sklearn) repository of Python for training and verification. Data were analyzed using algorithms of LR, SVM, RF, GBM, and MLP. After the model was established, the "Raster To Point" GIS tool was used to convert the raster image of the study area into 260,130 vector points; the information values of the 16 factors for the 260,130 points were extracted using the "Extract Multi Value To Point" tool. Overall, 260,130 groups of data were imported into the established models to calculate the score of landslide susceptibility. Finally, by using the "point to raster " GIS tool, five Landslide Susceptibility Maps (LSMs) for the five methods were obtained. The process of drawing LSMs using a machine learning algorithm is shown in Figure 7.

Logistic Regression (LR)
LR is a generalized linear model [59,60] having many similarities with linear regression. As LR assumes that dependent variable y obeys a Bernoulli distribution (in contrast, linear regression assumes that dependent variable y obeys Gaussian distribution [61]), LR is therefore supported by

Logistic Regression (LR)
LR is a generalized linear model [59,60] having many similarities with linear regression. As LR assumes that dependent variable y obeys a Bernoulli distribution (in contrast, linear regression assumes that dependent variable y obeys Gaussian distribution [61]), LR is therefore supported by linear regression theory. However, LR can be used to deal with 0/1 classification problems after introducing the logic function (i.e., sigmoid function) [43].

Support Vector Machine (SVM)
In order to enhance the generalization ability of the model, an additional optimization objective was added to the linear discriminant method. This method was termed SVM [20,36].
SVM is a linear classifier with a maximum possible safe interval, in which the support vectors are the points on both sides of the safe interval ( Figure 8). SVM can be regarded as solving two problems: on the one hand, it finds an appropriate way to measure the correlation between input vectors, i.e., the kernel function K (x, y); on the other hand, it constructs a linear structure by combining the output of training samples with new test samples. The output of training samples is measured by similarity [21,61]. The more similar the input samples, the greater the contribution to the output. As per the original nearest neighbor classifier, it can be approximately expressed as: where, l is the number of training samples; y i is the output of training samples; and x i and x are the new test samples to be classified. When kernel is used to calculate the point product of data points mapped by function ϕ(x), it is not necessary to calculate the mapping function. The equation can therefore be expressed as: the output. As per the original nearest neighbor classifier, it can be approximately expressed as: where, l is the number of training samples; yi is the output of training samples; and xi and x are the new test samples to be classified. When kernel is used to calculate the point product of data points mapped by function φ(x), it is not necessary to calculate the mapping function. The equation can therefore be expressed as:

Random Forest (RF)
The tree-based learning algorithm enables the prediction model to be accurate, stable and easily explained. Different from the linear model, tree-based algorithms can also effectively map the nonlinear relationship. Common tree-based models include decision trees, random forests, and promoted trees [27,28,62,63].
Both classification and regression trees belong to the branch of decision tree. The category predicted by a classification tree is the most common category of observation value of training samples in a certain region, namely the mode response of training observation values. In order to achieve the purpose of classification, the system typically predicts a group of categories and their probability of occurrence. Generally, recursive binary segmentation is used to generate a classification tree. However, as Residual Sum of Squares (RSS) cannot be used as a binary segmentation standard in a classification tree, it is therefore necessary to define the impure quantity (QM) of a leaf node to replace RSS, that is, a method to measure the homogeneity of target variables in the subset region R1, R2,..., RJ [61]. In node m, we can express the frequency of the category of RM

Random Forest (RF)
The tree-based learning algorithm enables the prediction model to be accurate, stable and easily explained. Different from the linear model, tree-based algorithms can also effectively map the nonlinear relationship. Common tree-based models include decision trees, random forests, and promoted trees [27,28,62,63].
Both classification and regression trees belong to the branch of decision tree. The category predicted by a classification tree is the most common category of observation value of training samples in a certain region, namely the mode response of training observation values. In order to achieve the purpose of classification, the system typically predicts a group of categories and their probability of occurrence. Generally, recursive binary segmentation is used to generate a classification tree. However, as Residual Sum of Squares (RSS) cannot be used as a binary segmentation standard in a classification tree, it is therefore necessary to define the impure quantity (QM) of a leaf node to replace RSS, that is, a method to measure the homogeneity of target variables in the subset region R1, R2,..., RJ [61]. In node m, we can express the frequency of the category of R M in a region by nm sample observations, and the frequency of the kth class training in the m region can be expressed as: RF decorrelates all trees by random disturbance. The core idea of RF is that it is the same as bagging tree, thus its variance is reduced. In addition, a large number of predictors can be considered for RF, not only because this method reduces bias, but also because local feature predictors play an important role in tree structure. RF can use a large number of predictors, even more than the number of samples observed. The most significant advantage of RF is that it can obtain more information to reduce deviation of the fitting value and estimation segmentation [64,65]. As RF computes enough decision tree models, each predictor has at least a few chances to become a predictor for defining segmentation. In most cases, the leading predictor and the feature predictor have the opportunity to define the segmentation of the dataset.

Gradient Boosting Machine (GBM)
GBM, an ensemble learning method, combines multiple decision trees to build a more powerful model which can be used for classification or regression [28,61,66]. Different from RF, GBM constructs trees in a continuous way, and each tree tries to correct the error of the previous tree. The idea behind GBM is to combine multiple weak learners to improve their performance. The main parameters of the GBM tree model include the number of trees and the learning rate.

Multilayer Perceptron (MLP)
MLP consists of three layers: an input layer, a hidden layer and an output layer (Figure 9). The different layers in MLP are fully connected [25,67,68]. In the classification task, the softmax function is used as the activation function in the output layer of the perceptron to ensure that the output is a probability value, and its sum is equal to 1. The softmax function receives a fractional vector of random real values and converts it into a plurality of vector values between 0 and 1, and the sum of which is 1 [69].
segmentation. In most cases, the leading predictor and the feature predictor have the opportunity to define the segmentation of the dataset.

Gradient Boosting Machine (GBM)
GBM, an ensemble learning method, combines multiple decision trees to build a more powerful model which can be used for classification or regression [28,61,66]. Different from RF, GBM constructs trees in a continuous way, and each tree tries to correct the error of the previous tree. The idea behind GBM is to combine multiple weak learners to improve their performance. The main parameters of the GBM tree model include the number of trees and the learning rate.

Multilayer Perceptron (MLP)
MLP consists of three layers: an input layer, a hidden layer and an output layer (Figure 9). The different layers in MLP are fully connected [25,67,68]. In the classification task, the softmax function is used as the activation function in the output layer of the perceptron to ensure that the output is a probability value, and its sum is equal to 1. The softmax function receives a fractional vector of random real values and converts it into a plurality of vector values between 0 and 1, and the sum of which is 1 [69].

Results and Discussion
Results gained using the five algorithms were imported into GIS, and the Janks natural breakpoint method [70] was used to divide landslide susceptibility into four grades [29]: low, moderate, high, and very high. Using this method, landslide susceptibility zoning maps were generated for each algorithm ( Figure 10). Results from this analysis recorded certain similarities. For

Results and Discussion
Results gained using the five algorithms were imported into GIS, and the Janks natural breakpoint method [70] was used to divide landslide susceptibility into four grades [29]: low, moderate, high, and very high. Using this method, landslide susceptibility zoning maps were generated for each algorithm ( Figure 10). Results from this analysis recorded certain similarities. For example, landslide susceptibility was high in the middle and northeastern areas, and low in the south and northwestern areas.
In order to quantify and compare differences between zoning results gained using the five methods, a grid distribution histogram of different grades of landslides was drawn ( Figure 11). Here, the proportion of landslide prone areas in the total area (gray bars), and the proportion of known landslide points in each grade area (red bars) are shown. Results from this analysis indicate that higher red columns in high and very high areas coupled with lower gray columns in different landslide grades indicate a higher level of model success and a better fit. When the proportion of landslide points in high or very high areas were divided by the proportion of the grade area, the five algorithms had results of 1.52 (LR), 1.77 (SVM), 1.95 (RF), 1.83 (GBM), and 1.64 (MLP); when the proportion of very high landslide points were directly divided by the proportion of the grade area, results were 1.92, 2.20, 2.98, 2.62, and 2.14, respectively. Based on these results, the five models can therefore be ranked in the order of: RF > SVM > MLP > GBM > LR.
Symmetry 2020, 12, x FOR PEER REVIEW 12 of 18 example, landslide susceptibility was high in the middle and northeastern areas, and low in the south and northwestern areas. In order to quantify and compare differences between zoning results gained using the five methods, a grid distribution histogram of different grades of landslides was drawn ( Figure 11). Here, the proportion of landslide prone areas in the total area (gray bars), and the proportion of known landslide points in each grade area (red bars) are shown. Results from this analysis indicate that higher red columns in high and very high areas coupled with lower gray columns in different landslide grades indicate a higher level of model success and a better fit. When the proportion of landslide points in high or very high areas were divided by the proportion of the grade area, the five algorithms had results of 1.52 (LR), 1.77 (SVM), 1.95 (RF), 1.83 (GBM), and 1.64 (MLP); when the proportion of very high landslide points were directly divided by the proportion of the grade area, results were 1.92, 2.20, 2.98, 2.62, and 2.14, respectively. Based on these results, the five models can therefore be ranked in the order of: RF > SVM > MLP > GBM > LR. The accuracy of the landslide susceptibility test can be shown using a Receiver Operate Curve (ROC) [18]. Here, the x-axis has a false positive rate, i.e., 1-specificity, indicating the probability of non-disaster points being mis-predicted; the y-axis is a true positive rate, that is, susceptibility, indicating the probability of correct prediction of disaster points [20,40,71]. Area Under Curve (AUC) was used to analyze the accuracy of the prediction results. AUC represents the area under the curve enclosed by the coordinate axis. The closer the AUC value is to 1, the more accurate the prediction results of the model are [29,31,72]. Python was used in this analysis to derive the success rate curve of training samples ( Figure 12) and prediction rate curves of test samples ( Figure 13). Results from these analyses indicate that the success rates of LR, SVM, RF, GBM, and MLP models under training samples were 0.781, 0.824, 0.853, 0.828, and 0.811, respectively (Figure 12), and the prediction rates of The accuracy of the landslide susceptibility test can be shown using a Receiver Operate Curve (ROC) [18]. Here, the x-axis has a false positive rate, i.e., 1-specificity, indicating the probability of non-disaster points being mis-predicted; the y-axis is a true positive rate, that is, susceptibility, indicating the probability of correct prediction of disaster points [20,40,71]. Area Under Curve (AUC) was used to analyze the accuracy of the prediction results. AUC represents the area under the curve enclosed by the coordinate axis. The closer the AUC value is to 1, the more accurate the prediction results of the model are [29,31,72]. Python was used in this analysis to derive the success rate curve of training samples ( Figure 12) and prediction rate curves of test samples ( Figure 13). Results from these analyses indicate that the success rates of LR, SVM, RF, GBM, and MLP models under training samples were 0.781, 0.824, 0.853, 0.828, and 0.811, respectively (Figure 12), and the prediction rates of test samples were 0.772, 0.803, 0.821, 0.815, and 0.803, respectively ( Figure 13). These results indicate that the probability of the RF model was relatively high. Combined with the success rate and prediction rate, we can see that the accuracy of the five algorithms was in the order of: RF > GBM > SVM > MLP > LR. Accuracy ranking was generally consistent with results gained using the histogram of raster distribution ( Figure 11). In the five machine learning algorithms involved in the landslide susceptibility mapping of Shexian County, RF had the best effect, followed by SVM and GBM; MLP and LR had the lowest level of accuracy. Since RF is an integrated model, its generalization ability, anti-interference ability, and fitting ability are all stronger than models using a single factor. Because the landslide factors of Shexian county are derived from 16 factors, the relationship between the factors is complex. In response to this practical problem, the RF integrated model achieved good results.  Figure 13). These results indicate that the probability of the RF model was relatively high. Combined with the success rate and prediction rate, we can see that the accuracy of the five algorithms was in the order of: RF > GBM > SVM > MLP > LR. Accuracy ranking was generally consistent with results gained using the histogram of raster distribution ( Figure 11). In the five machine learning algorithms involved in the landslide susceptibility mapping of Shexian County, RF had the best effect, followed by SVM and GBM; MLP and LR had the lowest level of accuracy. Since RF is an integrated model, its generalization ability, anti-interference ability, and fitting ability are all stronger than models using a single factor. Because the landslide factors of Shexian county are derived from 16 factors, the relationship between the factors is complex. In response to this practical problem, the RF integrated model achieved good results.

Conclusions
In this study, GIS and Sklearn were used in conjunction with five machine learning algorithms (LR, SVM, RF, GBM, and MLP) to analyze slope aspect, slope, plane curvature, profile curvature, terrain emergence degree, surface roughness, landform, fault, lithology, river buffer zone, net flow intensity index, and land surface roughness. On the basis of 16 condition factors, including shape humidity index, flow length, rainfall, road buffer and NDVI, 502 landslide and safety points were analyzed using the five algorithms. Overall, 70% of the landslide points were used as training data and 30% were used for verification. After establishing models for the five algorithms, 260 and 130 grids for the study region were converted into points. These points were then imported into the models for calculation. Finally, landslide susceptibility maps based on the five algorithms were created. According to the degree of susceptibility, landslides were divided into four grades: low,   Figure 13). These results indicate that the probability of the RF model was relatively high. Combined with the success rate and prediction rate, we can see that the accuracy of the five algorithms was in the order of: RF > GBM > SVM > MLP > LR. Accuracy ranking was generally consistent with results gained using the histogram of raster distribution ( Figure 11). In the five machine learning algorithms involved in the landslide susceptibility mapping of Shexian County, RF had the best effect, followed by SVM and GBM; MLP and LR had the lowest level of accuracy. Since RF is an integrated model, its generalization ability, anti-interference ability, and fitting ability are all stronger than models using a single factor. Because the landslide factors of Shexian county are derived from 16 factors, the relationship between the factors is complex. In response to this practical problem, the RF integrated model achieved good results.

Conclusions
In this study, GIS and Sklearn were used in conjunction with five machine learning algorithms (LR, SVM, RF, GBM, and MLP) to analyze slope aspect, slope, plane curvature, profile curvature, terrain emergence degree, surface roughness, landform, fault, lithology, river buffer zone, net flow intensity index, and land surface roughness. On the basis of 16 condition factors, including shape humidity index, flow length, rainfall, road buffer and NDVI, 502 landslide and safety points were analyzed using the five algorithms. Overall, 70% of the landslide points were used as training data and 30% were used for verification. After establishing models for the five algorithms, 260 and 130 grids for the study region were converted into points. These points were then imported into the models for calculation. Finally, landslide susceptibility maps based on the five algorithms were created. According to the degree of susceptibility, landslides were divided into four grades: low, Figure 13. Prediction rate curve.

Conclusions
In this study, GIS and Sklearn were used in conjunction with five machine learning algorithms (LR, SVM, RF, GBM, and MLP) to analyze slope aspect, slope, plane curvature, profile curvature, terrain emergence degree, surface roughness, landform, fault, lithology, river buffer zone, net flow intensity index, and land surface roughness. On the basis of 16 condition factors, including shape humidity index, flow length, rainfall, road buffer and NDVI, 502 landslide and safety points were analyzed using the five algorithms. Overall, 70% of the landslide points were used as training data and 30% were used for verification. After establishing models for the five algorithms, 260 and 130 grids for the study region were converted into points. These points were then imported into the models for calculation. Finally, landslide susceptibility maps based on the five algorithms were created. According to the degree of susceptibility, landslides were divided into four grades: low, moderate, high, and very high.
At the same time, the distribution histogram of grid and landslide points in different grades, as well as ROC and AUC, were used to compare the effect of these algorithms using landslide susceptibility maps. Results indicated that the five models were relatively successful in predicting landslide susceptibility occurrence. The ratio of high or very high landslide points to grade area defined by LR, SVM, RF, GBM, and MLP was 1.52, 1.77, 1.95, 1.83, and 1.64, and the ratio of very high landslide points to grade area was 1.92, 2.20, 2.98, 2.62, and 2.14, respectively. The success rates of training samples were 0.781, 0.824, 0.853, 0.828, and 0.811, and the prediction rates of test samples were 0.772, 0.803, 0.821, 0.815, and 0.803, respectively. The success rate and prediction rate of the other five algorithms were greater than 0.8, apart from LR which was slightly lower than 0.8. By ordering the five algorithms from good to bad (RF > SVM > MLP > GBM > LR), our results indicated that RF had the best landslide susceptibility evaluation. By combining machine learning algorithms with GIS to map landslide susceptibility and evaluate susceptibility, results from this investigation provide a greater level of information for relevant staff. The method presented in this study is not only suitable for Shexian County, it can also be expanded to include other mountainous areas in the Southern Anhui Province.