Landslide Susceptibility Prediction Using Particle-Swarm-Optimized Multilayer Perceptron: Comparisons with Multilayer-Perceptron-Only, BP Neural Network, and Information Value Models

: Landslides are one type of serious geological hazard which cause immense losses of local life and property. Landslide susceptibility prediction (LSP) can be used to determine the spatial probability of landslide occurrence in a certain area. It is important to implement LSP for landslide hazard prevention and reduction. This study developed a particle-swarm-optimized multilayer perceptron (PSO-MLP) model for LSP implementation to overcome the drawbacks of the conventional gradient descent algorithm and to determine the optimal structural parameters of MLP. Shicheng County in Jiangxi Province of China was used as the study area. In total, 369 landslides, randomly selected non-landslides, and 14 landslide-related predisposing factors were used to train and test the present PSO-MLP model and three other comparative models (an MLP-only model with the gradient descent algorithm, a back-propagation neural network (BPNN), and an information value (IV) model). The results showed that the PSO-MLP model had the most accurate prediction performance (area under the receiver operating characteristic curve (AUC) of 0.822 and frequency ratio (FR) accuracy of 0.856) compared with the MLP-only (0.791 and 0.829), BPNN (0.800 and 0.840), and IV (0.788 and 0.824) models. It can be concluded that the proposed PSO-MLP model addresses the drawbacks of the MLP-only model well and performs better than conventional artiﬁcial neural networks (ANNs) and statistical models. The spatial probability distribution law of landslide occurrence in Shicheng County was well revealed by the landslide susceptibility map produced using the PSO-MLP model. Furthermore, the present PSO-MLP model may have higher prediction and classiﬁcation performances in some other ﬁelds compared with conventional ANNs and statistical models.


Introduction
A landslide is a type of very serious natural hazard that occurs worldwide and results in immense losses in human life and property [1][2][3]. Much attention has been paid by geological engineers to determine the susceptible areas where landslides are likely to occur, and landslide susceptibility prediction (LSP) and susceptibility mapping are significant technologies used to this end [4,5].
Along with the development of information technologies, remote sensing and the geographic information system (GIS) have gradually become data sources and spatial analysis platforms for LSP [6,7]. Based on remote sensing and GIS, many mathematical models have been proposed to calculate landslide susceptibility indices (LSI), such as the analytic hierarchy process [8][9][10], weight evidence method [11], information value (IV) theory [5,12], frequency ratio (FR) method [13,14],

Materials
The materials include introduction of study area, landslide inventory information, and related predisposing factors.

Study Area and Landslide Inventory Information
Shicheng County is located in the southeastern part of Jiangxi Province and has a longitude of 116 • 05 46" E, 116 • 38 03" E and a latitude of 25 • 57 47" N, 26 • 36 13" N ( Figure 1). Its total area is about 1581.5 km 2 with a length of 71.8 km and a width of 53.7 km. The total population is about 3.33 × 10 5 . Shicheng County belongs to the subtropical monsoon humid climate zone and has abundant sunshine, four distinctive seasons, and rich rainfall. Its average annual precipitation was about 1748.6 mm between 1970 and 2015, and the total precipitation of the main flood season (April-June) accounts for 50.1% of the total annual precipitation. Both precipitation and temperature are non-uniformly distributed in Shicheng County due to the complex terrain characteristics and the relationship between land and sea locations.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 3 of 18 non-uniformly distributed in Shicheng County due to the complex terrain characteristics and the relationship between land and sea locations. Shicheng County belongs to a mountainous area with a developed river system and a dense river network (0.6 km of each km 2 ). The Qin River flows through the whole area from northeast to southwest and finally flows into the Ganjiang River. The groundwater type of Shicheng County is mainly shallow groundwater of shallow depth, good recharge conditions, rapid regeneration speed, and easy extraction. Furthermore, Shicheng County is in a mountain basin surrounded by the Wuyi Mountains. These mountains are mainly composed of pre-Devonian metamorphic rocks, Devonian quartz sandstone, sandy conglomerate, and sandy shale. In general, Shicheng County is in a typical southeast hilly region, with many mountains in the northeast area, rolling hills in the southwest area, and flat terrain in the central area.
Based on investigations by the Land and Resources Department of Jiangxi Province, the landslide inventory map shown in Figure 1 suggests that 369 landslides occurred in the study area from 1970 to 2012. The sliding masses of these landslides are mainly composed of quaternary silty clay intercalated with crushed stones that have a thickness ranging from 2 to 8 m. These landslides can be mainly classified as shallow soil landslides with a movement type of clay/silt slide [42]. The main features of these shallow soil landslides are their small scale, high frequency, group occurrence, and wide distribution. The total cover area of the recorded landslides in Shicheng County is about 2.44 × 10 6 m 2 , and the area of these landslides ranges from about 1.0 × 10 3 to 1.6 × 10 4 m 2 . Furthermore, the main direct trigger factors of these landslides are seasonal heavy rainfall and frequent unreasonable human engineering activities, such as slope toe cutting and road excavation.

Landslide-Related Predisposing Factors
In this study, we selected 14 landslide-related predisposing factors as input variables of the PSO-MLP, MLP-only, BPNN, and IV models. These predisposing factors were topography factors (digital elevation modeling (DEM), slope, aspect, relief amplitude, plan curvature, and profile curvature); hydrological factors (distance to river, topographic wetness index (TWI), and modified normalization difference water index (MNDWI)); lithological factors (rock types); land cover factors (normalized difference building index (NDBI), normalized difference vegetation index (NDVI), total surface radiation, and population density index). The DEM was obtained from the Global Digital Elevation Model of the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER GDEM), and the MNDWI, NDBI, and NDVI factors were calculated from Landsat 8 TM images (obtained on 5 October 2013, path/row 121/42, 30 m resolution). These 14 predisposing factors were Shicheng County belongs to a mountainous area with a developed river system and a dense river network (0.6 km of each km 2 ). The Qin River flows through the whole area from northeast to southwest and finally flows into the Ganjiang River. The groundwater type of Shicheng County is mainly shallow groundwater of shallow depth, good recharge conditions, rapid regeneration speed, and easy extraction. Furthermore, Shicheng County is in a mountain basin surrounded by the Wuyi Mountains. These mountains are mainly composed of pre-Devonian metamorphic rocks, Devonian quartz sandstone, sandy conglomerate, and sandy shale. In general, Shicheng County is in a typical southeast hilly region, with many mountains in the northeast area, rolling hills in the southwest area, and flat terrain in the central area.
Based on investigations by the Land and Resources Department of Jiangxi Province, the landslide inventory map shown in Figure 1 suggests that 369 landslides occurred in the study area from 1970 to 2012. The sliding masses of these landslides are mainly composed of quaternary silty clay intercalated with crushed stones that have a thickness ranging from 2 to 8 m. These landslides can be mainly classified as shallow soil landslides with a movement type of clay/silt slide [42]. The main features of these shallow soil landslides are their small scale, high frequency, group occurrence, and wide distribution. The total cover area of the recorded landslides in Shicheng County is about 2.44 × 10 6 m 2 , and the area of these landslides ranges from about 1.0 × 10 3 to 1.6 × 10 4 m 2 . Furthermore, the main direct trigger factors of these landslides are seasonal heavy rainfall and frequent unreasonable human engineering activities, such as slope toe cutting and road excavation.

Landslide-Related Predisposing Factors
In this study, we selected 14 landslide-related predisposing factors as input variables of the PSO-MLP, MLP-only, BPNN, and IV models. These predisposing factors were topography factors (digital elevation modeling (DEM), slope, aspect, relief amplitude, plan curvature, and profile curvature); hydrological factors (distance to river, topographic wetness index (TWI), and modified normalization difference water index (MNDWI)); lithological factors (rock types); land cover factors (normalized difference building index (NDBI), normalized difference vegetation index (NDVI), total surface radiation, and population density index). The DEM was obtained from the Global Digital Elevation Model of the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER GDEM), and the MNDWI, NDBI, and NDVI factors were calculated from Landsat 8 TM images (obtained on 5 October 2013, path/row 121/42, 30 m resolution). These 14 predisposing factors were selected according to the related literature and studies in areas with similar physical geography and geological environments [11,43,44].
In this study, the data format of DEM and original remote sensing images was grid cells with a spatial resolution of 30 × 30 m. Moreover, the raster format has the advantages of fast subdivision, simple expression, and high computational efficiency. Hence, the raster format with grid cells was applied to express all 14 predisposing factors in Shicheng County. The recorded landslides were subdivided into 2709 landslide grid cells.
(1) Topography factors in Shicheng County DEM (Figure 2a), which indicates the altitude above sea level in the study area, was the data source of the five other topographic factors. These six topographic factors have important effects on the probability of landslide occurrence [45]. Slope ( Figure 2b) has a direct influence on the landslide stability coefficient, and it is one of the basic elements of landslide evaluation [46]. Aspect (Figure 2c) influences landslide occurrence by affecting moisture movement, soil properties, and so forth [47]. Relief amplitude (Figure 2d) directly reflects terrain complexity through the variations of DEM in a certain area [48]. Plan curvature ( Figure 2e) and profile curvature (Figure 2f), respectively defined as the slope of the aspect and the slope of the slope, can also effectively reflect the terrain complexity of the study area [49]. Terrain complexity affects landslide evolution by influencing soil erosion, slope structure, sediment transportation, and so forth.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 4 of 18 selected according to the related literature and studies in areas with similar physical geography and geological environments [11,43,44]. In this study, the data format of DEM and original remote sensing images was grid cells with a spatial resolution of 30 × 30 m. Moreover, the raster format has the advantages of fast subdivision, simple expression, and high computational efficiency. Hence, the raster format with grid cells was applied to express all 14 predisposing factors in Shicheng County. The recorded landslides were subdivided into 2709 landslide grid cells.
(1) Topography factors in Shicheng County DEM (Figure 2a), which indicates the altitude above sea level in the study area, was the data source of the five other topographic factors. These six topographic factors have important effects on the probability of landslide occurrence [45]. Slope ( Figure 2b) has a direct influence on the landslide stability coefficient, and it is one of the basic elements of landslide evaluation [46]. Aspect (Figure 2c) influences landslide occurrence by affecting moisture movement, soil properties, and so forth [47]. Relief amplitude (Figure 2d) directly reflects terrain complexity through the variations of DEM in a certain area [48]. Plan curvature ( Figure 2e) and profile curvature (Figure 2f), respectively defined as the slope of the aspect and the slope of the slope, can also effectively reflect the terrain complexity of the study area [49]. Terrain complexity affects landslide evolution by influencing soil erosion, slope structure, sediment transportation, and so forth. (2) Hydrological, lithological, and land cover factors (2) Hydrological, lithological, and land cover factors Hydrological factors are very important factors of landslide evolution which influence the surface water distribution, water and soil saturation, water sources, runoff generation, and so forth [50]. The distance to river and TWI calculated from DEM and the MNDWI were used in this study as hydrological factors (Figure 3a-c). Distance to river reflects the water scour and soil erosion effects of surface runoff on landslide evolution [49], and TWI reflects the influence of terrain on soil saturation, rainfall infiltration, and runoff [50,51]. MNDWI suggests the surface humidity features, which indirectly present the moisture content in the slope mass [13].
Appl. Sci. 2019, 9, x FOR PEER REVIEW 5 of 18 Hydrological factors are very important factors of landslide evolution which influence the surface water distribution, water and soil saturation, water sources, runoff generation, and so forth [50]. The distance to river and TWI calculated from DEM and the MNDWI were used in this study as hydrological factors (Figure 3a-c). Distance to river reflects the water scour and soil erosion effects of surface runoff on landslide evolution [49], and TWI reflects the influence of terrain on soil saturation, rainfall infiltration, and runoff [50,51]. MNDWI suggests the surface humidity features, which indirectly present the moisture content in the slope mass [13]. The lithological factor in Shicheng County was expressed using rock types in this study, which remarkably affect the slope soil types, slope structure, and soil shear strength [11,43]. The rock types in Shicheng County are mainly defined as metamorphic, carbonate, and clastic rocks ( Figure 3d).
Land cover is also a key factor for LSP. The NDBI, NDVI, total surface radiation, and population density index were adopted in this study to express the land cover factors (Figure 3e-h). NDBI suggests the building distribution features, and it indirectly affects the hydrological and mechanical strength of slope soil [52,53]. NDVI suggests the vegetation growth features, and it mainly changes the hydrological and soil shear strength features of slope soil [11]. Furthermore, total surface radiation is the sum of direct solar radiation and diffuse solar radiation, and it indirectly influences the probability of landslide occurrence by affecting the aboveground vegetation and soil moisture [54]. The population density index affects landslide evolution indirectly by calculating human engineering activity [55].

FR and Correlation Analysis of Predisposing Factors
The FR values of the 14 predisposing factors were calculated to build connections between the recorded landslide grid cells and the predisposing factors [26,49,50,56]. Two discrete predisposing factors, namely, distance to river and rock types, were directly used, while the other continuous predisposing factors were first divided into several subclasses, which were then used to calculate the The lithological factor in Shicheng County was expressed using rock types in this study, which remarkably affect the slope soil types, slope structure, and soil shear strength [11,43]. The rock types in Shicheng County are mainly defined as metamorphic, carbonate, and clastic rocks ( Figure 3d).
Land cover is also a key factor for LSP. The NDBI, NDVI, total surface radiation, and population density index were adopted in this study to express the land cover factors (Figure 3e-h). NDBI suggests the building distribution features, and it indirectly affects the hydrological and mechanical strength of slope soil [52,53]. NDVI suggests the vegetation growth features, and it mainly changes the hydrological and soil shear strength features of slope soil [11]. Furthermore, total surface radiation is the sum of direct solar radiation and diffuse solar radiation, and it indirectly influences the probability of landslide occurrence by affecting the aboveground vegetation and soil moisture [54]. The population density index affects landslide evolution indirectly by calculating human engineering activity [55].

FR and Correlation Analysis of Predisposing Factors
The FR values of the 14 predisposing factors were calculated to build connections between the recorded landslide grid cells and the predisposing factors [26,49,50,56]. Two discrete predisposing factors, namely, distance to river and rock types, were directly used, while the other continuous predisposing factors were first divided into several subclasses, which were then used to calculate the FR values. In this study, the continuous predisposing factors were generally divided into eight subclasses using the Jenks natural break classification method [45,49,56].
The calculated FR values are shown in Table 1, which demonstrates that these factors play important roles in landslide occurrence in Shicheng County. Generally, a higher FR value indicates the more significant influence of a subclass of a predisposing factor on landslide occurrence. Therefore, Table 1 shows that, for example, slope values between 7.5 • and 14.9 • have a greater effect on landslide occurrence than other values; the closer to the river, the greater the influence on landslide occurrence. Further, metamorphic rock is more conducive to landslide occurrence comparing to other rocks.

Methods
The spatial data of landslide inventory information and related Predisposing are obtained, then the FRs are calculated. Based on these data, a novel PSO-MLP model is proposed to deal with the LSP in Shicheng County.

Multilayer Perceptron
MLP is a multilayer feed-forward network model with one-way error propagation, and it is one of the most widely used ANNs. MLP can solve the problems of pattern recognition, time series prediction, and so forth [34,57]. The evolution of a landslide, which is a complex physical process, is also a nonlinear system affected by the natural environment and human engineering activities [58,59]. Therefore, compared with deterministic models or general linear statistical methods, the MLP model has excellent nonlinear mapping ability for performing LSP [33].
The MLP model is composed of input, hidden, and output layers, all of which are composed of similar neurons (Figure 4). The connections between the input and hidden layers and between the hidden and output layers are all processed by the weight values. By training and testing these weight values, neural networks form an orderly and stable structure with decision-making ability. Since an MLP with a single hidden layer can approximate a nonlinear system with arbitrary accuracy, this paper mainly studied the single-hidden-layer MLP model [60]. For MLP model with multi-input variables and multi-output variables, the number of neurons in input layer X = [x 1 , x 2 · · · , x n0 ], hidden layer and output layer is respectively set as n 0 , n 1 and n 2 , so that the inputs and outputs of hidden layer neurons are: where z j , b j and y j are respectively the input, the threshold and the output of the jth neuron in hidden layer of MLP; w ij is the weight value between the ith input neuron and the jth neuron in the hidden layer; f z j is the activation function. Then the input and output of neurons in the output layer are: where, z k , b k and y k are the input, the threshold and the output of the neuron in the kth output layer; w jk is the weight value between the neuron in the jth hidden layer and the neuron in the kth output layer. In general, the error back-propagation method with conventional gradient descent algorithm, which is used as the training rule of MLP, can adjust the weight values between the neurons based on the estimated errors between the actual values and the MLP predicting values for training samples. The minimum value of the objective function and the optimal weight values of machine learning can be calculated step by step with an iterative solution of the conventional gradient descent algorithm. However, conventional gradient descent has the disadvantages of generating difficulties when searching along a straight line, a local optimum, and a slow convergence rate. To overcome these problems, mini-batch gradient descent, a compromise algorithm based on batch gradient descent and stochastic gradient descent was used to train and test the MLP model. Mini-batch gradient descent has advantages of a global optimum and a high convergence rate.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 9 of 18 generating difficulties when searching along a straight line, a local optimum, and a slow convergence rate. To overcome these problems, mini-batch gradient descent, a compromise algorithm based on batch gradient descent and stochastic gradient descent was used to train and test the MLP model. Mini-batch gradient descent has advantages of a global optimum and a high convergence rate.

Theory of PSO-MLP Model
PSO is a global optimization algorithm that simulates the foraging behavior of birds in groups [61]. The position and velocity of each particle can be updated based on the globally optimal solution and the current optimal solution; as a result, all the particles move in the direction guided by the objective function. Then, the final global optimal solution can be calculated. The PSO algorithm has better global optimization capability and higher calculation performance than other optimization algorithms (e.g., the genetic algorithm and the ant colony algorithm) [62].
The mini-batch gradient descent algorithm is mainly used to determine the appropriate connection weights between the neurons of MLP. However, several structural parameters, such as the learning rate, learning delay rate, and the learning momentum, are needed for appropriate determination [39]. The learning rate indicates the change range of the weight values between the neurons for each training iteration, and the learning momentum is applied to ensure that the change direction of the weight value is stable. In addition, the number of neurons in the hidden layer has an important effect on the prediction performance of MLP. Thus, it is necessary to select a proper number of neurons in the hidden layer [41]. In this study, the PSO was proposed to determine these four structural parameters of MLP.
The calculation processes of the proposed PSO-MLP are shown in Figure 5. Firstly, the initial parameters of PSO itself were selected, including the number of particles, maximum iterations, and

Theory of PSO-MLP Model
PSO is a global optimization algorithm that simulates the foraging behavior of birds in groups [61]. The position and velocity of each particle can be updated based on the globally optimal solution and the current optimal solution; as a result, all the particles move in the direction guided by the objective function. Then, the final global optimal solution can be calculated. The PSO algorithm has better global optimization capability and higher calculation performance than other optimization algorithms (e.g., the genetic algorithm and the ant colony algorithm) [62].
The mini-batch gradient descent algorithm is mainly used to determine the appropriate connection weights between the neurons of MLP. However, several structural parameters, such as the learning rate, learning delay rate, and the learning momentum, are needed for appropriate determination [39]. The learning rate indicates the change range of the weight values between the neurons for each training iteration, and the learning momentum is applied to ensure that the change direction of the weight value is stable. In addition, the number of neurons in the hidden layer has an important effect on the prediction performance of MLP. Thus, it is necessary to select a proper number of neurons in the hidden layer [41]. In this study, the PSO was proposed to determine these four structural parameters of MLP.
The calculation processes of the proposed PSO-MLP are shown in Figure 5. Firstly, the initial parameters of PSO itself were selected, including the number of particles, maximum iterations, and so forth. Secondly, the MLP model with the mini-batch gradient descent algorithm was trained and tested based on landslide and non-landslide samples. Then, the prediction accuracy index of the area under the receiver operating characteristic (ROC) curve (AUC) was selected as the fitness function of the MLP, and this fitness function was also the objective function of PSO. Fourthly, by comparing the AUC value of each particle to the global and local best AUC values, the position and velocity of each particle were updated gradually. Finally, the update process continued until the end of the setting maximum iteration of PSO was reached.

Training and Testing Variables of the Four Models
The model-building processes of LSP can be considered as a 0/1 classification problem. In general, input and output variables are needed to build these binary classification models. In this study, we stored and managed 369 recorded landslide polygons as 2709 landslide grid cells (assigned to 1) in GIS software. Further, 2709 non-landslide grid cells (assigned to 0) were randomly selected from the landslide-free areas [11,13,43]. The recorded landslide grid cells and selected non-landslide grid cells were used as output variables, while the landslide-related predisposing factors were used as input variables of these models. A spatial database containing input and output variables was divided into two parts: a training dataset (70% of the landslide and non-landslide grid cells) for model construction and a testing dataset (the remaining 30% of the landslide and non-landslide grid cells) for model validation. Finally, the FR values of the 14 predisposing factors were used as numeric input variables of these LSP models.

PSO-MLP Model for LSP
The PSO algorithm was applied to optimize the parameter selection of the MLP model with the mini-batch gradient descent algorithm. First, the PSO algorithm was initialized; for example, the particle population was set to 100, the maximum number of iterations was set to 35, the initial velocity and position of each particle were randomly determined, and the other related parameters were set to default values. Meanwhile, the dimensions of both velocity and position were set to four because there were four parameters (learning rate, learning delay rate, the momentum of the mini-batch gradient descent algorithm, and the number of neurons in the hidden layer of the MLP

Training and Testing Variables of the Four Models
The model-building processes of LSP can be considered as a 0/1 classification problem. In general, input and output variables are needed to build these binary classification models. In this study, we stored and managed 369 recorded landslide polygons as 2709 landslide grid cells (assigned to 1) in GIS software. Further, 2709 non-landslide grid cells (assigned to 0) were randomly selected from the landslide-free areas [11,13,43]. The recorded landslide grid cells and selected non-landslide grid cells were used as output variables, while the landslide-related predisposing factors were used as input variables of these models. A spatial database containing input and output variables was divided into two parts: a training dataset (70% of the landslide and non-landslide grid cells) for model construction and a testing dataset (the remaining 30% of the landslide and non-landslide grid cells) for model validation. Finally, the FR values of the 14 predisposing factors were used as numeric input variables of these LSP models.

PSO-MLP Model for LSP
The PSO algorithm was applied to optimize the parameter selection of the MLP model with the mini-batch gradient descent algorithm. First, the PSO algorithm was initialized; for example, the particle population was set to 100, the maximum number of iterations was set to 35, the initial velocity and position of each particle were randomly determined, and the other related parameters were set to default values. Meanwhile, the dimensions of both velocity and position were set to four because there were four parameters (learning rate, learning delay rate, the momentum of the mini-batch gradient descent algorithm, and the number of neurons in the hidden layer of the MLP model) that needed to be determined.
Secondly, several initial parameters of the MLP model were assigned: the number of the hidden layer was set to 1; the learning rate, learning delay rate, and momentum of mini-batch gradient descent were respectively set to 0.001-0.2, 0.001-0.5, and 0.05-0.95; and the number of neurons in the hidden layer ranged from 5 to 35. In addition, some other parameters of MLP were set to default values. Thirdly, the PSO-MLP model was trained and tested based on the datasets introduced in Section 4.3.
The PSO-MLP model-building process showed that its optimal parameters were a learning rate of 0.006, a learning decay rate of 0.004, a learning momentum of 0.78, and a hidden-layer neuron number of 23.
Finally, a landslide susceptibility map was produced based on the LSI values calculated by the PSO-MLP model, as shown in Figure 6a. This landslide susceptibility map was divided into five levels using the natural interval point method [63]: very high (10.2%), high (15.7%), moderate (25.7%), low (24.8%), and very low (23.6%) susceptibility levels ( Table 2). In general, the high and very high landslide susceptibility levels were mainly distributed in areas within a close distance to river networks and having a relatively low DEM, moderate slope, high population density, and active human engineering building. On the contrary, the low and very low landslide susceptibility levels were mainly located in zones far from river networks and having a high DEM, gentle or steep slopes, high vegetation cover rate, and few human activities. (25.7%), low (24.8%), and very low (23.6%) susceptibility levels ( Table 2). In general, the high and very high landslide susceptibility levels were mainly distributed in areas within a close distance to river networks and having a relatively low DEM, moderate slope, high population density, and active human engineering building. On the contrary, the low and very low landslide susceptibility levels were mainly located in zones far from river networks and having a high DEM, gentle or steep slopes, high vegetation cover rate, and few human activities.

MLP-Only Model for LSP
The MLP-only model with the conventional gradient descent algorithm was used to carry out LSP for comparison. The optimal number of neurons in the hidden layer was determined to be 21 according to the minimum prediction error method [62]. All the other parameters of the MLP-only model were set to default values and/or determined using default methods. The landslide susceptibility map produced by the MLP-only model is shown in Figure 6b and was also divided into very low (27.7%), low (22.9%), moderate (22.7%), high (16.9%), and very high (9.9%) susceptibility levels using the natural interval point method (Table 2).

BPNN Model
The commonly used BPNN model for LSP is mainly composed of input, hidden, and output layers. Related studies have shown that this model structure can fit any nonlinear function and predict many kinds of complex time series and pattern recognition problem [62]. Therefore, this study also adopted a single-hidden-layer BPNN model to carry out LSP in Shicheng County. Each layer of the BPNN model was composed of a certain number of neurons which connected the input, hidden, and output layers by weight values. Generally, the error back-propagation algorithm was applied to determine these weight values.
In this study, the input-output variables used for PSO-MLP were used again for BPNN model training and testing, and the optimal number of neurons in the hidden layer was set to 20 based on the minimum prediction error method [64]. The other parameters of the BPNN model were set to default values. Finally, the landslide susceptibility map produced by the BPNN model is shown in Figure 6c, which had very low (25.7%), low (22.4%), moderate (24.9%), high (16.1%), and very high (10.9%) susceptibility levels ( Table 2).

IV Model for LSP
The IV model is an indirect conventional statistical model that is often used in LSP [5,65]. According to these researches, the information value V i of the predisposing factor can be formulated as: where L i indicates the number of landslide grid cells with the presence of predisposing factor, T i indicates the grid cells number with predisposing factor, L indicates the sum number of landslide grid cells, T indicates the total number of grid cells in Shicheng County. In principle, the presence of predisposing factor does not contribute to landslide evolution when the value of V i is negative; and the presence of predisposing factor contributes to landslide evolution when the value of V i is positive. A higher value of V i suggests a greater correlations between predisposing factor and landslide occurrence. Hence, the total information value V, which reflects the landslide susceptible index of each grid cell, can be calculated as: In this study, the landslide susceptibility map generated by the IV model is shown in Figure 6d, for which the landslide susceptibility levels were divided into very low (23.7%), low (22.8%), moderate (26.7%), high (17.2%), and very high (9.2%) levels ( Table 2). The raw distribution of landslide susceptibility indices in Shicheng County was very similar to that generated by the PSO-MLP, MLP-only, and BPNN models.

Frequency Ratio Accuracy Analysis
It can be seen from Table 2 that the FR values of the very high susceptibility levels of the PSO-MLP, MLP-only, BPNN, and IV models were 4.546, 4.344, 4.229, and 4.184, respectively, suggesting that a small area with a very high susceptibility level can interpret many landslide locations. Furthermore, the FR values of the landslide susceptibility levels of all four models increased rapidly from the very low to the very high levels, suggesting that the landslide susceptibility maps generated by the four models were all good and reliable. In addition, the FR accuracy of the prediction model was defined as the ratio of the sum of the high and very high susceptibility levels to the sum of all the susceptibility levels [13]. A higher FR accuracy value indicated the greater prediction performance of the model. Table 2 shows that the FR accuracies of the PSO-MLP, MLP-only, BPNN, and IV models were 0.856, 0.829, 0.840, and 0.824, respectively. Hence, the prediction performance of the PSO-MLP model was higher compared with those of the other three models.

ROC Accuracies of These Models
The prediction performance of the four models was also assessed by the ROC curve method. The y-axis of an ROC curve indicates the sensitivity values of true positive rates, while the x-axis suggests the values of "1-specificity", which are known as false positive rates. Hence, an ROC curve can be set up as a function of "1-specificity" and can further reflect the prediction accuracy of a 0/1 classification system because of its various cut-off threshold evaluations [7,8,10].
The ROC method has been commonly and successfully applied to landslide susceptibility prediction model evaluation [16,25,26]. An AUC ranging from 0.5 to 1 is the standard index of model prediction performance. The greater the value of the AUC, the better the model prediction performance. Figure 7 shows that the AUC values of the PSO-MLP, MLP-only, BPNN, and IV models were 0.822, 0.791, 0.800, and 0.786, respectively. This precision index once again reflects that the PSO-MLP model had the best prediction performance among these four models. performance. Figure 7 shows that the AUC values of the PSO-MLP, MLP-only, BPNN, and IV models were 0.822, 0.791, 0.800, and 0.786, respectively. This precision index once again reflects that the PSO-MLP model had the best prediction performance among these four models.

PSO-MLP Model-Building Analysis
The PSO-MLP model proposed in this study firstly used FR analysis to build the correlations between recorded landslides and related predisposing factors. FR values are commonly used to obtain the input variables of prediction models and have been acknowledged as an efficient tool. Then, the correlation analysis was used to eliminate the collinear predisposing factors. As a result, the avoidance of redundant information contained in the input variables effectively guaranteed the validity of the prediction models.
Next, the mini-batch gradient descent algorithm was applied to overcome the drawbacks of the conventional gradient descent and stochastic gradient descent algorithms in terms of the non-global optimum, low training speed, reduced accuracy, and lack of parallel computing. In addition, the PSO algorithm was introduced to appropriately screen the optimal structural parameters of MLP, including the learning rate, learning decay rate, learning momentum, and the number of neurons in the hidden layer. The comparative results showed that the proposed PSO-MLP model had significantly better prediction performance than the conventional MLP-only model, suggesting that the PSO-MLP model does address the drawbacks of the MLP model of local optimum and efficient parameter selection. Moreover, it also obtained a more reasonable landslide susceptibility map than the conventional ANN model (i.e., BPNN) and the statistical model (i.e., IV), indicating that the PSO-MLP model is an excellent alternative model and can be extended to other areas for LSP.

PSO-MLP Model-Building Analysis
The PSO-MLP model proposed in this study firstly used FR analysis to build the correlations between recorded landslides and related predisposing factors. FR values are commonly used to obtain the input variables of prediction models and have been acknowledged as an efficient tool. Then, the correlation analysis was used to eliminate the collinear predisposing factors. As a result, the avoidance of redundant information contained in the input variables effectively guaranteed the validity of the prediction models.
Next, the mini-batch gradient descent algorithm was applied to overcome the drawbacks of the conventional gradient descent and stochastic gradient descent algorithms in terms of the non-global optimum, low training speed, reduced accuracy, and lack of parallel computing. In addition, the PSO algorithm was introduced to appropriately screen the optimal structural parameters of MLP, including the learning rate, learning decay rate, learning momentum, and the number of neurons in the hidden layer. The comparative results showed that the proposed PSO-MLP model had significantly better prediction performance than the conventional MLP-only model, suggesting that the PSO-MLP model does address the drawbacks of the MLP model of local optimum and efficient parameter selection.
Moreover, it also obtained a more reasonable landslide susceptibility map than the conventional ANN model (i.e., BPNN) and the statistical model (i.e., IV), indicating that the PSO-MLP model is an excellent alternative model and can be extended to other areas for LSP.

Conclusions
Based on the PSO and mini-batch gradient descent algorithms, an improved MLP model, namely, the PSO-MLP, was developed to carry out LSP for Shicheng County, China. The FR values of 14 predisposing factors were used as the input variables, while 2709 recorded landslide grid cells and 2709 randomly selected non-landslide grid cells were combined as the output variables of the prediction models. The MLP-only, BPNN, and IV models were also applied to perform LSP for comparison. The results showed that the landslide susceptibility maps produced by all four models were reasonable and reliable. However, the PSO-MLP model had higher prediction capability than the other three models, as assessed by the FR and ROC accuracies.
In summary, the PSO algorithm can effectively optimize the structure parameters of the MLP model with the mini-batch gradient descent algorithm compared with conventional MLP. The proposed PSO-MLP model has the advantage of being able to more globally and accurately predict landslide susceptibility compared with the commonly used BPNN and IV models. This novel PSO-MLP model can be used in other study areas for LSP. Moreover, the produced landslide susceptibility map of Shicheng County is necessary and valuable for the local government to carry out landslide hazard prevention and land use planning.