Combining Evolutionary Algorithms and Machine Learning Models in Landslide Susceptibility Assessments

: The main objective of the present study is to introduce a novel predictive model that combines evolutionary algorithms and machine learning (ML) models, so as to construct a landslide susceptibility map. Genetic algorithms (GA) are used as a feature selection method, whereas the particle swarm optimization (PSO) method is used to optimize the structural parameters of two ML models, support vector machines (SVM) and artiﬁcial neural network (ANN). A well-deﬁned spatial database, which included 335 landslides and twelve landslide-related variables (elevation, slope angle, slope aspect, curvature, plan curvature, proﬁle curvature, topographic wetness index, stream power index, distance to faults, distance to river, lithology, and hydrological cover) are considered for the analysis, in the Achaia Regional Unit located in Northern Peloponnese, Greece. The outcome of the study illustrates that both ML models have an excellent performance, with the SVM model achieving the highest learning accuracy (0.977 area under the receiver operating characteristic curve value (AUC)), followed by the ANN model (0.969). However, the ANN model shows the highest prediction accuracy (0.800 AUC), followed by the SVM (0.750 AUC) model. Overall, the proposed ML models highlights the necessity of feature selection and tuning procedures via evolutionary optimization algorithms and that such approaches could be successfully used for landslide susceptibility mapping as an alternative investigation tool.


Introduction
Landslides are natural disasters, the evolution of which has significant adverse consequences on the natural and build environment and in some cases, unfortunate human losses [1][2][3]. An increase in landslides has been reported worldwide, which is mainly associated with the impact of human activities and also natural processes. Specifically, it is well documented that favorable geo-environmental settings, climate change (the manifestation of extreme rainfall), human activities (the construction of new settlements and infrastructures in landslide-prone areas), and also changes in land-use patterns increase the possibility of landslides, which is a trend that likely will continue in the future [3][4][5]. As a response to increasing stress, the international scientific community has focused its efforts on designing have been reported [36,54]. Another example of the usage of evolutionary algorithms in landslide susceptibility is the one reported by [31], which introduced a novel model using a geographically weighted regression (GWR) technique so as to segment study areas into a series of prediction regions with appropriate sizes and an SVM classifier optimized by the PSO algorithm. The authors report higher prediction accuracy than conventional methods. Likewise, Nguyen et al. [55] applied a PSO and ABC algorithm to tune the structural parameters of an ANN. The outcomes of their study confirmed the significant increase of prediction accuracy from nearly 77% to around 86%. Chen, Panahi, and Pourghasemi [20] used three optimization techniques (PSO, GA, and differential evolution (DE)) to tune an ANFIS model. The authors highlighted that all three ensembles obtained satisfactory prediction accuracy, with the ANFIS-DE being the most promising ensemble model. In similar studies, concerning flood susceptibility maps, Tien Bui et al. [56] applied two meta-heuristic algorithms (PSO and EG) in order to tune an ANFIS model. The comparison of the results obtained by the proposed approach with the outcomes of a series of ML models (J48, DT, RF, MLP, Neural Nets, and SVM models) revealed the higher accuracy achieved by the proposed approach.
Although there are many studies that could be found in the literature with the usage of evolutionary algorithms in landslide susceptibility mapping, there are few a that combine two different evolutionary algorithms for feature selection and tuning process. In this context, the present study illustrates the integration of a feature selection evolutionary algorithm and an evolutionary optimization algorithm with two ML models used in landslide susceptibility assessments. The usage of the GA algorithm generated prediction models with reduced complexity and high generalization ability, whereas the model parameter tuning process via the PSO ensured more accurate predictions for both models. The remainder of the study is organized as follows: In the next section, Section 2, there is a short description concerning the study area. In Section 3, the methodology has been introduced, including a description of the algorithms and ML models used during the study. Data are presented in Section 3, whereas Section 4 lists the outcomes of the analysis. Section 5 covers the discussion, whereas the last section includes the conclusion of the present study.

Study Area
The Regional Unit of Achaia (RUA) covers the NW of Peloponnese, Greece, stretching to an area of 3274 km 2 , between longitudes 260,000 and 360,000 and latitudes 4,180,000 and 4,250,000 (Coordinate system, EPSG:2100, GGRS87) and a population of 300,000 inhabitants ( Figure 1). The RUA is one of the most mountainous regions of Greece covered by massive rocky limestone ridges and high peaks [24]. The geomorphological settings appear to be controlled mainly by geology, tectonic activity, and weathering and erosion procedures [57]. The elevation ranges between 0 and 2310 m with over 53% of its total area having elevation values above 1000 m. The climate of the area is characterized by a Mediterranean type, having warm, dry summers and mild, wet winters. The wet season expands from October to May. The total rainfall accounts for 93% of the annual rainfall. The wettest month is December, in which the mean rainfall is 128.9 mm, followed by November (124.7 mm). August is the driest month with a mean rainfall of 7.0 mm, followed by July (8.8 mm). Concerning the highest mean annual temperature, it corresponds to 24.3 °C, whereas the lowest value corresponds to 8.0 °C. The mean annual temperature reaches 13.7 °C [58]. Concerning the geological structure and the geotectonic settings of the study area, it is The climate of the area is characterized by a Mediterranean type, having warm, dry summers and mild, wet winters. The wet season expands from October to May. The total rainfall accounts for 93% of the annual rainfall. The wettest month is December, in which the mean rainfall is 128.9 mm, followed by November (124.7 mm). August is the driest month with a mean rainfall of 7.0 mm, followed by July (8.8 mm). Concerning the highest mean annual temperature, it corresponds to 24.3 • C, whereas the lowest value corresponds to 8.0 • C. The mean annual temperature reaches 13.7 • C [58]. Concerning the geological structure and the geotectonic settings of the study area, it is characterized by the presence of three geotectonic zones (Ionian, Olonos-Pindos and Gavrovo-Tripolis) and dominant tectonic fracturing that is responsible for the occurrence of Plio-Pleistocene sediments [59]. Specifically, within the study area, the following geological units are identified: fine-grained to coarse-grained loose Quaternary formations, coarse-grained coherent Quaternary formations, fine-grained to coarse-grained Plio-Pleistocene sediments, flysch formations, Cretaceous, Triassic, and Jurassic limestones, dolomites, schist-chert formations, sandstones, semi-metamophic, and formations of volcanic origin [60,61]

Methodology
The proposed methodology follows a five-phase procedure, which involves (i) data processing, (ii) multi-collinearity analysis, (iii) feature selection, (iv) optimizing the SVM and ANN models, (v) constructing the landslide susceptibility map, testing, and comparing (linear regression analysis) the two models. For the implementation of the evolutionary algorithms and ML models, the "caret" R package [62] was used, whereas ArcGIS 10.3.1 [63] was used for accessing the spatial data and constructing the landslide susceptibility maps. Figure 3 illustrates the five-phase methodology, whereas a brief description of each phase is introduced in the following paragraphs.

Methodology
The proposed methodology follows a five-phase procedure, which involves (i) data processing, (ii) multi-collinearity analysis, (iii) feature selection, (iv) optimizing the SVM and ANN models, (v) constructing the landslide susceptibility map, testing, and comparing (linear regression analysis) the two models. For the implementation of the evolutionary algorithms and ML models, the "caret" R package [62] was used, whereas ArcGIS 10.3.1 [63] was used for accessing the spatial data and constructing the landslide susceptibility maps. Figure 3 illustrates the five-phase methodology, whereas a brief description of each phase is introduced in the following paragraphs. Remote Sens. 2020, 12, x FOR PEER REVIEW 6 of 28

First Phase
The first phase involved, as a first action, the identification of the landslide and non-landslide areas and the construction of the inventory map. Landslide areas were located during field trips from 2017 to 2018, previous reports and studies that cover a time span of 67 years from 1950, and the interpretation of aerial photographs, whereas non-landslide areas were randomly selected from the freeing of landslide space by applying the Create Random Points, which is a geo-processing tool installed in the Data Management Tool of the ArcGIS platform [63].
A second action involved the selection of the landslide-related parameters and the classification and weighting of each parameter. Twelve parameters (elevation, slope angle, slope aspect, profile curvature, plan curvature, curvature, topographic wetness index (TWI), stream power index (SPI), lithology, hydrolithology cover, distance to faults, and distance to river network) were selected as landslide-related parameters based on the experience gained from previous studies conducted in the same area but also in areas with similar geo-environmental settings [57,64]. Resolution of the input data has an impact on the produced landslide accuracy, as a larger grid size leads to lost information [65]. In our case, each parameter was transformed into raster format with a grid size of 30 m. The grid size was determined by the input data, the DEM file (30 m), and the scale of the geological and topographic maps (1:50,000). The process of classification and weighting is mandatory in landslide susceptibility assessments. The classification of the continuous parameters, such as the geomorphological parameters, was based on expert knowledge and previous studies with similar settings, whereas the weighting was based on the Weight of Evidence (WofE) method [66]. In the literature, several weighting methods have been used, including expert-based methods with most common being the Analytic Hierarchic Process and also data-driven methods with the Frequency ratio being the most common [65]. In our case, the WofE method was used in order to capture the different probability of landslide occurrence of each variable and class.
WofE has been extensively used in studies involving natural hazard analysis, including floods and landslides and is characterized as a data-driven approach. The theoretical background of the WofE model involves the implementation of the Bayes theorem and prior and posterior probability

First Phase
The first phase involved, as a first action, the identification of the landslide and non-landslide areas and the construction of the inventory map. Landslide areas were located during field trips from 2017 to 2018, previous reports and studies that cover a time span of 67 years from 1950, and the interpretation of aerial photographs, whereas non-landslide areas were randomly selected from the freeing of landslide space by applying the Create Random Points, which is a geo-processing tool installed in the Data Management Tool of the ArcGIS platform [63].
A second action involved the selection of the landslide-related parameters and the classification and weighting of each parameter. Twelve parameters (elevation, slope angle, slope aspect, profile curvature, plan curvature, curvature, topographic wetness index (TWI), stream power index (SPI), lithology, hydrolithology cover, distance to faults, and distance to river network) were selected as landslide-related parameters based on the experience gained from previous studies conducted in the same area but also in areas with similar geo-environmental settings [57,64]. Resolution of the input data has an impact on the produced landslide accuracy, as a larger grid size leads to lost information [65]. In our case, each parameter was transformed into raster format with a grid size of 30 m. The grid size was determined by the input data, the DEM file (30 m), and the scale of the geological and topographic maps (1:50,000). The process of classification and weighting is mandatory in landslide susceptibility assessments. The classification of the continuous parameters, such as the geo-morphological parameters, was based on expert knowledge and previous studies with similar settings, whereas the weighting was based on the Weight of Evidence (WofE) method [66]. In the literature, several weighting methods have been used, including expert-based methods with most common being the Analytic Hierarchic Process and also data-driven methods with the Frequency ratio being the most common [65]. In our case, the WofE method was used in order to capture the different probability of landslide occurrence of each variable and class.
WofE has been extensively used in studies involving natural hazard analysis, including floods and landslides and is characterized as a data-driven approach. The theoretical background of the WofE model involves the implementation of the Bayes theorem and prior and posterior probability concepts [66]. In our case, each class of each parameter is assigned with a positive or a negative weight (W+ and W-) estimated based on the percentage of landslides in each class. The weight corresponds to a positive or a negative spatial correlation that may present among the classes of the landslide-related parameters and landslides. The measure of the spatial association is provided through the magnitude of contrast (C), the difference between W+ and W-. When the value of C is positive, it means that a positive correlation exists, while in the case of a negative value, it implies negative spatial association [66][67][68][69].
The following action in phase A was normalizing the parameters using the max-min procedure, with new values ranging between 0.1 and 0.9 [70]. The final action involved the construction of the training and testing subsets. This action had two parts, the first involving assigning to each data point the values that correspond to each parameter and the second partitioning the data into training and testing data. For the first part, the Spatial Analyst geoprocessing tool Extra Multi Values to Points was applied, whereas for the second part, the geoprocessing tool Create Subsets found in the Geostatistical Toolbox was used [63]. The training subset included 70% of landslide and non-landslide areas, whereas the testing subset included the remaining 30%.

Second Phase
The second phase involved a multi-collinearity analysis using the training subset, so as to find potential correlation among the landslide-related parameters and exclude those parameters from the next phase of analysis [48]. The analysis involves the calculation of two metrics, the Variance Inflation Factor (VIF) and the Tolerance index (TOL). In order to indicate severe multi-collinearity, the VIF index should be higher than 10 and the TOL index should be lower than 0.1 [71,72].

Third Phase
The third phase involved the feature selection procedure. As mentioned earlier, GA was the main evolutionary algorithm used for this purpose. In general, GA, which has been proposed by Holland [73], was developed based on the concepts of natural selection and biological operators (mutation, crossover, and selection). They are mainly used in optimization and search problems. The basic idea behind GA is that from an initial set of candidate solutions, which constitute the population, and repeated evolutionary process, the generated solutions would be better than the initial. Solutions, which are referred to as individuals, correspond to specific fitness values, where in our case, higher values are considered better. Those solutions that have the best fitness values are selected and undergo a crossover and random mutation process, which is repeated several times, producing several generations, which in principle creates optimal solutions. In our case, where we used GA for feature selection, the individuals are subsets of landslide-related parameters that are encoded as binary (1 and 0). The fitness values are measures of model performance, and specifically the classification accuracy achieved by the base learner, an RF model, using a 10-fold cross-validation procedure. The feature selection routine was conducted using the r package "caret" [62].

Fourth Phase
The fourth phase involved the implementation of the two machine learning models (SVM and ANN) and the usage of PSO as a tuning procedure. The PSO algorithm is an algorithm that is based on the behavior of bird flocking and fish schooling in the real world (swarm intelligence) [17], which was first introduced by Kennedy and Eberhart-Phillips [74]. Similar to GA, the PSO analyzes an initial group of random solutions (particles), sets a population (swarm), and updates the generation in order to achieve an optimal solution. The movement of each particle is influenced by the position and velocity of the surrounding particles, which is controlled by a fitness function. The objective of the PSO algorithm is to alter the velocity and position of each particle in order to maximize or minimize the Remote Sens. 2020, 12, 3854 7 of 26 fitness function. The fitness function was the same used in the feature selection process, maximizing the classification accuracy.
Concerning the two models, SVM and ANN, both are quite popular in landslide susceptibility assessments. SVM are characterized as non-parametric kernel-based techniques [75], appearing efficient in solving linear and non-linear classification and regression problems [76]. SVM are capable of defining complex decision functions that could separate two classes of data samples in an optimal way [77]. Specifically, the objective is to construct a function f (x) that deviates from known outcomes by a value no greater than ε (epsilon) for each training point x and at the same time to be as flat as possible. The main structural parameters used during the implementation of SVM model were C (cost), the cost of predicting a sample, γ (gamma), the precision parameter for the radial basis function, and ε (epsilon), the epsilon in the SVM insensitive loss function.
ANN are defined mainly as supervised ML models; they are able to classify unknown data to possible classes based on known data points and a set of inputs. ANN are considered as information processing systems that can provide knowledge based on a learning mechanism that is similar to the way humans learn [41,43]. ANN learns by processing known examples, storing within the data structure of the developed network probability-weighted correlations. The most common architecture of an ANN involves layers of neurons interconnected with a set of correlation weights, which by using a complex non-linear function transforms the input data to certain outputs. In our study, the ANN model, a feedforward MultiLayer Perceptron neural network, included three layers of neurons: an input layer, a hidden layer, and an output layer. The input layer had one neuron for each input landslide-related parameter. The hidden layer included a number of neurons that permitted complexities to develop among the input neurons. The output layer included only one neuron that corresponded to the outcome of the process (in our case, classifying an unknown area as landslide or non-landslide). The learning process that an ANN follows involves the estimation of the difference between the prediction of the ANN and the known target outcome. The ANN model adjusts the weighted correlations of each neuron that it has based on this difference and a learning rule. After a number of successive adjustments, the outcomes of the ANN model will be similar to the known target outcome. The learning procedure terminates when a certain criteria of accuracy are achieved. The "nnet" R package [78] was used, whereas the main structural parameters that the PSO algorithm tuned were the number of neurons in the hidden layer (size), the learning rate decay (decay), and the number of training iterations (epochs).

Fifth Phase
The first action of the fifth phase involved the implementation of the optimized SVM and ANN models for the construction of the landslide susceptibility maps. The landslide susceptibility maps illustrate the spatial distribution of the probability of an area to manifest landslides. The two maps were reclassified into a five-level map (very low susceptibility, low susceptibility, moderate susceptibility, high susceptibility, and very high susceptibility) using the natural break classification scheme [79]. The next action involved testing and comparing the two models by conducting a receiver operating characteristic (ROC) curve analysis and estimating the area under the ROC curve value (AUC) [11,80]. The final action was the implementation of the Wilcoxon signed-rank test in order evaluate the chance that the models produce statistically significant different outcomes by estimating the p and z values [81]. The null hypothesis was that the two models produced similar outcomes, and it was rejected when the p value was less than the significant level (0.05) and the z value was out of the range of the critical values of z (−1.96 and +1.96) [82][83][84].

Data
The landslide database included 335 landslides which according to the Varnes Classification System [85] were classified into three types: rockfalls, translational, and rotational landslides [57,59,86]. The majority were translational and rotational landslides, which were mainly manifested at areas that are covered by a thick weathering mantle and within the upper fragmentation zone of the geological Remote Sens. 2020, 12, 3854 8 of 26 formations [57] (Figure 4). Rockfalls were mainly associated with areas that are characterized by intense slope and are covered by limestone, schist, and chert formations [57]. Based on the results of previous studies concerning the area, in most cases, the combined action of intense rainfall, seismic-tectonic and human activity were identified as the main triggering factors [59,[87][88][89]. However, in our study, triggering factors are not included in the analysis, since the purpose is to provide a series of landslide susceptibility maps that by default do not consider a triggering factor. The 335 data point samples were extracted representing the centroid of each landslide. In only few cases of very large landslides was there the need to introduce additional points in order to represent the overall settings of the area. The 335 non-landslide samples were selected following the procedure described in the previous section (Section 3.1). Thus, a final database of 670 landslide and non-landslide samples was formed.
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 28 rainfall, seismic-tectonic and human activity were identified as the main triggering factors [59,[87][88][89]. However, in our study, triggering factors are not included in the analysis, since the purpose is to provide a series of landslide susceptibility maps that by default do not consider a triggering factor. The 335 data point samples were extracted representing the centroid of each landslide. In only few cases of very large landslides was there the need to introduce additional points in order to represent the overall settings of the area. The 335 non-landslide samples were selected following the procedure described in the previous section (Section 3.1). Thus, a final database of 670 landslide and nonlandslide samples was formed.
(a) (b) The geo-morphological parameters of elevation, slope angle, slope aspect, plan curvature, profile curvature, curvature, TWI, and SPI were derived using specific geo-processing tools found in the ArcGIS suite from an ASTER GDEM with 30 m resolution [63,84,90]. The lithological cover, hydrological cover, distance from faults, and distance from river network were digitized from the geological and topographic map sheets from IGME and National Topographic Maps that covered the area [91,92]. It is well known that elevation, slope angle, and slope aspect are three geo-morphological parameters that have a significant impact on landslides [93,94] (Figure 5a-c). It also well observed that areas of high altitude and smooth surface are less prone to landslide than abrupt slopes [95,96]. In the present study, elevation was classified into 5 classes, using the natural break classification scheme, whereas slope angles were classified into 5 classes based on expert knowledge. Slope aspect was classified into 8 classes, namely: north (N), northeast (NE), east (E), southeast (SE), south (S), southwest (SW), west (W), northwest (NW). Curvature, profile curvatures, plan curvatures, and secondary geo-morphological parameters are also considered as significant parameters, since they express local surface relief and complexities, which in most cases has an impact on the erosion process and landslides [93,97] (Figure 5d, Figure 6a,b). In the present study, the three curvature parameters were classified into three categories that represent negative, near zero, and positive values. According to Nefeslioglu et al. [38,98], the SPI parameter expresses the erosive power of surface flow and also the capability of sediment transportation, both processes influencing the landslide manifestation, with high values being evidence for increased erosion risk (Figure 6d) [99]. The TWI parameter can provide the spatial distribution of saturated source areas of runoff generation [100,101] based on the topography of the area (Figure 6c). SPI and TWI values were classified into 5 classes using the natural break method. As already mentioned, seismo-tectonic activity plays an important role in the manifestation of landslides [59,88]. In the present study, the distance from faults was classified into five classes: <150 m, 151-300 m, 301-450 m, 451-600 m, and > 601 m (Figure 7a). The spatial distribution of the river network influences the surface runoff and degree of ground water infiltration, which are processes that may create the appropriate conditions to cause landslides [102]. The distance from the river network was classified into five classes: <200 m, 201-400 m, 401-600 m, 601-800 m, The geo-morphological parameters of elevation, slope angle, slope aspect, plan curvature, profile curvature, curvature, TWI, and SPI were derived using specific geo-processing tools found in the ArcGIS suite from an ASTER GDEM with 30 m resolution [63,84,90]. The lithological cover, hydrological cover, distance from faults, and distance from river network were digitized from the geological and topographic map sheets from IGME and National Topographic Maps that covered the area [91,92]. It is well known that elevation, slope angle, and slope aspect are three geo-morphological parameters that have a significant impact on landslides [93,94] (Figure 5a-c). It also well observed that areas of high altitude and smooth surface are less prone to landslide than abrupt slopes [95,96]. In the present study, elevation was classified into 5 classes, using the natural break classification scheme, whereas slope angles were classified into 5 classes based on expert knowledge. Slope aspect was classified into 8 classes, namely: north (N), northeast (NE), east (E), southeast (SE), south (S), southwest (SW), west (W), northwest (NW). Curvature, profile curvatures, plan curvatures, and secondary geo-morphological parameters are also considered as significant parameters, since they express local surface relief and complexities, which in most cases has an impact on the erosion process and landslides [93,97] (Figure 5d, Figure 6a,b). In the present study, the three curvature parameters were classified into three categories that represent negative, near zero, and positive values. According to Nefeslioglu et al. [38,98], the SPI parameter expresses the erosive power of surface flow and also the capability of sediment transportation, both processes influencing the landslide manifestation, with high values being evidence for increased erosion risk (Figure 6d) [99]. The TWI parameter can provide the spatial distribution of saturated source areas of runoff generation [100,101] based on the topography of the area (Figure 6c). SPI and TWI values were classified into 5 classes using the natural break method. As already mentioned, seismo-tectonic activity plays an important role in the manifestation of landslides [59,88]. In the present study, the distance from faults was classified into five classes: <150 m, 151-300 m, 301-450 m, 451-600 m, and > 601 m (Figure 7a). The spatial distribution of the river network influences the surface runoff and degree of ground water infiltration, which are processes that may create the appropriate conditions to cause landslides [102]. The distance from the river network was classified into five classes: <200 m, 201-400 m, 401-600 m, 601-800 m, and >801 m (Figure 7b). Finally, the hydrological cover was classified into three categories, namely: permeable formations, semi-permeable formations, and impermeable formations (Figure 7c).

First Phase-WofE Analysis
Based on the results of the weighing procedure performed by the WoE method, the lithological cover had the highest correlation followed by the slope angle ( Figure 8). The highest correlation (C = 1.0623) was estimated for the class Volcanic rock and for the 4th class of the slope angle (>31°). Considering the slope angle, a clear increase has been recorded in the occurrence of landslide with the increase in slope angle.

First Phase-WofE Analysis
Based on the results of the weighing procedure performed by the WoE method, the lithological cover had the highest correlation followed by the slope angle ( Figure 8). The highest correlation (C = 1.0623) was estimated for the class Volcanic rock and for the 4th class of the slope angle (>31 • ). Considering the slope angle, a clear increase has been recorded in the occurrence of landslide with the increase in slope angle.
A strong influence also has been recorded for the parameter distance from fault (C = 0.6724), and classes that characterize areas between 201 and 600 m away from faults. A similar influence has been observed for the 4th class (C = 0.6285) of the SPI parameter, whereas areas with elevations between 506 and 804 m exhibit the high values of C (0.5877).
The class of the profile curvature that characterizes surfaces that appear upwardly convex and flow of water will be decelerated, which was positively correlated with the occurrence of landslides to a higher degree (0.2809) than the class that characterizes surfaces that appear upwardly concave, in which the flow of water will be accelerated (0.1517). In the case of plan curvature, both classes that characterize the surface that are laterally convex and concave appear with a positive correlation (0.3269 and 0.3152 respectively). The SPI value has the highest correlation, with an occurrence of landslides in the range of 10-20. A strong influence also has been recorded for the parameter distance from fault (C = 0.6724), and classes that characterize areas between 201 and 600 m away from faults. A similar influence has been observed for the 4th class (C = 0.6285) of the SPI parameter, whereas areas with elevations between 506 and 804 m exhibit the high values of C (0.5877).
The class of the profile curvature that characterizes surfaces that appear upwardly convex and flow of water will be decelerated, which was positively correlated with the occurrence of landslides to a higher degree (0.2809) than the class that characterizes surfaces that appear upwardly concave, in which the flow of water will be accelerated (0.1517). In the case of plan curvature, both classes that characterize the surface that are laterally convex and concave appear with a positive correlation (0.3269 and 0.3152 respectively). The SPI value has the highest correlation, with an occurrence of landslides in the range of 10-20.

Second phase-Multi-Collinearity Analysis
According to the outcomes of the multi-collinearity analysis, there is no multi-collinearity issues among the twelve parameters; thus, the whole set of parameters could be included for further analysis in the next phase (Table 1). Slope angle was the parameter with the smallest TOL value (0.3053) and the highest VIF value (3.2750), however, these were outside of the limits that are considered as an indication of collinearity (TOL < 0.1 and VIF > 10).

Second Phase-Multi-Collinearity Analysis
According to the outcomes of the multi-collinearity analysis, there is no multi-collinearity issues among the twelve parameters; thus, the whole set of parameters could be included for further analysis in the next phase (Table 1). Slope angle was the parameter with the smallest TOL value (0.3053) and the highest VIF value (3.2750), however, these were outside of the limits that are considered as an indication of collinearity (TOL < 0.1 and VIF > 10).

Third Phase-Feature Selection by GA
Concerning the feature selection procedure based on GA, the maximum number of generations was set to 100 with a population size of 20 individuals, the crossover probability was set to 0.8, the mutation probability was set to 0.1, and elitisim was set to 0. These were the default values. During the analysis, the top five selected variables were elevation, distance from river network, slope angle, distance from faults, and lithological cover. On average, seven variables were selected, whereas in the final search using the entire training set, nine parameters were selected at iteration 25. The external performance at this iteration was 0.7508. The nine parameters were elevation, slope angle, profile curvature, plan curvature, TWI, SPI, distance from river network, distance from faults, and lithology, whereas hydrological cover, curvature, and slope aspect were found to be the three least predictive parameters.
Overall, approximately 4367 sec were needed to give an outcome using a desktop PC with an Intel ® Core™ i5-4460 CPU 3.20 GHz processor and 8 GB RAM.

Fourth Phase-Optimizing SVM and ANN by PSO for Landslide Susceptibility Mapping
The next phase included the optimization procedure based on PSO for the two models, using only the features selected by the previous feature selection procedure. The optimal parameters are as follows: for the SVM model, cost = 5.48, gamma = 0.32, epsilon = 0.47, and the ANN model size =22, decay = 0.089, and maximum iteration = 120. The computational time needed for the optimization procedures was 828 and 1323 s, respectively. Figures 9 and 10 illustrates the landslide susceptibility maps produced by the implementation of ANN and SVM optimized by GA and PSO. Both maps were classified into a five-level scheme: very low, low, medium, high, and very high using the Natural Break classification method; details are in Section 3.1. From a visual inspection, the two maps show a similar spatial distribution; however, the ANN model presents a slightly higher coverage in the very high susceptibility class.

Fifth Phase-Evaluating the Performance of the SVM and ANN
Concerning the testing phase and the ROC analysis, for both models, their learning and predicting performance was evaluated based on the training and testing database, respectively ( Figure 11, Table 2). The results showed that the SVM model had a slightly higher AUC value (0.977) followed by the ANN model (0.969). Based on the testing database, a quite different pattern has been recorded. The ANN model presented the highest AUC value (0.800) followed by the SVM model (0.750). Figure 12a-d illustrates four histograms with the relative frequency and normal distribution that correspond to the two models and for landslide and non-landslide areas based on the testing dataset. The best ability to identify potential landslides was attributed to the ANN model, with almost 60% having values over 0.9. On the other hand, SVM gave the best results in the case of non-landslide areas. Slightly higher than 60% of the cases were lower than 0.3, indicating non-landslide areas, whereas the ANN model classified about 18% of non-landslide as landslides. Concerning the testing phase and the ROC analysis, for both models, their learning and predicting performance was evaluated based on the training and testing database, respectively ( Figure 11, Table 2). The results showed that the SVM model had a slightly higher AUC value (0.977) followed by the ANN model (0.969). Based on the testing database, a quite different pattern has been recorded. The ANN model presented the highest AUC value (0.800) followed by the SVM model (0.750).    Figure 12a-d illustrates four histograms with the relative frequency and normal distribution that correspond to the two models and for landslide and non-landslide areas based on the testing dataset. The best ability to identify potential landslides was attributed to the ANN model, with almost 60% having values over 0.9. On the other hand, SVM gave the best results in the case of non-landslide areas. Slightly higher than 60% of the cases were lower than 0.3, indicating non-landslide areas, whereas the ANN model classified about 18% of non-landslide as landslides. Although from the visual inspection the two susceptibility maps seem quite similar, from the Wilcoxon signed-rank test, the two models gave statistically significant different results. For a 95% significant level, it was found that the p value was 0.044 (less than 0.05), whereas the z value was 2.01 Although from the visual inspection the two susceptibility maps seem quite similar, from the Wilcoxon signed-rank test, the two models gave statistically significant different results. For a 95% significant level, it was found that the p value was 0.044 (less than 0.05), whereas the z value was 2.01 (outside the range of critical values −1.96 and +1.96). Thus, the two models produce different outcomes not by chance.

Discussion
Despite the large number of studies concerning landslide susceptibility and hazard modeling assessments, there are still no guidelines about the method or technique that appears as the most appropriate [2,17,93]. According to Van Westen et al. [103], every study area has its own unique set of landslide-related parameters, which affects to a different degree the probability of a landslide. Furthermore, no framework exists for the selection of the most appropriate parameters, making the whole process of landslide assessment a very difficult and complex task. Based on the above, no safe conclusion can be drawn as to the effectiveness of a method or technique that has universal validity. However, there is an increase confidence that ML methods provide higher predictive models than other statistical or knowledge-based methods. Two main advances of ML models are that they efficiently minimize uncertainties and have a greater tolerance in noisy or incomplete data [17,104].
Passing to our study, although the multi-collinearity analysis showed that all parameters could be used for analysis (since the parameters showed TOL values greater than 0.1 and VIF values less than 10), the feature selection procedure excluded three parameters: slope aspect, curvature, and hydrological cover. This may be attributed to the spatial distribution of landslide locations within the generated classes of the three parameters that appears to be more even. Thus, the three parameters appear with less discernment ability. Another probable explanation could be the presence of spatial autocorrelation; in our case, landslide and non-landslide locations have similar values in those three parameters. Our findings are in accordance with previous studies that indicate that an exclusion of certain parameters may be due to the spatial autocorrelation and data redundancy among the landslide-related parameters [4]. However, it must be mentioned that slope aspect and hydrological cover and also curvature are significant parameters that may have a great impact on the evolution of landslides [2,93]. Concerning the slope aspect, it defines the impact of sun exposure and rainfall controlling indirectly surface erosion and slope stability, which can vary greatly even at distances shorter than the resolution of the model (30 m). Although feature selection procedures assist in producing more accurate prediction models, they should be used with caution, and one must have in mind that transferring the knowledge gained in a study area to another may not be visible in all cases.
Based on the WofE method, concerning the lithological cover, the highest values were found in areas covered by volcanic rocks, fine grained deposits, and flysch formations. Similar findings have been provided from other researchers [57,59]. It seems that the fine-grained sediments, composed by mixing layers of clayey marls, marls, silty sands, and weak sandstones, are more prone to landslides. This could be attributed to the heterogeneous structure and the presence of a high degree of looseness that characterizes this type of formation [59]. Similarly, the flysch formations that are found within the research area are prone to rotational and transitional slides. Flysch formations are characterized by a varying and anisotropic geotechnical behavior, intense folded sediments, and thick weathering mantle, which are characteristics that significantly influence the evolution of landslides [87,89]. It is also well documented that slope angle is one of the most critical parameters that plays a significant role in slope stability [33]. Slope angle influences the shear and normal strength, which are developed on the discontinuities surfaces, with a steeper slope characterized by a higher shear stress and a lower safety factor [50]. In our study, higher slope angles (greater than 30 • ) showed higher C values; therefore, they contributed to a much higher degree to landslide susceptibility.
Although the two models gave similar results and their difference in predictive performance is about 5%, the final map that should be used is the one with the highest performance. As several previous studies have reported, even a 1 or 2% increment of the prediction accuracy could significantly Remote Sens. 2020, 12, 3854 20 of 26 control the resulting landslide susceptibility zones [11,16,105], making it a necessity to accurately predict these zones with the implementation of a high-performance-based model. In addition, our study highlighted the difference between the way each model classified the non-landslide areas, which is a variation that may be attributed to the learning and prediction procedures each model follows but also the identification of the non-landslide areas. Similar findings are reported by other researchers, who highlight the important task of identifying non-landslide areas. In our case, non-landslide areas may concern data characterized by some degree of "noise" that is more evenly distributed within the classes of each parameter and that the SVM model may by more tolerant to such data. Comparing the performance of each model, the ANN model shows a more stable performance, low variation between learning and predictive accuracy, and a higher prediction accuracy. A similar performance has been reported by several researchers in the field of natural hazard assessments, which assume that ANN is more capable of solving non-linear and complex problems in comparison to other ML models, such as SVM and tree-based models [106,107]. According to Tien Bui et al. [82], a Multi-Layer Preceptron model (neural-based model) was used to construct an landslide susceptibility map, and it outperformed several other ML models among which was a SVM model, based precisely on this ability. Overall, more studies are required under different settings and conditions that use other ML methods so that one can make safe conclusions about the superiority of one ML method over another.
Another issue that has a significant impact in landslide assessments is that during the selection of a predictive model, there is a need to understand its abilities and limitations. In our case, despite its very good performance, the optimized ANN model was not capable of providing an estimate of the contribution each parameter had on the classification process. In general, ANN models are known as black-box models, which provide little information concerning the learning and predictive process or how conclusions can be obtained from the model's outcomes [108,109]. The implementation of our methodology comes along with some drawbacks and limitations. First of all, there was a high computational power requirement and the significant computational time it took to obtain the final result. Both may be a serious drawback when results are needed immediately. Rewriting parts of the code or dividing the area into smaller parts and applying the methods in each area may help mitigate some of the above limitations. Another issue that may have an impact on the overall performance of the applied methods is the selection of the appropriate values for the structural parameters of the evolutionary algorithms that are used for feature selection and tuning. Default values defined by the development team or the various r packages that were used in our study were applied. However, these may influence the outcomes. Specifically, the parameter elitism in GA that allows the best individual in the overall optimization to survive until the end of the evolution was set to 0 in our study. By this, there is a chance that the best individual at a given generation may be replaced by a new offspring or may be selected for mutation; hence, it can be lost. Perhaps the usage of grid search techniques may assist to overcome the important task of setting the appropriate structural parameters of the evolutionary algorithms. This could be an interesting future work. Finally, neither of the two models gave any explanation concerning the influence that each parameter had on the overall landslide susceptibility. The implementation of methods that provide such information may help in order to establish a better understanding about the relation between the landslide parameters and landslide susceptibility. However, a clear importance value would be useful probably only for our study area, since the underlying mechanism responsible for landslide manifestation is rather complicated and site specific.

Conclusions
In the present study, an advanced approach for landslide susceptibility mapping was developed that involved the usage of evolutionary algorithms for feature selection procedures and tuning the structural parameters of ML models. Two ML models were evaluated, SVM and ANN. The results of our study indicate that both models performed very well, achieving high learning and predictive accuracy, with the ANN model having a more stable performance with a much higher predictive accuracy. The results illustrate a difference in the way that the non-landslide areas were classified, which may be attributed to the different way SVM and ANN learn and predict, but also the identification procedure that was followed concerning the non-landslide areas. The ANN model classified more unknown sites as landslides than SVM. Finally, it must be highlighted that the gained knowledge regarding the methodology approach and also the application of the method at the Achaia Regional Unit may assist both the scientific community and the local and government authorities. The scientific community may adopt the approach, process the raw data within a specific probabilistic manner, introduce feature selection procedure, and optimize-tune the predictive models, whereas the local and government authorities could include the findings of our study in any forthcoming landslide management plan.
Author Contributions: All authors have read and agreed to the published version of the manuscript. W.C., Y.C., P.T., I.I. and X.W. contributed equally to this work. P.T. and I.I. collected field data, conducted the modeling and wrote the manuscript. W.C., Y.C. and X.W. provided critical comments in planning of this paper and edited the manuscript. W.C., Y.C., P.T., I.I. and X.W. contributed to the revision of the manuscript. All authors discussed the results and edited the manuscript.