A Novel Swarm Intelligence—Harris Hawks Optimization for Spatial Assessment of Landslide Susceptibility

In this research, the novel metaheuristic algorithm Harris hawks optimization (HHO) is applied to landslide susceptibility analysis in Western Iran. To this end, the HHO is synthesized with an artificial neural network (ANN) to optimize its performance. A spatial database comprising 208 historical landslides, as well as 14 landslide conditioning factors—elevation, slope aspect, plan curvature, profile curvature, soil type, lithology, distance to the river, distance to the road, distance to the fault, land cover, slope degree, stream power index (SPI), topographic wetness index (TWI), and rainfall—is prepared to develop the ANN and HHO–ANN predictive tools. Mean square error and mean absolute error criteria are defined to measure the performance error of the models, and area under the receiving operating characteristic curve (AUROC) is used to evaluate the accuracy of the generated susceptibility maps. The findings showed that the HHO algorithm effectively improved the performance of ANN in both recognizing (AUROCANN = 0.731 and AUROCHHO–ANN = 0.777) and predicting (AUROCANN = 0.720 and AUROCHHO–ANN = 0.773) the landslide pattern.


Introduction
Landslides are defined as gravity-triggered downward mass movements which can result from anthropogenic and natural activities [1,2]. They are considered as one of the most devastating environmental threats and have cause huge physical and financial damage worldwide [3]. In Iran, landslide hazard is responsible for the loss of around 187 lives [4]. Due to geographical conditions, the approximately from 750 to 3100, where more than 80% of the area is above 1500 m.s.l. The terrain slope ranges from 0 to 60°, where more than 55% are classified as gentle slopes (i.e., lower than 15°). Also, the land is mainly covered by good ranges. Geologically, among the diverse kinds of lithological units, the so-called category "dark grey argillaceous shale" is the most common representing around 18% of the bedrock area.

Data Preparation and Spatial Relationship between the Landslide and Related Factors
As is known, providing a valid dataset is a crucial step of using artificial intelligence tools [9]. The spatial database of this research comprises a landslide inventory map as the target, and fourteen landslide conditioning factors as the input variables, namely elevation, slope aspect, plan curvature, profile curvature, soil type, lithology, distance to the river, distance to the road, distance to the fault, land cover, slope degree, stream power index (SPI), topographic wetness index (TWI), and rainfall. Notably, plan curvature and profile curvature are topographical attributes which contribute to the convergence (or divergence) of the flowing water and the velocity change of the flowing mass, respectively [36][37][38]. A total of 208 historical landslides were identified, and the same number of the non-landslide points were randomly produced over the study area.
All layers were created from basic sources (e.g., satellite imagery and vector data) and processed in geographic information system (GIS) with a pixel size of 10 m × 10 m [34,39]. In the following, frequency ratio (FR) theory is used to evaluate the spatial relationship between the landslide and each subclass of the conditioning factors. In this sense, the higher values indicate a greater correlation between the landslide and the proposed subclass [40]. Equation 1 expresses the FR formulation: where Nlandslide and Ndomain respectively stand for the percentage of the landslides found in the proposed subclass and the percentage of the terrain it covers. Figure 2 illustrates the classification of the considered conditioning factors as well as the calculated FR. Also, the description of the geological units is presented in Table 1. As is seen, the largest FRs are obtained for (a) 1000-1500 m for elevation,

Data Preparation and Spatial Relationship between the Landslide and Related Factors
As is known, providing a valid dataset is a crucial step of using artificial intelligence tools [9]. The spatial database of this research comprises a landslide inventory map as the target, and fourteen landslide conditioning factors as the input variables, namely elevation, slope aspect, plan curvature, profile curvature, soil type, lithology, distance to the river, distance to the road, distance to the fault, land cover, slope degree, stream power index (SPI), topographic wetness index (TWI), and rainfall. Notably, plan curvature and profile curvature are topographical attributes which contribute to the convergence (or divergence) of the flowing water and the velocity change of the flowing mass, respectively [36][37][38]. A total of 208 historical landslides were identified, and the same number of the non-landslide points were randomly produced over the study area.
All layers were created from basic sources (e.g., satellite imagery and vector data) and processed in geographic information system (GIS) with a pixel size of 10 m × 10 m [34,39]. In the following, frequency ratio (FR) theory is used to evaluate the spatial relationship between the landslide and each subclass of the conditioning factors. In this sense, the higher values indicate a greater correlation between the landslide and the proposed subclass [40]. Equation (1) expresses the FR formulation: where N landslide and N domain respectively stand for the percentage of the landslides found in the proposed subclass and the percentage of the terrain it covers. Figure 2 illustrates the classification of the considered conditioning factors as well as the calculated FR. Also, the description of the geological units is presented in Table 1. As is seen, the largest FRs are obtained for (a) 1000-1500 m for elevation, distance to the road, (h) 300-400 m for distance to the fault, (i) "Agriculture" for land cover, (j) 10-15 • for slope, (k) 85 × 10 5 -23 × 10 6 for SPI, (l) −9.38 to −7.40 for TWI, (m) 500-600 mm for rainfall, and (n) "Jugr" for lithology. Note that explaining the pattern of the obtained FRs requires exact analysis of the conditioning factors, like importance evaluation, which is not the aim of this paper.  Figure 2. The calculated frequency ratio(FR) for (a) elevation, (b) slope aspect, (c) plan curvature, (d) profile curvature, (e) soil type, (f) distance to river, (g) distance to road, (h) distance to fault, (i) land cover, (j) slope degree, (k) SPI, (l) TWI, and (m) rainfall.

Methodology
The overall steps carried out in this research are shown in Figure 3. After providing the required landslide inventory map, as well as landslide conditioning layers, the marked landslides are randomly divided into two separate parts, including 70% of whole data (146 landslides and 146 non-landslide points) for training the ANN and HHO-ANN models, and 30% of data (62 landslides and 62 non-landslide points) for evaluating their efficiency in predicting future landslides in the programming language of MATLAB 2014. The HHO is then coupled with ANN to optimize its computational parameters. The landslide susceptibility maps are produced, and the accuracy of each model is evaluated by three criteria, namely area under the receiving operating characteristic curve (AUROC), mean square error (MSE), and mean absolute error (MAE). Equations (2) and (3). denote the MSE and MAE in which N shows the number of involved samples, and Y i observed and Y i predicted are the desired and predicted values of landslide susceptibility, respectively.

Methodology
The overall steps carried out in this research are shown in Figure 3. After providing the required landslide inventory map, as well as landslide conditioning layers, the marked landslides are randomly divided into two separate parts, including 70% of whole data (146 landslides and 146 non-landslide points) for training the ANN and HHO-ANN models, and 30% of data (62 landslides and 62 non-landslide points) for evaluating their efficiency in predicting future landslides in the programming language of MATLAB 2014. The HHO is then coupled with ANN to optimize its computational parameters. The landslide susceptibility maps are produced, and the accuracy of each model is evaluated by three criteria, namely area under the receiving operating characteristic curve (AUROC), mean square error (MSE), and mean absolute error (MAE). Equations 2. and 3. denote the MSE and MAE in which N shows the number of involved samples, and Yi observed and Yi predicted are the desired and predicted values of landslide susceptibility, respectively.

Artificial Neural Network
Inspired by the biological interaction in the neural systems, artificial neural network (ANN) was first suggested by McCulloch and Pitts [41] as a predictive tool. It is constructed from some extremely

Artificial Neural Network
Inspired by the biological interaction in the neural systems, artificial neural network (ANN) was first suggested by McCulloch and Pitts [41] as a predictive tool. It is constructed from some extremely connected units which aim to discern the nonlinear relationship between external inputs [42]. The main reason for selecting the ANN as the basic model of this study was its high capability in various engineering simulations [43][44][45][46][47]. Figure 4 illustrates the structure of a commonly held type of neural networks named multilayer perceptron (MLP). The learning process is carried out by assigning some weights to each input vector. After producing the network response, the determined weights and biases are adjusted to minimize the error. More specifically, when the output (O z ) is produced, χ z is computed as follows: where T z is the desired response for the output node z. Then, the parameter λ s is calculated for each node of the middle layer as follows: in which W sz represents the connecting weight between the neurons s and z.
Next, considering m as the learning rate (ranging from 0 to 1), two parameters below are calculated to change the weights and biases in layer L: Eventually, the new weights and biases are produced as follows: networks named multilayer perceptron (MLP). The learning process is carried out by assigning some weights to each input vector. After producing the network response, the determined weights and biases are adjusted to minimize the error. More specifically, when the output ( ) is produced, is computed as follows: where is the desired response for the output node z. Then, the parameter is calculated for each node of the middle layer as follows: in which represents the connecting weight between the neurons s and z. Next, considering m as the learning rate (ranging from 0 to 1), two parameters below are calculated to change the weights and biases in layer L: Eventually, the new weights and biases are produced as follows:

Harris Hawks Optimization
Inspired by the chasing style and cooperative behavior of Harris' hawks, the Harris hawks optimization (HHO) technique was suggested by Heidari et al. [48]. Some of the hawks aim to surprise the prey by swooping it from different paths. Note that they select the chase pattern based on the flying pattern of the prey. HHO is a population-based search algorithm which draws on three major steps which are explained as follows (see Figure 5): (i) Exploration Phase

Harris Hawks Optimization
Inspired by the chasing style and cooperative behavior of Harris' hawks, the Harris hawks optimization (HHO) technique was suggested by Heidari et al. [48]. Some of the hawks aim to surprise the prey by swooping it from different paths. Note that they select the chase pattern based on the flying pattern of the prey. HHO is a population-based search algorithm which draws on three major steps which are explained as follows (see Figure 5): In this phase, it is determined to mathematically wait, search, and discover the desired hunt. The iter + 1 (the Harris hawks position) is mathematically expressed as follows: where X rabit stands for the rabbit position, iter denotes the present iteration, X rand is the randomly selected hawk among the available population, r i , I = 1, 2, 3, 4, q are random numbers ranging in [0, 1], and X m shows the average position for hawks and is computed as follows: in which X i shows the place of the hawks and N represents the hawk size.
(ii) Transition from Exploration to Exploitation Considering T as the maximum size about the repetitions and E 0 ∈ (−1, 1) as the initial energy during each step, HHO calculates the escaping energy of rabbit (E) by Equation (12). Regarding this value, exploration and exploitation may be changed.
In this sense, if |E| ≥ 1, the exploration phase gets started; otherwise, the neighborhood of the solutions is aimed to be exploited.

(iii) Exploitation Phase
Depending on the residual energy of the prey, the hawks may consider a soft or hard besiege for hunting it from different directions. A so-called parameter "r" is defined to measure the escaping chance of the prey. Accordingly, r < 0.5 represents a successful escape. In addition, when |E| ≥ 0.5, HHO takes soft surround and when |E| < 0.5, hard surround is applied. It is worth noting that even if the prey is able to escape (i.e., |E| ≥ 0.5), its success also depends on r. The attack procedure is influenced by the escaping and pursuing strategy of the prey and hawks, respectively. In this sense, four major steps are considered which are broadly explained in the Appendix A [49].
In this phase, it is determined to mathematically wait, search, and discover the desired hunt. The iter + 1 (the Harris hawks position) is mathematically expressed as follows: where stands for the rabbit position, iter denotes the present iteration, is the randomly selected hawk among the available population, , I = 1, 2, 3, 4, q are random numbers ranging in [0, 1], and shows the average position for hawks and is computed as follows: in which shows the place of the hawks and N represents the hawk size.
(ii) Transition from Exploration to Exploitation Considering T as the maximum size about the repetitions and ∈ (−1. 1) as the initial energy during each step, HHO calculates the escaping energy of rabbit (E) by Equation 12. Regarding this value, exploration and exploitation may be changed.
In this sense, if | | ≥ 1, the exploration phase gets started; otherwise, the neighborhood of the solutions is aimed to be exploited.

(iii) Exploitation Phase
Depending on the residual energy of the prey, the hawks may consider a soft or hard besiege for hunting it from different directions. A so-called parameter "r" is defined to measure the escaping chance of the prey. Accordingly, < 0.5 represents a successful escape. In addition, when | | ≥ 0.5, HHO takes soft surround and when | | < 0.5, hard surround is applied. It is worth noting that even if the prey is able to escape (i.e., | | ≥ 0.5), its success also depends on r. The attack procedure is influenced by the escaping and pursuing strategy of the prey and hawks, respectively. In this sense, four major steps are considered which are broadly explained in the Appendix [49].

Model Implementation
As stated above, this study addresses novel optimization of the artificial neural network based on the Harris hawks optimization technique for spatial prediction of a landslide. To this end, the HHO is synthesized with a multilayer perceptron neural network for finding the most appropriate computational parameters. More specifically, the ANN calculates the outputs by assigning some weights and biases. For optimization purposes, the structure of the ANN is mathematically introduced to the evolutionary algorithms. Here, the proposed HHO aims to find a solution containing the best alternatives to the weights and biases of the unreinforced ANN. It is worth noting that based on the authors' experience as well as a trial and error process, the proposed ANN was constructed with five hidden neurons. Also, "Tansig" was considered as their activation function. The proposed HHO-ANN ensemble was performed within 1000 repetitions when the MSE was defined as the objective function (OF). The model executed all 1000 iterations within 8370 s and reduced the OF from 0.238274922 to 0.196520013. However, the majority was decreased before the 750th try. Figure 6 shows the convergence curve of the proposed HHO-ANN.

Model Implementation
As stated above, this study addresses novel optimization of the artificial neural network based on the Harris hawks optimization technique for spatial prediction of a landslide. To this end, the HHO is synthesized with a multilayer perceptron neural network for finding the most appropriate computational parameters. More specifically, the ANN calculates the outputs by assigning some weights and biases. For optimization purposes, the structure of the ANN is mathematically introduced to the evolutionary algorithms. Here, the proposed HHO aims to find a solution containing the best alternatives to the weights and biases of the unreinforced ANN. It is worth noting that based on the authors' experience as well as a trial and error process, the proposed ANN was constructed with five hidden neurons. Also, "Tansig" was considered as their activation function. The proposed HHO-ANN ensemble was performed within 1000 repetitions when the MSE was defined as the objective function (OF). The model executed all 1000 iterations within 8370 s and reduced the OF from 0.238274922 to 0.196520013. However, the majority was decreased before the 750th try. Figure 6 shows the convergence curve of the proposed HHO-ANN.

Landslide Susceptibility Mapping
Next, the landslide susceptibility map of the study area was generated by transferring the outputs (i.e., the predicted values of landslide susceptibility index) of the ANN and HHO-ANN to the GIS environment. As is known, there are various techniques for classifying the constructed maps, such as natural break, equal interval, and quantile method. Out of those, equal interval is not practical as it emphasizes one susceptibility class relative to others [50]. Also, the disadvantage of the quantile method lies in placing widely differing values into the same class [51]. Regarding natural break, it reduces the variance within classes and maximizes the variance between classes [52,53], and is the most commonly used method for this aim [54][55][56][57][58]. Hence, using the "natural break" technique, the created maps were classified into five susceptibility categories, namely "Very low", "Low", "Moderate", "High", and "Very high". These maps are presented in Figure 7 along with the corresponding histograms. The resulted maps show a good approximation of the location of the marked landslide events. Also, it can be seen that appreciable parts of the area in the West-North and South are found to be under the high risk of landslide.

Landslide Susceptibility Mapping
Next, the landslide susceptibility map of the study area was generated by transferring the outputs (i.e., the predicted values of landslide susceptibility index) of the ANN and HHO-ANN to the GIS environment. As is known, there are various techniques for classifying the constructed maps, such as natural break, equal interval, and quantile method. Out of those, equal interval is not practical as it emphasizes one susceptibility class relative to others [50]. Also, the disadvantage of the quantile method lies in placing widely differing values into the same class [51]. Regarding natural break, it reduces the variance within classes and maximizes the variance between classes [52,53], and is the most commonly used method for this aim [54][55][56][57][58]. Hence, using the "natural break" technique, the created maps were classified into five susceptibility categories, namely "Very low", "Low", "Moderate", "High", and "Very high". These maps are presented in Figure 7 along with the corresponding histograms. The resulted maps show a good approximation of the location of the marked landslide events. Also, it can be seen that appreciable parts of the area in the West-North and South are found to be under the high risk of landslide.

Performance Assessment of the Models
Firstly, to measure the performance error of the used models, MSE and MAE error criteria were employed. These values were calculated for both training and testing samples. In this regard, Figure 8 illustrates the comparison between the actual and predicted landslide susceptibility values, as well as the histogram of the errors (i.e., the difference between the mentioned parameters). As is seen, applying the HHO algorithm helped the ANN to have more accurate pattern recognition. Accordingly, the training MSE declined from 0.20646 for the ANN to 0.19652 for the HHO-ANN. The decrease in the calculated MAE (from 0.42006 to 0.39809) provides additional evidence for this claim. As for the testing phase, the decrease of the MSE from 0.21235 to 0.19749 indicates the improvement of the generalization power of the ANN, which results in a more reliable approximation of the unseen landslides. The obtained MAE also attests this statement.

Performance Assessment of the Models
Firstly, to measure the performance error of the used models, MSE and MAE error criteria were employed. These values were calculated for both training and testing samples. In this regard, Figure 8 illustrates the comparison between the actual and predicted landslide susceptibility values, as well as the histogram of the errors (i.e., the difference between the mentioned parameters). As is seen, applying the HHO algorithm helped the ANN to have more accurate pattern recognition. Accordingly, the training MSE declined from 0.20646 for the ANN to 0.19652 for the HHO-ANN. The decrease in the calculated MAE (from 0.42006 to 0.39809) provides additional evidence for this claim. As for the testing phase, the decrease of the MSE from 0.21235 to 0.19749 indicates the improvement of the generalization power of the ANN, which results in a more reliable approximation of the unseen landslides. The obtained MAE also attests this statement. The accuracy of the developed susceptibility maps was also evaluated by drawing the receiver operating characteristic (ROC) curve of both training and testing predictions. The ROC curve plots the specificity (i.e., the proportion of the non-landslide grid cells which are correctly labeled "stable") versus the sensitivity (i.e., the proportion of the landslide grid cells which are correctly labeled "unstable" [59][60][61]. Beguería [61] stated that the area under the plotted ROC curves (AUROC) is a good indicator of the accuracy of natural hazard modeling. It ranges from 0.5 to 1 directly The accuracy of the developed susceptibility maps was also evaluated by drawing the receiver operating characteristic (ROC) curve of both training and testing predictions. The ROC curve plots the specificity (i.e., the proportion of the non-landslide grid cells which are correctly labeled "stable") versus the sensitivity (i.e., the proportion of the landslide grid cells which are correctly labeled "unstable" [59][60][61]. Beguería [61] stated that the area under the plotted ROC curves (AUROC) is a good indicator of the accuracy of natural hazard modeling. It ranges from 0.5 to 1 directly proportional to the prediction accuracy. Figure 9 displays the plotted ROC curves, as well as the computed AUROC for both ANN and HHO-ANN models ( Table 2). Based on this figure, both learning and prediction accuracies of the typical ANN increased after incorporation with the HHO. Accordingly, the accuracy of the training samples rose from 73.1% to 77.7%, and the accuracy of predicting unseen landslides increased from 72.0% (SE = 0.046) to 77.3% (SE = 0.027). The pairwise comparison of the ROC curves was also carried out using the method of Hanley and McNeil [62]. The obtained "significance level" and "z statistic" were 0.1978 and 1.288, respectively.
Sensors 2019, 19, 14 of 22 proportional to the prediction accuracy. Figure 9 displays the plotted ROC curves, as well as the computed AUROC for both ANN and HHO-ANN models ( Table 2). Based on this figure, both learning and prediction accuracies of the typical ANN increased after incorporation with the HHO. Accordingly, the accuracy of the training samples rose from 73.1% to 77.7%, and the accuracy of predicting unseen landslides increased from 72.0% (SE = 0.046) to 77.3% (SE = 0.027). The pairwise comparison of the ROC curves was also carried out using the method of Hanley and McNeil [62]. The obtained "significance level" and "z statistic" were 0.1978 and 1.288, respectively.
(a) (b) Figure 9. The ROC curves plotted for the (a) training and (b) testing datasets. The percentage of the area which is labeled by each one of the susceptibility classes is also calculated. Based on the generated landslide susceptibility maps, both models have found more than one-third of the studied area to be under the high risk of the landslide (i.e., "High" susceptibility category). Also, more than 20% of the area was classified as "Very high" susceptibility. All in all, the ANN and HHO-ANN classified around 57% (i.e., 4474.50 km 2 ) and 54% (i.e., 4210.50 km 2 ) of the area as regions with "High" and "Very high" susceptibility. More details of this analysis are presented in Table 3.   The percentage of the area which is labeled by each one of the susceptibility classes is also calculated. Based on the generated landslide susceptibility maps, both models have found more than one-third of the studied area to be under the high risk of the landslide (i.e., "High" susceptibility category). Also, more than 20% of the area was classified as "Very high" susceptibility. All in all, the ANN and HHO-ANN classified around 57% (i.e., 4474.50 km 2 ) and 54% (i.e., 4210.50 km 2 ) of the area as regions with "High" and "Very high" susceptibility. More details of this analysis are presented in Table 3. Moreover, the percentage of training and testing landslides located in each susceptibility class are presented in Table 4. According to this table, a considerable distinction is the percentage of the landslides found within the "Very high" susceptibility class. As is seen, the prediction of the unreinforced ANN indicates that about 27% and 20% of the training and testing landslides, respectively, are located in the mentioned class. Use of the optimized version increased these values to about 39% and 40%. Also, about 71% and 78% of the landslides which used for pattern discerning by the ANN and HHO-ANN were found in the regions with "High" and "Very high" susceptibility in the developed maps. These values were calculated as 68.87% and 75.98% for unseen landslide events.

Presenting the HHO-Based Predictive Formula
As previously explained, the main contribution of the used hybrid algorithm (i.e., the HHO) to the problem of landslide susceptibility assessment lies in determining the most proper values of the weights assigned to each landside conditioning factor. Therefore, this section aims to present the landslide susceptibility index formula of the ANN which is optimized by the proposed HHO algorithm. The weights and biases of the HHO-ANN model is shown in Table 5.
where Z1, Z2, . . . , Z8 are calculated as follows.  It is well-established that generating a susceptibility map of environmental threats is one of the most fundamental prerequisites for dealing with them. Landslides are one of the most disastrous of these threats. In the case of susceptibility zonation of the landslide, scholars have used different predictive and evaluative techniques. Also, due to the existing shortcomings of statistical methods, as well as the regular machine learning tools, they have employed hybrid metaheuristic algorithms to overcome these drawbacks. These algorithms represent search methods which aim to find the best-fitted solution for a mathematically defined problem.
Artificial neural networks have been efficiently used for landslide susceptibility analysis in various areas. As one of the first attempts, Lee et al. [63] determined the relative importance of landslide thematic factors through training an ANN by the backpropagation algorithm. More recently, Can et al. [64] tested four different learning strategies of the ANN, namely quick propagation, batch backpropagation, Levenberg-Marquardt, and conjugate gradient descent (CGD) for producing the landslide susceptibility map of Ovacık region, Tukey. Their findings revealed that the last algorithm outperforms those presented by other colleagues. However, it also emerged as the slowest algorithm.
Evolutionary algorithms have also shown good incorporation with typical intelligent models to enhance their performance. Different attempts have conducted the optimization of ANNs and ANFIS for modeling various natural phenomena like landslide [65], forest fire [66,67], and groundwater potential [68,69]. As is known, these population-based algorithms mime different social behaviors to search the most appropriate response to a problem. When it comes to machine learning models, the main objective is to overcome existing drawbacks like local minimum and dimension dangers [23], through achieving more suitable values of computational parameters. It also can be considered as the main contribution of such techniques to the proposed problems. In the case of spatial analysis of landslides, as stated in previous research [63,70], the connecting weights of the ANN denote the relative importance of the thematic layers. In this sense, Moayedi et al. [32] found that utilizing the PSO algorithm enhances the capability of the MLP neural network. They also presented the extracted weights that the PSO assigned to each considered landslide conditioning factor.
In the current research, for the first time, the HHO optimization algorithm was applied to the landslide susceptibility problem by incorporation with ANN. The results clearly showed that the HHO performed efficiently in improving the reliability of the proposed spatial analysis. In investigating the optimization path, the convergence curve of the HHO-ANN ensemble showed that an appreciable portion of the objective function reduction occurred in the first attempts. It continued to decrease the error until it remained more and less steady in the last 200 repetitions.
Referring to the results, the authors believe that the proposed HHO algorithm can be promisingly used for optimizing different intelligent tools for landslide susceptibility mapping. It seems to be a proper model for assist the engineers and authorities in future planning in order to alleviate the damage caused by landslides. However, the authors believe that the proposed ensemble could achieve higher accuracy by applying some measures like sensitivity analysis for better structures or optimizing the conditioning factors. Last but not least, establishing a comparison between the optimization capability of the HHO with other well-known metaheuristic techniques would be a good idea for future studies.

Conclusions
Recent years have witnessed the large employment of evolutionary science for optimizing the performance of a typical intelligent model due to the existing drawbacks in dealing with highly complex issues. Landslides are disastrous environmental hazards which need nonlinear analysis to be simulated. Hence, in this paper, a novel metaheuristic technique, namely Harris hawks optimization, was synthesized with an artificial neural network to overcome the computational shortcomings of this model in spatial modeling of landslide susceptibility mapping. After providing the required spatial database, the ANN was coupled with the HHO for adjusting the computational parameters. The results revealed that the HHO acts efficiently in reducing the learning error of the ANN, which resulted in more accurate analysis from the spatial relationship between the landslide occurrence and its conditioning factors. Consequently, the landslide susceptibility map produced by the HHO-ANN was more successful than the ANN map in terms of predicting the unseen landslide events. Finally, due to the acceptable accuracy of the generated maps, they can be used for risk management and decision making in the future. Acknowledgments: This work was financially supported by Ton Duc Thang University and University of South-Eastern Norway.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
More details about four steps of the exploitation phase in the HHO algorithm are presented below: 1. Soft surround: when r ≥ 1 2 and |E| ≥ 1 2 , then we have: where the outcome ∆X denotes the difference between the prey position vector and the term J stands for the prey jump severity. 2. Hard surround: when r ≥ 1 2 and |E| < 1 2 , then the current position is expressed as follows: 3. Advanced rapid dives while soft surround: when r < 1 2 and |E| ≥ 1 2 , the next action of the hawks is determined by the following relation: In the following, the below equation describes the diving of the hawks: where S 1×D shows random vector, D symbolizes the issue dimension, and LF indicates the levy flight. Let µ and ϑ be random values which can vary from 0 to 1, the LF and is calculated as: Consequently, the hawks' location is updated by the below equation: 4. Advanced rapid dives while hard surround: when r < 1 2 and |E| < 1 2 , then the behavior of the hawks which are considered to be near the rabbit can be expressed as: In the above formula, considering X m (iter) = 1 N N i=1 X i (iter), Y and Z should be calculated as follow [71]: Y = X rabit (iter) − E JX rabit (iter) − X m (iter) (A9)