Assessment of Landslide Susceptibility Using Statistical- and Artiﬁcial Intelligence-Based FR–RF Integrated Model and Multiresolution DEMs

: Landslide is one of the most important geomorphological hazards that cause signiﬁcant ecological and economic losses and results in billions of dollars in ﬁnancial losses and thousands of casualties per year. The occurrence of landslide in northern Iran (Alborz Mountain Belt) is often due to the geological and climatic conditions and tectonic and human activities. To reduce or control the damage caused by landslides, landslide susceptibility mapping (LSM) and landslide risk assessment are necessary. In this study, the e ﬃ ciency and integration of frequency ratio (FR) and random forest (RF) in statistical- and artiﬁcial intelligence-based models and di ﬀ erent digital elevation models (DEMs) with various spatial resolutions were assessed in the ﬁeld of LSM. The experiment was performed in Sangtarashan watershed, Mazandran Province, Iran. The study area, which extends to 1072.28 km 2 , is severely a ﬀ ected by landslides, which cause severe economic and ecological losses. An inventory of 129 landslides that occurred in the study area was prepared using various resources, such as historical landslide records, the interpretation of aerial photos and Google Earth images, and extensive ﬁeld surveys. The inventory was split into training and test sets, which include 70 and 30% of the landslide locations, respectively. Subsequently, 15 topographic, hydrologic, geologic, and environmental landslide conditioning factors were selected as predictor variables of landslide occurrence on the basis of literature review, ﬁeld works and multicollinearity analysis. Phased array type L-band synthetic aperture radar (PALSAR), ASTER (Advanced Spaceborne Thermal Emission and Reﬂection Radiometer), and SRTM (Shuttle Radar Topography Mission) DEMs were used to extract topographic and hydrologic attributes. The RF model showed that land use / land cover (16.95), normalised di ﬀ erence vegetation index (16.44), distance to road (15.32) and elevation (13.6) were the most important controlling variables. Assessment of model performance by calculating the area under the receiving operating characteristic curve parameter showed that FR–RF integrated model (0.917) achieved higher predictive accuracy than the individual FR (0.865) and RF (0.840) models. Comparison of PALSAR, ASTER, and SRTM DEMs with 12.5, 30 and 90 m spatial resolution, respectively, with the FR–RF integrated model showed that the prediction accuracy of FR–RF–PALSAR (0.917) was higher than FR–RF–ASTER (0.865) and FR–RF–SRTM (0.863). The results of this study could be used by local planners and decision makers for planning development projects and landslide hazard mitigation measures.


Introduction
Mass movements are influenced by natural and human factors [1].These hazards, which occur in many parts of the world, cause considerable damage to people's lives and property every year.For example, the average annual economic loss is approximately $1.5 billion in the United States, $2 billion in Japan and $2 million in Italy [2].These hazards annually cause approximately 500 billion rial worth of damage in Iran, and this amount does not include the destruction of nonrenewable natural resources [3].The climatic, topographic and environmental conditions prevailing in Iran, especially in northern provinces, have made landslide one of the most important natural hazards [4].In geomorphology, landslide is defined as a gravitational motion of the mass of rocks, rubble, and earth materials [5,6], which can be disastrous and cause the destruction of human settlements and infrastructure, such as roads, bridges, and dams, making it the third cause of disasters worldwide [7].Landslide is one of the greatest disturbances in the development programs in mountainous areas [8].In comparison with other natural hazards, such as volcanic eruptions or flooding, landslide has caused more economic losses and casualties [9].Therefore, knowing where landslides are more likely to occur is important for planners and policymakers to use the natural environment and its resources and economic and infrastructural facilities optimally and prevent damage and casualties [10,11].The rapid population growth, the expansion of human settlements in mountainous areas, the difficulty of predicting the occurrence of landslide and the multiple factors controlling this phenomenon indicate the need for landslide susceptibility assessment (LSA) [12].Landslide susceptibility (LS) is the probability of a landslide occurring in a region on the basis of the local state of the earth [13].Identifying sensitive areas with potential hazard is possible by using landslide susceptibility mapping (LSM), and landslides can be prevented and/or its damage can be reduced by providing appropriate management.LSM is the prediction of the spatial occurrence of landslide on the basis of geomorphological and geo-environmental conditioning factors [14,15].
In this study, statistical and machine learning models and their integration, were compared.Among statistical models, the FR was selected due to its simplicity of equations, ease of interpretation of results, and high efficiency [59][60][61].Abedini and Tulabi [62] used three models of landslide nominal risk factor, FR and AHP for LSM in Lorestan Province, Iran; they stated that the FR model had better prediction accuracy among the three.
The advantages of the RF method over other machine learning models is that it can apply several input factors without eliminating any of them and return a small set of categories that maintain high prediction accuracy [63][64][65].The classification accuracy of this model is affected by the number, scale, type and precision of input data.Processing the use of all suitable factors increases the accuracy of this model.Moreover, in comparison with other models, RF has higher sufficiency to apply a large number of datasets [66].Ref. [67] used three machine learning models, namely, BRT, MARS (multiple adaptive regression splines), and RF, for spatial modelling of gully erosion in Shahroud Basin, Iran and stated that FR had better performance.[68] used the integration of the FR bivariate statistical model and RF data mining model for LSA in the Chehel-chai watershed in Golestan Province.They stated that the integrated model, with area under the receiving operating characteristic (ROC) curve (AUC) of 0.831, had a high capability to identify susceptible areas of landslide occurrence.
Sangtarashan watershed in northern Iran is one of the mountainous regions exposed to numerous landslides.Most of these landslides have had a negative impact on the lives and the economy of people because they often affect residential areas and infrastructure, such as roads.Therefore, LS modelling in this area is of great interest.
A literature review of LS modelling indicates that many different methods have been used in this field worldwide, each having its advantages and disadvantages.The present study aims to increase the efficiency of LS modelling by integrating artificial intelligence-based RF models and FR statistical-based RF models.It also aims to evaluate how different resolutions of the digital elevation model (DEM) used to extract the topographic and hydrologic predictors affect the reliability of the landslide predictive models.Multiresolution DEMs, namely, phased array type L-band synthetic aperture radar (PALSAR), SRTM, and ASTER, were utilised for LSM in the study area.

Study Area
Sangtarashan watershed is located in Mazandran Province, Iran.The area extends to 1072.28 km 2  and lies between longitudes of 53 • 03 52" E and 53 • 31 39" E and latitudes of 36 • 11 48" N and 36 • 30 28" N (Figure 1).The elevation averages 704 m, varying from 72 m in the northeast to 1681 m in the southwest and east of the study area.The mean annual temperature and rainfall are 12.8 • C and 900 mm, respectively [69].The area is characterised by a mountainous landscape with an average slope gradient of 15.8 • and a maximum of 67.8 • .The normalised difference vegetation index (NDVI), which varies from −0.132 to 0.639 with a mean of 0.458, indicates a high vegetation density.Convex and concave plan curvature occur on 49.43% and 46.06% of the study area, respectively, whereas only 3.14% has flat topography.The site is characterised by seven land use/land cover (LU/LC) classes, including agricultural lands (0.08%), dense forest (78.9%), low forest (1.5%), agriculture-orchard (7.6%), agricultural forest (0.02%), dry farming forest (11.5%) and water (0.3%).The dominant lithological units in the area [70] are marl, calcareous sandstone, sandy limestone and minor conglomerate (Mm, s, l); hyporite-bearing limestone (K2l1); medium-to thick-bedded limestone (Pel); polymictic conglomerate and sandstone (Plc); and thick-bedded to massive limestone (K2l2).

Data Collection (Landslide Inventory Map and Landslide Conditioning Factors [LCFs])
The first step in LS modelling is preparing a landslide inventory.This stage is considered the most important because the accuracy of the landslide inventory used to calibrate the models affects the reliability of LSA [71][72][73].Thus, the higher the accuracy and quality of the landslide inventory, the better the predictive performance of the susceptibility maps will be [71].
The following are used to produce detailed and reliable landslide inventory map and interpret high-resolution images from Google Earth: (i) Historical landslide records [74]; (ii) interpretation of aerial photos with scales of 1:20,000 and 1:40,000 and high-resolution images available from Google Earth images; and (iii) extensive field surveys in three periods (i.e., 07/12/2016, 06/13/2017 and 03/02/2018).Finally, 129 landslides were identified in the study area from 2013 to 2018.The landslides in the study area are characterised by two modes, that is, slump-tensile and creep-tensile ruptures, which eventually lead to landslide instability.Of these landslides, 70% (90 landslide locations) were selected for training and 30% (39 landslide locations) for validation of the models [12].
Of the 129 landslides identified in the study area, 56 (43.41%) were translational slides, 38 (29.45%) were rotational slides, 5 (3.87%) were soil creep, 23 (17.82%) were shallow translational slide, and 7 (5.42%) were complex slide.These detections covered a total area of 13,324,421 m 2 .Analysis of landslide size showed that the smallest was 135.63 m 2 and the largest was 976,423 m 2 , with a mean size of 108,215 m 2 .The models developed by [75,76] were used to classify all landslides using landslide points located in the centre of the landslide scar.
On the basis of an extensive literature review [11,40,55,77] and of the physical characteristics of the site, 15 LCFs were used to model the LS in the study area.These LCFs include elevation, slope degree, slope aspect, convergence index, SL, plan curvature, profile curvature, drainage density, distance to stream, distance to road, distance to fault, lithology, rainfall, LU/LC, and NDVI (Figure 2a-o).The variables were submitted to a multicollinearity test that excluded strong relationships among them.The terrain attributes were extracted from PALSAR DEM with 12.5 m spatial resolution (http://www.eorc.jaxa.jp/ALOS/en/aw3d30),DEM-SRTM with 30 m spatial resolution (http://dwtkns.com/srtm30m/),and DEM-ASTER with 90 m spatial resolution (http://dwtkns.com/aster90m/).The slope, aspect, plan curvature, profile curvature, and LS are respectively calculated as follows: Remote Sens. 2019, 11, 999 5 of 24

Data Collection (Landslide Inventory Map and Landslide Conditioning Factors [LCFs])
The first step in LS modelling is preparing a landslide inventory.This stage is considered the most important because the accuracy of the landslide inventory used to calibrate the models affects the reliability of LSA [71 -73].Thus, the higher the accuracy and quality of the landslide inventory, the better the predictive performance of the susceptibility maps will be [71].
where A s is specific upstream contributing area, and β is the slope gradient.The values of H, G, p, q, r, s and t are obtained as follows: where z 1 -z 9 are altitude values in 3 × 3 cellular networks and ∆ s denotes the cell size.The stream network was extracted from the PALSAR DEM using ArcGIS10.5software to calculate the distance to stream and drainage density.After extracting the stream network, distance to stream and drainage density were calculated using the Euclidean distance and line density tools.To produce the grid of distance to roads, these factors were digitized from 1:50,000 scale topographic maps obtained from the Geographic Organisation of the Armed Forces (2008) and Google Earth satellite images.Next, the distance to road was calculated using the Euclidean distance tool.The same tool was used to extract the grid of distance to faults on the basis of the structural map provided by the Soil Conservation Section of Agricultural and Natural Resources Research Centre of Isfahan Province.The lithological map of the study area, which includes 12 units (Table 1), was prepared on the basis of a 1:100,000 scale geological map [70].A Landsat 8 image (path 163/ row 35) (7 September 2017), which was archived by USGS with 30 m and 15 m spatial resolution for visual and panchromatic bands, respectively (https://earthexplorer.usgs.gov/),was used to prepare the LU/LC map in ENVI4.8 software.The supervised (maximum likelihood) algorithm was used for this purpose.A set of 375 ground control points were used to validate the LU/LC map with kappa coefficient.The annual rainfall data recorded at the Talaroum, Afrachal, Soleiman-tangeh, Vasetan, Rigcheshmeh-nakam, Sarkat-tajan and Afrachal weather stations during a 30-year period (i.e., 1986-2016) were interpolated via kriging in ArcGIS10.5 [69] to prepare a rainfall map of the study area.Landsat 8 was used in ArcGIS10.5 software to calculate the NDVI from bands 4 (red band) and 5 (infrared band).The NDVI map was calculated as follows: where IR and R represent the infrared and red electromagnetic spectra, respectively.Table 2 shows the source and resolution of parameters used for classification, along with their classes, methods and references.Finally, all layers were unified similarly with DEM spatial resolution and the UTM Zone39N geographic coordinate system.

Methodology
The method used in this study (Figure 3) consists of five main steps as follows: (1) Detection of landslide locations by optical remote sensing analysis of Google Earth images and extensive field surveys and random subdivision of the landslide archive into training and validation sets; (2) selection and extraction of 15 LCFs using various data sources, including PALSAR DEM, geological map (1:100,000 scale), topographic map (1:50,000 scale) and Landsat 8 image; (3) application of FR model and establishment of the spatial relationship between LCFs and location of landslides; (4) application of RF model and determination of the importance of LCFs; (5) LSM using two approaches, namely, (i) individual statistical FR and artificial intelligence-based RF models and (ii) FR-RF integrated models; (6) validation of LSMs by calculating AUC parameter; and (7) comparison of FR-RF integrated models prepared using topographic variables extracted from DEMs with different resolutions.2.4.Models

FR
The FR approach defines a quantitative relationship between landslide occurrence and spatial variability of a set of predictor variables [83].This method is based on the calculation, for each class of the predictor variables, of the ratio of landslide area within that class to the total landslide area divided by the ratio of the area of that class to the total investigated area.FR is calculated as follows [84]: where A is the number of landslide pixels in each class of the predictors, B is the total number of landslide pixels in the study area, C is the total number of pixels in each class of the predictors, D is the total number of pixels in the study area, E is the percentage of landslide occurrence in each class of the factors, and F is the relative percentage of the area of each subclass of the total area.The FR values of the controlling factors are gathered together in the GIS environment to obtain index of susceptibility to landslide occurrence [85]: where LSI is the landslide susceptibility index, FR is the frequency ratio of factors and n represents the total input factors.
(1:100,000 scale), topographic map (1:50,000 scale) and Landsat 8 image; 3) application of FR model and establishment of the spatial relationship between LCFs and location of landslides; 4) application of RF model and determination of the importance of LCFs; 5) LSM using two approaches, namely, i) individual statistical FR and artificial intelligence-based RF models and ii) FR-RF integrated models; 6) validation of LSMs by calculating AUC parameter; and 7) comparison of FR-RF integrated models prepared using topographic variables extracted from DEMs with different resolutions.

FR
The FR approach defines a quantitative relationship between landslide occurrence and spatial variability of a set of predictor variables [83].This method is based on the calculation, for each class

RF
Data mining emerged in the late 1980s, and considerable progress was made in this subject in the 1990s [86].RF was introduced by [87].It is a method for the classification and regression of data that combines trees with branches and random data to form an RF.Hence, this method is a collection of regression random trees [88].Further details about the RF method can be found in [87,[89][90][91].RF analyses have been performed in R 3.3.1 software, using the 'Randomforest' package [92].

Ensemble of FR and RF (FR-RF)
Given the shortcomings and limitations of each of the statistical and machine learning models, scientists have proposed and developed integrated methods to overcome their disadvantages and increase their efficiency [93].In this study, two types of bivariate and machine learning methods, namely, FR and RF, and their ensemble were applied to produce LSM.The FR-RF integrated method eliminates the disadvantages of bivariate methods, such as the failure to calculate the importance of the parameters and disadvantages of machine learning methods, such as the non-calculation of the spatial relationship between the position of the landslides and the parameters that affect them.The main disadvantage of this methodology is the instability of the RF model.A small change in input data leads to a massive change in the tree structure and the output of the model.

Validation of models
In this experiment, the validation strategy was based on a random partition of the landslide inventory into a calibration set, which includes 70% of landslide locations, and a validation set, which includes 30% of landslide locations [39].The AUC was used to measure the predictive performance of the models [93][94][95][96][97].The AUC values, which are in the range of 0-1, was used to classify the accuracy of the landslide predictive models as follows: 0.9-1: excellent accuracy; 0.8-0.9:very good accuracy; 0.7-0.8:good accuracy; 0.6-0.7:moderate accuracy; and 0.5-0.6:poor accuracy [98].

Multicollinearity Test
In LSM, analysing the collinearity among the selected predictors is important as existence of collinearity reduces the performance of the predictive models.Tolerance (TOL) and variance inflation factor (VIF) are commonly used indicators for collinearity between independent variables and were thus used in this study.TOL values less than 0.1 and VIF greater than 10 show collinearity among the predictors [99].In this study, the values of VIF and TOL were calculated by extracting the values of the parameters in landslide and non-landslide points and analysing them in SPSS16 software.The results of collinearity analysis (Table 3) indicated that among 17 LCFs, which were initially selected, stream power index (SPI), and topography wetness index (TWI) were highly correlated with TOL (0.063 and 0.074) and VIF (15.94 and 13.47) values exceeding the thresholds.Therefore, SPI and TWI were excluded, and LS modelling in the study area was based on the following variables, which were unaffected by collinearity problems: Elevation, slope degree, slope aspect, convergence index, SL, plan curvature, profile curvature, drainage density, distance to stream, distance to road, distance to fault, lithology, rainfall, LU/LC, and NDVI.

Application of FR Model
The spatial relationship between the landslides that occurred in the study area and the LCFs are shown in Table 4.The results showed an inverse relationship between elevation and FR.The highest value of FR (i.e., 3.2032) was found in the lowest elevation class (<338 m), whereas the lowest value (i.e., 0.1273) was calculated in the highest elevation class (>1136 m).Although a positive relationship between the altitude and FR could be expected for several reasons (e.g., increase of precipitation with elevation), in the study area, low elevation areas exhibited conditions that were more favourable to landslide occurrence due to the presence of deeper soils and higher human pressure at the base of the mountainous areas; this result is consistent with the those of Pourghasemi and Rossi (2016).The highest value of FR (1.4692) was found where the slope angle is low to moderate (7.1 • -13 • ).This result, which is consistent with findings of [72,100], could be related to human activities, such as road construction and LU changes, which are frequently at the base of mountains and may favour slope instability conditions.East (FR = 1.8788) and south-west (FR = 1.5664) slope aspect classes showed a strong correlation with landslide occurrence in the area due to the higher moisture conditions.FR values of convergence index classes (1.4136 and 1.2785) showed higher landslide frequency on classes <−29.41 and −9.01 to 7.45.The FR analysis of the LS factor showed that classes of 77.9-103.4m with FR = 2.7919 and 103.4-128.2m with FR = 1.007 had a strong relationship with landslide occurrence in the study area.The analysis of the plan curvature classes showed that the regions with convex and concave topography had a significant effect on landslide occurrence with FR = 1.0910 and 0.9715, respectively.This result was due to the divergence of the flow of water in these slopes and the expansion and contraction of the soil of the convex slopes that had provided conditions favourable for erosion and landslides.These results agree with the findings of [101].The analysis of profile curvature showed that classes of <−1.13 (FR = 1.3583) and −0.35 to 0.28 (FR = 1.1193) were related to landslides in the study area.
A positive relationship was found between drainage density factor and FR, which showed the highest value (FR = 2.2105) where density of streams was the highest (i.e., >1.12 km −1 ).As expected, the lowest distance to stream class (<227 m) had a strong relationship with the triggering of landslides, whereas landslides were less frequent as the distance to streams increased.River erosion and saturation at the foot of slopes had a negative effect on the stability and increased the probability of landslide occurrence [102].The FR analysis of distance to roads showed that areas close to roads had higher susceptibility to landslides.The results of lithology showed that Plc units had a strong relation with landslides and if water penetrates into them, then landslide is more likely to occur.The results of the rainfall parameter showed that the areas that receive the highest amount of rainfall (>1196 mm) with the highest value of FR (1.9542) had the highest potential for landslide occurrence.On the basis of the LU/LC results, a class of agriculture-orchard with FR = 5.6107 had a strong correlation with landslides.These results indicated that human activities could cause the instability of domains and increase landslide occurrence.On the basis of the NDVI factor, the sensitivity of the areas to landslide decreased with the increase in the value of the indicator, indicating the importance of vegetation in reducing mass movements.After determining the weights of the classes of parameters using the FR method, an LSM map was produced using Equation ( 5 The resulting LSM obtained using the FR model varied from 6.25 to 17.92.For the classification of the map, four techniques, including equal interval, quantile, natural break and geomatical interval were analysed (Figure 4).Natural break technique was selected due to its high accuracy.Natural break was also used by [103]) as the best classification method for numerical variables.The resulting map was divided into five classes as follows (Figure 5a): Very low (6.25-11.45),low (11.45-14.01),moderate (14.01-17.14),high (17.14-21.77),and very high (21.77-30.39).The results (Figure 4c) showed that 27.11, 33.92, 22.60, 10.24 and 6.11% of the study area were in very low to very high LS classes, respectively.

Application of RF model
The out-of-bag value of the RF model was 21.6%, and the accuracy of the model was 78.4%.According to the results of the confusion matrix shown in Table 5, out of the 64 non-landslide

Application of RF Model
The out-of-bag value of the RF model was 21.6%, and the accuracy of the model was 78.4%.According to the results of the confusion matrix shown in Table 5, out of the 64 non-landslide locations observed, 54 (84.37%) were predicted as non-landslide and 10 (15.62%) as landslide, and out of the 61 observed landslide locations, 17 (27.86%)were predicted as non-landslide and 44 (72.13%) as landslide.The analysis of the importance of LCFs (Figure 6) showed the following ranking: LU/LC (16.95), NDVI (16.44), distance to roads (15.32), elevation (13.6), rainfall (8.98), lithology (8.71), distance to fault (6.19), drainage density (4.87), distance to stream (1.24), slope (0.34), SL (0.29), convergence index (−1.93),aspect (−2.27), plan curvature (−2.49) and profile curvature (−2.63).The LSM created using the RF model (Figure 5b) was classified according to the natural break method into five classes (very low, low, moderate, high and very high).Approximately 29.23% of the study area is located in low susceptibility class, whereas 8.15% is located in very high susceptibility class (Figure 4c).Approximately 24.26%, 23.51% and 14.83% are located in very low, moderate and high susceptibility classes, respectively.

3.4.Application of FR-RF Integrated Model
To calculate LS in the study area by using the FR-RF integrated approach, the following equation was used in ArcGIS10.5 environment: The LSM created using the RF model (Figure 5b) was classified according to the natural break method into five classes (very low, low, moderate, high and very high).Approximately 29.23% of the study area is located in low susceptibility class, whereas 8.15% is located in very high susceptibility class (Figure 4c).Approximately 24.26%, 23.51% and 14.83% are located in very low, moderate and high susceptibility classes, respectively.

Application of FR-RF Integrated Model
To calculate LS in the study area by using the FR-RF integrated approach, the following equation was used in ArcGIS10.5 4c) indicated that the largest fraction (35.37%) of the study area had low susceptibility, whereas the very high susceptibility class had the smallest extent (6.31%).Very low, moderate, and high susceptibility classes cover 30.23, 19.45, and 8.61% of the study area, respectively.

Model Validation
The results of the model validation using the ROC curve (Figure 7) showed that the FR-RF integrated model (AUC = 0.917) had a higher predictive accuracy compared with the individual methods FR (AUC = 0.865) and RF (AUC = 0.840).To investigate the effect of the resolution of different DEMs on the accuracy of the models, the DEM used in this study (ALOS PALSAR, with 12.5 m spatial resolution) was compared with ASTER and SRTM DEMs with 30 m and 90m spatial resolutions, respectively.The results of comparing the LSM derived from ASTER (Figure 8a) and SRTM (Figure 8b) DEMs and the FR-RF integrated model with the LSM derived from PALSAR DEM and FR-RF integrated model (Figure 5c) showed that the PALSAR DEM provided higher accuracy of the combined approach compared with the other DEMs.The AUC of FR-RF-PALSAR was 0.917, whereas those of FR-RF-ASTER and FR-RF-SRTM approaches were 0.865 and 0.863, respectively (Figure 9).
according to natural break method into five classes (Figure 5c), which included 19. 96-60.47The results (Figure 4c) indicated that the largest fraction (35.37%) of the study area had low susceptibility, whereas the very high susceptibility class had the smallest extent (6.31%).Very low, moderate, and high susceptibility classes cover 30.23, 19.45, and 8.61% of the study area, respectively.

Model Validation
The results of the model validation using the ROC curve (figure 7) showed that the FR-RF integrated model (AUC = 0.917) had a higher predictive accuracy compared with the individual methods FR (AUC = 0.865) and RF (AUC = 0.840).To investigate the effect of the resolution of different DEMs on the accuracy of the models, the DEM used in this study (ALOS PALSAR, with 12.5 m spatial resolution) was compared with ASTER and SRTM DEMs with 30 m and 90m spatial resolutions, respectively.The results of comparing the LSM derived from ASTER (Figure 8a) and SRTM (Figure 8b) DEMs and the FR-RF integrated model with the LSM derived from PALSAR DEM and FR-RF integrated model (Figure 5c) showed that the PALSAR DEM provided higher accuracy of the combined approach compared with the other DEMs.The AUC of FR-RF-PALSAR was 0.917, whereas those of FR-RF-ASTER and FR-RF-SRTM approaches were 0.865 and 0.863, respectively (Figure 9).

Discussion
In this study, low-frequency radar data (i.e.ALOS PALSAR) was used in providing a DEM to obtain high-precision and high-quality data.The integration of two highly efficient models, namely, FR and RF, was tested to predict areas that are susceptible to landslides accurately.Analysing the varied and complex data related to landslide occurrence requires robust and flexible analytical methods that control nonlinear relationships, interactions, and lost data.Understanding and presenting the results achieved by these methods should be simple and easily interpretable.RF is a powerful and useful data mining technique, which is used for classification and prediction [89].Treebased methods can be easily understood by everyone.These methods combine data analysis and modelling; thus, they are a powerful step in modelling.Given its high degree of classification accuracy, the RF has introduced a new method for determining the importance of variables, that is,

Discussion
In this study, low-frequency radar data (i.e., ALOS PALSAR) was used in providing a DEM to obtain high-precision and high-quality data.The integration of two highly efficient models, namely, FR and RF, was tested to predict areas that are susceptible to landslides accurately.Analysing the varied and complex data related to landslide occurrence requires robust and flexible analytical methods that control nonlinear relationships, interactions, and lost data.Understanding and presenting the results achieved by these methods should be simple and easily interpretable.RF is a powerful and useful data mining technique, which is used for classification and prediction [89].Tree-based methods can be easily understood by everyone.These methods combine data analysis and modelling; thus, they are a powerful step in modelling.Given its high degree of classification accuracy, the RF has introduced a new method for determining the importance of variables, that is, the capability to model complex interactions between predictor variables and the flexibility to implement different types of statistical data analysis.This method is also useful when complex interactions exist between predictor and response variables and when coherence exists among predictor variables.Another advantage of this method is that it does not need to normalise the data and it can handle nonlinear relationships.The most important disadvantage of this method in the prediction of landslide is the lack of calculation of the spatial relationship between the occurrence of landslides and their effective parameters.Therefore, this method cannot calculate the importance of the classes of each of the parameters in the occurrence of landslide.This problem can be resolved using the FR model.By contrast, the most important disadvantage of the FR model is its incapability to calculate the importance of LCFs, which can be solved using the RF model.
The results of determining the importance of parameters by the RF method showed that parameters LU/LC, NDVI, distance to roads and elevation had the greatest effect on landslide occurrence.These results are in line with the findings of [72,100].In the results of the model validation, several cases can be mentioned.
(i) The RF model had a higher predictive accuracy compared with the FR model, which is consistent with the results of [18,45,50,57 2018).[45]) used the logistic model tree, RF, and classification and regression tree (CART) to prepare an LSM.They used 12 controlling parameters and 171 landslide locations.Their results showed that the RF model had a higher predictive accuracy than the other models.[18] evaluated the LS in China using RF and decision tree models along with GIS techniques.They used 34 hydrological, topographic, geological, LC, and environmental LCFs and 300 landslide locations.They concluded that the RF model had a higher efficiency in identifying sensitive areas to landslide.[50] prepared an LS map at Pyeong-Chang, Korea using RF and boosted tree methods.They used various information layers and divided the landslides that occurred in the area into two groups for modelling (50%) and validation (50%) purposes.Both models had an acceptable accuracy in identifying landslide sensitive areas.[100] evaluated and compared four methods, namely, RF, BRT, CART, and the general linear model (GLM) in LSM in Wadi Tayyah Basin area.The authors used 11 LCFs and 125 landslide locations.The validation of the models, which was performed using success and prediction rate curves, showed that the RF model had a reasonable accuracy in predicting LS and could thus be used to this aim by environmental planners.
(ii) In the experiment, the integration of the individual models (i.e., FR and RF) increased their efficiency.This result is in line with the findings of [32, 48,52,56,68].Ref. [56] combined RF and EBF methods to assess LS.They used 153 landslide locations and 11 effective parameters, namely, slope angle, altitude, TWI, distance from roads, plan curvature, slope aspect, profile curvature, lithology, LU, distance from rivers, and distance to faults.Their results indicated that the integration approach had a higher efficiency in identifying areas susceptible to landslides and eliminate the disadvantages of each individual models.
(iii) The PALSAR DEM had a greater influence on the prediction accuracy of the proposed integrated model compared with ASTER and SRTM DEMs.Moreover, different DEMs derived from various data sources with different spatial resolutions could considerably affect the models' accuracy.This result is in line with the findings of [104,105], which highlighted that the accuracy of landslide prediction depends on the accuracy and quality of the controlling factors, which in turn depend on the accuracy of their resources, such as DEMs.The methodological framework introduced in this study showed that the integration of models could eliminate their disadvantages and increase their efficiency.Furthermore, the use of radar remote sensing data in comparison with optical data could play a significant role in increasing the prediction accuracy of the models.Thus, using this methodology is recommended in areas with the same environmental conditions.

Conclusions
LSM is the first and most important step in managing mountainous areas and reducing the damage caused by landslides.In recent years, many quantitative and qualitative methods have been introduced for LSM.Thus far, no method is considered the best; nevertheless, each one has its obvious advantages and disadvantages.In this study, two statistical-and artificial intelligence-based methods, namely FR and RF, and their integration were used for LSM to promote their advantages and overcome their shortcomings.For this purpose, 15 LCFs (i.e., elevation, slope, slope aspect, convergence index, SL, plan curvature, profile curvature, drainage density, distance to stream, distance to road, distance to fault, lithology, rainfall, LU/LC, and NDVI) and a modelling dataset were used for LSM.The results of RF showed that the LU/LC parameter had the most important role in landslide occurrence in the study area.Validation results showed that the FR-RF integrated model had a higher prediction accuracy than individual models.The results also showed that the quality and accuracy of the input DEMs considerably influenced the prediction accuracy of the resulting LSMs; thus, the resulting LSM from the PALSAR DEM had a higher accuracy compared with the LSM from ASTER and SRTM DEMs.On the basis of the results of the introduced methodology (i.e., FR-RF-PALSAR), 14.92% of the study area is located in high and very high susceptibility classes.The proposed methodology had many advantages (Section 4).The main advantage of this methodology is its capability to determine the relative importance of effective factors and the spatial relationship between these factors and landslide locations.Areas with high sensitivity to landslides are located around residential areas and human infrastructure; thus, any construction in these areas should be performed with caution and observance of engineering principles.The results of this study can be used for LU management, reduction of landslide damages and sustainable development.

Figure 1 .
Figure 1.Study area: (a) Location of the study area in Iran; (b) location of the study area in Mazandran Province; (c) location of training and validation landslides in the study area.

Figure 6 .
Figure 6.Determining the importance of LCFs using RF.

Figure 6 .
Figure 6.Determining the importance of LCFs using RF.

Figure 7 .
Figure 7. Area under curve (AUC) values of FR, RF and integrated model.Figure 7. Area under curve (AUC) values of FR, RF and integrated model.

Table 2 .
Overview of factors used for landslide susceptibility mapping (LSM).

Table 3 .
Multicollinearity analysis among independent variables.

Table 4 .
Spatial relationship between conditioning factors and landslide locations with frequency ratio (FR).