Investigating the Effect of Cross-Modeling in Landslide Susceptibility Mapping

To mitigate the negative effects of landslide occurrence, there is a need for effective landslide susceptibility mapping (LSM). The fundamental source for LSM is landslide inventory. Unfortunately, there are still areas where landslide inventories are not generated due to financial or reachability constraints. Considering this led to the following research question: can we model landslide susceptibility in an area for which landslide inventory is not available but where such is available for surrounding areas? To answer this question, we performed cross-modeling by using various strategies for landslide susceptibility. Namely, landslide susceptibility was cross-modeled by using two adjacent regions (“Łososina” and “Gródek”) separated by the Rożnów Lake and Dunajec River. Thus, 46% and 54% of the total detected landslides were used for the LSM in “Łososina” and “Gródek” model, respectively. Various topographical, geological, hydrological and environmental landslide-conditioning factors (LCFs) were created. These LCFs were generated on the basis of the Digital Elevation Model (DEM), Sentinel-2A data, a digitized geological and soil suitability map, precipitation, the road network and the Różnów lake shapefile. For LSM, we applied the Frequency Ratio (FR) and Landslide Susceptibility Index (LSI) methods. Five zones showing various landslide susceptibilities were generated via Natural Jenks. The Seed Cell Area Index (SCAI) and Relative Landslide Density Index were used for model validation. Even when the SCAI indicated extremely high values for “very low” susceptibility classes and very small values for “very high” susceptibility classes in the training and validation areas, the accuracy of the LSM in the validation areas was significantly lower. In the “Łososina” model, 90% and 57% of the landslides fell into the “high” and “very high” susceptibility zones in the training and validation areas, respectively. In the “Gródek” model, 86% and 46% of the landslides fell into the “high” and “very high” susceptibility zones in the training and validation areas, respectively. Moreover, the comparison between these two models was performed. Discrepancies between these two models exist in the areas of critical geological structures (thrust and fault proximity), and the reliability for such susceptibility zones can be low (2–3 susceptibility zone difference). However, such areas cover only 11% of the analyzed area; thus, we can conclude that in remaining regions (89%), LSM generated by the inventory for the surrounding area can be useful. Therefore, the low reliability of such a map in areas of critical geological structures should be borne in mind.

are selected randomly across the entire study area, so they can capture a variety of conditions. This is a huge advantage in LSM. Unfortunately, there are still areas for which landslide inventory maps are simply not generated due to financial or reachability issues; they are not available in some hardly accessible areas. The authors of [29] presented a summary of the completeness of landslide inventories for various countries in Europe in 2012, which confirms that there are still areas for which landslide maps have not been generated. For example, landslide inventory in Poland has been generated step by step (commune area after commune area), and the generation of such an inventory sometimes proved to be very time consuming when the geological conditions were complex, resulting in a lack of landslide inventory in such regions. From another point of view in such a region, there is still the need for land-use planning, decision making, and the continuation of building and investment.
Considering this led to the following research question: could we assess landslide susceptibility in an area for which landslide inventory is not available but where it is available for the surrounding places? To answer this research question, we performed LSM based on so-called cross-modeling. More specifically, we divided our study area into two separate regions ("Łososina" and "Gródek") and performed modeling for the entire study area based on the "Łososina" and "Gródek" areas in the first and the second modeling scenarios, respectively. For the LSM, we utilized various topographical, geological and environmental LCFs together with the widely used Frequency Ratio method. LSM was performed in the study area greatly affected by landslide activity in the Polish Carpathians, in the area of the Rożnów Lake.

Study Area
The study area is located in the Outer Carpathians in the Małopolskie voivodeship. The main river of Dunajec is located in the study area, and Rożnów Lake was created as result of damming the Dunajec River. This dam is used in the Rożnów Power Plant [30]. The Rożnów water reservoir is one of the main elements for managing the water resources of the Dunajec river basin. (Figure 1). The study area covers from 49 • 40 N to 49 • 46 N latitude and from 20 • 38 E to 20 • 48 E longitude, which corresponds to 136 km 2 ( Figure 1). Around 18.2 km 2 of this area is affected by landslides. According to Varnes' classification, updated by Hungr et al. [31], all landslides have a slide type of movement. According to Hungr's classification [31], within the Łososina commune, there are rock rotational slides (no. 6), clay/silt rotational/planar and compound slides (no. 11,12,14). Unfortunately, the landslide inventory provided by Polish National Geological Institute does not represent more detailed characteristic of landslide types e.g., shallow/deep-seated landslides or planar/rotational slides. Generally, landslides are classified as deep or shallow based on the material, movement mechanism and depth of the rapture surface [32]. The authors of [33,34] reported that landslide is considered as shallow when the depth of the sliding plane is less than 10 m and deep seated when sliding plane is deeper than 10 m. When investigating sizes [35,36], characteristics and slip depth of some landslides described in the landslide inventory documentation [37,38], we can concluded that shallow as well as deep seated landslide exists in investigated region. Generally in this region, landslides are slow-to very-slow-moving. The study area mainly consists of Eocene-Oligocene sandstones and shales and Upper Cretaceous sandstone and conglomerate-Lower Stebna layers. This is one of the most landslide-affected areas in Poland. Additionally, due to the complex geological conditions, landslide inventories in this region of the eastern shore of Rożnów Lake were unavailable for a long time. Thus, its situation suited the research question of this paper. Appl. Sci. 2020, 10, x FOR PEER REVIEW 4 of 31

Methodology
The methodology flowchart is presented in Figure 2. The study area was divided into two regions called "Łososina" and "Gródek", separated by Dunajec River (also the boundary of the Łososina and Gródek communes). These regions were used for cross-modeling. This means that in the first model, "Łososina" was used for modeling, while in the second, "Gródek" was used for the modeling of the landslide susceptibility of the entire study area. The official national landslide inventory map (SOPO) was used for LSM. The Frequency Ratio (FR) method was applied as a tool for landslide modeling. The susceptibility assessment was performed in the same manner by using various landslide input data. Based on this, a direct comparison between these two strategies was possible. More detailed descriptions of data and methods used are presented in the following subsections.

Methodology
The methodology flowchart is presented in Figure 2. The study area was divided into two regions called "Łososina" and "Gródek", separated by Dunajec River (also the boundary of the Łososina and Gródek communes). These regions were used for cross-modeling. This means that in the first model, "Łososina" was used for modeling, while in the second, "Gródek" was used for the modeling of the landslide susceptibility of the entire study area. The official national landslide inventory map (SOPO) was used for LSM. The Frequency Ratio (FR) method was applied as a tool for landslide modeling. The susceptibility assessment was performed in the same manner by using various landslide input data. Based on this, a direct comparison between these two strategies was possible. More detailed descriptions of data and methods used are presented in the following subsections.

Input Data
The landslide inventory database (SOPO) from the Polish National Geological Institute was used for the input data ( Figure 1). These data allowed the generation of models and validation datasets. The existing landslide locations and boundaries within the SOPO database were captured using conventional techniques, mostly comprising field reconnaissance, the visual interpretation of aerial photographs, and the analysis of historical data [39]. The landslides within the study area are stored in the SOPO database and were mapped during field work in the years 2010,2011,2012,2013,2014 and 2015 [35,37,38,40]. Additional mapping was also performed on the basis of topographic maps at a 1:10 000 scale, supported by stereoscopic analyses of aerial photographs and LiDAR data [9].
Various Landslide Conditioning Factors (LCFs) were generated from the data captured from different sources. Table 1 presents the data sources used to create so-called second-order input data such as the Digital Elevation Model (DEM), geological and soil map etc. The LiDAR data were acquired within the framework of the IT System of the Country Protection (ISOK) project. Almost the entire area of Poland has been scanned for the implementation phase of the extraordinary hazard (mostly water hazard) protection system (ISOK project) [41]. A point cloud with a density of 4-6 points/m 2 was generated, and the calculated Root Mean Square Error was about 0.15 m for the height coordinate [41]. This LiDAR point cloud was subsequently used for DEM generation. To decrease the data volume and discard some of the artefacts from the original DEM, we followed the recommendations of [42][43][44] and generated a DEM with a resolution of 2 m. This approach appears to be common among various papers, and many authors have reported that the finest DEM resolution is not always the best choice [42][43][44][45]. Based on the DEM, topographical LCFs were subsequently generated as described in Section 2.4.
A Sentinel-2A image acquired via the Copernicus Scientific data hub was used to extract the boundary of Rożnów Lake. This shape was extracted by the calculation of the Normalized Difference Vegetation Index (NDVI). More specifically, an NDVI value lower than 0 was used to extract the shape of the lake, based on which another LCF was calculated (the lake proximity). Moreover, an agricultural soil map acquired from the geoportal of the Małopolskie voivodeship was used to

Input Data
The landslide inventory database (SOPO) from the Polish National Geological Institute was used for the input data ( Figure 1). These data allowed the generation of models and validation datasets. The existing landslide locations and boundaries within the SOPO database were captured using conventional techniques, mostly comprising field reconnaissance, the visual interpretation of aerial photographs, and the analysis of historical data [39]. The landslides within the study area are stored in the SOPO database and were mapped during field work in the years 2010, 2011, 2012, 2013, 2014 and 2015 [35,37,38,40]. Additional mapping was also performed on the basis of topographic maps at a 1:10 000 scale, supported by stereoscopic analyses of aerial photographs and LiDAR data [9].
Various Landslide Conditioning Factors (LCFs) were generated from the data captured from different sources. Table 1 presents the data sources used to create so-called second-order input data such as the Digital Elevation Model (DEM), geological and soil map etc. The LiDAR data were acquired within the framework of the IT System of the Country Protection (ISOK) project. Almost the entire area of Poland has been scanned for the implementation phase of the extraordinary hazard (mostly water hazard) protection system (ISOK project) [41]. A point cloud with a density of 4-6 points/m 2 was generated, and the calculated Root Mean Square Error was about 0.15 m for the height coordinate [41]. This LiDAR point cloud was subsequently used for DEM generation. To decrease the data volume and discard some of the artefacts from the original DEM, we followed the recommendations of [42][43][44] and generated a DEM with a resolution of 2 m. This approach appears to be common among various papers, and many authors have reported that the finest DEM resolution is not always the best choice [42][43][44][45]. Based on the DEM, topographical LCFs were subsequently generated as described in Section 2.4.
A Sentinel-2A image acquired via the Copernicus Scientific data hub was used to extract the boundary of Rożnów Lake. This shape was extracted by the calculation of the Normalized Difference Vegetation Index (NDVI). More specifically, an NDVI value lower than 0 was used to extract the shape of the lake, based on which another LCF was calculated (the lake proximity). Moreover, an agricultural soil map acquired from the geoportal of the Małopolskie voivodeship was used to generate a soil-suitability map. A geological map was also obtained from the Polish National Geological Institute.
As well as the geological map, the soil maps were digitized and stored as vector layers in ArcGIS (version 10.6, ESRI, Redlands, CA, USA). Additionally, a thrust and fault network was generated based on previous work [30] via a digitization process. The road network was acquired from OpenStreetMap (OSM). Because the river network in OSM is very sparse, the stream network was extracted with the DEM, with which the flow accumulation and direction were initially calculated. The subsequent thresholding of the flow accumulation values allowed the extraction of the stream network. Finally, this network was manually refined and converted into the vector layers. Because precipitation is a direct trigger of landslides worldwide, we downloaded the precipitation measurements from four different meteorological stations in this area. We utilized measurements from 2019 due to the changes of the weather conditions caused by the climate change. From one year to another, we observed much more extreme weather phenomenon. Thus, to represent the most real and current rainfall situation, we applied the rainfall measurement from 2019. A precipitation map was generated based on these measurements by inverse distance weighted interpolation. A summary of all the data used in this study is presented in Table 1.

Preparation of Landslide Conditioning Factors
From the DEM generated from LiDAR data, various topographical layers were generated (LCFs). Table 2 presents an overview of various widely applied LCFs generated from the DEM and Figure 3 presents a graphical representation of some of the used LCFs. Based on some DEM analysis, hydrology-related layers were also extracted: the compound topographic index (CTI), integrated moisture index (IMI) and flow direction (FD). To create accurate LSM, we also generated a stream proximity layer based on the stream network extracted from the DEM (see Section 2.3) and Euclidian Distance Buffering (EDB) tool within ArcGIS.

Landslide Susceptibility Modeling
To fully achieve the goal of this study, we used various data for landslide susceptibility assessment. Namely, we generated the two landslide susceptibility models based on landslides located within the "Łososina" and "Gródek" regions in the first and second modeling scenarios, respectively. The qualitative information about the number of landslides used for modeling and its contribution in the total analyzed area is presented in Table 3. A graphical representation of the region separation is presented in Figure 4. Based on this, it can be observed that 46% and 54% of the total detected landslides were used for the LSM in "Łososina" and "Gródek" strategies, respectively. Moreover, it is apparent that there is an 8% difference in the landslides used for modeling between these two strategies. At this point, it is worth reiterating that 70% of the landslides are typically used for LSM in the literature [2,3,13,18,[20][21][22]24,27]. Moreover, a random sampling strategy is used for assessing landslide susceptibility; the landslides used for modeling are distributed randomly and evenly across the investigated area, so the variety of conditions (LCFs) can be better captured. Considering this, in the first and second strategies, we used fewer landslides for modeling purposes. Thus, we supposed that this modeling of landslide susceptibility would have lower performance than that usually presented in the literature. However, the goal of the presented work was to investigate whether landslide inventory could be used for LSM in surrounding regions rather than to strive for accuracy.

Landslide Susceptibility Modeling
To fully achieve the goal of this study, we used various data for landslide susceptibility assessment. Namely, we generated the two landslide susceptibility models based on landslides located within the "Łososina" and "Gródek" regions in the first and second modeling scenarios, respectively. The qualitative information about the number of landslides used for modeling and its

Application of Frequency Ratio Model for Landslide Susceptibility Zonation
The Frequency Ratio (FR) is a geospatial assessment tool used to estimate the susceptibility to landslides in a given research area [13]. It belongs to the bivariate statistical methods, and its value depends on the relationship between the location of landslides and LCFs [27]. For LCFs, weights are assigned based the ratio of the number of observed landslides to the area of the study area. The weights, represented as FRs, can be calculated using landslide inventory and each specific LCF [27]. When j is the class of a specific LCF (i), the FR is defined by the following equation: where: A = the number of pixels of the landslide in each LCF class B = the total number of pixels of the landslide in the test area C = the number of pixels in each LCF sub-class D = the total number of pixels in the test area [1] Accordingly, the FR was calculated by overlying landslide pixels with the thematic layers or LCF layers presented in Section 2.4 Values obtained using this method greater than 1 imply high landslide susceptibility within this class [13] and a strong correlation between the landslide occurrence and LCF class [28]. Values below 1 indicate no or a slight correlation between the LCF and landslide occurrence [13,28].

Application of Frequency Ratio Model for Landslide Susceptibility Zonation
The Frequency Ratio (FR) is a geospatial assessment tool used to estimate the susceptibility to landslides in a given research area [13]. It belongs to the bivariate statistical methods, and its value depends on the relationship between the location of landslides and LCFs [27]. For LCFs, weights are assigned based the ratio of the number of observed landslides to the area of the study area. The weights, represented as FRs, can be calculated using landslide inventory and each specific LCF [27]. When j is the class of a specific LCF (i), the FR is defined by the following equation: where: A = the number of pixels of the landslide in each LCF class B = the total number of pixels of the landslide in the test area C = the number of pixels in each LCF sub-class D = the total number of pixels in the test area [1] Accordingly, the FR was calculated by overlying landslide pixels with the thematic layers or LCF layers presented in Section 2.4 Values obtained using this method greater than 1 imply high landslide susceptibility within this class [13] and a strong correlation between the landslide occurrence and LCF class [28]. Values below 1 indicate no or a slight correlation between the LCF and landslide occurrence [13,28].
The Landslide Susceptibility Index (LSI) was subsequently calculated for each pixel in the image (x,y) according to the formula: Based on natural breaks, Jenks [56] developed an optimization method to minimize within-class variance while maximizing between-class variance. This classification method is generally implemented into GIS software, such as ESRI@ ArcGIS software or QGIS (free open source software). The classes are split according to natural clusters inherent in the data, and the boundaries are statistically determined when relatively large jumps occur within the susceptibility indices as determined by their variance. The LSI values were separated based on this method into five susceptible classes. This number of susceptibility zones is commonly used in small-scale landslide susceptibility mapping [57,58]: very low, low, moderate, high and very high. These are considered adequate for revealing any spatial patterns preserved in a dataset and aiding the interpretation of these LSMs [59].

Methods for Model Validation
Model validation is a crucial step that indicates whether the generated model achieves a certain level of accuracy. Without accuracy assessment, generated models are meaningless [13]. We applied the Seed Cell Area Index (SCAI) and Relative Landslide Density Index (R ind ) to evaluate our landslide susceptibility maps in this study. We also calculated a residual map and correlation index between two susceptible maps. The SCAI and R ind were calculated for validation and modeling areas for both models.

Index of Relative Landslide Density
Bearing in mind that the LSM value was used to distinguish susceptible and non-susceptible areas, landslides may be expected to occur in areas with more susceptible zones [60]. A relative landslide density index (R ind ) was calculated to verify this. This index is defined by the ratio of the landslide area and given susceptibility class to the overall landslide density. The index takes the following form: where: n i = the number of landslides observed in a susceptibility class N i = the area covered by the cells of this class [61].

Seed Cell Area Index
The Seed Cell Area Index (SCAI) validation technique used in this study was developed by Süzen and Doyuran [62]. It is described as the ratio between the percentage of pixels of the exact class of landslide susceptibility and the percentage of the pixels of existing landslides in a given landslide susceptibility zone. The SCAI is assumed to be a reliable validation technique [13,14]. If its values decrease from very low to very high classes of LS, the model is regarded as excellent [13]. When i is the specific susceptibility map, SCAI is represented as follows: where: E i = percentage of landslide pixels in specific susceptibility class to total landslide pixels F i = percentage of pixels in specific susceptibility class to total image pixels.

Map/Model Comparison
To evaluate the reliability of the two susceptibility maps generated based on two different landslide input datasets, we performed model or map comparison. Firstly, we determined the raster difference between these two maps, showing how these two maps differed from each other. By comparing two models with different susceptibility zones, a so-called residual map was generated [63]. This residual image showed how pixels shifted from one landslide susceptibility class to another between these two maps. Thus, a residual image can have a maximum five different classes: no difference, one-zone difference (1, −1), two-zone difference (2, −2), three-zone difference (3, −3) and four-zone difference (4, −4) [14]. We also calculated the Pearson correlation coefficient between these two landslide susceptibility maps to evaluate their similarity.

Results
This section comprises the following three subsections: (1) the spatial relationship between landslide and conditioning factors, and the FR indexes used for the LSI calculations (presented in Appendices A and B); (2) a presentation of the landslide susceptible maps generated by two models; and (3) the validation and comparison of the generated models using the SCAI, R ind and model comparison.

The Spatial Relationship Between Landslide Locations and Analyzed Landslide-Controlling Factors
The frequency ratios for the first and second models are presented in Appendices A and B, respectively. Observing the FRs, it is apparent that the topographic factors were not of great importance for the preparation of the LSM in this study. To the contrary, it can be observed that the lithostratigraphic units were more important than tectonics or other geological or environmental conditioning factors. For clarity, the ten highest frequency ratios achieved for each model are shown in Figure 5, from which it is clear that the lithostratigraphic units were the most important variable in both models. Soil suitability also played an important role in landslide occurrence. In the "Łososina" model, precipitation and tectonic were also important landslide-controlling factors.
To evaluate the reliability of the two susceptibility maps generated based on two different landslide input datasets, we performed model or map comparison. Firstly, we determined the raster difference between these two maps, showing how these two maps differed from each other. By comparing two models with different susceptibility zones, a so-called residual map was generated [63]. This residual image showed how pixels shifted from one landslide susceptibility class to another between these two maps. Thus, a residual image can have a maximum five different classes: no difference, one-zone difference (1, −1), two-zone difference (2, −2), three-zone difference (3, −3) and four-zone difference (4, −4) [14]. We also calculated the Pearson correlation coefficient between these two landslide susceptibility maps to evaluate their similarity.

Results
This section comprises the following three subsections: (1) the spatial relationship between landslide and conditioning factors, and the FR indexes used for the LSI calculations (presented in Appendices A and B); (2) a presentation of the landslide susceptible maps generated by two models; and (3) the validation and comparison of the generated models using the SCAI, and model comparison.

The Spatial Relationship Between Landslide Locations and Analyzed Landslide-Controlling Factors
The frequency ratios for the first and second models are presented in Appendices A and B, respectively. Observing the FRs, it is apparent that the topographic factors were not of great importance for the preparation of the LSM in this study. To the contrary, it can be observed that the lithostratigraphic units were more important than tectonics or other geological or environmental conditioning factors. For clarity, the ten highest frequency ratios achieved for each model are shown in Figure 5, from which it is clear that the lithostratigraphic units were the most important variable in both models. Soil suitability also played an important role in landslide occurrence. In the "Łososina" model, precipitation and tectonic were also important landslide-controlling factors.

Landslide Susceptibility Maps
The LSIs were calculated as described in Section 2.5 based on the FRs presented in Appendices A and B and afterwards sorted into five susceptibility classes by Natural Jenks classification. Figure  6a,b present various LSIs calculated for the model based on "Łososina" and based on "Gródek", respectively. Similarly, Figure 6c,d present the susceptibility classes (SC) for the model based on "Łososina" and that based on "Gródek", respectively.

Landslide Susceptibility Maps
The LSIs were calculated as described in Section 2.5 based on the FRs presented in Appendices A and B and afterwards sorted into five susceptibility classes by Natural Jenks classification. Figure 6a,b present various LSIs calculated for the model based on "Łososina" and based on "Gródek", respectively. Similarly, Figure 6c,d present the susceptibility classes (SC) for the model based on "Łososina" and that based on "Gródek", respectively. The results from the "Łososina" model show that the very low, low, moderate, high and very high landslide susceptibility zones of the LSM cover 5%, 10%, 28%, 31% and 26% of the investigated study area, respectively (Figure 7a). The "Gródek" model results show that the very low, low, moderate, high and very high landslide susceptibility zones of the LSM cover 7%, 20%, 35%, 28% and 10% of the investigated study area, respectively (Figure 7b). From these indices, the same quantity can be observed for very high and high susceptible classes. Some relatively high differences can be observed within the low susceptibility class. The results from the "Łososina" model show that the very low, low, moderate, high and very high landslide susceptibility zones of the LSM cover 5%, 10%, 28%, 31% and 26% of the investigated study area, respectively (Figure 7a). The "Gródek" model results show that the very low, low, moderate, high and very high landslide susceptibility zones of the LSM cover 7%, 20%, 35%, 28% and 10% of the investigated study area, respectively (Figure 7b). From these indices, the same quantity can be observed for very high and high susceptible classes. Some relatively high differences can be observed within the low susceptibility class. Gródek"-based model.

Seed Cell Area Index
We evaluated the achieved susceptibility models based on the SCAI index; however, our evaluation was performed for split areas ("Łososina" and "Gródek"). This allowed for the better investigation of possible changes (the modeling and validation splitting strategy are described in Section 2.5). If the values of the SCAI decrease from very low to very high classes of LS, the model is regarded as excellent [13]. Table 4 presents percentage of each susceptible class, landslide observed in this specific class and SCAI index for both models. We can observe that based on the evaluation of the LSM in the modeling area, the SCAI index was extremely high (2491) and extremely low (0.42) for the "very low" and "very high" susceptible zones, respectively. These are exceptionally good results. A somewhat lower performance can be observed for the LSM based on "Gródek" and evaluated in the "Gródek" area; however, even its results were largely correct. Moreover, based on the evaluation of our results in the validation area, the SCAI index for the "very low" class in both cases ("Łososina" and "Gródek" model) was higher than 50, and it was below 0.5 for the "very high" susceptibility class, which is also considered appropriate.  Gródek"-based model.

Seed Cell Area Index
We evaluated the achieved susceptibility models based on the SCAI index; however, our evaluation was performed for split areas ("Łososina" and "Gródek"). This allowed for the better investigation of possible changes (the modeling and validation splitting strategy are described in Section 2.5). If the values of the SCAI decrease from very low to very high classes of LS, the model is regarded as excellent [13]. Table 4 presents percentage of each susceptible class, landslide observed in this specific class and SCAI index for both models. We can observe that based on the evaluation of the LSM in the modeling area, the SCAI index was extremely high (2491) and extremely low (0.42) for the "very low" and "very high" susceptible zones, respectively. These are exceptionally good results. A somewhat lower performance can be observed for the LSM based on "Gródek" and evaluated in the "Gródek" area; however, even its results were largely correct. Moreover, based on the evaluation of our results in the validation area, the SCAI index for the "very low" class in both cases ("Łososina" and "Gródek" model) was higher than 50, and it was below 0.5 for the "very high" susceptibility class, which is also considered appropriate.

Index of Relative Landslide Density
The values of relative landslide density index R ind calculated for high and very high susceptibility classes are presented in Table 5. It can be observed that the R ind calculated for the area used for modeling is very high for both models (90% and 86%). For the validation areas, located on the other side of the river, the R ind are significantly lower (57% and 46%). This indicates changes in accuracy according to landslide susceptibility. Therefore, for the better investigation of the model performance, R ind should also be taken into account beside the SCAI index.

Map/Model Comparison
To better investigate the similarity of the models, we calculated the LSI and susceptible zone difference between them ( Figure 8). The residual maps present the differences in the LSI and susceptibility zones. Five differences in the susceptibility classes can be distinguished (0-4 zones difference), as presented in Table 6. Green color represents no difference in the susceptible zones between these two models, while light red and light blue represent one susceptible zone difference (1,−1). According to Table 6 almost half of the map (47%) presents no difference in landslide susceptibility zones, and 42% differs in one susceptibility zone. Based on the difference between the left and right images, it can be seen that negative and positive values exist between these two sides of the study area, respectively ( Figure 8). This is because the "Łososina" model was adjusted to the "Łososina" area and the "Gródek" model was adjusted to the "Gródek" area.

Discussion
Based on a comparison of the SCAI indices, the general performance (in terms of accuracy) of both models can be described as good. For the validation study area, the SCAI index was higher than 50 and lower than 0.5 for the very low and very high susceptibility classes, respectively. This shows higher performance than that described in other studies [13,14,55]. The calculated for the validation area shows that 46% and 57% of landslides fall into the high and very high susceptibility zones in the "Gródek" and "Łososina" models, respectively. This implies that besides the SCAI index, the should also be used to evaluate the LSM. The explanation for the low value of for the high and very high classes is that 37% and 45% of the landslides are located in the moderate class in the "Łososina" and "Gródek" models. This indicates that another classification method should be considered to effectively differentiate the susceptibility zones (e.g., quantiles, IsoData). This issue has been discussed in [60].
When comparing these two models, the SCAI indexes for Łososina model are more satisfactory. This is mostly due to the high SCAI values for very low and low susceptible zones and very low values for high and very high susceptible zones. Moreover, relative landslide density ( ) is higher for Łososina model (72%). Therefore, this model is more reliable than the Gródek model ( = 68%). This is surprising because it was expected that the "Gródek" model would better predict landslide susceptibility due to the greater amount of landslide used for modeling (54%) when compared with "Łososina" model (46%). Nevertheless, this indicate that the greater amount of landslides used for modeling does not go hand in hand with improved landslide susceptibility assessment.
As we mentioned in the Introduction section, scientists applied a randomly selected 70% of the landslides for modeling and 30% for validation. In the selection strategy, the landslides used for the modeling are distributed across the study area. Instead of our sampling strategy (splitting the area for the modeling and validation regions), landslides distributed evenly across the image can better "capture" a variety of topographical, geological and other environmental conditions (LCFs). When  The correlation index calculated for both susceptible maps is 0.697. This value is quite similar to the total average landslide density in the high and very high susceptibility zones for the entire study area (compare with Table 5). Therefore, the overall accuracy of both of these susceptibility maps for the entire investigated region can be estimated at 70%. However, it must be mentioned that the accuracy changes from region to region. In the area where the landslides used for modeling are located, 90% accuracy can be expected, while in the area where the landslides used for modeling are not preserved (validation area), an LSM accuracy between 46% and 57% can be expected.

Discussion
Based on a comparison of the SCAI indices, the general performance (in terms of accuracy) of both models can be described as good. For the validation study area, the SCAI index was higher than 50 and lower than 0.5 for the very low and very high susceptibility classes, respectively. This shows higher performance than that described in other studies [13,14,55]. The R ind calculated for the validation area shows that 46% and 57% of landslides fall into the high and very high susceptibility zones in the "Gródek" and "Łososina" models, respectively. This implies that besides the SCAI index, the R ind should also be used to evaluate the LSM. The explanation for the low value of R ind for the high and very high classes is that 37% and 45% of the landslides are located in the moderate class in the "Łososina" and "Gródek" models. This indicates that another classification method should be considered to effectively differentiate the susceptibility zones (e.g., quantiles, IsoData). This issue has been discussed in [60].
When comparing these two models, the SCAI indexes for Łososina model are more satisfactory. This is mostly due to the high SCAI values for very low and low susceptible zones and very low values for high and very high susceptible zones. Moreover, relative landslide density (R ind ) is higher for Łososina model (72%). Therefore, this model is more reliable than the Gródek model (R ind = 68%). This is surprising because it was expected that the "Gródek" model would better predict landslide susceptibility due to the greater amount of landslide used for modeling (54%) when compared with "Łososina" model (46%). Nevertheless, this indicate that the greater amount of landslides used for modeling does not go hand in hand with improved landslide susceptibility assessment.
As we mentioned in the Introduction section, scientists applied a randomly selected 70% of the landslides for modeling and 30% for validation. In the selection strategy, the landslides used for the modeling are distributed across the study area. Instead of our sampling strategy (splitting the area for the modeling and validation regions), landslides distributed evenly across the image can better "capture" a variety of topographical, geological and other environmental conditions (LCFs). When the study area is split into modeling and validation areas, it is possible for some landslide-prone geological units to be absent from the area used for modeling. Additionally, by comparing Figure 8 with LCFs, it can be observed that differences of up to two or three susceptible zones occur in the areas of faults and thrust proximity (intensive red and blue color). These differences of two classes can also be observed in changes of lithostratigraphic units (intensive red and blue color). This additionally confirmed our notion that the landslide sampling strategy had a crucial effect on the accuracy of LSM.
To better investigate the difference between these two models, a correlation matrix between the difference in the LSI index ( Figure 8a) and each LCF was calculated and is presented in Figure 9. Based on this, it can be concluded that this difference between the models highly corresponds to the fault, lake, and trust proximity. It can therefore be deduced that the landslides used for modeling did not capture the variability of geological (thrust and fault proximity) and environmental (lake proximity) conditions. Landslides that are better distributed across the entire study area can better describe the variety of geological and environmental conditions (e.g., thrust, fault and lake proximity) and therefore better predict landslide susceptibility.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 17 of 31 the study area is split into modeling and validation areas, it is possible for some landslide-prone geological units to be absent from the area used for modeling. Additionally, by comparing Figure 8 with LCFs, it can be observed that differences of up to two or three susceptible zones occur in the areas of faults and thrust proximity (intensive red and blue color). These differences of two classes can also be observed in changes of lithostratigraphic units (intensive red and blue color). This additionally confirmed our notion that the landslide sampling strategy had a crucial effect on the accuracy of LSM.
To better investigate the difference between these two models, a correlation matrix between the difference in the LSI index ( Figure 8a) and each LCF was calculated and is presented in Figure 9. Based on this, it can be concluded that this difference between the models highly corresponds to the fault, lake, and trust proximity. It can therefore be deduced that the landslides used for modeling did not capture the variability of geological (thrust and fault proximity) and environmental (lake proximity) conditions. Landslides that are better distributed across the entire study area can better describe the variety of geological and environmental conditions (e.g., thrust, fault and lake proximity) and therefore better predict landslide susceptibility. This outcome answers our research question whether we can perform LSM in an area for which a landslide inventory is not available but where such an inventory is available for the surrounding region. Based on this study, we can infer that the assessment of landslide susceptibility based on surrounding landslide inventories cannot be effectively performed in the areas when the geological and environmental conditions have changed. Generally, based on the results in Table 6, there is a 47% chance of being correct with the susceptibility zone and a 42% chance of error with one susceptibility zone. Therefore, we can generally perform landslide susceptibility modeling based on the landslide inventory for a neighboring region. However, it should be reiterated that there is 10% chance of our susceptibility map differing from target landslide susceptibility of two zones in some areas. These are generally located in areas with significant changes in the geological or environmental This outcome answers our research question whether we can perform LSM in an area for which a landslide inventory is not available but where such an inventory is available for the surrounding region. Based on this study, we can infer that the assessment of landslide susceptibility based on surrounding landslide inventories cannot be effectively performed in the areas when the geological and environmental conditions have changed. Generally, based on the results in Table 6, there is a 47% chance of being correct with the susceptibility zone and a 42% chance of error with one susceptibility zone. Therefore, we can generally perform landslide susceptibility modeling based on the landslide inventory for a neighboring region. However, it should be reiterated that there is 10% chance of our susceptibility map differing from target landslide susceptibility of two zones in some areas. These are generally located in areas with significant changes in the geological or environmental conditions. In this case, when we considered only very low susceptible classes for the decision making, this susceptibility map can be used for decision making and planning.
Another aspect which should be discussed here is the variability of landslide type used for LSM. Since the information about the specific landslide type (shallow/deep seated or planar/rotational) is not available within the national inventory, all landslides were used for LSM. A lack of differentiation between landslide types used for LSM is practiced by many researchers [3,13,14,22,55,61,62]. However shallow and deep seated landslides differ in terms of damage influence, size and volume [64]. Deep seated landslides generally appeared due to the relationship between natural denudation process and long-term rainfall, whereas, shallow landslides are related to short high-intensity rainfall [64]. Therefore, at this moment it is unknown whether high correlation between LSM difference (Figure 8a) and geological factors (faults and thrust proximity) correspond to some specific characteristic and mechanism of deep seated landslides, which are known for their strong connection with geology. Thus, in future works it is worth considering differentiation of landslide type (shallow/deep seated) in susceptibility modeling and its relation with landslide conditioning factors.

Conclusions
The fundamental source for LSM is landslide inventory. Unfortunately, there are still areas for which landslide inventories are not generated due to financial or reachability constraints. Thus, this study evaluated whether landslide susceptibility could be effectively assessed in such areas where landslide inventory was available for an adjacent region. Having considered the results from the two landslide susceptibility maps generated by cross-modeling, we can assert that susceptibility zones generated based on inventory located in an adjacent region will be different for 53% of the analyzed area when compared with a susceptibility map generated using landslide inventory. On the contrary, it can be estimated that 47% of the map generated based on inventory for a surrounding area will be similar to a map generated in a traditional manner (using landslide inventory for the analyzed region).
Based on a comparison of the accuracy measures for the training and validation areas, we can conclude that the cross-modeling of landslide susceptibility has great influence on the accuracy of landslide susceptibility determination. This is directly connected with sampling strategy (landslide selection for susceptibility modeling). In this study, we applied a sampling strategy completely different to that widely presented in the literature [13,14,20,55]. Namely, we performed cross-modeling based on landslides located in the cluster (Łososina/Gródek region) defined by the specific natural boundary (Rożnów Lake and Dunajec River). In the LSM literature, the landslide events used for modeling are usually randomly and evenly distributed across the study area. This allows capturing of the variety of geological conditions and the susceptibility to landslides. Therefore, in a perfect scenario, it is desirable to use landslides distributed evenly across the study area and located in various geological settings. This will introduce the effect of autocorrelation and allow the effective assessment of landslide susceptibility.
Nevertheless, when such a landslide sampling strategy is not possible due to the lack of landslide inventory, landslide susceptibility can be assess based on landslide inventory available in adjacent region. However, discrepancies between these two models exist in the areas of critical geological structures (thrust and fault proximity), and the reliability for such susceptibility zones can be low (2-3 susceptibility zone difference). Nonetheless, such areas cover only 11% of the analyzed area; thus, we can conclude that in remaining regions (89%), LSM generated by the inventory for the surrounding area can be used. Still, the low reliability of such a map in areas of critical geological structures should be borne in mind. This applies to LSM in other regions where the geology can be much more complex.
As such, the discrepancies among the LSMs generated based on the strategy presented in this study are particularly relevant given that the overestimation or underestimation of susceptibility can have crucial effects on land-use management and civil protection planning. The optimal map should be able to depict most potential landslide events and, at the same time, be effective and accurate for preventing failures in the study area. Therefore, specific landslide inventory is certainly needed to more reliably estimate landslide susceptibility in complex geological areas.

Acknowledgments:
The authors are very grateful to Tomasz Wojciechowski from Polish National Geological Institute for providing geological data and documentation from the study area.

Conflicts of Interest:
The authors declare no conflict of interest.