Gis-based Landslide Susceptibility Mapping on the Peloponnese Peninsula, Greece

In this paper, bivariate statistical analysis modeling was applied and validated to derive a landslide susceptibility map of Peloponnese (Greece) at a regional scale. For this purpose, landslide-conditioning factors such as elevation, slope, aspect, lithology, land cover, mean annual precipitation (MAP) and peak ground acceleration (PGA), and a landslide inventory were analyzed within a GIS environment. A landslide dataset was realized using two main landslide inventories. The landslide statistical index method (LSI) produced a susceptibility map of the study area and the probability level of landslide occurrence was classified in five categories according to the best classification method from three different methods tested. Model performance was checked by an independent validation set of landslide events. The accuracy of the final result was evaluated by receiver operating characteristics (ROC) analysis. The prediction ability was found to be 75.2% indicating an acceptable susceptibility map obtained from the GIS-based bivariate statistical model.


Introduction
Landslides are one of the major types of geo-hazards [1] as almost 9% of global natural disasters refer to landslides [2].Despite advances in science and technology, these events continue to result in OPEN ACCESS economic, human, and environmental losses worldwide.Landslide susceptibility (LS) is the propensity of soil or rock to produce various types of landslides [3,4].LS is usually expressed through cartographic means.A LS map presents areas with the potential for landsliding in the future by combining some of the critical factors that contributed to the occurrence of past landslides [5].Such a map is a valuable tool for assessing current and potential risks that can be used for developing early warning systems and mitigation plans, such as selecting the most suitable locations for construction of structures and roads.
Elevation, slope, aspect, lithology, land cover, mean annual precipitation (MAP) and peak ground acceleration (PGA) were selected as landslide-conditioning factors in our study.Among all parameters for LS zonation, elevation, slope and aspect have been recognized as the most important conditioning factors [6][7][8][9].The elevation dataset is useful to classify the local relief and locate points of maximum and minimum heights within terrains.Generally, it is well justified through the literature [10,11] that slope gradients have a large impact on landsliding in Peloponnese.The aspect parameter is related to differential weathering, exposure to sunlight and drying winds and soil moisture [7].Lithology also plays a key role in landslide activity since different lithologic units have different landslide susceptibility values [12].Moreover, slope stability is also influenced by land cover.Finally, both seismicity and precipitation factors [13,14] have been used as conditioning factors in many LS zonation studies.It is, therefore, imperative to incorporate these parameters, while carrying out LS analysis in seismically active and frequently rainfall-influenced regions.
From the beginning of the 1970s, the interest of both geoscientists and engineering professionals in LS zonation and the increasing emphasis on the use of Geographic Information Systems (GIS) technology allowed for the development of many methods.According to general overviews, presented in the work of [15][16][17][18], these methods can be divided into two groups: qualitative and quantitative methods.
The qualitative methods depend on the knowledge and previous experience of the experts, and include geomorphologic analysis [19] and the use of index or parameter maps [20].The quantitative methods depend on numerical expressions of the relationships between conditioning factors and landslide occurrence.They include geotechnical engineering approaches [21,22], statistical analysis, the artificial neural network (ANN) and neuro-fuzzy logic methods [23,24].Some qualitative approaches, however, incorporate the idea of ranking and weighting the parameters involved, and therefore can be considered as semi-quantitative in nature [25].
In statistical analysis methods, the combination of parameters that have led to landslides in the past are determined statistically, and quantitative predictions are made for areas not affected by landslides but where similar background conditions exist [13].Several researchers have applied these methods, which are either bivariate [26,27] or multivariate analyses [28,29].In bivariate analysis, each individual landslide-influencing factor is combined with a landslide inventory map, and weight values based on landslide densities are calculated for its corresponding classes.A well-known and widely used bivariate analysis method for LS zonation is frequency ratio calculation, often referred as the landslide susceptibility index (LSI) [26,27,[30][31][32][33][34].Tien Bui et al. [35] evaluated and compared the results of applying the LSI and the (multivariate) logistic regression methods for estimating landslide susceptibility in the Hoa Binh province of Vietnam.Conforti et al. [36] applied the LSI method for drafting a landslide susceptibility map in the Vitravo River catchment (southern Italy).In addition, Polykretis et al. [37] compared the performance of a conventional statistical method like LSI and a soft computing method like ANNs in the Krathis and Krios drainage basins (northern Peloponnese).These three studies indicated a good prediction capability (from 73% up to 94%) for the LSI method.
The main aim of this paper was to produce a landslide susceptibility map at regional scale (1:500,000) using a quantitative method.The performance of this model was evaluated in Peloponnese peninsula, Greece.Furthermore, we implemented validation analysis to estimate the prediction ability of the proposed model.

Study Area
The Peloponnese peninsula is located in the southern part of Greece (Figure 1), and is connected with the mainland through the Isthmus of Corinth.The total area of Peloponnese is 21,439 km 2 , and its population stands at 1,086,935 inhabitants (Hellenic Statistical Agency, 2001).Agricultural, forest and semi-natural areas cover the main part of Peloponnese, whereas urban is the dominant land cover in the coastal zone of the peninsula.The climate is typical Mediterranean with a hot and relatively dry summer between June and August, and a wet season during autumn, winter and spring [38].
Peloponnese has a complex geomorphology, with mountainous inland, many coastal cliffs in the south, and basins, coastal beaches, lakes and inland basins at the western and southwestern coasts.The slopes vary from gentle to very steep, while the drainage network is well developed, and is highly controlled by fault tectonics.The study area belongs to an active tectonic zone manifested by faults, thrust zones and folds.The main rock units in the study area are (a) carbonate rocks (44%) and (b) Neogene sediments (22%).In these two types of rock units, the majority of landslides have occurred.Peloponnese is a region highly damaged by the occurrence of severe natural disasters such as earthquakes, floods, landslides, and forest fires.Heavy rainfall and earthquakes have triggered several landslide events, mainly in northern and western areas of the peninsula [10,39,40].Many serious events are related to major faults and to unstable zones located in steep slopes.In addition, the human interventions for the construction of roads have also played a key role in landslide activation.
Accordingly, the study area forms a complex physiographic region where all conditioning factors of landslides present a high spatial variability.Considering these conditions, and the fact that the past is the key to the future, it is evident that slope instability is one of the most severe hazards.

Data
In order to accomplish the LS analyses in the study area, a spatial database was designed and developed, and spatial analysis tools were implemented within GIS environment with the use of ArcGIS (ver.9.3) software package.This database comprises two main parts: (a) the datasets with the background geographic conditions (slope, lithology, land cover, etc.) and (b) the landslide inventory dataset.
A spatial dataset that represents former landslides (presented as point features in the centroid of each pixel) is the most critical information layer in order to implement quantitative statistical analysis for LS assessment.In the present paper, two main landslide inventories were used: (a) an inventory maintained by Institute of Geology and Mineral Exploration (IGME) formed only from the recent historical records, covering a time period 1910 to 1995 [41]; and (b) an inventory developed on the basis of fieldwork and aerial photograph (provided from the Hellenic Military Geographical Service-HMGS, with scale 1:40,000) interpretation.This inventory contains landslide events that occurred from 1995 through 2003.The final landslide dataset consists of 282 landslides (217 and 65 landslides from the first and second inventory, respectively).This dataset (Figure 1) was randomly split into two separate groups: a training dataset (75% of the landslide inventory) and a validation dataset (25% of the landslide inventory).The training dataset was used for the implementation of statistical analysis, whereas the validation dataset was used during the verification of the results produced from the model.
The landslides under investigation have more or less the same characteristics: lateral based and downslope movement of soils or rocks.Seismic triggered landslides, occur in the vicinity of active faults, and usually related to other secondary seismic events, like soil liquefaction, subsidence of the coastal strip, and rock falls [42][43][44].Subaqueous and liquefaction events are not included in this study.Rainfall triggered landslides usually are rapid short moving events, while slow short-moving type also occur including extensive instability zones [45].They occur in gentle natural slopes where the translational type predominates.The occasional planar slip surfaces are located in the weathered zone of marls or flysch while ground water level reaches the surface of the slope during heavy rainfall.The most critical landslide-prone formations regarding lithology, and structure are flysch and neogene sediments, while schist and cherts significantly contribute in landslide phenomena [46].Slides which usually take place in the gentle slope of flysch mantle are typically quite shallow and take form of a sheet of weathered zone sliding on a slip surface parallel to the ground [47].In line with [48] and [49], in this paper the term landslide is used for translational and rotational earth slides, which were recorded in landslide inventories.These events vary consistently in volume, from some thousands of m 3 to several million m 3 [46], and depicting small to extremely large magnitude, according to [50] classification.
In this study, elevation, slope, aspect, PGA, MAP, lithology and land cover have been selected as the conditioning factors on landslide susceptibility.Although there are no standard guidelines for selecting these parameters [16], the nature of the study area, the scale of the analysis, and data availability were taken into account [18].The seven factors used in current research were selected on the basis of the aforementioned criteria while literature outputs and general guidelines for GIS-based studies were also considered [7,13,18,51,52].
The Digital Elevation Model (DEM) was the key to create various topographic parameters related to the landslide activity of the study area.Here, we used the DEM from SRTM (Shuttle Radar Topography Mission) database, with cell size 90 m × 90 m.From this DEM, elevation (0-2367 m), slope angle (0°-54°) and slope aspect layers have been extracted.
The lithological layer (with cell size 250 m × 250 m) of the study area was created from the Geological Map of Greece with scale 1:500,000 (IGME, 1983).The land-cover layer (with cell size 250 m × 250 m) in the region of Peloponnese was based on CORINE program (Coordinate of Information on the Environment).The study area was classified by following the level-1 classification scheme of the CORINE data [53].
Previous studies emphasized the need of incorporating dynamic factors (PGA, MAP) with other "static" factors for LS zonation mapping in areas, whereas these factors are playing an important role not only in the reactivation of old landslides but in the development of new ones [54,55].The precipitation data (339-1655 mm) used in this study refers to the mean annual precipitation during the period from 1950 through 1974 (source: Public Power Company-PPC, with about 30 active meteorological-rainfall stations in the study area, cell size: 250 m × 250 m).MAP is the average of the available long-term records [56].The seismic factor (0.05-3.95 m/s 2 ) layer was produced from the map of expected Peak Ground Acceleration (PGA) with 475-year return period (10% probability of exceedance in 50 years).PGA is the absolute maximum amplitude of recorded acceleration [57].The source of this layer (with cell size 250 m × 250 m) was the Technical Chamber of Greece (TCG, 1992).Landslide distribution is strongly affected by seismicity and especially by ground acceleration, while magnitude-distance relations have been established for earthquake induced landslides [48].

Statistical Method: Landslide Susceptibility Index (LSI)
One of the most common methods used in the statistical analysis of landslides is the frequency ratio method.This method is based on the relationship between the spatial distribution of landslides and each conditioning parameter [7].In the present paper, we estimate this relationship with the calculation of landslide susceptibility index (LSI).This method calculates the LSI for each category of all conditioning factors (e.g., land cover, lithology, slope, aspect, elevation, etc.), which are selected for the case study.Thus, supposing that j is a category within the factor i, then the LSI for this category (LSI i,j ) is defined as follows [13]: N , the number of landslides in category j of the factor i, j i A , the area of this category, T N the total number of landslides and T A the total area under investigation.Thus, the LSI presents the relative susceptibility to landslide occurrence.If a category is highly correlated to landslides, the area associated with this category will have a high positive LSI value.A negative LSI value for a specific category is an indicator of low landslide density in this class.Thus, for a conditioning factor to be useful for landslide susceptibility mapping, its categories should provide a range of LSI values.The overall susceptibility S for each pixel is defined as: where LSI i is the susceptibility for the factor i, and n is the total number of the factors.This analysis involves three main steps: (a) Categorization of all landslide-conditioning factors.In this step, the natural breaks (jenks) categorization in five discrete classes was implemented for the factors with continuous values (elevation, PGA, MAP), except for slope factor whose categorization was executed in a manual way based on its presented values.For the categorical factors (landcover, lithology, and aspect) we preserved all the classes of the nominal scale.(b) Calculation of the landslide density in each class and of the area for each category using GIS-based overlay functions.In this stage, the calculation of LSI for all factor categories based on Equation (1) was implemented (Table 1).(c) Finally, the integrated landslide susceptibility map (Figure 2) was created by combining the LSI values of multiple factors by means of GIS overlay analysis, and the Equations ( 1) and ( 2).We classified this map into five discrete categories: "Very Low", "Low", "Moderate", "High" and "Very High" landslide susceptibility according to the following classification methods: (a) equal interval; (b) standard deviation and (c) natural breaks (jenks).
The standard deviation (σ n ), the range (range n ) and the maximum value (max n ) of LSI for all factors were also calculated (Table 2) in order to interpret the importance of each factor [13].
The output LS map (Figure 2) from the LSI model shows that 25% (5319 km 2 ) and 16% (3391 km 2 ) of the study area were classified as "High" and "Very High" susceptibility zones, respectively.The same map also shows that the zones of "High" and "Very High" susceptibility are located around the northern (coastal) and western part of the study area.Finally, the overlay of the final LS map with the landslide training dataset indicated that 48%, 32% and 15% (total: 95%) of the landslide events fall within "Very High", "High" and "Moderate" landslide susceptibility zones (in total 65% of the study area), respectively.It is notable that according to the used model only 5% of the landslide test set falls in "Low" susceptibility zone, with no landslides into the "Very Low" susceptibly zone.
We have also proceeded with a standard validation analysis using data from landslide events, not included in the initial spatial database (validation data set), in order to estimate the overall performance of the LS model in the study area.For the validation of the output from our analysis, the receiver operating characteristics (ROC) curve was drawn, and the area under curve (AUC) value was calculated for the proposed model.In practice, the AUC performs very well and is often used when a general measure of predictiveness is desired [58].ROC analysis is considered as a powerful method for the validation of landslide susceptibility models [59,60].The AUC value ranges from 0.5 to 1.0.The ideal model yields an AUC value close to 1.0 (perfect fit), whereas a value close to 0.5 indicates an inaccurate model (random fit).
As was aforementioned, the landslide susceptibility map was classified into five discrete categories according to equal interval, standard deviation and natural breaks (jenks) classification methods.In order to evaluate these classification methods, the results from ROC analysis for the three methods were computed and compared (Table 3).From this process, the natural breaks classification was shown to be the most efficient classification method, as it presented good overall results for landslide and non-landslide cases (high values of sensitivity and specificity, and a satisfactory AUC value).So the final map was classified based on this classification method (Figure 2).
Figure 2 shows the ROC curve of LSI model for the validation dataset.The AUC value of 0.752 indicates a reasonable prediction ability of the model.

Discussion and Conclusions
This study was aimed at assessing the landslide susceptibility at a regional scale (1:500,000) on the Peloponnese peninsula, Greece.By applying a bivariate statistical analysis, implemented in a GIS environment, the relationships between landslide events and geo-environmental factors were assessed and shown on the susceptibility map.Despite the model, the choice of these factors plays a major role in the relative accuracy of the outcomes.Limited emphasis has been directed towards selection.The literature indicates that the most common conditioning factors are; lithological units, tectonic features, slope angle, proximity to (road or drainage) networks, land cover and rainfall distribution [3,61].All of these factors are considered to be related to landslide occurrence.However, there are additional factors that may be arguably as influential.
In this study, seven geo-environmental factors (elevation, slope, aspect, PGA, MAP, lithology and land cover) were selected.All the factor layers were harmonized to cell size 250 m × 250 m, apart from elevation layer which was harmonized to cell size 90 m × 90 m.
Statistical methods are generally considered the most appropriate method for LS mapping at regional scales because they are objective, reproducible and easily updatable [13].The LSI model is a data-driven bivariate statistical approach in which each parameter is analyzed individually and the calculation and application are easy and fast [62].The bivariate statistical technique is one of the most preferable models at this scale [63].However, many researchers argued that multivariate models-although more complex-are superior to bivariate methods as they predict the landslide susceptibility in areas with limited training data available better [64].In data-driven LS modeling, all the problems of landslide inventory maps will automatically be inherited to the final susceptibility maps [16].In general, a reliable, accurate and complete landslide inventory map will provide high-quality statistical models [35].In many countries-including Greece-this kind of extended landslide inventory is not available.
The validation with the use of the empiric ROC area for the LSI model was estimated to be 0.752 for the validation dataset (Figure 3).Thus, there is 75.2% agreement between prepared LS map and landslide locations of the validation dataset, which is a reasonable result, taking into consideration the scale of analysis.Recently, LS analyses in the international literature have used ROC analysis, not only to validate the landslide susceptibility mapping models, but also to compare their prediction capabilities.Bivariate statistical analysis is a common method at this scale [64][65][66], with fair to satisfactory results (AUC values from 0.59 up to 0.76).Several researchers have also compared the LSI model with expert-based qualitative models using different data sets, and finding the LSI model to be superior [26,27,67].
In addition, a previous study [25] in the same area with the use of a semi-quantitative (expert-based fuzzy weighting-EFW) model produced rather worse results (AUC value: 0.70).Comparing the final LS map of this study with this one of [25], it is derived that in both maps 9% (1917 km 2 ), 7% (1566 km 2 ) and 1% (286 km 2 ) of the study area belong to "Moderate", "High" and "Very High", respectively, susceptibility zones.On the other hand, the differences between two LS maps for these three susceptibility zones are presented to cover 26% (5518 km 2 ) of the study area.The differences for the remaining susceptibility zones ("Very Low" and "Low") cover a restricted part of the study area (6% or 1335 km 2 ).
The implementation of the bivariate statistical model in the study area revealed that the high susceptibility areas are mainly concentrated in the western Peloponnese (Figure 2).Additionally, a linear pocket of high susceptibility values is located in the northern coastal zone of the study area.According to the final LS map, the "Very High" susceptibility zone covers a significant part of the study area (16% of the total area).
The most important factors for the LS zonation in the study area are land cover, elevation and MAP.It seems that the incorporation of dynamic factors (precipitation, seismicity) in this regional analysis was more or less beneficial to the assessment of landslide susceptibility.Most of the landslide events are detected in "agricultural areas", in areas composed of "neogene", in areas with elevation lower than 234 m, slope angle lower than 10°, north or west facing, moderate levels of annual precipitation (885-1079) and high seismic acceleration (2.54-3.11m/s 2 ).
The findings of our analysis are acceptable for this scale and better than other semi-quantitative approaches [25].Thus, in a future work, we intend to combine this method with expert-based modeling in a "hybrid approach".Some limitations and assumptions of the applied method have to be pointed out.First of all, as the final result is given in a medium-scale map, the determination of the exact extent of the slope instability areas demands further site-specific research.At large scales, more exhaustive datasets and detailed geotechnical information are required.A second limitation is related to the landslide inventory dataset.At a regional scale, this dataset does not include the total amount of the landslide events within the study area.Furthermore, another limitation of this study is that the developed method does not take into account the mutual relationships between conditioning factors.Finally, it should be mentioned that, according to our analysis, the output LS map presents only the predicted spatial distribution of landsliding and not its temporal probability.Therefore, the result of this study should be used in the preliminary hazard mapping and guide quantitative risk analysis at a detailed scale.
Despite these limitations, the produced LS map could be very useful to community and local officials for choosing suitable locations for future land-use planning and implementation of developments.
Additionally, planners and developers could use this map to identify roads and settlements subject to damage by future landslides, and take drastic measures for preventing the landslide events.

Figure 1 .
Figure 1.The location of the study area (Peloponnese peninsula) and the landslide inventory.

Figure 2 .
Figure 2. (a) The landslide susceptibility map produced by the LSI model and (b) receiver operating characteristics (ROC) curve for the LSI model.

Table 1 .
Categories, landslide densities and susceptibility values (LSI) for landslide related factors.

Table 2 .
Landslide susceptibility index (LSI) properties of the conditioning factors.
min : Minimum value of LSI, LSI max : Maximum value of LSI, LSI range : Range of LSI values, LSI st.dev : Standard deviation of LSI values.Asterisk indicates very limited area cover of this class.

Table 3 .
ROC analysis results for the classification methods.