Next Article in Journal
Mitigation of Ionospheric Scintillation Effects on GNSS Signals with VMD-MFDFA
Previous Article in Journal
Detection of New Zealand Kauri Trees with AISA Aerial Hyperspectral Data for Use in Multispectral Monitoring
Previous Article in Special Issue
Assessment of the Degree of Building Damage Caused by Disaster Using Convolutional Neural Networks in Combination with Ordinal Regression

Remote Sens. 2019, 11(23), 2866; https://doi.org/10.3390/rs11232866

Article
A Novel Ensemble Approach for Landslide Susceptibility Mapping (LSM) in Darjeeling and Kalimpong Districts, West Bengal, India
1
Department of Geography, University of GourBanga, West Bengal 732103, India
2
Department of Geomorphology, Tarbiat Modares University, Tehran 14115-111, Iran
3
Department of Geoinformatics – Z_GIS, University of Salzburg, 5020 Salzburg, Austria
4
Institute of Research and Development, Duy Tan University, Da Nang 550000, Vietnam
*
Authors to whom correspondence should be addressed.
Received: 12 October 2019 / Accepted: 15 November 2019 / Published: 2 December 2019

Abstract

:
Landslides are among the most harmful natural hazards for human beings. This study aims to delineate landslide hazard zones in the Darjeeling and Kalimpong districts of West Bengal, India using a novel ensemble approach combining the weight-of-evidence (WofE) and support vector machine (SVM) techniques with remote sensing datasets and geographic information systems (GIS). The study area currently faces severe landslide problems, causing fatalities and losses of property. In the present study, the landslide inventory database was prepared using Google Earth imagery, and a field investigation carried out with a global positioning system (GPS). Of the 326 landslides in the inventory, 98 landslides (30%) were used for validation, and 228 landslides (70%) were used for modeling purposes. The landslide conditioning factors of elevation, rainfall, slope, aspect, geomorphology, geology, soil texture, land use/land cover (LULC), normalized differential vegetation index (NDVI), topographic wetness index (TWI), sediment transportation index (STI), stream power index (SPI), and seismic zone maps were used as independent variables in the modeling process. The weight-of-evidence and SVM techniques were ensembled and used to prepare landslide susceptibility maps (LSMs) with the help of remote sensing (RS) data and geographical information systems (GIS). The landslide susceptibility maps (LSMs) were then classified into four classes; namely, low, medium, high, and very high susceptibility to landslide occurrence, using the natural breaks classification methods in the GIS environment. The very high susceptibility zones produced by these ensemble models cover an area of 630 km2 (WofE& RBF-SVM), 474 km2 (WofE& Linear-SVM), 501km2 (WofE& Polynomial-SVM), and 498 km2 (WofE& Sigmoid-SVM), respectively, of a total area of 3914 km2. The results of our study were validated using the receiver operating characteristic (ROC) curve and quality sum (Qs) methods. The area under the curve (AUC) values of the ensemble WofE& RBF-SVM, WofE & Linear-SVM, WofE & Polynomial-SVM, and WofE & Sigmoid-SVM models are 87%, 90%, 88%, and 85%, respectively, which indicates they are very good models for identifying landslide hazard zones. As per the results of both validation methods, the WofE & Linear-SVM model is more accurate than the other ensemble models. The results obtained from this study using our new ensemble methods can provide proper and significant information to decision-makers and policy planners in the landslide-prone areas of these districts.
Keywords:
landslide; machine learning models; remote sensing; ensemble models; validation

1. Introduction

Mountainous regions are threatened by the common natural disaster of landslides. Like hurricanes, floods, droughts, earthquakes, soil erosion, and tsunamis, landslides are important environmental disasters which cause damage and destruction to residential areas, roads, agricultural fields, gardens, and grasslands. Spatially predicting landslide-prone areas may play an important role in disaster management, and it can be considered the standard tool for decision-making in different areas [1]. In geological engineering, landslides are defined as the downward movement of material mass on a slope [2]. Worldwide, mountainous areas are profoundly affected by landslides due to the instability of slopes and masses [3]. For example, the Indian Himalayan mountain regions such as Jammu & Kasmir, Himachal Himalaya, Kumayun, Darjeeling, Sikkim, and north-eastern hilly regions are severely affected by landslides [4]. In the Darjeeling Himalayan region, landslides have a severe environmental impact on socio-economic development. Every year the Darjeeling and Kalimpong districts are frequently heated by landslides due to heavy monsoon rainfall and seismic activity [5]. During July–August in 1993, May in 2009, and September in 2011, the Darjeeling and Kalimpong districts were severely affected by extreme landslides [6]. Furthermore, some major towns in these districts, such as Darjeeling, Mirik, Kurseong, and Kalimpong were hit by landslides during June–July in 2015 due to heavy rainfall, causing fatalities and damage to properties. Therefore, it is necessary to address the landslide risk faced by this particular region to reduce the impact of this environmental disaster. Information regarding the magnitude, character, and probability of landslides can be used to reduce the impact of landslide hazards and for sustainable environmental development and future planning [7]. Therefore, landslide potentiality zoning is an important step for sustainable land management, not only for this particular region but also for other mountainous regions all over the world. The chance of landslide occurrence depends on various conditioning factors rather than a single factor. For preparing the landslide susceptibility map, two things are important: Firstly, landslide inventory data which is considered the dependent variable, and, secondly, landslide conditioning factors which are considered independent variables. In this study, the landslide inventory map was prepared using data collected from Google Earth imagery, a global positioning system (GPS), and extensive field investigations. The landslide conditioning factors or environmental factors, such as slope, aspect, altitude, curvature, geology, soil, land use, normalized differential vegetation index (NDVI), distance from drainage, distance from fault, distance from road, topographic wetness index (TWI), and stream power index (SPI) were selected based on the findings of previous literature (Yilmaz [8], Abedini et al. [9], Regmi et al. [10], Chawla et al. [11], Shahabi and hasim [12], Roy and Saha [13], Pradhan [14], Pourghasemi et al. [15], Pham et al. [16], and Goetz et al. [17]). The landslide inventory data and aforementioned landslide conditioning factors were used to prepare the LSMs with the help of the remote sensing data (RS) and geographical information systems (GIS). Nowadays, most researchers argue that machine learning algorithms using remote sensing and geographical information systems are reliable and appropriate methods for assessing landslide hazards. During recent decades, many studies on landslide susceptibility mapping have been conducted in various parts of the world. Researchers have applied different approaches to produce landslide susceptibility maps, such as statistical models, probabilistic models, knowledge-driven models, and machine learning models using geographical information systems and remote sensing techniques like the analytical hierarchy process (AHP) and bivariate statistics [9,18], logistic regression (LR), artificial neural networks (ANN), frequency ratio (FR), naive bayes classifier, auto logistic modeling, static methods, multivariate adaptive regression, two-class kernel logistic regression, SVM, artificial neural network kernel, logistic regression and logistic tree, random forest, and decision tree methods [19]. Ensemble techniques have been shown to achieve better results than a single method. In this article, WofE was ensembled with four kernels (radial basis function (RBF), linear kernel, polynomial kernel, and sigmoid kernel) of SVM to predict probable landslide hazard areas and for a comparison of the results. The Darjeeling and Kalimpong districts are parts of the eastern Himalayan region in India. Both districts are mostly covered in hilly terrain. Every year, these districts are affected by landslides, which cause destruction to the roads, residential areas, tea gardens, and forests, leading to numerous fatalities. Therefore, these regions were selected as the study area to raise awareness among the public and government so necessary steps can be taken to mitigate the landslide hazard.

2. Materials and Methods

2.1. Study Area

The Darjeeling and Kalimpong districts are situated in the eastern Himalaya region of India and are mainly covered in hilly and rugged mountainous terrain. Combinedly, these districts cover an area of 3914 square km. The research site is bounded within the 26°27″ to 27°13″N latitudes and 87°59″E to 88°53″E longitudes (Figure 1). The altitude of the study area ranges from 15 m to 3602 m above the mean sea level. Climatically, the region is influenced by the south-west and north-east Indian monsoon. The summer season is very wet, and the winter season is dry and cold. The temperature of this region can drop close to zero degrees. According to the Indian metrological department, the rainfall of this region ranges from 1877 mm to 2333 mm. Geologically, the region is composed of Precambrian (Darjeeling gneiss, Daling series), Permian (Damuda series), Miocene (Swaliks), and recent Pleistocene (Alluvium) lithologies, as shown in Table 1 [20]. The Gorubathan and Rangamati surface are tectonic landscape of these regions. The majority of the study area is covered in Triassic rock. Regarding its geomorphology, the research site is composed of active flood plain, alluvial plain, folded ridge, highly dissected hill slope, intermontane valley, and piedmont fan plain [11]. Pedologically, the region is characterized by various soil texture classes; namely, gravelly-loamy, fine-loamy to coarse-loamy, gravelly-loamy to loamy skeletal, and gravelly-loamy to coarse-loamy [21]. Several rivers—namely, the Mahananda, Tista, Mechi, Balason, Jaldhaka, Rammam and Rangit—flow across these districts originating in the mountainous areas. The Darjeeling and Kalimpong districts are famous for national and international tourism. Some attractive tourist places in these regions are Tiger Hill, Rock Garden, Mahakal Temple, Dhirdham Temple, Batasia Loop, Ghoom Monastery, and Happy Valley Tea Garden. The major economic activities of these regions are tea plantation, horticulture, and tourism. The healthy and tasty tea of this region is famous worldwide. Siliguri, Darjeeling, and Kalimpong are the major towns and headquarters within our study area. the total population comprises 18,46,823 people, of which 50.75 % are males, and 49.25% are females. The population density of the research site is 586 people/km2, which is comparatively higher than the mean Indian population density [22]. The length of the national highway, state highway, and other main district highway has increased from 100 to 111 km, 80 to 191 km, and 37 to 79 km from 2001 to 2011. Different cultural communities are present in the study area, such as Nepali, Lepcha, Bhutia, and Rai.

2.2. Methodology

The methodology of the present study is depicted in Figure 2. The flowchart is divided into four main steps, as follows. Step 1: Data used: here, the landslide inventory map (LIM) and landslide conditioning factors (LCFs) data layers were prepared. Step 2: Multicollinearity analysis of landslide conditioning factors was carried out. Step 3: New ensembles of weight-of-evidence (WofE) and SVM models were applied to prepare the landslide susceptibility maps (LSMs). Step 4: LSMs were validated using the receiver operating characteristics (ROC) and quality sum (Qs) methods to measure the capability of the models and identify the best suitable model.

2.3. Data Preparation

2.3.1. Landslide Inventory Dataset

It is vital to analyze the landslide distribution and landslide conditioning factors to determine which areas are most at risk of landslide occurrence. The landslide inventory map (LIM) is an important part of the evaluation and assessment of landslide hazards and risks. Some researchers have used landslide inventory datasets for landslide susceptibility mapping [8,9,10,11,12,13,14,15,16,17,23]. In the present study, a total of 326 landslides were identified through extensive field investigations using a global positing system (GPS) and Google Earth imagery. Out of 326 landslides, 228 (70%) landslides were chosen randomly for landslide modeling purposes, and 98 (30%) landslides were used to validate the prepared landslide susceptibility maps. The landslide inventory map (LIM) was prepared in a GIS environment and is shown in Figure 1. Field photographs of some landslides in the study area are shown in Figure 3.

2.3.2. Preparing Effective Factors

Landslides are processes of mass movement under the influence of different effective factors. Accordingly, it is essential to analyze the conditions of the selected factors to assess landslide susceptibility. The topographic (altitude, slope, aspect), climatic (rainfall), lithological (geology, distance from lineament), hydro-morphological (geomorphology, distance from river, sediment transportation index, stream power index, topographic wetness index), land use, vegetation index, soil texture physical properties, and earthquake intensity are the major effective factors responsible for landslides in general. Previous studies, including Yilmaz [8], Abedini et al. [9], Regmi et al. [10], Chawla et al. [11], Shahabi et al. [12], Roy and Saha [13], Pradhan [14], Pourghasemi et al. [15], Pham et al. [16], and Goetz et al. [17] used these effective factors for landslide susceptibility mapping. In the present study, rainfall (Figure 4d), elevation (Figure 4a), slope (Figure 4b), aspect (Figure 4c), geology (Figure 4e), soil texture (Figure 4f), distance from river (Figure 4 g) distance from lineament (Figure 4h), distance from road (Figure 4k), geomorphology (Figure 4o), land use/land cover (Figure 4i), normalized differential vegetation index (Figure 4j), topographic wetness index (Figure 4l), sediment transportation index (Figure 4n), stream power index (Figure 4m), and seismic zone (Figure 4p) maps were used to delineate the landslide susceptibility area. Different techniques, which are mentioned in Table 2, were used to prepare the thematic layers of these effective factors. A DEM with a spatial resolution of 30m* 30m was selected to prepare the landslide susceptibility maps, and all of the parameters with scales greater or lesser than the DEM were resampled into 30m*30m resolution.
Slope is one of the main landslide conditioning factors. The spatial distribution of slope ranges from 0 to 89 degrees (Figure 4b). The aspect (Figure 4c) was classified into ten categories, i.e. flat (−1), north (0–22.5; 337.5–360), northeast (22.5–67.5), east (67.5–112.5), southeast (112.5–157.5), south (157.5–202.5), southwest (202.5–247.5), west (247.5–292.5), north-west (292.25–337.5). The altitude of the study area ranges from 15 m to 3602 m above mean sea level (Figure 4a). The spatial distribution of average rainfall ranges from 1877 mm to 2333 mm (Figure 4d). The geological map was obtained from the geological survey of India. The river buffer map was classified into five classes, based on the distance from the river, using the natural breaks classification method. The maximum distance from the river in this study area is 4.33 km (Figure 4g). Similarly, the maximum distances from the road and lineament are 16.4 km (Figure 4k) and 10 km (Figure 4h), respectively. The land use of this study area was classified into five categories; namely, water bodies, settlement, vegetation, tea gardens, fallow land, and agricultural land (Figure 4i). The NDVI values range from −0.072 to 0.432 (Figure 4j). The topographic wetness index (TWI) value of the study area ranges from 1.95 to 18.41 (Figure 4l). Geomorphologically, the research area consists of active flood plain, alluvial plain, folded ridge, highly dissected hill slope, inter mountain valley, and piedmont fan plain (Figure 4o). The seismic map of the study area was classified into two categories; namely, moderate and high seismic zones. The values of the moderate risk zone range from 3 to 5 on the Richter scale, while values above 5 on the Richter scale characterize the high seismic prone areas (Figure 4p). The spatial value of STI ranges from 0 to 203 (Figure 4n). The value of SPI in the study area ranges from −11.0 to 7.81 (Figure 4m).
The elevation, slope, aspect, rainfall, normalized differential vegetation index (NDVI), topographic wetness index (TWI), stream power index (SPI), and sediment transportation index (STI) factors were classified into five sub-layers using the natural breaks classification method in a GIS environment (Figure 4). The land use/land cover (LULC) was determined by the maximum likelihood classification method (Figure 4). The geology, soil texture, geomorphology, and seismic zone maps were categorized into different sub-layers using a general classification technique in a GIS environment (Figure 4).

2.4. Multicollinearity Analysis

Multicollinearity analysis is a vital way of identifying and selecting appropriate landslide conditioning factors [13]. In this study, multicollinearity was evaluated through the tolerance value and variance inflation factor (VIF). In normal conditions, tolerance values under 0.10 or VIF values of 10 and above indicate multicollinearity [31,32,33]. In the present study, the multicollinearity test of landslide conditioning factors was done using SPSS software.

2.5. Models

2.5.1. Weight-of-Evidence (WofE) Model

The present study demonstrates the application of the ensemble WofE and SVM model (a Bayesian probability model) for the assessment of landslide susceptibility in the GIS environment. Two types of data were incorporated in the weight-of-evidence model; namely, the landslide inventory data and landslide conditioning factors. The weights were assigned to each landslide conditioning factor by the weight-of-evidence (WofE) model. This model may be compared to the other statistical methods such as the data-driven model that is generally used for the Bayesian probability model [29,34,35,36,37,38,39,40]. Mohammady et al. [38] and Regmi et al. [10] emphasized the value of using the weight-of-evidence model for the evaluation of landslide hazard zones.
The positive weight (W+) and negative weight (W) were calculated to complete the weight-of-evidence function. This calculation was the basis for assigning the weights to the landslide conditioning factors (B) based on the presence and absence of landslides within the area [35] using the following equations (1, 2).
W i + = In P { B / A } P { B / A }
W i - = In P { B ¯ / A } P { B ¯ / A ¯ }
Here, P is the probability and ln is the natural log. Similarly, B and B ¯ indicate the presence and absence of landslide predictive factors. A and A ¯ indicate the presence and absence of landslides. A positive weight (W+) indicates the presence of landslides in a sub-category of landslide conditioning factors and the magnitude of this weight is an indication of the positive correlation between landslide conditioning factors and landslide occurrence. A negative weight (W) indicates the absence of landslides in a sub-category of landslide conditioning factors. A negative weight also indicates a negative correlation between the landslide conditioning factors and the occurrence of landslides [36]. For modeling purposes, the weight contrast C (C= W+−W) measures the spatial association between landslide conditioning factors and landslide occurrences. A positive C value indicates a positive spatial association and a negative C value indicates a negative spatial association [37].
The standard deviation of W is calculated using Equation (3):
S ( C ) = S 2 W + + S 2 W -
where S(W+) indicates the variance of the positive weights and S (W) indicates the variance of the negative weights. The variance of the weights was calculated using the following equation:
S 2 W + = 1 N { B A } + 1 B A ¯
S 2 W - = 1 N { B ¯ A } + 1 B ¯ A ¯
The studentized contrast is the final weight. It is a measure of confidence and is defined as the ratio of the contrast divided by its standard deviation. The studentized contrast serves as an informal test of whether C is significantly different from zero or if the contrast is likely to be “real” [35]. After applying the WofE model, the factor weights calculated by this model were ensembled with the SVM model.

2.5.2. Support Vector Machine (SVM) Model

Among the different machine learning algorithms, SVM is an important supervised learning binary classifier that is based on the structural risk minimization principle [41,42,43,44]. This method separates the hyperplane formation from the training dataset. The separating hyperplane is prepared in the original space of n coordinates (xi parameter in vector x) between the points of two distinct classes [43]. The maximum margin of separation between the classes is discovered by SVM and, therefore, builds a classification hyperplane in the center of the maximum margin [14,44]. If a point is located over the hyperplane, it will be classified as +1 and, if not, will be classified as −1. The training points adjoined to the optimal hyperplane are called support vectors. Once the decision surface is acquired, new data can be classified [45] considering a training data set of instance label pairs ( X i Y i ) with X i R n , Y i { + 1 , 1 } and i = 1 . . . . . . , m . To delineate the landslide susceptibility zones, X represents the vector space that includes rainfall, slope, aspect, elevation, geology, soil texture, land use/land cover, normalized differential vegetation index, distance from river, distance from lineament, distance from road, topographic wetness index, sediment transportation index, stream power index, geomorphology, and the seismic zone map. Meanwhile, the +1 class indicates landslide pixels, whereas the −1 class indicates non-landslide pixels.
The aim of SVM is to find the optimal separating hyperplane that can separate the training dataset into the two classes of landslides and non-landslides {+1, −1}. The separating hyperplane separates data using the following equations:
Y i = ( W . X i + b ) 1 ξ i
where w is a coefficient vector that defines the orientation of the hyperplane in the feature space, b is the offset value of the hyperplane from the origin, and ξ i represents the weak positive variables [46]. The problem of optimization will be solved through the determination of an optimal hyperplane using Lagrangian multipliers [47].
M i n i m i z e i = 1 n α i 1 2 i = 1 n j = 1 n α i α j Y i Y j ( X i X j )
  Subject   to i = 1 n α i Y i = O , 0 α i C ,
where ai represents the Lagrange multipliers, C is the penalty value, and the slack variables ni allow for penalized constraint violation. The decision function, which will be used for the classification of new data, can then be written as:
g ( X ) = s i g n ( i = 1 n Y i α i X i + b )
If the hyperplane cannot be separated by the linear kernel function, the original input data may be shifted into a high-dimensional feature space through some nonlinear kernel functions. The classification decision function is presented in Equation (10):
g ( X ) = s i g n ( i = 1 n Y i α i K ( X i , X j ) + b )
where K(Xi, Xj) is the kernel function.
Linear kernel (LN), polynomial kernel (PL), radial basis function kernel (RBF), and sigmoid kernel (SIG) are the most popular kernel types for SVM analysis [14]. PL and RBF are called Gaussian kernels, and they are the most commonly used kernels in the literature [43]. To prepare the landslide susceptibility map using SVM, we used the remote sensing (RS) software ENVI 4.7, which is an environment for visualizing images. The ENVI 4.7 SVM classifier has four types of kernels; namely, radial basis function (RBF), linear kernel, polynomial kernel, and sigmoid kernel. The mathematical calculation was carried out as shown in Table 3.

3. Results

3.1. Considering the Multicollinearity Analysis of the Effective Factors

The landslide conditioning factors were tested for multicollinearity. The results show that the lowest tolerance value of landslide conditioning factors is 0.446 for rainfall and the highest tolerance value is 0.824 for slope (Table 4). The highest variance inflation factor (VIF) value is 2.241, and the lowest VIF value is 1.213 (Table 4). However, the tolerance values of landslide conditioning factors are greater than 0.1, and VIF values are less than 0.1 and 10, suggesting that there are no collinearity problems among these factors. Therefore, the selected 16 landslide conditioning factors are suitable and accurate for modeling landslide susceptibility.

3.2. Relationship Between Landslide Location and Effective Factors

The WofE values of each class of explanatory variables stand for the degree of landslide occurrence (Table 5). The topographic factors of elevation, slope, and aspect are vital factors which determine the landslide susceptibility of an area. Areas in high elevations are more susceptible to landslides compared to lower altitude areas. In the present study, the altitude level between 422 m to 985 m has the highest WofE value, which indicates a high susceptibility to landslides. The other sub-layers of elevation are comparatively less susceptible to landslides. Slope plays a vital role in landslide hazard assessment. When slope stability becomes weak, the tendency of landslide occurrence is very high. Therefore, high slope values indicate a high probability of landslide occurrence. In our study area, the slope sub-class of 36 to 79 degrees is more prone to landslides compared to the other sub-layers of slope because this sub-class of slope has attained the maximum value of the WofE model. Aspect is also correlated with the probability of landslide occurrence. Aspect is the direction that a slope faces. In this study, south facing slopes obtained the maximum WofE value, indicating a high susceptibility to landslides. Heavy rainfall detaches the soil and rock easily, leading to an increased probability of landslide occurrence. The study area is highly influenced by the monsoon rainfall from June to November, during which the tendency of landslide occurrence is very high. The rainfall sub-layer of 2167 mm to 2239 mm attained the highest WofE values and, therefore, has a higher risk of landslides compared to the other sub-layers of rainfall. Regarding the geology, Darjeeling gneiss, daling series, and swaliks geological segments attained the highest WofE values, suggesting the highest risk of landslides. The soil texture is strongly associated with the probability of landslide occurrence. Gravelly-loamy, gravelly-loamy to loamy-skeletal, and coarse-loamy soil texture classes, with WofE values of 23.44, 21.01 and 19.05, respectively, indicate a higher risk of landslide occurrence compared to the other soil texture classes. River proximity also increases the chances of landslide occurrence. Areas nearest to rivers have a higher landslide risk compared to areas in further distance classes. Here, areas in the class of 0 to 1.66 km distance from rivers have a high probability of landslide occurrence with a WofE value of 14.78. Similarly, areas closest roads and lineaments have a high probability of landslide occurrence based on the WofE values. In recent times, land use has had a strong influence on the occurrence of landslides. Our study area is categorized into five land use types; namely, water bodies, settlement, vegetation, fallow land, and agricultural land. The outcome of the WofE model indicates that the fallow land has a higher risk of landslides compared to vegetation and other land uses. High normalized differential vegetation index areas are less prone to landslide occurrence and vise-versa. Here, the −0.07 to 0.12 NDVI sub-class with a WofE value of 33.27 is the most critical zone for landslide occurrence. The other sub-layers of NDVI indicate lower probabilities of landslide occurrence. For the factors of TWI, STI, and SPI, the maximum values have the highest probability of landslide occurrence. Geomorphologically, the folded ridge and highly dissected mountain regions have the highest potentiality of landslide occurrence, with WofE values of 15 and 33, respectively. Comparatively, the hilly and mountainous regions are more prone to landslides than the plain and plateau regions. Seismologically, the high seismic zone is more susceptible to landslide occurrence than the low seismic zone.
All sub-layers of the different landslide conditioning factors were assigned a weight by the WofE model in the GIS environment. The weighted layers were then converted to a raster layer to prepare the landslide susceptibility map. Before the landslide susceptibility mapping, the weighted (by WofE) layers were reclassified as the input data layers of the support vector machine (SVM) for ensembling with WofE.

3.3. Landslide Susceptibility Models

The support vector machine is an important machine learning algorithm that is used to assess an area’s susceptibility to landslides and other natural hazards. In the present study, the SVM classification was used and ensembled with WofE. The landslide conditioning factors; namely, elevation, slope, aspect, rainfall, geology, soil texture, land use land cover, normalized differential vegetation index (NDVI), distance from river, distance from road, distance from lineament, topographic wetness index (TWI), stream power index (SPI), sediment transportation index (STI), geomorphology, and seismic zone map were used as the input of the SVM classification. The probability values of the SVM classification ranges from 0 to 1. Pixels of images or conditioning factors indicate the landslide susceptibility index with two values, i.e., 0 to 1 where 0 represents stable conditions and 1 value indicates a high chance of landslides occurrence. The SVM classification has four kernel types; namely, radial basis function, linear kernel, polynomial kernel, and sigmoid kernel. These functions were applied in the SVM classification. The output file images created by the SVM classification were integrated and used to prepare the landslide susceptibility maps (LSMs) in the GIS environment.
The four landslide susceptibility maps (LSMs) shown in Figure 5a–d were prepared using the four ensemble models of WofE and SVM; namely, WofE & RBF-SVM, WofE & Linear-SVM, WofE&Polynimial-SVM, and WofE& Sigmoid-SVM. These landslide susceptibility maps (LSMs) were classified into four categories; namely, low, medium, high, and very high susceptibility to landslides, using the natural breaks classification method in the GIS environment. In the WofE& RBF-SVM ensemble map, the four landslide susceptibility classes of low, medium, high, and very high covered 1071 km2 (34%), 813 km2 (25.8%), 635 km2 (20.2%), and 630 km2(20%) area of the districts, respectively (Table 6 and Figure 6). In the WofE and Linear-SVM model, the low, medium, high, and very high landslide susceptibility classes covered an area of 1128 km2 (35.8%), 918 km2 (29.1%), 630 km2 (20%), and 474 km2 (15%), respectively (Table 6). In the WofE& Polynomial-SVM model, the low, medium, high, and very high susceptibility classes covered an area of 1095 km2 (34.8 %), 944 km2 (30%), 608 km2 (19.3%) and 501 km2 (15.9 %), respectively (Table 6). Meanwhile, in the WofE & Sigmoid-SVM ensemble landslide map, the classes of low, medium, high, and very high landslide susceptibility covered 1153 km2 (36.6%), 893 km2 (28.3%), 605 km2 (19.2%) and 398 km2 (15.8%) of the area, respectively (Table 6).

3.4. Validation and Comparison of Models

The landslide susceptibility maps of Darjeeling and Kalimpong districts were prepared by the ensembles of WofE and SVM. These LSMs were then validated using the receiver operating characteristics (ROC) curve, which justifies and evaluates the accuracy of the models [48,49,50,51,52,53,54,55,56]. The ROC curve was prepared along the X and Y-axis. The X-axis indicates the false positive rate (1-specificity) and the Y-axis indicates the true positive rate (sensitivity) [57]. ROC curves have been extensively used for the assessment of susceptibility maps [8,12,15,58,59,60,61,62,63,64,65,66]. In the present study, of the 326 landslides, 98 (30%) landslides were used to validate the landslide susceptibility maps. The area under curve (AUC) values of the ensemble models WofE& RBF-SVM, WofE& Linear-SVM, WofE& Polynomial-SVM, and WofE& Sigmoid-SVM are 87%, 90%, 88%, and 85%, respectively, indicating that they are very good models for the identification of landslide hazard zones (Figure 7a–d). Based on the results of the ROC curves, the WofE& Linear-SVM model is considered more accurate (AUC = 90%) than the other three ensemble models.
It is not sufficient to validate the susceptibility models with only one validation method because this can lead to erroneous results if the samples are randomly distributed across the basin. Therefore, it is essential to cross check the validation result using another suitable validation method. In the present study, the quality sum (Qs) index was used as a second method to assess the accuracy and compare the landslide susceptibility models. Abedini and Tulabi [67] used the Qs method for landslide hazard assessment. In the Qs method, greater values indicate a higher accuracy and correctness of the landslide susceptibility map, whereas lower values indicate lower accuracy [67]. To evaluate this index, the density ratio (Dr) was first calculated using Equation (11).
D r = S i A i i n S i i n A i
where Si is the sum of the area of the landslides in each risk class, Ai is the area in the class of risk, and n is the number of risk classes in a zonation map. The Qs index is shown in Equation (12).
Q s = i = 1 n ( D r - 1 ) 2 × S
where Qs is the quality sum index, Dr is the density ratio, and S is the areal ratio of each risk class to the total area. The Qs method is a reliable validation technique which is calculated based on the landslide distribution and landslide hazards map using Equation (20). The four ensemble models in this study obtained the following Qs values: the WofE & RBF-SVM ensemble model scored 2.10; the WofE& Linear-SVM ensemble model scored 2.24; the WofE&Polynominal-SVM ensemble model scored 2.10, and the WofE& Sigmoid-SVM ensemble model scored 2.18 (Table 7). In line with the ROC results, the Qs validation results also indicate that the WofE and Linear-SVM model is more accurate than the other ensemble models.

4. Discussion

Landslide susceptibility maps play a vital role in stakeholders making suitable decisions in landslide-prone areas. Landslide events not only cost human lives, but also destroy residential areas, roads, and agricultural fields. The assessment of landslide hazards using LSMs performed in this study is an important tool to mitigate landslide hazards, sustain the environment, and help the residents of high risk landslide susceptibility zones. In this study, ensemble models of weight-of-evidence (WofE) and SVM were used to prepare the landslide susceptibility maps (LSMs). The different statistical, knowledge-driven, probabilistic, and machine learning models were used to recognize which areas are at severe risk of landslide occurrence. Several past studies have produced landslide susceptibility maps using different methods and models, such as landslide numerical risk factor (LNRF), frequency ratio (FR), analytical hierarchical process (AHP), SVM, artificial neural network (ANN), logistic regression (LR), conditional probability (CP), multi-criteria decision approach (MCDA), bivariate statistical, bivariate and multivariate models [8,9,60,61,62,63,64,65]. These studies determined the critical zones of landslide risk in their respective study regions. However, in the present study, a new ensemble technique was used, which has shown better results than those of previous studies. An ensemble of the two or three models may provide better results than any single model. In the present study, landslide susceptibility maps were prepared using ensemble models of WofE& RBF-SVM, WofE& Linear-SVM, WofE & Polynomial-SVM, and WofE & Sigmoid-SVM. These models are reliable and accurate in this field. The landslide susceptibility maps were created using landslide inventory data (326 landslides) and landslide conditioning factors (16 environmental factors). The landslide susceptibility maps (LSMs) produced by the ensemble models were classified into four susceptibility classes; namely, low, medium, high, and very high susceptibility to landslide occurrence. The high susceptibility landslide probability zones of the WofE & RBF-SVM, WofE& Linear-SVM, WofE & Polynomial-SVM and WofE& Sigmoid-SVM models cover areas of 630 km2 (20%), 474 km2 (15%), 501 km2 (15%), 497 km2 (15%), respectively.
The landslide susceptibility maps (LSMs) were validated and compared using the receiver operating characteristic (ROC) and quality sum (Qs) validation methods. Based on these validation methods, all models are considered very good to excellent. A high resolution DEM for this area is not freely available, posing the main challenge for the researchers in this study. If high resolution images were used for the extraction of landslide conditioning factors instead of a 30m DEM, these methods could be used to model landslide susceptibility at a micro level and achieve better results [68,69]. Of the four ensemble models, the landslide map produced by the WofE & Linear-SVM model is more suitable and accurate than those produced by other models. The areal distribution of the landslide susceptibility maps is shown in Figure 7. In the present study, these very high susceptibilities landslide probability zones are found in the middle portion of the study area. The areas in these districts closer to roads, such as NH-31 road, Rohini road, Rishi road, Darjeeling road, Sevoke road, and Sikkim-Kalimpong roads, are highly affected by landslides. Teesta River is the major river in these districts. The areas closer to the Teesta River are the most critical zone of landslide susceptibility. The Lish catchment, Mahananda catchment, and Torsha catchment are major catchments which are highly affected by landslides. The other critical landslide areas are Sukhia-Pokhari, Kurseong, Sevoke, Majua tea garden, and Kalimpong. The main factors determining landslide risk in these regions are heavy rainfall, steep slope, elevation, soil texture, geology, distance from road and LULC. During the monsoon season, these areas are strongly affected by landslides due to heavy rainfall. These regions are also affected by high seismic intensity, which is an important cause of landslides. However, the study carefully chalks out the landslides risk zones of Darjeeling and Kalimpong districts. This study will help the government to mitigate the landslides effect and strengthen the public conscious for sustainable development.

5. Conclusions

Landslides are very harmful natural hazards that cost human lives and cause widespread damage to roads, residences, gardens, and agricultural land. In this study, the weight-of-evidence (WofE) and SVM models were ensembled to produce landslide susceptibility maps (LSMs) for the Darjeeling and Kalimpong districts. The ensemble approach is an appropriate method for landslide susceptibility mapping that provides better results than using a single model. The four LSMs produced in this study were classified into four categories; namely, low, medium, high, and very high susceptibility to landslide occurrence. In the various models, the very high susceptibility class covered 20% (WofE& RBF-SVM mode), 15% (WofE& Linear-SVM model), 15.9% (WofE& Polynomial-SVM model), and 15% (WofE& Sigmoid-SVM models) of the study area, respectively. The very high landslide-prone areas are mainly located in the southern and middle parts of Darjeeling and Kalimpong districts. In particular, the Lish catchment area, Teesta catchment area, Sevoke road, and Majua tea garden areas are highly susceptible to landslide occurrences. The results of the ensemble models were validated using the QS index and ROC methods. Both validation methods confirmed the landslide susceptibility maps produced by the WofE& RBF-SVM, WofE& Linear-SVM, WofE& Polynomial-SVM, and WofE& Sigmoid-SVM ensemble methods as being excellent and appropriate. Of the four ensemble models, the WofE & Linear-SVM model was found to be more accurate than other ensemble models. This work helps to increase awareness of the public and government and aims to reduce the impact of landslides by providing steps and suitable strategies of hazard mitigation. Some necessary steps and techniques are essential in the very high landslide risk zones of the study area. Identification of faults, weak geological regions, proper drainage management, and afforestation programs in landslide-prone areas may reduce the landslide risks. The results obtained from this study can provide proper and significant information to the decision-makers and policy planners in the landslide-prone areas of these districts.

Author Contributions

Methodology, J.R., S.S., and A.A.; formal analysis, J.R., and S.S.; investigation, J.R., S.S., and A.A.; writing—original draft preparation, J.R., S.S., and A.A.; writing—review and editing, J.R., S.S., A.A., T.B., and D.T.B.

Funding

This research was partly funded by the Austrian Science Fund (FWF) through the Doctoral College GIScience (DK W 1237-N23) at the University of Salzburg.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bui, D.T.; Tuan, T.A.; Klempe, H.; Pradhan, B.; Revhaug, I. Spatial prediction models for shallow landslide hazards: A comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 2015, 13, 361–378. [Google Scholar] [CrossRef]
  2. Cruden, D.M.; Varnes, D.J. Landslide Types and Processes, Transportation Research Board, U.S. National Academy of Sciences. Spec. Rep. 1996, 247, 36–75. [Google Scholar]
  3. Gerrard, J. The landslide hazard in the Himalayas: Geological control and human action. Geomorphology 1994, 10, 221–230. [Google Scholar] [CrossRef]
  4. Bhandari, R.K. Landslide hazard zonation: Some thoughts. In Coping with Natural Hazards: Indian Context; Valdiya, K.S., Ed.; Orient Longman: Hyderabad, India, 2004; pp. 134–152. [Google Scholar]
  5. Panikkar, S.V.; Subramanyan, V.A. geomorphic evaluation of the landslides around Dehradun and Mussoorie, Uttar Pradesh, India. Geomorphology 1996, 15, 169–181. [Google Scholar] [CrossRef]
  6. Sarkar, S. Landslides in Darjiling Himalayas. Trans. Jpn. Geomorphol. Union 1999, 20, 299–315. [Google Scholar]
  7. Fan, X.; Scaringi, G.; Domènech, G.; Yang, F.; Guo, X.; Dai, L.; Huang, R. Two multi-temporal datasets that track the enhanced landsliding after the 2008 Wenchuan earthquake. Earth Syst. Sci. Data 2019, 11, 35–55. [Google Scholar] [CrossRef]
  8. Yilmaz, I. Comparison of landslide susceptibility mapping methodologies for Koyulhisar, Turkey: Conditional probability, logistic regression, artificial neural networks, and support vector machine. Environ. Earth Sci. 2009, 61, 821–836. [Google Scholar] [CrossRef]
  9. Abedini, M.; Ghasemyan, B.; Mogaddam, M.H.R. Landslide susceptibility mapping in Bijar city, Kurdistan Province, Iran: A comparative study by logistic regression and AHP models. Environ. Earth Sci. 2017, 76, 308. [Google Scholar] [CrossRef]
  10. Regmi, A.D.; Devkota, K.C.; Yoshida, K.; Pradhan, B.; Pourghasemi, H.R.; Kumamoto, T.; Akgun, A. Application of frequency ratio, statistical index, and weights-of-evidence models and their comparison in landslide susceptibility mapping in Central Nepal Himalaya. Arab. J. Geosci. 2013, 7, 725–742. [Google Scholar] [CrossRef]
  11. Chawla, A.; Pasupuleti, S.; Chawla, S.; Rao, A.C.S.; Sarkar, K.; Dwivedi, R. Landslide Susceptibility Zonation Mapping: A Case Study from Darjeeling District, Eastern Himalayas, India. J. Indian Soc. Remote Sens. 2019, 47, 497–511. [Google Scholar] [CrossRef]
  12. Shahabi, H.; Hashim, M. Landslide susceptibility mapping using GIS-based statistical models and Remote sensing data in tropical environment. Sci. Rep. 2015, 5, 15. [Google Scholar] [CrossRef] [PubMed]
  13. Roy, J.; Saha, S. Landslide susceptibility mapping using knowledge driven statistical models in Darjeeling District, West Bengal, India. Geoenvironmental Disasters 2019, 6, 11. [Google Scholar] [CrossRef]
  14. Pradhan, B. A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput. Geosci. 2013, 51, 350–365. [Google Scholar] [CrossRef]
  15. Pourghasemi, H.R.; Jirandeh, A.G.; Pradhan, B.; Xu, C.; Gokceoglu, C. Landslide susceptibility mapping using support vector machine and GIS at the Golestan Province, Iran. J. Earth Syst. Sci. 2013, 122, 349–369. [Google Scholar] [CrossRef]
  16. Pham, B.T.; Pradhan, B.; Bui, D.T.; Prakash, I.; Dholakia, M. A comparative study of different machine learning methods for landslide susceptibility assessment: A case study of Uttarakhand area (India). Environ. Model. Softw. 2016, 84, 240–250. [Google Scholar] [CrossRef]
  17. Goetz, J.; Brenning, A.; Petschko, H.; Leopold, P. Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling. Comput. Geosci. 2015, 81, 1–11. [Google Scholar] [CrossRef]
  18. Gravina, T.; Figliozzi, E.; Mari, N.; Schinosa, F.D.L.T. Landslide risk perception in Frosinone (Lazio, Central Italy). Landslides 2016, 14, 1419–1429. [Google Scholar] [CrossRef]
  19. Pham, B.T.; Bui, D.T.; Prakash, I.; Dholakia, M. Hybrid integration of Multilayer Perceptron Neural Networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS. Catena 2017, 149, 52–63. [Google Scholar] [CrossRef]
  20. Pawde, M.B.; Saha, S.S. Geology of Darjeeling Himalaya; GSI: Kolkata, India, 1982. [Google Scholar]
  21. Pramanik, M.K. Site suitability analysis for agricultural land use of Darjeeling district using AHP and GIS techniques. Model. Earth Syst. Environ. 2016, 2. [Google Scholar] [CrossRef]
  22. Government of West Bengal. Bureau of Applied Economics and Statistics; Department of Statistics & Programme Implementation, District Statistical Handbook, Government of West Bengal: Kolkata, India, 2013.
  23. Guzzetti, F.; Reichenbach, P.; Ardizzone, F.; Cardinali, M.; Galli, M. Estimating the quality of landslide susceptibility models. Geomorphology 2006, 81, 166–184. [Google Scholar] [CrossRef]
  24. Li, Z.; Zhu, Q.; Gold, C. Digital Terrain Modeling: Principles and Methodology; CRC Press: Boca Raton, FL, USA, 2005. [Google Scholar]
  25. Wentworth, C.K. A simplified method of determining the average slope of land surfaces. Am. J. Sci. 1930, 117, 184–194. [Google Scholar] [CrossRef]
  26. Burrough, P.A.; McDonell, R.A. Principles of Geographical Information Systems; Oxford University Press: New York, NY, USA, 1998; p. 190. [Google Scholar]
  27. Bayraktar, H.; Turalioglu, S. A Kriging-based approach for locating a sampling site—In the assessment of air quality. Stoch. Environ. Res. Risk Assess. 2005, 19, 301–305. [Google Scholar] [CrossRef]
  28. Anderson, C.G.; Maxwell, D.C. Starting a Digitization Center; Elsevier: Amsterdam, The Netherlands, 2004; ISBN 978-1843340737. [Google Scholar]
  29. Ay, N.; Amari, S.-I. A Novel Approach to Canonical Divergences within Information Geometry. Entropy 2015, 17, 8111–8129. [Google Scholar] [CrossRef]
  30. Myung, I.J. Tutorial on Maximum Likelihood Estimation. J. Math. Psychol. 2003, 47, 90–100. [Google Scholar] [CrossRef]
  31. Crippen, R.E. Calculating the vegetation index faster. Remote Sens. Environ. 1990, 34, 71–73. [Google Scholar] [CrossRef]
  32. Moore, I.D.; Grayson, R.B.; Ladson, A.R. Digital terrain modelling: A review of hydrological, geomorphological, and biological applications. Hydrol. Process. 1991, 5, 3–30. [Google Scholar] [CrossRef]
  33. Moore, I.D.; Burch, G.J. Physical Basis of the Length Slope Factor in the Universal Soil Loss Equation. Soil Sci. Soc. Am. 1986, 50, 1294–1298. [Google Scholar] [CrossRef]
  34. Available online: http://dx.doi.org/10.2136/sssaj1986.03615995005000050042x (accessed on 21 October 2017).
  35. O’Brien, R.M. A Caution Regarding Rules of Thumb for Variance Inflation Factors. Qual. Quant. 2007, 41, 673–690. [Google Scholar] [CrossRef]
  36. Dormann, C.F.; Elith, J.; Bacher, S.; Buchmann, C.; Carl, G.; Carré, G.; Lautenbach, S. Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography 2012, 36, 27–46. [Google Scholar] [CrossRef]
  37. Wang, H.; Wang, G.; Wang, F.; Sassa, K.; Chen, Y. Probabilistic modeling of seismically triggered landslides using Monte Carlo simulations. Landslide 2008, 5, 387–395. [Google Scholar] [CrossRef]
  38. Mohammady, M.; Pourghasemi, H.R.; Pradhan, B. Landslide susceptibility mapping at Golestan Province, Iran: A comparison between frequency ratio, Dempster-Shafer, and weights-ofevidence models. J. Asian Earth Sci. 2012, 61, 221–236. [Google Scholar] [CrossRef]
  39. Bonham-Carter, G.F. Geographic information systems for geoscientists: Modeling with GIS. In Computer Methods in the Geosciences; Bonham-Carter, F., Ed.; Pergamon: Oxford, UK, 1994; p. 398. [Google Scholar]
  40. Dahal, R.K.; Hasegawa, S.; Nonomura, A.; Yamanaka, M.; Dhakal, S.; Paudyal, P. Predictive modeling of rainfall-induced landslide hazard in the Lesser Himalaya of Nepal based on weights-of evidence. Geomorphology 2008, 102, 496–510. [Google Scholar] [CrossRef]
  41. Dahal, R.K.; Hasegawa, S.; Nonomura, A.; Yamanaka, M.; Masuda, T.; Nishino, K. GIS-based weights-of-evidence modeling of rainfall-induced landslides in small catchments for landslide susceptibility mapping. Environ. Geol. 2008, 54, 314–324. [Google Scholar] [CrossRef]
  42. Wan, S.; Lei, T.C. A knowledge-based decision support system to analyze the debris-flow problems at Chen-Yu-Lan River, Taiwan. Knowl. Based Syst. 2009, 22, 580–588. [Google Scholar] [CrossRef]
  43. Yao, X.; Tham, L.; Dai, F. Landslide susceptibility mapping based on support vector machine: A case study on natural slopes of Hong Kong, China. Geomorphology 2008, 101, 572–582. [Google Scholar] [CrossRef]
  44. Marjanovic, M.; Kovacevic, M.; Bajat, B.; Vozenilek, V. Landslide susceptibility assessment using SVM machine learning algorithm. Eng. Geol. 2011, 123, 225–234. [Google Scholar] [CrossRef]
  45. Tehrany, M.S.; Pradhan, B.; Jebu, M.N. A comparative assessment between object and pixel-based classification approaches for land use/land cover mapping using SPOT 5 imagery. Geocarto Int. 2013, 29, 1–19. [Google Scholar] [CrossRef]
  46. Tien Bui, D.; Pradhan, B.; Lofman, O.; Revhaug, I. Landslide susceptibility assessment in Vietnam using support vector machines, decision tree, and Naïve Bayes Models. Math. Probl. Eng. 2012, 2012, 1–26. [Google Scholar] [CrossRef]
  47. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  48. Samui, P. Slope stability analysis: A support vector machine approach. Environ. Geol. 2008, 56, 255–267. [Google Scholar] [CrossRef]
  49. Arabameri, A.; Pradhan, B.; Rezaei, K. Gully erosion zonation mapping using integrated geographically weighted regression with certainty factor and random forest models in GIS. J. Environ. Manag. 2019, 232, 928–942. [Google Scholar] [CrossRef]
  50. Arabameri, A.; Pradhan, B.; Rezaei, K. Spatial prediction of gully erosion using ALOS PALSAR data and ensemble bivariate and data mining models. Geosci. J. 2019, 23, 1–18. [Google Scholar] [CrossRef]
  51. Arabameri, A.; Cerda, A.; Tiefenbacher, J.P. Spatial pattern analysis and prediction of gully erosion using novel hybrid model of entropy-weight of evidence. Water 2019, 11, 1129. [Google Scholar] [CrossRef]
  52. Arabameri, A.; Pradhan, B.; Rezaei, K.; Conoscenti, C. Gully erosion susceptibility mapping using GISbased multi-criteria decision analysis techniques. Catena 2019, 180, 282–297. [Google Scholar] [CrossRef]
  53. Arabameri, A.; Rezaei, K.; Cerda, A.; Lombardo, L.; Rodrigo-Comino, J. GIS-based groundwater potential mapping in Shahroud plain, Iran. A comparison among statistical (bivariate and multivariate), data mining and MCDM approaches. Sci. Total Environ. 2019, 658, 160–177. [Google Scholar] [CrossRef]
  54. Arabameri, A.; Pourghasemi, H.R.; Yamani, M. Applying different scenarios for landslide spatial modeling using computational intelligence methods. Environ. Earth Sci. 2017, 76, 832. [Google Scholar] [CrossRef]
  55. Arabameri, A.; Pradhan, B.; Rezaei, K.; Sohrabi, M.; Kalantari, Z. GIS-based landslide susceptibility mapping using numerical risk factor bivariate model and its ensemble with linear multivariate regression and boosted regression tree algorithms. J. Mt. Sci. 2019, 16, 595–618. [Google Scholar] [CrossRef]
  56. Arabameri, A.; Pradhan, B.; Rezaei, K.; Saro, L.; Sohrabi, M. An ensemble model for landslide susceptibility mapping in a forested area. Geochem. Int. 2019, 1–18. [Google Scholar] [CrossRef]
  57. Chung, C.J.F.; Fabbri, A.G. Validation of Spatial Prediction Models for Landslide Hazard Mapping. Nat. Hazards 2003, 30, 451–472. [Google Scholar] [CrossRef]
  58. Negnevitsky, M. Artificial Intelligence—A Guide to Intelligent Systems; Addison-Wesley Co.: Boston, MA, USA, 2002; p. 394. [Google Scholar]
  59. Mallick, J.; Singh, R.K.; Alawadh, M.A.; Islam, S.; Khan, R.A.; Qureshi, M.N. GIS-based landslide susceptibility evaluation using fuzzy-AHP multi-criteria decision-making techniques in the Abha Watershed, Saudi Arabia. Environ. Earth Sci. 2018, 77, 276. [Google Scholar] [CrossRef]
  60. Feizizadeh, B.; Roodposhti, M.S.; Jankowski, P.; Blaschke, T. A GIS-based extended fuzzy multi-criteria evaluation for landslide susceptibility mapping. Comput. Geosci. 2014, 73, 208–221. [Google Scholar] [CrossRef] [PubMed]
  61. Park, S.; Choi, C.; Kim, B.; Kim, J. Landslide susceptibility mapping using frequency ratio, analytic hierarchy process, logistic regression, and artificial neural network methods at the Inje area, Korea. Environ. Earth Sci. 2012, 68, 1443–1464. [Google Scholar] [CrossRef]
  62. Bijukchhen, S.M.; Kayastha, P.; Dhital, M.R. A comparative evaluation of heuristic and bivariate statistical modelling for landslide susceptibility mappings in Ghurmi–DhadKhola, east Nepal. Arab. J. Geosci. 2012, 6, 2727–2743. [Google Scholar] [CrossRef]
  63. Pradhan, B.; Youssef, A.M. Manifestation of remote sensing data and GIS on landslide hazard analysis using spatial-based statistical models. Arab. J. Geosci. 2009, 3, 319–326. [Google Scholar] [CrossRef]
  64. Ozdemir, A.; Altural, T. A comparative study of frequency ratio, weights of evidence and logistic regression methods for landslide susceptibility mapping: Sultan Mountains, SW Turkey. J. Asian Earth Sci. 2013, 64, 180–197. [Google Scholar] [CrossRef]
  65. Arabameri, A.; Yamani, M.; Pradhan, B.; Melesse, A.; Shirani, K.; Bui, D.T. Novel ensembles of COPRAS multi-criteria decision-making with logistic regression, boosted regression tree, and random forest for spatial prediction of gully erosion susceptibility. Sci. Total Environ. 2019, 688, 903–916. [Google Scholar] [CrossRef]
  66. Tien Bui, D.; Hoang, N.D.; Martínez-Álvarez, F.; Ngo, P.T.T.; Hoa, P.V.; Pham, T.D.; Samui, P.; Costache, R. A novel deep learning neural network approach for predicting flash flood susceptibility: A case study at a high frequency tropical storm area. Sci. Total Environ. 2019. [Google Scholar] [CrossRef]
  67. Abedini, M.; Tulabi, S. Assessing LNRF, FR, and AHP models in landslide susceptibility mapping index: A comparative study of Nojian watershed in Lorestan province, Iran. Environ. Earth Sci. 2018, 77, 405. [Google Scholar] [CrossRef]
  68. Haneberg, W.C.; Cole, W.F.; Kasali, G. High-resolution lidar-based landslide hazard mapping and modeling, UCSF Parnassus Campus, San Francisco, USA. Bull. Eng. Geol. Environ. 2009, 68, 263–276. [Google Scholar] [CrossRef]
  69. Nichol, J.E.; Shaker, A.; Wong, M.S. Application of high-resolution stereo satellite images to detailed landslide hazard assessment. Geomorphology 2006, 76, 68–75. [Google Scholar] [CrossRef]
Figure 1. Study area and landslide location map.
Figure 1. Study area and landslide location map.
Remotesensing 11 02866 g001
Figure 2. Methodological flowchart of the present work.
Figure 2. Methodological flowchart of the present work.
Remotesensing 11 02866 g002
Figure 3. Field photographs of some landslides in the study area. (a) Sikkim-Kalimpong road (27°03′20″N, 88°26′03″E) (b) Sevokekalimandir (26°54′01″N, 88°28′18″E). (c) Lish catchment (26°57′N, 88°30′17″E). (d) Darjeeling road (26°54′33″N, 88°17′10″E). (e) Pagla Jhora (26°52′47.70″N, 88°18′11.24″E). (f) Sevoke Road (26°54′33″N, 88°28′04″E).
Figure 3. Field photographs of some landslides in the study area. (a) Sikkim-Kalimpong road (27°03′20″N, 88°26′03″E) (b) Sevokekalimandir (26°54′01″N, 88°28′18″E). (c) Lish catchment (26°57′N, 88°30′17″E). (d) Darjeeling road (26°54′33″N, 88°17′10″E). (e) Pagla Jhora (26°52′47.70″N, 88°18′11.24″E). (f) Sevoke Road (26°54′33″N, 88°28′04″E).
Remotesensing 11 02866 g003
Figure 4. Landslide conditioning factors - a. elevation, b. slope, c. aspect, d. rainfall, e. geology, f. soil texture, g. distance from river, h. distance from lineament, i. land use/land cover (LULC), j. normalized differential vegetation index (NDVI), k. distance from road, l. topographic wetness index (TWI), m. stream power index (SPI), n. sediment transportation index (STI), o. Geomorphology, p. Seismic map.
Figure 4. Landslide conditioning factors - a. elevation, b. slope, c. aspect, d. rainfall, e. geology, f. soil texture, g. distance from river, h. distance from lineament, i. land use/land cover (LULC), j. normalized differential vegetation index (NDVI), k. distance from road, l. topographic wetness index (TWI), m. stream power index (SPI), n. sediment transportation index (STI), o. Geomorphology, p. Seismic map.
Remotesensing 11 02866 g004aRemotesensing 11 02866 g004b
Figure 5. Landslide Susceptibility maps (LSMs) produced by different ensemble models – (a). WofE& RBF-SVM, (b). WofE&Linear-SVM, (c). WofE& Polynomial-SVM, (d). WofE& Sigmoid-SVM models.
Figure 5. Landslide Susceptibility maps (LSMs) produced by different ensemble models – (a). WofE& RBF-SVM, (b). WofE&Linear-SVM, (c). WofE& Polynomial-SVM, (d). WofE& Sigmoid-SVM models.
Remotesensing 11 02866 g005
Figure 6. Areal distributions of LSMs by– (a). area distribution of LSMs, (b). percentage of area distribution of LSMs.
Figure 6. Areal distributions of LSMs by– (a). area distribution of LSMs, (b). percentage of area distribution of LSMs.
Remotesensing 11 02866 g006
Figure 7. Validation of LSMs using the ROC curve showing the area under curve (AUC) – (a). WofE& RBF-SVM, (b). WofE&Linear-SVM, (c). WofE& Polynomial-SVM, (d). WofE& Sigmoid-SVM models.
Figure 7. Validation of LSMs using the ROC curve showing the area under curve (AUC) – (a). WofE& RBF-SVM, (b). WofE&Linear-SVM, (c). WofE& Polynomial-SVM, (d). WofE& Sigmoid-SVM models.
Remotesensing 11 02866 g007
Table 1. Geological successions of Darjeeling Himalaya.
Table 1. Geological successions of Darjeeling Himalaya.
AgeSeriesLithological Characteristics
Recent
(Holocene)
Pleistocene
Sub-aerial formations
(soil, alluvia, colluvial)
Raised Terraces
Younger flood plain deposits of the rivers composed of sand, gravel, pebble, etc. and soil covering the rocks sandy, clay, gravel, pebble, boulders etc. representing older fluvial deposits
MioceneSiwalikMicaceous sandstones with slaty bands, seams of graphitic coal, silts and minor bands of limestone
PermianDamuda Series (Lower Gondwana)Quartzitic sandstones with slaty bands, carbonaceous shales, seams of graphitic coal, lamprophyre sills and minor bands of limestone
Precambrian1) Darjeeling gneiss
2) Daling gneiss
Golden-silvery micaschists; Carbonaceousmicaschists;
Granatiferousmicaschists and coarse grained gneisses. Slates (greenish to grey with perfect slaty cleavage). Phyllites surrounded by pebbles of quartz, Chlorite-schists with bands of grilty schist’s injected with gneiss (crinkled). Granites, pagmatites’s and quartz veins, with tourmaline and iron as accessories
Source: Mallet (50); Gansser, (51); Pawde and Saha, (52).
Table 2. Production techniques used for the various thematic data layers.
Table 2. Production techniques used for the various thematic data layers.
Sl. No.ParametersData Used & ScaleSources of Data TypesTechniquesReferences
1ElevationDEM
30 m × 30
U.S Geological Survey30 m × 30 m digital elevation model[24]
2SlopeDEM
30 m × 30
U.S Geological Survey Tan θ = N × i 636.6 N=No. of Contour Cutting;
i=Contour Interval
[25]
3AspectDEM
30 m × 30
U.S Geological Survey A s p e c t = 57.29 × α t a n 2 ( [ d z d y ] [ d z / d x ]
Where,
dz/dx= ((c+2f+i)−(a+2d+g))/8
dz/dy=((g+2h+i)−(a+2b+c))/8
Here, a to i indicates the cell value of 3*3 window.
[26]
4RainfallAnnual average rainfall data of different stations in the last 5 yearsIndian Metrological Department (IMD)Kriging Interpolation method[27]
5GeologyReference geological map
1: 50,000
Geological Survey of IndiaDigitization process[28]
6SoilReference district soil map
1: 50,000
National Bureau of Soil Survey and Land Use PlanningDigitization process[28]
7Distance from RiverReference
Topomap
1: 50,000
Survey of IndiaEuclidian Distance Buffering[29]
8Distance from LineamentReference sheet of Lineament
30 m × 30
https://bhuvan-vec2.nrsc.gov.in/bhuvan/wmsEuclidian Distance Buffering[29]
9Land use/land cover (LULC)Landsat 8 OLI/TIRS
30 m × 30
U.S Geological SurveyMaximum likelihood Classification[30]
10Normalized differential vegetation index (NDVI)Landsat 8 OLI/TIRS
30 m × 30
U.S Geological Survey N D V I = N I R I R N I R + I R
Where NIR is the near infrared band or band 4 and IR is the infrared band or band 3.
[31]
11Distance from roadReference Topomap
1: 50,000
Survey of IndiaEuclidian Distance Buffering[29]
12Topographic wetness index (TWI)DEM
30 m×30
1: 50,000
U.S Geological Survey T W I = I n ( A s / t a n θ )
Where α is the cumulative upslope area draining through a point (per unit contour length), and β is the slope gradient (in degree).
[32].
13Stream power index (SPI)DEM
30 m × 30
1: 50,000
U.S Geological Survey S P I = A s * t a n β
Where AS is the upstream contributing area and β is the slope gradient (in degrees)
[32].
14Sediment transportation index (STI)DEM
30 m × 30
U.S Geological Survey S T I = ( m + 1 ) × ( A s / 22.13 ) m × sin ( B / 0.0896 ) n
Where, As, is the specific catchment area; ‘B’ is the local slope gradient in degrees; m is usually set to 0.4, ‘n’, is usually set to 0.0896
[33]
15GeomorphologyReference sheet
1: 50,000
https://bhuvan-vec2.nrsc.gov.in/bhuvan/wmsDigitization process[27]
16Seismic zone mapLast 200 years point data of earthquake
30 m × 30
National Centre for Seismology, New Delhi, IndiaGridding and Interpolation (Inverse distance weight method)[11]
Table 3. SVM kernel types and their equations.
Table 3. SVM kernel types and their equations.
Kernel TypesEquationsKernel Parameters
Radial Basis Function (RBF) K ( X i , X j ) = exp ( γ X i X j 2 ) γ
Linear kernel K ( X i X j ) = X i T X j ---
Polynomial kernel K ( X i , X j ) = ( γ X i T X + 1 ) d γ , d
Sigmoid kernel K ( X i , X j ) = T a n h ( γ X i T X + 1 ) d γ
(Source: Tien Bui et al. [46], Yao et al. [43]).
Table 4. Multicollinearity analysis of landslide conditioning factors.
Table 4. Multicollinearity analysis of landslide conditioning factors.
Landslide Conditioning FactorsCollinearity Statistics
ToleranceVIF
Rainfall0.4462.241
Elevation0.5201.924
Slope0.8241.213
Aspect0.6721.488
Geology0.6881.453
Soil0.7561.323
Distance from River0.5701.753
Distance from lineament0.7731.294
Distance from Road0.4992.003
Land use/land cover (LULC)0.7541.326
Normalized differential vegetation index (NDVI)0.7571.320
Topographic wetness index (TWI)0.6771.477
Stream power index (SPI)0.6841.461
Sediment transportation index (STI)0.7681.302
Geomorphology0.7891.268
Seismic zone0.6181.618
Table 5. Spatial relationship between landslide conditioning factors and landslide occurrence extracted by the Weight-of-evidence (WofE) model.
Table 5. Spatial relationship between landslide conditioning factors and landslide occurrence extracted by the Weight-of-evidence (WofE) model.
Rainfall (mm)Pixels% of PixelsLandslide Pixels% of PixelsW+WCS2W+S2WW
1877.38–1991.973225908.78400.0000.0000.0920.0000.0000.0000.0000.000
1991.97–2090.542899067.89400.0000.0000.0820.0000.0000.0000.0000.000
2090.45–2167.4494432025.7123937.895−1.1820.215−1.3970.0030.0000.053−26.580
2167.44–2239.06133349336.309367073.6840.709−0.8851.5940.0000.0010.03249.504
2239.06–2333.9678235721.30291818.421−0.1450.036−0.1820.0010.0000.037−4.963
Slope (Degree)
0–9.32117581832.015921.847−2.8540.368−3.2220.0110.0000.105−30.614
9.32–18.6466509818.10957111.464−0.4580.078−0.5360.0020.0000.044−12.044
18.44–27.3481389622.161117223.5290.060−0.0180.0780.0010.0000.0332.326
27.34–36.6669444918.909157931.7000.518−0.1720.6900.0010.0000.03022.622
36.66–79.233234048.806156731.4601.277−0.2861.5630.0010.0000.03151.122
Altitude(m)
15–422135151136.7994178.373−1.4820.372−1.8540.0020.0000.051−36.226
422 – 98583722422.796249150.0000.787−0.4351.2220.0000.0000.02843.079
985 –157673849920.108100520.1730.003−0.0010.0040.0010.0000.0350.115
1576 – 227951866914.12283916.8440.176−0.0320.2090.0010.0000.0385.509
2279 – 36022267626.1742304.610−0.2930.017−0.3090.0040.0000.068−4.572
Aspect
Flat(−1)19050.05200.0000.0000.0010.0000.0000.0000.0000.000
north2369676.452390.788−2.1040.059−2.1630.0250.0000.160−13.495
northeast46202312.5803637.289−0.5460.059−0.6050.0030.0000.055−11.098
east45497012.38865113.0610.053−0.0080.0610.0020.0000.0421.443
southeast52220014.219109822.0450.439−0.0960.5350.0010.0000.03415.640
south52580714.317129225.9460.596−0.1460.7420.0010.0000.03222.922
southwest45723612.45089017.8680.362−0.0640.4260.0010.0000.03711.505
west37836210.3024629.279−0.1050.011−0.1160.0020.0000.049−2.376
northwest41957311.4241543.093−1.3080.090−1.3980.0060.0000.082−17.074
north2136215.817310.630−2.2230.054−2.2770.0320.0000.179−12.718
Geology
Swaliks193626652.721318263.8890.192−0.2700.4620.0000.0010.03015.659
Darjeeling Gneiss2705267.36669213.8890.635−0.0730.7090.0010.0000.04117.273
Daling series1314713.5804158.3330.847−0.0510.8970.0020.0000.05117.480
Alluvium67851218.47500.0000.0000.2050.0000.0000.0000.0000.000
Damuda series65589017.85969213.889−0.2520.047−0.2990.0010.0000.041−7.293
Soil
Gravelly-loamy2746517.47883016.6670.803−0.1050.9080.0010.0000.03823.845
Fine loamy_Coarse Loamy147784840.239110722.222−0.5940.264−0.8580.0010.0000.034−25.171
Gravelly loamy_LoamySkeletol45003512.254110722.2220.596−0.1210.7170.0010.0000.03421.019
Gravelly Loamy_Coarse Loamy140479438.250166033.333−0.1380.077−0.2140.0010.0000.030−7.131
Coarse Loamy653361.7792775.5561.142−0.0391.1810.0040.0000.06219.055
Distance from River (km)
0–0.42116095931.611104921.053−0.4070.144−0.5510.0010.0000.035−15.837
0.42–1.10129169635.171196639.4740.116−0.0690.1840.0010.0000.0296.356
1.10–1.6675040120.432144228.9470.349−0.1130.4620.0010.0000.03114.784
1.66–2.2637167710.1203937.895−0.2490.024−0.2730.0030.0000.053−5.195
2.26–4.33979312.6661312.632−0.0130.000−0.0140.0080.0000.089−0.153
Distance from Lineament(km)
0–1.5476349020.78890618.182−0.1340.032−0.1670.0010.0000.037−4.531
1.54–2.85109345729.773101920.455−0.3760.125−0.5010.0010.0000.035−14.243
2.85–4.2094131425.630147229.5450.142−0.0540.1970.0010.0000.0316.323
4.20–5.7563314217.239124525.0000.372−0.0990.4710.0010.0000.03314.378
5.75–10.122412636.5693406.8180.037−0.0030.0400.0030.0000.0560.710
Distance from Road(km)
0–1.74163602844.54679215.909−1.0310.417−1.4480.0010.0000.039−37.353
1.74–3.9498833526.91190618.182−0.3930.113−0.5060.0010.0000.037−13.754
3.94–6.7258925316.04490618.1820.125−0.0260.1510.0010.0000.0374.109
6.72–10.223166288.621147229.5451.235−0.2601.4950.0010.0000.03148.066
10.22–16.491424203.87890618.1821.550−0.1611.7110.0010.0000.03746.466
Land use/Land cover
Water bodies404271.10100.0000.0000.0110.0000.0000.0000.0000.000
Vegetation265029472.163111922.464−1.1681.027−2.1950.0010.0000.034−64.607
Fallow land1683824.585162432.6091.970−0.3482.3180.0010.0000.03076.445
Agricultural land76325620.782223844.9280.773−0.3641.1370.0000.0000.02939.858
Settlement503061.37000.0000.0000.0140.0000.0000.0000.0000.000
Normalized differential vegetation index (NDVI)
−0.07–0.1244245012.047139928.0930.849−0.2021.0500.0010.0000.03233.271
0.12–0.17)97251426.480142128.5230.074−0.0280.1030.0010.0000.0313.270
0.17–0.23)99725727.154131226.336−0.0310.011−0.0420.0010.0000.032−1.297
0.23–0.2981659222.23461812.411−0.5840.119−0.7030.0020.0000.043−16.346
0.29–0.4944385112.0852314.636−0.9590.081−1.0410.0040.0000.067−15.436
Topographic wetness index (TWI)
1.95–7.3758299015.87491818.4210.149−0.0310.1800.0010.0000.0374.916
7.37–8.53132685436.128209742.1050.153−0.0980.2520.0000.0000.0298.765
8.53–9.76108870129.643131126.316−0.1190.046−0.1650.0010.0000.032−5.140
9.76–11.7054726714.90165513.158−0.1250.020−0.1450.0020.0000.042−3.454
11.70–18.911268533.45400.0000.0000.0350.0000.0000.0000.0000.000
Sediment transportation index (STI)
0–4.80357680997.390485097.3680.0000.008−0.0080.0000.0080.089−0.096
4.80–20.81783622.1341312.6320.210−0.0050.2150.0080.0000.0892.429
20.81–56.85137280.37400.0000.0000.0040.0000.0000.0000.0000.000
56.85–120.1030370.08300.0000.0000.0010.0000.0000.0000.0000.000
120.10–203.387290.02000.0000.0000.0000.0000.0000.0000.0000.000
Stream power index (SPI)
−11.16 – −6.8445770112.4624278.571−0.3750.044−0.4180.0020.0000.051−8.260
−6.84 – −4.3167045218.255113922.8570.225−0.0580.2830.0010.0000.0348.385
−4.31 – −2.0899462227.082113922.857−0.1700.056−0.2260.0010.0000.034−6.700
−2.08 – −0.002100349227.323142328.5710.045−0.0170.0620.0010.0000.0311.978
−0.002 – 7.8154639814.87785417.1430.142−0.0270.1690.0010.0000.0384.491
Geomorphology
Alluvial plain59169416.11100.0000.0000.1760.0000.0000.0000.0000.000
Piedmont fan plain45301612.3351192.381−1.6460.108−1.7540.0080.0000.093−18.867
Inter montane valley38319010.4344749.524−0.0910.010−0.1010.0020.0000.048−2.101
Active flood plain2059505.60800.0000.0000.0580.0000.0000.0000.0000.000
Folded ridge49960713.603106721.4290.455−0.0950.5500.0010.0000.03515.919
Highly dissected hill slope153920841.910332166.6670.465−0.5561.0210.0000.0010.03033.948
Seismic zone map
High100064127.246260452.2730.653−0.4221.0750.0000.0000.02837.859
Moderate267202472.754237747.727−0.4220.653−1.0750.0000.0000.028−37.859
Table 6. Areal distribution of ensemble model landslide susceptibility maps (LSMs).
Table 6. Areal distribution of ensemble model landslide susceptibility maps (LSMs).
Landslide Susceptibility ClassesWofE& RBF-SVMWofE&Linear-SVMWofE& Polynomial-SVMWofE& Sigmoid-SVM
Area in sq.km% of AreaArea in sq.km% of AreaArea in sq.km% of AreaArea in sq.km% of Area
Low107134.0112835.8109534.8115336.6
Medium81325.891829.194430.089328.3
High63520.263020.060819.360519.2
Very High63020.047415.050115.949815.8
Table 7. Mathematical Calculation of Qs Method of Ensemble LSMs.
Table 7. Mathematical Calculation of Qs Method of Ensemble LSMs.
Ensemble ModelsClassesai (sq.km)si (sq.km)DRsQs
WofE& RBF-SVMLow1071.230.000.000.342.10
Medium812.950.120.100.26
High635.020.931.070.20
Very High629.803.263.780.20
WofE& Linear-SVMLow1127.550.000.000.362.24
Medium917.570.340.270.29
High630.041.131.320.20
Very High473.842.844.370.15
WofE& Polynomial-SVMLow1095.140.000.000.352.10
Medium944.150.340.260.30
High608.441.131.360.19
Very High501.272.844.130.16
WofE& Sigmoid-SVMLow1153.400.000.000.372.18
Medium892.570.230.190.28
High604.551.251.510.19
Very High498.482.844.160.16
Back to TopTop