Use of Very High-Resolution Optical Data for Landslide Mapping and Susceptibility Analysis along the Karnali Highway, Nepal

: The Karnali highway is a vital transport link and the only primary roadway that connects the remote Karnali region to the lowlands in Mid-Western Nepal. Every year there are reports of landslides blocking the road, making this area largely inaccessible. However, little e ﬀ ort has focused on systematically identifying landslides and landslide-prone areas along this highway. In this study, landslides were mapped with an object-based approach from very high-resolution optical satellite imagery obtained by the DigitalGlobe constellation in 2012 and PlanetScope in 2018. Landslides ranging from 10 to 30,496 m 2 were detected within a 3 km bu ﬀ er along the highway. Most of the landslides were located at lower elevations (between 500–1500 m) and on steep south-facing slopes. Landslides tended to cluster closer to the highway, near drainage channels and away from faults. Landslides were also most prevalent within the Kuncha Formation geologic class, and the forested and agricultural land cover classes. A susceptibility map was then created using a logistic regression methodology to highlight patterns in landslide activity. The landslide susceptibility map showed a good prediction rate with an area under the curve (AUC) of 0.90. A total of 33% of the study arealies in high / very high susceptibility zones. The map highlighted the lower elevated areas between Bangesimal and Manma towns with the Kuncha Formation geologic class as being the most hazardous. The banks of the Karnali River, its tributaries and areas near the highway were also highly susceptible to landslides. The results highlight the potential of very high-resolution optical imagery for documenting detailed spatial information on landslide occurrence, which enables susceptibility assessment in remote and data scarce regions such as the Karnali highway. m was created using logistic regression. The map has a good prediction rate, with an AUC of 0.90. Results indicate that approximately 33% of the study area lies in high / very high susceptibility zones. The road section between Bangesimal and Manma was found to be the most hazardous. This susceptibility map provides the ﬁrst estimates of highly susceptible areas to landslides along the Karnali highway, which can inform decisions about where to apply mitigation approaches, such as bioengineering. This method demonstrates the potential for conducting similar analyses in remote areas, providing the ﬁrst step towards hazard and risk estimates and can be expanded to other landslide-prone regions around the country.


Introduction
The mountains in Nepal are one of the most hazardous environments in the world, with frequent landslides caused by tectonic activity, monsoonal rainfall and infrastructure development [1]. About 72% of the country is hilly or mountainous, with 50% of the total population residing in these areas [2]. Long-term development and economic prosperity of this region is contingent on the availability and reliability of roads for access to infrastructure such as marketplaces, schools, and hospitals [3]. As a result, understanding the frequency, distribution, and susceptibility of landslides along Nepal's main transportation corridors is vital for better characterizing the impact that landslides may impose on the population within this region. Landslide susceptibility maps provide an estimate of where landslides might occur, typically based on knowledge of landslides that have occurred in the past [8]. There are many different methods for conducting landslide susceptibility mapping depending on the objectives of the resulting map [9]. Landslide susceptibility mapping can be categorized as heuristic, deterministic and statistical [10]. Heuristic methods are based on expert knowledge, hence are subjective. Deterministic approaches require a large number of input data gathered from laboratory tests and field visits, and can only be applied in smaller areas [11]. Statistical approaches are based on relationships between known or inferred instability variables with past and present distributions of landslides [12]. Similarly in Nepal, various landslide susceptibility mapping methods have been tested and their performances have been documented [13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29]. Most of these studies have been conducted in Central and Eastern Nepal, possibly due to the lack of landslide data in the Western part of the country. Landslide susceptibility maps provide an estimate of where landslides might occur, typically based on knowledge of landslides that have occurred in the past [8]. There are many different methods for conducting landslide susceptibility mapping depending on the objectives of the resulting map [9]. Landslide susceptibility mapping can be categorized as heuristic, deterministic and statistical [10]. Heuristic methods are based on expert knowledge, hence are subjective. Deterministic approaches require a large number of input data gathered from laboratory tests and field visits, and can only be applied in smaller areas [11]. Statistical approaches are based on relationships between known or inferred instability variables with past and present distributions of landslides [12]. Similarly in Nepal, various landslide susceptibility mapping methods have been tested and their performances have been documented [13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29]. Most of these studies have been conducted in Central and Eastern Nepal, possibly due to the lack of landslide data in the Western part of the country. The most important information required for susceptibility mapping is an inventory of past landslide occurrences. Conventional landslide inventories are produced by field visits and aerial photo interpretation [30]. This process is very time-consuming and can be subjective [31,32]. Recent advancements in remote-sensing technologies have significantly increased our ability to rapidly map landslides of various sizes, with less in situ surveys or human interaction [33][34][35]. Remote sensing of landslides can be categorized into two groups: pixel-based and object-based image analysis (OBIA). Pixel-based methods utilize spectral information alone, and ignore spatial information [36]. Hence, these are not the best method to map geomorphic processes such as landslides [37]. OBIA, on the other hand, incorporates spectral, textural, morphological and topographical characteristics, which are more suitable for detecting landslides [38]. OBIA has been successfully used to map landslides all around the world [37,[39][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57]. Comparative studies between pixel-based methods and OBIA have highlighted the superiority of OBIA in detecting landslides with fewer false positives [58,59].
Landsat 8, with a spatial resolution of 30 m, has been found effective in mapping larger landsides for rapid assessment in Nepal [60][61][62][63]. The launch of Sentinel-2, with a spatial resolution of 10 m in 2015, has also increased the availability of free high-resolution optical imagery and enabled landslide detection at finer scales than what was possible with previous open source satellite imagery from Landsat and ASTER. However, hazard and risk studies require a complete landslide inventory, and freely available optical imagery from sensors such as Sentinel-2, Landsat and ASTER are not capable of detecting small landslides with areas of less than 100 m 2 . As a result, very high-resolution (VHR) imagery is the only space-based option for a systematic and comprehensive landslide mapping [64]. VHR satellites are commercially owned, expensive and not freely available, except when disaster charters (www.disasterscharter.org) are activated. NextView licensing agreement, a partnership between the U.S. government and U.S. commercial vendors, provides access to VHR imagery for federal agencies in support of scientific research [65]. This partnership provides access to VHR optical imagery obtained from the DigitalGlobe (DG) (Westminster, Colorado, USA) constellation. The Small Satellite Data Buy program [66] is another initiative started by NASA to investigate the effectiveness of VHR imagery in support of research activities undertaken by the agency. This program provides access to additional VHR optical imagery from three satellites: SkySat, RapidEye and PlanetScope, operated by Planet Labs Inc. (San Francisco, California, USA) for NASA funded projects.
The aim of this study is to create multi-temporal landslide inventories along the Karnali highway using VHR imagery from DG and PlanetScope with the OBIA methodology. The effectiveness of this remote sensing based landslide inventory is demonstrated by producing a susceptibility map, documenting its predictive capacity and potential applications.

Study Area
The Karnali highway ( Figure 1) in Province 6 of Nepal runs south to north starting at Bangesimal in the Surkhet District and ending in Khalanga, of the Jumla District. This highway mostly runs parallel to the Karnali river between Bangesimal and Manma. Beyond Manma, it follows the Tila river, a tributary of the Karnali river. Since we are interested in landslide activity that might affect the highway, we defined a buffer of 3 km along the highway as our study area. The route is 233 km long and paved. Construction of the highway started in 1991/1992. However, the road was finally opened along the entire route to Khalanga only on 1 April 2007.
Nepal is divided into five geotectonic zones from south to north: the Gangetic Plain, Siwaliks, Lesser Himalaya, Higher Himalaya and Tibetan-Tethys Himalaya. These zones are separated from each other by the thrust faults. The Neogene Siwaliks consists of mudstone, sandstone and conglomerate. The Lesser Himalayan rocks mainly consist of low to medium grade Proterozoic metamorphic rocks (e.g., phyllite, schist, metasandstone and quartzite) of the Kuncha Formation along with metasedimentary rocks of the Nawakot Complex that extend from MBT at the southern margin to MCT at the northern margin. The granitic intrusions are also present within the Lesser Himalayas, such as the Ulleri Gneiss. The Gondwana rocks and Post-Gondwana Eocene-Early Miocene sediments of the Tansen group are present sporadically throughout the Lesser Himalayas. The Lesser Himalayan Crystalline, consisting of Proterozoic gneisses and schists, are present in the Lesser Himalayas as klippe.

Satellite Imagery
Multispectral data from three satellites of the DG constellation: GeoEye-1 (GE01), QuickBird-2 (QB02) and WorldView-2 (WV02) ( Table 1) and PlanetScope [68] were used for creating the landslide inventories. The DG satellites acquire images over a certain area only when it is tasked and at various off-nadir angles. VHR imagery tends to be best if the off-nadir angle is <20° [64]. Images acquired with higher off-nadir angles will suffer from image distortion and result in lower georeferencing accuracies. We queried the DG archives for images with off-nadir angle <20° and cloud cover <20%. The satellite coverage over this area was intermittent, with nearly complete coverage available only for the year 2012. In order to complete the coverage, we incorporated one frame from 2010, two

Satellite Imagery
Multispectral data from three satellites of the DG constellation: GeoEye-1 (GE01), QuickBird-2 (QB02) and WorldView-2 (WV02) ( Table 1) and PlanetScope [68] were used for creating the landslide inventories. The DG satellites acquire images over a certain area only when it is tasked and at various off-nadir angles. VHR imagery tends to be best if the off-nadir angle is <20 • [64]. Images acquired with higher off-nadir angles will suffer from image distortion and result in lower georeferencing accuracies.
We queried the DG archives for images with off-nadir angle <20 • and cloud cover <20%. The satellite coverage over this area was intermittent, with nearly complete coverage available only for the year 2012. In order to complete the coverage, we incorporated one frame from 2010, two frames from 2011 and two frames from 2013 ( Figure 1). As most of the images are from 2012, we will consider landslides detected using DG as a 2012 inventory. DG Level 1B products are available through the NextView license [65]. DG level 1B data are radiometrically and sensor corrected, but not projected to a plane using map projection or datum [69]. We orthorectified the data using a 30 m NASADEM [70] and then converted into top-of-the-atmosphere reflectance for normalized difference vegetation index (NDVI) calculations. All the processing was done using Polar Geospatial Center's orthorectification tools [71] on the Advanced Data Analytics Platform (ADAPT) at NASA Goddard Space Flight Center (GSFC)'s NASA Center for Climate Simulation (NCCS) (http://www.nccs.nasa.gov/services/adapt). In order to map recent landslides, we created another landslide inventory (referred to as the 2018 inventory) using 22 PlanetScope tiles from 11 and 12 November 2018, covering the whole highway. We used Level 3B PlanetScope Ortho Scene products at 3 m resolution with tile size of 24 km by 7 km.

Landslide Explanatory Variables
The explanatory variables ( Figure 3) used in this study can be categorized into four main groups: topographical, geological, hydrological and anthropogenic. These variables were generated in raster format with spatial resolution of 30 m in ArcGIS.
Topographical variables considered in this study are elevation, slope and aspect and were generated using a 30 m NASADEM. Elevation controls most of the geomorphological and geological processes and ranges across the study region from 341 to 3753 m. The elevation was classified into eight classes ( Figure 3a). Slope is often considered the most important factor contributing to landslides [72] and varies from 0 to 75 • over the study area. The slope was classified into eight classes ( Figure 3b). Slope aspect determine the amount of insolation and rainfall received due to prevailing climatologic patterns, which can have a substantial influence on landslide triggering [73]. The slope aspect was prepared with values ranging from 0 to 360 and further classified into nine classes ( Figure 3c).
Geology plays an important role in the occurrences of landslides as lithological and structural variations often lead to a difference in rock strength and permeability of soils and rocks [74,75]. Geological variables were obtained from digitized geological map from the Department of Mines and Geology in Nepal [67]. The Karnali highway passes through seven geological classes ( Figure 3d). The main geological structures that demarcate the study area are the MBT and the Mahabharat Thrust (MT) (Figure 2). These fault lines were digitized from the geological map. Distance to faults were created using the Euclidean distance toolset in ArcGIS and classified into six classes (Figure 3e).
The hydrological variable was represented by distance to drainage, since runoff adversely affects stability by eroding the slope or by saturating the lower portion of the hillslope [76]. Drainage was  Anthropogenic variables were represented by land cover and distance to highway. Human activities and road construction along steep mountain terrain destabilize the slope, which can increase the frequency of landslides [77]. Land cover with seven land cover types (Figure 3g) was obtained from the 2010 land cover map created from Landsat at 30 m resolution by The International Centre for Integrated Mountain Development (ICIMOD) [78]. Road networks data were obtained from OpenStreetMap [79]. The distance to highway was created using the Euclidean distance method and classified into six classes (Figure 3h).

Landslide Inventory Mapping
The methodology for landslide mapping is outlined in Figure 4. Imagery used in this study have different dates and acquisition modes. Hence, the methodology described below has been applied to individual tiles of DG and PlanetScope separately. The steps are described in sequential order. subsequent steps [83]. MRS requires three parameters: scale, shape and compactness. The scale controls the image object size, with higher scale resulting in larger objects and small scale in smaller objects. The shape determines the degree to which shape influences segmentation vs. spectral homogeneity. The compactness defines the weight of compactness criteria. The higher the value, more compact the objects will be. There are many methods for selecting the optimal scale automatically, such as estimation of scale parameter 2 [84], plateau objective function [42] and optimal scale parameter selector [85]. However, currently there is no standardized or best method for optimal scale estimation [86]. Manual trial and error selection of scale parameter is time consuming. Different scales were tested for segmentation, but obtaining a scale that accounts for different sizes of landslides present in the region was very difficult. Generally, over-segmentation is preferred to under-segmentation as merging is possible in later steps [37]. Hence, to ensure that the boundaries of the smallest landslide areas were derived, we used a small scale parameter of 10 with shape 0.1 and compactness 0.9 [37,57].

Segmentation
The first important procedure is segmentation, which divides an image into objects based on the homogeneity of the pixel values [80]. We used multiresolution segmentation (MRS) [81] available in eCognition®Developer 8.9 (Trimble Germany GmbH, Munich, Germany) software [82]. MRS is a bottom-up region-merging technique in which small objects are merged into bigger ones in subsequent steps [83]. MRS requires three parameters: scale, shape and compactness. The scale controls the image object size, with higher scale resulting in larger objects and small scale in smaller objects. The shape determines the degree to which shape influences segmentation vs. spectral homogeneity. The compactness defines the weight of compactness criteria. The higher the value, more compact the objects will be. There are many methods for selecting the optimal scale automatically, such as estimation of scale parameter 2 [84], plateau objective function [42] and optimal scale parameter selector [85]. However, currently there is no standardized or best method for optimal scale estimation [86]. Manual trial and error selection of scale parameter is time consuming. Different scales were tested for segmentation, but obtaining a scale that accounts for different sizes of landslides present in the region Remote Sens. 2019, 11, 2284 8 of 23 was very difficult. Generally, over-segmentation is preferred to under-segmentation as merging is possible in later steps [37]. Hence, to ensure that the boundaries of the smallest landslide areas were derived, we used a small scale parameter of 10 with shape 0.1 and compactness 0.9 [37,57].

Selection of Likely Landslide Candidate Objects
In this study, landslides and false positives (non-landslide areas appearing/initially classified as landslides) were detected first, with sequential elimination of false positives in the subsequent step. A NDVI threshold was used to separate likely landslide candidate objects from other vegetated surfaces. This threshold is selected according to Martha et al. [41] using K-means clustering [87]. K-means clustering finds cluster centers in continuous data, which can be used to set thresholds for identification of landslides and removal of false positives. One requirement of this method is that the number of clusters (K) must be predefined. We used the Elbow method to calculate the number of clusters (Figure 5a). This method was implemented in Python. In the Elbow method, a graph is plotted between within-cluster sum of squares (WCSS), which is the distance between each member of the cluster and its centroid and number of clusters. The main idea behind K-means clustering is to group data such that the WCSS is at a minimum. The location of bend (elbow) in the plot indicates the appropriate number of clusters beyond which increasing the number of clusters will not result in a decrease of WCSS. The elbow point was then determined automatically using the Elbow point detection method [88,89]. Using the obtained number of clusters, a NDVI threshold was obtained using K-means clustering to select likely landslide candidate objects (Figure 5b).

Selection of Likely Landslide Candidate Objects
In this study, landslides and false positives (non-landslide areas appearing/initially classified as landslides) were detected first, with sequential elimination of false positives in the subsequent step. A NDVI threshold was used to separate likely landslide candidate objects from other vegetated surfaces. This threshold is selected according to Martha et al. [41] using K-means clustering [87]. Kmeans clustering finds cluster centers in continuous data, which can be used to set thresholds for identification of landslides and removal of false positives. One requirement of this method is that the number of clusters (K) must be predefined. We used the Elbow method to calculate the number of clusters (Figure 5a). This method was implemented in Python. In the Elbow method, a graph is plotted between within-cluster sum of squares (WCSS), which is the distance between each member of the cluster and its centroid and number of clusters. The main idea behind K-means clustering is to group data such that the WCSS is at a minimum. The location of bend (elbow) in the plot indicates the appropriate number of clusters beyond which increasing the number of clusters will not result in a decrease of WCSS. The elbow point was then determined automatically using the Elbow point detection method [88,89]. Using the obtained number of clusters, a NDVI threshold was obtained using K-means clustering to select likely landslide candidate objects (Figure 5b).

Removal of False Positives from Landslides
Since NDVI was used as a distinguishing feature, objects with similar or lower NDVI values, such as shadows, rivers, roads, buildings, agricultural and barren lands, are misclassified as landslides. Combination of local knowledge, K-means clustering and spatial datasets were used for elimination of these false positives in sequential order.
Candidate objects belonging to shadow class were separated using brightness and hillshade. The hillshade map was generated using solar altitude and azimuth information at the time of image acquisition in ArcGIS. Shadows have low brightness and hillshade. K-means clustering was then used to obtain thresholds for brightness and hillshade. River objects were separated using stream order obtained from Strahler's method [90]. Stream order >5 represents perennial flowing water body. Candidate objects that intersected high stream orders were assigned to river class. As water exhibits negative NDVI, a threshold of NDVI < 0 was also used as additional criteria to separate river. Alluvial sands are present closer to river channels and are brighter than landslides. Hence, candidate objects belonging to this class were separated using the distance from river and higher brightness values compared to surrounding gorges. Distance to river and brightness threshold were set manually

Removal of False Positives from Landslides
Since NDVI was used as a distinguishing feature, objects with similar or lower NDVI values, such as shadows, rivers, roads, buildings, agricultural and barren lands, are misclassified as landslides. Combination of local knowledge, K-means clustering and spatial datasets were used for elimination of these false positives in sequential order.
Candidate objects belonging to shadow class were separated using brightness and hillshade. The hillshade map was generated using solar altitude and azimuth information at the time of image acquisition in ArcGIS. Shadows have low brightness and hillshade. K-means clustering was then used to obtain thresholds for brightness and hillshade. River objects were separated using stream order obtained from Strahler's method [90]. Stream order >5 represents perennial flowing water body. Candidate objects that intersected high stream orders were assigned to river class. As water exhibits negative NDVI, a threshold of NDVI < 0 was also used as additional criteria to separate river. Alluvial sands are present closer to river channels and are brighter than landslides. Hence, candidate objects belonging to this class were separated using the distance from river and higher brightness Remote Sens. 2019, 11, 2284 9 of 23 values compared to surrounding gorges. Distance to river and brightness threshold were set manually according to local conditions. Candidate objects that intersected the roads vector layer were separated and assigned to the road class. Agricultural land such as terrace farming was separated based on low slope and texture values, which was calculated using the gray level co-occurrence matrix (GLCM) [37]. Mean GLCM of the red band was calculated using Haralick's method [91]. The mean GLCM threshold was obtained from K-means clustering. Buildings were removed using shape criteria (i.e., small area and rectangular fit). Barren areas appeared darker than landslides, so a manual brightness threshold was selected to differentiate objects belonging to the barren class from landslide objects. A visual check to remove remaining obvious false positives was achieved in under 15 min. The remaining objects were merged and exported from eCognition in shapefiles as final landslides for validation and further analysis in ArcGIS.

Landslide Susceptibility Mapping
In this study we used logistic regression to calculate susceptibility, which allows for a multivariate regression between a dependent variable and several independent variables [92]. In landslide susceptibility mapping, the dependent variable is binary, representing the presence or absence of landslides, and the independent variables are landslide explanatory variables. The independent variables can be continuous, categorical or a combination of both [93]. The generalized linear model function in R software was used to fit the logistic regression model. In logistic regression, the logit function converts the probabilities into values from 0 to 1. The function that defines probability of landslide occurrence (P) is defined as: and where, x i are the explanatory variables considered by the model, b 0 is the intercept and b i the coefficients assigned to each explanatory variable x i . Figure 6 shows two example areas for the OBIA-detected landslides from DG and PlanetScope images. In order to validate the OBIA-detected landslides, the inventory was compared with a reference inventory that was compiled manually, using one WV02 image from 8 October 2012 (red square in Figure 1). Three metrics were calculated: true positive (TP), false negative (FN), and false positive (FP). These metrics were not based on the number of landslides because segmentation-derived image objects rarely correspond to single landslides due to over or under-segmentation [53,94]. Instead, the performance metrics were determined according to the overlapping area. TPs are correctly mapped landslides; FPs are detected landslides that have not been mapped in the reference inventory and FNs are reference landslides not identified by OBIA. Based on these metrics, the two accuracy indices, producer accuracy (PA) also known as detection percentage, and user accuracy (UA) were calculated as follows: PA = TP TP + FN It was observed that the OBIA-based method was only successful in obtaining 59% of the area of the reference inventory (Table 2). When evaluating based on the intersection of any portion of the reference and OBIA landslides, a PA of 98% was obtained. This suggests that nearly all landslides were detected using OBIA, but that the areas were not always accurate and had the tendency to be underestimated.  Figure 7 shows the location and size of OBIA-based landslides within the 980 km 2 Karnali highway study area. The size of the landslides varied from 10 to 30,496 m 2 . As OBIA was not successful in delineating the complete area of individual landslides, and also to avoid spatial autocorrelation of landslide samples during susceptibility modelling [95,96], we converted the detected landslides into initiation points using a Digital Elevation Model (DEM). The initiation point was assumed to be the highest elevation on the landslide boundary. In total, 1061 landslide initiation points were generated. A total of 993 landslide initiation points were generated from landslide areas It was observed that the OBIA-based method was only successful in obtaining 59% of the area of the reference inventory (Table 2). When evaluating based on the intersection of any portion of the reference and OBIA landslides, a PA of 98% was obtained. This suggests that nearly all landslides were detected using OBIA, but that the areas were not always accurate and had the tendency to be underestimated.  Figure 7 shows the location and size of OBIA-based landslides within the 980 km 2 Karnali highway study area. The size of the landslides varied from 10 to 30,496 m 2 . As OBIA was not successful in delineating the complete area of individual landslides, and also to avoid spatial autocorrelation of landslide samples during susceptibility modelling [95,96], we converted the detected landslides into initiation points using a Digital Elevation Model (DEM). The initiation point was assumed to be the highest elevation on the landslide boundary. In total, 1061 landslide initiation points were generated. A total of 993 landslide initiation points were generated from landslide areas obtained from DG images    (Figure 8c) that have high insolation and evaporation rates, resulting in less vegetated surfaces [73]. These slopes are also on the windward side, which tends to receive more precipitation compared to leeward slopes and might result in more landslide activity. The phyllite, schist, metasandstone and quartzite of the Kuncha Formation geologic class contains the majority of the landslides (Figure 8d). The phyllite of the Kuncha Formation are moderately to highly weathered and fractured [97], which might lead to increased landsliding. Landslides are located at a distance of 10-15 km from faults (Figure 8e), with the dominant fault zones oriented east-west. The effects of runoff and undercutting by the river are highlighted by a higher number of landslides within 500 m of drainages (Figure 8f). Comparison with the 2010 land cover map reveals landslides predominantly occur in forest and agriculture land cover types (Figure 8g), which might be due to destruction of forest for settlement and agriculture, as the development of the highway continued throughout the years. A higher number of landslides within 100 m of the highway (Figure 8h) suggests landslide initiation could be exacerbated due to infrastructure development associated with the highway.   (Figure 8b). Landslides are predominantly distributed within south-facing slopes (Figure 8c) that have high insolation and evaporation rates, resulting in less vegetated surfaces [73]. These slopes are also on the windward side, which tends to receive more precipitation compared to leeward slopes and might result in more landslide activity. The phyllite, schist, metasandstone and quartzite of the Kuncha Formation geologic class contains the majority of the landslides (Figure 8d). The phyllite of the Kuncha Formation are moderately to highly weathered and fractured [97], which might lead to increased landsliding. Landslides are located at a distance of 10-15 km from faults (Figure 8e), with the dominant fault zones oriented east-west. The effects of runoff and undercutting by the river are highlighted by a higher number of landslides within 500 m of drainages (Figure 8f). Comparison with the 2010 land cover map reveals landslides predominantly occur in forest and agriculture land cover types (Figure 8g), which might be due to destruction of forest for settlement and agriculture, as the development of the highway continued throughout the years. A higher number of landslides within 100 m of the highway (Figure 8h) suggests landslide initiation could be exacerbated due to infrastructure development associated with the highway. Remote Sens. 2019, 12, x FOR PEER REVIEW 12 of 23

Landslide Susceptibility and Validation
Different approaches exist for separating the landslide inventory into a training and validation data set. It is commonly suggested that approximately 20% of the data selected at random should be used to validate the result [95]. In this study, 75% of the landslide points selected randomly were

Landslide Susceptibility and Validation
Different approaches exist for separating the landslide inventory into a training and validation data set. It is commonly suggested that approximately 20% of the data selected at random should be used to validate the result [95]. In this study, 75% of the landslide points selected randomly were used for training the logistic regression model and 25% were held for validation. An equal number of non-landslide points were also used during the training phase.
Eight landslide explanatory variables were considered for logistic regression. Aspect, geology and land cover are categorical variables and elevation, slope, distance to faults, distance to highway and distance to drainage are continuous variables. All the explanatory variables were classified into different classes with nominal values. In this study, each of the classes within the variables were represented using landslide densities and these values were used for logistic regression. Use of landslide densities also allows for the representation of explanatory variables on the same scale, which enables us to estimate the effect of each variables on landslide occurrence [98]. Landslide density [99] was calculated as: where Bi is the number of landslides within a class, Ai is number of pixels in a class, B is the total number of landslides and A is total number of pixels within the study area. Table 3 shows the regression coefficients along with test statistics of the eight explanatory variables. Standard error is the upper and lower limits of the coefficient. The z value is the ratio of regression coefficient to standard error. Pr(>|z|) is the significance. From the analysis of coefficients and test statistics, it is seen that all variables except for distance to drainage have a prominent role in landslide formation, highlighted by the positive coefficient and significance at 0.05 level. Slope with the highest coefficient has a higher effect on landslide formation than any other variable. The susceptibility values obtained from logistic regression are probabilities on a continuous scale from 0 to 1. These values are classified into five levels: very low, low, moderate, high and very high ( Figure 9) using the natural breaks algorithm [100], which groups similar values together maximizing the difference between classes. Table 4 shows the percentage of landslide susceptibility classes for the whole study area. A total of 53.58% of the area lies in the very low/low susceptibility zones, whereas 32.97% of the study area lies in the high/very high susceptibility zones. The map shows the lower elevation areas between Bangesimal and Manma towns (Figure 1), with the Kuncha Formation geologic class being the most hazardous. The banks of the Karnali River, its tributaries and areas near to the highway are also highly susceptible to landslides.
The landslide susceptibility map was verified with the receiver operating characteristic (ROC) curve statistics, a useful method for representing the quality of deterministic and probabilistic detection and forecast systems [101]. The ROC curve is created by plotting the true positive rate and false positive rate of each possible binary classification of a dataset [102]. The area under the curve (AUC) indicates the performance of the model. Its value ranges from 0 to 1, where 1 indicates a perfect model fit and 0.5 indicates that the model does not perform any better than random chance. A total of 259 landslides and an equal number of non-landslide points not used for training the logistic regression model were used for validating the landslide susceptibility map. Results of the ROC analysis shown in Figure 10 give an AUC value of 0.90, which is higher than the 0.7 suggested for a successful prediction [103].  The landslide susceptibility map was verified with the receiver operating characteristic (ROC) curve statistics, a useful method for representing the quality of deterministic and probabilistic detection and forecast systems [101]. The ROC curve is created by plotting the true positive rate and false positive rate of each possible binary classification of a dataset [102]. The area under the curve (AUC) indicates the performance of the model. Its value ranges from 0 to 1, where 1 indicates a perfect  model fit and 0.5 indicates that the model does not perform any better than random chance. A total of 259 landslides and an equal number of non-landslide points not used for training the logistic regression model were used for validating the landslide susceptibility map. Results of the ROC analysis shown in Figure 10 give an AUC value of 0.90, which is higher than the 0.7 suggested for a successful prediction [103].

Discussion
This study represents the first time VHR imagery was applied using the OBIA methodology to map landslides along the Karnali highway. This method was only successful in mapping 59% of the landslide areas relative to the reference inventory. Upon close investigation (Figure 11), it was noted that the OBIA based method was successful in delineating landslide scarps, but was unable to detect long narrow debris trails due to the lack of distinct spectral properties. A similar result and issue was reported while mapping landslide hotspots in New Zealand using OBIA [94]. Error may have been introduced by using the 30 m DEM, the single segmentation scale used for identification of the landslide candidates, and the removal of false positives in this study. The OBIA detected landslide inventory accuracy decreases gradually with the reduction of DEM resolution [104]. Most of the studies using OBIA for landslide detection utilized a high-resolution DEM (10 m and higher) created from sources such as stereo pairs [37], LiDAR [47] and contours [38]. Another factor contributing to the error might be the single segmentation scale used in this study. Multiple scale-based false positive identifications significantly improves the overall accuracy [42]. However, creating a DEM and using multiple segmentation scale OBIA landslide detection for areas as large as the entire Karnali highway is a challenge.

Discussion
This study represents the first time VHR imagery was applied using the OBIA methodology to map landslides along the Karnali highway. This method was only successful in mapping 59% of the landslide areas relative to the reference inventory. Upon close investigation (Figure 11), it was noted that the OBIA based method was successful in delineating landslide scarps, but was unable to detect long narrow debris trails due to the lack of distinct spectral properties. A similar result and issue was reported while mapping landslide hotspots in New Zealand using OBIA [94]. Error may have been introduced by using the 30 m DEM, the single segmentation scale used for identification of the landslide candidates, and the removal of false positives in this study. The OBIA detected landslide inventory accuracy decreases gradually with the reduction of DEM resolution [104]. Most of the studies using OBIA for landslide detection utilized a high-resolution DEM (10 m and higher) created from sources such as stereo pairs [37], LiDAR [47] and contours [38]. Another factor contributing to the error might be the single segmentation scale used in this study. Multiple scale-based false positive identifications significantly improves the overall accuracy [42]. However, creating a DEM and using multiple segmentation scale OBIA landslide detection for areas as large as the entire Karnali highway is a challenge.
A total of 1161 initiation points were obtained based on landslide areas delineated from VHR imagery. In this study, we did not address amalgamation (i.e., mapping of several adjacent landslides as a single landslide) that might have occurred during merging and exporting from eCognition. However, amalgamation must be addressed if landslide inventories obtained using OBIA are to be utilized for studies where individual characteristics of a landslide are of importance, such as assessment of area-frequency distributions and estimation of landslide volumes [105,106]. A hybrid approach that combines OBIA with manual improvement could streamline the whole mapping process with acceptable accuracy, and reduce the time and effort needed for generating landslide inventories [94]. In mountainous areas, shadow is a major problem. Landslides that lie within shadows cannot be detected using optical imagery. Hence, field validation and inclusion of those landslides missed, because of this issue and the OBIA methodology in general, must be prioritized. A total of 1161 initiation points were obtained based on landslide areas delineated from VHR imagery. In this study, we did not address amalgamation (i.e., mapping of several adjacent landslides as a single landslide) that might have occurred during merging and exporting from eCognition. However, amalgamation must be addressed if landslide inventories obtained using OBIA are to be utilized for studies where individual characteristics of a landslide are of importance, such as assessment of area-frequency distributions and estimation of landslide volumes [105,106]. A hybrid approach that combines OBIA with manual improvement could streamline the whole mapping process with acceptable accuracy, and reduce the time and effort needed for generating landslide inventories [94]. In mountainous areas, shadow is a major problem. Landslides that lie within shadows cannot be detected using optical imagery. Hence, field validation and inclusion of those landslides missed, because of this issue and the OBIA methodology in general, must be prioritized.
Based on initiation points, a landslide susceptibility map was produced using logistic regression. Although a good prediction rate (AUC value of 0.90) was achieved, some inherent problems related to the statistical method remain. The landslide inventory contains all types of landslides. Hence, the susceptibility map produced is highly generalized. Training the model for each landslide type might result in a slightly different susceptibility map that may be more suitable for local scale analysis. Use of initiation points may lead to underestimation of the hazard, as it does not consider size or runout of the landslide. Similarly, small landslides are assigned the same weight as larger landslides, which may result in a shift towards higher susceptibility further up on the slope [107]. Logistic regression is based on the assumption that past combinations of explanatory variables that have resulted in landslides in some area holds true in other areas as well. However, finding the optimal combination of variables is not always straightforward [108]. The training data only included landslide information from a 3 km buffer along the highway, which might not sufficiently capture all variable combinations responsible for landslide occurrence. Thus, some critical slopes might not be identified as highly susceptible zones. In order to avoid these scenarios, continuous monitoring and updating of the susceptibility map must be prioritized. Nevertheless, this study shows that remote sensing and Based on initiation points, a landslide susceptibility map was produced using logistic regression. Although a good prediction rate (AUC value of 0.90) was achieved, some inherent problems related to the statistical method remain. The landslide inventory contains all types of landslides. Hence, the susceptibility map produced is highly generalized. Training the model for each landslide type might result in a slightly different susceptibility map that may be more suitable for local scale analysis. Use of initiation points may lead to underestimation of the hazard, as it does not consider size or runout of the landslide. Similarly, small landslides are assigned the same weight as larger landslides, which may result in a shift towards higher susceptibility further up on the slope [107]. Logistic regression is based on the assumption that past combinations of explanatory variables that have resulted in landslides in some area holds true in other areas as well. However, finding the optimal combination of variables is not always straightforward [108]. The training data only included landslide information from a 3 km buffer along the highway, which might not sufficiently capture all variable combinations responsible for landslide occurrence. Thus, some critical slopes might not be identified as highly susceptible zones. In order to avoid these scenarios, continuous monitoring and updating of the susceptibility map must be prioritized. Nevertheless, this study shows that remote sensing and the OBIA methodology is valuable for detecting landslides in a short amount of time in order to systematically characterize landslide pattern and improve our ability to identify susceptible zones in remote regions such as the Karnali highway in Mid-Western Nepal.
This study established landslide locations and areas along the Karnali highway for the first time. Approximately 33% of the study area lies in high/very high susceptibility zones. The road section that lies between Bangesimal and Manma towns is highly susceptible to landslides. This spatial and areal distribution will be helpful for decision makers to focus on locations with higher densities of landslide activity. There may be additional opportunities to further characterize slow moving landslides by considering displacement rates obtained from synthetic aperture radar (SAR) [49,109]. However, in mountainous regions such as Karnali, shadow, layover and decorrelation due to vegetation and snow, are factors limiting the applicability of SAR [110]. The 30 m susceptibility map provides estimates of locations where landslides might initiate and does not model landslide flow and deposition. Thus, it cannot be used for land-use planning, emergency response and engineering decisions directly. It can be used to provide a general overview of landslide hazard in the area and to delineate areas that might benefit from low-cost slope stabilization measures such as bioengineering [111,112]. These preventive measures not only help minimize landslide occurrence but also protect the population from its cascading impacts. While the susceptibility map proposed here is static, it can be used with dynamic variables such as satellite precipitation [113] to create a hazard map and identify sections of the highway that may be more likely to be exposed to landslides each season or with particular extreme rain events. However, this would require improved temporal information on when landslides occurred. Continuous monitoring of landslides and their evolution in time may be possible with recent availability of free high-resolution imagery from Sentinel-2 however, the resolution remains too coarse to resolve smaller landslide events. Additional information on population from Landscan [114] or Gridded Population of the World [115] can be used to better understand the potential exposure and risks to populations within the region.

Conclusions
A landslide inventory was created from VHR imagery in 2012 and 2018 using OBIA within a 3 km buffer of the Karnali highway. To our knowledge, this is the first landslide inventory in this area. The OBIA method was able to identify 59% of the landslide area obtained from manual mapping. The primary reason for the mismatch in landslide area was due to the challenge in detecting landslide tail or runout. Using landslide initiation points derived from the OBIA-based inventory area as training data, a landslide susceptibility map with a spatial resolution of 30 m was created using logistic regression. The map has a good prediction rate, with an AUC of 0.90. Results indicate that approximately 33% of the study area lies in high/very high susceptibility zones. The road section between Bangesimal and Manma was found to be the most hazardous. This susceptibility map provides the first estimates of highly susceptible areas to landslides along the Karnali highway, which can inform decisions about where to apply mitigation approaches, such as bioengineering. This method demonstrates the potential for conducting similar analyses in remote areas, providing the first step towards hazard and risk estimates and can be expanded to other landslide-prone regions around the country.