Forecasting of Landslides Using Rainfall Severity and Soil Wetness: A Probabilistic Approach for Darjeeling Himalayas

: Rainfall induced landslides are creating havoc in hilly areas and have become an important concern for the stakeholders and public. Many approaches have been proposed to derive rainfall thresholds to identify the critical conditions that can initiate landslides. Most of the empirical methods are deﬁned in such a way that it does not depend upon any of the in situ conditions. Soil moisture plays a key role in the initiation of landslides as the pore pressure increase and loss in shear strength of soil result in sliding of soil mass, which in turn are termed as landslides. Hence this study focuses on a Bayesian analysis, to calculate the probability of occurrence of landslides, based on di ﬀ erent combinations of severity of rainfall and antecedent soil moisture content. A hydrological model, called Syst è me Hydrologique Europ é en Transport (SHETRAN) is used for the simulation of soil moisture during the study period and event rainfall-duration (ED) thresholds of various exceedance probabilities were used to characterize the severity of a rainfall event. The approach was used to deﬁne two-dimensional Bayesian probabilities for occurrence of landslides in Kalimpong (India), which is a highly landslide susceptible zone in the Darjeeling Himalayas. The study proves the applicability of SHETRAN model for simulating moisture conditions for the study area and delivers an e ﬀ ective approach to enhance the prediction capability of empirical thresholds deﬁned for the region.


Introduction
Rainfall induced landslides are among the major natural disasters in hilly terrains, with an alarmingly high frequency of occurrence [1]. Such incidents are becoming common in Indian Himalayas during monsoon seasons and are creating a serious threat to lives and properties. The need for forecasting landslides induced by rainfall is vital for minimizing losses and managing the hazard. Forecasting landslides can reduce the causalities by providing a warning to the officials and public in general. For an effective warning system to work, it is crucial to identify the rainfall threshold conditions, which can initialize a landslide event.
The threshold can be set on process or empirical bases [2]. Process/physically based thresholds require detailed field investigations and continuous monitoring of physical parameters. The hydrological, geological, geomorphological, climatic conditions, and the physical processes can trigger landslides [3][4][5]. The physically based thresholds work on integrated hydrological model and slope stability analysis while empirical thresholds are developed based on rainfall and occurrence of landslide in the past only [6]. Hence empirical thresholds are often followed by practitioners [2,[7][8][9][10], as it bypasses the difficulties in understanding and evaluating the physical processes involved in the initiation of landslides [6,[11][12][13]. The threshold defines a critical condition of rainfall, if crossed can initiate failure of slopes in the region. The critical factor in the development of an empirical threshold is the definition of rainfall and landslide parameters. A rainfall event is defined as a duration of continuous precipitation separated from the no rainfall periods before and after the event. Similarly, a landslide event can be defined either as the number of slides in a specific time period or can be expressed as a single event during the same period [14]. Guzzetti et al. (2007) [2] has explained the definition of parameters used to characterize a rainfall event. An event rainfall is the cumulative amount of rainfall measured from the starting of precipitation to the time of failure [15]; the term duration specifies the time span of the rainfall event or rainfall period [16]; and rainfall intensity is the amount of precipitation in unit duration, i.e., the rate of rainfall over the duration considered [7]. The variables generally used to define empirical threshold conditions are intensity-duration (ID), event rainfall-duration (ED) and antecedent rainfall. ID threshold was first proposed by Caine in 1980 [7] and has been used extensively by researchers [2,8,10,17]. In most cases, for long observation periods, the term intensity denotes the average rainfall over the duration, ignoring the peak values, which makes the ID threshold models complicated. ED thresholds can be used to overcome this limitation. The definition of an empirical threshold is complete only when the area of validity of equation is also mentioned. There are global, regional, and local thresholds based on the area upon which the model is valid. Local and regional thresholds are defined for a particular region and cannot be used for other regions. Global thresholds are useful when regional or local scale thresholds are not available, but they may lead to false warnings as the threshold values are too low for most of the regions considered.
Based on the objectives, empirical thresholds can be generally classified into two categories [17,18]. One predicts the minimum levels to initiate slides (minimum thresholds) and the other balances the number of true and false warnings while forecasting landslides (warning thresholds). Establishing minimum thresholds for different locations has been in practice across the world [7,8,10,14]. These are very conservative in nature and may lead to false alarms. Warning thresholds are developed to minimize false alarm rates. The sensitivity of the warning threshold is of the utmost importance when it has to be clubbed with a warning system. The prediction performance can be improved by considering antecedent moisture conditions, as the antecedent rainfall and resulting moisture content plays a critical role in the initiation of landslides [19]. A concrete example of the influence of antecedent soil moisture content, the case of an event in Sarno (Italy) in 1998 has been pointed out by Crosta and Fattini [20]. Antecedent soil moisture plays a key role in the rainfall-runoff transformation and thereby the infiltration process [21]. Attempts for finding critical moisture content for triggering of landslides have been discussed in the literature for different locations. Weighted indexes based on antecedent rainfalls is one such approach [22,23], whereas another popular approach is by using satellite data [24]. These moisture content data can be used for improving the performance of statistical thresholds [25][26][27]. The advantage of taking soil moisture as a parameter in defining threshold is the consideration of in situ parameter along with the empirical thresholds. The amount of event rainfall needed to trigger a landslide will be different for different soils, based on their moisture content. For dry soils, threshold values should be kept high, to avoid false alarms and for saturated soils, they must be kept low to minimize missed signals. Again, the values are highly regional specific and should be defined on a local or regional scale through detailed investigations and should be validated for understanding performance efficiency. For measuring soil moisture in the field, volumetric water content sensors can be installed at site [28]. The drawback of this approach is that the sensor can provide the volumetric moisture content at the point of installation only, not for the whole area considered. In order to get the moisture content of a large area, a number of such sensors are required, which is economically not feasible. Satellite databases are another source of soil moisture data, which has already been validated. Hydrological models are also being used to predict the moisture content over a larger area [29][30][31]. In this study, a hydrological model Système Hydrologique Européen Transport (SHETRAN) [31,32] is used to estimate the moisture content. The model occasionally overestimates the moisture content [33] and hence equifinality testing [34,35] is required to verify the simulated data. In this study, the model is calibrated using satellite data to explore the applicability of the model in the study area. Thus, the simulated model helps in understanding the catchment properties like canopy storage capacity, saturated water content, conductivity, and roughness coefficient using the simulated moisture content values. A two-dimensional Bayesian approach, integrating the moisture content and event rainfall, is proposed. Hence, the objective of the study is twofold, to understand the applicability of SHETRAN model for the study area and to propose probabilistic thresholds based on rainfall severity and soil moisture condition.

Study Area
The study area is Kalimpong town of West Bengal, India ( Figure 1) and the duration of analysis is from 2010 to 2017. The area belongs to Darjeeling Himalayas, which are prone to landslides due to heavy rainfall, subsequent infiltration, and untrained mountain rivulets (termed as jhoras in the local dialect) [14]. Frequent landslides in the area make life difficult for both the inhabitants and tourists due to destruction of farmlands and hinderance of transportation facilities. Urbanization and infrastructural developments have increased the risk associated with landslides in the area. This hilly region is in quest of effective landslide early warning and mitigation strategies to tackle effectively the socio-economic setbacks due to landslides. The relationship between rainfall and occurrence of landslides in Kalimpong has been a topic of research in recent times [14,36,37]. The region has become a focal point of research due to the increase in number of landslide events in the recent past. The researchers have established empirical rainfall thresholds for the area using frequentist and Bayesian inference methods. The established empirical thresholds are the minimum threshold levels based on the severity of rainfall event only. The minimum levels often lead to false alarms, challenging their applicability in the development of a landslide early warning system (LEWS). Hence, the study is an attempt to improve the performance of empirical thresholds for the area, by incorporating an in situ parameter, antecedent moisture content. Instead of defining a minimum threshold level or critical condition, a two-dimensional Bayesian analysis is conducted for the study area, considering the effects of both rainfall severity and antecedent water content. This probabilistic approach gives the probability of occurrence of landslide with respect to the rainfall and moisture conditions of the study area.
Lying in an average elevation of 1240 m in the state of West Bengal, Kalimpong town is hemmed between two rivers, river Teesta in the western part and river Relli in the eastern part. The variation in temperature of the town is from 5 • C to 27 • C annually [38] The western face of the town is steep in nature while the eastern slopes are gentler. The reddish soil and steep slopes of Kalimpong are highly susceptible to landslides.
The severe landslides had adverse effects on the stability of the hill slopes of Kalimpong, especially on the western side and the town gets isolated from the rest of the state. Erosion of river Teesta and its tributaries during heavy rainfalls add to the poor lithological quality of the area and contributes to the initiation of landslides [39]. Debris/rockslides have become a common incident in Kalimpong during monsoon.
Agriculture is the primary source of income for the people of Kalimpong. As large areas of paddy cultivation increase the chances of the landslide, farmers moved towards other crops like cardamom. As per Indian census 2011, the population density of Kalimpong district is around 40.70 /km 2 . The town is getting densely populated and expanded along the roads without proper planning. Multi-storied buildings are getting constructed along the streets and on steeper slopes.

Geology
The geologic composition of Kalimpong consists of schists, gneiss, and phyllite [40]. The expansive nature of the upland soils present in this region is accounted for by the high organic content and moisture holding capacity. The shear strength of the soil is considerably reduced by continuous percolation of water. Geological joints and cracks make the rocks disintegrate and decompose and form delicate matter. Joints are observed in parallel, perpendicular, and oblique directions to foliations. In terms of lithology, quartz-mica schist of Daling series of a golden to a silver color contributes the bedrock throughout the region [41]. It is also evident that, the rock is metamorphosed in the eastern margin. Rich micaceous minerals and foliation of schist controls the landslides to a great extent.

Geohydrology
The study area can be sub divided into smaller basins, contributing to the tributaries of river Teesta [41]. The basin of river Teesta has minor first order streams that combine to form higher order streams ( Figure 2). The smaller streams are often disappearing at some location and then reappearing at some other place downslope. Streams in the western slope flow through deep narrow valleys to Teesta while eastern slope streams flow downslope. The slopes can be classified morphometrically into moderately steep category C (20°-30°) followed by a gentle slope category D (10°-20°) and the steep slope category B (30°-45°) [41]. During the monsoon, the increase in volume and high velocity of flow of rivers carry small pebbles and boulders along with them. High rate of erosion often leads to bank failures and landslides in the region. The rainfall data of Kalimpong during the monsoons seasons of the study period has been tabulated in Table 1 below:

Geology
The geologic composition of Kalimpong consists of schists, gneiss, and phyllite [40]. The expansive nature of the upland soils present in this region is accounted for by the high organic content and moisture holding capacity. The shear strength of the soil is considerably reduced by continuous percolation of water. Geological joints and cracks make the rocks disintegrate and decompose and form delicate matter. Joints are observed in parallel, perpendicular, and oblique directions to foliations. In terms of lithology, quartz-mica schist of Daling series of a golden to a silver color contributes the bedrock throughout the region [41]. It is also evident that, the rock is metamorphosed in the eastern margin. Rich micaceous minerals and foliation of schist controls the landslides to a great extent.

Geohydrology
The study area can be sub divided into smaller basins, contributing to the tributaries of river Teesta [41]. The basin of river Teesta has minor first order streams that combine to form higher order streams ( Figure 2). The smaller streams are often disappearing at some location and then reappearing at some other place downslope. Streams in the western slope flow through deep narrow valleys to Teesta while eastern slope streams flow downslope. The slopes can be classified morphometrically into moderately steep category C (20 • -30 • ) followed by a gentle slope category D (10 • -20 • ) and the steep slope category B (30 • -45 • ) [41]. During the monsoon, the increase in volume and high velocity of flow of rivers carry small pebbles and boulders along with them. High rate of erosion often leads to bank failures and landslides in the region. The rainfall data of Kalimpong during the monsoons seasons of the study period has been tabulated in Table 1 below:

Landslides in Kalimpong
Shivalik range of the Himalayas where Kalimpong is located, is a young fold mountain with lateritic soil. The active tectonic movements make this range vulnerable to landslides. As per GSI reports, the history of landslides in Kalimpong traces back to 1899. The years from 2010 to 2016 has been chosen as the calibration period and a total of 61 landslides were identified during the period, which were triggered by rainfall [44]. Most of the landslides were shallow in nature, typical rock, or debris slides during monsoon season, triggered by rainfalls. Some landslides were characterized by high saturation, propagated as debris flows with long run-out distances. The slope movements happened in the year 2017, monitored by MEMS tilt sensors established in Kalimpong has been used for the validation of the thresholds. The movements are very slow in nature, due to toe erosion by the mountain rivulets, resulting in displacements. The hills are experiencing continuous sinking in a very slow rate during rainy season.

Rainfall
The annual rainfall of the region during the study period ranges from 1623 mm to 2061 mm. Heavy rain breaks the bond between soil particles, which leads to particle disintegration. Continuous flow of runoff water along the banks of rivers and streams weakens the rocks and they start weathering. In a long span of time, these weak earth fails, leading to slides. For soils which absorb water, the unit area and total weight increases. This eventually increase the chance for landslides due to increase in pore pressure [45]. The water gets percolated in joints and cracks and triggers landslides. Addition of water makes the soil or rock heavy. It moves through the rock, which induces weathering of rock. Over a long time, the rock gets weakened and finally it fails, leading to a rockfall. The rainfall data of daily accuracy for Kalimpong town during 2010-2016 was collected from the rain

Landslides in Kalimpong
Shivalik range of the Himalayas where Kalimpong is located, is a young fold mountain with lateritic soil. The active tectonic movements make this range vulnerable to landslides. As per GSI reports, the history of landslides in Kalimpong traces back to 1899. The years from 2010 to 2016 has been chosen as the calibration period and a total of 61 landslides were identified during the period, which were triggered by rainfall [44]. Most of the landslides were shallow in nature, typical rock, or debris slides during monsoon season, triggered by rainfalls. Some landslides were characterized by high saturation, propagated as debris flows with long run-out distances. The slope movements happened in the year 2017, monitored by MEMS tilt sensors established in Kalimpong has been used for the validation of the thresholds. The movements are very slow in nature, due to toe erosion by the mountain rivulets, resulting in displacements. The hills are experiencing continuous sinking in a very slow rate during rainy season.

Rainfall
The annual rainfall of the region during the study period ranges from 1623 mm to 2061 mm. Heavy rain breaks the bond between soil particles, which leads to particle disintegration. Continuous flow of runoff water along the banks of rivers and streams weakens the rocks and they start weathering. In a long span of time, these weak earth fails, leading to slides. For soils which absorb water, the unit area and total weight increases. This eventually increase the chance for landslides due to increase in pore pressure [45]. The water gets percolated in joints and cracks and triggers landslides. Addition of water makes the soil or rock heavy. It moves through the rock, which induces weathering of rock. Over a long time, the rock gets weakened and finally it fails, leading to a rockfall. The rainfall data of daily accuracy for Kalimpong town during 2010-2016 was collected from the rain gauge maintained by Save The Hills at Tirpai, Kalimpong (Save The Hills) as shown in Table 1. Heavy rainfall is generally associated with landslides [46][47][48][49][50][51] in hilly areas and hence it is important to identify the rainfall Water 2020, 12, 804 6 of 19 parameters which triggered landslides on a regional scale. The distribution of landslides and rainfall during the calibration period, from 2010 to 2016 is graphically represented in Figure 3.
gauge maintained by Save The Hills at Tirpai, Kalimpong (Save The Hills) as shown in Table 1. Heavy rainfall is generally associated with landslides [46][47][48][49][50][51] in hilly areas and hence it is important to identify the rainfall parameters which triggered landslides on a regional scale. The distribution of landslides and rainfall during the calibration period, from 2010 to 2016 is graphically represented in Figure 3. From Figure 3, it can be inferred that the number of landslides is highly influenced by the amount of rainfall. Maximum number of landslides occurred during monsoon season (June-September). Eighty-seven percent of the total rainfall during the study period was contributed by monsoon rains. Heavy monsoon rains are thus identified as the major triggering factor of landslides in this region.

Drainage System
Radically descending rivulets dissipated the Kalimpong ridge and contributes to the Basin of Teesta River. A number of small tributaries along the ridgeline turn empty into some natural rivulets. These are locally called "jhoras". Jhoras cause intense scouring in the region and are fed by water from numerous perennial springs in the upper part of slopes. Urban area expansion has increased the use of concrete and asphalt in the region. These materials do not allow the water to percolate through. As the area is in the process of rapid expansion, a huge amount of runoff also enhances the monsoon water flow. Being untrained, jhoras create threat to farm lands and weak slopes. The absence of a planned drained system in the area that is getting urbanized allows the water to be fed into natural tributaries that slowly flows into the major jhoras. This causes intense headward and lateral erosion, which results in slope failures near the jhoras mostly in the form of a large landslide.

Data for the Hydrological Simulation
Precipitation and potential evapotranspiration are the meteorological inputs required to run SHETRAN simulation. For areas affected by snow melt, heat budgets are also required. The rainfall data obtained is of daily accuracy and it was converted to hourly rates, assuming uniform intensity of rainfall throughout the day. A digital elevation model (DEM) of 30 m resolution was used from the Bhuvan portal of National Remote Sensing Centre, India [43]. The DEM is developed from the From Figure 3, it can be inferred that the number of landslides is highly influenced by the amount of rainfall. Maximum number of landslides occurred during monsoon season (June-September). Eighty-seven percent of the total rainfall during the study period was contributed by monsoon rains. Heavy monsoon rains are thus identified as the major triggering factor of landslides in this region.

Drainage System
Radically descending rivulets dissipated the Kalimpong ridge and contributes to the Basin of Teesta River. A number of small tributaries along the ridgeline turn empty into some natural rivulets. These are locally called "jhoras". Jhoras cause intense scouring in the region and are fed by water from numerous perennial springs in the upper part of slopes. Urban area expansion has increased the use of concrete and asphalt in the region. These materials do not allow the water to percolate through. As the area is in the process of rapid expansion, a huge amount of runoff also enhances the monsoon water flow. Being untrained, jhoras create threat to farm lands and weak slopes. The absence of a planned drained system in the area that is getting urbanized allows the water to be fed into natural tributaries that slowly flows into the major jhoras. This causes intense headward and lateral erosion, which results in slope failures near the jhoras mostly in the form of a large landslide.

Data for the Hydrological Simulation
Precipitation and potential evapotranspiration are the meteorological inputs required to run SHETRAN simulation. For areas affected by snow melt, heat budgets are also required. The rainfall data obtained is of daily accuracy and it was converted to hourly rates, assuming uniform intensity of rainfall throughout the day. A digital elevation model (DEM) of 30 m resolution was used from the Bhuvan portal of National Remote Sensing Centre, India [43]. The DEM is developed from the stereo data covered by Cartosat-1. The satellite uses two panchromatic cameras of 2.5 m spatial resolution, acquiring two images at the same time. The same scene is captured by the camera with a time difference of 52 seconds. This stereo pair is used for the development of DEM [52].
Evapotranspiration data was downloaded from the Copernicus Climate Change Service for the time period of 2010-2017. The fifth generation European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA5) hourly data was downloaded for the study area [53]. The reanalysis combines the previous forecasts with available observations with a reduced resolution, in order to provide the data of several decades. The spatial resolution of the data used in this study is 0.1 • by 0.1 • . The vertical coverage is from 2 m above the surface level to a depth of 289 cm beneath the ground. Evapotranspiration data is available in meters, with an hourly accuracy. The assumptions in calculating the data are that agricultural land is watered well, and artificial surface conditions due to watering, vegetation, and wind are not affecting the atmosphere. The assumption does not hold true in case of arid conditions, but for the study area where temperature is moderate throughout the year, this assumption is justified.
Other catchment properties required are soil properties and land use requirements. The catchment properties are spatially distributed, and the meteorological properties are time dependent parameters. In this study, uniform soil properties and land use type is assumed throughout the area. The reason for this assumption is that the area considered for the study is less than 32 km 2 and spatializing the data within this small area may cause overfitting problems. To overcome this limitation, the parameters used were calibrated using moisture content values collected from a different database, called the Modern Era Retrospective-Analysis for Research and Applications (MERRA) using Goddard Earth Observing System Data Assimilation System Version 5 (GEOS 5) [54]. The data provides hourly surface data with 30 horizontal resolution. The data is also a reanalysis product that has been validated.

Simulation of Soil Moisture
The SHETRAN model can be used to simulate processes and drainage pathways in river catchments. The model has been used extensively across the world [32,47,55,56]. This study focused only on one out of the three components of SHETRAN, the water flow component. The spatially distributed input data is used to solve the partial differential equations governing the flow. A three-dimensional grid based on the DEM resolution was used to solve the equations.
In this study, the data from only one rain gauge station was used for the analysis, and hence, meteorological properties were considered as being uniform across the domain for each time interval. The evapotranspiration and rainfall conditions are assumed to be same within the study area.
SHETRAN uses the following equations for hydrological processes.
where w o i . and w m i are the observed and simulated moisture content when at each time step. w o i is the arithmetic mean of observed moisture content. Calibration was done for n days. The value of NSE should be one in the case of a perfect model.

Rainfall Events and Empirical Thresholds
The rainfall thresholds are subjected to many uncertainties, including (i) quality and availability of data regarding historical landslides and corresponding rainfall events [2,61,62], (ii) identification of a rainfall event associated with landslides [2], and (iii) the statistical approaches adopted to determine the thresholds [62]. In this study, we have considered 61 landslide events happened within the study area, during the time period of 2010-2016. Rainfall event was set with a dry period of one day throughout the year. Only rainfall events resulted in the occurrence of landslides were considered for the analysis. Hence, the term event rainfall (E) is the total cumulative rainfall from the starting of precipitation until the occurrence of a landslide. The duration (D) is the time period in hours from the starting of rainfall until the occurrence of a landslide. The term intensity (I) is used to define the average rate of precipitation in mm/h.
Once the rainfall events are identified, ED rainfall thresholds for different probabilities of exceedance were defined by using the frequentist approach, in the form of a power law as follows: where α is the intercept (scaling factor) and γ defines the slope of the curve (shape parameter). The approach is based on a frequency analysis of the precipitation conditions which resulted in landslides [63]. Since the data was spanning multiple orders, the rainfall parameters were log transformed first and were plotted using a single graph. The distribution was then fitted using a straight line equation of the form: which is same as the power law in Equation (2). The next step was to find the thresholds at different probabilities of exceedance, based on the distribution of data points. The difference δE between logarithm of each rainfall event logE and the corresponding event value of the best fit line log E f was calculated. By using a kernel density estimation [64], the probability density function of the distribution of δE was determined. The result fitted with a Gaussian function in the form where, a, c > 0 and a, b, c R. The thresholds at different probabilities of exceedance were then found using the shift of data points from the mean of the Gaussian distribution (fitted curve). In the case of a Gaussian distribution, both Bayesian and frequentist approaches yielded the same results [65] and in order to define different exceedance probabilities, the frequentist approach was followed in this study.

Bayes' Theorem
Evaluating the conditional probability of occurrence of the landslide was done using two-dimensional Bayesian analysis. The analysis considers the effect of two factors for the possible occurrence of landslides and defines a probability for the same. In this study, the factors considered are the antecedent water content and the severity of a rainfall event. The antecedent moisture content was simulated using SHETRAN and the rainfall event severity was determined using the total event rainfall and duration parameters, using the frequentist approach. Considering the range of antecedent soil moisture values, the results were scaled from 0 to 1 for easy classification. The scaled values are known as soil wetness. The higher the value of soil wetness, the wetter the in situ soil. The rainfall thresholds were also classified into six categories based on the exceedance probability conditions, i.e., rainfall events which are below the minimum threshold comes under one category, with very low severity. Thus, based on the defined thresholds (T min , T 5 , T 10 , T 20 and T 50 ) six categories were defined. Hence, 30 cells (5 by 6) were available for the two-dimensional Bayesian analysis. The theorem for analysis can be defined as follows: In this study, we are trying to find out the probability of occurrence of at least one landslide in the study area based on soil wetness and rainfall severity. Thus, A is the event of the occurrence of one or more landslides, B is the soil moisture condition, and C indicates the rainfall severity. 'B, C' indicates the occurrence of specific range of values of B, while the severity of rainfall is having a certain defined level. In short, it defines the condition of each cell in the previously mentioned 30 cells (5 by 6). The other terms in the equation can be defined as: P (B,C|A) is the probability of having a certain 'B, C' condition (cell condition) when a landslide happens. This is known as the conditional probability or the likelihood of B, C given A. P (A) is the prior probability of the occurrence of a landslide, regardless of rainfall severity, and soil wetness. P (B,C) is called the marginal probability of finding a certain cell condition with or without the occurrence of landslides. P (A|B,C) is the conditional probability or the posterior probability of occurrence of a landslide when a cell condition is satisfied.
All these probabilities are estimated in the terms of relative frequencies. where: N A = The total number of landslide events (If n number of landslides occur on the same day, it is considered as one landslide event) N R = The total number of rainfall events during the study period N B,C = The number of events in each cell condition N (B,C|A) = The number of rainfall events that resulted in landslides while satisfying a cell condition The conditional probability of occurrence of a landslide can be determined by using the above-mentioned frequencies.

Soil Moisture Estimation
The SHETRAN model used in study was calibrated using moisture content data. The rainfall data is first used to identify the phreatic line for the area under consideration based on the soil depth and catchment properties. Hence, actual evapotranspiration data is put as input for SHETRAN, AE/PE ratio is always kept as one during the simulation. The moisture content values and tension is calculated for the zone of unsaturation and for the saturated zone, recharge is calculated. The zone below the phreatic line is assumed to be in a saturated condition. Moisture conditions are then simulated for the soil above phreatic line, in a shallow depth which is often susceptible to shallow landslides. The hydraulic parameters for water flow modelling were determined using van Genuchten parameters in SHETRAN. It has been proven that the assumption of considering effective saturation by a smooth function, without considering the air entry value can sometimes lead to unrealistic results [66]. Hence to incorporate the effects of all assumptions taken during the modelling, it is important to perform the model calibration, for reliable outputs. The catchment properties were modified several times and an optimum performance was obtained with an R 2 value of 0.84. The calibrated parameters are listed in Table 2 below: The results are comparable with the hydrodynamic parameters predicted using ROSETTA Lite module [67] for Kalimpong, using the particle size distribution of soil [13]. This module takes the precipitation data as input and determines evapotranspiration by using the variation in temperature by Hargreaves equation [68].

Rainfall Thresholds
From the historical data, 61 landslide events happened in Kalimpong town during the study period was used for the definition of rainfall thresholds [44]. The cumulated event rainfall (E) and the duration of the rainfall events associated with these landslides were found and the frequentist approach was followed for the determination of thresholds. The probability density function of distribution of δE is plotted in Figure 4. modelling, it is important to perform the model calibration, for reliable outputs. The catchment properties were modified several times and an optimum performance was obtained with an R 2 value of 0.84. The calibrated parameters are listed in Table 2 below: The results are comparable with the hydrodynamic parameters predicted using ROSETTA Lite module [67] for Kalimpong, using the particle size distribution of soil [13]. This module takes the precipitation data as input and determines evapotranspiration by using the variation in temperature by Hargreaves equation [68].

Rainfall Thresholds
From the historical data, 61 landslide events happened in Kalimpong town during the study period was used for the definition of rainfall thresholds [44]. The cumulated event rainfall (E) and the duration of the rainfall events associated with these landslides were found and the frequentist approach was followed for the determination of thresholds. The probability density function of distribution of δE is plotted in Figure 4.   Figure 4 shows the shift of the threshold line with 5% exceedance probability from the best fit line. Based on the standard Gaussian distribution, the shift of each threshold from the best fit line has been calculated. Using this approach, five different threshold levels are defined in this study (T min , T 5 , T 10 , T 20 , and T 50 ). The defined thresholds and data points are plotted in Figure 5.
Water 2020, 12, x FOR PEER REVIEW 11 of 19 Figure 4 shows the shift of the threshold line with 5% exceedance probability from the best fit line. Based on the standard Gaussian distribution, the shift of each threshold from the best fit line has been calculated. Using this approach, five different threshold levels are defined in this study (Tmin, T5,  T10, T20, and T50). The defined thresholds and data points are plotted in Figure 5. From Figure 5, it can be observed that the Tmin threshold is the threshold below which no landslides are expected to occur. The obtained results are found to be in good agreement with the ID thresholds defined for the area [44]. The variation in duration was found to be relatively less, from one to eight days while the event rainfall values varied the order of tens to hundreds. Out of the 61 events, 5% of the events are expected to fall below the T5 line, 10% below the T10 line, and so on. The T50 line is the best fit line where 50% of the data points are expected to be below the line and 50% above the line. The equation of the minimum threshold has been obtained as E = 1.50D 0.65 and the best fit line is defined as E = 6.03D 0. 65 . The values of α and γ of different thresholds are tabulated in Table  3.  From Figure 5, it can be observed that the T min threshold is the threshold below which no landslides are expected to occur. The obtained results are found to be in good agreement with the ID thresholds defined for the area [44]. The variation in duration was found to be relatively less, from one to eight days while the event rainfall values varied the order of tens to hundreds. Out of the 61 events, 5% of the events are expected to fall below the T 5 line, 10% below the T 10 line, and so on. The T 50 line is the best fit line where 50% of the data points are expected to be below the line and 50% above the line. The equation of the minimum threshold has been obtained as E = 1.50D 0.65 and the best fit line is defined as E = 6.03D 0. 65 . The values of α and γ of different thresholds are tabulated in Table 3. The values indicate that for one day duration (24 hours) the chances of occurrence of landslide when the rainfall is less than 11.83 mm is nil, 18.78 mm is 5%, 26.11 mm is 10%, 32.19 mm is 20%, and 47.58 mm is 50%.
It can be observed from Figure 5 that the cumulated event rainfall is the highest when the duration is the maximum. For shorter duration events, landslides are triggered by very less rainfall as well. The reason could be the antecedent soil moisture conditions. Thus, the defined thresholds are further modified by incorporating the effect of antecedent soil moisture, using a Bayesian probability analysis.

Probabilistic Thresholds
Probabilistic thresholds are defined for the study area using the 30 cell conditions for Kalimpong town. The cell conditions are defined based on the six threshold classifications and five soil wetness classifications. Thresholds are classified as less than T min , T min -T 5 , T 5 -T 10 , T 10 -T 20 , T 20 -T 50 , and greater than T 50  The values indicate that for one day duration (24 hours) the chances of occurrence of landslide when the rainfall is less than 11.83 mm is nil, 18.78 mm is 5%, 26.11 mm is 10%, 32.19 mm is 20%, and 47.58 mm is 50%.
It can be observed from Figure 5 that the cumulated event rainfall is the highest when the duration is the maximum. For shorter duration events, landslides are triggered by very less rainfall as well. The reason could be the antecedent soil moisture conditions. Thus, the defined thresholds are further modified by incorporating the effect of antecedent soil moisture, using a Bayesian probability analysis.

Probabilistic Thresholds
Probabilistic thresholds are defined for the study area using the 30   It is evident from Figure 6 that the maximum probabilities are not always associated with the extreme conditions. The maximum probability of one was obtained in three cases: a) the severity of a rainfall event exceeds 50% and the soil wetness is between 0.4 and 0.6 b) the severity of a rainfall event between 20% and 50% and the soil wetness is between 0.6 and 0.8 c) the severity of a rainfall event between 5% and 10% and the soil wetness is between 0.6 and 0. 8 Thus, it can be inferred that for severe rainfall events, even lesser moisture content can trigger landslides and for less severe rainfall event, moisture content plays an important role in the occurrence of landslides. Further moisture content values, depending upon the circulation pattern, can further generate heavy rainfalls as mentioned in the analysis of Oder Flood [69] and this aspect should be explored in detail in further studies. It is evident from Figure 6 that the maximum probabilities are not always associated with the extreme conditions. The maximum probability of one was obtained in three cases: (a) the severity of a rainfall event exceeds 50% and the soil wetness is between 0.4 and 0.6 (b) the severity of a rainfall event between 20% and 50% and the soil wetness is between 0.6 and 0.8 (c) the severity of a rainfall event between 5% and 10% and the soil wetness is between 0.6 and 0.8.
Thus, it can be inferred that for severe rainfall events, even lesser moisture content can trigger landslides and for less severe rainfall event, moisture content plays an important role in the occurrence of landslides. Further moisture content values, depending upon the circulation pattern, can further generate heavy rainfalls as mentioned in the analysis of Oder Flood [69] and this aspect should be explored in detail in further studies.

Validation
The thresholds derived should be validated to understand if the moisture content values actually help in improving the performance of empirical thresholds or not. Quantitative comparison using statistical attributes is the best approach in this regard, as the attributes can verify the prediction power of each model [18]. For accessing the performance of empirical thresholds and the probabilistic approach, a receiver operating characteristic (ROC) curve approach has been used in this study. The data used for validation is the rainfall and landslide details of 2017, a dataset independent of that used for calibration. Displacements were observed on seven days during 2017 monsoon and the prediction by thresholds for each day was considered for deriving the statistical attributes. A confusion matrix of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) was used to calculate the sensitivity and specificity of the model, to plot the ROC curves. The statistical attributes were calculated for each level of exceedance for empirical thresholds and for probabilities greater than 0.1, 0.2, 0.4, and 0.8 for the probabilistic model.
For analysis, TPs are counted when the threshold considered is exceeded and displacements are observed in the field. TNs are cases when thresholds are not exceeded, and no movements are recorded. FPs are counted for each false alarm issued by the threshold considered and FN for each missed alarm. The efficiency of the model can be characterized by the ratio of sum of TP and TN to the total of all predictions. The term is not an apt criterion to understand the performance of model, as the number of TNs is much higher than the other three variables. Hence it is crucial to evaluate the performance using its sensitivity and specificity. Sensitivity measures the number of events predicted correctly, out of the total number of events, seven in this case. The term points to the true positive prediction rate of the model. Specificity is the ratio of TNs to the total number of days without sliding events. The model is better when both the sensitivity and specificity are one, i.e., when all landslides are correctly predicted without any false alarms. The false positive rate is expressed as one-specificity and is the ratio of false positives to the total number of days without landslide events. The results of statistical comparison of the two models are listed below in Table 4. The sensitivity is maximum for when the critical probability is considered as 0.1. In case of empirical thresholds, the sensitivity is the maximum when T min threshold is considered. While increasing the threshold level, the value of specificity is increased, but at the cost of many missed alarms, reducing the value of sensitivity in both cases. The term likelihood ratio is the ratio of sensitivity to the false positive rate and can be used to evaluate the model performance. The more the likelihood ratio, the better is the model. The likelihood ratio is also the highest for the minimum thresholds considered in both cases. The relative values of likelihood ratio for two-dimensional probabilistic model is higher than those of an empirical model in all the cases, assuring the better performance of the probabilistic model. The lesser values of sensitivity are expected as the total number of displacement events considered for validation is as less than seven and even single missed alarms can have a significant effect on the term.
ROC curves are then plotted between sensitivity and false positive rate for both the models considered, for a better understanding and comparison of models. The perfect point of a ROC curve is (0,1) when both sensitivity and specificity are 1. Area under the curve (AUC) is used to compare two different models, as the model with higher AUC performs better than the one with lower AUC [27]. ROC curves for both the models are plotted in Figure 7. The points shift from (1,1) to (0,0) from the minimum to a maximum level of exceedance considered in both the cases. The sensitivity is maximum for when the critical probability is considered as 0.1. In case of empirical thresholds, the sensitivity is the maximum when Tmin threshold is considered. While increasing the threshold level, the value of specificity is increased, but at the cost of many missed alarms, reducing the value of sensitivity in both cases. The term likelihood ratio is the ratio of sensitivity to the false positive rate and can be used to evaluate the model performance. The more the likelihood ratio, the better is the model. The likelihood ratio is also the highest for the minimum thresholds considered in both cases. The relative values of likelihood ratio for two-dimensional probabilistic model is higher than those of an empirical model in all the cases, assuring the better performance of the probabilistic model. The lesser values of sensitivity are expected as the total number of displacement events considered for validation is as less than seven and even single missed alarms can have a significant effect on the term.
ROC curves are then plotted between sensitivity and false positive rate for both the models considered, for a better understanding and comparison of models. The perfect point of a ROC curve is (0,1) when both sensitivity and specificity are 1. Area under the curve (AUC) is used to compare two different models, as the model with higher AUC performs better than the one with lower AUC [27]. ROC curves for both the models are plotted in Figure 7. The points shift from (1,1) to (0,0) from the minimum to a maximum level of exceedance considered in both the cases. The ROC curves for the empirical thresholds T10 and T20 are the closest to the diagonal line, making them the least reliable out of all the thresholds considered. It is evident from the curves that the AUC of the probabilistic thresholds is larger than that of an empirical model. The effect of the antecedent moisture content considered in the probabilistic model has increased the prediction power by considerably minimizing the false alarms. Even though the number of false alarms are reduced, the number of TPs for higher values of critical probabilities is not much higher than those of empirical thresholds. The reason could be the change in typology of landslides used for calibration and a better approach should be adopted for predicting slow movements, by considering the longterm effect of rainfall. The ROC curves for the empirical thresholds T10 and T20 are the closest to the diagonal line, making them the least reliable out of all the thresholds considered. It is evident from the curves that the AUC of the probabilistic thresholds is larger than that of an empirical model. The effect of the antecedent moisture content considered in the probabilistic model has increased the prediction power by considerably minimizing the false alarms. Even though the number of false alarms are reduced, the number of TPs for higher values of critical probabilities is not much higher than those of empirical thresholds. The reason could be the change in typology of landslides used for calibration and a better approach should be adopted for predicting slow movements, by considering the long-term effect of rainfall.

Conclusions
This study was carried out to understand the applicability of SHETRAN model in simulating the moisture content conditions for Kalimpong region and to evaluate the two-dimensional Bayesian probability of the occurrence of landslides in the region using rainfall severity and antecedent moisture conditions.
The historical landslide and rainfall data of Kalimpong from 2010-2017 and satellite were made use of in the research to determine the Bayesian probability. The study proves SHETRAN model can be used as an effective tool to predict moisture content of the region when direct field-based observations are not available. Using frequentist approach, empirical rainfall thresholds for the region were first defined for the region, for different exceedance probabilities. The ED thresholds defined for the region indicate that for one day duration (24 hours) the chances of occurrence of landslide when the rainfall is less than 11.83 mm is nil, 18.78 mm is 5%, 26.11 mm is 10%, 32.19 mm is 20%, and 47.58 mm is 50%. These thresholds were further modified by incorporating the antecedent moisture conditions. It was found that antecedent soil moisture conditions play a significant role in the initiation of landslides in Kalimpong, especially for the less severe rainfalls. The maximum probability of occurrence of landslides is not associated with the extreme conditions of the parameters considered. It is observed that less severe rainfalls can trigger landslides when the soil wetness is high and even if the soil is dry, highly severe rainfall can trigger landslides.
The study proposes an effective method to enhance the prediction capability of the empirical thresholds already defined for the region [14,70]. The major drawback of empirical thresholds is that, the increased number of false alarms can be considerably reduced by the two-dimensional analysis. The enhancement in performance has been evaluated by using a ROC approach where the probabilistic model with critical probability 0.1 gave the maximum performance with a likelihood ratio of 7.48. The probabilistic method outperforms the empirical approach and can be used to improve the performance of the empirical thresholds. Thus, the study proves that SHETRAN hydrological model can be used as a satisfactory alternative to the robust instrumentation network for estimating moisture content and the results can be used to predict the probability of occurrence of landslides. This highly landslide susceptible zone is in the dire need of an effective LEWS and the results can be integrated with field monitoring systems to predict shallow landslides in the region.