Using Field-Based Monitoring to Enhance the Performance of Rainfall Thresholds for Landslide Warning

Landslides are natural disasters which can create major setbacks to the socioeconomic of a region. Destructive landslides may happen in a quick time, resulting in severe loss of lives and properties. Landslide Early Warning Systems (LEWS) can reduce the risk associated with landslides by providing enough time for the authorities and the public to take necessary decisions and actions. LEWS are usually based on statistical rainfall thresholds, but this approach is often associated to high false alarms rates. This manuscript discusses the development of an integrated approach, considering both rainfall thresholds and field monitoring data. The method was implemented in Kalimpong, a town in the Darjeeling Himalayas, India. In this work, a decisional algorithm is proposed using rainfall and real-time field monitoring data as inputs. The tilting angles measured using MicroElectroMechanical Systems (MEMS) tilt sensors were used to reduce the false alarms issued by the empirical rainfall thresholds. When critical conditions are exceeded for both components of the systems (rainfall thresholds and tiltmeters), authorities can issue an alert to the public regarding a possible slope failure. This approach was found effective in improving the performance of the conventional rainfall thresholds. We improved the efficiency of the model from 84% (model based solely on rainfall thresholds) to 92% (model with the integration of field monitoring data). This conceptual improvement in the rainfall thresholds enhances the performance of the system significantly and makes it a potential tool that can be used in LEWS for the study area.


Introduction
Landslides are geomorphological processes, which can sometimes lead to severe loss of lives and properties [1,2]. Due to the increase in population, most hilly terrains are being urbanised, and the risk Japan [55,56]. The climatic and geological settings of the study area in India differ from the locations where MEMS tilt sensors have been already tested. In a global database of fatal landslides, it has been mentioned that Asia contributes 75% of the world's fatal landslides, with a significant number of events in the Himalayan Arc [57]. The study also states that landslide events are increasing in in India, Nepal, Bangladesh, Northern Pakistan, and Bhutan during the summer monsoon. When rainfall induced landslides are considered, India and Nepal contribute 16% and 10% of the events in the global dataset [57]. Several studies are conducted for deriving the empirical rainfall thresholds for the occurrence of landslides in this region [16,[58][59][60]. The Himalayan Arc [57,61] and Western Ghats [62] are critical landslide prone areas in India, contributing a major share of the total landslides in the country. Hence, it is important to verify its applicability in the study area, by comparing the readings from sensors with field observations. The primary triggering factor of landslides is rainfall, and hence several rainfall thresholds are being used to forecast the occurrence of landslides in this region [22,[63][64][65][66][67][68]. This study evaluates the use of field-based monitoring data, for enhancing the forecasting performance of empirical rainfall thresholds defined for the study area in Darjeeling Himalayas.
An algorithm is proposed and validated in this study, to integrate the field monitoring data with the rainfall thresholds. The algorithm can be used by the authorities to issue alerts to the public for possible occurrence of landslides. This study, making use of both historical data and real time observations for landslide forecasting, is the first of its kind for Darjeeling Himalayas. The proposed approach signifies the need for a network of real-time field monitoring sensors on critical slopes. For a country like India, the investment in such low-cost sensor networks can help in reducing the risk associated with landslides.

Study Area
Kalimpong town is situated in the state of West Bengal, India ( Figure 1). The town is close to Sikkim, West Bengal border in the Darjeeling Himalayas. Kalimpong town lies in between two rivers, Tista and Relli. The western slopes towards Tista River are very steep, and the slopes towards Relli River are gentle. The ground inclination varies from 20 • near Tista River and increases to around 40 o at higher elevations towards the town. The morphometric classification ranges from gentle slopes to escarpment. The topsoil consists of soil of size ranging from silt to sand. Colluvium, older debris, and young debris deposits are found in the region, which gets eroded during heavy rains.
Kalimpong is a part of Darjeeling Himalayas, with intra-thrusted Fold-Thrust Belt (FTB) rock formation aged from Precambrian to Quaternary [69]. The bedrock lithology consists mainly of golden to silver coloured quartz mica schist of Daling series [70]. The quartz mica schist at some portions grades to dull white mica quartz schist and bright silver white mica schist. The bedrock is largely covered with thick layers of weathered material. The rocks are highly jointed and weathered, resulting in intense land sliding and erosion rates during the monsoon season and the formation of loose soil deposits, especially in the upper portion of the young folded mountain sectors. The general landslide susceptibility of the area is also increased by land use practices (urbanisation and slope cuts for roads and buildings) and lateral erosion of streams. The most common landslide types in the region are debris flows and slides, often influenced by slope cuts and riverbank erosion. During monsoon season, the soil gets saturated due to rainwater. The joints in the rocks also allow the rainwater to percolate. The saturation of soil further reduces the shear strength and leads to slope failure. The region, with high drainage density, has the most critical locations along the mountain rivulets (jhoras in local dialect). The jhoras flowing in high velocity erodes soil from its banks and eventually result in slope failure. One of the most hazardous areas of the test site is the Pyarieni jhora, which is located in Chibo village of Kalimpong (Figure 1c): It develops through deposits of an ancient debris slide consisting of a sandy silty matrix with phyllite fragments, phyllitic quartzite and mica schist [69], with a regolith thickness varying from 4 to 5 m along the banks. The upper portions of the location have scanty exposures of schist, phyllite and sheared gneiss, while the exposures of silvery grey mica schist are found along the The rainfall data used for this study is of daily resolution, measured at Tirpai, Kalimpong [71]. The average monthly rainfall data for the monitoring period, from July 2017 to August 2020, is reported in Table 1 below and it can be observed that the major share of the total annual rainfall starts in the month of July, during the monsoon season. The daily rainfall data during the study period has been plotted in Figure 2. It can be observed that rainfall with an intensity greater than 100 mm per day has happened multiple times during the monsoon. The average annual rainfall of the region is around 2000 mm and during the investigated period the least rainfall was received in 2018, and the maximum amount was received in 2020. From January to August 2020, a total rainfall of 2246.3 mm is recorded in the region.

Methodology
The study evaluates the performance of empirical rainfall thresholds, field monitoring data and the combined use of both, in forecasting the slope failures near a mountain rivulet, Pyarieni jhora in Chibo village of Kalimpong. The unstable slopes near the rivulet are being monitored using three tilt sensors since July 2017. The empirical thresholds for the region have been derived using the rainfall and landslide data from 2010 to 2016 and has been validated in this study using the data from July 2017 to August 2020. The field monitoring data and the combined approach are calibrated and verified using the data from July 2017 to August 2020.
Both empirical thresholds and field monitoring are combined using an algorithm, which first considers the rainfall thresholds and verify the in situ conditions using the data from tilt meters. All the three methods are explained in the following sections in detail.

Empirical Thresholds
Defining the triggering rainfall of landslide events and identifying the associated rainfall parameters are the most critical part in deriving a rainfall threshold [21]. The parameters associated with the responsible rainfall event are then plotted on a two-dimensional plane in logarithmic scale. For example, while defining ID thresholds, the values of I and D of the rainfall events that have resulted in landslides are plotted in y and x axes, respectively. The logarithmic scale is used to plot data of multiple orders. The power law in the original scale will get converted to a straight line in logarithmic scale and the best fit line can be derived using the method of least squares. The definition of empirical thresholds considers only those events which have resulted in landslides. The effect of rainfall events that have not triggered landslides are usually ignored in the analysis.

Methodology
The study evaluates the performance of empirical rainfall thresholds, field monitoring data and the combined use of both, in forecasting the slope failures near a mountain rivulet, Pyarieni jhora in Chibo village of Kalimpong. The unstable slopes near the rivulet are being monitored using three tilt sensors since July 2017. The empirical thresholds for the region have been derived using the rainfall and landslide data from 2010 to 2016 and has been validated in this study using the data from July 2017 to August 2020. The field monitoring data and the combined approach are calibrated and verified using the data from July 2017 to August 2020.
Both empirical thresholds and field monitoring are combined using an algorithm, which first considers the rainfall thresholds and verify the in situ conditions using the data from tilt meters. All the three methods are explained in the following sections in detail.

Empirical Thresholds
Defining the triggering rainfall of landslide events and identifying the associated rainfall parameters are the most critical part in deriving a rainfall threshold [21]. The parameters associated with the responsible rainfall event are then plotted on a two-dimensional plane in logarithmic scale. For example, while defining ID thresholds, the values of I and D of the rainfall events that have resulted in landslides are plotted in y and x axes, respectively. The logarithmic scale is used to plot data of multiple orders. The power law in the original scale will get converted to a straight line in logarithmic scale and the best fit line can be derived using the method of least squares. The definition of empirical Water 2020, 12, 3453 6 of 21 thresholds considers only those events which have resulted in landslides. The effect of rainfall events that have not triggered landslides are usually ignored in the analysis.
The thresholds in this study are defined using a frequentist approach, according to the distribution of data around the best fit line. The data plotted on the logarithmic scale are of multiple orders, and the best fit line can be calculated using the method of least squares. On a logarithmic scale, the ID thresholds are generally straight lines with negative slope, and ED thresholds are straight lines with positive slopes. The distance of each data point in the vertical axis from the best fit line is the shift of each data point. The best fit line represents a threshold line with 50% exceedance probability. It is assumed that the data follows normal distribution, and the best fit line represents the mean value. The threshold lines can be defined using the shift of data points from the best fit line. Hence the shift is fitted using a standard Gaussian distribution and thresholds can be defined based on the distribution of data.
The standard form of ID thresholds is given by: where α and β are empirical parameters in which α is the intercept and β is the scaling parameter. In the case of ED thresholds, the power law is in of the similar form, given by: where a and b are empirical parameters, similar to the case of ID thresholds. For the same data, β and b are related by the following equation The rainfall thresholds were derived for the region using rainfall and landslide from 2010 to 2016 [22,58] in previous studies conducted by the same research group. The landslide data has been collected from various government sources and media reports to derive a relationship between the occurrence of rainfall and landslides in the region. For the definition of ID thresholds, responsible rainfall events were identified by expert judgement based on the time and location of landslide. Then, the intensity and duration of the rainfall has been calculated from the daily rainfall data, which were used for the definition of minimum threshold, in the form of a power law. The ID threshold ( Figure 3a) with an exceedance probability of 5% is derived for Kalimpong [58] as: To overcome the limitations of the defined ID threshold due to uncertainty in data and the spatial distribution of location of landslides and rain gauge, an automatic algorithm based approach [18] was used to define the ED thresholds ( Figure 3b) with 5% exceedance probability for Kalimpong [22].
The algorithm defines the threshold using frequentist approach, but the rainfall event for the initiation of landslide is identified by the reconstruction of all rainfall events based on the rainfall series data. Next, the Maximum Probability Rainfall Conditions (MPRC) were identified using the rainfall condition with highest weight. The weight assigned is directly proportional to the cumulative event rainfall and intensity of the rainfall conditions and inversely proportional to the square of distance between rainfall and landslide. The thresholds were calibrated using the rainfall and landslide data from 2010 to 2016 and have been validated in this study using the data from July 2017 to August 2020. Water 2020, 12, x FOR PEER REVIEW 8 of 24 Figure 3. Empirical thresholds derived for Kalimpong along with the rainfall events considered for derivation of thresholds (a) ID threshold [58] and (b) ED threshold [22].
The algorithm defines the threshold using frequentist approach, but the rainfall event for the initiation of landslide is identified by the reconstruction of all rainfall events based on the rainfall

Field Monitoring System
For monitoring the unstable slopes near Pyarieni jhora in Chibo, three sensor units were installed in July 2017. Each field monitoring unit consists of an MEMS tilt sensor, a volumetric water content sensor and a control unit as shown in Figure 4.
Water 2020, 12, x FOR PEER REVIEW 8 of 21 landslide data from 2010 to 2016 and have been validated in this study using the data from July 2017 to August 2020.

Field Monitoring System
For monitoring the unstable slopes near Pyarieni jhora in Chibo, three sensor units were installed in July 2017. Each field monitoring unit consists of an MEMS tilt sensor, a volumetric water content sensor and a control unit as shown in Figure 4. The MEMS tilt sensor measures the tilt angle in two directions: One parallel to the sloping surface, and one in the perpendicular direction. The abscissa of the sensors was kept parallel to the direction of slope and ordinate in the perpendicular direction. The angles were recorded with an accuracy of 0.017 o . The sensitivity of tilt sensors was 4 V/g, and the output was measured as digital voltage. This voltage value was converted to the tilt angle using the conversion equation provided by the manufacturer. The volumetric water content sensors measured the water content with a precision of ±3%, with a response time of 10 ms. The control unit consists of a wireless communication module which transfers the data from each sensor unit to a data logger. The data received from all sensors, were collected by the data logger and transferred in real-time through internet. The power to the sensors was provided by four C size alkaline batteries. Such batteries can work continuously in field conditions, for a long time. The sensor sleeps for 10 min after sending a signal, which increases the battery life in the field. The configuration of the sensor is shown in Figure 5. The sensor for this research has been procured from Chuo Kaihatsu Corporation, Japan [72]. The manufacturing cost of sensors are less when compared to the conventional approaches, due to the integration of MEMS with integrated circuits, with very small dimensions [73]. The MEMS tilt sensor measures the tilt angle in two directions: One parallel to the sloping surface, and one in the perpendicular direction. The abscissa of the sensors was kept parallel to the direction of slope and ordinate in the perpendicular direction. The angles were recorded with an accuracy of 0.017 • . The sensitivity of tilt sensors was 4 V/g, and the output was measured as digital voltage. This voltage value was converted to the tilt angle using the conversion equation provided by the manufacturer. The volumetric water content sensors measured the water content with a precision of ±3%, with a response time of 10 ms. The control unit consists of a wireless communication module which transfers the data from each sensor unit to a data logger. The data received from all sensors, were collected by the data logger and transferred in real-time through internet. The power to the sensors was provided by four C size alkaline batteries. Such batteries can work continuously in field conditions, for a long time. The sensor sleeps for 10 min after sending a signal, which increases the battery life in the field. The configuration of the sensor is shown in Figure 5. The sensor for this research has been procured from Chuo Kaihatsu Corporation, Japan [72]. The manufacturing cost of sensors are less when compared to the conventional approaches, due to the integration of MEMS with integrated circuits, with very small dimensions [73].
The sensors can be installed easily at shallow depths. The tilt sensors are placed at a depth of 1 m or slightly more, and volumetric moisture content sensors at around 3 cm depth from the ground surface. Thus, tilt sensors overcome the disadvantages of inclinometers and extensometers, with the easy installation and quick response time [47], making them an easy-to-use tool for providing early warning. The low cost of the sensor unit helps in installing many sensors in a single slope. The major disadvantage associated with the sensors is the possibility of false alarms due to human on animal interventions. Any impact on the peg or pole attached to the sensor can result in tilting of the sensor. Hence, utmost care has to be taken while interpreting the data from sensors. The data from sensors should always be correlated with the rainfall and moisture data before arriving at conclusions. The sensors can be installed easily at shallow depths. The tilt sensors are placed at a depth of 1 m or slightly more, and volumetric moisture content sensors at around 3 cm depth from the ground surface. Thus, tilt sensors overcome the disadvantages of inclinometers and extensometers, with the easy installation and quick response time [47], making them an easy-to-use tool for providing early warning. The low cost of the sensor unit helps in installing many sensors in a single slope. The major disadvantage associated with the sensors is the possibility of false alarms due to human on animal interventions. Any impact on the peg or pole attached to the sensor can result in tilting of the sensor. Hence, utmost care has to be taken while interpreting the data from sensors. The data from sensors should always be correlated with the rainfall and moisture data before arriving at conclusions.
The locations of the sensors were selected after detailed field investigations, consultation with Geological Survey of India, and discussions with local people [69]. The sensors S1, S2, and S3 were placed near Pyarieni jhora. The location of sensors, along with the drainage map, is shown in Figure  6.  The locations of the sensors were selected after detailed field investigations, consultation with Geological Survey of India, and discussions with local people [69]. The sensors S1, S2, and S3 were placed near Pyarieni jhora. The location of sensors, along with the drainage map, is shown in Figure 6. The sensors can be installed easily at shallow depths. The tilt sensors are placed at a depth of 1 m or slightly more, and volumetric moisture content sensors at around 3 cm depth from the ground surface. Thus, tilt sensors overcome the disadvantages of inclinometers and extensometers, with the easy installation and quick response time [47], making them an easy-to-use tool for providing early warning. The low cost of the sensor unit helps in installing many sensors in a single slope. The major disadvantage associated with the sensors is the possibility of false alarms due to human on animal interventions. Any impact on the peg or pole attached to the sensor can result in tilting of the sensor. Hence, utmost care has to be taken while interpreting the data from sensors. The data from sensors should always be correlated with the rainfall and moisture data before arriving at conclusions.
The locations of the sensors were selected after detailed field investigations, consultation with Geological Survey of India, and discussions with local people [69]. The sensors S1, S2, and S3 were placed near Pyarieni jhora. The location of sensors, along with the drainage map, is shown in Figure  6.  The sensors are used to monitor the slopes since July 2017 and the readings are matched with field observations, to assess the reliability of the data [46,69]. The recordings from the tilt meters during monitoring period is plotted in Figure 7 below.
It can be observed from Figure 7 that the maximum variation in tilt has been observed in sensor 2, which is the most critical location. Also, there are many abrupt changes, observed in the monitoring data, which are mostly vertical, due to sudden impacts. From field observations, it has been noted that such changes are not always associated with slope failures. The slope instabilities are occurring during monsoon seasons, due to heavy rainfall, during which the tilt meters record variations in tilting angles in accordance with the directions of failing mass. Water 2020, 12, x FOR PEER REVIEW 12 of 24  The second displacement period during 2020 was the most critical ( Figure 8) where cracks were observed in the village roads near the jhora (Figure 8). The cracks are close to the location of sensor 2, which has recorded the maximum variations in tilt angles. The recorded data were in accordance with the field observations. The data from all four monsoon seasons were used to calibrate and validate the thresholds for tilt meters.
Water 2020, 12, x FOR PEER REVIEW 11 of 21 data, which are mostly vertical, due to sudden impacts. From field observations, it has been noted that such changes are not always associated with slope failures. The slope instabilities are occurring during monsoon seasons, due to heavy rainfall, during which the tilt meters record variations in tilting angles in accordance with the directions of failing mass. During the 2017 monsoon, the tilting angles of sensors 3 and 2 showed a notable increase in the tilt angles. The displacement periods were 28 and 29 July 2017, and 13 to 17 August 2017. During monsoon 2018, maximum variations in tilting angles were recorded in sensor 2. From 3 August to 15 September 2018, significant changes were observed in tilt angles, with a flat period from 14 to 24 August 2018. During 2019 monsoon, both sensors 1 and 2 showed minor variations. There were two displacement periods, one from 12 to 20 July 2019 and the second was from 9 to 18 August 2019. The displacement periods during 2020 monsoon were from 17 to 22 June and 26 July to 27 August. The second displacement period during 2020 was the most critical ( Figure 8) where cracks were observed in the village roads near the jhora (Figure 8). The cracks are close to the location of sensor 2, which has recorded the maximum variations in tilt angles. The recorded data were in accordance with the field observations. The data from all four monsoon seasons were used to calibrate and validate the thresholds for tilt meters. The peak hourly tilt rate is used for issuing the alert and after evaluating the field monitoring data, it was set to 0.03°/h. All three sensors were considered for analysis, and if a reading crossing the threshold is recorded in any of the three sensors, alert should be issued. The method of measuring tilting rate has been modified from the previous studies, for ease of calculation. The common approach of calculating tilting rate is to calculate the slope of tilt angle vs. time graph, for a small constant interval of tilt angle. But in this study, we have kept the time interval constant instead of the change tilt angle. The change in tilt angle for every hour has been recorded as tilting rate. It has been verified if the rate is above 0.03°/h continuously for at least an hour. This has been done to minimize the errors due to noises in tilt angle readings. When the threshold is crossed, minor displacements can be expected near the jhora, which may result in very slow sinking of the uphill area. The process is very slow and happens in a duration of a few days. If the threshold value crossed is 0.1°/h, the alert is critical and there are chances for sudden slope failures, leading to landslides. The peak hourly tilt rate is used for issuing the alert and after evaluating the field monitoring data, it was set to 0.03 • /h. All three sensors were considered for analysis, and if a reading crossing the threshold is recorded in any of the three sensors, alert should be issued. The method of measuring tilting rate has been modified from the previous studies, for ease of calculation. The common approach of calculating tilting rate is to calculate the slope of tilt angle vs. time graph, for a small constant interval of tilt angle. But in this study, we have kept the time interval constant instead of the change tilt angle. The change in tilt angle for every hour has been recorded as tilting rate. It has been verified if the rate is above 0.03 • /h continuously for at least an hour. This has been done to minimize the errors due to noises in tilt angle readings. When the threshold is crossed, minor displacements can be expected near the jhora, which may result in very slow sinking of the uphill area. The process is very slow and happens in a duration of a few days. If the threshold value crossed is 0.1 • /h, the alert is critical and there are chances for sudden slope failures, leading to landslides.

Combined Use of Field Monitoring Data and Empirical Thresholds
The abovementioned rainfall thresholds and field monitoring data were integrated in a procedure that is depicted in Figure 9 and represents a prototype of warning systems combining both techniques.

Combined Use of Field Monitoring Data and Empirical Thresholds
The abovementioned rainfall thresholds and field monitoring data were integrated in a procedure that is depicted in Figure 9 and represents a prototype of warning systems combining both techniques. The warnings are issued only when a precursor event has happened in the study region. The precursor event is a rainfall event which is above the threshold line. From the historical data, such an event has the potential to trigger a slope failure in the region. It is not necessary that the slope failure should occur along with the event or immediately after the event. When the soil is moist, there are chances that the failure occurs within a short span, after the occurrence of the precursor event. The lag time defines the possibility of occurrence of landslides, within 72 h after the occurrence of a rainfall event, as observed during the test period. The value of 72 h has been chosen for Kalimpong, using a trial-and-error approach to minimize the missed alerts. The algorithm first checks if the rainfall threshold is crossed or not, and then checks if the peak hourly tilting rate has exceeded the tilting angle thresholds on the previous day. An alert is issued only if both the conditions are satisfied. The term Ts in Figure 9 represents the time for which alert is issued. If the rainfall forecast for a day is less than the empirical threshold, the algorithm checks if the time span is less than the lag time (72 h in this study) or not. If the lag time is not crossed, it checks for the tilt rate threshold again and issue alert if the tilt rate threshold is crossed.
The variable T5 in Figure 9 indicates the empirical rainfall threshold with 5% exceedance probability. The lag time is the time of issuing alert, after the ending of the precursor event, rainfall. The algorithm was applied to both ID and ED thresholds, with a lag time of 72 h, and the resulting performance were evaluated.

Results
The rainfall [71], monitoring data and landslide data for Kalimpong town from 1 July 2017 to 31 August 2020 were used for evaluation of the performance of the different methods considered. A The warnings are issued only when a precursor event has happened in the study region. The precursor event is a rainfall event which is above the threshold line. From the historical data, such an event has the potential to trigger a slope failure in the region. It is not necessary that the slope failure should occur along with the event or immediately after the event. When the soil is moist, there are chances that the failure occurs within a short span, after the occurrence of the precursor event. The lag time defines the possibility of occurrence of landslides, within 72 h after the occurrence of a rainfall event, as observed during the test period. The value of 72 h has been chosen for Kalimpong, using a trial-and-error approach to minimize the missed alerts. The algorithm first checks if the rainfall threshold is crossed or not, and then checks if the peak hourly tilting rate has exceeded the tilting angle thresholds on the previous day. An alert is issued only if both the conditions are satisfied. The term T s in Figure 9 represents the time for which alert is issued. If the rainfall forecast for a day is less than the empirical threshold, the algorithm checks if the time span is less than the lag time (72 h in this study) or not. If the lag time is not crossed, it checks for the tilt rate threshold again and issue alert if the tilt rate threshold is crossed.
The variable T5 in Figure 9 indicates the empirical rainfall threshold with 5% exceedance probability. The lag time is the time of issuing alert, after the ending of the precursor event, rainfall. The algorithm was applied to both ID and ED thresholds, with a lag time of 72 h, and the resulting performance were evaluated.

Results
The rainfall [71], monitoring data and landslide data for Kalimpong town from 1 July 2017 to 31 August 2020 were used for evaluation of the performance of the different methods considered. A confusion matrix was used, and some commonly used skill scores were calculated to quantify the prediction performance. The outcome or prediction by the thresholds were classified into four categories as true positives, false positives, false negatives, and true negatives for quantitative performance evaluation. The days in which the threshold predicts the possibility of a slope failure are a positive outcome. If landslide has happened on the day, it is true positive and if not; it is defined as a false positive. Negative outcomes are days on which the rainfall is below the threshold condition. If no landslides or movements are reported on the day, the outcome is a true negative and otherwise it is a false negative. By combining these four attributes, the performance of a model can be quantitatively evaluated through statistical indexes.
Efficiency is an indicator which can be used to understand the ratio of true outcomes to the total outcomes [74]. The ratio of true outcomes to the false outcomes is termed as odds ratio. For landslide forecasting models, the number of true negatives is expected to be of a higher order than the other three outcomes, and hence the terms like efficiency and odds ratio are dominated by the value of true negatives. Hence, it is important to consider this aspect and make use of terms in which the effect of true negatives is minimum. The terms sensitivity and specificity and the attributes derived from these variables are helpful in this aspect. The term sensitivity denotes the ratio of landslide events, which was correctly predicted to the total number of landslide events, and specificity denotes the ratio of non-landslide days, which was correctly predicted to the total number of non-landslide days. The value of both these variables should be 1 for a perfect model. Likelihood ratio is another term to evaluate the overall performance, the ratio of sensitivity to 1-specificity. While analysing the performance of a model, all these attributes should be evaluated thoroughly, to identify which is the best model according to the needs of the user.

Performance of Empirical Thresholds
The skill scores of the empirical thresholds are calculated using the confusion matrix and are mentioned in Table 2. The ID thresholds were 84% efficient and ED thresholds were 86% efficient in predicting true outcomes. From Table 2, it can be understood that the number of false positives predicted by the empirical thresholds are much higher than that of true positives, thus the performance of the model is highly conservative and, if used in a LEWS, it would result in a very high number of false alarms. The values of odds ratio are 5.36 and 5.98 for ID and ED thresholds, respectively. Even though the false positives are larger than true positives, odds ratio has a value greater than 1 due to the higher number of true negatives. In our case of study, both rainfall thresholds fail to predict almost 50% of the alerts and the number of false alerts is almost twice the number of true alerts. The sensitivity of the thresholds is 0.51, which is too less for an operational LEWS. These drawbacks affect the reliability of the model and false predictions will negatively affect the response of public towards the alerts issued. Hence it is important to overcome these two disadvantages, to develop an effective LEWS.

Performance of Tilt Meters
From the data received from sensors, it was observed that the sensors near Pyarieni jhora (sensors 1, 2, and 3) have recorded variations during monsoon.
The prediction performance, by using the field monitoring data alone, is given in Table 3. The data shows a total of 273 false positives on days without landslides. The same can be observed as abrupt changes in tilt angle during the months without rainfall. The variation in tilt angles were associated with external interference and noises, and not related to slope failure. Hence it is important to correlate the data from the sensor with the in-situ moisture content and daily rainfall data for better interpretation and issuing alarms, as done by the algorithm proposed in this work and explained in Section 3.3. As can be seen from Table 3, the field monitoring data alone cannot help in developing a good early warning system, as many false alarms would be issued. Even though the use of field monitoring is more sensitive than the rainfall thresholds, the higher number of false alarms make it less specific, resulting in a lesser likelihood ration of 3.41.

Performance of Proposed Algorithm
The peak daily tilting rates were employed to enhance the forecasting performance of conventional empirical rainfall thresholds. As mentioned in Tables 2 and 3, the major disadvantage associated with the rainfall thresholds and tilt meters when used alone, is the high number of false positives. Another important aspect is that such thresholds do not consider a lag time, which is the time between ending of the precursor event (rainfall) and the occurrence of landslides. Hence this study proposes an algorithm which uses both rainfall thresholds and field monitoring data to issue an alert to the public.
The performance statistics obtained using the proposed algorithm are mentioned in Table 4. Table 4. Prediction performance of algorithm-based approach, using empirical rainfall thresholds. From Table 4, it can be observed that the performance of both ID and ED thresholds have been improved considerably, using the algorithm-based approach employed in this research. It can be understood from Table 4 that the major variation after the use of field monitoring data has happened in the decrease of false alarms. The number of false positives has been reduced from 273 to 63, using the algorithm combining ED thresholds and tilt meters.

Discussion
The empirical rainfall thresholds, ID and ED were observed to be in good agreement with each other. Both the thresholds had similar prediction performance with many false alerts and missed alerts.
The tilt meters, when used alone, are able to predict most of the events, but with the downside of a very high number of false alerts. This drawback would limit the potential use of such methods in an operational LEWS: When false alarms are issued multiple times, the response to future warnings might get affected. Since the social response to the early warning is an important aspect in the implementation of an LEWS, it is crucial to lower the number of false alarms to the best possible extent. This is done by the combined approach proposed in this study. Figure 10 shows the variations in alerts issued by different methods considered. Another important component of the proposed warning system is the use of lag time, which helps in minimising the number of false alarms on days without rainfall. The lag time explores the effect of antecedent rainfall conditions on the initiation of landslides [75]. When compared with empirical thresholds, the combined approach shows an improved performance in terms of all the attributes. The number of true alerts is increased and the number of both false alerts and missed alerts are decreased (Figure 10), making the model more sensitive and specific. It should also be noted that there is a slight variation in the performance of the algorithm, depending upon the initial rainfall threshold used, ID or ED. The performance of ED thresholds is found to be better than that of ID thresholds when used individually and when used in algorithm. The difference comes in the decreased number of false alerts issued when ED thresholds are used ( Figure 10). The reason for this improvement is the method of selection of rainfall parameters considered for the definition of thresholds. The use of automatic algorithm has considered both spatial and temporal variations in the rainfall and landslide data and has reduced the number of false alarms issued.
the algorithm combining ED thresholds and tilt meters.

Discussion
The empirical rainfall thresholds, ID and ED were observed to be in good agreement with each other. Both the thresholds had similar prediction performance with many false alerts and missed alerts. The tilt meters, when used alone, are able to predict most of the events, but with the downside of a very high number of false alerts. This drawback would limit the potential use of such methods in an operational LEWS: When false alarms are issued multiple times, the response to future warnings might get affected. Since the social response to the early warning is an important aspect in the implementation of an LEWS, it is crucial to lower the number of false alarms to the best possible extent. This is done by the combined approach proposed in this study. Figure 10 shows the variations in alerts issued by different methods considered. Another important component of the proposed warning system is the use of lag time, which helps in minimising the number of false alarms on days without rainfall. The lag time explores the effect of antecedent rainfall conditions on the initiation of landslides [75]. When compared with empirical thresholds, the combined approach shows an improved performance in terms of all the attributes. The number of true alerts is increased and the number of both false alerts and missed alerts are decreased (Figure 10), making the model more sensitive and specific. It should also be noted that there is a slight variation in the performance of the algorithm, depending upon the initial rainfall threshold used, ID or ED. The performance of ED thresholds is found to be better than that of ID thresholds when used individually and when used in algorithm. The difference comes in the decreased number of false alerts issued when ED thresholds are used ( Figure 10). The reason for this improvement is the method of selection of rainfall parameters considered for the definition of thresholds. The use of automatic algorithm has considered both spatial and temporal variations in the rainfall and landslide data and has reduced the number of false alarms issued.  The very high number of false alerts provided by tilt sensors can be explained with the local conditions of the test site. Seepage happens at different locations within the old slide debris and through the joints of rocks. Erosion of debris material during rain may sometimes result in tilting the sensor, and such movements can be less critical and not always associated with landslide processes. Second, when sensors are placed in locations where chances of external factors cannot be neglected, they may record tilt angles which are not related to the slope failure, but due to micro-scale deformations not connected with slope instability or can be caused by external factors like disturbance from animals or human activities. This strengthens the initial idea of the proposed approach, for which data from the sensors must be interpreted in connection with the rainfall data, to take decisions regarding early warning. Figure 10 depicts that the number of false alarms issued by the field monitoring has been reduced considerably by the combined approach. But this comes with the cost of missing some of the true alerts, which have happened long after the occurrence of the precursor event (rainfall). Such events have started during the occurrence of rainfall, but have continued at very slow rates a few more days after the rainfall. The algorithm fails to predict such events if the lag time is more than 72 h. Increasing the lag time further will result in a greater number of false alerts. Thus, the use of tilt meters alone is a more sensitive approach with less missed alerts, but the higher number of false alerts makes it unsuitable for direct use in LEWS. The combined approach has a better overall performance with higher efficiency and likelihood ratio. The method proposed in this study is conceived to issue alerts with a lead time of 24 h. That lead time would allow the local community to implement some counteractions, which should be carefully evaluated in subsequent stages of the research, based also on the expected risk scenarios (yet to be defined) and the reliability of the warning system. The prototype LEWS should be tested first for a longer duration and detailed action plans should be decided, after considering different risk scenarios.
The development of LEWS demands strong scientific research in data collection, interpretation, and integration of different modules. The proposed algorithm overcomes the disadvantages associated with both rainfall thresholds and field monitoring data. The method is simple and easy to export, with historical rainfall and landslide data and a network of slope monitoring sensors. Cost of installation of a large number of sensors is a major concern for developing countries like India, hence a cost-effective approach is used in this study. The use of MEMS sensors along with the rainfall thresholds is thus a simple and economical approach for the prediction of landslide events.
Indeed, the methodology is far to get perfect predictions and needs further improvements: A purely statistical rainfall threshold approach is too simplistic to fully cope with the complex geological setting of this area. The combined use of instrumental monitoring helps getting some more insights on the behaviour of the slopes, but the error rates returned in the validation demonstrated that the complex geology of the area remains not adequately caught by the model. This conclusion opens other research directions to be explored in the near future such as: Extending the monitoring network to gather more data at different locations; or combining the rainfall threshold method with a zonation of landslide susceptibility [15] based on morphometry and geotechnical characteristics (e.g., weathering degree, orientation of joints, residual strength of the soil) of the area. Such improvements could help to better relate the triggering factors (i.e., rainfall) with the variability of the geological setting of the area.
Another limitation of the methodology lies in the intrinsic nature of empirical rainfall threshold method, which does not directly consider the failure mechanism. Since the major rock type of the study area is schist, the chances for purely translational failure cannot be neglected. Theoretically speaking, that leaves open the possibility of tilt angles undetected by the sensors. Since such circumstances have not happened during the study period, the response of tilt sensors to purely translational failure, remains not yet examined.
The proposed system should be considered prototypal as a high rate of errors is still present; nonetheless the testes improvements could be evaluated ameliorative with respect to the rainfall thresholds and tilt sensors used alone, thus they represent a step in the right direction for a more balanced hazard management in the study area. The methodology proposed in this study, which is the first of its kind for the Darjeeling Himalayas, can be easily exported to the other parts of the world, to test the possibility to improve the forecasting performance of existing empirical thresholds in different hydrological and geological settings.
To understand the applicability of the proposed algorithm for different meteo-hydrological settings, experimental investigations can also be conducted on laboratory scale. This could verify the possibility of exporting the model to different regions, by simulating different rainfall conditions on different types of soil and topography [76,77]. The prototype developed in this study is a site-specific case from the Pyarieni Jhora of Kalimpong, where failure occurs at a very slow rates and observed as cracks and displacements in the uphill areas. In a laboratory scale study, it is possible to simulate different types of landslides and define different thresholds for each case. Such studies can help in the calibration of model and critical tilt rate for different conditions of precursors and slope failures and can aid in field scale studies.

Conclusions
The rainfall thresholds defined for Kalimpong region in Darjeeling Himalayas has been evaluated quantitatively, using the recent rainfall and landslide data from 2017 to 2020 monsoons. The analysis proves that the large number of false alarms makes the conventional empirical rainfall thresholds unsuitable for use in an LEWS. Even though the thresholds predict the most landslides events correctly, the false alarms affect the reliability of the system and makes difficult to issue a trustworthy early warning to the public. To overcome this limitation, real-time field monitoring data from Kalimpong was utilized to enhance the forecasting performance of empirical thresholds. The field monitoring system consists of MEMS units integrating a tilt sensor, a soil moisture meter and a real-time wireless transmitter. The sensors which recorded peak variations were used to modify the empirical rainfall thresholds, using an algorithm that combines rainfall thresholds, lag time, and recorded tilting rates. According to the algorithm, the rainfall thresholds were taken as a first line of action and the criticality was verified using the field monitoring data, to issue alerts. The results showed that the combined approach has an efficiency of 92% and a likelihood ratio of 11.33, which makes them a more suitable part of the proposed LEWS for Kalimpong respect to both the original version based only on rainfall thresholds and a monitoring system based solely on MEMS. The proposed method is also found to have better performance than an early warning system using field monitoring data only. The algorithm overcomes the limitations of both rainfall thresholds and field monitoring data when used separately and can be used by the authorities to warn the public regarding a possible slope failure. The study discusses the first prototype of LEWS for this highly vulnerable landslide zone of Kalimpong and is found to be promising to be implemented as an operational LEWS.