The impact of Probability Density Functions assessment on model performance for slope stability analysis

: The development of forecasting models for the evaluation of potential slope instability after rainfall event represents an important issue for the scientific community. This topic has re-ceived considerable impetus due to climate change effect on the territory [1, 2] as several studies demonstrate that the increase in global warming can significantly influence the landslide activity and stability conditions of natural and artificial slopes [3]. A consolidated approach in evaluating rainfall induced landslide hazard is based on the integration of rainfall forecasts and physically based (PB) predictive models through deterministic laws. However, considering the complex nature of the processes and the high variability of the random quantities involved, probabilistic approaches are recommended in order to obtain reliable predictions. A crucial aspect of the stochastic approach is represented by the definition of appropriate probability density functions (pdfs) to model the uncertainty of the input variables as this may have an important effect on the evaluation of the probability of failure (PoF). The role of the pdf definition on reliability analysis is discussed through a comparison of PoF maps generated using Monte Carlo (MC) simulations performed over a study area located in the Umbria Region of central Italy.


Introduction
In many areas of the world, rainfall-induced landslides represent a relevant threat to population, infrastructures, buildings, and cultural heritage. In the last years, the extreme rainfall events have induced an increasing frequency of slope movements [4,5]; therefore, the prediction of rainfall-induced landslides represent a major challenge for the scientific community.
The most damaging landslides are triggered by intense or prolonged rainfall [6,7,8,9] and the most common phenomena are the shallow landslides [10].
Considering the negative impact of landslides on society, different approaches have been developed to protect and safeguard the territory, for example geomorphological mapping, analysis of landslide inventories, heuristic terrain and statistically-based classification methods [11,12] have often been used to evaluate the landslide susceptibility of an area.
Landslide susceptibility (S) is the likelihood of a landslide occurring in an area on the basis of local terrain conditions [13]; it represents an estimate of "where" landslides are likely to occur.
The susceptibility (S), the hazard H (representing the probability that a landslide of a given magnitude will occur in a given period and in a given area) and the vulnerability V (considering people and infrastructures involved) define the landslide risk, which is therefore based on what has happened in the past.
The current extreme regimes are not comparable with past rains therefore the zoning of risk (function of S, H, and V) not always is sufficient to guarantee the protection of the territory and people.
A robust and sustainable response to planning policies is represented by landslide forecasting models that are able to simulate the slope stability as a function of expected meteorological conditions and physical characteristics of the territory.
The spatial and temporal forecasting of shallow landslides triggered by rainfall can be performed according to different approaches: empiric methods, that analyze records of landslide events and attempts to determine spatial and temporal variations in the occurrence and frequency of landslides [14], and physically-based (PB) approaches accounting for the local physical and mechanical properties that control the failure processes [15,16,17,18,19,20,21,22]. The latter approaches are preferred to forecast the spatial and the temporal occurrence of shallow landslides triggered by individual rainfall events; and they are commonly used by the scientific community due to their capability to describe the natural physical processes through appropriate analytical equations (some examples are presented by: 23,24,25,26).
If a detailed description of the study area is available in terms of slope topography and physical, mechanical and hydraulic soil properties, PB approaches can provide a high level of reliability [see , 27]. Generally, the detailed reconstruction of slope topography does not represent a relevant problem; on the contrary, an adequate characterization of physical, mechanical and hydraulic properties of soil cover is subjected to economic and practical limitations. Soils and rocks are described by parameters characterized by high variability in space both in horizontal and vertical dimension [28]. For instance, mechanical properties show their uncertainty not only from site to site and within a given stratigraphy, but also within homogeneous covers, as a consequence of natural deposition processes [29]. In addition to the soil properties variability (random uncertainty), the soil parameters are characterized by two different forms of uncertainty, the first one, the epistemic uncertainty, is linked to impossibility to measure directly a soil characteristic [30]; the second one, the model uncertainty, is related to the approaches used to describe a specific phenomenon [31]. The model uncertainty includes: 1) the measurement uncertainty, related to systematic errors (bias) and random errors (precision), 2) the statistical uncertainty linked to limited information and influenced by the technique used, and 3) the uncertainty due to the idealizations present in the physics formulation of the problem. While the epistemic uncertainty can be reduced, the random uncertainty cannot be eliminated.
Nevertheless, PB models are often used considering the quantities involved in the landslide processes deprived of uncertainty, quantified by a single fixed value [32; 33]; consequently, the derived predictions are expressed by a single value of the Factor of Safety, Fs, or the critical rainfall intensity, Ic.
On the contrary, when PB models are used with a probabilistic approach, the variability of the input quantities is modeled and the dependent variables are described as random as well.
For landslide forecasting through PB approaches, a deterministic approach that assumes the input data without uncertainty [19; 34; 35, 36; 20; 37] is less suitable than a probabilistic one, which considers input data as random variables [38] defined through their Probability Density Functions (pdf) [39]. In the probabilistic approach, the safety level of the slope is given by the Probability of Failure (PoF), i.e., the probability associated to a value of factor of safety equal or less than 1 [40; 41].
A key point in the PB probabilistic modeling is represented by the definition of theoretical pdf for the random quantities involved in the simulated physical processes. For a specific variable, the theoretical pdf must be able to reproduce its variability starting from the available measures. Typically, the large variability of physical and mechanical properties of the soil makes quite difficult identifying a priori the theoretical pdf. In addition, when applying the PB probabilistic models over large areas, the objective difficulties of having a significant number of measures to correctly assess the pdf for each variable leads to the adoption of simplified hypotheses. When the probability density functions of strength parameters could not be determined from in situ or laboratory tests, the pdfs are estimated based on judgment, experience, or indications supported by other authors [42].
In this paper, the impact of the selection of the pdf on the PoF estimation is discussed through the application of a PB probabilistic model to a pilot study area characterized by a detailed geotechnical characterization. Available collected measures in this area are used to define the pdfs for soil shear strength properties: effective cohesion (c') and friction angle ('). In this work a comparison is carried out between the PoF evaluated: i) starting from uniform pdf for all random variables, called PoFu, when the internal structure of the uncertainty is unknown and only the minimum and maximum values of variable is known [43]; ii) assuming pdf able to consider the spatial variability of random quantities, called PoFr. The analyses have been carried out developing a modification of transient rainfall infiltration and grid-based regional slope-stability analysis (TRIGRS) code [20], in its probabilistic version [44]. The reliability analysis has been carried out using Monte Carlo method [40].
The paper is organized as follows. In section 2, an overview of the theoretical aspects of the approach and the equations that govern the physically based probabilistic model implemented are described; a detailed description of the study area is presented in the section 3, in particular the geotechnical and hydrological assumptions considered for the reliability analysis are illustrated in subsections 3.1 and 3.2 respectively.
The experimental settings and the results, obtained in terms of PoF, are discussed in section 4. The conclusions and future research developments represent the final section of the paper (section 5).

Physically-based landslide forecasting model
The probabilistic model, implemented in order to assess the impact of pdfs definition on the PoF evaluation, represents a new extension of the probabilistic version [44] of the original TRIGRS code [20]. This is a PB model, able to work through a discretization of the study area on a regular grid, coupling a hydraulic model for the evolution of the pore water pressure during time and a mechanical model for the assessment of the temporal slope stability conditions.
In the deterministic analysis, the slope stability is expressed by the factor of safety, Fs, equal to: Where a is the inclination grade of slope, c' represents the effective cohesion of the soil, ' is the effective friction angle,  is the pressure head and  is the unit weight of the soil.
In transient flow conditions, the Factor of Safety varies with Z and t, due to the evolution with time and space of the pressure head  generated by the rainfall infiltration process. This phenomenon is controlled by the balance of mass for the pore water which requires the definition of: i) satured (s) and residual (r) volumetric water content, ii) the soil stiffness (Eed), iii) the initial pre-storm water table depth dw and the pre-storm infiltration rate parameters a and ILT; iv) the thickness of the soil cover h and v) the hydraulic conductivity ks. Fs results to depend on 12 parameters, represented in a synthetic form in the equation 2: = ( , ℎ, , , ′ , ′, , , , , , , ) In the stochastic analysis, the stability conditions are expressed by the PoF. In principle, all the 12 parameters in equation (2) can be considered as random variables, however, the number of random variables can be reduced, without affecting the forecast reliability under appropriate hypotheses. In the considered case study, the following assumptions can be done: 1) on the safe-side of the prediction, and without detailed characterization for the unsaturated parameters, the soil cover can be considered in fully saturated conditions; thus the balance of mass for the pore water (d/dt=Dα d 2 /dZ 2 , in which  is the pressure head linked to infiltration process and Dα is the soil 1D consolidation coefficient) reduces to the simple diffusion; 2) since s is typically affected by low uncertainty, it can be considered constant and evaluated from literature data; 3) if a high resolution DTM is used, the slope steepness can be assumed accurate enough to be characterized by no uncertainty; 4) the water table depth should be monitored in different points of the study area. Therefore, the PB probabilistic model used in this paper considers the randomness of the following quantities: • soil mechanical properties (c','); • soil saturated hydraulic conductivity ks and soil stiffness Eed, both considered in the infiltration problem through the coefficient of consolidation; = ( )⁄ • the thickness of the soil cover layer, h. The model is organized into two blocks, the first assess the pdfs, and the second performs the Monte Carlo simulations for reliability analysis.
The comparison is carried out in terms of PoF estimations, by considering two different pdfs: 1) the assessed pdfr, evaluated by considering the actual variability of the random quantities, and 2) the uniform pdfu, which ignore the internal structure of the uncertainty considering only the possible or probable range of parameters variation. In the first case, the PoFr is obtained, in the second one the PoFu is estimated.
To evaluate the influence of pdf theoretical distribution, Monte Carlo simulation has been used for the reliability analysis. For unrelated variables, and for the problem analyzed, the Monte Carlo method involves the generation of 5 sequences of random numbers, the N realizations provide a sample of possible values for the safety factor Fs. In particular, the PoFu and PoFr have been evaluated considering a number of Monte Carlo realizations, for all cases, equal to N = 10.000, corresponding to possible estimate error of about 25%.

Study area
The study area selected for the investigation is known as Nuvole di Morra (Fig. 1a), a district located in the Città di Castello municipality (northern sector of Umbria Region, central Italy). The study area has an extension of 3 km 2 . Slopes in the area have a steepness varying between a maximum of 30° and a minimum of 5° with the highest values in the west and south-east part of the area. The landslide appeared to be chiefly a reactivation of pre-existing landslides; however, areas never affected by old landslides were also affected by the new slope movement.
The landslide (Fig. 1 a) can be divided into three zones: the source area, located in the north-west sector of the area; the entrainment area, located in the south-west sector and the indirect movement located in the east part.
Relevant damages were caused to structures and infrastructures, and several families were transferred in other safe places (Fig. 1 b, c). After the event, geotechnical surveys were planned, to support the structural remedial works design and implementation.
In particular in correspondence to 8 perforation tests (S1 to S8 in Fig. 1a), 4 tests of class Q5 (undisturbed sample) and 10 tests of class Q3 (disturbed samples of class 3) have been analyzed. In addition to laboratory tests, 2 standard penetrometric tests SPT, 4 geometric tests type MASW (Multichannel Analysis of Surface Waves), 4 seismic refraction profiles and 3 permeability tests, were performed.
In this study, the available in situ measurements were integrated with the information derived by the dataset of Perugia Province [45; 46], to obtain a detailed geotechnical characterization that makes this area particularly suitable for the evaluation of the impact of the pdf on the model performance.

Physical and mechanical properties
The study area is characterized by a homogeneous soil cover. According to the Geological Map of the Umbria Region, the outcropping lithotype is "turbidites pelitic arenaceous" (also reported in the classification described in [45]; [46]). The definition of the stochastic parameters (mean value and coefficient of variation COV) for c' and ' ( Table   1) derives from the integration between the regional dataset information [45] and in situ test measurements as mentioned in the previous Sec. 3.
For the soil stiffness (Eed) and the hydraulic conductivity (ks), considering the few available measurements, literature data have been used. In particular, for ks, characterized by a high variability, a log-normal distribution has been considered, while for Eed, often delimited by two fixed extreme values, the beta distribution, which combines to low values of physical quantity a low probability of occurrence, has been considered [28].
The thickness of the soil cover (h), for each cell of the grid, has been evaluated using the following empirical formula: where α is expressed in degrees [35]; h is a further random variable of the probabilistic model, considered normally distributed (Table 1). The unit weight of the soil, s, has been assumed equal to 19 kN/m 3 , and the soil cover is considered homogeneous and completely saturated (Sr = 1). Considering the demonstrative character of the present study, and in absence of detailed information or recorded data, the initial pre-storm water table depth (dw) has been set equal to 50% of h, and the steady pre-storm infiltration rate (ILT) is assumed to be negligible.

Rainfall data
The role of pdfs on the PoF evaluation is discussed with reference to the rainfall event that affected the Umbria region between September and December 2005. The large amount of rainfall triggered many shallow landslides, among these Nuvole di Morra is the most relevant event.
Meteorological rainfall data for the study area are obtained through the raingauges of the Umbria Region monitoring network which provide semi-hourly data, recorded continuously throughout the year 2005. The rainfall data recorded by the rain gauges of Trestina and Città di Castello were considered (Fig. 2), in particular Morra landslide was triggered by rainfall observed between 25th and 28th November. The rainfall event considered as input in the probabilistic physically-based landslide forecasting model is characterized by a total duration (t) of 32 h and cumulative rainfall of about 90 mm.

Study area
The modified PG_TRIGRS code was applied to the study area to obtain predictions on its stability with different rainfall conditions. At first, the spatial distribution of FS was evaluated considering the mean deterministic values defined in Tab. 1 for the physical and mechanical properties of the soil.
As shown in Fig. 3, in absence of rainfalls, the area results stable with values of Fs always greater than unity. As can be expected, the areas characterized by high susceptibility to landslides correspond to the sectors characterized by the highest slope (west and south-east area of the polygon in Fig. 3). Using the rainfall event described in Sect. 3.2, the code provides the predictions shown in Fig. 4. As expected the stability conditions of the study area changes during time, with the evolution of the rainfall event. To observe the effect of the rainfall on the model spatially-distributed predictions, the PoF is evaluated at 4 different time instants td: td=8h; td=16h; td=24h; and iv) td=t=32h, considering the uniform pdf for the random quantities (Fig. 4). Considering a subdivision of the PoF variability range into 4 classes: 0≤PoF<0.15; 0.15≤PoF<0.35; 0.35≤PoF<0.5; PoF≥0.5, as expected, at early time (Fig. 4 a) there are not areas associated with very high PoFu (PoF>50%). As time increases (Fig. 4 b) PoFu starts to increase and there is a transition from low (white cells) to higher PoFu levels (orange and red pixels). Starting from time t = 24 h (Fig. 4 c), cells with PoFu variable between 15% and 50% start to concentrate in southeast and west sector of the landscape.
At the end of the storm (Fig. 4 d) the stability conditions are similar to those shown for t = 24 h. In the last 8 hours, the rain intensity of the event is low and it corresponds to a cumulative rainfall of 8 mm.
In the second simulation, the pdf based on the actual measures of the study area were used. Similar results are obtained for PoFr; in particular, at the beginning of the event (Fig.  5 a), area in western sector depicted by PoFr> 50%, are observed. The first 8 hours of rain are characterized by a cumulative rain intensity of about 20 mm.
As time increases the pixels that fall in very high PoFr class do not increase (Fig. 5b), but a transition from low PoFr value (white cells) to medium PoFr values (beige and orange cells) in the southeast sector are visible (Fig. 5 c and Fig. 5 d).
Also in this case the stability conditions reached at t = 24 h are similar to those obtained at the end of the event. a) b) c) d) A similar spatial PoF distribution can be observed in Fig. 4 and Fig. 5; the portion characterized by higher probability of failure are distributed in the same sectors of the area, however the pixels, that as time progresses pass into higher PoF classes is different ( Fig. 6 and Fig. 7).
It is possible to notice, as expected, that at increasing rainfall time the areas characterized by lower PoF decrease (in both cases of PoFr and PoFu), as long as the areas with bigger PoF values, increase. It can be also remarked an increase with rainfall time of areas characterized by PoFr>50 (red classes) while red classes of PoFu keep on small and almost constant values.
In particular, from the histograms plots ( Fig. 6 and Fig. 7  To promote a quantitative comparison between two approaches, the quantity: has been evaluated. For the sake of brevity only the comparison in relation to the end of the event is shown (Fig. 8). The areas where the quantity ∆ E is negative, thus PoFr> PoFu, correspond to the areas characterized by the highest PoF (red class in Fig. 4 and Fig. 5); the portions characterized by positive or null value of ∆ E correspond to areas with lower probability of failure (white and beige pixels in Fig. 4 and Fig. 5).
Considering the topographic features measure after Morra landslide, the sector in which ∆ E is negative corresponds to the triggering area of the landslide, the portion in which ∆ E is positive corresponds to the secondary area of the sliding phenomenon. In conclusion, PB probabilistic model with realistic pdf provides higher PoFr in trigger zones; PB model with uniform pdf provides higher PoFu in less susceptible sectors. The use of uniform pdf then produce: i) a non-precautionary estimate of the failure conditions in the most vulnerable areas and ii) false alarms in the stable slopes (through a PoF overestimation in areas characterized by small instability conditions).

Conclusions
The physical process that leads to the slope failure is characterized by multiple uncertainties [39], which should never be considered null [47], the uncertainty of the quantities involved is considered in probabilistic approaches where the safety level of slope is expressed as a random variable, quantified by the probability of failure PoF.
The PoF evaluation is related to the theoretical pdfs considered for the input quantities of the forecast models. In this work has been shown that the definition of pdfs for the physical and mechanical properties of the soil influences the reliability analysis, this effect has been quantify with the exact method of Monte Carlo. The probability of failure has been computed in relation to a particular case study where a detailed geotechnical characterization is available, this feature supports the possibility to define other than uniform pdf. In addition, the small extension of the study area ensures a good compromise between the reliability of the results and the calculation times which can be result prohibitive for too large scale. In relation to Morra area, the results show differences between PoFu and PoFr not negligible. In the portion of the area, characterized by high susceptibility to landslide, the use of pdfu seems to lead to non-precautionary PoF estimates (PoFu <PoFr). On the contrary, in areas characterized by lower susceptibility, the PoFu appears to overestimate the conditions of potential instability (PoFu>PoFr). This trend occurs in relation to all the time instants considered it seems to be independent of the rainfall intensities observed.
In order to generalize the results, the comparison in terms of PoF on other study areas is necessary and large areas, characterized by different morphologies, will be considered in future research.
Properly defining pdfs for random parameters not only improves the reliability of the results but promotes a more informed use of physics-based probabilistic approaches.