Bayesian Inference in Snow Avalanche Simulation with r.avaﬂow

: Simulation tools for gravitational mass ﬂows (e.g., avalanches, debris ﬂows) are commonly used for research and applications in hazard assessment or mitigation planning. As a basis for a transparent and reproducible decision making process, associated uncertainties need to be identiﬁed in order to quantify and eventually communicate the associated variabilities of the results. Main sources of variabilities in the simulation results are associated with parameter variations arising from observation and model uncertainties. These are connected to the measurement inaccuracies or poor process understanding and the numerical model implementation. Probabilistic approaches provide various theoretical concepts to treat these uncertainties, but their direct application is not straightforward. To provide a comprehensive tool, introducing conditional runout probabilities for the decision making process we (i) introduce a mathematical framework based on well-established Bayesian concepts, (ii) develop a work ﬂow that couples this framework to the existing simulation tool r.avaﬂow, and (iii) apply the work ﬂow to two case studies, highlighting its application potential and limitations. The presented approach allows for back, forward and predictive calculations. Back calculations are used to determine parameter distributions, identifying and mapping the model, implementation and data uncertainties. These parameter distributions serve as a base for forward and predictive calculations, embedded in the probabilistic framework. The result variability is quantiﬁed in terms of conditional probabilities with respect to the observed data and the associated simulation and data uncertainties. To communicate the result variability the conditional probabilities are visualized, allowing to identify areas with large or small result variability. The conditional probabilities are particularly interesting for predictive avalanche simulations at locations with no prior information where visualization explicitly shows the result variabilities based on parameter distributions derived through back calculations from locations with well-documented observations.


Introduction
Risk and uncertainty analyses are gaining increasing attention in civil engineering applications, for example in the field of geology and geotechnics [1]. Their basis are probabilistic approaches, accounting for the distribution and uncertainty of input parameters, subsequently allowing to map the resulting output variations.
For snow avalanches simulation tools are usually utilized in deterministic approaches describing singular events, limiting, e.g., the quantification of the effectiveness of mitigation measures, advanced hazard zoning or the application of risk-based mitigation planning [2]. Due to model simplifications and the numerical implementation, process parameters that are essential for the simulation tools are not directly measurable and remain conceptual. Therefore, these parameters have to be estimated via back calculations, fitting simulation results to observed data, solving an inverse problem [3]. An important task is to understand the arising uncertainties in order to provide insight into the range of plausible results.
Over the years, different probabilistic approaches have been established to deal with the uncertainties involved in the optimization of process parameters and simulation procedure. A comprehensive probabilistic model has been presented by Ancey et al. [3], who investigates both the release probability and the Coulomb friction parameter of a one-dimensional process model. Eckert et al. [4] connect statistical-topographical methods (velocity modeling, volume correlation) for the predictive distribution of avalanche runout distances. The runout distance is also the target output of the probabilistic framework of Eckert et al. [5]. They used a simplified one-dimensional sliding block propagation model for the French avalanche database, choosing the friction proportional to release characteristics and varying the starting position. Eckert et al. [6] adapted the method of Eckert et al. [5] from a mass block model to a depth-averaged model by additionally considering the dynamic evolution of the avalanche body. Furthermore, the method was extended taking into account the variability of the starting position and velocity as well as the release depth and length. In this context the non-exceedance probability of the runout distance and impact pressures have been related to the return period. One common challenge in these concepts is a homogenized data availability allowing to transfer knowledge from one to another avalanche event or path [7]. It is crucial to treat uncertainties explicitly when the avalanche risk is evaluated [2]. Straub and Grêt-Regamey [8] set new ground, combining process and data based simulation approaches, explicitly treating parameter uncertainties through Bayesian inference. They compared the uncertainty of the avalanche release process with respect to nine deterministic parameter scenarios for a Voellmy-like process model in AVAL-2D, estimating annual probabilities of runout distances. Despite these efforts stochastic concepts and specifically Bayesian inference are still insufficiently considered in the snow and avalanche field. Particularly when operating with a state of the art process propagation model in three dimensional terrain explicit uncertainty treatment and providing a direct interpretation of derived probabilities remain unclear.
To tackle the problem of explicit uncertainty treatment and to relate these uncertainties to conditional probabilities we propose a work flow to (i) incorporate uncertainties in the optimization process of a two-dimensional open source avalanche simulation tool and derive parameter distributions, which can (ii) be used for probabilistic forward simulation and prediction.
Thereby, the open source simulation software r.avaflow [9] is combined with a transformation of the simulation results in an avalanche path dependent coordinate system [10,11] in order to compare the simulation results with observed extreme avalanche events. The open and adaptable features of r.avaflow provide a base for implementing single-or multiphase flow solvers in a python environment, allowing for the direct coupling to the postprocessing and optimization framework. The utilized single-phase process model in the avalanche simulation software includes the combination of the conservation equations, their numerical implementation and their closures, such as bottom friction relation or entrainment.
The basis of this work is back calculations of observed field data y obs , using measured input data x and a process model f with parameters θ. The first target is to derive distributions π(θ) of process model parameters, that allow one to approximate the observation avalanche data y obs through the process model y obs = f (x, θ). A stochastic approach is used, which explicitly treats the arising uncertainties in the measurements as well as in the process model. Thereby, Bayes' theorem is employed to derive the unknown parameter distributions π(θ) through the back calculations. The back calculation of a documented avalanche event allows one to estimate a parameter distribution π post (θ) that is associated with optimized simulation results with respect to the observations y obs . At this place, it becomes obvious that the process parameters θ directly depend on the observation data y obs and therefore may differ for different avalanche paths or respective events, further yielding different conditional probabilities.
The subsequent step of this work is to perform forward and predictive calculations. In this context predictive calculations correspond to class A predictions [12], i.e., a prediction without prior knowledge of the event itself, based on other available observations. Class C1 predictions on the other hand take into account the observations of the event itself and may therefore be compared to a simulation of a documented avalanche event with the best fitted parameter set in a deterministic sense. Avalanche observations are utilized to determine the parameter distributions π(θ) in the back calculation. Based on this parameter distribution π(θ) the forward calculation provides simulation results, that allow one to evaluate the correspondence to the avalanche observations utilized in the back calculation. This corresponds to a C1 prediction, however, with additional statistical information depicting the variability associated with the parameter distribution. The predictive calculation relies on the back calculated parameter distribution of one event, which is applied to another event, and thus this corresponds to a class A prediction for the other event. The predictive calculation results allow to evaluate changes in the result variability and the availability of local observations additionally allow to investigate the simulation result plausibility. Monte Carlo simulations allow one to estimate the variability of the predictive simulation results y pred with given input data x. As a final, practical outcome, a visualization of the resulting variability is presented. These visualizations show conditional probabilities that a certain area is affected by the avalanche simulation, i.e., the probability that the dynamic peak pressure results exceed a threshold value, considering the observation data. The runout probability visualizations allow for an intuitive interpretation of predictive avalanche simulation results and their associated uncertainties.

Simulation and Postprocessing
The input-output model y = f (x, θ) is a combination of the avalanche simulation tool r.avaflow [9] and an evaluation of the simulation results in a path dependent coordinate system [10], which allows one to define simple evaluation criteria. The tool r.avaflow includes different flow models: a single-phase model for solid flows and a two-phase model additionally considering solid fluid interactions. The friction relation in the latter is represented by a classical Voellmy friction [13,14] or a modified approach to account for random kinetic energy [15]. In this work, we choose the one-phase flow model with the Voellmy friction model defining the set of process model parameters θ = {δ 0 , ξ}. The friction angle δ 0 (or Coulomb friction coefficient µ = tan δ 0 ) and the turbulent friction coefficient ξ are the determining parameters of the basal shear stress τ (b) , which depends on the normal stress σ (b) and the velocity u, τ (b) = σ (b) tan δ 0 + g ξ u 2 , with the gravitational acceleration g. The appropriate parameter ranges of each model parameter can be constrained by various approaches. Physically relevant ranges [16] may be useful, but also experimental work [17], existing guidelines and literature [14,18] or prior knowledge based on back calculations [19] can be utilized. Based on these values and in order to avoid any a priori restrictions on the possible parameter combinations [11], we choose the rather large intervals of the single parameters to be δ 0 ∈ [0 • , 30 • ], resp. µ ∈ [0, 0.577] and ξ ∈ [500 m s 2 , 2500 m s 2 ]. The simulation input x, including the digital elevation model (DEM) and the release volume V rel , are directly derived from field measurements, documentation data and respective guidelines. Uncertainties of the simulation input are not explicitly treated although it is known that they may also influence the simulation results [20,21]. A simulation run is performed combining the fixed simulation input and a set of variable process parameters θ. Therefore, the influence of the simulation input on the simulation results is implicitly associated to the uncertainties that arise from the model itself. One challenge is to properly define the simulation output y that can be compared to observations y obs since the direct results of depth averaged state of the art simulations tools are the evolution of flow depths d(x, t) and velocities u(x, t) in space x and time t, which are usually not observed. Hence, it is necessary to define evaluation criteria [9], which in our case are based on the peak (maximum over all time steps) values of flow depths d(x, t), velocities u(x, t) and derived impact pressures p peak (x) = ρ u 2 (with the density ρ), that are transformed in a flow path relative coordinate system. In this coordinate system, the scalar result variables run-out (r), true positive (tp, simulated area matching the observed affected area, measured relative to the observed affected area A obs ), false positive ( f p, area that is affected in simulations but not in the observations, measured to the observed affected area A obs ) and maximal velocity (u max ) are derived and collected in the output vector y = {r, tp, f p, u max } [11]. Defining optimization variables such as runout or impacted area in terms of impact pressures is in accordance with avalanche hazard mitigation guidelines [10,22,23]. Therefore, the outline of the peak pressure result is identified by p peak > p lim with a pressure limit of p lim = 1 kPa (for ρ = 200 kg/m 3 corresponding to u = 2.2 m/s). With this outline the runout r can be identified and by a comparison to the documented affected area A obs the measures true positive tp and false positive f p, are determined. The maximum velocity u max is identified as the maximal velocity value over all time steps over the whole simulation domain.

Avalanche Data
To highlight the applicability of the presented approach we arbitrarily chose two case studies that are representative in terms of the typically available avalanche data. With this we ensure that the same method can be applied to a larger set of case studies allowing for a more detailed analysis of the resulting parameter distributions.
The case studies are from different regions in Austria with a similar quality of documented avalanche events (approximate destructive size d4-5 [24]). High quality assessments of release area and depth, as well as affected area exist for both events , but no additional information with respect to dynamic data (velocity, impact pressures, ...) are available. The Kerngraben avalanche event (Salzburg, AT, V rel = 65,000 m 3 ) is utilized to perform the back calculation, yielding optimized parameter distributions for the process model parameters, covering the uncertainties associated with the observations and the simulation itself. With these optimized parameter distributions we perform forward and predictive calcutlations (i.e., Monte Carlo simulations) for the Kerngraben and the a priori unknown Wolfsgruben avalanche (Tyrol, AT, V rel = 275,000 m 3 ). In order to further evaluate the predictive simulations we compare the simulation results with the existing observations. The DEMs at a resolution of 10 m are freely available [25]. Release area mappings and release depth estimates exist for each of these avalanches. They have been obtained through expert assessments, based on observational data supported by state of the art tools (slope and curvature [21,[26][27][28]).
Direct observations of avalanches are rather sparse. Therefore, the available information is often derived from historical documentation which usually includes the affected areas (see Table 1), which allow one to determine the associated avalanche runout.
Observations on dynamic data, such as velocities, form an essential part of parameter optimization for spatio-temporal propagation models but such data are rare [29]. While sparse radar measurements allow detailed velocities evaluations with spatial Reference [30], most velocity estimates rely on measurements of travel times without any spatial reference. We use an empirical velocity estimate compensate for missing velocity observations, linking the maximum velocities of avalanches to the total altitude difference along their avalanche path (u max ≈ 0.6 g ∆z, [31,32]).

Mathematical Framework
The objective of the back calculation is to solve an inverse problem, i.e., to determine the model parameters θ for given known input x and measured output y obs , mapping the uncertainties that are associated with the simulation and data in a probabilistic distribution of process parameters. Therein is the input-output model, the vector x describes the simulation input, θ denotes the model parameters and y is the resulting output. The standard optimization approach to solving an inverse problem would be to find a parameter combinationθ that minimizes the distance of the corresponding model output f (x,θ) to the observed data y obs . However, for the purpose of modelling the uncertainty of the fitting parameters as well as for the forward calculation and prediction in Section 5, the probability distribution of the parameters θ is required. Here is where the Bayesian approach comes in, which allows one to derive such a probability distribution from (a) prior knowledge about the parameters (in our case, physical restrictions on their range), (b) the observed data y obs , and (c) assumptions about the distribution of the model error and the measurement error.
Consequently, we treat the problem as a statistical inverse problem. All parameters θ included in the model f and the observed data y obs , as well as their errors are considered as random variables (denoted by capital letters and their realizations by lower case letters). The statistical inverse problem is solved for the unknown parameter vector Θ. X corresponds to the fixed simulation input such as release volume and topography and Y to the documented data. The error term E should model the arising uncertainties of the avalanche data Y (runout, velocity, affected area) and the model and its numerical implementation, respectively. The error term E is considered to be a mean zero random variable acting as additive noise, the probability density π E of which is assumed to be known. These assumptions lead to where E f denotes the error from the implementation of the model, E Y denotes the noise associated with the the uncertainties of the measured data, summarized in Table 1, and E = E Y + E f . The function f describes the simulation procedure and the postprocessing of the simulation results. Note that we do not take into account any error resulting from uncertainties in the simulation input X.
The goal of the Bayesian approach is to determine the probability distribution of the parameters θ, given the observed data y obs , the so-called posterior distribution π post (θ). The required ingredients are

•
The prior probability density π prior (θ) encoding the prior knowledge about the model parameters; • The likelihood function π(y obs |θ) expressing the probability of the observed data when the parameter has a given value θ.
Bayes' theorem is an extension of the the well-known formula for conditional probabilities of events A, B to the case of random variables. A subtlety in Formula (4) is that π(y obs ) is unknown. However, this information is not needed, because π(θ|y obs ) is a probability distribution, so its integral over θ-space has to be equal to 1. Thus π(y obs ) turns out to be a normalizing constant which in principle can be evaluated as the reciprocal of the integral over the known quantities π prior (θ)π(y obs |θ). This fact is often expressed by the formula π(θ|y obs ) ∝ π prior (θ)π(y obs |θ) .
The specification of the likelihood function is commonly based on the probability distribution π E of the error, that is, At this point, a statistical assumption about the distribution of the error is needed. The natural assumption is that the error has a normal distribution with zero mean and a certain variance or covariance matrix Σ E in the multi-valued case, i.e., for multiple observation variables. The choice of the error covariance matrix is usually based on the covariances of the observed data and should ensure that all components scale equally. This leads to the likelihood function To explore the posterior distribution we use the Markov chain Monte Carlo (MCMC) method, i.e., we utilize the Metropolis-Hastings algorithm [33,34] to produce a sample of the posterior distribution π(θ|y obs ). The Metropolis-Hastings algorithm generates a Markov chain which converges to π(θ|y obs ) as its unique stationary distribution. Convergence of the algorithm is known to hold under rather mild conditions (for more details see [35][36][37][38]). In principle, π(θ|y obs ) could also be evaluated directly which, however, would require computing the mentioned normalizing constant by a possibly high-dimensional integration. The major advantage of the Metropolis-Hastings algorithm is that it requires knowledge of the likelihood function and the prior distribution only up to a constant, as can be seen from the iteration step which uses the ratio (7).
In addition to the prior density π prior (θ), the Metropolis-Hastings algorithm needs the specification of a proposal distribution q(θ * , θ), which we choose of the form q(θ * − θ) where q is a Gaussian distribution with mean zero and covariance matrix Σ prop . The Markov chain of parameters θ t , t = 0, 1, 2, 3, . . . is computed by the algorithm as follows. First, a starting value θ 0 is drawn (or simply selected) from the prior distribution π prior (θ). Having obtained the member θ t of the chain at step t, a candidate θ * for the next step is drawn from the proposal distribution q(θ * , θ t ), centered at θ t . At this stage, the model output y = {r, tp, f p, u max } with given simulation input x, candidate process parameters θ * ) is simulated and π(y obs |θ * ) = π err (y obs − f (x, θ * )) is computed. Next, the ratios α = π(θ * |y obs ) π(θ t |y obs ) = π prior (θ * )π(y obs |θ * ) π prior (θ t )π(y obs |θ t ) are formed. The fraction α is used to identify those parameter combinations which are suitable to be added to the Markov chain. The candidate θ * is always accepted if α ≥ 1; otherwise, it is accepted with probability α and rejected with probability 1 − α. If the candidate is accepted, the parameter combination θ * is set as new state (θ * → θ t+1 ); if the candidate is rejected, the actual state of the chain is kept for the next iteration step (θ t → θ t+1 ). Possibly after a burn-in phase, the generated tails of the Markov chain may serve as a sample of the posterior distribution π post (θ). An important measure to assess the quality of the Markov chain is the acceptance rate, i.e., the fraction of proposed candidates which are accepted. An acceptance rate of about 25% is generally considered appropriate for the multivariate case, see, e.g., [39,40]. Moreover, the mixing of the chain is an indicator for whether the chain is exploring the whole parameter space. A good mixing is desired, which means that the candidates move fast from one part of the state space to other parts. One possibility, which we apply to test for sufficient mixing, is the visual interpretation of trace plots (e.g., see Figure 1). The acceptance rate and mixing of the chain are influenced through the proposal distribution. A proposal matrix with high variances leads to a high fluctuation of the parameters, which in turn leads to small acceptance rates. On the other hand, for too small variances the parameter chain is at risk to get stuck in regions with "good parameters" and not to explore the whole parameter space, which leads to a too high density of accepted parameters.

Application-Kerngraben Avalanche
To describe the investigated uncertainties, i.e., the error covariance matrix Σ E in the likelihood function, we proceed as formulated in (3) and account separately for the data measurement error ε y and the process model error ε f , i.e., we use the error model which in turn leads to the covariance matrix in the likelihood function. The four components arising from the measurement errors are assumed to be independent. They are summarized in the diagonal covariance matrix Σ E,y . This assumption is reasonable, if measurement devices for the respective observations, e.g., statistical estimates for velocities and field assessments for runout distances, differ. We are aware of the fact that runout distances and affected areas are not independent (e.g., that an increase in affected area is usually accompanied by an increase in runout). An explicit autocorrelation treatment or an additional correlation analysis of the avalanche observations could improve the results. On the other hand, the process model correlates the four output components of the model and thus also the model errors cannot be independent. These correlations are explicitly treated and estimated in combination with the variances of the model errors in the covariance matrix Σ E, f . The estimates for data measurement errors are based on expert judgment (std E,r = 25 m, std E,tp/ f p = 0.05 A obs , std E,u max = 10 m/s). While the diagonal elements of the covariance matrix for the model errors may also be determined via expert estimates, this is not straightforward for their off diagonal counterparts representing their correlations. In order to determine a model error covariance matrix, which may also be applied to other case studies, we perform 15 Monte Carlo simulations (each of sample size 500) of events with similar destructive avalanche size characteristics. The data measurement and model errors are of the same order of magnitude and the covariance matrix reads as: Assuming that the errors ε y and ε f are independent and Gaussian, the sum is again Gaussian and the covariance matrix is obtained by adding the individual covariance matrices For the prior distribution, no knowledge about the process model parameters apart from physically relevant intervals is assumed (see [11,41]). This information is expressed by a continuous uniform distribution of the respective parameter ranges, which can be expressed as δ 0 ∈ [0 • , 30 • ](µ ∈ [0, 0.577]), ξ ∈ [500 m s 2 , 2500 m s 2 ] for the prior density of θ = {δ 0 , ξ}. An important task when applying the Metropolis-Hastings algorithm is to asses the convergence behaviour. Accordingly, it is common to investigate different initial distributions as well as proposal distributions. These choices influence how the chain moves through the parameter space. We vary the initial distribution and the proposal distribution, respectively. A suitable choice for the proposal matrix Σ prop is a diagonal matrix with variances of (5 • ) 2 for δ 0 and 500 m s 2 2 for ξ, such that the acceptance rate reaches about 50%. Given this, the choice of the starting value has no influence on the resulting parameter distribution. Therefore, choosing θ (0) = {15 • , 1500 m s 2 } as starting values, located in the center of the parameter space appears appropriate with respect to the initial distribution.
The outcomes of the Metropolis-Hastings algorithm, i.e., the resulting trace plot of the Markov chain of length 2000 and the histogram of the stationary distribution for the parameters θ = {δ 0 , ξ} are displayed in Figure 1. The candidates for δ 0 are in the range δ 0 ∈ [0 • , 20 • ], whereas the distribution of possible candidates of ξ is wider and covers the whole parameter space of the prior distribution.
Due to the fact that we cannot exactly determine when the chain has reached its equilibrium distribution we decide to search for an appropriate proposal distribution and stop the algorithm after 2000 iterations, such that the monitoring properties are sufficiently met. We take the resulting Markov chain as a sample of the posterior distribution. The calculation time on a single core CPU with 3.40 GHz for the back calculation task is about ≈18 h and is strongly dependent on the utilized simulation tool.
The acceptance rate, i.e., the fraction of accepted and proposed elements, of the investigated scenario is 0.48. This means that 967 parameter combinations (candidates) were accepted and added to the resulting distributions θ post (Figure 1). The mean values of the distributions areδ 0 = 11.3 • andξ = 1714 m s 2 , which fit in the range of corresponding values found in the literature (e.g., [14,42,43]).

Mathematical Framework
The objectives of this task are (i) to use the posterior distribution π post (θ) and perform forward calculations or predictive simulations, leading to probabilities of simulation results y pred (runout, velocities), and (ii) to assess explicit information about their variability, which is directly linked to the back calculated posterior distribution. The main difference between forward calculations and predictive simulations is that forward calculations utilize local avalanche observations, which directly influence the back calculated posterior distribution, to perform forward calculations at the same location (class C1 prediction). Predictive simulations are performed in case that the utilized back calculated posterior distribution is independent of the local avalanche observations (class A prediction). For both calculations the same Monte Carlo approach is applied: a sample of size N of the model parameters following the posterior distribution is generated, simulations with these parameter combinations θ 1 , . . . , θ N are performed and the simulation results y 1 , . . . , y N are evaluated statistically. Thus a sample following the posterior distribution θ post is generated, following the same distribution as obtained from the Markov chain. There exists a variety of approaches to determine a sample of the posterior distribution [44]. One common method to realize this is the application of the component-wise inversion method, that is separately for each parameter. However, this method cannot be used in a straightforward way because the parameters and thus the components of the chain are correlated.
Another widespread approach requiring minimal artificial assumptions but ensuring a robust, smooth estimate is to approximate the joint distribution by means of a copula, preserving the marginal distributions and the linear correlation coefficients (estimated through the covariance matrix C θ ) of the original multivariate distribution. A two-dimensional copula [45] is a function C(a, b) of two variables a, b, both from the unit interval, which has the form of a two-dimensional distribution function whose marginals are one-dimensional uniform distributions. The copula encodes the desired correlation structure; given any two one-dimensional distribution functions F 1 , F 2 , the function F = C(F 1 , F 2 ) is a two-dimensional distribution function with marginals F 1 , F 2 and the desired correlation structure. The applied two-stage procedure consists of (i) the approximation of the marginal distributions This method of generating multivariate samples is applied, using a Gaussian copula, which is of the form C R (a, b) = Φ R (Φ −1 (a), Φ −1 (b)), where R is a given correlation matrix, Φ is the cumulative distribution function of a univariate standard Gaussian variable, and Φ R is the joint cumulative distribution function of a bivariate Gaussian variable with mean zero and covariance matrix R. Using C R as copula produces a joint distribution F(δ 0 , ξ) which has the (rank) correlation given by R. Generating random numbers from a Gaussian copula is simply done by generating a bivariate Gaussian variable (s, t) with correlation matrix R, set a = Φ(s), b = Φ(t) and finally δ 0 . To obtain the lower bound for the sample size in order to get statistically meaningful results we use Bikelis' theorem (see [46]), which leads to an appropriate sample size of 500 simulations to estimate the expectation value of the runout up to 3 m with a confidence level of 95%. This appears to be appropriate as it is below the spatial resolution of the utilized DEM.
The simulation with a sample of the posterior distribution corresponds to a simple Monte Carlo approach which gives a distribution of the estimator in each component of the result variable y.
With the obtained parameter sample θ pred post (see Figure 2) forward calculations for the Kerngraben and predictive simulations for the Wolfsgruben avalanche are performed. In each case this corresponds to 500 simulation runs allowing for a quantitative, statistical evaluation of the avalanche simulation results (runout, true positive, false positive, maximum velocity) y pred = (r, tp, f p, u max ).

Forward Calculation-Application to the Kerngraben Avalanche
Forward calculations are an appropriate instrument to test the accuracy of the simulation result with respect to the parameter optimization and the avalanche observations. Table 1 summarizes the observed avalanche data y obs and the corresponding simulation results y for the two case studies.
For the Kerngraben avalanche event the runout length derived from the observations corresponds to r = 1741 m. The corresponding runout simulation results, which are summarized in Figure 3, deviate on average ≈10 m from this observed value (median 1751 m), with 50% (75-25% quantiles) of the simulated runout lengths ranging from 1731 to 1771 m and 90% (95-5% quantiles) ranging from 1701 to 1831 m. Large tp values show that the simulated affected area is in high correspondence with the observed one A obs (>99% for the 25-75% quantiles and 96-99% in the 5-95% quantiles). These high values appear evident comparing the P(p peak (x) > p lim |y obs ) = 95% (most simulation runs affect this area) line with the observed affected area A obs in Figure 4. This underlines the importance of the false positive f p values in the apparent case where the overestimation by the simulation appears more distinct than the correctly depicted affected area. Here the false positive ( f p) values show that this high correspondence indicated by the high tp values, is accompanied by an overestimation of the simulated affected area (96-99% for the 47-58% quantiles and 40-66% in the 5-95% quantiles, always relative to the observed affected area). The estimated velocity of u max = 55 m/s cannot be reproduced by the simulations runs: the calculated median is 25 m/s, ranging from 20 to 29 m/s (5-95% quantiles).  The observed affected area A obs is shown as reference for the true positive (tp) and false positive f p values in the runout area (black shading, solid line). The color map indicates the probability P(p peak (x) > p lim |y obs ), that a respective area of the simulation raster is affected by an avalanche, given the considered data. The P = 95% and P = 5% probability isolines are highlighted (dotted and dashed-dotted lines), allowing to identify areas that exceed the threshold peak pressure p peak (x) > p lim in most simulation runs (dark red) or outliers that appear with probabilities P < 5% (light blue).

Prediction-Application to the Wolfsgruben Avalanche
Comparing the predictive simulations y pred to the observations y obs (see Table 1 and Figure 3) one can see that the median value of the predicted run-out (2071 m) deviates about 32 m from the documented run-out (2103 m), with 50% (75-25% quantile) of simulation runs ranging from 2038 m to 2114 m and 90% (95-5% quantile) ranging from 1985 m to 2233 m. On average the predicted affected area covers 97% of the observed affected area A obs (tp ranging from 94% to 100% for the 5% and 95% quantiles). However, it has to be noticed that this high correspondence is accompanied with an area of 33% (24-44%, f p in relation to the affected area) that was affected in the simulations, but not observed (see also Figure 4). Similarly to the forward calculation we observe that the predicted maximum velocities are in the range of 31-38 m/s (5% to 95% quantile), a value that appears significantly lower than the estimated value of 58 m/s.

Mathematical Framework
In order to evaluate the variability of the avalanche simulation results, which in turn indicates the robustness of the employed simulation scenario, a visualization of the conditional runout probabilities P(x|y obs ) is introduced. The visualization displays the conditional probabilities to exceed a certain peak pressure threshold at an arbitrary location p peak (x) = ρ u 2 , (ρ = 200 kg/m 3 , see Section 2), with the numerical cell j representing the discretization of the location x. Here we define the pressure threshold with p peak (x) > p lim = 1 kPa, which is in accordance with the lower range of values that are suggested in international hazard zoning guidelines [Johannesson].
To count the relative hitting frequency over the grid, define the indicator variable I j at cell j as Each I j is a Bernoulli random variable, where the probability p for outcome 1 is the conditional runout probability p = P(I j = 1|y obs ) that the peak pressure exceeds the threshold at cell j. Hence the conditional runout probability that the peak pressure exceeds the threshold at a location x, discretized by a numerical cell j, in a realization of the Monte Carlo simulation sample I i j , i = 1, . . . , N can be evaluated as For any location, the map displays the probability to be affected by the avalanche simulation, conditional on the observed data y obs . The conditional runout probabilities differ from approaches in, e.g., Straub and Grêt-Regamey [8], where the annual probability distribution of the run-out distance is conditional on deterministic model parameters or probabilistic measures, such as impact indicator index [47] with their direct dependence on the observational data.

Application to the Kerngraben and Wolfsgruben Avalanche
The runout probabilities of the forward calculations and predictive simulations with the Monte Carlo sample θ pred post are shown in Figure 4. The Kerngraben avalanche starts at an elevation 2175 m.a.s.l. and travels towards its runout area, located at ≈1225 m.a.s.l. Areas with P(p peak (x) > p lim |y obs ) > 5%, representing the majority of the avalanche simulations (95-5% quantiles, correspond to runout variation of −40 m to +90 m, compared to the observation) largely overlap with the documented affected area, which is also reflected by the accompanying large tp values (see Section 5.2 and Table 1). The corresponding f p values originate due to the affected areas on the west side of the runout area, where the simulations exceed the observed affected areas. For areas with low runout probabilities P(x|y obs ) the variation in runout appears to increase, especially considering single simulation runs that distinctly follow the topography along the valley bottom on the south-east side of the runout area.
The release zone of the Wolfgruben avalanche is located at about 2250 m.a.s.l. and the avalanche runout areas at ≈1300 m.a.s.l. The variability of the results are low along the main avalanche track, but increase towards the runout zone. Here we see that the variability for most of the simulation runs is rather small (5-95% runout probability corresponding to −119 m to +130 m runout), whereas in the runout area the variability of the result increases for low runout probabilities on the north-west side towards the counter slope and the right side, where very few simulation runs (P(p peak (x) > p lim |y obs ) < 5%) follow the river bed at ≈2250 m.a.s.l. It also appears evident that the main flow direction in the simulations slightly differs from the observed one, which is also indicated by the obtained runout difference of 32 m, due to the fact that the maximum of the runout observed is rather located at the north side of the runout area.

Discussion
The results highlight the potential of the proposed framework, but also indicate its limitations. The main drawback of the method is that uncertainties in the simulation input (DEM and release volume) are not explicitly treated. Although it is known that they have an influence on the simulation results, previous studies [10] indicate that small variations of frictions parameters cover the result variation range of large release volume variations. Accordingly, their expected variability is implicitly included in the model and observation error but not explicitly modeled as random variable X, considering the corresponding error in Equation (3). The explicit treatment would enhance the composition of the error covariance matrix. Therefore, the errors due to uncertain input lead to parameter variability and thus can not be distinguished from any model uncertainties. A proper analysis of the error composition could improve the performance of the proposed method. However, a pseudo-accuracy in the derived and acceptable error could hinder the method, since smaller errors would lead to significantly different acceptance rates and mixing of the chain. Moreover, the utilized models still are simplifications of the physical process and other influences, such as path variability, which may have a significant effect on the posterior parameter distribution. Thus if model parameters are forced to be exact, a combined optimization, i.e., a solution which fits for multiple avalanche paths, may not exist. However, similarly to existing approaches [7] the presented framework could be also applied to a case study including multiple avalanche paths or multiple avalanche events on one path. In this case the interpretation of the conditional runout probabilities would change. The presented approach provides a class A prediction including the variability associated with observation and model uncertainties for a single event. This implicitly assumes that both events have a similar destructive avalanche size characteristic. Although that might be true for two events, special care has to be taken to identify and treat sources of variabilities due to differences in spatio temporal characteristics, avalanche sizes or flow behaviour [48]. In the presented application the evaluation of the simulation results y and the avalanche observations y obs shows good agreement for simulated runouts r and matched affected areas (tp). A strength of the presented approach is that it reveals the accompanying overestimates of affected areas ( f p), reflecting the conservative character of the obtained simulation results. For maximal velocities u max the correspondence of simulated and estimated values appears to be rather low and deserves further discussion. Sources for this mismatch are manifold. One source of result variations are differences in the numerical solution and implementation of similar models (such as [9,49,50]), which are expected to be small and can not be further addressed since only one simulation tool was used. Other more obvious sources include the process model with the respective parameter values and the estimated reference velocity. This appears evident comparing the results for the two case studies. Similar total altitude differences lead to similar empirical estimates of maximum velocities (u max ≈ 0.6 g ∆z = 55 m/s for the Kerngraben and 58 m/s for the Wolfsgruben avalanche). Velocities derived from the depth-averaged simulation approach vary significantly. The median value of the simulated velocity distributions are 25 m/s and 35 m/s respectively (see Table 1), covering a wider range of 20-29 m/s and 31-38 m/s (5-95% quantiles). The significant differences between the two test cases can be attributed to the fact, that the simulation approach takes the two-dimensional topography with distinct terrain features into account, while the employed empirical model provides a maximum velocity estimate solely based on the total altitude difference. The empirical approach yields a valuable, rough estimate for an upper velocity bound, especially when no other information is available, but its generalization may not apply for all types of avalanches paths. Velocities derived from the depth-averaged simulation approach mainly depend on the employed process model and the corresponding friction parameters. In case of the presented, classical Voellmy model an adaptation of the upper bound for the turbulent friction coefficient ξ could increase the maximum velocities but has not been further investigated since a distinct peak already appeared in the parameter distribution. A significant change in velocities may be achieved by modifying the process model by, e.g., including entrainment [11] or considering a different friction relation [16,51]. It is also important to note that depth averaged model approaches do not provide any vertical velocity information and are therefore unable of reproducing turbulent structures resulting in large local velocity variations, that are particularly interesting in the frontal region of the avalanche [52]. The challenge of properly considering velocities in avalanche simulation evaluations has been previously debated yielding similar results [29,30] and new, promising measurement [53] and evaluation approaches [54] are currently discussed.

Conclusions
A probabilistic back calculation approach is realized with the Metropolis-Hastings algorithm to derive posterior distributions of the 2-dimensional process parameter θ = {δ 0 , ξ} for avalanche simulations. These posterior distributions are used to perform Monte Carlo simulations as forward and predictive avalanche simulations. We showed the different ingredients required to apply a Metropolis-Hastings algorithm in the back calculation procedure. After 2000 iterations and a total CPU-time of ≈18 h, a Markov chain consisting of 967 combinations of the process model parameters was determined.
Forward and predictive simulations revealed how uncertainties in the avalanche data lead to variations in the simulation results. The visualization of the conditional runout probabilities P(x|y obs ) in combination with the quantitative analysis of the forward and predictive avalanche simulation results y pred appears as a useful tool to evaluate the variability of the results.
The simulations are used to assess the quality of the derived process model parameters and to give, moreover, empirical distributions of the estimator of each result variable. The resulting conditional probabilities allow one to derive runouts that are in the range of the documented event. Although these results appear encouraging at first more simulations are necessary to narrow the parameter distributions and achieve lower result variabilities. An important observation is that large result variabilities can be associated with small probabilities or a small number of corresponding simulations.
The presented workflow and concept of conditional runout probabilities are particularly useful to evaluate the variability of simulation results. As an alternative to deterministic state of the art approaches, this probabilistic approach explicitly considers uncertainties of the observed avalanche data, providing an additional information on the accuracy of the simulation results that may appear indispensable for risk based approaches. The visualization allows for an intuitive interpretation, introducing probability thresholds (e.g., 95% and 5% quantiles), which are in accordance with engineering design guidelines [1,55] and allow one to identify computational outliers. Consideration and evaluation of uncertainties associated with avalanche simulations is imperative for researchers and practitioners as rational basis for further risk based management strategies.