Seismic Risk of Infrastructure Systems with Treatment of and Sensitivity to Epistemic Uncertainty

: Modern society’s very existence is tied to the proper and reliable functioning of its Critical Infrastructure (CI) systems. In the seismic risk assessment of an infrastructure, taking into account all the relevant uncertainties a ﬀ ecting the problem is crucial. While both aleatory and epistemic uncertainties a ﬀ ect the estimate of seismic risk to an infrastructure and should be considered, the focus herein is on the latter. After providing an up-to-date literature review about the treatment of and sensitivity to epistemic uncertainty, this paper presents a comprehensive framework for seismic risk assessment of interdependent spatially distributed infrastructure systems that accounts for both aleatory and epistemic uncertainties and provides conﬁdence in the estimate, as well as sensitivity of uncertainty in the output to the components of epistemic uncertainty in the input. The logic tree approach is used for the treatment of epistemic uncertainty and for the sensitivity analysis, whose results are presented through tornado diagrams. Sensitivity is also evaluated by elaborating the logic tree results through weighted ANOVA. The formulation is general and can be applied to risk assessment problems involving not only infrastructural but also structural systems. The presented methodology was implemented into an open-source software, OOFIMS, and applied to a synthetic city composed of buildings and a gas network and subjected to seismic hazard. The gas system’s performance is assessed through a ﬂow-based analysis. The seismic hazard, the vulnerability assessment and the evaluation of the gas system’s operational state are addressed with a simulation-based approach. The presence of two systems (buildings and gas network) proves the capability to handle system interdependencies and highlights that uncertainty in models / parameters related to one system can a ﬀ ect uncertainty in the output related to dependent systems.


Introduction
Modern societies are increasingly dependent on their Critical Infrastructure (CI) to produce and distribute the essential goods and services they need. From a system-theoretic point of view, the infrastructure is a system of systems including several systems, which can be area-like (i.e., buildings), line-like (i.e., lifelines, utilities) and point-like (i.e., critical facilities). The first two categories are spatially distributed.
The best practice for the seismic risk assessment of an infrastructure encompasses taking into account all the relevant uncertainties affecting the problem, as well as providing the confidence in the estimate and the sensitivity of the total output uncertainty to each input model.
The employed steady-state gas flow formulation is the one presented by Cavalieri (2017Cavalieri ( , 2020 [13,14], encompassing multiple pressure levels, the pressure-driven mode and the correction for pipe elevation change. Flow equations for high, medium and low pressure are available, allowing one to handle both distribution and transmission networks. Models for reduction groups and compressor stations, which allow the pressure to be locally reduced or increased, are included. The flow-based gas system model was implemented by the first author into an open-source simulation tool for civil infrastructures, namely Object-Oriented Framework for Infrastructure Modelling and Simulation (OOFIMS) [15], coded in MATLAB ® [16] language and initially developed within the EU-funded project SYNER-G (2012) [17]. OOFIMS is an integrated framework for reliability analysis that allows one to consider multiple interdependent infrastructure systems, accounting for the relevant uncertainties.
The remaining sections of this article are organised as follows. Section 2 presents a state-of-the-art review about the treatment of and sensitivity to epistemic uncertainty, offering suggestions on the best practice and presenting a novel approach to handle parameter correlation in a logic tree. A seismic risk assessment, encompassing the proposed methodology, was carried out for demonstration purposes on a "synthetic" city, composed of buildings and a gas network, as described in Section 3. Before presenting and discussing the results, Section 3 briefly describes the models and the performance metrics adopted in the framework. Conclusions and future work are reported in Section 4.

Modelling of Input Epistemic Uncertainty
The possible approaches to the treatment of epistemic uncertainty vary depending on its type, denoted as I and II in the following:

1.
Epistemic uncertainty on model form (Type I): a logic tree is commonly used, composed of a chain of sequential modules, the latter intended as groups of parallel branches or choices. In this framework, alternative models Θ, corresponding to logic tree branches within each module, are considered in each step of the analysis. Individual simulations are carried out for each different combination of sequential branches (i.e., of models), thus yielding multiple results, for instance in terms of mean annual frequency (MAF) of exceedance curves of a performance metric. Weights, summing up to one, are attached to branches to indicate subjective degrees of belief of the analyst in each model. This is common practice in probabilistic seismic hazard analysis (PSHA), where a typical uncertainty in model form is represented by the ground motion prediction equation (GMPE). The outcome is usually expressed in terms of mean hazard curve over the logic tree, obtained as a weighted average of the curves from all branches (Bommer and Scherbaum, 2008) [8]. Upper and lower fractile curves and/or a confidence interval around the mean curve are often computed based on the set of curves from the tree, so as to quantify the effect of epistemic uncertainty on the results.

2.
Epistemic uncertainty on model parameters (Type II): each model parameter θ is modelled with a random variable, whose distribution describes its epistemic uncertainty. (a) The parameters (e.g., the maximum magnitude M max ) are arranged in a hierarchical model together with aleatory uncertainty (e.g., the magnitude M). In this case, only one simulation is carried out and the risk analysis provides a single result (i.e., a single MAF curve of a performance metric, for instance), embedding the effects of both aleatory and epistemic uncertainty (e.g., Franchin and Cavalieri, 2015 [18], Su et al., 2020 [19], Morales-Torres et al., 2016 [20]). This approach prevents the analyst from properly treating Type II epistemic uncertainty, in terms of computing confidence intervals or fractiles, and identifying distinct contributions of input aleatory and epistemic uncertainty within the output uncertainty (refer to Figure 8 for an example). (b) Alternatively, the risk analysis is repeated for discrete values of each parameter θ (e.g., 16%, 50% and 84% fractiles). This approach, involving a higher associated computational effort, allows one, however, to arrange parameters in a logic tree, as done for Type I uncertainty. Since the discrete values of the model parameters are values from a probability distribution, their choice, as well as that of the corresponding weights attached to tree branches, could be assigned, for instance, according to Miller and Rice (1983) [21].

3.
Epistemic uncertainty of both Type I and II: (a) The analyst can decide to adopt approach (2a) for Type II uncertainty. This cheaper approach leads to carrying out the expectation over all sources of uncertainty, presenting the results as the mean over the logic tree. However, in this case, confidence intervals or fractiles would refer only to part of the total epistemic uncertainty (i.e., the one related to models). (b) The second option, involving approach (2b) for Type II uncertainty, consists of building an expanded logic tree for the treatment of both Type I and II uncertainties.
It should be clear from the above that the best practice for the treatment of epistemic uncertainty on model parameters is the adoption of approaches (2b) or (3b). However, one issue with the logic tree including model parameters is that, strictly speaking, sequential modules can only be used when all parameters are statistically independent. In the presence of parameter correlation, a variation in one parameter, belonging to one module, changes the (conditional) distribution of the others, thus making the results dependent on the ordering of modules. Such correlation is often disregarded, as it occurs for instance in PSHA with reference to the parameters of the Gutenberg-Richter recurrence relationship, namely a, b and M max , for which independent sequential modules are commonly adopted, thus neglecting, at least, the correlation between a and b. This leads the possible combinations to probably cover an unrealistically wide range of values, thus presenting a range of uncertainty much wider than the analyst actually intends to model (Bommer and Scherbaum, 2008) [8]. A possible solution to this issue, proposed by Cavalieri and Franchin (2019) [12], is reported in Section 2.2, together with a novel and more efficient solution developed in the current work.

Case of Correlated Parameter Values
As pointed out above, in the presence of correlation between parameters in logic trees (approaches (2b) and (3b)), sequential modules can still be used, but the results will be dependent, regardless, on the ordering of modules. Care should be taken in conditioning the values of the subsequent parameter to those of the preceding one. One possible option would be to change the weights, so that they are obtained from the distribution of each parameter conditional on the value of the correlated parameter that precedes it (e.g., Yilmaz et al., 2020 [22]). This approach would lead to having different weights for the same choices within a module depending on the choice of the preceding module, or to the same weights attached to different choices (parameter values). However, it can work only as far as there are no other constraints involved. One case that arises in the context of seismic risk analysis, where simple adjustment of the weights does not solve the problem, is related to the uncertainty in damage to a component given the input seismic intensity, uncertainty that is represented by fragility functions. The latter describe the probability of reaching or exceeding a given limit state conditional on intensity at the component's site. These functions are commonly postulated as lognormal and are thus defined by two parameters, the log-mean and the log-standard deviation. Epistemic uncertainty on these parameters, which are obviously correlated, is sometimes available. As long as the component's state is defined as binary and thus a single fragility function is employed to determine the damage state, modifying the weights is a viable option. The problem arises when the component is not modelled as binary and multiple limit states are therefore considered, and hence two or more fragility curves are used, which must never intersect.
To overcome these issues, Cavalieri and Franchin (2019) [12] proposed to avoid the use of sequential modules corresponding to correlated parameters, and instead to lump them all together into the same module, accounting for their dependence within the module itself. Needed input data are the marginal distributions (e.g., normal with mean µ and standard deviation σ) of the parameters and their correlation matrix. One parameter is fixed to a certain fractile and the same fractile of the remaining parameters is obtained using the conditional mean and standard deviation. For the formulation to be exhaustive, all combinations of conditioning and conditioned parameters are included. Indicating with ρ the correlation coefficient, with x 1 the conditioning parameter (fixed to a fractile) and with x 2 the vector of the conditioned parameters, it is possible to write the diagonal matrix of standard deviations, D xx , the correlation matrix, R xx , and the covariance matrix, C xx , as follows: where C x 2 x 1 , C x 1 x 2 and C x 2 x 2 are partitions of the covariance matrix C xx . The conditional mean vector, µ x 2 |x 1 , and covariance matrix, C x 2 x 2 |x 1 , are obtained as (Benjamin and Cornell, 1970) [23]: The conditional standard deviations are simply the square root of the diagonal elements of C x 2 x 2 |x 1 . Starting from the conditional means and standard deviations, one can retrieve any fractile of the conditioned parameters, using the inverse of the cumulative distribution function (CDF).
The method proposed by Cavalieri and Franchin (2019) [12] is flawed. In fact, the fractile chosen for the parameters on their respective marginal distributions does not correspond to the same fractile of the joint distribution, which is the actual target. For illustrative purposes, this is illustrated with reference to an example including only the first two correlated fragility parameters used in the application (see Section 3.2), related to the yield limit state fragility curve, namely, µ ln,Y and σ ln,Y . Taking only two parameters over a set of four correlated parameters allows the joint CDF to be visualised in a 3D plot. In particular, only the first two columns in Table 1 and the first 2 × 2 submatrix in Table 2 are considered for this example (the description of such tables is given in Section 3.2). First, the extreme values of µ ln,Y and σ ln,Y , extracted from a large sample of parameter values obtained from their marginal normal distributions, were used for a discretisation of the parameter supports, thus building a grid of points. Then, the joint CDF was evaluated at such grid points, as displayed in Figure 1. The latter shows the process of deriving the fractile of the joint distribution corresponding to a given fractile of the two parameters, according to the method proposed in [12]. For simplicity, and without loss of generality, the 50% fractile is considered, leading to the retrieval of only one set of parameter values, extracted from their marginal unconditional distributions (the set and marginal CDFs are shown in the figure). Figure 1 clearly shows that the 50% fractile of both parameters corresponds to the 27% fractile (black dot) of the joint distribution, which is thus quite different from the desired (target) 50% fractile. This shortcoming may occur, of course, also in the case of parameter fractiles different from 50%, when one parameter is fixed and the other one is extracted from the conditional distribution (see procedure outlined above).  Example showing the derivation of the fractile (black dot) of the joint distribution corresponding to the 50% fractile of the two fragility parameters, according to the approach proposed in [12].
In order to overcome this issue and properly handle the case of correlated parameters in a logic tree, a novel approach, which is also more efficient, is proposed in this work. The procedure is straightforward and encompasses the following steps: The extreme values of all parameters are obtained from their marginal distributions. Based on this, a sufficiently accurate discretisation of the parameter supports is carried out, thus building a grid of points.

2.
The joint CDF and probability density function (pdf) are evaluated at all grid points, using the parameters of the multivariate distribution (e.g., normal with mean vector µ X and covariance matrix C xx ).

3.
For each desired fractile of the joint distribution, corresponding to a CDF value F*, the points characterised by CDF = F* are extracted, as displayed with cyan dots in Figure 2a with reference to the example described above and the 50% fractile. The same cyan dots are also shown in Figure 2b, each with its joint pdf value. The accuracy in this step is of course a function of the discretisation employed in step #1 and the tolerance fixed for CDF around F*.

4.
Among the extracted points, the desired fractile of the joint distribution is selected as the one with the highest value of the joint pdf, as shown with a black dot in Figure 2b. The same selected point is also shown in Figure 2a. The selected set of values is the most likely parameter set; as such, it is not supposed to lead to extreme combinations of values and, in the case of fragility parameters, to intersection of fragility curves related to different limit states. The point selected in Figure 2 is the same reported in Figure 3 with a black dot. It can be clearly noted how the (target) 50% fractile of the joint distribution corresponds to the 70% and 69% fractiles of µ ln,Y and σ ln,Y , respectively. This is a further reassurance on the fact that, in general, the fractiles of the parameters and the joint distribution do not coincide, and that the proposed approach is the correct one to properly handle the desired fractiles of joint distributions within a logic tree. In fact, the approach (i) is fully consistent with the parameter correlation structure, (ii) keeps the independence of results on the ordering of modules, and (iii) is also much more efficient than the one presented in [12]; indeed, it allows one to considerably reduce the number of branches in the logic tree and thus the required computational effort, as will be better highlighted in Section 3.3.

Quantification of Output Uncertainty Due to the Epistemic Component
As already said, in a probabilistic framework where a logic tree is used to account for epistemic uncertainty (approaches 1, (2b) and (3b)), risk is estimated through a chain of modules, each composed of parallel choices (i.e., alternative models or model parameter values). The logic tree provides multiple results, for instance in terms of a number N of MAF of exceedance (λ) curves, denoted λ-curves in the following, for a performance metric of interest. Each λ-curve is related to one out of N simulations (e.g., Monte Carlo), each encompassing multiple runs.
Propagation of epistemic uncertainty through the logic tree results in a distribution f X (x) of the output X. The latter may express either λ values for a fixed performance metric value or performance metric values for a fixed value of λ. This choice is dictated by the goal of the analyst, who can be interested in the mean value of f X (x) for fixed performance metric values or for fixed λ values. The two approaches can yield very different results for the mean value. Bommer and Scherbaum (2008) [8], referring to hazard curves, pointed out that in current engineering design practice the procedure is to retrieve the expected value of intensity measure at a preselected frequency of exceedance (or return period). This is the choice adopted also in this work, with reference to a performance metric (refer to Figure 8).
As a minimum, f X (x) can be summarised through its mean µ X and variance σ 2 X . Computing the mean is commonly termed harvesting the logic tree in PSHA practice, while using the variance it is possible to estimate a confidence interval around the mean curve. Finally, variability in X is also often expressed through weighted fractiles, as an alternative to σ 2 X . For a tree with N Θ models and N θ parameters (see example in Figure 4), µ X can be estimated as the weighted average of X [12]: with: In Equation (4), N cΘ,i and N cθ,k are the numbers of choices for the i-th model and the k-th parameter, respectively, x n is the MAF or performance metric value corresponding to the n-th branch of the tree, formed from the sequence j(1), . . . , j(N Θ ) of models and m(1), . . . , m(N θ ) of parameter values, while w n is the associated weight. It is noted that the final weights w n , as well as all other weights within each module, are "reliability weights", as opposed to "frequency weights", therefore they sum up to unity: The variance σ 2 X can be estimated through the weighted sample variance: which is the unbiased estimator when weights sum up to unity as in Equation (5).
The weighted p% fractile is defined as the element x p of X satisfying the following condition: where I is an indicator function equal to 1 if x n is lower than or equal to x p , and 0 otherwise. Finally, the confidence interval around the estimate X of µ X , following, e.g., Rubinstein and Kroese (2016) [4] or Petty (2012) [5], can be obtained, recalling that the estimator X has approximately a normal distribution, N(µ X ,σ 2 X /N), where N is the samples size. When σ 2 X is estimated as in Equation (6), the confidence interval at the confidence level (1 − α)100% (e.g., α = 0.05 yields a confidence level of 95%), is: where t N−1,1−α/2 is the critical value for the Student t distribution with (N−1) degrees of freedom.

Sensitivity to Input Epistemic Uncertainty
In order to assess which component of (input) epistemic uncertainty contributes more to the uncertainty in the output, sensitivity analysis should be carried out. Most of the works found in the current literature about sensitivity analysis can be grouped in three main threads, based on the one-at-a-time (OAT) approach, the logic tree and analysis of variance, respectively.
According to the OAT approach, sensitivity is evaluated by varying one uncertain input variable or model at a time, while holding all the others constant (at their median value or best estimate). Visser et al. (2000) [6] and Merz and Thieken (2009) [24] proposed indices based on the difference between the maximum "uncertainty" range, MUR(λ), i.e., the difference between maximum and minimum value of the metric for a fixed λ (they use T = 1/λ), and the corresponding range UR i (λ) obtained by switching off the i-th module (i.e., setting it to its best estimate choice): Other authors, for instance, Celik and Ellingwood (2010) [25] and Celarec et al. (2012) [26], with reference to seismic risk assessment of structures, performed sensitivity of selected seismic response parameters (drifts, forces, etc.) to the input random variables-the latter also including Type II epistemic uncertainty. In these works, output is commonly computed first with all input variables set to their medians, and then setting one input variable at a time to a lower or upper fractile (typically 16% and 84%): the resulting variations are represented with tornado diagrams. Along the same lines, Porter et al. (2002) [27] used the OAT approach to carry out a deterministic sensitivity analysis of building loss estimates to major uncertain variables, which include spectral acceleration, mass, damping and structural force-deformation behaviour. Tornado diagrams were also used by Pourreza et al. (2020) [28] to eliminate the least influential modelling variables on the collapse capacity of a five-storey steel special moment resisting frame (SMRF). Similar approaches were followed, for instance, by Easa et al. (2020) [29] and Sevieri et al. (2020) [30].
The findings of all these studies are limited by their use of the OAT method. As pointed out by Saltelli and Annoni (2010) [31], sensitivity analyses employing the OAT approach are incomplete, as they are fundamentally unable to quantify the portion of uncertainty arising from interactions between input variables. In other words, correlations cannot be considered in OAT analyses because this would require the simultaneous movement of more than one input variable, as also highlighted by Cremen and Baker (2020) [32].
Frameworks including a logic tree are instead capable of taking into account interactions, since each combination of sequential branches involves variations within all modules (the latter including parallel choices of models or model parameters). For each choice within a module, it is possible to compute the weighted average X considering only the logic tree branches involving that choice. The sensitivity of output epistemic uncertainty to logic tree choices can still be presented through tornado diagrams (see, e.g., Figure 10b). Crowley et al. (2005) [33] evaluated the impact of epistemic uncertainty on earthquake loss models. In particular, they used systematic combined variations of the parameters defining the demand (i.e., ground motion) and the capacity (i.e., vulnerability) to identify the relative impacts on the resulting losses, finding that the influence of the epistemic uncertainty in the capacity is larger than that of the demand for a single earthquake scenario. Sensitivity results were presented through tornado diagrams. The logic tree approach was also used by Tyagunov et al. (2014) [34], who estimated the sensitivity of seismic risk curves to several input parameters belonging to three different modules of risk analysis, namely, hazard, vulnerability and loss. As a further, recent example, Bensi and Weaver (2020) [35] used a logic tree to treat both Type I and Type II epistemic uncertainties associated with the estimation of tropical cyclone storm recurrence rates. A sensitivity analysis, presented through tornado diagrams, was also carried out to show how storm rate (input) epistemic uncertainty impacts estimates of storm surge hazards.
An alternative way to account for interaction effects between input variables requires the use of Global Sensitivity Analysis (GSA), which allows for the identification of the parameters that have the largest influence on a set of model performance metrics and the identification of non-influential parameters (Saltelli et al., 2008) [36]. GSA is typically undertaken using variance-based techniques. The latter aim to quantify the contribution of each input parameter to the total variance of the output, by means of the analysis of variance (ANOVA) decomposition.
Tate et al. (2015) [11], after providing an understanding of the total variation and central tendency of the loss (i.e., output) estimates through an uncertainty analysis, carried out a sensitivity analysis based on the method of Sobol' (1993) [37]. The latter decomposes the variance to determine the proportional contribution of each model component to the total uncertainty. Based on the original version of the Sobol' sensitivity indices, among which the first order sensitivity index (or main effect) of the i-th uncertain variable, S i , Homma and Saltelli (1996) [38] introduced the total sensitivity index, S Ti . While S i gives the effect of the i-th uncertain variable by itself, S Ti indicates its "total" effect, which includes the fraction of variance accounted for by the variable alone, as well as the fraction accounted for by any combination of it with the remaining variables [7,31]. These indices, whose computation is carried out via sampling schemes, were indeed also mentioned by Helton et al. (2006) [39], who reviewed sampling-based methods for uncertainty and sensitivity analysis, and used by Cremen and Baker (2020) [32] within a variance-based sensitivity analysis for the FEMA P-58 methodology.
For complete variance decomposition, in addition to the method of Sobol', techniques based on the Fourier Amplitude Sensitivity Test (FAST) are also available [40,41]. Bovo and Buratti (2019) [42] used an ANOVA procedure based on a two-way crossed classification model [43], in order to evaluate the effect of epistemic uncertainty in plastic-hinge constitutive models on the definition of fragility curves for reinforced concrete (RC) frames. It was found that the uncertainty associated with the hysteretic model definition has a magnitude similar to the one related to record-to-record variability.
Among all the algorithms proposed in the literature for GSA, several methods consider the entire pdf of the model output, rather than its variance only. Such methods, which are moment-independent, are preferable in cases where variance is not a suitable proxy of uncertainty, for instance, when the output distribution is highly skewed or multi-modal. However, the adoption of density-based methods has been limited so far, owing to its difficulty to be implemented. In order to overcome this issue, Pianosi and Wagener (2015) [44] introduced the PAWN method, which efficiently enables the computation of density-based sensitivity indices. It is based on the concept of characterising output distributions by their CDF, which is easier to derive than pdf. Zadeh et al. (2017) [45] provided an interesting comparison between the commonly used variance-based method of Sobol' and the recently proposed moment-independent PAWN method, with reference to the Soil and Water Assessment Tool (SWAT), which is an environmental simulator applied all over the world for watershed management purposes. Sobol' and PAWN identified the same non-influential parameters, but the ranking of the influential ones was better quantified by the PAWN method. This was probably due to the asymmetric character of the output distributions, thus undermining the Sobol' implicit assumption that variance is a good proxy for output uncertainty.
Following an alternative approach, Cavalieri and Franchin (2019) [12] employed (modified) ANOVA combined with logic tree to perform sensitivity analysis of output uncertainty to input epistemic uncertainty. ANOVA allows one to test the assumption that a sample is divided into groups, by expressing the total variance in the sample, s 2 X , as the sum of a variance "within" each group, s 2 X,W , and a variance "between" groups, s 2 X,B . A large ratio s 2 X,B /s 2 X indicates that the difference between groups is large. If each module in the logic tree is taken in turn as a criterion for grouping and the set of results x n is divided into N G sub-groups, each corresponding to a different choice in the module, the ratio s 2 X,B /s 2 X can be used to rank modules. The classical ANOVA expression is derived for a sample where all realisations have the same weight and reflect the underlying distribution, which is assumed to be normal. If the sample is weighted, as in the case of results derived from a logic tree, the following relation for weighted ANOVA holds: where x gj and w gj are the output value and the associated weight corresponding to the j-th branch within the n g branches in the g-th group (g-th option in the considered module), and X g is the weighted average within the g-th group. Note that weights within a group do not sum up to unity. To summarise, the importance of the i-th module according to weighted ANOVA, based on which a module ranking can be derived, is obtained as: This is one of the approaches adopted for sensitivity analysis in the current work (see Section 3.4).

Application
The methodology presented above for the treatment of and sensitivity to epistemic uncertainty was implemented in OOFIMS. For demonstration purposes, a seismic risk assessment was carried out on a "synthetic" city composed of buildings and a gas network (see Section 3.2) and subjected to distributed seismic hazard.

Hazard, Vulnerability and Performance Metric for Gas System
The hazard probabilistic model adopted herein takes into account both transient ground deformation (TGD) and permanent ground deformation (PGD). TGD is related to ground shaking due to travelling seismic waves, while PGD is induced by geotechnical hazards (e.g., landslides) and causes large permanent soil deformations. In the application, the intensity measures (IMs) used to express TGD are the peak ground acceleration (PGA) and the peak ground velocity (PGV). For further details on the implemented hazard model, interested readers are referred to [46].
After computing the relevant IMs at the surface, namely, PGA, PGV and PGD (the latter acronym also indicates the resulting displacement) for each site, the physical damage state of the components is obtained through fragility models. Concerning gas pipelines, the model proposed by the American Lifelines Alliance (ALA) (2001) [47] for water pipelines was adopted. Two Poisson repair rates of faults (leaks/breaks) per unit length are assumed to be lognormally distributed, with an expression for the median,λ repair (returned in in km −1 , with PGV in cm/s and PGD in m), and a logarithmic standard deviation: λ repair,PGV = K 1 · 0.0024 · PGV λ repair,PGD = K 2 · 11.224 · PGD 0.319 λ repair,PGV =λ repair,PGV · ε RR,PGV λ repair,PGD =λ repair,PGD · ε RR,PGD λ repair = max λ repair,PGV ; λ repair,PGD (12) The residual terms ε RR,PGV and ε RR,PGD are both lognormal with unit median and log-standard deviation equal to 1.15 and 0.74, respectively, whilst K 1 and K 2 are modification factors, functions of the pipe material, soil, joint type and diameter. For the generic pipe, the highest repair rate, λ repair , is used to estimate the number of repairs (i.e., leaks or breaks). The outflow associated to such repairs is then lumped to the end nodes of the pipe as additional loads. Any leak or break thus plays a role in the flow analysis in damaged conditions. The metering/reduction (M/R) stations, which function as source nodes, were assumed to behave as binary components (i.e., intact/broken) and were assigned the same fragility model used for compressor stations, and provided by HAZUS (2007) [48]. Their lognormal fragility curve is characterised by a median PGA of 0.77 g and a logarithmic standard deviation of 0.65. The seismic vulnerability of the remaining components in the taxonomy of gas systems was neglected because no fragility models are available in the literature, to the authors' knowledge. The adopted fragility model for buildings will be introduced in Section 3.2.
Being that the focus of this work is on epistemic uncertainty, rather than on the comparison of metrics, it was decided to consider only one metric for the gas system, namely, the System Serviceability Index, SSI (Wang and O'Rourke, 2008) [49]. It is defined at system level and is flow-based, meaning that it requires a flow analysis and thus gives insight into the system's actual serviceability. SSI, ranging between 0 and 1, is defined as the ratio between the sum, over the n load nodes, of the delivered gas flows Q i (P 1,i ) s after an earthquake (subscript s), and the sum of node demands Q i .
The term P 1,i is the pressure at the i-th load node, needed to possibly reduce the demands according to the pressure-driven formulation (see [13]). This is a deterministic metric. Starting from the values computed within the single simulation runs, new probabilistic metrics can be retrieved at the end of the simulation, thus accounting for aleatory uncertainty. Examples of probabilistic metrics are given by moments or distributions of a deterministic metric (see Section 3.4).

The Synthetic City
The topology of the gas distribution network, already used by Cavalieri (2017) [13], is sketched in Figure 5a, together with the indication of node types. The graph is characterised by a grid or mesh-like topological structure, typical in urban areas where the main pipes connect suburbs or districts, and can be considered as a transmission/distribution (TD) system. The topology was generated using the network model developed by Dueñas-Osorio (2005) [50], which aims to represent real TD systems based on the ideal class of the d-lattice-graph. Out of the total 49 nodes, five are sources (i.e., M/R stations) and the remaining ones are RG/RGM stations, load nodes and joints. Note that RG (reduction group) stations operate pressure reduction between two pressure ranges (i.e., low, medium, high) or two pressure levels within the same range, while RGM (reduction group and metering) stations also serve as sink (or load) nodes for industrial users (at high or medium pressure). In load nodes, gas is delivered to end-users.
Three pressure levels are present, namely, two medium pressure (three bar and two bar) and one low pressure (0.05 bar). A total of 65 high-density polyethylene (HDPE) pipes connect the nodes. Pressure reduction is operated by reduction groups, which are automatically split into two nodes (see [13]). Elevation is 0 m for all nodes, while all pipes were assigned a D = 300 mm diameter. Figure 5b displays the gas network (GAS) together with two other layers, namely, the seismic hazard (HAZ) and buildings (BDG). The 15 by 15 km square footprint of the city is subdivided into nine geometrically coincident subcity districts (SCDs) and building census areas (BCAs), each of size 5 by 5 km. A park surrounds the central area. The total population is 1,800,000, corresponding to an average density equal to 8000 inhabitants/km 2 . For simplicity, only reinforced concrete buildings are present. The corresponding seismic fragility model was assigned on the basis of a comprehensive literature survey conducted within the SYNER-G (2012) project [17]. A set of two lognormal fragility curves is provided (in terms of PGA, in units of g), for the yield and collapse limit states. The curve parameters (µ ln,Y and σ ln,Y for the yield limit state, µ ln,C and σ ln,C for the collapse limit state) are themselves characterised as joint normal variables (epistemic uncertainty of Type II). Table 1 above reports the mean and coefficient of variation (CoV) of fragility parameters for RC buildings (mid-rise, seismically designed, non-ductile bare frames, according to Crowley et al., 2014 [51]), while Table 2 reports the corresponding correlation matrix, which is needed to ensure that the yield and collapse fragility curves do not intersect.
The seismic environment (i.e., the HAZ layer) consists of two sources, which are discretised into three and two rectangular seismogenic areas, displayed and numbered in Figure 5b. Their activity is characterised in terms of the parameters for the truncated Gutenberg-Richter recurrence law. In particular, λ 0 (i.e., the mean annual rate of the events in the source with M greater than the lower limit M min ) values are 0.014, 0.016, 0.020, 0.018, and 0.022 for the five areas, while the magnitude slope β and lower and upper magnitude limits M min and M max are set to 2.72, 4.5 and 6.5 for the first three areas, and 1.70, 5.0 and 6.5 for the remaining two areas.

Adopted Logic Tree
Following approach (3b) in Section 2.1, a logic tree was used to provide confidence in the estimate of the risk assessment results, as well as the sensitivity of the total output uncertainty to the components of epistemic uncertainty in the input. The four employed modules, related to models or parameter values, concern (weights are indicated in curly braces):

1.
M max for all the seismic sources The fractiles and corresponding weights in modules #3 and #4 were set according to the work by Miller and Rice (1983) [21]. A Gaussian quadrature procedure and a selected weighting function are employed to approximate a continuous cumulative distribution with a user-defined number of pairs of random variable values and cumulative probability. Each value is paired with a probability, so as to obtain the probability mass function of the discretised random variable. Such probabilities are used as branch weights in a logic tree module containing several parameter values. For the case at hand, the parameter distributions in modules #3 and #4 were discretised with three points, namely, the 91.5%, 50% and 8.5% fractiles with weights 0.25, 0.5 and 0.25, respectively.
Concerning module #3, the two variables ε RR belong to the same fragility model and could be correlated. However, only their (lognormal) marginal distributions are available [47], while their correlation structure is unknown. In order to handle this case, both parameters are lumped together in the same module, rather than in two subsequent modules, as done for the case of known correlation. Then, in the absence of the needed information, for each of the three branches the same fractile is derived for the two parameters by using their inverse marginal lognormal CDF, with zero log-mean and the log-standard deviations reported in Section 3.1.
As already mentioned, the four parameters of the two lognormal fragility curves for RC buildings are correlated (see Table 2). According to the approach by Cavalieri and Franchin (2019) [12], to handle parameter correlation, the RC fragility module (#4 above) in the logic tree would include four branches for each fractile. However, for the 50% fractile, all branches provide the same set of fragility curves (i.e., the mean curves), so that the four branches are lumped in just one. To summarise, such an approach would require four combinations for both 8.5% and 91.5% fractiles and one combination for the 50% fractile, for a total of nine branches to include in module #4, and consequently a total of 2 × 2 × 3 × 9 = 108 branches in the logic tree. On the other hand, the approach proposed herein for handling parameter correlation leads the RC fragility module to include only one branch per fractile (i.e., just three branches), for a total of 2 × 2 × 3 × 3 = 36 branches in the whole logic tree, thus entailing a significant reduction in computational effort.
The fragility curves related to the three branches in module #4 were thus retrieved according to the proposed approach (see Figure 2 above), employing a 25 point discretisation for each of the four parameters (for a total of 25 4 = 390,625 grid points) and fixing a 10 −3 tolerance for CDF around each of the three considered F* values, namely, 0.915, 0.50 and 0.085. The curves are displayed (with thicker lines) in Figure 6, where the mean fragility curves are also reported (with thinner lines) for reference. Three aspects deserve to be highlighted. First, the fragility curves for the 50% fractile branch (see Figure 6b) are quite different from the mean ones, thus confirming once more that the parameter values corresponding to the 50% fractile of the joint normal distribution are not the 50% fractiles of their marginal distributions. Second, the 50% fractile fragility curves (in Figure 6b) are not very distant from the 91.5% fractile ones (in Figure 6a), owing to the steep slope of the joint CDF in the [0.5,1] range: this leads the 50% and 91.5% fractiles of the joint distribution to be reflected by similar (high) fractiles of the parameters. Third, selecting the point with the highest joint pdf value (see Section 2.2) leads the fractiles of the parameters to be different from the target fractile of the joint distribution, but similar between each other, as can also be gathered from the example with two parameters in Figure 3. As a consequence, it can be noted in Figure 6 that the fragility curves gradually decrease both their median and slope (controlled by the log-mean and the log-standard deviation, respectively), when moving from 91.5% to 8.5% fractiles.

Results and Discussion
In order to carry out the seismic risk assessment in OOFIMS, a Monte Carlo simulation scheme was adopted. Distinct simulations were executed for each different combination of branches in the logic tree (see Section 3.3), thus yielding multiple results. Within the generic simulation, before starting the sampling, the implemented steady-state flow formulation is used to assess the operational state of the undamaged gas network. Then, in each simulation run, the following activities are undertaken: • The IMs of interest (i.e., PGA, PGV and PGD) at vulnerable components' sites are estimated using the OOFIMS hazard module; • The damage state of the components at risk, belonging to both gas and building systems, is estimated through the adopted fragility models; • The connectivity and operational state are assessed for the damaged gas network; • The gas system's performance is estimated through the adopted flow-based metric; • A second performance metric of interest for this application is computed, namely, the displaced population, P d . In the implemented model, people can be displaced from their homes either because of direct physical damage (building usability) or because of lack of basic services/utilities (building habitability), resulting from damage to interdependent utility systems (only gas system in this case). See Franchin and Cavalieri (2015) [18] for further details.
As mentioned above, the logic tree resulting from the adoption of the proposed approach for correlated parameters is composed of 36 branches, thus entailing the execution of 36 Monte Carlo simulations. Each single simulation consists of 1000 runs (i.e., iterations or samples), a number deemed to be sufficient in consideration of the illustrative character of the paper, and appropriate to show the main features of the implemented methodology for the treatment of and sensitivity to epistemic uncertainty. Overall, the analyses required a computational time of about 14 h, using a computer with a 3.50 GHz eight-core CPU and 64 GB RAM.
The subplots in Figure 7 show twelve λ-curves of SSI, obtained from different combinations of choices in the first three modules, concerning M max , the GMPE and both ε RR variables in the GAS fragility model. An important feature characterising this application is that the highest attainable value of SSI, corresponding to the reference undamaged conditions and called here SSI 0 , results in being lower than 1, and in particular, 0.976. This effect owes to the pressure-driven mode adopted in the flow formulation, combined with the assigned pressure levels and thresholds in the network (see [13,14]). As a result, several node loads are reduced even in non-seismic conditions, yielding a SSI 0 value lower than 1. Looking at the curves, it is possible to see the amount of epistemic uncertainty in the output and also qualitatively assess its sensitivity to the considered modules. It is noted that the maximum magnitude practically does not affect the curves, while the influence of the residuals and especially the GMPE is more evident. In particular, the upper fractiles for ε RR cause the λ-curves to be slightly shifted towards lower exceedance rates for SSI values, as expected, whilst in all cases the Akkar and Bommer (2010) [52] GMPE leads to more severe λ-curves of SSI, laying below the ones related to the Boore and Atkinson (2008) [53] model. Figure 8 shows the MAF curves of the displaced population normalised to the total population, P d /P. The λ-curves shown are 36, as the number of simulations carried out. Each curve corresponds to a single branch of the logic tree and quantifies the aleatory uncertainty contained in the employed models and parameters, whilst the spread of the curves around the average gives a clear indication of the epistemic uncertainty. As a consequence, the distribution of P d /P corresponding to the full suite of curves captures both aleatory and epistemic uncertainties (Bommer and Scherbaum, 2008) [8]. Figure 8 displays the weighted average curve of P d /P (for fixed MAF values), as well as the 95% confidence bounds and 16% and 84% weighted fractile curves. In fact, while the confidence in the estimate increases with the number of simulations, N, thus leading the confidence interval to reduce, the weighted fractiles are only a function of the sampled values.  To gain qualitative insight into the sensitivity of uncertainty in the final output (i.e., the MAF of P d /P) to the input modules, the subplots in Figure 9 show a disaggregation of the total epistemic uncertainty according to the fractiles of the RC fragility distribution, and following the same order of subplots in Figure 6. Thus, subplots (a), (b) and (c) show the λ-curves obtained by fixing the RC fragility distribution at 91.5%, 50% and 8.5% fractiles, respectively. The curves appear to be clearly grouped into clusters, related to the considered fractiles: naturally, the higher the fractile, the lower the estimated number of displaced people. The similarity of the 50% and 91.5% fractile fragility curves (see Figure 6a,b) leads to two almost overlapped clusters (in Figure 9a,b). In each subplot, two further clusters of curves, related to the two adopted GMPEs, can be identified. In particular, as already seen for SSI, the Akkar and Bommer (2010) [52] GMPE leads to more severe λ-curves of P d /P, laying on the right of the ones related to the Boore and Atkinson (2008) [53] model. The other two modules, namely, M max and ε RR , do not cause clustering of curves, and therefore are deemed to contribute less to output epistemic uncertainty. Quantitative information about sensitivity can be gained from Figure 10a, showing the module importance ranking obtained from weighted ANOVA (see Section 2.4), for return periods of 100 and 500 years. Each module is displayed with two bars indicating the total variance for the two return periods; the hatched portion of the bars indicates the variance between groups and thus the importance of the module in terms of contribution to the total epistemic uncertainty. It can be readily observed that the building fragility model has the highest importance, whilst the contributions of M max and ε RR are negligible. Figure 10a provides a clear indication of the modules having the largest impact and thus deserving increased "knowledge" (in terms of enhanced modelling, additional data for calibration, etc.), in the effort to reduce the total epistemic uncertainty in the problem. The tornado diagram shown in Figure 10b presents the sensitivity of the weighted average of P d /P to the different choices in the considered modules, for the same two return periods used for weighted ANOVA. The plot partly reflects the importance ranking of modules gained from Figure 10a, since the modules characterised by larger distance between the extreme choices, for the generic return period, clearly give higher contribution to uncertainty. Once again, the close values (especially for the 100 years' return period) of the P d /P weighted average for choices (a) and (b) in module #4 reflect the similarity of the 50% and 91.5% fractile fragility curves (see Figure 6a,b). Concerning the only module containing models, namely, #2 above, the Akkar and Bommer (2010) [52] GMPE yields higher values of P d /P weighted average, and thus it is confirmed to provide more severe shaking intensities than the Boore and Atkinson (2008) [53] model, as already seen in Figures 7 and 9. Finally, for all three modules containing parameter values, the sensitivity results are consistent with the input settings, since higher magnitude, upper fractile for ε RR and lower fractile for RC fragility distribution yield higher values of displaced population, and vice versa, as expected.

Conclusions and Future Work
The best practice for the seismic risk assessment of interdependent Critical Infrastructure systems encompasses taking into account all the relevant uncertainties affecting the problem, as well as providing confidence in the estimate and sensitivity of the total output uncertainty to uncertainty in the input.
Focusing on epistemic uncertainty, this paper presents a comprehensive framework for seismic risk assessment of infrastructure systems that (i) accounts for both aleatory and epistemic uncertainties and (ii) provides the confidence in the estimate, as well as (iii) the sensitivity of output epistemic uncertainty to the components of epistemic uncertainty in the input. Both models and model parameters, affected by epistemic uncertainty, are arranged in a chain of sequential modules within a logic tree. The latter is used for the treatment of epistemic uncertainty and, in combination with weighted ANOVA and tornado diagrams, to evaluate the sensitivity of uncertainty in the output to input epistemic uncertainty. The formulation is general and can be applied to risk assessment problems involving not only infrastructural but also structural systems (an example of treatment of Type II epistemic uncertainty using a logic tree is given in Gabbianelli et al., 2020 [54], related to steel storage pallet racks). The main novel contribution of the paper concerns a method for properly handling parameter correlation in a logic tree. This is a crucial aspect that is often disregarded by analysts in different sectors (e.g., in PSHA) and leads the possible combinations of parameter values to probably cover an unrealistically wide range, thus impairing the entire analysis.
The presented methodology was implemented within an open-source simulation tool for civil infrastructures, OOFIMS, and exemplified with reference to a synthetic city composed of buildings and a gas network. A simulation-based (Monte Carlo) approach was adopted, in which the seismic hazard is characterised in terms of both TGD and PGD, and a flow-based performance metric was used to evaluate the serviceability of the gas system. The employed steady-state gas flow formulation encompasses multiple pressure levels. The presence of the building layer allows showing the possibility to handle system interdependencies, in terms of displaced population evaluated as a function of utility loss in the gas network, as well as highlighting that uncertainty in models/parameters related to one system can affect uncertainty in the output related to dependent systems. Sensitivity results showed that the building fragility model has the highest importance, the GMPE model has a quite low impact, while the contributions (in the uncertainty of the output) of maximum magnitude and residuals in the gas fragility model are negligible. These results suggest that increased "knowledge" is needed to reduce epistemic uncertainty, especially in the derivation of building fragility curves. Such knowledge may be related to an enhanced modelling approach, or additional data for model calibration, or both. For example, instead of employing generic or typological fragility functions, one could use structure-specific fragility curves, obtained with a Bayesian Network (BN) model (e.g., the one proposed by Franchin et al., 2016 [55] for bridges). In the case of complete information available for model calibration, it would be possible to retrieve a single set of fragility curves: epistemic uncertainty in the input will be still somehow present, albeit to a lesser extent, but not treated in the framework (for the case-study considered herein, this would mean removing module #4 in the logic tree). On the other hand, in cases of incomplete information, the BN model can still be used and may lead to obtaining a probabilistic fragility model (such as the one used in this work) characterised by reduced uncertainty (with respect to a generic model). In both cases, the MAF curves of the performance metric of interest will produce a tighter "fan" around the weighted average (see Figure 8 above).
The proposed methodology for treatment of and sensitivity to epistemic uncertainty, as well as its software implementation, can be used by emergency managers, infrastructure owners and all the stakeholders engaged in mitigating the seismic risk of communities in earthquake-prone regions.
Future work aims to consider additional sources of epistemic uncertainty in this field, to compare or develop further methods for sensitivity analysis, as well as to explore the sensitivity of the ANOVA-based importance ranking to the number of runs in the logic tree parallel simulations and to the performance metrics considered.
Funding: This research was partially funded by the European Commission, for the SYNER-G collaborative research project, grant number 244061.