Bayesian Calibration of Hysteretic Parameters with Consideration of the Model Discrepancy for Use in Seismic Structural Health Monitoring

: Bayesian model calibration techniques are commonly employed in the characterization of nonlinear dynamic systems, as they provide a conceptual and e ﬀ ective framework to deal with model uncertainties, experimental errors and procedure assumptions. This understanding has resulted in the need to introduce a model discrepancy term to account for the di ﬀ erences between model-based predictions and real observations. Indeed, the goal of this work is to investigate model-driven seismic structural health monitoring procedures based on a Bayesian uncertainty quantiﬁcation framework, and thus make relevant considerations for its use in the seismic structural health monitoring, focusing on masonry structures. Speciﬁcally, the Bayesian inference has been applied to the calibration of nonlinear hysteretic systems to both provide: (i) most probable values (MPV) of the parameters following the calibration; and (ii) estimates of the model discrepancy posterior distribution. The e ﬀ ect of the model discrepancy in the calibration is ﬁrst illustrated recurring to a single degree of freedom using a Bouc–Wen type oscillator as a numerical benchmark. The model discrepancy is then introduced for calibrating a reference nonlinear Bouc–Wen model derived from real data acquired on a monitored masonry building. The main novelty of this study is the application of the framework of uncertainty quantiﬁcation on models representing data measured directly on masonry structures during seismic events.


Introduction
Models play a key role in simulating the behavior of engineering structures, though even very detailed models may fail to represent critical mechanisms. The variety of schemes and uncertainties that are typical of civil structures makes the prediction of the actual mechanical behavior and structural performance a difficult task. All this being said, computer simulations are useful engineering tools to design complex systems and to assess their performance [1][2][3]. These simulations aim at reproducing the underlying physical phenomena in question providing a solution for the governing equations. However, accurate modelling of the structural systems requires them to be calibrated and validated with direct observations and measured experimental data [4].
Most calibration methods are essentially regression techniques that estimate the model parameters based on the outputs, and eventually on the input, by means of various optimization algorithms. Some of the classical approaches of parameter estimation include weighted least-squares estimation, best linear unbiased estimation, etc. [4]. Basically, they are optimization problems of minimizing the difference between computed model output and experimental measured data. However, such deterministic approaches bump into some common problems: they are prone to be ill-conditioned, extremely susceptible to errors, and affected by stability issues. As a consequence, a single optimal parameter vector is not sufficient to specify the structural model, but rather a family of all plausible values of the model parameters needs to be identified that are consistent with the observations [5]. Moreover, a common situation in practice is to have limited acquired data and hence sparse datasets. This introduces epistemic uncertainty (lack of knowledge) alongside aleatory uncertainty (natural variability) [6]. All these factors result in uncertainty on model prediction. Therefore, addressing the uncertainty in the model parameters and studying how it influences the uncertainty in the response of the system is crucial and necessary to get accurate predictions with realistic levels of confidence. This evidence naturally leads to the consideration of the problem from a probabilistic perspective. The existing literature on probabilistic calibration is extensive [7][8][9] and focuses particularly on Bayesian approaches [10][11][12][13][14][15][16], Bayesian approaches with model class selection [17] and Bayesian filtering methods [18,19].
A consequence of using imperfect models in the calibration is that the identified parameters may not correspond to their physical namesakes [1]. The resulting posteriors do not necessarily give estimates of the true value of physical parameters, but rather give values which lead the model to best explain the data. In order to estimate the true physical value of parameters, careful thought needs to be taken when considering which parameters to include in the calibration. In addition, the model discrepancy term must be precisely specified to connect model prediction to the observations [6].
In the civil engineering research field, the Bayesian methodology has been used especially for the structural model updating based on experimental dynamic data [5,20] to update the uncertainty of the mechanical parameters of reinforced concrete (RC) structures [21,22] and masonry structures [23][24][25]. The novelty of this study is the application of this framework on models representing data measured directly on masonry structures during seismic events. The goal is to investigate model-driven seismic structural health monitoring (SHM) procedures by incorporating the posterior uncertainty linked to updated model discrepancy in order to propose its use for the seismic SHM of buildings. This points to the importance of correlating the choice of the discrepancy model function to the degradation amount and to the characteristics of the external seismic input. In this context, the use of simplified physics-based models is made herein, to capture the general nonlinear dynamical behavior of the structure at an acceptable time-computational cost (e.g., to support real-time applications), without resorting to complex FE models that would have required the introduction of surrogate modelling with a consequent increase in uncertainty. Specifically, the well-known Bayesian inversion framework (Section 2) has been applied to the calibration of nonlinear hysteretic systems. This approach clearly defines the errors and considers the uncertainties present in the model explicitly, providing the full multivariate distribution of the calibrated parameters and insights into a variety of quantities of interest (e.g., expected values, maximum a posteriori estimates). The uncertainty quantification (UQ) framework developed in [6] is first applied to a numerical benchmark and then on a case study, whose data came from a building monitored within the Italian Seismic Observatory of Structures (OSS) [26] (Section 3). The calibration of the latter is conducted exploring different parametric and non-parametric degrading models, quantifying the uncertainties that arise from each of them using the same discrepancy function. Finally, the effect of different levels of degradation on the inference of model parameters is also investigated in Section 3 and discussed in Section 4. Conclusions are then drawn in Section 5.

Materials and Methods
In this section some theoretical concepts are recalled about uncertainty quantification that are deemed necessary in order to understand the main concepts and terminology in view of the presented applications to SHM.
In the context of Bayesian statistics, it is assumed that all model parameters are random variables. This randomness describes the degree of uncertainty related to their realizations. The inference goal is Appl. Sci. 2020, 10, 5813 3 of 23 to draw conclusions making probability statements about the hypermeters x given the evidence of actual observations y, through Bayes' theorem: π(x y) = π(x) L(x; y) Z , (1) where π(x y) is the posterior distribution of the hyperparameters, π(x) is their prior distribution and L(x; y) is the likelihood function. Z stands for normalizing factor, named evidence or marginal likelihood, that ensures that the distribution π(x) integrates to 1. Z is defined by the integral D x π(x) L(x; y) dx. In this way, the solution of the inverse problem coincides with the posterior probability distribution.

Discrepancy Model
In engineering problems, the analysis of physical systems is often performed by means of computational models which are commonly based on analytical equations governing the system or numerical methods. A computational forward model M is usually defined as a function that maps a set of model input parameters, x, governing the system to predict certain output quantities of interest (QoI) y. However, forward models are only a mathematical representation of the real system. As a consequence, a discrepancy term shall be introduced to connect model predictions M(x) to the experimental observations y [6]: In this discrepancy term, the effects of measurement error on y ∈ Y and model inaccuracy are gathered. The assumption of additive Gaussian discrepancy with zero mean and residual covariance matrix Σ has been made in this work: , where its parameters x ε are additional unknowns to infer jointly with the input parameters. x M of M(x) [6] is a quantity not known a priori, this issue being overcome by parametrizing the residual covariance matrix. A diagonal covariance matrix with unknown residual variances σ 2 is assumed, specifically Σ =σ 2 I. This results in reducing the discrepancy parameter vector in a single scalar (x ε ≡ σ 2 ) and in setting the parameter vector as x = x M , σ 2 . Assuming then a prior distribution π σ 2 for the unknown variance of ε, as well as independency on the priors of the uncertain model and the discrepancy, the joint prior distribution can be drawn out: With these assumptions, a particular measurement point y i ∈ Y is a realization of a Gaussian distribution with mean value M(x) and variance σ 2 . In this way the likelihood function reads: Specifically: where N is the number of measurement points. Hence, the posterior distribution can be computed as: Appl. Sci. 2020, 10, 5813 4 of 23 which summarizes the updated information about the unknowns as x = x M , σ 2 after conditioning on the observation y.

Bayesian Calibration Procedure
In this section, we introduce the key steps of the entire procedure used for the calibration of the stochastic systems examined below (for the numerical benchmark and for the case study). Specifically, a three-phase approach was used as follows: • Phase 1: definition of the seismic input excitation and of the computational model. First, a ground earthquake acceleration record must be selected as seismic input excitation of the system. Then, one specifies general options for the physical model and its governing laws. In this work the system of Ordinary Specifically, the posterior distribution can also be seen as an intermediate quantity used to compute the conditional expectation of a certain quantity of interest (QoI) h(x) : D x → R [27], so that: where x (t) is the step of the chain at iteration t, whereas T is the total number of the generated MCMC sample point. In this way, the posterior is explored by realizing appropriate Markov chains over the prior support. Specifically, the affine invariant ensemble sampler (AIES) solver algorithm [28] has been used in this work, and an additive Gaussian discrepancy with zero mean and residual diagonal covariance matrix Σ = σ 2 I with unknown residual variances σ 2 has been introduced to connect the model response to the experimental observations. To infer σ 2 , a uniform prior distribution was assumed: whose standard deviation was set equal to the maximum of the absolute value over the time of the experimental observation Y at the beginning of the calibration. The Bayesian updating is then conducted by minimizing this value (i.e., maximizing the likelihood function of Equation (6)) until the convergence of the Markov chains is reached. The obtained sample can be used to estimate output statistics by drawing samples from the posterior. Finally, having obtained model parameters posterior distributions, a deterministic model response prediction can be estimated choosing the maximum a posteriori (MAP) or the mean of the distributions as a point estimate. The aforementioned approach is summarized by the flowchart in Figure 1.

Numerical Benchmark: Calibration of a SDoF Bouc-Wen Type Hysteretic System
The Bouc-Wen model of hysteresis is widely used in structural engineering [29][30][31][32][33][34]. The model was proposed by Bouc [35], and thereafter modified by Wen [36]. The differential equation governing the motion of a single degree of freedom (SDoF) is written in the form: .. + + ( ( ), ( )) = ( ), where is the mass of the system, is the viscous linear damping coefficient, ( ) is the displacement, ( , ) is the restoring force, and ( ) is the external excitation. ( ) and ( ) denote derivations in time. According to the Bouc-Wen model, the restoring force is given by the following expression: where ( , ) represents the elastic component whereas ( , ) is the hysteretic component which depends on the past history of the system response, is the initial stiffness of the system, is the ratio between the final stiffness and initial one and ( ) is the hysteretic displacement as defined by Baber and Noori [37] to enhance the capacity of the original model to represent hysteretic cycles: .

Numerical Benchmark: Calibration of a SDoF Bouc-Wen Type Hysteretic System
The Bouc-Wen model of hysteresis is widely used in structural engineering [29][30][31][32][33][34]. The model was proposed by Bouc [35], and thereafter modified by Wen [36]. The differential equation governing the motion of a single degree of freedom (SDoF) is written in the form: where m is the mass of the system, c is the viscous linear damping coefficient, u(t) is the displacement, f r (u, z) is the restoring force, and p(t) is the external excitation.
. u(t) and .. u(t) denote derivations in time. According to the Bouc-Wen model, the restoring force is given by the following expression: where f el r (u, z) represents the elastic component whereas f h r (u, z) is the hysteretic component which depends on the past history of the system response, k i is the initial stiffness of the system, α is the ratio between the final stiffness and initial one and z(t) is the hysteretic displacement as defined by Baber and Noori [37] to enhance the capacity of the original model to represent hysteretic cycles: .
In Equation (12) the parameters β, γ and N control the shape of the cycles, while the additional parameters A(ε), η(ε) and ν(ε) are degradation functions in terms of the dissipated hysteretic energy ε h (t) and take into account the stiffness and strength degradation: where the constant values of A 0 , ν 0 , η 0 are usually set to unity [38]. Whereas the values δ A , δ ν , δ η are constant terms which specify the amount of stiffness and strength degradation [39]. Finally, the dissipated hysteretic energy ε h (t) is given by: A record of the Montenegro earthquake (1979) was selected from the PEER Ground Motion Database (PEER, 2019) as seismic excitation ( Figure 2) for the benchmark.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 6 of 22 In Equation (12) the parameters , and control the shape of the cycles, while the additional parameters ( ), ( ) and ( ) are degradation functions in terms of the dissipated hysteretic energy ( ) and take into account the stiffness and strength degradation: where the constant values of A0, , are usually set to unity [38]. Whereas the values , , are constant terms which specify the amount of stiffness and strength degradation [39].
Finally, the dissipated hysteretic energy ( ) is given by: A record of the Montenegro earthquake (1979) was selected from the PEER Ground Motion Database (PEER, 2019) as seismic excitation ( Figure 2) for the benchmark. This input excitation was used to simulate the response of a SDoF Bouc-Wen-Baber-Noori (BWBN) nonlinear system with mass of = 1200 kg and monitored by a dynamic monitoring system composed of one accelerometer installed on the system lumped mass. The initial stiffness of the system was calibrated with the objective of obtaining a plausible system frequency for a masonry structure of approximately 4 Hz. A damping ratio of = 3% was adopted according to relevant literature on masonry structures. Finally, the remaining BWBN parameters used to generate the simulated record are presented in Table 1, while Figure 3 depicts the simulation record data. The Bayesian calibration was then carried out on the simulated velocity response. The latter was chosen because it is simpler to describe such a system by a single ODE with respect to velocity. Deterministic values were set for the following parameters of the BWBN model: = 1200 kg, = 1, = 1, and also for the damping ratio, = 3%. The latter assumption was made to release uncertainties related to the identification of from the calibration process. Instead, the remaining ones were considered independent random variables with associated distributions given in Table 2 and constitute the vector of uncertain model parameters . It is worth specifying that the degradation effect of the BWBN model has been neglected on purpose, in order to introduce a discrepancy in the model used for emulating the reference response of the system. Besides, no uncertainties in input excitation have been considered herein, in order to address them only on the model side. The first moments of the prior probability density functions (PDFs) of model parameters and α, namely = 7.42 and = 0.063, were selected by two linear regressions on the reference restoring force curve for small and high displacements. The standard deviations of these parameters were chosen in order to take into account an amount of variation from these values of the 10%. It is worth noticing that in Table 2 a new random variable * was introduced in order to guarantee the bounded-input, bounded-output (BIBO) stability [40]. Indeed, assuming as prior knowledge on the parameter the required BIBO stability condition (i.e., being uniformly distributed between the bounds − ≤ ≤ ), the conditional distribution ( | ) must be uniform. This input excitation was used to simulate the response of a SDoF Bouc-Wen-Baber-Noori (BWBN) nonlinear system with mass of m = 1200 kg and monitored by a dynamic monitoring system composed of one accelerometer installed on the system lumped mass. The initial stiffness k i of the system was calibrated with the objective of obtaining a plausible system frequency for a masonry structure of approximately 4 Hz. A damping ratio of ξ = 3% was adopted according to relevant literature on masonry structures. Finally, the remaining BWBN parameters used to generate the simulated record are presented in Table 1, while Figure 3 depicts the simulation record data. The Bayesian calibration was then carried out on the simulated velocity response. The latter was chosen because it is simpler to describe such a system by a single ODE with respect to velocity. Deterministic values were set for the following parameters of the BWBN model: m = 1200 kg, A = 1, N = 1, and also for the damping ratio, ξ = 3%. The latter assumption was made to release uncertainties related to the identification of ξ from the calibration process. Instead, the remaining ones were considered independent random variables with associated distributions given in Table 2 and constitute the vector of uncertain model parameters x M . It is worth specifying that the degradation effect of the BWBN model has been neglected on purpose, in order to introduce a discrepancy in the model used for emulating the reference response of the system. Besides, no uncertainties in input excitation have been considered herein, in order to address them only on the model side. The first moments of the prior probability density functions (PDFs) of model parameters k i and α, namely µ k i = 7.42 and µ α = 0.063, were selected by two linear regressions on the reference restoring force curve for small and high displacements. The standard deviations of these parameters were chosen in order to take into account an amount of variation from these values of the 10%. It is worth noticing that in Table 2 a new random variable γ * was introduced in order to guarantee the bounded-input, bounded-output (BIBO) stability [40]. Indeed, assuming as prior knowledge on the parameter γ the required BIBO stability condition (i.e., γ being uniformly distributed between the bounds −β ≤ γ ≤ β), the conditional distribution π(γ β) must be uniform. Table 1. Parameters of the BWBN model adopted for simulating the record data.
This can be achieved by transforming the input variables, namely removing the parameter and introducing an auxiliary variable * ∼ (−1,1). Hence, for a joint realization of the parameters and * , the actual parameter reads: = * for 0. The standard deviations of the latter parameters were computed trivially once the supports of the uniform distributions were defined. To infer the unknown residual variances (Equation (9)), a uniform prior distribution was assumed: whose standard deviation was set equal to the maximum of the absolute value over the time of the experimental observation . Markov Chain Monte Carlo (MCMC) simulations have been conducted using MATLAB-based Uncertainty Quantification framework UQLab-V1.3-113 ® [6] with 100 chains, 700 steps and AIES solver algorithm [28]. The number of the BWBN forward model ℳ ( ) calls in MCMC was 70′000. The system of ordinary differential equation (ODEs; Equation (10) and (12)) was solved with the MATLAB solver ode45 (explicit Runge-Kutta method with relative error tolerance 1 × 10 −3 ). The Latin hypercube sampling (LHS) method was used to get the parameters prior distributions (Table 2) of the BWBN model, which are shown in Figure 4.  This can be achieved by transforming the input variables, namely removing the parameter γ and introducing an auxiliary variable γ * ∼ U(−1, 1). Hence, for a joint realization of the parameters β and γ * , the actual parameter reads: γ = γ * β for β 0. The standard deviations of the latter parameters were computed trivially once the supports of the uniform distributions were defined. To infer the unknown residual variances σ 2 (Equation (9)), a uniform prior distribution was assumed: whose standard deviation was set equal to the maximum of the absolute value over the time of the experimental observation . u. Markov Chain Monte Carlo (MCMC) simulations have been conducted using MATLAB-based Uncertainty Quantification framework UQLab-V1.3-113 ® [6] with 100 chains, 700 steps and AIES solver algorithm [28]. The number of the BWBN forward model M BW (x) calls in MCMC was 70 000. The system of ordinary differential equation (ODEs; Equations (10) and (12)) was solved with the MATLAB solver ode45 (explicit Runge-Kutta method with relative error tolerance 1 × 10 −3 ). The Latin hypercube sampling (LHS) method was used to get the parameters prior distributions (Table 2) of the BWBN model, which are shown in Figure 4.  The evolutions of the Markov chains are plotted instead in Figure 5 for each model parameter = { , , * , , }. These give valuable insights about convergence of the chains. Indeed, it can be clearly seen that 700 steps are sufficient to reach the steady state. Consequently, samples generated by the chains follow the posterior distributions. However, the sample points generated by the AIES MCMC algorithm have been post-processed carrying out the burn-in of the first half of sample points before convergence to avoid the pollution of the estimate of posterior properties. In our case, the quality of the generated MCMC chains can be judged satisfactory. Table 3 reports the result of the Bayesian inversion analysis: mean and standard deviation of the posterior 5%−95% quantiles of the distribution and Maximum A Posteriori (MAP) point estimate. The MAP can be assumed as the most probable parameter value following calibration. This value has been used as a best-fit parameter. The evolutions of the Markov chains are plotted instead in Figure 5 for each model parameter x = k i , β, γ * , α, σ 2 . These give valuable insights about convergence of the chains. Indeed, it can be clearly seen that 700 steps are sufficient to reach the steady state. Consequently, samples generated by the chains follow the posterior distributions. However, the sample points generated by the AIES MCMC algorithm have been post-processed carrying out the burn-in of the first half of sample points before convergence to avoid the pollution of the estimate of posterior properties. In our case, the quality of the generated MCMC chains can be judged satisfactory. Table 3 reports the result of the Bayesian inversion analysis: mean and standard deviation of the posterior 5%-95% quantiles of the distribution and Maximum A Posteriori (MAP) point estimate. The MAP can be assumed as the most probable parameter value following calibration. This value has been used as a best-fit parameter. MCMC algorithm have been post-processed carrying out the burn-in of the first half of sample points before convergence to avoid the pollution of the estimate of posterior properties. In our case, the quality of the generated MCMC chains can be judged satisfactory. Table 3 reports the result of the Bayesian inversion analysis: mean and standard deviation of the posterior 5%−95% quantiles of the distribution and Maximum A Posteriori (MAP) point estimate. The MAP can be assumed as the most probable parameter value following calibration. This value has been used as a best-fit parameter.  The model response prediction using MAP as a point estimate of the posterior distributions is plotted in Figure 6. It can be clearly seen how the inferred response reproduces the experimental record quite accurately, with a mean square error (MSE) over the time of 2.007 × 10 −4 (i.e., half of the average of the MSE computed on all the post-predictive model runs, MSE = 4.0415 × 10 −4 ). In fact, we should not dwell on this specific prediction; rather, we should consider its confidence intervals. The latter tells us how uncertainties on model input propagate through the model. The model response prediction using MAP as a point estimate of the posterior distributions is plotted in Figure 6. It can be clearly seen how the inferred response reproduces the experimental record quite accurately, with a mean square error (MSE) over the time of 2.007 × 10 −4 (i.e., half of the average of the MSE computed on all the post-predictive model runs, MSE = 4.0415 × 10 −4 ). In fact, we should not dwell on this specific prediction; rather, we should consider its confidence intervals. The latter tells us how uncertainties on model input propagate through the model. The model response prediction using MAP as a point estimate of the posterior distributions is plotted in Figure 6. It can be clearly seen how the inferred response reproduces the experimental record quite accurately, with a mean square error (MSE) over the time of 2.007 × 10 −4 (i.e., half of the average of the MSE computed on all the post-predictive model runs, MSE = 4.0415 × 10 −4 ). In fact, we should not dwell on this specific prediction; rather, we should consider its confidence intervals. The latter tells us how uncertainties on model input propagate through the model.  From inspection of Figure 6 it can be concluded that uncertainties on the input parameters produce higher uncertainty of the response prediction in the lower amplitude regions rather than in the peak values. However, this is a satisfying result in earthquake engineering, where one is more interested in predicting maximum response quantities. The posterior distributions of the calibrated parameter are plotted in Figure 7.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 10 of 22 From inspection of Figure 6 it can be concluded that uncertainties on the input parameters produce higher uncertainty of the response prediction in the lower amplitude regions rather than in the peak values. However, this is a satisfying result in earthquake engineering, where one is more interested in predicting maximum response quantities. The posterior distributions of the calibrated parameter are plotted in Figure 7.

Demonstration on a Case Study
Having established that the UQ framework can significantly help to gain insight into the uncertainties intrinsic to calibration, thus furnishing a realistic level of confidence on model prediction, this can now be applied to a case study. The aim is to study how the UQ works with an identified hysteretic and degrading model that emulates the experimental response of a real monitored masonry building, the Town Hall of Pizzoli, subjected to its real earthquake recorded data. This is a three-story stone masonry building located northwest of the city of L'Aquila (Abruzzo), which is away from the city. It was built around 1920 and it formerly hosted a school. The structure presents a u-shaped regular plan mainly distributed along one direction, and its elevations are characterized by regular openings distributed along three levels above the ground (the raised ground floor, the first floor and the under-roof floor). The total area is about 770 m 2 while the volume is about

Demonstration on a Case Study
Having established that the UQ framework can significantly help to gain insight into the uncertainties intrinsic to calibration, thus furnishing a realistic level of confidence on model prediction, this can now be applied to a case study. The aim is to study how the UQ works with an identified hysteretic and degrading model that emulates the experimental response of a real monitored masonry building, the Town Hall of Pizzoli, subjected to its real earthquake recorded data. This is a three-story stone masonry building located northwest of the city of L'Aquila (Abruzzo), which is away from the city. It was built around 1920 and it formerly hosted a school. The structure presents a u-shaped regular plan mainly distributed along one direction, and its elevations are characterized by regular openings distributed along three levels above the ground (the raised ground floor, the first floor and the under-roof floor). The total area is about 770 m 2 while the volume is about 5000 m 3 . Figure 8 reports the schematizations of the analyzed building. The Town Hall of Pizzoli belongs to the network of buildings monitored by OSS [26]. The OSS monitors the structure of Pizzoli thanks to a dynamic monitoring system composed of eight accelerometers installed on the building and one placed in the basement. More specifically, a tri-axial accelerometer to record the seismic input in the three space directions is fixed in the basement. On the raised ground floor, no accelerometers are present. On the first floor, three biaxial accelerometers and one monoaxial accelerometer (in X direction) are present to capture the acceleration responses of the first floor in the two horizontal directions. Finally, the same scheme of sensors installed on the first floor is present on the second floor. It is worth noting that the acquisition system and the sensors setup was designed by OSS and represents an input data for our analyses. In this regard, to obtain an optimal positioning of the sensors, a finite element model should be used. In particular, the optimal sensor placement is done recurring to a linear finite element model (using eigen-analysis, singular value decomposition, etc.). The model is often linear to perform this task because for nonlinear analysis the linear elastic component commonly brings the predominant amount of the output variance.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 11 of 22 directions. Finally, the same scheme of sensors installed on the first floor is present on the second floor. It is worth noting that the acquisition system and the sensors setup was designed by OSS and represents an input data for our analyses. In this regard, to obtain an optimal positioning of the sensors, a finite element model should be used. In particular, the optimal sensor placement is done recurring to a linear finite element model (using eigen-analysis, singular value decomposition, etc.). The model is often linear to perform this task because for nonlinear analysis the linear elastic component commonly brings the predominant amount of the output variance. The system sampled the signals at 250 Hz, for a total duration of about 50 s for the analyzed earthquake. The acquisition system recorded the sequence that struck central Italy in 2016. For this study, the earthquake acceleration responses of the building recorded on 30 October 2016 were used as reference signals. It is worth noting that accelerometric data are used in this work to define the likelihood function during the calibration. This is true as the use of inter-storey drift in the definition of the likelihood function would require its estimation. This estimate is often complicated because the filtering and integration operation on experimental data usually brings errors during the elaboration, especially with respect to the time alignment of two different recorded signals (quantities that are needed to estimate the inter-storey drift).
Since the scope of this study is to show how the discrepancy of models affects the calibration of model parameters value for seismic SHM, only the inter-storey response of the under-roof floor (2nd DoF) has been considered as reference output quantity. The acceleration of the first floor (1st DoF) is considered as the input signal of the reference model. The decision to use an inter-storey law is based on the fact that the results of the calibration will be compared with the results of calibration performed with discrepant models. In this sense, a single DoF model helps to make a direct comparison of the results, since the correspondence of parameters belonging to different models becomes less obvious for multi DoF models. The idea is to present a numerical case study derived from a real-world structure, with realistic values for hysteretic model parameters.

Reference Model
In this subsection, the model that came as the result of a nonlinear identification previously performed on the Pizzoli Town Hall structure is described for completeness. The nonlinear identification, like in [41,42], relied on a plane frame model in the direction of minor inertia of the building. Assuming the mass contribution of the raised ground floor in the dynamics of the structure is The system sampled the signals at 250 Hz, for a total duration of about 50 s for the analyzed earthquake. The acquisition system recorded the sequence that struck central Italy in 2016. For this study, the earthquake acceleration responses of the building recorded on 30 October 2016 were used as reference signals. It is worth noting that accelerometric data are used in this work to define the likelihood function during the calibration. This is true as the use of inter-storey drift in the definition of the likelihood function would require its estimation. This estimate is often complicated because the filtering and integration operation on experimental data usually brings errors during the elaboration, especially with respect to the time alignment of two different recorded signals (quantities that are needed to estimate the inter-storey drift).
Since the scope of this study is to show how the discrepancy of models affects the calibration of model parameters value for seismic SHM, only the inter-storey response of the under-roof floor (2nd DoF) has been considered as reference output quantity. The acceleration of the first floor (1st DoF) is considered as the input signal of the reference model. The decision to use an inter-storey law is based on the fact that the results of the calibration will be compared with the results of calibration performed with discrepant models. In this sense, a single DoF model helps to make a direct comparison of the results, since the correspondence of parameters belonging to different models becomes less obvious for multi DoF models. The idea is to present a numerical case study derived from a real-world structure, with realistic values for hysteretic model parameters.

Reference Model
In this subsection, the model that came as the result of a nonlinear identification previously performed on the Pizzoli Town Hall structure is described for completeness. The nonlinear identification, like in [41,42], relied on a plane frame model in the direction of minor inertia of the building. Assuming the mass contribution of the raised ground floor in the dynamics of the structure is negligible, the building has been modelled with two lumped masses in correspondence to the first and under-roof floor.
The identification procedure consisted of using absolute acceleration data measured at the floors to approximate the response with a generalized linear model obtained by expanding the Bouc-Wen model of hysteresis.
The basic functions of the generalized linear model ended up being nonlinear in terms of the exponential parameters of the BW model. These parameters were therefore identified with specialized optimization algorithms, while for the remaining parameters a direct estimate in the joint time-frequency domain was performed. The identification provided also instantaneous values of the BW model parameters made explicit by the generalized linear model. The main findings of the identification can be summarized as follow: • The adopted procedure allowed verification of the consistency of the assumed nonlinear model. This was done by checking the stability of the values of the model parameters over time.

•
The resulting model satisfactorily reproduced the experimental response. This was validated in [43] by comparing the model results with the experimental findings obtained by the authors and by other researchers [44] independently, • The procedure allowed for the collection of timely information on the health of the structure immediately after the occurrence of the earthquake.
In the identification process, a global box-like behavior was assumed, also based on the results of on-site inspections that led to the verification of the existence of good connections between walls and floor-walls before the occurrence of seismic events. In addition, no strength deterioration was accounted for (i.e., ν ε h = 1), while for the stiffness degradation, only the proportional component was maintained (i.e., η ε h = 1 and A ε h = 1 − δ A ε h (t)). The parameter α was instead imposed to zero. These choices were made to make the reference model consistent with the model identified in [43], which was calibrated with experimental data.
The reference two DoFs model has been reconducted to a DoF by imposing the acceleration response of the first floor of the Town Hall (recorded by the monitoring system), .. u 1 (t), as the input signal. The output was obtained for the inter-story drift, u(t), between the under-roof and the first floor. For further information on the identified model one can refer to [43].
Thus, the reference model used for the analysis from now on becomes: . .
The reference model is depicted in Figure 9 with the acceleration input   Table 4. For the numerical simulations, a viscous damping term (with a damping ratio of 3% set as a deterministic value in the calibration process) has been added to the model of Equation (16).
The reference model is depicted in Figure 9 with the acceleration input ( ) , while the corresponding model parameters, identified in a previous study with the Pizzoli Town Hall building responses, are reported in Table 4. For the numerical simulations, a viscous damping term (with a damping ratio of 3% set as a deterministic value in the calibration process) has been added to the model of Equation (16).

Bayesian Calibration of the Reduced Single DoF Reference Model
The reduced single DoF BWBN model was used to validate the entire procedure in the first phase. Tables 5 and 6 report the prior information on model parameters and the results of the Bayesian inversion, respectively. Figure 10 depicts the posterior distributions of the parameters, while Figure 11 depicts the velocity response of the oscillator once the model has been calibrated. From the inspection of the latter, it can be concluded that uncertainties on the input parameters do not affect the response prediction.    Moreover, the value of the degrading term turns out not to influence the response prediction. This is due to the fact that during the earthquake shaking (PGA ≈ 1 m/s 2 ) the structure exhibited a low level of damage [44].
Both parametric and non-parametric degradation models have been used herein. Two models per each family are first mathematically introduced, then insights into the accuracy afforded by each model are provided.

Models for Stiffness Degradation
Besides the original BWBN degrading model, an additional parametric model is now introduced to investigate the capability of the Gaussian model discrepancy term to cover model uncertainty. An important part of the current literature on damage indices focuses on exponential function of the dissipated energy.
Consequently, an energy-based exponential function, according to [45], has been used in this work to replace the original BWBN stiffness degradation term 1 − ( ) that multiplies the initial Moreover, the value of the degrading term δ A turns out not to influence the response prediction. This is due to the fact that during the earthquake shaking (PGA ≈ 1 m/s 2 ) the structure exhibited a low level of damage [44].
Both parametric and non-parametric degradation models have been used herein. Two models per each family are first mathematically introduced, then insights into the accuracy afforded by each model are provided.

Models for Stiffness Degradation
Besides the original BWBN degrading model, an additional parametric model is now introduced to investigate the capability of the Gaussian model discrepancy term to cover model uncertainty. An important part of the current literature on damage indices focuses on exponential function of the dissipated energy.
Consequently, an energy-based exponential function, according to [45], has been used in this work to replace the original BWBN stiffness degradation term (1 − δ A ε(t)) that multiplies the initial stiffness k i : where β is an additional parameter to infer jointly with k i . In order to replicate the degrading effect on the initial stiffness k i through non-parametric models, probability distribution functions have been used herein. First, according to the procedure used in [46], a Weibull distribution function was adopted: where R, p 1 , p 2 are the distribution parameters to infer jointly with k i . Meaning to these parameters may be given as follows: p 1 represents the instant of time at which there is loss of stiffness; p 2 controls its rate; and R W controls the amount of damage. Secondly, a logistic distribution function was adopted: In this case, parameter µ is the instant of time in which the loss of stiffness occurs, while parameter s represents its rate, and R L defines the amount of degradation.

Comparison of the Calibrated Models
It is useful to make a comparison among all the models introduced to get insights on their accuracy in predicting the response of the oscillator. For the sake of simplicity, from now Model 1, Model 2, Model 3 and Model 4 will refer to the models with the degrading law of the original BWBN model, the exponential law, the Weibull and the logistic distribution functions, respectively. It is also worth noticing that, in order to reduce the number of parameters to infer, only the initial stiffness k i and the degrading term have been involved in the calibration process. This can be easily justified by the fact that, for practical earthquake engineering applications, the focus relies on the quantification of the structure's stiffness and its drop (especially for masonry structures which are prone to crack during a seismic event). Prior information on the possible value of the model parameters was obtained by visual inspection of the dissipated energy versus time curve. For both parametric and non-parametric models, the velocity response of the system obtained was almost identical to Figure 11 for a low-level of degradation. Figure 12 depicts the posterior distributions of the initial stiffness parameter k i for each model considered. From this inspection, it can be seen that all models were able to predict the correct value of the initial stiffness. The choice of the MAP (or of the mean of the distributions) as a point estimate after the uncertainty propagation led to a slightly underrated initial stiffness regardless of the model adopted. However, this is neglectable from an engineering point of view. Moreover, Figure 12 depicts the distributions of the discrepancy variance term σ 2 as well. All the models share the same order of magnitude for σ 2 and, more or less, the same variance. This can be read as follows: the adopted Gaussian discrepancy term, with mean null and unknown variance, performs as a good function to embody all the model errors arising. Thus, at the end, it appears clear how the choice of the model used to infer the stiffness of the oscillator is not meaningful in the presence of a low-level of damage. the mean of the distributions) as a point estimate after the uncertainty propagation led to a slightly underrated initial stiffness regardless of the model adopted. However, this is neglectable from an engineering point of view. Moreover, Figure 12 depicts the distributions of the discrepancy variance term as well. All the models share the same order of magnitude for and, more or less, the same variance. This can be read as follows: the adopted Gaussian discrepancy term, with mean null and unknown variance, performs as a good function to embody all the model errors arising. Thus, at the end, it appears clear how the choice of the model used to infer the stiffness of the oscillator is not meaningful in the presence of a low-level of damage.

Influence of the Degradation Level on Model Parameters Inference
Having established that the choice of the models used to infer the stiffness parameter appears to not be relevant for a low-level of degradation, the aim is now to investigate what happens in possible further scenarios in which a higher level of damage is exhibited. This study is important when working in the field of model calibration because it explains the importance of linking the choice of the discrepancy model distribution-which is needed as input in the calibration processto the degradation amount and the characteristics of the external force (e.g., amplitude and frequency).
To this aim, the input excitation was rescaled. Specifically, two cases are examined as follows: (i) one in which the input has been rescaled with a PGA of 6 m/s 2 (medium level of degradation); (ii) one in which the input has been rescaled with a PGA of 9 m/s 2 (high level of degradation).

Influence of the Degradation Level on Model Parameters Inference
Having established that the choice of the models used to infer the stiffness parameter k i appears to not be relevant for a low-level of degradation, the aim is now to investigate what happens in possible further scenarios in which a higher level of damage is exhibited. This study is important when working in the field of model calibration because it explains the importance of linking the choice of the discrepancy model distribution-which is needed as input in the calibration process-to the degradation amount and the characteristics of the external force (e.g., amplitude and frequency).
To this aim, the input excitation was rescaled. Specifically, two cases are examined as follows: (i) one in which the input has been rescaled with a PGA of 6 m/s 2 (medium level of degradation); (ii) one in which the input has been rescaled with a PGA of 9 m/s 2 (high level of degradation).

Medium Level of Degradation
When increasing the level of excitation to a medium value (i.e., PGA of 6 m/s 2 ) and keeping the same reference damage, all the models were still able to predict the correct value of the initial stiffness k i . However, an increase of the uncertainties in the model prediction was registered, this time due to the greater values of the discrepancy variance. The choice of the MAP (or of the mean of the distributions) as point estimate led to a slight overrating of the initial stiffness. As the estimation error was negligible, the adopted discrepancy term still performs as a good function to embody all the model errors. However, increasing the level of the input without accounting for a growth of the damage level appears to be unrealistic. This is because higher seismic excitations generally involve greater dissipation of energy. Therefore, a loss up to 40% of the initial stiffness k i was taken into account as well, resulting in Model 1 (i.e., the one consistent with the BWBN degrading model, which is able to predict the correct value of the initial stiffness). On the contrary, all the models inconsistent with the BWBN model (i.e., Models 2, 3 and 4) overrate the initial stiffness value in a non-negligible proportion (Figure 13), leading to higher level of uncertainty in the model response prediction. It is also worth noticing how the estimation error was amplified for non-parametric models. This is a considerable indicator of the inadequacy of the discrepancy function adopted. In order to recover the response records, more accurate and well-suited discrepancy models should be investigated, but this is beyond the scope of this paper.
the greater values of the discrepancy variance. The choice of the MAP (or of the mean of the distributions) as point estimate led to a slight overrating of the initial stiffness. As the estimation error was negligible, the adopted discrepancy term still performs as a good function to embody all the model errors. However, increasing the level of the input without accounting for a growth of the damage level appears to be unrealistic. This is because higher seismic excitations generally involve greater dissipation of energy. Therefore, a loss up to 40% of the initial stiffness was taken into account as well, resulting in Model 1 (i.e., the one consistent with the BWBN degrading model, which is able to predict the correct value of the initial stiffness). On the contrary, all the models inconsistent with the BWBN model (i.e., Models 2, 3 and 4) overrate the initial stiffness value in a non-negligible proportion (Figure 13), leading to higher level of uncertainty in the model response prediction. It is also worth noticing how the estimation error was amplified for non-parametric models. This is a considerable indicator of the inadequacy of the discrepancy function adopted. In order to recover the response records, more accurate and well-suited discrepancy models should be investigated, but this is beyond the scope of this paper.

High Level of Degradation
The final step consisted of increasing the level of input to a high value (i.e., PGA of 9 m/s 2 ). Again, when keeping the same level of reference damage, all models were able to predict the correct value of the initial stiffness although with a visibly greater variance than before. When the level of damage increased, taking into account a stiffness loss of up to the 80% of the initial one, the same conclusions made in Section 3.3.1 were also confirmed, stressing the fact that level of uncertainty is now greater due to both greater excitation and a higher level of damage ( Figure 14).

High Level of Degradation
The final step consisted of increasing the level of input to a high value (i.e., PGA of 9 m/s 2 ). Again, when keeping the same level of reference damage, all models were able to predict the correct value of the initial stiffness although with a visibly greater variance than before. When the level of damage increased, taking into account a stiffness loss of up to the 80% of the initial one, the same conclusions made in Section 3.3.1 were also confirmed, stressing the fact that level of uncertainty is now greater due to both greater excitation and a higher level of damage ( Figure 14). Appl. Sci. 2020, 10, x FOR PEER REVIEW 17 of 22

Discussion
This realistic case study has shown that a Gaussian discrepancy term, with null mean and unknown variance, is an effective choice to tackle the model inaccuracy rising for low levels of degradation when the aim is inference of the system's initial stiffness. At the same time, it has been proven that this kind of discrepancy still works fine for low levels of damage even if the input is rescaled up to 1 g. Consequently, the discrepancy term adopted can be judged suitable to cover model errors with increasing values of the PGA. However, higher levels of PGA introduce higher uncertainties in the model prediction. It has been shown that increasing the amount of degradation jointly to the level of damage leads to inaccurate predictions for those models that are inconsistent with the original physical BWBN model and more uncertainty also arises in the latter. This can be appreciated, for example, in the velocity response prediction of Model 3 ( Figure 15). This is the model that leads to the worst model prediction for high level of degradation, owing to the presence of a greater number of overestimated outliers.

Discussion
This realistic case study has shown that a Gaussian discrepancy term, with null mean and unknown variance, is an effective choice to tackle the model inaccuracy rising for low levels of degradation when the aim is inference of the system's initial stiffness. At the same time, it has been proven that this kind of discrepancy still works fine for low levels of damage even if the input is rescaled up to 1 g. Consequently, the discrepancy term adopted can be judged suitable to cover model errors with increasing values of the PGA. However, higher levels of PGA introduce higher uncertainties in the model prediction. It has been shown that increasing the amount of degradation jointly to the level of damage leads to inaccurate predictions for those models that are inconsistent with the original physical BWBN model and more uncertainty also arises in the latter. This can be appreciated, for example, in the velocity response prediction of Model 3 ( Figure 15). This is the model that leads to the worst model prediction for high level of degradation, owing to the presence of a greater number of overestimated outliers.

Discussion
This realistic case study has shown that a Gaussian discrepancy term, with null mean and unknown variance, is an effective choice to tackle the model inaccuracy rising for low levels of degradation when the aim is inference of the system's initial stiffness. At the same time, it has been proven that this kind of discrepancy still works fine for low levels of damage even if the input is rescaled up to 1 g. Consequently, the discrepancy term adopted can be judged suitable to cover model errors with increasing values of the PGA. However, higher levels of PGA introduce higher uncertainties in the model prediction. It has been shown that increasing the amount of degradation jointly to the level of damage leads to inaccurate predictions for those models that are inconsistent with the original physical BWBN model and more uncertainty also arises in the latter. This can be appreciated, for example, in the velocity response prediction of Model 3 ( Figure 15). This is the model that leads to the worst model prediction for high level of degradation, owing to the presence of a greater number of overestimated outliers.  What has been determined can be easily read from the graphs in Figure 16. The latter depicts the MSE error over the time of the velocity response prediction for each case. Specifically, Figure 16a-c display the variation of MSE computed for all the post-predictive model runs, for each model and for each level of degradation (low, medium and high). Instead, the comparison between the average MSE computed on all the post-predictive models runs versus the MSE of the MAP estimate prediction at different levels of degradation, is presented in Figure 16d. From the inspection of latter, we can conclude how the error in prediction increases as the level of degradation increases. This leads to more uncertain predictions.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 18 of 22 What has been determined can be easily read from the graphs in Figure 16. The latter depicts the MSE error over the time of the velocity response prediction for each case. Specifically, Figures 16a-c display the variation of MSE computed for all the post-predictive model runs, for each model and for each level of degradation (low, medium and high). Instead, the comparison between the average MSE computed on all the post-predictive models runs versus the MSE of the MAP estimate prediction at different levels of degradation, is presented in Figure 16d. From the inspection of latter, we can conclude how the error in prediction increases as the level of degradation increases. This leads to more uncertain predictions. At last, for those cases inconsistent with the physics model that presents from a medium to high level of damage, it can be concluded that the simple discrepancy function adopted is not sufficient to embody all the model errors. This means that the choice of the discrepancy function should be linked to the level of damage and, in actual practice applications, to the level of excitation.
Thus, to summarize the considerations on the use of Bayesian updating and the uncertainty quantification framework, one can state that the model errors and the observation errors can be taken into account by adding a discrepancy term in the model calibration process. This brings benefits in terms of calibration outcomes (i.e., in addition to good time-series fitting, the order of the parameters is close to the real value). However, for some cases, the Bayesian approach within the uncertainty quantification framework could bring trivial results in terms of model parameter values. This especially occurs if the discrepancy model distribution is not chosen properly. The findings of this paper show that a good way to choose the discrepancy model distribution could be to study potential correlations between it and the statistical nature of the external force (e.g., amplitude and frequency) and the statistical nature of the modal characteristics (e.g., natural frequencies) of a system, evaluated in operational conditions. For example, for external forces with frequency content close to the natural At last, for those cases inconsistent with the physics model that presents from a medium to high level of damage, it can be concluded that the simple discrepancy function adopted is not sufficient to embody all the model errors. This means that the choice of the discrepancy function should be linked to the level of damage and, in actual practice applications, to the level of excitation.
Thus, to summarize the considerations on the use of Bayesian updating and the uncertainty quantification framework, one can state that the model errors and the observation errors can be taken into account by adding a discrepancy term in the model calibration process. This brings benefits in terms of calibration outcomes (i.e., in addition to good time-series fitting, the order of the parameters is close to the real value). However, for some cases, the Bayesian approach within the uncertainty quantification framework could bring trivial results in terms of model parameter values. This especially occurs if the discrepancy model distribution is not chosen properly. The findings of this paper show that a good way to choose the discrepancy model distribution could be to study potential correlations between it and the statistical nature of the external force (e.g., amplitude and frequency) and the statistical nature of the modal characteristics (e.g., natural frequencies) of a system, evaluated in operational conditions. For example, for external forces with frequency content close to the natural frequencies of the system, there is a high chance of high degradation to occur, and thus a discrepancy model distribution close to a Gaussian distribution could bring trivial results. This is clearly valid for a given model, since the model selection framework is another research topic that was not evaluated in this study and is complementary to the actual work.
This finding could stimulate further research on the statistical correlation of discrepancy models and the magnitude of excitation, helping to calibrate mathematical models that try to emulate real word systems and structures.

Conclusions
This article addressed the Bayesian probabilistic calibration of nonlinear hysteretic systems for model-driven seismic SHM purposes. The well-known Bayesian inversion framework was first illustrated through a numerical benchmark of a SDOF Bouc-Wen type oscillator, and then applied to a reference model derived from a calibrated 2-DOFs system of a real monitored masonry building hit by the 2016 Central Italy earthquake. The main novelty of this study is the application of this framework on models representing data measured directly on masonry structures during seismic events. In the examined cases, the model calibration procedures was carried out by incorporating the posterior uncertainty linked to model discrepancy in order to: (i) explicitly define the errors and uncertainties present in the model; (ii) provide the full multivariate distribution of the calibrated parameters; (iii) estimate the model discrepancy posterior distribution; (iv) provide insights on quantities of interest (e.g., maximum a posteriori estimates, time-history response prediction).
Furthermore, the study points to the importance of choosing a discrepancy model function correlated to the underlying seismic characteristics. This consideration can be drawn from the results obtained for the case study associated to Pizzoli Town Hall, for which several parametric and non-parametric degrading models were investigated to gain insight into how model inaccuracy could be embodied by the discrepancy function adopted. Specifically, with reference to non-parametric models, it has been determined that: (i) for low level of damage, and even for high levels of PGA, accurate predictions can be achieved adopting a Gaussian discrepancy term with null mean and unknown variance, which overcomes the low sensitivity of the term used to model the degradation in the response; (ii) for high levels of damage, on the contrary, the simple Gaussian discrepancy function is unable to tackle the model inaccuracy rising.
This means that, when the magnitude of the external excitation is very high (for the system considered), or the system is subjected to resonance, the system is likely to undergo damage (e.g., to high level of degradation) and thus simplified assumptions on the model discrepancy term are not reasonable. For these cases, the use of a fuller and more specific model for the mean of the discrepancy term should be used to get reliable estimates. These results suggest further studies to better relate discrepancy models to the statistical nature of the excitation and other factors that can influence probabilistic model calibration for use in seismic SHM.