Bayesian Updates for an Extreme Value Distribution Model of Bridge Trafﬁc Load Effect Based on SHM Data

: As the distribution function of trafﬁc load effect on bridge structures has always been unknown or very complicated, a probability model of extreme trafﬁc load effect during service periods has not yet been perfectly predicted by the traditional extreme value theory. Here, we focus on this problem and introduce a novel method based on the bridge structural health monitoring data. The method was based on the fact that the tails of the probability distribution governed the behavior of extreme values. The generalized Pareto distribution was applied to model the tail distribution of trafﬁc load effect using the peak-over-threshold method, while the ﬁltered Poisson process was used to model the trafﬁc load effect stochastic process. The parameters of the extreme value distribution of trafﬁc load effect during a service period could be determined by theoretical derivation if the parameters of tail distribution were estimated. Moreover, Bayes’ theorem was applied to update the distribution model to reduce the statistical uncertainty. Finally, the rationality of the proposed method was applied to analyze the monitoring data of concrete-ﬁlled steel tube arch bridge suspenders. The results proved that the approach was convenient and found that the extreme value distribution type III might be more suitable as the trafﬁc load effect probability model. show that the variance of θ is reduced after updating.


Introduction
Bridge safety has become a public concern after several collapses in recent years [1,2]. Nowadays, increased axle load and growth of traffic density are the main causes of bridge accidents worldwide [3][4][5][6]. The actual loads on trucks are often illegally overloaded, which has led to higher risks [7]. There is an increasing demand for a systematic and efficient safety assessment of bridges to prevent possible disasters [8][9][10][11].
Among all the load effects that must be determined for a bridge assessment, the most variable effect is that induced by traffic load [12]. The accuracy of bridge safety assessments greatly depends on the accuracy of the traffic load effect probability model which describes the uncertainty of the traffic load effect [13,14]. Therefore, an accurate prediction method for an extreme traffic load effect model is desired, especially for evaluating the safety of existing bridge structures [12,15].
Traditionally, the traffic load effect probability model has been determined based on statistical regularity from conservative design code [13,16]. However, this approach neglects site-specific truck loads that may be substantially different from bridge to bridge and are important for the safety evaluation of an existing bridge [17,18]. In recognition of this, many studies have been conducted to improve the measurement of the site-specific traffic load. One of the most effective methods is the weigh-in-motion (WIM) technology [19][20][21], while others have tried to obtain traffic load effects directly through structural health monitoring (SHM) technology [22][23][24][25].
The WIM and SHM technologies solve the problem of data acquisition for site-specific traffic. It is very difficult, however, to collect enough data in a correspondingly shorter monitoring period which can be used to determine the characteristic values of extreme load effect expected over the lifetime of the design service [26,27]. Therefore, using limited data to predict extreme value distribution is a common practice [12].
Many approaches have been put forward for predicting the probability model of extreme traffic load effect based on limited site-specific data, such as the normal probability paper method and the Monte Carlo method [28]. To deal with the statistics of extremes, the extreme value theory (EVT) [29] represents an ideal framework, which has already been used for the purpose of structural dynamics [11]. The EVT has also been used for modeling extreme traffic load effect for several years. Many studies have tried to solve the problem by identifying the maximum load effect recorded during a loading event, or in a reference period such as one day or one month, and then fitting those maxima to an extreme value distribution [26]. Caprani et al. [30] classified the loading events by the number of trucks involved in the maximal load effect, and then the maxima of each loading event was modeled by the generalized extreme value (GEV) distribution.
In theory, it is not very difficult to predict the probability model for extreme value traffic load effects if they follow some common distribution types or are identically distributed. However, in practice, the distribution function of the traffic load effect is always unknown or follows a very complicated distribution, which makes the accurate estimation of extreme traffic load effect almost impossible [12,14,31]. Caprani et al. [30] highlighted that the traffic load effect is not identically distributed, which violates the assumption of classical EVT that the underlying distribution should be identically and independently distributed. According to the EVT, the tails of the probability distribution for the traffic load effect govern the behavior of extreme values [32,33]. Upon noticing this, some methods tried to solve the problem by fitting the tail data only. For example, the peaks-over-threshold (POT) [34] and generalized Pareto distribution (GPD) estimation [35]. Regardless of the POT method or the GPD estimation, they are both sensitive to the threshold, but a relatively high threshold will increase the statistical uncertainty of estimated extreme value [36].
As a result, the statistical uncertainty of probability distribution parameters cannot be ignored [33]. Meanwhile, the traffic load conditions will change during the bridge operation period, so the traffic load effect model should also be updated to reflect the real status of traffic. To solve this problem, the Bayesian approach may offer an appealing alternative [4,37]. The Bayesian approach has been utilized to estimate the extreme value of bridges due to various loads. Sheng Xu et al. [36] proposed the Bayesian method to study the shortterm extreme mooring tensions. Y.Q Ni et al. applied the Bayesian approaches when evaluating the wind-resistant performance [38] and condition assessment [39]. Yiming Gu et al. [14] proposed a novel Bayesian approach for estimating the extreme traffic. Most of the above research does not pay attention to the stochastic process, while some is under the framework of Gaussian processes [11,40]. According to the Unified Standard for Reliability Design of Highway Engineering Structures (GB/T 50283-1999), the traffic loading process conforms to a filtered Poisson process (FPP). The character of the FPP will influence which extreme value distribution model of traffic load effect is needed.
The purpose of this study is to introduce a method for predicting the probability model of extreme traffic load effect based on SHM data. To establish the value distribution of extreme traffic load effect, the method uses a generalized Pareto distribution (GPD). This models the tail distribution of traffic load effect based on the POT method and FPP, which is based on the EVT for stochastic processes. Meanwhile, since the number of extreme traffic load effect values in the sample is relatively small, especially when the threshold is high [36,39], the Bayes' theorem is applied to reduce the statistical uncertainty and update the distribution model.

The GPD Model for Peaks-Over-Threshold Traffic Load Effect
To model the tail distribution, the POT method is applied to treat the POT traffic load effect data.
The POT method is a natural choice for modeling exceedance over a threshold. It has shown its importance and success in a number of statistical analysis areas such as finance, insurance, hydrology, and geographical phenomena domains [14,41,42], and in recent years, it has attracted attention for modeling extreme values of bridge traffic load effect.
Traditionally, traffic load effect caused by trucks crossing a bridge has been assumed to be a sequence of independent and identically distributed observations, i.e., s 1 , s 2 , . . . , s n , with the same marginal distribution function, F s (·). Therefore, s max denotes its extreme value as follows: S max = max(s 1 , s 2 , . . . , s n ) In theory, the distribution of S max can be derived exactly as Equation (2), if F s (·) is known: Most of the time, however, the distribution function, F s (·), is unknown. In this case, the distribution F n s (s) can be approximated by the GEV distribution according to the EVT.
where µ, σ, and ξ are the location, scale, and shape parameters of GEV distribution, respectively. Although the GEV model can be used to describe the probability model of s max , the problem of how to define the values of µ, σ, ξ remains, since the extreme data s max is limited most of the time.
The extreme traffic load effect, s max , is the maximum value of s 1 , s 2 , . . . , s n and it is also the maximum value of s * , which is the traffic load effect in s 1 , s 2 , . . . , s n that exceeds the high threshold u. The probability distribution function of s * can be defined by the conditional probability in the POT method, as Equation (4): According to the EVT, the previous distribution, F * s (s * ), belongs to the GPD family if the threshold u is large enough. The GPD, which has been proven to be the limit distribution of scaled excess over high threshold values, is often used to fit the upper tail of the probability distribution [33].The cumulative distribution function (CDF) of the GPD is defined as follows: where µ, σ, and ξ are the location, scale, and shape parameters of GPD, respectively.

The Filtered Poisson Process Model for the Traffic Load Effect Stochastic Process
The traffic loading process is a stochastic process which conforms to an FPP. Since the traffic load and its load effect usually have a linear relationship, the traffic load effect process, S(t), can also be reasonably defined as an FPP (Equation (6)): where {N(t), t ≥ 0} is the number of traffic load effects during period t, which follows the Poisson process with parameter λ; s i is the i-th traffic load effect according to the appearance time, which is an identically and independently distributed random variable with F s (s) as CDF; τ i is the time when s i appears. Compared with the traffic load effect process S(t), the stochastic process model of the traffic load effect which exceeds a high threshold S * (t) can be reasonably defined as an FPP, but with different parameters (Equation (7)): where s * i is the i-th traffic load effect over a high threshold u according to the appearance time with F * s (s * ) as CDF; N * (t) is the number of s * i during the period t, and the Poisson process with parameter λ * = [1 − F s (s γ )]λ.

Probability Model for Extreme Traffic Load Effect
According to the EVT, the CDF of the traffic load effect EV distribution for an FPP during the service period T is as follows: For the same reason, the CDF of a traffic load effect that exceeds a high threshold EV distribution for an FPP during the service period T is as follows: With Equations (4), (8), and (9), Equation (10) can be deduced as follows: Equation (10) proves the equivalence between F * s max (s max ) and F s max (s max ), which means the probability model of extreme traffic load effect can be obtained through the analysis of the traffic load effect that exceeds a high threshold.
As mentioned in Section 2.1, the probability model of peak-over-threshold traffic load effect, F * s (s * ), follows a GPD. With Equations (5) and (9), the CDF of the traffic load effect that exceeds a high threshold EV distribution for an FPP during the service period T is Equation (11), which obviously follows the GEV distribution:

Parameter Estimation Based on Bridge Health Monitoring Data
Equation (11) tells us if the distribution parameters of the traffic load effect that exceeds a high threshold can be determined; therefore, the probability model of extreme traffic load effect is established.
Threshold selection is an important step in the application of the POT method. An appropriate threshold is critical for estimating the EV distribution [33,43]. To choose an appropriate threshold, the mean excess function (MEF) [44] of the GPD is defined as follows: Therefore, e n (µ) is a linear function of µ and E(X − µ|X > µ) is simply the mean of the exceedances of the threshold µ. According to Equation (12), the sample estimator of e n (µ) is as follows: where x i consists of the n observations that exceed µ. If the exceedance of the threshold µ = u 0 , the MEF of the samples over u 0 will produce a straight line, or close to it. The point set {µ, e n (µ)} is considered to be the mean residual life plot (MRLP). The MRLP provides a valid approximation of the distribution of the exceedances. If the threshold u 0 is appropriate, the µ > u 0 region of the MRLP is nearly a straight line. After determining the threshold µ, the parameters of the GPD can be estimated by maximum likelihood. Supposing the values x 1 , . . . , x k are the k exceedances of a threshold u 0 , for ξ = 0 the log-likelihood is derived as follows: In the case ξ = 0, the log-likelihood is obtained as follows: The values x 1 , . . . , x k are obtained from the bridge health monitoring data. Analytical maximization of the log-likelihood is not possible; therefore, numerical techniques are again required. Standard errors and confidence intervals for the GPD are obtained from standard likelihood theory.
According to the bridge health monitoring data, the parameter λ * can be determined as follows: where n t is the number of data that exceed the threshold u 0 during a monitoring period ∆t, which reflects the traffic volume changes at different sites during different periods of service.
Since the existing bridge structure has been in service for a few years, there are some differences between T for the existing bridge structure and the design reference period T N for the new bridge structure, which need to be considered. To determine T, the equal exceeding probability principle [45] is applied and T can be calculated as follows: where M represents the remaining design working life for the existing bridge and N represents the design working life.

Bayesian Updates for the Extreme Traffic Load Effect Model
The number of traffic load effect extreme value samples is relatively small, especially when the threshold is high. As a result, the statistical uncertainty of probability distribution parameters cannot be ignored. Meanwhile, the traffic load conditions will change during the bridge operation period, meaning the traffic load effect model should also be updated to reflect the real status of traffic. In this paper, a Bayesian approach is employed to update the extreme traffic load effect based on monitoring data.
In general, Bayes' theorem can be expressed in terms of probability density function (PDF) as follows: where π(θ) is the prior distribution of the model parameters θ (e.g., σ and ξ in GPD) and Θ is the parameter space. π(θ) represents the statistical uncertainty which can be decreased as additional traffic load effect monitoring data x are gathered, π(θ|x) is the posterior distribution of θ and represents the updated conditions of θ, and L(x|θ) is the likelihood function. The marginal likelihood of x can be expressed as follows: Therefore, the posterior distribution π(θ|x) expression is shortened as follows: On the basis of the above Bayes' theorem, the updated Bayesian procedure for the extreme traffic load effect model is as follows:

1.
Evaluate π(θ) Since the GEV distribution parameters can be calculated by GPD distribution parameters (see Equation (11)), only σ and ξ need to be chosen as the model parameters θ. At the beginning, several methods can be used to evaluate the original π(θ), such as Jeffreys prior of flat priors, even if no monitoring data are obtained. When the new monitoring data x are gathered, the previous posterior distribution π(θ|x ) becomes the new priors.

2.
Evaluate π(θ|x) It is very difficult to evaluate π(θ|x) by direct numerical integration, but the Markov Chain Monte Carlo (MCMC) algorithm provides a good solution to this problem. The basic idea of the MCMC algorithm is to simulate the Markov Chain, whose stationary distribution is π(θ|x). Next, π(θ|x) can be sampled based on the stationary distribution.

3.
Model update The posterior distribution, π(θ|x), contains the information from the new data x and also the prior information from π(θ). Therefore, the mean values of π(θ|x) are chosen as the updated model of parameters θ.

Application to a CFST Arch Bridge
In recent years, several through and half-through arch bridge collapsing accidents have happened in China. Most of those accidents were caused by broken suspenders. The increasing traffic load was a primary cause of the suspender breakage, such that the analysis of traffic load effect probabilistic models of suspenders has become a research focus [46].
The E'Bian bridge, as shown in Figure 1, is a half-through concrete-filled steel tube (CFST) arch bridge located in China. The main span is 138 m and the width is 13 m. The arch rib is made of CFST with 50 suspenders connected to the concrete deck. A vehicle load effect monitoring system was implemented on suspenders on this bridge, as shown in Figure 1. The monitoring system can collect the traffic load effect data through the fiber sensor installed on the steel wires of the suspenders (Figure 1).

Application to a CFST Arch Bridge
In recent years, several through and half-through arch bridge collapsing accidents have happened in China. Most of those accidents were caused by broken suspenders. The increasing traffic load was a primary cause of the suspender breakage, such that the analysis of traffic load effect probabilistic models of suspenders has become a research focus [46].
The E'Bian bridge, as shown in Figure 1, is a half-through concrete-filled steel tube (CFST) arch bridge located in China. The main span is 138 m and the width is 13 m. The arch rib is made of CFST with 50 suspenders connected to the concrete deck. A vehicle load effect monitoring system was implemented on suspenders on this bridge, as shown in Figure 1. The monitoring system can collect the traffic load effect data through the fiber sensor installed on the steel wires of the suspenders (Figure 1).

(a) Sensor placement (b) Monitoring System Components
Up River S01 S02 S10 S09 S08 S07 S06 S05 S04 S03 S13 S12 S11 S22 S21 S20 S19 S18 S17 S16 S15 S25 S24 S23 S14 Fiber Sensor   Figure 2 illustrates the distribution of traffic load effect on four suspenders (S10, S13, S16, and S22). KSQ = SQ/SK, where SQ is the real stress on the suspender steel wires and SK is the normal value of the traffic load effect calculated by the design code. It can be observed in Figure 2 that the Weibull distribution, normal distribution, log-normal distribution, and gamma distribution, which are recommended in the Unified Standard for Reliability Design of Highway Engineering Structures (GB/T 50283-1999) as the traffic load effect model, deviate from the empirical distribution for the multi-peak phenomenon.
As the probability model recommended by the design code does not properly satisfy the result, a tail-based model is employed. First, a threshold is determined by inspecting the MRLP, as shown in Figure 3. The threshold for each suspender is chosen at the point where the MRLP becomes almost a straight line. Then, based on the standard likelihood theory, the tail region data can be fitted by the GPD (Figure 4), which is accepted by the K-S test.

Stress(MPa)
PDF CDF PDF CDF Figure 1. Traffic load effect monitoring system. Figure 2 illustrates the distribution of traffic load effect on four suspenders (S10, S13, S16, and S22). K SQ = S Q /S K , where S Q is the real stress on the suspender steel wires and S K is the normal value of the traffic load effect calculated by the design code. It can be observed in Figure 2 that the Weibull distribution, normal distribution, log-normal distribution, and gamma distribution, which are recommended in the Unified Standard for Reliability Design of Highway Engineering Structures (GB/T 50283-1999) as the traffic load effect model, deviate from the empirical distribution for the multi-peak phenomenon.
As the probability model recommended by the design code does not properly satisfy the result, a tail-based model is employed. First, a threshold is determined by inspecting the MRLP, as shown in Figure 3. The threshold for each suspender is chosen at the point where the MRLP becomes almost a straight line. Then, based on the standard likelihood theory, the tail region data can be fitted by the GPD (Figure 4), which is accepted by the K-S test.  Figure 2 illustrates the distribution of traffic load effect on four suspenders (S10, S13, S16, and S22). KSQ = SQ/SK, where SQ is the real stress on the suspender steel wires and SK is the normal value of the traffic load effect calculated by the design code. It can be observed in Figure 2 that the Weibull distribution, normal distribution, log-normal distribution, and gamma distribution, which are recommended in the Unified Standard for Reliability Design of Highway Engineering Structures (GB/T 50283-1999) as the traffic load effect model, deviate from the empirical distribution for the multi-peak phenomenon.
As the probability model recommended by the design code does not properly satisfy the result, a tail-based model is employed. First, a threshold is determined by inspecting the MRLP, as shown in Figure 3. The threshold for each suspender is chosen at the point where the MRLP becomes almost a straight line. Then, based on the standard likelihood theory, the tail region data can be fitted by the GPD (Figure 4), which is accepted by the K-S test. (a) S10 (b) S13 (c) S16 (d) S22 (a) S10 (b) S13 (c) S16 (d) S22 (a) S10 (b) S13 (c) S16 (d) S22 The priors for GPD parameters and are assumed to follow a normal distribution in this study, and the hyper parameters , , and , can be generated from the prior monitoring data as follows: When new monitoring data are obtained, the MCMC algorithm is used to evaluate ( | ) based on the new monitoring data . Then, the samples of ( | ) are generated and the model parameters of ( | ) are found by likelihood estimation. In this study, the new monitoring data are simulated based on the prior GPD model to show the updated effect. The results (See Figure 5) clearly show that the variance of is reduced after updating. The priors for GPD parameters σ and ξ are assumed to follow a normal distribution in this study, and the hyper parameters µ σ , σ σ , and µ ξ , σ ξ can be generated from the prior monitoring data as follows: When new monitoring data are obtained, the MCMC algorithm is used to evaluate π(θ|x) based on the new monitoring data x. Then, the samples of π(θ|x) are generated and the model parameters of π(θ|x) are found by likelihood estimation. In this study, the new monitoring data are simulated based on the prior GPD model to show the updated effect. The results (See Figure 5) clearly show that the variance of θ is reduced after updating.

(a)
for S10 (b) for S10 (c) for S13 (d) for S13 (e) for S16 (f) for S16 For a suspender, if the design working life, N, is 30 years, the remaining design working life, M, is 6 years, and the design reference period, TN, is 100 years, according to Equation (17), the reference period, T, for the suspenders is 20 years. The probability model of extreme traffic load effect for the suspenders, the mean, and the standard deviation (S.D.) of based on the above parameters, are shown in Table 1.

Discussion
There is an interesting result that should be discussed. The parameter in GPD and GEV is negative, which means the extreme traffic load effect probability model follows the extreme value distribution type III (Weibull families). Bailey and Bez [18] reached a similar conclusion from the simulated static effects of traffic, but they require more proof. According to the EVT, if or < 0, the GPD and GEV have an upper bound of the distribution equal to the following: According to the Unified Standard for Reliability Design of Highway Engineering Structures (GB/T 50283-1999), the extreme traffic load effect probability model is usually assumed to be a normal distribution or extreme value distribution type I, which does not have an upper bound. In general, the traffic load effect cannot be unlimited if there is no damage where the monitoring sensor is placed. This means the extreme value distribution For a suspender, if the design working life, N, is 30 years, the remaining design working life, M, is 6 years, and the design reference period, T N , is 100 years, according to Equation (17), the reference period, T, for the suspenders is 20 years. The probability model of extreme traffic load effect for the suspenders, the mean, and the standard deviation (S.D.) of x up based on the above parameters, are shown in Table 1.

Discussion
There is an interesting result that should be discussed. The parameter ξ in GPD and GEV is negative, which means the extreme traffic load effect probability model follows the extreme value distribution type III (Weibull families). Bailey and Bez [18] reached a similar conclusion from the simulated static effects of traffic, but they require more proof. According to the EVT, if ξ or ξ < 0, the GPD and GEV have an upper bound of the distribution equal to the following: According to the Unified Standard for Reliability Design of Highway Engineering Structures (GB/T 50283-1999), the extreme traffic load effect probability model is usually assumed to be a normal distribution or extreme value distribution type I, which does not have an upper bound. In general, the traffic load effect cannot be unlimited if there is no damage where the monitoring sensor is placed. This means the extreme value distribution type III might be more reasonable as the probability model for the extreme traffic load effect, compared with the conventional models.

Conclusions
In this study, an updated Bayesian approach for predicting the extreme value distribution of traffic load effect using monitoring data is proposed based on the EVT for stochastic processes. The following conclusions can be made: (I) The GPD is suitable for modeling the tail distribution of traffic load effect data. This solves the problem of the distribution of actual traffic load effect, which sometimes follows uncommon types of probability distribution. As such, conventional methods cannot accurately predict the probability model of extreme traffic load effect using bridge structure health monitoring data.
(II) The extreme traffic load effect excess for an FPP during a service period is proven to follow the GEV distribution, when the probability model of peak-over-threshold traffic load effect follows the GPD.
(III) Bayesian updates can reduce the variance of distribution model parameters θ.
(IV) The improved method can consider the site-specific features of traffic load and also the differences between T for an existing bridge structure and the design reference period, T N , for a new bridge structure.
(V) The application results show that the extreme value distribution of traffic load effect follows the extreme value distribution type III (Weibull families), which has an upper bound and might be more reasonable than the conventional model.
(VI) Threshold selection is an important step in the application of the new method. An appropriate threshold is critical for estimating the EV distribution. Although the mean residual life plot presents a typical graphical method, it is not accurate enough. How to estimate a proper threshold should be addressed in future studies.
(VII) Other than the traditional standard likelihood theory, many methods such as the differential evolution algorithm [47] can be used to estimate the parameters of GDP based on monitoring data, and can be studied in the future.
Funding: This research was funded by the National Natural Science Foundation of China, grant number 51208224.