Analysis of Structural Health Monitoring Data with Correlated Measurement Error by Bayesian System Identification: Theory and Application

Measurement error is non-negligible and crucial in SHM data analysis. In many applications of SHM, measurement errors are statistically correlated in space and/or in time for data from sensor networks. Existing works solely consider spatial correlation for measurement error. When both spatial and temporal correlation are considered simultaneously, the existing works collapse, as they do not possess a suitable form describing spatially and temporally correlated measurement error. In order to tackle this burden, this paper generalizes the form of correlated measurement error from spatial correlation only or temporal correlation only to spatial-temporal correlation. A new form of spatial-temporal correlation and the corresponding likelihood function are proposed, and multiple candidate model classes for the measurement error are constructed, including no correlation, spatial correlation, temporal correlation, and the proposed spatial-temporal correlation. Bayesian system identification is conducted to achieve not only the posterior probability density function (PDF) for the model parameters, but also the posterior probability of each candidate model class for selecting the most suitable/plausible model class for the measurement error. Examples are presented with applications to model updating and modal frequency prediction under varying environmental conditions, ensuring the necessity of considering correlated measurement error and the capability of the proposed Bayesian system identification in the uncertainty quantification at the parameter and model levels.


Introduction
Structural health monitoring (SHM), which is to use measured data to infer the health status of the monitored structure, has received tremendous attention over the last decades [1][2][3][4][5][6][7][8][9][10]. For a particular SHM problem, the model output reflects the corresponding parameterization, and measurement error is the discrepancy between the measured noisy output and model output. In SHM data analysis, as measurement error is nonnegligible, it is crucial to select a suitable form and conduct system identification for the measurement error.
Deterministic inference methods typically construct the objective function as the generalized least squares under the assumption of the covariance matrix of measurement errors. Although the assumption of uncorrelated measurement error is widely adopted [11], it is found that inference can be improved by relaxing the uncorrelation assumption in some circumstances [12,13]. Probabilistic inference methods construct the objective function as the likelihood function in the frequentist approach or a posterior Probability Density Function (PDF) in the Bayesian approach [14][15][16][17]. Using Bayes' Theorem, the posterior PDF, which is proportional to the prior PDF and the likelihood function, accounts for the uncertainty both in the prior knowledge as well as in the measurements. The construction of the likelihood requires selecting a probability model for measurement error. The joint probability distribution of the measurement error vector is related to the marginal probability distribution and the correlation function. The marginal probability distribution prescribes that each measurement error follows a univariate Gaussian distribution, according to the Principle of Maximum Entropy [18,19]. The correlation function characterizes the statistical correlation between any two distinct measurement errors.
In many applications of SHM, measurement errors are statistically correlated in space (especially for dense sensor grids), time (especially for high sampling frequencies), or both in space and time, for data from sensor networks [20]. For the purpose of considering the statistical correlation of measurement errors, different correlation models have been considered. For instance, McFarland and Mahadevan [12] chose temporally exponential correlation for model calibration of thermal problems. Cheung et al. [21] considered spatially exponential correlation for turbulence modeling. Papadimitriou and Lombaert [22] investigated the effect of spatially exponential correlation for optimal sensor placement. Simoen et al. [23] introduced different functions for spatial correlation, e.g., an exponential correlation function, a spherical correlation function, and an exponentially damped cosine correlation function. Mu and Yuen [24] considered identical correlation in sparse Bayesian learning for risk pattern recognition.
Based on a prescribed set of candidate model classes for the correlated measurement error, Bayesian model class selection can be performed. The most suitable/plausible model class of measurement error is the one possessing maximum posterior probability. Recently, Bayesian model class selection has been studied and developed for structural health monitoring [25][26][27][28][29], structural damage detection [30][31][32][33], and risk assessment [24,34,35]. In particular, Simoen et al. [23] solely considered spatial correlation for measurement error and performed Bayesian model class selection to select the most suitable/plausible model class of spatially correlated measurement error in model updating. When temporal correlation alone is considered, an extension of the existing works [12,[21][22][23] should be made, not only introducing a suitable form describing temporally correlated measurement error, but also deriving the corresponding inferences. Furthermore, when both spatial and temporal correlation are considered simultaneously, i.e., spatial-temporal correlation, the existing works [12,[21][22][23] collapse, as they do not possess a suitable form describing spatially and temporally correlated measurement error. In order to tackle this burden, this paper generalizes the form of correlated measurement error from spatial correlation only or temporal correlation only to spatial-temporal correlation. A new form of spatial-temporal correlation and the corresponding likelihood function are proposed, and multiple candidate model classes for the measurement error are constructed, including no correlation, spatial correlation, temporal correlation, and the proposed spatial-temporal correlation. Bayesian system identification is conducted to achieve not only the posterior probability density function (PDF) for the model parameters, but also the posterior probability of each candidate model class for selecting the most suitable/plausible model class for the measurement error.
In Section 2, the theory is presented, including candidate model classes of data with correlated measurement error, posterior probability density function for uncertain model parameters, and posterior probability for model class selection. In Section 3, examples are presented with applications for model updating and modal frequency prediction under varying environmental conditions.

Candidate Model Classes of Data with Correlated Measurement Error
Consider a structure monitored by a sensor network with N o observations at each time step. Let Q n (b S) ∈ R N o denote the model output at the n-th time step, which is parameterized by a structural model S with the unknown structural parameter vector b. For a SHM problem, the model output reflects the corresponding parameterization. For example, in structural model updating, S can represent a particular structural model with unknown stiffness and damping parameters b. As the measured data are always subject to measurement error, the measured noisy output at the n-th time step Y n ∈ R N o : where ε n τ E ∈ R N o is the uncertain measurement error at the n-th time step, parameterized by a measurement error model E with the unknown measurement error parameter vector τ. By collecting data up to N T sampling time steps, the measured noisy output matrix In SHM, the selection of a mathematical model of correlated measurement error E directly affects the inference for the structural model S and unknown structural parameters b. The joint probability distribution of ε is related to two assumptions: the marginal probability distribution of ε ij and the correlation function. The marginal probability distribution of ε ij , according to the Principle of Maximum Entropy [18,19], follows univariate Gaussian distribution with zero mean and unknown variance τ 0 . That is, ε ij ∼ N ε ij 0, τ 0 . For one thing, the zero mean assumption is valid, as uncertain bias can be added to model output as another uncertain parameter. For another thing, the variance of measurement error is an uncertain parameter to be identified. The homogeneity variance model assumes the variances of the distinct measurement errors are identical, while the heterogeneous variance model assumes they are different. Further discussion on homogeneity and heterogeneous variance models can be found in [35,36]. The correlation function characterizes the statistical correlation between any two distinct measurement errors (ε ij and ε i j ). The following correlation models are introduced: (1) uncorrelated model (denoted as E 1 ); (2) identical correlation model [24] (denoted as E 2 ); (3) exponential correlation model [23] (denoted as E 3 ). Let ρ E k ij denote the correlation function of two variables i and j conditional on E j , described as follows: where δ ij is the Kronecker delta, and ∆ ij = |i − j| represents the distance between measurements i and j. Based on ρ E k ij , the spatial and temporal correlations can be constructed. Let E spat = E spat,m = E m m = 1, . . . and E temp = E temp,n = E n n = 1, . . . denote the candidate model classes for the correlated measurement error of spatial correlation and temporal correlation, respectively. Accordingly, introduce spatial correlation matrix L E spat,m ∈ R N o ×N o and temporal correlation matrix L E temp,n ∈ R N T ×N T as follows: L E temp,n ij = ρ E n ij , i, j = 1, . . . , N T , n = 1, 2, 3 It is worth noting that the existing works [12,[21][22][23] are capable of considering either spatial correlation of Equation (5) only or temporal correlation of Equation (6) only, and they do not possess a suitable form describing spatially and temporally correlated measurement error. Here, a new form of spatial-temporal correlation is pro- , ρ E n ii , and ρ E m j j denote the spatial-temporal correlation of ε ij and ε i j the temporal correlation between ε ij and ε i j , and the spatial correlation between ε i j and ε i j , respectively. Based on the property of correlation, the discrepancy between the spatial-temporal correlation and the product ρ E n ii ρ E m j j is bounded as fol- When ρ E n ii and ρ E m j j are closed to 1 (high correlation in time and space), the following approximation for the spatial-temporal correlation holds: ρ ii ρ E m j j . Thus, the covariance matrix of the spatial-temporal correlated measurement error vector can be expressed as: where vec() is vectorization; ⊗ is the Kronecker product. When setting L E temp,n or L E spat,m to be an identity matrix, the proposed spatial-temporal correlation in Equation (7) degenerates to the spatial correlation only or temporal correlation only. That is, while existing works consider spatial correlation alone [12,[21][22][23], this paper considers the special case of spatialtemporal correlation.
The universal set of candidate model classes for the correlated measurement error is E = E spat × E temp , and the corresponding unknown parameter vectors of measurement error: Finally, the universal set of candidate model classes for the system and correlated measurement error is M = S × E = {M k , k = 1, 2, . . .}. As the joint probability model of the measured noisy output matrix Y is an N o N T -dimensional normal distribution N vec(Y) vec(Q(X, b|S i )), τ 0 L E temp,n ⊗L E spat,m , the proposed likelihood function of SHM dataset D with spatially and temporally correlated measurement error, which is conditional on M k with its associated parameter vector θ = b T , τ T T , can be expressed as: where tr is the trace of the matrix. When setting L E temp,n or L E spat,m to be an identity matrix, the proposed likelihood function for spatial-temporal correlation in Equation (9) degenerates to the traditional likelihood function for the spatial correlation only or temporal correlation only in the existing works [12,[21][22][23]. Based on the proposed likelihood function, Bayesian system identification is rederived in the following parts.

Posterior Probability Density Function for Uncertain Model Parameters
According to Bayes' theorem, the posterior PDF is [14]: where p( θ| M k ) is the prior PDF, reflecting uncertainty introduced by modeling error at the parameter level based on the prior information. The form of the prior PDF can be determined using the Principle of Maximum Entropy [18] or the conjugate distribution. According to the Principle of Maximum Entropy, N θ i µ, ν 2 (normally distributed prior PDFs with mean µ and variance ν 2 ) and Gamma(θ i |α, β) (Gamma-distributed prior PDFs with the shape factor α and scale factor β) are appointed to θ i ∈ (−∞, +∞) and θ i ∈ (0, +∞), respectively. In the conjugate distribution, τ 0 follows Inv − Gamma(τ 0 |α , β ) (Inverse Gamma-distributed prior PDFs with the shape factor α and scale factor β ). A rational way of determining the hyperparameters of the prior PDF can be achieved as follows. First, according to the previous information, prescribe the values of Maximum A Priori (MAPr) and coefficients of variation value (COV). Then, determine the hyperparameters based on the MAPr and COV. For N θ i µ, ν 2 , µ = MAPr and ν = MAPr·COV; for The posterior PDF p(θ|D, M k ) is dominated by p(D|θ, M k ), given that the amount of datasets is large and p( θ|M k ) is relatively flat. Denote L(θ|D, M k ) = − ln p(θ|D, M k ). The Maximum A Posteriori (MAP) estimate is [16]: The posterior covariance matrix of the parameters Σ θ is related to the local curvature of L(θ|D, M k ), which can be described by the hessian Matrix H(L(θ|D, M k )) as follows: In the globally identifiable case [4], the posterior PDF p(θ|D, M k ) can be well approximated by the multivariate normal distribution N θ θ , H L θ D, M k −1 .

Posterior Probability for Model Class Selection
The purpose of model class selection is to select the most suitable model class based on the universal set of candidate model classes for the system; correlated measurement error is M = S × E = {M k , k = 1, 2, . . .}. Note that the determination of the universal set of candidate model classes M reflects uncertainty introduced by modeling error at the model level based on the prior information.
According to Bayes' theorem, the posterior probability of model class M k is [16,30]: where P(M k |D) is the posterior probability of M k (also called the plausibility of M k ); p(D|M k ) is the evidence (also called the marginal likelihood); P(M k ) is the prior probability of M k . When P(M k ) is selected to be the discrete uniform distribution, the most suitable/plausible model class isM, possessing the largest evidence: In the globally identifiable case, p(D|M k ) can be well approximated as [30]: where p D θ , M k is the likelihood function evaluated atθ; O k is the Ockham factor [15,16,30]: It can be seen that the optimal modelM should balance between the model fitting capability (quantified by p( D θ , M k )) and the model robustness (quantified by O k ). Recall two correlation functions ρ (2) and (3), ρ E 1 ij can be viewed as a simplified case of ρ E 2 ij given that τ E 2 γ = 0. On the one hand, if the model fitting capability is adopted for model class selection, ρ E 2 ij is superior to ρ On the other hand, if the model robustness is adopted for model class selection, ρ From a Bayesian point of view, the model fitting capability and the model robustness should be considered simultaneously for selecting the most suitable/plausible correlation models of measurement error.

Illustrative Example
Two examples are presented with applications for model updating and modal frequency prediction under varying environmental condition. In each example, the SHM data with correlated measurement errors are analyzed by the proposed Bayesian system identification. It is worth noting that the true correlation model and the associated parameters of measurement error, as well as the structural parameters, are unknown during the process of system identification.

Application to Model Updating
Model updating has received tremendous attention in many science and engineering fields. In SHM, Bayesian system identification has been studied and developed to conduct uncertainty quantification [37][38][39][40][41][42][43]. This example demonstrates the analysis of SHM data with correlated measurement errors in model updating. The Rayleigh damping model is adopted and the damping ratios of the first and third modes are 5%. The base excitation is El Centro earthquake. In the following section, two cases of structural health status are considered: (1) no damage case, the structure is undamaged, so all structural parameters are equal to their nominal values; (2) damaged case, stiffnesses of the 1st and 3th stories are both reduced by 20%, while that of other floors remains unchanged. Accelerations of 1st, 3th, and 5th floors are measured. The measurement error follows a zero-mean multivariate normal distribution with the covariance matrix as Σ E spat,2 E temp,2 . That is, both spatial and temporal correlations follow the identical correlation model (E 2 ). The measurement noise level is taken to be 15% rms of the noise-free responses of the top floor. Note that the true correlation model and the associated parameters of measurement error, as well as the structural parameters, are unknown during the process of system identification.
The candidate model class of the structure is introduced. The unknown stiffness matrix K and damping matrix L are parametrized in the structural model class S 1 : are the uncertain stiffness parameter vector and uncertain damping parameter vector; K S i j , j = 1, . . . , N S i K , and L S i j , j = 1, . . . , N S i K , are the prescribed nominal stiffness and damping submatrices. The damage level of a substructure can be reflected through the reduction of the corresponding stiffness parameter. For example, 5% reduction of a stiffness parameter indicates 5% stiffness loss of the corresponding substructure. The candidate model classes for the measurement error are introduced as follows: (1) E spat,1 E temp,1 (E spat,1 = E 1 and E temp,1 = E 1 ) does not consider any correlation spatially or temporally; (2) E spat,2 E temp,2 (E spat,2 = E 2 and E temp,2 = E 2 ) considers that spatial and temporal correlations are both identical correlations; (3) E spat,3 E temp,3 (E spat,3 = E 3 and E temp,3 = E 3 ) considers that spatial and temporal correlations are both exponential correlations (E 3 ). Finally, the universal set of all candidate model classes is 3 . It is worth noting that because both S 1 E spat,2 E temp,2 and S 1 E spat,3 E temp,3 correspond to the spatial-temporal correlation, the existing works [12,[21][22][23] are incapable of handling these two candidate model classes. Table 1 shows the prior PDF of parameters.
The following are the results for the no damage case. Table 2 shows model class selection results (no damage case). The results of the most plausible class are in red color. For the log-likelihood, S 1 E spat,2 E temp,2 is superior. This is anticipated because the correlation pattern of S 1 E spat,2 E temp,2 is the same as the true correlation of measurement error. For the log-Ockham factor, S 1 E spat,1 E temp,1 is superior because the simplest parametrization possesses the highest robustness. Because the posterior model probability is the balance between the likelihood and Ockham factor, the optimal model class is S 1 E spat,2 E temp,2 , which is identical to the true model class of measurement error. Table 3 shows parameter identification results (no damage case). Based on the capability of Bayesian inference for uncertainty quantification at the parameter level, both the MAP estimates and posterior standard deviations are shown. Compared to the true values of parameters, the MAP estimates of the most plausible class (S 1 E spat,2 E temp,2 ) are outperformed, and the corresponding posterior standard deviations of it are smaller than those of other model classes. Figure 1 shows the contour plot of the posterior PDF of the substructure parameters of the most plausible model class (no damage case). The true value, MAP estimate, 50% confidential interval, and 95% confidential interval are represented by "+", "o", dashed line, and solid line, respectively. The true values of the story stiffnesses are within 50% confidential intervals. Figure 2 shows the contour plots of the posterior PDF of the measurement error parameters of the most plausible model class (no damage case). The true values of the measurement error parameters are within the 95% confidential interval.     The following are the results for the damaged case. Table 4 shows model class selection results (damaged case). It can be anticipated that the true model class , , is selected because it is the best model balancing the likelihood and the Ockham factor. Table 5 shows parameter identification results (damaged case). The most plausible class ( , , ) not only successfully detects stiffness reductions at the 1st and 3th floors, but also possesses the smallest posterior standard deviations. Figures 3 and 4 show that both the true values of the story stiffnesses and measurement error parameters are within 50% confidential intervals and 95% confidential intervals, respectively.   The following are the results for the damaged case. Table 4 shows model class selection results (damaged case). It can be anticipated that the true model class S 1 E spat,2 E temp,2 is selected because it is the best model balancing the likelihood and the Ockham factor. Table 5 shows parameter identification results (damaged case). The most plausible class (S 1 E spat,2 E temp,2 ) not only successfully detects stiffness reductions at the 1st and 3th floors, but also possesses the smallest posterior standard deviations. Figures 3 and 4 show that both the true values of the story stiffnesses and measurement error parameters are within 50% confidential intervals and 95% confidential intervals, respectively.     The conclusions are as follows: (1) It is essential to select the most plausible model of measurement errors based on a prescribed set of candidate model classes characterizing spatially and temporally correlated measurement error; (2) over-simplified or over-complicated model class for the correlated measurement error in model updating degrades the identification result in terms of both the optimal value and uncertainty; (3) the proposed Bayesian system identification is capable of detecting structural damage with a satisfying precision level without knowing the true correlation model and the associated parameters.  The conclusions are as follows: (1) it is essential to select the most plausible model of measurement errors based on a prescribed set of candidate model classes characterizing spatially and temporally correlated measurement error; (2) over-simplified or over-complicated model class for the correlated measurement error in model updating degrades the identification result in terms of both the optimal value and uncertainty; (3) the proposed Bayesian system identification is capable of detecting structural damage with a satisfying precision level without knowing the true correlation model and the associated parameters.

Application to Modal Frequency Prediction under Varying Environmental Conditions
This example demonstrates the analysis of SHM data with correlated measurement error in modal frequency prediction under varying environmental conditions. The monitored tower is located in Perugia, Italy. Figure 5 shows the side view and sensor configuration. One environmental sensor at 27 m and three accelerometers at 41 m were installed to obtain the temperature and modal parameters, respectively. Detailed information can be found in [44]. In the paper, a regression model is proposed to predict modal frequencies based on temperature:

Application to Modal Frequency Prediction under Varying Environmental Conditions
This example demonstrates the analysis of SHM data with correlated measurement error in modal frequency prediction under varying environmental conditions. The monitored tower is located in Perugia, Italy. Figure 5 shows the side view and sensor configuration. One environmental sensor at 27 m and three accelerometers at 41 m were installed to obtain the temperature and modal parameters, respectively. Detailed information can be found in [44]. In the paper, a regression model is proposed to predict modal frequencies based on temperature: where T n is n-th input data point corresponding to the air temperature in the belfry outdoor of 27 m height and southern orientation; ω x1,n are the n-th output data vector corresponding to the frequency of the 1st bending mode in the x direction; ω y1,n , ω y2,n , and ω y3,n are the n-th output data vector corresponding to the frequencies of the 1st, 2nd, and 3rd bending modes in the y direction, respectively. The measured noisy output Y n is:       The candidate model class of input-output relation (S 1 ) is identical to the model output Q n (b S) of Equation (19). The candidate model classes of measurement error are introduced as follows: (1) E spat,1 E temp,1 (E spat,1 = E 1 and E temp,1 = E 1 ); (2) E spat,2 E temp,1 (E spat,2 = E 2 and E temp,1 = E 1 ); (3) E spat,2 E temp,2 (E spat,2 = E 2 and E temp,2 = E 2 ). Finally, the universal set of all candidate model classes is M = {S 1 E spat,1 E temp,1 ,S 1 E spat,2 E temp,1 , S 1 E spat,2 E temp,2 }. Table 6 shows model class selection results (modal frequency prediction). The most plausible model class is E spat,2 E temp,1 , indicating that the measurement errors are correlated in space but independent in time. The reason behind this result is explained as follows. On the one hand, for two modal frequencies at the same time step (ω x1,n and ω y1,n ), they are identified from the same set of vibration data, so it can be expected that they are statistically correlated, which is equivalent to E spat,2 = E 2 . On the other hand, for the same modal frequency at different time steps (ω x1,n and ω x1,n for n = n ), the temporal distance between them is significantly large, so it can be expected that they are statistically independent, which is equivalent to E temp,1 = E 1 . Table 7 shows parameter identification results (modal frequency prediction). Figure 7 shows the measured and predicted values of ω x1 and ω y1 by different candidate model classes. The upper and lower three subplots are for ω x1 and ω y1 , respectively. The horizontal and vertical axes are for measured and predicted values, respectively. The 45-degree reference line represents that the predicted and measured values are identical. The performance of the most plausible model class (S 1 E spat,2 E temp,1 of two middle subplots) is superior to that of the oversimplified model class (S 1 E spat,1 E temp,1 of two left subplots), and is similar to that of the overcomplicated model class (S 1 E spat,2 E temp,2 of two right subplots). This reconfirms the result of Table 6. On the one hand, the likelihood value of S 1 E spat,2 E temp,1 is higher than that of S 1 E spat,1 E temp,1 . On the other hand, the likelihood values of S 1 E spat,2 E temp,1 and S 1 E spat,2 E temp,2 are similar, so, owing to the principle of model parsimony, a simpler model is to be preferred over unnecessarily complicated ones.

Conclusions
Measurement error is non-negligible and crucial in SHM data analysis. In many applications of SHM, measurement errors are statistically correlated in space and/or in time for data from sensor networks. This paper generalizes the form of correlated measurement error from spatial correlation only or temporal correlation only to spatial-temporal correlation. A new form of spatial-temporal correlation and the corresponding likelihood function are proposed, and multiple candidate model classes for the measurement error are constructed, including no correlation, spatial correlation, temporal correlation, and the proposed spatial-temporal correlation. Bayesian system identification is conducted to achieve not only the posterior probability density function (PDF) for the model parameters but also the posterior probability of each candidate model class for selecting the most suitable/plausible model class for the measurement error. Examples are presented with applications for model updating and modal frequency prediction under varying environmental conditions. It turns out that: (1) to analyze SHM data with correlated measurement error, it is essential to select the most plausible model of measurement errors based on a prescribed set of candidate model classes characterizing spatially and temporally correlated measurement error; (2) over-simplified or over-complicated model class for the correlated measurement error degrades the identification result; (3) the proposed Bayesian system identification is capable of analyzing SHM data with correlated measurement error without knowing the true correlation model and the associated parameters.

Conclusions
Measurement error is non-negligible and crucial in SHM data analysis. In many applications of SHM, measurement errors are statistically correlated in space and/or in time for data from sensor networks. This paper generalizes the form of correlated measurement error from spatial correlation only or temporal correlation only to spatial-temporal correlation. A new form of spatial-temporal correlation and the corresponding likelihood function are proposed, and multiple candidate model classes for the measurement error are constructed, including no correlation, spatial correlation, temporal correlation, and the proposed spatial-temporal correlation. Bayesian system identification is conducted to achieve not only the posterior probability density function (PDF) for the model parameters but also the posterior probability of each candidate model class for selecting the most suitable/plausible model class for the measurement error. Examples are presented with applications for model updating and modal frequency prediction under varying environmental conditions. It turns out that: (1) to analyze SHM data with correlated measurement error, it is essential to select the most plausible model of measurement errors based on a prescribed set of candidate model classes characterizing spatially and temporally correlated measurement error; (2) over-simplified or over-complicated model class for the correlated measurement error degrades the identification result; (3) the proposed Bayesian system identification is capable of analyzing SHM data with correlated measurement error without knowing the true correlation model and the associated parameters.