Enhancing Seismic Damage Detection and Assessment in Highway Bridge Systems: A Pattern Recognition Approach with Bayesian Optimization

Highway bridges stand as paramount elements within transportation infrastructure systems. The ability to ensure swift recovery after extreme events, such as earthquakes, is a fundamental trait of resilient communities. Consequently, expediting the recovery process necessitates near real-time diagnosis of structural damage to provide dependable information. In this study, a data-driven approach for damage detection and assessment is investigated, focusing on bridge columns—the pivotal supporting elements of bridge systems—based on simulations derived from nonlinear time history analysis. This research introduces a set of cumulative intensity-based damage features, whose efficacy is demonstrated through unsupervised learning techniques. Leveraging the support vector machine, a prominent pattern recognition algorithm in supervised learning, alongside Bayesian optimization with a Gaussian process, seismic damage detection and assessment are explored. Encouragingly, the methodology yields high estimation accuracies for both binary outcomes (indicating the presence of damage or the occurrence of collapse) and multi-class classifications (indicating the severity of damage). This breakthrough opens avenues for the practical implementation of on-board sensor computing, enabling near real-time damage detection and assessment in bridge structures.


Introduction
According to ASCE infrastructure report card [1], the majority of infrastructures in the US are currently rated as mediocre to poor, with many nearing or surpassing their initial design service life and showing signs of deterioration.Among these infrastructures, reinforced concrete (RC) highway bridge systems play a crucial role in transporting goods and people across natural terrains.Ensuring proper recovery after extreme events like earthquakes is vital for building resilient communities.The effective management of postdisaster consequences requires reliable information about the impact of seismic events, thereby necessitating the allocation of existing resources and skills [2].A resilient system is characterized by its ability to achieve "reduced time to recovery" [3].Consequently, rapid condition monitoring of highway bridges is imperative [4], with an emphasis on near real-time assessment of structural integrity to determine their safety for reoperation.Traditionally, visual inspections have been the primary method for post-disaster condition assessment.However, deploying dedicated teams for manual inspection poses challenges in terms of both time efficiency and financial resources.Significant efforts have been made to automate the visual inspection process (e.g., [5][6][7][8]).Despite these advancements, automated inspection methods can only detect visible larger defects, leaving the possibility of serious invisible defects going unnoticed [9].
Vibration records serve as another valuable source of information in Structural Health Monitoring (SHM), operating under the assumption that dynamic properties and responses undergo changes in the presence of damage.This approach has been studied extensively Sensors 2024, 24, 611 2 of 17 over the years (e.g., [10,11]).The model-based approach treats SHM as an inverse problem, utilizing the finite element method and model updating for analysis (e.g., [12][13][14][15][16]), which, however, poses challenges for near real-time implementation.More recent developments in SHM involve data-driven methods.Worden et al. [17] applied outlier analysis to detect damage in a three-degree-of-freedom spring system, utilizing Mahalanobis squared distance as the discordancy measure.Santos et al. [18] explored kernel-based algorithms for damage detection under varying operational and environmental conditions.A hybrid approach based on the expectation-maximization algorithm and Gaussian mixture models was proposed by the same group [19] to identify the normal state of a bridge.Another combined approach, utilizing the concept of symbolic data analysis [20], was applied for structural modification assessment using vibration data from a continuously monitored bridge structure.Abdeljaber et al. [21] proposed the use of one-dimensional convolutional neural networks for damage detection, validated on a grandstand simulator.Other researchers have successfully implemented and applied auto-associative neural networks (e.g., [22][23][24]), auto-regressive models (e.g., [25]), and cluster analysis (e.g., [26,27]) for damage detection in recent years.Modal identification and model updating have also been studied using a Bayesian model (e.g., [28][29][30]).
The aforementioned works exemplify damage detection under operational events.In earthquake engineering, there is significant interest in the rapid condition monitoring of structural health for post-earthquake safety assessment.González and Zapico [31] introduced a seismic damage detection method based on artificial neural networks (ANNs) for buildings with steel moment-frame structures.De Lautour and Omenzetter [32] presented an approach using ANNs to identify seismic-induced damage in two-dimensional (2D) reinforced concrete frames.Elwood et al. [33] proposed an approach based on fuzzy pattern recognition for seismic damage detection in concrete building structures.Zhang et al. [34] employed regression trees and random forests to map building response and damage patterns to residual collapse capacity.Recent methods include using convolutional neural networks [35][36][37] and hybrid deep learning models [38].Notably, most research efforts in seismic damage detection have focused on 2D building structures subjected to unidirectional ground motion excitation.
This paper introduces a data-driven methodology for damage detection and assessment, utilizing acceleration data obtained from over 60,000 nonlinear time history analysis (NTHA) simulations conducted on two representative RC highway bridge systems subjected to bidirectional GM inputs.This study puts forth a set of low-dimensional cumulative intensity-based damage features, including fractional ones, specifically tailored for bridge columns, which are pivotal components of RC highway bridge systems.The effectiveness of these features is evidenced through their estimated joint probability density function (PDF).A comparative analysis is carried out on selected representative bridge systems under different conditions: normal circumstances and earthquake scenarios with probabilities of exceedance (POEs) of 50%, 10%, and 2% in 50 years, respectively.
This study leverages the support vector machine (SVM), a widely recognized pattern recognition algorithm, to scrutinize structural damage features.The SVM plays a pivotal role in identifying collapse occurrences, detecting damage presence, and assessing severity.Addressing the challenge of overfitting, hyperparameter tuning is conducted through Bayesian optimization [39], wherein the generalization performance of the learning algorithm is modeled as a Gaussian process (GP) sample.To the author's knowledge, this paper marks the pioneering application of SVM with Bayesian optimization in SHM for civil infrastructures.The outcomes demonstrate highly promising accuracies and robustness (to a significant amount of noise) in both binary (indicating the presence of damage or collapse) and multi-class (indicating the severity of damage) classifications.This research opens up possibilities for leveraging onboard sensor computing, enabling near real-time damage detection and assessment.

Damage Feature Extraction
An ideal damage feature is a low-dimensional quantity extracted from the system response data, demonstrating a robust correlation with the structural damage state.In this study, the SHM process is simulated by emulating the placement of four accelerometers (virtual sensors) on the bridge column, as illustrated in Figure 1.Among these, two sensors capture the bidirectional ground motion (GM) excitation, while the other two record the acceleration time histories of the column's top in both longitudinal and transverse directions.
opens up possibilities for leveraging onboard sensor computing, enabling near real-time damage detection and assessment.

Damage Feature Extraction
An ideal damage feature is a low-dimensional quantity extracted from the system response data, demonstrating a robust correlation with the structural damage state.In this study, the SHM process is simulated by emulating the placement of four accelerometers (virtual sensors) on the bridge column, as illustrated in Figure 1.Among these, two sen sors capture the bidirectional ground motion (GM) excitation, while the other two record the acceleration time histories of the column s top in both longitudinal and transverse di rections.It is to be noted that these features serve as the parameters that machine learning algorithms will analyze to identify and quantify damage.Consequently, these damage features are ideally expected to exhibit a monotonic change as the damage levels increase In this paper, a set of cumulative intensity-based damage features is proposed as follows tively, lead to cumulative absolute velocity (previously proposed as a damage feature in [40]) and Arias intensity (multiplied by a constant 2g π ), as illustrated in Figure 2. The damage feature suggested in Equation (1) holds broader applicability, as η can be any positive real number, thereby eliminating the constraint of being a positive integer.This flexibility allows for the incorporation of potential damage features based on fractiona cumulative intensity.To provide a comprehensive assessment of the bridge column s damage conditions, an additional related feature is introduced as follows: ( ) where ( ) ct a t represents the acceleration time history sensed at the bridge column top When analyzing the bridge column as an input-output system from an energy perspec tive, the ratio shows a decreasing trend with higher energy dissipation, signifying an es calation in the acquired damages on the bridge column.Nevertheless, normalizing abso lute intensity measures through the calculation of the corresponding ratio in Equation (2 leads to the loss of information regarding the magnitude of the input energy.Taking into account that g I η corresponds to the input energy of excitation [41], it is prudent to incor porate both absolute and relative intensity measures as inputs.As a result, considering It is to be noted that these features serve as the parameters that machine learning algorithms will analyze to identify and quantify damage.Consequently, these damage features are ideally expected to exhibit a monotonic change as the damage levels increase.In this paper, a set of cumulative intensity-based damage features is proposed as follows: where a g (t) represents the acceleration time history of the GM input and T d denotes the duration of the earthquake.Consequently, this series of damage features incorporates both amplitude and temporal contributions.It is noted that η = 1 and η = 2, respectively, lead to cumulative absolute velocity (previously proposed as a damage feature in [40]) and Arias intensity (multiplied by a constant π/2g), as illustrated in Figure 2. The damage feature suggested in Equation (1) holds broader applicability, as η can be any positive real number, thereby eliminating the constraint of being a positive integer.This flexibility allows for the incorporation of potential damage features based on fractional cumulative intensity.
To provide a comprehensive assessment of the bridge column's damage conditions, an additional related feature is introduced as follows: where a ct (t) represents the acceleration time history sensed at the bridge column top.When analyzing the bridge column as an input-output system from an energy perspective, the ratio shows a decreasing trend with higher energy dissipation, signifying an escalation in the acquired damages on the bridge column.Nevertheless, normalizing absolute intensity measures through the calculation of the corresponding ratio in Equation (2) leads to the loss of information regarding the magnitude of the input energy.Taking into account that I g η corresponds to the input energy of excitation [41], it is prudent to incorporate both absolute and relative intensity measures as inputs.As a result, considering the bidirectional GM input in x and y directions, for each selected η, a total of four damage features are taken into account in this study, i.e., I

Pattern Recognition
A pattern recognition algorithm is one that assigns a class label to a sample of measured data, typically by training a diagnostic.Supervised learning algorithms, a category within pattern recognition, educate the diagnostic by presenting it with the true label for each dataset.Consequently, these learning algorithms are crucial for evaluating factors such as the severity of damage, where datasets representing various damage states are employed for training and classification purposes.In this investigation, the use of support vector machine (SVM), a prominent representative of supervised learning algorithms, is explored.

Support Vector Machine
The aim of using the SVM is to construct a hyperplane as defined in the following equation to separate two different classes of data samples ( ) and to maximize the margin from the hyperplane to the closest data points in either class: where x denotes the selected damage features, i.e., , , This hyperplane is in terms of the extended features ( ) h x .Accordingly, the optimization problem for SVM can be expressed as follows [42]: where N is the total number of sampled points, C is the cost parameter to control the tradeoff of bias and variance, and ξ is the slack variable to allow for some data points to be on the wrong side of the margin.The solution to Equation (4) changes Equation (3) into the following:

Pattern Recognition
A pattern recognition algorithm is one that assigns a class label to a sample of measured data, typically by training a diagnostic.Supervised learning algorithms, a category within pattern recognition, educate the diagnostic by presenting it with the true label for each dataset.Consequently, these learning algorithms are crucial for evaluating factors such as the severity of damage, where datasets representing various damage states are employed for training and classification purposes.In this investigation, the use of support vector machine (SVM), a prominent representative of supervised learning algorithms, is explored.

Support Vector Machine
The aim of using the SVM is to construct a hyperplane as defined in the following equation to separate two different classes of data samples (y i ∈ {−1, 1}) and to maximize the margin from the hyperplane to the closest data points in either class: where x denotes the selected damage features, i.e., I g η−x , I g η−y , R η−x and R η−y for each η.This hyperplane is in terms of the extended features h(x).Accordingly, the optimization problem for SVM can be expressed as follows [42]: where N is the total number of sampled points, C is the cost parameter to control the tradeoff of bias and variance, and ξ is the slack variable to allow for some data points to be on the wrong side of the margin.The solution to Equation (4) changes Equation (3) into the following: In this paper, the radial basis kernel function as in Equation ( 8) is used.

Bayesian Optimization
To avoid overfitting, the common practice is to minimize the K-fold cross-validated (CV) [42] loss (CVL K ) of the SVM model with respect to its hyperparameters, the cost parameter C and the kernel scale γ in this study.First, one splits the training set into K non-overlapping subsets.For k = 1, 2, . . ., K, the test set is represented by the k-th subset, while the training set is represented by the remaining K-1 subsets.For the k-th iteration, the loss E K (λ) is evaluated, with λ = (C, γ), while the K-fold CV loss is computed as follows: The objective function at hand is evidently non-convex, lacking a closed-form expression, and thus its derivatives are inaccessible.One can only acquire observations of this function at sampled values, and such evaluations come at a considerable cost.Consequently, direct application of common optimization algorithms, like the Monte Carlo method or Genetic Algorithm, appears impractical.Bayesian optimization emerges as a potent strategy for extremum discovery in cases where the objective function, such as the one presented in Equation (10), is difficult to assess.What sets Bayesian optimization apart is its approach: it constructs a probabilistic model for the objective function and utilizes this model to determine the next point for evaluation.The aim is to leverage all available information from previous evaluations, thus avoiding an exclusive reliance on local gradient and Hessian approximations [39].Despite the additional computational effort required to determine the next point for evaluation, Bayesian optimization generally proves to be effective in identifying the minimum of challenging non-convex functions with relatively few evaluations [43].
The fundamental assumption adopted in Bayesian optimization is that the function CVL K (λ) is drawn from a GP prior, i.e., CVL K (λ) ∼ N(0, K) (without loss of generality, the prior mean is given as 0), whose kernel matrix is given by where k λ, λ ′ is the covariance function.From previous iterations, the following observation is acquired: D 1:t = λ 1:t , CVL K 1:t , where CVL K 1:t = CVL K (λ 1:t ).λ t+1 is ob- tained as the next point to evaluate and denote the value of the function at λ t+1 as CVL K t+1 = CVL K (λ t+1 ).Under the GP prior, CVL K 1:t and CVL K t+1 are jointly Gaussian and one can obtain the following expression for the predictive distribution [39,44]: where Therefore, the predictive posterior distribution CVL K t+1 D 1:t is sufficiently characterized by its predictive mean function µ(λ t+1 ) and predictive variance function σ 2 (λ t+1 ), which solely depend on the selection of the covariance function k λ, λ ′ .In this study, the automatic relevance determination (ARD) Matérn 5/2 kernel recommended in [43] is used as follows to permit greater flexibility in modeling function: where where θ 0 and θ d , d = 1, . . ., D, are the hyperparameters of the ARD Matérn 5/2 kernel that are learned by "seeding" with a few random samples and maximizing the log-likelihood of the evidence given θ = (θ 0 , θ 1 , . . . ,θ D ) [32,36].In this case, D = 2 corresponds to the dimensionality of λ = (C, γ).
To sample efficiently, Bayesian optimization uses an acquisition function to determine the next location λ t+1 for evaluation.The acquisition function used in this study is the Expected Improvement (EI), which is to maximize the EI over the best current value λ best = argmin λ i ∈λ 1:t CVL K (λ i ).This has a closed-form solution under the GP [44] assumption as follows: where and Φ(•) and ϕ(•), respectively, denote cumulative distribution function and PDF of the standard normal.Unlike the original unknown objective function in Equation ( 7), a EI (•) can be cheaply sampled to be maximized.Note that GPs scale cubically with the number of observation; in summary, the goal of Bayesian optimization is to efficiently discover the global optimum with a limited number of evaluations by intelligently allocating additional computing power to identify the next point for assessment.The algorithm of SVM with Bayesian optimization is summarized in Algorithm 1.The proposed pattern recognition algorithm is implemented through the following steps: 1.
Generate damage features from the training data, as detailed in Section 2; 2.
Train and fine-tune support vector machines (SVMs) according to the procedures outlined in Section 3.This involves using the generated damage features from Step 1 along with corresponding labels (e.g., damaged or not); 3.
Employ the trained SVMs from Step 2 for future predictions when a new set of acceleration records is acquired.

Case Study
In this section, the proposed framework is investigated on RC highway bridge systems.

Computational Bridge Model and Ground Motion Selection
Two representative RC highway bridge systems (designed after 2000), Jack Tone Road Overcrossing (denoted as Bridge A) and La Veta Avenue Overcrossings (denoted as Bridge B), are selected for this study.Comprehensive analytical modeling and simulations of these bridges can be found in [45].The software platform OpenSees (version 3.5.0)[46] is employed for both the modeling and simulations.The computational models explicitly encompass the superstructure, column-bents, and seat-type abutments.Given that modeling assumptions can significantly influence the dynamic response characteristics of short bridges [47,48], verified and/or validated modeling techniques are adopted whenever feasible.
The bridge superstructure (depicted in Figure 3), comprising the bridge deck and cap beam, is modeled using elastic beam-column elements with uncracked section properties.Exceptionally high torsional and out-of-plane stiffness values are assigned to the cap beam due to its integral construction with the deck.To accurately capture dynamic responses, the mass of the superstructure, including rotational mass, is distributed to the superstructure elements.The bridge column is represented by nonlinear force-based beam-column elements (as illustrated in Figure 3), incorporating fiber-discretized cross-sections.This approach employs three concurrent constitutive models: (1) confined concrete for the core, (2) unconfined concrete for the cover, and (3) steel for the reinforcing bars.For both cover and core concretes, following the methodology in [49], the Concrete01 constitutive model is applied.This model represents a uniaxial Kent-Scott-Park concrete material object with degraded linear unloading/reloading stiffness and no tensile strength.The steel reinforcing bars are modeled using the Steel02 material, which represents a uniaxial Giuffre-Menegotto-Pinto steel material object with isotropic strain hardening [50].Two modeling approaches, designated as Type I and Type II, are under consideration for the abutment (refer to Figure 4).Both approaches explicitly address longitudinal, transverse, and vertical responses.In Type I (as illustrated in Figure 4a), the model employs two nonlinear springs, each located at the ends, connected in series to gap elements.These springs, modeled with an elastic-perfectly plastic (EPP) backbone, represent the passive backfill response and the expansion joint, respectively [51].The transverse direction incorporates an EPP backbone relationship to model the backfill-wingwall-pile system, with the Type I model ignoring the resistance of the shear keys for simplicity.The vertical response of the bearing pads and stemwall is captured using two parallel springs.The first spring represents the flexible part of the elastomeric bearing pad in the vertical direction, and the second represents the vertical stiffness of the stemwall.In Type II (depicted in Figure 4b), the longitudinal response is modeled using five abutment nonlinear hyperbolic springs connected in series to gap elements.Additionally, the resistance provided by the shear key is modeled in the transverse direction using a nonlinear spring Two modeling approaches, designated as Type I and Type II, are under consideration for the abutment (refer to Figure 4).Both approaches explicitly address longitudinal, transverse, and vertical responses.In Type I (as illustrated in Figure 4a), the model employs two nonlinear springs, each located at the ends, connected in series to gap elements.These springs, modeled with an elastic-perfectly plastic (EPP) backbone, represent the passive backfill response and the expansion joint, respectively [51].The transverse direction incorporates an EPP backbone relationship to model the backfill-wingwall-pile system, with the Type I model ignoring the resistance of the shear keys for simplicity.The vertical response of the bearing pads and stemwall is captured using two parallel springs.The first spring represents the flexible part of the elastomeric bearing pad in the vertical direction, Sensors 2024, 24, 611 8 of 17 and the second represents the vertical stiffness of the stemwall.In Type II (depicted in Figure 4b), the longitudinal response is modeled using five abutment nonlinear hyperbolic springs connected in series to gap elements.Additionally, the resistance provided by the shear key is modeled in the transverse direction using a nonlinear spring with a trilinear backbone relationship.It is noteworthy that the modeling technique employed in this study aligns with the approach utilized by Cruz and Saiidi [52], validated through a large-scale four-span bridge test at the University of Nevada, Reno, demonstrating a comparable correlation between seismic demands derived from analytical models and experimental data.
tion incorporates an EPP backbone relationship to model the backfill-wingwall-pile sy tem, with the Type I model ignoring the resistance of the shear keys for simplicity.T vertical response of the bearing pads and stemwall is captured using two parallel spring The first spring represents the flexible part of the elastomeric bearing pad in the vertic direction, and the second represents the vertical stiffness of the stemwall.In Type II (d picted in Figure 4b), the longitudinal response is modeled using five abutment nonline hyperbolic springs connected in series to gap elements.Additionally, the resistance pr vided by the shear key is modeled in the transverse direction using a nonlinear spri with a tri-linear backbone relationship.It is noteworthy that the modeling technique em ployed in this study aligns with the approach utilized by Cruz and Saiidi [52], validat through a large-scale four-span bridge test at the University of Nevada, Reno, demonstr ing a comparable correlation between seismic demands derived from analytical mod and experimental data.Utilizing a magnitude 7 earthquake scenario outlined in [53], 99 pairs of seed bidirectional horizontal ground motion (GM) records are selected from the PEER Next Generation Attenuation (NGA) Project GM database [54].Subsequently, these 99 pairs of GM records are scaled based on the lognormal distribution of peak ground velocity (PGV) as detailed in [55].For this investigation, 25 PGV values (representing 25 intensity levels) to encompass this distribution are chosen.Additionally, various intercept angles, ranging from 0 to 150 in increments of 30 (refer to Figure 3), are explored.Consequently, considering both Bridges A and B, abutment modeling Types I and II, and the six intercept angles mentioned above for all 99 unscaled GMs with 25 PGV values, a total of 59,400 NTHA simulations are conducted.In this extensive set, simulations for the first five intercept angles are designated for training, while those for the last intercept angle constitute the test set.Both sets comprise representative samples, including damaged and undamaged instances, addressing the classification problem related to the existence of damage.

Damage Feature
The efficacy of R η proposed in Equation (2) as a damage feature is demonstrated through unsupervised learning, where a statistical model (such as a joint Probability Density Function) of damage features during the undamaged state is established [56,57].Monitoring data are then compared against this model.In this study, the multivariate probabilistic model of Distributions with Independent Components [58,59] is employed, with univariate distributions modeled through kernel density estimation (KDE) [60].The dataset is derived from 99 ground motions with small scaling factors, ranging from 0.01 to 0.1 in increments of 0.01.This dataset comprises a total of 990 NTHA simulations, representing undamaged conditions for each investigated bridge system configuration (e.g., Bridge A with Type I abutment modeling).The detailed procedure is outlined in the Appendix A.
Three sets of 40 GMs, which correspond to the earthquake scenarios with 50%, 10%, and 2% POE in 50 years, are selected to represent three damage levels for the bridges.As a demonstration, Figures 5 and 6, respectively, show the comparisons between the joint Sensors 2024, 24, 611 9 of 17 PDF (the heat maps) and the three groups for R 1 and R 2 (red dots), respectively.It is noted that the joint PDF of R 2 is much flatter than that of R 1 (e.g., peaks from 1.6 to 1.8 for R 1−x compared to those from 2.3 to 2.8 for R 2−x ), which explains the order of magnitude difference between their color bars.As the damage level increases (i.e., from 50% POE in 50 years to 10% POE in 50 years, and then to 2% POE in 50 years), clear monotonic trends are discernible for both damage features (as groups), showcasing a gradual shift of ellipses (encompassing most of the red dots), toward the left lower corner.Consequently, R 1 and R 2 are effective damage indicators for bridge column of the investigated RC highway bridge systems and can be used as damage features in the pattern recognition algorithm introduced next.
senting undamaged conditions for each investigated bridge system configuration (e.g., Bridge A with Type I abutment modeling).The detailed procedure is outlined in the Appendix A.
Three sets of 40 GMs, which correspond to the earthquake scenarios with 50%, 10%, and 2% POE in 50 years, are selected to represent three damage levels for the bridges.As a demonstration, Figures 5 and 6, respectively, show the comparisons between the joint PDF (the heat maps) and the three groups for 1 R and 2 R (red dots), respectively.It is noted that the joint PDF of 2 R is much flatter than that of 1 R (e.g., peaks from 1.6 to 1.8 for 1 x R − compared to those from 2.3 to 2.8 for 2 x R − ), which explains the order of mag- nitude difference between their color bars.As the damage level increases (i.e., from 50% POE in 50 years to 10% POE in 50 years, and then to 2% POE in 50 years), clear monotonic trends are discernible for both damage features (as groups), showcasing a gradual shift of ellipses (encompassing most of the red dots), toward the left lower corner.Consequently,

Simulated Measurement Noise
For earthquake event applications, it is crucial that the damage detection and assessment algorithm remains robust in the presence of measurement noise in sensor recordings.To simulate such noise, the following procedures are proposed: 1. Random Gaussian noise, with a noise-to-signal ratio of 30% (calculated as the ratio of standard deviations within the duration of each ground motion), is added to the acceleration time history of the input GM excitation at the column bottom (see Figure 1); 2. The acceleration time history at the column top is obtained by summing up the acceleration at the ground level (with noise) from Step 1 and the relative acceleration from NTHA simulations.Note that the relative acceleration is recorded in OpenSees [48] for the column top; 3. Again, random Gaussian noise with a noise-to-signal ratio of 30% is added to the obtained acceleration time history at the column top in Step 2. Figure 7 illustrates the comparison of the acceleration signal at the column top with and without noise for one GM scaled to the highest intensity level. 1.5

Simulated Measurement Noise
For earthquake event applications, it is crucial that the damage detection and assessment algorithm remains robust in the presence of measurement noise in sensor recordings.To simulate such noise, the following procedures are proposed: 1.
Random Gaussian noise, with a noise-to-signal ratio of 30% (calculated as the ratio of standard deviations within the duration of each ground motion), is added to the acceleration time history of the input GM excitation at the column bottom (see Figure 1); 2.
The acceleration time history at the column top is obtained by summing up the acceleration at the ground level (with noise) from Step 1 and the relative acceleration from NTHA simulations.Note that the relative acceleration is recorded in OpenSees [48] for the column top; 3.
Again, random Gaussian noise with a noise-to-signal ratio of 30% is added to the obtained acceleration time history at the column top in Step 2. Figure 7 illustrates the comparison of the acceleration signal at the column top with and without noise for one GM scaled to the highest intensity level.
2. The acceleration time history at the column top is obtained by summing up the acceleration at the ground level (with noise) from Step 1 and the relative acceleration from NTHA simulations.Note that the relative acceleration is recorded in OpenSees [48] for the column top; 3. Again, random Gaussian noise with a noise-to-signal ratio of 30% is added to the obtained acceleration time history at the column top in Step 2. Figure 7 illustrates the comparison of the acceleration signal at the column top with and without noise for one GM scaled to the highest intensity level.Following these procedures, for each GM, the noise-to-signal ratio consistently increases with the increase in intensity level.Following these procedures, for each GM, the noise-to-signal ratio consistently increases with the increase in intensity level.Figure 8 depicts such trends, presenting the average noise-to-signal ratio for the selected 99 ground motions across all four bridge configurations for the 30-degree intercept angle.average noise-to-signal ratio for the selected 99 ground motions across all four bridge configurations for the 30-degree intercept angle.

Classification Results
For visualization purposes, Figure 9 shows the training results using 1 R and 2 R , (refer to Equation ( 2)), respectively, as damage features for predicting the occurrence of collapse (i.e., when the peak column drift ratio exceeds 8% [61]) and the existence of damage (i.e., when the peak column drift ratio exceeds 2% [62]) for Bridge A with Type I abutment modeling.The decision boundary is determined by the labeled training data and an SVM tuned using Bayesian optimization.It is important to note that the decision boundary is nonlinear, and the damage regions exhibit discontinuities due to the utilization of a kernel, as defined in Equation ( 6). Figure 10 illustrates the minimization of the

Classification Results
For visualization purposes, Figure 9 shows the training results using R 1 and R 2 , (refer to Equation ( 2)), respectively, as damage features for predicting the occurrence of collapse (i.e., when the peak column drift ratio exceeds 8% [61]) and the existence of damage (i.e., when the peak column drift ratio exceeds 2% [62]) for Bridge A with Type I abutment modeling.The decision boundary is determined by the labeled training data and an SVM tuned using Bayesian optimization.It is important to note that the decision boundary is nonlinear, and the damage regions exhibit discontinuities due to the utilization of a nonlinear kernel, as defined in Equation ( 6). Figure 10 illustrates the minimization of the 10-fold cross-validated loss (adopted as the objective function in this paper) using Bayesian optimization.Both classification tasks-detecting the existence of damage and predicting the occurrence of collapse-are performed for all investigated bridge configurations using the damage features outlined in Table 1.In this study, the damage feature vector is of around ten dimensions [4].As mentioned earlier, for each η, a total of four damage features, i.e., I With the hyperparameter values of the SVM models (  With the hyperparameter values of the SVM models (     With the hyperparameter values of the SVM models (Table 3) determined using Bayesian optimization (searching over a cube with C, γ ∈ [0.001, 1000], as shown in Figure 8), the training, CV, and testing accuracies for all scenarios are documented in Table 4. Notably, the CV accuracy (i.e., 1-CVL K ; note that CVL K is the cost function as in Equation ( 9) for minimization) closely approximates the testing accuracy.While a subtle decrease in testing accuracy is observed with more intricate structures-from single-column Bridge A to two-column Bridge B and from Type I abutment modeling to Type II with additional springs and gap elements-the testing accuracies remain remarkably high for these two binary classifications.Figures 11 and 12 provide example confusion matrices for both training and testing sets, illustrating accuracies and misclassification errors for each class.The SVM is further extended to handle multi-class classification problems.In this paper, a three-class classification is conducted: no damage, damaged without collapse, and collapse (i.e., peak column drift ratio below 2%, between 2% and 8%, and above 8%).This entails three SVM classifiers, each time comparing one of the three classes to the remaining two.In this case, λ in Algorithm 1 becomes a vector with six elements, representing three cost parameters and three kernel scales-one pair for each SVM classifier.The last two columns of Table 2 contain the hyperparameters values for the three SVM models, i.e., from top to bottom, (0, 2%), [2%, 8%), and [8%, +inf) versus the remaining two classes.Remarkably, promising accuracies of approximately 90% are achieved (Table 3).Additionally, Figures 13-16 present the confusion matrices for training and testing sets in the three cases, thereby providing the predicted accuracies for each class.It is noteworthy that the accuracies for training and testing sets in all cases are comparable, indicating that the Bayesian-optimized SVM classifiers exhibit robustness against overfitting.The advantages of Bayesian optimization are evident in the comparisons of achieved testing accuracies with and without adopting Bayesian optimization for hyperparameter selection (Table 4).It is important to note that the hyperparameters leading to testing accuracies without Bayesian optimization are randomly selected (i.e., those used in the first iteration of the corresponding Bayesian optimization).

Conclusions
This paper introduces a novel data-driven approach for detecting damage in bridge columns through nonlinear time history simulations conducted on a reinforced concrete highway bridge system.The proposed structural health monitoring method simulates the placement of four accelerometers on the bridge column.Two of these accelerometers measure bidirectional GM excitation, while the other two record the acceleration time histories of the column's top in both longitudinal and transverse directions.A set of cumulative intensity-based damage features, including fractional ones, is derived from the acceleration time histories.These features have been proven to be effective and reliable indicators of damage through unsupervised learning.The analysis takes into account distributions with independent components, utilizing univariate kernel density distributions.Subsequently, a support vector machine is applied, with its hyperparameters optimized using Bayesian optimization.This approach is used to address various binary and multi-class classification problems related to damage diagnosis, such as predicting the occurrence of collapse, identifying the existence of damage, and assessing its severity.Remarkably high accuracies and robustness are achieved, even when subjected to simulated measurement noise with a high signal-to-noise ratio.This suggests the model's potential for implementation in sensor networks equipped with onboard computing capabilities, thereby enabling near real-time damage detection and assessment.

Figure 1 .
Figure 1.Virtual accelerometers placement on the bridge columns.
the acceleration time history of the GM input and d T denotes the duration of the earthquake.Consequently, this series of damage features incorporates both amplitude and temporal contributions.It is noted that 1

Figure 1 .
Figure 1.Virtual accelerometers placement on the bridge columns.
g η−x , I g η−y , R η−x and R η−y .Sensors 2024, 24, 611 4 of 17 Sensors 2024, 24, x FOR PEER REVIEW 4 of 19the bidirectional GM input in x and y directions, for each selected η , a total of four dam- age features are taken into account in this study, i.e.,

Sensors 2024 , 19 Figure 3 .
Figure 3. Modeling of the bridge under bidirectional GM input considering different intercept angles.

Figure 3 .
Figure 3. Modeling of the bridge under bidirectional GM input considering different intercept angles.

Figure 4 .
Figure 4. Abutment modeling with springs and gap elements.(a) Type I. (b) Type II.

Figure 4 .
Figure 4. Abutment modeling with springs and gap elements.(a) Type I. (b) Type II.

Figure 7 .
Figure 7. Time history reading of the column top for one GM scaled to the highest intensity level.

Figure 8 Figure 7 .
Figure 7. Time history reading of the column top for one GM scaled to the highest intensity level.

Figure 8 .
Figure 8.The average noise-to-signal ratio of the four bridge configurations under all intensity levels for the 30-degree intercept angle.

Figure 8 .
Figure 8.The average noise-to-signal ratio of the four bridge configurations under all intensity levels for the 30-degree intercept angle.

Figure 9 .Figure 10 .
Figure 9. SVM training results for the occurrence of collapse using R1−x and R1−y and existence of damage using R2−x and R2−y for Bridge A with Type I abutment modeling.(a) R1−x and R1−y.(b) R2−x and R2−y.

Figure 9 .Figure 9 .Figure 10 .
Figure 9. SVM training results for the occurrence of collapse using R 1−x and R 1−y and existence of damage using R 2−x and R 2−y for Bridge A with Type I abutment modeling.(a) R 1−x and R 1−y .(b) R 2−x and R 2−y .

Figure 10 .
Figure 10.An illustration of cross-validated loss minimization using Bayesian optimization and its bird view.(a) Bayesian optimization.(b) Bird view.

Figure 11 .Figure 12 .
Figure 11.Existence of damage: confusion matrices for Bridge B with Type I abutment modeling.(a) Training set.(b) Testing set.

Figure 11 .
Figure 11.Existence of damage: confusion matrices for Bridge B with Type I abutment modeling.(a) Training set.(b) Testing set.

Figure 11 .Figure 12 .
Figure 11.Existence of damage: confusion matrices for Bridge B with Type I abutment modeling.(a) Training set.(b) Testing set.

Figure 15 .Figure 16 .
Figure 15.Severity of damage: confusion matrices for Bridge B with Type I abutment modeling.(a) Training set.(b) Testing set.Sensors 2024, 24, x FOR PEER REVIEW 15 of 19

Figure 16 .
Figure 16.Severity of damage: confusion matrices for Bridge B with Type II abutment modeling.(a) Training set.(b) Testing set.

Table 3 )
determined using Bayesian optimization (searching over a cube with C γ ∈, as shown in Figure8), the training, CV, and testing accuracies for all scenarios are documented in Table4.Notably, the CV accuracy (i.e., 1 -K CVL ; note that K CVL is the cost function as in Equation (9) for minimization) closely approximates the testing accuracy.While a subtle decrease in testing accuracy is observed with more intricate structures-from single-column Bridge A to two-column Bridge B and from Type I abutment modeling to Type II with additional springs and gap elements-the testing accuracies remain remarkably high for these two binary classifications.Figures11 and 12provide example confusion matrices

Table 3 )
determined using Bayesian optimization (searching over a cube with

Table 4 .
Notably, the CV accuracy (i.e., 1 - K CVL ; note that K CVL is the cost function as in Equation (9) for minimization) closely approximates the testing accuracy.While a subtle decrease in testing accuracy is observed with more intricate structures-from single-column Bridge A to two-column Bridge B and from Type I abutment modeling to Type II with additional springs and gap elements-the testing accuracies remain remarkably high

Table 1 .
Selected damage features for investigated classification cases.

Table 2 .
Hyperparameters selected using Bayesian optimization for investigated classification cases.

Table 3 .
Training, CV, and testing accuracies achieved for investigated classification cases.

Table 4 .
Comparisons of testing accuracy between with and without Bayesian optimization for hyperparameters.

Table 3 )
. Additionally,present the confusion matrices for training and testing sets in the three cases, thereby providing the predicted accuracies for each class.It is noteworthy that the accuracies for training and testing sets in all cases are comparable, indicating that the Bayesian-optimized SVM classifiers exhibit robustness against overfitting.The advantages of Bayesian optimization are evident in the comparisons of achieved testing accuracies with and without adopting Bayesian optimization for hyperparameter selection (Table4).It is important to note that the hyperparameters leading to testing accuracies without Bayesian optimization are randomly selected (i.e., those used in the first iteration of the corresponding Bayesian optimization).

Table 3 )
. Additionally,present the confusion matrices for training and testing sets in the three cases, thereby providing the predicted accuracies for each class.It is noteworthy that the accuracies for training and testing sets in all cases are comparable, indicating that the Bayesian-optimized SVM classifiers exhibit robustness against overfitting.The advantages of Bayesian optimization are evident in the comparisons of achieved testing accuracies with and without adopting Bayesian optimization for hyperparameter selection (Table4).It is important to note that the hyperparameters leading to testing accuracies without Bayesian optimization are randomly selected (i.e., those used in the first iteration of the corresponding Bayesian optimization).

Table 2 .
Hyperparameters selected using Bayesian optimization for investigated classification cases.
BridgeAbutment Existence of Damage Occurrence of Collapse Severity of Damage C γ C γ