Bridge Structural Health Monitoring: A Multi-Dimensional Taxonomy and Evaluation of Anomaly Detection Methods

Sonbul, Omar S.; Rashid, Muhammad

doi:10.3390/buildings15193603

Open AccessReview

Bridge Structural Health Monitoring: A Multi-Dimensional Taxonomy and Evaluation of Anomaly Detection Methods

by

Omar S. Sonbul

and

Muhammad Rashid

^*

Computer and Network Engineering Department, Umm Al-Qura University, Makkah 21955, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Buildings 2025, 15(19), 3603; https://doi.org/10.3390/buildings15193603

Submission received: 31 August 2025 / Revised: 29 September 2025 / Accepted: 5 October 2025 / Published: 8 October 2025

(This article belongs to the Special Issue Structural Health Monitoring Through Advanced Artificial Intelligence)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Bridges are critical to national mobility and economic flow, making dependable structural health monitoring (SHM) systems essential for safety and durability. However, the SHM data quality is often affected by sensor faults, transmission noise, and environmental interference. To address these issues, anomaly detection methods are widely adopted. Despite their wide use and variety, there is a lack of systematic evaluation that comprehensively compares these techniques. Existing reviews are often constrained by limited scope, minimal comparative synthesis, and insufficient focus on real-time performance and multivariate analysis. Consequently, this systematic literature review (SLR) analyzes 36 peer-reviewed studies published between 2020 and 2025, sourced from eight reputable databases. Unlike prior reviews, this work presents a novel four-dimensional taxonomy covering real-time capability, multivariate support, analysis domain, and detection methods. Moreover, detection methods are further classified into three categories: distance-based, predictive, and image processing. A comparative evaluation of the reviewed detection methods is performed across five key dimensions: robustness, scalability, real-world deployment feasibility, interpretability, and data dependency. Findings reveal that image-processing methods are the most frequently applied (22 studies), providing high detection accuracy but facing scalability challenges due to computational intensity. Predictive models offer a trade-off between interpretability and performance, whereas distance-based methods remain less common due to their sensitivity to dimensionality and environmental factors. Notably, only 11 studies support real-time anomaly detection, and multivariate analysis is often overlooked. Moreover, time-domain signal processing dominates the field, while frequency and time-frequency domain methods remain rare despite their potential. Finally, this review highlights key challenges such as scalability, interpretability, robustness, and practicality of current models. Further research should focus on developing adaptive and interpretable anomaly detection frameworks that are efficient enough for real-world SHM deployment. These models should combine multi-modal strategies, handle uncertainty, and follow standardized evaluation protocols across varied monitoring environments.

Keywords:

bridges; structural health monitoring; data anomaly; detection; real-time; multivariate

1. Introduction

Bridges form the backbone of transportation networks by providing connectivity across natural and man-made obstacles such as rivers, valleys, and roads. Therefore, they play an important role in economic progress through enhanced mobility [1]. However, they experience wear and tear over time due to environmental factors and aging materials, which gradually weaken their structure [2]. To mitigate these risks, a structural health monitoring (SHM) process is implemented to detect early signs of damage. This proactive approach through SHM process enhances safety, reduces risks, and extends the lifespan of bridges [3].

A typical SHM process is illustrated in Figure 1. The process begins with the excitation phase, where either controlled external forces (active excitation) or naturally occurring ambient vibrations (passive excitation) are applied to the bridge structure to initiate dynamic response [4,5]. Following excitation, sensors record critical performance parameters (such as strain, displacement, and environmental conditions) during the data acquisition stage. The acquired data is then normalized to eliminate distortions and ensure consistency. In the anomaly detection phase, the data is examined for irregularities that may indicate faults. Such anomalies can result from sensor malfunctions, environmental noise, transmission errors, or human mistakes [6,7]. If anomalies are detected, a data correction step is applied to remove or mitigate errors. Once the data is refined, it undergoes processing to extract meaningful insights. These insights support informed decision-making regarding the bridge’s condition. Based on acquired insights, bridge maintenance actions are implemented based on the analysis to enhance structural safety and ensure timely intervention [8,9].

As illustrated in Figure 1, timely data correction in the SHM process is essential for accurate assessment. If errors are not rectified before the data processing phase, evaluations may become unreliable. Therefore, detecting abnormal data helps in preventing unnecessary maintenance actions. By identifying such anomalies, engineers can distinguish genuine structural issues from faulty sensor readings [10,11]. Consequently, robust techniques/methods for detecting abnormal data are used to improve the reliability of the SHM process. These techniques can be broadly categorized into three main types: distance-based methods, predictive models, and image processing approaches.

Distance-based anomaly detection assumes that anomalous data points lie far from the majority of other points in the dataset. If the distance between data points exceeds a set threshold, the point is considered as an abnormal data point [12]. Predictive models predict expected data values. The predictions are based on historical trends. The objective is to identify abnormalities based on deviations from predicted results [13,14,15]. Examples of predictive models are regression models [16], Bayesian approaches [17], and neural networks [15]. All these predictive models enhance detection accuracy but pose challenges in setting optimal thresholds. Image processing methods [18] use deep learning to detect anomalies by converting SHM time-series data into visual formats. Although effective, this approach risks losing information and introduces complexity into anomaly detection frameworks.

The performance of aforementioned abnormal data detection methods depends on several key factors such as real-time capability [13], analysis domain [12], and multivariate analysis [17]. Real-time capability enables the detection of anomalies as they occur, allowing for prompt corrective actions and preventing major failures. Similarly, analysis domain approaches (time, frequency, or time-frequency techniques) help uncover meaningful patterns in monitoring data. Multivariate analysis further enhances detection performance by integrating data from multiple sensors, revealing complex correlations that single-variable methods may overlook. Together, these factors significantly improve monitoring reliability and contribute to long-term bridge safety.

1.1. Motivation for Conducting a Systematic Literature Review on SHM Anomaly Detection

A systematic review of anomaly detection methods is essential to consolidate growing research. It helps to assess existing methods by highlighting their strengths, limitations, and suitability across different contexts [19]. Moreover, a systematic literature review (SLR) may examine how factors such as real-time capability, analysis domain investigations, and multivariate analysis influence reliability and efficiency. Ultimately, conducting an SLR helps in the improvement of anomaly detection accuracy. It will enhance long-term bridge safety through well-informed, data-driven decisions.

1.2. Existing Literature Reviews on Abnormal Data Detection in SHM of Bridges

Recent literature reviews have examined anomaly detection in SHM from multiple perspectives. Some studies focus on statistical outlier detection methods [20]. Others emphasize deep learning-based classification techniques [11]. A few works propose frameworks for assessing data quality in SHM systems [21]. Despite these contributions, many reviews exhibit notable limitations. Their scope is often narrow, addressing only specific algorithms or fault types. For example, one study categorizes anomaly detection methods solely by algorithm type (e.g., statistical vs. machine learning) [20], while another focuses only on data domains (e.g., time vs. frequency) [11]. These isolated approaches do not offer a unified understanding of how different methods relate or compare. Moreover, comparative evaluations across detection paradigms are generally absent. Therefore, a multi-dimensional taxonomy is needed to bring clarity and cohesion. Table 1 provides a summary of key prior reviews. It highlights their main focus areas and outlines the limitations that this study aims to address.

Although multi-dimensional taxonomies have been proposed in related fields, they often lack integration tailored to bridge-specific monitoring. These frameworks are common in areas such as time-series anomaly detection and general SHM applications. However, they do not fully address the unique challenges associated with bridge monitoring. For instance, the work [7,19] primarily focuses on algorithmic performance. The authors offer useful insights into detection accuracy and model design. They often overlook practical aspects such as real-time processing. Furthermore, multivariate sensor fusion is frequently neglected, despite being critical for effective bridge monitoring. Similarly, works in [22] explore AI applications in infrastructure maintenance. They highlight the potential of intelligent systems to improve operational efficiency. However, they do not provide a structured evaluation of anomaly detection methods. Signal domain analysis and bridge-specific typologies are rarely discussed in detail.

In contrast, our review introduces a unified four-dimensional taxonomy comprising real-time capability, multivariate support, analysis domain, and detection methodology. Real-time capability refers to the system’s ability to detect anomalies with minimal latency (typically within sub-second thresholds). Multivariate support indicates whether the method can handle multiple sensor streams simultaneously. Analysis domain distinguishes between time-series, frequency-domain, and image-based representations used for anomaly detection. Finally, the detection method categorizes the core algorithmic paradigm employed, such as distance-based, predictive modeling, or image processing techniques. These dimensions collectively enable a deployment-oriented classification of SHM anomaly detection approaches.

This taxonomy is explicitly designed to address the operational and computational complexities of bridge SHM. Unlike prior works, we present a comparative synthesis of 36 peer-reviewed studies (2020–2025) using standardized evaluation metrics, including accuracy, precision, recall, and F1-score. We also highlight underexplored areas such as time-frequency domain analysis and adaptive multi-modals. Our review not only consolidates existing knowledge in the field of bridge monitoring systems but also identifies actionable gaps and outlines future directions for scalable, interpretable, and real-time anomaly detection.

1.3. Contributions

This SLR makes the following key contributions to the field of abnormal data detection in bridge SHM:

Multi-Dimensional Taxonomy: A novel four-dimensional taxonomy is proposed, encompassing real-time capability, multivariate support, analysis domain (time, frequency, time-frequency), and detection methodology (distance-based, predictive, and image processing). It enables a holistic and operationally relevant classification of anomaly detection methods.
Comparative Analysis: The study systematically evaluates 36 peer-reviewed articles published between 2020 and 2025 using standardized performance metrics (accuracy, precision, recall, F1-score, offering a detailed synthesis of strengths, limitations, and deployment feasibility across different detection paradigms.
Identification of Underexplored Areas: The review highlights critical gaps in current research, including limited adoption of real-time processing (only 11 studies), sparse use of multivariate sensor fusion (only 8 studies), and under-utilization of frequency and time-frequency domain analysis.
Deployment-Oriented Insights: Robustness, scalability, interpretability, and data dependency of each detection method are assessed, providing actionable guidance for real-world SHM system integration.
Future Research Directions: The study outlines pathways for advancing SHM anomaly detection, including hybrid frameworks, lightweight and explainable AI models, multi-modal sensor fusion, domain adaptation, and the need for benchmark datasets and standardized evaluation protocols.

The following research questions (RQs) have been formulated to guide a structured investigation into existing methodologies.

RQ1: How frequently are different abnormal data detection techniques used in SHM studies, and which method dominates current research?
RQ2: How do various abnormal data detection methods perform in terms of real-time capability, analysis domain, and multivariate analysis?
RQ3: How do anomaly detection methods for bridge SHM compare in terms of detection performance across different fault types and study contexts?
RQ4: What are the key challenges in abnormal data detection, and how can emerging advancements improve detection accuracy in future research?

1.4. Overview of the Systematic Literature Review Approach

To address the constructed research questions (Q1–Q4), the overall layout of this SLR is illustrated in Figure 2. A total of 36 relevant studies were systematically selected. These came from eight reputable scientific databases: IEEE, Springer, Elsevier, SAGE, MDPI, Wiley, Tech Science Press, and Techno-Press. The selection was guided by well-defined inclusion and exclusion criteria. The detailed selection protocol is described in Section 2. The selected studies are categorized using four key dimensions. These include real-time capability, analysis domain, multivariate support, and anomaly detection methods, as detailed in Section 3. Anomaly detection methods are further classified into distance-based, predictive models and image processing methods/techniques in Section 4. Subsequently, Section 5 provides a comparative performance analysis of detection techniques across metrics such as robustness, scalability, deployment feasibility, interpretability, and data dependency. Section 6 highlights key challenges and directions for future research. The research question responses, along with noted limitations, are summarized in Section 7. The article is concluded in Section 8.

2. Research Methodology

To address the research questions outlined in the introduction, the SLR was conducted in accordance with the guidelines provided by [23]. These guidelines have been implemented in many other SLRs [24,25,26,27]. The review process involved defining key categories and developing a structured protocol for selecting relevant research articles. Section 2.1 presents the foundational background for these categories, while Section 2.2 outlines the detailed steps of the review protocol.

2.1. Categories Definition

Various anomaly detection methods are employed in SHM systems. Figure 3 presents a hierarchical taxonomy of these methods, categorized into distance-based, predictive, and image processing approaches, along with their subtypes [28]. Each category offers unique strengths in identifying abnormal data. While selecting an appropriate detection method is important, other factors also influence performance in SHM applications. These include real-time processing capability, the analysis domain (time, frequency, or time-frequency), and the ability to handle multivariate sensor data. The following sections explore each category in detail.

2.1.1. Abnormal Data Detection Methods

This section provides a brief overview of abnormal data detection methods, highlighting their core principles, strengths, and limitations within the SHM context.

Distance-Based Method

Distance-based methods identify anomalies by measuring the proximity between data points in a feature space [29]. Outliers are those that deviate significantly from the majority. The main idea is that normal data points are close together, while anomalies are farther away from them. A sample is flagged as abnormal if its distance from nearby points exceeds a predefined threshold. These methods are simple and interpretable but vary in scalability, sensitivity, and effectiveness across datasets. They are typically categorized into k-nearest neighbors (KNN)-based approaches [30], which rely on neighborhood density, and threshold-based methods [31], which compare data against fixed limits.

Predictive Methods

Predictive methods are data-driven approaches that use historical and real-time measurements to estimate expected structural behavior [15]. They work by learning temporal patterns or statistical relationships within the data to generate forecasts, which are then compared to actual sensor readings. Significant discrepancies between predicted and observed values indicate potential anomalies [32]. Common predictive techniques in SHM include multivariate linear regression [33], Bayesian networks [13,17], and Long Short-Term Memory (LSTM) neural networks [34]. Regression models capture trends, Bayesian inference provides a probabilistic framework for managing uncertainty, and neural networks uncover complex relationships. Predictive methods often require large volumes of high-quality training data, are sensitive to noise, and may perform poorly under non-stationary conditions or sudden structural changes. Additionally, setting appropriate thresholds for anomaly detection can be difficult.

Image Processing Methods

These methods detect anomalies by transforming sensor data into visual formats [35]. Typically, time-series or multivariate signals are converted into image-based representations [36,37,38], which are then analyzed using deep learning models, particularly convolutional neural networks (CNNs). Image-based approaches are generally categorized into two types: two-dimensional (2D) inputs, such as signal plots or spectrograms, and other formats like structured feature maps or encoded attributes. These methods offer key advantages, including the ability to capture complex spatial–temporal patterns, enhance recognition accuracy, and using established computer vision models. This makes them effective for detecting subtle or evolving structural changes. However, challenges include potential loss of signal fidelity during transformation, sensitivity to visual noise, and high computational demands during both pre-processing and inference.

Embedded Role of Statistical Methods in SHM Anomaly Detection Frameworks

Statistical methods are not listed as a separate category in our taxonomy. This is intentional, as they are embedded within broader modeling frameworks. Statistical techniques (such as regression, Bayesian inference, and principal component analysis (PCA)) are often part of predictive models. These techniques assist in learning temporal patterns, estimating uncertainty, and extracting features [7]. Statistical descriptors (such as mean, standard deviation, skewness, and kurtosis) are widely used as input features. They support image-based and hybrid anomaly detection models. For instance, some deep learning methods convert time-series data into images that include statistical features [32,39]. These images are then analyzed using CNNs. In such cases, statistical methods are used during pre-processing or feature extraction, not as separate detection tools. This shows a common trend in SHM research: statistical techniques are now built into machine learning workflows. Their use helps improve clarity, reliability, and generalization. Listing them as a separate category would be unnecessary and could give a misleading impression of their role. Interested readers can find the exact nature of their embedded role in SHM anomaly detection frameworks in [32,39].

2.1.2. Real-Time Capability

Real-time capability refers to a system’s ability to process and analyze sensor data instantly or with minimal delay, enabling immediate detection of anomalies. This is typically achieved through high-frequency data acquisition combined with fast processing algorithms, often deployed on edge devices or lightweight embedded systems. The main advantage of real-time processing is its support for timely decision-making, which is critical in scenarios such as earthquakes, overloads, or material fatigue. However, these systems face challenges, including limited computational resources, especially when using complex models such as deep neural networks or image-based methods.

2.1.3. Domain Analysis

Domain analysis involves examining signal characteristics across different transform domains to uncover structural behaviors or anomalies. The three primary domains are time, frequency, and time–frequency. Each domain offers unique insights: for instance, time-domain analysis captures raw signal trends, while frequency-domain techniques can reveal hidden patterns such as resonance shifts or periodic disturbances. Time–frequency methods provide a localized view of how signal characteristics evolve over time and frequency simultaneously. Damage-related features may be more prominent in specific domains depending on the nature of the anomaly. Therefore, selecting the appropriate domain is critical for enhancing detection accuracy and interpretability in SHM applications.

2.1.4. Multivariate Analysis

This type of analysis involves examining multiple sensor measurements simultaneously to detect complex inter-dependencies and uncover patterns that may indicate structural anomalies. It works on the principle that damage or deterioration often affects several parameters concurrently (such as strain, acceleration, and temperature) and analyzing them collectively can reveal insights that are not visible when analyzing individual signals. The key advantages of multivariate analysis include improved anomaly detection accuracy, enhanced noise resilience, and the ability to model interactions between variables over time. However, this approach can face limitations such as increased computational complexity, sensitivity to correlated noise, and challenges in interpreting high-dimensional results. It also requires well-calibrated sensor networks and synchronized data collection to ensure reliability.

2.2. Review Protocol Development

Based on predefined SLR guidelines [23], we developed a review protocol outlining the inclusion/exclusion criteria, search strategy, quality assessment, and data synthesis. Details are provided in the following sections.

2.2.1. Selection and Rejection Criteria

A concrete set of criteria was established, as shown in Table 2, to guide the selection and exclusion of studies. Only works that met these standards were included in the review.

To maintain relevance and reflect current technologies, only studies published between 2020 and 2025 were included. This time frame captures the most recent advancements in anomaly detection for SHM, particularly the integration of deep learning, real-time analytics, and sensor fusion techniques. Older studies published before 2020, though important, have already been reviewed in earlier surveys. They often miss the modern computational tools and deployment strategies used in today’s SHM systems. By focusing on this five-year window, this review emphasizes state-of-the-art methodologies and aligns with current research priorities.

2.2.2. Search Process

The databases listed in Table 3 were searched using varied keyword combinations and a publication filter (2020–2025) to ensure recency. While the AND operator yielded narrow results, the OR operator provided broader coverage. To refine the output, we applied two additional filters: article type and engineering subject area. Figure 4 illustrates the steps of the search process. We used various search terms across the target databases and retrieved approximately 6812 results. First, 4321 studies were excluded based on their titles. Then, 1469 were removed after reviewing their abstracts. A general review of the remaining 1022 articles was conducted to assess coherence and alignment with research questions RQ1–RQ7. This led to the exclusion of 442 articles. The remaining 580 were evaluated in detail. Based on this comprehensive review, 544 were filtered out. Finally, 36 articles were selected for inclusion. It is important to note that for this systematic review, a single reviewer independently screened all titles, abstracts, and full-text articles. As no dual-reviewer process was employed, inter-reviewer agreement metrics (e.g., Cohen’s

κ

) were not applicable.

The final selection of 36 studies reflects a rigorous, multi-stage review process. From an initial pool of over 6800 articles across eight reputable databases, each paper was screened for relevance to bridge SHM, methodological clarity, and validation strength. Redundant or weakly supported studies were excluded. The chosen papers span key modeling paradigms, sensor modalities, and anomaly detection strategies. They include both foundational and recent works, ensuring comprehensive and up-to-date coverage. Adding more studies would introduce redundancy without enhancing insight. This curated set enables a focused synthesis of trends and innovations, making it optimal for constructing a meaningful and well-structured taxonomy. The distribution of selected studies by publication year is illustrated in Figure 5.

3. Classification Results

This section begins by classifying the selected studies according to the abnormal data detection methods, discussed in Section 3.1. It then explores three additional design parameters: real-time capability (Section 3.2), analysis domain (Section 3.3), and multivariate analysis capability (Section 3.4).

3.1. Abnormal Data Detection Methods

The introductory part of this article states that this SLR focuses on three primary categories of abnormal data detection methods: distance-based methods, predictive approaches (including Bayesian inference, regression analysis, and neural networks), and image processing techniques, as outlined in Section 2.1.1. Consequently, Table 4 displays the distribution of the selected studies across the predefined categories.

It can be observed from Table 4 that image processing methods are the most frequently adopted approach. These techniques have gained popularity due to their ability to extract complex features and support deep learning architectures. In contrast, distance-based methods are rarely used. Their limited application is likely due to scalability issues and reduced effectiveness in high-dimensional SHM data. Predictive approaches, however, represent a strong area of ongoing research. They combine statistical models such as regression and Bayesian inference with advanced machine learning techniques. This fusion enhances their adaptability and interpretability. As a result, predictive models offer a practical balance between detection performance and computational efficiency. A detailed comparative analysis of abnormal data detection techniques within each category is provided in Section 4 and Section 5. Nevertheless, the categorization in Table 4 sets the stage for the analytical comparisons that follow, enabling targeted insights into the strengths, limitations, and application contexts of each methodological class.

3.2. Integration of Real-Time Processing in Structural Health Monitoring Frameworks

Real-time anomaly detection is a critical capability in SHM systems, enabling immediate response to structural faults and minimizing the risk of catastrophic failure. Among the 36 reviewed studies, only 11 explicitly incorporated real-time processing mechanisms, underscoring their underutilization in current research. Table 5 categorizes the reviewed SHM studies based on their support for real-time anomaly detection, revealing a notable skew toward non-real-time implementations.

Table 6 further details the performance metrics of real-time methods, emphasizing their methods, accuracy, latency, hardware (HW) platform and key characteristics. The reviewed studies demonstrate that real-time SHM performance is achievable through efficient architectures such as Bayesian models and lightweight neural networks, even under resource constraints. Several methods report sub-second latency and detection accuracies exceeding 95%, confirming their suitability for continuous monitoring and early fault detection.

3.2.1. Recent Advances in Real-Time Anomaly Detection for Bridge SHM

Recent studies have demonstrated significant progress in real-time anomaly detection for bridge SHM. Zhang et al. [13] employed a Bayesian Dynamic Linear Model (BDLM) with subspace detection and adaptive thresholding. Their system validated 600 data points in 0.69 s, achieving a per-point latency of just 0.024 s and an overall accuracy of 98.96%. In parallel, Zhang et al. [15] applied LSTM networks with a dual-threshold strategy, enabling fast and accurate fault detection from live SHM data streams with accuracy exceeding 93%. Zhu et al. [16] introduced Gaussian Process Regression (GPR) with representative data selection under operational variations, achieving 96% accuracy and a latency of 0.10 s on CPU-based systems. Gao et al. [18] developed a Pattern Recognition Neural Network (PRNN) that supports efficient deployment across long-span bridges, achieving 96.4% accuracy. Kim and Mukhiddinov [43] addressed class imbalance using a hyperparameter-tuned 1D CNN, achieving 97.6% accuracy through layer-wise training and gradual refinement.

Yang et al. [40] proposed a GPS data cleansing method, reporting 95–98% accuracy in anomaly localization and reconstruction. Qu et al. [42] introduced SARIMA for forecasting outlier effects, achieving 95% accuracy in early warning scenarios. Hao et al. [53] advanced scalability with a Mixture of Bridge Experts framework, integrating MobileViT and BST to reach 99.35% accuracy with a latency of 0.145 s. Wang et al. [54] developed a conversion system for server-based SHM, achieving 98% accuracy. In parallel, Wang et al. [59] implemented a transfer learning approach on edge devices using domain adaptation and self-distillation, delivering 98% accuracy with a latency of 0.3 s. Juntao et al. [60] demonstrated real-time alerting capabilities using neural networks with accuracy exceeding 95%.

3.2.2. Real-Time SHM: Bottlenecks and Deployment Hurdles

Despite these advances, several bottlenecks hinder the widespread deployment of real-time anomaly detection in bridge SHM. A primary challenge is computational complexity. Deep learning models such as CNNs, LSTMs, and hybrid architectures (e.g., Bi-LSTM + cGAN) demand substantial memory and processing power, limiting their feasibility on embedded or edge platforms. Another critical issue is the latency–accuracy trade-off. While lightweight models offer faster inference, they may compromise precision. Conversely, high-accuracy models often incur longer processing times. For instance, BDLM [13] and GPR [16] demonstrate sub-second latencies with high accuracy, but models like transfer learning [59] and MobileViT-based frameworks [53] require more computational resources despite their strong performance.

To address these constraints, several studies have explored edge computing, model compression, and hybrid frameworks. Wang et al. [59] leveraged domain adaptation on edge devices, while Hao et al. [53] integrated specialized models for sensor-specific optimization. Research by [54,61] investigated model conversion and inference strategies to reduce deployment friction. Although only a subset of approaches have achieved real-time performance under constrained conditions, their success underscores the feasibility of scalable SHM solutions. Future work must focus on adaptive architectures that balance latency, accuracy, and hardware efficiency across diverse bridge environments and sensor modalities.

3.3. Analysis Domain Investigations

Table 7 summarizes the signal analysis domains used in SHM studies. Time-domain analysis is the most frequently adopted approach. It appears in 25 out of the 36 reviewed studies. This reflects its simplicity, low computational cost, and ease of implementation. Time-domain features such as root mean square, kurtosis, and peak-to-peak amplitude are widely used for anomaly detection. Frequency-domain methods are also common. Fourier Transform (FT) is the most widely used technique in this category. FT decomposes time-varying signals into harmonic components. It provides insights into amplitude, phase, and frequency content. However, FT assumes signal stationarity and cannot capture transient behaviors.

In addition to time-domain and frequency domain, time-frequency domain techniques offer a more comprehensive view. They analyze signals that change over time and frequency simultaneously. Methods such as Wavelet Transforms (WT), Short-Time Fourier Transform (STFT), Wigner–Ville Distribution (WVD), Empirical Mode Decomposition (EMD), and Hilbert–Huang Transform (HHT) are designed to capture non-stationary and localized signal features [62,63,64]. For example, Deng et al. [46] used Continuous Wavelet Transform (CWT) to convert SHM signals into two-dimensional images. Their model achieved high accuracy in identifying pseudo-normal data and generalized well across different structural systems.

Despite advantages of time-frequency methods, they are underutilized. Only five studies in this review employed them. This limited adoption is due to several barriers. These techniques often involve high computational complexity. They require more memory and processing time than time-domain methods. Implementation can be challenging, especially for real-time applications. Some methods also demand expert knowledge for parameter tuning and interpretation. However, the potential benefits are significant. Time-frequency analysis can detect subtle and transient anomalies that are missed by simpler methods. It improves sensitivity to localized damage and environmental effects. These techniques are especially valuable for bridges exposed to dynamic loads and varying operational conditions.

In summary, while time-domain analysis dominates current SHM practice, time-frequency methods offer richer diagnostic capabilities. Their ability to capture both temporal and spectral features makes them ideal for complex anomaly detection tasks. Future research should focus on optimizing these methods for real-time deployment. This includes reducing computational load and developing adaptive algorithms. Greater attention to time-frequency analysis can enhance the accuracy, robustness, and generalizability of SHM systems across diverse bridge environments.

3.4. Multivariate Analysis Capability Investigations

As summarized in Table 8, only 8 out of the 36 reviewed studies employed multivariate analysis techniques in their anomaly detection frameworks. This limited adoption appears to stem from a combination of technical complexity, data scarcity, and a broader research gap. Multivariate analysis requires synchronized, high-quality data from multiple sensor modalities (such as strain, acceleration, displacement, temperature, and humidity) which are often unavailable or difficult to calibrate in real-world SHM deployments. Additionally, the increased computational burden and challenges associated with interpreting high-dimensional results may discourage its use, especially in resource-constrained environments.

Despite the aforementioned barriers, multivariate analysis offers significant advantages in detecting latent or compound failures that single-sensor approaches may overlook. For instance, Xiang et al. [17] utilized Bayesian estimation across four sensor channels to quantify uncertainty and derive probabilistic metrics for anomaly detection. Similarly, Son et al. [41] applied CNNs to structured multivariate features, enabling joint learning across time and frequency domains. Kang et al. [60] integrated diverse signal descriptors—including strain, displacement, vibration, and environmental parameters—into a machine learning framework that improved detection robustness under varying operational conditions.

The reviewed studies show that sensor fusion plays a vital role in enhancing anomaly detection. By combining data from multiple sensors, it captures cross-sensor dependencies that single-sensor approaches may miss. This integration improves noise resilience and helps uncover subtle patterns linked to structural degradation. Techniques like PCA, Independent Component Analysis (ICA), and attention-based neural networks support this process. They reduce dimensionality and extract meaningful features, making multivariate frameworks more scalable and easier to interpret. Despite its advantages, multivariate analysis is still underused in current SHM research. However, its potential to boost detection accuracy and generalization across different bridge types is significant. Future SHM systems should focus on sensor fusion strategies to gain deeper insights from diverse data sources. This will enable more adaptive, intelligent, and reliable monitoring solutions.

As shown in Table 8, multivariate analysis methods fall into three main categories: time series models, CNN-based feature learning, and machine learning frameworks.

Multivariate time series models: Studies such as [17,30,40] employ multivariate time series models to analyze structural response data captured from multiple sensors over time. These methods are particularly effective in capturing temporal dependencies and cross-sensor correlations, which are critical for detecting subtle anomalies under dynamic structural conditions.
Multivariate feature-based learning via CNN: Studies including [41,43,54,57] utilize CNNs to process structured multivariate input features. These features typically consist of time-domain and frequency-domain indicators extracted from raw monitoring signals. The CNN architecture allows for joint learning across these input dimensions, enabling the model to detect a variety of abnormal patterns.
Multivariate machine learning approaches: The study by [60] presents a comprehensive machine learning-based framework that integrates multivariate analysis for anomaly detection. It processes diverse signal descriptors (such as strain, displacement, vibration, and environmental parameters) from multiple sensor types. By applying classification models that capture inter-dependencies among these signals, the framework enhances detection accuracy and robustness. This approach is particularly effective under varying environmental and loading conditions, offering improved generalization while maintaining computational efficiency suitable for real-time SHM applications.

To summarize, multivariate anomaly detection techniques are diverse and evolving. They include time series modeling, CNN-based feature learning, and broader machine learning integration. Although only a few studies explicitly adopt these strategies, their impact is notable. Multivariate frameworks excel at capturing temporal patterns, cross-sensor relationships, and contextual anomalies. These strengths highlight their value in SHM. Future research should focus more on multivariate methodologies. Such approaches can reveal deeper insights from SHM data. They also support the development of adaptive, scalable, and intelligent monitoring systems.

4. Analysis of Abnormal Data Detection Methods

Section 3 presents a classification of the selected research studies based on abnormal data detection methods, real-time capabilities, analysis domains, and multivariate analysis support. However, a detailed analysis of abnormal data detection approaches is one of the core objectives of this SLR. Accordingly, this section provides a comprehensive analysis of these methods across several performance dimensions. Specifically, Section 4.1 examines studies utilizing distance-based techniques, while Section 4.2 evaluates predictive approaches, including Bayesian, regression, and neural network-based models. Lastly, Section 4.3 focuses on image-processing methods.

The evaluation metrics used to assess the performance of abnormal data detection methods include Accuracy, Precision, Recall, and F1-score, as defined in Equations (1)–(4), respectively. In binary classification tasks, model effectiveness is characterized by four key outcomes: true positives (TPs ), true negatives (TNs), false positives (FPs), and false negatives (FNs). These quantities serve as the foundation for several widely used performance metrics. Accuracy, defined in Equation (1), measures the overall proportion of correctly predicted instances. Precision, shown in Equation (2), quantifies the ratio of correctly predicted positive observations to the total predicted positives. Recall, presented in Equation (3), captures the ability of the model to correctly identify actual positive cases. Lastly, the F1-score, defined in Equation (4), represents the harmonic mean of Precision and Recall. It is particularly valuable in scenarios involving class imbalance, offering a balanced evaluation of both false positives and false negatives.

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(1)

Precision = \frac{T P}{T P + F P}

(2)

Recall = \frac{T P}{T P + F N}

(3)

F 1 - score = \frac{2 \cdot Precision \cdot Recall}{Precision + Recall}

(4)

4.1. Distance-Based Methods

Table 9 presents a comparison of distance-based anomaly detection methods used in bridge SHM. It includes two studies with different dataset characteristics. The first study uses a real-world dataset with 60,000 samples of acceleration data from a cable-stayed bridge. This dataset captures actual operational noise and variability. The second study uses a simulated dataset with 40,000 samples. It includes multiple sensor types such as temperature, strain, displacement, humidity, and deflection sensors. Real-world data improves realism, while simulated data allows controlled testing across diverse fault scenarios.

The first method [31] applies the Minimum Covariance Determinant (MCD) to detect low-quality signals and abnormal behavior. It uses robust distance metrics in multivariate space. The second method [30] uses KNN to identify statistical outliers by comparing temporal patterns. Both methods are unsupervised and easy to implement. They require minimal labeled data and are suitable for edge deployment. However, they face challenges with high-dimensional inputs and threshold sensitivity. These approaches are useful for early anomaly screening and can be integrated into broader SHM frameworks where interpretability and efficiency are important.

4.2. Predictive Methods

Within this category, 12 studies were selected and classified into three subgroups: Bayesian methods (2 studies), regression-based methods (2 studies), and neural network-based methods (8 studies). These classifications reflect the range of modeling strategies adopted for predictive anomaly detection in SHM applications.

4.2.1. Bayesian Methods

Bayesian estimation models rely on prior knowledge expressed as probability distributions. These distributions are updated as new data becomes available. The approach treats parameters as random variables that evolve with incoming observations [65,66]. This allows the model to adapt continuously and refine its predictions over time. Table 10 provides an overview of selected research studies that have employed Bayesian methods. It includes the bridge type, sensor modalities used, and the targeted anomaly types. Reported accuracy, where available, reflects detection performance. The final column outlines the specific Bayesian techniques employed, such as BDLM and Probability Density Functions (PDFs).

As shown in Table 10, Kim et al. [13] analyzed data from a long-span cable-stayed bridge. Their dataset included 144 samples of acceleration and strain measurements. The study focused on detecting spikes and baseline shifts in the sensor signals. They used a Bayesian Dynamic Linear Model (BDLM) with subspace detection. This method achieved an accuracy of 98.96 percent. It demonstrated strong performance in identifying sudden changes and signal drift. Similarly, Xiang et al. [17] worked with a larger dataset of 245 samples. The data came from a large-span bridge equipped with multiple sensors. These included anemometers, temperature sensors, anchor load cells, and a connected pipe system. The study targeted sensor faults that could affect system reliability. They applied Bayesian estimation to generate probability density functions. A certainty index was introduced to measure confidence in the detection results. The method did not report a specific accuracy value but emphasized reliability and uncertainty quantification.

In conclusion, Bayesian methods are effective at identifying various types of abnormal data in SHM, including spikes, baseline shifts, and sensor faults. They employ probabilistic reasoning to manage uncertainty and dynamically update model parameters as new data becomes available, enhancing robustness in changing structural environments. Compared to distance-based methods, which rely on fixed thresholds and simple similarity metrics, Bayesian approaches offer greater flexibility and interpretability. While distance-based techniques are computationally efficient and suitable for detecting clear outliers, they often struggle with complex or subtle anomalies and lack adaptability across varying sensor inputs. However, the performance of Bayesian models depends on well-defined prior distributions and sufficient data. Despite these challenges, Bayesian approaches remain powerful tools for modeling uncertainty and improving anomaly detection in bridge monitoring systems.

4.2.2. Regression Methods

These models are used to quantify the relationship between one or more input variables and an output variable, and are commonly applied in prediction or inference tasks. Regression models can be broadly categorized into two types. The first includes general-purpose models, such as linear regression, Gaussian process regression, and support vector regression, which are primarily used to model static relationships between variables [67]. The second category comprises time series-based models, including Moving Average (MA), Auto regressive (AR), and Auto regressive Integrated Moving Average (ARIMA). These methods are designed to handle time-dependent data, capturing evolving patterns and accounting for random variation. Time series models are particularly useful for sequential datasets but are more complex, as they must accommodate changing conditions and dynamic system behavior [33].

Table 11 summarizes studies that employ regression-based methods for anomaly detection in SHM bridge systems. The “Dataset” column identifies the dataset deployed in the evaluation process. The “Bridge type” column highlights the structural context, including an oblique arch bridge and a long-span cable-stayed bridge, demonstrating the adaptability of regression models across different bridge types. The “Sensor Data” column indicates the nature of the input data, such as strain and stress measurements, which are commonly used to monitor structural integrity. The “Anomaly Type” column identifies the specific irregularities targeted in each study (outliers in one case and noise in another) reflecting the models’ ability to handle different forms of abnormal data. The final column outlines the regression techniques used. For example, Zhu et al. [16] proposed a representative data selection strategy for online performance assessment. Their approach uses Gaussian Process models to enhance prediction accuracy and reduce computational load. It is designed for streaming bridge monitoring data, supporting real-time applications. The study also introduces a performance warning index to evaluate bridge conditions and detect anomalies. The method operates within a probabilistic framework, capturing uncertainty in the monitoring data. Compared to SARIMA, Gaussian process regression handles non-linear relationships better. It also provides confidence intervals for predictions. However, SARIMA is more efficient for long-term forecasting in time-dependent systems. Both models support anomaly detection but differ in focus. SARIMA excels in temporal pattern recognition. Gaussian processes are stronger in terms of noise filtering and uncertainty modeling.

When we compare regression models with Bayesian models, we observe that the latter offer a different approach. They rely on prior knowledge and update beliefs as new data arrives. This makes them powerful for uncertainty quantification and decision-making when data is limited. Bayesian inference can model complex systems with fewer samples. However, it often involves high computational cost and requires careful prior selection. In contrast, regression models are more scalable and easier to implement. They perform well when data is abundant and prior knowledge is limited.

4.2.3. Neural Network Methods

Neural networks have shown strong performance in tasks such as object detection, classification, and segmentation. This success has led to their increasing use in data analysis, forecasting, and anomaly detection. In SHM, they are effective at predicting structural responses and improving accuracy [68]. These models learn patterns directly from data without needing explicit equations. They capture complex nonlinear relationships that traditional models often miss. However, they lack interpretability, making it hard to trace how inputs affect outputs. Their architectures can also be complex. As layers increase, so do parameters and computational costs. Training becomes more time-consuming and resource-intensive.

Table 12 provides a comprehensive summary of studies that apply neural network-based methods for anomaly detection in bridge SHM systems. The “Ref.” column lists the cited studies, each representing a unique application of neural networks in real-world monitoring scenarios. The “Dataset” column identifies the dataset deployed in the training and testing processes. The “Bridge Type” column shows the diversity of structural configurations, including long-span cable-stayed bridges, suspension bridges, and twin-box girder bridges, indicating the broad applicability of these methods. The “Sensor data” column highlights the sensor modalities used, such as acceleration, GPS, and cable tension data, which are essential for capturing structural behavior. The “Anomaly Type” column outlines the range of anomalies detected, including missing data, outliers, drift, trend shifts, and abnormal patterns. This demonstrates the versatility of neural networks to identify both subtle and severe deviations. The next two columns identify the Neural network architecture and the methodology deployed in each study. The next four columns (“P = Precision”, “R = Recall,” “F1 = F1 Score,” and “A = Accuracy”) report performance metrics used to evaluate detection effectiveness.

It can be observed from Table 12 that CNN-based methods dominate the landscape, particularly for acceleration data, with studies like [14,43,44] demonstrating strong performance in detecting complex anomalies such as drift, trend, and square patterns. Notably, ref. [44] achieves the highest F1 score (0.975) and accuracy (99.15%), employing data compression for enhanced feature extraction. LSTM variants, including Bi-LSTM and reconstruction-based models [40,41], show superior recall and precision, especially when handling GPS and cable tension data, with [41] reaching near-perfect accuracy (99.98%). The SCN-based (Stochastic Configuration Network) approach in [45] introduces structural innovation via Random Node Removal, yielding competitive metrics (>0.96 F1, >99% accuracy) despite a smaller dataset. Methodologies vary from subspace enhancement and dual-thresholding to statistical feature integration, reflecting a trade-off between interpretability and complexity. Overall, while CNNs offer robust generalization across anomaly types, LSTM-based models excel in temporal pattern recognition, and SCNs provide promising structural adaptability, underscoring the importance of aligning architecture choice with sensor type and anomaly characteristics.

In summary, the reviewed studies demonstrate that neural networks can detect a wide range of fault types with high precision and accuracy. This includes minor errors, missing signals, trends, and drift patterns. Although the models differ in input sources and bridge types, most achieve high recall and F1 scores, indicating reliable and balanced performances. Despite challenges such as limited interpretability and high computational demand, neural networks remain effective and versatile tools for detecting abnormal patterns in SHM applications.

Summary of Predictive Models: While neural networks dominate the predictive model category, interpretability remains a nuanced aspect. In this review, interpretability was assessed based on the transparency of model architecture and the availability of feature attribution mechanisms. For instance, studies such as [32,41] employed CNNs with structured input features, allowing partial traceability of decision pathways through saliency maps or attention weights. Regression-based models [42] were considered more interpretable due to their explicit mathematical formulations and direct mapping between input variables and outputs. Bayesian models [13] offered probabilistic reasoning and uncertainty quantification, further enhancing interpretability. However, we acknowledge that standardized metrics for interpretability such as SHAP (SHapley Additive exPlanations) values and LIME (Local Interpretable Model-agnostic Explanations) scores were not uniformly reported across studies [69,70]. This limitation has been noted in Section 6.2, and future research should incorporate formal interpretability assessments to support transparent and trustworthy SHM applications.

4.3. Image Processing Methods

Image processing is a key part of artificial intelligence. It simulates human vision to perform tasks such as segmentation, classification, recognition, tracking, and decision-making. In civil engineering, it is used to measure crack widths, locate structural elements, and assess earthquake damage. Recently, it has been applied in SHM. Instead of using raw sensor data, the data is converted into images. These images are then used to train models to detect anomalies [71,72]. This approach mimics how humans visually detect changes in their environment. In this category of image processing methods, 22 research studies were reviewed. Of these, 13 utilized 2D image inputs, while the remaining 9 employed alternative visual representations.

Next, we present two major categories of image-processing-based anomaly detection methods in SHM: those that rely on two-dimensional (2D) image inputs and those that utilize alternative input formats. Each category is evaluated in terms of data transformation techniques, model performance, and practical applicability across different bridge types and fault scenarios.

4.3.1. Two-Dimensional Image Input Classes

This category of methods converts one-dimensional (1D) time series monitoring data into 2D image representations. These images are then analyzed using deep learning techniques to detect anomalies. Transforming the data into 2D form enhances the ability to capture both spatial features and temporal patterns. Deep learning models (particularly CNNs) have demonstrated strong performance in image classification tasks. When monitoring data is visualized, these models can effectively employ their pattern recognition capabilities to identify abnormal behavior. Moreover, image-based data augmentation techniques, such as rotation, scaling, and translation, can be applied to the generated images. These augmentations increase the diversity and volume of training samples, therefore improving the robustness and accuracy of the models in detecting anomalies [73,74].

Table 13 summarizes recent studies that employ 2D image-based methods for anomaly detection in structural health monitoring (SHM). Each entry includes the dataset details, bridge type, sensor data used, categories of detected anomalies, and key performance evaluation metrics. The “Dataset” column specifies the data used for training and testing. The “Bridge” column indicates the structural form (e.g., cable-stayed, railway, general), while the “Sensor Data” column refers to the source of input, such as acceleration or multi-sensor combinations. The “Anomaly Type” column lists specific forms of data irregularity, including missing values, outliers, trends, drift, frequency-domain confusion (FDC), and time-frequency confusion (TFC). The final two columns describe the neural network architecture and the methodology used to convert monitoring data into 2D images, respectively.

The performance metrics are abbreviated as follows: Precision (P) measures the ratio of correct anomaly predictions to total predicted anomalies; Recall (R) indicates the proportion of actual anomalies correctly identified; F1 is the harmonic mean of precision and recall; and Accuracy (A) reflects the overall correctness of the model. As the table shows, most models achieved high accuracy (often exceeding 95%), and several studies reported strong recall and F1 scores, indicating reliable detection capabilities across a wide range of fault categories and bridge types.

Table 13 presents a comparative overview of recent studies employing 2D image-based techniques for anomaly detection in structural health monitoring of bridges. A key differentiator among these studies is the dataset size, which directly influences model generalization and robustness. For instance, Ref. [57] utilizes the largest dataset with 675,432 samples, integrating multi-sensor inputs such as acceleration, strain, displacement, humidity, and temperature. In contrast, studies like [46,47] rely on smaller datasets of 7200 samples, focusing solely on acceleration data. Mid-sized datasets, such as those used by [35,38,48] (54,720 samples), strike a balance between computational feasibility and anomaly diversity.

Architecturally, CNNs dominate the landscape, appearing in 11 out of 13 studies. However, hybrid models such as Generative Adversarial Network (GANs) (e.g., [49,55] and ensemble networks [36] demonstrate promising performance, particularly in handling complex anomaly types. Methodologically, grayscale conversion and FFT (Fast Fourier Transform) are the most commonly used techniques for transforming time-series data into 2D images. Notably, RGB encoding [57] and adversarial networks [55] are associated with higher accuracy scores, suggesting that richer image representations and generative modeling may enhance detection capabilities. In terms of anomaly coverage, ref. [57] stands out by addressing six distinct types, including mutation and environmental anomalies, thanks to its multi-sensor input. Studies such as [46,49] also cover a broad spectrum of anomalies, including time-frequency confusion (TFC), spikes, and constant shifts. Conversely, refs. [38,47] focus on fewer anomaly types, primarily within the frequency and trend domains.

Performance metrics reveal trade-offs. Ref. [55] achieves the highest reported accuracy (99.1%), employing a CNN-GAN hybrid with adversarial training. Refs. [37,58] report balanced precision and recall (both at 95%), indicating consistent detection across anomaly classes. Meanwhile, ref. [53] prioritizes precision (>85.48%) but suffers from lower recall (>68.18%), suggesting a conservative detection strategy that may miss subtle anomalies. Ref. [57], despite its complex input and broad anomaly coverage, reports slightly lower accuracy (93.28%) but maintains high precision (93.76%) and F1-score (0.94), reflecting its robustness across diverse conditions.

Overall, the comparative analysis highlights that studies integrating larger datasets, multi-sensor inputs, and advanced image encoding techniques tend to achieve superior performance. While CNN remains the backbone of most architectures, the inclusion of generative models and ensemble strategies offers meaningful improvements in anomaly detection accuracy and generalizability. While 2D image-based input techniques are effective for anomaly detection, they come with notable limitations. Converting 1D time-series data into 2D images can lead to information loss, especially for continuous signals where subtle patterns may be obscured. The transformation process may also introduce noise due to quantization errors. Additionally, image representations increase computational complexity, requiring more advanced models and longer training times. Finally, the nonlinear conversion reduces interpretability, making it harder to trace how original signals relate to detected anomalies.

4.3.2. Hybrid Input Classes

To address the limitations of 2D image input classes, researchers have adopted alternative input formats. Table 14 presents a summary of relevant studies. The second column indicates the dataset used for training and testing. The third column specifies the type of bridge evaluated. The fourth column describes the input data format analyzed in each study. The fifth column identifies the anomaly types targeted by the models. The sixth and seventh columns outline the neural network architecture and the methodology employed. The final column reports the performance of each method in detecting abnormal data.

Most studies in Table 14 use CNNs, making up over 85% of the methods reviewed. Some use ensemble CNNs (e.g., [36]), while others combine CNNs with GANs (e.g., [55]). A few adopt autoencoder-based GANs (e.g., [49]). These models show growing interest in using generative techniques to improve anomaly detection. Feedforward Networks (FFNs) are used in [18] as a simpler option. They perform well when paired with well-designed input features. In addition to the model architecture type, input transformation methods also vary. FFT is the most common (e.g., [35]). Other methods include grayscale encoding (e.g., [37,48]), Gramian Angular Fields (GAF) (e.g., [47]), and RGB encoding (e.g., [57]). These techniques affect how well models capture spatial and temporal features, and they influence sensitivity to signal distortions.

The scope of anomaly detection differs significantly across studies. The works in [46,57] detect up to six anomaly types, including mutation, drift, and time–frequency confusion (TFC), indicating high temporal and spectral sensitivity. Ref. [18] also covers six types, such as square and trend anomalies, while [37,47] identify five types, focusing on signal distortions and frequency shifts. The work in [49] includes unique anomalies like spikes and constant shifts, suggesting enhanced temporal resolution through GAF encoding and adversarial reconstruction. Notably, models employing richer input formats (e.g., RGB, GAF) tend to support broader anomaly detection capabilities, while grayscale-based models often focus on fewer, well-defined categories.

Performance metrics reveal that GAN-enhanced models (e.g., [55]) achieve the highest accuracy (99.1%), likely due to their ability to model complex data distributions. CNN-based models using grayscale or FFT inputs (e.g., [37,58]) demonstrate balanced precision and recall (95%) with accuracy exceeding 98%. Ref. [46] reports the highest F1 score (0.97), indicating strong overall detection capability. FFN-based models (e.g., [18]) remain competitive, particularly when paired with feature engineering, though they may lack temporal depth. Models like [53] show high accuracy (98.94%) but lower recall (>68.18%), suggesting a precision-biased detection strategy. Overall, the choice of input transformation and model architecture significantly influences anomaly coverage and detection performance.

In conclusion, methods based on alternative input classes offer an effective solution for anomaly detection in SHM. They overcome key limitations of 2D image-based approaches by preserving signal features and maintaining a direct link to the original time-series data. The reviewed studies show strong performance across different bridge types and sensor inputs. Most models achieve high accuracy, with consistent precision, recall, and F1 scores. Combining statistical feature extraction with classification improves interpretability. These methods are also computationally efficient and suitable for real-time monitoring. Overall, input-class-based techniques provide a reliable and scalable approach for fault detection in civil infrastructure.

5. Comparative Evaluation of Various Anomaly Detection Methods

To enhance the analytical depth of this review, we present a comparative synthesis of the three main anomaly detection categories: distance-based, predictive, and image-processing methods. This synthesis is structured around five key dimensions: robustness, scalability, real-world deployment feasibility, interpretability, and data dependency. These dimensions are important because they define the practical value, reliability, and adaptability of anomaly detection techniques in bridge SHM applications. Table 15 presents a comparative synthesis of anomaly detection methods used in bridge SHM, evaluated across these dimensions. Neural networks and image-based approaches demonstrate high robustness and scalability, but their deployment is often constrained by computational demands and limited interpretability. Bayesian models offer strong robustness and interpretability through probabilistic reasoning, though their scalability is moderate and performance depends on prior assumptions. Regression models strike a balance between transparency and efficiency, making them suitable for embedded systems, but they require well-calibrated data and lack uncertainty modeling. Distance-based methods are lightweight and interpretable, ideal for edge deployment, yet they struggle with scalability and sensitivity to feature engineering. Hybrid models emerge as a promising alternative, combining structured inputs and statistical features to enhance robustness, scalability, and interpretability while reducing data dependency. This comparative framework supports informed method selection based on operational constraints and monitoring objectives.

5.1. Robustness

Robustness is a model’s ability to perform reliably under difficult conditions. It includes noisy sensor data, missing values, environmental changes, or unexpected structural shifts. A robust anomaly detection method can still identify faults accurately, even when the input data is imperfect or unpredictable. Neural network-based predictive models [14,15,32,40,41,43,44,45] and image-processing techniques [35,36,37,38,46,47,48,49,53,55,57,58,59] show high resilience to noise and complex fault patterns. Most studies report F1-scores above 0.90, confirming their strong detection performance. Neural networks show strong robustness in detecting various fault types. These include missing data, drift, trend shifts, and outliers [14,15,44,45]. They generalize well across noisy and incomplete datasets. For example, recall and F1 scores reach 0.9753 in [44] and exceed 0.96 in [45]. However, their robustness may decline under extreme class imbalance or when exposed to unseen fault types without retraining [32]. Image processing techniques also demonstrate high robustness. They effectively detect faults such as drift, missing data, gain shifts, and frequency-domain confusion. Studies like [46,58] report F1 scores of 0.95 or higher, even under noisy conditions. Methods that combine statistical features or multivariate descriptors [51,52,60] further improve robustness. These techniques capture subtle and compound anomalies across multiple sensor channels.

Complementing the aforementioned data-driven approaches, Bayesian models [13,17] are robust due to their probabilistic reasoning and adaptive learning. Xiang et al. [17] highlight their ability to model uncertainty using probability density functions and certainty indices. This improves fault detection under noisy or incomplete sensor conditions. Bayesian models can update beliefs with new data, making them resilient in dynamic environments where structural behavior changes over time. However, their performance depends heavily on the accuracy of prior distributions. They may struggle with abrupt structural changes or non-stationary conditions that differ significantly from learned patterns [17,65]. Regression models, especially time-series variants like SARIMA and Gaussian Process Regression, show moderate robustness. They can capture evolving structural patterns and filter out noise, as demonstrated by Zhu et al. [16]. Yet, their performance may decline when faced with sudden changes or anomalies that deviate from historical trends. Unlike Bayesian models, regression methods lack built-in uncertainty quantification. This limits their resilience in highly variable environments.

Finally, distance-based methods (MCD [31] and KNN [30]) are sensitive to threshold selection and environmental variability, limiting their robustness in high-dimensional SHM datasets. MCD provides comparatively robust multivariate outlier detection by minimizing the influence of noise and corrupted data, which is particularly effective in identifying low-quality acceleration signals [31]. On the other hand, KNN is sensitive to temporal variability and environmental fluctuations. Its performance degrades when anomalies are subtle or when the data distribution shifts, as it lacks intrinsic mechanisms for uncertainty quantification.

To summarize, robustness in anomaly detection varies across methods. Neural networks and image processing techniques offer high resilience, consistently achieving strong performance under noisy and complex conditions. Bayesian models excel in dynamic environments due to their uncertainty modeling but depend on accurate priors. Regression models show moderate robustness, though their lack of uncertainty quantification limits adaptability. Distance-based methods like MCD and KNN are more sensitive to environmental variability and threshold settings, making them less reliable in high-dimensional SHM contexts. Overall, method selection should align with data complexity and variability, with hybrid approaches offering balanced performance.

5.2. Scalability

Scalability refers to how well a detection method handles increasing data volume, sensor diversity, or system complexity without significant loss in performance or efficiency. In SHM, scalable models can process data from large sensor networks or multiple bridge structures while maintaining acceptable computational speed and accuracy. Two-dimensional image-based approaches are highly scalable due to parallel processing and GPU acceleration [37,38,59]. Hybrid input image processing techniques that avoid full image conversion and use structured features [39,54] further reduce computational overhead. These methods are well-suited for large-scale deployments, especially in resource-constrained environments.

Predictive models also scale well with data volume. Neural networks, in particular, can handle multivariate inputs such as GPS, cable tension, and acceleration [40,41]. However, deep architectures like CNN-LSTM require significant memory and processing power [43], which may limit their use in real-time or embedded SHM systems. Additionally, retraining may be necessary under non-stationary conditions [16,42]. Bayesian models offer moderate scalability. They perform efficiently on low- to mid-dimensional datasets [13,17]. Yet their computational cost increases with multivariate sensor networks, making them less practical for large-scale SHM applications. Regression methods are generally scalable and computationally efficient. Linear and SARIMA models [42] are suitable for large-scale deployments. Gaussian Process Regression, while accurate, becomes resource-intensive as data volume grows due to kernel matrix operations. Distance-based methods like MCD and KNN are lightweight and scale well with small to medium-sized datasets [30,31]. Their non-parametric nature avoids complex training. However, in high-dimensional sensor networks, scalability suffers. Distance metrics lose discriminative power, and interpretability declines due to the curse of dimensionality.

In summary, image-based and neural network methods offer strong scalability but demand high computational resources. Bayesian and regression models provide a balance between efficiency and performance, especially in mid-scale deployments. Distance-based methods are best suited for simpler SHM systems. Selecting the right approach depends on the scale of monitoring, available resources, and the complexity of sensor data. Hybrid multi-modal frameworks may offer the most practical solution for scalable, real-time SHM.

5.3. Real-World Deployment

Real-World Deployment assesses how practical it is to implement a detection method in operational SHM systems. Key considerations include computational cost, latency, hardware compatibility (e.g., edge devices), and ease of integration with existing infrastructure. A feasible model should deliver timely results with minimal resource usage and remain robust under continuous monitoring. Distance-based techniques [30,31] are well-suited for edge deployment. Their simplicity and low resource demands allow operation on embedded systems with limited memory and processing power. However, they lack adaptability to changing conditions, often requiring manual recalibration. These methods are ideal for static environments or preliminary anomaly screening.

Predictive models including regression [16,42], Bayesian [13,17], and neural networks [14,15,32,40,41,43,44,45] offer a balance between performance and feasibility. Regression models are lightweight and transparent, making them highly suitable for embedded systems. For example, Zhu et al. [16] demonstrated real-time monitoring with fast training and inference, supporting continuous bridge assessment. Bayesian models also show strong feasibility. Kim et al. [13] validated 600 data points in 0.69 s, with a latency of just 0.024 s per measurement. Their low storage needs and transparent structure make them practical for resource-constrained environments. Neural networks, once trained, offer fast inference and can be integrated into real-time pipelines [40,44]. Full-scale deployments have achieved high accuracy (e.g., 99.98% in [41]). However, training and model updates are resource-intensive. Deployment on edge devices may require compression or pruning to meet hardware limits [43].

Image-processing methods, while accurate, often demand high-resolution sensors and significant computational power [35,36,37,38,46,47,48,49,53,55,57,58,59]. This limits their feasibility in embedded SHM systems. Nonetheless, 2D models have shown success in full-scale deployments with high accuracy (e.g., 99.1% in [55], 98.94% in [53]). Pre-processing and storage demands, however, pose challenges for edge integration. Hybrid input models offer a more deployment-friendly alternative. By reducing transformation complexity and maintaining lighter model footprints, they improve feasibility in constrained environments [54,60].

To summarize, feasibility in real-world SHM deployment depends on balancing performance with resource constraints. Distance-based and regression models are highly practical for embedded systems. Bayesian approaches offer efficient, transparent solutions for moderate-scale deployments. Neural networks and hybrid input models provide high accuracy but require optimization for edge compatibility. Image-based methods, while powerful, face limitations in resource-constrained settings. Ultimately, the choice of method should align with system scale, hardware availability, and operational demands.

5.4. Interpretability

Interpretability refers to how easily engineers and decision-makers can understand and trust a model’s outputs. In anomaly detection, interpretability ensures that alerts are traceable and decisions are explainable. This is especially important in safety-critical domains like bridge monitoring. Regression-based models offer high interpretability. Their mathematical transparency allows users to trace predictions back to input variables. This supports diagnostic clarity and makes them ideal for safety-sensitive SHM applications. Bayesian frameworks also excel in interpretability. Their probabilistic reasoning and transparent inference mechanisms help engineers understand how anomalies are detected. Confidence levels can be quantified, supporting informed decision-making [17]. Compared to neural networks, Bayesian models provide clearer diagnostic pathways. Neural networks generally function as black-box models. Tracing the influence of specific input features is difficult, which limits transparency in safety-critical applications. Some studies have introduced attention mechanisms and feature attribution to improve interpretability [40]. These efforts enhance explainability but do not fully resolve the challenge.

Distance-based methods are intuitive and easy to understand. Anomalies are flagged based on deviation from expected patterns or proximity thresholds [30,31]. However, they lack deeper diagnostic capabilities and cannot explain the root cause beyond statistical deviation. Their interpretability is high, but their explanatory depth is limited. Image-based models also face interpretability issues. The link between raw sensor data and anomaly decisions is often obscured. Saliency maps and attention mechanisms have been used to improve transparency [51]. However, pure 2D approaches remain difficult to validate in engineering contexts. Hybrid input image processing models offer a more interpretable alternative. They preserve signal fidelity and allow partial traceability through structured inputs and feature attribution [39,52]. These models strike a balance between performance and transparency, making them more suitable for engineering decision support.

Interpretability varies widely across SHM anomaly detection methods. Regression and Bayesian models provide clear, traceable logic and are highly suitable for safety-critical decisions. Distance-based methods offer intuitive reasoning but lack diagnostic depth. Neural networks and image-based models require additional mechanisms to improve transparency. Hybrid input approaches present a promising middle ground, combining performance with explainable outputs. Ultimately, the choice of method should consider the need for diagnostic clarity alongside detection accuracy.

5.5. Data Dependency

Data dependency refers to how much a model’s performance relies on the quantity, quality, and diversity of input data. Highly data-dependent models often require extensive labeled datasets and consistent sensor readings. In contrast, models with low data dependency can operate effectively even with sparse, noisy, or partially missing data. Predictive and image-based methods typically demand large volumes of labeled data. This requirement poses challenges in SHM contexts where annotated datasets are limited. Neural networks perform well when trained on rich and diverse data [44,45]. However, they are highly data-dependent. Limited fault types or insufficient labels can lead to overfitting and poor generalization [32]. To address this, data augmentation and transfer learning are commonly used.

Bayesian models offer greater flexibility in data-scarce environments. They can incorporate prior knowledge to compensate for limited labeled data [65]. Still, their accuracy depends on sensor quality and the validity of prior assumptions. In non-stationary or sparse datasets, performance may degrade unless the model is carefully tuned or embedded within multi-modal frameworks. Regression models rely on historical patterns and statistical consistency. They perform well with high-quality data but are sensitive to missing or imbalanced inputs. Unlike Bayesian models, regression techniques cannot leverage prior knowledge. Their effectiveness declines in sparse environments unless supported by pre-processing or hybrid multi-modal strategies. Distance-based methods are less data-intensive. They require minimal labeled data and can operate in unsupervised settings. This makes them suitable for early-stage SHM systems. However, their performance depends on well-calibrated thresholds and feature selection. Poor pre-processing can lead to false positives or missed detections. Overall, they exhibit low data dependency but high sensitivity to input quality. Pure 2D image-based models demand extensive labeled datasets. They often struggle with rare fault types and imbalanced data distributions [59]. Their performance is tightly linked to the diversity and quality of training data. Hybrid input models offer a more adaptive solution. Techniques like transfer learning and domain adaptation [55,58] reduce reliance on labeled data. These strategies enable generalization across bridge types and improve robustness in varied SHM scenarios.

In summary, data dependency varies significantly across detection methods. Neural networks and image-based models are powerful but require rich datasets. Bayesian and regression models offer more transparency but depend on data quality and consistency. Distance-based methods are lightweight and suitable for sparse settings, though they are sensitive to feature engineering. Hybrid input approaches provide a balanced solution, reducing data dependency and enhancing generalization. Understanding these trade-offs helps researchers select appropriate methods based on data availability, system constraints, and monitoring goals.

6. Challenges and Future Research Directions

Despite notable progress that has been made in abnormal data detection methods within SHM of bridges, there are several important issues that are required to be resolved. Examples of these issues/challenges include several domains. These include but not limited to computational, methodological, and application-driven domains. Consequently, this opens up many opportunities for future research.

6.1. Key Challenges in Abnormal Data Detection for SHM Systems

The challenges in abnormal data detection in the bridge SHM process are related to the scalability, interpretability, robustness, and practicality of current models when deployed in dynamic environments with uncertain and noisy data.

Computational Complexity and Real-Time Limitations: Deep learning-based image processing techniques have shown excellent accuracy in detecting difficult and complex anomalies in SHM systems [32,38,57,59,60]. These methods often use large neural networks such as CNNs, which require significant computing power. As a result, running these models on low-power devices such as edge systems or embedded hardware becomes very challenging. This limits their use in real-time applications, where fast responses are necessary to prevent serious damage or failure. Studies such as [13,16] have tried to reduce this delay by using faster algorithms, but the trade-off between speed and detection accuracy remains a major issue. Therefore, designing lightweight models that can work efficiently on limited hardware without sacrificing performance is still a big challenge.
Lack of Interpretability: Neural networks have demonstrated strong capabilities in capturing complex patterns within SHM datasets [14,32]. However, their decision-making processes are often opaque, functioning as “black-box” models with limited interpretability. When an anomaly is detected, it is unclear which input features contributed to the decision or how the internal representations led to the output. This lack of transparency poses challenges in safety-critical contexts such as bridge monitoring, where understanding the rationale behind alerts is essential. In SHM applications, particularly during emergency evaluations or maintenance planning, practitioners require interpretable outputs to support timely and reliable decision-making. Consequently, despite their high detection accuracy, the limited explainability of neural networks remains a major limitation, hindering their broader adoption in operational SHM systems.
Under-utilization of Multivariate and Domain Analysis: Many SHM studies rely on data from a single type of sensor, such as acceleration or strain. However, combining data from different types of sensors such as temperature, displacement, and humidity can provide a more complete picture of a bridge’s condition. Still, only a small number of studies have used this approach in their detection systems [17,57,60]. In addition, most studies analyze data only in the time domain. Time-domain methods are simple and fast, but they can miss important features that show up only when data is transformed into other forms. Frequency-domain and time–frequency-domain techniques, such as Fourier Transform or Wavelet Transform, can uncover hidden or subtle faults that are not obvious in raw signals. These methods are very useful, especially for detecting early or small-scale changes in structures. However, they are not widely applied in current research [46,58]. As a result, valuable insights may be lost, and some types of damage may go undetected.
Class Imbalance and Fault Diversity: Many SHM studies rely on datasets that contain a large number of samples for common fault types, but very few instances of rare yet critical faults [32,39,48]. This imbalance skews model learning, leading to poor detection of minority classes. Models trained on frequent faults often misclassify or miss rare anomalies, undermining reliability (especially in emergency scenarios). While a few reviewed studies have attempted to address class imbalance using techniques such as synthetic oversampling, cost-sensitive learning, or data augmentation, these approaches are not consistently applied across the literature. For instance, Qu et al. [55] used attention mechanisms and small sample augmentation to boost minority class recall. Similarly, Du et al. [37] designed a CNN framework that explicitly handles imbalance under limited data. Mao et al. [49] applied GANs and autoencoders to improve detection in skewed datasets. However, these strategies are not widely adopted. Most studies report overall metrics without fault-type breakdowns, masking gains in rare fault detection. Moreover, performance metrics rarely highlight improvements for minority classes. Accuracy and F1 scores are often aggregated, limiting visibility into fault-specific performance. Fault diversity is another concern. Many models target specific fault types and fail to detect others such as bias, drift, gain errors, or environmental noise [15,31]. This narrow focus reduces generalization across bridge types and conditions. Addressing both imbalance and fault diversity is key to building robust, transferable SHM systems.
Data Quality and Labeling Constraints: Training supervised learning models requires large amounts of labeled data. However, in SHM systems, especially for rare or unusual anomalies, labeled data is often very limited [37,49]. This makes it hard for the models to learn effectively and detect these uncommon but important faults. Another challenge is that labeling data by hand is time-consuming, costly, and can sometimes introduce errors. It also requires expert knowledge, which is not always available. As a result, many datasets remain partially labeled or entirely unlabeled. To solve this issue, more research is focusing on unsupervised and semi-supervised learning approaches. These methods can learn patterns from unlabeled data or from just a small number of labeled samples. This makes them more practical for SHM where obtaining labeled data is difficult or expensive [37,49].

In summary, SHM systems are advancing quickly. This progress is driven by developments in deep learning and signal processing. Despite these gains, current techniques still face important limitations. High computational demands make real-time deployment difficult. Many models lack interpretability, which reduces trust in critical scenarios. Multivariate and domain-specific analyses are still underexplored. Fault imbalance and poorly labeled data also hinder performance. These challenges pose serious barriers to reliable and scalable SHM systems. To overcome them, future methods must be more efficient, transparent, and adaptive. Such improvements are essential for building trustworthy SHM systems that can protect critical infrastructure.

6.2. Future Research Directions

The growing complexity of SHM systems and the increasing demand for real-time, accurate anomaly detection call for more advanced and flexible research strategies. The following future research directions highlight key opportunities to enhance the performance, scalability, and trustworthiness of SHM solutions across diverse bridge monitoring applications.

Multi-modal and Adaptive Frameworks: No single method can handle all types of anomalies in SHM data. Statistical, distance-based, predictive, and image-based techniques each offer unique advantages. Distance-based methods are simple and interpretable. However, they often fail with high-dimensional or large-scale datasets. Deep learning models can detect complex patterns, but they require labeled data and high computational resources. Hybrid methods combine multiple techniques to balance these trade-offs. For example, integrating CNNs with handcrafted statistical features improves detection accuracy. It also keeps the model lightweight and easier to interpret [32,39]. Sensor fusion further enhances performance. It merges data from different sources such as strain, vibration, and temperature sensors. This fusion captures complementary features and reduces false positives. Transfer learning is also essential. It helps models adapt to new bridges or sensor layouts. Domain adaptation techniques allow anomaly detectors to generalize without retraining [59]. This reduces deployment time and improves scalability. Future SHM systems should be both multi-modal and adaptive. They must select or combine methods based on data type, resource constraints, and latency requirements. This adaptability ensures robust and efficient anomaly detection in diverse bridge environments.
Lightweight and Explainable Models: While many deep learning models (CNNs, LSTM etc.) exhibit strong performance in detecting anomalies, they often require substantial computational resources. Consequently, it is difficult to deploy these models on embedded or low-power devices commonly used in field-based bridge monitoring. To address this, researchers have explored model compression techniques such as pruning (removing redundant neurons or layers) and quantization (reducing the numerical precision of weights and activations). These methods reduce model size and computational load without significantly compromising accuracy [52,54]. In addition to model compression, another challenge is interpretability. In safety-critical applications like bridge SHM, the lack of transparency (interpretability) can hinder trust and adoption. It is important for engineers to understand why a model flagged a particular anomaly and which features contributed most to the decision. Therefore, explainability is essential, not only for validation and debugging but also for regulatory compliance and operational confidence. Techniques such as saliency maps, layer-wise relevance propagation, and SHAP (SHapley Additive exPlanations) are increasingly being used to visualize and interpret model behavior [69,70]. In addition to improving interpretability, ensemble machine learning models offer practical advantages in SHM. These models combine multiple learners (such as decision trees, support vector machines, or neural networks) to improve robustness and generalization [75]. Ensemble approaches are particularly valuable in SHM, where sensor data can be noisy, heterogeneous, and context-dependent. By employing diverse model architectures and learning strategies, ensemble methods can mitigate overfitting, improve fault tolerance, and provide more stable outputs. Moreover, some ensemble models, like Random Forests or Gradient Boosting Machines, offer built-in feature importance metrics, which contribute to interpretability and help engineers understand which sensor inputs are most influential.
Multi-modal and Multivariate Fusion: In many bridge SHM systems, researchers rely on a single type of sensor data. Common examples include acceleration or strain signals (as shown in different tables of Section 4 in this manuscript). While these can detect certain faults, they often miss more complex issues. Using multiple sensor types together, known as multi-modal fusion, offers a deeper view of the bridge’s condition. For instance, combining acceleration, strain, temperature, and displacement data allows for richer insights. This approach helps uncover hidden or subtle anomalies that may not appear in single-sensor readings. To manage this large volume of data, multivariate analysis techniques are useful. Methods like PCA and ICA reduce data size. They also highlight key features that are most relevant for fault detection. Studies show that combining sensor fusion with multivariate analysis improves detection accuracy. It also enhances reliability in identifying structural problems [17,30]. Despite its promise, this approach is not yet common in SHM research. There is a need for better algorithms that can handle diverse data sources. Researchers must also develop user-friendly frameworks that support real-time analysis of multi-modal and multivariate data.
Domain Adaptation and Transfer Learning: In many SHM projects, models are trained using data from one specific bridge. These models often perform well on the original structure. However, when applied to a different bridge, their accuracy usually drops. This happens because the new data may contain unfamiliar patterns, features, or noise levels. The model has not seen this type of data before, so it struggles to make correct predictions. Transfer learning offers a solution to this problem. It allows a model trained on one bridge to be reused on another bridge. This process requires little or no extra training. As a result, it saves time and reduces the need for large labeled datasets in each new deployment. Several recent studies have shown that this method works well in SHM applications [57,58,59]. Another useful method is domain adaptation. It helps the model understand both the original and new data. This is achieved by aligning the data distributions between the two bridges. When the distributions are similar, the model can perform better on both. Together, transfer learning and domain adaptation make SHM systems more flexible. They help models generalize across different environments. This reduces the need to collect and label new data for every bridge. It also makes it easier to deploy AI-based monitoring systems in real-world conditions.
Robust Detection under Noise and Uncertainty: In real-world SHM systems, sensor data often contains noise or missing information due to weather, communication problems, or sensor faults. This makes it hard for models to correctly detect true anomalies. If a model is too sensitive, it may raise false alarms. If it is not sensitive enough, it may miss important faults. To handle this, future SHM systems should use probabilistic models that can deal with uncertainty in the data. For example, Bayesian models can estimate how confident the system is when it labels a data point as an anomaly [17]. They can also update their decisions as new data arrives, making them more flexible. Other approaches, such as Gaussian process models, can provide not just predictions but also a measure of uncertainty in those predictions [16]. This helps engineers better understand whether a warning is strong evidence of failure or just a weak signal. Similarly, newer methods try to combine uncertainty estimation with deep learning models to improve reliability under noisy conditions [52]. Using these ideas, future frameworks can become more robust and trustworthy, even when the input data is incomplete or unreliable.
Benchmark Datasets and Standardized Evaluation: One of the key challenges in SHM research is the lack of open and well-annotated datasets. Most studies rely on private datasets collected from specific bridges, which are not shared with the research community. This makes it hard to compare different methods fairly and slows down progress in the field. Publicly available datasets that cover various bridge types, fault categories, and environmental conditions would allow researchers to develop, test, and improve their methods on common ground [39]. In addition to datasets, there is also a need for standardized evaluation metrics. Right now, different studies use different ways to measure accuracy, precision, recall, and other performance indicators. This makes it difficult to judge which method is actually better or more reliable. For example, some works report only accuracy, while others use more detailed metrics such as F1-score or processing time [40]. Having a common set of benchmarks and evaluation criteria would help the community perform consistent comparisons and drive progress toward more dependable SHM systems.

In conclusion, advancing SHM research requires a multi-faceted approach that combines the strengths of various detection strategies. Future systems should not only be accurate but also adaptable, explainable, and lightweight enough for real-world deployment. Emphasizing multi-modal frameworks, improved uncertainty handling, transferable models, and standardized evaluation protocols will be essential for meeting these goals. Together, these research directions will pave the way for smarter, safer, and more resilient bridge health monitoring systems.

7. Answers to Formulated Research Questions and Limitations of the Research

This section presents the findings of the SLR by addressing the four research questions that guided the study. Each question explores a critical dimension of abnormal data detection in SHM, including method prevalence, real-time capability, performance across bridge types, and key challenges. The answers in Section 7.1 are supported by evidence from the 36 selected studies and corresponding summary tables. Following this, Section 7.2 outlines the methodological limitations of the review process, offering transparency and identifying areas for future refinement.

7.1. Answers to Formulated Research Questions

Research question 1: How frequently are different abnormal data detection techniques used in SHM studies, and which method dominates current research?

Answer: Among the 36 reviewed studies, image processing methods are the most prevalent in abnormal data detection for SHM systems in bridges. As shown in Table 4, 22 studies utilized image-based approaches, particularly those involving deep learning and two-dimensional visual transformations. Predictive methods were applied in 12 studies, while distance-based methods were used in only 2 studies. Section 5 further contextualizes these findings through a comparative analysis across five dimensions: robustness, scalability, deployment feasibility, interpretability, and data dependency. This evaluation reinforces the trend toward deep learning while underscoring the trade-offs that researchers must consider in real-world SHM applications.

Research question 2: How do various abnormal data detection methods perform in terms of real-time capability, analysis domain, and multivariate analysis?

Answer: Based on the findings presented in Section 3 of the systematic review, abnormal data detection methods in bridge SHM systems show varied performance across real-time capability, analysis domain, and multivariate analysis. Real-time capability remains significantly underutilized, with only 11 out of 36 studies explicitly addressing this dimension. Most deep learning-based image processing methods, while highly accurate, suffer from computational complexity that limits their suitability for real-time deployment. In terms of analysis domain, the majority of studies rely on time-domain features, with only a few exploring frequency or time-frequency domains despite their potential for enhancing fault detection. This indicates a narrow application of domain-specific techniques. Regarding multivariate analysis, only eight studies incorporated multiple sensor inputs or feature dimensions, suggesting that many current approaches still operate on simplified or univariate data representations. Overall, Section 3 highlights that while detection accuracy has improved, real-time responsiveness, domain diversity, and multivariate integration remain critical areas for advancement in SHM anomaly detection research.

Research question 3: How do anomaly detection methods for bridge SHM compare in terms of detection performance across different fault types and study contexts?

Answer: Image processing methods, as detailed in Table 13 and Table 14, demonstrated high accuracy (often exceeding 95%) and effectively detected a wide range of fault types, including drift, missing values, trends, and noise. However, these methods face practical limitations due to their computational demands and potential information loss during transformation. Predictive models, summarized in Table 10, Table 11 and Table 12, offered a balanced trade-off between interpretability and performance. For example, Bayesian models achieved an accuracy of 98.96% (Table 10), while LSTM-based neural networks showed strong precision and recall, particularly in detecting drift and outliers (Table 12). In contrast, distance-based methods (Table 9) were computationally simple but exhibited limited generalization. They were primarily effective for detecting clear outliers and less suitable for complex or subtle anomalies.

Research question 4: What are the key challenges in abnormal data detection, and how can emerging advancements improve detection accuracy in future research?

Answer: The key challenges identified in Section 5 of the article include computational complexity, the need for high-performance models, lack of interpretability, under-utilization of sensor fusion and frequency-domain analysis, and issues related to imbalanced or poorly labeled data. To address these limitations, the article recommends the development of multi-modal frameworks that combine multiple detection paradigms, the adoption of lightweight and explainable models suitable for embedded deployment, and the expansion of transfer learning and domain adaptation techniques to improve cross-bridge generalization. Additionally, it emphasizes the need to create public benchmark datasets that include diverse fault types and standardized evaluation protocols.

7.2. Limitations of the Research

Although this SLR followed a rigorous and clearly defined methodology, several inherent limitations in the research process require attention:

Search Process: We utilized defined search terms across selected databases and applied systematic filtering. Nevertheless, thousands of results made exhaustive screening infeasible. Additionally, article exclusion based solely on titles may have omitted relevant studies with non-explicit titles.
Databases Selection: While our study considered eight highly regarded databases (IEEE, Springer, Elsevier, SAGE, MDPI, Wiley, Tech Science Press, Techno-Press), we acknowledge the possibility of overlooking pertinent work indexed elsewhere. Nonetheless, due to the breadth and prestige of the selected repositories, we believe the findings of this SLR remain representative and impactful.
Scope and Selection Limitations: The number of selected papers may seem small compared to the growing research in SHM and AI. However, this review focused on quality rather than quantity. Many studies were excluded because they lacked strong validation or repeated similar ideas. We also limited our scope to bridge-specific SHM to keep the review focused. Future reviews can expand the coverage by including more sources and using broader criteria. This will help capture newer trends and emerging techniques. Despite these limitations, the selected corpus of studies offers a comprehensive and credible foundation for evaluating abnormal data detection in SHM systems. Acknowledging these constraints also highlights valuable opportunities for further meta-analytical exploration and deeper cross-database synthesis in future research.

8. Conclusions

This SLR examined 36 peer-reviewed studies published between 2020 and 2025. The studies were sourced from eight major databases and focused on abnormal data detection techniques used in bridge SHM systems. The findings were organized using a four-dimensional taxonomy. These dimensions included real-time capability, multivariate analysis, signal domain, and detection methods. The SLR grouped existing detection methods into three main categories: distance-based methods, predictive models, and image processing approaches. Among these, image processing methods, especially those using deep learning, showed the highest accuracy in identifying complex data anomalies. Predictive models provided a balance between interpretability and performance. Distance-based methods were simple but had limited scalability. Subsequently, a comparative analysis of these three detection paradigms was conducted across five key dimensions: robustness, scalability, deployment feasibility, interpretability, and data dependency. This analysis revealed the strengths and limitations of each detection paradigm in practical SHM applications. Unlike previous reviews, which primarily focused on broad SHM applications or isolated algorithmic benchmarks, our framework targeted bridge-specific challenges such as multivariate sensor fusion and real-time constraints. Despite recent progress, the SLR identified several research gaps. Real-time detection and multivariate analysis remain underexplored. Only 11 studies addressed real-time capability, and just 8 considered multivariate analysis. Domain-specific techniques such as frequency or time-frequency analysis are rarely applied outside the time domain, even though they are useful for fault detection. The review also highlighted some major challenges, including limited interpretability of deep models, computational demands restricting real-time use, imbalance in fault types, and a shortage of well-labeled data. To address these issues, future research should focus on adaptive multi-modal frameworks, lightweight and interpretable AI models, and sensor fusion with advanced domain analysis. Transfer learning can enhance cross-bridge adaptability, while benchmark datasets and standardized protocols are vital for consistent evaluation. By offering a clear synthesis of current methods, performance trade-offs, and future priorities, this SLR provides actionable guidance for improving abnormal data detection in bridge monitoring systems.

Author Contributions

Conceptualization, O.S.S. and M.R.; investigation, O.S.S.; writing—original draft preparation, O.S.S. and M.R.; writing—review and editing, M.R.; supervision, M.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

Varghese, A.M.; Pradhan, R.P. Transportation infrastructure and economic growth: Does there exist causality and spillover? A Systematic Review and Research Agenda. Transp. Res. Procedia 2025, 82, 2618–2632. [Google Scholar] [CrossRef]
Faris, N.; Zayed, T.; Fares, A. Review of Condition Rating and Deterioration Modeling Approaches for Concrete Bridges. Buildings 2025, 15, 219. [Google Scholar] [CrossRef]
Azhar, A.S.; Kudus, S.A.; Jamadin, A.; Mustaffa, N.K.; Sugiura, K. Recent vibration-based structural health monitoring on steel bridges: Systematic literature review. Ain Shams Eng. J. 2024, 15, 102501. [Google Scholar] [CrossRef]
Gharehbaghi, V.R.; Noroozinejad Farsangi, E.; Noori, M.; Yang, T.; Li, S.; Nguyen, A.; Málaga-Chuquitaype, C.; Gardoni, P.; Mirjalili, S. A critical review on structural health monitoring: Definitions, methods, and perspectives. Arch. Comput. Methods Eng. 2022, 29, 2209–2235. [Google Scholar] [CrossRef]
He, Z.; Li, W.; Salehi, H.; Zhang, H.; Zhou, H.; Jiao, P. Integrated structural health monitoring in bridge engineering. Autom. Constr. 2022, 136, 104168. [Google Scholar] [CrossRef]
Brighenti, F.; Caspani, V.F.; Costa, G.; Giordano, P.F.; Limongelli, M.P.; Zonta, D. Bridge management systems: A review on current practice in a digitizing world. Eng. Struct. 2024, 321, 118971. [Google Scholar] [CrossRef]
Deng, Y.; Zhao, Y.; Ju, H.; Yi, T.H.; Li, A. Abnormal data detection for structural health monitoring: State-of-the-art review. Dev. Built Environ. 2024, 17, 100337. [Google Scholar] [CrossRef]
Sonbul, O.S.; Rashid, M. Algorithms and Techniques for the Structural Health Monitoring of Bridges: Systematic Literature Review. Sensors 2023, 23, 4230. [Google Scholar] [CrossRef]
Rashid, M.; Sonbul, O.S. Towards the Structural Health Monitoring of Bridges Using Wireless Sensor Networks: A Systematic Study. Sensors 2023, 23, 8468. [Google Scholar] [CrossRef]
Qu, C.; Zhang, H.; Zhang, R.; Zou, S.; Huang, L.; Li, H. Multiclass Anomaly Detection of Bridge Monitoring Data with Data Migration between Different Bridges for Balancing Data. Appl. Sci. 2023, 13, 7635. [Google Scholar] [CrossRef]
Choi, K.; Yi, J.; Park, C.; Yoon, S. Deep Learning for Anomaly Detection in Time-Series Data: Review, Analysis, and Guidelines. IEEE Access 2021, 9, 120043–120065. [Google Scholar] [CrossRef]
Mejri, N.; Lopez-Fuentes, L.; Roy, K.; Chernakov, P.; Ghorbel, E.; Aouada, D. Unsupervised anomaly detection in time-series: An extensive evaluation and analysis of state-of-the-art methods. Expert Syst. Appl. 2024, 256, 124922. Available online: https://www.sciencedirect.com/science/article/pii/S0957417424017895 (accessed on 28 September 2025). [CrossRef]
Zhang, Y.M.; Wang, H.; Wan, H.P.; Mao, J.X.; Xu, Y.C. Anomaly detection of structural health monitoring data using the maximum likelihood estimation-based Bayesian dynamic linear model. Struct. Health Monit. 2021, 20, 2936–2952. [Google Scholar] [CrossRef]
Zhang, Y.; Lei, Y. Data Anomaly Detection of Bridge Structures Using Convolutional Neural Network Based on Structural Vibration Signals. Symmetry 2021, 13, 1186. [Google Scholar] [CrossRef]
Zhang, J.; Zhang, J.; Wu, Z. Long-Short Term Memory Network-Based Monitoring Data Anomaly Detection of a Long-Span Suspension Bridge. Sensors 2022, 22, 6045. [Google Scholar] [CrossRef] [PubMed]
Zhu, Y.C.; Zheng, Y.W.; Xiong, W.; Li, J.X.; Cai, C.S.; Jiang, C. Online Bridge Structural Condition Assessment Based on the Gaussian Process: A Representative Data Selection and Performance Warning Strategy. Struct. Control Health Monit. 2024, 2024, 5579734. [Google Scholar] [CrossRef]
Xu, X.; Forde, M.C.; Ren, Y.; Huang, Q.; Liu, B. Multi-index probabilistic anomaly detection for large span bridges using Bayesian estimation and evidential reasoning. Struct. Health Monit. 2023, 22, 948–965. [Google Scholar] [CrossRef]
Gao, K.; Chen, Z.D.; Weng, S.; Zhu, H.p.; Wu, L.Y. Detection of multi-type data anomaly for structural health monitoring using pattern recognition neural network. Smart Struct. Syst. 2022, 29, 129–140. [Google Scholar] [CrossRef]
Fan, Z.; Tang, X.; Chen, Y.; Ren, Y.; Deng, C.; Wang, Z.; Peng, Y.; Shi, C.; Huang, Q. Review of anomaly detection in large span bridges: Available methods, recent advancements and future trends. Adv. Bridge Eng. 2024, 5, 2. [Google Scholar] [CrossRef]
Ayadi, A.; Ghorbel, O.; Obeid, A.M.; Abid, M. Outlier detection approaches for wireless sensor networks: A survey. Comput. Netw. 2017, 129, 319–333. [Google Scholar] [CrossRef]
Makhoul, N. Review of data quality indicators and metrics, and suggestions for indicators and metrics for structural health monitoring. Adv. Bridge Eng. 2022, 3, 17. [Google Scholar] [CrossRef]
Shahrivar, F.; Sidiq, A.; Mahmoodian, M.; Jayasinghe, S.; Sun, Z.; Setunge, S. AI-based bridge maintenance management: A comprehensive review. Artif. Intell. Rev. 2025, 58, 135. [Google Scholar] [CrossRef]
Kitchenham, B. Procedures for Performing Systematic Reviews; Keele University: Keele, UK, 2004; p. 33. [Google Scholar]
Rashid, M.; Anwar, M.W.; Khan, A.M. Toward the tools selection in model based system engineering for embedded systems—A systematic literature review. J. Syst. Softw. 2015, 106, 150–163. [Google Scholar] [CrossRef]
Rashid, M.; Imran, M.; Jafri, A.R.; Al-Somani, T.F. Flexible architectures for cryptographic algorithms—A systematic literature review. J. Circuits, Syst. Comput. 2019, 28, 1930003. [Google Scholar] [CrossRef]
Imran, M.; Bashir, F.; Jafri, A.R.; Rashid, M.; ul Islam, M.N. A systematic review of scalable hardware architectures for pattern matching in network security. Comput. Electr. Eng. 2021, 92, 107169. [Google Scholar] [CrossRef]
Tasadduq, I.A.; Rashid, M. Toward Intelligent Underwater Acoustic Systems: Systematic Insights into Channel Estimation and Modulation Methods. Electronics 2025, 14, 2953. [Google Scholar] [CrossRef]
Chandola, V.; Banerjee, A.; Kumar, V. Anomaly Detection: A Survey. ACM Comput. Surv. (CSUR) 2009, 41, 1–58. [Google Scholar] [CrossRef]
Knorr, E.M.; Ng, R.T.; Tucakov, V. Distance-based outliers: Algorithms and applications. VLDB J. 2000, 8, 237–253. [Google Scholar] [CrossRef]
Lei, Z.; Zhu, L.; Fang, Y.; Li, X.; Liu, B. Anomaly detection of bridge health monitoring data based on KNN algorithm. J. Intell. Fuzzy Syst. 2020, 39, 5243–5252. [Google Scholar] [CrossRef]
Jeong, S.; Jin, S.S.; Sim, S.H. Modal Property-Based Data Anomaly Detection Method for Autonomous Stay-Cable Monitoring System in Cable-Stayed Bridges. Struct. Control Health Monit. 2024, 2024, 8565150. [Google Scholar] [CrossRef]
Zhang, H.; Lin, J.; Hua, J.; Gao, F.; Tong, T. Data Anomaly Detection for Bridge SHM Based on CNN Combined with Statistic Features. J. Nondestruct. Eval. 2022, 41, 28. [Google Scholar] [CrossRef]
Yuen, K.V.; Ortiz, G.A. Outlier detection and robust regression for correlated data. Comput. Methods Appl. Mech. Eng. 2017, 313, 632–646. [Google Scholar] [CrossRef]
Zhang, X.; Zhou, W. Structural Vibration Data Anomaly Detection Based on Multiple Feature Information Using CNN-LSTM Model. Struct. Control Health Monit. 2023, 2023, 3906180. [Google Scholar] [CrossRef]
Shajihan, A.; Wang, S.; Zhai, G.; Spencer, B. CNN based data anomaly detection using multi-channel imagery for structural health monitoring. Smart Struct. Syst. 2022, 29, 181–193. [Google Scholar] [CrossRef]
Chou, J.Y.; Fu, Y.; Huang, S.K.; Chang, C.M. SHM data anomaly classification using machine learning strategies: A comparative study. Smart Struct. Syst. 2022, 29, 77–91. [Google Scholar] [CrossRef]
Du, Y.; Li, L.; Hou, R.; Wang, X.; Tian, W.; Xia, Y. Convolutional Neural Network-based Data Anomaly Detection Considering Class Imbalance with Limited Data. Smart Struct. Syst. 2022, 29, 63–75. [Google Scholar] [CrossRef]
Liu, G.; Niu, Y.; Zhao, W.; Duan, Y.; Shu, J. Data anomaly detection for structural health monitoring using a combination network of GANomaly and CNN. Smart Struct. Syst. 2022, 29, 53–62. [Google Scholar] [CrossRef]
Zhang, Y.; Tang, Z.; Yang, R. Data anomaly detection for structural health monitoring by multi-view representation based on local binary patterns. Measurement 2022, 202, 111804. [Google Scholar] [CrossRef]
Yang, K.; Ding, Y.; Jiang, H.; Zhao, H.; Luo, G. A two-stage data cleansing method for bridge global positioning system monitoring data based on bi-direction long and short term memory anomaly identification and conditional generative adversarial networks data repair. Struct. Control Health Monit. 2022, 29, e2993. [Google Scholar] [CrossRef]
Son, H.; Jang, Y.; Kim, S.E.; Kim, D.; Park, J.W. Deep Learning-Based Anomaly Detection to Classify Inaccurate Data and Damaged Condition of a Cable-Stayed Bridge. IEEE Access 2021, 9, 3100419. [Google Scholar] [CrossRef]
Qu, B.; Liao, P.; Huang, Y. Outlier Detection and Forecasting for Bridge Health Monitoring Based on Time Series Intervention Analysis. Struct. Durab. Health Monit. 2022, 16, 323–341. [Google Scholar] [CrossRef]
Kim, S.Y.; Mukhiddinov, M. Data Anomaly Detection for Structural Health Monitoring Based on a Convolutional Neural Network. Sensors 2023, 23, 8525. [Google Scholar] [CrossRef]
Ni, F.; Zhang, J.; Noori, M.N. Deep learning for data anomaly detection and data compression of a long-span suspension bridge. Comput.-Aided Civ. Infrastruct. Eng. 2020, 35, 685–700. [Google Scholar] [CrossRef]
Yang, J.; Liu, D.; Zhao, L.; Yang, X.; Li, R.; Jiang, S.; Li, J. Improved stochastic configuration network for bridge damage and anomaly detection using long-term monitoring data. Inf. Sci. 2025, 700, 121831. [Google Scholar] [CrossRef]
Deng, Y.; Ju, H.; Zhong, G.; Li, A.; Ding, Y. A general data quality evaluation framework for dynamic response monitoring of long-span bridges. Mech. Syst. Signal Process. 2023, 200, 110514. [Google Scholar] [CrossRef]
Deng, Y.; Ju, H.; Zhong, G.; Li, A. Data quality evaluation for bridge structural health monitoring based on deep learning and frequency-domain information. Struct. Health Monit. 2023, 22, 2925–2947. [Google Scholar] [CrossRef]
Zhao, M.; Sadhu, A.; Capretz, M. Multiclass anomaly detection in imbalanced structural health monitoring data using convolutional neural network. J. Infrastruct. Preserv. Resil. 2022, 3, 10. [Google Scholar] [CrossRef]
Mao, J.; Wang, H.; SpencerJr, B.F. Toward data anomaly detection for automated structural health monitoring: Exploiting generative adversarial nets and autoencoders. Struct. Health Monit. 2021, 20, 1609–1626. [Google Scholar] [CrossRef]
Jian, X.; Zhong, H.; Xia, Y.; Sun, L. Faulty data detection and classification for bridge structural health monitoring via statistical and deep-learning approach. Struct. Control Health Monit. 2021, 28, e2824. [Google Scholar] [CrossRef]
Lei, X.; Xia, Y.; Wang, A.; Jian, X.; Zhong, H.; Sun, L. Mutual information based anomaly detection of monitoring data with attention mechanism and residual learning. Mech. Syst. Signal Process. 2023, 182, 109607. [Google Scholar] [CrossRef]
Xu, J.; Dang, D.; Qian, M.; Liu, X.; Han, Q. A novel and robust data anomaly detection framework using LAL-AdaBoost for structural health monitoring. J. Civ. Struct. Health Monit. 2022, 12, 305–321. [Google Scholar] [CrossRef]
Hao, C.; Gong, Y.; Liu, B.; Pan, Z.; Sun, W.; Li, Y.; Zhuo, Y.; Ma, Y.; Zhang, L. Data anomaly detection for structural health monitoring using the Mixture of Bridge Experts. Structures 2025, 71, 108039. [Google Scholar] [CrossRef]
Wang, L.; Kang, J.; Zhang, W.; Hu, J.; Wang, K.; Wang, D.; Yu, Z. Online diagnosis for bridge monitoring data via a machine learning-based anomaly detection method. Measurement 2025, 245, 116587. [Google Scholar] [CrossRef]
Qu, C.X.; Yang, Y.T.; Zhang, H.M.; Yi, T.H.; Li, H.N. Two-stage anomaly detection for imbalanced bridge data by attention mechanism optimisation and small sample augmentation. Eng. Struct. 2025, 327, 119613. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, X.; Wu, W.; Xia, Y. Anomaly detection of sensor faults and extreme events by anomaly locating strategies and convolutional autoencoders. Struct. Health Monit. 2025. [Google Scholar] [CrossRef]
Pan, Q.; Bao, Y.; Li, H. Transfer learning-based data anomaly detection for structural health monitoring. Struct. Health Monit. 2023, 22, 3077–3091. [Google Scholar] [CrossRef]
Qu, C.X.; Zhang, H.M.; Yi, T.H.; Pang, Z.Y.; Li, H.N. Anomaly detection of massive bridge monitoring data through multiple transfer learning with adaptively setting hyperparameters. Eng. Struct. 2024, 314, 118404. [Google Scholar] [CrossRef]
Wang, X.; Wu, W.; Du, Y.; Cao, J.; Chen, Q.; Xia, Y. Wireless IoT Monitoring System in Hong Kong–Zhuhai–Macao Bridge and Edge Computing for Anomaly Detection. IEEE Internet Things J. 2024, 11, 4763–4774. [Google Scholar] [CrossRef]
Kang, J.; Wang, L.; Zhang, W.; Hu, J.; Chen, X.; Wang, D.; Yu, Z. Effective alerting for bridge monitoring via a machine learning-based anomaly detection method. Struct. Health Monit. 2024. [Google Scholar] [CrossRef]
Arif, M.; Rashid, M. A Literature Review on Model Conversion, Inference, and Learning Strategies in EdgeML with TinyML Deployment. Comput. Mater. Contin. 2025, 83, 13–64. [Google Scholar] [CrossRef]
Beale, C.; Niezrecki, C.; Inalpolat, M. An adaptive wavelet packet denoising algorithm for enhanced active acoustic damage detection from wind turbine blades. Mech. Syst. Signal Process. 2020, 142, 106754. [Google Scholar] [CrossRef]
Nikkhoo, A.; Karegar, H.; Mohammadi, R.K.; Hajirasouliha, I. An acceleration-based approach for crack localisation in beams subjected to moving oscillators. J. Vib. Control 2021, 27, 489–501. [Google Scholar] [CrossRef]
Hou, Z.; Hera, A.; Noori, M. Wavelet-based techniques for structural health monitoring. In Health Assessment of Engineered Structures: Bridges, Buildings and Other Infrastructures; World Scientific: Singapore, 2013; pp. 179–202. [Google Scholar]
Moghaddass, R.; Sheng, S. An anomaly detection framework for dynamic systems using a Bayesian hierarchical framework. Appl. Energy 2019, 240, 561–582. [Google Scholar] [CrossRef]
Wan, H.P.; Ni, Y.Q. Bayesian Modeling Approach for Forecast of Structural Stress Response Using Structural Health Monitoring Data. J. Struct. Eng. 2018, 144, 04018130. [Google Scholar] [CrossRef]
Pang, J.; Liu, D.; Peng, Y.; Peng, X. Anomaly detection based on uncertainty fusion for univariate monitoring series. Measurement 2017, 95, 280–292. [Google Scholar] [CrossRef]
Kim, C.; Lee, J.; Kim, R.; Park, Y.; Kang, J. DeepNAP: Deep neural anomaly pre-detection in a semiconductor fab. Inf. Sci. 2018, 457–458, 1–11. [Google Scholar] [CrossRef]
Gramegna, A.; Giudici, P. SHAP and LIME: An Evaluation of Discriminative Power in Credit Risk. Front. Artif. Intell. 2021, 4, 752558. [Google Scholar] [CrossRef]
Mane, D.; Magar, A.; Khode, O.; Koli, S.; Bhat, K.; Korade, P. Unlocking Machine Learning Model Decisions: A Comparative Analysis of LIME and SHAP for Enhanced Interpretability. J. Electr. Syst. 2024, 20, 1252–1267. [Google Scholar] [CrossRef]
German, S.; Brilakis, I.; DesRoches, R. Rapid entropy-based detection and properties measurement of concrete spalling with machine vision for post-earthquake safety assessments. Adv. Eng. Inform. 2012, 26, 846–858. [Google Scholar] [CrossRef]
Kabir, S. Imaging-based detection of AAR induced map-crack damage in concrete structure. NDT E Int. 2010, 43, 461–469. [Google Scholar] [CrossRef]
Mumuni, A.; Mumuni, F. Data augmentation: A comprehensive survey of modern approaches. Array 2022, 16, 100258. [Google Scholar] [CrossRef]
Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
Kazemi, F.; Asgarkhani, N.; Jankowski, R. Optimization-based stacked machine-learning method for seismic probability and risk assessment of reinforced concrete shear walls. Expert Syst. Appl. 2024, 255, 124897. [Google Scholar] [CrossRef]

Figure 1. Structural health monitoring workflow incorporating data anomaly detection.

Figure 2. Research framework illustrating database sources and classification of selected studies.

Figure 3. Taxonomy of anomaly detection methods in structural health monitoring of bridges.

Figure 4. Year-wise distribution of selected research articles from WoS-indexed journals (2020–2025), illustrating publication trends and the evolving focus on abnormal data detection methods in SHM for bridges.

Figure 5. Search process of the systematic literature review highlighting the sequential filtering stages from initial results to final study selection.

Table 1. Summary of existing reviews: focus areas and key limitations.

Ref.	Year	Focus	Limitations
[20]	2017	Outlier detection in wireless sensor networks using statistical methods	Lacks AI, real-time analysis, multivariate support, and domain-level assessment
[11]	2021	DL-based anomaly detection in time-series with benchmark and training analysis	No statistical-DL integration; overlooks computational efficiency and real-time deployment
[21]	2022	Focuses on SHM data quality indicators and a generic evaluation framework	Lacks AI-driven enhancements and detailed performance analysis across different methods
[7]	2024	Focuses on taxonomy and evaluation of anomaly detection methods for SHM	Lacks emphasis on various models and AI-driven approaches for enhanced accuracy
[19]	2024	Focuses on anomaly detection in large-span bridges through structural metrics	Lacks AI integration, detailed performance analysis, and uncertainty management strategies
[22]	2025	Explores AI applications in bridge maintenance, highlighting efficiency and sustainability	Lacks coverage of multi-sensor fusion and real-time AI deployment in large-scale monitoring

Table 2. Inclusion and exclusion criteria with rationale.

Criterion	Rationale
Subject Relevance	Ensures that selected studies directly address anomaly detection in bridge SHM and contribute to the research questions.
Publication Date (2020–2025)	Focuses the review on recent advancements and excludes outdated methodologies that may not reflect current trends.
Publisher	Limits selection to reputable sources indexed in eight major scientific databases (IEEE, Springer, Elsevier, SAGE, MDPI, Wiley, Tech Science Press, Techno-Press) to ensure quality.
Impactful Contributions	Prioritizes studies that demonstrate practical relevance and deployment potential for abnormal data detection in bridge SHM.
Results-Oriented	Filters out studies lacking empirical validation or rigorous experimentation, ensuring reliability of reported findings.
Avoid Repetition	Prevents duplication by selecting only one representative study in cases of overlapping research within the same context.

Table 3. Search terms and results under AND/OR operators. Abbreviations: Spr. = Springer, Els. = Elsevier, TSP = Tech Science Press, TP = Techno-Press.

Search Terms	Op.	IEEE	Spr.	Els.	SAGE	Wiley	MDPI	TSP	TP
‘Bridges’ ‘SHM’ ‘Anomaly detection’	AND	12	16	115	94	63	23	8	55
‘Bridges’ ‘SHM’ ‘Anomaly detection’	OR	113	129	498	523	289	135	92	256
‘Bridges’ ‘SHM’ ‘Data cleansing’	AND	19	33	132	124	74	38	11	69
‘Bridges’ ‘SHM’ ‘Data cleansing’	OR	73	94	387	412	245	112	45	254
‘Bridges’ ‘SHM’ ‘Abnormal Data detection’	AND	48	52	195	210	149	72	23	126
‘Bridges’ ‘SHM’ ‘Abnormal Data detection’	OR	152	164	487	475	321	174	111	295
‘Bridges’ ‘SHM’ ‘Abnormal Data detection’	AND	69	83	314	264	212	93	22	164
‘Bridges’ ‘SHM’ ‘Abnormal Data detection’	OR	312	364	952	865	658	352	163	648
‘Bridges’ ‘SHM’ ‘Outlier detection’	AND	18	26	216	169	87	36	8	54
‘Bridges’ ‘SHM’ ‘Outlier detection’	OR	132	185	543	532	404	198	59	354
‘Bridges’ ‘SHM’ ‘Data quality management’	AND	54	74	227	214	162	72	34	147
‘Bridges’ ‘SHM’ ‘Data quality management’	OR	127	236	678	632	545	258	67	497
‘Bridges’ ‘SHM’ ‘Anomalous data detection’	AND	28	52	207	182	132	65	14	106
‘Bridges’ ‘SHM’ ‘Anomalous data detection’	OR	263	325	529	654	552	365	125	516

Table 4. Classification and statistical overview of abnormal data detection methods.

Detection Methods	No. of Studies	Cited References
Distance-Based	2	[30,31]
Predictives	12	[13,14,15,16,17,32,40,41,42,43,44,45]
Image Processing	22	[18,34,35,36,37,38,39,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60]

Table 5. Prevalence of real-time capability in SHM systems.

Real-Time Capability	No. of Studies	Associated References
Yes	11	[13,15,16,18,40,42,43,53,54,59,60]
No	25	[14,17,30,31,32,34,35,36,37,38,39,41,44,45,46,47,48,49,50,51,52,55,56,57,58]

Table 6. Performance of real-time anomaly detection methods (NG = not given).

Ref.	Methods	Accuracy (%)	Latency (s)	HW	Key Highlights
[13]	BDLM	98.96	0.024	NG	Subspace detection with adaptive thresholding
[15]	LSTM	>93	NG	NG	Correlation coefficient with double thresholds method
[16]	GPR	96	0.10	CPU	Representative data selection with operational variations
[18]	PRNN	96.4	NG	NG	Feature extraction from long time-series data
[43]	CNN	97.6	NG	NG	Layer-wise training with gradual refinement
[40]	Bi-LSTM + cGAN	95 to 98	NG	NG	GPS data cleansing across multi-sensor datasets
[42]	SARIMA	95	NG	NG	Forecasting outlier effects for early warnings
[53]	Multi Models	99.35	0.145	NG	Combination of MobileViT and the BST
[54]	2D time-series	98	NG	Server	Multi-channel data conversion
[59]	Transfer Learning	98	0.3	Edge	Edge-based domain-adaptive anomaly detection
[60]	Neural Network	>95	NG	NG	Real-time alerting

Table 7. Categorization of abnormal data detection methods by analysis domain: time, frequency, and time–frequency techniques.

Analysis Domain	No. of Studies	Associated References
Time	25	[13,14,15,17,18,30,32,34,35,36,38,39,40,41,42,43,44,48,49,51,55,56,57,58,60]
Frequency	14	[16,31,32,34,35,38,44,45,47,50,54,57,58,59]
Time-Frequency	5	[36,37,46,52,53]

Table 8. Distribution of multivariate analysis usage among selected studies.

Multivariate Usage	No. of Studies	References and Technique Category
Yes	8	[17,30,40]—Multivariate Time Series Models [41,43,54,57]—CNN Feature-Based Learning [60]—Multivariate Machine Learning Framework
No	28	[13,14,15,16,18,31,32,34,35,36,37,38,39,42,44,45,46,47,48,49,50,51,52,53,55,56,58,59]

Table 9. Distance-based methods in the area of SHM for bridges.

Ref.	Dataset	Bridge Type	Sensor Data Type	Anomaly Type	Methodology
[31]	Real bridge (60,000)	Cable-stayed	Acceleration	Low-quality, abnormal beh.	Uses MCD for distance-based anomaly detection
[30]	Simulation (40,000)	General	Temperature, deflection, train, humidity, disp.	Outlier	Applies KNN by measuring distances to identify anomalies

Table 10. Bayesian methods in the area of SHM for bridges.

Ref.	Dataset	Bridge Type	Sensor Data Type	Anomaly Type	Accuracy	Methodology
[13]	Real bridge (144)	Long-span cable-stayed	Acceleration, strain	Spikes, baseline shift	98.96%	BDLM with subspace detection
[17]	Real bridge (245)	Large span	Anemometers, temperature sensors, anchor load cells, connected pipe	Sensor fault	–	PDFs with certainty index

Table 11. Regression methods in the area of SHM for bridges.

Ref.	Dataset	Bridge	Sensor data	Anomaly Type	Methodology
[42]	Real bridge (105)	Oblique arch	Strain	Outlier	SARIMA model for anomaly detection
[16]	Real bridge (7048)	Long-span cable-stayed	Stress	Noise	Gaussian process regression with data selection

Table 12. Neural network methods in the area of SHM for bridges.

Ref.	Dataset	Bridge	Sensor Data	Anomaly Type	Arch.	Methodology	Performance Evaluation
Ref.	Dataset	Bridge	Sensor Data	Anomaly Type	Arch.	Methodology	P (%)	R (%)	F1	A (%)
[14]	Real bridge (28,272)	Long-span cable-stayed	Acceleration	Miss, minor, outlier, square, trend, drift	CNN	Subspace-enhanced features	86.65	92.96	0.89	95
[15]	Real bridge (4320)	Long-span cable-stayed	Acceleration	Outlier, minor, missing, trend, drift, break	LSTM	Dual-threshold with point-wise comparison	>90	>92	–	>93
[40]	Real bridge (86,400)	Suspension bridge	GPS	Miss, outlier, drift, trend	Bi-LSTM	Data cleaning by integrating multi-modals	88.9	95.40	0.92	98.26
[41]	Real bridge (691,200)	Long-span cable-stayed	Cable tension	Outlier	LSTM	Anomaly scores from reconstruction	95.6	92.01	0.938	99.98
[43]	Real bridge (54,720)	Cable-stayed	Acceleration	Missing, minor, outlier, square, trend, drift	CNN	feature learning from acceleration	>70	>85	>0.77	97.6
[44]	Real bridge (14,400)	Long-span cable-stayed	Acceleration	Abnormal data	CNN	data compression	97.93	97.13	0.975	99.15
[45]	Real bridge (3000)	Twin-box girder	Acceleration	Outlier	SCN	Random Node Removal	>95	>96	>0.96	>99
[32]	Real bridge (54,720)	–	Acceleration	Missing, minor, outlier, square, trend, drift	CNN	CNN with statistical features	>72	>85	–	94.26

Table 13. Two-dimensional input classes for image-based methods in SHM for bridges.

Ref.	Dataset	Bridge	Sensor Data	Anomaly Type	Arch.	Methodology	Performance Evaluation
Ref.	Dataset	Bridge	Sensor Data	Anomaly Type	Arch.	Methodology	P (%)	R (%)	F1	A (%)
[47]	Real bridge (7200)	–	Acceleration	FDC, drift, square, missing, trend	CNN	GAF and FFT	>84	>92	0.94	>96
[46]	Real bridge (7200)	Long-span cable	Acceleration	TFC, drift, square, trend, missing, minor	CNN	CWT	>84	>94	0.97	97.1
[35]	Real bridge (54,720)	Long-span cable	Acceleration	trend, drift, minor, missing, outlier, square	CNN	FFT	>92	>92	-	>96
[36]	Real bridge (54,720)	Long-span cable	Acceleration	trend, drift, minor, missing, outlier, square	Ensemble network	FFT	-	-	-	97
[37]	Real bridge (54,720)	Long-span cable	Acceleration	missing, minor, outlier, square, trend, drift	CNN	Grayscale	95	95	-	98.3
[38]	Real bridge (54,720)	Long-span cable	Acceleration	missing, minor, outlier, square, trend, drift	CNN	GAF and FFT	-	-	0.94	98.2
[48]	Real bridge (54,720)	Long-span cable	Acceleration	missing, minor, outlier, square, trend, drift	CNN	Grayscale	>85	>73	>0.82	97.74
[49]	Real bridge (22,320)	Long-span cable	Acceleration	trend, shift, spikes, constant, trend, drift	GAN and autoencoders	GAF	-	-	-	>94
[53]	Real bridge (142,848)	long-span railway and cable	Acceleration	missing, drift, amplitude, constant	CNN	Transition Field	>85.48	>68.18	0.76	98.94
[55]	Real bridge (20,160)	Large-span cable	Acceleration	local gain, outlier, drift, missing	CNN, GAN	Adversarial Network	-	>95	-	99.1
[57]	Real bridge (675,432)	Long-span	Acceleration, strain, displacement, humidity, temperature	missing, outlier, mutation, missing, trend, square	CNN	RGB format	93.76	>89.9	0.94	93.28
[58]	Real bridge (8250)	Large-span cable	Acceleration	local gain, drift, missing, noise, outlier	CNN	Grayscale	95	95	0.95	96.8
[59]	Real bridge (54,720)	Long-span	Acceleration	missing, minor, outlier, square, trend, drift	CNN	Grayscale	78.66	85.5	-	95

Table 14. Hybrid input classes for image-based methods in the area of SHM for bridges.

Ref.	Dataset	Bridge	Sensor Data	Anomaly Type	Arch.	Methodology	Performance Evaluation
Ref.	Dataset	Bridge	Sensor Data	Anomaly Type	Arch.	Methodology	P (%)	R (%)	F1	A (%)
[39]	Real bridge (54,720)	Long-span cable	Acceleration	missing, minor, outlier, drift square, trend	N/A	Multi-view binary patterns Random forest	-	-	-	97.5
[50]	Real bridge (180,000)	Large-span arch & cable	Acceleration	missing, minor, outlier, noise, biased	CNN	Relative FDH	88.29	81.54	0.84	99.39
[18]	Real bridge (54,720)	Long-span cable	Acceleration	missing, minor, outlier, drift square, trend	FFN	Feature extraction FFN	90.5	88.07	-	97
[51]	Real bridge (21,600)	Long-span cable	Acceleration	missing, minor, outlier, drift normal, biased	CNN	Mutual information correlation analysis	>97	>97	>0.97	99.45
[52]	Real bridge (72,000)	Long-span cable	Acceleration	missing, minor, outlier, drift square, trend	Ensemble	Active learning and AdaBoost algorithm	90.67	94.24	0.92	97.95
[34]	Real bridge (2160)	Long-span suspension	Acceleration	outlier, square missing, minor	CNN- LSTM	Model integration	>96	>96	>0.96	97.87
[54]	Real bridge –	Box girder	strain displacement vibration	Noise	-	Data conversion	90.62	97.55	0.93	99.36
[56]	Real bridge (54,720)	Long-span, footbridge	Acceleration	missing, minor, outlier, drift square, trend	Auto– encoders	Encoder-decoder modeling	>99	>99	>0.99	>99
[60]	Real bridge (51,840)	Box girder	strain displacement vibration temperature humidity	Outlier	Encoder– decoder	Data conversion	92.1	92.4	0.92	94.9

Table 15. Comparative evaluation of anomaly detection methods in SHM.

Method	Robust.	Scalab.	Deploy.	Interp.	Data Dep.
Distance-Based	Moderate; threshold-sensitive	Low in high-D; scalable for small sets	High; edge-suitable	High; intuitive, limited depth	Low; minimal labels, feature-sensitive
Regression Models	Moderate; data-sensitive	High; efficient at scale	High; lightweight	High; traceable math	Moderate; needs calibrated input
Bayesian Models	High; uncertainty-aware	Moderate; low-mid D efficient	Moderate; tunable for embedded use	High; probabilistic traceability	Moderate; priors help sparse data
Neural Networks	High; noise-resilient	High; multivariate scalable	Moderate; fast inference, costly training	Low; black-box limits clarity	High; needs large labeled sets
2D Image-Based	High; noise-tolerant	High; GPU-accelerated	Low; costly and preprocessing-heavy	Low; obscures raw signals	High; label diversity critical
Hybrid Inputs	High; method-integrated	High; efficient transforms	High; edge-suitable	Moderate–High; structured inputs aid traceability	Moderate; transfer learning helps

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sonbul, O.S.; Rashid, M. Bridge Structural Health Monitoring: A Multi-Dimensional Taxonomy and Evaluation of Anomaly Detection Methods. Buildings 2025, 15, 3603. https://doi.org/10.3390/buildings15193603

AMA Style

Sonbul OS, Rashid M. Bridge Structural Health Monitoring: A Multi-Dimensional Taxonomy and Evaluation of Anomaly Detection Methods. Buildings. 2025; 15(19):3603. https://doi.org/10.3390/buildings15193603

Chicago/Turabian Style

Sonbul, Omar S., and Muhammad Rashid. 2025. "Bridge Structural Health Monitoring: A Multi-Dimensional Taxonomy and Evaluation of Anomaly Detection Methods" Buildings 15, no. 19: 3603. https://doi.org/10.3390/buildings15193603

APA Style

Sonbul, O. S., & Rashid, M. (2025). Bridge Structural Health Monitoring: A Multi-Dimensional Taxonomy and Evaluation of Anomaly Detection Methods. Buildings, 15(19), 3603. https://doi.org/10.3390/buildings15193603

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bridge Structural Health Monitoring: A Multi-Dimensional Taxonomy and Evaluation of Anomaly Detection Methods

Abstract

1. Introduction

1.1. Motivation for Conducting a Systematic Literature Review on SHM Anomaly Detection

1.2. Existing Literature Reviews on Abnormal Data Detection in SHM of Bridges

1.3. Contributions

1.4. Overview of the Systematic Literature Review Approach

2. Research Methodology

2.1. Categories Definition

2.1.1. Abnormal Data Detection Methods

Distance-Based Method

Predictive Methods

Image Processing Methods

Embedded Role of Statistical Methods in SHM Anomaly Detection Frameworks

2.1.2. Real-Time Capability

2.1.3. Domain Analysis

2.1.4. Multivariate Analysis

2.2. Review Protocol Development

2.2.1. Selection and Rejection Criteria

2.2.2. Search Process

3. Classification Results

3.1. Abnormal Data Detection Methods

3.2. Integration of Real-Time Processing in Structural Health Monitoring Frameworks

3.2.1. Recent Advances in Real-Time Anomaly Detection for Bridge SHM

3.2.2. Real-Time SHM: Bottlenecks and Deployment Hurdles

3.3. Analysis Domain Investigations

3.4. Multivariate Analysis Capability Investigations

4. Analysis of Abnormal Data Detection Methods

4.1. Distance-Based Methods

4.2. Predictive Methods

4.2.1. Bayesian Methods

4.2.2. Regression Methods

4.2.3. Neural Network Methods

4.3. Image Processing Methods

4.3.1. Two-Dimensional Image Input Classes

4.3.2. Hybrid Input Classes

5. Comparative Evaluation of Various Anomaly Detection Methods

5.1. Robustness

5.2. Scalability

5.3. Real-World Deployment

5.4. Interpretability

5.5. Data Dependency

6. Challenges and Future Research Directions

6.1. Key Challenges in Abnormal Data Detection for SHM Systems

6.2. Future Research Directions

7. Answers to Formulated Research Questions and Limitations of the Research

7.1. Answers to Formulated Research Questions

7.2. Limitations of the Research

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI