Investigation of Issues in Data Anomaly Detection Using Deep-Learning-and Rule-Based Classi ﬁ cations for Long-Term Vibration Measurements

: Structural health monitoring (SHM) systems are widely used for civil infrastructure monitoring. Data acquired from the SHM systems play an important role in assessing structural integrity and determining further maintenance activities. Considering that sensors in the SHM systems are installed in a harsh environment for long-term measurements, some sensors can malfunction and produce faulty data. As a large amount of measured data is often desired to be automatically processed and can adversely a ﬀ ect structural assessments, identifying such abnormal data is important. This paper provides critical investigations of the automated detection of data anomalies using existing deep-learning-based classi ﬁ cation in conjunction with a simple rule-based approach. The issues investigated in this study include (1) the presence of ambiguous data that cannot be categorized as an anomaly class in the literature, (2) information loss during the conversion of time-series data into images for the deep-learning-based approach, and (3) additional issues, such as misclassi ﬁ cation by trained models and requirements of the threshold section in the rule-based approach. The results of these key investigations can be utilized to develop an e ﬀ ective anomaly detection process.


Introduction
With the recent development of advanced sensing techniques, structural health monitoring (SHM) is considered an effective maintenance tool for civil engineering structures [1][2][3].SHM systems were deployed in several full-scale structures for long-term monitoring [2,[4][5][6].Monitoring data from SHM systems can be used to evaluate structural performance, predict the future behavior of structures, detect damage, and assess the condition of structures [7][8][9][10][11][12].However, the sensors in SHM systems often malfunction, resulting in various types of data anomalies.Identifying and removing anomalies before data processing are crucial for proper SHM.Thus, effective approaches for handling such anomalies in the measured data are essential.
Data-driven approaches, including statistical approaches, have been studied for identifying anomalies in SHM data [13][14][15][16][17].A method using principal component analysis was proposed for detecting, isolating, and reconstructing faulty sensor data [18].A Mahalanobis-distance-based algorithm was used to detect outliers in the time-series data obtained from different structures in the laboratory [19].Based on the Gaussian process approach, the sensor anomalies were identified using the residual between the measured and expected responses [20].Although these statistics-based approaches can achieve reasonably high performance, they are intrinsically limited to data with a single type of data anomaly.
As a result, deep learning was introduced to resolve problems associated with statistics-based identification.Recently, deep learning has been widely used for the automatic analysis of time-series data, such as prediction [21], classification [22,23], and anomaly detection [24][25][26][27][28][29][30].One of the earliest approaches employing deep learning was the development of a deep neural network in which the inputs were images converted from measured time-history data [25].For better classification performance, an approach using a convolutional neural network (CNN) was proposed with enhanced input information, combining images of time-and frequency-domain data [26].Furthermore, multichannel images consisting of the time-domain response, spectrogram, and probability density function information were proposed to further increase the information in the input images for accurate anomaly detection [27].Although data enhancement in the images used for training the neural networks was shown to improve the classification performance, the methods employing converted images have critical issues, such as ambiguous labeling of training images, imbalanced data for each class, and limited performance due to low-resolution converted images.
Therefore, investigating effective and practical anomaly detection using deep learning is necessary.Most importantly, the measured time-history data often have an intrinsic ambiguity that cannot classify a dataset into either normal or abnormal cases.For example, the "Trend" category [26] indicates a data anomaly with an overall increasing or decreasing trend; knowing that no dataset has exactly a zero mean, data labeling implies ambiguity that makes the deep learning approaches difficult.Furthermore, the converted images used as input to the CNN typically have a low resolution compared to the original time histories, which can lead to similar effects on the quantization errors.These issues should be appropriately considered before applying image-classification-based deeplearning approaches for detecting data anomalies.
This paper presents a comparative study of two automated data anomaly detection schemes of CNN-and rule-based classifications to provide in-depth discussions regarding the important issues described previously.Herein, rule-based classification is selected because it has advantages in exploring the issues in that (1) conversion to images is unnecessary, (2) class ambiguity can be better investigated and handled, and (3) classification performance is less sensitive to false detections.This paper first defines rules for data anomaly classes typically used in the literature and prepares labeled datasets from actual long-term SHM data.Subsequently, CNN-based and rule-based classification approaches are compared to identify the characteristics and limitations of these methods for data anomaly detection.

Background: CNN-Based Classification
Among the state-of-the-art approaches for anomaly detection described in the Introduction, CNN-based classification using converted images [26] was selected for comparison.This approach involves converting both time-and frequency-domain data into images that are, in turn, used as the input to a CNN model for classifying multi-class data anomalies.A total of seven classes are considered: Missing (most or all of the data are missing), Minor (the vibration response oscillates with a small amplitude), Outlier (one or more outliers appear in data), Square (the vibration response oscillates abnormally over a range of the accelerometer), Trend (the response is non-stationary with a monotonous trend), Drift (the response is non-stationary with random drift), and Normal.This method, validated using acceleration data measured from a cable-stayed bridge in China, exhibited 94% accuracy.
In the data preprocessing step, the measured time-history data and their Fourier transforms are converted into images to prepare the input to the CNN.Image conversion includes plotting the time and frequency responses without axes and storing them as images.The single-channel images of the time-and frequency-domain data were stacked together to form a dual-channel image in which the time and frequency responses were stored in the red and green channels, respectively.The size of each dual-channel image was 100 × 100 pixels.Only the acceleration response of the bridge was considered for anomaly detection [26].
The neural network used for data anomaly detection includes typical convolutional layers.The architecture of the CNN is shown in Figure 1.The first hidden layer is a feature map with a size of 60 × 60, generated after implementing the convolutional layer with 20 filters of a size of 41 × 41 and a stride of one.The third layer is of size 30 × 30, generated after implementing a max pooling layer with a pooling size of 2 × 2 and a stride of two.To address these issues, this study introduces a rule-based classification method that does not require data-to-image conversion, allows the investigation of class ambiguity, and is less sensitive to false detections.The rule-based classification is described in detail in the following section by defining the classes of data anomalies [26].

Rule-Based Classification
This study introduces a rule-based classification and compares it with the CNNbased classification to investigate the issues of intrinsic class ambiguity and information loss in data-to-image conversion.The overall process of the rule-based classification is shown in Figure 2.Each data anomaly class is defined by specific rules and algorithms that minimize the need for extensive data preprocessing, such as converting data into images.The measured time-history data were scrutinized for each class and classified into a single anomalous class upon detection.If no anomalous classes were found, the data were considered Normal.Note that the time-history data can have multiple classes of data anomalies, except for Normal.Herein, we define the data anomaly classes and provide rules for determining classes.

Definitions of Data Anomaly Classes
This study considered the same anomaly classes used previously [26] for classifying the measured time-history data, with the rules defined for each class based on prior descriptions [26].The definitions in this section involve the necessary threshold values that cannot have universal values.The specific values used in this study are presented in the next section, along with the actual measured data.

Missing Class
Missing is when the measured data are not complete.Examples of the Missing class are shown in Figure 3.A simple approach comparing the number of data points in the measured data with the desired length can be used to determine the Missing class.

Minor Class
The Minor class indicates data with clear quantization errors that typically occur when the response level is low compared with the resolution of the analog-to-digital converter (ADC), as shown in Figure 4. To detect the Minor class, the number of unique values in the measured structural response was counted.If the number was less than a pre-defined threshold, the data were classified as a Minor class.The threshold can be selected considering that a 24-bit ADC is common in the market.

Outlier Class
The identification of the Outlier class involves data points that deviate significantly from the rest of the measured data.Examples of the Outlier class are shown in Figure 5.A statistical approach can be used to determine the Outlier class.The initial step was to detrend the data to eliminate any underlying trends (e.g., constant, linear, or higher-order).Subsequently, all the peaks in the measured data were obtained.Three conditions were considered for classifying the time-history data to be classified as Outlier:  The magnitude of the maximum peak ( ) divided by the standard deviation of the time history ( ) is greater than a pre-defined threshold;


The magnitude of the maximum peak divided by the mean of the time history ( ) is greater than a pre-defined threshold;  A data segment was selected right after the maximum peak.The ratio of standard deviations of the data segment ( ) and standard deviation of the raw time history ( ) is less than a pre-defined threshold.
The third condition is necessary for differentiating the impulse response from an outlier in the time-history data.The Square class occurs when the vibration response oscillates abnormally over the measurement range.Thus, most data points were approximately at the maximum or minimum amplitudes, resulting in a square shape, as shown in Figure 6a.Consequently, the data histogram has two distinct peaks corresponding to the maximum and minimum amplitudes, as shown in Figure 6b.The characteristics of the Square class can be utilized for classification.Consider the following criteria for determining the Square class:  The two maximum numbers of data points in the histogram (maximum bins) divided by the total number of data points ( ) are greater than a pre-defined threshold;  Bins with larger or smaller acceleration values than those of the maximum bins do not possess data points ( ) that exceed a pre-defined threshold.
The second criterion was added to ensure the histogram has two distinct peaks on the far left and right sides, as shown in Figure 6b.
where  is the standard deviation of the mean of the segments and  is the mean of the standard deviation of the segment.If the index  is greater than a pre-defined threshold, the signal is classified as Drift. indicates the level of variation in the segmented signals.By dividing  by  which is the overall amplitude of the original signal, the index  can be used as a measure of the random drift in a signal.This rule-based classification can be utilized to classify measured data into the abovedefined classes.This classification approach is independent of data conversion into images and is less sensitive to false detections.The rule-based classification with actual measured data is discussed in detail in Section 4.

Issue in the Definitions of Anomaly Classes
Although proper definitions for classifying the measured data into various classes are provided in Section 3.1, ambiguity exists in the data anomaly classes for a certain dataset.For example, the Trend and Drift classes have intrinsic ambiguities because the threshold values for rtrend and ddrift cannot be selected perfectly, causing difficulties in labeling data for CNN-based classifications.Further discussion on this issue is provided in a later section.

Data Source
The measured time-history data from the Hwatae Bridge, a cable-stayed bridge located in the city of Yeosu, Republic of Korea, were used for the analysis of data anomaly detection.This bridge was constructed in 2015 with two 130-meter-high pylons and a 500meter span between the two pylons.The data consisted of acceleration, displacement, and wind data.Only the acceleration data of the bridge were used for anomaly detection.To collect the acceleration data, 36 channels were placed at various parts of the bridge, such as piers, decks, pylons, and cables.The sensor layout of the 36 accelerometers on the bridge is shown in Figure 9. Detailed information on the positions of these channels is presented in Table 1.Acceleration data were recorded every 10 min at a sampling frequency of 100 Hz.Rule-and CNN-based methods were applied to the data measured from the bridge to investigate their applicability and performance.First, a rule-based method was applied to the measured data, which allowed us to identify class ambiguity in the data.Based on these results, labeled data were prepared for the CNN-based method.Furthermore, the CNN-based classification results were obtained for the labeled dataset.Given that data labeling is conducted using the rule-based classification, in this study, we do not intend to compare the accuracy of the two methods but to investigate their characteristics and associated issues.An in-depth discussion of the classification results is presented in the next section.

Results of the Rule-Based Classification
The rule-based classification was applied to the acceleration data from the Hwatae Bridge along with added artificial data to investigate its applicability.The threshold values used in the rule-based classification are listed in Table 2.The selection of the specific values is discussed later in this section.Acceleration data were obtained from December 2020 to May 2021 for all 36 channels.During the classification, Normal, Missing, Outlier, Trend, and Drift classes were identified in the actual acceleration data.Examples of these classes representing 10 min acceleration data are shown in Figure 11.A critical issue in applying a rule-based method is that the measured data have an intrinsic ambiguity in the anomaly classes.In particular, certain data are difficult to identify as the Trend, Drift, or Normal class because the data do not exhibit distinctive features for classification.To better describe the phenomenon, the index defined for the Trend class, rtrend, for all acceleration data is plotted in Figure 12a, showing an almost continuous distribution without clear patterns that can be used for classifying the Trend class.Thus, we selected two threshold values for rtrend to separate data clearly for both Trend and Normal classes, such that data above the red line (i.e., the first threshold of rtrend = 0.00035) and below the green line (i.e., the second threshold of rtrend = 0.00005) in Figure 12b, c are observed as obvious Trend and Normal classes, respectively.The data between the red and green lines are defined as Trend-ambiguous.A procedure similar to that for the ddrift distribution shown in Figure 13a was used for the Drift class.Time history examples for the Drift and Normal classes above and below the green lines are shown in Figure 13b and Figure 13c, respectively.The data between the red and green lines correspond to Driftambiguous.
Examples of both Trend-ambiguous and Drift-ambiguous classes are shown in Figure 14.One can clearly see the difficulty of considering it as a perfect Normal, Trend, or Drift class.In total, 7216 Trend-ambiguous samples and 3489 Drift-ambiguous samples were selected from the measured acceleration data with pre-defined thresholds.Using rule-based classification, the total numbers of samples for each class and their proportions with respect to the total number of samples labeled in the database are listed in Table 3.

Results of the CNN-Based Classification
A CNN-based classification method was applied to the prepared labeled datasets.The labeled data must be converted into images for the CNN-based classification.The converted dual-channel images contain time and frequency information in the red and green channels, respectively.Examples of the dual-channel images for all seven classes are shown in Figure 15.The training dataset was prepared by randomly selecting data samples from the labeled dataset.Trend-ambiguous and Drift-ambiguous classes were not included in the training and test datasets to ensure consistency with a previous CNN-based classification study [26].The details of both the training and test datasets are listed in Table 4.A balanced dataset with the same amount of data for each class was used to train the CNN model because an imbalanced dataset usually shows poor performance in training the network.A total of 1000 random samples from each class were selected from the database to train the CNN model.The training dataset for the CNN model was further divided into 70% training and 30% validation datasets.The remaining data samples from the database were used as datasets to test the accuracy of the trained CNN model.The trained CNN model was then used to test for data anomalies in the remaining test dataset of the database.The results for the test dataset are shown in Figure 16.The overall accuracy of the test dataset was 93.3%.For the Normal and Minor classes, the recall values were 88.6% and 92.8%, respectively.Except for the Normal and Minor classes, all classes exhibited >95% recall values.The Minor and Drift classes showed 83.3% and 56.8% precision values, respectively, which were less than the precision values of the other classes.The results are similar to those of a previous study [26], proving that the CNN-based classification was appropriately implemented.Nonetheless, certain critical issues such as intrinsic ambiguity in the anomaly class and low resolution in the converted images must be investigated to determine the applicability of the data anomaly detection process.Further discussion of these issues is presented in the next section of this paper.

Ambiguous Data Investigation
Ambiguous classes were investigated based on the results of the CNN-based classification.Data samples from both Trend-ambiguous and Drift-ambiguous were tested using the trained CNN model.Note that this analysis does not evaluate the performance of the CNN-based classification but discusses the effects of intrinsic ambiguity in the measured data.The classification results for Trend-ambiguous and Drift-ambiguous conditions are shown in Figure 17 and Figure 18, respectively.In the case of Trend-ambiguous, the data were mostly classified into Normal, Trend, and Drift classes.Similarly, in the case of Driftambiguous, most of the data were classified into Normal and Drift classes, whereas few were classified into Trend and Minor classes by the trained CNN.These ambiguous classes not only make data labeling difficult but also considerably degrade the classification accuracy of CNN-based methods.Class ambiguity also applies to the rule-based classification.Perfect thresholds that can distinguish ambiguous classes from Normal classes do not exist.

Image-Resolution-Related Issue
The conversion of time-series data into images can impair data anomaly classification because the information in the data is partially lost during the conversion process.The converted image, which was used as the input of the CNN model, could not contain the full information of the measured dynamic signal.The image size used in this study was only 100 × 100 pixels, whereas the acceleration data contained 60,000 data points digitized with a 24-bit analog-to-digital converter (ADC).Indeed, lost information can lead to the misclassification of relevant data.
One of the adverse effects of image conversion is the loss of information on the amplitude axis, similar to quantization error.A vertical length of 100 pixels, which is less than 2 7 , is equivalent to applying a 6-or 7-bit ADC to the original acceleration signal.The resulting images after the low-bit ADC are often seen as data of the Minor class, the main reason for which is the quantization phenomenon.Figure 19 shows examples of misclassification (i.e., data in the Normal class were classified as Minor).Image conversion also causes information loss on the time axis, which behaves like resampling.In this dataset, one pixel corresponded to 600 data points.Thus, fewer than 600 missing data points could not be included in the converted image.For example, the images at the top of Figure 20

Supplementary Issues
Other issues in the data anomaly detection process are discussed in this section.Although the accuracy of the CNN-based classification was high, false detections were observed, as shown in Figure 16.Because the converted images were used as inputs to the CNN, data with similar shapes can be misclassified, as shown in Figure 21.For example, an image of impulse responses belonging to the Normal class (see Figure 21a) can be classified as Outlier by the CNN-based method.
One of the most salient drawbacks of the rule-based classification is that an appropriate selection of threshold values is necessary for each SHM dataset (v.0.3.1.).Determining the threshold values defined in this study is a subjective task that cannot be generalized.In addition, as the threshold values are manually selected before the actual application, the automation of data anomaly detection becomes difficult.

Conclusions
The critical issues in the data anomaly detection were investigated with the rulebased and the CNN-based approaches.The rules for the seven classes of data anomaly used in [26] were defined for the rule-based classification.As the two classes Trend and Drift were not seen to be strictly separated from Normal, Trend-ambiguous and Drift-ambiguous were further defined.The acceleration data obtained from the Hwatae Bridge were labeled using the defined rules and were used to train the CNN model for classification.The following issues in the data anomaly classification were identified in the process of applying rule-based and CNN-based classifications: (1) Measured data have intrinsic ambiguity, as we defined two ambiguous classes.Thus, both determining a perfect rule that can separate the data without ambiguity and labeling such data for training the CNN model were seen to be impossible.(2) Important information can be lost during data conversion into images, as this conversion generates non-original data anomalies due to quantization and resampling phenomena.
(3) The CNN model misclassified normal data as an incorrect anomaly class when the data had a feature in which a similar shape was shown in the anomaly class.(4) Rule-based classification requires definitions of data anomalies and the selection of the associated threshold values as prior knowledge.This requirement is the most important reason for developing the CNN-based classification, while the deep learningbased approach used to date also has clear drawbacks as discussed in this paper.
Both the rule-based and CNN-based classifications have advantages and disadvantages with respect to their applicability.The rule-based classification requires prior knowledge necessary to determine threshold values for the data anomalies sought.On other hand the CNN-based classification is free of such definitions of data anomalies; however, information loss due to the conversion of data into images can be an issue.Those characteristics of the rule-based and CNN-based methods discussed in this study result in differences in applicability, depending on target SHM system.

Figure 1 .
Figure 1.Detailed CNN architecture [26].Although the CNN-based approach has shown potential for data anomaly detection, several important issues have been identified, as discussed previously. Conversion of measured data into images: low-resolution input images have a similar effect to quantization errors;  Intrinsic class ambiguity in measured data: the presence of data that cannot be classified into a specific class;  False detections: classes with similar shapes are falsely detected.

Figure 2 .
Figure 2. Flowchart of the rule-based classification for data anomalies.

Figure 3 .
Figure 3. Examples of the Missing class.

Figure 4 .
Figure 4. Examples of the Minor class.

Figure 6 .
Figure 6.Example of the Square class: (a) time history and (b) histogram plot.

Figure 7 .
Figure 7. Examples of the Trend class.3.1.6.Drift Class The Drift class is characterized by nonstationary vibrations with random drift, as illustrated in Figure 8.To identify the Drift class, the measured data were partitioned into equally sized segments.Here, we define the following index to determine the Drift class:

Figure 8 .
Figure 8. Examples of the Drift class.3.1.7.Normal Class If a measured signal is not classified into any of the anomaly classes, it is considered the Normal class.This rule-based classification can be utilized to classify measured data into the abovedefined classes.This classification approach is independent of data conversion into images and is less sensitive to false detections.The rule-based classification with actual measured data is discussed in detail in Section 4.

Figure 9 .
Figure 9. Schematic of accelerometers installed on the Hwatae bridge.
Top of the right pylon 24-26 Foundation of the right pylon 27-29 Girder end 30-32 Free field 33, 35, 36 Girder at ½ span Considering that the data for the Minor and Square classes do not exist in the measured data, the data for these two classes were artificially produced.The data for the Minor class were generated by quantizing random signals of the original acceleration data.Similarly, the Square class was artificially prepared using the square wave phenomenon.A total of 5105 Minor and 5000 Square class samples were generated.The artificially generated Minor and Square class examples are shown in Figure 10.

Figure 12 .
Figure 12.(a) Trend class classification index distribution, (b) time history example right above the red line, and (c) time history example right below the green line.

Figure 13 .
Figure 13.(a) Drift class classification index distribution.(b) Time history example right above the red line and (c) time history example right below the green line.

Figure 16 .
Figure 16.Confusion matrix for the unseen test dataset.

Figure 17 .
Figure 17.CNN-based classification results for Trend-ambiguous data samples.

Figure 18 .
Figure 18.CNN-based classification results for Drift-ambiguous data samples.

Figure 19 .
Figure 19.Misclassification of the Normal class into the Minor class.
exhibit missing data, whereas the converted images shown at the bottom are classified as Normal.

Figure 20 .
Figure 20.Misclassification of the Missing class due to low-resolution images.

Figure 21 .
Figure 21.Misclassification of the Normal class into the (a) Outlier and (b) Drift classes.

Author Contributions:
Conceptualization, S.-H.S.; methodology, I.U.K. and S.-H.S.; software, I.U.K., S.J. and S.-H.S.; validation, I.U.K. and S.J.; investigation, S.J.; writing-original draft, I.U.K.; writing-review and editing, S.J. and S.-H.S.; supervision, S.-H.S.; funding acquisition, S.-H.S.All authors have read and agreed to the published version of the manuscript.Funding: This research was supported by the Sungkyunkwan University and the BK21 FOUR (Graduate School Innovation) funded by the Ministry of Education (MOE, Korea) and National Research Foundation of Korea (NRF), as well as by a grant (RS-2020-KA156887) from the Smart Construction Technology Development Program funded by the Ministry of Land, Infrastructure, and Transport of the Korean Government.

Table 1 .
Information about selected channels for anomaly detection.

Table 2 .
Threshold values for the data anomaly classes.

Table 3 .
Total samples and percentage of each class anomaly in database.

Table 4 .
Training and test dataset information for the database.