1. Introduction
Reinforced concrete bridges are susceptible to vehicle loads, chloride erosion, freeze–thaw cycles, environmental humidity, and alkali–silica reaction (ASR) during long-term service, leading to multiple types of damage, such as cracking, reinforcement corrosion, and material deterioration [
1,
2]. Such damage is usually hidden and progressive; if not identified in time, the load-bearing capacity and durability of the structure may deteriorate, thereby threatening bridge service safety. As a passive non-destructive testing method, acoustic emission (AE) technology can capture transient elastic waves released by internal microcracking. Owing to its high sensitivity to microcrack initiation and propagation, AE technology can reflect the dynamic process of structural damage development [
3,
4], and therefore holds considerable application potential for damage identification in concrete structures.
Existing studies have shown that AE signals are closely related to different damage processes in concrete structures. Under loading and mechanical failure, AE parameters such as event counts, ring-down counts, energy, and duration can be used to characterize crack initiation, propagation, and failure processes [
5,
6,
7]. In recent years, Zhou et al. [
8], Rather et al. [
9], and Cui et al. [
10] further verified the capability of AE signals to characterize load-induced damage from the perspectives of fatigue damage assessment, interface degradation identification, and fatigue crack propagation analysis, respectively. For reinforcement corrosion damage, previous studies have shown that hit counts, cumulative signal strength, and absolute energy can reflect corrosion initiation, passive film breakdown, and corrosion-induced cracking processes [
11,
12,
13,
14]. For freeze–thaw damage, related studies have indicated that freeze–thaw conditions and saturation state affect AE response characteristics [
15,
16]. Choi et al. [
17], Mainali et al. [
18], and Todak et al. [
19] further revealed the characterization capability of AE signals for freeze–thaw damage from the perspectives of freeze–thaw damage response, stage-dependent AE activity, and the influence of saturation state.
However, owing to the complex internal structure of concrete, AE signals are inherently burst-like, non-stationary, and multi-parametric. It is difficult to accurately identify different damage modes using only a single parameter or manually defined thresholds. With the development of data processing methods, machine learning has gradually been introduced into AE signal analysis. Thirumalaiselvi et al. [
20] used pattern recognition methods to characterize crack development during progressive damage in large concrete structures. Abdeljaber et al. [
21] and Ding et al. [
22] carried out structural damage identification based on convolutional neural networks and deep belief networks, respectively, demonstrating the potential of machine learning methods for feature extraction and damage identification in complex structural responses. Compared with supervised learning, unsupervised clustering methods do not require pre-labeled samples and can mine the latent internal structure of AE signals. Previous studies have applied clustering methods to AE signal interpretation in different cement-based materials, including polymer concrete damage pattern recognition and steel fiber reinforced concrete fracture classification [
23,
24]. More recently, K-means clustering has been used to analyze the failure process and damage mechanisms of BFRP-strengthened concrete beams based on AE parameters [
25], while GMM-based clustering has been adopted to identify tensile crack AE events and predict crack propagation in lightly reinforced concrete beams [
26].
Although these studies demonstrate the applicability of clustering and machine learning methods in AE signal interpretation, many existing AE clustering studies still rely on conventional clustering algorithms in the original or reduced Euclidean feature space. Such methods may be affected by initialization, cluster shape assumptions, and overlapping feature distributions. In particular, AE signals generated by different damage mechanisms may exhibit nonlinear and locally overlapping distributions in multi-parameter feature space, which limits the separability achieved by conventional clustering methods. Kernel-based methods provide a potential way to improve nonlinear separability through implicit feature space mapping. In AE-based structural damage monitoring, kernel methods such as kernel ridge regression have been used for corrosion-induced damage assessment in reinforced concrete structures [
27]. In the broader field of structural health monitoring, adaptive kernel spectral clustering has also been applied for automated damage detection under environmental and operational variability [
28]. However, the performance of kernel clustering methods is strongly affected by the selected kernel function and its parameters. Although optimization algorithms have been introduced to improve AE data clustering, such as genetic algorithm-based AE clustering optimization [
29], automatic optimization of both kernel type and key kernel parameters for unsupervised AE damage-type identification remains insufficiently explored.
Nevertheless, from the perspective of multi-type damage identification in RC beams, most existing studies still focus on a single damage mechanism or a specific experimental scenario. In practical monitoring of RC beams, AE signals may be associated with different damage types, such as loading-induced cracking, reinforcement corrosion, and freeze–thaw deterioration. Due to the overlap of time-domain, frequency-domain, and energy-related features, distinguishing these damage-related AE signals remains challenging [
30]. Therefore, improving the identification accuracy of AE signals corresponding to different damage types is still an important issue in AE-based monitoring applications.
To address this issue, this study focuses on AE signals collected from RC beams under three different damage conditions, namely loading-induced damage, corrosion damage, and freeze–thaw damage. A COA-optimized kernel K-means (COA-KKM) clustering method is proposed to improve the unsupervised identification of AE signals associated with different damage types. Compared with existing AE clustering studies, the proposed framework introduces nonlinear kernel mapping to enhance the separability of overlapping AE feature distributions, and uses the coyote optimization algorithm to optimize the kernel function type and key parameters under an unsupervised Gap Statistic criterion. This reduces the dependence on empirically selected kernels and manually tuned parameters, and provides a more adaptive clustering strategy for multi-type AE signal identification. The proposed method is compared with K-means, FCM, and GMM, and its clustering accuracy and stability are evaluated. The results can provide a methodological reference for distinguishing damage-related AE signals and establishing cluster–damage type mappings in RC beam assessment.
3. Model Establishment
3.1. Clustering Analysis
3.1.1. K-Means Clustering Algorithm
The classical K-Means clustering algorithm is a partition-based clustering method. Its core idea is to divide data samples into several pre-specified number of non-overlapping clusters through iterative optimization. Let the dataset be as follows:
In the formulation, let
denote the
i-th n-dimensional feature vector. Given the number of clusters is pre-specified, the objective of K-means is to partition the dataset into
k disjoint clusters such that the within-cluster similarity of samples is maximized and the between-cluster dissimilarity is also maximized. Euclidean distance is adopted to quantify the similarity between samples. In the clustering procedure, the distance between a data point
x and the centroid
of the
i-th cluster can be formulated as follows:
Meanwhile, K-means clustering uses the sum of squared errors (SSE) as an evaluation metric for cluster quality, with its objective function defined as follows:
In the formula, represents the i-th cluster, and denotes the corresponding cluster center. SSE reflects the compactness of samples within each cluster; a smaller SSE value indicates higher intra-cluster consistency in the clustering results.
3.1.2. Fuzzy C-Means Clustering Algorithm
Fuzzy C-Means (FCM) is a representative soft clustering algorithm, whose core mechanism introduces a membership framework that allows each sample to belong to multiple clusters with varying degrees of weight. Unlike hard clustering, FCM does not enforce exclusive assignment; instead, it quantifies the relative proximity of each sample to cluster centroids, making it particularly suitable for datasets with fuzzy boundaries, overlapping distributions, or transitional patterns. Let the input dataset be denoted as
D. The objective of FCM is to partition C fuzzy clusters by constructing a membership matrix
represents the degree to which sample
belongs to cluster
j. The membership values must satisfy the following constraints:
The constraint ensures that the sum of membership degrees of each sample over all clusters equals 1, reflecting its fuzzy membership property. Given the membership matrix, the centroid of the
j-th cluster is determined by the weighted average of the samples:
In the formula,
m is the fuzzy weighting exponent, which usually takes a value in the interval [1.5, 2.5]. A larger m makes the membership distribution smoother and enhances fuzziness, while a smaller m makes the result closer to hard partitioning. FCM achieves optimization by minimizing the following objective function:
The objective function simultaneously incorporates the spatial distance between samples and cluster centroids, as well as the membership weights of samples to each cluster. Through the alternating update of the membership matrix U and the cluster centroids C, the algorithm drives each sample in the feature space to converge toward its high-membership cluster center.
3.1.3. Gaussian Mixture Model
The Gaussian Mixture Model (GMM) is a soft clustering method based on probabilistic generation mechanism. This model assumes that observed samples do not come from a single distribution, but are jointly generated by several latent sub-distributions, and each Gaussian component corresponds to a latent clustering structure in the data space. Through the weighted superposition of each component, the GMM can describe the multimodal characteristics of data under a unified probabilistic framework.
Formally, given the dataset
D, the GMM models the probability density function of the overall data as the weighted sum of K elemental Gaussian distributions:
The symbols in the equation are defined as follows: K: the number of Gaussian components; : the mixing coefficient of the k-th component, representing its weight in the overall model, satisfying > 0 and = 1; : the mean vector of the k-th Gaussian component; and : the covariance matrix of the k-th Gaussian component.
The probability density function of the multivariate Gaussian distribution is defined as follows:
The parameter set in Formula (10) is typically estimated using the Expectation-Maximization (EM) algorithm. The EM algorithm alternately optimizes between the hidden variable space and the parameter space to gradually approach the maximum value of the model’s logarithmic likelihood function of the observed data [
33].
3.2. Model Modification
3.2.1. Kernel Clustering
In practical engineering and complex data analysis problems, the distribution relationship between samples often presents significant nonlinear characteristics. Traditional clustering methods based on Euclidean distance can hardly effectively characterize the true similarity between samples in the original feature space. To overcome this limitation, Kernel Clustering was proposed, and Kernel K-Means is the most typical application form of kernel clustering.
Different from traditional methods, Kernel K-Means does not directly operate on high-dimensional coordinates. Instead, it implicitly completes sample assignment and cluster center updating with the help of a kernel function. Therefore, it can effectively identify cluster structures of any shape and has stronger adaptability to nonlinearly distributed data, as shown in
Figure 7.
There are many types of kernel functions, and this paper considers the following three types of classical kernel functions that are widely used in pattern recognition and clustering tasks:
The parameter is the kernel width parameter, which controls the smoothness of the kernel function.
- 2.
Polynomial kernel function:
In the formula, c is the bias constant, and e is the degree parameter of the polynomial kernel function.
- 3.
Linear kernel function:
3.2.2. COA Kernel Optimization K-Means Clustering Method
The performance of kernel K-means clustering is significantly influenced by the type of kernel function and its parameter values. Different kernel functions correspond to distinct feature mapping schemes, while the kernel parameters alter the distance relationships among samples in the kernel space and the expressiveness of cluster structures. When kernel functions and their parameters are selected based on manual experience, the clustering results become susceptible to subjective bias, making it difficult to stably adapt to the nonlinear and overlapping distribution characteristics inherent in acoustic emission signal feature spaces. To address this limitation, this paper proposes a COA-optimized kernel K-means clustering method (COA-KKM) by integrating the Coyote Optimization Algorithm (COA) to adaptively optimize the kernel function type and its critical parameters, with the Gap Statistic employed as the fitness evaluation metric.
The COA is a swarm intelligence optimization algorithm that simulates the social behavior and evolutionary mechanisms of coyote populations. In this algorithm, each candidate solution is regarded as an individual coyote, and the variable combination corresponding to the candidate solution represents the social state of the coyote. During iteration, each coyote individual is jointly influenced by the dominant individual in the group and the cultural trend of the population. The fundamental update process can be expressed as follows:
In the formula, is the social state of the c-th coyote in the p-th group at the t-th iteration; is the dominant coyote in the p-th group; is the cultural trend of the p-th group, which is usually determined by the median of the social states of all dimensions within the group; is two randomly selected coyote individuals within the group; and is random factors in the interval [0, 1]. This updating mechanism allows individuals to move closer to dominant individuals and the overall population trend while retaining a certain random search capability, thus balancing local exploitation and global exploration.
This paper selects the Gaussian radial basis kernel function, polynomial kernel function and linear kernel function (Equations (11)–(13)) as the optimization objects. The kernel function type and its parameters are encoded into a three-dimensional vector as the parameters to be optimized:
where t denotes the kernel type code takes the value of 1, 2, 3 (1 = RBF, 2 = Poly,3 = Linear);
,
denote the kernel parameter.
For any coyote individual’s corresponding kernel function configuration, kernel K-means is adopted to cluster the standardized acoustic emission feature matrix. Let the number of clusters be k, the r-th cluster is
, and the number of samples in it is, then within-cluster dispersion in the kernel space can be expressed as follows:
In the formula, is the nonlinear mapping function implicitly defined by the kernel function. According to the properties of kernel functions, the distance in the kernel space can be calculated directly from the kernel function:
To evaluate the clustering effectiveness under different kernel function configurations, this paper adopts the Gap Statistic as the fitness function for the COA. Based on the range of features in the original data, BBB reference datasets are generated, and kernel K-means clustering is performed on each to compute the corresponding within-cluster scatter. With B = 20, the Gap Statistic can be expressed as follows:
A larger Gap Statistic indicates that the clustering structure obtained under the current kernel function configuration is more distinct compared to a random distribution. Therefore, the optimization objective of the COA can be expressed as follows:
After random initialization of the coyote population within the given search space, each individual corresponds to a unique combination of kernel function type and parameters. For each individual, the kernel function is constructed based on its encoded representation, followed by kernel K-means clustering on the standardized acoustic emission feature matrix, and the resulting Gap Statistic is computed as the individual’s fitness value. Subsequently, the Coyote Optimization Algorithm (COA) iteratively updates the social state of each individual according to Equation (14), while maintaining population diversity through operations such as newborn individual replacement, inferior individual elimination, and inter-group migration. The optimization process terminates when the global best fitness remains unimproved for a consecutive number of iterations, or when the maximum number of iterations is reached. Finally, the individual with the maximum Gap Statistic is selected as the optimal kernel configuration, and kernel K-means clustering is performed using this optimal kernel function and its corresponding parameters to obtain the final clustering results of acoustic emission signals from RC beams.
3.2.3. Parameter Settings
The main parameter settings used in the COA-KKM algorithm are listed in
Table 3.
3.3. Dataset Construction
3.3.1. AE Feature Dataset
Due to the significant non-stationarity and randomness characteristics of acoustic emission signals, it is usually necessary to conduct quantitative characterization via feature parameters to reveal the acoustic response properties during the evolution of internal damage in structures. Although an acoustic emission system can extract a variety of feature parameters, the selection of feature parameters for unsupervised clustering analysis needs to balance the clarity of physical meaning, parameter stability, and the ability to distinguish different damage mechanisms. Meanwhile, excessive or redundant feature parameters will not only increase the dimension of the feature space, but also may reduce the discriminative performance of clustering results.
Based on these considerations and the AE generation characteristics during reinforcement corrosion, freeze–thaw damage, and loading-induced failure of RC beams, seven AE parameters were selected as input features for subsequent clustering analysis, including rise time, ring-down counts, energy, duration, average frequency, counts to peak, and amplitude. These parameters characterize AE events from the perspectives of time-domain response, activity level, energy release, frequency content, and signal intensity, and can provide complementary information for distinguishing different damage-related AE signals. The definitions of several AE parameters are illustrated in
Figure 8.
Based on the AE monitoring data obtained from three RC beams under loading, corrosion, and freeze–thaw conditions, as described in
Section 2, a seven-dimensional feature dataset containing 490 valid samples was constructed. The distribution ranges of the AE parameters for each damage condition are listed in
Table 4.
To provide a more intuitive description of the collected AE dataset, the distributions of the seven AE features under the three damage conditions are shown in
Figure 9. It can be observed that the loading-induced damage samples generally exhibit higher values in ring-down counts, duration, average frequency, and amplitude, whereas the freeze–thaw and corrosion samples show closer distributions in several features. Meanwhile, overlaps still exist among the three damage types, indicating that single AE parameters are insufficient for reliable damage identification. Therefore, multi-feature clustering analysis is necessary for further distinguishing different damage-related AE signals.
3.3.2. Data Pretreatment
The seven AE features used in this study have different physical dimensions and numerical ranges. Therefore, min–max normalization was applied to each feature before clustering, so that all features were transformed into the same numerical range while preserving their relative distribution characteristics.
Although the normalized seven-dimensional feature set characterizes AE signals from multiple perspectives, including time-domain response, energy release, amplitude response and frequency content, the Pearson correlation matrix shows that correlations exist among several feature parameters. As shown in
Figure 10, rise time is strongly correlated with peak frequency, while ring-down counts show strong correlations with duration and amplitude. Duration is also highly correlated with amplitude. These correlations indicate that the original AE feature set contains partially overlapping information and potential feature redundancy. Therefore, PCA was introduced to reduce feature redundancy and obtain a compact low-dimensional representation for visualization and the corresponding clustering analysis. The cumulative variance contribution rates of the first three and first four principal components were 90.05% and 94.68%, respectively. Therefore, the first three principal components were used for three-dimensional visualization, while the first four principal components were retained as the PCA-reduced input for the corresponding clustering analysis.
To further interpret the physical meaning of the retained principal components, the loading matrix of PC1–PC4 was analyzed. The absolute loading values of the first four principal components are listed in
Table 5. For PC1, the dominant features are ring-down counts, amplitude and duration, indicating that PC1 mainly reflects the overall AE activity intensity and sustained response level. PC2 is dominated by average frequency, suggesting that it primarily represents the frequency-related characteristics of AE signals. PC3 is mainly controlled by amplitude, peak frequency and rise time, and can be associated with waveform morphology, peak response and frequency distribution characteristics. PC4 is dominated by energy, amplitude and duration, indicating that it preserves supplementary information related to energy release and signal duration. This also explains why retaining the fourth principal component is meaningful, although its individual variance contribution is relatively small.
The PCA visualization of the AE feature dataset is shown in
Figure 11. It can be observed that the three types of damage signals exhibit a certain degree of aggregation in the reduced feature space, indicating that the selected AE parameters contain damage-related discriminative information. However, the category boundaries are not completely separated, and local overlap still exists among different damage types. This further suggests that simple visual separation or single-parameter judgment is insufficient, and clustering algorithms with stronger nonlinear separation capability are required for multi-type AE signal identification.
3.4. Performance Evaluation
The AE dataset used in this study was constructed from three experimentally designed dominant damage conditions, namely loading-induced damage, freeze–thaw damage, and reinforcement corrosion. Accordingly, three clusters were used for all clustering algorithms in the comparative analysis, so that the clustering results could be consistently compared with the experimentally defined damage categories.
Since the cluster labels output by unsupervised clustering algorithms are only numerical identifiers of sample partitions in the feature space, they do not inherently correspond to specific damage categories. Therefore, before calculating external evaluation metrics, the Hungarian algorithm was employed to establish the optimal correspondence between the clustering labels and the experimental damage labels. Specifically, a cost matrix was constructed according to the mismatch between the clustering results and the ground-truth damage categories, and the optimal assignment with the minimum total matching error was obtained. Based on the matched labels, the clustering accuracy was then calculated to evaluate the consistency between the clustering results and the experimental damage categories.
In this paper, the performance of different clustering methods is evaluated using the metrics clustering accuracy (ACC), adjusted Rand index (ARI), and silhouette coefficient (SC), whose calculation formulas are given in Equations (20)–(22). Here,
represents the true label of the i-th sample;
denotes the cluster label after optimal matching via the Hungarian algorithm; N is the total number of samples; a(i) and b(i) are the mean intra-cluster distance and the mean nearest-cluster distance for sample i, respectively; and RI and denote the Rand index and its expected value under random partitioning. To ensure comparability among different methods, SC was calculated in the normalized original seven-dimensional feature space for all clustering results. It should be noted that, for kernel-based clustering, the distance relationship in the kernel-induced feature space is different from that in the original Euclidean space. Therefore, SC is used only as a supplementary internal validity index in this study, while ACC and ARI are taken as the main metrics for evaluating the consistency between clustering results and experimental damage categories.
Among these indexes, the ACC reflects the proportion of correctly classified samples after label matching, where a higher ACC indicates better clustering accuracy. The ARI measures the agreement between the clustering result and the ground truth with correction for chance, with values closer to 1 signifying higher consistency and 0 corresponding to random partitioning. The SC evaluates the compactness and separation of clusters without using true labels; a higher SC suggests that samples are well assigned to their clusters.
Since clustering results may be affected by random initialization, each method was independently repeated 10 times with different random seeds. The mean values were used to compare the overall clustering performance, while the standard deviations were used to evaluate the stability of each method.
5. Discussion
5.1. Class-Level Performance
Although the overall clustering performance has been evaluated in
Section 4 using the ACC, ARI and SC, these global metrics cannot fully reveal the recognition characteristics of different damage types. These metrics compress the classification performance of freeze–thaw damage, corrosion damage and load-induced damage into single overall values. As a result, they may conceal the fact that some damage types are easier to identify while others are more prone to misclassification. For AE-based damage identification of RC beams, this class-dependent recognition performance is important because different damage mechanisms may generate AE signals with different degrees of feature separability. If only the overall clustering accuracy is considered, the model may appear reliable while still performing poorly for a specific damage type. Therefore, further analysis of the recognition performance for each damage category is necessary to clarify the identification difficulty of different AE signals and to reveal the source of misclassification.
To further evaluate the recognition performance of each damage type, the F1-score was introduced in this study. The F1-score is a class-level evaluation metric that combines precision and recall. Precision reflects the proportion of correctly identified samples among all samples predicted as a certain damage type, while recall reflects the proportion of correctly identified samples among all actual samples of that damage type. Therefore, the F1-score can simultaneously consider false positives and false negatives, making it suitable for evaluating the recognition performance of each individual damage category. The precision, recall and F1-score are defined as follows:
where TP denotes the number of samples correctly identified as a given damage type, FP denotes the number of samples from other damage types incorrectly identified as that damage type, and FN denotes the number of samples of that damage type incorrectly assigned to other categories. A higher F1-score indicates better recognition performance for the corresponding damage type. In contrast, a lower F1-score suggests that the damage type is more likely to be missed or confused with other categories in the current feature space.
As shown in
Table 10 and
Figure 15, the class-level F1-scores show clear differences in the recognition difficulty of the three damage types. Load-induced damage generally achieves high F1-scores under different methods, with values of 0.9944 ± 0.0008, 0.8760 ± 0.1532, 0.9717 ± 0.0039 and 0.9706 ± 0.0158 for K-Means, FCM, GMM and COA-KKM, respectively. Except for the relatively large fluctuation observed in FCM, the F1-scores of load-induced damage remain at a high level, indicating that the AE signals associated with loading-induced damage are more distinguishable in the current feature space.
In contrast, freeze–thaw and corrosion damage show stronger dependence on the clustering method. For freeze–thaw damage, K-Means and FCM obtain relatively low and unstable F1-scores, especially FCM, whose large standard deviation indicates strong sensitivity to random initialization. The GMM and COA-KKM significantly improve the recognition performance of freeze–thaw damage, with F1-scores of 0.8860 ± 0.0792 and 0.8759 ± 0.1091, respectively. This suggests that methods capable of describing non-spherical or nonlinear feature distributions are more suitable for identifying freeze–thaw-related AE signals.
For corrosion damage, COA-KKM achieves the highest F1-score of 0.9214 ± 0.0538, outperforming K-Means, FCM and the GMM. Although the GMM shows good overall performance in
Table 8, its F1-score for corrosion damage is 0.8067 ± 0.2494, with a relatively large standard deviation, indicating that corrosion-related AE signals are still difficult to identify stably under the Gaussian mixture assumption. In comparison, COA-KKM provides a more balanced recognition performance across the three damage categories, especially improving the identification of corrosion damage. These results indicate that the advantage of COA-KKM is mainly reflected in its ability to handle boundary or overlapping samples, thereby improving the class-level consistency of different damage-related AE signals.
For visualization and misclassification analysis, the representative run whose ACC was closest to the mean ACC of the 10 repeated runs was selected.
Figure 16 further reveals the specific misclassification paths of different methods in the representative runs. For K-Means, the misclassification mainly occurs between freeze–thaw and corrosion damage: 83 freeze–thaw samples are identified as corrosion, whereas corrosion and load-induced samples are almost all correctly assigned. This indicates that the poor recognition of freeze–thaw damage by K-Means is not caused by general confusion among all three categories, but mainly by the shift of freeze–thaw samples toward the corrosion category in the feature space. FCM partly alleviates this problem, reducing the number of freeze–thaw samples misclassified as corrosion from 83 to 38. However, the misclassified samples are still mainly concentrated between freeze–thaw and corrosion, suggesting that fuzzy membership can improve boundary sample identification to some extent, but cannot fully eliminate the feature overlap between these two categories.
The misclassification pattern of the GMM differs from those of K-Means and FCM. Most freeze–thaw samples are correctly identified, whereas 22 corrosion samples are misclassified as freeze–thaw and 9 corrosion samples are misclassified as load-induced damage. This indicates that corrosion samples show a certain transitional characteristic in the feature space, and may be close to either freeze–thaw or load-induced AE signals. For COA-KKM, the main misclassification is the shift of freeze–thaw samples toward the load-induced category, with 21 freeze–thaw samples misclassified as load-induced damage. In addition, 12 corrosion samples are misclassified as freeze–thaw and 3 corrosion samples are misclassified as load-induced damage, while all load-induced samples are correctly identified. These results show that COA-KKM maintains stable recognition of load-induced damage and effectively reduces the misclassification of corrosion samples; however, some boundary freeze–thaw samples may still be assigned to the load-induced category when their AE features become close to those of loading-related signals.
Overall, the misclassified samples of the four methods are mainly concentrated in adjacent or transitional regions of the feature space. K-Means and FCM mainly confuse freeze–thaw with corrosion, the GMM shows a tendency to disperse corrosion samples toward other categories, while COA-KKM performs better in reducing corrosion misclassification and maintaining stable recognition of load-induced damage. Combined with the F1-score results, the three types of AE signals exhibit different degrees of separability in the feature space. Load-induced AE signals are the easiest to form a stable category, whereas freeze–thaw- and corrosion-related signals remain the main sources affecting the clarity of classification boundaries.
5.2. Feature Interpretation
To further interpret the feature characteristics of the COA-KKM clustering results, the original AE features of the samples identified by the model and the main misclassification paths were analyzed based on the representative run.
Figure 17 presents the normalized median values of the seven AE features for different sample groups. The first three rows represent the samples identified by COA-KKM as freeze–thaw, corrosion and load-induced damage, while the last two rows correspond to the two main misclassification paths in
Figure 16d, namely corrosion samples misclassified as freeze–thaw damage and freeze–thaw samples misclassified as load-induced damage.
For the three identified damage groups, the samples identified as load-induced damage show the most distinctive feature profile. Their normalized median values of ring-down count, duration, average frequency and counts to peak are 0.37, 0.30, 0.49 and 0.27, respectively, which are generally higher than those of the freeze–thaw and corrosion groups. This indicates that COA-KKM tends to assign samples with stronger AE activity, longer signal duration and more prominent frequency characteristics to the load-induced damage category. In contrast, the samples identified as freeze–thaw damage generally show lower feature values, with only counts to peak reaching a relatively noticeable level. The samples identified as corrosion damage show higher values than the freeze–thaw group in rise time, duration, average frequency and amplitude, but are still generally lower than the load-induced group, presenting an intermediate feature profile between freeze–thaw and load-induced damage.
For the main misclassification paths, the corrosion samples misclassified as freeze–thaw damage show relatively low values in most AE features. Their ring-down count, duration and amplitude are lower than those of the overall samples identified as corrosion damage, and are closer to the feature levels of the freeze–thaw group. This suggests that some corrosion samples with weak AE activity and less pronounced duration-related responses may be located close to the freeze–thaw category in the feature space. On the other hand, the freeze–thaw samples misclassified as load-induced damage show a particularly high normalized median value of counts to peak, reaching 0.71, and their average frequency is also higher than that of the overall freeze–thaw group. Although these samples still have relatively low ring-down count, duration and amplitude, their more prominent rising-stage activity may cause them to shift toward the load-induced category during clustering.
Overall,
Figure 17 further explains the misclassification phenomena observed in the COA-KKM confusion matrix. Load-induced samples show more pronounced responses in several AE features and therefore form a relatively stable identified category. Freeze–thaw and corrosion samples exhibit closer feature profiles, especially when corrosion samples have weak AE activity, which may cause them to be assigned to the freeze–thaw category. In addition, some freeze–thaw samples with high counts to peak tend to approach the load-induced category, becoming the main source of freeze–thaw-to-load misclassification.
6. Conclusions
This paper proposed a COA-optimized kernel K-means clustering method for the unsupervised identification of AE signals associated with different damage types in RC beams. Based on AE signals collected from three RC beams under loading-induced damage, freeze–thaw damage and reinforcement corrosion conditions, a dataset containing 490 valid samples with seven AE features was constructed. The main conclusions are as follows:
- (1)
The AE signals corresponding to the three damage types show distinguishable but partially overlapping feature distributions. Load-induced damage generally exhibits higher ring-down count, duration, average frequency and amplitude, while freeze–thaw and corrosion damage show closer feature distributions in several parameters. This indicates that single AE parameters are insufficient for reliable damage identification, and multi-feature clustering is necessary.
- (2)
Compared with K-means, FCM and the GMM, the proposed COA-KKM method achieves the best overall clustering performance. Over 10 repeated runs, COA-KKM obtains the highest ACC and ARI, reaching 92.86% ± 4.19% and 0.8215 ± 0.0662, respectively. The improvements in ACC and ARI are statistically significant compared with K-means and FCM, while COA-KKM also shows a higher mean performance and better stability than the GMM. The results show that kernel mapping and COA-based parameter optimization can improve the nonlinear separation ability and stability of clustering for different damage-related AE signals.
- (3)
The recognition difficulty differs among damage types. Load-induced damage can be identified more stably, while freeze–thaw and corrosion damage are more prone to confusion due to feature overlap. Class-level results show that COA-KKM achieves the highest F1-score for corrosion damage, reaching 0.9214 ± 0.0538, and maintains good recognition performance for both freeze–thaw and load-induced damage.
- (4)
Feature interpretation based on the representative COA-KKM result further shows that the main misclassified samples are located in feature-overlapping or transitional regions. Corrosion samples with weaker AE activity tend to be misclassified as freeze–thaw damage, while some freeze–thaw samples with higher counts to peak tend to shift toward the load-induced category.
It should be noted that this study mainly focuses on the feasibility validation of the COA-KKM method for the unsupervised identification of AE signals associated with different damage types and does not include a complete engineering deployment study for bridge structural health monitoring systems. Due to the limitation of the experimental conditions, the dataset used in this study was collected from controlled tests on three RC beams, with a relatively limited sample size. In addition, the three types of AE signals were obtained from different dominant damage conditions, which cannot fully represent complex service scenarios in which multiple types of damage coexist or evolve sequentially in actual bridge structures. Nevertheless, the COA-KKM method showed good performance in terms of clustering accuracy, ARI and stability, indicating its potential as an auxiliary analysis tool for AE-based bridge monitoring. For future engineering application, further validation using field AE monitoring data is still needed, with consideration of sensor deployment, real-time computation, traffic-induced noise, environmental variations, sensor coupling conditions and non-damage AE sources, so as to evaluate its robustness and applicability under complex on-site conditions.