A Novel Residual Dual Attention Multiscale Network for Vibration-Based Damage Recognition in Floating Wind Turbine Structural Health Monitoring

Han, Huiming; Li, Yifei; Wang, Renqiang; Deng, Hua; Lu, Yuchen; Zhang, Yuxuan

doi:10.3390/s26134104

Open AccessArticle

A Novel Residual Dual Attention Multiscale Network for Vibration-Based Damage Recognition in Floating Wind Turbine Structural Health Monitoring

by

Huiming Han

¹,

Yifei Li

^2,*

,

Renqiang Wang

¹

,

Hua Deng

¹,

Yuchen Lu

³

and

Yuxuan Zhang

^4,5,*

¹

School of Nautical Technology, Jiangsu Maritime Institute, Nanjing 210024, China

²

School of Engineering, Huzhou Normal University, Huzhou 313000, China

³

Yantai Research Institute, Harbin Engineering University, Yantai 264006, China

⁴

College of Intelligent Science and Engineering, Beijing University of Agriculture, Beijing 102206, China

⁵

Department of Computer and Electrical Engineering, Mid Sweden University, SE-851 70 Sundsvall, Sweden

^*

Authors to whom correspondence should be addressed.

Sensors 2026, 26(13), 4104; https://doi.org/10.3390/s26134104 (registering DOI)

Submission received: 15 May 2026 / Revised: 23 June 2026 / Accepted: 25 June 2026 / Published: 28 June 2026

(This article belongs to the Section Physical Sensors)

Download

Browse Figures

Versions Notes

Abstract

Floating wind turbines (FWTs) are key equipment for deep-sea clean energy exploitation, and their structural health condition is directly related to operational safety and energy output. However, FWT vibration signals exhibit significant non-stationary and multi-scale characteristics, with damage-sensitive features of different damage patterns spanning multiple temporal scales. Existing methods fail to sufficiently extract and fuse multi-scale damage-sensitive features. To this end, this paper proposes a novel Residual Dual Attention Multiscale Network (RDAMNet). The network innovatively designs a signal-level multi-scale decoupling strategy that extracts damage-sensitive features at different scales from complementary signal representations through a multi-branch differentiated architecture. Furthermore, an ECA-SE dual attention mechanism is designed to collaboratively enhance damage-related channel responses at both the feature extraction and fusion stages. Multiple independent experimental results on a publicly available dataset demonstrate that RDAMNet achieves a mean damage recognition accuracy and a weighted F1-score of 95.39% and 95.37%, respectively, significantly outperforming five compared methods. Cross-condition generalization experiments further demonstrate that RDAMNet maintains mean accuracies exceeding 94% across different wind speed and wind direction combinations, validating its stability across operating conditions. Moreover, RDAMNet only contains 663,783 parameters with a single-sample GPU inference time of 5.35 ms, exhibiting a favorable performance–efficiency trade-off. The ablation study verifies the effective contribution of each core component, and branch importance analysis, together with Grad-CAM visualization, further substantiates the multi-scale feature learning capability of the network. The proposed method provides an effective technical approach for intelligent structural health monitoring of FWTs in complex oceanic environments.

Keywords:

floating wind turbine; structural health monitoring; vibration-based damage recognition; multi-scale feature extraction; attention mechanism

1. Introduction

Floating wind turbines (FWTs) serve as key equipment for deep-sea wind energy exploitation. They are capable of overcoming the water depth limitations of nearshore fixed-bottom wind turbines and harnessing stronger and more consistent wind resources in deep seas, thus playing an increasingly important role in the global clean energy transition and offshore renewable energy development [1,2,3,4,5,6,7,8]. However, FWTs operate over extended service periods in harsh oceanic environments characterized by high wind loading, intense wave excitation, salt spray corrosion, and continuous cyclic loading. These severe conditions subject critical structural components [9,10,11,12,13,14,15,16,17], including the tower, blades, floater connections, and mooring systems, to persistent degradation and damage risks. Early-stage structural damage, if not detected in a timely manner, may progressively propagate into severe structural failure under uncontrollable environmental loading, potentially leading to catastrophic consequences [18,19,20,21,22,23,24]. Given the highly restricted accessibility of offshore environments, manual inspection and maintenance operations are both costly and hazardous [25,26,27,28,29,30]. Therefore, developing automated structural health monitoring (SHM) techniques for FWTs is of paramount importance for enabling predictive maintenance, reducing unplanned downtime, and maximizing energy output [18,31,32,33,34,35,36].

Among the SHM techniques for FWTs, vibration-based methods have attracted extensive attention due to their broad applicability, cost-effectiveness, and capability for continuous real-time monitoring [37,38,39,40]. The core principle of this technique lies in the fact that structural damage induces measurable vibration response changes by altering stiffness, mass distribution, or damping properties [41,42,43,44,45]. In fact, vibration- and strain-response-based damage identification strategies have been extensively investigated for various engineering structures such as bridges, trusses, and pipelines [46,47,48,49,50], and these methods have accumulated a rich theoretical and methodological foundation for the field of structural damage identification. In terms of traditional machine learning, data-driven approaches leverage signal processing techniques to extract time-domain or frequency-domain features from vibration signals, combined with classifiers such as decision trees, artificial neural networks, k-nearest neighbors, and support vector machines for damage diagnosis [51,52,53,54]. Methods based on stochastic parametric models, such as autoregressive models and linear parameter-varying autoregressive models, achieve robust diagnosis of wind turbine structural damage through statistical testing. However, traditional machine learning methods rely on manual feature extraction and exhibit limited generalization capability in handling high-dimensional nonlinear data [55,56,57,58,59]. Deep learning has been widely adopted across multiple domains and continues to demonstrate outstanding performance, exhibiting strong automatic feature learning capability and superior generalization ability [60,61,62,63,64,65]. With the development of deep learning, convolutional neural networks, recurrent neural networks, and their hybrid architectures have shown significant advantages in wind turbine fault diagnosis, including end-to-end feature learning and improved generalization performance [66,67,68,69,70,71]. Meanwhile, attention mechanisms and multi-scale feature extraction strategies have been introduced to enhance the modeling of complex vibration patterns, and multisensory collaborative diagnosis frameworks have been further explored for damage localization and quantification in FWTs [72,73,74,75,76].

Although the aforementioned methods have achieved certain progress, existing research on FWT vibration-based damage recognition still suffers from several limitations. The vibration response signals of FWTs exhibit significant multi-scale time-frequency characteristics, and different types of damage manifest as differentiated signal signatures across varying temporal scales [77,78]. However, most existing methods employ single-scale or unified input representations for feature extraction, failing to sufficiently capture the complementary information among multi-scale damage-sensitive features. Moreover, the coupled effects of oceanic environmental noise and multi-source loading severely obscure early-stage subtle damage features [79,80]. Existing network architectures lack effective channel-level feature enhancement mechanisms, causing the responses of damage-sensitive channels to be diluted by noise-dominant channels. Meanwhile, current research on FWT vibration-based damage diagnosis predominantly focuses on mooring systems, with a notable lack of systematic investigation into early-stage damage recognition of other critical structural components, including the tower, floater connections, and blades.

To address these challenges, this paper proposes a novel Residual Dual Attention Multiscale Network (RDAMNet) for vibration-based damage recognition of FWTs. RDAMNet innovatively designs a signal-level multi-scale decoupling strategy that decomposes the raw vibration signal at the network front-end into complementary signal representations with distinct physical characteristics, and extracts damage-sensitive features at different scales through a differentiated branch architecture. Residual connections ensure the effective propagation of subtle damage features through deep networks. The ECA-SE dual attention mechanism collaboratively enhances damage-sensitive channel responses at both the feature extraction and fusion stages, effectively suppressing the interference of oceanic environmental noise. The proposed method is validated on a publicly available dataset. The main contributions of this paper are summarized as follows:

1.: A novel Residual Dual Attention Multiscale Network (RDAMNet) is proposed for vibration-based damage recognition of FWTs, which innovatively designs a signal-level multi-scale decoupling strategy and a differentiated branch architecture to achieve complementary extraction of multi-scale damage-sensitive features, overcoming the limitation of existing methods in insufficiently capturing multi-scale information from unified input representations.
2.: A dual attention mechanism composed of ECA and SE is designed to hierarchically enhance damage-sensitive channel responses at the feature extraction stage and adaptively recalibrate the contribution weights of cross-branch channels at the fusion stage, effectively mitigating the masking effect of oceanic environmental noise on subtle damage features.
3.: Comprehensive experiments on the UPATRAS Floating Wind Turbine Vibration Dataset validate the effectiveness of RDAMNet, where multi-run comparative analyses, cross-condition generalization validation, ablation studies, and interpretability analyses systematically demonstrate the superiority of the proposed method from multiple perspectives, namely statistical reliability, generalization capability, component contribution, and feature visualization.

2. Proposed Method

2.1. Problem Formulation

Floating wind turbines (FWTs) operate in complex oceanic environments over extended service periods, and the vibration response variations induced by structural degradation serve as critical information sources for damage recognition. This study aims to leverage one-dimensional vibration signals acquired by nacelle-mounted accelerometers to automatically identify the structural health state of FWTs. The task is formulated as a supervised multi-pattern damage recognition problem. Given a one-dimensional vibration signal

x \in R^{L}

of length L, the objective is to learn a mapping function

f : R^{L} \to {1, 2, \dots, C}

that assigns each signal to one of C predefined structural health states, where C denotes the total number of damage patterns including the healthy state. The training set

D = {(x_{i}, y_{i})}_{i = 1}^{N}

comprises N labeled samples, where

y_{i}

represents the ground-truth label of the i-th sample.

2.2. Overview of the Proposed RDAMNet

The vibration response signals of FWTs exhibit significant multi-scale time-frequency characteristics, and distinct damage patterns manifest as differentiated signal signatures across varying temporal scales. For instance, mooring line degradation induces low-frequency global stiffness variations that alter long-range structural responses, whereas localized structural defects such as crack propagation generate transient impulse signatures. Moreover, the coupled effects of oceanic environmental noise and multi-source loading further obscure damage-sensitive features. However, conventional single-branch convolutional neural networks, constrained by a fixed receptive field, fail to simultaneously capture both localized transient impulses and long-range structural variations. Meanwhile, standard convolutional operations assign equal weights across all feature channels, causing critical damage-related channel responses to be diluted by noise-dominant channels.

To address these challenges, a novel Residual Dual Attention Multiscale Network (RDAMNet) is proposed for vibration-based damage recognition of FWTs. The core design philosophy of RDAMNet encompasses three aspects. Firstly, a multi-branch multi-scale input strategy is employed to extract complementary time-frequency features from the raw signal, max-pooled signal, and average-pooled signal, thereby enhancing the perceptual capability of the network across distinct damage patterns. Secondly, residual connections are incorporated within each feature extraction branch to mitigate the degradation of subtle damage features in deep networks, ensuring effective gradient propagation. Finally, a dual attention mechanism is introduced, which comprises ECA [81] and SE [82] attention. Specifically, ECA is embedded within each branch to enhance damage-sensitive channel responses, while SE operates on the post-fusion features for channel recalibration. The two mechanisms collaboratively achieve adaptive highlighting of damage-sensitive features. The overall architecture of the proposed RDAMNet is illustrated in Figure 1. To facilitate reproducibility, Table 1 lists the detailed configuration parameters of each module in RDAMNet and Table 2 further provides the internal specifications of the three core modules: ResECA, ECA, and SE.

The following subsections present each component of RDAMNet in the order of the data flow. The multi-scale signal input and differentiated branch design are introduced first, followed by the detailed structure of the ResECA feature extraction block within each branch. Subsequently, the principles of the ECA and SE attention mechanisms are described, respectively. Finally, the adaptive feature fusion and damage recognition output are presented.

2.3. Multi-Scale Signal Input and Differentiated Branch Design

In the vibration signals of FWTs, distinct types of damage signatures are distributed across varying temporal scales. Long-range low-frequency structural responses induced by global stiffness variations such as mooring line degradation, transient impulse signatures generated by localized defects such as crack propagation, and subtle damage features submerged in oceanic environmental noise each require extraction at different temporal resolutions. A single signal representation fails to simultaneously accommodate the extraction requirements of these multi-scale features. Therefore, RDAMNet applies max-pooling and average-pooling operations to the original input signal

x

at the network front-end, generating three complementary multi-scale input signals. Among them, the raw signal preserves the full time-domain resolution, the max-pooled signal retains the peak responses within local regions to highlight transient impulses and abrupt variations, and the average-pooled signal suppresses stochastic high-frequency noise to facilitate the extraction of stationary low-frequency structural responses. The three input signals can be formulated as:

x_{1} = x, x_{2} = {MaxPool}_{p} (x), x_{3} = {AvgPool}_{p} (x)

(1)

where p denotes the pooling kernel size and stride,

x_{1} \in R^{1 \times L}

represents the raw signal,

x_{2} \in R^{1 \times ⌊ L / p ⌋}

represents the max-pooled signal, and

x_{3} \in R^{1 \times ⌊ L / p ⌋}

represents the average-pooled signal.

The three signals are fed into three differently configured feature extraction branches. Each branch comprises three cascaded ResECA feature extraction blocks, while employing differentiated configurations in terms of kernel size, pooling strategy, and dilation rate to match the physical characteristics of each signal.

The raw signal branch employs progressively decreasing large kernel sizes to extract low-frequency and global structural response features over extended temporal ranges. Larger convolutional kernels cover a wider temporal window, facilitating the capture of global vibration pattern variations induced by structural damage. The max-pooled branch employs smaller kernel sizes. Since the input signal has already been processed by max-pooling to retain peak information, this branch combined with small convolutional kernels is more effective at capturing localized impulse, abrupt variation, and spike response features. The average-pooled and dilated convolution branch introduces progressively increasing dilation rates on the basis of the average-pooled signal. The average-pooling operation suppresses stochastic noise beforehand, while dilated convolutions expand the receptive field without increasing the parameter count or further reducing temporal resolution, enabling this branch to capture structural response patterns spanning longer temporal extents under noise-suppressed conditions.

2.4. ResECA Feature Extraction Block

The vibration variations induced by early-stage structural damage in FWTs are often extremely subtle and tend to degrade or even vanish during the layer-by-layer abstraction process of multi-layer convolutions. To ensure these subtle damage features are effectively preserved in deep networks, the fundamental feature extraction unit within each branch adopts the ResECA structure. The residual connection constructs a shortcut path for identity mapping, enabling subtle damage features to be directly propagated across multiple convolutional layers to deeper levels. The ECA adaptively weights channel responses at each hierarchical level, highlighting damage-sensitive channels.

Specifically, this module comprises two layers of one-dimensional convolution, batch normalization (BN), ReLU activation, ECA channel attention, and a residual connection. Given the input feature

F_{in} \in R^{C_{in} \times T}

, the feature extraction process can be formulated as:

Specifically, this module comprises two layers of one-dimensional convolution [83], batch normalization (BN) [84], ReLU activation [85], ECA channel attention [81], and a residual connection. Given the input feature

F_{in} \in R^{C_{in} \times T}

, the feature extraction process can be formulated as

H = {Conv 1 d}_{2} (ReLU ({BN}_{1} ({Conv 1 d}_{1} (F_{in}))))

(2)

F_{out} = Pool (ReLU (ECA ({BN}_{2} (H)) + S (F_{in})))

(3)

where

S (\cdot)

denotes the shortcut connection, which is an identity mapping when the input and output dimensions are identical, or a

1 \times 1

convolution with batch normalization when a dimensional transformation is required.

Pool (\cdot)

represents the optional max-pooling operation for temporal downsampling.

2.5. Efficient Channel Attention

ECA is a lightweight channel attention mechanism proposed by Wang et al. [81]. Unlike SE attention that employs fully connected layers to model global channel relationships, ECA leverages one-dimensional convolution to achieve local cross-channel interaction, maintaining efficient channel modeling capability while avoiding the information loss caused by dimensionality reduction. RDAMNet embeds ECA within each ResECA block to adaptively enhance damage-sensitive channel responses at each hierarchical level. The principle of ECA is illustrated in Figure 2.

Given the input feature

F \in R^{C \times T}

, ECA first compresses each channel into a scalar descriptor through global average pooling, then captures local interactions among adjacent channels through a one-dimensional convolution with an adaptive kernel size, and ultimately generates channel weights through sigmoid activation. The computational process can be formulated as:

s = GAP (F) \in R^{C \times 1}

(4)

w = σ {({Conv 1 d}_{k} (s^{⊤}))}^{⊤} \in R^{C \times 1}

(5)

F^{'} = F ⊙ w

(6)

where

GAP (\cdot)

denotes the global average pooling,

σ (\cdot)

denotes the sigmoid activation function, ⊙ represents the channel-wise multiplication, and k represents the adaptive kernel size of the one-dimensional convolution. The kernel size k is adaptively determined based on the channel dimensionality C as:

k = {|\frac{{log}_{2} C}{γ} + \frac{b}{γ}|}_{odd}

(7)

where

γ

and b are hyperparameters that control the mapping relationship between channel dimensionality and interaction range, and

{| \cdot |}_{odd}

denotes rounding to the nearest odd integer.

2.6. Squeeze-And-Excitation Attention

SE is a channel attention mechanism proposed by Hu et al. [82], which achieves channel recalibration through two steps, namely global information squeeze and channel excitation. In RDAMNet, the complementary multi-scale features extracted by the three branches are concatenated along the channel dimension and compressed through a

1 \times 1

convolution, after which the channels originating from different branches contribute differently to the final damage recognition. SE attention is introduced at the fusion stage to adaptively adjust the importance weights of each channel, thereby highlighting the channel responses most relevant to damage recognition in the fused features. The principle of SE attention is illustrated in Figure 3.

Given the fused feature

G \in R^{C^{'} \times T^{'}}

, SE first compresses the temporal information of each channel into a scalar descriptor through global average pooling, then learns the nonlinear inter-channel dependencies through a two-layer fully connected network, and ultimately generates a channel weight vector. The computational process can be formulated as:

z = GAP (G) \in R^{C^{'}}

(8)

w_{SE} = σ (W_{2} \cdot ReLU (W_{1} \cdot z)) \in R^{C^{'}}

(9)

G^{'} = G ⊙ w_{SE}

(10)

where

W_{1} \in R^{(C^{'} / r) \times C^{'}}

and

W_{2} \in R^{C^{'} \times (C^{'} / r)}

are the learnable weight matrices of the two fully connected layers, respectively, r denotes the reduction ratio that controls the bottleneck dimensionality, and

σ (\cdot)

represents the sigmoid activation function.

2.7. Adaptive Feature Fusion and Damage Recognition

After the feature extraction through the three branches described above, the output features

f_{1}, f_{2}, f_{3} \in R^{128 \times T^{'}}

from each branch are first aligned along the temporal dimension and then concatenated along the channel dimension to generate a joint feature representation. Subsequently, a

1 \times 1

convolution compresses the channel dimensionality of the concatenated features from

3 \times 128

to 128, reducing redundancy while achieving nonlinear cross-branch feature fusion. After the fused features undergo SE attention-based channel recalibration, two complementary global feature vectors are generated through global average pooling (GAP) and global max pooling (GMP), respectively. GAP captures the average activation response across each channel, reflecting the overall feature distribution. GMP captures the peak activation response across each channel, preserving the most salient discriminative information. The two global vectors are concatenated and subsequently processed through Dropout regularization and a two-layer fully connected network, ultimately yielding a C-dimensional damage recognition result vector. The fusion and damage recognition process can be formulated as:

F_{cat} = Concat (f_{1}, f_{2}, f_{3}) \in R^{384 \times T^{'}}

(11)

F_{fuse} = SE (ReLU (BN ({Conv}_{1 \times 1} (F_{cat})))) \in R^{128 \times T^{'}}

(12)

v = Concat (GAP (F_{fuse}), GMP (F_{fuse})) \in R^{256}

(13)

\hat{y} = {FC}_{2} (Dropout (ReLU ({FC}_{1} (Dropout (v))))) \in R^{C}

(14)

where

\hat{y}

denotes the predicted logit vector, and the final damage recognition result is obtained by selecting the damage pattern with the maximum predicted value.

3. Experimental Validation

3.1. Dataset Description

The vibration data employed in this study are sourced from the UPATRAS Floating Wind Turbine Vibration Dataset [18], publicly released by the Stochastic Mechanical Systems and Automation Laboratory at the University of Patras, Greece. As illustrated in Figure 4, This dataset is based on a lab-scale FWT model, with vibration signals acquired by a single uniaxial accelerometer mounted on the upper part of the tower at a sampling frequency of

f_{s} = 1024

Hz. The dataset encompasses six structural health states, including one healthy state and five early-stage damage scenarios. The experiments were conducted under nine operating conditions derived from three wind speeds and three wind directions, yielding a total of 540 vibration signals.

This study selects the vibration signals of all six structural health states under the operating condition of Wind Direction 1 and Wind Speed 1, designated as WD1_WS1, as experimental data. The six structural health states and their corresponding labels are described as follows. The healthy state H indicates that the FWT structure is free from any artificial damage, serving as the baseline reference. The bolt connection degradation state B represents the connection degradation between the tower and floater, implemented by removing two of the eight mounting bolts. The added mass state M1 simulates the additional loading caused by ice accumulation by attaching a mass of 1.7 g to the blade edge. The added mass state M2 simulates more severe ice accumulation by attaching a larger mass of 2.3 g. The blade crack state C1 has a crack length of 1.5 cm, corresponding to approximately 4% of the overall blade length. The blade crack state C2 has a crack length of 3 cm, corresponding to approximately 8% of the overall blade length.

To ensure that no information leakage exists between the training and testing sets, a strict data partitioning strategy is adopted. First, the complete vibration recordings corresponding to each health state are divided into non-overlapping training and testing subsets at a ratio of 8:2. Subsequently, a non-overlapping sliding window of length 1024 is independently applied within the already-assigned training and testing subsets to segment the signals. Since the segmentation is performed strictly after the partitioning and no overlap exists between adjacent windows, the samples in the training and testing sets are completely isolated in time, eliminating any cross-contamination from adjacent or identical signal segments. The final dataset comprises 1140 training samples and 282 testing samples, yielding a total of 1422 samples.

3.2. Evaluation Metrics

Two evaluation metrics, damage recognition accuracy and weighted F1-score, are employed to evaluate the performance of the proposed method.

The damage recognition accuracy measures the overall recognition correctness of the model across all test samples, which is computed as

Accuracy = \frac{TP + TN}{TP + TN + FP + FN} \times 100 %

(15)

where TP, TN, FP, and FN denote the number of true positives, true negatives, false positives, and false negatives, respectively.

The weighted F1-score comprehensively considers the precision and recall of each damage pattern, providing a more thorough reflection of the model’s recognition performance under imbalanced conditions. For the multi-pattern damage recognition task, the weighted F1-score is computed as

{F 1}_{weighted} = \sum_{i = 1}^{C} \frac{n_{i}}{N} \cdot \frac{2 \cdot {Precision}_{i} \cdot {Recall}_{i}}{{Precision}_{i} + {Recall}_{i}}

(16)

where C denotes the total number of damage patterns,

n_{i}

represents the number of samples of the i-th damage pattern, N denotes the total number of samples, and

{Precision}_{i}

and

{Recall}_{i}

represent the precision and recall of the i-th damage pattern, respectively.

3.3. Compared Methods

To comprehensively evaluate the performance of the proposed RDAMNet, five representative deep learning methods are selected as comparative baselines. These methods encompass different technical routes, including classical convolutional networks, multi-scale feature extraction, attention mechanism fusion, and spatiotemporal modeling, enabling the validation of RDAMNet from multiple perspectives. Each compared method is briefly introduced as follows.

1.: ResNet18 [86] is a classical deep residual network based on convolutional neural networks, which effectively mitigates the gradient degradation problem in deep networks through the introduction of a residual learning framework. This method serves as a widely adopted baseline model in the deep learning community and is selected to verify the performance advantage of RDAMNet over general-purpose deep feature extraction architectures.
2.: DCNet [87] is a dual-channel feature aggregation network proposed by Guo et al. for wind turbine fault diagnosis under variable speed operating conditions. This method constructs a parallel patch-aware convolutional module to extract multi-scale features from time-frequency representations, introduces Haar wavelet downsampling to reduce spatial resolution while preserving discriminative features, and dynamically allocates channel and spatial attention weights through a channel prior convolutional attention mechanism. This method is selected to evaluate the competitiveness of RDAMNet in attention mechanism-driven multi-scale feature fusion.
3.: IMCTN [78] is a physics-aware spatiotemporal diagnostic framework proposed by Zhao et al. for structural health monitoring of ultra-large wind turbine blades. This method integrates ensemble empirical mode decomposition with a hybrid Transformer-CNN architecture, coupling multi-head self-attention with multi-scale convolutions to model long-range temporal dependencies and localized patterns. This method is selected to evaluate whether RDAMNet can achieve competitive damage recognition performance without the global modeling capability of Transformers.
4.: MCAMCNN [88] is a fault diagnosis method based on a multi-channel attention mechanism convolutional neural network, proposed by Zheng et al. for wind turbine condition monitoring. This method employs a dual-layer multi-scale convolution combined with multi-channel attention to extract multi-domain features and dynamically calibrate feature channel weights, with adaptive feature fusion ultimately achieved through ECA. This method is selected to evaluate the performance difference between the dual attention mechanism of RDAMNet and the multi-channel attention strategy of this method in channel feature modeling.
5.: MSCNN-BiLSTM-WMV [89] is a fusion model of multi-scale convolutional neural network and bidirectional long short-term memory network, proposed by Xu et al. for wind turbine bearing fault diagnosis. This method extracts spatial features through multi-scale convolutions, captures temporal dependencies through bidirectional LSTM, and proposes a weighted majority voting rule to fuse multi-sensor information for improving generalization capability. This method is selected to evaluate the effectiveness of the pure convolutional multi-branch architecture of RDAMNet compared with CNN-RNN hybrid architectures in temporal feature modeling.

3.4. Implementation Details

All experiments were conducted on a computer equipped with an Intel Core i9-14900KF processor, 64 GB RAM, and an NVIDIA GeForce RTX 5080 GPU. The network is implemented with Python 3.9.19 and PyTorch 2.0.0 framework. To ensure the fairness of comparative experiments, all compared methods use the same training and testing data split, receive the same vibration signals as input, and are trained and evaluated on the same hardware platform. Table 3 lists the detailed training configurations of all methods.

3.5. Signal-Level Motivation for Multi-Scale Modeling

This subsection further analyzes the rationality of the multi-scale structure adopted in RDAMNet from the perspective of raw vibration signals, demonstrating that the proposed network design is derived from the intrinsic feature distribution of FWT damage-induced vibration responses rather than a simple stacking of architectural components. From the perspective of structural dynamics, the vibration response of an FWT results from the coupled interaction of wind loading, wave excitation, structural stiffness, and mass distribution. Different types of structural damage alter the system’s stiffness matrix, mass matrix, or damping properties, inducing multi-level dynamic effects such as natural frequency shifts, modal shape distortions, and transient response variations. These effects manifest as feature changes across different temporal scales in the vibration signals: variations in global stiffness or mass distribution predominantly affect the response characteristics of the dominant low-frequency modes, whereas localized structural defects tend to produce high-frequency transient components or local waveform distortions at shorter temporal scales. The time-domain waveforms and frequency-domain amplitude spectra of randomly selected signal segments under different damage modes are presented in Figure 5, and the normalized band energy distribution in the low-frequency range is presented in Figure 6.

As shown in Figure 5, all six structural states exhibit evident non-stationary vibration responses, whereas their amplitude envelopes, local peak distributions, and oscillatory patterns differ substantially. Bolt degradation state B produces pronounced amplitude fluctuations and local abrupt responses, which can be attributed to the reduction in tower-floater connection stiffness caused by the removal of mounting bolts, inducing larger relative displacements and nonlinear contact responses under dynamic loading. Blade crack states C1 and C2 exhibit varying degrees of oscillation intensity differences. The stiffness discontinuity introduced by cracks causes the cracked region to exhibit a breathing effect under alternating loads, thereby introducing localized nonlinear transient components into the time-domain signal. Added-mass states M1 and M2 present more evident low-frequency modulation, as the additional blade mass alters the mass distribution and moment of inertia of the rotating components, causing a downward shift of the structural natural frequencies. These phenomena indicate that damage-sensitive information is not concentrated at a single temporal scale, but is simultaneously distributed across global low-frequency trends, local peak responses, and medium-scale waveform variations.

The frequency-domain analysis further corroborates the above observation. Although the dominant vibration energy is concentrated in the low-frequency range, different damage modes exhibit distinct spectral peak locations, peak intensities, and band-energy distributions. The band-energy results in Figure 6 reveal that the relative energy contributions of different damage modes vary across low-frequency sub-bands, indicating that the diagnostic information encompasses both dominant low-frequency structural responses and weaker band-specific components with discriminative capability. A model employing only a single receptive field for feature extraction may overly focus on the dominant low-frequency energy, thereby insufficiently capturing weaker but critical damage-sensitive features in other time-frequency regions. RDAMNet addresses this issue through the synergy of its multi-branch differentiated architecture and dual attention mechanism. Specifically, the raw signal branch preserves the full time-domain resolution, and its convolutional kernels can directly operate on the original waveform containing high-frequency transient components. The max-pooled branch highlights transient impulses and abrupt variations by retaining local peak responses, which inherently correspond to the high-frequency energy components in the signal. Furthermore, the ECA attention embedded within each branch adaptively enhances damage-sensitive channel responses at each hierarchical level, preventing weaker but discriminative high-frequency feature channels from being overwhelmed by low-frequency dominant channels. After the features from the three branches are fused, SE attention further performs adaptive recalibration across cross-branch channels, ensuring that complementary features from different frequency ranges are effectively utilized.

The above analysis reveals that FWT damage-induced vibration responses simultaneously contain dominant low-frequency components, local transient variations, and subtle differences across frequency bands, exhibiting evident multi-scale properties. Single-scale feature learning alone may result in insufficient feature representation, particularly overlooking weaker but damage-related components. The multi-scale design of RDAMNet is motivated by this signal-level feature distribution, aiming to obtain more comprehensive damage-sensitive representations from different scales, thereby providing a more discriminative feature basis for subsequent damage pattern recognition.

3.6. Hyperparameter Sensitivity Analysis

To determine the optimal training configuration of RDAMNet, a joint sensitivity analysis is conducted on the optimizer and learning rate. Three optimizers, namely AdamW, Adam, and SGD, are combined with three learning rates of 0.001, 0.005, and 0.01 in a full factorial design, yielding 9 experiments in total. The results are presented as 3D surface plots in Figure 7. As shown in Figure 7, AdamW at a learning rate of 0.001 achieves the globally optimal performance, with the damage recognition accuracy and weighted F1-score reaching 96.45% and 96.44%, respectively. Both AdamW and Adam exhibit a monotonic performance decline as the learning rate increases, whereas SGD shows the opposite trend, achieving only 64.89% accuracy at a learning rate of 0.001 but improving to 89.71% at 0.01, which still remains significantly lower than the best result of AdamW. Based on the above analysis, all subsequent comparative experiments and ablation experiments adopt the optimal configuration of the AdamW optimizer and a learning rate of 0.001.

3.7. Comparative Results and Analysis

To comprehensively evaluate the robustness and stability of the models, all methods are independently run five times, each with a different random seed for data splitting and network parameter initialization, and results are reported as mean ± standard deviation. It should be noted that all the comparison results below were obtained on an independent test set. The comparative damage recognition results of all methods on the UPATRAS dataset are presented in Figure 8. As shown in Figure 8a,b, RDAMNet achieves the best mean damage recognition accuracy and weighted F1-score of 95.39 ± 1.23% and 95.37 ± 1.24%, respectively, across five independent runs, significantly outperforming all compared methods on both metrics. Compared with the classical ResNet18, RDAMNet demonstrates a substantial accuracy improvement. Although ResNet18 mitigates the gradient degradation problem in deep networks through residual connections, its single-scale feature extraction strategy fails to sufficiently capture the complementary information of multi-scale damage features in FWT vibration signals, resulting in insufficient discriminability for similar damage patterns. Compared with DCNet and MCAMCNN, which also employ attention mechanisms and multi-scale feature fusion, RDAMNet consistently maintains a clear performance lead. This performance difference is primarily attributed to the differentiated branch design of RDAMNet. Although DCNet and MCAMCNN introduce multi-scale convolutions and channel attention, both methods extract features from a unified input representation without achieving multi-scale decoupling at the signal level. RDAMNet generates complementary signals with different physical characteristics through max-pooling and average-pooling at the network front-end, enabling each branch to extract peak impulses, low-frequency structural responses, and long-range structural patterns under noise-suppressed conditions in a targeted manner, thereby obtaining richer discriminative feature representations. Furthermore, compared with the remaining baseline methods, RDAMNet also demonstrates significant performance advantages. This demonstrates that for the FWT vibration damage recognition task, the pure convolutional multi-branch architecture of RDAMNet combined with the dual attention mechanism achieves superior damage recognition performance without the need for complex temporal modeling components such as Transformers or recurrent neural networks.

Figure 8c,d further present the scatter distributions of accuracy and F1-score across five independent runs for each method. The scatter points of RDAMNet are concentrated in the high-performance region with relatively small fluctuations, indicating that it consistently achieves stable and high recognition performance under different random initializations. In contrast, some compared methods exhibit more dispersed distributions, reflecting their higher sensitivity to random seeds and insufficient stability. Furthermore, Figure 8e presents the model complexity metrics of RDAMNet. RDAMNet has only 663,783 parameters, 241.41 M FLOPs, and a single-sample GPU inference time of 5.35 ms, indicating that RDAMNet achieves the best recognition performance while maintaining low computational overhead, demonstrating a favorable performance-efficiency trade-off and practical deployment potential.

To further assess the overfitting risk of RDAMNet, Figure 9 simultaneously presents the training and testing loss, accuracy, and F1-score curves over 100 training epochs. The training and testing loss curves exhibit highly consistent trends, both showing steady decline and converging after approximately 60 epochs, with no typical overfitting signs such as the test loss increasing while the training loss continues to decrease. Meanwhile, the accuracy and F1-score of both the training and testing sets increase synchronously and converge to comparable levels, further confirming that RDAMNet possesses good generalization capability without significant overfitting.

To further intuitively verify the above quantitative conclusions, t-SNE is employed to visualize the deep features extracted by each method, with the results presented in Figure 10, where subplot (g) annotates the damage pattern labels corresponding to different colors. As shown in Figure 10a, the features extracted by RDAMNet exhibit the most distinct clustering structure, with the six damage patterns forming tight and well-separated clusters in the feature space, and the inter-pattern boundaries are clearly distinguishable with virtually no cross-pattern sample overlap. As shown in Figure 10b, ResNet18 is capable of separating most damage patterns, yet overlap remains between some patterns. As shown in Figure 10c–f, the feature distributions of the remaining four compared methods all exhibit varying degrees of pattern overlap, with significant overlapping and intersection among the feature clusters of multiple damage patterns. The above visualization results corroborate the quantitative metrics presented in Figure 8, further verifying that RDAMNet achieves the optimal pattern separation in the feature space. This advantage stems from the synergistic effect of the multi-branch multi-scale feature extraction and the ECA-SE dual attention mechanism. The multi-scale branches capture complementary damage features from signals at different temporal resolutions, ECA enhances the responses of damage-sensitive channels within each branch, and SE adaptively recalibrates the contribution weights of cross-branch channels at the fusion stage. The three components collectively ensure that the final feature representation possesses high intra-pattern compactness and inter-pattern separability.

3.8. Cross-Condition Generalization Analysis

The preceding comparative experiments validate the damage recognition performance of RDAMNet under the single operating condition of WD1_WS1. However, FWTs in practical operation face continuously varying wind speed and wind direction conditions, and the model’s generalization capability across different operating conditions is a critical prerequisite for its practical deployment. To this end, this subsection further evaluates the cross-condition generalization performance of RDAMNet under different wind speed and wind direction combinations. In addition to the training condition WD1_WS1, four different wind speed and wind direction combinations, namely WD1_WS2, WD1_WS3, WD2_WS2, and WD2_WS3, are selected. RDAMNet is independently trained and tested under each operating condition, with three independent runs per condition, and the results are presented in Figure 11.

As shown in Figure 11, the mean accuracy and mean F1-score of RDAMNet exceed 94% across all four operating conditions, with small standard deviations, indicating that the model maintains stable and high recognition performance under different wind speed and wind direction conditions. These results demonstrate that the damage-sensitive feature representations learned through the multi-branch multi-scale feature extraction strategy and dual attention mechanism exhibit favorable cross-condition transferability. It is noteworthy that the performance under WD2_WS3 is slightly lower than under other conditions, which may be attributed to the increased signal complexity and noise level caused by the coupled effects of higher wind speed and different wind direction. Nevertheless, the mean accuracy under this condition still exceeds 94%, indicating that RDAMNet remains capable of effectively recognizing damage patterns under more challenging operating conditions.

3.9. Ablation Study

To systematically verify the contribution of each core component of RDAMNet to the final damage recognition performance, six ablation experiments are designed in this subsection. All ablation experiments are conducted under the same training hyperparameters and data split settings as the comparative experiments above, ensuring fair comparability. The ablation variants are designed as follows:

1.: w/o Multi-scale Input: The inputs of all three branches are unified to the raw signal by removing the max-pooling and average-pooling operations at the input stage, to verify the effectiveness of signal-level multi-scale decoupling.
2.: w/o Multi-branch: Only the raw signal branch is retained, and the max-pooled branch and the average-pooled dilated convolution branch are removed, to verify the necessity of the multi-branch architecture.
3.: w/o ECA: w/o ECA: The ECA channel attention module is removed from all ResECA blocks, to verify the role of intra-branch hierarchical channel attention.
4.: w/o SE: The SE channel attention module is removed from the fusion layer, to verify the role of channel recalibration at the fusion stage.
5.: Dual Attention: Both ECA and SE attention modules are simultaneously removed, to verify the overall synergistic effect of the dual attention mechanism.
6.: w/o Residual: The residual connections are removed from all ResECA blocks, to verify the role of residual connections in preserving subtle damage features.

The ablation study results are presented in Table 4. Among all ablation variants, w/o Multi-scale Input leads to the most significant performance degradation, with the accuracy dropping from 96.45% to 87.23%, indicating that signal-level multi-scale decoupling is the core driving factor of RDAMNet’s performance. The accuracy of w/o Multi-branch further drops to 89.36%, verifying the necessity of the multi-branch architecture in extracting complementary damage features. Regarding the attention mechanisms, the accuracies of w/o ECA and w/o SE decrease to 92.19% and 93.97%, respectively, while w/o Dual Attention, which simultaneously removes both modules, results in an accuracy of 91.13%, lower than that of removing either attention module alone. This indicates a significant synergistic gain between ECA and SE, where ECA hierarchically enhances the responses of damage-sensitive channels within each branch, and SE adaptively recalibrates the contribution weights of cross-branch channels at the fusion stage, thereby achieving a performance gain greater than the sum of their individual contributions. Additionally, w/o Residual yields an accuracy of 94.32%, verifying the role of residual connections in facilitating the propagation of subtle damage information during deep feature extraction. The above results collectively demonstrate that each core component in RDAMNet contributes indispensably to the final damage recognition performance, where multi-scale input decoupling and the multi-branch architecture constitute the performance foundation, the dual attention mechanism provides critical feature enhancement, and residual connections ensure the integrity of feature propagation.

Notably, the contributions of these components are not simply independent and additive. Multi-scale input decoupling provides each branch with complementary signal representations possessing distinct physical characteristics, upon which the multi-branch architecture extracts damage-sensitive features at different scales through differentiated kernel sizes and dilation rates. ECA then hierarchically suppresses noise-dominant channels and enhances damage-related channel responses within each branch, thereby ensuring that the branch-level features delivered to the fusion stage possess a high signal-to-noise ratio. Building on this, SE adaptively recalibrates channels according to their contributions to the classification objective, achieving effective integration of cross-branch complementary information, while residual connections span the entire feature extraction process to safeguard subtle damage features from being lost during layer-by-layer abstraction. This hierarchical cooperation also explains why w/o Dual Attention yields lower performance than removing either ECA or SE alone: when both attention mechanisms are simultaneously absent, the adaptive regulation of intra-branch feature purification and cross-branch feature integration is lost, resulting in the complete forfeiture of synergistic gains.

3.10. Interpretability Analysis of Multi-Scale Features

To examine whether the trained RDAMNet captures discriminative representations at different temporal resolutions, branch importance analysis and one-dimensional Grad-CAM visualization are introduced based on Grad-CAM [90]. Grad-CAM is a gradient-based interpretability method that highlights the feature regions most relevant to the target-class prediction by backpropagating class-specific gradients. Following this principle, the branch importance is calculated by measuring the gradient magnitude of the target-class score with respect to the output feature map of each RDAMNet branch. The obtained values are then normalized by the total importance of all three branches, allowing a direct comparison among different health and damage states. The results are presented in Figure 12 and Figure 13, respectively. Figure 12 presents the normalized importance distribution of the three branches under different damage modes.

All three branches contribute to the final decision, but their relative importance varies with the damage mode. For blade crack C1, blade crack C2, and heavy added mass M2, the raw-signal branch shows the highest importance, indicating that local waveform details and transient variations at the original temporal resolution are highly informative for these cases. For the healthy state H, the max-pooled branch becomes dominant, suggesting that peak-preserved patterns are useful for identifying the baseline vibration behavior without artificial damage. For bolt degradation B, the raw-signal and max-pooled branches make comparable contributions, implying that this state contains both abrupt local responses and detailed waveform information. In contrast, the average-pooled branch provides relatively stable support across all damage modes, reflecting the complementary role of smoothed low-frequency structural responses.

The above branch importance distribution shows that different damage modes do not rely on a single-scale representation. Instead, RDAMNet adaptively emphasizes different branches according to the vibration characteristics of each condition. This result corroborates the ablation conclusions in Table 4, further confirming that multi-scale input decoupling and the multi-branch architecture are important sources of the performance improvement of RDAMNet. Figure 13 further presents the one-dimensional Grad-CAM responses under different damage modes. All six damage mode samples are correctly recognized, and the high-response regions are distributed across multiple temporal segments rather than concentrated at isolated sampling points. For blade crack states, the activation regions predominantly appear at positions with enhanced local oscillations and evident waveform variations. For added-mass states, especially M2, the activation responses extend across several low-frequency oscillation cycles. For bolt degradation, the high-response regions are associated with local peak-valley variations and non-stationary fluctuations. These results demonstrate that RDAMNet simultaneously attends to local transient features, waveform details, and longer-range structural response variations during damage recognition.

Overall, the branch importance analysis and Grad-CAM visualization provide interpretability evidence that the three branches form a complementary feature extraction system rather than redundantly learning similar information. The raw-signal branch captures fine-grained transient details, the max-pooled branch emphasizes peak-related patterns, and the average-pooled branch supplies a noise-suppressed low-frequency baseline. Moreover, the dispersed Grad-CAM activations indicate that the predictions are mainly supported by structural vibration patterns over continuous temporal regions, instead of by isolated noise-like points. This improves the physical credibility of RDAMNet and further explains its effectiveness in FWT vibration damage recognition.

3.11. Engineering Implications for Condition-Based Maintenance

The experimental results indicate that RDAMNet can distinguish the six structural health states of FWTs with high reliability, providing a technical basis for condition-based maintenance in offshore wind farm operation. In engineering practice, different fault patterns imply different risk levels and intervention priorities. Bolt connection degradation B represents the loosening or deterioration of fasteners between the tower and floater. If this condition is not detected and addressed in time, stress concentration may increase in the connection region, potentially accelerating fatigue crack initiation and threatening global structural stability. Therefore, this state should be regarded as a high-priority warning condition. Blade crack states C1 and C2 correspond to structural defects with different severity levels. Since cracks may continue to propagate under cyclic wind-wave loading, their timely identification allows maintenance personnel to plan targeted inspection or repair before the defect reaches a critical stage, thereby reducing the risk of blade fracture. Added-mass states M1 and M2 represent mass imbalance caused by ice accumulation. Such loading can change the dynamic characteristics of rotating components and increase the wear of bearings and drivetrain systems. Early recognition of these states can support timely de-icing decisions and prevent secondary mechanical deterioration.

From a decision-making perspective, RDAMNet goes beyond binary anomaly detection by further identifying the fault category and severity level. This capability enables graded maintenance responses. For example, a minor blade crack C1 may be incorporated into the next planned inspection cycle, whereas a more severe crack C2 or bolt degradation B should trigger a higher-priority intervention. Such a differentiated strategy can help balance structural safety and maintenance cost by avoiding both unnecessary downtime caused by over-maintenance and risk accumulation caused by delayed repair.

This implication is particularly important for FWTs deployed in deep-sea environments, where offshore maintenance is strongly constrained by weather windows, vessel availability, accessibility, and personnel safety. Conventional time-based preventive maintenance cannot fully reflect the actual structural condition and may lead to inefficient inspections or delayed responses. As an automated vibration-based recognition model, RDAMNet can be integrated into an online monitoring framework using vibration signals continuously collected by nacelle-mounted accelerometers. In this way, the model can provide real-time or near-real-time health assessment for maintenance planning. In addition, its compact model size and millisecond-level inference speed support edge-side deployment, which is beneficial for reducing data transmission burden and enabling fast local diagnosis. These characteristics make RDAMNet a promising tool for supporting the transition from passive periodic inspection to proactive health-aware maintenance, with potential benefits in reducing unplanned downtime, extending structural service life, and improving the economic efficiency of offshore wind farm operation.

3.12. Limitations and Future Work

Although RDAMNet achieves strong performance on the laboratory dataset, several limitations should be acknowledged. First, the UPATRAS dataset is obtained from a laboratory-scale FWT model, whose dynamic behavior may differ from that of full-scale offshore turbines. Real marine environments involve more complex disturbances, including sensor noise, non-stationary wind-wave excitations, and long-term signal drift, which may affect recognition accuracy. In addition, the structural scale, material properties, boundary conditions, and damage morphologies of practical turbines are more complicated than those represented in the experimental model. Therefore, the transferability of RDAMNet to real offshore structures still requires further validation. Second, this study considers six discrete health or damage states, while actual faults usually evolve continuously and may appear as compound scenarios involving multiple simultaneous defects. Third, although the cross-condition experiments show stable performance under different wind speed and direction combinations, real offshore operating conditions are broader and more extreme. The robustness of the model under severe sea states, storms, surges, and long-term operational uncertainty remains to be further investigated.

Future work will focus on validating RDAMNet using field monitoring data from real offshore wind farms and improving its adaptability to industrial environments. Domain adaptation and transfer learning can be introduced to reduce the distribution gap between laboratory and field data. Robust recognition under missing data, sensor degradation, and variable noise levels also deserves further study. In addition, the damage categories should be extended from discrete single-fault states to continuous degradation processes and multi-fault coupling conditions. Another important direction is to combine fault recognition with long-term degradation modeling and remaining useful life prediction, so that the framework can be further developed from state classification toward structural life-cycle assessment and maintenance optimization.

4. Conclusions

This paper proposes RDAMNet, a Residual Dual Attention Multiscale Network for vibration-based damage recognition of floating wind turbines. RDAMNet introduces a novel multi-branch multi-scale input strategy that decouples the raw vibration signal at the signal level into complementary representations with distinct physical characteristics, enabling different branches to capture damage-sensitive features at different scales. Additionally, an ECA-SE dual attention mechanism is designed to collaboratively enhance damage-related information at both the feature extraction and fusion stages. On the UPATRAS Floating Wind Turbine Vibration Dataset, RDAMNet achieves a mean damage recognition accuracy and a weighted F1-score of 95.39 ± 1.23% and 95.37 ± 1.24%, respectively, across five independent runs, significantly outperforming five representative methods including ResNet18, DCNet, IMCTN, MCAMCNN, and MSCNN-BiLSTM-WMV. Cross-condition generalization experiments demonstrate that RDAMNet maintains mean accuracies exceeding 94% across different wind speed and wind direction combinations, validating its generalization capability under different operating conditions. Model complexity analysis reveals that RDAMNet contains only 663,783 parameters, 241.41 M FLOPs, and a single-sample GPU inference time of 5.35 ms, achieving a balance between recognition performance and computational efficiency. The ablation study demonstrates that each core component contributes indispensably to the recognition performance, with multi-scale input decoupling and the multi-branch architecture serving as the performance foundation, while the dual attention mechanism provides critical feature enhancement. Branch importance analysis and Grad-CAM visualization further confirm that RDAMNet adaptively leverages features at different scales according to the vibration response characteristics of distinct damage modes. These results demonstrate that RDAMNet provides an effective and interpretable solution for vibration-based damage recognition in the structural health monitoring of floating wind turbines.

Author Contributions

Conceptualization, H.H. and Y.L. (Yifei Li); methodology, H.H., Y.L. (Yifei Li) and Y.L. (Yuchen Lu); software, H.H., Y.L. (Yifei Li), Y.L. (Yuchen Lu) and Y.Z.; validation, H.H., Y.L. (Yifei Li), Y.L. (Yuchen Lu) and Y.Z.; formal analysis, H.H., Y.L. (Yifei Li), Y.L. (Yuchen Lu) and Y.Z.; investigation, H.H., Y.L. (Yifei Li) and H.D.; resources, H.H., Y.L. (Yifei Li), R.W., H.D., Y.L. (Yuchen Lu) and Y.Z.; data curation, H.H., Y.L. (Yifei Li), R.W., H.D., Y.L. (Yuchen Lu) and Y.Z.; writing—original draft preparation, H.H.; writing—review and editing, Y.L. (Yifei Li), R.W., Y.L. (Yuchen Lu) and Y.Z; visualization, H.H., Y.L. (Yifei Li), R.W., H.D. and Y.Z.; supervision, R.W., H.D. and Y.Z.; project administration, R.W., H.D. and Y.Z.; funding acquisition, R.W. and Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This project is supported by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China through Grant No. 24KJA580002, and Mid Sweden University for Open Access Publication.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available on request from the corresponding authors.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Zhang, Z.; Yang, H.; Wang, R.; Zhang, K.; Zhou, D.; Zhu, H.; Zhang, P.; Han, Z.; Cao, Y.; Tu, J. Effects of combined platform rotation and pitch motions on aerodynamic loading and wake recovery of a single-point moored twin-rotor floating wind turbine. Energy 2025, 320, 135137. [Google Scholar] [CrossRef]
Wang, Y.; Yao, T.; Zhao, Y.; Jiang, Z. Review of tension leg platform floating wind turbines: Concepts, design methods, and future development trends. Ocean Eng. 2025, 324, 120587. [Google Scholar] [CrossRef]
Wang, C.; Utsunomiya, T.; Wee, S.; Choo, Y.S. Research on floating wind turbines: A literature survey. IES J. Part A Civ. Struct. Eng. 2010, 3, 267–277. [Google Scholar] [CrossRef]
McMorland, J.; Collu, M.; McMillan, D.; Carroll, J. Operation and maintenance for floating wind turbines: A review. Renew. Sustain. Energy Rev. 2022, 163, 112499. [Google Scholar] [CrossRef]
Jiang, Z. Mooring design for floating wind turbines: A review. Renew. Sustain. Energy Rev. 2025, 212, 115231. [Google Scholar] [CrossRef]
Zhang, H.; Gao, X.; Lu, H.; Zhao, Q.; Zhu, X.; Wang, Y.; Zhao, F. Investigation of a new 3D wake model of offshore floating wind turbines subjected to the coupling effects of wind and wave. Appl. Energy 2024, 365, 123189. [Google Scholar] [CrossRef]
Wang, B.; Gao, X.; Li, Y.; Liu, L.; Li, H. Dynamic response analysis of a semi-submersible floating wind turbine based on different coupling methods. Ocean Eng. 2024, 297, 116948. [Google Scholar] [CrossRef]
Zhang, Y.; Adin, V.; Bader, S.; Oelmann, B. Leveraging acoustic emission and machine learning for concrete materials damage classification on embedded devices. IEEE Trans. Instrum. Meas. 2023, 72, 2525108. [Google Scholar] [CrossRef]
Zhu, Z.; Zhao, Y.; Ompusunggu, A.P. Physics-informed machine learning for near real-time stress prediction on a structural component: Application for landing gears. Eng. Appl. Artif. Intell. 2025, 162, 112532. [Google Scholar] [CrossRef]
Lu, Y.; Zhu, Z.; Liu, H.; Chen, M.; Qiu, X.; Xu, H.; Qu, X. End-to-End graph neural network framework for precise localization of internal leakage valves in marine pipelines based on Intelligent graphs. Adv. Eng. Inform. 2025, 68, 103716. [Google Scholar] [CrossRef]
Gaidai, O.; Yakimov, V.; Wang, F.; Zhang, F.; Balakrishna, R. Floating wind turbines structural details fatigue life assessment. Sci. Rep. 2023, 13, 16312. [Google Scholar] [CrossRef] [PubMed]
Lu, Y.; Zhu, Z.; Li, Y.; Liu, H.; Hou, J.; Zhou, C.; Wahab, M.A. LDS-former: A lightweight dual-stream transformer for real-time acoustic emission monitoring of crack evolution in offshore steel structures. Adv. Eng. Inform. 2026, 74, 104635. [Google Scholar] [CrossRef]
Pimenta, F.; Ribeiro, D.; Román, A.; Magalhães, F. Predictive model for fatigue evaluation of floating wind turbines validated with experimental data. Renew. Energy 2024, 223, 119981. [Google Scholar] [CrossRef]
Lu, Y.; Li, Y.; Liu, H.; Zhang, Y.; Wang, X.; Chen, M.; Zhao, C.; Wahab, M.A. Learning multi-dimensional sensor relationships for robust marine pipeline leakage non-destructive detection via adaptive graph networks. Eng. Struct. 2026, 350, 121983. [Google Scholar] [CrossRef]
Mahmood, Y.; Yasir, N.; Quenette, K.; Badin, G.; Huang, Y.; Xu, L. Fiber-Optic Sensor-Based Structural Health Monitoring with Machine Learning: A Task-Oriented and Cross-Domain Review. Sensors 2026, 26, 2641. [Google Scholar] [CrossRef] [PubMed]
Tong, T.; Qu, W.; Hua, J.; Chen, D.; Tan, J.; Lin, J. Delamination detection in composite laminates using Lamb wave tomographic method based on sparse and probabilistic reconstruction. NDT E Int. 2026, 160, 103650. [Google Scholar] [CrossRef]
Lu, Y.; Zhang, Y.; Zhang, Y.; Li, D.; Bader, S. Pipeline Posterior Scoring Module for out-of-distribution detection via attachable uncertainty quantification. Reliab. Eng. Syst. Saf. 2026, 277, 113029. [Google Scholar] [CrossRef]
Korolis, J.S.; Bourdalos, D.M.; Sakellariou, J.S. Machine Learning-Based Damage Diagnosis in Floating Wind Turbines Using Vibration Signals: A Lab-Scale Study Under Different Wind Speeds and Directions. Sensors 2025, 25, 1170. [Google Scholar] [CrossRef] [PubMed]
Lu, Y.; Zhang, Y.; Liu, H.; Bader, S. TinyLSN: A lightweight network for real-time marine pipeline leakage detection in IoT systems. IEEE Internet Things J. 2026, 13, 21104–21116. [Google Scholar] [CrossRef]
Pandit, A.; Ghiasi, R.; O’Shea, M. Structural Health Monitoring of Mooring Systems for Floating Wind Turbines. In Proceedings of the International Conference on Experimental Vibration Analysis for Civil Engineering Structures, Porto, Portugal, 2–4 July 2025; Springer: Berlin/Heidelberg, Germany, 2025; pp. 11–18. [Google Scholar]
Zhang, Y.; Lu, Y.; Li, D.; Bader, S.; Zio, E. Dynamic Causal Graph Network for reliable pipeline leak detection. Reliab. Eng. Syst. Saf. 2026, 275, 112795. [Google Scholar] [CrossRef]
Tong, Y.; Liu, W.; Liu, X.; Wang, P.; Sheng, Z.; Li, S.; Zhang, H.; Meng, Y.; Zhu, Y.; Lei, X.; et al. Materials design and structural health monitoring of horizontal axis offshore wind turbines: A state-of-the-art review. Materials 2025, 18, 329. [Google Scholar] [CrossRef] [PubMed]
Wong, V.K.; Li, X.; Yousry, Y.M.; Philibert, M.; Jiang, C.; Lim, D.B.K.; Subhodayam, P.T.C.; Fan, Z.; Yao, K. Twice reflected ultrasonic bulk wave for surface defect monitoring. Ultrasonics 2025, 147, 107530. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Lu, Y.; Martinez-Rau, L.S.; Qiu, Q.; Bader, S. Real-time on-device weed identification using a hardware-efficient lightweight CNN. Front. Plant Sci. 2026, 17, 1747863. [Google Scholar] [CrossRef] [PubMed]
Li, G.; Chen, M.; Lu, Y.; Zhang, Y. Rolling bearing fault diagnosis in noisy environments using Channel-Time parallel attention networks. Sci. Rep. 2025, 15, 35034. [Google Scholar] [CrossRef] [PubMed]
Khan, M.Z.; Shahzadi, M.; Khan, A.; Ali, U.; Hassan, M.A.S.; Hussain, M. Review on crack detection in civil infrastructure using structural health monitoring and machine learning techniques. Innov. Infrastruct. Solut. 2025, 10, 348. [Google Scholar] [CrossRef]
Lu, Y.; Zhang, Y.; Qiu, X.; Ren, W.; Zhao, C.; Chen, M.; Li, Y.; Bader, S.; Liu, H. Structural health monitoring of offshore pipelines via a novel spatial-topological adaptive graph neural network. Struct. Health Monit. 2026, 14759217261418056. [Google Scholar] [CrossRef]
Katam, R.; Pasupuleti, V.D.K.; Kalapatapu, P. Machine learning-driven structural health monitoring: STFT-based feature extraction for damage detection. In Proceedings of the Structures; Elsevier: Amsterdam, The Netherlands, 2025; Volume 78, p. 109244. [Google Scholar]
Zhang, Y.; Pullin, R.; Oelmann, B.; Bader, S. On-Device fault diagnosis with augmented acoustic emission data: A case study on carbon fiber panels. IEEE Trans. Instrum. Meas. 2025, 74, 2534912. [Google Scholar] [CrossRef]
Lu, Y.; Zhang, Y.; Chen, M.; Liu, Q.; Wang, X.; Liu, Z.; Liu, H. Acoustic emission-based graph learning for internal valve leakage localisation in offshore pipelines. Nondestruct. Test. Eval. 2025, 1–29. [Google Scholar] [CrossRef]
Antolin, L.A.S.; Silva, E.H.F.M.d.; Zanon, A.J.; Ribeiro, B.S.M.R.; Marin, F.R. How much would irrigation increase maize production in Brazil? Sci. Agric. 2025, 82, e20240083. [Google Scholar] [CrossRef]
Socrates, K.; Vasanthanathan, A. The Perspectives, Synthesis, and Archives of CNT-Based Carbon Fiber-Reinforced Composites: A State-Of-The-Art Review. Polym.-Plast. Technol. Mater. 2026, 65, 601–618. [Google Scholar] [CrossRef]
Barashok, K.; Choi, Y.; Choi, Y.; Aslam, M.; Lee, J. Machine learning techniques in ultrasonics-based defect detection and material characterization: A comprehensive review. Adv. Mech. Eng. (Sage Publ. Inc.) 2025, 17, 1. [Google Scholar] [CrossRef]
Feng, W.Q.; Mousavi, Z.; Farhadi, M.; Bayat, M.; Ettefagh, M.M.; Varahram, S.; Sadeghi, M.H. A hybrid wavelet-deep learning approach for vibration-based damage detection in monopile offshore structures considering soil interaction. J. Civ. Struct. Health Monit. 2025, 15, 417–444. [Google Scholar] [CrossRef]
Anastasiadis, N.P.; Sakaris, C.S.; Schlanbusch, R.; Sakellariou, J.S. Vibration-based SHM in the synthetic mooring lines of the semisubmersible OO-Star wind floater under varying environmental and operational conditions. Sensors 2024, 24, 543. [Google Scholar] [CrossRef] [PubMed]
Chen, B.Q.; Liu, K.; Yu, T.; Li, R. Enhancing reliability in floating offshore wind turbines through digital twin technology: A comprehensive review. Energies 2024, 17, 1964. [Google Scholar] [CrossRef]
Civera, M.; Surace, C. Non-destructive techniques for the condition and structural health monitoring of wind turbines: A literature review of the last 20 years. Sensors 2022, 22, 1627. [Google Scholar] [CrossRef] [PubMed]
Yang, Y.; Liang, F.; Zhu, Q.; Zhang, H. An overview on structural health monitoring and fault diagnosis of offshore wind turbine support structures. J. Mar. Sci. Eng. 2024, 12, 377. [Google Scholar] [CrossRef]
Elmasry, M.I. Reliable Remote Technique for SHM of Offshore Windmills Supporting Structures. Struct. Eng. Int. 2026, 36, 16–25. [Google Scholar]
Xiang, Z.Q.; Wang, J.T.; Wang, W.; Pan, J.W.; Liu, J.F.; Le, Z.J.; Cai, X.Y. Vibration-based health monitoring of the offshore wind turbine tower using machine learning with Bayesian optimisation. Ocean Eng. 2024, 292, 116513. [Google Scholar] [CrossRef]
Scarselli, G.; Nicassio, F. Machine Learning for Structural Health Monitoring of Aerospace Structures: A Review. Sensors 2025, 25, 6136. [Google Scholar] [CrossRef] [PubMed]
Hoda, M.A.; Mazzanti, A.; Shojaii, J.; Azam, Y.E.; Linzell, D.G. Unsupervised machine learning for bridge SHM using FPGA: Proof of concept via full-scale experiments. Measurement 2025, 256, 118717. [Google Scholar] [CrossRef]
Khatir, A.; Capozucca, R.; Khatir, S.; Magagnini, E.; Le Thanh, C.; Riahi, M.K. Advancements and emerging trends in integrating machine learning and deep learning for SHM in mechanical and civil engineering: A comprehensive review. J. Braz. Soc. Mech. Sci. Eng. 2025, 47, 419. [Google Scholar] [CrossRef]
O. Aikhuele, D.; E. Diemuodeke, O. Computational analysis of stiffness reduction effects on the dynamic behaviour of floating offshore wind turbine blades. J. Mar. Sci. Eng. 2024, 12, 1846. [Google Scholar] [CrossRef]
Korolis, J.; Bourdalos, D.; Sakellariou, J. Damage Diagnosis in a Floating Wind Turbine Lab-Scale Model Under Varying Wind Conditions Using Vibration-Based Machine Learning Methods. In Proceedings of the International Operational Modal Analysis Conference; Springer: Berlin/Heidelberg, Germany, 2024; pp. 381–393. [Google Scholar]
Xu, Z.D.; Wu, Z. Energy damage detection strategy based on acceleration responses for long-span bridge structures. Eng. Struct. 2007, 29, 609–617. [Google Scholar] [CrossRef]
Xu, Z.D.; Liu, M.; Wu, Z.; Zeng, X. Energy damage detection strategy based on strain responses for long-span bridge structures. J. Bridge Eng. 2011, 16, 644–652. [Google Scholar] [CrossRef]
Xu, Z.D.; Wu, K.Y. Damage detection for space truss structures based on strain mode under ambient excitation. J. Eng. Mech. 2012, 138, 1215–1223. [Google Scholar] [CrossRef]
Xu, Z.D.; Zhu, C.; Shao, L.W. Damage identification of pipeline based on ultrasonic guided wave and wavelet denoising. J. Pipeline Syst. Eng. Pract. 2021, 12, 04021051. [Google Scholar] [CrossRef]
Tao, Y.; Xu, Z.D.; Wei, Y.; Liu, X.Y.; Dong, Y.R.; Dai, J. Integrating deep learning into an energy framework for rapid regional damage assessment and fragility analysis under Mainshock-aftershock sequences. Earthq. Eng. Struct. Dyn. 2025, 54, 1678–1697. [Google Scholar] [CrossRef]
Zhang, Y.; Lu, Y.; Martinez-Rau, L.S.; Fan, Z.; Qiu, Q.; O’Flynn, B.; Bader, S. TinyML-Enabled IoT Edge Framework with Knowledge Distillation for Weed Classification. IEEE Internet Things J. 2026, 13, 27453–27466. [Google Scholar] [CrossRef]
Tang, Y.; Chang, Y.; Li, K. Applications of K-nearest neighbor algorithm in intelligent diagnosis of wind turbine blades damage. Renew. Energy 2023, 212, 855–864. [Google Scholar] [CrossRef]
Zhang, Y.; Nürnberg, A.; Rau, L.S.M.; Vu, Q.N.P.; Lu, Y.; Oelmann, B.; Bader, S. TinyML pipeline for efficient crack classification in UAV-based structural health inspections. Sci. Rep. 2026, 16, 8964. [Google Scholar] [CrossRef] [PubMed]
Alves, V.H.M.; Gomes, R.F.I.; Cury, A. New perspectives on structural health monitoring using unsupervised quantum machine learning. Mech. Syst. Signal Process. 2025, 229, 112489. [Google Scholar] [CrossRef]
Sharma, S.; Nava, V. Condition monitoring of mooring systems for Floating Offshore Wind Turbines using Convolutional Neural Network framework coupled with Autoregressive coefficients. Ocean Eng. 2024, 302, 117650. [Google Scholar] [CrossRef]
Sharma, S.; Nava, V. Monitoring Mooring Lines of Floating Offshore Wind Turbines: Autoregressive Coefficients and Stacked Auto-Associative-Deep Neural Networks. In Proceedings of the International Conference on Condition Monitoring and Asset Management, Oxford, UK, 18–20 June 2024; The British Institute of Non-Destructive Testing: Northampton, UK, 2024; Volume 2024, pp. 1–12. [Google Scholar]
Ye, H.; Zhu, W.; Li, H.; Ji, W.; Soares, C.G.; Wang, J. Failure warning for offshore wind turbines based on Autoregressive models. Ocean Eng. 2025, 332, 121448. [Google Scholar] [CrossRef]
Ye, L.; Li, Z.h.; Yan, H.n.; Liu, C.; Cho, H.H.; Guo, T. Predicting film-cooling effectiveness of compound-angle holes using a POD-based hybrid deep learning model. Aerosp. Sci. Technol. 2026, 176, 112590. [Google Scholar] [CrossRef]
Yang, T.; Qian, Z.; Hang, N.; Liu, M. S-PINN: Stabilized physics-informed neural networks for alleviating barriers between multi-level co-optimization. Comput. Methods Appl. Mech. Eng. 2025, 447, 118348. [Google Scholar] [CrossRef]
Liu, L.; Chu, C.; Chen, C.; Huang, S. MarineYOLO: Innovative deep learning method for small target detection in underwater environments. Alex. Eng. J. 2024, 104, 423–433. [Google Scholar] [CrossRef]
Cui, C.; Liu, L.; Qiao, R. A cutting-edge video anomaly detection method using image quality assessment and attention mechanism-based deep learning. Alex. Eng. J. 2024, 108, 476–485. [Google Scholar] [CrossRef]
Yin, X.; Chen, L. Image object detection method based on improved faster R-CNN. J. Circuits Syst. Comput. 2024, 33, 2450130. [Google Scholar]
Li, H.; Li, Y.; Li, P.; Zhang, G.; Wang, W.; Xu, K. Exploring Uncertainty and Representativeness for Deep Active Learning. J. Circuits Syst. Comput. 2025, 34, 2550207. [Google Scholar] [CrossRef]
Weng, Y.; Yang, K.; Liu, Z.; He, W.; Tang, X. Lgvlm-miot: A lightweight generative visual-language model for multilingual iot applications. IEEE Internet Things J. 2025, 12, 13311–13326. [Google Scholar] [CrossRef]
Lai, C.H.; Wu, T.E.; Wang, C.C. Enhancing Information Security in Smart Manufacturing Through Least Significant Bit Steganography in Engineering Drawings. J. Comput. Inf. Sci. Eng. 2025, 25, 091006. [Google Scholar] [CrossRef]
Xu, Z.; Bashir, M.; Yang, Y.; Wang, X.; Wang, J.; Ekere, N.; Li, C. Multisensory collaborative damage diagnosis of a 10 MW floating offshore wind turbine tendons using multi-scale convolutional neural network with attention mechanism. Renew. Energy 2022, 199, 21–34. [Google Scholar] [CrossRef]
Xu, P.; Lin, Z.; Zahid, U.; Zheng, J.; Song, Q.; Han, B. Mooring lines structural health monitoring based on floating wind turbine response using an integrated ESAX-ResNet-50 model. Ocean Eng. 2026, 353, 124696. [Google Scholar] [CrossRef]
Wang, Z.; Qiao, D.; Tang, G.; Wang, B.; Yan, J.; Ou, J. An identification method of floating wind turbine tower responses using deep learning technology in the monitoring system. Ocean Eng. 2022, 261, 112105. [Google Scholar] [CrossRef]
Gorostidi, N.; Nava, V.; Aristondo, A.; Pardo, D. Predictive maintenance of floating offshore wind turbine mooring lines using deep neural networks. J. Phys. Conf. Ser. 2022, 2257, 012008. [Google Scholar] [CrossRef]
Zhang, C.; Guo, Z.; Li, C. Unsupervised anomaly detection for gearboxes based on the deep convolutional support generative adversarial network. Sci. Rep. 2025, 15, 20977. [Google Scholar] [CrossRef] [PubMed]
Fan, J.; Yu, G.A.; Zhao, M.; Zong, H. Addressing multi-scale temporal variability: Deep integration and application of the CNN and transformer model in monthly streamflow prediction. Expert Syst. Appl. 2025, 292, 128658. [Google Scholar] [CrossRef]
Deng, S.; Ning, D.; Mayon, R. The motion forecasting study of floating offshore wind turbine using self-attention long short-term memory method. Ocean Eng. 2024, 310, 118709. [Google Scholar] [CrossRef]
Xu, L.M.; Wong, P.K.; Gao, Z.J.; Yang, Z.X.; Zhao, J.; Wang, X.B. An Attention-Driven Multi-Scale Framework for Rotating-Machinery Fault Diagnosis Under Noisy Conditions. Electronics 2025, 14, 3805. [Google Scholar] [CrossRef]
Kang, B.; Park, S.; Kwon, D. Interpretable prediction of floating offshore wind turbine dynamic Responses: An attention-based deep learning approach. Ocean Eng. 2025, 335, 121703. [Google Scholar] [CrossRef]
Triviño, H.; Feijoo, C.; Lugmania, H.; Vidal, Y.; Tutivén, C. Damage detection and localization at the jacket support of an offshore wind turbine using transformer models. Struct. Control Health Monit. 2023, 2023, 6646599. [Google Scholar] [CrossRef]
Huang, J.; Huang, Z.; Xie, C.; Zhang, Y.; Ostachowicz, W.; Cao, M. Unsupervised deep learning framework for damage identification under ambient excitations: Trait of damage localization and demonstrative applications. Mech. Syst. Signal Process. 2026, 250, 114123. [Google Scholar] [CrossRef]
Zhao, D.; Shao, D.; Wang, T. Dynamic cross-scale time-frequency network for characterizing non-stationary fault characteristic frequency of offshore wind turbine. Ocean Eng. 2025, 332, 121367. [Google Scholar] [CrossRef]
Zhao, S.; Wei, F.; Zhu, Y.; He, J.; Zhou, A.; Ma, Y. An improved multi-scale convolutional temporal neural network method for wind turbine blade fault diagnosis. Meas. Sci. Technol. 2026, 37, 046113. [Google Scholar] [CrossRef]
Sakaris, C.S.; Yang, Y.; Bashir, M.; Michailides, C.; Wang, J.; Sakellariou, J.S.; Li, C. Structural health monitoring of tendons in a multibody floating offshore wind turbine under varying environmental and operating conditions. Renew. Energy 2021, 179, 1897–1914. [Google Scholar] [CrossRef]
Pacheco-Cherrez, J.; Cardenas, D.; Delgado-Gutierrez, A.; Probst, O. Operational modal analysis for damage detection in a rotating wind turbine blade in the presence of measurement noise. Compos. Struct. 2023, 321, 117298. [Google Scholar] [CrossRef]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
LeCun, Y.; Bengio, Y. Convolutional networks for images, speech, and time series. In The Handbook of Brain Theory and Neural Networks; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning; PMLR: New York, NY, USA, 2015; pp. 448–456. [Google Scholar]
Nair, V.; Hinton, G.E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 807–814. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Guo, H.; Guo, X.; Zhang, X.; Lu, F.; Liang, C. Fault diagnosis of wind turbine based on dual-channel feature aggregation network with attentional mechanism. Eng. Appl. Artif. Intell. 2025, 161, 112291. [Google Scholar] [CrossRef]
Zheng, H.; Niu, D.; Shao, C.; Yin, S.; Wu, X. Fault Diagnosis of Wind Turbines Based on Multi-Channel Attention Mechanism Convolutional Network. Energies 2026, 19, 1686. [Google Scholar] [CrossRef]
Xu, Z.; Mei, X.; Wang, X.; Yue, M.; Jin, J.; Yang, Y.; Li, C. Fault diagnosis of wind turbine bearing using a multi-scale convolutional neural network with bidirectional long short term memory and weighted majority voting for multi-sensors. Renew. Energy 2022, 182, 615–626. [Google Scholar] [CrossRef]
Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]

Figure 1. The overall architecture of the proposed RDAMNet.

Figure 2. The principle of the ECA attention mechanism.

Figure 3. The principle of the SE attention mechanism.

Figure 4. Experimental setup of the lab-scale floating wind turbine FWT. (a) Photograph of the lab-scale FWT model. (b) Schematic of the FWT under different wind directions WD1, WD2, and WD3, as well as different wind speeds WS1, WS2, and WS3. (c) Configuration of the lab-scale FWT model, including the accelerometer installation position and the considered damage scenarios. These scenarios consist of connection degradation between the tower and floater denoted as B, added mass cases where M1 corresponds to a mass of 1.7 g and M2 corresponds to a mass of 2.3 g, and blade crack cases where C1 corresponds to a crack length of 1.5 cm representing 4 percent of the blade length and C2 corresponds to a crack length of 3 cm representing 8 percent of the blade length.

Figure 5. Time-domain and frequency-domain responses of randomly selected signal segments under different damage modes.

Figure 6. Normalized frequency-band energy distributions of different damage modes.

Figure 7. Hyperparameter sensitivity analysis of RDAMNet. (a) Accuracy surface across different optimizers and learning rates. (b) F1-score surface across different optimizers and learning rates.

Figure 8. Comprehensive performance comparison of RDAMNet and baseline models. (a) Mean accuracy of six models across five independent runs. (b) Mean F1-score of six models across five independent runs. (c) Accuracy distribution over five runs. (d) F1-score distribution over five runs. (e) Model complexity of RDAMNet: parameters, FLOPs, and single-sample GPU inference time.

Figure 9. Training convergence and generalization of RDAMNet. (a) Training and testing loss and accuracy curves over 100 epochs. (b) Training and testing loss and F1-score curves over 100 epochs.

Figure 10. t-SNE visualization of the features extracted by different methods. (a) RDAMNet. (b) ResNet18. (c) DCNet. (d) IMCTN. (e) MCAMCNN. (f) MSCNN-BiLSTM-WMV. (g) Legend of damage pattern labels.

Figure 11. Performance of RDAMNet under different operating conditions. (a) Mean accuracy with error bars across three independent runs. (b) Mean F1-score with error bars across three independent runs.

Figure 12. Branch importance analysis of the three RDAMNet branches under different damage modes.

Figure 13. One-dimensional Grad-CAM visualization of RDAMNet under different damage modes.

Table 1. Detailed architecture of RDAMNet.

Module	Layer	Filters	Kernel/Dilation	Output
Branch 1 (Raw)	ResECA × 3	$32 \to 64 \to 128$	49/1, 15/1, 7/1	$128 \times 128$
Branch 2 (MaxPool)	ResECA × 3	$32 \to 64 \to 128$	9/1, 5/1, 3/1	$128 \times 128$
Branch 3 (AvgPool)	ResECA × 3	$32 \to 64 \to 128$	7/2, 5/2, 3/4	$128 \times 128$
Fusion	Conv $1 \times 1$ + SE	$384 \to 128$	–	$128 \times 128$
Classifier	GAP + GMP + FC	$256 \to 128 \to C$	–	C

Table 2. Block specifications of ResECA, ECA, and SE modules.

Block	Parameter	Value
ResECA	Conv layers per block	2
ResECA	Shortcut	Identity/1 × 1 Conv + BN
ResECA	Temporal downsampling	MaxPool1d, stride 2
ECA	$γ$ , b	2, 1
SE	Reduction ratio r	8

Table 3. Training configurations of all compared methods.

Method	Optimizer	Learning Rate	Batch Size	Epochs	Loss Function
RDAMNet	AdamW	0.001	64	100	CE (label smoothing = 0.1)
ResNet18	AdamW	0.001	64	150	CE (label smoothing = 0.1)
DCNet	Adam	0.001	32	50	CE
IMCTN	AdamW	0.001	64	50	CE
MCAMCNN	Adam	0.01	16	100	CE
MSCNN-BiLSTM-WMV	Adam	0.001	64	100	CE

Table 4. Ablation study results of RDAMNet on the UPATRAS dataset.

Variant	Accuracy (%)	F1-Score (%)
RDAMNet	96.45	96.44
w/o Multi-scale Input	87.23	87.02
w/o Multi-branch	89.36	89.31
w/o ECA	92.19	92.16
w/o SE	93.97	93.27
w/o Dual Attention	91.13	91.14
w/o Residual	94.32	94.28

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Han, H.; Li, Y.; Wang, R.; Deng, H.; Lu, Y.; Zhang, Y. A Novel Residual Dual Attention Multiscale Network for Vibration-Based Damage Recognition in Floating Wind Turbine Structural Health Monitoring. Sensors 2026, 26, 4104. https://doi.org/10.3390/s26134104

AMA Style

Han H, Li Y, Wang R, Deng H, Lu Y, Zhang Y. A Novel Residual Dual Attention Multiscale Network for Vibration-Based Damage Recognition in Floating Wind Turbine Structural Health Monitoring. Sensors. 2026; 26(13):4104. https://doi.org/10.3390/s26134104

Chicago/Turabian Style

Han, Huiming, Yifei Li, Renqiang Wang, Hua Deng, Yuchen Lu, and Yuxuan Zhang. 2026. "A Novel Residual Dual Attention Multiscale Network for Vibration-Based Damage Recognition in Floating Wind Turbine Structural Health Monitoring" Sensors 26, no. 13: 4104. https://doi.org/10.3390/s26134104

APA Style

Han, H., Li, Y., Wang, R., Deng, H., Lu, Y., & Zhang, Y. (2026). A Novel Residual Dual Attention Multiscale Network for Vibration-Based Damage Recognition in Floating Wind Turbine Structural Health Monitoring. Sensors, 26(13), 4104. https://doi.org/10.3390/s26134104

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Residual Dual Attention Multiscale Network for Vibration-Based Damage Recognition in Floating Wind Turbine Structural Health Monitoring

Abstract

1. Introduction

2. Proposed Method

2.1. Problem Formulation

2.2. Overview of the Proposed RDAMNet

2.3. Multi-Scale Signal Input and Differentiated Branch Design

2.4. ResECA Feature Extraction Block

2.5. Efficient Channel Attention

2.6. Squeeze-And-Excitation Attention

2.7. Adaptive Feature Fusion and Damage Recognition

3. Experimental Validation

3.1. Dataset Description

3.2. Evaluation Metrics

3.3. Compared Methods

3.4. Implementation Details

3.5. Signal-Level Motivation for Multi-Scale Modeling

3.6. Hyperparameter Sensitivity Analysis

3.7. Comparative Results and Analysis

3.8. Cross-Condition Generalization Analysis

3.9. Ablation Study

3.10. Interpretability Analysis of Multi-Scale Features

3.11. Engineering Implications for Condition-Based Maintenance

3.12. Limitations and Future Work

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI