Research on Propeller Defect Diagnosis of Rotor UAVs Based on MDI-STFFNet

Cui, Beining; Jiang, Dezhi; Wang, Xinyu; Xiao, Lv; Tan, Peisen; Li, Yanxia; Tan, Zhaobin

doi:10.3390/sym18010003

Open AccessArticle

Research on Propeller Defect Diagnosis of Rotor UAVs Based on MDI-STFFNet

by

Beining Cui

¹,

Dezhi Jiang

²,

Xinyu Wang

¹,

Lv Xiao

¹,

Peisen Tan

¹,

Yanxia Li

² and

Zhaobin Tan

^1,*

¹

School of Electronic and Control Engineering, North China Institute of Aerospace Engineering, Langfang 065000, China

²

Zhejiang Dayuan Pumps Industrial Co., Ltd., Taizhou 317500, China

^*

Author to whom correspondence should be addressed.

Symmetry 2026, 18(1), 3; https://doi.org/10.3390/sym18010003

Submission received: 4 November 2025 / Revised: 10 December 2025 / Accepted: 18 December 2025 / Published: 19 December 2025

(This article belongs to the Section Engineering and Materials)

Download

Browse Figures

Versions Notes

Abstract

To address flight safety risks from rotor defects in rotorcraft drones operating in complex low-altitude environments, this study proposes a high-precision diagnostic model based on the Multimodal Data Input and Spatio-Temporal Feature Fusion Network (MDI-STFFNet). The model uses a dual-modality coupling mechanism that integrates vibration and air pressure signals, forming a “single-path temporal, dual-path representational” framework. The one-dimensional vibration signal and the five-channel pressure array are mapped into a texture space via phase space reconstruction and color-coded recurrence plots, followed by extraction of transient spatial features using a pre-trained ResNet-18 model. Parallel LSTM networks capture long-term temporal dependencies, while a parameter-free 1D max-pooling layer compresses redundant pressure data, reducing LSTM parameter growth. The CSW-FM module enables adaptive fusion across modal scales via shared-weight mapping and learnable query vectors that dynamically assign spatiotemporal weights. Experiments on a self-built dataset with seven defect types show that the model achieves 99.01% accuracy, improving by 4.46% and 1.98% over single-modality vibration and pressure inputs. Ablation studies confirm the benefits of spatiotemporal fusion and soft weighting in accuracy and robustness. The model provides a scalable, lightweight solution for UAV power system fault diagnosis under high-noise and varying conditions.

Keywords:

PSR-CRP; multimodal data fusion; spatiotemporal feature coordination; deep learning; UAV rotor blade defect diagnosis

1. Introduction

With the advancement of the low-altitude economy, rotary-wing drones—serving as a critical component of this emerging sector—have become indispensable tools across diverse industries, including agriculture, logistics, security, inspection, and surveying. Their simple design, high maneuverability, and vertical takeoff and landing capabilities facilitate efficient and flexible operations, contributing to their widespread deployment and the continuous expansion of application scenarios [1,2,3,4,5]. The power system of a rotary-wing unmanned aerial vehicle consists of the motor, propeller, electronic speed controller, and power supply. Among these components, the motor and propeller serve as the primary elements for energy conversion and transmission, responsible for transforming electrical energy into mechanical energy. Through aerodynamic interaction, they generate lift and thrust, enabling the aircraft’s maneuverable flight capabilities. During flight operations, rotary-wing unmanned aerial vehicles are susceptible to propeller damage caused by collisions or other external factors. From a symmetry perspective, a perfectly balanced propeller operating under steady-state conditions exhibits rotational symmetry and periodic dynamic stability, leading to symmetric patterns in vibration and pressure pulsations. Propeller defects fundamentally disrupt this symmetry, as mass imbalances, aerodynamic asymmetries, and geometric deviations introduce characteristic asymmetric signatures across both temporal and spatial domains. These symmetry-breaking phenomena manifest as non-uniform load distributions, phase distortions, and aperiodic oscillations—features that the proposed multi-modal fusion approach is specifically designed to detect and quantify. Such damage may lead to degraded flight performance, accelerated component deterioration, or even loss of control and subsequent crash, potentially resulting in mission failure. In severe cases, drone crashes may pose risks to human safety, including personal injury or fatalities [6]. From a symmetry-based perspective, a properly operating propeller exhibits rotational symmetry and periodic dynamic stability under steady-state conditions. Propeller defects disrupt this symmetry, resulting in non-uniform distributions of vibration and pressure pulsations, phase distortion, and non-periodic oscillations. These symmetry-breaking phenomena represent key diagnostic indicators of faults; however, existing research has largely overlooked their full potential in fault detection and diagnosis. Therefore, advancing research on rotorcraft propeller fault diagnosis from a symmetry-oriented framework holds significant scientific merit and practical engineering value.

Fault diagnosis technology is an interdisciplinary field characterized by strong integration, encompassing multiple disciplines such as digital signal processing, machine learning, mathematical statistics, automatic control, and sensor detection. Current fault diagnosis techniques are primarily categorized into model-based approaches [7,8,9,10] and data-driven methods [11,12,13,14]. Model-based approaches require substantial expert knowledge and complex modeling procedures. In practical systems, accurate and effective models are often difficult to establish, which limits the ability to achieve high-accuracy diagnosis. With the advancement of artificial intelligence, data-driven approaches have increasingly focused on input data preprocessing and model architecture design. This enables the constructed networks to adaptively learn fault-related features from the data, facilitating accurate fault classification without the need for extensive expert knowledge. Liu et al. [15] achieved over 90% accuracy in detecting propeller physical damage by capturing flight-induced acoustic noise and applying convolutional neural networks with transfer learning. Iannace et al. [16] measured the noise emitted by unmanned aerial vehicles and applied an artificial neural network to achieve diagnostic classification of unbalanced propeller blades, achieving an accuracy rate of approximately 97%. Steinhoff et al. [17] achieved high spatial resolution in sound source localization using acoustic imaging cameras and employed convolutional neural networks to diagnose faults in two-bladed and three-bladed propellers, with classification accuracies of 99% and 97%, respectively. Altinors et al. [18] employed machine learning algorithms, including decision tree (DT), support vector machine (SVM), and k-nearest neighbors (KNN), to perform fault diagnosis of drone propeller damage, motor eccentricity, and bearing failures based on acoustic data analysis. Compared with existing advanced fault detection methods, the MDI-STFFNet-based fault diagnosis model proposed in this paper demonstrates significant advantages and innovative contributions across multiple dimensions. While current UAV rotor defect detection technologies are capable of identifying faults to a certain extent, most rely on single-modality data—such as vibration or pressure signals—limiting their robustness in complex low-altitude environments where they are highly susceptible to environmental noise interference, thereby degrading diagnostic accuracy. Moreover, a substantial body of existing research focuses predominantly on acoustic data, which is inherently vulnerable to background noise and aerodynamic disturbances in real-world scenarios, further constraining its practical applicability. More critically, current approaches lack a systematic analysis of symmetry disruptions induced by rotor defects, failing to fully exploit these characteristic asymmetries as diagnostic indicators for enhanced fault identification.

To address the aforementioned challenges, this paper proposes MDI-STFFNet (Multimodal Data Input and Spatio-Temporal Feature Fusion Network), a novel deep learning architecture. The model adopts a multimodal data fusion strategy that integrates one-dimensional vibration signals with multi-channel air pressure pulsation signals. This dual-modal input design not only mitigates the limitations of single-modal data in information representation but also significantly improves fault detection accuracy and robustness by leveraging the physical coupling and complementary characteristics between sensor modalities. Simultaneously, MDI-STFFNet enables parallel extraction of temporal and spatial features through a “single-path time—dual-path representation” framework. It employs PSR-CRP-ResNet to capture spatial textures and transient dynamics, while long-term temporal dependencies are extracted via LSTM networks. This spatiotemporal feature fusion mechanism enhances sensitivity to subtle faults and strengthens overall detection performance. Furthermore, MDI-STFFNet incorporates a channel-wise soft weight fusion module (CSW-FM), which dynamically assigns adaptive weights to different modal channels based on their relevance to fault conditions. By emphasizing the most discriminative features, this mechanism further improves the model’s adaptability and diagnostic precision.

Finally, the symmetry-based fault characterization method introduces a novel perspective for anomaly detection in rotating machinery. By performing high-dimensional phase space reconstruction of one-dimensional signals, it enables the visualization of symmetry-breaking phenomena, thereby significantly improving the interpretability and generalizability of diagnostic outcomes. These innovative design elements not only empower MDI-STFFNet to achieve superior performance in rotor blade fault detection for rotary-wing UAVs operating in complex low-altitude environments but also offer a highly efficient and scalable solution applicable to fault diagnosis across a broader range of rotating mechanical systems.

The method proposed in this paper achieves a three-stage progression spanning the data layer, representation layer, and fusion layer:

(1): Replaces traditional acoustic measurements with a dual-physical-coupling modality that integrates IEPE vibration signals from the motor base and distributed air pressure pulsation signals along the arm, thereby significantly mitigating the detrimental effects of wind disturbances, dust particles, and background noise on feature reliability;
(2): Introduces a “single-time-series, dual-representation” paradigm that concurrently employs phase-space reconstruction and color-recursive plotting to explicitly characterize transient spatial textures, while leveraging LSTM-based networks to mine long-term temporal dependencies. This enables complementary integration of spatiotemporal information and synergistic enhancement of sensitivity to subtle defects;
(3): Designs a channel-wise soft-weighted fusion module that dynamically allocates cross-modal feature weights through shared-weight linear mapping and learnable query vectors, enabling adaptive scalar weighting across heterogeneous scales and fully unlocking the discriminative potential of limited training samples.

2. Theoretical Analysis

2.1. PSR-CRP

The phase space reconstruction theory was first proposed by Packard et al. [19] and subsequently refined by Takens et al. [20]. Takens’ theorem states that if a dynamical system is smooth, then through sufficiently many delayed embeddings, a phase space topologically equivalent to the original system can be reconstructed from a single time series. The core concept of phase space reconstruction involves extending a one-dimensional time series into trajectories within a high-dimensional space, thereby revealing the underlying dynamic characteristics of the original system [21,22,23]. Phase space reconstruction techniques are analytical methods for nonlinear time series analysis. By determining the time delay τ and embedding dimension m, these techniques extend one-dimensional time series into an m-dimensional state space, thereby reconstructing the intrinsic characteristics of the original dynamical system and enabling their visualization in a high-dimensional space. For the original sequences

x_{1}, x_{2}, \dots, x_{m}

phase space reconstruction extends them into an m-dimensional state space:

X = [\begin{matrix} X_{1}^{T} \\ X_{2}^{T} \\ ⋮ \\ X_{N}^{T} \end{matrix}] = [\begin{matrix} x_{1} & x_{1 + τ} & \dots & x_{1 + (m - 1) τ} \\ x_{2} & x_{2 + τ} & \dots & x_{2 + (m - 1) τ} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ x_{N} & x_{N + τ} & \dots & x_{N + (m - 1) τ} \end{matrix}]

(1)

The selection of the time delay τ and embedding dimension m is critical. An excessively high embedding dimension m amplifies noise interference and reduces the model’s generalization capability, whereas an excessively low embedding dimension m fails to adequately capture the system’s dynamic behavior, leading to information loss. An excessively large time delay τ leads to phase space diffusion, weakening the correlation between coordinates and causing the reconstructed phase space to fail in representing the true dynamics of the system. Conversely, an excessively small time delay τ results in excessive compression of the phase space, leading to redundant and repetitive information [24]. In this paper, m = 3 is adopted, and the time delay τ is determined using the mutual information method [25]. The recurrence plot serves as a powerful tool for visualizing the symmetry properties of dynamical systems. For a propeller in healthy operating condition, the phase space trajectory exhibits periodic recurrence and structural regularity, which manifest as symmetric patterns in the recurrence plot—characterized by uniform diagonal lines and consistent texture features. When defects disrupt the propeller’s rotational symmetry, the recurrence plot reveals corresponding asymmetries, including fragmented diagonal structures, irregular texture distributions, and distorted attractor boundaries. This transformation from symmetry to asymmetry in the reconstructed phase space constitutes the fundamental discriminative basis of the proposed defect diagnosis framework.

Recurrence plots serve as a fundamental tool for analyzing the periodicity, chaotic behavior, and non-stationarity of time series, revealing the internal structure of the data and the temporal interdependencies among individual time points. It visually represents the internal dependencies within a time series by depicting the system’s state at different time points as a two-dimensional graph [26]. The raw one-dimensional vibration data are reconstructed into phase space and projected onto a three-dimensional state space. The Euclidean distance between each pair of phase points is computed. Given two points

a (x_{11}, x_{12}, \dots, x_{1 n})

and

b (x_{21}, x_{22}, \dots, x_{2 n})

in n-dimensional space, the Euclidean distance between them is defined as:

d_{12} = \sqrt{\sum_{k = 1}^{n} {(x_{1 k} - x_{2 k})}^{2}}

(2)

The distance matrix D is defined as follows:

D = [\begin{matrix} d_{n 1} & d_{n 2} & d_{n 3} & \dots & \dots & d_{n (n - 2)} & d_{n (n - 1)} & d_{n n} \\ d_{(n - 1) 1} & d_{(n - 1) 2} & d_{(n - 1) 3} & \dots & \dots & d_{(n - 1) (n - 2)} & d_{(n - 1) (n - 1)} & d_{(n - 1) n} \\ d_{(n - 2) 1} & d_{(n - 2) 2} & d_{(n - 2) 3} & \dots & \dots & d_{(n - 2) (n - 2)} & d_{(n - 2) (n - 1)} & d_{(n - 2) n} \\ ⋮ & ⋮ & ⋮ & ⋰ & ⋮ & ⋮ & ⋮ \\ ⋮ & ⋮ & ⋮ & ⋰ & ⋮ & ⋮ & ⋮ \\ d_{31} & d_{32} & d_{33} & \dots & \dots & d_{3 (n - 3)} & d_{3 (n - 1)} & d_{3 n} \\ d_{21} & d_{22} & d_{23} & \dots & \dots & d_{2 (n - 2)} & d_{2 (n - 1)} & d_{2 n} \\ d_{11} & d_{12} & d_{13} & \dots & \dots & d_{1 (n - 2)} & d_{1 (n - 1)} & d_{1 n} \end{matrix}]

(3)

Here, d_mn denotes the Euclidean distance between the m-th and n-th phase points. Subsequently, the distance matrix is sequentially mapped onto the Magma colormap to generate a color-coded recurrence plot. This plot provides an intuitive visualization of the distance relationships among phase points and reveals abrupt changes in the system.

2.2. ResNet

ResNet is a deep learning architecture introduced by Kaiming He et al. [27]. This network effectively addresses the gradient vanishing problem commonly encountered in training deep neural networks by introducing residual modules and skip connections. This facilitates the training of exceptionally deep architectures and significantly improves model performance [28]. The residual mapping structure is depicted in Figure 1.

The input to the residual block is denoted as x. After passing through two successive weight layers, the feature output is F(x). Following the residual connection, this is transformed into F(x) + x. Subsequent residual blocks take the output of the preceding block as input, thereby enabling residual learning. This approach significantly mitigates the challenges associated with training deep neural networks.

This paper employs the ResNet-18 architecture, with the network structure illustrated in Figure 2.

Given the use of multimodal inputs and dual feature representations, two sets of feature extraction networks are required to learn parameter weights. Training from randomly initialized parameters proves challenging for achieving effective convergence. Therefore, this paper adopts the ResNet18-IMAGENET1K_V1 pre-trained model to initialize the network parameters, followed by forward and backward propagation to fine-tune the weights. IMAGENET1K_V1 is one of the standard pre-trained weights provided by TorchVision, suitable for foundational transfer learning in most visual tasks. Renowned for its stability, compatibility, and consistent performance, it serves as a widely adopted baseline for model initialization in both research and industrial applications.

2.3. LSTM

In the field of deep learning, recurrent neural networks exhibit distinct advantages in processing sequential time series data. Recurrent neural networks (RNNs) leverage internal states (commonly referred to as “memory”) to process sequential inputs by incorporating information from previous time steps, making them well-suited for tasks such as speech recognition, natural language processing, and time series forecasting [29]. However, traditional RNNs suffer from severe gradient vanishing problems when processing long sequential data, which hinders the network’s ability to capture long-range dependencies. Long Short-Term Memory (LSTM) networks effectively mitigate this issue by incorporating gating mechanisms and dedicated memory cells [30,31]. Its core structure consists of an input gate, a forget gate, and an output gate. Through these three gating mechanisms, the LSTM network can selectively retain or discard information, thereby improving its ability to capture long-term dependencies. As shown in Figure 3, the structure of an LSTM memory cell is illustrated.

One of the LSTM cells consists of a forget gate, an input gate, and an output gate. The function of the forget gate is to determine which information in the cell state should be retained and which should be discarded. It takes the current input x_t and the previous hidden state h_t₋₁ as inputs, mapping them to values between 0 and 1 through a Sigmoid activation function. Values close to 0 indicate that the corresponding information in the cell state will be discarded, whereas values close to 1 indicate that the information will be preserved. The formula for computing the forget gate is as follows:

f_{t} = s i g m o i d (W_{f h} h_{t - 1} + W_{f x} x_{t} + b_{f})

(4)

The input gate regulates the extent to which information from the current input is incorporated into the cell state. It likewise receives x_t and h_t₋₁ as inputs, computes an update gate value through the Sigmoid activation function, simultaneously transforms the current input using the Tanh activation function, and then element-wise multiplies the two results to obtain the candidate information for updating the cell state.

i_{t} = s i g m o i d (W_{i h} h_{t - 1} + W_{i x} x_{t} + b_{i})

(5)

{\overset{⌢}{C}}_{t} = \tanh (W_{c h} h_{t - 1} + W_{c x} x_{t} + b_{c})

(6)

The cell state is updated based on the outputs from the forget gate and the input gate.

C_{t} = f_{t} \cdot C_{t - 1} + i_{t} \cdot {\overset{⌢}{C}}_{t}

(7)

The output gate determines which information in the cell state will be output as the hidden state at the current time step. It receives x_t and h_t₋₁ as inputs, computes an output gate value through the Sigmoid activation function, and then multiplies this value element-wise by the cell state after it has been transformed via the tanh activation function to produce the hidden state at the current time step.

O_{t} = s i g m o i d (W_{o h} h_{t - 1} + W_{o x} x_{t} + b_{i})

(8)

h_{t} = O_{t} \cdot \tanh (C_{t})

(9)

The above provides a brief overview of the operational mechanisms of the forget gate, input gate, and output gate in an LSTM network. Here, c denotes the memory cell, C_t represents the cell state at time step t, h_t signifies the hidden state, W denotes the weight matrix, b denotes the bias term, Sigmoid and tanh represent the activation functions, and “

\cdot

” denotes the element-wise multiplication operation.

3. Experimental Data Acquisition

To investigate rotorcraft propeller defect detection and validate the effectiveness of the proposed diagnostic model based on multi-modal feature fusion, the research team developed a rotorcraft propeller test rig with pre-defined propeller defects. As shown in Figure 4, the setup illustrates a test rig for rotorcraft unmanned aerial vehicle propellers. ➀ Specifies the mounting location for the brushless motor (DJI 2312A, Shenzhen, China); ➁ Specifies the mounting location for the propeller (Model 9450, Shenzhen, China); ➂ Specifies the mounting location for the pneumatic pressure sensor acquisition board; ➃ Identifies the motor speed control unit; ➄ Specifies the mounting location for the IEPE vibration sensor.

To collect barometric array data from the propeller, BMP581 barometric pressure sensors are deployed at 2 cm intervals, enabling the detection of barometric pressure pulsations on the arm during propeller rotation. Figure 5 illustrates the distribution of inspection points on propeller blades, with a standard propeller used as an example.

As shown in Figure 6, the predefined propeller defect types are as follows: Defect ➀ represents a 5% tip fracture; Defect ➁ represents a 10% tip fracture; Defect ➂ denotes a trailing edge notch on the blade with a maximum depth of 5 mm; Defect ➃ denotes a trailing edge notch on the blade with a maximum depth of 10 mm; Defect ➄ denotes a leading edge notch on the blade with a maximum depth of 5 mm; Defect ➅ denotes a leading edge notch on the blade with a maximum depth of 10 mm.

Table 1 presents statistical data samples related to propeller defects collected using a rotorcraft propeller test rig, IEPE vibration sensors, and a BMP581 barometric pressure sensor. All tests were conducted at a constant rotational speed corresponding to a 20% duty cycle.

4. Fault Diagnosis

This paper proposes a rotorcraft propeller defect diagnosis system based on MDI-STFFNet. Figure 7 illustrates the overall architecture of the model.

4.1. Signal Processing

This experiment utilizes one-dimensional vibration data and five-channel barometric pressure data as inputs. The vibration data are sampled at a frequency of 12 kHz, with each segment comprising 2048 data points. The five pressure channels are sampled at 100 Hz, with data segmented into groups of 200 points. Thus, each sample file contains one channel of 2048-point vibration data along with five channels of 200-point pressure pulsation data. To eliminate the influence of dimensional differences on experimental results, the data must be normalized separately.

X_{n o r m a l i z e d} = \frac{X - X_{\min}}{X_{\max} - X_{\min}}

(10)

Here, X denotes the original data point, X_min represents the minimum value in the signal sequence, and X_max denotes the maximum value in the signal sequence.

Using operational data from a partially normal propeller with a 5 mm notch at the blade leading edge as an illustrative example, Figure 8 presents the waveform profiles of the vibration signal and five pressure signals. The vibration signal corresponding to the normal propeller exhibits a relatively regular and stable waveform with minimal amplitude fluctuations, indicating a uniform distribution of vibrational energy and consistent system operation. In contrast, the vibration signal of the defective propeller contains more pronounced spikes and abrupt transitions, reflecting significant waveform irregularity and carrying diagnostic information associated with localized structural damage. By monitoring pressure pulsation variations at five designated pressure sensing locations, the normal propeller demonstrates relatively regular and stable periodic fluctuations, indicating smooth and consistent airflow dynamics. In contrast, the defective propeller exhibits significantly greater amplitude variability and waveform irregularities in the pressure signals—particularly at monitoring points 1 and 2—attributable to airflow disturbances induced by the blade notch.

This experiment employs m = 3 for phase space reconstruction, with the time delay parameter determined using the mutual information method. The candidate range for the time delay τ is set to [1,200]. The reconstructed signal is mapped onto a magma colormap to generate a recurrence plot by calculating the Euclidean distance between each pair of phase points. In the color recurrence plot, the intensity of color reflects the Euclidean distance between phase points in phase space. Darker colors (closer to black) indicate smaller distances between phase points, signifying higher similarity and recurrence in the system’s dynamic behavior. Lighter colors (closer to red) indicate larger distances between phase points, indicating greater divergence in dynamic behavior. Taking the operational data of a standard propeller and a propeller with a 5 mm notch at the blade leading edge as examples, Figure 9 presents a comparative analysis of the vibration and barometric pressure pulsation recurrence plots for both propellers. The recurrence plot (a) generated from vibration data during normal propeller operation exhibits strong periodicity, high determinism, and a clear structural pattern. In contrast, the recurrence plot (g) derived from vibration data with a 5 mm notch at the blade leading edge displays disrupted periodicity, fragmented structure, and increased chaotic behavior. The divergence between (a) and (g) indicates that the blade notch significantly alters the system’s nonlinear dynamic behavior. For the five-channel pressure pulsation array, a 5 mm defect at the blade leading edge notch induces significant changes at monitoring points 2 and 3. When comparing the recurrence plots of monitoring points 2 and 3 under the 5 mm blade leading edge notch condition with those of the standard propeller, the former exhibit numerous abrupt structural changes, with the attractor boundary expanded and becoming indistinct. This indicates the presence of strong transient disturbances, resulting in pronounced alterations in the recurrence plot. From a symmetry perspective, the comparative visualization in Figure 8 clearly illustrates how propeller defects disrupt the symmetry of system dynamic behaviors. Under healthy operating conditions, propeller rotation exhibits perfect periodic symmetry, which manifests as regular periodic orbits in phase space reconstruction and highly symmetric texture patterns in recurrence plots. The recurrence plots for the healthy propeller (a–f) exhibit strong symmetric recurrence characteristics, characterized by: (i) parallel diagonal structures indicating periodic recurrence, (ii) uniform texture distribution reflecting stable rotational symmetry, and (iii) consistent attractor morphology across all pressure monitoring points. In contrast, the defective propeller (g–l) displays pronounced asymmetry, including: (i) fragmented and non-parallel diagonal structures, (ii) irregular texture clustering with abrupt intensity variations at monitoring points 2 and 3, and (iii) expanded and distorted attractor boundaries. These asymmetric features are directly correlated with the physical imbalance induced by the blade notch, as the defect disrupts periodic pressure wave propagation and introduces phase jitter into the vibration signals.

The time-series data from normal and defective propellers (vibration data and five-channel air pressure pulsation arrays) exhibit distinct differences when processed using PSR-based color recurrence maps. These differences can serve as feature inputs for classification models to distinguish between various fault types. Experimental validation confirms the feasibility of this approach for propeller defect classification.

4.2. MDI-STFFNet

4.2.1. Spatial Feature Extraction

Following normalization, the single-channel vibration data and five-channel air pressure pulsation data were subjected to phase space reconstruction. This technique projects the one-dimensional time series vibration and air pressure data into a three-dimensional phase space. By calculating the Euclidean distance between each pair of phase points, a distance matrix was constructed. Subsequently, the distances in the matrix were mapped onto the Magma color spectrum to generate a color-coded recurrence plot. The input consists of six time series, generating six color recurrence maps. Each color recurrence map has three channels, resulting in a total of eighteen input channels. These eighteen channels undergo a 3 × 3 convolution operation. Each output channel corresponds to one set of convolutional kernels (eighteen in total, with each kernel matching one input channel). The outputs from these eighteen kernels are summed element-wise to produce a feature map for a single output channel. Three such sets of kernels are used to generate three output channels, thereby conforming to the input requirements of the ResNet architecture.

4.2.2. Temporal Feature Extraction

The six normalized data streams are further processed through six LSTM networks to extract six sets of abstract features. As the six original datasets comprise one vibration signal and five air pressure pulsation signals—representing two distinct modal data types—direct fusion of these disparate modalities risks overemphasizing or neglecting certain modality-specific information, potentially leading to information loss or misinterpretation. This paper first applies max pooling to the five air pressure pulsation signals, thereby compressing the temporal dimension, filtering out noise and minor disturbances, and preserving the most salient pulsation characteristics. The vibration data features extracted by the LSTM are then fused with the max-pooled air pressure pulsation features through the Channel-wise Soft Weighted Fusion Module (CSW-FM, Section 4.2.3), yielding a combined feature representation that integrates both vibration and air pressure pulsation information.

4.2.3. Spatio-Temporal Feature Fusion

The fusion of spatiotemporal features is performed using a Channel-wise Soft Weighted Fusion Module (CSW-FM), in which spatial and temporal features are separately fed into a shared linear transformation matrix W to generate low-dimensional attention-based latent representations. Subsequently, the dot product between the latent representations and the learnable query vector q is computed to obtain the raw scores for both modalities. Following softmax normalization, the dynamic weights α_vib and α_air are obtained, satisfying the constraint α_vib + α_air = 1. Scalar-level weighting is applied to the spatial and temporal features using the aforementioned weights to obtain weighted single-modal vectors. Subsequently, these vectors are concatenated to form the final fused representation. This characterization simultaneously captures both the high-frequency transient information from vibration signals and the macro-level trend information from air pressure data. The proportion of each type of information is automatically adjusted through an attention mechanism conditioned on the current input, thereby significantly improving the robustness and accuracy of subsequent fault diagnosis and operational condition recognition tasks.

4.2.4. Model: Training and Defect Classification Results

In the MDI-STFFNet model proposed herein, ResNet18 adopts the pre-trained weights from ImageNet-1K V1, while the remaining parameters are initialized using random seeds. During training, the parameters of the ResNet18 network are not frozen; rather, they are updated through both forward and backward propagation to optimize the network’s weight configurations. The model was trained for 200 epochs with a batch size of 16 and a learning rate of 1 × 10⁻³, targeting a 7-class classification task. The dataset was partitioned into training and test sets in an 8:2 ratio. Figure 10 illustrates the accuracy curves for the model’s training and test sets over the course of training. The model converged rapidly within 50 epochs, with both training and test accuracy improving concurrently. Although slight overfitting emerged in the later stages of training (beyond 150 epochs), the training accuracy approached 100%, while the test accuracy stabilized at approximately 95% or higher. The model demonstrated strong performance during training, with accuracy steadily improving, reasonable convergence, minimal overfitting, and stable generalization capabilities. Future work may consider expanding the test set size to more accurately assess the model’s generalization performance.

After 200 training epochs, the accuracy on the 7-class training set reached 99.75%, while the test set achieved 99.01%. The confusion matrix is presented in Figure 11. Analysis of the training curves and confusion matrix reveals that after 150 epochs, while training accuracy continues to increase gradually, the test accuracy has stabilized around 95%, with a standard deviation below 0.3% across three repeated trials and a training–testing gap of merely 0.7%. Given that each class in the in-house dataset contains approximately 115 training samples, a test accuracy of 99% approaches the theoretical performance limit. Furthermore, the confusion matrix exhibits only two off-diagonal misclassifications, which are both randomly distributed and spatially clustered—patterns more indicative of inherent data noise than systematic overfitting. Thus, the observed minor discrepancies are primarily attributable to uncertainty arising from limited sample size rather than model overfitting. Future work will systematically evaluate the incremental benefits of regularization strategies such as dropout and early stopping on larger-scale datasets.

To further validate the model’s stability and generalization capability, we incorporated five-fold cross-validation into the experimental framework. The training dataset was randomly partitioned into five subsets, with each subset sequentially used for validation while the remaining four were utilized for training. The resulting test accuracies across the five folds were 100.00%, 96.27%, 96.89%, 95.65%, and 95.65%, yielding an average accuracy of 96.89%. This outcome demonstrates that the model maintains high stability and consistent performance across different data partitions, effectively accommodating variations in data distribution. These results further confirm the reliability and strong generalization ability of the proposed approach.

4.3. Dissolution Test

4.3.1. Comparison Between Single-Modality Input and Multi-Modality Input

To validate that multimodal data input provides a more comprehensive characterization of the system’s health status compared to unimodal data input, experiments were conducted using single vibration data and single pressure array data as individual input modalities. For single vibration signal input, spatial features are extracted using the PSR-CRP-ResNet network, while temporal features are captured by the LSTM network. The extracted spatio-temporal features are subsequently fused through a channel-level soft weighting fusion module, and classification results are then generated via a fully connected layer. After 200 training epochs, the test set accuracy reached 94.55%. Figure 12 illustrates the accuracy curve and confusion matrix of the diagnostic model using single vibration data as input. The multimodal input classification achieved an accuracy improvement of 4.46% compared to the single vibration data input.

For single-pressure pulsation array inputs, spatial features are extracted using the PSR-CRP-ResNet network, while temporal features are captured by a five-channel parallel LSTM network with max pooling applied for dimensionality reduction. The resulting spatiotemporal features are subsequently fused through a channel-level soft weighting fusion module, and classification results are then generated via a fully connected layer. After 200 training epochs, the test set accuracy reached 97.03%. Figure 13 illustrates the accuracy curve and confusion matrix of the diagnostic model for single-pressure pulse waveform array input. The classification accuracy of the model improved by 1.98% when using multimodal data input compared to single-pressure pulse waveform array input.

Results indicate that multimodal input significantly enhances diagnostic accuracy compared to single vibration data or single pressure array input, achieving improvements of 4.46% and 1.98%, respectively, in test set accuracy. This demonstrates that multimodal input provides a more comprehensive characterization of the system’s health status, thereby enabling higher defect detection accuracy. This performance enhancement validates the efficacy of multimodal data fusion in complex system monitoring, demonstrating that the integration of sensor data from diverse sources enables the capture of richer feature information, thereby enhancing the model’s capability to recognize system states.

4.3.2. Comparison Between Single Feature and Spatio-Temporal Feature Integration

To validate the superiority of spatio-temporal feature fusion, ablation studies were conducted on both temporal and spatial feature extraction components. For the single spatial feature extraction approach—where a single vibration signal and five barometric pulsation array inputs are directly fed into the PSR-CRP-ResNet network, with its output connected to the fully connected layer—the test set accuracy reached 99.01% after 200 training epochs. This result is consistent with the accuracy achieved using spatio-temporal feature fusion. Figure 14 illustrates the accuracy curve and confusion matrix of the diagnostic model after single spatial feature extraction. Analysis of the accuracy curve reveals that spatio-temporal feature fusion (Figure 10) exhibits stable accuracy fluctuations at a relatively high level, indicating well-trained models with strong generalization capabilities. In contrast, tests employing single spatial feature extraction demonstrate significant accuracy fluctuations, suggesting overfitting and limited ability of the models to effectively capture the dynamic characteristics of the data. The incorporation of temporal features significantly enhances the model’s ability to capture underlying data patterns, thereby enabling more reliable performance in practical applications.

For single-time feature extraction—specifically, the extraction of temporal features from a single vibration signal and five air pressure pulsation arrays using six LSTM units, respectively—the temporal features derived from the five pressure pulsation arrays undergo max pooling for dimensionality reduction and are subsequently fused with the temporal features of the vibration signal through channel-level soft weighting. The resulting fused features are then passed through a fully connected layer to produce the final defect classification. After 200 training epochs, the test set accuracy reached 99.50%, exceeding that of spatio-temporal feature fusion by 0.49%. Figure 15 illustrates the accuracy curve and confusion matrix for diagnostic models employing single temporal feature extraction. Analysis of the accuracy curve reveals that although the temporal-only feature model achieves a marginally higher test accuracy by 0.49 percentage points, the spatio-temporal fusion model demonstrates superior performance in terms of generalization capability, robustness, comprehensiveness of feature extraction, and developmental potential. The spatio-temporal fusion model, which integrates PSR-CRP-ResNet and LSTM, consolidates multidimensional information to enhance robustness and adaptability under complex data conditions. In contrast, the single-temporal feature model exhibits significant fluctuations in test accuracy due to the absence of spatial feature processing, indicating limited adaptability across varying data distributions. Considering practical application value, stability, and potential for enhancement, spatio-temporal fusion models demonstrate greater promise and represent a more favorable choice in the field of deep learning.

In summary, ablation experiments demonstrate that while employing a single spatial feature achieves a test accuracy of 99.01%, it results in severe oscillations during convergence due to the omission of dynamic evolutionary information, thereby limiting generalization performance. In contrast, utilizing a single temporal feature increases accuracy to 99.50%, but lacks explicit modeling of spatial correlation structures, making the model sensitive to data distribution shifts and insufficiently robust. By contrast, the spatiotemporal fusion approach concurrently embeds multidimensional spatial representations from PSR-CRP-ResNet and long-term dependency characterization from LSTM through a parallel interactive mechanism. This significantly reduces variance while maintaining high discriminative power, resulting in a stable accuracy curve with minimal generalization error. The approach integrates structural interpretability with engineering scalability. Therefore, the integration of spatio-temporal features not only validates the complementarity between spatial and temporal domains but also provides a modeling solution that achieves a balanced trade-off among accuracy, robustness, and potential for enhancement, thereby enabling highly reliable fault diagnosis under complex operating conditions.

4.3.3. Direct Feature Concatenation Versus CSW-FM

The model proposed herein incorporates two instances of channel-level soft-weighted fusion modules: specifically, channel-level soft-weighted fusion is applied to integrate vibration data features with air pressure pulsation features during temporal feature fusion, and the same mechanism is employed to combine spatio-temporal features in both the spatial and temporal feature extraction stages. To validate the effectiveness of the channel-level soft-weighted fusion module, two sets of feature fusion components were experimentally compared using direct concatenation in place of weighted fusion. Figure 16 illustrates the diagnostic model accuracy curve and confusion matrix for CSW-FM. After 200 training iterations, the test set accuracy reached 98.51%. Compared with direct feature concatenation, the channel-level soft-weighted fusion module improved diagnostic accuracy by 0.5 percentage points.

4.3.4. Analysis of the Lightweight Design and Real-Time Deployment Capabilities of MDI-STFFNet

MDI-STFFNet, as a blade defect diagnosis model designed for rotary-wing unmanned aerial vehicles (UAVs), not only achieves high diagnostic accuracy but also exhibits strong potential for lightweight and real-time deployment. In practical UAV applications, the model’s lightweight architecture and low-power inference capabilities are essential for successful onboard implementation. As highlighted by Murat Bakirci et al. [32], UAV platforms impose stringent constraints on computational resources and power consumption, posing significant challenges for the deployment of deep learning models. This study evaluates multiple lightweight variants of the YOLO series (e.g., YOLOv10-nano), demonstrating the feasibility of achieving low-power, low-latency, real-time object detection on UAV platforms. These findings provide valuable technical reference and practical guidance for the airborne deployment of MDI-STFFNet. Additionally, Liu et al. [33] proposed a real-time object detection method based on the lightweight FasterNet-16 backbone network, specifically designed to meet the low-power and resource-constrained requirements of UAV-based remote sensing systems. By integrating multi-scale feature fusion techniques and passive ensemble strategies, the study significantly improves feature processing efficiency without compromising architectural compactness. This result demonstrates that through principled network architecture design and systematic optimization, deep learning models can achieve efficient on-board operation on UAV platforms while concurrently satisfying real-time inference and energy efficiency demands.

In the design of MDI-STFFNet, the pre-trained ResNet-18 model is employed as the feature extraction backbone. ResNet-18 features a relatively shallow architecture and a low parameter count, conferring inherent advantages in terms of computational complexity and memory footprint. Furthermore, by integrating the Channel-wise Soft Weighted Fusion Module (CSW-FM), the model enables dynamic weight allocation across modalities, effectively reducing computational overhead while preserving high diagnostic accuracy. These architectural choices allow MDI-STFFNet to achieve lightweight and energy-efficient characteristics without sacrificing diagnostic precision, thereby enhancing its suitability for real-time deployment on UAV platforms. Nevertheless, opportunities remain for further reduction in power consumption. Future work will focus on optimizing computational efficiency and energy usage to better align with the operational constraints of practical UAV applications.

5. Conclusions

In summary, this study presents a rotor blade defect diagnosis model for rotary-wing unmanned aerial vehicles (UAVs) based on the MDI-STFFNet architecture, achieving a test accuracy of 99.01% in a seven-class defect classification task. The proposed model integrates multimodal signal acquisition with strong interference resistance, employs symmetry-based defect characterization to enhance interpretability, and utilizes a single-input dual-path architecture with multi-scale spatio-temporal feature fusion to comprehensively capture defect-related features. Specifically, the dual-sensor framework enhances robustness against environmental disturbances, while the symmetry-driven detection approach offers a generalizable paradigm for anomaly identification. The dual-path design enables simultaneous modeling of spatial textures and long-term temporal dependencies, thereby significantly improving sensitivity to subtle defects. Future research will focus on several key directions: expanding the dataset to encompass diverse motor types and rotor configurations to support transfer learning and domain adaptation; enhancing noise robustness through adversarial training to maintain diagnostic accuracy above 95% under high-noise conditions; evaluating model performance under extreme operational scenarios using multi-variable data fusion, including integration of temperature sensors; and optimizing the model for on-board deployment on UAVs via techniques such as knowledge distillation and quantization.

Author Contributions

Conceptualization, B.C., Z.T. and D.J.; methodology, Z.T., B.C. and L.X.; software, B.C., D.J. and X.W.; validation, B.C. and Z.T.; formal analysis, L.X. and Y.L.; investigation, B.C., X.W. and L.X.; resources, Z.T.; data curation, B.C., D.J., X.W. and P.T.; writing—original draft, B.C.; writing—review & editing, Z.T.; visualization, B.C., P.T. and Y.L.; supervision, Z.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the North China Institute of Aerospace Engineering, grant number YKY-2024-23.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Authors Dezhi Jiang and Yanxia Li were employed by Zhejiang Dayuan Pumps Industrial Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Zhang, K.; Zhao, L.; Cui, J.; Mao, P.; Yuan, B.; Liu, Y. Design and Implementation of Evaluation Method for Spraying Coverage Region of Plant Protection UAV. Agronomy 2023, 13, 1631. [Google Scholar] [CrossRef]
Zhang, J.; Qiang, Z.; Lin, H.; Chen, Z.; Li, K.; Zhang, S. Research on Tobacco Field Semantic Segmentation Method Based on Multispectral Unmanned Aerial Vehicle Data and Improved PP-LiteSeg Model. Agronomy 2024, 14, 1502. [Google Scholar] [CrossRef]
Patiluna, V.; Maja, J.M.; Robbins, J. Evaluation of Radio Frequency Identification Power and Unmanned Aerial Vehicle Altitude in Plant Inventory Applications. AgriEngineering 2024, 6, 1319–1334. [Google Scholar] [CrossRef]
Liu, R.; Zhang, Z.; Jiao, Y.; Yang, C.; Zhang, W. Study on Flight Performance of Propeller-Driven UAV. Int. J. Aerosp. Eng. 2019, 2019, 6282451. [Google Scholar] [CrossRef]
Kameyama, S.; Sugiura, K. Estimating Tree Height and Volume Using Unmanned Aerial Vehicle Photography and SfM Technology, with Verification of Result Accuracy. Drones 2020, 4, 19. [Google Scholar] [CrossRef]
Czyż, Z.; Karpiński, P.; Szczepaniak, R.; Sapiński, P.; Depczyński, W.; Bańkowski, D.; Skiba, K. Investigation of the autogyro main rotor blade failure after flight Tests. Eng. Fail. Anal. 2025, 182, 110136. [Google Scholar] [CrossRef]
Rogers, T.J.; Worden, K.; Cross, E.J. On the application of Gaussian process latent force models for joint Input-state-parameter estimation: With a view to Bayesian operational Identification. Mech. Syst. Signal Process. 2020, 140, 106580. [Google Scholar] [CrossRef]
Zheng, L.; Xiang, Y.; Luo, N. Nonlinear dynamic modeling and vibration analysis for early fault evolution of rolling bearings. Sci. Rep. 2024, 14, 23687. [Google Scholar] [CrossRef]
Dore, P.; Chakkor, S.; Oualkadi, A.E.; Baghouri, M. Real-time intelligent system for wind turbine monitoring using fuzzy System. E-Prime—Adv. Electr. Eng. Electron. Energy 2023, 3, 100096. [Google Scholar] [CrossRef]
Liu, X.; Chen, Y.; Xiong, L.; Wang, J.; Luo, C.; Zhang, L.; Wang, K. Intelligent fault diagnosis methods toward gas turbine: A Review. Chin. J. Aeronaut. 2024, 37, 93–120. [Google Scholar] [CrossRef]
Xu, X.; Huang, X.; Bian, H.; Wu, J.; Liang, C.; Cong, F. Total process of fault diagnosis for wind turbine gearbox, from the perspective of combination with feature extraction and machine learning: A Review. Energy AI 2024, 15, 100318. [Google Scholar] [CrossRef]
Ke, L.; Liu, Y.; Yang, Y. Compound Fault Diagnosis Method of Modular Multilevel Converter Based on Improved Capsule Network. IEEE Access 2022, 10, 41201–41214. [Google Scholar] [CrossRef]
Wei, L.; Peng, X.; Cao, Y. Intelligent fault diagnosis of rolling bearings in strongly noisy environments using graph convolutional Networks. Int. J. Adapt. Control Signal Process. 2024, 39, 1469–1482. [Google Scholar] [CrossRef]
Yang, P.; Wen, C.; Geng, H.; Liu, P. Intelligent Fault Diagnosis Method for Blade Damage of Quad-Rotor UAV Based on Stacked Pruning Sparse Denoising Autoencoder and Convolutional Neural Network. Machines 2021, 9, 360. [Google Scholar] [CrossRef]
Liu, W.; Chen, Z.; Zheng, M. An Audio-Based Fault Diagnosis Method for Quadrotors Using Convolutional Neural Network and Transfer Learning. In Proceedings of the 2020 American Control Conference (ACC), Denver, CO, USA, 1–3 July 2020; pp. 1367–1372. [Google Scholar] [CrossRef]
Iannace, G.; Ciaburro, G.; Trematerra, A. Fault diagnosis for UAV blades using artificial neural network. Robotics 2019, 8, 59. [Google Scholar] [CrossRef]
Steinhoff, L.; Koschlik, A.K.; Arts, E.; Soria-Gomez, M.; Raddatz, F.; Kunz, V.D. Development of an acoustic fault diagnosis system for UAV propeller blades. CEAS Aeronaut. J. 2024, 15, 881–893. [Google Scholar] [CrossRef]
Altinors, A.; Yol, F.; Yaman, O. A sound based method for fault detection with statistical feature extraction in UAV motors. Appl. Acoust. 2021, 183, 108325. [Google Scholar] [CrossRef]
Packard, N.H.; Crutchfield, J.P.; Shaw, R.S. Geometry from a Time Series. Phys. Rev. Lett. 2008, 45, 712–725. [Google Scholar] [CrossRef]
Takens, F. Detecting strange attractors in turbulence. In Dynamical Systems and Turbulence, Warwick 1980; Springer: Berlin/Heidelberg, Germany, 2006; pp. 366–381. [Google Scholar]
Ma, Y.F.; Niu, P.F.; Ma, X.F. Research on the Phase Space Reconstruction Method of Chaotic Time Series. Appl. Mech. Mater. 2010, 26–28, 236–240. [Google Scholar] [CrossRef]
Tolle, C.R.; Pengitore, M. Phase-space reconstruction: A path towards the next generation of nonlinear differential equation based models and its implications towards non-uniform sampling theory. In Proceedings of the 2009 2nd International Symposium on Resillient Control Systems, Idaho Falls, ID, USA, 11–13 August 2009; pp. 63–68. [Google Scholar]
Lekscha, J.; Donner, R.V. Phase space reconstruction for Non-uniformly sampled noisy time Series. Chaos Interdiscip. J. Nonlinear Sci. 2018, 28, 085702. [Google Scholar] [CrossRef]
McGowan, S.P.; Robertson, W.S.P.; Blachut, C.; Balasuriya, S. Optimal Reconstruction of Vector Fields from Data for Prediction and Uncertainty Quantification. J. Nonlinear Sci. 2024, 34, 73. [Google Scholar] [CrossRef]
Cui, B.; Tan, Z.; Gao, Y.; Wang, X.; Xiao, L. Research on a Fault Diagnosis Method for Rolling Bearings Based on the Fusion of PSR-CRP and DenseNet. Processes 2025, 13, 2372. [Google Scholar] [CrossRef]
Ladino-Moreno, E.O.; García-Ubaque, C.A.; Zamudio-Huertas, E. Method for phase space reconstruction to estimate the Short-term future behavior of pressure signals in Pipelines. MethodsX 2024, 12, 102620. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Identity Mappings in Deep Residual Networks. In Computer Vision—ECCV 2016; Lecture Notes in Computer Science; Springer International Publishing: Berlin/Heidelberg, Germany, 2016; pp. 630–645. [Google Scholar]
Zaeemzadeh, A.; Rahnavard, N.; Shah, M. Norm-Preservation: Why Residual Networks Can Become Extremely Deep? IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 3980–3990. [Google Scholar] [CrossRef]
Biswas, S.; Breuel, T. Learning Morphological Transformations with Recurrent Neural Networks. Procedia Comput. Sci. 2015, 53, 335–344. [Google Scholar] [CrossRef][Green Version]
Wu, B.; Wang, Z.; Chen, K.; Yan, C.; Liu, W. GBC: An Energy-Efficient LSTM Accelerator With Gating Units Level Balanced Compression Strategy. IEEE Trans. Circuits Syst. I Regul. Pap. 2022, 69, 3655–3665. [Google Scholar] [CrossRef]
Hong, J.; Liang, F.; Yang, H.; Zhang, C.; Zhang, X.; Zhang, H.; Wang, W.; Li, K.; Yang, J. Multi-forword-step state of charge prediction for real-world electric vehicles battery systems using a novel LSTM-GRU hybrid neural Network. ETransportation 2024, 20, 100322. [Google Scholar] [CrossRef]
Bakirci, M. Performance evaluation of low-power and lightweight object detectors for real-time monitoring in resource-constrained drone systems. Eng. Appl. Artif. Intell. 2025, 159, 111775. [Google Scholar] [CrossRef]
Liu, Z.; Chen, C.; Huang, Z.; Chang, Y.C.; Liu, L.; Pei, Q. A low-cost and lightweight real-time object-detection method based on uav remote sensing in transportation systems. Remote Sens. 2024, 16, 3712. [Google Scholar] [CrossRef]

Figure 1. Residual Mapping Structure.

Figure 2. ResNet-18 Network Architecture.

Figure 3. The Structure of an LSTM Memory Cell.

Figure 4. Propeller Test Bench for Rotary-Wing Unmanned Aerial Vehicles.

Figure 5. Distribution of Inspection Points on Propeller Blades.

Figure 6. Propeller Defect Types.

Figure 7. Propeller Defect Diagnosis Model for Rotary-Wing UAVs Based on MDI-STFFNet.

Figure 8. Vibration and Air Pressure Waveform Data for Normal Propeller and Propeller with 5 mm Blade Leading Edge Notch.

Figure 9. Comparison of vibration and pressure pulsation recurrence plots between a standard propeller and a propeller with a 5 mm notch at the blade leading edge.

Figure 10. Accuracy Curves for the Training and Test Sets.

Figure 11. Confusion Matrix.

Figure 12. Accuracy Curve (a) and Confusion Matrix (b) for the Diagnostic Model with Single Vibration Data Input.

Figure 13. Accuracy Curve (a) and Confusion Matrix (b) of the Diagnostic Model with Single-Pressure Pulsation Array Input.

Figure 14. Accuracy Curve (a) and Confusion Matrix (b) for Diagnostic Models Using Single Spatial Feature Extraction.

Figure 15. Accuracy Curve (a) and Confusion Matrix (b) for Diagnostic Models Using Single Temporal Feature Extraction.

Figure 16. Accuracy Curve (a) and Confusion Matrix (b) of the Diagnostic Model without CSW-FM.

Table 1. Statistical Sample of Propeller Data.

Defect Type	Defect Dimensions	Sample Size	Sampling Rate		Label
Defect Type	Defect Dimensions	Sample Size	Vibration Data	Barometric Pressure Data	Label
Normal		144	12 KHz	100 Hz	0
Blade Tip Fracture	5% Fracture	144			1
Blade Tip Fracture	10% Fracture	144			2
Pitching Notch on the Trailing Edge of the Blade	$d_{\max} = 5 mm$	144			3
Pitching Notch on the Trailing Edge of the Blade	$d_{\max} = 10 mm$	144			4
Pitching Notch on the Leading Edge of the Blade	$d_{\max} = 5 mm$	144			5
Pitching Notch on the Leading Edge of the Blade	$d_{\max} = 10 mm$	144			6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cui, B.; Jiang, D.; Wang, X.; Xiao, L.; Tan, P.; Li, Y.; Tan, Z. Research on Propeller Defect Diagnosis of Rotor UAVs Based on MDI-STFFNet. Symmetry 2026, 18, 3. https://doi.org/10.3390/sym18010003

AMA Style

Cui B, Jiang D, Wang X, Xiao L, Tan P, Li Y, Tan Z. Research on Propeller Defect Diagnosis of Rotor UAVs Based on MDI-STFFNet. Symmetry. 2026; 18(1):3. https://doi.org/10.3390/sym18010003

Chicago/Turabian Style

Cui, Beining, Dezhi Jiang, Xinyu Wang, Lv Xiao, Peisen Tan, Yanxia Li, and Zhaobin Tan. 2026. "Research on Propeller Defect Diagnosis of Rotor UAVs Based on MDI-STFFNet" Symmetry 18, no. 1: 3. https://doi.org/10.3390/sym18010003

APA Style

Cui, B., Jiang, D., Wang, X., Xiao, L., Tan, P., Li, Y., & Tan, Z. (2026). Research on Propeller Defect Diagnosis of Rotor UAVs Based on MDI-STFFNet. Symmetry, 18(1), 3. https://doi.org/10.3390/sym18010003

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Propeller Defect Diagnosis of Rotor UAVs Based on MDI-STFFNet

Abstract

1. Introduction

2. Theoretical Analysis

2.1. PSR-CRP

2.2. ResNet

2.3. LSTM

3. Experimental Data Acquisition

4. Fault Diagnosis

4.1. Signal Processing

4.2. MDI-STFFNet

4.2.1. Spatial Feature Extraction

4.2.2. Temporal Feature Extraction

4.2.3. Spatio-Temporal Feature Fusion

4.2.4. Model: Training and Defect Classification Results

4.3. Dissolution Test

4.3.1. Comparison Between Single-Modality Input and Multi-Modality Input

4.3.2. Comparison Between Single Feature and Spatio-Temporal Feature Integration

4.3.3. Direct Feature Concatenation Versus CSW-FM

4.3.4. Analysis of the Lightweight Design and Real-Time Deployment Capabilities of MDI-STFFNet

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI