1. Introduction
The fault diagnosis of rolling bearings is critical across key sectors such as industrial manufacturing, aerospace, energy, and transportation, as bearing performance directly impacts system safety, stability, and lifespan. However, external loads, friction, and structural fatigue during prolonged operation often cause failures such as bearing wear, gear fractures, and rotor imbalance. If not promptly detected and diagnosed, these failures can lead to reduced performance, increased energy consumption, or even catastrophic accidents [
1,
2]. Therefore, research into bearing fault diagnosis is of both significant engineering value and vital importance to the efficient operation of modern industrial systems.
Traditional fault diagnosis methods mainly rely on signal processing and statistical techniques such as time-domain, frequency-domain, and time–frequency analysis. These approaches extract feature parameters from signals to identify faults but often depend heavily on expert knowledge and are susceptible to noise, leading to reduced diagnostic accuracy and reliability, especially under complex and variable operating conditions. To overcome these limitations, data-driven methods have gained traction. Compared to traditional approaches, machine learning techniques offer stronger self-learning capabilities, enabling the automatic discovery of underlying patterns in large datasets while reducing dependence on domain expertise. In bearing fault diagnosis, supervised learning methods such as support vector machines (SVM) [
3] and k-nearest neighbors (kNN) [
4] have shown the ability to classify faults based on features extracted from vibration signals, providing accurate and reliable solutions. Despite overcoming some limitations of traditional methods, machine learning approaches still face challenges when dealing with high-dimensional, complex signals, particularly in low-data regimes.
To overcome these challenges and enhance feature extraction in complex signal environments, deep learning methods have been increasingly explored. Deep learning has shown great promise in this domain due to its powerful feature learning capabilities. CNNs [
5] excel at extracting spatiotemporal features, RNNs [
6] and LSTMs [
7] effectively capture temporal dependencies, and Transformers [
8] leverage self-attention mechanisms to enhance the modeling of long-sequence data. These advancements have significantly boosted the capabilities of intelligent fault diagnosis, especially under big data scenarios. However, in practical industrial settings, two key challenges persist: (1) limited labeled fault samples in the target domain, and (2) significant domain shifts caused by varying operating conditions, load levels, or sensor settings. These issues lead to distribution mismatch and poor generalization of diagnostic models trained in source domains.
To address this, few-shot domain adaptation (FSDA) has emerged as a promising direction, aiming to leverage limited target samples to enable cross-domain generalization. Several recent studies have combined meta-learning with domain adaptation to improve performance in FSDA settings. For example, ADMTL [
9] introduces attention-based meta-transfer learning to reuse source knowledge in new tasks, while PMML [
10] employs prototype-based matching to facilitate few-shot fault classification. These methods demonstrate the effectiveness of episodic training and representation reuse.
However, two important limitations remain unresolved: (1) GlBA-only alignment: Most existing domain adaptation methods rely on aligning global feature distributions via adversarial training or maximum mean discrepancy. These approaches fail to consider intra-class variations that naturally exist across different operational modes within the same fault category. Ignoring these subdomain discrepancies often leads to feature distortion or negative transfer [
10]. Recent studies [
11,
12] have further emphasized that global alignment techniques often fail to preserve class-conditional structures and inter-class margins across subdomains, highlighting the need for subdomain-level feature adaptation in practical diagnostic settings. (2) Static alignment weights: prior methods typically use fixed loss weights to balance domain adaptation and classification losses and lack flexibility in dynamic FSDA scenarios where the domain shift severity and task difficulty vary across episodes.
To overcome these challenges, we propose a novel meta-learning based framework named Dynamic Balance Domain-Adaptation based Few-shot Diagnosis (DBDA-FD). Our approach explicitly incorporates both global and subdomain adversarial alignment mechanisms to improve feature matching fidelity. A dynamic balance factor is further introduced to adaptively modulate the importance of global vs. subdomain alignment during training. This enables our model to dynamically focus on more critical alignment levels according to domain discrepancy and task complexity. From a broader perspective, symmetry plays a critical role in various scientific domains, including physics, biology, and artificial intelligence. In the context of fault diagnosis, symmetry can be interpreted as the consistency or invariance in feature distributions across different domains and fault conditions. Our proposed method exploits this notion by enforcing symmetric alignment between the source and target domain features at both global and subdomain levels, leading to more robust and transferable representations for few-shot learning scenarios. Extensive experiments on the benchmark CWRU and PU datasets have demonstrated that DBDA-FD consistently outperforms existing methods in both five-way five-shot and three-way five-shot diagnostic tasks. Our model achieves over 97.6% accuracy and shows significant improvements in robustness under severe domain shifts and class imbalance conditions.
The main contributions of this paper are as follows:
Dual-Level Alignment: We propose a novel few-shot diagnosis framework named DBDA-FD, which for the first time integrates both global and subdomain adversarial alignment into a meta-learning structure. This enables more fine-grained feature alignment and better handling of domain shifts across diverse working conditions.
Dynamic Balance Factor: A dynamic balancing factor is introduced to adaptively weigh global and subdomain alignment losses during adversarial training. This mechanism enhances feature transferability while mitigating overfitting, especially under data imbalance.
Superior Performance: Extensive experiments on CWRU and PU datasets show that DBDA-FD achieves 97.6% and 97.3% accuracy on five-way five-shot and three-way five-shot tasks, respectively, outperforming recent SOTA methods including ADMTL and PMML by 0.6–1.4%.
2. Related Work
Convolutional neural networks (CNNs) have been widely adopted in mechanical fault diagnosis due to their powerful feature extraction capabilities. Unlike traditional methods that rely on handcrafted features, CNNs enable end-to-end learning of critical local patterns, making them particularly effective for time–frequency images and vibration signal pattern recognition. Ince et al. [
13] proposed a real-time motor fault detection method based on one-dimensional CNN (1D-CNN) that directly processes raw current signals without additional feature extraction, learning temporal features through multiple convolutional and pooling layers for efficient classification. Zhu et al. [
14] employed the symmetrized dot pattern (SDP) technique to convert rotor vibration signals into visual representations, enabling CNN-based feature extraction and enhancing adaptability to complex signal patterns. Iqbal et al. [
15] integrated vibration and acoustic signals, constructing time–frequency maps via short-time Fourier transform (STFT) and feeding them into CNNs for feature extraction and classification, thereby improving diagnostic accuracy under multimodal inputs. Addressing computational complexity, Pan et al. [
16] developed a lightweight CNN (LW-CNN) for real-time intelligent diagnosis, optimizing network architecture to reduce computational costs and employing sliding window-based data augmentation to enhance generalization. Zhong et al. [
17] further combined transfer learning and self-attention mechanisms, proposing a lightweight CNN (SLCNN) where vibration signals are transformed into time–frequency images using continuous wavelet transform (CWT), a self-attention module (SAM) is embedded into an optimized SqueezeNet architecture, and parameters are transferred from ImageNet pre-trained models, achieving high classification accuracy even with limited training data. These studies demonstrate that CNNs and their variants exhibit strong feature extraction and classification capabilities, particularly suited for time–frequency image-based fault detection.
For vibration signals and other time-series data, recurrent neural networks (RNNs) and their variants, such as long short-term memory networks (LSTM), are valuable for capturing long-term dependencies, proving effective in mechanical fault prediction and diagnosis. Yang et al. [
18] proposed an LSTM-based fault diagnosis approach by first converting sensor data from rotating machinery to the frequency domain, applying sparse representation and random projection for dimensionality reduction and subsequently modeling temporal relationships using LSTM for accurate classification. Sabir et al. [
19] utilized LSTM to directly model stator current signals, effectively capturing sequential features and enhancing bearing fault diagnosis accuracy. Zhang et al. [
6] introduced a method combining gated recurrent units (GRUs) by converting time-series vibration signals into two-dimensional images, using GRUs to learn key temporal information and employing multilayer perceptrons (MLPs) for classification, effectively integrating temporal and spatial features. Additionally, Pan et al. [
20] proposed a hybrid 1D-CNN and LSTM model, where CNN extracts local features and LSTM models long-term dependencies for accurate bearing fault classification. Khorram et al. [
21] developed an end-to-end CNN-LSTM model that directly processes accelerometer vibration signals without the need for explicit feature extraction, where CNN learns local features and LSTM captures long-range dependencies, thus avoiding reliance on traditional preprocessing. Overall, LSTM and its variants exhibit strong modeling capabilities for time-series data, mitigating gradient vanishing issues typical in traditional RNNs and achieving outstanding performance in mechanical fault prediction and diagnosis.
Although RNNs and LSTMs have succeeded in sequence modeling, their inherently sequential computation limits training speed and parallelism. Transformers, leveraging global self-attention mechanisms and efficient parallel computation, have emerged as promising tools for time-series analysis and have been rapidly applied in mechanical fault diagnosis. Jin et al. [
22] proposed a time-series Transformer (TST)-based method for rotating machinery fault diagnosis, introducing a sequence tokenizer to segment one-dimensional vibration signals into subsequences, transforming them into high-dimensional representations suitable for Transformer-based feature learning. The multi-head self-attention mechanism enhances the model’s ability to capture global patterns, improving classification accuracy. Li et al. [
23] introduced a variational attention-based Transformer network (VATN), incorporating variational inference into the standard Transformer encoder and optimizing attention weights via Dirichlet distributions, thereby enhancing the interpretability of causal relationships between signal patterns and fault types. Wu et al. [
24] proposed a Transformer-based classification model for manufacturing rotational systems, capable of identifying known fault types and detecting novel faults, thus improving system adaptability. Furthermore, Pei et al. [
25] proposed a hybrid Transformer-CNN (TCN) model, where Transformers capture long-range dependencies and CNNs extract local features, with transfer learning employed to enhance generalization. Experimental results showed that TCN outperforms traditional LSTM and CNN models in classification accuracy across multiple vibration datasets, while also achieving higher computational efficiency.
Few-shot learning approaches for bearing fault classification have shown significant promise in addressing the challenge of data scarcity. For instance, Han et al. [
26] introduced domain-adversarial networks into a meta-learning framework, enabling effective feature transfer from source to target domains by generating meta-knowledge. Chen et al. [
27] adopted a model-agnostic meta-learning strategy to enhance generalization across varying operating conditions, refining parameters through gradient-based adaptation on novel tasks.
In addition to these works, some recent methods have attempted to integrate domain adaptation and meta-learning strategies for fault diagnosis, such as ADMTL and PMML.
ADMTL [
9] introduces an attention-based meta-transfer learning framework, aiming to reuse knowledge from previously learned fault categories for quick adaptation to new ones under domain shifts. PMML [
10], on the other hand, adopts a prototype-based meta-learning approach, which learns task-invariant knowledge from source domains and performs feature matching in target domains. While both methods provide valuable insights into few-shot domain adaptation, they primarily focus on global feature alignment and do not explicitly consider subdomain-level discrepancies that may exist within the same domain due to varying operational conditions. In contrast, our proposed DBDA-FD introduces a subdomain discriminator to capture fine-grained domain differences and a dynamic balancing factor to adaptively weight global and subdomain alignment objectives. This dual-level alignment framework promotes symmetric structure not only globally but also within subdomains, enhancing the model’s ability to generalize under varying operational conditions.
Recently, Yang et al. [
28] proposed an enhanced diagnosis framework that combines Relief-F-based feature selection with an optimized random forest algorithm. By identifying the most discriminative features from vibration signals using Relief-F, the model reduces computational complexity without sacrificing performance. The optimized random forest further addresses class imbalance by dynamically adjusting tree-splitting criteria and sample weighting, leading to improved robustness in imbalanced datasets. In parallel, Zhang et al. [
29] developed a hybrid dilated convolutional network (HDCN) that leverages multi-scale dilated convolutions to capture both local and global features. A novel class-aware attention (CAA) mechanism is introduced to adaptively reweight feature maps based on class rarity, ensuring adequate representation of minority fault types. Experiments demonstrate that HDCN outperforms conventional CNN and LSTM models in imbalanced scenarios, particularly in identifying rare and underrepresented fault conditions. Building on these advances, recent studies have also explored more scalable and generalizable paradigms in mechanical fault diagnosis. For instance, Mehta et al. [
30] proposed a federated transfer learning framework to achieve cross-factory fault generalization without sharing raw data, while Vijayalakshmi et al. [
31] designed a decentralized federated learning scheme to preserve data privacy and mitigate distribution discrepancies. In the context of long-range dependency modeling, Tang et al. [
32] introduced Signal-Transformer, which captures spectral-temporal patterns under variable conditions using attention mechanisms. Moreover, Wang et al. [
33] developed a Transformer-based few-shot learning model tailored for noisy labels and limited samples, achieving high accuracy under changing working conditions. These studies highlight the growing demand for fault diagnosis frameworks that support privacy, domain robustness, and data efficiency—further motivating our dynamic dual-alignment strategy within the DBDA-FD framework.
In summary, these recent methods—spanning CNNs, RNNs, Transformers, and few-shot learning frameworks—demonstrate a clear trend toward improving diagnostic performance under practical constraints such as limited or imbalanced data. However, achieving robust domain adaptation remains a persistent challenge.
3. Problem Definition
To address the challenges of few-shot fault diagnosis in rotating machinery, this study categorizes datasets into three types: domain-specific training data , domain-adaptive validation data , and test data . We define a dataset as , where all three subsets are disjointed and the dataset contains B classes of data, each class consisting of instances, with C labeled instances and E unlabeled instances. In this work, a “task” is defined as a specific scenario, denoted as , where represents the support set containing labeled samples, represents the label corresponding to the fault situation, and denotes the query set with labeled samples. indicates ground-truth labels used only for testing or validation purposes. The term refers to the predicted labels for query examples from a source domain task, which excludes any samples in the corresponding support or query sets. Our proposed method treats the query set of a target domain as the ground truth, leverages the remaining samples as the source domain, and finally evaluates model performance by averaging the prediction results across all target scenarios.
6. Conclusions and Future Work
When trained with limited samples, deep learning models often suffer from overfitting and poor generalization, especially in fault diagnosis tasks under domain shift. To address these challenges, we proposed a novel few-shot fault diagnosis framework, DBDA-FD, which integrates global and subdomain feature alignment within a meta-learning paradigm. A dynamic balancing factor is introduced to adaptively regulate the contribution of global and subdomain alignment objectives, enabling the model to generate more transferable features across varying working conditions. Comprehensive experiments conducted on the CWRU and PU bearing fault datasets confirm the superior performance of DBDA-FD. Specifically, the model achieved classification accuracies of 97.6% and 97.3% in five-way five-shot and three-way five-shot tasks, respectively, outperforming representative state-of-the-art methods such as PMML and ADMTL by up to 0.7%. These results demonstrate the robustness and effectiveness of our approach in handling cross-domain scenarios with limited annotated data. In practical applications, DBDA-FD holds strong potential for real-world deployment in intelligent maintenance systems, where labeled fault data are scarce, and domain conditions are highly variable. However, its performance may degrade when the discrepancy between source and target domains is excessively large, limiting the effectiveness of feature transfer. To further improve the scalability and adaptability of the framework, future research will focus on developing quantitative transferability scoring metrics to evaluate domain similarity prior to adaptation. This will guide the dynamic selection or weighting of source domains, improving alignment efficiency and reducing negative transfer. Additionally, we plan to incorporate self-supervised pretraining techniques to alleviate the reliance on labeled source data—ensuring that the model learns reliable and high-quality transferable features, thereby enhancing its performance and stability in few-shot mechanical fault detection under unknown operating conditions and better meeting the needs of real-world applications.