A Synergistic Fault Diagnosis Method for Rolling Bearings: Variational Mode Decomposition Coupled with Deep Learning
Abstract
1. Introduction
- Limitations in long-range temporal dependency modeling: Methods such as the IWOA-VMD-KELM model [11] use the improved whale optimization algorithm (IWOA) to optimize variational mode decomposition (VMD) parameters and rely on kernel extreme learning machine (KELM) for classification. While VMD enhances feature extraction, KELM—a traditional machine learning classifier—lacks the ability to capture long-range temporal dependencies in sequential vibration data, which becomes a critical shortcoming when dealing with dynamic fault evolution processes (e.g., the gradual expansion of bearing cracks).
- Insufficiency in global feature correlation: The WI-CNN model [12], which integrates adaptive stochastic resonance denoising and the Gramian angular field-based convolutional neural network (CNN), strengthens weak fault features through preprocessing and spatial transformation. However, it relies solely on the CNN for local spatial feature extraction and fails to model the global correlations between decomposed modal components. This oversight often leads to incomplete feature representation when faults exhibit multi-scale propagation characteristics.
- Suboptimal decomposition of strongly non-stationary signals: For small-sample scenarios, the ALA-FMD-MSCA-RN framework [13] uses the artificial lemming algorithm (ALA) to optimize feature mode decomposition (FMD) parameters. Nevertheless, FMD lacks the rigorous mathematical foundation of VMD for modal separation, resulting in suboptimal decomposition effects when processing strongly non-stationary bearing vibration signals.
- Difficulty in handling high-dimensional features: The SK-LS-SVM method [14], which combines spectral kurtosis (SK) and the least-squares support vector machine (LS-SVM), can effectively identify fault-related frequency bands. However, the LS-SVM struggles to handle high-dimensional features extracted from complex vibration signals, leading to a reduction in the accuracy of multi-fault classification tasks (e.g., simultaneously distinguishing inner ring, outer ring, and rolling element faults).
2. Materials and Methods
2.1. Data Acquisition and Preprocessing
2.1.1. Data Source
2.1.2. Data Preprocessing
2.2. Basic Method
2.2.1. Variational Mode Decomposition
2.2.2. Convolutional Neural Network
- Convolutional Layer
- Pooling Layer
- Fully Connected Layer
2.2.3. Multi-Head Self-Attention Mechanism
- Scaled Dot-Product Attention
- Multi-Head Attention Calculation
2.2.4. Feature Fusion Module
- Cross-Component Attention Weight Calculation
- Multi-Component Feature Weighted Fusion
2.3. Rolling Bearing Fault Diagnosis Model Based on the VMD-CNN-Transformer Model
- (1)
- Data Preprocessing: Denoising and normalization are performed on the acquired data to reduce noise interference and dimensional differences. Subsequently, variational mode decomposition is applied to the data that has undergone denoising and normalization.
- (2)
- Dataset Partitioning: The window size is set to 1024, with an interval step size of 512 and an overlap ratio of 0.5; on this basis, the sliding window technique is employed to segment the original signal into individual samples. Subsequently, the dataset is partitioned into the training set, validation set, and test set at a ratio of 7:1:2, which serves to provide sufficient sample support for the subsequent processes of model training and evaluation.
- (3)
- Construction of the VMD-CNN-Transformer Model: A CNN is used to capture the local features of each modal time series, while a Transformer is employed to capture global dependencies. Finally, the features of different modal signals are fused, and fault classification is completed through a fully connected layer.
- (4)
- Model Training: The parameters of the VMD-CNN-Transformer model are randomly initialized. The Adam optimizer and cross-entropy loss function are selected [25]. Training data is then input in batches. Outputs are obtained through forward propagation, errors are calculated, gradients are computed via backward propagation, and parameters are updated using the optimizer. This cycle continues for a preset number of epochs (50 epochs), with loss and accuracy recorded each epoch to monitor the progress of training [26].
- (5)
- Model Evaluation: Metrics such as accuracy and F1-score are calculated to analyze the model’s classification performance across different fault categories. The cross-entropy loss curve is plotted to observe the model’s convergence behavior, and a confusion matrix is generated to visually illustrate the effectiveness of the model’s classification for each category.
3. Results
3.1. Experimental Results
3.1.1. Experimental Setup
3.1.2. Model Evaluation Metrics
3.1.3. Algorithm Comparison
- IWOA-VMD-KELM: With the improved whale optimization algorithm as the core, it optimizes the number of modal components and penalty coefficient of variational mode decomposition, as well as the regularization coefficient and kernel parameter of the kernel extreme learning machine; VMD decomposes vibration signals to obtain optimal modal components and construct feature vectors, while KELM completes fault classification.
- ALA-FMD-MSCA-RN: For few-shot scenarios, the artificial lemming algorithm is utilized to optimize the parameters of feature mode decomposition, including the number of modes and filter length. Subsequently, the modal components with the minimum Residual Energy Index (REI) are selected and converted into time–frequency maps. The Multi-Scale Coordinate Attention (MSCA) mechanism is employed to enhance key features and combined with a Relation Network (RN) to calculate the similarity of samples, thereby achieving fault classification.
- SK-LS-SVM: Spectral kurtosis adaptively determines the optimal center frequency and bandwidth of vibration signals, and a band-pass filter is constructed to reduce noise. Subsequently, the Hilbert transform is applied to extract the envelope spectrum in order to acquire fault features. Feature vectors are constructed using the signal kurtosis and the amplitude ratio of the characteristic frequencies of the inner and outer rings, which are then input into the LS-SVM to achieve fault classification.
3.1.4. Ablation Experiment
3.1.5. Evaluation of Generalization Ability
3.2. Analysis of Experimental Results
4. Discussion
- (1)
- Generalizability: The current validation of the model remains confined to the publicly available CWRU dataset, and its applicability in real industrial scenarios has yet to be thoroughly verified. To further evaluate the model’s generalization capabilities, performance assessments on additional publicly available datasets from diverse sources should be conducted in future research. Specifically, follow-up studies may incorporate bearing datasets from Jiangnan University (JNU), the University of Connecticut (CU), and the University of Ottawa (OU) for supplementary experiments. Through cross-dataset validation, a more comprehensive examination of the model’s diagnostic stability and adaptability under varying data distributions and operational conditions will be conducted, thereby providing more robust empirical support for its industrial application.
- (2)
- Model Extension: The current research focuses primarily on rolling bearing faults induced by mechanical loads. Nevertheless, in practical industrial scenarios represented by electric vehicles, electrical stress arising from circulating bearing currents also constitutes a critical factor leading to bearing degradation and failure. In future studies, we will consider extending this proposed method to such practical industrial contexts (e.g., electric vehicle-related applications) for further validation and application, thereby broadening its scope of industrial utility beyond the existing mechanical load-centric fault diagnosis framework [27,28].
- (3)
- Model Lightweighting: The computational complexity of the Transformer architecture remains high. Subsequent research could explore lightweight modifications, such as sparse attention mechanisms, to enhance compatibility with embedded systems.
- (4)
- Real-Time Performance: For online monitoring applications, the inference speed of the model requires further optimization to meet industrial standards for real-time fault diagnosis.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Xu, F.N.; Ding, N.; Li, N.; Liu, L.; Hou, N.; Xu, N.; Guo, W.; Tian, L.; Xu, H.; Wu, C.-M.L.; et al. A Review of Bearing Failure Modes, Mechanisms and Causes. Eng. Fail. Anal. 2023, 152, 107518. [Google Scholar] [CrossRef]
- Xin, J.; Chen, Y.M.; Wang, L.; Hua, H.L.; Cheng, P. Failure Prediction; Monitoring and Diagnosis Methods for Slewing Bearings of Large-Scale Wind Turbine: A Review. Measurement 2021, 172, 108855. [Google Scholar] [CrossRef]
- He, F.; Xie, G.; Luo, J. Electrical Bearing Failures in Electric Vehicles. Friction 2020, 8, 4–28. [Google Scholar] [CrossRef]
- Zhang, C.; Wei, S.; Dong, G.; Zeng, Y.; Zhu, G.; Zhou, X.; Liu, F. Time-Domain Sparsity Based Bearing Fault Diagnosis Methods Using Pulse Signal-to-Noise Ratio. IEEE Trans. Instrum. Meas. 2024, 73, 3516804. [Google Scholar] [CrossRef]
- Wang, B.; Yang, Q.F. Research on Fault Diagnosis of Industrial Robot Rotating Components Based on Improved Multi-Scale Residual Network. Sci. Technol. Innov. 2024, 7, 217–220. [Google Scholar]
- Elouaham, S.; Dliou, A.; Nassiri, B.; Zougagh, H. Combination Method for Denoising EMG Signals Using EWT and EMD Techniques. In Proceedings of the 2023 IEEE International Conference on Advances in Data-Driven Analytics and Intelligent Systems (ADACIS), Marrakesh, Morocco, 23–25 November 2023; pp. 1–6. [Google Scholar]
- Zheng, L.C.; Liang, X.Y.; Yuan, G.N. Based on Ensemble Neural Network and Improved Extreme Learning Machine for Fault Detection of Mine Mobile Robots. Met. Mine 2024, 6, 159–164. [Google Scholar]
- Cai, J.; Yang, Z.; Liu, X.; Xiong, J.; Chen, H. A Review of Data-Driven Machinery Fault Diagnosis Using Machine Learning Algorithms. J. Vib. Eng. Technol. 2022, 10, 2481–2507. [Google Scholar] [CrossRef]
- Zhong, R.; Hu, B.; Feng, Y.; Lou, S.; Hong, Z.; Wang, F.; Li, G.; Tan, J. Lithium-Ion Battery Remaining Useful Life Prediction: A Federated Learning-Based Approach. Energ. Ecol. Environ. 2024, 9, 549–562. [Google Scholar] [CrossRef]
- Li, Y.; Jia, Z.; Liu, Z.; Shao, H.; Zhao, W.; Liu, Z.; Wang, B. Interpretable Intelligent Fault Diagnosis Strategy for Fixed-Wing UAV Elevator Fault Diagnosis Based on Improved Cross Entropy Loss. Meas. Sci. Technol. 2024, 35, 076110. [Google Scholar] [CrossRef]
- Yuan, B.; Lu, L.; Chen, S. Research on Bearing Fault Diagnosis Based on Vibration Signals and Deep Learning Models. Electronics 2025, 14, 2090. [Google Scholar] [CrossRef]
- Zhong, W.; Pang, B. Intelligent Diagnosis Method for Early Weak Faults Based on Wave Intercorrelation–Convolutional neural networks. Electronics 2025, 14, 2808. [Google Scholar] [CrossRef]
- Wang, H.; Shui, F.; Xie, R.; Gu, J.; Li, C. Few-Shot Bearing Fault Diagnosis Based on ALA-FMD and MSCA-RN. Electronics 2025, 14, 2672. [Google Scholar] [CrossRef]
- Lai, L.; Xu, W.; Song, Z. A Novel Fault Diagnosis Method for Rolling Bearings Based on Spectral Kurtosis and LS-SVM. Electronics 2025, 14, 2790. [Google Scholar] [CrossRef]
- Neupane, D.; Seok, J. Bearing Fault Detection and Diagnosis Using Case Western Reserve University Dataset with Deep Learning Approaches: A Review. IEEE Access 2020, 8, 93155–93178. [Google Scholar] [CrossRef]
- Case Western Reserve University (CWRU) Open-Access Rolling Bearing Dataset. Available online: https://engineering.case.edu/bearingdatacenter/download-data-file (accessed on 15 April 2024).
- Han, F.; Zhang, X.; Cao, J. A Novel Approach of Fault Diagnosis for Gearbox Based on VMD Optimized by GSWOA and Improved RCMSE. Eksploat. I Niezawodn.–Maint. Reliab. 2025, 28. [Google Scholar] [CrossRef]
- Shen, Z.; Shibo, Z.; Bingnan, W.; Thomas, G.H. Deep Learning Algorithms for Bearing Fault Diagnostics—A Comprehensive Review. IEEE Access. 2020, 8, 29857–29881. [Google Scholar]
- You, K.S.; Wang, P.Z.; Huang, P.; Gu, Y.K. A Sound-Vibration Physical-Information Fusion Constraint-Guided Deep Learning Method for Rolling Bearing Fault Diagnosis. Meas. Sci. Technol. 2025, 36, 025103. [Google Scholar]
- Cao, L.; Sun, W. Research on Bearing Fault Identification of Wind Turbines’ Transmission System Based on Wavelet Packet Decomposition and Probabilistic Neural Network. Energies 2024, 17, 2581. [Google Scholar] [CrossRef]
- Chen, L.; Zhang, X.; Li, Z.; Jiang, H. Research on a Wind Turbine Gearbox Fault Diagnosis Method Using Singular Value Decomposition and Graph Fourier Transform. Sensors 2024, 24, 3234. [Google Scholar] [CrossRef]
- Yan, R.; Lin, J. Equipment Intelligent Operation and Maintenance; CRC Press: Boca Raton, FL, USA, 2025. [Google Scholar] [CrossRef]
- Wang, C.; Tian, X.; Zhou, F.; Shao, X.; Liang, X.; Li, H. Fault Diagnosis of Electric Transmission System Based on Graph-Enhanced Deep Feature Fusion Network Model Using Efficient Decision Mapping. Meas. Sci. Technol. 2025, 36, 046123. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
- Jin, D.; He, C.; Zou, Q.; Qin, Y.; Wang, B. Source Code Vulnerability Detection Based on Joint Graph and Multimodal Feature-Fusion. Electronics 2025, 14, 975. [Google Scholar] [CrossRef]
- Al-Karawi, A.; Abofanas, M.; Mohammedqasem, R.; Mohammedqasim, H. Revolutionizing the Classification of Medical Image Classification by Using Integrating Advanced Neural Networks with Pre-Processing. Procedia Comput. Sci. 2025, 258, 1326–1337. [Google Scholar] [CrossRef]
- Tombul, Y.; Tillmann, P.; Andert, J. Simulation of the Circulating Bearing Currents for Different Stator Designs of Electric Traction Machines. Machines 2023, 11, 811. [Google Scholar] [CrossRef]
- Jie, H.; Wang, C.; See, K.Y.; Li, H.; Zhao, Z. A Systematic EV Bearing Degradation Testing Approach Considering Circulating Bearing Currents. IEEE/ASME Trans. Mechatron. 2025. [Google Scholar] [CrossRef]
Status | Fault Size | |
---|---|---|
0 | Normal | None |
1 | Inner ring fault | 0.007 inches |
2 | Inner ring fault | 0.014 inches |
3 | Inner ring fault | 0.021 inches |
4 | Outer ring fault | 0.007 inches |
5 | Outer ring fault | 0.014 inches |
6 | Outer ring fault | 0.021 inches |
7 | Rolling element fault | 0.007 inches |
8 | Rolling element fault | 0.014 inches |
9 | Rolling element fault | 0.021 inches |
Convolutional Layer No. | Conv1d Operation | Activation Function | MaxPool1d Operation |
---|---|---|---|
Layer 1 | Input channels: 4, output channels: 8, kernel size: 3, and padding: 1 | ReLU | Kernel size: 2, stride: 2 |
Layer 2 | Input channels: 8, output channels: 16, kernel size: 3, and padding: 1 | Kernel size: 2, stride: 2 | |
Layer 3 | Input channels: 16, output channels: 32, kernel size: 3, and padding: 1 | Kernel size: 2, stride: 2 | |
Layer 4 | Input channels: 32, output channels: 64, kernel size: 3, and padding: 1 | Kernel size: 2, stride: 2 | |
Layer 5 | Input channels: 64, output channels: 128, kernel size: 3, and padding: 1 | Kernel size: 2, stride: 2 |
Recent Methods | Accuracy on CWRU Dataset | Difference from VMD-CNN-Transformer (99.48%) |
---|---|---|
IWOA-VMD-KELM [11] | 98.8% | −0.68% |
ALA-FMD-MSCA-RN [13] | 96.8% | −2.68% |
SK-LS-SVM [14] | 95% | −4.48% |
Model | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|
CNN-Transformer | 0.9911 | 0.9914 | 0.9911 | 0.9910 |
VMD-CNN | 0.9792 | 0.9813 | 0.9792 | 0.9791 |
VMD-Transformer | 0.8854 | 0.8994 | 0.8854 | 0.8818 |
VMD-CNN-Transformer | 0.9948 | 0.9951 | 0.9948 | 0.9948 |
Model | Time/s | Parameters |
---|---|---|
CNN-Transformer | 6.32 | 296,682 |
VMD-CNN | 5.44 | 34,274 |
VMD-Transformer | 5.82 | 1,845,002 |
VMD-CNN-Transformer | 6.78 | 299,234 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, S.; Su, X.; Li, J.; Li, F.; Li, M.; Ren, Y.; Wang, G.; Shi, N.; Qian, H. A Synergistic Fault Diagnosis Method for Rolling Bearings: Variational Mode Decomposition Coupled with Deep Learning. Electronics 2025, 14, 3714. https://doi.org/10.3390/electronics14183714
Wang S, Su X, Li J, Li F, Li M, Ren Y, Wang G, Shi N, Qian H. A Synergistic Fault Diagnosis Method for Rolling Bearings: Variational Mode Decomposition Coupled with Deep Learning. Electronics. 2025; 14(18):3714. https://doi.org/10.3390/electronics14183714
Chicago/Turabian StyleWang, Shuzhen, Xintian Su, Jinghan Li, Fei Li, Mingwei Li, Yafei Ren, Guoqiang Wang, Nianfeng Shi, and Huafei Qian. 2025. "A Synergistic Fault Diagnosis Method for Rolling Bearings: Variational Mode Decomposition Coupled with Deep Learning" Electronics 14, no. 18: 3714. https://doi.org/10.3390/electronics14183714
APA StyleWang, S., Su, X., Li, J., Li, F., Li, M., Ren, Y., Wang, G., Shi, N., & Qian, H. (2025). A Synergistic Fault Diagnosis Method for Rolling Bearings: Variational Mode Decomposition Coupled with Deep Learning. Electronics, 14(18), 3714. https://doi.org/10.3390/electronics14183714