Bearing Fault Diagnosis Grounded in the Multi-Modal Fusion and Attention Mechanism
Abstract
:1. Introduction
2. Methods
2.1. Data Processing
2.2. FAN-BD
3. Data
4. Experimental Design and Results
4.1. Experimental Environment
4.2. Dataset Classification
4.3. Experimental Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Daniel, R.V.; Siddhappa, S.A.; Gajanan, S.B.; Philip, S.V.; Paul, P.S. Effect of bearings on vibration in rotating machinery. In Proceedings of the IOP Conference Series: Materials Science and Engineering, Busan, Republic of Korea, 25–27 August 2017; p. 012264. [Google Scholar]
- Ahn, G.; Lee, H.; Park, J.; Hur, S. Development of indicator of data sufficiency for feature-based early time series classification with applications of bearing fault diagnosis. Processes 2020, 8, 790. [Google Scholar] [CrossRef]
- Tang, X.; He, Q.; Gu, X.; Li, C.; Zhang, H.; Lu, J. A novel bearing fault diagnosis method based on GL-mRMR-SVM. Processes 2020, 8, 784. [Google Scholar] [CrossRef]
- Liu, J. Detrended fluctuation analysis of vibration signals for bearing fault detection. In Proceedings of the 2011 IEEE Conference on Prognostics and Health Management, Denver, CO, USA, 20–23 June 2011; pp. 1–5. [Google Scholar]
- Guishuang, T.; Wang, S.; Zhang, C. A method for rolling bearing fault diagnosis based on the power spectrum analysis and support vector machine. In Proceedings of the IEEE 10th International Conference on Industrial Informatics, Beijing, China, 25–27 July 2012; pp. 546–549. [Google Scholar]
- Wu, S.L.; Liu, J.X.; Li, L. Fault Diagnosis of Rolling Bearing on the Basis of Wavelet Neural Network. Appl. Mech. Mater. 2014, 598, 244–249. [Google Scholar] [CrossRef]
- Dong, S.; Luo, T.; Zhong, L.; Chen, L.; Xu, X. Fault diagnosis of bearing based on the kernel principal component analysis and optimized k-nearest neighbour model. J. Low Freq. Noise Vib. Act. Control 2017, 36, 354–365. [Google Scholar] [CrossRef]
- Sun, Y.; Tao, H.; Stojanovic, V. Autoregressive data generation method based on wavelet packet transform and cascaded stochastic quantization for bearing fault diagnosis under unbalanced samples. Eng. Appl. Artif. Intell. 2024, 138, 109402. [Google Scholar] [CrossRef]
- Wen, L.; Li, X.; Gao, L.; Zhang, Y. A new convolutional neural network-based data-driven fault diagnosis method. IEEE Trans. Ind. Electron. 2017, 65, 5990–5998. [Google Scholar] [CrossRef]
- Guo, Y.; Mao, J.; Zhao, M. Rolling bearing fault diagnosis method based on attention CNN and BiLSTM network. Neural Process. Lett. 2023, 55, 3377–3410. [Google Scholar] [CrossRef]
- Mangalam, K.; Fan, H.; Li, Y.; Wu, C.-Y.; Xiong, B.; Feichtenhofer, C.; Malik, J. Reversible vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 10830–10840. [Google Scholar]
- Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 87–110. [Google Scholar] [CrossRef] [PubMed]
- Guo, H.; Zhao, X. Intelligent Diagnosis of Dual-channel Parallel Rolling Bearings Based on Feature Fusion. IEEE Sens. J. 2024, 24, 10640–10655. [Google Scholar] [CrossRef]
- Tao, H.; Shi, H.; Qiu, J.; Jin, G.; Stojanovic, V. Planetary gearbox fault diagnosis based on FDKNN-DGAT with few labeled data. Meas. Sci. Technol. 2023, 35, 025036. [Google Scholar] [CrossRef]
- Tao, H.; Zheng, J.; Wei, J.; Paszke, W.; Rogers, E.; Stojanovic, V. Repetitive process based indirect-type iterative learning control for batch processes with model uncertainty and input delay. J. Process Control 2023, 132, 103112. [Google Scholar] [CrossRef]
- Wang, Z.; Nie, P.; Liu, J.; He, J.; Wu, H.; Guo, P. Bearing fault diagnosis based on a Multiple-Constraint Modal-invariant Graph Convolutional Fusion Network. High-Speed Railw. 2024, 2, 92–100. [Google Scholar] [CrossRef]
- Gao, J.; Li, P.; Chen, Z.; Zhang, J. A survey on deep learning for multimodal data fusion. Neural Comput. 2020, 32, 829–864. [Google Scholar] [CrossRef] [PubMed]
- Dosovitskiy, A. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Yang, J.; Zhang, Y.; Wang, K.; Tong, Y.; Liu, J.; Wang, G. Coal–Rock Data Recognition Method Based on Spectral Dimension Transform and CBAM-VIT. Appl. Sci. 2024, 14, 593. [Google Scholar] [CrossRef]
- Jung, W.; Kim, S.-H.; Yun, S.-H.; Bae, J.; Park, Y.-H. Vibration, acoustic, temperature, and motor current dataset of rotating machine under varying operating conditions for fault diagnosis. Data Brief 2023, 48, 109049. [Google Scholar] [CrossRef] [PubMed]
- Yacouby, R.; Axman, D. Probabilistic extension of precision, recall, and F1 score for more thorough evaluation of classification models. In Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems, Online, 20 November 2020; pp. 79–91. [Google Scholar]
Layer Name Detailed Architecture Description | Output Tensor Shape | Layer Description |
---|---|---|
Conv1 | [2, 64, 120, 100] | Convolutional layer: input channels: 1; output channels: 64; kernel size: 3 × 3; padding = 1. Extracts low-level features. |
ReLU1 | [2, 64, 120, 100] | Activation function: applies ReLU activation to Conv1 output |
Conv2 | [2, 128, 120, 100] | Convolutional layer: input channels: 64; output channels: 128; kernel size: 3 × 3; padding = 1. Extracts higher-level features. |
ReLU2 | [2, 128, 120, 100] | Activation function: applies ReLU activation to Conv2 output. |
Padding | [2, 128, 120, 100] | Padding operation: ensures image dimensions are divisible by patch_size = 8. |
Unfold (Patch Division) | [2, 15, 12, 128, 8.8] | Divides image into patches of size 8 × 8 and flattens each patch, resulting in (batch_size, num_patches_h, num_patches_w, channels, patch_size, patch_size). |
Patch Embedding | [2, 180, 256] | Linear projection layer, flattens each patch and maps to embed_dim = 256 dimension. Results in [batch_size, num_patches, embed_dim]. |
Position Embedding | [1, 180, 256] | Dynamically generates position encoding and adds to patch_embeddings. |
Transformer Encoder | [2, 180, 256] | Transformer encoder section with 4 TransformerEncoderLayers, with each layer containing num_heads = 4. Performs encoding on input patches. |
CBMA Attention | [2, 180, 256] | Multi-head self-attention mechanism (CBMA): input is patch embeddings after Transformer encoding; dimensions remain unchanged |
MLP | [2, 180, 256] | Multi-layer perceptron, including GELU activation. |
Mean Pooling | [2, 256] | Pooling operation: computes mean along num_patches dimension, resulting in [batch_size, embed_dim]. |
Fully Connected | [2, 4] | Classification layer: outputs num_classes = 4 categories, representing the classification results. |
Data Type | Collection Location/Sensors | Collection Equipment | Sampling Frequency | Unit | Included Columns |
---|---|---|---|---|---|
Vibration Data | Two bearing housings (A and B) in x and y directions, using 4 accelerometers (PCB352C34) | Siemens SCADAS Mobile 5PM50 | 25.6 kHz | Gravity constant (g) | Timestamp, x-direction (Bearing A), y-direction (Bearing A), x-direction (Bearing B), y-direction (Bearing B) |
Current Data | Current sensors (3 CT sensors, Hioki CT6700) | NI9775 | 100 kHz | Ampere (A) | Timestamp, R-phase current, S-phase current, T-phase current |
Evaluation Indicators | Prediction | ||
---|---|---|---|
Positive | Negative | ||
Actual | Positive | True Positive (TP) | False Negative (FN) |
Negative | False Positive (FP) | Ture Negative (TN) |
Algorithm | Normal | Inner Fault | Outer Fault | Ball Fault |
---|---|---|---|---|
FAN-BD Precision | 97.0% | 97.5% | 97.5% | 97.0% |
CBMA-VIT Precision | 91.3% | 92.2% | 95.0% | 91.9% |
ViT Precision | 93.0% | 90.4% | 91.0% | 89.5% |
Swin Transformer Precision | 93.9% | 93.5% | 93.9% | 92.1% |
ConvNeXt Precision | 92.0% | 90.5% | 91.1% | 91.5% |
FAN-BD Recall | 97.5% | 97.5% | 97.5% | 98.0% |
CBMA-VIT Recall | 90.0% | 94.5% | 95.0% | 90.6% |
ViT Recall | 93.0% | 89.5% | 91.0% | 89.5% |
Swin Transformer Recall | 93.0% | 93.5% | 93.0% | 93.0% |
ConvNeXt Recall | 92.0% | 90.5% | 92.9% | 91.5% |
FAN-BD F1-score | 97.0% | 97.5% | 97.5% | 97.5% |
CBMA-VIT F1-score | 90.6% | 93.3% | 95.0% | 91.2% |
ViT F1-score | 93.0% | 89.9% | 91.0% | 89.5% |
Swin Transformer F1-score | 93.4% | 93.5% | 93.4% | 92.5% |
ConvNeXt F1-score | 92.0% | 90.5% | 92.0% | 91.5% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, J.; Han, H.; Dong, X.; Wang, G.; Zhang, S. Bearing Fault Diagnosis Grounded in the Multi-Modal Fusion and Attention Mechanism. Appl. Sci. 2025, 15, 1531. https://doi.org/10.3390/app15031531
Yang J, Han H, Dong X, Wang G, Zhang S. Bearing Fault Diagnosis Grounded in the Multi-Modal Fusion and Attention Mechanism. Applied Sciences. 2025; 15(3):1531. https://doi.org/10.3390/app15031531
Chicago/Turabian StyleYang, Jianjian, Haifeng Han, Xuan Dong, Guoyong Wang, and Shaocong Zhang. 2025. "Bearing Fault Diagnosis Grounded in the Multi-Modal Fusion and Attention Mechanism" Applied Sciences 15, no. 3: 1531. https://doi.org/10.3390/app15031531
APA StyleYang, J., Han, H., Dong, X., Wang, G., & Zhang, S. (2025). Bearing Fault Diagnosis Grounded in the Multi-Modal Fusion and Attention Mechanism. Applied Sciences, 15(3), 1531. https://doi.org/10.3390/app15031531