Analysis of Gearbox Bearing Fault Diagnosis Method Based on 2D Image Transformation and 2D-RoPE Encoding
Abstract
1. Introduction
2. Materials and Methods
2.1. Materials
Dataset
- (1)
- MCC5 gearbox dataset
- (2)
- HUST gearbox dataset
2.2. Methods
2.2.1. Overview
2.2.2. Data Preprocessing
Algorithm 1 Applying proposed RoPE-DWTrans model to classification | |
Input: a set of Dataset sample S = {(X1, label1, type1), …, (Xt, labeln, typei)}, where X represented the features of the GAF images, label was the class label of the GAF images (which can either be 14 classes or 23 classes), and type represented the type of the image (with two possible types). The S was classified into a training set (trainX, trainlabel, traintype), a validation set (valX, vallabel, valtype), a testing set (testX, testlabel, testtype) in a ratio of 7.6:1.2:1.2. The number of learning epochs was denoted as M. num represented num hidden layers. Output: the optimal Model and its classification statistics. | |
|
Algorithm 2 Data preprocessing | |
Input: Raw vibration data files in CSV format. N was the length of the CSV format. Output: GAF image features in NPY format. | |
|
2.2.3. Depthwise Feature Extractor
2.2.4. Self-Attention with 2D RoPE
- A.
- Preliminaries
- B.
- Two-dimensional Rotary Position Embedding
Algorithm 3 The 2D relative position encoding | |
Input: Query vector qₙ, Key vector kₘ ∈ ℝ^{1 × d_head}, Position (xₙ, yₙ), (xₘ, yₘ) Output: Attention score with 2D relative position encoding | |
|
- C.
- Residual with Zero initialization
3. Results
3.1. Training Environment and Parameter Settings
3.2. Evaluation Criteria
3.3. Performance
3.3.1. Classification Performance
3.3.2. Feature Extraction Performance Evaluation Under Multiple Time Windows
3.3.3. Classification of Gearbox Fault Types and Severity Levels
3.3.4. Comparison of Different Modules
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
CNN | Convolutional neural network |
RNN | Recurrent neural network |
SE | Squeeze and excitation |
FFT | Fast Fourier Transform |
GAF | Gramian Angular Field |
LSTM | Long Short-Term Memory |
ViT | Vision Transformer |
DWFE | Deep feature extractor |
2D-RoPE | Two-dimensional rotary position encoding |
PPA | Piecewise Aggregate Approximation |
GASF | Gramian Angular Summation Field |
GADF | Gramian Angular Difference Field |
References
- Mikić, D.; Desnica, E.; Kiss, I.; Mikić, V. Reliability analysis of rolling ball bearings considering the bearing radial clearance and operating temperature. Adv. Eng. Lett. 2022, 1, 16–22. [Google Scholar] [CrossRef]
- Vasic, M.; Stojanovic, B.; Blagojevic, M. Fault analysis of gearboxes in open pit mine. Appl. Eng. Lett. 2020, 5, 50–61. [Google Scholar] [CrossRef]
- Molęda, M.; Małysiak, M.; Sunderam, V.; Ding, W.; Mrozek, D. From corrective to predictive maintenance—A review of maintenance approaches for the power industry. Sensors 2023, 23, 5970. [Google Scholar] [CrossRef]
- Shao, Z.; Zhang, T.; Kosasih, B. Compound Faults Diagnosis in Wind Turbine Gearbox Based on Deep Learning Methods: A Review. In Proceedings of the 2024 Global Reliability and Prognostics and Health Management Conference (PHM-Beijing), Beijing, China, 11–13 October 2024. [Google Scholar]
- Seo, M.; Yun, W. Gearbox Condition Monitoring and Diagnosis of Unlabeled Vibration Signals Using a Supervised Learning Classifier. Machines 2024, 12, 127. [Google Scholar] [CrossRef]
- Mohad, F.; Gomes, L.; Tortorella, G.; Lermen, F.H. Operational excellence in total productive maintenance: Statistical reliability as support for planned maintenance pillar. Int. J. Qual. Reliab. Manag. 2025, 42, 1274–1296. [Google Scholar] [CrossRef]
- Khalil, A.; Rostam, S. Machine learning-based predictive maintenance for fault detection in rotating machinery: A case study. Eng. Technol. Appl. Sci. Res. 2024, 14, 13181–13189. [Google Scholar] [CrossRef]
- Chukwunweike, J.; Anang, A.; Dike, J.; Adeniran, A.A. Enhancing manufacturing efficiency and quality through automation and deep learning: Addressing redundancy, defects, vibration analysis, and material strength optimization. World J. Adv. Res. Rev. 2024, 23, 1272–1295. [Google Scholar] [CrossRef]
- Li, X.; Wang, Y.; Yao, J.; Li, M.; Gao, Z. Multi-sensor fusion fault diagnosis method of wind turbine bearing based on adaptive convergent viewable neural networks. Reliab. Eng. Syst. Saf. 2024, 245, 109980. [Google Scholar] [CrossRef]
- Mian, Z.; Deng, X.; Dong, X.; Tian, Y.; Cao, T.; Chen, K.; Al Jaber, T. A literature review of fault diagnosis based on ensemble learning. Eng. Appl. Artif. Intell. 2024, 127, 107357. [Google Scholar] [CrossRef]
- Xu, L.; Teoh, S.; Ibrahim, H. A deep learning approach for electric motor fault diagnosis based on modified InceptionV3. Sci. Rep. 2024, 14, 12344. [Google Scholar] [CrossRef]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Vo, T.; Liu, M.; Tran, M. Harnessing attention mechanisms in a comprehensive deep learning approach for induction motor fault diagnosis using raw electrical signals. Eng. Appl. Artif. Intell. 2024, 129, 107643. [Google Scholar] [CrossRef]
- O’shea, K.; Nash, R. An introduction to convolutional neural networks. arXiv 2015, arXiv:1511.08458. [Google Scholar]
- Medsker, L.; Jain, L. Recurrent neural networks. Des. Appl. 2001, 5, 64–67. [Google Scholar]
- Lv, J.; Xiao, Q.; Zhai, X. A high-performance rolling bearing fault diagnosis method based on adaptive feature mode decomposition and Transformer. Appl. Acoust. 2024, 224, 110156. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar] [CrossRef]
- Singh, A.; Mousavi, S.; Gaurav, K. SHS: Scorpion Hunting Strategy Swarm Algorithm. arXiv 2024, arXiv:2407.14202. [Google Scholar]
- Luo, X.; Wang, H.; Han, T.; Zhang, Y. FFT-trans: Enhancing robustness in mechanical fault diagnosis with Fourier transform-based transformer under noisy conditions. IEEE Trans. Instrum. Meas. 2024, 73, 2515112. [Google Scholar] [CrossRef]
- Duhamel, P.; Vetterli, M. Fast Fourier transforms: A tutorial review and a state of the art. Signal Process. 1990, 19, 259–299. [Google Scholar] [CrossRef]
- Xie, S.; Zhou, S.; Sakurada, K.; Ishikawa, R.; Onishi, M.; Oishi, T. G2fR: Frequency Regularization in Grid-Based Feature Encoding Neural Radiance Fields. In European Conference on Computer Vision; Springer Nature: Cham, Switzerland, 2024; pp. 186–203. [Google Scholar]
- You, K.; Wang, P.; Huang, P.; Gu, Y. A sound-vibration physical-information fusion constraint-guided deep learning method for rolling bearing fault diagnosis. Reliab. Eng. Syst. Saf. 2025, 253, 110556. [Google Scholar]
- Liu, M.; Chen, L.; Du, X. Activated gradients for deep neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2021, 34, 2156–2168. [Google Scholar] [CrossRef] [PubMed]
- Sun, Y.; Lao, D. Surprising instabilities in training deep networks and a theoretical analysis. Adv. Neural Inf. Process. Syst. 2022, 35, 19567–19578. [Google Scholar]
- Zeng, A.; Chen, M.; Zhang, L. Are transformers effective for time series forecasting. Proc. AAAI Conf. Artif. Intell. 2023, 37, 11121–11128. [Google Scholar] [CrossRef]
- Zhao, B.; Xing, H.; Wang, X.; Song, F.; Xiao, Z. Rethinking attention mechanism in time series classification. Inf. Sci. 2023, 627, 97–114. [Google Scholar] [CrossRef]
- Garcia, G.; Michau, G.; Ducoffe, M.; Gupta, J.S.; Fink, O. Temporal signals to images: Monitoring the condition of industrial assets with deep learning image processing algorithms. Proc. Inst. Mech. Eng. Part O J. Risk Reliab. 2022, 236, 617–627. [Google Scholar] [CrossRef]
- Yi, K.; Zhang, Q.; Cao, L. A survey on deep learning-based time series analysis with frequency transformation. arXiv 2023, arXiv:2302.02173. [Google Scholar]
- Qiu, S.; Cui, X.; Ping, Z.; Shan, N.; Li, Z.; Bao, X.; Xu, X. Deep learning techniques in intelligent fault diagnosis and prognosis for industrial systems: A review. Sensors 2023, 23, 1305. [Google Scholar] [CrossRef]
- Wu, G.; Ji, X.; Yang, G.; Jia, Y.; Cao, C. Signal-to-image: Rolling bearing fault diagnosis using ResNet family deep-learning models. Processes 2023, 11, 1527. [Google Scholar] [CrossRef]
- Li, Z.; Fan, R.; Tu, J. Tdanet: A novel temporal denoise convolutional neural network with attention for fault diagnosis. arXiv 2024, arXiv:2403.19943. [Google Scholar]
- Sun, Y.; Li, S.; Wang, Y.; Wang, X. Fault diagnosis of rolling bearing based on empirical mode decomposition and improved manhattan distance in symmetrized dot pattern image. Mech. Syst. Signal Process. 2021, 159, 107817. [Google Scholar] [CrossRef]
- Zhou, Y.; Long, X.; Sun, M.; Chen, Z. Bearing fault diagnosis based on Gramian angular field and DenseNet. Math. Biosci. Eng. 2022, 19, 14086–14101. [Google Scholar] [CrossRef] [PubMed]
- Wang, M.; Wang, W.; Zhang, X.; Iu, H.H.-C. A new fault diagnosis of rolling bearing based on Markov transition field and CNN. Entropy 2022, 24, 751. [Google Scholar] [CrossRef] [PubMed]
- Tang, H.; Tang, Y.; Su, Y.; Feng, W.; Wang, B.; Chen, P.; Zuo, D. Feature extraction of multi-sensors for early bearing fault diagnosis using deep learning based on minimum unscented kalman filter. Eng. Appl. Artif. Intell. 2024, 127, 107138. [Google Scholar] [CrossRef]
- Julier, S.; Uhlmann, J. New extension of the Kalman filter to nonlinear systems. Signal Process. Sens. Fusion Target Recognit. VI 1997, 3068, 182–193. [Google Scholar]
- Wang, L.; Zhao, W. An ensemble deep learning network based on 2D convolutional neural network and 1D LSTM with self-attention for bearing fault diagnosis. Appl. Soft Comput. 2025, 172, 112889. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Heo, B.; Park, S.; Han, D.; Yun, S. Rotary position embedding for vision transformer. In European Conference on Computer Vision; Springer Nature: Cham, Switzerland, 2024; pp. 289–305. [Google Scholar]
- Sun, S.; Xia, X.; Zhou, H. A graph representation learning-based method for fault diagnosis of rotating machinery under time-varying speed conditions. Nonlinear Dyn. 2025, 113, 17449–17475. [Google Scholar] [CrossRef]
- Chen, S.; Liu, Z.; He, X.; Zou, D.; Zhou, D. Multi-mode Fault Diagnosis Datasets of Gearbox Under Variable Working Conditions. Data Brief 2024, 54, 110453. [Google Scholar] [CrossRef]
- Zhao, C.; Zio, E.; Shen, W. Domain generalization for cross-domain fault diagnosis: An application-oriented perspective and a benchmark study. Reliab. Eng. Syst. Saf. 2024, 245, 109964. [Google Scholar] [CrossRef]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Ramachandran, P.; Zoph, B.; Le, Q. Searching for activation functions. arXiv 2017, arXiv:1710.05941. [Google Scholar]
- Su, J.; Ahmed, M.; Lu, Y. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing 2024, 568, 127063. [Google Scholar] [CrossRef]
- Bachlechner, T.; Majumder, B.; Mao, H.; Cottrell, G.; McAuley, J. Rezero is all you need: Fast convergence at large depth. Uncertain. Artif. Intell. 2021, 161, 1352–1361. [Google Scholar]
- Li, T.; Zhao, Z.; Sun, C.; Yan, R.; Chen, X. Multireceptive field graph convolutional networks for machine fault diagnosis. IEEE Trans. Ind. Electron. 2020, 3, 12739–12749. [Google Scholar] [CrossRef]
- Wang, Z.; Wu, Z.; Li, X.; Shao, H.; Han, T.; Xie, M. Attention-aware temporal–spatial graph neural network with multi-sensor information fusion for fault diagnosis. Knowl.-Based Syst. 2023, 278, 110891. [Google Scholar] [CrossRef]
- Jiang, Z.; Zheng, W.; Men, D. Research on gearbox fault diagnosis method under variable working conditions based on HHO-MLP neural network. Manuf. Technol. Mach. Tool 2025, 2, 29–35. [Google Scholar]
- Zhao, X.; Zhu, X.; Liu, J.; Hu, Y.; Gao, T.; Zhao, L.; Yao, J.; Liu, Z. Model-assisted multi-source fusion hypergraph convolutional neural networks for intelligent few-shot fault diagnosis to electro-hydrostatic actuator. Inf. Fusion 2024, 104, 102186. [Google Scholar] [CrossRef]
Dataset | MCC5 | HUST | Total |
---|---|---|---|
4 s extraction | 3360 | 180 | 3540 |
10 s extraction | 1440 | 90 | 1530 |
- | Parameter | Setting |
---|---|---|
Data preprocessing | Sample split | [i, i + 6], i ∈ {0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52}; [i, i + 6], i ∈ {0, 10, 20, 30, 40, 50} |
Window size for PAA dimensionality reduction | 150 | |
Number of signal channels | 8 | |
Output GASF/GADF image resolution | 512 | |
Shape of GAF (merges all channels of GASF/GADF) | (16, 512, 512) | |
Patch size | 16 | |
Network | Kernel size of depthwise convolution | 3 × 3, 5 × 5 |
Kernel number of depthwise convolution | 32 | |
Number of attention heads | {1, 2, 3, 4} | |
Numerical stability parameters for LayerNorm | 1 × 10−12 | |
The number of hidden layers in the transformer module | 6 | |
FFN hidden size | 2048 | |
Others | Dropout | 0.1 |
Loss Function | CrossEntropyLoss | |
Batch size | 16 | |
Epochs | 256 | |
Learning rate | 2 × 10−6 | |
Optimizer | AdamW |
Model | Evaluated Metrics | |||
---|---|---|---|---|
Accuracy | Precision | Recall | F1 Score | |
GRU + ShuffleNet | 0.927 | 0.932 | 0.925 | 0.919 |
SeNetTrans | 0.938 | 0.944 | 0.954 | 0.943 |
InceptionV3 | 0.859 | 0.841 | 0.858 | 0.843 |
ConvNeXtV2 | 0.940 | 0.953 | 0.959 | 0.946 |
RoPE-DWTrans (Ours) | 0.953 | 0.959 | 0.973 | 0.961 |
Model | Evaluated Metrics | |||
---|---|---|---|---|
Accuracy | Precision | Recall | F1 Score | |
GRU + ShuffleNet | 0.831 | 0.834 | 0.831 | 0.827 |
SeNetTrans | 0.895 | 0.899 | 0.892 | 0.893 |
InceptionV3 | 0.733 | 0.754 | 0.729 | 0.729 |
ConvNeXtV2 | 0.905 | 0.919 | 0.908 | 0.911 |
RoPE-DWTrans (Ours) | 0.923 | 0.932 | 0.928 | 0.928 |
Model | Evaluated Metrics | 4 s Extraction | 10 s Extraction | ||
---|---|---|---|---|---|
14 Class | 23 Class | 14 Class | 23 Class | ||
Vit | Accuracy | 0.929 | 0.893 | 0.947 | 0.922 |
F1 score | 0.932 | 0.897 | 0.962 | 0.912 | |
DWFE + Vit | Accuracy | 0.937 | 0.905 | 0.964 | 0.933 |
F1 score | 0.948 | 0.906 | 0.967 | 0.943 | |
RoPE-DWTrans (Ours) | Accuracy | 0.953 | 0.923 | 0.964 | 0.950 |
F1 score | 0.961 | 0.928 | 0.967 | 0.951 |
Fature Extraction | Datasets | Evaluated Metrics | Training Time | |||
---|---|---|---|---|---|---|
Accuracy | Precision | Recall | F1 Score | |||
4 s extraction | 14 class | 0.953 | 0.959 | 0.973 | 0.961 | 51s |
23 class | 0.923 | 0.932 | 0.928 | 0.928 | 122 s | |
10 s extraction | 14 class | 0.964 | 0.976 | 0.971 | 0.967 | 33 s |
23 class | 0.950 | 0.949 | 0.966 | 0.951 | 95 s |
- | Fault Categories | - | Fault Categories |
---|---|---|---|
Fault 1 | Gear pitting | Fault 4 | Teeth break and bearing outer |
Fault 2 | Gear wear | Fault 5 | Teeth break |
Fault 3 | Teeth break and bearing inner | Fault 6 | Teeth crack |
Feature Extraction | Fault | Fault 1 | Fault 2 | Fault 3 | Fault 4 | Fault 5 | Fault 6 |
---|---|---|---|---|---|---|---|
Class | 0–2 | 3–5 | 8–10 | 11–13 | 14–16 | 17–19 | |
4 s extraction | ViT | 0.833 | 0.854 | 0.897 | 0.923 | 0.873 | 0.915 |
Ours | 0.870 | 0.887 | 0.870 | 0.958 | 0.943 | 0.969 | |
10 s extraction | ViT | 1.000 | 0.958 | 0.933 | 0.933 | 0.857 | 0.833 |
Ours | 1.000 | 1.000 | 1.000 | 0.933 | 0.928 | 0.883 |
Methods | Evaluated Metrics | |
---|---|---|
Accuracy | F1-Score | |
FFT-SGCN [43] | 0.962 | 0.967 |
Multireceptive-GCN [50] | 0.971 | 0.968 |
Attention-TSGNN [51] | 0.975 | 0.976 |
HHO-MLP [52] | 0.975 | - |
Ours | 0.981 | 0.978 |
Performance Corresponding to Different Percentages of Training Data (ACC and F1) | |||||||
---|---|---|---|---|---|---|---|
Extraction | Class | 56.5% | 68.2% | 88.7% | |||
Accuracy | F1 Score | Accuracy | F1 Score | Accuracy | F1 Score | ||
4 s | 14 | 0.828 | 0.831 | 0.862 | 0.862 | 0.953 | 0.961 |
23 | 0.802 | 0.812 | 0.843 | 0.854 | 0.923 | 0.928 | |
10 s | 14 | 0.839 | 0.846 | 0.897 | 0.890 | 0.964 | 0.967 |
23 | 0.813 | 0.828 | 0.883 | 0.886 | 0.950 | 0.951 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Luo, X.; Wang, M.; Zhang, Z. Analysis of Gearbox Bearing Fault Diagnosis Method Based on 2D Image Transformation and 2D-RoPE Encoding. Appl. Sci. 2025, 15, 7260. https://doi.org/10.3390/app15137260
Luo X, Wang M, Zhang Z. Analysis of Gearbox Bearing Fault Diagnosis Method Based on 2D Image Transformation and 2D-RoPE Encoding. Applied Sciences. 2025; 15(13):7260. https://doi.org/10.3390/app15137260
Chicago/Turabian StyleLuo, Xudong, Minghui Wang, and Zhijie Zhang. 2025. "Analysis of Gearbox Bearing Fault Diagnosis Method Based on 2D Image Transformation and 2D-RoPE Encoding" Applied Sciences 15, no. 13: 7260. https://doi.org/10.3390/app15137260
APA StyleLuo, X., Wang, M., & Zhang, Z. (2025). Analysis of Gearbox Bearing Fault Diagnosis Method Based on 2D Image Transformation and 2D-RoPE Encoding. Applied Sciences, 15(13), 7260. https://doi.org/10.3390/app15137260