Machine Fault Diagnosis: Experiments with Different Attention Mechanisms Using a Lightweight SqueezeNet Architecture
Abstract
1. Introduction
1.1. Literature Review
1.2. Contribution Summary
- This paper presents the classification results of the lightweight SqueezeNet with four attention mechanisms: self-attention, CBAM, channel, and spatial. With the additional attention mechanisms, the model is lightweight compared to state-of-the-art architectures, such as CNN, VGG, DenseNet, and ResNet. The trainable parameters of the SqueezeNet architecture with and without attention mechanisms are 0.73~0.82 million, whereas the trainable parameters of the state-of-art architectures are 1.3~138 million [38].
- All the experiments run with texture images of Malfunctioning Industrial Machine Investigation and Inspection (MIMII) and Toy Anomaly Detection in Machine Operating Sounds (ADMOS) audio sensor datasets, which are generated by applying EMD and Gammatone Spectrogram. In [38], it is shown that the EMD–Gammatone spectrogram generates more unique texture patterns for each type of fault compared to the MFCC, Gammatone, and Hilbert–Huang transform. The experimental results show that the EMD–Gammatone spectrogram demonstrates 88.81% accuracy for the SqueezeNet architecture, whereas the other state-of-the-art texture generation techniques exhibited only 62~79.42% accuracy.
- The experimental results show that among the attention mechanisms, the self-attention mechanism outperforms the other three attention mechanisms. It is concluded that the self-attention mechanism can acquire the most effective information from the input vector forwarded by the convolution and pooling layers. The experimental results show that the self-attention mechanism exhibits 97% accuracy for the MIMII and ToyADMOS datasets, whereas the other attention mechanisms exhibited 95~96% accuracy.
- In addition, to explain the classifier results step by step, the t-SNE is applied to visualize the features. We run all the experiments for 200 epochs and report the features for every 100 epochs of the training dataset, and the features of the testing dataset after 200 epochs both for MIMII and ToyADMOS for the SqueezeNet architecture with the self-attention mechanism. The t-SNE results illustrate that the model with a self-attention block almost accurately extracted the distinct features of each fault both for balanced and imbalanced datasets.
2. Methodology
| Algorithm 1: Steps followed to implement the proposed fault detection model | 
| Step 1: Signal pre-processing and texture image generation 
 
 | 
2.1. Channel Attention Mechanism
2.2. Spatial Attention Mechanism
2.3. CBAM Attention Mechanism
3. Experimental Results Analysis
3.1. Experiment with MIMII Dataset
3.2. Experiment with ToyADMOS
4. Interpretation of Self-Attention SqueezeNet Results with T-SNE
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Liu, R.; Yang, B.; Zio, E.; Chen, X. Artificial intelligence for fault diagnosis of rotating machinery: A review. Mech. Syst. Signal Process. 2018, 108, 33–47. [Google Scholar] [CrossRef]
- Xu, L.; Teoh, S.S.; Ibrahim, H. A deep learning approach for electric motor fault diagnosis based on modified InceptionV3. Sci. Rep. 2024, 14, 12344. [Google Scholar] [CrossRef] [PubMed]
- Siraj, F.M.; Ayon, S.T.K.; Samad, M.A.; Uddin, J.; Choi, K. Few-Shot Lightweight SqueezeNet Architecture for Induction Motor Fault Diagnosis using Limited Thermal Image Dataset. IEEE Access 2024, 12, 50986–50997. [Google Scholar] [CrossRef]
- Kankar, P.K.; Sharma, S.C.; Harsha, S.P. Fault diagnosis of ball bearings using machine learning methods. Expert Syst. Appl. 2011, 38, 1876–1886. [Google Scholar] [CrossRef]
- Sobie, C.; Freitas, C.; Nicolai, M. Simulation-driven machine learning: Bearing fault classification. Mech. Syst. Signal Process. 2018, 99, 403–419. [Google Scholar] [CrossRef]
- Liu, Y.; Yan, X.; Zhang, C.A.; Liu, W. An ensemble convolutional neural networks for bearing fault diagnosis using multi-sensor data. Sensors 2019, 19, 5300. [Google Scholar] [CrossRef]
- Zhang, R.; Peng, Z.; Wu, L.; Yao, B.; Guan, Y. Fault diagnosis from raw sensor data using deep neural networks considering temporal coherence. Sensors 2017, 17, 549. [Google Scholar] [CrossRef] [PubMed]
- Guo, M.; Liu, H.; Xu, Y.; Huang, Y. Building extraction based on U-Net with an attention block and multiple losses. Remote Sens. 2020, 12, 1400. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the 2018 European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- He, A.; Li, T.; Li, N.; Wang, K.; Fu, H. CABNet: Category attention block for imbalanced diabetic retinopathy grading. IEEE Trans. Med. Imaging 2020, 40, 143–153. [Google Scholar] [CrossRef]
- Zhang, H.; Wu, C.; Zhang, Z.; Zhu, Y.; Lin, H.; Zhang, Z.; Sun, Y.; He, T.; Mueller, J.; Manmatha, R.; et al. Resnest: Split-attention networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 2736–2746. [Google Scholar]
- Chen, L.; Zhang, H.; Xiao, J.; Nie, L.; Shao, J.; Liu, W.; Chua, T.S. Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5659–5667. [Google Scholar]
- Akbar, A.S.; Fatichah, C.; Suciati, N. Single level UNet3D with multipath residual attention block for brain tumor segmentation. J. King Saud. Univ.—Comput. Inf. Sci. 2022, 34, 3247–3258. [Google Scholar] [CrossRef]
- Huang, J.; Chen, B.; Yao, B.; He, W. ECG arrhythmia classification using STFT-based spectrogram and convolutional neural network. IEEE Access 2019, 7, 92871–92880. [Google Scholar] [CrossRef]
- Islam, M.M.; Kim, J.M. Automated bearing fault diagnosis scheme using 2D representation of wavelet packet transform and deep convolutional neural network. Comput. Ind. 2019, 106, 142–153. [Google Scholar] [CrossRef]
- Shao, S.; McAleer, S.; Yan, R.; Baldi, P. Highly accurate machine fault diagnosis using deep transfer learning. IEEE Trans. Ind. Inform. 2018, 15, 2446–2455. [Google Scholar] [CrossRef]
- Wen, L.; Li, X.; Li, X.; Gao, L. A new transfer learning based on VGG-19 network for fault diagnosis. In Proceedings of the 2019 IEEE 23rd International Conference on Computer Supported Cooperative Work in Design (CSCWD), Porto, Portugal, 6–8 May 2019; pp. 205–209. [Google Scholar]
- Zhang, D.; Zhou, T. Deep convolutional neural network using transfer learning for fault diagnosis. IEEE Access 2021, 9, 43889–43897. [Google Scholar] [CrossRef]
- Guo, L.; Lei, Y.; Xing, S.; Yan, T.; Li, N. Deep convolutional transfer learning network: A new method for intelligent fault diagnosis of machines with unlabeled data. IEEE Trans. Ind. Electron. 2018, 66, 7316–7325. [Google Scholar] [CrossRef]
- Yang, B.; Lei, Y.; Jia, F.; Xing, S. An intelligent fault diagnosis approach based on transfer learning from laboratory bearings to locomotive bearings. Mech. Syst. Signal Process. 2019, 122, 692–706. [Google Scholar] [CrossRef]
- Zhang, R.; Tao, H.; Wu, L.; Guan, Y. Transfer learning with neural networks for bearing fault diagnosis in changing working conditions. IEEE Access 2017, 5, 14347–14357. [Google Scholar] [CrossRef]
- Han, T.; Liu, C.; Wu, R.; Jiang, D. Deep transfer learning with limited data for machinery fault diagnosis. Appl. Soft Comput. 2021, 103, 107150. [Google Scholar] [CrossRef]
- Li, G.; Wu, J.; Deng, C.; Wei, M.; Xu, X. Self-supervised learning for intelligent fault diagnosis of rotating machinery with limited labeled data. Appl. Acoust. 2022, 191, 108663. [Google Scholar] [CrossRef]
- Berenji, A.; Taghiyarrenani, Z.; Rohani Bastami, A. Fault identification with limited labeled data. J. Vib. Control 2024, 30, 1502–1510. [Google Scholar]
- Wei, M.; Liu, Y.; Zhang, T.; Wang, Z.; Zhu, J. Fault diagnosis of rotating machinery based on improved self-supervised learning method and very few labeled samples. Sensors 2021, 22, 192. [Google Scholar] [CrossRef]
- Tang, H.; Gao, S.; Wang, L.; Li, X.; Li, B.; Pang, S. A novel intelligent fault diagnosis method for rolling bearings based on Wasserstein generative adversarial network and Convolutional Neural Network under Unbalanced Dataset. Sensors 2021, 21, 6754. [Google Scholar] [CrossRef] [PubMed]
- Islam, R.; Uddin, J.; Kim, J.M. Texture analysis based feature extraction using Gabor filter and SVD for reliable fault diagnosis of an induction motor. Int. J. Inf. Technol. Manag. 2018, 17, 20–32. [Google Scholar] [CrossRef]
- Zabin, M.; Choi, H.J.; Uddin, J.; Furhad, M.H.; Ullah, A.B. Industrial Fault Diagnosis using Hilbert Transform and Texture Features. In Proceedings of the 2021 IEEE International Conference on Big Data and Smart Computing (BigComp), Jeju Island, Korea, 17–20 January 2021; pp. 121–128. [Google Scholar]
- Fan, H.; Ren, Z.; Zhang, X.; Cao, X.; Ma, H.; Huang, J. A gray texture image data-driven intelligent fault diagnosis method of induction motor rotor-bearing system under variable load conditions. Measurement 2024, 233, 114742. [Google Scholar] [CrossRef]
- Tran, M.Q.; Liu, M.K.; Tran, Q.V.; Nguyen, T.K. Effective fault diagnosis based on wavelet and convolutional attention neural network for induction motors. IEEE Trans. Instrum. Meas. 2021, 71, 3139706. [Google Scholar] [CrossRef]
- Chen, L.; Ma, Y.; Wang, H.; Wen, S.; Guo, L. A novel deep convolutional neural network and its application to fault diagnosis of the squirrel-cage asynchronous motor under noisy environment. Meas. Sci. Technol. 2023, 34, 115113. [Google Scholar] [CrossRef]
- Liu, H.; Li, L.; Ma, J. Rolling Bearing Fault Diagnosis Based on STFT-Deep Learning and Sound Signals. Shock Vib. 2016, 1, 6127479. [Google Scholar] [CrossRef]
- Benkedjouh, T.; Zerhouni, N.; Rechak, S. Deep Learning for Fault Diagnosis based on short-time Fourier transform. In Proceedings of the 2018 International Conference on Smart Communications in Network Technologies (SaCoNeT), El Oued, Algeria, 27–31 October 2018; pp. 288–293. [Google Scholar]
- Du, Y.; Wang, A.; Wang, S.; He, B.; Meng, G. Fault diagnosis under variable working conditions based on STFT and transfer deep residual network. Shock Vib. 2020, 1, 1274380. [Google Scholar] [CrossRef]
- Ahmed, H.O.; Nandi, A.K. Connected components-based colour image representations of vibrations for a two-stage fault diagnosis of roller bearings using convolutional neural networks. Chin. J. Mech. Eng. 2021, 34, 37. [Google Scholar] [CrossRef]
- Pham, M.T.; Kim, J.M.; Kim, C.H. Rolling bearing fault diagnosis based on improved GAN and 2-D representation of acoustic emission signals. IEEE Access 2022, 10, 78056–78069. [Google Scholar]
- Qin, Z.; Huang, F.; Pan, J.; Niu, J.; Qin, H. Improved Generative Adversarial Network for Bearing Fault Diagnosis with a Small Number of Data and Unbalanced Data. Symmetry 2024, 16, 358. [Google Scholar] [CrossRef]
- Zabin, M.; Kabir, A.N.B.; Kabir, M.K.; Choi, H.J.; Uddin, J. Machine Fault Diagnosis Using EMD-Gammatone Texture Representation and A Lightweight Self-Attention SqueezeNet. In Proceedings of the 2024 IEEE International Conference on Big Data and Smart Computing (BigComp), Bangkok, Thailand, 18–21 February 2024; pp. 32–39. [Google Scholar]
- Jung, H.; Choi, S.; Lee, B. Rotor Fault Diagnosis Method Using CNN-Based Transfer Learning with 2D Sound Spectrogram Analysis. Electronics 2023, 12, 480. [Google Scholar] [CrossRef]
- Ruan, D.; Wang, J.; Yan, J.; Gühmann, C. CNN parameter design based on fault signal analysis and its application in bearing fault diagnosis. Adv. Eng. Inform. 2023, 55, 101877. [Google Scholar] [CrossRef]
- Lin, S.L. Intelligent fault diagnosis and forecast of time-varying bearing based on deep learning VMD-DenseNet. Sensors 2021, 21, 7467. [Google Scholar] [CrossRef]
- Wen, L.; Li, X.; Gao, L. A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Comput. Appl. 2020, 32, 6111–6124. [Google Scholar] [CrossRef]
- Su, J.; Wang, H. Fine-Tuning and Efficient VGG16 Transfer Learning Fault Diagnosis Method for Rolling Bearing. In Proceedings of the IncoME-VI and TEPEN 2021, Performance Engineering and Maintenance Engineering; Springer: Berlin/Heidelberg, Germany, 2022; pp. 453–461. [Google Scholar]
- Li, H.; Qiu, K.; Chen, L.; Mei, X.; Hong, L.; Tao, C. SCAttNet: Semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images. IEEE Geosci. Remote Sens. Lett. 2020, 18, 905–909. [Google Scholar] [CrossRef]
- Park, J.; Woo, S.; Lee, J.Y.; Kweon, I.S. A simple and light-weight attention module for convolutional neural networks. Int. J. Comput. Vis. 2020, 128, 783–798. [Google Scholar] [CrossRef]
- Cao, G.; Luo, S. Multimodal perception for dexterous manipulation. In Tactile Sensing, Skill Learning, and Robotic Dexterous Manipulation; Academic Press: Cambridge, MA, USA, 2022; pp. 45–58. [Google Scholar]
- Tanabe, R.; Purohit, H.; Dohi, K.; Endo, T.; Nikaido, Y.; Nakamura, T.; Kawaguchi, Y. MIMII DUE: Sound dataset for malfunctioning industrial machine investigation and inspection with domain shifts due to changes in operational and environmental conditions. In Proceedings of the 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 17–20 October 2021; pp. 21–25. [Google Scholar]
- Koizumi, Y.; Saito, S.; Uematsu, H.; Harada, N.; Imoto, K. ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In Proceedings of the 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 17–20 October 2021; pp. 313–317. [Google Scholar]














| Ref., Year | 2D Image Generation Techniques and Dataset | Architectures | Limitations | 
|---|---|---|---|
| [33], 2018 | STFT-based spectrogram, vibration dataset | CNN | Although the STFT is an effective spectrogram for representing frequency components in time series data due to its simultaneous temporal and spectral information-capturing capabilities, it does not provide an optimal spectrogram resolution in both time and frequency domains. | 
| [15], 2019 | DDRgram, AE dataset | ADCNN | The model shows that the DDRgram-based input (1024 pixels) 2D adaptive deep CNN exhibited a higher diagnostic performance by requiring less processing time than the 1D raw signal for the AE dataset. Selecting the wavelet filter size when generating the DDRgram is a vital issue. | 
| [35], 2021 | RGBVI, vibration dataset | CNN | RGB vibration images are generated by segmenting the vibration dataset. Although the model shows improved results compared to the ML models, the model architectures and trainable parameters are not reported. | 
| [26],2021 | Window-based grayscale image, vibration dataset | Self-attentive convolutional neural network (SECNN) | The model shows improved accuracy for an imbalanced dataset. However, the model was validated by generating sample data using GANs due to the unavailability of the actual dataset. | 
| [36], 2022 | Constant-Q transform (CQT), AE data | GAN | Although the model has rapid convergence and high accuracy, the model has 0.31 million trainable parameters and is tested with limited environments; the limitations of CQT are its sensitivity to the tuning parameters, computation burden and unsuitability for a large dataset and real-time applications. | 
| [37], 2024 | CWT, vibration dataset | CNN-CA (coordinate attention) | The coordinate attention mechanism with CNN shows improved performance. Although the CWT is not required to select a window size as STFT, it is mostly suitable for non-stationary signals and computational complexity is an issue. | 
| Trainable Parameters | Accuracy | F1-Score | |
|---|---|---|---|
| SqueezeNet without attention | 73,632 | 96% | 0.96 | 
| SqueezeNet with Self-attention | 82,224 | 97% | 0.97 | 
| SqueezeNet with Channel attention | 73,796 | 96% | 0.96 | 
| SqueezeNet with Spatial attention | 73,730 | 96% | 0.96 | 
| SqueezeNet with CBAM attention | 73,894 | 96% | 0.96 | 
| Accuracy | F1-Score | |
|---|---|---|
| SqueezeNet without attention | 96% | 0.96 | 
| SqueezeNet with Self-attention | 97% | 0.97 | 
| SqueezeNet with Channel attention | 97% | 0.97 | 
| SqueezeNet with Spatial attention | 96% | 0.96 | 
| SqueezeNet with CBAM attention | 95% | 0.95 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zabin, M.; Choi, H.-J.; Kabir, M.K.; Kabir, A.N.B.; Uddin, J. Machine Fault Diagnosis: Experiments with Different Attention Mechanisms Using a Lightweight SqueezeNet Architecture. Electronics 2024, 13, 3112. https://doi.org/10.3390/electronics13163112
Zabin M, Choi H-J, Kabir MK, Kabir ANB, Uddin J. Machine Fault Diagnosis: Experiments with Different Attention Mechanisms Using a Lightweight SqueezeNet Architecture. Electronics. 2024; 13(16):3112. https://doi.org/10.3390/electronics13163112
Chicago/Turabian StyleZabin, Mahe, Ho-Jin Choi, Muhammad Kubayeeb Kabir, Anika Nahian Binte Kabir, and Jia Uddin. 2024. "Machine Fault Diagnosis: Experiments with Different Attention Mechanisms Using a Lightweight SqueezeNet Architecture" Electronics 13, no. 16: 3112. https://doi.org/10.3390/electronics13163112
APA StyleZabin, M., Choi, H.-J., Kabir, M. K., Kabir, A. N. B., & Uddin, J. (2024). Machine Fault Diagnosis: Experiments with Different Attention Mechanisms Using a Lightweight SqueezeNet Architecture. Electronics, 13(16), 3112. https://doi.org/10.3390/electronics13163112
 
        




 
       