# Deep Transfer Learning-Based Fault Diagnosis Using Wavelet Transform for Limited Data

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

- We adopt a pre-trained model, GoogLeNet, to classify industrial faults where the wavelet transformation converts one-dimensional time-series data into two-dimensional images of the time and frequency domains as the input. To deal with the constraints of the limited target samples, we first partially retrain the pre-trained model using the intermediate data that are conceptually related to the target domain, but less expensive, to collect the samples. We then retrain this relatively small portion of the pre-trained intermediate model using the target data.
- We extensively evaluate the effectiveness of the proposed method and compare it with the state of the art. We use a Simulink model of a triplex reciprocating pump with different fault combinations and severity levels. The proposed method improves the generalization capability to classify the fault types while reducing the dependency on the training data of the source domain. In particular, we show the critical impact of the severity level on the fault classification accuracy.

## 2. Related Works

## 3. Wavelet-Based Deep Transfer Learning

#### 3.1. Overall Framework

- Pre-trained model: We use the GoogLeNet trained on the ImageNet data set to classify 1000 categories of the typical images [20]. GoogLeNet is a 22-layer CNN, a variant of the inception network developed at Google for image classification and object detection.
- Training on the intermediate domain: The CWT converts the time-series signals of the intermediate domain to the wavelet images capturing the time and frequency characteristics. The intermediate network has nearly the same architecture as the pre-trained model of the source domain. Some subsequent parameters of the network are updated using the intermediate samples, while we reduce the amount of learning parameters to avoid overfitting by fixing most of the parameters of the preceding layers. Once the network is trained, it is used as the general fault diagnosis model to input the wavelet images.
- Training on the target domain: Similarly, CWT converts the time-series signals of the target domain to the wavelet images in terms of the time and frequency domains. The limited wavelet images of the target domain are passed to the intermediate network to fine-tune a few subsequent layers of the network for the fault diagnosis of the target domain.
- Fault diagnosis stage: The target network is eventually utilized to classify various fault types based on the extracted feature information of the wavelet images.

#### 3.2. Wavelet Transformation

#### 3.3. Deep Transfer Learning

## 4. Evaluation Setup

#### 4.1. Fault Data

#### 4.2. Comparison

- SVM: A machine learning technique known as SVM relies on the structural risk minimization problem [29]. It can perform well in a high-dimensional non-linear problem with limited samples. Various signal processing techniques for the time and frequency domains are adopted to manually extract the features of the signals. We use various statistical metrics of the time domain analysis, including mean, standard deviation, root mean square, kurtosis, maximum-to-minimum difference, and signal median absolute deviation. Furthermore, the spectral analysis extracts useful features for predicting faults, such as bearings, gears, and engines [21]. We consider the cumulative powers in the low-frequency range of 10–20 $\mathrm{Hz}$, mid-frequency range of 40–60 $\mathrm{Hz}$, and high-frequency range above 100 $\mathrm{Hz}$, as well as the frequency of the peak magnitude and spectral kurtosis peak. Note that spectrum condition indicators of various frequency ranges are based on the expected harmonics due to the specifications of the triplex reciprocating pump model, as we will discuss in Section 5.
- CNN: Figure 3 depicts the structure and the configuration of the CNN, consisting of 18 layers. The hidden layer mainly consists of the convolutional layer, the batch normalization layer, the activation layer, the sub-sampling layer, and the dropout layer. We adopt the rectified linear units (ReLUs) as an activation function to improve the training time. The output of the convolutional layer is fed to the max pooling of the sub-sampling layer. The softmax function is applied to the output of the last fully connected layer and returns the distribution of eight class labels corresponding to the fault types. The classification accuracy of the CNN considerably depends on the configuration parameters, including input image size, activation function, filter size, sampling method, and iteration number. The network parameter optimization method is adopted to optimize the configuration parameters for the classification accuracy [30].

## 5. Performance Evaluation

#### 5.1. Fault Data Analysis

#### 5.2. Diagnosis Performance Analysis

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Rai, A.; Kim, J.M. A novel health indicator based on information theory features for assessing rotating machinery performance degradation. IEEE Trans. Instrum. Meas.
**2020**, 69, 6982–6994. [Google Scholar] [CrossRef] - Chen, J.; Li, Z.; Pan, J.; Chen, G.; Zi, Y.; Yuan, J.; Chen, B.; He, Z. Wavelet transform based on inner product in fault diagnosis of rotating machinery: A review. Mech. Syst. Signal Process.
**2016**, 70–71, 1–35. [Google Scholar] [CrossRef] - Cerrada, M.; Snchez, R.V.; Li, C.; Pacheco, F.; Cabrera, D.; Valente de Oliveira, J.; Vsquez, R.E. A review on data-driven fault severity assessment in rolling bearings. Mech. Syst. Signal Process.
**2018**, 99, 169–196. [Google Scholar] [CrossRef] - Khan, S.; Yairi, T. A review on the application of deep learning in system health management. Mech. Syst. Signal Process.
**2018**, 107, 241–265. [Google Scholar] [CrossRef] - Park, P.; Ergen, S.C.; Fischione, C.; Lu, C.; Johansson, K.H. Wireless network design for control systems: A survey. IEEE Commun. Surv. Tutor.
**2018**, 20, 978–1013. [Google Scholar] [CrossRef] - Liu, R.; Yang, B.; Zio, E.; Chen, X. Artificial intelligence for fault diagnosis of rotating machinery: A review. Mech. Syst. Signal Process.
**2018**, 108, 33–47. [Google Scholar] [CrossRef] - Liu, J.; Ren, Y. A general transfer framework based on industrial process fault diagnosis under small samples. IEEE Trans. Ind. Inform.
**2021**, 17, 6073–6083. [Google Scholar] [CrossRef] - Zhang, T.; Chen, J.; Xie, J.; Pan, T. SASLN: Signals augmented self-taught learning networks for mechanical fault diagnosis under small sample condition. IEEE Trans. Instrum. Meas.
**2021**, 70, 1–11. [Google Scholar] [CrossRef] - Xie, J.; Li, Z.; Zhou, Z.; Liu, S. A novel bearing fault classification method based on XGBoost: The fusion of deep learning-based features and empirical features. IEEE Trans. Instrum. Meas.
**2021**, 70, 1–9. [Google Scholar] [CrossRef] - Li, X.; Jiang, H.; Zhao, K.; Wang, R. A deep transfer nonnegativity-constraint sparse autoencoder for rolling bearing fault diagnosis with few labeled data. IEEE Access
**2019**, 7, 91216–91224. [Google Scholar] [CrossRef] - Cao, P.; Zhang, S.; Tang, J. Preprocessing-free gear fault diagnosis using small datasets with deep convolutional neural network-based transfer learning. IEEE Access
**2018**, 6, 26241–26253. [Google Scholar] [CrossRef] - An, Z.; Li, S.; Wang, J.; Jiang, X. A novel bearing intelligent fault diagnosis framework under time-varying working conditions using recurrent neural network. ISA Trans.
**2020**, 100, 155170. [Google Scholar] [CrossRef] - Janssens, O.; Slavkovikj, V.; Vervisch, B.; Stockman, K.; Loccufier, M.; Verstockt, S.; Van de Walle, R.; Van Hoecke, S. Convolutional neural network based fault detection for rotating machinery. J. Sound Vib.
**2016**, 377, 331–345. [Google Scholar] [CrossRef] - Zhang, W.; Li, C.; Peng, G.; Chen, Y.; Zhang, Z. A deep convolutional neural network with new training methods for bearing fault diagnosis under noisy environment and different working load. Mech. Syst. Signal Process.
**2018**, 100, 439–453. [Google Scholar] [CrossRef] - Azamfar, M.; Singh, J.; Bravo-Imaz, I.; Lee, J. Multisensor data fusion for gearbox fault diagnosis using 2-d convolutional neural network and motor current signature analysis. Mech. Syst. Signal Process.
**2020**, 144, 106861. [Google Scholar] [CrossRef] - Lv, H.; Chen, J.; Zhang, T.; Hou, R.; Pan, T.; Zhou, Z. SDA: Regularization with cut-flip and mix-normal for machinery fault diagnosis under small dataset. ISA Trans.
**2021**, 111, 337–349. [Google Scholar] [CrossRef] - Yosinski, J.; Clune, J.; Bengio, Y.; Lipson, H. How transferable are features in deep neural networks? In Proceedings of the International Conference on Neural Information Processing Systems, Montreal, BC, Canada, 8–13 December 2014; pp. 3320–3328. [Google Scholar]
- Zhang, A.; Li, S.; Cui, Y.; Yang, W.; Dong, R.; Hu, J. Limited data rolling bearing fault diagnosis with few-shot learning. IEEE Access
**2019**, 7, 110895–110904. [Google Scholar] [CrossRef] - Saufi, S.R.; Ahmad, Z.A.B.; Leong, M.S.; Lim, M.H. Gearbox fault diagnosis using a deep learning model with limited data sample. IEEE Trans. Ind. Inform.
**2020**, 16, 6263–6271. [Google Scholar] [CrossRef] - Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Park, P.; Marco, P.D.; Shin, H.; Bang, J. Fault Detection and Diagnosis Using Combined Autoencoder and Long Short-Term Memory Network. Sensors
**2019**, 19, 4612. [Google Scholar] [CrossRef][Green Version] - Abdelgayed, T.S.; Morsi, W.G.; Sidhu, T.S. A new approach for fault classification in microgrids using optimal wavelet functions matching pursuit. IEEE Trans. Smart Grid.
**2018**, 9, 4838–4846. [Google Scholar] [CrossRef] - Tran, M.Q.; Liu, M.K.; Tran, Q.V.; Nguyen, T.K. Effective fault diagnosis based on wavelet and convolutional attention neural network for induction motors. IEEE Trans. Instrum. Meas.
**2022**, 71, 1–13. [Google Scholar] [CrossRef] - Abdeljaber, O.; Avci, O.; Kiranyaz, M.S.; Boashash, B.; Sodano, H.; Inman, D.J. 1-d CNNs for structural damage detection: Verification on a structural health monitoring benchmark data. Neurocomputing
**2018**, 275, 1308–1317. [Google Scholar] [CrossRef] - Bechhoefer, E. Rolling Element Bearing Fault Diagnosis Data. Mathworks Inc., 2018. Available online: https://github.com/mathworks/RollingElementBearingFaultDiagnosis-Data (accessed on 21 July 2022).
- Zhou, S.; Chellappa, R. From sample similarity to ensemble similarity: Probabilistic distance measures in reproducing kernel hilbert space. IEEE Trans. Pattern Anal. Mach. Intell.
**2006**, 28, 917–929. [Google Scholar] [CrossRef] - Kingma, P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015; pp. 1–6. [Google Scholar]
- Miller, S. Triplex Pump with Faults. Mathworks Inc., 2020. Available online: https://github.com/mathworks/Simscape-Triplex-Pump (accessed on 21 July 2022).
- Wang, S.S.; Chern, A.; Tsao, Y.; Hung, J.W.; Lu, X.; Lai, Y.H.; Su, B. Wavelet speech enhancement based on nonnegative matrix factorization. IEEE Signal Process. Lett.
**2016**, 23, 1101–1105. [Google Scholar] [CrossRef][Green Version] - Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian Optimization of Machine Learning Algorithms. In Proceedings of the International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–8 December 2012; pp. 2951–2959. [Google Scholar]

**Figure 2.**GoogLeNet architecture. Conv and FC denote the convolutional layer and the fully-connected layer while MaxPool and AveragePool are the max pooling layer and average pooling layer, respectively; $(a\times b)$ means the filter size of the convolutional layer or the pooling size of the pooling layer, respectively. We add the dropout layer after the last average pooling layer.

**Figure 3.**Designed CNN architecture trained from scratch: $(c,a\phantom{\rule{3.33333pt}{0ex}}\times b)$ means c filters with the size of $a\times b$.

**Figure 5.**Output flow rate and power spectrum of ${f}_{0}$ and ${f}_{1}$ with different severity levels, $l=2,4,6$.

**Figure 7.**Confusion matrix of SVM (

**a**), the CNN (

**b**), and CNN–TL with various fault types, ${f}_{0},\dots ,{f}_{7}$ (

**c**).

**Figure 8.**Classification accuracy of SVM, the CNN, and CNN–TL with different ratios of the training data, $r=0.1,\dots ,0.8$.

**Figure 9.**Classification accuracy of SVM, the CNN, and CNN–TL with different severity labels, $l=1,\dots ,9$.

**Figure 10.**Required number of training samples of SVM, the CNN, and CNN–TL with different severity labels, $l=1,\dots ,9$.

Label | Classes | Number of Samples | Number of Samples |
---|---|---|---|

per Severity Level | |||

${f}_{0}$ | healthy state | 3600 | − |

${f}_{1}$ | cylinder leak | 3600 | 400 |

${f}_{2}$ | blocked inlet | 3600 | 400 |

${f}_{3}$ | cylinder leak and blocked inlet | 3600 | 400 |

${f}_{4}$ | bearing friction | 3600 | 400 |

${f}_{5}$ | cylinder leak and bearing friction | 3600 | 400 |

${f}_{6}$ | blocked inlet and bearing friction | 3600 | 400 |

${f}_{7}$ | cylinder leak and blocked inlet and bearing friction | 3600 | 400 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Bang, J.; Di Marco, P.; Shin, H.; Park, P. Deep Transfer Learning-Based Fault Diagnosis Using Wavelet Transform for Limited Data. *Appl. Sci.* **2022**, *12*, 7450.
https://doi.org/10.3390/app12157450

**AMA Style**

Bang J, Di Marco P, Shin H, Park P. Deep Transfer Learning-Based Fault Diagnosis Using Wavelet Transform for Limited Data. *Applied Sciences*. 2022; 12(15):7450.
https://doi.org/10.3390/app12157450

**Chicago/Turabian Style**

Bang, Junseong, Piergiuseppe Di Marco, Hyejeon Shin, and Pangun Park. 2022. "Deep Transfer Learning-Based Fault Diagnosis Using Wavelet Transform for Limited Data" *Applied Sciences* 12, no. 15: 7450.
https://doi.org/10.3390/app12157450