Rolling Bearing Fault Diagnosis Based on Wavelet Packet Decomposition and Multi-Scale Permutation Entropy

This paper presents a rolling bearing fault diagnosis approach by integrating wavelet packet decomposition (WPD) with multi-scale permutation entropy (MPE). The approach uses MPE values of the sub-frequency band signals to identify faults appearing in rolling bearings. Specifically, vibration signals measured from a rolling bearing test system with different defect conditions are decomposed into a set of sub-frequency band signals by means of the WPD method. Then, each sub-frequency band signal is divided into a series of subsequences, and MPEs of all subsequences in corresponding sub-frequency band signal are calculated. After that, the average MPE value of all subsequences about each sub-frequency band is calculated, and is considered as the fault feature of the corresponding sub-frequency band. Subsequently, MPE values of all sub-frequency bands are considered as input feature vectors, and the hidden Markov model (HMM) is used to identify the fault pattern of the rolling bearing. Experimental study on a data set from the Case Western Reserve University bearing data center has shown that the presented approach can accurately identify faults in rolling bearings.


Introduction
It is important to detect and diagnose rolling element bearing failures in rotating machinery in real time to void abnormal event progression and to reduce productivity loss [1].Among commonly used techniques, vibration-based analysis has been widely established to diagnose bearing faults due to the fact structural defects can cause changes of the bearing dynamic characteristics as manifested in vibrations [2].However, some non-linear factors, such as clearance, friction, and stiffness, affect complexity of the vibration signals.As a result, accurate evaluation of rolling bearings becomes a very challenging task if only the traditional analysis in the time or frequency domain on the working condition is used [3].In the past few years, various methods have been studied for detecting bearing faults in rotating machines, such as the stator current and vibration harmonic analysis method, the stray flux measurement method, the Park's vector approach, the instantaneous power factor (IPF) monitoring method, and the advanced artificial-intelligence-based method [4,5].
Development of the wavelet transform over the past years has also provided an effective tool to extract features from transient, time-varying signals for machine fault diagnosis.The research in [6,7] has presented a comprehensive overview about the application of the wavelet in machine fault diagnosis.The wavelet packet decomposition (WPD) is an extension of the wavelet transform.It has attracted increasing attention due to its ability in providing more flexible time-frequency decomposition, especially in the high-frequency region.The WPD is widely used in various machine fault diagnosis applications because of its excellent performance [2].For example, research in [8] used different sets of wavelet packet vectors to represent bearing vibration signals under different defect conditions.It was found that the WPD can improve the continuous wavelet transform (CWT) in terms of computational cost.It can also solve the frequency-band disagreement by discrete wavelet transform (DWT) only breaking up the approximation version [9].For rolling bearing vibration signal analysis, the manifold learning and wavelet packet transform were combined to extract the weak signature from waveform feature space [10].In another study, the WPD was incorporated with ensemble empirical mode decomposition to enable roller bearing defect detection at its incipient stage [11].
Generally, some features, such as energy content and Kurtosis value, are extracted from each sub-frequency band of the vibration signal.However, a preliminary study [12] has shown that the energy content is shown to have good robustness but relatively low sensitivity for incipient defects detection, whereas Kurtosis has high sensitivity to incipient defects but low stability.It is expected to combine these two parameters to enhance the defect severity assessment capability.In practice, if more parameters are used in feature extraction, they do not necessarily improve diagnostic performance but rather increase the computational cost [2,13].New parameters that can facilitate effective feature extraction are needed.
Permutation Entropy (PE), a parameter of average entropy, can describe the complexity of a time series.It is robust under non-linear distortion of the signal and is also computationally efficient.PE has been used for online chatter detection in turning processes [14], tool flute breakage detection in end milling [15], and status characterization of rotary machines [16].Multi-scale permutation entropy (MPE), which is based on PE, can measure the complexity of time series in different scales.A diagnosis method based on multi-scale permutation entropy and support vector machine (SVM) has been used to monitor and diagnose the rolling bearings working conditions [17].It has also been combined with a Laplacian score to refine features for bearing fault classification with the SVM [18].
In this paper, by taking advantages of the WPD and the MPE, an enhanced method for rolling bearing fault diagnosis is presented.The WPD is used as the pretreatment to decompose a vibration signal into a set of sub-frequency band signals, and the MPE value of each sub-frequency band signal is calculated.All MPE values of each vibration signal are used as a feature vector to a classifier, where the hidden Markov model (HMM) is used to identify the fault pattern of the rolling bearing.The rest of this paper is organized as follows.In Section 2, the review of the fault diagnosis method based on WPD and MPE is presented, and the proposed method for rolling bearing fault diagnosis is discussed.The evaluation and experiments are presented in Section 3. Finally, concluding remarks are drawn in Section 4.

Wavelet Packet Decomposition
The principle of the WPD can be described as follows [9,19].Mathematically, a wavelet packet consists of a set of linearly combined wavelet functions, which are generated using the following recursive relationship: ( ) where ) ( , t x k j denotes the wavelet coefficients at the j-th level, k-th sub-frequency band.
Therefore, the signal ) (t x can be expressed as: where the symbols j and k denote the decomposition level and sub-frequency band, respectively.An example of a 3-level decomposition of the signal ) (t x using the wavelet packet decomposition is shown in Figure 1.Given a signal's decomposition as represented in Equation ( 5), the energy content k j E in each sub-frequency band is defined as: where N is the number of the wavelet packet coefficients in each sub-frequency band, and In most cases, the energy content values can be treated as features to construct a feature vector for defect classification.However, a preliminary study has verified that the energy content has low sensitivity for incipient defects detection [12].In the following, the MPE technique can be integrated with the WPD to achieve more accurate fault classification results.

Multi-Scale Permutation Entropy
The mathematical theorem of the PE and MPE was described in detail in [16][17][18].According to the Takens-Maine theorem, the phase space of a time series } , , 2 , 1 ), ( { can be reconstructed as: where m is the embedded dimension and τ is the time delay.The m number of real values contained in each ) (i X can be arranged in an increasing order as: If there exist two or more elements in ) (i X that have the same value, e.g., , their original positions can be sorted such that for 1 2 can be written.Accordingly, any vector ) (i X can be mapped onto a group of symbols as: where The maximum value of ) (m H P can be obtained as ) ! ln(m when all the symbol sequences have the same probability distribution as !/ 1 m P l = . Therefore, the PE of order m can be normalized as: The size of P H value indicates the degree of randomness of time series.The greater P H is, the more random the time series indicates.Contrarily, it indicates that the time series are more regular.
The MPE is employed for estimation of complexity parameters.The MPE calculates PE over multiple scales to avoid contradictory results by single scale entropy.In the case of Shannon entropy, the sequential relation between values of the time series is neglected.This is more useful for a linear system while MPE employs the comparison of neighboring values for analysis of complex time domain data.This property of the MPE makes it more useful for analysis of non-stationary signals.
Based on the multi-scale technique, the main step for calculating MPE is to construct the consecutive coarse-grained time series.This can be done by taking the average of the data inside non-overlapping windows of length l which is called the scale factor, and the sequence is processed as coarse-grained time series.The coarse-grained time series can be expressed by: where ) (i y l denotes coarse-grained time series on different scales, and when the scale factor is equal to one, the sequence is original time series

Fault Diagnosis Based on WPD and MPE
The WPD has been widely applied in the field of signal feature extraction; this is because the WPD has a strong ability of analysis in the time-frequency domain.Combined with the property of the MPE which is useful for the analysis of non-stationary signals, a hybrid rolling bearing fault diagnosis approach can be designed as shown in Figure 2.
The main steps are as follows: Step 1: The rolling bearing vibration signal is sampled and then processed by WPD with a three-level decomposition as shown in Figure 1.
Step 2: Each time series data, corresponding to each sub-frequency band signal, is divided into several subsequences of length w , and the data length 256 = w . The subsequence is obtained by using the maximum overlap, that is to say, each subsequence backward one data point to get the next sequence.Then, MPE values of all subsequences from one sub-frequency band signal are calculated using Equation (10).
Step 3: The average of MPE values for each sub-frequency band is calculated, and the average value is considered as the fault feature vector of each sub-frequency band signal.Then, fault feature vectors of each rolling bearing vibration signal can be calculated.
Step 4: After scalar quantization by index calculation formula of Lloyds algorithm in Equation ( 13) [20], the feature vectors of different conditions are used to train the HMM with each working condition: where N is the length of the codebook vector, ) (i partition is the partition vector with the length of N − 1, x is the feature vector for scalar quantization.
Step 5: A test vibration signal can then be acquired for diagnosis, and the feature vector is first extracted.
Then, the feature vector is put into the well trained HMMs, and the corresponding HMM which has the maximum probability is regarded as the classification result [21,22].

Evaluation Using the Simulated Signal
Three signals x1(t), x2(t), and x3(t) are simulated as shown in Figure 3.The signal x1(t), x2(t), and x3(t) are all consist of a set of Gaussian-type impulses with different amplitudes and white noises.The relative band width of Gaussian-type impulses in the signal x1(t) is 0.5, and the center frequency is 100 Hz.The relative band width of Gaussian-type impulses in the signal x2(t) is 0.4, and the center frequency is 50 Hz.The relative band width of Gaussian-type impulses in the signal x3(t) is 0.3, and the center frequency is 150 Hz.Since the characteristics of the new signals are very similar to those of the real fault signals, the simulation experiment result can verify the validity of the proposed method to a certain extent.Considering the effectiveness of the decomposition level as well as the computational complexity, a three-level WPD is adopted for data processing, which decomposes each simulated signal into eight sub-frequency band signals.The reverse biorthogonal wavelet 5.5 is chosen as the base wavelet of the decomposition.Figure 4   Table 1 shows the PE of each sub-frequency band signal after using a moving average computation.It can be seen that there is relatively little difference between the PE of each sub-frequency band signal, and no obvious change trend is identified.Table 2 shows the MPE of each sub-frequency band signal after using moving average computation.Comparing the MPE values of 1 ( ) x t with those of 2 ( ) x t , and 3 ( ) x t , it can be seen that the three groups of MPE values are clearly distributed in different ranges.After scalar quantization, the feature vectors are used to train the HMM for signal classification.A total of 120 feature vectors were collected from three groups of signals using the proposed approach.One-third of the feature vectors in each condition were used for training the classifier and others were used for testing.The results of the signal classification are listed in Table 3. Results in Table 3 indicate that the presented method based on the WPD and the MPE can effectively identify different signals, and the overall classification rate is 95.6%.For the purposes of comparison, the signal classification rates using the MPE alone is calculated and 90% classification rate is obtained.It verifies that efficiency of the signal classification method proposed in this paper is improved in a certain extent than the MPE alone method.

Evaluation Using Experimental Data
In order to illustrate the practicability and effectiveness of the proposed method, a bearing fault data set from the Case Western Reserve University bearing data center is analyzed [23].The data set are acquired from the test stand shown in Figure 5, where it consists of a 2 hp motor, a torque transducer, a dynamometer, and control electronics.The test bearings support the motor shaft which is the deep grove ball bearings with the type of 6205-2RS JEMSKF.Single point faults were introduced to the inner raceway, outer raceway and ball of test bearings using electro-discharge machining with fault diameters of 0.18 mm.Vibration data was collected at 12,000 samples per second using accelerometers, which were attached to the housing with magnetic bases.Accelerometers were placed at the 12 o'clock position at both the drive end and fan end of the motor housing.The motor load level was controlled by the fan in the right side of Figure 5.For performance comparison between the MPE and the PE, sample vibration signals of bearings shown in Figure 6 are used for analysis, and the corresponding single factor analysis result is shown in Figure 8.In the processing, the scale of the MPE is selected as 4, by referring to research in [17] and the experiments.From Figure 8, it can be seen that the differentiation performance of the MPE is higher than that of the PE.The parameters in Tables 4 and 5 were quantified by Lloyds algorithm in Equation ( 13) as feature vectors for training the HMMs of different conditions.
A total of 160 feature vectors were collected from the four conditions, one-fourth of the feature vectors were used for training the classifier and others for signal classification, and the classification results are listed in Table 6.Out of 120 test feature vectors, only seven cases were not correctly classified, and the overall classification rate is 94.2%.For comparison, Table 7 list classification results of the WPD-PE method, and Table 8 lists classification results of the MPE alone method.From the comparison results, the proposed method is efficient for rolling bearing fault diagnosis, and the overall classification rate of the proposed method is higher, to a certain extent, than the MPE method and the WPD-PE method.In order to further verify the applicability of the proposed method, signals measured under 2 hp motor load which are shown in Figure 7 are processed.The classification result is listed in Table 9.The overall classification rate of the proposed fault detection method under 2 hp motor load condition is 93.3%.However, the classification rate of the MPE method, alone, under this condition is only 84.2%.That is to say, the proposed fault detection method has good applicability.Table 9. Classification results of signals shown in Figure 7 with proposed method.

Conclusions
Aiming at diagnosing rolling bearing faults, a hybrid approach that integrates the WPD with the MPE is proposed in this paper.The WPD is used as the pretreatment to decompose a vibration signal into a set of sub-frequency band signals, and the MPE value of each sub-frequency band signal is calculated.All MPE values of each vibration signal are formed as a feature vector and used as an input to a classifier, where the HMM is chosen to characterize the bearing faults.As compared to the WPD-PE approach, a higher classification rate has shown to be achieved by using the proposed approach (e.g., 95.6% for simulated signals, and 94.2% for experimental data).Since the approach presented in this study is generic in nature, it can be readily adapted to a broad range of applications for machine fault diagnosis.

Figure 2 .
Figure 2. The flow chart of proposed Fault Diagnosis method.

Figures 6
Figures 6 and 7 illustrate representative waveforms of the sample vibration signals measured from the test bearings under four initial conditions: (a) signal from a healthy bearing, (b) signal from a bearing with inner ring defect, (c) signal from a bearing with rolling element defect, and (d) signal from a bearing with outer ring defect.Signals in Figure 6 are measured under 0 hp motor load with the motor speed of 1797 rpm, and signals in Figure 7 are measured under 2 hp motor load with the motor speed of 1750 rpm.

Figure 6 .
Figure 6.Vibration signal waveforms of different conditions (0 hp motor load).(a) healthy bearing, (b) a bearing with inner ring defect, (c) a bearing with rolling element defect and (d) a bearing with outer ring defect.

Figure 7 .
Figure 7. Vibration signal waveforms of different conditions (2 hp motor load).(a) healthy bearing, (b) a bearing with inner ring defect, (c) a bearing with rolling element defect and (d) a bearing with outer ring defect.
Each signal shown in Figure6is decomposed into eight sub-frequency band signals firstly.Then, the PE and MPE value of each sub-frequency band signal are calculated.The results of the PE and the MPE are shown in one of the !m symbol permutations, which is mapped onto the m number symbols

Table 1 .
shows each sub-frequency band signal of x1(t), x2(t), and x3(t).PE values of all sub-frequency bands are illustrated in Table 1, and corresponding MPE values are illustrated in Table 2. Permutation entropy (PE) of each sub-frequency band.

Table 4 .
Permutation entropy (PE) value of each sub-frequency band.

Table 5 .
Multi-scale permutation entropy (MPE) value of each sub-frequency band.

Table 6 .
Classification results of the method based on wavelet packet decomposition (WPD) and multi-scale permutation entropy (MPE).

Table 7 .
Classification results of the wavelet packet decomposition multi-scale permutation entropy (WPD-PE) method.

Table 8 .
Classification results of the MPE method.