Improved Lmd, Permutation Entropy and Optimized K-means to Fault Diagnosis for Roller Bearings

A novel bearing vibration signal fault feature extraction and recognition method based on the improved local mean decomposition (LMD), permutation entropy (PE) and the optimized K-means clustering algorithm is put forward in this paper. The improved LMD is proposed based on the self-similarity of roller bearing vibration signal extending the right and left side of the original signal to suppress its edge effect. After decomposing the extended signal into a set of product functions (PFs), the PE is utilized to display the complexity of the PF component and extract the fault feature meanwhile. Then, the optimized K-means algorithm is used to cluster analysis as a new pattern recognition approach, which uses the probability density distribution (PDD) to identify the initial centroid selection and has the priority of recognition accuracy compared with the classic one. Finally, the experiment results show the proposed method is effectively to fault extraction and recognition for roller bearing.


Introduction
Roller bearings are the most common parts and play the key role in rotating machinery system.Under the working conditions of high-speed and heavy-load, varying degrees of failures always appear in different locations of bearings which are probably related almost 50% of all motor faults [1].In order to monitor the health condition of bearings, the vibration-based signal processing techniques are seen as the most valid methods for diagnosing the roller bearing faults due to vibration signals accompanies with a lot of useful information of failures [2,3].Furthermore, it is generally accepted that vibration-based signal processing techniques consist of two major aspects: fault feature extraction and fault pattern recognition [4].
Currently there are many techniques have been put forward to extract the fault characteristics from the vibration signals such as time-domain analysis, frequency-domain analysis , time-frequency analysis and so on [5].Nevertheless, the vibration signals in most cases reveal the features of non-linear, non-Gaussian and non-stationary, the traditional time and frequency domain analysis techniques based on linear system may never be suitable for detecting the faults from those vibration signals [6].Therefore, much research has been done on time-frequency analysis and proved that it could effectively detect dynamic changes of those vibration signals.The short time Fourier transformation (STFT) [7] cannot catch the higher time and frequency resolution meanwhile.The Wigner-Ville distribution Entropy 2016, 18, 70 3 of 11 1.Find out all the local extrema n i of the row signal xptq, thus calculate the ith mean value m i and the ith envelope estimate a i respectively.m i " pn i `ni`1 q{2 (1) 2. The local means and the local envelope estimates are then separately smoothed using moving average method (MA) to get the mean function m 11 ptq and the local envelope function a 11 ptq.
3. Get a frequency modulated signal s 11 ptq.
s 11 ptq " pxptq ´m11 ptqq{a 11 ptq If the s 11 ptq is not a purely frequency modulated signal, regard it as the new original signal and repeat 1~3 until s 1n ptq is, that is the envelop function a 1pn`1q ptq equals to 1.
4. Multiply together the all envelope estimates obtained from the iterative process and then get the envelope signal a 1 ptq.a 1 ptq " a 11 ptqa 12 ptq ¨¨¨a 1n ptq " 5.Then, the first PF is formed from the product of the envelope signal a 1 ptq and the purely frequency modulated signal s 1n ptq.
PF 1 " a 1 ptqs 1n ptq 6. Finally, separate the PF 1 from the xptq to get a new signal.Repeat the whole process until u k ptq becomes constant or monotonic.So for, the raw signals can be reconstructed according to

2.2.The Boundary Processing Method
Because of the nondeterminacy of local extreme points both at the beginning and at the end of finite-duration signals, border distortion or edge effect will appear.Here a multi-component AM-FM simulation signal xptq is given.
where t = 0:1/1000:2, set sampling frequency 1000 Hz.The decomposition results of the simulation signal xptq by original LMD are shown in Figure 1.From Figure 1, it can be clearly found that the row signal is decomposed into three PFs which are corresponding with x 1 ptq, x 2 ptq and x 3 ptq, respectively, and a constant residual Rptq, however, two sides of each PF appear the distorting phenomenon to different extent.Furthermore, the time-frequency representation can be obtained by combining IFs and IAs in Figure 2 with the phenomenon of "swing" for the edge effect.The "swing" for edges of signal waveform because of the existence of edge effect could lead to more errors in calculating the PE which is based on permutation patterns by comparing the neighboring values of the signal.In addition, with the increase of iterations, the divergence will gradually "pollute" whole process of decomposition and lead to fatal results ultimately.In order to solve the problem of edge effect, this paper will propose a novel method to extend the signal-self-similar continuation.The self-similarity of signal refers to exactly or approximately similar to a part of itself and it is a typical property of fractals.In engineering, many kinds of signals are a fractal system with statistic self-similarity, either global or local self-similarity [23][24][25][26].It is a solution based on waveform matching and makes the extension meet the trend of original signals as far as possible to maintain the inner rules or characteristic.Focus on the left extension, the left data is   In order to solve the problem of edge effect, this paper will propose a novel method to extend the signal-self-similar continuation.The self-similarity of signal refers to exactly or approximately similar to a part of itself and it is a typical property of fractals.In engineering, many kinds of signals are a fractal system with statistic self-similarity, either global or local self-similarity [23][24][25][26].It is a solution based on waveform matching and makes the extension meet the trend of original signals as far as possible to maintain the inner rules or characteristic.Focus on the left extension, the left data is (1)  are defined as the maximum and minimum value of the given signal ( ) x t with the corresponding time, i tm and i tn , respectively.
1. Build a characteristic waveform which is a triangular waveform based on 1 1 (1) x m n   three points.2. Calculate the all start points ( ) i x tx and search the integration interval ( ) i i x i m n   matching best the characteristic waveform.It's a process of self-similarity and the corresponding time is achieved in Equation (3).In order to solve the problem of edge effect, this paper will propose a novel method to extend the signal-self-similar continuation.The self-similarity of signal refers to exactly or approximately similar to a part of itself and it is a typical property of fractals.In engineering, many kinds of signals are a fractal system with statistic self-similarity, either global or local self-similarity [23][24][25][26].It is a solution based on waveform matching and makes the extension meet the trend of original signals as far as possible to maintain the inner rules or characteristic.Focus on the left extension, the left data is xp1q and m i , n i pi " 1, 2, 3 ¨¨¨q are defined as the maximum and minimum value of the given signal xptq with the corresponding time, tm i and tn i , respectively.

1.
Build a characteristic waveform which is a triangular waveform based on xp1q ´m1 ´n1 three points.

2.
Calculate the all start points xptx i q and search the integration interval xpiq ´mi ´ni matching best the characteristic waveform.It's a process of self-similarity and the corresponding time is achieved in Equation (3).

3.
Find out the best extension of signal through the shape error parameter epiq without considering the order of magnitudes.
Extend the right end by the same way.Meanwhile, the extended signal will be achieved.
Therefore, the decomposition results and the time-frequency representation of the mono-components derived from the improved LMD in Figures 3 and 4. It is clearly found in Figures 3  and 4 that the edge effect has improved a lot, especially near the right end of the time-frequency representation.Therefore, the analysis results validate that the improved LMD based on self-similar continuation can significantly decrease the border distortion.3. Find out the best extension of signal through the shape error parameter ( ) e i without considering the order of magnitudes.
4. Extend the right end by the same way.Meanwhile, the extended signal will be achieved.
Therefore, the decomposition results and the time-frequency representation of the mono-components derived from the improved LMD in Figures 3 and 4. It is clearly found in Figures 3 and 4 that the edge effect has improved a lot, especially near the right end of the time-frequency representation.Therefore, the analysis results validate that the improved LMD based on self-similar continuation can significantly decrease the border distortion.3. Find out the best extension of signal through the shape error parameter ( ) e i without considering the order of magnitudes.
tn tm e i m x tm n m tm tx tn tm 4. Extend the right end by the same way.Meanwhile, the extended signal will be achieved.
Therefore, the decomposition results and the time-frequency representation of the mono-components derived from the improved LMD in Figures 3 and 4. It is clearly found in Figures 3 and 4 that the edge effect has improved a lot, especially near the right end of the time-frequency representation.Therefore, the analysis results validate that the improved LMD based on self-similar continuation can significantly decrease the border distortion.

Permutation Entropy
Permutation entropy calculates entropy based on permutation patterns by comparing the neighboring values of the time series [19].It's directly accounts for the temporal information contained in the time series to detect dynamical changes and contributes to the understanding of complex and chaotic systems.
For a given time series x " tx t : t " 1, . . ., Nu, a vector composed of the Dth subsequent values is constructed D is the embedding dimension which determines how much information is contained in each vector and τ is the time delay, i " 1, 2, . . ., N. X D i is a new time series and it has a permutation π j " pj 1 , j 2 , . . ., j D q, if it satisfies that: Furthermore, the relative frequency for each distribution can be defined as: According to the Shannon's entropy of the D! distinct symbols, PE of a time series can be defined as follow: H p pDq " ´ÿ π j P S D ppπ j q lnpppπ j qq For convenience, Equation ( 8) can be normalized by lnpD!q.
H p " H p pDq{plnpD!qq " ´1 lnpD!q k ÿ i"1 ppπ i q lnpppπ i qq Obviously, Equation (5) indicates that two main parameters should be determined: the embedding dimension D and the time delay τ.The evaluation of the appropriate probability distribution relies on the embedding D, since D determines the number of accessible states, D!.For practical purpose, it is adequate to use 3 ď D ď 7 and the value of τ " 1 to calculate the PE in [27].

K-means Clustering Algorithm
After fault features are extracted by improved LMD and PE, it is necessary to classify the condition of the roller bearings.K-means clustering algorithm has become the most popular method because its simplicity of idea, formulation of algorithm and good convergence for unsupervised clustering [28].The detailed process of K-means clustering algorithm is described below.

1.
Random initialization of cluster centroids for a given data set.

2.
Calculate the distance between the cluster centroids and every point.Distribute these points to the cluster represented by the centroids according to the shortest distance principle.

3.
Find out the mean value of every cluster and it could be seen as a new cluster centroid.

4.
Compare the new centroid with the previous or check the cluster objective function's convergence property.Repeat steps ( 2) and (3) until the cluster centroid remains unchanged or the function is convergence.
The clustering quality of K-means algorithm highly dependent upon the initialization of cluster centers.Because of the random initial selection, the clustering quality cannot get guarantee.So, this paper selects the peak of data set's probability density curve as the initial value which called optimized K-means.In theory, a probability density function (PDF), or density of a continuous random variable is a function that describes the relative likelihood foe this random variable to take on a given value [29].In other words, it could describe the probability of the points near the given values.In a way, it's reasonable to choose the maximum of PDF as the initial center of K-means algorithm.In practice, we construct the 3-dimensional data set shown in Figure 5 to compare the optimized K-means with the traditional algorithm in classification accuracy and iterations.The results of comparison are partly shown in Table 1.As the Table 1 shows, the optimized K-means has superiority in classification accuracy at the same iterations.The clustering quality of K-means algorithm highly dependent upon the initialization of cluster centers.Because of the random initial selection, the clustering quality cannot get guarantee.So, this paper selects the peak of data set's probability density curve as the initial value which called optimized K-means.In theory, a probability density function (PDF), or density of a continuous random variable is a function that describes the relative likelihood foe this random variable to take on a given value [29].In other words, it could describe the probability of the points near the given values.In a way, it's reasonable to choose the maximum of PDF as the initial center of K-means algorithm.In practice, we construct the 3-dimensional data set shown in Figure 5 to compare the optimized K-means with the traditional algorithm in classification accuracy and iterations.The results of comparison are partly shown in Table 1.As the Table 1 shows, the optimized K-means has superiority in classification accuracy at the same iterations.

The Fault Feature Extraction Combining Improved LMD and PE
To verity the effectiveness of improved LMD and PE in the fault feature extraction, the proposed approach is applied to the experimental bearing vibration signals analysis.In this paper, all the experimental data are obtained from the website of Case Western Reverse Lab [30], and the experiment system's sketch is given in Figure 6 in which the SKF bearing is used as experimental objective.The test stand mainly consists of a 2 hp motor, a torque transducer, a dynamometer and control electronics.The vibration signals are collected under four conditions including the normal, the inner race fault (IRF), the outer race fault (ORF) and the ball fault (BF).The test bearings using electro-discharge machining with fault diameters of 0.007 inches, 0.014 inches, 0.021 inches and 0.028 inches.
In the improved method of LMD, we utilize the self-similarity of signal to extend for suppressing the edge effect.First, we prove the self-similarity of vibration signals.For example, the Figure 7 shows the temporal distributions of roller bearing vibration signal with the same inner race fault condition

The Fault Feature Extraction Combining Improved LMD and PE
To verity the effectiveness of improved LMD and PE in the fault feature extraction, the proposed approach is applied to the experimental bearing vibration signals analysis.In this paper, all the experimental data are obtained from the website of Case Western Reverse Lab [30], and the experiment system's sketch is given in Figure 6 in which the SKF bearing is used as experimental objective.The test stand mainly consists of a 2 hp motor, a torque transducer, a dynamometer and control electronics.The vibration signals are collected under four conditions including the normal, the inner race fault (IRF), the outer race fault (ORF) and the ball fault (BF).The test bearings using electro-discharge machining with fault diameters of 0.007 inches, 0.014 inches, 0.021 inches and 0.028 inches.
in different sample frequency.It could be found that the trend of vibration signals in a high sample frequency is similar to the waveform in the lower one.So, the roller bearing vibration signals have the feature of self-similarity and the improved LMD can be used to decompose them.In the improved method of LMD, we utilize the self-similarity of signal to extend for suppressing the edge effect.First, we prove the self-similarity of vibration signals.For example, the Figure 7 shows the temporal distributions of roller bearing vibration signal with the same inner race fault condition in different sample frequency.It could be found that the trend of vibration signals in a high sample frequency is similar to the waveform in the lower one.So, the roller bearing vibration signals have the feature of self-similarity and the improved LMD can be used to decompose them.The PE values for all of the four conditions are calculated and shown in Figure 8. From Figure 8 we can observe that the four conditions have been distinguished effectively and the PE values of PFs reflect the complexity of vibration signal further.PE values under normal roller bearings are smaller than that of roller bearings under fault conditions.When the fault happened on roller bearings, the dynamic system will change, resulting in one more PF which is the failure frequency and a large PE than that of normal condition.So far, the fault features of roller bearing vibration signal have been extracted based on the improved LMD and PE, and we choose the previous three PEs to classify.

The Fault Pattern Recognition Based on the Optimized K-means Clustering Algorithm
The fault pattern recognizing of roller bearings can be done based on the fault feature vectors obtained by improved LMD and PE.State classification results of the optimized K-means are shown in Table 2, from which we can find that actual clustering quality of K-means are extremely consistent with the target ones.Thus, the proposed method combing improved LMD, PE and the optimized K-means can realize the fault feature extraction and fault pattern recognition effectively.

Conclusions
In order to extract the fault feature and recognize the fault pattern of bearing vibration signals, this paper proposes a novel approach combining the improved LMD, PE and the optimized K-means.The analysis results from simulation signal and experiment data demonstrate the superiority of the approach that is as follows.

Figure 2 .
Figure 2. The time-frequency representation of the product functions (PFs) derived from the LMD.

Figure 2 .
Figure 2. The time-frequency representation of the product functions (PFs) derived from the LMD.

Figure 2 .
Figure 2. The time-frequency representation of the product functions (PFs) derived from the LMD.

Figure 3 .
Figure 3.The improved LMD decomposition results of ( ) x t .

Figure 4 .
Figure 4.The time-frequency representation of the PFs derived from the improved LMD.

Figure 3 .
Figure 3.The improved LMD decomposition results of ( ) x t .

Figure 4 .
Figure 4.The time-frequency representation of the PFs derived from the improved LMD.

Figure 4 .
Figure 4.The time-frequency representation of the PFs derived from the improved LMD.

Figure 5 .
Figure 5.The sample data sets for clustering.

Figure 5 .
Figure 5.The sample data sets for clustering.

Figure 8 .
Figure 8.The permutation entropy (PE) values for all of the four conditions.
frequency.It could be found that the trend of vibration signals in a high sample frequency is similar to the waveform in the lower one.So, the roller bearing vibration signals have the feature of self-similarity and the improved LMD can be used to decompose them.

Figure 8 .
Figure 8.The permutation entropy (PE) values for all of the four conditions.

Figure 8 .
Figure 8.The permutation entropy (PE) values for all of the four conditions.

Figure 8 .
Figure 8.The permutation entropy (PE) values for all of the four conditions.

Table 1 .
The comparison of clustering quality based on K-means and optimized K-means.

Table 1 .
The comparison of clustering quality based on K-means and optimized K-means.

Table 2 .
State classification results of the optimized K-means.