Feature Extraction Method of Rolling Bearing Fault Signal Based on EEMD and Cloud Model Characteristic Entropy

The randomness and fuzziness that exist in rolling bearings when faults occur result in uncertainty in acquisition signals and reduce the accuracy of signal feature extraction. To solve this problem, this study proposes a new method in which cloud model characteristic entropy (CMCE) is set as the signal characteristic eigenvalue. This approach can overcome the disadvantages of traditional entropy complexity in parameter selection when solving uncertainty problems. First, the acoustic emission signals under normal and damage rolling bearing states collected from the experiments are decomposed via ensemble empirical mode decomposition. The mutual information method is then used to select the sensitive intrinsic mode functions that can reflect signal characteristics to reconstruct the signal and eliminate noise interference. Subsequently, CMCE is set as the eigenvalue of the reconstructed signal. Finally, through the comparison of experiments between sample entropy, root mean square and CMCE, the results show that CMCE can better represent the characteristic information of the fault signal.


Introduction
The rolling bearing is the most common element of the rolling mechanism.Approximately 30% of mechanical faults occur in rolling bearings; hence, detecting and diagnosing faults in rolling bearings are popular issues among scholars worldwide.In the last 10 years, acoustic emission (AE) technology has become significant in monitoring the states of rolling bearings.This technology can help effectively detect early signs of pitting corrosion defect and crack initiation, and thus, prevent disastrous consequences [1,2].Researchers have investigated characteristic extraction methods for AE signals on bearings.A.M. Al-Ghamdi compared the root mean square (RMS), amplitude, and kurtosis values of the AE and vibration signals from the outer race fault of a rolling bearing.He asserted that AE technology was more effective than vibration technology in early fault diagnosis [3].P. Beck investigated related parameters of AE, such as the relationship between physical properties and AE energy in material fracturing.The test results revealed a linear relationship between AE energy and fracture area or depth [4].B. Kilundu employed a cyclostationary technique and proposed an indicator that was more sensitive to the continuous monitoring of defects compared with traditional temporal indicators (e.g., RMS, kurtosis, crest factor) [5].
Considering the poor working environment of mechanical equipment, signals collected on-the-spot are frequently seriously polluted.To eliminate noise from signals, some researchers introduced wavelet noise reduction technology into the feature extraction of AE signals and achieved good results.However, this method exhibits difficulties in selecting the wavelet base and determining the threshold.Empirical mode decomposition (EMD) does not have a fixed basis formula, and thus, difficulties in selecting wavelet base during wavelet analysis can be avoided.This technique is more effective in non-stationary signal de-noising than other wavelet transform methods.[6].However, EMD suffers from model mixing problems, which can distort the decomposed intrinsic mode function (IMF).To solve the frequency mixing problem in EMD de-noising, Huang proposed ensemble EMD (EEMD) to perform signal decomposition and enhance the thoroughness of de-noising [7].
Given the complex structure of mechanical equipment, existing randomness and fuzziness in fault causes, fault phenomena, and fault mechanisms may result in uncertainty in signal acquisition.Uncertainty can influence the accuracy of signal feature extraction and reduce the precision of fault diagnosis.Entropy is a measurement unit of uncertainty.It can effectively reduce the dimension of an eigenvalue and fully represent the feature information of a signal.Energy entropy, information entropy, appropriate entropy (ApEn), sample entropy (SampEn), and so on, are frequently used as the eigenvalue of signals.Yan et al. [8] integrated ApEn into the state monitoring of bearings and achieved good results.Su et al. [9] introduced SampEn into fault feature extraction in rolling bearings.Based on the results of experiments, SampEn performs better than ApEn, which is suitable for distinguishing the fault states of rolling bearings.However, these two methods experience difficulty in terms of the complexity in parameter selection and other shortcomings, such as a slight variation in threshold will cause a sudden change in entropy, which affects the stability of statistics.To overcome the disadvantages of traditional entropy, this study presents a solution; namely, cloud model characteristic entropy (CMCE).CMCE (En) is obtained from a reverse cloud generator without threshold and dimension settings.This generator can prevent the difficulties caused by parameter selection and better solve the uncertainty problem.
Based on a cross of probability theory and fuzzy mathematic theory, Li [10] proposed that the cloud model, which was obtained from a specific structure algorithm, could be an alternative model of qualitative concepts and quantitative expressions.The cloud model cannot only reflect the uncertainty of a natural language concept, but also the correlation between randomness and fuzziness, which constitutes mutual mapping between qualitative and quantitative concepts.CMCE is one of the digital characteristics of the cloud model that can represent the uncertainty measurement of qualitative concepts.In [11], Li evaluated the stability of segmented sequence data, and then identified and expressed the basic characteristics of a time series in self-adaption.In [12], Yu used CMCE to measure the fluctuation range of harmonic current under normal operation mode.Harmonic current was determined according to the curves of the 3En outer boundary of normal cloud membership.Meanwhile, the cloud model is widely applied in the field of emitter identification, power transformer fault diagnosis, and network intrusion detection, among others [13][14][15].However, the use of this model in the fault signal feature extraction of rolling bearings has not yet been reported.Accordingly, this study decomposes AE signals using the EEMD algorithm and selects sensitive IMFs that can represent characteristic information via the mutual information (MI) algorithm to eliminate noise interference.Meanwhile, CMCE is used as the eigenvalue of the reconstructed signal to overcome the shortcomings of traditional entropy.This process can improve the accuracy of signal feature extraction and the precision of fault diagnosis.

EEMD Algorithm
The EEMD algorithm is an auxiliary signal processing method that deals with noises.In this algorithm, Gaussian white noise is imposed on the signals, and mixed signals are calculated repeatedly via empirical mode decomposition.After noise is added, signal continuity may be achieved in regions with different frequencies because of the evenly distributed statistical properties of Gaussian white noise frequencies.Consequently, the mixing degree of the IMF component model is reduced.The EEMD algorithm is described as follows [16].
(1) The overall average time M and the standard deviation of white noise k are set.
(2) The EMD experiments are performed m times after adding white noise.
(2.1)After a random Gaussian white noise nm(t) is added into the input signal x(t), signal xm(t) is obtained as follows: (2.4) Take the minimum number of model components in each IMF group, which is obtained in M times decomposition as the final overall average number of IMF.(3) Each IMF in m times decomposition is averaged as follows: (4) j c is outputted as the j-th IMF obtained after EEMD decomposition.The added white noise nm(t) is generated randomly in each experiment.When the value of M is large, the overall average of the added Gaussian white noise is close to zero.

MI Algorithm
The MI algorithm was proposed by Claude Elwood Shannon, the developer of information theory.This algorithm refers to the statistical correlation between two random variables and is obtained from the extended concept of entropy.Entropy refers to a measure of the degree of disordered state of a physical system.The variable of entropy X is defined as [17] 2 ( ) ( )log ( ) where p(x) is the probability that event X will occur.Conditional entropy between two different random variables X and Y is defined as follows: where p(y) is the probability that event Y will occur independently, and p(x|y) is the conditional probability that event X will occur under the condition that event Y is occurring.The joint entropy of where p(x, y) is the simultaneity probability (i.e., joint probability) that events X and Y will occur.
In general, the relationship between the unite information entropy and the conditional information entropy is For variables X and Y, the definition of MI is as follows: The MI obtained from Equation ( 5) is not domesticated, and will be domesticated by Equation ( 6) as follows:

Cloud Model Algorithm
Let U be a quantitative universe that can be represented by a number, and C is the qualitative concept of U. If the quantitative value x U  is the first random realization of the qualitative concept C, then x's certainty degree for C μ( ) [0, 1] x  is a stable tendency random number.
x U   , and x → μ(x), then the distribution of x in the universe is a cloud, which can be denoted as C(x).
Each x is called a cloud droplet.The numerical characteristics of the cloud model include Ex (expected value), En (CMCE), and He (hyper entropy).The cloud model algorithm is divided into backward cloud generator algorithm and normal cloud generator algorithm [18].The backward cloud generator algorithm indicates that a certain amount of data samples are expressed as qualitative concepts, which are expressed by digital characteristics, as shown in the following discussion.
(1) Sample mean 1   is obtained according to sample point xi, The first order of the sample absolute center distance is , and sample variance is 2 .
(2) Calculate the expected value as follows.
x E X  (3) Calculate CMCE as follows.
A normal cloud generator represents a mapping from qualitative mode to quantitative mode, and the cloud droplet is obtained according to digital characteristics (Ex, En, He).This algorithm generates random quantitative values with uncertain concepts as well as the degree of certainty of random quantitative values.The steps of the normal cloud generator algorithm are described as follows.
(1) Generate a normal random number En′ with expected value En and standard deviation He.
(2) Generate a normal random number "x" with expected value Ex and standard deviation En′.
The probability density formula of X is The expected value of cloud droplet X is E(X) = Ex, and its variance is D(X) = En 2 + He 2 . ( (4) X is a cloud droplet of the universe, and y is the certainty degree.
(5) Repeat Steps (1)-( 4) until the required number of cloud droplets is generated.The schematic of the final cloud model is shown in Figure 1.Among these, CMCE (En) represents the uncertainty of the reflected qualitative concept.CMCE (En) indicates the range of the cloud droplet group accepted by linguistic terms in the log domain space, which is also called ambiguity.Alternatively, CMCE (En) denotes the randomness of the appearance of a cloud droplet; a cloud droplet can symbolize a qualitative concept.Moreover, entropy also indicates the relevance between uncertainty and randomness.CMCE (En) can be used to represent the particle size of a qualitative concept.In general, when entropy is high, the concept is macroscopic, fuzziness and randomness are considerable, and definite quantization is difficult.

Signal Feature Extraction Method Based on EEMD and CMCE
According to the preceding analysis, the feature extraction method based on EEMD and CMCE can be concluded through the following steps.Sensitive IMFs are selected according to MI threshold.
The formula of the MI threshold is shown as follows [19] max( ) μ 10 max( 1 2, , ) 3 , In Equation ( 16), μi is the MI between the i-th IMF and the original signal, n is the number of IMFs, and max (μi) is the maximum number of MI.
For an IMF whose MI value with the original signal is larger than the MI threshold, μh is maintained.For sensitive IMFs, an IMF whose MI value with the original signal is smaller than the MI threshold, then μh is removed.
(3) The selected sensitive IMFs are used to reconstruct signals.(4) CMCE as the eigenvalue is calculated using the backward cloud generator to reconstruct signals.

Design and Layout of Test Rig
To complete the simulation experiment, two conditions-namely, normal and damage states of the inner ring of the bearing-were tested.The rolling bearing experimental test rig is illustrated in Figure 2. The test rig consists of the motor, coupling, test bearing and support structure.The type of rolling bearing used is K001, which is produced by Chinese Nanjing haning Bearing Manufacturing Co., Ltd.The lesion area of the bearing sample is 6 mm 2 , which is processed by electrical discharge.Physical samples are shown in Figure 3.

Instrumentation
The test instrument used in the experiment is a four-channel signal acquisition system of PCI-2-PAC produced by the American Acoustic Physics Company.The acoustic sensor is R15α, whose response frequency is 60~500 kHz and service temperature is −20~80 °C.The acoustic sensor is fixed to the stents, which are attached to the bearing by an M20 magnetic fixture.The acoustic sensor is then connected to the data acquisition system by a preamplifier (40 dB).The output impedance of the preamplifier is 50 Ω, and the working frequency is 10 KHz~2 MHz.The data acquisition systems employ AEwin 3.5 software in data acquisition and analysis.The speed of the rotating motor is 14,000 r/min, and the sampling rate is 1MSPS during the experiments.The schematic of the data acquisition process is shown in Figure 4.In general, the most significant information of the original signal is centralized in the first several IMF components of EEMD decomposition.The MI values between IMF1~IMF7 and the original signal under normal and damage states are calculated by the MI algorithm, as shown in Figure 7.In this figure, the abscissa mark Ai (i = 1, 2, …, 7) indicates IMFi (i = 1, 2, …, 7) and the original signal.The normal and damage MI thresholds are calculated according to literature [19], with threshold values of 0.1452 and 0.0711, respectively.In addition, Figure 7 shows that the MI value between IMF1 and IMF4 and the original signal under normal state is higher than the threshold of 0.1452, hence, IMF1 and IMF4 are determined as sensitive IMFs.The MI value between the remaining IMFs and the original signal is lower than the threshold of 0.1452, thus, these IMFs are removed as false components.Similarly, the MI value between IMF1 and IMF4 and the original signal under damage state is higher than the threshold of 0.0711, hence, IMF1 and IMF4 are sensitive IMFs.Meanwhile, the MI values of the other IMFs are smaller than the threshold of 0.0711, and thus, these IMFs are removed as false components.In this manner, sensitive IMFs are determined under the two states.Consequently, we can obtain a summation according to sensitive IMFs.The reconstructed signals under the two states are shown in Figure 8.As shown in Figures 10 and 11, change is relatively stable with the increasing number of sample CMCE and SampEn.However, the difference between normal and damage states of SampEn is minimal, and the distinguishing effect is not obvious.SampEn uses the unit step function, and threshold r is highly sensitive.Consequently, a sudden change occurs in SampEn, which causes the difference between the two states to decrease further.As shown in Figure 12, the RMS fluctuation of the AE signals under the two states is obvious and stability is relatively poor.Although differences occur in RMS between the normal and damage states, the distinguishing effect is poor.RMS can represent the change in signal amplitude, and thus, it is easily disturbed by the environment, has poor anti-noise capability, and fails to solve the signal uncertainty problem.Moreover, obvious fluctuations easily appear with the increase in sample number.
The CMCE and SampEn of the reconstructed signal are set as the eigenvalue.Fault diagnosis is performed with K-nearest neighbor (KNN) classification algorithm and support vector machine (SVM) algorithm [20][21][22].One hundred and twenty groups of samples are selected from CMCE and SampEn samples, respectively, among which, 80 groups are for training and 40 groups are for testing.Setting K = 3 in K-nearest neighbor (KNN) classification algorithm, the penalty factor C = 150, σ = 1 in support vector machine (SVM) algorithm, the experimental results are shown in Table 1, which show that the diagnostic performances of KNN and SVM are both good under EEMD-CMCE.Also, this shows that the proposed EEMD-CMCE method is a better Feature Extraction method than the EEMD-SampEn method.

Conclusions
This study proposes a feature extraction method based on the EEMD and CMCE of rolling bearing fault signal.The EEMD algorithm is used in signal decomposition.Sensitive IMFs are then selected to reconstruct signals using the MI method, which finally eliminates noise.CMCE is used as the eigenvalue of the reconstructed signal to overcome the disadvantages of traditional entropy.According to the analysis of the AE signals from a rolling bearing under normal and damage states, the experimental results verify that the index, as bearing fault characteristics, is effective and superior.

( 2 . 2 )
xm(t) is decomposed by EMD to obtain cj,m, which indicates that j IMF is obtained in the m-th decomposition (j = 1, 2, …, Nm).Nm denotes the number of IMF in the m-th decomposition.(2.3)If m < M, then let m = m + 1 and return to (2.2).

Figure 1 .
Figure 1.Schematic of the cloud model.

( 1 )
IMFj (j = 1, 2, …, n) is obtained by decomposing the collected AE signals calculated by the EEMD algorithm.(2) All MI values between all IMFj and the original signal by the MI algorithm are calculated.

Figure 3 .
Figure 3. Damage fault in the inner ring of the bearing. damage

Figure 4 .
Figure 4. Schematic of the data acquisition systems.

Figure7.
Figure7.MI value between IMF1~ IMF 7 under the two states and the original signal.

Figure 8 .
Figure 8. Reconstruction signals.(a) The normal signal of reconstruction; (b) The damage signal of reconstruction.

Figure 9 .
Figure 9. Cloud model diagram of the reconstructed signal.

Figure 10 .
Figure 10.Fitting curves of all CMCE values under the two states.

Figure 11 .
Figure 11.Fitting curves of all SampEn values under the two states.

Figure 12 .
Figure 12.Fitting curves of all RMS values under the two states.

Table 1 .
Comparison of fault diagnosis results.