Fault Recognition of Rolling Bearings Based on Parameter Optimized Multi-Scale Permutation Entropy and Gath-Geva

To extract fault features of rolling bearing vibration signals precisely, a fault diagnosis method based on parameter optimized multi-scale permutation entropy (MPE) and Gath-Geva (GG) clustering is proposed. The method can select the important parameters of MPE method adaptively, overcome the disadvantages of fixed MPE parameters and greatly improve the accuracy of fault identification. Firstly, aiming at the problem of parameter determination and considering the interaction among parameters comprehensively of MPE, taking skewness of MPE as fitness function, the time series length and embedding dimension were optimized respectively by particle swarm optimization (PSO) algorithm. Then the fault features of rolling bearing were extracted by parameter optimized MPE and the standard clustering centers is obtained with GG clustering. Finally, the samples are clustered with the Euclid nearness degree to obtain recognition rate. The validity of the parameter optimization is proved by calculating the partition coefficient and average fuzzy entropy. Compared with unoptimized MPE, the propose method has a higher fault recognition rate.


Introduction
As the core component of rotating machinery, the state of rolling bearing directly affects the use of the equipment [1]. Vibration signals collected by the sensor are often contaminated by noise and thus unusable for direct machine faults diagnosis [2]. How to identify the state of rolling bearing quickly and effectively has become a focus of current research. Fault feature extraction and pattern recognition are key links in the fault diagnosis of rolling bearing [3,4]. At present, for the non-stationary complex signal, the feature extraction method mainly applies traditional time-frequency analysis [5] and filtering. Its statistical characteristics in time and frequency domain change with time, such as root mean square (RMS) [6], kurtosis [7], and shape factor [8]; however, these indicators will change whether the fault location occurs in the bearing outer ring, bearing inner ring or rolling element when a bearing fails. Relying solely on these eigenvalues cannot effectively distinguish and identify the fault location. Fast Fourier transform (FFT) [9], Wavelet transform [10], and ensemble empirical mode decomposition (EEMD) [11] are commonly used to signal denoising in feature extraction of fault diagnosis. Fault types are determined by comparing current fault features with standard or existing fault features [12,13]. However, due to the factors such as friction, vibration, and load in the process of mechanical operation, the vibration signal of mechanical system often shows nonlinear behavior. Using the method of time-frequency analysis to decompose the signal into stable signal inevitably has some limitations and difficulties [14].
Based on the above reasons, this paper proposes a method which combines parameter optimized MPE and GG clustering algorithm to extract fault features and recognition pattern of rolling bearing. The effectiveness of the proposed method is verified by several rolling bearing fault experiments.

MPE Theory
The MPE is to calculate the permutation entropy of time series at different scales, that is to consider the characteristics of time series at multi scales. The calculation steps are as follows. For the time series X = [x i , i = 1, 2, · · · N], the coarse-grained time series y j (s) are obtained by coarse-grained processing [25], x i , (j = 1, 2, · · · , [N/s]) (1) where s is the scale factor of X and N is the length of X.

Parameter Selection for MPE
In order to analyze the general trend of a group of data, the first step is to find the mean value. However, the mean value alone cannot fully represent the overall situation of a group of data, so the skewness of the data can be obtained [45]. The smaller the absolute value of skewness is, the more reliable the value is.
The MPE value of X (X = [x i , i = 1, 2, · · · , N]) from all scales constitutes the sequence H D (X) H D (X) = {H D (1), H D (2), · · · , H D (s)} The skewness of H D (X) is skew where H ave D (X) and H std D (X) are the average value and standard deviation of the H D (X), E(*) stands for expectation. Therefore, this paper selects the square function of skewness as the objective function [42] to calculate the minimum value and optimize the maximum value of the F(X) Entropy 2021, 23, 1040 4 of 22

Particle Swarm Optimization
Particle swarm optimization (PSO) [46] regards the individuals in the population as particles without mass and volume in the multi-dimensional search space. Each particle has its own position and velocity, in the solution space, the fitness evaluation function is used to continuously aggregate to its personal best historical position p best and the group best historical position g best in the whole field to realize the evolution of candidate solutions.
The special memory function of PSO makes it possible to dynamically track the current search situation and adjust its search strategy. The evolution process of particle swarm optimization is as follows where σ is an evolutionary algebra, ve i σ is the flight velocity of particle i, po i σ is the position vector of particle i, p σ i is the best position experienced by particle i and g σ i is the best position of the whole particle swarm to experience in the solution space. r 1 and r 2 are random numbers between [0, 1], c 1 and c 2 are learning factors, w is the inertia weight factor. While po i and ve i meet the following condition, where δ is the proportional coefficient between the maximum velocity ve max and the maximum search space po max .
When the position or velocity of a certain dimensional variable exceeds the boundary range, the boundary absorption strategy is adopted, that is, the particle falls on the boundary of the search space in the next iteration.
The parameters of PSO algorithm in this paper are set as follows: population size group = 20, maximum iterations T max = 10, acceleration constant c 1,2 = 1.5, and inertia weight w = 0.5. The process of MPE parameter optimization using PSO is shown in Figure 1.

GG Algorithm
The specific algorithm given in the reference [47] is as follows.
(2) Initialize membership matrix U = [u ik ] z×n , u ik is objective function, which indicates the degree of the k th sample belonging to the i th (1 where λ is the iterations, β is fuzzy exponent and generally taken as 2. (4) Calculate the distance measure DM ik where DM ik is the maximum likelihood estimation distance and A i is the covariance matrix of the i th cluster center, q i is the prior probability of the i th cluster being selected. (13) where if the condition U λ − U λ−1 < ε (ε is the termination tolerance) is satisfied, the operation will be terminated, otherwise, λ = λ + 1, until the condition is satisfied.

GG Algorithm
The specific algorithm given in the reference [47] is as follows.
(1) Suppose a sample set where λ is the iterations, β is fuzzy exponent and generally taken as 2.

Evaluation Index of Clustering Effect
The clustering effect of GG fuzzy clustering can be made quantitative assessment with partition coefficient (PAC) [48] and partition entropy (PAE), which are as follows where ζ i , ζ iτ are the number of all members in cluster i and the number of members belonging to class τ, respectively. is the number of category from cluster i. Suppose the sample set Ω = (ψ 1 , ψ 2 , · · · ψ k , · · · ψ n ) is composed of the sample set called θ and the set ϕ, n is the number of samples in Ω, Euclidean closeness [49] is used to fault recognition in this paper, then the Euclid closeness between θ and ϕ is where θ(ψ k ) and ϕ(ψ k ) are membership functions of θ and ϕ, respectively.

The Process of Bearing Fault Pattern Recognition
The framework of the proposed method is shown in Figure 2. The general implementation procedure is summarized as follows (1) Carry out the experiment and collect the vibration experiment data.
(2) For the signal, the initial parameters of MPE are optimized by PSO algorithm. The optimal parameters (m, L) of MPE is determined.

Parameter Influence Analysis of MPE
In order to study the influence of different parameters on MPE, the experimental data of rolling bearing in Case Western Reserve University [50] is used for analysis. The test stand is shown in Figure 3, which is composed of a 2 hp motor (left), a torque transducer/encoder (center), a dynamometer (right), and control electronics (not shown). The test bearings support the motor shaft. Single point faults were introduced to the test bear-

Parameter Influence Analysis of MPE
In order to study the influence of different parameters on MPE, the experimental data of rolling bearing in Case Western Reserve University [50] is used for analysis. The test stand is shown in Figure 3, which is composed of a 2 hp motor (left), a torque transducer/encoder (center), a dynamometer (right), and control electronics (not shown). The test bearings support the motor shaft. Single point faults were introduced to the test bearings using electro-discharge machine. Vibration data was collected using accelerometers, which were attached to the housing with magnetic bases. The rolling bearing near the drive end is tested in the experiment. Its type is 6205-2 RSJEMSKF.

Parameter Influence Analysis of MPE
In order to study the influence of different parameters on MPE, the experimental data of rolling bearing in Case Western Reserve University [50] is used for analysis. The test stand is shown in Figure 3, which is composed of a 2 hp motor (left), a torque transducer/encoder (center), a dynamometer (right), and control electronics (not shown). The test bearings support the motor shaft. Single point faults were introduced to the test bearings using electro-discharge machine. Vibration data was collected using accelerometers, which were attached to the housing with magnetic bases. The rolling bearing near the drive end is tested in the experiment. Its type is 6205-2 RSJEMSKF.  Taking the normal vibration signal of the drive end bearing as an example when the motor speed is 1797 r/min, the sampling frequency is 12 kHz. The values of data length L are 128-4096, respectively. The values of embedding dimension m are 3-8, delay time t is 1-6 and scale factor s is set from 1 to 12. Figure 4 shows the amplitude variation of MPE from samples in each state under different lengths, different embedding dimensions, and different delay time.
It can be seen from Figure 4 that for the normal vibration signal of the bearing, when m = 6, t = 1, the value of L changes from small to large, the entropy increases obviously. So different L values have a greater impact on the entropy, it is necessary to select the appropriate value of L. Fixed L = 1024, t = 1, m values from small to large change, with the increase of m, the entropy decreases obviously, different m value has a different entropy, so it is necessary to select the appropriate value of m.
Fixed m = 6, L = 1024, as can be seen from Figure 4c, with the increase of delay time t, the entropy value does not increase or decrease obviously at different scales, which indicates that it has little effect on the entropy value, so the fixed value of t is 1 in this paper. When m value is too small, the ability of the algorithm to detect signal mutation is low, but the larger m value is, the larger the amount of calculation is, and the longer the running time of the algorithm is. In summary, selecting the appropriate data length L and embedding dimension m is necessary.
It can be seen from Figure 4 that for the normal vibration signal of the bearing, when m = 6, t = 1, the value of L changes from small to large, the entropy increases obviously. So different L values have a greater impact on the entropy, it is necessary to select the appropriate value of L. Fixed L = 1024, t = 1, m values from small to large change, with the increase of m, the entropy decreases obviously, different m value has a different entropy, so it is necessary to select the appropriate value of m.

Case 1: CWRU Data Analysis
When the motor speed is 1797 r/min, four types of vibration signals are analyzed, including normal (NR) bearings, outer ring fault (ORF) bearings, inner ring fault (IRF) bearings and ball fault (BF) bearings. Figure 5 shows a part of time waveform of the vibration signal collected by sensors in four states, the horizontal axis is the time, the vertical axis is the acceleration amplitude of the vibration signals, their units are second and m · s −2 respectively. Intercept each state signals from the original signal according to different lengths to obtain four state samples. The number of samples is 30 for each state, a total of 120 feature vectors can be obtained, and each feature vector has 12 dimensions. parameter optimization of PSO algorithm is verified compared with the parameters in reference [32]. The vibration signals of four states of the bearing are analyzed, and the change of fitness value in the optimization process is shown in Figure 6.
The optimized parameters of MPE for various state samples are shown in Table 1  Firstly, every sample is analyzed by MPE to extract the features, the effectiveness of parameter optimization of PSO algorithm is verified compared with the parameters in reference [32]. The vibration signals of four states of the bearing are analyzed, and the change of fitness value in the optimization process is shown in Figure 6.
(d)   It can be seen from Figure 7b that the MPE with optimized parameters can better distinguish the four different states of bearings, and is better than the effect of fixed parameters in Figure 7a. The parameter optimized MPE can distinguish the four states of the bearing more obviously, it can be used as the feature vector to further classify and identify the bearing fault modes.

Amplitude
Fitness value  PC are two vectors in two-dimensional space after data visualization, they have the same meaning in the following similar figures. As can be seen from Figure 8b, the samples are distributed around four clustering centers according to fault types after processed by the proposed method, the distance between different classes becomes larger and the distance within classes becomes smaller compared with Figure 8a. In order to further illustrate the effectiveness of this research method, the PAC, PAE, and fault recognition rate are used to evaluate quantitatively. Corresponding to Figure 8, the performance comparison of the two recognition methods is shown in Table 2. It can be seen that (1) The closer the PAC is to 0, the better the clustering effect. Although the PAC value of MPE and PSO-MPE are all 1, the PAE decreases gradually. The closer the PAE is to 0, the better the clustering effect. (2) The fault recognition rate of PSO-MPE with GG clustering reaches 100%, which is consistent with its clustering performance.    It can be seen from Figure 7b that the MPE with optimized parameters can better distinguish the four different states of bearings, and is better than the effect of fixed parameters in Figure 7a. The parameter optimized MPE can distinguish the four states of the bearing more obviously, it can be used as the feature vector to further classify and identify the bearing fault modes.
In Figure 8, PC 1 and PC 2 are two vectors in two-dimensional space after data visualization, they have the same meaning in the following similar figures. As can be seen from Figure 8b, the samples are distributed around four clustering centers according to fault types after processed by the proposed method, the distance between different classes becomes larger and the distance within classes becomes smaller compared with Figure 8a.
In order to further illustrate the effectiveness of this research method, the PAC, PAE, and fault recognition rate are used to evaluate quantitatively. Corresponding to Figure 8, the performance comparison of the two recognition methods is shown in Table 2. It can be seen that (1) The closer the PAC is to 0, the better the clustering effect. Although the PAC value of MPE and PSO-MPE are all 1, the PAE decreases gradually. The closer the PAE is to 0, the better the clustering effect. (2) The fault recognition rate of PSO-MPE with GG clustering reaches 100%, which is consistent with its clustering performance. (3) It can be seen that the PSO-MPE method proposed by the author can effectively extract the fault feature information of rolling bearing and accurately identify different fault types of rolling bearings.

Case 2: A Freight Locomotive Wheelset Bearing Signal
To further demonstrate the performance of the proposed method, a fault experiment is carried out in this section. The experimental setup and the tested wheelset bearing are shown in Figure 9. RD2 wheel set and 197,726 double row tapered roller bearing are installed on the test bench. The fault bearings are shown in Figure 10. The wheelset bearing defections are natural damages generated during the operation of the railway freight vehicles, which are located in the outer raceway, inner raceway and ball, respectively. The experimental device includes three DASP data processing software of CA-YD-188 piezoelectric accelerometer, signal amplifier and INV36DF signal acquisition instrument. The sensors are installed on the test bench in turn, and the position is shown in Figure 9. The sampling frequency is 25.6 kHz.  In order to observe the time domain characteristics and save the paper space, Figure 11 shows the time domain waveform of the bearing inner ring and rolling ball. We can see the noise component of the collected signal from this experiment in the Figure 11 is more than bearing experiment of CWRU in the Figure 5, which increases difficulty of the method verification. There are 30 group samples collected in each state. It can be seen from Figure 12 that without optimizing the parameters of the MPE, the entropy values of the four states of rolling bearing are intertwined, they are not effectively distinguished, which cannot effectively distinguish the four states, it is not suitable to use them as the quantitative features of rolling bearing fault.
GG with parameters unoptimized MPE is directly used for the signal. As shown in Figure 12a, the entropy value of the four states is not effectively distinguished. The sample distance of the same class is too large, and the distance between different classes is small in Figure 12b. Although we can see about four gathering teams, the distinction between NR and ORF is not obvious, some NR samples are wrongly classified into ORF, when the signal contains more noise components, it is easy to misjudge. GG with parameters unoptimized MPE is directly used for the signal. As shown in Figure 12a, the entropy value of the four states is not effectively distinguished. The sample distance of the same class is too large, and the distance between different classes is small in Figure 12b. Although we can see about four gathering teams, the distinction between NR and ORF is not obvious, some NR samples are wrongly classified into ORF, when the signal contains more noise components, it is easy to misjudge.  The Table 3 are the parameters of MPE in various states obtained by PSO. Figure 13a shows the PSO-MPE of four state signals, it can be seen that distance between the entropy curves of different operation states is significantly increased and entropy curves of different operation states are obviously separated completely. This is because when the rolling bearing has faults, the randomness of vibration signal changes, which changes the entropy values in different scales. In the same state, with the increase of scale, the randomness and complexity of coarse-grained sequence decrease, and the change range of entropy decreases.  The Table 3 are the parameters of MPE in various states obtained by PSO. Figure 13a shows the PSO-MPE of four state signals, it can be seen that distance between the entropy curves of different operation states is significantly increased and entropy curves of different operation states are obviously separated completely. This is because when the rolling bearing has faults, the randomness of vibration signal changes, which changes the entropy values in different scales. In the same state, with the increase of scale, the randomness and complexity of coarse-grained sequence decrease, and the change range of entropy decreases.  As can be seen from Figure 13b, after samples are processed by PSO-MPE and GG clustering algorithm, they are distributed around four clustering centers according to fault types, the distance between different classes becomes larger and the distance within class becomes smaller than Figure 13a.
According to Table 4, The fault recognition rate of rolling bearing based on PSO-MPE and GG clustering is 99.17%, which is higher than the recognition rate of MPE. Moreover, the PAC and PAE are better than those of parameters unoptimized MPE. It shows that the proposed method is still effective under relatively difficult experimental conditions. In order to prove the superiority of parameter optimized MPE as signal feature extraction index, compare it with the feature vector composed of kurtosis and root mean square. Figure 14 shows the effect of clustering with kurtosis and root mean square (RMS) As can be seen from Figure 13b, after samples are processed by PSO-MPE and GG clustering algorithm, they are distributed around four clustering centers according to fault types, the distance between different classes becomes larger and the distance within class becomes smaller than Figure 13a.
According to Table 4, The fault recognition rate of rolling bearing based on PSO-MPE and GG clustering is 99.17%, which is higher than the recognition rate of MPE. Moreover, the PAC and PAE are better than those of parameters unoptimized MPE. It shows that the proposed method is still effective under relatively difficult experimental conditions. In order to prove the superiority of parameter optimized MPE as signal feature extraction index, compare it with the feature vector composed of kurtosis and root mean square. Figure 14 shows the effect of clustering with kurtosis and root mean square (RMS) as feature vector. Compared with the Figure 13b, it is obvious that the four types of samples are not effectively distinguished, because these indexes will change no matter which part of the bearing fails, it cannot effectively distinguish the fault location only through kurtosis or root mean square. While the research method in this paper can effectively distinguish different types of fault samples. as feature vector. Compared with the Figure 13b, it is obvious that the four types of samples are not effectively distinguished, because these indexes will change no matter which part of the bearing fails, it cannot effectively distinguish the fault location only through kurtosis or root mean square. While the research method in this paper can effectively distinguish different types of fault samples.

Case 3: A High-Speed Locomotive Wheelset Bearing Fault Signal
In order to verify whether the method is still effective in more complex working conditions with more noise components, the practical test data from the self-made experiment platform is selected for subsequent analysis. In this case, the vibration signal has been collected from a high-speed locomotive wheelset bearing. The test rig structure [51] is depicted in Figure 15. In order to simulate the load change of wheel set bearing during operation, apply a random force with a frequency of 0.2~20 Hz and an average value of about 10 kN in the radial direction, a simple harmonic force with a frequency of 1 Hz and a maximum value of 10 kN is applied axially on the test rig.
The field diagram of the test rig and the test bearings are depicted in Figure 16. The sensor is located at the top of the end-shield of the test bearing in Figure 16c and the vibration signal is collected by a PCB356A25 accelerometer. The dynamic loads can be obtained by the radical and axial actuators. There is an artificial local defect in the outer race of test bearing as plotted in Figure 16d, of which the width is 1 mm and length is 5 mm. It can be noted that the artificial defect is relatively slight in comparison with its geometries. The sampling frequency is set as 51.2 kHz and the set speed is 2000 r/min.

Case 3: A High-Speed Locomotive Wheelset Bearing Fault Signal
In order to verify whether the method is still effective in more complex working conditions with more noise components, the practical test data from the self-made experiment platform is selected for subsequent analysis. In this case, the vibration signal has been collected from a high-speed locomotive wheelset bearing. The test rig structure [51] is depicted in Figure 15. In order to simulate the load change of wheel set bearing during operation, apply a random force with a frequency of 0.2~20 Hz and an average value of about 10 kN in the radial direction, a simple harmonic force with a frequency of 1 Hz and a maximum value of 10 kN is applied axially on the test rig.  The field diagram of the test rig and the test bearings are depicted in Figure 16. The sensor is located at the top of the end-shield of the test bearing in Figure 16c and the vibration signal is collected by a PCB356A25 accelerometer. The dynamic loads can be obtained by the radical and axial actuators. There is an artificial local defect in the outer race of test bearing as plotted in Figure 16d, of which the width is 1 mm and length is 5 mm. It can be noted that the artificial defect is relatively slight in comparison with its geometries. The sampling frequency is set as 51.2 kHz and the set speed is 2000 r/min.    As can be seen from Figure 17, GG clustering cannot effectively cluster the fault feature samples constructed by MPE. It is difficult for the entropy to represent the different running states of bearings so further treatment is necessary. The steps are the same as last section, it will not be repeated here.
The Table 5 are the parameters of MPE in various states, which are obtained by PSO algorithm. Then GG is used to cluster the samples. The results with PSO-MPE are show in Figure 18.  As can be seen from Figure 17, GG clustering cannot effectively cluster the fault feature samples constructed by MPE. It is difficult for the entropy to represent the different running states of bearings so further treatment is necessary. The steps are the same as last section, it will not be repeated here.
The Table 5 are the parameters of MPE in various states, which are obtained by PSO algorithm. Then GG is used to cluster the samples. The results with PSO-MPE are show in Figure 18.  It can be seen from Figure 18a, the proposed method can effectively distinguish the three states. The values of NR and ORF is obviously separated while they are not in Figure  17a. Compared with Figure 17b, the samples of each state in Figure 18b are obviously separated, classified with its own cluster centers and the distance between different classes becomes larger and the distance within classes becomes smaller, respectively. It can be seen from Figure 18a, the proposed method can effectively distinguish the three states. The values of NR and ORF is obviously separated while they are not in Figure 17a. Compared with Figure 17b, the samples of each state in Figure 18b are obviously separated, classified with its own cluster centers and the distance between different classes becomes larger and the distance within classes becomes smaller, respectively.
According to Table 6, The fault recognition rate of rolling bearing based on the proposed method is 100%, which improves a lot than the 78.89% of the MPE. Moreover, the PAC and PAE of PSO-MPE are better than those of MPE, which prove the necessity and advantage of combination PSO-MPE and GG method, it has better clustering effect and recognition effect. In order to prove the robustness advantage of the proposed method, compare it with the feature vector composed of kurtosis and root mean square (RMS). Figure 19 shows the clustering effect of kurtosis and root mean square as feature vector. Compared with the Figure 18b, it is obvious that the three types of samples are not effectively distinguished, because the experimental environment simulates the working condition of high-speed train operation, the collected signal is close to the vibration signal of the train running on the actual line and is seriously disturbed by environmental noise. This index is almost invalid in this case. three states. The values of NR and ORF is obviously separated while they are not i 17a. Compared with Figure 17b, the samples of each state in Figure 18b are o separated, classified with its own cluster centers and the distance between differ ses becomes larger and the distance within classes becomes smaller, respectively According to Table 6, The fault recognition rate of rolling bearing based on posed method is 100%, which improves a lot than the 78.89% of the MPE. Moreo PAC and PAE of PSO-MPE are better than those of MPE, which prove the neces advantage of combination PSO-MPE and GG method, it has better clustering ef recognition effect. In order to prove the robustness advantage of the proposed method, compar the feature vector composed of kurtosis and root mean square (RMS). Figure 19 sh clustering effect of kurtosis and root mean square as feature vector. Compared Figure 18b, it is obvious that the three types of samples are not effectively distin because the experimental environment simulates the working condition of hig train operation, the collected signal is close to the vibration signal of the train run the actual line and is seriously disturbed by environmental noise. This index i invalid in this case.

Conclusions
In this paper, a rolling bearing fault detection method based on the PSO-MPE and GG is proposed. The method can select the important parameters of MPE method adaptively, overcome the disadvantages of fixed MPE parameters and greatly improve the accuracy of fault identification. The method is verified by several experiments. Some conclusions are obtained as follows: (1) To solve the problem of parameter determination of MPE, fitness function is constructed by skewness of multi-scale permutation entropy, the time series length L and embedding dimension m are optimized, the effectiveness of the optimization method is verified by experiments.
(2) Compared with the MPE of fixed parameters, it is proved that parameter optimized MPE can extract fault features accurately and has better classification and recognition rate about the rolling bearing typical faults.
(3) The effectiveness and robustness of the proposed method is verified by several rolling bearing experiments, of which the signals are simple to complex. Meanwhile, compared with the feature vector composed of root mean square and kurtosis, the proposed method shows advantages when the vibration signal contains more noise components and serious environmental interference, the proposed method has more accurate and stable performance in fault diagnosis. Data Availability Statement: The data are not publicly available due to laboratory restrictions.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A Table A1. Similarities and Differences between This Paper and Other References.

References Similarities Differences
Reference [26]: The variational mode decomposition (VMD) is used to denoise and obtain intrinsic mode functions (IMF), calculate MPE of these IMFs, FCM is used to cluster. The parameters influence of MPE is not considered.

MPE-LS-SVM
The MPE of the bearing vibration signal in different scales is calculated, the Laplacian score (LS) is used to refine the feature vector, SVM is used to classify. The parameters of MPE is fixed.
Reference [29]: EEMD-MPE-SA-SVM A number of intrinsic mode functions (IMFs) are obtained using ensemble empirical mode decompose (EEMD), the multi-scale IMF permutation entropy are extracted, SA-SVM is used to classify. The parameters (m,t) of MPE is fixed.

Reference [30]: CEEMD-MPE-GK
Complementary Ensemble Empirical Mode Decomposition (CEEMD) is used to denoise and obtain intrinsic mode functions (IMF), calculate MPE of the modal IMF, the GK is used to fault type recognition. The parameters influence of MPE is not considered.