Fault Detection for Vibration Signals on Rolling Bearings Based on the Symplectic Entropy Method

Bearing vibration response studies are crucial for the condition monitoring of bearings and the quality inspection of rotating machinery systems. However, it is still very difficult to diagnose bearing faults, especially rolling element faults, due to the complex, high-dimensional and nonlinear characteristics of vibration signals as well as the strong background noise. A novel nonlinear analysis method—the symplectic entropy (SymEn) measure—is proposed to analyze the measured signals for fault monitoring of rolling bearings. The core technique of the SymEn approach is the entropy analysis based on the symplectic principal components. The dynamical characteristics of the rolling bearing data are analyzed using the SymEn method. Unlike other techniques consisting of high-dimensional features in the time-domain, frequency-domain and the empirical mode decomposition (EMD)/wavelet-domain, the SymEn approach constructs low-dimensional (i.e., two-dimensional) features based on the SymEn estimate. The vibration signals from our experiments and the Case Western Reserve University Bearing Data Center are applied to verify the effectiveness of the proposed method. Meanwhile, it is found that faulty bearings have a great influence on the other normal bearings. To sum up, the results indicate that the proposed method can be used to detect rolling bearing faults.


Introduction
Rolling bearings are basic and key components in rotary machine systems.Any slight fault will lead to abnormal working states of the rotary machine structures, and even affect the normal running of other components.The caused concurrent faults will bring about economical and personnel losses.Rolling bearings with defects have been widely and deeply investigated from the two aspects of mechanism studies and diagnosis analyses, respectively [1][2][3].Mechanism studies mainly involve dynamic model designs and theoretical analyses of rolling bearings with surface defects (such as pitting or spalling) from the viewpoint of mechanical dynamics theory [4].Meanwhile, various nonlinear multi-body dynamic models have been proposed, due to Hertzian contact, unbalanced rotor effects, radial internal clearance, raceway spalls and the size of the rolling elements [4][5][6][7].Diagnosis analyses cover the exploration and development of approaches for the feature extraction and identification for rolling bearing faults from the viewpoint of signal processing or information theory, such as statistical processing, fractal dimension, linear discriminant analysis, cepstrum analysis, time-frequency analysis, supervised learned processing and so on [8][9][10][11][12][13][14][15][16][17][18][19][20].Few studies however have investigated and extracted the characteristics of the vibration signals on rolling bearings in terms of dynamics theory.For this, this paper tries to introduce the SymEn method to analyze these vibration signals from the point of view of dynamics.In order to describe the behavior of rolling bearings, the features extracted from vibration signals are the most commonly high-dimensional feature vectors consisting of values in various domains, such as the time domain, frequency domain, EMD/wavelet domain and so on [20][21][22][23][24][25][26][27][28][29][30][31].Yang et al. constructed 14-dimensional features including three fractal dimensions and 11 time-domain statistical features to evaluate various fault conditions of rolling bearings [20].For bearing fault detection, Prieto et al. gave a set of 30-dimension statistic features in the time domain for two channels in the beginning [21].Wu et al. studied 80 original features from multi-scale entropy (MSE), multi-scale permutation entropy (MPE), multi-scale root-mean-square (MSRMS) and multi-band spectrum entropy (MBSE) [22].Jiang et al. constructed a feature vector consisting of 59 features from the time, frequency and wavelet domains [23].Ali et al. used empirical mode decomposition and time-domain statistics to extract 50 features [24].Yang et al. extracted 35 features to identify the condition of a rotor [25].Wang et al. extracted a set of 22 statistical features from the time, frequency and time-frequency domains [26].Mbo'o et al. extracted a 10 × 6 feature matrix by using frequency analysis and statistical measures for the diagnosis of bearing damage [27].Sun et al. applied a weighted wavelet packet entropy (WWPE) method to construct a feature vector of 50 WWPE values for each moving window [28].Xu et al. applied multiscale base-scale entropy approaches to construct high-dimensional feature vectors (such as 4-dimensional vectors) for fault diagnosis of rolling bearings [29].Li et al. used 20 features in the time-domain and frequency-domain to reflect the running states of bearings [30].Yu gave 20-dimensional original features including Shannon entropy, second order statistics, frequency characteristics, wavelet energy, etc. [31].However, the high-dimensional fault features often contain a great deal of redundant and irrelevant information so that the corresponding classification techniques cause an increase of computational burden and even a decline of the classification accuracy [32].In order to avoid these problems, one thinks that the reduction dimension of high-dimensional features is a necessary process.In other words, how to get appropriate features becomes a key step for feature recognition.Various reduction dimension techniques have been developed and applied, such as the marginal fisher analysis (MFA) [23], the principal component analysis (PCA) [33], the generalized discriminate analysis (GDA) [34] and so on, but it can be seen that these methods are related to the intrinsic dimensionality of the analyzed data.That is to say, to a great extent, the recognition performance is dependent on the intrinsic dimension value of the data [35,36].Although it is still a great challenge to choose an optimal value of the intrinsic dimension, the intrinsic dimension of the analyzed data should come from its system.This paper introduces the SymEn method to research the dimensionality and nonlinearity of the vibration signals by reconstructing a Hamilton matrix of the rolling bearing system in symplectic space.
Symplectic space is an even dimensional phase space.In this space, the geometry transform theory is called symplectic geometry.A Hamilton matrix of the system in symplectic space can be equivalent to its original system.Namely, the eigenvalues of the Hamilton matrix contain the dynamical information of the system, such as dimensions.At present, symplectic geometry theory has been mainly studied in nonlinear differential systems [37][38][39].However, it has also been applied to deal with the time series from complex systems in the reconstructed phase space, such as dimension estimation, nonlinear analysis, symplectic principal component analysis, reduction noise, data prediction and feature extraction [40][41][42][43].Meanwhile, a SymEn estimate is an entropy measure for complex systems in symplectic geometry space, which can reflect the intrinsic characteristics of the dynamical systems [44].
The rest of this paper is organized as follows: Section 1 introduces the symplectic entropy method as well as the approximate entropy (ApEn), the sample entropy (SampEn), the fuzzy entropy (FuzzyEn), and the radial basis function (RBF) classifier.Section 2 shows our experimental signals and the vibration data sets from the Case Western Reserve University (CWRU) Bearing Data Center.Section 3 analyzes and discusses the dynamical characteristics of our experimental data based on the SymEn method.The effectiveness of the proposed approach is verified by using RBF classifier to diagnose faults in our experiments in comparisons with the ApEn, SampEn and FuzzyEn methods.
Then, the SymEn method with RBF classifier is also applied to detect faults in CWRU data.Finally, Section 4 gives the conclusions.

Methodology
From the point of view of dynamics theory, a time series, i.e., one-dimensional signal, can reconstructed the system dynamics according to Takens' embedding theorem.In other words, for a real time series x (x i , i = 1, . . ., N), an attractor X l×m is reconstructed in phase space and also called the trajectory matrix of the dynamical system: where , represents a point of the system in the phase space.m is the embedding dimension.l = N − m + 1 is the number of points in R m .The matrix X contains the dynamical information of the system.In order to keep the dynamical characteristics of the matrix X unchanged, this paper introduces the SymEn method due to the measure preserving transformation of symplectic geometry theory.Moreover, the ApEn, SampEn and FuzzyEn methods are also given for comparison.

Symplectic Entropy (SymEn)
The SymEn method estimates a SymEn measure based on the energy distribution of the attractor X in symplectic space.The SymEn statistic can better describe the properties of the system by using the symplectic transform, even for the nonlinear dynamic systems [44].The SymEn algorithm is given as follows: first, the attractor X in (1) is mapped into symplectic space.That is, for the attractor X, one can get a Hamilton matrix M of the system in symplectic space [41]: where A = X T X. M is equivalent to the attractor X.Second, eigenvalues of the Hamilton matrix M are obtained by a symplectic similar transform.According to symplectic geometry theory, any Hamilton matrix can remain unchanged during a symplectic similar transform [40].Then, the eigenvalues of the Hamilton matrix M reflect the dynamical characteristics of the system.The symplectic similar transform is a regular transform based on a bilinear antisymmetric nonsingular cross product, called the symplectic cross product [x, y], vectors x and y ∈ R 2n : where <•> denotes the inner product, J is the skew-symmetric matrix: where I n is a n-dimensional unit matrix.In symplectic geometry, the Hamilton matrix M can be transformed into an upper Hessenberg matrix B (b ij = 0, i > j + 1) by a symplectic matrix H: where H can also be a Household matrix, i.e., where: here, P can be obtained from the matrix A [40,42].For the matrix A, we can get: 21 = 0, the elementary reflective array P (1) can be calculated as follows: where: 11 , a Then: Next: The matrix P (2) can be constructed: Entropy 2017, 19, 607 5 of 17 where: Thus, A (2) can be transformed as A (3) , in which the values of the second column are all zero except the first two elements: One can repeat the above steps until A (n) becomes an upper triangle matrix B. The matrix P can be computed: The eigenvalues µ = {µ The µ values reflect the distribution of the energy of the system in different directions of symplectic space.The probability space of the system can be expressed as: where m is the embedding dimension, 0 ≤ p i ≤ 1, ∑ m i=1 p i = 1.p i describes the uncertainty of the entropy in different directions.Then, the SymEn measure can be defined as follows: The SymEn statistic calculates the statistical average and measures the degree of uncertainty about the underlying energy probability distribution referring to the Shannon entropy.

Approximate Entropy (ApEn)
For the ApEn algorithm [45], when the dimension is equal to m, the distance d m ij between X m i and X m j in Equation ( 1) is defined as: Then, the similarity between X m i and X m j can be described by using a measure C m i (i), i.e., Entropy 2017, 19, 607 where, std(•) means the standard deviation.r represents a predetermined tolerance value.k is a constant larger than zero.Θ is the Heaviside function, i.e., Next, one can get φ m (r) as follows: For m + 1, there is: Then, for the time series x with the finite length N, the ApEn value can be defined as: Here, for given N data points, choices of parameter m and r are required in order to calculate reasonable values of ApEn.In applications, m is generally chosen as 2. r ranges from 0.1 to 0.2 std [45].Besides, ApEn estimate is bias due to self-matches of the ApEn algorithm [46].

Sample Entropy (SampEn)
The SampEn algorithm does not count self-matches in order to reduce the bias.For the SampEn estimate, the similarity measure C m i (r) is defined as follows: Here, the analysis length is chosen as the first N − m vectors of length m so that for 1 ≤ i ≤ N − m, X m i and X m+1 i are defined.d m ij is calculated by using Equation ( 19) with i = j.Then: and for m + 1, there is: The SampEn can be calculated by using Equations ( 28) and (30): Entropy 2017, 19, 607 Here, the choices of m and r are the same as those of ApEn algorithm.The SampEn statistic computes the negative logarithm of a probability associated with the time series as a whole but does not employ a template-wise strategy, whereas the ApEn statistic calculates probabilities in a template-wise fashion.In the other hand, the SampEn statistic employs the same Heaviside function to measure the similarity of vectors as the ApEn does.However, the Heaviside function has the characteristics of the discontinuity and hard boundary, which causes problems in the validity and accuracy of the results for the small tolerance r [45,46].

Fuzzy Entropy (FuzzyEn)
FuzzyEn algorithm employs the fuzzy membership function instead of the Heaviside function to avoid the above issues in the SampEn and ApEn algorithms [47].Fuzzy membership function has the features of ambiguous boundaries with a real number in the range [0, 1].
For the FuzzyEn estimate, the similarity measure C m i (r) is defined as follows: where, D m ij denotes a fuzzy membership function instead of the Heaviside function Θ(•) in Equation ( 22): For m + 1: Then, referring to Equations ( 28), ( 30) and (31) in the SampEn algorithm, the FuzzyEn measure for finite datasets can be defined: The FuzzyEn has two different main characteristics, i.e., continuity and convexity, from the ApEn and the SampEn.Thus, the similarity changes by continuous degrees and self-similarity are the maximum.Moreover, the FuzzyEn statistic is also the negative natural logarithm of the ratio of Φ m+1 to Φ m .The FuzzyEn algorithm is similar to the ApEn and SampEn algorithms.The SymEn algorithm is different from the algorithms of the ApEn, SampEn and FuzzyEn although their first steps are all to reconstruct an attractor from a time series.The SymEn values are calculated by a series of preserving-measure transforms in symplectic space but other entropy values are not done.In the practical application, the SymEn method can choose any embedding dimensions, even high dimensions [44], but the others can often only use low dimensions, such as 2 or 3 dimensions [45][46][47].For the vibration signals of the rolling bearings, this paper studies the effect of the embedding dimension on the SymEn algorithm.

Radial Basis Function (RBF) Classifier
Classifiers based on RBF have been widely using in different fields for pattern recognition [48].The RBF classifier includes three totally different layers of an input layer, a hidden layer and an output layer (see Figure 1).The input layer is an interface between input sources and a RBF network.The hidden layer is a high-dimensional nonlinear mapping layer of multiple RBF neurons.The output layer is a layer consisting of linear neurons between the outputs and the hidden layer.Given the input set x j ∈ R n , c i is the center of the i-th radial basis function R i .y k is the k-th class output.For the j-th input x j , the output y k of the k-th node in the hidden layer is given by the Gaussian radial basis function: where ω(k,i) represents the weight between the i-th node in the hidden layer and the k-th node in the output layer, σ is the dispersion of the Gaussian function.For the algorithm details of the RBF classifier readers may refer to [49].
RBF has global approximation properties and the best approximation performance.Its training processing is also fast and easy.This paper applies the RBF method to classify the four states of rolling bearings.

Case 1: Experimental Vibration Time Series for the Rolling Bearings
In Case 1, two types of the rolling bearings (GB203 and GB6203) are used in order to conveniently analyze them.Table 1 shows their main parameters.2).For testing sets, the number of the NC, ORF, REF and IRF data sets is 154, 66, 132 and 66, respectively (see Table 2).

Case 1: Experimental Vibration Time Series for the Rolling Bearings
In Case 1, two types of the rolling bearings (GB203 and GB6203) are used in order to conveniently analyze them.Table 1 shows their main parameters.

Case 2: Standard Reference Data Sets in the CWRU Bearing Database
To further test the utilities of the proposed method, the vibration data sets for 6205-2RS JEM of Svenska Kullager-Fabriken (SKF) deep-groove ball bearings from the website database of the Case Western Reserve University (CWRU) Bearing Data Center (http://csegroups.case.edu/bearingdatacenter/pages/welcome-case-western-reserve-university-bearing-data-center-website) are used [8,23,25,29,30]

Case 2: Standard Reference Data Sets in the CWRU Bearing Database
To further test the utilities of the proposed method, the vibration data sets for 6205-2RS JEM of Svenska Kullager-Fabriken (SKF) deep-groove ball bearings from the website database of the Case Western Reserve University (CWRU) Bearing Data Center (http://csegroups.case.edu/bearingdatacenter/pages/welcome-case-western-reserve-university-bearing-data-center-website) are used [8,23,25,29,30].These data are acceleration signals in the vertical directions on the housing of drive end (DE) bearing and on the fan end (FE) bearing housing, respectively.The rotor states include  3 gives the data set number in seven analyzed data groups with a 12 kHz sampling frequency and the 1730 r/min shaft speed for 6205-2RS JEM of SKF deep-groove ball bearings.The fault depth is 0.11 inches.The fault diameters are 0.18 mm, 0.36 mm and 0.53 mm, respectively.Meanwhile, the ORF position is located at the lowest point (6:00 o'clock), the highest point (12:00 o'clock) and horizontal point (3:00 o'clock), respectively.

Dimension Analysis of Vibration Signals for the Rolling Bearing
Dimension estimation is often a basic and important step for the dynamical analysis of vibration time series.Takens' embedding theorem shows that if the embedding dimension is appropriate, the n-dimension dynamics of a system can be reconstructed from a time series of an observable-measurable system.The reconstructed dynamics properties can reflect the dynamical characteristics of the measured experimental data.Based on Takens' embedding theorem, the dimension of the bearing time series is estimated by using symplectic geometry referring to our previous study [40].Figure 3 shows the symplectic geometry spectrums of a typical vibration time series in Figure 2d, which is the logarithm of the energy distribution in Equation (17).The embedding dimension m ranges from 3 to 28.It can be seen that symplectic geometry spectrums gradually decrease with increasing the dimension m, and tend suddenly to flatten at m = 7.Therefore, the embedding dimension is chosen as 7 for the following SymEn analysis on the bearing vibration data.

Dimension Analysis of Vibration Signals for the Rolling Bearing
Dimension estimation is often a basic and important step for the dynamical analysis of vibration time series.Takens' embedding theorem shows that if the embedding dimension is appropriate, the n-dimension dynamics of a system can be reconstructed from a time series of an observable-measurable system.The reconstructed dynamics properties can reflect the dynamical characteristics of the measured experimental data.Based on Takens' embedding theorem, the dimension of the bearing time series is estimated by using symplectic geometry referring to our previous study [40].Figure 3 shows the symplectic geometry spectrums of a typical vibration time series in Figure 2d, which is the logarithm of the energy distribution in Equation (17).The embedding dimension m ranges from 3 to 28.It can be seen that symplectic geometry spectrums gradually decrease with increasing the dimension m, and tend suddenly to flatten at m = 7.Therefore, the embedding dimension is chosen as 7 for the following SymEn analysis on the bearing vibration data.

SymEn Analysis of Vibration Signals for the Rolling Bearing
The SymEn measure reflects the dynamical characteristics of a system.As an entropy estimate, the SymEn value quantifies the lack of information existing in the system by analyzing a time series in symplectic space.Here, the SymEn estimate is used as a test statistic to detect the nonlinear dynamical characteristics of the vibration signals for the rolling bearings.The null hypothesis of a

SymEn Analysis of Vibration Signals for the Rolling Bearing
The SymEn measure reflects the dynamical characteristics of a system.As an entropy estimate, the SymEn value quantifies the lack of information existing in the system by analyzing a time series in symplectic space.Here, the SymEn estimate is used as a test statistic to detect the nonlinear dynamical characteristics of the vibration signals for the rolling bearings.The null hypothesis of a Gaussian linear stochastic process is applied to generate surrogates of the raw vibration signals [52].The aim of the null hypothesis is to test whether the dynamics of the vibration signals are consistent with a Gaussian linear noise system.According to the null hypothesis, the surrogate data generated from a Gaussian linear noise system contain the same power spectrum and empirical distribution as the raw data by using the iterated amplitude adjusted Fourier transform (IAAFT) algorithm [53].If there is a significant difference between a test statistic value of the raw data and those of its surrogate data, the null hypothesis is rejected.It indicates that the raw data could be nonlinear.In order to give a judgement in the 95% confidence level, the generated surrogate data are 39 sets.The SymEn values are calculated according to Equation (18).For the rolling bearings, the differences between the SymEn values of the raw vibration data in Figure 2d and those of its surrogate data increases with the increase of the embedding dimension m (see Figure 4a).Figure 4b shows that there is a significant difference between the SymEn value of the raw data and those of its surrogates in m = 7.The null hypothesis can be rejected in the 95% confidence.The raw data does not come from a Gaussian linear stochastic process.The result indicates that the raw data could contain nonlinear dynamical characteristics.Subsequently, this paper furtherly investigates the fault detection of vibration signals by applying the SymEn method.

Fault Detection of Vibration Signals for the Rolling Bearing Based on SymEn
For four running states of rolling bearings in Case 1, the SymEn values of vibration data are calculated with the dimension m = 7 referring to Equation (18). Figure 5 shows the SymEn values of the vibration data for seven NC tests, three ORF tests, six REF tests, three IRF tests in Table 2.We can see that for the same running states, the SymEn values of different tests are very close although the different rolling bearings have some different SymEn values (see Figure 5).Table 4

Fault Detection of Vibration Signals for the Rolling Bearing Based on SymEn
For four running states of rolling bearings in Case 1, the SymEn values of vibration data are calculated with the dimension m = 7 referring to Equation (18). Figure 5 shows the SymEn values of the vibration data for seven NC tests, three ORF tests, six REF tests, three IRF tests in Table 2.We can see that for the same running states, the SymEn values of different tests are very close although the different rolling bearings have some different SymEn values (see Figure 5).Table 4 2. We can see that for the same running states, the SymEn values of different tests are very close although the different rolling bearings have some different SymEn values (see Figure 5).Table 4 gives the mean and standard deviation of the SymEn values of vibration signals for four running states.The SymEn values of four running states are obviously different.The results indicate that the SymEn measure could reflect the dynamical characteristics of rolling bearings.Moreover, this paper analyzes the effect of the analysis data length L on the proposed method in order to better detect faults of rolling bearings.The SymEn values of the vibration data in Case 1 are calculated in m = 7 and the data length L ranging from 2000 to 10,000 points.Then, the overall accuracies based on the SymEn values are given by using the radial basis function (RBF) classifier (see Table 5).For contrast, the ApEn, SampEn and FuzzyEn values are estimated in the dimension m = 2 as well (see Table 5).For the SymEn values, the classification rates increase with the increase of the analysis data length L and tend to a flat in the beginning of the 6000-point data length (see Table 5).For the ApEn, SampEn and FuzzyEn values, the rates tend to a flat in the beginning of the 8000-point data length and are lower than those of the SymEn in the same analysis data length L, so the length of the analysis data can be chosen as 6000 points in this paper.The rates of four running states based on the SymEn estimates are shown in Table 6.The above classification features are two-dimension vectors consisting of two SymEn values based on two channels.Comparing with the accuracy rates based on high-dimension features of Case II in the literature [50,51], the results show that the low dimension features based on SymEn can get higher recognition rates.The SymEn estimate could describe the characteristics of different running states for rolling bearings.The SymEn method is superior to other entropy methods.In order to furtherly verify the performance of the proposed method, the receiver operating characteristic (ROC) curves and classification confusion matrix are given in Figure 6.For the training set, the area of ROC curves is 1.For the testing set, the area is 0.9968.The results show that the proposed method can be applied to classify the four states of rolling bearings.Here, the data in Case 2 from the CWRU database website are analyzed as well.The DE data are considered as the data of channel 1 and the FE data instead of the data of channel 2. The SymEn values are calculated with m = 7 and L = 6000 points.Figure 7 shows the scatter plots based on the SymEn values of DE and FE data for the groups in Table 3.We can see that the SymEn estimates are obviously different for different running states of the DE bearing (see Figure 7).The accuracies are all 100% by RBF classifier for the NC, ORF, REF and IRF (see Table 7).Besides, for the same fault diameter (such as group 1 to 3 and group 5 to 7 in Table 3), the SymEn values are different with the different outer rolling fault positions (see Figure 7a-c,e-f).For the same outer rolling fault position (such as 6 o'clock position), the SymEn values are also different with the different fault diameters (see Figure 7a,d,e).The results indicate that the classification groups of four running states built by the same shaft speed, the same fault diameter and the same outer rolling fault position are help to investigate the dynamic characteristics of rolling bearings.For this, other data of SKF bearings for four different shaft speeds in the CWRU database are grouped into 70 sets like Table 3.The best classification rates are 100%, the worst 84.8%.The analyzed results are not all 100% classification accuracies.Comparing with Case 1, this might be caused by using the vertical direction accelerations at two different points of a horizontal rotation shaft, i.e., the housings of drive end (DE) bearing and the fan end (FE) bearing, respectively.For the normal FE bearing, the SymEn estimates  Here, the data in Case 2 from the CWRU database website are analyzed as well.The DE data are considered as the data of channel 1 and the FE data instead of the data of channel 2. The SymEn values are calculated with m = 7 and L = 6000 points.Figure 7 shows the scatter plots based on the SymEn values of DE and FE data for the groups in Table 3.We can see that the SymEn estimates are obviously different for different running states of the DE bearing (see Figure 7).The accuracies are all 100% by RBF classifier for the NC, ORF, REF and IRF (see Table 7).Besides, for the same fault diameter (such as group 1 to 3 and group 5 to 7 in Table 3), the SymEn values are different with the different outer rolling fault positions (see Figure 7a-c,e-f).For the same outer rolling fault position (such as 6 o'clock position), the SymEn values are also different with the different fault diameters (see Figure 7a,d,e).The results indicate that the classification groups of four running states built by the same shaft speed, the same fault diameter and the same outer rolling fault position are help to investigate the dynamic characteristics of rolling bearings.For this, other data of SKF bearings for four different shaft speeds in the CWRU database are grouped into 70 sets like Table 3.The best classification rates are 100%, the worst 84.8%.The analyzed results are not all 100% classification accuracies.Comparing with Case 1, this might be caused by using the vertical direction accelerations at two different points of a horizontal rotation shaft, i.e., the housings of drive end (DE) bearing and the fan end (FE) bearing, respectively.For the normal FE bearing, the SymEn estimates of the data in FE should be very close but are obviously different due to the faults of the DE bearing and vice versa.These results demonstrate that fault bearings have a great impact on other bearings of rotary machine systems, such as fan end bearings or drive end bearings.In other words, even if a bearing is normal, it might be considered as a faulty bearing if another bearing on the same shaft is faulty.Therefore, for fault diagnosis of rolling bearings, the vertical and horizontal accelerations at a test position might better help to detect faults.
Besides, the data in the CWRU database have been widely investigated.In the previous studies, the classification features were often high-dimensional in order to get high accuracy rates [23,25,29,30,50].By comparison, the low-dimensional features based on SymEn estimate can effectively detect the faults of rolling bearings for the data used in [23,25,29,30,50].The results indicate that the SymEn estimate could describe the characteristics of different running states for rolling bearings.The performance of the SymEn method is better than those of other high-dimensional approaches.Besides, the data in the CWRU database have been widely investigated.In the previous studies, the classification features were often high-dimensional in order to get high accuracy rates [23,25,29,30,50].By comparison, the low-dimensional features based on SymEn estimate can effectively detect the faults of rolling bearings for the data used in [23,25,29,30,50].The results indicate that the SymEn estimate could describe the characteristics of different running states for rolling bearings.The performance of the SymEn method is better than those of other high-dimensional approaches.Training (%) 100 100 100 100 100 100 100 Testing (%) 100 100 100 100 100 100 100  Training (%) 100 100 100 100 100 100 100 Testing (%) 100 100 100 100 100 100 100

Conclusions
The vibration signals of rolling bearings usually contain non-stationary and nonlinear characteristics.This paper introduces a novel entropy analysis method-the SymEn measure-to investigate the dynamic behavior of rolling bearings.Based on the SymEn analysis, the vibration data of the rolling bearing should be high-dimensional and nonlinear.The dimensions of the data could be chosen as 7 for the data analysis.Based on the SymEn measure, there are significant differences among the dynamic characteristics of the NC, ORF, REF and IRF conditions for the rolling bearings.For the bearing fault detection, the performance of the proposed method is better than those of the ApEn, SampEn, FuzzyEn and other high-dimensional approaches by using the radial basis function classifier.The SymEn measure as a low-dimension feature can provide a higher accuracy rate than the high-dimension features based on time-domain, frequency-domain, wavelet-domain and so on.The results show that the SymEn method better reflects the vibration characteristic information of rolling bearings, and the SymEn method can be applied to diagnose the faults of rolling bearings.

17 Figure 1 .
Figure 1.The architecture of the RBF classifier.
The analyzed data in Case 1 are measured in the running states of the normal condition (NC), outer race fault (ORF), rolling element fault (REF) and the inner race fault (IRF).The ORF, REF and IRF are generated by single pitting on the inner race surface, outer race surface and rolling element surface of the bearings, respectively.The two-channel bearing vibration signals are collected by the data acquisition system with an accelerometer (8791A250, Kistler, Amherst, NY, USA), a signal conditioner (Kistler 5134A), a data acquisition card (PXI-1042, National Instruments, Austin, TX, USA) and a computer on the bearing test rig in the State Key Laboratory of Mechanical System and Vibration (see Figure 2) [50,51].Channel 1 is the vertical vibration data.Figure 2d shows typical REF vibration data taken in the vertical direction.Channel 2 is the horizontal vibration data.The rotating speed is 720 r/min.The sampling frequency is 25.6 kHz.The total test number of four running states is seven NC tests, three ORF tests, six REF tests and three IRF tests, respectively (see Table 2 ).The sampling time length of each data is 8 s, i.e., 204,800 points.The length of the analysis data is 6000 points.For training sets, the number of the NC, ORF, REF and IRF data sets is 84, 36, 72 and 36, respectively, in training sets (see Table The analyzed data in Case 1 are measured in the running states of the normal condition (NC), outer race fault (ORF), rolling element fault (REF) and the inner race fault (IRF).The ORF, REF and IRF are generated by single pitting on the inner race surface, outer race surface and rolling element surface of the bearings, respectively.The two-channel bearing vibration signals are collected by the data acquisition system with an accelerometer (8791A250, Kistler, Amherst, NY, USA), a signal conditioner (Kistler 5134A), a data acquisition card (PXI-1042, National Instruments, Austin, TX, USA) and a computer on the bearing test rig in the State Key Laboratory of Mechanical System and Vibration (see Figure 2) [50,51].Channel 1 is the vertical vibration data.Figure 2d shows typical REF vibration data taken in the vertical direction.Channel 2 is the horizontal vibration data.The rotating speed is 720 r/min.The sampling frequency is 25.6 kHz.The total test number of four running states is seven NC tests, three ORF tests, six REF tests and three IRF tests, respectively (see (a) The bearing test rig ( b) The data acquisition system diagram (c) The test bearing and accelerometer ( d) A typical REF data in vertical direction

Figure 2 .
Figure 2. The bearing test rig system and a typical REF data in Case 1.
. These data are acceleration signals in the vertical directions on the housing of drive end (DE) bearing and on the fan end (FE) bearing housing, respectively.The rotor states include the normal condition (NC), three single-point defects (ORF, REF and IRF) at DE as well as NC at FE. Table 3 gives the data set number in seven analyzed data groups with a 12 kHz sampling frequency and the 1730 r/min shaft speed for 6205-2RS JEM of SKF deep-groove ball bearings.The fault depth is 0.11 inches.The fault diameters are 0.18 mm, 0.36 mm and 0.53 mm, respectively.Meanwhile, the ORF position is located at the lowest point (6:00 o'clock), the highest point (12:00 o'clock) and horizontal point (3:00 o'clock), respectively.

Figure 2 .
Figure 2. The bearing test rig system and a typical REF data in Case 1.
condition (NC), three single-point defects (ORF, REF and IRF) at DE as well as NC at FE. Table

Figure 3 .
Figure 3.The symplectic geometry spectrums of the data in Figure 2d.

Figure 3 .
Figure 3.The symplectic geometry spectrums of the data in Figure 2d.

17 a
Entropy 2017,19, 607   11 of Gaussian linear stochastic process.The result indicates that the raw data could contain nonlinear dynamical characteristics.Subsequently, this paper furtherly investigates the fault detection of vibration signals by applying the SymEn method.

Figure 4 .
Figure 4.The SymEn analysis of the data in Figure 1b and its surrogate data: (a) SymEn values with the increase of the embedding dimension m = 2:25; (b) SymEn values in m = 7.
gives the mean and standard deviation of the SymEn values of vibration signals for four running states.The SymEn values of four running states are obviously different.The results indicate that the SymEn measure could reflect the dynamical characteristics of rolling bearings.

Figure 4 .
Figure 4.The SymEn analysis of the data in Figure 1b and its surrogate data: (a) SymEn values with the increase of the embedding dimension m = 2:25; (b) SymEn values in m = 7.
gives the mean and standard deviation of the SymEn values of vibration signals for four running states.The SymEn values of four running states are obviously different.The results indicate that the SymEn measure could reflect the dynamical characteristics of rolling bearings.
the vibration data for seven NC tests, three ORF tests, six REF tests, three IRF tests in Table

Figure 5 .
Figure 5.The SymEn values of two-channel vibration data for 7 NC tests, 3 ORF tests, 6 REF tests and 3 IRF tests.The abscissa denotes each test.The ordinate is the SymEn value with mean and standard deviation.

Figure 5 .
Figure 5.The SymEn values of two-channel vibration data for 7 NC tests, 3 ORF tests, 6 REF tests and 3 IRF tests.The abscissa denotes each test.The ordinate is the SymEn value with mean and standard deviation.

Figure 6 .
Figure 6.The performance analysis of the proposed method for the 6000-point data in Case 1: (a-b) for the training sets; (c-d) for the testing sets.

Figure 6 .
Figure 6.The performance analysis of the proposed method for the 6000-point data in Case 1: (a,b) for the training sets; (c,d) for the testing sets.

Entropy 2017 ,
19, 607 14 of 17 faulty.Therefore, for fault diagnosis of rolling bearings, the vertical and horizontal accelerations at a test position might better help to detect faults.

Figure 7 .
Figure 7.The scatter plots based on the SymEn values of DE and FE data for the groups in Table 3, outer raceway faults located at 6 o'clock, 3 o'clock and 12 o'clock, respectively: (a-c) 0.007-inch fault diameter; (d) 0.014-inch fault diameter; (e-g) 0.021-inch fault diameter.The abscissa is the SymEn values of DE data.The ordinate is the SymEn values of FE data.

Figure 7 .
Figure 7.The scatter plots based on the SymEn values of DE and FE data for the groups in Table 3, outer raceway faults located at 6 o'clock, 3 o'clock and 12 o'clock, respectively: (a-c) 0.007-inch fault diameter; (d) 0.014-inch fault diameter; (e-g) 0.021-inch fault diameter.The abscissa is the SymEn values of DE data.The ordinate is the SymEn values of FE data.

Table 1 .
Parameters of the rolling bearings in Case 1.
Type Pitch Diameter (mm) Ball Diameter (mm) Ball Number Contact Angle ( o ) Figure 1.The architecture of the RBF classifier.

Table 2 )
. The sampling time length of each data is 8 s, i.e., 204,800 points.The length of the analysis data is 6000 points.For training sets, the number of the NC, ORF, REF and IRF data sets 84, 36, 72 and 36, respectively, in training sets (see Table2).For testing sets, the number of the NC, ORF, REF and IRF data sets is 154, 66, 132 and 66, respectively (see Table2). is

Table 1 .
Parameters of the rolling bearings in Case 1.

Table 2 .
Test times (TT), training set number (TrSN) and testing set number (TeSN) for the data in case 1.

Table 3 .
The used CWRU bearing data set number (SN) of every classification group for 6205-2RS JEM.

Table 3 .
The used CWRU bearing data set number (SN) of every classification group for 6205-2RS JEM.

Table 4 .
SymEn values of vibration signals in four running states.

Table 5 .
Effects of data length L on the SymEn, ApEn, SampEn and FuzzyEn methods for the rolling bearing fault diagnosis.

Table 6 .
Classification rate (%) of four classes in Case 1 by using SymEn with the RBF classifier.

Table 7 .
Classification of the data in Case 2, group number corresponding to Table3.

Table 7 .
Classification of the data in Case 2, group number corresponding to Table3.