Rolling Element Bearing Fault Diagnosis under Impulsive Noise Environment Based on Cyclic Correntropy Spectrum

Rolling element bearings are widely used in various industrial machines. Fault diagnosis of rolling element bearings is a necessary tool to prevent any unexpected accidents and improve industrial efficiency. Although proved to be a powerful method in detecting the resonance band excited by faults, the spectral kurtosis (SK) exposes an obvious weakness in the case of impulsive background noise. To well process the bearing fault signal in the presence of impulsive noise, this paper proposes a fault diagnosis method based on the cyclic correntropy (CCE) function and its spectrum. Furthermore, an important parameter of CCE function, namely kernel size, is analyzed to emphasize its critical influence on the fault diagnosis performance. Finally, comparisons with the SK-based Fast Kurtogram are conducted to highlight the superiority of the proposed method. The experimental results show that the proposed method not only largely suppresses the impulsive noise, but also has a robust self-adaptation ability. The application of the proposed method is validated on a simulated signal and real data, including rolling element bearing data of a train axle.


Introduction
Rolling element bearings are one of the most widely used mechanical components in various industrial machines, such as gearboxes, railway axles, and turbines. Their health states tend to degrade due to repeating rotations under harsh working conditions. Fault diagnosis of rolling element bearings is essential to prevent machine breakdown and ensure production efficiency. Therefore, research related to bearing degradation prognostics and fault diagnosis has recently attracted much attention [1][2][3].
The narrowband-based amplitude demodulation of a vibration signal enables the extraction of more detailed information about the fault signature. The procedure of this method uses a band-pass filter to retain one of the resonant frequency bands. Hilbert transform is then employed to demodulate the signal preprocessed by the band-pass filter and to construct a squared envelope on the filtered signal [4]. At last, the squared envelope spectrum is utilized to identify bearing fault frequencies. Thus, the key of the narrowband-based amplitude demodulation is to find a more informative resonant frequency band for Hilbert demodulation. Much effort has been made to address this issue. Spectral

1.
A cyclostationary analysis method based on the correntropy function is introduced into the fault diagnosis of the rolling element bearing under an impulsive noise environment.

2.
The kernel size of the correntropy function is investigated to find out its influence on the rolling element bearing fault diagnosis. 3.
The diagnosis performance of the proposed method is compared with the Fast Kurtogram method using train axle bearing data to verify its suitability.

Correntropy Function
Correntropy function can be regarded a similarity measure based on kernel function. The definition of correntropy function is defined as follows [24]: Let {x t , t ∈ T} be a stochastic process with T being an index set and x t ∈ R d . The correntropy function V(t 1 , t 2 ) is defined as a function from T * T into R + given by Equation (1): where E[·] denotes the mathematical expectation over the stochastic process x t . κ(x t 1 − x t 2 ) corresponds to the positive-definite function which satisfies the Mercer condition [25]. The Gaussian kernel is usually chosen as the Mercer kernel function due to its smoothness and strict positive-definiteness [26]. The kernel function is shown in Equation (2): where σ is the kernel size. The correntropy function is dependent upon the kernel size and it is selected according to certain applications [27]. By applying an extension of the Taylor series to the correntropy function, Equation (1) can be rewritten as follows [24]: which involves all the even-order moments of the random variable ||x t 1 − x t 2 ||. Specifically, the term corresponding to n = 1 in Equation (3) is proportional to: where R x (t 1 , t 2 ) is the covariance function of the random process. Thus, the information provided by the traditional correlation function is included within the new function. It can be seen that the correntropy function is a second-order statistic of the mapped feature space data. Besides, the correntropy function incorporates higher-order moments of the random variable ||x t 1 − x t 2 || by adjusting the values of the kernel size. Thanks to these attractive properties, the correntropy function has been widely applied in machine learning and signal processing. Among those applications, the suppression performance of non-Gaussian noise has been highly researched [28][29][30], in which the Gaussian kernel function plays an important part. The superiority of the Gaussian kernel function mainly includes two points: transforming values of outliers into zero and extracting higher statistical moments. These two merits enable the correntropy function to handle signals corrupted by non-Gaussian noise, especially impulsive noise.

Cyclic Spectral Analysis
A cyclostationary process is a stochastic process that exhibits some hidden periodicities in its structure [16]. Denote a bearing fault signal as x(t n ) and its stream as x[n], where t n = n/F s , n = 0, 1, . . . , L − 1 indicate time instants with sampling frequency F s . Assume that the bearing fault signal is cyclostationary on the second order, which indicates that the instantaneous autocorrelation function is periodic with period T [20]: where E{·} refers to the ensemble average operator; τ = m/F s ; and * is the complex conjugate operator. One important tool for the cyclic spectral analysis of bearing signals is SC analysis, which is obtained in the form of a bi-spectral map, thus reflecting the whole picture of spectral frequency f and cyclic frequency α in the signal. Based on Equation (5), SC is defined as follows [20]: In the case of the second-order cyclostationary signal, SC is continuous at spectral frequency f and discrete at cyclic frequency α. It can be rewritten as follows: where S k x ( f ), k = 0, ±1, ±2, . . . are cyclic spectra. The alignment composed of non-zero values at a certain cyclic frequency α demonstrates the existence of a sinusoidal modulation in the signal, which also suggests the presence of fault signature during the fault diagnosis process. Estimators of SC analysis, such as averaged cyclic periodogram, cyclic modulation spectrum, and fast estimator of SC, are proposed and discussed, contributing to the practical applications of SC analysis.

Fundamental of CCE Function and CCE Spectrum
Similar to formulations of the cyclostationary process of the first and second order, we can obtain the basic expression of the CCE. Let V x (t, τ) denote the correntropy function for a stochastic process x(t) that exhibits hidden periodicities in its structure and for which the time shift is τ. Assume the correntropy function is periodic with T 0 , then: Furthermore, represent V x (t, τ) by Fourier series, as shown as Equation (9): where α = n/T, n ∈ Z is taken as the cyclic frequency associated with the fault characteristic. Let us define the CCE function for x(t) as Fourier coefficients V α x (τ), computed by: According to those formulations, they are functions of time lag τ and are indexed by the cyclic frequency α. Note that for α = 0, the CCE function returns to the conventional correntropy function.
Combined with Equation (2), the CCE function can be rewritten as [23]: The CCE function has proved to be very useful in signal processing, although its applications are limited to communication engineering. It is often more effective and natural to transform the structure of a cyclostationary signal in the frequency domain. Note that the CCE is a function with variables t and τ, its frequency domain is a two-dimensional (2D) Fourier transform with two frequency variables α and f , and CCE spectrum (CCES) is defined as a Fourier transform of CCE as follows: S α x ( f ) displays the power distribution of the cyclostationary signal with respect to both the spectral frequency f , which is associated with the system resonance frequency, and the cyclic frequency α, which is also known as the fault frequency.

Kernel Size Selection of the CCE Function
The Gaussian variance, also known as kernel size, is a free parameter which must be chosen according to certain applications. A well-tuned kernel size can minimize the effect of the noise interruption. Therefore, the diagnosis performance of cyclic correntropy is largely determined by the kernel size. However, in previous studies related to cyclic correntropy, the selection method of kernel size was not addressed in detail [21][22][23] and kernel size mainly was decided by multiple trials. Thus, it is necessary to find a solution for the kernel size selection.
The correntropy function is an extended work of the information theoretic learning (ITL) theory [31,32]. Thanks to the work of Robert et al. [33], the link between the inner product in a kernel feature and the ITL cost function was discovered. Robert et al. claimed that the ITL cost functions, when estimated by the Parzen method, can be expressed in terms of inner products in a kernel feature space defined by a Mercer kernel. This link offers a criterion for the selection of the Mercer kernel size based on density estimation.
Among those kernel size estimation methods [34][35][36], Silverman's rule is a classical method because of its low computation cost and robust performance [37]. Therefore, this paper selects the optimal parameter according to Silverman' rule, one of the most widely used kernel estimation methods. Silverman's rule is shown here as Equation (13): where N is the data length. A stands for the minimum of the empirical standard deviation of data and the data interquartile range scaled by 1.34. The influence of the kernel size on the bearing fault diagnosis result is discussed more in greater depth in Section 4 with detailed examples.

Estimation of CCES
Based on the introduction of above two sections, the estimation of CCES is shown as follows: Step 1. Denote the input signal x[n], with signal length n. Then divide the input signal into L blocks, with each block of N samples.

Estimation of CCES
The following example aims to illustrate the failure of the Fast Kurtogram in the presence of impulsive noise. Firstly, the simulated bearing fault signal is modeled as a single-degree-of-freedom system according to Reference [38]: where α is equal to 900; f m is the fault frequency, which is set to 100 Hz; and F s is the sampling frequency, which is set to 10,000 Hz. f is the resonant frequency, which is equal to 1000. τ r is subject to discrete uniform distribution, and is thus used to simulate the randomness caused by roller slippage. Impulsive noise has a similar behavior to the bearing fault signal. According to the impulsive noise simulation method proposed by Antoni [15], the noise is similarly modeled as a single-degree-of-freedom system with different parameter sets. Typically, α is equal to 300, while f m is set to 30 Hz. The resonant frequency is equal to 3000. Note that the impulsive noise is not composed of a single transient but rather multiple noisy transients in the time domain. The final synthetic signal of length L = 10 4 is displayed in Figure 1. Figure 1a shows that impulsive noise is distributed around sample points 1000, 1334, and so on, whose amplitudes are obviously larger than normal transients The Fast Kurtogram method [7] is applied to select an optimal band for the square envelope analysis and the corresponding Kurtogram is displayed in the Figure 2. It can be seen that the impulsive noise around 3000 Hz dominates the Kurtogram at level 4.5, while the fault signature is masked in the noise. Furthermore, the frequency band centered at 3020 Hz is obtained by the band filter and the corresponding envelope and amplitude spectrum are displayed in Figure 3. Figure 3b shows that the filtered signal band is actually the impulsive noise component, whose frequency is set to be 30 Hz. Then, the simulated signal is further analyzed with the CCES proposed in Section 3.3. To observe the fault frequency in greater detail, the CCES is projected onto the cyclic domain. According to Silverman's rule, σ is equal to 0.107. The cyclic domain profile when σ is equal to 0.107 is displayed in Figure 4. It can be seen from the figure that the fault frequency is easily identified, marked with red arrows. The spectrum also displays the frequency component of 30 Hz, marked with a red circle. Compared to the fault frequency at 100 Hz and its harmonics, the impulsive noise is largely suppressed. Thus, the analysis results of the simulated signal under an impulsive environment demonstrates that the proposed method can detect the fault frequency effectively. The Fast Kurtogram method [7] is applied to select an optimal band for the square envelope analysis and the corresponding Kurtogram is displayed in the Figure 2. It can be seen that the impulsive noise around 3000 Hz dominates the Kurtogram at level 4.5, while the fault signature is masked in the noise. Furthermore, the frequency band centered at 3020 Hz is obtained by the band filter and the corresponding envelope and amplitude spectrum are displayed in Figure 3. Figure 3b shows that the filtered signal band is actually the impulsive noise component, whose frequency is set to be 30 Hz. The Fast Kurtogram method [7] is applied to select an optimal band for the square envelope analysis and the corresponding Kurtogram is displayed in the Figure 2. It can be seen that the impulsive noise around 3000 Hz dominates the Kurtogram at level 4.5, while the fault signature is masked in the noise. Furthermore, the frequency band centered at 3020 Hz is obtained by the band filter and the corresponding envelope and amplitude spectrum are displayed in Figure 3. Figure 3b shows that the filtered signal band is actually the impulsive noise component, whose frequency is set to be 30 Hz. Then, the simulated signal is further analyzed with the CCES proposed in Section 3.3. To observe the fault frequency in greater detail, the CCES is projected onto the cyclic domain. According to Silverman's rule, σ is equal to 0.107. The cyclic domain profile when σ is equal to 0.107 is displayed in Figure 4. It can be seen from the figure that the fault frequency is easily identified, marked with red arrows. The spectrum also displays the frequency component of 30 Hz, marked with a red circle. Compared to the fault frequency at 100 Hz and its harmonics, the impulsive noise is largely suppressed. Thus, the analysis results of the simulated signal under an impulsive environment demonstrates that the proposed method can detect the fault frequency effectively. Then, the simulated signal is further analyzed with the CCES proposed in Section 3.3. To observe the fault frequency in greater detail, the CCES is projected onto the cyclic domain. According to Silverman's rule, σ is equal to 0.107. The cyclic domain profile when σ is equal to 0.107 is displayed in Figure 4. It can be seen from the figure that the fault frequency is easily identified, marked with red arrows. The spectrum also displays the frequency component of 30 Hz, marked with a red circle. Compared to the fault frequency at 100 Hz and its harmonics, the impulsive noise is largely suppressed. Thus, the analysis results of the simulated signal under an impulsive environment demonstrates that the proposed method can detect the fault frequency effectively.

Case Study 1
To further testify the performance of the proposed method, comparison experiments are conducted on real bearing data. The first example applies the dataset from the Western Reserve University (WRU) bearing data center [39]. This dataset is free from impulsive noise and can be used to test the performance of the proposed method in the presence of common background noise. The data of inner race faults numbered 169 are chosen for analysis. According to formulas for bearing fault characteristic frequency [40], the inner race fault frequency is 162.185 Hz.
Firstly, the Fast Kurtogram is obtained for the optimal band selection, as shown in Figure 5. It can be seen from the figure that the optimal band is located at level 6 and the frequency center is 2750 Hz, marked with the black circle. This optimal band acquired through the bandpass filter and squared envelope spectrum is displayed in Figure 6. The fault frequency and its harmonics are marked with red arrows. Figure 6b shows that the fault signature is diagnosable but shows some discrete components interrupting the frequency domain.

Case Study 1
To further testify the performance of the proposed method, comparison experiments are conducted on real bearing data. The first example applies the dataset from the Western Reserve University (WRU) bearing data center [39]. This dataset is free from impulsive noise and can be used to test the performance of the proposed method in the presence of common background noise. The data of inner race faults numbered 169 are chosen for analysis. According to formulas for bearing fault characteristic frequency [40], the inner race fault frequency is 162.185 Hz.
Firstly, the Fast Kurtogram is obtained for the optimal band selection, as shown in Figure 5. It can be seen from the figure that the optimal band is located at level 6 and the frequency center is 2750 Hz, marked with the black circle. This optimal band acquired through the bandpass filter and squared envelope spectrum is displayed in Figure 6. The fault frequency and its harmonics are marked with red arrows. Figure 6b shows that the fault signature is diagnosable but shows some discrete components interrupting the frequency domain.

Case Study 1
To further testify the performance of the proposed method, comparison experiments are conducted on real bearing data. The first example applies the dataset from the Western Reserve University (WRU) bearing data center [39]. This dataset is free from impulsive noise and can be used to test the performance of the proposed method in the presence of common background noise. The data of inner race faults numbered 169 are chosen for analysis. According to formulas for bearing fault characteristic frequency [40], the inner race fault frequency is 162.185 Hz.
Firstly, the Fast Kurtogram is obtained for the optimal band selection, as shown in Figure 5. It can be seen from the figure that the optimal band is located at level 6 and the frequency center is 2750 Hz, marked with the black circle. This optimal band acquired through the bandpass filter and squared envelope spectrum is displayed in Figure 6. The fault frequency and its harmonics are marked with red arrows. Figure 6b shows that the fault signature is diagnosable but shows some discrete components interrupting the frequency domain.  Then, with σ set equal to 0.015, the cyclic domain profile of the corresponding signal is displayed in Figure 7. Compared to the squared envelope of the filtered signal based on Fast Kurtogram, the fault frequency and its harmonics are easier to find. The result demonstrates that the proposed method can still obtain a better diagnosis performance compared to the Fast Kurtogram in an environment with common background noise.   Then, with σ set equal to 0.015, the cyclic domain profile of the corresponding signal is displayed in Figure 7. Compared to the squared envelope of the filtered signal based on Fast Kurtogram, the fault frequency and its harmonics are easier to find. The result demonstrates that the proposed method can still obtain a better diagnosis performance compared to the Fast Kurtogram in an environment with common background noise. Then, with σ set equal to 0.015, the cyclic domain profile of the corresponding signal is displayed in Figure 7. Compared to the squared envelope of the filtered signal based on Fast Kurtogram, the fault frequency and its harmonics are easier to find. The result demonstrates that the proposed method can still obtain a better diagnosis performance compared to the Fast Kurtogram in an environment with common background noise.  Then, with σ set equal to 0.015, the cyclic domain profile of the corresponding signal is displayed in Figure 7. Compared to the squared envelope of the filtered signal based on Fast Kurtogram, the fault frequency and its harmonics are easier to find. The result demonstrates that the proposed method can still obtain a better diagnosis performance compared to the Fast Kurtogram in an environment with common background noise.

Case Study 2
In this case study, industrial railway axle bearing fault data are used for further comparison experiments. One unique component of the industrial railway axle bearing signal is the impulsive background noise. During the train operation on the rail, impulsive force will be produced when the train passes through curves or small gaps between rail joints. This impulsive force will be conducted to the bearing casing through the train wheel and creates an impulsive noise environment. This phenomenon raises new challenges to the bearing fault diagnosis. Focusing on this issue, railway bearing data under impulsive noise interference are acquired to validate the superiority of the proposed method.
Our experimental platform for collecting railway axle bearing fault data is shown in Figure 8. Through a transmission set, a variable speed DC motor with a speed up to 1480 r/min is used to drive the rotation of an axle at different speeds. Axle bearings are assembled at the ends of the axle. A lateral load set and a vertical load set are installed to impose practical loads during rail vehicle operation. Note that the lateral load is used to simulate multiple impacts when the train passes curves or small gaps between rail joints. Fan motors are installed to simulate the effect of natural wind in the opposite direction of the vehicle's momentum. Sensors are mounted at 12 o'clock (directly in the vertical load zone) and 3 o'clock (orthogonal to the vertical load zone) of the bearing casing to acquire vibration data. Two fault bearings are selected from the railway maintenance center and their degradation conditions are shown in Figure 8b,c respectively. The sampling frequency is set to 12,800 Hz. The simulated speed and vertical load are set to 150 km/h and 272 kN, respectively. The lateral load for generating the impulsive noise is 20 kN. According to the transmission ratio of our experimental platform, the inner race fault frequency and the outer race fault frequency are calculated as 164 Hz and 120 Hz, respectively.

Case Study 2
In this case study, industrial railway axle bearing fault data are used for further comparison experiments. One unique component of the industrial railway axle bearing signal is the impulsive background noise. During the train operation on the rail, impulsive force will be produced when the train passes through curves or small gaps between rail joints. This impulsive force will be conducted to the bearing casing through the train wheel and creates an impulsive noise environment. This phenomenon raises new challenges to the bearing fault diagnosis. Focusing on this issue, railway bearing data under impulsive noise interference are acquired to validate the superiority of the proposed method.
Our experimental platform for collecting railway axle bearing fault data is shown in Figure 8. Through a transmission set, a variable speed DC motor with a speed up to 1480 r/min is used to drive the rotation of an axle at different speeds. Axle bearings are assembled at the ends of the axle. A lateral load set and a vertical load set are installed to impose practical loads during rail vehicle operation. Note that the lateral load is used to simulate multiple impacts when the train passes curves or small gaps between rail joints.  The input data are firstly analyzed with the Fast Kurtogram method and Figure 9 displays the optimal band marked with the black circle. It can be seen from the figure that the optimal band is obtained at level 3, for which the central frequency is 6000 Hz. The associated squared envelope spectrum is further displayed in Figure 10 based on the optimal band. It is found that the Fast Kurtogram fails to find an informative spectral frequency band relevant to the fault signature for further squared envelope spectrum analysis. In Figure 10b, the frequency components are mainly impulsive noise produced by the lateral load. Therefore, it should be noted that the frequency bands associated with the several largest kurtosis values are not enough to establish a correct spectral frequency range for fault diagnosis of the industrial railway axle bearing. The input data are firstly analyzed with the Fast Kurtogram method and Figure 9 displays the optimal band marked with the black circle. It can be seen from the figure that the optimal band is obtained at level 3, for which the central frequency is 6000 Hz. The associated squared envelope spectrum is further displayed in Figure 10 based on the optimal band. It is found that the Fast Kurtogram fails to find an informative spectral frequency band relevant to the fault signature for further squared envelope spectrum analysis. In Figure 10b, the frequency components are mainly impulsive noise produced by the lateral load. Therefore, it should be noted that the frequency bands associated with the several largest kurtosis values are not enough to establish a correct spectral frequency range for fault diagnosis of the industrial railway axle bearing.  Then, the CCES is applied to analyze the input data of the inner race fault and the corresponding cyclic domain profile is displayed in Figure 11. According to Silverman's rule, the σ is set to 4.496 in this experiment. The highlighted frequencies marked with red arrows are 161 Hz, 329 Hz, and 491 Hz. One thing that should be noted is that there is a minor difference between the theoretic fault frequency and the obtained frequency. This is due to random slips between rolling elements and the inner race caused by the large vertical load [41,42]. Furthermore, irregular rotations of the faulty bearing caused by the high speed and vertical load may also have some influence on the fault frequency calculation.  Then, the CCES is applied to analyze the input data of the inner race fault and the corresponding cyclic domain profile is displayed in Figure 11. According to Silverman's rule, the σ is set to 4.496 in this experiment. The highlighted frequencies marked with red arrows are 161 Hz, 329 Hz, and 491 Hz. One thing that should be noted is that there is a minor difference between the theoretic fault frequency and the obtained frequency. This is due to random slips between rolling elements and the inner race caused by the large vertical load [41,42]. Furthermore, irregular rotations of the faulty bearing caused by the high speed and vertical load may also have some influence on the fault frequency calculation. Then, the CCES is applied to analyze the input data of the inner race fault and the corresponding cyclic domain profile is displayed in Figure 11. According to Silverman's rule, the σ is set to 4.496 in this experiment. The highlighted frequencies marked with red arrows are 161 Hz, 329 Hz, and 491 Hz. One thing that should be noted is that there is a minor difference between the theoretic fault frequency and the obtained frequency. This is due to random slips between rolling elements and the inner race caused by the large vertical load [41,42]. Furthermore, irregular rotations of the faulty bearing caused by the high speed and vertical load may also have some influence on the fault frequency calculation.
The previous procedure is applied to analyze the axle outer race fault signal and the relevant results are plotted in Figures 12 and 13, respectively. The center frequency of the optimal band is located at level 3 with 6000 Hz. Furthermore, the squared envelope analysis based on the optimal band is displayed in Figure 13. Unfortunately, the fault signature seems to be buried in the impulsive noise again. The previous procedure is applied to analyze the axle outer race fault signal and the relevant results are plotted in Figures 12 and 13, respectively. The center frequency of the optimal band is located at level 3 with 6000 Hz. Furthermore, the squared envelope analysis based on the optimal band is displayed in Figure 13. Unfortunately, the fault signature seems to be buried in the impulsive noise again.    The previous procedure is applied to analyze the axle outer race fault signal and the relevant results are plotted in Figures 12 and 13, respectively. The center frequency of the optimal band is located at level 3 with 6000 Hz. Furthermore, the squared envelope analysis based on the optimal band is displayed in Figure 13. Unfortunately, the fault signature seems to be buried in the impulsive noise again.    The previous procedure is applied to analyze the axle outer race fault signal and the relevant results are plotted in Figures 12 and 13, respectively. The center frequency of the optimal band is located at level 3 with 6000 Hz. Furthermore, the squared envelope analysis based on the optimal band is displayed in Figure 13. Unfortunately, the fault signature seems to be buried in the impulsive noise again.   The application of the CCES method with σ equal to 3.021 and the cyclic domain profile of outer race fault signal is shown in Figure 14. The highlighted frequencies marked with red arrows are 120 Hz, 241 Hz, and 358 Hz, which are almost the same as the theoretic fault frequency and its harmonics.
The application of the CCES method with σ equal to 3.021 and the cyclic domain profile of outer race fault signal is shown in Figure 14. The highlighted frequencies marked with red arrows are 120 Hz, 241 Hz, and 358 Hz, which are almost the same as the theoretic fault frequency and its harmonics. An interesting point is that the CCES can obtain the cyclic frequency of the fault signature while suppressing the periodic noise, which also is achieved in the simulation signal analysis. Generally speaking, cyclostationary modeling aims to determine all those cyclostationary components in the signal. According to the CCES procedure proposed in Section 3, it is found that this method can easily deal with impulsive noise without any cyclostationarity. When the impulsive noise is cyclostationary, which can also be considered periodic in time domain, the CCES can also detect this component and the cyclic domain profile will show this cyclic frequency in the spectrum plot. However, this frequency component will not dominate the frequency spectrum because this phenomenon only happens when impulsive noise is distributed all along the whole signal and the impulsive amplitude is large enough, which rarely occurs during train operation or in other industries.

Influence of Kernel Size on the Diagnosis Performance
The Gaussian kernel size is a free parameter selected according to certain applications. The variation of kernel size expands its application range, while improper kernel size may become its potential weakness. Previous studies on the kernel size selection of CCE were mainly based on multiple trials, which does not lead to the optimal parameter for the Gaussian kernel function. In this section, fault diagnosis performance of different kernel size is analyzed to provide an effective mechanism for kernel size selection.
Input data are derived from the above two case studies for comparison, including the WRU data of the inner race fault numbered 169 and the axle bearing data of the outer race fault. The optimal kernel size of each experiment is based on Silverman' rule. In addition, another random kernel size is used for comparisons.
The random kernel size is set to 1 and the optimal kernel size is 0.015 according to Silverman's rule. The analysis results of the WRU data numbered 169 are displayed in Figure 15. Both of the two analysis results can identify the fault frequency and its harmonics. However, Figure 15b with the optimal kernel size displays fewer frequency components than Figure 15a with a random kernel size, especially in the frequency band between 100 Hz and 350 Hz. This is because those frequency components which are uncorrelated to the fault signature have been suppressed. An interesting point is that the CCES can obtain the cyclic frequency of the fault signature while suppressing the periodic noise, which also is achieved in the simulation signal analysis. Generally speaking, cyclostationary modeling aims to determine all those cyclostationary components in the signal. According to the CCES procedure proposed in Section 3, it is found that this method can easily deal with impulsive noise without any cyclostationarity. When the impulsive noise is cyclostationary, which can also be considered periodic in time domain, the CCES can also detect this component and the cyclic domain profile will show this cyclic frequency in the spectrum plot. However, this frequency component will not dominate the frequency spectrum because this phenomenon only happens when impulsive noise is distributed all along the whole signal and the impulsive amplitude is large enough, which rarely occurs during train operation or in other industries.

Influence of Kernel Size on the Diagnosis Performance
The Gaussian kernel size is a free parameter selected according to certain applications. The variation of kernel size expands its application range, while improper kernel size may become its potential weakness. Previous studies on the kernel size selection of CCE were mainly based on multiple trials, which does not lead to the optimal parameter for the Gaussian kernel function. In this section, fault diagnosis performance of different kernel size is analyzed to provide an effective mechanism for kernel size selection.
Input data are derived from the above two case studies for comparison, including the WRU data of the inner race fault numbered 169 and the axle bearing data of the outer race fault. The optimal kernel size of each experiment is based on Silverman' rule. In addition, another random kernel size is used for comparisons.
The random kernel size is set to 1 and the optimal kernel size is 0.015 according to Silverman's rule. The analysis results of the WRU data numbered 169 are displayed in Figure 15. Both of the two analysis results can identify the fault frequency and its harmonics. However, Figure 15b with the optimal kernel size displays fewer frequency components than Figure 15a with a random kernel size, especially in the frequency band between 100 Hz and 350 Hz. This is because those frequency components which are uncorrelated to the fault signature have been suppressed. Similarly, the random kernel size is set to 0.1 and the optimal kernel size is 3.021 according to Silverman's rule. The analysis results of the axle bearing signal of the outer race fault are displayed in Figure 16. It can be seen from the Figure 16a that the result with the randomly chosen kernel size cannot locate frequencies corresponding to the fault characteristic. Compared with Figure 16a, the cyclic domain profile with the optimal kernel size exhibits a much better diagnosis performance, as shown in Figure 16b. The above two analysis results have shown that the kernel size plays an important role in the fault diagnosis through the CCES method, especially when signal components are complex and affected by impulsive noise. The kernel optimization method based on Silverman's rule is highly efficient and accurate during the fault diagnosis. Thus, CCES can be regarded as a self-adaptation method which may also be applied other objects in mechanical fields.  Similarly, the random kernel size is set to 0.1 and the optimal kernel size is 3.021 according to Silverman's rule. The analysis results of the axle bearing signal of the outer race fault are displayed in Figure 16. It can be seen from the Figure 16a that the result with the randomly chosen kernel size cannot locate frequencies corresponding to the fault characteristic. Compared with Figure 16a, the cyclic domain profile with the optimal kernel size exhibits a much better diagnosis performance, as shown in Figure 16b. Similarly, the random kernel size is set to 0.1 and the optimal kernel size is 3.021 according to Silverman's rule. The analysis results of the axle bearing signal of the outer race fault are displayed in Figure 16. It can be seen from the Figure 16a that the result with the randomly chosen kernel size cannot locate frequencies corresponding to the fault characteristic. Compared with Figure 16a, the cyclic domain profile with the optimal kernel size exhibits a much better diagnosis performance, as shown in Figure 16b. The above two analysis results have shown that the kernel size plays an important role in the fault diagnosis through the CCES method, especially when signal components are complex and affected by impulsive noise. The kernel optimization method based on Silverman's rule is highly efficient and accurate during the fault diagnosis. Thus, CCES can be regarded as a self-adaptation method which may also be applied other objects in mechanical fields. The above two analysis results have shown that the kernel size plays an important role in the fault diagnosis through the CCES method, especially when signal components are complex and affected by impulsive noise. The kernel optimization method based on Silverman's rule is highly efficient and accurate during the fault diagnosis. Thus, CCES can be regarded as a self-adaptation method which may also be applied other objects in mechanical fields.

Conclusions
In this paper, we have investigated an impulsive noise suppression method for bearing fault diagnosis based on the CCE function. Silverman's rule was used to obtain an optimal kernel size for CCE input. This adjustable parameter provides an effective mechanism to eliminate the negative effect of impulsive noise. Then, a simulated signal and two real cases were studied to validate the performance of the proposed method. The experiment results show that the proposed method has a good diagnosis performance, especially in the presence of impulsive noise. Furthermore, a powerful frequency band selection method named Fast Kurtogram was applied to analyze the same data for comparison. It was found that the fault diagnosis method based on CCE function outperforms the Fast Kurtogram method.
Further research should mainly focus on the combination of CCE and mode identification methods. Typically, frequency domain features of CCES can be extracted to reflect different fault characteristics. Moreover, considering the computation time of cyclic methodology, a simplified and fast computation method should be developed in subsequent research works. Note that this new method can generate cyclostationary signatures for bearing fault diagnosis. Thus, it should be helpful in the diagnosis of other machinery, such as gears and blades.
Author Contributions: X.Z. and Y.Q. designed the proposed method and prepared the manuscript; X.Z. and C.H. finished writing the code. L.J. and L.K. provided guidance for modifying the manuscript. All authors prepared, revised, and approved the final submission.