A New Fault Feature Extraction Method for Rotating Machinery Based on Multiple Sensors

During the operation of rotating machinery, the vibration signals measured by sensors are the aliasing signals of various vibration sources, and they contain strong noises. Conventional signal processing methods have difficulty separating the aliasing signals, which causes great difficulties in the condition monitoring and fault diagnosis of the equipment. The principle and method of blind source separation are introduced, and it is pointed out that the blind source separation algorithm is invalid in strong pulse noise environments. In these environments, the vibration signals are first de-noised with the median filter (MF) method and the de-noised signals are separated with an improved joint approximate diagonalization of eigenmatrices (JADE) algorithm. The simulation results found here verify the effectiveness of the proposed method. Finally, the vibration signal of the hybrid rotor is effectively separated by the proposed method. A new separation approach is thus provided for vibration signals in strong pulse noise environments.


Introduction
In the process of rotating machinery operation, the vibration signals measured by vibration sensors are often composed of the vibrations of multiple components [1,2]. Elucidating how to analyze, process, and identify these signals are very important for judging the working state of rotating machinery and fault diagnosis [3]. It is very difficult to analyze and process these sensor signals directly, which is bound to cause a lot of difficulties in mechanical condition monitoring and fault diagnosis [4]. The traditional modern signal processing method is obviously insufficient for vibration signals with multiple overlaps for rotating machinery [5][6][7][8].
In recent years, the development of digital signal processing technology has changed rapidly, and a large number of methods are used at present for the extraction and noise reduction of rotation fault signals. The empirical mode decomposition (EMD) proposed by Norden E. Huang et al. [9] is a non-stationary signal analysis method that can find the hidden feature information in a signal, and it is widely used in the fault extraction and noise reduction of rotating machinery [10][11][12]. The minimum entropy deconvolution [13,14] designs the optimal filter to eliminate the random noise in the bearing impact signal under the condition of maximizing the kurtosis value. The adaptive filter [15,16] needs to introduce an additional noise signal and extract the bearing fault impact signal by designing an optimal filter. According to the characteristics of bearing fault vibration and impact, the matching trace [17,18] defines atoms to disassemble the vibration signal and extract the impact characteristic components. In mathematical morphology analysis [19,20], the pre-defined structure operator is used to carry out corrosion, expansion, opening, and closing operations on the signal, such as to suppress the noise and In Figure 1, the signals sent by n original signal sources ( 1 2 3 , , , , n s s s s  ) are measured by m sensors and output observation signals ( 1 2 3 , , , , n x x x x  ). In the actual test process, when multiple sensors are used for observation, the number of sensors are generally required to be not less than the number of signal sources, that is, m n ≥ . Assuming that the transmission is instantaneous and that the sensor receives a linear mixture of the original signal sources, the output of the i-th sensor is given as follows: which is also written as follows: In the formula, is an unknown rank full-rank mixed matrix, ( ) s t is an ndimensional source vector, and ( ) v t is an additive noise vector, and its statistics are independent.
The purpose of blind source separation is to find a separation matrix W , so that ( ) ( ) y t Wx t = is the optimal estimate of ( ) s t .
A is a mixed matrix. To separate the source signal is to find the separation matrix U to make the following true: If UA I = If I is a unit matrix, then ( ) S t  is an effective separation of ( ) S t  . In Figure 1, the signals sent by n original signal sources (s 1 , s 2 , s 3 , . . . , s n ) are measured by m sensors and output observation signals (x 1 , x 2 , x 3 , . . . , x n ). In the actual test process, when multiple sensors are used for observation, the number of sensors are generally required to be not less than the number of signal sources, that is, m ≥ n. Assuming that the transmission is instantaneous and that the sensor receives a linear mixture of the original signal sources, the output of the i-th sensor is given as follows: where a ij is the mixing coefficient and v i (t) is the observation noise of the i-th sensor. The matrix form is given as follows: which is also written as follows: In the formula, A ∈ R m×n is an unknown rank full-rank mixed matrix, s(t) is an n-dimensional source vector, and v(t) is an additive noise vector, and its statistics are independent.
The purpose of blind source separation is to find a separation matrix W, so that y(t) = Wx(t) is the optimal estimate of s(t).
A is a mixed matrix. To separate the source signal is to find the separation matrix U to make the following true: If If I is a unit matrix, then S (t) is an effective separation of S(t).
Since blind source separation estimates the input signal according to only the observed signal, without any prior knowledge about the source signal, there are some uncertainties between the estimated input signal and the source signal, which are mainly reflected in the uncertainties in the estimation of the amplitude and order of the input signal. However, these two uncertainties do not affect the analysis of the signal, because most of the information of the signal is contained in the waveform rather than in the magnitude and order.

Median Filter
Median filtering belongs to a nonlinear sorting statistical filtering method. It sets a fixed length window to scan data by analyzing the distribution of sample data. In this process, the data in the window are sorted and the median value is taken as the output data after filtering at a certain point [37,38]. The median sequence obtained is the filtered signal. The sorting operation of this process can suppress the impulse noise in the signal well, but it retains the edge profile information of the original data and has a weak ability to suppress the stationary random noise superimposed linearly on the signal, therefore, the median filter is mainly used in signal processing which needs to suppress the impulse noise and retain the edge profile information.
The mathematical description of the one-dimensional signal median filter is given as follows. Suppose that a dataset consisting of k data is x(1), x(2), . . . , x(k) , let D be a filter window of length L = 2N + 1, where N is a positive integer. We define the dynamic sub-window as follows: We set 2N + 1 data in the input window at the n-th time as x(n − N), . . . , x(n), . . . , x(n + N)|n + N ≤ k , where the output of the median filter is then defined as follows: where med [•] denotes that the operation of the data in the window is arranged in an ascending order and then the median value is taken. Scanning the samples with windows, the output sequence median s(n)(1 ≤ n ≤ k − N + 1) is the filtered signal. It can be seen that the superposition principle is no longer tenable at this time, so the median filter is a non-linear filtering method, and its ability to suppress smooth noise is weak.
The above process shows that the median filter is a neighborhood operation, which selects the median of the window sequence as the output calculation. In the output median sequence of dynamic window D, there is a white noise component of a Gaussian distribution that is superimposed linearly in the sequence, and it has a defect, that is, difficult to process the edge signal. Therefore, this filtering method can protect the details of the linear superimposed stationary random noise and the ability to suppress the Gaussian white noise is weak.

Independent Component Analysis Based on JADE
If the vibration source signals are statistically independent of each other, the process of blind source separation is typically carried out to find the independent elements in the observed mixed signals. The key step here is to find the mixed matrix A or the separated matrix U.

Hybrid Matrix A Estimation Based on the Fourth-Order Cumulant
The JADE algorithm is an algorithm based on a fourth-order cumulant. For n random variables x 1 , . . . x n , fourth-order cumulants are defined as follows: where i, j, k, l = 1, 2, . . . , n. The elements of row i and column j corresponding to the fourth-order cumulant matrix can be expressed as follows: Sensors 2020, 20, 1713 where m kl is the k, l element of any weight matrix M of order n × n. If the mean value of each variable in X is zero, according to Equations (9) and (10), the cumulant matrix of X can be expressed as follows: where tr(·) is the trace of the matrix and R X is the covariance matrix of matrix X. It has been proven in [39] that the fourth-order cumulant matrix can be expressed as follows: where λ i is the eigenvalue of C X (M), a i is the column vector of A, and i = 1, 2, . . . , n; A = [a 1 , a 2 , . . . , a n ]. Take two n × n-order matrices M 1 and M 2 , then, according to Equation (10), we can obtain the following: Here, From Equation (10), it can be concluded that Therefore, the diagonal element of ∆ can be regarded as the eigenvalue of G, and A is the eigenvector of G. Theoretically, finding the eigenvector of G is equivalent to finding the mixed matrix A.

Signal Whitening
The above discussion is based on the condition that matrix A is invertible and the source signals are statistically independent of each other. Therefore, in the actual situation, it is necessary to first add the central and whitening process to the observation signal to ensure that the conditions are established. Centralization is to replace x i (t), i = 1, 2, . . . , n with x i (t) − E[x i ], such that the observation sequence x i (t) becomes a zero mean sequence. Whitening is to remove the correlation between the components and ensure the statistical independence between the components. Without losing generality, if the matrix of observation variables after centralization is still set as where N is the number of observation points, then its covariance matrix can be expressed as follows: If V is the unitary matrix and Σ is the eigenvalue diagonal matrix of R X , then the whitening transformation matrix Q can be expressed as follows: The whitened signal can be expressed as follows: Sensors 2020, 20, 1713 where the covariance matrix of which can be written as follows: In order to realize whitening, we make H = QA, then In Section 4.1, the process of finding A based on X(t) has been transformed into the process of finding H based on the whitened matrix Z(t).

Joint Approximate Diagonalization
In practical calculation, because of the existence of numerical calculation error and interference noise, it is impossible to achieve complete diagonalization, only approximate diagonalization. It is impossible to find the optimal H value for any two matrices, i.e., M 1 and M 2 , so the joint approximate diagonalization method is used instead. Based on the whitened signal Z(t), we take a p n × n-order arbitrary matrices M 1 , . . . M p . In order to meet the accuracy requirements, we generally take p = n 2 . Next, we find C z (M i )(i = 1, 2, . . . , p) for each M i , find a unitary matrix H, and make the following formula reach a minimum value: where o f f (·) is defined as the sum of squares of all nondiagonal elements of the matrix. The estimation of the independent elements of mixed signals can be expressed as follows:

Blind Source Separation of Multi-Fault Vibration Signals Based on MF-JADE
Sensors are usually arranged on the X and Y axes of rotating machinery, the collected signals are transmitted, and the statistical independence of each signal is easily affected by the transmission time difference, noise, and so on. When blind source separation is performed, median filtering and whitening processing are required first.

Basic Steps
Step 1: Carry out median filtering and centralized processing for the fault signal data observed in each channel; Step 2: Whiten the data with Equations (18)- (20) to get Z(t), t = 1, 2, . . . , N.
Step 3: Take p weight matrices for M 1 , . . . M p , and calculate the fourth-order cumulant matrix (11), where generally p = n 2 .
Step 4: The matrix group C Z (M i ) of Step 3 is jointly approximately diagonalized to minimize the optimization objective function (Equation (23)), and thus the unitary matrix H is obtained.
Step 5: Calculate the separation matrix U = H T Q.
Step 6: Estimate the source signal according to Equation (5).
Step 7: Analyze the signal characteristics and conduct fault diagnosis.

Evaluation Index of Separation Effect
In order to quantitatively explain the effect of BSS, it is necessary to consider a variety of performance evaluation indices to reflect the error measurement between the separated signal and the original source signal from different aspects. In this paper, three independent meta-analysis evaluation indices, i.e., the correlation coefficient, ρ i , secondary residual, VQM, and performance index, PI, are introduced. If s i is the i-th vibration source signal and s is the separation signal corresponding to s i , the correlation between s i and s can be expressed as follows: where cov(·) represents variance. When the signal separated by the independent component analysis (ICA) algorithm is close to the corresponding source signal, the closer the value of ρ i to 1, the better the separation effect. A calculation formula with an amplitude correction factor was adopted for the secondary residual, which can be expressed as follows: The smaller the value of VQM, the better the separation effect. When the value is less than −23 dB, the separation effect is better.
By Equations (5) and (6), we let Φ = UA. Ideally it should be a unit array. Considering the uncertainty of the arrangement order of output vectors in the ICA method, Φ can be a matrix with only one element in each row and column. At this time, a source signal corresponds to a separate signal, which is an effective separation. PI is the index to measure the difference between the actual Φ matrix and the one-to-one correspondence requirements above. Its formula can be expressed as follows: where h ij is the (i, j) element of matrix Φ. The smaller the PI value, the better the separation effect.
In addition, the vibration signals generated by the friction and collision of fatigue-damaged parts must have certain periodic characteristics. Therefore, for rotating machinery, the frequency characteristics, such as the resonance frequency, are key factors to reflect the effectiveness of the separation signal, which also needs to be included in the evaluation index of the separation effect.

Simulations
In order to verify the effectiveness of the algorithm, three periodic signals with different frequencies were used to simulate the vibration mixing caused by different rotating frequencies for machines. Generally, the vibration signal of a single rotating shaft can be simply regarded as the superposition of its rotating frequency and its double frequency. The expression of the source signal can be expressed as follows: where A ki is the amplitude of the i-th source signal, f i is the frequency conversion of the i-th source signal, k f i is the k-times frequency conversion of the i-th source signal, ϕ ki is the phase, and A ki and ϕ ki are randomly generated by the computer. Here, the f i values are 25, 50, and 75 Hz respectively. It can be seen from Figure 2f that in the case of pulse signal interference, the source signal has not been separated, and there is a large error. This can show that the algorithm based on the noiseless model will produce a lot of errors in the separation of noisy data, even leading to incorrect results.
Comparing Figure 2h,b, it can be seen that the corresponding relationship between the separated signal and the source signal is y 1 → s 1 , y 2 → s 2 , y 3 → s 3 . It can be seen from Figure 2h that the separation method based on MF-FastICA has more frequency components marked by red circles. However, it is much better than the direct separation based on the FastICA method.
Comparing Figure 2j,b, it can be seen that the corresponding relationship between the separated signal and the source signal is y 1 → s 1 , y 2 → s 2 , y 3 → s 3 . The separation and source signals only exist in the uncertainty of amplitude and sequence, which does not affect the identification of fault characteristics. In Figure 2j Table 1. As can be seen from Table 1, the method proposed in this paper is better than the traditional methods of FastICA and JADE.  The simulation results show that before the blind separation of vibration signals under the interference of impulse noise, the median filter method can effectively remove impulse noise, improve the signal-to-noise ratio, and effectively achieve the extraction of fault features.

Experiments
In order to verify the separation performance of the proposed algorithm for the measured mixed vibration signal, an experimental platform was built to analyze the measured mixed rotor vibration signal. Since there may be multiple potential source signals in the process of rotor rotation, such as the vibration signal of ball bearings, axial vibration signals, and noise signals from shafts, and since the sensor is measuring at the same time, the signal measured by the sensor is the mixed vibration signal. In order to satisfy the assumption that the number of sensors is greater than or equal to the number of source signals in blind source separation, five sensors were used in the experiment. The installation positions of the sensors are shown in Figure 3b. The rotating speed of the rotor was about 3200r/min and the sampling frequency was 5 KHz. Figure 3a shows the rotor test bench. The testbed was used to simulate the rub impact fault, and the simulated fault debugging part is shown in Figure 3c. Figure 4a shows the time-domain vibration signals collected by the sensor in the case of a rub impact fault. In the case of a rub impact fault, the classic FastICA algorithm was directly used to separate the sampling signals. The time-domain waveform of the separated signal is shown in Figure 5a. Comparing Figures 4a and 5a, it can be seen there is no obvious difference between the mixed signal measured by the actual rotor test bench and the separated signal in the time domain. Figure 6a shows the time-domain vibration signals separated by the JADE algorithm. Figure 7a shows the time-domain vibration signals separated by median noise reduction and the JADE algorithm. Comparing Figures 4a and 7a, it can be seen the impulse noise is well suppressed after median filtering.
In order to compare the complex vibration of the rotor before and after separation more intuitively, it is necessary to analyze the spectrum of each data signal before and after the separation and observe the different characteristics of the signal before and after the separation from the frequency domain.
algorithm. Comparing Figure 4 a and Figure 7 a, it can be seen the impulse noise is well suppressed after median filtering.
In order to compare the complex vibration of the rotor before and after separation more intuitively, it is necessary to analyze the spectrum of each data signal before and after the separation and observe the different characteristics of the signal before and after the separation from the frequency domain.  It can be seen from Figure 4 b that most of the frequencies are submerged in the noise, and the frequencies are mixed, except that the frequencies in the third figure are not submerged by the noise. As shown in Figure 5 b, the spectrum after direct separation still contains a lot of noise, and each frequency is not completely separated, which shows that the separation effect of the FastICA  Figure 5b is submerged by noise and cannot be identified, while the characteristic spectrum line of 100 Hz in Figure 7 b is highlighted. This shows that the performance of the MF-JADE algorithm is better than that of direct separation when it is used to separate aliased signals in impulsive noise environments, where it can effectively suppress noise signals and highlight the periodic signals.    In the first figure in Figure 7 b, it can be seen the frequency of 50 Hz is highlighted while the other frequencies are suppressed. Since the power frequency used in daily life is 50 Hz, it can be determined that the signal is a power frequency signal. The second figure in Figure 7 b has two frequency values, one is 50 Hz, which can be calculated as the rotor's rotating frequency, and the other is 100 Hz, which is two-fold frequency. Since the amplitude of the first-fold frequency is greater than that of the second-fold frequency, it can be seen that the signal is the unbalanced rotor fault signal. From the third and fourth figures in Figure 7 b, it can be seen that the amplitude of the first octave is less than that of the second octave and that there are other octave spectral lines, so it can be It can be seen from Figure 4b that most of the frequencies are submerged in the noise, and the frequencies are mixed, except that the frequencies in the third figure are not submerged by the noise. As shown in Figure 5b, the spectrum after direct separation still contains a lot of noise, and each frequency is not completely separated, which shows that the separation effect of the FastICA algorithm is significantly worse when the data contain strong impulse noise. Comparing Figures 5b and 7b, it can be seen the spectral line of 100 Hz in Figure 5b is submerged by noise and cannot be identified, while the characteristic spectrum line of 100 Hz in Figure 7b is highlighted. This shows that the performance of the MF-JADE algorithm is better than that of direct separation when it is used to separate aliased signals in impulsive noise environments, where it can effectively suppress noise signals and highlight the periodic signals.
In the first figure in Figure 7b, it can be seen the frequency of 50 Hz is highlighted while the other frequencies are suppressed. Since the power frequency used in daily life is 50 Hz, it can be determined that the signal is a power frequency signal. The second figure in Figure 7b has two frequency values, one is 50 Hz, which can be calculated as the rotor's rotating frequency, and the other is 100 Hz, which is two-fold frequency. Since the amplitude of the first-fold frequency is greater than that of the second-fold frequency, it can be seen that the signal is the unbalanced rotor fault signal. From the third and fourth figures in Figure 7b, it can be seen that the amplitude of the first octave is less than that of the second octave and that there are other octave spectral lines, so it can be seen that the signal is the rotor rub impact fault signal. From the fifth figure in Figure 7b, it can be seen that the frequency of this figure is distributed over the entire frequency band. From Figure 7a, it can be seen that the signal is random in the time domain. Combining these two points, it can be determined that this signal is a noise signal.
Comparing Figures 5 and 7, the improved method proposed in this paper can effectively separate the rub impact fault and mass disk imbalance fault caused by the rub impact and the noise signal. However, using the classical FastICA and JADE separation method, we can only separate the 50 Hz power frequency signal of the rotor system, as shown in Figures 5 and 6.

Conclusions
In order to solve the problem of fault feature extraction for rotating machinery in a strong impulse noise environment, a fault separation method combining the median filter and an improved JADE algorithm (MF-JADE) has been proposed here. Through simulation and an experimental study of the vibration signal separation of a hybrid rotor, the following conclusions may be made:

1.
Blind separation of the observation signal with strong impulse noise was carried out directly, and the error of the separation result was large, where even an incorrect result could be obtained.

2.
The median filtering method can effectively remove the impulse noise signal without losing the useful components of the original signal, improve the signal-to-noise ratio, and provide precondition for the accurate realization of blind separation.

3.
For the measured signal, although the independence assumption of blind source separation is not strictly true, the MF-JADE algorithm is still effective in the actual vibration signal separation.

4.
The combination of a median filtering method and a blind source separation algorithm provides a new method for the separation of aliased signals in strong impulse noise environments.