Fault Diagnosis of Rotating Machinery Based on Multi ‐ Sensor Signals and Median Filter Second ‐ Order Blind Identification (MF ‐ SOBI)

: Feature extraction plays a crucial role in the diagnosis of rotating machinery faults. However, the vibration signals measured are inherently complex and non ‐ stationary and the features of faulty signals are often submerged by noise. The principle and method of blind source separation are introduced, and we point out that the blind source separation algorithm is invalid in an environment of strong impulse noise. In order to solve the problem of fast separation of multi ‐ sensor signals in an environment of strong impulse noise, first, the window width of the median filter (MF) is calculated according to the sampling frequency, so that the impulse noise and part of the white noise can be effectively filtered out. Next, the filtered signals are separated by the improved second ‐ order blind identification (SOBI) algorithm. At the same time, the method is tested on the strong pulse background noise and rub impact dataset. The results show that this method has higher efficiency and accuracy than the direct separation method. It is possible to apply the method to real ‐ time signal analysis due to its speed and efficiency.


Introduction
Rotating machinery is a kind of widely used power equipment, so it is of great significance to implement intelligent operation and maintenance management [1,2]. However, due to the complex structure and dynamic operating environment, the vibration of such equipment generally presents strong nonlinear and background noise characteristics [3,4]. For many years, only the local vibration signals collected by a single sensor have been used to solve the problem of fault identification of rotating machinery system, which has led to obvious difficulties. Thus, we should make full use of a series of sensors arranged at several key sections of the rotating machinery, and implement intelligent fault decision-making technology based on as much information as possible. This view has gained consensus on the research prospect of industrial big data technology [5]. Obviously, the more sensors that collect fault information, the more dimensions the fault feature dataset will have. Therefore, in the process of researching the intelligent operation and maintenance of rotating machinery, a major question is how to effectively extract the sensitive quantitative features of its operational state from the nonlinear, strong noise, high-dimensional vibration signal fault feature dataset, which has significance for the development of intelligent decision-making technology driven by big data.
The purpose of a vibration signal analysis of rotating machinery is to extract the operational information of the key rotating parts (e.g., gears, bearings, and rotors), and effective signal processing and feature extraction are key to completing the condition monitoring and fault diagnosis of rotating machinery. Therefore, more accurate and comprehensive extraction of vibration signal characteristics has been the pursuit of researchers in this field. The methods of feature extraction of vibration signal include empirical mode decomposition (EMD) [6,7], minimum entropy deconvolution [8,9], an adaptive filter [10,11], matching tracking [12,13], mathematical morphology analysis [14][15][16], cyclostationary signal analysis [17,18], a Wiener filter [19,20], wavelet transform [21][22][23], a Kalman filter [24,25], and stochastic resonance [26,27]-all of which help with the development of mechanical fault diagnosis. The above methods can effectively eliminate background noise and interference components, and extract a fault signal in a specific environment, but are not suitable for complex interference situations. In particular, when the interference signal in the vibration signal of rotating machinery is similar to the fault signal, it is more difficult to identify and eliminate the interference components and extract the fault signal with the above methods. Blind source separation (BSS) technology [28] can separate multiple aliased signals without the influence of time-frequency overlap of source signals. At the same time, the separation signal will not lose the weak signal characteristic information in the vibration signal of rotating machinery.
There have been many effective blind source separation algorithms with different characteristics, including the fast fixed-point algorithm [3], natural gradient algorithm [29], Equivariant Adaptive Separation via Independence (EASI) algorithm [30], and Joint Approximation Diagonalization of Eigen-matrices (JADE) algorithm [31]. These have been widely used for fault diagnosis [32,33], image processing [34], speech recognition [35], earthquake prediction [36], and other fields. Lu et al. [37] proposed a source contribution quantitative estimation method based on underdetermined blind source separation to identify the major vibration and radiation noise. Belaid et al. [38] presented a new multiscale decomposition algorithm that enables the blind separation of convolutely mixed images. He et al. [39] proposed an improved underdetermined blind source separation algorithm based on sparse component analysis (SCA) and applied on the sampled pressure pulsation signal to obtain separated signals. Zhang et al. [40] proposed a universal singlechannel blind source separation method based on a combination of ensemble empirical mode decomposition and independent component analysis, which is suitable for autonomous underwater vehicles in time-varying ocean currents. These algorithms usually show good separation performance when separating noiseless mixed signals, but when separating strong noiseless mixed signals, there could be a large error-especially in the case of low signal to noise ratio (SNR), there will be a completely wrong conclusion, because these algorithms are based on the noiseless interference model. In rotating machinery operation, the signals measured by sensors inevitably contain noise. If the above blind source separation algorithm is directly used to separate the mixed vibration signal, many errors may be produced, or the wrong conclusions may be drawn.
Therefore, a fault diagnosis method for rotating machinery is developed based on a combination of improved median filter (MF) and second-order blind identification (SOBI). Firstly, the window width of the median filter is calculated according to the sampling frequency, so that the impulse noise and part of the white noise can be effectively filtered out. Secondly, the improved SOBI algorithm is used to separate the signal after noise reduction. Finally, the method is tested on strong pulse background noise and the rub impact dataset, and the fault feature signal is effectively separated.

Model of Fault Diagnosis System
In the condition monitoring and fault diagnosis of rotating machinery, the signal measured by the sensor is usually a linear mixture of multiple original signals. The mixing process is shown in Figure 1. Suppose the signal vector observed from m sensors is , and the vibration signal vector of n signal sources is i i n   signal sources at time t . The output signal of each sensor can be expressed as follows [41]: where ij a is the mixing coefficient, and ( ) i v t is the noise signal.
Its matrix form can be expressed as follows: where m n A R   is an unknown full-rank mixed matrix; ( ) s t is an n-dimensional source vector; ( ) v t is an additive noise vector, and its statistics are independent. The purpose of blind source separation is to find a separation matrix W , so that the result of Equation (4) is the optimal estimate of ( ) s t : To separate the source signal is to find the separation matrix U and make it meet the following formula: if UA I  .
If I is a unit matrix, then ( ) S t  is an effective separation of ( ) S t  .
However, the structure of key parts of the rotating machinery is complex: there are many sources of vibration excitation, the actual propagation path of the vibration is complicated, the equipment interferes with each other, and the environmental interference is noisy. The measured signal is often a result of multiple vibration sources overlapping with each other. In a linear convolutional hybrid model, the observation vector can be expressed as follows: The convolution in the time domain is equal to the product in the frequency domain, and the convolution mixture in the time domain is converted into an instantaneous mixture in the frequency domain. Its model can be expressed as follows: The two are unified in form. Therefore, no matter how the signals are mixed, they can be processed in the same way.
In this article, unless otherwise stated, the following assumptions are made: The source signal is a zero-mean, spatially uncorrelated, but time-correlated signal; The source signal is a stationary process, i.e.,

Principle of Median Filter
A median filter is a kind of nonlinear filter technology that has good edge preserving characteristics and the ability of suppressing impulse noise [8]. This method is essentially a window filter. The filtering operation is to scan the sample data by sliding a fixed length window, and replace the data in the center of the current window with the median of the data in the window [42]. With the end of the window moving, the filtering process of the whole sample signal will be completed. Since only one signal dimension is involved in the vibration signal analysis in this paper, only one dimension of the discrete median filtering principle is discussed. Let the discrete sampling sequence of signal ( ) x t be ( ) x n ( 1,2,3 , ) n n   and take the filtering window with a length of (where d is a positive integer) to conduct median filtering for this signal sequence. At the nth time, the data column in the window is represented as W is arranged in the order of small to large, and the intermediate value ( ) y n is taken to replace the original ( ) x n that is, the filtering task of a data point of the signal is completed [43,44]. The mathematical expression of this process is: where   Med  is the median of all numbers in the window.
The principle of the median filter is simple, it is easy to create its algorithm program with a computer, and the impulse noise under half window width can basically be eliminated. Therefore, as long as an appropriate window width is set, the median filter can effectively reduce the impulse noise in the vibration signal, but because of the characteristics of the filtering method itself, it cannot filter out the white noise.

Improved Method of Median Filter
The key problem with the median filtering method is determining the filter window width according to the signal characteristics. On the one hand, the window should not be too wide, or the details of useful signals will be lost; on the other hand, the window should not be too narrow, or too much impulse noise will remain. In order to filter out impulse noise without losing useful signal, the window width should be double that of the pulse width. If the sampling interval of vibration signal is s T and the duration of impulse noise is s L , the reasonable window width d L can be expressed as follows: is the sampling frequency. According to [16], the impulse noise is mainly continuous: is assumed. It can be seen from Equation (11) that the window width is adaptively adjusted with s F , which is more conducive to eliminating impulse noise and retaining useful signals.

The Fast Second-Order Blind Identification (SOBI) Method for Diagonalization of Average Matrix
The second-order statistical method does not require the non-Gaussian assumptions applied in the higher-order statistical method [43,44]. However, because the second-order statistical method is usually a joint diagonalization of a set of autocorrelation matrices, the dimension of the autocorrelation matrix increases with the number of sensors, so when the number of sources is quite large, the calculation of the BSS method is more difficult. Therefore, a simple and effective SOBI based BSS method is needed.

The Joint Approximation Diagonalization SOBI Method
Methods for the blind separation of second-order statistics generally include preprocessing data, calculating second-order statistics, calculating diagonalization matrices, and obtaining source and mixed process estimates.
(1) Data Preprocessing The most common preprocessing step in the second-order blind separation method is whitening. When the number of source signals and the number of sensors is the same, no whitening treatment is required. However, the whitening preprocessing can remove the spatial correlation. When the number of source signals and sensors are not the same, the whitening process can estimate the number of sources, and, at the same time, eliminate the effect of additive noise on the signal. Belouchrani and Cichocki [45] improved the algorithm by using robust orthogonalization as a preprocessing step, under the condition that the number of observations was greater than the number of sources.
(2) Calculate Second-Order Statistics For colored sources with different power spectrums, a delay covariance matrix is used: (

3) Joint Matrix Diagonalization to Obtain Unitary Matrix
The goal of joint diagonalization is to find the orthogonal matrix U, enabling a group of matrices to be diagonalized, so In this formula, i M is the delay covariance, and i D is the real diagonal matrix.

Algorithm Steps for Joint Approximate Diagonalization SOBI Method Implementation
The implementation algorithm flow of the SOBI method is as follows: Step 1: Estimate the sample covariance matrix (0) R  from T sample data; use 1 , , n    to represent the n largest eigenvalues, and have 1 , , n h h  as the corresponding eigenvectors. Under the assumption of white noise, the noise variance estimate 2  is the average of m-n minimum eigenvalues of (0) R  . In the ideal case without noise, the last m n  singular values of (0) R  are equal to 0. Therefore, the noise variance 2  is 0.
Step 2: Obtain the whitening signal forms a whitening matrix.
Step 3: Fix a set of delays Step 4: Get the approximate diagonalizer U Step 5: Calculate the estimate of the source signal ( ) ( )

Method Principle
The structure of the correlation matrix with different time delays observed by the sensor is as follows: Spatially whiten the correlation matrix (0) R : where W  is the whitening matrix. The whitening correlation matrix ( ) The whitened correlation matrix forms a joint matrix Due to using the general diagonalization criterion we can get Divide both sides by K and get the new diagonalization criterion: The improved diagonalization method is based on the following optimization problems: The joint diagonalization of the matrix is transformed into the diagonalization problem of the average matrix, which greatly simplifies the algorithm. The average matrix reflects the average feature structure of the matrix, which will inevitably cause the loss of some detailed features; this will affect the separation accuracy to some extent, but it can be seen from the subsequent experiments that this simplification does not seriously affect the separation effect. However, the separation speed was significantly improved.

Implementation Algorithm Steps of Improved MF-SOBI Method
The implementation algorithm flow of the improved MF-SOBI method is as follows: Step 1: the improved median filtering method in Section 3.2 is used to filter the sampled signal.
Step 2: estimate the filtered sample covariance matrix (1) R  from the T sample data. Singular value decomposition of (1) R  is performed to obtain a whitening matrix W  . Let 1 , , n    represent the n largest eigenvalues, and 1 , , n h h  the corresponding eigenvector.
Step 3: under the assumption of noise, the estimate of noise variance 2  is the average of m n  minimum eigenvalues of (1) R  . In the ideal case of no noise or white noise, the last m-n singular values of R are equal to 0.
Step 4: fix a set of delay Step 5: obtain the approximate diagonalizer U Step 6: the estimation of the source signal is ( ) ( )

Evaluation Index of Method Separation Performance
(1) Performance Index (PI) To measure the effectiveness of the method, a performance indicator is used to measure the separation performance [46], which is defined as In the formula, ik g is the ( , ) i j th element of the system matrix Similarly, max j ji g represents the maximum value of the absolute value of the element in the vector in the ith column of G . The smaller the PI , the better the separation effect. When complete separation is achieved, the performance index is 0.
(2) Residual Cross-Talking Error (RCTE) The residual cross-talking error (RCTE) between the separated signal and the source signal also reflects the similarity between the separated signal and the source signal. Generally, when the RCTE value reaches -20 dB, it can be considered to achieve a good separation effect: where i s is the ith source signal and i x is the ith comparison signal.

Simulation Study on Separation of Mixed Signals with Different Frequency Carriers
In the simulation experiment, the vibration of the rotor system is mainly simulated. Rotating parts of the rotor system include the rotor, bearing, and other rotating parts. The vibration signal of the rotor system can be determined from the sinusoidal signal of each frequency and each resonance frequency. The simulation signal can be expressed as The randomly generated mixed matrix is as follows: Assume that the rotor rub impact fault frequency is a mixed signal of 100 Hz and 150 Hz, and other source signals are composed of a vibration signal with a base frequency of 50 Hz and Gaussian white noise. The source signal waveform obtained is shown in Figure 2.
The source signal is randomly and linearly mixed, and the mixing matrix is A . The mixed signal is as shown in Figure 3.
In Figure 3, it can be seen that the characteristics of the source signal have been completely hidden in the mixed signal, and the information on the source signal cannot be read from the time domain waveform. Figure 4 shows the signal directly separated from the mixed signal by the JADE algorithm. It can be seen, from Figure 4, that the noise signal has not been successfully separated, and the other signals separated contain many harmonics, so the signal characteristics cannot be accurately identified, which indicates that the method is invalid under strong noise. Figure 5 is the signal separated by the classical SOBI algorithm. It can be seen from Figure 5 that, although the noise signal has been successfully separated, the other two separated signals also contain many harmonics, so it is impossible to accurately identify the signal characteristics. It means that this method can separate the noise signal, but it cannot effectively separate other source signals.    By comparing Figure 2b and Figures 4b-6b, it can be seen that the separation methods based on the classical JADE and SOBI algorithm have more frequency components marked with red circles. It indicates that the separation result is not accurate. From the evaluation indexes in Table 1, we can also see that the improved algorithm MF-SOBI is better than SOBI and JADE.  By comparing Figures 2-6, it can be seen that the uncertainty of the separated signal compared with the source signal mainly lies in the uncertainty of amplitude and signal sequence brought about by the blind source separation technology itself. In addition to this uncertainty, other characteristics of the signal have been better recovered. Furthermore, the performance index and similarity coefficient of the algorithm are calculated and compared with the results of the SOBI and JADE algorithms. The performance index shows the overall separation ability of the algorithm; the smaller the value is, the better. The similarity coefficient shows the reduction of a single source signal; the closer the absolute value is to 1, the better the separation result. The comparison results are shown in Table 1. From the data in Table 1, it can be seen that, under the interference of strong impulse noise, the improved algorithm in this paper effectively separates the source signals, and has a higher separation performance than the classical algorithm.

Applications
The double-span rotor experimental platform is shown in Figure 7a. The rotor test bed is 134 mm long, 51 mm wide, and 120 mm high, with a total mass of about 50 kg. The test rotor has a shaft size of ф 10 mm × 810 mm, on which three lumped mass disks are installed. The rotor is supported by four cylindrical bearings and separated into a double-span structure. For the convenience of description, the rotor span close to the motor is named the forward span and the rotor span far away from the motor is named the backward span. The rotor system is driven by a direct current (DC) motor whose power is about 1.1 KW. A flexible connection mode of nylon rope is adopted between the motor output shaft and the front span rotor, and between the front span rotor and the rear span rotor. The DC motor is controlled by a pulse width modulator. The experiments show that the first critical speed of the rotor system is about 2000 r/min, the unstable speed is close to 5000 r/min, and 3000 r/min is the best stable operating point in the flexible working area of the rotor system. The vibration displacement signal acquisition program is implemented on the LabVIEW platform. In order to effectively simulate the rotor imbalance, rotor misalignment, rotor dynamic, and static rub impact fault experiment, the rotor experimental platform is equipped with a counterweight screw, rub impact screw, gasket, and other experimental auxiliary devices.
In order to simulate the rotor dynamic and static rub impact fault on the test bench, a threaded hole is reserved above the test bench bracket, as shown in Figure 7c, and a long white plastic screw is reserved. The long screw is screwed into the threaded hole to make contact with the mass disk, but it cannot make full contact, mainly because, due to the speed increase, the amplitude of the rotor is increasing under the action of unbalanced force. When the amplitude reaches the critical distance, there will be friction between the mass disk and the plastic screw. At the same time, the rub impact fault under different working conditions can be simulated by changing the distance and speed. In this experiment, the dynamic and static rub impact experiments were completed at the speed of 2800 r/min. The sampling frequency was 5000 Hz and the number of sampling points was 1024. The eddy current sensor probe is composed of two vertical probes. The probes are installed near the journal and around the disk. The vibration is obvious and the signal is easy to obtain. The sensor's installation position is shown in Figure 7b. A single sensor at the end of the rotor is used to measure the real-time speed of the rotor. In order to ensure that the number of sensors in blind source separation is greater than or equal to the number of source signals, four sensors are used in the experiment.
The four measured sensor signals are shown in Figure 8. The signals separated by the SOBI algorithm are shown in Figure 9. The signals separated by the MF-SOBI algorithm are shown in Figure 10.   Due to the interference of strong impulse noise, the fault characteristics of the rotor system cannot be identified from the time domain waveform of the source signal in Figure 8a. However, in the frequency domain waveform y1 of Figure 8b, the 50 Hz power frequency signal of the rotor can be clearly identified.  In the case of strong impulse noise, the classical SOBI algorithm is directly used for separation without noise reduction. The separated signal is shown in Figure 9. According to the time-frequency waveform in Figure 9, only a 50 Hz rotor signal can be clearly identified. This shows that the classical SOBI algorithm has a poor separation effect under strong pulse interference, or is wrong. As can be seen from Figure 10b (the frequency domain waveform of the separated signal), the impulse noise is well filtered and the rotor vibration source signal is well separated. From y1 of Figure  10, it can be seen that the 50 Hz signal is obvious, and there is low-amplitude double frequency at the same time. According to the frequency characteristics, it can be determined that the y1 signal clearly reflects the slight imbalance caused by the rub impact fault. From y2 and y4, we can clearly see 50 Hz power frequency and 100 Hz double frequency, and a small number of high-frequency components. According to its frequency characteristics, it can be judged as rub impact fault characteristics. Because y3 is random in both the time domain and the frequency domain, it can be judged as random noise.
Through the above analysis, it can be seen that under strong impulse noise interference, the improved algorithm proposed in this paper can effectively separate the rub impact fault of rotor system and the slight imbalance fault caused by the rub impact fault and random noise, which verifies the effectiveness of this method.

Conclusions
In order to solve the problem of fast fault feature extraction of rotating machinery, an improved blind source separation algorithm based on MF-SOBI is proposed. Our conclusions are as follows: 1. If the observation signal under the interference of strong impulse noise is separated directly, the error of separation result is large, and the wrong result may even be obtained. 2. The improved adaptive median filtering method can effectively remove the interference of strong impulse noise. 3. The improved algorithm in this paper can effectively separate the simulated rotor vibration signals, and its performance is greatly improved compared with the traditional method.