An Effective Bearing Fault Diagnosis Technique via Local Robust Principal Component Analysis and Multi-Scale Permutation Entropy

The acquired bearing fault signal usually reveals nonlinear and non-stationary nature. Moreover, in the actual environment, some other interference components and strong background noise are unavoidable, which lead to the fault feature signal being weak. Considering the above issues, an effective bearing fault diagnosis technique via local robust principal component analysis (LRPCA) and multi-scale permutation entropy (MSPE) was introduced in this paper. Robust principal component analysis (RPCA) has proven to be a powerful de-noising method, which can extract a low-dimensional submanifold structure representing signal feature from the signal trajectory matrix. However, RPCA can only handle single-component signal. Therefore, in order to suppress background noise, an improved RPCA method named LRPCA is proposed to decompose the signal into several single-components. Since MSPE can efficiently evaluate the dynamic complexity and randomness of the signals under different scales, the fault-related single-components can be identified according the MPSE characteristic of the signals. Thereafter, these identified components are combined into a one-dimensional signal to represent the fault feature component for further diagnosis. The numerical simulation experimentation and the analysis of bearing outer race fault data both verified the effectiveness of the proposed technique.


Introduction
The bearing as an essential element has been widely used in rotating machinery [1,2]. Due to the severe working conditions, such as long and uninterrupted operation, alternating loads, and corrosion, the probability of bearing failure increases greatly, which may cause heavy economic losses or even serious personal injury [3]. Hence, an available diagnosis technique for bearing faults is highly valuable [4]. The bearing faults can be classified into three main types: inner race fault, outer race fault, and rolling element fault [5]. The vibration signals of the bearings are often used for fault diagnosis for their containing abundant equipment operation information [6]. When the bearing faults occur, the corresponding vibration signals will produce periodic impulses, and the feature of this signal behaves in a typical nonlinear and non-stationary nature, which increases in spectral complexity [5,7,8]. In the actual industrial production environment, the signals usually contain some interference vibrations caused by other mechanical components and strong background noise besides useful fault feature.

RPCA
The acquired one-dimensional bearing fault signal x ∈ R n can be converted into a high dimensional signal trajectory matrix X ∈ R n 1 ×n 2 by phase space reconstruction, which is based on a embedding process with the parameter of the embedding dimension n 1 and the delay time τ (where (n 1 − 1)τ + n 2 = n) [22]: x 2+τ · · · x n 2 +τ . . . . . . · · · . . .
Except for the strong background noise component, X is composed of multiple feature components, which include the fault feature component and the unwanted interfering components. Hence, on the basis of noise suppression, how to separate the useful feature component from these mixed multi-components and back it to the one-dimensional signal to represent the extracted fault feature component was an inevitable task for extracting weak faults in this paper.
By referring to manifold learning theory, it can be found that the feature component in X has a low-dimensional submanifold structure [26,27]. RPCA can extract this structure by solving the following low-rank matrix and sparse matrix decomposition model (shown as Figure 1): minarg L,S rank(L) + α S 0 , s.t. X = L + S (2) where α = 1/ max(n 1 , n 2 ) is the optimal weighted parameter. The first term (L ∈ R n 1 ×n 2 , r(L) r(X)) of this model is a rank constraint function, which uses a low-rank matrix to estimate the low-dimensional submanifold structure, and the second term (S ∈ R n 1 ×n 2 ) is a regularization strategy, which is mainly used to correct the deviation of matrix data caused by noise. In fault feature extraction, L represents the signal feature component and S captures the noise [28]. Thus, RPCA can effectively separate the feature component of the signal from the noise. However, it can be seen from the theoretical basis of Equation (2) that RPCA can only deal with the matrix data composed of a single feature component and noise, which means this method cannot separate the data containing the multiple feature components.

LRPCA
To tackle the drawback of RPCA in processing the multi-component signals, we proposed a novel LRPCA method based on CLSLRMA to decompose X into several single-components. In LRPCA, the following fundamental assumption is introduced.
Fundamental Assumption: In addition to the background noise signal matrix S, X contains a feature signal matrix L composed of a linear mixture of several feature components. Furthermore, each feature component corresponds to a low-dimensional submanifold structure hidden in the high-dimensional phase space and has a characteristically of low-rank. feature extraction, L represents the signal feature component and S captures the noise [28]. Thus, RPCA can effectively separate the feature component of the signal from the noise. However, it can be seen from the theoretical basis of Equation (2) that RPCA can only deal with the matrix data composed of a single feature component and noise, which means this method cannot separate the data containing the multiple feature components. = + X S L Figure 1. Illustration of the robust principal component analysis (RPCA); A signal trajectory matrix X ∈ R n 1 ×n 2 can be decomposed into a low rank feature component L ∈ R n 1 ×n 2 and a sparse component S ∈ R n 1 ×n 2 . Figure 2 depicts the main ideal of the LRPCA. Specifically, each submanifold structure is generally hidden in the different high-dimensional phase space T (e i ) associated with the local selected anchor point (e i = (a i , b i ), a i = 1, . . . , n 1 ; b i = 1, . . . , n 2 ; i = 1, . . . , m), and T (e i ) is derived from the weighting of X: T (e i ) = W e i X where W e i ∈ R n 1 ×n 2 is a local weighted coefficient matrix. Thereafter, T (e i ) is decomposed into a low-rank component L e i and a sparse component S e i . The resulting low-rank matrices L e 1 , . . . , L e m represent the low-dimensional submanifolds, which correspond to different feature components in raw signal, that of the fault feature component and the unwanted interfering components. Conclusively, L is expressed as the weighted combination of these local low-rank matrices: L = R 1 L e 1 + · · · + R m L e m (4) where R i ∈ R n 1 ×n 2 is a weighted regression matrix. Thus, the fault-related feature components can be extracted from these low-rank matrices obtained by decomposing L. For this task, a novel local low-rank matrix and sparse matrix decomposition model was proposed to obtain these low-rank matrices from X: L e i = minarg L e i ,S e i rank(L e i ) + α S e i 0 , s.t. T(e i ) = L e i + S e i The RPCA is actually a special case of this model when W e i is a unit matrix. Therefore, the de-noising performance of this model can be guaranteed.   Figure 2. Illustration of the proposed LRPCA method; in the different high-dimensional phase space T (e i ) associated with the local selected anchor point e i = (a i , b i ), the signal trajectory matrix X ∈ R n 1 ×n 2 can be decomposed into a low rank component L e i ∈ R n 1 ×n 2 and a sparse component S e i ∈ R n 1 ×n 2 .

• Model Construction and Algorithm Solving
The derivation of the phase space T (e i ) is the key step in LRPCA. Firstly, a distance function d((a, b), (a , b )) is defined to describe the similarity between any two elements X(a, b) and X(a , b ) in X. A smaller value of d means the probability of the two elements being in the same phase space is higher. According to the theory of CLSLRMA, The standard incomplete SVD X = UV T [39] is employed to divide d into two independent terms and the arc-cosine function is utilized to calculate these two terms: Then, the following non-parametric smoothing Epanechnikov kernel function [40] is adopted to define the local weighted coefficient matrix W e i : The (i, j)-element of W e i is expressed as W(e i , (i, j)). The Epanechnikov kernel function is a typical unilateral quadratic decrement function related to d. Consequently, a lager value of the weighted coefficient means that the weight for X(i, j) belonging to T (e i ) is bigger. Besides, for different local anchor point e i , there theoretically should exist a unique corresponding phase space hidden the single submanifold. However, as the fundamental assumption states, there should be finite number of single submanifold structures in X. Gratifyingly, this function has the mathematical property of changing slowly, which means as long as the similarity of two anchor points e i and e j is large enough, their corresponding phase spaces T (e i ) and T (e i ) will have quite high similarity. Moreover, both of those two phase spaces may hide the same single submanifold structure.
Through the above theoretical analysis process, the T (e i ) satisfying the fundamental assumption was successful created. Finally, T (e i ) is decomposed to extract the hidden low-rank matrix L e i by solving the Equation (5). Since the discrete combination nature of the l 0 -norm and rank function, the solution of this model is an non-deterministic polynomial (NP)-hard problem [41]. Take this into consideration, some recent studies [36,37,[42][43][44][45] pointed out that an equivalent convex optimization program of this model can be obtained from the convex hull of the two constraints; that is, the l 1 -norm and the nuclear norm are employed to replace those two constraints respectively: The solution of Equation (8) is a typical convex optimization problem, whose minimizer is globally unique [43]. Algorithm 1 provides a precise and convergent solution to this equation via ADMM [44,46]. Note that steps 3 and 4 are both convex problems, they both have closed-form solutions via singular value thresholding operator [47].

Algorithm 1 solve (8) by ADMM
Input: signal trajectory matrix X ∈ R n 1 ×n 2 ; Parameter: number of anchor points: q; regularization parameter: α = 1/ max(n 1 , n 2 ); for all i = 1:m, parallel do; 1. select e i (a i , b i ) uniformly in X, and calculate W e i by Equation (7); 2. T (e i ) = W e i X; Initialize: L 0 e i = S 0 e i = Y 0 = 0, γ 0 = e −3 , γ max = e 10 , µ = 1.1, ε = 1e − 8; while not converged do; 3. fix the others and update L k+1 e i by: − S k+1 e i ; 6. update τ: τ k+1 = min(µτ k , τ max ); 7. check the convergence conditions: end; end; output: L e 1 , · · · , L e m . Through Algorithm 1, we can obtain m low-rank matrices corresponding to different feature components in the raw signal. Besides, it needs to be emphasized that the performance of the final fault diagnosis is highly susceptible to the location selection and the number of the anchor points. Once the chosen anchor points are inappropriate, there may be multiple low-rank matrices corresponding to the same feature component. More seriously, the fault-related low-rank matrices may be completely missed. In view of the above problems, for the selection of the anchor points, we do not have a good solution for now. But, in the process of analyzing the experimental data, the following two principles are feasible. The first one is that m should be large enough to ensure that the low-rank matrices corresponding to all feature components can be extracted specifically. This can be explained from the perspective of the probability. When the number of the anchor points is more, the probability of the extracted fault-related low-rank matrices is higher. However, this inevitably requires a lot of computing time. The other one is that these anchor points should be uniformly chosen from the elements set and the distance between any two anchor points should be made large enough. This principle is to make extracted low-rank matrices corresponding to the different feature components differ as much as possible. In the numerical simulation experiment, since the raw simulated signal contained three feature components, and we found that when m was set as six, the final decomposition performance was quite good. Additionally, when analyzing the experimental signal, it was found that m = 6 is also appropriate. Therefore, according to the analysis results of the experimental data, we assumed that the number of the main feature components in the acquired bearing fault signal generally would not exceed three, and set m to be twice that number; that is m = 6.

• Global Approximation of Fault Feature Component
These low-rank matrices can be backed to m one-dimensional component signals by inverse transform. In Section 2.2, it will show that the one-dimensional components related to fault feature can be identified from these signal through the MPSE characteristic of the signal. Thus, there are o (o < m) identified one-dimensional signals and their corresponding low-rank matrices L e 1 , . . . , L e o . These low-rank matrices are actually local sensitive, which can only be used to describe the fault feature information contained in the corresponding low-dimensional submanifolds. Hence, the Nadaraya-Watson regression model [38] is adopted to combine these low-rank matrices into a global approximationL f ∈ R n 1 ×n 2 of the fault feature component, which can be expressed as: Note that if o = m, this equation is equivalent to Equation (4). Thus, the estimator of L can be obtained, which is actually a de-noising process. Thereby, we returnedL f to the one-dimensional time seriesl f ∈ R n , which is the expected fault feature signal. Hence, the task to suppress the strong background noise and extract the fault feature signal was completed.

Basic Theory of MSPE
The MPSE can efficiently evaluate the dynamic complexity and randomness of the time series under different scales. The calculation of the MSPE depends on three parameters: scale factor ε, embedding dimension d and time-lag δ [32,36]. For a signal time series x = [x 1 , x 2 , · · · , x n ], the main calculation steps can be divided as follows: (1) Transform x into a successive coarse-grained time series y ε ∈ R n ε (n ε = [n/ε]) by averaging the time data points in x with the given non-overlapping time slice of the increasing length, ε. Then, each element of y ε is defined as: (2) For each coarse-grained time series y ε , the PE value needs to be calculated. Firstly, y ε is cut into a series of data segments through d and δ: There will be n ε − (d − 1)δ data segments in total. Then, there are d! different types of ordinal patterns (ψ i , i = 1, . . . , d!) in the data segments. Then, count the frequency of each pattern and denote them as f (ψ i ), i = 1, . . . , d!. Thus, the relative frequency of each pattern can be written as: Finally, the PE of y ε is expressed as: For convenience, we normalize P(ε) by dividing its maximum value log 2 d!:  Figure 3 shows the process of coarse granulation and the data segmentation of a time series. The bearing fault feature signal can be modeled as the combination of finite pulsed excitations [8,[48][49][50]:   The bearing fault feature signal can be modeled as the combination of finite pulsed excitations [8,[48][49][50]: where b i (t), c(t), and a i (t) indicate the amplitude modulation component, the frequency modulation component, and the modulation effect caused by vibration transfer path, respectively. f e and θ i indicate the system resonance frequency and the initial phase, respectively. b i (t) and c i (t) can be expressed as: where ξ, f c , and T d indicate the resonance attenuation coefficient, the fault feature frequency, and the time period of fault, respectively. u(t) represents a unit step function. υ i and θ il represent the random slip of the i-th pulse and the initial phase, respectively. B i and C il are amplitudes. Generally, the vibration sensor is mounted at the bearing seat, which is fixed with the outer race. In the case of the rolling element fault or the inner race fault, the vibration propagation generated by the signal transfer path from the fault location to the sensor is varies with time, resulting in an amplitude modulation effect in the signal: where f r indicates the rotation frequency of the shaft where the fault bearing is located, and C is a constant.
In the case of the outer race fault, the vibration propagation only has a scaling effect on the signal amplitude due to the transfer path being fixed: One accepted approach to fault identification is to identify the fault-related frequency contents from the signal spectrum. For example, Figure 4 shows the waveform and the spectrum of a simulated bearing's inner race fault signal, and the related frequency content includes: the fault feature frequency f c , its harmonic frequencies n f c , the rotational frequency f r , and the modulated side band formed by their combination n f c ± f r ; Figure 5 shows the waveform and the spectrum of a bearing outer race fault signal. The frequency content includes: the fault feature frequency f c and its harmonic frequencies n f c . It can be observed that nonlinear and non-stationary nature of these fault signals increases the complexity of the spectrum. Especially in the fault of inner race, where the complex modulation side band appears. As a result, the dynamic characteristic of the inner race fault should be more complex than that of outer race fault. side band formed by their combination c r nf f ± ; Figure 5 shows the waveform and the spectrum of a bearing outer race fault signal. The frequency content includes: the fault feature frequency c f and its harmonic frequencies c nf . It can be observed that nonlinear and non-stationary nature of these fault signals increases the complexity of the spectrum. Especially in the fault of inner race, where the complex modulation side band appears. As a result, the dynamic characteristic of the inner race fault should be more complex than that of outer race fault.

Identification of Fault-Related Components through MSPE
When faults occur in a mechanical system, such as gear or bearing parts, the MSPE value of the vibration signal will change, and the value varies with different type of the faults. Thus, MSPE can be adopted to identify the fault-related components.
The common interference components in bearing signals include the shock signals generated by other parts and the harmonic signals. Therefore, without loss of generality, we discuss the MSPE characteristics of different types of components bearing fault signals, including the fault feature signals of the bearing's inner race (shown as Figure 4) and the bearing outer race (shown as Figure 5), a harmonic signal, a shock signal, and a Gaussian white noise signal. The results of these five type signals are shown as Figure 6a. The results demonstrate that the MSPE value of a regular time series, such as harmonic signal or shock signal, is smaller, the examples being basically below 0.3; in contrast, since advent of the complex dynamic characteristic, the MPSE values of the bearing fault feature signals are larger, ranging from 0.4 to 0.7. In addition, it can be observed that the MSPE x 10 -3 frequency/HZ bearing outer race fault signal. The frequency content includes: the fault feature frequency c f and its harmonic frequencies c nf . It can be observed that nonlinear and non-stationary nature of these fault signals increases the complexity of the spectrum. Especially in the fault of inner race, where the complex modulation side band appears. As a result, the dynamic characteristic of the inner race fault should be more complex than that of outer race fault.

Identification of Fault-Related Components through MSPE
When faults occur in a mechanical system, such as gear or bearing parts, the MSPE value of the vibration signal will change, and the value varies with different type of the faults. Thus, MSPE can be adopted to identify the fault-related components.
The common interference components in bearing signals include the shock signals generated by other parts and the harmonic signals. Therefore, without loss of generality, we discuss the MSPE characteristics of different types of components bearing fault signals, including the fault feature signals of the bearing's inner race (shown as Figure 4) and the bearing outer race (shown as Figure 5), a harmonic signal, a shock signal, and a Gaussian white noise signal. The results of these five type signals are shown as Figure 6a. The results demonstrate that the MSPE value of a regular time series, such as harmonic signal or shock signal, is smaller, the examples being basically below 0.3; in contrast, since advent of the complex dynamic characteristic, the MPSE values of the bearing fault feature signals are larger, ranging from 0.4 to 0.7. In addition, it can be observed that the MSPE x 10 -3 frequency/HZ

Identification of Fault-Related Components through MSPE
When faults occur in a mechanical system, such as gear or bearing parts, the MSPE value of the vibration signal will change, and the value varies with different type of the faults. Thus, MSPE can be adopted to identify the fault-related components.
The common interference components in bearing signals include the shock signals generated by other parts and the harmonic signals. Therefore, without loss of generality, we discuss the MSPE characteristics of different types of components bearing fault signals, including the fault feature signals of the bearing's inner race (shown as Figure 4) and the bearing outer race (shown as Figure 5), a harmonic signal, a shock signal, and a Gaussian white noise signal. The results of these five type signals are shown as Figure 6a. The results demonstrate that the MSPE value of a regular time series, such as harmonic signal or shock signal, is smaller, the examples being basically below 0.3; in contrast, since advent of the complex dynamic characteristic, the MPSE values of the bearing fault feature signals are larger, ranging from 0.4 to 0.7. In addition, it can be observed that the MSPE value of the inner race fault is larger than that of the outer race. In particular, the randomness of the noise signal is the strongest with the MSPE value more than 0.9, which proves that the MSPE is sensitive to noise. effectively identify the early weak fault feature signals under strong background noise. Figure 6b shows the MPSE of the four feature signals in Figure 6a, after Gaussian white noise with a SNR of −5 is added. Due to the existence of the strong background noise, the randomness of signals is greatly enhanced, resulting in the MSPE values of the four feature signals increasing to more than 0.9 and mixing together. As a result, the fault feature components are impossible to be identified. Therefore, it is an urgent problem to reduce the noise before identifying the weak fault feature components.  (1) Using the proposed LRPCA method to decompose the trajectory matrix consisting of the acquired bearing fault signal into multiple low-rank matrices and to suppress the noise synchronously; (2) Convert the low-rank matrices obtained into one-dimensional component signals by inverse transformations and identify the fault-related components from these signals through the MPSE characteristic of the signal; (3) Using the weighted Nadaraya-Watson regression model and inverse transform to combine the low-rank matrices corresponding to identified components into a one-dimensional signal to represent the extracted fault feature component; (4) Confirm the bearing fault by identifying the fault-related frequency contents from the signal spectrum. According to the above analysis results, we can set a threshold range (0.5-0.85) of the MSPE value or select the larger value (but no more than 0.9) to identify the components related to bearing fault feature from the q one-dimensional component signals obtained by LRPCA.

The Process of the Effective Fault Diagnosis Technique
It needs to be emphasized that although MSPE has a good anti-noise ratio, it may not be able to effectively identify the early weak fault feature signals under strong background noise. Figure 6b shows the MPSE of the four feature signals in Figure 6a, after Gaussian white noise with a SNR of −5 is added. Due to the existence of the strong background noise, the randomness of signals is greatly enhanced, resulting in the MSPE values of the four feature signals increasing to more than 0.9 and mixing together. As a result, the fault feature components are impossible to be identified. Therefore, it is an urgent problem to reduce the noise before identifying the weak fault feature components. (1) Using the proposed LRPCA method to decompose the trajectory matrix consisting of the acquired bearing fault signal into multiple low-rank matrices and to suppress the noise synchronously; (2) Convert the low-rank matrices obtained into one-dimensional component signals by inverse transformations and identify the fault-related components from these signals through the MPSE characteristic of the signal; (3) Using the weighted Nadaraya-Watson regression model and inverse transform to combine the low-rank matrices corresponding to identified components into a one-dimensional signal to represent the extracted fault feature component; (4) Confirm the bearing fault by identifying the fault-related frequency contents from the signal spectrum.

xperiments . Numerical Simulation Experiment
In order to verify the diagnostic performance of the proposed technique, without loss erality, a simulated signal composed of multiple components was generated: is the simulated fault feature signal of a bearing's inner race, as shown in Figure 4, and ailed parameters are listed in Table 1.

Numerical Simulation Experiment
In order to verify the diagnostic performance of the proposed technique, without loss of generality, a simulated signal composed of multiple components was generated: where x 1 (t) is the simulated fault feature signal of a bearing's inner race, as shown in Figure 4, and its detailed parameters are listed in Table 1. x 2 (t) and x 3 (t) are the interferences of a shock signal and a harmonic signal, respectively. Figure 8a,b shows the waveforms of these two signals and both of their feature frequencies are 150 Hz. n(t) represents the strong background white Gaussian noise. The signal sampling frequency and the sampling point are f s = 20,000 and n = 20,000, respectively.   Then, the fault feature extraction performance of the proposed technique was investigated. In this experiment, a noise with a SNR of −5 db was added into the simulation signal. Figure 10 shows the waveform and the spectrum of the noise-signal mixture. It can be seen that the fault feature is completely submerged by noise and the other interferences, which inevitably increases the difficulty of recognizing the fault feature. The methods of wavelet shrinkage denoising [14], basis pursuit denoising [53], EMD, and SSA were selected for comparative analysis. In the wavelet shrinkage denoising method, the decomposition layer was selected as 4 and the wavelet basis function was set as "db15." The waveform and the spectrum of the final extracted Firstly, the de-noising performance of the proposed LRPCA method was tested. The strong background noise with SNR from −5 to 5 db was added to the simulated signals to imitate the early weak fault. For visually displaying the de-noising performance of LRPCA, the methods of RPCA, SSA, and EMD were employed to make a comparative analysis. The phase space reconstruction parameters used in LRPCA, RPCA, and SSA were all set as n 1 = 200 and τ = 100. EMD employed the energy difference tracking method [51] to select desired IMF components. The hard threshold method [52] was adopted in SSA to select the best combination of singular values to reconstruct the signal. Figure 9 illustrates the de-noising result of the above four methods. In the vertical axis of the graphics, the higher SNR value indicates a better de-noising performance. It is clear that the proposed LRPCA method can provide the best noise suppression performance.  Then, the fault feature extraction performance of the proposed technique was investigated. In this experiment, a noise with a SNR of −5 db was added into the simulation signal. Figure 10 shows the waveform and the spectrum of the noise-signal mixture. It can be seen that the fault feature is completely submerged by noise and the other interferences, which inevitably increases the difficulty of recognizing the fault feature. The methods of wavelet shrinkage denoising [14], basis pursuit denoising [53], EMD, and SSA were selected for comparative analysis. Then, the fault feature extraction performance of the proposed technique was investigated. In this experiment, a noise with a SNR of −5 db was added into the simulation signal. Figure 10 shows the waveform and the spectrum of the noise-signal mixture. It can be seen that the fault feature is completely submerged by noise and the other interferences, which inevitably increases the difficulty of recognizing the fault feature. The methods of wavelet shrinkage denoising [14], basis pursuit denoising [53], EMD, and SSA were selected for comparative analysis.
Then, the fault feature extraction performance of the proposed technique was investigated. In this experiment, a noise with a SNR of −5 db was added into the simulation signal. Figure 10 shows the waveform and the spectrum of the noise-signal mixture. It can be seen that the fault feature is completely submerged by noise and the other interferences, which inevitably increases the difficulty of recognizing the fault feature. The methods of wavelet shrinkage denoising [14], basis pursuit denoising [53], EMD, and SSA were selected for comparative analysis. In the wavelet shrinkage denoising method, the decomposition layer was selected as 4 and the wavelet basis function was set as "db15." The waveform and the spectrum of the final extracted In the wavelet shrinkage denoising method, the decomposition layer was selected as 4 and the wavelet basis function was set as "db15." The waveform and the spectrum of the final extracted fault feature signal are shown in Figure 11. In the resulting spectrum, the fault feature frequency ( f c ) and some of its harmonic frequencies (2 f c −4 f c ) can be found. However, due to a large number of interference components and strong background noise, the identification of these fault-related frequencies was seriously affected, and the modulation sidebands were completely submerged. hence, this method was insufficient in feature extraction of a simulated signal. fault feature signal are shown in Figure 11. In the resulting spectrum, the fault feature frequency ( c f ) and some of its harmonic frequencies ( 2 c f − 4 c f ) can be found. However, due to a large number of interference components and strong background noise, the identification of these fault-related frequencies was seriously affected, and the modulation sidebands were completely submerged. hence, this method was insufficient in feature extraction of a simulated signal.
In the basis pursuit denoising method, we adopted the compressed sensing reconstruction algorithm [54] to extract the fault feature signal and Figure 12 shows the final analysis result. In the resulting spectrum, the Fault harmonic frequencies ( In the EMD, fourteen IMFs can be obtained by decomposing the signal and the waveforms of their top twelve are shown in Figure 13a. By applying Fourier transform to these IMFs, in the resulting spectrums of IMF3 and IMF4, partial fault-related frequency contents can be found; i.e., fault harmonic frequencies ( Frequency/Hz In the basis pursuit denoising method, we adopted the compressed sensing reconstruction algorithm [54] to extract the fault feature signal and Figure 12 shows the final analysis result. In the resulting spectrum, the Fault harmonic frequencies (3 f c −5 f c ) and their modulated sidebands (3 f c ± f r , 4 f c ± f r , and 5 f c ± f r ) could be identified. But, there were still a lot of interference frequency components and strong background noise, leading to the fault of the signal not being directly determined.
In the EMD, fourteen IMFs can be obtained by decomposing the signal and the waveforms of their top twelve are shown in Figure 13a. By applying Fourier transform to these IMFs, in the resulting spectrums of IMF3 and IMF4, partial fault-related frequency contents can be found; i.e., fault harmonic frequencies (2 f c −5 f c ) and the modulated sidebands (3 f c − f r , 4 f c − f r , and 5 f c + f r ). The waveform and the spectrum of these two IMFs are shown in Figure 13b-e. However, there were still plenty of interference frequency components and noise, which are adverse to the identification of fault feature. In the EMD, fourteen IMFs can be obtained by decomposing the signal and the waveforms of their top twelve are shown in Figure 13a. By applying Fourier transform to these IMFs, in the resulting spectrums of IMF3 and IMF4, partial fault-related frequency contents can be found; i.e., fault harmonic frequencies ( In the SSA, by setting the threshold value of the energy of singular value to reach 95% of the total, four singular subspaces representing different feature components can be obtained. Figure 14a illustrates the waveforms of the corresponding one-dimensional feature component signals of these four subspaces obtained through inverse transform. Then, through similarity analysis, it was concluded that components 2 and 4 had higher similarity with the raw fault feature signal. Thus, the concluded that components 2 and 4 had higher similarity with the raw fault feature signal. Thus, the signal obtained by adding them represents the extracted fault feature component. The waveform and the spectrum of this extracted signal are shown in Figure 14b,c. In the spectrum, some fault-related frequency contents could be found. But, similar to the analysis result of EMD, there were also some interference frequency components and noises, which may have had an adverse effect on the final diagnosed result. The diagnosed result of the proposed technique is shown in Figure 15. Figure 15a shows the waveforms of the one-dimensional component signals obtained by LRPCA, which correspond to the six low-rank matrices. Additionally, the MSPEs of those components are displayed in Figure 13b. It can be observed that the MSPE values of component 3 and 6 are relatively higher and evenly distributed between 0.5 and 0.8. Thus, we could combine the low-rank matrices corresponding to those two components into a one-dimensional signal to represent the extracted fault feature signal. Figure 15c,d show the waveform and the spectrum of this extracted signal. All fault related frequencies were clearly discernable in the spectrum. Furthermore, it can be observed that the interference frequency components and the background noise were basically eliminated. According to the above analysis information, we can undoubtedly confirm that the fault occurred at the inner race. In the SSA, by setting the threshold value of the energy of singular value to reach 95% of the total, four singular subspaces representing different feature components can be obtained. Figure 14a illustrates the waveforms of the corresponding one-dimensional feature component signals of these four subspaces obtained through inverse transform. Then, through similarity analysis, it was concluded that components 2 and 4 had higher similarity with the raw fault feature signal. Thus, the signal obtained by adding them represents the extracted fault feature component. The waveform and the spectrum of this extracted signal are shown in Figure 14b,c. In the spectrum, some fault-related frequency contents could be found. But, similar to the analysis result of EMD, there were also some interference frequency components and noises, which may have had an adverse effect on the final diagnosed result. The diagnosed result of the proposed technique is shown in Figure 15. Figure 15a shows the waveforms of the one-dimensional component signals obtained by LRPCA, which correspond to the six low-rank matrices. Additionally, the MSPEs of those components are displayed in Figure 13b. It can be observed that the MSPE values of component 3 and 6 are relatively higher and evenly distributed between 0.5 and 0.8. Thus, we could combine the low-rank matrices corresponding to those two components into a one-dimensional signal to represent the extracted fault feature signal. Figure 15c,d show the waveform and the spectrum of this extracted signal. All fault related frequencies were clearly discernable in the spectrum. Furthermore, it can be observed that the interference frequency components and the background noise were basically eliminated. According to the above analysis information, we can undoubtedly confirm that the fault occurred at the inner race. The above simulation results indicate that EMD and SSA perform when decomposing a complicated multi-component signal into finite single-components. However, their anti-noise ability against strong noise is insufficient and there are still some interference components in the final extracted feature signals. On the contrary, the proposed technique is more effective in eliminating the strong background noise and extracting the weak fault feature component, and its diagnostic performance for the simulated signal is obviously better than the other two methods.

Experimental Signal Analysis
The vibration signal acquired on the spot is more complex than the simulation signal. In order to further verify the practicability of the proposed technique, a pitting fault signal of the bearing's outer race sampling from a bearing-gear fault's experimental table was analyzed. Figure 16 shows the experimental table, which consists of an AC motor, couplings, a gearbox with two pairs of meshing gears, and a magnetic powder brake. The test bearing is a single-row tapered roller bearing of the type of 32206 and its fault was handled by electrical discharge machining (EDM) method. The red arrow in Figure 16a displays the mounting position of the bearing. The experiment parameter was listed in Table 2. The vibration data of this experiment are measured by PCB acceleration sensor in the vertical direction of the bearing. the experimental table, which consists of an AC motor, couplings, a gearbox with two pairs of meshing gears, and a magnetic powder brake. The test bearing is a single-row tapered roller bearing of the type of 32206 and its fault was handled by electrical discharge machining (EDM) method. The red arrow in Figure 16a displays the mounting position of the bearing. The experiment parameter was listed in Table 2. The vibration data of this experiment are measured by PCB acceleration sensor in the vertical direction of the bearing.   Figure 17 is the diagram of the waveform and the spectrum of the acquired signal. It can be seen that the weak fault feature was impossible to be identified. Then, the proposed technique was utilized to diagnose the signal, and the methods of wavelet shrinkage denoising, basis pursuit denoising, EMD, and SSA were chosen for a comparative analysis.   Figure 17 is the diagram of the waveform and the spectrum of the acquired signal. It can be seen that the weak fault feature was impossible to be identified. Then, the proposed technique was utilized to diagnose the signal, and the methods of wavelet shrinkage denoising, basis pursuit denoising, EMD, and SSA were chosen for a comparative analysis. The analysis results of the wavelet shrinkage denoising method and basis pursuit denoising method are shown as Figures 18 and 19, respectively. In their result spectrums, although the fault characteristic frequency ( o f ) and its harmonic frequencies ( 2 o f − 3 o f ) can be found, the identification of these fault-related frequencies is seriously affected by a large number of interference components and strong background noise. Therefore, the fault type of bearing cannot be directly determined from these analysis results. The analysis results of the wavelet shrinkage denoising method and basis pursuit denoising method are shown as Figures 18 and 19, respectively. In their result spectrums, although the fault characteristic frequency ( f o ) and its harmonic frequencies (2 f o −3 f o ) can be found, the identification of these fault-related frequencies is seriously affected by a large number of interference components and strong background noise. Therefore, the fault type of bearing cannot be directly determined from these analysis results.
The analysis results of the wavelet shrinkage denoising method and basis pursuit denoising method are shown as Figures 18 and 19, respectively. In their result spectrums, although the fault characteristic frequency ( o f ) and its harmonic frequencies ( 2 o f − 3 o f ) can be found, the identification of these fault-related frequencies is seriously affected by a large number of interference components and strong background noise. Therefore, the fault type of bearing cannot be directly determined from these analysis results.   Figure 20b,d,e. However, that fault feature information is hard to be recognized due to the presence of many interference frequency components and strong background noise. Therefore, it is difficult to determine whether the outer race of the bearing is faulty. The analysis results of the wavelet shrinkage denoising method and basis pursuit denoising method are shown as Figures 18 and 19, respectively. In their result spectrums, although the fault characteristic frequency ( o f ) and its harmonic frequencies ( 2 o f − 3 o f ) can be found, the identification of these fault-related frequencies is seriously affected by a large number of interference components and strong background noise. Therefore, the fault type of bearing cannot be directly determined from these analysis results.
(a) (b)   Figure 20b,d,e. However, that fault feature information is hard to be recognized due to the presence of many interference frequency components and strong background noise. Therefore, it is difficult to determine whether the outer race of the bearing is faulty.   Figure 20b,d,e. However, that fault feature information is hard to be recognized due to the presence of many interference frequency components and strong background noise. Therefore, it is difficult to determine whether the outer race of the bearing is faulty. Figure 21 shows the fault feature extraction result of SSA. The peaks of fault feature frequency ( f o ) and its triple frequency (3 f o ) were obvious in the spectrum. But there are still many interference peaks and noise, which affects the identification of fault feature. These above analysis results indicate that neither EMD nor SSA can provide good fault diagnosis performances for the experimental fault signal.
The proposed technique was utilized to diagnose the signal. Figure 22a shows the waveforms of the one-dimensional component signal corresponding to the six low-rank matrices obtained by LRPCA. Furthermore, their MSPE values were displayed in Figure 22b. It can be observed that the MSPE values of component 1, 3, 4, and 6 are relatively higher. Meanwhile, the MSPE value of component 4 was basically above 0.9, which may be the feature signal of other component with more complex dynamics characteristic. The MSPE values of components 1, 3, and 6 ranged from 0.7 to 0.9, and their trends were similar, so we combined the low-rank matrices corresponding to these three components into a one-dimensional feature signal representing the extracted the fault feature signal. Figure 22c,d show the waveform and the spectrum of this extracted signal. In the spectrum, the rotation frequency ( f r ), the fault feature frequency ( f o ), and its double frequency (2 f o ) and triple frequency (3 f o ) can be easily found. Moreover, it can be seen that the energy of the noise was suppressed at a low level and there were a few interference frequency components, which had little effect on the final diagnosed result. Consequently, we could confirm that the outer race of bearing had fault. These analysis results demonstrate that the proposed technique can effectively suppress the strong background noise and extract weak fault feature component from the multi-component signal, which means the proposed technique can provide a great diagnostic performance when dealing with the experimental signal.
waveform and the spectrum of this extracted signal. In the spectrum, the rotation frequency ( r f ), the fault feature frequency ( o f ), and its double frequency ( 2 o f ) and triple frequency ( 3 o f ) can be easily found. Moreover, it can be seen that the energy of the noise was suppressed at a low level and there were a few interference frequency components, which had little effect on the final diagnosed result. Consequently, we could confirm that the outer race of bearing had fault.

Conclusions
In general, the bearing's weak fault feature exhibits the nature of nonlinear and non-stationary, which is hard to be extracted under the situation of existing strong background noise and interference components. In considering this problem, an effective bearing fault diagnosis technique via LRPCA and MSPE was introduced in this paper. The LRPCA can decompose the signal trajectory matrix into multiple low-rank matrices, and meanwhile, suppress the noise. The MSPE was used to identify the low-rank matrices corresponding to bearing's fault feature. Thereafter, those identified low-rank matrices were combined into a one-dimensional signal to represent the extracted fault feature component for further fault diagnosis. The principle and the effectiveness of this technique was verified by the analysis of both the simulation signal and the acquired bearing fault signal. The analysis results indicate that the proposed technique can effectively detect and locate the bearing faults accurately.
The threshold range representing the bearing fault feature component was determined based on the results of simulation experiments. However, the feature components of the acquired bearing fault signals should be much more complex than simulated signals. Therefore, we will focus on determining a more appropriate threshold range for acquired signal in future work. In addition, the de-nosing performance of proposed LRPCA method was verified by the signal simulation Equation (20), and Figure 9 shows that it performs better than other methods. In fact, the robustness of the method can be improved by statistical evidence. By extending the evaluation to greater number of rounds, the results of each round are collected, and through meaningful statistical tests, the parameters of the method can be optimized, leading to a significant improvement in noise reduction performance. Therefore, we will also carry out more deep research to that end in future work.