Diagnosis of Compound Fault Using Sparsity Promoted-Based Sparse Component Analysis

Compound faults often occur in rotating machinery, which increases the difficulty of fault diagnosis. In this case, blind source separation, which usually includes independent component analysis (ICA) and sparse component analysis (SCA), was proposed to separate mixed signals. SCA, which is based on the sparsity of target signals, was developed to sever the compound faults and effectively diagnose the fault due to its advantage over ICA in underdetermined conditions. However, there is an issue regarding the vibration signals, which are inadequately sparse, and it is difficult to represent them in a sparse way. Accordingly, to overcome the above-mentioned problem, a sparsity-promoted approach named wavelet modulus maxima is applied to obtain the sparse observation signal. Then, the potential function is utilized to estimate the number of source signals and the mixed matrix based on the sparse signal. Finally, the separation of the source signals can be achieved according to the shortest path method. To validate the effectiveness of the proposed method, the simulated signals and vibration signals measured from faulty roller bearings are used. The faults that occur in a roller bearing are the outer-race flaw, the inner-race flaw and the rolling element flaw. The results show that the fault features acquired using the proposed approach are evidently close to the theoretical values. For instance, the inner-race feature frequency 101.3 Hz is very similar to the theoretical calculation 101 Hz. Therefore, it is effective to achieve the separation of compound faults utilizing the suggest method, even in underdetermined cases. In addition, a comparison is applied to prove that the proposed method outperforms the traditional SCA method when the vibration signals are inadequate.


Introduction
Rotating machinery, such as roller bearings, is commonly used in industrial settings [1,2]. The fault diagnosis of rotating machinery is very important to guarantee that the whole machinery system works under normal conditions [3,4]. However, fault diagnosis is complicated because of high frequency of compound faults [5,6]. To effectively separate the compound faults, blind source separation (BSS) method was proposed [7,8]. Among BSS methods, independent component analysis (ICA) and sparse component analysis (SCA) are the most widely used algorithms. They are effective in separating the source signals from the mixed observations [9,10]. ICA is based on the statistical independence and SCA is based on the sparsity of the source signals.
In the case of compound faults, problems with equal numbers of observations and sources are called determined problems. If when the number of observations is greater than the number of sources, then they are defined as overdetermined problems. For some determined problems and overdetermined problems, the ICA methods perform well on separation [11,12]. However, the actual signals are usually collected by fewer sensors than the total number of sensors available because some sources are affected by environmental conditions, which makes them underdetermined problems, as the number of observations is less than that of sources. To solve this problem, certain classical algorithms, such as ensemble empirical mode decomposition (EEMD), were used to decompose a single channel signals into multiple signals [13,14]. However, the number of decomposition channels is often more than the number of source signals, which makes the separation of source signals contain false source signals. Moreover, the EEMD-based ICA method takes a long time to carry out. In addition, the ICA BSS has the precondition that the source signals satisfy non-Gaussian distribution and are independent of each other [15,16]. All of this means that the above methods are not applicable in dealing with compound faults under the underdetermined conditions.
With an aim to solve the above-mentioned problems, sparse component analysis was gradually developed. It makes full use of signal sparse characteristics to realize the source signal separation. It can effectively separate sparse source signals when ICA does not perform well. The concept of the sparsity model was first proposed by Olshausen [17], and later the concept of SCA was raised by Lewicki [18,19]. The concept that SCA can completely reconstruct the source signal in BSS was put forward by Georgiev et al. [20,21]. In addition, Bofill et al. presented the method of estimating the mixed matrix by the potential function (PF) and successfully separated the mixed sound signals [22].
To separate the observed signals using the SCA method, the sparsity of the target signals is one of the most significant conditions that must be satisfied. Source signals, such as sine signal and cosine signal, can be represented sparsely by Fourier Transform. However, in the case of vibration signals, it is very difficult to disperse signals in a sparse way using traditional methods, such as Fourier Transform and wavelet transform. Therefore, to represent the vibration signals sparsely, a sparsity-promoted method using the wavelet modulus maxima is proposed in this paper. First, the wavelet modulus maxima are used to achieve the objective of thinning the mixed signals. Then, the number of mixed signals and the mixed matrix are estimated by PF based on the sparsity of the observed signals. Finally, the shortest path method is used to separate the source signals. In addition, a comparison is made with the traditional envelope spectrum method and the SCA method in order to verify the effectiveness of the proposed method.
The rest of the paper is organized as follows. The basic theories of sparse component analysis and wavelet modulus maxima are introduced in Section 2. Section 3 presents the detailed steps to separate the compound faults based on the proposed method. Section 4 mainly discusses the research on simulation signals and experimental signals. Lastly, the conclusions are drawn in Section 5.

Sparse Component Analysis
Sparse component analysis algorithms require that signal is sparse enough. The so-called sparse signal refers to the signal that most of the time is zero (or close to zero), while it is bigger than zero a few times. The probability density function distribution diagram shows a clear wave crest and smooth descent. If the source signal is a sparse signal, the zero value (or close to zero) will be far more frequent than the nonzero value (or larger than zero), and only one source signal can be dominant at most times [23,24].
The model can be expressed as: where X = [X 1 , X 1 , · · · , X T ] represents the observation signal; A represents a mixed matrix; and S represents the source signal. The PF method utilizes a type of algorithm based on sparse clustering data to calculate the source number and mixed form of the underdetermined blind signals [22]. The PF method assumes that the number of mixed signals is two, let l t = x 2 1 (t) + x 2 2 (t) and θ(t) = tan −1 (x 2 (t)/x 1 (t)), and PFs Φ are defined as shown in the following Equation: where the basis function ϕ around a triangular function of the local angle α is based on Equation (3), λ is used to adjust the desired angular width or resolution of the local contributions, which is the actual angular difference between an arbitrary direction and θ t in the polar coordinates, and l t is a weight to put more emphasis on the more reliable data: The PF Φ shows the distribution of the sampling data in the cluster center, if the column vector of the mixed matrix A is just in the actual sampling data, the PF of the column vector will be large. When the number of local maxima is exactly equal to the number of vibration sources, the corresponding column vector is an approximate solution about the mixed matrixÂ. Thus, the number of the vibration sources and the mixed matrixÂ are obtained.
When the mixing matrix A is known, the solution is not unique, as the system in Equation (4) is underdetermined. The usual approach of sparse BSS consists of finding the solution that minimizes the l 1 norm. In this case, the optimal representation of the data point: The shortest path method is to find the shortest path from the origin to x(t), and the smallest sum of the vector length is the solution in all combinations by decomposing the observed signals. The shortest path decomposition involves the following steps [25]: (1) Extract any m column vector from the estimated mixed matrixÂ to compose the m × m dimensional matrix termed * A and get the C m n matrix; (2) Calculate the inverse matrix * A i −1 of each matrix, i = 1, · · · , C m n ; (3) Decompose the observed signal vector x t according to the direction of the base vector, which is every column vector in the matrix, and the length of each vector is obtained. For any time t, the observed signal is formed by a linear combination of the m sparse source signals, so that the source signal is decided by the next Equation: Therefore, the source signal can be calculated for all times according to the Equation, which decomposes the source signal estimation into T linear programming problems.

Wavelet Modulus Maxima
The method of wavelet analysis is the extension of the Fourier analysis method, which is the local transformation of space (time) and frequency, so it can effectively extract the information from the signal and solve many difficult problems that cannot be solved by Fourier Transform. Discretization of the continuous wavelet is the discretization of the continuous scale parameter s and translation parameter τ. Usually, the continuous scale parameter s and the translation parameters τ in wavelet transform are taken as s = s j 0 , τ = τ j 0 , j ∈ Z. Accordingly, the wavelet function can be expressed as [26,27]: and the corresponding discrete wavelet transform can be expressed as: If the wavelet function for the wavelet transform ψ is the first derivative or first derivative of a low-pass function, the results of the wavelet transform will reflect the extreme point of a signal singularity or turning point. The definition of a signal singularity can be expressed as follows.
The function f (x) has a Lipchitz index α at the point x 0 , if and only if there is a constant K > 0, such that ∀x ∈ R, and It can be seen from Equation (8) that the local Lipchitz index α of the function f (x) in the point x 0 depicts the singular class type of the point. Specifically, α > 1 represents f (x), which is differentiable at the point x 0 . 0 < α < represents f (x), which is continuous but not differentiable at the point x 0 . α = 0 represents f (x), which is not continuous but bounded at the point x 0 , while α < 0 represents f (x), and has a singularity at the point x 0 , which is actually noise.
When using this local singularity of wavelet analysis, the wavelet coefficients depend on the characteristics of f (x) in the x 0 neighborhood and the size of the scale selected by the wavelet transform. In wavelet transform, the local singularity of the signal is defined as follows.
The necessary and sufficient condition for the existence of a consistent Lipchitz index α about the function f (x) in the interval [a, b] is that there exists a constant K > 0, such that ∀x ∈ [a, b], whose wavelet transform satisfies Taking the logarithm on both sides above and we get Then, α is called the singularity index at x 0 (also known as the Lipchitz index). The above equation gives the variation law of the logarithm of the wavelet transform coefficient with the scale j, and the variation of the modulus of the wavelet transform with the scale j at the singular point of the corresponding signal will naturally meet this rule. If the Lipchitz index α > 0 in the function f (x), the modulus maxima of the function wavelet transform coefficient will increase with the scale. On the contrary, when α < 0, the maximum value of the transformation modulus will decrease as the scale increases. For the case of α = 0, the maximum value of the wavelet transform modulus does not change with the scale.
The extreme point signal or turning point is called the singular point, which mainly reflects the characteristics of the signal. For the actual signal, the Lipchitz index must have α > 0. Then, the wavelet transform modulus maxima of the signal will increase with the scale increase. This propagation characteristic is the basic principle and basis of the signal sparse representation of the wavelet modulus maxima. Since the extreme point signal is more stable, the wavelet modulus maxima are more suitable to improve the sparsity of the signal. The steps of thinning established upon wavelet modulus maxima are as follows: (1) Wavelet basis function is used to conduct n level wavelet transform to get the high-frequency wavelet coefficients at each level and obtain the wavelet modulus maxima. (2) For each level, search for their communication points and a neighborhood on the previous level.
(3) Keep the maxima points in the neighborhood of the communication points while removing the modulus maxima from the neighborhood.

Separation Method by Sparsity Promoted-Based SCA
There are mainly two ways to solve the underdetermined BSS problem. One is decomposing the original signals into multiple channel signals, through which the underdetermined BSS issue can be transferred to determined or overdetermined problem. In this way, the EEMD-based ICA method is commonly used. The other method is SCA, which separates mixed signals based on the sparsity. However, the EEMD-based ICA method is often time-consuming and is likely to receive false source signals. In addition, the vibration signals generated by a roller bearings fault cannot be represented sparsely in the normal way. Thus, the SCA method might not perform well when the compound faults exist. To meet the requirement of sparsity, the wavelet modulus maxima are utilized to represent the vibration signals in a sparse way.
A new method based on the wavelet modulus maxima and sparse underdetermined BSS is proposed in this paper. Several experiments on roller bearings' faults are performed to verify the efficiency of the presented method. The results obtained by the proposed method are compared with the traditional SCA method, whose flowchart is shown in Figure 1. First, the roller bearings signals are collected by the acceleration sensor. Second, the db4 wavelet is used to perform four-level wavelet transform to get the high frequency wavelet coefficients of each level and acquire the modulus maxima. Then, the maxima points are removed out of the neighborhood of the communication points. Keep the maxima points in the neighborhood and the sparse mixed signals are obtained. The sparse signal that has the most obvious peaks and the prominent turning point in the potential function is chosen as the following experimental object. Third, the PF is used to estimate the mixed matrix and the number of sources, and the source signals were separated according to the shortest path. Lastly, the roller bearing fault features are extracted by comparing them with the theoretical characteristic frequencies of a roller bearing. maxima are more suitable to improve the sparsity of the signal. The steps of thinning established upon wavelet modulus maxima are as follows: (1) Wavelet basis function is used to conduct n level wavelet transform to get the high-frequency wavelet coefficients at each level and obtain the wavelet modulus maxima. (2) For each level, search for their communication points and a neighborhood on the previous level.
(3) Keep the maxima points in the neighborhood of the communication points while removing the modulus maxima from the neighborhood.

Separation Method by Sparsity Promoted-Based SCA
There are mainly two ways to solve the underdetermined BSS problem. One is decomposing the original signals into multiple channel signals, through which the underdetermined BSS issue can be transferred to determined or overdetermined problem. In this way, the EEMD-based ICA method is commonly used. The other method is SCA, which separates mixed signals based on the sparsity. However, the EEMD-based ICA method is often time-consuming and is likely to receive false source signals. In addition, the vibration signals generated by a roller bearings fault cannot be represented sparsely in the normal way. Thus, the SCA method might not perform well when the compound faults exist. To meet the requirement of sparsity, the wavelet modulus maxima are utilized to represent the vibration signals in a sparse way.
A new method based on the wavelet modulus maxima and sparse underdetermined BSS is proposed in this paper. Several experiments on roller bearings' faults are performed to verify the efficiency of the presented method. The results obtained by the proposed method are compared with the traditional SCA method, whose flowchart is shown in Figure 1. First, the roller bearings signals are collected by the acceleration sensor. Second, the db4 wavelet is used to perform four-level wavelet transform to get the high frequency wavelet coefficients of each level and acquire the modulus maxima. Then, the maxima points are removed out of the neighborhood of the communication points. Keep the maxima points in the neighborhood and the sparse mixed signals are obtained. The sparse signal that has the most obvious peaks and the prominent turning point in the potential function is chosen as the following experimental object. Third, the PF is used to estimate the mixed matrix and the number of sources, and the source signals were separated according to the shortest path. Lastly, the roller bearing fault features are extracted by comparing them with the theoretical characteristic frequencies of a roller bearing.

Simulation Analysis
To prove the applicability of the SCA algorithm in underdetermined BSS applications, three sparse source signals are generated according to Equation (11), whose characteristic frequencies are 100 Hz, 67 Hz and 30 Hz. The sampling frequency is f s = 20, 000 Hz and the sampling data length is N = 40, 960. The spectrum of the source signal is presented in Figure 2. The three signals are randomly mixed into two observed signals x 1 and x 2 according to Equation (12), and the random mixed matrix Sensors 2017, 17, 1307 6 of 16

Simulation Analysis
To prove the applicability of the SCA algorithm in underdetermined BSS applications, three sparse source signals are generated according to Equation (11), whose characteristic frequencies are 100 Hz, 67 Hz and 30 Hz. The sampling frequency is 20 000 s f  ， Hz and the sampling data length is 40 960 N  ， . The spectrum of the source signal is presented in Figure 2. The three signals are randomly mixed into two observed signals 1 x and 2 x according to Equation (12), and the random  Figure 3 shows the mixed signals and their spectra. The characteristic frequency cannot be obtained from the results shown in Figure 3. In this case, the wavelet modulus maxima are employed to generate the sparse signal. First, the db4 wavelet is used to perform the four-level wavelet transform to get the high-frequency wavelet coefficients of each level and acquire the modulus maxima. Second, search for their communication points on the previous level and then the maxima points are kept, as they are in the neighborhood of the communication points, while the modulus maxima are removed, as they are out of the neighborhood. Finally, the sparse signals are gained.  Figure 3 shows the mixed signals and their spectra. The characteristic frequency cannot be obtained from the results shown in Figure 3. In this case, the wavelet modulus maxima are employed to generate the sparse signal. First, the db4 wavelet is used to perform the four-level wavelet transform to get the high-frequency wavelet coefficients of each level and acquire the modulus maxima. Second, search for their communication points on the previous level and then the maxima points are kept, as they are in the neighborhood of the communication points, while the modulus maxima are removed, as they are out of the neighborhood. Finally, the sparse signals are gained. Sensors 2017, 17, 1307 7 of 16 The scatter diagram of the sparse signals is demonstrated in Figure 4, and the sparse signals PF are shown in Figure 5. It can be seen that the peaks and turning points are apparent, which means that the sparsity of signal is appropriate. The mixed signals number and the estimation of the mixed matrix A are obtained from the results shown in Figures 4 and 5. The number of sources is three where the estimation of the mixed matrix is very close to the mixed matrix.  The scatter diagram of the sparse signals is demonstrated in Figure 4, and the sparse signals PF are shown in Figure 5. It can be seen that the peaks and turning points are apparent, which means that the sparsity of signal is appropriate. The mixed signals number and the estimation of the mixed matrix A are obtained from the results shown in The scatter diagram of the sparse signals is demonstrated in Figure 4, and the sparse signals PF are shown in Figure 5. It can be seen that the peaks and turning points are apparent, which means that the sparsity of signal is appropriate. The mixed signals number and the estimation of the mixed matrix A are obtained from the results shown in Figures 4 and 5 where the estimation of the mixed matrix is very close to the mixed matrix.  Finally, the source signal is separated based on the shortest path method and the results are shown in Figure 6, which are very close to the results displayed in Figure 2. The results reveal that the proposed method works well in underdetermined BSS.

Experimental Verification and Discussion
Aiming at verifying availability of the method proposed in this paper, the experimental system of the bearing diagnosis is used. As is shown in Figure 7a, the experimental system includes a rotating Finally, the source signal is separated based on the shortest path method and the results are shown in Figure 6, which are very close to the results displayed in Figure 2. The results reveal that the proposed method works well in underdetermined BSS. Finally, the source signal is separated based on the shortest path method and the results are shown in Figure 6, which are very close to the results displayed in Figure 2. The results reveal that the proposed method works well in underdetermined BSS.

Experimental Verification and Discussion
Aiming at verifying availability of the method proposed in this paper, the experimental system of the bearing diagnosis is used. As is shown in Figure 7a, the experimental system includes a rotating

Experimental Verification and Discussion
Aiming at verifying availability of the method proposed in this paper, the experimental system of the bearing diagnosis is used. As is shown in Figure 7a, the experimental system includes a rotating machine, the roller bearing (NTN 204) and acceleration sensors. A 1-kW ac motor is employed to drive the experimental setup and the measurements in the experimental system are made without load. The accelerometer sensors are mono-axial piezoelectricity type accelerometers (IMI 608A11) with a bandwidth from 0.5 Hz to 10 KHz and a 100 mV/g sensitivity. The vibration signals of the roller bearings faults are collected through two accelerometer sensors that are mounted on the bearing housing in the vertical and horizontal directions, as indicated in Figure 7b. In addition, the signals used in this paper are obtained by the sensor in the vertical direction due to the fault features being more obvious in this direction. The faults often occurring in a roller bearing are the outer-race flaw, the inner-race flaw and the rolling element flaw. We artificially made those flaws as shown in Figure 7c,d,e for the tests of condition diagnosis. Technical characteristics of the NTN N204 bearing are displayed in Table 1. The sizes of the flaws in roller bearings are all 0.5 × 0.15 mm (width ×depth), and the failures were artificially created using a wire-cutting machine. The inherent frequencies of roller bearing are generally distributed in the high frequency band. To fully analyse signal features, considering further research, the larger sampling frequency is chosen to acquire the more comprehensive information of machinery conditions. Thus, the sampling frequency is 100 KHz and the sampling time as 10 s. The corresponding time of vibration signals used in this study is 1 s. The vibration signals collected at 900 rpm is utilized to verify the effectiveness of the proposed method.
Sensors 2017, 17, 1307 9 of 16 machine, the roller bearing (NTN 204) and acceleration sensors. A 1-kW ac motor is employed to drive the experimental setup and the measurements in the experimental system are made without load. The accelerometer sensors are mono-axial piezoelectricity type accelerometers (IMI 608A11) with a bandwidth from 0.5 Hz to 10 KHz and a 100 mV/g sensitivity. The vibration signals of the roller bearings faults are collected through two accelerometer sensors that are mounted on the bearing housing in the vertical and horizontal directions, as indicated in Figure 7b. In addition, the signals used in this paper are obtained by the sensor in the vertical direction due to the fault features being more obvious in this direction. The faults often occurring in a roller bearing are the outer-race flaw, the inner-race flaw and the rolling element flaw. We artificially made those flaws as shown in Figure 7c,d,e for the tests of condition diagnosis. Technical characteristics of the NTN N204 bearing are displayed in Table 1. The sizes of the flaws in roller bearings are all 0.5 × 0.15 mm (width ×depth), and the failures were artificially created using a wire-cutting machine. The inherent frequencies of roller bearing are generally distributed in the high frequency band. To fully analyse signal features, considering further research, the larger sampling frequency is chosen to acquire the more comprehensive information of machinery conditions. Thus, the sampling frequency is 100 KHz and the sampling time as 10 s. The corresponding time of vibration signals used in this study is 1 s. The vibration signals collected at 900 rpm is utilized to verify the effectiveness of the proposed method.

Number of Rollers External Diameter (mm) Inner Diameter (mm) Width (mm)
10 47 20 14 The fault passing frequency of each element of the roller bearings can be calculated according to Equations (13)- (15), where D is the pitch diameter, Z is the number of rollers, d is the diameter of the rollers, α is the contact angle of the rollers, and f r is the rotating frequency. f o , f i and f b represent the frequency of the outer-race, inner-race and the rollers, respectively. These equations are based on the assumption of a pure rolling motion. However, in practice, some sliding motion may occur, which causes slight deviation in the characteristic frequency locations. Therefore, these equations should be regarded as approximations only. The approximate results are shown in Table 2. The faults of the outer-race, inner-race and the rollers are mixed into two signals x 1 and x 2 by a random matrix. The waveforms and the spectrum of the mixed signals are shown in Figure 8, where the fault characteristic frequency cannot be obtained from the envelope spectrum of the mixed signals. The fault passing frequency of each element of the roller bearings can be calculated according to Equations (13)- (15), where D is the pitch diameter, Z is the number of rollers, d is the diameter of the rollers,  is the contact angle of the rollers, and r f is the rotating frequency. o f , i f and b f represent the frequency of the outer-race, inner-race and the rollers, respectively. These equations are based on the assumption of a pure rolling motion. However, in practice, some sliding motion may occur, which causes slight deviation in the characteristic frequency locations. Therefore, these equations should be regarded as approximations only. The approximate results are shown in Table 2.

Fault Characteristic Frequency
Outer race 60 Hz Inner race 101 Hz Roller 72 Hz The faults of the outer-race, inner-race and the rollers are mixed into two signals 1 x and 2 x by a random matrix. The waveforms and the spectrum of the mixed signals are shown in Figure 8, where the fault characteristic frequency cannot be obtained from the envelope spectrum of the mixed signals. The wavelet modulus maxima are used to thin the mixed signals to achieve the linear clustering. First, the db4 wavelet is used to implement the four-level wavelet transform to acquire the high frequency wavelet coefficients of each level and obtain the wavelet modulus maxima. Second, their communication points on the previous level are located and then the maxima points are kept because they are in the neighborhood of the communication points, whereas the modulus maxima are removed because they are far from the communication points. The wavelet modulus maxima and their spectra are displayed in Figures 9 and 10. Only the 60.18 Hz and 101.3 Hz can be intangibly recognized from the envelope spectrum and the rollers defect cannot be diagnosed. The wavelet modulus maxima are used to thin the mixed signals to achieve the linear clustering. First, the db4 wavelet is used to implement the four-level wavelet transform to acquire the high frequency wavelet coefficients of each level and obtain the wavelet modulus maxima. Second, their communication points on the previous level are located and then the maxima points are kept because they are in the neighborhood of the communication points, whereas the modulus maxima are removed because they are far from the communication points. The wavelet modulus maxima and their spectra are displayed in Figures 9 and 10. Only the 60.18 Hz and 101.3 Hz can be intangibly recognized from the envelope spectrum and the rollers defect cannot be diagnosed.  The wavelet modulus maxima in the fourth level of each mixed signal are used as sparse signal for it has the appropriate sparsity. The PF of the sparse signal is presented in Figure 11 and it shows that there are three distinct peaks. The data in Figure 11 indicate that the linear clustering effect is apparent and the estimated number of source signals gained is three. The estimation of the mixed matrix could be obtained based on the PF. The wavelet modulus maxima in the fourth level of each mixed signal are used as sparse signal for it has the appropriate sparsity. The PF of the sparse signal is presented in Figure 11 and it shows that there are three distinct peaks. The data in Figure 11 indicate that the linear clustering effect is apparent and the estimated number of source signals gained is three. The estimation of the mixed matrix could be obtained based on the PF. The wavelet modulus maxima in the fourth level of each mixed signal are used as sparse signal for it has the appropriate sparsity. The PF of the sparse signal is presented in Figure 11 and it shows that there are three distinct peaks. The data in Figure 11 indicate that the linear clustering effect is apparent and the estimated number of source signals gained is three. The estimation of the mixed matrix could be obtained based on the PF.

Comparison with the Traditional SCA
To confirm that the sparsity is the key part of the SCA method and the proposed method effectively improves the sparsity of the signal, the fault diagnosis method based on the traditional SCA is used. First, the sparse signal is obtained according to the wavelet transform, the number of mixed signals and the mixed matrix are estimated by the PF according to the sparse signals.
The PF is shown in Figure 13, and the number of source signals is only 2. Then, the shortest path method is used to separate the source signal. The waveforms and spectrum of the separated source signal are shown in Figure 14, where the fault feature of the inner-race is 101.38 Hz, but the characteristics of the outer-race and the roller are

Comparison with the Traditional SCA
To confirm that the sparsity is the key part of the SCA method and the proposed method effectively improves the sparsity of the signal, the fault diagnosis method based on the traditional SCA is used. First, the sparse signal is obtained according to the wavelet transform, the number of mixed signals and the mixed matrix are estimated by the PF according to the sparse signals.
The PF is shown in Figure 13, and the number of source signals is only 2. Then, the shortest path method is used to separate the source signal.

Comparison with the Traditional SCA
To confirm that the sparsity is the key part of the SCA method and the proposed method effectively improves the sparsity of the signal, the fault diagnosis method based on the traditional SCA is used. First, the sparse signal is obtained according to the wavelet transform, the number of mixed signals and the mixed matrix are estimated by the PF according to the sparse signals.
The PF is shown in Figure 13, and the number of source signals is only 2. Then, the shortest path method is used to separate the source signal. The waveforms and spectrum of the separated source signal are shown in Figure 14, where the fault feature of the inner-race is 101.38 Hz, but the characteristics of the outer-race and the roller are The waveforms and spectrum of the separated source signal are shown in Figure 14, where the fault feature of the inner-race is 101.38 Hz, but the characteristics of the outer-race and the roller are

Conclusions
A promising sparse component analysis method based on the wavelet modulus maxima is proposed in this paper to extract compound fault features of roller bearings for underdetermined BSS. To overcome the insufficient sparsity when the SCA is performed, wavelet modulus maxima is used to promote the sparsity of the vibration signals. Then, the PF is employed to estimate the mixed matrix. When the mixed matrix is known, the smallest sum of the vector length is the solution in all combinations by decomposing the observed signal according to the shortest path method, and the separation of the compound faults are obtained. To verify the validity of the proposed method, the simulated signals and vibration signals of faulty roller bearings are used. The results indicate that the fault features of roller bearing with out-race defects, inner-race defects and rollers are basically the same as the theoretical value. For example, the inner-race feature frequency 101.3 Hz is very similar to the theoretical calculation 101 Hz. Therefore, compound faults can be separated based on the presented method. In addition, compared with the traditional SCA method, the experimental results showed that the proposed method successfully thins the mixed signal and extracts compound fault features. In this approach, the time for separating the source signal needs to be further reduced. Future work will focus on the point of the method.
Acknowledgments: This work is partially supported by the National Natural Science Foundation of China (Grant Nos. 51675035 and 51375037).
Author Contributions: Huaqing Wang designed and conceived the experiments; Yansong Hao and Liuyang Song carried out the experimental work and collected the dates; Peng Chen and Liuyang Song analyzed the experimental results; Yansong Hao wrote the paper. Huaqing Wang and Yanliang Ke reviewed and edited the manuscript. All authors contributed to discussing and revising the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Conclusions
A promising sparse component analysis method based on the wavelet modulus maxima is proposed in this paper to extract compound fault features of roller bearings for underdetermined BSS. To overcome the insufficient sparsity when the SCA is performed, wavelet modulus maxima is used to promote the sparsity of the vibration signals. Then, the PF is employed to estimate the mixed matrix. When the mixed matrix is known, the smallest sum of the vector length is the solution in all combinations by decomposing the observed signal according to the shortest path method, and the separation of the compound faults are obtained. To verify the validity of the proposed method, the simulated signals and vibration signals of faulty roller bearings are used. The results indicate that the fault features of roller bearing with out-race defects, inner-race defects and rollers are basically the same as the theoretical value. For example, the inner-race feature frequency 101.3 Hz is very similar to the theoretical calculation 101 Hz. Therefore, compound faults can be separated based on the presented method. In addition, compared with the traditional SCA method, the experimental results showed that the proposed method successfully thins the mixed signal and extracts compound fault features. In this approach, the time for separating the source signal needs to be further reduced. Future work will focus on the point of the method.