A Novel Signal Separation Method Based on Improved Sparse Non-Negative Matrix Factorization

In order to separate and extract compound fault features of a vibration signal from a single channel, a novel signal separation method is proposed based on improved sparse non-negative matrix factorization (SNMF). In view of the traditional SNMF failure to perform well in the underdetermined blind source separation, a constraint reference vector is introduced in the SNMF algorithm, which can be generated by the pulse method. The square wave sequences are constructed as the constraint reference vector. The output separated signal is constrained by the vector, and the vector will update according to the feedback of the separated signal. The redundancy of the mixture signal will be reduced during the constantly updating of the vector. The time–frequency distribution is firstly applied to capture the local fault features of the vibration signal. Then the high dimension feature matrix of time–frequency distribution is factorized to select local fault features with the improved SNMF method. Meanwhile, the compound fault features can be separated and extracted automatically by using the sparse property of the improved SNMF method. Finally, envelope analysis is used to identify the feature of the output separated signal and realize compound faults diagnosis. The simulation and test results show that the proposed method can effectively solve the separation of compound faults for rotating machinery, which can reduce the dimension and improve the efficiency of algorithm. It is also confirmed that the feature extraction and separation capability of proposed method is superior to the traditional SNMF algorithm.


Introduction
The analysis methods based on a vibration signal have been diffusely used in the fault diagnosis of mechanical equipment [1,2], because the vibration signal usually contains the main information of the operating state about the equipment [3,4]. However, the observed signals of mechanical equipment are often non-stationary [5,6], and accompanied with multiple fault characteristics at the same time in real engineering [7,8]. Moreover, the coupling of fault features also increases the difficulty of compound faults diagnosis [9,10]. Therefore, it is of great significance for the normal operation of mechanical equipment and healthy system management to separate out multiple source signals and extract compound fault features effectively from vibration signals [11,12].
A transform domain decomposition method [13] is usually used to achieve the separation of multiple source components, such as empirical mode decomposition (EMD) [14], local mean appropriate reference constraints on the basis of the sparse non-negative matrix factorization (SNMF) algorithm, the improved SNMF method can improve the processing ability of the SNMF algorithm and reduce the procedure of source signals and noise features detected. With the improved SNMF method, the high dimension feature matrix of time-frequency distribution is factorized to select local fault features. Furthermore, according to the sparse property of improved SNMF, the proposed method can accomplish the separation of the compound fault features automatically.
The remaining sections are organized as follows: Section 2 describes the fundamental principle of NMF algorithm. In Section 3, the improved SNMF algorithm is introduced. The separation of compound fault signals based on the proposed method is presented in Section 4. The simulation and test signals are discussed to evaluate the proposed method in Section 5. Finally, the conclusions are summarized in Section 6.

Principle of Non-Negative Matrix Factorization
Matrix factorization is an effective means for large-scale data processing and analysis. The NMF algorithm can be represented as follows: Given a non-negative matrix V ∈ R m×n + , the algorithm constructs approximate factorizations of the form with a product of two non-negative matrices W ∈ R m×r + and H ∈ R r×n + [42], namely: V m×n ≈ W m×r H r×n (1) where m is the dimension of the matrix V m×n , and n is the number of samples. The parameter r is generally chosen as r < mn/(m + n) and is called reduced rank, thus the product W m×r H r×n can be considered as a compressed form of the data V m×n . Varieties of optimization algorithms about loss function were proposed for improving the efficiency of the algorithm, since the NMF algorithm has been put forward. Traditionally, the loss function is represented: The NMF algorithm for the loss function of Equation (2) can be regarded as the following optimization problem: The optimization problem in Equation (3) is convex with respect to W and H respectively. However, it is non-convex about the matrices W and H simultaneously. Therefore, the above problem can deal with an iterative multiplicative updated algorithm until the objective function converges to some constant value. The updated rules are presented: Later, the objective function is defined based on the KL divergence, namely: So, the update rules are given to obtain W and H:

Sparse Non-Negative Matrix Factorization
In this section, the basic description of sparseness is discussed. The idea of "sparse coding" is a sparse representation scheme that can effectively represent typical data vectors with only a few units. In other word, the sparse representation scheme actually means that most of the representation units are close to zero, and only a very small part takes significantly non-zero values.
Currently, there are many functions for measuring the sparseness of data. A normalized scale should have such a feature: The sparsest vector with only a component being non-zero and other components being zero should have a sparseness of one; the least sparsest vector with all elements equal should have a sparseness of zero.
The sparsity degree function [43] in this paper is based on the L 1 norm namely: where n is the dimension of the vector x. This function takes a value of 1 if and only if the vector x contains a non-zero element, and takes a value of 0 if and only if all elements are equal, otherwise the values can be smoothly distributed between the two extremes. The illustration of different degrees of sparseness are shown in Figure 1, displaying the sparseness of different levels. Each bar indicates the value of one element. Where the leftmost is at low levels of sparseness, all the elements are substantially equal. The rightmost is at high levels, and most coefficients are zero except for a few non-zero elements.

Sparse Non-Negative Matrix Factorization
In this section, the basic description of sparseness is discussed. The idea of "sparse coding" is a sparse representation scheme that can effectively represent typical data vectors with only a few units. In other word, the sparse representation scheme actually means that most of the representation units are close to zero, and only a very small part takes significantly non-zero values.
Currently, there are many functions for measuring the sparseness of data. A normalized scale should have such a feature: The sparsest vector with only a component being non-zero and other components being zero should have a sparseness of one; the least sparsest vector with all elements equal should have a sparseness of zero.
The sparsity degree function [43] in this paper is based on the 1 L norm namely: where n is the dimension of the vector x . This function takes a value of 1 if and only if the vector x contains a non-zero element, and takes a value of 0 if and only if all elements are equal, otherwise the values can be smoothly distributed between the two extremes. The illustration of different degrees of sparseness are shown in Figure 1, displaying the sparseness of different levels. Each bar indicates the value of one element. Where the leftmost is at low levels of sparseness, all the elements are substantially equal. The rightmost is at high levels, and most coefficients are zero except for a few non-zero elements. For the choice of constraint terms, whether it is the sparsity of the constraint W, or the sparsity of the constraint H, or the sparsity of both constraints depends on the specific application in question. Since the base matrix W contains the feature information of the original data, the sparseness constraint on W may improve the convenience of storage and calculation, but it may cause the base feature to be missing. Therefore, the coefficient matrix H is usually constrained effectively, which can effectively enhance the feature of the base matrix W .
The SNMF method based on the L1 norm constraint is derived from Hoyer's non-negative sparse coding method [44], which combines the Euclidean distance with the norm constraint to form the objective function. . . , 0, 1 For the choice of constraint terms, whether it is the sparsity of the constraint W, or the sparsity of the constraint H, or the sparsity of both constraints depends on the specific application in question. Since the base matrix W contains the feature information of the original data, the sparseness constraint on W may improve the convenience of storage and calculation, but it may cause the base feature to be missing. Therefore, the coefficient matrix H is usually constrained effectively, which can effectively enhance the feature of the base matrix W.
The SNMF method based on the L 1 norm constraint is derived from Hoyer's non-negative sparse coding method [44], which combines the Euclidean distance with the norm constraint to form the objective function. min where λ is a regularization parameter, which is used to balance sparseness and reconstruction error. a i is the ith line of V. The updated rules are determined by: where η is the step size of iteration. Similarly, we can also obtain the objective function, combining the generalized KL (Kullback-Leibler) divergence with the norm constraint, The updated rules are determined by:

Improved Sparse Non-Negative Matrix Factorization
In this section, it is shown how to improve sparseness in the SNMF framework. By comparing the Euclidean distance and KL divergence functions respectively, the L 1 norm is added as the objective function of the sparse constraint term. The convergence of two different objective functions and the certainty of the solution are also analyzed and proved strictly by scholars [45], which provides a solid mathematical theoretical basis for its solution process. However, the solution process based on KL divergence is a multiplicative rule completely, which can reduce the computational complexity and guarantee the process of iteration better [46]. Therefore, the objective function based on KL divergence is chosen as the algorithm. Meanwhile, in order to improve the processing ability of the sparse non-negative matrix factorization algorithm, lower the dimension of the problem, and reduce the redundancy of the information after decomposition, we introduce a constraint reference vector → r = (r 1 , r 2 , . · ··, r n ) T (where n is the sample length) based on the traditional algorithm. The vector contains the feature information of the objective function proposed, and the information can be changed according to the source signal. This paper uses the mean square to measure the error between reference vectors, namely: When y is completely closed to the source signal, ε(y, → r ) has a minimum value. When ε(y, → r ) satisfies Equation (14), the output result y is the desired source signal, where ξ is the threshold value. Using g(y) as the feasibility constraint of Equation (10), the solution of the algorithm can be projected onto the feasibility constraint function. Therefore, the problem of the improved SNMF algorithm can be summarized as follows: where p(y) is the limiting constraint of the objective function, and y is the solution vector. The steps of Algorithm 1 are as follows: Step 1. Initialize non-negative matrices W and H randomly Step 2. Extract the constraint reference vector → r with the feature of the source signal Step 3. Calculate the initial value of the objective function from Equation (15) Step 4. According to Equations (11) and (12), update the matrices W and H alternately and iteratively Step 5. If the objective function converges, the iteration is stopped, and the matrices W and H are outputted; otherwise, steps (3) and (4) are performed cyclically The biggest advantage of the improved SNMF method is that the vector is added as a constraint reference, which constrains the objective function and can be generated adaptively according to the source signal, and the redundant component after decomposition is reduced.

Signal Separation Method Based on Improved SNMF
Based on the above analysis, a separation method of compound fault signals with improved sparse non-negative matrix factorization is put forward for the bearings in rotating machinery. The implementation steps Algorithm 2 are summarized as follows:

Algorithm 2: Signal Separation Method Based on Improved SNMF
Step 1. The method of short-time Fourier transform (STFT) is applied to the original vibration signal to obtain a high-dimensional feature matrix that characterizes local information.
Step 2. Take the energy value of the feature matrix to satisfy the input matrix of improved SNMF.
Step 3. Use the improved SNMF algorithm to reduce the dimension, and get the base matrix W and the coefficient matrix H.
Step 4. The base matrix W and the coefficient matrix H are reconstructed in a low-dimensional space, and the time-frequency information is transformed into the time domain by using an inverse time Fourier transform (ISTFT) to obtain a reconstructed waveform of the feature component.
Step 5. The reconstructed signal is selected for envelope spectrum analysis to extract the fault feature of the bearing.
The flow chart is shown in Figure 2.

Simulation Analysis
In order to verify the effectiveness of the proposed method, the following model was used to simulate compound faults of a rolling bearing: where, g = 0.1 is the damping coefficient, 1 ( ) s t and 2 ( ) s t are composed of the following two feature parameters: The natural frequencies are 3000 Hz and 5000 Hz respectively, the characteristic frequencies are taken as 63 Hz and 173 Hz, the sampling frequency is 100 kHz, and the analysis points are taken as 0.5 s time segments. A = [0.8147, 0.9058] is a mixed matrix generated randomly. The mixed signal S(t) is obtained by Equation (17). The waveform and the envelope spectrum of the mixed signal are shown in Figure 3.

Simulation Analysis
In order to verify the effectiveness of the proposed method, the following model was used to simulate compound faults of a rolling bearing: where, g = 0.1 is the damping coefficient, s 1 (t) and s 2 (t) are composed of the following two feature parameters: The natural frequencies are 3000 Hz and 5000 Hz respectively, the characteristic frequencies are taken as 63 Hz and 173 Hz, the sampling frequency is 100 kHz, and the analysis points are taken as 0.5 s time segments. A = [0.8147, 0.9058] is a mixed matrix generated randomly. The mixed signal S(t) is obtained by Equation (17). The waveform and the envelope spectrum of the mixed signal are shown in Figure 3.  According to the feature information of the simulated signal, the constraint reference signals were constructed by the pulse method. The square wave sequences with the same length as the mixed signal were generated as the constraint reference signals. The waveform and the partial enlargement of the reference signals are shown in Figure 4. For the mixed simulation signal, the proposed method was used for analysis. Firstly, the feature matrix X was obtained by STFT, and the time-frequency distribution is shown in Figure 5. Secondly, the energy value of the feature matrix was obtained as the input matrix of the improved SNMF. Thirdly, the energy-value matrix was decomposed by the improved SNMF algorithm to obtain the base matrix W and the coefficient matrix H. Finally, the matrix W and H were reconstructed in the subspace, and the inverse short-time Fourier transform was used to transform them into the time domain, getting separated signals. According to the feature information of the simulated signal, the constraint reference signals were constructed by the pulse method. The square wave sequences with the same length as the mixed signal were generated as the constraint reference signals. The waveform and the partial enlargement of the reference signals are shown in Figure 4. According to the feature information of the simulated signal, the constraint reference signals were constructed by the pulse method. The square wave sequences with the same length as the mixed signal were generated as the constraint reference signals. The waveform and the partial enlargement of the reference signals are shown in Figure 4. For the mixed simulation signal, the proposed method was used for analysis. Firstly, the feature matrix X was obtained by STFT, and the time-frequency distribution is shown in Figure 5. Secondly, the energy value of the feature matrix was obtained as the input matrix of the improved SNMF. Thirdly, the energy-value matrix was decomposed by the improved SNMF algorithm to obtain the base matrix W and the coefficient matrix H. Finally, the matrix W and H were reconstructed in the subspace, and the inverse short-time Fourier transform was used to transform them into the time domain, getting separated signals.  For the mixed simulation signal, the proposed method was used for analysis. Firstly, the feature matrix X was obtained by STFT, and the time-frequency distribution is shown in Figure 5. Secondly, the energy value of the feature matrix was obtained as the input matrix of the improved SNMF. Thirdly, the energy-value matrix was decomposed by the improved SNMF algorithm to obtain the base matrix W and the coefficient matrix H. Finally, the matrix W and H were reconstructed in the subspace, and the inverse short-time Fourier transform was used to transform them into the time domain, getting separated signals. By introducing the constraint reference signals, two sets of separated signals were obtained, which indicated that the feature information in the two sets of separated signals was rich, and described the source signal better. The envelope spectra of separated signals are shown in Figure 6. It can be seen from Figure 6 that the two characteristic components, 63 Hz and 173 Hz, included in the source signal can be separated by the proposed method. Therefore, from the analysis of simulation signal, it can be concluded that the proposed method can effectively separate the source signal from the mixed signals, and the characteristic frequency of the source signal can also be extracted in the envelope spectrum, which verifies the effectiveness of the proposed method.
In order to verify the advantages of the proposed method, it is compared with the traditional SNMF algorithm using the model of KL-divergence. When the reference vector is not been introduced as a constraint term, some redundant signal components (s3, s4, s5, s6) are obtained as the Figure 7 shows. It can be seen from the Figure 7 that the spectrogram is disorganized. In order to illustrate quantitatively the advantages of the reference vector, the optimal two sets of signals in the Figure 7 were compared with the separated signals in Figure 6 and used to define the following Equation (18) to quantify the gains of the proposed approach. The SNRs (signal to noise ratio) of two methods are shown in Algorithm 1. By introducing the constraint reference signals, two sets of separated signals were obtained, which indicated that the feature information in the two sets of separated signals was rich, and described the source signal better. The envelope spectra of separated signals are shown in Figure 6. By introducing the constraint reference signals, two sets of separated signals were obtained, which indicated that the feature information in the two sets of separated signals was rich, and described the source signal better. The envelope spectra of separated signals are shown in Figure 6. It can be seen from Figure 6 that the two characteristic components, 63 Hz and 173 Hz, included in the source signal can be separated by the proposed method. Therefore, from the analysis of simulation signal, it can be concluded that the proposed method can effectively separate the source signal from the mixed signals, and the characteristic frequency of the source signal can also be extracted in the envelope spectrum, which verifies the effectiveness of the proposed method.
In order to verify the advantages of the proposed method, it is compared with the traditional SNMF algorithm using the model of KL-divergence. When the reference vector is not been introduced as a constraint term, some redundant signal components (s3, s4, s5, s6) are obtained as the Figure 7 shows. It can be seen from the Figure 7 that the spectrogram is disorganized. In order to illustrate quantitatively the advantages of the reference vector, the optimal two sets of signals in the Figure 7 were compared with the separated signals in Figure 6 and used to define the following Equation (18) to quantify the gains of the proposed approach. The SNRs (signal to noise ratio) of two methods are shown in Algorithm 1. It can be seen from Figure 6 that the two characteristic components, 63 Hz and 173 Hz, included in the source signal can be separated by the proposed method. Therefore, from the analysis of simulation signal, it can be concluded that the proposed method can effectively separate the source signal from the mixed signals, and the characteristic frequency of the source signal can also be extracted in the envelope spectrum, which verifies the effectiveness of the proposed method.
In order to verify the advantages of the proposed method, it is compared with the traditional SNMF algorithm using the model of KL-divergence. When the reference vector is not been introduced as a constraint term, some redundant signal components (s 3 , s 4 , s 5 , s 6 ) are obtained as the Figure 7 shows. It can be seen from the Figure 7 that the spectrogram is disorganized. In order to illustrate quantitatively the advantages of the reference vector, the optimal two sets of signals in the Figure 7 were compared with the separated signals in Figure 6 and used to define the following Equation (18) to quantify the gains of the proposed approach. The SNRs (signal to noise ratio) of two methods are shown in Table 1. (18) where N is the sampling length, A F r is the amplitude of the first three-order characteristic frequency, and A i is the amplitude of the frequency domain signal.
where N is the sampling length, r F A is the amplitude of the first three-order characteristic frequency, and i A is the amplitude of the frequency domain signal. The improved SNMF -0.2527 -0.3421 The traditional SNMF -0.8484 (Figure 7c) -1.6449 (Figure 7d) According to Figure 7 and Table 1, the proposed approach has better separation and noise reduction effects.

Experiment and Discussion
In this section, the measured compound fault signals of bearing are taken as the research object to verify the effectiveness of the proposed method. The defects of 0.5 mm width and 0.15 mm depth were machined artificially on the outer ring and rolling elements of the bearing. The type of cylindrical roller bearing was NTN N204. During the experiment, the vibration signal in the vertical direction was collected by the acceleration sensor placed on the bearing housing. The experimental platform of the rotating machine and the fault bearing are shown in Figure 8. The sampling frequency was 100 kHz and the sampling time was 10s. The motor speed was set to 900 rpm, and the rolling bearing components were calculated according to the bearing structural parameters (Table 2)  According to Figure 7 and Table 1, the proposed approach has better separation and noise reduction effects.

Experiment and Discussion
In this section, the measured compound fault signals of bearing are taken as the research object to verify the effectiveness of the proposed method. The defects of 0.5 mm width and 0.15 mm depth were machined artificially on the outer ring and rolling elements of the bearing. The type of cylindrical roller bearing was NTN N204. During the experiment, the vibration signal in the vertical direction was collected by the acceleration sensor placed on the bearing housing. The experimental platform of the rotating machine and the fault bearing are shown in Figure 8. The sampling frequency was 100 kHz and the sampling time was 10s. The motor speed was set to 900 rpm, and the rolling bearing components were calculated according to the bearing structural parameters (Table 2) and the equations in references [47,48]. The theoretical characteristic frequency is shown in Table 3.   The compound fault signals collected were used for analysis, and the analysis points were taken as 1s time segments randomly. The waveform and the envelope spectrum at 900 rpm are shown in Figure 9. The impact pulse component can be seen clearly from the time domain waveform, indicating that the bearing had failed, but the feature of period time is not obvious, and the useful state information of the bearing could not be obtained. In the envelope spectrum, the defect feature of outer race and rolling element was submerged by the noise component, and it was difficult to identify. The peak appearing at the frequency of about 6 Hz (first peak) in the spectrum, and the frequency value was close to the characteristic frequency of the cage, that is, the revolving frequency of the rolling element, so the peak was caused by the impact of the rolling elements.
According to the feature information of the experimental signal, the constraint reference signals were constructed by the pulse method. The square wave sequences with the same length as the   The compound fault signals collected were used for analysis, and the analysis points were taken as 1s time segments randomly. The waveform and the envelope spectrum at 900 rpm are shown in Figure 9.   The compound fault signals collected were used for analysis, and the analysis points were taken as 1s time segments randomly. The waveform and the envelope spectrum at 900 rpm are shown in Figure 9. The impact pulse component can be seen clearly from the time domain waveform, indicating that the bearing had failed, but the feature of period time is not obvious, and the useful state information of the bearing could not be obtained. In the envelope spectrum, the defect feature of outer race and rolling element was submerged by the noise component, and it was difficult to identify. The peak appearing at the frequency of about 6 Hz (first peak) in the spectrum, and the frequency value was close to the characteristic frequency of the cage, that is, the revolving frequency of the rolling element, so the peak was caused by the impact of the rolling elements.
According to the feature information of the experimental signal, the constraint reference signals were constructed by the pulse method. The square wave sequences with the same length as the The impact pulse component can be seen clearly from the time domain waveform, indicating that the bearing had failed, but the feature of period time is not obvious, and the useful state information of the bearing could not be obtained. In the envelope spectrum, the defect feature of outer race and rolling element was submerged by the noise component, and it was difficult to identify. The peak appearing at the frequency of about 6 Hz (first peak) in the spectrum, and the frequency value was close to the characteristic frequency of the cage, that is, the revolving frequency of the rolling element, so the peak was caused by the impact of the rolling elements.
According to the feature information of the experimental signal, the constraint reference signals were constructed by the pulse method. The square wave sequences with the same length as the experimental signal were generated as the constraint reference signals. The waveform and the partial enlargement of the reference signals are shown in Figure 10. According to the proposed method, the original signal was subjected to short-time Fourier transform to obtain a feature matrix X, and the time-frequency distribution is shown in Figure 11. The energy value of the feature matrix was obtained as an input matrix of the improved SNMF, then the energy-value matrix was decomposed by the improved SNMF algorithm to obtain the base matrix W and the coefficient matrix H. Finally, the matrix W and H were reconstructed in the subspace, and the inverse short-time Fourier transform was used to transform them into the time domain, getting reconstructed signals. According to the proposed method, the original signal was subjected to short-time Fourier transform to obtain a feature matrix X, and the time-frequency distribution is shown in Figure 11. The energy value of the feature matrix was obtained as an input matrix of the improved SNMF, then the energy-value matrix was decomposed by the improved SNMF algorithm to obtain the base matrix W and the coefficient matrix H. Finally, the matrix W and H were reconstructed in the subspace, and the inverse short-time Fourier transform was used to transform them into the time domain, getting reconstructed signals.
Two sets of separated signals could be obtained by introducing a constrained reference vector, which indicated that the feature information in the two sets of separated signals was rich, and the source signal contained two fault components. The envelope spectra of separated signals are shown in Figure 12.
It is obvious that two source signal components were obtained by the proposed method, which corresponded to a fault characteristic frequency of the outer race and the rolling element respectively, and they were consistent with the theoretical characteristic values. Meanwhile, their higher harmonic components were also extracted clearly. In addition, the cage frequency (6 Hz) and its higher harmonic components appear in the Figure 12b, and the sidebands of the fault characteristic frequency were distributed by the cage frequency, which was consistent with the feature of the rollers failure. Therefore, the results show that the proposed method could effectively separate the fault source signal from the mixed signals, and the fault characteristic frequency could also be extracted in the envelope spectrum, which verified the effectiveness of the proposed method in the field of compound fault diagnosis of bearings.
the partial enlargement of reference signal 3; (c) the waveform of reference signal 4; (d) the partial enlargement of reference signal 4.
According to the proposed method, the original signal was subjected to short-time Fourier transform to obtain a feature matrix X, and the time-frequency distribution is shown in Figure 11. The energy value of the feature matrix was obtained as an input matrix of the improved SNMF, then the energy-value matrix was decomposed by the improved SNMF algorithm to obtain the base matrix W and the coefficient matrix H. Finally, the matrix W and H were reconstructed in the subspace, and the inverse short-time Fourier transform was used to transform them into the time domain, getting reconstructed signals. Two sets of separated signals could be obtained by introducing a constrained reference vector, which indicated that the feature information in the two sets of separated signals was rich, and the It is obvious that two source signal components were obtained by the proposed method, which corresponded to a fault characteristic frequency of the outer race and the rolling element respectively, and they were consistent with the theoretical characteristic values. Meanwhile, their higher harmonic components were also extracted clearly. In addition, the cage frequency (6 Hz) and its higher harmonic components appear in the Figure 12(b), and the sidebands of the fault characteristic frequency were distributed by the cage frequency, which was consistent with the feature of the rollers failure. Therefore, the results show that the proposed method could effectively separate the fault source signal from the mixed signals, and the fault characteristic frequency could also be extracted in the envelope spectrum, which verified the effectiveness of the proposed method in the field of compound fault diagnosis of bearings.

Comparison with Traditional Method
In order to verify the advantages of the proposed method in the field of compound faults diagnosis of bearings, it was compared with the traditional SNMF algorithm using the model of KLdivergence. The experimental data at 900 rpm was selected to illustrate this. The energy value of the feature matrix obtained by the short-time Fourier transform, and the traditional SNMF algorithm was used to reduce the dimension. The matrix W and H were reconstructed in the subspace, selecting the reconstructed signal as the separated signal (f1 and f2), and the envelope spectra are shown in Figure  13.

Comparison with Traditional Method
In order to verify the advantages of the proposed method in the field of compound faults diagnosis of bearings, it was compared with the traditional SNMF algorithm using the model of KL-divergence. The experimental data at 900 rpm was selected to illustrate this. The energy value of the feature matrix obtained by the short-time Fourier transform, and the traditional SNMF algorithm was used to reduce the dimension. The matrix W and H were reconstructed in the subspace, selecting the reconstructed signal as the separated signal (f 1 and f 2 ), and the envelope spectra are shown in Figure 13.
It can be seen from Figure 13 that the compound fault signals were not separated effectively using the traditional SNMF algorithm based on KL-divergence. The fault feature of outer race was almost extracted, and the fault feature of rolling element was submerged. In addition, the cage characteristic frequency (6 Hz) could only be obtained in the Figure 13b, failing to describe the fault source signal accurately. The proposed method can extract and separate the fault features of the rollers and the outer race effectively. Comparing Figures 12 and 13, it can be seen that since the improved SNMF algorithm enhances the local features and sparsity of the fault component, and reduce the redundant information of the reconstructed signal, the source signal can be separated, and the feature can be extracted. The unique advantages of the proposed method in the field of compound fault diagnosis of bearings have been proved.
In order to verify the advantages of the proposed method in the field of compound faults diagnosis of bearings, it was compared with the traditional SNMF algorithm using the model of KLdivergence. The experimental data at 900 rpm was selected to illustrate this. The energy value of the feature matrix obtained by the short-time Fourier transform, and the traditional SNMF algorithm was used to reduce the dimension. The matrix W and H were reconstructed in the subspace, selecting the reconstructed signal as the separated signal (f1 and f2), and the envelope spectra are shown in Figure  13.
It can be seen from Figure 13 that the compound fault signals were not separated effectively using the traditional SNMF algorithm based on KL-divergence. The fault feature of outer race was almost extracted, and the fault feature of rolling element was submerged. In addition, the cage characteristic frequency (6 Hz) could only be obtained in the Figure 13

Conclusions
The feature of compound fault signals in rotating machinery is weak, and the source signals of the compound fault are difficult to separate from the mixture signal. Aiming to solve these problems, a blind source separation method with single channel based on the improved SNMF was proposed. The square wave sequences with the feature information were constructed as the constraint reference vector into the objective function of the traditional SNMF algorithm, to reduce the redundancy of the decomposed data. STFT was applied to obtain a high-dimensional feature matrix. Considering the high dimensional feature matrix after the STFT, the improved SNMF is adopted to select local feature from time-frequency distribution, and it can lower the dimension of the problem. Meanwhile, according to the local learning ability of the improved SNMF algorithm, the compound fault features can be separated effectively, and the redundant component after decomposition is reduced on the basis of effective data reduction. The simulation and test results show that the proposed method can effectively separate the feature of compound faults for roller bearing. Compared with the traditional SNMF, the feature extraction and separation capability of the proposed method is superior to the traditional SNMF. Therefore, the proposed method is of great significance for the compound faults diagnosis of rotating machinery and has certain engineering application value. Considering the impact of initialization about the improved SNMF, this paper only initialized randomly to test the performance of the algorithm. The optimized initialization will be studied in the future.