Weak Signal Detection Based on Combination of Sparse Representation and Singular Value Decomposition

: Due to the inevitable acquisition system noise and strong background noise, it is often difﬁcult to detect the features of weak signals. To solve this problem, sparse representation can effectively extract useful information according to the sparse characteristics of signals. However, sparse representation is less effective against non-Gaussian white noise. Therefore, a novel SRSVD method combining sparse representation and singular value decomposition is proposed to further improve the denoising performance of the algorithm. All the signal components highly matched with the dictionary are extracted by sparse representation, and then each component of singular value decomposition is weighted by the evaluation index, PMI, which can indicate the component with useful information in the signal, so that the signal denoising performance of the algorithm is greatly improved. The performance of the proposed method is veriﬁed by the processing of weak signals in the circuit and early fault signals of bearings. The results show that SRSVD can successfully suppress noise interference. Compared with other existing methods, SRSVD has better denoising performance.


Introduction
Extracting useful information from noise has always been a problem of extensive research in various fields. For example, in the field of image, image denoising and superresolution analysis can enrich the information of the images [1]; in the field of condition monitoring, fault signal extraction can identify the existence of early fault [2,3]; in the field of electronics, the extraction of small changes in current can identify kinds of weak measurement changes [4]. Therefore, it is crucial and necessary to study effective signal denoising methods.
The evaluation criteria of a "weak signal" are not fixed, and there is no clear and widely agreed definition and interpretation of the weak signal. As weak signal detection is widely used in many fields, the detection objects in each field are completely different, and the detection methods vary greatly. Moreover, the degree of signal weakness in different detection objects is inconsistent, so it is difficult to define "weak". Therefore, in order to distinguish it from weak signals in other fields, it should be emphasized that what is studied in this paper is weak sensor electrical signals, and weakness means that the energy of electrical signals is relatively weak compared with other electrical noises. Alternatively, weak signals can be interpreted from the perspective of signal detection; that is, weak signals refer to signals that are difficult to detect with conventional and traditional signal analysis methods.
The commonly used signal processing methods include wavelet transform (WT), shorttime Fourier transform (STFT) [5], empirical mode decomposition (EMD) [6], variational mode decomposition (VMD) [7], fast kurtogram [8], and singular value decomposition (SVD) [9], etc. The sparse feature is a universal property in the signal [10]. In recent years, scholars have carried out extensive discussions and studied various methods by using the sparsity priori. In order to extract the sparse features of signals, the most representative method is the sparse representation (SR) algorithm [11]. The core idea of SR is that any signal can be represented as the superposition of some optimal basic functions from the basic waveform library, which matches the main structure of the signal best. The functions of the basic waveform library are called dictionaries, and dictionaries can be constructed based on prior knowledge of the signal, or based on paradigms of nonparametric signals. Since the dictionary matches some major features of the original signal, the coefficients of the signal are sparse under the decomposition of the dictionary. The theoretical system of sparse representation algorithm was first established on the matching tracking algorithm proposed by Mallat in 1993. In 1998, Donoho proposed the sparse decomposition theory. In essence, it reasonably loosed the regularization term of l 0 in the mathematical model of matching tracking algorithm to l 1 regularization, which also marks the formal establishment of the two core algorithms of sparse representation, and the subsequent sparse representation algorithms are improved or extended on this basis. Greedy tracking algorithms based on l 0 regularization mainly include orthogonal matching pursuit (OMP) [12] in 1993, tree based tracking algorithm proposed by Jost et al. [13] in 2006, and phased orthogonal matching pursuit (StOMP) proposed by Donoho [14] in 2012. Algorithms based on l 1 or l p regularization mainly include the least absolute shrinkage selection operator (LASSO) [15] in 1996, focal underdetermined system solver (FOCUSS) [16] in 1997 and gradient projection-based sparse representation (GPSR) [17] in 2008. In the same year, Huggins [18] proposed greedy basis tracking, and Bobin [19] proposed morphological component analysis (MCA), also belonging to this category. In 2004, Daubechies et al. [20] proposed a new sparse atomic decomposition algorithm, called iterative shrinkage threshold algorithm (ISTA), and it gradually became one of the most widely used convex optimization algorithms.
The singular value decomposition method is a widely used signal denoising method, but it has encountered many problems in practical application [9]. For example, the cutoff rule in the difference spectrum is difficult to determine, it is easily affected by noise energy and it does not have the ability to identify useful signals. Therefore, a novel signal denoising method called SRSVD is proposed to solve this problem. Through the powerful ability of SR to weaken Gaussian noise, we filter once, find out all components that best match the dictionary, and reconstruct the signal to be measured. Then the improved SVD algorithm is used to remove the irrelevant components to achieve the purpose of secondary denoising. Previously, most methods combining SR and SVD were used for pattern recognition. SVD was generally used for dimensionality reduction, and then SR is used for classification [21,22]. In our method, SR is first used to remove Gaussian noise, which requires the reconstruction of the original signal. In the second step, improved SVD is used for secondary filtering. As can be seen, the purpose, sequence, and flow of our method is fundamentally different from the previous methods.
The major contributions of this paper are as follows: (1) Combining SR and SVD for weak signal detection for the first time and achieved good results. (2) According to the characteristics of SR and SVD and combined with practical application, different dictionaries are adopted to achieve SRSVD denoising. (3) The PMI index is used for the reconstruction of the signal in SRSVD, and the SRSVD can filter the useful information in the signal.
This paper is organized as follows. In Section 2, SR and SVD are briefly described. Section 3 details the proposed method SRSVD. In Section 4, the SRSVD method will be analyzed using experimental test data. Finally, conclusions are drawn in Section 5.

Sparse Representation
For a signal, if the coefficient vector decomposed by a suitable dictionary meets the following properties: if all the other elements are zero except that the value of a few elements is non-zero, the signal can be sparse represented, and the coefficient vector obtained by decomposing is called the sparse representation vector of the original signal. Sparse representation is the most representative method in linear representation. It has also been proved to be a very powerful solution. It shows great potential in signal and image denoising and has been widely used in the fields of machine learning [23] and computer vision [24].
Assuming a given measurement signal y ∈ R M , its sparse representation model [2] is defined as: where D ∈ R M×K is the dictionary matrix, M is the number of rows of dictionary D, K is the number of columns of dictionary D; d k is a column vector in a dictionary that can be called atom or basis function. x = [x 1 , x 2 , . . . x k ] T is the sparse representation vector of input signal y, which can be regarded as the weight of base function d k , ε is noise. In the sparse representation model, the number of columns in dictionary D is usually much greater than the number of rows (K >> M), that is, the dictionary is over-complete or redundant, which leads to the fact that x can have infinite solutions. Adding sparse constraints to the model can solve the degradation problem caused by over-completeness. Therefore, the objective function of sparse representation can be expressed as where · 0 is the pseudo-norm of l 0 and the return value is the number of non-zero values of the vector. It is generally considered that y − Dx 2 is the data fidelity term, which ensures that the signal reconstructed by the product of the dictionary and coefficient vector is as consistent as possible with the original signal. λ x 0 is a regular term, and its design is generally designed with prior knowledge to constrain parameters in the model.
In the sparse representation model, it is a sparse constraint on the decomposed coefficient vector. The methods of solving Equation (2) are mainly divided into two categories: one is a greedy algorithm, such as match pursuit (MP), orthogonal match pursuit (OMP), and piecewise orthogonal match pursuit (StOMP). Since the l 0 norm is non-convex, the solution strategy of this kind of algorithm is mainly to use an exhaustive search to solve the sparse representation problem, which is almost an NP-hard problem. With the increase of dimension, the complexity of the solution increases rapidly, resulting in the need to spend a lot of computational cost in the actual situation.
Another strategy is to smooth the l 0 norm into an l 1 norm convex optimization problem. Candes and Tao proved that under RIP conditions, the solution of Equation (2) in the l 0 norm is consistent with that in the l 1 norm, and the problem is easily solved by a gradient correlation algorithm. l 1 norm regularization problem is also called the Lasso regression problem and its objective function is: where x 1 = ∑ n |x n | is the l 1 norm. Typical methods for solving Equation (3) include an iterative shrinkage-thresholding algorithm (ISTA), gradient projection for sparse reconstruction (GPSR), and the interpolation point method, etc. Note that compared with Equation (2), Equation (3) only replaces the l 0 norm in the regularization term with the l 1 norm. Convert it to the following form: It is assumed that x is two-dimensional to facilitate visual analysis, as shown in Figure 1a. Contour lines of Equation (4) can be drawn on the (x 1 , x 2 ) plane, and the constraint is a region with a size of λ on the plane (within the square contour). The place where contour lines intersect with this region for the first time is the optimal solution. It can be seen that the region constrained by the l 1 norm has "angles" at the intersection of each coordinate axis, while the geodesic of the objective function will intersect at angles and generate sparsity in most cases unless the position is very well placed. In contrast, the l 2 norm does not have such properties, as shown in Figure 1b. Because it is a circle, the probability of the first intersection occurring at the position with sparsity is very small. Therefore, converting the l 0 norm to the l 1 norm can also produce sparsity, while the l 2 norm can produce smoothness but not sparsity. Equation (2), Equation (3) only replaces the l0 norm in the regularization term with the l1 norm. Convert it to the following form: s t D (4) It is assumed that x is two-dimensional to facilitate visual analysis, as shown in Figure 1a. Contour lines of Equation (4) can be drawn on the (x1, x2) plane, and the constraint is a region with a size of λ on the plane (within the square contour). The place where contour lines intersect with this region for the first time is the optimal solution. It can be seen that the region constrained by the l1 norm has "angles" at the intersection of each coordinate axis, while the geodesic of the objective function will intersect at angles and generate sparsity in most cases unless the position is very well placed. In contrast, the l2 norm does not have such properties, as shown in Figure 1b. Because it is a circle, the probability of the first intersection occurring at the position with sparsity is very small. Therefore, converting the l0 norm to the l1 norm can also produce sparsity, while the l2 norm can produce smoothness but not sparsity.

Singular Value Decomposition
For a discrete sampled signal ∈  N X , its Hankel matrix is constructed: Let m = N − n + 1, in order to obtain the best denoising effect, when the signal length is even, we should take the number of columns n = N/2 and the number of rows m = N/2 + 1. While N is odd, the number of columns n = (N + 1)/2 and the number of rows m = (N + 1)/2 should be selected.
The singular value decomposition of H is: The diagonal elements of S are called singular values of H, denoted as σ1, σ2, …, σn. These singular values are shown as the former k values are large, and the later singular values are very close to 0. The difference spectrum can describe the change of the singular value sequence. As long as the superposition of the former k components is selected and the singular value after k is set to zero, the noise reduction signal can be obtained.

Proposed SRSVD Method
The basic flowchart of the proposed method is shown in Figure 2, and the steps are explained in detail below.

Singular Value Decomposition
For a discrete sampled signal X ∈ R N , its Hankel matrix is constructed: Let m = N − n + 1, in order to obtain the best denoising effect, when the signal length is even, we should take the number of columns n = N/2 and the number of rows m = N/2 + 1. While N is odd, the number of columns n = (N + 1)/2 and the number of rows m = (N + 1)/2 should be selected.
The singular value decomposition of H is: The diagonal elements of S are called singular values of H, denoted as σ 1 , σ 2 , . . . , σ n . These singular values are shown as the former k values are large, and the later singular values are very close to 0. The difference spectrum can describe the change of the singular value sequence. As long as the superposition of the former k components is selected and the singular value after k is set to zero, the noise reduction signal can be obtained.

Proposed SRSVD Method
The basic flowchart of the proposed method is shown in Figure 2, and the steps are explained in detail below.
SR from the perspective of probability interpretation, the fidelity term of the SR model is given by the maximum likelihood estimation of y|x to be inferred likelihood estimator, and the fidelity term follows white Gaussian distribution. Therefore, the SR model has a strong ability to remove white Gaussian noise. Because a lot of noise in nature is caused by the irregular movement of some particles. Especially in the circuit, the irregular motion of electrons, such as thermal motion, produces strong white Gaussian noise. Therefore, the first purpose of using SR is to remove Gaussian white noise. Appl  SR from the perspective of probability interpretation, the fidelity term of the SR model is given by the maximum likelihood estimation of y|x to be inferred likelihood estimator, and the fidelity term follows white Gaussian distribution. Therefore, the SR model has a strong ability to remove white Gaussian noise. Because a lot of noise in nature is caused by the irregular movement of some particles. Especially in the circuit, the irregular motion of electrons, such as thermal motion, produces strong white Gaussian noise. Therefore, the first purpose of using SR is to remove Gaussian white noise.
Next, the reconstructed signal is decomposed by SVD, and the decomposed signal components are given proper weights by appropriate evaluation indexes to extract the favorable parts of the signal components. The specific evaluation index and weight selection will be given in Section 4 combined with the actual situation. Therefore, the SRSVD noise reduction algorithm steps can be summarized as follows: Step 1. Construct a dictionary matrix D by prior knowledge or dictionary learning. For dictionary learning, its objective function is: is the sample matrix; it may consist of signals of nonparametric signal observation.
x is the Frobenius norm of the matrix X. Dictionary learning algorithm can construct an appropriate dictionary without prior knowledge of the sample, so it has a wider range of application fields than an analytic dictionary.
Step 2. The sparse coefficients are solved by ISTA [20]. The objective function is shown in Equation (3) and the ISTA procedure is shown in Algorithm 1.
Step 3. Reconstructed signal Step 4. Construct Hankel matrix H of new y, the construction method is shown in Equation (5).
Step 6. Each component is given weight according to the evaluation index. In this paper, we choose the periodic modulation intensity (PMI) [25] as the component evaluation index, which can be used as an important indicator to measure the information Next, the reconstructed signal is decomposed by SVD, and the decomposed signal components are given proper weights by appropriate evaluation indexes to extract the favorable parts of the signal components. The specific evaluation index and weight selection will be given in Section 4 combined with the actual situation.
Therefore, the SRSVD noise reduction algorithm steps can be summarized as follows: Step 1. Construct a dictionary matrix D by prior knowledge or dictionary learning. For dictionary learning, its objective function is: where Y = (y i ) m i=1 ∈ R m×n is the sample matrix; it may consist of signals of nonparametric signal observation. A= (α i ) m i=1 is the sparse coefficient matrix and T is sparsity.
2 is the Frobenius norm of the matrix X. Dictionary learning algorithm can construct an appropriate dictionary without prior knowledge of the sample, so it has a wider range of application fields than an analytic dictionary.
Step 2. The sparse coefficients are solved by ISTA [20]. The objective function is shown in Equation (3) and the ISTA procedure is shown in Algorithm 1.
Step 4. Construct Hankel matrix H of new y, the construction method is shown in Equation (5).
Step 6. Each component is given weight according to the evaluation index. In this paper, we choose the periodic modulation intensity (PMI) [25] as the component evaluation index, which can be used as an important indicator to measure the information in the signal. The weight assigned to each component by PMI is denoted as w 1 , w 2 , . . . , w n , and finally, Initialization: x (0) , µ < D −2 2 , Iter; For i = 0 to Iter do

Case Study I
In this section, the feasibility of the SRSVD algorithm is tested by building a signal acquisition system. The test system is shown in Figure 3, and consists of five parts. The signal generator generates useful signals for testing; the amplifier analog circuit board attenuates the useful signal output of the signal generator with large capacitance and outputs the amplification result of the amplifier chip; the signal acquisition instrument (DN8509N) is used to collect the output signal of the amplifier chip; and the shielding box blocks external interference and the PC is used to record the chip output signal.
in the signal. The weight assigned to each component by PMI is denoted as w1, w2, …, wn, and finally, after denoising

Case Study I
In this section, the feasibility of the SRSVD algorithm is tested by building a signal acquisition system. The test system is shown in Figure 3, and consists of five parts. The signal generator generates useful signals for testing; the amplifier analog circuit board attenuates the useful signal output of the signal generator with large capacitance and outputs the amplification result of the amplifier chip; the signal acquisition instrument (DN8509N) is used to collect the output signal of the amplifier chip; and the shielding box blocks external interference and the PC is used to record the chip output signal. We set the sampling frequency to 128 kHz and the signal frequency to be measured to 123 Hz. We randomly intercept a segment of signal with 6400 sampling points for analysis. Figure 4 shows the signal waveform and its frequency spectrum. As can be seen from Figure 4b, the signal generated by the signal generator cannot be effectively identified due to the adverse effects of noise in the circuit. We set the sampling frequency to 128 kHz and the signal frequency to be measured to 123 Hz. We randomly intercept a segment of signal with 6400 sampling points for analysis. Figure 4 shows the signal waveform and its frequency spectrum. As can be seen from Figure 4b, the signal generated by the signal generator cannot be effectively identified due to the adverse effects of noise in the circuit.  We used SRSVD to denoise this signal and try to extract the signal from the signal generator. First, we used the dictionary learning method to train a dictionary. It should be noted here that although the data in this experiment are generated by the signal generator, we often do not know the types and features of the collected signals in practical situations. We used SRSVD to denoise this signal and try to extract the signal from the signal generator. First, we used the dictionary learning method to train a dictionary. It should be noted here that although the data in this experiment are generated by the signal generator, we often do not know the types and features of the collected signals in practical situations. Therefore, the prior of the signal generator was ignored in this experiment. K-SVD algorithm was used for training, and the algorithm procedure is shown in Algorithm 2. K-SVD algorithm is a dictionary learning algorithm proposed by Aharon et al. [1] in 2006 and has gradually become the most widely used dictionary learning algorithm. K-SVD updates the dictionary columns one by one, reducing the computational complexity of updating the dictionary. The initial dictionary is set to a Gaussian random dictionary, n = 6400, m = 10, and the number of samples is set to 50. Some dictionary atoms obtained by K-SVD are shown in Figure 5. It can be seen that the learned atom has a certain harmonic performance, which has a certain degree of matching with the signal generated by the signal generator in the internal structure. The original signal is denoised by SR through the learned dictionary, and the result is shown in Figure 6. We used SRSVD to denoise this signal and try to extract the signal from the signal generator. First, we used the dictionary learning method to train a dictionary. It should be noted here that although the data in this experiment are generated by the signal generator, we often do not know the types and features of the collected signals in practical situations. Therefore, the prior of the signal generator was ignored in this experiment. K-SVD algorithm was used for training, and the algorithm procedure is shown in Algorithm 2. K-SVD algorithm is a dictionary learning algorithm proposed by Aharon et al. [1] in 2006 and has gradually become the most widely used dictionary learning algorithm. K-SVD updates the dictionary columns one by one, reducing the computational complexity of updating the dictionary. The initial dictionary is set to a Gaussian random dictionary, n = 6400, m = 10, and the number of samples is set to 50. Some dictionary atoms obtained by K-SVD are shown in Figure 5. It can be seen that the learned atom has a certain harmonic performance, which has a certain degree of matching with the signal generated by the signal generator in the internal structure. The original signal is denoised by SR through the learned dictionary, and the result is shown in Figure 6.  SVD is used to perform secondary filtering on the denoised signal. After being filtrated by PMI, the final result is shown in Figure 7. It can be seen that by SRSVD denoising, the dynamic range of the frequency of the irrelevant components is smaller. Moreover, the frequency of 123 Hz is more obvious. (ii) Normalization: normalize the columns of (0) D Figure 6. (a) The denoised signal waveform by SR and (b) its frequency spectrum.
SVD is used to perform secondary filtering on the denoised signal. After being filtrated by PMI, the final result is shown in Figure 7. It can be seen that by SRSVD denoising, the dynamic range of the frequency of the irrelevant components is smaller. Moreover, the frequency of 123 Hz is more obvious. Define the set of samples that use atom d j 0 : Select only the column constraint E j 0 corresponding to Ω j 0 , obtain E R j 0 ; SVD decomposition of E R j 0 = U∆V T , updating dictionary atoms d j 0 = u 1 and sparse coefficient α is small enough, the iteration is stopped.
Otherwise continue the loop. end for end for return D (k) denoising, the dynamic range of the frequency of the irrelevant components is smaller. Moreover, the frequency of 123 Hz is more obvious.

Case Study II
The signal processing method is also widely used in the field of fault signal extraction, so we take the actual bearing fault signal as an example to verify our method. We use the open-access experimental data set of bearings run-to-failure as the signal to be processed. As shown in Figure 8, the test bench [26] consists of an AC induction motor, a motor speed controller, a support shaft, and a hydraulic loading system. Two accelerometers are mounted vertically and horizontally on the housing of the test bearing. Set the sampling frequency as 25.6 kHz, and collect data for 1.28 s every 1 min. We select the outer ring fault as the fault to be detected. As the speed and bearing size are known, the fault frequency can be estimated to be about 109 Hz through the fault bearing physical mode.
It should be noted that different from case study I, the bearing fault signal model can often be used as prior knowledge in the process of fault signal extraction. The fault signals of rolling bearings can be expressed as the sum of a series of impulses: where s(t) = A sin(ω d t)e −ξω d t , A i is the magnitude of the ith impulse, T is the period of the impulse, τ i is the rolling mass slip time of the ith impulse, which is usually a random small value, M is the number of impulses, and n(t) is random noise. Therefore, according to this model, we adopt an analytic dictionary composed of tunable Q-factor wavelet transform (TQWT). Because TQWT matches well with the characteristics of fault bearings, it is widely used in the field of fault diagnosis.
extraction, so we take the actual bearing fault signal as an example to verify our method.
We use the open-access experimental data set of bearings run-to-failure as the signal to be processed. As shown in Figure 8, the test bench [26] consists of an AC induction motor, a motor speed controller, a support shaft, and a hydraulic loading system. Two accelerometers are mounted vertically and horizontally on the housing of the test bearing. Set the sampling frequency as 25.6 kHz, and collect data for 1.28 s every 1 min. We select the outer ring fault as the fault to be detected. As the speed and bearing size are known, the fault frequency can be estimated to be about 109 Hz through the fault bearing physical mode. , Ai is the magnitude of the ith impulse, T is the period of the impulse, τi is the rolling mass slip time of the ith impulse, which is usually a random small value, M is the number of impulses, and n(t) is random noise. Therefore, according to this model, we adopt an analytic dictionary composed of tunable Q-factor wavelet transform (TQWT). Because TQWT matches well with the characteristics of fault bearings, it is widely used in the field of fault diagnosis. The severity of bearing fault can be roughly judged by the value of the root mean square (RMS), and the development stage of the fault can be divided according to this value, as shown in Figure 9. Figure 9c and 9d respectively show the signal waveforms in the early and medium stages of the fault. It can be seen that the dynamic range of the signal in the medium stage is larger and presents obvious periodic pulse characteristics, while it is difficult to find fault characteristics in the early stage of the signal waveform. We use SRSVD to analyze the early stage signal and try to extract the fault characteristics. The severity of bearing fault can be roughly judged by the value of the root mean square (RMS), and the development stage of the fault can be divided according to this value, as shown in Figure 9. Figure 9c and 9d respectively show the signal waveforms in the early and medium stages of the fault. It can be seen that the dynamic range of the signal in the medium stage is larger and presents obvious periodic pulse characteristics, while it is difficult to find fault characteristics in the early stage of the signal waveform. We use SRSVD to analyze the early stage signal and try to extract the fault characteristics. This work makes a lot of sense because if early fault characteristics can be detected during the operation of rotating equipment, the loss of economy and life can be greatly reduced.  We chose the 60th minute signal for analysis, with data points of 32,768. The parameters of TQWT are as follows: Q = 2, r = 5, J = 10. In the process of SR, the ISTA algorithm is also used to solve the sparse coefficients, and the evaluation index for SVD is also PMI. The denoised signal obtained by SRSVD is shown in Figure 10g, and the fault signal is demodulated through Hilbert envelope spectrum analysis, as illustrated in Figure 10h. As can be seen from Figure 10h, the fault characteristic frequency (109 Hz) and its 2nd, 3rd, and 4th harmonics can be clearly found. We chose the 60th minute signal for analysis, with data points of 32,768. The parameters of TQWT are as follows: Q = 2, r = 5, J = 10. In the process of SR, the ISTA algorithm is also used to solve the sparse coefficients, and the evaluation index for SVD is also PMI. The denoised signal obtained by SRSVD is shown in Figure 10g, and the fault signal is demodulated through Hilbert envelope spectrum analysis, as illustrated in Figure 10h. As can be seen from Figure 10h, the fault characteristic frequency (109 Hz) and its 2nd, 3rd, and 4th harmonics can be clearly found. We chose the 60th minute signal for analysis, with data points of 32,768. The parameters of TQWT are as follows: Q = 2, r = 5, J = 10. In the process of SR, the ISTA algorithm is also used to solve the sparse coefficients, and the evaluation index for SVD is also PMI. The denoised signal obtained by SRSVD is shown in Figure 10g, and the fault signal is demodulated through Hilbert envelope spectrum analysis, as illustrated in Figure 10h. As can be seen from Figure 10h, the fault characteristic frequency (109 Hz) and its 2nd, 3rd, and 4th harmonics can be clearly found. We used three other widely used denoise algorithms to compare with SRSVD, these algorithms were L1-Kurtosis [27], RSVD [25], and variational mode decomposition (VMD) We used three other widely used denoise algorithms to compare with SRSVD, these algorithms were L1-Kurtosis [27], RSVD [25], and variational mode decomposition (VMD) [7]. L1-kurtosis is a kurtosis-based weighted sparse representation denoising model. RSVD algorithm is the secondary filtering algorithm used in SRSVD, i.e., the PMI evaluation index is embedded in SVD so that the restructuring process has the ability to select components according to the information content. VMD is an adaptive, fully non-recursive method for modal variation and signal processing.
In this paper, the number of decomposition levels was selected as six, and the first two components with the largest kurtosis were selected for reconstruction. The denoising results of these three methods are shown in Figure 10a-f. It can be seen that there was no obvious fault feature in their envelope spectra. Only in the VMD method denoising signal envelope spectra, fault frequency(109 Hz) and its third, fifth harmonic (327 Hz, 545 Hz) could be found obviously, but the noise reduction effect was still not as obvious as that of SRSVD. In addition, kurtosis is used to evaluate the signal after noise reduction and a larger kurtosis indicates more fault information. Kurtosis is an important index to evaluate the number of fault information in signal and is widely used in fault diagnosis. Its calculation formula is: where n is the number of data points, and µ is the mean value. After calculation, the kurtosis of the original signal was 4.673, and the kurtosis of the signal by SRSVD, L1-kurtosis, RSVD, and VMD denoised were 5.486, 4.759, 5.146, and 5.464, respectively. Therefore, after SRSVD noise reduction, the fault features in the signal were the most obvious, and the fault information was the most abundant.
In addition, the same analysis and comparison experiments were performed for the signals at the 50th minute and 100th minute. These two signals can be regarded as the stronger noisy case and the less noise case, respectively. Firstly, the 100th minute signal was processed. The envelope spectra of the signals after denoising by the four methods are shown in Figure 11. It can be seen that when in the middle fault stage, all the four methods can find the fault frequency and its high harmonic frequency, among which VMD and SRSVD were particularly excellent. Their corresponding kurtosis values were K L1-Kurtosis = 5.663, K RSVD = 5.176, K VMD = 5.862, and K SRSVD = 5.937, respectively. Secondly, the signal at the 50th minute was tested, and the obtained envelope spectra comparison results are shown in Figure 12. It can be seen that at the 50th minute, the performance of the four methods was not ideal; only SRSVD could find the third harmonic frequency of the fault. At this time, their corresponding kurtosis values were K L1-Kurtosis = 3.765, K RSVD = 3.567, K VMD = 3.649, and K SRSVD = 3.956, respectively.
where n is the number of data points, and μ is the mean value. After calculation, the kurtosis of the original signal was 4.673, and the kurtosis of the signal by SRSVD, L1kurtosis, RSVD, and VMD denoised were 5.486, 4.759, 5.146, and 5.464, respectively. Therefore, after SRSVD noise reduction, the fault features in the signal were the most obvious, and the fault information was the most abundant. In addition, the same analysis and comparison experiments were performed for the signals at the 50th minute and 100th minute. These two signals can be regarded as the stronger noisy case and the less noise case, respectively. Firstly, the 100th minute signal was processed. The envelope spectra of the signals after denoising by the four methods are shown in Figure 11. It can be seen that when in the middle fault stage, all the four methods can find the fault frequency and its high harmonic frequency, among which VMD and SRSVD were particularly excellent. Their corresponding kurtosis values were KL1-Kurtosis = 5.663, KRSVD = 5.176, KVMD = 5.862, and KSRSVD = 5.937, respectively. Secondly, the signal at the 50th minute was tested, and the obtained envelope spectra comparison results are shown in Figure 12. It can be seen that at the 50th minute, the performance of the four methods was not ideal; only SRSVD could find the third harmonic frequency of the fault. At this time, their corresponding kurtosis values were KL1-Kurtosis = 3.765, KRSVD = 3.567, KVMD = 3.649, and KSRSVD = 3.956, respectively.

Conclusions
This paper presents a signal processing method for weak signal detection. By combining sparse representation with singular value decomposition, Gaussian white noise is removed by sparse representation for primary filtering, and useful information in the signal is extracted by weighted singular value decomposition for secondary filtering to achieve the purpose of denoising. This multi-layer filtering method is mainly taken into account that sparse representation is sensitive to Gaussian noise, while the improved SVD method can select useful components in signals by using the appropriate indicator. After two layers of filtering, the useful components in the signal can be extracted. Two experiments verify the effectiveness of the proposed method and the experimental results show that SRSVD has advantages in fault signal extraction compared with the L1-kurtosis method, RSVD, and VMD. Future research can focus on useful information indexes to extract weak signals more effectively.

Conclusions
This paper presents a signal processing method for weak signal detection. By combining sparse representation with singular value decomposition, Gaussian white noise is removed by sparse representation for primary filtering, and useful information in the signal is extracted by weighted singular value decomposition for secondary filtering to achieve the purpose of denoising. This multi-layer filtering method is mainly taken into account that sparse representation is sensitive to Gaussian noise, while the improved SVD method can select useful components in signals by using the appropriate indicator. After two layers of filtering, the useful components in the signal can be extracted. Two experiments verify the effectiveness of the proposed method and the experimental results show that SRSVD