Sparse ECG Denoising with Generalized Minimax Concave Penalty

The electrocardiogram (ECG) is an important diagnostic tool for cardiovascular diseases. However, ECG signals are susceptible to noise, which may degenerate waveform and cause misdiagnosis. In this paper, the ECG noise reduction techniques based on sparse recovery are investigated. A novel sparse ECG denoising framework combining low-pass filtering and sparsity recovery is proposed. Two sparsity recovery algorithms are developed based on the traditional ℓ1-norm penalty and the novel generalized minimax concave (GMC) penalty, respectively. Compared with the ℓ1-norm penalty, the non-differentiable non-convex GMC penalty has the potential to strongly promote sparsity while maintaining the convexity of the cost function. Moreover, the GMC punishes large values less severely than ℓ1-norm, which is utilized to overcome the drawback of underestimating the high-amplitude components for the ℓ1-norm penalty. The proposed methods are evaluated on ECG signals from the MIT-BIH Arrhythmia database. The results show that underestimating problem is overcome by the proposed GMC-based method. The GMC-based method shows significant improvement with respect to the average of output signal-to-noise ratio improvement (SNRimp), the average of root mean square error (RMSE) and the percent root mean square difference (PRD) over almost any given SNR compared with the classical methods, thus providing promising approaches for ECG denoising.


Introduction
As the recording of the bioelectric potentials produced by rhythmical cardiac activities, the Electrocardiogram (ECG) signals provides plentiful and valuable information on the heart functional conditions, and it is thus has been widely used not only for the clinical diagnosis of cardiovascular diseases in hospital, but also for patient monitoring at home. However, the ECG signals are essentially weak and non-stationary, and so susceptible to various noises. The noise may blur or even mask the key characteristics in the ECG waveform, which will further degenerate the reliability and efficiency of the clinical diagnosis. Therefore, it is important and valuable to denoise the ECG, i.e., extract high resolution cardiac signals from noisy ECG observations. ECG noise elimination is a very challenging task due to the spectral overlap of signals and noise, and a wide range of ECG denoising techniques have been continuously and extensively studied in the last few decades. These methods can be typically classified as the linear filtering, empirical mode decomposition (EMD) and wavelet transform, etc. They are the most well-developed approaches and have their own advantages and drawbacks.
The linear filtering methods [1][2][3][4] have low computational complexity and fast calculation speed and thus have been widely used in embedded or portable monitoring instruments. However, the classical filters failed to reduce the noise components that overlap with signal spectrum. Moreover, they will cause significant damage to the details of ECG signals which contain sharp edges and impulses of short duration. Therefore, they are often used in combination with other noise reduction algorithms.
To overcome these difficulties, nonlinear methods have been investigated and Wavelet-based methods are the most famous and effective [5][6][7][8][9][10][11][12][13][14]. These approaches rely on the idea of thresholding under the assumption that signal magnitudes will concentrate in the Wavelet representation while the noise will spread, such that the noise can be illuminated by cutting the Wavelets coefficients less than a pre-determined threshold. The performance of the Wavelet denoising methods depends on the matching degree of the selected Wavelet basis functions and the ECG signals. The basis functions will be fixed once the algorithms have been developed and may not necessarily match the practical ECG signals.
Due to the ability of dealing with non-stationary and nonlinear signals, the EMD-based ECG denoising methods gained extensive attention recently [15][16][17][18][19][20][21]. The EMD method was first introduced in [22] for analyzing data from nonstationary and nonlinear processes. The EMD decomposes a given signal into a series of intrinsic mode functions (IMFs) through an iterative process called sifting. The IMFs play the roles of basis functions, and the EMD method can be seen as a type of Wavelet decomposition with adaptive basis functions. Since the basis functions of the EMD are derived from the signal itself, it has the potential to outperform the traditional Wavelet methods. However, the EMD has a limitation of mode mixing, which causes the mixing of higher order components with the lower order components. Because of the mode mixing problem, it is difficult to separate the signal and noise components clearly even in higher order IMFs. To overcome this problem, paper [21] proposed a grey spectral noise cancellation (GSNC) scheme for ECG signals where two-stage discrimination is employed with the empirical mode decomposition (EMD), the ensemble empirical mode decomposition (EEMD) and the grey spectral noise estimation (GSNE). This method effectively alleviates the mode mixing problem. In the case of denoising, removing some of the IMFs will inevitably corrupt the useful information of the signal [23]. For example, in ECG denoising, the EMD may cause damage to P-waves and T-waves, which may result in misdiagnosis [24].
Either the Wavelet or the EMD methods can be classified as the transform-domain processing approaches under the assumption that the true signal can be well approximated by a linear combination of some basis functions [25,26]. The essence underlying these methods is that the signal can be sparsely represented in the transform domain. Since the ECG signals consist of periodical pulse components, they can also be well represented by a combination of low pass components and sparse components in time domain. Consequently, ECG denoising methods based on the time-domain sparse representations have been an active research area in recent years. Paper [27] modeled the ECG signals as the sum of two second and third order sparse derivative signals to reduce noise. In [28], an sparse ECG signals enhancement and QRS waves detection algorithm was proposed. In [29,30], the sparse optimizationis combined with linear time invariant (LTI) filtering to conduct ECG signals denoising. In these sparse representation methods, 1 -norm is used as a penalty term to construct sparse optimization models. This penalty term has the ability to control the sparsity of the optimization results. However, the 1 -norm penalty has a drawback that it always underestimates the interested signal [31], since larger coefficients are penalized more heavily in the 1 norm than smaller coefficients [32]. The underestimating causes the amplitude of estimated signals to be lower than the true value, which may lead to misdiagnoses.
In order to improve the denoising performance, a variety of non-convex penalty terms have been developed to replace 1 -norm [32][33][34]. Paper [32] proposes a weighted formulation of 1 -norm minimization designed to more closer to the solution of 0 -norm. In [33], the author uses 2 − 0 penalty instead of 1 -norm penalty. Paper [34] addresses the minimization problem of 0 -regularized least squares for a continuum and propose two heuristic search algorithms. However, these methods adopt non-convex penalty to replace 1 -norm, which leads to nonconvex objective functions. As a result, the algorithms become complicated and the solution is often stuck in the local optimal solutions. To overcome these drawbacks, a novel multivariate non-separable non-convex penalty, namely generalized minimax-concave (GMC) penalty, has been recently presented in [35,36]. The GMC penalty is a non-differentiable non-convex penalty obtained by subtracting a differentiable generalized Huber function from 1 -norm. It strongly promotes sparsity while maintaining the convexity of the cost function.
Currently, the GMC penalty has been successfully applied to the fault detection of gearbox and achieved good results [31,37]. However, the application of GMC penalty to the ECG denoising has not been well studied yet. It is not known whether the GMC penalty can be applied to the ECG signals processing area to improve the denoising performance. This motivates the work in this paper to explore the GMC-based sparse denoising method.
In this paper, we study the sparse denoising techniques for the ECG signals. Through modeling the observed ECG signals as a sum of low-pass component, sparse component and additive noise component, we propose an ECG denoising framework by combining the LTI low-pass filtering and the sparsity recovering. As the key ingredients of the framework, the sparsity recovery algorithms are investigated and we proposed two sparsity recovery algorithms based on 1 -norm penalty and GMC penalty, respectively. We compare the denoising performance of the proposed two schemes and demonstrate that the GMC-based scheme preservers more detail information with respect to QRS waves of ECG signals. The contributions of this work are summarized as follows.
1. A novel ECG denoising framework based on sparsity recovery is proposed. Low-pass filtering and sparse denoising idea was first proposed in [38], but it is not known if it can be applied to the ECG denoising. Moreover, the sparsity in [38] deals with the derivatives of the signal, while we directly deal with the sparsity components of the signal. Therefore, we propose a different sparse framework for ECG denoising. 2. We extend the application of GMC penalty to the area of ECG denoising. Although the GMC penalty has been proposed in [35] and several application examples were given there, it is not known how to apply it in the ECG area and whether the performance of ECG denoising can be improved. 3. The performance of the GMC-based sparse denoising algorithm is comprehensively tested.
We carry out a series of experiments to explore the performance the GMC-based denoising algorithm. The relationship between the parameter setting and the performance are demonstrated through experiments. It is also shown that the GMC-based algorithm helps to overcome the underestimation problem and outperforms the classical methods known in the literature.
The paper is organized as follows. We introduce the sparsity recovery denoising framework in Section 2. The 1 -norm based algorithm is described in Section 3. The GMC-based sparse denoising algorithm is proposed in Section 4. Experimental results are presented in Section 5. Section 6 concludes the work.

System Model
In this section, the sparse signal model in time domain is presented. A sparse denoising framework based on this signal model is established with the help of mathematical derivations.

Signal Model
The noisy ECG signals can be modeled as the superposition of a low-pass signal, a sparse signal and the additive white Gaussian noise (AWGN), which can be written as where n is the index to the discrete sampling sequences, l(n), d(n) and w(n) represent the low-pass signal, the sparse signal and the additive white Gaussian noise (AWGN), respectively. The 'sparse' property means that there are many zeros in a signal frame with N data points. To easily understand, we can regard d(n), 0, 1, 2, . . . ., N − 1, as the periodic components in the ECG signal frame. In the following, to facilitate our derivations, we omit the time index (n) in the case of not being confused.

Denoising Framework
We aim to recover the pure ECG signals l + d from the observed noisy signals s. To achieve this, a straightforward method is the traditional linear filtering procedure. Specifically, let s l + d represent the pure ECG signals, then it can be approximately estimated through traditional linear filtering where LPF represents a low-pass filtering (LPF) process andĺ LPF(l + w). Here, we use the zero-phase filter since it can reduce noise in the signals and preserves the QRS complex at the same time, it occurs in the original [38,39]. To facilitate readers, the design method of the LPF is sketched in Appendix A.
From Equation (2c), it is observed that the traditional LPF will smooth not only the low-pass component but also the sparse component in the original signals. Since the sparse component d consists of high frequency ingredients, it is prone to be damaged by the LPF, which will be embodied by the injury of the details of the ECG signals. For example, the amplitude of the peaks, the shape of the Q and R waves may be damaged by the LPF.
As mentioned above, the traditional LPF methods may cause damage to the details of the ECG signals due to the high-frequency ingredients. In order to enhance the details in the recovered signals, we propose a new ECG denoising framework as shown by Figure 1. The main idea is to recover these details, which is sparse, from the residual of the low-pass filter. Specifically, the residual of the LPF is written as where HPF represents a high-pass filter (HPF). It is assumed that the LPF and HPF are compensative, i.e., HPF = I − LPF. Here, I represents the identity operator. From Equation (3d), we know that the residual of the LPF consists of the high-passed ingredient of the sparse signals plus high-frequency noise. Ifd HPF(d) is recovered, then it can be combined with the output of the LPF (l =ĺ + LPF(d)) as shown in Figure 1. The combined signal can be written as Recall that HPF(d) contains the details of the sparse high-frequency components, the details of the recovered ECG signals can be thus improved. Then the problem boils down to how to recover this high-frequency sparse signals, i.e., HPF(d), from noise.
In the following, we propose two methods to recover the sparse components in the residual of the LPF. The first one is based on the 1 -norm penalty, which improves the details in the detains of the output of LPF but still has obvious underestimating phenomenon. The second one is based on the GMC penalty, which achieves better performance than the 1 -norm method and overcomes the underestimating problem.

Sparsity Recovery with 1 -Norm
In this section, we propose the 1 -norm based sparsity recovery algorithm for the sparsity recovery block in Figure 1.
From Figure 1, we know that the performance of the sparsity recovery plays the vital role in the whole denoising procedure. The classical sparsity recovery methods can be applied here straightforwardly, e.g., the 1 -norm penalized least squares method [40]. Specifically, the sparse signal d can be decomposed asd where A ∈ R N×M is the basis vectors of the signal space and x ∈ R M is the representation coefficients. Given A, we expect to find a sparse solution x to satisfy the equality. The sparse solving process can be mathematically reformulated as the following 1 -norm optimization problem Note that A can be constructed using different methods in principle [35]. In this work, we construct A based on the Short Time Fourier Transform (STFT) matrix as shown in Appendix B, considering that the ECG signals are periodic pulses in time series.
Whend is corrupted by noise, i.e., d w d +w is observed, an approximation solution to Equation (6) can be found by solving the following basis pursuit denoising (BPD) problem [41,42] arg min Specifically, the forward-backward spliting (FBS) method can be applied to find the solution of Equation (7) [43], as listed in Algorithm 1. It is noted that the so f t(·) in the algorithm is the soft thresholding operator defined by where T ∈ R + is a given threshold. The maxeig(·) denotes the maximum eigenvalue of a matrix.
Our simulation results show that the details in the recovered ECG signals of the proposed 1 -norm method are enhanced significantly compared with the output of LPF. However, there still exists an obvious gap between the recovered signal and the noiseless signal with respect to the peaks. That is to say the proposed 1 -norm method underestimates the peaks values in the signals. This drawback of the 1 -norm methods has also been revealed in literature [31,35]. To further improve the performance of the denoised signals with respect to the details, we propose a non-convex sparse regularization method based on the generalized minimax concave (GMC) penalty in the following.

Sparsity Recovery with GMC Penaly
As mentioned above, the classical sparsity recovery method based on 1 -norm penalty will lead to underestimation to the peaks values. In this section, instead of the 1 -norm penalty, we adopt a novel penalty for the sparsity recovery problem, which can keep the convexity of the least squares cost function minimized and avoids underestimation the values of waveform.

GMC Penalty
The GMC penalty function ψ B : R N → R is defined as } is the generalized Huber function [35], inf{·} denotes the infimum of a function. Figure 2 illustrates the scalar version of GMC penalty functions. It shows that the GMC penalty has a basic property that large values are penalized no less than small values, but the degree of punishment is not as severe as 1 -norm [35]. Because of this property, it becomes possible to overcome the drawback of underestimating the high-amplitude components for the 1 -norm penalty. Moreover, the GMC penalty function is non-convex itself, but it can be carefully customized to maintain the convexity of the cost function to be minimized. This means that the global optimization of the cost function can be achieved with GMC penalty.

Problem Formulation
By replacing the 1 -norm penalty in Equation (1) with the GMC penalty Equation (9), we present the following sparsity recovery problem arg min In order to maintain the convexity of the objective function, the parameters in B should be set carefully based on the following lemma.

Lemma 1.
In [35], Let y ∈ R M , A ∈ R M×N and λ > 0, the function F( Based on Lemma 1 and given A, the convexity of the objective in Equation (10) can be satisfied by simply setting • The non-convexity of the penalty ψ B is controlled by the parameter γ. When γ = 0, the elements of B are zeros, and the penalty degenerates to the 1 -norm. When γ > 0, a higher γ will lead to higher non-convexity.

Proposed GMC-Based Sparsity Recovery Algorithm
Similar with the classical 1 -norm penalty problem, there exists no closed-form solution for the GMC penalty problem Equation (10). In fact, the gradient of the GMC penalty is also inexplicit. In order to solve it, we substitute Equation (9) into Equation (10) and obtain a reformulated optimization problem (x opt , v opt ) = arg min where 2 2 with 0 ≤ γ ≤ 1, and d w =d +w. Problem Equation (12) is a saddle-point optimization problem [44], which can also be solved by the FBS method [35,45]. We list details of the solving process in Algorithm 2.
Based on the sparse denoising framework in Figure 1 and the sparsity recovery algorithm in Algorithm 2, we propose the GMC-penalty ECG denoising method. The detailed processing flow of the method is shown in Figure 3 with some key parameter settings. The whole process consists of four steps. At the first step, the noisy ECG signal is sent to a low-pass filter. At the second step, the output of the LPFl and the residual signal y −l are obtained. Two parameters of the LPF, i.e., the order k and the cutoff frequency is w c should be configured beforehand. The next step is to perform sparse recovery following Algorithm 2. The linear transform matrix A, the data length R, the data overlapping ration and the window function should be set before applying the algorithm. Moreover, the regularization parameter λ and the convexity-guaranteeing factor γ are also set at the start of the algorithm. After processing d w by Algorithm 2, the optimal solution x opt is obtained for each data frame. Since x opt is the sparse coefficients in frequency domain, we should further recover the denoised sparse signal through the calculationd = A H x opt . Finally, we obtain the denoised ECG signal by synthesizing the low-pass componentl and the high-frequency sparse componentd in the last step.
Step 1: Input noisy ECG signals Step 2: Low Pass Filter Parameter Setting Step 3: GMC penalty denoising

Configurations and Preprocessing
The performance of the proposed algorithms are evaluated based on the ECG records of the MIT-BIH arrhythmia database [46]. The database contains a total of 48 records, each record length is 30 min, the sampling rate is 360 Hz. We note that the original signals in the MIT-BIH database have been filtered by a band-pass filter with a passband from 0.1 to 100 Hz. However, it is obvious that the noise components falling in the passband will not be eliminated. To understand its influence on the clean signal, we make an observation of the original signals in the database, as shown in Figure 4a. It can be observed that there exist noise components in the signals, but the amplitudes of the noise components are relatively weak compared with the useful signal. In order to gain deeper insight into the influence of the noise on our algorithms, we process the original signal directly using our algorithms, and the results are shown in Figure 4b. It can be perceived that the weak noise in the original signal is 'smoothed' out by our algorithms. That is to say our algorithms have the ability to suppress the noise in the acquired signals. However, considering that the original noise in the original signals of the database is relatively weak, we omit its impact on the 'true' signal. In this case, we call the original signals in the MIT-BIH database as 'pure signal', despite there existing very weak noise in practice.  The performance of the proposed algorithm is compared with the works in [30,47,48]. Three different performance criteria are chosen for the performance measurement, i.e., the rooted mean square error (RMSE), the percent root mean square difference (PRD) an the signal-to-noise ratio improvement (SNR imp ).
For the LPF, we set the parameter of filter order as k = 2. The the digital cutoff frequency is set as w c = 0.06π, i.e., f c = 0.03 * F s = 10.8 Hz, where F s = 360 Hz is the sampling frequency of the ECG signals. It is noted that the cutoff frequency should be customized for different ECG signals, since the periodic characteristics of the ECG signals are not uniform in practice. We took data from the first 1 min, a total of 21,600 points. The data segment length R is set as 32, and the frame length of the STFT is 64.
To illustrate the proposed denoising framework, as shown by Figure 1, we plot the outputs of different preprocessing nodes in Figure 5. Specifically, Figure 5a is the original ECG signal read from the No.100 recode in the MIT-BIH database. This pure signal is then corrupted by AWGN noise with 10dB SNR and the result waveform is shown in Figure 5b. The noisy ECG signal, i.e., s, is the input of our denoising framework. It is sent to the LPF, and the output of which is shown in Figure 5c. By substituting this LPF signal from the raw noisy signal, a residual signal is obtained which can be treated approximately as a residual high-frequency sparse component of the ECG signal contaminated with noise, as shown in Figure 5d.

Convergence of the Proposed Sparsity Recovery Algorithms
The noisy sparse signal d w =d +w is further sent to the sparsity recovery block in Figure 1, where the proposed 1 -norm method or the GMC-penalty method is applied to extract the sparse signal. The convergence curves of the algorithms are plotted in Figure 6. In the simulation, we set γ = 0.8 and λ = 0.1, and we define the update rate asd (k)−d(k−1)

d(k)
, where k denotes the kth iteration. It can be observed that the updating rate of the algorithm will be less than 0.1% after about 250 iterations. This means the proposed algorithm will converge after hundreds of iterations. In Figure 6, we also plot the converge performance of the SASS method which is proposed in [30]. It can be observed that the update rate of the SASS algorithm will be less than 10 −4 after 250 steps, which reveals that the proposed algorithms in this work converge slower than the SASS algorithm.

Performance Comparison of the Proposed Algorithms
The outputs of the sparsity recovery block are plotted in Figure 7. It can be observed that the noise in the d w (i.e., Figure 5d is removed by both algorithms).
As the final step of the proposed denoising scheme, we obtain the final denoised ECG signal by adding the outputs of the LPF and the sparsity recovery block according to Figure 1. The resulting waveforms are shown in Figure 8. It can be observed that all the three methods, i.e., the proposed 1 -norm method, the proposed GMC method and the SASS method, eliminate the noise effectively. However, the 1 -norm method underestimates the R-peaks visibly as shown in the down-right corner of Figure 8. The SASS method performs better than the 1 -norm method with respect to the R-peaks, but it underestimates the Q-peaks as shown in the down-left corner of Figure 8. As a comparison, the GMC method outperforms the other two methods with respect to both the R-peaks and Q-peaks.
To gain more insights into the proposed algorithms, we plot the time-frequency relationships of the recovered ECG signals in Figure 9. It is observed that the GMC-based method preserves more high-frequency components than the 1 -norm method, which can be perceived in the spectrogram.

Impacting of λ
For the GMC-based algorithm, both λ and γ should be specified. Since the two parameters are coupled, we can fix one of them and vary the other to study their impact on the performance of the method. Specifically, we set γ = 0.8, and varying λ from 0 to 0.6. The resulted RMSE performance of the algorithm is shown in Figure 10. It can be observed that a better RMSE can be achieved when λ is chosen in the vicinity of 0.09. In Figure 11, we compare the rooted mean square error (RMSE), the percent root mean square difference (PRD) and the signal-to-noise ratio improvement (SNR imp ) achieved by different methods over different SNRs (20 dB 15 dB 10 dB 5 dB −5 dB −10 dB). The SASS [30], NLM [47] and CE-EMD-AN [48] algorithms are adopted as references. At each SNR, we average the results of all 48 data in the MIT-BIH database. It can be observed that the proposed GMC-based algorithm outperforms the referenced algorithms for all SNRs. To fully demonstrate the performance of the proposed algorithm, we list the SNR imp values achieved by different algorithms over SNR = 10 dB in Table 1. The results show that the GMC-based algorithm performs the best of all of the experimented SNRs. In order to more intuitively show the denoising performance of each algorithm, we illustrate in Figure 12 the values of RMSE, PRD and SNR imp achieved by the algorithms for the data number as 100, 101, 103, 105, 106, 115 and 220 in the database. It is noted that these data are selected because they were used in literature [47]. In fact, our algorithm can be applied to any ECG data in the database. To further show performance of the the proposed GMC-based method, we compared the SNR imp percentage improvement of the GMC algorithm with the three referenced algorithms, i.e., SASS, NLM and CE-EMD-AN, under different noise levels. The percentage improvement is calculated according to the following formula We designed an experiment to calculate the percentage of GMC increase in SNR imp value relative to other algorithms. The calculation formula is as follows: Percentage improvement = SNR_imp of GMC − SNR_imp of the referenced algorithm SNR_imp of the referenced algorithm × 100%. (13) The results are listed in Table 2. It shows that the performance of the proposed algorithm outperforms the referenced algorithms significantly. There is a question that if the proposed algorithm can be applied to the abnormal ECG signals, such as arrhythmia signal, and how it will perform. To answer this question, we apply the proposed algorithm to the normal heart beat signal and the arrhythmia signal and results are shown in Figures 13 and 14, respectively. As can be seen from Figure 13, all the four algorithms can remove noise effectively for the normal ECG signal. However, there is a huge difference in the type of signal that has an abnormal heart beat. As shown in Figure 14, after four seconds from the ECG signal (mitdb/230), the waveform begins to have turbulence and the T wave disappears significantly. After denoising, the output of the SASS algorithm is not smooth enough; the output of the CE-EMD-AN algorithm is seriously distorted in the period of 4 s-5 s; the output of the NLM algorithm has obvious sharp noise in the S-T segment during the 3 s-4 s period. Compared with these referenced algorithms, the proposed GMC algorithm achieves the best waveform. Therefore, the proposed GMC-based denoising algorithm can also be used with the arrhythmia signals.

Conclusions
We have investigated the ECG noise reduction techniques based on sparsity recovery in this paper. Through combining low-pass filtering and sparsity recovery, we proposed a novel sparse ECG denoising framework. The 1 penalty and the GMC penalty were utilized to construct sparsity recovery algorithms. Although the GMC penalty function was non-differentiable non-convex itself, it could be used to construct convex cost function and thus helps to obtain optimal solution. Since the GMC had the property of penalizing large values more than small values, but the degree of punishment is not as severe as 1 -norm, it was utilized to overcome the drawback of underestimating the high-amplitude components for the 1 -norm penalty. Experimental results demonstrated that the proposed GMC-based method overcame the underestimating problem of the 1 -norm method and classical ECG denoising methods proposed in literature. Results showed that the GMC-based method outperforms the conventional algorithms such as the SASS algorithm and the EMD based method with respect to the SNR imp , RMSE and PRD.
Given finite-length sequences p n and q n , two banded Toeplitz matrices P and Q are defined as P =       p 1 p 2 p 3 p 1 p 2 p 3 . . .
Based on these matrices, a linear filter matrix can be constructed as The matrix H is approximately Toeplitz, and it represents approximately an LTI system with the frequency response H(e jω ) = |Q(e jω )| 2 |Q(e jω )| 2 + α|P(e jω )| 2 . (A4) H(e jω ) is zero-phase, and its transfer function is written as After implementing Equation (A3) to the input data vector y (e.g., the noisy data), the vectorized output of the filter is written as x = Hy . (A6) An advantage of Equation (A6) is that transient effects at the start and end of the finite-length signal are intrinsically avoided. This means that the transient end-point artifacts is avoided.

Appendix B. The STFT Used in the Sparsity Recovery
We use the STFT in the sparsity recovery block in Figure 1. Specifically, the input of the sparsity recovery block is cut as small data segments with length R. The segments are overlapped with a ratio of 50%. After padding R zeros behind the tail of the data segment, a STFT frame with length N = 2R is constructed.
The constructed time-domain signal frame d ∈ R N can be represented by the frequency-domain signal x ∈ C N as where A ∈ C N×N is the inverse discrete Fourier transform (IDFT) matrix defined as The time-domain representation Equation (A11) can be transformed into the frequencyrepresentation as It is noted specially that a window function with length R is applied to the segmented time-domain data, which is defined as w(n) = sin( n + 0.5 R π), n = 0, . . . , R − 1 .
After the frequency-to-time transform, this window should be removed by multiplexing its reciprocal.