ECG Signal Denoising and Reconstruction Based on Basis Pursuit

: The electrocardiogram (ECG) is widely used for the diagnosis of heart diseases. However, ECG signals are easily contaminated by different noises. This paper presents efﬁcient denoising and compressed sensing (CS) schemes for ECG signals based on basis pursuit (BP). In the process of signal denoising and reconstruction, the low-pass ﬁltering method and alternating direction method of multipliers (ADMM) optimization algorithm are used. This method introduces dual variables, adds a secondary penalty term, and reduces constraint conditions through alternate optimization to optimize the original variable and the dual variable at the same time. This algorithm is able to remove both baseline wander and Gaussian white noise. The effectiveness of the algorithm is validated through the records of the MIT-BIH arrhythmia database. The simulations show that the proposed ADMM-based method performs better in ECG denoising. Furthermore, this algorithm keeps the details of the ECG signal in reconstruction and achieves higher signal-to-noise ratio (SNR) and smaller mean square error (MSE).


Introduction
Noise removal is a fundamental problem in signal processing. Electrocardiography (ECG) is a vital biomedical tool for heart disease diagnosis. ECG is a noninvasive technique which is a safe, harmless, and quick method of cardiovascular diagnosis. The accuracy of the information extracted from a signal requires proper characterization of waveform morphologies and that they are not contaminated by noises [1][2][3]. However, ECG is a fairly weak electric signal, and its amplitude is usually in the millivolt level. ECG noises include interference by the instrument itself, baseline wander, human activities, and other factors in the signal. Baseline wander is the most common noise in ECG signals, which has greatest interference to its amplitude. It usually causes the signal to deviate from the normal baseline level, affecting the ST segment and small waves such as P wave and T wave, etc. The changes of these morphological waves can seriously interfere with disease diagnosis [4][5][6]. The main cause of baseline wander is the effect of breathing. Compared with ECG signals, baseline wander is a low-frequency noise, with a frequency of 0.05~2 Hz [7].
Several techniques have been reported in the literature to address ECG components contaminated with different noises. Some solutions have been proposed, such as adaptive filter architecture, wavelet transform, neural networks, and nonlinear filter banks. Several ECG compression techniques have been developed in recent years, most of which are based on lossy schemes for their higher compression ratio [8]. These techniques have also been used to extract a noise-free signal from noisy ECG.
Compressed sensing (CS) is a data sampling technology which has emerged in recent years. Its main idea is to reconstruct and restore the signal with less sampled data [9][10][11][12].
(2) This paper proposes the BP-ADMM algorithm, which overcomes the problems of low reconstruction accuracy. This method introduces dual variables, reduces constraint conditions, and achieves the purpose of optimizing the original variable and the dual variable at the same time through alternate optimization. (3) A low-pass filter matrix is constructed for baseline wander correction and denoising through a zero-phase filter. The results show that the issue of the peak underestimation of the ECG signal is effectively improved, and the performance of the algorithm is systematically proposed. (4) The rest of this paper is organized as follows. The related techniques and mathematical methods are presented in Section 2. Section 3 introduces the system model. Section 4 shows the algorithm set-up process as well as the ECG signal denoising and reconstruction process. A detailed description of the simulation results under various algorithms is provided in Section 5, and Section 6 is the conclusion.

Basis Pursuit
In the theory of CS, the number of observations is much smaller than the length of the signal, so the signal reconstruction is faced with the problem of solving underdetermined equations. In order to solve this problem, the BP algorithm is proposed. The BP algorithm is a new method in signal sparse representation and also a principle of global optimization [14]. The observation matrix has restricted isometric properties (RIP). Moreover, RIP also provides a theoretical guarantee for accurately recovering signals from observed values. The classical basis pursuit algorithm is based on the above principles to achieve the purpose of reconstruction and Gaussian white noise denoising. At present, the basis pursuit method has good applications in the field of one-dimensional signal processing [31][32][33]. In the case of noisy observations, the model is as follows: where x ∈ R n is an unknown vector, y ∈ R m is the measuring signal, the baseline wander f , and the Gaussian white noise n. M ∈ R m×n is the perceptual matrix, and the goal is to recover the unknown vector x from the noisy signal y.

Review of the ADMM Method
The ADMM method is an algorithm for constraint minimization. ADMM is a powerful technique in linearly constrained optimization problems for splitting a large problem into smaller ones. It has been widely used in many areas including machine learning, signal processing, and other fields [34][35][36][37]. It can be decomposed into several sub-vectors, that is, its objective function can also be decomposed into: with variable x ∈ R, and where f are convex. Then the large-scale optimization problem can be transformed into the distributed optimization problem. Accordingly, the equality constraint matrix Ax = b is also divided into: where A ∈ R r×n is the measurement matrix, and b ∈ R r is the projection vector. Therefore, the augmented Lagrange objective function L ρ (x, λ) can be written as follows: where ρ > 0 is the penalty parameter, λ > 0 is a regular term parameter, and L ρ is the standard Lagrangian for the problem. The penalty function obtained is still the same as that obtained in the augmented Lagrange multiplier method. Then, the dual ascending method is adopted to obtain a decentralized algorithm that can carry out parallel operations: The x i can be updated independently, because x i updates are done in an alternating or sequential manner, and they are alternating direction multiplicators.

System Model
The ECG signal is a multi-segment sequential, periodic signal. The value of the signal generally appears as a signal close to zero, and its waveform is composed of abrupt peaks returning to the baseline. The ECG signal appears as a form of sparse waveform distribution, and the signal has a certain periodicity. When the ECG signal is calculated by the third-order difference, it can be obtained that most of the values in the difference signal have tended to or approximated to zero. Therefore, the ECG signal can be regarded as a sparse signal.
An ECG recording signal affected by noise and baseline wander after collection can be viewed as the following three parts: the original clean ECG signal x; the baseline wander f , and the Gaussian white noise n. Therefore, a mixed signal of length N can be modeled as follows: When the signal itself is sparse or close to sparse representation, signal processing can be performed by the sparse optimization correlation method. It is well known that the L norm [38] is a representative of convex sparsity. The problem of sparse signal recovery can be transformed into an optimization problem for the following objective function. Therefore, the problem of sparse noise reduction is attributed to the optimization of the norm of the problem of minimization under the condition of data fidelity.
where • p denotes p norm, α is the regularization factor, M is the measurement matrix, and the row vectors are orthogonal to each other. According to the signal model, the nonconvex optimization problem of sparse signal estimation is expressed as a basis tracking denoising problem model, in which regularization parameters and signal norm control errors and sparsity are introduced.

BP Denoising
BP denoising (BPDN) is now mainly applied to image processing denoising, signal recovery, and reconstruction [39]. It is a typical noise reduction method based on the sparse signal model. It is defined by the minimization of a convex cost function that contains a quadratic data truth term and a nondifferentiable convex penalty term, whose penalty term is a compound term of a linear operator and a L1 norm [40]. When the signal or its differential signal is sparse, the noise in the signal can be effectively suppressed, which can be regarded as a convex optimization problem with quadratic data fidelity.
The BPDN algorithm extracts a subset y from the columns of M, and its goal is to minimize the number of extracted columns or minimize the number of non-zero items in the vector x. The x is reconstructed from the noisy data y. The BPDN algorithm transforms Equation (8) into the following optimization problem: In the BPDN model, the signal can be regarded as a linear combination of multiple basis functions. Most of the basis function coefficients are zero, while only a few of them have non-zero coefficients. The basis function here is called the atom, and all the combinations of atoms are called dictionaries. When the number of atoms in the dictionary is greater than the signal dimension, the dictionary is overcomplete or redundant. Sparse representation is to find the optimal solution in the whole signal space by redundant dictionary according to the sparse metric standard. The typical sparse metric is L0. The commonly used measurement matrix is the Gaussian random matrix. It has been proved that the high probability of such a matrix satisfies the RIP property [41]. Therefore, when the Gaussian random matrix is used to observe the original signals, the original signals can be accurately recovered from the measured value with a very high probability through the reconstruction algorithm.
The measurement matrix plays an important role in CS and signal reconstruction. The reconstruction error is smaller if the performance of the measurement matrix is better [42]. The measurement matrix M makes the dimension of ECG signals reconstructed from L measured values such as Figure 1. The measurement matrix M can be established according to the length of ECG signals N and the number of measurements L. So M is a matrix of L × N. In this model, considering the periodic pulse characteristics of the ECG signal on the time series and the need for real-time performance, an analysis dictionary is selected and constructed using short-time Fourier transform (STFT) [43].
where Mψ satisfies the RIP feature. That is, for any sparse signal x and constant ξ, ξ ∈ (0, 1), The initial observation matrix M is a full-rank matrix. The row vectors of the measurement matrix M meet the orthogonal characteristics, and the energy is equal. That is Appl. Sci. 2021, 11, x FOR PEER REVIEW 5 of 16

BP Denoising
BP denoising (BPDN) is now mainly applied to image processing denoising, signal recovery, and reconstruction [39]. It is a typical noise reduction method based on the sparse signal model. It is defined by the minimization of a convex cost function that contains a quadratic data truth term and a nondifferentiable convex penalty term, whose penalty term is a compound term of a linear operator and a L1 norm [40]. When the signal or its differential signal is sparse, the noise in the signal can be effectively suppressed, which can be regarded as a convex optimization problem with quadratic data fidelity.
The BPDN algorithm extracts a subset y from the columns of M , and its goal is to minimize the number of extracted columns or minimize the number of non-zero items in the vector x . The x is reconstructed from the noisy data y . The BPDN algorithm transforms Equation (8) into the following optimization problem: In the BPDN model, the signal can be regarded as a linear combination of multiple basis functions. Most of the basis function coefficients are zero, while only a few of them have non-zero coefficients. The basis function here is called the atom, and all the combinations of atoms are called dictionaries. When the number of atoms in the dictionary is greater than the signal dimension, the dictionary is overcomplete or redundant. Sparse representation is to find the optimal solution in the whole signal space by redundant dictionary according to the sparse metric standard. The typical sparse metric is L0. The commonly used measurement matrix is the Gaussian random matrix. It has been proved that the high probability of such a matrix satisfies the RIP property [41]. Therefore, when the Gaussian random matrix is used to observe the original signals, the original signals can be accurately recovered from the measured value with a very high probability through the reconstruction algorithm.
The measurement matrix plays an important role in CS and signal reconstruction. The reconstruction error is smaller if the performance of the measurement matrix is better [42]. The measurement matrix M makes the dimension of ECG signals reconstructed from L measured values such as  The M column vector of the measurement matrix is normalized, namely: Figure 2 shows the BP signal reconstruction and BPDN denoising process. Figure 2a shows the original ECG signals. Figure 2b is the signal reconstruction through BP algorithm. It can be seen from Figure 2c that the measurement process amplifies the variance of the Appl. Sci. 2021, 11, 1591 6 of 15 original noise component, which is the phenomenon of noise folding. This is mainly because the compression measurement process aliases the noise corresponding to the zero component of the sparse signal into the signal, making the noise multiplied. The power of the sparse signal remains basically unchanged after being compressed and measured, resulting in a sharp drop in the signal-to-noise ratio and affecting the performance of compressed sensing reconstruction. It can be seen from Figure 2d that the noise reduction effect of the BPDN algorithm is not ideal.
The M column vector of the measurement matrix is normalized, namely:  Figure 2 shows the BP signal reconstruction and BPDN denoising process. Figure 2a shows the original ECG signals. Figure 2b is the signal reconstruction through BP algorithm. It can be seen from Figure 2c that the measurement process amplifies the variance of the original noise component, which is the phenomenon of noise folding. This is mainly because the compression measurement process aliases the noise corresponding to the zero component of the sparse signal into the signal, making the noise multiplied. The power of the sparse signal remains basically unchanged after being compressed and measured, resulting in a sharp drop in the signal-to-noise ratio and affecting the performance of compressed sensing reconstruction. It can be seen from Figure 2d that the noise reduction effect of the BPDN algorithm is not ideal.

Low-Pass Filter MATRIX Construction
Digital filters are mainly divided into finite impulse response (FIR) filters and infinite impulse response (IIR) filters. For a FIR filter, the impulse response decays to zero in a finite time, and its output depends only on the current and past input signal values. Finite length unit impulse response filter, also known as non-recursive filter, is the most basic element in digital signal processing systems [44]. It can guarantee arbitrary amplitude-frequency characteristics while having strict linear phase-frequency characteristics. Meanwhile, its unit sampling response is finite length, so the filter is a stable system. Therefore, FIR filters are widely applied to communication, image processing, pattern recognition, and other fields.
Our goal is to extract a clean signal x from the noisy signal y. In order to achieve this, the common solution is to use a traditional linear low-pass filtering system. We use a zero-phase filter because it can reduce the noise in the signal while keeping the QRS waveform from shifting after noise reduction. The transfer function of the zero-phase filter is: The effect of a linear time invariant (LTI) system on the input signal is to change the complex amplitude of each frequency component in the signal. The nature of this effect can be understood in detail by using the modulus phase representation to look at this effect. Take the discrete time situation as an example: where H e jw is called the gain of the system. The effect of an LTI system on the input Fourier transform model is to multiply it by the modulus of the system frequency response. The phase shift of the system can change the relative phase relationship among the components in the input signal, so that even if the gain of the system is constant for all frequencies, it is possible to produce great changes in the time-domain characteristics of the input. The vectorized output after filtering the input noise-containing data y is: y = Hx [45].

BP-ADMM Methods
In this section we formulate our problem and derive an efficient distributed optimization algorithm via ADMM. In the case of noise, according to Equations (7) and (9) of the paper, it can be concluded that: where ε represents a very small normal number error. In addition, according to the Lagrange multiplier theorem, a constant λ can be introduced to make the above optimization problem with constraints become an unconstrained minimization problem. Firstly, we set up the objective function: In order to apply the ADMM to the BP method, we first transform the ADMM formula to an equivalent constrained problem. Then according to yy = Hx of Section 4.2, it can be described as the following formula: We introduce new variables z = Ax. Then the ECG denoising and reconstruction problem can be transformed into an unconstrained form, and the corresponding augmented Lagrange function is: where λ is a dual variable, and ρ is the penalty operator. Use the alternate direction multiplier of the ADMM algorithm to solve the problem. Therefore, the variables x and z are updated in sequence, and then the dual variables are updated. In the second step of ADMM, substituting the augmented Lagrange function in Equation (18) into the ADMM algorithm framework shown in Equation (15), the new ADMM algorithm process can be drawn as: Let us define: Obviously, to derive G k (x) in Equation (18) with respect to x, the derivation process is as follows: Finally, we update the dual variables. Let Equation (24) be 0, and then we get: The multiplier term and the penalty term are completely squared, and after discarding some constant terms, the optimization sub-problem can be transformed into: In addition, the soft threshold function, the soft threshold operator S is defined as Its iterative process turns into a soft threshold for point-by-point operation: But different from the general iterative algorithm, the ADMM algorithm is a dualcondition stop threshold judgment on the stop criterion of iterative convergence, that is, both the original problem residual and the dual residual must reach the convergence threshold.
In the ADMM algorithm, the selection of parameters λ and ρ have a relatively important impact on the accuracy of the image reconstruction algorithm [46]. In the image solution process, it is necessary to select a more appropriate balance factor λ value to maintain the balance between the regular part and the fidelity part. It is also necessary to select a more appropriate penalty parameter ρ value to determine the convergence of the iterative reconstruction, and the current academic circle does not propose a clear selection method for these two parameters, being only based on the balance of the fidelity term and the regular term. The selection of parameters is estimated, and the optimal value is selected according to the experiment. Therefore, in order to improve the accuracy of the signal generated by the iterative reconstruction process, this section has conducted multiple experiments to study the influence of the values of the two parameters λ and ρ on the experimental results of the ADMM algorithm, and obtained relatively appropriate parameter values.

ECG Database
The algorithm in this paper is mainly verified by the currently popular MIT-BIH ECG database. In order to better simulate the patient's ECG signals in the real state, we artificially add noises of different amplitudes. The MIT-BIH noise stress test database provides three common ECG noise signals: baseline drift, power-line interference, and motion artifacts. The frequency range of the first baseline drift is between 0.02 Hz and 2 Hz. The first baseline drift as the noise source is more representative [47]. Superimposing different signal-to-noise ratio (SNR) Gaussian white noise on ECG signals simulates the ECG with noises in the real state. We add different SNR noises during the experiment.

Performance Analysis
The existing literature generally uses signal-to-noise ratio and mean square error to measure the quality of the noise reduction effect [48]. In order to effectively evaluate the noise reduction effect of the algorithm proposed in this article and compare it with the existing algorithm, this article uses the following defined evaluation indicators. The output signal-to-noise ratio is defined as follows: The mean square error is defined as follows:

Simulation Results
We evaluate the denoising performance of our method in this section. To perform a comprehensive study of the proposed method, we consider different numbers of ECG data. We also study the performance of the BP-ADMM algorithm according to SNR and MSE in order to demonstrate its ability.
The first group of ECG data in the MIT-BIH arrhythmia database is recorded in file No. 100. The data in this group of data contain little noise and abnormal heart beats. This group of ECG data is preprocessed to obtain the first length 3000 pure ECG signal x in Figure 3a, which is linearly superimposed with the first baseline wandering noise in the BW file to obtain a noisy ECG signal y in Figure 3c.

Simulation Results
We evaluate the denoising performance of our method in this section. To perform a comprehensive study of the proposed method, we consider different numbers of ECG data. We also study the performance of the BP-ADMM algorithm according to SNR and MSE in order to demonstrate its ability.
The first group of ECG data in the MIT-BIH arrhythmia database is recorded in file No. 100. The data in this group of data contain little noise and abnormal heart beats. This group of ECG data is preprocessed to obtain the first length 3000 pure ECG signal x in Figure 3a, which is linearly superimposed with the first baseline wandering noise in the BW file to obtain a noisy ECG signal y in Figure 3c. The blue line represents the noisy signal, and the solid red line represents the signal after filtering the baseline wander noise and Gaussian white noise in Figure 4. It can be seen from the figure that after the algorithm filtering in this paper, the ECG signal that initially deviated from the baseline level returns to the baseline level, and the Gaussian white noise is effectively removed, which proves that the BP-ADMM algorithm can effec- The blue line represents the noisy signal, and the solid red line represents the signal after filtering the baseline wander noise and Gaussian white noise in Figure 4. It can be seen from the figure that after the algorithm filtering in this paper, the ECG signal that initially deviated from the baseline level returns to the baseline level, and the Gaussian white noise is effectively removed, which proves that the BP-ADMM algorithm can effectively remove baseline wander noise and Gaussian white noise. Before denoising, due to the large amount of white noise and baseline drift in the initial signal, the waveform diagram basically does not show the original characteristics of the ECG signal. However, after processing by the proposed algorithm, the waveform characteristics of pulse signals can be easily seen.
ci. 2021, 11, x FOR PEER REVIEW In order to further verify the effect of denoising, the values of the SNR befor after denoising of the pulse signal are calculated, respectively. The calculation resu shown in Table 1.  In order to further verify the effect of denoising, the values of the SNR before and after denoising of the pulse signal are calculated, respectively. The calculation results are shown in Table 1.  Table 1 is a comparison of SNR and MSE after filtering with wavelet, mean, and total variation (TV) noise reduction algorithms. In Table 1, the wavelet transform and the mean value method are not very effective in filtering Gaussian white noise, but the TV transform shows a good denoising effect. Even so, the denoising effect of the algorithm proposed in this article is still better than the three denoising algorithms mentioned above. The data in Table 1 show that the signal-to-noise ratio after denoising has been significantly improved compared to the ratio before denoising. The larger the value of the signal-to-noise ratio, the better the denoising effect, which indicates that the algorithm in this paper can effectively suppress a variety of noises and achieve a good denoising effect. We calculate the mean SNR of the above 28 pieces of ECG data shown in Table 2. Among them, the mean SNR of BP-ADMM is 7.129 which is the biggest among the four methods. This experiment is performed under the noise data input 5 dB. In order to verify the effectiveness of the noise reduction algorithm proposed in this paper, we compare it with TV, wavelet, and mean filtering methods with different SNR noises. Figures 5 and 6 show the results. It can be seen from Figures 6 and 7 that the algorithm proposed in this paper performs better in preserving the detailed features of ECG signals after noise reduction. The guidance signal constructed by this method can contain most of the morphological characteristics of the original ECG signals. However, the algorithm proposed in this paper can filter the electrode interference noises very well while retaining the original morphological characteristics of the P wave and T wave. However, the reconstructed ECG signals offset the original signals and many stair-case artifacts, which cannot retain the detailed features from the other filtering methods. The noise reduction differences among these algorithms are shown in Figure 7.

Conclusions
In order to better remove baseline wander interference and Gaussian white noise, a BP-ADMM algorithm is proposed for ECG denoising and reconstruction through combining low-pass filtering and CS recovery. The BP-ADMM algorithm is studied on the basis of the traditional BP algorithm, which reconstructs the ECG signals with denoising. The method introduces dual variables, adds a secondary penalty term, and reduces constraint conditions through alternate optimization to optimize the original variables and the dual variables at the same time. The dual decomposition method improves the calculation speed by decomposing the objective function in parallel. The effectiveness of the algorithm is verified by the records from the MIT-BIH arrhythmia database. The experiments show that this denoising method can effectively remove baseline wander interference and random Gaussian white noise in ECG signals. The proposed framework outperforms the wavelet and average algorithms in SNR and MSE. The waveform of ECG signals is much smoother and more details are preserved.