Marginal Component Analysis of ECG Signals for Beat-to-Beat Detection of Ventricular Late Potentials

: Heart condition diagnosis based on electrocardiogram signal analysis is the basic method used in prevention of cardiovascular diseases, which are recognized as the leading cause of death globally. To anticipate the occurrence of ventricular arrhythmia, the detection of Ventricular Late Potentials (VLPs) is clinically worthwhile. VLPs are low-amplitude and high-frequency signals appearing at the end part of QRS complexes in the electrocardiogram, which can be considered as a robust feature for arrhythmia risk stratiﬁcation in patients with cardiac diseases. This paper proposes a beat-to-beat VLP detection method based on the the marginal component analysis and investigates its performance taking into account different ratios between QRS and VLP power. After a denoising phase, performed adopting the singular vector decomposition technique, heartbeats characterized by VLP onsets are identiﬁed and extracted taking into account the vector magnitude of each high resolution ECG (HR-ECG) record. To evaluate the proposed method performance, a 15-lead HR-ECG database consisting of real VLP-negative and simulated VLP-positive patterns was used. The achieved results highlight the method validity for VLP detection.


Introduction
The shape of electrocardiographic signal is considered by cardiologists as representative of the heart state and, therefore, is useful in detecting cardiac pathologies such as arrhythmia [1,2]. Many types of arrhythmias have been classified based on heart rate and mechanism or site of origin. Some of them are benign but others may indicate the presence of serious heart disease, stroke or sudden cardiac death [3].
To prevent malignant ventricular arrhythmias in patients with arrhythmogenic right ventricular cardiomyopathy, the detection of ventricular late potential (VLP) signals is useful and helpful [4]. In fact, they are used as non-invasive markers for the prognosis of sudden cardiac death risk in patients recovering from myocardial infarction [5,6]. Detection of VLP occurrences means that additional information about patient heart conditions is provided to physicians.
In fact, VLP occurrences in cardiac signal is useful not only to identify post myocardial infarction patients prone to sudden cardiac death but also to evaluate thrombolytic and coronary angioplasty therapy, to follow patient progress after some kind of heart surgery, to evaluate the evolution of cardiac conditions (such as cardiomyopathy, ischemia, and angina pectoris) and to study patients with risk factors for cardiovascular disease (i.e., hypertension, diabetes mellitus, and smoking).
VLPs are cardiac signals of high frequency content (in the range of 40-250 Hz) and very low voltage (between 1 and 20 µV) that are located at the end of the QRS complex but may also extend into the early part of the ST segment ( Figure 1). They are considered non-stationary and non-Gaussian signals and are generated in cardiac tissue zones whose architecture has been altered as a result of necrosis, fibrosis or dystrophy. The resulting delayed and fragmented depolarization causes the occurrence of high-resistivity areas where the speed of cardiac impulse decreases. By giving rise to ventricular late potentials, such heterogeneous areas represent electrophysiological substrate for the development of re-entrant ventricular tachycardia [7].
As VLP signals are covered by both low-frequency, high-amplitude deflections of ECG and by high-frequency interference arising from biomedical instrumentation and muscular activity signals, their detection and quantification are hard tasks. It follows that noise reduction is an essential and delicate phase when cardiac signal is processed for VLP detection and localization [8]. The Simson amplitude-time procedure [9] is the best known method for VLP detection. It is based on ensemble averaging of a multitude of identical cardio-cycles both to reduce random noise in ECGs and to enhance the detection of low-amplitude signals. To characterize VLPs through time and amplitude measurements of QRS complexes, the three bipolar orthogonal XYZ leads are normally combined and a mean of several tens of beats per lead are averaged.
Several signal processing techniques for VLP detection implement a time-domain analysis. Due to VLP low amplitude and VLP continuity with QRS complex, the detection of these microvolt waveforms requires high amplification and suitable filtering for the rejection of low frequencies associated with repolarization phases of action potential, ST segment and T wave. The performance of these methods are dependent on the characteristics of the selected filter. Most often, a linear, shift-invariant (time-invariant) digital filter is implemented in the time domain as a convolution sum to avoid phase distortion in a single processing step. A bi-directional four-pole Butterworth high-pass digital filter is generally adopted to prevent the ringing effect and to ensure the invariance of QRS complex onset and offset in the filtered ECG and in the original signal. In addition, the localization of the QRS complex endpoint position is a hard task because of the noise occurrence that makes the portion of the signal following the QRS unstable. Standards propose the automatic estimation of the initial and final points of the filtered QRS complex [9].
Moreover, the incomplete characterization of re-entrant activity, the poor accuracy of positive prediction [10], and the impossibility to detect VLPs in patients with bundle branch block pathology [11] are the major limitations of time-domain analysis.
Unlike time analysis, a standard for a frequency domain approach has not yet been defined. Several studies have considered the Fourier transformation [12] and the Short Time Fourier transform, while other methods have used the maximum entropy spectrum estimation [13,14], the time variant auto-regressive spectral study [15], the spectral turbulence analysis [6,16] and the Wigner distribution [17]. Some evident limits of the aforementioned methods concern the fixed duration of the window adopted for the selection of QRS complex segments, the generation of interference terms and the fixed time-frequency resolution that is a poor choice for the analysis of non-stationary signals such as VLPs. The wavelet transform is also adopted because it makes possible a good tracking of sudden changes in the analyzed signal [18][19][20].
The averaging of a large number of QRS complexes, the so-called Signal Averaged Electrocardiogram (SAECG), improves the signal to noise ratio (SNR) in High Resolution-ECG (HR-ECG) signals, but decreases the sensitivity of the VLP detection and may give rise to waveform smoothing as consequence of the alignment jitter [21]. In fact, the adoption of SAECG signal makes impossible the detection of the existence of ventricular variance from beat to beat.
Several efforts were carried out by researchers for noise reduction but the unknown characteristics of VLPs and their behavior similar to noise have not allowed reaching results widely accepted by physicians.
To preserve the variability from beat to beat, as well as late potential onsets, as much as possible, a beat-to-beat correlation based denoising approach is proposed in this paper. The time alignment of heartbeat signals acquired by each lead (up to 15) of the HR-ECG monitoring system and the Singular Value Decomposition (SVD) technique characterize the denoising procedure of the implemented method.
As the paper aim is the detection of VLP occurrences in the ST segment of each heartbeat, the most significant singular values are retained in the reconstruction of HR-ECG signals; the reconstructed signals are summed to identify the peak to be analyzed. Moreover, a visual inspection allows VLP detections.
The confusion matrix is adopted for the performance evaluation of the proposed heuristic approach, taking into account different ratios between QRS and VLP powers.
Due to the lack of referred databases containing multi-leads HR-ECGs with VLPs, a simulated injection of VLP-like signals was used. Each VLP was simulated as multifrequency signal in Gaussian noise.
This approach is common in the scientific literature [22,23] and has the benefit of scoring the detectability of added signals with varying VLP levels, as the beats affected by the injected VLP are known.
The paper is organized as follows. In Section 2, the technique used for the analysis phase is presented. In Section 3, the implemented method is detailed. In Section 4, the database adopted as test bench of the proposed approach is presented. Experimental results and discussion close the paper.

Adopted Technique
In the proposed system, the Singular Value Decomposition (SVD) method is employed for VLP detection. The aforementioned technique is widely used in several applications of signal processing such as compressing, denoising, data reduction and so on.
The SVD technique is a matrix decomposition method for reducing a matrix to its constituent parts [24].
Denoting X the data matrix of size m × n; U and V orthonormal matrices of size m × n and n × n, respectively; and S a diagonal matrix of size n × n, the SVD decomposition can be written simply as: where the superscript () H represents the transpose and conjugate operator (Hilbert operator). The columns of U and V are named the le f t and right singular vector matrices of X, respectively, while the values along the diagonal of S are the singular values of X sorted from the highest value to the lowest one. The U and V matrices satisfy the following relations: and where I is the identity matrix. Since matrix V is orthogonal and considering Equation (3), the SVD equation can be rewritten as: V is the matrix which performs the linear combinations of X matrix columns to obtain an orthogonal matrix named E. Instead, the right multiplication of matrix U with S only scales the vector columns in U without affecting the orthogonality property. The E matrix is thus obtained as a simple linear combination of matrix X columns.
The modulus of each column vector in E represents the singular values; it is easy to demonstrate that it represents the estimate of the standard deviations of the signal in each column of E.
This procedure tends to accumulate all the strongly correlated information on columns in the first left singular vector. It follows that all the signal's highest energy terms in X columns are joined within the first singular vector of U, leaving all the uncorrelated components in the remaining columns.
This SVD ability to extract the uncorrelated marginal component of a signal has been appropriately exploited for reducing significantly the noise effect in the cardiac signal.

Implemented Method
Unlike most published studies, the method implemented in this paper carries out the VLP detection on the cardiac signal and not on the mean amplitudes of the signal (SAECG), as usually performed. It follows that the signal phase information is preserved and a better cross correlation among different acquisitions of beats is allowed.
Assuming one HR-ECG record composed of several lead signals, the implemented method detects VLP occurrences by carrying out a beat-to-beat analysis of the HR-ECG record under test. To pursue this aim, the following phases have been implemented: Pre-processing phase In the next sections, all phases are detailed.

Pre-Processing Phase
For each lead of a generic HR-ECG record, a band pass filtering was carried out both to exclude signal wandering due to low frequency terms and to limit the observation bandwidth to the properly selected frequency occupancy. In fact, as the VLP signal contains only frequency terms below 300 Hz, the filter bandwidth upper and lower limits were chosen at f H = 330 Hz and f L = 5 Hz, respectively, to avoid neglecting some VLPs or some cardiac signal useful information.
A subsequent comb filter was used to reduce the misleading effects on tested signals arising from the 50 Hz noise and all its harmonics caused by the power supply section of the acquisition system. The cascaded impulse response of the used filter is a 1000 taps linear phase Finite Impulse Response (FIR) filter whose transfer function is represented in Figure 2. The VLP detection on a beat-to-beat study requires a preliminary signal analysis. In fact, for each lead of every HR-ECG record, all R-peak positions were detected and a beat-to-beat segmentation was carried out. The above-mentioned operations made it possible to construct P matrices with P equal to the number of heartbeats in one lead. These matrices, named B i (for i = 1, ..., P), have size M × Q where M is the number of samples composing each heartbeat of the cardiac signal and Q is the number of leads composing one HR-ECG record. If is a column vector that represents the ith heartbeat of the jth lead, the B matrix for the generic ith heartbeat is composed as indicated in Figure 3a.
For the analysis of the whole HR-ECG, the lead beat matrix (LBM) was constructed by concatenating all the B i matrices as follows: The obtained matrix has size M × N leads , where N leads is equal to P × Q (Figure 3b). For each HR-ECG record to be tested, one LBM matrix was created. The LBM matrix was constructed using the analytic signal associated to each beat of each lead. The analytic signal complex description is reported in Equation (6).
In Equation (6), x(t) is the generic acquired lead signal,x(t) is its Hilbert transform, j is the imaginary unit and x c (t) is the analytic signal in its complex representation. The purpose of this transformation is to take care of the signal phase information.
The conceived method decomposes each LBM matrix into orthogonal components adopting the SVD technique so the main correlated part of the lead signals composing one HR-ECG record is retained in the first few left singular vectors of the decomposition. Therefore, the first singular vector represents the heartbeat expected signal, while the signal differences among different beats are stored in subsequent singular vectors. As the SVD decomposition is an energy based method and the singular values are sorted in descending order, the largest part of the lead signal energy is contained in the first few left singular vectors, which correspond to the highest singular values (Figure 4). As a result of the VLP low energy amount, its occurrence may not be detected in each LBM decomposition because the VLP presence in the left singular matrix is dependent on the VLP power amount with respect to the power of the difference signals of each LBM column vectors. As can be seen, the largest part of the lead signal energy is contained in a few singular vectors. To define the minimum number of left singular matrix vectors that can ensure an accurate signal representation, the highest singular values were considered. By thresholding the singular values, two distinct subspaces can be defined, namely the signal subspace and the noise subspace.
To obtain a denoised LBM matrix, the recomposition of the matrix from its SVD decomposition was performed, retaining only the higher singular values and setting to zero the singular values corresponding to the noise subspace in the diagonal matrix of Equation (1).
Because the energy of the VLP signal is unknown, to separate the signal subspace from the noise subspace in the LBM SVD decomposition, only the first fifteen singular values were retained. In fact, the highest singular values are the most energetic terms in the signal orthogonal decomposition and correspond to the vectors able to accurately represent the signal. The singular value flat zone can be considered mainly due to the background noise, which is always present in acquired signals. Figure 2 shows the plot of the singular vectors of a LBM decomposition.
A Denoised LBM matrix (named DLBM) was obtained after making the recomposition of the LBM matrix by Equation (1). Therefore, the DLBM is expressed as follows: where the generic DB i is the denoised B i whose generic element is denoted by db i m,j with m = 1, ..., M and j = 1, ..., Q.

Detection Phase
As the method aims to both detect and evaluate the frequency occurrence of VLPs, a beat-to-beat detection algorithm is required.
The output of the pre-processing phase, namely the DLBM matrix of one HR-ECG record, represents the input of the detection phase.
Assuming that heartbeats and VLP onsets are uncorrelated (or marginally correlated), the SVD decomposition was performed to confine the VLPs in the secondary singular values. In particular, the implemented procedure adopts the marginal component analysis for the identification and extraction of heartbeats characterized by VLP occurrences. The procedure discards the first singular vector and adopts the other vectors for the DLBM matrix reconstruction. The last not neglected vectors include details on VLP onsets such as the mutual differences among corresponding heartbeats of all the leads composing one HR-ECG record.
The plot of the singular vectors after the SVD decomposition of a generic DLBM matrix shows noticeable VLP onsets in the third left singular vectors of the decomposition ( Figure 5).
The spectral analysis of some ST segments extracted from the DLBM shows that the obtained peaks take place in correspondence of the VLP frequency components ( Figure 6).
To separate the VLP contribution from the cardiac signal, the Vector Magnitude (VM) of each HR-ECG record was evaluated, which quantifies the total energy of the above-mentioned record. The VM is recognized as a standard for this type of analysis by the Task Force Committee of the European Society of Cardiology, the American Heart Association, and the American College of Cardiology [25].
For the generic ith heartbeat ( T is a column vector whose elements were evaluated adopting the following formula: A V M matrix of (M × P) size was constructed by concatenating the above-mentioned column vectors: The extraction of all the ST segments from each V M i , (1 ≤ i ≤ P) makes the identification of VLP onsets possible. In fact, by plotting the ST segments composing each V M i arranged in columns, the VLP occurrences are highlighted by longer vertical light gray segments in Figure 7. The method results in terms of VLP detections, as reported in Figure 8. Circles represent the presence of true VLP signals while peaks are the computed standard deviation in the ST segment of each beat.
Adopting the proposed algorithm, a perfect match between real VLP locations and detected VLP positions can be obtained with high probability when the VLP peak magnitude is 40 dB below the R peak.

Adopted Database
The method was validated using real electrocardiographic signals provided by the PhysioNet database. In particular, the PTB Diagnostic ECG Database, a collection of real ECGs acquired by the Physikalisch Technische Bundesanstalt (PTB), the German national metrology institute, was selected [26]. Signals that make it up are characterized by sampling frequency of 1000 Hz, resolution of 16 bit with 0.5 µV/LSB and total duration of about 2 min. Each HR-ECG acquisition is composed of 15 signals: 12 acquired by conventional leads and 3 orthogonal (Frank leads). The 15-lead HR-ECG data were named HR-ECG record.
Sixty HR-ECG records, each composed of acquisition in absence of VLP signals coming from the PTB Diagnostic ECG Database, were used. To evaluate the algorithm accuracy, 60 HR-ECG records with VLPs were synthesized by adding on the ST segment of healthy HR-ECG signals, VLP signals properly generated in Matlab. For each record, a number of VLPs randomly selected in the range 1-30 was added to the VLP-free acquisitions in haphazard (but known) positions. This approach is shared with other studies in the literature concerning the VLP detection [23,27,28].
As HR-ECG records were corrupted by adding VLPs in known positions, the ground truth of VLP localizations was available and the detection process performance could be verified and accurately evaluated.
Based on the aforementioned features, a model was properly developed for VLP signal generation as the sum of sinusoids in accordance with Equation (10) : where α n , φ n and f n are parameters randomly selected inside the ranges [0, 1], [0, 2π] and [40-250 Hz], respectively. For each heartbeat, the generated signal has fixed frequency terms but different peak amplitudes, which depend on the phase composition of the frequency components in each ST segment. Therefore, the VLP peak amplitude may be considered as a random variable with a uniform distribution ranging in a random interval that is related to the VLP frequency component amplitudes. In addition, the position of additive VLP signals was slightly randomly varied from beat to beat with respect to the R peak but it was the same for corresponding heartbeats of all 15 leads composing one HR-ECG record. In this way, the VLP variability due to physiological causes was reproduced.
The block diagram of the VLP signal generation is represented in Figure 9. After the VLP signal generation, a normalization was carried out to set the ratio between the amplitude of the highest R peak present in the lead under test and the amplitude of the most elevated VLP equal to a pre-established value.

Evaluation Parameters
The performance of the implemented diagnostic system was evaluated using sensitivity, specificity and accuracy. Sensitivity (Se) is defined as the probability of detecting a VLP when a VLP exists really. Specificity (Sp) represents the probability of obtaining a negative HR-ECG record when VLP is not present. Accuracy (Ac) is defined as the observed agreement between the procedure results and the physicians opinion about the HR-ECG record under test [30]. They were computed as follows: where TP (number of true positives) is the number of correct identifications of VLPs inside the HR-ECG record under test; FN (the number of false negatives) is the number of VLPs present in the HR-ECG record that the algorithm is not able to detect; FP (the number of false positives) is the number of VLPs detected by the algorithm but are not really present in the HR-ECG record; and TN (the number of true negatives) is the number of HR-ECG records that the procedure considers without VLPs that really do not have VLPs.
In general, high values of both parameters are to be hoped for CAD systems. Really, a trade-off between Sp and Se is necessary both on the basis of impact of FP and FN diagnoses and on the prevalence of disease in the subjects under test [31].

Results of the Implemented Method
The entire collection of healthy records of the PTB database was processed. Since each HR-ECG was recorded adopting 15 leads and is 2 min long, 1800 cardiac tracings for a total of about 90,000 heartbeats were tested for the system evaluation.
Denoting with A QRS /A VLP the ratio between the R peak amplitude and the VLP amplitude of the same heartbeat, the conceived system reached a sensitivity, specificity and accuracy of about 95.3%, 94.5% and 94.3%, respectively, at a rate of A QRS /A VLP equal to 40 dB.
In Figure 10, the achieved SE, Sp and Ac trends in terms of different A QRS /A VLP values are plotted. It is shown that accuracy not lower than 90% was achieved up to A QRS /A VLP = 45 dB (Table 1). Figure 10. CAD performance plot in terms of HR-ECG peak to VLP peak ratio. Comparisons of the obtained performance with other methods indicated in the literature show the procedure's validity (Table 2). Orosco L. et al. [34] The procedure analyzes signal average HR-ECG record and defines a diagnostic index as a combination between the best of temporal parameters and the most significant time-frequency index of VLP analysis.

Discussion and Conclusions
Technological innovations have contributed both to human health (improving people quality of life) and to management of diagnostic institutes (making the diagnostic process efficient by increasing the productivity of each physician). In particular, new systems have been implemented that are able to detect illness signs even if diagnostic signals are complex and difficult to analyze.
In this paper, a computer aided detection system is employed for automatic detection of ventricular late potentials (VLPs) in high resolution ECG signals. The implemented method adopts the marginal component analysis implemented by the singular value decomposition to perform a beat-to-beat analysis. Due to VLP low amplitude, the detection of their occurrences is a challenge because they are masked by noise, interference and cardiac signal components.
For the procedure performance evaluation, a database composed of real and semi-simulated HR-ECG records was used, which provided a realistic and controllable environment for algorithm assessment. In particular, random generated sequences resembling the VLP characteristics were added to non-VLP records in random positions.
Even though our method exhibits quite similar performance to Zandi et al.'s [23] algorithm, it shows many distinctive characteristics, which improve the performance not only in regards to characteristic parameters (sensitivity, specificity and accuracy) but especially in terms of signal processing, tool applicability and application conditions. In fact, the implemented method shows that a careful use of marginal component analysis in the implementation of a suitable pre-processing phase of a CAD aimed to VLP detection makes it possible to reach performance in line with the best methods indicated in the literature without the recourse to complex detection/classification systems (based on neural networks, heuristic or probabilistic approaches). Moreover, since the implemented method has the benefit of being an open architecture where each block is an object-oriented module, in future work, the detection/classification section might be upgraded individually to improve the CAD system performance. Additionally, even if the obtained performance is comparable with that achieved in [23], the application conditions are quite different. In fact:

•
The databases selected to test the procedures are different: the authors of [23] used a private database composed of HR-ECG records lacking in VLPs with a sampling frequency of 2000 Hz and a 16-bit A/D converter, while a freely available public database composed of HR-ECG records lacking in VLPs with a sampling frequency of 1000 Hz and a 16-bit A/D was adopted here. The decision of using a public database was motivated by the intention of obtaining results comparable with some other procedures present in the literature that use the same database. It is well known that database characteristics influence the achieved performance of a CAD method and, therefore, the same procedure could produce different results when changing the signal dataset. Most studies in the literature test the VLP detection adopting private dataset.

•
The procedures for VLP generation and insertion in HR-ECG signals are different in [23] and in the proposed method. In [23], the basic VLP waveform is simulated as a colored Gaussian process and added to the QRS complex end part of every heartbeat. The position of the additive VLP waveforms is varied randomly from beat to beat and the amplitude of the VLP waveforms is modified for each heartbeat as the R wave absolute peak value is 100 times (40 dB) more than that of the VLP waveform in that heartbeat. In the proposed method, for each heartbeat, the generated signal has fixed frequency terms but different peak amplitudes, which depend on the phase composition of the frequency components in each ST segment. Therefore, the VLP peak amplitude may be considered as a random variable with an almost uniform distribution ranging in a random interval that is related to the VLP frequency component amplitudes. In addition, the position of additive VLP signals is slightly randomly varied from beat to beat with respect to the R peak but it is the same for corresponding heartbeats of all the leads composing one HR-ECG record.
In the proposed tool, there is no guarantee that, for each heartbeat, a ratio not greater than 100 is preserved between the R and the VLP peak values in that heartbeat (i.e., the VLP amplitude might be lower, making its detection more difficult), giving rise to a more critical situation in comparison with the method in [23].
Concluding and summarizing, the following benefits characterize the implemented method: • an open architecture where each block is an object-oriented module, which can be upgraded individually to improve the CAD system; • able to achieve better, or at least comparable, performance than other procedures detailed in the literature; • able to preserve the beat-to-beat variability information; • able to achieve satisfactory results up to a ratio of R peak amplitude to VLP amplitude equal to 45 dB; • a heuristic approach that needs no training and subsequent validation for the test procedure; and • an efficient approach with respect to the required computational load.
Obviously, the acceptance of a CAD system in diagnostic environment would depend not only on the performance of the method alone, but also on how well a physician performs the task when the computer output is used as an aid.