A Tandem Feature Extraction Approach for Arrhythmia Identification

Heart disease is currently the leading cause of death in the world. The electrocardiogram (ECG) is the recording of the electrical activity generated by the heart. Its low cost and simplicity have made it an essential test for monitoring heart disease, especially for the identification of arrhythmias. With the advances in electronic technology, there are nowadays sensors that enable the recording of the ECG during the daily life of the patient and its wireless transmission to healthcare facilities. This type of information has a great potential to detect cardiac diseases in their early stages and to permit early interventions before the patient’s health deteriorates. However, to usefully exploit the large volume of information obtained from ambulatory ECG, pattern recognition techniques that are capable of automatically analyzing it are required. Tandem feature extraction techniques have proven to be useful for the processing of physiological parameters such as the electroencephalogram (EEG) and speech. However, to the best of our knowledge, they have never been applied to the ECG. In this paper, the utility of tandem feature extraction for the identification of arrhythmias is studied. The coefficients of a regression using Hermite functions are used to create a feature vector that represents the heartbeat. A multiple-layer perceptron (MLP) is trained using these features and its posterior probability outputs are used to extend the original feature vector. Finally, a Gaussian mixture model (GMM) is trained on the extended feature vectors, which is then used in a GMM-based arrhythmia identification system. This approach has been validated using the MIT-BIH Arrhythmia database. The accuracy of the Gaussian mixture model increased by 15.8% when applied over the extended feature vectors, compared to its application over the original feature vectors, showing the potential of tandem feature extraction for ECG analysis and arrhythmia identification.


Introduction
Cardiovascular diseases are the main cause of death in the world [1,2]. Approximately 18 million people died from cardiovascular diseases in 2019, representing 33% of all deaths worldwide. In the near future, it is expected that the proportion of global deaths from heart disease will increase [3], and this increase will be more pronounced in developing countries due to changes in diet and lifestyle derived from the greater purchasing power of their citizens [4].
The electrocardiogram (ECG) is a fundamental test in the clinical routine for the diagnosis and monitoring of cardiovascular diseases. Its low cost, simplicity of usage, non-invasive nature and the simplicity of the instrumentation necessary for its acquisition, make it an ideal candidate for long-term ambulatory monitoring during the patient daily life [5,6]. In the ECG, a lead is a measure of the electrical activity of the heart given by the difference in potential between two points. This difference can be measured between two electrodes (bipolar lead) or between a virtual point and an electrode (monopolar lead). Different leads provide different perspectives of the electrical stimulus, and therefore complementary information.
The QRS complex is the most distinctive element of the heartbeat in an ECG [7]. It corresponds to the ventricular depolarization, which causes contraction of the right and left ventricles. Normal QRS complexes last between approximately 60 and 120 ms. The duration, amplitude and morphology of the QRS complex provide valuable information about the state of the heart and are useful in the diagnosis of cardiac arrhythmias, that is, of conduction abnormalities and other heart disorders (see Figure 1). Nowadays, there are algorithms that are capable of detecting the position of the QRS complexes with an acceptable degree of satisfaction, obtaining a sensitivity of about 99.9% [8]. However, the identification of the heartbeat morphology is still an open problem [9,10]. In the last decade, there have been significant advances in electronic technology, including miniaturization of components, increased battery life and decreases in production costs. At the same time, advances in communication technologies have facilitated the wireless transmission of information both in local area networks (such as Wi-Fi, Bluetooth and Zigbee [11]) and over long distances (such as 4G-5G [12]). This has made the ambulatory monitoring of patients during their daily live activities technologically possible and cost effective [13]. However, all this information has a volume too high to be exploited manually by healthcare staff. A normal patient has approximately 100,000 heartbeats per day, and each ECG lead recorded is capturing a different electrical representation of those heartbeats. Hence the need for pattern recognition techniques for the automatic analysis of ECG that are suitable for using in the context of a wearable telemonitoring platform [6,14].

Related Work
Significant research has been conducted on the usage of pattern recognition techniques in ECG analysis that aim to automatically detect different types of arrhythmias over ECG recordings. The most widely used pattern recognition approaches for years are based on two different stages: feature extraction and classification. The feature extraction stage aims to extract robust (and commonly hand-crafted) features that effectively represent the ECG signal and the classification stage employs those features to carry out heartbeat classification. Topological features derived from persistent homology [15] such as mean, standard deviation, skewness and kurtosis for persistence, birth time, death time, persistence entropy, number of dimensions, sums of persistence, number of layers in the landscapes, number of valleys per layer, and mean, standard deviation, skewness and kurtosis for the number of peaks per layer, along with other feature types such as demographic data and RR intervals have been employed in [16]. Some other works present time-frequency-based features such as the wavelet transform [17][18][19], wavelet packet entropy [20] and frequency-based features such as fast Fourier transform [19]. Fusion of different feature sets (morphological features based on discrete wavelet transform (DWT), statistical variational features and temporal features) have also been explored [21].
Regarding the classification stage, discriminative models that aim to differentiate between the heartbeat classes are widely used (e.g., support vector machine (SVM) [17] and random forest [16,20]). More advanced discriminative models such as convolutional neural networks have been investigated in [18], and fusion of different classifiers (SVM and nearest neighbour) were also considered in [21].
However, in the recent years, the traditional approaches based on those two stages are being replaced by deep neural network-based approaches. These approaches do not need the feature extraction stage, since they are able to carry out classification from the raw ECG signal [22][23][24][25][26]. Transformer models based on attention and encoder-decoder architectures have also been employed [19,27]. However, all these approaches present an important drawback when considered to be used as part of a daily live monitoring solution: they cannot be fully-integrated on a wearable device due to their computational requirements.

Motivation and Organization of this Paper
The approaches presented in the literature for ECG analysis (and hence for arrhythmia detection) are commonly based on hand-crafted features. For ECG analysis, Hermite functions have shown to be a compact and robust representation in the presence of noise for feature extraction in ECG signal classification systems [28][29][30]. On the other hand, the tandem approach for feature extraction was firstly presented in early 2000 [31] for speech recognition tasks. This aims to augment the original hand-crafted features with discriminatively-trained features. These features were originally based on a multiple-layer perceptron (MLP) although linear discriminant analysis (LDA) has also been explored later for some other signals different to speech. To do so, the MLP is employed to obtain a posterior probability for each class to be identified. The resulting set of posterior probabilities in the output layer of the MLP is added to the original features to create the so-called tandem approach. For LDA-based features, the projections of the LDA are employed. For speech signals, the tandem approach does typically incorporate logarithm and PCA decorrelation-based transformations to match the speech signal characteristics. This tandem approach has been shown to significantly improve pattern recognition performance with MLP-based features in speech recognition [32][33][34], speaker verification [35], language identification [35][36][37] and fiber optic recognition [38,39]; both MLP and LDA-based features proved useful in electroencephalogram (EEG) recognition [40], with the best result obtained from the MLP-based features.
Based on the power of Hermite functions for ECG signal representation as handcrafted features [28][29][30], this work explores whether the tandem approach could be useful for the classification of heartbeat morphology. To do so, the proposal uses an augmented feature extraction strategy, which incorporates new features based on the posterior probabilities output by an MLP to the Hermite-based features. Then, the augmented features are fed to a Gaussian mixture model (GMM)-based classification system which carries out the final classification of the heartbeats. It must be noted that the work presented in [30] also employs Hermite functions for ECG signal representation and MLP for classification. However, our work differs from [30] since we propose the use of the MLP within the feature extraction and we base our classification system in Gaussian mixture modelling. The work presented in [41], which employs multiscale principal component analysis for signal preprocessing, statistical features (i.e., mean, average power, standard deviation and mean value ratio) related to the coefficients of the DWT for feature extraction and decision trees for classification also differs from our approach since the signal preprocessing, feature extraction and classification stages are all different. Therefore it can be said that, to the best of our knowledge, this is the first work that employs a tandem approach for feature extraction in arrhythmia identification from ECG. Moreover, due to the low complexity of the GMM approach employed for classification in this work, the system is able to be fully-integrated in a (low-cost) wearable device.
The rest of the paper is organized as follows: Section 2 presents the database used in this work. The novel tandem feature extraction for ECG arrhythmia identification is presented in Section 3. The experimental procedure is presented in Section 4. Section 5 presents the experiments and results, which are discussed in Section 6. Finally, Section 7 concludes the paper.

Database
To validate the technique presented in this work, the most referenced database in the literature of arrhythmia identification will be used: the MIT-BIH Arrhythmia Database [42]. The wide variety of patients, the different types of heartbeats and the large number of annotations have fostered the use of this database [28,[43][44][45][46][47]. This database contains 48 electrocardiogram recordings obtained from 47 different patients. Each recording consists of two leads among the following: MLII, V1, V2, V3, V4 and V5. The recordings are digitized at a sampling rate of 360 Hz with a resolution of 11 bits. The database should not be considered a representative sample of the population as the records were carefully selected to try to cover the widest variety of cardiac disorders as possible. Each heartbeat was reviewed by at least two cardiologists, being approximately 68% of them considered as normal and the other 32% were divided into 16 types of abnormal heartbeats.
After the publication of the MIT-BIH Arrhythmia Database, the Association for the Advancement of Medical Instrumentation (AAMI) proposed guidelines for evaluating the performance of arrhythmia identification algorithms and this recommended to use a division of heartbeats into five types: normal (N), supraventricular (S), ventricular (V), fusion (F) and indeterminate (Q) heartbeats [48]. This classification has become a de facto standard, and it will be used here to evaluate the results of our novel approach. The original labels of the MIT-BIH Arrhythmia Database were mapped to the five heartbeat labels recommended by AAMI, as in [28]. This mapping is presented in Table 1.

Tandem Feature Extraction for Arrhythmia Identification
The tandem feature approach has been tested on the ECG arrhythmia identification system presented in Figure 2, which is based on four different stages: (1) signal preprocessing, where the ECG signals are initially denoised, (2) raw feature extraction, in which a raw set of discriminant features is extracted from the denoised ECG signals, (3) augmented feature extraction, in which the raw feature vector is enhanced with the MLP-based features to create the so-called tandem feature vectors, and (4) pattern classification, which involves two different stages itself: training, which trains the Gaussian mixture model for each AAMI heartbeat class from the training tandem feature vectors, and testing, which classifies each heartbeat into one of the predefined AAMI heartbeat classes from the testing tandem feature vectors (see Table 1). These stages are explained in more detail next.

Signal Preprocessing
Signal preprocessing aims to filter noise in the signal to allow the feature extraction step to be based on the morphological properties of the heartbeat, without being affected by issues such as baseline drift or high frequency noise. Two filters were applied on the ECG recordings. First, an eight-level Daubechies-based wavelet transform with extremal phase filters of width 4 was applied for baseline drift removal. The result of this filter is used to reconstruct a time series with the baseline drift of the ECG recording, and this drift is subtracted from the original recording. Afterwards, a low-pass Butterworth filter with a cut-off frequency of 40 Hz was applied to eliminate high frequency noise as well as the power line hum. The filtered ECG signals comprise the output of this module and are the input to the raw feature extraction stage.

Raw Feature Extraction
From the denoised ECG signals, the raw feature extraction aims to obtain the most discriminant information from the various types of heartbeats present in the database. To do so, the Hermite functions were used in this work for raw signal representation. The orthogonal Hermite functions have a shape reminiscent of QRS morphology and include a width parameter that enables an efficient modelling of QRS complexes of different amplitudes. This makes it possible to obtain accurate heartbeat representations with few coefficients. The heartbeat is represented by a feature vector with the coefficients that permit its reconstruction from the combination of the Hermite functions. This representation has been shown to be compact and robust in the presence of noise [28].
From the ECG signals, a 200 ms window was extracted for each heartbeat by considering the samples before and after the actual heartbeat position labelled in the database. Hermite functions tend to zero both in −∞ and ∞. To make Hermite functions converge at window edges, a 100 ms zero segment was added at both sides of the QRS complex so that the resulting window has length a of 400 ms. This window can be represented as Equation (1): where l refers the window sample, N h is the number of Hermite functions, c n (σ) represents the coefficients of the linear combination, φ n [l, σ] is the n-Hermite discrete function that is obtained by sampling the corresponding Hermite continuous function (i.e., φ n (t, σ)), e[l] is the approximation error between the actual window x[l] and the Hermite representation, σ is a dilation parameter that relates the width of the Hermite function with the width of the QRS complex and l varies according to Equation (2): where W is the window size, Fs is the sampling frequency and represents the floor function. The Hermite functions φ n [l, σ] are defined as Equation (3): where T s is the sampling period (i.e., the inverse of the sampling frequency Fs) and α is defined as Equation (4): The Hermite polynomial H n (α) in Equation (3) is defined recursively as Equation (5): where H 0 (x) = 1 and H 1 (x) = 2x. This Hermite representation enables the representation of the heartbeat contained in each signal window from the N h coefficients of the linear combination (referred to as c n (σ)) from the Hermite functions, and from σ.
For a given value of σ, the Hermite functions form an orthogonal basis, as shown in Equation (6): It must be noted that Equation (6) holds if the discrete Hermite function φ n [l, σ] is close enough to zero on both the edges and outside the analysis window. For the edges of each analysis window, φ n [l, σ] is at most 1/10 of its maximum value within the window, as defined in Equation (7): where −l 0 and l 0 refer to the first and last window samples, respectively. Moreover, we also impose that the value of φ n [l, σ] is smaller outside the analysis window than in the edge of the analysis window, as shown in Equation (8): For a certain value of σ, the linear combination coefficients c n (σ) are computed by minimizing the summed squared error given by Equation (9): in which the squared error is approximated following c n (σ) = x[l] · φ n [l, σ]. For a predefined window size and for a fixed number of Hermite functions, it is possible to calculate theoretical limits for the value of σ. Through an incremental iterative process, different values of σ are tested, starting at 0 and going up to the theoretical maximum, until the one that minimizes the error is found. The average values of σ for N ∈ [1,30] are from 14 ms to 21 ms. Then, a raw feature vector x r f is stored for each heartbeat, which consists of the N h numerical values of the c n (σ) Hermite representation of the corresponding heartbeat plus the σ value. This process is carried out per each ECG lead available; since our system employs two different ECG leads, N r = 2(N h + 1)-dimensional raw feature vectors x r f comprise the output of this module and are given to the augmented feature extraction module.

Augmented Feature Extraction
This module takes the raw feature vectors x r f as input and produces tandem feature vectors x t f as output. An MLP is employed to add the feature-level augmented information to each heartbeat in the ECG arrhythmia identification system. The MLP consists of three layers, as shown in Figure 3: an input layer with N r raw feature vector values, a hidden layer, whose number of units was selected based on preliminary experiments, and an output layer, which employs the softmax activation function, with a number of units equal to the number of heartbeat classes (five in our case). The MLP models are trained by the MLP training module in Figure 2. The standard back-propagation algorithm [49] is employed to learn the MLP weights (i.e., connections between the units of the input and hidden layers and connections between the units of the hidden and output layers, as shown in Figure 3) so that the classification error in the training data is minimized. Henceforth, the set of weights learned are used then to obtain the posterior probability vectors.
The augmented feature extraction consists of two different stages, which are applied to each of the N r -dimensional raw feature vectors x r f , as presented next.

Posterior Probability Vector Computation
From the raw feature vectors x r f and employing the weights computed during MLP training, the MLP calculates a posterior probability for each class to be recognized. This process is similar to the use of the MLP for classification in which each raw feature vector is assigned the class with the highest posterior probability. However, instead of making a class decision for each raw feature vector, the MLP generates one posterior probability per class, as shown in Figure 3. These posterior probability values are then used as new features, hence building a set of N c -dimensional posterior probability vectors, being N c the number of different AAMI heartbeat classes.

Tandem Feature Vector Construction
This stage concatenates the original N r -dimensional raw feature vectors x r f (those generated by the raw feature extraction module) and the N c -dimensional posterior probability vectors computed by the MLP. Therefore, (N r + N c )-dimensional tandem feature vectors x t f are built, which are then used in the pattern classification system.
The ICSI QuickNet toolkit [50], which was originally developed for the tandem approach in speech recognition tasks, provides different tools for developing signal processing systems based on MLP strategies. Here, we have used the ICSI QuickNet toolkit with the default parameter values for MLP training, posterior probability vector computation and tandem feature vector construction.

Pattern Classification
Gaussian mixture modelling has a widespread usage within pattern classification tasks (e.g., speech recognition [51], image recognition [52], video recognition [53], etc.). For ECG arrhythmia identification, GMMs are a suitable tool because: (1) GMMs can be trained from a limited amount of data [54], as it occurs for some heartbeat types present in the MIT-BIH Arrhythmia Database; (2) GMMs provide a simple strategy for classification, making it suitable for embedding the ECG arrhythmia identification system in a wearable device that aims to continuously monitor heart activity; and (3) GMMs can represent a large class of sample distributions (e.g., those corresponding to the training and testing data).
Therefore, for the GMM λ k , being k one of the five heartbeat classes, the probability that a certain feature vector x t f belongs to the class represented by that model λ k can be obtained. We will denote this probability as p(x t f |λ k ).

Training
From a subset of the heartbeats for a certain class k, which comprises the training subset, the training stage estimates the parameters (i.e., mean and covariance values) of each GMM λ k from the tandem feature vectors of that subset. To do so, the Expectation-Maximization algorithm [55], which makes use of a maximum likelihood criterion, is employed. This training stage is needed just once, so that the classification stage employs the set of trained GMMs. For the sake of simplicity, a single component for each GMM has been used to train each model.

Classification
Once the models have been trained, classification is conducted on a fully independent data subset, the so-called testing subset. The classification stage finds the class represented by the modelĉ with the maximum posterior probability. Hence, for a given input tandem feature vector x t f the Bayes' rule is applied as Equation (10): where we have considered a uniform prior probability for each class.

Evaluation Metrics
The main metric used to test the system was the classification accuracy, which was computed as Equation (11): where Correct is the number of correctly classified testing heartbeats and N represents the total number of testing heartbeats. We also presented the confusion matrix showing the number of testing heartbeats for a given class that were classified as any of the considered AAMI classes, along with the sensitivity and specificity values for each class, which were defined as Equations (12) and (13), respectively: where TP k is the number of true positive testing heartbeats for class k (i.e., heartbeats of class k that are correctly classified by the system) and FN k is the number of false negative testing heartbeats for class k (i.e., heartbeats of class k that are incorrectly classified by the system). It must be noted that sensitivity metric coincided with the intra-class accuracy.
The specificity was calculated as: where TN k is the number of true negative testing heartbeats for class k (i.e., heartbeats of all the classes except k that are classified by the system as any class except k) and FP k is the number of false positive testing heartbeats for class k (i.e., heartbeats that were incorrectly classified by the system as belonging to class k).

System Configuration
Regarding feature extraction, N h = 30 Hermite functions were used since they showed to optimally represent the vast majority of the heartbeats according to both the Bayesian Information Criterion (BIC) and the Akaike Information Criterion (AIC) [28]. This meant that we used a N r = 62-dimensional raw feature vectors, since two leads were used, and each lead provides 30 Hermite coefficients and the σ parameter. Then, the feature vector was augmented to N r + N c = 67-dimensional tandem feature vectors by adding the five posterior probabilities calculated by the MLP, according to the AAMI heartbeat classes. The MLP training and posterior probability computation employed a hidden layer with 100 units.

Evaluation Strategy
Experiments were carried out following the training/testing data division presented in [56] (see Table 2). It must be noted that the work presented in [56] did not employ the paced recordings in the MIT-BIH Arrhythmia database. Since our work does employ those recordings (no recording from the database was excluded), we assigned two of these recordings to the training data (102 and 217) and the other two (104 and 107) to the testing data. Both the training and testing data sets were made up of 24 different recordings. The number of heartbeats that belonged to each AAMI class for both training and testing data can be found in Table 3. The training data were employed both for GMM and MLP training and the testing data were employed for GMM testing.
It must be noted that, although the same training data were employed to train the MLP used in the augmented feature extraction and to train the GMMs (from the training augmented feature vectors), this did not introduce any over-fitting issues given that the final validation was carried out on the testing data, which were not used in the training of the GMMs nor in the MLP.

MLP-Based Experiments
An initial set of experiments was carried out to show the potential effectiveness of using the MLP-based features in the classification stage. These experiments employed the N r -dimensional raw feature vectors as input for the MLP and carries out an MLPbased classification on testing data. For classification, the class with the highest posterior probability was assigned to each heartbeat so that the performance could be evaluated. Results are presented in Table 4 and they showed that the sensitivity (i.e., intra-class) MLP performance was, in general, above chance (i.e., higher than 20%). The only exception was the S class, for which there were limited training data and it integrated the highest number of heartbeat morphological classes (see Table 1). This may have dramatically reduced the performance for that class due to both data scarcity and blurred model. The other class results obtained with the MLP classification provided optimism towards the utility of the MLP-based features in our GMM-based classification system to improve its performance.

ECG Arrhythmia Identification System
The results for the raw feature extraction, which was considered as the baseline in this work, and the augmented feature extraction, are presented in Table 5. It must be noted that the raw feature extraction experiments employed the raw feature vectors for both GMM training (using the training recordings of Table 2) and classification (using the testing recordings of Table 2), hence matching the feature type as in the augmented feature extraction approach. Results showed that the augmented feature extraction approach significantly outperformed the performance obtained with the raw feature extraction, with a 15.8% improvement in the accuracy. Table 5. ECG arrhythmia identification results with the raw feature extraction alone (Raw) and the augmented feature extraction module (Tandem) with the best statistically significant results for each metric in bold font. Confidence bands for a 95% interval confidence are also presented. 'Feat. extr.' stands for feature extraction, 'Se.' for sensitivity, 'Spe.' for specificity and 'Acc.' for accuracy. The confusion matrix for the raw feature extraction is presented in Table 6 and that of the augmented feature extraction is presented in Table 7. Table 6. Confusion matrix of the ECG arrhythmia identification system with the raw feature extraction. The number of heartbeats that are classified as any of the considered classes is shown in each cell. The values between brackets represent the number of heartbeats that belong to the real class. Sensitivity and specificity percentage values are also provided for each class.

Discussion
The results of Table 5 show a 15.8% improvement in the accuracy of the augmented feature extraction when compared with the raw feature extraction. This clearly indicates that the MLP is able to produce robust features that are suitable for GMM-based classification by providing complementary information to that of present in the raw feature vector. This supports our hypothesis that the usage of tandem feature extraction can be useful for ECG analysis, in the same way that it has proven its usefulness in the analysis of other physiological signals such as EEG and speech [32,40].
However, when we also consider the intra-class contribution (see Tables 5-7), we can see that not all heartbeat types improve their performance with the incorporation of the tandem features. Although for the 'N', 'F' and 'Q' classes the augmented feature set improves the corresponding intra-class accuracy, for the 'S' and 'V' classes, the opposite occurs. We consider that this may be due to the fact that the 'S' class is the one that integrates a large variety of heartbeat morphology types (see Table 1), which may produce a less robust MLP-based features and a more blurred GMM. This is confirmed by the fact that this class obtained the worst intra-class accuracy in the MLP-based experiments (see Table 4). Furthermore, we must consider the fact that the tandem feature vector includes only features extracted from the QRS morphology. The QRS is the most significant feature of the heartbeat and the most relevant for the identification of arrhythmias, with the possible exception of the 'S' heartbeats. These heartbeats originate in the upper chambers of the heart and usually present a similar propagation through the ventricles as normal ones, whereas the QRS only captures information related to propagation through the ventricles. Therefore, it is often not possible to distinguish between 'S' and 'N' heartbeats using only the QRS morphology. This is consistent with the results in Table 7, which show that most 'S' heartbeats have been classified as 'N' heartbeats, probably because their QRS morphology was similar to that of an 'N' heartbeat. Given the extreme difficulty of reliably identifying and extracting the electrical information of the propagation of the heartbeat through the atria (the P wave), this is often palliated through the incorporation of information related to the distance between each heartbeat and the previous ones [28,30,56]. It is likely that having incorporated this type of information into the tandem feature vector, better performance could be obtained for this heartbeat type.
In Tables 6 and 7, it can be clearly seen that the classes for which limited training data are available due to the lower prevalence of those heartbeat types (i.e., 'S', 'F' and 'Q') obtain the worst performance, and they are mainly confused with the class for which more training data are available (i.e., 'N'). Something similar happens with the MLP results (see Table 4).
It should be noted that the classification of most of the 'F' heartbeats as 'N'/'V' heartbeats is expected, since the former type of heartbeat happens when supraventricular and ventricular impulses concur, hence producing a hybrid complex. In addition, the 'Q' heartbeat in the raw feature extraction (see Table 6) is confused with the 'V' heartbeat, which could be due to the complexity of both paced and unclassifiable heartbeats when using a less robust feature extraction approach.

Conclusions and Future Work
This paper has evaluated whether the tandem feature extraction approach is useful for ECG arrhythmia identification. To do so, tandem features have been integrated within a tandem feature extraction approach for a GMM-based arrhythmia identification system. While the use of tandem feature extraction is common in other application domains and it has previously been used in the analysis of other physiological signals such as EEG and speech, to the best of our knowledge it has never been applied to ECG analysis for arrhythmia identification. Our approach consists of adding the posterior probabilities from an MLP as features to the feature vector representing each of the heartbeats. To represent the morphology of each heartbeat we have used the coefficients of a regression based on Hermite functions. Our results have shown that the augmented feature extraction significantly outperforms the results of the Hermite representation (15.8% improvement in accuracy), and the heartbeat types for which more training data are available benefit more from this approach.
This result suggests that our approach could benefit from the use of data augmentation techniques to handle class imbalance [57,58]. The introduction of features related to the distance between heartbeats should also be considered to be able to better distinguish the 'S' heartbeats. Finally, the use of other types of classifiers, such as those derived from deep learning techniques, could improve the performance of the arrhythmia identification system, although at the cost of higher computational requirements.