Electrocardiographic Fragmented Activity (II): A Machine Learning Approach to Detection

: Hypertrophic cardiomyopathy, according to its prevalence, is a comparatively common disease related to the risk of suffering sudden cardiac death, heart failure and stroke. This illness is characterized by the excessive deposition of collagen among healthy myocardium cells. This situation, which is medically known as ﬁbrosis, constitutes effective conduction obstacles in the myocardium electrical path, and when severe enough, it can be outlined as additional peaks or notches in the QRS, clinically entitled as fragmentation. Nowadays, the fragmentation detection is performed by visual inspection, but the fragmented QRS can be confused with the noise present in the electrocardiogram (ECG). On the other hand, ﬁbrosis detection is performed by magnetic resonance imaging with late gadolinium enhancement, the main drawback of this technique being its cost in terms of time and money. In this work, we propose two automatic algorithms, one for fragmented QRS detection and another for ﬁbrosis detection. For this purpose, we used four different databases, including the subrogated database described in the companion paper and incorporating three additional ones, one compounded by more accurate subrogated ECG signals and two compounded by real and affected subjects as labeled by expert clinicians. The ﬁrst real-world database contains QRS fragmented records and the second one contains records with ﬁbrosis and both were recorded in Hospital Clínico Universitario Virgen de la Arrixaca (Spain). To deeply analyze the scope of these datasets, we benchmarked several classiﬁers such as Neural Networks, Support Vector Machines (SVM), Decision Trees and Gaussian Naïve Bayes (NB). For the fragmentation dataset, the best results were 0.94 sensitivity, 0.88 speciﬁcity, 0.89 positive predictive value, 0.93 negative predictive value and 0.91 accuracy when using SVM with Gaussian kernel. For the ﬁbrosis databases, more limited accuracy was reached, with 0.47 sensitivity, 0.91 speciﬁcity, 0.82 predictive positive value, 0.66 negative predictive value and 0.70 accuracy when using Gaussian NB. Nevertheless, this is the ﬁrst time that ﬁbrosis detection is attempted automatically from ECG postprocessing, paving the way towards improved algorithms and methods for it. Therefore, we can conclude that the proposed techniques could offer a valuable tool to clinicians for both fragmentation and ﬁbrosis diagnoses support.


Introduction
The heart is the principal element in the circulatory system and it is divided in four chambers-the right and left atria, which are at the upper part of the heart; and the right and left ventricles, which are located at the lower part of the heart. The right part of the heart is responsible for receiving un-oxygenated blood from the whole body and carrying it to the lungs to oxygenate it, whereas the left part is responsible for receiving the oxygenated blood from the lungs and sending it to the whole body. This process is performed by the heartbeat, which is originated in the sinoatrial node, a small mass of tissue located in the right atrium. This node acts as a pacemaker creating the basic electrical impulse that travels to the atrioventricular node, which is a mass of tissue located between the atria and the ventricles. In this node, the impulse is delayed allowing the blood flowing from the atria to ventricles. Finally, this impulse is conducted to the His bundle where it is divided in two paths, in order to reach both ventricles simultaneously [1]. In terms of a representation of real electrical activity and operational behavior of the heart, the electrocardiogram (ECG) has proven to be the most widespread used tool for fast heart functioning diagnose. This test consists of several electrodes attached to the patient chest and limbs which record the different projections of the cardiac electrical activity. It allows clinicians to easily detect a wide number of illness, such as, abnormalities in rhythm like tachycardia or bradycardia, abnormalities in electrical conduction like bundle branch blocks, or markers of sudden cardiac death, among others [2,3].
In this work, we covered the fibrosis, which is the apparition of non-conductive (fibrous) tissue patches among the normal myocardium tissue. These patches act as obstacles in the normal myocardium electrical conduction and favor the risk of suffering different kinds of arrhythmias [4], even those ones that cause sudden cardiac death. On the other hand, recent research [5][6][7] sugged that the presence of fragmentation, which is a ECG feature manifesting as a number of extra peaks and deflections in QRS complex, is related to the presence of myocardial fibrosis. Nowadays, the fragmentation is detected by visual inspection of the ECG and this method presents two main drawbacks, namely, several different definitions of this condition are present in the literature and the inter-observer error is high, because this characteristic can be confused with noise or artifacts in the ECG. On the other hand, the fibrosis is detected by magnetic resonance imaging with late gadolinium enhancement (MRI-LGE), which is an expensive test in terms of time and money. Therefore, the main objectives of this work are two, namely, the development of an algorithm allowing a systematic detection of fragmentation in the ECG and the evaluation of the detection power of these very same techniques to identify the visually unnoticeable effects of fibrosis on digital registers.
In the companion paper [8], we scrutinized the behavior of two frequently used multivariate transforms, principal component analysis (PCA) and independent components analysis (ICA), over a fragmented subrogated model of ECG. As a result, this previous work proved that the use of these transforms can enhance the presence of fragmentation waves in the computed components. In this work we extend the use of these transformations in order to compute features that can be used in a classifier for the suitable fragmentation and fibrosis detection. This paper is structured as follows: Section 2 describes the literature available with regard to fragmentation and fibrosis detection in ECG digital registers; In Section 3, the four databases used in this work are described, as well as the different proposed methods; Section 4 presents the results of each proposed method over the previously described databases; Finally, in Section 5, the conclusions are presented together with the suggested future work in this area.

Background
Very limited literature has been published according to this topic, specially related to automatic detection of singularities such as fragmentation and almost no paper with regard to systematic automatic detection of fibrosis using the ECG was found in our search of precedents. In particular, and to our best knowledge, there are just four main works available about fragmentation detection. In Reference [9], the authors proposed a method based on discrete wavelet transform (DWT).
The algorithm firstly presents a signal accommodation stage for processing through denoising and detrending. Then, the DWT is computed by using Haar signals and the resulting coefficients are interpolated. Finally, fragmentation is detected and classified according to the values of the coefficients close to the zero-cross points. This method was tested over 31 registers from the PTB database from Physionet, achieving 0.90 sensitivity and 0.90 specificity.
In Reference [10], the authors presented an algorithm based on intrinsic time-scale decomposition (ITD). First, the algorithm computes the first four components of ITD from the ECG. Second, the ECG is delineated according to the second and third ITD components. Then, the fragmentation index is computed by averaging the ratio between the wave-speed and the absolute amplitude of the half wave over the QRS, both of them obtained from the first ITD component. Finally, the algorithm is tested over the records of the PTB Physionet database, limiting the cases to those meeting the following criteria: QRS duration lower than 120 ms and noise level computed as the standard deviation of the first 80 ms of the record lesser than 0.02 mV. Results showed 0.96 area under the receiver operating characteristic curve.
In Reference [11], the authors proposed an algorithm based on stationary wavelet transform (SWT). In this case, the ECG is detrended by using median filters. Then, fiducial point detection is made based in zero-crossing techniques. Finally, fragmented QRS are detected by computing the SWT, jointly together with Haar signals, from each beat, thus finding the zero-crossing of the transformed signal and finally applying certain rules based on morphology. Results applied over 51 patients from University Hospital of Southampton showed 0.90 accuracy.
In Reference [12], the authors proposed a method to detect fragmentation. Their algorithm firstly presents a pre-processing stage where the signal is band-pass filtered and the baseline wander is processed. Afterwards, the beats are detected and delineated by using the variant mode decomposition (VMD). Then, ten features are computed from the signal and from the VMD output, namely: The average slope of the rising and downing flanks of QRS; The values of the linear fit of the averaged slopes; The peaks and main frequencies, of the 3rd, 4th and 5th component of VMD; And the number of peaks of the QRS. Finally, a number of classifiers are tested using the previous computed features over a private database of 616 records labeled by 5 experts. The final algorithm exhibited 0.86 sensitivity, 0.89 specificity and 0.88 accuracy.
With regard to fibrosis, and as mentioned earlier, to our knowledge there is no published paper devoted to automatic detection of fibrosis. Hence, this would be a new research line pioneered by the present paper.
In this work, we used linear and non-linear classifiers in order to detect both fragmentation and fibrosis. We selected those classifiers that previously have presented good performance in other similar bio-signal processing fields, which are: The support vector machine (SVM), used for the vasovagal syncope detection [13]; The K-nearest neighbour (KNN), applied in beat classification tasks [14]; The multilayer perceptron (MLP), used for arrhythmia classification in implanted cardioverter defibrillators [15]; The naïve Bayes (NB), applied also in arrhythmia classification tasks [16]; And the decision trees (DT), used to lung cancer detection [17].

Materials and Methods
This section is divided in three main parts describing the databases, the classifiers used, and the features spaces. In the first subsection, the following four databases are described with detail: The subrogated fragmented database (Sfrag-DB), the subrogated wide-fragmented database (SWfrag-DB), the fragmented database (FHCM-DB) and the fibrosis database(HCM-DB). The second subsection summarizes the principles of the classifiers that have been applied in this research, namely, SVM, MLP, KNN, DT and the Gaussian NB. The last subsection is devoted to describing a systematic approach to the different and possible input spaces and to the signal processing being done in this stage, such as the previous normalization or the construction of the features themselves.

Databases
As mentioned above, in this work we used four different databases, namely, Sfrag-DB, SWfrag-DB, FHCM-DB and HCM-DB. Each of them includes both affected and control recordings in order to be able to apply and benchmark the selected learning techniques.
The Sfrag-DB was created using control records (from healthy subjects) and modifying them by synthetic incorporation of fragmented waves. This added element was stemmed according to where A is a random number between 1% and 30% of the main peak amplitude, t is the time vector with its values going from 0 to 2w, w is a random number between 4 and 24 ms that represents the fragmentation duration and n represents the number of semi-cycles of the sinusoid. This database was extensively developed including this wide variety of possible values in the variables included in the equation to even overcome any imaginable composition of the fragmentation: (i) From almost invisible values of amplitude to exceeding possible ones; (ii) From low frequencies to very high ones; (iii) From just a section of the wavelength to a number of them. This sinusoid was applied to the first 150 ECG from 418 control ECG recorded from students of Universidad Católica San Antonio (Murcia, Spain). These ECG were recorded using the ELI 350 from Mortara TM , a device that uses 500 Hz sampling frequency and 5 µV resolution. The demographic data from this ECG population were a men-to-woman ratio of 314/104 and 23.1 ± 4.3 years. The fragmented positive-to-negative ratio were 150/150. An example of a control (fragmented) record from this database can be seen in Figure 1a,b. The SWfrag-DB was created to simulate the effect of wide miss-conduction areas on the heart. Hence, we used the same previous control database, but this time synthetically fragmented using the following sinusoid equation, where A is a random number between 1% and 30% of the main peak amplitude, t is the time vector and its values going from 0 to 2w, w is a random number between 20 ms and 40 ms that represents the fragmentation duration and f is a random number from 45 to 75 representing the frequency, in Hertz, of the sinusoid. An example of a record from this database can be seen in Figure 1c The FHCM-DB was created using 80 standard 12-lead ECG records from HCM positive diagnosed patients. For the development of this database, the presence of fragmentation was identified by two independent clinical reviewers, from Hospital Clínico Universitario Virgen de la Arrixaca (Murcia, Spain). This batch was selected inside a larger set of 225 cases. The demographic data of the whole database was 47.0 ± 15.9 years and men-to-women ratio of 152/73. These ECG were recorded with GE-MAC 5000 from General Electric TM at 500 Hz sampling rate and 4.88 µV resolution. The fragmented positive-to-negative ratio was 42/38. An example of a record from this database can be seen in Figure 1d.
The HCM-DB comprised the fibrosis cases and it is a subset of 300 records from a larger database of 1750 cases of analyzed HCM patients including positively fibrosis diagnosed patients using MRI-LGE. These records were collected, analyzed and diagnosed by the expert clinicians in our group, from Hospital Clínico Universitario Virgen de la Arrixaca (Murcia, Spain). The signal-recorder was PageWriter TC30 from Philips TM at 500 Hz sampling rate and 5 µV resolution. The demographic data of the whole database were 55.3 ± 16.4 years and a men-to-women ratio of 651/1099. The fibrosis positive-to-negative ratio was 150/150. An example of a record from this database can be seen in

Classifiers
In this work, we used five different classifiers to evaluate the diagnostic capabilities of each of them to detect fragmentation or fibrosis. The implemented classifiers were SVM, MLP, KNN, DT and NB and all of these presented classifiers were developed and tested according to the recommendations in [18].
Before the explanation of each classifier, some shared mathematical concepts need to be summarized. First, our observations are defined in terms of their features, which are numeric values that shape the observations and they are labelled according to the class to which they belong. More, hereafter we work with binary problems. Therefore, from a mathematical point of view, our data set can be defined as where x i (y i ) is the feature vector (label) of the i-th observation. Second, all of these classifiers provide us with a final classification function that is computed by using the training observations x i to determine the label of a new unknown observation, denoted as x. Third, most of these classifiers are based on a training-test scheme and they must be trained with a subset of observations to set their internal parameters in order to define their classification function. As said before, each observation is defined by its features, hence, the input space is often called also feature space.
The SVM, which was proposed in [19], is a classifier with at least three interesting properties, namely, it presents a tractable optimization formulation, tractable complexity control and flexible non-linear parametrization. The basic idea behind the SVM is the computation of a hyperplane that splits the input space according to the observed classes. For reader convenience, some basic cases are next explained. Let us consider a linearly separable binary classification problem in which the classification function is given by where H is the separator hyperplane, w is the normal vector to this plane and b is the bias or independent term. According to this equation, the number of separator hyperplanes is infinity, therefore, we must define the optimal hyperplane. It is advantageously defined as the hyperplane that maximizes the distance to the closest observation in both classes, which are named the support vectors.
After computed all the calculus, the equation for w and b can be written as where α i is the i-th Lagrangian multiplier, which is greater than zero as far as the i-th observation is a support vector. As can be seen, the classification function C(x) is computed by applying inner product of support vector, those ones observation which present α k = 0 and the observation that we want to classify x. This solution is known as hard-margin classification and an example of this kind of problems can be seen in Figure 2a, where the highlighted points are the support vectors, the solid line represents the optimal separator hyperplane and the dotted lines represent the optimized margin.
As it can be seen, the SVM provides the classifier equation as a function of the inner product of the training observations. It should be noticed here that in the real world, the previous solution cannot be computed as clean as expected due to the existence of classification errors and noisy observations. In order to deal with this, a new parameter needs to be added to the formulation. This new approach considers slack variables, the solution derived from this approach is known as soft-margin and after these new calculi the new equation for w and b can be expressed as where ζ k is the slack variable associated to the k-th support vector. As we have seen, the hard and soft margin approaches present a similar classification function, which only depends of the inner product among support vectors and the observation to be classified. An example of this kind of problem can be seen in Figure 2b, where the highlighted points are the support vectors, the solid black line represents the optimal separator hyperplane, the solid colored lines represent the slack variables and the dotted lines represent the optimized margin. The SVM is a powerful tool to work with linear problems, yet a classical question is how it deals with non-linear problems. As we have seen before, the SVM can classify linearly separable problems, hence the way to manage the non-linear problems is to use a function allowing to map the features space into a higher dimensional space where the problem is linearly separable and this mapping is performed by function φ(x). Now the problem is to find that function φ(x) which meets the following equation.
where K is the kernel function, which represents the inner product in a higher dimensional space. As can be observed, finding φ(·) that meets the previous equation can be difficult, therefore we can apply the Mercer's Theorem that allows us to use a function as a kernel if it is semi-positive definite symmetric. An example of the use of kernel function to solve a non-linear problem is shown in Figure 2c, where the highlighted points are the support vectors, the solid line represents the optimal separator hyperplane and the dotted lines represent the optimized margin. The most common kernel functions used in this work are described here after. First, the linear inner product function is computed according to the following equation, where x i is the i-th observation of training set and x is a new observation. In a similar way, the Gaussian function, maps the input space and performs the product in the feature space according to where x i is the i-th observation of training set, x is a new observation and σ is the width of the Gaussian given as SVM input. The second classifier that can be used is the MLP [20]. This model is a classifier based on the biological neural systems and it is structured as several perceptrons or neurons in different layers and interconnected. The neurons are composed by four different parts, namely: Input weights, which assign importance to the inputs; Bias, which acts like a threshold; Adder, which sums all the weighted inputs and the bias; And activation function, which give us the output of the neuron. Mathematically the output function of a simple neuron can be expressed as where g(·) is the activation function, P is the number of inputs, w is the vectors of weight associated to the inputs, x ij is the j-th feature of the i-th observation and w 0 is the bias. The most common activation functions are, the linear function, the step function and the sigmoid function, defined according to Equation (11) respectively.
As said before, these neural networks are formed by aggregation of neurons in layers. For a complex neural network, conformed by L layers with P neurons per layer, the classification function can be written as where g is the activation function, w L j is the weight associated to the j-th input of the output layer and o l , which is a function of x i , is the output of the l-th layer. An example of an MLP neural network can be seen in Figure 3. In order to use the MLP as classifier, a training stage is needed, where each bias and weight of every perceptron in the classifier. The algorithm used to perform this training stage is compounded by several steps: (i) All the weights and bias are initialized; (ii) The outputs are computed for each observation of the training subset; (iii) The squared error is computed for these observation; (iv) The generalized delta rule is used to modified the weights and bias; (v) Steps (ii), (iii) and (iv) are repeated until all the training observations are used; (vi) The total error of each training iteration is computed; (vii) If the total error is less or equal than a threshold the training is finished, else the steps (ii), (iii), (iv), (v) and (vi) are repeated until the threshold is satisfied or the maximum number of iterations is reached.
Another option is the NB classifier [20], which is based on Bayes Theorem and the word naïve is related to the assumption of statistical independence among the variables. The NB classifies using the probability of belonging to a class. From a mathematical point of view, the probability of certain observation modelled by its feature vector x i belonging to a class c can be written The previous equation cannot be solved without considering the statistic independence among the features, but by applying this assumption, we can rewrite it into where P(c) is the probability of a class, this value is a constant that can be computed as 1 #classes or as the probability of the class in the training data and P(x i j|c) can be computed as a Gaussian function, according to where σ 2 c is the variance of the i-th feature and it is computed using the training data, µ 2 c is the mean of the j-th feature and it is computed using the training data and x ij is the value of the j-th feature of test data. Finally, the decision rule can be written as where P(c k ) is the probability of the k-th class and x ij is the j-th feature from i-th observation.
Another widely used classifier is the KNN [21], which belongs to the so-called lazy learning classifiers, as they do not use the whole training dataset, but instead the solution is only computed as a discriminative function from the closest subset of training observations to classify the test dataset. The classification process follows the next steps: First, the training dataset is mapped to features space; Second, when a new observation is classified, the KNN must compute the distance to every point in the plane; Finally, the classification is made taking into account the label of the k-nearest points in the plane. The distance definition could diverse, but the Euclidean distance is mostly used and it can be mathematically expressed as where D is the number of dimension of observation x and x i and x j are the points which their distance between which we need to know. Moreover, there is a collection of algorithms to label an observation according to their neighbors and one of the most used is the one using a polling among the k-nearest neighbors in order to label a new observation, this is, where Vote x i is the voting value of the c-th class for x i observation, x i is the observation that we wanted to classify, x k is the k-th nearest neighbour and α is a constant that is 1 when x i and x k belong to the same class and zero otherwise. The final classification is performed by taking the label that reaches highest value of Vote x i . The DT are classifying algorithm that recursively split the input space [20]. DT are compounded by three different parts, namely, the root node (which has no inputs), the leaf nodes (which have one input and no outputs) and the internal nodes (which present one input and different outputs). Each internal node divides the input space into two or more semi-spaces according to a discrete function of the features. Many different algorithms to generate a binary DT are based on Hunt's algorithm, which is a simple algorithm conformed by the following steps: First, if all the observations on a node are from the same class, this node is marked as a leaf; Second, if not all the observations are from the same class, the node is marked as an internal one, the observations are divided by using attribute test conditions and this process iterates until all the observations are classified. Nowadays, the algorithms to build a DT are very efficient and implement stop criteria to prevent the overfitting.
As said before, we need some attribute test conditions, which are the way of splitting the observations in a node according to the value of their features, as shown in Figure 4. In order to select the best splitting, we need to implement a variable that can take into account the impurity, which is a measure of the number of classes present in a node, of the parent and the child nodes. This variable is known as gain, ∆ and it is computed as where I(parent) represents the impurity measured in parent node, K is the number of children nodes, S v j is the number of observations in the children node v j , S is the number of observations in the parent node and I(v j ) is the impurity of children node v j . The impurity can be computed from different ways and, in this work, we used the following implementation, where C is the number of classes in the subset, i is the class and t is the selected node.  The main drawback of DT is their tendency to grow too much. Hence, exiting techniques must be implemented, such as pruning algorithms that remove the branches with lower information, or setting maximum values for the parameters that control the growth, as the maximum depth or the minimum number of observations per leaf.

Processing and Features Spaces
The data processing comprises several steps in our ECG data: (i) Low-order band-pass filtering from 0.5 to 100 Hz, in order to reduce the out-of-band noise without endanger the characteristics of fragmentation (fibrosis); (ii) Notch filtering to reduce the power-line interference; (iii) Baseline wander removal method, based on cubic spline interpolation [22]; (iv) QRS-detector stage, based on a tailored version of the Pan-Tompkins algorithm developed by our group [23]; (v) Beat template stage, where the beat templates are created by averaging the well-correlated QRS complexes, which reduces the total amount of noise present in the ECG without noticeable distortion [24].
Once the signals have been properly adjusted and key variables such fiducial points have been carefully isolated, the environment is set for the feature space analysis. To do so, features-spaces calculation is divided into three additional ladders: (i) Transformation stage, where a signal is processed according to the multivariate transformations presented in the companion paper [8]; (ii) Signal selection stage, where setting the slot between 140 ms or 700 ms around the ECG main peak and the normalization is applied, if it is required; And (iii) feature computation, which is presented in two different possible ways, namely, statistical features and signal-samples related features.
First, in order to broadly analyze the signal, statistics (stats) were computed in four separated situations, namely, the statistic of the whole signal, the statistic of the 25 Hz low-pass filtered signal, the statistic of the 25 to 75 Hz band-pass filtered signal and the statistic of 75 Hz high-pass filters signal. For each of them we computed: (i) the average; (ii) the standard deviation, which represents the power of the signal; (iii) the skewness of the windowed signal, which takes into account the position of the maximum of the windowed signal distribution; (iv) the kurtosis, which represents the shape of the windowed signal distribution; (v) and the number of maxima present in the windowed signal. All of these features were computed according to Reference [25]. On the other hand, the features based on the signal-samples were three, namely, the aggregation of the ECG samples across all components (Sum), the aggregation of the squared values of the ECG samples across all components (PowSum) and the concatenation of the samples of each ECG component in one single vector (Concat).

Results
The presentation of our results is divided in four subsections. The first subsection is devoted to fragmentation detection based on linear models and we describe the experiments using linear classifiers over the two fragmented databases. In the second subsection, focused on feature relevance, we show the statistical relevance of the best linear classifier and a new subrogated model that takes into account these results. In the third subsection, we present the benchmarking of linear classifiers on fibrosis database and the statistical relevance of the best performing features is additionally dealt. Finally, in the fourth subsection, non-linear detection is scrutinized and the results for non-linear classifiers on the fibrosis and fragmented databases are presented with detail.
In all cases, we benchmarked the classifiers using the merit figures usually applied for clinical purposes, namely, sensibility (Sen), specificity (Spe), positive predictive value (PPV), negative predictive value (NPV) and accuracy (Acc). The interpretation of these parameters is extensively described in the literature [26] where TP (TN) is the number of records marked as pathological (non-pathological) by the clinician and by the classifier and FP (FN) is the number of records marked as non-pathological (pathological) by the clinicians and marked as pathological (non-pathological) by the classifier. As mentioned in Section 3, segment preprocessing was applied prior to classification processing. Segments were set initially to 700 ms around the main peak. This segment is referred to as non-normalized beat prior to normalization and normalized beat after normalization. With the same criteria, segments of 140 ms around the main peak are recognized hereafter as non-normalized QRS when normalization is not yet applied and normalized QRS once it is statistically normalized.

Fragmentation Detection Based on Linear Models
The main goal of this experiment is to determine the linear classifier that detects the best the fragmentation activity on the ECG. The scheme of the tested methods is as follows. A lineal classifier is firstly selected from the two main implementation of the SVM, namely, C-SVM and ν-SVM (the interested readers can see Reference [18] for details) and the linear SVM was selected according to its behavior working with high dimensionality spaces [27]. Then, the ECG segments of interest are computed as mentioned above. And finally, the input space is computed for classification according to the methods described in Section 3.

Features Relevance and New Fragmented Subrogated Model
In this section, we tested the statistical relevance of the features used in the best method for fragmentation detection methods described in previous section. To do so, we performed a bootstrap resampling analysis, which allows us to compute the probability density function of a parameter by the it computation over resampled subsets of the population [28], setting the number of resamplings B to 100. When the confidence interval of an SVM weight does not overlap zero, the associated feature is identified as relevant for fragmentation detection. On the other hand, in the methods that use the signal as input of the SVM, we searched the frequency bands which are related to the fragmentation, and to do so, we worked with the SVM as a transversal linear filter where their weights are the coefficient of the filter.
The first presented classifier is the ν-SVM, which is combined with the principal components of the normalized QRS from the independent leads computed over the Sfrag-DB. Figure 5a shows the confidence interval at 95% for each SVM weight associated to the features described in Section 3. Each panel shows the confidence interval of the corresponding features weight associated to a one principal component. As it can be seen in first Panel, which corresponds to the first principal component, the relevant features were the 11, 12, 13, and 14. Those ones correspond to mean, standard deviation, kurtosis, and skewness of the band-pass filtered component respectively. As can be observed, the relevant principal component are the last two, this is coherent with the obtained results in the companion paper [8], where we said using the detailed components, those with lower variance enhance the fragmentation detection.
In Figure 5b we can observe the confidence interval of the ν-SVM weights, when it is fed with the summation of regionalized principal components power from FHCM-DB. The exhibited behavior of the coefficient is near periodic with a frequency around 10 Hz, as seen in the upper panel around feature 15, and in lower the panel in the spectrum representation. The fragmentation waves in these records are visible by a clinician, for this reason we thought that the apparition of these frequency bands are related to the minimum size of the fibrotic mass in the myocardium that originate fragmentation waves, if the size of the fibrotic mass is lower, the fragmentation waves become invisible to the clinician. Figure 5c shows the ν-SVM weights for the ν-SVM fed with the concatenation of non-normalized QRS from the eight independent leads from FHCM-DB. We see the confidence interval of the SVM weights in columns 1 and 3, and their frequency behavior associated to these weights in columns 2 and 4. In this case the in periodic behavior is clearer than the previous experiment, where the frequency behavior of Sfrag-DB was presented and the main frequency is around 60 Hz.  According to the previous results, we developed a new subrogated database called SWfrag-DB, which considers this results in order to enhance the fit between our model and the real-world fragmentation. In the next experiment, we showed the behavior of the proposed detection methods when they work with this new database. Figure 6a shows the Acc for ν-SVM applied over SWfrag-DB and using each signal selection. The best results for this database are achieved using SVM with the statistics computed over the principal components of the non-normalized QRS and the statistics computed over the last three principal components of the normalized QRS. In both cases, the achieved Acc is 0.817. These values can be seen at Panels (1,1), and (2,1). The best results are obtained when the statistics are used as input for the SVM. On the other hand, the worst scenario is obtained using the aggregation, the aggregation of power, or the concatenation of components. Figure 6b shows the Acc for ν-SVM applied over SWfrag-DB and using each signal selection. as can be observed, the best results for this database are achieved by using as features of SVM the statistics computed over the principal components of the non-normalized QRS achieving 0.808 Acc. These values can be seen at Panel (1,1). The best results are obtained when we used the statistics as SVM features. On the other hand, the worst cases appeared when is used as input, signal aggregation, aggregation of power, or the concatenation of components.
These results are similar to the obtained in case of Sfrag-DB and these are coherent with the results presented in [8], where the fragmented waves information tends to isolate in detailed components. Therefore, we must use the information from all the components.  Table 1 shows the results for each signal selection and classifier for SWfrag-DB. The Acc for combination is above 0.70. The maximum Acc that was achieved was 0.82. This value was reached using ν-SVM fed with the statistics computed over the three last principal components of the non-normalized QRS and using ν-SVM feed with the statistic computed over the principal components of the normalized QRS. In general, the use of the QRS exhibits a good performance and the worst output was presented when ICA was computed over non-normalized beats. These results reinforce the conclusions of [8], where we said that the use of PCA is better than ICA for fragmentation detection, because is more stable in terms of components output, which is relevant to get good results from the SVM.

Fibrosis Detection Based on Linear Models and Statistical Relevance
In this subsection, we cover the results of the experiments applied over HCM-DB. As mentioned previously, and according to clinical criteria, it is not possible to visually identify this affection in ECG, and MRI-LGE is currently required for diagnosis. Hence, given that fibrosis is physiopathologically related to missconduction in the heart, we tried to apply the same algorithmic. The scheme of the tested methods was next. A linear classifier was first selected from both SVM algorithms; Then, the interesting ECG segment was selected from the following ones, normalized beat, normalized QRS, non-normalized beat, and non-normalized QRS; Finally, the input space of the classifiers was computed according to Section 3.   Figure 7a shows the Acc for ν-SVM applied over HCM-DB and using each signal selection. The best result for this database is achieved using the statistics computed over the QRS taken from 12 leads as input for SVM, reaching Acc = 0.68, see Panel (2,1). Figure 7b shows the Acc for C-SVM applied over HCM-DB and using each signal selection. As can be observed, the best results for this database are achieved using as features of SVM the concatenation of non-normalized QRS from the 8 independent leads achieving Acc = 0.683, see Panel (2,3). Table 2 shows the combinations that achieved the best results for each signal selection and classifier. as can be observed, the minimum Acc for combinations was this time much lower, reaching only 0.64. The maximum Acc was 0.68, achieved by ν-SVM, fed with the statistics computed over the non-normalized QRS of the 12 leads, and the concatenation the non-normalized QRS of the 8 independent leads. The use of non-normalized QRS exhibit a good performance, but the results, although much lower in the fragmentation analysis. These results are relevant because they proved the existence of fibrosis markers in the standard ECG.
The goal of the second part of this experiment, is to describe the statistical relevance of the features of the best fragmentation detection methods described in previous experiment. Hence, we performed a bootstrap analysis with B equal to 100, after which we extracted the relevant features by selecting those ones the confidence interval of whose weight did not overlap zero. Figure 8 shows the confidence interval of the used features in the linear model that exhibit the best Acc applied over the HCM-DB. In this case, we can observe that not exist any non-overlapping-zero feature, hence we can say that any computed feature is statistically significant in our linear model. Which indicates that the relation among used features and fibrosis is not linear. Therefore, we need to explore the use of non-linear classifiers to enhance the fibrosis detection.

Fragmentation and Fibrosis Detection Based on Non-Linear Models
In this subsection we experiment using the same databases, and processing techniques, but now using non-linear classifiers. The benchmarked classifiers were, the C-SVM with Gaussian kernel, the ν-SVM with Gaussian kernel, the MLP, the KNN, the DT, and the Gaussian NB. Table 3a shows the best algorithm of each class tested over the Sfrag-DB. As it can be seen, the best algorithm was NB using statistics computed over the output of the principal components of the non-normalized QRS, this algorithm reached an Acc = 0.83. The algorithm provided different accuracies depending on the used classifier, and sorting them from top to low performers they could be listed as follows: NB, the C-SVM and DT, the MLP and ν-SVM, and KNN. These results indicate statistical independence among the used features for the fragmentation detection. Table 3b shows the best algorithm of each class tested over the SWfrag-DB. The best algorithm was again the NB using the statistic computed over the principal components of the non-normalized QRS, which achieved Acc = 0.83. According to classifier Acc, they can be sorted as: NB, C-SVM, ν-SVM, MLP, and DT, and the KNN. In these cases, the most used features were the statistics computed over the principal components. These results also proven the statistical independence among the used features for the wide-fragmentation detection. Table 3c shows the best algorithm of each class tested over the HCM-DB. The best algorithm is the C-SVM fed with the aggregation of normalized QRS from the eight independent leads, which achieved Acc = 0.91. In this case all the classifiers reached a significant positive Acc. According to classifier Acc, they can be sorted as: C-and ν-SVM, KNN, MLP, DT, and NB. Unlike the subrogated databases used before, Sfrag and Swfrag, in case of real fragmented records the results exhibit a non-linear dependency among the used features and the existence of fragmentation.
The main goal of the second part of this experiment is to determine if the use of non-linear model classifiers exhibit best results than linear models in the fibrosis detection. The classifiers tested in this test were, the C-SVM with Gaussian kernel, the ν-SVM with Gaussian kernel, the MLP, the KNN, the DT, and the Gaussian NB. Table 4 shows the behavior of non-linear classifier over HCM-DB, we can observe that better performance is achieved by NB fed using the computed statistic from the eighth independent leads, this algorithm presents Acc = 0.70. In this case, the overall performance is low compared with the other databases, moreover, the use of the statistics computed over the non-normalized signals presents the best performance. According to their accuracy, the classifiers can be divided in three groups, namely, NB that achieves the best results, C-SVM, KNN, ν-SVM and MLP, and finally DT achieving the worst results. These results show that the statistical independence among the used features for fibrosis detection.  As seen in Table 5 the non-linear methods outperformed the linear methods in detection of both fragmentation and fibrosis.

Discussion
The main goals of this study are two: First, the development of an algorithm allowing the clinicians to automatic detection of fragmentation in twelve-leads ECG; Second, the creation of an algorithm allowing the clinicians to early detection of fibrosis in the twelve-leads ECG. According with the obtained results in Reference [8], where we use multivariate transforms, such as, PCA or ICA to enhance the presence of fragmentation in the ECG. We computed several features that can modelled this situation. In case of fibrosis, due to both affections are similar, we followed the same strategy. The developed algorithms are based on linear and no-linear classifiers, namely, linear SVM, SVM with Gaussian Kernels, KNN, MLP, DT, and Gaussian NB. The main advantage of linear methods is the interpretability of their results, but in general, their behavior is lower than non-linear methods. On the other hand, the use of non-linear methods enhances the results obtained from linear methods, but these one loose the interpretability of their results.
If we check the obtained results in the case of Sfrag-DB, which corresponds to the subrogated model that extend widely the number of frequencies of the synthetic fragmentation, the difference between linear and non-linear models are quite relevant. Results showed NB as the best performing method, in terms of Acc, when applied over the statistics computed to the output of PCA computed over a narrow window around QRS and taking in the analysis just only the eighth independent leads. This method achieved 0.79 Sen, 0.77 Spe, 0.96 PPV, 0.80 NPV, and 0.87 Acc. It is relevant to mention here, which although figures depicted are quite relevant, we are actually working with a streamed synthetic formulation of the fragmentation wave. This model could help to know the behavior of proposed algorithms over well-known signals before applying them over real cases.
Moving now to the second used database, SWfrag-DB, which is a subrogated model that articulates a more restrictive range of frequencies to enhance the fit to the real fragmented signals, results did not move away from the previous ones. For this case, results were the following for the best scenario, 0.76 Sen, 0.90 Spe, 0.89 PPV, 0.77 NPV, and 0.83 Acc. As it happened in Sfrag-DB, according to the best classifier, NB, the results shown and statistical independence among the selected features for the fragmentation detection.
The third used database was FHCM-DB, which contains the real fragmented records. The results showed a much better figures than the subrogated models. This can be due the wide extension of the free parameters set in the synthetic model, appears to be too extensive compared to the effective real case. The best method was for this case then the C-SVM with Gaussian kernel over the aggregation of the normalized QRS from the eight independent leads. Final figures were 0.94 Sen, 0.88 Spe, 0.89 PPV, 0.93 NPV, and 0.91 Acc. The obtained values are very much comparable positively with results published in literature, improving slightly all existing reported results. Those ones proved a non-linear dependency among the selected features and the presence of fragmentation in the records.
A new and challenging situation was found for our team when it was proposed by the clinicians to evaluate this very same techniques over the HCM-DB, which contains records from patients affected by fibrosis. Where as mentioned earlier in this paper, very little references are found in the scientific processing literature in this area. This lack of references is justified due to fact that these affections are rarely visible in the ECG, and diagnosis always require from an MRI-LGE. Though, academic researchers do not set their target on this affection when working with ECG processing. We hypothesis in this work, which although it might not be visible in the ECG, electrophysiological components related to the misconducting cells, could eventually be present inside the signal. The provided results are relevant, not because of the merit figures obtained, which are much lower than in the case of fragmentation, but unique as a clinical reference for better diagnosis. The best proposed method is compounded by NB classifier applied over the statistics computed from the real QRS of the eighth independent leads, and the merit figures obtained were 0.47 Sen, 0.91 Spe, 0.82 PPV, 0.66 NPV, and 0.70 Acc. As reader can see, even the results were not that good in terms of Sen, the presented algorithm achieved a high value of Spe. This is especially relevant, as nowadays does not exist any algorithm that allow the clinicians to evaluate the presence of fibrosis based on the ECG.
As a general conclusion we can state that the algorithms and techniques presented in this paper opens a wide range of possible applications, such as, the development risk assessment tool for early diagnosis of misconducting affections namely, fibrosis, fragmentation, and so on.
We also think that this paper opens a new opportunity in the ECG processing field to analyze and eventually develop improved algorithms for fibrosis detection to enhance the Sen presented in this work. Additionally we think that other transformation techniques, not included in this paper, empirical mode decomposition, wavelets, among others, could also be evaluated to improve the results presented here, which could end up creating a clinically validated score for misconducting affections.