Artificial Neural Network for Atrial Fibrillation Identification in Portable Devices

Atrial fibrillation (AF) is a common cardiac disorder that can cause severe complications. AF diagnosis is typically based on the electrocardiogram (ECG) evaluation in hospitals or in clinical facilities. The aim of the present work is to propose a new artificial neural network for reliable AF identification in ECGs acquired through portable devices. A supervised fully connected artificial neural network (RSL_ANN), receiving 19 ECG features (11 morphological, 4 on F waves and 4 on heart-rate variability (HRV)) in input and discriminating between AF and non-AF classes in output, was created using the repeated structuring and learning (RSL) procedure. RSL_ANN was created and tested on 8028 (training: 4493; validation: 1125; testing: 2410) annotated ECGs belonging to the “AF Classification from a Short Single Lead ECG Recording” database and acquired with the portable KARDIA device by AliveCor. RSL_ANN performance was evaluated in terms of area under the curve (AUC) and confidence intervals (CIs) of the received operating characteristic. RSL_ANN performance was very good and very similar in training, validation and testing datasets. AUC was 91.1% (CI: 89.1–93.0%), 90.2% (CI: 86.2–94.3%) and 90.8% (CI: 88.1–93.5%) for the training, validation and testing datasets, respectively. Thus, RSL_ANN is a promising tool for reliable identification of AF in ECGs acquired by portable devices.


Introduction
Arrhythmias are among the most common cardiac disorders that can cause severe and sometimes fatal complications, even when asymptomatic [1,2]. Among the different kinds of serious cardiac arrhythmias, atrial fibrillation (AF) is the most common, affecting 1-2% of the worldwide population [3]. AF is associated with a high morbidity (especially stroke and heart failure) and mortality. Mortality (per 100,000 population), in particular, has shown an increasing trend with time; from 1990 to 2010 it increased from 0.8 to 1.6 in men, and from 0.9 to 1.7 in women, with peaks in developed countries reaching 2.7 and 2.4, respectively [3]. Thus, AF represents, worldwide, a significant public health problem with huge socio-economic repercussions.
AF is a supraventricular arrhythmia characterized by uncoordinated continuous atrial electrical activation, causing the deterioration of atrial functionality. In normal conditions, the contraction of the heart is initiated by an electrical impulse that, after having been generated by the sino-atrial node, propagates through all atrial myocardial cells, causing their electrical depolarization and mechanical contraction followed by the electrical repolarization and mechanical relaxation. Successively, the electrical impulse reaches the atrioventricular node, in which it is slowly conducted before propagating through all ventricular myocytes, causing their ventricular depolarization and contraction and subsequent repolarization and relaxation. The electrical phenomena associated with the propagation of this impulse through the heart result in typical waves of the electrocardiogram (ECG) measured at the body surface. Normally, the ECG is a pseudo-periodic signal ( Figure 1A) constituted by the repetition of a pattern showing a sequence of typical waves ( Figure 1B): the P wave, which reflects the atrial depolarization; the QRS complex, which reflects the ventricular depolarization and hides the atrial repolarization; and the T wave, which reflects the ventricular repolarization. In AF, the sino-atrial node is overruled by the continuous fibrillatory activity and is no longer able to provide its pseudo-periodic impulse, so the heart rhythm becomes irregular ( Figure 1C) and the impulse propagates though the atria following chaotic pathways [4]. However, once the impulse reaches the atrioventricular node and finds it not refractory, the impulse normally propagates through the ventricles. Combination of the AF random nature and the complex conduction/blocking properties of the atrioventricular node generates an irregular heart rate. These abnormalities of the electrical activity of the heart are reflected in the ECG that is no longer a pseudo-periodic signal but, rather, shows a high level of heart-rate variability (HRV) ( Figure 1C). The P wave is no longer present; instead, continuous fibrillatory waves, also called F waves, are seen as rapid low amplitude oscillations that reflect the continuous uncoordinated atrial depolarization ( Figure 1D).
Sensors 2020, 20, x FOR PEER REVIEW 3 of 16 Figure 1. Panel A shows a normal pseudo-periodic electrocardiogram (ECG) tracing. Panel B shows a normal beat, constituted by a P wave (the smallest wave), a QRS complex (with R being the highest wave) and a T wave. Panel C shows an ECG tracing with atrial fibrillation (AF) and thus increased heart rate variability (HRV). Panel D shows a beat during AF with F waves but no P wave. Figure 1. Panel (A) shows a normal pseudo-periodic electrocardiogram (ECG) tracing. Panel (B) shows a normal beat, constituted by a P wave (the smallest wave), a QRS complex (with R being the highest wave) and a T wave. Panel (C) shows an ECG tracing with atrial fibrillation (AF) and thus increased heart rate variability (HRV). Panel (D) shows a beat during AF with F waves but no P wave.
AF diagnosis is typically ECG-based and is usually made by a cardiologist, possibly supported by computerized applications [5][6][7][8][9][10][11][12][13][14][15][16][17][18], in hospitals or in clinical facilities. However, traditional medical ECG devices, even when used out-of-the-hospital (such as the Holter ECG recorders), are coupled to a limited amount of people, who are symptomatic or have cryptogenic stroke and, hence, for whom there is an indication for long-term monitoring. But, due to the sneaky and oftentimes asymptomatic way AF develops, a large-scale monitoring would be preferable, especially in the population above a certain age. The use of wearable devices (such as watches, patches and bands) and portable devices (such as smartphone and tablets) is becoming more and more common among the entire population worldwide. The modern devices are able to record the ECG and thus have opened the possibility to remotely monitor AF on a plethora of individuals. However, in order to be useful in the preventive diagnosis of AF, they have to be associated with a reliable diagnostic software. As a result, several algorithms for automatic detection of AF have been proposed in the literature [5][6][7][8][9], several of which are based on machine and deep learning approaches [10][11][12][13][14][15][16][17][18]. Most of them claim very high performances but, when critically analyzed, show some common limitations. Firstly, performances of some algorithms for AF identification have been tested only against sinus rhythm [5][6][7][8][9][10][11][13][14][15], without considering the main confounders that are the level of noise affecting ECGs and the presence of other kinds of arrhythmias [12,[16][17][18]. Secondly, most algorithms only rely on HRV to identify AF [5][6][7][8][9][10]13,14], despite high HRV being also associated with many other arrhythmias (not AF-specific) [19,20] and AF being also associated with absence of electrocardiographic P wave and presence of electrocardiographic F waves. Finally, some algorithms have been tested only on ECGs recorded by traditional medical devices [11][12][13][14][15][16]18] and not by modern wearable or portable devices; thus, their applicability to the latter remains to be demonstrated.
The aim of the present work is to propose a new artificial neural network (ANN) for a reliable identification of AF based on several input ECG features and to test it on ECG recordings acquired through portable devices, and thus typically affected by noise, made in healthy subjects and in cardiac patients exhibiting various types of abnormal cardiac rhythms. To this aim, a supervised fully connected artificial neural network was created using the repeated structuring and learning procedure [21] and tested on the "AF Classification from a Short Single Lead ECG Recording" database [19] by Physionet [22], consisting of thousands of short single-lead ECG recordings acquired with the portable KARDIA device by AliveCor [19].

Study Datasets
Data belong to the "AF Classification from a Short Single Lead ECG Recording" database by Physionet [19,22] (https://physionet.org). They include 8244 single lead ECGs (typically Einthoven lead I), collected with the portable KARDIA device by AliveCor (https://www.alivecor.com). ECG duration ranges from 9 s to 61 s (average: 33 s) and the sampling rate is 300 Hz. All ECG recordings were manually annotated by an expert as showing AF rhythms (738 recordings), normal rhythms (5050 recordings) or other rhythms (different from AF and normal rhythms, such as premature ventricular contraction; 2456 recordings) [19,22]. For the scope of this paper, these ECG recordings were classified into two reference classes, the AF class (738 recordings) and the non-AF class (7506 recordings).
All ECGs were characterized in terms of signal-to-noise ratio (SNR) (in dB, where the signal and noise amplitudes were defined as maximum signal amplitude and 4 times signal standard deviation, respectively) and submitted to an automatic algorithm for R-peak detection [23]. Only ECGs for which at least three consecutive R peaks could be identified were accepted for feature extraction and AF identification. Specifically, only the accepted ECGs were considered and grouped into three datasets, the training dataset, the validation dataset and the testing dataset. The training dataset and the validation dataset, including 55% and 15% of accepted ECGs, respectively, were used to create the ANN for AF identification, while the testing dataset, including the remaining 30% of accepted ECGs, was used to evaluate the created ANN performance. In all datasets, the prevalence of subjects in AF and non-AF classes was maintained unaltered.

ECG Processing and Feature Extraction
Initially, each ECG was prefiltered with a 6th order bidirectional Butterworth bandpass filter (cutoff frequencies of 0.5 Hz and 45 Hz) and R-peak positions were identified [23]. Then, several different signal processing steps were applied to obtain a set of 19 features from each ECG, 11 morphological features, 4 F-waves features and 4 HRV features. For an interpretive approach, the features were selected to include all those on which the criteria for AF diagnosis rely, namely P-wave disappearance, F-waves appearance and HRV increment, possibly quantified with different methods.
The 11 morphological features were extracted from the median ECG beat (MECGB), obtained as the median of the n (with n being the number of beats in the recordings) ECG segments included between 250 ms and 450 ms before and after each R peak, respectively. Specifically, the following 6 standard landmarks [24] were identified: P p (position of the absolute maximum of | MECGB | to the left of the R wave; it corresponds to the P-peak position in the presence of the P wave or to the highest oscillation position in the presence of F waves); R p (position of the absolute maximum of | MECGB |; it corresponds to the R-peak position); T p (position of the absolute maximum of | MECGB | to the right of the R wave; it corresponds to the T-peak position); QRS on (position of the point where the MECGB derivative changes its sign for the second-to-last time before R p ; it corresponds to the QRS-onset position); QRS off (position of the point where the MECGB derivative changes its sign for the second time after R p ; it corresponds to the QRS-offset position or J point); and T off (position of the point where the MECGB derivative changes its sign for the first time after the T p ; it corresponds to the T-offset position). Using these 6 landmarks, 11 morphological features, 5 time intervals (namely P p R p , P p QRS off , QRS on QRS off , QRS on T off and QRS off T off ) and 6 amplitudes (namely AP, AQRS on , AQRS, AQRS off , AT and AQRS/AP), are computed as described in Table 1. All amplitude features are computed with respect to baseline level identified 80 ms before R p [25].
The 4 F-waves features are based on the power spectral density estimation of the residual ECG obtained by subtracting the dominant ECG waveform obtained using the segmented beat modulation method [26,27], from the original ECG. Specifically, the F-waves frequency ratio (FWFR) (dimensionless); was computed as the ratio between the spectral area in the F-waves frequency band (4-10 Hz) and the total spectral area [27]. Since four different methods were used to estimate the power spectral density, 4 FWFR values (namely FWFR FFT , FWFR WLC , FWFR YWK , and FWFR THM ,) were obtained as described in Table 1.

Artificial Neural Network Construction
The iterative repeated structuring and learning (RSL) procedure [21] was used to create a supervised fully connected artificial neural network (RSL_ANN). Details about the RSL procedure can be found in [21]. In the present study, RSL_ANN was designed according to the following specifications: (a) the input layer consists of 19 neurons (one for each extracted feature), the output layer consists of one neuron that provides a value between 0 and 1, with 0 representing the non-AF class and 1 representing the AF class, and all other neurons had weights and biases between −1 and +1 and a sigmoid activation function; (b) optimization was done with the scaled-conjugate-gradient algorithm [29]; (c) to avoid overfitting, the validation-based early stopping criterion was used [30]; and (d) the AF and non-AF classes were weighted according to the inverse of their prevalence in order to compensate their disproportionality [31]. The procedure dynamically alternated structuring and learning phases. The primitive RSL_ANN (initially composed of a neuron in a hidden layer) was upgraded in different alternatives according to the following rules: each alternative presented only an additional neuron in an existing layer or in a new layer; the number of neurons in a layer could not be larger than the number of neurons in the previous layer; the maximal number of layers was three; and initialized weights and bias of the additional neuron had to improve RSL_ANN performance after only one epoch. If one rule was not fulfilled, the alternative was not acceptable. Then, all alternatives were learnt, and their validation errors were compared with the validation error of the primitive RSL_ANN. The RSL_ANN with the smallest validation error was considered as the new primitive RSL_ANN, and the procedure started anew. The stopping criteria were the following: there were no acceptable alternatives; the same alternative was confirmed as primitive for 10 consecutive times; or there were no misclassifications in both training and validation datasets. When one of the stopping criteria occurred, the primitive RSL_ANN was also the final RSL_ANN. In order to avoid dependency from initialization, 100 different RSL_ANNs were created by considering 100 different initializations. The optimal RSL_ANN was selected as the one showing the smallest validation error.

Statistics
Feature distributions over classes were described in terms of 50th [25th;75th] percentiles in all datasets and compared using the Wilcoxon ranksum test for equal medians. Statistical significance (p-value) was set at 0.05. RSL_ANN performance was evaluated by computing the receiver operating characteristic (ROC) curve from which area under the curve (AUC) and associated 95% confidence intervals (CIs) were computed. Sensitivity (Se) and specificity (Sp) were eventually determined for two specific operating points on the ROC curve of the testing dataset. The first operating point (Case 1) was that for which Se equals Sp; the second operating point (Case 2) was that for which Sp is set at 75% and Se is computed accordingly.

Results
Out of 8244 ECGs available in the Physionet "AF Classification from a Short Single Lead ECG Recording" database, 8028 (97.4%) were accepted for the study while the remaining 216 (2.7%) were rejected. Accepted ECGs were characterized by a SNR significantly higher than rejected ones (3.7[1.2;4.3] dB vs. 0.1[−2.5;2.5] dB, respectively; p-value < 0.05). Table 2 shows accepted ECGs grouped into training, validation and testing datasets. Feature distributions over datasets are reported in Table 3. Most features (15 out of 19) were found to be significantly different when statistically comparing the subjects in the AF and non-AF classes in all datasets.  The optimal RSL_ANN had a three hidden layer architecture with 6 neurons in the first hidden layer, 6 neurons in the second hidden layer and 5 neurons in the third hidden layer (Figure 2). The ROC curves for the testing dataset obtained with optimal RSL_ANN are depicted in Figure 3. The AUCs for the training, validation and testing datasets are 91.1% (CI: 89.1-93.0%), 90.2% (CI: 86.2-94.3%), and 90.8% (CI: 88.1-93.5%), respectively. Case 1 was characterized by values of Se and Sp both equal to 81.2% in the testing dataset. Eventually, Case 2 was characterized by a value of Sp equal to 75.0% and a value of Se equal to 88.7% in the testing dataset.

Discussion
This work proposes RSL_ANN as a supervised fully connected artificial neural network created using the repeated structuring and learning procedure [21] for reliable AF identification in ECGs acquired with the portable KARDIA device by AliveCor, as those used here and available in the Physionet "AF Classification from a Short Single Lead ECG Recording" database [19,22]. The repeated structuring and learning procedure has to be considered as a general method to construct

Discussion
This work proposes RSL_ANN as a supervised fully connected artificial neural network created using the repeated structuring and learning procedure [21] for reliable AF identification in ECGs acquired with the portable KARDIA device by AliveCor, as those used here and available in the Physionet "AF Classification from a Short Single Lead ECG Recording" database [19,22]. The repeated structuring and learning procedure has to be considered as a general method to construct

Discussion
This work proposes RSL_ANN as a supervised fully connected artificial neural network created using the repeated structuring and learning procedure [21] for reliable AF identification in ECGs acquired with the portable KARDIA device by AliveCor, as those used here and available in the Physionet "AF Classification from a Short Single Lead ECG Recording" database [19,22]. The repeated structuring and learning procedure has to be considered as a general method to construct ANNs and not in association with a specific clinical application. The used innovative repeated structuring and learning procedure [21] is indeed particularly suitable for applications of neural networks to relatively small databases (and not only to big data, as is usually done) since improving the loss function by iteratively alternating structuring and learning phases during the training (activation functions are standard).
RSL_ANN was fed with a set of 19 input features automatically extracted from ECGs (Table 1). By considering the three criteria for AF diagnosis, the features set includes standard morphological features of ECG waves (to reflect possible P-wave disappearance) as well as ECG features that typically characterize AF, that are F-waves features (to reflect possible F-waves appearance) and HRV features (to reflect possible HRV increment). Statistical analysis of feature distributions (Table 3) confirmed the known clinical observations that, in AF, the P wave disappears, F waves appear and HRV increases. P-wave disappearance and F-waves appearance are indicated by the finding that AP values are significantly higher in the non-AF class than the AF class. AP values in the AF class are not 0 (as one would expect in the absence of the P wave) because of representing F-waves amplitude and not the P -wave amplitude (see Section 2.2). F-waves appearance in AF is also indicated by the fact that all FWFR features were significantly higher in the AF class than in the non-AF class. Finally, the HRV increment in AF is indicated by the fact that all HRV features were significantly higher in the AF class than in the non-AF class. These findings, together with the observation that only two morphological and not AF-specific features out of 19 (both related to the QRS complex) were not significantly different in AF vs. non-AF classes (Table 3), confirm the reliability of the automatic feature extraction and the appropriateness of the feature selection.
RSL_ANN output is the ECG classification score, that is a value between 0 (indicating a subject not affected by AF) and 1 (indicating a subject affected by AF). No further stratification for cardiac rhythms other than AF was provided for the non-AF cases since optimal identification of a specific cardiac rhythm or pathology requires a specifically designed artificial neural network and proper selection of input ECG features (for example, in [21,32] optimal artificial neural networks for identification of heart failure in post-infarction patients and of ischemia in patients who underwent elective percutaneous coronary intervention are proposed, both obtained using the repeated structuring and learning procedure and a different set of 13 input ECG features).
As said, use of ECG features instead of raw data (as sometimes done when using long short-term memory, 1D convolutional neural network and others [11,15,16,33,34]) at the input of RSL_ANN implies adding an ECG processing step for feature extraction before classification; however, it also allows the construction of a faster and simpler artificial neural network, since based on a reduced number of hidden layers, through a smaller training dataset. In addition, since each feature, if well selected, reflects a specific physiologic phenomenon, classification logic of a network is physiologically more understandable than when it is based on raw data, and this is very much appreciated in context in which interpretability of the model is desirable.
RSL_ANN was constructed and tested on the "AF Classification from a Short Single Lead ECG Recording" database [19] by Physionet [22]; this database was selected for several reasons. First, it contains more than 8000 short single-lead ECG recordings and thus represents a suitable database for the design of a tool based on artificial neural networks. Additionally, these ECGs were acquired using the KARDIA [19], which is a portable device by AliveCor, in healthy subjects and patients showing several types of cardiac rhythm besides AF. These characteristics of the database allowed us to test the proposed algorithm in relation to the two main confounders in automatic AF identification, which are the level of noise affecting ECGs acquired using portable devices and the presence of arrhythmias other than AF.
Less than 3% of the ECGs included in the database could not be used in this study due to high levels of noise that jeopardized R-peak detection, and thus not for issues related to feature extraction or RSL_ANN construction. Nevertheless, all the observations that can be done on RSL_ANN ability to identify AF hold for ECGs affected by various levels of noise but in which the signal is dominant with respect to noise. Reliability of RSL_ANN in very noisy conditions remains to be demonstrated and requires availability of an R-peak detector able to perform correctly in such adverse conditions. Performance of RSL_ANN was very good and very similar in all datasets, with AUC over 90%. This result confirms the ability of RSL_ANN to correctly generalize the problem of AF identification. We made the choice to express RSL_ANN performance in terms of AUC in order to optimize Se and Sp (and thus working points in ROC and the threshold value of an output neuron) according to applications. When RSL_ANN is applied to a subject with no history of AF, errors in AF and non-AF classifications should be equally probable. Consequently, the threshold should be chosen to have equal values of Se and Sp. This case corresponds to Case 1, in which Se and Sp are 81.2% in the testing dataset. Instead, if RSL_ANN is applied to a subject with history of AF, AF occurrence is more likely and errors in AF identification should be minimized with respect to errors in non-AF identifications. Consequently, the threshold should be chosen to have the maximum Se obtainable by setting Sp at its minimum acceptable value. This case corresponds to Case 2 in which a Se of 88.7% is obtained by setting Sp at 75.0%.
Our RSL_ANN uses the highest number of data acquired by a portable device, considers all main confounders in AF identification and uses all AF diagnosis features. Some studies report values of Se (>90%) or AUC (>90%) higher than ours but involved discrimination of clinical ECGs (acquired with medical machines such as electrocardiograph or Holter ECG) [11][12][13][14][15][16]18] showing AF rhythm from clinical ECGs showing normal sinus rhythm only [5][6][7][8][9][10][11][13][14][15]. These working conditions are much easier than those considered in this study, in which ECGs were acquired by a portable device and discrimination of AF rhythms is not only from normal sinus rhythm but also from other arrhythmias. One work [17] created a classifier able to detect AF using the same database of our paper and obtained values of Se and Sp equal to 77.5% and 97.9%, respectively; thus, differently from us, it made the choice to optimize Sp over Se. In any event, we believe that the ROC curve should always be provided since the choice of a threshold is for the specialists in medical decision making. Finally, some proceedings from Computing in Cardiology 2017 used the same database as training dataset, but then validated their methods in another dataset, which however is not open-source available. Considering this discrepancy, a comparison between these studies and our work would be biased.
Eventually, reliable identification of AF in ECG acquired by portable or wearable devices is important for large scale preventive screening among the entire worldwide population [35]. In this context, our RSL_ANN represents a reliable software application to be associated to one of them to contrast the socio-economic repercussions related to AF due to its usual late diagnosis. Future studies are needed to definitely validate the use of the RSL_ANN for large scale AF screening.

Conclusions
Our proposed supervised fully connected artificial neural network created using the repeated structuring and learning procedure was able to reliably identify atrial fibrillation from the data acquired with the portable KARDIA device by AliveCor available in the Physionet "AF Classification from a Short Single Lead ECG Recording" database. Thus, our proposed artificial neural network represents a promising tool for a reliable identification of atrial fibrillation from ECGs acquired by portable devices, even when affected by other abnormal rhythms and corrupted by noise.