Personal Heart Health Monitoring Based on 1D Convolutional Neural Network

The automated detection of suspicious anomalies in electrocardiogram (ECG) recordings allows frequent personal heart health monitoring and can drastically reduce the number of ECGs that need to be manually examined by the cardiologists, excluding those classified as normal, facilitating healthcare decision-making and reducing a considerable amount of time and money. In this paper, we present a system able to automatically detect the suspect of cardiac pathologies in ECG signals from personal monitoring devices, with the aim to alert the patient to send the ECG to the medical specialist for a correct diagnosis and a proper therapy. The main contributes of this work are: (a) the implementation of a binary classifier based on a 1D-CNN architecture for detecting the suspect of anomalies in ECGs, regardless of the kind of cardiac pathology; (b) the analysis was carried out on 21 classes of different cardiac pathologies classified as anomalous; and (c) the possibility to classify anomalies even in ECG segments containing, at the same time, more than one class of cardiac pathologies. Moreover, 1D-CNN based architectures can allow an implementation of the system on cheap smart devices with low computational complexity. The system was tested on the ECG signals from the MIT-BIH ECG Arrhythmia Database for the MLII derivation. Two different experiments were carried out, showing remarkable performance compared to other similar systems. The best result showed high accuracy and recall, computed in terms of ECG segments and even higher accuracy and recall in terms of patients alerted, therefore considering the detection of anomalies with respect to entire ECG recordings.


Introduction
The aging of the population is leading to an increase in patients suffering from cardiac pathologies, therefore requiring electrocardiographic monitoring. An electrocardiogram (ECG) is an easy, rapid, and non-invasive tool that traces the electrical activity of the heart [1] revealing the presence of cardiac pathologies such as conduction disease, channelopathies, structural heart disease, and previous ischemic injury [2]. On the other hand, investigating the altered acoustic characteristics of the cardiac tones, as an example, may allow the early identification of valve malfunction [3].
Systems able to support the doctors' work in the diagnosis of pathologies can facilitate health care decision making reducing considerably expenditure of time and money [4][5][6][7][8][9][10][11][12]. The ECG has become the diagnostic procedure most commonly performed in clinical cardiology [13][14][15] and the diffusion of wearable and portable devices has been enabling patients to constantly monitor the cardiac activity, for example of elder people through wireless sensor networks [16]. Cardiologists cannot examine millions of ECGs daily recorded from portable devices. Thus, systems able to automatically detect suspicious anomalies in ECGs are required, in order to reduce the number of ECGs that need to be manually examined by the cardiologists, identifying those that need a further examination and also the urgency of such examination. For this reason, systems require high detection performance in order to avoid that normal ECGs incorrectly detected as anomalous should be examined by a medical professional, and, even more important, that the presence of an electrocardiographic alteration, which could be the indicator of cardiac pathology, is recognized and does not escape the observation of the cardiologist. To make the anomalous ECGs be examined by the medical specialist and that the proper therapy is administered, the detection system should maximize recall for anomalous ECGs, which is to maximize the number of ECGs correctly classified as anomalous, even losing accuracy.
The goal of this paper is to implement a system able to automatically detect the suspect of cardiac pathologies in ECG signals to support personal monitoring devices. We propose a 1D-CNN architecture optimized to detect anomalous ECG recordings, regardless of the kind of cardiac pathology, including in the analysis 21 classes of anomalies.
The system here presented was designed to be implemented on devices for personal use and with the aim to only send to the cardiologist ECGs detected with the suspect of a cardiac alteration for further examination, thus no information about the specific class of anomaly is detected. The proposed system is based on a binary classification model.
In fact, as of now, we want to make it clear that the main goal of our study is not to classify different cardiac pathologies, but to make sure that the suspect of a pathology can be detected and that patients are alarmed: then a correct diagnosis can be carried on with specific tests and the intervention of medical staff. The cardiologist will examine all ECGs detected as anomalous identifying the pathology and prescribing the proper treatment. The system has been implemented with the aim to achieve high levels of recall for anomalous ECGs in order to minimize the possibility that the presence of any kind of cardiac alteration could escape the observation of the cardiologist. This paper is organized as follows: Section 2 reports a wide background, Section 3 describes ECG signals; Section 4 illustrates material and methods; in Section 5 are pointed out results and discussions; and Section 6 sets out conclusions.

Background
Many studies have proposed the implementation of artificial neural networks and deep learning architectures for the development of automatic systems able to recognize the suspect of cardiac anomalies [40][41][42]. In the literature the detection of cardiac anomalies has been investigated analyzing both heart sounds acquired by digital stethoscopes and ECG signals from portable devices.
Meintjes et al. [43] implemented continuous wavelet transform (CWT) scalograms and convolutional neural networks for the correct classification of the fundamental heart sounds in recordings of normal and pathological heart sounds. They implemented a methodology in order to distinguish between the first and second heart sounds using CWT decomposition and convolutional neural network (CNN) features. Results show the high potential in the use of CWT and CNN in the analysis of heart sounds compared to support vector machine (SVM), and k-nearest neighbors (kNN) classifiers. In [44] authors propose the classification of heart sounds on short, unsegmented recordings and normalized spectral amplitude of 5 s duration phonocardiogram segments was determined by fast Fourier transform and wavelet entropy by wavelet analysis. Spectral amplitude and wavelet entropy features were then combined in a classification tree. They achieved accuracy comparable to other algorithms obtained without the complexity of segmentation. Redlarski et al. [17] presented a new heart sound classification technique combining linear predictive coding coefficients, used for feature extraction, with a classifier built upon combining support vector machine and the modified cuckoo search algorithm. It showed good performance of the diagnostic system, in terms of accuracy, complexity [45][46][47] and range of distinguishable heart sounds.
With the application of deep learning architectures, also the accuracy of ECG diagnostic analysis has achieved new high levels. The systems implemented using such techniques allow the automated interpretation of ECG signals from portable devices in real time. The common deep learning networks for the analysis of ECG signals are mainly based on recurrent neural networks (RNNs), convolutional neural networks (CNNs), and some other architectures [1].
Chauhan et al. [48] investigated the applicability of deep recurrent neural network architectures with long short term memory (LSTM) for detecting cardiac arrhythmias in ECG signals. This approach is quite fast, does not require preprocessing of the data [49] or hand coded features and do not need prior information about the abnormal signal. The network was tested on the MIT-BIH Arrhythmia Database for the classification of four different types of Arrhythmias showing that LSTMs may be a viable candidate for anomaly detection in ECG signals.
Saadatnejad et al. [50] proposed an LSTM-based ECG classification algorithm for continuous cardiac monitoring on wearable devices. They preprocessed data extracting RR interval and wavelet features from ECG samples. The ECG signal along with the extracted features were fed into multiple LSTM recurrent neural networks. The MIT-BIH ECG Arrhythmia Database was used for the classification of six different types of anomalies. The proposed algorithm achieved accurate LSTM-based ECG classification to wearable devices with low computational costs.
Thill et al. [51] presented an unsupervised time series anomaly detection algorithm to detect anomalies in ECG readings. They performed a recurrent LSTM network to predict the normal time series behavior without the usage of the anomaly class labels building a multivariate normal error model for the nominal data. Anomalous events were detected with a high probability through a high Mahalanobis distance. They classified six anomaly classes and obtained good performance achieving high levels of precision and recall.
Although RNN architectures are suitable to process time series data, they present some limitations. The major drawbacks of RNNs are the vanishing gradient and gradient exploding problems that make their training difficult, not allowing the processing of very long sequences. Moreover, due to its recurrent nature, the computation is slow. For this reason, some studies investigated the implementation of 1D-CNNs, with the main aim to design low computational complexity systems to support portable devices. In fact, Kiranyaz et al. [52] revised the state of the art techniques used in signal processing applications such as patient-specific ECG classification, structural health monitoring, anomaly detection in power electronics circuitry, and motor-fault detection. In particular, they highlighted how the implementation of adaptive and compact 1D-CNN can achieve higher performance than deep conventional 2D with low computational complexity. Adaptive and compact 1D-CNNs can be efficiently trained with a limited data set of 1D signals instead of massive size data sets required by deep 2D CNNs. It can be performed directly to the raw signal without any pre or post processing, such as features extraction, selection, dimensionality reduction, etc., and it is able to extract features from shorter segments of the overall data set. Moreover, due to the low computational requirements, 1D-CNNs are well suited for real-time and low-cost applications, especially on smart mobile devices that can be the proper tools for personal health monitoring [52,53]. Yıldırım et al. [54] proposed a new 1D-convolutional neural network approach for the automatic classification of cardiac arrhythmia on long-duration electrocardiography (ECG) signal analysis. The model was performed on 10-s fragments of ECG signals including 17 different classes of cardiac arrhythmia. The model showed remarkable performance and could be implemented with low computational complexity on mobile devices and cloud computing for tele-medicine, e.g., patient self-monitoring and preventive health. Li et al. [55] proposed a 1D-CNN based model to classify ECG signals. The model consisted of five layers and realized the classification of five typical kinds of arrhythmia signals. It achieved promising classification accuracy and significantly outperformed several typical ECG classification methods. Zubair et al. [56] propose an ECG beat classification system using a 1D-CNN model. The proposed classification system efficiently classified ECG beats into five different classes. Results showed that the model achieved a significant classification accuracy and superior computational efficiency than most of the state-of-the-art methods for ECG signal classification. Avanzato et al. [57] proposed a new neural architecture based on 1D-CNN for the development of automatic heart disease diagnosis systems using ECG signals. The model was performed on 30 s segments and classified three different classes of anomalies. It showed high performance and low complexity implementation. Kamaleswaran et al. [58] introduced a novel deep learning architecture for detection of normal sinus rhythm, AF, other abnormal rhythms, and noise. They proposed an optimal 13-layer 1D-CNN model with identified normal, AF and other rhythms using single lead short ECG recordings. The architecture was computationally fast and could also be used in real-time cardiac arrhythmia detection applications.

ECG Signal
The mechanical pumping activity of the heart muscle is determined by the rhythmic generation of an electrical impulse that originates at the level of the sinoatrial node and, through specialized conduction pathways, spreads to all cardiac muscle cells causing cycles of depolarization and repolarization underlying the contraction of single cells.
ECG is the graphic reproduction of the electrical activity of the heart during its functioning, recorded at the surface of the body. The doctor, usually a cardiologist specialist, interprets the electrocardiographic recording by detecting the presence of cardiac arrhythmias, structural changes in the cardiac cavities, atria and/or ventricles, ischemia, myocardial infarction, and other cardiopathies, characterized by an alteration of electrical conduction. A beat of ECG signals can be observed by five characteristic waves-P, Q, R, S, and T [59], where each wave is related to a specific interval of the polarizationdepolarization cycle. The characteristic of the normal ECG is that varies only in the presence of problems. The fundamental morphology of the ECG is given by three deflections (P, QRS, and T), which represent the formation and diffusion of the cardiac electrical impulse along the pathways of the conduction system ( Figure 1).

Dataset
The proposed method was tested on the MIT-BIH Arrhythmia Database supplied by PhysioNet, a web resource for complex physiologic signals databases [60]. The MIT-BIH Arrhythmia Database contains 48 half-hour extracts of two-channel ambulatory ECG recordings, obtained from 47 subjects: 25 males aged between 32 and 89 years and 22 females aged between 23 and 89 years. As described by the authors in [60], the database consists of 23 recordings randomly selected by a set of 4000 24-h ambulatory ECG recordings collected from a mixed population of hospitalized (approximately 60%) and ambulatory (approximately 40%) patients at Beth Israel Hospital in Boston. The remaining 25 recordings were selected from the same set in order to include less common but clinically significant arrhythmias that would not be well represented in a small random sample. The recordings were digitized at 360 samples per second per channel with an 11bit resolution over a 10-mV range. For our analysis, we used the same data published by Kaggle in txt and csv format, since they were easier to process [61].
The database contains 22 classes, 1 for normal beat, and 21 for various kinds of anomalies in ECG recordings.

Data Organization
Data were structured in a tabular form in order to be processed by the neural network. The ECGs are two-channel recordings including main derivations, varied among subjects. In most ECGs one channel is a Modified-Lead II (MLII) and the other channel is generally V1, sometimes V2, V4, or V5, depending on the subject. For this reason, since the MLII is almost present in every ECGs, we only considered this lead. Four ECG recordings, only containing leads V1 and V5, were excluded from the analysis. The 30-min ECG recordings were fragmented into segments of 15 s. Since the recordings were digitized at 360 samples per second, each segment consisted of 5400 samples (Tables 1 and  2). Each segment was included in the analysis and no data cleaning process was executed.
Based on the annotations assigned to the peaks present, each segment was labeled as follows: if in the segment all peaks were annotated as normal then the entire segment was labeled as normal; if in the segment at least one peak was annotated as anomalous, presenting any kind of anomalies, then the entire segment was labeled as anomalous.

Dataset
The proposed method was tested on the MIT-BIH Arrhythmia Database supplied by PhysioNet, a web resource for complex physiologic signals databases [60]. The MIT-BIH Arrhythmia Database contains 48 half-hour extracts of two-channel ambulatory ECG recordings, obtained from 47 subjects: 25 males aged between 32 and 89 years and 22 females aged between 23 and 89 years. As described by the authors in [60], the database consists of 23 recordings randomly selected by a set of 4000 24-h ambulatory ECG recordings collected from a mixed population of hospitalized (approximately 60%) and ambulatory (approximately 40%) patients at Beth Israel Hospital in Boston. The remaining 25 recordings were selected from the same set in order to include less common but clinically significant arrhythmias that would not be well represented in a small random sample. The recordings were digitized at 360 samples per second per channel with an 11-bit resolution over a 10-mV range. For our analysis, we used the same data published by Kaggle in txt and csv format, since they were easier to process [61].
The database contains 22 classes, 1 for normal beat, and 21 for various kinds of anomalies in ECG recordings.

Data Organization
Data were structured in a tabular form in order to be processed by the neural network. The ECGs are two-channel recordings including main derivations, varied among subjects. In most ECGs one channel is a Modified-Lead II (MLII) and the other channel is generally V1, sometimes V2, V4, or V5, depending on the subject. For this reason, since the MLII is almost present in every ECGs, we only considered this lead. Four ECG recordings, only containing leads V1 and V5, were excluded from the analysis. The 30-min ECG recordings were fragmented into segments of 15 s. Since the recordings were digitized at 360 samples per second, each segment consisted of 5400 samples (Tables 1 and 2). Each segment was included in the analysis and no data cleaning process was executed.  Based on the annotations assigned to the peaks present, each segment was labeled as follows: if in the segment all peaks were annotated as normal then the entire segment was labeled as normal; if in the segment at least one peak was annotated as anomalous, presenting any kind of anomalies, then the entire segment was labeled as anomalous.
In the dataset, each row represented a segment of 15 s labeled as normal or anomalous. Labeled as normal were 2105 segments and 3175 as anomalous. Since the dataset presented clearly imbalanced classes showing a proportion bias, we undersampled the segments labeled as anomalous in order to keep only a part of these data, thus balancing the training set. The anomalous segments in excess were included in the test set.
As will be detailed later, we carried out two distinct experiments, preprocessing data in different ways. In the first experimentation, segments were randomly included in the training set, using an equal proportion of segments for the two classes. The training set consisted of 2930 segments, 1465 labeled as normal, and 1465 labeled as anomalous. The training set was split in 70% for training and 30% for validation, keeping the same proportion of normal and anomalous segments. Segments contained in the test set presented a different proportion of the two classes, including 640 labeled as normal (60%) and 426 as anomalous (40%). Examples of normal and anomalous ECG recording are shown in Figure 2. Table 3 shows the number of ECG segments for each normal or anomalous class. The segments classified as anomalous can include one or more types of anomalies.
training set was split in 70% for training and 30% for validation, keeping the same proportion of normal and anomalous segments. Segments contained in the test set presented a different proportion of the two classes, including 640 labeled as normal (60%) and 426 as anomalous (40%). Examples of normal and anomalous ECG recording are shown in Figure 2. Table 3 shows the number of ECG segments for each normal or anomalous class. The segments classified as anomalous can include one or more types of anomalies.  In the first experiment, as a widespread procedure used in literature and in the studies we compared to our work, we randomly assigned segments extracted from the same ECG recordings into training and test set. In the second experiment, we carried out a patientoriented analysis, assigning to the training set only segments extracted from the same ECG recordings. This procedure was carried out for ECGs from 28 patients, since each ECG is related to a single patient. Segments extracted from the remaining 16 ECGs were assigned to the test set. In this way, segments of the same patient ECG recording are not interleaved between the training and test set. The training set consisted of 3360 segments, 1594 normal, and 1766 anomalous and the test set included 1800 segments, 435 normal, and 1365 anomalous. With this selection procedure, every segment of ECG recordings included in the test set had never been presented to the network during the training phase. Thus, the evaluation is only done on ECG recordings of patients never seen by the model.

Model Architecture
Convolutional neural network is a special kind of artificial neural network developed for image classification in which the model normally processes two-dimensional spatial input data representing an image's pixels, in a process called feature learning. The same model can be used for one-dimensional sequence of data, such as an analysis of time series data, signal data, or natural language processing. The architecture of the model is described in [62]. The electrocardiogram signal is a time series data sequence that represents electrical impulses from the myocardium [63]. Thus, we propose a 1D convolutional neural network consisting of four convolutional blocks and one output block for the analysis of ECG signal data, in order to automatically identify normal and anomalous ECG recordings.
The convolutional block, as represented in Figure 3, consists of a 1D convolutional layer, a batch normalization layer, a 1D max pooling layer, and a rectified linear unit (ReLU) layer, while the output contains a 1D average pooling, a flatten layer, a dense layer, and a Softmax layer. We chose a four convolutional blocks architecture, since it showed a right tradeoff between computational efficiency and results accuracy.
The 1D convolutional layer creates a convolution kernel that is convolved with the input layer over a single dimension to produce a tensor of output. The kernel size was set to 80 in the first layer and decreased to 4 in the subsequent layers, in order to reduce computational costs ( Table 4). The batch normalization standardized the input and it was applied after each convolutional layer and before the level of pooling, in order to improve performance and stabilize the learning process of the deep neural network [59]. The output of the batch normalization layer was downsampled by means of a 1D max pooling layer with a pool size of 4. Table 3. ECG segments for each ECG class in train and test sets for the first experimentation. Since our goal is to detect any kind of anomaly, we considered both rhythm and beat alterations with no difference. We also considered signal related annotations since a low quality noise signal could represent an issue to medical diagnosis.    The 1D max pooling resizes the input representation by taking the maximum value on the window defined by pool size. The strides specify how much the pooling window moved for each pooling step. The pooling level was placed before ReLU to reduce overfitting. In the output block, the 1D average pooling performed the same operation as the 1D max pooling but took the average window value. After the average pooling, the network had a flatten layer in order to transform multi-dimensional input feature vectors obtained in the previous layer to the appropriate size, as the input of the subsequent layers of the network. The output of the flatten layer is the input of a dense fully connected layer, which uses the Softmax function to predict output classes. Moreover, we used a dropout parameter to prevent overfitting. Dropout is a regularization technique that helps to reduce interdependent learning amongst the neurons. At each training, a set of neurons randomly chosen is dropped out of the net. We used dropout in the dense layer with a fraction of 0.6. With the same aim of preventing overfitting, in the dense layer, we also used a weight regularization approach. We used an L2 regularization penalty (sum of the squared weights) with hyperparameter equal to 0.001 in order to keep small values of weight in the dense layer. The architecture of the proposed 1D-CNN model is shown in Figure 3 and Table 4.

Validation and Performance Metrics
In order to evaluate the performance of our model, the training set was split in 70% for training and 30% for validation. Then to validate the stability of the model and generalize results, a resampling procedure was performed on the training set. In particular, we implemented k-fold cross-validation with k = 10. Data were split into 10 groups, then each group at a time was used as a validation set and the remaining groups as the training set. Data were split such that no observation could be included both in training and in test sets. The network was fitted on the training set and evaluated on the validation set for 10 times. Results were summarized with mean and standard deviation values of the model performance metrics. Finally, the testing set was used to verify the robustness of the neural network on data not included in the training set.
To evaluate the performance of the neural network, we computed the confusion matrix and the traditional classification metrics. In particular, given TP (true positive) and TN (true negative) the number of events correctly classified, respectively, as successes or failures and FP (false positive) and FN (false negative) the number of events incorrectly classified, respectively, as successes or failures, we calculated

Results and Discussion
We performed the proposed 1D convolution neural network on the MIT-BIH Arrhythmia data processed as segments of 15 s consisting of 5400 samples including 21 different classes of anomalies.
In the first experiment, we split the dataset basing on segments as explained in Section 4.2. The network was trained on 70% of the set and was validated on the remaining 30% for 200 epochs. Results for the validation performed on the training set are shown in  Accuracy  TP TN  TP TN FN FP , (1) , (2) ,

Results and Discussions
We performed the proposed 1D convolution neural network on the MIT-BIH Arrhythmia data processed as segments of 15 s consisting of 5400 samples including 21 different classes of anomalies.
In the first experiment, we split the dataset basing on segments as explained in Section 4.2. The network was trained on 70% of the set and was validated on the remaining 30% for 200 epochs. Results for the validation performed on the training set are shown in   In order to validate the stability of the proposed method, we used a k-fold cross validation as explained in Section 4.4. The average accuracy of the model was 89.3 ± 0.26% of standard deviation and the average loss was of 0.28 ± 0.06% and an average recall 85.6 ± 0.03%.
In order to assess the performance, the network was evaluated on the test set. The confusion matrix and the related metrics were computed ( Table 5). The network showed an accuracy of 89.51%, and a recall of 91.09% for normal and 87.79% for anomalous segments, the precision of 91.81% for normal and 86.78% for anomalous segments, and F1-score 91.45% for normal and 87.28% for anomalous segments.

Predicted
Accuracy Recall Precision F1 In order to validate the stability of the proposed method, we used a k-fold cross validation as explained in Section 4.4. The average accuracy of the model was 89.3 ± 0.26% of standard deviation and the average loss was of 0.28 ± 0.06% and an average recall 85.6 ± 0.03%.
In order to assess the performance, the network was evaluated on the test set. The confusion matrix and the related metrics were computed ( Table 5). The network showed an accuracy of 89.51%, and a recall of 91.09% for normal and 87.79% for anomalous segments, the precision of 91.81% for normal and 86.78% for anomalous segments, and F1-score 91.45% for normal and 87.28% for anomalous segments. We compared the proposed network with the studies that, at the state of the art, performed the same 1D convolutional neural network using the MIT-BIH Arrhythmia Dataset with the aim to implement an automatic classification of cardiac pathologies based on ECG signals. The results of this comparison are reported in Table 6. Briefly summarizing, Yıldırım et al. [54] proposed a 1D-convolutional neural network model for cardiac arrhythmia (17 classes) detection based on long-duration electrocardiography (ECG) signal analysis. They designed a complete end-to-end structure with neither hand-crafted feature extraction of the signals nor feature selection at any stage using a 16-layer deep network structure including standard CNN layers. The network was tested on MIT-BIH Arrhythmia database considering 10 s ECG signal segments for one lead (MLII) for a total of 1000 ECG signal segments from 45 persons. The network achieved a detection accuracy of 17 cardiac arrhythmia disorders (classes) at a level of 91.33%. Li et al. [55] proposed a 1D-convolutional neural network to classify ECG signals. The network consisted of five layers in addition to the input layer and the output layer, in particular, two convolution layers, two down sampling layers and one full connection layer. The model extracted the effective features from data and classified the features automatically. The wavelet threshold method was used to filter the high frequency noise, while the wavelet transform and reconstruction algorithm to correct the baseline drift, which is a low-frequency noise. Subsequently, the segmentation of ECG signals and the reduction of dimensions were performed by using R peaks that were located by the method of the wavelet transform. The network achieved a detection accuracy of five typical kinds of arrhythmia signals at a level of 97.5%. Zubair et al. [56] proposed an ECG beat classification system based on a 1D-convolutional neural network. The model integrated feature extraction and classification of ECG pattern recognition and consisted of three convolution layers, three pooling layers, a multilayer perceptron and Softmax layer. The classification was performed in three main steps: ECG beat detection, sample extraction, and classification. ECG beat detection stage involves the detection of the individual beat signal from 30 min long ECG recording of each patient, using modified-Lead II signals. Equal numbers of samples (100) on both right and left sides from the Rpeak were extracted and downsampling was performed to represent raw data of each beat by 128 samples. The network classified five different kinds of anomalies with a detection accuracy of 92.7%. Avanzato et al. [57] proposed an automatic heart disease diagnosis system using ECG signals based on the direct application of a 1D-convolutional neural network. The network consisted of three layers in addition to the input layer and the output layer. Unlike the classic CNN, which use fully connected neurons as their output layer, this network performed a single Average pooling layer and then a Softmax followed by a natural logarithm. The structure of the neural network input consisted of 30-s segments where every second of ECG recording was equivalent to 360 samples, for a total of 10,800 samples. The paper evaluated the performance of the network on three classes of anomalies with an accuracy of 98.33%. Table 6 shows that the proposed method presents remarkable performance compared to the other studies, considering that it is able to detect anomalies in an ECG recording including 21 different classes. In addition, the proposed method is able to classify anomalies even in presence of more than one kind of cardiac pathology in the same ECG segment. It achieves recall and F1-score, respectively of 87.79% and 86.78%, improving results obtained in [54], which considered only 17 classes of anomalies. These results were achieved with a detection accuracy of 89.51%, slightly lower than the accuracy obtained using the method proposed in [54], despite the increase in the number of different kinds of anomalies. The high performance obtained for recall, in spite of a small loss for accuracy, is consistent with the research goal to reduce the number of ECGs sent to the medical specialist for further examination. At the same time, we need to ensure that anomalous ECGs could not escape the medical examination and the prescription of the proper therapy. Although results obtained in [54][55][56] showed higher performance, in those studies the analysis was carried out including a lower number of anomalies. In particular, [55,56] included in the analysis five classes and obtained an accuracy, respectively, of 97.5% and 92.7%. In [57] the analysis was conducted using three classes with an accuracy of 98.33%. Note that the model in [54] was based on 16 levels architecture (therefore with a large computational complexity to deploy, as an example, on wearable devices) and the models in [55][56][57], although based on four levels, operate only on a very small number of anomalies.
Analyzing the 52 segments containing anomalies and incorrectly classified as normal (false negative, see Table 5), we observed that, for most of them, at least one of the segments extracted from the same ECG recording was correctly classified as anomalous. Overall, only five ECG recordings from patients affected by cardiac pathologies were not detected at all.
In light of this, we carried out a second experiment, including in the training set only segments extracted from the same ECG recordings, and for the test set, as explained in Section 4.2. Thus, the proposed 1D-CNN was tested on segments extracted from ECG recordings never presented to the network in the training process.
The training set consisted of 3360 segments, 1594 normal, and 1766 anomalous, extracted from 28 ECG recordings, whereas the remaining recordings were included in the test set. In the test set, there were 1800 segments, 435 normal, and 1365 anomalous. To evaluate the performance of the model, the network was trained on 70% of the set and was validated on the remaining 30% for 200 epochs. Figure 5 shows the learning process for training and validation loss and accuracy. The model stabilized in convergence in a training process of 200 epochs. Additionally, in this case, the learning curves of the training and validation loss stabilized around zero and the learning curves of the training and validation accuracy stabilized around 90%.
In order to validate the stability of the proposed method, we used a k-fold cross validation as explained in Section 4.4. The average accuracy of the model was 88.2 ± 0.28% of standard deviation and the average loss was of 0.29 ± 0.07% and an average recall of 87.6 ± 0.03%.
The network was, then, evaluated on the test set. The resulting confusion matrix and the related metrics are shown in Table 7.
test set. In the test set, there were 1800 segments, 435 normal, and 1365 anomalous. To evaluate the performance of the model, the network was trained on 70% of the set and was validated on the remaining 30% for 200 epochs. Figure 5 shows the learning process for training and validation loss and accuracy. The model stabilized in convergence in a training process of 200 epochs. Additionally, in this case, the learning curves of the training and validation loss stabilized around zero and the learning curves of the training and validation accuracy stabilized around 90%. In order to validate the stability of the proposed method, we used a k-fold cross validation as explained in Section 4.4. The average accuracy of the model was 88.2 ± 0.28% of standard deviation and the average loss was of 0.29 ± 0.07% and an average recall of 87.6 ± 0.03%.
The network was, then, evaluated on the test set. The resulting confusion matrix and the related metrics are shown in Table 7.
The network showed an accuracy of 84.94%, and a recall of 55.40% for normal and 94.36% for anomalous segments, the precision of 75.79% for normal and 86.91% for anomalous segments, and F1-score 64.01% for normal and 90.48% for anomalous segments.  The network showed an accuracy of 84.94%, and a recall of 55.40% for normal and 94.36% for anomalous segments, the precision of 75.79% for normal and 86.91% for anomalous segments, and F1-score 64.01% for normal and 90.48% for anomalous segments.
The model showed a higher recall for anomalous segments compared to the previous test. Analyzing the 77 segments containing anomalies and incorrectly classified as normal, we observed that, for all of them, at least one of the segments extracted from the same ECG recording was correctly classified as anomalous. This result showed a remarkable improvement in the performance of the proposed method since, thinking in terms of patients rather than segments, the system was able to detect 100% of ECG recordings from patients affected by cardiac pathologies.
Considering this performance, we believe that the unavoidable goal to detect anomalous segments and minimize missed alarms was reached. This claim must be considered in the operative context where the model was deployed, in which the cost of a false alarm was considerably less than a missed one.
Moreover, we want to highlight that these results were achieved with the second experiment whose settings represents a closer scenario in terms of model usage. In the second experiment the model was validated on ECG records of different patients never presented to the model during the training. This experiment design is not common since validation is usually carried out with the usual holdout method, that is, segments from the same ECG recordings could be interleaved between training and test sets. With holdout there are chances that some signal patterns have been already presented to the model in the training phase.
However, it should be noted that the system described here still presents a significant false alarm rate. We are currently investigating different strategies to reduce the required workload due to false alarms. One viable solution is to isolate and collect only anomalous detected segments; when they occur in significant quantities can be sent remotely for the expert verification. Looking only at some of the segments, just those classified as anomalous by our model, can be a very quick job for an expert, certainly less demanding than observing an entire ECG, even more so if referred to a Holter exam. This operational solution is well suited also for continual and active learning techniques, that is to periodically retrain the model based on new annotations from an expert; in this way, as new examples come into the system a probable gradual decrease in false alarms events is expected, with more balanced performances.

Conclusions
The diffusion of personal portable monitoring devices could involve the reporting of millions of ECG recordings every day. Systems able to support the cardiologists' work in the interpretation of ECGs for the diagnosis of cardiac pathologies are required to facilitate health care decision making reducing considerably the expenditure of time and money. In fact, the automated detection of suspicious anomalies in ECG recordings can drastically reduce the number of ECGs that need to be manually examined by the cardiologists, excluding those classified as normal.
In the present paper, we propose a system able to automatically detect the suspect of cardiac pathologies in ECG signals from personal monitoring devices, using a 1D-CNN architecture. The 1D-CNN model overcomes the problems of vanishing gradient and gradient exploding related to recurrent neural networks, making their training difficult. Moreover, the 1D-CNN model allows one to implement real-time and low-cost systems and it is characterized by low computational complexity, feasible implementation on smart devices, and cloud computing. The system was optimized to detect the suspect of anomalies classifying normal and anomalous ECG, regardless of the kind of cardiac pathology. The proposed model was tested on the MIT-BIH ECG Arrhythmia Database, which included 21 different classes of ECG anomalies. Two different experimentations were carried out showing remarkable performance compared to the other studies conducted using the 1D-CNN architecture tested on the MIT-BIH ECG Arrhythmia Database. In particular, the network achieved accuracy and recall, respectively, of 84.94% and 94.36% computed with respect to the ECG signal segments and accuracy and recall of 100% when computed with respect to the patients, therefore considering the detection of anomalies in the entire ECG recordings.
We are now working on a possible personalization of the model, tunable toward a single person. We expect that the performance of this kind of model could be much better than a general one, with an additional cost of model calibration before the actual usage.
In the same study we are also investigating a model trained on "normal" segments of healthy patients and abnormal segments of pathological patients (since it is not possible to obtain abnormal segments from a healthy patient). This will allow us to observe if the model trained on this new dataset presents different characteristics than the model described in this paper. Here, we wanted to compare our study with studies that were as homogeneous as possible, at least in the dataset used, so we did not introduce any further changes regarding the training data.  Institutional Review Board Statement: The used dataset was not recorded by the authors but originated from a 2001 study by Moody et al. [60]. The authors of the database stated that all ethical requirements had been followed. Moreover, the database is available online for an extended period now and has been used extensively in many studies.

Data Availability Statement:
The proposed method was tested on the MIT-BIH Arrhythmia Database supplied by PhysioNet, see https://www.physionet.org/content/mitdb/1.0.0/.