ECG-signal multi-classication model based on squeeze-and-excitation residual neural networks

Abstract


Background
The human heart has an electric transmission system that voluntarily generates regular electrical signals and transmits these signals to the entire heart.Heart disease takes the lives of many all over the world [1,2,3].However, arrhythmia, i.e., irregular heartbeats caused by changes or dysfunctions of this system, has been unfamiliar to the general public.
Arrhythmia can generally be diagnosed using a measured electrocardiogram (ECG), which is a record of the electrical activity in the heart, obtained through electrodes located on the skin of the chest and limbs.
An ECG usually refers to a 12-lead ECG, which gathers 12 different types of information from the heart.
To precisely classify a 12-lead ECG signal, doctors examine the ECG data and diagnose speci c arrhythmias based on their medical knowledge and extensive experience.Unfortunately, judgment errors are likely to occur during this process.Even an experienced specialist requires considerable time to analyze the signals, and the accuracy may not be high [4,5].In addition, in the case of a Holter monitor, the cardiologist cannot see the entire signal, which is usually recorded over several days.Thus, many scholars have attempted to classify 12-lead ECG signals automatically and accurately.
Thus far, rule-based algorithms for ECG signal classi cation have been unsuitable for use in practice, owing to their poor performance.In addition, this classi cation has been approached using various machine-learning methods, e.g., logistic regression [6], support vector machines (SVMs) [7], random forests [8], and K-nearest neighbors [9,10].The deep-learning model is as a closed model for use in real hospitals because it exhibits much better ECG signal-classi cation performance than conventional classi cation algorithms and rule-based algorithms.
The most natural deep-learning research using ECG data involves creating a deep-learning model using 12-lead ECG information measured in a hospital.Smith et al. [11] found that the accuracy of a new deeplearning network using 12-lead ECG data was higher than that of a conventional algorithm, with 13 convolutional layers and 3 fully connected layers.However, in most cases, rather than using all the ECG information, scholars have approached ECG signal classi cation using the information from one speci c lead; e.g., lead I or lead II. Lee et al. [12] used a residual network (ResNet) with six residual blocks and an Alex network to classify atrial brillation (Normal /Atrial Fibrillation), which provided accuracies of 99.9% and 99.7%, respectively.Rajpurkar et al. [4] and Hannun et al. [5] showed that a deep-learning model exceeded average cardiologists in terms of ECG discrimination ability of 12 output rhythm classes (10 arrhythmias/Normal/Noise), using a 34-layer ResNet model.It is important to note that the data they collected were large-scale, obtained from patients in actual hospitals.Their model used 91,232 modi ed lead-II ECG records from 53,549 patients, recorded using Zio cardiac monitors.
Recently, beyond simple convolutional neural network (CNN) structures, attempts have been made to nd a better ECG signal-classi cation structure by using structures that produce good results for image classi cation.Kim et al. [13] used the visual DenseNet architecture with 34 layers for two classi cations (Normal /Abnormal), with lead-II ECG data measured in a hospital.This structure achieved an overall accuracy of 98.89% and an F1 score of 99.09%.Their results showed that a single-lead ECG, rather than the 12-lead ECGs measured in a hospital, was su cient to distinguish between normal and abnormal.
In contrast to the methods mentioned thus far, some scholars have used short-term Fourier and wavelet transforms to convert ECG data into two-dimensional (frequency, time) data and used them as input for a deep neural network.Salem et al. [14] used the transformation "spectrogram" from a one-dimensional (1D) ECG signal from the MIT-BIH dataset and the European ST-T dataset to make 2D images.They also used a 161-layer DenseNet, pre-trained on millions of images, to extract abstract information and then applied an SVM for four-class classi cation (Normal Sinus/Atrial Fibrillation and Flutter/Ventricular Fibrillation/ST Segment Change).Their model's accuracy and F1 score were 97.23% and 97.35%, respectively.
Thus far, deep learning's approach to ECG classi cation is similar to that of image classi cation, with deep learning layer deepening and a complex structure.In this study, we also followed this trend to nd a suitable structure, among those structures that were deeper and more complex but provided good results for existing image classi cation, for ECG signal multi-classi cation.

ECG dataset description
We constructed a large ECG-signal dataset that includes 28,308 lead-II ECGs collected from the Korea University Anam hospital in South Korea.The collected data are meaningful in that they are not re ned, classes.In this study, our model was designed to classify seven rhythm classes (Normal /AF /AFL/FAB/SB /ST /PVC) from raw single-lead ECG data.The data ratios for each sector were 34.48% (Normal), 33.86% (AF), 6.17%(AFL), 6.90%(FAB), 6.87% (SB), 6.20% (ST), and 5.53% (PVC).
The ECG data were measured for 10 s at a frequency of 200 Hz.The data we used were based on lead-II ECG data taken from 12-lead ECG data.In addition, the range of data values was adjusted to enable the smooth learning of deep-learning models with min-max normalization.
In this study, we used a squeeze-and-excitation residual network (SE-ResNet), which is a ResNet with an added squeeze-and-excitation (SE) block, to create a model for ECG-signal multi-classi cation.SE-ResNet is considered to be one of the most popular of the many CNN architectures because of its high performance on ImageNet for image classi cation.In addition, an SE network is easy to apply because it simply adds an SE block without changing the shape of the existing model.We used ResNet as the baseline model for our ECG-signal multi-classi cation model; ResNet is known as one of the best models for ECG-signal multi-classi cation [4,5].
Speci cally, we used a modi ed ResNet [16], which uses pre-activated weight layers, instead of the original ResNet [17], which uses post-activated, because the modi ed ResNet has better performance than the original.Through these processes, we con rmed that SE-ResNet was an e cient ECG signal classi er and minimized human intervention.In addition, we conducted experiments to observe the changes in the model's classi cation performance considering the number of layers, and we found an optimal model for classifying ECG-based heartbeats.

Classi er Model Architecture And Experiment
We developed an ECG-signal multi-classi cation model using SE-ResNet.SE-ResNet focuses on the interdependencies between the channels of its convolutional features, instead of investigating the spatial information.The SE block consists of a squeeze operation, which summarizes the overall information about each feature map, and an excitation operation, which scales the importance of each feature map.
That is, the squeeze operation extracts only the important information from each channel using global average pooling, and the excitation operation computes the inter-channel dependencies using a fully connected layer and a nonlinear function.The main difference between the proposed network and the original SE-ResNet on ImageNet is that the proposed network uses 1D convolutions instead of 2D convolutions.We modi ed the original SE-ResNet by changing the input and output from 224 × 224 × 3 and 1000 to 1 × 2000 × 1 and seven (classes).To verify the performance of our model using SE-ResNet, we chose the ResNet used in [4,5] as the baseline model.
We evaluated our model on the lead-II ECG signal dataset measured in the Korea University Anam Hospital; it consists of seven classes.We considered 28,308 10-s ECG signals.We rst split our lead-II ECG dataset into two, a training dataset and a test dataset, in the ratio of 8:2.After this, we set aside the test dataset and chose 80% of the training dataset to be the actual training dataset and the remaining 20% to be the validation dataset.For dividing the dataset, we randomly selected each dataset; however, we xed the random seed to compare the results.Thus, 64% of our total dataset was used to train the network, 16% to validate the model, and the remaining 20% to test the model.The number of training samples was 18,116.For the validation and test of the ECG signals, 4,530 and 5,662 samples were used, respectively.
As an optimizer, we selected the Adam optimizer presented by Kingma et al. [18] with an initial learning rate of 0.0001.The training process continued until the validation loss did not decrease for a certain step.
Similar to other deep-learning classi cation models, we used categorical cross entropy as a loss function.
Under the above training setting, we investigated SE-ResNet and ResNet with 18/34/50/101/152 layers for seven-class classi cation.The test set was evaluated using the parameters in the validation set that exhibited the best performance.For the model evaluation, the accuracy and F1 score were used.The F1 score is the harmonic mean of the precision and recall.It is a more e cient criterion for model evaluation than accuracy, if the ratio between the data sectors is very different, e.g., the ECG signal dataset we cover in this study.

Results
Tables 1 and 2 summarizes the performance of the seven-class classi cation models on the testing data using SE-ResNet and ResNet, respectively.For all sectors, our model has a higher F1 score than the baseline model, ResNet.The best result for the seven-class classi cation model using SE-ResNet was the 152-layer model with a 97.05% F1 score (Table 1).The best result for the seven-class classi cation model using ResNet was the 152-layer model with a 95.65% F1 score (Table 2).
From Tables 1, 2, we can see that our methodology using SE-ResNet outperforms the baseline model.Speci cally, the F1 scores of the best SE-ResNet models for the seven-class ECG signal classi cations were + 1.40% (difference between 97.05% for the 152-layer SE-ResNet and 95.65% for the 152-layer ResNet) higher than the baseline model.
When the data were analyzed, the point to note about Tables 1 is that the F1 scores of our model for AFL, PVC, SB and FAB were relatively lower than the F1 scores for Normal, AF and ST.Furthermore, to check the results in more detail, we selected the 152-layer SE-ResNet models with the highest F1 scores for the seven-class classi cations.We calculated the confusion matrices of these models, and they are graphically shown in Fig. 1, which con rms the advantages and disadvantages of our models.As shown in Fig. 1, our model shows good overall performance for most sectors; however, it had di culty distinguishing between AF and AFL, FAB and PVC, and FAB and SB, which explains the lower F1 scores for AFL, PVC, SB and FAB.

Discussion
We gathered a large lead-II ECG dataset from the Korea University Anam Hospital in South Korea and found a suitable ECG-signal multi-classi cation model for this dataset.Our SE-ResNet model surpassed the F1 scores for the baseline model, ResNet, by + 1.40% for the seven-class ECG signal classi cations.Thus, we con rmed that SE-ResNet is a good model for ECG-signal multi-classi cation with a high accuracy and F1 score.These results indicate that, with only one ECG signal, instead of the 12-lead ECG measured by the hospital, our SE-ResNet multi-classi cation model could classify ECG signals su ciently correctly.In addition, our model is expected to be a tool to provide arrhythmia patients and the general public with information about arrhythmia, and to enable doctors to immediately perform the necessary treatment in the medical eld.However, our classi er has several limitations.First, the F1 scores of AFL, PVC, SB and FAB were much lower than those of the other sectors (Normal / AF / ST) in our model.This was because of the model's inability to distinguish between AF and AFL, FAB and PVC, and FAB and SB.This lower accuracy might be due to the insu cient data for AFL, PVC, SB and FAB compared with Normal, AF and ST.In the future, we plan to upgrade our model to increase the F1 scores of AFL, PVC, SB and FAB.
Second, owing to the lack of ECG data, we created a classi cation model for only some often-observed arrhythmias.To expand this model to other arrhythmia conditions, we must accumulate additional arrhythmia data; e.g., junction rhythm, SVT, VT, Wenckebach, etc. presented in Rajpurkar et al. [4] and Hannun et al. [5].
Third, we built the model and conducted various experiments to determine the best F1 score and accuracy, without considering the model's capacity.To achieve higher accuracy, we followed the deeplearning trend of making deeper and more complicated networks.However, many studies in the real world must be carried out on computationally limited platforms.In the future, we will consider the model's le size and computation speed, as well as its accuracy and F1 score.
Finally, we could not explain why our model arrived at a speci c decision for ECG classi cation.We must be able to fully understand how these decisions are being made so that we can trust the model's decision.We will consider this direction as a future research topic.

Conclusions
For ECG-signal multi-classi cation, SE-ResNet might be better than the ResNet baseline model, considering the F1 scores.
but they include various types of actual data measured in hospitals.The data consist of the following 7 categories: Normal sinus rhythm (Normal) Atrial brillation (AF) Atrial utter (AFL) First degree atrioventricular block(FAB) Sinus bradycardia (SB) Sinus tachycardia (ST) Premature ventricular contraction (PVC) Cardiologists in the Korea University Anam Hospital in South Korea annotated the labels for these 7 Abbreviations SE-ResNet Squeeze-and-Excitation Residual Network SENet Squeeze-and-Excitation Network ResNet Residual Network ECG Electrocardiogram CNN Convolutional Neural Network RNN Recurrent Neural Network Normal

Figures Figure 1
Figures