Classifying Cardiac Arrhythmia from ECG Signal Using 1D CNN Deep Learning Model

: Blood circulation depends critically on electrical activation, where any disturbance in the orderly pattern of the heart’s propagating wave of excitation can lead to arrhythmias. Diagnosis of arrhythmias using electrocardiograms (ECG) is widely used because they are a fast, inexpensive, and non-invasive tool. However, the randomness of arrhythmic events and the susceptibility of ECGs to noise leads to misdiagnosis of arrhythmias. In addition, manually diagnosing cardiac arrhythmias using ECG data is time-intensive and error-prone. With better training, deep learning (DL) could be a better alternative for fast and automatic classiﬁcation. The present study introduces a novel deep learning architecture, speciﬁcally a one-dimensional convolutional neural network (1D-CNN), for the classiﬁcation of cardiac arrhythmias. The model was trained and validated with real and noise-attenuated ECG signals from the MIT-BIH dataset. The main aim is to address the limitations of traditional electrocardiograms (ECG) in the diagnosis of arrhythmias, which can be affected by noise and randomness of events, leading to misdiagnosis and errors. To evaluate the model performance, the confusion matrix is used to calculate the model accuracy, precision, recall, f1 score, average and AUC-ROC. The experiment results demonstrate that the proposed model achieved outstanding performance, with 1.00 and 0.99 accuracies in the training and testing datasets, respectively, and can be a fast and automatic alternative for the diagnosis of arrhythmias.


Introduction
Globally, cardiovascular diseases (CVDs) have the highest mortality rate [1].The World Health Organization (WHO) identifies cardiovascular diseases (CVDs) as the leading cause of death worldwide, accounting for an estimated 31% of all fatalities annually [2].The Middle East, Asia, and Russia have much higher rates than the rest of the globe [3,4].The Cardiovascular electrical system has three major categories, including electrical (cardiac arrhythmia), circulatory (blood vessel abnormality), and structural [5].Cardiac arrhythmia refers to a collection of irregular heartbeats caused by a malfunctioning heart's electrical system.
Electrocardiogram (ECG) is a well-known diagnostic technique for cardiac arrhythmias that documents physiological heart activity throughout time [6].According to a recent estimate, the annual global number of ECG recordings exceeds 300 million and is expected to increase in the future [7], and the number continues to rise.The main reason for the popularity of the, compared to CT and MRI, ECG is that it is a simpler and less expensive test.ECG is a non-invasive test, it only requires the placement of electrodes on the skin, and it can provide information about the electrical activity of the heart, heart rate, and the presence of certain conditions, such as arrhythmias or heart attacks [8].ECG can be performed easily, quickly and in most cases it is painless [9].It can also be conducted repeatedly to monitor the progress of some conditions.The ability of deep learning (DL) to automatically extract important features and selflearn from them to distinguish between classes makes it a promising alternative for classifying cardiac arrhythmia.Moreover, DL can reduce features automatically and handle large and noisy datasets [14,15].Therefore, deep learning can be seen in various applications, including malware detection [16,17], image processing [18], and healthcare [19,20].Furthermore, several recent studies have been proposed to classify cardiac arrhythmias using deep learning techniques such as convolutional neural networks (CNNs) [5,21,22].
Convolutional neural networks (CNNs) are well-known deep learning techniques for 2D data, such as image processing [23] and segmentation [24], due to the ability to learn complex features and maintain spatial relationships between features.
This work presents a classification model based on DL by fine-tuning the hyperparameters of the 1D-CNN to address the limitations of traditional electrocardiograms (ECG) in the diagnosis of arrhythmias, which can be affected by noise and randomness of events, leading to misdiagnosis and errors.The proposed DL model incorporates three blocks, and each block comprises two 1D-CNN layers, a max-pooling layer, a dropout layer, and a batch-normalization layer.The main goal is to develop a fast and automatic alternative for the diagnosis of arrhythmias.The model correctly detects four types of arrhythmias, including fusion (F), normal (N), supraventricular-ectopic (S), and ventricular-ectopic (V) beat from ECG lead II signal.The confusion matrix and AUC-ROC curve were utilized to evaluate the model's performance.To support the model's validity, the CNN model's accuracy and recall (sensitivity) performance are compared to current research.The main contributions of this article are summarized as follows: • Proposing a simple yet effective method for extracting the heartbeat from the signals by segmenting the signal centered on the R-peak point to ensure that all the critical features, such as QRS Complex, P-wave and T-wave, are correctly extracted.The ability of deep learning (DL) to automatically extract important features and self-learn from them to distinguish between classes makes it a promising alternative for classifying cardiac arrhythmia.Moreover, DL can reduce features automatically and handle large and noisy datasets [14,15].Therefore, deep learning can be seen in various applications, including malware detection [16,17], image processing [18], and healthcare [19,20].Furthermore, several recent studies have been proposed to classify cardiac arrhythmias using deep learning techniques such as convolutional neural networks (CNNs) [5,21,22].
Convolutional neural networks (CNNs) are well-known deep learning techniques for 2D data, such as image processing [23] and segmentation [24], due to the ability to learn complex features and maintain spatial relationships between features.
This work presents a classification model based on DL by fine-tuning the hyperparameters of the 1D-CNN to address the limitations of traditional electrocardiograms (ECG) in the diagnosis of arrhythmias, which can be affected by noise and randomness of events, leading to misdiagnosis and errors.The proposed DL model incorporates three blocks, and each block comprises two 1D-CNN layers, a max-pooling layer, a dropout layer, and a batch-normalization layer.The main goal is to develop a fast and automatic alternative for the diagnosis of arrhythmias.The model correctly detects four types of arrhythmias, including fusion (F), normal (N), supraventricular-ectopic (S), and ventricular-ectopic (V) beat from ECG lead II signal.The confusion matrix and AUC-ROC curve were utilized to evaluate the model's performance.To support the model's validity, the CNN model's accuracy and recall (sensitivity) performance are compared to current research.The main contributions of this article are summarized as follows:

•
Proposing a simple yet effective method for extracting the heartbeat from the signals by segmenting the signal centered on the R-peak point to ensure that all the critical features, such as QRS Complex, P-wave and T-wave, are correctly extracted.

•
Developing a novel CNN architecture that shows outstanding performance in identifying four types of cardiac arrhythmias compared to existing work in the literature.

•
Evaluating the optimal CNN hyperparameters in terms of filter, activation function, kernel size, and the number of layers.
The rest of the article is arranged in the following manner.
Section 2 reviews related work in arrhythmia classification using ECG signals.The proposed 1D-CNN model for classifying cardiac arrhythmia is explained in Section 3. Section 4 discusses the results of the experiment in comparison to state-of-the-art works.Section 5 concludes the work presented in this article and provides some suggestions and future work.

Related Works
Various research works on classifying cardiac arrhythmia from ECG signals can be divided into two approaches: the non-deep learning approach (traditional machine learning) and the deep learning approach.
The traditional ML approach uses machine learning algorithms such as support vector machine (SVM), decision trees (DTs), and random forest (RF) to classify cardiac arrhythmia.For instance, [29] proposed a computational system for diagnosing cardiac arrhythmia using k-nearest neighbor (KNN) and DTs.The model was trained based on 14 features extracted from the MIT-BIH dataset.DTs outperformed KNN with 0.96, 0.99, and 0.84 for accuracy, sensitivity, and specificity, respectively.The authors of [30] detected myocardial infarctions from 10 s ECG signals using SVM.The model was trained using 14 features extracted by the principal component (PCA) technique.The model achieved an overall accuracy of 0.96.The authors of [31] developed a model to detect the narrowing of three types of coronary arteries (CAD).The model is trained with SVM, uses 25 features, and achieves an overall accuracy of 0.96, a sensitivity of 1.00, and a specificity of 0.88.The authors of [32] proposed the Naïve Bayes model to detect five types of cardiac arrhythmias from ECG signals.The best model performance was based on four features extracted using higher-order statistics (HOS).The model obtained an overall accuracy of 0.94, a specificity of 0.57, and a recall of 0.99.In [13], various tree-based ML algorithms, such as Logistic Model Trees, Naïve Bayes Tree, and RF, were trained to classify arrhythmias from 23 recordings and trained to classify 11 classes.The RF scored the best results with an accuracy of 0.97, a specificity of 0.95, and a recall of 0.97.In [33], a genetic algorithm-based backpropagation neural network (GA-BPNN) technique for ECG identification was developed to categorize six distinct types of arrhythmias with an accuracy of 0.97.
Although these traditional machine learning models have performed well in classifying arrhythmias, they have some limitations, such as feature selection and poor classification performance on large datasets [34].In other words, most machine learning algorithms require feature selection to reduce complexity and enhance performance.However, selecting important features requires additional work and might differ from one method to another.
On the other hand, deep learning (DL) overcomes these limitations and improves performance due to its ability to extract features automatically and handle large datasets.The authors of [6] trained a CNN-BiLSTM to classify five types of cardiac arrhythmia from the MIT-BIH dataset; the model obtained 0.98 accuracy, 0.91 sensitivity and 0.91 specificity.The authors of [35] proposed a CNN algorithm to classify heartbeats into five classes and achieved an overall accuracy of 0.93.Two-and five-second ECG signals from the St. Petersburg and Fantasia datasets are used in [36] to build a CAD prediction model using CNN.The proposed model can discriminate between arrhythmias with an accuracy of 0.94 for the two second model and 0.95 for the five-second model.A study [37] proposed a network of CNNs and BiLSTM to categorize five ECG arrhythmias with a recognition accuracy of 0.96.
Although several works have been conducted on identifying cardiac arrhythmias using deep learning techniques, there is still a lack of an effective CNN method for cardiac arrhythmia classification.In this study, we develop a novel CNN architecture to efficiently classify four types of cardiac arrhythmia from ECG lead II signal.

Proposed Methodology
The proposed 1D-CNN model is suggested for classifying cardiac arrhythmia into four phases: data acquisition, data preprocessing, training of 1D-CNN for classifying cardiac arrhythmias and performance evaluation.The proposed classification of cardiac arrhythmias based on the 1d-CNN algorithm is shown in Figure 2.
Although several works have been conducted on identifying cardiac arrhythmias using deep learning techniques, there is still a lack of an effective CNN method for cardiac arrhythmia classification.In this study, we develop a novel CNN architecture to efficiently classify four types of cardiac arrhythmia from ECG lead II signal.

Proposed Methodology
The proposed 1D-CNN model is suggested for classifying cardiac arrhythmia into four phases: data acquisition, data preprocessing, training of 1D-CNN for classifying cardiac arrhythmias and performance evaluation.The proposed classification of cardiac arrhythmias based on the 1d-CNN algorithm is shown in Figure 2.

Dataset Description
In this work, the original and noise-attenuated ECG signals obtained from the MIT-BIH [38] dataset are used as a data source to classify four arrhythmias according to the Association for the Advancement of Medical Instrumentation (AAMI) standard EC57 [39].The MIT-BIH dataset contains 48 ECG recordings, thirty minutes long, obtained from 47 patients.Each record has two types of ECG signals: Lead II and Lead V5.The recordings were digitized across a 10 mV range at 360 Hz per channel in 11-bit resolution.In this

Dataset Description
In this work, the original and noise-attenuated ECG signals obtained from the MIT-BIH [38] dataset are used as a data source to classify four arrhythmias according to the Association for the Advancement of Medical Instrumentation (AAMI) standard EC57 [39].The MIT-BIH dataset contains 48 ECG recordings, thirty minutes long, obtained from 47 patients.Each record has two types of ECG signals: Lead II and Lead V5.The recordings were digitized across a 10 mV range at 360 Hz per channel in 11-bit resolution.In this experiment, ECGs Lead II have been extracted, scaled, and segmented into four types of arrhythmias to train and test the proposed CNN model.

Dataset Prepossessing
The 48 ECG records were converted to NumPy arrays, and the signals from the leads II were extracted to train the model.NumPy arrays are multi-dimensional arrays used for scientific computing in Python [40].They are homogeneous, meaning each element in the array must be of the same type.They are fixed-size at creation, unlike Python lists.It is a requirement that all elements in a NumPy array are of the same data type, and they are indexed by a tuple of non-negative integers.NumPy arrays are written mostly in C language and are stored in contiguous memory locations, which makes them faster and more powerful than Python lists.
Each extracted signal was scaled and segmented into heartbeats with a window length of 180 features for each heartbeat centered around the R-peak.The steps of extracting the heartbeats from the ECG signal II are described in Figure 3.
arrhythmias to train and test the proposed CNN model.

Dataset Prepossessing
The 48 ECG records were converted to NumPy arrays, and the signals from the leads II were extracted to train the model.NumPy arrays are multi-dimensional arrays used for scientific computing in Python [40].They are homogeneous, meaning each element in the array must be of the same type.They are fixed-size at creation, unlike Python lists.It is a requirement that all elements in a NumPy array are of the same data type, and they are indexed by a tuple of non-negative integers.NumPy arrays are written mostly in C language and are stored in contiguous memory locations, which makes them faster and more powerful than Python lists.
Each extracted signal was scaled and segmented into heartbeats with a window length of 180 features for each heartbeat centered around the R-peak.The steps of extracting the heartbeats from the ECG signal II are described in Figure 3.

Heartbeat Extraction
To extract the heartbeats from the MIT-BIT lead II, we propose a simple yet effective method (a visual representation is shown in Table 1) to extract heartbeats from signals.
The main steps of the extraction method are described as follows:

•
Extracting ECG lead II signals from the patient records and converting them into NumPy array.

•
Scaling the extracted signals to ensure that all signals have the same mean and standard deviation, which helps classify several arrhythmias correctly.

•
Detecting the R-peak position using XQRS from the WFDB library to segment the heartbeats.

•
Segmenting the signal into heartbeat windows with a length of 180 features in each window centered around the R-peak position.

•
Extract each heartbeat class from the annotations given by cardiologists in the dataset.Table 1 shows the categorization of the four types of ECG heartbeats, which are divided into four classes according to AAMI [39].

Heartbeat Extraction
To extract the heartbeats from the MIT-BIT lead II, we propose a simple yet effective method (a visual representation is shown in Table 1) to extract heartbeats from signals.The main steps of the extraction method are described as follows: • Extracting ECG lead II signals from the patient records and converting them into NumPy array.

•
Scaling the extracted signals to ensure that all signals have the same mean and standard deviation, which helps classify several arrhythmias correctly.

•
Detecting the R-peak position using XQRS from the WFDB library to segment the heartbeats.

•
Segmenting the signal into heartbeat windows with a length of 180 features in each window centered around the R-peak position.

•
Extract each heartbeat class from the annotations given by cardiologists in the dataset.Table 1 shows the categorization of the four types of ECG heartbeats, which are divided into four classes according to AAMI [39].
It is essential to highlight that the proposed technique extracts all the key regions, such as QRS Complex, P-wave and T-wave, that are generally used by cardiologists to identify arrhythmia.Moreover, all extracted beats contain exact window sizes, which is crucial to train the model.

Data Preparation
After segmenting the signals into heartbeats, the total number of heartbeats is 99,774 for the 48 records.Each class contains a different number of heartbeat samples.The heartbeats are categorized into 89,694 for normal beats, 6487 for ventricular ectopic beats, 2814 for supraventricular ectopic beats, 779 for fusion beats, and 24 for unknown beats.The unknown beats have dropped because they contain unclassified beats and have a significantly lower number of instances [41].The heartbeats were split into 75% (74,830 samples) for training and validation and 25% (24,944 samples) for testing.We used the stratify parameter to ensure that all the classes had been split equally into the training and testing data.The utilization of the stratify parameter guarantees that the proportion of samples is maintained consistently across each class, for both the training and testing data.The training and validation dataset splits into 80% (59,864 samples) for training and 20% (14,966 samples) for validation.
Table 1 shows a significant imbalance among the four classes (i.e., class 'N' has 89,694, while class 'F' has only 779).An unbalanced dataset leads to a classification bias toward classes with more samples, which leads to deficient performance in classifying categories with fewer samples [42].Therefore, estimating class-weight technique [43] is applied to balance the datasets.The class-weight technique adjusts the model's cost function such that misidentifying an instance belonging to a minority class incurs a more significant penalty than misidentifying an example belonging to a majority class [14].This strategy can improve the model's accuracy by rebalancing the class distribution.

Training of One-Dimensional Convolutional Neural Network (1D-CNN) for Classifying Cardiac Arrhythmia
Convolutional neural network (CNN) is a deep learning network that can infer highlevel features from input features [44].A CNN is conceptually comparable to a multilayer perceptron (MLP), where each neuron possesses an activation function that translates the weighted inputs into the outputs.CNNs consist of three main layers: the convolutional layer, the pooling layer, and the fully connected layer.With proper training, CNNs can be utilized in many applications, including speech recognition [45], structure engineering [46,47], and image processing [48].
1D-CNN is a modified version of CNN designed for 1D signals, especially for sparse data unsuitable for traditional CNN [49].1D-CNNs are similar to 2D-CNNs but are used for processing one-dimensional data such as audio or text.Moreover, 1D-CNNs use 1D convolutional filters to extract features from the data, while 2D-CNNs use 2D convolutional filters.1D-CNNs also have fewer parameters than 2D-CNNs, which makes them more computationally efficient [49].It is particularly well suited for signal data that has temporal component, including time series data, because it is able to extract local features from the signal that are robust to small shifts in time.Here are a few reasons why 1D CNNs are particularly well suited for signal data: • Time-invariant feature learning: A 1D CNN is able to learn time-invariant features from a signal, meaning that it is able to extract features that are robust to small shifts in time.This is important for signal data because signals are often affected by noise, variations in the measurement scale, and non-stationarity.With a 1D CNN, a network can learn to extract relevant features that are robust to these variations, resulting in improved performance.

•
Local feature extraction: In 1D CNNs, the convolutional layers are able to extract local features from the signal, which are important for signal data, as signals are often composed of local patterns.These patterns could be variations, such as frequency, amplitude, or shape, which the CNN is able to learn and extract.

•
Translation invariance: 1D CNNs are also translation invariant, which means that they are able to detect the same pattern, regardless of its location in the signal.This is particularly useful for signals where the location of the pattern is not known.
Multiscale feature learning: The combination of convolution and pooling operations in a 1D CNN allows the network to learn features at different scales and resolutions, which is useful for signal data because signals often have patterns at different scales.
These properties, along with the ability of CNNs to learn complex and abstract feature representations, makes them a good fit for signal data.Algorithm 1 shows an example of the basic structure of a 1D CNN algorithm in pseudocode.This pseudocode assumes that the CNN class has already been implemented with functions for applying filters and combining output and that the convolutional and fully connected layers are stored in lists within the CNN object.The input data are passed through the convolutional layers, with each layer applying a set of filters to the input using 1D convolution and producing output data.The output data from the final convolutional layer is then passed through the fully connected layers, with each layer combining the output from the previous layer into a single set of predictions.The final set of predictions is then returned as the output of the 1D CNN.
Due to its compact and simple configuration performance, 1D-CNN is also suitable for real-time applications and low-cost hardware implementations.Equation (1) represents a single convolution of a signal, x n 1 = [x 1 , x 2 , . . . ,x n ], in which n denotes the total number of points [50], h denotes the activation function, l denotes layer index, b denotes the bias of the jth feature map, M denotes the kernel size, W j m denotes the feature map's weight and mth denotes the filter index.
This work developed a 1D CNN model to classify cardiac arrhythmias based on signals from ECG lead II.An intensive experiment was conducted to select the minimal model architecture with optimal parameters to improve the model performance.The architecture of the proposed model is represented in Figure 4.
using 1D convolution and producing output data.The output data from the final convolutional layer is then passed through the fully connected layers, with each layer combining the output from the previous layer into a single set of predictions.The final set of predictions is then returned as the output of the 1D CNN.
Due to its compact and simple configuration performance, 1D-CNN is also suitable for real-time applications and low-cost hardware implementations.Equation ( 1) represents a single convolution of a signal,  1  = [ 1 ,  2 , … ,   ], in which  denotes the total number of points [50], ℎ denotes the activation function,  denotes layer index,  denotes the bias of the ℎ feature map,  denotes the kernel size,    denotes the feature map's weight and ℎ denotes the filter index.Figure 4 shows that the model consists of three convolution blocks (A, B, C).Each block contains two 1D-CNN layers, a max-pooling layer, a dropout layer, and a batchnormalization layer.Each 1D-CNN layer contains 128 filters with a kernel-window size of 10 for each filter.Each layer of a 1D-CNN is activated using the rectified linear unit (ReLU) function.Activation functions are crucial to increase the expressiveness of neural networks and enhance the approximation capability between the network's different layers [34].
The max-pooling layer is applied after the 1D-CNN with a pooling size of two to highlight the most present feature by calculating the largest value in each patch.To accelerate the learning speed of the model structure, max-pooling down-samples the input representation by selecting the largest value inside a spatial region [34].The setting of the network hyperparameters was determined through an empirical approach, involving experimentation with different values to find the optimal configuration.
As shown in Table 2, to reduce the overfitting, a dropout layer of 0.20 and the normalization layer were applied [49].During the training period, the dropout layer randomly sets the inputs at each step with zero frequency.Dropout regularization reduces interdependence between layers by probabilistically dropping some of the nodes in the Figure 4 shows that the model consists of three convolution blocks (A, B, C).Each block contains two 1D-CNN layers, a max-pooling layer, a dropout layer, and a batchnormalization layer.Each 1D-CNN layer contains 128 filters with a kernel-window size of 10 for each filter.Each layer of a 1D-CNN is activated using the rectified linear unit (ReLU) function.Activation functions are crucial to increase the expressiveness of neural networks and enhance the approximation capability between the network's different layers [34].
The max-pooling layer is applied after the 1D-CNN with a pooling size of two to highlight the most present feature by calculating the largest value in each patch.To accelerate the learning speed of the model structure, max-pooling down-samples the input representation by selecting the largest value inside a spatial region [34].The setting of the network hyperparameters was determined through an empirical approach, involving experimentation with different values to find the optimal configuration.
As shown in Table 2, to reduce the overfitting, a dropout layer of 0.20 and the normalization layer were applied [49].During the training period, the dropout layer randomly sets the inputs at each step with zero frequency.Dropout regularization reduces interdependence between layers by probabilistically dropping some of the nodes in the same layer.The dropped neuron weights are ignored, significantly improving the model's generalization capacity [51].The normalization of CNN layers accelerates model convergence during training and prevents gradient growth [52].In addition, the batch-normalization layer guarantees that the transformation of the various batches remains within a specified range, stabilizing the learning process and accelerating the parameters' convergence [34].
The fully connected layer combines the information from the preceding layers to create the final output.In the flatten layer, the preceding layer's output is transformed into a single vector to be implemented as input for the dense layer.Each neuron in the dense layer gets the outputs of all neurons in the layer underneath it and conducts matrix-vector multiplication.Table 2 illustrates the proposed structure of the CNN model.The first dense layer has 512 nodes and an activation function of ReLU, followed by a 0.20 dropout layer.The second dense layer consists of four nodes, each representing a class, and a SoftMax activation function to classify the output into four arrhythmia types.The model was fitted with sparse cross-entropy as the loss function and the Adam optimizer with a learning rate of 0.001 and a decay factor of 1 × 10 −6 as the optimization function.The model fits the training and validation datasets with 512-epoch batches and 500 iterations.

Performance Matrices
The confusion matrix and AUC-ROC curve, frequently used to assess machine learning models, were used to assess the model performance.Specifically, the model is evaluated using accuracy, precision, recall (sensitivity), f1 score, specificity, and ROC curve, which are described below:

•
Accuracy: how often the model is correct.
• Precision: how often the model predicts a class as positive relative to the total number of positives in all classes.
• Recall: how many times the model correctly predicts the class to be positive.
• Specificity: Specificity is defined as the proportion of true negatives correctly identified by the model.It is also referred to as the true negative rate (TNR) or selectivity.It measures the ability of the model to accurately identify negative examples.
• F1-score describes the weighted average of recall and precision.
• AUC-ROC Curve: The Area Under Curve (AUC) is a measure of the thresholds between true and false-positive rates.The Receiver Operating Characteristic curve (ROC) visually illustrates the trade-off between sensitivity and specificity, where the x-axis represents the false-positive rate, and the y-axis represents the true-positive rate.
The AUC evaluates the capability of the ROC curve to distinguish between classes.The larger the AUC number, the better the performance.

Performance of the Proposed Method
The experimental dataset used in this work is from the MIT-BIH, which is currently used in a lot of ECG research and contains accurate and thorough expert annotation [38,53].As discussed in Section 3, the ECG lead II signals of the 47 patients have been extracted, scaled, and segmented into heartbeats.The working environment for training the model consisted of one NVIDIA GeForce GTX 1070 GPU with 16 GB of RAM.
The initial phase of the experiments involved dividing the dataset into 75% (74,830 samples) for training and validation and 25% (24,944 samples) for testing.The model was trained for 500 epochs with a batch size of 512 per epoch.The accuracy and loss curves of the conducted experiment are illustrated in Figure 5.The loss function in 1D-CNN quantifies the discrepancy between the expected outcome and the outcome produced by the 1D-CNN algorithm.It is used to measure how far an estimated value is from its true value [54].In this experiment, the sparse categorical cross entropy loss function is used for our multiclass classification task.This loss function computes the logarithm of the output index, which is indicated by the ground truth.This means that the loss is computed only once per instance and the summation is omitted, leading to better performance [54].The formula for the sparse categorical cross entropy loss is as follows:  Furthermore, the proposed model achieved remarkable precision, recall, f1-score, AUC, average accuracy, and loss in the training and testing datasets.Table 3 shows the performance matrices used to evaluate the model in the training and testing datasets.It is worth mentioning that all the numbers in the manuscript have been rounded to 2 decimal numbers The average in the table refers to the mathematical mean of all classes in a particular measurement without considering the proportion of each class in the dataset.For example, in the training dataset, the model scores an average of 0.99 per cent in recall, and  Figure 5 shows that both the training and validation curves increased in a stable manner.Furthermore, the proposed model achieved remarkable precision, recall, f1-score, AUC, average accuracy, and loss in the training and testing datasets.Table 3 shows the performance matrices used to evaluate the model in the training and testing datasets.It is worth mentioning that all the numbers in the manuscript have been rounded to 2 decimal numbers.
The average in the table refers to the mathematical mean of all classes in a particular measurement without considering the proportion of each class in the dataset.For example, in the training dataset, the model scores an average of 0.99 per cent in recall, and 0.98 per cent in precision and f1-score, respectively.Amongst the four classes, class N and class V secure perfect results of 1.00 in all matrices, whereas class F scores the worst, with 0.95, 0.97, 0.96 in precision, recall and f1-score, respectively.Figure 6 presents the number of correctly and incorrectly classified samples in the training and testing dataset, along with their percentages.AUC is another measurement score to evaluate the model's performance in discriminating between classes.It measures the trade-off between the true-positive and false-positive rates, which can graphically represent using the ROC curve.AUC is another measurement score to evaluate the model's performance in discriminating between classes.It measures the trade-off between the true-positive and false-positive rates, which can graphically represent using the ROC curve.Figure 7   AUC is another measurement score to evaluate the model's performance in discriminating between classes.It measures the trade-off between the true-positive and false-positive rates, which can graphically represent using the ROC curve.Figure 7    Micro-average and macro-average are two ways of summarizing the information of the multiclass ROC curves.Micro-averaging aggregates the contributions from all the classes to compute the average metrics as follows: Macro-averaging requires calculating the metric individually for each class and then averaging the results, hence treating all classes equally a priori.

Comparison of the Proposed Method to Other Previous Works
Further comparison with existing work in the literature showed that our proposed network scores superior performance in distinguishing different classes exceeding traditional and deep learning models.Furthermore, the experiment in this paper achieved excellent accuracy compared to existing works in the literature, with total training and testing dataset accuracy of 1.00 and 0.99 per cent, respectively.
Compared to traditional machine learning models such as [29,30,32], and [13], the proposed 1D-CNN model exceeds them in terms of accuracy with an estimate of 0.05 to 0.02 percent.Moreover, machine learning models tend to apply feature engineering to select high-weight features to reduce complexity.For instance, [29] used Fourteen features from ECG signals to train SVM and DTs to diagnose cardiac arrhythmia.The authors of [32] trained a Naïve Bayes model using only four features.Using such an approach in ECG signals might cause losing important features and break up the temporal relationships between features [34].Moreover, feature engineering requires additional processes to choose the feature selection method and the best number of features to train the model.
On the other hand, the deep learning approach automatically reduces feature complexity while training the model.CNN, for instance, can automatically learn the unique relationships between features.Therefore, CNN is commonly used with spatially related data due to its ability to effectively model spatial localities using shared weights for the filters [55].Table 4 summarizes selected state-of-the-art studies and compares them to our work.It is worth mentioning that all the existing work in Table 4 used the MIT-BIH dataset.Compared to CNN models such as [62] and [5], the proposed model exceeds them in terms of accuracy with an estimate of 0.04 and 0.06 per cent, respectively.Moreover, our model scores higher specificity with an estimate of 0.02 and 0.07 per cent.Similarly, [59] developed a model with thirteen layers of 1D CNN to predict five types of arrhythmias and achieved less than our model with an estimate of 0.04 and 0.01 in accuracy and sensitivity.
Compared to hybrid models that combine two architectures such as [56,58], our CNN model surpasses them by at least 0.07 per cent in accuracy and sensitivity and 0.03 per cent in specificity.For instance, [56] developed a model incorporating a densely connected convolutional neural network (DenseNet) and gated recurrent unit network (GRU) and scores less than our model with an estimate of 0.07, 0.17, and 0.02 in accuracy, sensitivity, and specificity, respectively.In [58], a deep learning model combining RNN and LSTM is developed to classify the normal and abnormal beats from ECG signals.Our model scores better in accuracy sensitivity and specificity with 0.11, 0.02, and 0.16, respectively.In [22], a convolutional neural network and extreme learning machine (CNN-ELM) is trained to classify four classes of arrhythmias and scored 0.01 less than our model in specificity.In [37], a CNN-BiLSTM model is developed to classify five types of arrhythmias and scored 0.03 less than our model in accuracy and specificity.

Conclusions
Classification of cardiac arrhythmias is essential to help physicians diagnose cardiovascular diseases.This work proposed a classification model containing three blocks of 1D CNN to classify cardiac arrhythmia from ECG lead II signal.The proposed 1D-CNN model has demonstrated its efficiency in predicting four arrhythmia classes with an outstanding performance.The 1D CNN model can help physicians diagnose cardiovascular disease while reducing physician workload.Although the proposed architecture has an excellent performance, certain limitations still had to be considered when interpreting the results.The distribution of categories in the MIT-BIH dataset we used for training and testing was quite unbalanced.Although the proposed architecture has an excellent performance, certain limitations still had to be considered when interpreting the results.The distribution of categories in the MIT-BIH dataset we used for training and testing was quite unbalanced.Even though we have addressed the imbalance with the class weight approach, the imbalance of the data still has some impact on the model generalization.Furthermore, cardiac arrhythmias can vary greatly between patients; therefore, a larger dataset is needed to train deep learning models to handle such variability and to better generalize new cases, which will be our aim in future work.

Figure 1 .
Figure 1.Example ECG patterns for different heartbeat types.

Figure 1 .
Figure 1.Example ECG patterns for different heartbeat types.

Figure 2 .
Figure 2. The proposed classification of cardiac arrhythmias based on 1D-CNN model.

Figure 2 .
Figure 2. The proposed classification of cardiac arrhythmias based on 1D-CNN model.

Figure 5 .
Figure 5. Accuracy and loss curves of the 1D-CNN model.

Figure 5 .
Figure 5. Accuracy and loss curves of the 1D-CNN model.

Figure 6 .
Figure 6.Confusion matrix of the 1D-CNN model for arrhythmia categorization.In the test dataset, the average of all classes decreased by 0.5 per cent in all classes compared to the training dataset, with overall 0.93, 0.94, and 0.93 precision, recall and f1score, respectively.Class N is still in the lead with a perfect score in all performance matrices, while class V dropped 0.02 per cent in precision and f1-score and 0.03 per cent in recall.Class S dropped 0.07 per cent in precision, 0.05 per cent in recall and 0.06 per cent in F1-score.Even though the class weight technique has been applied to adjust the cost function of the model, class F is still the worst class, with 0.85 per cent in all classes with a 0.10 per cent drop.We can relate the low score of class F to the sample size, which only has 584 and 195 samples; among them, only 15 and 29 samples were misclassified in the training and testing datasets, respectively.AUC is another measurement score to evaluate the model's performance in discriminating between classes.It measures the trade-off between the true-positive and false-positive rates, which can graphically represent using the ROC curve.Figure7shows the ROC curve on the training and testing datasets.The figure shows a perfect score in all classes in the training and testing datasets except class F in the testing set with 0.99 per cent.
Figure 7 shows the ROC curve on the training and testing datasets.The figure shows a perfect score in all classes in the training and testing datasets except class F in the testing set with 0.99 per cent.

Figure 7 .
Figure 7. ROC curve of the 1D-CNN model for arrhythmia categorization.Micro-average and macro-average are two ways of summarizing the information of the multiclass ROC curves.Micro-averaging aggregates the contributions from all the classes to compute the average metrics as follows:

Figure 6 .
Figure 6.Confusion matrix of the 1D-CNN model for arrhythmia categorization.In the test dataset, the average of all classes decreased by 0.5 per cent in all classes compared to the training dataset, with overall 0.93, 0.94, and 0.93 precision, recall and f1-score, respectively.Class N is still in the lead with a perfect score in all performance matrices, while class V dropped 0.02 per cent in precision and f1-score and 0.03 per cent in recall.Class S dropped 0.07 per cent in precision, 0.05 per cent in recall and 0.06 per cent in F1-score.Even though the class weight technique has been applied to adjust the cost function of the model, class F is still the worst class, with 0.85 per cent in all classes with a 0.10 per cent drop.We can relate the low score of class F to the sample size, which only has 584 and 195 samples; among them, only 15 and 29 samples were misclassified in the training and testing datasets, respectively.AUC is another measurement score to evaluate the model's performance in discriminating between classes.It measures the trade-off between the true-positive and false-positive rates, which can graphically represent using the ROC curve.Figure7shows the ROC curve on the training and testing datasets.The figure shows a perfect score in all classes in the training and testing datasets except class F in the testing set with 0.99 per cent.
shows the ROC curve on the training and testing datasets.The figure shows a perfect score in all classes in the training and testing datasets except class F in the testing set with 0.99 per cent.Mathematics 2023, 11, 562 13 of 18 (a) Training-based confusion matrix.(b) Testing-based confusion matrix.

Figure 6 .
Figure 6.Confusion matrix of the 1D-CNN model for arrhythmia categorization.In the test dataset, the average of all classes decreased by 0.5 per cent in all classes compared to the training dataset, with overall 0.93, 0.94, and 0.93 precision, recall and f1score, respectively.Class N is still in the lead with a perfect score in all performance matrices, while class V dropped 0.02 per cent in precision and f1-score and 0.03 per cent in recall.Class S dropped 0.07 per cent in precision, 0.05 per cent in recall and 0.06 per cent in F1-score.Even though the class weight technique has been applied to adjust the cost function of the model, class F is still the worst class, with 0.85 per cent in all classes with a 0.10 per cent drop.We can relate the low score of class F to the sample size, which only has 584 and 195 samples; among them, only 15 and 29 samples were misclassified in the training and testing datasets, respectively.AUC is another measurement score to evaluate the model's performance in discriminating between classes.It measures the trade-off between the true-positive and false-positive rates, which can graphically represent using the ROC curve.Figure7shows the ROC curve on the training and testing datasets.The figure shows a perfect score in all classes in the training and testing datasets except class F in the testing set with 0.99 per cent.
shows the ROC curve on the training and testing datasets.The figure shows a perfect score in all classes in the training and testing datasets except class F in the testing set with 0.99 per cent.

Figure 7 .
Figure 7. ROC curve of the 1D-CNN model for arrhythmia categorization.Micro-average and macro-average are two ways of summarizing the information of the multiclass ROC curves.Micro-averaging aggregates the contributions from all the classes to compute the average metrics as follows:

Figure 7 .
Figure 7. ROC curve of the 1D-CNN model for arrhythmia categorization.

Table 1 .
Arrhythmia classes, related annotation and sample size of each class in the dataset.

Table 1 .
Arrhythmia classes, related annotation and sample size of each class in the dataset.

Table 2 .
Parameter tuning of the proposed model.

Table 3 .
Confusion matrix report of 1D-CNN model.

Table 3 .
Confusion matrix report of 1D-CNN model.

Table 4 .
Existing work for arrhythmia classification from ECG signals.