Reservoir Computing Based Echo State Networks for Ventricular Heart Beat Classiﬁcation

: The abnormal conduction of cardiac activity in the lower chamber of the heart (ventricular) can cause cardiac diseases and sometimes leads to sudden death. In this paper, the author proposed the Reservoir Computing (RC) based Echo State Networks (ESNs) for ventricular heartbeat classiﬁcation based on a single Electrocardiogram (ECG) lead. The Association for the Advancement of Medical Instrumentation (AAMI) standards were used to preprocesses the standardized diagnostic tool (ECG signals) based on the interpatient scheme. Despite the extensive efforts and notable experiments that have been done on machine learning techniques for heartbeat classiﬁcation, ESNs are yet to be considered for heartbeat classiﬁcation as a is fast, scalable, and reliable approach for real-time scenarios. Our proposed method was especially designed for Medical Internet of Things (MIoT) devices, for instance wearable wireless devices for ECG monitoring or ventricular heart beat detection systems and so on. The experiments were conducted on two public datasets, namely AHA and MIT-BIH-SVDM. The performance of the proposed model was evaluated using the MIT-BIH-AR dataset and it achieved remarkable results. The positive predictive value and sensitivity are 98.98% and 98.98%, respectively for the modiﬁed lead II (MLII) and 98.96% and 97.95 for the V1 lead, respectively. However, the experimental results of the state-of-the-art approaches, namely the patient-adaptable method, improved generalization, and the multiview learning approach obtained 92.8%, 87.0%, and 98.0% positive predictive values, respectively. These obtained results of the existing studies exemplify that the performance of this method achieved higher accuracy. We believe that the improved classiﬁcation accuracy opens up the possibility for implementation of this methodology in Medical Internet of Things (MIoT) devices in order to bring improvements in e-health systems.


Introduction
According to the American Heart Association [1] and the World Health Organization [2], the mortality ratio due to heart diseases is growing rapidly in the world. It is estimated that around 17.7 million people were affected due to cardiovascular disease (CVD) in 2015. The documentation report from China in 2011 [3] disclosed the information of CVD patients, which showed the statistics that around 230 million people were suffering from CVD and that 3 million death cases were due to the CVD. Electrocardiogram (ECG) signals are a major tool which have been widely used for the analysis of heart disorders in many applications [4][5][6]. The collection of these ECG signals are based on contemporary devices which are being used to communicate with the body surface using sensors or nodes [7]. The ECG signals refers to the physical activity of the heart; either the heart has disease or an abnormal rhythm. Irregular rhythm or abnormal rhythm is also known as arrhythmia, which is the leading source of any heart disease. It is not stated that all arrhythmias have annotated beats in five groups [36]: Ventricular ectopic beat (VEB), Supraventricular ectopic beat (SVEB), Fusion beat (F), Unclassified and paced beat (Q), and Non-ectopic beat (N). However, paced beat recordings were omitted from the datasets. These annotated signal-based ECG recordings have a 30 min long duration, whereas the MIT-BIH-SVDM dataset has two types of lead recordings: the modified lead II (MLII) and V1; a few patients also have V5 and V2 instead of V1.
The AHA dataset has no information about the lead recordings. Hence, this study's methodology is based on the classification of a ventricular heartbeat. Therefore, in this paper we consider two classes of AAM/ANSI standards for our binary class models to determine the normal and abnormal beats: Supraventricular beats as normal morphology and has a label 0, whereas the ventricular beats as abnormal morphology and has a label 1.

ECG Signal Preprocessing
In this study, ECG signals-based recordings are used, and these are discussed in the previous Section 2. 1. The signals are collected based on the common sampling frequency rate of 360 Hz. In order to create a new feature set for input of the classifier, by using the ECG signal dataset, the following steps were conducted and summarized in Figure 1.

ECG Signal Denoising
The contaminated signals contain numerous kinds of noises and artifacts, for instance, baseline wanderings and power line interference. There are several reasons for contamination; sometimes it occurs due to respiration or it also occurs when the patient moves while recording the physiological signals. There are two noises that are frequently highlighted in ECG signals baseline wandering and Power line interference [37] which are caused by respiration and the variation of 50% amplitude in peak to peak due to the 50 Hz, respectively. These noises and artifacts can create a problem in the extraction of information of interest (hidden features) from the raw ECG signal. Furthermore, a corrupt ECG signal can lead to the wrong diagnosis and it also has a major effect on the performance of algorithms during classification [37][38][39]. In the literature, there are many studies that are dedicated to innovating the algorithm for the filtering of noisy physiological signals for the highest performance of proposed models, for instance [38,39], this study applies the same filtering technique as used in [7,40], and these studies also addressed the similar aim of classification. The author believes that this is the major decision that helps to evaluate the performance of the current study with the existing methods. Therefore, to bridge this gap, we used a median filter on the ECG signal (x) which had nth length. The sliding window is created for the preprocessing of signals, so the length of the ECG signal n will be represented as the length of the sliding window. For every step, the median value will be calculated by using this equation: This study used two median filters, one is for length n = 600 ms and the second one is for n = 200 ms. This study used the medfilt1 function of the Matlab [41] which implemented the one-dimensional filtering on the ECG signal. The main benefit of applying the medfilt1 function was that it eliminates the unwanted distortion from the signal.
After the application of median filters for the baseline wandering outlier's removal, this study applied 12th order finite impulse response (low pass filter) with the given cut-off frequency k = 35 Hz for removing the outliers which are related to power line interference by using the fir1 Matlab function [42]. The clear picture based on the results of these filters is reported in Figure 2a  In the literature, there are many studies that are dedicated to innovating the algorithm for the filtering of noisy physiological signals for the highest performance of proposed models, for instance [39,40], this study applies the same filtering technique as used in [8,41], and these studies also addressed the similar aim of classification. The author believes that this is the major decision that helps to evaluate the performance of the current study with the existing methods. Therefore, to bridge this gap, we used a median filter on the ECG signal (x) which had nth length. The sliding window is created for the preprocessing of signals, so the length of the ECG signal n will be represented as the length of the sliding window. For every step, the median value will be calculated by using this equation: if the value of n is odd x i − n 2 , x(i − n 2 + 1, . . . ., x i + n 2 − 1 if the value of n is even (1) This study used two median filters, one is for length n = 600 ms and the second one is for n = 200 ms. This study used the medfilt1 function of the Matlab [42] which implemented the one-dimensional filtering on the ECG signal. The main benefit of applying the medfilt1 function was that it eliminates the unwanted distortion from the signal.
After the application of median filters for the baseline wandering outlier's removal, this study applied 12th order finite impulse response (low pass filter) with the given cut-off frequency k = 35 Hz for removing the outliers which are related to power line interference by using the fir1 Matlab function [43]. The clear picture based on the results of these filters is reported in Figure 2a

High Frequency component (Peak) Detection
This study carried out the second step peak detection which was very useful for further steps. The High-frequency components are detected by applying the modified Pan Tompkins algorithm [43]. The algorithm consists of the following steps:

High Frequency component (Peak) Detection
This study carried out the second step peak detection which was very useful for further steps. The High-frequency components are detected by applying the modified Pan Tompkins algorithm [44]. The algorithm consists of the following steps: In the first step, this study used a block of the differentiation equation to obtain the high slope values; secondly, the output values of the signals were squared to extract the R-peaks; in the last step, this study applied the summation on all values of R-wave slope. The values related to the R-wave, for instance, R-location and amplitudes of R-wave were stored in the R nXm matrix. Figure 3 represents the example of detecting the R-peaks based on the Pan Tompkins algorithm. The reason for applying this modified Pan Tompkins algorithm is that it can easily adapt the variation of signal changes and extract the high-frequency component in an efficient manner.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 6 of 16 In the first step, this study used a block of the differentiation equation to obtain the high slope values; secondly, the output values of the signals were squared to extract the R-peaks; in the last step, this study applied the summation on all values of R-wave slope. The values related to the R-wave, for instance, R-location and amplitudes of R-wave were stored in the R nXm matrix. Figure 3 represents the example of detecting the R-peaks based on the Pan Tompkins algorithm. The reason for applying this modified Pan Tompkins algorithm is that it can easily adapt the variation of signal changes and extract the high-frequency component in an efficient manner.

Heart Beat Segmentation
The aim of this subsection is to segment the heartbeat; after the detection of peaks, this study used the R nxm matrix for selecting the R-location as a reference point. After determining the position of R-peak, 200 samples or around 0.52 s points were separated in the window, which is considered a single heartbeat. The segmented area based on the QRS complex is described in Figure 4.The start point and end point of the highlighted beat represented as Q wave and S wave respectively, whereas the peak area is defined as R wave. The combination of Q, R and S wave represents ventricular depolarization of the heart.
However, for the extraction of P and T waves, this study divided the 200 samples in two part (the division of points was based on 75 and 125 points which were taken from the left side and right side of the R-peak, respectively). To avoid the misleading of extraction, all samples were collected from 0.2 s to 0.32 s. In this way, using the temporal location and samples, the study extracted the Q, S, P, and T waves from each heartbeat.

Heart Beat Segmentation
The aim of this subsection is to segment the heartbeat; after the detection of peaks, this study used the R nxm matrix for selecting the R-location as a reference point. After determining the position of R-peak, 200 samples or around 0.52 s points were separated in the window, which is considered a single heartbeat. The segmented area based on the QRS complex is described in Figure 4.The start point and end point of the highlighted beat represented as Q wave and S wave respectively, whereas the peak area is defined as R wave. The combination of Q, R and S wave represents ventricular depolarization of the heart.
However, for the extraction of P and T waves, this study divided the 200 samples in two part (the division of points was based on 75 and 125 points which were taken from the left side and right side of the R-peak, respectively). To avoid the misleading of extraction, all samples were collected from 0.2 s to 0.32 s. In this way, using the temporal location and samples, the study extracted the Q, S, P, and T waves from each heartbeat.

Temporal Feature Extraction
The aim of this study is to design a reliable and optimal classifier for ventricular heartbeat classification which can be useful for Medical IoT based on a single ECG lead. The other main focus of this study is to extract those features which should not contain more computational cost and should be easy to implement in the real time environment. Therefore, this study does not consider those signals for the feature extraction phase which has contamination in their parts.
The characterization of normal heart beats (N + SVB) and abnormal heartbeats (A + VB) are well known and are also suggested by the medical specialists in the literature [44,45]. (N + SVB) were distinguished by regular RR intervals, the presence of P-wave, and has a narrow QRS complex, whereas (A + VB) has shorter RR interval, the p-wave is not present, and it also has a wider QRS complex.
Moreover, in this part of feature extraction, this study follows two types of methods; one is based on temporal feature extraction which is less expensive and easy to implement, however these features are lead independent yet useful for real-time implementation, therefore to overcome this inter-patient variability issue, this study used a second method by the name template matching. It is a patient adaptable and simple approach which measures the similarity between the input signal beat and template beat [46].
In the first method, the author computed the relevant attributes for the study's classifier; the six temporal features were computed from each segmented beat.

The Previous R-R interval is defined as the time duration between the current beat and previous
beat. 2. The Subsequent R-R Interval is explained as the time duration between the current beat and subsequent beat.

The Standard deviation of Successive Difference (SDSD)
is defined as the standard feature of physiological signals for arrhythmia classification. It is the difference between 10 consequent RR intervals [47].

The Average of RR-Interval |
| is defined as the average ratio of 10 consecutive RR intervals.

Temporal Feature Extraction
The aim of this study is to design a reliable and optimal classifier for ventricular heartbeat classification which can be useful for Medical IoT based on a single ECG lead. The other main focus of this study is to extract those features which should not contain more computational cost and should be easy to implement in the real time environment. Therefore, this study does not consider those signals for the feature extraction phase which has contamination in their parts.
The characterization of normal heart beats (N + SVB) and abnormal heartbeats (A + VB) are well known and are also suggested by the medical specialists in the literature [45,46]. (N + SVB) were distinguished by regular RR intervals, the presence of P-wave, and has a narrow QRS complex, whereas (A + VB) has shorter RR interval, the p-wave is not present, and it also has a wider QRS complex.
Moreover, in this part of feature extraction, this study follows two types of methods; one is based on temporal feature extraction which is less expensive and easy to implement, however these features are lead independent yet useful for real-time implementation, therefore to overcome this inter-patient variability issue, this study used a second method by the name template matching. It is a patient adaptable and simple approach which measures the similarity between the input signal beat and template beat [47].
In the first method, the author computed the relevant attributes for the study's classifier; the six temporal features were computed from each segmented beat.

1.
The Previous R-R interval is defined as the time duration between the current beat and previous beat.

2.
The Subsequent R-R Interval is explained as the time duration between the current beat and subsequent beat.

3.
The Standard deviation of Successive Difference (SDSD) is defined as the standard feature of physiological signals for arrhythmia classification. It is the difference between 10 consequent RR intervals [48].

4.
The Average of RR-Interval Beat is defined as the average ratio of 10 consecutive RR intervals.

The Average Derivate of RR interval
Beat is defined as the average ratio of the derivate of the RR interval, where the derivation of the segmented beat is calculated by using this central difference Equation (2).
Based on clinical specialist advice, ventricular abnormal beat has a wider area of QRS complex and it also tends to rise and fall more than a normal beat. Therefore, the derivative area should have the lowest value. In contrast, in the second method, template matching was used to avoid the lead independency and inter-patient variability issues. The approach was quite simple and patient adaptable; the study used this approach to compare the ECG signal heart beats with the same class specific template in order to define the Pearson's correlation coefficients (PCC) between the input ECG signal and template beat. However, template beat was calculated by taking an average ratio of at least 50% beats in which 25% belongs to the normal beat and 25% belongs to the abnormal beats. After calculating the specific template beats for normal and abnormal, this method compared the current heartbeat with the template beat example which is shown in Figure 5.  (2).
Based on clinical specialist advice, ventricular abnormal beat has a wider area of QRS complex and it also tends to rise and fall more than a normal beat. Therefore, the derivative area should have the lowest value.
In contrast, in the second method, template matching was used to avoid the lead independency and inter-patient variability issues. The approach was quite simple and patient adaptable; the study used this approach to compare the ECG signal heart beats with the same class specific template in order to define the Pearson's correlation coefficients (PCC) between the input ECG signal and template beat. However, template beat was calculated by taking an average ratio of at least 50% beats in which 25% belongs to the normal beat and 25% belongs to the abnormal beats. After calculating the specific template beats for normal and abnormal, this method compared the current heartbeat with the template beat example which is shown in Figure 5. To this end, for the comparison of heartbeats in real time scenarios, we updated and calculated the template based on a real-time adaptation of the system. Assuming that doctors are using 10 s ECG strips which have around 15 heartbeats in every 10 s ECG strips, the new template is estimated and updated and also replaces the old beats from the template. To evaluate the similarity measurement between the template beat and ECG signal heartbeat (PCC), features are extracted for our classifier as follows: is defined as the Pearson's correlation coefficients between the template beat and ECG signal heartbeat.

PCC (
, ) is defined as the Pearson's correlation coefficients between the derivatives value of the template beat and ECG signal segmented beat.

PCC (
, ) is defined as the Pearson's correlation coefficients between the squares ratio of the template beat and ECG signal segmented beat. 9. PCC (( ) , ( ) ) is defined as the Pearson's correlation coefficients between the squares of the derivatives of the template beat and ECG signal segmented beat.

The Average of Template beat
is defined as the average of the template beat.

The Average Derivate of Template beat
is defined as the average derivatives of the template beat. To this end, for the comparison of heartbeats in real time scenarios, we updated and calculated the template based on a real-time adaptation of the system. Assuming that doctors are using 10 s ECG strips which have around 15 heartbeats in every 10 s ECG strips, the new template is estimated and updated and also replaces the old beats from the template. To evaluate the similarity measurement between the template beat and ECG signal heartbeat (PCC), features are extracted for our classifier as follows:

6.
PCC (Template beat, heart beat) is defined as the Pearson's correlation coefficients between the template beat and ECG signal heartbeat.

7.
PCC ( Template beat , heart beat ) is defined as the Pearson's correlation coefficients between the derivatives value of the template beat and ECG signal segmented beat.

8.
PCC (Template beat 2 , heart beat 2 ) is defined as the Pearson's correlation coefficients between the squares ratio of the template beat and ECG signal segmented beat.

9.
PCC (Template beat ) 2 , (heart beat ) 2 is defined as the Pearson's correlation coefficients between the squares of the derivatives of the template beat and ECG signal segmented beat. 10. The Average of Template beat Template beat is defined as the average of the template beat.

The Average Derivate of Template beat Template beat is defined as the average derivatives of the template beat.
To this end, the total 11 features are selected for the study proposed classifier in which five were based on temporal features of ECG morphology and the rest of the features were based on (PCC) by using the template matching approach. All these features are stored in the matrix denoted as F(n).

The Configuration of the Echo State Network Based Reservoir Computing for Classification
The main idea of reservoir computing (RC) is based on artificial neural network (ANN) because the nature of functional elements in RC is based on the input weights, biases, and connection weights among the neurons that are similar to ANN. The RC is further divided into two common methods (1) Echo State Networks (ESNs) [32] and (2) liquid state machine (LSMs) [49]. ESNs tend to use sparsely connected sigmoid nodes, whereas LSMs are a network which is made up by spiking neurons in the reservoir model [50]. In this study, the author considered the main type of RC echo state networks (ESNs) to construct an efficient model; for implementation in real-time scenarios, see Figure 6. In our methodology, the author used a cycle-based ESN architecture where the neurons of the reservoirs were connected in a ring form, which is also called non-random links between neurons (see Figure 7). To this end, the total 11 features are selected for the study proposed classifier in which five were based on temporal features of ECG morphology and the rest of the features were based on (PCC) by using the template matching approach. All these features are stored in the matrix denoted as ( ).

The Configuration of the Echo State Network Based Reservoir Computing for Classification
The main idea of reservoir computing (RC) is based on artificial neural network (ANN) because the nature of functional elements in RC is based on the input weights, biases, and connection weights among the neurons that are similar to ANN. The RC is further divided into two common methods (1) Echo State Networks (ESNs) [31] and (2) liquid state machine (LSMs) [48]. ESNs tend to use sparsely connected sigmoid nodes, whereas LSMs are a network which is made up by spiking neurons in the reservoir model [49]. In this study, the author considered the main type of RC echo state networks (ESNs) to construct an efficient model; for implementation in real-time scenarios, see Figure 6. In our methodology, the author used a cycle-based ESN architecture where the neurons of the reservoirs were connected in a ring form, which is also called non-random links between neurons (see Figure 7).  The reason behind using the non-random topology was that an ESNs model could easily be implemented in hardware and it also allows low power consumption with efficient processing speed [50]. These advantages encouraged the author to use the ESNs model for ventricular heart beat classification. The activation vector of the echo state network is defined by Equation (3).
where ( ) ℝ , is the activation vector or state in which represents the number of neurons in the connection weights and ℝ is a random matrix which was typically drawn uniformly To this end, the total 11 features are selected for the study proposed classifier in which five were based on temporal features of ECG morphology and the rest of the features were based on (PCC) by using the template matching approach. All these features are stored in the matrix denoted as ( ).

The Configuration of the Echo State Network Based Reservoir Computing for Classification
The main idea of reservoir computing (RC) is based on artificial neural network (ANN) because the nature of functional elements in RC is based on the input weights, biases, and connection weights among the neurons that are similar to ANN. The RC is further divided into two common methods (1) Echo State Networks (ESNs) [31] and (2) liquid state machine (LSMs) [48]. ESNs tend to use sparsely connected sigmoid nodes, whereas LSMs are a network which is made up by spiking neurons in the reservoir model [49]. In this study, the author considered the main type of RC echo state networks (ESNs) to construct an efficient model; for implementation in real-time scenarios, see Figure 6. In our methodology, the author used a cycle-based ESN architecture where the neurons of the reservoirs were connected in a ring form, which is also called non-random links between neurons (see Figure 7).  The reason behind using the non-random topology was that an ESNs model could easily be implemented in hardware and it also allows low power consumption with efficient processing speed [50]. These advantages encouraged the author to use the ESNs model for ventricular heart beat classification. The activation vector of the echo state network is defined by Equation (3).
where ( ) ℝ , is the activation vector or state in which represents the number of neurons in the connection weights and ℝ is a random matrix which was typically drawn uniformly The reason behind using the non-random topology was that an ESNs model could easily be implemented in hardware and it also allows low power consumption with efficient processing speed [51]. These advantages encouraged the author to use the ESNs model for ventricular heart beat classification. The activation vector of the echo state network is defined by Equation (3). s(n) = g(ϑW in F(n) + αWs(n − 1) where S(n) R N d , is the activation vector or state in which N x represents the number of neurons in the connection weights and W in R N x X N d is a random matrix which was typically drawn uniformly from [−1, 1] in which N d represents the dimension of the input vector. In the end, the connection weight matrix was computed based on the connection between neurons in the network and it is represented as W R N x X N x , whereas ϑ and α are the input parameters of connection scaling, F(n) represent the input feature vector of heart beats which has dimensionality = 11. The activation function of ESN is also called classic sigmoid function which is shifted to become symmetric near to 0. In general, the ESN state initialized with null state. The linear combination of the N x activation s(n) or response of the ESN model was calculated according to: where W out R N out X N x represent the weight matrix of the connections between ESN neurons and output nodes, N out represents the number of readouts. We extend our ESN model to know the bias weight and feedback between the y(n) and the reservoirs is defined as:

Learning Phase of The ESN Model
In the learning phase of this study, the author split our dataset based on a 10-fold cross-validation technique. The substantial amount of data around 80% was used to train the classifier, whereas 20% of data was used for testing. In this training phase, the ESN model tries to find the weight matrix which minimizes the error rate between the output and target values. The representation of the training input and output sequence is defined as: Hence, the output matrix W out R N out X N x and the reservoir states in the training phase are represented as A R N x X T . The corresponding output weight matrix is defined as B R N x X T , where T denotes the time of training phase. The training is defined as follows: It can be rewritten as W out AA T = BA T The solution of this equation is computed as in inverse matrix in order to find out that W out is defined as: We used Moore Penrose pseudoinverse [52] for numerical stability which is defined as: To mitigate the overfitting of the model, we used Ridge Regression [53] which minimized the amplitude of W out which is according to: where is the regularization factor of each reservoir which has no absolute meaning and I represents the identity matrix I ∈ R N x X N x .
The ratio of normal and abnormal values is estimated here in this point based on scaling parameters, the maximum threshold value of the study set is 0.4, and by using this value, the study scales our parameters to get the accurate prediction. In the end of the training phase, the regression technique helps to minimize the quadratic error between the output values and desired output values. The performance of our proposed classifier achieved outer perform results which are discussed in the following section.

Performance Evaluation Parameters
To evaluate the performance of our ESN model, the study used three performance parameters which are recommended by AAMI standards for evaluating the performance of learning algorithm, which includes sensitivity (Se), classification accuracy (Acc), and error rate (Er) where True negative (TN) is defined as the number of normal records, which is correctly classified as normal, True positive (TP) is defined as the number of abnormal records, which is correctly classified as abnormal, False positive (FP) is defined as the number of normal records that are classified as an abnormal record of the dataset, and False Negative (FN) is defined as the number of abnormal records that are classified as normal.

Evaluation
In this section, the performance of the proposed classifier is evaluated in terms of accuracy, sensitivity, and error rate. This study can also indicate that this evaluation part is the final classifier performance as the study already trained the model before. Table 3 shows that ECG 1 and ECG 2 from (MIT-BIH SVDB) and AHA recordings obtained remarkable performance. The ECG 1 lead achieved 98% accuracy, whereas ECG 1 and AHA recordings obtained 97% and 96% accuracy, respectively. ECG 1 achieved the highest result because this lead is modified which is good enough to distinguish the beats very efficiently. The detailed performance of our proposed model is described in Figures 8-11 where the confusion matrix shows the total number of classified beats and the graph represents the performance statistics based on percentage in terms of accuracy and sensitivity.            In contrast, this study examined the limitation of our proposed ESN in implementation and the finding is that the classifier needs to wait until the reference beat is computed. Once the template beat is computed, the system is ready to implement the real-time beats for classification. The system is able to compute the new template beat when more than 10 beats were classified properly and the model starts retraining by using the new template beat and it only consumes 2.0 s for the training model.

Comparison with State-of-the-art Methods
In the literature [11,15,53,54], the author noticed that many studies used the MIT-BIH arrhythmia dataset for the heartbeat classification, whereas this study used different datasets. Therefore, it is obvious that this study cannot present a fair comparison between the current study and state-of-theart methods. To support a fair comparison analysis, this study used the MIT-BIH arrhythmia dataset for heartbeat classification using our proposed non-random ESN model. This method is focused on a single lead and ventricular heartbeat classification. Therefore, this technique considered only two classes: (N + SVB) or (A + VB). For MLII and V1, both leads were taken individually to classify the ventricular heartbeats from the ECG signals. This methodology outperforms the rest of the other studies that were discussed in Table 4. Moreover, some studies in the literature only focused on one lead classification and the computational cost of the proposed algorithm was not suitable for realtime application [8,55]. However, this study used both leads for classification and the technique computational cost and performance of our technique is noteworthy for real-time implementation. Thus, the author compared this study with those state-of-the-art methods that used both leads for learning the algorithm and the results were extracted from their confusion matrix which is only related to ventricular heartbeats.

Conclusions
The proposed algorithm is especially designed for implementation in medical wearable wireless gadgets as it is fast with less power consumption. The study's proposed methodology used single ECG lead signals to classify the ventricular heartbeats. Hence, the performance of the proposed It is noticed that the complexity of RC is higher in random connections, whereas the cycle-based ESN model has low computational cost and it is suitable for implementation in real time scenarios. In the learning phase, this study used Ridge regression to optimize the desired output weights and avoid overfitting, whereas in the testing phase, it only involves sparse connectivity which helps to reduce the number of heavy multiplications and additions. In this study, all experimental work was conducted using MATLAB (2018a) on a desktop computer running with 128 GB RAM and 16 cores. Once all scaling parameters were adjusted in the model, the system only required eight minutes in training with a 30 min duration ECG dataset. However, the testing runtime was observed that it takes 1.5 s for feature extraction or selection from ECG beats and the model takes 5 s to classify the heartbeat either as (N + SVB) or (A + VB). The total approximation time is estimated as 2.0 s and total time complexity is observed O(F(n) 2 E) where E denotes the training examples and F(n) is the number of features.
In contrast, this study examined the limitation of our proposed ESN in implementation and the finding is that the classifier needs to wait until the reference beat is computed. Once the template beat is computed, the system is ready to implement the real-time beats for classification. The system is able to compute the new template beat when more than 10 beats were classified properly and the model starts retraining by using the new template beat and it only consumes 2.0 s for the training model.

Comparison with State-of-the-art Methods
In the literature [12,16,54,55], the author noticed that many studies used the MIT-BIH arrhythmia dataset for the heartbeat classification, whereas this study used different datasets. Therefore, it is obvious that this study cannot present a fair comparison between the current study and state-of-the-art methods. To support a fair comparison analysis, this study used the MIT-BIH arrhythmia dataset for heartbeat classification using our proposed non-random ESN model. This method is focused on a single lead and ventricular heartbeat classification. Therefore, this technique considered only two classes: (N + SVB) or (A + VB). For MLII and V1, both leads were taken individually to classify the ventricular heartbeats from the ECG signals. This methodology outperforms the rest of the other studies that were discussed in Table 4. Moreover, some studies in the literature only focused on one lead classification and the computational cost of the proposed algorithm was not suitable for real-time application [9,56]. However, this study used both leads for classification and the technique computational cost and performance of our technique is noteworthy for real-time implementation. Thus, the author compared this study with those state-of-the-art methods that used both leads for learning the algorithm and the results were extracted from their confusion matrix which is only related to ventricular heartbeats.

Conclusions
The proposed algorithm is especially designed for implementation in medical wearable wireless gadgets as it is fast with less power consumption. The study's proposed methodology used single ECG lead signals to classify the ventricular heartbeats. Hence, the performance of the proposed classifier is compared to other existing methodologies. The system provides a significant contribution in the field of MIoT (Medical Internet of Things) and also provides the ability to train the new dataset for the enhancement of system performance. The quality of the system is capable enough to be implemented in wearable wireless devices or MIoT gadgets. In future work, there will be the need to modify this algorithm for other annotated heartbeat classification.