Next Article in Journal
A 2.5D Generalized Finite Difference Method for Elastic Wave Propagation Problems
Previous Article in Journal
Event-Based Quantized Dissipative Filtering for Nonlinear Networked Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Artificial Intelligence for Multiclass Rhythm Analysis for Out-of-Hospital Cardiac Arrest During Mechanical Cardiopulmonary Resuscitation

1
Department of Applied Mathematics, University of the Basque Country (UPV/EHU), 48013 Bilbao, Spain
2
Biocruces Bizkaia Health Research Institute, Cruces Plaza, 48903 Barakaldo, Spain
3
Department of Communications Engineering, University of the Basque Country (UPV/EHU), 48013 Bilbao, Spain
4
Department of Electronic Technology, University of the Basque Country (UPV/EHU), 20600 Eibar, Spain
5
Norwegian National Advisory Unit on Prehospital Emergency Medicine (NAKOS), Division of Prehospital Services, Oslo University Hospital, N-0424 Oslo, Norway
6
Doctor Car 119, Air Ambulance Department, Division of Prehospital Care, Oslo University Hospital, N-0424 Oslo, Norway
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(8), 1251; https://doi.org/10.3390/math13081251
Submission received: 22 February 2025 / Revised: 28 March 2025 / Accepted: 5 April 2025 / Published: 10 April 2025

Abstract

:
Load distributing band (LDB) mechanical chest compression (CC) devices are used to treat out-of-hospital cardiac arrest (OHCA) patients. Mechanical CCs induce artifacts in the electrocardiogram (ECG) recorded by defibrillators, potentially leading to inaccurate cardiac rhythm analysis. A reliable analysis of the cardiac rhythm is essential for guiding resuscitation treatment and understanding, retrospectively, the patients’ response to treatment. The aim of this study was to design a deep learning (DL)-based framework for cardiac automatic multiclass rhythm classification in the presence of CC artifacts during OHCA. Concretely, an automatic multiclass cardiac rhythm classification was addressed to distinguish the following types of rhythms: shockable (Sh), asystole (AS), and organized (OR) rhythms. A total of 15,479 segments (2406 Sh, 5481 AS, and 7592 OR) were extracted from 2058 patients during LDB CCs, whereof 9666 were used to train the algorithms and 5813 to assess the performance. The proposed architecture consists of an adaptive filter for CC artifact suppression and a multiclass rhythm classifier. Two DL alternatives were considered for the multiclass classifier: convolutional neuronal networks (CNNs) and residual networks (ResNets). A traditional machine learning-based classifier, which incorporates the research conducted over the past two decades in ECG rhythm analysis using more than 90 state-of-the-art features, was used as a point of comparison. The unweighted mean of sensitivities, the unweighted mean of F 1 -Scores, and the accuracy of the best method (ResNets) were 88.3%, 88.3%, and 88.2%, respectively. These results highlight the potential of DL-based methods to provide accurate cardiac rhythm diagnoses without interrupting mechanical CC therapy.

1. Introduction

Out-of-hospital cardiac arrest (OHCA) is one of the leading causes of death worldwide, with an annual incidence of 67 to 170 per 100,000 inhabitants in Europe, as well as survival rates at hospital discharge of around 8% (0% to 18%) [1]. OHCA survival depends on several crucial factors, including bystander cardiopulmonary resuscitation (CPR) with emphasis on chest compressions (CCs), early defibrillation, and the overall standard of medical care provided by the emergency medical services (EMS) [2].
Recognizing the patient’s cardiac rhythm throughout resuscitation is crucial for two key reasons: first, to guide therapy according to the treatment pathways defined by the international guidelines; and second, to retrospectively analyze the patient’s response to treatment. Regarding the first case, resuscitation guidelines emphasize the need for discriminating between shockable (Sh) rhythms, comprising ventricular fibrillation (VF) and pulseless ventricular tachycardia (VT); and non-shockable (NSh) rhythms, which include both organized (OR) and asystole (AS) rhythms. The Sh/NSh discrimination is the most crucial decision during resuscitation as defibrillation is the only treatment capable of restoring the normal function of the heart when a Sh rhythm is present [2,3]. A finer classification of the cardiac rhythm may also be needed, especially within the NSh rhythm group, to determine other decisive therapies. For instance, the recommended treatment for AS consists of high-quality CPR, early administration of adrenaline, and identification of the underlying cause of the arrest [4], whereas the presence of an OR could be an indicative of the return of spontaneous circulation; in which case, the patient should be transported to hospital for post-resuscitation care and recovery [2]. As for the retrospective debriefing of resuscitation episodes, knowledge of the patient’s rhythm throughout the episode may offer valuable insights on the interaction between therapy and physiological response [5,6,7]; this may help identify optimal treatment strategies or clinical interventions that improve OHCA survival. One of the limitations for such retrospective studies is the lack of OHCA databases, including cardiac rhythm annotations by expert clinicians, which is mostly due to the expensive and time-consuming manual labor required. Given all this, there is a clear need for the development of multiclass algorithms that automatically identify the patient’s cardiac rhythm, both in real time and retrospectively.
The state-of-the-art OHCA rhythm classification algorithms are mainly based on the analysis of the ECG, typically consisting of an ECG feature extraction stage followed by a machine learning (ML) classifier. They mostly consist of an ECG feature extraction stage followed by a ML classifier. ECG feature extraction has been approached in time [8,9], frequency [10,11], combined time–frequency [12], and complexity domains [13]. The ML approaches explored for the classification stage include K-nearest neighbors [14,15], support vector machines [16,17], artificial neural networks [15], and ensembles of decision trees [18,19]. Recently, OHCA rhythm classification has shifted toward deep learning (DL) techniques, such as convolutional neural networks (CNNs) [20] or residual networks (ResNet) [21], which avoid the knowledge-based feature extraction process of traditional ML models. As determining the need for defibrillation is crucial in OHCA, the discrimination between Sh and NSh rhythms has been the most commonly addressed classification problem by the aforementioned algorithms. However, a less simplistic cardiac rhythm classification is needed to determine other decisive therapies during CPR [22,23]. To address this, Rad et al. [15] introduced the first multiclass OHCA rhythm classifier, where a set of features derived from the discrete wavelet analysis of the ECG were fed into different ML-based classifiers.
These ML- and DL-based binary and multiclass algorithms have primarily focused on rhythm classification during interruptions in chest compressions (CCs), as the mechanical activity during CPR introduces ECG artifacts that hinder accurate rhythm detection. As a result, current commercial defibrillators require rescuers to pause CCs every 2 min for rhythm analysis. However, these interruptions reduce blood flow to vital organs, decreasing the chances of survival [24]. Over the past few decades, efforts have been made to develop more accurate rhythm analysis algorithms that could be applied during CCs, thus helping minimize CC interruptions [10,25]. These algorithms typically follow a similar structure to those used during non-CC intervals, but they include a preliminary filtering step to remove CC-induced artifacts. Such methods have proven successful in classifying both Sh/NSh and multiclass rhythms during OHCA [26,27].
While earlier methods primarily focused on manually delivered CCs, the use of mechanical compression devices in OHCA assistance has notably increased. These devices provide CCs at a constant rate and depth, and although evidence supporting improved survival remains inconclusive [28,29,30], their growing adoption underscores the potential benefits they offer. Mechanical devices help ensure the quality of CCs in line with current resuscitation guidelines, even in situations where manual CPR might be compromised, such as during transport [31,32], in confined spaces, or during prolonged resuscitation efforts when rescuer fatigue may impact performance [33]. In addition, these devices alleviate the physical burden on healthcare providers, allowing them to focus on other aspects of patient care. However, while some studies have addressed Sh/NSh rhythm classification in the context of mechanical CPR [34,35], there is no solution yet for multiclass rhythm classification during mechanical CPR.
In this study, we introduce the first DL solutions for reliable multiclass cardiac rhythm classification during mechanical CPR, which distinguish Sh, AS, and OR rhythms. The DL framework employs convolutional neural networks (CNNs) and residual networks (ResNet) for direct ECG classification. In order to know if deep learning techniques improve rhythm classification upon the existing state-of-the-art approaches during mechanical CPR, a traditional ML-based classifier was used as baseline. The ML framework consists of a feature extraction stage, in which more than 90 features are derived from the ECG rhythm classification research conducted over the past two decades, and a random forest (RF) classifier.
From a clinical perspective, a reliable multiclass rhythm classification during mechanical CPR would provide several benefits. It would enable more accurate identification of shockable rhythms, improving defibrillation timing and enhancing therapeutic decision making. Additionally, it would allow for more precise management of non-shockable rhythms, such as asystole, by ensuring timely and optimal treatment like adrenaline administration. This would not only improve resuscitation efficiency, but also enhance post-resuscitation care by reliably detecting the return of spontaneous circulation (OR), optimizing patient outcomes. Furthermore, a multiclass rhythm classifier would be useful for the retrospective annotation of rhythms, potentially offering valuable insights into the interaction between therapy and patient response. This could ultimately contribute to identifying optimal treatment strategies, refining resuscitation protocols, and improving survival chances in OHCA.

2. Materials

The data used in this study were collected from the Circulation Improving Resuscitation Care (CIRC) trial, which was designed to compare automated load distributing band CPR (LDB-CPR) with high-quality manual CPR (M-CPR) in terms of survival [28,36]. Data were gathered between 5 March 2009 and 11 January 2011 in a randomized, unblinded, and controlled group sequential trial of OHCA patients by three US (Fox Valley, Hilsborough, and Houston) and two European (Vienna and Nijmegen) EMS. After EMS providers initiated manual CCs, patients were randomized to receive either LDB-CPR or M-CPR. The LDB device (AutoPulse, ZOLL Medical, Chelmsford, MA, USA) delivered CCs in a fixed position, with a constant depth of 20% of the patient’s anterior posterior diameter of the chest and at a constant rate of 80 min −1 ( f LDB = 1.33 Hz ).
Anonymized data from Lifepak 12 and 15 monitor defibrillators were exported to MATLAB (MathWorks Inc., Natick, MA, USA) using Physio-Control’s CODE-STAT data review software, and they were then resampled to a sampling frequency of 250 Hz. The data included the ECG and thoracic impedance (TI) signals of each episode together with the CC instants detected by the CODE-STAT software. Figure 1 corresponds to a 70 s interval from an OHCA episode, where ECG (corrupted by CCs) and TI signals are shown in Panels (a) and (c), respectively. The blue circles on the TI signal indicate the CC instants detected by the CODE-STAT software. As can be seen, each fluctuation in the TI signal corresponds to a CC that was administered by the EMS. Furthermore, in Figure 1, two series of CCs can be clearly distinguished in the 0–15 s and 47–70 s time intervals, corresponding to M-CPR and LDB-CPR, respectively. The middle interval, from 15 s to 47 s , corresponds to a segment without CCs and is, therefore, free of CC artifacts. Note the clear difference in the TI pattern during LDB-CPR and M-CPR, with a much larger amplitude and a more regular pattern for LDB-CPR due to the constant depth and rate of the mechanical CCs. Panel (b) of Figure 1 shows the instantaneous CC rate derived from the CC instants marked in Panel (c). It can be observed that the CC rate for manual CPR was variable and fluctuated around 140 min −1, but when the LDB device was applied, the CC frequency stabilized in 80 min −1.
Episodes where LDB-CPR was administered were used to conduct this study, and the application of the LDB device was identified when the CC rate stabilized at the device’s fixed rate of 80 min −1 for at least 20 s (notice the activation of the LDB device in Panel (b) of Figure 1). Then, 22 s signal segments were extracted, corresponding to a single cardiac rhythm and comprising a 6 s CC-free interval and a 16 s corrupted by CC artifacts (refer to the highlighted segment in Figure 1). The intervals during CCs were used as inputs for the multiclass decision algorithms, while the artifact-free intervals were employed to annotate the real underlying rhythm of the patient. Rhythms were annotated as Sh, AS, or OR. The segments corresponding to the EMS of Hilsborough, Nijmegen, and Vienna were annotated by consensus of three biomedical engineers and then subsequently audited by a clinician specialized in the resuscitation field. These segments were used to evaluate the performance of the multiclass decision algorithms. The segments corresponding to the remaining two EMS were not audited by a clinician and were, therefore, used to train the algorithms.
The final database consisted of 15,479 segments extracted from 2058 patients, whereof 9666 segments (1252 Sh, 3865 AS, and 4549 OR) from 1178 patients were used to train/develop the multiclass decision algorithms, and then 5813 segments (1154 Sh, 1616 AS, and 3043 OR) from 880 patients were used to test the performance.
All information regarding ethics approval, data collection procedures, patients’ inclusion/exclusion criteria, and related ethical considerations is thoroughly outlined in the original clinical trial papers, from which the data in this study were derived [28,36]. In summary, the collection of defibrillator files was approved by the Institutional Review Board (IRB) or ethics committee for the lead EMS agency at each study site, and it was conducted under Exception for Informed Consent (EFIC) for emergency research issued by the USA Food and Drug Administration and the applicable laws of The Netherlands and Austria.

3. Methods

This study proposes and evaluates different algorithms for the three-class (Sh, AS, and OR) classification of ECG segments corrupted by LDB CC artifacts. All the algorithms were composed of two main stages: (1) an adaptive filter based on a Recursive Least Squares (RLSs) to eliminate CC artifacts from the ECG, and (2) a classification stage, for which RF, CNN, and ResNet models were optimized to classify the rhythm in the filtered ECG as Sh, AS, or OR. All the classification models were designed to analyze the filtered ECG in the interval from 2–14 s (see the highlighted interval in Figure 2), which may help avoid the filtering transients of the RLS filter. In what follows is t = n · T s , where T s = 4 m s is the sampling period ( f s = 250 Hz ), and n is the sample index.

3.1. CPR Artifact Suppressing Filter

During CPR, the corrupted ECG signal, s cor ( n ) , recorded by the defibrillator, can be expressed as
s cor ( n ) = s ecg ( n ) + s cc ( n ) ,
where s ecg ( n ) is the patient’s uncorrupted ECG, reflecting the actual underlying heart rhythm, and s cc ( n ) represents the artifact introduced by CCs.
An adaptive RLS filter, tailored for removing periodic interferences [37,38], was used to obtain an estimate of the CC artifact s ^ cc ( n ) , which was then subtracted from s cor ( n ) to obtain the filtered ECG s ecg ( n ) , i.e., an estimate of the true underlying heart rhythm. In this approach, the artifact is assumed to be quasi-periodic, and it is modeled as a truncated Fourier series of N terms:
s cc ( n ) = k = 1 N c k ( n ) cos ( k ω 0 n + θ k ( n ) ) = k = 1 N a k ( n ) cos ( k ω 0 n ) + b k ( n ) sin ( k ω 0 n ) ,
where ω 0 is the fundamental discrete frequency of CCs—which, for a LDB device, is constant at ω 0 = 2 π f LDB T s with f LDB = 1.33 Hz 80 min 1 —and T s is the sampling period. Based on this model, the estimated artifact can be expressed in vector format as
s ^ cc ( n ) = Θ T ( n 1 ) Φ ( n ) ,
where vectors Θ ( n ) and Φ ( n ) , respectively, define the time-varying coefficients and the in-phase and quadrature reference signals of the Fourier series:
Θ ( n ) = [ a 1 ( n ) b 1 ( n ) a N ( n ) b N ( n ) ] T
Φ ( n ) = [ cos ( ω 0 n ) sin ( ω 0 n ) cos ( N ω 0 n ) sin ( N ω 0 n ) ] T .
The RLS filter estimates the a k ( n ) and b k ( n ) coefficients adaptively over time so that the error between the corrupted ECG, s cor ( n ) , and the estimated artifact, s ^ cc ( n ) , is minimized at each iteration at the harmonics of the LDB device compression frequency, f LDB . Note that, in this configuration (Figure 2), the error signal corresponds to the filtered ECG, s ^ ecg ( n ) . The update equations of the filter are given by
s ^ ecg ( n ) = s cor ( n ) s ^ cc ( n )
F ( n ) = 1 λ F ( n 1 ) F ( n 1 ) Φ ( n ) Φ T ( n ) F ( n 1 ) λ + Φ T ( n ) F ( n 1 ) Φ ( n )
Θ ( n ) = Θ ( n 1 ) + F ( n ) Φ ( n ) s ^ ecg ( n ) ,
where the gain matrix, F ( n ) , and the coefficient vector were initialized to F ( 0 ) = 0.03 I 2 N and Θ ( 0 ) = 0 T . As shown in the previous equations, there are two configurable parameters in the RLS filter: the number of harmonics, N, that model the artifact, and the forgetting factor, λ , which provides a trade-off between the filter’s adaptability and stability. These values were fixed to N = 35 and λ = 0.989 , based on the optimal RLS configuration identified in [35], where the RLS filter was evaluated in terms of the performance of a Sh/NSh decision algorithm applied to the filtered signal. Panel (a) of Figure 2 shows the adaptive RLS filtering schema, while Panel (b) displays the input and output signals. From top to bottom, we have shown the following: the corrupted ECG, s cor ( n ) ; the estimated CC artifact, s ^ cc ( n ) ; and the filtered ECG revealing the underlying rhythm of the patient (which, in this case, corresponds to an OR rhythm).

3.2. Optimization and Evaluation

The training set, composed of 9666 segments derived from 1178 patients, was used in a 10-fold cross validation (CV) approach for hyperparameter optimization and, in the case of RF models, for feature selection; details on the hyperparameters and feature selection are provided in the upcoming sections corresponding to each model. Data were divided patient-wise, and it was also ensured that every fold retained a prevalence of each rhythm comparable to that of the entire dataset.
The remaining 5813 segments, corresponding to 880 patients, were used to evaluate the performance of the classifiers. For each class i Sh , AS , OR , the sensitivity ( Se i ), positive predictive value ( PPV i ), and F 1 -Scorei were computed; and the unweighted mean of all sensitivities (UMS), total accuracy (ACC), and unweighted mean of all the F 1 -Scores (UMFS) were used as summarizing metrics:
Se i = TP i TP i + FN i , PPV i = TP i TP i + FP i , F 1 -Score i = 2 · PPV i · Se i PPV i + Se i UMS = 1 3 i = 1 3 Se i , ACC = TP + TN TP + TN + FP + FN , UMFS = 1 3 i = 1 3 F 1 -Score i
where TP i , TN i , FP i , and FN i are the true positives, true negatives, false positives, and false negatives for class i, respectively.
In order to estimate the statistical distributions of the performance metrics, the test set was split into 20 patient-wise replicas, each of which included 44 patients. Performance metrics were reported as the mean (standard deviation, SD) as they passed the Kolmogorov–Smirnov normality test. Finally, a two-sample paired t-test was performed to test for equal means of the performance metrics in both the deep learning methods and the state-of-the-art method, RF, as well as in the ResNet and CNN models. A p-value of < 0.05 was considered statistically significant.

3.3. Algorithm Based on CNNs

Figure 3a shows, in blue, the architecture of the multiclass OHCA rhythm classification algorithm based on CNNs. This architecture is based on the one proposed in [26] to discern Sh and NSh rhythms during manual CPR. The filtered ECG (1D signal of N = 3000 samples) was introduced to a CNN composed of B convolutional blocks (Figure 3 shows a three-block as an example, and B is a trainable parameter), which is used to extract the high level features of the signal, followed by two fully connected layers for the three-class classification. The b-th convolutional block is composed of a one-dimensional convolutional layer (Conv1D) with J b filters of width I b , followed by a batch normalization (BN) layer, a rectified linear unit (ReLU), a max-pooling layer, and a dropout layer.
The input to the first convolutional block is defined as s 0 ( n , 1 ) = s ^ ecg ( n ) . The expression s b 1 ( n , m ) will refer to the input of block b or, equivalently, the output of block b 1 , where n and m represent the time and filter index, respectively. The output of the Conv1D at b-th convolutional block can be formulated as follows:
c b ( n , m ) = b m + ρ = 1 J b 1 i = 1 I b w ρ , i m s b 1 ( n + i 1 , ρ ) ,
where the filter weights w ρ , i m and the biases for channel shifting b m are the learnable parameters adjusted during training.
BN layers modify the output of the preceding layer to prevent complex weight interactions from altering the data distribution. This accelerates training by allowing for the use of larger learning rates and improves generalization while reducing overfitting [39]. For every training mini-batch B , a BN layer calculates the channel-wise means μ B , m and variances σ B , m 2 , and it then normalizes each channel via the following equation:
c ^ b ( n , m ) = c b ( n , m ) μ B , m σ B , m 2 + ϵ ,
where ϵ is a small value included to ensure numerical stability. The normalized channels are then adjusted through scaling and shifting to optimize the final ReLu layer. As a result, the outputs, z b ( n , m ) , can be expressed as
z b ( n , m ) = γ m · c ^ b ( n , m ) + β m ,
where γ m and β m are trainable parameters. On inference, a moving average of the mini-batch means μ B , m and variances σ B , m 2 observed during training is typically applied in (9).
Max-pooling layers downsample input data by taking the maximum value from each block of K elements along the time dimension n, so that the output for block b can be represented as follows:
s b ( n , m ) = max z b ( k , m ) k = ( n 1 ) · K + 1 , , n · K .
Finally, the ReLU layers add nonlinearity to the network through the activation function f ( x ) = max 0 , x , enabling the model to learn intricate nonlinear mappings.
Zero-padding was applied before the convolution operations, so the only reduction in dimensionality was due to the max-pooling layers ( K = 3 ). The dropout layer at the end of each block serves as a regularization mechanism, operating exclusively during training to prevent overfitting. This layer temporarily disables a randomly chosen fraction of the network’s adjustable parameters. The output of the last convolutional block was flattened and fed to a dense network composed of two fully connected layers with 10 and 3 neurons, respectively. Finally, a softmax layer transformed the output of the final 3 neurons into values ranging from 0 , 1 , representing the probabilities that a given segment corresponds to a Sh ( p Sh ), AS ( p AS ), or OR ( p OR ) rhythm.

3.4. Algorithm Based on ResNets

The third architecture explored was a ResNet, which mitigates the issue of performance deterioration as layers are added, enabling deeper networks [40]. The main components of a ResNet are residual blocks, which comprise a main path—including convolutional, batch normalization, and other typical CNN layers—and a shortcut path, which directly connects the input and the main path output. Let x be the input to a residual block, and let H ( x ) be the desired data transformation; instead of learning this transformation directly, residual blocks then focus on learning the difference between the input and the output, which is called the residual F ( x ) = H ( x ) x . This is achieved by the simple addition of the main path and shortcut path outputs, and it makes it easier for the network to learn by focusing on refining the input rather than completely transforming it.
Figure 3b shows the layout of the ResNet architecture [40], which, similar to Jaureguibeitia et al. [21], was designed as intending to replicate that of the CNN, thus deepening the network while maintaining a coherent structure. Similar to the CNN, the network was composed of B = 3 , 4 , 5 , and 6 blocks, each consisting of two residual blocks following the main path pre-activation configuration (conv-BN-ReLu-conv-BN) proposed by Han et al. [41], where the first block of the network was an exception to this rule and consisted of a single, much simpler conv-BN-ReLU configuration with no shortcut path. Pooling layers were replaced by strided convolutions, which skip every other step in the filtering process. When adjustments to length and depth are needed, the shortcut path of the first residual block includes a strided convolution to create a linear projection of the input. Finally, the hidden fully connected layer was replaced by a global average pooling layer, which outputs the mean value of each input channel [42].

3.5. CNN and ResNet Configurations

The CNN parameters that were adjusted during the training phase were the following: the number of convolutional blocks B = 3 , 4 , 5 , 6 ; the width of the filters I = 2 , 4 , 8 , 16 , 32 , 64 , 92 (the same filter width was considered along the B blocks); and the number of filters, which varied from block to block L = J 1 , J 2 , , J B . Six filter configurations, with increasing number of filters (from sparse to dense), were studied: L 1 = ( 1 , 2 , 4 , 8 , 16 , 32 ) ; L 2 = ( 2 , 4 , 8 , 16 , 32 , 64 ) ; L 3 = ( 4 , 8 , 16 , 32 , 64 , 128 ) ; L 4 = ( 6 , 12 , 24 , 48 , 96 , 192 ) ; L 5 = ( 8 , 16 , 32 , 64 , 128 , 256 ) ; and L 6 = ( 10 , 20 , 40 , 80 , 160 , 320 ) . The values in parentheses correspond to the number of filters J b for blocks b = 1 , , 6 . For architectures with B < 6 blocks, central values (with upwards bias) were selected. Therefore, for 3, 4, and 5 blocks, the L 2 configuration would be, for instance, as follows: (8, 16, 32), (4, 8, 16, 32), and (4, 8, 16, 32, 64), respectively. Table 1 shows the 6 specific filter configurations, L , for each block number, B, applied to the CNN model. The final values of B, I, and configuration L were all optimized during the training phase.
As in the CNN, each convolutional layer in the network used an identical filter size selected from I = 2 , 4 , 8 , 16 , 32 , 64 , 92 . Similarly, the possible configurations of the number of filters were selected from L 1 L 6 , with the number of filters per block applied to each convolutional layer within that block. Table 2 shows the 6 specific filter configurations, L , for each block number, B, applied to the ResNet model.
The validation/selection of hyperparamenters was performed in two phases: First, the CNN and ResNet models were evaluated for a fixed configuration L 3 and for all combinations of the number of blocks B and filter width I. The optimal filter width I opt was selected for each number of blocks tested as that of the models scoring the best performance. Then, the models were evaluated using the optimal filter width I opt that was achieved in each of the number of blocks tested and in all combinations of the filter configuration L , with the best performing model being selected to analyze the test data. In both cases, the performance criterion was the average between UMS and UMFS.
Both in the CNN and ResNet architectures, the weights and biases of every layer were optimized in order to minimize categorical cross-entropy using stochastic gradient descent with a momentum of 0.8. The initial learning rate was fixed at 0.02 and it was reduced by a factor of 0.8 at every epoch. The training process was conducted for 20 epochs with a batch size of 256 samples [43].

3.6. Comparison with the State of the Art: Classical Machine Learning

The performance of the CNN and ResNet models was compared with that of a state-of-the-art classical ML solution that was designed for multiclass OHCA rhythm classification during manual CPR [27]. In essence, the algorithm integrates a multi-resolution ECG analysis approach, employing the Stationary Wavelet Transform (SWT) for feature extraction and a RF classifier for the subsequent classification. The SWT decomposes the 12 s window into 7 detail coefficient (d1–d7) sub-bands using a Daubechies 4 mother wavelet. A denoised version of the ECG was also reconstructed using the detail coefficients d3 to d7, corresponding to an analysis frequency band of 0.98–31.25 Hz .
From the denoised ECG and the detail coefficients d3–d7, ninety-three features were extracted to characterize the OHCA rhythm subtypes, representing over 25 years of research in the field. These features were divided into five analysis domains. Time domain features included characteristics like the mean and the standard deviation of the heart rate [34]; spectral features included classical measures like VF leakage [44] or the power proportion concentrated around the VF-fibrillation band [45]; complexity analysis covered entropy measures like sample or Shannon entropy [46]; statistical analysis measures involved amplitude distribution characteristics; and, finally, phase space features utilized time-delay embedding to extract dynamics in the ECG. The detailed description of each of the 93 features can be found in [27].
The training set was divided using a 10-fold CV approach to select the best subset of K features and to optimize RF hyperparameters. First, the optimal set of features was selected for each of the 10 training folds that constitute the 10-fold CV. Feature selection was based on a recursive feature elimination (RFE) approach using out-of-bag permutation importance as a ranking criterion [47,48]. Permutation importance is an inherent characteristic of the RF classifier that evaluates the significance of each feature by randomly shuffling its values in the training data of each tree in the forest and then measuring the resulting change in the out-of-bag error. In each iteration of the RFE algorithm, features were ranked, and the least important 3% were removed. This process was repeated until the optimal sets of K features, with K = 1 , 3 , 5 , 7 , 10 · j and j = 1 , , 9 , were selected for classification. Once the best K-feature subsets were selected in each of the 10 CV training folds, the RF classifier was optimized. Only one parameter of the RF classifier was considered for optimization: the minimum number of observations per leaf, l size , which controls the depth of the trees and was identified in [27] as essential for preventing overfitting. For every CV training fold and subset of K features, different l size values were trained in the range 1 l size 200 , and these were then evaluated in the corresponding testing fold. The number of trees was set to B = 500 , and the number of predictors per split was fixed to the default value K for both feature selection and l size optimization. B = 500 was found to be sufficient to stabilize accuracy without causing overfitting [49], and the default value of the number of predictors per split achieved by far the best performance in [27]. Finally, l size was fixed to the default value l size = 1 during the feature selection process.

4. Results

To set a reference for the CNN/ResNet results, the performance of the state-of-the-art classical ML algorithm was analyzed first. Figure 4 and Figure 5 show the results obtained by this algorithm on the training data. The left panel of Figure 4 shows the mean performance metrics (UMS, UMFS, and the average between UMS and UMFS) as a function of the number of features K; these were calculated as the average, across all l size ( i ) values considered, of the CV performance metrics obtained for the K-feature and l size = l size ( i ) RF models. The best compromise between model simplicity and performance was obtained for K = 20 as the metrics barely increased for a greater value of K. The right panel of Figure 4 shows the CV performance metrics as a function of l size . RF models of K = 20 features were considered based on the previous results. In terms of the average between UMS and UMFS, the optimal range for l size was 1 < = l size < = 9 , with a significant decline in performance observed for larger values of l size . The value l size = 1 was chosen as optimal as it produced the most balanced UMS and UMFS results. Figure 5 shows the 30 ECG features with a higher probability of selection for the K = 20 and l size = 1 CV RF models. These probabilities were estimated by counting the number of times the features were selected in the 10 iterations of the feature selection algorithm in the 10-fold CV loop. The most important nine parameters, i.e., those that were selected in 100% of iterations, were highly heterogeneous as they were derived from all detail coefficients, as well as from the denoised ECG, and they also corresponded to complexity (SampEn), time (bCP), statistical (IQR, StdAbs1, Hmb, and Hcmp), and phase space domains (SkewPSD). The acronyms used in Figure 5 are the same as those found in [27], where a detailed description of each parameter is provided. Given these results, a single optimal RF configuration was defined using l size = 1 and the most important K = 20 features, as per Figure 5. On a 10-fold CV loop over training data, this configuration obtained identical UMS and UMFS scores of 81.3%. As a single model trained on the complete training data set and evaluated on the test set, it obtained a UMS and UMFS of 85.3% and 85.1%, respectively. The mean (SD) of the performance metrics obtained using the 20 replicas of the testing set can be found in Table 3.
Regarding the performance of the DL models, the impact of altering the main parameters of the CNN architecture is depicted in Figure 6. The top row shows the CV performance metrics (UMS, UMFS, and the average between UMS and UMFS) for a varying filter size I and a fixed filter configuration L 3 . The best performance in terms of the average between UMFS and UMS (the third column on both rows) was obtained for a filter width of I = 64 , 32 , 32 , and 16 when B = 3 , 4 , 5 , and 6 were used. The bottom row contains the results of the study of the effect of changing the filter configuration when the filter size was fixed at/to these optimal values. The L 3 configuration for B = 5 blocks and the L 5 configuration with B = 6 blocks achieved the best and very similar performances, with the six-block one being slightly better. Therefore, the optimal model for the CNN architecture was composed of six convolutional blocks with 8, 16, 32, 64, 128, and 256 filters, all of them of width 16. In the training set, this model achieved a UMFS and UMS of 83.0% and 84.8%, respectively. In the entire testing set, the model obtained a UMFS and UMS of 86.1% and 87.5%, respectively. The mean (SD) of the performance metrics obtained using the 20 replicas of the testing set can be found in Table 3.
Figure 7 analyzes the effect of changing the parameters of the ResNet architecture. Similar to CNNs, the filter width was optimized first with the filter configuration fixed at L 3 (first row) for fixed optimal filter sizes, and this was then followed by the selection of filter configuration (second row). The best classification results were obtained for four blocks. Adding a fifth block increased the complexity of the network (number of trainable parameters) and slightly decreased performance. Using only three blocks resulted in a large decrease in performance, or an overly simplistic model. As the third column of the first row shows, the best performance in terms of the average between UMFS and UMS was obtained for I = 32 when B 5 and I = 16 when B = 6 . With the filter width fixed, the second row of Figure 7 shows that the best performance was achieved for four blocks and the L 4 filter configuration. Thus, the optimal configuration of the ResNet architecture consisted of four blocks containing 6, 12, 24, and 48 filters, respectively, all of them of width 32, that is, the ResNet input block used four filters and the remaining three blocks used 12, 24, and 48 filters in each of the residual blocks. In the training set, this configuration obtained a UMFS and UMS of 86.9% and 86.6%, respectively. In the entire testing set, the ResNet model obtained a value of 88.3% for both the UMFS and the UMS. The mean (SD) of the performance metrics obtained using the 20 replicas of the testing set can be found in Table 3.
Table 3 shows the results of the OHCA rhythm classification algorithms based on classical ML, CNN, and ResNet models when the aforementioned optimal configurations were applied into the 20 replicas of the testing set. As the summarizing metrics demonstrate, the DL-based algorithms performed better than the traditional ML ones. The CNN model outperformed the RF model by 1.3 percentage points in UMFS, 2.3 in UMS, and 1.4 in ACC (p-value < 0.05). The ResNet was the best-performing model, outperforming the CNN model by 1.8 percentage points in UMFS, 0.6 in UMS, and 1.7 in ACC (p-value < 0.05). As such, the ResNet model outperformed the RF model by 3.1 percentage points in UMFS, 2.9 points in UMS, and 3.1 points in ACC (p-value < 0.05). These results demonstrate that, for the first time, deep learning models offer better performance than classical ML methods for multiclass cardiac rhythm classification during mechanical CCs.
Figure 8 shows the confusion matrices obtained for the optimal configuration of the three models using the entire testing set, where 5816 segments were obtained from 880 patients. The distinction between AS and OR proved to be the most challenging. Regarding the ResNet, 15.1% of AS were incorrectly classified as OR, whereas 9.7% of OR rhythms were misclassified as AS. These findings are in line with those reported by Kwok et al., where, on a limited set of patients, they demonstrated the first three-class rhythm classification algorithm during manual CPR [12]. In scenarios without a CPR artifact, the AS/OR discrimination is relatively simple and can be addressed using energy and heart rate measurements. During CCs, spiky filtering residuals may be confounded as QRS complexes during AS (Panel (d) of Figure 9). Conversely, CPR artifact filtering may reduce R-peak amplitudes in OR rhythms, producing erroneous AS classifications (Panel (c) of Figure 9).
Given the importance of the Sh/NSh discrimination during resuscitation therapy and as an additional experiment, the best performing method (ResNet) was adapted for the binary Sh/NSh classification task. For this experiment, AS and OR rhythms were grouped into the NSh category using 9666 segments (1252 Sh and 8414 NSh) to train and 5813 segments (1154 Sh and 4659 NSh) to test the model. The optimal ResNet architecture for this problem was selected in an analogous manner to that for the three-class problem. Performance was evaluated in terms of sensitivity (SE, the proportion of correctly classified Sh rhythms) and specificity (SP, the proportion of correctly classified NSh rhythms) in line with the minimum performance requirements for the Sh/Nsh discrimination recommended by the AHA; balanced accuracy (the mean of SE and SP) was chosen as a summarizing metric. Figure 10 shows the performance metrics obtained across the 10-fold CV loop in the training set. As in previous figures, the top row contains analyses of the impact of altering the width of the filters, I, while the bottom one includes details of the impact of altering the configuration of the filters, L , for a fixed filter size. As shown in the third panel of the bottom row, the L 6 configuration with three and four blocks and I = 32 achieved the maximum BAC, with the four-block one being slightly higher. This architecture obtained SE/SP/BAC scores of 85.5%/98.6%/92.0% and 90.6%/98.5%/94.6% in the training and testing sets, respectively.

5. Discussion

The adoption of mechanical CPR devices has significantly increased in recent years, primarily through two main technologies: LDB- and piston-driven mechanical CC devices. Mechanical CPR guarantees high-quality CCs when a manual CPR is (i) subjected to fatigue, (ii) is practically challenging, or (iii) cannot be delivered safely. However, mechanical CPR also introduces large artifacts in the ECG, hindering rhythm analysis, which is critical for clinical decision making. Given the negative effect of CPR interruptions on resuscitation outcomes, there is a high interest for algorithms capable of rhythm analysis during ongoing CPR. To the best of our knowledge, this is the first study to address multiclass OHCA rhythm classification during mechanical CPR, and it is also the first to apply deep learning techniques in this context. Two specific DL architectures, CNN and ResNet, were tested and compared with a state-of-the-art classical ML approach [26].
DL-based algorithms outperformed the classical ML algorithm by at least 2 percentage points in UMS and ACC. Considering that the classical ML algorithm relies on over 20 years of expert knowledge in ECG feature engineering for OHCA rhythm classification, these results highlight the power of DL algorithms to learn discriminative features by leveraging all the hidden information in the ECG. This simplifies the feature extraction process, saving time and, more importantly, improving the quality of the features extracted.
The algorithm based on ResNet offered the best performance, achieving a UMFS, UMS, and ACC of 88.3%, 88.3%, and 88.2% for the three-class classification task, respectively. This performance is similar to that which was obtained in [27], i.e., the only other study in the literature that analyzed multiclass OHCA rhythm classification during manual CPR. The characteristics of manual compressions are rescuer-dependent, which means the variability of the resulting artifacts anticipates a more complex filtering challenge. However, manual artifacts showed significantly smaller artifact amplitudes and less harmonic components (smaller bandwidths) compared to LDB artifacts, and this balance resulted in similar levels of accuracy [27,34,35]. For the Sh/NSh problem, the BAC was 94.6%, with a SE of 90.6% and a SP of 98.5%. This is a very important problem since it addresses shock advice decisions during CPR. Shock advice algorithms for defibrillators are normally tested on artifact-free data. In such a scenario, the AHA requires a minimum SE and SP of 90% and 95%, respectively [50]. Our solution met those requirements.
The database used in this study was fully derived from OHCA cases. It is unclear whether the proposed algorithms would perform differently for in-hospital data; however, given that in-hospital resuscitation does not entail differences in LDB-CPR, it is tempting to argue that the analysis and results we have presented for OHCA patients will also be valid for in-hospital cardiac arrest and CPR.
Finally, some considerations about these results are worth noting. First, this study presents the first method for an automatic and clinically safe multiclass OHCA rhythm analysis during LBD-CPR. The proposed solution, together with already available solutions for piston-driven [34] and manual CCs [27], would cover rhythm analysis in every CPR scenario. This may open the possibility of a reliable multiclass OHCA rhythm analysis during CPR, contributing to guide therapy while reducing no-flow intervals, thereby improving survival in OHCA. Second, while the proposed DL algorithms significantly outperformed the classical ML approach and also met AHA requirements, they are no guarantee of the best achievable performance. Due to computational constraints, only a finite set of network configurations was considered. Moreover, this study was limited to CNN and ResNet architectures, following previous studies on manual CPR [19,26,51,52,53]. More recent architectures such as Transformers (ViTs) [54] and Capsule Networks (CapsNets) [55] could result in improved performances, especially in the presence of increased training data.
The results obtained in this study represent a meaningful and significant improvement over the current state-of-the-art techniques for rhythm classification during mechanical compressions. Further research in DL techniques will be pursued to explore additional avenues for optimizing performance in future work.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math13081251/s1, The supplementary material outlines the steps required to use the trained neural network models (ResNet and CNN) for classifying cardiac rhythms in ECG segments.

Author Contributions

Conceptualization, L.W.; methodology, I.I. and X.J.; software, I.I.; validation, I.I., X.J. and E.Alonso; formal analysis, I.I., X.J., E.A. (Erik Alonso) and A.E.; investigation, I.I., X.J., E.A. (Erik Alonso) and A.E.; resources, E.A. (Erik Alonso) and E.A. (Elisabete Aramendi); data curation, I.I. and X.J; writing—original draft preparation, I.I.; writing—review and editing, I.I., X.J, E.A. (Erik Alonso), E.A. (Elisabete Aramendi) and L.W.; visualization, I.I.; supervision, E.A. (Elisabete Aramendi) and L.W.; project administration, E.A. (Erik Alonso) and E.A. (Elisabete Aramendi); funding acquisition, E.A. (Erik Alonso) and E.A. (Elisabete Aramendi). All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by MCIN/AEI/10.13039/501100011033 and by FEDER Una manera de hacer Europa through grant PID2021-122727OB-I00. Additional support was provided by the Basque Government through grants IT1717-22, 2023333042, and 2024333037; and by the University of the Basque Country (UPV/EHU) under grant EHU-N23/01.

Data Availability Statement

Two trained models were uploaded as Supplementary Materials, corresponding to the CNN and ResNet. Models were trained using the training dataset (9666 segments) and the optimal configurations. This allows other researchers to replicate our results using the same models and configurations.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Gräsner, J.T.; Herlitz, J.; Tjelmeland, I.B.; Wnent, J.; Masterson, S.; Lilja, G.; Bein, B.; Böttiger, B.W.; Rosell-Ortiz, F.; Nolan, J.P.; et al. European Resuscitation Council Guidelines 2021: Epidemiology of cardiac arrest in Europe. Resuscitation 2021, 161, 61–79. [Google Scholar] [CrossRef] [PubMed]
  2. Soar, J.; Böttiger, B.W.; Carli, P.; Couper, K.; Deakin, C.D.; Djärv, T.; Lott, C.; Olasveengen, T.; Paal, P.; Pellis, T.; et al. European resuscitation council guidelines 2021: Adult advanced life support. Resuscitation 2021, 161, 115–151. [Google Scholar] [CrossRef] [PubMed]
  3. Olasveengen, T.M.; Semeraro, F.; Ristagno, G.; Castren, M.; Handley, A.; Kuzovlev, A.; Monsieurs, K.G.; Raffay, V.; Smyth, M.; Soar, J.; et al. European resuscitation council guidelines 2021: Basic life support. Resuscitation 2021, 161, 98–114. [Google Scholar] [CrossRef] [PubMed]
  4. Gough, C.J.; Nolan, J.P. The role of adrenaline in cardiopulmonary resuscitation. Crit. Care 2018, 22, 139. [Google Scholar] [CrossRef]
  5. Kvaløy, J.T.; Skogvoll, E.; Eftestøl, T.; Gundersen, K.; Kramer-Johansen, J.; Olasveengen, T.M.; Steen, P.A. Which factors influence spontaneous state transitions during resuscitation? Resuscitation 2009, 80, 863–869. [Google Scholar] [CrossRef]
  6. Nordseth, T.; Bergum, D.; Edelson, D.P.; Olasveengen, T.M.; Eftestøl, T.; Wiseth, R.; Abella, B.S.; Skogvoll, E. Clinical state transitions during advanced life support (ALS) in in-hospital cardiac arrest. Resuscitation 2013, 84, 1238–1244. [Google Scholar] [CrossRef]
  7. Nordseth, T.; Niles, D.E.; Eftestøl, T.; Sutton, R.M.; Irusta, U.; Abella, B.S.; Berg, R.A.; Nadkarni, V.M.; Skogvoll, E. Rhythm characteristics and patterns of change during cardiopulmonary resuscitation for in-hospital paediatric cardiac arrest. Resuscitation 2019, 135, 45–50. [Google Scholar] [CrossRef]
  8. Thakor, N.V.; Zhu, Y.S.; Pan, K.Y. Ventricular tachycardia and fibrillation detection by a sequential hypothesis testing algorithm. IEEE Trans. Biomed. Eng. 1990, 37, 837–843. [Google Scholar] [CrossRef]
  9. Jekova, I.; Krasteva, V. Real time detection of ventricular fibrillation and tachycardia. Physiol. Meas. 2004, 25, 1167. [Google Scholar] [CrossRef]
  10. Irusta, U.; Ruiz, J. An algorithm to discriminate supraventricular from ventricular tachycardia in automated external defibrillators valid for adult and paediatric patients. Resuscitation 2009, 80, 1229–1233. [Google Scholar] [CrossRef]
  11. Neurauter, A.; Eftestøl, T.; Kramer-Johansen, J.; Abella, B.S.; Sunde, K.; Wenzel, V.; Lindner, K.H.; Eilevstjønn, J.; Myklebust, H.; Steen, P.A.; et al. Prediction of countershock success using single features from multiple ventricular fibrillation frequency bands and feature combinations using neural networks. Resuscitation 2007, 73, 253–263. [Google Scholar] [CrossRef] [PubMed]
  12. Kwok, H.; Coult, J.; Drton, M.; Rea, T.D.; Sherman, L. Adaptive rhythm sequencing: A method for dynamic rhythm classification during CPR. Resuscitation 2015, 91, 26–31. [Google Scholar] [CrossRef] [PubMed]
  13. Ristagno, G.; Mauri, T.; Cesana, G.; Li, Y.; Finzi, A.; Fumagalli, F.; Rossi, G.; Grieco, N.; Migliori, M.; Andreassi, A.; et al. Amplitude spectrum area to guide defibrillation: A validation on 1617 patients with ventricular fibrillation. Circulation 2015, 131, 478–487. [Google Scholar] [CrossRef] [PubMed]
  14. Cabello, D.; Barro, S.; Salceda, J.; Ruiz, R.; Mira, J. Fuzzy K-nearest neighbor classifiers for ventricular arrhythmia detection. Int. J. Bio-Med. Comput. 1991, 27, 77–93. [Google Scholar] [CrossRef]
  15. Rad, A.B.; Eftestøl, T.; Engan, K.; Irusta, U.; Kvaløy, J.T.; Kramer-Johansen, J.; Wik, L.; Katsaggelos, A.K. ECG-based classification of resuscitation cardiac rhythms for retrospective data analysis. IEEE Trans. Biomed. Eng. 2017, 64, 2411–2418. [Google Scholar] [CrossRef]
  16. Cheng, P.; Dong, X. Life-threatening ventricular arrhythmia detection with personalized features. IEEE Access 2017, 5, 14195–14203. [Google Scholar] [CrossRef]
  17. Li, Q.; Rajagopalan, C.; Clifford, G.D. Ventricular fibrillation and tachycardia classification using a machine learning approach. IEEE Trans. Biomed. Eng. 2013, 61, 1607–1613. [Google Scholar]
  18. Krasteva, V.; Jekova, I. Rhythm Analysis During Cardio-Pulmonary Resuscitation with Convolutional and Recurrent Neural Networks Using ECG and Optional Impedance Input. In Proceedings of the International Symposium on Bioinformatics and Biomedicine, Burgas, Bulgaria, 5–7 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 3–15. [Google Scholar]
  19. Jekova, I.; Krasteva, V. Optimization of end-to-end convolutional neural networks for analysis of out-of-hospital cardiac arrest rhythms during cardiopulmonary resuscitation. Sensors 2021, 21, 4105. [Google Scholar] [CrossRef]
  20. Hajeb-M, S.; Cascella, A.; Valentine, M.; Chon, K. Deep neural network approach for continuous ECG-based automated external defibrillator shock advisory system during cardiopulmonary resuscitation. J. Am. Heart Assoc. 2021, 10, e019065. [Google Scholar] [CrossRef]
  21. Jaureguibeitia, X.; Zubia, G.; Irusta, U.; Aramendi, E.; Chicote, B.; Alonso, D.; Larrea, A.; Corcuera, C. Shock decision algorithms for automated external defibrillators based on convolutional networks. IEEE Access 2020, 8, 154746–154758. [Google Scholar] [CrossRef]
  22. Alwan, Y.; Cvetković, Z.; Curtis, M.J. Methods for improved discrimination between ventricular fibrillation and tachycardia. IEEE Trans. Biomed. Eng. 2017, 65, 2143–2151. [Google Scholar] [CrossRef] [PubMed]
  23. Risdal, M.; Aase, S.O.; Kramer-Johansen, J.; Eftestol, T. Automatic identification of return of spontaneous circulation during cardiopulmonary resuscitation. IEEE Trans. Biomed. Eng. 2007, 55, 60–68. [Google Scholar] [CrossRef] [PubMed]
  24. Cheskes, S.; Schmicker, R.H.; Christenson, J.; Salcido, D.D.; Rea, T.; Powell, J.; Edelson, D.P.; Sell, R.; May, S.; Menegazzi, J.J.; et al. Perishock pause: An independent predictor of survival from out-of-hospital shockable cardiac arrest. Circulation 2011, 124, 58–66. [Google Scholar] [CrossRef] [PubMed]
  25. Ruiz de Gauna, S.; Irusta, U.; Ruiz, J.; Ayala, U.; Aramendi, E.; Eftestøl, T. Rhythm analysis during cardiopulmonary resuscitation: Past, present, and future. BioMed Res. Int. 2014, 2014, 386010. [Google Scholar] [CrossRef]
  26. Isasi, I.; Irusta, U.; Aramendi, E.; Eftestøl, T.; Kramer-Johansen, J.; Wik, L. Rhythm analysis during cardiopulmonary resuscitation using convolutional neural networks. Entropy 2020, 22, 595. [Google Scholar] [CrossRef]
  27. Isasi, I.; Irusta, U.; Rad, A.B.; Aramendi, E.; Zabihi, M.; Eftestøl, T.; Kramer-Johansen, J.; Wik, L. Automatic cardiac rhythm classification with concurrent manual chest compressions. IEEE Access 2019, 7, 115147–115159. [Google Scholar] [CrossRef]
  28. Wik, L.; Olsen, J.A.; Persse, D.; Sterz, F.; Lozano, M., Jr.; Brouwer, M.A.; Westfall, M.; Souders, C.M.; Malzer, R.; van Grunsven, P.M.; et al. Manual vs. integrated automatic load-distributing band CPR with equal survival after out of hospital cardiac arrest. The randomized CIRC trial. Resuscitation 2014, 85, 741–748. [Google Scholar] [CrossRef]
  29. Rubertsson, S.; Lindgren, E.; Smekal, D.; Östlund, O.; Silfverstolpe, J.; Lichtveld, R.A.; Boomars, R.; Ahlstedt, B.; Skoog, G.; Kastberg, R.; et al. Mechanical chest compressions and simultaneous defibrillation vs conventional cardiopulmonary resuscitation in out-of-hospital cardiac arrest: The LINC randomized trial. JAMA 2014, 311, 53–61. [Google Scholar] [CrossRef]
  30. Krep, H.; Mamier, M.; Breil, M.; Heister, U.; Fischer, M.; Hoeft, A. Out-of-hospital cardiopulmonary resuscitation with the AutoPulse™ system: A prospective observational study with a new load-distributing band chest compression device. Resuscitation 2007, 73, 86–95. [Google Scholar] [CrossRef]
  31. Ong, M.E.H.; Mackey, K.E.; Zhang, Z.C.; Tanaka, H.; Ma, M.H.M.; Swor, R.; Shin, S.D. Mechanical CPR devices compared to manual CPR during out-of-hospital cardiac arrest and ambulance transport: A systematic review. Scand. J. Trauma Resusc. Emerg. Med. 2012, 20, 39. [Google Scholar] [CrossRef]
  32. Putzer, G.; Braun, P.; Zimmermann, A.; Pedross, F.; Strapazzon, G.; Brugger, H.; Paal, P. LUCAS compared to manual cardiopulmonary resuscitation is more effective during helicopter rescue—a prospective, randomized, cross-over manikin study. Am. J. Emerg. Med. 2013, 31, 384–389. [Google Scholar] [CrossRef] [PubMed]
  33. Ashton, A.; McCluskey, A.; Gwinnutt, C.; Keenan, A. Effect of rescuer fatigue on performance of continuous external chest compressions over 3 min. Resuscitation 2002, 55, 151–155. [Google Scholar] [CrossRef] [PubMed]
  34. Isasi, I.; Irusta, U.; Elola, A.; Aramendi, E.; Ayala, U.; Alonso, E.; Kramer-Johansen, J.; Eftestøl, T. A machine learning shock decision algorithm for use during piston-driven chest compressions. IEEE Trans. Biomed. Eng. 2018, 66, 1752–1760. [Google Scholar] [CrossRef] [PubMed]
  35. Isasi, I.; Irusta, U.; Aramendi, E.; Olsen, J.; Wik, L. Shock decision algorithm for use during load distributing band cardiopulmonary resuscitation. Resuscitation 2021, 165, 93–100. [Google Scholar] [CrossRef]
  36. Lerner, E.B.; Persse, D.; Souders, C.M.; Sterz, F.; Malzer, R.; Lozano, M., Jr.; Westfall, M.; Brouwer, M.A.; van Grunsven, P.M.; Whitehead, A.; et al. Design of the Circulation Improving Resuscitation Care (CIRC) Trial: A new state of the art design for out-of-hospital cardiac arrest research. Resuscitation 2011, 82, 294–299. [Google Scholar] [CrossRef]
  37. Xiao, Y.; Ma, L.; Ward, R.K. Fast RLS Fourier analyzers capable of accommodating frequency mismatch. Signal Process. 2007, 87, 2197–2212. [Google Scholar] [CrossRef]
  38. Irusta, U.; Ruiz, J.; de Gauna, S.R.; EftestØl, T.; Kramer-Johansen, J. A least mean-square filter for the estimation of the cardiopulmonary resuscitation artifact based on the frequency of the compressions. IEEE Trans. Biomed. Eng. 2009, 56, 1052–1062. [Google Scholar] [CrossRef]
  39. Ioffe, S. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]
  40. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  41. Han, D.; Kim, J.; Kim, J. Deep pyramidal residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5927–5935. [Google Scholar]
  42. Lin, M.; Chen, Q.; Yan, S. Network in network. arXiv 2013, arXiv:1312.4400. [Google Scholar]
  43. Sutskever, I.; Martens, J.; Dahl, G.; Hinton, G. On the importance of initialization and momentum in deep learning. In Proceedings of the International Conference on Machine Learning, PMLR, Atlanta, GA, USA, 17–19 June 2013; pp. 1139–1147. [Google Scholar]
  44. Kuo, S. Computer detection of ventricular fibrillation. In Proceedings of the Computers in Cardiology; IEEE Comupter Society: Washington, DC, USA, 1978; pp. 347–349. [Google Scholar]
  45. Jekova, I. Shock advisory tool: Detection of life-threatening cardiac arrhythmias and shock success prediction by means of a common parameter set. Biomed. Signal Process. Control 2007, 2, 25–33. [Google Scholar] [CrossRef]
  46. Alonso-Atienza, F.; Morgado, E.; Fernandez-Martinez, L.; Garcia-Alberola, A.; Rojo-Alvarez, J.L. Detection of life-threatening arrhythmias using feature selection and support vector machines. IEEE Trans. Biomed. Eng. 2013, 61, 832–840. [Google Scholar] [CrossRef] [PubMed]
  47. Pang, H.; George, S.L.; Hui, K.; Tong, T. Gene selection using iterative feature elimination random forests for survival outcomes. IEEE/ACM Trans. Comput. Biol. Bioinform. 2012, 9, 1422–1431. [Google Scholar] [CrossRef] [PubMed]
  48. Shen, K.Q.; Ong, C.J.; Li, X.P.; Hui, Z.; Wilder-Smith, E.P. A feature selection method for multilevel mental fatigue EEG classification. IEEE Trans. Biomed. Eng. 2007, 54, 1231–1237. [Google Scholar] [CrossRef] [PubMed]
  49. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  50. Kerber, R.E.; Becker, L.B.; Bourland, J.D.; Cummins, R.O.; Hallstrom, A.P.; Michos, M.B.; Nichol, G.; Ornato, J.P.; Thies, W.H.; White, R.D.; et al. Automatic external defibrillators for public access defibrillation: Recommendations for specifying and reporting arrhythmia analysis algorithm performance, incorporating new waveforms, and enhancing safety: A statement for health professionals from the American Heart Association Task Force on Automatic External Defibrillation, Subcommittee on AED Safety and Efficacy. Circulation 1997, 95, 1677–1682. [Google Scholar]
  51. Krasteva, V.; Didon, J.P.; Ménétré, S.; Jekova, I. Deep learning strategy for sliding ECG analysis during cardiopulmonary resuscitation: Influence of the hands-off time on accuracy. Sensors 2023, 23, 4500. [Google Scholar] [CrossRef]
  52. Lee, S.; Jung, S.; Ahn, S.; Cho, H.; Moon, S.; Park, J.H. Comparison of Neural Network Structures for Identifying Shockable Rhythm During Cardiopulmonary Resuscitation. J. Clin. Med. 2025, 14, 738. [Google Scholar] [CrossRef]
  53. Gong, Y.; Wei, L.; Yan, S.; Zuo, F.; Zhang, H.; Li, Y. Transfer learning based deep network for signal restoration and rhythm analysis during cardiopulmonary resuscitation using only the ECG waveform. Inf. Sci. 2023, 626, 754–772. [Google Scholar] [CrossRef]
  54. Apostol, A.; Nutu, M. Arrhythmia Classification from 12-Lead ECG Signals Using Convolutional and Transformer-Based Deep Learning Models. arXiv 2025, arXiv:2502.17887. [Google Scholar]
  55. Jayasekara, H.; Jayasundara, V.; Athif, M.; Rajasegaran, J.; Jayasekara, S.; Seneviratne, S.; Rodrigo, R. Timecaps: Capturing time series data with capsule networks. arXiv 2019, arXiv:1911.11800. [Google Scholar]
Figure 1. A 70 s interval from an OHCA episode. From top to bottom: (a) ECG, (b) instantaneous CC rate, (c) thoracic impedance (TI), and (d) filtered ECG (only filtered in the interval corresponding to mechanical CCs in the database segment). CC instants are depicted as blue circles in the TI signal. Activity shows manual CPR (first 15 s ), followed by a pause for the LDB device application, and then a resumption of mechanical CPR (last 20 s ). The interval highlighted in green in Panel (a) corresponds to a 22 s segment included in the study dataset. The first 6 s (light green), free of CC artifacts, were used to annotate the ground truth rhythm (organized) of the patient, while the last 16 s with artifact (dark green) were used to develop the algorithms.
Figure 1. A 70 s interval from an OHCA episode. From top to bottom: (a) ECG, (b) instantaneous CC rate, (c) thoracic impedance (TI), and (d) filtered ECG (only filtered in the interval corresponding to mechanical CCs in the database segment). CC instants are depicted as blue circles in the TI signal. Activity shows manual CPR (first 15 s ), followed by a pause for the LDB device application, and then a resumption of mechanical CPR (last 20 s ). The interval highlighted in green in Panel (a) corresponds to a 22 s segment included in the study dataset. The first 6 s (light green), free of CC artifacts, were used to annotate the ground truth rhythm (organized) of the patient, while the last 16 s with artifact (dark green) were used to develop the algorithms.
Mathematics 13 01251 g001
Figure 2. Panel (a) illustrates the adaptive RLS filtering schema, while Panel (b) shows the input and output signals in the interval corresponding to the last 16 s of the database segment highlighted in dark green in Figure 2. From top to bottom: the corrupted ECG, s cor ( n ) ; the estimated CC artifact, s ^ cc ( n ) ; and the filtered ECG, s ^ ecg ( n ) . The highlighted segment corresponds to the 12 s interval used for the development of the classification algorithms, thus avoiding filtering transients.
Figure 2. Panel (a) illustrates the adaptive RLS filtering schema, while Panel (b) shows the input and output signals in the interval corresponding to the last 16 s of the database segment highlighted in dark green in Figure 2. From top to bottom: the corrupted ECG, s cor ( n ) ; the estimated CC artifact, s ^ cc ( n ) ; and the filtered ECG, s ^ ecg ( n ) . The highlighted segment corresponds to the 12 s interval used for the development of the classification algorithms, thus avoiding filtering transients.
Mathematics 13 01251 g002
Figure 3. Architecture of (a) CNN-based and (b) ResNet-based rhythm classifiers. Panel (a) was extracted from [26].
Figure 3. Architecture of (a) CNN-based and (b) ResNet-based rhythm classifiers. Panel (a) was extracted from [26].
Mathematics 13 01251 g003
Figure 4. The cross-validation performance metrics (UMFS, UMS, and the average between UMFS and UMS) for the classic machine learning algorithm as a function of the number of features K (left panel) and the minimum number of observations per leaf l size (right panel). The right panel corresponds to K = 20 .
Figure 4. The cross-validation performance metrics (UMFS, UMS, and the average between UMFS and UMS) for the classic machine learning algorithm as a function of the number of features K (left panel) and the minimum number of observations per leaf l size (right panel). The right panel corresponds to K = 20 .
Mathematics 13 01251 g004
Figure 5. Selection probability for the 30 most selected features by the RF classifier in the 10 iterations of the 10-fold CV loop.
Figure 5. Selection probability for the 30 most selected features by the RF classifier in the 10 iterations of the 10-fold CV loop.
Mathematics 13 01251 g005
Figure 6. Performance of the CNN architecture as a function of the configurable parameters of the network: the number of blocks (B), the filter width (I), and the filter configuration ( L ). The first row shows the effect of the filter width, I, for networks with L 3 configuration. The second row shows the effect of the filter configurations, from dense ( L 6 ) to sparse ( L 1 ), for the optimal I values.
Figure 6. Performance of the CNN architecture as a function of the configurable parameters of the network: the number of blocks (B), the filter width (I), and the filter configuration ( L ). The first row shows the effect of the filter width, I, for networks with L 3 configuration. The second row shows the effect of the filter configurations, from dense ( L 6 ) to sparse ( L 1 ), for the optimal I values.
Mathematics 13 01251 g006
Figure 7. Performance of the ResNet architecture as a function of the configurable parameters of the network: the number of blocks (B), the filter width (I), and the filter configuration ( L ). The first row shows the effect of the filter width, I, for networks with L 3 configuration. The second row shows the effect of the filter configurations, from dense ( L 6 ) to sparse ( L 1 ), for the optimal I values.
Figure 7. Performance of the ResNet architecture as a function of the configurable parameters of the network: the number of blocks (B), the filter width (I), and the filter configuration ( L ). The first row shows the effect of the filter width, I, for networks with L 3 configuration. The second row shows the effect of the filter configurations, from dense ( L 6 ) to sparse ( L 1 ), for the optimal I values.
Mathematics 13 01251 g007
Figure 8. Confusion matrix for the different models using the entire testing set, where 5816 segments were obtained from 880 patients. The sensitivities for each class and model are shown in the diagonals.
Figure 8. Confusion matrix for the different models using the entire testing set, where 5816 segments were obtained from 880 patients. The sensitivities for each class and model are shown in the diagonals.
Mathematics 13 01251 g008
Figure 9. Examples of segments correctly (left: (a,b)) and incorrectly (right: (c,d)) classified by the ResNet model. The ground truth (Grd Truth) classification is shown in the interval without compressions, and the diagnoses of the ResNet model are shown in the interval with mechanical CCs. The wrong diagnoses of the ResNet were caused by the inability of the filter to properly remove artifacts. This led to very reduced QRS complexes that resembled an AS rhythm during an OR rhythm (c) or to spiky filtering residuals that looked like QRS complexes during AS (d).
Figure 9. Examples of segments correctly (left: (a,b)) and incorrectly (right: (c,d)) classified by the ResNet model. The ground truth (Grd Truth) classification is shown in the interval without compressions, and the diagnoses of the ResNet model are shown in the interval with mechanical CCs. The wrong diagnoses of the ResNet were caused by the inability of the filter to properly remove artifacts. This led to very reduced QRS complexes that resembled an AS rhythm during an OR rhythm (c) or to spiky filtering residuals that looked like QRS complexes during AS (d).
Mathematics 13 01251 g009
Figure 10. Performance of the ResNet architecture for the binary Sh/NSh classification in terms of the configurable parameters of the network: the number of blocks (B), the filter width (I), and the filter configuration ( L ). The first row shows the effect of the filter width, I, for networks with L 3 configuration. The second row shows the effect of the filter configurations, from dense ( L 6 ) to sparse ( L 1 ), for the optimal I values.
Figure 10. Performance of the ResNet architecture for the binary Sh/NSh classification in terms of the configurable parameters of the network: the number of blocks (B), the filter width (I), and the filter configuration ( L ). The first row shows the effect of the filter width, I, for networks with L 3 configuration. The second row shows the effect of the filter configurations, from dense ( L 6 ) to sparse ( L 1 ), for the optimal I values.
Mathematics 13 01251 g010
Table 1. The six filter configurations, L , were tested in a 10-fold cross-validation loop for each number of convolutional blocks, B, used in the CNN architecture.
Table 1. The six filter configurations, L , were tested in a 10-fold cross-validation loop for each number of convolutional blocks, B, used in the CNN architecture.
Configuration of the
Number of Filters, L
Number of Blocks, B
3 4 5 6
L 1 4 , 8 , 16 2 , 4 , 8 , 16 2 , 4 , 8 , 16 , 32 1 , 2 , 4 , 8 , 16 , 32
L 2 8 , 16 , 32 4 , 8 , 16 , 32 4 , 8 , 16 , 32 , 64 2 , 4 , 8 , 16 , 32 , 64
L 3 16 , 32 , 64 8 , 16 , 32 , 64 8 , 16 , 32 , 64 , 128 4 , 8 , 16 , 32 , 64 , 128
L 4 24 , 48 , 96 12 , 24 , 48 , 96 12 , 24 , 48 , 96 , 192 6 , 12 , 24 , 48 , 96 , 192
L 5 32 , 64 , 128 16 , 32 , 64 , 128 16 , 32 , 64 , 128 , 256 8 , 16 , 32 , 64 , 128 , 256
L 6 40 , 80 , 160 20 , 40 , 80 , 160 20 , 40 , 80 , 160 , 320 10 , 20 , 40 , 80 , 160 , 320
Table 2. The six filter configurations, L , tested in the 10-fold cross-validation loop for each number of convolutional blocks, B, used in the ResNet architecture. Except for the first convolutional block, the remaining convolutional blocks consisted of two residual blocks. Thus, for B = 3 , B = 4 , B = 5 , and B = 6 , there were 5, 7, 9, and 11 numbers in parenthesis, respectively.
Table 2. The six filter configurations, L , tested in the 10-fold cross-validation loop for each number of convolutional blocks, B, used in the ResNet architecture. Except for the first convolutional block, the remaining convolutional blocks consisted of two residual blocks. Thus, for B = 3 , B = 4 , B = 5 , and B = 6 , there were 5, 7, 9, and 11 numbers in parenthesis, respectively.
LNumber of Blocks, B
3 4 5 6
L 1 4 , 8 , 8 , 16 , 16 2 , 4 , 4 , 8 , 8 , 16 , 16 2 , 4 , 4 , 8 , 8 , 16 , 16 , 32 , 32 1 , 2 , 2 , 4 , 4 , 8 , 8 , 16 , 16 , 32 , 32
L 2 8 , 16 , 16 , 32 , 32 4 , 8 , 8 , 16 , 16 , 32 , 32 4 , 8 , 8 , 16 , 16 , 32 , 32 , 64 , 64 2 , 4 , 4 , 8 , 8 , 16 , 16 , 32 , 32 , 64 , 64
L 3 16 , 32 , 32 , 64 , 64 8 , 16 , 16 , 32 , 32 , 64 , 64 8 , 16 , 16 , 32 , 32 , 64 , 64 , 128 , 128 4 , 8 , 8 , 16 , 16 , 32 , 32 , 64 , 64 , 128 , 128
L 4 24 , 48 , 48 , 96 , 96 12 , 24 , 24 , 48 , 48 , 96 , 96 12 , 24 , 24 , 48 , 48 , 96 , 96 , 192 , 192 6 , 12 , 12 , 24 , 24 , 48 , 48 , 96 , 96 , 192 , 192
L 5 32 , 64 , 64 , 128 , 128 16 , 32 , 32 , 64 , 64 , 128 , 128 16 , 32 , 32 , 64 , 64 , 128 , 128 , 256 , 256 8 , 16 , 16 , 32 , 32 , 64 , 64 , 128 , 128 , 256 , 256
L 6 40 , 80 , 80 , 160 , 160 20 , 40 , 40 , 80 , 80 , 160 , 160 20 , 40 , 40 , 80 , 80 , 160 , 160 , 320 , 320 10 , 20 , 20 , 40 , 40 , 80 , 80 , 160 , 160 , 320 , 320
Table 3. Performance metrics obtained by the OHCA multiclass algorithms on the 20 replicas of the testing set, reported as the mean (SD).
Table 3. Performance metrics obtained by the OHCA multiclass algorithms on the 20 replicas of the testing set, reported as the mean (SD).
RFCNNResNet
SE (%)
  AS80.2 (4.4)85.0 (5.3)84.7 (5.9)
  OR84.5 (6.4)83.4 (5.3)87.9 (4.3)
  Sh90.8 (4.6)94.3 (3.2)91.5 (3.5)
PPV (%)
  AS76.9 (6.9)78.0 (6.0)80.1 (5.8)
  OR87.0 (2.7)91.2 (3.0)90.2 (3.4)
  Sh90.5 (5.6)86.8 (6.8)93.7 (4.4)
F 1 -Score (%)
  AS78.2 (3.6)81.1 (3.5)82.1 (3.8)
  OR85.6 (3.3)87.0 (3.1)89.0 (2.7)
  Sh90.4 (3.1)90.2 (3.7)92.5 (2.7)
Sum. metrics (%)
  UMFS84.8 (2.6)86.1 (2.4)87.9 (2.2)
  UMS85.2 (2.2)87.5 (2.2)88.1 (2.4)
  ACC84.6 (2.8)86.0 (2.5)87.7 (2.4)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Isasi, I.; Jaureguibeitia, X.; Alonso, E.; Elola, A.; Aramendi, E.; Wik, L. Artificial Intelligence for Multiclass Rhythm Analysis for Out-of-Hospital Cardiac Arrest During Mechanical Cardiopulmonary Resuscitation. Mathematics 2025, 13, 1251. https://doi.org/10.3390/math13081251

AMA Style

Isasi I, Jaureguibeitia X, Alonso E, Elola A, Aramendi E, Wik L. Artificial Intelligence for Multiclass Rhythm Analysis for Out-of-Hospital Cardiac Arrest During Mechanical Cardiopulmonary Resuscitation. Mathematics. 2025; 13(8):1251. https://doi.org/10.3390/math13081251

Chicago/Turabian Style

Isasi, Iraia, Xabier Jaureguibeitia, Erik Alonso, Andoni Elola, Elisabete Aramendi, and Lars Wik. 2025. "Artificial Intelligence for Multiclass Rhythm Analysis for Out-of-Hospital Cardiac Arrest During Mechanical Cardiopulmonary Resuscitation" Mathematics 13, no. 8: 1251. https://doi.org/10.3390/math13081251

APA Style

Isasi, I., Jaureguibeitia, X., Alonso, E., Elola, A., Aramendi, E., & Wik, L. (2025). Artificial Intelligence for Multiclass Rhythm Analysis for Out-of-Hospital Cardiac Arrest During Mechanical Cardiopulmonary Resuscitation. Mathematics, 13(8), 1251. https://doi.org/10.3390/math13081251

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop