Prediction of Sleep Apnea Occurrence from a Single-Lead Electrocardiogram Using Stacking Hybrid Architecture with Gated Recurrent Neural Network Architectures and Logistic Regression

Tan, Tan-Hsu; Chen, Guan-Hua; Liu, Shing-Hong; Chen, Wenxi

doi:10.3390/technologies14020092

Open AccessArticle

Prediction of Sleep Apnea Occurrence from a Single-Lead Electrocardiogram Using Stacking Hybrid Architecture with Gated Recurrent Neural Network Architectures and Logistic Regression

¹

Innovation Frontier Institute of Research for Science and Technology, National Taipei University of Technology, Taipei 10608, Taiwan

²

Department of Computer Science and Information Engineering, Chaoyang University of Technology, Taichung City 41349, Taiwan

³

Division of Information Systems, School of Computer Science and Engineering, The University of Aizu, Aizu-Wakamatsu City 965-8580, Fukushima, Japan

^*

Author to whom correspondence should be addressed.

Technologies 2026, 14(2), 92; https://doi.org/10.3390/technologies14020092

Submission received: 3 January 2026 / Revised: 20 January 2026 / Accepted: 24 January 2026 / Published: 1 February 2026

(This article belongs to the Special Issue AI-Enabled Smart Healthcare Systems)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Obstructive sleep apnea (OSA) is a common sleep disorder that impacts patient health and imposes a burden on families and healthcare systems. The diagnosis of OSA is usually performed through overnight polysomnography (PSG) in a hospital setting. In recent years, OSA detection using a single-lead electrocardiogram (ECG) has been explored. The advantage of this method is that patients can be measured in home environments. Thus, the aim of this study was to predict occurrences of sleep apnea with parameters extracted from previous single-lead ECG measurements. The parameters were the R-R interval (RRI) and R-wave amplitude (RwA). The dataset was the single-lead ECG Apnea-ECG Database, and a stacking hybrid architecture (SHA) including three gated recurrent neural network architectures (GRNNAs) and logistic regression was proposed to improve the accuracy of OSA detection. Three GRNNAs used three different recurrent neural networks: Bidirectional Long Short-Term Memory (BiLSTM), Gated Recurrent Unit (GRU), and Bidirectional GRU (BiGRU). The challenge of this method was in exploring how many minutes of previous RRI and RwA measurements (n minutes) have the best performance in predicting occurrences of sleep apnea in the future (h minutes). The results showed that the SHA under an n of 20 min had the best performance in predicting occurrences of sleep apnea in the following 10 min: the SHA achieved a precision of 95.79%, sensitivity of 94.74%, specificity of 97.48%, F₁-score of 95.26%, and accuracy of 96.45%. The proposed SHA was successful in predicting future sleep apnea occurrence with a single-lead ECG. Thus, this approach could be used in the development of wearable sleep monitors for the management of sleep apnea.

Keywords:

obstructive sleep apnea; electrocardiogram; gated recurrent neural network; logistic regression; stacking hybrid architecture; wearable device

1. Introduction

Sleep apnea is a condition characterized by recurrent pauses in breathing or shallow breathing during sleep. There are three main types of sleep apnea [1]: obstructive sleep apnea (OSA), central sleep apnea (CSA), and mixed sleep apnea (MSA). OSA is the most common type and occurs due to the soft tissues of the upper airway collapsing during sleep, leading to partial or complete airway obstruction and subsequent apnea. Sleep apnea, particularly OSA, has substantial clinical relevance due to its strong association with multisystem morbidity and increased mortality [2]. Recurrent upper airway obstruction during sleep leads to intermittent hypoxia, sleep fragmentation, and large intrathoracic pressure swings, which collectively contribute to sympathetic nervous system overactivation, systemic inflammation, oxidative stress, and endothelial dysfunction. Clinically, untreated sleep apnea is closely linked to cardiovascular diseases [3] such as hypertension, coronary artery disease, heart failure, atrial fibrillation, and stroke, as well as metabolic disorders, including insulin resistance and type 2 diabetes [4]. In addition, sleep apnea negatively affects neurocognitive function, resulting in excessive daytime sleepiness, impaired attention, reduced executive function, and an increased risk of occupational and traffic accidents. The early detection and effective management of sleep apnea therefore play a critical role in reducing long-term cardiovascular risk [5], improving metabolic control, enhancing quality of life, and lowering healthcare burden at both individual and population levels [6].

According to Benjafield et al. [7], more than 900 million adults worldwide may suffer from OSA. Young et al. [8] reported that the prevalence of OSA among adults is 24% in men and 9% in women, with higher incidence among males and middle-aged or older populations. OSA is also positively associated with hypertension, stroke, cardiovascular disease, diabetes, and cognitive impairment. Furthermore, Chowdhuri et al. [9] and Strollo et al. [10] report that the prevalence in men is twice that in women. Among postmenopausal women receiving hormone replacement therapy, the prevalence is like that of premenopausal women. However, for postmenopausal women who do not receive hormone therapy, the prevalence is significantly higher, and even approaches that of men. Data from Taiwan’s National Health Insurance Research Database [11] indicate that the prevalence of sleep apnea in Taiwan is approximately 0.49%, suggesting substantial underestimation. In addition, according to a research report from the National Health Research Institutes [11], older adults tend to present with symptoms such as insomnia or have no noticeable symptoms, and the causes are often related to unstable breathing patterns. Therefore, older adults may be easily overlooked for sleep apnea due to the absence of obvious symptoms such as excessive daytime sleepiness or snoring, constituting an important risk that should not be ignored.

At present, overnight polysomnography (PSG) is the gold standard measurement for diagnosing OSA [7,12]. PSG must be conducted in a sleep center, where multiple physiological signals are measured throughout the night, including electroencephalography (EEG), electrooculography (EOG), electrocardiography (ECG), electromyography (EMG), the respiratory effort of the thoracic and abdominal regions, oral–nasal airflow, blood pressure changes, blood oxygen saturation (SaO₂), heart rate, sleep position, and the apnea–hypopnea index (AHI). These data are used to determine whether sleep apnea is present. PSG performed in a professional sleep center provides detailed physiological measurements, and with continuous monitoring and interpretation by trained personnel, it can accurately diagnose various sleep disorders. However, the patient must stay overnight in a hospital or specialized sleep center and be connected to multiple sensors and wires during the examination. These measurement procedures may affect sleep quality, consequently influencing diagnostic accuracy. In addition, the high cost of PSG reduces the willingness of the general population to undergo the test.

Chazal et al. extracted the QRS complex to derive the R–R interval (RRI) and the ECG-derived respiration (EDR) signal as features. Using linear discriminant analysis and quadratic discriminant analysis, apnea or normal breathing could be classified within one minute with accuracies of 90.4% and 90.6%, respectively [13]. Khandoker et al. [14] extracted the parameters of heart rate variability and EDR using wavelet transform, and applied a Support Vector Machine (SVM) to classify patients with OSA or normal breathing. The evaluation was conducted on 42 subjects, correctly identifying 24 out of 26 OSA patients and 15 out of 16 non-OSA individuals. The performance achieved an accuracy of 90.0%, a recall of 92.3%, and a specificity of 93.8%, demonstrating that the SVM model can effectively detect obstructive sleep apnea events. Mendez et al. [15] used the K-Nearest Neighbor (KNN) algorithm and a neural network (NN) to classify apnea and normal breathing segments. The KNN model achieved an accuracy of 88%, a recall of 85%, and a specificity of 90%; the NN model achieved an accuracy of 88%, a recall of 89%, and a specificity of 86%. Compared with deep-learning models, KNN and NN have lower computational complexity and are easier to implement in embedded or wearable devices for real-time execution. However, autoregressive models assume that signals follow a linear process, whereas ECG and respiratory dynamics exhibit strong nonlinearity, which may result in suboptimal performance for certain apnea conditions. Bahrami and Forouzanfar [16] used the RRI extracted from a single-lead ECG, and normalized parameters of HRV during one-minute segment with a fixed length of 256 points using cubic spline interpolation. These sequences were then directly fed into a one-dimension convolutional neural network (1D CNN) for training. Among 35 test cases, the model achieved an accuracy of 88.23%, a recall of 82.74%, a specificity of 91.62%, and an area under the curve (AUC) of approximately 0.9453 for classifying one-minute segments. In comparison, the same study reported an accuracy of 78.15% for SVM using time-domain parameters of HRV, 74.56% for SVM only using frequency domain parameters, and 80.91% for SVM using time and frequency domain parameters.

Chang et al. [17] employed a 1D-CNN for feature extraction, LSTM to capture temporal dynamics, and a fully connected DNN for classification. Using 10-fold cross-validation, the model achieved an accuracy of 87.9%, a recall of 81.1%, a specificity of 92.0%, and an F₁-score of 79.07%. Sharan et al. [18] also utilized a 1D CNN for OSA classification with a one-lead ECG signal. They used one-minute ECG segments as the classification unit, achieving an accuracy of 87.9%, a recall of 81.1%, a specificity of 92.0%, and an AUC of approximately 0.935. Based on these previous studies, the parameters of HRV may be used to classify one-minute segments of apnea and normal breathing using a machine-learning (ML) model, and 1D CNN could be used to extract features of apnea symptoms. Then, the ML model could be used to classify apnea and normal breathing. However, the second method needs a larger-sized model, which cannot be executed in an edge computing system.

Some researchers [19,20] proposed that portable systems for helping detect OSA have significant potential for future development. Their advantages are threefold. The first is real-time intervention. When an apnea event is detected, continuous positive airway pressure (CPAP) devices will adjust the pressure settings in real time to reduce the occurrence of hypoxemia. The second benefit is predictive treatment. If future OSA risk can be predicted using past real-time data, medical devices may proactively adjust relevant settings to address patient needs at an early stage. The third advantage is safety. When prolonged airway obstruction is anticipated, an alarm can be immediately activated to ensure patient safety during sleep.

Bahrami and Forouzanfar proposed an ensemble learning architecture to forecast the occurrence of sleep apnea from single-lead ECG. The results showed that the proposed method using past ten-minute data could forecast apnea events up to one minute in the future. An accuracy of up to 94.95% was achieved [21]. When using past five-minute data to forecast apnea events up to one minute in the future, an accuracy of up to 91.38% was achieved. Thus, the aim of this study was to predict the probability of OSA occurring further into the future with less input data. We proposed a stacking hybrid architecture (SHA) that includes three types of gated recurrent neural network—Bidirectional Long Short-Term Memory (BiLSTM), Gated Recurrent Unit (GRU), and Bidirectional GRU (BiGRU)—combined with logistic regression (LR) [22] to improve the accuracy of OSA detection. The challenge of this method lies in exploring how much of a past time span (n minutes) is needed to forecast the occurrence of sleep apneas in the future (h minute). We used the Apnea-ECG Database [23] to explore the optimal n and h. There were 70 subjects in this database, with 50% of subjects belonging to the sleep apnea group and the remainder belonging to the normal group. The ECG signal was segmented into one-minute intervals. The RRI and of R-wave amplitude (RwA) were extracted from each segment as the input features. The RRI and RwA sequences were interpolated to a sampling rate of 3 Hz via cubic spline interpolation. Thus, the dimension of the input vector was 360. We designed n to be 1, 3, 5, 10, or 20 min, and h to be the first, third, fifth, eighth, or tenth minute. Thus, there were 25 conditions to be evaluated.

2. Materials and Methods

Figure 1 shows a flowchart of the proposed method, which included two phases. In the first phase, data preprocessing and feature engineering, single-lead ECG signals were processed, including one-minute segments, and RRI and RwA sequences were extracted and resampled as signals of 3 Hz using cubic spline interpolation. The input vector had 360 points, which were RRI signal padding with an RwA signal in one minute. In the second phase, a stacking hybrid architecture (SHA) was proposed to forecast occurrences of sleep apnea, including three types of gated recurrent neural network architecture (GRNNA): Bidirectional Long Short-Term Memory architecture (BiLSTMA), Gated Recurrent Unit architecture (GRUA), and Bidirectional GRU architecture (BiGRUA). Then, an LR was used to fuse the output of three GRNNAs to predict occurrences of sleep apnea at a future time point. The output layer of each GRNNA has two nodes representing probabilities of apnea and non-apnea occurrence. However, only the apnea probabilities of three GRNNAs are the input vector of LR, of which the dimension is 3. The output layer of LR is one node representing the probability of apnea occurrence.

2.1. Apnea-ECG Database

The data used were obtained from the Apnea-ECG Database of PhysioNet [23], which included 70 subjects. Here, 50% of the subjects belong to the sleep apnea group, and the remaining subjects belong to the normal group. ECG measurements were taken at a 100 Hz sampling rate over 7–8 h. This database also supported a target file in which the apnea marks were made synchronously with the ECG. In each segment, if a subject exhibits apnea breathing according to AHI criteria, this one-minute segment will be marked as A; otherwise, it will be marked as N. This dataset includes 35 files (a01–a20, b01–b05, c01–c10) in the training group, and 35 files (x01–x35) in the testing group.

2.2. Data Segment

The one-minute segments were organized according to three variables: k, n, and h. k represents the starting minute in the past, n represents the number of segments, and h is the time of sleep apnea occurrence in the future minute. Figure 2 shows two examples to illustrate the relations between k, n, and h within a 10 min interval. Figure 2a shows the example of k = 0, n = 5, h = 3. The ECG of 10 min was segmented as 10 segments, from S₀ to S₉. The starting segment is first segment (S₀). The number of segments is five (S₀ to S₄). These segments are used to forecast the seventh segment (S₇) whether this segment includes the occurrence of sleep apnea or not. Figure 2b shows the example of k = 1, n = 5, h = 3. The five segments from S₁ to S₅ are used to forecast the eighth segment (S₈) whether this segment includes the occurrence of sleep apnea or not.

2.3. Extracting RRI and RwA

In one segment, the program released by biosppy was used for R-wave detection [24]. The duration of the QRS complex waves was set to ±0.05 s in this program. When R-waves are detected, two parameters are obtained: RRI and RwA. Because the heart rate is controlled by the autonomic nervous system (ANS), which is not a fixed value, the RRI and RwA in one minute are classed as two sequences, not signals. The RRI and RwA sequences were normalized within [0, 1] to reduce the interference of false positive or negative R-waves and arrhythmic beats. Then, cubic spline interpolation [25,26] was used to resample the normalized RRI and RwA sequences as signals with a sampling rate of 3 Hz. Thus, in one segment (minute), there were 180 points for the RRI and RwA, respectively. The input vector in one segment was the combination of two signals by the series connection, whose dimension was 360.

2.4. Gated Recurrent Neural Network Architectures

We proposed three GRNNAs. Each GRNNA used BiLSTMA, GRUA, or BiGRUA as the first layer. Figure 3 shows their architectures. In BiLSTMA and BiGRUA, the first layer is the gated recurrent neural network, whose hidden state has 64 units each for the forward and backward direction, making a total of 128. In GRUA, the first layer has 64 units. The number of nodes in the second layer is 32 and the active function is ReLU. The number of nodes of the output layer is 2, and the active function is Softmax. The connection between the second and output layers dropped by 30%. The two nodes in the output layer represent the probability of sleep apnea occurring (p_Apnea) and normal respiration (p_Normal) in the future.

2.5. Logistic Regression

We used LR to fuse the output of the three GRNNAs to improve the accuracy, as shown in Figure 4. P[i] is the input vector, which included the probability of apnea or hypopnea (pBiLSTM, pGRU, pBiGRU), of the three GRNNAs, as shown in Equation (1). i is the ith sample. The number of nodes of the input layer is three and of the output layer is one. The goal of LR is to forecast the occurrence of sleep apnea. The active function (σ) is the sigmoid function, as shown in Equation (2).

\hat{p} (i)

is the probability of sleep apnea occurrence for the ith sample. w is the weights of full connection. b is the bias. Backpropagation is used to train w and b h.

P [i] = [p B i L S T M (i), p G R U (i), p B i G R U (i)] .

(1)

\hat{p} (i) = σ (w^{T} P [i] + b) = \frac{1}{1 + \exp (- (\sum_{j = 1}^{3} w_{j} \times P_{i j} + b))}

(2)

In this study, 0.5 was used as the decision threshold. If

\hat{p} (i)

≥ 0.5, this sample was categorized as sleep apnea; otherwise, it was categorized as normal respiration.

2.6. Metrics of the Model

A confusion matrix was used in this study and the following metrics were calculated: precision, sensitivity, specificity, accuracy, and F₁-score, as a measure of overall performance. True positive (TP) is a classification of apnea and an outcome of apnea. True negative (TN) is a classification of normal respiration and an outcome of normal respiration. False positive (FP) is a classification of apnea and an outcome of normal respiration. False negative (FN) is a classification of normal respiration and an outcome of apnea. Under fivefold crossing validations, the counts of TP, TN, FP, and FN in each fold are used to compute the evaluation metrics, including precision, sensitivity, specificity, accuracy, and F₁-score.

Precision is defined as Equation (3), sensitivity is defined as Equation (4), specificity is defined as Equation (5), accuracy is defined as Equation (6), and F₁-score is defined as Equation (7).

P r e c i s i o n (%) = \frac{T P}{T P + F P} \times 100 %

(3)

S e n s i t i v i t y (%) = \frac{T P}{T P + F N} \times 100 %

(4)

S p e c i f i c i t y (%) = \frac{T N}{T N + F P} \times 100 %

(5)

A c c u r a c y (%) = \frac{T P + T N}{T P + T N + F P + F N} \times 100 %

(6)

F 1 - s c o r e (%) = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} \times 100 %

(7)

3. Results

This study involved experiments using a computer equipped with an Intel core i5-13600k NVIDIA GeForce RTX 4090 graphics processing unit (NVIDIA, Santa Clara, CA, USA). Anaconda 3 was used as the development environment for Python 3.8, along with the Anaconda 3 deep-learning suite, which includes TensorFlow 2.10.0 and Karas 2.3.1. A Jupiter Notebook was used to perform the training and performance evaluation of the deep-learning models.

We explored the performance of the three GRNNAs and the SHA separately. We used the grid search under different n and h variables to explore the performance of each GRNNA and SHA. As outlined in Section 2.2, the numbers of training and testing samples depend on n and h variables. In this study, we set n as 5, 10, and 20 min, and h as the first, third, fifth, eighth, and tenth minute. Table 1 shows the numbers of samples from the training and testing files, and the total samples. The fivefold crossing validation were used to evaluate the metrics of the three GRNNAs and SHA.

3.1. Metrics of Three GRNNAs

The metrics of the three GRNNAs were evaluated under the different n and h variables. Three GRNNAs used BiLSTM, GRU, and BiGRU as the first layer. Table 2 shows the metrics of the GRNNA using BiLSTM with different previous ECG signals (n of 5, 10, and 20 min) to predict occurrences of sleep apnea at different future time points (h of the first, third, fifth, eighth, and tenth minute). The accuracy of the GRNNA reduces when the future time increases. Using the previous 5 min of ECG signals to predict sleep apnea in the following 1 and 10 min resulted in accuracies of 94.3% and 92.6%, respectively. However, when n is 20, the accuracy is independent of future time. Table 3 shows the metrics of the GRNNA using GRU with different previous ECG signals (n of 5, 10, and 20 min) to predict occurrences of sleep apnea at different future time points (h of the first, third, fifth, eighth, and tenth minute). Only with an n of 5 min does the accuracy of the GRNNA reduce when the future time increases. For an n of 5 min, the accuracies are 94.4% and 92.1% under an h of the first and tenth minute, respectively. For the other previous ECG signals, the accuracies are all independent of the h variable. Table 4 shows the metrics of the GRNNA using BiGRU with different previous ECG signals (n of 5, 10, and 20 min) to predict occurrences of sleep apnea at different future time points (h of the first, third, fifth, eighth, and tenth minute). Under an n of 10 min and h of the tenth minute, the GRNNA has the best accuracy of 96.04%. The worst accuracy is obtained when n is 5 min and h is the eighth minute, at 93.86%. Moreover, the accuracies are all independent of changes in the n and h variables.

When the n variable increases, the precision, sensitivity, specificity, F₁-score, and accuracy of the three GRNNAs show a significant improvement. This result indicates that input data with longer times leads to better performance for OSA prediction. The three GRNNAs achieve the best performance when the input data is with 10 min. Extending the input data to 20 min yields only marginal performance gains. Thus, input data with 10 min strikes a favorable balance between predictive performance and computational cost. Table 5 presents the performance of the three GRNNAs with an n of 10 min and h of the first, fifth, and tenth minute.

For the h of the first minute, BiGRU achieves a higher precision of 95.08% and specificity of 97.06%, respectively. In contrast, GRU outperforms in terms of sensitivity (94.47%), F₁-score (94.62%), and accuracy (95.95%). For an h of the fifth minute, GRU demonstrates superior precision of 94.02% and specificity of 96.42%, whereas BiGRU attains higher sensitivity (94.67%), F₁-score (94.29%), and accuracy (95.68%). For an h of the tenth minute, BiGRU exhibits the best overall performance across all evaluation metrics, including precision of 94.71%, sensitivity of 94.77%, specificity of 96.80%, F₁-score of 94.74%, and accuracy of 96.04%.

3.2. Metrics of Stacking Hybrid Architecture

Table 6 shows the metrics of the LR under an n of 1, 3, 5, 10, and 20 min and h of the first, third, fifth, eighth, and tenth minute. Its metrics increase by about 1.2% to 2.2% when the input time (n) increases from 1 min to 10 min. When the input time is 20 min, its metrics only increase a small amount under an h of the first minute with an accuracy of 96.34% vs. 96.45%, and reduce slightly under an h of the third, fifth, eighth, and tenth minute with an accuracy of 96.07% vs. 95.95%, 96.08% vs. 96.07%, 96.26% vs. 96.18%, and 96.15% vs. 96.02%, respectively.

We used the receiver operating characteristic (ROC) curve to compare the performance between the LR and three GRNNAs under n of 5, 10, and 20 min, and h of the first and tenth minute. Figure 5 shows the ROC curves of the three GRNNAs and SHA under an n of 5 min. When h is the first minute, the AUC of SHA is larger than for GRU, BiLSTM, and BiGRU, at 0.980 vs. 0.970, 0.976, and 0.976 (Figure 5a). When h is the tenth minute, the AUC of SHA is larger than for GRU, BiLSTM, and BiGRU at 0.967 vs. 0.944, 0.959, and 0.960 (Figure 5b). Figure 6 shows the ROC curves of three GRNNAs and SHA under an n of 10 min. When h is the first minute, the AUC values of SHA is larger than GRU, BiLSTM, and BiGRU at 0.988 vs. 0.982, 0.984, and 0.985 (Figure 6a). When h is the tenth minute, the AUC values of SHA is larger than GRU, BiLSTM, and BiGRU at 0.985 vs. 0.977, 0.977, and 0.981 (Figure 6b). Figure 7 shows the ROC curves of the three GRNNAs and SHA under an n of 20 min. When h is the first minute, the AUC value of SHA is larger than GRU, BiLSTM, and BiGRU at 0.989 vs. 0.985, 0.984, and 0.987 (Figure 5a). When h is the tenth minute, the AUC value of SHA is larger than GRU, BiLSTM, and BiGRU at 0.989 vs. 0.985, 0.984, and 0.987 (Figure 7b). Table 7 shows the AUC values of the three GRNNAs and SHA. SHA has the largest AUC of 0.986 under n of 20 min and h of the tenth minute.

4. Discussion

With the advancement of smart sleep wearables capable of unobtrusively monitoring multiple physiologic signals, forecasting sleep apnea occurrences has become increasingly more feasible. Predicting the onset of sleep apnea can facilitate timely preventive interventions and the development of effective therapeutic and management strategies. In this study, we proposed a stacking hybrid architecture for modeling ECG-based physiological parameters, RRI and RwA, to forecast sleep apnea occurrences. Bahrami and Forouzanfar [21] used a deep-learning model whose best accuracy achieved 94.95% ± 0.33% under an n of 20 min and h of the first minute. Stojanovski et al. used edge computing to detect sleep apnea with the same database, for which accuracy was only up to 82% [27]. Torres et al. used Raspberry Pi for detecting sleep apnea [28]. The accuracy values obtained were 90.2%. Under the same conditions, our proposed GRNNA with BiGRU and SHA achieved accuracies of 95.96% and 96.45, respectively. Table 7 shows the AUC values of the three GRNNAs and SHA. SHA has the largest AUC of 0.989 under an n of 20 min and h of the first minute. Because the SHA was used for the fusion of the three GRNNAs, its metrics were better than the independent GRNNA. Many studies support this theorem. Priyadharsini and Phamila used deep hybrid architecture with stacked ensemble learning for the binary classification of retinal disease, achieving an accuracy of 92.3% [29]. Sharma et al. used a hybrid CNN and BiGRU-Attention-based deep-learning model for protein function prediction, achieving a molecular function prediction of 5.2% and biological process prediction of 1.2% [20]. Manna et al. proposed a fuzzy rank-based ensemble of CNN models for the classification of cervical cytology, with an accuracy of 95.43% [30]. Liu et al. proposed a SHA, a deep neural network (DNN) stacked on a CatBoost model, to estimate the median frequency and root mean square of a surface electromyogram (sEMG) using mechanomyograms. The Pearson correlation coefficients for MDF and RMS estimations were approximately 0.98 and 0.92, respectively [31].

In Table 7, different previous ECG signals (n of 5, 10, and 20 min) are presented. We find that the largest AUC values for forecasting apnea–respiration occurrences at different future time points (h of the first, fifth, and tenth minute) are for SHA. When using ECG signals from the previous 20 min to predict occurrences of sleep apnea at future time points, the AUC values of SHA reach 0.989, 0.985, and 0.986, respectively. Thus, a greater amount of previous time point data may lead to higher accuracy in forecasting occurrences of sleep apnea. Moreover, predicting the occurrence of sleep apnea in the very near future (h of the first minute) has the largest AUC values under different previous time point ECG signals (n of 5, 10, and 20 min), with AUC values of SHA reaching 0.980, 0.988, and 0.989, respectively. Predicting the occurrence of sleep apnea in the farthest future time point (h of the tenth minute) has the smallest AUC values under different ECG signals from previous time points (n of 5, 10, and 20 min), with AUC values of SHA of 0.967, 0.985, and 0.986, respectively. Thus, the further back in time the signals are obtained (n of 20 min), the greater the accuracy of predicting sleep apnea in the following minute (h of the first minute).

Because the RRI and RwA sequences are not sequenced under the fixed sampling rate, the numbers of RRI and RwA points in one minute are affected by the heart rate. However, the dimensions of the input vector for GRNNAs must be fixed. Thus, RRI and RwA sequences must be resampled to generate RRI and RwA signals. When linear interpolation is used, the RRI and RwA signals are not smooth. If high-order global polynomial interpolation is used, boundary oscillations will occur at equidistant nodes to generate a larger error, known as the Runge Phenomenon [25]. The cubic spline baseline estimation can effectively remove drift [26]. Cubic splines are piecewise cubic polynomials that can maintain the continuity of the first and second derivatives at the nodes, and have both smoothness and local stability. Therefore, they are particularly suitable for nonlinear time signals such as electrocardiograms.

To select ECG signals with good quality, we used two rules to classify the quality of the ECG in segments. Good-quality segments were extracted as RRI and RwA signals to train and test the proposed models for forecasting sleep apnea. The first rule was that the RRI should be limited to between 300 ms and 2000 ms. If the RRI was out of this limitation, it was deleted. The second rule was that there were at least 10 points of RRI and RwA in a segment. After deleting the segments with bad signal quality, the number of segments categorized as sleep apnea decreased from 6596 to 6299, and segments categorized as normal respiration decreased from 10,448 to 10,409. The loss rate was only 2.02%. Because this is an easy method that does not require random-access memory (RAM) or flash memory, it can be embedded in edge computing systems with resource limitations. Liu et al. proposed 2D CNN models to classify PPG quality [32], and a stacking hybrid architecture, 1D CNN + GRU, to classify the quality of impedance plethysmograms and ballistocardiograms [33]. These methods all require RAM and flash memory and are therefore not suitable for embedding in edge computing systems.

LR can be used to replace standard linear regression to enhance performance for a two-level outcome [34]. Subasi and Ercelebi used two ML models, a multilayer perceptron neural network (MLPNN) and LR, to classify epileptic seizures using different DWT signals from an electroencephalogram. They found that MLPNN had better performance than MLPNN LR [35]. Li et al. used MLPNN and LR models to classify heart sounds as being normal and abnormal, respectively. MLPNN also had a better performance than LR [36]. When LR is used independently to perform classifications, its performance is poorer. Moreover, meta learning was used to fuse the outcomes of the different ML or DL models to improve the performance [37]. In this study, we proposed a stacking hybrid architecture that includes three RNNAs to forecast the probability of sleep apnea occurrence, and used LR as the meta learning model to fuse the outcomes of three RNNAs and produce the final decision.

The respiratory and circulatory systems are tightly coupled through regulation by the autonomic nervous system. During inspiration, intrathoracic pressure decreases, leading to a reduction in atrial pressure and the diminished activation of baroreceptors. This results in a decrease in vagal tone and an increase in heart rate, reflected by shortened R–R intervals. Conversely, during respiration, intrathoracic pressure increases due to diaphragmatic relaxation and upward movement, which elevates atrial pressure and activates baroreceptors. This activation enhances vagal tone and leads to a decrease in heart rate, manifested as prolonged R–R intervals [38]. In addition, respiratory-induced changes in the orientation of the cardiac electrical axis relative to the ECG electrodes produce an amplitude modulation effect in the ECG signal, resulting in variations in RwA [39]. Respiratory disorders can be identified by analyzing the variation in RRI conjugated with RwA changes in ECG morphology [40]. This observation is consistent with previous studies demonstrating that features derived from QRS complex timing and ECG amplitude are effective for sleep apnea detection [41]. Thus, in this study, we used RRI and RwA signals to forecast occurrences of sleep apnea.

Recent surveys show that medical Vision–Language Models (VLMs) have progressed from early-stage “feature fusion” between images and clinical text toward general-purpose large multimodal models that support broader tasks such as report generation, visual question answering, retrieval-augmented diagnosis, and cross-modal grounding—shifting the field from task-specific pipelines to more unified, scalable paradigms [42,43,44,45]. In parallel, knowledge distillation (KD) and teacher–student frameworks are increasingly emphasized as practical enablers for clinical translation, because they reduce model size/latency and improve deployability on resource-constrained devices (e.g., edge computing systems) while aiming to preserve diagnostic performance and calibration—an especially relevant consideration for real-world medical workflows [46]. Therefore, further research into these trends and clarifying the approach will help develop more up-to-date methods for the assessment of sleep apnea.

5. Conclusions

This work focused on forecasting the occurrence of sleep apnea using RRI and RwA signals obtained at previous time points using a single-lead ECG. An ECG patch has been developed for monitoring long-term single-lead ECG [47]. We proposed an SHA to forecast occurrences of sleep apnea that achieved an AUC of 0.989 with a previous time point (n) of 20 min and a future time point (h) of 1 min. Thus, embedding the SHA into the ECG patch will result in an inexpensive and easy-to-use wearable device for facilitating sleep and health management in the future. A limitation of this study was its small sample size. The evaluation of the proposed method on a larger sleep dataset may further demonstrate the good performance of the proposed SHA. Moreover, we did not reduce the size of the SHA to fit a wearable device, and the performance of a compressed SHA may be lower than that of a PC-based SHA. In this study, we only used RRI and RwA characteristics to forecast sleep apnea. Thus, exploring different characteristics derived from a single-lead ECG for forecasting sleep apnea could be included in future work.

Author Contributions

Conceptualization, T.-H.T.; methodology, T.-H.T., G.-H.C. and S.-H.L.; software, T.-H.T. and G.-H.C.; validation, W.C.; investigation, T.-H.T.; data curation, G.-H.C.; writing—original draft preparation, G.-H.C. and S.-H.L.; writing—review and editing, S.-H.L. and W.C.; supervision, T.-H.T.; project administration, T.-H.T.; funding acquisition, T.-H.T. and S.-H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science and Technology Council, Taiwan, under grant NSTC 114-2221-E-027-040. The APC was funded by NSTC 113-2923-E-324-001-MY3 and NSTC 114-2221-E-027-040.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

https://physionet.org/content/apnea-ecg/1.0.0/ (accessed on 10 January 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

PSG	Polysomnography
ECG	Electrocardiogram
OSA	Obstructive Sleep Apnea
CSA	Central Sleep Apnea
MSA	Mixed Sleep Apnea
RRI	R-R Interval
RwA	R-Wave Amplitude
SHA	Stacking Hybrid Architecture
GRNNA	Gated Recurrent Neural Network Architecture
BiLSTM	Bidirectional Long Short-Term Memory
BiGRU	Bidirectional GRU
GRU	Gated Recurrent Unit
SaO₂	Blood Oxygen Saturation
AHI	Apnea Hypopnea Index
EDR	ECG-Derived Respiration
1D CNN	One-Dimensional Convolutional Neural Network
AUC	Area Under the Curve
ML	Machine Learning
CPAP	Continuous Positive Airway Pressure
LR	Logistic Regression
RAM	Random-Access Memory

References

Guilleminault, C.; Tilkian, A.; Dement, W.C. The sleep apnea syndromes. Annu. Rev. Med. 1976, 27, 465–484. [Google Scholar] [CrossRef] [PubMed]
Leung, R.S.T.; Douglas, B.T. Sleep apnea and cardiovascular disease. Am. J. Respir. Crit. Care Med. 2001, 164, 2147–2165. [Google Scholar] [CrossRef] [PubMed]
Peppard, P.E.; Young, T.; Barnet, J.H.; Palta, M.; Hagen, E.W.; Hla, K.M. Increased prevalence of sleep-disordered breathing in adults. Am. J. Epidemiol. 2013, 177, 1006–1014. [Google Scholar] [CrossRef] [PubMed]
Punjabi, N.M. The epidemiology of adult obstructive sleep apnea. Proc. Am. Thorac. Soc. 2008, 5, 136–143. [Google Scholar] [CrossRef]
Marin, J.M.; Carrizo, S.J.; Vicente, E.; Agusti, A.G.N. Long-term cardiovascular outcomes in men with obstructive sleep apnea-hypopnea with or without treatment with continuous positive airway pressure: An observational study. Lancet 2005, 365, 1046–1053. [Google Scholar] [CrossRef]
Jordan, A.S.; McSharry, D.G.; Malhotra, A. Adult obstructive sleep apnoea. Lancet 2014, 383, 736–747. [Google Scholar] [CrossRef]
Benjafield, A.V.; Ayas, N.T.; Eastwood, P.R.; Heinzer, R.; Ip, M.S.; Morrell, M.J.; Malhotra, A. Estimation of the global prevalence and burden of obstructive sleep apnea: A literature-based analysis. Lancet Respir. Med. 2019, 7, 687–698. [Google Scholar] [CrossRef]
Young, T.; Palta, M.; Dempsey, J.; Skatrud, J.; Weber, S.; Badr, S. The occurrence of sleep-disordered breathing among middle-aged adults. New Engl. J. Med. 1993, 328, 1230–1235. [Google Scholar] [CrossRef]
Chowdhuri, S.; Patel, P.; Badr, M.S. Apnea in older adults. Sleep Med. Clin. 2018, 13, 21–37. [Google Scholar] [CrossRef]
Strollo, P.J.; Rogers, R.M. Obstructive sleep apnea. New Engl. J. Med. 1996, 334, 99–104. [Google Scholar] [CrossRef]
Chen, T.Y.; Kuo, T.B.J.; Chung, C.H.; Tzeng, N.; Lai, H.; Chien, W.; Yang, C.C. Age and sex differences on the association between anxiety disorders and obstructive sleep apnea: A nationwide case-control study in Taiwan. Psychiatry Clin. Neurosci. 2022, 76, 251–259. [Google Scholar] [CrossRef] [PubMed]
Rechtschaffen, A.; Kales, A. A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects; U.S. Government Printing Office: Washington, DC, USA, 1968.
De Chazal, P.; Heneghan, C.; Sheridan, E.; Reilly, R.; Nolan, P.; O’Malley, M. Automated processing of the single-lead electrocardiogram for the detection of obstructive sleep apnea. IEEE Trans. Biomed. Eng. 2003, 50, 686–696. [Google Scholar] [CrossRef] [PubMed]
Khandoker, A.H.; Palaniswami, M.; Karmakar, C.K. Support Vector Machines for Automated Recognition of Obstructive Sleep Apnea Syndrome from ECG Recordings. IEEE Trans. Inf. Technol. Biomed. 2009, 13, 37–48. [Google Scholar] [CrossRef] [PubMed]
Mendez, M.O.; Bianchi, A.M.; Matteucci, M.; Cerutti, S.; Penzel, T. Sleep apnea screening by autoregressive models from a single ECG lead. IEEE Trans. Biomed. Eng. 2009, 56, 2838–2850. [Google Scholar] [CrossRef]
Bahrami, M.; Forouzanfar, M. Sleep apnea detection from single-lead ECG: A comprehensive analysis of machine learning and deep learning algorithms. IEEE Trans. Instrum. Meas. 2022, 71, 4003011. [Google Scholar] [CrossRef]
Chang, H.Y.; Yeh, C.Y.; Lee, C.T.; Lin, C.C. A sleep apnea detection system based on a one-dimensional deep convolution neural network model using single-lead ECG. Sensors 2022, 20, 4157. [Google Scholar]
Sharan, R.V.; Berkovsky, S.; Xiong, H.; Coiera, E. ECG-derived heart rate variability interpolation and one-dimensional convolutional neural networks for detecting sleep apnea. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Montreal, QC, Canada, 20–24 July 2020; pp. 637–640. [Google Scholar]
Tran, N.T.; Tran, H.N.; Mai, A.T. A wearable device for at-home obstructive sleep apnea assessment: State-of-the-art and research challenges. Front. Neurol. 2023, 14, 1123227. [Google Scholar] [CrossRef]
Sharma, L.; Deepak, A.; Ranjan, A.; Krishnasamy, G. A novel hybrid CNN and BiGRU-Attention based deep learning model for protein function prediction. Stat. Appl. Genet. Mol. Biol. 2023, 22, 20220057. [Google Scholar] [CrossRef]
Bahrami, M.; Forouzanfar, M. Deep learning forecasts the occurrence of sleep apnea from single-lead ECG. Cardiovasc. Eng. Technol. 2022, 13, 809–815. [Google Scholar] [CrossRef]
Ruczinski, I.; Kooperberg, C.; LeBlanc, M. Logic regression. J. Comput. Graph. Stat. 2003, 12, 475–511. [Google Scholar] [CrossRef]
Penzel, T.; Moody, G.B.; Mark, R.G.; Goldberger, A.L.; Peter, J.H. The Apnea-ECG database. Comput. Cardiol. 2000, 27, 255–258. [Google Scholar]
PIA-Group. biosppy/signals/ecg.py. biosppy v2.2.2; GitHub, 2020. Available online: https://github.com/PIA-Group/BioSPPy/blob/master/biosppy/signals/ecg.py (accessed on 23 January 2026).
Runge, C.D.T. Über empirische Funktionen und die Interpolation zwischen äquidistanten Ordinaten. Z. Math. Phys. 1901, 46, 224–243. [Google Scholar]
Allen, J.; Anderson, J.M.; Dempsey, G.J.; Adgey, A.A.J. Efficient baseline wander removal for feature analysis of electrocardiographic body-surface maps. In Proceedings of the International Conference of the IEEE Engineering in Medicine and Biology Society, Baltimore, MD, USA, 3–6 November 1994; pp. 1316–1317. [Google Scholar]
Stojanovski, A.; Zdravevski, E.; Koceski, S.; Trajkovik, V. Real-time sleep apnea detection with one-channel ECG based on edge computing paradigm. ICT Innov. 2018, 2018, 124–134. [Google Scholar]
Torres, J.M.; Oliveira, S.; Sobral, P.; Moreira, R.S.; Soares, C. In-Home Sleep Monitoring using Edge Intelligence. SN Comput. Sci. 2024, 5, 538. [Google Scholar] [CrossRef]
Priyadharsini, C.; Phamila, A.V. Deep hybrid architecture with stacked ensemble learning for binary classification of retinal disease. Results Eng. 2024, 24, 103219. [Google Scholar] [CrossRef]
Manna, A.; Kundu, R.; Kaplun, D.; Sinitca, A.; Sarkar, R. A fuzzy rank-based ensemble of CNN models for classification of cervical cytology. Sci. Rep. 2021, 11, 14538. [Google Scholar] [CrossRef]
Liu, S.H.; Chen, W.; Chang, K.M.; Pan, K.L. Using mechanomyograms for estimating median frequency and root mean square of electromyograms with hybrid deep neural network and CatBoost model. IEEE Access 2024, 12, 190629–190639. [Google Scholar] [CrossRef]
Liu, S.H.; Li, R.X.; Wang, J.J.; Chen, W.; Su, C.H. Classification of photoplethysmographic signal quality with deep convolution neural networks for accurate Measurement of cardiac stroke volume. Appl. Sci. 2020, 10, 4612. [Google Scholar] [CrossRef]
Liu, S.H.; Sun, Y.; Wu, B.Y.; Chen, W.; Zhu, X. Using machine learning models for cuffless blood pressure estimation with ballistocardiogram and impedance plethysmogram. Front. Digit. Health 2025, 7, 1511667. [Google Scholar] [CrossRef]
LaValley, M.P. Logistic regression. Circulation 2008, 117, 2395–2399. [Google Scholar] [CrossRef]
Subasi, A.; Ercelebi, E. Classification of EEG signals using neural network and logistic regression. Comput. Methods Programs Biomed. 2005, 78, 87–99. [Google Scholar] [CrossRef]
Li, L.; Wang, X.; Du, X.; Liu, Y.; Liu, C.; Qin, C.; Li, Y. Classification of heart sound signals with BP neural network and logistic regression. In Proceedings of the IEEE 2017 Chinese Automation Congress, Jinan, China, 20–22 October 2017; pp. 7380–7383. [Google Scholar]
Almohimeed, A.; Saad, R.M.A.; Mostafa, S.; El-Rashidy, N.M.; Farrag, S.; Gaballah, A.; Elaziz, M.A.; El-Sappagh, S.; Saleh, H. Explainable artificial intelligence of multi-level stacking ensemble for detection of Alzheimer’s disease based on particle swarm optimization and the sub-scores of cognitive biomarkers. IEEE Access 2023, 11, 123173–123193. [Google Scholar] [CrossRef]
Hirsch, J.A.; Bishop, B. Respiratory sinus arrhythmia in humans: How breathing pattern modulates heart rate. Am. J. Physiol.-Heart Circ. Physiol. 1981, 241, H620–H629. [Google Scholar] [CrossRef] [PubMed]
Pallas-Areny, R.; Colominas-Balague, J.; Rosell, F.J. The effect of respiration-induced heart movements on the ECG. IEEE Trans Biomed. Eng. 1989, 36, 585–590. [Google Scholar] [CrossRef][Green Version]
Penzel, T.; Kantelhardt, J.W.; Bartsch, R.P.; Riedl, M.; Kraemer, J.F.; Wessel, N. Modulations of heart rate, ECG, and cardio-respiratory coupling observed in polysomnography. Front. Physiol. 2016, 7, 460. [Google Scholar] [CrossRef]
Ahmad, S.; Batkin, I.; Kelly, O.; Dajani, H.R.; Bolic, M.; Groza, V. Multiparameter physiological analysis in obstructive sleep apnea simulated with Mueller maneuver. IEEE Trans Instrum Meas. 2013, 62, 2751–2762. [Google Scholar] [CrossRef]
Li, X.; Li, L.; Jiang, Y.; Wang, H.; Qiao, X.; Feng, T.; Luo, H.; Zhao, Y. Vision-Language Models in medical image analysis: From simple fusion to general large models. Inf. Fusion 2025, 118, 102995. [Google Scholar] [CrossRef]
Chen, Q.; Zhao, R.; Wang, S.; Phan, V.M.H.; van den Hengel, A.; Verjans, J.; Liao, Z.; To, M.-S.; Xia, Y.; Chen, J.; et al. A survey of medical vision-and-language applications and their techniques. arXiv 2024, arXiv:2411.12195. [Google Scholar]
Wu, J.; Wang, Y.; Zhong, Z.; Liao, W.; Trayanova, N.; Jiao, Z.; Bai, H.X. Vision-language foundation model for 3D medical imaging. npj Artif. Intell. 2025, 1, 17. [Google Scholar] [CrossRef]
Urooj, B.; Fayaz, M.; Ali, S.; Dang, L.M.; Kim, K.W. Large language models in medical image analysis: A systematic survey and future directions. Bioengineering 2025, 12, 818. [Google Scholar] [CrossRef]
Li, X.; Li, L.; Li, M.; Yan, P.; Feng, T.; Luo, H.; Zhao, Y.; Yin, S. Knowledge distillation and teacher-student learning in medical imaging: Comprehensive overview, pivotal role, and future directions. Med. Image Anal. 2025, 103819. [Google Scholar] [CrossRef]
Zulqarnain, M.; Stanzione, S.; Rathinavel, G.; Smout, S.; Willegems, M.; Myny, K.; Cantatore, E. A flexible ECG patch compatible with NFC RF communication. npj Flex. Electron. 2020, 4, 13. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the proposed method. There are two phases. The first phase is “data preprocessing and feature engineering”. The single-lead ECG dataset is published in the Apnea-ECG Database of PhysioNet [23]. During preprocessing, the single-lead ECG signal was segmented by one minute. RRI and RwA were extracted from each segment and resampled as 3 Hz via cubic spline interpolation. The input vector has 360 points, which are RRI signal padding with RwA signal in one minute. The second phase forecasts the occurrences of sleep apnea by the proposed SHA, which includes BiLSTMA, GRUA, and BiGRUA. Then, a logistical regression is used to fuse the apnea probability of the three GRNNAs.

Figure 2. Two examples illustrate the relations between k, n, and h within 10 segments from S₀ to S₉: (a) k, n, and h are 0, 5, and 3, respectively. The starting segment is the first segment (S₀). The number of segments is five (S₀ to S₄). The seventh segment (S₇) is forecasted. (b) k, n, and h are 1, 5, and 3, respectively. The starting segment is the second segment (S₁). The number of segments is five (S₁ to S₅). The eighth segment (S₈) is forecasted.

Figure 3. Flowcharts of three GRNNAs. The first and third row are BiLSTMA and BiGRUA, for which the first layer is BiLSTM and BiGRU. Their hidden state has 64 units for both the forward and backward direction. The number of nodes of the second layer is 32 and the active function is ReLU. The number of nodes of the output layer is 2, and the active function is Softmax. The connection between the second and output layers dropped by 30%. The two nodes in the output layer represent the probability of sleep apnea (p_Apnea) and normal respiration (p_Normal) in the future. The second row is GRUA, for which the first layer is GRU. Its hidden state has 64 units. The other layers are the same as for BiGRU.

Figure 4. The architecture of LR. The input vector is

P = [p B i L S T M, p G R U, p B i G R U]

and the predicted output is

\hat{p}

. The active function of the output node is the sigmoid function.

Figure 4. The architecture of LR. The input vector is

P = [p B i L S T M, p G R U, p B i G R U]

and the predicted output is

\hat{p}

. The active function of the output node is the sigmoid function.

Figure 5. The ROC curves of SHA (red line) and three GRNNAs—GRU (blue line), BiLSTM (orange line), and BiGRU (green line)—under n of 5 min and h of the (a) first and (b) tenth minute.

Figure 6. The ROC curves of SHA (red line) and three GRNNAs—GRU (blue line), BiLSTM (orange line), and BiGRU (green line)—under n of 10 min and h of (a) the first and (b) tenth minute.

Figure 7. The ROC curves of SHA (red line) and three GRNNAs—GRU (blue line), BiLSTM (orange line), and BiGRU (green line)—under n of 20 min and h of (a) the first and (b) tenth minute.

Table 1. Number of samples in the training and testing groups, and total samples under n of 5, 10, and 20 min and h of the first, third, fifth, eighth, and tenth minute.

n (min)	h (min)	Number of Samples in Training Group	Number of Samples in Testing Group	Number of Total Samples
5	First	16,704	16,940	33,644
	Third	16,702	16,938	33,640
	Fifth	16,700	16,936	33,636
	Eighth	16,697	16,933	33,630
	Tenth	16,695	16,931	33,626
10	First	16,699	16,935	33,634
	Third	16,697	16,933	33,630
	Fifth	16,695	16,931	33,626
	Eighth	16,692	16,928	33,620
	Tenth	16,690	16,926	33,616
20	First	16,689	16,925	33,614
	Third	16,687	16,923	33,610
	Fifth	16,685	16,921	33,606
	Eighth	16,682	16,918	33,600
	Tenth	16,680	16,916	33,596

Table 2. Metrics of GRNNA using BiLSTM under different n and h variables.

n (min)	h (min)	Precision (%)	Sensitivity (%)	Specificity (%)	F₁-Score (%)	Accuracy (%)
5	First	93.71	91.07	96.3	92.37	94.33
	Third	92.3	90.97	95.41	91.63	93.73
	Fifth	92.12	90.69	95.3	91.4	93.56
	Eighth	91.47	91.23	94.85	91.35	93.48
	Tenth	89.9	90.57	93.84	90.23	92.61
10	First	94.41	94.36	96.61	94.38	95.76
	Third	93.37	94.18	95.95	93.77	95.28
	Fifth	93.99	93.99	96.36	93.99	95.47
	Eighth	93.45	94.24	96	93.84	95.33
	Tenth	93.92	93.86	96.32	93.89	95.4
20	First	94.8	92.8	96.92	93.79	95.37
	Third	94.46	92.63	96.72	93.54	95.18
	Fifth	93.32	94.23	95.92	93.77	95.29
	Eighth	93.7	93.81	96.19	93.75	95.29
	Tenth	93.16	94.59	95.8	93.87	95.35

Table 3. Metrics of GRNNA using GRU under different n and h variables.

n (min)	h (min)	Precision (%)	Sensitivity (%)	Specificity (%)	F₁-Score (%)	Accuracy (%)
5	First	93.05	92.12	95.83	92.58	94.43
	Third	92.28	89.6	95.46	90.92	93.25
	Fifth	90.4	91.7	94.1	91.05	93.2
	Eighth	91.65	89.18	95.08	90.4	92.86
	Tenth	89.24	89.84	93.44	89.54	92.08
10	First	94.77	94.47	96.85	94.62	95.95
	Third	94.21	93.16	96.53	93.68	95.26
	Fifth	94.02	92.99	96.42	93.5	95.13
	Eighth	93.45	94.1	96.01	93.77	95.29
	Tenth	93.04	94.37	95.73	93.7	95.22
20	First	95.11	93.86	97.08	94.48	95.87
	Third	93.86	94.09	96.28	93.98	95.45
	Fifth	94.19	94.12	96.49	94.16	95.6
	Eighth	93.79	94.66	96.21	94.22	95.63
	Tenth	94.36	93.79	96.61	94.08	95.55

Table 4. Metrics of GRNNA using BiGRU under different n and h variables.

n (min)	h (min)	Precision (%)	Sensitivity (%)	Specificity (%)	F₁-Score (%)	Accuracy (%)
5	First	93.33	93.09	95.97	93.21	94.89
	Third	92.48	92.73	95.43	92.61	94.42
	Fifth	92.28	91.87	95.35	92.08	94.04
	Eighth	91.29	92.55	94.65	91.91	93.86
	Tenth	92.14	92.29	95.24	92.22	94.12
10	First	95.08	93.84	97.06	94.46	95.85
	Third	94.56	94.21	96.72	94.38	95.77
	Fifth	93.92	94.67	96.29	94.29	95.68
	Eighth	93.62	95.59	96.06	94.59	95.88
	Tenth	94.71	94.77	96.8	94.74	96.04
20	First	94.43	94.88	96.61	94.65	95.96
	Third	94.76	93.43	96.87	94.09	95.58
	Fifth	93.94	94.64	96.31	94.29	95.68
	Eighth	94.89	94.35	96.93	94.62	95.96
	Tenth	93.57	94.73	96.06	94.15	95.56

Table 5. Performance of the three GRNNAs with an n of 10 min and h of the first, fifth, and tenth minute.

GRNNA	h (min)	Precision (%)	Sensitivity (%)	Specificity (%)	F₁-Score (%)	Accuracy (%)
BiLSTM	First	94.41	94.36	96.61	94.38	95.76
GRU	First	94.77	94.47	96.85	94.62	95.95
BiGRU	First	95.08	93.84	97.06	94.46	95.85
BiLSTM	Fifth	93.99	93.99	96.36	93.99	95.47
GRU	Fifth	94.02	92.99	96.42	93.50	95.13
BiGRU	Fifth	93.92	94.67	96.29	94.29	95.68
BiLSTM	Tenth	93.92	93.86	96.32	93.89	95.40
GRU	Tenth	93.04	94.37	95.73	93.70	95.22
BiGRU	Tenth	94.71	94.77	96.80	94.74	96.04

Table 6. Metrics of LR under n of 1, 3, 5, 10, and 20 min and h of the first, third, fifth, eighth, and tenth minute.

h (min)	n (min)	Precision (%)	Sensitivity (%)	Specificity (%)	F₁-Score (%)	Accuracy (%)
First	5	94.7	93.36	96.84	94.03	95.53
	10	95.5	94.78	97.29	95.13	96.34
	20	95.79	94.74	97.48	95.26	96.45
Third	5	93.91	92.79	96.35	93.35	95.01
	10	95.12	94.41	97.07	94.76	96.07
	20	95	94.22	97	94.61	95.95
Fifth	5	93.69	92.59	96.22	93.14	94.86
	10	95.03	94.55	97	94.79	96.08
	20	94.92	94.66	96.94	94.79	96.07
Eighth	5	93.92	92.72	96.36	93.31	94.99
	10	94.9	95.2	96.9	95.05	96.26
	20	95.2	94.64	97.12	94.92	96.18
Tenth	5	93.52	92.32	96.13	92.92	94.69
	10	95.1	94.65	97.05	94.87	96.15
	20	94.76	94.66	96.84	94.71	96.02

Table 7. AUC values of the three GRNNAs and SHA under different previous (n of 5, 10, and 20 min) and future time points (h of the first, fifth, and tenth minute).

h (min)	n (min)
First	5	GRNNA with BiLSTM	0.976
		GRNNA with GRU	0.970
		GRNNA with BiGRU	0.976
		SHA	0.980
	10	GRNNA with BiLSTM	0.984
		GRNNA with GRU	0.982
		GRNNA with BiGRU	0.985
		SHA	0.988
	20	GRNNA with BiLSTM	0.984
		GRNNA with GRU	0.985
		GRNNA with BiGRU	0.987
		SHA	0.989
Fifth	5	GRNNA with BiLSTM	0.962
		GRNNA with GRU	0.959
		GRNNA with BiGRU	0.965
		SHA	0.971
	10	GRNNA with BiLSTM	0.979
		GRNNA with GRU	0.980
		GRNNA with BiGRU	0.980
		SHA	0.985
	20	GRNNA with BiLSTM	0.977
		GRNNA with GRU	0.978
		GRNNA with BiGRU	0.983
		SHA	0.985
Tenth	5	GRNNA with BiLSTM	0.959
		GRNNA with GRU	0.944
		GRNNA with BiGRU	0.960
		SHA	0.967
	10	GRNNA with BiLSTM	0.979
		GRNNA with GRU	0.977
		GRNNA with BiGRU	0.981
		SHA	0.985
	20	GRNNA with BiLSTM	0.980
		GRNNA with GRU	0.983
		GRNNA with BiGRU	0.983
		SHA	0.986

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tan, T.-H.; Chen, G.-H.; Liu, S.-H.; Chen, W. Prediction of Sleep Apnea Occurrence from a Single-Lead Electrocardiogram Using Stacking Hybrid Architecture with Gated Recurrent Neural Network Architectures and Logistic Regression. Technologies 2026, 14, 92. https://doi.org/10.3390/technologies14020092

AMA Style

Tan T-H, Chen G-H, Liu S-H, Chen W. Prediction of Sleep Apnea Occurrence from a Single-Lead Electrocardiogram Using Stacking Hybrid Architecture with Gated Recurrent Neural Network Architectures and Logistic Regression. Technologies. 2026; 14(2):92. https://doi.org/10.3390/technologies14020092

Chicago/Turabian Style

Tan, Tan-Hsu, Guan-Hua Chen, Shing-Hong Liu, and Wenxi Chen. 2026. "Prediction of Sleep Apnea Occurrence from a Single-Lead Electrocardiogram Using Stacking Hybrid Architecture with Gated Recurrent Neural Network Architectures and Logistic Regression" Technologies 14, no. 2: 92. https://doi.org/10.3390/technologies14020092

APA Style

Tan, T.-H., Chen, G.-H., Liu, S.-H., & Chen, W. (2026). Prediction of Sleep Apnea Occurrence from a Single-Lead Electrocardiogram Using Stacking Hybrid Architecture with Gated Recurrent Neural Network Architectures and Logistic Regression. Technologies, 14(2), 92. https://doi.org/10.3390/technologies14020092

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of Sleep Apnea Occurrence from a Single-Lead Electrocardiogram Using Stacking Hybrid Architecture with Gated Recurrent Neural Network Architectures and Logistic Regression

Abstract

1. Introduction

2. Materials and Methods

2.1. Apnea-ECG Database

2.2. Data Segment

2.3. Extracting RRI and RwA

2.4. Gated Recurrent Neural Network Architectures

2.5. Logistic Regression

2.6. Metrics of the Model

3. Results

3.1. Metrics of Three GRNNAs

3.2. Metrics of Stacking Hybrid Architecture

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI