Evaluation of Data Augmentation Under Label Scarcity for ECG-Based Detection of Sleep Apnea

Ryu, Semin; Koh, Jeonghwan; Jeong, In cheol

doi:10.3390/app152413231

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Evaluation of Data Augmentation Under Label Scarcity for ECG-Based Detection of Sleep Apnea

by

Semin Ryu

^1,2,*

,

Jeonghwan Koh

^1,2

and

In cheol Jeong

^1,2,3,*

¹

Department of Artificial Intelligence Convergence, Hallym University, Chuncheon 24254, Republic of Korea

²

Cerebrovascular Disease Research Center, Hallym University, Chuncheon 24254, Republic of Korea

³

Department of Population Health Science and Policy, Icahn School of Medicine, Mount Sinai, New York, NY 10029, USA

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(24), 13231; https://doi.org/10.3390/app152413231

Submission received: 19 November 2025 / Revised: 15 December 2025 / Accepted: 16 December 2025 / Published: 17 December 2025

(This article belongs to the Section Biomedical Engineering)

Download Versions Notes

Abstract

Supervised ECG-based sleep apnea detection typically depends on large and fully annotated datasets, yet the rarity and cost of labeling apneic events often lead to substantial annotation scarcity in practice. This study provides a controlled evaluation of how such scarcity degrades classification performance and, as a key contribution, investigates whether a constrained, morphology-preserving ECG augmentation framework can compensate for reduced apnea-label availability. Using the PhysioNet Apnea–ECG dataset, we simulated seven levels of label retention (

r = 5

–

100 %

) and trained a lightweight CNN–BiLSTM model under both subject-dependent (SD) and subject-independent (SI) five-fold protocols. Offline augmentation was applied only to apnea segments and consisted of simple, physiologically motivated time-domain perturbations designed to retain realistic cardiac and respiratory dynamics. Across both evaluation settings, augmentation substantially mitigated performance loss in the low- and mid-scarcity regimes. Under SI evaluation, the mean F1-score improved from 0.57 to 0.72 at

r = 5 %

and from 0.63 to 0.76 at

r = 10 %

, with scores at

r = 10

–

40 %

(0.75–0.77) approaching the full-label baseline of 0.79. Temporal and spectral analyses confirmed preservation of P–QRS–T morphology and respiratory modulation without distortion. These results demonstrate that simple and interpretable ECG augmentations provide an effective and reproducible baseline for data-efficient apnea screening and offer a practical path toward scalable annotation and robust single-lead deployment under label scarcity.

Keywords: sleep apnea; ECG; data augmentation; label scarcity; spectral analysis; subject-independent evaluation

Share and Cite

MDPI and ACS Style

Ryu, S.; Koh, J.; Jeong, I.c. Evaluation of Data Augmentation Under Label Scarcity for ECG-Based Detection of Sleep Apnea. Appl. Sci. 2025, 15, 13231. https://doi.org/10.3390/app152413231

AMA Style

Ryu S, Koh J, Jeong Ic. Evaluation of Data Augmentation Under Label Scarcity for ECG-Based Detection of Sleep Apnea. Applied Sciences. 2025; 15(24):13231. https://doi.org/10.3390/app152413231

Chicago/Turabian Style

Ryu, Semin, Jeonghwan Koh, and In cheol Jeong. 2025. "Evaluation of Data Augmentation Under Label Scarcity for ECG-Based Detection of Sleep Apnea" Applied Sciences 15, no. 24: 13231. https://doi.org/10.3390/app152413231

APA Style

Ryu, S., Koh, J., & Jeong, I. c. (2025). Evaluation of Data Augmentation Under Label Scarcity for ECG-Based Detection of Sleep Apnea. Applied Sciences, 15(24), 13231. https://doi.org/10.3390/app152413231

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluation of Data Augmentation Under Label Scarcity for ECG-Based Detection of Sleep Apnea

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI