Next Article in Journal
Current Topics in OCT Applications in Vitreoretinal Surgery
Previous Article in Journal
Evaluation of the Validity and Reliability of NeuroSkin’s Wearable Sensor Gait Analysis Device in Healthy Individuals
Previous Article in Special Issue
Development, Validation, and Deployment of a Time-Dependent Machine Learning Model for Predicting One-Year Mortality Risk in Critically Ill Patients with Heart Failure
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Protocol

Integration of EHR and ECG Data for Predicting Paroxysmal Atrial Fibrillation in Stroke Patients

1
Department of Public Health Sciences, College of Medicine, Pennsylvania State University, Hershey, PA 17033, USA
2
Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
3
Penn State Hershey Medical Center, Penn State College of Medicine, Hershey, PA 17033, USA
4
Department of Neurology, College of Medicine, The Pennsylvania State University, Hershey, PA 17033, USA
5
Division of Cardiology, Heart and Vascular Institute, Penn State Hershey Medical Center, Hershey, PA 17033, USA
*
Authors to whom correspondence should be addressed.
Bioengineering 2025, 12(9), 961; https://doi.org/10.3390/bioengineering12090961 (registering DOI)
Submission received: 17 July 2025 / Revised: 26 August 2025 / Accepted: 3 September 2025 / Published: 7 September 2025
(This article belongs to the Special Issue Machine Learning Technology in Predictive Healthcare)

Abstract

Predicting paroxysmal atrial fibrillation (PAF) is challenging due to its transient nature. Existing methods often rely solely on electrocardiogram (ECG) waveforms or Electronic Health Record (EHR)-based clinical risk factors. We hypothesized that explicitly balancing the contributions of these heterogeneous data sources could improve prediction accuracy. We developed a Transformer-based deep learning model that integrates 12-lead ECG signals and 47 structured EHR variables from 189 patients with cryptogenic stroke, including 49 with PAF. By systematically varying the relative contributions of ECG and EHR data, we identified an optimal ratio for prediction. Best performance (accuracy: 0.70, sensitivity: 0.72, specificity: 0.87, Area Under Curve - Receiver Operating Characteristics (AUROC): 0.65, Area Under the Precision-Recall Curve (AUPRC): 0.43) was achieved using a 5-fold cross-validation when EHR data contributed one-third and ECG data two-thirds of the model’s input. This multimodal approach outperformed unimodal models, improving accuracy by 35% over EHR-only and 5% over ECG-only methods. Our results support the value of combining ECG and structured EHR information to improve accuracy and sensitivity in this pilot cohort, motivating validation in larger studies.

1. Introduction

Paroxysmal atrial fibrillation (PAF), a major stroke risk factor, characterized by transient arrhythmic episodes, poses significant diagnostic challenges due to its sporadic nature, with up to 40% of cases remaining asymptomatic [1,2]. Traditional detection methods, such as Holter monitoring, often fail to capture these fleeting events, leading to delayed diagnosis and increased risk of recurrent stroke and systemic embolism [3,4]. Recent advances in machine learning have enabled novel approaches to PAF prediction, with convolutional neural networks (CNNs) achieving higher AUROCs than conventional ML methods using electrograms or 12-lead electrocardiograms (ECGs) alone [5,6,7,8,9]. However, traditional (unimodal) models face limitations: ECG-based approaches primarily detect electrophysiological anomalies, while electronic health record (EHR)-driven models rely on systemic risk factors such as hypertension and diabetes, which lack key temporal resolution [7,10].
Integrating ECG and EHR data could synergistically enhance predictive accuracy by contextualizing transient ECG abnormalities within longitudinal clinical profiles, a hypothesis supported by studies showing improvement in diagnosis prediction when combining these data sources [11]. Multimodal deep learning frameworks, particularly those employing attention mechanisms, have demonstrated promise in cardiac applications; Transformer architectures excel at modeling long-range dependencies in sequential data, while CNNs capture localized ECG features such as P-wave morphology and QRS complex variations [12,13]. For instance, Tzou et al. [13] achieved high sensitivity for PAF prediction by analyzing P-wave dynamics and skin sympathetic nerve activity using a wavelet–CNN hybrid, while Tang et al. [7] reported strong performance for PAF reoccurrence by fusing intracardiac electrograms with clinical variables. However, no study has systematically determined the optimal integration of raw ECG waveforms and structured EHR data for predicting PAF, representing a significant gap given the episodic and elusive nature of the condition [2,10].
To address this gap, our pilot study deliberately focuses on dissecting the relative contributions of ECG and EHR data. The primary objective was to develop a multimodal deep learning model for predicting PAF. We hypothesized that a multimodal deep learning model would outperform unimodal models by optimally balancing ECG and EHR contributions. We introduce a Transformer-based architecture that integrates denoised 12-lead ECG signals with 47 EHR parameters, including cardiac monitoring duration and hemoglobin levels. This approach quantifies the optimal ratio between ECG and EHR. Class imbalance is addressed through stochastic data augmentation techniques, time warping, amplitude scaling, and Gaussian noise injection [14]. We further discuss implications for early intervention strategies, address limitations in sensitivity, and propose future directions, including latent EHR feature exploration and prospective and external validation. This work advances personalized PAF management by bridging electrophysiological and systemic risk assessment through explainable multimodal learning [12,15]. This work is intended to support outpatient rhythm monitoring triage and not to replace clinician judgment.

2. Materials and Methods

2.1. Study Design and Data Collection

This study utilized a dataset comprising 189 cryptogenic stroke patients, collected by medical students at an academic hospital, Penn State College of Medicine, from January 2017 to May 2023 under an exempt Institutional Review Board protocol. The dataset included both ECG waveforms and EHR data. All data were validated by a cardiologist and a stroke neurologist to ensure diagnostic accuracy and clinical relevance. Reporting followed the TRIPOD-AI guideline; a completed checklist is provided in the Supplement (Table A1).

2.2. Inclusion/Exclusion

Inclusion criteria were adults ≥ 18 with cryptogenic ischemic stroke; participants were excluded if they had persistent/permanent AF before index ECG, non-12-lead ECG, or missing EHR core variables.

2.3. Data Preprocessing

ECG waveforms were extracted from XML files. Each 12-lead ECG signal was normalized to a range of 0–1 and preprocessed using wavelet denoising and bandpass filtering (0.5–40 Hz). Multiple representations of ECG data were explored, including raw signals, denoised signals, and denoised–filtered signals. Data augmentation techniques were applied to the ECG signals to address class imbalance, including time warping, amplitude scaling, baseline wander addition, Gaussian noise injection, random permutation, and random shifting. The augmentation probability (Paug) and the number of augmented samples were optimized as hyperparameters.
We selected 47 clinically relevant EHR parameters based on known PAF risk factors and predictors. We used information available at the index stroke encounter. Predictors included demographics (age, sex, race, ethnicity); BMI (body mass index); comorbidities/new diagnoses at index (hypertension, type 2 diabetes, hyperlipidemia, hypercoagulability, chronic kidney disease, liver disease, active cancer, prior ischemic/hemorrhagic stroke, myocardial infarction, coronary artery disease, peripheral arterial disease, systemic embolism, dementia, systolic/diastolic heart failure); social and family history (tobacco, alcohol, family history of stroke or AF); stroke severity/imaging (NIHSS on admission, ipsilateral ICA stenosis > 50%, intracranial arterial disease, lacunar pattern, stroke laterality); echocardiography (ejection fraction %, left atrial size, left atrial enlargement, cardiac shunt/PFO); risk score (CHA2DS2-VASc); and renal function and labs (eGFR, hemoglobin, platelet count, ALT, AST, HDL, LDL, HbA1c). Following manual chart review, no missing values were present for demographics and laboratory measures. For binary comorbidity indicators, records with unpopulated fields were adjudicated from notes/problem lists; when documentation did not support the condition, the indicator was coded absent. Non-numeric features were excluded, and numeric features were normalized to a range of 0–1. From an initial 223 cases reviewed based on their EHR, we identified 197 with ECG listed for extraction. Ultimately, 189 of these ECGs were successfully extracted and matched to the reviewed EHR record, forming our final study cohort. The sample size was fixed by cohort availability, with 49 PAF events.

2.4. Deep Learning Model Architecture

The proposed model consisted of a hybrid architecture designed to integrate ECG and EHR data for binary classification of PAF risk. For ECG feature extraction, a series of CNN layers processed the ECG signals to extract temporal features. Multi-head attention layers captured long-range dependencies across leads. The compressed ECG features were reduced to a fixed dimensionality (nECG) using dense layers. For EHR feature processing, EHR features were processed through dense layers to reduce their dimensionality (nEHR) while preserving critical information. To clearly define the optimal balance between modalities, we systematically adjusted the representation (compression dimensions) of ECG and EHR inputs to identify their relative contributions to predictive performance.
Compressed ECG and EHR features were concatenated and passed through additional dense layers for final classification. The model architecture was optimized through hyperparameter tuning, including the number of attention heads (4–8), compression dimensions (nECG and nEHR), learning rate (10−4 to 10−3), and augmentation strategies. The tuning process specifically aimed to identify the optimal balance between ECG and EHR inputs for the best performance. Models were implemented in TensorFlow [16].

2.5. Training and Validation

The dataset was split into training and testing sets using 5-fold cross-validation to ensure robustness. Each experiment was repeated 10 times with different random seeds to assess variability in performance metrics. Data augmentation was performed during training with an augmentation probability (Paug = 0.1) optimized for generalization without overfitting. The model was trained on NVIDIA RTX 6000 Ada GPU (Manufactured by NVIDIA Corporation, Santa Clara, CA, United States), using the Adam optimizer with an exponential learning rate decay schedule. Binary cross-entropy loss was used as the objective function. Test–time augmentation further improved prediction robustness by averaging predictions across augmented test samples. The training time was 50 GPU-hours. Decision curve analysis was not performed in this pilot due to limited events; clinical utility was not assessed.

2.6. Statistical Analysis

Model performance was evaluated using several metrics: accuracy, area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, precision, and F1 score. The statistical significance of performance differences between models was assessed using paired t-tests. We also conducted pairwise Wilcoxon tests versus the best overall EHR contribution for the top performing repeats across all metrics. The significance threshold was considered p < 0.05. Feature importance analysis was conducted using Random Forest models to identify the most predictive variables in the EHR dataset. Additionally, the contribution of ECG versus EHR data was analyzed by varying their respective compression dimensions in the model and evaluating performance changes.
The study pipeline, as shown in Figure 1, consists of data preparation, an example of a preprocessed 12-lead ECG waveform, and multimodal data processing in the deep learning pipeline for PAF prediction.

3. Results

The study included 189 cryptogenic stroke patients, of whom 49 (26%) had a diagnosis of PAF. The mean age of the cohort was 71.4 years. Patients with PAF were significantly older than those without (75.4 years vs. 70.0 years, p = 0.004). The cohort was predominantly female (57.7%) and White (82.5%), with no significant differences in sex or race between the PAF and non-AF groups. The average monitoring duration for the PAF group was longer than for the non-AF group (22.6 months vs. 18.3 months), though this difference was not statistically significant (p = 0.064) (Table 1, Figure A1).
The results of this study demonstrate the effectiveness of a multimodal deep learning model that integrates ECG (denoised + band-pass filtered 0.5–40 Hz) and EHR data for predicting PAF. Using 189 stroke patients, the model was evaluated across multiple configurations and metrics, with a primary focus on the accuracy, sensitivity, and specificity. Each best performing configuration achieved an accuracy of 0.70 (SD: 0.04), sensitivity of 0.72 (SD: 0.42), and specificity of 0.87 (SD: 0.06) (Appendix A, Table A2). The large SD for some metrics reflects fold-to-fold variability driven by the small number of PAF events per fold.
This model compressed ECG to 32 and EHR to 16 latent dimensions and incorporated 8 attention heads into the Transformer architecture. Data augmentation with a probability of 0.1 further enhanced model generalization without overfitting. Figure 2 illustrates the overall distribution of performance metrics of different architectures and compares key performance metrics for the best model configuration.
Models using only EHR data achieved an accuracy of 0.67 (SD: 0.2; p < 0.05), a sensitivity of 0.72 (SD: 0.42; p < 0.05), and a specificity of 0.80 (SD: 0.32; p < 0.05), while those using only ECG data showed an accuracy of 0.52 (SD: 0.06; p < 0.05), a sensitivity of 0.51 (SD: 0.23; p < 0.05), and a specificity of 0.84 (SD: 0.07; p < 0.05). The integration of both data modalities not only enhanced overall predictive performance but also highlighted the critical role of achieving the right balance between ECG and EHR inputs. Systematic analysis demonstrated that predictive performance peaked when EHR data comprised approximately one-third of the model input (Figure 3, Table A3), underscoring the importance of balancing clinical data with electrophysiological information. Exploratory Wilcoxon tests versus the 33% EHR condition showed significant differences for nearly all comparisons (Appendix E, Table A4); the only non-significant pair was accuracy versus 67% (p = 0.33).
Figure 4 ranks EHR feature importance. Higher ranks included age, hemoglobin, and left-atrial measures, consistent with known PAF risk correlates.

4. Discussion

The findings from this study highlight the potential of multimodal deep learning models in predicting PAF by leveraging complementary information from ECG waveforms and EHR data. Although the dataset is not large and imbalanced, we employed 5-fold cross-validation and repeated the experiments 10 times to avoid potential overfitting. A novel aspect of this work is the direct comparison of ECG and EHR contributions, which reveals that an optimal balance is achieved when EHR data accounts for approximately 33% of the overall input. The observed improvement in predictive performance when combining these modalities aligns with prior studies [7,17,18]. For instance, Tang et al. demonstrated that integrating ECG with clinical features improved PAF recurrence prediction after catheter ablation [7]. Also, Khurshid et al. reported that combining ECG with clinical risk factors yielded complementary benefits for PAF prediction [19]. Recent studies suggest links between brain tissue susceptibility and vascular/arrhythmic risk [20,21]. In clinics, multimodal risk scores could be used to prioritize patients for extended ambulatory ECG and earlier cardiology follow-up.
Although our model’s accuracy is comparable to other studies, our critical insight is that optimizing the EHR data contribution significantly enhances model specificity and maintains clinical interpretability, a vital step for real-world applicability [6,22]. However, when considering the optimal contribution balance, the 33% input from EHR data emerges as a factor in attaining this performance level. This discrepancy may be attributed to differences in dataset size, patient demographics, or the transient nature of PAF, which present unique challenges for prediction.
The relatively low sensitivity observed in our study reflects the difficulty in detecting sporadic PAF episodes from limited data points, a limitation also noted in other works focusing on PAF [23]. Nonetheless, the high specificity indicates that our model is effective at ruling out low-risk cases, and our analysis suggests that the EHR contribution plays a pivotal role in enhancing specificity while maintaining an acceptable trade-off with sensitivity.
Regarding feature importance analysis (Figure 4 and Figure A2), hemoglobin levels emerged as an important feature, potentially reflecting underlying anemia, polycythemia [24], or other systemic conditions associated with PAF risk [18,25]. This further reinforces the value of the EHR component in our multimodal approach, underscoring that even a one-third contribution from EHR data can capture essential clinical nuances that improve overall model performance.
While our results are promising, several limitations warrant consideration. The primary limitation is the sample size of 189 patients from a single center, which may affect the generalizability of our findings. We observed instability in some metrics, which is attributable to the small pilot cohort and rare positives per CV fold. Accordingly, the retrospective nature of this pilot study requires our model to be validated in larger, more diverse populations, preferably through prospective studies. Given the pilot scope, we did not include full classical ML benchmarking; this is planned for a larger, prospective, multi-site validation. Furthermore, the relatively low sensitivity observed in our models reflects the inherent difficulty of predicting sporadic PAF episodes from limited data. Future work should focus on external validation to confirm our findings and refine the optimal 33% EHR data contribution as a benchmark for improving predictive accuracy and sensitivity (e.g., class-weighted or focal loss) in different clinical settings. Also, medication exposures (e.g., aspirin, statin) were not used at index to avoid post-stroke initiation bias; they are planned for future prospective cohorts.

5. Conclusions

This pilot study provides evidence supporting the use of multimodal deep learning for predicting paroxysmal atrial fibrillation among stroke patients by combining ECG and EHR data. Our analysis uniquely reveals that optimal performance is achieved when EHR data contributes approximately 33% of the overall input, underscoring the critical importance of balancing heterogeneous data sources. By leveraging complementary information from these modalities, our approach offers a scalable solution for early risk stratification and intervention in clinical practice. Future research should validate our findings externally, dynamically optimize the balance between EHR and ECG data, and explore real-time clinical deployment to enhance early PAF detection, clinical decision making, and patient outcomes.

Author Contributions

All authors affirm that they contributed to the writing of the manuscript. A.V.S.: Writing—review and editing, Visualization, Conceptualization, Formal analysis, Methodology, Data curation. M.M., D.O. and N.S.: Data curation, Writing—review and editing. S.S.H. and A.V.: Writing—review and editing. R.S. and A.M.: Data curation, Writing—review and editing. V.A.: Writing—review and editing, Validation, Supervision. R.Z.: Writing—review and editing, Conceptualization, Supervision, Project administration. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki. The research protocol was reviewed by the Institutional Review Board of Penn State College of Medicine (protocol number STUDY00023096). The Human Research Protection Program determined that the research met the criteria for exempt research according to institutional policies and federal regulations and therefore waived the need for formal IRB review.

Informed Consent Statement

Patient consent was waived for this study. The research involved a retrospective analysis of existing data, and it was not practicable to obtain consent from the individuals whose data were used. The Institutional Review Board approved this waiver, as the study posed minimal risk to the subjects.

Data Availability Statement

The data that support the findings of this study are not publicly available due to institutional policies and to protect patient privacy and confidentiality.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AUROCArea under the receiver operating characteristic curve
CNNConvolutional neural network
ECGElectrocardiogram
EHRElectronic health record
PAFParoxysmal atrial fibrillation

Appendix A. Additional Evaluation and Interpretability

We include the receiver operating characteristic (ROC) and precision–recall (PR) curves for the best performing multimodal configuration (Figure A1). The observed areas match the summary in the manuscript (AUROC ≈ 0.65; AUPRC ≈ 0.43), providing a fuller view of threshold behavior in this imbalanced setting.
Figure A1. ROC and precision–recall curves for the best multimodal model. (a): ROC curve; the dashed diagonal indicates chance performance. (b): Precision–Recall curve.
Figure A1. ROC and precision–recall curves for the best multimodal model. (a): ROC curve; the dashed diagonal indicates chance performance. (b): Precision–Recall curve.
Bioengineering 12 00961 g0a1
Beyond the EHR feature importance analysis already shown in the main text, we computed SHAP attributions for the multimodal model to characterize local (per-patient) contributions. A representative waterfall plot (Figure A2) illustrates how individual features push the prediction toward/away from PAF for a correctly classified case.
Figure A2. SHAP waterfall plot for a representative prediction. Local explanation showing how the top features cumulatively shift the log-odds toward PAF (right direction) or away from PAF (left direction).
Figure A2. SHAP waterfall plot for a representative prediction. Local explanation showing how the top features cumulatively shift the log-odds toward PAF (right direction) or away from PAF (left direction).
Bioengineering 12 00961 g0a2

Appendix B

The following is a TRIPOD-AI checklist.
Table A1. TRIPOD-AI table.
Table A1. TRIPOD-AI table.
ItemEvidence in Manuscript
Title/Abstract identify model type and purposeTitle and abstract specify multimodal deep learning to predict PAF in stroke patients; metrics reported.
Background and objectives (intended use/clinical context)Rationale for combining ECG+EHR; objective to develop a multimodal model and examine relative contributions.
Source of data and study design/settingSingle-center retrospective cohort; Penn State COM; accrual Jan 2017–May 2023; cardiologist and stroke neurologist validated data.
Participants (eligibility, selection, numbers)Cryptogenic stroke patients; flow from 223 reviewed → 197 with ECG → 189 final.
Outcome (definition, timing, how assessed, blinding)Outcome is PAF; data validated by specialists.
Predictors (definitions, timing, measurement)47 EHR variables (demographics, labs, comorbidities); 12-lead ECG; preprocessing described.
Sample size (rationale)n = 189; 49 events (26%).
Handling of missing dataNo imputation; chart-adjudicated binary comorbidities coded 0.
Bias, drift, data splits (leakage prevention)5-fold CV; repeated with different seeds.
Modeling details (algorithms, hyperparameters, class imbalance)Hybrid CNN + attention for ECG; MLP for EHR; concatenation; tuned heads 4–8, learning rate 1 × 10−4–1 × 10−3; augmentation probability 0.1; compression dimensions varied.
Internal validation5-fold cross-validation; 10 repeats; test-time augmentation.
Performance measures (discrimination, calibration, CIs)Accuracy, AUROC, sensitivity, specificity, precision, F1; mean ± SD and p-values; AUROC/AUPRC reported/best values.
ExplainabilityGlobal EHR feature importance (Random Forest); figures.
Subgroups/fairnessAppendix D Table A3
Model presentation (final model, thresholds, access)All codes are shared at https://github.com/TheDecodeLab/AFib-multimodal.git. (accessed on 4 September 2025)
Clinical utilityNot assessed (pilot; limited events).
Reproducibility (software, versions)Python 3.11, TensorFlow 2.20.
Data availabilityData not publicly available due to privacy.

Appendix C

Below is an exploratory subgroup analysis by age (≥52 vs. <52), sex, race, and ethnicity. Values are mean accuracy (% ± SD) across 5-fold × 2-repeat cross-validation.
Table A2. Subgroup accuracy (% ± SD) and p-values by age, sex, race, and ethnicity (5-fold × 10-repeat CV). p-values compare groups within each category. Age split at 52 years.
Table A2. Subgroup accuracy (% ± SD) and p-values by age, sex, race, and ethnicity (5-fold × 10-repeat CV). p-values compare groups within each category. Age split at 52 years.
CategoryGroupnAccuracy (%)SD (%)p-Value
AgeOld (≥52)1780.660.270.6413
Young (<52)110.710.24
SexMale800.690.280.0997
Female1090.650.26
RaceWhite1560.670.270.6539
Black170.630.27
Asian10.22-
Others150.620.27
EthnicityHispanic30.620.330.98
Non-Hispanic1860.670.27

Appendix D

Table A3 summarizes the hyperparameters and cross-validation performance for the best model: eight heads, number component ECG 32, number component EHR 16 (≈33% EHR: 67% ECG at fusion), Adam optimizer, initial learning rate 1 × 10−4, batch size 32, 100 epochs, minority oversampling within training folds, and 5 times test-time augmentation.
Table A3. Final hyperparameters and summary performance for the best performing model (multimodal ECG+EHR).
Table A3. Final hyperparameters and summary performance for the best performing model (multimodal ECG+EHR).
ComponentSetting (Value)Notes
ArchitectureCNN encoder → Multi-Head Attention × 1 → latent compressors → concat → classifierTransformer-style attention stack
Attention heads8
ECG representationDenoised + band-pass filtered 12-lead ECG (600 × 12)Final choice used for the best model
EHR feature set47 structured predictors at index encounterDemographics, comorbidities, labs, echo, stroke features
Fusion latent (ECG)32
Fusion latent (EHR)16~33% EHR: 67% ECG contribution
Augmentation (train) 0.1None used in the final model
Test-time augmentation5 replicates (averaged)
Optimizer/LossAdam, binary cross-entropy
Initial learning rate1 × 10−4Fixed across folds
Epochs/Batch size100/32Early stopping as implemented
Class imbalance handlingMinority oversampling in training foldsEvaluation on the original class mix
Cross-validation5-fold × 2 repeatsModel selection by mean CV AUROC
Performance (CV mean)AUROC = 0.654 ± 0.071; Accuracy = 0.751 ± 0.092; Specificity = 0.859 ± 0.105; F1 = 0.487 ± 0.106Threshold metrics from CV; full ROC/PR in Figure A1

Appendix E

We compared performance across ECG-EHR contribution settings using pairwise tests versus the 33% EHR condition, which showed the highest metric. For each setting, we retained the top existing model instances to focus on the best achievable operating regime per condition. We then performed two-sided Wilcoxon rank-sum tests across the comparisons per metric.
Table A4. Pairwise comparisons vs 33% HER.
Table A4. Pairwise comparisons vs 33% HER.
Comparison with 33% EHRAUROCAccuracySensitivitySpecificityF1 Score
0%≤0.05≤0.05≤0.05≤0.05≤0.05
11%≤0.05≤0.05≤0.05≤0.05≤0.05
20%≤0.05≤0.05≤0.05≤0.05≤0.05
50%≤0.05≤0.05≤0.05≤0.05≤0.05
67%≤0.050.33≤0.05≤0.05≤0.05
80%≤0.05≤0.05≤0.05≤0.05≤0.05
100%≤0.05≤0.05≤0.05≤0.05≤0.05

References

  1. Boon, K.H.; Khalil-Hani, M.; Malarvili, M.B.; Sia, C.W. Paroxysmal Atrial Fibrillation Prediction Method with Shorter HRV Sequences. Comput. Methods Programs Biomed. 2016, 134, 187–196. [Google Scholar] [CrossRef]
  2. Linz, D.; Hermans, A.; Tieleman, R.G. Early Atrial Fibrillation Detection and the Transition to Comprehensive Management. Europace 2021, 23, ii46–ii51. [Google Scholar] [CrossRef]
  3. Miyazaki, Y.; Yamagata, K.; Ishibashi, K.; Inoue, Y.; Miyamoto, K.; Nagase, S.; Aiba, T.; Kusano, K. Paroxysmal Atrial Fibrillation as a Predictor of Pacemaker Implantation in Patients with Unexplained Syncope. J. Cardiol. 2022, 80, 28–33. [Google Scholar] [CrossRef]
  4. Nayak, T.; Lohrmann, G.; Passman, R. Controversies in Diagnosis and Management of Atrial Fibrillation. Cardiol. Rev. 2024. [Google Scholar] [CrossRef]
  5. Xia, Y.; Wulan, N.; Wang, K.; Zhang, H. Detecting Atrial Fibrillation by Deep Convolutional Neural Networks. Comput. Biol. Med. 2018, 93, 84–92. [Google Scholar] [CrossRef]
  6. Raghunath, S.; Pfeifer, J.M.; Ulloa-Cerna, A.E.; Nemani, A.; Carbonati, T.; Jing, L.; vanMaanen, D.P.; Hartzel, D.N.; Ruhl, J.A.; Lagerman, B.F.; et al. Deep Neural Networks Can Predict New-Onset Atrial Fibrillation From the 12-Lead ECG and Help Identify Those at Risk of Atrial Fibrillation-Related Stroke. Circulation 2021, 143, 1287–1298. [Google Scholar] [CrossRef] [PubMed]
  7. Tang, S.; Razeghi, O.; Kapoor, R.; Alhusseini, M.I.; Fazal, M.; Rogers, A.J.; Rodrigo Bort, M.; Clopton, P.; Wang, P.J.; Rubin, D.L.; et al. Machine Learning-Enabled Multimodal Fusion of Intra-Atrial and Body Surface Signals in Prediction of Atrial Fibrillation Ablation Outcomes. Circ. Arrhythm. Electrophysiol. 2022, 15, e010850. [Google Scholar] [CrossRef] [PubMed]
  8. Bhagubai, M.; Vandecasteele, K.; Swinnen, L.; Macea, J.; Chatzichristos, C.; De Vos, M.; Van Paesschen, W. The Power of ECG in Semi-Automated Seizure Detection in Addition to Two-Channel behind-the-Ear EEG. Bioengineering 2023, 10, 491. [Google Scholar] [CrossRef] [PubMed]
  9. Neri, L.; Gallelli, I.; Dall’Olio, M.; Lago, J.; Borghi, C.; Diemberger, I.; Corazza, I. Validation of a New and Straightforward Algorithm to Evaluate Signal Quality during ECG Monitoring with Wearable Devices Used in a Clinical Setting. Bioengineering 2024, 11, 222. [Google Scholar] [CrossRef]
  10. Murat, F.; Sadak, F.; Yildirim, O.; Talo, M.; Murat, E.; Karabatak, M.; Demir, Y.; Tan, R.-S.; Acharya, U.R. Review of Deep Learning-Based Atrial Fibrillation Detection Studies. Int. J. Environ. Res. Public Health 2021, 18, 11302. [Google Scholar] [CrossRef]
  11. Bhattacharya, A.; Sadasivuni, S.; Chao, C.-J.; Agasthi, P.; Ayoub, C.; Holmes, D.R.; Arsanjani, R.; Sanyal, A.; Banerjee, I. Multi-Modal Fusion Model for Predicting Adverse Cardiovascular Outcome Post Percutaneous Coronary Intervention. Physiol. Meas. 2022, 43, 124004. [Google Scholar] [CrossRef]
  12. Jo, Y.-Y.; Cho, Y.; Lee, S.Y.; Kwon, J.-M.; Kim, K.-H.; Jeon, K.-H.; Cho, S.; Park, J.; Oh, B.-H. Explainable Artificial Intelligence to Detect Atrial Fibrillation Using Electrocardiogram. Int. J. Cardiol. 2021, 328, 104–110. [Google Scholar] [CrossRef] [PubMed]
  13. Tzou, H.-A.; Lin, S.-F.; Chen, P.-S. Paroxysmal Atrial Fibrillation Prediction Based on Morphological Variant P-Wave Analysis with Wideband ECG and Deep Learning. Comput. Methods Programs Biomed. 2021, 211, 106396. [Google Scholar] [CrossRef]
  14. Machine Learning for Detecting Atrial Fibrillation from ECGs: Systematic Review and Meta-Analysis. Available online: https://www.imrpress.com/journal/RCM/25/1/10.31083/j.rcm2501008/htm (accessed on 2 February 2025).
  15. Jabbour, G.; Nolin-Lapalme, A.; Tastet, O.; Corbin, D.; Jordà, P.; Sowa, A.; Delfrate, J.; Busseuil, D.; Hussin, J.G.; Dubé, M.-P.; et al. Prediction of Incident Atrial Fibrillation Using Deep Learning, Clinical Models, and Polygenic Scores. Eur. Heart J. 2024, 45, 4920–4934. [Google Scholar] [CrossRef]
  16. Abadi, M. TensorFlow: Learning Functions at Scale. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, Nara, Japan, 18–22 September 2016; Association for Computing Machinery: New York, NY, USA, 2016; p. 1. [Google Scholar]
  17. Qiu, Y.; Guo, H.; Wang, S.; Yang, S.; Peng, X.; Xiayao, D.; Chen, R.; Yang, J.; Liu, J.; Li, M.; et al. Deep Learning-Based Multimodal Fusion of the Surface ECG and Clinical Features in Prediction of Atrial Fibrillation Recurrence Following Catheter Ablation. BMC Med. Inform. Decis. Mak. 2024, 24, 225. [Google Scholar] [CrossRef]
  18. Lin, F.; Zhang, P.; Chen, Y.; Liu, Y.; Li, D.; Tan, L.; Wang, Y.; Wang, D.W.; Yang, X.; Ma, F.; et al. Artificial-Intelligence-Based Risk Prediction and Mechanism Discovery for Atrial Fibrillation Using Heart Beat-to-Beat Intervals. Med 2024, 5, 414–431.e5. [Google Scholar] [CrossRef] [PubMed]
  19. Khurshid, S.; Friedman, S.; Reeder, C.; Di Achille, P.; Diamant, N.; Singh, P.; Harrington, L.X.; Wang, X.; Al-Alusi, M.A.; Sarma, G.; et al. ECG-Based Deep Learning and Clinical Risk Factors to Predict Atrial Fibrillation. Circulation 2022, 145, 122–133. [Google Scholar] [CrossRef] [PubMed]
  20. Uchida, Y.; Kan, H.; Kano, Y.; Onda, K.; Sakurai, K.; Takada, K.; Ueki, Y.; Matsukawa, N.; Hillis, A.E.; Oishi, K. Longitudinal Changes in Iron and Myelination Within Ischemic Lesions Associate With Neurological Outcomes: A Pilot Study. Stroke 2024, 55, 1041–1050. [Google Scholar] [CrossRef]
  21. Uchida, Y.; Kan, H.; Inoue, H.; Oomura, M.; Shibata, H.; Kano, Y.; Kuno, T.; Usami, T.; Takada, K.; Yamada, K.; et al. Penumbra Detection With Oxygen Extraction Fraction Using Magnetic Susceptibility in Patients With Acute Ischemic Stroke. Front. Neurol. 2022, 13, 752450. [Google Scholar] [CrossRef]
  22. Attia, Z.I.; Noseworthy, P.A.; Lopez-Jimenez, F.; Asirvatham, S.J.; Deshmukh, A.J.; Gersh, B.J.; Carter, R.E.; Yao, X.; Rabinstein, A.A.; Erickson, B.J.; et al. An Artificial Intelligence-Enabled ECG Algorithm for the Identification of Patients with Atrial Fibrillation during Sinus Rhythm: A Retrospective Analysis of Outcome Prediction. Lancet 2019, 394, 861–867. [Google Scholar] [CrossRef]
  23. Chen, W.; Zheng, P.; Bu, Y.; Xu, Y.; Lai, D. Achieving Real-Time Prediction of Paroxysmal Atrial Fibrillation Onset by Convolutional Neural Network and Sliding Window on R-R Interval Sequences. Bioengineering 2024, 11, 903. [Google Scholar] [CrossRef] [PubMed]
  24. Abstract 16068: Polycythemia Vera Is Associated with Increased Atrial Fibrillation Compared to the General Population: Results from the National Inpatient Sample Database|Circulation. Available online: https://www.ahajournals.org/doi/10.1161/circ.134.suppl_1.16068?utm_source=chatgpt.com (accessed on 13 July 2025).
  25. Truong, E.T.; Lyu, Y.; Ihdayhid, A.R.; Lan, N.S.R.; Dwivedi, G. Beyond Clinical Factors: Harnessing Artificial Intelligence and Multimodal Cardiac Imaging to Predict Atrial Fibrillation Recurrence Post-Catheter Ablation. J. Cardiovasc. Dev. Dis. 2024, 11, 291. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Overview of the study pipeline, including (a) data preparation steps with patient inclusion numbers, (b) an example of a preprocessed 12-lead ECG waveform, and (c) the deep learning pipeline integrating multimodal ECG and EHR data for paroxysmal atrial fibrillation (PAF).
Figure 1. Overview of the study pipeline, including (a) data preparation steps with patient inclusion numbers, (b) an example of a preprocessed 12-lead ECG waveform, and (c) the deep learning pipeline integrating multimodal ECG and EHR data for paroxysmal atrial fibrillation (PAF).
Bioengineering 12 00961 g001
Figure 2. (a) Performance metrics across different deep learning architectures. (b) Spider plot illustrating the performance metrics (AUROC, sensitivity, specificity, precision, accuracy, and F1 score) for the best performing model configuration. The plot highlights the balanced trade-offs achieved across all six metrics.
Figure 2. (a) Performance metrics across different deep learning architectures. (b) Spider plot illustrating the performance metrics (AUROC, sensitivity, specificity, precision, accuracy, and F1 score) for the best performing model configuration. The plot highlights the balanced trade-offs achieved across all six metrics.
Bioengineering 12 00961 g002
Figure 3. Impact of varying the relative contribution of EHR versus ECG data on predictive performance metrics. Moving left along the x-axis increases ECG data’s contribution (reducing EHR), whereas moving right increases EHR data’s contribution. For comparison, the dashed line represents baseline performance (dummy classifier).
Figure 3. Impact of varying the relative contribution of EHR versus ECG data on predictive performance metrics. Moving left along the x-axis increases ECG data’s contribution (reducing EHR), whereas moving right increases EHR data’s contribution. For comparison, the dashed line represents baseline performance (dummy classifier).
Bioengineering 12 00961 g003
Figure 4. Feature importance analysis identifying key predictors in the EHR dataset. The error bars represent variability through training random seeds. The symbol ‘@’ indicates measurements obtained at the index stroke.
Figure 4. Feature importance analysis identifying key predictors in the EHR dataset. The error bars represent variability through training random seeds. The symbol ‘@’ indicates measurements obtained at the index stroke.
Bioengineering 12 00961 g004
Table 1. Baseline demographic and clinical characteristics of the study population, stratified by paroxysmal atrial fibrillation (PAF) diagnosis. The p-values show the significant difference between PAF and no PAF.
Table 1. Baseline demographic and clinical characteristics of the study population, stratified by paroxysmal atrial fibrillation (PAF) diagnosis. The p-values show the significant difference between PAF and no PAF.
VariableTotal (n = 189)No PAF (n = 140)PAF (n = 49)p-Value
Age (years)71.4 ± 11.470.0 ± 11.675.4 ± 9.60.004
Sex, n (%) 0.452
 Male80 (42.3)62 (44.3)18 (36.7)
 Female109 (57.7)78 (55.7)31 (63.3)
Race, n (%) 0.205
 White156 (82.5)116 (82.9)40 (81.6)
 Black17 (9.0)13 (9.3)4 (8.2)
 Asian1 (0.5)0 (0.0)1 (2.0)
 Others14 (7.4)11 (7.9)3 (6.1)
Ethnicity, n (%) 0.999
 Hispanic3 (1.6)2 (1.4)1 (2.0)
 Non-Hispanic186 (98.4)138 (98.6)48 (98.0)
Monitoring (months)19.4 ± 13.918.3 ± 13.622.6 ± 14.50.064
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Vafaei Sadr, A.; Mareboina, M.; Orabueze, D.; Sarkar, N.; Hejazian, S.S.; Vemuri, A.; Shah, R.; Maheshwari, A.; Zand, R.; Abedi, V. Integration of EHR and ECG Data for Predicting Paroxysmal Atrial Fibrillation in Stroke Patients. Bioengineering 2025, 12, 961. https://doi.org/10.3390/bioengineering12090961

AMA Style

Vafaei Sadr A, Mareboina M, Orabueze D, Sarkar N, Hejazian SS, Vemuri A, Shah R, Maheshwari A, Zand R, Abedi V. Integration of EHR and ECG Data for Predicting Paroxysmal Atrial Fibrillation in Stroke Patients. Bioengineering. 2025; 12(9):961. https://doi.org/10.3390/bioengineering12090961

Chicago/Turabian Style

Vafaei Sadr, Alireza, Manvita Mareboina, Diana Orabueze, Nandini Sarkar, Seyyed Sina Hejazian, Ajith Vemuri, Ravi Shah, Ankit Maheshwari, Ramin Zand, and Vida Abedi. 2025. "Integration of EHR and ECG Data for Predicting Paroxysmal Atrial Fibrillation in Stroke Patients" Bioengineering 12, no. 9: 961. https://doi.org/10.3390/bioengineering12090961

APA Style

Vafaei Sadr, A., Mareboina, M., Orabueze, D., Sarkar, N., Hejazian, S. S., Vemuri, A., Shah, R., Maheshwari, A., Zand, R., & Abedi, V. (2025). Integration of EHR and ECG Data for Predicting Paroxysmal Atrial Fibrillation in Stroke Patients. Bioengineering, 12(9), 961. https://doi.org/10.3390/bioengineering12090961

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop