Figure 1.
Representative physiological waveform samples from the MIMIC-III database showing synchronized PPG, ECG, and arterial blood pressure (ABP) signals recorded at 125 Hz sampling frequency.
Figure 1.
Representative physiological waveform samples from the MIMIC-III database showing synchronized PPG, ECG, and arterial blood pressure (ABP) signals recorded at 125 Hz sampling frequency.
Figure 2.
Training dynamics of the ResNet-Transformer model showing loss curves and performance metrics across epochs. The model achieved optimal performance at epoch 5, with subsequent training showing signs of overfitting.
Figure 2.
Training dynamics of the ResNet-Transformer model showing loss curves and performance metrics across epochs. The model achieved optimal performance at epoch 5, with subsequent training showing signs of overfitting.
Figure 3.
SHAP summary plot showing global feature importance rankings. Each point represents a sample, with color indicating feature value (red = high, blue = low) and horizontal position showing SHAP impact on prediction. ECG-derived features (ecg_r_amp_std, ecg_signal_quality) dominate the top rankings.
Figure 3.
SHAP summary plot showing global feature importance rankings. Each point represents a sample, with color indicating feature value (red = high, blue = low) and horizontal position showing SHAP impact on prediction. ECG-derived features (ecg_r_amp_std, ecg_signal_quality) dominate the top rankings.
Figure 4.
Comparison of SHAP-based and permutation-based feature importance rankings. Both methods identify ECG R-wave amplitude variability and signal quality as the most predictive features, validating the robustness of importance estimates.
Figure 4.
Comparison of SHAP-based and permutation-based feature importance rankings. Both methods identify ECG R-wave amplitude variability and signal quality as the most predictive features, validating the robustness of importance estimates.
Figure 5.
Partial dependence plots for top predictive features showing the marginal effect on predicted systolic blood pressure. The plots reveal nonlinear relationships between feature values and blood pressure predictions. Blue lines represent partial dependence values, shaded regions indicate ±1 standard deviation confidence bands, and vertical tick marks show feature value deciles.
Figure 5.
Partial dependence plots for top predictive features showing the marginal effect on predicted systolic blood pressure. The plots reveal nonlinear relationships between feature values and blood pressure predictions. Blue lines represent partial dependence values, shaded regions indicate ±1 standard deviation confidence bands, and vertical tick marks show feature value deciles.
Figure 6.
Feature interaction heatmap showing SHAP interaction values between top features. Darker colors indicate stronger interactions, revealing dependencies between ECG and PPG features in blood pressure prediction.
Figure 6.
Feature interaction heatmap showing SHAP interaction values between top features. Darker colors indicate stronger interactions, revealing dependencies between ECG and PPG features in blood pressure prediction.
Figure 7.
Model performance comparison with 95% confidence intervals across all evaluated models. XGBoost achieves significantly lower error than all other models. Salmon bars represent linear models (Ridge, Lasso, ElasticNet, Linear Regression, SVR, LightGBM) and orange bars represent tree-based and instance-based models (XGBoost, Gradient Boosting, Random Forest, KNN). Grey dashed lines indicate AAMI threshold for reference. Subplots show: (A) Mean Absolute Error (MAE), (B) Root Mean Square Error (RMSE), and (C) Coefficient of Determination (R2).
Figure 7.
Model performance comparison with 95% confidence intervals across all evaluated models. XGBoost achieves significantly lower error than all other models. Salmon bars represent linear models (Ridge, Lasso, ElasticNet, Linear Regression, SVR, LightGBM) and orange bars represent tree-based and instance-based models (XGBoost, Gradient Boosting, Random Forest, KNN). Grey dashed lines indicate AAMI threshold for reference. Subplots show: (A) Mean Absolute Error (MAE), (B) Root Mean Square Error (RMSE), and (C) Coefficient of Determination (R2).
Figure 8.
Bland–Altman plot for XGBoost predictions showing the relationship between prediction error and mean blood pressure. The plot reveals acceptable bias (−1.27 mmHg) but substantial variability (limits of agreement: −24.01 to 21.47 mmHg). The dark solid line indicates mean bias, red dashed lines indicate the 95% limits of agreement, and green dotted lines mark the ±5 mmHg clinical threshold.
Figure 8.
Bland–Altman plot for XGBoost predictions showing the relationship between prediction error and mean blood pressure. The plot reveals acceptable bias (−1.27 mmHg) but substantial variability (limits of agreement: −24.01 to 21.47 mmHg). The dark solid line indicates mean bias, red dashed lines indicate the 95% limits of agreement, and green dotted lines mark the ±5 mmHg clinical threshold.
Figure 9.
Statistical comparison heatmap showing p-values from pairwise Wilcoxon signed-rank tests. Darker colors indicate more significant differences. XGBoost (top row) shows highly significant differences from all other models.
Figure 9.
Statistical comparison heatmap showing p-values from pairwise Wilcoxon signed-rank tests. Darker colors indicate more significant differences. XGBoost (top row) shows highly significant differences from all other models.
Figure 10.
SHAP feature importance bar plot showing mean absolute SHAP values for all 55 features. ECG-derived features (blue bars) and PPG-derived features (orange bars) are distinguished by color. ECG features dominate the top rankings, with R-wave amplitude variability and signal quality emerging as the most predictive features.
Figure 10.
SHAP feature importance bar plot showing mean absolute SHAP values for all 55 features. ECG-derived features (blue bars) and PPG-derived features (orange bars) are distinguished by color. ECG features dominate the top rankings, with R-wave amplitude variability and signal quality emerging as the most predictive features.
Figure 11.
Ablation study results comparing MAE for PPG-only, ECG-only, and combined feature sets.
Figure 11.
Ablation study results comparing MAE for PPG-only, ECG-only, and combined feature sets.
Figure 12.
SHAP summary plot showing global feature importance for SBP prediction.
Figure 12.
SHAP summary plot showing global feature importance for SBP prediction.
Figure 13.
Aggregate SHAP importance by signal type: ECG (54.7%) vs. PPG (45.3%).
Figure 13.
Aggregate SHAP importance by signal type: ECG (54.7%) vs. PPG (45.3%).
Figure 14.
Prediction error distribution across ablation study configurations.
Figure 14.
Prediction error distribution across ablation study configurations.
Table 1.
Dataset characteristics and blood pressure distribution.
Table 1.
Dataset characteristics and blood pressure distribution.
| Characteristic | Value |
|---|
| Source | MIMIC-III Waveform Database (PhysioNet) |
| Number of subjects | 1524 |
| Number of samples (SBP) | 61,232 |
| Number of samples (DBP) | 61,192 |
| Sampling frequency | 125 Hz |
| SBP (mean ± SD) | 108.79 ± 18.63 mmHg |
| DBP (mean ± SD) | 57.26 ± 11.14 mmHg |
Table 2.
Signal preprocessing parameters.
Table 2.
Signal preprocessing parameters.
| Parameter | Value | Description |
|---|
| Sampling rate | 125 Hz | MIMIC waveform acquisition rate |
| Segment duration | 30 s | Pre-segmented in MIMIC-BP dataset |
| Bandpass filter (PPG) | 0.5–8.0 Hz | 3rd-order Butterworth filter |
| Bandpass filter (ECG) | 0.5–40.0 Hz | 3rd-order Butterworth filter |
| Peak detection | scipy.signal.find_peaks | distance = 42 samples |
| SBP plausibility range | 60–260 mmHg | Segments outside excluded |
| DBP plausibility range | 40–180 mmHg | Segments outside excluded |
| Heart rate plausibility | 30–250 bpm | Segments outside excluded |
| R-peak detection | Pan-Tompkins (NeuroKit2) | Standard QRS detection |
| Minimum cardiac cycles | 10 | Required for reliable PTT |
Table 3.
Photoplethysmography (PPG) features extracted for blood pressure estimation (n = 21).
Table 3.
Photoplethysmography (PPG) features extracted for blood pressure estimation (n = 21).
| Category | Features |
|---|
| Statistical (11) | Mean, Standard Deviation, Variance, Minimum, Maximum, Range, Median, IQR, Skewness, Kurtosis, RMS |
| Heart Rate (2) | HR Mean, HR Standard Deviation |
| RR Interval (3) | RR Mean, RR Standard Deviation, RMSSD |
| Morphological (4) | Pulse Amplitude, Pulse Width (50%), Fall Time, Dicrotic Index |
| Frequency (1) | Dominant Frequency |
Table 4.
Electrocardiography (ECG) features extracted for blood pressure estimation (n = 9).
Table 4.
Electrocardiography (ECG) features extracted for blood pressure estimation (n = 9).
| Category | Features |
|---|
| Heart Rate (5) | HR Mean, HR Standard Deviation, RR Mean, RR Standard Deviation, RR Coefficient of Variation |
| Morphological (3) | Total Power, R-wave Amplitude Mean, R-wave Amplitude Standard Deviation |
| Quality (1) | Signal Quality Index |
Table 5.
Machine learning models evaluated for blood pressure estimation.
Table 5.
Machine learning models evaluated for blood pressure estimation.
| Category | Models |
|---|
| Linear | Linear Regression, Ridge, Lasso, ElasticNet |
| Instance-based | K-Nearest Neighbors (KNN) |
| Kernel | Support Vector Regression (SVR-RBF) |
| Tree Ensemble | Random Forest, Gradient Boosting, XGBoost, LightGBM |
Table 6.
ResNet-Transformer deep learning model configuration.
Table 6.
ResNet-Transformer deep learning model configuration.
| Parameter | Value |
|---|
| Model | ResNet-Transformer hybrid (1D) |
| Input | Raw PPG/ECG waveforms |
| Total parameters | 6,078,850 |
| Optimizer | Adam (lr = 0.0001, weight decay = 1 × 10−5) |
| Early stopping | Patience = 15 epochs |
| Maximum epochs | 100 |
| Best epoch | 5 |
Table 7.
Clinical validation standards for blood pressure measurement devices.
Table 7.
Clinical validation standards for blood pressure measurement devices.
| Standard | Criterion 1 | Criterion 2 | Criterion 3 |
|---|
| BHS Grade A | ≥60% within 5 mmHg | ≥85% within 10 mmHg | ≥95% within 15 mmHg |
| BHS Grade B | ≥50% within 5 mmHg | ≥75% within 10 mmHg | ≥90% within 15 mmHg |
| BHS Grade D | ≥40% within 5 mmHg | ≥65% within 10 mmHg | ≥85% within 15 mmHg |
| BHS Grade D | Below Grade D | - | - |
| AAMI | Mean error ≤ 5 mmHg | SD ≤ 8 mmHg | - |
Table 8.
Complete model performance comparison for systolic blood pressure estimation. Models ranked by MAE. CI: confidence interval; BHS: British Hypertension Society grade.
Table 8.
Complete model performance comparison for systolic blood pressure estimation. Models ranked by MAE. CI: confidence interval; BHS: British Hypertension Society grade.
| Model | MAE (95% CI) | RMSE | R2 | Bias ± SD | BHS |
|---|
| XGBoost | 7.32 (6.59–8.07) | 11.67 | 0.621 | −1.27 ± 11.60 | C |
| KNN | 8.47 (7.73–9.25) | 12.57 | 0.560 | −1.45 ± 12.49 | D |
| Gradient Boosting | 8.77 (8.03–9.51) | 12.55 | 0.561 | −1.10 ± 12.50 | D |
| Random Forest | 10.39 (9.66–11.12) | 13.90 | 0.462 | −1.50 ± 13.82 | D |
| SVR | 12.24 (11.37–13.13) | 16.42 | 0.250 | −2.94 ± 16.15 | D |
| ResNet-Transformer | 12.78 (-) | 16.24 | 0.267 | 0.20 ± 16.23 | D |
| Linear Regression | 13.66 (12.79–14.50) | 17.52 | 0.146 | −1.46 ± 17.46 | D |
| LightGBM | 13.82 (12.98–14.69) | 17.40 | 0.157 | −1.40 ± 17.35 | D |
| Ridge | 13.90 (13.02–14.75) | 17.76 | 0.122 | −1.48 ± 17.70 | D |
| Lasso | 15.24 (14.34–16.17) | 19.02 | −0.007 | −1.64 ± 18.95 | D |
| ElasticNet | 15.24 (14.34–16.17) | 19.02 | −0.007 | −1.64 ± 18.95 | D |
Table 9.
British Hypertension Society (BHS) grading for blood pressure estimation models. Grade thresholds shown for reference.
Table 9.
British Hypertension Society (BHS) grading for blood pressure estimation models. Grade thresholds shown for reference.
| Model | ≤5 mmHg | ≤10 mmHg | ≤15 mmHg | Grade |
|---|
| XGBoost | 56.7% | 77.2% | 87.2% | C |
| KNN | 46.8% | 70.2% | 83.3% | D |
| Gradient Boosting | 43.7% | 70.3% | 82.8% | D |
| Random Forest | 32.2% | 59.5% | 78.0% | D |
| BHS Grade A threshold | 60% | 85% | 95% | - |
| BHS Grade D threshold | 40% | 65% | 85% | - |
Table 10.
Top 15 features ranked by SHAP importance for blood pressure prediction.
Table 10.
Top 15 features ranked by SHAP importance for blood pressure prediction.
| Rank | Feature | SHAP Importance | Signal |
|---|
| 1 | ecg_r_amp_std | 0.654 | ECG |
| 2 | ecg_signal_quality | 0.482 | ECG |
| 3 | ppg_hr_mean | 0.418 | PPG |
| 4 | ppg_skewness | 0.397 | PPG |
| 5 | ecg_r_amp_mean | 0.395 | ECG |
| 6 | ecg_hr_mean | 0.381 | ECG |
| 7 | ppg_iqr | 0.374 | PPG |
| 8 | ppg_kurtosis | 0.329 | PPG |
| 9 | ppg_min | 0.235 | PPG |
| 10 | ppg_fall_time | 0.220 | PPG |
| 11 | ecg_rr_mean | 0.200 | ECG |
| 12 | ppg_dicrotic_idx | 0.200 | PPG |
| 13 | ppg_rr_rmssd | 0.195 | PPG |
| 14 | ppg_width_50 | 0.174 | PPG |
| 15 | ppg_amp_mean | 0.154 | PPG |
Table 11.
Ablation study results—feature source comparison (SBP).
Table 11.
Ablation study results—feature source comparison (SBP).
| Model | PPG Only (MAE) | ECG Only (MAE) | Combined (MAE) | BHS Grade |
|---|
| LightGBM | 15.97 | 16.23 | 16.24 | D |
| XGBoost | 16.58 | 17.41 | 16.19 | D |
| Random Forest | 16.36 | 16.54 | 16.31 | D |
Table 12.
Statistical significance testing (Wilcoxon signed-rank test).
Table 12.
Statistical significance testing (Wilcoxon signed-rank test).
| Comparison | Model | p-Value | Significant? |
|---|
| PPG vs. ECG | LightGBM | 0.226 | No |
| PPG vs. ECG | XGBoost | 0.277 | No |
| PPG vs. ECG | Random Forest | 0.650 | No |
Table 13.
SHAP feature importance by signal type.
Table 13.
SHAP feature importance by signal type.
| Signal Type | % of Total Importance | Features in Top 10 |
|---|
| ECG Features | 54.7% | 4 |
| PPG Features | 45.3% | 6 |
Table 14.
Comparison of tree-based machine learning versus deep learning approaches for feature-based blood pressure estimation.
Table 14.
Comparison of tree-based machine learning versus deep learning approaches for feature-based blood pressure estimation.
| Approach | Best MAE (mmHg) | Explanation |
|---|
| XGBoost (tree-based) | 7.32 | Excels at structured feature relationships and nonlinear interactions |
| ResNet-Transformer (DL) | 12.78 | Designed for raw waveforms, not pre-engineered features |
Table 15.
Comparison with selected blood pressure estimation studies from recent literature.
Table 15.
Comparison with selected blood pressure estimation studies from recent literature.
| Study | Method | MAE (SBP) | BHS | Key Difference |
|---|
| This study | XGBoost | 15.97 mmHg | C | Feature-based, ECG + PPG |
| TransfoRhythm (2024) [28] | Transformer | 1.37 mmHg | A | End-to-end raw signals |
| CNN-BiLSTM (2025) [29] | Hybrid DL | 1.88 mmHg | A | Raw waveform input |
| Kachuee et al. (2017) [7] | AdaBoost | 11.17 mmHg | - | PPG features only |