Background/Objectives: Prognostication in ICU patients with documented coma or unresponsiveness is a high-stakes task that informs escalation of care, goals-of-care discussions, and family counselling. Conventional scores are often based on static snapshots and may not reflect early physiological evolution in heterogeneous real-world ICU populations. Routine arterial blood gases (ABG) and SpO
2 are repeatedly measured during early ICU care and may capture clinically meaningful trajectories that can be leveraged by explainable machine learning. To develop and internally validate exploratory, time-updated explainable machine-learning models for ICU outcome in ICU patients with clinically documented coma or unresponsiveness using routine ABG/SpO
2 measurements and physiological trajectories available at admission, 24 h, and 72 h, and to evaluate whether trajectory information adds prognostic information within a staged internal-validation framework.
Methods: We conducted a retrospective single-centre study of 108 adult ICU patients with clinically documented coma or unresponsiveness. Predictors included demographics, comorbidity burden, COVID-19 status, baseline ABG/SpO
2 at ICU admission, inflammatory and coagulation biomarkers, and derived ABG/SpO
2 trajectory variables at 24 h and 72 h. Trajectory variables were defined as changes from admission to 24 h and to 72 h and were retained as missing when follow-up measurements were unavailable. The primary ICU-course outcome was ICU death versus transfer to ward. Three staged models were evaluated: Model A using baseline variables, Model B adding 24 h trajectory features, and Model C adding 72 h trajectory features. For each stage, models were analyzed with and without the derived respiratory_support index; models excluding respiratory_support were treated as the main interpretive analyses. Logistic regression, random forest, and gradient boosting (XGBoost) classifiers were assessed using repeated stratified 5-fold cross-validation with 20 repeats and aligned out-of-fold predictions. Performance was reported using AUC-ROC, precision–recall AUC, Brier score, and operating-point metrics; clinical utility was examined with decision-curve analysis. Model interpretation used SHAP and partial dependence plots. Robustness analyses included feature-exclusion sensitivity analysis for respiratory_support and a label-permutation sanity check.
Results: ICU mortality was 65.7% (71/108). Follow-up ABG completeness was 75.9% at 24 h and 61.1% at 72 h. Because respiratory_support summarized the highest support level during the first 72 h and strongly separated outcome groups, models excluding respiratory_support were treated as the primary interpretive analyses. In the primary NoRS logistic-regression models, discrimination was moderate-to-strong, with AUC-ROC 0.822 for Model A_noRS, 0.848 for Model B_noRS, and 0.895 for Model C_noRS; bootstrap 95% confidence intervals were 0.739–0.897, 0.766–0.919, and 0.830–0.951, respectively. Measurement-availability sensitivity analyses and simple benchmark models were added to contextualize trajectory-related performance. Respiratory_support-enriched models were retained only as secondary severity-aware analyses, not as admission-only prediction models. Label permutation reduced discrimination toward chance (AUC ≈ 0.55). SHAP and partial-dependence analyses identified oxygenation variables, inflammatory burden, acid–base status, and ΔPaO
2 at 72 h as clinically coherent contributors to predicted risk; when included, respiratory_support dominated feature attribution, consistent with its role as an organ-support intensity marker.
Conclusions: In ICU patients with clinically documented coma or unresponsiveness, explainable machine-learning models using routine ABG/SpO
2 trajectories within the first 72 h are feasible and may provide time-updated prognostic information, but the incremental value of trajectory-enriched models over simpler admission-only benchmarks remains unproven. Trajectory-enriched NoRS models retained meaningful discrimination after removing organ-support severity, suggesting a possible physiologically meaningful signal beyond support intensity alone, although definitive incremental value over parsimonious admission-only benchmarks was not established. These findings should be interpreted as exploratory and internally validated only; they do not establish a deployable ICU mortality score, do not demonstrate superiority over established ICU severity scores, and require external validation in larger multicentre cohorts before clinical deployment.
Full article