Early Prediction of Acute Respiratory Distress Syndrome in Critically Ill Polytrauma Patients Using Balanced Random Forest ML: A Retrospective Cohort Study

Nesrine Ben El Hadj Hassine; Sabri Barbaria; Omayma Najah; Halil İbrahim Ceylan; Muhammad Bilal; Lotfi Rebai; Raul Ioan Muntean; Ismail Dergaa; Hanene Boussi Rahmouni

doi:10.3390/jcm14248934

,

and

¹

Research Laboratory of Biophysics and Medical Technologies, Higher Institute of Medical Technologies of Tunis, Tunis El-Manar University, Tunis 1006, Tunisia

²

Anesthesia and Intensive Care Department, Pierre-Wertheimer Neurological Hospital, 69000 Lyon, France

³

Physical Education and Sports Teaching Department, Faculty of Sports Sciences, Ataturk University, 25240 Erzurum, Türkiye

⁴

Centre for Responsible Innovation in Big Data and AI (BRAIN), Business School, Birmingham City University (BCU), Birmingham B15 3TN, UK

J. Clin. Med.2025, 14(24), 8934;https://doi.org/10.3390/jcm14248934
(registering DOI)

This article belongs to the Section Respiratory Medicine

Version Notes

Order Reprints

Abstract

Background/Objectives: Acute respiratory distress syndrome (ARDS) represents a critical complication in polytrauma patients, characterized by diffuse lung inflammation and bilateral pulmonary infiltrates with mortality rates reaching 45% in intensive care units (ICU). The heterogeneous nature of ARDS and complex clinical presentation in severely injured patients poses substantial diagnostic challenges, necessitating early prediction tools to guide timely interventions. Machine learning (ML) algorithms have emerged as promising approaches for clinical decision support, demonstrating superior performance compared to traditional scoring systems in capturing complex patterns within high-dimensional medical data. Based on the identified research gaps in early ARDS prediction for polytrauma populations, our study aimed to: (i) develop a balanced random forest (BRF) ML model for early ARDS prediction in critically ill polytrauma patients, (ii) identify the most predictive clinical features using ANOVA-based feature selection, and (iii) evaluate model performance using comprehensive metrics addressing class imbalance challenges. Methods: This retrospective cohort study analyzed 407 polytrauma patients admitted to the ICU of the Center of Traumatology and Major Burns of Ben Arous, Tunisia, between 2017 and 2021. We implemented a comprehensive ML pipeline that incorporates Tomek Links undersampling, ANOVA F-test feature selection for the top 10 predictive variables, and SMOTE oversampling with a conservative sampling rate of 0.3. The BRF classifier was trained with class weighting and evaluated using stratified 5-fold cross-validation. Performance metrics included AUROC, PR-AUC, sensitivity, specificity, F1-score, and Matthews correlation coefficient. Results: Among 407 patients, 43 developed ARDS according to the Berlin definition, representing a 10.57% incidence. The BRF model demonstrated exceptional predictive performance with an AUROC of 0.98, a sensitivity of 0.91, a specificity of 0.80, an F1-score of 0.84, and an MCC of 0.70. Precision–recall AUC reached 0.86, demonstrating robust performance despite class imbalance. During stratified cross-validation, AUROC values ranged from 0.93 to 0.99 across folds, indicating consistent model stability. The top 10 selected features included procalcitonin, PaO₂ at ICU admission, 24-h pH, massive transfusion, total fluid resuscitation, presence of pneumothorax, alveolar hemorrhage, pulmonary contusion, hemothorax, and flail chest injury. Conclusions: Our BRF model provides a robust, clinically applicable tool for early prediction of ARDS in polytrauma patients using readily available clinical parameters. The comprehensive two-step resampling approach, combined with ANOVA-based feature selection, successfully addressed class imbalance while maintaining high predictive accuracy. These findings support integrating ML approaches into critical care decision-making to improve patient outcomes and resource allocation. External validation in diverse populations remains essential for confirming generalizability and clinical implementation.

Keywords:

acute respiratory distress syndrome; ARDS prediction; balanced random forest; class imbalance; feature selection; intensive care unit; ML; polytrauma; predictive modeling; SMOTE

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Article metric data becomes available approximately 24 hours after publication online.