Prediction of Acute Respiratory Distress Syndrome in Traumatic Brain Injury Patients Based on Machine Learning Algorithms

Abstract Background: Acute respiratory distress syndrome (ARDS) commonly develops in traumatic brain injury (TBI) patients and is a risk factor for poor prognosis. We designed this study to evaluate the performance of several machine learning algorithms for predicting ARDS in TBI patients. Methods: TBI patients from the Medical Information Mart for Intensive Care-III (MIMIC-III) database were eligible for this study. ARDS was identified according to the Berlin definition. Included TBI patients were divided into the training cohort and the validation cohort with a ratio of 7:3. Several machine learning algorithms were utilized to develop predictive models with five-fold cross validation for ARDS including extreme gradient boosting, light gradient boosting machine, Random Forest, adaptive boosting, complement naïve Bayes, and support vector machine. The performance of machine learning algorithms were evaluated by the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, accuracy and F score. Results: 649 TBI patients from the MIMIC-III database were included with an ARDS incidence of 49.5%. The random forest performed the best in predicting ARDS in the training cohort with an AUC of 1.000. The XGBoost and AdaBoost ranked the second and the third with an AUC of 0.989 and 0.815 in the training cohort. The random forest still performed the best in predicting ARDS in the validation cohort with an AUC of 0.652. AdaBoost and XGBoost ranked the second and the third with an AUC of 0.631 and 0.620 in the validation cohort. Several mutual top features in the random forest and AdaBoost were discovered including age, initial systolic blood pressure and heart rate, Abbreviated Injury Score chest, white blood cells, platelets, and international normalized ratio. Conclusions: The random forest and AdaBoost based models have stable and good performance for predicting ARDS in TBI patients. These models could help clinicians to evaluate the risk of ARDS in early stages after TBI and consequently adjust treatment decisions.


Introduction
Traumatic brain injury (TBI) is a widely concerning health issue causing a huge burden to society. It has been estimated that 69 million people suffer TBI annually around the world [1]. The poor prognosis of TBI is not only attributable to the severity of intracranial injury and concomitant trauma of the extracranial region but is also caused by various complicated organ dysfunctions such as acute kidney injury, coagulopathy, respiratory failure and acute respiratory distress syndrome (ARDS) [2]. Previous studies have shown that ARDS was a common pulmonary complication in TBI patients with the incidence ranging from 1% to 60% [2,3]. ARDS also has been confirmed as a risk factor for poor prognosis including higher mortality, poorer neurological outcome and longer length of hospital stay in some studies [4][5][6][7]. Studies have explored risk factors for ARDS in TBI patients including younger age, male sex, admission tachycardia, underlying respiratory and vascular diseases, pneumonia, head AIS, early crystalloids, early platelet transfusion and intracranial hypertension [6][7][8]. While one recent meta-analysis indicated age, male gender, white race, head AIS, Marshall CT score, GCS on admission, and increased intracranial pressure during hospitalizations were not significant predictors for ARDS in TBI [9]. Exploring potential risk factors for ARDS after TBI and identifying patients with a higher risk for ARDS in the early stage after injury is important for clinicians to devise optimal treatment strategies including setting appropriate parameters on the ventilator. Trying to avoid the development or progression of ARDS in clinical practice may improve the prognosis of TBI patients. There is no study developing a model to evaluate the risk of ARDS in TBI patients. Machine learning algorithms perform well on predicting outcome events for patients based on their advantages in dealing with complex data and nonlinear relationships. We designed this study to evaluate the performance of different machine learning algorithms when predicting ARDS in TBI patients.

Patients
This study included patients derived from the Medical Information Mart for Intensive Care-III (MIMIC-III) database. Produced by the computational physiology laboratory of Massachusetts Institute of Technology (MIT) (Cambridge, MA, USA), this freely available database collects electronic medical records of patients hospitalized in the Beth Israel Deaconess Medical Center (BIDMC) (Boston, MA, USA) between 2001 and 2012 and receives ethical approval from the institutional review boards of MIT and BIDMC. Patients in the MIMIC-III were deidentified and anonymized to protect personal privacy. Our study extracted patients diagnosed with TBI from the MIMIC-III based on ICD-9 codes (80000-80199; 80300-80499; 8500-85419). Some of the TBI patients were excluded from this study if they met the following criteria: (1) Age < 18; (2) Lacked records of Glasgow Coma Scale (GCS) on admission; (3) Lacked records of vital signs and laboratory tests; (4) Abbreviated Injury Score (AIS) head < 3; (5) Lacked records of arterial oxygen pressure (PaO 2 ) and corresponding fraction of inspired oxygen (FiO 2 ) ( Figure 1). A total of 649 TBI patients were finally included in our study. The study was designed and conducted to comply with the ethical standards of the Helsinki declaration. The study design was approved by the ethical committee of West China hospital (2021-1598).

Study Variables
Age, gender and underlying diseases including diabetes, hypertension, hyperlipidemia, coronary heart disease, liver disease, chronic renal disease, and malignancy were included. Initial vital signs including systolic blood pressure (SBP), diastolic blood pressure (DBP), heart rate, and respiratory rate were recorded. Glasgow Coma Score (GCS), Abbreviated Injury Score (AIS) of chest, and Injury Severity Score (ISS) were collected to reflect the severity of injuries. Laboratory tests analyzed from the first blood sample since admission were selected as features including white blood cells (WBCs), platelets, red blood cells (RBCs), hemoglobin, glucose, blood urea nitrogen, serum creatinine, serum sodium, serum potassium, serum chloride, serum calcium, prothrombin time, and international normalized ratio (INR). Initial ventilation related parameters including PaO 2 , FiO 2 , PaO 2 /FiO 2 ratio were extracted. Intracranial injury locations were fetched including epidural hematoma (EDH), subdural hematoma (SDH), subarachnoid hemorrhage (SAH), and intraparenchymal hemorrhage (IPH). Medical treatments during the first day since admission were collected, including RBC transfusion, platelet transfusion, anticoagulant use, antiplatelet use and vasopressor use. Records of mechanical ventilation and neurosurgical operation were collected. A total of 40 features were finally included in the process of developing machine learning models. The outcome of this study was the development of ARDS which was diagnosed based on the Berlin definition [10].

Study Variables
Age, gender and underlying diseases including diabetes, hypertension, hyperlipidemia, coronary heart disease, liver disease, chronic renal disease, and malignancy were included. Initial vital signs including systolic blood pressure (SBP), diastolic blood pressure (DBP), heart rate, and respiratory rate were recorded. Glasgow Coma Score (GCS), Abbreviated Injury Score (AIS) of chest, and Injury Severity Score (ISS) were collected to reflect the severity of injuries. Laboratory tests analyzed from the first blood sample since admission were selected as features including white blood cells (WBCs), platelets, red blood cells (RBCs), hemoglobin, glucose, blood urea nitrogen, serum creatinine, serum sodium, serum potassium, serum chloride, serum calcium, prothrombin time, and international normalized ratio (INR). Initial ventilation related parameters including PaO2, FiO2, PaO2/FiO2 ratio were extracted. Intracranial injury locations were fetched including epidural hematoma (EDH), subdural hematoma (SDH), subarachnoid hemorrhage (SAH), and intraparenchymal hemorrhage (IPH). Medical treatments during the first day since admission were collected, including RBC transfusion, platelet transfusion, anticoagulant use, antiplatelet use and vasopressor use. Records of mechanical ventilation and neurosurgical operation were collected. A total of 40 features were finally included in the process of developing machine learning models. The outcome of this study was the development of ARDS which was diagnosed based on the Berlin definition [10].

Statistical Analysis
The normality of collected variables was determined by the Kolmogorov-Smirnov test. Variables with normal distribution and non-normal distribution were presented as mean ± standard deviation and median (interquartile range), respectively. Categorical variables were shown as counts (percentage). Differences of collected continuous variables between

Statistical Analysis
The normality of collected variables was determined by the Kolmogorov-Smirnov test. Variables with normal distribution and non-normal distribution were presented as mean ± standard deviation and median (interquartile range), respectively. Categorical variables were shown as counts (percentage). Differences of collected continuous variables between ARDS group and non-ARDS group were analyzed by the Student's t-test and Mann-Whitney U test. Differences in collected categorical variables between two groups were analyzed by the Chi-square test or the Fisher exact test. p < 0.05 was considered as being statistically significant.

Machine Learning Algorithms
TBI patients included from the MIMIC-III dataset were randomly divided into the training set and the validation set with a ratio of 7:3. There were six machine learning algorithms that were trained with five-fold cross validation in the training set to predict ARDS including extreme gradient boosting (XGBoost), light gradient boosting machine (light GBM), random forest, adaptive boosting (AdaBoost), complement naïve Bayes, and support vector machine (SVM). The optimal parameters of each machine learning algorithm were automatically explored during the cross-validation process. Trained machine learning predictive models were then verified in the validation set by evaluating multiple indexes including area under the receiver operating characteristics curve (AUC), accuracy, sensitivity, specificity, positive predicted value (PPV), negative predicted value (NPV) and F1 score. The Shapley Additive explanation (SHAP) method was utilized for evaluation of the feature importance in machine learning predictive models and visualized explanation of predictive models. All statistical analyses and figures were performed using the Extreme smart analysis-an online statistical analysis platform based on the Python (Amsterdam, The Netherlands).

Comparison between Final Included Patients and Those Lacking Records of PaO 2 and FiO 2
A total of 2031 patients were excluded from this study due to the following criteria: (1) age < 18 (n = 32); (2) lacked records of GCS on admission (n = 65); (3) lacked records of vital signs and laboratory tests (n = 116); (4) AIS head < 3 (n = 187); (5) lacked records of PaO 2 and corresponding FiO 2 (n = 1631) ( Figure 1). Finally, 649 TBI patients were included with the ARDS incidence being 49.5%. A large number of TBI patients (1631/2680, 60.86%) were excluded due to a lack of records of PaO 2 and corresponding FiO 2 . The comparison between these patients and final included patients were shown in Supplementary Table S1. Compared with these excluded patients, final included patients were mainly severe TBI (GCS: 6 (3-9), median (quartiles)) and had younger age (59.3 vs. 66.8, p < 0.001). The incidence of RBC transfusion (p < 0.001), antiplatelet transfusion (p < 0.001), vasopressor use (p < 0.001), mechanical ventilation (p < 0.001) and neurosurgery (p < 0.001) were higher in the final included patients. They also had higher mortality than those excluded patients (29.4% vs. 13.1%, p < 0.001). These differences indicated the included population of this study was mainly severe TBI.

Baseline Characteristics of Included TBI Patients
Among included patients, the age of the ARDS group was higher than the non-ARDS group (p = 0.027) ( Table 1). Comorbidities did not significantly differ between the ARDS group and the non-ARDS group. The ARDS group had higher AIS chest (p < 0.001) and ISS (p = 0.009) than the non-ARDS group. The GCS did not show significant difference between the two groups (p = 0.724). Laboratory tests showed that platelets (p = 0.004) were lower in the ARDS group while prothrombin time (p = 0.002) and INR (<0.001) were higher in the ARDS group. The initial PaO 2 (p < 0.001) was lower in the ARDS group while the initial FiO 2 did not show significant difference between two groups (p = 0.874). The initial PaO 2 /FiO 2 ratio of the non-ARDS group and the ARDS group were 356 and 248, respectively. The percentage of mild, moderate and severe ARDS among overall ARDS patients was 43.5%, 39.7% and 16.7%, respectively ( Figure 2). Compared with the non-ARDS group, the ARDS group was more likely to receive platelet transfusion during the first day (p = 0.0027). Finally, the ARDS group had a longer length of ICU stay (<0.001) and length of hospital stay (<0.001).

Performance of Machine Learning Algorithms for Predicting ARDS in TBI
The random forest performed the best on predicting ARDS in the training cohort with an AUC value of 1.000 (Table 2) ( Figure 3A). The accuracy, sensitivity, specificity, PPV, NPV, F1 score of the random forest in the training cohort was 0.998, 1.000, 1.000, 1.000, 0.997 and 1.000, respectively. The XGBoost and AdaBoost ranked second and third with an AUC of 0.989 and 0.815. The random forest still performed the best in predicting ARDS in the validation cohort with an AUC value of 0.652 (Table 3) ( Figure 3B). The accuracy, sensitivity, specificity, PPV, NPV, and F1 score of the random forest in the validation cohort was 0.542, 0.719, 0.579, 0.767, 0.526, 0.716, respectively. The AdaBoost and XGBoost ranked second and third with an AUC of 0.631 and 0.620. Generally, the random forest performed well and stably in predicting ARDS both in the training cohort and the validation cohort. The AdaBoost is second only to the random forest while the XGBoost showed significantly different performance between the training cohort and the validation cohort.

Important Features in Machine Learning Algorithms for Predicting ARDS in TBI
Feature importance derived from random forest and AdaBoost are shown in Figures 4A and 4B, respectively. The SHAP value of all patients' output in the random forest model and the AdaBoost model are presented in Figure 4C,D. The common important features in the random forest and the AdaBoost were analyzed. Figure 5A showed 15 common features were discovered among the top 20 features in these two algorithms including platelet, INR, AIS chest, heart rate, DBP, WBC, age, serum chloride, hemoglobin, SBP, SDH, respiratory rate, serum sodium, GCS, and RBC. Figure 5B showed seven common features were discovered among the top ten features in these two algorithms including platelet, INR, AIS chest, heart rate, WBC, age, SBP.
Medicina 2023, 59, x FOR PEER REVIEW 8 of 12 including platelet, INR, AIS chest, heart rate, DBP, WBC, age, serum chloride, hemoglobin, SBP, SDH, respiratory rate, serum sodium, GCS, and RBC. Figure 5B showed seven common features were discovered among the top ten features in these two algorithms including platelet, INR, AIS chest, heart rate, WBC, age, SBP.   (B) Venn diagrams of the top ten features in random forest and adaboost. There were seven common features in these two algorithms including platelet, INR, AIS thoracic, heart rate, WBC, age, and SBP. INR, international normalized ratio; AIS, Abbreviated Injury Score; DBP, diastolic blood pressure; WBC, white blood cell; SBP, systolic blood pressure; SDH, subdural hematoma; GCS, Glasgow Coma Scale; RBC, red blood cell; CHD, coronary heart disease.

Discussion
The incidence of ARDS in the study was 49.5%, which was similar to the previously reported incidence of ARDS in TBI ranging from 1% to 60% [2,3,9]. The actual incidence of ARDS in TBI patients from the MIMIC-III database may be lower than 49.5% because a large number of TBI patients that lacked relevant records were excluded from this study. The significant variation in the reported incidence of ARDS in TBI may be attributable to the difference of injury severity, treatment strategy and healthcare level in different medical centers. Compared with the non-ARDS group, the ARDS group in the study had a longer length of ICU stay and length of hospital stay. The 30-day mortality did not show (B) Venn diagrams of the top ten features in random forest and adaboost. There were seven common features in these two algorithms including platelet, INR, AIS thoracic, heart rate, WBC, age, and SBP. INR, international normalized ratio; AIS, Abbreviated Injury Score; DBP, diastolic blood pressure; WBC, white blood cell; SBP, systolic blood pressure; SDH, subdural hematoma; GCS, Glasgow Coma Scale; RBC, red blood cell; CHD, coronary heart disease.

Discussion
The incidence of ARDS in the study was 49.5%, which was similar to the previously reported incidence of ARDS in TBI ranging from 1% to 60% [2,3,9]. The actual incidence of ARDS in TBI patients from the MIMIC-III database may be lower than 49.5% because a large number of TBI patients that lacked relevant records were excluded from this study. The significant variation in the reported incidence of ARDS in TBI may be attributable to the difference of injury severity, treatment strategy and healthcare level in different medical centers. Compared with the non-ARDS group, the ARDS group in the study had a longer length of ICU stay and length of hospital stay. The 30-day mortality did not show statistical significance between these two groups. This fact is contradictory to the finding of one meta-analysis which indicated the survival proportion was significantly higher in TBI patients without ARDS than those with ARDS [9]. The insignificance of survival between these two groups in our study may be caused by the exclusion of a large number of mild TBI patients. Due to the high prevalence of ARDS and poor its prognosis, exploring risk factors for ARDS and evaluating the risk of developing ARDS in the early phase after TBI is necessary to decrease the possibility of developing ARDS and to improve the prognosis of TBI patients.
In this study, the random forest and AdaBoost achieved good and stable performances in predicting ARDS, both in the training cohort and the validation cohort among several machine learning algorithms. Trained based on the bagging method, the random forest is an ensemble classifier composed of multiple decision trees. It integrates all classified voting results of individual trees and judges the category with the most votes as the final output. The boosting method means combining many weak classifiers to produce a powerful classifier to improve the predictive accuracy of the final model. As a classical boosting algorithm, Adaboost has a high detection rate and is not prone to over fitting. A total of seven mutual features were discovered among the top ten features in random forest and adaboost including platelet, INR, AIS chest, heart rate, WBC, age, and SBP. The platelet and INR are essential components of the coagulation test. Previous studies showed coagulative disorders are prevalent in TBI with the incidence of coagulopathy ranging from 13% to 54% [11][12][13][14][15]. Actually, the coagulation system plays an important role in the pathophysiological process of ARDS [16,17]. The imbalance between inflammation and coagulation leads to an inflammatory response, formation of microthrombi and diffused deposition of fibrin in pulmonary capillary bed and alveoli [18,19]. As a key element in ARDS development, the process of immune-thrombosis formation involves many kinds of cells including platelets, neutrophils, endothelial cells [18]. Correspondingly, the WBC is another important feature for predicting ARDS in our developed random forest and adaboost models.
In addition to the coagulation indexes and WBC, heart rate and SBP which may collectively reflect the tissue perfusion were also important features in machine learning based models. The shock status would undoubtedly decrease the transport of blood and oxygen to pulmonary tissue and accelerate lung injury. Finally, AIS chest and age were important features in machine learning based models. One previous study confirmed rib fracture as a risk factor for ARDS after mild TBI [20]. The thoracic trauma may cause direct mechanical damage to the pulmonary tissue or increase the risk of pneumonia by restricting respiratory amplitude. One epidemiological research study with a large sample size found younger age was significantly associated with the higher risk of ARDS in isolated severe TBI patients, while another study showed that elderly trauma patients had a higher risk of ARDS than non-elderly trauma patients [7,21]. The influence of age on ARDS occurrence and the corresponding mechanism in TBI patients is still worth investigating. Composed of these above-mentioned important features, random forest or adaboost based models may be effective in predicting the risk of ARDS in TBI patients.
This study has several limitations. Firstly, this was a single center database study, and a large number of patients were excluded due to a lack of records of included variables. This selection bias could not be avoided. Most of the excluded patients were mild to moderate TBI. Therefore, this study mainly investigated the incidence of ARDS in severe TBI and the predictive models may be more suitable for use in severe TBI. Future studies with larger sample sizes are worthwhile to evaluate the predictive performance of machine learning models in more generalized TBI patients. Secondly, machine learning models were developed and internally validated using the same dataset from a single medical center.
These models should be externally validated in other medical centers in future studies. Thirdly, although the SHAP value is a visualized form of machine learning model, it is still difficult for clinicians to evaluate the risk of ARDS in clinical practice. It is worthwhile to develop a practical application incorporating random forest or adaboost algorithms which could be readily used with an estimated accurate value of ARDS possibility in portable electronic equipment for TBI.

Conclusions
Machine learning algorithms identified some factors of ARDS in TBI including age, initial systolic blood pressure and heart rate, AIS chest, WBCs, platelets, and INR. The random forest and AdaBoost based models perform efficiently and stably in the prediction of ARDS in TBI patients. These models could help clinicians to evaluate the risk of ARDS in the early stage after TBI, and consequently adjust treatment strategies to prevent the development of ARDS during hospitalizations for TBI.