Prediction of Intraparenchymal Hemorrhage Progression and Neurologic Outcome in Traumatic Brain Injury Patients Using Radiomics Score and Clinical Parameters

(1) Background: Radiomics analysis of spontaneous intracerebral hemorrhages on computed tomography (CT) images has been proven effective in predicting hematoma expansion and poor neurologic outcome. In contrast, there is limited evidence on its predictive abilities for traumatic intraparenchymal hemorrhage (IPH). (2) Methods: A retrospective analysis of 107 traumatic IPH patients was conducted. Among them, 45 patients (42.1%) showed hemorrhagic progression of contusion (HPC) and 51 patients (47.7%) had poor neurological outcome. The IPH on the initial CT was manually segmented for radiomics analysis. After feature extraction, selection and repeatability evaluation, several machine learning algorithms were used to derive radiomics scores (R-scores) for the prediction of HPC and poor neurologic outcome. (3) Results: The AUCs for R-scores alone to predict HPC and poor neurologic outcome were 0.76 and 0.81, respectively. Clinical parameters were used to build comparison models. For HPC prediction, variables including age, multiple IPH, subdural hemorrhage, Injury Severity Score (ISS), international normalized ratio (INR) and IPH volume taken together yielded an AUC of 0.74, which was significantly (p = 0.022) increased to 0.83 after incorporation of the R-score in a combined model. For poor neurologic outcome prediction, clinical variables of age, Glasgow Coma Scale, ISS, INR and IPH volume showed high predictability with an AUC of 0.92, and further incorporation of the R-score did not improve the AUC. (4) Conclusion: The results suggest that radiomics analysis of IPH lesions on initial CT images has the potential to predict HPC and poor neurologic outcome in traumatic IPH patients. The clinical and R-score combined model further improves the performance of HPC prediction.


Introduction
Traumatic brain injury (TBI), most commonly caused by unintentional falls and motor vehicle crashes, is a serious condition that results in neuropsychiatric impairment, disability and death [1]. The annual incidence of TBI is estimated at up to 939 per 100,000 people worldwide [2]. At the moment of blunt head injury, primary brain injury occurs when the brain impacts on the surrounding confines of the skull bone and dura, which further ruptures the neurons and glial cells [3]. The high kinetic energy also fractures the microvessels in the brain and causes extravasation, leading to the high-attenuating contusion hemorrhage observed on non-contrast computed tomography (CT) [4]. Patients with contusion hemorrhages, also called traumatic intraparenchymal hemorrhages (IPH), typically receive non-surgical management initially if there is no significant mass effect. However, hematomas could expand during hospitalization following a secondary brain injury process [5,6]. As well as dysfunctional hemostasis, one proposed mechanism is the activation of specificity protein 1, nuclear factor-kB and the upregulation of sulfonylurea receptor 1 on endothelial cells that received a lower kinetic energy [3]. A further increase in blood-brain barrier permeability aggravates vasogenic edema and causes oncotic cell death and capillary fragmentation that leads to hematoma expansion observed on subsequent CT images.
CT images not only provide radiological information for visual interpretation but also contain quantitative semantic and agnostic features that can be extracted with radiomics tools. Machine learning models generated from selected radiomics features have been shown to correlate well with clinically important diagnostic or prognostic outcomes [20]. Focusing on spontaneous intracerebral hemorrhages, extensive studies have been performed, which demonstrated the good predictive capability of models derived from radiomics features for hematoma expansion [21][22][23][24][25] and poor neurological outcome [24,[26][27][28][29][30]. Furthermore, increased c-statistics were observed after the radiomics features were combined with clinical and radiological parameters constantly.
Despite the good prognostic value of radiomics in spontaneous IPH being reported, research evidence on traumatic IPH is scarce [31,32]. The purpose of this study was to demonstrate the feasibility of radiomics to predict HPC and poor neurologic outcome in traumatic IPH patients. Firstly, the radiomics models were built from the IPH lesions shown on the initial CT examinations for prediction, and the results were compared to those predicted using conventional clinical parameters and the lesion volume information. Secondly, the radiomics score (R-score) was added to the clinical parameters to build the combined model, and the performance was compared to assess the added value of the radiomics score in the outcome prediction.

Patient Identification, Baseline Parameters, and Outcomes' Definitions
This retrospective cohort analysis was conducted by the identification of patients from the TBI database in Chi Mei Medical Center and its branch hospitals, which consisted of patients entering the emergency room (ER) due to head injuries from 2015 to 2017. The institutional review board approved this study and waived the requirement for informed consent. After reviewing the medical charts and images, there were initially 1110 adult patients who had CT-documented intracranial hemorrhages associated with blunt head trauma. Among them, 756 patients had extra-axial hemorrhages only and were excluded from the analysis. IPH was found on CT images of 354 patients. A follow-up CT was ordered either due to neurological deterioration or routinely on the following days, based on the decisions of neurosurgeons. Patients without a follow-up CT scan within 96 h from the initial CT scan were excluded. Furthermore, patients showing midline shifts more than 5 mm on initial CT, brain herniation syndromes, surgical intervention between the initial and follow-up CT scans or cerebral aneurysms were also excluded. Finally, a total of 107 patients who received non-operative management after initial CT scans were eligible for analysis. The patient selection flow diagram is shown in Figure 1.

Patient Identification, Baseline Parameters, and Outcomes' Definitions
This retrospective cohort analysis was conducted by the identification of patients from the TBI database in Chi Mei Medical Center and its branch hospitals, which consisted of patients entering the emergency room (ER) due to head injuries from 2015 to 2017. The institutional review board approved this study and waived the requirement for informed consent. After reviewing the medical charts and images, there were initially 1110 adult patients who had CT-documented intracranial hemorrhages associated with blunt head trauma. Among them, 756 patients had extra-axial hemorrhages only and were excluded from the analysis. IPH was found on CT images of 354 patients. A follow-up CT was ordered either due to neurological deterioration or routinely on the following days, based on the decisions of neurosurgeons. Patients without a follow-up CT scan within 96 h from the initial CT scan were excluded. Furthermore, patients showing midline shifts more than 5 mm on initial CT, brain herniation syndromes, surgical intervention between the initial and follow-up CT scans or cerebral aneurysms were also excluded. Finally, a total of 107 patients who received non-operative management after initial CT scans were eligible for analysis. The patient selection flow diagram is shown in Figure 1. Baseline patient characteristics include the following parameters: age, sex, cause of head injury (falling or motor vehicle collision), single or multiple IPH lesions, the total volume of IPH on initial CT, concurrent extra-axial hemorrhages (epidural, subdural, subarachnoid and intraventricular), laboratory data (platelet count, international normalized ratio (INR) and activated partial thromboplastin time ratio), systolic blood pressure at ER, GCS (Glasgow Coma Scale) at ER, ISS (Injury Severity Score), comorbid conditions Baseline patient characteristics include the following parameters: age, sex, cause of head injury (falling or motor vehicle collision), single or multiple IPH lesions, the total volume of IPH on initial CT, concurrent extra-axial hemorrhages (epidural, subdural, subarachnoid and intraventricular), laboratory data (platelet count, international normalized ratio (INR) and activated partial thromboplastin time ratio), systolic blood pressure at ER, GCS (Glasgow Coma Scale) at ER, ISS (Injury Severity Score), comorbid conditions (hypertension and diabetes mellitus) and antiplatelet medications. No patients were on anticoagulants. The distribution of baseline parameters is detailed in Tables 1 and 2.  The first outcome was determined by the occurrence of HPC or not. HPC was determined based on the two CT examinations, defined by a more than 30% relative volume increase or a more than 10 mL absolute volume increase on the follow-up CT compared to the initial CT in a patient, according to the criteria used in prior studies [11,14,16]. The second outcome was determined based on the Glasgow Outcome Scale (GOS) at three-month intervals, and further dichotomized to either being poor (1 to 3) or good (4 and 5).

CT Protocols
Multi-detector CT scanners (SOMATOM Definition AS, SOMATOM Sensation 64, and SOMATOM Emotion 16, Siemens Healthineers) were used for image acquisition. The standard brain CT was acquired with tube voltage and tube current between 80-120 kVp and 250-300 mAs, respectively. Image coverage was from the occipital bone to the vertex. CT scans were performed with dimension size from 512 × 512 × 28 to 512 × 512 × 46. The slice thickness ranged from 3.6 mm to 5.0 mm and the in-plane resolution varied from 0.38 × 0.38 mm 2 to 0.49 × 0.49 mm 2 .

Image Segmentation
Segmentation was performed manually by tracing the hyperdense IPH region of interest (ROI) on every axial slice of the initial and follow-up CT scan with ImageJ software (National Institutes of Health). Efforts were made to ensure that segmented ROI did not include any nearby hyperdense regions such as bone, dura or extra-axial hemorrhages. A board-certified neuroradiologist (Reader A, 6 years of experience) performed the segmentation and the results were verified by a senior board-certified neuroradiologist (Reader C, 21 years of experience). To evaluate the intra-reader and inter-reader agreements on segmentation results, ROI of 30 cases were delineated again by Reader A with 2 months apart, and by another neuroradiologist (Reader B, 4 years of experience) independently. The Dice coefficient was used for comparison of the segmentation results.

Radiomics Features' Extraction and Selection
To minimize the variability of extracted features, the Hounsfield units (HU) for each set of CT images were rescaled to the range from −1024 HU to 3071 HU. The nearly raw raster data format was converted from CT images and processed by MATLAB 2020 a (The MathWorks). Feature extraction was performed using PyRadiomics. The classes of features were selected from the PyRadiomics library, including the first-order statistics, the shapebased parameters and the second-order texture features of Grey-Level Co-Occurrence Matrix, Grey-Level Run Length Matrix, Grey-Level Size Zone Matrix and Grey-Level Difference Matrix. Finally, a total of 107 radiomics features were extracted from each lesion ROI. To identify the uncorrelated features with maximum relevance, feature selection was performed using support vector machine (SVM) with the Gaussian kernel. The intraclass correlation coefficients (ICC) for the selected radiomics features were calculated.

Radiomics Score and Performance Evaluation
The selected radiomics features were used to build R-score models for prediction of HPC and poor GOS outcome, by using two methods: SVM with the Gaussian kernel and random subspace k-nearest neighbors (KNN) classifiers. The choices of the adopted models were made after testing various SVM, KNN, decision trees and discriminant algorithms. We conducted a 10-fold cross-validation process to prevent overfitting, whereby 90% of cases were randomly selected as the training set and the remaining 10% as the testing set. This procedure was repeated ten times to obtain the average results. The prediction thresholds for HPC and poor GOS were both set at R-score of ≥0.5. The receiver operating characteristic (ROC) curves and areas under the ROC curves (AUC) were used to evaluate the performance of the created radiomics models.

Building of Combined Clinical-Radiomics Model
We performed multiple logistic regression analyses based on the R-score and the clinical parameters for the prediction of HPC and poor neurologic outcome. Baseline parameters that showed higher AUCs individually were selected as variables for clinical model establishment. We evaluated the predictive performances of clinical variables alone, in combination with initial total IPH volumes, and further in combination with R-score. Clinical parameters were modeled as categorical variables. Initial total IPH volume and R-score were modeled as continuous variables. Comparison of AUCs was evaluated with DeLong's test and p-values < 0.05 were considered significant. The overall analysis flowchart from segmentation, preprocessing, feature selection, model building and evaluation of their performances is shown in Figure 2. establishment. We evaluated the predictive performances of clinical variables alone, in combination with initial total IPH volumes, and further in combination with R-score. Clinical parameters were modeled as categorical variables. Initial total IPH volume and Rscore were modeled as continuous variables. Comparison of AUCs was evaluated with DeLong's test and p-values < 0.05 were considered significant. The overall analysis flowchart from segmentation, preprocessing, feature selection, model building and evaluation of their performances is shown in Figure 2. Overall steps for the establishment of progressive hematoma and poor neurological outcome prediction models. Following manual segmentation of intraparenchymal hemorrhage on initial CT images and preprocessing, radiomics features were extracted, selected and modeled through machine learning algorithms. The performance of radiomics scores and combined clinical-volume models were analyzed with receiver operating characteristic curves. R-score, radiomics score.

Baseline Patient Characteristics
As listed in Table 1, there were 45 patients (42.1%) showing HPC in our studied population. Patients showing HPC had significantly larger initial IPH volumes and more of them were in the ISS ≥ 25 group. As shown in Table 2, there were 51 patients (47.7%) with unfavorable outcomes of GOS 1 to 3. Older age, a falling injury, larger IPH volume, combined SAH, combined IVH, thrombocytopenia, increased INR, lower GCS, higher ISS and history of hypertension were significantly associated with the consequences of poor neurological outcome.

Repeatability of Segmentations
The average Dice coefficient was 0.857 for repeated segmentation of Reader A, and 0.773 for independent segmentation between the two readers. The selected radiomics features, and the intra-and inter-reader ICCs, are shown in the Supplementary Materials Table S1. When using the definition of ICC between 0.4 and 0.59 as fair, 0.60 to 0.74 as good and above 0.75 as excellent [33], all selected radiomics features had excellent intrareader agreements. For inter-reader agreements, seven features were excellent, one was good and one was fair.

Performance of Radiomics Score with Case Examples
The average AUC obtained from 10-fold cross-validation for R-score alone to predict HPC was 0.7638, and the AUC for R-score alone to predict poor neurologic outcome was 0.8067. Details on accuracy, sensitivity and specificity are shown in Table 3. Two case examples of true positive and true negative R-score predictions for HPC are depicted in Overall steps for the establishment of progressive hematoma and poor neurological outcome prediction models. Following manual segmentation of intraparenchymal hemorrhage on initial CT images and preprocessing, radiomics features were extracted, selected and modeled through machine learning algorithms. The performance of radiomics scores and combined clinical-volume models were analyzed with receiver operating characteristic curves. R-score, radiomics score.

Baseline Patient Characteristics
As listed in Table 1, there were 45 patients (42.1%) showing HPC in our studied population. Patients showing HPC had significantly larger initial IPH volumes and more of them were in the ISS ≥ 25 group. As shown in Table 2, there were 51 patients (47.7%) with unfavorable outcomes of GOS 1 to 3. Older age, a falling injury, larger IPH volume, combined SAH, combined IVH, thrombocytopenia, increased INR, lower GCS, higher ISS and history of hypertension were significantly associated with the consequences of poor neurological outcome.

Repeatability of Segmentations
The average Dice coefficient was 0.857 for repeated segmentation of Reader A, and 0.773 for independent segmentation between the two readers. The selected radiomics features, and the intra-and inter-reader ICCs, are shown in the Supplementary Materials Table S1. When using the definition of ICC between 0.4 and 0.59 as fair, 0.60 to 0.74 as good and above 0.75 as excellent [33], all selected radiomics features had excellent intra-reader agreements. For inter-reader agreements, seven features were excellent, one was good and one was fair.

Performance of Radiomics Score with Case Examples
The average AUC obtained from 10-fold cross-validation for R-score alone to predict HPC was 0.7638, and the AUC for R-score alone to predict poor neurologic outcome was 0.8067. Details on accuracy, sensitivity and specificity are shown in Table 3. Two case examples of true positive and true negative R-score predictions for HPC are depicted in Figure 3. Another two case examples of false positive and false negative R-score predictions for HPC are depicted in Figure 4.

Combined Clinical-Radiomics Model for Prediction of Hemorrhagic Progression
Selected clinical variables including age, multiple IPH, concurrent SDH, ISS and INR for prediction of HPC showed an AUC of 0.7133. A clinical-volume model, created by adding initial total IPH volume as a variable, raised the AUC to 0.7412 non-significantly (p = 0.237). However, the further combination of the R-score with the clinical-volume model significantly increased the AUC to 0.8315 (p = 0.022). Figure 5 shows the ROC curves for the clinical-volume, R-score and combination models.

Combined Clinical-Radiomics Model for Prediction of Hemorrhagic Progression
Selected clinical variables including age, multiple IPH, concurrent SDH, ISS and INR for prediction of HPC showed an AUC of 0.7133. A clinical-volume model, created by adding initial total IPH volume as a variable, raised the AUC to 0.7412 non-significantly (p = 0.237). However, the further combination of the R-score with the clinical-volume

Prediction of Poor Neurologic Outcome
The clinical-volume model, built from selected clinical parameters of age, GCS, ISS, INR and initial total IPH volume for the prediction of poor neurologic outcome showed an AUC of 0.9247, which was higher than the R-score alone. The combination of the R-score with the clinical-volume model showed a non-significant change in the AUC at 0.9503 (p = 0.095). Figure 6 shows the ROC curves for the clinical-volume and R-score models.
Diagnostics 2022, 12, x FOR PEER REVIEW 10 of 15 model significantly increased the AUC to 0.8315 (p = 0.022). Figure 5 shows the ROC curves for the clinical-volume, R-score and combination models. Figure 5. ROC curves of the clinical-volume, the R-score, and the combined models for the prediction of hemorrhagic progression. * Denotes significant difference between the two models with pvalue < 0.05; R-score, radiomics score; ROC, receiver operating characteristic.

Prediction of Poor Neurologic Outcome
The clinical-volume model, built from selected clinical parameters of age, GCS, ISS, INR and initial total IPH volume for the prediction of poor neurologic outcome showed an AUC of 0.9247, which was higher than the R-score alone. The combination of the Rscore with the clinical-volume model showed a non-significant change in the AUC at 0.9503 (p = 0.095). Figure 6 shows the ROC curves for the clinical-volume and R-score models. Figure 5. ROC curves of the clinical-volume, the R-score, and the combined models for the prediction of hemorrhagic progression. * Denotes significant difference between the two models with p-value < 0.05; R-score, radiomics score; ROC, receiver operating characteristic.

Discussion
In this study, we set out to investigate the role of radiomics features in predicting hematoma progression and poor neurologic outcome in TBI patients. Our main finding is that the R-score, derived from selected radiomics features and modeled through SVM and KNN, can classify traumatic IPH patients in developing HPC or having poor neurological outcome with AUCs of 0.76 and 0.81, respectively. A further combination of the clinical Figure 6. ROC curves of the clinical-volume, the R-score and the combined models for the prediction of poor neurologic outcome; R-score, radiomics score; ROC, receiver operating characteristic.

Discussion
In this study, we set out to investigate the role of radiomics features in predicting hematoma progression and poor neurologic outcome in TBI patients. Our main finding is that the R-score, derived from selected radiomics features and modeled through SVM and KNN, can classify traumatic IPH patients in developing HPC or having poor neurological outcome with AUCs of 0.76 and 0.81, respectively. A further combination of the clinical and IPH volume parameters significantly increased the predictive performance of HPC with an AUC of 0.83. The results demonstrate the predictive capability of radiomics for hematoma progression and poor outcome in the TBI setting.
Radiomics features are quantitative data obtained from medical images, including firstorder statistics and shaped-based and texture-related feature classes. The mathematicallydefined radiomics features represent the image characteristics of the ROI, and many of them cannot be discerned by visual interpretation. It has been shown to provide a signature of the lesion, and further modeling for diagnostic, prognostic or predictive purposes could be performed with machine learning algorithms based on the selected features [20]. Another approach is deep learning-based radiomics, which does not rely on mathematicallypredefined features but various neural networks to directly identify imaging information that are relevant to clinical problems. As compared with feature-based radiomics, deep learning-based radiomics may be advantageous in terms of repeatability and reproducibility. However, it requires a larger dataset, which is often unavailable on a clinical basis. The interpretability of the result is also problematic since the method is perceived as a "black box" [34]. Nevertheless, there are efforts being made to overcome the shortcomings mentioned above. To ensure robustness of the selected features, we performed intraand inter-reader analysis and demonstrated acceptable repeatability. For the choices of model-building algorithms, we conducted multiple tests using different machine learning algorithms to identify the best performing ones by using methods reported before [29].
The initial IPH volume is the simplest objective parameter that could be calculated from CT images. Other radiological features such as the black hole sign, swirl sign, heterogeneous density, blend sign, hypodensities, irregular shape and island sign are all known to be positively associated with hematoma expansion and poor neurological outcome [35]; however, the evaluations of these shape and density features are subjective. Therefore, the application of radiomics can extract more objective features that are not retrievable based on vision and manual measurements. It has consistently been shown to be effective for the prediction of hematoma expansion and poor neurologic outcome in spontaneous IPH patients [21][22][23][24][25][26][27][28][29][30]. As for traumatic IPH, we herein demonstrated its usefulness to further expand the limited evidence currently available [31,32].
Since the R-score only considers features of the IPH lesion on CT images alone, it is paramount to incorporate the results of other important clinical and radiological predictors. We selected pertinent variables generally recorded during clinical practice to avoid missing data and created a clinical-volume model for comparison. For HPC prediction, we found significant associations with higher ISS and larger volumes on univariate analyses. Multiple IPH, concurrent SDH, age and INR were also probable risk factors, although non-significant. A higher ISS indicates a more serious injury to a patient on its whole body, which could be indirectly linked to more blood loss with dysfunctional hemostasis and the activation of systemic proinflammatory biomarkers related to neuroinflammation [36], and thus is a reported risk factor for HPC [8,13]. A larger initial IPH volume, as well as multiple IPH and concurrent SDH, indicate greater severity of the initial head trauma, thus increasing the volume of susceptible brain tissue for the secondary injury process and leading to hemorrhagic progression [6]. Therefore, these factors are known to pose higher risks of HPC [5,7,9,[11][12][13][15][16][17]. Aging increases vulnerable brain tissue due to weakness of the microvasculature and decreased cerebral blood flow [5,37], which further adds risk to HPC [13,[15][16][17]19]. Elevation of INR suggests coagulopathy and is positively correlated with HPC [10,17,18]. Our clinical-volume prediction model yielded an AUC of 0.74, which is similar to the results from others (0.72-0.77) [16,19]. The combination of the R-score significantly increased the predictive model's performance, demonstrating the favorable effect of adding radiomics features to predict HPC in traumatic IPH patients in a similar way for spontaneous IPH patients [21,[23][24][25].
Although the radiomics score can predict poor neurologic outcome with a decent AUC of 0.81, we found it unhelpful to further augment the prediction of the clinical-volume model, which had already reached 0.92 in our studied population. Age, GCS and ISS were notable strong prognostic determinants. Increased odds for poor outcome after TBI in elderly patients were frequently reported [38,39]. Comorbid conditions, reduced physical reserves and medication usage were possible reasons for poor outcomes. A lower GCS, mainly reflecting severe brain injury, was also observed to be associated with poor functional recovery [38,39]. ISS was reported to be an independent predictor for poor outcomes, which has been linked to respiratory failure [40]. Although not selected as parameters in our predictive model, comorbid conditions of hypertension and diabetes mellitus were associated with higher percentages of poor neurologic outcome in our studied population. Hypertension causes defective cerebral autoregulation; therefore, the cerebral blood flow decreases (ischemia) or increases (hyperemia) abnormally even with small changes in arterial pressure and aggravates the vulnerability of brain tissue [41]. Diabetes mellitus contributes to hyperglycemia, which is a modifiable risk factor for poor neurologic outcome. The mechanism is not clearly understood but associated lactic acidosis, electrolyte disturbances and inflammation are possible causes [42]. An intensive glycemic control target shows a small but statistically significant reduction in the risk of poor neurological outcome [43]. In general, clinical factors remained more important in terms of outcome prediction in patients with TBI.
Some limitations are noted in this study. The retrospective analysis contains biases related to patient selection. The manual segmentation process, the consistency of which was verified among readers, is time-intensive and unrealistic to be incorporated into clinical workflows currently. We tried to apply an automatic segmentation tool to our dataset; however, the segmentation results for traumatic IPH were suboptimal and extensive adjustments were still required. In contrast to spontaneous IPH, the multiplicity and lower imaging contrast due to close locations to bone and extra-axial hemorrhages make traumatic IPH lesions more difficult to segment automatically. Inconsistent performances were reported by recent studies, showing a wide range of Dice coefficient results for automatic traumatic IPH segmentation [44][45][46]; however, development in this field is rapid. By using a finely-tuned automatic segmentation tool for traumatic IPH, larger numbers of images can be processed timely for radiomics analysis in the near future. Lastly, due to limited cases, we could not perform validation using an independent dataset; therefore, the generalizability of our results needs to be further investigated.

Conclusions
We demonstrate the feasibility of radiomics analysis of initial CT images for the prediction of HPC and poor GOS in traumatic IPH patients. The combination of the R-score with clinical and lesion volume parameters showed significantly better predictivity of HPC than clinical-volume information only. The results suggest that radiomics analysis of IPH lesions on initial CT images has the potential to predict the risk of progression and aid in clinical management for traumatic IPH patients. Nevertheless, well-designed prospective cohort studies or randomized controlled trials are still required to add evidence on the beneficial role of radiomics in the future.