Next Article in Journal
Stress, Anxiety and Depressive Symptoms, Burnout and Insomnia Among Greek Nurses One Year After the End of the Pandemic: A Moderated Chain Mediation Model
Next Article in Special Issue
Blood Flow in the Internal Jugular Veins in the Lateral Decubitus Body Position in the Healthy People
Previous Article in Journal
Detailed Insights into the Relationship Between Three-Dimensional Speckle-Tracking Echocardiography-Derived Systolic Left Atrial Global Strains and Left Ventricular Volumes in Healthy Adults from the MAGYAR-Healthy Study
Previous Article in Special Issue
Single-Stage Microsurgical Clipping of Multiple Intracranial Aneurysms in a Patient with Cerebral Atherosclerosis: A Case Report and Review of Surgical Management
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Predicting Mortality in Subarachnoid Hemorrhage Patients Using Big Data and Machine Learning: A Nationwide Study in Türkiye

1
Department of Industrial Engineering, Faculty of Engineering, Bilkent University, 06800 Ankara, Türkiye
2
National Magnetic Resonance Research Center (UMRAM), Bilkent University, 06800 Ankara, Türkiye
3
Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
4
Department of Neurosurgery, Faculty of Medicine, Hacettepe University, 06100 Ankara, Türkiye
5
Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
6
General Directorate of Health Information System, Republic of Türkiye Ministry of Health, 06800 Ankara, Türkiye
7
Republic of Türkiye Ministry of Health, 06800 Ankara, Türkiye
8
Department of Neurosurgery, Dr. Abdurrahman Yurtaslan Oncology Research and Education Hospital, 06800 Ankara, Türkiye
9
Department of Radiology, Faculty of Medicine, Hacettepe University, 06230 Ankara, Türkiye
10
Department of Neurosurgery, School of Medicine, Yale University, New Haven, CT 06520, USA
11
Department of Neurosurgery, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
*
Author to whom correspondence should be addressed.
J. Clin. Med. 2025, 14(4), 1144; https://doi.org/10.3390/jcm14041144
Submission received: 1 December 2024 / Revised: 22 January 2025 / Accepted: 24 January 2025 / Published: 10 February 2025
(This article belongs to the Special Issue Neurovascular Diseases: Clinical Advances and Challenges)

Abstract

:
Background/Objective: Subarachnoid hemorrhage (SAH) is associated with high morbidity and mortality rates, necessitating prognostic algorithms to guide decisions. Our study evaluates the use of machine learning (ML) models for predicting 1-month and 1-year mortality among SAH patients using national electronic health records (EHR) system. Methods: Retrospective cohort of 29,274 SAH patients, identified through national EHR system from January 2017 to December 2022, was analyzed, with mortality data obtained from central civil registration system in Türkiye. Variables included (n = 102) pre- (n = 65) and post-admission (n = 37) data, such as patient demographics, clinical presentation, comorbidities, laboratory results, and complications. We employed logistic regression (LR), decision trees (DTs), random forests (RFs), and artificial neural networks (ANN). Model performance was evaluated using area under the curve (AUC), average precision, and accuracy. Feature significance analysis was conducted using LR. Results: The average age was 56.23 ± 16.45 years (47.8% female). The overall mortality rate was 22.8% at 1 month and 33.3% at 1 year. One-month mortality increased from 20.9% to 24.57% (p < 0.001), and 1-year mortality rose from 30.85% to 35.55% (p < 0.001) in the post-COVID period compared to the pre-COVID period. For 1-month mortality prediction, the ANN, LR, RF, and DT models achieved AUCs of 0.946, 0.942, 0.931, and 0.916, with accuracies of 0.905, 0.901, 0.893, and 0.885, respectively. For 1-year mortality, the AUCs were 0.941, 0.927, 0.926, and 0.907, with accuracies of 0.884, 0.875, 0.861, and 0.851, respectively. Key predictors of mortality included age, cardiopulmonary arrest, abnormal laboratory results (such as abnormal glucose and lactate levels) at presentation, and pre-existing comorbidities. Incorporating post-admission features (n = 37) alongside pre-admission features (n = 65) improved model performance for both 1-month and 1-year mortality predictions, with average AUC improvements of 0.093 ± 0.011 and 0.089 ± 0.012, respectively. Conclusions: Our study demonstrates the effectiveness of ML models in predicting mortality in SAH patients using big data. LR models’ robustness, interpretability, and feature significance analysis validate its importance. Including post-admission data significantly improved all models’ performances. Our results demonstrate the utility of big data analytics in population-level health outcomes studies.

1. Introduction

Subarachnoid hemorrhage (SAH) is a severe, acute neurological disorder characterized by bleeding into the subarachnoid space, often leading to significant morbidity and mortality, especially among patients with poor neurological grade at admission [1,2,3]. The SAH continues to be an essential public health problem, as patients usually arrive at emergency departments with severe neurological deterioration [4,5]. Timely prognosis prediction in patients with SAH is necessary for optimizing clinical care and allocating resources effectively, potentially enhancing patient outcomes and reducing hospital costs [6,7].
Recent machine learning (ML) advances have shown promise in predicting outcomes for various neurological conditions in surgical patients [8,9], including SAH. Studies have demonstrated the utility of decision tree (DT) models for predicting long-term outcomes after poor-grade aneurysmal SAH [10]. Moreover, complication and treatment-aware ML models have also shown improved prediction accuracy for functional consequences [11]. Metabolic evaluation of cerebrospinal fluid (CSF) and plasma for predicting SAH prognosis and delayed cerebral ischemia (DCI) has also provided insights into poor outcomes [12,13,14]. Artificial neural networks (ANNs) have been applied to predict clinical outcomes and DCI [15,16].
Despite extensive research, there remains a need for large-scale studies specifically targeting mortality predictions and poor prognosis factors in SAH patients. Türkiye’s e-Nabız system offers a comprehensive resource, encompassing nationwide electronic health records (EHR) from all hospitals in the country with a population of over 85 million. This extensive platform provides a unique opportunity to study disease-specific healthcare outcomes at the population level. Our study leverages this robust database to enhance the validity and applicability of our findings across all the geographical regions in Türkiye, marking it the first comprehensive nationwide analysis.
We aim to utilize big data extracted via a nationwide EHR to develop predictive models for 1-month and 1-year mortality in SAH patients, identifying critical prognostic factors using advanced ML algorithms. The dataset’s comprehensiveness will allow us to integrate both pre- and post-admission clinical variables, enabling the identification of high-risk patients and parameters. This approach will facilitate the development of tailored interventions, ultimately improving patient outcomes, optimizing hospital resource allocation, and reducing healthcare costs. By applying ML techniques such as logistic regression (LR), decision trees (DTs), random forests (RFs), and ANNs, we seek to provide a nuanced understanding of SAH prognosis and enhance clinical decision-making.

2. Method

Our study was conducted with the approval of the Ministry of Health of the Republic of Türkiye, utilizing the e-Nabız system, which serves as Türkiye’s comprehensive national health information platform. Patient consent was not required, as the study was retrospective, and the data were anonymized. The e-Nabız system contains essential patient information, including diagnoses, imaging findings, examination, and treatment history. All methods were conducted in accordance with relevant guidelines and regulations. The dataset used in the analysis was acquired in compliance with the Personal Data Protection Law No. 6698, dated 24 March 2016, of the Republic of Türkiye. The data acquisition and analysis were also conducted according to the Ethics Committee for Public Servants guidelines under the Ministry’s Ethics Commission. The study was reviewed by commission officers, who confirmed that all data collection, analysis, and reporting processes adhered to scientific and ethical standards. The mortality times of patients were extracted from the Central Civil Registration System (MERNİS). The MERNİS system is considered a robust and reliable source for obtaining accurate mortality data in Türkiye.

2.1. Patient Selection

The study population comprised patients diagnosed with spontaneous SAH, confirmed using the ICD-10 code I60 (subarachnoid hemorrhage), who presented to emergency departments (EDs) in Türkiye between January 2017 and December 2022. To be eligible for inclusion, patients were required to have a confirmed SAH diagnosis and had to have undergone a computed tomography (CT) scan. Exclusion criteria included patients with traumatic SAH and incomplete medical records, specifically those lacking essential data points such as imaging results or follow-up information. Instances with unclear diagnostic codes or those lacking confirmatory imaging were also eliminated.

2.2. Feature Selection and Data Preprocessing

Feature selection was based on previous studies and clinical relevance, identifying a broad set of potential predictors. Comorbidities were identified based on ICD codes documented before the index emergency visit, while complications were identified through new ICD codes recorded after the initial emergency visit.
The pre-admission data include comprehensive clinical information collected from patients upon arrival at the ED. This dataset (n = 65 features) captures initial patient demographics, clinical presentations, and immediate laboratory results obtained during the ED visit. It encompasses critical early assessments such as the patient’s first-present condition, initial vital signs, and laboratory tests, including a complete blood count (CBC), basic metabolic panel, and blood gases. Additionally, pre-existing medical conditions documented in the patient’s history are included to provide a baseline for subsequent analysis.
The post-admission data encompass all clinical observations and outcomes recorded after patients were formally admitted to the hospital. This dataset (n = 37 features) includes detailed information on the medical interventions received, surgical or endovascular treatments, complications (e.g., infections, neurological events, cardiopulmonary conditions), and their management during hospitalization.
The study utilized these distinct data phases to develop predictive models. The pre-admission data provide a snapshot of the patient’s condition at initial medical contact, capturing early diagnostic indicators and potential risk factors. In contrast, the post-admission data offer a longitudinal view of patient outcomes and the impact of various interventions or management strategies, including detailed complications tracking and assessments of treatment efficacy. By incorporating both datasets, the analysis aims to comprehensively assess factors influencing patient prognosis, from initial presentation through hospitalization.
Data preprocessing involved applying a min–max scaler, where each feature was linearly transformed such that the minimum value was set to 0 and the maximum value to 1. This scaling and one-hot encoding of categorical variables were implemented to enhance the data’s compatibility with ML algorithms. Including pre- and post-admission parameters data in the predictive models was a deliberate decision to comprehend the broader spectrum of patient data, thereby improving the models’ predictive power.

2.3. Machine Learning Techniques

Two sets of predictive models were created to predict the likelihood of mortality at one month and one year after the index event of SAH. One set included pre-admission characteristics; the other incorporated pre- and post-admission parameters. The pre-admission model included age, gender, initial clinical presentation, and early laboratory findings. In contrast, the post-admission model included characteristics of the interventional process and post-admission neurological, laboratory, and clinical features.
Four ML techniques were utilized: LR, DT, RF, and ANN. Each model’s training and test employed a dataset split of 70:30, where 70% was allocated for training and 30% for testing. Cross-validation was conducted to optimize model hyperparameters and mitigate overfitting. The four models’ performance was assessed on test samples using AUC, average precision (AP), and accuracy, aligned with previous studies leveraging machine learning for predictive modeling in clinical outcome research [17,18,19,20,21].
(i) Logistic regression (LR) is a basic yet robust prediction algorithm that estimates the probability of a binary outcome based on a set of input features. It calculates a weighted sum of the input features using coefficients learned during training, then applies a logistic function to constrain the output to a range of 0 to 1. This value represents the likelihood of the positive class or event (in our case, “death within 1 month” and “death within 1 year”).
(ii) Decision trees (DT) work by recursively splitting the dataset into subsets based on specific feature values, forming a tree-like structure. Each split condition, represented by a branching node, determines the direction in which data points flow, either left or right. The terminal leaf nodes signify the final outcome, where the predicted class is derived from the majority class of the training data points at that leaf.
(iii) Random forest (RF) is an ensemble learning approach that builds multiple decision trees using random subsets of the training data and input features. The model aggregates the predictions from these individual trees, typically by averaging or voting, to generate the final output. This ensemble strategy enhances the model’s accuracy and resilience by mitigating the tendency to overfit.
(iv) Artificial neural networks (ANNs) are the simplest type of neural networks, where information flows in a single direction—from the input layer (representing features) through one or more hidden layers (where learning occurs) to the output layer (indicating the predicted outcome). Hidden layers are composed of interconnected nodes that perform linear transformations on their inputs, followed by nonlinear activation functions, enabling the network to identify and learn complex patterns in the training data.

2.4. Statistical Analysis

Descriptive statistics for parametric continuous variables were presented as means and standard deviations (±), while non-parametric continuous variables were summarized using medians, first and third quartiles [Q1–Q3]. Categorical variables were summarized as numbers (percentages, [n (%)]). Logistic regression was employed to determine the most influential predictors, and the statistical significance of each variable was evaluated using p-values. Variables with p-values below 0.05 were deemed statistically significant. Variables with greater significance were classified into different categories depending on their p-values: (i) p < 0.001, (ii) p < 0.01, and (iii) p < 0.05.
We used Python (version 3.8) for statistical studies with Scikit-learn (version 1.3.2) and R (version 4.2.1) for statistical significance analysis. We evaluated different models by analyzing their performance indicators, choosing the model with the highest AUC score as the best-performing one.

3. Results

3.1. Patient Characteristics

As detailed in Table 1, this study included 29,274 patients diagnosed with SAH between January 2017 and December 2022. Our cohort had a median age of 56.23 ± 16.45 years, with 52.2% male and 47.8% female patients. The median time from the initial emergency visit to the CT scan was 12 [4–114] minutes. Notably, 34.2% of the patients experienced cardiopulmonary arrests and/or necessitated endotracheal intubation within the first 24 h, and 9.7% required external ventricular drain (EVD) placement within the first 72 h. Initial laboratory values at emergency presentation included a median hemoglobin level of 13 [11.4–14.4] g/dL, a white blood cell count of 12.01 × 103 [9.22–15.6] cells/μL, and 131 [105–167] mg/dL blood glucose level. The cohort also exhibited comorbidities, including hypertension (55.7%), chronic ischemic heart disease (26.8%), and type 2 diabetes mellitus (25.5%). The results indicate that 8220 patients (28.1%) were diagnosed with epilepsy. The average length of hospitalization was approximately 19.01 ± 15.94 days. The mortality rates within the first week, the first month, and the first year were 9.4%, 22.8%, and 33.3%, respectively.
Out of the total cohort, 37.8% of patients underwent primary intervention for aneurysms, with 22.7% undergoing clipping and 15.1% undergoing coiling. The median time from the initial emergency visit to intervention was 1.95 days [0.75–4.75 days], with clipping occurring at a median of 1.95 days [0.77–4.4 days] and coiling at 2.07 days [0.8–5.8 days].
Regarding the institutions where interventions for aneurysms were performed, the majority occurred in government-owned hospitals (49.3%), followed by university hospitals (30.7%). Finally, 19.7% were carried out in private institutions. This distribution highlights the significant role of public healthcare institutions in managing aneurysmal interventions within the studied cohort.

3.2. First-Month Mortality Prediction

The first-month mortality rate in our cohort was 22.8%. As depicted in Table 2 and Figure 1, in the test sets, the pre-admission LR model achieved an AUC of 0.849, an average precision of 0.651, and an accuracy of 0.832. The DT model had an AUC of 0.835, an average precision of 0.636, and an accuracy of 0.826. The RF model demonstrated an AUC of 0.855, an average precision of 0.667, and an accuracy of 0.835. The ANN model achieved an AUC of 0.850, an average precision of 0.649, and an accuracy of 0.829.
When post-admission parameters were included, the predictive performance of the models significantly improved in the test sets. The LR model’s AUC increased to 0.942, with an average precision of 0.835 and an accuracy of 0.901. The DT model achieved an AUC of 0.916, an average precision of 0.755, and an accuracy of 0.885. The RF model’s AUC rose to 0.931, with an average precision of 0.813 and an accuracy of 0.893. The ANN model demonstrated the best performance, with an AUC of 0.946, an average precision of 0.844, and an accuracy of 0.905.
In addition to pre-admission features (n = 65), post-admission features (n = 37) significantly improved the model’s performance in terms of 1-month mortality prediction. The AUC for the ANN model increased from 0.850 to 0.946, RF from 0.855 to 0.931, DT from 0.835 to 0.916, and LR from 0.849 to 0.942. On average, the AUC improvement across all models was 0.093 ± 0.011, reflecting a consistent enhancement across all methods.

3.3. First-Year Mortality Prediction

The first-year mortality rate in our cohort was 33.3%. As shown in Table 3 and Figure 2, similar models were developed to predict first-year mortality. In the test sets, the pre-admission LR model achieved an AUC of 0.831, an average precision of 0.733, and an accuracy of 0.782. The DT model had an AUC of 0.825, an average precision of 0.717, and an accuracy of 0.777. The RF model demonstrated an AUC of 0.839, an average precision of 0.747, and an accuracy of 0.789. The ANN model achieved an AUC of 0.835, an average precision of 0.733, and an accuracy of 0.780.
Including post-admission parameters significantly enhanced the models’ performance. In the test sets, the LR model’s AUC increased to 0.927, with an average precision of 0.881 and an accuracy of 0.875. The DT model achieved an AUC of 0.907, an average precision of 0.848, and an accuracy of 0.851. The RF model’s AUC rose to 0.926, with an average precision of 0.877 and an accuracy of 0.861. The ANN model again demonstrated a slightly higher performance, with an AUC of 0.941, an average precision of 0.898, and an accuracy of 0.884.
In addition to pre-admission features (n = 65), the inclusion of post-admission features (n = 37) improved the AUC for the ANN model from 0.835 to 0.941, RF from 0.839 to 0.926, DT from 0.825 to 0.907, and LR from 0.831 to 0.927. The average AUC improvement across all models was 0.089 ± 0.012, showing consistent enhancements across all models with post-admission data.
A stratified analysis of mortality rates was conducted to assess the potential impact of the COVID-19 pandemic relevant to the current literature. The 1-month mortality rate increased from 20.9% in the pre-COVID period to 24.57% in the post-COVID period (p < 0.001). Similarly, the 1-year mortality rate rose from 30.85% to 35.55% (p < 0.001). Due to the unavailability of patient-level COVID-19 data, pandemic-specific effects were not incorporated into the predictive models.

3.4. Significance Analysis

As demonstrated in Table 4, in both 1-month and 1-year mortality predictions, several key prognostic factors were identified as highly significant (p < 0.001). Among these, age emerged as one of the strongest predictors, with older patients showing a significantly increased risk of mortality at both time points. Cardiopulmonary arrest and endotracheal intubation at ED was another critical factor, demonstrating a strong association with mortality, further emphasizing the severity of the initial clinical presentation. The placement of an EVD within the first 72 h, while not a significant predictor for 1-month mortality, became a meaningful factor for 1-year mortality. Elevated glucose and lactate levels were also important markers of poor outcomes, while a higher white blood cell count (WBC) was strongly correlated with increased mortality risk. Pre-existing comorbidities, such as chronic heart disease, hypertension, and chronic kidney disease, were consistently significant across both 1-month and 1-year predictions, highlighting the influence of underlying health conditions on patient outcomes.
Additionally, postoperative complications, including sepsis, epilepsy, pneumonia, and deep vein thrombosis (DVT), were crucial determinants of mortality. Notably, postoperative pneumonia was among the most significant complications affecting survival. These findings underscore the importance of incorporating both pre- and post-admission data to enhance the predictive accuracy of mortality models in SAH patients, particularly in high-risk individuals with severe presentations and multiple comorbidities.

4. Discussion

Our study successfully developed and validated ML models to predict first-month and first-year mortality in patients with SAH using the comprehensive nationwide EHR system in Türkiye. Previous research has extensively explored various ML models for this purpose, and our findings align with and expand upon these efforts [22,23,24]. Unlike prior studies, our research utilizes Türkiye’s nationwide EHR system, marking the first extensive study of its kind within this unique national health information platform in SAH patients. Additionally, by focusing on first-month and first-year mortality and employing various advanced ML techniques, our study provides novel insights and contributes to the literature.
Our study utilized two distinct datasets and models trained on pre-admission and combined pre- and post-admission features. This approach serves two purposes: first, to enable early mortality prediction during hospital admission, facilitating timely discussions and planning with patients and their families; and second, to assess the impact of treatment decisions on mortality and morbidity, providing valuable insights into post-treatment outcomes. Including post-admission parameters substantially improved the performance of machine learning models in predicting mortality for SAH patients. For example, the ANN model’s AUC for 1-month mortality increased from 0.850 to 0.946 in the test set when incorporating post-admission data, with notable improvements in accuracy, precision, and recall. These enhancements underscore the importance of comprehensive data integration—from initial presentation through post-admission care—in improving prognosis predictions. Additionally, post-admission complications such as epilepsy, stroke, and pneumonia emerged as critical factors influencing survival. The high prevalence of epilepsy diagnoses may be attributed to preventive antiepileptic prescriptions, often entered as diagnoses. Early EVD placement, while not significant for 1-month mortality, became a key predictor for 1-year outcomes, reflecting its association with neurological deterioration and long-term survival. These findings highlight the value of a holistic clinical approach, particularly in neurosurgical settings, where detailed patient management strategies can meaningfully impact outcomes.
While our study primarily emphasizes the statistical methods and the overall utility of the models, the key predictors identified through our analysis deserve special attention. Factors such as age, cardiopulmonary arrest, and endotracheal intubation status emerged as significant predictors of both short- and long-term mortality. Older patients and those requiring intubation or experiencing cardiopulmonary arrest typically have less physiological reserve to recover from surgery and subsequent complications [25,26,27]. Furthermore, laboratory markers like elevated glucose and lactate levels, as well as high WBC counts, were strongly correlated with poor outcomes, reflecting the severity of the initial clinical presentation. Post-admission complications, including pneumonia, sepsis, and epilepsy, further underscored the importance of continuous monitoring and management in improving survival rates. Integrating comprehensive data from initial presentation through post-admission care allows for more precise risk stratification and targeted interventions, leading to better patient management and clinical outcomes. The improvement in model performance with post-admission data underscores the necessity of continuous monitoring and management in neurosurgical and intensive care settings. While objective clinical scales like the Fisher grade, WFNS, Hunt and Hess, and Glasgow Coma Scale provide valuable insights [28], our study demonstrates that comprehensive tabular data can effectively serve as a proxy for these objective measures. Despite the absence of these specific scales, our extensive data analysis achieved superior predictive performance, exceeding the standard range of AUCs reported in studies utilizing these clinical scales. This highlights the enhanced prognostic power of integrating extensive pre- and post-admission big data in SAH outcome predictions. Utilizing our ML algorithms, we achieved high predictive accuracy, highlighting our dataset’s potential to capture the essential prognostic elements of these scales indirectly. This approach allows for a reliable and robust prediction model, which is especially valuable in clinical settings where these specific scales may not be readily available or consistently applied.
Interestingly, our study demonstrated a significant rise in mortality rates, with 1-month mortality increasing from 20.9% pre-COVID to 24.57% post-COVID and 1-year mortality rising from 30.85% to 35.55% (p < 0.001). Consistent with our findings, an increase in SAH mortality during the COVID-19 pandemic has been reported; this could potentially be attributed to delayed presentation, worse neurological status, associated comorbidities, and decreased hospital resource utilization [29,30,31,32,33,34]. Due to the unavailability of patient-level COVID-19 data, pandemic-specific effects were not incorporated into our predictive models.
The performance comparison revealed that the ANN and RF models achieved the highest accuracy and AUC scores, likely due to their ability to capture complex relationships within the data. ANN’s deep learning architecture facilitates extracting intricate features, resulting in superior predictive accuracy. However, its clinical applicability may be limited by its computational complexity, the need for technical expertise, and substantial processing power. Conversely, LR offers high precision and AUC with simplicity and interpretability, making it suitable for resource-limited settings. DT demonstrated competitive performance. LR provides similar success rates to ANN while being less complex and more interpretable, making it a suitable option for practical implementation, as shown in Figure 1 and Figure 2. Additionally, as was in the case in our study, a significance analysis of the features can be conducted to further validate their importance in the model.
Recent research has demonstrated that ML algorithms can accurately predict in-hospital mortality and functional outcomes [19,35]. These models often outperform traditional statistical methods and can match clinicians’ predictions [36]. The multicenter retrospective cohort studies have effectively utilized ML to predict prognostic features and identify poor prognostic conditions related to SAH, such as DCI [15,37]. These studies have demonstrated the utility of various ML algorithms, including RF, gradient boosting DT, support vector machines, and DT, in predicting DCI and other complications and poor prognostic outcomes following SAH [7,10,11,38]. These studies highlighted the importance of incorporating clinical variables to enhance predictive accuracy, a methodology we also adopted and found effective. Recent studies have reported average AUC values between 0.80 and 0.87 for predicting outcomes in SAH, demonstrating the efficacy of ML in enhancing prognostic accuracy. In comparison, our models achieved superior performance, with the ANN model reaching an AUC of 0.945 for 1-month mortality and 0.937 for 1-year mortality. This improvement underscores the impact of incorporating pre- and post-admission features and big data analysis.
Key predictors include age, serum glucose, neutrophil-to-lymphocyte ratio, and established clinical scales such as WFNS and Fisher’s scale [39]. Some researchers have developed simplified scorecards based on ML models to facilitate bedside use [6]. Additionally, the others applied ML to a nationwide electronic health record (EHR) database to predict mortality and poor outcomes in SAH patients. This demonstrates that logistic regression and other ML models could achieve high predictive performance when leveraging comprehensive clinical datasets [19,40].
Our research is consistent with the findings of Farooqi et al., who emphasized the potential of AI and ML in refining grading and outcome predictions for aneurysmal SAH. They highlighted that integrating extensive datasets from nationwide EHR could significantly improve the accuracy and applicability of predictive models [41]. Similarly, other recent studies underscored the critical role of early clinical indicators and the application of ML algorithms in predictive modeling for SAH patients [42,43,44]. Integrating pre- and post-admission variables into our ML models significantly enhanced their predictive performance. This approach aligns with Shu et al.’s findings, demonstrating the crucial role of post-admission factors in accurately predicting patient outcomes after SAH using Shapley Additive Explanations (SHAPs) analysis to enhance model interpretability [45]. Our results underscore the importance of comprehensive data collection throughout the patient’s clinical journey, from initial presentation to post-admission care.
A major strength of our study is the utilization of both the e-Nabız and MERNİS systems. The e-Nabız system provides a comprehensive and representative dataset from all hospitals in Türkiye, providing robust and reliable clinical data. Similarly, the MERNİS system offers accurate and dependable mortality data. Together, these national platforms contribute significantly to the validity and generalizability of our findings. Applying advanced ML techniques, including ANN, DT, RF, and LR, also allowed for high predictive accuracy.
However, the study’s retrospective design and potential biases in data collection are notable limitations. Despite rigorous data preprocessing and imputation techniques, some residual confounding factors may exist. Moreover, excluding patients with incomplete records may have introduced selection bias. A further limitation of our study is the absence of an objective scale to assess patients’ conditions at presentation (such as Hunt-Hess, mRS, or GCS scores). To address this, we categorized patients into “good” and “poor” presentation groups based on their first presentation in ED, such as arrest history, intubation status, and later clinical and operational factors like complications, epilepsy, or sepsis. This proxy measure allowed us to account for the severity of the presentation indirectly, although it does not fully replace standardized clinical scoring systems. Future studies should focus on the prospective validation of the models and consider incorporating additional clinical, imaging, and laboratory variables to enhance predictive accuracy.
Future research should validate our predictive models prospectively in different clinical settings to confirm their generalizability and reliability. Integrating more granular clinical data and molecular and biochemical markers could provide deeper insights into SAH prognosis and improve model performance. Exploring other ML algorithms and ensemble methods may yield better predictive accuracy. Investigating the impact of incorporating continuous monitoring data, such as real-time physiological parameters, could enhance the timeliness and precision of outcome predictions [46]. Moreover, future nationwide studies should include robust scales such as WFNS, Hunt and Hess, and Fisher, which are unavailable in our cohort, to further refine the predictive models and improve their applicability.

5. Conclusions

Our study developed and validated ML models to predict 1-month and 1-year mortality in patients with SAH using data from Türkiye’s unique nationwide EHR system. Notably, including pre- and post-admission parameters significantly enhanced the models’ predictive performance, with ANN achieving the highest accuracy and AUC. However, LR also demonstrated strong performance, closely matching ANN while being more interpretable and more accessible to apply in clinical practice. Key predictors included age, cardiopulmonary arrest and/or endotracheal intubation, abnormal laboratory results (such as glucose and lactate levels) on presentation, and post-admission complications such as pneumonia and sepsis, which significantly impacted both short- and long-term mortality. These findings offer valuable clinical implications for the early identification of high-risk SAH patients, enabling timely and personalized interventions. Moreover, this study is the first comprehensive SAH analysis using a nationwide dataset in the Turkish population, underscoring its unique contribution to the field. Future research should focus on the prospective validation of these models and further integration of clinical variables to enhance predictive accuracy, potentially improving patient outcomes and optimizing resource allocation in neurosurgery.

Author Contributions

Conception or design of the work: T.K., A.I.I., A.A. and S.H. Data collection: T.K., S.C., A.I.I., A.A. and S.H. Data analysis and interpretation: T.K., E.C., N.N.G., A.A. and S.H. Manuscript Composition: T.K., E.C. and S.H. Critical revision of the article: T.K., N.A., M.M.U., S.B., A.I.I., A.B., A.A. and S.H. Study supervision: T.K., N.A., M.M.U., S.B., A.I.I., A.B., A.A. and S.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted by the Declaration of Helsinki and approved by the Ministry of Health of the Republic of Türkiye (Personal Data Protection Law No. 6698, dated 24 March 2016).

Informed Consent Statement

Informed consent was obtained from the Ministry of Health of the Republic of Türkiye.

Data Availability Statement

The data supporting the findings of this study, including all images and datasets, are available from the corresponding author, Sahin Hanalioglu, upon reasonable request. No additional external data repositories were used, and the data are not publicly archived due to the nature of this study. All relevant data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ota, N.; Morita, A.; Tominari, S.; Nakayama, T.; Nozaki, K.; Tominaga, T.; Noda, K.; Kamiyama, H.; Tanikawa, R.; on behalf of the Japan Neurosurgical Society for UCAS Japan Investigators. Differences Between Subarachnoid Hemorrhage Seen in Daily Practice and Aneurysms That Rupture During Follow-Up. Stroke 2021, 52, e491–e493. [Google Scholar] [CrossRef] [PubMed]
  2. Ota, N.; Noda, K.; Chida, D.; Kiko, K.; Miyoshi, N.; Kondo, T.; Haraguchi, K.; Kamiyama, H.; Tokuda, S.; Tanikawa, R. Emergent Subarachnoid Clot Removal with Aneurysm Repair for Subarachnoid Hemorrhage Might Improves Clinical Outcome. World Neurosurg. 2022, 167, e100–e109. [Google Scholar] [CrossRef] [PubMed]
  3. Hanalioglu, S.; Sahin, B.; Sayyahmelli, S.; Ozaydin, B.; Erginoglu, U.; Aycan, A.; Baskaya, M.K. The role of microsurgery for poor-grade aneurysmal subarachnoid hemorrhages in the endovascular era. Acta Neurochir. 2022, 164, 781–793. [Google Scholar] [CrossRef] [PubMed]
  4. Yue, Q.; Liu, Y.; Leng, B.; Xu, B.; Gu, Y.; Chen, L.; Zhu, W.; Mao, Y. A Prognostic Model for Early Post-Treatment Outcome of Elderly Patients with Aneurysmal Subarachnoid Hemorrhage. World Neurosurg. 2016, 95, 253–261. [Google Scholar] [CrossRef]
  5. Hamouda, A.M.; Cwajna, M.; Elfil, M.; Derhab, M.; Desouki, M.T.; Kobeissi, H.; Ghozy, S.; Kallmes, D.F. Impact of frailty on post-operative outcomes following subarachnoid hemorrhage: A systematic review and meta-analysis. Clin. Neurol. Neurosurg. 2024, 244, 108413. [Google Scholar] [CrossRef]
  6. Zhang, Y.; Zeng, H.; Zhou, H.; Li, J.; Wang, T.; Guo, Y.; Cai, L.; Hu, J.; Zhang, X.; Chen, G. Predicting the Outcome of Patients with Aneurysmal Subarachnoid Hemorrhage: A Machine-Learning-Guided Scorecard. J. Clin. Med. 2023, 12, 7040. [Google Scholar] [CrossRef]
  7. de Jong, G.; Aquarius, R.; Sanaan, B.; Bartels, R.H.M.A.; Grotenhuis, J.A.; Henssen, D.J.H.A.; Boogaarts, H.D. Prediction Models in Aneurysmal Subarachnoid Hemorrhage: Forecasting Clinical Outcome with Artificial Intelligence. Neurosurgery 2021, 88, E427–E434. [Google Scholar] [CrossRef]
  8. Khaniyev, T.; Copenhaver, M.S.; Safavi, K.C.; Levi, R. A prescriptive optimization approach to identification of minimal barriers for surgical patients. medRxiv 2023. [Google Scholar] [CrossRef]
  9. Safavi, K.C.; Khaniyev, T.; Copenhaver, M.; Seelen, M.; Zenteno Langle, A.C.; Zanger, J.; Daily, B.; Levi, R.; Dunn, P. Development and Validation of a Machine Learning Model to Aid Discharge Processes for Inpatient Surgical Care. JAMA Netw. Open 2019, 2, e1917221. [Google Scholar] [CrossRef]
  10. Liu, J.; Xiong, Y.; Zhong, M.; Yang, Y.; Guo, X.; Tan, X.; Zhao, B. Predicting Long-Term Outcomes After Poor-Grade Aneurysmal Subarachnoid Hemorrhage Using Decision Tree Modeling. Neurosurgery 2020, 87, 523–529. [Google Scholar] [CrossRef]
  11. Maldaner, N.; Zeitlberger, A.M.; Sosnova, M.; Goldberg, J.; Fung, C.; Bervini, D.; May, A.; Bijlenga, P.; Schaller, K.; Roethlisberger, M.; et al. Development of a Complication- and Treatment-Aware Prediction Model for Favorable Functional Outcome in Aneurysmal Subarachnoid Hemorrhage Based on Machine Learning. Neurosurgery 2021, 88, E150–E157. [Google Scholar] [CrossRef] [PubMed]
  12. Appel, D.; Seeberger, M.; Schwedhelm, E.; Czorlich, P.; Goetz, A.E.; Böger, R.H.; Hannemann, J. Asymmetric and Symmetric Dimethylarginines are Markers of Delayed Cerebral Ischemia and Neurological Outcome in Patients with Subarachnoid Hemorrhage. Neurocrit. Care 2018, 29, 84–93. [Google Scholar] [CrossRef] [PubMed]
  13. Koch, M.; Acharjee, A.; Ament, Z.; Schleicher, R.; Bevers, M.; Stapleton, C.; Patel, A.; Kimberly, W.T. Machine Learning-Driven Metabolomic Evaluation of Cerebrospinal Fluid: Insights into Poor Outcomes After Aneurysmal Subarachnoid Hemorrhage. Neurosurgery 2021, 88, 1003–1011. [Google Scholar] [CrossRef]
  14. Stapleton, C.J.; Acharjee, A.; Irvine, H.J.; Wolcott, Z.C.; Patel, A.B.; Kimberly, W.T. High-throughput metabolite profiling: Identification of plasma taurine as a potential biomarker of functional outcome after aneurysmal subarachnoid hemorrhage. J. Neurosurg. 2020, 133, 1842–1849. [Google Scholar] [CrossRef]
  15. Ge, S.; Chen, J.; Wang, W.; Zhang, L.; Teng, Y.; Yang, C.; Wang, H.; Tao, Y.; Chen, Z.; Li, R.; et al. Predicting who has delayed cerebral ischemia after aneurysmal subarachnoid hemorrhage using machine learning approach: A multicenter, retrospective cohort study. BMC Neurol. 2024, 24, 177. [Google Scholar] [CrossRef]
  16. de Oliveira Souza, N.V.; Rouanet, C.; Solla, D.J.F.; de Lima, C.V.B.; de Souza, C.A.; Rezende, F.; Alves, M.M.; Manuel, A.L.d.O.; Neto, F.C.; Frudit, M.; et al. The Role of VASOGRADE as a Simple Grading Scale to Predict Delayed Cerebral Ischemia and Functional Outcome After Aneurysmal Subarachnoid Hemorrhage. Neurocrit. Care 2023, 38, 96–104. [Google Scholar] [CrossRef]
  17. Shamout, F.; Zhu, T.; Clifton, D.A. Machine Learning for Clinical Outcome Prediction. IEEE Rev. Biomed. Eng. 2021, 14, 116–126. [Google Scholar] [CrossRef]
  18. Senders, J.T.; Staples, P.C.; Karhade, A.V.; Zaki, M.M.; Gormley, W.B.; Broekman, M.L.D.; Smith, T.R.; Arnaout, O. Machine Learning and Neurosurgical Outcome Prediction: A Systematic Review. World Neurosurg. 2018, 109, 476–486.e1. [Google Scholar] [CrossRef]
  19. Zhu, G.; Yuan, A.; Yu, D.; Zha, A.; Wu, H. Machine learning to predict mortality for aneurysmal subarachnoid hemorrhage (aSAH) using a large nationwide EHR database. PLoS Digit. Health 2023, 2, e0000400. [Google Scholar] [CrossRef]
  20. Yuan, R.; Janzen, I.; Devnath, L.; Khattra, S.; Myers, R.; Lam, S.; MacAulay, C. MA19.11 Predicting Future Lung Cancer Risk with Low-Dose Screening CT Using an Artificial Intelligence Model. J. Thorac. Oncol. 2023, 18, S174. [Google Scholar] [CrossRef]
  21. Devnath, L.; Summons, P.; Luo, S.; Wang, D.; Shaukat, K.; Hameed, I.A.; Aljuaid, H. Computer-Aided Diagnosis of Coal Workers’ Pneumoconiosis in Chest X-ray Radiographs Using Machine Learning: A Systematic Literature Review. Int. J. Environ. Res. Public Health 2022, 19, 6439. [Google Scholar] [CrossRef] [PubMed]
  22. Ichim, C.; Pavel, V.; Mester, P.; Schmid, S.; Todor, S.B.; Stoia, O.; Anderco, P.; Kandulski, A.; Müller, M.; Heumann, P.; et al. Assessing Key Factors Influencing Successful Resuscitation Outcomes in Out-of-Hospital Cardiac Arrest (OHCA). J. Clin. Med. 2024, 13, 7399. [Google Scholar] [CrossRef] [PubMed]
  23. Warman, A.; Kalluri, A.L.; Azad, T.D. Machine learning predictive models in neurosurgery: An appraisal based on the TRIPOD guidelines. Systematic review. Neurosurg. Focus 2023, 54, E8. [Google Scholar] [CrossRef] [PubMed]
  24. Mourelo-Fariña, M.; Pértega, S.; Galeiras, R. A Model for Prediction of In-Hospital Mortality in Patients with Subarachnoid Hemorrhage. Neurocrit. Care 2021, 34, 508–518. [Google Scholar] [CrossRef] [PubMed]
  25. Daghistani, T.A.; Elshawi, R.; Sakr, S.; Ahmed, A.M.; Al-Thwayee, A.; Al-Mallah, M.H. Predictors of in-hospital length of stay among cardiac patients: A machine learning approach. Int. J. Cardiol. 2019, 288, 140–147. [Google Scholar] [CrossRef]
  26. Sundström, J.; Hedberg, J.; Thuresson, M.; Aarskog, P.; Johannesen, K.M.; Oldgren, J. Low-Dose Aspirin Discontinuation and Risk of Cardiovascular Events. Circulation 2017, 136, 1183–1192. [Google Scholar] [CrossRef]
  27. Charehsaz, A.; Vayisoglu, T.; Uyaniker, Z.A.; Cekic, E.; Ozturk, E.; Isikay, A.I.; Hanalioglu, S. Relative Cortical Atrophy Index as a Strong Predictor of Recurrence After Surgery for Chronic Subdural Hematoma. Neurosurgery 2024, 95, 1369–1377. [Google Scholar] [CrossRef]
  28. Veldeman, M.; Rossmann, T.; Weiss, M.; Conzen-Dilger, C.; Korja, M.; Hoellig, A.; Virta, J.J.; Satopää, J.; Luostarinen, T.; Clusmann, H.; et al. Aneurysmal Subarachnoid Hemorrhage in Hospitalized Patients on Anticoagulants—A Two Center Matched Case-Control Study. J. Clin. Med. 2023, 12, 1476. [Google Scholar] [CrossRef]
  29. Akbik, F.; Yang, C.; Howard, B.M.; Grossberg, J.A.; Danyluk, L.; Martin, K.S.; Alawieh, A.; Rindler, R.S.; Tong, F.C.; Barrow, D.L.; et al. Delayed Presentations and Worse Outcomes After Aneurysmal Subarachnoid Hemorrhage in the Early COVID-19 Era. Neurosurgery 2022, 91, 66–71. [Google Scholar] [CrossRef]
  30. Lintas, K.; Rohde, S.; Ellrichmann, G.; El-Hamalawi, B.; Sarge, R.; Müller, O. Subarachnoid hemorrhages and aneurysms during the SARS-CoV2-pandemia at a tertiary medical center—Analysis of incidence and outcome. Brain Spine 2023, 3, 101757. [Google Scholar] [CrossRef]
  31. Qureshi, A.I.; Baskett, W.I.; Huang, W.; Shyu, D.; Myers, D.; Lobanova, I.; Ishfaq, M.F.; Naqvi, S.H.; French, B.R.; Siddiq, F.; et al. Subarachnoid Hemorrhage and COVID-19: An Analysis of 282,718 Patients. World Neurosurg. 2021, 151, e615–e620. [Google Scholar] [CrossRef] [PubMed]
  32. Patel, S.D.; Balabhadra, A.; Otite, F.; Patel, N.; Tunguturi, A.; Bruno, C.; Sussman, E.; Ollenschleger, M.; Alberts, M.J.; Mehta, T. Abstract TP34: Outcomes and Trends of Subarachnoid Hemorrhage During the COVID-19 Pandemic. Stroke 2024, 55, ATP34. [Google Scholar] [CrossRef]
  33. Flores-Sanchez, J.D.; Perez-Chadid, D.A.; Vargas-Urbina, J.; Zumaeta, J.; Rodriguez, R.R.; Palacios, F.; Flores-Castillo, J. Pandemic impact on aneurysmal subarachnoid hemorrhage in Peru’s high COVID-19 lethality setting: A public institutional experience. Surg. Neurol. Int. 2023, 14, 440. [Google Scholar] [CrossRef] [PubMed]
  34. SVIN COVID-19 Global SAH Registry. Global impact of the COVID-19 pandemic on subarachnoid haemorrhage hospitalisations, aneurysm treatment and in-hospital mortality: 1-year follow-up. J. Neurol. Neurosurg. Psychiatry 2022, 93, 1028. [Google Scholar] [CrossRef]
  35. Deng, J.; He, Z. Characterizing Risk of In-Hospital Mortality Following Subarachnoid Hemorrhage Using Machine Learning: A Retrospective Study. Front. Surg. 2022, 9, 891984. [Google Scholar] [CrossRef]
  36. Savarraj, J.P.J.; Hergenroeder, G.W.; Zhu, L.; Chang, T.; Park, S.; Megjhani, M.; Vahidy, F.S.; Zhao, Z.; Kitagawa, R.S.; Choi, H.A. Machine Learning to Predict Delayed Cerebral Ischemia and Outcomes in Subarachnoid Hemorrhage. Neurology 2021, 96, e553–e562. [Google Scholar] [CrossRef]
  37. Hu, P.; Liu, Y.; Li, Y.; Guo, G.; Su, Z.; Gao, X.; Chen, J.; Qi, Y.; Xu, Y.; Yan, T.; et al. A Comparison of LASSO Regression and Tree-Based Models for Delayed Cerebral Ischemia in Elderly Patients With Subarachnoid Hemorrhage. Front. Neurol. 2022, 13, 791547. [Google Scholar] [CrossRef]
  38. Ramos, L.A.; van der Steen, W.E.; Sales Barros, R.; Majoie, C.B.L.M.; van den Berg, R.; Verbaan, D.; Vandertop, W.P.; Zijlstra, I.J.A.J.; Zwinderman, A.H.; Strijkers, G.J.; et al. Machine learning improves prediction of delayed cerebral ischemia in patients with subarachnoid hemorrhage. J. Neurointerv. Surg. 2019, 11, 497–502. [Google Scholar] [CrossRef]
  39. de Toledo, P.; Rios, P.M.; Ledezma, A.; Sanchis, A.; Alen, J.F.; Lagares, A. Predicting the Outcome of Patients With Subarachnoid Hemorrhage Using Machine Learning Techniques. IEEE Trans. Inf. Technol. Biomed. 2009, 13, 794–801. [Google Scholar] [CrossRef]
  40. Yu, D.; Williams, G.W.; Aguilar, D.; Yamal, J.; Maroufy, V.; Wang, X.; Zhang, C.; Huang, Y.; Gu, Y.; Talebi, Y.; et al. Machine learning prediction of the adverse outcome for nontraumatic subarachnoid hemorrhage patients. Ann. Clin. Transl. Neurol. 2020, 7, 2178–2185. [Google Scholar] [CrossRef]
  41. Farooqi, H.A.; Safwan, Z.; Nabi, R. Advancing grading and outcome prediction in aneurysmal subarachnoid hemorrhage: Harnessing artificial intelligence and machine learning for precision healthcare. Neurosurg. Rev. 2024, 47, 326. [Google Scholar] [CrossRef] [PubMed]
  42. Nath, S.; Koziarz, A.; Badhiwala, J.H.; Almenawer, S.A. Predicting outcomes in aneurysmal subarachnoid haemorrhage. BMJ 2018, 360, k102. [Google Scholar] [CrossRef] [PubMed]
  43. Andersen, C.R.; Fitzgerald, E.; Delaney, A.; Finfer, S. A Systematic Review of Outcome Measures Employed in Aneurysmal Subarachnoid Hemorrhage (aSAH) Clinical Research. Neurocrit. Care 2019, 30, 534–541. [Google Scholar] [CrossRef] [PubMed]
  44. Jaja, B.N.R.; Cusimano, M.D.; Etminan, N.; Hanggi, D.; Hasan, D.; Ilodigwe, D.; Lantigua, H.; Le Roux, P.; Lo, B.; Louffat-Olivares, A.; et al. Clinical Prediction Models for Aneurysmal Subarachnoid Hemorrhage: A Systematic Review. Neurocrit. Care 2013, 18, 143–153. [Google Scholar] [CrossRef]
  45. Shu, L.; Yan, H.; Wu, Y.; Yan, T.; Yang, L.; Zhang, S.; Chen, Z.; Liao, Q.; Yang, L.; Xiao, B.; et al. Explainable machine learning in outcome prediction of high-grade aneurysmal subarachnoid hemorrhage. Aging 2024, 16, 4654–4669. [Google Scholar] [CrossRef]
  46. Cekic, E.; Pinar, E.; Pinar, M.; Dagcinar, A. Deep Learning-Assisted Segmentation and Classification of Brain Tumor Types on Magnetic Resonance and Surgical Microscope Images. World Neurosurg. 2023, 182, e196–e204. [Google Scholar] [CrossRef]
Figure 1. Precision–recall and Receiver Operating Characteristic (ROC) curves for predicting mortality in subarachnoid hemorrhage patients. Using various feature sets, the curves illustrate the predictive performance of different machine learning models for first-month mortality outcomes. (A) shows the precision–recall curve for first-month mortality prediction using preoperative features alone. (B) presents the precision–recall curve for first-month mortality prediction incorporating both postoperative and preoperative features. (C) displays the ROC curve for first-month mortality prediction using only preoperative features. (D) highlights the ROC curve for first-month mortality prediction with the addition of postoperative features.
Figure 1. Precision–recall and Receiver Operating Characteristic (ROC) curves for predicting mortality in subarachnoid hemorrhage patients. Using various feature sets, the curves illustrate the predictive performance of different machine learning models for first-month mortality outcomes. (A) shows the precision–recall curve for first-month mortality prediction using preoperative features alone. (B) presents the precision–recall curve for first-month mortality prediction incorporating both postoperative and preoperative features. (C) displays the ROC curve for first-month mortality prediction using only preoperative features. (D) highlights the ROC curve for first-month mortality prediction with the addition of postoperative features.
Jcm 14 01144 g001
Figure 2. Precision–recall and Receiver Operating Characteristic (ROC) curves for predicting first-year mortality in subarachnoid hemorrhage patients. Using various feature sets, the curves illustrate the predictive performance of different machine learning models for first-year mortality outcomes. (A) shows the precision–recall curve for first-year mortality prediction using preoperative features alone. (B) presents the precision–recall curve for first-year mortality prediction incorporating both postoperative and preoperative features. (C) displays the ROC curve for first-year mortality prediction using only preoperative features, while (D) highlights the ROC curve for first-year mortality prediction by adding postoperative features.
Figure 2. Precision–recall and Receiver Operating Characteristic (ROC) curves for predicting first-year mortality in subarachnoid hemorrhage patients. Using various feature sets, the curves illustrate the predictive performance of different machine learning models for first-year mortality outcomes. (A) shows the precision–recall curve for first-year mortality prediction using preoperative features alone. (B) presents the precision–recall curve for first-year mortality prediction incorporating both postoperative and preoperative features. (C) displays the ROC curve for first-year mortality prediction using only preoperative features, while (D) highlights the ROC curve for first-year mortality prediction by adding postoperative features.
Jcm 14 01144 g002
Table 1. Characteristics of patients: demographics, clinical presentation, interventions, and outcomes.
Table 1. Characteristics of patients: demographics, clinical presentation, interventions, and outcomes.
VariableValue
Number of patients29,274
Gender
 Female14,002 (47.8)
 Male15,272 (52.2)
Age56.23 ± 16.45
57 [46–68]
Time from initial emergency visit to CT scan, minutes12 [4–114]
Time from initial emergency visit to diagnosis, hours3 [0–23]
Intubation in the first 24 h10,017 (34.2)
EVD in the first 72 h2840 (9.7)
Lab values in the initial emergency visit
 Hemoglobin (g/dL)13 [11.4–14.4]
 RBC (×103 cells/μL)4.5 [4–4.9]
 WBC (×103 cells/μL)12.01 [9.2–15.6]
 Platelets (×103 cells/μL)230 [184–282]
 Lymphocyte (×103 cells/μL)4.3 [1.4–10.5]
 Neutrophil (×103 cells/μL)22.34 [10–82.9]
 Sodium (mmol/L)138.8 [136–141]
 Glucose (mg/dL)131 [105–167]
 Lactate (mmol/L)1.91 [1.3–3.2]
Comorbidities
 Type 2 Diabetes Mellitus (DM)7460 (25.5)
 Hyperlipidemia6968 (23.8)
 Atherosclerosis356 (1.2)
 Hypertension16,313 (55.7)
 Acute ischemic heart disease2994 (10.2)
 Chronic ischemic heart disease7845 (26.8)
 Cerebrovascular disease5565 (19)
 Peripheral artery disease2563 (8.8)
 Renal failure1402 (4.8)
 Malignancy1118 (3.8)
 Inflammatory disease1110 (3.8)
 Rheumatic heart diseases640 (2.2)
 Liver disease836 (2.9)
 Chronic obstructive pulmonary disease (COPD)3529 (12.1)
 Other aneurysms682 (2.3)
 Obesity553 (1.9)
 Pregnancy788 (2.7)
 Neurocutaneous disorders8 (0.02)
 Coagulation disorders554 (1.9)
Primary intervention for aneurysm
 Yes11,068 (37.8)
  Clipping6643 (22.7)
  Coiling4425 (15.1)
 No18,206 (62.2)
Time from initial emergency visit to intervention for aneurysm, days1.95 [0.75–4.75]
 Clipping1.95 [0.77–4.4]
 Coiling2.07 [0.8–5.8]
Facilities in which intervention for aneurysm was performed
 Government-owned hospitals5461 (49.3)
 University hospitals3396 (30.7)
 Private hospitals1674 (15.1)
 Private university hospitals537 (4.6)
Interventions for SAH complications
 Yes8323 (28.4)
  Decompressive craniectomy648 (2.2)
  Epidural hematoma evacuation125 (0.4)
  Subdural hematoma evacuation938 (3.2)
  Intracerebral hemorrhage evacuation1233 (4.2)
   CSF drainage
    EVD and/or ELD placement3090 (10.6)
    VPS placement345 (1.2)
   Tracheostomy747 (2.6)
   PEG placement522 (1.8)
 No20,951 (71.6)
Length of hospitalization, days19.01 ± 15.94
15.9 [8.6–24.8]
Emergency revisit within 90 days of discharge5100 (17.4)
Death within 7 days2757 (9.4)
Death within 30 days6668 (22.8)
Death within the first year9737 (33.3)
Post-treatment complications
 Acute respiratory distress syndrome (ARDS)331 (1.1)
 Respiratory failure4502 (15.4)
 Acute ischemic heart disease1781 (6.1)
 Sepsis1380 (4.7)
 Meningitis294 (1)
 Encephalitis48 (0.2)
 Intracranial and intraspinal abscess69 (0.2)
 Urinary tract infection (UTI)3691 (12.6)
 Epilepsy8220 (28.1)
 Hydrocephalus960 (3.3)
 Cerebral edema1937 (6.6)
 Pulmonary thromboembolism (PTE)506 (1.7)
 Deep vein thrombosis (DVT)1137 (3.9)
 Pneumonia4914 (16.8)
 Paralysis3148 (10.8)
 Status epilepticus216 (0.7)
 Decubitus ulcer1382 (4.7)
 Cerebral Ischemia1191 (4.1)
Table 2. Machine learning prediction results for first-month mortality in SAH patients.
Table 2. Machine learning prediction results for first-month mortality in SAH patients.
Machine Learning Methods
Input FeaturesSampleMetricLogistic RegressionDecision TreeRandom ForestArtificial Neural Network
Pre-admissionTrainingAUC0.8450.8610.8820.855
Average Precision0.6540.6920.7500.673
Accuracy0.8300.8420.8520.834
TestAUC0.8490.8350.8550.850
Average Precision0.6510.6360.6670.649
Accuracy0.8320.8260.8350.829
Pre-admission
+
Post-admission
TrainingAUC0.9400.9370.9420.952
Average Precision0.8250.7970.8400.858
Accuracy0.8950.8860.8920.906
TestAUC0.9420.9160.9310.946
Average Precision0.8350.7550.8130.844
Accuracy0.9010.8850.8930.905
Table 3. Machine learning prediction results for first-year mortality in SAH patients.
Table 3. Machine learning prediction results for first-year mortality in SAH patients.
Machine Learning Methods
Input FeaturesSampleMetricLogistic RegressionDecision TreeRandom ForestArtificial Neural Network
PreadmissionTrainingAUC0.8370.8530.8630.846
Average Precision0.7440.7720.7960.755
Accuracy0.7860.8010.8070.793
TestAUC0.8310.8250.8390.835
Average Precision0.7330.7170.7470.733
Accuracy0.7820.7770.7890.780
Pre-admission
+
Post-admission
TrainingAUC0.9290.9330.9390.949
Average Precision0.8850.8890.8980.914
Accuracy0.8740.8640.8650.893
TestAUC0.9270.9070.9260.941
Average Precision0.8810.8480.8770.898
Accuracy0.8750.8510.8610.884
Table 4. Results of significance analysis for key predictors.
Table 4. Results of significance analysis for key predictors.
First Month PredictionFirst Year Prediction
EstimateStd.ErrorPr (>|z|)EstimateStd.ErrorPr (>|z|)
Age2.3504750.15115.48<0.0014.1886510.14329.097<0.001
Cardiopulmonary Arrest4.1381830.23417.613<0.0014.4888350.32813.684<0.001
Endotracheal Intubation−2.1174090.234−9.026<0.001−2.7971080.328−8.524<0.001
EVD (First 72 h)−0.101640.116−0.87 −0.4520860.107−4.187<0.001
Glucose5.3394680.5998.913<0.0016.4291980.58211.036<0.001
Lactate5.0144751.3303.77<0.0014.8075891.2333.896<0.001
Neutrophil1.0065990.3253.093<0.010.2289110.3050.748
Basophil3.6334941.3092.774<0.012.9502291.2162.425<0.05
White Blood Cell Count7.2846520.52313.917<0.0016.6469280.51013.023<0.001
Eosinophil−3.6185161.089−3.321<0.001−2.6049650.879−2.961<0.01
Hematocrit−0.3515740.129−2.719<0.01−0.3399320.117−2.901<0.01
Erythrocyte0.2463410.4810.511 −1.1333480.436−2.598<0.01
Mean Platelet Volume0.5288420.2392.209<0.050.6353360.2212.872<0.01
Platelet Distribution Width1.1247360.2484.528<0.0010.8352780.2253.710<0.001
Platelet−0.4093960.168−2.424<0.05−0.4139810.153−2.705<0.01
Platelet to WBC Ratio−5.8627950.908−6.452<0.001−1.6292480.736−2.213<0.05
Pre-existing Hypertension0.2909370.0535.415<0.0010.311360.0476.528<0.001
Pre-existing Chronic Heart Disease0.2279440.0653.496<0.0010.4080310.0586.960<0.001
Pre-existing Stroke0.0895690.0631.409 0.3278230.0565.841<0.001
Pre-existing Pulmonary Arterial Hypertension 0.1887410.0812.313<0.050.206920.0722.844<0.01
Pre-existing Chronic Kidney Disease 0.5836930.105.832<0.0010.6272750.0926.812<0.001
Pre-existing Malignancy0.1241780.1101.123 0.8318560.0958.754<0.001
Pre-existing Rheumatic Vascular Disease0.5745830.1463.935<0.0010.2128160.1331.596
Pre-existing Chronic Liver Failure0.2842460.0883.212<0.010.5583130.0816.883<0.001
Pre-existing Epilepsy0.2720370.1182.301<0.050.7362110.0987.444<0.001
Pre-existing Hemiplegia/Paraplegia0.7695220.1704.51<0.0010.8299210.1465.677<0.001
Institution Type−3.2317090.433−7.462<0.001−3.5397160.630−5.618<0.001
Procedure-related Complications−0.0055640.143−0.039 1.2319470.11410.728<0.001
Decompressive Craniectomy0.9417640.1745.404<0.0010.4462390.1572.839<0.01
Subdural Hematoma −0.0977490.180−0.54 −0.7045350.149−4.727<0.001
Intracerebral Hemorrhage 0.5985190.1593.741<0.0010.0645770.1370.469
Tracheostomy−1.0265970.137−7.454<0.0010.5101610.1144.447<0.001
Percutaneous Endoscopic Gastrostomy −2.6653370.257−10.352<0.0010.2831030.1292.190<0.05
EVD (anytime during the hospitalization)0.7261110.1813.99<0.0010.4702550.1612.916<0.01
Postoperative Sepsis−0.0996390.106−0.932 1.042470.08811.825<0.001
Postoperative Epilepsy−2.1637650.081−26.651<0.001−1.7551340.054−32.30<0.001
Postoperative Hemiplegia/Paraplegia−1.7323570.141−12.216<0.001−1.1114460.077−14.29<0.001
Postoperative Hydrocephalus−0.8217420.169−4.856<0.001−0.3885680.115−3.351<0.001
Postoperative Brain Edema0.3456950.0804.286<0.0010.4220680.0755.598<0.001
Postoperative Pulmonary Thromboembolism −0.8336070.264−3.155<0.01−0.0402770.157−0.256
Postoperative Deep Vein Thrombosis −1.8303640.277−6.607<0.001−0.836692 0.128−6.490<0.001
Postoperative Stroke −0.6622050.044−15.043<0.001−0.5726210.040−14.004<0.001
Postoperative Pneumonia−1.232750.051−24.065<0.001−0.9792740.042−22.84<0.001
Postoperative Urinary Tract Infection (UTI)−1.6818550.101−16.526<0.001−1.3115410.062−20.96<0.001
Postoperative Decubitus Ulcers−2.2550480.221−10.172<0.0010.4211850.0924.552<0.001
Postoperative Respiratory Failure0.4794710.0598.024<0.0010.9301210.05317.294<0.001
Postoperative Chronic Heart Disease−1.7158990.081−20.984<0.001−1.8381050.062−29.58<0.001
30-day Emergency Re-admission After Discharge−1.2593970.123−10.163<0.0010.312850.1062.925<0.01
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Khaniyev, T.; Cekic, E.; Gecici, N.N.; Can, S.; Ata, N.; Ulgu, M.M.; Birinci, S.; Isikay, A.I.; Bakir, A.; Arat, A.; et al. Predicting Mortality in Subarachnoid Hemorrhage Patients Using Big Data and Machine Learning: A Nationwide Study in Türkiye. J. Clin. Med. 2025, 14, 1144. https://doi.org/10.3390/jcm14041144

AMA Style

Khaniyev T, Cekic E, Gecici NN, Can S, Ata N, Ulgu MM, Birinci S, Isikay AI, Bakir A, Arat A, et al. Predicting Mortality in Subarachnoid Hemorrhage Patients Using Big Data and Machine Learning: A Nationwide Study in Türkiye. Journal of Clinical Medicine. 2025; 14(4):1144. https://doi.org/10.3390/jcm14041144

Chicago/Turabian Style

Khaniyev, Taghi, Efecan Cekic, Neslihan Nisa Gecici, Sinem Can, Naim Ata, Mustafa Mahir Ulgu, Suayip Birinci, Ahmet Ilkay Isikay, Abdurrahman Bakir, Anil Arat, and et al. 2025. "Predicting Mortality in Subarachnoid Hemorrhage Patients Using Big Data and Machine Learning: A Nationwide Study in Türkiye" Journal of Clinical Medicine 14, no. 4: 1144. https://doi.org/10.3390/jcm14041144

APA Style

Khaniyev, T., Cekic, E., Gecici, N. N., Can, S., Ata, N., Ulgu, M. M., Birinci, S., Isikay, A. I., Bakir, A., Arat, A., & Hanalioglu, S. (2025). Predicting Mortality in Subarachnoid Hemorrhage Patients Using Big Data and Machine Learning: A Nationwide Study in Türkiye. Journal of Clinical Medicine, 14(4), 1144. https://doi.org/10.3390/jcm14041144

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop