Interpretable Machine Learning Model Predicting Early Neurological Deterioration in Ischemic Stroke Patients Treated with Mechanical Thrombectomy: A Retrospective Study

Yang, Tongtong; Hu, Yixing; Pan, Xiding; Lou, Sheng; Zou, Jianjun; Deng, Qiwen; Zhang, Qingxiu; Zhou, Junshan; Zhu, Junrong

doi:10.3390/brainsci13040557

Open AccessArticle

Interpretable Machine Learning Model Predicting Early Neurological Deterioration in Ischemic Stroke Patients Treated with Mechanical Thrombectomy: A Retrospective Study

by

Tongtong Yang

^1,†,

Yixing Hu

^1,†,

Xiding Pan

¹,

Sheng Lou

¹,

Jianjun Zou

¹,

Qiwen Deng

²

,

Qingxiu Zhang

³,

Junshan Zhou

^2,* and

Junrong Zhu

^1,*

¹

Department of Pharmacy, Nanjing First Hospital, China Pharmaceutical University, Nanjing 210009, China

²

Department of Neurology, Nanjing Hospital Affiliated to Nanjing Medical University, Nanjing 210009, China

³

Department of Neurology, Drum Tower Hospital, Nanjing University, Nanjing 210009, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Brain Sci. 2023, 13(4), 557; https://doi.org/10.3390/brainsci13040557

Submission received: 24 February 2023 / Revised: 16 March 2023 / Accepted: 23 March 2023 / Published: 26 March 2023

(This article belongs to the Special Issue Management of Cerebrovascular Disease and Its Complications: Advances and Challenges)

Download

Browse Figures

Versions Notes

Abstract

:

Early neurologic deterioration (END) is a common and feared complication for acute ischemic stroke (AIS) patients treated with mechanical thrombectomy (MT). This study aimed to develop an interpretable machine learning (ML) model for individualized prediction to predict END in AIS patients treated with MT. The retrospective cohort of AIS patients who underwent MT was from two hospitals. ML methods applied include logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGBoost). The area under the receiver operating characteristic curve (AUC) was the main evaluation metric used. We also used Shapley Additive Explanation (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) to interpret the result of the prediction model. A total of 985 patients were enrolled in this study, and the development of END was noted in 157 patients (15.9%). Among the used models, XGBoost had the highest prediction power (AUC = 0.826, 95% CI 0.781–0.871). The Delong test and calibration curve indicated that XGBoost significantly surpassed those of the other models in prediction. In addition, the AUC in the validating set was 0.846, which showed a good performance of the XGBoost. The SHAP method revealed that blood glucose was the most important predictor variable. The constructed interpretable ML model can be used to predict the risk probability of END after MT in AIS patients. It may help clinical decision making in the perioperative period of AIS patients treated with MT.

Keywords:

machine learning; acute ischemic stroke; early neurological deterioration; mechanical thrombectomy

1. Introduction

Recent clinical trials have shown that mechanical thrombectomy (MT) became the first-line standard treatment for acute ischemic stroke (AIS) patients caused by large vessel occlusion (LVO) [1,2,3,4,5]. However, AIS patients are susceptible to a common and feared complication of early neurologic deterioration (END) after MT [6], which is consistently associated with three-month unfavorable outcomes [7,8,9]. Therefore, it is desirable to predict the risk probability of END in AIS patients after MT and assist doctors in making more accurate treatment decisions for AIS patients.

Over the years, there have been many studies concerning exploring the predictors of END after MT in AIS patients, and some predictors have been found indeed, such as the blood glucose [10,11] baseline stroke severity [12], AIS patients with atrial fibrillation [13], diabetes mellitus [14,15], and so on. There are also several studies on predictive models for END after MT in AIS patients [16,17,18]. To some extent, these prediction models can help clinicians make more accurate treatment decisions for patients. However, these prediction models are based on traditional regression algorithms, which have certain limitations in solving nonlinear problems between various prognostic factors with outcomes. In addition, the variable selection based on the traditional statistical technique might obtain false positive variables or exclude false negative variables, which may debase predictive power. For overcoming the above problem, an alternative and effective technique is necessary for constructing precise prediction models.

Machine learning (ML) has emerged as a promising predictive tool in medicine and has been applied in many medical fields, such as ischemic stroke outcome prediction [19,20,21,22], biomedical research [23] and rehabilitation for chronic stroke survivors [24], and so on. Because in the process of modeling, machine learning can fit the complex relationship in multi-dimensional data, extract subtle information, and automatically summarize and generalize to obtain new knowledge. Machine learning has been limited in the medical field due to its black-box nature [25,26]. However, the latest machine learning models have been made interpretable by Shapley Additive Explanations (SHAP), which prompts the application of ML to be further developed [19,20]. The SHAP is a novel, cutting-edge machine learning algorithm which can visualize the relationship between each feature and the related predictive ability and can more intuitively understand the importance of features and enhance clinical interpretability.

We aimed to develop an interpretable machine learning model using patient preoperative and intraoperative relevant variables and demographics. The model was used to predict the probability of postoperative neurological deterioration and assist doctors to make more accurate treatment decisions for patients.

2. Methods

2.1. Study Population

This retrospective study was based on the clinical data of the patients with LVO who were treated with MT at the National Advanced Stroke Center of Nanjing First Hospital (China), from July 2015 to December 2021. We also used clinical data of the patients between January 2019 and December 2021 from Nanjing Drum Tower Hospital (China) for external validation to optimize the model. Inclusion criteria for this study can be listed as follows: (I) patients aged older than 18 years; (II) treated with MT; (III) anterior or posterior circulation large vessel occlusion state was verified by digital subtracted angiography, magnetic resonance angiography, or computed tomographic angiography. We excluded patients with uncertain 24 h neurological deterioration and lack of NIHSS score on admission. We also excluded patients with lacking vital data (i.e., age, sex and vascular recanalization status) and existing extreme outliers.

2.2. Patient Variables and Data Definitions

Patient information that was available for analyses included demographic characteristics, previous anti-thrombosis treatment, risk factors, past ischemic events, course data of stroke in 24 h, evaluation of the situation of stroke on admission with National Institutes of Health Stroke Scale [NIHSS] score, premorbid modified Rankin Scale [mRS] score, and laboratory findings (Table 1).

The symptomatic intracranial hemorrhage (sICH) happened within 24 h of admission, according to the Heidelberg Bleeding Classification [27]. Vascular recanalization was evaluated by modified Thrombolysis in Cerebral Infarction [mTICI] ≥ 2b [28]. The study was conducted in accordance with the Nanjing First Hospital, Nanjing Drum Tower Hospital, and approved by the document number of the ethics committee’s approval: ChiCTR-OCH-14004382. The outcome was END. END was defined as the NIHSS score with an increase of at least 4 points from baseline to 24 h of the stroke event [29].

2.3. Statistical Analysis

Categorical variables were expressed as numbers (percentages) and continuous variables were expressed as medians (quartile). Comparisons of the baseline characteristics between END and without END used the Student t-test or Mann–Whitney U test for continuous variables relying on the sample normality of distribution and Pearson’s test or Fisher’s exact test for categorical characteristics relying on the sample amount.

2.4. Data Processing and Feature Selection

Variables for which less than 20% of data were missing were included in the study, and missing variables were computed by using the median for continuous variables or the most common value for categorical variables, as the case may be. Before we started the modeling process, the dataset was split into training set (80%) and testing set (20%) with a stratified random sampling method. The data from Nanjing Drum Tower Hospital was taken as an external validation set. The training set was applied to the step of feature selection and training models. To overcome the imbalance of the training set, we applied stratified random sampling to generate positive cases, which may effectively prevent the overfitting problem. The testing set was used for evaluating model performance during and after training, while the validation set was used for evaluating the generalization of the model.

Redundant and extraneous factors would draw too much detail and noise, which may be prone to form a model over-fitting and decrease the predictive ability of the model, respectively. Therefore, we utilized the Least Absolute Selection and Shrinkage Operator (LASSO) [30] to exclude non-interference variables and developed ML models. It has the advantage of combining packaging method with machine learning algorithm and also has the advantage of high computational efficiency of filtering method. In brief, the LASSO is a regularized regression technique commonly used for reducing a high-dimensional feature space. Its principle is to shrink some coefficients to exactly zero and thus perform variable selection.

2.5. Modeling Strategies

Modeling processes were shown in Figure 1. The END ML model was built with data from the training set. We initially attempted to use four ML algorithms for constructing models, and these four models were named logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), and support vector machine (SVM). We used 10-fold cross-validation technique to improve generality, and grid search algorithm was used for tuning hyperparameters for each model.

2.6. Model Evaluation

Model performance was assessed by area under the receiver operating characteristic curve (AUC) using the independent testing set. The Delong test and calibration curves were used for comparing the ROC curves in different models, which could identify the optimal model. Calibration curves of the models were evaluated by using the Brier score method (range: 0–1), with lower scores reflecting better model calibration, which showed the agreement between the model’s predicted values and the cohort’s observed outcome [31]. We also input the validation set to assess the generalization of model. The sensitivity, specificity, accuracy (ACC), and Youden index were also analyzed.

2.7. Explanation of the Model

Although the above algorithms are robust and well-performing ML methods that have been very popular in the medical field, it is difficult to interpret and display black-box characters. Thus, based on the training set, we introduced the SHAP to our prediction model to make up for the issue of black-box character, which can obtain analyses of the features and make patient-specific predictions rationally [32]. The SHAP was inspired by cooperative game theory, which could visually represent the importance ranking of features and calculate each feature Shapley values in the prediction model [33]. In addition, to achieve a better interpretation of the prediction model in individual patients, we also introduced Local Interpretable Model-Agnostic Explanations (LIME) to exhibit the impact of vital variables at the level of the individual. Briefly, based on a local linear model, LIME represents a detailed interpretation of a classifier by approximating weights to the disturbance input. Thus, we used LIME to explain two specific instances in the best predictive behavior of the model. Python version 3.7 was used in this present study and related packages include XGBoost, Shap, and Scikit-learn environment.

3. Results

3.1. Study Population

Table 1 describes the characteristics of the study population from Nanjing First Hospital. A total of 985 AIS patients with LVO and 36 features were taken into the study, comprising 157 who developed END during hospitalization and 828 without END. The univariate analysis revealed that coronary artery disease, sICH, Stent retriever only, Stent retriever/aspiration with rescue therapy, blood glucose, and glycated hemoglobin were associated with the risk of END. Supplementary Table S1 describes the demographics and clinical characteristics of the external validation set: there are 56 positive cases and 177 negative cases.

3.2. Feature Selection

The data set was randomly split into the training set (n = 690) and the testing set (n = 295). In the LASSO algorithm, 9 variables, namely, blood glucose, NIHSS at baseline, interval from groin puncture to recanalization, serum creatinine, interval from onset to treatment, systolic blood pressure, diastolic blood pressure, platelets, and uric acid were selected as crucial variables, which were determined by the result of LASSO regularization process based on 690 patients in the training set.

3.3. Model Building and Evaluation

We applied the following ML algorithm with 9 crucial variables as input variables, including LR, XGBoost, SVM, RFC, and DNN to predict the risk of END. Figure 2 shows the ROC of each model on the training set and testing set. Table 2 shows the evaluation indexes of each model on the testing set, including AUC, sensitivity, specificity, accuracy, and the Youden index. Overall, among the four models, XGBoost had the highest prediction power (AUC = 0.826, 95% CI 0.781–0.871), whereas SVM showed the poorest prediction performance (AUC = 0.643, 95% CI 0.584–0.702). The Delong test was used for comparing the performance of the 4 models on the testing set, and p < 0.05 were considered statistically significant. Supplementary Table S2 shows that RF and XGBoost significantly surpassed those of the other models in prediction. We further determined the optimal model from the calibration curve; the plot showed that the XGBoost had the lowest brier score, which means the model has the best calibration (Figure 3). Therefore, the XGBoost was selected to be the optimal model.

As shown in Table 3, we validated the XGBoost model in an external validation cohort with 233 patients, and the AUC in the validating set was 0.846. We further calculated the confusion matrix, the sensitivity, specificity, and overall accuracy were 0.750, 0.836, and 0.815, which means that among the 56 cases with END, 46 cases were correctly predicted by the model, while among the 177 cases without END, 148 cases were correctly predicted by the model.

3.4. Explanation of the Model at the Feature Level

The SHAP algorithm has been applied to the XGBoost model to obtain each variable’s importance for the END prediction. The importance distribution for each variable in descending order was plotted in Figure 4A. Blood glucose was the strongest predictor of END, followed closely by NIHSS at baseline, interval from groin puncture to recanalization, serum creatinine, and interval from onset to treatment. In addition, in order to distinguish the relationship between the target result and positive and negative predictors, SHAP values were used for uncovering the END risk factors. As shown in Figure 4B, each row represents a feature, and each dot represents a sample; the redder the color, the greater the value of the feature; the bluer the color, the smaller the value of the feature. What can be found is that increases in the concentration of blood glucose are a positive impact and are more likely to develop END, whereas increases in NIHSS at baseline have a negative impact and is less likely to develop END.

3.5. Explanation of the Model at the Individual Level

We next used LIME to analyze the contribution level of features of new instances for the prediction of END in LVO patients. As illustrated in Figure 5, the true values of the nine main features (right), the overall predicted probability of END and No-END (left), and the classification details (middle) of the two instances were exhibited in the LIME plot. For instance, in patient 23, the predicted probability for END was high (0.70) due to the number of positive conditions, consisting of a high concentration of blood glucose, low NIHSS at baseline, high systolic blood pressure and high platelets. In contrast, the END probability in patient 100 was low (0.29) due to few positive conditions, only high serum creatinine.

4. Discussion

As far as we know, this study is the first attempt to use the explainable machine learning model to predict the risk of END after MT for AIS patients. Moreover, the results of the Delong test showed that our model had good discrimination. Additionally, the prediction accuracy of our model was high. Among the 4 ML prediction models, XGBoost had the best prediction effect, with an AUC as high as 0.826. The XGBoost algorithm can control the complexity of the model by adding regular terms, which is more conducive to preventing overfitting. Furthermore, it can better deal with classified and multi-dimensional data sets and improve the computational power and generalization ability of the model, which makes it suitable for clinical application. We also used the data of Nanjing Drum Tower Hospital to externally validate our prediction model, and the AUC value was as high as 0.846, indicating that our model has a high potential for prediction of END in a wider range of the Chinese population at least. We hope that more data can be used to further validate our model in the future. In our prediction model, some variables associated with END were not previously highlighted, such as the interval from onset to treatment, platelet, uric acid, and so on. In our interpretable ML model, the variables most associated with END were visualized, which helps to more intuitively understand the risk factors associated with END.

Our study is consistent with previous findings and validates the value of the ML model in predicting END after MT in AIS patients. The fasting blood glucose level [34,35,36,37] and the baseline NIHSS score [35,37] have been reported as risk factors for END. These risk factors have been reported in previous studies. Following previous work has shown that it may be related to vascular endothelial dysfunction caused by impaired blood glucose control [38]. Recent studies have shown that there is a close link between pre-existing hyperglycemia and increased cerebral ischemia/reperfusion injury in the field of stroke. In addition, previous studies have shown that clinical features alone that are observed at stroke onset can help to distinguish cardioembolic from atherothrombotic infarction [39]. However, in our study, although the incidence of END in patients with atherosclerotic infarcts is higher than that in patients with cardioembolic stroke, this is not a statistically significant difference (p > 0.05). This may require us to expand the sample size and do further research in the future.

The thromboinflammation will be further developed mainly due to the high blood glucose, which creates a harmful environment in the body [40]. The study by Girot et al. in the field of END showed that a low NIHSS score on admission was an independent risk factor [17]. Our findings are consistent with Girot et al. At first glance, this result may be surprising. It is possible that those with higher NIHSS scores are less prone to increase their score compared to those with lower NIHSS scores. It may be that patients with lower NIHSS scores are prone to vascular injuries in the acute phase and hypoperfusion areas that lead to infarction progression. Rohini et al. [18] showed that higher SBP is highly associated with END, which is consistent with our findings. It could be explained that patients with high SBP may have more severe strokes. In addition, previous studies have shown that elevated baseline SBP in AIS patients may be associated with death and dependence [41]. In our study, the time from groin puncture to revascularization and the time from symptom onset to treatment were also highly correlated with END. The time from onset to revascularization was an independent predictor of END after EVT for acute basilar artery occlusion, which was reported by Zhong et al. [42].

The complexity of ML models makes it difficult to determine the reasons behind their predictions, which may hinder clinical application. However, in this study, the SHAP algorithm was used to interpret the prediction at different levels, which ensured the performance and clinical interpretability of the model and was demonstrated to the user through friendly visualization tools. Clinicians will have a better understanding of the decision-making process of the model, which is conducive to the clinical use of the prediction results. Furthermore, it is demonstrated in our study that interpretable machine learning methods are capable of predicting END and individualizing predictions in the context of patients. Previous studies on END mainly focused on its pathophysiological explanation and each risk factor, which lacked the combined usage of large samples in clinical practice [43,44,45]. Furthermore, there is no uniformly accepted risk stratification algorithm for predicting END. Therefore, the strength of our ML model is that relevant variables can be extracted from real-world clinical data to predict END.

Our study has some limitations. First, although we have data for external validation, this is retrospective data, and the data comes from one center. The retrospective data might have led to recall, and selection bias to a certain degree. Therefore, in order to promote the model, more data sets and prospective multicenter clinical trials are needed to verify our results and the accuracy of the model. Second, to retain more available data, we only considered medical records in which END occurred within 24 h after stroke, although END is also known to occur within days of the initial time. In addition, some anesthetic drugs were not completely metabolized in the patient, and the patient was still intubated, which would lead to a risk of bias in the NIHSS score. Third, the collateral status may affect END, but this variable was not collected. In addition, in the process of building the model, feature selection was only applied once on the combined training validation set without being incorporated into the cross-validation splits. Therefore, it is possible that some feature importance variabilities were unaccounted for. Finally, the participants are all from the Chinese population, so the results cannot be easily extrapolated to other groups.

5. Conclusions

In this study, the interpretable ML model constructed can be used to predict the risk probability of END after MT in AIS patients. In addition, the model was externally validated and achieved good predictive performance. However, the amount of data to validate the model is small, and more clinical data and prospective clinical studies are needed for further verification. Furthermore, the results from this study may guide clinical decision making in the selection and intraoperative and postoperative management of AIS patients who had undergone MT with a high risk of END.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/brainsci13040557/s1, Table S1: Demographics and clinical characteristics of external validation set; Table S2: The p-values of pairwise comparisons of AUCs on the testing set for different models with the Delong test.

Author Contributions

Conceptualization, T.Y. and Y.H.; methodology, T.Y. and Y.H.; software, X.P.; formal analysis, Q.D.; data curation, S.L. and J.Z. (Jianjun Zou); writing—original draft preparation, Y.H. and T.Y.; writing—review and editing, Q.Z.; visualization, J.Z. (Junshan Zhou); supervision, J.Z. (Junrong Zhu); project administration, J.Z. (Junshan Zhou) and J.Z. (Junrong Zhu). All authors have read and agreed to the published version of the manuscript.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Institutional Review Board Statement

This study was conducted in accordance with the Nanjing First Hospital, Nanjing Drum Tower Hospital, and approved by the document number of the ethics committee’s approval: ChiCTR-OCH-14004382.

Informed Consent Statement

Patient consent was waived because this is a retrospective study.

Data Availability Statement

The datasets used in this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Saver, J.L.; Goyal, M.; Bonafe, A.; Diener, H.C.; Levy, E.I.; Pereira, V.M.; Albers, G.W.; Cognard, C.; Cohen, D.J.; Hacke, W.; et al. Stent-retriever thrombectomy after intravenous t-PA vs. t-PA alone in stroke. N. Engl. J. Med. 2015, 372, 2285–2295. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Powers, W.J.; Rabinstein, A.A.; Ackerson, T.; Adeoye, O.M.; Bambakidis, N.C.; Becker, K.; Biller, J.; Brown, M.; Demaerschalk, B.M.; Hoh, B.; et al. Guidelines for the Early Management of Patients With Acute Ischemic Stroke: 2019 Update to the 2018 Guidelines for the Early Management of Acute Ischemic Stroke: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association. Stroke 2019, 50, e344–e418. [Google Scholar] [CrossRef] [PubMed]
Goyal, M.; Demchuk, A.M.; Menon, B.K.; Eesa, M.; Rempel, J.L.; Thornton, J.; Roy, D.; Jovin, T.G.; Willinsky, R.A.; Sapkota, B.L.; et al. Randomized assessment of rapid endovascular treatment of ischemic stroke. N. Engl. J. Med. 2015, 372, 1019–1030. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Campbell, B.C.; Mitchell, P.J.; Kleinig, T.J.; Dewey, H.M.; Churilov, L.; Yassi, N.; Yan, B.; Dowling, R.J.; Parsons, M.W.; Oxley, T.J.; et al. Endovascular therapy for ischemic stroke with perfusion-imaging selection. N. Engl. J. Med. 2015, 372, 1009–1018. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Berkhemer, O.A.; Fransen, P.S.; Beumer, D.; van den Berg, L.A.; Lingsma, H.F.; Yoo, A.J.; Schonewille, W.J.; Vos, J.A.; Nederkoorn, P.J.; Wermer, M.J.; et al. A randomized trial of intraarterial treatment for acute ischemic stroke. N. Engl. J. Med. 2015, 372, 11–20. [Google Scholar] [CrossRef] [Green Version]
Saver, J.L.; Altman, H. Relationship between neurologic deficit severity and final functional outcome shifts and strengthens during first hours after onset. Stroke 2012, 43, 1537–1541. [Google Scholar] [CrossRef] [Green Version]
Mori, M.; Naganuma, M.; Okada, Y.; Hasegawa, Y.; Shiokawa, Y.; Nakagawara, J.; Furui, E.; Kimura, K.; Yamagami, H.; Kario, K.; et al. Early neurological deterioration within 24 hours after intravenous rt-PA therapy for stroke patients: The Stroke Acute Management with Urgent Risk Factor Assessment and Improvement rt-PA Registry. Cerebrovasc. Dis. 2012, 34, 140–146. [Google Scholar] [CrossRef]
Dharmasaroja, P.A.; Muengtaweepongsa, S.; Dharmasaroja, P. Early outcome after intravenous thrombolysis in patients with acute ischemic stroke. Neurol. India 2011, 59, 351–354. [Google Scholar] [CrossRef]
Davalos, A.; Toni, D.; Iweins, F.; Lesaffre, E.; Bastianello, S.; Castillo, J. Neurological deterioration in acute ischemic stroke: Potential predictors and associated factors in the European cooperative acute stroke study (ECASS) I. Stroke 1999, 30, 2631–2636. [Google Scholar] [CrossRef] [Green Version]
Simonsen, C.Z.; Schmitz, M.L.; Madsen, M.H.; Mikkelsen, I.K.; Chandra, R.V.; Leslie-Mazwi, T.; Andersen, G. Early neurological deterioration after thrombolysis: Clinical and imaging predictors. Int. J. Stroke 2016, 11, 776–782. [Google Scholar] [CrossRef] [Green Version]
Bourcier, R.; Goyal, M.; Muir, K.W.; Desal, H.; Dippel, D.W.J.; Majoie, C.; van Zwam, W.H.; Jovin, T.G.; Mitchell, P.J.; Demchuk, A.M.; et al. Risk factors of unexplained early neurological deterioration after treatment for ischemic stroke due to large vessel occlusion: A post hoc analysis of the HERMES study. J. Neurointerv. Surg. 2022, 15, 221–226. [Google Scholar] [CrossRef]
Haeusler, K.G.; Gerischer, L.M.; Vatankhah, B.; Audebert, H.J.; Nolte, C.H. Impact of hospital admission during nonworking hours on patient outcomes after thrombolysis for stroke. Stroke 2011, 42, 2521–2525. [Google Scholar] [CrossRef]
Gong, P.; Zhang, X.; Gong, Y.; Liu, Y.; Wang, S.; Li, Z.; Chen, W.; Zhou, F.; Zhou, J.; Jiang, T.; et al. A novel nomogram to predict early neurological deterioration in patients with acute ischaemic stroke. Eur. J. Neurol. 2020, 27, 1996–2005. [Google Scholar] [CrossRef]
Tanaka, R.; Ueno, Y.; Miyamoto, N.; Yamashiro, K.; Tanaka, Y.; Shimura, H.; Hattori, N.; Urabe, T. Impact of diabetes and prediabetes on the short-term prognosis in patients with acute ischemic stroke. J. Neurol. Sci. 2013, 332, 45–50. [Google Scholar] [CrossRef]
Kim, B.J.; Park, J.M.; Kang, K.; Lee, S.J.; Ko, Y.; Kim, J.G.; Cha, J.K.; Kim, D.H.; Nah, H.W.; Han, M.K.; et al. ERRATUM: Table Correction: Case Characteristics, Hyperacute Treatment, and Outcome Information from the Clinical Research Center for Stroke-Fifth Division Registry in South Korea. J. Stroke 2015, 17, 377–378. [Google Scholar] [CrossRef] [Green Version]
Sun, D.; Tong, X.; Huo, X.; Jia, B.; Raynald; Wang, A.; Ma, G.; Ma, N.; Gao, F.; Mo, D.; et al. Unexplained early neurological deterioration after endovascular treatment for acute large vessel occlusion: Incidence, predictors, and clinical impact: Data from ANGEL-ACT registry. J. Neurointerv. Surg. 2022, 14, 875–880. [Google Scholar] [CrossRef]
Girot, J.B.; Richard, S.; Gariel, F.; Sibon, I.; Labreuche, J.; Kyheng, M.; Gory, B.; Dargazanli, C.; Maier, B.; Consoli, A.; et al. Predictors of Unexplained Early Neurological Deterioration After Endovascular Treatment for Acute Ischemic Stroke. Stroke 2020, 51, 2943–2950. [Google Scholar] [CrossRef]
Bhole, R.; Nouer, S.S.; Tolley, E.A.; Turk, A.; Siddiqui, A.H.; Alexandrov, A.V.; Arthur, A.S.; Mocco, J.; COMPASS Investigators. Predictors of early neurologic deterioration (END) following stroke thrombectomy. J. Neurointerv. Surg. 2022. [Google Scholar] [CrossRef]
Yao, Z.; Mao, C.; Ke, Z.; Xu, Y. An explainable machine learning model for predicting the outcome of ischemic stroke after mechanical thrombectomy. J. Neurointerv. Surg. 2022, jnis-2022-019598. [Google Scholar] [CrossRef]
Jabal, M.S.; Joly, O.; Kallmes, D.; Harston, G.; Rabinstein, A.; Huynh, T.; Brinjikji, W. Interpretable Machine Learning Modeling for Ischemic Stroke Outcome Prediction. Front. Neurol. 2022, 13, 884693. [Google Scholar] [CrossRef]
Mistry, E.A.; Yeatts, S.; de Havenon, A.; Mehta, T.; Arora, N.; De Los Rios La Rosa, F.; Starosciak, A.K.; Siegler, J.E., 3rd; Mistry, A.M.; Yaghi, S.; et al. Predicting 90-Day Outcome After Thrombectomy: Baseline-Adjusted 24-Hour NIHSS Is More Powerful Than NIHSS Score Change. Stroke 2021, 52, 2547–2553. [Google Scholar] [CrossRef] [PubMed]
Hu, Y.; Yang, T.; Zhang, J.; Wang, X.; Cui, X.; Chen, N.; Zhou, J.; Jiang, F.; Zhu, J.; Zou, J. Dynamic Prediction of Mechanical Thrombectomy Outcome for Acute Ischemic Stroke Patients Using Machine Learning. Brain Sci. 2022, 12, 938. [Google Scholar] [CrossRef]
Zorman, M.; Verlic, M. Explanatory approach for evaluation of machine learning-induced knowledge. J. Int. Med. Res. 2009, 37, 1543–1551. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chae, S.H.; Kim, Y.; Lee, K.S.; Park, H.S. Development and Clinical Evaluation of a Web-Based Upper Limb Home Rehabilitation System Using a Smartwatch and Machine Learning Model for Chronic Stroke Survivors: Prospective Comparative Study. JMIR Mhealth Uhealth 2020, 8, e17216. [Google Scholar] [CrossRef] [PubMed]
Yu, M.K.; Ma, J.; Fisher, J.; Kreisberg, J.F.; Raphael, B.J.; Ideker, T. Visible Machine Learning for Biomedicine. Cell 2018, 173, 1562–1565. [Google Scholar] [CrossRef] [Green Version]
Tjoa, E.; Guan, C. A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI. IEEE Trans. Neural. Netw. Learn Syst. 2021, 32, 4793–4813. [Google Scholar] [CrossRef]
von Kummer, R.; Broderick, J.P.; Campbell, B.C.; Demchuk, A.; Goyal, M.; Hill, M.D.; Treurniet, K.M.; Majoie, C.B.; Marquering, H.A.; Mazya, M.V.; et al. The Heidelberg Bleeding Classification: Classification of Bleeding Events After Ischemic Stroke and Reperfusion Therapy. Stroke 2015, 46, 2981–2986. [Google Scholar] [CrossRef] [Green Version]
Zhang, X.; Yuan, K.; Wang, H.; Gong, P.; Jiang, T.; Xie, Y.; Sheng, L.; Liu, D.; Liu, X.; Xu, G. Nomogram to Predict Mortality of Endovascular Thrombectomy for Ischemic Stroke Despite Successful Recanalization. J. Am. Heart Assoc. 2020, 9, e014899. [Google Scholar] [CrossRef]
Siegler, J.E.; Boehme, A.K.; Kumar, A.D.; Gillette, M.A.; Albright, K.C.; Martin-Schild, S. What change in the National Institutes of Health Stroke Scale should define neurologic deterioration in acute ischemic stroke? J. Stroke Cerebrovasc. Dis. 2013, 22, 675–682. [Google Scholar] [CrossRef] [Green Version]
Tibshirani, R. Regression shrinkage and selection via the lasso: A retrospective. J. R. Stat. Soc. Ser. B-Stat. Methodol. 2011, 73, 273–282. [Google Scholar] [CrossRef]
Jelovsek, J.E.; Hill, A.J.; Chagin, K.M.; Kattan, M.W.; Barber, M.D. Predicting Risk of Urinary Incontinence and Adverse Events After Midurethral Sling Surgery in Women. Obstet. Gynecol. 2016, 127, 330–340. [Google Scholar] [CrossRef] [PubMed]
Rodriguez-Perez, R.; Bajorath, J. Interpretation of Compound Activity Predictions from Complex Machine Learning Models Using Local Approximations and Shapley Values. J. Med. Chem. 2020, 63, 8761–8777. [Google Scholar] [CrossRef]
Thorsen-Meyer, H.C.; Nielsen, A.B.; Nielsen, A.P.; Kaas-Hansen, B.S.; Toft, P.; Schierbeck, J.; Strom, T.; Chmura, P.J.; Heimann, M.; Dybdahl, L.; et al. Dynamic and explainable machine learning prediction of mortality in patients in the intensive care unit: A retrospective study of high-frequency data in electronic patient records. Lancet Digit. Health 2020, 2, e179–e191. [Google Scholar] [CrossRef]
Yu, Q.; Mao, X.; Fu, Z.; Luo, S.; Huang, Q.; Chen, Q.; Li, S.; Zhang, J.; Qiu, Y.; Wu, Y.; et al. Fasting blood glucose as a predictor of progressive infarction in men with acute ischemic stroke. J. Int. Med. Res. 2022, 50, 3000605221132416. [Google Scholar] [CrossRef] [PubMed]
Seners, P.; Turc, G.; Oppenheim, C.; Baron, J.C. Incidence, causes and predictors of neurological deterioration occurring within 24 h following acute ischaemic stroke: A systematic review with pathophysiological implications. J. Neurol. Neurosurg. Psychiatry 2015, 86, 87–94. [Google Scholar] [CrossRef] [PubMed]
Kim, J.S.; Kim, R.Y.; Cha, J.K.; Rha, H.W.; Kang, M.J.; Kim, D.H.; Park, H.S.; Choi, J.H.; Huh, J.T.; Lee, I.K. Pre-stroke glycemic control is associated with early neurologic deterioration in acute atrial fibrillation-related ischemic stroke. eNeurologicalSci 2017, 8, 17–21. [Google Scholar] [CrossRef] [PubMed]
Duan, Z.; Guo, W.; Tang, T.; Tao, L.; Gong, K.; Zhang, X. Relationship between high-sensitivity C-reactive protein and early neurological deterioration in stroke patients with and without atrial fibrillation. Heart Lung 2020, 49, 193–197. [Google Scholar] [CrossRef]
Jamwal, S.; Sharma, S. Vascular endothelium dysfunction: A conservative target in metabolic disorders. Inflamm. Res. 2018, 67, 391–405. [Google Scholar] [CrossRef]
Arboix, A.; Oliveres, M.; Massons, J.; Pujades, R.; Garcia-Eroles, L. Early differentiation of cardioembolic from atherothrombotic cerebral infarction: A multivariate analysis. Eur. J. Neurol. 1999, 6, 677–683. [Google Scholar] [CrossRef]
Desilles, J.P.; Syvannarath, V.; Ollivier, V.; Journe, C.; Delbosc, S.; Ducroux, C.; Boisseau, W.; Louedec, L.; Di Meglio, L.; Loyau, S.; et al. Exacerbation of Thromboinflammation by Hyperglycemia Precipitates Cerebral Infarct Growth and Hemorrhagic Transformation. Stroke 2017, 48, 1932–1940. [Google Scholar] [CrossRef]
Petersen, N.H.; Ortega-Gutierrez, S.; Wang, A.; Lopez, G.V.; Strander, S.; Kodali, S.; Silverman, A.; Zheng-Lin, B.; Dandapat, S.; Sansing, L.H.; et al. Decreases in Blood Pressure During Thrombectomy Are Associated with Larger Infarct Volumes and Worse Functional Outcome. Stroke 2019, 50, 1797–1804. [Google Scholar] [CrossRef] [PubMed]
Zhong, X.; Tong, X.; Sun, X.; Gao, F.; Mo, D.; Wang, Y.; Miao, Z. Early Neurological Deterioration Despite Recanalization in Basilar Artery Occlusion Treated by Endovascular Therapy. Front. Neurol. 2020, 11, 592003. [Google Scholar] [CrossRef] [PubMed]
Sun, W.; Liu, W.; Zhang, Z.; Xiao, L.; Duan, Z.; Liu, D.; Xiong, Y.; Zhu, W.; Lu, G.; Liu, X. Asymmetrical cortical vessel sign on susceptibility-weighted imaging: A novel imaging marker for early neurological deterioration and unfavorable prognosis. Eur. J. Neurol. 2014, 21, 1411–1418. [Google Scholar] [CrossRef]
Seo, W.K.; Seok, H.Y.; Kim, J.H.; Park, M.H.; Yu, S.W.; Oh, K.; Koh, S.B.; Park, K.W. C-reactive protein is a predictor of early neurologic deterioration in acute ischemic stroke. J. Stroke Cerebrovasc. Dis. 2012, 21, 181–186. [Google Scholar] [CrossRef]
Kwon, H.M.; Lee, Y.S.; Bae, H.J.; Kang, D.W. Homocysteine as a predictor of early neurological deterioration in acute ischemic stroke. Stroke 2014, 45, 871–873. [Google Scholar] [CrossRef]

Figure 1. Modeling procedure. Machine learning involved feature selection by LASSO, model building with five algorithms. LASSO, Least Absolute Selection and Shrinkage Operator; LR, Logistic Regression; RF, Random Forest; SVM, Support Vector Machine; XGBoost, Extreme Gradient Boosting.

Figure 2. The receiver operating characteristic curve (ROC) of five models on training set (A) and ROC of five models on testing set (B). LR, logistic regression; RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.

Figure 3. The calibration curves and the Brier score of machine learning models on the testing set of models. LR, logistic regression; RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.

Figure 4. Feature importance ranking based on Shapley Additive Explanation (SHAP) values. Red indicates that the value of the feature is high, and blue indicates that the value of the feature is low; the x-axis represents the SHAP values; the features are ranked according to the sum of the SHAP values for all patients (A). Standard bar charts were drawn and sorted using the average absolute value of the shape values of each feature in model (B).

Figure 5. LIME plot for individual case explanation on two random patients from the testing set of the XGBoost model. LIME plot included the patient from the “true positive” group explained by LIME algorithm (A) and a patient from the “true negative” group explained by LIME algorithm (B).

Table 1. Demographics and clinical characteristics according to patients with and without early neurological deterioration.

Variables	All Patients, n = 985	with END, n = 157	without END, n = 828	p Value
Demographic characteristics
Age, years, median (IQR)	71.0(63.0–79.0)	71.0(63.5–80.0)	71.0(63.0–79.0)	0.620
Female, n (%)	377(38.3)	61(38.9)	316(38.2)	0.871
Vascular risk factors, n (%)
Hypertension	677(68.7)	117(74.5)	560(67.6)	0.088
Diabetes mellitus	232(23.6)	46(29.3)	186(22.5)	0.064
Hyperlipidemia	17(1.7)	3(1.9)	14(1.7)	0.846
Coronary artery disease	142(14.4)	13(8.3)	129(15.6)	0.017
Atrial fibrillation	311(31.6)	43(27.4)	268(32.4)	0.219
Previous stroke or TIA	215(21.8)	30(19.1)	185(22.3)	0.368
Smoking	303(30.8)	45(28.7)	258(31.2)	0.524
Drinking	232(23.6)	28(17.8)	204(24.6)	0.065
Clinical data, median (IQR)
Systolic blood pressure, mmHg	138.0(123.0–155.0)	140.0(125.0–160.0)	138.0(123.0–154.0)	0.196
Diastolic blood pressure, mmHg	82.0(73.0–93.0)	82.0(71.5–93.5)	82.5(73.0–93.0)	0.991
NIHSS at baseline	14.0(11.0–18.0)	14.0(8.5–18.0)	14.0(11.0–18.0)	0.186
Interval from onset to treatment, min	270.0(190.0–410.0)	300.0(189.5–500.0)	270.0(190.0–404.3)	0.426
Interval from groin puncture to recanalization, min	70.0(50.0–95.0)	72.0(55.0–104.5)	68.0(50.0–94.3)	0.089
Cause of stroke, n (%)				0.067
Atherosclerotic	448(45.5)	76(48.4)	372(44.9)
Cardioembolic	433(44)	58(36.9)	375(45.3)
Others	104(10.6)	23(14.6)	81(9.8)
Endovascular therapy, n (%)
Intravenous thrombolysis	379(38.5)	64(40.8)	315(38)	0.521
Tirofiban	425(43.1)	74(47.1)	351(42.4)	0.271
sICH	139(14.1)	39(24.8)	100(12.1)	0.000
Recanalization	883(89.6)	134(85.4)	749(90.5)	0.054
Lesion location, n (%)
Anterior circulation	790(80.2)	117(74.5)	673(81.3)	0.051
Posterior circulation	197(20)	40(25.5)	157(19)	0.061
Procedural modes
Aspiration only, n (%)	34(3.5)	8(5.1)	26(3.1)	0.218
Stent retriever only, n (%)	740(75.1)	104(66.2)	636(76.8)	0.005
Stent retriever/aspiration with rescue therapy, n (%)	228(23.1)	47(29.9)	181(21.9)	0.028
Passes of Stent retriever, median (IQR)	2.0(1.0–3.0)	2.0(1.0–3.0)	2.0(1.0–3.0)	0.830
Laboratory data, median (IQR)
Platelets, μmol/L	180.0(145.0–220.0)	185.0(145.5–228.5)	180.0(145.0–219.0)	0.433
Serum creatinine, μmol/L	69.0(58.0–83.5)	69.7(55.8–90.5)	69.0(58.0–83.0)	0.825
Blood glucose, mmol/L	6.6(5.4–8.2)	7.9(6.1–9.8)	6.4(5.3–7.8)	0.000
Total cholesterol, mmol/L	4.1(3.4–5.0)	4.1(3.5–5.0)	4.1(3.4–4.9)	0.476
Triglyceride, mmol/L	1.0(0.8–1.5)	1.1(0.8–1.5)	1.0(0.7–1.4)	0.074
High-density lipoprotein, mmol/L	1.1(0.9–1.3)	1.1(0.9–1.3)	1.1(0.9–1.3)	0.391
Low-density lipoprotein, mmol/L	2.5(1.9–3.1)	2.4(1.8–3.2)	2.5(1.9–3.1)	0.973
UA, μmol/L	314.0(241.5–383.0)	321.0(246.0–382.5)	313.0(241.0–383.0)	0.604
Glycated hemoglobin, mmol/L	5.9(5.6–6.6)	6.2(5.7–6.8)	5.9(5.6–6.5)	0.004
Homocysteine, μmol/L	13.1(10.9–15.7)	13.1(10.4–15.7)	13.1(11.0–15.7)	0.697

Data are presented as median (IQR) or number (%). Abbreviations: IQR, interquartile range; TIA, transient cerebral ischemia; NIHSS, National Institute of Health stroke scale; sICH, symptomatic intracranial hemorrhage; UA, uric acid.

Table 2. Discrimination and calibration of each machine learning algorithm on the testing set.

Model	AUC	Sensitivity	Specificity	Accuracy	Brier	Youden Index
LR	0.644	0.387	0.793	0.587	0.234	0.565
RF	0.819	0.827	0.689	0.759	0.190	0.480
SVM	0.643	0.482	0.713	0.596	0.235	0.520
XGBoost	0.826	0.798	0.713	0.756	0.184	0.509

AUC, area under the curve; LR, logistic regression; RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.

Table 3. Prediction performance for the external validation cohort.

Model	AUC	Sensitivity	Specificity	Accuracy
XGBoost	0.846	0.750	0.836	0.815

AUC, area under the curve; XGBoost, extreme gradient boosting.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, T.; Hu, Y.; Pan, X.; Lou, S.; Zou, J.; Deng, Q.; Zhang, Q.; Zhou, J.; Zhu, J. Interpretable Machine Learning Model Predicting Early Neurological Deterioration in Ischemic Stroke Patients Treated with Mechanical Thrombectomy: A Retrospective Study. Brain Sci. 2023, 13, 557. https://doi.org/10.3390/brainsci13040557

AMA Style

Yang T, Hu Y, Pan X, Lou S, Zou J, Deng Q, Zhang Q, Zhou J, Zhu J. Interpretable Machine Learning Model Predicting Early Neurological Deterioration in Ischemic Stroke Patients Treated with Mechanical Thrombectomy: A Retrospective Study. Brain Sciences. 2023; 13(4):557. https://doi.org/10.3390/brainsci13040557

Chicago/Turabian Style

Yang, Tongtong, Yixing Hu, Xiding Pan, Sheng Lou, Jianjun Zou, Qiwen Deng, Qingxiu Zhang, Junshan Zhou, and Junrong Zhu. 2023. "Interpretable Machine Learning Model Predicting Early Neurological Deterioration in Ischemic Stroke Patients Treated with Mechanical Thrombectomy: A Retrospective Study" Brain Sciences 13, no. 4: 557. https://doi.org/10.3390/brainsci13040557

APA Style

Yang, T., Hu, Y., Pan, X., Lou, S., Zou, J., Deng, Q., Zhang, Q., Zhou, J., & Zhu, J. (2023). Interpretable Machine Learning Model Predicting Early Neurological Deterioration in Ischemic Stroke Patients Treated with Mechanical Thrombectomy: A Retrospective Study. Brain Sciences, 13(4), 557. https://doi.org/10.3390/brainsci13040557

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Interpretable Machine Learning Model Predicting Early Neurological Deterioration in Ischemic Stroke Patients Treated with Mechanical Thrombectomy: A Retrospective Study

Abstract

1. Introduction

2. Methods

2.1. Study Population

2.2. Patient Variables and Data Definitions

2.3. Statistical Analysis

2.4. Data Processing and Feature Selection

2.5. Modeling Strategies

2.6. Model Evaluation

2.7. Explanation of the Model

3. Results

3.1. Study Population

3.2. Feature Selection

3.3. Model Building and Evaluation

3.4. Explanation of the Model at the Feature Level

3.5. Explanation of the Model at the Individual Level

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI