Toward Better Risk Stratification for Implantable Cardioverter-Defibrillator Recipients: Implications of Explainable Machine Learning Models

Deng, Yu; Cheng, Sijing; Huang, Hao; Liu, Xi; Yu, Yu; Gu, Min; Cai, Chi; Chen, Xuhua; Niu, Hongxia; Hua, Wei

doi:10.3390/jcdd9090310

Open AccessArticle

Toward Better Risk Stratification for Implantable Cardioverter-Defibrillator Recipients: Implications of Explainable Machine Learning Models

The Cardiac Arrhythmia Center, State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences & Peking Union Medical College, No.167 North Lishi Road, Beijing 100037, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

J. Cardiovasc. Dev. Dis. 2022, 9(9), 310; https://doi.org/10.3390/jcdd9090310

Submission received: 16 August 2022 / Revised: 8 September 2022 / Accepted: 15 September 2022 / Published: 17 September 2022

(This article belongs to the Special Issue Machine Learning and Artificial Intelligence in Cardiovascular Medicine)

Download

Browse Figures

Versions Notes

Abstract

:

Background: Current guideline-based implantable cardioverter-defibrillator (ICD) implants fail to meet the demands for precision medicine. Machine learning (ML) designed for survival analysis might facilitate personalized risk stratification. We aimed to develop explainable ML models predicting mortality and the first appropriate shock and compare these to standard Cox proportional hazards (CPH) regression in ICD recipients. Methods and Results: Forty-five routine clinical variables were collected. Four fine-tuned ML approaches (elastic net Cox regression, random survival forests, survival support vector machine, and XGBoost) were applied and compared with the CPH model on the test set using Harrell’s C-index. Of 887 adult patients enrolled, 199 patients died (5.0 per 100 person-years) and 265 first appropriate shocks occurred (12.4 per 100 person-years) during the follow-up. Patients were randomly split into training (75%) and test (25%) sets. Among ML models predicting death, XGBoost achieved the highest accuracy and outperformed the CPH model (C-index: 0.794 vs. 0.760, p < 0.001). For appropriate shock, survival support vector machine showed the highest accuracy, although not statistically different from the CPH model (0.621 vs. 0.611, p = 0.243). The feature contribution of ML models assessed by SHAP values at individual and overall levels was in accordance with established knowledge. Accordingly, a bi-dimensional risk matrix integrating death and shock risk was built. This risk stratification framework further classified patients with different likelihoods of benefiting from ICD implant. Conclusions: Explainable ML models offer a promising tool to identify different risk scenarios in ICD-eligible patients and aid clinical decision making. Further evaluation is needed.

Keywords:

implantable cardioverter-defibrillators; machine learning; mortality; first appropriate shock; Shapley Additive exPlanations values; personalized risk stratification

1. Introduction

Since their first introduction in clinical practice four decades ago, implantable cardiac defibrillators (ICDs) have been undoubtedly recognized as an effective modality to prevent sudden cardiac death (SCD), which affects millions of people worldwide each year [1,2,3]. Nonetheless, as heterogeneity exists among patients, and a high proportion of ICD recipients in the real-world did not receive appropriate device therapies in the long-term follow-up [4,5,6]. Moreover, patients implanted with ICD may suffer from procedure-related complications, inappropriate ICD shocks, and psychological disorders [7]. As such, to classify those most likely to benefit from ICD therapy, personalized risk assessment is warranted.

Efforts have never ceased to optimize risk stratification for ICD candidates. Previous established models, such as the Seattle Heart Failure Model–D [8], the Seattle Proportional Risk model [9,10], and other models [11,12,13,14,15,16,17] integrating both risks of ventricular arrhythmias and pump failure death have been proved to have acceptable discrimination to identify potential beneficiaries. These tools built by traditional statistical modeling strategies are easy to interpret and use. However, statistical modeling is not suitable for handling complex interactions and high-order nonlinear relationships that exist in real situations [18]. In contrast, machine learning (ML), inherently embedded to deal with these problems, may be a feasible solution to improve risk prediction [18]. Along with its high flexibility and accuracy, machine learning has also been questioned regarding its low interpretability, which is of particular importance in healthcare settings. Nowadays, with advances in explainable ML tools, such as the Shapley Additive exPlanation (SHAP) values, it is possible to have a thorough understanding of ML models [19].

In this study, we compared machine learning approaches incorporating time-to-event analysis to traditional Cox proportional hazards (CPH) modeling for all-cause death and appropriate shock prediction. We also used SHAP values to explain each variable’s contribution to outcome events. Finally, we suggested a feasible framework for risk stratification by integrating those two dimensions of risk of death and shock.

2. Materials and Methods

2.1. Patient Population

We retrospectively enrolled 1417 adult patients who received initial single- or dual- chamber ICD implantation from 1 January 2010, through 31 December 2020, at our institution. We excluded patients with specialized indications for ICDs (n = 447), including cardiac channelopathies, hypertrophic cardiomyopathy, and congenital heart disease due to their distinctive pathophysiology, and patients who did not meet the current indication for ICD implant [3]. We also excluded patients without any visit data after implant (n = 83). After the exclusion of these patients, 887 patients were retained for analysis. This study complied with the Declaration of Helsinki and was approved by Ethics Committee of Fuwai Hospital. All patients provided informed consent. The flowchart of the study, including patient selection, is illustrated in Figure 1.

2.2. Outcome Definition

All-cause death and the first appropriate ICD shock were both predefined endpoints. Survival conditions were obtained by hospital records, death certification, or phone calls to patients’ relatives. Patients were required to perform device interrogations every 6–12 months and shortly after perceiving device therapies. Appropriate ICD shock was defined as ICD shock delivered for ventricular tachycardia or ventricular fibrillation and was adjudicated by a trained electrophysiologist though the therapy zones were at the discretion of the treating physicians. The censoring dates for the two endpoints were not necessarily the same because survival status was further ascertained even after the last follow-up of device interrogations.

2.3. Data Collection and Preprocessing

A total of 45 variables, including patient demographics, laboratory values, comorbidities, medications, electrocardiogram findings, and echocardiographic indices were collected from electronic medical records. Fourteen variables with missing data are shown in Table S1. All these variables had missing rates less than 5%. Data preprocessing was performed using the scikit-learn (version 1.0.2) module in Python 3.7.13. Dichotomous variables were encoded using the OneHotEncoder function. The New York Heart Association (NYHA) functional classification was transformed to an ordinal variable using the OrdinalEncoder function. Continuous variables were scaled to normal distribution through Box-Cox transformation and standardization using the PowerTransformer function, to diminish the impact of variable variances on model performance. Missing values were imputed with the mean value of each variable by the SimpleImputer function. Raw data and the corresponding transformed values are provided in Table S2.

2.4. Modeling Strategies

The modeling process of all-cause death and appropriate shock was independent. All patients enrolled were randomly split into a training set (n = 665, 75%) and a test set (n = 222, 25%) using a stratified sampling method based on outcome events. To avoid data leakage, data preprocessing was performed after data partitioning. We modeled the outcomes as right-censored data. Four ML algorithms, including elastic net Cox regression (EN-Cox), random survival forests (RSF), survival support vector machine (SSVM), and eXtreme Gradient Boosting (XGBoost), were applied to build prediction models using the scikit-survival [20] (version 0.17.1) and XGBoost [21] (version 1.5.2) modules. To ensure the best performance of each model, hyperparameter tuning techniques of grid-search with five-fold cross-validation were used to obtain optimal hyperparameters. On the other hand, the CPH model was fitted through stepwise backward Cox regression based on Akaike’s information criterion by entering variables significantly related to outcomes at a p < 0.10 level using the ‘rms’ package of R software 4.1.2. It was set as a benchmark for model comparison. The best ML models were respectively selected to construct risk scores predicting death and shock. Subsequently, patients were respectively divided into three equal-sized risk groups (low, intermediate, high) of death and shock. Accordingly, a 3-by-3 risk assessment matrix accounting for both risks was built. This risk stratification system ultimately identified patients with different likelihood of benefiting from ICD implant.

2.5. Model Interpretability

SHAP values, based on the coalitional game theory by Lundberg and Lee [19], were used to explain each variable’s contribution to prediction (risk scores in this study). It provides both global interpretability (to which extent each predictor contributes positively or negatively to the outcome event on the average) and local interpretability (contribution of each predictor to the outcome in an individual). SHAP summary plots and dependence plots were used to illustrate global interpretability, whereas the force plot was used to illustrate local interpretability.

2.6. Statistical Analysis

Continuous variables were summarized as either the mean ± standard deviation or the median with the interquartile range (IQR), as appropriate; categorical variables were presented as frequencies and percentages. Baseline characteristics were compared using the Student’s t, the Mann–Whitney U, the chi-square or the Fisher exact tests, as appropriate. Survival curves were plotted using the Kaplan–Meier estimator. Log-rank tests were used to compare unadjusted differences between groups. Model accuracy was assessed on the test set using Harrell’s concordance index (C-index) owing to its independence of the proportional-hazards assumption. The C-index ranges from 0 to 1, with 1 representing perfect discrimination between subjects who experience the outcome or not, with 0 representing a completely wrong model. The confidence intervals of the C-index were estimated by the bootstrap method with 100 resamplings. The overall difference of the C-index was tested using one-way ANOVA. Furthermore, the C-index of each ML model was compared with the CPH model using the Dunnett’s test. In sensitivity analysis, we imputed missing variables by multivariable imputation using Bayesian ridge regression in an iterative method. A two-sided p-value ≤ 0.05 was considered statistically significant.

3. Results

3.1. Patient Characteristics

A total of 887 patients who had initial ICD implant were identified in the study. The mean age at ICD implantation was 59.0 ± 13.0 years. Patients were predominantly male (75.2%), with a secondary prevention indication (72.9%), and had no pacing indications (93.3%). Approximately half had ischemic cardiomyopathy (48.8%). Most patients prescribed β-blockers and renin–angiotensin–aldosterone system (RAAS) inhibitors, while only 10.6% were prescribed calcium channel blockers. Patient characteristics before and after the partition are summarized in Table 1, with all covariates included in the prediction models. Compared to the test set, there were no significant differences except for higher dual-chamber ICD use and blood urea nitrogen (BUN) levels in the training set for the prediction of death. On the other hand, patients in both sets for the prediction of appropriate shock had similar characteristics across all spectra.

3.2. Outcome Events

During the study period, 199 patients died, with a median follow-up duration of 4.8 (IQR, 3.0–7.1) years (incidence rate of 5.0 per 100 person-years); 265 patients received the first appropriate shock with a median follow-up of 2.7 (IQR, 1.1–5.1) years (12.4 per 100 person-years).

3.3. Prediction of All-Cause Death and Appropriate Shock

ML and CPH models were developed in the training set. Table 2 shows the parameter search space and optimal parameters for the ML models. Model performance in the test set is shown in Figure 2. For the prediction of death, the CPH model achieved a C-index of 0.760 (95% CI 0.752–0.768) on the test set. Among ML algorithms, XGBoost and RSF both showed a significantly greater C-index than the CPH model (C-index difference of 0.034 and 0.021, respectively, both p < 0.001). The EN-Cox model had a trend toward better performance than the CPH model (C-index difference of 0.011, p = 0.178). For the prediction of appropriate shock, SSVM had the highest C-index of 0.621 (95% CI 0.613–0.628), but it was not significantly higher than the CPH model (C-index difference of 0.009, p = 0.243). EN-Cox had similar discrimination compared to the CPH model. However, XGBoost and RSF were not superior to the CPH model (C-index difference of −0.023 and −0.022, respectively, both p < 0.001). Model performance in the training set is provided in Table S3.

3.4. Explainability Based on SHAP Values

We computed the SHAP values of CPH and ML models in the test set. As illustrated in Figure 3 and Figure S1, for the XGBoost model predicting death, increased N-terminal pro-brain natriuretic peptide (NT-proBNP), left ventricular end-diastolic diameter (LVEDD), the New York Heart Association (NYHA) functional classification, left atrial diameter (LAD), high-sensitivity C-reactive protein (hs-CRP), age, BUN, and abnormal systolic blood pressure (< 110 mm Hg or > 140 mm Hg) were the most important risk factors. In contrast, increased left ventricular ejection fraction (LVEF) and hemoglobin level were among the most protective factors. For SSVM predicting shock risk, the primary prevention indication, increasing age, previous myocardial infarction, and higher LVEF, body mass index, and systolic blood pressure were the most protective factors, while male sex, usage of RAAS inhibitors, increased LAD, and LVEDD were the greatest risk factors (Figure 3 and Figure S2). SHAP summary plots show that each predictor in CPH models had the same effect direction as the regression coefficients (Table S4). The summary plots of other ML algorithms are shown in Figure S3. In addition, Figure 4 illustrates how each variable contributes to the outcome prediction in a single patient.

3.5. Establishment of Bi-Dimensional Risk Profiles

XGBoost and SSVM were respectively chosen to construct the risk model for all-cause death and appropriate shock. Patients were classified into three increasing risk categories of all-cause death and appropriate shock by XGBoost and SSVM (Figure 5), respectively. Accordingly, 3*3 risk profiles were developed (Figure 6). Patients with the highest risk of death (16.58 per 100 person-years) and lowest risk of shock (3.99 per 100 person-years) may not benefit from ICD implant. Conversely, patients with the lowest risk of death (0.39 per 100 person-years) and highest risk of shock (16.99 per 100 person-years) may benefit from implant. For those patients with both high risk of death and shock, low risk of death and marginally high risk of shock, shared decisions between patients and clinicians are needed. Strategies need to be made in accordance with risk scenarios.

3.6. Sensitivity Analysis

Instead of mean imputation, Bayesian ridge regression was used to impute missing values. Data partitioning, preprocessing, and hyperparameter search spaces were kept the same. The results supported the primary analysis. The best hyperparameters (Table S5) selected did not substantially deviate from the primary analysis. Model performance was compared and is shown in Figure S4. For the prediction of death, XGBoost also showed better performance than the CPH model (C-index difference of 0.025, p < 0.001). For the prediction of shock, SSVM also had better performance than the CPH model, although not statistically different (C-index difference of 0.011, p = 0.103). The explanation of the models predicting death and shock is shown in Figure S5.

4. Discussion

We leveraged a single-center ICD cohort to create CPH and ML models that predict all-cause death and appropriate shock with an average of nearly 5-year follow-up. We demonstrated that optimized ML models had comparable or better performance than CPH models. SHAP plots further showed that traditional risk factors in ML models had explainable predictive value consistent with established knowledge. Ultimately, we raised a feasible framework to classify ICD patients into a 3*3 matrix of risk scenarios, which may facilitate individualized risk stratification. To the best of our knowledge, this is the first head-to-head study comparing survival ML algorithms with the statistical CPH in ICD patients.

Risk scores using statistical modeling strategies have shown satisfactory performance in ICD benefit prediction [8,9,10,11,12,13,14,15,16,17,22,23]. Survival analysis using CPH regression is a standard paradigm for identifying risk factors and predicting prognosis. However, it only works under two key assumptions: the proportional hazard assumption and the linearity assumption. In comparison, ML approaches do not rely on prior hypotheses and assumptions and therefore are highly flexible [18]. In addition, ML algorithms can handle complex interactions in large datasets in which CPH regression may fail to converge [24]. To date, ML algorithms have been widely applied to cardiovascular disease, from arrhythmias identification to electro-anatomical mapping and to clinical decision support and prognosis prediction [18]. In this study, ML-based models also had an excellent performance in the outcome prediction of ICD recipients.

We demonstrated that XGBoost had the highest discrimination among ML methods in all-cause death prediction and outperformed the CPH model. We further utilized SHAP values to interpret the model. As it showed, older age, higher NT-proBNP, LVEDD, LAD, hs-CRP, BUN, and worsening heart functional status assessed by NYHA were among the most important risk factors. Conversely, higher LVEF and hemoglobin were the most protective factors. These results were in line with previous findings, as deteriorated heart and renal function were related to an increased risk of death [11,16,17,25,26,27]. Additionally, hs-CRP, a biomarker of acute inflammation, was an established risk factor [28]. Systolic blood pressure was also one of the most predictive factors, with a U-shaped relationship shown in the SHAP dependence plot. Death risk increased at both low and high systolic blood pressure, highlighting the role of blood pressure control on mortality. This phenomenon was overlooked in previous studies [11,25].

On the other hand, predicting appropriate ICD shock was much harder. Although we demonstrated SSVM achieved the highest accuracy, it was only minimal to moderate. Expectedly, prevention indication ranked first among various clinical characteristics, underlining its importance to SCD risk [3]. Consistent with previous studies, male sex, higher LAD, and LVEDD were related to an increased risk of shock [9,11,15,22], while myocardial infarction, increasing age, LVEF, body mass index, and systolic blood pressure were related to a decreased risk [9,11,17,29]. However, compared to the general consensus [30], the use of RAAS inhibitors was associated with an increased shock risk. This might partly be attributed to the data-driven nature of ML algorithms and does not necessarily represent a causal relationship. Of note, this effect was not found in other ML algorithms predicting shock risk (Figure S3D–F). As a result, more caution is needed when choosing the right model in the clinical setting.

Our results outlined the difficulties and complexities of predicting ICD shock, which was a surrogate for life-threatening ventricular arrhythmias. It has been widely accepted that the development of ventricular tachycardia/ventricular fibrillation is a dynamic and evolving process involving the participation of multiple pathophysiological processes [22,31,32]. Abnormal heart function, electrical instability, genetic mutations, autonomic dysregulation, and comorbidities have been found to contribute to an increased risk of SCD [7,33,34]. Nonetheless, evidence was inconclusive because of conflicting results, limiting their utilization in clinical settings [7,33,34]. Specifically, several factors may contribute to the low accuracy of this study. First, our cohorts mainly comprised patients with secondary prevention. Previous studies have demonstrated the failure to identify risk factors and establish a risk model in these patients [35,36]. Second, a paucity of cardiac magnetic resonance imaging (MRI) data also impaired the performance. Myocardial replacement fibrosis detected by late gadolinium enhancement is more strongly associated with SCD than LVEF in both ischemic and nonischemic cardiomyopathy [23,29,32,37]. Furthermore, a broader scar zone size is related to a greater SCD risk [27,29,37]. T1-mapping techniques might also add information to arrhythmogenesis [33,34]. Third, ICD programming was left at the discretion of the operators. A standard protocol was not applied. In conclusion, ML is not a panacea. On the contrary, model performance largely depends on prior knowledge and data quality.

Disregarding the mode of death (pump failure or sudden death) may lead to incomplete risk assessment and subsequent biased clinical decisions to an ICD candidacy. Therefore, we developed a framework by integrating the risk of all-cause death and shock and finally identified nine risk profiles. Patients with the highest risk of death and the lowest risk of shock may not benefit from ICD implant. As a result, an ICD may be deferred in this scenario. On the other hand, for those with the lowest risk of death but the highest risk of shock, an ICD implant is justified. For patients with both the highest risk of shock and death, shared decisions between healthcare providers and patients are encouraged, as ICD is solely amenable to shockable arrhythmic events instead of non-shockable rhythms or pump failure death. Additionally, a comprehensive evaluation must be implemented before decision-making in other scenarios.

In fact, this risk assessment framework was first introduced by Buxton et al. [38]. A total of 25 baseline variables including the electrophysiological study result were collected in 674 patients with coronary artery disease enrolled in the MUSTT (Multicenter Unsustained Tachycardia Trial) study [38]. Risk scores of all-cause death and arrhythmic death were built, with satisfactory C-indexes of 0.78 and 0.70, respectively. Later, Lee et al. [11] also developed similar dual risk stratification models using 3445 primary prevention ICD patients by competing risk analysis. More recently, Reeder et al. [17] and Younis et al. [15] also built such models using data from the SCD-HeFT (Sudden Cardiac Death in Heart Failure Trial) and MADIT (Multicenter Automatic Defibrillator Implantation) trial. The former had a C-index for the non-arrhythmic mortality of 0.68 and ventricular tachycardia/ventricular fibrillation of 0.71 [15]. The latter had an area under the curve at 5 years for death of 0.79 and ICD shock of 0.65 [17]. Our study was inspired by these landmark studies and further demonstrated the potential of ML to reach a higher plateau than traditional statistical modeling. As the amount of clinical data is increasing faster than ever before, it is of vital importance to bring ML into daily practice. ML can exclusively, efficiently, and accurately identify complex patterns from big data. Moreover, it can be easily integrated into electrical medical systems and be updated consistently and automatically.

Our study had several limitations. First, all patients were enrolled in a single center with predominantly secondary prevention indication. Therefore, it may not necessarily be applicable to other populations. Moreover, due to the absence of key variables, we cannot validate and thus make comparisons with established models mentioned before. Nevertheless, as a proof-of-concept study, our primary goal was to illustrate the capacities and advantages of ML and the feasibility of constructing a bi-dimensional risk framework instead of building an out-of-the-box model. Second, appropriate shock may not be a suitable surrogate for life-threatening arrhythmias. Landmark studies have demonstrated only an approximately 50% reduction in SCD in the ICD group compared with placebo [39]. Furthermore, adopting an ICD programming strategy of delayed therapy and high-rate cutoff was associated with reduced inappropriate shocks, unnecessary shocks, and mortality [40,41]. Therefore, appropriate ICD shock may be affected by programming setting and does not necessarily equal life-threatening arrhythmias. However, until now, it has remained the best surrogate endpoint for SCD in clinical research [11,17]. Third, CMR-derived parameters, electrocardiographic measurements, and electrophysiological study results were not available in the study, which showed incremental value in addition to traditional risk factors [22,23,27,32,34,37]. Still, our results showed the capacity of routinely available clinical parameters to derive efficient predictive models. Last, model calibration was not evaluated in the study due to the difficulties of XGBoost and SSVM algorithms in estimating the baseline or cumulative hazard function. This is also a downside of many ML methods implementing survival analysis. In summary, improvements must be made before using these algorithms in clinical settings, including but not limited to increasing sample size, adding more clinically relevant parameters, fine-tuning and optimizing the algorithms, and validating performance in external datasets.

5. Conclusions

In this head-to-head comparison of ML and traditional CPH modeling in the risk stratification of ICD recipients, we demonstrated that optimized ML is at least as good as or even better than CPH. A bi-dimensional risk matrix integrating death and shock risk using ML algorithms could facilitate clinical decision-making. This study is exploratory, and further refinement and adjustment are needed before translation into clinical practice.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/jcdd9090310/s1, Table S1: Distribution and proportion of missing variables. Table S2: Data before and after Box-Cox transformation and standardization. Table S3: Model performance in the training test. Table S4: Univariable and multivariable Cox regression of all-cause death and first appropriate shock. Table S5: Parameter search space for each model in sensitivity analysis. Figure S1: Continuous variables’ contribution to the outcome in the XGBoost model predicting all-cause death. Figure S2: Continuous variables’ contribution to the outcome in the SSVM model predicting the first appropriate shock. Figure S3: Model interpretability using SHAP values. Figure S4: Comparison of C-index between CPH and ML algorithms in sensitivity analysis. Figure S5: Model interpretability using SHAP values in sensitivity analysis.

Author Contributions

Conceptualization: Y.D., S.C., H.H. and W.H.; data curation: Y.D., S.C., H.H., X.L. and Y.Y.; formal analysis: Y.D., S.C. and H.H.; funding acquisition: W.H.; investigation: Y.D., S.C., H.H., X.L. and Y.Y.; methodology: Y.D., S.C. and H.H.; resources: M.G., C.C., X.C., H.N. and W.H.; software: Y.D., S.C. and H.H.; supervision: H.N. and W.H.; validation: Y.D., S.C. and H.H.; visualization: Y.D.; writing—original draft: Y.D., S.C. and H.H.; writing—review and editing: W.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by Ethics Committee of Fuwai Cardiovascular Hospital (protocol code: 2010-272, 2 June 2010).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The datasets used and/or analyzed during the current study are not publicly available due to the regulation of Fuwai Hospital.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kiguchi, T.; Okubo, M.; Nishiyama, C.; Maconochie, I.; Ong, M.E.H.; Kern, K.B.; Wyckoff, M.H.; McNally, B.; Christensen, E.F.; Tjelmeland, I.; et al. Out-of-hospital cardiac arrest across the World: First report from the International Liaison Committee on Resuscitation (ILCOR). Resuscitation 2020, 152, 39–49. [Google Scholar] [CrossRef] [PubMed]
Tsao, C.W.; Aday, A.W.; Almarzooq, Z.I.; Alonso, A.; Beaton, A.Z.; Bittencourt, M.S.; Boehme, A.K.; Buxton, A.E.; Carson, A.P.; Commodore-Mensah, Y.; et al. Heart Disease and Stroke Statistics-2022 Update: A Report From the American Heart Association. Circulation 2022, 145, e153–e639. [Google Scholar] [CrossRef] [PubMed]
Al-Khatib, S.M.; Stevenson, W.G.; Ackerman, M.J.; Bryant, W.J.; Callans, D.J.; Curtis, A.B.; Deal, B.J.; Dickfeld, T.; Field, M.E.; Fonarow, G.C. 2017 AHA/ACC/HRS guideline for management of patients with ventricular arrhythmias and the prevention of sudden cardiac death: A report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines and the Heart Rhythm Society. J. Am. Coll. Cardiol. 2018, 72, e91–e220. [Google Scholar] [CrossRef]
Merchant, F.M.; Jones, P.; Wehrenberg, S.; Lloyd, M.S.; Saxon, L.A. Incidence of defibrillator shocks after elective generator exchange following uneventful first battery life. J. Am. Heart Assoc. 2014, 3, e001289. [Google Scholar] [CrossRef] [PubMed]
Zabel, M.; Willems, R.; Lubinski, A.; Bauer, A.; Brugada, J.; Conen, D.; Flevari, P.; Hasenfuss, G.; Svetlosak, M.; Huikuri, H.V.; et al. Clinical effectiveness of primary prevention implantable cardioverter-defibrillators: Results of the EU-CERT-ICD controlled multicentre cohort study. Eur. Heart J. 2020, 41, 3437–3447. [Google Scholar] [CrossRef] [PubMed]
Briongos-Figuero, S.; Garcia-Alberola, A.; Rubio, J.; Segura, J.M.; Rodriguez, A.; Peinado, R.; Alzueta, J.; Martinez-Ferrer, J.B.; Vinolas, X.; Fernandez de la Concha, J.; et al. Long-Term Outcomes Among a Nationwide Cohort of Patients Using an Implantable Cardioverter-Defibrillator: UMBRELLA Study Final Results. J. Am. Heart Assoc. 2021, 10, e018108. [Google Scholar] [CrossRef] [PubMed]
Nielsen, J.C.; Lin, Y.-J.; de Oliveira Figueiredo, M.J.; Sepehri Shamloo, A.; Alfie, A.; Boveda, S.; Dagres, N.; Di Toro, D.; Eckhardt, L.L.; Ellenbogen, K. European Heart Rhythm Association (EHRA)/Heart Rhythm Society (HRS)/Asia Pacific Heart Rhythm Society (APHRS)/Latin American Heart Rhythm Society (LAHRS) expert consensus on risk assessment in cardiac arrhythmias: Use the right tool for the right outcome, in the right population. EP Europace 2020, 22, 1147–1148. [Google Scholar] [CrossRef]
Levy, W.C.; Lee, K.L.; Hellkamp, A.S.; Poole, J.E.; Mozaffarian, D.; Linker, D.T.; Maggioni, A.P.; Anand, I.; Poole-Wilson, P.A.; Fishbein, D.P.; et al. Maximizing survival benefit with primary prevention implantable cardioverter-defibrillator therapy in a heart failure population. Circulation 2009, 120, 835–842. [Google Scholar] [CrossRef]
Shadman, R.; Poole, J.E.; Dardas, T.F.; Mozaffarian, D.; Cleland, J.G.; Swedberg, K.; Maggioni, A.P.; Anand, I.S.; Carson, P.E.; Miller, A.B.; et al. A novel method to predict the proportional risk of sudden cardiac death in heart failure: Derivation of the Seattle Proportional Risk Model. Heart Rhythm 2015, 12, 2069–2077. [Google Scholar] [CrossRef]
Levy, W.C.; Li, Y.; Reed, S.D.; Zile, M.R.; Shadman, R.; Dardas, T.; Whellan, D.J.; Schulman, K.A.; Ellis, S.J.; Neilson, M.; et al. Does the Implantable Cardioverter-Defibrillator Benefit Vary With the Estimated Proportional Risk of Sudden Death in Heart Failure Patients? JACC Clin. Electrophysiol. 2017, 3, 291–298. [Google Scholar] [CrossRef]
Lee, D.S.; Hardy, J.; Yee, R.; Healey, J.S.; Birnie, D.; Simpson, C.S.; Crystal, E.; Mangat, I.; Nanthakumar, K.; Wang, X.; et al. Clinical Risk Stratification for Primary Prevention Implantable Cardioverter Defibrillators. Circ. Heart Fail. 2015, 8, 927–937. [Google Scholar] [CrossRef] [PubMed]
Bilchick, K.C.; Wang, Y.; Cheng, A.; Curtis, J.P.; Dharmarajan, K.; Stukenborg, G.J.; Shadman, R.; Anand, I.; Lund, L.H.; Dahlstrom, U.; et al. Seattle Heart Failure and Proportional Risk Models Predict Benefit From Implantable Cardioverter-Defibrillators. J. Am. Coll. Cardiol. 2017, 69, 2606–2618. [Google Scholar] [CrossRef] [PubMed]
Bergau, L.; Willems, R.; Sprenkeler, D.J.; Fischer, T.H.; Flevari, P.; Hasenfuss, G.; Katsaras, D.; Kirova, A.; Lehnart, S.E.; Luthje, L.; et al. Differential multivariable risk prediction of appropriate shock versus competing mortality-A prospective cohort study to estimate benefits from ICD therapy. Int. J. Cardiol. 2018, 272, 102–107. [Google Scholar] [CrossRef]
Kristensen, S.L.; Levy, W.C.; Shadman, R.; Nielsen, J.C.; Haarbo, J.; Videbaek, L.; Bruun, N.E.; Eiskjaer, H.; Wiggers, H.; Brandes, A.; et al. Risk Models for Prediction of Implantable Cardioverter-Defibrillator Benefit: Insights From the DANISH Trial. JACC Heart Fail. 2019, 7, 717–724. [Google Scholar] [CrossRef]
Younis, A.; Goldberger, J.J.; Kutyifa, V.; Zareba, W.; Polonsky, B.; Klein, H.; Aktas, M.K.; Huang, D.; Daubert, J.; Estes, M.; et al. Predicted benefit of an implantable cardioverter-defibrillator: The MADIT-ICD benefit score. Eur. Heart J. 2021, 42, 1676–1684. [Google Scholar] [CrossRef] [PubMed]
Deng, Y.; Zhang, N.; Hua, W.; Cheng, S.; Niu, H.; Chen, X.; Gu, M.; Cai, C.; Liu, X.; Huang, H.; et al. Nomogram predicting death and heart transplantation before appropriate ICD shock in dilated cardiomyopathy. ESC Heart Fail. 2022, 9, 1269–1278. [Google Scholar] [CrossRef] [PubMed]
Reeder, H.T.; Shen, C.; Buxton, A.E.; Haneuse, S.J.; Kramer, D.B. Joint Shock/Death Risk Prediction Model for Patients Considering Implantable Cardioverter-Defibrillators. Circ. Cardiovasc. Qual. Outcomes 2019, 12, e005675. [Google Scholar] [CrossRef]
Feeny, A.K.; Chung, M.K.; Madabhushi, A.; Attia, Z.I.; Cikes, M.; Firouznia, M.; Friedman, P.A.; Kalscheur, M.M.; Kapa, S.; Narayan, S.M.; et al. Artificial Intelligence and Machine Learning in Arrhythmias and Cardiac Electrophysiology. Circ. Arrhythm. Electrophysiol. 2020, 13, e007952. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Processing Syst. 2017, 30, 4768–4777. [Google Scholar]
Pölsterl, S. scikit-survival: A Library for Time-to-Event Analysis Built on Top of scikit-learn. J. Mach. Learn. Res. 2020, 21, 1–6. [Google Scholar]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Stolfo, D.; Ceschia, N.; Zecchin, M.; De Luca, A.; Gobbo, M.; Barbati, G.; Gigli, M.; Mase, M.; Pinamonti, B.; Pivetta, A.; et al. Arrhythmic Risk Stratification in Patients With Idiopathic Dilated Cardiomyopathy. Am. J. Cardiol. 2018, 121, 1601–1609. [Google Scholar] [CrossRef] [PubMed]
Di Marco, A.; Brown, P.F.; Bradley, J.; Nucifora, G.; Claver, E.; de Frutos, F.; Dallaglio, P.D.; Comin-Colet, J.; Anguera, I.; Miller, C.A.; et al. Improved Risk Stratification for Ventricular Arrhythmias and Sudden Death in Patients With Nonischemic Dilated Cardiomyopathy. J. Am. Coll. Cardiol. 2021, 77, 2890–2905. [Google Scholar] [CrossRef] [PubMed]
Ambale-Venkatesh, B.; Yang, X.; Wu, C.O.; Liu, K.; Hundley, W.G.; McClelland, R.; Gomes, A.S.; Folsom, A.R.; Shea, S.; Guallar, E.; et al. Cardiovascular Event Prediction by Machine Learning: The Multi-Ethnic Study of Atherosclerosis. Circ. Res. 2017, 121, 1092–1101. [Google Scholar] [CrossRef]
Levy, W.C.; Mozaffarian, D.; Linker, D.T.; Sutradhar, S.C.; Anker, S.D.; Cropp, A.B.; Anand, I.; Maggioni, A.; Burton, P.; Sullivan, M.D.; et al. The Seattle Heart Failure Model: Prediction of survival in heart failure. Circulation 2006, 113, 1424–1433. [Google Scholar] [CrossRef] [PubMed]
Deng, Y.; Cheng, S.J.; Hua, W.; Cai, M.S.; Zhang, N.X.; Niu, H.X.; Chen, X.H.; Gu, M.; Cai, C.; Liu, X.; et al. N-Terminal Pro-B-Type Natriuretic Peptide in Risk Stratification of Heart Failure Patients With Implantable Cardioverter-Defibrillator. Front. Cardiovasc. Med. 2022, 9, 823076. [Google Scholar] [CrossRef]
Gulati, A.; Jabbour, A.; Ismail, T.F.; Guha, K.; Khwaja, J.; Raza, S.; Morarji, K.; Brown, T.D.; Ismail, N.A.; Dweck, M.R.; et al. Association of fibrosis with mortality and sudden cardiac death in patients with nonischemic dilated cardiomyopathy. JAMA 2013, 309, 896–908. [Google Scholar] [CrossRef] [PubMed]
Cheng, A.; Zhang, Y.; Blasco-Colmenares, E.; Dalal, D.; Butcher, B.; Norgard, S.; Eldadah, Z.; Ellenbogen, K.A.; Dickfeld, T.; Spragg, D.D.; et al. Protein biomarkers identify patients unlikely to benefit from primary prevention implantable cardioverter defibrillators: Findings from the Prospective Observational Study of Implantable Cardioverter Defibrillators (PROSE-ICD). Circ. Arrhythm. Electrophysiol. 2014, 7, 1084–1091. [Google Scholar] [CrossRef] [PubMed]
Zegard, A.; Okafor, O.; De Bono, J.; Kalla, M.; Lencioni, M.; Marshall, H.; Hudsmith, L.; Qiu, T.; Steeds, R.; Stegemann, B. Myocardial fibrosis as a predictor of sudden death in patients with coronary artery disease. J. Am. Coll. Cardiol. 2021, 77, 29–41. [Google Scholar] [CrossRef]
Packer, M. What causes sudden death in patients with chronic heart failure and a reduced ejection fraction? Eur. Heart J. 2020, 41, 1757–1763. [Google Scholar] [CrossRef]
Rohde, L.E.; Vaduganathan, M.; Claggett, B.L.; Polanczyk, C.A.; Dorbala, P.; Packer, M.; Desai, A.S.; Zile, M.; Rouleau, J.; Swedberg, K.; et al. Dynamic changes in cardiovascular and systemic parameters prior to sudden cardiac death in heart failure with reduced ejection fraction: A PARADIGM-HF analysis. Eur. J. Heart Fail. 2021, 23, 1346–1356. [Google Scholar] [CrossRef]
Wu, K.C.; Wongvibulsin, S.; Tao, S.; Ashikaga, H.; Stillabower, M.; Dickfeld, T.M.; Marine, J.E.; Weiss, R.G.; Tomaselli, G.F.; Zeger, S.L. Baseline and Dynamic Risk Predictors of Appropriate Implantable Cardioverter Defibrillator Therapy. J. Am. Heart Assoc. 2020, 9, e017002. [Google Scholar] [CrossRef] [PubMed]
Halliday, B.P.; Cleland, J.G.F.; Goldberger, J.J.; Prasad, S.K. Personalizing Risk Stratification for Sudden Death in Dilated Cardiomyopathy: The Past, Present, and Future. Circulation 2017, 136, 215–231. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Hammersley, D.J.; Halliday, B.P. Sudden Cardiac Death Prediction in Non-ischemic Dilated Cardiomyopathy: A Multiparametric and Dynamic Approach. Curr. Cardiol. Rep. 2020, 22, 85. [Google Scholar] [CrossRef] [PubMed]
Borleffs, C.J.; van Erven, L.; Schotman, M.; Boersma, E.; Kies, P.; van der Burg, A.E.; Zeppenfeld, K.; Bootsma, M.; van der Wall, E.E.; Bax, J.J.; et al. Recurrence of ventricular arrhythmias in ischaemic secondary prevention implantable cardioverter defibrillator recipients: Long-term follow-up of the Leiden out-of-hospital cardiac arrest study (LOHCAT). Eur. Heart J. 2009, 30, 1621–1626. [Google Scholar] [CrossRef] [PubMed]
Schaer, B.; Kuhne, M.; Reichlin, T.; Osswald, S.; Sticherling, C. Incidence of and predictors for appropriate implantable cardioverter-defibrillator therapy in patients with a secondary preventive implantable cardioverter-defibrillator indication. Europace 2016, 18, 227–231. [Google Scholar] [CrossRef]
Klem, I.; Klein, M.; Khan, M.; Yang, E.Y.; Nabi, F.; Ivanov, A.; Bhatti, L.; Hayes, B.; Graviss, E.A.; Nguyen, D.T.; et al. Relationship of LVEF and Myocardial Scar to Long-Term Mortality Risk and Mode of Death in Patients With Nonischemic Cardiomyopathy. Circulation 2021, 143, 1343–1358. [Google Scholar] [CrossRef]
Buxton, A.E.; Lee, K.L.; Hafley, G.E.; Pires, L.A.; Fisher, J.D.; Gold, M.R.; Josephson, M.E.; Lehmann, M.H.; Prystowsky, E.N.; Investigators, M. Limitations of ejection fraction for prediction of sudden death risk in patients with coronary artery disease: Lessons from the MUSTT study. J. Am. Coll. Cardiol. 2007, 50, 1150–1157. [Google Scholar] [CrossRef]
Oscar, O.; Enrique, R.; Andres, B. Subanalyses of secondary prevention implantable cardioverter-defibrillator trials: Antiarrhythmics versus implantable defibrillators (AVID), Canadian Implantable Defibrillator Study (CIDS), and Cardiac Arrest Study Hamburg (CASH). Curr. Opin. Cardiol. 2004, 19, 26–30. [Google Scholar] [CrossRef]
Moss, A.J.; Schuger, C.; Beck, C.A.; Brown, M.W.; Cannom, D.S.; Daubert, J.P.; Estes, N.A., III; Greenberg, H.; Hall, W.J.; Huang, D.T.; et al. Reduction in inappropriate therapy and mortality through ICD programming. N. Engl. J. Med. 2012, 367, 2275–2283. [Google Scholar] [CrossRef]
Aktas, M.K.; Bennett, A.L.; Younis, A.; Kutyifa, V.; Polonsky, B.; McNitt, S.; Zareba, W.; Rosero, S.; Goldenberg, I. Implantable cardioverter-defibrillator programming after first occurrence of ventricular tachycardia in the Multicenter Automatic Defibrillator Implantation Trial-Reduce Inappropriate Therapy (MADIT-RIT). Heart Rhythm O2 2020, 1, 77–82. [Google Scholar] [CrossRef]

Figure 1. Study flow diagram. ICD, implantable cardioverter defibrillation.

Figure 2. Comparison of C-index between CPH and ML algorithms for all-cause death (A) and first appropriate shock (B) in the test set. For predicting all-cause death, the C-index of CPH, EN-Cox, RSF, SSVM, and XGBoost were 0.760 (95% CI 0.752–0.768), 0.771 (95% CI 0.763–0.779), 0.781 (95% CI 0.773–0.788), 0.767 (95% CI 0.759–0.775), and 0.794 (95% CI 0.786–0.802), respectively. For predicting shock, the C-index of CPH, EN-Cox, RSF, SSVM, and XGBoost were 0.611 (95% CI 0.604–0.618), 0.608 (95% CI 0.601–0.615), 0.589 (95% CI 0.581–0.597), 0.621 (95% CI 0.613–0.628), and 0.588 (95% CI 0.580–0.596), respectively. CI, confidence interval; CPH, Cox proportional hazards regression; other abbreviations as in Table 2.

Figure 3. Model interpretability using SHAP values. CPH model (A), XGBoost (B) for prediction of death; CPH model (C), SSVM (D) for prediction of appropriate shock. The X-axis stands for SHAP value, and the predictor lies orderly on the Y-axis according to their importance (the higher in position, the more important). Only the top 10 important predictors are left in the plot. Each point on the summary plot represents a single predictor of an individual. Overlapping points are jittered in the y-axis. Red and blue colors respectively indicate higher and lower values of a predictor. For example, a high value of NT-proBNP increases the risk score of death in the CPH model predicting death. In other words, it is a risk factor. Conversely, a high value of diastolic blood pressure reduces the death risk score, making it a protective factor. SHAP, SHapley Additive exPlanations; other abbreviations as in Figure 2 and Table 1.

Figure 4. SHAP force plots for XGBoost predicting all-cause death (A) and SSVM predicting shock (B) in a single patient. Red and blue bars respectively represent the positive and negative effects of each predictor contributing to the occurrence of the outcome. The extent of impact is represented by the size of the bar. Bold values represent the predicted risk scores. Of note, values have been transformed and raw values can be seen in Table S2. Abbreviations as in Table 1, Table 2, and Figure 3.

Figure 5. Cumulative incidence curves of all-cause death (A) and first appropriate shock (B) by risk stratum.

Figure 6. The construction of bi-dimensional risk profiles. (A), The cumulative incidence curves. (B), Bar plot of the annual incidence rate. The X-axis and Y-axis represent the three increasing risk strata of appropriate shock and all-cause death, respectively. A 3*3 risk profile was built accordingly. As a result, three different treatment strategies might be considered.

Table 1. Demographics of the training and test sets.

Characteristics		Datasets of All-Cause Death		p-Value	Datasets of First Appropriate Shock		p-Value
Characteristics	All Patients (n = 887)	Training Set (n = 665)	Test Set (n = 222)	p-Value	Training Set (n = 665)	Test Set (n = 222)	p-Value
Demographics
Age (years)	59.0 ± 13.0	59.3 ± 12.8	58.3 ± 13.7	0.361	59.0 ± 13.1	59.1 ± 13.0	0.894
Male sex	667 (75.2%)	504 (75.8%)	163 (73.4%)	0.537	498 (74.9%)	169 (76.1%)	0.779
Body mass index (kg/m²)	24.7 ± 3.6	24.8 ± 3.5	24.5 ± 3.8	0.284	24.8 ± 3.7	24.6 ± 3.3	0.631
Ischemic etiology	433 (48.8%)	324 (48.7%)	109 (49.1%)	0.984	317 (47.7%)	116 (52.3%)	0.269
Family history of sudden death	25 (2.8%)	20 (3.0%)	5 (2.3%)	0.723	16 (2.4%)	9 (4.1%)	0.293
Clinical characteristics
Smoking	416 (46.9%)	316 (47.5%)	100 (45.0%)	0.574	314 (47.2%)	102 (45.9%)	0.802
Primary prevention	240 (27.1%)	185 (27.8%)	55 (24.8%)	0.425	179 (26.9%)	61 (27.5%)	0.94
Dual-chamber ICD	303 (34.2%)	240 (36.1%)	63 (28.4%)	0.044	230 (34.6%)	73 (32.9%)	0.703
Systolic BP (mmHg)	120.5 ± 16.6	120.9 ± 16.4	119.5 ± 17.2	0.298	120.8 ± 16.9	119.7 ± 15.7	0.377
Diastolic BP (mmHg)	73.5 ± 10.3	73.8 ± 10.1	72.9 ± 10.9	0.292	73.5 ± 10.5	73.7 ± 9.8	0.736
NYHA class				0.396			0.498
I	239 (26.9%)	176 (26.5%)	63 (28.4%)		184 (27.7%)	55 (24.8%)
II	326 (36.8%)	255 (38.3%)	71 (32.0%)		249 (37.4%)	77 (34.7%)
III	260 (29.3%)	189 (28.4%)	71 (32.0%)		188 (28.3%)	72 (32.4%)
IV	62 (7.0%)	45 (6.8%)	17 (7.7%)		44 (6.6%)	18 (8.1%)
Echocardiogram
LVEDD (mm)	60.3 ± 10.9	60.4 ± 10.8	59.9 ± 11.2	0.606	60.0 ± 11.0	61.2 ± 10.5	0.158
LVEF (%)	43.1 ± 14.6	43.2 ± 14.3	43.0 ± 15.3	0.897	43.4 ± 14.7	42.2 ± 14.1	0.282
LAD (mm)	42.5 ± 8.1	42.5 ± 7.9	42.5 ± 8.8	0.941	42.3 ± 8.0	43.0 ± 8.4	0.306
IVS (mm)	9.4 ± 2.3	9.4 ± 2.2	9.5 ± 2.7	0.555	9.4 ± 2.4	9.4 ± 2.1	0.747
RVD (mm)	22.6 ± 4.3	22.5 ± 4.2	22.9 ± 4.5	0.281	22.7 ± 4.5	22.6 ± 3.7	0.780
Tricuspid valve regurgitation	84 (9.5%)	61 (9.2%)	23 (10.4%)	0.696	67 (10.1%)	17 (7.7%)	0.351
Mitral valve regurgitation	169 (19.1%)	121 (18.2%)	48 (21.6%)	0.305	127 (19.1%)	42 (18.9%)	1.000
Electrocardiogram findings
Heart rate (beats per minute)	69.0 ± 13.6	68.6 ± 13.7	70.1 ± 13.4	0.163	68.6 ± 13.8	70.1 ± 13.3	0.167
CLBBB	48 (5.4%)	32 (4.8%)	16 (7.2%)	0.232	38 (5.7%)	10 (4.5%)	0.604
CRBBB	53 (6.0%)	42 (6.3%)	11 (5.0%)	0.564	39 (5.9%)	14 (6.3%)	0.939
Frequent PVCs	371 (41.8%)	283 (42.6%)	88 (39.6%)	0.494	281 (42.3%)	90 (40.5%)	0.711
Pacing indication	59 (6.7%)	45 (6.8%)	14 (6.3%)	0.934	44 (6.6%)	15 (6.8%)	1.000
Comorbidities
Myocardial infarction	345 (38.9%)	266 (40.0%)	79 (35.6%)	0.276	256 (38.5%)	89 (40.1%)	0.732
Atrial fibrillation	259 (29.2%)	190 (28.6%)	69 (31.1%)	0.531	189 (28.4%)	70 (31.5%)	0.425
Hypertension	383 (43.2%)	291 (43.8%)	92 (41.4%)	0.599	285 (42.9%)	98 (44.1%)	0.797
Diabetes	179 (20.2%)	134 (20.2%)	45 (20.3%)	1.000	134 (20.2%)	45 (20.3%)	1.000
Hyperlipidemia	431 (48.6%)	324 (48.7%)	107 (48.2%)	0.954	322 (48.4%)	109 (49.1%)	0.922
Stroke	58 (6.5%)	42 (6.3%)	16 (7.2%)	0.758	48 (7.2%)	10 (4.5%)	0.208
Hyperuricemia	78 (8.8%)	64 (9.6%)	14 (6.3%)	0.169	59 (8.9%)	19 (8.6%)	0.995
Laboratory tests
NT-proBNP (pg/mL)	788.9 (302.0,1779.0)	765.8 (299.2,1761.8)	853.3 (330.8,1794.2)	0.479	743.6 (299.5,1714.0)	874.8 (316.8,1904.3)	0.268
Hemoglobin (g/L)	140.3 ± 18.1	140.1 ± 17.9	141.0 ± 19.0	0.530	140.8 ± 18.3	138.9 ± 17.8	0.174
Creatinine (μmol/L)	88.0 (75.2,103.7)	87.7 (75.3,104.0)	88.0 (75.0,102.6)	0.955	87.7 (75.0,104.0)	88.3 (75.7,102.9)	0.982
BUN (mmol/L)	6.6 (5.3,8.6)	6.7 (5.4,8.7)	6.0 (4.9,8.3)	0.015	6.6 (5.3,8.6)	6.5 (4.9,8.7)	0.428
hs-CRP (mg/L)	1.9 (0.8,4.6)	2.0 (0.8,4.7)	1.9 (0.8,4.2)	0.962	2.0 (0.8,4.3)	1.7 (0.8,5.6)	0.856
Medications
ACEI/ARB/ ARNI	573 (64.6%)	426 (64.1%)	147 (66.2%)	0.617	433 (65.1%)	140 (63.1%)	0.637
Amiodarone	461 (52.0%)	354 (53.2%)	107 (48.2%)	0.222	344 (51.7%)	117 (52.7%)	0.862
Beta-blockers	747 (84.2%)	567 (85.3%)	180 (81.1%)	0.170	564 (84.8%)	183 (82.4%)	0.462
Calcium channel blockers	94 (10.6%)	74 (11.1%)	20 (9.0%)	0.446	72 (10.8%)	22 (9.9%)	0.796
Diuretics	564 (63.6%)	428 (64.4%)	136 (61.3%)	0.453	422 (63.5%)	142 (64.0%)	0.956
MRA	524 (59.1%)	396 (59.5%)	128 (57.7%)	0.676	394 (59.2%)	130 (58.6%)	0.919
Digitalis	196 (22.1%)	143 (21.5%)	53 (23.9%)	0.520	150 (22.6%)	46 (20.7%)	0.633
Statin	449 (50.6%)	330 (49.6%)	119 (53.6%)	0.342	332 (49.9%)	117 (52.7%)	0.523
Antiplatelet	322 (36.3%)	248 (37.3%)	74 (33.3%)	0.326	239 (35.9%)	83 (37.4%)	0.758
Anticoagulants	163 (18.4%)	115 (17.3%)	48 (21.6%)	0.180	120 (18.0%)	43 (19.4%)	0.733

Values are presented as the mean ± standard deviation, median (interquartile range), or frequency (%). ACEI/ARB/ARNI, angiotensin-converting enzyme inhibitor/angiotensin receptor blocker/angiotensin receptor-neprilysin inhibitor; BP, blood pressure; BUN, blood urea nitrogen; CLBBB, complete left bundle branch block; CRBBB, complete right bundle branch block; hs-CRP, high-sensitivity C-reactive protein; ICD, implantable cardioverter-defibrillator; IVS, interventricular septum thickness; LAD, left atrial diameter; LVEDD, left ventricular end-diastolic diameter; LVEF, left ventricular ejection fraction; MRA, mineralocorticoid receptor antagonist; NT-proBNP, N-terminal pro-brain natriuretic peptide; NYHA, New York Heart Association; PVC, premature ventricular contractions; RVD, right ventricular diameter.

Table 2. Parameter search space and optimal parameters for each model.

Algorithms	Parameter	Search Space	Optimal Parameter for Death Prediction	Optimal Parameter for Shock Prediction
EN-Cox	l1 ratio	0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0	0.9	0.2
EN-Cox	alpha	Log distribution from 0.0001 to 1	0.0233	0.0339
RSF	number of trees	100, 200, 300, 400, 500	400	500
	maximum depth	2, 3, 4, 5, 6, 7	4	7
	minimum samples required to split	10, 14, 28, 22, 40, 50	22	40
	minimum samples required at leaf nodes	5, 7, 9, 11, 20, 25	5	9
SSVM	alpha	0.1, 1, 10, 100	0.1	0.1
	gamma	1, 0.1, 0.01, 0.001	1	0.001
	kernel	rbf, poly, linear, sigmoid, cosine	poly	rbf
	degree (poly kernels only)	2, 3, 4, 5	4	-
XGBoost	loss function	CoxPH	-	-
	learning rate	0.01, 0.05, 0.10	0.1	0.1
	number of trees	20, 25, 30	30	30
	maximum depth	1, 2	2	2
	fraction of samples	0.4, 0.5	0.4	0.4
	fraction of variables	0.4, 0.5	0.5	0.4
	minimum samples required to split	1, 2	1	1

EN-Cox, elastic net Cox regression; RSF, random survival forests; SSVM, survival support-vector machine; XGBoost, eXtreme Gradient Boosting.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Deng, Y.; Cheng, S.; Huang, H.; Liu, X.; Yu, Y.; Gu, M.; Cai, C.; Chen, X.; Niu, H.; Hua, W. Toward Better Risk Stratification for Implantable Cardioverter-Defibrillator Recipients: Implications of Explainable Machine Learning Models. J. Cardiovasc. Dev. Dis. 2022, 9, 310. https://doi.org/10.3390/jcdd9090310

AMA Style

Deng Y, Cheng S, Huang H, Liu X, Yu Y, Gu M, Cai C, Chen X, Niu H, Hua W. Toward Better Risk Stratification for Implantable Cardioverter-Defibrillator Recipients: Implications of Explainable Machine Learning Models. Journal of Cardiovascular Development and Disease. 2022; 9(9):310. https://doi.org/10.3390/jcdd9090310

Chicago/Turabian Style

Deng, Yu, Sijing Cheng, Hao Huang, Xi Liu, Yu Yu, Min Gu, Chi Cai, Xuhua Chen, Hongxia Niu, and Wei Hua. 2022. "Toward Better Risk Stratification for Implantable Cardioverter-Defibrillator Recipients: Implications of Explainable Machine Learning Models" Journal of Cardiovascular Development and Disease 9, no. 9: 310. https://doi.org/10.3390/jcdd9090310

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Toward Better Risk Stratification for Implantable Cardioverter-Defibrillator Recipients: Implications of Explainable Machine Learning Models

Abstract

1. Introduction

2. Materials and Methods

2.1. Patient Population

2.2. Outcome Definition

2.3. Data Collection and Preprocessing

2.4. Modeling Strategies

2.5. Model Interpretability

2.6. Statistical Analysis

3. Results

3.1. Patient Characteristics

3.2. Outcome Events

3.3. Prediction of All-Cause Death and Appropriate Shock

3.4. Explainability Based on SHAP Values

3.5. Establishment of Bi-Dimensional Risk Profiles

3.6. Sensitivity Analysis

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI