Prediction of All-Cause Mortality Following Percutaneous Coronary Intervention in Bifurcation Lesions Using Machine Learning Algorithms

Stratifying prognosis following coronary bifurcation percutaneous coronary intervention (PCI) is an unmet clinical need that may be fulfilled through the adoption of machine learning (ML) algorithms to refine outcome predictions. We sought to develop an ML-based risk stratification model built on clinical, anatomical, and procedural features to predict all-cause mortality following contemporary bifurcation PCI. Multiple ML models to predict all-cause mortality were tested on a cohort of 2393 patients (training, n = 1795; internal validation, n = 598) undergoing bifurcation PCI with contemporary stents from the real-world RAIN registry. Twenty-five commonly available patient-/lesion-related features were selected to train ML models. The best model was validated in an external cohort of 1701 patients undergoing bifurcation PCI from the DUTCH PEERS and BIO-RESORT trial cohorts. At ROC curves, the AUC for the prediction of 2-year mortality was 0.79 (0.74–0.83) in the overall population, 0.74 (0.62–0.85) at internal validation and 0.71 (0.62–0.79) at external validation. Performance at risk ranking analysis, k-center cross-validation, and continual learning confirmed the generalizability of the models, also available as an online interface. The RAIN-ML prediction model represents the first tool combining clinical, anatomical, and procedural features to predict all-cause mortality among patients undergoing contemporary bifurcation PCI with reliable performance.


Introduction
The evolution of both stent technology and implantation techniques has translated into improved clinical outcomes following percutaneous coronary intervention (PCI) in complex anatomical and procedural settings, such as coronary bifurcation lesions [1][2][3][4][5].
However, real-life contemporary registries [6,7] still report a considerable risk of adverse events in this high-risk subset, warranting precise prognostication.
Available risk scores to predict adverse events associated with PCI are based on study populations with a small proportion of bifurcation lesions [8][9][10][11]. The external performance of such models in this lesion setting remains modest [12]. The absence of dedicated algorithms to predict long-term outcomes of PCI in coronary bifurcations clashes with the abundant evidence demonstrating a significantly poorer short-and long-term prognosis of these lesions compared to the overall PCI population, most likely as a result of multifaceted differences in procedural technique, hemorheology and vessel healing that makes PCI in bifurcations less forgiving than in other anatomical subsets [5,[13][14][15][16]. While the risk stratification of ischemic endpoints is pivotal to informing patient management and therapeutic choices, mortality prediction remains an important goal to improve physician-patient communication, orientate follow-up and clinical decision making, and allow comparative effectiveness research in order to guide procedural strategy and further technical advances [17,18]. However, to date, there is no available predictive tool to predict long-term mortality following bifurcation PCI.
In the clinical research field of risk prediction following PCI, available studies favored either clinical [8] or anatomy/procedure [11,19] focused approaches to assess residual risk rather than exploiting the multidimensional nature of risk, which may be better determined by integrating these factors, especially in the bifurcation setting. Moreover, traditional prognostic risk assessment is constructed upon a limited selection of variables based on a priori assumptions, potentially omitting routinely assessed, powerful outcome predictors. This potential limit of classical inferential statistics could be overcome by machine learning (ML) that adopts a radically different approach, focusing on algorithmic representations of data and their classification in order to establish and quantify the relationships among variables [20]. Indeed, ML has emerged as a powerful approach to circumvent the limitations of current methods by applying computational algorithms to large datasets with numerous multiparametric variables, capturing high-dimensional, non-linear relationships among clinical features to make data-driven outcome predictions [21]. The effectiveness of this strategy has been shown across several cardiovascular applications, where ML was superior to validated traditional risk stratification tools, including the prediction of adverse events among patients with coronary artery disease or heart failure undergoing cardiac resynchronization therapy [22][23][24]. Thus, we developed an ML-based risk stratification model integrating clinical, anatomical, and procedural features to predict mortality following bifurcation PCI, utilizing a large international cohort of patients undergoing coronary bifurcation PCI with contemporary stents [5]. The model was then validated in a large population derived from two randomized trials of contemporary stents.

Study Population
The present study includes 4094 patients with a coronary bifurcation lesion treated with contemporary, very thin drug-eluting stents. The ML-based model was developed and internally validated using the RAIN (very thin stents for patients with left main or bifurcation in real life, NCT03544294) registry population (defined as the discovery cohort in the present study). RAIN is a multicenter retrospective registry including consecutive patients undergoing unprotected left main or coronary bifurcation PCI from 2014 to 2017 at 23 institutions worldwide [5]. Patients undergoing ostial/mid-shaft left main PCI or patients with incomplete clinical, angiographic, procedural, and outcome data were excluded from this study. Thus, 2393 patients undergoing bifurcation PCI with very thin strut stents comprised the final discovery cohort. The discovery cohort was randomly divided into a training cohort (n = 1795) and an internal validation cohort (n = 598).
The external validation cohort for the ML-based model was obtained from the merged DUTCH PEERS (Durable polymer-based stent Challenge of Promus ElemEnt versus Resolute integrity: TWENTE II) trial and BIO-RESORT trial patient cohorts [2,25,26]. More specifically, 1701 patients with a coronary bifurcation lesion treated with very thin stents (n = 465 from DUTCH-PEERS, n = 1236 from BIO-RESORT) with follow-up truncated at 2-years comprised the external validation cohort.
Detailed descriptions of the study cohorts are provided as Supplementary Materials. Cardiovascular risk factors, clinical presentation, angiographic features, use of intravascular imaging, bifurcation technique details, characteristics of the treated lesion, and implanted stents were collected in a dedicated database. The primary endpoint of the study was all-cause mortality at two years, while all-cause death at 30 days and at one year were evaluated as secondary endpoints. The study complies with the Declaration of Helsinki, all of the patients provided informed consent for inclusion in the registries, and local institutional review board approval was obtained by each center.

Model Development
An overview of model development is provided in Figure 1A. The model was trained and validated according to TRIPOD guidelines. The discovery cohort was randomly split into a training (n = 1795) and an internal validation (n = 598) dataset. The Fisher score was used for feature selection in the training cohort, and the variables with a coefficient >0.75 were retained ( Figure S1); the selected variables were used to develop predictive models. A grid search including 5 different ML classifiers (linear discriminant analysis [LDA], random forest [RF] regressor, support vector machine [SVM] with a linear or Gaussian kernel, and isolation forest) and 3 algorithms for data imbalance correction (synthetic minority over-sampling technique [SMOTE], SMOTE and nearest neighbors, and random oversampling methods) was performed on the training cohort, generating 13 models for mortality prediction after bifurcation PCI. Data imbalance correction algorithms were applied to avoid the accuracy paradox (a falsely high accuracy due to over-prediction of the most represented class); oversampling techniques impute simulated patient data starting from real patients from the discovery cohort in the virtual space created by patient parameters to balance the number of patients with death occurrence and patients without events during model training. LDA applies a linear combination approach; the predicted endpoint is derived from the following equation: "Endpoint (all-cause mortality) = LDAcoeff 1 *Variable 1 + LDAcoeff 2 *Variable 2 + . . . + LDAcoeff n *Variable n > tested thresholds". The coefficients are generated by the algorithm to maximize the separation between groups (Death vs. No events), increasing precision estimates by variance reduction; variables represent patients' features, selected as described above. The RF algorithm generates a pre-defined set of classification trees ("n" classification trees) with a fixed maximum number of splits for each tree. The predicted endpoint results from the outcome of each classification tree of the forest; if at least "(n/2) + 1" of "n" trees of the RF predicts death as an outcome, then this endpoint is assigned to the patient. Linear SVM builds a classification model to assign patients to their outcome given a linear boundary, while Gaussian SVM allows the patients to be divided using a non-linear boundary. The model equations are: "SVMcoeff 0 + SVMcoeff 1 *Variable 1 + SVMcoeff 2 *variable 2 + . . . . + SVMcoeff n *Variable n ", and "SVMcoeff 0 + SVMcoeff 1 *f(Variable 1 ) + SVMcoeff 2 *f(variable 2 ) + . . . . + SVMcoeff n *f(Variable n )", respectively, where "f" is an exponential function coefficient. Isolation forest is a particular type of RF that uses unsupervised learning to discriminate anomalies (in this case, patients with death occurrence) from normal data (patients without events).
A random forest regressor algorithm with random oversampling correction yielded the highest accuracy for the prediction of death occurrence, and it is referenced throughout the manuscript as the RAIN-ML prediction model. A 10-fold cross-validation was used to select and tune the hyper-parameters (number of classification trees and number of splits) of the RAIN-ML model in the training cohort; the hyper-parameters reaching the highest accuracy in outcome prediction were selected. Thereafter, its performance was assessed by K-center cross-validation, risk stratification analysis, continual learning, and both internal and external validation. Overfitting bias was defined as the difference between the accuracy obtained during training and the accuracy during the internal or external validation. The model was developed to predict 2-years all-cause mortality; its performance was then assessed also at different time points (30-day and 1-year). A detailed description of the model development is provided in the Supplementary Materials (Extended Methods section).
A user-friendly online interface was designed to facilitate the application of the RAIN-ML prediction model in clinical practice (available at https://rain.hpc4ai.it; accessed on 12 May 2022).

Statistical Analysis
Categorical variables were reported as count and percentage and analyzed by chisquare test. Continuous variables were reported as median and interquartile ranges and analyzed by Mann-Whitney U-test. The analysis of the receiver operating characteristic (ROC) curves was performed to calculate the area under the curve (AUC) and to derive the best cut-off by evaluation of the Youden Index (J = sensitivity + specificity − 1). A two-tailed p-value of less than 0.05 was considered significant. Analyses were performed by IBM SPSS Statistics 26 (IBM, New York, NY, USA), Python 3.5 (library, scikit-learn), and GraphPad PRISM 8.0 (La Jolla, California, CA, USA).

Characteristics of the Discovery Cohort
The discovery cohort (n = 2393) was used to develop and internally validate the RAIN-ML prediction model. The baseline characteristics of the patients undergoing bifurcation PCI (median age 69 [interquartile range, IQR: 61-77] years, male sex 76.0%) from the discovery cohort are reported in Table S1, as stratified by death occurrence. The discovery cohort was randomized into a training and an internal validation dataset to develop and test the predictive models (Table 1). There were no differences between the training and internal validation cohorts (Table S2). After a median follow-up of 274 (IQR 52-434) days, 137 (5.7%) patients died (103 and 34 from the training and internal validation cohort, respectively; 30-day, 1-year, and 2-year mortalities were 1.2%, 3.7%, and 5.2%, respectively).

Development and Internal Validation of the RAIN-ML Prediction Model
Several patient and lesion-related parameters differed significantly between patients in whom the all-cause death endpoint was or was not reached at 2-year follow-up (Table S1). Features associated with all-cause mortality were selected by Fisher score (see Methods and Extended Methods sections) in the training cohort. Of the 38 patient and lesion-related parameters, 13 were excluded leading to a final set of 25 input variables (13 related to patient history and clinical presentation, seven to coronary anatomy, and five to the PCI procedure; Figure S1). Chronic kidney disease (CKD, defined as a glomerular filtration rate < 60 mL/min/1.73 m 2 ) was the best predictor of all-cause mortality, followed by the indication for PCI, first lesion vessel, diabetes, diffuse coronary disease, left ventricular ejection fraction (EF, %), kind of bifurcation, and age ( Figure 1B). Among the 13 trained models, a random forest regressor algorithm with random oversampling correction yielded the highest accuracy in predicting 2-years all-cause mortality and was selected as the RAIN-ML prediction model (Table S3). A representative classification tree of the random forest RAIN-ML model is shown in Figure 1C. After tuning (Table S4), the RAIN-ML model displayed an accuracy of 81.1% at training (AUC 0.791; 95% CI 0.742-0.840), and 79.8% at internal validation (AUC 0.768; 95% CI 0.669-0.868), with an overfitting effect of 1.3% ( Figure 1D). The sensitivity and specificity were 82.5/81.0% during training and 67.6/80.5% during internal validation, with 85 of 103 and 23 of 34 patients experiencing the correctly classified endpoint.
The performance of the RAIN-ML model in all-cause mortality prediction was assessed at different time points: the model was developed to predict 2-year all-cause mortality as the primary endpoint, and then its performance was also assessed at 30-days follow-up, 1-year follow-up, and including all events; Figure 2). After 1 year, the AUC was 0.777 (95% CI 0.721-0.834) during training and 0.718 (95% CI 0.586-0.850) during internal validation ( Figure 2B). After 2 years, the AUC was 0.799 (95% CI 0.745-0.852) during training and 0.736 (95% CI 0.624-0.847) during internal validation ( Figure 2C).
To further confirm the generalizability of the RAIN-ML prediction model, we applied a K-center cross-validation approach to the discovery cohort (n = 2393). The analysis confirmed an acceptable performance in each of the 23 participating institutions, with a mean accuracy, sensitivity, and specificity of 75.3%, 60.6%, and 76.2%.

Risk Stratification Analysis
In the mixed discovery and external validation cohorts, increasing coefficients of the RAIN-ML prediction model were directly correlated with the proportion of subjects with death occurrence (Figure 3A-D). Patient stratification according to the RAIN-ML model and the occurrence of death at follow-up are reported in Table S6. The lowest risk patients with an ML model coefficient of 0.10-0.19 displayed an all-cause mortality risk of 0.3%, 0.9%, and 2.3%, after 30 days, 1-year, and 2-year follow-up, respectively. On the other hand, the highest risk patients with an ML model coefficient of 0.90-1.00 displayed an all-cause mortality risk of 8.1%, 23.7%, and 72.2%, respectively, after 30 days, 1-year, and 2-year follow-up. Using cut-offs derived by ROC curve analysis to optimize sensitivity and specificity, we then stratified patients according to the predicted risk of all-cause mortality. For the RAIN-ML prediction model, a coefficient of less than 0.21 identified a low-risk subgroup of patients with a risk of 1.4% of all-cause mortality (35 of 2444 subjects), patients with a coefficient ranging between 0.21 and 0.70 showed an intermediate risk of 4.5% (52 of 1166 subjects), while a coefficient higher than 0.70 identified a risk of 18.4% (high-risk group; 89 of 484 subjects; Figure 3E). As compared to low risk, being categorized as intermediate risk and high risk was associated with increased (3.2-fold and 13.1-fold, respectively, both p < 0.001) mortality. The risk ranking approach (after the exclusion of patients at intermediate risk) led to a sensitivity/specificity of 71.8/85.9% with an overall accuracy of 85.3% when low-risk patients were compared to those classified as high risk.

External Validation of the RAIN-ML Model
The patients from the external validation cohort were younger (65 [IQR 57-72] years), with a lower prevalence of cardiovascular risk factors, prior myocardial infarction, and coronary revascularizations compared to the discovery cohort (Tables 1 and S5). At external validation, 1312 of 1701 patients were correctly classified according to death occur- 4.5% (52 of 1166 subjects), while a coefficient higher than 0.70 identified a risk of 18.4% (high-risk group; 89 of 484 subjects; Figure 3E). As compared to low risk, being categorized as intermediate risk and high risk was associated with increased (3.2-fold and 13.1fold, respectively, both p < 0.001) mortality. The risk ranking approach (after the exclusion of patients at intermediate risk) led to a sensitivity/specificity of 71.8/85.9% with an overall accuracy of 85.3% when low-risk patients were compared to those classified as high risk.
The predictive performances of the RAIN-ML model are summarized in Table S7. The predictive performances of the RAIN-ML model are summarized in Table S7.

Discussion
A reliable and clinically relevant patient risk stratification is a prerequisite for adequate treatment selection, informed consent, and improved care, all key elements of modern personalized medicine. Following this guiding principle, we developed and validated a prediction model based on supervised machine learning algorithms to identify long-term all-cause mortality in patients undergoing PCI on coronary bifurcations.
Our model was first developed using the largest available bifurcation PCI registry reflective of contemporary practice, encompassing a wide range of very thin secondgeneration drug-eluting stents and procedural techniques, applied to a variety of clinical scenarios among all-comers at 23 institutions worldwide. Subsequently, our findings were externally validated using a large bifurcation PCI population derived from two randomized trials of second-generation drug-eluting stents [2,25,26].
The RAIN-ML model, correctly classifying 3245 of 4094 patients, showed a good discriminative capability for all-cause mortality prediction, confirmed at both internal and external validation, and by K-center cross-validation, with an accuracy of 81.1% during training and ranging between 77.1% and 79.8% during validation. According to the RAIN-ML prediction model, following bifurcation PCI, 6% of the patients displayed a 1-year mortality risk > 20%, while about 60% of them carried a 1-year mortality risk below 2%. The adoption of the RAIN-ML model in contemporary practice may allow for the performance of reliable and clinically relevant risk stratification to inform, personalize, and improve care. An accurate risk stratification would allow, on the one hand, targeted strategies in patients with the highest risk of death through a comprehensive evaluation and a tailored approach and, on the other hand, less intensive follow-up for those at low risk.
Despite several differences in the characteristics of the study cohorts, external validation yielded a good performance, suggesting good generalizability and robustness of the RAIN-ML model applicability. In particular, the cohorts of the BIO-RESORT and DUTCH-PEERS trials were composed of patients with a lower burden of cardiovascular risk factors and previous cardiovascular events, translating into lower death rates at followup as compared to the real-world clinical setting of the RAIN cohort. Moreover, in the external validation cohorts, "left ventricular ejection fraction", a powerful feature of the RAIN-ML model, was coded dichotomically, and the variable "diffuse coronary disease" was unavailable. The model performed well despite these missing data and potentially limiting factors that might have affected model discrimination. Importantly, the model performed well at internal validation, suggesting its applicability in the real-world setting and its potential usefulness in daily clinical practice.

Rationale of the Study and Related Work
In this study, we focused on all-cause mortality as the primary endpoint to offer a comprehensive evaluation of the biological risk of this patient subset, which displays unique features as compared to the overall PCI population. Specifically, patients undergoing bifurcation PCI have higher short-and long-term all-cause mortality as compared to patients with non-bifurcation PCI [13,27], pointing to a peculiar association and possibly causal link of bifurcation lesions with mortality. If, on the one hand, the presence of bifurcation lesions is a potential proxy for a more severe atherosclerotic burden, on the other hand, it may also represent a subset biologically more prone to adverse events due to the peculiar rheological characteristics [14,15]. Moreover, specific procedural aspects are associated with long-term mortality in bifurcation PCI, highlighting the importance, beyond the lesion's natural history, of PCI-related factors in determining the prognosis of this population [5,16]. However, presently there is no predictive tool reflective of current clinical practice available to predict long-term mortality following bifurcation PCI. For these reasons, we focused on a multifaceted approach to residual risk, based on a comprehensive evaluation of patient-, anatomy-, and procedure-related factors integrated by a machine learning approach able to handle multidimensional information and produce data-driven outcome prediction. A key advantage of this approach is that investigators do not generally need to specify which potential predictor variables to consider and in which combinations. A multidimensional approach may be highly relevant in this setting, which is characterized by high anatomical and procedural complexity. Indeed, previous scores to predict adverse events focusing on either patient-related or anatomy/procedure-related factors performed only modestly in patients with bifurcation PCI [28]. Specifically, in a previous analysis of the RAIN registry, the PCI complexity definition proposed by Giustino et al. [11] using validated and guideline-endorsed criteria (reflecting anatomy/procedure-related factors) was unable to discriminate post-procedural mortality (AUC 0.49), and the PARIS risk score [8] (reflecting patient-related factors) displayed only a modest discrimination capacity (AUC 0.65). More importantly, these tools showed potential for an accurate event prediction when combined, thus suggesting that a comprehensive evaluation of clinical, anatomical, and procedural features may better reflect residual risk [28].
Nevertheless, the aim of the RAIN-ML prediction model was not to guide bifurcation PCI based on pre-procedural risk but rather to provide the patient and the treating physician with information that also integrates procedural outcome predictors that may significantly modify the patient's prognostic trajectory and, consequently, its clinical management and follow-up.

Perspectives
When compared to traditional scores, ML-based models do not always improve performance in terms of accuracy [29]. However, several advantages may become apparent in the long term [30][31][32][33][34][35][36][37]. Specifically, compared to the static nature of traditional scores, the performance of the RAIN-ML prediction model is dynamic, thanks to its evolutive learning feature allowing the model to improve its classification algorithm by learning strategies at the increased enrollment time and number of recruited patients.
An example of continual learning applied to the discovery cohort of the RAIN-ML model is presented in Figure 4: the model was trained at each time point on an increasing number of patients. From 3 months to 33 months of enrollment, the accuracy increased from 67.9% to 78.7% at validation. We thus will plan through a dedicated anonymized system integrated with the freely available online interface (https://rain.hpc4ai.it; accessed on 12 May 2022) to prospectively endorse RAIN-ML training to constantly improve outcome prediction. Moreover, future works should assess whether the integration of more granular features, such as the characteristics of plaque vulnerability evaluated by intravascular imaging, might have potentially improved outcome prediction. Similarly, features of non-cardiovascular comorbidities, which may be relevant to the mortality endpoint in this complex patient population, have not been evaluated in the present work: future studies may integrate these features with the RAIN-ML model to possibly improve its discriminative capability. Finally, prospective validation of the RAIN-ML model in an external real-world population remains desirable.

Limitations
This study has some limitations to be acknowledged. First, the RAIN re retrospective. However, the good generalizability demonstrated in the external cohort constituted by bifurcation PCI patients from two randomized trials is r Second, the model performance in the cohorts and at different time points ra moderate to good. As discussed, the integration of more granular technical and cal features along with data on non-cardiovascular comorbidities might have improved outcome prediction. However, this would have limited the model a these features are not yet routinely assessed in everyday clinical practice.
Third, our ML model requires several input variables that might discoura However, all these variables are generally readily available, and the user-frien interface makes risk estimation at the different evaluated time-points an easy procedure.

Limitations
This study has some limitations to be acknowledged. First, the RAIN registry was retrospective. However, the good generalizability demonstrated in the external validation cohort constituted by bifurcation PCI patients from two randomized trials is reassuring. Second, the model performance in the cohorts and at different time points ranged from moderate to good. As discussed, the integration of more granular technical and anatomical features along with data on non-cardiovascular comorbidities might have potentially improved outcome prediction. However, this would have limited the model adoption as these features are not yet routinely assessed in everyday clinical practice.
Third, our ML model requires several input variables that might discourage its use. However, all these variables are generally readily available, and the user-friendly online interface makes risk estimation at the different evaluated time-points an easy and quick procedure.

Conclusions
The RAIN-ML prediction model represents the first developed tool combining clinical, anatomical, and procedural features through a machine learning approach to predict allcause mortality among patients undergoing contemporary coronary bifurcation PCI, with robust performance and generalizability for mortality prediction across different clinical scenarios and at different time points. The adoption of the RAIN-ML model has the potential to improve doctor-to-patient communication, patient management, and clinical research.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board (ethic committee approval code: 00253/2021).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data that support the findings of this study are available on reasonable request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.