Important Risk Factors in Patients with Nonvalvular Atrial Fibrillation Taking Dabigatran Using Integrated Machine Learning Scheme—A Post Hoc Analysis

Our study aims to develop an effective integrated machine learning (ML) scheme to predict vascular events and bleeding in patients with nonvalvular atrial fibrillation taking dabigatran and identify important risk factors. This study is a post-hoc analysis from the Randomized Evaluation of Long-Term Anticoagulant Therapy trial database. One traditional prediction method, logistic regression (LGR), and four ML techniques—naive Bayes, random forest (RF), classification and regression tree, and extreme gradient boosting (XGBoost)—were combined to construct our scheme. Area under the receiver operating characteristic curve (AUC) of RF (0.780) and XGBoost (0.717) was higher than that of LGR (0.674) in predicting vascular events. In predicting bleeding, AUC of RF (0.684) and XGBoost (0.618) showed higher values than those generated by LGR (0.605). Our integrated ML feature selection scheme based on the two convincing prediction techniques identified age, history of congestive heart failure and myocardial infarction, smoking, kidney function, and body mass index as major variables of vascular events; age, kidney function, smoking, bleeding history, concomitant use of specific drugs, and dabigatran dosage as major variables of bleeding. ML is an effective data analysis algorithm for solving complex medical data. Our results may provide preliminary direction for precision medicine.


Introduction
Stroke is the leading cause of death and disability worldwide [1]. Cardioembolic stroke is a primary subtype, and nonvalvular atrial fibrillation (NVAF) is one of the most common risk factors for cardioembolic stroke, with a global prevalence of 1-2% [2]. In recent decades, this event has been treated by shifting from the traditional vitamin K antagonist warfarin to nonvitamin K antagonists (NOACs) [3][4][5]. Because of the significant increase in the clinical demands for NOACs, off-label use, especially the dosage selection regimen, has become an important issue in recent years. In real-world studies, off-label low-dose NOACs were prescribed to approximately 9-31% of patients with NVAF [6,7]. Adverse effects including a higher risk of ischemic stroke and systemic embolism have been observed in these patients [8,9].
Dabigatran etexilate, the only direct thrombin inhibitor, is an NOAC with two approved doses based on the Randomized Evaluation of Long-Term Anticoagulant Therapy (RE-LY) trial [10]. In this trial, low-dose dabigatran (110 mg twice daily) had similar vascular prevention effects as those of warfarin with lower rates of major hemorrhage.
High-dose dabigatran (150 mg twice daily) was associated with lower rates of vascular events but similar rates of major hemorrhage. The dosage adjustment plan of dabigatran was based on previous studies and expert opinions (European label) [11], which suggested that clinicians could decrease the dosage of dabigatran among patients aged >80 years, those aged 75-80 years with a high risk of bleeding, or those with concomitant use of verapamil. Physicians must balance the risk of recurrent stroke and bleeding tendency in clinical practice. Currently, the congestive heart failure, hypertension, age ≥ 75 years [doubled], diabetes mellitus, prior stroke, transient ischemic attack or thromboembolism [doubled], vascular disease, age 65-74 years, and sex category (CHA 2 DS 2 -VASc) [12] and hypertension, abnormal renal/liver function, stroke, bleeding history or predisposition, labile international normalized ratio, elderly [age ≥ 65 years], drugs/alcohol concomitantly (HAS-BLED) [13] scores are used to calculate the risk of recurrent ischemic stroke and assess bleeding risk, respectively. However, these tools share the same grading factors: old age, hypertension, and stroke history. This may lead to a clinical dilemma, i.e., one patient could score high in both scoring systems. Although the CHA 2 DS 2 -VASc score has been widely used for years with convenience and reliability [14,15], insufficient prediction performance (C statistic of 0.679) has remained a concern [16]. Machine learning (ML) methods have been recently used as well-constructed analytical, classification and prediction tools for medical problems [17][18][19][20][21][22]. Their advantage and performance in demonstrating complex relationships between risk factors and outcomes and analyzing important information hidden in the vast amount of medical data have made them an emerging research topic. Kamel et al. [23] and Chun et al. [24] have confirmed the feasibility of predicting vascular events. Unlike prediction models that use only one ML technique that might be insufficient to provide complete, adequate and stable feature selection results, our study developed an integrated ML feature selection scheme with the benefits of stable and balanced performance. Our method may reveal important variables influencing the efficacy and safety of dabigatran to provide a precision medical suggestion regarding dose selection and risk control for patients with different characteristics.

Study Population
This study is a post-hoc analysis based on RE-LY trial dataset. This study was reviewed and approved by the Research Ethics Review Committee of the Fu Jen Catholic University Hospital. The requirement for informed consent was waived, since the data contain only de-identified information.
In the RE-LY trial, >18,000 patients with newly diagnosed arrhythmia and indications of secondary prevention with an anticoagulant were randomized to receive dabigatran 110 or 150 mg twice daily or an adjusted dose of warfarin with a median follow-up period of approximately 2 years. Exclusion criteria included a history of severe heart valve disorders, a recent stroke, and renal insufficiency. The primary outcome was stroke or systemic embolism and the primary safety outcome was major hemorrhage. The definitions and results of other secondary outcomes have been described in detail and published. We collected the data of patients taking dabigatran with complete follow-up in the RE-LY trial for the present analysis.

Proposed Integrated Machine Learning Scheme
We proposed an integrated ML feature selection scheme for predicting vascular events and bleeding in patients with NVAF taking dabigatran and for identifying important risk factors. Figure 1 shows the process of establishing the proposed scheme. First step: Identify risk factors as predictor variables and define target variables. For risk factors we referred to the recommendations in the guidelines of the American Heart Association and the European Society of Cardiology [11,25], which included sex; age; body mass index (BMI); body weight; ethnicity; kidney function abnormality; concomitant use of specific drugs; history of hypertension, stroke, previous bleeding, myocardial infarction (MI), diabetes mellitus (DM), congestive heart failure (CHF), or systemic embolism; smoking; and liver function abnormality. Boundaries of subgroups in most variables followed the definition of CHA2DS2-VASc and HAS-BLED scores. BMI was classified according to the definition of the World Health Organization [26]. Moderate and severe kidney function abnormality was labeled according to the United States Food and Drug Administration (USFDA) [27].
For analyzing the influence of these factors on efficacy and safety, we selected two target variables including vascular events (P1: stroke, MI, systemic embolism, and vascular death) and major bleeding (P2: major bleeding defined as blood loss with a decrease in hemoglobin level of ≥2 g/dL (1.2 mmol/L), transfusion of ≥2 packed red blood cells, or symptomatic bleeding in a critical area or organ). Our presumed important variables and prognostic outcomes were individually categorized according to the definition shown in Table 1.
Subjects were identified according to participants' characteristics and laboratory data collected during their enrolment in the RE-LY trial. Only patients with available complete information were included in our analysis. Two independent investigators confirmed prognostic outcomes according to the criteria mentioned in the trial. First step: Identify risk factors as predictor variables and define target variables. For risk factors we referred to the recommendations in the guidelines of the American Heart Association and the European Society of Cardiology [11,25], which included sex; age; body mass index (BMI); body weight; ethnicity; kidney function abnormality; concomitant use of specific drugs; history of hypertension, stroke, previous bleeding, myocardial infarction (MI), diabetes mellitus (DM), congestive heart failure (CHF), or systemic embolism; smoking; and liver function abnormality. Boundaries of subgroups in most variables followed the definition of CHA 2 DS 2 -VASc and HAS-BLED scores. BMI was classified according to the definition of the World Health Organization [26]. Moderate and severe kidney function abnormality was labeled according to the United States Food and Drug Administration (USFDA) [27].
For analyzing the influence of these factors on efficacy and safety, we selected two target variables including vascular events (P1: stroke, MI, systemic embolism, and vascular death) and major bleeding (P2: major bleeding defined as blood loss with a decrease in hemoglobin level of ≥2 g/dL (1.2 mmol/L), transfusion of ≥2 packed red blood cells, or symptomatic bleeding in a critical area or organ). Our presumed important variables and prognostic outcomes were individually categorized according to the definition shown in Table 1.
Subjects were identified according to participants' characteristics and laboratory data collected during their enrolment in the RE-LY trial. Only patients with available complete information were included in our analysis. Two independent investigators confirmed prognostic outcomes according to the criteria mentioned in the trial. Abbr.: BMI, body mass index; GFR, glomerular filtration rate; TIA, transient ischemic attack; NSAID, nonsteroidal anti-inflammatory drug; COX, cyclooxygenase; MI, myocardial infarction; DM, diabetes mellitus; CHF, congestive heart failure. # Liver function abnormality defined as a medical history of cirrhosis or abnormal biochemical data when the patients were enrolled (bilirubin level more than two times the upper limit of normal, plus one or more of aspartate transaminase, alanine transaminase, or alkaline phosphatase level more than three times the upper limit of normal). † Vascular events defined as stroke, myocardial infarction, systemic embolism, and vascular death. * Major bleeding was defined as blood loss with a decrease in hemoglobin level of ≥2 g/dL (1.2 mmol/L), transfusion of ≥2 packed red blood cells, or symptomatic bleeding in a critical area or organ. Critical areas were intraocular, intracranial (including hemorrhagic stroke), intraspinal, intramuscular with compartment syndrome, retroperitoneal, intraarticular, or pericardial.
The study protocol included one traditional prediction method, logistic regression (LGR), and four ML techniques, viz., naive Bayes (NB), random forest (RF), classification and regression tree (CART), and extreme gradient boosting (XGBoost). NB is a popular ML model used for classification tasks. This algorithm can sort objects according to specific characteristics and variables based on the Bayes theorem. It calculates the probability of hypotheses on presumed groups [28]. RF is an ensemble learning method developed by constructing several decision trees. It collects numerous random samples of variables as the training dataset to alleviate the overfitting feature of decision trees. Each tree in the RF outputs its prediction result, and the class with the most votes sums up the best performance model [29]. CART is a classification ML algorithm that constructs a decision tree based on Gini's impurity index. The decision tree structure comprises root, internal, and leaf nodes, which may represent training data and decision-making points. The CART prediction model is constructed by picking variables and evaluating split ends until an appropriate tree is produced [30]. XGBoost is an optimized distributed gradient boosting system that implements ML algorithms based on the gradient boosting framework. It uses the regularization term to control model complexity and simultaneously uses firstand second-order derivatives to perform a second-order Taylor expansion of the loss function [31]. These ML methods, which share characteristics of interpretable tools for prediction and classification with good performance in vast unprocessed data, have been widely applied in solving cerebrovascular and cardiovascular disease problems [32][33][34][35]. Meanwhile, the logistic regression, which is a widely accepted analytic method in medical research, was defined as the benchmark in our study.
Second step: Train NB, RF, CART, and XGBoost models and evaluate their predictive performance. The models are trained using two combinations of predictor and target variables. One combination involves using 18 variables (V1-V18) as predictors and vascular events (P1) as the target variable; the other combination involves using V1-V18 as predictors but bleeding (P2) as the target variable. In training these models, the data of recruited patients were randomly separated into 90% training and 10% testing datasets according to the 10-fold cross-validation (CV) method. Our scheme applied the 10-fold nested CV method for enhancing stability to estimate the best performance of each model [36]. This process consisted of 10-fold inner CV for tuning and then determining the best hyperparameter set of each method for model selection and 10-fold outer CV for evaluating the predictive performance of the best model of each method for model evaluation.
These models' efficacy were evaluated based on their mean and standard deviation of accuracy, sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve (AUC) [37]. Sensitivity is the proportion of true positives tests of all patients with predicted events. Specificity is the proportion of true negative tests out of all patients who have not predicted events. Accuracy is the proportion of correct predictions (both true positives and true negatives tests) among the total number of patients examined. ROC curve is a graphic performance measurement of a classification model at various classification thresholds. AUC is the Area under the ROC curve, which provides an aggregate performance measure across all possible classification thresholds. The best hyperparameters with leading validation performance based on the AUC value for each model can be chosen to construct tuned NB, RF, CART, and XGBoost best models. The results of the best performance model with AUC values exceeding those of LGR were the cornerstone of our predicting models of vascular events and bleeding.
Third step: Importance ranking of risk factors. The "caret" R package version 6.0-90 [38] was applied for each of the four methods to generate each variable's importance value. We defined the priority demonstrated in each model ranking 1 as the most critical factor and 18 as the least critical factor. Each model would perform 10 times due to the use of 10-fold outer CV to gain the average ranking of each variable for more confident results. Individual ML methods may produce different importance rankings owing to distinct characteristics. Ensemble machine learning method based on a combination of multiple models' outputs is widely accepted and has produced good results in recent years [39]. An integrated ML feature selection scheme might assemble the prediction powers of these methods. We summarized the major important variables from the average ranking of each risk factor based on the identified convincing ML models to enhance stability and integrity.
According to the individual priority of the variables presented in the predictive models of vascular events and bleeding, we may establish an instruction concept for patients with NVAF taking dabigatran. In the final stage, we summarized our significant findings and discussed them in light of previous concepts.

Results
There were 12,091 patients randomized to take dabigatran in the RE-LY trial. After excluding 289 patients with missing data, 11802 patients were enrolled in our study. Subjects' demographic data are outlined in Table 2. There were 318 (2.69%) patients with vascular events, and 2238 (18.96%) patients had bleeding within the first year of follow-up when taking dabigatran, while others were event-free.  Table 3 shows the values of hyperparameters which train best NB, RF, CART, and XG-Boost models with leading AUC values. The performances of LGR, NB, RF, CART, and XGBoost methods in predicting vascular events and bleeding are listed in Table 4. The ROC curve of each model is presented in Figure 2. In predicting vascular events, RF (AUC = 0.780) and XGBoost (AUC = 0.717) showed higher AUC values than LGR (AUC = 0.674). In predicting bleeding, RF (0.684) and XGBoost (0.618) showed higher AUC values than LGR (0.605). In contrast, NB and CART showed inferior performance to LGR in predicting vascular events and bleeding. Therefore, we selected RF and XGBoost as the basis of our integrated ML feature selection model. Table 3. Summary of the values of the hyperparameters which train the best NB, RF, CART, and XGBoost models.

Hyperparameters Best Value Meanings
CART minispilt 20 The minimum number of observations that must exist in a node for a split to be attempted.  Table 5 shows the overall importance ranking of each risk factor in predicting vascular events based on RF and XGBoost. The average rankings with 10-fold cross-validation of the two models were demonstrated as "Average Ranking of 10 Times RF" and "Average Ranking of 10 Times XGBoost". The different methods generated individual importance ranking according to their analyzing rules. For a more comprehensive view, we summarized the findings of the two models equally in our integrated ML feature selection scheme. We obtained the "Average ranking of the 2 Models" with simple averaging the average ranking values from the RF and XGBoost models. To clarify the ranking, we ranked the result from 1 and showed that the "Final ranking in predicting vascular events" was listed using the "Average ranking of the 2 Models" value. Age; history of CHF, MI, DM, and stroke; smoking; kidney function; BMI; ethnicity, and dabigatran dosage were the major predictor variables of vascular events.   Table 5 shows the overall importance ranking of each risk factor in predicting vascular events based on RF and XGBoost. The average rankings with 10-fold cross-validation of the two models were demonstrated as "Average Ranking of 10 Times RF" and "Average Ranking of 10 Times XGBoost". The different methods generated individual importance ranking according to their analyzing rules. For a more comprehensive view, we summarized the findings of the two models equally in our integrated ML feature selection scheme. We obtained the "Average ranking of the 2 Models" with simple averaging the average ranking values from the RF and XGBoost models. To clarify the ranking, we ranked the result from 1 and showed that the "Final ranking in predicting vascular events" was listed using the "Average ranking of the 2 Models" value. Age; history of CHF, MI, DM, and stroke; smoking; kidney function; BMI; ethnicity, and dabigatran dosage were the major predictor variables of vascular events.   Abbr.: RF, random forest; XGBoost, eXtreme gradient boosting; BMI, body mass index. Table 6 presents the overall importance ranking of each risk factor in predicting bleeding. By averaging the rank values of RF and XGBoost methods, we concluded that age, kidney function, smoking, bleeding history, concomitant use of specific drugs, dabigatran dosage, BMI, MI and CHF history, and ethnicity were the major predictor variables of bleeding.

Discussion
To our knowledge, this is the first study attempting to analyze risk factors in patients with NVAF taking dabigatran using integrated ML feature selection methods. RF and XGBoost demonstrated prominent prediction values exceeding those of LGR. We could conclude the ranking of essential risk factors in these patients after averaging the results of these two methods. In order to balance simplicity and practicality against precision, we selected the top nine important variables to discuss according to physicians' decision. (Table 7). In most predictive models, an age of >65 years is a standard variable that predicts ischemic stroke and bleeding. As expected, age was the leading predictor of vascular events and bleeding in our study.
Smoking induces atherosclerosis and endothelial dysfunction, simultaneously resulting in more ischemic insults and hemorrhage [44][45][46]. Smoking also contributes to an increased probability of developing arrhythmia via several metabolic factors and underlying diseases [47]. Smoking cessation is a well-documented strategy to prevent vascular disease either with or without arrhythmia. Regarding the medical management of patients with atrial fibrillation, smoking has received insufficient attention. In a consensus, smoking was reported to increase warfarin clearance, influencing the drug effects [48]. There was no similar concern when the anticoagulant was shifted to NOACs. However, in our study, smoking appeared to be a more important variable than other common systemic diseases in patients taking dabigatran.
In the CHA 2 DS 2 -VASc score, ischemic stroke history played a more critical role than MI after adding the two scores when patients ever had a stroke. However, MI was a prevalent risk factor for vascular events rather than stroke in our study. CHF that might result from ischemic heart disease or hypertension complications also has a significant impact on most evaluating tools [49]. Cardiomegaly caused by these underlying diseases leads to left ventricular hypokinesia, the major cause of thrombus formation [50][51][52]. However, endothelial dysfunction and cerebral autoregulation disturbance are also the consequences of CHF [53]. In ML models, we may comprehensively analyze several variables with different interactions; hence, CHF and MI show higher grades in the prediction of vascular events among all underlying diseases.
Kidney dysfunction was infrequently mentioned as a major risk factor for ischemic stroke. Severe kidney impairment (estimated creatinine clearance <30 mL/min/1.73 m 2 ) was an exclusion criterion for most NOACs [10,[54][55][56]. Delayed drug clearance may increase the possibility of bleeding [57]. In a study on Danish population, kidney function impairment was found to contribute to a high tendency of developing vascular events and bleeding [58]. The levels of inflammatory and procoagulant factors including C-reactive protein, fibrinogen, factor VIIc, and factor VIIIc were high [59]. Furthermore, hemostatic dysfunction including decreased glycoprotein IIb and IIIa levels, reduced von Willebrand factor activity, and altered arachidonic acid metabolism were detected in older individuals with renal insufficiency [60]. These double-sided adverse effects may be due to these physiological alterations, and kidney dysfunction is the end-organ damage result of hypertension and diabetes. Our scheme selects it as a significant representative variable of vascular events and bleeding.
In general, a high BMI may be associated with metabolic syndrome and hypertension [61]. High BMI increases the prevalence of cerebrovascular and heart diseases [62]. However, this trend is controversial in patients with arrhythmia. Meta-analysis and realworld cohort studies have revealed less ischemic stroke and bleeding prevalence in patients with high BMI [63,64]. The all-cause death rate was higher in underweight patients. BMI was critical for predicting vascular events and bleeding in our study.
For a competitive relationship in the CYP3A4 and P-glycoprotein inhibition pathway [65], the recommended dabigatran dose in the European label is 110 mg if a patient is on verapamil. In the United States, the USFDA recommended that clinicians use dabigatran with caution when patients are on long-term use of nonsteroidal anti-inflammatory drugs (NSAIDs), antithrombotic agents, or medicines that may elevate the blood levels and effects of dabigatran, such as dronedarone or ketoconazole [66]. Observational studies conducted in the US and Taiwan have indicated that concomitant use of these drugs enhances bleeding risk in patients taking dabigatran [67,68]. Combining antithrombotic and antiplatelet agents is a well-known therapy limited in certain conditions owing to high bleeding probability [69]. Our results confirm that these drug interactions have an important effect on bleeding risk.
This study attempted to solve the dilemma of dose selection of dabigatran to obtain the maximum benefit of prevention and avoidance of side effects in patients with various physiological conditions and comorbidities. Dabigatran dose was also defined as a variable in our model. Although it had a noticeable influence on both vascular events and bleeding, it was not a major factor in either result. This issue remains a complex problem that our study could not solve completely because of three of the top five risk factors of either vascular event or bleeding being the same (Table 7). Nevertheless, we could identify essential factors to provide good suggestions using this model. First, smoking cessation and maintaining an appropriate body shape are vital for patients prescribed dabigatran. CHF and MI imply a high risk for thrombotic events with secondary prevention with dabigatran, and intensive medical control and prescribing a standard dabigatran dose are essential. In contrast, a previous bleeding history and concomitant use of antithrombotic agents, NSAIDs, and medicines with effects on CYP3A4 have adverse effects on bleeding when we select a low dabigatran dose. However, older age and kidney function impairment have double-sided adverse effects causing more vascular events and bleeding simultaneously. Other methods are indicated to determine the dividing line of each factor if it exists.

Limitations
Our study has several limitations. First, our findings must be applied with caution in clinical practice considering the inclusion and exclusion criteria of the RE-LY trial. The trial population comprised subjects with relatively low CHA 2 DS 2 -VASc scores (2.1 ± 1.1), and patients with certain comorbidities were excluded. Second, the trial participants were regularly followed up for two years with good compliance that might be infrequent in our outpatients. Specific effects of other systemic diseases might be allevi-ated. Third, we intended to establish a prediction model for patients with NVAF taking dabigatran; vascular events including stroke, MI, systemic embolism, and vascular death were defined as the primary outcome. Given that we selected only one NOAC instead of an antiplatelet agent or combined therapy, and though an antiplatelet agent could prevent atherosclerosis, this issue might be affected by risk factors including dyslipidemia, lifestyle, and genetics, which were not included in our study. Nevertheless, our study design was suitable for clinical practice when considering the secondary prevention of cardioembolic stroke.

Conclusions
NOACs could replace warfarin owing to their similar protective effects and better safety quality in real-world studies. Appropriate dose selection and intensive risk factor control are necessary to achieve high-quality secondary prevention. In our research, RF and XGBoost generated higher accuracies and AUC values than LGR in simultaneously predicting vascular events and bleeding even with the disproportionate prevalence. Furthermore, these methods remained relatively stable between their sensitivities and specificities in the imbalanced data. This integrated ML feature selection scheme showed a great opportunity to solve complex medical data. Although further evaluation is indicated, our study might provide a preliminary direction of precision medicine for secondary prevention in patients with arrhythmia.

Informed Consent Statement:
The requirement for informed consent was waived, since the data contain only de-identified information.
Data Availability Statement: The data are available through application to Boehringer-Ingelheim on the research data sharing platform (https://vivli.org/; accessed on 28 July 2021). Restrictions apply to the availability of these data, which were used under license for this study.