A Comparison of the Impact of Pharmacological Treatments on Cardioversion, Rate Control, and Mortality in Data-Driven Atrial Fibrillation Phenotypes in Critical Care

Critical care physicians are commonly faced with patients exhibiting atrial fibrillation (AF), a cardiac arrhythmia with multifaceted origins. Recent investigations shed light on the heterogeneity among AF patients by uncovering unique AF phenotypes, characterized by differing treatment strategies and clinical outcomes. In this retrospective study encompassing 9401 AF patients in an intensive care cohort, we sought to identify differences in average treatment effects (ATEs) across different patient groups. We extract data from the MIMIC-III database, use hierarchical agglomerative clustering to identify patients’ phenotypes, and assign them to treatment groups based on their initial drug administration during AF episodes. The treatment options examined included beta blockers (BBs), potassium channel blockers (PCBs), calcium channel blockers (CCBs), and magnesium sulfate (MgS). Utilizing multiple imputation and inverse probability of treatment weighting, we estimate ATEs related to rhythm control, rate control, and mortality, approximated as hourly and daily rates (%/h, %/d). Our analysis unveiled four distinctive AF phenotypes: (1) postoperative hypertensive, (2) non-cardiovascular mutlimorbid, (3) cardiovascular multimorbid, and (4) valvulopathy atrial dilation. PCBs showed the highest cardioversion rates across phenotypes, ranging from 11.6%/h (9.35–13.3) to 7.69%/h (5.80–9.22). While CCBs demonstrated the highest effectiveness in controlling ventricular rates within the overall patient cohort, PCBs and MgS outperformed them in specific phenotypes. PCBs exhibited the most favorable mortality outcomes overall, except for the non-cardiovascular multimorbid cluster, where BBs displayed a lower mortality rate of 1.33%/d [1.04–1.93] compared to PCBs’ 1.68%/d [1.10–2.24]. The results of this study underscore the significant diversity in ATEs among individuals with AF and suggest that phenotype-based classification could be a valuable tool for physicians, providing personalized insights to inform clinical decision making.


Introduction
Atrial fibrillation (AF) is the most prevalent cardiac arrhythmia, affecting more than 33 million patients worldwide [1].It is commonly encountered in critically ill patients, with incidences ranging from 4.5% to 15% in intensive care units (ICUs) [2], where it is associated with higher healthcare costs, prolonged hospitalization duration, increased risk of thromboembolism, and increased mortality [3,4].
AF is a heterogeneous disease with diverse causes and mechanisms.It may be driven by cardiac and non-cardiac comorbidities, such as pulmonary, metabolic, and endocrine disorders, genetic factors, or inflammatory states [5,6].The abundance of pathophysiological processes driving AF has led to the realization that AF is a complex arrhythmia with significant inter-patient heterogeneity [7].To address this heterogeneity, data-driven methods such as cluster analysis have been applied to AF cohorts, identifying clinically relevant AF phenotypes with different treatment patterns and outcomes.
Hemodynamic compromise resulting from AF often makes an urgent conversion to sinus rhythm necessary in critically ill patients.In hemodynamically stable patients, AF is often observed until it terminates spontaneously, or the ventricular rate is controlled to avoid potential adverse effects associated with cardioverting antiarrhythmic drugs.Such adverse effects include thyroid disorders, hypotension, pulmonary fibrosis, and proarrhythmic effects, with some antiarrhythmics being associated higher mortality than others [8].
Within the ICU, the management of AF is mostly limited to pharmacological treatments [5,9].Beta blockers, magnesium sulphate, and calcium channel blockers are primarily aimed at reducing the ventricular rate, by means of reducing atrioventricular node conduction.Potassium channel blockers are used to restore and maintain sinus rhythm by prolonging atrial refractory periods, thereby preventing re-entrant activity.For a comprehensive overview of the mechanisms of AF, and the molecular mechanisms of antiarrhythmic drugs, we refer the reader to [10].
The wide spectrum of treatment options coupled with a heterogeneous patient population makes treatment selection a complex endeavor.As a result, strong evidence for the optimal treatment strategy is missing [9], and AF treatment in ICUs varies across clinical institutions.Nonetheless, treatment strategies with antiarrhythmic drugs have been shown to impact patient outcomes in the short as well as the long term [11,12].A recent multicenter survey on treatment preferences among physicians revealed a lack of consensus on whether to choose a rate control or a rhythm control strategy, a lack of consensus in the choice of antiarrhythmic agent, and a disregard for patients' underlying pathophysiological presentation in treatment selection in 75% of respondents [13].Even though some tendencies for treatment selection exist, they are often derived from outpatient guidelines, and are not directly applicable to ICU populations due to different AF mechanisms, risks, and effectiveness of treatments [14,15].
Previous studies have employed cluster analysis to identify and characterize different AF phenotypes in community cohorts.The first such application was performed by Inohara et al. [16], who identified four recognizable phenotypes based on 60 clinical variables.The authors observed significant differences in the use of pharmacological treatments, and rates of major adverse cardiovascular or neurological events, new-onset heart failure, hospitalization, major bleeding, and mortality.Further studies [17][18][19][20][21] incorporated different clinical variables and identified varying numbers of clusters, reporting inter-cluster differences in clinical outcomes.A recurring conclusion of previous works was that cluster analysis was able to identify clinically meaningful phenotypes, which may potentially guide treatment decisions and improve patient outcomes.
We further explore the applicability of cluster analysis and phenotype classification in AF management by assessing its ability to identify clusters with varying average treatment effects (ATEs).Hierarchical agglomerative clustering is employed to identify distinct AF phenotypes in an intensive care cohort.Phenotypes' properties are described, and the efficacy of pharmacological interventions is evaluated, demonstrating differences in ATEs on rhythm control, rate control, as well as in-hospital mortality.

Data
This study performs a retrospective analysis of a large single-center intensive care database, the Medical Information Mart for Intensive Care (MIMIC-III) [22].The MIMIC-III database contains electronic health records from 55,423 distinct ICU admissions of 46,520 patients in the critical care units of the Beth Israel Deaconess Medical Center in the years from 2001 to 2012.The data include vital signs, medications, laboratory measurements, periodically charted observations, medical procedures, diagnoses, and free-text clinical notes.

Cohort Definition
We include patients at least 18 years of age with a diagnostic code indicating AF (ICD-9-CM: 427.31).For patients who exhibited more than one ICU admission, only the first admission with an AF diagnosis is considered.Patients with an ICU stay shorter than 24 h or age below 18 years are excluded from the analysis.

Variables
Patients are described using 34 clinical variables which include comorbidities, laboratory measurements, observations, and medical history.The clinical variables were gathered based on a systematic literature review, in which clinical variables predictive of patient outcomes were identified.The employed database was screened for the presence of the variables identified in the systematic review, resulting in the 34 clinical variables presented in Table A1.Within the scope of this study, we use the earliest available record for each variable, if more than one value is available.Continuous variables are transformed into z-scores for analysis.

Outcomes
The primary outcomes are (i) conversion to sinus rhythm, and (ii) achievement of rate control, defined as a heart rate < 100 beats per minute [23].In MIMIC-III patients, heart rates and heart rhythms were recorded by nurses at regular intervals, and have previously been shown to be accurate and precise to within 1 h [24].It is assumed that a registered rhythm is maintained until a different rhythm is recorded.The secondary outcome is inhospital mortality.The primary outcomes are censored at 24 h, and the secondary outcome is censored at 30 days [11].For all outcomes, we consider the time from the first treatment administration until the corresponding outcome is observed.

Treatment Groups
The MIMIC-III database is transformed into the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) [25] using the code provided by [26] to identify patient exposures to different pharmaceutical agents.Treatments are captured based on the ingredients in administered drugs, which are accompanied by the timestamp indicating when the treatment was initiated.Within the scope of this study, four treatment groups are defined based on classes 2-4 of the Vaughan Williams classification [27]-beta blockers (BBs), potassium channel blockers (PCBs), and calcium channel blockers (CCBs), as well as magnesium sulphate (MgS).The treatment groups with the corresponding drug ingredients are shown in Table 1.For the outcomes conversion to sinus rhythm and inhospital mortality, treatment group assignment is determined based on the first observed drug exposure during an AF episode, while for the rate control, it is determined based on the first observed drug exposure during an AF episode with a rapid ventricular response (>100 beats per minute).

Multiple Imputation
We handled missing data through multiple imputation [28].To achieve this, linear regression models were utilized to generate imputed datasets.This process involved resampling the original dataset with replacement, leading to the creation of 60 bootstrapped datasets.To fit linear regression models to these bootstrapped datasets, we employed chained equations [29], a technique that effectively accounts for the interrelationships among descriptive variables.Subsequently, the original dataset underwent repeated imputation using these models, resulting in a total of 60 imputed datasets.With a fraction of missing information amounting to 7.42%, the number of imputed datasets adheres to Bodner's rule [30].

Inverse Probability of Treatment Weighting
Given the nature of our retrospective cohort analysis, it is important to consider that patients may have received treatments in a non-randomized manner, based on their specific pathophysiological presentations.Failing to account for this selection bias during the evaluation of treatment effects could lead to biased treatment effects [31].
To address the influence of confounding variables, we utilize a statistical method called inverse probability of treatment weighting (IPTW), implemented through the Twang toolkit [32].Within each imputed dataset, we compute the likelihood of patients being allocated to their specific treatment categories based on their descriptive attributes, employing gradient boosted logistic regression models.This methodology allows us to quantify the likelihood of patients receiving a specific treatment based on their individual characteristics.The inverse of this likelihood score for each patient serves as a weighting factor in subsequent analyses to mitigate the impact of confounding variables.
We evaluate the balance of covariates across treatment groups by computing the maximum absolute pairwise standardized mean differences.This measure allowed us to assess the degree of covariate imbalance between the treatment groups, with smaller differences indicating improved balance.
The hyperparameters employed for the IPTW method were determined empirically, taking into consideration the total computation time and obtained covariate balance.

Cluster Analysis
Patient phenotypes are identified using hierarchical agglomerative clustering.We use a complete linkage criterion for agglomeration, and Gower's distance metric [33] to account for the combination of continuous and categorical covariates.Pairwise distances are computed for each imputed dataset, and averaged to a single distance matrix, which is used in the clustering algorithm.The number of clusters is defined in a compromise between resolution and clinical explainability.The number of clusters is manually chosen such that recognizable patient groups are apparent, blinded to treatments and outcomes.Patient characteristics are compared among the different clusters, and statistical differences are assessed using the Kruskal-Wallis test.Patient covariates are described in terms of their medians and interquartile ranges (IQRs) for continuous covariates, while categorical covariates are reported as counts and percentages.Each cluster is described in terms of its most prominent properties to provide an intuitive characterization.

Statistical Analysis
The key component of the statistical analysis is the estimation of ATEs.Different approaches may be taken to analyze and present such data, such as univariate and multivariate Cox analyses and multiparametric and exponential survival models [34].To provide the highest degree of interpretability, we approximate ATEs using weighted exponential survival models with weights obtained from IPTW.
The survival models are fitted to each imputed dataset 100 times, utilizing Bayesian bootstrapping as proposed by Rubin [35].This process results in a total of 6000 event rate estimates for each ATE.The estimates of constant event rates are subsequently presented as probability distributions.We report both the mode and the 95% highest density intervals of these distributions.
ATEs are computed for the complete cohort and for individual clusters.This allows us to examine the effects of the treatment both overall and within specific clusters.We assess differences in ATEs using Bayes factors (BFs) [36] to provide a quantifiable uncertainty estimate in the effects of different treatments.
Our results are reported as hourly rates (%/h) for the primary outcomes, and as daily rates (%/d) for the secondary outcome.This allows for a clear and direct comparison of the effects of different treatments over time.All statistical analyses were performed using Python 3.7 and R Core v4.1.2.A secondary analysis using the KMeans algorithm can be found in Section S1 of the Supplementary Material.

Results
Of the 46,520 patients in the database, 10,277 have a diagnostic code indicating AF.After excluding patients with age below 18 years and a hospital stay shorter than 24 h, a total of 9401 patients were included in the analysis.The cohort characteristics are shown in Table 2.The percentages of missing values for each characteristic are shown in Table A1 in Appendix A. Patients were followed for an average of 11.3 days (IQR, 5.29-13.9),and were either discharged after 5.28 days (IQR, 10.2-12.0), or expired after 5.35 days (IQR, 12.4-15.9).The share of in-hospital mortality was 49.7%.

Patient Clusters
A total of four clusters were identified based on the hierarchical clustering dendrogram presented in Figure 1.The clinical variables of the identified phenotypes are shown in Table A2 in Appendix A. The dominant characteristics of the identified patient clusters are as follows.

Cluster 4: Valvulopathy Atrial Dilation (n = 2335)
This cluster is characterized by the highest rate of valvulopathies (45.2%) and a high rate of left and right atrial dilation (91.0%, 76.0%).Patients in this cluster further have the highest rates of cor pulmonale (14.3%) and a high rate of COPD (16.0%).

Treatment Effects
IPTW resulted in well-matched covariates between treatment groups within all outcomes.Variable means and maximum absolute pairwise standardized mean differences for treatment groups are shown in Supplementary Tables S2 and S3.The complete list of treatment effects is available in Supplementary Tables S4-S6.
The mortality rates varied across the identified clusters.While PCBs were associated with the lowest mortality rates in the cohort analysis, this did not hold true within the hypertensive post-operative cluster, where mortality rates with PCBs were comparable to CCBs (0.52%/d [0.28-0.76] vs. 0.44%/d [0.05-1.15],BF = 1.02).In the non-cardiovascular multimorbid cluster, PCBs were associated with a higher mortality rate than BBs  The results of the secondary analysis can be found in the supplementary material, and include: The cluster characteristics of the secondary analysis (Table S1), and probability distributions of adjusted treatment effects for the obtained clusters (Figures S1-S3).

Discussion
This study investigated clustering-derived AF phenotypes and their treatment effects in an ICU cohort.The main findings of the study are as follows.(i) The heterogeneity in the AF population, previously reported in community cohorts, can also be observed in the analyzed ICU population.(ii) ATEs differ between phenotypes and are often different from those observed when the treatment effect is averaged across the entire population.
In accordance with previous cluster analyses of AF populations, the analysis of the ICU cohort revealed recognizable patient groups, for example, the postoperative hypertensive cluster, which was characterized by young age and predominantly male patients.With postoperative conditions being known for triggering AF by means of causing inflammatory states, patients in this cluster had the lowest rates of arrhythmia history, and, thus, the highest proportion of new-onset AF.
Previous studies have utilized cluster analysis to explore the differences in treatment patterns and clinical outcomes in AF cohorts.Our work expands on this research by demonstrating that AF phenotypes derived from cluster analysis also exhibit heterogeneity in terms of ATEs.This finding is consistent with previous studies that have criticized the reporting of overall mean effects in clinical research, as such an approach may overlook important treatment effects that are specific to certain patient subgroups [37].The observed heterogeneity in ATEs in this study carries major significance for clinical research.ATEs are commonly derived from the analysis of entire cohorts and heterogeneous treatment effects go unnoticed.For example, a previous study reported benefits in mortality of BBs versus CCBs [38], which we can confirm when analyzing the ATE for the complete cohort, but we find evidence of the opposite effect within two of the identified clusters.Similarly, we can confirm previous results showing that CCBs outperform PCBs in controlling ventricular rate [39] when considering the ATE on the entire cohort.Evaluating individual phenotypes, however, reveals evidence for the opposite effect for patients with valvulopathies and dilated atria.The identification of AF phenotypes may provide insight for further study design, and provide a method to evaluate heterogeneous cohorts such that heterogeneous treatment effects are not overlooked.
While heterogeneities in ATEs were observed among the identified phenotypes, several relationships did not appear as would be expected.Several contraindications of the investigated treatments are known, such as CCBs being contraindicated in patients with systolic heart failure, and amiodarone (PCB) in patients with thyroid disorders.Ideally, one would expect clusters characterized by such contraindications to emerge, and the expected ATEs to be reflected in the results.To what extent such clusters may emerge when a more fine-grained clustering is performed must be evaluated.Given our analysis does not reveal these existing treatment effects, it must be understood that we do not propose a formal classification.
Nonetheless, the presented results may have major clinical implications.A recent survey has shown that 75% of clinical decision-makers treating AF in the ICU would not change their intervention strategy depending on an underlying pathophysiological condition [13].Our results underline the necessity of considering the pathophysiological presentation of patients during treatment selection, to maximize treatment utility and to minimize risk.Overlooking the heterogeneity in AF patients may result in inadequate treatment and lead to suboptimal patient outcomes.
Numerous studies [16,[18][19][20][21] have suggested that a phenotype-driven approach for the management of AF may improve patient outcomes by either providing insights that drive further research, or by guiding treatment directly.However, some studies have reported conflicting results, such as phenotypes with high rates of anticoagulation being associated with increased incidence of ischemic events [16,18].We have therefore provided ATEs in this work to further expand the idea of a phenotype classification for AF management.Such an approach can help quantify treatment effects for specific patient phenotypes, which can aid in selecting appropriate treatments to maximize desired effects while minimizing the probability of adverse outcomes.However, in order to implement such a quantification approach, additional studies are needed to validate our findings in prospective multi-center cohorts and determine the extent to which the results can be generalized beyond the specific cohort used in our study.

Limitations
The results of this study should be interpreted within the context of several limitations.First, the database used does not provide adequate temporal resolution for procedural and diagnostic codes, which are generated when a patient is discharged.It must therefore be understood that the obtained results inevitably suffer from look-ahead bias.Second, the selection of descriptive variables has profound impact on the results of cluster analysis.While care was taken to select relevant variables, candidate variables had to be removed due to data sparsity or were completely unavailable in the database.The inclusion of further variables may impact the results and reduce residual confounding.Third, the present study defined treatment groups according to the first treatment received during an AF episode.In practice, treatments may be administered in combinations, or incrementally escalated until the desired effect is achieved.Further, we have only considered a limited number of treatments due to relatively infrequent use of alternative antiarrhythmic drugs.A more extensive dataset may provide insights into the treatment effects of further antiarrhythmic agents.Fourth, while the employed unsupervised clustering algorithm has previously been shown to identify clinically relevant patient phenotypes, our approach of deciding the number of clusters was primarily based on investigator discretion.A common practice in unsupervised clustering would be the identification of the optimal number of clusters using a clustering metric, such as the silhouette score or the Calinski-Harabasz index.Such approaches have, however, not shown any usable results in our dataset, indicating a lack of structure in the covariate space.Other studies [40,41] have proposed semi-supervised cluster analysis to determine the appropriate number of clusters.Such methods form clusters that correlate with outcomes, identifying patient phenotypes with increased clinical significance.Finally, while the obtained results show significant differences in treatment effects, the obtained distributions were often too wide to give conclusive results.The use of a larger dataset may allow for more discriminatory results.

Conclusions
Cluster analysis of the employed ICU cohort identified four recognizable AF phenotypes defined by unique characteristics.Phenotypes showed different treatment effects, highlighting the heterogeneity of AF patients in critical care settings.Our results support the idea of a phenotype classification approach to support clinical decision making by quantifying treatment effects of individual patients and provide a basis for the design of further studies.Future works should consider the application of semi-supervised clustering methods to emphasize cohesive treatment effects in the formation of clusters, thereby maximizing their clinical significance.

Figure 1 .
Figure 1.Hierarchical clustering dendrogram showing the four identified clusters.Green: Postoperative Hypertensive, red: Non-Cardiovascular Multimorbid, purple: Cardiovascular Multimorbid, yellow: Valvulopathy Atrial Dilation.The dashed horizontal line represents the stopping criterion used for the final cluster definition.

Figure 2 .
Figure 2. Probability distributions of adjusted hourly rhythm control rates.Black dots represent the distribution modes.Black lines represent the 95% highest density intervals.BBs-beta blockers; PCBs-potassium channel blockers; CCBs-calcium channel blockers; MgS-magnesium sulphate.

Figure 3 .
Figure 3. Probability distributions of adjusted hourly rate control rates.Black dots represent the distribution modes.Black lines represent the 95% highest density intervals.BBs-beta blockers; PCBs-potassium channel blockers; CCBs-calcium channel blockers; MgS-magnesium sulphate.

Figure 4 .
Figure 4. Probability distributions of adjusted daily mortality rates.Black dots represent the distribution modes.Black lines represent the 95% highest density intervals.BBs-beta blockers; PCBs-potassium channel blockers; CCBs-calcium channel blockers; MgS-magnesium sulphate.

Table 1 .
Treatment groups and their corresponding drug ingredients.

Table A1 .
Cont.Characteristics of the cohort, and the identified clusters.