Machine Learning Approach to Understand Worsening Renal Function in Acute Heart Failure

Acute heart failure (AHF) is a common and severe condition with a poor prognosis. Its course is often complicated by worsening renal function (WRF), exacerbating the outcome. The population of AHF patients experiencing WRF is heterogenous, and some novel possibilities for its analysis have recently emerged. Clustering is a machine learning (ML) technique that divides the population into distinct subgroups based on the similarity of cases (patients). Given that, we decided to use clustering to find subgroups inside the AHF population that differ in terms of WRF occurrence. We evaluated data from the three hundred and twelve AHF patients hospitalized in our institution who had creatinine assessed four times during hospitalization. Eighty-six variables evaluated at admission were included in the analysis. The k-medoids algorithm was used for clustering, and the quality of the procedure was judged by the Davies–Bouldin index. Three clinically and prognostically different clusters were distinguished. The groups had significantly (p = 0.004) different incidences of WRF. Inside the AHF population, we successfully discovered that three groups varied in renal prognosis. Our results provide novel insight into the AHF and WRF interplay and can be valuable for future trial construction and more tailored treatment.


Introduction
Acute heart failure (AHF) remains a significant problem with a high mortality and a massive financial burden for healthcare providers [1,2]. AHF is a multidimensional state with a complex interplay between the cardiovascular and other systems, including the renal. The pathological condition of simultaneous dysfunction of the kidneys and heart, in which the disorder of one organ induces the damage of the second one, is called cardiorenal syndrome [3]. One of the clinical manifestations of cardiorenal syndrome is the worsening renal function (WRF), which can be defined as, e.g., an increase in serum creatinine or/and a decrease in urine output in a specified period [4]. WRF is a frequent complication overlapping the AHF, especially in conditions of intensive cardiac care units [5], and is associated with prolonged hospitalization and diminished survival [4]. The population of AHF patients endangered by the WRF is heterogenous, and so is the postulated WRF's impact on prognosis. Some authors showed contrary evidence that WRF has a negative, neutral, or even positive effect [4,6,7]. Considering this uncertainty, we presumed that the current lack of well-established classifications describing the risk of WRF is insufficient and does not reflect significant clinical differences between AHF patients. Thus, we decided to analyse the heterogeneity of the AHF population by resorting to novel methods of data analysis, aiming to describe different risk groups of WRF and, further, its impact on prognosis. Importantly, we have only included variables, which are the standard-of-care parameters routinely assessed during AHF patient monitoring.
Data science algorithms, especially Machine Learning (ML), enable novel, clinically important insight into existing data and distinguish previously unrecognized patterns [8].
Clustering is an unsupervised ML technique that organizes the set of data into internally similar subgroups. We presumed that this technique, which was successfully leveraged in marketing [9], could as well prove its value in cardiovascular research. Considering these advances, we decided to implement clustering in the AHF population to understand the occurrence and significance of the WRF better.

Study Population
We have retrospectively analysed three hundred and twelve acute heart failure (AHF) patients from two registries conducted in our institution between 2010-2012 and 2016-2017. Our previous papers described the eligibility criteria in both registries [10]. Heart failure diagnosis was stated according to the current ESC guidelines by a responsible physician [11,12]. To ensure the creatinine course in every patient and avoid missing values in the analysis, we have only included the patients who had serum creatinine assessed at four points, i.e., at admission, after 24 and 48 h of hospitalization, and at discharge.

Worsening of the Renal Function Evaluation
As there was a significant lack of data about diuresis and GFR or parameters indispensable for its calculation, we have based the diagnosis of worsening renal function (WRF) and acute kidney injury (AKI) on creatinine assessment only. AKI was defined according to the KDIGO guidelines as the ≥0.3 mg/dL increase of serum creatinine in 48 h [13]. WRF was defined as the ≥0.3 mg/dL increase of serum creatinine at any point during hospitalization. We decided to analyse both of these phenomena in order to caption as many renal endpoints as possible. Throughout the paper we will stick to using the term WRF, as it is a broader qualification.

Clustering and Data Analysis
Variables included in the analysis are shown in Table 1. Initially, we chose 86 variables regarding the patient's clinical status, i.e., HF subtype, aetiology, comorbidities, symptomatology, and biochemical presentation. All parameters were assessed at patient admission to the hospital. Variables were manually screened to eliminate potential errors; e.g., anomalies, single values out of range, etc. The dataset was implemented into RapidMiner and autocleaning was performed. Variables with over 90% stability, 10% of missing values, or correlated with at least r = 0.6 were meant to be removed, but none of the variables fulfilled these criteria. Missing values were replaced by average values, as clustering algorithms cannot proceed with missing values. Further, nominal values were converted into numerical, and all the numerical parameters were normalized to range from 0 to 1, so each variable had the same impact on the calculated distance.
Clustering is a widely used descriptive data analysis method on the border between statistical analysis and data mining with a relatively long history. The goal of clustering (also called segmentation) is to identify groups of similar examples. Thus, the critical issue in clustering is a proper definition of similarity or distance. There are several clustering methods and algorithms that can be divided into various types, such as hierarchical versus partitional, exclusive versus overlapping versus fuzzy, and complete versus partial [14].
We used the k-medoids algorithm in our experiments. K-medoids is a partitional method that creates non-overlapping clusters. The number of resulting groups must be specified in advance. The algorithm repeatedly re-assigns the examples into the given number of clusters by minimizing their distance to a centroid and recomputes the centroids. Unlike k-means clustering, where cluster centroids are computed by averaging values for examples in a given cluster, each cluster in k-medoids clustering is represented using an existing, most representative example. This makes the results of the k-medoids clustering easier to interpret. The implementation in RapidMiner luckily offers the option to tune hyperparameters of the algorithm automatically. In our case, we adjusted the number of clusters and the similarity measure. The process of the clusters' calculation performed in RapidMiner is displayed in Figure 1, and the file is attached in Supplementary Materials File S1. Hb-fraction of oxygenated haemoglobin, FHHb-fraction of deoxyhemoglobin in total hemoglobin, ctHb-total hemoglobin, Lac-lactates, mOsm -milliosmoles, HGB-hemoglobin, HCT-hematocrit, RBC-red blood count, MCV-mean corpuscular volume, MCH-mean corpuscular hemoglobin, MCHC-mean corpuscular hemoglobin concentration, RDW-red cell distribution width, WBC-white blood count, LYMPH-lymphocytes percentage, MONO-monocytes, NEUTR-neutrophiles, PLT-platelets count, Ast-aspartate aminotransferase, Alt-alanine transaminase, CRP-C-reactive protein, GGTP-gamma-glutamyl transpeptidase, NTproBNP-N-terminal prohormone of brain natriuretic peptide, INR-international normalized ratio, Fe-total iron amount in blood, TIBC-total iron-binding capacity, Tsat-transferrin saturation, sTfR-Soluble Transferrin Receptor, IL-6-interleukin 6th, eGFR-estimated glomerular filtration rate.
We assessed the quality of clustering using the Davies-Bouldin index [15]. This index evaluates the quality of clustering considering the intra-cluster distance (that should be low) and inter-cluster distance (that should be high). The lower the value of the Davies-Bouldin index, the better the clustering.
Associations between clusters and clinical variables were evaluated. The normality was checked using K-S, Shapiro-Wilk, and Lilliefors tests. Parameters with normal distributions are shown as means ± standard deviations. The non-normal variables are displayed as the medians and interquartile ranges. Categorical variables are shown as numbers and percentages ( Table 2). Statistical significance was evaluated using analysis of variance; the p below 0.05 was considered statistically significant. Clustering was performed in RapidMiner 9.1 (RapidMiner GmbH, Dortmund, Germany), and the statistical assessment was conducted in STATISTICA 12 (StatSoft Polska Sp. z o.o., Krakow, Poland).

Clustering
The population was segmented into three clusters, enumerated from 0 to 2. Groups included were, respectively, 158, 110, and 44 patients.

Cluster 0
Cluster 0 was the most numerous one. It comprised the highest proportion of chronic HF with reduced ejection fraction, with the underlying cause of coronary artery disease. Patients usually had a history of PCI/CABG and electrical device implantation. COPD and insulin-dependent diabetes were most frequently reported. Clinical status comprised common pulmonary congestion, moderate limb oedema, and the lowest heart rate. In laboratory parameters, they presented the lowest Ast, Alt, ferritin, IL-6, and NT-proBNP.

Cluster 1
Among other clusters, this group was composed predominantly of older women. They manifested the first manifestation of HF, with preserved ejection fraction and high comorbidity burden, i.a., diabetes and hypertension. Their clinical presentation was reflected by the most frequent NYHA IV, least frequent lower limb oedema and pulmonary congestion, and highest blood pressure. In laboratory measurements, they reached the lowest haemoglobin, HCO3, bilirubin, GGTP, and the highest serum sodium and potassium concentration, serum osmolarity, glucose, Ast, Alt, IL-6, and ferritin.

Cluster 2
The last group was the youngest, with the highest proportion of males and the lowest ejection fraction. They reported the highest stroke history and presented with frequent ascites and hepatomegaly. They achieved the highest HGB, HCT, MCV, bilirubin, GGTP, Fe, NT-proBNP, urine creatinine, and urea and the lowest albumin in laboratory parameters. They were also the most frequent active alcohol users and smokers.
The most important clinical features of each cluster are shown in Table 3 and Figure 2. Highest: % of females, age, ejection fraction, % of de novo HF and preserved EF, valvular and hypertension aetiology, hypertension, diabetes, RR, mOsm, Na, K, glucose, Ast, Alt, lowest: ascites, hepatomegaly, HGB, HCT, MCH, pH HCO 3 , urine creatinine and urea, Non-significant: highest: NYHA IV, limbs oedema I, JVP I, no pulmonary oedema, pCO 2 , IL-6, ferritin, creatinine, urine Na-first manifestation of HFpEF older woman, with high inflammatory markers, creatinine and osmolarity, highest AKI and WRF occurrence, and moderate one-year mortality

Outcome
The global one-year mortality in the studied group was 24% (74 events occurred). The mortality did not significantly differ between the clusters (p = 0.2), from cluster 0 to cluster 2: 22% vs. 22% vs. 34%. The Cox regression was performed, but none of the cluster's hazard ratios reached statistical significance (p = 0.35, p = 0.75, p = 0.0.09), and neither did the Kaplan-Meier estimation (p = 0.21).
Clusters differed in terms of the time of hospitalization, AKI, and WRF occurrence. Patients in cluster 2 were the least likely to develop AKI or WRF and were hospitalized for the longest time.
The outcomes and findings are summarised in Table 4. Abbreviations: WRF-worsening of the renal function, AKI-acute kidney injury, HF-heart failure.

Discussion
The WRF and AKI in AHF are common complications associated with ominous outcomes [4]. The occurrence of AKI has been estimated at 9-13% of AHF patients [16,17]. The underlying causes of the WRF in AHF are complex and not fully understood; the most prominent hypotheses include the impact of, i.a., congestion [18]. Given this lack of specific evidence, we decided to analyse the heterogeneity of the AHF population in the context of WRF occurrence and possible clinical phenotypes which determine it.
The ML-based analysis is gaining popularity in cardiovascular research [19]. There were some magnificent attempts to implement ML in the HF population [20][21][22][23][24][25][26]. Yagi tried to identify distinct phenotypes among AHF patients who experienced WRF [27]. Nevertheless, our study is the first to incorporate clustering into the analysis of the HF population, aiming to distinguish subgroups varied in terms of the WRF. The clustering techniques were able to distinguish three interesting clinical subtypes with different pathophysiology and implications for the outcome.

Cluster 0
This cluster represents the population of older men with chronic HF. We can assume that these patients represent the population with a relatively long history of cardiovascular treatment as they are frequently secured with the electric device and have undergone coronary intervention. They have also been saddled with comorbidities, i.e., end-stage insulin-dependent diabetes and COPD. As these patients represent the group of the chronic and fragile population, therapeutic interventions should be targeted at stable heart failure and comorbidities management [28][29][30].

Cluster 1
Cluster 1 is mainly composed of females. It is the oldest population with the first manifestation of HF, non-ischaemic aetiology, and preserved ejection fraction. They present signs of minimal congestion. In the biochemical assessment, patients in cluster 1 reached the highest serum creatinine, sodium potassium, and osmolarity. This phenotype corresponds with the described HFpEF phenotype [31]. Cluster 1 achieved the highest concentration of selected inflammatory biomarkers (IL-6, ferritin), and high activation of inflammatory pathways was reported to be unique for the HFpEF [32]. Recent studies showed that higher osmolarity correlates with the incidence of WRF in AHF [33]. Importantly, this group reached the highest incidence of AKI and WRF but moderate mortality; our consideration of its explanation is presented in the next paragraph. As the HFpEF population currently suffers from the lack of evidence-based treatment, therapeutic interventions should focus on comorbidities management and lifestyle changes [2]. Some hope for efficient pharmacotherapy is provided by the recent trials on SGLT-2 inhibitors [34][35][36].

Cluster 2
Cluster 2 seems to be the most interesting. It consists almost exclusively of men. They represent the youngest population with chronic HF with the lowest ejection fraction, developed on aetiology described as "other". Patients suffered from the least burden of comorbidities, which can be explained by their youngest age and probable underdiagnosis due to low commitment to their health management. These patients can be described as having toxic aetiology. They represent the highest frequency of active smokers and alcohol users and have the highest values of GGTP and bilirubin, which reflect the afflicted liver function [37]. Moreover, they reached the highest mean value of MCV, which might be associated with alcohol abuse [38]. In the clinical assessment, they manifest frequent and massive peripheral oedema, i.e., the highest incidence of lime oedema III, hepatomegaly, and ascites, but somewhat limited pulmonary congestion. This discrepancy between the aggravation of oedema in different vascular areas should be further evaluated. Laboratory signs of congestion, e.g., NT-proBNP, are also the highest among the clusters. Notably, patients in cluster 2 achieved the lowest pCO2, which can be a sign of heightened chemosensitivity-the predictor of an a unfavourable outcome [39].
Notably, the cluster with the lowest incidence of AKI and WRF (cluster 2) was the one with the highest one-year mortality (non-significant). In our opinion, that can be explained by two intertwined hypotheses. First, creatinine is the late marker of kidney function [40] and has limited value in assessing renal damage [41]. Some authors distinguish true and pseudo-WRF based on the concentration of so-called new renal biomarkers, i.e., NGAL, KIM-1, and cystatin-c [42]. Considering this, the isolated increase in serum creatinine can be insufficient for an accurate kidney assessment. Secondly, creatinine can rise during decongestive therapy [43,44]. It was reported that the transient rise of creatinine during decongestive treatment could even be a promising sign, as it reflects the exhaustiveness of the decongestion [45]. Thus, increased creatinine during diuretic treatment does not necessarily indicate genuine kidney injury, which would worsen the outcome, but it can be a sign of diminishing volume overload. The incompleteness of the decongestion was shown to be an important prognostic factor of mortality in AHF [46], which, in our case, could explain why the cluster with the lowest WRF incidence reaches the highest mortality.
The proposed novel classification may complement the classical ways of AHF patient profiling and has significant clinical implications. Each of the extracted clusters has a different suggested pathophysiology and, therefore, another therapeutic pathway that can be therapeutically addressed; e.g., cluster 0-uptitration of the evidence-based HFrEF medical therapy, cluster 1-comorbidities management, and cluster 2-substance abuse counselling and harm reduction. Focusing on these aspects should lead to more accurate treatment tailoring and eventually optimization of therapy. The efficiency of the proposed clusterbased approach to the therapy adjustments should be evaluated in the prospective studies. Notably, clustering does not reveal baffling relationships. The uncovered connections are clear for the experienced cardiologist. The value of the presented analyses is that it provides tangible evidence for the existence of such phenogroups. Potentially, clustering could immediately categorize a patient into one of the groups and suggest to a physician a relevant proceeding, which can sometimes be omitted due to overworking or lack of experience.

Limitations
Our study is not free from limitations. Our data comes from the single-centre registries gathered between 2010-2012 and 2016-2017. Patients in these registries were treated with the current ESC criteria, which did not mention the modern drugs, i.a., a SGLT-2 inhibitor. This influences the potential extrapolation of our results to the present AHF population. Further, we did not assess the novel kidney markers, which would increase the thoroughness of the renal status evaluation. However, the presented assessment model mirrors the commonly used, well-understood variables. Importantly, we have only included the patients who had their creatinine evaluated at four time points, including discharge. Thus, we only included patients who survived the hospitalization. We have also prespecified the number of clusters, as we wanted to avoid the over-fragmentation of the data; however, pre-specification of the number of clusters to three follows the previous papers about clustering in HF [27,47,48]. All the issues mentioned above should be addressed in further trials.

Conclusions
Machine learning techniques provided fresh insights into the existing medical datasets. We were able to distinguish three clinically and prognostically different phenotypes. Importantly, these phenotypes are different in terms of the AKI or WRF occurrence. These groups constitute valuable insight into AHF and WRF interplay and may be leveraged for future trial construction and more tailored treatments. Our data provides further evidence for the hypothesis that the serum creatinine concentration should be analysed in the broader context in the population of decongested patients and that its increase is not necessarily prognostically worrying.
Noteworthy, we used the k-medoids algorithm instead of the more popular k-means algorithm because k-medoids represent centroids of clusters as existing data points (patients in our case). This makes the results better interpretable. The k-medoids algorithm is also more robust to outliers than the k-means algorithm [49], which is meaningful in medical data.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/biom12111616/s1, File S1: RapidMiner procces. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available within the article. Further data are available on request from the corresponding author.