Differences between Kidney Transplant Recipients from Deceased Donors with Diabetes Mellitus as Identified by Machine Learning Consensus Clustering

Clinical outcomes of deceased donor kidney transplants coming from diabetic donors currently remain inconsistent, possibly due to high heterogeneities in this population. Our study aimed to cluster recipients of diabetic deceased donor kidney transplants using an unsupervised machine learning approach in order to identify subgroups with high risk of inferior outcomes and potential variables associated with these outcomes. Consensus cluster analysis was performed based on recipient-, donor-, and transplant-related characteristics in 7876 recipients of diabetic deceased donor kidney transplants from 2010 to 2019 in the OPTN/UNOS database. We determined the important characteristics of each assigned cluster and compared the post-transplant outcomes between the clusters. Consensus cluster analysis identified three clinically distinct clusters. Recipients in cluster 1 (n = 2903) were characterized by oldest age (64 ± 8 years), highest rate of comorbid diabetes mellitus (55%). They were more likely to receive kidney allografts from donors that were older (58 ± 6.3 years), had hypertension (89%), met expanded criteria donor (ECD) status (78%), had a high rate of cerebrovascular death (63%), and carried a high kidney donor profile index (KDPI). Recipients in cluster 2 (n = 687) were younger (49 ± 13 years) and all were re-transplant patients with higher panel reactive antibodies (PRA) (88 [IQR 46, 98]) who received kidneys from younger (44 ± 11 years), non-ECD deceased donors (88%) with low numbers of HLA mismatch (4 [IQR 2, 5]). The cluster 3 cohort was characterized by first-time kidney transplant recipients (100%) who received kidney allografts from younger (42 ± 11 years), non-ECD deceased donors (98%). Compared to cluster 3, cluster 1 had higher incidence of primary non-function, delayed graft function, patient death and death-censored graft failure, whereas cluster 2 had higher incidence of delayed graft function and death-censored graft failure but comparable primary non-function and patient death. An unsupervised machine learning approach characterized diabetic donor kidney transplant patients into three clinically distinct clusters with differing outcomes. Our data highlight opportunities to improve utilization of high KDPI kidneys coming from diabetic donors in recipients with survival-limiting comorbidities such as those observed in cluster 1.


Introduction
End-stage kidney disease (ESKD) is a common and highly morbid disorder that affects both life quality and survival. Kidney transplantation is the best option for most ESKD patients and is superior to other renal replacement therapies such as hemodialysis and peritoneal dialysis in terms of long-term mortality risk [1,2]. In the United States, the number of kidney transplants has increased each year since 2015; however, only a quarter of patients on waitlists received a deceased donor kidney within 5 years due to the ongoing severe shortage of available kidney grafts [3]. Within this context, the transplant community has continued to seek opportunities to increase the size of the available donor pool so as to improve opportunities for transplant. Kidney allografts coming from donors with a higher kidney donor profile index (KDPI) or meeting expanded criteria donor (ECD) status are often imperfect in quality; however, they provide survival and quality of life benefit to patients awaiting transplantation [4]. Utilization of kidneys from donors with diabetes mellitus (DM) has increased remarkably in the last two decades [5]. In the United States, individuals with DM are generally considered ineligible to donate a kidney as living donors due to concern for long-term kidney health. Some studies have demonstrated that deceased donor kidney allografts coming from diabetic donors are at increased risk for delayed graft function, rejection, and inferior graft quality [5][6][7][8]. Despite these risk variables, studies have shown that transplantation with a diabetic donor kidney offers greater survival benefit compared with patients who remained on the waitlist [6].
Although transplant centers remain cautious when using kidney allografts from diabetic and high KDPI donors, there are many advantages in expanding utilization. One advantage is an increased availability of donor organs, which can be critical during the current shortage of donor organs. Utilizing diabetic donor kidneys may also reduce wait times for transplantation because there may be fewer candidates on the waiting list willing to accept such kidneys. Transplant professionals currently determine the utilization of kidneys from donors with DM on a case-by-case basis, considering factors such as the recipient's kidney disease severity, the availability of appropriate organs, and the quality of the donor organ. The inconsistent results are most likely caused by high heterogenous clinical presentations of both recipients and donors. As such, subgroup analysis of diabetic donor kidney transplant recipients may help identify the specific population and guide selection of candidates suitable for transplantation.
Unsupervised machine learning (ML) is a type of artificial intelligence that enables the analysis of intricate data sets and recognition of patterns without explicit guidance or labeling [7,8]. In the case of kidney transplant recipients, unsupervised ML can be a useful tool for healthcare professionals to identify complex and subtle relationships between patient characteristics, donor characteristics, medical history, treatment history, and outcomes that may not have been apparent using traditional statistical methods [9,10]. The potential applications of unsupervised ML in kidney transplantation include identifying subpopulations of kidney transplant recipients that may benefit from specific interventions. For example, unsupervised ML algorithms can identify groups of kidney transplant recipients with similar clinical characteristics and medical histories but different kidney transplant outcomes [10]. This enables healthcare professionals to identify subpopulations of patients who may benefit from specific treatments or interventions such as adjusting immunosuppressive medication regimens or modifying lifestyle factors. ML consensus clustering is a technique used in data clustering that involves combining the results of multiple clustering algorithms to improve the quality and robustness of clustering results [11,12]. This can help healthcare professionals develop more targeted and effective strategies for managing kidney transplant recipients, leading to better patient outcomes. The utilization of a ML consensus clustering approach could potentially offer healthcare professionals a new perspective on the different phenotypes of kidney transplant recipients who received a kidney from a diabetic donor and have varying outcomes.
The objective of this study was to utilize an unsupervised ML consensus clustering approach to cluster kidney transplant patients who received kidneys from deceased diabetic donors based on a wide range of recipient-, donor-, and transplant-related variables, and subsequently examine the post-transplant outcomes among the distinct clusters identified through the ML algorithm.

Data Source and Study Population
The study population consisted of adult kidney transplant recipients in the United States between 2010 and 2019, identified through the Organ Procurement and Transplantation Network (OPTN)/United Network for Organ Sharing (UNOS) database. The inclusion criteria encompassed kidney transplants from deceased donors with diabetes. The database did not specify the type of diabetes in donors; therefore, any diabetic donors, regardless of diabetic type, were included. Approval for this study was obtained from the Mayo Clinic Institutional Review Board (IRB number 21-007698).

Data Collection
A comprehensive set of recipient-, donor-, and transplant-related characteristics, as outlined in Table 1, were extracted from the OPTN/UNOS database for subsequent incorporation into the clustering analysis. All variables exhibited missing data of less than 10%. To address the missing values, the multiple imputation by chained equation (MICE) method [13] was employed for imputing the missing data.

Clustering Analysis
In this study, an unsupervised machine learning (ML) technique was employed to categorize the clinical phenotypes of deceased diabetic donor kidney transplant patients [14]. To ensure meaningful clinical clusters, a consensus clustering approach was utilized. The clustering process involved applying a pre-specified subsampling parameter of 80% with 100 iterations, and considering a range of potential clusters (k) from 2 to 10. The objective was to avoid generating an excessive number of clusters that lacked clinical significance. The determination of the optimal number of clusters was based on several factors, including the examination of the consensus matrix (CM) heat map, the cumulative distribution function (CDF), cluster-consensus plots incorporating within-cluster consensus scores, and the proportion of ambiguously clustered pairs (PAC). The within-cluster consensus score, ranging from 0 to 1, was calculated as the average consensus value among pairs of individuals belonging to the same cluster [12]. A higher score indicated greater cluster stability. In contrast, the PAC, also ranging from 0 to 1, represented the proportion of sample pairs with consensus values falling within predetermined boundaries [15]. A lower PAC value indicated better cluster stability [15]. For further details regarding the consensus cluster algorithms employed in this study for reproducibility purposes, please refer to the Supplementary Materials.

Outcomes
The study evaluated various outcomes following kidney transplantation from diabetic donors, including primary non-function, delayed graft function, acute rejection within 1 year, patient death, and death-censored graft failure within 1 and 5 years posttransplant. Death-censored graft failure was defined as the requirement for dialysis or re-transplantation, while considering patients censored for death or at the most recent follow-up date available in the OPTN/UNOS database.

Statistical Analysis
Following the assignment of each kidney transplant recipient from diabetic donors into clusters through consensus clustering analysis, a comparative analysis of clinical characteristics and post-transplant outcomes was conducted among the assigned clusters. The differences in clinical characteristics among the clusters were assessed using the Chisquared test for categorical variables and analysis of variance (ANOVA) for continuous variables. The identification of key characteristics for each cluster was determined by measuring the standardized mean difference (SMD) between each cluster and the overall cohort, with a predetermined cutoff of >0.3. Rather than relying solely on p-values, we chose this threshold based on the effect size considered clinically significant for the variables under examination. By adopting this approach [10], we aimed to capture substantial differences among the clusters while taking into account the magnitude of these differences. Patient survival and death-censored graft survival were estimated using Kaplan-Meier analysis, and comparisons among the clusters were made using the log-rank test. Hazard ratios (HR) for patient death and death-censored graft failure were calculated using Cox proportional hazard analysis, while odds ratios (OR) for primary non-function, delayed graft function, and 1-year rejection were determined using logistic regression analysis. Since the consensus clustering approach deliberately generated clinically distinct clusters, the associations of the assigned clusters with post-transplant outcomes were not adjusted for patient characteristics. All statistical analyses were performed using R, version 4.0.3 (Rstudio, Inc., Boston, MA, USA; http://www.rstudio.com/, accessed on 18 January 2023), including the ConsensusClusterPlus package (version 1.46.0) for consensus clustering analysis and the MICE command in R for multivariable imputation by chained equation [13].

Clinical Characteristics of Each Cluster of Kidney Transplant Recipients from Diabetic Donors
During the study period from 2010 to 2019, 158,367 adult patients underwent kidney transplants. Of these, 7876 (5%) received kidney transplants from a deceased donor with diabetes. Among these donors, 69% had concurrent history of hypertension, and 31% had a KDPI of ≥85%. Therefore, consensus clustering analysis was performed in 7876 kidney transplant recipients from diabetic deceased donors. Figure 1A displays the cumulative distribution function (CDF) plot, illustrating the consensus distributions for each cluster of kidney transplant patients from deceased donors with diabetes. The delta area plot ( Figure 1B) represents the relative changes in the area under the CDF curve. The most significant alterations in the area occurred between k = 2 and k = 5, after which the relative increase in the area became less pronounced. The CM heat map ( Figure 1C, Figures S1-S9) demonstrates the identification of cluster 3 with distinct boundaries by the machine learning algorithm, indicating robust cluster stability across multiple iterations. The mean cluster consensus score was found to be highest in cluster 3 ( Figure 2A). Additionally, favorable low partition around medoids clustering (PAC) was observed for the three clusters ( Figure 2B). Thus, the consensus clustering analysis identified three clusters that best represented the data pattern of diabetic deceased donor kidney transplant patients.

Discussion
We have successfully categorized kidney transplant recipients who received deceased donor kidneys from diabetic donors into 3 clusters with distinct clinical features and outcomes by using an unsupervised ML approach. Compared to clusters 1 and 2, cluster 3 recipients had superior outcomes specific to death-censored graft survival and patient survival at both 1 and 5 years. Compared to the other clusters, cluster 3 recipients had more favorable recipient and donor characteristics. All recipients in cluster 3 were first-time kidney transplant patients who received kidney allografts from younger donors with a lower KDPI score.
Cluster 1 recipients accounted for~40% of this study cohort. Cluster 1 recipients had the least favorable recipient and donor characteristics. These characteristics included older age and presence of diabetes. Donor-recipient pairing was evident in cluster 1 and recipients in cluster 1 were more likely to receive higher KDPI allografts. Transplant outcomes, including survival, are known to be reduced in older recipients with comorbidities [16,17]. Use of higher KDPI kidney allografts in older recipients highlights opportunities to improve access and equity in kidney transplantation [18]. This compliments OPTN data, which has shown that outcomes are better with transplantation compared with remaining on the waitlist [6]. Use of older, higher KPDI, diabetic deceased donor kidney allografts is likely of greatest benefit to older recipients with minimal qualifying time, lack of available living donors, and presence of survival-limiting comorbidities. Although use of allografts with high KPDI characteristics are unlikely to be advantageous for younger recipients, diabetic kidney allografts from younger donors are likely to be of suitable quality. As suggested by findings from this unsupervised ML, kidney allografts with donor characteristics observed in clusters 2 and 3 should be considered for all waitlist patients, including those who are younger in age. Careful screening of the allograft, including consideration of a procurement biopsy, can help guide clinical decision making on the appropriate recipient. Overall, outcomes in cluster 1 highlight opportunities to increase utilization of higher KDPI kidneys in older recipients with significant survival-limiting comorbidities. Patient survival, independent of graft quality, was the largest factor responsible for influencing outcomes. Use of a higher quality deceased donor is unlikely to have yielded differences in patient survival.
Recipients and donors in cluster 3 had the most favorable characteristics including younger age. The favorable patient and death-censored graft survival observed in cluster 3 highlights the variability observed in diabetic deceased donor kidney allografts. Duration of diabetes and medical management influence allograft quality. The importance of procurement biopsies is better established in high KDPI ECD donors; however, use of procurement biopsies in standard KDPI donors (KDPI < 85%) can also be of value, particularly for donors with diabetes.
Although recipients in cluster 2 were younger, and also had more favorable characteristics, the cluster 2 cohort had decreased death-censored graft survival. All recipients in cluster 2 were re-transplant patients and this likely played a role in the graft survival observed. Similar to cluster 3, cluster 2 donors also had more favorable characteristics, such as younger age and lower KDPI, despite the presence of diabetes. It is likely that factors such as infection and rejection, often associated with re-transplantation, may have been responsible for the reduced graft survival observed.
Overall, the findings from this unsupervised ML approach highlight opportunities to further improve the utilization of deceased donor kidney allografts coming from diabetic donors. Although diabetic donor kidneys are increasingly accepted, our data still shows that only a small percentage (5%) of transplanted kidneys were diabetic donor kidneys, which is similar to other reports ranging from 3.5% to 8.8% [5,[19][20][21][22][23]. The data from the UNOS registry suggested that approximately 40-50% of diabetic donor kidneys have been discarded each year from 2008 to 2019, and diabetic ECD donor kidneys had higher discarded rates, approximately 60% [3,24]. The current findings suggest that factors intrinsic to the recipient, including diabetes status and re-transplantation, more significantly impact graft and patient survival. Our study identified three different subpopulations of diabetic donor kidney recipients with distinct features and outcomes, which help us determine the recipients with high risks. These findings prompt us to better understand, evaluate, and allocate diabetic donor kidneys to shorten the waitlist and reduce the risk of death on the waitlist, although the impact of diabetic donor kidneys on patient and graft survival should be further investigated in large, prospective, and randomized studies.
The findings from this ML approach suggest that there are opportunities to further improve the utilization of deceased donor kidney allografts coming from diabetic donors. Factors intrinsic to the recipient, including diabetes status and re-transplantation, rather than presence of diabetes in the donor, appear to play more significant roles in influencing graft and patient survival. Unsupervised ML has the potential to provide healthcare professionals with a novel viewpoint on the distinct phenotypes of kidney transplant recipients who received kidneys from donors with diabetes and experienced disparate outcomes, and thus significantly improved the care and outcomes for these complex patients.
Limited biopsy data and missing reported data from the UNOS are important limitations of this study. Biopsies play a crucial role in determining the quality and suitability of kidneys for transplantation, including the utilization of kidneys from diabetic donors. However, despite their importance, biopsy data are often incomplete or unavailable. The quality of a kidney from a donor with diabetes can be influenced by several factors, such as the duration and severity of the donor's diabetes, the presence of diabetic complications, and other comorbidities. Kidneys from diabetic donors may have microscopic changes, such as arteriolar hyalinosis and interstitial fibrosis, which can reduce the functional capacity of the kidney. Additionally, donors who had well-managed blood sugar levels and exhibit no signs or symptoms of renal impairment, such as elevated levels of protein in their urine or decreased kidney function, may not suffer from diabetic kidney disease. Our dataset had certain limitations, including the absence of information on coronary artery disease, post-transplant diabetes, and CMV prophylaxis management [25]. These factors are important considerations in the context of kidney transplantation outcomes. Therefore, future research endeavors should focus on incorporating these crucial pieces of information to gain a more comprehensive understanding of the characteristics of both recipients and donors in the UNOS database. By including these data, a more detailed analysis can be conducted to assess their impact on transplant outcomes and potentially enhance patient care in the field of kidney transplantation. Additional studies could expand upon the infor-mation that is currently available, allowing for a more comprehensive understanding of recipient and donor-related factors and their impact on transplant outcomes. Additionally, we utilized the SMD technique with a predetermined threshold exceeding 0.3 to identify the distinct characteristics associated with each cluster [26][27][28][29]. Instead of solely relying on p-values, our decision to use this cutoff value was informed by the effect size that was deemed clinically significant for the variables being investigated. This methodological approach was implemented to ensure that we captured meaningful differences among the clusters while considering the magnitude of these differences [10]. For instance, working income did not emerge as a distinct clinical characteristic specific to each assigned cluster. However, employment status and its associated factors, such as working income, may indeed play a role in post-transplant outcomes and adherence [30]. Future studies that incorporate data on employment status and post-transplant adherence could provide further insights into the relationship between working income, adherence, and transplant outcomes. Finally, our dataset did not contain waitlisted patients [31]. For instance, it is noteworthy that the impact of type 2 diabetes on the survival of waitlisted individuals is substantial [32]. As a result, it may be necessary to modify allocation policies for type 2 diabetes patients to account for the heightened risk of mortality and the possibility of their waitlist being suspended due to the presence of other medical conditions. Future studies to apply this machine learning approach in waitlisted patients would be of interest to better identify waitlisted patients who would have more survival benefit from receiving kidney transplant from donors with diabetes, high KDPI, or ECD status.

Conclusions
Our study utilized an unsupervised machine learning approach to cluster recipients of diabetic deceased donor kidney transplants into three clinically distinct groups based on recipient-, donor-, and transplant-related characteristics. Our findings suggest that recipients in Cluster 1, who were characterized by older age, comorbid diabetes mellitus, and high-KDPI kidneys, were associated with poorer post-transplant outcomes. This highlights the need for improved utilization of high-KDPI kidneys coming from diabetic donors in recipients with survival-limiting comorbidities. Additionally, Cluster 2 recipients, who were younger re-transplant patients with higher PRA, were associated with higher incidence of delayed graft function and death-censored graft failure. These results may aid in identifying subgroups of recipients who require closer monitoring and tailored interventions to improve their post-transplant outcomes.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/jpm13071094/s1. Figure S1. Consensus matrix heat map (k = 2) depicting consensus values on a white to blue color scale of each cluster; Figure S2. Consensus matrix heat map (k = 3) depicting consensus values on a white to blue color scale of each cluster; Figure S3. Consensus matrix heat map (k = 4) depicting consensus values on a white to blue color scale of each cluster; Figure S4. Consensus matrix heat map (k = 5) depicting consensus values on a white to blue color scale of each cluster; Figure S5. Consensus matrix heat map (k = 6) depicting consensus values on a white to blue color scale of each cluster; Figure S6. Consensus matrix heat map (k = 7) depicting consensus values on a white to blue color scale of each cluster; Figure S7. Consensus matrix heat map (k = 8) depicting consensus values on a white to blue color scale of each cluster; Figure S8. Consensus matrix heat map (k = 9) depicting consensus values on a white to blue color scale of each cluster; Figure S9. Consensus matrix heat map (k = 10) depicting consensus values on a white to blue color scale of each cluster. References [33][34][35][36][37][38]