Is It Time for Machine Learning Algorithms to Predict the Risk of Kidney Failure in Patients with Chronic Kidney Disease?

Chronic kidney disease (CKD) is a common clinical problem affecting more than 800 million people with different kidney diseases [...].

Chronic kidney disease (CKD) is a common clinical problem affecting more than 800 million people with different kidney diseases [1]. CKD is associated with poor clinical outcomes including cardiovascular complications, a reduced quality of life, increased healthcare resource utilization and death [1,2]. In a few cases, CKD may progress to endstage kidney disease (ESKD), leading to even higher morbidity and mortality [3]. The common, broadly used and readily relevant risk prediction tool with extensive validation is the kidney failure risk equation (KFRE) [3,4]. This equation includes age, sex, urine albumin-to-creatinine ratio (ACR) and estimated glomerular filtration rate (eGFR) to predict the need for kidney replacement therapy (KRT) including dialysis or kidney transplantation with excellent predictive performance in a Canadian population [4].
Subsequently, the KFRE equation has been externally validated in more than 30 countries [3,[5][6][7][8][9][10]. Recently, Hallan et al. [10] studied the KFRE model in a Norwegian cohort of patients aged > 65 with an eGFR < 45 mL/min/1.73 m 2 , discovering it to be well-calibrated and maintaining a great discrimination (C-statistic of 0.93). This was comparable with the findings in a primary care setting in the United Kingdom [11]. Additional studies either validating the KFRE or implementing a similar approach to incorporate the KFRE have provided comparable findings including in a cohort in Korea (C-statistic of 0.86 in stage 3 CKD, 0.80 in stage 4 CKD and 0.84 in stage 5 CKD) [12], a cohort in Japan (C-statistic of 0.84 for a 3-variable model and 0.88 for an 8-variable model) [13] and a population in Singapore (0.93 for a 4-variable model) [14].
While the KFRE equation can provide excellent predictive performance, this predictive model does not take the effect of health-related behaviors on the risk of CKD progression into consideration [15]. In addition, the predicted risks for kidney failure by KFRE have been shown to be greater than the actual observed risks across the different etiologies of CKD with the exception of patients with polycystic kidney disease (PKD) [16]. Furthermore, each specific cause of CKD has additional important prognostic factors such as total kidney volume (TKV) in patients with PKD [17], immunosuppression in patients with glomerulonephritis [18] and the presence of rejection in kidney transplant recipients [19].
Recent progress in big data of electronic medical records (EMRs) has exponentially stimulated machine learning (ML) [1], which utilizes computer algorithms to identify patterns in big datasets with a large number of complex factors. ML can produce more precise prediction models by modeling linear and non-linear interactions among diverse and large numbers of variables, surpassing the capacity of traditional statistical approaches. Recent studies have demonstrated the potential application of ML approaches for CKD patients including the identification and monitoring of CKD [20]. With the advancement of natural language processing technology, clinical notes can also render an opening to discover previously unknown risk factors for CKD progression. While the data on the utilization of ML algorithms for CKD detection and monitoring are promising, data on prognostic algorithms to predict the risk of kidney failure in CKD patients are limited [20]. Recently, the Chronic Renal Insufficiency Cohort (CRIC) study investigators conducted a study utilizing an unsupervised ML algorithm with a consensus clustering on 72 baseline characteristics among 2696 CKD patients [21]. By utilizing a consensus clustering approach, the investigators successfully demonstrated distinct CKD subgroups that were associated with different risks of clinical outcomes [21]. This study's findings support the promising direction towards the development of a ML risk prediction model for CKD progression. While the KFRE equation has already been validated and achieved an excellent predictive performance [3,[5][6][7][8][9][10][11]13], ML approaches can further improve automatically through experience and the incorporation of more updated data in order to provide better healthcare and precision medicine [22]. In recent years, there have been significant advances in research on ML model interpretability and explainability; these have helped alleviate concerns of "black boxes" especially in neural network ML models [1].
In summary, this is a ripe time to investigate the use of ML approaches in nephrology. Its use in CKD patients would allow the development of ML models that better identify patients at risk of CKD progression and improve the management of CKD, especially in primary care settings.

Conflicts of Interest:
We do not have any financial or non-financial potential conflicts of interest.