A Machine Learning-Based Prediction Model for Diabetic Kidney Disease in Korean Patients with Type 2 Diabetes Mellitus
Abstract
:1. Introduction
2. Methods
2.1. Study Design and Data Sources
2.2. Data Overview
2.2.1. Data Extraction
2.2.2. Feature Selection and Preprocessing
2.3. ML Models
2.3.1. Logistic Regression Models
2.3.2. Tree-Based Models
2.3.3. Support Vector Machines
2.3.4. Deep Learning Models (Neural Network)
2.4. Hyperparameter Tuning
2.5. Model Evaluation
3. Results
3.1. Demographic Characteristics of Patients from JBUH, AUMC, KHMC, KWMC, Sejong-BCN, and WKUH
3.2. Comparisons of Prediction Model Performance
3.3. Other Performance Metrics of Each Machine Learning Models
3.4. Feature Importance Analysis
4. Discussion
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AI | artificial intelligence |
AUC | area under the curve |
AUMC | Ajou University Hospital |
CDM | common data model |
CKD | chronic kidney disease |
DKD | diabetic kidney disease |
eGFR | estimated glomerular filtration rate |
EHR | electronic health record |
ESKD | end-stage kidney disease |
HbA1C | hemoglobin A1c |
JBUH | Jeonbuk National University Hospital |
KHMC | Kyunghee University Hospital |
KWMC | Kangwon National University Hospital |
ML | machine learning |
SD | standard deviation |
Sejong-BCN | Bucheon Sejong Hospital |
SVM | support vector machine |
T2DM | type 2 diabetes mellitus |
WKUH | Wonkwang University Hospital |
XGBoost | eXtreme Gradient Boosting |
References
- Ahmad, E.; Lim, S.; Lamptey, R.; Webb, D.R.; Davies, M.J. Type 2 diabetes. Lancet 2022, 400, 1803–1820. [Google Scholar] [CrossRef]
- Sun, H.; Saeedi, P.; Karuranga, S.; Pinkepank, M.; Ogurtsova, K.; Duncan, B.B.; Stein, C.; Basit, A.; Chan, J.C.N.; Mbanya, J.C.; et al. IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes Res. Clin. Pract. 2022, 183, 109119. [Google Scholar] [CrossRef]
- Bae, J.H.; Han, K.D.; Ko, S.H.; Yang, Y.S.; Choi, J.H.; Choi, K.M.; Kwon, H.S.; Won, K.C. Diabetes fact sheet in Korea 2021. Diabetes Metab. J. 2022, 46, 417–426. [Google Scholar] [CrossRef]
- Francis, A.; Harhay, M.N.; Ong, A.C.M.; Tummalapalli, S.L.; Ortiz, A.; Fogo, A.B.; Fliser, D.; Roy-Chaudhury, P.; Fontana, M.; Nangaku, M.; et al. Chronic kidney disease and the global public health agenda: An international consensus. Nat. Rev. Nephrol. 2024, 20, 473–485. [Google Scholar] [CrossRef]
- Bikbov, B.; Purcell, C.A.; Levey, A.S.; Smith, M.; Abdoli, A.; Abebe, M.; Adebayo, O.M.; Afarideh, M.; Agarwal, S.K.; Agudelo-Botero, M.; et al. Global, regional, and national burden of chronic kidney disease, 1990-2017: A systematic analysis for the Global Burden of Disease Study 2017. Lancet 2020, 395, 709–733. [Google Scholar] [CrossRef]
- Kim, K.M.; Jeong, S.A.; Ban, T.H.; Hong, Y.A.; Hwang, S.D.; Choi, S.R.; Lee, H.; Kim, J.H.; Kim, S.H.; Kim, T.H.; et al. Status and trends in epidemiologic characteristics of diabetic end-stage renal disease: An analysis of the 2021 Korean Renal Data System. Kidney Res. Clin. Pract. 2024, 43, 20–32. [Google Scholar] [CrossRef]
- Kim, N.H.; Seo, M.H.; Jung, J.H.; Han, K.D.; Kim, M.K.; Kim, N.H. 2023 Diabetic kidney disease fact sheet in Korea. Diabetes Metab. J. 2024, 48, 463–472. [Google Scholar] [CrossRef]
- Gheith, O.; Farouk, N.; Nampoory, N.; Halim, M.A.; Al-Otaibi, T. Diabetic kidney disease: World wide difference of prevalence and risk factors. J. Nephropharmacol. 2016, 5, 49–56. [Google Scholar] [CrossRef]
- Bang, H.; Vupputuri, S.; Shoham, D.A.; Klemmer, P.J.; Falk, R.J.; Mazumdar, M.; Gipson, D.; Colindres, R.E.; Kshirsagar, A.V. SCreening for Occult REnal Disease (SCORED): A simple prediction model for chronic kidney disease. Arch. Intern. Med. 2007, 167, 374–381. [Google Scholar] [CrossRef]
- Kwon, K.S.; Bang, H.; Bomback, A.S.; Koh, D.H.; Yum, J.H.; Lee, J.H.; Lee, S.; Park, S.K.; Yoo, K.Y.; Park, S.K.; et al. A simple prediction score for kidney disease in the Korean population. Nephrology 2012, 17, 278–284. [Google Scholar] [CrossRef]
- Sun, L.; Wu, Y.; Hua, R.X.; Zou, L.X. Prediction models for risk of diabetic kidney disease in Chinese patients with type 2 diabetes mellitus. Ren. Fail. 2022, 44, 1454–1461. [Google Scholar] [CrossRef]
- González-Rocha, A.; Colli, V.A.; Denova-Gutiérrez, E. Risk prediction score for chronic kidney disease in healthy adults and adults with type 2 diabetes: Systematic review. Prev. Chronic Dis. 2023, 20, E30. [Google Scholar] [CrossRef]
- Gregorich, M.; Kammer, M.; Heinzel, A.; Böger, C.; Eckardt, K.U.; Heerspink, H.L.; Jung, B.; Mayer, G.; Meiselbach, H.; Schmid, M.; et al. Development and validation of a prediction model for future estimated glomerular filtration rate in people with type 2 diabetes and chronic kidney disease. JAMA Netw. Open 2023, 6, e231870. [Google Scholar] [CrossRef]
- Andaur Navarro, C.L.; Damen, J.A.A.; Takada, T.; Nijman, S.W.J.; Dhiman, P.; Ma, J.; Collins, G.S.; Bajpai, R.; Riley, R.D.; Moons, K.G.M.; et al. Risk of bias in studies on prediction models developed using supervised machine learning techniques: Systematic review. BMJ 2021, 375, n2281. [Google Scholar] [CrossRef]
- Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef]
- Ooi, Y.G.; Sarvanandan, T.; Hee, N.K.Y.; Lim, Q.H.; Paramasivam, S.S.; Ratnasingam, J.; Vethakkan, S.R.; Lim, S.K.; Lim, L.L. Risk prediction and management of chronic kidney disease in people living with type 2 diabetes mellitus. Diabetes Metab. J. 2024, 48, 196–207. [Google Scholar] [CrossRef]
- Chen, L.; Shao, X.; Yu, P. Machine learning prediction models for diabetic kidney disease: Systematic review and meta-analysis. Endocrine 2024, 84, 890–902. [Google Scholar] [CrossRef]
- Allen, A.; Iqbal, Z.; Green-Saxena, A.; Hurtado, M.; Hoffman, J.; Mao, Q.; Das, R. Prediction of diabetic kidney disease with machine learning algorithms, upon the initial diagnosis of type 2 diabetes mellitus. BMJ Open Diabetes Res. Care 2022, 10, e002560. [Google Scholar] [CrossRef]
- Dong, Z.; Wang, Q.; Ke, Y.; Zhang, W.; Hong, Q.; Liu, C.; Liu, X.; Yang, J.; Xi, Y.; Shi, J.; et al. Prediction of 3-year risk of diabetic kidney disease using machine learning based on electronic medical records. J. Transl. Med. 2022, 20, 143. [Google Scholar] [CrossRef]
- Sabanayagam, C.; He, F.; Nusinovici, S.; Li, J.; Lim, C.; Tan, G.; Cheng, C.Y. Prediction of diabetic kidney disease risk using machine learning models: A population-based cohort study of Asian adults. Elife 2023, 12, e81878. [Google Scholar] [CrossRef]
- Liu, X.Z.; Duan, M.; Huang, H.D.; Zhang, Y.; Xiang, T.Y.; Niu, W.C.; Zhou, B.; Wang, H.L.; Zhang, T.T. Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: A multicenter retrospective study. Front. Endocrinol. 2023, 14, 1184190. [Google Scholar] [CrossRef]
- Hripcsak, G.; Duke, J.D.; Shah, N.H.; Reich, C.G.; Huser, V.; Schuemie, M.J.; Suchard, M.A.; Park, R.W.; Wong, I.C.; Rijnbeek, P.R.; et al. Observational Health Data Sciences and Informatics (OHDSI): Opportunities for observational researchers. Stud. Health Technol. Inform. 2015, 216, 574–578. [Google Scholar] [CrossRef]
- Thomas, M.C.; Brownlee, M.; Susztak, K.; Sharma, K.; Jandeleit-Dahm, K.A.; Zoungas, S.; Rossing, P.; Groop, P.H.; Cooper, M.E. Diabetic kidney disease. Nat. Rev. Dis. Primers 2015, 1, 15018. [Google Scholar] [CrossRef]
- Diabetes and Kidney Disease. Available online: https://diabetesatlas.org/atlas/diabetes-and-kidney-disease/ (accessed on 30 November 2024).
- Choi, H.H.; Choi, G.; Yoon, H.; Ha, K.H.; Kim, D.J. Rising incidence of diabetes in young adults in South Korea: A national cohort study. Diabetes Metab. J. 2022, 46, 803–807. [Google Scholar] [CrossRef]
- Park, S.E.; Ko, S.H.; Kim, J.Y.; Kim, K.; Moon, J.H.; Kim, N.H.; Han, K.D.; Choi, S.H.; Cha, B.S. Diabetes fact sheets in Korea 2024. Diabetes Metab. J. 2025, 49, 24–33. [Google Scholar] [CrossRef]
- Williamson, T.; Gomez-Espinosa, E.; Stewart, F.; Dean, B.B.; Singh, R.; Cui, J.; Kong, S.X. Poor adherence to clinical practice guidelines: A call to action for increased albuminuria testing in patients with type 2 diabetes. J. Diabetes Complicat. 2023, 37, 108548. [Google Scholar] [CrossRef]
- Vistisen, D.; Andersen, G.S.; Hulman, A.; Persson, F.; Rossing, P.; Jørgensen, M.E. Progressive decline in estimated glomerular filtration rate in patients with diabetes after moderate loss in kidney function-even without albuminuria. Diabetes Care 2019, 42, 1886–1894. [Google Scholar] [CrossRef]
- Jung, C.Y.; Yoo, T.H. Pathophysiologic mechanisms and potential biomarkers in diabetic kidney disease. Diabetes Metab. J. 2022, 46, 181–197. [Google Scholar] [CrossRef]
- Tangri, N.; Stevens, L.A.; Griffith, J.; Tighiouart, H.; Djurdjev, O.; Naimark, D.; Levin, A.; Levey, A.S. A predictive model for progression of chronic kidney disease to kidney failure. JAMA 2011, 305, 1553–1559. [Google Scholar] [CrossRef]
- Sheng, Y.; Zhang, C.; Huang, J.; Wang, D.; Xiao, Q.; Zhang, H.; Ha, X. Comparison of conventional mathematical model and machine learning model based on recent advances in mathematical models for predicting diabetic kidney disease. Digit. Health 2024, 10, 1–10. [Google Scholar] [CrossRef]
- Kang, M.W.; Tangri, N.; Kim, Y.C.; An, J.N.; Lee, J.; Li, L.; Oh, Y.K.; Kim, D.K.; Joo, K.W.; Kim, Y.S.; et al. An independent validation of the kidney failure risk equation in an Asian population. Sci. Rep. 2020, 10, 12920. [Google Scholar] [CrossRef]
- Lee, J.; Lee, S.H.; Yoon, K.H.; Cho, J.H.; Han, K.; Yang, Y. Risk of developing chronic kidney disease in young-onset Type 2 diabetes in Korea. Sci. Rep. 2023, 13, 10100. [Google Scholar] [CrossRef] [PubMed]
- Choi, H.; Lee, J.Y.; Sul, Y.; Kim, S.; Ye, J.B.; Lee, J.S.; Yoon, S.; Seok, J.; Han, J.; Choi, J.H.; et al. Comparing machine learning and logistic regression for acute kidney injury prediction in trauma patients: A retrospective observational study at a single tertiary medical center. Medicine 2023, 102, e34847. [Google Scholar] [CrossRef]
- Jo, S.J.; Park, J.B.; Lee, K.W. Prediction of very early subclinical rejection with machine learning in kidney transplantation. Sci. Rep. 2023, 13, 22387. [Google Scholar] [CrossRef]
- Lee, H.C.; Yoon, S.B.; Yang, S.M.; Kim, W.H.; Ryu, H.G.; Jung, C.W.; Suh, K.S.; Lee, K.H. Prediction of acute kidney injury after liver transplantation: Machine learning approaches vs. logistic regression model. J. Clin. Med. 2018, 7, 428. [Google Scholar] [CrossRef]
- Islam, M.A.; Majumder, M.Z.H.; Hussein, M.A. Chronic kidney disease prediction based on machine learning algorithms. J. Pathol. Inform. 2023, 14, 100189. [Google Scholar] [CrossRef]
- Iseki, K.; Kohagura, K. Anemia as a risk factor for chronic kidney disease. Kidney Int. Suppl. 2007, 72, S4–S9. [Google Scholar] [CrossRef]
- Neuen, B.L.; Young, T.; Heerspink, H.J.L.; Neal, B.; Perkovic, V.; Billot, L.; Mahaffey, K.W.; Charytan, D.M.; Wheeler, D.C.; Arnott, C.; et al. SGLT2 inhibitors for the prevention of kidney failure in patients with type 2 diabetes: A systematic review and meta-analysis. Lancet Diabetes Endocrinol. 2019, 7, 845–854. [Google Scholar] [CrossRef]
- Perkovic, V.; Tuttle, K.R.; Rossing, P.; Mahaffey, K.W.; Mann, J.F.E.; Bakris, G.; Baeres, F.M.M.; Idorn, T.; Bosch-Traberg, H.; Lausvig, N.L.; et al. Effects of semaglutide on chronic kidney disease in patients with type 2 diabetes. N. Engl. J. Med. 2024, 391, 109–121. [Google Scholar] [CrossRef]
- Bakris, G.L.; Agarwal, R.; Anker, S.D.; Pitt, B.; Ruilope, L.M.; Rossing, P.; Kolkhof, P.; Nowack, C.; Schloemer, P.; Joseph, A.; et al. Effect of finerenone on chronic kidney disease outcomes in type 2 diabetes. N. Engl. J. Med. 2020, 383, 2219–2229. [Google Scholar] [CrossRef]
- Neuen, B.L.; Heerspink, H.J.L.; Vart, P.; Claggett, B.L.; Fletcher, R.A.; Arnott, C.; de Oliveira Costa, J.; Falster, M.O.; Pearson, S.A.; Mahaffey, K.W.; et al. Estimated lifetime cardiovascular, kidney, and mortality benefits of combination treatment with SGLT2 inhibitors, GLP-1 receptor agonists, and nonsteroidal MRA compared with conventional care in patients with type 2 diabetes and albuminuria. Circulation 2024, 149, 450–462. [Google Scholar] [CrossRef]
Feature | Overall (n = 5120) | No DKD (n = 3759) | DKD (n = 1361) |
---|---|---|---|
Sex (male), n (%) | 2125 (42) | 1521 (40) | 604 (44) |
Age (years), mean (SD) | 61.53 (12.42) | 59.16 (12.11) | 68.09 (10.82) |
Systolic blood pressure (mmHg), mean (SD) | 137.87 (22.44) | 137.12 (22.16) | 139.83 (23.05) |
Diastolic blood pressure (mmHg), mean (SD) | 85.52 (15.35) | 85.49 (15.32) | 85.60 (15.45) |
Hypertension, n (%) | 859 (17) | 550 (15) | 309 (23) |
Dyslipidemia, n (%) | 509 (10) | 383 (10) | 126 (9) |
Cardiac diseases, n (%) | 1175 (23) | 776 (21) | 399 (29) |
Stroke, n (%) | 380 (7) | 260 (7) | 120 (9) |
Insulin, n (%) | 725 (14) | 517 (14) | 208 (15) |
ARB or ACEi, n (%) | 741 (14) | 517 (14) | 224 (16) |
Diuretics, n (%) | 349 (7) | 218 (6) | 131 (10) |
Statin, n (%) | 1162 (23) | 835 (22) | 327 (24) |
Creatinine (mg/dL), mean (SD) | 0.75 (0.20) | 0.72 (0.19) | 0.82 (0.20) |
eGFR mL/min/1.73 m2, mean (SD) | 95.30 (16.81) | 98.97 (15.95) | 85.16 (14.85) |
HbA1c, mg/dL (%), mean (SD) | 7.41 (1.56) | 7.41 (1.55) | 7.42 (1.61) |
Hemoglobin g/dL, mean (SD) | 13.70 (1.90) | 13.93 (1.84) | 13.05 (1.92) |
Feature | Overall (n = 4193) | No DKD (n = 3246) | DKD (n = 947) |
---|---|---|---|
Sex (male), n (%) | 1633 (39) | 1255 (0.39) | 378 (40) |
Age (years), mean (SD) | 60.16 (12.58) | 57.88 (12.10) | 67.99 (10.97) |
Systolic blood pressure (mmHg), mean (SD) | 134.65 (19.86) | 133.52 (19.44) | 138.50 (20.76) |
Diastolic blood pressure (mmHg), mean (SD) | 81.88 (12.06) | 81.71 (11.99) | 82.44 (12.30) |
Hypertension, n (%) | 1471 (35) | 1027 (32) | 444 (47) |
Dyslipidemia, n (%) | 1076 (26) | 771 (24) | 305 (32) |
Cardiac diseases, n (%) | 1895 (45) | 1325 (41) | 570 (60) |
Stroke, n (%) | 476 (11) | 305 (9) | 171 (18) |
Insulin, n (%) | 143 (3) | 105 (3) | 38 (4) |
ARB or ACEi, n (%) | 1044 (25) | 752 (23) | 292 (31) |
Diuretics, n (%) | 413 (10) | 273 (8) | 140 (15) |
Statin, n (%) | 413 (10) | 273 (8) | 140 (15) |
Creatinine, mg/dL, mean (SD) | 0.80 (0.18) | 0.78 (0.17) | 0.87 (0.18) |
eGFR mL/min/1.73 m2, mean (SD) | 93.64 (16.60) | 96.98 (15.73) | 82.20 (14.28) |
HbA1c mg/dL (%), mean (SD) | 6.74 (1.45) | 6.76 (1.46) | 6.70 (1.45) |
Hemoglobin, g/dL, mean (SD) | 14.09 (1.71) | 14.24 (1.67) | 13.61 (1.76) |
Model | AUC ROC | Sensitivity | Specificity | Accuracy | Precision | F1 Score |
---|---|---|---|---|---|---|
Lasso | 0.7977 | 0.6985 | 0.7726 | 0.7529 | 0.5263 | 0.6003 |
Ridge | 0.7971 | 0.7022 | 0.7646 | 0.7480 | 0.5190 | 0.5969 |
Elastic Net | 0.7978 | 0.7022 | 0.7673 | 0.7500 | 0.5219 | 0.5988 |
Random Forest | 0.8019 | 0.7647 | 0.7354 | 0.7432 | 0.5111 | 0.6127 |
XGBoost | 0.8099 | 0.7316 | 0.7566 | 0.7500 | 0.5209 | 0.6085 |
Neural Network | 0.7393 | 0.4081 | 0.8564 | 0.7373 | 0.5068 | 0.4521 |
SVC | 0.7978 | 0.7243 | 0.7566 | 0.7480 | 0.5184 | 0.6043 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, K.A.; Kim, J.S.; Kim, Y.J.; Goak, I.S.; Jin, H.Y.; Park, S.; Kang, H.; Park, T.S. A Machine Learning-Based Prediction Model for Diabetic Kidney Disease in Korean Patients with Type 2 Diabetes Mellitus. J. Clin. Med. 2025, 14, 2065. https://doi.org/10.3390/jcm14062065
Lee KA, Kim JS, Kim YJ, Goak IS, Jin HY, Park S, Kang H, Park TS. A Machine Learning-Based Prediction Model for Diabetic Kidney Disease in Korean Patients with Type 2 Diabetes Mellitus. Journal of Clinical Medicine. 2025; 14(6):2065. https://doi.org/10.3390/jcm14062065
Chicago/Turabian StyleLee, Kyung Ae, Jong Seung Kim, Yu Ji Kim, In Sun Goak, Heung Yong Jin, Seungyong Park, Hyejin Kang, and Tae Sun Park. 2025. "A Machine Learning-Based Prediction Model for Diabetic Kidney Disease in Korean Patients with Type 2 Diabetes Mellitus" Journal of Clinical Medicine 14, no. 6: 2065. https://doi.org/10.3390/jcm14062065
APA StyleLee, K. A., Kim, J. S., Kim, Y. J., Goak, I. S., Jin, H. Y., Park, S., Kang, H., & Park, T. S. (2025). A Machine Learning-Based Prediction Model for Diabetic Kidney Disease in Korean Patients with Type 2 Diabetes Mellitus. Journal of Clinical Medicine, 14(6), 2065. https://doi.org/10.3390/jcm14062065