Predicting Acute Kidney Injury: A Machine Learning Approach Using Electronic Health Records
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Design and Setting
2.2. Workflow
2.3. Data Sources
2.4. Cohort Entry Criteria
2.5. Input Features
2.6. Outcome: Identification of AKI
2.7. Data Preprocessing
2.8. Analysis Using Machine Learning Techniques
2.8.1. Ensemble Methods
Support Vector Machine
Decision Tree
2.8.2. Logistic Regression
2.8.3. XGBoost
2.9. Tools and Technologies
3. Results
3.1. Cohort Characteristics
3.2. Classification Results
4. Discussion
5. Limitations and Future Work
6. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Appendix A
Data Source | Description | Study Purpose |
---|---|---|
Canadian Institute for Health Information Discharge Abstract Database and National Ambulatory Care Reporting System | The Canadian Institute for Health Information Discharge Abstract Database and National Ambulatory Care Reporting System collect diagnostic and procedural variables for inpatient stays and ED visits, respectively. Diagnostic and inpatient procedural coding use the 10th version of the Canadian Modified International Classification of Disease system 10th Revision (after 2002). | Cohort creation, description, exposure and outcome estimation |
Ontario Drug Benefits | The Ontario Drug Benefits database includes a wide range of outpatient prescription medications available to all Ontario citizens over the age of 65. The error rate in the Ontario Drug Benefits database is less than 1%. | Medication prescriptions, description and exposure |
Registered Persons Database | The Registered Persons Database captures demographic (sex, date of birth, postal code) and vital status information on all Ontario residents. Relative to the Canadian Institute for Health Information Discharge Abstract Database in-hospital death flag, the Registered Persons Database has a sensitivity of 94% and a positive predictive value of 100%. | Cohort creation, description and exposure |
Ontario Health Insurance Plan | The Ontario Health Insurance Plan database contains information on Ontario physician billing claims for medical services using fee and diagnosis codes outlined in the Ontario Health Insurance Plan Schedule of Benefits. These codes capture information on outpatient, inpatient and laboratory services rendered to a patient. | Cohort creation, stratification, description, exposure and outcome |
Variable | Database | Code | Set Code |
---|---|---|---|
Major cancer | Canadian Institute for Health Information Discharge Abstract Database | International Classification of Diseases 9th Revision | 150, 154, 155, 157, 162, 174, 175, 185, 203, 204, 205, 206, 207, 208, 2303, 2304, 2307, 2330, 2312, 2334 |
International Classification of Diseases 10th Revision | 971, 980, 982, 984, 985, 986, 987, 988, 989, 990, 991, 993, C15, C18, C19, C20, C22, C25, C34, C50, C56, C61, C82, C83, C85, C91, C92, C93, C94, C95, D00, D010, D011, D012, D022, D075, D05 | ||
Ontario Health Insurance Plan | Diagnosis | 203, 204, 205, 206, 207, 208, 150, 154, 155, 157, 162, 174, 175, 183, 185 | |
Chronic liver disease | Canadian Institute for Health Information Discharge Abstract Database | International Classification of Diseases 9th Revision | 4561, 4562, 070, 5722, 5723, 5724, 5728, 573, 7824, V026, 571, 2750, 2751, 7891, 7895 |
International Classification of Diseases 10th Revision | B16, B17, B18, B19, I85, R17, R18, R160, R162, B942, Z225, E831, E830, K70, K713, K714, K715, K717, K721, K729, K73, K74, K753, K754, K758, K759, K76, K77 | ||
Ontario Health Insurance Plan | Diagnosis | 571, 573, 070 | |
Fee code | Z551, Z554 | ||
Coronary artery disease (excluding angina) | Canadian Institute for Health Information Discharge Abstract Database | Canadian Classification of Diagnostic, Therapeutic and Surgical Procedures | 4801, 4802, 4803, 4804, 4805, 481, 482, 483 |
Canadian Classification of Health Interventions | 1IJ50, 1IJ76 | ||
International Classification of Diseases 9th Revision | 412, 410, 411 | ||
International Classification of Diseases 10th Revision | I21, I22, Z955, T822 | ||
Ontario Health Insurance Plan | Diagnosis | 410, 412 | |
Fee code | R741, R742, R743, G298, E646, E651, E652, E654, E655, Z434, Z448 | ||
Diabetes | Canadian Institute for Health Information Discharge Abstract Database | International Classification of Diseases 9th Revision | 250 |
International Classification of Diseases 10th Revision | E10, E11, E13, E14 | ||
Ontario Health Insurance Plan | Diagnosis | 250 | |
Fee code | Q040, K029, K030, K045, K046 | ||
Heart failure | Canadian Institute for Health Information Discharge Abstract Database | Canadian Classification of Diagnostic, Therapeutic and Surgical Procedures | 4961, 4962, 4963, 4964 |
Canadian Classification of Health Interventions | 1HP53, 1HP55, 1HZ53GRFR, 1HZ53LAFR, 1HZ53SYFR | ||
International Classification of Diseases 9th Revision | I500, I501, I509, I255, J81 | ||
International Classification of Diseases 10th Revision | I21, I22, Z955, T822 | ||
Ontario Health Insurance Plan | Diagnosis | 428 | |
Fee code | R701, R702, Z429 | ||
Hypertension | Canadian Institute for Health Information Discharge Abstract Database | International Classification of Diseases 9th Revision | 401, 402, 403, 404, 405 |
International Classification of Diseases 10th Revision | I10, I11, I12, I13, I15 | ||
Ontario Health Insurance Plan | Diagnosis | 401, 402, 403 | |
Kidney stones | Canadian Institute for Health Information Discharge Abstract Database | International Classification of Diseases 9th Revision | 5920, 5921, 5929, 5940, 5941, 5942, 5948, 5949, 27411 |
International Classification of Diseases 10th Revision | N200, N201, N202, N209, N210, N211, N218, N219, N220, N228 | ||
Peripheral vascular disease | Canadian Institute for Health Information Discharge Abstract Database | Canadian Classification of Diagnostic, Therapeutic and Surgical Procedures | 5125, 5129, 5014, 5016, 5018, 5028, 5038, 5126, 5159 |
Canadian Classification of Health Interventions | 1KA76, 1KA50, 1KE76, 1KG50, 1KG57, 1KG76MI, 1KG87, 1IA87LA, 1IB87LA, 1IC87LA, 1ID87LA, 1KA87LA, 1KE57 | ||
International Classification of Diseases 9th Revision | 4402, 4408, 4409, 5571, 4439, 444 | ||
International Classification of Diseases 10th Revision | I700, I702, I708, I709, I731, I738, I739, K551 | ||
Ontario Health Insurance Plan | Fee code | R787, R780, R797, R804, R809, R875, R815, R936, R783, R784, R785, E626, R814, R786, R937, R860, R861, R855, R856, R933, R934, R791, E672, R794, R813, R867, E649 | |
Cerebrovascular disease (stroke or transient ischemic attack) | Canadian Institute for Health Information Discharge Abstract Database | International Classification of Diseases 9th Revision | 430, 431, 432, 4340, 4341, 4349, 435, 436, 3623 |
International Classification of Diseases 10th Revision | I62, I630, I631, I632, I633, I634, I635, I638, I639, I64, H341, I600, I601, I602, I603, I604, I605, I606, I607, I609, I61, G450, G451, G452, G453, G458, G459, H340 | ||
Chronic kidney disease | Canadian Institute for Health Information Discharge Abstract Database | International Classification of Diseases 9th Revision | 4030, 4031, 4039, 4040, 4041, 4049, 585, 586, 5888, 5889, 2504 |
International Classification of Diseases 10th Revision | E102, E112, E132, E142, I12, I13, N08, N18, N19 | ||
Ontario Health Insurance Plan | Diagnosis | 403, 585 |
Variable | Database | Code Set | Code |
---|---|---|---|
Dialysis | Canadian Institute for Health Information Discharge Abstract Database | Canadian Classification of Diagnostic, Therapeutic and Surgical Procedures | 5127, 5142, 5143, 5195, 6698 |
Canadian Classification of Health Interventions | 1PZ21, 1OT53DATS, 1OT53HATS, 1OT53LATS, 1SY55LAFT, 7SC59QD, 1KY76, 1KG76MZXXA, 1KG76MZXXN, 1JM76NC, 1JM76NCXXN | ||
International Classification of Diseases 9th Revision | V451, V560, V568, 99673 | ||
International Classification of Diseases 10th Revision | T824, Y602, Y612, Y622, Y841, Z49, Z992 | ||
Ontario Health Insurance Plan | Fee code | R850, G324, G336, G327, G862, G865, G099, R825, R826, R827, R833, R840, R841, R843, R848, R851, R946, R943, R944, R945, R941, R942, Z450, Z451, Z452, G864, R852, R853, R854, R885, G333, H540, H740, R849, G323, G325, G326, G860, G863, G866, G330, G331, G332, G861, G082, G083, G085, G090, G091, G092, G093, G094, G095, G096, G294, G295 | |
Kidney transplant | Canadian Institute for Health Information Discharge Abstract Database | Canadian Classification of Health Interventions | 1PC85 |
Ontario Health Insurance Plan | Fee code | S435, S434 |
References
- Selby, N.M.; Crowley, L.; Fluck, R.J.; McIntyre, C.W.; Monaghan, J.; Lawson, N.; Kolhe, N.V. Use of Electronic Results Reporting to Diagnose and Monitor AKI in Hospitalized Patients. Clin. J. Am. Soc. Nephrol. 2012, 7, 533–540. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Porter, C.J.; Juurlink, I.; Bisset, L.H.; Bavakunji, R.; Mehta, R.L.; Devonald, M.A.J. A real-time electronic alert to improve detection of acute kidney injury in a large teaching hospital. Nephrol. Dial. Transplant. 2014, 29, 1888–1893. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wu, X.; Zhang, W.; Ren, H.; Chen, X.; Xie, J.; Chen, N. Diuretics associated acute kidney injury: Clinical and pathological analysis. Ren. Fail. 2014, 36, 1051–1055. [Google Scholar] [CrossRef]
- Nadkarni, G.N.; A Patel, A.; Ahuja, Y.; Annapureddy, N.; Agarwal, S.K.; Simoes, P.K.; Konstantinidis, I.; Kamat, S.; Archdeacon, M.; Thakar, C.V. Incidence, Risk Factors, and Outcome Trends of Acute Kidney Injury in Elective Total Hip and Knee Arthroplasty. Am. J. Orthop. 2016, 45, E12–E19. [Google Scholar] [PubMed]
- Kolhe, N.V.; Muirhead, A.W.; Wilkes, S.R.; Fluck, R.J.; Taal, M.W. The epidemiology of hospitalised acute kidney injury not requiring dialysis in England from 1998 to 2013: Retrospective analysis of hospital episode statistics. Int. J. Clin. Pr. 2016, 70, 330–339. [Google Scholar] [CrossRef] [Green Version]
- Liu, S.; Joseph, K.; Bartholomew, S.; Fahey, J.; Lee, L.; Allen, A.C.; Kramer, M.S.; Sauve, R.; Young, D.C.; Liston, R.M. Temporal trends and regional variations in severe maternal morbidity in Canada, 2003 to 2007. J. Obstet. Gynaecol. Can. 2010, 32, 847–855. [Google Scholar] [CrossRef]
- Mehrabadi, A.; Liu, S.; Bartholomew, S.; A Hutcheon, J.; A Magee, L.; Kramer, M.S.; Liston, R.M.; Joseph, K. Hypertensive disorders of pregnancy and the recent increase in obstetric acute renal failure in Canada: Population based retrospective cohort study. BMJ 2014, 349, g4731. [Google Scholar] [CrossRef] [Green Version]
- Mehta, R.L.; Pascual, M.T.; Soroko, S.; Savage, B.R.; Himmelfarb, J.; Ikizler, T.A.; Paganini, E.P.; Chertow, G.M. Spectrum of acute renal failure in the intensive care unit: The PICARD experience. Kidney Int. 2004, 66, 1613–1621. [Google Scholar] [CrossRef] [Green Version]
- Siddiqui, N.F.; Coca, S.G.; Devereaux, P.; Jain, A.K.; Li, L.; Luo, J.; Parikh, C.R.; Paterson, M.; Philbrook, H.T.; Wald, R.; et al. Secular trends in acute dialysis after elective major surgery—1995 to 2009. Can. Med. Assoc. J. 2012, 184, 1237–1245. [Google Scholar] [CrossRef] [Green Version]
- Waikar, S.S.; Curhan, G.C.; Wald, R.; McCarthy, E.P.; Chertow, G.M. Declining Mortality in Patients with Acute Renal Failure, 1988 to 2002. J. Am. Soc. Nephrol. 2006, 17, 1143–1150. [Google Scholar] [CrossRef] [Green Version]
- Zulman, D.M.; Asch, S.M.; Martins, S.B.; Kerr, E.A.; Hoffman, B.B.; Goldstein, M.K. Quality of Care for Patients with Multiple Chronic Conditions: The Role of Comorbidity Interrelatedness. J. Gen. Intern. Med. 2013, 29, 529–537. [Google Scholar] [CrossRef] [PubMed]
- Ali, T.; Khan, I.; Simpson, W.; Prescott, G.J.; Townend, J.; Smith, W.; MacLeod, A. Incidence and Outcomes in Acute Kidney Injury: A Comprehensive Population-Based Study. J. Am. Soc. Nephrol. 2007, 18, 1292–1298. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bagshaw, S.M.; George, C.; Bellomo, R. Changes in the incidence and outcome for early acute kidney injury in a cohort of Australian intensive care units. Crit. Care 2007, 11, R68. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Eriksen, B.O.; Hoff, K.R.S.; Solberg, S. Prediction of acute renal failure after cardiac surgery: Retrospective cross-validation of a clinical algorithm. Nephrol. Dial. Transplant. 2003, 18, 77–81. [Google Scholar] [CrossRef] [Green Version]
- Palevsky, P.M.; Liu, K.D.; Brophy, P.D.; Chawla, L.S.; Parikh, C.R.; Thakar, C.V.; Tolwani, A.J.; Waikar, S.S.; Weisbord, S.D. KDOQI US Commentary on the 2012 KDIGO Clinical Practice Guideline for Acute Kidney Injury. Am. J. Kidney Dis. 2013, 61, 649–672. [Google Scholar] [CrossRef] [PubMed]
- Gottlieb, S.S.; Abraham, W.; Butler, J.; Forman, D.E.; Loh, E.; Massie, B.M.; O’Connor, C.M.; Rich, M.W.; Stevenson, L.W.; Young, J.; et al. The prognostic importance of different definitions of worsening renal function in congestive heart failure. J. Card. Fail. 2002, 8, 136–141. [Google Scholar] [CrossRef]
- Clinical Practice Guideline KDIGO Clinical Practice Guideline for Acute Kidney Injury. Kidney Int. Suppl. 2012, 2, 1–138.
- Kate, R.J.; Perez, R.M.; Mazumdar, D.; Pasupathy, K.S.; Nilakantan, V. Prediction and detection models for acute kidney injury in hospitalized older adults. BMC Med. Inform. Decis. Mak. 2016, 16, 39. [Google Scholar] [CrossRef] [Green Version]
- Delanaye, P.; Pottel, H.; Cavalier, E. Serum Creatinine: Not So Simple! Nephron 2017, 136, 302–308. [Google Scholar] [CrossRef] [PubMed]
- Mohamadlou, H.; Lynn-Palevsky, A.; Barton, C.; Chettipally, U.; Shieh, L.; Calvert, J.; Saber, N.R.; Das, R. Prediction of Acute Kidney Injury with a Machine Learning Algorithm Using Electronic Health Record Data. Can. J. Kidney Health Dis. 2018, 5, 5. [Google Scholar] [CrossRef] [Green Version]
- Pozzoli, S.; Simonini, M.; Manunta, P. Predicting acute kidney injury: Current status and future challenges. J. Nephrol. 2017, 31, 209–223. [Google Scholar] [CrossRef] [PubMed]
- Mehta, R.L. Management of acute kidney injury: It’s the squeaky wheel that gets the oil! Clin. J. Am. Soc. Nephrol. 2011, 6, 2102–2104. [Google Scholar] [CrossRef] [PubMed]
- Lieske, J.C.; Chawla, L.; Kashani, K.; Kellum, J.A.; Koyner, J.L.; Mehta, R.L. Biomarkers for Acute Kidney Injury: Where Are We Today? Where Should We Go? Clin. Chem. 2014, 60, 294–300. [Google Scholar] [CrossRef] [PubMed]
- Rostamzadeh, N.; Abdullah, S.S.; Sedig, K. Data-Driven Activities Involving Electronic Health Records: An Activity and Task Analysis Framework for Interactive Visualization Tools. Multimodal Technol. Interact. 2020, 4, 7. [Google Scholar] [CrossRef] [Green Version]
- Delamarre, D.; Bouzillé, G.; Dalleau, K.; Courtel, D.; Cuggia, M. Semantic integration of medication data into the EHOP Clinical Data Warehouse. Stud. Health Technol. Inform. 2015, 210, 702–706. [Google Scholar]
- Abramson, E.L.; Barrón, Y.; Quaresimo, J.; Kaushal, R. Electronic Prescribing Within an Electronic Health Record Reduces Ambulatory Prescribing Errors. Jtr. Comm. J. Qual. Patient Saf. 2011, 37, 470–478. [Google Scholar] [CrossRef]
- Abdullah, S.S.; Rostamzadeh, N.; Sedig, K.; Garg, A.X.; McArthur, E. Visual Analytics for Dimension Reduction and Cluster Analysis of High Dimensional Electronic Health Records. Informatics 2020, 7, 17. [Google Scholar] [CrossRef]
- Abdullah, S.S.; Rostamzadeh, N.; Sedig, K.; Garg, A.X.; McArthur, E. Multiple Regression Analysis and Frequent Itemset Mining of Electronic Medical Records: A Visual Analytics Approach Using VISA_M3R3. Data 2020, 5, 33. [Google Scholar] [CrossRef] [Green Version]
- Rashidi, H.H.; Sen, S.; Palmieri, T.L.; Blackmon, T.; Wajda, J.; Tran, N.K. Early Recognition of Burn- and Trauma-Related Acute Kidney Injury: A Pilot Comparison of Machine Learning Techniques. Sci. Rep. 2020, 10, 1–9. [Google Scholar] [CrossRef]
- Tran, N.K.; Sen, S.; Palmieri, T.L.; Lima, K.; Falwell, S.; Wajda, J.; Rashidi, H.H. Artificial intelligence and machine learning for predicting acute kidney injury in severely burned patients: A proof of concept. Burn 2019, 45, 1350–1358. [Google Scholar] [CrossRef]
- E Davis, S.; A Lasko, T.; Chen, G.; Siew, E.D.; Matheny, M.E. Calibration drift in regression and machine learning models for acute kidney injury. J. Am. Med. Inform. Assoc. 2017, 24, 1052–1061. [Google Scholar] [CrossRef] [PubMed]
- Cheng, P.; Waitman, L.R.; Hu, Y.; Liu, M. Predicting Inpatient Acute Kidney Injury over Different Time Horizons: How Early and Accurate? AMIA Annu. Symp. Proced. 2017, 2017, 565–574. [Google Scholar]
- Ibrahim, N.E.; McCarthy, C.P.; Shrestha, S.; Gaggin, H.K.; Mukai, R.; Magaret, C.A.; Rhyne, R.F.; Januzzi, J.L. A clinical, proteomics, and artificial intelligence-driven model to predict acute kidney injury in patients undergoing coronary angiography. Clin. Cardiol. 2019, 42, 292–298. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gameiro, J.; Branco, T.; Lopes, J.A. Artificial Intelligence in Acute Kidney Injury Risk Prediction. J. Clin. Med. 2020, 9, 678. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Abdullah, S.S.; Rostamzadeh, N.; Sedig, K.; Lizotte, D.J.; Garg, A.X.; McArthur, E. Machine Learning for Identifying Medication-Associated Acute Kidney Injury. Informatics 2020, 7, 18. [Google Scholar] [CrossRef]
- Registered Persons Database (RPDB)—Ontario Data Catalogue. Available online: https://data.ontario.ca/dataset/registered-persons-database-rpdb (accessed on 25 July 2020).
- Ontario Drug Benefit (ODB) Database—Ontario Data Catalogue. Available online: https://data.ontario.ca/dataset/ontario-drug-benefit-odb-database (accessed on 25 July 2020).
- Levy, A.R.; O’Brien, B.J.; Sellors, C.; Grootendorst, P.; Willison, N. Coding accuracy of administrative drug claims in the Ontario Drug Benefit database. Can. J. Clin. Pharmacol. 2003, 10, 67–71. [Google Scholar]
- National Ambulatory Care Reporting System Metadata (NACRS) CIHI. Available online: https://www.cihi.ca/en/national-ambulatory-care-reporting-system-metadata-nacrs (accessed on 25 July 2020).
- Discharge Abstract Database Metadata (DAD) CIHI. Available online: https://www.cihi.ca/en/discharge-abstract-database-metadata-dad (accessed on 25 July 2020).
- ICD-10 Version: 2019. Available online: https://icd.who.int/browse10/2019/en (accessed on 25 July 2020).
- Data Available through DASm. Available online: https://www.ices.on.ca/DAS/Data (accessed on 25 July 2020).
- Wilkinson, L. Classification and Regression Trees. Available online: http://cda.psych.uiuc.edu/multivariate_fall_2013/systat_cart_manual.pdf (accessed on 5 August 2020).
- Quinlan, J.R. C4.5: Programs for Machine Learning; Elsevier: Amsterdam, The Netherlands, 2014. [Google Scholar]
- Lewis, D.D. Naive (Bayes) at forty: The independence assumption in information retrieval. In Machine Learning: ECML-98; Nédellec, C., Rouveirol, C., Eds.; Springer: Berlin/Heidelberg, Germany, 1998; pp. 4–15. [Google Scholar]
- Bahnsen, A.C.; Aouada, D.; Ottersten, B. Example-Dependent Cost-Sensitive Logistic Regression for Credit Scoring. In Proceedings of the 2014 13th International Conference on Machine Learning and Applications, Detroit, MI, USA, 3–5 December 2014; pp. 263–269. [Google Scholar]
- Cristianini, N.; Shawe-Taylor, J. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Available online: /core/books/an-introduction-to-support-vector-machines-and-other-kernelbased-learning-methods/A6A6F4084056A4B23F88648DDBFDD6FC (accessed on 23 April 2020).
- Dietterich, T.G. Ensemble Methods in Machine Learning. In Multiple Classifier Systems; Springer: Berlin/Heidelberg, Germany, 2000; pp. 1–15. [Google Scholar]
- Freund, Y.; E Schapire, R. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef] [Green Version]
- Wang, S.; Yao, X. Diversity Analysis on Imbalanced Data Sets by Using Ensemble Models; IEEE: Piscataway, NJ, USA, 2009. [Google Scholar]
- Barandela, R.; Valdovinos, R.; Rosas, R.M.V.; Sánchez, J.S. New Applications of Ensembles of Classifiers. Pattern Anal. Appl. 2003, 6, 245–256. [Google Scholar] [CrossRef]
- Seiffert, C.; Khoshgoftaar, T.M.; Van Hulse, J.; Napolitano, A. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance. IEEE 2009, 40, 185–197. [Google Scholar] [CrossRef]
- Chawla, N.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Galar, M.; Fernandez, A.; Barrenechea, E.; Bustince, H.; Herrera, F. A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches. IEEE Trans. Syst. Man Cybern. Part C Rev. 2011, 42, 463–484. [Google Scholar] [CrossRef]
- Tomar, D.; Agarwal, S. A survey on Data Mining approaches for Healthcare. Int. J. BioSci. BioTechnol. 2013, 5, 241–266. [Google Scholar] [CrossRef]
- Cawley, G.C.; Talbot, N.L. On over-fitting in model selection and subsequent selection bias in performance evaluation. J. Mach. Learn. Res. 2010, 11, 2079–2107. [Google Scholar]
- Xie, N.; Liu, Y. Notice of Retraction: Review of decision trees. In Proceedings of the 2010 3rd International Conference on Computer Science and Information Technology, Chengdu, China, 9–11 July 2010; pp. 105–109. [Google Scholar]
- McCallum, A.; Nigam, K. A Comparison of Event Models for Naive Bayes Text Classification. In AAAI-98 Workshop on Learning for Text Categorization; AAAI Workshop: Madison, WI, USA, 1998; pp. 41–48. [Google Scholar]
- Ismail, B.; Anil, M. Regression methods for analyzing the risk factors for a life style disease among the young population of India. Indian Heart J. 2014, 66, 587–592. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Association for Computing Machinery: San Francisco, CA, USA, 2016; pp. 785–794. [Google Scholar]
- Wang, C.; Wang, S.; Shi, F.; Wang, Z. Robust Propensity Score Computation Method based on Machine Learning with Label-corrupted Data. arXiv, 2018; arXiv:1801.03132. [Google Scholar]
- Wang, C.-W.; Lee, Y.-C.; Calista, E.; Zhou, F.; Zhu, H.; Suzuki, R.; Komura, D.; Ishikawa, S.; Cheng, S.-P. A benchmark for comparing precision medicine methods in thyroid cancer diagnosis using tissue microarrays. Bioinformatics 2018, 34, 1767–1773. [Google Scholar] [CrossRef] [Green Version]
- Wang, C.; Deng, C.; Wang, S. Imbalance-XGBoost: Leveraging Weighted and Focal Losses for Binary Label-Imbalanced Classification with XGBoost. Available online: https://arxiv.org/abs/1908.01672 (accessed on 5 August 2020).
- SAS Enterprise BI Server. Available online: https://www.sas.com/en_ca/software/enterprise-bi-server.html (accessed on 19 February 2020).
- RStudio Open Source & Professional Software for Data Science Teams. Available online: https://rstudio.com/ (accessed on 19 February 2020).
- Japkowicz, N.; Shah, M. Evaluating Learning Algorithms: A Classification Perspective; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
- Go, A.S.; Parikh, C.R.; Ikizler, T.A.; Coca, S.G.; Siew, E.D.; Chinchilli, V.M.; Hsu, C.-Y.; Garg, A.X.; Zappitelli, M.; Liu, K.D.; et al. The assessment, serial evaluation, and subsequent sequelae of acute kidney injury (ASSESS-AKI) study: Design and methods. BMC Nephrol. 2010, 11, 22. [Google Scholar] [CrossRef] [Green Version]
- Matheny, M.E.; Miller, R.A.; Ikizler, T.A.; Waitman, L.R.; Denny, J.C.; Schildcrout, J.S.; Dittus, R.S.; Peterson, J.F. Development of Inpatient Risk Stratification Models of Acute Kidney Injury for Use in Electronic Health Records. Med. Decis. Mak. 2010, 30, 639–650. [Google Scholar] [CrossRef] [Green Version]
- Kane-Gill, S.L.; Sileanu, F.E.; Murugan, R.; Trietley, G.S.; Handler, S.; Kellum, J.A. Risk factors for acute kidney injury in older adults with critical illness: A retrospective cohort study. Am. J. Kidney Dis. 2014, 65, 860–869. [Google Scholar] [CrossRef] [Green Version]
- Dylewska, M.; Chomicka, I.; Małyszko, J. Hypertension in Patients with Acute Kidney Injury. Wiad. Lek. 2019, 72, 2199–2201. [Google Scholar] [CrossRef]
- Hsu, R.K.; Hsu, C.-Y. The Role of Acute Kidney Injury in Chronic Kidney Disease. Semin. Nephrol. 2016, 36, 283–292. [Google Scholar] [CrossRef] [PubMed]
- Girman, C.J.; Kou, T.D.; Brodovicz, K.; Alexander, C.M.; O’Neill, E.A.; Engel, S.; Williams-Herman, D.E.; Katz, L. Risk of acute renal failure in patients with Type 2 diabetes mellitus. Diabet. Med. 2012, 29, 614–621. [Google Scholar] [CrossRef] [PubMed]
- Olsson, D.; Sartipy, U.; Braunschweig, F.; Holzmann, M.J.; Hertzberg, D. Acute Kidney Injury Following Coronary Artery Bypass Surgery and Long-term Risk of Heart Failure. Circ. Hear. Fail. 2013, 6, 83–90. [Google Scholar] [CrossRef] [Green Version]
- Rydén, L.; Sartipy, U.; Evans, M.; Holzmann, M.J. Acute Kidney Injury After Coronary Artery Bypass Grafting and Long-Term Risk of End-Stage Renal Disease. Circulation 2014, 130, 2005–2011. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chao, C.-T.; Tsai, H.-B.; Wu, C.-Y.; Lin, Y.-F.; Hsu, N.-C.; Chen, J.-S.; Hung, K.-Y. Cumulative Cardiovascular Polypharmacy Is Associated with the Risk of Acute Kidney Injury in Elderly Patients. Medicine 2015, 94, e1251. [Google Scholar] [CrossRef]
- Ho, K.M.; Power, B.M. Benefits and risks of furosemide in acute kidney injury. Anaesthesia 2010, 65, 283–293. [Google Scholar] [CrossRef] [PubMed]
- Verdoodt, A.; Honoré, P.P.M.; Jacobs, R.; De Waele, E.; Van Gorp, V.; De Regt, J.; Spapen, H.D. Do statins induce or protect from acute kidney injury and chronic kidney disease: An update review in 2018. J. Transl. Intern. Med. 2018, 6, 21–25. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Pierson-Marchandise, M.; Gras, V.; Moragny, J.; Micallef, J.; Gaboriau, L.; Picard, S.; Choukroun, G.; Masmoudi, K.; Liabeuf, S. The drugs that mostly frequently induce acute kidney injury: A case—Noncase study of a pharmacovigilance database. Br. J. Clin. Pharmacol. 2017, 83, 1341–1349. [Google Scholar] [CrossRef] [Green Version]
- Perez-Ruiz, F. Treatment with Allopurinol is Associated with Lower Risk of Acute Kidney Injury in Patients with Gout: A Retrospective Analysis of a Nested Cohort. Rheumatol. Ther. 2017, 4, 419–425. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kocięcka, M.Z.; Dabrowski, M.; Stepinska, J. Acute kidney injury after transcatheter aortic valve replacement in the elderly: Outcomes and risk management. Clin. Interv. Aging 2019, 14, 195–201. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ng, R.R.G.; Tan, G.H.J.; Liu, W.; Ti, L.K.; Chew, S.T.H. The Association of Acute Kidney Injury and Atrial Fibrillation after Cardiac Surgery in an Asian Prospective Cohort Study. Medicine 2016, 95, e3005. [Google Scholar] [CrossRef] [PubMed]
- Godin, M.; Bouchard, J.; Mehta, R.L. Fluid Balance in Patients with Acute Kidney Injury: Emerging Concepts. Nephron Clin. Pr. 2013, 123, 238–245. [Google Scholar] [CrossRef] [PubMed]
- Carrara, C.; Abbate, M.; Sabadini, E.; Remuzzi, G. Acute Kidney Injury and Hemolytic Anemia Secondary to Mycoplasma pneumoniae Infection. Nephron 2017, 137, 148–154. [Google Scholar] [CrossRef] [PubMed]
- Siew, E.D.; Fissell, W.H.; Tripp, C.M.; Blume, J.D.; Wilson, M.D.; Clark, A.J.; Vincz, A.J.; Ely, E.W.; Pandharipande, P.P.; Girard, T.D. Acute Kidney Injury as a Risk Factor for Delirium and Coma during Critical Illness. Am. J. Respir. Crit. Care Med. 2017, 195, 1597–1607. [Google Scholar] [CrossRef]
- Evans, R.D.R.; Hemmilä, U.; Craik, A.; Mtekateka, M.; Hamilton, F.; Kawale, Z.; Kirwan, C.J.; Dobbie, H.; Dreyer, G. Incidence, aetiology and outcome of community-acquired acute kidney injury in medical admissions in Malawi. BMC Nephrol. 2017, 18, 21. [Google Scholar] [CrossRef] [Green Version]
- Neugarten, J.; Golestaneh, L. Female sex reduces the risk of hospital-associated acute kidney injury: A meta-analysis. BMC Nephrol. 2018, 19, 314. [Google Scholar] [CrossRef]
- Yokota, L.G.; Sampaio, B.M.; Rocha, E.P.; Balbi, A.; Prado, I.R.S.; Ponce, D. Acute kidney injury in elderly patients: Narrative review on incidence, risk factors, and mortality. Int. J. Nephrol. Renov. Dis. 2018, 11, 217–224. [Google Scholar] [CrossRef] [Green Version]
Characteristics | Patients Admitted to Hospital or Visited Emergency Department | ||
---|---|---|---|
Total Patients | AKI | No AKI | |
Cohort size | 905,442 | 5993 | 899,449 |
Age, yr, mean (SD) | |||
65 to <70 | 181,088 (20%) | 589 | 180,499 |
70 to <80 | 371,231 (41%) | 1911 | 369,320 |
80 to <90 | 269,147 (30%) | 2485 | 269,147 |
≥90 | 81,489 (9%) | 1008 | 80,481 |
Sex | |||
Women | 507,047 (56%) | 2901 | 504,146 |
Year of cohort entry (index date) | |||
2014–2015 | 588,537 (65%) | 3987 | 584,550 |
2015–2016 | 316,904 (34%) | 2006 | 314,898 |
Location | |||
Rural residence | 144,870 (16%) | 501 | 144,369 |
LTC | |||
Long-term care | 36,217 (4%) | 745 | 35,472 |
Income Quintile | |||
1 (lowest) | 172,035 (19%) | 1306 | 170,729 |
2 | 189,143 (21%) | 1318 | 187,825 |
3 | 182,588 (20%) | 1173 | 181,415 |
4 | 181,086 (20%) | 1154 | 179,932 |
5 (highest) | 180,590 (20%) | 1043 | 179,547 |
Comorbid conditions (by codes) | |||
Hypertension | 814,604 (88%) | 5784 | 808,820 |
Diabetes | 358,472 (38%) | 3306 | 355,166 |
Heart failure | 125,136 (14%) | 1821 | 123,315 |
Coronary artery disease | 239,437 (26%) | 2005 | 237,432 |
Chronic liver disease | 33,359 (4%) | 297 | 33,062 |
Cancer | 145,286 (16%) | 1016 | 144,270 |
Chronic kidney disease | 86,442 (9%) | 1854 | 84,588 |
Kidney stones | 12,457 (1%) | 93 | 12,364 |
Peripheral vascular disease | 13,197 (2%) | 158 | 13,039 |
Cerebrovascular disease | 25,835 (3%) | 282 | 25,553 |
Hospital Diagnosis Codes | |||
Disorders of fluid, electrolyte and acid-base balance (E87) | 13,563 (1%) | 962 | 12,601 |
Delirium (F05) | 4996 (1%) | 342 | 4654 |
Atrial fibrillation (I48.91) | 34,120 (4%) | 1978 | 32,142 |
Mycoplasma pneumoniae (B96) | 6197 (1%) | 434 | 5763 |
Anaemia (D64.9) | 11,814 (1%) | 791 | 11,023 |
Valve disorders (I35) | 1261 (1%) | 186 | 1075 |
Fracture of femur (S72) | 7263 (1%) | 231 | 7032 |
Atherosclerotic cardiovascular disease (I25.10) | 21,472 (2%) | 1256 | 20,216 |
Volume depletion (E86.9) | 3739 (1%) | 240 | 3499 |
Diseases of the digestive system (K00-K95) | 4552 (1%) | 264 | 4288 |
Abnormal functions of organs and systems (R94.8) | 11,348 (2%) | 725 | 10,623 |
Chronic pulmonary (J81.1) | 24,217 (3%) | 971 | 23,246 |
Hyperplasia of prostate (N40.1) | 5047 (1%) | 153 | 4894 |
Certain infectious and parasitic diseases (A00-B99) | 1191 (1%) | 105 | 1086 |
Dementia (F03. 90) | 8714 (1%) | 390 | 8324 |
Glomerular disorders (N08) | 3988 (1%) | 569 | 3419 |
Ensemble-Based Methods | Machine Learning Techniques | Sensitivity | Specificity | AUROC |
---|---|---|---|---|
NA | Logistic Regression | 0.79 | 0.72 | 0.77 ± 0.038 |
SMOTEBoost | Classification and Regression Trees (CART) | 0.77 | 0.69 | 0.74 ± 0.039 |
C5.0 | 0.84 | 0.78 | 0.83 ± 0.036 | |
NB (Naïve Bayes) | 0.61 | 0.89 | 0.75 ± 0.038 | |
Support Vector Machine (SVM) (linear) | 0.84 | 0.74 | 0.79 ± 0.035 | |
SVM (polynomial) | 0.78 | 0.82 | 0.81 ± 0.033 | |
SVM (sigmoid) | 0.76 | 0.85 | 0.84 ± 0.035 | |
SVM (radial) | 0.70 | 0.83 | 0.82 ± 0.034 | |
SMOTE-Bagging | CART | 0.60 | 0.71 | 0.68 ± 0.041 |
C5.0 | 0.62 | 0.84 | 0.79 ± 0.036 | |
NB | 0.69 | 0.73 | 0.72 ± 0.039 | |
SVM (linear) | 0.76 | 0.84 | 0.81 ± 0.031 | |
SVM (polynomial) | 0.82 | 0.73 | 0.80 ± 0.033 | |
SVM (sigmoid) | 0.84 | 0.71 | 0.81 ± 0.030 | |
SVM (radial) | 0.90 | 0.74 | 0.86 ± 0.029 | |
UnderBagging | CART | 0.71 | 0.83 | 0.79 ± 0.035 |
C5.0 | 0.88 | 0.76 | 0.85 ± 0.032 | |
NB | 0.58 | 0.72 | 0.61 ± 0.041 | |
SVM (linear) | 0.77 | 0.84 | 0.83 ± 0.035 | |
SVM (polynomial) | 0.85 | 0.71 | 0.84 ± 0.037 | |
SVM (sigmoid) | 0.89 | 0.71 | 0.85 ± 0.034 | |
SVM (radial) | 0.79 | 0.90 | 0.86 ± 0.033 | |
RUSBoost | CART | 0.78 | 0.74 | 0.76 ± 0.039 |
C5.0 | 0.84 | 0.77 | 0.82 ± 0.028 | |
NB | 0.68 | 0.72 | 0.71 ± 0.039 | |
SVM (linear) | 0.84 | 0.78 | 0.83 ± 0.035 | |
SVM (polynomial) | 0.74 | 0.85 | 0.82 ± 0.037 | |
SVM (sigmoid) | 0.90 | 0.79 | 0.88 ± 0.029 | |
SVM (radial) | 0.71 | 0.87 | 0.85 ± 0.034 | |
XGBoost | Tree boosting | 0.89 | 0.81 | 0.88 ± 0.031 |
Linear boosting | 0.86 | 0.77 | 0.84 ± 0.033 |
Intervals | Readmission with AKI |
---|---|
1–3 days | 415 |
4–7 days | 534 |
8–14 days | 888 |
15–30 days | 1517 |
31–60 days | 3579 |
61–90 days | 1499 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Abdullah, S.S.; Rostamzadeh, N.; Sedig, K.; Garg, A.X.; McArthur, E. Predicting Acute Kidney Injury: A Machine Learning Approach Using Electronic Health Records. Information 2020, 11, 386. https://doi.org/10.3390/info11080386
Abdullah SS, Rostamzadeh N, Sedig K, Garg AX, McArthur E. Predicting Acute Kidney Injury: A Machine Learning Approach Using Electronic Health Records. Information. 2020; 11(8):386. https://doi.org/10.3390/info11080386
Chicago/Turabian StyleAbdullah, Sheikh S., Neda Rostamzadeh, Kamran Sedig, Amit X. Garg, and Eric McArthur. 2020. "Predicting Acute Kidney Injury: A Machine Learning Approach Using Electronic Health Records" Information 11, no. 8: 386. https://doi.org/10.3390/info11080386
APA StyleAbdullah, S. S., Rostamzadeh, N., Sedig, K., Garg, A. X., & McArthur, E. (2020). Predicting Acute Kidney Injury: A Machine Learning Approach Using Electronic Health Records. Information, 11(8), 386. https://doi.org/10.3390/info11080386