Clinical Context Is More Important than Data Quantity to the Performance of an Artificial Intelligence-Based Early Warning System
Abstract
1. Introduction
2. Materials and Methods
3. Results
4. Discussion
Limitations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Kim, D.H.; Cho, A.; Park, H.C.; Kim, B.Y.; Lee, M.; Kim, G.O.; Kim, J.; Lee, Y.K. Regular laboratory testing and patient survival among patients undergoing maintenance hemodialysis: A Korean nationwide cohort study. Sci. Rep. 2023, 13, 18360. [Google Scholar] [CrossRef] [PubMed]
- Kang, H. The prevention and handling of the missing data. Korean J. Anesthesiol. 2013, 64, 402–406. [Google Scholar] [CrossRef] [PubMed]
- Wells, B.J.; Chagin, K.M.; Nowacki, A.S.; Kattan, M.W. Strategies for handling missing data in electronic health record derived data. eGEMs 2013, 1, 1035. [Google Scholar] [CrossRef] [PubMed]
- Goldstein, B.A.; Navar, A.M.; Pencina, M.J.; Ioannidis, J.P.A. Opportunities and challenges in developing risk prediction models with electronic health records data: A systematic review. J. Am. Med. Inform. Assoc. 2017, 24, 198–208. [Google Scholar] [CrossRef]
- Sisk, R.; Lin, L.; Sperrin, M.; Barrett, J.K.; Tom, B.; Diaz-Ordaz, K.; Peek, N.; Martin, G.P. Informative presence and observation in routine health data: A review of methodology for clinical risk prediction. J. Am. Med. Inform. Assoc. JAMIA 2020, 28, 155–166. [Google Scholar] [CrossRef]
- Sim, T.; Hahn, S.; Kim, K.J.; Cho, E.Y.; Jeong, Y.; Kim, J.H.; Ha, E.Y.; Kim, I.C.; Park, S.H.; Cho, C.H.; et al. Preserving informative presence: How missing data and imputation strategies affect the performance of an AI-based early warning score. J. Clin. Med. 2025, 14, 2213. [Google Scholar] [CrossRef]
- Reese, J.; Deakyne, S.J.; Blanchard, A.; Bajaj, L. Rate of Preventable Early Unplanned Intensive Care Unit Transfer for Direct Admissions and Emergency Department Admissions. Hosp. Pediatr. 2015, 5, 27–34. [Google Scholar] [CrossRef]
- Haegdorens, F.; Van Bogaert, P.; Roelant, E.; De Meester, K.; Misselyn, M.; Wouters, K.; Monsieurs, K.G. The introduction of a rapid response system in acute hospitals: A pragmatic stepped wedge cluster randomised controlled trial. Resuscitation 2018, 129, 127–134. [Google Scholar] [CrossRef]
- Quan, H.; Sundararajan, V.; Halfon, P.; Fong, A.; Burnand, B.; Luthi, J.C.; Saunders, L.D.; Beck, C.A.; Feasby, T.E.; Ghali, W.A. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med. Care 2005, 43, 1130–1139. [Google Scholar] [CrossRef]
- Silber, J.H.; Williams, S.V.; Krakauer, H.; Schwartz, J.S. Hospital and patient characteristics associated with death after surgery. A study of adverse occurrence and failure to rescue. Med. Care 1992, 30, 615–629. [Google Scholar] [CrossRef]
- Charlson, M.E.; Carrozzino, D.; Guidi, J.; Patierno, C. Charlson comorbidity index: A critical review of clinimetric properties. Psychother. Psychosom. 2022, 91, 8–35. [Google Scholar] [CrossRef] [PubMed]
- Sung, M.; Hahn, S.; Han, C.H.; Lee, J.M.; Lee, J.; Yoo, J.; Heo, J.; Kim, Y.S.; Chung, K.S. Event Prediction Model Considering Time and Input Error Using Electronic Medical Records in the Intensive Care Unit: Retrospective Study. JMIR Med. Inform. 2021, 9, e26426. [Google Scholar] [CrossRef]
- Sim, T.; Cho, E.Y.; Kim, J.H.; Lee, K.H.; Kim, K.J.; Hahn, S.; Ha, E.Y.; Yun, E.; Kim, I.-C.; Park, S.H.; et al. Prospective external validation of a deep-learning-based early-warning system for major adverse events in general wards in South Korea. Acute Crit. Care 2025, 40, 197–208. [Google Scholar] [CrossRef]
- Sun, M.; Engelhard, M.M.; Bedoya, A.D.; Goldstein, B.A. Incorporating informatively collected laboratory data from EHR in clinical prediction models. BMC Med. Inform. Decis. Mak. 2024, 24, 206. [Google Scholar] [CrossRef]
- Lancia, G.; Varkila, M.R.J.; Cremer, O.L.; Spitoni, C. Two-step interpretable modeling of ICU-AIs. Artif. Intell. Med. 2024, 151, 102862. [Google Scholar] [CrossRef] [PubMed]
- Junqué de Fortuny, E.; Martens, D.; Provost, F. Predictive Modeling With Big Data: Is Bigger Really Better? Big Data 2013, 1, 215–226. [Google Scholar] [CrossRef] [PubMed]
- Perez-Lebel, A.; Varoquaux, G.; Le Morvan, M.; Josse, J.; Poline, J.B. Benchmarking missing-values approaches for predictive models on health databases. GigaScience 2022, 11, giac013. [Google Scholar] [CrossRef]
- Sisk, R.; Sperrin, M.; Peek, N.; van Smeden, M.; Martin, G.P. Imputation and missing indicators for handling missing data in the development and deployment of clinical prediction models: A simulation study. Stat. Methods Med. Res. 2023, 32, 1461–1477. [Google Scholar] [CrossRef]
- Yang, J.; Dung, N.T.; Thach, P.N.; Phong, N.T.; Phu, V.D.; Phu, K.D.; Yen, L.M.; Thy, D.B.X.; Soltan, A.A.S.; Thwaites, L.; et al. Generalizability assessment of AI models across hospitals in a low-middle and high income country. Nat. Commun. 2024, 15, 8270. [Google Scholar] [CrossRef]
- El Morr, C.; Ozdemir, D.; Asdaah, Y.; Saab, A.; El-Lahib, Y.; Sokhn, E.S. AI-based epidemic and pandemic early warning systems: A systematic scoping review. Health Inform. J. 2024, 30, 14604582241275844. [Google Scholar] [CrossRef]
- Churpek, M.M.; Snyder, A.; Twu, N.M.; Edelson, D.P. Accuracy Comparisons between Manual and Automated Respiratory Rate for Detecting Clinical Deterioration in Ward Patients. J. Hosp. Med. 2018, 13, 486–487. [Google Scholar] [CrossRef] [PubMed]
- Rossetti, S.C.; Dykes, P.C.; Knaplund, C.; Cho, S.; Withall, J.; Lowenthal, G.; Albers, D.; Lee, R.Y.; Jia, H.; Bakken, S.; et al. Real-time surveillance system for patient deterioration: A pragmatic cluster-randomized controlled trial. Nat. Med. 2025, 31, 1895–1902. [Google Scholar] [CrossRef] [PubMed]
CCI Groups | |||||
---|---|---|---|---|---|
Overall (n = 24,359) | High-CCI (n = 12,139) | Moderate/Low-CCI (n = 12,220) | p-Value | ||
Age, median ± IQR, yr | 69.0 ± 22.0 | 78.0 ± 14.0 | 57.0 ± 23.0 | <0.001 | |
Sex, n (%) | F | 12,303 (50.5) | 5456 (44.9) | 6847 (56.0) | <0.001 |
M | 12,056 (49.5) | 6683 (55.1) | 5373 (44.0) | ||
BMI, median ± IQR, kg/m2 | 23.67 ± 5.2 | 22.94 ± 5.0 | 24.28 ± 5.1 | <0.001 | |
DBP, median ± IQR, mmHg | 78.0 ± 12.0 | 75.0 ± 12.0 | 80.0 ± 15.0 | <0.001 | |
Pulse, median ± IQR | 78.0 ± 18.0 | 79.0 ± 19.0 | 78.0 ± 18.0 | 0.006 | |
Respiration, median ± IQR | 20.0 ± 2.0 | 20.0 ± 2.0 | 20.0 ± 2.0 | <0.001 | |
SBP, median ± IQR, mmHg | 125.0 ± 29.0 | 127.0 ± 28.0 | 123.0 ± 27.0 | <0.001 | |
SpO2 (%), median ± IQR | 97.0 ± 2.0 | 97.0 ± 3.0 | 97.0 ± 2.0 | <0.001 | |
Temperature, median ± IQR, °C | 36.8 ± 0.6 | 36.8 ± 0.5 | 36.8 ± 0.5 | <0.001 | |
Missing laboratory values, n (%) | |||||
Total bilirubin | 5550 (22.78) | 2048 (16.87) | 3502 (28.66) | <0.001 | |
Lactate | 24,038 (98.68) | 11,894 (97.98) | 12,144 (99.38) | <0.001 | |
pH | 20,956 (86.03) | 9937 (81.86) | 11019 (90.17) | <0.001 | |
Sodium | 5039 (20.69) | 1723 (14.19) | 3316 (27.14) | <0.001 | |
Potassium | 5045 (20.71) | 1725 (14.21) | 3320 (27.17) | <0.001 | |
Creatinine | 4926 (20.22) | 1686 (13.89) | 3240 (26.51) | <0.001 | |
Hematocrit | 3300 (13.55) | 1551 (12.78) | 1749 (14.31) | <0.001 | |
White blood cell count | 3303 (13.56) | 1554 (12.80) | 1749 (14.31) | 0.001 | |
HCO3− | 20,956 (86.03) | 9937 (81.86) | 11,019 (90.17) | <0.001 | |
Platelet | 3300 (13.55) | 1551 (12.78) | 1749 (14.31) | <0.001 | |
C-reactive protein | 6832 (28.05) | 2667 (21.97) | 4165 (34.08) | <0.001 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sim, T.; Cho, E.; Kim, J.; Kim, H.G.; Kim, S.-J. Clinical Context Is More Important than Data Quantity to the Performance of an Artificial Intelligence-Based Early Warning System. J. Clin. Med. 2025, 14, 4444. https://doi.org/10.3390/jcm14134444
Sim T, Cho E, Kim J, Kim HG, Kim S-J. Clinical Context Is More Important than Data Quantity to the Performance of an Artificial Intelligence-Based Early Warning System. Journal of Clinical Medicine. 2025; 14(13):4444. https://doi.org/10.3390/jcm14134444
Chicago/Turabian StyleSim, Taeyong, Eunyoung Cho, Jihyun Kim, Ho Gwan Kim, and Soo-Jeong Kim. 2025. "Clinical Context Is More Important than Data Quantity to the Performance of an Artificial Intelligence-Based Early Warning System" Journal of Clinical Medicine 14, no. 13: 4444. https://doi.org/10.3390/jcm14134444
APA StyleSim, T., Cho, E., Kim, J., Kim, H. G., & Kim, S.-J. (2025). Clinical Context Is More Important than Data Quantity to the Performance of an Artificial Intelligence-Based Early Warning System. Journal of Clinical Medicine, 14(13), 4444. https://doi.org/10.3390/jcm14134444