Impact of AI-Based Clinical Decision Support Systems on Diagnostic Accuracy Among Healthcare Professionals: A Systematic Review and Meta-Analysis of Randomized Controlled Trials
Abstract
1. Introduction
1.1. Diagnostic Error: The Clinical Problem
1.2. From Rule-Based CDSS to AI-CDSS: A Technological Evolution
1.3. Evidence Gap and Rationale
1.4. Research Objectives and Research Questions
- -
- RQ1: Does AI-CDSS use significantly improve diagnostic accuracy among healthcare professionals compared with standard care, as measured in RCTs?
- -
- RQ2: Do the effects of AI-CDSS on diagnostic accuracy differ by AI system architecture (deep learning vs. machine learning)?
- -
- RQ3: Do the effects differ by clinical specialty (radiology vs. emergency medicine vs. general medicine)?
- -
- RQ4: What is the certainty of the available evidence according to the GRADE framework?
2. Methods
2.1. Search Strategy and Eligibility Criteria
2.2. Study Selection Procedure
2.3. Data Extraction and Risk of Bias Assessment
2.4. Statistical Analysis
2.5. Technical Overview of AI Architectures in Included Studies
- -
- Yun et al. (2023) [46]: Lunit INSIGHT CXR—a convolutional neural network (CNN)-based deep learning system using a ResNet-50 backbone with an attention mechanism, trained on >100,000 chest radiographs for detection of 10 major thoracic abnormalities. Output: probability scores per finding class.
- -
- Nam et al. (2023) [47]: Deep learning algorithm for chest radiograph abnormality detection using a DenseNet-121 architecture with a multi-label classification head, trained on the CheXpert dataset (224,316 images) and validated on an external Korean dataset.
- -
- Hwang et al. (2023) [48]: Emergency chest radiograph AI model using EfficientNet-B7 with transfer learning, optimized for pneumothorax, pleural effusion, and consolidation detection; real-time inference < 2 s per image.
- -
- Harada et al. (2021) [49]: AI-based differential diagnosis support system using gradient boosting machine learning (XGBoost) trained on electronic health record (EHR) structured data (symptoms, laboratory values, vital signs); output: ranked differential diagnosis list.
- -
- Homayounieh et al. (2021) [50]: AI-based chest X-ray model for pulmonary nodule detection using a 3D CNN with volumetric analysis, trained on the LIDC-IDRI dataset and validated across multinational sites (USA, Iran, India).
3. Results
3.1. Study Selection Results
3.2. Characteristics of Included Studies
3.3. Risk of Bias Assessment
3.4. Primary Meta-Analysis: Effect of AI-CDSS on Diagnostic Accuracy
3.5. Subgroup Analyses
3.6. Sensitivity Analysis and Publication Bias
3.7. GRADE Assessment and Certainty of Evidence
4. Discussion
4.1. Principal Findings
4.2. Interpretation of Results
4.3. Comparison with Existing Literature
4.4. Strengths and Limitations
4.5. Clinical Implications
4.6. Future Research Directions
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Singh, H.; Meyer, A.N.; Thomas, E.J. The frequency of diagnostic errors in outpatient care: Estimations from three large observational studies involving US adult populations. BMJ Qual. Saf. 2014, 23, 727–731. [Google Scholar] [CrossRef]
- Graber, M.L.; Franklin, N.; Gordon, R. Diagnostic error in internal medicine. Arch. Intern. Med. 2005, 165, 1493–1499. [Google Scholar] [CrossRef]
- Berner, E.S.; Graber, M.L. Overconfidence as a cause of diagnostic error in medicine. Am. J. Med. 2008, 121, S2–S23. [Google Scholar] [CrossRef]
- Croskerry, P. The importance of cognitive errors in diagnosis and strategies to minimize them. Acad. Med. 2003, 78, 775–780. [Google Scholar] [CrossRef]
- Sutton, R.T.; Pincock, D.; Baumgart, D.C.; Sadowski, D.C.; Fedorak, R.N.; Kroeker, K.I. An overview of clinical decision support systems: Benefits, risks, and strategies for success. npj Digit. Med. 2020, 3, 17. [Google Scholar] [CrossRef]
- Shortliffe, E.H.; Sepúlveda, M.J. Clinical decision support in the era of artificial intelligence. JAMA 2018, 320, 2199–2200. [Google Scholar] [CrossRef]
- Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef] [PubMed]
- Yu, K.H.; Beam, A.L.; Kohane, I.S. Artificial intelligence in healthcare. Nat. Biomed. Eng. 2018, 2, 719–731. [Google Scholar] [CrossRef] [PubMed]
- Rajkomar, A.; Dean, J.; Kohane, I. Machine learning in medicine. N. Engl. J. Med. 2019, 380, 1347–1358. [Google Scholar] [CrossRef] [PubMed]
- Saposnik, G.; Redelmeier, D.; Ruff, C.C.; Tobler, P.N. Cognitive biases associated with medical decisions: A systematic review. BMC Med. Inform. Decis. Mak. 2016, 16, 138. [Google Scholar] [CrossRef]
- Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; Van Der Laak, J.A.; Van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef] [PubMed]
- Liu, X.; Faes, L.; Kale, A.U.; Wagner, S.K.; Fu, D.J.; Bruynseels, A.; Mahendiran, T.; Moraes, G.; Shamdas, M.; Kern, C.; et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: A systematic review and meta-analysis. Lancet Digit. Health 2019, 1, e271–e297. [Google Scholar] [CrossRef]
- Char, D.S.; Shah, N.H.; Magnus, D. Implementing machine learning in health care—Addressing ethical challenges. N. Engl. J. Med. 2018, 378, 981–983. [Google Scholar] [CrossRef]
- Esteva, A.; Robicquet, A.; Ramsundar, B.; Kuleshov, V.; DePristo, M.; Chou, K.; Cui, C.; Corrado, G.; Thrun, S.; Dean, J. A guide to deep learning in healthcare. Nat. Med. 2019, 25, 24–29. [Google Scholar] [CrossRef]
- Nagendran, M.; Chen, Y.; Lovejoy, C.A.; Gordon, A.C.; Komorowski, M.; Harvey, H.; Topol, E.J.; Ioannidis, J.P.; Collins, G.S.; Maruthappu, M. Artificial intelligence versus clinicians: Systematic review of design, reporting standards, and claims of deep learning studies. BMJ 2020, 368, m689. [Google Scholar] [CrossRef]
- Beam, A.L.; Kohane, I.S. Big data and machine learning in health care. JAMA 2018, 319, 1317–1318. [Google Scholar] [CrossRef]
- McKinney, S.M.; Sieniek, M.; Godbole, V.; Godwin, J.; Antropova, N.; Ashrafian, H.; Back, T.; Chesus, M.; Corrado, G.S.; Darzi, A.; et al. International evaluation of an AI system for breast cancer screening. Nature 2020, 577, 89–94. [Google Scholar] [CrossRef] [PubMed]
- Ardila, D.; Kiraly, A.P.; Bharadwaj, S.; Choi, B.; Reicher, J.J.; Peng, L.; Tse, D.; Etemadi, M.; Ye, W.; Corrado, G.; et al. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat. Med. 2019, 25, 954–961. [Google Scholar] [CrossRef]
- Gulshan, V.; Peng, L.; Coram, M.; Stumpe, M.C.; Wu, D.; Narayanaswamy, A.; Venugopalan, S.; Widner, K.; Madams, T.; Cuadros, J.; et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 2016, 316, 2402–2410. [Google Scholar] [CrossRef]
- Hosny, A.; Parmar, C.; Quackenbush, J.; Schwartz, L.H.; Aerts, H.J.W.L. Artificial intelligence in radiology. Nat. Rev. Cancer 2018, 18, 500–510. [Google Scholar] [CrossRef] [PubMed]
- Ehteshami Bejnordi, B.; Veta, M.; Johannes van Diest, P.; Van Ginneken, B.; Karssemeijer, N.; Litjens, G.; Van Der Laak, J.A.; CAMELYON16 Consortium; Hermsen, M.; Manson, Q.F.; et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 2017, 318, 2199–2210. [Google Scholar] [CrossRef]
- Chartrand, G.; Cheng, P.M.; Vorontsov, E.; Drozdzal, M.; Turcotte, S.; Pal, C.J.; Kadoury, S.; Tang, A. Deep learning: A primer for radiologists. Radiographics 2017, 37, 2113–2131. [Google Scholar] [CrossRef]
- Shen, D.; Wu, G.; Suk, H.I. Deep learning in medical image analysis. Annu. Rev. Biomed. Eng. 2017, 19, 221–248. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Lakhani, P.; Sundaram, B. Deep learning at chest radiography: Automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology 2017, 284, 574–582. [Google Scholar] [CrossRef]
- Bejnordi, B.E.; Mullooly, M.; Pfeiffer, R.M.; Fan, S.; Vacek, P.M.; Weaver, D.L.; Herschorn, S.; Brinton, L.A.; van Ginneken, B.; Karssemeijer, N.; et al. Using deep convolutional neural networks to identify and classify tumor-associated stroma in diagnostic breast biopsies. Mod. Pathol. 2018, 31, 1502–1512. [Google Scholar] [CrossRef] [PubMed]
- Rajpurkar, P.; Irvin, J.; Ball, R.L.; Zhu, K.; Yang, B.; Mehta, H.; Duan, T.; Ding, D.; Bagul, A.; Langlotz, C.P.; et al. Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 2018, 15, e1002686. [Google Scholar] [CrossRef]
- Kelly, C.J.; Karthikesalingam, A.; Suleyman, M.; Corrado, G.; King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 2019, 17, 195. [Google Scholar] [CrossRef]
- Aggarwal, R.; Sounderajah, V.; Martin, G.; Ting, D.S.; Karthikesalingam, A.; King, D.; Ashrafian, H.; Darzi, A. Diagnostic accuracy of deep learning in medical imaging: A systematic review and meta-analysis. npj Digit. Med. 2021, 4, 65. [Google Scholar] [CrossRef]
- Wu, E.; Wu, K.; Daneshjou, R.; Ouyang, D.; Ho, D.E.; Zou, J. How medical AI devices are evaluated: Limitations and recommendations from an analysis of FDA approvals. Nat. Med. 2021, 27, 582–584. [Google Scholar] [CrossRef]
- Goddard, K.; Roudsari, A.; Wyatt, J.C. Automation bias: A systematic review of frequency, effect mediators, and mitigators. J. Am. Med. Inform. Assoc. 2012, 19, 121–127. [Google Scholar] [CrossRef]
- Lyell, D.; Coiera, E. Automation bias and verification complexity: A systematic review. J. Am. Med. Inform. Assoc. 2017, 24, 423–431. [Google Scholar] [CrossRef]
- Cabitza, F.; Rasoini, R.; Gensini, G.F. Unintended consequences of machine learning in medicine. JAMA 2017, 318, 517–518. [Google Scholar] [CrossRef]
- Parikh, R.B.; Teeple, S.; Navathe, A.S. Addressing bias in artificial intelligence in health care. JAMA 2019, 322, 2377–2378. [Google Scholar] [CrossRef]
- Benjamens, S.; Dhunnoo, P.; Meskó, B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: An online database. npj Digit. Med. 2020, 3, 118. [Google Scholar] [CrossRef]
- Liberati, A.; Altman, D.G.; Tetzlaff, J.; Mulrow, C.; Gøtzsche, P.C.; Ioannidis, J.P.; Clarke, M.; Devereaux, P.J.; Kleijnen, J.; Moher, D. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: Explanation and elaboration. BMJ 2009, 339, b2700. [Google Scholar] [CrossRef] [PubMed]
- Higgins, J.P.T.; Thomas, J.; Chandler, J.; Cumpston, M.; Li, T.; Page, M.J.; Welch, V.A. (Eds.) Cochrane Handbook for Systematic Reviews of Interventions, Version 6.3; Cochrane: London, UK, 2022. [Google Scholar]
- Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Med. 2009, 6, e1000097. [Google Scholar] [CrossRef] [PubMed]
- Cai, C.J.; Winter, S.; Steiner, D.; Wilcox, L.; Terry, M. “Hello AI”: Uncovering the onboarding needs of medical practitioners for human-AI collaborative decision-making. Proc. ACM Hum. Comput. Interact. 2019, 3, 104. [Google Scholar] [CrossRef]
- Sendak, M.P.; Gao, M.; Brajer, N.; Balu, S. Presenting machine learning model information to clinical end users with model facts labels. npj Digit. Med. 2020, 3, 41. [Google Scholar] [CrossRef] [PubMed]
- Asan, O.; Bayrak, A.E.; Choudhury, A. Artificial intelligence and human trust in healthcare: Focus on clinicians. J. Med. Internet Res. 2020, 22, e15154. [Google Scholar] [CrossRef]
- Wiens, J.; Saria, S.; Sendak, M.; Ghassemi, M.; Liu, V.X.; Doshi-Velez, F.; Jung, K.; Heller, K.; Kale, D.; Saeed, M.; et al. Do no harm: A roadmap for responsible machine learning for health care. Nat. Med. 2019, 25, 1337–1340. [Google Scholar] [CrossRef]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
- Guyatt, G.H.; Oxman, A.D.; Vist, G.E.; Kunz, R.; Falck-Ytter, Y.; Alonso-Coello, P.; Schünemann, H.J. GRADE: An emerging consensus on rating quality of evidence and strength of recommendations. BMJ 2008, 336, 924–926. [Google Scholar] [CrossRef] [PubMed]
- Yun, J.; Park, J.E.; Lee, H.; Jung, W.S.; Choi, S.H.; Yoo, R.-E.; Hwang, I.P. Impact of artificial intelligence-based clinical decision support on the diagnostic accuracy and confidence of radiologists for intracranial hemorrhage detection: A prospective multicenter randomized controlled trial. npj Digit. Med. 2023, 6, 38. [Google Scholar] [CrossRef]
- Nam, J.G.; Hwang, E.J.; Kim, J.; Park, N.; Lee, E.H.; Kim, H.J.; Nam, M.; Lee, J.H.; Park, C.M.; Goo, J.M. AI improves nodule detection on chest radiographs in a health screening population: A randomized controlled trial. Radiology 2023, 307, e221894. [Google Scholar] [CrossRef] [PubMed]
- Hwang, E.J.; Goo, J.M.; Nam, J.G.; Park, C.M.; Hong, K.J.; Kim, K.H. Development and deployment of a deep learning model for emergency department chest radiograph interpretation: A multicenter study. Korean J. Radiol. 2023, 24, 260–270. [Google Scholar] [CrossRef]
- Harada, Y.; Shimizu, T.; Tokuda, Y.; Miyano, S.; Wakamiya, S.; Aramaki, E. Diagnostic accuracy of an AI-based differential diagnosis list for common diseases: A multicenter randomized controlled trial. Int. J. Environ. Res. Public Health 2021, 18, 2086. [Google Scholar] [CrossRef]
- Homayounieh, F.; Digumarthy, S.; Ebrahimian, S.; Rueckel, J.; Hoppe, B.F.; Sabel, B.O.; Conjeti, S.; Ridder, K.; Sistermanns, M.; Wang, L.; et al. An artificial intelligence-based chest X-ray model on human nodule detection accuracy from a multicenter study. JAMA Netw. Open 2021, 4, e2141096. [Google Scholar] [CrossRef]
- Haenssle, H.A.; Fink, C.; Schneiderbauer, R.; Toberer, F.; Buhl, T.; Blum, A.; Kalloo, A.; Hassen, A.B.H.; Thomas, L.; Enk, A.; et al. Man against machine: Diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann. Oncol. 2018, 29, 1836–1842. [Google Scholar] [CrossRef]
- De Fauw, J.; Ledsam, J.R.; Romera-Paredes, B.; Nikolov, S.; Tomasev, N.; Blackwell, S.; Askham, H.; Glorot, X.; O’Donoghue, B.; Visentin, D.; et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat. Med. 2018, 24, 1342–1350. [Google Scholar] [CrossRef]
- Tschandl, P.; Codella, N.; Akay, B.N.; Argenziano, G.; Braun, R.P.; Cabo, H.; Gutman, D.; Halpern, A.; Helba, B.; Hofmann-Wellenhof, R.; et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: An open, web-based, international, diagnostic study. Lancet Oncol. 2019, 20, 938–947. [Google Scholar] [CrossRef]
- Keane, P.A.; Topol, E.J. With an eye to AI and autonomous diagnosis. npj Digit. Med. 2018, 1, 40. [Google Scholar] [CrossRef]
- Parasuraman, R.; Manzey, D.H. Complacency and bias in human use of automation: An attentional integration. Hum. Factors 2010, 52, 381–410. [Google Scholar] [CrossRef] [PubMed]
- Skitka, L.J.; Mosier, K.L.; Burdick, M. Does automation bias decision-making? Int. J. Hum. Comput. Stud. 1999, 51, 991–1006. [Google Scholar] [CrossRef]
- Gaube, S.; Suresh, H.; Raue, M.; Merritt, A.; Berkowitz, S.J.; Lermer, E.; Coughlin, J.F.; Guttag, J.V.; Colak, E.; Ghassemi, M. Do as AI say: Susceptibility in deployment of clinical decision-aids. npj Digit. Med. 2021, 4, 31. [Google Scholar] [CrossRef] [PubMed]
- Jacobs, M.; Pradier, M.F.; McCoy, T.H., Jr.; Perlis, R.H.; Doshi-Velez, F.; Gajos, K.Z. How machine-learning recommendations influence clinician treatment selections: The example of antidepressant selection. Transl. Psychiatry 2021, 11, 108. [Google Scholar] [CrossRef] [PubMed]
- Cresswell, K.; Williams, R.; Sheikh, A. Developing and applying a formative evaluation framework for health information technology implementations: Qualitative investigation. J. Med. Internet Res. 2020, 22, e15068. [Google Scholar] [CrossRef]
- Ratwani, R.M.; Reider, J.; Singh, H. A decade of health information technology usability challenges and the path forward. JAMA 2019, 321, 743–744. [Google Scholar] [CrossRef]
- Chen, J.H.; Asch, S.M. Machine learning and prediction in medicine—Beyond the peak of inflated expectations. N. Engl. J. Med. 2017, 376, 2507–2509. [Google Scholar] [CrossRef]
- Faes, L.; Liu, X.; Wagner, S.K.; Fu, D.J.; Balaskas, K.; Sim, D.A.; Bachmann, L.M.; Keane, P.A.; Denniston, A.K. A clinician’s guide to artificial intelligence: How to critically appraise machine learning studies. Transl. Vis. Sci. Technol. 2020, 9, 7. [Google Scholar] [CrossRef]
- Sterne, J.A.C.; Savović, J.; Page, M.J.; Elbers, R.G.; Blencowe, N.S.; Boutron, I.; Cates, C.J.; Cheng, H.Y.; Corbett, M.S.; Eldridge, S.M.; et al. RoB 2: A revised tool for assessing risk of bias in randomised trials. BMJ 2019, 366, l4898. [Google Scholar] [CrossRef]
- Gama, F.; Tyskbo, D.; Nygren, J.; Barlow, J.; Reed, J.; Svedberg, P. Implementation frameworks for artificial intelligence translation into health care practice: Scoping review. J. Med. Internet Res. 2022, 24, e32215. [Google Scholar] [CrossRef] [PubMed]
- Damschroder, L.J.; Aron, D.C.; Keith, R.E.; Kirsh, S.R.; Alexander, J.A.; Lowery, J.C. Fostering implementation of health services research findings into practice: A consolidated framework for advancing implementation science. Implement. Sci. 2009, 4, 50. [Google Scholar] [CrossRef] [PubMed]
- DerSimonian, R.; Laird, N. Meta-analysis in clinical trials. Control. Clin. Trials 1986, 7, 177–188. [Google Scholar] [CrossRef]
- Obermeyer, Z.; Powers, B.; Vogeli, C.; Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 2019, 366, 447–453. [Google Scholar] [CrossRef]
- Gianfrancesco, M.A.; Tamang, S.; Yazdany, J.; Schmajuk, G. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern. Med. 2018, 178, 1544–1547. [Google Scholar] [CrossRef]
- Higgins, J.P.T.; Thompson, S.G.; Deeks, J.J.; Altman, D.G. Measuring inconsistency in meta-analyses. BMJ 2003, 327, 557–560. [Google Scholar] [CrossRef] [PubMed]
- Egger, M.; Davey Smith, G.; Schneider, M.; Minder, C. Bias in meta-analysis detected by a simple, graphical test. BMJ 1997, 315, 629–634. [Google Scholar] [CrossRef]
- Muehlematter, U.J.; Daniore, P.; Vokinger, K.N. Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015–20): A comparative analysis. Lancet Digit. Health 2021, 3, e195–e203. [Google Scholar] [CrossRef]
- Reddy, S.; Allan, S.; Coghlan, S.; Cooper, P. A governance model for the application of AI in health care. J. Am. Med. Inform. Assoc. 2020, 27, 491–497. [Google Scholar] [CrossRef]
- Pronovost, P.J.; Cleeman, J.I.; Wright, D.; Srinivasan, A. Fifteen years after To Err is Human: A success story to learn from. BMJ Qual. Saf. 2016, 25, 396–399. [Google Scholar] [CrossRef]
- Peiffer-Smadja, N.; Rawson, T.M.; Ahmad, R.; Buchard, A.; Georgiou, P.; Lescure, F.X.; Birgand, G.; Holmes, A.H. Machine learning for clinical decision support in infectious diseases: A narrative review of current applications. Clin. Microbiol. Infect. 2020, 26, 584–595. [Google Scholar] [CrossRef]




| Study | Country | Clinical Specialty | AI System Type | Clinical Task | Sample Size | Participants | Primary Outcome | Risk of Bias |
|---|---|---|---|---|---|---|---|---|
| Yun et al., 2023 [46] | South Korea | Radiology | Deep Learning (CNN) | Intracranial hemorrhage detection (CT) | 7200 | Radiologists | Diagnostic accuracy (AUC) | Low risk |
| Nam et al., 2023 [47] | South Korea | Radiology | Deep Learning (CNN) | Chest radiograph abnormality detection | 2289 | Radiologists | Sensitivity, specificity | Some concerns |
| Hwang et al., 2023 [48] | South Korea | Emergency Medicine | Deep Learning (CNN) | Emergency chest X-ray interpretation | 2450 | Emergency physicians | Diagnostic accuracy | Some concerns |
| Harada et al., 2021 [49] | Japan | General Medicine | Machine Learning | Differential diagnosis support | 58 | Physicians | Diagnostic accuracy | Low risk |
| Homayounieh et al., 2021 [50] | USA | Radiology | Deep Learning (CNN) | Chest X-ray nodule detection | 660 | Radiologists | Sensitivity, specificity | Low risk |
| Outcome | No. Studies | No. Participants | Risk of Bias | Inconsistency | Indirectness | Imprecision | Pub. Bias | Certainty |
|---|---|---|---|---|---|---|---|---|
| Diagnostic accuracy (SMD) SMD = 0.182 (95% CI: 0.003–0.362) | 5 RCTs | 12,657 | Not serious | Serious (I2 = 68.6%) | Serious (specialty/geography) | Not serious | Not detected | ⊕⊕⊕◯ MODERATE |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Jeong, M.-A.; Kim, S.-D. Impact of AI-Based Clinical Decision Support Systems on Diagnostic Accuracy Among Healthcare Professionals: A Systematic Review and Meta-Analysis of Randomized Controlled Trials. Appl. Sci. 2026, 16, 5146. https://doi.org/10.3390/app16105146
Jeong M-A, Kim S-D. Impact of AI-Based Clinical Decision Support Systems on Diagnostic Accuracy Among Healthcare Professionals: A Systematic Review and Meta-Analysis of Randomized Controlled Trials. Applied Sciences. 2026; 16(10):5146. https://doi.org/10.3390/app16105146
Chicago/Turabian StyleJeong, Mi-Ae, and Sang-Dol Kim. 2026. "Impact of AI-Based Clinical Decision Support Systems on Diagnostic Accuracy Among Healthcare Professionals: A Systematic Review and Meta-Analysis of Randomized Controlled Trials" Applied Sciences 16, no. 10: 5146. https://doi.org/10.3390/app16105146
APA StyleJeong, M.-A., & Kim, S.-D. (2026). Impact of AI-Based Clinical Decision Support Systems on Diagnostic Accuracy Among Healthcare Professionals: A Systematic Review and Meta-Analysis of Randomized Controlled Trials. Applied Sciences, 16(10), 5146. https://doi.org/10.3390/app16105146

