You are currently viewing a new version of our website. To view the old version click .
Biomedicines
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

21 December 2025

Comparison of Artificial Intelligence and Radiologists in MRI-Based Prostate Cancer Diagnosis: A Meta-Analysis of Accuracy and Effectiveness

,
,
and
1
Graduate Program of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York, NY 10065, USA
2
Graduate Program of Translational & Clinical Investigation, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY 10065, USA
3
Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, USA
4
Department of Medicine, Clinical & Translational Science Center, Weill Cornell Medicine, New York, NY 10065, USA
Biomedicines2026, 14(1), 20;https://doi.org/10.3390/biomedicines14010020 
(registering DOI)
This article belongs to the Section Cancer Biology and Oncology

Abstract

Background: Prostate cancer remains a leading cause of mortality in men, making early, accurate detection crucial for early intervention. While radiologists utilize the Prostate Imaging Reporting and Data System (PI-RADS) for the interpretation of MRI imaging, variations in expertise and inter-reader differences can affect diagnostic accuracy. Artificial intelligence (AI) has emerged as a promising tool for automated detection, with the potential to achieve diagnostic performance comparable to radiologists in identifying clinically significant prostate cancer (csPCa), streamline workflows, and reduce unnecessary biopsies. However, its real-world performance compared to expert radiologists remains a topic of ongoing debate. Purpose: This meta-analysis aims to evaluate whether AI can achieve diagnostic performance that is comparable to that of radiologists in MRI-based prostate cancer detection by comparing diagnostic accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC). Methods: Following PRISMA 2020 guidelines, we searched PubMed for studies directly comparing AI and radiologists in MRI-based detection of csPCa. Ten studies (20,423 patients) were included, and quality was assessed using QUADAS-2. Analyses included forest plots for diagnostic sensitivity and specificity, funnel plots of AUROC to assess publication bias, and paired AUROC difference plots to directly compare diagnostic accuracy. Results: Pooled sensitivity was 0.87 (95% CI: 0.81–0.94) for AI and 0.85 (95% CI: 0.77–0.94) for radiologists; pooled specificity was 0.61 (95% CI: 0.51–0.72) for AI and 0.63 (95% CI: 0.54–0.71) for radiologists. Funnel plots of AUROC against standard error showed no strong visual evidence of publication bias. Paired AUROC difference analysis demonstrated no significant performance difference between AI and radiologists, with a pooled difference of 0.018 (p = 0.378). Conclusions: AI systems demonstrated diagnostic performance comparable to radiologists for MRI-based detection of csPCa, with a nonsignificant and slightly higher pooled sensitivity and AUROC. Moreover, AI has the potential to improve workflow speed, uniformity across expertise levels, and hybrid AI-radiologist approaches to reduce unnecessary biopsies. Large-scale, prospective trials with standardized protocols are needed to assess AI’s effectiveness across diverse clinical settings.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.