Previous Article in Journal
From Legal Text to NP-Complete Decision Models: MPNet Retrieval and Policy Information Extraction
Previous Article in Special Issue
MedToolica: Finetuning-Free Agentic Compositional Tool Learning for 3D CT Reasoning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Do Foundation Models Truly Outperform Domain-Specific Models? Evidence from Digital Pathology

AI Innovation Lab, Weill Cornell Medicine, Doha P.O. Box 24144, Qatar
*
Authors to whom correspondence should be addressed.
Mach. Learn. Knowl. Extr. 2026, 8(6), 164; https://doi.org/10.3390/make8060164 (registering DOI)
Submission received: 24 March 2026 / Revised: 7 June 2026 / Accepted: 8 June 2026 / Published: 12 June 2026

Abstract

Foundation models (FMs) are increasingly proposed as general-purpose solutions for computational pathology, with the potential to simplify clinical artificial intelligence deployment by reducing the need for task-specific architectures. However, their reliability across cancer domains with distinct morphological characteristics remains unclear, limiting confidence in real-world clinical use. We benchmarked seven general-purpose pathology FMs and three domain-specific FMs across eleven patch-level datasets spanning three clinically relevant domains: pediatric hematology, prostate cancer, and breast cancer, using both linear probing and last-layer fine-tuning adaptation strategies. By jointly evaluating pediatric leukemia, male-predominant prostate cancer, and female-predominant breast cancer, this study is, to our knowledge, the first to explicitly examine specialist-versus-generalist FM behavior across age- and sex-stratified cancer populations. Performance differences were strongly domain dependent. In hematology, the specialist FM DINOBloom matched and, in several datasets, marginally exceeded leading generalist models (AUC 0.990–0.999 vs. GigaPath 0.981–1.000), suggesting advantages for highly distinctive cellular morphology. In prostate cancer grading, the generalist FM UNI2-h consistently outperformed the specialist HistoEncoder (AUC 0.956–0.977 vs. 0.908–0.964). In breast cancer, UNI2-h achieved the best overall performance across all tasks. No publicly available breast-cancer-specific FM currently exists for direct comparison; therefore, breast cancer results characterize general FM transferability rather than specialist-versus-generalist differences. Importantly, cross-dataset experiments revealed substantial performance degradation under dataset shift in both prostate and breast cancer, indicating that current FMs are not yet robust enough for heterogeneous multi-site clinical use. These findings support the use of generalist FMs as efficient backbones for well-characterized single-site, patch-level tasks, while challenging the assumption that high benchmark performance necessarily reflects true clinical readiness and demonstrating that pathology FMs are not uniformly superior to specialist models.
Keywords: artificial intelligence; foundation models; transfer learning; breast cancer; prostate cancer; leukemia; hematology; computational pathology; digital pathology artificial intelligence; foundation models; transfer learning; breast cancer; prostate cancer; leukemia; hematology; computational pathology; digital pathology
Graphical Abstract

Share and Cite

MDPI and ACS Style

Rabah, C.B.; Serag, A. Do Foundation Models Truly Outperform Domain-Specific Models? Evidence from Digital Pathology. Mach. Learn. Knowl. Extr. 2026, 8, 164. https://doi.org/10.3390/make8060164

AMA Style

Rabah CB, Serag A. Do Foundation Models Truly Outperform Domain-Specific Models? Evidence from Digital Pathology. Machine Learning and Knowledge Extraction. 2026; 8(6):164. https://doi.org/10.3390/make8060164

Chicago/Turabian Style

Rabah, Chaima Ben, and Ahmed Serag. 2026. "Do Foundation Models Truly Outperform Domain-Specific Models? Evidence from Digital Pathology" Machine Learning and Knowledge Extraction 8, no. 6: 164. https://doi.org/10.3390/make8060164

APA Style

Rabah, C. B., & Serag, A. (2026). Do Foundation Models Truly Outperform Domain-Specific Models? Evidence from Digital Pathology. Machine Learning and Knowledge Extraction, 8(6), 164. https://doi.org/10.3390/make8060164

Article Metrics

Back to TopTop