This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Do Foundation Models Truly Outperform Domain-Specific Models? Evidence from Digital Pathology
by
Chaima Ben Rabah
Chaima Ben Rabah *
and
Ahmed Serag
Ahmed Serag *
AI Innovation Lab, Weill Cornell Medicine, Doha P.O. Box 24144, Qatar
*
Authors to whom correspondence should be addressed.
Mach. Learn. Knowl. Extr. 2026, 8(6), 164; https://doi.org/10.3390/make8060164 (registering DOI)
Submission received: 24 March 2026
/
Revised: 7 June 2026
/
Accepted: 8 June 2026
/
Published: 12 June 2026
Abstract
Foundation models (FMs) are increasingly proposed as general-purpose solutions for computational pathology, with the potential to simplify clinical artificial intelligence deployment by reducing the need for task-specific architectures. However, their reliability across cancer domains with distinct morphological characteristics remains unclear, limiting confidence in real-world clinical use. We benchmarked seven general-purpose pathology FMs and three domain-specific FMs across eleven patch-level datasets spanning three clinically relevant domains: pediatric hematology, prostate cancer, and breast cancer, using both linear probing and last-layer fine-tuning adaptation strategies. By jointly evaluating pediatric leukemia, male-predominant prostate cancer, and female-predominant breast cancer, this study is, to our knowledge, the first to explicitly examine specialist-versus-generalist FM behavior across age- and sex-stratified cancer populations. Performance differences were strongly domain dependent. In hematology, the specialist FM DINOBloom matched and, in several datasets, marginally exceeded leading generalist models (AUC 0.990–0.999 vs. GigaPath 0.981–1.000), suggesting advantages for highly distinctive cellular morphology. In prostate cancer grading, the generalist FM UNI2-h consistently outperformed the specialist HistoEncoder (AUC 0.956–0.977 vs. 0.908–0.964). In breast cancer, UNI2-h achieved the best overall performance across all tasks. No publicly available breast-cancer-specific FM currently exists for direct comparison; therefore, breast cancer results characterize general FM transferability rather than specialist-versus-generalist differences. Importantly, cross-dataset experiments revealed substantial performance degradation under dataset shift in both prostate and breast cancer, indicating that current FMs are not yet robust enough for heterogeneous multi-site clinical use. These findings support the use of generalist FMs as efficient backbones for well-characterized single-site, patch-level tasks, while challenging the assumption that high benchmark performance necessarily reflects true clinical readiness and demonstrating that pathology FMs are not uniformly superior to specialist models.
Share and Cite
MDPI and ACS Style
Rabah, C.B.; Serag, A.
Do Foundation Models Truly Outperform Domain-Specific Models? Evidence from Digital Pathology. Mach. Learn. Knowl. Extr. 2026, 8, 164.
https://doi.org/10.3390/make8060164
AMA Style
Rabah CB, Serag A.
Do Foundation Models Truly Outperform Domain-Specific Models? Evidence from Digital Pathology. Machine Learning and Knowledge Extraction. 2026; 8(6):164.
https://doi.org/10.3390/make8060164
Chicago/Turabian Style
Rabah, Chaima Ben, and Ahmed Serag.
2026. "Do Foundation Models Truly Outperform Domain-Specific Models? Evidence from Digital Pathology" Machine Learning and Knowledge Extraction 8, no. 6: 164.
https://doi.org/10.3390/make8060164
APA Style
Rabah, C. B., & Serag, A.
(2026). Do Foundation Models Truly Outperform Domain-Specific Models? Evidence from Digital Pathology. Machine Learning and Knowledge Extraction, 8(6), 164.
https://doi.org/10.3390/make8060164
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.