Analytical Validation of Multimodal AI Test Predicting Breast Cancer Recurrence Risk (Ataraxis Breast RISK)
Abstract
1. Introduction
2. Materials and Methods
2.1. Ataraxis Breast RISK Overview
2.2. Study Design and Procedures
2.3. Repeatability and Reproducibility
2.4. Clinical Validation Bridging Study
2.5. Robustness to Realistic Data Variability
2.6. Statistical Methods
3. Results
3.1. Primary Experiments
3.2. Clinical Validation Bridging Study
3.3. Clinicopathologic Data Variability
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| ATX | Ataraxis Breast RISK |
| AV | Analytical validation |
| CAP | College of American Pathologists |
| CI | Confidence interval |
| CLIA | Clinical Laboratory Improvement Amendments |
| ER | Estrogen receptor |
| H&E | Hematoxylin and eosin |
| HER2 | Human epidermal growth factor receptor 2 |
| ICC | Intraclass correlation coefficient |
| IDC | Invasive ductal carcinoma |
| ILC | Invasive lobular carcinoma |
| LDT | Laboratory-developed test |
| LOB | Limit of blank |
| LOD | Limit of detection |
| PR | Progesterone receptor |
| SOP | Standard operating procedure |
| QC | Quality control |
References
- Amgad, M.; Hodge, J.M.; Elsebaie, M.A.T.; Bodelon, C.; Puvanesarajah, S.; Gutman, D.A.; Siziopikou, K.P.; Goldstein, J.A.; Gaudet, M.M.; Teras, L.R.; et al. A Population-Level Digital Histologic Biomarker for Enhanced Prognosis of Invasive Breast Cancer. Nat. Med. 2024, 30, 85–97. [Google Scholar] [CrossRef]
- Chen, S.; Jiang, L.; Gao, F.; Zhang, E.; Wang, T.; Zhang, N.; Wang, X.; Zheng, J. Machine Learning-Based Pathomics Signature Could Act as a Novel Prognostic Marker for Patients with Clear Cell Renal Cell Carcinoma. Br. J. Cancer 2022, 126, 771–777. [Google Scholar] [CrossRef]
- Gerrard, P.; Zhang, J.; Yamashita, R.; Huang, H.-C.; Nag, S.; Nhek, S.; Kish, J.; Cole, A.; Silberman, N.; Royce, T.J.; et al. Analytical Validation of a Clinical Grade Prognostic and Classification Artificial Intelligence Laboratory Test for Men with Prostate Cancer. AI Precis. Oncol. 2024, 1, 119–126. [Google Scholar] [CrossRef]
- Fernandez, G.; Zeineh, J.; Prastawa, M.; Scott, R.; Madduri, A.S.; Shtabsky, A.; Jaffer, S.; Feliz, A.; Veremis, B.; Mejias, J.C.; et al. Analytical Validation of the PreciseDx Digital Prognostic Breast Cancer Test in Early-Stage Breast Cancer. Clin. Breast Cancer 2024, 24, 93–102.e6. [Google Scholar] [CrossRef] [PubMed]
- Witowski, J.; Zeng, K.G.; Cappadona, J.; Elayoubi, J.; Choucair, K.; Chiru, E.D.; Chan, N.; Kang, Y.-J.; Howard, F.; Ostrovnaya, I.; et al. Multi-Modal AI for Comprehensive Breast Cancer Prognostication. arXiv 2024, arXiv:2410.21256. [Google Scholar] [CrossRef]
- Xu, H.; Usuyama, N.; Bagga, J.; Zhang, S.; Rao, R.; Naumann, T.; Wong, C.; Gero, Z.; González, J.; Gu, Y.; et al. A Whole-Slide Foundation Model for Digital Pathology from Real-World Data. Nature 2024, 630, 181–188. [Google Scholar] [CrossRef]
- Chen, R.J.; Ding, T.; Lu, M.Y.; Williamson, D.F.K.; Jaume, G.; Song, A.H.; Chen, B.; Zhang, A.; Shao, D.; Shaban, M.; et al. Towards a General-Purpose Foundation Model for Computational Pathology. Nat. Med. 2024, 30, 850–862. [Google Scholar] [CrossRef] [PubMed]
- Geyer, C.E., Jr.; Kates-Harbeck, D.A.; Rastogi, P.; Kates, R.; Filipits, M.; Hlauschek, D.; Fesl, C.; Christgen, M.; Nitz, U.; Kuemmel, S.; et al. Abstract PD11-01: Development of a Multi-Modal Artificial Intelligence (MMAI) Model for Predicting Distant Metastasis in HR+ Early-Stage Invasive Breast Cancer. Clin. Cancer Res. 2026, 32, PD11-01. [Google Scholar] [CrossRef]
- Garberis, I.; Gaury, V.; Saillard, C.; Drubay, D.; Elgui, K.; Schmauch, B.; Jaeger, A.; Herpin, L.; Linhart, J.; Sapateiro, M.; et al. Deep Learning Assessment of Metastatic Relapse Risk from Digitized Breast Cancer Histological Slides. Nat. Commun. 2025, 16, 5876. [Google Scholar] [CrossRef]
- Elayoubi, J.; Tang, C.; Ruddy, K.J.; Choucair, K.; Kalinsky, K.; Pogoda, K.; Esteva, F.J.; Abdelsattar, J.M.; Borges, V.F.; Zeng, K.; et al. Comparing an AI Test to a 21-Gene Assay for Premenopausal Node-Positive HR+/HER2-Breast Cancer. medRxiv 2026. [Google Scholar] [CrossRef]
- McAndrew, N.P.; Ma, C.; Davis, A.A.; Chiru, E.D.; Bardia, A.; Abdelsattar, J.M.; Cappadona, J.; Zeng, K.; Geras, K.J.; Witowski, J.; et al. Prognostic Risk Refinement Using Artificial Intelligence in HR+/HER2- Early Breast Cancer: Implications for CDK4/6 Eligibility Criteria. medRxiv 2026. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Oquab, M.; Darcet, T.; Moutakanni, T.; Vo, H.; Szafraniec, M.; Khalidov, V.; Fernandez, P.; Haziza, D.; Massa, F.; El-Nouby, A.; et al. DINOv2: Learning Robust Visual Features without Supervision. arXiv 2023, arXiv:2304.07193. [Google Scholar]
- Cappadona, J.; Zeng, K.G.; Fernandez-Granda, C.; Witowski, J.; LeCun, Y.; Geras, K.J. Squeezing Performance from Pathology Foundation Models with Chained Hyperparameter Searches. In Proceedings of the NeurIPS 2024 Workshop: Self-Supervised Learning—Theory and Practice, Vancouver, BC, Canada, 10–15 December 2024. [Google Scholar]
- Otsu, N. A Threshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef]
- Badve, S.S.; Baehner, F.L.; Gray, R.P.; Childs, B.H.; Maddala, T.; Liu, M.-L.; Rowley, S.C.; Shak, S.; Perez, E.A.; Shulman, L.J.; et al. Estrogen- and Progesterone-Receptor Status in ECOG 2197: Comparison of Immunohistochemistry by Local and Central Laboratories and Quantitative Reverse Transcription Polymerase Chain Reaction by Central Laboratory. J. Clin. Oncol. 2008, 26, 2473–2481. [Google Scholar] [CrossRef] [PubMed]
- McCullough, A.E.; Dell’orto, P.; Reinholz, M.M.; Gelber, R.D.; Dueck, A.C.; Russo, L.; Jenkins, R.B.; Andrighetto, S.; Chen, B.; Jackisch, C.; et al. Central Pathology Laboratory Review of HER2 and ER in Early Breast Cancer: An ALTTO Trial [BIG 2-06/NCCTG N063D (Alliance)] Ring Study. Breast Cancer Res. Treat. 2014, 143, 485–492. [Google Scholar] [CrossRef] [PubMed]
- Griggs, J.J.; Hamilton, A.S.; Schwartz, K.L.; Zhao, W.; Abrahamse, P.H.; Thomas, D.G.; Jorns, J.M.; Jewell, R.; Saber, M.E.S.; Haque, R.; et al. Discordance between Original and Central Laboratories in ER and HER2 Results in a Diverse, Population-Based Sample. Breast Cancer Res. Treat. 2017, 161, 375–384. [Google Scholar] [CrossRef]
- Longacre, T.A.; Ennis, M.; Quenneville, L.A.; Bane, A.L.; Bleiweiss, I.J.; Carter, B.A.; Catelano, E.; Hendrickson, M.R.; Hibshoosh, H.; Layfield, L.J.; et al. Interobserver Agreement and Reproducibility in Classification of Invasive Breast Carcinoma: An NCI Breast Cancer Family Registry Study. Mod. Pathol. 2006, 19, 195–207. [Google Scholar] [CrossRef]
- Shrout, P.E.; Fleiss, J.L. Intraclass Correlations: Uses in Assessing Rater Reliability. Psychol. Bull. 1979, 86, 420–428. [Google Scholar] [CrossRef]
- Wilson, E.B. Probable Inference, the Law of Succession, and Statistical Inference. J. Am. Stat. Assoc. 1927, 22, 209–212. [Google Scholar] [CrossRef]
- Vallat, R. Pingouin: Statistics in Python. J. Open Source Softw. 2018, 3, 1026. [Google Scholar] [CrossRef]
- Seabold, S.; Perktold, J. Statsmodels: Econometric and Statistical Modeling with Python. In Proceedings of the Proceedings of the Python in Science Conference, SciPy, Austin, TX, USA, 28 June–3 July 2010; pp. 92–96. [Google Scholar]
- Therneau, T.; Atkinson, E. The Concordance Statistic. A Package for Survival Analysis in R, Vignettes. R Package Version 2023. Available online: https://cran.r-project.org/web/packages/survival/vignettes/concordance.pdf (accessed on 26 March 2026).
- U.S. Food and Drug Administration. 510(k) Substantial Equivalence Determination Decision Summary: K062694; U.S. Food and Drug Administration: Silver Spring, MD, USA, 2006. Available online: https://www.accessdata.fda.gov/cdrh_docs/reviews/k062694.pdf (accessed on 26 March 2026).
- U.S. Food and Drug Administration. 510(k) Substantial Equivalence Determination Decision Summary: K130010; U.S. Food and Drug Administration: Silver Spring, MD, USA, 2013. Available online: https://www.accessdata.fda.gov/cdrh_docs/reviews/K130010.pdf (accessed on 26 March 2026).


| Analytical Validation Cohort (N = 160) | |
|---|---|
| Age at diagnosis (Years) | |
| Median [IQR] | 57.5 [50.0–66.0] |
| Race | |
| Black or African American | 50 (31.25%) |
| Hispanic or Latino | 1 (0.62%) |
| White | 42 (26.25%) |
| Unknown | 67 (41.88%) |
| ER receptor status | |
| Negative | 1 (0.62%) |
| Positive | 159 (99.38%) |
| PR receptor status | |
| Negative | 8 (5.00%) |
| Positive | 152 (95.00%) |
| HER2 receptor status (by immunohistochemistry) | |
| Equivocal (2+) | 1 (0.62%) |
| Negative (0, 1+) | 159 (99.38%) |
| Pathologic T stage | |
| T1a | 3 (1.88%) |
| T1b | 23 (14.38%) |
| T1c | 68 (42.50%) |
| T2 | 51 (31.88%) |
| T3 | 4 (2.50%) |
| T4 | 1 (0.62%) |
| Unknown | 10 (6.25%) |
| Pathologic N stage | |
| N0 | 108 (67.50%) |
| N1 | 36 (22.50%) |
| N2 | 2 (1.25%) |
| N3 | 2 (1.25%) |
| Unknown | 12 (7.50%) |
| Recurrence | |
| No | 141 (88.12%) |
| Yes | 19 (11.88%) |
| Death | |
| No | 157 (98.12%) |
| Yes | 3 (1.88%) |
| Follow-up time (Years) | |
| Median [IQR] | 4.7 [3.9–6.1] |
| ICC (with 95% CI) | Agreement (%) | |
|---|---|---|
| Intra-operator | 0.99 (0.98–1.00) | 100.0% |
| Inter-operator | 0.99 (0.98–1.00) | 100.0% |
| Inter-laboratory | 0.97 (0.96–0.98) | 94.7% |
| Clinicopathologic Data Perturbations | 0.85 (0.82–0.88) | 90.0% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Dantone, M.; Lacsamana, M.; Zeng, K.G.; Kenny, P.A.; Geras, K.J.; Witowski, J. Analytical Validation of Multimodal AI Test Predicting Breast Cancer Recurrence Risk (Ataraxis Breast RISK). Diagnostics 2026, 16, 1023. https://doi.org/10.3390/diagnostics16071023
Dantone M, Lacsamana M, Zeng KG, Kenny PA, Geras KJ, Witowski J. Analytical Validation of Multimodal AI Test Predicting Breast Cancer Recurrence Risk (Ataraxis Breast RISK). Diagnostics. 2026; 16(7):1023. https://doi.org/10.3390/diagnostics16071023
Chicago/Turabian StyleDantone, Marc, Martin Lacsamana, Ken G. Zeng, Paraic A. Kenny, Krzysztof J. Geras, and Jan Witowski. 2026. "Analytical Validation of Multimodal AI Test Predicting Breast Cancer Recurrence Risk (Ataraxis Breast RISK)" Diagnostics 16, no. 7: 1023. https://doi.org/10.3390/diagnostics16071023
APA StyleDantone, M., Lacsamana, M., Zeng, K. G., Kenny, P. A., Geras, K. J., & Witowski, J. (2026). Analytical Validation of Multimodal AI Test Predicting Breast Cancer Recurrence Risk (Ataraxis Breast RISK). Diagnostics, 16(7), 1023. https://doi.org/10.3390/diagnostics16071023

