Breathprints for Breast Cancer: Evaluating a Non-Invasive Approach to BI-RADS 4 Risk Stratification in a Preliminary Study
Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Design and Population
2.1.1. Study Design
2.1.2. Study Population
2.2. Device Description
2.3. Breath Sampling Protocol
- Ambient Sampling Phase (30 s): The device initially sampled ambient air, through the mouthpiece, to establish a stable response with respect to the ambient. This step calibrates the sensor array to the room’s background VOC composition, ensuring accurate differential detection during breath sampling. In Figure 3, this phase is referred to as “Baselining”.
- Breath Sampling Phase (5–15 s): With the participant’s nose gently occluded to prevent nasal breathing, a single full exhalation was performed into the mouthpiece. The integrated capnography [23] module automatically identifies the end-tidal (alveolar) portion of the breath and triggers its capture in the buffer chamber. In Figure 3, this phase is referred to as “Capturing”.
- Sensor Recovery Phase (250 s): Following sample capture, ambient air was drawn through the system to facilitate desorption of VOCs from the sensor surfaces, allowing the array to return to the ambient state in preparation for the next measurement. In Figure 3, this phase is referred to as “Recovery”.
2.4. Data Preprocessing and Model Building
2.4.1. Data Preprocessing
2.4.2. Model Architecture and Clinically Optimized Training
- LTask: the error for performing the malignancy classification task
- LBreathprint: the error for decoding the breathprint from the latent vector
- LBI-RAD: the error for decoding the BI-RADS score from the latent vector
2.4.3. Model Cross-Validation
3. Results
3.1. Study Population and Data Distribution
3.2. Predictive Performance in the BI-RADS 4 Cohort
3.2.1. Specificity and Sensitivity Trade-Offs Across Subcategories
3.2.2. Summary Metrics and Negative Predictive Value
3.2.3. Additional Observations
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Appendix A.1. Secure Data Transmission and Storage
Appendix A.2. Data Preprocessing Details
Appendix A.3. Model Architecture and Training Strategy
- The model’s training objective minimizes a combined loss function that includes the standard reconstruction loss (ensuring accurate representation of the breath signal and BI-RADS score) and the classification loss.
- To introduce a crucial clinical bias toward detection, the classification loss employs class-weighted cross-entropy, which selectively up-weights the malignant class during training.
- The models are penalized for overfitting by subtracting the training–validation performance gaps while they are optimized to maximize the F2-score, which prioritizes sensitivity (recall) over precision with a ratio of 2.0 for sensitivity and 0.8 for precision, given the dataset skewness towards the biopsy-confirmed benign cases. The coefficients are chosen empirically.
Appendix A.4. Model Parameters
Appendix A.5. Model Validation Protocol
Appendix A.6. Model Performance Evaluation Metrics
References
- Caswell-Jin, J.L.; Sun, L.P.; Munoz, D.; Lu, Y.; Li, Y.; Huang, H.; Hampton, J.M.; Song, J.; Jayasekera, J.; Schechter, C.; et al. Analysis of breast cancer mortality in the US-1975 to 2019. JAMA 2024, 331, 233–241. [Google Scholar] [CrossRef] [PubMed]
- Ellison, L.F.; Saint-Jacques, N. Five-year cancer survival by stage at diagnosis in Canada. Health Rep. 2023, 34, 3–15. [Google Scholar]
- Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef]
- Liberman, L.; Menell, J.H. Breast imaging reporting and data system (BI-RADS). Radiol. Clin. N. Am. 2002, 40, 409–430. [Google Scholar] [CrossRef] [PubMed]
- Spak, D.; Plaxco, J.; Santiago, L.; Dryden, M.; Dogan, B. BI-RADS ® fifth edition: A summary of changes. Diagn. Interv. Imaging 2017, 98, 179–190. [Google Scholar] [CrossRef]
- Elezaby, M.; Li, G.; Bhargavan-Chatfield, M.; Burnside, E.S.; DeMartini, W.B. ACR BI-RADS assessment category 4 subdivisions in diagnostic mammography: Utilization and outcomes in the National Mammography Database. Radiology 2018, 287, 416–422. [Google Scholar] [CrossRef] [PubMed]
- Liu, C.; Sun, M.; Arefan, D.; Zuley, M.; Sumkin, J.; Wu, S. Deep learning of mammogram images to reduce unnecessary breast biopsies: A preliminary study. Breast Cancer Res. 2024, 26, 82. [Google Scholar] [CrossRef]
- Meng, M.; Li, H.; Zhang, M.; He, G.; Wang, L.; Shen, D. Reducing the number of unnecessary biopsies for mammographic BI-RADS 4 lesions through a deep transfer learning method. BMC Med. Imaging 2023, 23, 82. [Google Scholar] [CrossRef]
- Shen, Y.; Shamout, F.E.; Oliver, J.R.; Witowski, J.; Kannan, K.; Park, J.; Wu, N.; Huddleston, C.; Wolfson, S.; Millet, A.; et al. Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams. Nat. Commun. 2021, 12, 5645. [Google Scholar] [CrossRef]
- Miglioretti, D.L.; Abraham, L.; Sprague, B.L.; Lee, C.I.; Bissell, M.C.; Ho, T.-Q.H.; Bowles, E.J.; Henderson, L.M.; Hubbard, R.A.; Tosteson, A.N.; et al. Association between false-positive results and return to screening mammography in the Breast Cancer Surveillance Consortium cohort. Ann. Intern. Med. 2024, 177, 1297–1307. [Google Scholar] [CrossRef]
- Brodersen, J.; Siersma, V.D. Long-term psychosocial consequences of false-positive screening mammography. Ann. Fam. Med. 2013, 11, 106–115. [Google Scholar] [CrossRef]
- Chubak, J.; Boudreau, D.M.; Fishman, P.A.; Elmore, J.G. Cost of breast-related care in the year following false positive screening mammograms. Med. Care 2010, 48, 815–820. [Google Scholar] [CrossRef]
- Sun, X.; Shao, K.; Wang, T. Detection of volatile organic compounds (VOCs) from exhaled breath as noninvasive methods for cancer diagnosis. Anal. Bioanal. Chem. 2016, 408, 2759–2780. [Google Scholar] [CrossRef]
- Leemans, M.; Bauër, P.; Cuzuel, V.; Audureau, E.; Fromantin, I. Volatile Organic Compounds Analysis as a Potential Novel Screening Tool for Breast Cancer: A Systematic Review. Biomark. Insights 2022, 17, 11772719221100709. [Google Scholar] [CrossRef]
- Yockell-Lelièvre, H.; Philip, R.; Kaushik, P.; Masilamani, A.; Meterissian, S. Breathomics: A non-invasive approach for the diagnosis of breast cancer. Bioengineering 2025, 12, 411. [Google Scholar] [CrossRef] [PubMed]
- Haworth, J.J.; Pitcher, C.K.; Ferrandino, G.; Hobson, A.R.; Pappan, K.L.; Lawson, J.L.D. Breathing new life into clinical testing and diagnostics: Perspectives on volatile biomarkers from breath. Crit. Rev. Clin. Lab. Sci. 2022, 59, 353–372. [Google Scholar] [CrossRef] [PubMed]
- Nakhleh, M.K.; Haick, H.; Humbert, M.; Cohen-Kaminsky, S. Volatolomics of breath as an emerging frontier in pulmonary arterial hypertension. Eur. Respir. J. 2017, 49, 1601897. [Google Scholar] [CrossRef] [PubMed]
- Nakhleh, M.K.; Amal, H.; Jeries, R.; Broza, Y.Y.; Aboud, M.; Gharra, A.; Ivgi, H.; Khatib, S.; Badarneh, S.; Har-Shai, L.; et al. Diagnosis and classification of 17 diseases from 1404 subjects via pattern analysis of exhaled molecules. ACS Nano 2017, 11, 112–125. [Google Scholar] [CrossRef]
- Rufo, J.C.; Madureira, J.; Fernandes, E.O.; Moreira, A. Volatile organic compounds in asthma diagnosis: A systematic review and meta-analysis. Allergy 2016, 71, 175–188. [Google Scholar] [CrossRef]
- Van Berkel, J.; Dallinga, J.; Möller, G.; Godschalk, R.; Moonen, E.; Wouters, E.; Van Schooten, F. A profile of volatile organic compounds in breath discriminates COPD patients from controls. Respir. Med. 2010, 104, 557–563. [Google Scholar] [CrossRef]
- Dixit, K.; Fardindoost, S.; Ravishankara, A.; Tasnim, N.; Hoorfar, M. Exhaled Breath Analysis for Diabetes Diagnosis and Monitoring: Relevance, Challenges and Possibilities. Biosensors 2021, 11, 476. [Google Scholar] [CrossRef] [PubMed]
- Buszewski, B.; Ulanowska, A.; Ligor, T.; Denderz, N.; Amann, A. Analysis of exhaled breath from smokers, passive smokers and non-smokers by solid-phase microextraction gas chromatography/mass spectrometry. Biomed. Chromatogr. 2009, 23, 551–556. [Google Scholar] [CrossRef]
- Gravenstein, J.S.; Jaffe, M.B.; Gravenstein, N. (Eds.) Capnography; Cambridge University Press & Assessment: Cambridge, UK, 2011. [Google Scholar]
- Lourenço, C.; Turner, C. Breath analysis in disease diagnosis: Methodological considerations and applications. Metabolites 2014, 4, 465–498. [Google Scholar] [CrossRef]
- Rahman, H.; Hooper, J.K.; Wardeh, A.; Masilamani, A.P.; Yockell-Lelièvre, H.; Kandathil, J.O.; Abadi, M.K. Confounder-Invariant Representation Learning (CIRL) for robust olfaction with scarce aroma sensor data: Mitigating humidity effects in breath analysis. Sensors 2025, 25, 6839. [Google Scholar] [CrossRef]
- Levaray, N.; Ozhikandathil, J.; Masilamani, A.P.; Panarello, T. Sensing Elements Comprising Gold Nanoparticle-Grafted Carbon Black. U.S. Patent No. 11,788,985, 17 October 2023. [Google Scholar]
- Ryan, M.; Zhou, H.; Buehler, M.; Manatt, K.; Mowrey, V.; Jackson, S.; Kisor, A.; Shevade, A.; Homer, M. Monitoring space shuttle air quality using the Jet Propulsion Laboratory electronic nose. IEEE Sensors J. 2004, 4, 337–347. [Google Scholar] [CrossRef]
- Shevade, A.V.; Ryan, M.A.; Homer, M.L.; Manfreda, A.M.; Zhou, H.; Manatt, K.S. Molecular modeling of polymer composite-analyte interactions in electronic nose sensors. Sens. Actuators B Chem. 2003, 93, 84–91. [Google Scholar] [CrossRef] [PubMed]
- Henderson, B.; Ruszkiewicz, D.M.; Wilkinson, M.; Beauchamp, J.D.; Cristescu, S.M.; Fowler, S.J.; Salman, D.; Di Francesco, F.; Koppen, G.; Langejürgen, J.; et al. A Benchmarking Protocol for Breath Analysis: The Peppermint Experiment. J. Breath Res. 2020, 14, 046008. [Google Scholar] [CrossRef] [PubMed]
- Ryan, M.A.; Manatt, K.S.; Gluck, S.; Shevade, A.V.; Kisor, A.K.; Zhou, H.; Lara, L.M.; Homer, M.L. The JPL electronic nose: Monitoring air in the U.S. Lab on the International Space Station. In Proceedings of the 2010 IEEE Sensors, Waikoloa, Hawaii, 1–4 November 2010; pp. 1242–1247. [Google Scholar]
- Cawley, G.C.; Talbot, N.L. On over-fitting in model selection and subsequent selection bias in performance evaluation. J. Mach. Learn. Res. 2010, 11, 2079–2107. [Google Scholar]
- Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef]
- Meterissian, S.H.; Abadi, M.K.; Wardeh, A.; Kaushik, P.; Philip, R.; Bassel, M.A.; Graham, G.; Masilamani, A. Breast cancer detection using a realtime breath analyzer: A pilot study. J. Clin. Oncol. 2025, 43, e13040. [Google Scholar] [CrossRef]
- Baltrušaitis, T.; Ahuja, C.; Morency, L.-P. Multimodal machine learning: A survey and taxonomy. arXiv 2017, arXiv:1705.09406. [Google Scholar] [CrossRef] [PubMed]
- Huang, S.-C.; Pareek, A.; Seyyedi, S.; Banerjee, I.; Lungren, M.P. Fusion of medical imaging and electronic health records using deep learning: A systematic review and implementation guidelines. npj Digit. Med. 2020, 3, 136. [Google Scholar] [CrossRef] [PubMed]





| Group 1 Benign Lesion | Group 2 Biopsy-Confirmed Breast Cancer | Total | |
|---|---|---|---|
| Initial enrolment | 110 participants (363 samples) | 66 participants (181 samples) | 176 participants (544 samples) |
| Post-exclusion | 72 participants (270 samples) | 53 participants (167 samples) | 125 participants (437 samples) |
| BI-RADS Category | |||
| 3 | 2 participants (7 samples) | 0 participant (0 sample) | 2 participants (7 samples) |
| 5 | 2 participants (7 samples) | 36 participants (114 samples) | 38 participants (121 samples) |
| 4A | 26 participants (103 samples) | 2 participants (7 samples) | 28 participants (110 samples) |
| 4B | 34 participants (124 samples) | 6 participants (18 samples) | 40 participants (142 samples) |
| 4C | 8 participants (29 samples) | 9 participants (28 samples) | 17 participants (57 samples) |
| 4A + 4B + 4C | 68 participants (256 samples) | 17 participants (53 samples) | 85 participants (309 samples) |
| BI-RADS Category | Sensitivity | NPV | Specificity | PPV | Malignancy Rate |
|---|---|---|---|---|---|
| 4A | 86 ± 5% | 99 ± 0% | 83 ± 7% | 28 ± 8% | 6% |
| 4B | 82 ± 5% | 96 ± 1% | 70 ± 8% | 29 ± 5% | 13% |
| 4C | 92 ± 4% | 90 ± 4% | 67 ± 8% | 73 ± 4% | 49% |
| 4 (A + B + C) | 88 ± 3% | 97 ± 1% | 75 ± 7% | 43 ± 6% | 17% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Masilamani, A.P.; Hooper, J.K.; Rahman, M.H.; Philip, R.; Kaushik, P.; Graham, G.; Yockell-Lelievre, H.; Khomami Abadi, M.; Meterissian, S.H. Breathprints for Breast Cancer: Evaluating a Non-Invasive Approach to BI-RADS 4 Risk Stratification in a Preliminary Study. Cancers 2026, 18, 226. https://doi.org/10.3390/cancers18020226
Masilamani AP, Hooper JK, Rahman MH, Philip R, Kaushik P, Graham G, Yockell-Lelievre H, Khomami Abadi M, Meterissian SH. Breathprints for Breast Cancer: Evaluating a Non-Invasive Approach to BI-RADS 4 Risk Stratification in a Preliminary Study. Cancers. 2026; 18(2):226. https://doi.org/10.3390/cancers18020226
Chicago/Turabian StyleMasilamani, Ashok Prabhu, Jayden K. Hooper, Md Hafizur Rahman, Romy Philip, Palash Kaushik, Geoffrey Graham, Helene Yockell-Lelievre, Mojtaba Khomami Abadi, and Sarkis H. Meterissian. 2026. "Breathprints for Breast Cancer: Evaluating a Non-Invasive Approach to BI-RADS 4 Risk Stratification in a Preliminary Study" Cancers 18, no. 2: 226. https://doi.org/10.3390/cancers18020226
APA StyleMasilamani, A. P., Hooper, J. K., Rahman, M. H., Philip, R., Kaushik, P., Graham, G., Yockell-Lelievre, H., Khomami Abadi, M., & Meterissian, S. H. (2026). Breathprints for Breast Cancer: Evaluating a Non-Invasive Approach to BI-RADS 4 Risk Stratification in a Preliminary Study. Cancers, 18(2), 226. https://doi.org/10.3390/cancers18020226

