Cross-Disease Breathomics by PTR-TOF-MS: Multiclass Machine Learning and Network Remodeling Across Asthma, COPD, Cystic Fibrosis, and Lymphangioleiomyomatosis
Abstract
1. Introduction
2. Results
2.1. Dataset Composition
2.2. Multiclass Classification Performance and Assessment of Confounding Effects
2.3. Comparative Characteristics of VOCs Between Groups
2.4. Network Remodeling of VOC Interactions
3. Discussion
3.1. Principal Findings and Clinical Framing
3.2. Biological Interpretation of the Most Significant VOCs
3.3. Network Remodeling and Centrality as Complementary Signal
3.4. Methodological Considerations and Limitations
4. Materials and Methods
4.1. Study Cohorts and Ethical Approval
4.2. PTR-TOF-MS Breath Sampling and Instrumental Setup
4.3. Putative VOC Annotation
4.4. Data Splitting, Preprocessing, and Normalization
4.5. Statistical Data Analysis
4.5.1. Ensemble Feature Selection
4.5.2. Identifying VOC Predictors and Their Relationship to Endpoints
4.5.3. Multiclass Model Development and Validation
4.5.4. Between-Group Comparisons of VOC Abundance
4.5.5. Network Analysis of VOC Interactions
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| PTR-TOF-MS | proton-transfer reaction time-of-flight mass spectrometry |
| eVOCs | exhaled volatile organic compounds |
| BA | bronchial asthma |
| COPD | chronic obstructive pulmonary disease |
| CF | cystic fibrosis |
| LAM | lymphangioleiomyomatosis |
| ATS | the American Thoracic Society |
| ERS | the European Respiratory Society |
| AUC | area under the curve |
| BMI | body mass index |
| FEV1 | forced expiratory volume in 1 s |
| FVC | forced vital capacity |
| FEF75 | forced expiratory flow when 75% of FVC has been exhaled |
| mMRC | Modified Medical Research Council |
| OvO | one-vs-one |
| OvR | one-vs-rest |
References
- Global Initiative for Asthma (GINA). Global Strategy for Asthma Management and Prevention; GINA: Fontana, WI, USA, 2023; Available online: https://ginasthma.org/ (accessed on 1 January 2024).
- Agustí, A.; Celli, B.R.; Criner, G.J.; Halpin, D.; Anzueto, A.; Barnes, P.; Bourbeau, J.; Han, M.K.; Martinez, F.J.; Montes de Oca, M.; et al. Global Initiative for Chronic Obstructive Lung Disease 2023 report: GOLD executive summary. Am. J. Respir. Crit. Care Med. 2023, 207, 819–837. [Google Scholar] [CrossRef]
- Dwight, M.; Marshall, B. CFTR modulators: Transformative therapies for cystic fibrosis. J. Manag. Care Spec. Pharm. 2021, 27, 281–284. [Google Scholar] [CrossRef]
- Johnson, S.R.; Cordier, J.F.; Lazor, R.; Cottin, V.; Costabel, U.; Harari, S.; Reynaud-Gaubert, M.; Boehler, A.; Brauner, M.; Popper, H.; et al. Review Panel of the ERS LAM Task Force. European Respiratory Society guidelines for the diagnosis and management of lymphangioleiomyomatosis. Eur. Respir. J. 2010, 35, 14–26. [Google Scholar] [CrossRef] [PubMed]
- Ibrahim, W.; Carr, L.; Cordell, R.; Wilde, M.J.; Salman, D.; Monks, P.S.; Thomas, P.; Brightling, C.E.; Siddiqui, S.; Greening, N.J. Breathomics for the clinician: The use of volatile organic compounds in respiratory diseases. Thorax 2021, 76, 514–521. [Google Scholar] [CrossRef]
- van der Schee, M.P.; Paff, T.; Brinkman, P.; van Aalderen, W.M.C.; Haarman, E.G.; Sterk, P.J. Breathomics in lung disease. Chest 2015, 147, 224–231. [Google Scholar] [CrossRef]
- Horváth, I.; Barnes, P.J.; Loukides, S.; Sterk, P.J.; Högman, M.; Olin, A.C.; Amann, A.; Antus, B.; Baraldi, E.; Bikov, A.; et al. A European Respiratory Society technical standard: Exhaled biomarkers in lung disease. Eur. Respir. J. 2017, 49, 1600965. [Google Scholar] [CrossRef]
- Romano, A.; Hanna, G.B. Identification and quantification of VOCs by proton transfer reaction time of flight mass spectrometry: An experimental workflow for the optimization of specificity, sensitivity, and accuracy. J. Mass Spectrom. 2018, 53, 287–295. [Google Scholar] [CrossRef]
- Graus, M.; Müller, M.; Hansel, A. High resolution PTR-TOF: Quantification and formula confirmation of VOC in real time. J. Am. Soc. Mass Spectrom. 2010, 21, 1037–1044. [Google Scholar] [CrossRef]
- Cristescu, S.M.; Gietema, H.A.; Blanchet, L.; Kruitwagen, C.L.; Munnik, P.; van Klaveren, R.J.; Lammers, J.W.; Buydens, L.; Harren, F.J.; Zanen, P. Screening for emphysema via exhaled volatile organic compounds. J. Breath Res. 2011, 5, 046009. [Google Scholar] [CrossRef]
- Marzoog, B.A.; Chomakhidze, P.; Gognieva, D.; Gagarina, N.V.; Silantyev, A.; Suvorov, A.; Fominykha, E.; Mustafina, M.; Natalya, E.; Gadzhiakhmedova, A.; et al. Machine learning model discriminate ischemic heart disease using breathome analysis. Biomedicines 2024, 12, 2814. [Google Scholar] [CrossRef]
- Li, J.; Zhang, Y.; Chen, Q.; Pan, Z.; Chen, J.; Sun, M.; Wang, J.; Li, Y.; Ye, Q. Development and validation of a screening model for lung cancer using machine learning: A large-scale, multi-center study of biomarkers in breath. Front. Oncol. 2022, 12, 975563. [Google Scholar] [CrossRef]
- Nalluri, M.; Pentela, M.; Eluri, N.R. A scalable tree boosting system: XG Boost. Int. J. Res. Stud. Sci. Eng. Technol. 2020, 7, 36–51. [Google Scholar]
- Becker, M.; Nassar, H.; Espinosa, C.; Stelzer, I.A.; Feyaerts, D.; Berson, E.; Bidoki, N.H.; Chang, A.L.; Saarunya, G.; Culos, A.; et al. Large-scale correlation network construction for unraveling the coordination of complex biological systems. Nat. Comput. Sci. 2023, 3, 346–359. [Google Scholar] [CrossRef]
- Mustafina, M.; Silantyev, A.; Makarova, M.; Suvorov, A.; Chernyak, A.; Naumenko, Z.; Pakhomov, P.; Pershina, E.; Suvorova, O.; Shmidt, A.; et al. Exhaled breath analysis in lymphangioleiomyomatosis by real-time proton mass spectrometry. Int. J. Mol. Sci. 2025, 26, 6005. [Google Scholar] [CrossRef]
- Mustafina, M.; Silantyev, A.; Suvorov, A.; Chernyak, A.; Suvorova, O.; Shmidt, A.; Gordeeva, A.; Vergun, M.; Gognieva, D.; Avdeev, S.; et al. Integrated exhaled VOC and clinical biomarker profiling for predicting bronchodilator responsiveness in asthma and COPD patients. Diagnostics 2025, 15, 2738. [Google Scholar] [CrossRef]
- Mustafina, M.; Silantyev, A.; Krasovskiy, S.; Chernyak, A.; Naumenko, Z.; Suvorov, A.; Gognieva, D.; Abdullaev, M.; Suvorova, O.; Schmidt, A.; et al. Identification of exhaled metabolites correlated with respiratory function and clinical features in adult patients with cystic fibrosis by real-time proton mass spectrometry. Biomolecules 2024, 14, 1189. [Google Scholar] [CrossRef]
- Mustafina, M.; Silantyev, A.; Krasovskiy, S.; Chernyak, A.; Naumenko, Z.; Suvorov, A.; Gognieva, D.; Abdullaev, M.; Bektimirova, A.; Bykova, A.; et al. Exhaled breath analysis in adult patients with cystic fibrosis by real-time proton mass spectrometry. Clin. Chim. Acta 2024, 560, 119733. [Google Scholar] [CrossRef]
- Ibrahim, W.; Natarajan, S.; Wilde, M.; Cordell, R.; Monks, P.S.; Greening, N.; Brightling, C.E.; Evans, R.; Siddiqui, S. A systematic review of the diagnostic accuracy of volatile organic compounds in airway diseases and their relation to markers of type-2 inflammation. ERJ Open Res. 2021, 7, 00030–02021. [Google Scholar] [CrossRef]
- Skawinski, M.; Schooten, F.J.V.; Smolinska, A. A comprehensive guide to volatolomics data analysis. J. Breath Res. 2024, 19, 015001. [Google Scholar] [CrossRef]
- Monedeiro, F.; Monedeiro-Milanowski, M.; Ratiu, I.A.; Brożek, B.; Ligor, T.; Buszewski, B. Needle Trap Device-GC-MS for characterization of lung diseases based on breath VOC profiles. Molecules 2021, 26, 1789. [Google Scholar] [CrossRef]
- Bessa, V.; Darwiche, K.; Teschler, H.; Sommerwerck, U.; Rabis, T.; Baumbach, J.I.; Freitag, L. Detection of volatile organic compounds (VOCs) in exhaled breath of patients with chronic obstructive pulmonary disease (COPD) by ion mobility spectrometry. Int. J. Ion Mobil. Spec. 2011, 14, 7–13. [Google Scholar] [CrossRef]
- Gaida, A.; Holz, O.; Nell, C.; Schuchardt, S.; Lavae-Mokhtari, B.; Kruse, L.; Boas, U.; Langejuergen, J.; Allers, M.; Zimmermann, S.; et al. A dual center study to compare breath volatile organic compounds from smokers and non-smokers with and without COPD. J. Breath Res. 2016, 10, 026006. [Google Scholar] [CrossRef]
- Pizzini, A.; Filipiak, W.; Wille, J.; Ager, C.; Wiesenhofer, H.; Kubinec, R.; Blaško, J.; Tschurtschenthaler, C.; Mayhew, C.A.; Weiss, G.; et al. Analysis of volatile organic compounds in the breath of patients with stable or acute exacerbation of chronic obstructive pulmonary disease. J. Breath Res. 2018, 12, 036002. [Google Scholar] [CrossRef]
- Bessa, V.; Darwiche, K.; Teschler, H.; Sommerwerck, U.; Rabis, T.; Baumbach, J.I.; Freitag, L. Measurement of exhaled volatile organic compounds from patients with chronic obstructive pulmonary disease (COPD) using closed gas loop GC-IMS and GC-APCI-MS. J. Breath Res. 2016, 10, 026004. [Google Scholar] [CrossRef] [PubMed]
- Scott, J.; Sueiro-Olivares, M.; Ahmed, W.; Heddergott, C.; Zhao, C.; Thomas, R.; Bromley, M.; Latgé, J.P.; Krappmann, S.; Fowler, S.; et al. Pseudomonas aeruginosa-derived volatile sulfur compounds promote distal Aspergillus fumigatus growth and a synergistic pathogen–pathogen interaction that increases pathogenicity in co-infection. Front. Microbiol. 2019, 10, 2311. [Google Scholar] [CrossRef]
- Kuo, S.H.; Lau, G.W. Pseudomonas aeruginosa-derived metabolites and volatile organic compounds: Impact on lung epithelial homeostasis and mucosal immune response. Front. Immunol. 2025, 16, 1553013. [Google Scholar] [CrossRef]
- Brinkman, P.; Ahmed, W.M.; Gómez, C.; Knobel, H.H.; Weda, H.; Vink, T.J.; Nijsen, T.M.; Wheelock, C.E.; Dahlen, S.E.; Montuschi, P.; et al. Exhaled volatile organic compounds as markers for medication use in asthma. Eur. Respir. J. 2020, 55, 1900544. [Google Scholar] [CrossRef]
- Blagus, R.; Lusa, L. SMOTE for high-dimensional class-imbalanced data. BMC Bioinform. 2013, 14, 106. [Google Scholar] [CrossRef] [PubMed]
- El Abiead, Y.; Mohanty, I.; Xing, S.; Rutz, A.; Charron-Lamoureux, V.; Damiani, T.; Lu, W.; Patti, G.J.; Zamboni, N.; Yanes, O.; et al. A perspective on unintentional fragments and their impact on the dark metabolome, untargeted profiling, molecular networking, public data, and repository scale analysis. JACS Au 2025, 5, 5828–5850. [Google Scholar] [CrossRef] [PubMed]
- Székely, G.J.; Rizzo, M.L.; Bakirov, N.K. Measuring and testing dependence by correlation of distances. Ann. Stat. 2007, 35, 2769–2794. [Google Scholar] [CrossRef]
- Székely, G.J.; Rizzo, M.L. Brownian distance covariance. Ann. Appl. Stat. 2009, 3, 1236–1265. [Google Scholar] [CrossRef] [PubMed]




| BA | COPD | CF | LAM | Control | p-Value | |
|---|---|---|---|---|---|---|
| Number of subjects | 160 | 128 | 102 | 51 | 402 | |
| Sex, % male | 56 (35.0%) | 111 (86.7%) | 54 (52.9%) | 0 (0%) | 154 (38%) | <0.01 |
| Age, years | 58.4 ± 17.1 | 66.6 ± 10.3 | 25.6 ± 7.8 | 48.7 ± 10.9 | 39.1 ± 14.0 | <0.01 |
| BMI, kg/m2 | 29.3 ± 8.5 | 25.9 ± 5.8 | 19.8 ± 3.6 | 24.7 ± 5.1 | 25.0 ± 5.2 | 0.02 |
| Smoking status | ||||||
| Never | 85 (53%) | 25 (19.5%) | 102 (100%) | 50 (98%) | 281 (69.9%) | 0.06 |
| Former | 54 (33.9%) | 43 (33.6%) | 0 | 0 | 107 (26.6%) | <0.01 |
| Current | 21 (13.1%) | 60 (46.9%) | 0 | 1 (2%) | 14 (3.5%) | <0.01 |
| mMRC, scores | 2.1 ± 0.9 | 2.6 ± 0.9 | 1.1 ± 0.8 | 1.5 ± 1.1 | 0.0 ± 0.1 | <0.01 |
| FVC % pred | 78.1 ± 18.7 | 72.6 ± 22.3 | 76.0 ± 21.2 | 89.0 ± 18.7 | 98.8 ± 11.3 | <0.01 |
| FEV1% pred | 61.9 ± 17.1 | 49.7 ± 21.6 | 57.6 ± 24.6 | 72.9 ± 28.6 | 99.2 ± 11.0 | <0.01 |
| FEV1/FVC, % | 62.4 ± 10.0 | 51.1 ± 11.7 | 62.4 ± 13.1 | 63.8 ± 17.1 | 82.7 ± 6.2 | <0.01 |
| FEF75%pred | 61.8 ± 27.0 | 52.5 ± 27.0 | 29.6 ± 27.8 | 66.3 ± 44.2 | 123.0 ± 52.1 | <0.01 |
| m/z | VOCs Name ** | BA | COPD | CF | LAM | Control |
|---|---|---|---|---|---|---|
| 71.055 | 2-Pentanone, Fragments of C5-compounds | 0.211 ± 0.503 | 0.360 ± 0.617 | 0.125 ± 0.299 | 0.058 ± 0.049 | 0.077 ± 0.092 |
| 73.065 | Butanal or 2-butanone fragment | 0.069 ± 0.070 | 0.166 ± 0.549 | 0.043 ± 0.051 | 0.049 ± 0.019 | 0.045 ± 0.021 |
| 79.054 | Protonated benzene | 0.022 ± 0.023 | 0.038 ± 0.031 | 0.013 ± 0.008 | 0.019 ± 0.028 | 0.014 ± 0.011 |
| 83.086 | Cyclohexene | 0.046 ± 0.113 | 0.049 ± 0.044 | 0.033 ± 0.214 | 0.034 ± 0.089 | 0.043 ± 0.199 |
| 85.070 | Cyclopentenone | 0.056 ± 0.051 | 0.054 ± 0.037 | 0.016 ± 0.005 | 0.023 ± 0.016 | 0.028 ± 0.028 |
| 95.054 | Phenol | 0.187 ± 0.144 | 0.141 ± 0.137 | 0.050 ± 0.086 | 0.284 ± 0.175 | 0.213 ± 0.130 |
| 97.092 | Cycloheptene | 0.025± 0.022 | 0.037 ± 0.034 | 0.013 ± 0.019 | 0.020 ± 0.028 | 0.017 ± 0.024 |
| 109.071 | Methionol (3-methylthiopropanol) | 0.032 ± 0.022 | 0.034 ± 0.023 | 0.014 ± 0.005 | 0.031 ± 0.028 | 0.029 ± 0.052 |
| 118.071 | Indole | 0.037 ± 0.044 | 0.023 ± 0.036 | 0.005 ± 0.009 | 0.060 ± 0.051 | 0.041 ± 0.040 |
| 149.104 | Diethanolamine | 0.017 ± 0.067 | 0.011 ± 0.010 | 0.014 ± 0.008 | 0.018 ± 0.018 | 0.025 ± 0.113 |
| 159.071 | Thiazole/thiazolium derivatives | 0.032 ± 0.173 | 0.060 ± 0.284 | 0.005 ± 0.002 | 0.112 ± 0.389 | 0.066 ± 0.340 |
| 181.014 | Chlorinated amino acid derivative | 0.008 ± 0.017 | 0.006 ± 0.006 | 0.009 ± 0.007 | 0.006 ± 0.008 | 0.020 ± 0.082 |
| 235.211 | Terpene derivatives | 0.006 ± 0.005 | 0.006 ± 0.005 | 0.012 ± 0.027 | 0.011 ± 0.008 | 0.015 ± 0.056 |
| Centrality Metric | Lower Threshold (5th Percentile) | Upper Threshold (95th Percentile) |
|---|---|---|
| Weighted Degree | −0.295 | +0.448 |
| Betweenness | −0.426 | +0.271 |
| Eigenvector | −0.306 | +0.449 |
| Katz | −0.303 | +0.446 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Mustafina, M.; Silantyev, A.; Suvorov, A.; Krasovskiy, S.; Makarova, M.; Chernyak, A.; Suvorova, O.; Shmidt, A.; Gognieva, D.; Bykova, A.; et al. Cross-Disease Breathomics by PTR-TOF-MS: Multiclass Machine Learning and Network Remodeling Across Asthma, COPD, Cystic Fibrosis, and Lymphangioleiomyomatosis. Int. J. Mol. Sci. 2026, 27, 3483. https://doi.org/10.3390/ijms27083483
Mustafina M, Silantyev A, Suvorov A, Krasovskiy S, Makarova M, Chernyak A, Suvorova O, Shmidt A, Gognieva D, Bykova A, et al. Cross-Disease Breathomics by PTR-TOF-MS: Multiclass Machine Learning and Network Remodeling Across Asthma, COPD, Cystic Fibrosis, and Lymphangioleiomyomatosis. International Journal of Molecular Sciences. 2026; 27(8):3483. https://doi.org/10.3390/ijms27083483
Chicago/Turabian StyleMustafina, Malika, Artemiy Silantyev, Aleksandr Suvorov, Stanislav Krasovskiy, Marina Makarova, Alexander Chernyak, Olga Suvorova, Anna Shmidt, Daria Gognieva, Aleksandra Bykova, and et al. 2026. "Cross-Disease Breathomics by PTR-TOF-MS: Multiclass Machine Learning and Network Remodeling Across Asthma, COPD, Cystic Fibrosis, and Lymphangioleiomyomatosis" International Journal of Molecular Sciences 27, no. 8: 3483. https://doi.org/10.3390/ijms27083483
APA StyleMustafina, M., Silantyev, A., Suvorov, A., Krasovskiy, S., Makarova, M., Chernyak, A., Suvorova, O., Shmidt, A., Gognieva, D., Bykova, A., Gogiberidze, N., Akselrod, A., Belevskiy, A., Avdeev, S., Betelin, V., Syrkin, A., & Kopylov, P. (2026). Cross-Disease Breathomics by PTR-TOF-MS: Multiclass Machine Learning and Network Remodeling Across Asthma, COPD, Cystic Fibrosis, and Lymphangioleiomyomatosis. International Journal of Molecular Sciences, 27(8), 3483. https://doi.org/10.3390/ijms27083483

