Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Integrated Exhaled VOC and Clinical Biomarker Profiling for Predicting Bronchodilator Responsiveness in Asthma and COPD Patients

Diagnostics 2025, 15(21), 2738; https://doi.org/10.3390/diagnostics15212738

by Malika Mustafina^1,2,3,*

, Artemiy Silantyev⁴

, Aleksander Suvorov⁴

, Alexander Chernyak²

, Olga Suvorova⁵

, Anna Shmidt⁵

, Anastasia Gordeeva⁵

, Maria Vergun⁴

, Daria Gognieva^1,4

, Sergey Avdeev^2,5

, Vladimir Betelin³

and Philipp Kopylov^1,3,4

Reviewer 1:

José Leija-Martínez

Reviewer 2: Anonymous

Diagnostics 2025, 15(21), 2738; https://doi.org/10.3390/diagnostics15212738

Submission received: 20 September 2025 / Revised: 19 October 2025 / Accepted: 27 October 2025 / Published: 28 October 2025

(This article belongs to the Section Clinical Diagnosis and Prognosis)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Dear authors,

This study integrates exhaled volatile organic compound (VOC) profiling using PTR-TOF-MS with clinical biomarkers (FeNO, eosinophils, IgE) and spirometry to differentiate asthma, COPD, and healthy controls, and to predict bronchodilator responsiveness (BDR). It employs machine-learning algorithms (XGBoost) and reports high diagnostic accuracy (AUC up to 1.000) for predicting BDR.

The manuscript addresses a clinically relevant and innovative topic within precision respiratory medicine and metabolomics.

Major Comments:

1. Study Design and Population

The cross-sectional design limits the causal interpretation of observed associations between VOCs, biomarkers, and bronchodilator responsiveness. The authors should acknowledge this limitation earlier (in the Abstract or Introduction) and justify why this design is appropriate for exploratory biomarker discovery.

Exclusion of asthma-COPD overlap (ACO) is appropriate, but it should be discussed how this exclusion may affect generalizability, as overlap phenotypes are common in real-world settings.

The control group characteristics (age and smoking history) differ significantly from those of the patients, introducing potential confounding. The post-hoc propensity matching is mentioned, but the results should be summarised in the main text (currently only in Supplementary Tables S3–S4).

2. Statistical and Machine-Learning Analysis

The feature-selection process using XGBoost and resampling is well described; however, external validation or independent testing is lacking. A model achieving AUC = 1.000 without external validation suggests potential overfitting. Authors should explicitly state the minimisation to minimise this risk and propose how future work will validate the model.

The sample size for each subgroup and for BDR prediction should be justified with a power calculation, or at least discussed in terms of model robustness, given the number of predictors used.

Clarify whether cross-validation was stratified by disease group to prevent data leakage.

It would be helpful to provide a feature-importance plot or SHAP visualisation to visualise predictor contributions to the final model.

Besides, there is an improbable BDR performance and potential target leakage. The BDR model reports AUC = 1.000 (tidal) and 0.970 (forced), which is exceptional and unusual in clinical prediction.

Please confirm that no post-bronchodilator variables (e.g., FEF25-75 post-BD) or any transforms that “peek” at post-intervention data entered the predictor set before determining BDR status. Your Table 4 lists post-BD flows among predictors, which likely leaks outcome information and explains perfect/near-perfect discrimination. Restrict predictors to pre-bronchodilator data plus baseline VOCs/biomarkers; then re-estimate performance.

Provide calibration (slope/intercept), precision-recall curves, decision-curve analysis, and confidence intervals for AUC, sensitivity, and specificity; avoid a fixed 0.5 threshold unless justified.

Inconsistent definition of positive BDR: In Results, BDR is described as “FEV1 ≥ 10% from baseline,” while Methods cite ERS/ATS 2022 criteria as ΔFEV1 or ΔFVC > 10% predicted—these are not equivalent. Align definitions across sections, re-label all tables/figures accordingly, and re-analyse if needed.

3. VOC Annotation and Chemical Identification

The discussion correctly highlights the limitations of VOC annotation; however, the lack of MS/MS confirmation weakens the chemical interpretation. Summarising putative compounds (m/z, possible chemical identity, biological relevance, literature support) in the main text would improve readability.

The authors refer to “formic acid-related fragments” and “protonated propylene glycol” but do not provide exact references for these identifications. They should clarify whether these compounds derive from endogenous metabolism or exogenous exposures (e.g., inhaler propellants).

4. Clinical Relevance and Interpretation

The manuscript would emphasise how VOCs could be translated into clinical decision-making. For example, how could VOC measurement complement or replace FeNO and eosinophil testing?

Discuss potential biological mechanisms linking VOCs (e.g., m/z 79, 101) with airway inflammation and bronchodilator responsiveness.

The authors should address whether BMI differences between BDR-positive and negative patients might partially explain VOC variation, since obesity modifies exhaled metabolite profiles.

5. Presentation and Figures

Some tables are dense and contain redundant P-values (e.g., all “1.000” comparisons). The presentation would benefit from summarising only significant comparisons.

Include confidence intervals for AUC, sensitivity, and specificity values in Figures 1 and S1.

The study design figure (Fig. 2) could be improved with a more precise depiction of data collection, analysis pipeline, and modelling workflow.

Minor Comments:

Abstract: The phrase “exceptional accuracy (AUC = 1.000)” should be toned down; such perfect performance suggests overfitting and should be cautiously phrased (“very high accuracy in internal validation”).

Keywords: Please remove the hyphen in “ma-chine learning.”

Introduction: Replace “The search for new bi-omarkers may help identify…” with “The search for novel biomarkers may help identify…” for smoother flow.

Tables: Replace commas with dots in numerical P-values (e.g., “0,021” → “0.021”) to comply with journal formatting.

Section 2.3: Correct the phrase “FeNO, blood eosinophil 10*9/L” to “Blood eosinophils (×10⁹/L)”.

Figure references: In Results, “Figure 2” is cited before being introduced; ensure sequential numbering.

English polishing: Minor grammatical adjustments would enhance clarity (e.g., “patients did not differ in smoking status (p = 0.065)” → “Smoking status did not differ significantly between groups (p = 0.065)”).

Supplementary materials: Ensure all tables/figures (S1–S4) are correctly referenced in the text with short explanatory sentences.

Comments on the Quality of English Language

Dear authors,

The manuscript is generally well-written and understandable, but it would benefit from a light-to-moderate language edit to enhance clarity, precision, and consistency with the journal's style.

Specific recommendations

Terminology & consistency

Use one form throughout for key terms (e.g., bronchodilator response [BDR], tidal vs forced breathing, asthma vs BA).

Keep biomarker and spirometry acronyms consistent (FeNO, FEV₁, FEF₂₅–₇₅, IgE, eosinophils).

Abbreviations

Define each abbreviation at first mention in the Abstract and again in the main text; avoid redefining later.

Do not introduce abbreviations that are used fewer than three times.

Tense & voice

Methods/Participants: past tense ("we measured…", "samples were analysed…").

Results: past tense.

Conclusions/Implications: present tense.

Prefer explicit, active constructions where possible.

Units, numbers, and symbols

Use SI units and consistent formatting: a space between the number and unit (e.g., 50 mL, 10%), leading zeros for values <1 (e.g., 0.05), and decimal points (not commas).

Write P values as P < 0.001, P = 0.032 (capital P; no trailing zeros).

Standardise eosinophils (×10⁹/L) and IgE (IU/mL).

Hyphenation & typography

Hyphenate compound adjectives before a noun: "post-bronchodilator FEV₁," "machine-learning model," "case–control design."

Use en dashes for numeric ranges (e.g., 25–75).

Spell m/z consistently.

Figures and tables

Ensure captions are self-contained (expand acronyms at first use in each caption).

Standardise decimal places and align columns; avoid p-values written as "1.000".

Sentence economy & clarity

Remove filler ("It should be noted that…", "In fact,").

Prefer precise claims linked to evidence: replace "significantly increased" with the estimate and CI.

Common micro-edits (apply globally)

"data are" (not "data is").

"composed of" (not "comprised of" for part–whole).

"patients who" (not "patients which").

"Compared with" is preferred in formal writing.

Avoid anthropomorphism where unnecessary ("the model indicates…" rather than "the model thinks…").

Use consistent spelling (British or American) throughout; do not mix.

Examples of style tightening

Weak: "It should be noted that the results were significantly better."Stronger: "Results were better (AUC 0.86; 95% CI …)."

Weak: "There was a trend which was close to significance."Stronger: "The association was imprecise (P = 0.065; 95% CI overlaps the null)."

Recommendation
After methodological revisions, a professional copy-edit or careful author revision focusing on the points above will make the prose clear and publication-ready.

Author Response

Thank you very much for taking the time to review this manuscript. This document provides detailed responses, as well as relevant corrections to the manuscript. The text has been checked by a native English speaker, the corresponding certificate is attached.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The authors present a manuscript which uses PTR-MS to differentiate between healthy controls and lung disease patients specifically when looking at bronchodilator responsiveness. I think this is an interesting study with a good sample size.

Abstract - these are not strictly VOCs although they may be masses associated with specific VOCs. "Specific VOCs (m/z 79, 101) were significantly associated with a positive BDR test."

Also I'm not sure VOC footprints is common terminology?

Could the authos provide information on the models ability to discriminate the specific diseases from healthy etc. also in the abstract?

Introduction - I think the introduction is well written and has a tight focus on the specific study being undertaken. There may be scope to introduce some additional references regarding the utility of VOCs in the clinical diagnosis and management of the conditions studied.

Results - There are some confounding factors in the patient demographics table such as predominance of males in COPD and predominance of smokers in this group also.

Could be more specific about the separation into training and test sets "For the separate analysis of normal and forced expiration, the data were divided into training and test sets."

If you have the exact mass can you not predict the molecular formula? "m/z=79.054, m/z=95.054, m/z=44.991 (presumably corresponds to formic acid-related frag-129 ment), m/z=71.055, and m/z=53.037"

This footnote in the table is not clear "*m/z=44.991 presumably corresponds to the formic acid-related fragment, 141 ppm mass error (the putative chemical was identified using Ionicon libraries, the Human Metabolome Database and literature data)"

What is the rationale for selecting 11 VOCs in the models. Does this introduce the possibility of overfitting within the models. How does this compare to the total number of features identified for each condition.

I don't really understand the significance of highlighting certain VOCs - could all the fragements not be associated with potential VOCs? "*mass per charge value (m/z). **m/z=77.059 presumably corresponds to protonated propylene glycol 175 (the putative chemical was identified using Ionicon libraries, the Human Metabolome Database and 176 literature data)"

Where are the AUC analyses for the BDR - these are mentioned in the abstract. whereas ones presented in the results are not mentioned in the abstract? Are the AUC results stated based on VOCs combined with other markers?

Discussion

There was quite a lot of overlap between Asthma and COPD in terms of m/z fragments. Was this expected? What are the likely reasons for this? similar inflammatory response etc.?

I don't understand this point as if you had accurate mass then at least you could predict the formula and confirm the likely fragments identified in the results?

"Unfortunately, the annotation of ions in our study using the Ionicon libraries and the HMBD, had limitations, and the chemical origin of the ions m/z 53, 71, 79, 95 was not 215 determined."

I would say it is more likely to be acetaldehyde? This is not clear in terms of which compound the reference refers to "VOC with m/z 45 probably corresponds to formic acid or acetaldehyde, related to food consumption and associated with COPD in the literature [13]."

Experimental

Overall there is a good level of scientific detail throughout the section.

In this section you state that you worked out the molecular formula for the m/z frgaments - so I don't understand why this isn't reported in the results as this information would be useful for other researchers?

Author Response

Thank you for receiving our manuscript ID: diagnostics-3914224 « Integrated Exhaled VOC and Clinical Biomarker Profiling for Predicting Bronchodilator Responsiveness in Asthma and COPD Patients» and considering it for review. Thank you for your high appreciation of our work.

There are responses to the reviewer comments below.

Responses to Reviewer 2 comments

Abstract

Comments 1.1 These are not strictly VOCs although they may be masses associated with specific VOCs. "Specific VOCs (m/z 79, 101) were significantly associated with a positive BDR test."

Response 1.1 We have corrected the phrase to «Specific mass spectral features (m/z 79, m/z 101) were significantly associated with a positive BDR test».

Comments 1.2 Also I'm not sure VOC footprints is common terminology?

Response 1.2 We agree with the comment and have corrected the phrase to "VOC signatures".

Comments 1.3 Could the authors provide information on the models ability to discriminate the specific diseases from healthy etc. also in the abstract?

Response 1.3 More detailed information on the differences between asthma and COPD and controls has been added to the abstract.

Introduction

Comments 2.1 I think the introduction is well written and has a tight focus on the specific study being undertaken. There may be scope to introduce some additional references regarding the utility of VOCs in the clinical diagnosis and management of the conditions studied.

Response 2.1 References to the Introduction section of the article have been added.

Results

Comments 3.1 There are some confounding factors in the patient demographics table such as predominance of males in COPD and predominance of smokers in this group also.

Response 3.1 Thank you for your comment. We noted these factors in the Study Limitations section and attempted to mitigate them by presenting the data in Supplementary materials.

Comments 3.2 Could be more specific about the separation into training and test sets "For the separate analysis of normal and forced expiration, the data were divided into training and test sets."

Response 3.2 We have added a note about the statistical analysis in the Materials and Methods section.

Comments 3.3 If you have the exact mass can you not predict the molecular formula? "m/z=79.054, m/z=95.054, m/z=44.991 (presumably corresponds to formic acid-related frag-129 ment), m/z=71.055, and m/z=53.037"

Response 3.3 Unfortunately, we were unable to find precise identifications for these ions using the IONICON and HMBD libraries and literature data, taking into account an error of 200 ppm. Therefore, we only make assumptions about the chemical nature of these VOCs in the Discussion and Limitations sections.

Comments 3.4 This footnote in the table is not clear "*m/z=44.991 presumably corresponds to the formic acid-related fragment, 141 ppm mass error (the putative chemical was identified using Ionicon libraries, the Human Metabolome Database and literature data)"

Response 3.4 We indicated that in this case we were able to accurately identify this ion with m/z = 44.991 as a fragment of formic acid; we also indicated the error, which is within the acceptable limits for identification up to 200 ppm.

Comments 3.5 What is the rationale for selecting 11 VOCs in the models. Does this introduce the possibility of overfitting within the models. How does this compare to the total number of features identified for each condition.

Response 3.5 As we mentioned in the article: For the separate analysis of normal and forced expiration, the data were divided into training and test sets. The training set was used to select the hyperparameters of the subsequent boosting classifier model. After selecting hyperparameters, features were selected separately for each type of breathing. Ranking was performed using the feature importance indicator for the boosting classifier. The top 10% of predictors by feature importance were selected.

Comments 3.6 I don't really understand the significance of highlighting certain VOCs - could all the fragments not be associated with potential VOCs? "*mass per charge value (m/z). **m/z=77.059 presumably corresponds to protonated propylene glycol 175 (the putative chemical was identified using Ionicon libraries, the Human Metabolome Database and 176 literature data)".

Response 3.6 In our study, we attempted to determine the chemical nature of the most significant predictors of asthma and COPD diseases, as well as those associated with bronchial hyperreactivity, in order to determine their possible pathophysiological and clinical significance.

Comments 3.7 Where are the AUC analyses for the BDR - these are mentioned in the abstract. whereas ones presented in the results are not mentioned in the abstract? Are the AUC results stated based on VOCs combined with other markers?

Response 3.7 A graphical representation of the AUC results for BDR is provided in the Supplementary Materials. We have also added AUC data for differences from the control groups of asthma and COPD patients to the abstract.

Discussion

Comments 4.1 There was quite a lot of overlap between Asthma and COPD in terms of m/z fragments. Was this expected? What are the likely reasons for this? similar inflammatory response etc.?

Response 4.1 The observed overlap in VOC profiles between asthma and COPD was, to a significant extent, expected and can be attributed to several pathophysiological factors. While asthma and COPD are distinct diseases, they are both chronic inflammatory disorders of the airways. This shared nature leads to common biochemical processes that are reflected in the exhaled metabolome: oxidative stress and airway remodeling. Both diseases are characterized by a significant burden of oxidative stress, driven by reactive oxygen species (ROS) from inflammatory cells (e.g., neutrophils, eosinophils, macrophages) and environmental exposures (e.g., cigarette smoke). Lipid peroxidation, a key consequence of oxidative stress, generates a range of volatile aldehydes (e.g., hexanal, heptanal, nonanal) and other hydrocarbons that can contribute to overlapping VOC signals. For instance, compounds related to benzene or toluene derivatives (potentially reflected in our m/z 79 and m/z 95 signals) are known markers of oxidative stress and have been reported in both conditions. Both asthma and COPD involve structural changes in the airways, including basement membrane thickening, fibrosis, and angiogenesis. These processes involve the breakdown and synthesis of extracellular matrix components, which can release volatile metabolites that are not disease-specific.

Comments 4.2 I don't understand this point as if you had accurate mass then at least you could predict the formula and confirm the likely fragments identified in the results?

Response 4.2 We were unable to identify the presented ions with an accuracy of 200 ppm using the IONICON and HMBD libraries and literature data, in this regard we can only make assumptions about the chemical nature of the identified compounds.

Comments 4.3 I would say it is more likely to be acetaldehyde? This is not clear in terms of which compound the reference refers to "VOC with m/z 45 probably corresponds to formic acid or acetaldehyde, related to food consumption and associated with COPD in the literature [13]."

Response 4.3 Unfortunately, we are unable to accurately annotate this ion; it was not found in the IONICON library, HMBD, or literature data. Therefore, we speculate that this ion with m/z = 45 may be a fragment of formic acid or acetaldehyde.

5 Experimental

Comments 5.1 In this section you state that you worked out the molecular formula for the m/z frgaments - so I don't understand why this isn't reported in the results as this information would be useful for other researchers?

Response 5.1 We have added Table 5 to the Discussion section on the putative chemical nature of the ion predictors we identified for possible future determination of their clinical significance in future studies.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Dear authors,

The revised version of your manuscript represents a considerable improvement over the original submission. The authors have successfully addressed the previous reviewer's concerns and enhanced the paper's scientific and editorial quality. The integration of methodological refinements and additional analyses has strengthened the robustness of the findings. The study is now suitable for publication in its current form.

1) Enhancements for Major Comments:

Study Design and Methodology

The study design is now clearly defined as cross-sectional, clarifying the exploratory nature and discovery purpose of this work (v2, Introduction, lines 98-101).

The inclusion of a propensity-score-matching analysis to control for confounding factors such as age, sex, and smoking status significantly improves the internal validity of comparisons between groups. This enhancement was not present in version 1.

Statistical and Analytical Refinement

The new version includes multivariate regression analyses (Supplementary Table S4) and propensity-matched cohort characteristics (Supplementary Table S3), which substantially increase the analytical depth and reproducibility of the results.

The AUC values for asthma and COPD models have been corrected and aligned between text and figures. ROC performance values are now coherent (AUC = 0.747 / 0.710 for asthma and 0.821 / 0.856 for COPD).

The authors clarified the contribution of each predictor variable (FeNO, eosinophils, IgE, and VOCs at m/z 79.054 and 101.039) to the model’s performance, thereby improving transparency.

Results and Data Presentation

Table 1 and related text now include consistent statistical notation and clear group comparisons using standardised P-values.

The expanded description of the machine-learning approach (XGBoost) with explicit feature-importance outputs strengthens the interpretation of predictive biomarkers.

The figures and legends are more precise and better aligned with the results.

Discussion and Clinical Interpretation

The discussion has been expanded with mechanistic insights linking the identified VOCs (m/z 79.054, 101.039) to oxidative and inflammatory pathways in airway disease.

The revised version more effectively highlights the translational potential of integrating VOCs and biomarkers for the personalised management of asthma and COPD.

2) Enhancements for Minor Comments

Please ensure consistency in the use of abbreviations (e.g., “BA” vs. “asthma”) throughout the text.

Figure legends could more clearly indicate whether models were based on tidal or forced breathing.

A final proofreading step for typographical uniformity and punctuation in references is recommended.

Article Menu

Integrated Exhaled VOC and Clinical Biomarker Profiling for Predicting Bronchodilator Responsiveness in Asthma and COPD Patients

Further Information

Guidelines

MDPI Initiatives

Follow MDPI