Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

An Innovative Deep Learning Approach for Ventilator-Associated Pneumonia (VAP) Prediction in Intensive Care Units—Pneumonia Risk Evaluation and Diagnostic Intelligence via Computational Technology (PREDICT)

J. Clin. Med. 2025, 14(10), 3380; https://doi.org/10.3390/jcm14103380

by Geoffray Agard^1,2,3,*

, Christophe Roman³, Christophe Guervilly^1,*

, Jean-Marie Forel^1,2, Véronica Orléans^2,4, Damien Barrau^1,2

, Pascal Auquier², Mustapha Ouladsine³, Laurent Boyer^2,4

and Sami Hraiech^1,2

Reviewer 1:

Marcos Brioschi

Reviewer 2: Anonymous

J. Clin. Med. 2025, 14(10), 3380; https://doi.org/10.3390/jcm14103380

Submission received: 1 April 2025 / Revised: 4 May 2025 / Accepted: 8 May 2025 / Published: 13 May 2025

(This article belongs to the Special Issue Innovations in Perioperative Anesthesia and Intensive Care)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Did the authors consider different VAP prevention protocols (such as head-of-bed elevation, oral care, or sedation protocols) used in ICU during the years 2008 to 2019? These can reduce VAP, and if they changed over the 11 years of data, the prediction might not be accurate. The authors did not adjust for this bias in clinical practice variation. They must explain this or add it as a limitation.
Was there any adjustment or stratification based on prior antibiotic use before VAP onset? If patients have already got antibiotics, it could delay or mask VAP, changing the model's ability to predict real infections. But the study did not adjust for this, which brings antibiotic exposure bias. Authors need to include this point or explain why this data was not used.
The study does not say how the model’s predictions will be used. A tool that gives frequent VAP alerts could make doctors give more antibiotics, not less. This could create over-prescription problems and resistance. The authors should suggest a guideline on how to respond to the model’s alerts or state this as a risk.
Why did the authors only use five basic vital signs? Deep learning is powerful and can learn from more data, like labs or X-rays. By using only vitals they risk underfitting and missing important clinical signals. They should explain this choice more clearly or call it a model limitation.
Can the PREDICT model run on an ICU monitor or on a hospital server? There is no mention of model size, computing time or memory use. These are technical barriers that the authors must describe or admit as future work needed.
How the model improves key ICU goals, like reducing ventilation days or saving antibiotics? The study shows high prediction accuracy but no real outcome changes. So, it’s hard to the readers to know if the model helps in real life. This gap must be noted as a limitation and need for future prospective testing.
Did the model account for type of airway access (endotracheal vs tracheostomy) and ventilation mode? Tracheostomy versus endotracheal tube and different ventilator settings affect lung infections. But the study doesn’t include this. It’s a miss because ventilation type bias is well-known in VAP studies. The authors must add this missing factor to the limitations.
How did the authors handle temporal trends in ICU care over 11 years of data, from 2008 to 2019? Medical care changed a lot during this time, and this could affect VAP rates. If this was not accounted for, the model might be learning patterns from old practice not current ones. The authors must say whether they tested for secular trends. If not they must admit this as source of bias.
Why did the authors not perform an external validation on another dataset? The model was trained and tested only on MIMIC-IV, from one hospital. We don’t know if it works in Europe, Asia, or even another American hospital. This is a major flaw. The authors must say this is a future step and add it clearly in the discussion as a limitation.
The authors did not present calibration plots or Brier scores. They used only AUPRC and AUROC. These metrics show if the model can rank patients correctly, but not if the predicted probability is true. For example, a model can say 80% VAP risk when it’s only 20%. Without calibration, this is misleading. Calibration is especially important in clinical models. The authors must run calibration plots (reliability curves e.g.) and Brier scores. If they cannot do this now they must say it clearly in the statistical limitations and plan for it in future versions.
How did the authors control for label leakage between the observation and prediction windows? If signs of early VAP already appear in the observation window, and the model predicts VAP in the next 6 hours, then it’s not really “predicting” the future, it’s just detecting the present. This causes over-optimistic results. The authors must clarify how they avoided this maybe with sensitivity analysis. If not done this is a methodological flaw and must be written as a study limitation.

Author Response

1. Did the authors consider different VAP prevention protocols (such as head-of-bed elevation, oral care, or sedation protocols) used in ICU during the years 2008 to 2019? These can reduce VAP, and if they changed over the 11 years of data, the prediction might not be accurate. The authors did not adjust for this bias in clinical practice variation. They must explain this or add it as a limitation.

We thank the reviewer for highlighting this important point. Compliance with ventilator-associated pneumonia (VAP) prevention bundles in the ICU indeed represents a potential source of bias over the long study period (2008–2019).

To address this concern, we reviewed recent literature analyzing the MIMIC-IV cohort, notably the study by Leong et al. (2024) [Leong YH, et al., Anesthesiology and Perioperative Science], which specifically evaluated compliance to VAP prevention protocols within the MIMIC-IV database. Their results demonstrated that compliance to the Institute for Healthcare Improvement (IHI) ventilator care bundle was extremely low overall — with only 0.3% of patients achieving full bundle compliance, and only head-of-bed elevation being routinely implemented (89% compliance), while other interventions such as sedation protocols, oral care, or prophylaxis measures had very low adherence rates.

These findings suggest that throughout the 2008–2019 period, major changes or systematic improvements in VAP prevention bundle compliance were unlikely to significantly impact the general incidence of VAP recorded in MIMIC-IV. Consequently, the low and stable compliance likely limits the bias introduced by evolving prevention protocols on our predictive model.

However, we acknowledge that even low variations in clinical practice could influence VAP risk to some extent. Therefore, we have added a discussion of this point in the limitations section of the revised manuscript, explicitly stating that variations in bundle compliance and ICU protocols over time represent a potential residual confounder that could not be fully adjusted for in our model.

Reference added:
Leong YH, Khoo YL, Abdullah HR, Ke Y. Compliance to ventilator care bundles and its association with ventilator-associated pneumonia. Anesthesiology and Perioperative Science. 2024;2:20. doi:10.1007/s44254-024-00059-1.