Use of CPAP Failure Score to Predict the Risk of Helmet-CPAP Support Failure in COVID-19 Patients: A Retrospective Study

(1) Background: the aim of this study was to create a score to predict the incidence of CPAP failure in COVID-19 patients early. (2) Methods: in this retrospective observational study, we included all consecutive adult patients admitted between February and April 2021. The main outcome was the failure of CPAP support (intubation or death). (3) Results: two-hundred and sixty-three COVID-19 patients were managed with CPAP. The population was divided in short-CPAP (CPAP days ≤ 10; 72.6%) and long-CPAP (>10; 27.4%) groups. After balancing the entire population using a stabilized IPTW method, we applied a multivariable logistic regression analysis to identify the risk factors for CPAP failure. We used the identified covariates to create a mathematical model, the CPAP Failure Score (CPAP-FS). The multivariable logistic regression analysis identified four variables: SpO2 (OR = 0.86; p-value = 0.001), P/F ratio (OR = 0.99; p-value = 0.008), the Call Score (OR = 1.44; p-value = 0.02), and a pre-existing chronic lung disease (OR = 3.08; p-value = 0.057). The beta-coefficients obtained were used to develop the CPAP-FS, whose diagnostic ability outperformed other relevant COVID-19-related parameters (AUC = 0.87; p-value < 0.0001). We validated the CPAP-FS using a 10-fold internal cross-validation method which confirmed the observed results (AUCs 0.76–0.80; p-values < 0.0001). (4) Conclusions: the CPAP-FS can early identify COVID-19 patients who are at risk of CPAP failure.


Introduction
Patients affected by severe acute respiratory syndrome coronavirus 2 (SARS-CoV2) infection are asymptomatic or show mild symptoms in about 80% of cases. However, 15% of patients develop severe acute respiratory failure, and 5% of critical patients are admitted to the intensive care unit (ICU) [1]. Optimal respiratory management to improve hypoxia and to preserve the lungs includes adequate mechanical ventilation, timing and duration of respiratory support, and ventilation mode. Strategies are controversial, and evidence is rapidly evolving [2][3][4]. To prevent self-induced lung injury in severe coronavirus disease 2019 (COVID-19) pneumonia, initial recommendations suggested early intubation and invasive mechanical ventilation. However, the time from ICU admission to intubation was not associated with increased survival in patients with COVID-19 [5][6][7]. Furthermore, COVID-19 patients who were treated with non-invasive ventilation (NIV) in the ICU were burdened by a 2-fold higher risk of failure than patients affected by non-COVID-19 acute hypoxemic respiratory failure [8].
During the recent COVID-19 pandemic, to address the dramatic bed saturation in the ICU, respiratory support with helmet continuous positive airway pressure (CPAP) has been used to treat patients with less severe respiratory failure outside the ICU [9]. The physiological rationale for a positive end-expiratory pressure (PEEP), delivered by a non-invasive device, has been discussed from the beginning of the pandemic, but the prompt recognition of patients at risk for non-invasive respiratory support failure remains challenging [10]. Several scores have been proposed to predict the optimal timing for intubation in patients affected by acute hypoxic respiratory failure, yet none of them have been validated in COVID-19 [11][12][13].
The aim of this study is to create a score that could predict the incidence of CPAP failure in COVID-19 patients at the time of hospital presentation with respiratory compromise (CPAP Failure Score-CPAP-FS).

Materials and Methods
This retrospective observational study was approved by the Local Ethics Board of Policlinico Umberto I, Rome, Italy (N • 109/2020). Between February and April 2021, we included all consecutive adult patients (≥18 years) admitted to the Policlinico Umberto I of Rome for bilateral interstitial pneumonia due to SARS-CoV-2 infection, treated with helmet-CPAP during the hospital stay. The population has been evaluated by reviewing electronic health records of prospectively collected data. Exclusion criteria were an age < 18 years, support with NIV different from helmet-CPAP, and patients still hospitalized in ICU at the time of data analysis. The last follow-up date was 31 May 2021. All patients were positive to polymerase chain reaction testing of a nasopharyngeal sample for SARS-CoV-2. Helmet-CPAP was considered in patients with arterial oxygen pressure (PaO 2 ) < 60 mmHg and/or respiratory rate (RR) > 30/min on maximal oxygenation therapy for 15 min (Venturi mask with oxygen flow 15 L/min). Criteria for intubation were persistent or worsening acute respiratory failure (SpO 2 < 88%, RR > 30/min) despite CPAP set to FiO 2 100% and PEEP 10 cm H 2 O. The decision not to intubate after CPAP failure was made by the treating physician.
The main outcome of the study was the failure of CPAP support. CPAP failure was defined as a composite variable comprehending: the need for post-CPAP orotracheal intubation, and/or death during the hospital stay.
The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines were followed to create the study. The guarantor of the data quality was the Data Manager of the Study Group (F.A.). Errors and missing data were identified throughout the database and solved, when possible, with specific queries. A detailed table reporting the missing data present in the database is shown in the Supplementary Materials (Table S1).

Statistical Analysis
Continuous variables were reported as medians and interquartile ranges (IQR). Categorical variables were described as numbers and percentages. Comparisons between groups were made using Fisher's exact test or chi-squared test for categorical variables, as appropriate. Mann-Whitney was used for continuous variables. Missing data relative to study covariates (Table S1) always involved less than 10% of patients. In all the cases, missing data were handled with a single imputation method. In detail, a median of nearby points imputation was adopted. The median instead of the mean was adopted due to the skewed distribution of the managed variables [14].
The entire population was preliminarily divided in two groups according to the length of CPAP support.
We identified the best cut-off to divide the population using the Receiver Operating Characteristic (ROC) analysis. As specified in the data reported in Table S2, different cutoffs for the risk of CPAP failure were tested. The Youden's index was used to evaluate the best diagnostic accuracy of the threshold identified. We selected as CPAP failure cut-off the value of 10 days (approximately corresponding to the 70th centile of the population) due to its higher Youden's index value. With the intent to compensate for the non-randomized design of this retrospective study, the two groups were "balanced" using a stabilized inverse probability therapy weighting (IPTW). We generated a propensity score for each patient on the original population of 263 patients. The score was created using a multivariate logistic regression model considering CPAP failure (no vs. yes) as the dependent variable. We selected 20 possible clinically relevant confounders as covariates: COVID-19 first wave, age, male sex, arterial hypertension, type-2 diabetes mellitus, cardiovascular comorbidity, liver comorbidity, asthma, chronic lung disease, renal comorbidity, neurological comorbidity, obesity, HIV or malignancy, need for ICU stay, CT scan lung damage %, P/F ratio, Call Score, SpO 2 , C-reactive protein, and D-dimer.
All the covariates were available at the beginning of CPAP support to avoid the risk of a possible immortal time bias in covariate selection. With the intent to reduce the artificial increase of the sample size, and, therefore, of the type I error rate (i.e., the increased number of false positives) caused by the inflated sample size in the pseudo data, we used stabilized weights (SW) according to the formula: SW = p/PS for the study group, and SW = (1 − p)/(1 − PS) for the control group (1) where p is the probability of etiology without considering covariates, and PS is the propensity score [15].
Because p-values can be biased by population size, results from the comparisons between covariates subgroups were reported as effect size (Cohen's D value): values lower than |0.1| indicated very small differences between means, values between |0.1| and |0.3| indicated small differences, values between |0.3| and |0.5| indicated moderate differences, and values greater than |0.5| indicated considerable differences [16].
A multivariable logistic regression analysis was conducted in the post-IPTW population to identify the risk factors for CPAP failure. The same 20 covariates used to calculate the IPTW were investigated. The best model was calculated selecting the most statistically significant covariates with a backward conditional approach. An adjustment of the model for age and sex was done. Odds ratios (OR) and 95.0% confidence intervals (95.0% CI) were reported for significant variables. OR and 95.0% CI were based on 1000 bootstrap samples.
Using the beta-coefficients of the identified covariates, a mathematical model was created, namely the CPAP-FS.
The diagnostic ability of the proposed score was compared with other clinically relevant parameters using the c-statistical analysis. Area under the curve (AUC) and 95.0% CI were reported for significant variables. The analyses were performed in the post-IPTW pseudo population (used to obtain the score) and validated in the initial pre-IPTW unbalanced population.
With the intent to further validate the observed results, a 10-fold internal crossvalidation method was adopted, randomly splitting the original pre-IPTW samples of 263 cases into 10 equally-sized (n = 200) sub-samples, respectively. CPAP-FS was tested again in each sub-group in terms of prognostic ability.
Survival analyses were performed using the Kaplan-Meier method, and the log-rank test was adopted to compare the obtained survivals.
Variables with a p < 0.05 were considered statistically significant. Statistical analyses and plots were run using the SPSS statistical package version 27.0 (SPSS Inc., Chicago, IL, USA).

Results
Between February and April 2021, 263 hospitalized COVID-19 patients were managed with CPAP support at Policlinico Umberto I of Rome. The characteristics and the severity of COVID-19 are reported in Table 1. The entire population was divided into two groups: short-CPAP (CPAP days ≤ 10; n = 191, 72.6%) and long-CPAP (CPAP days > 10; n = 72, 27.4%). Patients of the two groups were similar for comorbidities such as arterial hypertension, diabetes, asthma, and chronic lung disease. Age and gender were not significantly different in the two groups. The two groups were also similar comparing several radiological, clinical, and biochemical parameters of COVID-19 severity. As an example, the median percentage of pulmonary damage evaluated by CT scan, the P/F ratio, and the C-reactive protein were not statistically relevant. The SpO 2 recorded at hospital admission was lower in the short-CPAP group (93% vs. 95%; p-value = 0.02), indicating a potentially more severe COVID-19 pneumonia in this group. The incidence of ICU admission was higher in the short-CPAP group compared to the long-CPAP group (29.3 vs. 11.1%; p-value = 0.002). Furthermore, the short-CPAP group had a higher rate of intubation (41/43, 95.3%; p-value < 0.0001). As expected, the median hospital stay was longer in the long-CPAP group than in the short-CPAP group (31 vs. 16 days; p-value < 0.0001).

Stabilized IPTW Effect
With the intent to minimize the effect of selection biases caused by the non-randomized design of this retrospective study, the entire population was "artificially" balanced using a stabilized IPTW method. As reported in Table 2, the population was efficaciously "balanced" for the 20 potential confounders adopted. In detail, before the IPTW, 10 variables showed very small differences, seven showed small differences, two showed moderate differences, and one variable showed a considerable difference. After the IPTW, 12 variables showed very small differences, and eight showed small differences. Despite the stabilized IPTW being adopted with the intent to minimize the potential reduction of the sample size of the initial population, the post-IPTW pseudo population reduced to 168 cases.

CPAP-FS
A multivariable logistic regression model was performed on the post-IPTW population with the intent to identify the risk factors for CPAP failure. As reported in Table 3 three variables measured at hospital admission were independent factors to predict the risk of CPAP failure: SpO 2 (OR = 0.86; p-value = 0.001), P/F ratio (OR = 0.99; p-value = 0.008), and the Call Score (OR = 1.44; p-value = 0.02). A preexisting chronic lung disease only neared statistical relevance (OR = 3.08; p-value = 0.057). According to the obtained beta-coefficients, we proposed the CPAP Failure Score: 7.315 + 0.512 (if male) + 0.044 × age + 1.124 (if chronic lung disease) + 0.365 × Call Score − 0.153 × SpO 2 − 0.008 × P/F ratio. Table 4 shows the characteristics of the population divided into two groups: CPAP failure and no CPAP failure

Diagnostic Ability
The CPAP-FS was superior to other relevant COVID-19-related parameters (patient age, D-dimer, P/F ratio, Call Score, SpO 2 , and the presence of comorbidities or chronic lung disease), with an AUC = 0.87 (p-value < 0.0001) ( Table 5). The score was further validated exploring its diagnostic ability in the entire pre-IPTW unbalanced population of 263 cases. Moreover, in this case, the CPAP-FS had a better diagnostic ability when compared to other parameters connected to COVID-19 severity (AUC = 0.78, p-value < 0.0001).
In addition, we performed an internal validation of the score. In the 10-fold internal validation, the CPAP-FS was always superior to all the other variables in terms of diagnostic ability (AUCs 0.76-0.80, p-values < 0.0001) ( Table 6). Table 6. Diagnostic ability of the CPAP Failure Score compared with other relevant clinical factors for the potential failure of CPAP approach: 10-fold internal validation of the model in the pre-IPTW population.

CPAP-FS and CPAP Use
Stratifying the entire population in four quartiles (Table S3), we identified four different sub-classes showing significantly different rates of CPAP failure and a different temporal distribution of the risk of failure.

CPAP-FS and CPAP Use
Stratifying the entire population in four quartiles (Table S3), we identified four different sub-classes showing significantly different rates of CPAP failure and a different temporal distribution of the risk of failure.
In detail, the patients within the first quartile of the score (value ≤ −2.90) presented a low one-month risk of CPAP failure (13.1%). In the second quartile (value −2.90-−1.61), the failure rate was 36.1%. Finally, in the third (value −1.60-−0.11) and fourth quartile (value ≥ −0.10), the failure rates were higher, respectively, 52.8 and 75.3% (Figure 1 and Table S4). Interestingly, the patients belonging to the first and second quartiles developed CPAP failure predominantly during the first 10 days of CPAP support (n = 25). After this period, the incidence of CPAP failure was only anecdotic (n = 3). Conversely, in the third and fourth quartile patients, the number of failures was consistent not only during the first 10 days, but also after day-20 of non-invasive support (Figure 2). Interestingly, the patients belonging to the first and second quartiles developed CPAP failure predominantly during the first 10 days of CPAP support (n = 25). After this period, the incidence of CPAP failure was only anecdotic (n = 3). Conversely, in the third and fourth quartile patients, the number of failures was consistent not only during the first 10 days, but also after day-20 of non-invasive support (Figure 2).

Discussion
Data from this retrospective observational study suggest that CPAP-FS predicts the incidence of CPAP failure in COVID-19 patients affected by acute hypoxemic respiratory failure and treated with helmet-CPAP. The severity of the score correlates with the risk of failure, and stratifies its temporal distribution. Patients with a lower CPAP-FS (first and second quartile) have a higher risk of failure within the first 10 days of CPAP, whereas more severe patients (third and fourth quartile) show an elevated risk of failure even beyond 10 days of treatment (especially after 20 days).
All COVID-19 patients enrolled in the study received non-invasive support through helmet-CPAP. Over the years, this device has shown several advantages in the management of severe acute respiratory failure compared to other non-invasive devices. It provides a constant and stable PEEP delivered by a free-flow system and a PEEP valve [17] without the need for a ventilator, a crucial characteristic for its use in non-ICU settings [18]. Helmet-CPAP has shown a lower risk of environmental contamination and nosocomial transmission of infections thanks to a lower level of leaks than nasal and face masks [19]. Furthermore, the use of a face mask interface tends to be less-tolerated, leading to a high failure rate and the need for intubation and invasive mechanical ventilation [20]. These advantages have led to a better outcome in recent studies when helmet-CPAP was compared to other non-invasive respiratory devices. In a metanalysis by Ferreyro et al., respiratory support with helmet-CPAP was compared to high-flow nasal cannula (HFNC) and NIV delivered by face mask, leading to a lower risk of endotracheal intubation and death with the first device [21]. In addition, regarding the management of COVID-19 respiratory failure, the secondary outcomes of the HENIVOT randomized controlled trial showed that the use of helmet NIV, when compared to HFNC, led to a significantly lower incidence of intubation, and a higher number of invasive mechanical-ventilation-free days at 28 days [22].

Discussion
Data from this retrospective observational study suggest that CPAP-FS predicts the incidence of CPAP failure in COVID-19 patients affected by acute hypoxemic respiratory failure and treated with helmet-CPAP. The severity of the score correlates with the risk of failure, and stratifies its temporal distribution. Patients with a lower CPAP-FS (first and second quartile) have a higher risk of failure within the first 10 days of CPAP, whereas more severe patients (third and fourth quartile) show an elevated risk of failure even beyond 10 days of treatment (especially after 20 days).
All COVID-19 patients enrolled in the study received non-invasive support through helmet-CPAP. Over the years, this device has shown several advantages in the management of severe acute respiratory failure compared to other non-invasive devices. It provides a constant and stable PEEP delivered by a free-flow system and a PEEP valve [17] without the need for a ventilator, a crucial characteristic for its use in non-ICU settings [18]. Helmet-CPAP has shown a lower risk of environmental contamination and nosocomial transmission of infections thanks to a lower level of leaks than nasal and face masks [19]. Furthermore, the use of a face mask interface tends to be less-tolerated, leading to a high failure rate and the need for intubation and invasive mechanical ventilation [20]. These advantages have led to a better outcome in recent studies when helmet-CPAP was compared to other non-invasive respiratory devices. In a metanalysis by Ferreyro et al., respiratory support with helmet-CPAP was compared to high-flow nasal cannula (HFNC) and NIV delivered by face mask, leading to a lower risk of endotracheal intubation and death with the first device [21]. In addition, regarding the management of COVID-19 respiratory failure, the secondary outcomes of the HENIVOT randomized controlled trial showed that the use of helmet NIV, when compared to HFNC, led to a significantly lower incidence of intubation, and a higher number of invasive mechanical-ventilation-free days at 28 days [22].
Many multiparametric scores have been designed and validated to predict the risk of intubation and invasive mechanical ventilation in ICU patients. Of these, only a few can be used specifically to evaluate the risk of NIV failure in COVID-19 patients. In a previous retrospective study by Liu et al., an online calculator was validated to predict non-invasive respiratory support failure in a cohort of 652 COVID-19 patients [13]. Among those, only 286 patients were treated with NIV; the remaining 366 patients received HFNC. The variables used to develop this nomogram were age, number of comorbidities, ROX index ((SpO 2 /FiO 2 )/respiratory rate), Glasgow coma scale, and the use of vasopressors during the first day of NIV support.
Compared to the CPAP-FS, the nomogram proposed by Liu et al. includes patients managed with devices other than helmet-CPAP. Most of the patients were treated with HFNC, which has shown poorer important secondary outcomes in the recent HENIVOT trial on COVID-19 patients [22].
The HACOR score, previously used to predict NIV failure in hypoxemic patients, has been evaluated in COVID-19 by Guia et al. [23]. It is a bedside scoring system including five parameters: heart rate, acidosis, consciousness, oxygenation, and respiratory rate. Guia et al. demonstrated that after one hour of NIV, a score > 5 predicted NIV failure in COVID-19 patients with a diagnostic accuracy of 82%. Unlike the HACOR score that considered parameters recorded before NIV initiation, the CPAP-FS can be administered at hospital admission, and combines general and respiratory variables with the CALL score, a measure of disease progression [24]. Furthermore, the diagnostic ability of the CPAP-FS is higher than the accuracy of other parameters typically associated with COVID-19 severity, and the IPTW balancing has further validated the score, mitigating the potential bias due to the retrospective nature of the study. Our analysis demonstrates that CPAP-FS can be used to identify patients who benefit from non-invasive treatment early, and distinguish them from those who have a greater risk of failure. This could be extremely useful during times of resource constraint, such as COVID pandemic waves, when a considerable number of patients with severe acute respiratory failure present to the Emergency Department [25]. This score could help allocate patients to either the ICU or the medical ward based on the risk stratification. More specifically, patients who have a lower risk of CPAP failure could be safely managed in a non-ICU setting. On the other hand, patients with a high CPAP-FS prompt a stricter monitoring, and could be considered for early intubation [4]. Therefore, early identification of CPAP failure is a promising strategy to improve outcome. Patients presenting with a low CPAP-FS have a low risk of CPAP failure within the first 10 days of the support; past this cut-off, the risk becomes negligible. In these patients, respiratory support with helmet-CPAP is particularly indicated to overcome the virus-induced acute pulmonary insult. Patients belonging to the more severe classes show an elevated risk of failure after 20 days from the introduction of CPAP. This could probably be associated with the onset of complications related to the prolonged treatment, such as superinfections or self-inflicted lung injury [26,27].
This study has some limitations. First, it is retrospective monocentric research. Despite the statistical analysis being designed to minimize the bias, a prospective study is needed to confirm the reliability of the CPAP-FS. Second, adding other specific parameters might further improve the sensitivity and specificity of the score. We considered pulmonary CT scan involvement at hospital admission, but we found it unreliable in predicting CPAP failure, probably because it is a static parameter unable to capture the evolution of the disease. Third, the retrospective nature of the study limited our ability to collect data necessary to calculate other predictive models of CPAP failure. Therefore, we could not compare the CPAP-FS with other available scores.

Conclusions
During the recent pandemic, the treatment of acute hypoxemic respiratory failure associated with COVID-19 has been an incredible challenge for clinicians and healthcare systems because of the overwhelming number of patients requiring respiratory support. CPAP-FS can be an easy tool to identify COVID-19 patients presenting with respiratory com-promise who are at risk of CPAP failure early. Nevertheless, prospective studies are needed to better identify the cohort of COVID-19 patients who can benefit from CPAP support.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/jcm11092593/s1: Table S1: Missing data reported in the cohort of 263 patients treated with CPAP after COVID-19 infection; Table S2: Identification of the best CPAP threshold to use for identifying the risk of CPAP failure after COVID-19 infection; Table S3: Diagnostic ability of the CPAP Failure Score compared to other relevant clinical factors for the potential failure of CPAP support: validation of the model in the pre-and post-IPTW population; Table S4: Stratification of the CPAP-FS in quartiles and deciles.
Author Contributions: F.A. and A.T. contributed to the conception and drafted the manuscript. Q.L. analyzed, interpreted the data, and reviewed the paper. C.A. analyzed and interpreted the data. C.C., F.D.L. and A.C. collected the data. F.P. provided a critical review of the manuscript. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of Sapienza University of Rome (protocol code 109/2020, and date of approval 7 April 2020).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Data available on request due to restrictions e.g., privacy or ethical. The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the lack of a link.