The RALE Score Versus the CT Severity Score in Invasively Ventilated COVID-19 Patients—A Retrospective Study Comparing Their Prognostic Capacities

Background: Quantitative radiological scores for the extent and severity of pulmonary infiltrates based on chest radiography (CXR) and computed tomography (CT) scan are increasingly used in critically ill invasively ventilated patients. This study aimed to determine and compare the prognostic capacity of the Radiographic Assessment of Lung Edema (RALE) score and the chest CT Severity Score (CTSS) in a cohort of invasively ventilated patients with acute respiratory distress syndrome (ARDS) due to COVID-19. Methods: Two-center retrospective observational study, including consecutive invasively ventilated COVID-19 patients. Trained scorers calculated the RALE score of first available CXR and the CTSS of the first available CT scan. The primary outcome was ICU mortality; secondary outcomes were duration of ventilation in survivors, length of stay in ICU, and hospital-, 28-, and 90-day mortality. Prognostic accuracy for ICU death was expressed using odds ratios and Area Under the Receiver Operating Characteristic curves (AUROC). Results: A total of 82 patients were enrolled. The median RALE score (22 [15–37] vs. 26 [20–39]; p = 0.34) and the median CTSS (18 [16–21] vs. 21 [18–23]; p = 0.022) were both lower in ICU survivors compared to ICU non-survivors, although only the difference in CTSS reached statistical significance. While no association was observed between ICU mortality and RALE score (OR 1.35 [95%CI 0.64–2.84]; p = 0.417; AUC 0.50 [0.44–0.56], this was noticed with the CTSS (OR, 2.31 [1.22–4.38]; p = 0.010) although with poor prognostic capacity (AUC 0.64 [0.57–0.69]). The correlation between the RALE score and CTSS was weak (r2 = 0.075; p = 0.012). Conclusions: Despite poor prognostic capacity, only CTSS was associated with ICU mortality in our cohort of COVID-19 patients.


Introduction
Quantitative scores for pulmonary infiltrates and consolidations on lung images such as chest radiography (CXR) and chest computed tomography (CT) are increasingly used in critically ill invasively ventilated patients. Such visual scores may have comparable diagnostic and prognostic capacities [1][2][3]. Advantages of a CXR-based score over chest CT scan-based score rely on CXRs being easier and cheaper to obtain, with no patient transportation outside of the ICU, and lower radiation exposure than a chest CT scan.
The Radiographic Assessment of Lung Edema (RALE) score quantifies both extent and severity of parenchymal abnormalities in the CXR [4]. The RALE score has been shown to have an excellent diagnostic accuracy for acute respiratory distress syndrome (ARDS) [5][6][7], and early changes in the RALE score have been found to have an association with outcome in critically ill invasively ventilated patients [4,8,9]. This was confirmed in invasively ventilated patients with coronavirus disease 2019 (COVID-19) [1]. A recent analysis on invasively ventilated patients showed how higher RALE scores were associated with worse parameters of respiratory mechanics and higher levels of pro-inflammatory biomarkers [10]. The CT severity score (CTSS) quantifies the extent of parenchymal involvement on the chest CT scan, by summing the percentage of affected lung tissue per each lung lobe [11]. The CTSS has been used successfully in the diagnostic process in COVID-19 patients and has been shown to correlate well with disease severity and laboratory parameters [12]. The CTSS may even have a correlation with short-term outcome [13,14].
In the current study, we aimed to determine the association with mortality of the RALE score and the CTSS in critically ill invasively ventilated COVID-19 patients and compare their prognostic capacity for ICU mortality. We also sought to determine the correlation between the RALE score and the CTSS. We hypothesized that the RALE score and the CTSS have comparable prognostic capacity for outcome.

Study Design
This was a retrospective observational study in critically ill invasively ventilated COVID-19 patients admitted to the intensive care units (ICUs) of two tertiary centers, the Academic Medical Center, and the Free University Medical Center, Amsterdam, The Netherlands. The Institutional Review Boards of both hospitals approved the study protocol (approval W20_494 # 20.546). The need for individual patient informed consent was waived because the data used for this analysis were collected as part of standard care. The study is registered at clinicaltrials.gov (NCT05047653).

Inclusion and Exclusion Criteria
Consecutive patients admitted between 1 March 2020, and 1 June 2020, the first wave of the national outbreak in the Netherlands, and between 1 October 2020 and 31 December 2020, the first 3 months of the second wave of the national outbreak in the Netherlands were screened for participation. Patients were selected if having received invasive ventilation for acute hypoxemic respiratory failure due to COVID-19 that was confirmed by reverse transcriptase-polymerase chain reaction for SARS-CoV-2. Patients aged <18 years, patients with an alternate diagnosis, and patients receiving other forms of oxygen support, such as high flow nasal oxygen (HFNO), noninvasive ventilation, or continuous positive airway pressure, were excluded. We also excluded patients that had their first CXR and first chest CT scan too far apart in time, using a cutoff of 24 h.

Data Collection
An online case report form created with Castor (www.castoredc.com (accessed on 29 July 2022)) was used to collect baseline and demographic characteristics, the Acute Physiology and Chronic Health Evaluation (APACHE) II score, and typical ventilation parameters at the moment of lung imaging, including inhaled oxygen fraction (FiO 2 ), positive end-expiratory pressure (PEEP), maximum airway pressure (Pmax), respiratory rate (RR), tidal volume (VT), and the blood gas analysis results.

Imaging Scores
CXRs and chest CT scans were collected from the electronic imaging systems in each hospital and uploaded in Joint Photographic Experts Group (JPEG) format into the database. Then, CXRs were scored by at least two independent scorers that were extensively trained in calculating RALE scores. For this, each scorer was trained in the RALE scoring by one of the investigators (C.Z.), who was trained during a 1-month focused period by the team that developed the RALE score [6]. An interclass correlation coefficient (ICC) > 0.8 between the trainer and other scorers on a training sample of 22 CXRs from another set of CXRs of patients with ARDS was a prerequisite for scoring CXRs in the study dataset. A third scorer was only involved if the difference in numeric RALE score between two scorers was >25%, in order to reach a final consensus by discussion. Chest CT scans were scored by a radiologist experienced in chest CT.
For the RALE score, the chest was divided into four quadrants by a vertical line over the spine and a horizontal line at the level of the first branch of the left main bronchus; each quadrant was then scored for extent of alveolar opacities (consolidation score, from 0 to 4), and the corresponding density of alveolar opacities (density score, from 1 to 3), and the final score was the sum of the product of the consolidation and density scores for each quadrant. The RALE score thus ranged from 0 (no abnormalities) to 48 (maximum abnormalities) [4]. For details on RALE score computation, see Figure S1.
For the CTSS, the percentage of involvement per each lung lobe was scored and summed; the final score was the sum of the individual lobar scores, which could range from 0 (no lung involvement) to 25 (maximum involvement when all the 5 lobes show more than 75% involvement) [15]. For details, see Table S1.

Outcomes
The primary outcome was ICU mortality. Secondary outcomes were duration of ventilation in survivors, length of stay in ICU, and hospital-, 28-, and 90-day mortality.

Power Calculation
We did not perform a formal sample size calculation. Instead, the available patients served as the sample size for this study.

Statistical Analysis
Demographic data and clinical and outcome variables were summarized as medians (interquartile range) for continuous variables and as frequencies (percentage) for categorical variables. Normally distributed variables were compared between groups with t-test or ANOVA. Not normally distributed variables were compared between groups with Wilcoxon signed rank tests or Mann-Whitney U test. Categorical variables were compared between groups by Wilcoxon signed rank test.
To test the association of the radiological scores with outcomes, we performed univariable and multivariable logistic regression models introducing the RALE score and the CTSS alternatively. As covariates for the logistic regression models, we used age, body mass index, and the APACHE II score [16]. We performed a sensitivity analysis introducing PEEP as covariate, as PEEP can influence imaging scores. In a further sensitivity analysis, we used a Splines fitted model to take into consideration the non-normal distribution of the CTSS. For this purpose, linear tail restricted cubic splines with three knots and three degrees of freedom were fitted, as by default values from the 'rcs' function of the 'rms' R package. For both univariable and multivariable logistic regression model, a receiver operating characteristic (ROC) was constructed from which the area under curve (AUROC) was calculated. In order to assess the prognostic potential of both scores, ROC curves were estimated on the averaged terms after applying a repeated (5 times) 10-fold cross-validation algorithm to the dataset, and 95% confidence interval on ROC estimates were obtained by 500 bootstrap repetitions. When the AUROC was 0.9-1.0, the prognostic capacity was considered excellent, if 0.8-0.9, 0.7-0.8, and 0.6-0.7, the test was defined as good, fair, or poor, respectively. The De Long test was used for the comparison of the AUROCs [17]. A non-significant result would confirm that the two scores are comparable in terms of prognostic capability for ICU mortality. For the correlation between the RALE and CTSS, we used the coefficient of determination obtained from a simple linear regression model (r 2 ), using RALE as independent variable and CTSS as dependent variable.
To assess the continuous outcomes such as duration of ventilation and ICU length of stay in survivors, we fitted a linear regression model for each score with the same covariates structure as the aforementioned logistic models. We compared which score performs best in the model using R 2 .
All analyses were performed using a two-sided superiority hypothesis test, with a significance level of 0.05 and presented with two-sided 95% confidence intervals. No corrections were performed for multiple comparisons across secondary clinical outcomes, thus the findings should be considered as exploratory. All analyses were performed using R (version 4.0.2, R Core Team, 2016, Vienna, Austria).

Patients
From 1 March 2020 through 1 June 2020 (the first 3 months of the first wave) and from 1 October 2020 through 31 December 2020 (the first 3 months of the second wave), we screened 254 patients ( Figure 1). We excluded 172 patients for various reasons, the most frequent one being a missing CXR or CT scan within the predefined timespan of 24 h. Patient demography and ventilation characteristics and outcomes are shown in Tables 1 and 2. The median age was 65 [60-72] years, with the most common comorbidities being hypertension and diabetes. There were no significant differences in frequencies in demographic characteristics and comorbidities between ICU survivors and ICU nonsurvivors except for hypertension. Furthermore, ventilation parameters were not different between the same groups and most patients had moderate or severe ARDS. ICU mortality was 42.7%. Patient demography and ventilation characteristics and outcomes are shown in Tables 1 and 2. The median age was 65 [60-72] years, with the most common comorbidities being hypertension and diabetes. There were no significant differences in frequencies in demographic characteristics and comorbidities between ICU survivors and ICU nonsurvivors except for hypertension. Furthermore, ventilation parameters were not different between the same groups and most patients had moderate or severe ARDS. ICU mortality was 42.7%.

Prognostic Capacity for ICU Death
The RALE score had no association with ICU mortality (OR, 1.  (Figure 4), yet significantly superior to the RALE score (p value for De Long test = 0.006) ( Table 4).

Prognostic Capacity for ICU Death
The RALE score had no association with ICU mortality (OR, 1.35 [95%CI 0.64-2.8 p = 0.42) ( Table 3), neither showed prognostic capacity, with an area under ROC (AURO for ICU mortality of 0.50 [0.44-0.56] (Figure 4), although the calibration of the fitted mod was poor ( Figure S2A). The CTSS had an association with ICU mortality (OR, 2.31 [95% 1. 22-4.38]; p = 0.01), with an adequate calibration of the fitted model ( Figure S2B). T prognostic capacity of the CTSS was poor, with an AUROC for ICU mortality of 0.64 [0.5 0.69] (Figure 4), yet significantly superior to the RALE score (p value for De Long tes 0.006) ( Table 4).  The CTSS had an association with hospital mortality, 28-day, and 90-day mortality, but not with length of stay or duration of ventilation ( Table 3). The RALE score was not associated with any of the secondary outcomes.

Sensitivity Analyses
The addition of PEEP in the model and using splines fitted models did not modify the findings for the primary outcome ( Figure S3). Both scores were not associated with ICU mortality when only the CXRs of the first 3 days were taken into consideration (Table 3).

Discussion
The main findings of this study in COVID-19 patients with ARDS can be summarized as follows: (1) The first available CTSS score is associated with patient mortality albeit with a poor prognostic capacity; (2) the first available RALE score has no association with outcome and no prognostic capacity; (3) there is a lack of correlation between the two radiological scores; (4) none of the scores can predict ICU length of stay or duration of mechanical ventilation.
In contrast to the study hypothesis, the two radiological scores showed a different behavior with regards to prognostication of patient outcome. The CT-based score was significantly higher in non-survivors and showed an association with all mortality outcomes. Although the CTSS was initially developed to discriminate the severity of disease [15,18], subsequent studies did show a consistent association with outcome. In a validation study performed in the Netherlands, the CTSS was associated with 30-day mortality, although that study was performed in the emergency department and recruited suspected rather than confirmed COVID-19 cases [14]. In ICU patients, the CTSS was shown to predict the composite outcome of death or ICU stay for more than 30 days [19].
Despite showing an association with mortality and a superior predictive potential as compared to the RALE score, the prognostic capability of the CTSS was minimal and unfit for clinical purposes. This highlights the complexity of the trajectory of COVID-19 ARDS, where the extent of pulmonary impairment as assessed by imaging at baseline has a limited impact on survival. The poor prognostic capacity of the CTSS was also observed in the emergency department [14]. Although chest CT manages to identify progression of COVID-19 ground-glass opacities towards consolidations and absorption [20], the baseline score seems to provide scarce predictive information. The failure to predict mortality using baseline scores has also been shown for lung ultrasound [21] and chest X-ray [1]. The use of early changes in radiological scores seem to outperform the use of baseline values for prognostication purposes [1,8,14].
The RALE score had consistently no association with any of the mortality outcomes. This echoes the finding of a recent international multicenter study performed on 139 patients with COVID-19 ARDS [1] and another study in five German ICUs [22]. Yet, these findings are in contrast with other studies conducted in less severe cohorts or patients outside of the ICU that did find an association between the entity of pulmonary impairment estimated by the RALE score and adverse outcomes [8,[23][24][25]. The RALE score provides a reliable interpretation of signs of lung edema on chest radiographs and has been validated as a good predictor of ARDS [4,8]. The RALE may also be associated with long term diffusion impairment, as recently shown in a cohort of patients performing CXR at six months from discharge [26]. COVID-19 worked as a boost for a development of machine learning solutions to assist specialists in early diagnostic detection and treatment of ICU patients [27]. For instance, the RALE score was also used as benchmark to validate a fully automated segmentation and intensity quantification method in CXRs for COVID-19 patients [28]. To date, the overall evidence is against the routine use of the baseline RALE score for the prediction of outcome in mechanically ventilated ICU patients with COVID-19 ARDS.
Pulmonary vascular dysfunctions described in COVID-19 ARDS are not captured by the CTSS and RALE score [29][30][31][32][33][34]. Aside from thrombo-embolic complications, mortality may be also driven by other factors, such as bacterial and fungal infections complications [35,36] and ICU-acquired weakness. The initial ventilator management was also shown to moderate the prognostic value of other predictive parameters [37]. These mechanisms aid to justify the absent or scarce prognostic potential of baseline radiological scores in COVID-19 ARDS.
The negligible correlation between CXR RALE score and the chest CT scan CTSS was unexpected and does not have a univocal explanation [38,39]. In fact, several conceptual differences may impede interchangeable use of the two scores. The RALE attempts to quantify both extent and severity of alveolar infiltrates of pulmonary edema while the CTSS solely estimates the extent of lung involvement [13][14][15]18]. Secondly, the different distributions observed suggest that caution should be taken before comparing a CXR based score with a CT derived one. For instance, the CTSS was skewed towards higher values, while the RALE showed a normal distribution. For the bedside clinician, chest radiographs remain easier to perform, require lower radiation dose, and are safer compared to CT. The possible benefits of routine CT examinations (fast triage, high resolution imaging, association with mortality) may not outweigh the harms such as overuse of medical resources and higher radiation dose [14]. Considered these shortcomings and the poor prognostic capacity of the CTSS, the use of early changes in the RALE score or lung ultrasound score could be more useful as first line imaging technique [1,8,19,21].
The study was designed to minimize bias by strictly adhering to a predefined statistical analysis plan and systematic training of the RALE scorers. We had a low interobserver variability between the scorers [6,7,9,40]. We collected data in both the first as the second wave in the Netherlands, with minimal loss to follow-up.
This study has several limitations. First, the sample size of this study was relatively small in combination with a high overall mortality and the retrospective design of this study limits the inclusion of all potential confounders. Secondly, the inclusion criteria of the study could have resulted in selection bias and patient-level differences in terms of treatments received. This study only included mechanical ventilated patients with both a CT scan and CXR within the same timeframe and ICU length of stay for at least 24 h. This may potentially create a systematic bias towards patients with higher severity or generate a risk of attrition bias. Finally, despite the high ICC among scorers, there was still significant variability in one out of five CXRs with the need of a third scorer. This additional scorer was not blinded to the results of the previous assessments, and this could have generated scoring bias.

Conclusions
In this cohort of invasively ventilated patients with ARDS due to COVID-19, the CTSS of the first available chest CT scan was associated with short-and medium-term mortality outcomes, albeit with poor prognostic capacity. The CXR-based RALE score was not associated with any mortality outcome and had no prognostic capacity. No correlation was found between the CXR and CT-based score, a finding to be validated in future studies. Neither score could predict duration of ventilation or length of stay in ICU.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/diagnostics12092072/s1, Figure S1: The RALE scoring sheet, showing the total score and the score for each of the four quadrants for a representative study patient; Figure S2: Calibration plots of the fitted model of (A) the RALE score and (B) the CTSS; Figure S3: Spline graphical association of the RALE score (panel A) and CTSS (panel B) versus ICU mortality, adjusted median APACHE II of 12 score and BMI of 29; Table S1: The CT severity score per lobe. A system for scoring ground-glass opacity, interstitial opacity, and air trapping on thin-section CT scan.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki. This was a retrospective observational study in critically ill invasively ventilated COVID-19 patients admitted to the intensive care units (ICUs) of two tertiary centers, the Academic Medical Center, and the Free University Medical Center, Amsterdam, The Netherlands. The Institutional Review Boards of both hospitals approved the study protocol (approval W20_494 # 20.546).

Informed Consent Statement:
The need for individual patient informed consent was waived because the data used for this analysis were collected as part of standard care. The study is registered at clinicaltrials.gov (NCT05047653).

Data Availability Statement:
Requests for the data should be sent to Claudio Zimatore; email address: claudiozimatore@gmail.com.

Conflicts of Interest:
The authors declare no conflict of interest.