Can Deep Learning-Based Volumetric Analysis Predict Oxygen Demand Increase in Patients with COVID-19 Pneumonia?

Background and Objectives: This study aimed to investigate whether predictive indicators for the deterioration of respiratory status can be derived from the deep learning data analysis of initial chest computed tomography (CT) scans of patients with coronavirus disease 2019 (COVID-19). Materials and Methods: Out of 117 CT scans of 75 patients with COVID-19 admitted to our hospital between April and June 2020, we retrospectively analyzed 79 CT scans that had a definite time of onset and were performed prior to any medication intervention. Patients were grouped according to the presence or absence of increased oxygen demand after CT scan. Quantitative volume data of lung opacity were measured automatically using a deep learning-based image analysis system. The sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) of the opacity volume data were calculated to evaluate the accuracy of the system in predicting the deterioration of respiratory status. Results: All 79 CT scans were included (median age, 62 years (interquartile range, 46–77 years); 56 (70.9%) were male. The volume of opacity was significantly higher for the increased oxygen demand group than for the nonincreased oxygen demand group (585.3 vs. 132.8 mL, p < 0.001). The sensitivity, specificity, and AUC were 76.5%, 68.2%, and 0.737, respectively, in the prediction of increased oxygen demand. Conclusion: Deep learning-based quantitative analysis of the affected lung volume in the initial CT scans of patients with COVID-19 can predict the deterioration of respiratory status to improve treatment and resource management.


Introduction
The novel coronavirus disease 2019 (COVID-19) has rapidly spread worldwide since it was first reported in Wuhan, Hubei Province, China, in December 2019. Worldwide, more than 226,800,000 people have been infected by 17 September 2021, and more than 571,000 people have been affected daily. In Japan, 1,663,024 people have been infected, of which 17,030 have died [1,2].
While many patients with COVID-19 are asymptomatic or present with mild symptoms and do not require hospitalization, the increasing number of patients with moderateto-severe respiratory conditions (requiring oxygenation or mechanical ventilation) depletes medical and human resources as well as disrupts daily practices [3]. There are only few facilities that have the capacity to treat a high number of patients with severe COVID-19 who require mechanical ventilation or intensive care. Therefore, it is important to assess the risk of exacerbation in individual cases to determine the optimal distribution of patients and allocation of medical and human resources [4]. Older age, obesity, chronic obstructive pulmonary disease (COPD), serious heart disease, and several other comorbidities exacerbate the medical risks. Unfortunately, imagining techniques for predicting such exacerbations for individual cases have not yet been developed [5,6].
Computed tomography (CT) is a sensitive imaging modality for the detection of pneumonia. Nonsegmented pure ground-glass opacities (GGOs) in the peripheral side of the lung are typical in the images of early-stage COVID-19 pneumonia. As the disease progresses, the number of lesions increases, and a crazy-paving pattern can be observed in the GGOs. GGOs become mixed with consolidation, and consolidation increases as the condition worsens. At the time of healing, consolidation decreases and becomes faint with a cord-like structure [7][8][9]. Knowledge of these typical processes and the resulting imaging patterns may be used as a tool for estimating the approximate stage of an individual patient based on CT findings. The area of the lesion usually shrinks during the healing phase and expands as it becomes more severe [10,11]. Therefore, we examined whether the area of the GGOs or consolidation could serve as a predictive indicator of the severity of pneumonia. In addition, CT findings are known to dynamically change over time, and previous studies revealed that they peak approximately 9-13 days after onset [10] (6-11 days in a study by Wang et al. [8]). We thought that stratification of the time from onset would increase the accuracy of predicting aggravation.
In recent years, artificial intelligence (AI), particularly deep learning, has been significantly developed and has been applied to medical image classification, object detection, semantic segmentation, etc. [12][13][14][15][16]. Moreover, it has been reported that the use of AI technology can greatly improve the accuracy of tasks that are difficult and time-consuming for humans to perform alone and, thus, save reading time. Several studies have been conducted on the detection, region extraction, and classification of COVID-19 pneumonia lesions on CT images, and the clinical usefulness of these studies have been verified [17][18][19][20][21].
Therefore, this study aimed to quantitatively analyze the chest CT scans of patients with COVID-19 using a deep learning-based system to investigate the association between image data and the deterioration of the patient's respiratory status.

Study Population
The medical ethics committee of our hospital approved this retrospective study and waived the requirement for written informed consent. In the present study, the inclusion criteria for patient enrollment were as follows: (a) Inpatient diagnosed as positive for COVID-19 by one or more reverse transcription polymerase chain reaction (RT-PCR) test and (b) underwent chest CT from April to June 2020. The exclusion criteria were as follows: (a) younger than 20 years of age, (b) treated with drugs for COVID-19 before CT scan, and (c) the onset was unclear (Figure 1).
After reviewing the database of radiology reports and clinical records at our institute, two board-certified radiologists with 6 and 10 years of imaging experience and a medical student extracted chest CT images and clinical data, including sex, age, symptoms, time of onset, time of CT scan, comorbidities (e.g., chronic kidney disease, COPD, obesity, serious heart disease, and diabetes), blood tests (white blood cell count [WBC], lymphocyte count % [LYM%], C-reactive protein [CRP], and lactate dehydrogenase [LDH]), and treatment (oxygen administration and mechanical ventilation). The radiologist has extracted the cases to be registered and the students were mainly in charge of entering the specified data into the sheet. In total, 79 CT scans of patients with COVID-19 were included in this study.

Chest CT Imaging
CT scan was performed based on the clinical judgment of the attending physician and was performed in the supine position, and the image was taken in the craniocaudal direction. In total, 47 of the CT examinations were conducted at our institution, and SO-MATOM Edge Plus (Siemens Healthcare GmbH, Erlangen, Germany) with 64-detector rows was utilized. Conversely, 32 CT examinations were conducted outside our institutions, and several CTs with 4-to 320-detector rows were utilized. The acquisition parameters at our hospital were as follows: 120-kV tube voltage with automatic tube current modulation (150 mAs); tube rotation time, 0.28 s; beam collimation, 128 ch × 0.6 mm; and beam pitch, 1.5. By default, 2.0 mm chest CT images without interslice gap were reconstructed using a sharp tissue kernel (Bl57) and the filtered back-projection technique. Outside institutions, the slice thickness of the reconstructed images ranged from 1.25 to 5 mm.

COVID-19 Pneumonia Analysis Using Deep Learning System
The deep learning-based pneumonia analysis system (CT Pneumonia Analysis prototype, Siemens Healthcare GmbH, Erlangen, Germany) was used to quantitatively analyze the area of COVID-19 pneumonia on chest CT images. The system has been trained and tested using a dataset of 9749 three-dimensional chest CT volumes to automatically perform three-dimensional segmentation and quantification of the anomalous CT pattern GGOs (low opacities) and consolidations (high opacities) that are commonly present in COVID-19. Figure 2 presents an example of a segmented lung and pneumonia region on a CT image. The total opacity volume (mL), low opacity volume (mL), and high opacity volume (mL) were obtained.

Chest CT Imaging
CT scan was performed based on the clinical judgment of the attending physician and was performed in the supine position, and the image was taken in the craniocaudal direction. In total, 47 of the CT examinations were conducted at our institution, and SOMATOM Edge Plus (Siemens Healthcare GmbH, Erlangen, Germany) with 64-detector rows was utilized. Conversely, 32 CT examinations were conducted outside our institutions, and several CTs with 4-to 320-detector rows were utilized. The acquisition parameters at our hospital were as follows: 120-kV tube voltage with automatic tube current modulation (150 mAs); tube rotation time, 0.28 s; beam collimation, 128 ch × 0.6 mm; and beam pitch, 1.5. By default, 2.0 mm chest CT images without interslice gap were reconstructed using a sharp tissue kernel (Bl57) and the filtered back-projection technique. Outside institutions, the slice thickness of the reconstructed images ranged from 1.25 to 5 mm.

COVID-19 Pneumonia Analysis Using Deep Learning System
The deep learning-based pneumonia analysis system (CT Pneumonia Analysis prototype, Siemens Healthcare GmbH, Erlangen, Germany) was used to quantitatively analyze the area of COVID-19 pneumonia on chest CT images. The system has been trained and tested using a dataset of 9749 three-dimensional chest CT volumes to automatically perform three-dimensional segmentation and quantification of the anomalous CT pattern GGOs (low opacities) and consolidations (high opacities) that are commonly present in COVID-19. Figure 2 presents an example of a segmented lung and pneumonia region on a CT image. The total opacity volume (mL), low opacity volume (mL), and high opacity volume (mL) were obtained.

Statistical Analysis
The statistical analyses in this study were conducted using EZR version 1.31 (Saitama Medical Center, Jichi Medical University, Saitama, Japan) [22] and IBM SPSS Stastics version 24 (International Business Machines Corporation, Armonk, NY, USA).
Descriptive statistics were used to express categorical variables as counts and percentages, and numeric or ordered variables were expressed as medians and 25th-75th percentiles. The variables were selected based on a clinical perspective. We compared the deterioration of the patient's respiratory status and clinical and radiological factors (e.g., age, WBC, LYM%, CRP, and LDH); the continuous variables were compared using the Mann-Whitney U test, and the categorical variables were compared using the χ 2 test.
To investigate the volume of pneumonia (total opacity, low opacity, and high opacity) as measured by the deep learning system and its relationship with oxygen demand and the deterioration of the patients' respiratory status, a receiver operating characteristic (ROC) analysis was conducted. We calculated the sensitivity, specificity, and area under the ROC curve (AUC) to predict the deterioration of the patient's respiratory status. An optimal cut-off value that was closest to the upper left corner was derived (the cut-off value with the highest sum of sensitivity and specificity). In each case, a p-value of < 0.05 was considered statistically significant.
Moreover, we divided the cases into three subgroups according to the time from onset to CT scan, early period (0-5 days), middle period (6-10 days), and late period (≥11 days), and analyzed each group separately. The day of onset was defined as the day when symptoms, such as fever, malaise, and respiratory symptoms, appeared.
Logistic regression analysis was performed with the dependent variable being the presence or absence of an increase in oxygen demand and the independent variables being volume of high opacity, low opacity, and overall opacity. Calibration was examined by the Hosmer-Lemeshow goodness-of-fit test (a non-significant test indicates good calibration) and by graphically examining the deviation between mean observed and mean predicted probabilities for increased oxygen in 10 equally sized groups of predicted risks.

Patient Characteristics
Patients' demographic data are presented in Table 1. The median age was 62 years, and 56 (70.9%) patients were male. The median duration between onset and the CT scan was 9 days [6,13].
Oxygen demand increased after the CT scan in 45 cases (57.0%) (increased oxygen demand group). The median age was higher for the increased oxygen demand group (65 years [54, 77] for the increased oxygen demand group and 51.5 years [36, 71.5] for the nonincreased oxygen demand group). In the increased oxygen demand group, 26 (57.8%) patients did not require oxygen at the time of the CT scan; however, oxygen demand increased afterward. In the nonincreased oxygen demand group, 30 (88.2%) patients never required oxygenation, and the remaining four (11.2%) patients required a maximum of 3 L/min of oxygen at the time of the CT scan, but the demand did not increase thereafter. In the increased oxygen demand group, 19 (42.2%) patients required mechanical ventilation during treatment, and extracorporeal membrane oxygenation was introduced for four (8.9%) patients. Continuous variables are expressed as median and interquartile range in brackets and were compared between the two groups using the Mann-Whitney U test. Categorical variables are expressed as numbers (%) and were compared between the groups using the χ 2 test. Computed tomography, CT; white blood cell count, WBC; lymphocyte, LMY; C-reactive protein, CRP; lactate dehydrogenase, LDH; extracorporeal membrane oxygenation, ECMO.

Volume of Opacities as Predictive Factors
The opacity volume was significantly higher in the increased oxygen demand group than in the nonincreased oxygen demand group (585.3 mL vs. 132.8 mL, p < 0.001), as were the volume of high opacity (71.4 mL vs. 25.1 mL, p = 0.006) and the volume of low opacity (440.3 mL vs. 95.1 mL, p = 0.001). Table 2 and Figure 3 present the time course for the opacity volume of each type on chest CT from the onset of symptoms. The numbers of cases in each period were 17, 35, and 27, respectively. In addition, we compared the increased oxygen demand and nonincreased oxygen demand groups at each period. No significant difference was observed between the two groups in any type of opacity during the early period. The low opacity volume peaked in the middle period and decreased in the late period for both the increased oxygen demand and the nonincreased oxygen demand groups. Conversely, the high opacity volume increased during the late period for the increased oxygen demand group but decreased for the nonincreased oxygen demand group. Significant differences were observed in the low opacity volume during the middle period (551.80 mL vs. 330.69 mL, p = 0.044) and the high opacity volume in the late period (129.74 mL vs. 5.28 mL, p = 0.018).   Change in the volume of each type of opacity on the chest CT from the time of the initial onset of symptoms. The low opacity volume peaked during the middle period and decreased during the late period for both the increased oxygen demand and the nonincreasing oxygen demand groups. Conversely, the high opacity volume increased during the late period for the increased oxygen demand group but decreased for the nonincreased oxygen demand group. Figure 4 presents the ROC curves of the opacity volume as a predictor of oxygen demand. The AUC was also estimated. The sensitivity, specificity, and AUC of the volume of opacity were 76.5%, 68.2%, and 0.737 (95% confidence interval [CI], 0.624-0.851) in predicting the increased oxygen demand; 50.0%, 79.5%, and 0.691 (95% CI, 0.572-0.811) for high opacity; and 76.5%, 62.2%, and 0.722 (95% CI, 0.607-0.837) for low opacity, respectively. Figure 3. Change in the volume of each type of opacity on the chest CT from the time of the initial onset of symptoms. The low opacity volume peaked during the middle period and decreased during the late period for both the increased oxygen demand and the nonincreasing oxygen demand groups. Conversely, the high opacity volume increased during the late period for the increased oxygen demand group but decreased for the nonincreased oxygen demand group. Figure 4 presents the ROC curves of the opacity volume as a predictor of oxygen demand. The AUC was also estimated. The sensitivity, specificity, and AUC of the volume of opacity were 76.5%, 68.2%, and 0.737 (95% confidence interval [CI], 0.624-0.851) in predicting the increased oxygen demand; 50.0%, 79.5%, and 0.691 (95% CI, 0.572-0.811) for high opacity; and 76.5%, 62.2%, and 0.722 (95% CI, 0.607-0.837) for low opacity, respectively.   Table 3 describes the AUC for each type of opacity volume for increased oxygen demand in each period. The AUC values were low in all groups during the early period (0.45-0.65). The low opacity volume demonstrated a relatively high AUC value, but the high opacity volume showed a better value during the late period. The AUC value before the middle period of the high opacity volume did not reach 0.5, which was very low. Table 3. AUC of each type of opacity volume as a predictor of oxygen demand in each period.  Figure 5 shows the calibration plots for volume of high, low, and overall opacity, respectively. Volume of high opacity and overall opacity are poorly calibrated because they have some groups with over-or under-predicted risk compared to volume of low opacity. The Hosmer-Lemeshow goodness of fit between the measured and predicted values showed that volume of low opacity (p = 0.056) was higher than volume of high opacity (p = 0.022) and overall opacity (p = 0.021).

Period
Medicina 2021, 57, x FOR PEER REVIEW 9 of 13 in predicting increased oxygen demand (a); 50.0%, 79.5%, and 0.691 for high opacity (b); and 76.5%, 62.2%, and 0.722 for low opacity (c), respectively. Table 3 describes the AUC for each type of opacity volume for increased oxygen demand in each period. The AUC values were low in all groups during the early period (0.45-0.65). The low opacity volume demonstrated a relatively high AUC value, but the high opacity volume showed a better value during the late period. The AUC value before the middle period of the high opacity volume did not reach 0.5, which was very low. Table 3. AUC of each type of opacity volume as a predictor of oxygen demand in each period.

Period
Number  Figure 5 shows the calibration plots for volume of high, low, and overall opacity, respectively. Volume of high opacity and overall opacity are poorly calibrated because they have some groups with over-or under-predicted risk compared to volume of low opacity. The Hosmer-Lemeshow goodness of fit between the measured and predicted values showed that volume of low opacity (p = 0.056) was higher than volume of high opacity (p = 0.022) and overall opacity (p = 0.021). (c) Figure 5. The calibration plots for volume of high, low, and overall opacity. The Hosmer-Lemeshow goodness of fit between the measured and predicted values showed that volume of low opacity (p = 0.056) (b) was higher than volume of high opacity (p = 0.022) (a) and overall opacity (p = 0.021) (c).

Discussion
In this single-center, retrospective study, we used collected data to investigate the correlation between the opacity volume of patients with COVID-19 pneumonia on CT image before treatment and the subsequent increase in oxygen demand.
In the comparison of the increased oxygen demand group and the nonincreased oxygen demand group, the median time from onset to CT was 3.5 days shorter for the increased oxygen demand group, whereas the opacity volume was 4.4 times larger despite early shooting. Individual differences may exist in the degree of symptoms that trigger a visit to the hospital and subsequent CT imaging, although CT imaging is often performed earlier in severe cases. This suggests that the early onset of symptoms due to the rapid spread of pneumonia after infection promoted early visits. Compared with high opacity volume, low opacity volume was considerably different between the increased oxygen demand group and the nonincreased oxygen demand group during the whole study period. The rapidly increasing GGOs may be associated with early-stage symptoms. Rorat M et al. retrospectively analyzed 61 patients with COVID-19 who underwent a CT scan due to suspicious symptoms of pneumonia during deterioration of health [23]. Quantitative CT was performed using deep learning and revealed a significantly higher severity of changes in type of GGOs and consolidation in patients with severe disease than in those with nonsevere disease. Although the results of this study tend to be the same as ours, the opacity of this cohort is much greater than ours. The possible causes are difference in the median time from onset to CT (12 days, compared to 9 days in our study) and that many patients may have poor respiratory status due to the strict definition of severe illness (room air oxygen saturation < 90%).
We also showed that volume of high opacity, low opacity, and overall opacity all proved to be significant variables and the risk of increases in oxygen demand increased as the volume of opacity became higher. In the Hosmer-Lemeshow goodness-of-fit test, the best fit between measured and predicted values was found for low opacity volume. Since the CT findings of pneumonia in COVID-19 change from low to high opacity as the disease worsens, measuring the volume of low opacity may be appropriate for predicting increases in oxygen demand In our subgroup study, the period was divided into three groups every 5 days, taking into consideration the ease of use when applied in actual clinical practice. There are subtle differences in previous studies regarding how to divide the cases into subgroups. Pan et al. [10] classified multiple patients with COVID-19 into four stages based on the quartiles of patients and degree of lung involvement. In the study by Wang et al. [8], numerous

Discussion
In this single-center, retrospective study, we used collected data to investigate the correlation between the opacity volume of patients with COVID-19 pneumonia on CT image before treatment and the subsequent increase in oxygen demand.
In the comparison of the increased oxygen demand group and the nonincreased oxygen demand group, the median time from onset to CT was 3.5 days shorter for the increased oxygen demand group, whereas the opacity volume was 4.4 times larger despite early shooting. Individual differences may exist in the degree of symptoms that trigger a visit to the hospital and subsequent CT imaging, although CT imaging is often performed earlier in severe cases. This suggests that the early onset of symptoms due to the rapid spread of pneumonia after infection promoted early visits. Compared with high opacity volume, low opacity volume was considerably different between the increased oxygen demand group and the nonincreased oxygen demand group during the whole study period. The rapidly increasing GGOs may be associated with early-stage symptoms. Rorat M et al. retrospectively analyzed 61 patients with COVID-19 who underwent a CT scan due to suspicious symptoms of pneumonia during deterioration of health [23]. Quantitative CT was performed using deep learning and revealed a significantly higher severity of changes in type of GGOs and consolidation in patients with severe disease than in those with nonsevere disease. Although the results of this study tend to be the same as ours, the opacity of this cohort is much greater than ours. The possible causes are difference in the median time from onset to CT (12 days, compared to 9 days in our study) and that many patients may have poor respiratory status due to the strict definition of severe illness (room air oxygen saturation < 90%).
We also showed that volume of high opacity, low opacity, and overall opacity all proved to be significant variables and the risk of increases in oxygen demand increased as the volume of opacity became higher. In the Hosmer-Lemeshow goodness-of-fit test, the best fit between measured and predicted values was found for low opacity volume. Since the CT findings of pneumonia in COVID-19 change from low to high opacity as the disease worsens, measuring the volume of low opacity may be appropriate for predicting increases in oxygen demand.
In our subgroup study, the period was divided into three groups every 5 days, taking into consideration the ease of use when applied in actual clinical practice. There are subtle differences in previous studies regarding how to divide the cases into subgroups. Pan et al. [10] classified multiple patients with COVID-19 into four stages based on the quartiles of patients and degree of lung involvement. In the study by Wang et al. [8], numerous patients underwent CT several times. The time period was divided every 6 days as the median scan-to-scan interval was 6 days. In our cohort, the opacity volume peaked during the middle period. This time course was similar to that of previous studies [8,10,11]. Even if there were small differences in the grouping method, each study demonstrated a similar feature in the progression of pneumonia.
No significant difference was observed in the opacity volumes during the early period between the two groups, and the AUC for the increased oxygen demand was low. Thus, exacerbation could not be predicted by CT at the initial stage.
The AUC of the low opacity volume during the middle to late period exhibits moderate accuracy. The low opacity volume was significantly larger in the increased oxygen demand group during the middle period. It was still larger (eight times larger than that of the nonincreased oxygen demand group) during the late period but not significantly (p = 0.056). Conversely, the high opacity volume significantly increased during the late period in the increased oxygen demand group, whereas it decreased in the nonincreased oxygen demand group. The end AUC increased and exhibited moderate accuracy. Based on these data, low opacity is associated with exacerbation of the respiratory function after 6 days of onset, whereas exacerbation after 11 days might be correlated with the high opacity volume. The cause of the exacerbation of respiratory illness may be the appearance of new lesions due to the inability to control the illness or the nonreduction of existing GGOs/consolidation.
In this study, deep learning was adopted to automatically extract and quantify lung lesions from the CT scans of patients with COVID-19. Deep learning provides an efficient method for the detection and segmentation of the affected area on CT images. There are several papers that suggest the usefulness of CT evaluation using deep learning of COVID-19, but each has different end points and analysis methods [24,25]. Additionally, most of the studies focus on detection and diagnosis, and those that consider severity and prognosis are relatively rare [23]. The advantages of our method are as follows: 1. Predictions are based on a single CT, not a comparison of multiple CTs taken during the course of illness. Liu et al. conducted the first cohort study to predict outcomes in patients with COVID-19 using quantitative CT measurements, and they detected that the change of the CT image from day 0 to day 4 predicts progression to severe illness [24]. To safely perform CT for patients with COVID-19, a lot of human resources and occupancy of the CT room are required. Therefore, it is not recommended to perform multiple CTs, and it is highly evaluated to predict with a single CT.; 2. Consideration of the time from onset to imaging: Although there are several papers examining the predictive ability of quantitative analysis, it is difficult to say that it can contribute to the proper allocation of patients at an early stage due to the variation in CT scan timing.; 3. It focuses on quantitative data on pneumonia area and is easy to implement clinically, as it does not require an overly complex program. To the best of our knowledge, there are no studies that deal with a large number of early CTs in severe COVID-19 cases or objectively assess the predictive value of opacity, particularly in Japan.
Predictive techniques are required to prevent the spread of COVID-19 infection and effectively utilize limited medical resources. Due to the relative availability of RT-PCR in numerous hospitals and clinics, major interest has been directed toward predicting aggravation and early intervention. It is important to distribute severe patients to the appropriate facilities. As CT has become widespread in Japan, CT scans can be performed at small hospitals in the city. If a CT can predict the exacerbation of COVID-19 pneumonia, risk-aware treatment will be possible, and transfers for specialized treatment will be smoothly managed at the appropriate time.
This study has several limitations. First, the number of cases is small, particularly for a subgroup analysis, and the number of each group became smaller. Second, the details of CT imaging protocol and acquisition parameters at other hospitals are uncertain. Third, it is a retrospective, single-center study and it may not reflect the current or future situations for all societies because our hospital mainly dealt with patients with COVID-19 who had moderate-to-severe symptoms.

Conclusions
Deep learning-based quantitative analysis of the affected lung volume in the initial CT scans of patients with COVID-19 offers a predictive factor for determining the potential deterioration of the patient's respiratory condition, which can guide treatment management and optimization of limited resources during the high demand of an outbreak.