Association of Clinical and Radiological Features with Disease Severity of Symptomatic Immune Checkpoint Inhibitor-Related Pneumonitis

Objectives: To investigate the predictive ability of clinical and chest computed tomography (CT) features to predict the severity of symptomatic immune checkpoint inhibitor-related pneumonitis (CIP). Methods: This study included 34 patients diagnosed with symptomatic CIP (grades 2–5) and divided into mild (grade 2) and severe CIP (grades 3–5) groups. The groups’ clinical and chest CT features were analyzed. Three manual scores (extent, image finding, and clinical symptom scores) were conducted to evaluate the diagnostic performance alone and in combination. Results: There were 20 cases of mild CIP and 14 cases of severe CIP. More severe CIP occurred within 3 months than after 3 months (11 vs. 3 cases, p = 0.038). Severe CIP was significantly associated with fever (p < 0.001) and the acute interstitial pneumonia/acute respiratory distress syndrome pattern (p = 0.001). The diagnostic performance of chest CT scores (extent score and image finding score) was better than that of clinical symptom score. The combination of the three scores demonstrated the best diagnostic value, with an area under the receiver operating characteristic curve of 0.948. Conclusions: The clinical and chest CT features have important application value in assessing the disease severity of symptomatic CIP. We recommend the routine use of chest CT in a comprehensive clinical evaluation.


Introduction
Recently, immune checkpoint inhibitors (ICIs) have emerged as a critical therapeutic approach for various malignancy types and have greatly improved clinical outcomes [1][2][3][4]. However, overactivation of the immune system leads to immune-related adverse events that can affect virtually any body organ [5][6][7]. Immune checkpoint inhibitor-related pneumonitis (CIP) has potentially fatal toxicity [7,8]. Clinically, the grade of CIP is classified according to the Common Terminology Criteria for Adverse Events (CTCAE) [9]. Grade 1 CIP cases are incidentally detected by computed tomography (CT) imaging with no symptoms, which presents relatively limited clinical significance because no further intervention is required. Therefore, we considerably pay attention to symptomatic CIP (grades [2][3][4][5], which is defined as the occurrence of new or aggravating respiratory symptoms such as dyspnea and cough, including new inflammatory lesions on chest CT imaging after ICI treatment and excluding pulmonary infection, tumor progression, and other reasons [10,11]. Symptomatic CIP leads to the delay and termination of ICI treatment [12]. Patients with severe disease may experience acute respiratory failure and treatment-related death [7,8], which is of great concern in clinical practice.
Even though CTCAE is the standard for evaluating CIP severity based on clinical features, guidelines have been updated to include radiological indicators as critical factors. The American Society of Clinical Oncology/National Comprehensive Cancer Network guideline [13] originally included the extent of pneumonitis on chest CT as a grading indicator, and the same approach was used in subsequent guidelines [14,15]. Corticosteroid therapy is the basic treatment for CIP, and the appropriate dose and duration of corticosteroids for mild (grade 2) and severe CIP (grades [3][4][5] are different. Insufficient corticosteroid administration worsens pneumonitis, whereas excessive corticosteroid use results in severe opportunistic infections. Therefore, the evaluation of CIP severity is essential for clinical decision-making. Notably, radiological features on chest CT are more intuitive and easier to quantify, which may be of great value.
This study aimed to retrospectively analyze the clinical features and chest CT imaging differences between mild and severe CIP. Additionally, we established three manual scores and examined the diagnostic performance of the scores alone and in combination.

Patient Selection
Patients who received immunotherapy for various malignancies in our hospital between January 2017 and April 2021 were retrospectively reviewed. CIP was defined as the development of new pulmonary inflammatory lesions after immunotherapy and was considered to be associated with immunotherapy by attending physicians. The following cases were excluded: (a) patients lacking chest CT scans before and during ICI treatment, (b) patients who underwent immunotherapy combined with thoracic radiotherapy, (c) patients exhibiting likelihood of active lung infection, and (d) patients exhibiting likelihood of cancer progression or malignant lung infiltration. The CIP grade was determined using the CTCAE version 5.0 [9]. This study focused on symptomatic CIP (grades 2-5) because asymptomatic CIP (grade 1) presented relatively limited clinical significance. The patients were categorized into mild (grade 2) and severe CIP (grades 3-5) groups. Notably, the onset time of CIP was defined as the time from the first use of ICIs to the occurrence of pneumonitis, and complete medical records were retrospectively collected accordingly.

Chest CT Examination
Serial chest CT images at CIP diagnosis were viewed by two radiologists (Q.Z. and X.L.T., with 3 and 16 years of experience in chest imaging, respectively). When the two radiologists had disagreements, a third radiologist (N.W., with 40 years of experience in chest imaging) reviewed the images independently and made the final assessment. We also consulted an expert in the field of pneumonitis, and disagreements were resolved by consensus. The radiologists had access to clinical details and previous images to ensure the correct differentiation of the target CIP region and to determine if patients had received previous radiotherapy, had lesions of cancer progression, or had any pre-existing lung abnormalities (such as chronic obstructive pulmonary disease or prior interstitial lung disease). Notably, the radiologists were blinded to the CIP grade when viewing the chest CT images.
Since all patients were required to undergo high-resolution CT, we obtained the standard clinical protocol from our hospital with ≥64-detector row scanners. The CT scanning protocols varied between unenhanced and enhanced scans because of their retrospective nature. Axial CT images were reconstructed with 1.25-mm or 5-mm thickness, and CT images were viewed at the lung window setting (width, 1500 HU, and level, −650 HU).
Furthermore, we designed three manual scores with the following semi-quantitative measurement parameters: (a) extent score: extent in the ratio of the volume in the upper (above the carina), middle (below the carina up to the inferior pulmonary vein), and lower (below the inferior pulmonary vein) lung zones (overall, six lung zones were assigned using a 6-point scale [0 point: none, 1 point: ≤5%, 2 points: 6-25%, 3 points: 26-50%, 4 points: 51-75%, and 5 points: 76-100% of lung parenchyma involved], and the total score ranged from 0 to 30 points); (b) image finding score: image findings, including GGO, consolidation, reticular opacities, interlobular septal thickening, honeycombing, and pleural effusion were assigned 1 point (yes) or 0 point (no), with the total score ranging from 0 to 6 points; and (c) clinical symptom score: clinical symptoms, including dyspnea, cough, wheezing, fever, chest tightness, chest pain, and hemoptysis, were assigned 1 point (yes) or 0 point (no), with the total score ranging from 0 to 7 points.

Statistical Analyses
Statistical tests were conducted using SPSS v. 25.0 (IBM Corp.). We descriptively analyzed the clinical and chest CT differences between patients with mild and severe CIP. Categorical data are presented as numbers (percentages), and continuous data are presented as medians (ranges). The differences between groups were evaluated using Fisher's exact test for categorical variables and the Mann-Whitney U test for continuous variables. Subsequently, logistic regression analysis was used to create a combination score based on the extent, image finding, and clinical symptom scores. Receiver operating characteristics (ROC) were obtained to assess the performance of extent, image finding, clinical symptom, and the combination scores. For the combination score, ROC was constructed using the sum of the values obtained by weighing the parameters for the coefficients obtained by stepwise logistic regression analysis. Cutoff values were chosen based on optimal sensitivity. The area under the receiver operating characteristic curve (AUC), sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were calculated. All p-values were based on a two-sided hypothesis, and p < 0.05 indicated statistical significance.

Radiological Features
3.3. Diagnostic Performance of Three Manual Scores and the Combination Score for Severe CIP Table 3 shows the comparison of diagnostic performance to differentiate between mild and severe CIP, and Figure 2 illustrates the ROC curves. Our results indicated that the extent score (AUC 0.857) and image finding score (AUC 0.843) presented better performances than the clinical symptom score (AUC 0.782). The combination of the three manual scores (−10.562 + 0.132 × Extent score + 0.748 × Image finding score + 2.154 × Clinical symptom score) demonstrated the best diagnostic value over the three scores alone, with an AUC of 0.948. Additionally, Figure 3 shows the CT images of three representative patients with mild, severe, and fatal CIP.   The patient presented symptoms of dyspnea, cough, and wheezing. The extent, image finding, and clinical symptom scores were 6, 3, and 3 points for the whole lung, respectively, and CIP demonstrates significant absorption after oral corticosteroid treatment. (D-F) Severe CIP with an NSIP pattern. A 64-year-old male patient with squamous cell lung carcinoma was treated with a PD-1 inhibitor combined with chemotherapy. Chest CT on day 38 after ICI initiation demonstrates a diffuse development of GGOs, consolidation, reticular opacities, interlobular septal thickening, honeycombing, and a small amount of pleural effusion dominated subpleural distribution. The patient presented symptoms of dyspnea, wheezing, fever, and hemoptysis. The extent, image finding, and clinical symptom scores were 18, 6, and 4 points for the whole lung, respectively. CIP shows significant absorption after intravenous corticosteroid treatment. (G,H) Fatal CIP with an AIP/ARDS pattern. A 60-year-old male patient with duodenal cancer was treated with a herpes simplex virus type II coding for the granulocyte-macrophage colony-stimulating factor combined with a PD-1 inhibitor. Chest CT on day 78 after ICI initiation demonstrates a diffuse development of GGOs, consolidation, reticular opacities, interlobular septal thickening, and a small amount of pleural effusion, presenting the unusual manifestation of the "crazy paving" sign. The patient presented symptoms of dyspnea, cough, and hemoptysis. The extent, image finding, and clinical symptom scores were 30, 5, and 3 points for the whole lung, respectively. The patient died early of respiratory failure, and no follow-up CT scan was available. Abbreviations: CIP, checkpoint inhibitor-related pneumonitis; PD-1, programmed cell death-1; ICIs, immune checkpoint inhibitors; GGO, groundglass opacity; OP, organizing pneumonia; NSIP, nonspecific interstitial pneumonia; AIP/ARDS, acute interstitial pneumonia/acute respiratory distress syndrome.

Discussion
Severe CIP has potentially fatal toxicity. Challenges occur in severe CIP in clinical decision-making, including the appropriate dose and duration of corticosteroids and ICI rechallenge. It is crucial to perform risk stratification of CIP patients in the early stages to determine the severity and enable timely diagnosis and treatment. Our study aimed to analyze the clinical and chest CT imaging differences between mild and severe CIP. Additionally, we established three manual scores and examined the diagnostic performance of the three scores alone and in combination when assessing CIP severity. Our results revealed that more severe CIP occurred within 3 months than after 3 months. Severe CIP was significantly associated with fever and the AIP/ARDS pattern. The diagnostic performance of chest CT scores (extent score and image finding score) was better than that of clinical symptom score. The combination of chest CT features and clinical symptoms demonstrated the best diagnostic performance. The findings of this study demonstrated the importance of chest CT, which plays a significant role in clinical comprehensive assessment.
CIP is a diagnosis of exclusion based on consensus experience [13][14][15]. Proof of drug administration, temporal eligibility (symptom development following drug initiation), and an appropriate latency period between drug administration and the development of symptoms help raise suspicion of CIP. Diagnostic confirmation requires the exclusion of infection, tumor progression, carcinomatous lymphangitis, radiation-related pneumonitis, thromboembolism, and pulmonary edema. Diagnostic confirmation also requires a comprehensive consideration of clinical symptoms, chest CT imaging, and tests of sputum, blood, urine cultures, and a nasal swab. In general, routine lung biopsies are not recommended. However, if there is clinical or radiological doubt about the etiology of pulmonary infiltrates, a lung biopsy may provide an answer.
This study observed a positive correlation between CIP severity and the extent, image finding, and clinical symptom scores. First, the extent could indicate the severity of CIP, and an extent score over 11 points meant lesion deterioration in our study. Clinical guidelines indicated that the extent of lung parenchyma involvement was <50% for mild CIP and >50% for severe CIP [13][14][15]. The extent also reflected the severity of radiation pneumonitis, viral pneumonia, and other drug-associated pneumonitis [23][24][25]. Second, the image finding score quantified CT findings. It could be speculated that as the disease course deteriorated with a higher score, CT findings could show more diverse and complex pulmonary opacities, in addition to GGOs and consolidation. Third, clinical symptoms are subjective and susceptible to patient status, age, tumor stage, and previous pulmonary disease. Our study found that the clinical symptom score had the lowest sensitivity (0.429) and that the diagnostic performance of chest CT scores was better than that of the clinical symptom score. Therefore, the assessment of the chest CT might be more important when patients had mild symptoms in the early stage. Fourth, the combination of the three scores demonstrated the best diagnostic value over the three scores alone, indicating that the comprehensive evaluation of chest CT and clinical symptoms is of great value in clinical application. Finally, a chest CT could help monitor disease progression during the follow-up period. The improvement of CIP manifests as a total or partial resolution of pneumonitis and a decreased score. In contrast, CIP progresses to a more severe phase with an increased score.
This study revealed that more severe CIP occurred within 3 months than after 3 months (11 vs. 3 cases, p = 0.038). This finding is in line with that of the study conducted by Huang et al. [26], with 6 weeks as the dividing line (92.9% vs. 7.1%, p < 0.05). Huang et al. reported that early-onset CIP had higher radiologic severity and a poorer prognosis, with an OP pattern as the dominant radiographic pattern, while late-onset CIP had lower radiologic severity and a better prognosis, with an NSIP pattern as the dominant radiographic pattern [26]. We speculated that a possible explanation was that overactivation of the immune system in early-onset CIP patients might cause more violent inflammatory cytokine cascade responses or even undergo a cytokine storm in severe and fatal cases. However, a study by Delaunay et al. [16] pointed out that there appeared to be no correlation between the occurrence time and clinical severity (p = 0.32). The possibility of an early onset of severe CIP requires further verification. Nevertheless, it demonstrated the clinical significance that physicians should perform necessary follow-ups to ensure the detection of severe CIP in the early stages, especially within 3 months. Our study indicated that fever was a significant predictor of severe CIP (p < 0.001). Although the relationship between fever and CIP severity is poorly understood, we speculated that patients with fever could cause the spread of inflammatory factors throughout the body, which might indicate a more severe systemic response than localized respiratory symptoms. In clinical practice, fever also plays a role in distinguishing CIP from infectious pneumonia. CIP was less prone to fever and more prone to cough and dyspnea, and fever tended to be mild to moderate if it occurred [12,16,17,19,27,28].
Radiological features could reflect CIP severity to a certain extent. We observed that severe CIP was associated with increased lung lobe involvement, diffuse distribution, reticular opacities, interlobular septal thickening, honeycombing, and pleural effusion. Previous reports on severe CIP cases demonstrated similar CT characteristics of extensive and diffuse findings in both lungs [17,29,30]. These results indicated that as the severity of CIP increased, the effect of pneumonitis evolved from localized lung injury to a wider inflammatory response and showed complex pulmonary opacities. The AIP/ARDS pattern has been associated with the most severe clinical course and was the risk factor for CIPrelated deaths [17,22]. The AIP/ARDS pattern was a significant predictor of severe CIP (p = 0.001) in our study. All seven patients with the AIP/ARDS pattern had severe CIP, and two died during follow-up. This finding was correlated with the pathological characteristics of severe and fatal CIP, and it indicated the extension and deterioration of the disease course. A pathological study revealed that two patients with fatal CIP had total lung injury [31]: one showed diffuse alveolar damage with foamy macrophage accumulation and eosinophilic hyaline membranes and that the other patient showed acute fibrinous pneumonitis with alveolar septal edema, abundant foamy macrophages in the airspaces, intra-alveolar fibrin, and reactive pneumocyte hyperplasia with vacuolization. Previous studies reported that the OP pattern (some studies called it the COP pattern) was the most frequent radiographic pattern of CIP [12,16,17,19,27,28], which is consistent with our study. Nishino et al. [17] also reported that the OP pattern was the most common in different tumor types and in both monotherapy and combination therapy. Notably, this finding is similar to other drugrelated and radiation pneumonitis [23,24,32]. These results might indicate that pneumonitis was a general manifestation of the lung's response to various injuries, and the OP pattern was the most common form.
This study has some limitations. First, this was a single-center retrospective study with a relatively small sample size; hence, further studies with larger sample sizes are warranted for verification. Second, some patients received combination therapy, such as chemotherapy and antiangiogenesis, which might have affected pneumonitis from additional agents. Tumor patients, particularly advanced tumor patients after multiline treatment, had complex clinical courses. However, our retrospective study might be more closely related to actual clinical situations than a prospective cohort with ICI monotherapy. Third, subjective assessment of radiological features could lead to interobserver bias, and we attempted to reduce it by reaching consensus contours among three radiologists and consulting an expert. Finally, the segmentation of the lesion volume on CT was based on visual and semi-quantitative measurements. Remarkably, accurate and quantitative segmentation using artificial intelligence software is a future research direction.

Conclusions
More severe CIP occurred within 3 months than after 3 months. Severe CIP was significantly associated with fever and the AIP/ARDS pattern. The diagnostic performance of chest CT scores was better than that of clinical symptom score in assessing the CIP severity. The combination of chest CT features and clinical symptoms had the best diagnostic performance. The findings of this study demonstrated the importance of chest CT, which plays a significant role in clinical comprehensive assessment. Based on the evidence of the current study, we therefore recommend that a chest CT be routinely used in a comprehensive clinical evaluation.

Informed Consent Statement:
The need for patient consent was waived owing to the retrospective nature of this study.

Data Availability Statement:
Data is contained within the article. Data supporting the reported results may be provided upon reasonable request.