The Role of Chest CT Radiomics in Diagnosis of Lung Cancer or Tuberculosis: A Pilot Study

In many low-income countries, the poor availability of lung biopsy leads to delayed diagnosis of lung cancer (LC), which can appear radiologically similar to tuberculosis (TB). To assess the ability of CT Radiomics in differentiating between TB and LC, and to evaluate the potential predictive role of clinical parameters, from March 2020 to September 2021, patients with histological diagnosis of TB or LC underwent chest CT evaluation and were retrospectively enrolled. Exclusion criteria were: availability of only enhanced CT scans, previous lung surgery and significant CT motion artefacts. After manual 3D segmentation of enhanced CT, two radiologists, in consensus, extracted and compared radiomics features (T-test or Mann–Whitney), and they tested their performance, in differentiating LC from TB, via Receiver operating characteristic (ROC) curves. Forty patients (28 LC and 12 TB) were finally enrolled, and 31 were male, with a mean age of 59 ± 13 years. Significant differences were found in normal WBC count (p < 0.019) and age (p < 0.001), in favor of the LC group (89% vs. 58%) and with an older population in LC group, respectively. Significant differences were found in 16/107 radiomic features (all p < 0.05). LargeDependenceEmphasis and LargeAreaLowGrayLevelEmphasis showed the best performance in discriminating LC from TB, (AUC: 0.92, sensitivity: 85.7%, specificity: 91.7%, p < 0.0001; AUC: 0.92, sensitivity: 75%, specificity: 100%, p < 0.0001, respectively). Radiomics may be a non-invasive imaging tool in many poor nations, for differentiating LC from TB, with a pivotal role in improving oncological patients’ management; however, future prospective studies will be necessary to validate these initial findings.


Introduction
Tuberculosis (TB) and lung cancer (LC) are the most common lung diseases contributing to mortality in developing nations, and both have associations with the human development index of the country or region [1]. LC is the second most common subsite, contributing 11.4% of new cases and 18% of new deaths, according to the Global cancer statistics 2020-GLOBOCAN estimates [2]. LC and TB are often known to coexist. In fact, some studies have shown chronic inflammation of TB as carcinogenic [3,4]. In previous studies, the coexistence of TB and LC, in a small percentage of patients, was documented [5][6][7]. Furthermore, lung malignancy and the different drugs used for the treatment are associated with immunosuppression, which often leads to mycobacterial infection [8,9].
Due to the high prevalence of TB in endemic areas, lung consolidation is presumptively treated with antibiotics and antituberculosis drugs with a wait and watch policy [10]. The lack of response to treatment is considered an indication to suspect malignancy, and thus, the treatment may be delayed [11]. This often results in LC presenting at an advanced stage, with a dramatic impact on the survival [12]. The limited access to pulmonary biopsy usually results in the upstaging of the malignancy or, in uncertain diagnoses or a watch and wait strategy, results in increased mortality and morbidity [13].
Imaging examinations, such as computed tomography (CT), demonstrated key tools in differential diagnosis between TB and LC [14]. Nevertheless, in clinical practice, due to the radiological similarities between these pathologies, even expert radiologists, relying on CT data, are often subject to misdiagnosis [15]. In particular, the presence of chest CT findings, as consolidation and a nodular pattern, especially those with spiculations and irregular margins in many tuberculous bacteria, can mimic primary LC [16]. As a corollary finding, in a case of known malignancy, a tubercular non-calcified granuloma is sometimes misdiagnosed as a metastatic nodule [17]. Thus, a non-invasive and diagnostic alternative is required to improve the discrimination between TB and LC.
In this contest, Radiomics is an emerging medical imaging tool that turns the qualitative analysis of multimodal medical images into quantitative data [18,19]. After imaging feature extraction with dedicated software, trained algorithms provide diagnostic aid or exact quantitative information by calculating the extracted features [20]. Thus, Radiomics can reflect biological information regarding the analyzed lung lesions, such as cell morphology, internal heterogeneity, and molecular and gene expression, which can provide a more accurate differential diagnosis for confused masses, in a non-invasive way [21][22][23].
In recent literature, only a few studies have investigated the role of quantitative radiomic features to differentiate lung TB from LC [24][25][26].
The aim of the study was to identify and compare the CT radiomic features of both TB and LC, as well as identify the best ones, in order to demonstrate the potential key role of radiomics in differentiating between these two diseases, allowing the accurate assessment and early diagnosis of LC in developing nations.

Patient Population and Study Design
This retrospective observational study was in accordance with the Declaration of Helsinki. All participants provided informed consent, and the approval of the Institutional Review Board was not necessary in this observational non-interventional retrospective study.
Sixty patients admitted at Apollo Adlux Hospital, Kerala, India, from March 2020 to September 2021, were selected according to the following inclusion criteria: patients with (a) histological diagnosis of TB or LC, as well as patients (b) who underwent chest CT during hospitalization.
Exclusion criteria were: (a) negative chest CT for pulmonary consolidation, (b) availability of the only CT scans with contrast media administration, (c) previous surgical pulmonary resection, and (d) significant motion artefacts on chest CT.
Patients' demographic characteristics, clinical findings, and laboratory results, including sex, age, comorbidities, smoking habits, respiratory symptoms, fever, erythrocyte sedimentation ratio (ESR), white blood cell (WBC) count, hemoglobin (HB), and histopathology results, were also retrieved from the internal hospital records and analyzed.

CT Acquisition Technique
All patients underwent unenhanced Chest CT scans during hospitalization. Chest CT acquisitions were obtained with the patients in supine position during end-inspiration, without contrast medium injection, and with the scans performed in the cranio-caudal direction. CT exams were obtained by using 160-slice CT (CANON Aquilion SP Prime Scanner, Canon Medical Systems Corporation, Otawara, Japan). CT scans were obtained by setting the following technical parameters: tube voltage 120 kV; tube current modulation 300 mAs, spiral pitch factor 0.98; collimation 64 × 0.625 mm; time of rotation 0.5 s. Standard soft tissue reconstruction, by Iterative Reconstruction, was used for all CT images at a slice thickness of 0.5 mm.

CT Scans Evaluation and Segmentation Analysis
Digital Imaging and Communications in Medicine (DICOM) data were transferred into a picture archiving and communication system (PACS) workstation (Centricity Universal Viewer, version 6.0; GE Medical Systems, Boston, Massachussets, United States). Two radiologists, in consensus (G.G. and D.C., with 5 years and 15 years in thoracic imaging experience, respectively), evaluated the CT scans eligible for segmentation analysis; they then performed CT scan segmentation analysis. The volumetric lung segmentation of each CT scan was performed by using open-source 3D Slicer software (version 4.11.20210226, http://www.slicer.org, accessed on 28 February 2021). Slice-by-slice, a volumetric region of interest (VOI) was manually drawn on mediastinal window scans, with the goal of covering total consolidation volume and avoiding the pulmonary vessels, or bronchi, and cavitations.

Radiomic Features Extraction
The 3D Slicer Radiomics extension (pyradiomics library [18]) was used to extract 107 radiomic features from the mediastinal window of unenhanced Chest CT scans, including first and second order features: 19 features first order statistics, 13 features 2D and 3D shapes, 16

Statistical Analysis
All data are expressed as mean ± standard deviation (SD). Categorical variables were described as counts and percentages. Gaussian distribution was tested by the Shapiro-Wilk test: continuous parametric variables were compared by using the Student t-test, while nonparametric variables were compared with the Mann-Whitney U test. Statistical significance was assessed with p < 0.05. For inferential comparisons, correction for multiple testing was done with the Holm-Bonferroni method; that is, the smallest p value was compared to 0.05/107 = 0.00047 [27]. Statistical analysis was performed using MedCalc Statistical Software version 20.013 (MedCalc Software bvba, Ostend, Belgium).
Receiver operating characteristic (ROC) curves, and the calculated areas under the curve (AUCs), were calculated to test the significant performance of chest CT radiomic features in differentiating LC from TB; sensitivity and specificity were evaluated too.

Study Population and Patients Data
From an initial population of 60 patients, ten (16%) were excluded for the absence of TB consolidation at chest CT examination, three (5%) were for previous pulmonary surgery, two (3%) were for the availability of only CT scans with contrast media administration, and five (8%) patients were excluded for the presence of severe motion artifacts on chest CT images.
Thus, the final population comprised forty patients who were finally enrolled, where 31 were male (77%) and 9 were female (23%), with a mean age of 59 ± 13 (SD) years and an age range of 21-82. Among those, 28 patients (70%) were affected by LC, and 12 patients (30%) were affected by TB. The enrollment flowchart of the study is shown in Figure 1. There were 19 (48%) patients who had a current smoking habit, and 21 (52%) had never smoked. Additionally, 29 (73%) participants had at least one underlying comorbidity, and diabetes mellitus and hypertension were the most common reported. Among the patients, 33 (82%) reported symptoms such as cough (16/40), dyspnea (7/40), fever (11/40), and weight-loss (9/40), while seven patients (18%) were asymptomatic. The correlation of patients' data and clinical parameters, between the two patient groups, showed that significant differences were found in normal WBC count (p < 0.019) and age (p < 0.001), in favor of the LC group (89% vs. 58%) and with an older population in LC group, respectively. No other statistically significant differences were found, despite LC patients presenting cough more than the TB group (13 vs. 3).
Full data about patients' demographics, clinical records, and laboratory findings are reported in Table 1.   From the volumetric segmentation of lung parenchyma, the software itself has routinely extracted 107 radiomic features from chest CT scans. In the comparison between LC patients and TB patients, 16 radiomic parameters showed significantly different results after correction for multiple testing (Table 2).   In particular, among Shape features, only Surface Volume Ratio was able to significantly differentiate between two patient groups (p = 0.0003) after adjustment for multiple testing, with a good performance (AUC: 0.868, sensitivity: 85.7%, specificity: 83.3%, p < 0.0001) in discriminating LC patients.

3D Segmentation and Radiomic Features
Among First Order features, two (10Percentile and Mean) significantly differentiated LC and TB patients (p = 0.0002 and p =0.0003 respectively); 10Percentile showed the best performance (AUC: 0.881, sensitivity: 75%, specificity: 91.7%, p <0.0001) for LC patients' individuation. After adjustment for multiple testing, no Neighboring Gray Tone Difference Matrix (NGTDM) features were able to discriminate, in a significant way, between LC and TB group. Figure 6 shows the best AUCs for each category of radiomic features.

Discussion
This study investigated radiomic features of pulmonary tuberculosis (TB) and lung cancer (LC), to assess the potential role of Radiomic in differentiating between these two diseases, in 40 patients living in developing country. Significant differences were found after adjustment for multiple testing in 16/107 radiomic features extracted: 1 Shape, 2 First Order, 2 GLCM, 2 GLDM, 5 GLRLM, and 4 GLSZM, all with p < 0.05. Furthermore, Large Dependence Emphasis (GLDM feature) and Large Area Low Gray Level Emphasis (GLSZM feature) showed the best performance in discriminating LC from TB (AUC: 0.92, sensitivity: 85.7%, specificity: 91.7%, p < 0.0001; AUC: 0.92, sensitivity: 75%, specificity: 100%, p < 0.0001, respectively). The correlation of patients' data and clinical parameters showed significant differences in normal WBC count (p < 0.019) and age (p < 0.001), with an older population in the LC group.
Our results, in line with previous literature studies, reinforced the idea that Radiomics could have a future role in the management of patients affected by pulmonary mass, in particular in low-income nations, to reach the diagnosis earlier and to avoid treatment's delay.
To date, only a few studies investigated the role of Radiomics in differential diagnosis between LC and TB. In particular, E.N. Cui and colleagues [25] validated a radiomics method for distinguishing pulmonary TB from LC, based on CT images, by analyzing peritumoral regions with good discrimination (AUC: 0.91 for training cohort, and 0.90 for validation cohort).
In a remarkable study conducted by Feng B. et al. [26], in line with our results, 426 patients were enrolled to investigate the radiomics nomogram's differential diagnostic performance in discriminating between tuberculous granuloma and lung adenocarcinoma, appearing as solitary pulmonary solid nodules. Individualized radiomics nomograms, incorporating the radiomics features and clinical factors, were constructed to validate the diagnostic ability. Radiomics signature, age, and spiculation sign, being independent predictors, were used to build the radiomics nomogram, which showed higher diagnostic accuracy than each single model (AUCs: 0.966, 0.934, and 0.906 for training, internal validation, and external validation cohorts, respectively).
Similarly, some studies have focused on the Radiomics performance to discriminate between LC and atypical granulomas. In detail, Yang X. et al. investigated the ability of quantitative CT radiomics to preoperatively differentiate solitary atypical granulomatous nodules from lung adenocarcinoma, in 302 patients, by analyzing the predictive performance of combined CT-based radiomics and clinical risk factors with three models, in both enhanced and unenhanced chest CT. Their study showed that the discrimination's predictive performance of combined, unenhanced CT-based radiomics and clinical risk factors performed better than simple radiomics models (AUCs: 0.935 vs. 0.843). Moreover, the authors found that LC patients had larger CT-size and were more likely to be older (>50 years old) than patients with granulomas, similarly to our results.
On the same line, Dennie C. and colleagues [28] reported that CT texture analysis achieved good accuracy in differentiating LC and granulomas lesions (AUC: 0.90, 88% of sensitivity, and 92% of specificity). These encouraging results may be reflecting the strong tumor heterogeneity compared to granulomas.
Future perspectives on LC and TB discrimination were provided by the promising results of Feng B. and colleagues [29], who investigated the diagnostic performance of a CT-based deep learning nomogram (DLN) in 550 patients with solitary solid pulmonary nodules. In their study, deep learning signature, gender, age, and lobulated shape, as independent predictors, were used to build the DLN; this combined model showed better diagnostic accuracy than any single model (AUCs 0.889, 0.879, and 0.809, in the training, internal validation, and external validation cohorts, respectively).
Prompt differential diagnosis between TB and LC is crucial to provide appropriate management and to avoid delayed diagnosis and treatment of patients affected by LC, which frequently leads to poor outcomes and survival [11]. In developing countries, there is limited access to invasive lung biopsy; thus, a non-invasive diagnostic method is often required. In this contest, Radiomics has emerged in the last decades, in the imaging field, as a supporting tool for clinicians in the proper management and workup of oncologic patients [20].
This study has several limitations: first, the small sample size of patients enrolled, also caused by motion artefacts on CT, could be avoided in a prospective study; secondly, the retrospective nature of the study; thirdly, intrinsic Radiomics limits, as the lack of standardization of image acquisition, lack of uniformization of image processing, operators' subjectivity, and the lack of validation cohort.
In the future, these initial findings need to be validated, by further prospective studies, in order to overcome these drawbacks and validate Radiomics as an imaging biomarker with good reproducibility.
In conclusion, Radiomics may be a non-invasive imaging tool in many developing countries for differentiating LC from TB, and it may have a pivotal role in avoiding delayed diagnosis of LC and improving the management of oncological patients. Informed Consent Statement: Written informed consent has been obtained from the patients to publish this paper.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.