Automated Classification of Lung Cancer Subtypes Using Deep Learning and CT-Scan Based Radiomic Analysis
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Description
2.2. Semi-Automated Segmentation and Manual Inspection
2.3. Radiomic Feature Extraction
2.4. Radiomic Model Building
3. Results
3.1. Patient Demographics
3.2. Segmentation Accuracy
3.3. Radiomic Model Analysis Using SMOTE
3.4. Radiomic Analysis: Center and Center-Offset Slices
3.5. Radiomic Feature Analysis: Whole Tumor
3.6. Radiomic Feature Analysis: Three-Class Classification with Whole Tumor Features
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Collins, L.G.; Haines, C.; Perkel, R.; Enck, R.E. Lung Cancer: Diagnosis and Management. Am. Fam. Physician 2007, 75, 56–63. [Google Scholar] [PubMed]
- Siegel, R.L.; Miller, K.D.; Wagle, N.S.; Jemal, A. Cancer Statistics, 2023. CA Cancer J. Clin. 2023, 73, 17–48. [Google Scholar] [CrossRef] [PubMed]
- Tunali, I.; Gillies, R.J.; Schabath, M.B. Application of Radiomics and Artificial Intelligence for Lung Cancer Precision Medicine. Cold Spring Harb. Perspect. Med. 2021, 11, a039537. [Google Scholar] [CrossRef] [PubMed]
- Rizzo, S.; Botta, F.; Raimondi, S.; Origgi, D.; Fanciullo, C.; Morganti, A.G.; Bellomi, M. Radiomics: The Facts and the Challenges of Image Analysis. Eur. Radiol. Exp. 2018, 2, 36. [Google Scholar] [CrossRef]
- Singh, S.; Pinsky, P.; Fineberg, N.S.; Gierada, D.S.; Garg, K.; Sun, Y.; Nath, P.H. Evaluation of Reader Variability in the Interpretation of Follow-up CT Scans at Lung Cancer Screening. Radiology 2011, 259, 263–270. [Google Scholar] [CrossRef] [Green Version]
- Haarburger, C.; Müller-Franzes, G.; Weninger, L.; Kuhl, C.; Truhn, D.; Merhof, D. Radiomics Feature Reproducibility under Inter-Rater Variability in Segmentations of CT Images. Sci. Rep. 2020, 10, 12688. [Google Scholar] [CrossRef]
- Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar] [CrossRef]
- Jiang, J.; Hu, Y.-C.; Liu, C.-J.; Halpenny, D.; Hellmann, M.D.; Deasy, J.O.; Mageras, G.; Veeraraghavan, H. Multiple Resolution Residually Connected Feature Streams for Automatic Lung Tumor Segmentation From CT Images. IEEE Trans. Med. Imaging 2019, 38, 134–144. [Google Scholar] [CrossRef]
- Jiang, J.; Elguindi, S.; Berry, S.L.; Onochie, I.; Cervino, L.; Deasy, J.O.; Veeraraghavan, H. Nested Block Self-Attention Multiple Resolution Residual Network for Multiorgan Segmentation from CT. Med. Phys. 2022, 49, 5244–5257. [Google Scholar] [CrossRef]
- Primakov, S.P.; Ibrahim, A.; van Timmeren, J.E.; Wu, G.; Keek, S.A.; Beuque, M.; Granzier, R.W.Y.; Lavrova, E.; Scrivener, M.; Sanduleanu, S.; et al. Automated Detection and Segmentation of Non-Small Cell Lung Cancer Computed Tomography Images. Nat. Commun. 2022, 13, 3423. [Google Scholar] [CrossRef]
- Zhang, G.; Yang, Z.; Jiang, S. Automatic Lung Tumor Segmentation from CT Images Using Improved 3D Densely Connected UNet. Med. Biol. Eng. Comput. 2022, 60, 3311–3323. [Google Scholar] [CrossRef]
- Um, H.; Jiang, J.; Thor, M.; Rimner, A.; Luo, L.; Deasy, J.O.; Veeraraghavan, H. Multiple Resolution Residual Network for Automatic Thoracic Organs-at-Risk Segmentation from CT. arXiv 2020, arXiv:2005.13690. Available online: http://arxiv.org/abs/2005.13690 (accessed on 2 February 2023).
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Krithika alias AnbuDevi, M.; Suganthi, K. Review of Semantic Segmentation of Medical Images Using Modified Architectures of UNET. Diagnostics 2022, 12, 3064. [Google Scholar] [CrossRef]
- Clark, K.; Vendt, B.; Smith, K.; Freymann, J.; Kirby, J.; Koppel, P.; Moore, S.; Phillips, S.; Maffitt, D.; Pringle, M.; et al. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. J. Digit. Imaging 2013, 26, 1045–1057. [Google Scholar] [CrossRef] [Green Version]
- Li, P.; Wang, S.; Li, T.; Lu, J.; HuangFu, Y.; Wang, D. A Large-Scale CT and PET/CT Dataset for Lung Cancer Diagnosis. Cancer Imaging Arch. 2020. [Google Scholar] [CrossRef]
- Lin, T. LabelImg: LabelImg Is a Graphical Image Annotation Tool and Label Object Bounding Boxes in Images. Available online: https://github.com/tzutalin/labelImg (accessed on 2 February 2023).
- CT Scans | Cancer Imaging Program (CIP). Available online: https://imaging.cancer.gov/imaging_basics/cancer_imaging/ct_scans.htm (accessed on 16 April 2023).
- Detterbeck, F.C. The Eighth Edition TNM Stage Classification for Lung Cancer: What Does It Mean on Main Street? J. Thorac. Cardiovasc. Surg. 2018, 155, 356–359. [Google Scholar] [CrossRef] [Green Version]
- Apte, A.P.; Iyer, A.; Crispin-Ortuzar, M.; Pandya, R.; van Dijk, L.V.; Spezi, E.; Thor, M.; Um, H.; Veeraraghavan, H.; Oh, J.H.; et al. Technical Note: Extension of CERR for Computational Radiomics: A Comprehensive MATLAB Platform for Reproducible Radiomics Research. Med. Phys. 2018, 45, 3713–3720. [Google Scholar] [CrossRef]
- Zwanenburg, A.; Vallières, M.; Abdalah, M.A.; Aerts, H.J.W.L.; Andrearczyk, V.; Apte, A.; Ashrafinia, S.; Bakas, S.; Beukinga, R.J.; Boellaard, R.; et al. The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-Based Phenotyping. Radiology 2020, 295, 328–338. [Google Scholar] [CrossRef] [Green Version]
- Lever, J.; Krzywinski, M.; Altman, N. Principal Component Analysis. Nat. Methods 2017, 14, 641–642. [Google Scholar] [CrossRef] [Green Version]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Synthetic Minority Over-Sampling Technique (SMOTE). Available online: https://www.mathworks.com/matlabcentral/fileexchange/75401-synthetic-minority-over-sampling-technique-smote (accessed on 25 February 2023).
- Jin, X.; Xu, A.; Bie, R.; Guo, P. Machine Learning Techniques and Chi-Square Feature Selection for Cancer Classification Using SAGE Gene Expression Profiles. In Proceedings of the Data Mining for Biomedical Applications; Li, J., Yang, Q., Tan, A.-H., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 106–115. [Google Scholar]
- Univariate Feature Ranking for Classification Using Chi-Square Tests-MATLAB Fscchi2. Available online: https://www.mathworks.com/help/stats/fscchi2.html#mw_3a4e15f8-e55d-4b64-b8d0-1253e2734904_head (accessed on 25 February 2023).
- Binczyk, F.; Prazuch, W.; Bozek, P.; Polanska, J. Radiomics and Artificial Intelligence in Lung Cancer Screening. Transl Lung Cancer Res 2021, 10, 1186–1199. [Google Scholar] [CrossRef] [PubMed]
- What Is Lung Cancer? | Types of Lung Cancer. Available online: https://www.cancer.org/cancer/types/lung-cancer/about/what-is.html (accessed on 9 May 2023).
- Li, H.; Gao, L.; Ma, H.; Arefan, D.; He, J.; Wang, J.; Liu, H. Radiomics-Based Features for Prediction of Histological Subtypes in Central Lung Cancer. Front. Oncol. 2021, 11, 658887. [Google Scholar] [CrossRef] [PubMed]
- Liu, S.; Liu, S.; Zhang, C.; Yu, H.; Liu, X.; Hu, Y.; Xu, W.; Tang, X.; Fu, Q. Exploratory Study of a CT Radiomics Model for the Classification of Small Cell Lung Cancer and Non-Small-Cell Lung Cancer. Front. Oncol. 2020, 10, 1268. [Google Scholar] [CrossRef] [PubMed]
- Hyun, S.H.; Ahn, M.S.; Koh, Y.W.; Lee, S.J. A Machine-Learning Approach Using PET-Based Radiomics to Predict the Histological Subtypes of Lung Cancer. Clin. Nucl. Med. 2019, 44, 956–960. [Google Scholar] [CrossRef]
- Yang, F.; Chen, W.; Wei, H.; Zhang, X.; Yuan, S.; Qiao, X.; Chen, Y.-W. Machine Learning for Histologic Subtype Classification of Non-Small Cell Lung Cancer: A Retrospective Multicenter Radiomics Study. Front. Oncol. 2021, 10, 608598. [Google Scholar] [CrossRef]
- Rubin, M.A.; Bristow, R.G.; Thienger, P.D.; Dive, C.; Imielinski, M. Impact of Lineage Plasticity to and from a Neuroendocrine Phenotype on Progression and Response in Prostate and Lung Cancers. Mol. Cell 2020, 80, 562–577. [Google Scholar] [CrossRef]
- Quintanal-Villalonga, Á.; Chan, J.M.; Yu, H.A.; Pe’er, D.; Sawyers, C.L.; Sen, T.; Rudin, C.M. Lineage Plasticity in Cancer: A Shared Pathway of Therapeutic Resistance. Nat. Rev. Clin. Oncol. 2020, 17, 360–371. [Google Scholar] [CrossRef]
- Lv, J.; Chen, X.; Liu, X.; Du, D.; Lv, W.; Lu, L.; Wu, H. Imbalanced Data Correction Based PET/CT Radiomics Model for Predicting Lymph Node Metastasis in Clinical Stage T1 Lung Adenocarcinoma. Front. Oncol. 2022, 12, 788968. [Google Scholar] [CrossRef]
- Fornacon-Wood, I.; Mistry, H.; Ackermann, C.J.; Blackhall, F.; McPartlin, A.; Faivre-Finn, C.; Price, G.J.; O’Connor, J.P.B. Reliability and Prognostic Value of Radiomic Features Are Highly Dependent on Choice of Feature Extraction Platform. Eur. Radiol. 2020, 30, 6241–6250. [Google Scholar] [CrossRef]
- Tomori, Y.; Yamashiro, T.; Tomita, H.; Tsubakimoto, M.; Ishigami, K.; Atsumi, E.; Murayama, S. CT Radiomics Analysis of Lung Cancers: Differentiation of Squamous Cell Carcinoma from Adenocarcinoma, a Correlative Study with FDG Uptake. Eur. J. Radiol. 2020, 128, 109032. [Google Scholar] [CrossRef]
- Owens, C.A.; Peterson, C.B.; Tang, C.; Koay, E.J.; Yu, W.; Mackin, D.S.; Li, J.; Salehpour, M.R.; Fuentes, D.T.; Court, L.E.; et al. Lung Tumor Segmentation Methods: Impact on the Uncertainty of Radiomics Features for Non-Small Cell Lung Cancer. PLoS ONE 2018, 13, e0205003. [Google Scholar] [CrossRef] [Green Version]
- Yan, M.; Wang, W. Development of a Radiomics Prediction Model for Histological Type Diagnosis in Solitary Pulmonary Nodules: The Combination of CT and FDG PET. Front. Oncol. 2020, 10, 555514. [Google Scholar] [CrossRef]
Subgroup | A—Adenocarcinoma | B—Small Cell Carcinoma | E—Large Cell Carcinoma | C—Squamous Cell Carcinoma | p-Value |
---|---|---|---|---|---|
Sex | <0.01 | ||||
M | 118 | 21 | 4 | 47 | |
F | 133 | 17 | 1 | 14 | |
Age | 0.604 | ||||
Median | 62 | 63.5 | 63 | 61 | |
Range | 28–63 | 32–77 | 41–72 | 47–90 | |
Smoking History | <0.01 | ||||
S | 91 | 18 | 3 | 43 | |
NS | 160 | 28 | 2 | 18 | |
T-Stage | <0.01 | ||||
T1 | 139 | 10 | 1 | 19 | |
T2 | 74 | 13 | 0 | 19 | |
T3 | 29 | 11 | 1 | 16 | |
T4 | 9 | 4 | 3 | 7 | |
N-Stage | <0.01 | ||||
N0 | 145 | 5 | 1 | 33 | |
N1 | 58 | 12 | 1 | 14 | |
N2 | 5 | 0 | 1 | 2 | |
N3 | 43 | 21 | 2 | 12 | |
M-Stage | <0.01 | ||||
M0 | 161 | 25 | 0 | 44 | |
M1 | 90 | 13 | 5 | 17 |
Histotype Symbol | Group Size | Group Description | Pre-Restriction Successes | Post-Restriction Successes | Percentage Increase |
---|---|---|---|---|---|
A | 251 | Adenocarcinoma | 127 | 292 | 130% |
B | 38 | Small cell carcinoma | 27 | 47 | 74% |
E | 5 | Large cell carcinoma | 4 | 5 | 25% |
C | 61 | Squamous cell carcinoma | 37 | 73 | 97% |
T-Stage | Group Size | Group Description | Pre-Restriction Successes | Post-Restriction Successes | Percentage Increase |
1a | 12 | Tumor smaller than 1 cm | 1 | 22 | 2100% |
1b | 29 | Tumor smaller than 2 cm | 8 | 33 | 312% |
1c | 128 | Tumor smaller than 3 cm | 52 | 158 | 203% |
2 | 106 | Tumor smaller than 5 cm | 69 | 116 | 68% |
3 | 57 | Tumor smaller than 7 cm | 46 | 67 | 45% |
4 | 23 | Tumor larger than 7 cm | 19 | 21 | 10% |
A | B | C | Total | |
---|---|---|---|---|
Before SMOTE | 226 | 38 | 60 | 324 |
After SMOTE | 226 | 224 | 222 | 672 |
Model | Iteration 1 | Iteration 2 | Iteration 3 | Iteration 4 | Iteration 5 |
---|---|---|---|---|---|
Tree | 80.10% | 80.10% | 78.30% | 79.60% | 79.30% |
Discriminant | 73.20% | 72.90% | 74.40% | 73.10% | 70.40% |
Naïve Bayes | 71.40% | 70.80% | 73.10% | 69.90% | 71.30% |
SVM | 92.70% | 93.20% | 91.70% | 92.40% | 87.80% |
KNN | 89.00% | 90.30% | 89.90% | 89.90% | 89.60% |
Ensemble | 89.00% | 90.60% | 90.30% | 89.90% | 88.10% |
Narrow Neural Network | 83.00% | 83.80% | 83.00% | 84.20% | 83.80% |
Two-Class Classification | ||||
---|---|---|---|---|
Classification Model | Pre-SMOTE CV Accuracy | Post-SMOTE CV Accuracy | Pre-SMOTE AUC | Post-SMOTE AUC |
Tree | 85.10% | 74.40% | 0.50 | 0.73 |
Discriminant | 83.60% | 77.60% | 0.73 | 0.87 |
Naïve Bayes | 81.60% | 74.00% | 0.61 | 0.82 |
SVM | 85.10% | 77.60% | 0.50 | 0.84 |
KNN | 85.10% | 80.50% | 0.71 | 0.9 |
Ensemble | 85.10% | 78.00% | 0.50 | 0.86 |
Narrow Neural Network | 77.1% | 78.30% | 0.48 | 0.80 |
Three-Class Classification | ||||
Classification Model | Pre-SMOTE CV Accuracy | Post-SMOTE CV Accuracy | Pre-SMOTE AUC* | Post-SMOTE AUC* |
Tree | 70.10% | 73.30% | 0.75 | 0.81 |
Discriminant | 70.60% | 77.50% | 0.82 | 0.88 |
Naïve Bayes | 68.20% | 68.90% | 0.81 | 0.85 |
SVM | 72.10% | 85.00% | 0.82 | 0.84 |
KNN | 71.60% | 87.00% | 0.84 | 0.97 |
Ensemble | 71.60% | 84.30% | 0.77 | 0.93 |
Narrow Neural Network | 60.70% | 77.50% | 0.65 | 0.78 |
Two-Class Classification | ||||
---|---|---|---|---|
Classification Model | Pre-SMOTE CV Accuracy | Post-SMOTE CV Accuracy | Pre-SMOTE AUC | Post-SMOTE AUC |
Tree | 88.30% | 82.30% | 0.48 | 0.85 |
Discriminant | 87.30% | 82.50% | 0.75 | 0.90 |
Naïve Bayes | 84.00% | 79.60% | 0.71 | 0.86 |
SVM | 88.30% | 92.60% | 0.69 | 0.98 |
KNN | 88.30% | 88.70% | 0.69 | 0.90 |
Ensemble | 88.30% | 92.60% | 0.48 | 0.98 |
Narrow Neural Network | 82.10% | 89.90% | 0.61 | 0.92 |
Three-Class Classification | ||||
Classification Model | Pre-SMOTE CV Accuracy | Post-SMOTE CV Accuracy | Pre-SMOTE AUC* | Post-SMOTE AUC* |
Tree | 76.5% | 80.1% | 0.84 | 0.87 |
Discriminant | 73.5% | 73.2% | 0.86 | 0.87 |
Naïve Bayes | 74.7% | 71.4% | 0.83 | 0.89 |
SVM | 76.5% | 92.7% | 0.86 | 0.97 |
KNN | 89.1% | 89.0% | 0.87 | 0.86 |
Ensemble | 77.2% | 89.0% | 0.86 | 0.96 |
Narrow Neural Network | 67.6% | 83.0% | 0.71 | 0.86 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dunn, B.; Pierobon, M.; Wei, Q. Automated Classification of Lung Cancer Subtypes Using Deep Learning and CT-Scan Based Radiomic Analysis. Bioengineering 2023, 10, 690. https://doi.org/10.3390/bioengineering10060690
Dunn B, Pierobon M, Wei Q. Automated Classification of Lung Cancer Subtypes Using Deep Learning and CT-Scan Based Radiomic Analysis. Bioengineering. 2023; 10(6):690. https://doi.org/10.3390/bioengineering10060690
Chicago/Turabian StyleDunn, Bryce, Mariaelena Pierobon, and Qi Wei. 2023. "Automated Classification of Lung Cancer Subtypes Using Deep Learning and CT-Scan Based Radiomic Analysis" Bioengineering 10, no. 6: 690. https://doi.org/10.3390/bioengineering10060690
APA StyleDunn, B., Pierobon, M., & Wei, Q. (2023). Automated Classification of Lung Cancer Subtypes Using Deep Learning and CT-Scan Based Radiomic Analysis. Bioengineering, 10(6), 690. https://doi.org/10.3390/bioengineering10060690