Machine Learning Model of ResNet50-Ensemble Voting for Malignant–Benign Small Pulmonary Nodule Classification on Computed Tomography Images
Abstract
:Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Data Source
2.1.1. Inclusion/Exclusion Criteria
- The subjects of this study should be adults (age ≥ 18 years);
- In order to ensure the integrity of the information in the lung nodule images, the number of CT images containing nodules should not be less than 2 per patient;
- Clear physician’s diagnostic report was available;
- Small pulmonary nodules less than 20 mm in size for which a definitive pathologic diagnosis cannot be made.
- Patients treated with chemo-radiotherapy or surgery;
- Images of nodules that were difficult to segment;
- The size of the lung nodule was above 20 mm.
2.1.2. Diagnostic Criteria
2.2. Research Design Process
2.3. Image Preprocessing
2.4. Deep Learning Algorithm
2.4.1. ResNet50
2.4.2. VGG16
2.5. Machine Learning Classifiers
2.5.1. Ensemble Voting
2.5.2. Random Forest
2.5.3. XGBoost
2.5.4. SVM
2.5.5. Naïve Bayes
2.6. Feature Visualization
2.7. Statistical Analysis
3. Results
3.1. Combined Machine Learning Models
3.2. Feature Visualization
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Lung Cancer Screening Considerations During Respiratory Infection Outbreaks, Epidemics or Pandemics: An International Association for the Study of Lung Cancer Early Detection and Screening Committee Report—ScienceDirect. Available online: https://www.sciencedirect.com/science/article/pii/S1556086421033268 (accessed on 25 August 2023).
- Zeng, H.; Chen, W.; Zheng, R.; Zhang, S.; Ji, J.S.; Zou, X.; Xia, C.; Sun, K.; Yang, Z.; Li, H.; et al. Changing Cancer Survival in China during 2003–15: A Pooled Analysis of 17 Population-Based Cancer Registries. Lancet Glob. Health 2018, 6, e555–e567. [Google Scholar] [CrossRef]
- Wang, L.; Zhang, M.; Pan, X.; Zhao, M.; Huang, L.; Hu, X.; Wang, X.; Qiao, L.; Guo, Q.; Xu, W.; et al. Integrative Serum Metabolic Fingerprints Based Multi-Modal Platforms for Lung Adenocarcinoma Early Detection and Pulmonary Nodule Classification. Adv. Sci. 2022, 9, 2203786. [Google Scholar] [CrossRef]
- Eberhardt, R.; Ernst, A.; Herth, F.J.F. Ultrasound-Guided Transbronchial Biopsy of Solitary Pulmonary Nodules Less than 20 Mm. Eur. Respir. J. 2009, 34, 1284–1287. [Google Scholar] [CrossRef] [PubMed]
- An Assisted Diagnosis System for Detection of Early Pulmonary Nodule in Computed Tomography Images|SpringerLink. Available online: https://link.springer.com/article/10.1007/s10916-016-0669-0?utm_source=xmol&utm_medium=affiliate&utm_content=meta&utm_campaign=DDCN_1_GL01_metadata (accessed on 25 August 2023).
- Management of Small Lung Nodules in the Era of Lung Cancer Screening|Lung Cancer|JAMA Surgery|JAMA Network. Available online: https://jamanetwork.com/journals/jamasurgery/fullarticle/2719456 (accessed on 26 August 2023).
- Huang, P.; Park, S.; Yan, R.; Lee, J.; Chu, L.C.; Lin, C.T.; Hussien, A.; Rathmell, J.; Thomas, B.; Chen, C.; et al. Added Value of Computer-Aided CT Image Features for Early Lung Cancer Diagnosis with Small Pulmonary Nodules: A Matched Case-Control Study. Radiology 2018, 286, 286–295. [Google Scholar] [CrossRef] [PubMed]
- Kaliyugarasan, S.; Lundervold, A.; Lundervold, A.S. Pulmonary Nodule Classification in Lung Cancer from 3D Thoracic CT Scans Using Fastai and MONAI. Int. J. Interact. Multimed. Artif. Intell. 2021, 6, 83. [Google Scholar] [CrossRef]
- Zhao, X.; Liu, L.; Qi, S.; Teng, Y.; Li, J.; Qian, W. Agile Convolutional Neural Network for Pulmonary Nodule Classification Using CT Images. Int. J. Comput. Ass. Rad. 2018, 13, 585–595. [Google Scholar] [CrossRef]
- Cao, K.; Tao, H.; Wang, Z.; Jin, X. MSM-ViT: A Multi-Scale MobileViT for Pulmonary Nodule Classification Using CT Images. J. X-ray Sci. Technol. 2023, 31, 731–744. [Google Scholar] [CrossRef]
- Mkindu, H.; Wu, L.; Zhao, Y. Lung Nodule Detection of CT Images Based on Combining 3D-CNN and Squeeze-and-Excitation Networks. Multimed. Tools Appl. 2023, 82, 25747–25760. [Google Scholar] [CrossRef]
- Mkindu, H.; Wu, L.; Zhao, Y. Lung Nodule Detection in Chest CT Images Based on Vision Transformer Network with Bayesian Optimization. Biomed. Signal Process. Control 2023, 85, 104866. [Google Scholar] [CrossRef]
- Howard, B.A.; Morgan, R.; Thorpe, M.P.; Turkington, T.G.; Oldan, J.; James, O.G.; Borges-Neto, S. Comparison of Bayesian Penalized Likelihood Reconstruction versus OS-EM for Characterization of Small Pulmonary Nodules in Oncologic PET/CT. Ann. Nucl. Med. 2017, 31, 623–628. [Google Scholar] [CrossRef]
- Incremental Benefit of Maximum-Intensity-Projection Images on Observer Detection of Small Pulmonary Nodules Revealed by Multidetector CT|AJR. Available online: https://www.ajronline.org/doi/10.2214/ajr.179.1.1790149 (accessed on 25 August 2023).
- Chae, K.J.; Jin, G.Y.; Ko, S.B.; Wang, Y.; Zhang, H.; Choi, E.J.; Choi, H. Deep Learning for the Classification of Small (≤2 cm) Pulmonary Nodules on CT Imaging: A Preliminary Study. Acad. Radiol. 2020, 27, e55–e63. [Google Scholar] [CrossRef] [PubMed]
- Mei, M.; Ye, Z.; Zha, Y. An Integrated Convolutional Neural Network for Classifying Small Pulmonary Solid Nodules. Front. Neurosci. 2023, 17, 1152222. [Google Scholar] [PubMed]
- Liu, R.-S.; Ye, J.; Yu, Y.; Yang, Z.-Y.; Lin, J.-L.; Li, X.-D.; Qin, T.-S.; Tao, D.-P.; Song, W.; Wang, G.; et al. The Predictive Accuracy of CT Radiomics Combined with Machine Learning in Predicting the Invasiveness of Small Nodular Lung Adenocarcinoma. Transl. Lung Cancer Res. 2023, 12, 530–546. [Google Scholar] [CrossRef] [PubMed]
- Guan, X.; Du, Y.; Ma, R.; Teng, N.; Ou, S.; Zhao, H.; Li, X. Construction of the XGBoost Model for Early Lung Cancer Prediction Based on Metabolic Indices. BMC Med. Inform. Decis. Mak. 2023, 23, 107. [Google Scholar] [CrossRef]
- Jain, S. Computer-Aided Detection System for the Classification of Non-Small Cell Lung Lesions Using SVM. Curr. Comput.-Aided Drug Des. 2021, 16, 833–840. [Google Scholar] [CrossRef]
- Srivastava, V.; Gupta, S.; Chaudhary, G.; Balodi, A.; Khari, M.; García-Díaz, V. An Enhanced Texture-Based Feature Extraction Approach for Classification of Biomedical Images of CT-Scan of Lungs. Int. J. Interact. Multimed. Artif. Intell. 2021, 6, 18. [Google Scholar] [CrossRef]
- Rajinikanth, V.; Kadry, S.; Moreno-Ger, P. ResNet18 Supported Inspection of Tuberculosis in Chest Radiographs with Integrated Deep, LBP, and DWT Features. Int. J. Interact. Multimed. Artif. Intell. 2023, 8, 38. [Google Scholar] [CrossRef]
- Sharma, A.K.; Nandal, A.; Dhaka, A.; Koundal, D.; Bogatinoska, D.C.; Alyami, H. Enhanced Watershed Segmentation Algorithm-Based Modified ResNet50 Model for Brain Tumor Detection. BioMed Res. Int. 2022, 2022, 7348344. [Google Scholar] [CrossRef]
- Hossain, M.d.B.; Iqbal, S.M.H.S.; Islam, M.d.M.; Akhtar, M.d.N.; Sarker, I.H. Transfer Learning with Fine-Tuned Deep CNN ResNet50 Model for Classifying COVID-19 from Chest X-ray Images. Inform. Med. Unlocked 2022, 30, 100916. [Google Scholar] [CrossRef]
- A New Model Based on Improved VGG16 for Corn Weed Identification, Frontiers in Plant Science—X-MOL. Available online: https://www.x-mol.com/paper/1677428630847471616?adv (accessed on 29 August 2023).
- Circuit Manufacturing Defect Detection Using VGG16 Convolutional Neural Networks. Available online: https://www.hindawi.com/journals/wcmc/2022/1070405/ (accessed on 29 August 2023).
- Advanced Defensive Distillation with Ensemble Voting and Noisy Logits|SpringerLink. Available online: https://link.springer.com/article/10.1007/s10489-022-03495-3?utm_source=xmol&utm_medium=affiliate&utm_content=meta&utm_campaign=DDCN_1_GL01_metadata (accessed on 29 August 2023).
- Shehab, M.A.; Kahraman, N. A Weighted Voting Ensemble of Efficient Regularized Extreme Learning Machine. Comput. Electr. Eng. 2020, 85, 106639. [Google Scholar] [CrossRef]
- Mantas, C.J.; Castellano, J.G.; Moral-García, S.; Abellán, J. A Comparison of Random Forest Based Algorithms: Random Credal Random Forest versus Oblique Random Forest. Soft Comput. 2019, 23, 10739–10754. [Google Scholar] [CrossRef]
- Li, J.; An, X.; Li, Q.; Wang, C.; Yu, H.; Zhou, X.; Geng, Y. Application of XGBoost Algorithm in the Optimization of Pollutant Concentration. Atmos. Res. 2022, 276, 106238. [Google Scholar] [CrossRef]
- Ding, S.; Shi, Z.; Tao, D.; An, B. Recent Advances in Support Vector Machines. Neurocomputing 2016, 211, 1–3. [Google Scholar] [CrossRef]
- Redivo, E.; Viroli, C.; Farcomeni, A. Quantile-Distribution Functions and Their Use for Classification, with Application to Naïve Bayes Classifiers. Statist. Comput. 2023, 33, 55. [Google Scholar] [CrossRef]
- Kadara, H.; Tran, L.M.; Liu, B.; Vachani, A.; Li, S.; Sinjab, A.; Zhou, X.J.; Dubinett, S.M.; Krysan, K. Early Diagnosis and Screening for Lung Cancer. Cold Spring Harb. Perspect. Med. 2021, 11, a037994. [Google Scholar] [CrossRef] [PubMed]
- Huang, H.; You, Z.; Cai, H.; Xu, J.; Lin, D. Fast Detection Method for Prostate Cancer Cells Based on an Integrated ResNet50 and YoloV5 Framework. Comput. Methods Programs Biomed. 2022, 226, 107184. [Google Scholar] [CrossRef]
- Alshammari, A. Construction of VGG16 Convolution Neural Network (VGG16_CNN) Classifier with NestNet-Based Segmentation Paradigm for Brain Metastasis Classification. Sensors 2022, 22, 8076. [Google Scholar] [CrossRef]
- A Method for Detecting the Quality of Cotton Seeds Based on an Improved ResNet50 Model. Available online: https://pubmed.ncbi.nlm.nih.gov/36791128/ (accessed on 26 August 2023).
- VGG16 Feature Extractor with Extreme Gradient Boost Classifier for Pancreas Cancer Prediction. Available online: https://pubmed.ncbi.nlm.nih.gov/37504815/ (accessed on 26 August 2023).
- Lyu, J.; Bi, X.; Ling, S.H. Multi-Level Cross Residual Network for Lung Nodule Classification. Sensors 2020, 20, 2837. [Google Scholar] [CrossRef]
- Deep-Learning Model of ResNet Combined with CBAM for Malignant-Benign Pulmonary Nodules Classification on Computed Tomography Images. Available online: https://pubmed.ncbi.nlm.nih.gov/37374292/ (accessed on 26 August 2023).
- Xie, Y.; Xia, Y.; Zhang, J.; Song, Y.; Feng, D.; Fulham, M.; Cai, W. Knowledge-Based Collaborative Deep Learning for Benign-Malignant Lung Nodule Classification on Chest CT. IEEE Trans. Med. Imaging 2019, 38, 991–1004. [Google Scholar] [CrossRef]
- Wang, H.; Zhu, H.; Ding, L.; Yang, K. A Diagnostic Classification of Lung Nodules Using Multiple-Scale Residual Network. Sci. Rep. 2023, 13, 11322. [Google Scholar] [CrossRef]
- Evaluation of the Solitary Pulmonary Nodule: Size Matters, but Do Not Ignore the Power of Morphology | Insights into Imaging. Available online: https://link.springer.com/article/10.1007/s13244-017-0581-2?utm_source=xmol&utm_medium=affiliate&utm_content=meta&utm_campaign=DDCN_1_GL01_metadata (accessed on 26 August 2023).
- Patient and Nodule Characteristics Associated with a Lung Cancer Diagnosis Among Individuals with Incidentally Detected Lung Nodules—ScienceDirect. Available online: https://www.sciencedirect.com/science/article/pii/S0012369222039009 (accessed on 26 August 2023).
- Choi, W.; Oh, J.H.; Riyahi, S.; Liu, C.-J.; Jiang, F.; Chen, W.; White, C.; Rimner, A.; Mechalakos, J.G.; Deasy, J.O.; et al. Radiomics Analysis of Pulmonary Nodules in Low-Dose CT for Early Detection of Lung Cancer. Med. Phys. 2018, 45, 1537–1549. [Google Scholar] [CrossRef] [PubMed]
- Early Detection of Lung Cancer Using DNA Promoter Hypermethylation in Plasma and Sputum|Clinical Cancer Research|American Association for Cancer Research. Available online: https://aacrjournals.org/clincancerres/article/23/8/1998/123278/Early-Detection-of-Lung-Cancer-Using-DNA-Promoter (accessed on 26 August 2023).
- Li, X.; Li, X.; Chen, S.; Wu, Y.; Liu, Y.; Hu, T.; Huang, J.; Yu, J.; Pei, Z.; Zeng, T.; et al. TRAP1 Shows Clinical Significance in the Early Diagnosis of Small Cell Lung Cancer. J. Inflamm. Res. 2021, 14, 2507–2514. [Google Scholar] [CrossRef] [PubMed]
Database | Subjects (n, %) | Images (n, %) |
---|---|---|
Beijing Chest Hospital | 43 (10.86) | 89 (10.67) |
Beijing Cancer Hospital | 106 (26.77) | 228 (27.37) |
Xuanwu Hospital | 96 (24.24) | 204 (24.46) |
Beijing Physical Examination Center | 79 (19.95) | 175 (20.98) |
TCGA Public Database | 26 (6.57) | 51 (6.12) |
LIDC-IDRI | 46 (11.62) | 87 (10.44) |
Total | 396 (100.00) | 834 (100.00) |
Clinic Information | Benign (n = 154) | Malignant (n = 242) | p |
---|---|---|---|
Age (years, mean ± SD) | 61.43 ± 12.38 | 68.42 ± 10.29 | 0.057 a |
Gender (n, %) | 0.903 b | ||
Male | 81 (0.53) | 135 (0.56) | |
Female | 73 (0.47) | 107 (0.44) |
Models | Accuracy | Sensitivity | Specificity | PPV | NPV | AUC | MAE | F1-Score |
---|---|---|---|---|---|---|---|---|
ResNet50 | 0.75 (0.73, 0.77) | 0.82 | 0.66 | 0.78 | 0.71 | 0.81 | 0.27 | 0.80 |
VGG16 | 0.61 (0.59, 0.63) | 0.37 | 0.90 | 0.82 | 0.54 | 0.61 | 0.40 | 0.51 |
VGG16-Ensemble Voting | 0.88 (0.88, 0.89) | 0.95 | 0.78 | 0.74 | 0.54 | 0.77 | 0.11 | 0.91 |
VGG16-XGBoost | 0.74 (0.72, 0.77) | 0.86 | 0.57 | 0.73 | 0.68 | 0.76 | 0.25 | 0.80 |
VGG16-Random Forest | 0.73 (0.71, 0.75) | 0.89 | 0.49 | 0.72 | 0.74 | 0.79 | 0.27 | 0.80 |
VGG16-SVM | 0.72 (0.70, 0.75) | 0.90 | 0.46 | 0.72 | 0.76 | 0.78 | 0.27 | 0.80 |
VGG16-Naïve Bayes | 0.63 (0.61, 0.66) | 0.69 | 0.54 | 0.69 | 0.55 | 0.66 | 0.37 | 0.69 |
ResNet50-Ensemble Voting | 0.94 (0.93, 0.94) | 0.96 | 0.91 | 0.85 | 0.63 | 0.88 | 0.06 | 0.95 |
ResNet50-XGBoost | 0.82 (0.80, 0.83) | 0.89 | 0.70 | 0.82 | 0.81 | 0.90 | 0.18 | 0.86 |
ResNet50-Random Forest | 0.82 (0.80, 0.84) | 0.92 | 0.66 | 0.81 | 0.86 | 0.89 | 0.19 | 0.86 |
ResNet50-SVM | 0.83 (0.82, 0.85) | 0.93 | 0.69 | 0.82 | 0.86 | 0.91 | 0.17 | 0.87 |
ResNet50-Naïve Bayes | 0.71 (0.69, 0.73) | 0.75 | 0.66 | 0.77 | 0.63 | 0.75 | 0.29 | 0.76 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, W.; Yu, S.; Yang, R.; Tian, Y.; Zhu, T.; Liu, H.; Jiao, D.; Zhang, F.; Liu, X.; Tao, L.; et al. Machine Learning Model of ResNet50-Ensemble Voting for Malignant–Benign Small Pulmonary Nodule Classification on Computed Tomography Images. Cancers 2023, 15, 5417. https://doi.org/10.3390/cancers15225417
Li W, Yu S, Yang R, Tian Y, Zhu T, Liu H, Jiao D, Zhang F, Liu X, Tao L, et al. Machine Learning Model of ResNet50-Ensemble Voting for Malignant–Benign Small Pulmonary Nodule Classification on Computed Tomography Images. Cancers. 2023; 15(22):5417. https://doi.org/10.3390/cancers15225417
Chicago/Turabian StyleLi, Weiming, Siqi Yu, Runhuang Yang, Yixing Tian, Tianyu Zhu, Haotian Liu, Danyang Jiao, Feng Zhang, Xiangtong Liu, Lixin Tao, and et al. 2023. "Machine Learning Model of ResNet50-Ensemble Voting for Malignant–Benign Small Pulmonary Nodule Classification on Computed Tomography Images" Cancers 15, no. 22: 5417. https://doi.org/10.3390/cancers15225417
APA StyleLi, W., Yu, S., Yang, R., Tian, Y., Zhu, T., Liu, H., Jiao, D., Zhang, F., Liu, X., Tao, L., Gao, Y., Li, Q., Zhang, J., & Guo, X. (2023). Machine Learning Model of ResNet50-Ensemble Voting for Malignant–Benign Small Pulmonary Nodule Classification on Computed Tomography Images. Cancers, 15(22), 5417. https://doi.org/10.3390/cancers15225417