A Machine Learning-Based Predictive Model of Epidermal Growth Factor Mutations in Lung Adenocarcinomas
Abstract
:Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Patient Data
2.2. Case Selection
2.3. Patient Characteristics
2.4. CT Examination
2.5. Region of Interest (ROI) Labeling
2.6. Feature Engineering
2.7. Feature Selection and Modeling
2.8. Application of Shapley Value Algorithm to PCA
2.9. Model Establishment and Statistics Analysis
3. Results
3.1. Clinical Characteristics Statistics Analysis
3.2. Radiomics Features Selection
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zheng, R.S.; Zhang, S.W.; Zeng, H.M.; Wang, S.; Sun, K.; Chen, R.; Li, L.; Wei, W.; He, J. Cancer incidence and mortality in China, 2016. J. Natl. Cancer Cent. 2022, 2, 1–9. [Google Scholar] [CrossRef]
- Hsu, P.-C.; Jablons, D.M.; Yang, C.-T.; You, L. Epidermal Growth Factor Receptor (EGFR) Pathway, Yes-Associated Protein (YAP) and the Regulation of Programmed Death-Ligand 1 (PD-L1) in Non-Small Cell Lung Cancer (NSCLC). Int. J. Mol. Sci. 2019, 20, 3821. [Google Scholar] [CrossRef]
- Lin, A.; Wei, T.; Meng, H.; Luo, P.; Zhang, J. Role of the dynamic tumor microenvironment in controversies regarding immune checkpoint inhibitors for the treatment of non-small cell lung cancer (NSCLC) with EGFR mutations. Mol. Cancer 2019, 18, 139. [Google Scholar] [CrossRef]
- Haratani, K.; Hayashi, H.; Tanaka, T.; Kaneda, H.; Togashi, Y.; Sakai, K.; Hayashi, K.; Tomida, S.; Chiba, Y.; Yonesaka, K.; et al. Tumor immune microenvironment and nivolumab efficacy in EGFR mutation-positive non-small-cell lung cancer based on T790M status after disease progression during EGFR-TKI treatment. Ann. Oncol. 2017, 28, 1532–1539. [Google Scholar] [CrossRef]
- Mok, T.S.; Wu, Y.-L.; Thongprasert, S.; Yang, C.-H.; Chu, D.-T.; Saijo, N.; Sunpaweravong, P.; Han, B.; Margono, B.; Ichinose, Y.; et al. Gefitinib or Carboplatin–Paclitaxel in Pulmonary Adenocarcinoma. N. Engl. J. Med. 2009, 361, 947–957. [Google Scholar] [CrossRef]
- Jackman, D.M.; Miller, V.A.; Cioffredi, L.A.; Yeap, B.Y.; Jänne, P.A.; Riely, G.J.; Ruiz, M.G.; Giaccone, G.; Sequist, L.V.; Johnson, B.E. Impact of epidermal growth factor receptor and KRAS mutations on clinical outcomes in previously untreated non-small cell lung cancer patients: Results of an online tumor registry of clinical trials. Clin. Cancer Res. 2009, 15, 5267–5273. [Google Scholar] [CrossRef]
- Akamatsu, H.; Harada, H.; Tokunaga, S.; Yoshimura, N.; Ikeda, H.; Oizumi, S.; Sugimoto, N.; Takano, T.; Murakami, H.; Nishimura, Y.; et al. A Phase II Study of Gefitinib with Concurrent Thoracic Radiotherapy in Patients With Unresectable, Stage III Non–small-cell Lung Cancer Harboring EGFR Mutations (WJOG6911L). Clin. Lung Cancer 2018, 20, e25–e27. [Google Scholar] [CrossRef]
- Patil, V.M.; Noronha, V.; Joshi, A.; Choughule, A.B.; Bhattacharjee, A.; Kumar, R.; Goud, S.; More, S.; Ramaswamy, A.; Karpe, A.; et al. Phase III study of gefitinib or pemetrexed with carboplatin in EGFR-mutated advanced lung adenocarcinoma. ESMO Open 2017, 2, 1–9. [Google Scholar] [CrossRef]
- Sequist, L.V.; Yang, J.C.-H.; Yamamoto, N.; Obyrne, K.; Hirsh, V.; Mok, T.; Geater, S.L.; Orlov, S.; Tsai, C.-M.; Boyer, M.; et al. Phase III Study of Afatinib or Cisplatin Plus Pemetrexed in Patients with Metastatic Lung Adenocarcinoma with EGFR Mutations. J. Clin. Oncol. 2013, 31, 3327–3334. [Google Scholar] [CrossRef]
- Suda, K.; Mitsudomi, T.; Shintani, Y.; Okami, J.; Ito, H.; Ohtsuka, T.; Toyooka, S.; Mori, T.; Watanabe, S.-I.; Asamura, H.; et al. Clinical Impacts of EGFR Mutation Status: Analysis of 5780 Surgically Resected Lung Cancer Cases. Ann. Thorac. Surg. 2020, 111, 269–276. [Google Scholar] [CrossRef]
- Querings, S.; Altmüller, J.; Ansén, S.; Zander, T.; Seidel, D.; Gabler, F.; Peifer, M.; Markert, E.; Stemshorn, K.; Timmermann, B.; et al. Benchmarking of Mutation Diagnostics in Clinical Lung Cancer Specimens. PLoS ONE 2011, 6, e19601. [Google Scholar] [CrossRef]
- Li, X.Y.; Zhang, C.; Zhang, Q.L.; Zhu, J.-L.; Liu, Q.; Chen, M.-W.; Yang, X.-M.; Hui, W.-L.; Cui, Y.-L. Sensitive genotyping of mutations in the EGFR gene from NSCLC patients using PCR-GoldMag lateral flow device. Sci. Rep. 2017, 7, 8346. [Google Scholar] [CrossRef]
- Gillies, R.J.; Kinahan, P.E.; Hricak, H. Radiomics: Images Are More than Pictures, They Are Data. Radiology 2016, 278, 563–577. [Google Scholar] [CrossRef]
- Sun, R.; Limkin, E.J.; Vakalopoulou, M.; Dercle, L.; Champiat, S.; Han, S.R.; Verlingue, L.; Brandao, D.; Lancia, A.; Ammari, S.; et al. A radiomics approach to assess tumour-infiltrating CD8 cells and response to anti-PD-1 or anti-PD-L1 immunotherapy: An imaging biomarker, retrospective multicohort study. Lancet Oncol. 2018, 19, 1180–1191. [Google Scholar] [CrossRef]
- Li, X.; Zhou, Y.; Dvornek, N.C.; Gu, Y.; Ventola, P.; Duncan, J.S. Efficient Shapley Explanation for Features Importance Estimation Under Uncertainty. Med. Image Comput. Comput. Assist. Interv. 2020, 12261, 792–801. [Google Scholar] [CrossRef]
- Da Cunha Santos, G.; Shepherd, F.A.; Tsao, M.S. EGFR Mutations and Lung Cancer. Annu. Rev. Pathol. 2011, 6, 49–69. [Google Scholar] [CrossRef]
- Santos, G.D.C.; Saieg, M.A.; Geddie, W.; Leighl, N. EGFR gene status in cytological samples of nonsmall cell lung carcinoma: Controversies and opportunities. Cancer Cytopathol. 2011, 119, 80–91. [Google Scholar] [CrossRef]
- Liu, W.S.; Zhao, L.J.; Pang, S.Q.; Yuan, Z.-Y.; Li, B.; Wang, P. Prognostic value of epidermal growth factor receptor mutations in resected lung adenocarcinomas. Med. Oncol. 2014, 31, 771. [Google Scholar] [CrossRef]
- Deng, C.; Zhang, Y.; Ma, Z.; Fu, F.; Deng, L.; Li, Y.; Chen, H. Prognostic value of epidermal growth factor receptor gene mutation in resected lung adenocarcinoma. J. Thorac. Cardiovasc. Surg. 2020, 162, 664–674.e7. [Google Scholar] [CrossRef]
- Hayasaka, K.; Shiono, S.; Matsumura, Y.; Yanagawa, N.; Suzuki, H.; Abe, J.; Sagawa, M.; Sakurada, A.; Katahira, M.; Takahashi, S.; et al. Epidermal Growth Factor Receptor Mutation as a Risk Factor for Recurrence in Lung Adenocarcinoma. Ann. Thorac. Surg. 2018, 105, 1648–1654. [Google Scholar] [CrossRef] [Green Version]
- Van Sanden, S.; Murton, M.; Bobrowska, A.; Rahhali, N.; Sermon, J.; Rodrigues, B.; Goff-Leggett, D.; Chouaid, C.; Sebastian, M.; Greystoke, A. Prevalence of Epidermal Growth Factor Receptor Exon 20 Insertion Mutations in Non-small-Cell Lung Cancer in Europe: A Pragmatic Literature Review and Meta-analysis. Target. Oncol. 2022, 17, 153–166. [Google Scholar] [CrossRef]
- Duma, N.; Santana-Davila, R.; Molina, J.R. Non-Small Cell Lung Cancer: Epidemiology, Screening, Diagnosis, and Treatment. Mayo Clin. Proc. 2019, 94, 1623–1640. [Google Scholar] [CrossRef]
- Li, S.; Ding, C.; Zhang, H.; Song, J.; Wu, L. Radiomics for the prediction of EGFR mutation subtypes in non-small cell lung cancer. Med. Phys. 2019, 46, 4545–4552. [Google Scholar] [CrossRef]
- Wang, S.; Shi, J.; Ye, Z.; Dong, D.; Yu, D.; Zhou, M.; Liu, Y.; Gevaert, O.; Wang, K.; Zhu, Y.; et al. Predicting EGFR mutation status in lung adenocarcinoma on computed tomography image using deep learning. Eur. Respir. J. 2019, 53, 1800986. [Google Scholar] [CrossRef]
- Zhao, W.; Yang, J.; Ni, B.; Bi, D.; Sun, Y.; Xu, M.; Zhu, X.; Li, C.; Jin, L.; Gao, P.; et al. Toward automatic prediction of EGFR mutation status in pulmonary adenocarcinoma with 3D deep learning. Cancer Med. 2019, 8, 3532–3543. [Google Scholar] [CrossRef]
- Rios Velazquez, E.; Parmar, C.; Liu, Y.; Coroller, T.P.; Cruz, G.; Stringfield, O.; Ye, Z.; Makrigiorgos, M.; Fennessy, F.; Mak, R.H.; et al. Somatic Mutations Drive Distinct Imaging Phenotypes in Lung Cancer. Cancer Res. 2017, 77, 3922–3930. [Google Scholar] [CrossRef]
- Liu, Y.; Kim, J.; Balagurunathan, Y.; Li, Q.; Garcia, A.L.; Stringfield, O.; Ye, Z.; Gillies, R.J. Radiomics features are associated with EGFR mutation status in lung adenocarcinomas. Clin. Lung Cancer 2016, 14, 441–448. [Google Scholar] [CrossRef]
- Zhang, G.; Cao, Y.; Zhang, J.; Ren, J.; Zhao, Z.; Zhang, X.; Li, S.; Deng, L.; Zhou, J. Predicting EGFR mutation status in lung adenocarcinoma: Development and validation of a computed tomography-based radiomics signature. Am. J. Cancer Res. 2021, 11, 546–560. [Google Scholar]
- Liu, Y.; Kim, J.; Qu, F.; Liu, S.; Wang, H.; Balagurunathan, Y.; Ye, Z.; Gillies, R.J. CT Features Associated with Epidermal Growth Factor Receptor Mutation Status in Patients with Lung Adenocarcinoma. Radiology 2016, 280, 271–280. [Google Scholar] [CrossRef]
- Wang, C.; Xu, X.; Shao, J.; Zhou, K.; Zhao, K.; He, Y.; Li, J.; Guo, J.; Yi, Z.; Li, W. Deep Learning to Predict EGFR Mutation and PD-L1 Expression Status in Non-Small-Cell Lung Cancer on Computed Tomography Images. J. Oncol. 2021, 2021, 5499385. [Google Scholar] [CrossRef]
- Zhang, L.; Fried, D.V.; Fave, X.J.; Hunter, L.A.; Yang, J.; Court, L.E. IBEX: An open infrastructure software platform to facilitate collaborative work in radiomics. Med. Phys. 2015, 42, 1341–1353. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- El Naqa, I.; Grigsby, P.; Apte, A.; Kidd, E.; Donnelly, E.; Khullar, D.; Chaudhari, S.; Yang, D.; Schmitt, M.; Laforest, R.; et al. Exploring feature-based approaches in PET images for predicting cancer treatment outcomes. Pattern Recognit. 2009, 42, 1162–1171. [Google Scholar] [CrossRef] [PubMed]
- Yuan, Y.; Ren, J.; Shi, Y.; Tao, X. MRI-based radiomic signature as predictive marker for patients with head and neck squamous cell carcinoma. Eur. J. Radiol. 2019, 117, 193–198. [Google Scholar] [CrossRef]
- Thakur, A.; Goldbaum, M.; Yousefi, S. Convex Representations Using Deep Archetypal Analysis for Predicting Glaucoma. IEEE J. Transl. Eng. Health Med. 2020, 8, 3800107. [Google Scholar] [CrossRef]
- Lambin, P.; Leijenaar, R.T.H.; Deist, T.M.; Peerlings, J.; de Jong, E.E.C.; van Timmeren, J.; Sanduleanu, S.; Larue, R.T.H.M.; Even, A.J.G.; Jochems, A.; et al. Radiomics: The bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol. 2017, 14, 749–762. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Characteristics | Groups | Overall | |
---|---|---|---|
Gender | Male | 441 | 58.2% |
Female | 317 | 41.8% | |
Smoking History | Yes | 358 | 47.2% |
No | 400 | 52.8% | |
Stage | IV | 602 | 79.4% |
IIIA | 71 | 9% | |
IIIB | 55 | 7.2% | |
IIA | 7 | 0.9% | |
IIIC | 5 | 0.7% | |
IB | 6 | 0.8% | |
IIB | 6 | 0.8% | |
IA | 4 | 0.5% | |
II | 2 | 0.3% | |
Family History | Yes | 79 | 10.4% |
No | 679 | 89.6% | |
Tumor Location | Central Type | 239 | 31.5% |
Peripheral Type | 513 | 67.7% | |
unknown | 6 | 0.8% | |
EGFR Mutation | wild-type | 396 | 52.2% |
exon 19 | 218 | 28.8% | |
exon 21 | 115 | 15.2% | |
exon 20.21 | 12 | 1.6% | |
exon 18 | 6 | 0.8% | |
exon 19.20 | 5 | 0.7% | |
exon 19.21 | 3 | 0.4% | |
exon 18.20.21 | 1 | 0.1% | |
exon 18.20 | 1 | 0.1% | |
exon 15 | 1 | 0.1% |
Variables | Overall | Mutation | Wild-Type | p-Value | |
---|---|---|---|---|---|
Median Age (Range) | 55.6 ± 10 (23–85) | 55.2 ± 9.9 (23–85) | 55.9 ± 10.1 (29–83) | 0.34 | |
Gender | Female | 317 | 188 | 129 | <0.001 |
Male | 441 | 174 | 267 | ||
Smoking History | No | 400 | 236 | 164 | <0.001 |
Yes | 358 | 126 | 232 | ||
Family History | No | 679 | 326 | 353 | 0.633 |
Yes | 79 | 36 | 43 | ||
Stage | I–II | 25 | 9 | 16 | 0.309 |
III–IV | 733 | 352 | 381 | ||
Tumor Location | C-Type | 239 | 110 | 129 | 0.583 |
P-Type | 513 | 248 | 265 |
Groups | Feature Names | Coefficients |
---|---|---|
Shape | Convex | −0.02970268 |
Orientation | 0.00904001 | |
MeanBreadth | −0.0272518 | |
GLCM | -333-4Correlation | −0.0582105 |
135-4InformationMeasureCorr1 | −0.01293069 | |
45-7SumVariance | 0.005674244 | |
135-7ClusterTendendcy | 0.00998332 | |
90-7DifferenceEntropy | −0.02322079 | |
0-4InverseVariance | 0.06418816 | |
NID | Busyness | 0.03406641 |
Model Name | Train ROC/ AUC Mean | Test ROC/ AUC Mean | p | Train Acc Mean | Test Acc Mean | p | Train F1 Mean | Test F1 Mean | p |
---|---|---|---|---|---|---|---|---|---|
LGBM | 0.99 | 0.64 | 0.03 | 0.95 | 0.61 | 0.03 | 0.95 | 0.63 | 0.02 |
RF | 1 | 0.65 | 0.03 | 1 | 0.61 | 0.02 | 1 | 0.62 | 0.04 |
SVC | 0.72 | 0.65 | 0.03 | 0.64 | 0.6 | 0.01 | 0.65 | 0.61 | 0.01 |
KNN | 0.76 | 0.58 | 0.01 | 0.7 | 0.57 | 0.02 | 0.7 | 0.57 | 0.02 |
Groups | Feature Names |
---|---|
Shape | Roundness |
Convex | |
Orientation | |
MeanBreadth | |
NID | TextureStrength1 |
TextureStrength | |
GLCM | 90-1InverseVariance |
270-1InverseVariance | |
-333-1InverseVariance | |
135-4Correlation |
Feature Type | Feature Name |
---|---|
Radiomics | Convex |
Meanbreadth | |
Orientation | |
TextureStrength | |
Compactness1 | |
270-7Correlation | |
Clinical | SmokingHistory |
Age | |
Gender | |
FamilyHistory |
Model Name | Train ROC/ AUC Mean | Test ROC/ AUC Mean | p | Train Acc Mean | Test Acc Mean | p | Train F1 Mean | Test F1 Mean | p |
---|---|---|---|---|---|---|---|---|---|
LGBM | 1 | 0.926 | 0.084 | 1 | 0.88 | 0.11 | 1 | 0.72 | 0.32 |
RF | 1 | 0.91 | 0.079 | 1 | 0.832 | 0.1 | 1 | 0.66 | 0.28 |
SVC | 0.897 | 0.87 | 0.09 | 0.87 | 0.831 | 0.12 | 0.75 | 0.64 | 0.22 |
KNN | 0.947 | 0.84 | 0.096 | 0.9 | 0.81 | 0.09 | 0.79 | 0.63 | 0.18 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
He, R.; Yang, X.; Li, T.; He, Y.; Xie, X.; Chen, Q.; Zhang, Z.; Cheng, T. A Machine Learning-Based Predictive Model of Epidermal Growth Factor Mutations in Lung Adenocarcinomas. Cancers 2022, 14, 4664. https://doi.org/10.3390/cancers14194664
He R, Yang X, Li T, He Y, Xie X, Chen Q, Zhang Z, Cheng T. A Machine Learning-Based Predictive Model of Epidermal Growth Factor Mutations in Lung Adenocarcinomas. Cancers. 2022; 14(19):4664. https://doi.org/10.3390/cancers14194664
Chicago/Turabian StyleHe, Ruimin, Xiaohua Yang, Tengxiang Li, Yaolin He, Xiaoxue Xie, Qilei Chen, Zijian Zhang, and Tingting Cheng. 2022. "A Machine Learning-Based Predictive Model of Epidermal Growth Factor Mutations in Lung Adenocarcinomas" Cancers 14, no. 19: 4664. https://doi.org/10.3390/cancers14194664
APA StyleHe, R., Yang, X., Li, T., He, Y., Xie, X., Chen, Q., Zhang, Z., & Cheng, T. (2022). A Machine Learning-Based Predictive Model of Epidermal Growth Factor Mutations in Lung Adenocarcinomas. Cancers, 14(19), 4664. https://doi.org/10.3390/cancers14194664