Effectiveness of Artificial Intelligence Models in Predicting Lung Cancer Recurrence: A Gene Biomarker-Driven Review
Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Search Strategy
2.2. Eligibility Criteria
2.3. Study Selection
2.4. Data Extraction
2.5. Data Synthesis
3. Results
3.1. Demographic Characteristics
3.2. Machine Learning Techniques
Model Performance
- Gradient Boosting and XGBoost Model
- Support Vector Machines and Regression Models
- Hybrid and Feature-Enriched Models
3.3. Deep Learning and Multi-Omics Integration
3.4. Key Features and Gene Biomarkers
3.4.1. Biomarkers for LUAD/NSCLC Prediction
- Zhong et al., 2019 [7] identified PDIA3, MYH11, PDK1, SDC3, RPE65, LAMC3, BTK, and UPK1B as key predictive biomarkers.
- Jones et al., 2021 [28] found that SMARCA4, TP53, and genomic alterations measured by the Fraction of Genome Altered (FGA) were significant recurrence predictors.
- Luo et al., 2020 [29] reported that CpG methylation markers, including ART4, KCNK9, FAM83A, and C6orf10, provided valuable prognostic insights.
- Xu et al., 2020 [21] identified a 12-gene signature (ACTR2, ALDH2, FBP1, HIRA, ITGB2, MLF1, P4HA1, S100A10, S100B, SARS, SCGB1A1, SERPIND1, and VSIG4) that demonstrated strong predictive capability.
3.4.2. Immune-Related Markers
- Jiang et al., 2021 [8] found that FOXP3 expression and PD-L1 on tumor-infiltrating lymphocytes (TILs) played crucial roles in predicting SCLC recurrence.
- Rakaee et al., 2023 [31] reported that STK11 and KEAP1 co-mutations were associated with distinct immune phenotypes, impacting the recurrence risk.
3.4.3. Multi-Omics Approaches
- Xu et al., 2024 [37] demonstrated that the integration of PSMC1, PSMD11, PRKCB, CCNE1, NRG1, ZNF521, and NGF significantly improved the predictive accuracy.
- Zhou et al., 2023 [38] identified the long non-coding RNAs (lncRNAs) LINC00675 and MEG3 as critical recurrence predictors.
- Shi et al., 2021 [33] reported CPS1, CCR2, NT5E, ANLN, and ABCC2 as biomarkers with strong prognostic value.
3.4.4. Tumor and Immune Markers
3.4.5. Integration with AI for Enhanced Prediction
3.5. Validation and Generalizability
3.6. Clinical Relevance
3.7. Adverse Events and Bias
3.8. Effective Therapies
3.8.1. Surgical Resection
3.8.2. Adjuvant Therapy
Chemotherapy
Immunotherapy
Radiotherapy
3.9. Key Biomarkers and Their Predictive Value
3.9.1. Recurring Biomarkers in Predictive Models
3.9.2. Distinct Biomarkers Linked to Recurrence
3.9.3. Implications for AI-Driven Models
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
TNM | Tumor, Nodes, and Metastases |
AI | Artificial Intelligence |
PRISMA | Preferred Reporting Items for Systematic Reviews and Meta-Analyses |
ML | Machine Learning |
SVM | Support Vector Machines |
AUC | Area Under the Curve |
NK cells | Natural Killer cells |
CT | Computed Tomography |
MMPs | Metallo-Proteinases |
NGS | Next-Generation Sequencing |
ICB | Immune Checkpoint Blockade |
ANNs | Artificial Neural Networks |
PET | Positron Emission Tomography |
DEGs | Differentially Expressed Genes |
LUAD | Lung Adenocarcinoma |
RFS | Recurrence-Free Survival |
CIBERSORT | Cell-type Identification by Estimating Relative Subsets of RNA Transcripts |
TIICs | Tumor-Infiltrating Immune Cells |
LASSO | Least Absolute Shrinkage and Selection Operator |
ROC | Receiver Operating Characteristic |
SCLC | Small Cell Lung Cancer |
CPE | Concordance Probability Estimate |
LUSC | Lung Squamous Cell Carcinoma |
IBPGNET | Interpretable Biological Pathway Graph Neural Networks |
CNVs | Copy Number Variations |
SNVs | Single Nucleotide Variants |
FGA | Fraction of Genome Altered |
TILs | Tumor-Infiltrating Lymphocytes |
TSI | Tumor Stemness Index |
EHRs | Electronic Health Records |
ctDNA | Circulating Tumor DNA |
References
- Kratzer, T.B.; Bandi, P.; Freedman, N.D.; Smith, R.A.; Travis, W.D.; Jemal, A.; Siegel, R.L. Lung cancer statistics, 2023. Cancer 2024, 130, 1330–1348. [Google Scholar] [CrossRef] [PubMed]
- Zhu, X.; Kudo, M.; Huang, X.; Sui, H.; Tian, H.; Croce, C.M.; Cui, R. Frontiers of MicroRNA Signature in Non-small Cell Lung Cancer. Front. Cell Dev. Biol. 2021, 9, 643942. [Google Scholar] [CrossRef] [PubMed]
- Hawrysz, I.; Wadolowska, L.; Slowinska, M.A.; Czerwinska, A.; Golota, J.J. Lung Cancer Risk in Men and Compliance with the 2018 WCRF/AICR Cancer Prevention Recommendations. Nutrients 2022, 14, 4295. [Google Scholar] [CrossRef]
- Liu, J.; Yu, Q.; Wang, X.S.; Shi, Q.; Wang, J.; Wang, F.; Ren, S.; Jin, J.; Han, B.; Zhang, W.; et al. Compound Kushen Injection Reduces Severe Toxicity and Symptom Burden Associated with Curative Radiotherapy in Patients with Lung Cancer. J. Natl. Compr. Cancer Netw. 2023, 21, 821–830.e3. [Google Scholar] [CrossRef] [PubMed]
- Moon, S.; Choi, D.; Lee, J.-Y.; Kim, M.H.; Hong, H.; Kim, B.-S.; Choi, J.-H. (Eds.) Machine learning-powered prediction of recurrence in patients with non-small cell lung cancer using quantitative clinical and radiomic biomarkers. In Medical Imaging 2020: Computer-Aided Diagnosis; SPIE: Houston, TX, USA, 2020. [Google Scholar]
- Huang, P.; Illei, P.B.; Franklin, W.; Wu, P.H.; Forde, P.M.; Ashrafinia, S.; Hu, C.; Khan, H.; Vadvala, H.V.; Shih, I.M.; et al. Lung Cancer Recurrence Risk Prediction through Integrated Deep Learning Evaluation. Cancers 2022, 14, 4150. [Google Scholar] [CrossRef]
- Zhong, J.; Chen, J.M.; Chen, S.L.; Yi, Y.F. Constructing a Risk Prediction Model for Lung Cancer Recurrence by Using Gene Function Clustering and Machine Learning. Comb. Chem. High Throughput Screen. 2019, 22, 266–275. [Google Scholar] [CrossRef]
- Jiang, M.; Wu, C.; Zhang, L.; Sun, C.; Wang, H.; Xu, Y.; Sun, H.; Zhu, J.; Zhao, W.; Fang, Q.; et al. FOXP3-based immune risk model for recurrence prediction in small-cell lung cancer at stages I-III. J. Immunother. Cancer 2021, 9, e002339. [Google Scholar] [CrossRef]
- Depeursinge, A.; Yanagawa, M.; Leung, A.N.; Rubin, D.L. Predicting adenocarcinoma recurrence using computational texture models of nodule components in lung CT. Med. Phys. 2015, 42, 2054–2063. [Google Scholar] [CrossRef]
- Libling, W.A.; Korn, R.; Weiss, G.J. Review of the use of radiomics to assess the risk of recurrence in early-stage non-small cell lung cancer. Transl. Lung Cancer Res. 2023, 12, 1575–1589. [Google Scholar] [CrossRef]
- Ai, Y.; Li, Y.; Chen, Y.-W.; Aonpong, P.; Han, X. ResMLP_GGR: Residual Multilayer Perceptrons-Based Genotype-Guided Recurrence Prediction of Non-small Cell Lung Cancer. J. Image Graph. 2023, 11, 185–194. [Google Scholar] [CrossRef]
- Ai, Y.; Liu, J.; Li, Y.; Wang, F.; Du, X.; Jain, R.K.; Lin, L.; Chen, Y.W. SAMA: A Self-and-Mutual Attention Network for Accurate Recurrence Prediction of Non-Small Cell Lung Cancer Using Genetic and CT Data. IEEE J. Biomed. Health Inform. 2024, 29, 3220–3233. [Google Scholar] [CrossRef] [PubMed]
- Nakagawa, M.; Uramoto, H.; Shimokawa, H.; Onitsuka, T.; Hanagiri, T.; Tanaka, F. Insulin-like growth factor receptor-1 expression predicts postoperative recurrence in adenocarcinoma of the lung. Exp. Ther. Med. 2011, 2, 585–590. [Google Scholar] [CrossRef] [PubMed]
- Liu, C.-H. The Role of Chronic Inflammation in Lung Tumorigenesis and the Identification of Potential Biomarkers for Lung Cancer Treatment. Ph.D. Thesis, University of Pittsburgh, Pittsburgh, PA, USA, 2019. [Google Scholar]
- Roberto, M.; Arrivi, G.; Pilozzi, E.; Montori, A.; Balducci, G.; Mercantini, P.; Laghi, A.; Ierinò, D.; Panebianco, M.; Marinelli, D.; et al. The Potential Role of Genomic Signature in Stage II Relapsed Colorectal Cancer (CRC) Patients: A Mono-Institutional Study. Cancer Manag. Res. 2022, 14, 1353–1369. [Google Scholar] [CrossRef] [PubMed]
- Wadowska, K.; Bil-Lula, I.; Trembecki, Ł.; Śliwińska-Mossoń, M. Genetic Markers in Lung Cancer Diagnosis: A Review. Int. J. Mol. Sci. 2020, 21, 4569. [Google Scholar] [CrossRef]
- Niknafs, N.; Conroy, M.; Anagnostou, V. Tracing the genetic fingerprints of tumour evolution: The pursuit of identifying mutations with differential weights within the overall tumour mutation burden and their role in therapeutic responses with immune checkpoint blockade. Clin. Transl. Med. 2023, 13, e1287. [Google Scholar] [CrossRef]
- Anita, M.; Ambhika, C.; Anish, T. Exploring the Landscape of Artificial Intelligence in Healthcare Applications. In AI Healthcare Applications and Security, Ethical, and Legal Considerations; IGI Global: Hershey, PA, USA, 2024; pp. 29–48. [Google Scholar]
- Lorenc, A.; Romaszko-Wojtowicz, A.; Jaśkiewicz, Ł.; Doboszyńska, A.; Buciński, A. Exploring the efficacy of artificial neural networks in predicting lung cancer recurrence: A retrospective study based on patient records. Transl. Lung Cancer Res. 2023, 12, 2083–2097. [Google Scholar] [CrossRef]
- Ramtohul, T.; Challier, L.; Servois, V.; Girard, N. Pretreatment Tumor Growth Rate and Radiological Response as Predictive Markers of Pathological Response and Survival in Patients with Resectable Lung Cancer Treated by Neoadjuvant Treatment. Cancers 2023, 15, 4158. [Google Scholar] [CrossRef]
- Xu, S.; Zhou, J.; Liu, K.; Chen, Z.; He, Z. A Recurrence-Specific Gene-Based Prognosis Prediction Model for Lung Adenocarcinoma through Machine Learning Algorithm. BioMed Res. Int. 2020, 2020, 9124792. [Google Scholar] [CrossRef]
- Wang, Q.; Zhou, D.; Wu, F.; Liang, Q.; He, Q.; Peng, M.; Yao, T.; Hu, Y.; Qian, B.; Tang, J.; et al. Immune Microenvironment Signatures as Biomarkers to Predict Early Recurrence of Stage Ia-b Lung Cancer. Front. Oncol. 2021, 11, 680287. [Google Scholar] [CrossRef]
- Shen, Y.; Goparaju, C.; Yang, Y.; Babu, B.A.; Gai, W.; Pass, H.; Jiang, G. Recurrence prediction of lung adenocarcinoma using an immune gene expression and clinical data trained and validated support vector machine classifier. Transl. Lung Cancer Res. 2023, 12, 2055–2067. [Google Scholar] [CrossRef]
- Iriso, P.; Boulahfa, J.; Afshar, M.; Attignon, V.; Bouaoud, J.; Boussageon, M.; Fayette, J.; Foy, J.-P.; Karabajakian, A.; Kindermans, M.; et al. Artificial intelligence–based biomarkers of response to immunotherapy in patients with non–small-cell lung cancer considering previous lines of treatment. J. Clin. Oncol. 2023, 41, e14652. [Google Scholar] [CrossRef]
- Aonpong, P.; Iwamoto, Y.; Han, X.-H.; Lin, L.; Chen, Y.-W. Genotype-guided radiomics signatures for recurrence prediction of non-small cell lung cancer. IEEE Access 2021, 9, 90244–90254. [Google Scholar] [CrossRef]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, 71. [Google Scholar] [CrossRef] [PubMed]
- Abdu-Aljabar, R.D.; Awad, O.A. Improving Lung Cancer Relapse Prediction Using the Developed Optuna_XGB Classification Model. Int. J. Intell. Eng. Syst. 2023, 16, 131–141. [Google Scholar]
- Jones, G.D.; Brandt, W.S.; Shen, R.; Sanchez-Vega, F.; Tan, K.S.; Martin, A.; Zhou, J.; Berger, M.; Solit, D.B.; Schultz, N. A genomic-pathologic annotated risk model to predict recurrence in early-stage lung adenocarcinoma. JAMA Surg. 2021, 156, e205601. [Google Scholar] [CrossRef]
- Luo, R.; Song, J.; Xiao, X.; Xie, Z.; Zhao, Z.; Zhang, W.; Miao, S.; Tang, Y.; Ran, L. Identifying CpG methylation signature as a promising biomarker for recurrence and immunotherapy in non–small-cell lung carcinoma. Aging 2020, 12, 14649. [Google Scholar] [CrossRef]
- Miao, R.; Xu, Z.; Han, T.; Liu, Y.; Zhou, J.; Guo, J.; Xing, Y.; Bai, Y.; He, Z.; Wu, J. Based on machine learning, CDC20 has been identified as a biomarker for postoperative recurrence and progression in stage I & II lung adenocarcinoma patients. Front. Oncol. 2024, 14, 1351393. [Google Scholar]
- Rakaee, M.; Andersen, S.; Giannikou, K.; Paulsen, E.-E.; Kilvær, T.K.; Busund, L.-T.; Berg, T.; Richardsen, E.; Lombardi, A.P.; Adib, E. Machine learning-based immune phenotypes correlate with STK11/KEAP1 co-mutations and prognosis in resectable NSCLC: A sub-study of the TNM-I trial. Ann. Oncol. 2023, 34, 578–588. [Google Scholar] [CrossRef]
- Senthil, S.; Shubha, B.A. Improving the performance of lung cancer detection at an earlier stage and prediction of reoccurrence using the neural networks and ant lion optimizer. Int. J. Recent Technol. Eng. 2019, 8, 6378–6391. [Google Scholar] [CrossRef]
- Shi, H.; Han, L.; Zhao, J.; Wang, K.; Xu, M.; Shi, J.; Dong, Z. Tumor stemness and immune infiltration synergistically predict response of radiotherapy or immunotherapy and relapse in lung adenocarcinoma. Cancer Med. 2021, 10, 8944–8960. [Google Scholar] [CrossRef]
- Timilsina, M.; Buosi, S.; Fey, D.; Janik, A.; Torrente, M.; Provencio, M.; Bermu, A.C.; Carcereny, E.; Costabello, L.; Rodr, D. (Eds.) Integration of clinical information and imputed aneuploidy scores to enhance relapse prediction in early stage lung cancer patients. In Proceedings of the AMIA Annual Symposium Proceedings, Washington, DC, USA, 5–11 May 2022; American Medical Informatics Association: Washington, DC, USA; p. 1062. [Google Scholar]
- Timilsina, M.; Fey, D.; Buosi, S.; Janik, A.; Costabello, L.; Carcereny, E.; Abreu, D.R.; Cobo, M.; Castro, R.L.; Bernabé, R. Synergy between imputed genetic pathway and clinical information for predicting recurrence in early stage non-small cell lung cancer. J. Biomed. Inform. 2023, 144, 104424. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.; Lu, X.; Chen, J. Construction and experimental validation of an acetylation-related gene signature to evaluate the recurrence and immunotherapeutic response in early-stage lung adenocarcinoma. BMC Med. Genom. 2022, 15, 254. [Google Scholar] [CrossRef] [PubMed]
- Xu, Z.; Liao, H.; Huang, L.; Chen, Q.; Lan, W.; Li, S. IBPGNET: Lung adenocarcinoma recurrence prediction based on neural network interpretability. Brief. Bioinform. 2024, 25, bbae080. [Google Scholar] [CrossRef] [PubMed]
- Zhou, X.; Ji, L.; Ma, Y.; Tian, G.; Lv, K.; Yang, J. Intratumoral microbiota-host interactions shape the variability of lung adenocarcinoma and lung squamous cell carcinoma in recurrence and metastasis. Microbiol. Spectr. 2023, 11, e0373822. [Google Scholar] [CrossRef]
- Yang, Y.; Xu, L.; Sun, L.; Zhang, P.; Farid, S.S. Machine learning application in personalised lung cancer recurrence and survivability prediction. Comput. Struct. Biotechnol. J. 2022, 20, 1811–1820. [Google Scholar] [CrossRef]
- Alam, M.R.; Seo, K.J.; Abdul-Ghafar, J.; Yim, K.; Lee, S.H.; Jang, H.-J.; Jung, C.K.; Chong, Y. Recent application of artificial intelligence on histopathologic image-based prediction of gene mutation in solid cancers. Brief. Bioinform. 2023, 24, bbad151. [Google Scholar] [CrossRef]
- McAleese, J.; Taylor, A.; Walls, G.; Hanna, G. Differential relapse patterns for non-small cell lung cancer subtypes adenocarcinoma and squamous cell carcinoma: Implications for radiation oncology. Clin. Oncol. 2019, 31, 711–719. [Google Scholar] [CrossRef]
- Yin, W.; Chen, G.; Li, Y.; Li, R.; Jia, Z.; Zhong, C.; Wang, S.; Mao, X.; Cai, Z.; Deng, J. Identification of a 9-gene signature to enhance biochemical recurrence prediction in primary prostate cancer: A benchmarking study using ten machine learning methods and twelve patient cohorts. Cancer Lett. 2024, 588, 216739. [Google Scholar] [CrossRef]
- Ji, J.-H.; Ahn, S.G.; Yoo, Y.; Park, S.-Y.; Kim, J.-H.; Jeong, J.-Y.; Park, S.; Lee, I. Prediction of a Multi-Gene Assay (Oncotype DX and Mammaprint) Recurrence Risk Group Using Machine Learning in Estrogen Receptor-Positive, HER2-Negative Breast Cancer—The BRAIN Study. Cancers 2024, 16, 774. [Google Scholar] [CrossRef]
- Huang, J.; Zhang, J.-L.; Ang, L.; Li, M.-C.; Zhao, M.; Wang, Y.; Wu, Q. Proposing a novel molecular subtyping scheme for predicting distant recurrence-free survival in breast cancer post-neoadjuvant chemotherapy with close correlation to metabolism and senescence. Front. Endocrinol. 2023, 14, 1265520. [Google Scholar] [CrossRef]
- Wang, Z.; Ma, C.; Teng, Q.; Man, J.; Zhang, X.; Liu, X.; Zhang, T.; Chong, W.; Chen, H.; Lu, M. Identification of a ferroptosis-related gene signature predicting recurrence in stage II/III colorectal cancer based on machine learning algorithms. Front. Pharmacol. 2023, 14, 1260697. [Google Scholar] [CrossRef]
- Wu, J.; Liu, S.; Chen, X.; Xu, H.; Tang, Y. Machine learning identifies two autophagy-related genes as markers of recurrence in colorectal cancer. J. Int. Med. Res. 2020, 48, 0300060520958808. [Google Scholar] [CrossRef] [PubMed]
- Kim, W.; Kim, K.S.; Lee, J.E.; Noh, D.-Y.; Kim, S.-W.; Jung, Y.S.; Park, M.Y.; Park, R.W. Development of a novel breast cancer recurrence prediction model using support vector machine. J. Breast Cancer 2012, 15, 230–238. [Google Scholar] [CrossRef] [PubMed]
- Obrzut, B.; Kusy, M.; Semczuk, A.; Obrzut, M.; Kluska, J. Prediction of 5–year overall survival in cervical cancer patients treated with radical hysterectomy using computational intelligence methods. BMC Cancer 2017, 17, 840. [Google Scholar] [CrossRef] [PubMed]
- Lee, B.; Chun, S.H.; Hong, J.H.; Woo, I.S.; Kim, S.; Jeong, J.W.; Kim, J.J.; Lee, H.W.; Na, S.J.; Beck, K.S. DeepBTS: Prediction of recurrence-free survival of non-small cell lung cancer using a time-binned deep neural network. Sci. Rep. 2020, 10, 1952. [Google Scholar] [CrossRef]
- Shtivelman, E.; Hensing, T.; Simon, G.R.; Dennis, P.A.; Otterson, G.A.; Bueno, R.; Salgia, R. Molecular pathways and therapeutic targets in lung cancer. Oncotarget 2014, 5, 1392. [Google Scholar] [CrossRef]
- Uramoto, H.; Tanaka, F. Recurrence after surgery in patients with NSCLC. Transl. Lung Cancer Res. 2014, 3, 242. [Google Scholar]
- Brambilla, E.; Gazdar, A. Pathogenesis of lung cancer signalling pathways: Roadmap for therapies. Eur. Respir. J. 2009, 33, 1485–1497. [Google Scholar] [CrossRef]
- Park, H.K.; Choi, Y.D.; Yun, J.-S.; Song, S.-Y.; Na, K.-J.; Yoon, J.Y.; Yoon, C.-S.; Oh, H.-J.; Kim, Y.-C.; Oh, I.-J. Genetic alterations and risk factors for recurrence in patients with non-small cell lung cancer who underwent complete surgical resection. Cancers 2023, 15, 5679. [Google Scholar] [CrossRef]
- Yang, S.; Liu, Y.; Li, M.-Y.; Ng, C.S.H.; Yang, S.-L.; Wang, S.; Zou, C.; Dong, Y.; Du, J.; Long, X.; et al. FOXP3 promotes tumor growth and metastasis by activating Wnt/β-catenin signaling pathway and EMT in non-small cell lung cancer. Mol. Cancer 2017, 16, 124. [Google Scholar] [CrossRef]
- Kratz, J.R.; Li, J.Z.; Tsui, J.; Lee, J.C.; Ding, V.W.; Rao, A.A.; Mann, M.J.; Chan, V.; Combes, A.J.; Krummel, M.F. Genetic and immunologic features of recurrent stage I lung adenocarcinoma. Sci. Rep. 2021, 11, 23690. [Google Scholar] [CrossRef] [PubMed]
- Leung, E.L.-H.; Fiscus, R.R.; Tung, J.W.; Tin, V.P.-C.; Cheng, L.C.; Sihoe, A.D.-L.; Fink, L.M.; Ma, Y.; Wong, M.P. Non-small cell lung cancer cells expressing CD44 are enriched for stem cell-like properties. PLoS ONE 2010, 5, e14062. [Google Scholar] [CrossRef] [PubMed]
- Schoenfeld, A.J.; Bandlamudi, C.; Lavery, J.A.; Montecalvo, J.; Namakydoust, A.; Rizvi, H.; Egger, J.; Concepcion, C.P.; Paul, S.; Arcila, M.E. The genomic landscape of SMARCA4 alterations and associations with outcomes in patients with lung cancer. Clin. Cancer Res. 2020, 26, 5701–5708. [Google Scholar] [CrossRef] [PubMed]
- Emmanouilidi, A.; Falasca, M. Targeting PDK1 for chemosensitization of cancer cells. Cancers 2017, 9, 140. [Google Scholar] [CrossRef]
- Broët, P.; Dalmasso, C.; Tan, E.H.; Alifano, M.; Zhang, S.; Wu, J.; Lee, M.H.; Régnard, J.-F.; Lim, D.; Koong, H.N. Genomic profiles specific to patient ethnicity in lung adenocarcinoma. Clin. Cancer Res. 2011, 17, 3542–3550. [Google Scholar] [CrossRef]
- Wang, K.; Li, H.; Chen, R.; Zhang, Y.; Sun, X.-X.; Huang, W.; Bian, H.; Chen, Z.-N. Combination of CALR and PDIA3 is a potential prognostic biomarker for non-small cell lung cancer. Oncotarget 2017, 8, 96945. [Google Scholar] [CrossRef]
- Santos, N.J.; Barquilha, C.N.; Barbosa, I.C.; Macedo, R.T.; Lima, F.O.; Justulin, L.A.; Barbosa, G.O.; Carvalho, H.F.; Felisbino, S.L. Syndecan family gene and protein expression and their prognostic values for prostate cancer. Int. J. Mol. Sci. 2021, 22, 8669. [Google Scholar] [CrossRef]
- Ai, Y.; Aonpong, P.; Wang, W.; Li, Y.; Iwamoto, Y.; Han, X.; Chen, Y.W. Residual Multilayer Perceptrons for Genotype-Guided Recurrence Prediction of Non-Small Cell Lung Cancer. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2022, 2022, 447–450. [Google Scholar]
- Abbaker, N.; Minervini, F.; Guttadauro, A.; Solli, P.; Cioffi, U.; Scarci, M. The future of artificial intelligence in thoracic surgery for non-small cell lung cancer treatment a narrative review. Front. Oncol. 2024, 14, 1347464. [Google Scholar] [CrossRef]
- Herington, J.; McCradden, M.D.; Creel, K.; Boellaard, R.; Jones, E.C.; Jha, A.K.; Rahmim, A.; Scott, P.J.; Sunderland, J.J.; Wahl, R.L.; et al. Ethical Considerations for Artificial Intelligence in Medical Imaging: Deployment and Governance. J. Nucl. Med. 2023, 64, 1509–1515. [Google Scholar] [CrossRef]
- Herington, J.; McCradden, M.D.; Creel, K.; Boellaard, R.; Jones, E.C.; Jha, A.K.; Rahmim, A.; Scott, P.J.; Sunderland, J.J.; Wahl, R.L.; et al. Ethical Considerations for Artificial Intelligence in Medical Imaging: Data Collection, Development, and Evaluation. J. Nucl. Med. 2023, 64, 1848–1854. [Google Scholar] [CrossRef]
Author/Year | Country | Type of Study | Number of Samples | Study Duration | Cancer Type | Machine Learning Techniques and Tools | Gender—Age Range |
---|---|---|---|---|---|---|---|
Zhong et al., 2019 [7] | China | Analytical and predictive study | Train 156/Test 83/Val 530 | - | non-small cell lung cancer | Support vector machine (SVM)/recursive feature elimination (RFE) by R packages | - |
Jones et al., 2021 [28] | USA | Prospective cohort | 426 patients | 10 years | early-stage lung adenocarcinoma | PRecur using gradient-boosting survival regression by the MSK-IMPACT sequencing platform | 140 M/286 F—69 (62–75) |
Jiang et al., 2021 [8] | China | Observational histology study | 102 patients | 5 years | Small cell lung cancer | XGBoost by R packages | 84 M/18 F—63.5 (38–81) |
Luo et al., 2020 [29] | China | Retrospective observational study | 827 TCGA NSCLC and 60 GSE | - | Non-small cell lung carcinoma | LASSO-Logistic regression and Random Forest method by R packages like limma, edgeR, and GSVA | - |
Senthil et al., 2019 [32] | India | Analytical and predictive study | - | - | Non-small cell lung cancer and small cell lung cancer | Back Propagation Network (BPN) optimized with an Ant Lion Optimization (ALO) algorithm | - |
Xu et al., 2020 [21] | China | Retrospective cohort study | 426 patients | - | Lung adenocarcinoma | LASSO Cox regression and multivariate Cox analyses by R packages like limma and DESeq2, and GSEA software | 37 M/5 F—62.9 (39–85) |
Wang et al., 2022 [36] | France, Japan, Sweden, Canada, South Korea, China | Analytical and predictive study of multiple cohorts | 334 LUAD patients/59 normal | - | Early-stage lung adenocarcinoma | Lasso regression and univariate Cox regression by R packages, the STRING database, Cytoscape software (version 3.8.0), X-tile software, and GSEA software | - |
Timilsina et al., 2023 [35] | France Japan, Sweden, Canada, South Korea, China | Analytical and predictive study of multiple cohorts | 1348 patients | - | Early-stage lung adenocarcinoma | Lasso regression, univariate and multivariate Cox regression by R software, X-tile software, the STRING database, and Cytoscape software | 1010 M/338 F—65.7–65.9 (31–118) |
Shen et al., 2023 [23] | USA | Retrospective cohort study | 41 patients | 7 years | Early-stage lung adenocarcinoma | Support vector machine (SVM) with recursive feature elimination (SVM-RFE) by the CIBERSORT algorithm, nSolver 3.0 software, and NanoString | 12 M/29 F 65.0 recurrence/69.5 non-recurrence |
Abdu-Aljabar et al., 2023 [27] | Iraq | Analytical and predictive study | 487 patients | - | Non-Small Cell Lung Cancer | Optuna_XGB classification model and a comparison with original XGBoost, PSO, Hyperopt, Deep Forest, KNN, SVM, and Naive Bayes algorithms by Optuna optimization | - |
Timilsina et al., 2022 [34] | Ireland UK Spain Czech Republic | Analytical and predictive study | 1348 patients | - | Early-stage lung adenocarcinoma | Support vector classification, logistic regression, Random Forest classification, gradient boosting machine classifier, and multi-layer perceptron classifier | 1010 M/338 F—65.7–65.9 (31–118) |
Yang et al., 2022 [39] | UK China | Analytical and predictive study | 511 LUAD 487 LUSC | - | Lung adenocarcinoma, Lung squamous cell carcinoma (non-small cell lung cancer) | Decision tree methods, neural networks, and support vector machines by MATLAB (R2017) | - |
Aonpong et al., 2021 [25] | Japan | Analytical and predictive study | 88 | 4 years | NSCLC | Deep neural network (DNN), ANN, stochastic gradient descent (SGD) | 64 M/24 F—69 (46–85) |
Xu et al., 2024 [37] | China | Analytical and predictive study | 134/371 | - | Lung adenocarcinoma | Interpretable Biological Pathway Graph Neural Networks (IBPGNET) | - |
Zhou et al., 2023 [38] | China | Analytical and predictive study | 123 LUAD 110 LUSC | 3 years | Lung adenocarcinoma and lung squamous cell carcinoma | Random Forest (RF), Gaussian naive Bayes (NB), and Adaboost (Ada) | - |
Shi et al., 2021 [33] | China | Analytical and predictive study | 484 | - | Lung adenocarcinoma | LASSO Cox regression | 226 M/258 F |
Miao et al., 2024 [30] | China | Analytical and predictive study | 279 | Lung adenocarcinoma | Random Forest, Random Survival Forest, Kaplan–Meier tool | - | |
Rakaee et al., 2023 [31] | Denmark and Norway | Prospective study (TNM-I trial), retrospective study (UNN cohort) | 934 | 4/20 years | Non-small cell lung cancer (NSCLC), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC) | Supervised machine learning, artificial neural networks, and multilayer perceptron | 523 M/411 F—(39–86) |
Author/Year | Feature Selection/Extraction Method | Model Training | AUC/ROC | Sensitivity/Specificity | Accuracy/Precision | Dice Score/F1 Score | Data Characteristics |
---|---|---|---|---|---|---|---|
Zhong 2019 [7] | Differential gene expression | Support vector machine (SVM) | AUC = 0.95 (training, internal CV) | Sensitivity (Recall, Recurrence): 0.88 Specificity (Recall, Nonrecurrence): 0.90 | Precision (Recurrence): 0.79 Precision (Nonrecurrence): 0.95 Average accuracy: 0.89 | F1-score (Recurrence): 0.77 F1-score (Nonrecurrence): 0.93 Average F1-score: 0.89 | Public gene expression datasets (GEO/TCGA) Probe counts: 22,284 (train), 17,386 (test) Balanced recurrent/nonrecurrent in training Standardized (Z-score) values, gene symbol mapping |
Jones 2021 [28] | Cox regression | Gradient boosting survival regression | CPE: 0.73 | - | - | - | 426 LUAD patients (stages I–III): broad-panel next-generation sequencing data, clinicopathologic data |
Jiang 2021 [8] | Cox regression | eXtreme gradient boosting (XGBoost) | AUC = 0.715 | - | - | - | 102 SCLC patients (stages I–III): clinical data, immunohistochemistry (IHC), gene expression data |
Luo 2020 [29] | LASSO-Logistic regression, Random Forest, LASSO-Cox regression, univariate/multivariate Cox regression | LASSO and Random Forest | AUC = 0.965 | - | - | 901 NSCLC samples: DNA methylation levels, RNA-seq data, clinical characteristics obtained from TCGA | |
Senthil 2019 [32] | Principal Component Analysis (PCA) | BPN optimized with ALO | - | Sensitivity: up to 88.6% Specificity: up to 96.8% | Up to 99.1% accuracy | - | UCI Machine Learning Repository |
Xu 2020 [21] | LASSO-Cox regression, multivariate Cox regression | LASSO Cox regression | AUC = 96.3% | - | - | - | LUAD tissues: gene expression data obtained from TCGA and GEO |
Wang 2022 [36] | Lasso regression, univariate Cox regression | Multivariate Cox regression | AUC: up to 0.679 | - | - | - | 334 early-stage LUAD patients: transcriptome sequencing data obtained from TCGA and GEO |
Timilsina 2023 [35] | Aneuploidy score imputation, identification of overlapping features | Support Vector Classification (SVC), logistic regression (LR), Random Forest (RF), gradient boosting machine (GBM), multilayer perceptron classifier (NNC) | ROC-AUC: 0.80 | - | Accuracy: 0.76 | F1 score: 0.61 | 1348 early-stage NSCLC patients: clinical and genomic data obtained from TCGA |
Shen 2023 [23] | Recursive feature elimination (RFE) | Support vector machine (SVM) | Training set: 92.0% Validation set: 91.7% | Training set sensitivity: 89.5% Training set specificity: 62.5% Validation set sensitivity: 75.0% Validation set specificity: 100.0% | Training set accuracy: 91.2% Validation set accuracy: 90.0% | - | 41 early-stage LUAD patients: gene expression data, clinical data |
Abdu-Aljabar 2023 [27] | eXtreme gradient boosting (XGBoost) | Optuna-optimized eXtreme gradient boosting (Optuna_XGBoost) | GSE8894 dataset: 0.93 GSE68465 dataset: 0.79 | GSE8894 dataset: Sensitivity: 1.00 Specificity: 0.86 GSE68465 dataset: Sensitivity: 0.90 Specificity: 0.68 | Accuracy: GSE8894 dataset: 0.93 GSE68465 dataset: 0.81 | F1 Score for the GSE8894 dataset: 0.93 F1 Score for the GSE68465 dataset: 0.84 | Gene expression data |
Timilsina 2022 [34] | Aneuploidy score imputation, identification of overlapping features | Support Vector Classification, logistic regression, Random Forest classification, gradient boosting machine classifier, and multilayer perceptron classifier | ROC-AUC score: 0.79 | - | - | - | 1348 early-stage NSCLC patients: clinical data, imputed aneuploidy scores |
Yang 2022 [39] | ANOVA | Decision trees (CART) artificial neural networks (feedforward neural network) support vector machines (least-squares SVM) | AUC = 0.82 | - | - | - | 511 LUAD samples and 487 LUSC samples: demographic, clinical, and genomic data obtained from TCGA |
Aonpong 2021 [25] | Weighted Gene Co-expression Network Analysis (WGCNA) | Random forest, random survival forest | AUC = 0.948 | Sensitivity: 0.93 Specificity: 0.94 | - | - | LUAD tissues: gene expression data, clinical data obtained from TCGA and GEO |
Xu 2024 [37] | Chi-square test for omics data (top 3,000 features per dataset) | 5-fold cross-validation repeated 5× | AUC = 0.88 | Not explicitly reported | Accuracy: 0.82 AUPR (Precision-Recall): 0.790 | 0.68 | Multi-omics: SNV, AMP_CNV, DEL_CNV High-dimensional (18,498–19,645 features/omics type) Class imbalance (134 vs. 371) |
Zhou 2023 [38] | DESeq2 | Random Forest (RF), Gaussian naive Bayes (NB), Adaboost (Ada) classifiers | AUC 0.81 | - | Accuracy = 0.78 | - | 123 LUAD and 110 LUSC patients: transcriptome data, microbiome data, clinical data |
Shi 2021 [33] | Chi-square test | Interpretable Biological Pathway Graph Neural Networks (IBPGNET) | AUC = 0.88 | - | Accuracy: 0.82 | F1 score: 0.68 | LUAD patients: multi-omics data, copy number variants (CNVs), somatic mutations, clinical data |
Miao 2024 [30] | Gray level co-occurrence matrix (GLCM), ResNet50 model/LASSO, F-test (ANOVA), CHI-2 | Deep neural network (DNN) regression, artificial neural network (ANN) | AUC = 0.7667 | Sensitivity: 0.95 Specificity: 0.59 | Accuracy: 83.28% | - | 88 NSCLC patients: CT images + gene expression data |
Rakaee 2023 [31] | LASSO—multivariate Cox regression | LASSO Cox regression | AUC = up to 0.856 | - | - | - | 484 LUAD patients: clinical and genomic data obtained from TCGA |
Category | Study (Year) | Key Biomarkers |
---|---|---|
LUAD/NSCLC Prediction | Zhong et al., 2019 [7] | PDIA3, MYH11, PDK1, SDC3, RPE65, LAMC3, BTK, UPK1B |
Jones et al., 2021 [28] | SMARCA4, TP53, Fraction of Genome Altered (FGA) | |
Luo et al., 2020 [29] | CpG methylation markers: ART4, KCNK9, FAM83A, C6orf10 | |
Xu et al., 2020 [21] | 12-gene signature: ACTR2, ALDH2, FBP1, HIRA, ITGB2, MLF1, P4HA1, S100A10, S100B, SARS, SCGB1A1, SERPIND1, VSIG4 | |
Immune-Related Markers | Jiang et al., 2021 [8] | FOXP3 expression, PD-L1 on tumor-infiltrating lymphocytes (TILs) |
Rakaee et al., 2023 [31] | STK11 and KEAP1 co-mutations | |
Multi-Omics Approaches | Xu et al., 2024 [37] | PSMC1, PSMD11, PRKCB, CCNE1, NRG1, ZNF521, NGF |
Zhou et al., 2023 [38] | Long non-coding RNAs: LINC00675, MEG3 | |
Shi et al., 2021 [33] | CPS1, CCR2, NT5E, ANLN, ABCC2 | |
Tumor and Immune Markers | Shen et al., 2023 [23] | MR1, BCL6, CCL13 (tumor tissue), TBX21, IL-17RB, GZMB (buffy coat) |
Abdu-Aljabar et al., 2023 [27] | BTBD6, KLHL7, BMPR1A |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pourakbar, N.; Motamedi, A.; Pashapour, M.; Sharifi, M.E.; Sharabiani, S.S.; Fazlollahi, A.; Abdollahi, H.; Rahmim, A.; Rezaei, S. Effectiveness of Artificial Intelligence Models in Predicting Lung Cancer Recurrence: A Gene Biomarker-Driven Review. Cancers 2025, 17, 1892. https://doi.org/10.3390/cancers17111892
Pourakbar N, Motamedi A, Pashapour M, Sharifi ME, Sharabiani SS, Fazlollahi A, Abdollahi H, Rahmim A, Rezaei S. Effectiveness of Artificial Intelligence Models in Predicting Lung Cancer Recurrence: A Gene Biomarker-Driven Review. Cancers. 2025; 17(11):1892. https://doi.org/10.3390/cancers17111892
Chicago/Turabian StylePourakbar, Niloufar, Alireza Motamedi, Mahta Pashapour, Mohammad Emad Sharifi, Seyedemad Seyedgholami Sharabiani, Asra Fazlollahi, Hamid Abdollahi, Arman Rahmim, and Sahar Rezaei. 2025. "Effectiveness of Artificial Intelligence Models in Predicting Lung Cancer Recurrence: A Gene Biomarker-Driven Review" Cancers 17, no. 11: 1892. https://doi.org/10.3390/cancers17111892
APA StylePourakbar, N., Motamedi, A., Pashapour, M., Sharifi, M. E., Sharabiani, S. S., Fazlollahi, A., Abdollahi, H., Rahmim, A., & Rezaei, S. (2025). Effectiveness of Artificial Intelligence Models in Predicting Lung Cancer Recurrence: A Gene Biomarker-Driven Review. Cancers, 17(11), 1892. https://doi.org/10.3390/cancers17111892