AI-Driven Models for Diagnosing and Predicting Outcomes in Lung Cancer: A Systematic Review and Meta-Analysis

Simple Summary This research explores the transformative potential of artificial intelligence (AI) in the early detection of lung cancer. Through a comprehensive systematic review and meta-analysis, this study evaluates the effectiveness of AI models, emphasizing a promising avenue for improving diagnostic accuracy. Among 1024 identified records, 39 studies were meticulously selected and analyzed following the PRISMA guidelines. The findings highlight significant strides in AI’s role, emphasizing the need for standardized protocols. Despite the observed heterogeneity, this study underscores AI’s promising impact on lung cancer screening, laying the groundwork for future advancements in clinical practice. This research contributes crucial insights for healthcare professionals and researchers alike, aiming to enhance the early diagnosis and management of lung cancer. Abstract (1) Background: Lung cancer’s high mortality due to late diagnosis highlights a need for early detection strategies. Artificial intelligence (AI) in healthcare, particularly for lung cancer, offers promise by analyzing medical data for early identification and personalized treatment. This systematic review evaluates AI’s performance in early lung cancer detection, analyzing its techniques, strengths, limitations, and comparative edge over traditional methods. (2) Methods: This systematic review and meta-analysis followed the PRISMA guidelines rigorously, outlining a comprehensive protocol and employing tailored search strategies across diverse databases. Two reviewers independently screened studies based on predefined criteria, ensuring the selection of high-quality data relevant to AI’s role in lung cancer detection. The extraction of key study details and performance metrics, followed by quality assessment, facilitated a robust analysis using R software (Version 4.3.0). The process, depicted via a PRISMA flow diagram, allowed for the meticulous evaluation and synthesis of the findings in this review. (3) Results: From 1024 records, 39 studies met the inclusion criteria, showcasing diverse AI model applications for lung cancer detection, emphasizing varying strengths among the studies. These findings underscore AI’s potential for early lung cancer diagnosis but highlight the need for standardization amidst study variations. The results demonstrate promising pooled sensitivity and specificity of 0.87, signifying AI’s accuracy in identifying true positives and negatives, despite the observed heterogeneity attributed to diverse study parameters. (4) Conclusions: AI demonstrates promise in early lung cancer detection, showing high accuracy levels in this systematic review. However, study variations underline the need for standardized protocols to fully leverage AI’s potential in revolutionizing early diagnosis, ultimately benefiting patients and healthcare professionals. As the field progresses, validated AI models from large-scale perspective studies will greatly benefit clinical practice and patient care in the future.


Introduction
Lung cancer remains a formidable global health challenge, claiming the lives of millions of individuals each year [1].The high mortality rate associated with lung cancer is primarily attributed to the advanced stage at which it is often diagnosed [2].Lung cancer is notorious for its asymptomatic early stages, making it extremely difficult to diagnose until it has reached an advanced, often incurable, stage.The later the diagnosis, the more limited the treatment options, and the grimmer the prognosis for patients.In contrast, when lung cancer is detected at an early stage, the chances of successful treatment and longterm survival increase significantly.Consequently, there is a pressing need for innovative strategies to enable early detection, as this could significantly improve the prognosis and overall survival rates of lung cancer patients [3].In recent years, the field of artificial intelligence (AI) has emerged as a promising avenue for achieving this goal.In the field of healthcare, AI has shown promise in improving diagnostic accuracy, predicting disease outcomes, and personalizing treatment plans [4].In the context of lung cancer, AI systems can analyze vast datasets of medical images, patient records, and genetic information to identify patterns and abnormalities that may elude human perception.These systems can not only detect lung cancer at earlier stages, but also assist in risk assessment and treatment planning [5].
Current methods for early lung cancer detection include screening programs such as low-dose computed tomography (LDCT) and the analysis of biomarkers.While these approaches have demonstrated some success, they are not without limitations [6].LDCT, for instance, may lead to overdiagnosis and increased healthcare costs.AI systems can potentially enhance the effectiveness of these methods by providing more precise and efficient analysis, reducing false positives and false negatives, and offering a complementary approach to existing techniques.Despite the potential benefits of AI in early lung cancer detection, several challenges and considerations must be addressed [7,8].The performance of AI models can vary depending on the quality and diversity of the data used for training [9,10].Therefore, the selection and curation of data are fundamental to the success of AI-based systems in this context.
This systematic review and metanalysis endeavors to provide a comprehensive evaluation of the performance of AI systems for the early detection of lung cancer.This paper analyzed the current state of AI applications in lung cancer detection, including the various techniques and approaches being utilized.Furthermore, the study critically assessed the advantages and limitations of AI-based methods compared to traditional approaches.

Materials and Methods
In conducting this systematic review and meta-analysis evaluating the performance of AI systems for the early detection of lung cancer, we meticulously adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines.The following detailed Method Section outlines the steps taken in this comprehensive review.

Protocol Development
The research question was formulated to assess the performance of AI systems in early lung cancer detection.A detailed protocol was developed, outlining the inclusion and exclusion criteria, search strategy, and methods for data extraction and analysis.In accordance with the journal's guidelines, this systematic review was not registered in any specific database prior to its initiation.While the journals encourage registration for systematic reviews, it is not a mandatory requirement universally practiced in the field.This decision was aligned with established practices within this domain, considering the extensive body of previously published systematic reviews without prior registration in reputable peer-reviewed journals.

Literature Search
Comprehensive searches were conducted in electronic databases, including PubMed, Google Scholar, Science direct, and Embase, to identify relevant studies published up to October 2023.The inclusion of Google Scholar helped us to identify the grey literature, conference papers, and other non-traditional sources of information.

Search Strategy
The search strategy included a combination of keywords and Medical Subject Heading (MeSH) terms related to lung cancer, artificial intelligence, and early detection.The search strategy was tailored to each database to account for variations in syntax and indexing.This ensured that no relevant studies were missed.

Study Selection
Two independent reviewers screened titles and abstracts for eligibility based on predefined criteria.The initial screening of titles and abstracts helped in rapidly identifying studies that met the inclusion criteria and eliminating those that did not.Full-text articles of potentially relevant studies were retrieved for further assessment.A full-text review was conducted on potentially relevant studies to ensure that the selection process was rigorous, and that the final dataset was of high quality.

Eligibility Criteria 2.5.1. Inclusion Criteria
Studies evaluating the performance of AI systems for the early detection of lung cancer.
Original research articles published in English.
Studies reporting sensitivity, specificity, and other relevant diagnostic performance metrics.

Exclusion Criteria
Studies lacking sufficient data on AI system performance.Reviews, commentaries, and conference abstracts without primary data.

Data Extraction
Relevant data, including study characteristics, AI system details, validation methods, and diagnostic performance metrics, were extracted using a standardized data extraction form.Two reviewers independently extracted data from selected studies.Discrepancies were resolved through discussion or consultation with a third reviewer.

Quality Assessment
The quality of the included studies was assessed using appropriate tools, considering factors such as study design, patient selection, and AI system evaluation.A risk-of-bias graph and summary were generated to visually represent the methodological quality of the included studies.

Data Synthesis and Analysis
The meta-analysis was carried out using R software (Version 4.3.0,Vienna, Austria) along with the RStudio interface (Version 2023.03.0,Boston, MA, USA).Packages and libraries such as meta and metafor were utilized to calculate key performance metrics, including pooled sensitivity and specificity, along with their associated confidence intervals, i.e., 95%.Moreover, the presence of heterogeneity among the included studies was evaluated using a chi-square test and I 2 index statistics.

Reporting
A PRISMA flow diagram was used to illustrate the study selection process, including the number of studies identified, screened, assessed for eligibility, and included in the final analysis.

Results
The flow diagram in Figure 1 shows that the researchers identified 1024 records from the databases, but only 116 records were assessed for eligibility.At the identification stage, 326 records were excluded due to duplication.During the screening stage, 28 records were excluded because they were not in English.Some records lacked the essential data required for the systematic review.Records that did not have full-text versions available for review were excluded.Review articles, which summarize and analyze existing research, were excluded during the screening stage.After completing the identification and screening stages, the research team identified 39 studies that met the inclusion criteria and were relevant to the systematic review.These studies formed the basis for the subsequent data extraction and analysis, contributing to the comprehensive evaluation of AI systems for early lung cancer detection in the systematic review.
In Table 1, we present an overview of the characteristics of the included studies in our systematic review, each focusing on the application of AI models for the early detection of lung cancer.The table encompasses a diverse range of studies conducted across different countries and utilizing various AI models and data sources.When comparing and contrasting the results of these studies, several key insights emerge.While studies such as Wu 2020) achieved exceptionally high sensitivity, minimizing the risk of missing cancer cases [4,7,10].On the other hand, the study by Chen (2022) showcases the effectiveness of AI models, specifically CNN and RNN, in improving the overall accuracy of lung cancer prediction [8].Notably, Huang et al. (2018) integrated sensor array technology with machine learning, demonstrating its promise in the precise identification of lung cancer, especially when compared to traditional models [11].Li et al. (2019) conducted a retrospective study in China using 3D deep learning technology on CT scans [9].Their AI system achieved a sensitivity of 75% and specificity of 82%, resulting in an overall accuracy of 88.8%.This research emphasized the potential of AI as a diagnostic tool capable of providing more precise and unbiased outcomes in the diagnosis of pulmonary nodules, ultimately reducing the interpretation time for radiologists.Choi et al. (2018), from the USA, conducted a retrospective study using Support Vector Machine (SVM) and LASSO on LIDC-IDRI data [5].Their AI model achieved an accuracy of 84.6%, which was notably 12.4% higher than the accuracy for Lung-RADS.This result demonstrated the potential of AI in substantially improving the accuracy of lung cancer detection.In another study from China employed a 3D CMixNet model on LUNA-16 and LIDC-IDRI datasets.Their system achieved a sensitivity of 94.0% and specificity of 91.0%, showcasing better results compared to existing methods for lung cancer detection.These variations in results highlight the trade-offs between sensitivity and specificity, as well as the distinct strengths of different AI models and approaches.While some studies emphasize the potential of AI in overcoming specific challenges, such as PD-L1 assessment or eligibility assessment, others underscore the efficiency and reliability of AI in lung cancer screening.Collectively, these findings underscore the transformative potential of AI in enhancing the accuracy and efficiency of lung cancer diagnosis, promising significant benefits to both patients and healthcare professionals.Collectively, these results underscore the transformative role AI can play in improving the accuracy, efficiency, and reliability of lung cancer diagnosis, ultimately benefiting patients and healthcare professionals.Our systematic review incorporates these findings to offer a holistic understanding of the state of AI in lung cancer detection, shedding light on the remarkable potential of these technologies in the field of oncology.

Crossvalidation
The performance of the observers in evaluating the risk of malignancy was slightly higher than the performance of fusion AI algorithms.Figures 2 and 3 present forest plots of the pooled sensitivity and sensitivity of AI models for the early diagnosis of lung cancer.The pooled sensitivity and specificity of AI models across the included studies were 0.87 (95% CI: 0.82-0.90)and 0.87 (95% CI: 0.80-0.91),respectively.These results indicate that AI models demonstrated a high level of accuracy in correctly identifying true positives and true negatives, showing promising results for the early diagnosis of lung cancer.However, heterogeneity was observed among the included studies.This heterogeneity may be attributed to variations in study populations, data sources, and model specifications.The results of quality assessments are presented in Figure 4.
Figures 2 and 3 present forest plots of the pooled sensitivity and sensitivity of AI models for the early diagnosis of lung cancer.The pooled sensitivity and specificity of AI models across the included studies were 0.87 (95% CI: 0.82-0.90)and 0.87 (95% CI: 0.80-0.91),respectively.These results indicate that AI models demonstrated a high level of accuracy in correctly identifying true positives and true negatives, showing promising results for the early diagnosis of lung cancer.However, heterogeneity was observed among the included studies.This heterogeneity may be attributed to variations in study populations, data sources, and model specifications.The results of quality assessments are presented in Figure 4.

Discussion
Lung cancer is one of the most prevalent diseases worldwide and the leading cause of cancer-associated deaths, with an estimated 2.2 million new cases and 1.8 million deaths in 2020 [44].Currently, a CT scan of the chest is the most frequent method of lung cancer screening.Its high resolution can elucidate the association among surrounding

Discussion
Lung cancer is one of the most prevalent diseases worldwide and the leading cause of cancer-associated deaths, with an estimated 2.2 million new cases and 1.8 million deaths in 2020 [44].Currently, a CT scan of the chest is the most frequent method of lung cancer screening.Its high resolution can elucidate the association among surrounding

Discussion
Lung cancer is one of the most prevalent diseases worldwide and the leading cause of cancer-associated deaths, with an estimated 2.2 million new cases and 1.8 million deaths in 2020 [44].Currently, a CT scan of the chest is the most frequent method of lung cancer screening.Its high resolution can elucidate the association among surrounding organs and blood vessels more clearly, and it plays a significant role in the early detection of lung cancer [45].However, the accuracy of this method can be influenced by benign lesions such as necrosis, inflammation, tuberculosis, various textures in lung images, and several other factors like the experience of radiologists, potentially leading to misdiagnosis and omissions [46].With the implementation of AI-assisted diagnostic systems into clinical practice, a new era has dawned in the field of lung cancer diagnosis.Recent studies have documented the growing and widespread utilization of AI models in clinical diagnosis and treatment, respectively [47][48][49].AI models primarily focus on diagnosing and evaluating various medical images, including skin lesions, pathological microscopic images, and radiological data.AI models are remarkable in their ability to enhance diagnostic accuracy, stability, and work efficiency.
This review documented promising results, indicating that AI models for the early diagnosis of lung cancer demonstrated a high level of accuracy, with pooled sensitivity and specificity values of 0.87 (95% CI: 0.82-0.90)and 0.87 (95% CI: 0.80-0.91),respectively.These findings suggest that AI models exhibit significant capability in correctly identifying true positives and true negatives.Liu et al. recently conducted a systematic review and meta-analysis in which they also demonstrated the commendable performance of AI models in predicting lung cancer, with a pooled sensitivity and specificity of 89% and 87%, respectively [46].Robust performance is sometimes crucial in terms of lung cancer, where early detections substantially impact patient outcomes.High accuracy of more than 90% was observed in this study, which aligns with the broader trend in the medical literature supporting the effectiveness of AI in diagnostic settings [7,8,10,12,14,15,26,32].However, a lower pooled accuracy was also reported in studies mainly focused on lung cancer screening, specifically considering the results obtained across all studies [20,30,34,42] and those in which a CNN model was employed [20,30,42], ranging from 67-75%, respectively.Despite these consolidated findings, we are confident that AI models are a valuable resource for radiologists to detect lung cancer.
AI-assisted diagnostic systems result in different diagnostic outcomes.A study reported that a 3D CNN model exhibited greater advantages in detecting lung cancer as compared to improvements seen with other AI models [18].However, in two studies, ANN achieved high diagnostic performance that could be useful for the detection of lung cancer [32,33].Distinct algorithms demonstrate diverse diagnostic capabilities, notably in radionics and deep learning, which not only assist in predicting the benign or malignant nature of lung nodules, but also identify the prognosis of small-cell lung cancer [50,51].The utilization of AI models in clinical practice is promising; however, validity remains a critical step for generalizability.Among 39 articles, 15 articles performed cross-validation to assess the effectiveness and reliability of AI models [5,6,12,14,15,19,21,23,26,[28][29][30]36,42,50].
Despite the overall positive outcomes, observed heterogeneity among the included studies was identified.Therefore, future research should focus on refining AI models, considering the identified heterogeneity challenges.Collaborative efforts among researchers, clinicians, and policymakers are essential to establish guidelines and standards for the development and evaluation of AI systems in lung cancer screening.By addressing these challenges collectively, the field can progress toward the implementation of AI technologies in clinical settings, ultimately improving the early diagnosis and management of lung cancer.
The current study has certain limitations that should be addressed: (1) The exclusion of the studies lacking complete diagnostic data may have altered the results.(2) While conducting this comprehensive search, only English language articles were included, potentially introducing language bias.(3) The high heterogeneity among all included studies may be attributed to variations in study populations, data sources, and model specifications, and these results warrant further investigation.(4) The included studies were mainly designed retrospectively, which may have affected the overall quality of the systematic review and meta-analysis.Despite the initial verification of AI models' effectiveness in lung cancer screening, most of the AI-based approaches are still in the laboratory research stage and have not yet been implemented into clinical practice.Limitations are evident in data integration, image data quality, legal liability definition, complex pathology diagnosis and cost of use.However, a huge volume of experienced healthcare professionals, especially radiologists and pathologists, are getting involved in AI-assisted lung cancer detection.It is anticipated that AI models will play a significant role in the early detection of lung cancer.

Conclusions
This systematic review and meta-analysis reported the promising outcomes of AI models in the early detection of lung cancer.The pooled sensitivity and specificity values of 0.87 (95% CI: 0.82-0.90)and 0.87 (95% CI: 0.80-0.91)showed the potential of AI models in identifying true positives and true negatives.Regarding the observed heterogeneity among the included studies, these findings highlight the need for standardized protocols in the development of AI models for lung cancer screening.As the medical field continues to grow, healthcare professionals and patients will benefit from the integration of AI models in clinical practice once these models have been validated in large-scale prospective studies.
et al. (2022) and Alexander et al. (2020) achieved notably high specificity levels, suggesting the potential for reducing false positives in clinical settings, Baldwin et al. (

FOR PEER REVIEW 6 of 15 Figure 1 .
Figure 1.Flowchart of included studies.

Table 1 .
Characteristics of included studies.