Artificial Intelligence in Laryngeal Cancer Detection: A Systematic Review and Meta-Analysis

Alabdalhussein, Ali; Al-Khafaji, Mohammed Hasan; Al-Busairi, Rusul; Al-Dabbagh, Shahad; Khan, Waleed; Anwar, Fahid; Raheem, Taghreed Sami; Elkrim, Mohammed; Sahota, Raguwinder Bindy; Mair, Manish

doi:10.3390/curroncol32060338

Open AccessSystematic Review

Artificial Intelligence in Laryngeal Cancer Detection: A Systematic Review and Meta-Analysis

by

Ali Alabdalhussein

^1,*

,

Mohammed Hasan Al-Khafaji

¹,

Rusul Al-Busairi

²,

Shahad Al-Dabbagh

²,

Waleed Khan

¹,

Fahid Anwar

³,

Taghreed Sami Raheem

²,

Mohammed Elkrim

¹,

Raguwinder Bindy Sahota

⁴ and

Manish Mair

³

¹

Department of Otolaryngology, University Hospitals of Leicester, Leicester LE1 5WW, UK

²

Independent Researcher, Leicester LE2 2AD, UK

³

Department of Maxillofacial Surgery, University Hospitals of Leicester, Leicester LE1 5WW, UK

⁴

Department of Otolaryngology, University Hospitals of Derby and Burton, Derby DE22 3NE, UK

^*

Author to whom correspondence should be addressed.

Curr. Oncol. 2025, 32(6), 338; https://doi.org/10.3390/curroncol32060338

Submission received: 3 May 2025 / Revised: 29 May 2025 / Accepted: 5 June 2025 / Published: 9 June 2025

(This article belongs to the Section Head and Neck Oncology)

Download

Browse Figures

Review Reports Versions Notes

Abstract

(1) Background: The early detection of laryngeal cancer is crucial for achieving superior patient outcomes and preserving laryngeal function. Artificial intelligence (AI) methodologies can expedite the triage of suspicious laryngeal lesions, thereby diminishing the critical timeframe required for clinical intervention. (2) Methods: We included all studies published up to February 2025. We conducted a systematic search across five major databases: MEDLINE, EMCARE, EMBASE, PubMed, and the Cochrane Library. We included 15 studies, with a total of 17,559 patients. A risk of bias assessment was performed using the QUADAS-2 tool. We conducted data synthesis using the Meta Disc 1.4 program. (3) Results: A meta-analysis revealed that AI demonstrated high sensitivity (78%) and specificity (86%), with a Pooled Diagnostic Odds Ratio of 53.77 (95% CI: 27.38 to 105.62) in detecting laryngeal cancer. The subset analysis revealed that CNN-based AI models are superior to non-CNN-based models in image analysis and lesion detection. (4) Conclusions: AI can be used in real-world settings due to its diagnostic accuracy, high sensitivity, and specificity.

Keywords:

artificial intelligence (AI); machine learning; laryngeal cancer; laryngoscopy; otolaryngology

1. Introduction

The larynx is an important anatomical structure with a role in phonation, respiration, and deglutition. Its physiological function significantly impacts human quality of life by acting as a crucial gateway, directing air into the lungs to breathe and food into the oesophagus on its way to the stomach. As important as its function is, the larynx is highly vulnerable to pathological states; one central malignancy in the upper aerodigestive tract is that of the larynx. It comprises approximately 30–40% of head and neck malignancies and is the most common malignancy in otolaryngology [1]. With the incidence estimated to exceed 24,500 cases per year by 2030 [2], early-stage diagnosis is even more critical to reduce patient mortality and preserve both laryngeal anatomy and vocal fold function [3]. However, laryngeal cancer diagnosis can be challenging. A study published in the Canadian Association of Radiologists Journal identified missed opportunities for an earlier diagnosis of head and neck cancers on prior CT or MRI scans in 4% of cases. Imaging modalities such as ultrasound, computed tomography (CT), and magnetic resonance imaging (MRI) play a crucial role in the staging process and aid in the assessment of tumour size, local invasion, cartilage involvement, and regional lymph node spread [4]. Each tool has its advantages and disadvantages, shaping its uses. The diagnosis of laryngeal lesions begins primarily with indirect laryngoscopy, preferably with endoscopy equipment [5].

Therefore, there is growing interest in utilising artificial intelligence (AI) to enhance clinical outcomes. AI-assisted laryngoscopy offers significant advantages, such as facilitating earlier and more accurate detection of malignant lesions by even non-expert clinicians, which may allow for timely interventions and better prognoses [6].

In this systematic review and meta-analysis, we aim to evaluate artificial intelligence’s diagnostic accuracy and effectiveness in detecting laryngeal cancers by analysing images taken by a laryngeal endoscope during patients’ evaluations due to a laryngeal lesion. Furthermore, we will review limitations such as biases and research gaps.

The Role of AI in Medical Imaging Analysis

Artificial intelligence (AI) is the simulation of human intelligence in machines that are programmed to think and learn like humans [7]. The role of AI in medicine is constantly expanding. AI promises to revolutionise patient care in the coming years [8].

In his survey, “How Artificial Intelligence Is Shaping Medical Imaging Technology: A Survey of Innovations and Applications” [9], Luis Pinto-Coelho highlighted the current possible applications of AI in medical imaging, such as the interpretation of brain, breast, or other images. He found that the fusion of medical imaging and AI has led to significant advancements in healthcare, from early disease detection to personalised diagnosis and therapy.

Khalifa and Mona, in their review, “AI in Diagnostic Imaging: Revolutionising Accuracy and Efficiency” [10], stated that AI is revolutionising diagnostic imaging, enhancing the accuracy and efficiency of medical image interpretation, and significantly impacting healthcare delivery [10].

AI remains a controversial subject, particularly concerning safety and confidentiality. However, these challenges can be managed by implementing basic safeguards. The above two papers underscore the transformative impact of AI on diagnostic imaging, enhancing accuracy, efficiency, and personalisation in patient care. As AI technologies evolve, their integration into clinical workflows is poised to become a cornerstone of modern medicine, shaping the future of diagnostics and treatment planning.

2. Materials and Methods

2.1. Preregistration

This review was registered in PROSPERO with ID CRD420250656619 on 21 March 2025.

2.2. Search Strategy and Selection Criteria

We followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines in conducting this systematic review and meta-analysis. A careful search was performed across five databases: MEDLINE, EMBASE, EMCARE, PubMed, and the Cochrane Library(Figure 1). The search was conducted to identify studies that used AI with different architectures and tested their sensitivity, specificity, and diagnostic accuracy in detecting laryngeal cancer. The search period was from 1 January 2000 to 1 February 2025. A study was excluded if it was not in English, was a case report, was a systematic review or meta-analysis, had insufficient data, or did not involve cancer detection through imaging. We included original research focused on cancer detection using laryngoscope images. All the included patients were clinically suspicious, and no radiologically suspicious cases were included. The included studies were written in English and had sufficient data for meta-analysis. We formed our search strategy using Boolean operators. The search strategy combined three core concepts: (1) laryngeal cancer, using the terms “laryngeal cancer”, “laryngeal carcinoma”, and “cancer of the larynx”; (2) artificial intelligence, using terms such as “artificial Intelligence,” “AI,” “machine learning,” “deep learning”, “neural network”, “CNN”, and “computer-aided diagnosis”; and (3) diagnostic application, incorporating keywords like “diagnosis”, “detection”, and “classification”. These were further combined with modality-specific terms such as “endoscopy”, “laryngoscopy”, “medical imaging”, “image analysis”, “video endoscopy”, “voice signal”, and “narrowband imaging”. The final Boolean search query used was (“laryngeal cancer” OR “laryngeal carcinoma” OR “cancer of the larynx”) AND (“artificial Intelligence” OR “AI” OR “machine learning” OR “deep learning” OR “neural network” OR “CNN” OR “computer-aided diagnosis”) AND (“diagnosis” OR “detection” OR “classification”) AND (“endoscopy” OR “laryngoscopy” OR “medical imaging” OR “image analysis” OR “video endoscopy” OR “voice signal” OR “narrowband imaging”).

Two independent reviewers (AA and RB) screened all titles and abstracts identified through the database search to determine whether the studies met the inclusion criteria. Full-text articles were retrieved for studies deemed potentially eligible. Both reviewers assessed each full-text report independently against the predefined inclusion and exclusion criteria. Any disagreements were resolved through discussion and, if necessary, consultation with a third reviewer (MM) for an opinion. No automation tools were used in the screening or selection process.

2.3. Data Extraction and Analysis

For eligible studies, we extracted data on sensitivity, specificity, sample size, accuracy, and complication rates (Table 1). Studies included in the review compared the diagnostic performance of various machine learning algorithms, including logistic regression, decision trees, and neural networks, in identifying laryngeal cancer. Each study had to report quantitative metrics, such as sensitivity, specificity, accuracy, and the area under the receiver operating characteristic (ROC) curve. We excluded case reports and studies lacking reported sensitivity or specificity values. The risk of bias in the included studies was assessed using the QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies-2) tool, which evaluates four key domains: patient selection, index test, reference standard, and flow and timing. Two reviewers (AA and RB) assessed each study independently using the QUADAS-2 criteria, and their assessments were compared to ensure consistency. Any discrepancies were resolved through discussion or, if necessary, consultation with a third reviewer (MM). The results of the quality assessment are illustrated in Figure 2. No automated tools were used in the bias assessment process.

2.4. Statistical Analysis and Synthesis Methods

To determine eligibility for synthesis, we first tabulated the key characteristics of each included study, including study design, type of AI model used, imaging modality (e.g., laryngeal endoscopy), target condition (laryngeal cancer), and reported diagnostic performance metrics. These characteristics were compared against the predefined inclusion criteria and the planned outcomes for synthesis.

Only studies that evaluated the use of artificial intelligence for detecting laryngeal cancer based on imaging data and reported or allowed for calculation of sensitivity and specificity were included in the quantitative synthesis. Studies that did not provide sufficient diagnostic data or focused on non-cancerous laryngeal conditions were excluded from the meta-analysis but were described narratively when relevant. Two reviewers (AA and RB) carried out this process independently, and any disagreements were resolved by discussion.

We used the Meta-Disc 1.4 software to compute the overall pooled sensitivity, specificity values, and diagnostic accuracy and to generate forest plots for these values. Two reviewers (AA and RB) independently extracted data using a standardised data collection form. For each included study, relevant information such as study design, population characteristics, AI model details, and diagnostic performance metrics was recorded. In cases where true positive, true negative, false positive, and false negative values were not explicitly reported, these values were calculated based on the available data, including sample size, number of cancer cases, and reported sensitivity and specificity. No automation tools were used in the data extraction process. When necessary, efforts were made to contact study authors to clarify or obtain missing data. Discrepancies between reviewers were resolved through discussion and consensus. Forest plots were generated to illustrate the variability in sensitivity and specificity estimates, accompanied by 95% confidence intervals (CIS) (Figure 3).

3. Results

We carried out a thorough search across five databases, including MEDLINE [25 records], EMBASE (28 records), EMCARE (11 records), PubMed (26 records), and Cochrane Library (1 record), yielding a total of 91 records. After removing 36 duplicates and 32 studies from the title, 23 studies remained for eligibility screening. Of these, eight were excluded from the title, leaving 15 studies on the final screen and included in the final analysis. The complete search process is outlined in the PRISMA flow diagram (Figure 1).

We excluded eight studies due to insufficient data [24,26,27,28,29,30,31,32].

All included studies [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25] were retrospective experimental studies, were published in English, and provided sufficient data to calculate true positive, true negative, false positive, and false negative rates. These studies focused on detecting laryngeal cancer through laryngoscope videos or photos and comparing the effectiveness of AI in distinguishing cancerous lesions from benign ones. The majority of studies utilised flexible nasoendoscopy, with only one study employing rigid endoscopy. Regarding the imaging modalities used in the included studies, twelve employed white light imaging (WLI), two utilised a combination of WLI and narrow band imaging (NBI), and one study used NBI alone.

The data collected are presented in Table 1.

3.1. Overall Pooled Analysis

This review was conducted on 15 studies, including 22,842 images of the larynx taken from 13,570 patients. The analysis yielded an overall pooled sensitivity of 78% (95% CI: 77–78%), showing a strong ability to correctly identify negative cases. The pooled specificity was found to be 86% (95% CI: 86–87%), indicating strong performance in correctly identifying negative cases. The diagnostic odds ratio (DOR), a global measure of test effectiveness, resulted in a pooled value of 53.77 (95% CI: 27.38–105.62), suggesting consistent overall discriminatory capacity. The SROC curve, which produced an AUC of 0.9380 and a Q index of 0.8750, emphasising excellent overall diagnostic accuracy of the AI-based models under evaluation. The results are illustrated in Figure 3.

3.2. Subset Analysis: Diagnostic Accuracy of CNN-Based MODELS vs. Non-CNN Models

We conducted a subset analysis to compare the diagnostic accuracy of Convolutional Neural Network (CNN)-based models with that of non-CNN-based models used in our included studies. The presented meta-analysis compares the diagnostic performance of non-CNN-based and CNN-based models using several forest plots and SROC curves. The non-CNN group yielded a pooled sensitivity of 0.65 (95% CI: 0.64–0.66) and specificity of 0.80 (95% CI: 0.79–0.81), with high heterogeneity (I² = 96.7% and 97.9%, respectively). The corresponding symmetric SROC curve showed an AUC of 0.9044, indicating good overall accuracy (Figure 4). In contrast, CNN-based models demonstrated superior diagnostic accuracy, with a pooled sensitivity of 0.85 (95% CI: 0.84–0.86) and specificity of 0.90 (95% CI: 0.89–0.90), both with high but slightly lower heterogeneity compared to non-CNN models. The AUC of the SROC for CNN-based models was 0.9307, suggesting enhanced discriminative ability. Furthermore, the diagnostic odds ratio (DOR) was markedly higher in the CNN group (pooled DOR = 51.04, 95% CI: 37.53–69.41) than in the non-CNN group (pooled DOR = 15.13, 95% CI: 10.09–22.67), emphasising the significant improvement in diagnostic performance with CNN-based approaches (Figure 5). Overall, the results support the superior accuracy and diagnostic strength of CNN-based models in medical image classification tasks.

4. Discussion

The rapid development of machine learning and artificial intelligence with potent analytical abilities has raised several concerns and questions: will AI imminently replace medical expertise? [33]. The answer to this question can be unpredictable; however, it remains a valuable tool for improving our practice. In their article, Davenport T. and Kalakota R. explain the implications of AI for the healthcare workforce and how it can improve, but not replace, medical practice. Healthcare jobs most likely to be affected by AI will be those that involve dealing with digital information, such as radiology and pathology, for example [34]. Among AI’s most prominent and promising uses is in radiology, where deep learning excels in lesion detection and medical image analysis [35]. Aside from radiology, comparable developments are being seen with the application of AI in histopathology, where AI has the potential to alleviate the workload of human experts, make reporting more consistent and unbiased, and promote better clinical outcomes [36]. Our findings further support these developments by demonstrating that AI can accurately assist in diagnosing laryngeal cancer and reduce the proportion of missed cases due to a lack of experience. Aside from greater diagnostic accuracy, AI reduces delays in analysing images and videos by automating image interpretation, triaging suspicious lesions faster. For laryngeal cancer, where early-stage conditions are often more suitable for organ-preserving therapies like transoral laser microsurgery or radiotherapy, early diagnosis significantly influences treatment choices and impacts long-term voice and airway health. AI also helps in standardising terminology used during image analysis. By eliminating inter-observer variability, AI systems can provide more uniform care among institutions and geographic centres.

4.1. Limitations

This meta-analysis has several limitations that warrant consideration. The studies in the analysis were very heterogeneous regarding the sources, image quality, and deep learning algorithms, thus introducing selection bias into the study. Improved study designs utilising standardised modality images and specific deep-learning models must be employed in future work to improve reliability. Also, most included studies lacked external validation datasets, which hindered generalisability in real-world practice settings. There is a clear need for high-quality prospective studies with comprehensive external validation cohorts to address this gap and enhance the current evidence base. Finally, while many papers demonstrated the superior diagnostic performance of deep learning models compared to clinicians, particularly those with less experience, insufficient data were provided on performance by clinicians with and without AI support, making an aggregated objective appraisal difficult. Furthermore, the majority of these studies focused on Chinese populations. Therefore, there is an increased need to include data on ethnically and geographically distinct populations to maximise the applicability and generalisability of the findings.

4.2. Future Application

The evidence presented in this review elucidates the exceptional diagnostic performance of artificial intelligence (AI) models in the identification of laryngeal lesions, with individual studies demonstrating performance levels that are equal to or exceed those of experienced clinicians. Consequently, this warrants the integration of AI software to support routine otolaryngology clinics, particularly in settings with limited resources or areas lacking specialised expertise. AI-assisted laryngoscopy has the potential to facilitate earlier diagnoses of malignancies, standardise assessments, and minimise diagnostic delays. Nonetheless, it is imperative for clinicians to acknowledge the limitations of these models and to provide appropriate clinical oversight to mitigate the risk of over-reliance on such technologies.

5. Conclusions

We have found that AI has excellent diagnostic accuracy, with high sensitivity and specificity, and can be used in real-world settings and otolaryngology clinics. Further improvement is still a great opportunity. According to our review, research with high-image-number data has better outcomes. This leads us to the suggestion of having one centralised worldwide database centre for images of laryngeal lesions where AI can be trained for better outcomes.

Author Contributions

Conceptualization, A.A., M.M. and M.H.A.-K.; Methodology, A.A., M.M., M.E. and W.K.; Software, R.A.-B., W.K., M.E. and F.A.; Validation, M.H.A.-K., S.A.-D., M.E. and F.A.; Formal Analysis, A.A., M.E. and M.M.; Investigation, A.A., R.B.S. and T.S.R.; Resources, R.B.S., W.K. and T.S.R.; Data Curation, R.A.-B.; Writing—Original Draft Preparation, A.A. and R.B.S.; Writing—Review and Editing, F.A., T.S.R. and R.B.S.; Visualization, S.A.-D.; Supervision, M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This study received no external funding.

Data Availability Statement

All available data included in the main manuscript and no more raw data.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
ML	Machine Learning
CNN	Convolutional Neural Network

References

Igissin, N.; Zatonskikh, V.; Telmanova, Z.; Tulebaev, R.; Moore, M. Laryngeal cancer: Epidemiology, etiology, and prevention: A narrative review. Iran. J. Public Health 2023, 52, 2248. [Google Scholar] [CrossRef]
Gupta, B.; Johnson, N.W.; Kumar, N. Global epidemiology of head and neck cancers: A continuing challenge. Oncology 2016, 91, 13–23. [Google Scholar] [CrossRef]
Araújo, T.; Santos, C.P.; De Momi, E.; Moccia, S. Learned and handcrafted early-stage laryngeal SCC diagnosis features. Med. Biol. Eng. Comput. 2019, 57, 2683–2692. [Google Scholar] [CrossRef] [PubMed]
Lu, F.; Lysack, J.T. Lessons Learned From Commonly Missed Head and Neck Cancers on Cross-Sectional Imaging. Can. Assoc. Radiol. J. 2022, 73, 595–597. [Google Scholar] [CrossRef] [PubMed]
Krausert, C.R.; Olszewski, A.E.; Taylor, L.N.; McMurray, J.S.; Dailey, S.H.; Jiang, J.J. Mucosal wave measurement and visualization techniques. J. Voice 2011, 25, 395–405. [Google Scholar] [CrossRef]
Alonso-Coello, P.; Rigau, D.; Sanabria, A.J.; Plaza, V.; Miravitlles, M.; Martinez, L. Quality and strength: The GRADE system for formulating recommendations in clinical practice guidelines. Arch. Bronconeumol. (Engl. Ed.) 2013, 49, 261–267. [Google Scholar] [CrossRef]
Obuchowicz, R.; Strzelecki, M.; Piórkowski, A. Clinical applications of artificial intelligence in medical imaging and image processing—A review. Cancers 2024, 16, 1870. [Google Scholar] [CrossRef]
Ahmad, Z.; Rahim, S.; Zubair, M.; Abdul-Ghafar, J. Artificial intelligence [AI] in medicine, current applications and future role with special emphasis on its potential and promise in pathology: Present and future impact, obstacles including costs and acceptance among pathologists, practical and philosophical considerations. A comprehensive review. Diagn. Pathol. 2021, 16, 24. [Google Scholar]
Pinto-Coelho, L. How artificial intelligence shapes medical imaging technology: A survey of innovations and applications. Bioengineering 2023, 10, 1435. [Google Scholar] [CrossRef]
Khalifa, M.; Albadawy, M. AI in diagnostic imaging: Revolutionising accuracy and efficiency. Comput. Methods Programs Biomed. Update 2024, 5, 100146. [Google Scholar] [CrossRef]
Baldini, C.; Azam, M.A.; Sampieri, C.; Ioppi, A.; Ruiz-Sevilla, L.; Vilaseca, I.; Alegre, B.; Tirrito, A.; Pennacchi, A.; Peretti, G.; et al. An automated approach for real-time informative frames classification in laryngeal endoscopy using deep learning. Eur. Arch. Oto-Rhino-Laryngol. 2024, 281, 4255–4264. [Google Scholar] [CrossRef] [PubMed]
Yao, P.; Witte, D.; Gimonet, H.; German, A.; Andreadis, K.; Cheng, M.; Sulica, L.; Elemento, O.; Barnes, J.; Rameau, A. Automatic classification of informative laryngoscopic images using deep learning. Laryngoscope Investig. Otolaryngol. 2022, 7, 460–466. [Google Scholar] [CrossRef] [PubMed]
Ren, J.; Jing, X.; Wang, J.; Ren, X.; Xu, Y.; Yang, Q.; Ma, L.; Sun, Y.; Xu, W.; Yang, N.; et al. Automatic recognition of laryngoscopic images using a deep-learning technique. Laryngoscope 2020, 130, E686–E693. [Google Scholar] [CrossRef]
FEM, C. CARS 2023—Computer Assisted Radiology and Surgery. Int. J. CARS 2023, 18 (Suppl. S1), S1–S123. [Google Scholar]
Xu, Z.H.; Fan, D.G.; Huang, J.Q.; Wang, J.W.; Wang, Y.; Li, Y.Z. Computer-aided diagnosis of laryngeal cancer based on deep learning with laryngoscopic images. Diagnostics 2023, 13, 3669. [Google Scholar] [CrossRef]
Xiong, H.; Lin, P.; Yu, J.G.; Ye, J.; Xiao, L.; Tao, Y.; Jiang, Z.; Lin, W.; Liu, M.; Xu, J.; et al. Computer-aided diagnosis of laryngeal cancer via deep learning based on laryngoscopic images. EBioMedicine 2019, 48, 92–99. [Google Scholar] [CrossRef]
Zhao, W.; Zhi, J.; Zheng, H.; Du, J.; Wei, M.; Lin, P.; Li, L.; Wang, W. Construction of prediction model of early glottic cancer based on machine learning. Acta Oto-Laryngol. 2025, 145, 72–80. [Google Scholar] [CrossRef] [PubMed]
Wellenstein, D.J.; Woodburn, J.; Marres, H.A.; van den Broek, G.B. Detection of laryngeal carcinoma during endoscopy using artificial Intelligence. Head Neck 2023, 45, 2217–2226. [Google Scholar] [CrossRef]
Fang, S.; Fu, J.; Du, C.; Lin, T.; Yan, Y. Identifying laryngeal neoplasms in laryngoscope images via deep learning based object detection: A case study on an extremely small data set. Irbm 2023, 44, 100799. [Google Scholar] [CrossRef]
Mamidi, I.S.; Dunham, M.E.; Adkins, L.K.; McWhorter, A.J.; Fang, Z.; Banh, B.T. Laryngeal cancer screening during flexible video laryngoscopy using large computer vision models. Ann. Otol. Rhinol. Laryngol. 2024, 133, 720–728. [Google Scholar] [CrossRef]
Kang, Y.F.; Yang, L.; Hu, Y.F.; Xu, K.; Cai, L.J.; Hu, B.B.; Lu, X. Self-Attention Mechanisms-Based Laryngoscopy Image Classification Technique for Laryngeal Cancer Detection. Head Neck 2025, 47, 944–955. [Google Scholar] [CrossRef] [PubMed]
Wang, M.L.; Tie, C.W.; Wang, J.H.; Zhu, J.Q.; Chen, B.H.; Li, Y.; Zhang, S.; Liu, L.; Guo, L.; Yang, L.; et al. Multi-instance learning based artificial intelligence model to assist vocal fold leukoplakia diagnosis: A multicentre diagnostic study. Am. J. Otolaryngol. 2024, 45, 104342. [Google Scholar] [CrossRef]
Esmaeili, N.; Davaris, N.; Boese, A.; Illanes, A.; Navab, N.; Friebe, M.; Arens, C. Contact Endoscopy–Narrow Band Imaging (CE-NBI) data set for laryngeal lesion assessment. Sci. Data 2023, 10, 733. [Google Scholar] [CrossRef]
Yan, P.; Li, S.; Zhou, Z.; Liu, Q.; Wu, J.; Ren, Q.; Chen, Q.; Chen, Z.; Chen, Z.; Chen, S.; et al. Automated detection of glottic laryngeal carcinoma in laryngoscopic images from a multicentre database using a convolutional neural network. Clin. Otolaryngol. 2023, 48, 436–441. [Google Scholar] [CrossRef] [PubMed]
Dunham, M.E.; Kong, K.A.; McWhorter, A.J.; Adkins, L.K. Optical biopsy: Automated classification of airway endoscopic findings using a convolutional neural network. Laryngoscope 2022, 132 (Suppl. S4), S1–S8. [Google Scholar] [CrossRef]
Bur, A.M.; Zhang, T.; Chen, X.; Kavookjian, H.; Kraft, S.; Karadaghy, O.; Farrokhian, N.; Mussatto, C.; Penn, J.; Wang, G. Interpretable computer vision to detect and classify structural laryngeal lesions in digital flexible laryngoscopic images. Otolaryngol.–Head Neck Surg. 2023, 169, 1564–1572. [Google Scholar] [CrossRef]
Baldini, C.; Migliorelli, L.; Berardini, D.; Azam, M.A.; Sampieri, C.; Ioppi, A.; Srivastava, R.; Peretti, G.; Mattos, L.S. Improving real-time detection of laryngeal lesions in endoscopic images using a decoupled super-resolution enhanced YOLO. Comput. Methods Programs Biomed. 2025, 260, 108539. [Google Scholar] [CrossRef] [PubMed]
Azam, M.A.; Sampieri, C.; Ioppi, A.; Benzi, P.; Giordano, G.G.; De Vecchi, M.; Campagnari, V.; Li, S.; Guastini, L.; Paderno, A.; et al. Videomics of the upper aero-digestive tract cancer: Deep learning applied to white light and narrow band imaging for automatic segmentation of endoscopic images. Front. Oncol. 2022, 12, 900451. [Google Scholar] [CrossRef]
Azam, M.A.; Sampieri, C.; Ioppi, A.; Africano, S.; Vallin, A.; Mocellin, D.; Fragale, M.; Guastini, L.; Moccia, S.; Piazza, C.; et al. Deep learning applied to white light and narrow band imaging videolaryngoscopy: Toward real-time laryngeal cancer detection. Laryngoscope 2022, 132, 1798–1806. [Google Scholar] [CrossRef]
Zhao, Q.; He, Y.; Wu, Y.; Huang, D.; Wang, Y.; Sun, C.; Ju, J.; Wang, J.; Mahr, J.J. Vocal cord lesions classification based on deep convolutional neural network and transfer learning. Med. Phys. 2022, 49, 432–442. [Google Scholar] [CrossRef]
Moccia, S.; Vanone, G.O.; De Momi, E.; Laborai, A.; Guastini, L.; Peretti, G.; Mattos, L.S. Learning-based classification of informative laryngoscopic frames. Comput. Methods Programs Biomed. 2018, 158, 21–30. [Google Scholar] [CrossRef] [PubMed]
Parker, F.; Brodsky, M.B.; Akst, L.M.; Ali, H. Machine learning in laryngoscopy analysis: A proof of concept observational study for the identification of post-extubation ulcerations and granulomas. Ann. Otol. Rhinol. Laryngol. 2021, 130, 286–291. [Google Scholar] [CrossRef] [PubMed]
Popa, S.L.; Ismaiel, A.; Brata, V.D.; Turtoi, D.C.; Barsan, M.; Czako, Z.; Pop, C.; Muresan, L.; Stanculete, M.F.; Dumitrascu, D.I. Artificial Intelligence and medical specialties: Support or substitution? Med. Pharm. Rep. 2024, 97, 409. [Google Scholar] [CrossRef]
Davenport, T.; Kalakota, R. The potential for artificial intelligence in healthcare. Future Healthc. J. 2019, 6, 94–98. [Google Scholar] [CrossRef] [PubMed]
Hosny, A.; Parmar, C.; Quackenbush, J.; Schwartz, L.H.; Aerts, H.J. Artificial intelligence in radiology. Nat. Rev. Cancer 2018, 18, 500–510. [Google Scholar] [CrossRef]
Shmatko, A.; Ghaffari Laleh, N.; Gerstung, M.; Kather, J.N. Artificial intelligence in histopathology: Enhancing cancer research and clinical oncology. Nat. Cancer 2022, 3, 1026–1038. [Google Scholar] [CrossRef]

Figure 1. PRISMA flow diagram.

Figure 2. (A). Risk of bias and applicability concern summary [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25] and (B). risk of bias and applicability concerns graphs.

Figure 3. The (A)—sensitivity, (B)—specificity, (C)—SROC, and (D)—diagnostic accuracy of included studies.

Figure 4. The (A)—sensitivity, (B)—specificity, (C)—SROC, and (D)—diagnostic accuracy of non-CNN-based models.

Figure 5. The (A)—sensitivity, (B)—specificity, (C)—SROC, and (D)—diagnostic accuracy of CNN-based models.

Table 1. Summary of data collected from involved studies.

Study	AI Model	TP	TN	FP	FN
Baldini et al. [11]	Int shallow CNN	264	313	45	82
	INT ResNet-50	334	339	19	12
	INT MobileNetv2	333	339	19	13
	EXT ResNet-50	272	331	9	43
Yao et al. [12]	CNN	3277	612	193	356
Ren et al. [13]	CNN	90	393	7	10
Lee et al. [14]	YOLOV5	137	392	8	63
Lee et al. [14]	YOLOV6	148	390	10	52
Xu et al. [15]	Densenet201 INTERNAL	230	220	18	21
	Densenet201 EXTERNAL	222	230	36	36
	Alexnet INTERNAL	214	194	43	37
	Alexnet EXTERNAL	198	201	64	60
	Inception v3 INTERNAL	220	213	24	31
	Inception v3 EXTERNAL	224	189	76	34
	Mnasnet INTERNAL	214	231	7	37
	Mnasnet EXTERNAL	212	263	2	46
	Mobilenet v3 INTERNAL	228	132	105	23
	Mobilenet v3 EXTERNAL	156	212	53	102
	Resnet152 INTERNAL	216	217	20	35
	Resnet152 EXTERNAL	188	248	18	70
	Squeezenet1 INTERNAL	222	207	30	29
	Squeezenet1 EXTERNAL	202	212	53	56
	Vgg19 INTERNAL	235	207	30	16
	Vgg19 EXTERNAL	224	243	22	34
Xiong et al. [16]	DCNN	628	1815	166	220
Zhao et al. [17]	RF	74	118	8	0
	DV	58	110	16	16
	SVM	70	112	14	4
Wellenstein et al. [18]	YOLOv5s	69	303	23	28
	YOLOv5m	74	284	42	23
	YOLOv5sl	70	295	37	21
Fang et al. [19]	Faster R-CNN	35	213	16	13
Mamidi et al. [20]	[Vit]	127	40	12	3
Kang et al. [21]	ILCDS ex	31	187	9.47	5
	ILCDS in	184	979	23.97	43
Wang et al. [22]	LR	627	723	187.46	411
	SVM	609	740	170.17	429
	RandomForest	722	623	286.65	316
	ExtraTrees	548	765	144.69	490
	XGBoost	733	621	289.38	305
	LightGBM	723	627	283.01	315
	MLP	634	723	187.46	404
Esmaeili et al. [23]	DenseNet121	563	1383	152	123
	EfcientNetB0V2	564	1386	149	122
	ResNet50V2	581	1434	101	105
	Ensemble model.	602	1461	74	84
Yan et al. [24]	R-CNNs	66	503	137	23
Dunham et al. [25].	CNN	46	47	4	3

TP: True positive, TN: true negative, FP: false positive, FN: false negative.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alabdalhussein, A.; Al-Khafaji, M.H.; Al-Busairi, R.; Al-Dabbagh, S.; Khan, W.; Anwar, F.; Raheem, T.S.; Elkrim, M.; Sahota, R.B.; Mair, M. Artificial Intelligence in Laryngeal Cancer Detection: A Systematic Review and Meta-Analysis. Curr. Oncol. 2025, 32, 338. https://doi.org/10.3390/curroncol32060338

AMA Style

Alabdalhussein A, Al-Khafaji MH, Al-Busairi R, Al-Dabbagh S, Khan W, Anwar F, Raheem TS, Elkrim M, Sahota RB, Mair M. Artificial Intelligence in Laryngeal Cancer Detection: A Systematic Review and Meta-Analysis. Current Oncology. 2025; 32(6):338. https://doi.org/10.3390/curroncol32060338

Chicago/Turabian Style

Alabdalhussein, Ali, Mohammed Hasan Al-Khafaji, Rusul Al-Busairi, Shahad Al-Dabbagh, Waleed Khan, Fahid Anwar, Taghreed Sami Raheem, Mohammed Elkrim, Raguwinder Bindy Sahota, and Manish Mair. 2025. "Artificial Intelligence in Laryngeal Cancer Detection: A Systematic Review and Meta-Analysis" Current Oncology 32, no. 6: 338. https://doi.org/10.3390/curroncol32060338

APA Style

Alabdalhussein, A., Al-Khafaji, M. H., Al-Busairi, R., Al-Dabbagh, S., Khan, W., Anwar, F., Raheem, T. S., Elkrim, M., Sahota, R. B., & Mair, M. (2025). Artificial Intelligence in Laryngeal Cancer Detection: A Systematic Review and Meta-Analysis. Current Oncology, 32(6), 338. https://doi.org/10.3390/curroncol32060338

Article Menu

Artificial Intelligence in Laryngeal Cancer Detection: A Systematic Review and Meta-Analysis

Abstract

1. Introduction

The Role of AI in Medical Imaging Analysis

2. Materials and Methods

2.1. Preregistration

2.2. Search Strategy and Selection Criteria

2.3. Data Extraction and Analysis

2.4. Statistical Analysis and Synthesis Methods

3. Results

3.1. Overall Pooled Analysis

3.2. Subset Analysis: Diagnostic Accuracy of CNN-Based MODELS vs. Non-CNN Models

4. Discussion

4.1. Limitations

4.2. Future Application

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI