ChatGPT in Oral Pathology: Bright Promise or Diagnostic Mirage
Abstract
1. Introduction
2. Materials and Methods
2.1. Ethical Considerations
2.2. Identification of Clinical Images
2.3. Generation of Responses with ChatGPT
2.4. Establishment of the Gold Standard
2.5. Data Analysis
3. Results
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| OSCC | oral squamous cell carcinoma |
| OL | oral leukoplakia |
| OLP | oral lichen planus |
| AI | artificial intelligence |
| LLMs | large language models |
| OPMDs | potentially malignant disorders |
| PPV | positive predictive value |
| NPV | negative predictive value |
| AUC | area under the ROC curve |
References
- Umer, F.; Batool, I.; Naved, N. Innovation and Application of Large Language Models (LLMs) in Dentistry—A Scoping Review. BDJ Open 2024, 10, 90. [Google Scholar] [CrossRef]
- Nia, M.F.; Ahmadi, M.; Irankhah, E. Transforming dental diagnostics with artificial intelligence: Advanced integration of ChatGPT and large language models for patient care. Front. Dent. Med. 2025, 5, 1456208. [Google Scholar] [CrossRef] [PubMed]
- Kämmer, J.E.; Hautz, W.E.; Krummrey, G.; Sauter, T.C.; Penders, D.; Birrenbach, T.; Bienefeld, N. Effects of Interacting with a Large Language Model Compared with a Human Coach on the Clinical Diagnostic Process and Outcomes among Fourth-Year Medical Students: Study Protocol for a Prospective, Randomised Experiment Using Patient Vignettes. BMJ Open 2024, 14, e087469. [Google Scholar] [CrossRef] [PubMed]
- Savage, T.; Nayak, A.; Gallo, R.; Rangan, E.; Chen, J.H. Diagnostic Reasoning Prompts Reveal the Potential for Large Language Model Interpretability in Medicine. NPJ Digit. Med. 2024, 7, 20. [Google Scholar] [CrossRef]
- Abbasian Ardakani, A.; Airom, O.; Khorshidi, H.; Bureau, N.J.; Salvi, M.; Molinari, F.; Acharya, U.R. Interpretation of Artificial Intelligence Models in Healthcare. J. Ultrasound Med. 2024, 43, 1789–1818. [Google Scholar] [CrossRef]
- Brasil, S.; Pascoal, C.; Francisco, R.; dos Reis Ferreira, V.; Videira, P.A.; Valadão, G. Artificial Intelligence (AI) in Rare Diseases: Is the Future Brighter? Genes 2019, 10, 978. [Google Scholar] [CrossRef]
- Cabral, S.; Restrepo, D.; Kanjee, Z.; Wilson, P.; Crowe, B.; Abdulnour, R.-E.; Rodman, A. Clinical Reasoning of a Generative Artificial Intelligence Model Compared with Physicians. JAMA Intern. Med. 2024, 184, 581. [Google Scholar] [CrossRef]
- Suárez, A.; Arena, S.; Herranz Calzada, A.; Castillo Varón, A.I.; Diaz-Flores García, V.; Freire, Y. Decoding Wisdom: Evaluating ChatGPT’s Accuracy and Reproducibility in Analyzing Orthopantomographic Images for Third Molar Assessment. Comput. Struct. Biotechnol. J. 2025, 28, 141–147. [Google Scholar] [CrossRef]
- Warin, K.; Limprasert, W.; Suebnukarn, S.; Jinaporntham, S.; Jantana, P.; Vicharueang, S. AI-Based Analysis of Oral Lesions Using Novel Deep Convolutional Neural Networks for Early Detection of Oral Cancer. PLoS ONE 2022, 17, e0273508. [Google Scholar] [CrossRef]
- Xu, Z.; Lin, A.; Han, X. Current AI Applications and Challenges in Oral Pathology. Oral 2025, 5, 2. [Google Scholar] [CrossRef]
- Perumal, M.K.K.; Renuka, R.R.; Subbiah, S.K.; Natarajan, P.M. Artificial Intelligence-Driven Clinical Decision Support Systems for Early Detection and Precision Therapy in Oral Cancer: A Mini Review. Front. Oral Health 2025, 6, 1592428. [Google Scholar] [CrossRef]
- Mirfendereski, P.; Li, G.Y.; Pearson, A.T.; Kerr, A.R. Artificial Intelligence and the Diagnosis of Oral Cavity Cancer and Oral Potentially Malignant Disorders from Clinical Photographs: A Narrative Review. Front. Oral Health 2025, 6, 1569567. [Google Scholar] [CrossRef] [PubMed]
- Talwar, V.; Singh, P.; Mukhia, N.; Shetty, A.; Birur, P.; Desai, K.M.; Sunkavalli, C.; Varma, K.S.; Sethuraman, R.; Jawahar, C.V.; et al. AI-Assisted Screening of Oral Potentially Malignant Disorders Using Smartphone-Based Photographic Images. Cancers 2023, 15, 4120. [Google Scholar] [CrossRef] [PubMed]
- Hegde, S.; Ajila, V.; Zhu, W.; Zeng, C. Artificial Intelligence in Early Diagnosis and Prevention of Oral Cancer. Asia Pac. J. Oncol. Nurs. 2022, 9, 100133. [Google Scholar] [CrossRef] [PubMed]
- Abati, S.; Bramati, C.; Bondi, S.; Lissoni, A.; Trimarchi, M. Oral Cancer and Precancer: A Narrative Review on the Relevance of Early Diagnosis. Int. J. Environ. Res. Public Health 2020, 17, 9160. [Google Scholar] [CrossRef]
- Speight, P.M.; Khurram, S.A.; Kujan, O. Oral Potentially Malignant Disorders: Risk of Progression to Malignancy. Oral Surg. Oral Med. Oral Pathol. Oral Radiol. 2018, 125, 612–627. [Google Scholar] [CrossRef]
- Silva, L.C.; Faustino, I.S.P.; Ramos, J.C.; Colafemina, A.C.E.; Di Pauli-Paglioni, M.; Leite, A.A.; Santos-Silva, A.R.; Lopes, M.A.; Vargas, P.A. The Importance of Early Treatment of Oral Squamous Cell Carcinoma: Case Report. Oral Oncol. 2023, 144, 106442. [Google Scholar] [CrossRef]
- Alrashdan, M.S.; Cirillo, N.; McCullough, M. Oral Lichen Planus: A Literature Review and Update. Arch. Dermatol. Res. 2016, 308, 539–551. [Google Scholar] [CrossRef]
- Louisy, A.; Humbert, E.; Samimi, M. Oral Lichen Planus: An Update on Diagnosis and Management. Am. J. Clin. Dermatol. 2024, 25, 35–53. [Google Scholar] [CrossRef]
- Lodolo, M.; Valor, J.; Villa, A. Randomized Controlled Trials for Oral Leukoplakia. Oral Dis. 2025. [Google Scholar] [CrossRef]
- Binnie, R.; Dobson, M.L.; Chrystal, A.; Hijazi, K. Oral Lichen Planus and Lichenoid Lesions—Challenges and Pitfalls for the General Dental Practitioner. Br. Dent. J. 2024, 236, 285–292. [Google Scholar] [CrossRef]
- Albagieh, H.; Alabdulkareem, S.E.; Alharbi, W.; Alqahtani, S.M.; Algoblan, G. Oral Squamous Cell Carcinoma Mimicking Lichenoid Reaction After Implant Placement: A Case Report. Cureus 2023, 15, e50804. [Google Scholar] [CrossRef] [PubMed]
- Mravak-Stipetić, M.; Lončar-Brzak, B.; Bakale-Hodak, I.; Sabol, I.; Seiwerth, S.; Majstorović, M.; Grce, M. Clinicopathologic Correlation of Oral Lichen Planus and Oral Lichenoid Lesions: A Preliminary Study. Sci. World J. 2014, 2014, 746874. [Google Scholar] [CrossRef] [PubMed]
- Silva, P.V.R.; Palaçon, M.P.; Silveira, H.A.; Martins, K.H.; Bufalino, A.; León, J.E. Oral Carcinoma Arising Under Implant-Supported Prosthesis: Progression of Proliferative Verrucous Leukoplakia Initially Mimicking Lichen Planus. J. Oral. Implantol. 2024, 50, 397–400. [Google Scholar] [CrossRef]
- Suzuki, J.; Hashimoto, S.; Watanabe, K.; Takahashi, K.; Usubuchi, H.; Suzuki, H. Carcinoma Cuniculatum Mimicking Leukoplakia of the Mandibular Gingiva. Auris Nasus Larynx 2012, 39, 321–325. [Google Scholar] [CrossRef]
- Rao, A.; Pang, M.; Kim, J.; Kamineni, M.; Lie, W.; Prasad, A.K.; Landman, A.; Dreyer, K.; Succi, M.D. Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study. J. Med. Internet Res. 2023, 25, e48659. [Google Scholar] [CrossRef]
- Suárez, A.; Díaz-Flores García, V.; Algar, J.; Gómez Sánchez, M.; Llorente de Pedro, M.; Freire, Y. Unveiling the ChatGPT Phenomenon: Evaluating the Consistency and Accuracy of Endodontic Question Answers. Int. Endod. J. 2023, 57, 108–113. [Google Scholar] [CrossRef]
- Freire, Y.; Santamaría Laorden, A.; Orejas Pérez, J.; Gómez Sánchez, M.; Díaz-Flores García, V.; Suárez, A. ChatGPT Performance in Prosthodontics: Assessment of Accuracy and Repeatability in Answer Generation. J. Prosthet. Dent. 2024, 131, 659.e1–659.e6. [Google Scholar] [CrossRef]
- Suárez, A.; Jiménez, J.; de Pedro, M.L.; Andreu-Vázquez, C.; García, V.D.-F.; Sánchez, M.G.; Freire, Y. Beyond the Scalpel: Assessing ChatGPT’s Potential as an Auxiliary Intelligent Virtual Assistant in Oral Surgery. Comput. Struct. Biotechnol. J. 2023, 24, 46–52. [Google Scholar] [CrossRef]
- Öncü, S.; Torun, F.; Ülkü, H.H. AI-Powered Standardised Patients: Evaluating ChatGPT-4o’s Impact on Clinical Case Management in Intern Physicians. BMC Med. Educ. 2025, 25, 278. [Google Scholar] [CrossRef]
- Scherr, R.; Halaseh, F.F.; Spina, A.; Andalib, S.; Rivera, R. ChatGPT Interactive Medical Simulations for Early Clinical Education: Case Study. JMIR Med. Educ. 2023, 9, e49877. [Google Scholar] [CrossRef]
- Frosolini, A.; Catarzi, L.; Benedetti, S.; Latini, L.; Chisci, G.; Franz, L.; Gennaro, P.; Gabriele, G. The Role of Large Language Models (LLMs) in Providing Triage for Maxillofacial Trauma Cases: A Preliminary Study. Diagnostics 2024, 14, 839. [Google Scholar] [CrossRef]
- OpenAI. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar] [CrossRef]
- OpenAI. OpenAI Temporary Chat FAQ. 2025. Available online: https://help.openai.com/en/articles/8914046-temporary-chat-faq (accessed on 22 September 2025).
- Zhan, Z.-Z.; Xiong, Y.-T.; Wang, C.-Y.; Zhang, B.-T.; Lian, W.-J.; Zeng, Y.-M.; Liu, W.; Tang, W.; Liu, C. Utilizing GPT-4 to Interpret Oral Mucosal Disease Photographs for Structured Report Generation. Sci. Rep. 2025, 15, 5187. [Google Scholar] [CrossRef] [PubMed]
- Schmidl, B.; Hütten, T.; Pigorsch, S.; Stögbauer, F.; Hoch, C.C.; Hussain, T.; Wollenberg, B.; Wirth, M. Artificial Intelligence for Image Recognition in Diagnosing Oral and Oropharyngeal Cancer and Leukoplakia. Sci. Rep. 2025, 15, 3625. [Google Scholar] [CrossRef] [PubMed]
- Jin, Q.; Chen, F.; Zhou, Y.; Xu, Z.; Cheung, J.M.; Chen, R.; Summers, R.M.; Rousseau, J.F.; Ni, P.; Landsman, M.J.; et al. Hidden Flaws behind Expert-Level Accuracy of Multimodal GPT-4 Vision in Medicine. npj Digit. Med. 2024, 7, 190. [Google Scholar] [CrossRef]
- Diniz-Freitas, M.; Lago-Méndez, L.; Limeres-Posse, J.; Diz-Dios, P. Challenging ChatGPT-4V for the Diagnosis of Oral Diseases and Conditions. Oral Dis. 2025, 31, 701–706. [Google Scholar] [CrossRef]
- Huang, H.; Zheng, O.; Wang, D.; Yin, J.; Wang, Z.; Ding, S.; Yin, H.; Xu, C.; Yang, R.; Zheng, Q.; et al. ChatGPT for Shaping the Future of Dentistry: The Potential of Multi-Modal Large Language Model. Int. J. Oral Sci. 2023, 15, 29. [Google Scholar] [CrossRef]
- Suárez, A.; Freire, Y.; Suárez, M.; Díaz-Flores García, V.; Andreu-Vázquez, C.; Thuissard Vasallo, I.J.; Castillo Varón, A.I.; Martín, C. Diagnostic Performance of Multimodal Large Language Models in the Analysis of Oral Pathology. Oral Dis. 2025. [Google Scholar] [CrossRef]
- Pradhan, P. Accuracy of ChatGPT 3.5, 4.0, 4o and Gemini in Diagnosing Oral Potentially Malignant Lesions Based on Clinical Case Reports and Image Recognition. Med. Oral Patol. Oral Cir. Bucal 2020, 30, e224–e231. [Google Scholar] [CrossRef]
- Kaygisiz, Ö.F.; Teke, M.T. Can Deepseek and ChatGPT Be Used in the Diagnosis of Oral Pathologies? BMC Oral Health 2025, 25, 638. [Google Scholar] [CrossRef] [PubMed]
- AlFarabi Ali, S.; AlDehlawi, H.; Jazzar, A.; Ashi, H.; Esam Abuzinadah, N.; AlOtaibi, M.; Algarni, A.; Alqahtani, H.; Akeel, S.; Almazrooa, S. The Diagnostic Performance of Large Language Models and Oral Medicine Consultants for Identifying Oral Lesions in Text-Based Clinical Scenarios: Prospective Comparative Study. JMIR AI 2025, 4, e70566. [Google Scholar] [CrossRef] [PubMed]
- Afshar, M.; Gao, Y.; Wills, G.; Wang, J.; Churpek, M.M.; Westenberger, C.J.; Kunstman, D.T.; Gordon, J.E.; Goswami, C.; Liao, F.J.; et al. Prompt Engineering with a Large Language Model to Assist Providers in Responding to Patient Inquiries: A Real-Time Implementation in the Electronic Health Record. JAMIA Open 2024, 7, ooae080. [Google Scholar] [CrossRef] [PubMed]
| Condition | Reference Diagnosis | ChatGPT Diagnosis: Present | ChatGPT Diagnosis: Absent |
|---|---|---|---|
| OSCC A | Present | 39 | 21 |
| Absent | 6 | 204 | |
| Total | 45 | 225 | |
| OL B | Present | 54 | 36 |
| Absent | 0 | 30 | |
| Total | 54 | 66 | |
| OLP C | Present | 30 | 90 |
| Absent | 7 | 173 | |
| Total | 37 | 263 |
| OSCC | 95% CI | OL | 95% CI | OLP | 95% CI | ||||
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | 65.00% | 51.60% | 76.87% | 60.00% | 49.13% | 70.19% | 25.00% | 17.55% | 33.73% |
| Specificity | 97.14% | 93.89% | 98.94% | 100.00% | 88.43% | 100.00% | 96.11% | 92.15% | 98.42% |
| Positive predictive value | 86.67% | 73.21% | 94.95% | 100.00% | 93.40% | 100.00% | 81.08% | 64.84% | 92.04% |
| Negative predictive value | 90.67% | 86.09% | 94.13% | 45.45% | 33.14% | 58.19% | 65.78% | 59.70% | 71.50% |
| False positive rate | 2.86% | 1.06% | 6.11% | 0.00% | 0.00% | 11.57% | 3.89% | 1.58% | 7.85% |
| False negative rate | 35.00% | 23.13% | 48.40% | 40.00% | 29.81% | 50.87% | 75.00% | 66.27% | 82.45% |
| Correctly classified | 90.00% | 85.78% | 93.31% | 70.00% | 60.96% | 78.02% | 67.67% | 62.05% | 72.93% |
| ROC area | 81.07% | 74.88% | 87.26 | 80.00% | 74.91% | 85.09% | 60.56% | 56.42% | 64.70% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Published by MDPI on behalf of the Lithuanian University of Health Sciences. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Suárez, A.; Freire, Y.; Díaz-Flores García, V.; Santamaría Laorden, A.; Orejas Pérez, J.; Suárez Ajuria, M.; Algar, J.; Martín Carreras-Presas, C. ChatGPT in Oral Pathology: Bright Promise or Diagnostic Mirage. Medicina 2025, 61, 1744. https://doi.org/10.3390/medicina61101744
Suárez A, Freire Y, Díaz-Flores García V, Santamaría Laorden A, Orejas Pérez J, Suárez Ajuria M, Algar J, Martín Carreras-Presas C. ChatGPT in Oral Pathology: Bright Promise or Diagnostic Mirage. Medicina. 2025; 61(10):1744. https://doi.org/10.3390/medicina61101744
Chicago/Turabian StyleSuárez, Ana, Yolanda Freire, Víctor Díaz-Flores García, Andrea Santamaría Laorden, Jaime Orejas Pérez, María Suárez Ajuria, Juan Algar, and Carmen Martín Carreras-Presas. 2025. "ChatGPT in Oral Pathology: Bright Promise or Diagnostic Mirage" Medicina 61, no. 10: 1744. https://doi.org/10.3390/medicina61101744
APA StyleSuárez, A., Freire, Y., Díaz-Flores García, V., Santamaría Laorden, A., Orejas Pérez, J., Suárez Ajuria, M., Algar, J., & Martín Carreras-Presas, C. (2025). ChatGPT in Oral Pathology: Bright Promise or Diagnostic Mirage. Medicina, 61(10), 1744. https://doi.org/10.3390/medicina61101744

