Diagnostics

Journal Browser

► Journal Browser

Editor

Topical Collection Information

Dear Colleagues,

This is a collection of open access high-quality papers published by Editorial Board Members, or those who were invited by the Editorial Office. This Topical Collection aims to publish high-quality articles within the field of artificial intelligence in medical diagnosis and prognosis. The papers should be long research papers (or review papers) with full and detailed summaries of the author's own work performed so far. Please note that the selected full papers will still be subjected to thorough and rigorous peer review. All papers will be published on an ongoing basis.

Prof. Dr. Tim Duong
Collection Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the collection website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Diagnostics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

artificial intelligence
diagnosis
prognosis
biomedical imaging
radiology

Published Papers (2 papers)

Download All Papers

2024

16 pages, 1101 KiB

Open AccessArticle

Development and Evaluation of a GPT4-Based Orofacial Pain Clinical Decision Support System

by Charlotte Vueghs, Hamid Shakeri, Tara Renton and Frederic Van der Cruyssen

Diagnostics 2024, 14(24), 2835; https://doi.org/10.3390/diagnostics14242835 - 17 Dec 2024

Cited by 2 | Viewed by 1334

Abstract

Background: Orofacial pain (OFP) encompasses a complex array of conditions affecting the face, mouth, and jaws, often leading to significant diagnostic challenges and high rates of misdiagnosis. Artificial intelligence, particularly large language models like GPT4 (OpenAI, San Francisco, CA, USA), offers potential as a diagnostic aid in healthcare settings. Objective: To evaluate the diagnostic accuracy of GPT4 in OFP cases as a clinical decision support system (CDSS) and compare its performance against treating clinicians, expert evaluators, medical students, and general practitioners. Methods: A total of 100 anonymized patient case descriptions involving diverse OFP conditions were collected. GPT4 was prompted to generate primary and differential diagnoses for each case using the International Classification of Orofacial Pain (ICOP) criteria. Diagnoses were compared to gold-standard diagnoses established by treating clinicians, and a scoring system was used to assess accuracy at three hierarchical ICOP levels. A subset of 24 cases was also evaluated by two clinical experts, two final-year medical students, and two general practitioners for comparative analysis. Diagnostic performance and interrater reliability were calculated. Results: GPT4 achieved the highest accuracy level (ICOP level 3) in 38% of cases, with an overall diagnostic performance score of 157 out of 300 points (52%). The model provided accurate differential diagnoses in 80% of cases (400 out of 500 points). In the subset of 24 cases, the model’s performance was comparable to non-expert human evaluators but was surpassed by clinical experts, who correctly diagnosed 54% of cases at level 3. GPT4 demonstrated high accuracy in specific categories, correctly diagnosing 81% of trigeminal neuralgia cases at level 3. Interrater reliability between GPT4 and human evaluators was low (κ = 0.219, p < 0.001), indicating variability in diagnostic agreement. Conclusions: GPT4 shows promise as a CDSS for OFP by improving diagnostic accuracy and offering structured differential diagnoses. While not yet outperforming expert clinicians, GPT4 can augment diagnostic workflows, particularly in primary care or educational settings. Effective integration into clinical practice requires adherence to rigorous guidelines, thorough validation, and ongoing professional oversight to ensure patient safety and diagnostic reliability. Full article

► Show Figures

Figure 1

16 pages, 1344 KiB

Open AccessArticle

Evaluating Large Language Model (LLM) Performance on Established Breast Classification Systems

by Syed Ali Haider, Sophia M. Pressman, Sahar Borna, Cesar A. Gomez-Cabello, Ajai Sehgal, Bradley C. Leibovich and Antonio Jorge Forte

Diagnostics 2024, 14(14), 1491; https://doi.org/10.3390/diagnostics14141491 - 11 Jul 2024

Cited by 14 | Viewed by 3072

Abstract

Medical researchers are increasingly utilizing advanced LLMs like ChatGPT-4 and Gemini to enhance diagnostic processes in the medical field. This research focuses on their ability to comprehend and apply complex medical classification systems for breast conditions, which can significantly aid plastic surgeons in making informed decisions for diagnosis and treatment, ultimately leading to improved patient outcomes. Fifty clinical scenarios were created to evaluate the classification accuracy of each LLM across five established breast-related classification systems. Scores from 0 to 2 were assigned to LLM responses to denote incorrect, partially correct, or completely correct classifications. Descriptive statistics were employed to compare the performances of ChatGPT-4 and Gemini. Gemini exhibited superior overall performance, achieving 98% accuracy compared to ChatGPT-4’s 71%. While both models performed well in the Baker classification for capsular contracture and UTSW classification for gynecomastia, Gemini consistently outperformed ChatGPT-4 in other systems, such as the Fischer Grade Classification for gender-affirming mastectomy, Kajava Classification for ectopic breast tissue, and Regnault Classification for breast ptosis. With further development, integrating LLMs into plastic surgery practice will likely enhance diagnostic support and decision making. Full article

► Show Figures

Figure 1

Journal Menu

Journal Browser

Artificial Intelligence in Medical Diagnosis and Prognosis

Share This Topical Collection

Editor

Topical Collection Information

Keywords

Published Papers (2 papers)

2024

Further Information

Guidelines

MDPI Initiatives

Follow MDPI