Next Article in Journal
Transjugular Intrahepatic Portosystemic Shunt for Portal Vein Thrombosis in Cirrhotic Patients: 18-Year Experience in a Tertiary Referral Hospital
Previous Article in Journal
Continence Recovery After Radical Prostatectomy: Personalized Rehabilitation and Predictors of Treatment Outcome
Previous Article in Special Issue
Machine Learning-Based Prediction of Postoperative Deep Vein Thrombosis Following Tibial Fracture Surgery
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Flexing ChatGPT-4o’s Diagnostic Muscle: Detection of Fractures in the Ossifying Pediatric Elbow on Radiographs

by
Jonathan Kia-Sheng Phua
and
Timothy Shao Ern Tan
*
Department of Diagnostic and Interventional Imaging, KK Women’s and Children’s Hospital, 100 Bukit Timah Road, Singapore 229899, Singapore
*
Author to whom correspondence should be addressed.
Diagnostics 2025, 15(22), 2882; https://doi.org/10.3390/diagnostics15222882 (registering DOI)
Submission received: 25 September 2025 / Revised: 2 November 2025 / Accepted: 12 November 2025 / Published: 13 November 2025
(This article belongs to the Special Issue Applications of Artificial Intelligence in Orthopedics)

Abstract

Background/Objectives: Elbow fractures are the most common injuries in children and are frequently evaluated with plain radiographs in the acute setting. As dedicated pediatric radiology services are not widely available, diagnosis of fractures could be delayed. Since 2023, ChatGPT-4 has offered image analysis capabilities, which has untapped potential for radiographic analysis. This study represents the first evaluation of ChatGPT-4o, a multimodal large language model, in interpreting pediatric elbow radiographs for fracture detection, thereby demonstrating its potential as a generalist AI tool distinct from domain-specific pediatric models. Methods: A curated set of 200 pediatric elbow radiographs (100 normal, 100 abnormal with at least one fracture site, 105 right elbow, and 95 left elbow radiographs) acquired between October 2023 and March 2024 at a tertiary pediatric hospital were analyzed in this case–control study. Each anonymized radiograph was evaluated by ChatGPT-4o via a standardized prompt. ChatGPT-4o’s prediction outputs (fracture vs. no fracture) were subsequently compared against verified radiology reports (ground-truth). Diagnostic performance metrics such as sensitivity, specificity, accuracy, positive predictive value (PPV), negative predictive value (NPV), and F1 score were calculated. Results: ChatGPT-4o achieved an overall accuracy of 85% in detecting elbow fractures on pediatric radiographs, with a sensitivity of 87% and specificity of 82%. PPVs and NPVs were 83% and 86%, respectively. The F1 score was 0.85. ChatGPT-4o correctly identified the fracture site in 68 (78%) of the 87 studies in which it had detected fractures accurately. Cohen’s kappa coefficient was 0.69, indicating substantial agreement with actual diagnoses. Conclusions: This study highlights the utility and potential applications of ChatGPT-4o as a valuable point-of-care tool in aiding the detection of pediatric elbow fractures in emergency settings, particularly where specialist access is limited.
Keywords: elbow fractures; pediatric elbow fractures; supracondylar fractures; ChatGPT; ChatGPT-4o; artificial intelligence; large language model; emergency department; acute radiology; pediatric radiology elbow fractures; pediatric elbow fractures; supracondylar fractures; ChatGPT; ChatGPT-4o; artificial intelligence; large language model; emergency department; acute radiology; pediatric radiology

Share and Cite

MDPI and ACS Style

Phua, J.K.-S.; Tan, T.S.E. Flexing ChatGPT-4o’s Diagnostic Muscle: Detection of Fractures in the Ossifying Pediatric Elbow on Radiographs. Diagnostics 2025, 15, 2882. https://doi.org/10.3390/diagnostics15222882

AMA Style

Phua JK-S, Tan TSE. Flexing ChatGPT-4o’s Diagnostic Muscle: Detection of Fractures in the Ossifying Pediatric Elbow on Radiographs. Diagnostics. 2025; 15(22):2882. https://doi.org/10.3390/diagnostics15222882

Chicago/Turabian Style

Phua, Jonathan Kia-Sheng, and Timothy Shao Ern Tan. 2025. "Flexing ChatGPT-4o’s Diagnostic Muscle: Detection of Fractures in the Ossifying Pediatric Elbow on Radiographs" Diagnostics 15, no. 22: 2882. https://doi.org/10.3390/diagnostics15222882

APA Style

Phua, J. K.-S., & Tan, T. S. E. (2025). Flexing ChatGPT-4o’s Diagnostic Muscle: Detection of Fractures in the Ossifying Pediatric Elbow on Radiographs. Diagnostics, 15(22), 2882. https://doi.org/10.3390/diagnostics15222882

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop