Comparison of ChatGPT-5 and DeepSeek V3 for Artificial Intelligence-Assisted Patient Education in Foot and Ankle Disorders
Abstract
1. Introduction
2. Material and Method
2.1. Study Design
2.2. Reliability and Quality Assessment
2.3. Readability Assessment
2.4. Content Quality Assessment
2.5. Statistical Analysis
3. Results
3.1. Interobserver Agreement Assessment
3.2. Quality and Reliability Assessment
3.3. Readability Assessment
3.4. Content Quality Assessment
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Greenberg-Worisek, A.J.; Kurani, S.; Rutten, L.J.F.; Blake, K.D.; Moser, R.P.; Hesse, B.W. Correction: Tracking Healthy People 2020 Internet, Broadband, and Mobile Device Access Goals: An Update Using Data From the Health Information National Trends Survey. J. Med. Internet Res. 2022, 24, e39712. [Google Scholar] [CrossRef] [PubMed]
- Bartolucci, M.L.; Parenti, S.I.; Bortolotti, F.; Gorini, T.; Alessandri-Bonetti, G. Awareness and sources of knowledge about obstructive sleep apnea: A cross-sectional survey study. Healthcare 2023, 11, 3052. [Google Scholar] [CrossRef] [PubMed]
- Fraval, A.; Chong, Y.M.; Holcdorf, D.; Plunkett, V.; Tran, P. Internet use by orthopaedic outpatients—Current trends and practices. Australas. Med. J. 2012, 5, 633–638. [Google Scholar] [CrossRef]
- Daraz, L.; Morrow, A.S.; Ponce, O.J.; Farah, W.; Katabi, A.; Majzoub, A.; Seisa, M.O.; Benkhadra, R.; Alsawas, M.; Larry, P.; et al. Readability of online health information: A meta-narrative systematic review. Am. J. Med. Qual. 2018, 33, 487–492. [Google Scholar] [CrossRef]
- Bayrak, H.C.; Karagöz, B.; Bayrak, Ö. Comparative evaluation of large language model-based chatbots in a septic arthritis scenario: ChatGPT, Claude, and Perplexity. Acta Orthop. Traumatol. Turc. 2025, 59, 415–420. [Google Scholar] [CrossRef] [PubMed]
- Schwarz, I.; Houck, D.A.; Belk, J.W.; Hop, J.; Bravman, J.T.; McCarty, E.C. The quality and content of internet-based information on orthopaedic sports medicine requires improvement: A systematic review. Arthrosc. Sports Med. Rehabil. 2021, 3, e1547–e1555. [Google Scholar] [CrossRef]
- Lower, K.; Seth, I.; Lim, B.; Seth, N. ChatGPT-4: Transforming medical education and addressing clinical exposure challenges in the post-pandemic era. Indian J. Orthop. 2023, 57, 1527–1534. [Google Scholar] [CrossRef]
- Chow, J.C.; Wong, V.; Sanders, L.; Li, K. Developing an AI-assisted educational chatbot for radiotherapy using the IBM Watson Assistant platform. Healthcare 2023, 11, 2417. [Google Scholar] [CrossRef]
- Stokel-Walker, C.; Van Noorden, R. What ChatGPT and generative AI mean for science. Nature 2023, 614, 214–216. [Google Scholar] [CrossRef]
- Sallam, M. ChatGPT utility in healthcare education, research, and practice: Systematic review on the promising perspectives and valid concerns. Healthcare 2023, 11, 887. [Google Scholar] [CrossRef]
- Seth, I.; Rodwell, A.; Tso, R.; Valles, J.; Bulloch, G.; Seth, S. A conversation with an open artificial intelligence platform on osteoarthritis of the hip and treatment. J. Orthop. Sports Med. 2023, 5, 112–120. [Google Scholar] [CrossRef]
- Rydberg, E.M.; Wennergren, D.; Stigevall, C.; Ekelund, J.; Möller, M. Epidemiology of more than 50,000 ankle fractures in the Swedish Fracture Register during a period of 10 years. J. Orthop. Surg. Res. 2023, 18, 79. [Google Scholar] [CrossRef]
- Wang, D.; He, Y.; Ma, Y.; Wu, H.; Ni, G. The era of artificial intelligence: Talking about the potential application value of ChatGPT/GPT-4 in foot and ankle surgery. J. Foot Ankle Surg. 2024, 63, 1–3. [Google Scholar] [CrossRef]
- Eltorai, A.E.; Sharma, P.; Wang, J.; Daniels, A.H. Most American Academy of Orthopaedic Surgeons’ online patient education material exceeds average patient reading level. Clin. Orthop. Relat. Res. 2015, 473, 1181–1186. [Google Scholar] [CrossRef]
- Koo, T.K.; Li, M.Y. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 2016, 15, 155–163. [Google Scholar] [CrossRef] [PubMed]
- Charnock, D.; Shepperd, S.; Needham, G.; Gann, R. DISCERN: An instrument for judging the quality of written consumer health information on treatment choices. J. Epidemiol. Community Health 1999, 53, 105–111. [Google Scholar] [CrossRef]
- Incerti Parenti, S.; Gamberini, S.; Fiordelli, A.; Bortolotti, F.; Laffranchi, L.; Alessandri-Bonetti, G. Online information on mandibular advancement device for the treatment of obstructive sleep apnea: A content, quality and readability analysis. J. Oral Rehabil. 2023, 50, 210–216. [Google Scholar] [CrossRef] [PubMed]
- Shoemaker, S.J.; Wolf, M.S.; Brach, C. Development of the Patient Education Materials Assessment Tool (PEMAT): A new measure of understandability and actionability for print and audiovisual patient information. Patient Educ. Couns. 2014, 96, 395–403. [Google Scholar] [CrossRef]
- Mangan, M.S.; Cakir, A.; Yurttaser Ocak, S.; Tekcan, H.; Balci, S.; Ozcelik Kose, A. Analysis of the quality, reliability, and popularity of information on strabismus on YouTube. Strabismus 2020, 28, 175–180. [Google Scholar] [CrossRef] [PubMed]
- Kher, A.; Johnson, S.; Griffith, R. Readability assessment of online patient education material on congestive heart failure. Adv. Prev. Med. 2017, 2017, 9780317. [Google Scholar] [CrossRef]
- Kincaid, J.P.; Fishburne, R.P.; Rogers, R.L.; Chissom, B.S. Derivation of New Readability Formulas for Navy Enlisted Personnel; Technical Report Research Branch Report; US Naval Air Station: Memphis, TN, USA, 1975; pp. 8–75.
- Flesch, R. A new readability yardstick. J. Appl. Psychol. 1948, 32, 221–233. [Google Scholar] [CrossRef] [PubMed]
- Sallam, M.; Barakat, M.; Sallam, M. Pilot testing of a tool to standardize the assessment of the quality of health information generated by artificial intelligence-based models. Cureus 2023, 15, e49373. [Google Scholar] [CrossRef] [PubMed]
- Gutmann, J.; Kühbeck, F.; Berberat, P.O.; Fischer, M.R.; Engelhardt, S.; Sarikas, A. Use of learning media by undergraduate medical students in pharmacology: A prospective cohort study. PLoS ONE 2015, 10, e0122624. [Google Scholar] [CrossRef]
- Peng, Y.; Malin, B.A.; Rousseau, J.F.; Wang, Y.; Xu, Z.; Xu, X.; Weng, C.; Bian, J. From GPT to DeepSeek: Significant gaps remain in realizing AI in healthcare. J. Biomed. Inform. 2025, 163, 104791. [Google Scholar] [CrossRef]
- He, Y.; Tang, H.; Wang, W.; Gu, S.; Ni, G.; Wu, H. Will ChatGPT/GPT-4 be a lighthouse to guide spinal surgeons? Ann. Biomed. Eng. 2023, 51, 1362–1365. [Google Scholar] [CrossRef]
- Cheng, K.; Li, Z.; Guo, Q.; Sun, Z.; Wu, H.; Li, C. Emergency surgery in the era of artificial intelligence: ChatGPT could be the doctor’s right-hand man. Int. J. Surg. 2023, 109, 1816–1818. [Google Scholar] [CrossRef]
- Kirchner, G.J.; Kim, R.Y.; Weddle, J.B.; Bible, J.E. Can artificial intelligence improve the readability of patient education materials? Clin. Orthop. Relat. Res. 2023, 481, 2260. [Google Scholar] [CrossRef]
- Kıvrak, A.; Ulusoy, İ. How high is the quality of the videos about children’s elbow fractures on YouTube? J. Orthop. Surg. Res. 2023, 18, 166. [Google Scholar] [CrossRef]
- Yuce, A.; Oto, O.; Vural, A.; Misir, A. YouTube provides low-quality videos about talus osteochondral lesions and their arthroscopic treatment. Foot Ankle Surg. 2023, 29, 441–445. [Google Scholar] [CrossRef]
- Khan, R.A.; Jawaid, M.; Khan, A.R.; Sajjad, M. ChatGPT—Reshaping medical education and clinical management. Pak. J. Med. Sci. 2023, 39, 605–607. [Google Scholar] [CrossRef] [PubMed]
- van de Ridder, J.M.; Shoja, M.M.; Rajput, V. Finding the place of ChatGPT in medical education. Acad. Med. 2023, 98, 867. [Google Scholar] [CrossRef] [PubMed]
- Campbell, D.J.; Estephan, L.E.; Mastrolonardo, E.V.; Amin, D.R.; Huntley, C.T.; Boon, M.S. Evaluating ChatGPT responses on obstructive sleep apnea for patient education. J. Clin. Sleep Med. 2023, 19, 1989–1995. [Google Scholar] [CrossRef] [PubMed]
- Yilmaz Muluk, S.; Olcucu, N. The role of artificial intelligence in the primary prevention of common musculoskeletal diseases. Cureus 2024, 16, e65372. [Google Scholar] [CrossRef] [PubMed]
- Sallam, M.; Al-Mahzoum, K.; Almutawaa, R.A.; Alhashash, J.A.; Dashti, R.A.; AlSafy, D.R.; Almutairi, R.A.; Barakat, M. The performance of OpenAI ChatGPT-4 and Google Gemini in virology multiple-choice questions: A comparative analysis of English and Arabic responses. BMC Res. Notes 2024, 17, 247. [Google Scholar] [CrossRef]
- Keçeci, T.; Karagöz, B. Can large language models follow guidelines? A comparative study of ChatGPT-4o and DeepSeek AI in clavicle fracture management based on AAOS recommendations. BMC Med. Inform. Decis. Mak. 2025, 25, 350. [Google Scholar] [CrossRef]
- Tao, B.K.; Handzic, A.; Hua, N.J.; Vosoughi, A.R.; Margolin, E.A.; Micieli, J.A. Utility of ChatGPT for automated creation of patient education handouts: An application in neuro-ophthalmology. J. Neuroophthalmol. 2024, 44, 119–124. [Google Scholar] [CrossRef]


| ChatGPT-5 | DeepSeek V3 | p Value | |
|---|---|---|---|
| DISCERN 1–8 | 31.3 ± 2.3 | 28.8 ± 4.3 | 0.04 |
| DISCERN 9–15 | 30.9 ± 2.3 | 29.2 ± 3.1 | 0.032 |
| DISCERN 16 | 4.5± 0.5 | 3.9 ± 0.9 | 0.013 |
| DISCERN Total | 66.7 ± 2.3 | 62 ± 6.7 | 0.044 |
| PEMAT-P understandability | 82.7 ± 6.8 | 78.9 ± 6.2 | 0.045 |
| PEMAT-P actionability | 81.6 ± 6.2 | 78.5 ± 5.5 | 0.043 |
| PEMAT-P Total | 82.1 ± 5.3 | 78.7 ± 3.2 | 0.015 |
| GQS | 4.46 ± 0.36 | 4.13 ± 0.21 | 0.01 |
| ChatGPT-5 | DeepSeek V3 | p Value | |
|---|---|---|---|
| Word count | 623 ± 73 | 565 ± 84 | 0.021 |
| FKGL | 10.1 ± 1.5 | 8.5 ± 1.1 | 0.022 |
| FRE | 42.9 ± 9.6 | 54.5 ± 9.2 | 0.038 |
| ChatGPT-5 | DeepSeek V3 | p Value | |
|---|---|---|---|
| Completeness | 4.56 ± 0.5 | 4.24 ± 0.43 | 0.022 |
| Lack of false information | 5 | 4.76 ± 0.5 | 0.01 |
| Evidence | 3.76 ± 0.43 | 3.48 ± 0.5 | 0.043 |
| Appropriateness | 4.52 ± 0.5 | 4 | 0.001 |
| Relevance | 5 | 4.52 ± 0.5 | 0.001 |
| CLEAR Total | 22.8 ± 1.3 | 21 ± 0.9 | 0.001 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Published by MDPI on behalf of the American Podiatric Medical Association (APMA). Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Karagoz, B.; Bayrak, H.C.; Keçeci, T. Comparison of ChatGPT-5 and DeepSeek V3 for Artificial Intelligence-Assisted Patient Education in Foot and Ankle Disorders. J. Am. Podiatr. Med. Assoc. 2026, 116, 33. https://doi.org/10.3390/japma116030033
Karagoz B, Bayrak HC, Keçeci T. Comparison of ChatGPT-5 and DeepSeek V3 for Artificial Intelligence-Assisted Patient Education in Foot and Ankle Disorders. Journal of the American Podiatric Medical Association. 2026; 116(3):33. https://doi.org/10.3390/japma116030033
Chicago/Turabian StyleKaragoz, Bekir, Hünkar Cagdas Bayrak, and Tolga Keçeci. 2026. "Comparison of ChatGPT-5 and DeepSeek V3 for Artificial Intelligence-Assisted Patient Education in Foot and Ankle Disorders" Journal of the American Podiatric Medical Association 116, no. 3: 33. https://doi.org/10.3390/japma116030033
APA StyleKaragoz, B., Bayrak, H. C., & Keçeci, T. (2026). Comparison of ChatGPT-5 and DeepSeek V3 for Artificial Intelligence-Assisted Patient Education in Foot and Ankle Disorders. Journal of the American Podiatric Medical Association, 116(3), 33. https://doi.org/10.3390/japma116030033

