To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders
Abstract
1. Introduction
2. Materials and Methods
3. Results
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AI | Artificial Intelligence |
AoI | Area of Interest |
ChatGPT | Chat Generative Pre-trained Transformer |
Deg | Degenerative |
F&A | Foot and Ankle |
FKGL | Flesch–Kincaid Grade Level |
FRES | Flesch–Kincaid Reading Ease Score |
Inf | Infective |
Ped | Pediatric |
PRP | Platelet-Rich Plasma |
SPSS | Statistical Package for the Social Sciences |
Tra | Trauma |
Up | Upper Extremity |
References
- Rispler, D.T.; Sara, J. The impact of complementary and alternative treatment modalities on the care of orthopaedic patients. J. Am. Acad. Orthop. Surg. 2011, 19, 634–643. [Google Scholar] [CrossRef]
- Line, S.; Nguyen, E.T.; Marsh, L.; Fry, C. Problems With Medium-Sized Joints: Ankle Conditions. FP Essent. 2023, 535, 25–36. [Google Scholar] [PubMed]
- Zhou, L.; Sun, K.; Chen, Y.; Chen, G.L.; Deng, D.J.; Jiao, G.L.; Li, Z.Z. Efficacy of Shangbai ointment in alleviating pain in patients with acute ankle joint lateral collateral ligament injury: A randomized controlled trial. J. South. Med. Univ. 2017, 37, 398–401. (In Chinese) [Google Scholar] [CrossRef]
- Gogate, N.; Satpute, K.; Hall, T. The effectiveness of mobilization with movement on pain, balance and function following acute and sub-acute inversion ankle sprain—A randomized, placebo controlled trial. Phys. Ther. Sport 2021, 48, 91–100. [Google Scholar] [CrossRef] [PubMed]
- Gencer, B.; Doğan, Ö.; Çulcu, A.; Ülgen, N.K.; Çamoğlu, C.; Arslan, M.M.; Mert, O.; Yiğit, A.; Yeni, T.B.; Hanege, F.; et al. Internet and social media preferences of orthopaedic patients vary according to factors such as age and education levels. Health Inf. Libr. J. 2024, 41, 84–97. [Google Scholar] [CrossRef]
- Duymus, T.M.; Karadeniz, H.; Çaçan, M.A.; Kömür, B.; Demirtaş, A.; Zehir, S.; Azboy, İ. Internet and social media usage of orthopaedic patients: A questionnaire-based survey. World J. Orthop. 2017, 8, 178–186. [Google Scholar] [CrossRef]
- Kaplan, B. Revisiting Health Information Technology Ethical, Legal, And Social Issues And Evaluation: Telehealth/Telemedicine And Covid-19. Int. J. Med. Inform. 2020, 143, 104239. [Google Scholar] [CrossRef]
- Morya, V.K.; Lee, H.W.; Shahid, H.; Magar, A.G.; Lee, J.H.; Kim, J.H.; Jun, L.; Noh, K.C. Application of ChatGPT for Orthopedic Surgeries and Patient Care. Clin. Orthop. Surg. 2024, 16, 347–356. [Google Scholar] [CrossRef]
- Curry, E.; Li, X.; Nguyen, J.; Matzkin, E. Prevalence of internet and social media usage in orthopedic surgery. Orthop. Rev. 2014, 6, 5483. [Google Scholar] [CrossRef]
- Yavuz, İ.A.; Kahve, Y.; Aydin, T.; Gencer, B.; Bingöl, O.; Yıldırım, A.Ö. Comparison of the first and second waves of the COVID-19 pandemic with a normal period in terms of orthopaedic trauma: Data from a level 1 trauma centre. Acta Orthop. Traumatol. Turc. 2021, 55, 391–395. [Google Scholar] [CrossRef]
- Gencer, B.; Doğan, Ö. Consequences of the COVID-19 pandemic on fracture distribution: Epidemiological data from a tertiary trauma center in Turkey. J. Exp. Clin. Med. 2022, 39, 128–133. [Google Scholar] [CrossRef]
- Gencer, B.; Çulcu, A.; Doğan, Ö. COVID-19 exposure and health status of orthopedic residents: A survey study. J. Exp. Clin. Med. 2022, 39, 337–341. [Google Scholar] [CrossRef]
- Salvagno, M.; Taccone, F.S.; Gerli, A.G. Can artificial intelligence help for scientific writing? Crit. Care 2023, 27, 75, Erratum in Crit. Care. 2023, 27, 99. [Google Scholar] [CrossRef] [PubMed]
- Giorgino, R.; Alessandri-Bonetti, M.; Luca, A.; Migliorini, F.; Rossi, N.; Peretti, G.M.; Mangiavini, L. ChatGPT in orthopedics: A narrative review exploring the potential of artificial intelligence in orthopedic practice. Front. Surg. 2023, 10, 1284015. [Google Scholar] [CrossRef]
- Hu, X.; Niemann, M.; Kienzle, A.; Braun, K.; Back, D.A.; Gwinner, C.; Renz, N.; Stoeckle, U.; Trampuz, A.; Meller, S. Evaluating ChatGPT responses to frequently asked patient questions regarding periprosthetic joint infection after total hip and knee arthroplasty. Digit. Health 2024, 10, 20552076241272620. [Google Scholar] [CrossRef]
- Schwartzman, J.D.; Shaath, M.K.; Kerr, M.S.; Green, C.C.; Haidukewych, G.J. ChatGPT is an Unreliable Source of Peer-Reviewed Information for Common Total Knee and Hip Arthroplasty Patient Questions. Adv. Orthop. 2025, 2025, 5534704. [Google Scholar] [CrossRef]
- Gwak, G.T.; Hwang, U.J.; Jung, S.H.; Kim, J.H. Search for Medical Information and Treatment Options for Musculoskeletal Disorders through an Artificial Intelligence Chatbot: Focusing on Shoulder Impingement Syndrome. J. Musculoskelet. Sci. Technol. 2023, 7, 8–16. [Google Scholar] [CrossRef]
- Ah-Yan, C.; Boissonnault, È.; Boudier-Revéret, M.; Mares, C. Impact of artificial intelligence in managing musculoskeletal pathologies in physiatry: A qualitative observational study evaluating the potential use of ChatGPT versus Copilot for patient information and clinical advice on low back pain. J. Yeungnam Med. Sci. 2025, 42, 11. [Google Scholar] [CrossRef]
- Tabanli, A.; Demirkiran, N.D. Comparing ChatGPT 3.5 and 4.0 in Low Back Pain Patient Education: Addressing Strengths, Limitations, and Psychosocial Challenges. World Neurosurg. 2025, 196, 123755. [Google Scholar] [CrossRef]
- Safran, E.; Yildirim, S. A cross-sectional study on ChatGPT’s alignment with clinical practice guidelines in musculoskeletal rehabilitation. BMC Musculoskelet. Disord. 2025, 26, 411. [Google Scholar] [CrossRef]
- Kilkenny, C.J.; Davey, M.S.; O’Sullivan, D.; Medlar, C.; O’ Driscoll, C.; O’Daly, B. Evaluation of the quality of information provided by ChatGPT on pelvic and acetabular surgery. J. Orthop. Rep. 2025, 4, 100561. [Google Scholar] [CrossRef]
- Friedman, D.B.; Hoffman-Goetz, L. A systematic review of readability and comprehension instruments used for print and web-based cancer information. Health Educ. Behav. 2006, 33, 352–373. [Google Scholar] [CrossRef]
- Hancı, V.; Ergün, B.; Gül, Ş.; Uzun, Ö.; Erdemir, İ.; Hancı, F.B. Assessment of readability, reliability, and quality of ChatGPT®, BARD®, Gemini®, Copilot®, Perplexity® responses on palliative care. Medicine 2024, 103, e39305. [Google Scholar] [CrossRef] [PubMed]
- Gül, Ş.; Erdemir, İ.; Hanci, V.; Aydoğmuş, E.; Erkoç, Y.S. How artificial intelligence can provide information about subdural hematoma: Assessment of readability, reliability, and quality of ChatGPT, BARD, and perplexity responses. Medicine 2024, 103, e38009. [Google Scholar] [CrossRef] [PubMed]
- Magruder, M.L.; Rodriguez, A.N.; Wong, J.C.J.; Erez, O.; Piuzzi, N.S.; Scuderi, G.R.; Slover, J.D.; Oh, J.H.; Schwarzkopf, R.; Chen, A.F.; et al. Assessing Ability for ChatGPT to Answer Total Knee Arthroplasty-Related Questions. J. Arthroplast. 2024, 39, 2022–2027. [Google Scholar] [CrossRef] [PubMed]
- Mika, A.P.; Martin, J.R.; Engstrom, S.M.; Polkowski, G.G.; Wilson, J.M. Assessing ChatGPT Responses to Common Patient Questions Regarding Total Hip Arthroplasty. J. Bone Jt. Surg. Am. 2023, 105, 1519–1526. [Google Scholar] [CrossRef]
- Giorgino, R.; Alessandri-Bonetti, M.; Del Re, M.; Verdoni, F.; Peretti, G.M.; Mangiavini, L. Google Bard and ChatGPT in Orthopedics: Which Is the Better Doctor in Sports Medicine and Pediatric Orthopedics? The Role of AI in Patient Education. Diagnostics 2024, 14, 1253. [Google Scholar] [CrossRef]
- Kunze, K.N.; Varady, N.H.; Mazzucco, M.; Lu, A.Z.; Chahla, J.; Martin, R.K.; Ranawat, A.S.; Pearle, A.D.; Williams, R.J., 3rd. The Large Language Model ChatGPT-4 Exhibits Excellent Triage Capabilities and Diagnostic Performance for Patients Presenting With Various Causes of Knee Pain. Arthroscopy 2025, 41, 1438–1447.e14. [Google Scholar] [CrossRef]
- Shrestha, N.; Shen, Z.; Zaidat, B.; Duey, A.H.; Tang, J.E.; Ahmed, W.; Hoang, T.; Restrepo Mejia, M.; Rajjoub, R.; Markowitz, J.S.; et al. Performance of ChatGPT on NASS Clinical Guidelines for the Diagnosis and Treatment of Low Back Pain: A Comparison Study. Spine 2024, 49, 640–651. [Google Scholar] [CrossRef]
- Adelstein, J.M.; Sinkler, M.A.; Li, L.T.; Mistovich, R.J. ChatGPT Responses to Common Questions About Slipped Capital Femoral Epiphysis: A Reliable Resource for Parents? J. Pediatr. Orthop. 2024, 44, 353–357. [Google Scholar] [CrossRef]
- Wrenn, S.P.; Mika, A.P.; Ponce, R.B.; Mitchell, P.M. Evaluating ChatGPT’s Ability to Answer Common Patient Questions Regarding Hip Fracture. J. Am. Acad. Orthop. Surg. 2024, 32, 656–659. [Google Scholar] [CrossRef]
- Wright, B.M.; Bodnar, M.S.; Moore, A.D.; Maseda, M.C.; Kucharik, M.P.; Diaz, C.C.; Schmidt, C.M.; Mir, H.R. Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients? Bone Jt. Open. 2024, 5, 139–146. [Google Scholar] [CrossRef]
- Johns, W.L.; Martinazzi, B.J.; Miltenberg, B.; Nam, H.H.; Hammoud, S. ChatGPT Provides Unsatisfactory Responses to Frequently Asked Questions Regarding Anterior Cruciate Ligament Reconstruction. Arthroscopy 2024, 40, 2067–2079.e1. [Google Scholar] [CrossRef]
- Reyhan, A.H.; Mutaf, Ç.; Uzun, İ.; Yüksekyayla, F. A Performance Evaluation of Large Language Models in Keratoconus: A Comparative Study of ChatGPT-3.5, ChatGPT-4.0, Gemini, Copilot, Chatsonic, and Perplexity. J. Clin. Med. 2024, 13, 6512. [Google Scholar] [CrossRef]
AoI | Anat | Question |
---|---|---|
Ped | Knee | My 12-year-old son has been getting more and more pain in the front of his knee when he’s been playing sports and in the evenings. He’s also got a bit of a swelling going on. Any ideas what I should do? |
Ped | F&A | My 3-year-old daughter’s feet are pressing inwards. What should I do? |
Ped | Hip | My grandson was just born. I’ve heard about something called a hip ultrasound. Do you know what this is? And when should I have it? |
Ped | Spine | My 17-year-old daughter has a curve in her back. What should I do? |
Ped | Hip | My 6-year-old child walks with a limp. Do you have any advice? |
Sport | Up | My 48-year-old mum has really bad pain in her right shoulder when she gets plates from the top shelves and washes her hair in the shower. Any ideas what she can do? |
Sport | Up | I’m a 45-year-old housewife and the outer side of my elbow really hurts when I squeeze a cloth or open a jar lid at home. What should I do? |
Sport | Knee | My mate at work told me that his knee swelled up after he hurt it playing football a month ago. He said it felt like it was rotating and he felt a gap in it. What should he do? |
Sport | F&A | During the same match, another friend heard a loud pop behind his ankle, which we all heard. But the man can walk. Any ideas what it is? |
Sport | Knee | I’m 35 and had meniscus surgery last week. Do you think I should get physical therapy? |
Sport | Hip | My 33-year-old cousin played football when he was younger. He’s been in pain in his left groin for the last three years. Any ideas? |
Sport | Spine | I am 25 years old and my back hurts. What should I do? |
Deg | Spine | I am 75 years old and my back hurts. What should I do? |
Deg | F&A | I’m a 55-year-old woman and I’ve had pain in my heel every morning for the last month. It eases up during the day, but kicks in again in the morning. Any ideas what I should do? |
Deg | Knee | My dad is 65 and only has pain in his knees when he’s going up and down stairs, but he’s fine when he’s walking on flat surfaces. His knee doesn’t lock, but he does have a bit of pain when going up and down stairs. Any ideas what I should do? |
Deg | Knee | I’m 75, and my knees hurt a lot when I bend over and stand up. I can’t even walk 100 m Any ideas what I can do? |
Deg | F&A | My 48-year-old wife has a deformity of the big toe. Do you know what we’re supposed to do? |
Deg | Spine | I’m a 40-year-old office worker and I’ve had numbness in my left hand and neck for about a month. Any ideas what I should do? |
Inf | Knee | I’m 26 and got back from Thailand last week. My right knee’s all swollen, warm and really painful. Any ideas what I should do? |
Inf | Knee | My 60-year-old mum had PRP last week, and now the knee where they did it is swollen and she can’t move it much. Do you know what is it? |
Inf | Hip | My 6-year-old daughter had the flu last week, and this week she’s got a bit of pain in her hip and is limping a bit. Wat I should do? |
Inf | F&A | My dad, who’s 75, has had a black, stinky discharge on his foot for about three months. He also can’t feel some parts of his foot. Any ideas what we should do? |
Tra | F&A | I broke my tibia bone and had to have a nail put in it, and it’s been six months but I’m still in pain. Do you know what I can do? |
Tra | F&A | I broke my tibia bone and had a nail put in it during surgery. It’s been 10 months and I’m still in pain. Any ideas what I can do? |
Tra | Up | My 80-year-old grandmother took a tumble at home a week ago, and her left wrist is still all swollen and sore. What we can do? |
Tra | Hip | I’m 80 and had surgery after a hip fracture six months ago. I start limping when I get tired while walking. Any idea why? |
Intra-Rater Reliability | |||||
---|---|---|---|---|---|
First Evaluation | Second Evaluation | Kappa | Level of Agreement | ||
First Author | Diagnosis | 4.77 ± 0.587 | 4.81 ± 0.491 | 0.859 | Strong |
Recommendation | 4.54 ± 0.859 | 4.58 ± 0.809 | 0.912 | Almost Perfect | |
Referral | 4.88 ± 0.431 | 4.88 ± 0.431 | 1.000 | Perfect | |
Second Author | Diagnosis | 4.88 ± 0.326 | 4.85 ± 0.464 | 0.816 | Strong |
Recommendation | 4.62 ± 0.804 | 4.65 ± 0.797 | 0.894 | Strong | |
Referral | 4.96 ± 0.196 | 4.96 ± 0.196 | 1.000 | Perfect |
First Author | Second Author | ||||||
---|---|---|---|---|---|---|---|
Diag | Recom | Referral | Diag | Recom | Referral | ||
Overall Score | 4.77 ± 0.59 (3–5) | 4.54 ± 0.86 (2–5) | 4.88 ± 0.43 (3–5) | 4.88 ± 0.33 (4–5) | 4.62 ± 0.8 (2–5) | 4.96 ± 0.2 (4–5) | |
AoI | Ped (n= 5) | 4.8 ± 0.45 (4–5) | 4.2 ± 0.84 (3–5) | 5 (5) | 5 (5) | 4.4 ± 0.89 (3–5) | 5 (5) |
Sport (n = 7) | 5 (5) | 4.71 ± 0.77 (3–5) | 4.57 ± 0.79 (3–5) | 5 (5) | 4.86 ± 0.38 (4–5) | 5 (5) | |
Deg (n = 6) | 5 (5) | 4 ± 1.27 (2–5) | 5 (5) | 5 (5) | 4 ± 1.27 (2–5) | 4.83 ± 0.41 (4–5) | |
Inf (n = 4) | 5 (5) | 5 (5) | 5 (5) | 5 (5) | 5 (5) | 5 (5) | |
Tra (n = 4) | 3.75 ± 0.96 (3–5) | 5 (5) | 5 (5) | 4.25 ± 0.5 (4–5) | 5 (5) | 5 (5) | |
p | 0.007 | 0.136 | 0.227 | 0.001 | 0.189 | 0.504 | |
Anat | Up (n = 3) | 5 (5) | 4.33 ± 1.16 (3–5) | 4.33 ± 1.16 (3–5) | 5 (5) | 4.67 ± 0.58 (4–5) | 5 (5) |
Spine (n = 4) | 5 (5) | 4.75 ± 0.5 (4–5) | 5 (5) | 5 (5) | 5 (5) | 5 (5) | |
Hip (n = 5) | 4.6 ± 0.89 (3–5) | 4.8 ± 0.45 (4–5) | 5 (5) | 4.8 ± 0.48 (4–5) | 4.6 ± 0.89 (3–5) | 5 (5) | |
Knee (n = 7) | 5 (5) | 4.43 ± 1.13 (2–5) | 5 (5) | 5 (5) | 4.43 ± 1.13 (2–5) | 4.86 ± 0.38 (4–5) | |
F&A (n = 7) | 4.43 ± 0.79 (3–5) | 4.43 ± 0.98 (3–5) | 5 (5) | 4.71 ± 0.49 (4–5) | 4.57 ± 0.79 (3–5) | 5 (5) | |
p | 0.189 | 0.974 | 0.260 | 0.405 | 0.834 | 0.607 |
First Author | |||||
---|---|---|---|---|---|
Pediatric | Sport | Degenerative | Infective | Trauma | |
Pediatric | N/A | 0.237 | 0.273 | 0.371 | 0.078 |
Sport | 0.237 | N/A | 1.000 | 1.000 | 0.011 |
Degenerative | 0.273 | 1.000 | N/A | 1.000 | 0.018 |
Infective | 0.371 | 1.000 | 1.000 | N/A | 0.046 |
Trauma | 0.078 | 0.011 | 0.018 | 0.046 | N/A |
Second Author | |||||
Pediatric | Sport | Degenerative | Infective | Trauma | |
Pediatric | N/A | 1.000 | 1.000 | 1.000 | 0.025 |
Sport | 1.000 | N/A | 1.000 | 1.000 | 0.010 |
Degenerative | 1.000 | 1.000 | N/A | 1.000 | 0.016 |
Infective | 1.000 | 1.000 | 1.000 | N/A | 0.040 |
Trauma | 0.025 | 0.010 | 0.016 | 0.040 | N/A |
Flesch–Kincaid Grade Level | Flesch Reading Ease Score | Reading Grade Level | ||
---|---|---|---|---|
Overall Score | 7.8 ± 1.267 (5.8–10.9) | 52.68 ± 8.6 (30.2–65) | High School | |
AoI | Ped | 8.26 ± 1.22 (7–10) | 50.96 ± 5.75 (43.2–56.4) | High School |
Sport | 7.93 ± 1.81 (5.8–10.9) | 51 ± 12.84 (30.2–65) | High School | |
Deg | 7.35 ± 0.9 (5.9–8.6) | 56.4 ± 6.11 (46.9–64.6) | High School | |
Inf | 8.23 ± 0.73 (7.2–8.9) | 49.13 ± 4.49 (46.1–55.8) | College | |
Tra | 7.28 ± 1.25 (6.1–8.5) | 55.73 ± 9.56 (47–64.2) | High School | |
p | 0.610 | 0.596 | N/A | |
Anat | Up | 7.3 ± 1.25 (6.3–8.7) | 55.73 ± 8.59 (46.7–63.8) | High School |
Spine | 6.83 ±1.26 (5.8–8.5) | 58.2 ± 8.5 (47–65) | High School | |
Hip | 7.96 ± 1.87 (6.1–10.9) | 51.76 ± 12.82 (30.2–64.2) | High School | |
Knee | 8.51 ± 0.95 (7.4–10) | 48.63 ± 6.27 (39.6–55.3) | College | |
F&A | 7.76 ± 0.87 (6.3–8.6) | 52.91 ± 7.32 (46.9–63.8) | High School | |
p | 0.202 | 0.223 | N/A |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Arzu, U.; Gencer, B. To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders. Diagnostics 2025, 15, 1834. https://doi.org/10.3390/diagnostics15141834
Arzu U, Gencer B. To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders. Diagnostics. 2025; 15(14):1834. https://doi.org/10.3390/diagnostics15141834
Chicago/Turabian StyleArzu, Ufuk, and Batuhan Gencer. 2025. "To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders" Diagnostics 15, no. 14: 1834. https://doi.org/10.3390/diagnostics15141834
APA StyleArzu, U., & Gencer, B. (2025). To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders. Diagnostics, 15(14), 1834. https://doi.org/10.3390/diagnostics15141834