Enhancing Patient Understanding of Perianal Fistula MRI Findings Using ChatGPT: A Randomized, Single Centre Study
Abstract
1. Introduction
2. Materials and Methods
2.1. Patient and Public Involvement (PPI)
2.2. Study Design and Overview
- Prompt engineering and pilot testing (results published in the Journal of Imaging [37])
- Clinical evaluation of LLM outputs
- Patient evaluation: Randomised patient study assessing readability, comprehensibility, perceived utility, trustworthiness and follow-up questions.
2.3. Setting and Participants
2.4. Phase 1: Prompt Engineering & Pilot Phase
Sample Size Calculation
2.5. Phase 2: Clinical Evaluation of AI Summaries
- •
- Fidelity to original report:
- ○
- Yes
- ○
- No, but not clinically significant
- ○
- No, clinically significant (hallucination)
- •
- Quantitative ratings (1–5):
- ○
- Overall impression
- ○
- Strength of recommendations
- ○
- Confidence in report
- •
- Hallucination detection:
- ○
- Presence (HS harm scale [38]: none, mild, moderate, severe). Hallucinations were classified according to the WHO/NHS harm-severity framework, using the International Classification for Patient Safety categories (none, mild, moderate, severe, death) to assess potential patient-impact.
2.6. Phase 3: Randomised Patient Evaluation
- •
- Readability
- •
- Comprehensibility
- •
- Perceived utility
- •
- Likelihood of follow-up questions
- •
- Trustworthiness
2.7. Data Analysis
2.7.1. Quantitative Analysis
2.7.2. Qualitative Analysis
2.8. Ethical Approval and Data Availability
3. Results
3.1. Phase 1: Prompt Engineering & Pilot Phase
3.2. Phase 2: Clinician Evaluation of 250 Consecutive MRI Fistula Reports
3.3. Phase 3: Patient Evaluation of AI-Generated Summaries
3.4. Patient Evaluation of Original Reports vs. AI-Generated Patient Summaries
4. Discussion
4.1. General Findings
4.2. Study Limitations
4.3. Clinical Implications and Future Directions
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Chung, E.M.; Zhang, S.C.; Nguyen, A.T.; Atkins, K.M.; Sandler, H.M.; Kamrava, M. Feasibility and Acceptability of ChatGPT Generated Radiology Report Summaries for Cancer Patients. Digit. Health 2023, 9, 20552076231221620. [Google Scholar] [CrossRef] [PubMed]
- Kuckelman, I.J.; Wetley, K.; Yi, P.H.; Ross, A.B. Translating Musculoskeletal Radiology Reports into Patient-Friendly Summaries Using ChatGPT-4. Skelet. Radiol. 2024, 53, 1621–1624. [Google Scholar] [CrossRef]
- Jeblick, K.; Schachtner, B.; Dexl, J.; Mittermeier, A.; Stüber, A.T.; Topalis, J.; Weber, T.; Wesp, P.; Sabel, B.O.; Ricke, J.; et al. ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports. Eur. Radiol. 2024, 34, 2817–2825. [Google Scholar] [CrossRef] [PubMed]
- Amin, K.S.; Davis, M.A.; Doshi, R.; Haims, A.H.; Khosla, P.; Forman, H.P. Accuracy of ChatGPT, Google Bard, and Microsoft Bing for Simplifying Radiology Reports. Radiology 2023, 309, e232561. [Google Scholar] [CrossRef]
- Gencer, A. Readability Analysis of ChatGPT’s Responses on Lung Cancer. Sci. Rep. 2024, 14, 17234. [Google Scholar] [CrossRef]
- Stephan, D.; Bertsch, A.S.; Schumacher, S.; Puladi, B.; Burwinkel, M.; Al-Nawas, B.; Kämmerer, P.W.; Thiem, D.G. Improving Patient Communication by Simplifying AI-Generated Dental Radiology Reports With ChatGPT: Comparative Study. J. Med. Internet Res. 2025, 27, e73337. [Google Scholar] [CrossRef]
- Shieh, A.; Tran, B.; He, G.; Kumar, M.; Freed, J.A.; Majety, P. Assessing ChatGPT 4.0’s Test Performance and Clinical Diagnostic Accuracy on USMLE STEP 2 CK and Clinical Case Reports. Sci. Rep. 2024, 14, 9330. [Google Scholar] [CrossRef]
- Spinelli, A.; Yanai, H.; Girardi, P.; Milicevic, S.; Carvello, M.; Maroli, A.; Avedano, L. The Impact of Crohn’s Perianal Fistula on Quality of Life: Results of an International Patient Survey. Crohns Colitis 360 2023, 5, otad036. [Google Scholar] [CrossRef]
- Maaser, C.; Sturm, A.; Vavricka, S.R.; Kucharzik, T.; Fiorino, G.; Annese, V.; Calabrese, E.; Baumgart, D.C.; Bettenworth, D.; Borralho Nunes, P.; et al. ECCO-ESGAR Guideline for Diagnostic Assessment in IBD Part 1: Initial Diagnosis, Monitoring of Known IBD, Detection of Complications. J. Crohn’s Colitis 2019, 13, 144–164K. [Google Scholar] [CrossRef] [PubMed]
- Kucharzik, T.; Taylor, S.; Allocca, M.; Burisch, J.; Ellul, P.; Iacucci, M.; Maaser, C.; Baldin, P.; Bhatnagar, G.; Ben-Horin, S. ECCO-ESGAR-ESP-IBUS Guideline on Diagnostics and Monitoring of Patients with Inflammatory Bowel Disease: Part 1: Initial diagnosis, monitoring of known inflammatory bowel disease, detection of complications. J. Crohn’s Colitis 2025, 19, jjaf106. [Google Scholar] [CrossRef]
- Barnes, E.L.; Long, M.D.; Kappelman, M.D.; Martin, C.F.; Sandler, R.S. High Patient Activation Is Associated With Remission in Patients With Inflammatory Bowel Disease. Inflamm. Bowel Dis. 2019, 25, 1248–1254. [Google Scholar] [CrossRef]
- Barello, S.; Guida, E.; Leone, S.; Previtali, E.; Graffigna, G. Does Patient Engagement Affect IBD Patients’ Health-Related Quality of Life? Findings from a Cross-Sectional Study among People with Inflammatory Bowel Diseases. Health Qual. Life Outcomes 2021, 19, 77. [Google Scholar] [CrossRef]
- Anand, E.; Devi, J.; Antoniou, A.; Joshi, S.; Stoker, J.; Lung, P.; Hart, A.; Tozer, P.; Ballard, D.H.; Deepak, P. Patient’s Attitudes to Magnetic Resonance Imaging in Perianal Fistulising Crohn’s Disease: A Global Survey. Crohn’s Colitis 360 2025, 7, otaf015. [Google Scholar] [CrossRef]
- Choudhry, A.J.; Younis, M.; Ray-Zack, M.D.; Glasgow, A.E.; Haddad, N.N.; Habermann, E.B.; Jenkins, D.H.; Heller, S.F.; Schiller, H.J.; Zielinski, M.D. Enhanced Readability of Discharge Summaries Decreases Provider Telephone Calls and Patient Readmissions in the Posthospital Setting. Surgery 2019, 165, 789–794. [Google Scholar] [CrossRef]
- Jindal, P.; MacDermid, J.C. Assessing Reading Levels of Health Information: Uses and Limitations of Flesch Formula. Educ. Health 2017, 30, 84–88. [Google Scholar] [CrossRef]
- Woodcock, C.; Mittelstadt, B.; Busbridge, D.; Blank, G. The Impact of Explanations on Layperson Trust in Artificial Intelligence-Driven Symptom Checker Apps: Experimental Study. J. Med. Internet Res. 2021, 23, e29386. [Google Scholar] [CrossRef]
- Tan, T.C.; Roslan, N.E.B.; Li, J.W.; Zou, X.; Chen, X.; Ratnasari; Santosa, A. Patient Acceptability of Symptom Screening and Patient Education Using a Chatbot for Autoimmune Inflammatory Diseases: Survey Study. JMIR Form. Res. 2023, 7, e49239. [Google Scholar] [CrossRef]
- Lau, C.W.Y.; Kupiec, K.; Livermore, P. Exploring the Acceptance and Opportunities of Using a Specific Generative AI Chatbot to Assist Parents in Managing Pediatric Rheumatological Chronic Health Conditions: Mixed Methods Study. JMIR Pediatr. Parent. 2025, 8, e70409. [Google Scholar] [CrossRef]
- Shekar, S.; Pataranutaporn, P.; Sarabu, C.; Cecchi, G.A.; Maes, P. People Overtrust AI-Generated Medical Advice despite Low Accuracy. NEJM AI 2025, 2, Aloa2300015. [Google Scholar] [CrossRef]
- Bayrak, S.; Yucel, E.; Takci, H. Epilepsy Radiology Reports Classification Using Deep Learning Networks. CMC 2021, 70, 3589–3607. [Google Scholar] [CrossRef]
- NHS England. Artificial Intelligence (AI) and Machine Learning. Available online: https://www.england.nhs.uk/long-read/artificial-intelligence-ai-and-machine-learning/ (accessed on 3 August 2025).
- Korom, R.; Kiptinness, S.; Adan, N.; Said, K.; Ithuli, C.; Rotich, O.; Kimani, B.; King’ori, I.; Kamau, S.; Atemba, E.; et al. AI-Based Clinical Decision Support for Primary Care: A Real-World Study. arXiv 2025, arXiv:2507.16947. [Google Scholar]
- Embí, P.J.; Rhew, D.C.; Peterson, E.D.; Pencina, M.J. Launching the Trustworthy and Responsible AI Network (TRAIN): A Consortium to Facilitate Safe and Effective AI Adoption. JAMA 2025, 333, 1481–1482. [Google Scholar] [CrossRef]
- Kinross, J.; Davies, J. The Creation of an AI Taskforce for Colorectal Surgery in the United Kingdom and Ireland. Color. Dis. 2024, 26, 1869–1870. [Google Scholar] [CrossRef]
- El-Hussuna, A.; Pellino, G.; Soriano, A.; Sahnan, K.; Mookhoek, A.; Sinonque, P.; Allocca, M.; Carter, D.; Ensari, A.; Iacucci, M.; et al. Results of the Ninth Scientific Workshop of the European Crohn’s and Colitis Organisation (ECCO): Artificial Intelligence in IBD Surgery: Opportunities and Limitations. J. Crohn’s Colitis 2025, jjaf135. [Google Scholar] [CrossRef]
- Mookhoek, A.; Sinonque, P.; Allocca, M.; Carter, D.; Ensari, A.; Iacucci, M.; Kopylov, U.; Verstockt, B.; Baumgart, D.C.; Noor, N.M.; et al. Results of the 9th Scientific Workshop of the European Crohn’s and Colitis Organisation (ECCO): Artificial Intelligence in Endoscopy, Radiology and Histology in IBD Diagnostics. J. Crohn’s Colitis 2025, jjaf133. [Google Scholar] [CrossRef]
- Rao, A.; Kim, J.; Kamineni, M.; Pang, M.; Lie, W.; Dreyer, K.J.; Succi, M.D. Evaluating GPT as an Adjunct for Radiologic Decision Making: GPT-4 Versus GPT-3.5 in a Breast Imaging Pilot. J. Am. Coll. Radiol. 2023, 20, 990–997. [Google Scholar] [CrossRef]
- Doshi, R.; Amin, K.S.; Khosla, P.; Bajaj, S.S.; Chheang, S.; Forman, H.P. Quantitative Evaluation of Large Language Models to Streamline Radiology Report Impressions: A Multimodal Retrospective Analysis. Radiology 2024, 310, e231593. [Google Scholar] [CrossRef]
- Lyu, Q.; Tan, J.; Zapadka, M.E.; Ponnatapura, J.; Niu, C.; Myers, K.J.; Wang, G.; Whitlow, C.T. Translating Radiology Reports into Plain Language Using ChatGPT and GPT-4 with Prompt Learning: Results, Limitations, and Potential. Vis. Comput. Ind. Biomed. Art 2023, 6, 9. [Google Scholar] [CrossRef]
- Adams, L.C.; Truhn, D.; Busch, F.; Kader, A.; Niehues, S.M.; Makowski, M.R.; Bressem, K.K. Leveraging GPT-4 for Post Hoc Transformation of Free-Text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study. Radiology 2023, 307, e230725. [Google Scholar] [CrossRef]
- Bobba, P.S.; Sailer, A.; Pruneski, J.A.; Beck, S.; Mozayan, A.; Mozayan, S.; Arango, J.; Cohan, A.; Chheang, S. Natural Language Processing in Radiology: Clinical Applications and Future Directions. Clin. Imaging 2023, 97, 55–61. [Google Scholar] [CrossRef]
- Li, S. Exploring the Clinical Capabilities and Limitations of ChatGPT: A Cautionary Tale for Medical Applications. Int. J. Surg. 2023, 109, 2865–2867. [Google Scholar] [CrossRef]
- McCoy, L.G.; Manrai, A.K.; Rodman, A. Large Language Models and the Degradation of the Medical Record. N. Engl. J. Med. 2024, 391, 1561–1564. [Google Scholar] [CrossRef]
- Elkassem, A.A.; Smith, A.D. Potential Use Cases for ChatGPT in Radiology Reporting. AJR Am. J. Roentgenol. 2023, 221, 373–376. [Google Scholar] [CrossRef]
- Sun, Z.; Ong, H.; Kennedy, P.; Tang, L.; Chen, S.; Elias, J.; Lucas, E.; Shih, G.; Peng, Y. Evaluating GPT-4 on Impressions Generation in Radiology Reports. Radiology 2023, 307, e231259. [Google Scholar] [CrossRef]
- Park, J.; Oh, K.; Han, K.; Lee, Y.H. Patient-Centered Radiology Reports with Generative Artificial Intelligence: Adding Value to Radiology Reporting. Sci. Rep. 2024, 14, 13218. [Google Scholar] [CrossRef]
- Anand, E.; Ghersin, I.; Lingam, G.; Pelly, T.; Singer, D.; Tomlinson, C.; Munro, R.E.; Capstick, R.; Antoniou, A.; Hart, A.L.; et al. AI-Generated Patient-Friendly MRI Fistula Summaries: A Pilot Randomised Study. J. Imaging 2025, 11, 302. [Google Scholar] [CrossRef] [PubMed]
- Cooper, J.; Williams, H.; Hibbert, P.; Edwards, A.; Butt, A.; Wood, F.; Parry, G.; Smith, P.; Sheikh, A.; Donaldson, L.; et al. Classification of Patient-Safety Incidents in Primary Care. Bull. World Health Organ. 2018, 96, 498–505. [Google Scholar] [CrossRef] [PubMed]
- Braun, V.; Clarke, V. Using Thematic Analysis in Psychology. Qual. Res. Psychol. 2006, 3, 77–101. [Google Scholar] [CrossRef]
- Iqbal, N.; Sackitey, C.; Gupta, A.; Tolan, D.; Plumb, A.; Godfrey, E.; Grierson, C.; Williams, A.; Brown, S.; Maxwell-Armstrong, C.; et al. The development of a minimum dataset for MRI reporting of anorectal fistula: A multi-disciplinary, expert consensus process. Eur Radiol 2022, 32, 8306–8316. [Google Scholar] [CrossRef]
- Hughes, S.; Bae, M.; Li, M. Vectara Hallucination Leaderboard. 2023. Available online: https://github.com/vectara/hallucination-leaderboard (accessed on 20 August 2025).
- Butler, J.J.; Acosta, E.; Kuna, M.C.; Harrington, M.C.; Rosenbaum, A.J.; Mulligan, M.T.; Kennedy, J.G. Decoding Radiology Reports: Artificial Intelligence-Large Language Models Can Improve the Readability of Hand and Wrist Orthopedic Radiology Reports. Hand 2024, 20, 1144–1152. [Google Scholar] [CrossRef]
- Chelli, M.; Descamps, J.; Lavoué, V.; Trojani, C.; Azar, M.; Deckert, M.; Raynier, J.-L.; Clowez, G.; Boileau, P.; Ruetsch-Chelli, C. Hallucination Rates and Reference Accuracy of ChatGPT and Bard for Systematic Reviews: Comparative Analysis. J. Med. Internet Res. 2024, 26, e53164. [Google Scholar] [CrossRef]
- Busch, F.; Hoffmann, L.; Rueger, C.; van Dijk, E.H.; Kader, R.; Ortiz-Prado, E.; Makowski, M.R.; Saba, L.; Hadamitzky, M.; Kather, J.N.; et al. Current Applications and Challenges in Large Language Models for Patient Care: A Systematic Review. Commun. Med. 2025, 5, 26. [Google Scholar] [CrossRef]
- Sedano, R.; Nguyen, T.M.; Almradi, A.; Rieder, F.; Parker, C.E.; Shackelton, L.M.; D’Haens, G.; Sandborn, W.J.; Feagan, B.G.; Ma, C.; et al. Disease Activity Indices for Pouchitis: A Systematic Review. Inflamm. Bowel Dis. 2021, 28, 622–638. [Google Scholar] [CrossRef]
- Rajpurkar, P.; Topol, E.J. Beyond Assistance: The Case for Role Separation in AI-Human Radiology Workflows. Radiology 2025, 316, e250477. [Google Scholar] [CrossRef]
- Zhang, K.; Zhou, R.; Adhikarla, E.; Yan, Z.; Liu, Y.; Yu, J.; Liu, Z.; Chen, X.; Davison, B.D.; Ren, H.; et al. A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks. Nat. Med. 2024, 30, 3129–3141. [Google Scholar] [CrossRef] [PubMed]
- Rundo, L.; Militello, C. Image Biomarkers and Explainable AI: Handcrafted Features versus Deep Learned Features. Eur. Radiol. Exp. 2024, 8, 130. [Google Scholar] [CrossRef] [PubMed]
- Verkade, C.; Brouwers, L.; Stijns, J.; van Dal, V.; Wasowicz, D.K.; de Kiefte, M.; van Tilborg, F.; Zimmerman, D.D.E. Validation of a 3D-Printed Model of Cryptoglandular Perianal Fistulas. Tech. Coloproctology 2024, 28, 59. [Google Scholar] [CrossRef] [PubMed]
- Jeri-McFarlane, S.; García-Granero, Á.; Ochogavía-Seguí, A.; Pellino, G.; Oseira-Reigosa, A.; Gil-Catalan, A.; Brogi, L.; Ginard-Vicens, D.; Gamundi-Cuesta, M.; Gonzalez-Argente, F.X. Three-Dimensional Modelling as a Novel Interactive Tool for Preoperative Planning for Complex Perianal Fistulas in Crohn’s Disease. Color. Dis. 2023, 25, 1279–1284. [Google Scholar] [CrossRef]
- Topol, E.J. High-Performance Medicine: The Convergence of Human and Artificial Intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef]





| Characteristic | Category | Frequency | Percent (%) |
|---|---|---|---|
| Aetiology of fistulae | Cryptoglandular | 158 | 63.2 |
| Crohn’s | 69 | 27.6 | |
| Pouch-related fistula | 19 | 7.6 | |
| Rectovaginal fistula | 4 | 1.6 | |
| Parks’ classification | Subsphincteric | 9 | 3.6 |
| Intersphincteric | 63 | 25.2 | |
| Transsphincteric | 153 | 61.2 | |
| Suprasphincteric | 8 | 3.2 | |
| Extrasphincteric | 17 | 6.8 | |
| Horseshoe | No horseshoe | 211 | 84.4 |
| Horseshoe collection | 39 | 15.6 | |
| Extension | No extension | 181 | 72.4 |
| Extension | 69 | 27.6 | |
| Abscess | No abscess | 211 | 84.4 |
| Abscess | 39 | 15.6 |
| Objective Metric | Original (Mean (SD)) | AI-Generated Summary (Mean (SD)) | t | p |
|---|---|---|---|---|
| Flesch-Kincaid Score | 26.25 (9.1) | 65.83 (5.0) | 58.243 | <0.001 |
| FK Grade | 14.1 (2.7) | 7.93 (1.03) | −31.92 | <0.001 |
| Word Count | 172 (57.35) | 345 (55.03) | 48.52 | <0.001 |
| Subjective Metric | Combined Clinician assessment | |||
| AI reflects original report | Yes: 212/250 (84.8%) No, but not clinically significant +: 38/250 (15.2%) | |||
| Overall impression (1–5) | 4.33 * | |||
| Strength of recommendations (1–5) * | 4.39 * | |||
| Confidence in report (1–5) * | 4.68 * | |||
| Hallucinations present | No: 221/250 (88.4%) Yes: 29/250 (11.6%) | |||
| Severity of hallucinations | 2.4% Cases considered as ‘mild harm’ | |||
| Patient Demographics | Mean (SD)/n (%) |
|---|---|
| Age | |
| Mean (SD) | 48 (13.7) |
| Sex | |
| Male | 36 (59%) |
| Female | 25 (41%) |
| Ethnicity | |
| Asian | 11 (18.0%) |
| Mixed | 1 (1.6%) |
| Arab | 6 (9.8%) |
| White | 43 (70.5%) |
| Level of Education | |
| Primary School | 1 (1.7%) |
| Secondary School | 12 (19.7%) |
| Vocational or similar | 13 (21.3%) |
| University bachelor’s degree | 19 (31.1%) |
| Graduate or professional | 16 (26.2%) |
| English—Reading and Writing | |
| Fluent (near-native proficiency) | 42 (68.9%) |
| Advanced (comfortable with complex reading and writing) | 16 (26.2%) |
| Intermediate (can read and write but with some difficulty) | 3 (4.9%) |
| Fistula Aetiology | |
| Crohn’s Disease | 23 (37.7%) |
| Cryptoglandular Disease | 30 (49.2%) |
| Ulcerative Colitis | 1 (1.6%) |
| Pouch-related Fistula | 3 (4.9%) |
| Rectovaginal Fistula | 4 (6.6%) |
| Number of investigations since fistula diagnosis | Mean (S.D.) |
| MRI Fistula | 6.3 (6.1) Add Range |
| Question | Response Category | n (%) |
|---|---|---|
| Do you currently use the NHS app? | Yes | 52 (85.2%) |
| No | 9 (14.8%) | |
| How often do you access your medical records on this app? | Never | 3 (4.9%) |
| Once a month | 7 (11.5%) | |
| A few times a year | 20 (32.8%) | |
| Prior to clinical appointments | 11 (18%) | |
| A few times a month | 20 (32.8%) | |
| How easy do you find it to read your diagnostic reports? | Extremely difficult | 7 (11.5%) |
| Slightly difficult | 17 (27.9%) | |
| Neither easy nor difficult | 17 (27.9%) | |
| Slightly easy | 11 (18.0%) | |
| Extremely easy | 9 (14.8%) | |
| How useful would you find a patient-friendly AI-generated summary of your diagnostic report? | Extremely useful | 28 (45.9%) |
| Very useful | 20 (32.8%) | |
| Moderately useful | 9 (14.8%) | |
| Slightly useful | 3 (4.9%) | |
| Not at all useful | 1 (1.6%) | |
| How useful would you find a ‘next steps’ or ‘action plan’ statement? | Extremely useful | 32 (52.5%) |
| Very useful | 18 (29.5%) | |
| Moderately useful | 8 (13.1%) | |
| Slightly useful | 2 (3.3%) | |
| Not at all useful | 1 (1.6%) |
| Report A—Simple Fistula (n = 31). | |||||
|---|---|---|---|---|---|
| Variable | Mean Original (SD) | Mean AI (SD) | Mean Difference | t | p-Value |
| Readability | 3.26 (1.21) | 4.45 (1.03) | 1.19 | 4.27 | <0.001 |
| Comprehensibility | 3.29 (1.10) | 4.48 (0.93) | 1.19 | 5.21 | <0.001 |
| Perceived Utility | 3.45 (1.15) | 4.29 (0.97) | 0.84 | 3.47 | <0.001 |
| Follow Up | 4.19 (0.95) | 4.00 (1.18) | −0.19 | −0.90 | 0.19 |
| Trustworthiness | 4.06 (1.00) | 4.23 (1.12) | 0.16 | 0.740 | 0.23 |
| Report B—Complex Fistula (n = 30) | |||||
| Variable | Mean Original (SD) | Mean AI (SD) | Mean Difference | t | p-Value |
| Readability | 2.70 (1.26) | 4.50 (0.94) | 1.80 | 5.5 | <0.001 |
| Comprehensibility | 2.73 (1.29) | 4.33 (1.16) | 1.60 | 4.7 | <0.001 |
| Perceived Utility | 2.93 (1.14) | 4.37 (1.16) | 1.43 | 5.0 | <0.001 |
| Follow Up | 4.23 (0.97) | 3.73(1.20) | −0.50 | −1.53 | 0.14 |
| Trustworthiness | 3.97 (0.93) | 4.13 (1.11) | 0.17 | 0.93 | 0.36 |
| Comparison of AI Across Differing Complexities (Option A—AI vs. Option B—AI) | |||||
| Variable | A: Mean AI (SD) | B: Mean AI (SD) | Mean Difference | t | p-Value |
| Readability | 4.45 (1.03) | 4.50 (0.94) | −0.05 | −0.19 | 0.85 |
| Comprehensibility | 4.48 (0.93) | 4.33 (1.16) | 0.15 | 0.56 | 0.58 |
| Perceived Utility | 4.29 (0.97) | 4.37 (1.16) | −0.08 | −0.28 | 0.78 |
| Follow Up | 4.00 (1.18) | 3.73 (1.20) | 0.27 | 0.87 | 0.39 |
| Trustworthiness | 4.23 (1.12) | 4.13 (1.11) | 0.09 | 0.33 | 0.75 |
| Domain | Positive Feedback | Points for Improvement |
|---|---|---|
| Anatomy |
|
|
| Lay AI-generated summary & Patient-Focused Language |
|
|
| Clinical Recommendations |
|
|
| Healing & Disease Trajectory |
|
|
| Structured Reporting |
|
|
| Content Accuracy & Consistency |
|
|
| Generic Recommendations |
|
|
| Omissions & Missed Findings |
|
|
|
|---|
|
|
| Suggested Next Steps (not directive, for discussion): |
|
|
|
|
|
| 🔍 Your MRI Results Explained |
| ✅ What we found: You have a tunnel (called a fistula) near your back passage. It starts [near the skin/inside the bowel] and may have other branches or small fluid areas. |
| 🔄 Is it healing?: Compared to your last scan, [it looks better/has stayed the same/may need more care]. |
| 💬 What does this mean for you?: Your care team might suggest keeping a small soft thread (called a seton) in place, or planning surgery. This depends on your symptoms and type of disease. |
| If you have Crohn’s disease: This might be part of your condition. Speak to your IBD nurse or colorectal specialist about long-term care. |
| 📅 Next steps: Ask your team about follow-up, treatment options, and whether more scans or procedures are needed. |
| “discuss with your colorectal surgeon whether a new seton is needed or if the fistula can be closed”. |
| Next Steps: |
|
|
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Anand, E.; Ghersin, I.; Lingam, G.; Devlin, K.; Pelly, T.; Singer, D.; Tomlinson, C.; Munro, R.E.J.; Capstick, R.; Antoniou, A.; et al. Enhancing Patient Understanding of Perianal Fistula MRI Findings Using ChatGPT: A Randomized, Single Centre Study. Diagnostics 2026, 16, 72. https://doi.org/10.3390/diagnostics16010072
Anand E, Ghersin I, Lingam G, Devlin K, Pelly T, Singer D, Tomlinson C, Munro REJ, Capstick R, Antoniou A, et al. Enhancing Patient Understanding of Perianal Fistula MRI Findings Using ChatGPT: A Randomized, Single Centre Study. Diagnostics. 2026; 16(1):72. https://doi.org/10.3390/diagnostics16010072
Chicago/Turabian StyleAnand, Easan, Itai Ghersin, Gita Lingam, Katie Devlin, Theo Pelly, Daniel Singer, Chris Tomlinson, Robin E. J. Munro, Rachel Capstick, Anna Antoniou, and et al. 2026. "Enhancing Patient Understanding of Perianal Fistula MRI Findings Using ChatGPT: A Randomized, Single Centre Study" Diagnostics 16, no. 1: 72. https://doi.org/10.3390/diagnostics16010072
APA StyleAnand, E., Ghersin, I., Lingam, G., Devlin, K., Pelly, T., Singer, D., Tomlinson, C., Munro, R. E. J., Capstick, R., Antoniou, A., Hart, A. L., Tozer, P., Sahnan, K., & Lung, P. (2026). Enhancing Patient Understanding of Perianal Fistula MRI Findings Using ChatGPT: A Randomized, Single Centre Study. Diagnostics, 16(1), 72. https://doi.org/10.3390/diagnostics16010072

