Completeness and Quality of Neurology Referral Letters Generated by a Large Language Model for Standardized Scenarios
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Design
2.2. ChatGPT Letter Generation
2.3. Evaluation Rubric
2.4. Scoring Procedure
2.5. Statistical Analysis
2.6. Ethical Considerations
3. Results
3.1. Overall Performance
3.2. Letter Completeness
3.3. Letter Quality
3.4. Summary of Findings
4. Discussion
5. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
List of Abbreviations
| AI | Artificial intelligence |
| CC | Chief complaint |
| DaT | Dopamine transporter |
| EHR | Electronic health record |
| HPI | History of present illness |
| LLM | Large language model |
| Mx | Management |
| PE | Physical exam findings |
| Q | Consultation questions |
| SD | Standard deviation |
| TIA | Transient ischemic attack |
Appendix A. Standardized Clinical Scenarios
- Migraine headaches since adolescence, previously well-controlled with sumatriptan.
- Seasonal allergies.
- No history of asthma, diabetes, hypertension, or cardiac disease.
- Tonsillectomy and adenoidectomy at age 5.
- Smokes 1 pack of cigarettes per day for the past 10 years.
- Drinks alcohol socially on weekends.
- Denies illicit drug use.
- Employed as a teacher.
- Lives in an apartment with her husband and two children.
- Mother has a history of migraine headaches.
- Father has a history of hypertension and hyperlipidemia.
- General: No weight loss or fever.
- HEENT: Headaches as described above. No visual disturbances, hearing loss, or tinnitus.
- Cardiovascular: No chest pain, palpitations, or shortness of breath.
- Pulmonary: No cough, sputum production, or dyspnea.
- Gastrointestinal: Occasional constipation.
- Genitourinary: No urinary frequency, urgency, or dysuria.
- Musculoskeletal: No joint pain, stiffness, or weakness.
- Neurological: Headaches as described above. No numbness, tingling, weakness, or difficulty speaking.
- Psychiatric: Occasional anxiety and insomnia.
- General: Well-developed, well-nourished female in no acute distress.
- Vitals: Temperature 37.0 °C, pulse 80 bpm, respirations 16 breaths/min, blood pressure 120/80 mmHg, oxygen saturation 99% on room air.
- HEENT: Pupils equal, round, reactive to light and accommodation. No fundoscopic abnormalities.
- Neck: Supple, no lymphadenopathy, no jugular venous distention.
- Cardiovascular: Regular rate and rhythm, no murmurs, rubs, or gallops.
- Pulmonary: Clear to auscultation bilaterally.
- Abdomen: Soft, non-tender, non-distended, no organomegaly.
- Neurological:
- ○
- Cranial nerves II–XII intact.
- ○
- Motor strength 5/5 bilaterally in all extremities.
- ○
- Sensation intact to light touch, pinprick, and vibration.
- ○
- Reflexes 2+ symmetrically.
- ○
- Gait normal.
- ○
- No cerebellar signs.
- Mental Status: Alert and oriented x3.
- Cranial Nerves:
- ○
- II: Visual acuity 20/20 in both eyes. Visual fields full to confrontation.
- ○
- III, IV, VI: Extraocular movements intact.
- ○
- V: Facial sensation intact to light touch and pinprick.
- ○
- VII: Facial symmetry intact.
- ○
- VIII: Hearing intact to whispered voice.
- ○
- IX, X: Uvula midline, gag reflex present.
- ○
- XI: Shoulder shrug and head turn strength equal bilaterally.
- ○
- XII: Tongue midline, no fasciculations.
- Motor:
- ○
- Upper and lower extremity strength 5/5 bilaterally.
- ○
- No pronator drift.
- Sensory:
- ○
- Light touch, pinprick, vibration, and proprioception intact in all extremities.
- Reflexes:
- ○
- Biceps, triceps, brachioradialis, patellar, and Achilles reflexes 2+ bilaterally.
- ○
- Plantar reflexes flexor bilaterally.
- Coordination:
- ○
- Finger-to-nose, heel-to-shin, and rapid alternating movements normal.
- ○
- Romberg test negative.
- ○
- Gait normal.
- Chronic daily headaches, likely migraine.
- Possible medication overuse headache.
- Rule out other secondary causes of headache.
- Migraine headaches.
- Medication overuse headache.
- Tension-type headaches.
- Cluster headaches.
- Sinus headaches.
- Subarachnoid hemorrhage.
- Brain tumor.
- Giant cell arteritis.
- Diagnostic Studies:
- ○
- Complete blood count (CBC).
- ○
- Comprehensive metabolic panel (CMP).
- ○
- Erythrocyte sedimentation rate (ESR).
- ○
- C-reactive protein (CRP).
- ○
- Urinalysis.
- ○
- Consider imaging studies (CT head or MRI head) if red flags are present.
- Treatment:
- ○
- Medications:
- ♣
- Discontinue any over-the-counter pain medications.
- ♣
- Start prophylactic migraine medications (e.g., propranolol, topiramate).
- ♣
- Prescribe acute migraine medications (e.g., sumatriptan, triptans).
- ○
- Lifestyle modifications:
- ♣
- Stress management techniques (e.g., relaxation, yoga).
- ♣
- Regular sleep schedule.
- ♣
- Avoidance of triggers (e.g., caffeine, alcohol, stress, bright lights).
- ♣
- Regular exercise.
- Follow-up:
- ○
- Schedule follow-up appointment in 1–2 weeks to assess headache response to treatment.
- Alzheimer’s disease.
- Vascular dementia.
- Frontotemporal dementia.
- Lewy body dementia.
- Depression.
- Medication side effects.
- Vitamin B12 deficiency.
- Hypothyroidism.
- Complete blood count (CBC), comprehensive metabolic panel (CMP), thyroid function tests (TSH), vitamin B12 and folate levels, and lipid profile.
- Neuropsychological testing to further evaluate cognitive function.
- Brain imaging studies, such as MRI or CT scan of the head.
- Referral to a neurologist for further evaluation and management.
- Transient ischemic attack (TIA): This diagnosis is strongly suspected given the acute onset of focal neurological deficits (left-sided weakness, dysarthria) followed by complete resolution within 24 h.
- Possible risk factors: Hypertension, hyperlipidemia, diabetes, smoking history, and family history of cardiovascular disease.
- Stroke: While complete resolution of symptoms suggests TIA, a small ischemic stroke cannot be entirely ruled out.
- Migraine with aura: This can present with transient neurological symptoms, but the typical features of headache and visual disturbances are absent in this case.
- Hypoglycemia: This can cause transient neurological symptoms, but the patient denies any history of diabetes-related hypoglycemia.
- Immediate evaluation:
- ○
- Non-contrast head CT: To rule out intracranial hemorrhage.
- ○
- Electrocardiogram (ECG): To assess for cardiac arrhythmias.
- ○
- Blood tests: Complete blood count, comprehensive metabolic panel, lipid profile, and coagulation studies.
- ○
- Carotid ultrasound: To assess for carotid artery stenosis.
- ○
- Echocardiogram: To evaluate for cardiac sources of embolism.
- Antiplatelet therapy: Initiate aspirin or clopidogrel immediately to reduce the risk of subsequent stroke.
- Risk factor modification:
- ○
- Blood pressure control: Optimize antihypertensive medications.
- ○
- Glycemic control: Tightly control blood sugar levels.
- ○
- Lipid management: Initiate or adjust statin therapy.
- ○
- Smoking cessation counseling: If applicable.
- ○
- Dietary and lifestyle modifications: Encourage a healthy diet and regular exercise.
- Urgent neurology consultation: For further evaluation and management.
- Hypertension, well-controlled on lisinopril.
- Hyperlipidemia, well-controlled on atorvastatin.
- Osteoarthritis of the knees.
- Hypothyroidism, well-controlled on levothyroxine.
- Hysterectomy in 2005.
- Cholecystectomy in 2010.
- Cataract surgery bilaterally in 2018.
- Retired school teacher.
- Lives independently in a single-family home.
- Denies smoking or illicit drug use.
- Occasional social alcohol consumption (1–2 glasses of wine per week).
- No significant travel history.
- Mother: History of hypertension and hyperlipidemia.
- Father: History of Parkinson’s disease.
- Siblings: No significant medical history.
- General: No weight loss, fever, or fatigue.
- HEENT: No visual disturbances, hearing loss, tinnitus, epistaxis, sore throat, hoarseness, or dental problems.
- Cardiovascular: No chest pain, palpitations, shortness of breath, or edema.
- Respiratory: No cough, sputum production, or shortness of breath.
- Gastrointestinal: No abdominal pain, nausea, vomiting, diarrhea, or constipation.
- Genitourinary: No urinary incontinence, frequency, or urgency. No dysuria.
- Musculoskeletal: Occasional joint pain in knees, relieved with over-the-counter pain relievers.
- Neurological: Tremors in both upper extremities as described above. No headache, dizziness, vertigo, syncope, seizures, weakness, numbness, or tingling.
- Psychiatric: No anxiety, depression, or insomnia.
- General: Well-nourished, alert, and oriented x3.
- Vital Signs: Blood pressure 128/78 mmHg, pulse 72 beats/min regular, respirations 16 breaths/min, temperature 98.6 °F (37.0 °C), oxygen saturation 99% on room air.
- Neurological Examination:
- ○
- Mental Status: Alert and oriented x3. Speech clear and fluent. No cognitive deficits.
- ○
- Cranial Nerves: II-XII grossly intact.
- ○
- Motor:
- ♣
- Strength: 5/5 in all extremities.
- ♣
- Tone: Normal.
- ♣
- Coordination:
- Finger-to-nose: Difficulty due to tremor.
- Rapid alternating movements: Slow and clumsy.
- Heel-to-shin: Intact.
- ♣
- Gait: Normal.
- ♣
- Posture: Upright.
- ♣
- Resting tremor: Present in both hands at rest, pill-rolling in appearance.
- ♣
- Action tremor: Minimal.
- ○
- Sensory: Light touch, pinprick, vibration, and proprioception intact.
- ○
- Reflexes: 2+ symmetrically in upper and lower extremities.
- ○
- Babinski sign: Absent bilaterally.
- Tremor, likely Parkinson’s disease.
- Essential tremor.
- Parkinson’s disease.
- Cerebellar tremor.
- Dystonia.
- Medication-induced tremor (e.g., beta-blockers, theophylline).
- Hyperthyroidism.
- Wilson’s disease.
- Diagnostic Workup:
- Laboratory tests: Complete blood count, comprehensive metabolic panel, thyroid function tests, liver function tests, and ceruloplasmin level.
- Neuroimaging: Brain MRI with and without contrast.
- DaTscan: To assess dopamine transporter function.
- Treatment:
- Medical management:
- If confirmed Parkinson’s disease, initiate treatment with levodopa/carbidopa.
- Consider beta-blockers (e.g., propranolol) for essential tremor.
- Non-pharmacological interventions:
- Occupational therapy for adaptive strategies to cope with tremors.
- Physical therapy for exercise and balance training.
- Follow-up:
- Schedule a follow-up appointment in 1–2 weeks to discuss diagnostic results and further management options.
- Conservative management:
- ○
- Pain management: Over-the-counter pain relievers such as acetaminophen or ibuprofen. Consider referral to physical therapy for modalities such as heat therapy, massage, and core strengthening exercises.
- ○
- Lifestyle modifications: Encourage weight loss if overweight or obese. Advise the patient to maintain good posture and avoid prolonged sitting or standing.
- Imaging studies: Obtain lumbar spine MRI to confirm the diagnosis and rule out other potential causes.
- Further evaluation: If conservative management fails to improve symptoms, consider further evaluation with a neurologist or neurosurgeon.
Appendix B. Sample AI-Generated Neurology Referral Letters
Appendix B.1. High-Quality Referral Letter (TIA Case–Letter 3.10, Score: 30/30)
- General: Well-nourished, alert, and oriented male in no acute distress.
- Vital Signs: BP 150/95 mmHg, HR 78 bpm, RR 16, BMI 27.
- Cardiac & Pulmonary: Regular rate and rhythm; clear to auscultation.
- Neurological Exam: Normal on arrival; no motor or sensory deficits.
- Stroke (though complete symptom resolution suggests TIA).
- Migraine aura without headache.
- Hypoglycemia (unlikely given normal blood glucose).
- Immediate imaging: Non-contrast head CT to rule out hemorrhage.
- Vascular studies: Carotid ultrasound for stenosis evaluation.
- Cardiac assessment: Echocardiogram, ECG to assess embolic sources.
- Risk factor management: Initiation of aspirin/clopidogrel, blood pressure control.
Appendix B.2. Low-Quality Referral Letter (Radiculopathy Case–Letter 5.2, Score: 10/30)
- Presenting Complaint: [Details of symptoms, onset, progression, severity, any related factors]
- Past Medical History: [Relevant conditions, surgeries, or major illnesses]
- Medications: [Current medications, dosages, recent changes]
- Family History: [Pertinent neurological or other conditions]
- General Appearance: [Patient’s general condition, distress, etc.]
- Neurological Examination: [Summary of findings: mental status, cranial nerves, motor function, reflexes, gait]
- Letter 3.10 demonstrates strong organization, completeness, and clinical clarity, scoring 30/30.
- Letter 5.2 lacks key details in history, physical examination, and plan, as the LLM left the information areas incomplete (10/30).
References
- Ali, F.; Hamid, U.; Zaidat, O.; Bhatti, D.; Kalia, J.S. Role of Artificial Intelligence in TeleStroke: An Overview. Front. Neurol. 2020, 11, 559322. [Google Scholar] [CrossRef] [PubMed]
- Jiang, F.; Jiang, Y.; Zhi, H.; Dong, Y.; Li, H.; Ma, S.; Wang, Y.; Dong, Q.; Shen, H.; Wang, Y. Artificial intelligence in healthcare: Past, present and future. Stroke Vasc. Neurol. 2017, 2, 230–243. [Google Scholar] [CrossRef] [PubMed]
- Liu, T.; Hetherington, T.C.; Stephens, C.; McWilliams, A.; Dharod, A.; Carroll, T.; Cleveland, J.A. AI-Powered Clinical Documentation and Clinicians’ Electronic Health Record Experience: A Nonrandomized Clinical Trial. JAMA Netw Open. 2024, 7, e2432460. [Google Scholar] [CrossRef] [PubMed]
- Avant. Artificial Intelligence for Medical Documentation. 2024. Available online: https://avant.org.au/resources/artificial-intelligence-for-medical-documentation (accessed on 1 February 2025).
- Shah, S.J.; Devon-Sand, A.; Ma, S.P.; Jeong, Y.; Crowell, T.; Smith, M.; Liang, A.S.; Delahaie, C.; Hsia, C.; Shanafelt, T.; et al. Ambient artificial intelligence scribes: Physician burnout and perspectives on usability and documentation burden. J. Am. Med. Inform. Assoc. JAMIA 2025, 32, 375–380. [Google Scholar] [CrossRef] [PubMed]
- Hillis, J.M.; Bizzo, B.C. Use of Artificial Intelligence in Clinical Neurology. Semin. Neurol. 2022, 42, 39–47. [Google Scholar] [CrossRef] [PubMed]
- Marvix.ai. How Ambient AI Transforms Neurology Documentation. 2025. Available online: https://www.marvixapp.ai/blog/blog-ambient-ai-in-neurology-documentation (accessed on 1 February 2025).
- Voigtlaender, S.; Pawelczyk, J.; Geiger, M.; Vaios, E.J.; Karschnia, P.; Cudkowicz, M.; Dietrich, J.; Haraldsen, I.R.J.H.; Feigin, V.; Owolabi, M.; et al. Artificial intelligence in neurology: Opportunities, challenges, and policy implications. J. Neurol. 2024, 271, 2258–2273. [Google Scholar] [CrossRef] [PubMed]
- Kernberg, A.; Gold, J.A.; Mohan, V. Using ChatGPT-4 to Create Structured Medical Notes From Audio Recordings of Physician-Patient Encounters: Comparative Study. J. Med. Internet Res. 2024, 26, e54419. [Google Scholar] [CrossRef] [PubMed]
- Schwieger, A.; Angst, K.; de Bardeci, M.; Burrer, A.; Cathomas, F.; Ferrea, S.; Grätz, F.; Knorr, M.; Kronenberg, G.; Spiller, T.; et al. Large language models can support generation of standardized discharge summaries—A retrospective study utilizing ChatGPT-4 and electronic health records. Int. J. Med. Inform. 2024, 192, 105654. [Google Scholar] [CrossRef] [PubMed]
- Tung, J.Y.M.; Gill, S.R.; Sng, G.G.R.; Lim, D.Y.Z.; Ke, Y.; Tan, T.F.; Jin, L.; Elangovan, K.; Ong, J.C.L.; Abdullah, H.R.; et al. Comparison of the Quality of Discharge Letters Written by Large Language Models and Junior Clinicians: Single-Blinded Study. J. Med. Internet Res. 2024, 26, e57721. [Google Scholar] [CrossRef] [PubMed]
- Ayers, J.W.; Poliak, A.; Dredze, M.; Leas, E.C.; Zhu, Z.; Kelley, J.B.; Faix, D.J.; Goodman, A.M.; Longhurst, C.A.; Hogarth, M.; et al. Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum. JAMA Intern. Med. 2023, 183, 589–596. [Google Scholar] [CrossRef] [PubMed]
- Harris, E. AI-Drafted Responses to Patients Reduced Clinician Burnout. JAMA 2024, 331, 1440. [Google Scholar] [CrossRef] [PubMed]
- MEDICA Trade Fair. AI: Will an Algorithm Soon Be Writing Doctors’ Letters? 2024. Available online: https://www.medica-tradefair.com/en/digital-health/ai-algorithm-doctor-letter (accessed on 1 February 2025).
- Heilmeyer, F.; Böhringer, D.; Reinhard, T.; Arens, S.; Lyssenko, L.; Haverkamp, C. Viability of Open Large Language Models for Clinical Documentation in German Health Care: Real-World Model Evaluation Study. JMIR Med. Inform. 2024, 12, e59617. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.; Zaharia, M.; Zou, J. How Is ChatGPT’s Behavior Changing Over Time? Harv. Data Sci. Rev. 2024, 6. [Google Scholar] [CrossRef]
- Baughman, D.J.; Botros, P.A.; Waheed, A. Technology in Medicine: Improving Clinical Documentation. FP Essent. 2024, 537, 26–38. [Google Scholar] [PubMed]
- Garcia, P.; Ma, S.P.; Shah, S.; Smith, M.; Jeong, Y.; Devon-Sand, A.; Tai-Seale, M.; Takazawa, K.; Clutter, D.; Vogt, K.; et al. Artificial Intelligence-Generated Draft Replies to Patient Inbox Messages. JAMA Netw. Open 2024, 7, e243201. [Google Scholar] [CrossRef] [PubMed]
- Allen, M.R.; Webb, S.; Mandvi, A.; Frieden, M.; Tai-Seale, M.; Kallenberg, G. Navigating the doctor-patient-AI relationship—A mixed-methods study of physician attitudes toward artificial intelligence in primary care. BMC Prim. Care 2024, 25, 42. [Google Scholar] [CrossRef] [PubMed]
| Category | Description | Max Score |
|---|---|---|
| Demographics | Includes patient age, gender, handedness, and occupation | 2 |
| Chief complaint (CC) | Clearly states the main reason for medical visit | 2 |
| History of present illness (HPI) | Details about symptom onset, progression, triggers, impact, etc. | 5 |
| Physical examination findings (PE) | Findings from physical/neurological exams relevant to the case | 5 |
| Management (Mx) | Description of current or proposed management plan (investigation, treatment, referral, etc.) | 3 |
| Consultation questions (Q) | Specific questions for the specialist | 3 |
| Language level | Professional, clear, and concise language | 5 |
| Overall structure | Logical organization and separation of sections | 3 |
| Letter length | Adheres to the recommended word count range (300–400 words = 2; <250 or >450 words = 0) | 2 |
| Component | Case 1: Headache | Case 2: Memory Problems | Case 3: TIA | Case 4: Tremor | Case 5: Radiculopathy | Overall Score |
|---|---|---|---|---|---|---|
| Demographics (2 pts) | 1.9 (95%) | 1.9 (95%) | 1.9 (95%) | 1.9 (95%) | 1.9 (95%) | 1.92 (96%) |
| CC (2 pts) | 1.9 (95%) | 1.9 (95%) | 1.9 (95%) | 1.9 (95%) | 1.9 (95%) | 1.9 (95%) |
| HPI (5 pts) | 4.3 (86%) | 4.2 (84%) | 4.3 (86%) | 4.4 (88%) | 4.5 (90%) | 4.42 (88.4%) |
| PE (5 pts) | 4.1 (82%) | 4.0 (80%) | 4.3 (86%) | 4.5 (90%) | 4.3 (86%) | 4.24 (84.8%) |
| Mx (3 pts) | 2.1 (70%) | 2.0 (67%) | 2.3 (77%) | 2.4 (80%) | 2.1 (70%) | 2.18 (72.67%) |
| Q (3 pts) | 2.8 (93%) | 2.7 (90%) | 2.8 (93%) | 2.9 (97%) | 2.8 (93%) | 2.8 (93.33%) |
| Overall completeness score (20 pts) | 17.4 (87%) | 16.7 (83%) | 17.8 (89%) | 18.0 (90%) | 17.5 (88%) | 17.46 (87.3%) |
| Quality Metric | Case 1: Headache | Case 2: Memory Problems | Case 3: TIA | Case 4: Tremor | Case 5: Radiculopathy | Overall Score |
|---|---|---|---|---|---|---|
| Language level (5 pts) | 4.6 (92%) | 4.5 (90%) | 4.7 (94%) | 4.6 (92%) | 4.5 (90%) | 4.58 (91.6%) |
| Structure and organization (3 pts) | 2.7 (90%) | 2.6 (87%) | 2.8 (93%) | 2.9 (97%) | 2.7 (90%) | 2.7 (90%) |
| Letter length compliance (2 pts) | 1.0 (50%) | 1.1 (55%) | 1.2 (60%) | 1.0 (50%) | 1.0 (50%) | 1.02 (51%) |
| Total quality score (10 pts) | 8.3 (83%) | 8.2 (82%) | 8.7 (87%) | 8.5 (85%) | 8.2 (82%) | 8.38 (83.8%) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Published by MDPI on behalf of the Lithuanian University of Health Sciences. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rattananan, W. Completeness and Quality of Neurology Referral Letters Generated by a Large Language Model for Standardized Scenarios. Medicina 2025, 61, 1931. https://doi.org/10.3390/medicina61111931
Rattananan W. Completeness and Quality of Neurology Referral Letters Generated by a Large Language Model for Standardized Scenarios. Medicina. 2025; 61(11):1931. https://doi.org/10.3390/medicina61111931
Chicago/Turabian StyleRattananan, Watcharasarn. 2025. "Completeness and Quality of Neurology Referral Letters Generated by a Large Language Model for Standardized Scenarios" Medicina 61, no. 11: 1931. https://doi.org/10.3390/medicina61111931
APA StyleRattananan, W. (2025). Completeness and Quality of Neurology Referral Letters Generated by a Large Language Model for Standardized Scenarios. Medicina, 61(11), 1931. https://doi.org/10.3390/medicina61111931

