Using Large Language Models to Simulate History Taking: Implications for Symptom-Based Medical Education
Abstract
1. Introduction
2. Materials and Methods
2.1. Dataset Construction and Assessment
- Question: What does your chest pain feel like?
- Response options: Squeezing or tightening/sharp or stabbing/pressure-like and heavy/burning or aching/tearing or ripping.
2.1.1. Checklist-Based Assessment of Medical Appropriateness
2.1.2. Analysis of Diagnostic Diversity
2.2. Exploration of Potential Educational Applications
2.2.1. Visualization of Diagnostic Pathways
2.2.2. Analysis of Age-Specific and Sex-Specific Dialogue Patterns
2.2.3. Identifying High-Impact Questions for Diagnosis
2.3. Statistical Analysis
3. Results
3.1. Evaluation of LLM-Generated History-Taking Dialogues
3.2. Exploration of Potential Educational Applications
3.2.1. Sankey Diagram
3.2.2. Age-Specific and Sex-Specific Dialogues
3.2.3. IG Analysis
4. Discussion
4.1. Can the Fine-Tuned Gemma-3-27B Perform Effective History Taking?
4.2. How Can LLM-Generated Dialogues Support Medical Students in Learning History Taking?
4.3. Limitations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
LLMs | Large language models |
AI | Artificial intelligence |
CPC | Chest Pain Checklist |
IG | Information gain |
References
- Peterson, P.; Baker, E.; McGaw, B. International Encyclopedia of Education; Elsevier: Amsterdam, The Netherlands, 2009. [Google Scholar]
- Bösner, S.; Pickert, J.; Stibane, T. Teaching differential diagnosis in primary care using an inverted classroom approach: Student satisfaction and gain in skills and knowledge. BMC Med. Educ. 2015, 15, 63. [Google Scholar] [CrossRef] [PubMed]
- Kiesewetter, J.; Ebersbach, R.; Tsalas, N.; Holzer, M.; Schmidmaier, R.; Fischer, M.R. Knowledge is not enough to solve the problems–The role of diagnostic knowledge in clinical reasoning activities. BMC Med. Educ. 2016, 16, 303. [Google Scholar] [CrossRef] [PubMed]
- Faustinella, F.; Jacobs, R.J. The decline of clinical skills: A challenge for medical schools. Int. J. Med. Educ. 2018, 9, 195–197. [Google Scholar] [CrossRef] [PubMed]
- Schopper, H.; Rosenbaum, M.; Axelson, R. ‘I wish someone watched me interview:’ Medical student insight into observation and feedback as a method for teaching communication skills during the clinical years. BMC Med. Educ. 2016, 16, 286. [Google Scholar] [CrossRef] [PubMed]
- Alrasheedi, A.A. Deficits in history taking skills among final year medical students in a family medicine course: A study from KSA. J. Taibah Univ. Med. Sci. 2018, 13, 415–421. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Zeng, C.; Zhong, J.; Zhang, R.; Zhang, M.; Zou, L. Leveraging large language model as simulated patients for clinical education. arXiv 2024, arXiv:2404.13066. [Google Scholar]
- Holderried, F.; Stegemann-Philipps, C.; Herrmann-Werner, A.; Festl-Wietek, T.; Holderried, M.; Eickhoff, C.; Mahling, M. A language model–powered simulated patient with automated feedback for history taking: Prospective study. JMIR Med. Educ. 2024, 10, e59213. [Google Scholar] [CrossRef]
- Cairns, C.; Kang, K. National Hospital Ambulatory Medical Care Survey: 2020 Emergency Department Summary Tables. 2022. Available online: https://stacks.cdc.gov/view/cdc/121911 (accessed on 15 July 2025).
- Johri, S.; Jeong, J.; Tran, B.A.; Schlessinger, D.I.; Wongvibulsin, S.; Barnes, L.A.; Zhou, H.Y.; Cai, Z.R.; Van Allen, E.M.; Kim, D.; et al. An evaluation framework for clinical use of large language models in patient interaction tasks. Nat. Med. 2025, 31, 77–86. [Google Scholar] [CrossRef]
- Walls, R.; Hockberger, R.; Gausche-Hill, M.; Erickson, T.; Wilcox, S. Rosen’s Emergency Medicine: Concepts and Clinical Practice; Elsevier: Amsterdam, The Netherlands, 2023; pp. 202–210. [Google Scholar]
- Loscalzo, J.; Fauci, A.; Kasper, D.; Hauser, S.; Longo, D.; Jameson, J.L. Harrison’s Principles of Internal Medicine, 21st ed.; McGraw-Hill Education: New York, NY, USA, 2022. [Google Scholar]
- Henderson, M.C.; Tierney, L.M., Jr.; Smetana, G.W. The Patient History: An Evidence-Based Approach to Differential Diagnosis, 2nd ed.; The McGraw-Hill Companies: New York, NY, USA, 2012; pp. 261–272. [Google Scholar]
- Gulati, M.; Levy, P.D.; Mukherjee, D.; Amsterdam, E.; Bhatt, D.L.; Birtcher, K.K.; Blankstein, R.; Boyd, J.; Bullock-Palmer, R.P. 2021 AHA/ACC/ASE/CHEST/SAEM/SCCT/SCMR guideline for the evaluation and diagnosis of chest pain: A report of the American College of Cardiology/American Heart Association Joint Committee on Clinical Practice Guidelines. J. Am. Coll. Cardiol. 2021, 78, e187–e285. [Google Scholar] [CrossRef]
- Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Otto, E.; Culakova, E.; Meng, S.; Zhang, Z.; Xu, H.; Mohile, S.; Flannery, M.A. Overview of Sankey flow diagrams: Focusing on symptom trajectories in older adults with advanced cancer. J. Geriatr. Oncol. 2022, 13, 742–746. [Google Scholar] [CrossRef] [PubMed]
- Hsia, R.Y.; Hale, Z.; Tabas, J.A. A national study of the prevalence of life-threatening diagnoses in patients with chest pain. JAMA Intern. Med. 2016, 176, 1029–1032. [Google Scholar] [CrossRef] [PubMed]
- Kim, H.S.; Han, H.S.; Kim, W.; Kim, C.; Jang, J.Y.; Kwon, W.; Heo, J.S.; Shin, S.H.; Hwang, H.K.; Park, J.S. Clinical implications of young-onset pancreatic cancer patients after curative resection in Korea: A Korea Tumor Registry System Biliary Pancreas database analysis. HPB 2023, 25, 146–154. [Google Scholar] [CrossRef]
- Bertolini, S.; Maoli, A.; Rauch, G.; Giacomini, M. Entropy-driven decision tree building for decision support in gastroenterology. In Data and Knowledge for Medical Decision Support; IOS Press: Amsterdam, The Netherlands, 2013; pp. 93–97. [Google Scholar]
- Yamamoto, A.; Koda, M.; Ogawa, H.; Miyoshi, T.; Maeda, Y.; Otsuka, F.; Ino, H. Enhancing Medical Interview Skills Through AI-Simulated Patient Interactions: Nonrandomized Controlled Trial. JMIR Med. Educ. 2024, 10, e58753. [Google Scholar] [CrossRef] [PubMed]
- Yi, Y.; Kim, K.J. The feasibility of using generative artificial intelligence for history taking in virtual patients. BMC Res. Notes 2025, 18, 80. [Google Scholar] [CrossRef] [PubMed]
- Sun, Z.; Luo, C.; Liu, Z.; Huang, Z. Conversational disease diagnosis via external planner-controlled large language models. arXiv 2024, arXiv:2404.04292. [Google Scholar]
- Du, Z.; Zheng, L.; Hu, R.; Xu, Y.; Li, X.; Sun, Y.; Chen, W.; Wu, J.; Cai, H.; Ying, H. LLMs Can Simulate Standardized Patients via Agent Coevolution. arXiv 2024, arXiv:2412.11716. [Google Scholar]
- Tu, T.; Schaekermann, M.; Palepu, A.; Saab, K.; Freyberg, J.; Tanno, R.; Wang, A.; Li, B.; Amin, M.; Cheng, Y.; et al. Towards conversational diagnostic artificial intelligence. Nature 2025, 642, 442–450. [Google Scholar] [CrossRef] [PubMed]
- Bösner, S.; Haasenritter, J.; Hani, M.A.; Keller, H.; Sönnichsen, A.C.; Karatolios, K.; Schaefer, J.R.; Baum, E.; Donner-Banzhoff, N. Gender differences in presentation and diagnosis of chest pain in primary care. BMC Fam. Pract. 2009, 10, 79. [Google Scholar] [CrossRef] [PubMed]
No. | Item | Description |
---|---|---|
1 | Quality | Character or nature of the chest pain (e.g., sharp, dull) |
2 | Location | Site of pain on the chest |
3 | Radiation | Site of radiated pain other than the chest (e.g., arm, jaw) |
4 | Onset | Timing of chest pain onset |
5 | Duration | How long a chest pain episode lasts |
6 | Aggravating or relieving factors | Factors that worsen or soothe the pain |
7 | Associated symptoms | Other symptoms present (e.g., shortness of breath, nausea) |
8 | Past medical history | Relevant pre-existing medical or surgical conditions |
9 | Patient’s activity at onset | What the patient was doing at the time of chest pain onset (e.g., exertion, fall on the chest) |
10 | Severity | Intensity of the pain |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huh, C.Y.; Lee, J.; Kim, G.; Jang, Y.; Ko, H.-s.; Suh, M.J.; Hwang, S.; Son, H.J.; Song, J.; Kim, S.-J.; et al. Using Large Language Models to Simulate History Taking: Implications for Symptom-Based Medical Education. Information 2025, 16, 653. https://doi.org/10.3390/info16080653
Huh CY, Lee J, Kim G, Jang Y, Ko H-s, Suh MJ, Hwang S, Son HJ, Song J, Kim S-J, et al. Using Large Language Models to Simulate History Taking: Implications for Symptom-Based Medical Education. Information. 2025; 16(8):653. https://doi.org/10.3390/info16080653
Chicago/Turabian StyleHuh, Cheong Yoon, Jongwon Lee, Gibaeg Kim, Yerin Jang, Hye-seung Ko, Min Jung Suh, Sumin Hwang, Ho Jin Son, Junha Song, Soo-Jeong Kim, and et al. 2025. "Using Large Language Models to Simulate History Taking: Implications for Symptom-Based Medical Education" Information 16, no. 8: 653. https://doi.org/10.3390/info16080653
APA StyleHuh, C. Y., Lee, J., Kim, G., Jang, Y., Ko, H.-s., Suh, M. J., Hwang, S., Son, H. J., Song, J., Kim, S.-J., Kim, K. J., Kim, S. I., Kim, C. O., & Ko, Y. G. (2025). Using Large Language Models to Simulate History Taking: Implications for Symptom-Based Medical Education. Information, 16(8), 653. https://doi.org/10.3390/info16080653