A Review of Large Language Models in Medical Education, Clinical Decision Support, and Healthcare Administration
Abstract
:1. Introduction
2. Materials and Methods
- −
- PubMed (https://pubmed.ncbi.nlm.nih.gov/, accessed on 1 October 2024)
- −
- ArXiv (https://arxiv.org/, accessed on 5 October 2024)
- −
- IEEE Xplore (https://ieeexplore.ieee.org/Xplore/home.jsp, accessed on 2 February 2024)
- Articles discussing the use of LLMs in medical or healthcare settings
- Studies describing or evaluating LLM-based interventions or workflows in education, clinical decision-making, or administration
- Publications available in English
- Along with the following exclusion criteria:
- Papers lacking explicit mention of an LLM or focusing solely on AI methods not relevant to LLMs (e.g., non-transformer-based models)
- Abstracts without sufficient methodological detail
- Commentaries or opinion pieces that did not provide any empirical or explicit conceptual data
3. Large Language Models in Medical Education
4. LLMs in Clinical Decision Support and Knowledge Retrieval
Title | Authors/Year | Key Findings | Limitations |
---|---|---|---|
Augmenting Black-box LLMs with Medical Textbooks for Clinical Question Answering | [28] | Augmenting LLMs with comprehensive RAG pipelines leads to improved performance and reduced hallucinations in medical QA | The study only evaluated the system on three medical QA tasks without testing its performance on more complex clinical scenarios or real-world medical applications. Additionally, while the system showed improved accuracy, it relied on existing medical textbooks, which may become outdated, and the study did not assess the system’s ability to handle emerging medical knowledge or novel clinical cases. |
Leveraging Large Language Models for Decision Support in Personalized Oncology | [12] | LLMs show potential in Personalized Oncology, albeit still not matching human expert level quality | The study was restricted to only 10 fictional cancer cases and used only 4 LLMs (ChatGPT, Galactica, Perplexity, and BioMedLM) for evaluation. |
Evaluation and mitigation of the limitations of large language models in clinical decision-making | [30] | The researchers found that current state-of-the-art LLMs perform significantly worse than clinicians in diagnosing patients, fail to follow diagnostic and treatment guidelines, and struggle with basic tasks like interpreting laboratory results, concluding that LLMs are not yet ready for autonomous clinical decision-making and require extensive clinician supervision. | The study was restricted to only four common abdominal pathologies and used data from a single database (MIMIC) with a clear US-centric bias, as the data were gathered in an American hospital following American guidelines. Additionally, the study used only open-sourced Llama-2 based models. |
Exploring the landscape of AI-assisted decision-making in head and neck cancer treatment: a comparative analysis of NCCN guidelines and ChatGPT responses | [32] | ChatGPT shows promise in providing treatment suggestions for Head and Neck cancer aligned with NCCN Guidelines | The study was restricted to hypothetical cases rather than real patient scenarios and only evaluated ChatGPT’s performance against NCCN Guidelines without considering other treatment guidelines or real-world clinical complexities. |
Large language model (ChatGPT) as a support tool for breast tumor board | [33] | ChatGPT-3.5 provides good recommendations when evaluated as a decision support tool in breast cancer boards. | The study used a very small sample size of only 10 consecutive patients, relied on only two senior radiologists for evaluation, and was limited to using ChatGPT-3.5 accessed on a single day (9 February 2023). |
Exploring the Potential of ChatGPT-4 in Predicting Refractive Surgery Categorizations: Comparative Study | [34] | ChatGPT-4 achieves significant agreement with clinicians in predicting refractive surgery categorizations | The study relied on a single human rater for comparison and used a small sample size of only 100 consecutive patients. |
Almanac—Retrieval-Augmented Language Models for Clinical Medicine | [35] | Almanac, a RAG-LLM system, significantly outperforms standard LLMs in ClinicalQA, while also providing correct citations and handling adversarial prompts | The evaluation was limited to a panel of only 10 healthcare professionals (8 board-certified clinicians and 2 healthcare practitioners) and compared Almanac against only three standard LLMs (ChatGPT-4, Bing, and Bard). While the study used 314 clinical questions across nine medical specialties, it focused primarily on guidelines and treatment recommendations without evaluating other aspects of clinical decision-making. |
Enhancing Large Language Models for Clinical Decision Support by Incorporating Clinical Practice Guidelines | [29] | Binary Decision Tree (BDT), Program-Aided Graph Construction (PAGC), and Chain-of-Thought–Few-Shot Prompting (CoT-FSP) improved performance in both automated and human evaluations. | Methods tested on a relatively small sample synthetic dataset of 39 patients. |
5. LLMs in Healthcare Administration
Title | Authors/Year | Key Findings | Limitations |
---|---|---|---|
A critical assessment of using ChatGPT for extracting structured data from clinical notes | [36] | ChatGPT-3.5 demonstrated high accuracy in extracting pathological classifications from lung cancer and pediatric osteosarcoma pathology reports, outperforming traditional NLP methods and achieving accuracy rates of 89% to 100% across different datasets. | The study only evaluated ChatGPT’s performance on two specific types of pathology reports (lung cancer and pediatric osteosarcoma), which may not be representative of its effectiveness across other medical domains or different types of clinical notes. |
Adapted large language models can outperform medical experts in clinical text summarization. | [38] | Summaries from the best-adapted LLMs were deemed either equivalent (45%) or superior (36%) to those produced by medical experts. | The study only focused on four specific types of clinical summarization tasks (radiology reports, patient questions, progress notes, and doctor–patient dialog), which may not encompass the full range of clinical documentation scenarios. |
Extracting symptoms from free-text responses using ChatGPT among COVID-19 cases in Hong Kong | [37] | GPT-4 achieved high specificity (0.947–1.000) for all symptoms and high sensitivity for common symptoms (0.853–1.000), with moderate sensitivity for less common symptoms (0.200–1.000) using zero-shot prompting. | The performance evaluation was limited to common symptoms (>10% prevalence) and less common symptoms (2–10% prevalence), potentially missing rare but clinically significant symptoms. |
Exploring the potential of ChatGPT in medical dialogue summarization: a study on consistency with human preferences | [39] | ChatGPT’s summaries were favored more by human medical experts in manual evaluations, demonstrating better readability and overall quality. | The research relied heavily on automated metrics (ROUGE and BERTScore), which could be inadequate for evaluating LLM-generated medical summaries. |
Generative Artificial Intelligence to Transform Inpatient Discharge Summaries to Patient-Friendly Language and Format. | [40] | LLM-transformed discharge summaries were significantly more readable and understandable when compared to original summaries. | Small sample size of only 50 discharge summaries from a single institution (NYU Langone Health). |
6. Mitigating Current LLM Limitations in Healthcare
7. Ethical Considerations and Regulatory Challenges
8. Conclusions and Future Directions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Conflicts of Interest
References
- OpenAI. GPT-4 Technical Report 2023. Available online: https://openai.com/research/gpt-4 (accessed on 10 October 2024.).
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Annual Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Minaee, S.; Mikolov, T.; Nikzad, N.; Chenaghlu, M.; Socher, R.; Amatriain, X.; Gao, J. Large language models: A survey. arXiv 2024, arXiv:2402.06196. [Google Scholar]
- Yin, S.; Fu, C.; Zhao, S.; Li, K.; Sun, X.; Xu, T.; Chen, E. A survey on multimodal large language models. arXiv 2023, arXiv:2306.13549. [Google Scholar] [CrossRef]
- Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z. A survey of large language models. arXiv 2023, arXiv:2303.18223. [Google Scholar]
- Lemos, J.I.; Resstel, L.B.; Guimarães, F.S. Involvement of the prelimbic prefrontal cortex on cannabidiol-induced attenuation of contextual conditioned fear in rats. Behav. Brain Res. 2010, 207, 105–111. [Google Scholar] [CrossRef]
- Hendrycks, D.; Burns, C.; Basart, S.; Zou, A.; Mazeika, M.; Song, D.; Steinhardt, J. Measuring massive multitask language understanding. arXiv 2020, arXiv:2009.03300. [Google Scholar]
- Gilson, A.; Safranek, C.W.; Huang, T.; Socrates, V.; Chi, L.; Taylor, R.A.; Chartash, D. How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Med. Educ. 2023, 9, e45312. [Google Scholar] [CrossRef]
- Nori, H.; King, N.; McKinney, S.M.; Carignan, D.; Horvitz, E. Capabilities of gpt-4 on medical challenge problems. arXiv 2023, arXiv:2303.13375. [Google Scholar]
- Clusmann, J.; Kolbinger, F.R.; Muti, H.S.; Carrero, Z.I.; Eckardt, J.-N.; Laleh, N.G.; Löffler, C.M.L.; Schwarzkopf, S.-C.; Unger, M.; Veldhuizen, G.P.; et al. The future landscape of large language models in medicine. Commun. Med. 2023, 3, 141. [Google Scholar] [CrossRef]
- Mehandru, N.; Miao, B.Y.; Almaraz, E.R.; Sushil, M.; Butte, A.J.; Alaa, A. Evaluating large language models as agents in the clinic. npj Digit. Med. 2024, 7, 84. [Google Scholar] [CrossRef]
- Benary, M.; Wang, X.D.; Schmidt, M.; Soll, D.; Hilfenhaus, G.; Nassir, M.; Sigler, C.; Knödler, M.; Keller, U.; Beule, D.; et al. Leveraging Large Language Models for Decision Support in Personalized Oncology. JAMA Netw. Open 2023, 6, e2343689. [Google Scholar] [CrossRef]
- Xiong, G.; Jin, Q.; Lu, Z.; Zhang, A. Benchmarking retrieval-augmented generation for medicine. arXiv 2024, arXiv:2402.13178. [Google Scholar]
- Gao, Y.; Xiong, Y.; Gao, X.; Jia, K.; Pan, J.; Bi, Y.; Dai, Y.; Sun, J.; Wang, H. Retrieval-augmented generation for large language models: A survey. arXiv 2023, arXiv:2312.10997. [Google Scholar]
- StatPearls [Database]; StatPearls Publishing: Treasure Island, FL, USA, 2024.
- Sandmann, S.; Riepenhausen, S.; Plagwitz, L.; Varghese, J. Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks. Nat. Commun. 2024, 15, 2050. [Google Scholar] [CrossRef]
- Abd-Alrazaq, A.; AlSaad, R.; Alhuwail, D.; Ahmed, A.; Healy, P.M.; Latifi, S.; Aziz, S.; Damseh, R.; Alabed Alrazak, S.; Sheikh, J. Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions. JMIR Med. Educ. 2023, 1, 48291. [Google Scholar] [CrossRef]
- Skryd, A.; Lawrence, K. ChatGPT as a Tool for Medical Education and Clinical Decision-Making on the Wards: Case Study. JMIR Form. Res. 2024, 8, 51346. [Google Scholar] [CrossRef]
- Béchard, P.; Ayala, O.M. Reducing hallucination in structured outputs via Retrieval-Augmented Generation. arXiv 2024, arXiv:2404.08189. [Google Scholar]
- Loaiza-Bonilla, A.; Thaker, N.G.; Redjal, N.; Doria, C.; Showalter, T.; Penberthy, D.; Dicker, A.P.; Choudhri, A.; Williamson, S.; Shah, C.; et al. Large language foundation models encode clinical radiation oncology domain knowledge: Performance on the American College of Radiology Standardized Examination. J. Clin. Oncol. 2024, 42, e13585. [Google Scholar] [CrossRef]
- Benítez, T.M.; Xu, Y.; Boudreau, J.D.; Kow, A.W.C.; Bello, F.; Van Phuoc, L.; Wang, X.; Sun, X.; Leung, G.K.; Lan, Y.; et al. Harnessing the potential of large language models in medical education: Promise and pitfalls. J. Am. Med. Inf. Assoc. 2024, 31, 776–783. [Google Scholar] [CrossRef]
- Magalhães Araujo, S.; Cruz-Correia, R. Incorporating ChatGPT in Medical Informatics Education: Mixed Methods Study on Student Perceptions and Experiential Integration Proposals. JMIR Med. Educ. 2024, 20, 51151. [Google Scholar] [CrossRef]
- Ali, K.; Barhom, N.; Tamimi, F.; Duggal, M. ChatGPT-A double-edged sword for healthcare education? Implications for assessments of dental students. Eur. J. Dent. Educ. 2024, 28, 206–211. [Google Scholar] [CrossRef]
- Ow, G.; Rodman, A.; Stetson, G.V. MedEdMENTOR AI: Can artificial intelligence help medical education researchers select theoretical constructs? medRxiv 2023, medRxiv:2023.11.16.23298661. [Google Scholar]
- Klang, E.; Portugez, S.; Gross, R.; Kassif Lerner, R.; Brenner, A.; Gilboa, M.; Ortal, T.; Ron, S.; Robinzon, V.; Meiri, H.; et al. Advantages and pitfalls in utilizing artificial intelligence for crafting medical examinations: A medical education pilot study with GPT-4. BMC Med. Educ. 2023, 23, 772. [Google Scholar] [CrossRef]
- Cheung, B.H.H.; Lau, G.K.K.; Wong, G.T.C.; Lee, E.Y.P.; Kulkarni, D.; Seow, C.S.; Wong, R.; Co, M.T. ChatGPT versus human in generating medical graduate exam multiple choice questions-A multinational prospective study (Hong Kong S.A.R., Singapore, Ireland, and the United Kingdom). PLoS ONE 2023, 18, e0290691. [Google Scholar] [CrossRef]
- Siriwardhana, S.; Weerasekera, R.; Wen, E.; Kaluarachchi, T.; Rana, R.; Nanayakkara, S. Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering. Trans. Assoc. Comput. Linguist. 2023, 11, 1–17. [Google Scholar] [CrossRef]
- Wang, Y.; Ma, X.; Chen, W. Augmenting black-box llms with medical textbooks for clinical question answering. arXiv 2023, arXiv:2309.02233. [Google Scholar]
- Oniani, D.; Wu, X.; Visweswaran, S.; Kapoor, S.; Kooragayalu, S.; Polanska, K.; Wang, Y. Enhancing Large Language Models for Clinical Decision Support by Incorporating Clinical Practice Guidelines. arXiv 2024, arXiv:2401.11120. [Google Scholar]
- Hager, P.; Jungmann, F.; Holland, R.; Bhagat, K.; Hubrecht, I.; Knauer, M.; Vielhauer, J.; Makowski, M.; Braren, R.; Kaissis, G.; et al. Evaluation and mitigation of the limitations of large language models in clinical decision-making. Nat. Med. 2024, 30, 2613–2622. [Google Scholar] [CrossRef]
- MetaAI. Introducing Llama 3.1. 2024. Available online: https://ai.meta.com/blog/meta-llama-3-1/ (accessed on 10 October 2024).
- Marchi, F.; Bellini, E.; Iandelli, A.; Sampieri, C.; Peretti, G. Exploring the landscape of AI-assisted decision-making in head and neck cancer treatment: A comparative analysis of NCCN guidelines and ChatGPT responses. Eur. Arch. Otorhinolaryngol. 2024, 281, 2123–2136. [Google Scholar] [CrossRef]
- Sorin, V.; Klang, E.; Sklair-Levy, M.; Cohen, I.; Zippel, D.B.; Balint Lahat, N.; Konen, E.; Barash, Y. Large language model (ChatGPT) as a support tool for breast tumor board. npj Breast Cancer 2023, 9, 44. [Google Scholar] [CrossRef]
- Ćirković, A.; Katz, T. Exploring the Potential of ChatGPT-4 in Predicting Refractive Surgery Categorizations: Comparative Study. JMIR Form. Res. 2023, 28, 51798. [Google Scholar] [CrossRef]
- Zakka, C.; Shad, R.; Chaurasia, A.; Dalal, A.R.; Kim, J.L.; Moor, M.; Fong, R.; Phillips, C.; Alexander, K.; Ashley, E.; et al. Almanac -Retrieval-Augmented Language Models for Clinical Medicine. NEJM AI 2024, 1, 25. [Google Scholar] [CrossRef] [PubMed]
- Huang, J.; Yang, D.M.; Rong, R.; Nezafati, K.; Treager, C.; Chi, Z.; Wang, S.; Cheng, X.; Guo, Y.; Klesse, L.J.; et al. A critical assessment of using ChatGPT for extracting structured data from clinical notes. npj Digit. Med. 2024, 7, 024–01079. [Google Scholar] [CrossRef] [PubMed]
- Wei, W.I.; Leung, C.L.K.; Tang, A.; McNeil, E.B.; Wong, S.Y.S.; Kwok, K.O. Extracting symptoms from free-text responses using ChatGPT among COVID-19 cases in Hong Kong. Clin. Microbiol. Infect. 2024, 30, 8. [Google Scholar] [CrossRef] [PubMed]
- Van Veen, D.; Van Uden, C.; Blankemeier, L.; Delbrouck, J.-B.; Aali, A.; Bluethgen, C.; Pareek, A.; Polacin, M.; Reis, E.P.; Seehofnerová, A.; et al. Adapted large language models can outperform medical experts in clinical text summarization. Nat. Med. 2024, 30, 1134–1142. [Google Scholar] [CrossRef]
- Liu, Y.; Ju, S.; Wang, J. Exploring the potential of ChatGPT in medical dialogue summarization: A study on consistency with human preferences. BMC Med. Inf. Decis. Mak. 2024, 24, 024–02481. [Google Scholar] [CrossRef]
- Zaretsky, J.; Kim, J.M.; Baskharoun, S.; Zhao, Y.; Austrian, J.; Aphinyanaphongs, Y.; Gupta, R.; Blecker, S.B.; Feldman, J. Generative Artificial Intelligence to Transform Inpatient Discharge Summaries to Patient-Friendly Language and Format. JAMA Netw. Open 2024, 7, e240357. [Google Scholar] [CrossRef]
- Poulain, R.; Fayyaz, H.; Beheshti, R. Bias patterns in the application of LLMs for clinical decision support: A comprehensive study. arXiv 2024, arXiv:2404.15149. [Google Scholar]
- OpenAI. Introducing OpanAI o1-Preview. 2024. Available online: https://openai.com/index/introducing-openai-o1-preview/ (accessed on 12 December 2024.).
- Xie, Y.; Wu, J.; Tu, H.; Yang, S.; Zhao, B.; Zong, Y.; Jin, Q.; Xie, C.; Zhou, Y. A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? arXiv 2024, arXiv:2409.15277. [Google Scholar]
- Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; van den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; et al. Mastering the game of Go with deep neural networks and tree search. Nature 2016, 529, 484–489. [Google Scholar] [CrossRef]
- Christophe, C.; Kanithi, P.K.; Munjal, P.; Raha, T.; Hayat, N.; Rajan, R.; Al-Mahrooqi, A.; Gupta, A.; Salman, M.U.; Gosal, G. Med42--Evaluating Fine-Tuning Strategies for Medical LLMs: Full-Parameter vs. Parameter-Efficient Approaches. arXiv 2024, arXiv:2404.14779. [Google Scholar]
Title | Authors/Year | Key Findings | Limitations |
---|---|---|---|
ChatGPT as a Tool for Medical Education and Clinical Decision-Making on the Wards: Case Study | [18] | ChatGPT showed potential for addressing medical knowledge gaps and building differential diagnoses during ward rounds | Single-site exploratory evaluation with a small sample conducted over only 7 days at one urban academic medical center, using only ChatGPT-3.5 and relying primarily on qualitative phenomenological inquiry methods for analysis. |
Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions | [17] | LLMs offer a wide range of applications, virtual patient and tutor acting, generating medical cases and personalized study plans | A perspective/opinion paper that primarily draws on professional experience rather than empirical evidence, lacking systematic data collection or analysis to support its conclusions about LLMs in medical education. |
Large language foundation models encode clinical radiation oncology domain knowledge: Performance on the American College of Radiology Standardized Examination | [20] | GPT-4-turbo performed best on clinical radiation oncology questions, outperforming some resident physicians | The study only evaluated performance on a single year’s (2021) ACR Radiation Oncology In-Training Exam, lacking longitudinal assessment across multiple exam versions. Additionally, the evaluation was limited to multiple-choice questions without assessing the models’ reasoning capabilities or ability to handle more complex clinical scenarios. |
Harnessing the potential of large language models in medical education: promise and pitfalls | [21] | LLMs like OpenAI’s ChatGPT can transform education by enhancing student learning and faculty innovation, though challenges include academic misconduct, AI overreliance, reduced critical thinking, content accuracy concerns, and impacts on teaching staff. | A narrative review without a systematic methodology for literature selection and analysis, lacking empirical data to support its conclusions. |
Incorporating ChatGPT in Medical Informatics Education: Mixed Methods Study on Student Perceptions and Experiential Integration Proposals | [22] | The study found that most students were satisfied with ChatGPT, citing benefits for content generation, brainstorming, and rewriting text, with proposals to integrate it into master’s courses for enhancing learning and assisting in various academic tasks. | A low number of questionnaire responses, which may affect the generalizability of the findings and the robustness of student perceptions regarding ChatGPT’s use in medical informatics education. |
ChatGPT-A double-edged sword for healthcare education? Implications for assessments of dental students | [23] | The study evaluated ChatGPT’s accuracy on various healthcare education assessments, finding it provided accurate responses to most text-based questions but struggled with image-based questions and critical literature appraisal, highlighting the need for educators to adapt teaching and assessments to integrate AI while mitigating dishonest use. | Using only text-based questions without image processing capabilities, relying on the free version of ChatGPT with word count restrictions, and lacking. validation of the assessment items’ quality or difficulty level before testing them with ChatGPT. |
MedEdMENTOR AI: Can artificial intelligence help medical education researchers select theoretical constructs? | [24] | MedEdMENTOR AI accurately recommended the actual theoretical constructs for 55% of qualitative studies from 24 core medical educational journals. | The study only evaluated MedEdMENTOR AI’s performance on theoretical construct recommendations for qualitative studies from a 6-month period in selected journals, with a relatively small sample size (53 studies), and only tested the system’s ability to match previously used theories rather than assessing the appropriateness or innovation of its recommendations. |
Advantages and pitfalls in utilizing artificial intelligence for crafting medical examinations: a medical education pilot study with GPT-4 | [25] | GPT-4 demonstrated the ability to rapidly generate a large number of multiple-choice questions for medical examinations with a low rate of outright errors (0.5%) but still required human expert review to address issues such as outdated terminology, demographic insensitivities, and methodological flaws in about 15% of the questions. | The study was limited to a single examination format (210 MCQs) based on an existing template, relied on only five specialist physicians for evaluation, and was conducted over a brief two-month period (March–April 2023). Additionally, the study lacked a comparison of time and resource efficiency between traditional question–writing methods and GPT-4-generated questions. |
ChatGPT versus human in generating medical graduate exam multiple choice questions-A multinational prospective study (Hong Kong S.A.R., Singapore, Ireland, and the United Kingdom) | [26] | ChatGPT demonstrated the ability to generate multiple-choice questions for medical graduate examinations that were comparable in quality to those created by university professoriate staff, with only minor differences in relevance, while producing these questions in a fraction of the time required by human examiners. | The study used a relatively small sample size of only 50 MCQs per group, relied on just two human examiners for comparison, and was restricted to questions based on only two medical textbooks. Additionally, the wider range of scores in AI-generated questions suggests inconsistent quality, and the study did not assess the long-term reliability or validity of the AI-generated questions. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Vrdoljak, J.; Boban, Z.; Vilović, M.; Kumrić, M.; Božić, J. A Review of Large Language Models in Medical Education, Clinical Decision Support, and Healthcare Administration. Healthcare 2025, 13, 603. https://doi.org/10.3390/healthcare13060603
Vrdoljak J, Boban Z, Vilović M, Kumrić M, Božić J. A Review of Large Language Models in Medical Education, Clinical Decision Support, and Healthcare Administration. Healthcare. 2025; 13(6):603. https://doi.org/10.3390/healthcare13060603
Chicago/Turabian StyleVrdoljak, Josip, Zvonimir Boban, Marino Vilović, Marko Kumrić, and Joško Božić. 2025. "A Review of Large Language Models in Medical Education, Clinical Decision Support, and Healthcare Administration" Healthcare 13, no. 6: 603. https://doi.org/10.3390/healthcare13060603
APA StyleVrdoljak, J., Boban, Z., Vilović, M., Kumrić, M., & Božić, J. (2025). A Review of Large Language Models in Medical Education, Clinical Decision Support, and Healthcare Administration. Healthcare, 13(6), 603. https://doi.org/10.3390/healthcare13060603