Intelligent Support for Radiotherapy: A Review of Clinical Applications for Large Language Models
Abstract
1. Introduction
2. Literature Search and Selection
3. Automated Target Volume Contouring
4. Dose Prediction and Automation of RT Planning
4.1. DoseGNN Model
4.2. Multi-Agent Collaborative Automated Planning System
4.3. Multimodal Fusion Automated Planning Framework
4.4. Efficiency Analysis in Automated Planning/Hallucinations Mitigation
5. Patient Education and Communication
6. Clinical Decision Support
7. Information Extraction and Tumor Prognosis Application
7.1. Information Extraction
7.2. Prognosis Assessment
8. Limitations and Future Research Paths
- Restrictive study design undermines clinical extrapolation: Most foundational studies on LLMs in RT adopt small-sample, single-center, retrospective designs, introducing inherent selection and center-specific biases. Limited sample sizes (e.g., 35 patients for LLM-based automated CC RT planning [26]; 668 single-center prostate cancer cases for Medformer validation [13]) fail to capture the clinical diversity of tumor heterogeneity, patient variability, and equipment differences in real-world RT practice, leading to poor performance reproducibility. Additionally, models trained on single-center data (e.g., Radformer, developed on 2985 single-center head and neck cancer patients [16]) are prone to overfitting to local clinical workflows, delineation standards, and dose-planning preferences—factors that vary widely across institutions. GPT-Plan and GPT-RadPlan [23,24] lack large-scale prospective and multicenter validation, precluding verification of their long-term stability and allowing for overestimation of performance due to retrospective data biases. This represents the primary barrier to translating LLM research into cross-center clinical practice.
- Core technical bottlenecks pose clinical safety risks: “Hallucination” is the most critical technical risk for LLM application in RT. For instance, Gemini-1.5-Flash produced clinically infeasible CC RT plans due to hallucinations [26], which could lead to severe clinical consequences such as target underdosing or excessive normal tissue irradiation. These limitations have been rigorously reported in high-quality clinical studies [40,41]. In clinical decision support, some LLMs proposed guideline-inconsistent and potentially harmful recommendations, such as unnecessary amputation for sarcoma patients [34]. These cases demonstrate that “hallucination” is not a theoretical concern but a technical flaw with direct potential for clinical harm. Furthermore, most LLM-RT models are customized for single tumor types or isolated clinical tasks, with poor generalizability to complex scenarios (concurrent multi-cancer RT, rare tumor treatment, special populations) due to insufficient representative training data. These limitations introduce uncontrollable safety risks, failing to meet the high reliability requirements for medical AI in RT.
- Heterogeneous evaluation systems impede comparative validation and translation: A standardized, clinical-oriented evaluation system for LLMs in RT is currently lacking. Existing studies exhibit significant heterogeneity in indicator selection: target delineation research either uses a single metric (DSC [17]) or a composite panel (DSC, IOU, HD95 [14]), while automated planning research prioritizes either DVH optimization [23] or gamma pass rates [26]. This heterogeneity precludes direct cross-study comparison of model performance. More importantly, evaluations rely heavily on technical surrogate indicators (e.g., DSC, MAE, gamma pass rates) with no established correlation to core RT clinical endpoints (e.g., treatment-related complication rates, local tumor control, long-term patient survival) [42]. The disconnect between technical optimization and clinical benefit limits the scientific validation of LLM clinical value and creates a barrier for regulatory approval and clinical translation.
- Non-technical barriers create systemic obstacles to clinical implementation: Even with technical/design improvements, LLM translation faces unresolved non-technical challenges in medical liability, regulation, data governance, and clinical governance. LLM-RT systems lack unified regulatory classification, approval, and access standards, as existing medical device frameworks fail to adapt to the iterative and cross-modal characteristics of LLMs. Previous studies have proposed the establishment of a standardized clinical governance system incorporating hierarchical review, strict quality control, and mandatory human verification [43,44]. It is also essential to clarify that clinicians bear the ultimate legal responsibility [43,44,45] and to strictly comply with regulatory requirements concerning the approval and classification of AI-powered medical devices [33]. RT clinical data (imaging, text, pathology) contain sensitive patient information, yet LLM training requires large-scale multicenter data sharing—a conflict unresolved by the absence of RT-specific standardized datasets and privacy-preserving computing frameworks. Finally, no full-lifecycle clinical governance system exists for LLM-based RT tools, including standardized operating procedures, mandatory manual verification nodes, or long-term performance monitoring, compromising the stability and safety of clinical applications.
- Technical optimization and performance enhancement: Address the “hallucination” issue through refined RAG and multi-stage self-reflection mechanisms to improve output accuracy, and overcome limitations of single-data-type models to enhance generalizability in complex clinical scenarios (e.g., multi-cancer concurrent RT, rare tumor treatments).
- Deepening clinical application scenarios: Explore LLM applications in RT quality control and multi-center collaborative quality assurance studies. Standardized workflows and intelligent verification will enhance treatment consistency.
- Data security and ethical framework development: Establish specialized datasets and annotation standards for RT, balancing data sharing with privacy protection. Formulate ethical guidelines and access criteria for clinical LLM applications to ensure compliance with medical safety regulations.
- Cross-disciplinary collaborative innovation: Strengthen interdisciplinary cooperation between LLMs and RT physics, clinical medicine, and computer science to develop specialized niche models (e.g., pediatric RT education models). Address complex challenges such as dynamic dose optimization and adaptive RT, providing robust technological support for precision, efficiency, and personalization in RT.
9. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| LLMs | Large language models |
| RT | Radiotherapy |
| OARs | organs at risk |
| LD | Linear dichroism |
| PTV | planned target volume |
| DSC | Dice similarity coefficient |
| IOU | intersection over union |
| HD95 | Hausdorff distance 95 |
| SWIN | shifted window |
| GPT-4 | Generative Pretrained Transformer 4 |
| AAPM | American Association of Physicists in Medicine |
| IMRT | intensity-modulated radiotherapy |
| VMAT | Volumetric Modulated Arc Therapy |
| MAE | the mean absolute error |
| Dmax | the maximum dose |
| Dmean | mean dose |
| D95 | 95% volume dose |
| D1 | 1% volume dose |
| Ops | optimisation parameters |
| TPS | Treatment Planning System |
| SD | standard deviation |
| RAG | Retrieval-Augmented Generation |
| MTB | Molecular Tumor Board |
| HR-CTV | High-risk clinical target volume |
| CC | cervical cancer |
| CTV | clinical target volume |
| GTV | Gross target volume |
| DVH | Dose volume Histogram |
| HI | Homogeneity Index |
| ePRO | Electronic patient-reported outcome |
| AI | artificial intelligence |
| RTOG | Radiation Therapy Oncology Group |
| CT | Computed tomography |
| MRI | Magnetic resonance imaging |
| PET-CT | Positron emission tomography–computed tomography |
| BO | Bayesian optimization |
| AUC | area under the curve |
| MTB | Multinational Sarcoma Board |
References
- Stapleton, S.; Jaffray, D.; Milosevic, M. Radiation effects on the tumor microenvironment: Implications for nanomedicine delivery. Adv. Drug Deliv. Rev. 2017, 109, 119–130. [Google Scholar] [CrossRef]
- Wang, K.; Tepper, J.E. Radiation therapy-associated toxicity: Etiology, management, and prevention. CA Cancer J. Clin. 2021, 71, 437–454. [Google Scholar] [CrossRef]
- Kui, X.; Liu, F.; Yang, M.; Wang, H.; Liu, C.; Huang, D.; Li, Q.; Chen, L.; Zou, B. A review of dose prediction methods for tumor radiation therapy. Meta-Radiology 2024, 2, 100057. [Google Scholar] [CrossRef]
- Breedveld, R.H.B.; Craft, D.; Van Haveren, R.; Heijmen, B. Multi-criteria optimization and decision-making in radiotherapy. Eur. J. Oper. Res. 2019, 277, 1–19. [Google Scholar] [CrossRef]
- Kyroudi, A.; Petersson, K.; Ozsahin, E.; Bourhis, J.; Bochud, F.; Moeckli, R. Exploration of clinical preferences in treatment planning of radiotherapy for prostate cancer using Pareto fronts and clinical grading analysis. Phys. Imaging Radiat. Oncol. 2020, 14, 82–86. [Google Scholar] [CrossRef] [PubMed]
- Jones, M.P.; Martin, J.; Foo, K.; Estoesta, P.; Estoesta, P.; Holloway, L.; Jameson, M. The impact of contour variation on tumour control probability in anal cancer. Radiat. Oncol. 2018, 13, 97. [Google Scholar] [CrossRef]
- Vinod, S.K.; Jameson, M.G.; Min, M.; Holloway, L.C. Uncertainties in volume delineation in radiation oncology: A systematic review and recommendations for future studies. Radiother. Oncol. 2016, 121, 169–179. [Google Scholar] [CrossRef] [PubMed]
- Lewis, P.; Court, L.E.; Lievens, Y.; Aggarwal, A. Structure and processes of existing practice in radiotherapy peer review: A systematic review of the literature. Clin. Oncol. 2021, 33, 248–260. [Google Scholar] [CrossRef]
- Yin, S.; Luo, X.; Yang, Y.; Shao, Y.; Ma, L.; Lin, C.; Yang, Q.; Wang, D.; Luo, Y.; Mai, Z.; et al. Development and validation of a deep-learning model for detecting brain metastases on 3D post-contrast MRI: A multi-center multi-reader evaluation study. Neuro-Oncology 2022, 24, 1559–1570. [Google Scholar] [CrossRef]
- Teng, L.; Zhao, Z.; Huang, J.; Cao, Z.; Meng, R.; Shi, F.; Shen, D. Knowledge-Guided Prompt Learning for Lifespan Brain MR Image Segmentation. In Medical Image Computing and Computer Assisted Intervention; Springer: Cham, Switzerland, 2024. [Google Scholar] [CrossRef]
- Hua, R.; Huo, Q.; Gao, Y.; Sui, H.; Zhang, B.; Sun, Y.; Mo, Z.; Shi, F. Segmenting Brain Tumor Using Cascaded V-Nets in Multimodal MR Images. Front. Comput. Neurosci. 2020, 14, 9. [Google Scholar] [CrossRef]
- Muneer, A.; Waqas, M.; Saad, M.B.; Showkatian, E.; Bandyopadhyay, R.; Xu, H.; Li, W.; Chang, J.Y.; Liao, Z.; Haymaker, C.; et al. From Classical Machine Learning to Emerging Foundation Models: Review on Multimodal Data Integration for Cancer Research. arXiv 2025, arXiv:2507.09028v2. [Google Scholar] [CrossRef]
- Rajendran, P.; Chen, Y.; Qiu, L.; Niedermayr, T.; Liu, W.; Buyyounouski, M.; Bagshaw, H.; Han, B.; Yang, Y.; Kovalchuk, N.; et al. Autodelineation of Treatment Target Volume for Radiation Therapy Using Large Language Model-Aided Multimodal Learning. Int. J. Radiat. Oncol. Biol. Phys. 2025, 121, 230–240. [Google Scholar] [CrossRef]
- Zhang, J.; Huang, J.; Jin, S.; Lu, S. Vision-Language Models for Vision Tasks: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 5625–5644. [Google Scholar] [CrossRef] [PubMed]
- Paszke, A.; Lerer, A.; Killeen, T.; Antiga, L.; Yang, E.; Tejani, A.; Fang, L.; Gross, S.; Bradbury, J.; Lin, Z. PyTorch: An Imperative Style, High-Performance Deep Learning Library; Curran Associates Inc.: Red Hook, NY, USA, 2019; pp. 8–14. [Google Scholar]
- Rajendran, P.; Yang, Y.; Niedermayr, T.R.; Gensheimer, M.; Beadle, B.; Le, Q.T.; Xing, L.; Dai, X. Large language model-augmented learning for auto-delineation of treatment targets in head-and-neck cancer radiotherapy. Radiother. Oncol. J. Eur. Soc. Ther. Radiol. Oncol. 2025, 205, 110740. [Google Scholar] [CrossRef] [PubMed]
- Oh, Y.; Park, S.; Byun, H.K.; Cho, Y.; Lee, I.J.; Kim, J.S.; Ye, J.C. LLM-driven multimodal target volume contouring in radiation oncology. Nat. Commun. 2024, 15, 9186. [Google Scholar] [CrossRef]
- Wang, J.; Zhang, J.; Yang, K.; Ghavidel, B.B.; Khajetash, B.; Sarikhani, A.; Houshyari, M.; Liu, T.; Lei, Y.; Tavakoli, M.; et al. Enhancing auto-contouring with large language model in high-dose rate brachytherapy for cervical cancers. Med. Phys. 2025, 52, e70034. [Google Scholar] [CrossRef]
- Liu, C.; Liu, Z.; Holmes, J.; Zhang, L.; Zhang, L.; Ding, Y.; Shu, P.; Wu, Z.; Dai, H.; Li, Y.; et al. Artificial general intelligence for radiation oncology. Meta-Radiology 2023, 1, 100045. [Google Scholar] [CrossRef]
- Moore, K.L.; Brame, R.S.; Low, D.A.; Mutic, S. Experience-Based Quality Control of Clinical Intensity-Modulated Radiotherapy Planning—ScienceDirect. Int. J. Radiat. Oncol. Biol. Phys. 2011, 81, 545–551. [Google Scholar] [CrossRef] [PubMed]
- Hansen, C.R.; Bertelsen, A.; Hazell, I.; Zukauskaite, R.; Gyldenkerne, N.; Johansen, J.; Eriksen, J.G.; Brink, C. Automatic treatment planning improves the clinical quality of head and neck cancer treatment plans. Clin. Translat. Radiat. Oncol. 2016, 1, 2–8. [Google Scholar] [CrossRef]
- Dong, Z.; Chen, Y.; Gay, H.; Hao, Y.; Hugo, G.D.; Samson, P.; Zhao, T. Large-language-model empowered 3D dose prediction for intensity-modulated radiotherapy. Med. Phys. 2025, 52, 619–632. [Google Scholar] [CrossRef]
- Wang, Q.; Wang, Z.; Li, M.; Ni, X.; Tan, R.; Zhang, W.; Wubulaishan, M.; Wang, W.; Yuan, Z.; Zhang, Z.; et al. A feasibility study of automating radiotherapy planning with large language model agents. Phys. Med. Biol. 2025, 70, 075007. [Google Scholar] [CrossRef] [PubMed]
- Liu, S.; Pastor-Serrano, O.; Chen, Y.; Gopaulchan, M.; Liang, W.; Buyyounouski, M.; Pollom, E.; Le, Q.T.; Gensheimer, M.; Dong, P.; et al. Automated radiotherapy treatment planning guided by GPT-4Vision. Phys. Med. Biol. 2025, 70, 155002. [Google Scholar] [CrossRef]
- Hanna, T.P.; King, W.D.; Thibodeau, S.; O’Sullivan, D.E.; Booth, C.M.; Sullivan, R.; Aggarwal, A. Mortality due to cancer treatment delay: Systematic review and meta-analysis. BMJ 2020, 371, m4087. [Google Scholar] [CrossRef]
- Wei, S.; Hu, A.; Liang, Y.; Yang, J.; Yu, L.; Li, W.; Yang, B.; Qiu, J. Feasibility study of automatic radiotherapy treatment planning for cervical cancer using a large language model. Radiat. Oncol. 2025, 20, 77. [Google Scholar] [CrossRef]
- Shinn, N.; Cassano, F.; Berman, E.; Gopinath, A.; Narasimhan, K.; Yao, S. Reflexion: Language Agents with Verbal Reinforcement Learning. Adv. Neural Inf. Process. Syst. 2023, 36, 8634–8652. [Google Scholar] [CrossRef]
- Dehelean, D.C.; Maier, S.H.; Altay-Langguth, A.; Nitschmann, A.; Schmeling, M.; Fleischmann, D.F.; Rogowski, P.; Trapp, C.; Corradini, S.; Belka, C.; et al. Evaluating large language models as an educational tool for meningioma patients: Patient and clinician perspectives. Radiat. Oncol. 2025, 20, 101. [Google Scholar] [CrossRef] [PubMed]
- Wawrzuta, D.; Napieralska, A.; Ludwikowska, K.; Jaruševičius, L.; Trofimoviča-Krasnorucka, A.; Rausis, G.; Szulc, A.; Pędziwiatr, K.; Poláchová, K.; Klejdysz, J.; et al. Large language models for pretreatment education in pediatric radiation oncology: A comparative evaluation study. Clin. Transl. Radiat. Oncol. 2025, 51, 100914. [Google Scholar] [CrossRef]
- Richlitzki, C.; Mansoorian, S.; Käsmann, L.; Stoleriu, M.G.; Kovacs, J.; Sienel, W.; Kauffmann-Guerrero, D.; Duell, T.; Schmidt-Hegemann, N.S.; Belka, C.; et al. Assessing ChatGPT’s Educational Potential in Lung Cancer Radiotherapy From Clinician and Patient Perspectives: Content Quality and Readability Analysis. JMIR Cancer 2025, 11, e69783. [Google Scholar] [CrossRef]
- Hung, T.K.W.; Kuperman, G.J.; Sherman, E.J.; Ho, A.L.; Weng, C.; Pfister, D.G.; Mao, J.J. Performance of Retrieval-Augmented Large Language Models to Recommend Head and Neck Cancer Clinical Trials. J. Med. Internet Res. 2024, 26, e60695. [Google Scholar] [CrossRef] [PubMed]
- Lammert, J.; Dreyer, T.; Mathes, S.; Kuligin, L.; Borm, K.J.; Schatz, U.A.; Kiechle, M.; Lörsch, A.M.; Jung, J.; Lange, S.; et al. Expert-Guided Large Language Models for Clinical Decision Support in Precision Oncology. JCO Precis. Oncol. 2024, 8, e2400478. [Google Scholar] [CrossRef] [PubMed]
- Miao, B.; Sun, Q.; Wang, P.; Shao, R.; Ding, Y.; Chen, Y.; Ying, R. Exploring the use of large language models for classification, clinical interpretation, and treatment recommendation in breast tumor patient records. Sci. Rep. 2025, 15, 31450. [Google Scholar] [CrossRef]
- Li, C.P.; Kalisa, A.T.; Roohani, S.; Hummedah, K.; Menge, F.; Reißfelder, C.; Albertsmeier, M.; Kasper, B.; Jakob, J.; Yang, C. The imitation game: Large language models versus multidisciplinary tumor boards: Benchmarking AI against 21 sarcoma centers from the ring trial. J. Cancer Res. Clin. Oncol. 2025, 151, 248. [Google Scholar] [CrossRef]
- Wei, S.; Hu, A.; Wang, Z.; Meng, X.; Yu, L.; Yang, B.; Qiu, J. RTPhy-ChatBot: A RAG-Based intelligent assistant for radiotherapy physics using LLaMA3 and AAPM reports. J. Appl. Clin. Med. Phys. 2025, 26, e70263. [Google Scholar] [CrossRef] [PubMed]
- Choi, H.S.; Song, J.Y.; Shin, K.H.; Chang, J.H.; Jang, B.S. Developing prompts from large language model for extracting clinical information from pathology and ultrasound reports in breast cancer. Radiat. Oncol. J. 2023, 41, 209–216. [Google Scholar] [CrossRef] [PubMed]
- Fink, M.A.; Bischoff, A.; Fink, C.A.; Jonas, M.M.; Dulz, K.L.; Heußel, C.P.; Kauczor, H.U.; Weber, T.F. Potential of ChatGPT and GPT-4 for Data Mining of Free-Text CT Reports on Lung Cancer. Radiology 2023, 308, e231362. [Google Scholar] [CrossRef]
- Chao, P.J.; Chang, C.H.; Wu, J.J.; Liu, Y.H.; Shiau, J.P.; Shih, H.H.; Lin, G.Z.; Lee, S.H.; Lee, T.F. Improving Prediction of Complications Post-Proton Therapy in Lung Cancer Using Large Language Models and Meta-Analysis. Cancer Control 2024, 31, 10732748241286749. [Google Scholar] [CrossRef]
- Liao, C.; Chu, C.; Lin, T.; Chou, T.; Tsai, M. Enhancing Patient Outcomes in Head and Neck Cancer Radiotherapy: Integration of Electronic Patient-Reported Outcomes and Artificial Intelligence-Driven Oncology Care Using Large Language Models. Cancers 2025, 17, 2345. [Google Scholar] [CrossRef] [PubMed]
- Hadi, M.U.; Al Tashi, Q.; Qureshi, R.; Shah, A.; Muneer, A.; Irfan, M.; Zafar, A.; Shaikh, M.B.; Akhtar, N.; Hassan, S.Z.; et al. A Survey on Large Language Models: Applications, Challenges, Limitations, and Practical Usage. TechRxiv 2023. [Google Scholar] [CrossRef]
- Muneer, A.; Zhang, K.; Hamdi, I.; Qureshi, R.; Waqas, M.; Fouad, S.; Ali, H.; Anwar, S.M.; Wu, J. Foundation Models in Biomedical Imaging: Turning Hype into Reality. arXiv 2025, arXiv:2512.15808. [Google Scholar] [CrossRef]
- Esmaeilzadeh, P. Challenges and strategies for wide-scale artificial intelligence (AI) deployment in healthcare practices: A perspective for healthcare organizations. Artif. Intell. Med. 2024, 151, 102861. [Google Scholar] [CrossRef]
- Yi, P.H.; Haver, H.L.; Jeudy, J.J.; Kim, W.; Kitamura, F.C.; Oluyemi, E.T.; Smith, A.D.; Moy, L.; Parekh, V.S. Best Practices for the Safe Use of Large Language Models and Other Generative AI in Radiology. Radiology 2025, 316, e241516. [Google Scholar] [CrossRef] [PubMed]
- D’Antonoli, T.A.; Stanzione, A.; Bluethgen, C.; Vernuccio, F.; Ugga, L.; Klontzas, M.E.; Cuocolo, R.; Cannella, R.; Koak, B. Large language models in radiology: Fundamentals, applications, ethical considerations, risks, and future directions. Diagn. Interv. Radiol. 2024, 30, 80–90. [Google Scholar] [CrossRef] [PubMed]
- Zitu, M.M.; Le, T.D.; Duong, T.; Haddadan, S.; Garcia, M.; Amorrortu, R.; Zhao, Y.; Rollison, D.E.; Thieu, T. Large language models in cancer: Potentials, risks, and safeguards. BJR Artif. Intell. 2024, 2, ubae019. [Google Scholar] [CrossRef] [PubMed]



| Research Type | Dataset Size | Category of Model | Ground Truth | Core Advantage | Process Time | Iterations |
|---|---|---|---|---|---|---|
| retrospective | 12 lung cancer IMRT cases and 5 CC VMAT cases | GPT-Plan [23] | D95 increased by 4.75%, HI decreased by 49.52%, and mean lung dose decreased by 14.31% (p < 0.05) The efficacy of the GPT-Plan was comparable to that of senior physicists; OAR protection was superior to that achieved by junior physicists in CC cases. | A pioneering multi-agent system for automated RT planning optimization | - | - |
| retrospective | 5 CC VMAT cases | GPT-Plan (with Retriever) [22] | The efficiency of the system is comparable to that of senior human planners. | Simultaneously enhance the stability and efficiency of optimization. | - | 2–4 (average 3.2) |
| retrospective | 17 prostate cancer VMAT cases and 13 head and neck cancer VMAT cases | GPT-RadPlan based on GPT-4V [24] | Reducing OAR doses by an average of 5 Gy (15% reduction in prostate cancer, 10–15% reduction in head and neck cancer) | No additional training required, with strong adaptability and excellent target coverage. Superior OAR protection. | - | 3–6 |
| retrospective | 35 cases of CC (VMAT) | Qwen-2.5-Max (Alibaba Cloud, Hangzhou, China) [26] | The gamma pass rate > 0.995 (2 mm/2%) | Rapid generation of an acceptable clinical plan | 16.3 ± 5.0 min | <18 |
| 35 cases of CC (VMAT) | Gemini-1.5-Flash (Google, Mountain View, CA, USA) [26] | Gemini-1.5-Flash generated hallucinations, leading to confusion in OAR dose | Cannot generate a clinically acceptable plan | - | - | |
| 35 cases of CC (VMAT) | Llama-3.2 (Meta AI, Menlo Park, CA, USA) [26] | The gamma pass rate > 0.995 (2 mm/2%) | Rapid generation of an acceptable clinical plan | 9.8 ± 2.1 min | <11 |
| Research Type | Extract Scene | Dataset Size | Category of Model | Core Conclusions |
|---|---|---|---|---|
| Retrospective | Pathological report [36] | 340 breast cancer patients | GPT-3.5-turbo | The overall accuracy rate was 87.7%, and the lymphatic vessel infiltration extraction accuracy rate was 98.2%. |
| Retrospective | Radiation report [37] | 3523 Lung Cancer CT Reports | GPT-4/ChatGPT | The GPT-4 model outperforms ChatGPT; GPT-4 achieves 98.6% accuracy in lesion parameter extraction and 96% correct reporting extraction rate. |
| Retrospective | Studies [38] | 1569 articles | ChatGPT | Evaluation efficiency is 3229 times higher than manual assessment. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Fu, J.; Cheng, Y.; Li, Z.; Fu, J. Intelligent Support for Radiotherapy: A Review of Clinical Applications for Large Language Models. J. Clin. Med. 2026, 15, 2531. https://doi.org/10.3390/jcm15072531
Fu J, Cheng Y, Li Z, Fu J. Intelligent Support for Radiotherapy: A Review of Clinical Applications for Large Language Models. Journal of Clinical Medicine. 2026; 15(7):2531. https://doi.org/10.3390/jcm15072531
Chicago/Turabian StyleFu, Juanjuan, Yifan Cheng, Zhaobin Li, and Jie Fu. 2026. "Intelligent Support for Radiotherapy: A Review of Clinical Applications for Large Language Models" Journal of Clinical Medicine 15, no. 7: 2531. https://doi.org/10.3390/jcm15072531
APA StyleFu, J., Cheng, Y., Li, Z., & Fu, J. (2026). Intelligent Support for Radiotherapy: A Review of Clinical Applications for Large Language Models. Journal of Clinical Medicine, 15(7), 2531. https://doi.org/10.3390/jcm15072531

