AI and Ethics: A Systematic Review of the Ethical Considerations of Large Language Model Use in Surgery Research
Abstract
:1. Introduction
2. Materials and Methods
2.1. Search Strategy and Database Search
2.2. Study Eligibility and Selection Process
2.3. Data Collection and Analysis
3. Results
3.1. Characteristics of Included Studies
3.2. Ethical Considerations
3.3. Representation of Ethical Principles
4. Discussion
4.1. Representation of Ethical Principles
4.2. Ethical Concerns
4.2.1. Accuracy
4.2.2. Fabricated Content and “Hallucinations”
4.2.3. Informed Consent
4.2.4. Patient Privacy and Data Security
4.2.5. Bias and Inequity
4.2.6. Transparency
4.2.7. Responsibility and Liability
4.2.8. Authorship and Plagiarism
4.2.9. Patient Trust and Patient-Physician Relationship
4.2.10. Replacement of Physicians
4.3. LLMs and Other AI Advancements in Surgery
4.4. Limitations of the Study
4.5. Steps for Future Research
- Improving accuracy and reliability. Further LLM development is essential for improving accuracy. This will likely involve expanding training datasets to include reliable and comprehensive medical data. Additionally, continuous updates to these datasets are necessary to maintain alignment with the guidelines of current medical practice. Furthermore, the refinement of AI algorithms will also improve accuracy.
- Quantifying and mitigating bias. Robust methods need to be developed and validated for detecting and quantifying bias in LLM outputs, particularly with respect to patient demographics, socioeconomic status, and medical conditions. Effective strategies must be explored and implemented to mitigate bias in both LLM training data and algorithms, ensuring equitable and non-discriminatory medical advice and decision making.
- Transparency and explainability. Enhancing LLM transparency requires the development of interpretable AI frameworks that explain the reasoning behind LLM outputs and decisions. It is crucial to empower users in order to understand limitations and build trust. Investigating and adopting effective tools for visualizing and communicating the reasoning behind LLMs and their limitations to patients and healthcare professionals is essential in order to foster informed decision making and collaboration.
- Responsibility and liability. Clear legal frameworks and guidelines that define responsibility and liability for LLM-related errors or adverse events in healthcare need to be established, thus ensuring accountability and fairness. In-depth research is needed on the legal and ethical implications of LLM implementation, with consideration paid to various stakeholder perspectives and the promotion of responsible development and deployment.
- Patient trust and communication. Investigating the impact of LLM use on the patient-physician relationship is crucial in order to explore strategies for maintaining trust, open communication, and patient autonomy in a context of LLM involvement. Ethical guidelines and best practices for communication between healthcare professionals, patients, and LLMs need to be developed, thus fostering practices to ensure the maintenance of informed consent, shared decision making, and optimal patient care.
- Long-term societal impact. Exploring the potential economic and social consequences of LLM integration in healthcare is critical. Concerns about job displacement and ensuring equitable access to AI-powered healthcare solutions need to be addressed. It is essential to conduct ethical impact assessments of LLM implementation, with consideration of broader societal implications and potential unintended consequences. It is also necessary to formulate responsible governance frameworks.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AI | Artificial intelligence |
BERT | Bidirectional Encoder Representations from Transformers |
ChatGPT | Chat Generative Pre-Trained Transformer |
CINAHL | Cumulative Index to Nursing and Allied Health Literature |
EMBASE | Excerpta Medica Database |
HIPAA | Health Insurance Portability and Accountability Act |
HLEG | High-Level Expert Group |
LaMDA | Language Model for Dialogue Applications |
LLaMA | Large Language Model Meta AI |
LLM | large language model |
NLP | natural language processing |
PaLM | Pathways Language Model |
PRISMA | Preferred Reporting Items for Systematic Reviews and Meta-Analyses |
RoBERTa | Robustly Optimized BERT Pretraining Approach |
T5 | Text-To-Text Transfer Transformer |
Appendix A
Database | Search String |
---|---|
CINAHL (EBSCO) | ((“large language model*”) OR (“language model*”) OR (“OpenAI”) OR (“GPT”) OR (“chatbot”) OR ((“Google”) AND (“PaLM”)) OR ((“Google”) AND (“BERT”)) OR ((“Anthropic”) AND (“Claude”)) OR ((“Meta”) AND (“LLaMA”)) OR ((“Stanford”) AND (“Alpaca”))) AND ((“plastic surger*”) OR (“reconstructive surger*”) OR (“aesthetic surger*”) OR (“cosmetic surg*”) OR (“pediatric surger*”) OR (“orthopedic surger*”) OR (“otolaryngology surger*”) OR (“ENT surger*”) OR (“head and neck surger*”) OR (“cardiothoracic surger*”) OR (“thoracic surger*”) OR (“oncologic surger*”) OR (“surgical oncology”) OR (“burn surger*”) OR (“endocrine surger*”) OR (“ophthalmic surger*”) OR (“gynecological surger*”) OR (“vascular surger*”) OR (“colorectal surger*”) OR (“trauma surger*”) OR (“oral and maxillofacial surger*”) OR (“oromaxillofacial surger*”) OR (“transplant surger*”) OR (“bariatric surger*”) OR (“hand surger*”) OR (“hepatobiliary surger*”) OR (“minimally invasive surger*”) OR (“fetal surger*”) OR (“robotic surger*”) OR (“urology”) OR (“urologic surger*”) OR (“neurologic surger*”) OR (“neurosurg*”) OR (“elbow surger*”) OR (“spine surger*”) OR (“craniofacial surger*”) OR (“podiatric surger*”) OR (“surgical critical care”) OR (“organ transplant”) OR (“obstetrics and gynecology”) OR (“obstetrics and gynaecology”) OR (“OBGYN”) OR (“OB/GYN”) OR (“OB GYN”) OR (“surgery”) OR (“surgical”) OR (“surgeries”) OR (“surgeon*”)) |
EMBASE (Ovid) | ((“large language model*”) OR (“language model*”) OR (“OpenAI”) OR (“GPT”) OR (“chatbot”) OR ((“Google”) AND (“PaLM”)) OR ((“Google”) AND (“BERT”)) OR ((“Anthropic”) AND (“Claude”)) OR ((“Meta”) AND (“LLaMA”)) OR ((“Stanford”) AND (“Alpaca”))) AND ((“plastic surger*”) OR (“reconstructive surger*”) OR (“aesthetic surger*”) OR (“cosmetic surg*”) OR (“pediatric surger*”) OR (“orthopedic surger*”) OR (“otolaryngology surger*”) OR (“ENT surger*”) OR (“head and neck surger*”) OR (“cardiothoracic surger*”) OR (“thoracic surger*”) OR (“oncologic surger*”) OR (“surgical oncology”) OR (“burn surger*”) OR (“endocrine surger*”) OR (“ophthalmic surger*”) OR (“gynecological surger*”) OR (“vascular surger*”) OR (“colorectal surger*”) OR (“trauma surger*”) OR (“oral and maxillofacial surger*”) OR (“oromaxillofacial surger*”) OR (“transplant surger*”) OR (“bariatric surger*”) OR (“hand surger*”) OR (“hepatobiliary surger*”) OR (“minimally invasive surger*”) OR (“fetal surger*”) OR (“robotic surger*”) OR (“urology”) OR (“urologic surger*”) OR (“neurologic surger*”) OR (“neurosurg*”) OR (“elbow surger*”) OR (“spine surger*”) OR (“craniofacial surger*”) OR (“podiatric surger*”) OR (“surgical critical care”) OR (“organ transplant”) OR (“obstetrics and gynecology”) OR (“obstetrics and gynaecology”) OR (“OBGYN”) OR (“OB/GYN”) OR (“OB GYN”) OR (“surgery”) OR (“surgical”) OR (“surgeries”) OR (“surgeon*”)) |
PubMed (NIH) | ((“large language model*”) OR (“language model*”) OR (“OpenAI”) OR (“GPT”) OR (“chatbot”) OR ((“Google”) AND (“PaLM”)) OR ((“Google”) AND (“BERT”)) OR ((“Anthropic”) AND (“Claude”)) OR ((“Meta”) AND (“LLaMA”)) OR ((“Stanford”) AND (“Alpaca”))) AND ((“plastic surger*”) OR (“reconstructive surger*”) OR (“aesthetic surger*”) OR (“cosmetic surg*”) OR (“pediatric surger*”) OR (“orthopedic surger*”) OR (“otolaryngology surger*”) OR (“ENT surger*”) OR (“head and neck surger*”) OR (“cardiothoracic surger*”) OR (“thoracic surger*”) OR (“oncologic surger*”) OR (“surgical oncology”) OR (“burn surger*”) OR (“endocrine surger*”) OR (“ophthalmic surger*”) OR (“gynecological surger*”) OR (“vascular surger*”) OR (“colorectal surger*”) OR (“trauma surger*”) OR (“oral and maxillofacial surger*”) OR (“oromaxillofacial surger*”) OR (“transplant surger*”) OR (“bariatric surger*”) OR (“hand surger*”) OR (“hepatobiliary surger*”) OR (“minimally invasive surger*”) OR (“fetal surger*”) OR (“robotic surger*”) OR (“urology”) OR (“urologic surger*”) OR (“neurologic surger*”) OR (“neurosurg*”) OR (“elbow surger*”) OR (“spine surger*”) OR (“craniofacial surger*”) OR (“podiatric surger*”) OR (“surgical critical care”) OR (“organ transplant”) OR (“obstetrics and gynecology”) OR (“obstetrics and gynaecology”) OR (“OBGYN”) OR (“OB/GYN”) OR (“OB GYN”) OR (“surgery”) OR (“surgical”) OR (“surgeries”) OR (“surgeon*”)) |
Scopus (Elsevier) | ((“large language model*”) OR (“language model*”) OR (“OpenAI”) OR (“GPT”) OR (“chatbot”) OR ((“Google”) AND (“PaLM”)) OR ((“Google”) AND (“BERT”)) OR ((“Anthropic”) AND (“Claude”)) OR ((“Meta”) AND (“LLaMA”)) OR ((“Stanford”) AND (“Alpaca”))) AND ((“plastic surger*”) OR (“reconstructive surger*”) OR (“aesthetic surger*”) OR (“cosmetic surg*”) OR (“pediatric surger*”) OR (“orthopedic surger*”) OR (“otolaryngology surger*”) OR (“ENT surger*”) OR (“head and neck surger*”) OR (“cardiothoracic surger*”) OR (“thoracic surger*”) OR (“oncologic surger*”) OR (“surgical oncology”) OR (“burn surger*”) OR (“endocrine surger*”) OR (“ophthalmic surger*”) OR (“gynecological surger*”) OR (“vascular surger*”) OR (“colorectal surger*”) OR (“trauma surger*”) OR (“oral and maxillofacial surger*”) OR (“oromaxillofacial surger*”) OR (“transplant surger*”) OR (“bariatric surger*”) OR (“hand surger*”) OR (“hepatobiliary surger*”) OR (“minimally invasive surger*”) OR (“fetal surger*”) OR (“robotic surger*”) OR (“urology”) OR (“urologic surger*”) OR (“neurologic surger*”) OR (“neurosurg*”) OR (“elbow surger*”) OR (“spine surger*”) OR (“craniofacial surger*”) OR (“podiatric surger*”) OR (“surgical critical care”) OR (“organ transplant”) OR (“obstetrics and gynecology”) OR (“obstetrics and gynaecology”) OR (“OBGYN”) OR (“OB/GYN”) OR (“OB GYN”) OR (“surgery”) OR (“surgical”) OR (“surgeries”) OR (“surgeon*”)) |
Web of Science (Thompson) | ((“large language model*”) OR (“language model*”) OR (“openmi”) OR (“GPT”) OR (“chatbot”) OR ((“Google”) AND (“PaLM”)) OR ((“Google”) AND (“BERT”)) OR ((“Anthropic”) AND (“Claude”)) OR ((“Meta”) AND (“LLaMA”)) OR ((“Stanford”) AND (“Alpaca”))) AND ((“plastic surger*”) OR (“reconstructive surger*”) OR (“aesthetic surger*”) OR (“cosmetic surg*”) OR (“pediatric surger*”) OR (“orthopedic surger*”) OR (“otolaryngology surger*”) OR (“ENT surger*”) OR (“head and neck surger*”) OR (“cardiothoracic surger*”) OR (“thoracic surger*”) OR (“oncologic surger*”) OR (“surgical oncology”) OR (“burn surger*”) OR (“endocrine surger*”) OR (“ophthalmic surger*”) OR (“gynecological surger*”) OR (“vascular surger*”) OR (“colorectal surger*”) OR (“trauma surger*”) OR (“oral and maxillofacial surger*”) OR (“oromaxillofacial surger*”) OR (“transplant surger*”) OR (“bariatric surger*”) OR (“hand surger*”) OR (“hepatobiliary surger*”) OR (“minimally invasive surger*”) OR (“fetal surger*”) OR (“robotic surger*”) OR (“urology”) OR (“urologic surger*”) OR (“neurologic surger*”) OR (“neurosurg*”) OR (“elbow surger*”) OR (“spine surger*”) OR (“craniofacial surger*”) OR (“podiatric surger*”) OR (“surgical critical care”) OR (“organ transplant”) OR (“obstetrics and gynecology”) OR (“obstetrics and gynaecology”) OR (“obgyns”) OR (“OB/GYN”) OR (“OB GYN”) OR (“surgery”) OR (“surgical”) OR (“surgeries”) OR (“surgeon*”)) (Title) or ((“large language model*”) OR (“language model*”) OR (“openmi”) OR (“GPT”) OR (“chatbot”) OR ((“Google”) AND (“PaLM”)) OR ((“Google”) AND (“BERT”)) OR ((“Anthropic”) AND (“Claude”)) OR ((“Meta”) AND (“LLaMA”)) OR ((“Stanford”) AND (“Alpaca”))) AND ((“plastic surger*”) OR (“reconstructive surger*”) OR (“aesthetic surger*”) OR (“cosmetic surg*”) OR (“pediatric surger*”) OR (“orthopedic surger*”) OR (“otolaryngology surger*”) OR (“ENT surger*”) OR (“head and neck surger*”) OR (“cardiothoracic surger*”) OR (“thoracic surger*”) OR (“oncologic surger*”) OR (“surgical oncology”) OR (“burn surger*”) OR (“endocrine surger*”) OR (“ophthalmic surger*”) OR (“gynecological surger*”) OR (“vascular surger*”) OR (“colorectal surger*”) OR (“trauma surger*”) OR (“oral and maxillofacial surger*”) OR (“oromaxillofacial surger*”) OR (“transplant surger*”) OR (“bariatric surger*”) OR (“hand surger*”) OR (“hepatobiliary surger*”) OR (“minimally invasive surger*”) OR (“fetal surger*”) OR (“robotic surger*”) OR (“urology”) OR (“urologic surger*”) OR (“neurologic surger*”) OR (“neurosurg*”) OR (“elbow surger*”) OR (“spine surger*”) OR (“craniofacial surger*”) OR (“podiatric surger*”) OR (“surgical critical care”) OR (“organ transplant”) OR (“obstetrics and gynecology”) OR (“obstetrics and gynaecology”) OR (“obgyns”) OR (“OB/GYN”) OR (“OB GYN”) OR (“surgery”) OR (“surgical”) OR (“surgeries”) OR (“surgeon*”)) (Abstract) |
Inclusion Criteria | Exclusion Criteria |
---|---|
Articles published between 1 January 2018, and current (31 October 2023). | Articles published before 1 January 2018. |
Articles written in English. | Articles not written in English. |
Articles published in a peer-reviewed medical/scientific journal. | Articles that are unpublished (pre-print), published in a non-peer-reviewed journal, or non-journal articles (meeting abstracts, conference proceedings, etc.). |
Full-text journal articles. | Unretrievable articles. |
Sufficient discussion of ethical limitations. | Insufficient discussion of ethics. |
Articles targeted toward a surgery audience and/or published in a surgical journal. | Articles not targeted toward a surgery audience or published in a surgical journal. |
Articles focused on LLMs. | Articles that are not focused on LLMs. |
Duplicate article. |
References
- Hamet, P.; Tremblay, J. Artificial intelligence in medicine. Metabolism 2017, 69, S36–S40. [Google Scholar] [CrossRef] [PubMed]
- Laird, J.E.; Lebiere, C.; Rosenbloom, P.S. A Standard Model of the Mind: Toward a Common Computational Framework across Artificial Intelligence, Cognitive Science, Neuroscience, and Robotics. AI Mag. 2017, 38, 13–26. [Google Scholar] [CrossRef]
- Mikolov, T.; Karafiát, M.; Burget, L.; Cernocký, J.; Khudanpur, S. Recurrent neural network based language model. In Proceedings of the Interspeech, Chiba, Japan, 26–30 September 2010; pp. 1045–1048. [Google Scholar]
- Jin, Z. Analysis of the Technical Principles of ChatGPT and Prospects for Pre-trained Large Models. In Proceedings of the 2023 IEEE 3rd International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA), Chongqing, China, 26–28 May 2023; pp. 1755–1758. [Google Scholar]
- OpenAI. ChatGPT. Available online: https://chat.openai.com/chat (accessed on 31 October 2023).
- Aljindan, F.K.; Shawosh, M.H.; Altamimi, L.; Arif, S.; Mortada, H. Utilization of ChatGPT-4 in Plastic and Reconstructive Surgery: A Narrative Review. Plast. Reconstr. Surg. Glob. Open 2023, 11, e5305. [Google Scholar] [CrossRef]
- Gupta, R.; Park, J.B.; Bisht, C.; Herzog, I.; Weisberger, J.; Chao, J.; Chaiyasate, K.; Lee, E.S. Expanding Cosmetic Plastic Surgery Research with ChatGPT. Aesthetic Surg. J. 2023, 43, 930–937. [Google Scholar] [CrossRef]
- Sharma, S.C.; Ramchandani, J.P.; Thakker, A.; Lahiri, A. ChatGPT in Plastic and Reconstructive Surgery. Indian J. Plast. Surg. 2023, 56, 320–325. [Google Scholar] [CrossRef] [PubMed]
- Abi-Rafeh, J.; Xu, H.H.; Kazan, R.; Tevlin, R.; Furnas, H. Large Language Models and Artificial Intelligence: A Primer for Plastic Surgeons on the Demonstrated and Potential Applications, Promises, and Limitations of ChatGPT. Aesthetic Surg. J. 2024, 44, 329–343. [Google Scholar] [CrossRef]
- Xiao, D.; Meyers, P.; Upperman, J.S.; Robinson, J.R. Revolutionizing Healthcare with ChatGPT: An Early Exploration of an AI Language Model’s Impact on Medicine at Large and its Role in Pediatric Surgery. J. Pediatr. Surg. 2023, 58, 2410–2415. [Google Scholar] [CrossRef]
- Lebhar, M.S.; Velazquez, A.; Goza, S.; Hoppe, I.C. Dr. ChatGPT: Utilizing Artificial Intelligence in Surgical Education. Cleft Palate Craniofacial J. 2023; online ahead of print. [Google Scholar] [CrossRef]
- Oh, N.; Choi, G.S.; Lee, W.Y. ChatGPT goes to the operating room: Evaluating GPT-4 performance and its potential in surgical education and training in the era of large language models. Ann. Surg. Treat. Res. 2023, 104, 269–273. [Google Scholar] [CrossRef]
- Loftus, T.J.; Tighe, P.J.; Filiberto, A.C.; Efron, P.A.; Brakenridge, S.C.; Mohr, A.M.; Rashidi, P.; Upchurch, G.R., Jr.; Bihorac, A. Artificial Intelligence and Surgical Decision-making. JAMA Surg. 2020, 155, 148–158. [Google Scholar] [CrossRef]
- Nazer, L.H.; Zatarah, R.; Waldrip, S.; Ke, J.X.C.; Moukheiber, M.; Khanna, A.K.; Hicklen, R.S.; Moukheiber, L.; Moukheiber, D.; Ma, H.; et al. Bias in artificial intelligence algorithms and recommendations for mitigation. PLoS Digit Health 2023, 2, e0000278. [Google Scholar] [CrossRef] [PubMed]
- Oleck, N.C.; Naga, H.I.; Nichols, D.S.; Morris, M.X.; Dhingra, B.; Patel, A. Navigating the Ethical Landmines of ChatGPT: Implications of Intelligent Chatbots in Plastic Surgery Clinical Practice. Plast. Reconstr. Surg. Glob. Open 2023, 11, e5290. [Google Scholar] [CrossRef] [PubMed]
- Redrup Hill, E.; Mitchell, C.; Brigden, T.; Hall, A. Ethical and legal considerations influencing human involvement in the implementation of artificial intelligence in a clinical pathway: A multi-stakeholder perspective. Front. Digit Health 2023, 5, 1139210. [Google Scholar] [CrossRef] [PubMed]
- Alonso, A.; Siracuse, J.J. Protecting patient safety and privacy in the era of artificial intelligence. Semin. Vasc. Surg. 2023, 36, 426–429. [Google Scholar] [CrossRef] [PubMed]
- Keskinbora, K.H. Medical ethics considerations on artificial intelligence. J. Clin. Neurosci. 2019, 64, 277–282. [Google Scholar] [CrossRef] [PubMed]
- Jeyaraman, M.; Ramasubramanian, S.; Balaji, S.; Jeyaraman, N.; Nallakumarasamy, A.; Sharma, S. ChatGPT in action: Harnessing artificial intelligence potential and addressing ethical challenges in medicine, education, and scientific research. World J. Methodol 2023, 13, 170–178. [Google Scholar] [CrossRef] [PubMed]
- AI HLEG. Ethics Guidelines for Trustworthy Artificial Intelligence; High-Level Expert Group on Artificial Intelligence: Brussels, Belgium, 2019; p. 8. [Google Scholar]
- Beauchamp, T.L.; Childress, J.F. Principles of Biomedical Ethics, 8th ed.; Oxford University Press: New York, NY, USA, 2019. [Google Scholar]
- Paola, F.; Barten, S.S. An ‘ethics gap’ in writing about bioethics: A quantitative comparison of the medical and the surgical literature. J. Med. Ethics 1995, 21, 84–88. [Google Scholar] [CrossRef] [PubMed]
- Wall, A.; Angelos, P.; Brown, D.; Kodner, I.J.; Keune, J.D. Ethics in surgery. Curr. Probl. Surg. 2013, 50, 99–134. [Google Scholar] [CrossRef] [PubMed]
- Tung, T.; Organ, C.H. Ethics in surgery: Historical perspective. Arch. Surg. 2000, 135, 10–13. [Google Scholar] [CrossRef]
- Ward, C. Ethics in surgery. Ann. R. Coll. Surg. Engl. 1994, 76, 223. [Google Scholar]
- Liebe, H.; Hunter, C.J. Ethical considerations of academic surgical research. Semin. Pediatr. Surg. 2021, 30, 151097. [Google Scholar] [CrossRef] [PubMed]
- Cobianchi, L.; Verde, J.M.; Loftus, T.J.; Piccolo, D.; Dal Mas, F.; Mascagni, P.; Garcia Vazquez, A.; Ansaloni, L.; Marseglia, G.R.; Massaro, M.; et al. Artificial Intelligence and Surgery: Ethical Dilemmas and Open Issues. J. Am. Coll. Surg. 2022, 235, 268–275. [Google Scholar] [CrossRef] [PubMed]
- Matthew, J.P.; David, M.; Patrick, M.B.; Isabelle, B.; Tammy, C.H.; Cynthia, D.M.; Larissa, S.; Jennifer, M.T.; Elie, A.A.; Sue, E.B.; et al. PRISMA 2020 explanation and elaboration: Updated guidance and exemplars for reporting systematic reviews. BMJ 2021, 372, n160. [Google Scholar] [CrossRef]
- Chung, K.C.; Pushman, A.G.; Bellfi, L.T. A systematic review of ethical principles in the plastic surgery literature. Plast. Reconstr. Surg. 2009, 124, 1711–1718. [Google Scholar] [CrossRef] [PubMed]
- Chappell, A.G.; Kane, R.L.; Wood, S.M.; Wescott, A.B.; Chung, K.C. Representation of Ethics in the Plastic Surgery Literature: A Systematic Review. Plast. Reconstr. Surg. 2021, 148, 289e–298e. [Google Scholar] [CrossRef] [PubMed]
- Seyferth, A.V.; Wood, S.M.; Kane, R.L.; Chung, K.C. Representation of Ethics in COVID-19 Research: A Systematic Review. Plast. Reconstr. Surg. 2022, 149, 1237–1244. [Google Scholar] [CrossRef]
- Allen, J.W.; Earp, B.D.; Koplin, J.; Wilkinson, D. Consent-GPT: Is it ethical to delegate procedural consent to conversational AI? J. Med. Ethics 2023, 50, 77–83. [Google Scholar] [CrossRef] [PubMed]
- Cocci, A.; Pezzoli, M.; Lo Re, M.; Russo, G.I.; Asmundo, M.G.; Fode, M.; Cacciamani, G.; Cimino, S.; Minervini, A.; Durukan, E. Quality of information and appropriateness of ChatGPT outputs for urology patients. Prostate Cancer Prostatic Dis. 2023, 27, 103–108. [Google Scholar] [CrossRef] [PubMed]
- Javid, M.; Reddiboina, M.; Bhandari, M. Emergence of artificial generative intelligence and its potential impact on urology. Can. J. Urol. 2023, 30, 11588–11598. [Google Scholar]
- Li, W.; Fu, M.; Liu, S.; Yu, H. Revolutionizing Neurosurgery with GPT-4: A Leap Forward or Ethical Conundrum? Ann. Biomed. Eng. 2023, 51, 2105–2112. [Google Scholar] [CrossRef]
- Li, W.; Zhang, Y.; Chen, F. ChatGPT in Colorectal Surgery: A Promising Tool or a Passing Fad? Ann. Biomed. Eng. 2023, 51, 1892–1897. [Google Scholar] [CrossRef] [PubMed]
- Varas, J.; Coronel, B.V.; Villagrán, I.; Escalona, G.; Hernandez, R.; Schuit, G.; Durán, V.; Lagos-Villaseca, A.; Jarry, C.; Neyem, A.; et al. Innovations in surgical training: Exploring the role of artificial intelligence and large language models (LLM). Rev. Colégio Bras. Cir. 2023, 50, e2023360. [Google Scholar] [CrossRef]
- Luo, S.; Deng, L.; Chen, Y.; Zhou, W.; Canavese, F.; Li, L. Revolutionizing pediatric orthopedics: GPT-4, a groundbreaking innovation or just a fleeting trend? Int. J. Surg. 2023, 109, 3694–3697. [Google Scholar] [CrossRef] [PubMed]
- Park, I.; Joshi, A.S.; Javan, R. Potential role of ChatGPT in clinical otolaryngology explained by ChatGPT. Am. J. Otolaryngol. 2023, 44, 103873. [Google Scholar] [CrossRef] [PubMed]
- Garcia Valencia, O.A.; Thongprayoon, C.; Jadlowiec, C.C.; Mao, S.A.; Miao, J.; Cheungpasitporn, W. Enhancing Kidney Transplant Care through the Integration of Chatbot. Healthcare 2023, 11, 2518. [Google Scholar] [CrossRef] [PubMed]
- Reis, L.O. ChatGPT for medical applications and urological science. Int. Braz. J. Urol. 2023, 49, 652–656. [Google Scholar] [CrossRef]
- Ramamurthi, A.; Are, C.; Kothari, A.N. From ChatGPT to Treatment: The Future of AI and Large Language Models in Surgical Oncology. Indian J. Surg. Oncol. 2023, 14, 537–539. [Google Scholar] [CrossRef]
- Sahiner, B.; Chen, W.; Samala, R.K.; Petrick, N. Data drift in medical machine learning: Implications and potential remedies. Br. J. Radiol. 2023, 96, 20220878. [Google Scholar] [CrossRef]
- Atallah, S.B.; Banda, N.R.; Banda, A.; Roeck, N.A. How large language models including generative pre-trained transformer (GPT) 3 and 4 will impact medicine and surgery. Tech. Coloproctol. 2023, 27, 609–614. [Google Scholar] [CrossRef]
- Iannantuono, G.M.; Bracken-Clarke, D.; Floudas, C.S.; Roselli, M.; Gulley, J.L.; Karzai, F. Applications of large language models in cancer care: Current evidence and future perspectives. Front Oncol 2023, 13, 1268915. [Google Scholar] [CrossRef]
- Roman, A.; Al-Sharif, L.; Gharyani, M.A.L. The Expanding Role of ChatGPT (Chat-Generative Pre-Trained Transformer) in Neurosurgery: A Systematic Review of Literature and Conceptual Framework. Cureus 2023, 15, e43502. [Google Scholar] [CrossRef]
- Kunze, K.N.; Jang, S.J.; Fullerton, M.A.; Vigdorchik, J.M.; Haddad, F.S. What’s all the chatter about? Bone Jt. J. 2023, 105, 587–589. [Google Scholar] [CrossRef]
- Laios, A.; Theophilou, G.; De Jong, D.; Kalampokis, E. The Future of AI in Ovarian Cancer Research: The Large Language Models Perspective. Cancer. Control 2023, 30, 10732748231197915. [Google Scholar] [CrossRef] [PubMed]
- Merrell, L.A.; Fisher, N.D.; Egol, K.A. Large Language Models in Orthopaedic Trauma: A Cutting-Edge Technology to Enhance the Field. J. Bone Jt. Surg. Am. 2023, 105, 1383–1387. [Google Scholar] [CrossRef] [PubMed]
- Chen, T.C.; Kaminski, E.; Koduri, L.; Singer, A.; Singer, J.; Couldwell, M.; Delashaw, J.; Dumont, A.; Wang, A. Chat GPT as a Neuro-Score Calculator: Analysis of a Large Language Model’s Performance on Various Neurological Exam Grading Scales. World Neurosurg. 2023, 179, e342–e347. [Google Scholar] [CrossRef]
- Jayakumar, P.; Oude Nijhuis, K.D.; Oosterhoff, J.H.F.; Bozic, K.J. Value-based Healthcare: Can Generative Artificial Intelligence and Large Language Models be a Catalyst for Value-based Healthcare? Clin. Orthop. Relat. Res. 2023, 481, 1890–1894. [Google Scholar] [CrossRef] [PubMed]
- Kim, J.K.; Chua, M.; Rickard, M.; Lorenzo, A. ChatGPT and large language model (LLM) chatbots: The current state of acceptability and a proposal for guidelines on utilization in academic medicine. J. Pediatr. Urol. 2023, 19, 598–604. [Google Scholar] [CrossRef]
- Tay, J.Q. ChatGPT and the future of plastic surgery research: Evolutionary tool or revolutionary force in academic publishing? Eur. J. Plast. Surg. 2023, 46, 643–644. [Google Scholar] [CrossRef]
- Lim, B.; Seth, I.; Bulloch, G.; Xie, Y.; Hunter-Smith, D.J.; Rozen, W.M. Evaluating the efficacy of major language models in providing guidance for hand trauma nerve laceration patients: A case study on Google’s AI BARD, Bing AI, and ChatGPT. Plast. Aesthetic Res. 2023, 10, 43. [Google Scholar] [CrossRef]
- Puladi, B.; Gsaxner, C.; Kleesiek, J.; Hölzle, F.; Röhrig, R.; Egger, J. The impact and opportunities of large language models like ChatGPT in oral and maxillofacial surgery: A narrative review. Int. J. Oral Maxillofac. Surg. 2023, 53, 78–88. [Google Scholar] [CrossRef]
- Weidman, A.A.; Valentine, L.; Chung, K.C.; Lin, S.J. OpenAI’s ChatGPT and Its Role in Plastic Surgery Research. Plast. Reconstr. Surg. 2023, 151, 1111–1113. [Google Scholar] [CrossRef] [PubMed]
- Liu, J.Y.; Zheng, J.Q.; Cai, X.T.; Wu, D.D.; Yin, C.L. A descriptive study based on the comparison of ChatGPT and evidence-based neurosurgeons. iScience 2023, 26, 107590. [Google Scholar] [CrossRef]
- Hallock, G.R.; Hallock, G.G. ChatEd.Mgr.com/SAP. Ann. Plast. Surg. 2023, 91, 632–633. [Google Scholar] [CrossRef] [PubMed]
- Lower, K.; Seth, I.; Lim, B.; Seth, N. ChatGPT-4: Transforming Medical Education and Addressing Clinical Exposure Challenges in the Post-Pandemic Era. Indian J. Orthop. 2023, 57, 1527–1544. [Google Scholar] [CrossRef] [PubMed]
- Qu, R.W.; Qureshi, U.; Petersen, G.; Lee, S.C. Diagnostic and Management Applications of ChatGPT in Structured Otolaryngology Clinical Scenarios. OTO Open 2023, 7, e67. [Google Scholar] [CrossRef] [PubMed]
- Kleebayoon, A.; Wiwanitkit, V. Letter: I Asked a ChatGPT to Write an Editorial About How We Can Incorporate Chatbots Into Neurosurgical Research and Patient Care. Neurosurgery 2023, 93, E77. [Google Scholar] [CrossRef] [PubMed]
- Rawashdeh, B.; Kim, J.; AlRyalat, S.A.; Prasad, R.; Cooper, M. ChatGPT and Artificial Intelligence in Transplantation Research: Is It Always Correct? Cureus 2023, 15, e42150. [Google Scholar] [CrossRef] [PubMed]
- Amann, J.; Blasimme, A.; Vayena, E.; Frey, D.; Madai, V.I. Explainability for artificial intelligence in healthcare: A multidisciplinary perspective. BMC Med. Inform. Decis. Mak. 2020, 20, 310. [Google Scholar] [CrossRef] [PubMed]
- D’Amico, R.S.; White, T.G.; Shah, H.A.; Langer, D.J. In Reply: I Asked a ChatGPT to Write an Editorial about How We Can Incorporate Chatbots into Neurosurgical Research and Patient Care. Neurosurgery 2023, 93, E78. [Google Scholar] [CrossRef]
- Liu, H.Y.; Alessandri-Bonetti, M.; Arellano, J.A.; Egro, F.M. Can ChatGPT be the Plastic Surgeon’s New Digital Assistant? A Bibliometric Analysis and Scoping Review of ChatGPT in Plastic Surgery Literature. Aesthetic Plast. Surg. 2023; online ahead of print. [Google Scholar] [CrossRef]
- Palacios, J.F.; Bastidas, N. Man, or Machine? Artificial Intelligence Language Systems in Plastic Surgery. Aesthetic Surg. J. 2023, 43, NP918–NP923. [Google Scholar] [CrossRef]
- Ray, P.P. Revisiting the need for the use of GPT in surgery and medicine. Tech. Coloproctol. 2023, 27, 959–960. [Google Scholar] [CrossRef] [PubMed]
- Ishaaq, N.; Sohail, S.S. Correspondence on “Assessing the Accuracy of Responses by the Language Model ChatGPT to Questions Regarding Bariatric Surgery”. Obes. Surg. 2023, 33, 4159. [Google Scholar] [CrossRef] [PubMed]
- Esplugas, M. The use of artificial intelligence (AI) to enhance academic communication, education and research: A balanced approach. J. Hand. Surg. Eur. Vol. 2023, 48, 819–822. [Google Scholar] [CrossRef]
- Lechien, J.R.; Gorton, A.; Robertson, J.; Vaira, L.A. Is ChatGPT-4 Accurate in Proofread a Manuscript in Otolaryngology–Head and Neck Surgery? Otolaryngol. Head Neck Surg. 2023; online ahead of print. [Google Scholar] [CrossRef]
- Dutton, J.J. Artificial Intelligence and the Future of Computer-Assisted Medical Research and Writing. Ophthalmic Plast. Reconstr. Surg. 2023, 39, 203–205. [Google Scholar] [CrossRef] [PubMed]
- Kuang, Y.R.; Zou, M.X.; Niu, H.Q.; Zheng, B.Y.; Zhang, T.L.; Zheng, B.W. ChatGPT encounters multiple opportunities and challenges in neurosurgery. Int. J. Surg. 2023, 109, 2886–2891. [Google Scholar] [CrossRef] [PubMed]
- Lawson McLean, A. Artificial Intelligence in Surgical Documentation: A Critical Review of the Role of Large Language Models. Ann. Biomed. Eng. 2023, 51, 2641–2642. [Google Scholar] [CrossRef] [PubMed]
- Najafali, D.; Reiche, E.; Camacho, J.M.; Morrison, S.D.; Dorafshar, A.H. Let’s Chat About Chatbots: Additional Thoughts on ChatGPT and Its Role in Plastic Surgery Along With Its Ability to Perform Systematic Reviews. Aesthetic Surg. J. 2023, 43, NP591–NP592. [Google Scholar] [CrossRef]
- Seth, I.; Lower, K.; Bulloch, G.; Seth, N. Letter to the Editor: Editorial: Artificial Intelligence Applications and Scholarly Publication in Orthopaedic Surgery. Clin. Orthop. Relat. Res. 2023, 481, 1652–1653. [Google Scholar] [CrossRef]
- Bassiri-Tehrani, B.; Cress, P.E. Unleashing the Power of ChatGPT: Revolutionizing Plastic Surgery and Beyond. Aesthetic Surg. J. 2023, 43, 1395–1399. [Google Scholar] [CrossRef]
- Masic, I. Plagiarism in scientific publishing. Acta Inform. Med. 2012, 20, 208–213. [Google Scholar] [CrossRef]
- Tools such as ChatGPT threaten transparent science; here are our ground rules for their use. Nature 2023, 613, 612. [CrossRef]
- Thorp, H.H. ChatGPT is fun, but not an author. Science 2023, 379, 313. [Google Scholar] [CrossRef]
- Seth, I.; Cox, A.; Xie, Y.; Bulloch, G.; Hunter-Smith, D.J.; Rozen, W.M.; Ross, R.J. Evaluating Chatbot Efficacy for Answering Frequently Asked Questions in Plastic Surgery: A ChatGPT Case Study Focused on Breast Augmentation. Aesthetic Surg. J. 2023, 43, 1126–1135. [Google Scholar] [CrossRef]
- Uruthiralingam, U.; Rea, P.M. Augmented and Virtual Reality in Anatomical Education—A Systematic Review. Adv. Exp. Med. Biol. 2020, 1235, 89–101. [Google Scholar] [CrossRef]
- Ayoub, A.; Pulijala, Y. The application of virtual reality and augmented reality in Oral & Maxillofacial Surgery. BMC Oral. Health 2019, 19, 238. [Google Scholar] [CrossRef]
- Mishra, R.; Narayanan, M.D.K.; Umana, G.E.; Montemurro, N.; Chaurasia, B.; Deora, H. Virtual Reality in Neurosurgery: Beyond Neurosurgical Planning. Int. J. Environ. Res. Public Health 2022, 19, 1719. [Google Scholar] [CrossRef] [PubMed]
- Ghaednia, H.; Fourman, M.S.; Lans, A.; Detels, K.; Dijkstra, H.; Lloyd, S.; Sweeney, A.; Oosterhoff, J.H.F.; Schwab, J.H. Augmented and virtual reality in spine surgery, current applications and future potentials. Spine J. 2021, 21, 1617–1625. [Google Scholar] [CrossRef]
- Van Leeuwen, F.W.B.; van der Hage, J.A. Where Robotic Surgery Meets the Metaverse. Cancers 2022, 14, 6161. [Google Scholar] [CrossRef] [PubMed]
- Matwala, K.; Shakir, T.; Bhan, C.; Chand, M. The surgical metaverse. Cir. Esp. 2023; online ahead of print. [Google Scholar] [CrossRef]
- Seddon, I.; Rosenberg, E.; Houston, S.K., 3rd. Future of virtual education and telementoring. Curr. Opin. Ophthalmol. 2023, 34, 255–260. [Google Scholar] [CrossRef]
- Sun, P.; Zhao, S.; Yang, Y.; Liu, C.; Pan, B. How do Plastic Surgeons use the Metaverse: A Systematic Review. J. Craniofac. Surg. 2023, 34, 548–550. [Google Scholar] [CrossRef]
- Kaddoura, S.; Al Husseiny, F. The rising trend of Metaverse in education: Challenges, opportunities, and ethical considerations. PeerJ Comput. Sci. 2023, 9, e1252. [Google Scholar] [CrossRef] [PubMed]
- Lareyre, F.; Maresch, M.; Chaudhuri, A.; Raffort, J. Ethics and Legal Framework for Trustworthy Artificial Intelligence in Vascular Surgery. EJVES Vasc. Forum. 2023, 60, 42–44. [Google Scholar] [CrossRef] [PubMed]
Principle | Definition [21] | Examples |
---|---|---|
Autonomy | Respect for an individual’s right to informed medical decision making | Informed consent; full disclosure and discussion regarding the risks and benefits of surgical intervention; full disclosure and discussion regarding the involvement of an LLM in a patient’s care; respecting a patient’s right to privacy and confidentiality; and respecting a researcher’s right to have proper recognition and attribution for their work. |
Beneficence | Maximization of benefit to the patient (“do good”), while minimizing harm | Training of highly skilled, patient-focused surgeons; practice of evidence-based medicine; and supervision and verification to ensure LLM-generated content meets quality standards and is beneficial. |
Nonmaleficence | Avoidance of patient harm (“do no harm”) | Ongoing efforts by surgeons to minimize surgical complications; the avoidance of unnecessary procedures; the identification and rectification of inaccurate, incomplete, or outdated information that can lead to potentially harmful recommendations; and stringent security measures to maintain patient confidentiality and prevent harmful repercussions from unauthorized disclosures or data breaches. |
Justice | Fair distribution of healthcare resources | Conscious efforts to minimize bias that can widen healthcare disparities and inequalities; establishing infrastructure to support equitable resource allocation; and identifying and rectifying training dataset bias to avoid the production of biased content and recommendations. |
Characteristic | No. | % | |
---|---|---|---|
Surgical specialty | |||
Plastic surgery | 14 | 26.4 | |
Orthopedic surgery/foot and ankle surgery | 9 | 17.0 | |
Neurosurgery | 7 | 13.2 | |
Nonspecific/general surgery | 4 | 7.5 | |
Urology | 4 | 7.5 | |
Otolaryngology/oral and maxillofacial surgery | 4 | 7.5 | |
Colorectal surgery | 3 | 5.7 | |
Ophthalmology | 3 | 5.7 | |
Surgical oncology | 3 | 5.7 | |
Transplant surgery | 2 | 3.8 | |
Pediatric surgery | 1 | 1.9 | |
Obstetrics and gynecology | 1 | 1.9 | |
Bariatric surgery | 1 | 1.9 | |
Cited Large Language Models (Developer) | |||
ChatGPT (OpenAI) | 53 | 100.0 | |
Bard (Google) | 11 | 20.8 | |
LaMDA (Google) | 5 | 9.4 | |
Bing Chat (Microsoft) | 4 | 7.5 | |
PaLM/Med-PaLM-2 (Google) | 3 | 5.7 | |
BERT (Google) | 3 | 5.7 | |
T5 (Google) | 3 | 5.7 | |
LLaMA (Meta) | 3 | 5.7 | |
RoBERTa (Meta) | 2 | 3.8 | |
Claude (Anthropic) | 2 | 3.8 |
Ethical Consideration | Most Relevant Ethical Principle(s) | No. | % |
---|---|---|---|
Accuracy | Beneficence, nonmaleficence | 45 | 84.9 |
Bias | Justice, nonmaleficence | 39 | 73.6 |
Privacy, security, confidentiality | Autonomy, nonmaleficence | 36 | 67.9 |
Responsibility and liability | Justice | 35 | 66.0 |
Supervision/human oversight | Beneficence, nonmaleficence | 23 | 43.4 |
Fabricated content/“hallucinations” | Nonmaleficence | 22 | 41.5 |
Authorship | Autonomy, justice | 21 | 39.6 |
Plagiarism | Autonomy, justice | 20 | 37.7 |
Obsolescence/outdated content | Nonmaleficence, beneficence | 19 | 35.8 |
Transparency | Autonomy, justice | 19 | 35.8 |
Inequity and healthcare disparities | Justice, nonmaleficence | 18 | 34.0 |
Reliability | Beneficence, nonmaleficence | 18 | 34.0 |
Replacement/substitution of physicians | Justice | 13 | 24.5 |
Patient trust and patient-physician relationship | Autonomy, beneficence | 13 | 24.5 |
Informed consent | Autonomy, nonmaleficence | 10 | 18.9 |
Ethical Principle | No. | % |
---|---|---|
All | 3 | 5.7 |
Autonomy | 10 | 18.9 |
Nonmaleficence | 4 | 7.5 |
Justice | 4 | 7.5 |
Beneficence | 3 | 5.7 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pressman, S.M.; Borna, S.; Gomez-Cabello, C.A.; Haider, S.A.; Haider, C.; Forte, A.J. AI and Ethics: A Systematic Review of the Ethical Considerations of Large Language Model Use in Surgery Research. Healthcare 2024, 12, 825. https://doi.org/10.3390/healthcare12080825
Pressman SM, Borna S, Gomez-Cabello CA, Haider SA, Haider C, Forte AJ. AI and Ethics: A Systematic Review of the Ethical Considerations of Large Language Model Use in Surgery Research. Healthcare. 2024; 12(8):825. https://doi.org/10.3390/healthcare12080825
Chicago/Turabian StylePressman, Sophia M., Sahar Borna, Cesar A. Gomez-Cabello, Syed A. Haider, Clifton Haider, and Antonio J. Forte. 2024. "AI and Ethics: A Systematic Review of the Ethical Considerations of Large Language Model Use in Surgery Research" Healthcare 12, no. 8: 825. https://doi.org/10.3390/healthcare12080825
APA StylePressman, S. M., Borna, S., Gomez-Cabello, C. A., Haider, S. A., Haider, C., & Forte, A. J. (2024). AI and Ethics: A Systematic Review of the Ethical Considerations of Large Language Model Use in Surgery Research. Healthcare, 12(8), 825. https://doi.org/10.3390/healthcare12080825