Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling
Abstract
:1. Introduction
2. Methods
3. Results
4. Comment
4.1. Principal Findings
4.2. Results in the Context of What Is Known
4.3. Clinical Implications
4.4. Research Implications
4.5. Strengths and Limitations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Rodriguez, J.A.; Clark, C.R.; Bates, D.W. Digital Health Equity as a Necessity in the 21st Century Cures Act Era. JAMA 2020, 323, 2381–2382. [Google Scholar] [CrossRef] [PubMed]
- Amin, K.; Khosla, P.; Doshi, R.; Chheang, S.; Forman, H.P. Artificial Intelligence to Improve Patient Understanding of Radiology Reports. Yale J. Biol. Med. 2023, 96, 407–417. [Google Scholar] [CrossRef] [PubMed]
- Piersson, A.D.; Dzefi-Tettey, K. OC01.02: Accuracy and readability of patient-focused information on obstetrics ultrasound imaging from online sources versus ChatGPT-generated. Ultrasound Obstet. Gynecol. 2023, 62, 1–2. [Google Scholar] [CrossRef]
- Ahn, S. The impending impacts of large language models on medical education. Korean J. Med. Educ. 2023, 35, 103–107. [Google Scholar] [CrossRef]
- Ray, P.P. Bridging the gap: Integrating ChatGPT into obstetrics and gynecology research—A call to action. Arch. Gynecol. Obstet. 2023, 309, 1111–1113. [Google Scholar] [CrossRef]
- Lyu, Q.; Tan, J.; Zapadka, M.E.; Ponnatapura, J.; Niu, C.; Myers, K.J.; Wang, G.; Whitlow, C.T. Translating radiology reports into plain language using ChatGPT and GPT-4 with prompt learning: Results, limitations, and potential. Vis. Comput. Ind. Biomed. Art. 2023, 6, 9. [Google Scholar]
- Li, H.; Moon, J.T.; Iyer, D.; Balthazar, P.; Krupinski, E.A.; Bercu, Z.L.; Newsome, J.M.; Banerjee, I.; Gichoya, J.W.; Trivedi, H.M. Decoding radiology reports: Potential application of OpenAI ChatGPT to enhance patient understanding of diagnostic reports. Clin. Imaging 2023, 101, 137–141. [Google Scholar] [CrossRef]
- Ali, S.R.; Dobbs, T.D.; Hutchings, H.A.; Whitaker, I.S. Using ChatGPT to write patient clinic letters. Lancet Digit. Health 2023, 5, e179–e181. [Google Scholar] [CrossRef]
- Grünebaum, A.; Chervenak, J.; Pollet, S.L.; Katz, A.; Chervenak, F.A. The exciting potential for ChatGPT in obstetrics and gynecology. Am. J. Obstet. Gynecol. 2023, 228, 696–705. [Google Scholar] [CrossRef]
- Wan, C.; Cadiente, A.; Khromchenko, K.; Friedricks, N.; Rana, R.A.; Baum, J.D. ChatGPT: An Evaluation of AI-Generated Responses to Commonly Asked Pregnancy Questions. Open J. Obstet. Gynecol. 2023, 13, 1528–1546. [Google Scholar] [CrossRef]
- Allahqoli, L.; Ghiasvand, M.M.; Mazidimoradi, A.; Salehiniya, H.; Alkatout, I. Diagnostic and Management Performance of ChatGPT in Obstetrics and Gynecology. Gynecol. Obstet. Investig. 2023, 88, 310–313. [Google Scholar] [CrossRef] [PubMed]
- Goodman, R.S.; Patrinely, J.R.; Stone, C.A.; Zimmerman, E.; Donald, R.R.; Chang, S.S.; Berkowitz, S.T.; Finn, A.P.; Jahangir, E.; Scoville, E.A.; et al. Accuracy and Reliability of Chatbot Responses to Physician Questions. JAMA Netw. Open. 2023, 6, e2336483. [Google Scholar] [CrossRef] [PubMed]
- Doshi, R.H.; Bajaj, S.S.; Krumholz, H.M. ChatGPT: Temptations of Progress. Am. J. Bioeth. 2023, 23, 6–8. [Google Scholar] [CrossRef] [PubMed]
- Weiss, B.D. Health Literacy and Patient Safety: Help Patients Understand. In Manual for Clinicians; AMA Foundation: Berkeley, CA, USA, 2007. [Google Scholar]
- Hansberry, D.R.; Agarwal, N.; Baker, S.R. Health literacy and online educational resources: An opportunity to educate patients. AJR Am. J. Roentgenol. 2015, 204, 111–116. [Google Scholar] [CrossRef]
- Doshi, R.; Amin, K.; Khosla, P.; Bajaj, S.; Chheang, S.; Forman, H.P. Utilizing Large Language Models to Simplify Radiology Reports: A comparative analysis of ChatGPT3.5, ChatGPT4.0, Google Bard, and Microsoft Bing. medRxiv 2023. [Google Scholar] [CrossRef]
- Paradise, S.L.; Landis, C.A.; Klein, D.A. Evidence-Based Contraception: Common Questions and Answers. Am. Fam. Physician 2022, 106, 251–259. [Google Scholar]
- Ayers, J.W.; Zhu, Z.; Poliak, A.; Leas, E.C.; Dredze, M.; Hogarth, M.; Smith, D.M. Evaluating Artificial Intelligence Responses to Public Health Questions. JAMA Netw. Open. 2023, 6, e2317517. [Google Scholar] [CrossRef]
- Coleman, M.; Liau, T.L. A computer readability formula designed for machine scoring. J. Appl. Psychol. 1975, 60, 283–284. [Google Scholar] [CrossRef]
- Sare, A.; Patel, A.; Kothari, P.; Kumar, A.; Patel, N.; Shukla, P.A. Readability Assessment of Internet-based Patient Education Materials Related to Treatment Options for Benign Prostatic Hyperplasia. Acad. Radiol. 2020, 27, 1549–1554. [Google Scholar] [CrossRef]
- Chen, L.; Zaharia, M.; Zou, J. How is ChatGPT’s behavior changing over time? Harv. Data Sci. Review 2024, 6. [Google Scholar] [CrossRef]
- Glasier, A.; Cameron, S.T.; Blithe, D.; Scherrer, B.; Mathe, H.; Levy, D.; Gainer, E.; Ulmann, A. Can we identify women at risk of pregnancy despite using emergency contraception? Data from randomized trials of ulipristal acetate and levonorgestrel. Contraception 2011, 84, 363–367. [Google Scholar] [CrossRef]
Question | AAFP b | GPT-4.0 | GPT-3.5 | Google Bard | Microsoft Bing |
---|---|---|---|---|---|
What forms of emergency contraception are effective? | 3.5 | 3 | 3.1 | 3.1 | 2.5 |
Are fertility awareness methods of contraception effective? | 3.4 | 3.3 | 3.5 | 3.4 | 2.5 |
What contraceptive methods are less safe for people with migraines? | 3.5 | 3.6 | 3.5 | 3.2 | 3.2 |
How long does long-acting reversible contraception remain effective? | 3.5 | 3.5 | 3.3 | 3.1 | 2.8 |
Can depot medroxyprogesterone acetate be self-administered subcutaneously? What are the effects on bone mineral density? | 3.4 | 3.3 | 3.6 | 4 | 2.9 |
What are contraception considerations in transgender and gender-diverse people with a uterus? | 3.4 | 3.2 | 3.4 | 3.3 | 3.3 |
GPT-4.0 | GPT-3.5 | Google Bard | Microsoft Bing | |
---|---|---|---|---|
Gunning–Fog Index | 13.80 | 13.32 | 10.10 | 13.91 |
Flesch–Kincaid Grade Level | 14.50 | 12.57 | 9.55 | 12.45 |
Automated Readability Index | 16.95 | 15.22 | 11.35 | 15.40 |
Coleman–Liau Index | 16.49 | 13.47 | 11.37 | 14.96 |
Average Reading Grade Level | 15.44 | 13.65 | 10.59 | 14.18 |
Criteria | AAFP a | GPT-4.0 | GPT-3.5 | Google Bard | Microsoft Bing |
---|---|---|---|---|---|
Was the question responded to? | 4.92 | 4.67 | 4.67 | 4.67 | 4.42 |
Was the response evidence based? | 5.00 | 2.83 | 3.17 | 2.83 | 2.92 |
Was the response complete or had any extraneous information? | 3.33 | 2.75 | 2.67 | 2.67 | 2.42 |
Did the response refer the user to a healthcare provider or other resources? | 3.00 | 4.58 | 5.00 | 5.00 | 2.33 |
Does the response speak in absolutes? | 1.00 | 1.75 | 1.50 | 1.58 | 2.25 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Patel, A.V.; Jasani, S.; AlAshqar, A.; Doshi, R.H.; Amin, K.; Panakam, A.; Patil, A.; Sheth, S.S. Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling. Digital 2025, 5, 10. https://doi.org/10.3390/digital5020010
Patel AV, Jasani S, AlAshqar A, Doshi RH, Amin K, Panakam A, Patil A, Sheth SS. Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling. Digital. 2025; 5(2):10. https://doi.org/10.3390/digital5020010
Chicago/Turabian StylePatel, Anisha V., Sona Jasani, Abdelrahman AlAshqar, Rushabh H. Doshi, Kanhai Amin, Aisvarya Panakam, Ankita Patil, and Sangini S. Sheth. 2025. "Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling" Digital 5, no. 2: 10. https://doi.org/10.3390/digital5020010
APA StylePatel, A. V., Jasani, S., AlAshqar, A., Doshi, R. H., Amin, K., Panakam, A., Patil, A., & Sheth, S. S. (2025). Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling. Digital, 5(2), 10. https://doi.org/10.3390/digital5020010