Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (24)

Search Parameters:
Keywords = Google’s Bard

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
14 pages, 826 KiB  
Systematic Review
Current Applications of Chatbots Powered by Large Language Models in Oral and Maxillofacial Surgery: A Systematic Review
by Vincenzo Ronsivalle, Simona Santonocito, Umberto Cammarata, Eleonora Lo Muzio and Marco Cicciù
Dent. J. 2025, 13(6), 261; https://doi.org/10.3390/dj13060261 - 11 Jun 2025
Viewed by 588
Abstract
Background/Objectives: In recent years, interest has grown in the clinical applications of artificial intelligence (AI)-based chatbots powered by large language models (LLMs) in oral and maxillofacial surgery (OMFS). However, there are conflicting opinions regarding the accuracy and reliability of the information they provide, [...] Read more.
Background/Objectives: In recent years, interest has grown in the clinical applications of artificial intelligence (AI)-based chatbots powered by large language models (LLMs) in oral and maxillofacial surgery (OMFS). However, there are conflicting opinions regarding the accuracy and reliability of the information they provide, raising questions about their potential role as support tools for both clinicians and patients. This systematic review aims to analyze the current literature on the use of conversational agents powered by LLMs in the field of OMFS. Methods: The review was conducted following PRISMA guidelines and the Cochrane Handbook for Systematic Reviews of Interventions. Original studies published between 2023 and 2024 in peer-reviewed English-language journals were included. Sources were identified through major electronic databases, including PubMed, Scopus, Google Scholar, and Web of Science. The risk of bias in the included studies was assessed using the ROBINS-I tool, which evaluates potential bias in study design and conduct. Results: A total of 49 articles were identified, of which 4 met the inclusion criteria. One study showed that ChatGPT provided the most accurate responses compared to Microsoft Copilot (ex-Bing) and Google Gemini (ex-Bard) for questions related to OMFS. Other studies highlighted that ChatGPT-4 can assist surgeons with quick and relevant information, though responses may vary depending on the quality of the questions. Conclusions: Chatbots powered by LLMs can enhance efficiency and decision-making in OMFS routine clinical cases. However, based on the limited number of studies included in this review (four), their performance remains constrained in complex clinical scenarios and in managing emotionally sensitive patient interactions. Further research on clinical validation, prompt formulation, and ethical oversight is essential to safely integrating LLM technologies into OMFS practices. Full article
(This article belongs to the Special Issue Artificial Intelligence in Oral Rehabilitation)
Show Figures

Graphical abstract

9 pages, 707 KiB  
Article
Use of Artificial Intelligence in Vesicoureteral Reflux Disease: A Comparative Study of Guideline Compliance
by Mehmet Sarikaya, Fatma Ozcan Siki and Ilhan Ciftci
J. Clin. Med. 2025, 14(7), 2378; https://doi.org/10.3390/jcm14072378 - 30 Mar 2025
Viewed by 479
Abstract
Objective: This study aimed to evaluate the compliance of four different artificial intelligence applications (ChatGPT-4.0, Bing AI, Google Bard, and Perplexity) with the American Urological Association (AUA) vesicoureteral reflux (VUR) management guidelines. Materials and Methods: Fifty-one questions derived from the AUA guidelines were [...] Read more.
Objective: This study aimed to evaluate the compliance of four different artificial intelligence applications (ChatGPT-4.0, Bing AI, Google Bard, and Perplexity) with the American Urological Association (AUA) vesicoureteral reflux (VUR) management guidelines. Materials and Methods: Fifty-one questions derived from the AUA guidelines were asked of each AI application. Two experienced paediatric surgeons independently scored the responses using a five-point Likert scale. Inter-rater agreement was analysed using the intraclass correlation coefficient (ICC). Results: ChatGPT-4.0, Bing AI, Google Bard, and Perplexity received mean scores of 4.91, 4.85, 4.75 and 4.70 respectively. There was no statistically significant difference between the accuracy of the AI applications (p = 0.223). The inter-rater ICC values were above 0.9 for all platforms, indicating a high level of consistency in scoring. Conclusions: The evaluated AI applications agreed highly with the AUA VUR management guidelines. These results suggest that AI applications may be a potential tool for providing guideline-based recommendations in paediatric urology. Full article
(This article belongs to the Special Issue Clinical Advances in Artificial Intelligence in Urology)
Show Figures

Figure 1

8 pages, 176 KiB  
Article
Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling
by Anisha V. Patel, Sona Jasani, Abdelrahman AlAshqar, Rushabh H. Doshi, Kanhai Amin, Aisvarya Panakam, Ankita Patil and Sangini S. Sheth
Digital 2025, 5(2), 10; https://doi.org/10.3390/digital5020010 - 25 Mar 2025
Cited by 1 | Viewed by 1021
Abstract
Background: As digital health resources become increasingly prevalent, assessing the quality of information provided by publicly available AI tools is vital for evidence-based patient education. Objective: This study evaluates the accuracy and readability of responses from four large language models—ChatGPT 4.0, ChatGPT 3.5, [...] Read more.
Background: As digital health resources become increasingly prevalent, assessing the quality of information provided by publicly available AI tools is vital for evidence-based patient education. Objective: This study evaluates the accuracy and readability of responses from four large language models—ChatGPT 4.0, ChatGPT 3.5, Google Bard, and Microsoft Bing—in providing contraceptive counseling. Methods: A cross-sectional analysis was conducted using standardized contraception questions, established readability indices, and a panel of blinded OB/GYN physician reviewers comparing model responses to an AAFP benchmark. Results: The models varied in readability and evidence adherence; notably, ChatGPT 3.5 provided more evidence-based responses than GPT-4.0, although all outputs exceeded the recommended 6th-grade reading level. Conclusions: Our findings underscore the need for the further refinement of LLMs to balance clinical accuracy with patient-friendly language, supporting their role as a supplement to clinician counseling. Full article
8 pages, 1881 KiB  
Article
Responses of Artificial Intelligence Chatbots to Testosterone Replacement Therapy: Patients Beware!
by Herleen Pabla, Alyssa Lange, Nagalakshmi Nadiminty and Puneet Sindhwani
Soc. Int. Urol. J. 2025, 6(1), 13; https://doi.org/10.3390/siuj6010013 - 12 Feb 2025
Cited by 1 | Viewed by 937
Abstract
Background/Objectives: Using chatbots to seek healthcare information is becoming more popular. Misinformation and gaps in knowledge exist regarding the risk and benefits of testosterone replacement therapy (TRT). We aimed to assess and compare the quality and readability of responses generated by four [...] Read more.
Background/Objectives: Using chatbots to seek healthcare information is becoming more popular. Misinformation and gaps in knowledge exist regarding the risk and benefits of testosterone replacement therapy (TRT). We aimed to assess and compare the quality and readability of responses generated by four AI chatbots. Methods: ChatGPT, Google Bard, Bing Chat, and Perplexity AI were asked the same eleven questions regarding TRT. The responses were evaluated by four reviewers using DISCERN and Patient Education Materials Assessment Tool (PEMAT) questionnaires. Readability was assessed using the Readability Scoring system v2.0. to calculate the Flesch–Kincaid Reading Ease Score (FRES) and the Flesch–Kincaid Grade Level (FKGL). Kruskal–Wallis statistics were completed using GraphPad Prism V10.1.0. Results: Google Bard received the highest DISCERN (56.5) and PEMAT (96% understandability and 74% actionability), demonstrating the highest quality. The readability scores ranged from eleventh-grade level to college level, with Perplexity outperforming the other chatbots. Significant differences were found in understandability between Bing and Google Bard, DISCERN scores between Bing and Google Bard, FRES between ChatGPT and Perplexity, and FKGL scoring between ChatGPT and Perplexity AI. Conclusions: ChatGPT and Google Bard were the top performers based on their quality, understandability, and actionability. Despite Perplexity scoring higher in readability, the generated text still maintained an eleventh-grade complexity. Perplexity stood out for its extensive use of citations; however, it offered repetitive answers despite the diversity of questions posed to it. Google Bard demonstrated a high level of detail in its answers, offering additional value through visual aids. With improvements in technology, these AI chatbots may improve. Until then, patients and providers should be aware of the strengths and shortcomings of each. Full article
Show Figures

Figure 1

30 pages, 1127 KiB  
Article
ChatGPT-4 vs. Google Bard: Which Chatbot Better Understands the Italian Legislative Framework for Worker Health and Safety?
by Martina Padovan, Alessandro Palla, Riccardo Marino, Francesco Porciatti, Bianca Cosci, Francesco Carlucci, Gianluca Nerli, Armando Petillo, Gabriele Necciari, Letizia Dell’Amico, Vincenzo Carmelo Lucisano, Sergio Scarinci and Rudy Foddis
Appl. Sci. 2025, 15(3), 1508; https://doi.org/10.3390/app15031508 - 1 Feb 2025
Cited by 1 | Viewed by 2137
Abstract
Large language models, such as ChatGPT-4 and Google Bard, have demonstrated potential in healthcare. This study explores their utility in occupational medicine, a field where decisions rely on compliance with specific workplace health and safety regulations. A dataset of questions encompassing key occupational [...] Read more.
Large language models, such as ChatGPT-4 and Google Bard, have demonstrated potential in healthcare. This study explores their utility in occupational medicine, a field where decisions rely on compliance with specific workplace health and safety regulations. A dataset of questions encompassing key occupational health topics derived from the Italian Legislative Decree 81/08, which governs workplace health and safety, was utilized. Responses from ChatGPT-4 with contextual information (ChatGPT-4+context) and Google Bard were evaluated for accuracy and completeness, with error categorization used to identify common issues. Subcategories of the topics of the regulations were analyzed as well. In total, 433 questions were included in our analysis. ChatGPT-4+context surpasses Bard in terms of accuracy and completeness in responses, with a lower error rate in the categories analyzed, except for the percentage of missed responses. In the subcategories analyzed, Bard is superior to ChatGPT-4+context only in the areas of the manual handling of loads and physical hazards. ChatGPT-4+context outperformed Bard in providing answers about Italian regulations on health and safety at work. This study highlights the potential and limitations of large language models as decision-support tools in occupational medicine and underscores the importance of regulatory context in enhancing their reliability. Full article
Show Figures

Figure 1

20 pages, 942 KiB  
Systematic Review
Evaluating the Performance of Artificial Intelligence-Based Large Language Models in Orthodontics—A Systematic Review and Meta-Analysis
by Farraj Albalawi, Sanjeev B. Khanagar, Kiran Iyer, Nora Alhazmi, Afnan Alayyash, Anwar S. Alhazmi, Mohammed Awawdeh and Oinam Gokulchandra Singh
Appl. Sci. 2025, 15(2), 893; https://doi.org/10.3390/app15020893 - 17 Jan 2025
Cited by 3 | Viewed by 1996
Abstract
Background: In recent years, there has been remarkable growth in AI-based applications in healthcare, with a significant breakthrough marked by the launch of large language models (LLMs) such as ChatGPT and Google Bard. Patients and health professional students commonly utilize these models due [...] Read more.
Background: In recent years, there has been remarkable growth in AI-based applications in healthcare, with a significant breakthrough marked by the launch of large language models (LLMs) such as ChatGPT and Google Bard. Patients and health professional students commonly utilize these models due to their accessibility. The increasing use of LLMs in healthcare necessitates an evaluation of their ability to generate accurate and reliable responses. Objective: This study assessed the performance of LLMs in answering orthodontic-related queries through a systematic review and meta-analysis. Methods: A comprehensive search of PubMed, Web of Science, Embase, Scopus, and Google Scholar was conducted up to 31 October 2024. The quality of the included studies was evaluated using the Prediction model Risk of Bias Assessment Tool (PROBAST), and R Studio software (Version 4.4.0) was employed for meta-analysis and heterogeneity assessment. Results: Out of 278 retrieved articles, 10 studies were included. The most commonly used LLM was ChatGPT (10/10, 100% of papers), followed by Google’s Bard/Gemini (3/10, 30% of papers), and Microsoft’s Bing/Copilot AI (2/10, 20% of papers). Accuracy was primarily evaluated using Likert scales, while the DISCERN tool was frequently applied for reliability assessment. The meta-analysis indicated that the LLMs, such as ChatGPT-4 and other models, do not significantly differ in generating responses to queries related to the specialty of orthodontics. The forest plot revealed a Standard Mean Deviation of 0.01 [CI: 0.42–0.44]. No heterogeneity was observed between the experimental group (ChatGPT-3.5, Gemini, and Copilot) and the control group (ChatGPT-4). However, most studies exhibited a high PROBAST risk of bias due to the lack of standardized evaluation tools. Conclusions: ChatGPT-4 has been extensively used for a variety of tasks and has demonstrated advanced and encouraging outcomes compared to other LLMs, and thus can be regarded as a valuable tool for enhancing educational and learning experiences. While LLMs can generate comprehensive responses, their reliability is compromised by the absence of peer-reviewed references, necessitating expert oversight in healthcare applications. Full article
Show Figures

Figure 1

37 pages, 1529 KiB  
Article
Differences in User Perception of Artificial Intelligence-Driven Chatbots and Traditional Tools in Qualitative Data Analysis
by Boštjan Šumak, Maja Pušnik, Ines Kožuh, Andrej Šorgo and Saša Brdnik
Appl. Sci. 2025, 15(2), 631; https://doi.org/10.3390/app15020631 - 10 Jan 2025
Cited by 1 | Viewed by 3563
Abstract
Qualitative data analysis (QDA) tools are essential for extracting insights from complex datasets. This study investigates researchers’ perceptions of the usability, user experience (UX), mental workload, trust, task complexity, and emotional impact of three tools: Taguette 1.4.1 (a traditional QDA tool), ChatGPT (GPT-4, [...] Read more.
Qualitative data analysis (QDA) tools are essential for extracting insights from complex datasets. This study investigates researchers’ perceptions of the usability, user experience (UX), mental workload, trust, task complexity, and emotional impact of three tools: Taguette 1.4.1 (a traditional QDA tool), ChatGPT (GPT-4, December 2023 version), and Gemini (formerly Google Bard, December 2023 version). Participants (N = 85), Master’s students from the Faculty of Electrical Engineering and Computer Science with prior experience in UX evaluations and familiarity with AI-based chatbots, performed sentiment analysis and data annotation tasks using these tools, enabling a comparative evaluation. The results show that AI tools were associated with lower cognitive effort and more positive emotional responses compared to Taguette, which caused higher frustration and workload, especially during cognitively demanding tasks. Among the tools, ChatGPT achieved the highest usability score (SUS = 79.03) and was rated positively for emotional engagement. Trust levels varied, with Taguette preferred for task accuracy and ChatGPT rated highest in user confidence. Despite these differences, all tools performed consistently in identifying qualitative patterns. These findings suggest that AI-driven tools can enhance researchers’ experiences in QDA while emphasizing the need to align tool selection with specific tasks and user preferences. Full article
Show Figures

Figure 1

35 pages, 3798 KiB  
Article
An AI-Based Evaluation Framework for Smart Building Integration into Smart City
by Mustafa Muthanna Najm Shahrabani and Rasa Apanaviciene
Sustainability 2024, 16(18), 8032; https://doi.org/10.3390/su16188032 - 13 Sep 2024
Cited by 8 | Viewed by 4191
Abstract
The integration of smart buildings (SBs) into smart cities (SCs) is critical to urban development, with the potential to improve SCs’ performance. Artificial intelligence (AI) applications have emerged as a promising tool to enhance SB and SC development. The authors apply an AI-based [...] Read more.
The integration of smart buildings (SBs) into smart cities (SCs) is critical to urban development, with the potential to improve SCs’ performance. Artificial intelligence (AI) applications have emerged as a promising tool to enhance SB and SC development. The authors apply an AI-based methodology, particularly Large Language Models of OpenAI ChatGPT-3 and Google Bard as AI experts, to uniquely evaluate 26 criteria that represent SB services across five SC infrastructure domains (energy, mobility, water, waste management, and security), emphasizing their contributions to the integration of SB into SC and quantifying their impact on the efficiency, resilience, and environmental sustainability of SC. The framework was then validated through two rounds of the Delphi method, leveraging human expert knowledge and an iterative consensus-building process. The framework’s efficiency in analyzing complicated information and generating important insights is demonstrated via five case studies. These findings contribute to a deeper understanding of the effects of SB services on SC infrastructure domains, highlighting the intricate nature of SC, as well as revealing areas that require further integration to realize the SC performance objectives. Full article
Show Figures

Figure 1

18 pages, 494 KiB  
Article
Impact of Motivation Factors for Using Generative AI Services on Continuous Use Intention: Mediating Trust and Acceptance Attitude
by Sangbum Kang, Yongjoo Choi and Boyoung Kim
Soc. Sci. 2024, 13(9), 475; https://doi.org/10.3390/socsci13090475 - 9 Sep 2024
Cited by 11 | Viewed by 9042
Abstract
This study aims to empirically analyze the relationship between the motivational factors of generative AI users and the intention to continue using the service. Accordingly, the motives of users who use generative AI services are defined as individual, social, and technical motivation factors. [...] Read more.
This study aims to empirically analyze the relationship between the motivational factors of generative AI users and the intention to continue using the service. Accordingly, the motives of users who use generative AI services are defined as individual, social, and technical motivation factors. This research verified the effect of these factors on intention to continue using the services and tested the meditating effect of trust and acceptance attitude. We tested this through verifying trust and acceptance attitudes. An online survey was conducted on language-based generative AI service users such as OpenAI’s ChatGPT, Google Bard, Microsoft Bing, and Meta-Lama, and a structural equation analysis was conducted through a total of 356 surveys. As a result of the analysis, individual, social, and technical motivational factors all had a positive (+) effect on trust and acceptance attitude on the attitude toward accepting generative AI services. Among them, individual motivation such as self-efficacy, innovation orientation, and playful desire were found to have the greatest influence on the formation of the acceptance attitude. In addition, social factors were identified as the factors that have the greatest influence on trust in the use of generative AI services. When it comes to using generative AI, it was confirmed that social reputation or awareness directly affects the trust in usability. Full article
(This article belongs to the Special Issue Technology, Digital Transformation and Society)
Show Figures

Figure 1

10 pages, 2432 KiB  
Article
Replies to Queries in Gynecologic Oncology by Bard, Bing and the Google Assistant
by Edward J. Pavlik, Dharani D. Ramaiah, Taylor A. Rives, Allison L. Swiecki-Sikora and Jamie M. Land
BioMedInformatics 2024, 4(3), 1773-1782; https://doi.org/10.3390/biomedinformatics4030097 - 24 Jul 2024
Cited by 2 | Viewed by 1452
Abstract
When women receive a diagnosis of a gynecologic malignancy, they can have questions about their diagnosis or treatment that can result in voice queries to virtual assistants for more information. Recent advancement in artificial intelligence (AI) has transformed the landscape of medical information [...] Read more.
When women receive a diagnosis of a gynecologic malignancy, they can have questions about their diagnosis or treatment that can result in voice queries to virtual assistants for more information. Recent advancement in artificial intelligence (AI) has transformed the landscape of medical information accessibility. The Google virtual assistant (VA) outperformed Siri, Alexa and Cortana in voice queries presented prior to the explosive implementation of AI in early 2023. The efforts presented here focus on determining if advances in AI in the last 12 months have improved the accuracy of Google VA responses related to gynecologic oncology. Previous questions were utilized to form a common basis for queries prior to 2023 and responses in 2024. Correct answers were obtained from the UpToDate medical resource. Responses related to gynecologic oncology were obtained using Google VA, as well as the generative AI chatbots Google Bard/Gemini and Microsoft Bing-Copilot. The AI narrative responses varied in length and positioning of answers within the response. Google Bard/Gemini achieved an 87.5% accuracy rate, while Microsoft Bing-Copilot reached 83.3%. In contrast, the Google VA’s accuracy in audible responses improved from 18% prior to 2023 to 63% in 2024. While the accuracy of the Google VA has improved in the last year, it underperformed Google Bard/Gemini and Microsoft Bing-Copilot so there is considerable room for further improved accuracy. Full article
(This article belongs to the Special Issue Feature Papers in Computational Biology and Medicine)
Show Figures

Figure 1

23 pages, 736 KiB  
Review
A Systematic Review and Comprehensive Analysis of Pioneering AI Chatbot Models from Education to Healthcare: ChatGPT, Bard, Llama, Ernie and Grok
by Ketmanto Wangsa, Shakir Karim, Ergun Gide and Mahmoud Elkhodr
Future Internet 2024, 16(7), 219; https://doi.org/10.3390/fi16070219 - 22 Jun 2024
Cited by 13 | Viewed by 12228
Abstract
AI chatbots have emerged as powerful tools for providing text-based solutions to a wide range of everyday challenges. Selecting the appropriate chatbot is crucial for optimising outcomes. This paper presents a comprehensive comparative analysis of five leading chatbots: ChatGPT, Bard, Llama, Ernie, and [...] Read more.
AI chatbots have emerged as powerful tools for providing text-based solutions to a wide range of everyday challenges. Selecting the appropriate chatbot is crucial for optimising outcomes. This paper presents a comprehensive comparative analysis of five leading chatbots: ChatGPT, Bard, Llama, Ernie, and Grok. The analysis is based on a systematic review of 28 scholarly articles. The review indicates that ChatGPT, developed by OpenAI, excels in educational, medical, humanities, and writing applications but struggles with real-time data accuracy and lacks open-source flexibility. Bard, powered by Google, leverages real-time internet data for problem solving and shows potential in competitive quiz environments, albeit with performance variability and inconsistencies in responses. Llama, an open-source model from Meta, demonstrates significant promise in medical contexts, natural language processing, and personalised educational tools, yet it requires substantial computational resources. Ernie, developed by Baidu, specialises in Chinese language tasks, thus providing localised advantages that may not extend globally due to restrictive policies. Grok, developed by Xai and still in its early stages, shows promise in providing engaging, real-time interactions, humour, and mathematical reasoning capabilities, but its full potential remains to be evaluated through further development and empirical testing. The findings underscore the context-dependent utility of each model and the absence of a singularly superior chatbot. Future research should expand to include a wider range of fields, explore practical applications, and address concerns related to data privacy, ethics, security, and the responsible deployment of these technologies. Full article
Show Figures

Figure 1

19 pages, 239 KiB  
Article
Google Bard and ChatGPT in Orthopedics: Which Is the Better Doctor in Sports Medicine and Pediatric Orthopedics? The Role of AI in Patient Education
by Riccardo Giorgino, Mario Alessandri-Bonetti, Matteo Del Re, Fabio Verdoni, Giuseppe M. Peretti and Laura Mangiavini
Diagnostics 2024, 14(12), 1253; https://doi.org/10.3390/diagnostics14121253 - 13 Jun 2024
Cited by 13 | Viewed by 2348
Abstract
Background: This study evaluates the potential of ChatGPT and Google Bard as educational tools for patients in orthopedics, focusing on sports medicine and pediatric orthopedics. The aim is to compare the quality of responses provided by these natural language processing (NLP) models, addressing [...] Read more.
Background: This study evaluates the potential of ChatGPT and Google Bard as educational tools for patients in orthopedics, focusing on sports medicine and pediatric orthopedics. The aim is to compare the quality of responses provided by these natural language processing (NLP) models, addressing concerns about the potential dissemination of incorrect medical information. Methods: Ten ACL- and flat foot-related questions from a Google search were presented to ChatGPT-3.5 and Google Bard. Expert orthopedic surgeons rated the responses using the Global Quality Score (GQS). The study minimized bias by clearing chat history before each question, maintaining respondent anonymity and employing statistical analysis to compare response quality. Results: ChatGPT-3.5 and Google Bard yielded good-quality responses, with average scores of 4.1 ± 0.7 and 4 ± 0.78, respectively, for sports medicine. For pediatric orthopedics, Google Bard scored 3.5 ± 1, while the average score for responses generated by ChatGPT was 3.8 ± 0.83. In both cases, no statistically significant difference was found between the platforms (p = 0.6787, p = 0.3092). Despite ChatGPT’s responses being considered more readable, both platforms showed promise for AI-driven patient education, with no reported misinformation. Conclusions: ChatGPT and Google Bard demonstrate significant potential as supplementary patient education resources in orthopedics. However, improvements are needed for increased reliability. The study underscores the evolving role of AI in orthopedics and calls for continued research to ensure a conscientious integration of AI in healthcare education. Full article
(This article belongs to the Special Issue Artificial Intelligence in Orthopedic Surgery and Sport Medicine)
12 pages, 1249 KiB  
Article
Comparative Analysis of Artificial Intelligence Virtual Assistant and Large Language Models in Post-Operative Care
by Sahar Borna, Cesar A. Gomez-Cabello, Sophia M. Pressman, Syed Ali Haider, Ajai Sehgal, Bradley C. Leibovich, Dave Cole and Antonio Jorge Forte
Eur. J. Investig. Health Psychol. Educ. 2024, 14(5), 1413-1424; https://doi.org/10.3390/ejihpe14050093 - 15 May 2024
Cited by 7 | Viewed by 2748
Abstract
In postoperative care, patient education and follow-up are pivotal for enhancing the quality of care and satisfaction. Artificial intelligence virtual assistants (AIVA) and large language models (LLMs) like Google BARD and ChatGPT-4 offer avenues for addressing patient queries using natural language processing (NLP) [...] Read more.
In postoperative care, patient education and follow-up are pivotal for enhancing the quality of care and satisfaction. Artificial intelligence virtual assistants (AIVA) and large language models (LLMs) like Google BARD and ChatGPT-4 offer avenues for addressing patient queries using natural language processing (NLP) techniques. However, the accuracy and appropriateness of the information vary across these platforms, necessitating a comparative study to evaluate their efficacy in this domain. We conducted a study comparing AIVA (using Google Dialogflow) with ChatGPT-4 and Google BARD, assessing the accuracy, knowledge gap, and response appropriateness. AIVA demonstrated superior performance, with significantly higher accuracy (mean: 0.9) and lower knowledge gap (mean: 0.1) compared to BARD and ChatGPT-4. Additionally, AIVA’s responses received higher Likert scores for appropriateness. Our findings suggest that specialized AI tools like AIVA are more effective in delivering precise and contextually relevant information for postoperative care compared to general-purpose LLMs. While ChatGPT-4 shows promise, its performance varies, particularly in verbal interactions. This underscores the importance of tailored AI solutions in healthcare, where accuracy and clarity are paramount. Our study highlights the necessity for further research and the development of customized AI solutions to address specific medical contexts and improve patient outcomes. Full article
Show Figures

Figure 1

19 pages, 311 KiB  
Article
Will Artificial Intelligence Affect How Cultural Heritage Will Be Managed in the Future? Responses Generated by Four genAI Models
by Dirk H. R. Spennemann
Heritage 2024, 7(3), 1453-1471; https://doi.org/10.3390/heritage7030070 - 11 Mar 2024
Cited by 19 | Viewed by 7021
Abstract
Generative artificial intelligence (genAI) language models have become firmly embedded in public consciousness. Their abilities to extract and summarise information from a wide range of sources in their training data have attracted the attention of many scholars. This paper examines how four genAI [...] Read more.
Generative artificial intelligence (genAI) language models have become firmly embedded in public consciousness. Their abilities to extract and summarise information from a wide range of sources in their training data have attracted the attention of many scholars. This paper examines how four genAI large language models (ChatGPT, GPT4, DeepAI, and Google Bard) responded to prompts, asking (i) whether artificial intelligence would affect how cultural heritage will be managed in the future (with examples requested) and (ii) what dangers might emerge when relying heavily on genAI to guide cultural heritage professionals in their actions. The genAI systems provided a range of examples, commonly drawing on and extending the status quo. Without a doubt, AI tools will revolutionise the execution of repetitive and mundane tasks, such as the classification of some classes of artifacts, or allow for the predictive modelling of the decay of objects. Important examples were used to assess the purported power of genAI tools to extract, aggregate, and synthesize large volumes of data from multiple sources, as well as their ability to recognise patterns and connections that people may miss. An inherent risk in the ‘results’ presented by genAI systems is that the presented connections are ‘artifacts’ of the system rather than being genuine. Since present genAI tools are unable to purposively generate creative or innovative thoughts, it is left to the reader to determine whether any text that is provided by genAI that is out of the ordinary is meaningful or nonsensical. Additional risks identified by the genAI systems were that some cultural heritage professionals might use AI systems without the required level of AI literacy and that overreliance on genAI systems might lead to a deskilling of general heritage practitioners. Full article
12 pages, 249 KiB  
Project Report
ChatGPT and Bard in Plastic Surgery: Hype or Hope?
by Ania Labouchère and Wassim Raffoul
Surgeries 2024, 5(1), 37-48; https://doi.org/10.3390/surgeries5010006 - 16 Jan 2024
Cited by 4 | Viewed by 3036
Abstract
Online artificial intelligence (AI) tools have recently gained in popularity. So-called “generative AI” chatbots unlock new opportunities to access vast realms of knowledge when being prompted by users. Here, we test the capabilities of two such AIs in order to determine the benefits [...] Read more.
Online artificial intelligence (AI) tools have recently gained in popularity. So-called “generative AI” chatbots unlock new opportunities to access vast realms of knowledge when being prompted by users. Here, we test the capabilities of two such AIs in order to determine the benefits for plastic surgery while also assessing the potential risks. Future developments are outlined. We used the online portals of OpenAI’s ChatGPT (version 3.5) and Google’s Bard to ask a set of questions and give specific commands. The results provided by the two tools were compared and analyzed by a committee. For professional plastic surgeons, we found that ChatGPT and Bard can be of help when it comes to conducting scientific reviews and helping with scientific writing but are of limited use due to the superficiality of their answers in specific domains. For medical students, in addition to the above, they provide useful educational material with respect to surgical methods and exam preparation. For patients, they can help when it comes to preparing for an intervention, weighing the risks and benefits, while providing guidance on optimal post-operative care. ChatGPT and Bard open widely accessible data to every internet user. While they might create a sense of “magic” due to their chatbot interfaces, they nonetheless can help to increase productivity. For professional surgeons, they produce superficial answers—for now—albeit providing help with scientific writing and literature reviews. For medical students, they are great tools to deepen their knowledge about specific topics such as surgical methods and exam preparation. For patients, they can help in translating complicated medical jargon into understandable lingo and provide support for pre-operative as well as post-operative care. Such AI tools should be used cautiously, as their answers are not always precise or accurate, and should always be used in combination with expert medical guidance. Full article
Back to TopTop