Unlocking the Potentials of Large Language Models in Orthodontics: A Scoping Review

Zheng, Jie; Ding, Xiaoqian; Pu, Jingya Jane; Chung, Sze Man; Ai, Qi Yong H.; Hung, Kuo Feng; Shan, Zhiyi

doi:10.3390/bioengineering11111145

Open AccessReview

Unlocking the Potentials of Large Language Models in Orthodontics: A Scoping Review

by

Jie Zheng

¹,

Xiaoqian Ding

²,

Jingya Jane Pu

³

,

Sze Man Chung

²,

Qi Yong H. Ai

⁴

,

Kuo Feng Hung

^5,*

and

Zhiyi Shan

^2,*

¹

Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China

²

Division of Paediatric Dentistry and Orthodontics, Faculty of Dentistry, The University of Hong Kong, Hong Kong, China

³

Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong, China

⁴

Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, China

⁵

Applied Oral Science & Community Dental Care, Faculty of Dentistry, The University of Hong Kong, Hong Kong, China

^*

Authors to whom correspondence should be addressed.

Bioengineering 2024, 11(11), 1145; https://doi.org/10.3390/bioengineering11111145

Submission received: 9 October 2024 / Revised: 30 October 2024 / Accepted: 11 November 2024 / Published: 13 November 2024

(This article belongs to the Special Issue AI-Powered Diagnosis and Treatment Plans in Dentistry and Orofacial Fields)

Download

Browse Figures

Review Reports Versions Notes

Abstract

(1) Background: In recent years, large language models (LLMs) such as ChatGPT have gained significant attention in various fields, including dentistry. This scoping review aims to examine the current applications and explore potential uses of LLMs in the orthodontic domain, shedding light on how they might improve dental healthcare. (2) Methods: We carried out a comprehensive search in five electronic databases, namely PubMed, Scopus, Embase, ProQuest and Web of Science. Two authors independently screened articles and performed data extraction according to the eligibility criteria, following the PRISMA-ScR guideline. The main findings from the included articles were synthesized and analyzed in a narrative way. (3) Results: A total of 706 articles were searched, and 12 papers were eventually included. The applications of LLMs include improving diagnostic and treatment efficiency in orthodontics as well as enhancing communication with patients. (4) Conclusions: There is emerging research in countries worldwide on the use of LLMs in orthodontics, suggesting an upward trend in their acceptance within this field. However, the potential application of LLMs remains in its early stage, with a noticeable lack of extensive studies and tailored products to address specific clinical needs.

Keywords:

large language models; orthodontics; scoping review; generative AI; LLMs; Chatbot; ChatGPT; GPT; artificial intelligence

Graphical Abstract

1. Introduction

As a sub-branch of computer science, the study of artificial intelligence (AI) began as early as 1943 [1]. However, it was first introduced using the term “artificial intelligence” in 1956 [2]. AI is a broad field encompassing various computer programs designed to imitate human learning and reasoning, as well as the application of technologies like machine learning, deep learning, and natural language processing. It is gradually becoming capable of performing tasks that traditionally require human intelligence, significantly improving work efficiency [3,4,5]. With the development of artificial intelligence, significant changes have emerged across various fields. In finance [6], manufacturing [7], and transportation [8,9], AI has played a significant role in enhancing efficiency and driving innovation and progress within these areas. Similarly, in the medical field, the applications of AI are extensive and diverse. In the 1970s, the development of the MYCIN system marked a significant advancement in using AI for diagnosing bacterial infections and recommending antibiotic treatments. Since then, a growing number of studies have used AI to analyze medical images and manage electronic medical records [10,11,12,13]. In recent years, the maturation of AI technologies such as machine learning, deep learning, and others has further promoted the development of the medical field [14,15,16,17,18].

LLMs are a type of advanced AI model that depend on neural network architectures and deep learning techniques. They undergo pre-training and fine-tuning with large-scale text data, including articles and internet content, allowing them to perform a diversity of natural language processing tasks like text generation, question answering and others [19,20,21]. LLMs have demonstrated remarkable abilities in understanding human language and generating human-like dialogue, leading to their widespread use [22,23]. Some of these models, such as GPT-4 (OpenAI, 2023), Bard (Google, 2023), and Llama (Meta AI, 2023), are generative AI applications that generate human-like language texts and responses based on LLMs, which have been applied in a broad range of situations [24,25,26,27]. Although these models have presented notable potential in dialogue and question-answering tasks, their content accuracy and ethical issues still require strict supervision and management [28,29]. Moreover, the financial, educational, and medical sectors are increasingly adopting LLM-based applications, such as virtual tutors and automated medical documentation [30,31].

Numerous studies have shown that LLMs and their related software are widely used in the medical field. For example, LLMs can be employed for routine communication with patients to provide relatively professional responses that enhance patient engagement and satisfaction [32]. They also assist doctors in analyzing large amounts of clinical data to support decision making and improve overall healthcare outcomes [33]. Additionally, LLMs play an important role in the education of medical students by enhancing learning efficiency through personalized and interactive educational tools [34,35,36]. Although LLMs have been widely used in many medical fields, their research and application in orthodontics remain relatively limited. Therefore, there is still a lack of comprehensive understanding of their use in clinical practice and scientific research in orthodontics.

As a specialized branch of dentistry, orthodontics treatment is extensively utilized in clinical practice. The work in [37] indicates that malocclusion is currently considered one of the significant factors affecting oral health, with a highly variable prevalence among children and adolescents, estimated to range from 39% to 93%. However, orthodontic treatment provides notable intervention effects in correcting this condition [38], and in addressing certain facial asymmetries [39]. Orthodontics primarily concentrates on diagnosing, preventing, and correcting misaligned teeth and jaws, so it faces unique challenges and requirements that are different from other medical fields. In terms of diagnosis, orthodontics needs to combine clinical examinations with various imaging devices to comprehensively analyze the relationships between teeth, bones, and soft tissues. Additionally, orthodontic treatment plans are highly individualized, not only according to the specific situation of patients but also considering the long treatment period and the patient’s psychological factors [40,41,42]. LLMs offer substantial potential to enhance patient comprehension of the treatment plans and can assist orthodontists in analyzing and diagnosing image examinations. Additionally, it can monitor patient treatment progress and provide timely reminders for patient evaluations. Thus, the application of LLMs in orthodontics has significant potential and deserves in-depth research and exploration.

This scoping review intends to systematically collate the currently published literature on the application of LLMs in orthodontics to provide a comprehensive overview covering the applications, advantages, and challenges of LLMs in orthodontics. Through identifying gaps in current research, this review also aims to provide potential directions for future research and clinical practice.

2. Materials and Methods

In this scoping review, we performed a comprehensive review and synthesis of the published literature following the PRISMA-ScR guidelines [43] to ensure transparency and rationality in the whole process. Our research question was as follows: What are the applications of LLMs in orthodontics so far? All studies related to implementing LLMs in orthodontics were included in this review. The search strategy was structured based on “orthodontics” and “LLMs”, together with their synonyms (Appendix A). The eligibility criteria were constructed according to the PICOS format [44].

P (Population):	Human subjects including patients, dental professionals, or laypeople
I (Intervention):	LLMs, such as ChatGPT (Open AI), Gemini (Google), and Copilot (Microsoft), that were implemented in orthodontic domains
C (Comparator):	Conventional healthcare approach or blank control
O (Outcomes):	Assessment of the orthodontic outcomes in terms of diagnostic accuracy, treatment efficacy, speed of action, etc.
S (Study design):	Original studies published in English within peer-reviewed academic journals between 2017 and 30 June 2024 were included.

After an extensive search was performed across five major electronic databases (PubMed, Scopus, Web of Science, ProQuest, Embase), all records were imported into EndNote 20 software for screening. Initially, all duplicate literature records were removed. Then, two authors independently screened the title and abstract of each study based on the eligibility criteria. For those studies that met the eligibility criteria or where there was uncertainty, full text was retrieved for detailed evaluation. Any discrepancies that occurred during the screening process were resolved by discussion until consensus was reached.

The data extraction was independently conducted by two reviewers and pooled for analysis. The extracted data included characteristics of articles (authors, country, and publication year), LLMs (name and company), as well as the application (the performance/effectiveness) of the studies. This information was organized into a summary table of the included studies.

Eventually, we carried out a comprehensive analysis of the extracted data. A descriptive analysis of the collected basic information was performed, detailing the current state of published research from various aspects such as publication time, country, and predicting future research trends. Subsequently, we analyzed the research topics in-depth, discussing the specific applications of LLMs in different areas of orthodontics. This includes an assessment of current use, as well as an exploration of applications that may have potential in the future. Finally, we provided a detailed description and discussion of the challenges in the study, aiming to offer valuable insights for future research and applications.

3. Results

The search obtained a total of 706 articles including 472 articles from PubMed, 16 articles from Embase, 35 articles from ProQuest, 62 articles from Scopus, and 121 articles from Web of Science. After the removal of 96 duplicate articles, 533 articles were excluded based on title screening. Following the abstract review, 50 articles did not meet the inclusion criteria, leaving 26 for full-text evaluation. Ultimately, only 12 articles fulfilled the eligibility criteria. The detailed flowchart is presented in Figure 1. The characteristics of the included studies are shown in Table 1.

Of the 12 articles that met the inclusion criteria, all were published within the past two years (4 studies in 2023 and 8 studies in 2024); 9/12 (75%) articles evaluated the accuracy and validity of the different LLMs regarding the answers to questions related to the field of orthodontics through quantitative analysis; 1/12 (8.33%) article used the comparative mixed method and 2/12 (16.67%) articles discussed the application of LLMs. The distribution of these articles is illustrated in Figure 2.

Based on the regional distribution of these 12 articles, it can be seen that they were published across various countries. Specifically, the distribution includes three articles from Turkey, and one article each from Greece, Italy, the United States, Brazil, Slovakia, and Cyprus. The Asian region includes two articles from China and one from Japan. The detailed distribution of these articles is illustrated in Figure 3 (source: http://gisgeography.com). While this broad geographic distribution reflects the high level of interest among scholars worldwide in the application of LLMs in orthodontics, it also implies a relatively limited amount of research in this area.

The included articles were categorized based on the research methodology, and the majority of them are quantitative studies. Through statistical analyses, nine articles evaluated the effectiveness and accuracy of different LLMs in orthodontics. Four of these studies [48,51,52,53] demonstrate that LLMs perform well in providing accurate answers and effective assistance. However, five other articles [49,50,54,55,56] shows that the accuracy of these LLMs is not yet optimal, and they occasionally generate incorrect answers. Moreover, two discussion studies [46,47] presented the specific application of LLMs in DM software (Dental Monitoring Co., Paris, France) and the CephGPT-4 model, respectively, noting that they have enhanced the efficiency of orthodontic clinical treatment, but further research and development of these technologies are still needed. Furthermore, a mixed-method study [45] combining quantitative and qualitative approaches conducted a comprehensive assessment of the accuracy of responses provided by LLMs, with the results indicating that their overall accuracy needs improvement.

Additionally, twelve articles mentioned the application of different LLMs related to orthodontics. Surovková et al. [57] discussed how DM software can facilitate remote communication and monitoring between patients and dentists, noting its potential to provide personalized real-time analysis and feedback to help patients better understand the treatment process and enhancing treatment efficiency. L. Ma et al. [46] discussed the CephGPT4.0 model, which is trained based on MiniGPT-4 to generate diagnostic reports by automatically analyzing cephalometric landmarks. And there are two articles [45,55] that compared the performance of four LLMs (Bard, ChatGPT3.5, ChatGPT4.0, and Bing) in response to orthodontic questions; both articles showed that the LLMs occasionally generated suboptimal answers. Also, the study conducted by S. Abu Arqub et al. [49] highlighted that the overall accuracy of ChatGPT-3.5 cornering clear aligners was limited by a deficiency in relevant citations. But two other articles [48,53] demonstrated that ChatGPT4.0 exhibits a high level of accuracy and completeness in answering orthodontic questions. Furthermore, two articles [50,51] compared the accuracy of ChatGPT3.5 and Bard; it was found that both articles indicate that the quality of ChatGPT-3.5 is superior to Bard. Another two articles [52,54] compared the performance of ChatGPT3.5 and ChatGPT4.0; both indicate that ChatGPT-4.0 has improved in the reliability of responses but become more complex in terms of readability. Apart from these studies, M. Morishita et al. [56] evaluated the capabilities of ChatGPT4V, with a particular focus on its performance in processing image inputs and providing orthodontic analysis; however, the overall accuracy response rate was only 35% in responding to image-based questions.

Overall, we classified and discussed them according to different publication dates, geographical country distribution, and the application of different LLMs in various fields of orthodontics. It not only reveals the various applications and evaluations of LLM techniques, but also highlights their growing influence in the specialized field. For specific classifications and the studies, refer to Table 2.

4. Discussion

This rapid development of LLMs has created a revolution in the field of artificial intelligence, enabling them to be widely used in a variety of fields such as communication, education and healthcare [26,58,59,60]. LLMs, as an important part of artificial intelligence, are trained with large text data and deep learning techniques that enable them to learn and understand the patterns and structures of human language [61,62,63,64]. This ability makes LLMs display great potential in clinical, scientific research and other areas. In the medical field, LLMs can be applied to assist in diagnosis and personalized treatment planning, which can effectively improve the efficiency and quality of medical services [65,66,67].

With this scoping review, a total of 12 articles referring to the application of LLMs in orthodontics were collected, all of which were published during the last two years [45,46,47,48,49,50,51,52,53,54,55,56]. Based on our results, there are few studies on LLMs in orthodontics and a limited number of applied studies. It was also found that most of the articles evaluated the accuracy and speed of answering of different GPT models [45,50,51,52,54,55]. In addition, the included studies show an upward trend of research in this field over the years and these studies are conducted by researchers from different countries around the globe. The increasing number of studies from diverse regions in the past two years reflects the growing global attention and interest this field has attracted among scholars. However, our scoping review has some limitations. Firstly, it was restricted to English-language literature, and only five databases were searched, which introduces selection bias and limits the comprehensiveness of the evidence provided. So it indicates that there is still plenty of scope for further exploration and research in this area.

As the potential benefits of LLMs in various healthcare systems and cultural contexts become increasingly recognized, this provides strong support for the future development and application of LLMs in the field of orthodontics. For example, models based on the GPT architecture can analyze extensive datasets from the electronic health records (EHRs) of orthodontic patients to predict patient outcomes after treatment. This prediction ability not only helps dentists in formulating personalized treatment plans but also facilitates the early identification of potential risks, thereby enhancing patient prognosis [36,68,69]. Furthermore, LLMs can incorporate multimodal models, such as image recognition models, which leverage text and image data types to assist dentists in providing more comprehensive and accurate diagnoses. In the study by Morishita et al. [56], the image recognition capabilities and question-answering accuracy of ChatGPT-4V were evaluated, revealing that challenges still exist in processing complex images and providing accurate answers. In the future, more advanced machine learning models can be trained to improve image processing capabilities further, thereby providing better services to dentists and patients. Since some studies have shown that malocclusion may affect pronunciation [70,71,72], the combination of LLMs and specialized speech analysis models can assist in identifying potential risks of oral problems such as open bite and deep overbite, by analyzing speech data. Using this preliminary screening approach leads to further clinical examination and diagnosis; with this early detection and timely orthodontic intervention, it can effectively improve the oral health and quality of life of patients. Furthermore, LLMs show great potential in the field of patient care. Surovková et al. [47] have introduced and discussed the changes that AI technology can bring to doctors, nurses, and patients during orthodontic practice. Actually, in daily medical practice, patients often have questions about diagnostic reports and treatment plans. LLMs can assist healthcare staff in effectively conveying complex medical knowledge, helping patients better understand their treatment plans and increasing their compliance with treatment [73]. It also can help patients better understand their health conditions and enhance their health awareness by providing personalized educational content and resources. In addition, integrating the LLMs with general risk models can assist in triaging of orthodontic patients. By synthesizing and analyzing multiple data sources, LLMs can provide precise risk assessments. This application can optimize the allocation of medical resources by categorizing patients based on the time required for their visits, thus enabling more rational resource distribution and improving overall treatment efficiency and effectiveness [74]. LLMs also show distinct advantages in virtual simulated clinical trials [75]. Through creating virtual clinical trial environments, LLMs can evaluate the effects and risks of different orthodontic products and treatment plans. This not only accelerates the development of new products and therapies, but also provides a comprehensive analysis, thereby supporting researchers in making more informed decisions.

Additionally, in the field of education, LLMs play an important role in translating, explaining, summarizing, and synthesizing content [76,77]. They can break down complex content into multiple parts, making it easier for students to understand. At the same time, LLMs have the ability to guide students to gradually ask questions, promoting independent thinking and logical reasoning [78,79,80]. Furthermore, studies have shown that integrating LLMs into educational platforms can effectively optimize online teaching, improving the interactivity of learning and simulating clinical scenarios to train medical students [81]. By using LLMs to analyze and process a large amount of orthodontic case data, it can provide students with more accurate learning materials and simulated cases to help them gain a deeper understanding of complex orthodontic concepts and techniques, assisting them in fully understanding complex orthodontic concepts and techniques. Although the application of LLMs in orthodontic education can be regarded as an important direction for future research and development, continuous optimization and improvement are still needed to ensure the accuracy and reliability of the models and to avoid misinformation. Regardless of the challenges, the potential of LLMs in the field of education cannot be ignored. With the continuous development of technology and the expansion of application scenarios, LLMs are expected to play an important role in orthodontic education, promoting the innovation and development of educational models.

Even though the application of LLMs in the field of orthodontics can improve the accuracy of clinical diagnosis, assist in the personalized treatment plan making, and increase the efficiency of student learning, there are several limitations. First, it not only requires a large amount of high-quality training data to ensure the accuracy and reliability of generated information, but it is also necessary to establish a rapid and efficient data update mechanism to make the model capable of learning the latest medical developments in a timely manner. This is crucial because research and innovation in the medical field is continuous, and lacking up-to-date content may lead to inaccurate information. Secondly, LLMs present certain challenges in language understanding and generation. They may produce text that seems reasonable and coherent, but in reality, this content is not always accurate. In fields like medicine, where highly precise content is required, improving the reasoning and comprehension abilities of models becomes particularly crucial. Additionally, it is essential to take measures for ensuring patient data privacy and safety. In addition, continuous improvements must be made by verifying the accuracy and reliability of the outputs generated by LLMs.

5. Conclusions

In this scoping review, we present a thorough analysis of studies that explore the applications of LLMs in the field of orthodontics. Although the number of studies and enthusiasm for LLMs have gradually increased in recent years, there is still a gap in their application in the field of orthodontics. Existing research indicates that it is necessary to further improve the accuracy of the GPT in answering orthodontics-related questions, so that orthodontists can refer to the answers of the GPT to assist in formulating orthodontic treatment plans and also to help patients to better understand the questions in the diagnosis and treatment. It may be possible in the future to train the GPT with a large amount of data to improve its answer precision, and to develop relevant clinical applications based on LLMs that assist in treatment, which can significantly improve the treatment efficiency. For the education of dental students, applications based on LLMs can also be developed to enhance the efficiency and motivation of students. Issues related to privacy and ethics in artificial intelligence may be addressed in future research and development.

Author Contributions

Conceptualization, Z.S. and K.F.H.; methodology, J.Z.; validation, X.D.; formal analysis, J.Z. and X.D.; investigation, J.Z. and X.D.; writing—original draft preparation, J.Z. and S.M.C.; writing—review and editing, J.J.P., Q.Y.H.A. and K.F.H.; supervision, Z.S. and K.F.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in the references.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Search strategy

PubMed: 471

#1 ((((((((Large Language Model*) OR (LLM)) OR (ChatGPT)) OR (GPT)) OR (generative pre trained transformer*)) OR (pretraining language model*)) OR (AI language model)) OR (chatbot)) OR (gpt-*) AND (2017:2024[pdat])

#2 (((((((orthodon*) OR (cephalo*)) OR (orthodontic treatment)) OR (Dental braces)) OR (Aligners)) OR (Teeth straightening)) OR (Dental correction)) OR (orthodontic procedure) AND (2017:2024[pdat])

#1 AND #2

Embase: 16

#1 (‘large language model*’:ti,ab,kw OR llm:ti,ab,kw OR chatgpt:ti,ab,kw OR gpt:ti,ab,kw OR ‘generative pre trained transformer*’:ti,ab,kw OR ‘pretraining language model*’:ti,ab,kw OR ‘ai language model’:ti,ab,kw OR chatbot:ti,ab,kw OR gpt*:ti,ab,kw) AND [2017–2024]/py

#2 (orthodon*:ti,ab,kw OR cephalo*:ti,ab,kw OR ‘orthodontic treatment’:ti,ab,kw OR ‘dental braces’:ti,ab,kw OR aligners:ti,ab,kw OR ‘teeth straightening’:ti,ab,kw OR ‘dental correction’:ti,ab,kw OR ‘orthodontic procedure’:ti,ab,kw) AND [2017–2024]/py

#3 #1 AND #2

ProQuest: 35

#1 noft(Large Language Model*) OR noft(LLM) OR noft(ChatGPT) OR noft(GPT) OR noft(generative pre trained transformer*) OR noft(pretraining language model*) OR noft(AI language model) OR noft(chatbot) OR noft(gpt*)

#2 noft(orthodon*) OR noft(cephalo*) OR noft(orthodontic treatment) OR noft(Dental braces) OR noft(Aligners) OR noft(Teeth straightening) OR noft(Dental correction) OR noft(orthodontic procedure)

#3 #1 AND #2

Scopus: 62

#1 (TITLE-ABS-KEY (large AND language AND model*) OR TITLE-ABS-KEY (llm) OR TITLE-ABS-KEY (chatgpt) OR TITLE-ABS-KEY (gpt) OR TITLE-ABS-KEY (generative AND pre AND trained AND transformer*) OR TITLE-ABS-KEY (pretraining AND language AND model*) OR TITLE-ABS-KEY (ai AND language AND model) OR TITLE-ABS-KEY (chatbot) OR TITLE-ABS-KEY (gpt*)) AND PUBYEAR > 2016 AND PUBYEAR < 2025

#2 (TITLE-ABS-KEY (orthodon*) OR TITLE-ABS-KEY (cephalo*) OR TITLE-ABS-KEY (orthodontic AND treatment) OR TITLE-ABS-KEY (dental AND braces) OR TITLE-ABS-KEY (aligners) OR TITLE-ABS-KEY (teeth AND straightening) OR TITLE-ABS-KEY (dental AND correction) OR TITLE-ABS-KEY (orthodontic AND procedure)) AND PUBYEAR > 2016 AND PUBYEAR < 2025

#3 #1AND #2

Web of science: 121

#1 ((((((((TS=(Large Language Model*)) OR TS=(LLM)) OR TS=(ChatGPT)) OR TS=(GPT)) OR TS=(generative pre trained transformer*)) OR TS=(pretraining language model*)) OR TS=(AI language model)) OR TS=(chatbot)) OR TS=(gpt*)

#2 (((((((TS=(orthodon*)) OR TS=(cephalo*)) OR TS=(orthodontic treatment)) OR TS=(Dental braces)) OR TS=(Aligners)) OR TS=(Teeth straightening)) OR TS=(Dental correction)) OR TS=(orthodontic procedure)

#3 #1 AND #2

References

McCulloch, W.S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. 1943. Bull. Math. Biol. 1990, 52, 99–115; discussion 173–197. [Google Scholar] [CrossRef] [PubMed]
Negnevitsky, M. Artificial Intelligence: A Guide to Intelligent Systems; Pearson education: London, UK, 2005. [Google Scholar]
Tran, B.X.; Vu, G.T.; Ha, G.H.; Vuong, Q.-H.; Ho, M.-T.; Vuong, T.-T.; La, V.-P.; Ho, M.-T.; Nghiem, K.-C.P.; Nguyen, H.L.T.; et al. Global evolution of research in artificial intelligence in health and medicine: A bibliometric study. J. Clin. Med. 2019, 8, 360. [Google Scholar] [CrossRef] [PubMed]
Mintz, Y.; Brodie, R. Introduction to artificial intelligence in medicine. Minim. Invasive Ther. Allied Technol. 2019, 28, 73–81. [Google Scholar] [CrossRef] [PubMed]
Kaul, V.; Enslin, S.; Gross, S.A. History of artificial intelligence in medicine. Gastrointest. Endosc. 2020, 92, 807–812. [Google Scholar] [CrossRef]
Cao, L. Ai in finance: Challenges, techniques, and opportunities. ACM Comput. Surv. (CSUR) 2022, 55, 1–38. [Google Scholar]
Arinez, J.F.; Chang, Q.; Gao, R.X.; Xu, C.; Zhang, J. Artificial intelligence in advanced manufacturing: Current status and future outlook. J. Manuf. Sci. Eng. 2020, 142, 110804. [Google Scholar] [CrossRef]
Mandal, V.; Mussah, A.R.; Jin, P.; Adu-Gyamfi, Y. Artificial intelligence-enabled traffic monitoring system. Sustainability 2020, 12, 9177. [Google Scholar] [CrossRef]
Abduljabbar, R.; Dia, H.; Liyanage, S.; Bagloee, S.A. Applications of artificial intelligence in transport: An overview. Sustainability 2019, 11, 189. [Google Scholar] [CrossRef]
Davenport, T.; Kalakota, R. The potential for artificial intelligence in healthcare. Future Healthc. J. 2019, 6, 94–98. [Google Scholar] [CrossRef]
Jiang, F.; Jiang, Y.; Zhi, H.; Dong, Y.; Li, H.; Ma, S.; Wang, Y.; Dong, Q.; Shen, H.; Wang, Y. Artificial intelligence in healthcare: Past, present and future. Stroke Vasc. Neurol. 2017, 2, 230–243. [Google Scholar] [CrossRef]
Caffery, L.J.; Clunie, D.; Curiel-Lewandrowski, C.; Malvehy, J.; Soyer, H.P.; Halpern, A.C. Transforming dermatologic imaging for the digital era: Metadata and standards. J. Digit. Imaging 2018, 31, 568–577. [Google Scholar] [CrossRef] [PubMed]
Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef] [PubMed]
Al Kuwaiti, A.; Nazer, K.; Al-Reedy, A.; Al-Shehri, S.; Al-Muhanna, A.; Subbarayalu, A.V.; Al Muhanna, D.; Al-Muhanna, F.A. A Review of the Role of Artificial Intelligence in Healthcare. J. Pers. Med. 2023, 13, 951. [Google Scholar] [CrossRef] [PubMed]
Du-Harpur, X.; Watt, F.; Luscombe, N.; Lynch, M. What is AI? Applications of artificial intelligence to dermatology. Br. J. Dermatol. 2020, 183, 423–430. [Google Scholar] [CrossRef] [PubMed]
Basu, K.; Sinha, R.; Ong, A.; Basu, T. Artificial intelligence: How is it changing medical sciences and its future? Indian J. Dermatol. 2020, 65, 365–370. [Google Scholar] [CrossRef]
Amann, J.; Blasimme, A.; Vayena, E.; Frey, D.; Madai, V.I.; Consortium, P.Q. Explainability for artificial intelligence in healthcare: A multidisciplinary perspective. BMC Med. Inform. Decis. Mak. 2020, 20, 310. [Google Scholar] [CrossRef]
Habehh, H.; Gohel, S. Machine learning in healthcare. Curr. Genom. 2021, 22, 291. [Google Scholar] [CrossRef]
Aggarwal, R.; Sounderajah, V.; Martin, G.; Ting, D.S.; Karthikesalingam, A.; King, D.; Ashrafian, H.; Darzi, A. Diagnostic accuracy of deep learning in medical imaging: A systematic review and meta-analysis. NPJ Digit. Med. 2021, 4, 65. [Google Scholar] [CrossRef]
Chang, Y.; Wang, X.; Wang, J.; Wu, Y.; Yang, L.; Zhu, K.; Chen, H.; Yi, X.; Wang, C.; Wang, Y.; et al. A survey on evaluation of large language models. ACM Trans. Intell. Syst. Technol. 2024, 15, 1–45. [Google Scholar] [CrossRef]
Chen, M.; Tworek, J.; Jun, H.; Yuan, Q.; Pinto, H.P.D.O.; Kaplan, J.; Edwards, H.; Burda, Y.; Joseph, N.; Brockman, G.; et al. Evaluating large language models trained on code. arXiv 2021, arXiv:2107.03374. [Google Scholar]
Shanahan, M. Talking about large language models. Commun. ACM 2024, 67, 68–79. [Google Scholar] [CrossRef]
Tseng, R.; Verberne, S.; van der Putten, P. ChatGPT as a commenter to the news: Can LLMs generate human-like opinions? In Proceedings of the Multidisciplinary International Symposium on Disinformation in Open Online Media, Amsterdam, The Netherlands, 21–22 November 2023; pp. 160–174. [Google Scholar]
Garon, J. A practical Introduction to Generative AI, Synthetic Media, and the Messages Found in the Latest Medium (March 14, 2023). 2023. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4388437 (accessed on 10 July 2024).
Wu, T.; He, S.; Liu, J.; Sun, S.; Liu, K.; Han, Q.-L.; Tang, Y. A brief overview of ChatGPT: The history, status quo and potential future development. IEEE/CAA J. Autom. Sin. 2023, 10, 1122–1136. [Google Scholar] [CrossRef]
Kasneci, E.; Seßler, K.; Küchemann, S.; Bannert, M.; Dementieva, D.; Fischer, F.; Gasser, U.; Groh, G.; Günnemann, S.; Hüllermeier, E.; et al. ChatGPT for good? On opportunities and challenges of large language models for education. Learn. Individ. Differ. 2023, 103, 102274. [Google Scholar] [CrossRef]
Akhtar, Z.B. Unveiling the evolution of generative AI (GAI): A comprehensive and investigative analysis toward LLM models (2021–2024) and beyond. J. Electr. Syst. Inf. Technol. 2024, 11, 22. [Google Scholar]
Seth, I.; Cox, A.; Xie, Y.; Bulloch, G.; Hunter-Smith, D.J.; Rozen, W.M.; Ross, R.J. Evaluating chatbot efficacy for answering frequently asked questions in plastic surgery: A ChatGPT case study focused on breast augmentation. Aesthetic Surg. J. 2023, 43, 1126–1135. [Google Scholar] [CrossRef]
Lim, Z.W.; Pushpanathan, K.; Yew, S.M.E.; Lai, Y.; Sun, C.-H.; Lam, J.S.H.; Chen, D.Z.; Goh, J.H.L.; Tan, M.C.J.; Sheng, B.; et al. Benchmarking large language models’ performances for myopia care: A comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard. EBioMedicine 2023, 95, 104770. [Google Scholar] [CrossRef]
Chen, Z.Z.; Ma, J.; Zhang, X.; Hao, N.; Yan, A.; Nourbakhsh, A.; Yang, X.; McAuley, J.; Petzold, L.; Wang, W.Y. A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law. arXiv 2024, arXiv:2405.01769. [Google Scholar]
Karabacak, M.; Margetis, K. Embracing large language models for medical applications: Opportunities and challenges. Cureus 2023, 15, e39305. [Google Scholar] [CrossRef]
Geantă, M.; Bădescu, D.; Chirca, N.; Nechita, O.C.; Radu, C.G.; Rascu, S.; Rădăvoi, D.; Sima, C.; Toma, C.; Jinga, V. The Potential Impact of Large Language Models on Doctor–Patient Communication: A Case Study in Prostate Cancer. Healthcare 2024, 12, 1548. [Google Scholar] [CrossRef]
Garg, R.K.; Urs, V.L.; Agarwal, A.A.; Chaudhary, S.K.; Paliwal, V.; Kar, S.K. Exploring the role of ChatGPT in patient care (diagnosis and treatment) and medical research: A systematic review. Health Promot. Perspect. 2023, 13, 183. [Google Scholar] [CrossRef]
Thirunavukarasu, A.J.; Ting, D.S.J.; Elangovan, K.; Gutierrez, L.; Tan, T.F.; Ting, D.S.W. Large language models in medicine. Nat. Med. 2023, 29, 1930–1940. [Google Scholar] [CrossRef] [PubMed]
Meskó, B.; Topol, E.J. The imperative for regulatory oversight of large language models (or generative AI) in healthcare. NPJ Digit. Med. 2023, 6, 120. [Google Scholar] [CrossRef] [PubMed]
Yang, X.; Chen, A.; PourNejatian, N.; Shin, H.C.; Smith, K.E.; Parisien, C.; Compas, C.; Martin, C.; Costa, A.B.; Flores, M.G.; et al. A large language model for electronic health records. NPJ Digit. Med. 2022, 5, 194. [Google Scholar] [CrossRef] [PubMed]
Cenzato, N.; Nobili, A.; Maspero, C. Prevalence of Dental Malocclusions in Different Geographical Areas: Scoping Review. Dent. J. 2021, 9, 117. [Google Scholar] [CrossRef]
Jamilian, A.; Kiaee, B.; Sanayei, S.; Khosravi, S.; Perillo, L. Orthodontic treatment of malocclusion and its impact on oral health-related quality of life. Open Dent. J. 2016, 10, 236. [Google Scholar] [CrossRef]
Ko, E.W.-C.; Huang, C.S.; Lin, C.-H.; Chen, Y.-R. Orthodontic perspective for face asymmetry correction. Symmetry 2022, 14, 1822. [Google Scholar] [CrossRef]
Kahn, S.; Ehrlich, P.; Feldman, M.; Sapolsky, R.; Wong, S. The jaw epidemic: Recognition, origins, cures, and prevention. Bioscience 2020, 70, 759–771. [Google Scholar] [CrossRef]
Caruso, S.; Caruso, S.; Pellegrino, M.; Skafi, R.; Nota, A.; Tecco, S. A knowledge-based algorithm for automatic monitoring of orthodontic treatment: The dental monitoring system. Two cases. Sensors 2021, 21, 1856. [Google Scholar] [CrossRef]
Littlewood, S.J.; Mitchell, L. An Introduction to Orthodontics; Oxford University Press: Oxford, UK, 2019. [Google Scholar]
Tricco, A.C.; Lillie, E.; Zarin, W.; O’Brien, K.K.; Colquhoun, H.; Levac, D.; Moher, D.; Peters, M.D.J.; Horsley, T.; Weeks, L.; et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation. Ann. Intern. Med. 2018, 169, 467–473. [Google Scholar] [CrossRef]
Amir-Behghadami, M.; Janati, A. Population, Intervention, Comparison, Outcomes and Study (PICOS) design as a framework to formulate eligibility criteria in systematic reviews. Emerg. Med. J. 2020, 37, 387. [Google Scholar] [CrossRef]
Giannakopoulos, K.; Kavadella, A.; Aaqel Salim, A.; Stamatopoulos, V.; Kaklamanos, E.G. Evaluation of the performance of generative AI large language models ChatGPT, Google Bard, and Microsoft Bing Chat in supporting evidence-based dentistry: Comparative mixed methods study. J. Med. Internet Res. 2023, 25, e51580. [Google Scholar] [CrossRef] [PubMed]
Ma, L.; Han, J.; Wang, Z.; Zhang, D. Cephgpt-4: An interactive multimodal cephalometric measurement and diagnostic system with visual large language model. arXiv 2023, arXiv:2307.07518. [Google Scholar]
Surovková, J.; Haluzová, S.; Strunga, M.; Urban, R.; Lifková, M.; Thurzo, A. The New Role of the Dental Assistant and Nurse in the Age of Advanced Artificial Intelligence in Telehealth Orthodontic Care with Dental Monitoring: Preliminary Report. Appl. Sci. 2023, 13, 5212. [Google Scholar] [CrossRef]
Tanaka, O.M.; Gasparello, G.G.; Hartmann, G.C.; Casagrande, F.A.; Pithon, M.M. Assessing the reliability of ChatGPT: A content analysis of self-generated and self-answered questions on clear aligners, TADs and digital imaging. Dent. Press J. Orthod. 2023, 28, e2323183. [Google Scholar] [CrossRef]
Abu Arqub, S.; Al-Moghrabi, D.; Allareddy, V.; Upadhyay, M.; Vaid, N.; Yadav, S. Content analysis of AI-generated (ChatGPT) responses concerning orthodontic clear aligners. Angle Orthod. 2024, 94, 263–272. [Google Scholar] [CrossRef]
Arslan, C.; Kahya, K.; Cesur, E.; Cakan, D.G. An evaluation of orthodontic information quality regarding artificial intelligence (AI) chatbot technologies: A comparison of ChatGPT and google BARD. Australas. Orthod. J. 2024, 40, 149–157. [Google Scholar] [CrossRef]
Daraqel, B.; Wafaie, K.; Mohammed, H.; Cao, L.; Mheissen, S.; Liu, Y.; Zheng, L. The performance of artificial intelligence models in generating responses to general orthodontic questions: ChatGPT vs Google Bard. Am. J. Orthod. Dentofac. Orthop. 2024, 165, 652–662. [Google Scholar] [CrossRef]
Demir, G.B.; Süküt, Y.; Duran, G.S.; Topsakal, K.G.; Görgülü, S. Enhancing systematic reviews in orthodontics: A comparative examination of GPT-3.5 and GPT-4 for generating PICO-based queries with tailored prompts and configurations. Eur. J. Orthod. 2024, 46, cjae011. [Google Scholar] [CrossRef]
Hatia, A.; Doldo, T.; Parrini, S.; Chisci, E.; Cipriani, L.; Montagna, L.; Lagana, G.; Guenza, G.; Agosta, E.; Vinjolli, F.; et al. Accuracy and Completeness of ChatGPT-Generated Information on Interceptive Orthodontics: A Multicenter Collaborative Study. J. Clin. Med. 2024, 13, 735. [Google Scholar] [CrossRef]
Kılınç, D.D.; Mansız, D. Examination of the reliability and readability of Chatbot Generative Pretrained Transformer’s (ChatGPT) responses to questions about orthodontics and the evolution of these responses in an updated version. Am. J. Orthod. Dentofac. Orthop. 2024, 165, 546–555. [Google Scholar] [CrossRef]
Makrygiannakis, M.A.; Giannakopoulos, K.; Kaklamanos, E.G. Evidence-based potential of generative artificial intelligence large language models in orthodontics: A comparative study of ChatGPT, Google Bard, and Microsoft Bing. Eur. J. Orthod. 2024; ahead of print. [Google Scholar]
Morishita, M.; Fukuda, H.; Muraoka, K.; Nakamura, T.; Hayashi, M.; Yoshioka, I.; Ono, K.; Awano, S. Evaluating GPT-4V’s performance in the Japanese national dental examination: A challenge explored. J. Dent. Sci. 2024, 19, 1595–1600. [Google Scholar] [CrossRef] [PubMed]
Strunga, M.; Urban, R.; Surovková, J.; Thurzo, A. Artificial intelligence systems assisting in the assessment of the course and retention of orthodontic treatment. Healthcare 2023, 11, 683. [Google Scholar] [CrossRef] [PubMed]
Demszky, D.; Yang, D.; Yeager, D.S.; Bryan, C.J.; Clapper, M.; Chandhok, S.; Eichstaedt, J.C.; Hecht, C.; Jamieson, J.; Johnson, M.; et al. Using large language models in psychology. Nat. Rev. Psychol. 2023, 2, 688–701. [Google Scholar] [CrossRef]
Qureshi, R.; Irfan, M.; Gondal, T.M.; Khan, S.; Wu, J.; Hadi, M.U.; Heymach, J.; Le, X.; Yan, H.; Alam, T. AI in drug discovery and its clinical relevance. Heliyon 2023, 9, e17575. [Google Scholar] [CrossRef] [PubMed]
Shoham, O.B.; Rappoport, N. Cpllm: Clinical prediction with large language models. arXiv 2023, arXiv:2309.11295. [Google Scholar]
Esteva, A.; Robicquet, A.; Ramsundar, B.; Kuleshov, V.; DePristo, M.; Chou, K.; Cui, C.; Corrado, G.; Thrun, S.; Dean, J. A guide to deep learning in healthcare. Nat. Med. 2019, 25, 24–29. [Google Scholar] [CrossRef]
Huang, J.; Chang, K.C.-C. Towards reasoning in large language models: A survey. arXiv 2022, arXiv:2212.10403. [Google Scholar]
Awais, M.; Naseer, M.; Khan, S.; Anwer, R.M.; Cholakkal, H.; Shah, M.; Yang, M.-H.; Khan, F.S. Foundational models defining a new era in vision: A survey and outlook. arXiv 2023, arXiv:2307.13721. [Google Scholar]
Yang, J.; Jin, H.; Tang, R.; Han, X.; Feng, Q.; Jiang, H.; Zhong, S.; Yin, B.; Hu, X. Harnessing the power of llms in practice: A survey on chatgpt and beyond. ACM Trans. Knowl. Discov. Data 2024, 18, 1–32. [Google Scholar] [CrossRef]
Rasmy, L.; Xiang, Y.; Xie, Z.; Tao, C.; Zhi, D. Med-BERT: Pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ Digit. Med. 2021, 4, 86. [Google Scholar] [CrossRef]
Safranek, C.W.; Sidamon-Eristoff, A.E.; Gilson, A.; Chartash, D. The role of large language models in medical education: Applications and implications. JMIR Med. Educ. 2023, 9, e50945. [Google Scholar] [CrossRef] [PubMed]
Ríos-Hoyo, A.; Shan, N.L.; Li, A.; Pearson, A.T.; Pusztai, L.; Howard, F.M. Evaluation of large language models as a diagnostic aid for complex medical cases. Front. Med. 2024, 11, 1380148. [Google Scholar] [CrossRef] [PubMed]
Yang, X.; Chen, A.; PourNejatian, N.; Shin, H.C.; Smith, K.E.; Parisien, C.; Compas, C.; Martin, C.; Flores, M.G.; Zhang, Y.; et al. Gatortron: A large clinical language model to unlock patient information from unstructured electronic health records. arXiv 2022, arXiv:2203.03540. [Google Scholar]
Kraljevic, Z.; Bean, D.; Shek, A.; Bendayan, R.; Hemingway, H.; Yeung, J.A.; Deng, A.; Baston, A.; Ross, J.; Idowu, E.; et al. Foresight--generative pretrained transformer (GPT) for modelling of patient timelines using Ehrs. arXiv 2022, arXiv:2212.08072. [Google Scholar]
Keyser, M.M.B.; Lathrop, H.; Jhingree, S.; Giduz, N.; Bocklage, C.; Couldwell, S.; Oliver, S.; Moss, K.; Frazier-Bowers, S.; Phillips, C.; et al. Impacts of Skeletal Anterior Open Bite Malocclusion on Speech. FACE 2022, 3, 339–349. [Google Scholar] [CrossRef]
Handoko, H.; Yohana, N. Speech production and malocclusion: A review. JURNAL ARBITRER 2023, 10, 107–115. [Google Scholar] [CrossRef]
Al-Huwaizi, A. Occlusal Features, Perception of Occlusion, Orthodontic Treatment Need and Demand Among 13 Year Old Iraqi Students. Ph.D. Thesis, University of Baghdad, Baghdad, Iraq, 2002. [Google Scholar]
Tripathi, S.; Sukumaran, R.; Cook, T.S. Efficient healthcare with large language models: Optimizing clinical workflow and enhancing patient care. J. Am. Med. Inform. Assoc. 2024, 31, 1436–1440. [Google Scholar] [CrossRef]
Arora, A.; Arora, A. The promise of large language models in health care. Lancet 2023, 401, 641. [Google Scholar] [CrossRef]
Askin, S.; Burkhalter, D.; Calado, G.; El Dakrouni, S. Artificial Intelligence Applied to clinical trials: Opportunities and challenges. Health Technol. 2023, 13, 203–213. [Google Scholar] [CrossRef]
Radford, A.; Kim, J.W.; Xu, T.; Brockman, G.; McLeavey, C.; Sutskever, I. Robust speech recognition via large-scale weak supervision. In Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; pp. 28492–28518. [Google Scholar]
Zhu, W.; Liu, H.; Dong, Q.; Xu, J.; Huang, S.; Kong, L.; Chen, J.; Li, L. Multilingual machine translation with large language models: Empirical results and analysis. arXiv 2023, arXiv:2304.04675. [Google Scholar]
Nori, H.; King, N.; McKinney, S.M.; Carignan, D.; Horvitz, E. Capabilities of gpt-4 on medical challenge problems. arXiv 2023, arXiv:2303.13375. [Google Scholar]
Kumar, H.; Musabirov, I.; Reza, M.; Shi, J.; Kuzminykh, A.; Williams, J.J.; Liut, M. Impact of guidance and interaction strategies for LLM use on Learner Performance and perception. arXiv 2023, arXiv:2310.13712. [Google Scholar]
Kung, T.H.; Cheatham, M.; Medenilla, A.; Sillos, C.; De Leon, L.; Elepaño, C.; Madriaga, M.; Aggabao, R.; Diaz-Candido, G.; Maningo, J.; et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLoS Digit. Health 2023, 2, e0000198. [Google Scholar] [CrossRef] [PubMed]
Ahuja, A.S.; Polascik, B.W.; Doddapaneni, D.; Byrnes, E.S.; Sridhar, J. The digital metaverse: Applications in artificial intelligence, medical education, and integrative health. Integr. Med. Res. 2023, 12, 100917. [Google Scholar] [CrossRef] [PubMed]

Figure 1. PRISMA-ScR flowchart.

Figure 2. Number of articles distributed by year and methodology.

Figure 3. Country distribution of studies on the use of large language models.

Table 1. The characteristics of included studies.

Author and Year	Country	Objective	Assessment Method	Effects/Results
K. Giannakopoulos et al. (2023) [45]	Cyprus	To evaluate the accuracy of the answers provided by Bard, ChatGPT-3.5 and ChatGPT-4, and Bing Chat related to the two orthodontic questions. (1. Is early orthodontic treatment in two phases for children with prominent upper teeth more beneficial compared to treatment that is provided in one phase in adolescence? 2. Does orthodontic treatment affect airway function?)	Comparative Mixed Methods Study	Within the field of orthodontics, both quantitative and qualitative analyses showed that ChatGPT-4 and ChatGPT-3.5 perform significantly better than Google Bard and Microsoft Bing Chat.
L. Ma et al. (2023) [46]	China	In this paper, they propose a novel multimodal cephalometric analysis and diagnostic dialogue model called CephGPT-4.	Discussion	CephGPT-4 can improve the efficiency and accuracy of orthodontic measurements by automatically analyzing cephalometric landmarks and generating diagnostic reports but it still needs further validation and evaluation.
J. Surovková et al. (2023) [47]	Slovakia	The paper introduces the Dental Monitoring (DM), an orthodontic software that uses AI and knowledge-based algorithms to provide accurate treatment tracking and can answer questions from patients. It also evaluates DM’s clinical application within the daily workflow of orthodontic treatment.	Discussion	The use of DM can significantly improve treatment effectiveness by reducing both the average number of treatments and the overall duration, but it faces challenges with daily application imperfections and inconsistent scan evaluations. Although the use of LLMs in DM is limited, it can improve communication efficiency by semi-automatically answering questions from patients.
O. M. Tanaka et al. (2023) [48]	Brazil	To evaluate the accuracy of ChatGPT in answering to a total of 45 questions on Clear aligners, TAD and Digital imaging.	Quantitative analysis	225 evaluations of 5 evaluators show 11 (4.9%) were very poor, 4 (1.8%) as poor, and 15 (6.7%) as acceptable, good [34 (15,1%)] and very good [161 (71.6%)] (Fleiss’s Kappa = 0.004). ChatGPT has proven effective in providing quality answers related to these 3 domains.
S. Abu Arqub et al. (2024) [49]	USA	To assess the accuracy of ChatGPT answers of 111 questions concerning orthodontic clear aligners.	Quantitative analysis	ChatGPT provided correct responses to approximately 76% of the inquiries regarding orthodontic clear aligners but failed to provide the correct corresponding reference sources.
C. Arslan et al. (2024) [50]	Turkey	24 questions about conventional braces, clear aligners, orthognathic surgery and orthodontic retainers were chosen for assessing the accuracy of the answers provided by ChatGPT and BARD.	Quantitative analysis	Generally, these two provided satisfactory responses to the common orthodontic inquiries. ChatGPT’s answers surpassed those of Google Bard in quality. (The average number of references provided per answer: ChatGPT: 2.13 ± 1.51, BARD:1.96 ± 1.76).
B. Daraqel, et al. (2024) [51]	China	100 questions were used to evaluate and compare the performance of ChatGPT-3.5, Google Bard in terms of response accuracy, completeness, generation time, and response length when answering general orthodontic questions	Quantitative analysis	The median accuracy score was 9 (total score is 10) for ChatGPT and 8 for Bard. The median completeness score was 8 for ChatGPT and 8 for Bard. Bard’s response generation time was shorter than ChatGPT by 10.4 s/question. Response length generation was the same in the two models.
G. B. Demir et al. (2024) [52]	Turkey	To compare the effectiveness of ChatGPT3.5 and ChatGPT4 in completing systematic reviews.	Quantitative analysis	The accuracy rate of both PICO generation GPT4.0 was better than that of GPT3.5, especially in the P and C parts (initial accuracy rates: GPT-4: P = 98%,C = 86%; GPT-3.5: P = 93%, I = 63%; second GPT-4: P = 100%,C = 81%; GPT-3.5: P = 88%,C = 42%) Both ChatGPT 3.5 and 4 can be pivotal tools for generating PICO-driven queries in orthodontics.
A. Hatia et al. (2024) [53]	Italy	Twenty-one questions were used to investigate the accuracy and completeness of ChatGPT in answering questions and solving clinical scenarios related to interceptive orthodontics.	Quantitative analysis	For open-ended questions, the overall median score was 4.9/6 for the accuracy, and 2.4/3 for completeness. For clinical cases, the overall median score was 4.9/6 for the accuracy, and 2.5/3 for completeness.
D. D. Kılınç and D. Mansız (2024) [54]	Turkey	34 questions about orthodontics were used to assess the reliability and readability of the responses to the two versions of ChatGPT.	Quantitative analysis	ChatGPT’s responses showed some improvement in reliability aspects during the second evaluation (p = 0.001). The readability of the response texts in the new version became more difficult (p = 0.001).
M. A. Makrygiannakis et al. (2024) [55]	Greece	Ten questions about orthodontics were used to assess and compare the answers provided by Google’s Bard, OpenAI’s ChatGPT-3.5 and ChatGPT-4, and Microsoft’s Bing.	Quantitative analysis	Bing shows the highest answering score (Bing = 7.1, ChatGPT-4.0 = 4.7, Google Bard = 4.6, ChatGPT-3.5 = 3.8 the score ranging from 0 to 10) All models occasionally produced answers with a lack of comprehensiveness, accuracy, clarity and relevance.
M. Morishita et al.(2024) [56]	Japan	A total of 160 questions were used to assess the capabilities of ChatGPT-4V in answering image-based questions, including 20 questions specifically in the field of orthodontics.	Quantitative analysis	ChatGPT-4V has some limitations; the overall correct response rate of ChatGPT-4V was 35%, 57.1% for compulsory questions, 43.6% for general questions, 28.6% for practical questions. In the field of orthodontics, the correct answer rate was 25%. Of the 22 unanswered questions, 36.4% were orthodontics.

Table 2. Specific classifications and studies.

Published Year

Methodological

Region Distribution

LLMs

The Field of Orthodontics

2023

K. Giannakopoulos et al. (2023) [45]
L. Ma et al. (2023) [46]
J. Surovková et al. (2023) [47]
O. M. Tanaka et al. (2023) [48]

2024

S. Abu Arqub et al. (2024) [49]
C. Arslan et al. (2024) [50]
B. Daraqel, et al. (2024) [51]
G. B. Demir et al. (2024) [52]
A. Hatia et al. (2024) [53]
D. D. Kılınç and D. Mansız (2024) [54]
M. A. Makrygiannakis et al. (2024) [55]
M. Morishita et al.(2024) [56]

Discussion

L. Ma et al. (2023) [46]
J. Surovková et al. (2023) [47]

Comparative mixed method

K. Giannakopoulos et al. (2023) [45]

Quantitive analysis

O. M. Tanaka et al. (2023) [48]
S. Abu Arqub et al. (2024) [49]
C. Arslan et al. (2024) [50]
B. Daraqel, et al. (2024) [51]
G. B. Demir et al. (2024) [52]
A. Hatia et al. (2024) [53]
D. D. Kılınç and D. Mansız (2024) [54]
M. A. Makrygiannakis et al. (2024) [55]
M. Morishita et al.(2024) [56]

Europe

K. Giannakopoulos et al. (2023) [45]
J. Surovková et al. (2023) [47]
C. Arslan et al. (2024) [50]
G. B. Demir et al. (2024) [52]
A. Hatia et al. (2024) [53]
D. D. Kılınç and D. Mansız (2024) [54]
M. A. Makrygiannakis et al. (2024) [55]

Asia

L. Ma et al. (2023) [46]
B. Daraqel, et al. (2024) [51]
M. Morishita et al.(2024) [56]

South America

O. M. Tanaka et al. (2023) [48]

North America

S. Abu Arqub et al. (2024) [49]

ChatGPT

K. Giannakopoulos et al. (2023) [45]
O. M. Tanaka et al. (2023) [48]
S. Abu Arqub et al. (2024) [49]
C. Arslan et al. (2024) [50]
B. Daraqel, et al. (2024) [51]
G. B. Demir et al. (2024) [52]
A. Hatia et al. (2024) [53]
D. D. Kılınç and D. Mansız (2024) [54]
M. A. Makrygiannakis et al. (2024) [55]
M. Morishita et al.(2024) [56]

Bard

K. Giannakopoulos et al. (2023) [45]
C. Arslan et al. (2024) [50]
B. Daraqel, et al. (2024) [51]
M. A. Makrygiannakis et al. (2024) [55]

Bing

K. Giannakopoulos et al. (2023) [45]
M. A. Makrygiannakis et al. (2024) [55]

MiniGPT-4

L. Ma et al. (2023) [46]

Clinical Applications
Assisting treatment and question answering

K. Giannakopoulos et al. (2023) [45]
L. Ma et al. (2023) [46]
J. Surovková et al. (2023) [47]
O. M. Tanaka et al. (2023) [48]
S. Abu Arqub et al. (2024) [49]
C. Arslan et al. (2024) [50]
B. Daraqel, et al. (2024) [51]
G. B. Demir et al. (2024) [52]
A. Hatia et al. (2024) [53]
D. D. Kılınç and D. Mansız (2024) [54]
M. A. Makrygiannakis et al. (2024) [55]

Automatic diagnosis and imagine analysis

M. Morishita et al.(2024) [56]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zheng, J.; Ding, X.; Pu, J.J.; Chung, S.M.; Ai, Q.Y.H.; Hung, K.F.; Shan, Z. Unlocking the Potentials of Large Language Models in Orthodontics: A Scoping Review. Bioengineering 2024, 11, 1145. https://doi.org/10.3390/bioengineering11111145

AMA Style

Zheng J, Ding X, Pu JJ, Chung SM, Ai QYH, Hung KF, Shan Z. Unlocking the Potentials of Large Language Models in Orthodontics: A Scoping Review. Bioengineering. 2024; 11(11):1145. https://doi.org/10.3390/bioengineering11111145

Chicago/Turabian Style

Zheng, Jie, Xiaoqian Ding, Jingya Jane Pu, Sze Man Chung, Qi Yong H. Ai, Kuo Feng Hung, and Zhiyi Shan. 2024. "Unlocking the Potentials of Large Language Models in Orthodontics: A Scoping Review" Bioengineering 11, no. 11: 1145. https://doi.org/10.3390/bioengineering11111145

APA Style

Zheng, J., Ding, X., Pu, J. J., Chung, S. M., Ai, Q. Y. H., Hung, K. F., & Shan, Z. (2024). Unlocking the Potentials of Large Language Models in Orthodontics: A Scoping Review. Bioengineering, 11(11), 1145. https://doi.org/10.3390/bioengineering11111145

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Unlocking the Potentials of Large Language Models in Orthodontics: A Scoping Review

Abstract

1. Introduction

2. Materials and Methods

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI