1. Introduction
Undoubtedly, the potential of AI has raised great expectations in industries such as healthcare. This is because the challenges faced by humanity in the field of health are diverse and of varying nature. The case of COVID-19 alone has underscored how vulnerable humanity is to such challenges, where failing to address them results in significant costs, including the loss of lives. Therefore, AI emerges as a crucial resource in humanity’s endeavor to confront these challenges and improve living conditions.
The integration of AI in the context of medical research has not only taken place in the last four years, during which we have witnessed a significant surge in interest but has actually been ongoing for decades. A prime example of this is the research project conducted by Weber et al. [
1]. During this early phase, these scholars focused on issues related to interactive graphics within computer-mediated medical seminars. Similarly, a few years later, Trappl [
2] evaluated the importance of the interaction of AI in another field, specifically psychotherapy, examining the evolving perspective around the psychotherapeutic interaction between humans and computers.
The last term is also the prevailing one in the relevant literature, evolving with different definitions over the years. Joubert et al. [
3], Bolc et al. [
4], Anbar and Anbar [
5], and Torasso et al. [
6] introduced the term “Man-Machine Interaction” in the 1980s. Agah and Tanie [
7] in the late 1990s distinguished between “Human-Machine Interactions” and “Human-Robot Interactions”. More recently, several scholars have adopted the term “human-artificial intelligence interaction”, including Gaczek et al. [
8], Su et al. [
9], Van Berkel et al. [
10], Wiebelitz et al. [
11], and Sivaraman et al. [
12].
The diversity of the conceptual definition, however, is not accidental. Given the multidimensionality of medical issues, AI has all of the elements needed to address long-standing and entrenched challenges in various medical disciplines. Special emphasis has clearly been given to the treatment of various forms of cancer and oncology [
13,
14,
15], geriatrics [
16,
17], cardiology [
18], and even dentistry [
19].
The specific aim of this research is to analytically describe and critically assess the scope and depth of AI’s utilization in different medical fields, identifying technological advancements as well as areas needing further exploration. For this purpose, this study introduces a novel bibliometric approach by comprehensively tracing the evolution and integration of AI within these sectors, highlighting key trends and shifts not previously synthesized in the literature.
Many researchers have attempted to review the integration of AI in healthcare [
20,
21]; however, these reviews have not comprehensively covered the entire period from the initial stages of AI, nor have they provided detailed information per medical department. Our research fills this critical gap by offering new insights into AI’s role and its developmental trajectory in healthcare, and pinpointing specific applications and impacts across various medical specialties.
Hence, this research seeks to address the following questions:
What are the most influential countries, institutions, sources, and authors in the field of AI interaction with healthcare?
What are the main thematic areas of research in AI interaction with healthcare?
Which medical departments have integrated AI according to each thematic area of research?
The structure of the paper is as follows: We begin by detailing our research methodology. We then proceed to the analysis phase, starting with an explanation of our bibliometric and cluster-based content analysis. Following this, we identify the thematic subjects of the relevant literature. This leads us into the discussion section, where we elaborate on the advancements made using AI over the years in various medical areas, organized by thematic area. Finally, the conclusion section summarizes the paper, considers the study’s limitations, and outlines potential avenues for future research.
2. Research Methodology
To address the aforementioned research questions, we employed the bibliometric analysis technique as our methodological approach. Within this framework, the careful selection of databases for sourcing information, along with the criteria used to determine the eligibility of studies for analysis, was deemed crucial.
With regard to the first parameter, our objective was to encompass a broad spectrum of academic publications from databases widely recognized within the scientific community. Consequently, we opted for the Scopus and PubMed databases, renowned for their extensive collection of academic publications pertaining to AI and medicine.
However, these two sources are rarely combined for bibliometric analysis due to significant disparities in the information they provide. Specifically, the CSV file obtained from PubMed lacks certain data such as the number of citations, associations, abstracts, and index keywords, which would need to be manually extracted by the researcher from each article’s PubMed page. Moreover, the different categorizations used by the two databases at the publication type level do not align. For instance, Scopus classifies documents as “Conference Paper, Book Chapter, Book, Review, Conference Review”, while the corresponding PubMed categories include “Clinical Conference, Consensus Development Conference, NIH, Books and Papers, Review, Systematic Review”.
This issue was addressed with seriousness to facilitate the continuation of the research process and the effort to integrate the two databases. This was particularly due to our focus during the search on studies published in scientific journals, books, and conference proceedings, while excluding newspapers, opinion pieces, and news articles.
Furthermore, from our sample, we deemed it appropriate to exclude all kinds of reviews, aiming primarily to identify the bibliometric characteristics of original articles. The practice of omitting reviews from the bibliometric analysis was also adopted in other studies assessing the role of AI in healthcare, as demonstrated by Saheb et al. [
22] in their study on ethics and the use of AI in the sector. The rationale behind excluding review articles was based on their tendency to over-represent certain topics through the aggregation of multiple primary studies, which could skew the results of our thematic analysis. The detailed screening criteria for the exclusion of reviews are presented in
Table 1.
Based on the rationale outlined above, we initially retrieved data from the two databases on 29 March 2024. We used the keywords “medical” or “healthcare”, “interaction”, and “artificial intelligence” in the title, abstract, and keywords (TI-ABS-KEY) field of the Scopus database, identifying a total of 2241 articles. Similarly, in the PubMed database, employing the same terms resulted in 319 articles.
Table 2 summarizes the queried databases and the specific search terms used. As an inclusion criterion for both databases, articles needed to be in English.
The searches in both databases were followed by comparing the results to identify and eliminate duplicate entries—articles included in both databases. Each potential duplicate was manually checked to confirm its status as a duplicate. This involved verifying the titles, authors, publication years, and other relevant details to ensure accuracy. Consequently, a total of 195 articles found in both Scopus and PubMed were removed from our sample. Within this group, we identified an article by Krishnamoorthy et al. [
23] that was listed twice in the Scopus database, as well as instances where the same article appeared with different publication years in the two databases [
24,
25,
26,
27,
28,
29]. After identifying these duplicates, each entry was meticulously reviewed to confirm its uniqueness in terms of content and publication details, ensuring the removal of any erroneous duplicates. Thus, after merging the samples from both databases, we obtained a combined total of 2383 articles.
In these articles, a thorough examination of their content was conducted by two scholars to identify any review articles that were not detected through the initial search queries. As a result, 304 articles were excluded from our sample, yielding a final count of 2061 articles. This process highlights our commitment to maintaining a high standard of research integrity by ensuring that our findings are based on the most pertinent and original research available.
Table 3 summarizes the process we followed to select the final sample for analysis.
3. Results
Next, we present the results of the bibliometric analysis. Firstly, we start by listing the outcomes of a bibliographic-coupling analysis of influential elements, encompassing countries, authors, institutions, sources, and documents. Following that, we proceed to a co-occurrence analysis of keywords, extracting the main thematic fields in which the literature concerning the interaction of AI in the medical field has focused.
3.1. Key Influential Factors through Bibliographic-Coupling Analysis
3.1.1. Annual Publication Volume
One of the factors considered in bibliometric analysis pertained to the volume of publications per year. This is aimed at understanding the emergence and growth of scientific interest in the relevant literature.
As seen in
Figure 1, the earliest articles on the interaction of AI in the medical field date back to the 1970s, but their frequency remained low until the end of the millennium. Starting in 2005, the number of publications began to rise significantly, showing an almost linear upward trend until 2018, when there was a sharp increase in research interest in this topic.
Notably, from 2018 to 2019, the number of publications surged by approximately 51%, and this growth rate remained high between 2020 and 2021 (50%). In the subsequent two years, the annual increase remained above 35% (2022: 325 records, 2023: 485 records). Despite 2024 data only covering the first three months, the rapid growth appeared to continue. This trend confirmed that the use of AI in the medical field, especially in terms of interaction, is attracting increasing research interest. This is likely due to the capabilities offered by processing large volumes of data multifactorially in a short time, which can provide insights into complex medical issues.
An additional observation from
Figure 1 was the distribution of publications across the two selected databases, Scopus and PubMed. The green line, representing duplicates, shows the number of articles that appear in both databases annually, helping to visualize the overlap and the growing convergence of literature in these two major databases. Notably, no conference papers or books were found in PubMed, during our specific search.
3.1.2. Geographic Distribution of Productivity Rates
Giving attention to the distribution of documents by their geographic location, it is important to note that the VOSviewer analysis considers the location of the authors’ affiliations rather than the publishers. This approach ensures that our data reflect the actual regions where the research is being conducted, providing a more accurate representation of global research activities.
Table 4 and
Table 5 illustrate the distribution of publications by continent and country, respectively. It is observed that Europe and Asia are the continents with the highest productivity, accounting for more than 72% of the total published studies.
At the country level, the most productive are the United States and China, followed by India, the United Kingdom, Germany, and Italy. These six countries together account for half of the total publications (approximately 50.5%).
Figure 2 illustrates the interconnected networks between countries regarding publications in the literature on AI’s role in the medical industry. In our analysis, to rectify inaccuracies concerning the countries of publication, we created a thesaurus file. This file facilitated the merging of terms referring to the same country. For instance, “U.S.A.” was merged with “USA” and “United States”; “Republic of Ireland” with “Ireland”; “west ger”, “West Germany”, “Hamburg Germany” and “Deutschland” with “Germany”; “Ind” with “India”; and “U.A.E.” with “United Arab Emirates”. The network of relations encompassed a total of 106 countries. However, five countries were excluded as they lacked any developed relations with other countries.
Through analysis using the VOSviewer software (version 1.6.20), we identified 14 distinct network clusters of cooperating countries at the publication level. Many of these clusters consisted of neighboring countries.
Table 6 displays countries organized by cluster and color, mirroring the map’s presentation. Countries in bold indicate the highest level of contribution in terms of publications compared to all other countries in the same cluster.
3.1.3. Distribution of Publications across Various Types and Publishers
Utilizing the formulated search terms, the bibliometric analysis focused on the three primary types of publications that gathered the highest number of entries in the relevant literature. So, regarding the interaction of AI in the medical field, as shown in the subsequent
Table 7, there was a notably higher preference for publications in journals (1019) compared to books (516) and conference papers (526).
Table 7 also displays the sources of academic publications that have contributed most significantly to the specific literature, along with their publishers. It shows various types of sources that have supplied more than 10 publications to the literature on the interaction of AI in the medical field.
In the realm of academic journal publications, the majority were contributed by “Artificial Intelligence in Medicine” and “International Journal of Medical Informatics”. These journals, published by Elsevier, together accounted for approximately 80 studies. “Journal of Biomedical Informatics”, published by Academic Press Inc., also made a significant contribution with 25 papers, while “IEEE Access” from IEEE Inc. added another 19 articles. The group of academic journals with more than 10 publications included JMIR Publications Inc. and ACS Publications, with their leading journals “Journal of Medical Internet Research” and “ACS Applied Materials and Interfaces”, respectively.
Focusing on the sources of publication with the most significant contributions outside of academic journals, we find that both books and conference papers feature different publishing houses playing a pivotal role. Springer dominates in book publications, particularly with its “Lecture Notes in Computer Science” series, which boasts 173 publications and significantly enriches the reviewed bibliography. In the realm of books, IOS Press stands out with “Studies in Health Technology and Informatics”, contributing 75 publications. For conference proceedings, the prominent publisher ACM leads with its “ACM International Conference Proceeding Series” and the “Conference on Human Factors in Computing Systems”, both of which are influential in this domain.
Additionally, with the aim of distinguishing the networks formed among different sources, we utilized VOSviewer again. In total, out of 1108 distinct source titles, 797 comprised the largest set of connected items. As depicted in
Figure 3, dense yellow clusters are identified among the most significant sources. The following elements are notable as their density color closely resembles yellow. “Lecture Notes in computer science” is part of a cluster with Total Link Length (TLS) 1319. “Artificial Intelligence in Medicine” ranks second with TLS 1123. “Journal of Biomedical Informatics” holds the third position with TLS 707. “International Journal of Medical Informatics” secures fourth place with TLS 634. Lastly, “Studies in Health Technology and Informatics” ranks fifth with TLS 468.
3.1.4. Dissemination of Publications among Authors and Institutions
To accurately discern the authors who have contributed significantly to the relevant literature, we initially processed the data extracted from two databases. This step was crucial because VOSviewer’s dual options can yield inconsistent results when analyzing a vast corpus of articles. Specifically, this software conducts author-level analysis either by using full names or by abbreviating them to initials. Choosing between these options can lead to varying outcomes, with the latter proving problematic for authors from countries like China and Japan, due to the commonality of initials. To enhance the precision of our analysis, we utilized the Author ID feature from the Scopus database. For articles in the PubMed database lacking a corresponding column, we linked authors present in both databases to their Scopus Author ID. For authors without a Scopus publication, we manually created and assigned a unique ID code.
During this process, we identified 8750 unique author codes, which differed from the 8842 individuals VOSviewer (version 1.6.20) recognized when the file was loaded. This discrepancy indicated that to accurately extract the network of connections between authors, we needed to rectify the 92 additional records. These records corresponded to existing authors but were listed under slightly different names.
Table 8 enumerates the most prolific authors, highlighting those who have made significant contributions to the literature on AI interaction in the medical field. Notably, despite the large number of authors, only 21 have published five or more papers in this area. Included in this group are two authors with discrepancies in their name entries: Francisco Jose, who was also recorded as Francisco J., and Pedro Ignacio Dorado-Diaz, who was listed once as Dorado-Diaz, P. Ignacio.
It is notable that Terenziani P. and Piovesan L., along with Michalowski M. and Michalowski W., collaborated on the majority of their articles. Specifically, Terenziani P., a significant contributor to research on AI interaction in healthcare, began with a seminal publication in 1989, focusing on the interaction between humans and machines in diagnostic systems [
6]. He remained active, particularly from 2010 to 2020, collaborating with Piovesan L. on investigating how different clinical guidelines may interact or conflict during patient care, especially concerning comorbid patients, to ensure that combined recommendations avoid negative consequences or conflicting advice. Similarly, Michalowski also concentrated on related areas, such as developing and refining methods for managing and implementing multiple clinical practice guidelines, particularly for patients with comorbidities. From 2011 to 2021, their research focused on using constraint logic programming to identify inconsistencies in clinical guidelines and applying interactive methods and qualitative measures to resolve these inconsistencies. Their work evolved into creating an integrated framework for the moderated implementation of multiple guidelines, resulting in the development of MitPlan and MitPlan 2.0, which provide enhanced support for managing patients with multiple morbidities through innovative design approaches.
This collaboration among the most prolific authors is evident in
Figure 4, which emerges when we impose a criterion—specifically, a requirement for inclusion in the analysis of at least five publications per author.
In examining the institutions contributing the most to the literature on the interaction of AI in the medical industry, the bibliometric analysis revealed that, in terms of publication count, Harvard Medical School (U.S.A.) stands out with five documents out of a total of 5721 institutions. Following closely are McGill University (Canada), the Poznan University of Technology (Poland), and the University of Pennsylvania (U.S.A.), each with four documents. In total, only 140 different organizations have published more than one document in this literature.
However, at the citation level, the distinguished organizations varied in their contributions to the field. In this case, Columbia University (U.S.A.), LinkedIn Corporation (Italy), Microsoft Research (India), and Microsoft (U.S.A.) account for 982 citations, while Fudan University—Department of Physics (China), the Institute of Biochemistry and Cell Biology (China), the State Key Laboratory for Modification of Chemical Fibers and Polymer Materials and College of Chemistry, Donghua University (China), and the State Key Laboratory of Molecular Engineering of Polymers, Fudan University (China) have 721 citations.
Table 9 displays the list of organizations whose documents have garnered more than 500 citations.
The combined citation records among various organizations essentially demonstrate the collaboration they engage in within the realm of publications. The four organizations topping the list in citations have jointly contributed to the publication titled “Intelligible Models for Healthcare: Predicting Pneumonia Risk and Hospital 30-Day Readmission” [
30]. This article tackles fundamental challenges in biology related to understanding cellular heterogeneity and its correlation with cellular physiology. Meanwhile, the second group of organizations contributed to a significant study in the relevant literature titled “A Bioinspired Mineral Hydrogel as a Self-Healable, Mechanically Adaptable Ionic Skin for Highly Sensitive Pressure Sensing” [
31]. This publication focused on the development of a novel type of mechanically adaptable ionic skin sensor, with promising applications in various fields such as AI, wearable devices, and soft robotics.
3.2. Analysis of Keyword Co-Occurrence and Content Clustering
An important yet challenging aspect of the bibliometric analysis involved identifying the subject areas where the interaction of AI has occurred within the field of medicine. VOSviewer proved to be a valuable tool for this task; however, ensuring that the results accurately represented the relevant literature required careful attention. Since neither the bibliography nor the user manual of the software [
32] provided a specific rule regarding the minimum number of occurrences required for a keyword to be included in the analysis and considering the substantial volume of articles to be analyzed—exceeding 12,000 different keywords—we implemented a multi-stage process to derive meaningful results.
Initially, we conducted the analysis without modifying our data and without utilizing a thesaurus file for correction. In this scenario, we identified 13,015 distinct keywords, classified into 73 clusters based on their content. The most frequently occurring keywords are presented in
Table 10. Additionally,
Figure 5 depicts the network among all of these keywords.
As depicted in
Table 9, instances like “Algorithm—Algorithms” and “Human—Humans” are essentially duplicates and should be consolidated. Conversely, keywords such as “Artificial intelligence systems” and “Health Care” serve as broad search terms and do not provide specific insights into the directions AI has taken in the medical industry. Therefore, following Caldarelli’s suggestion [
33], they should be omitted to prevent biasing our results. Additionally, terms such as “article”, “priority journal”, “male”, “female”, “adult”, and “controlled study” can be disregarded for our analysis. This exclusion will provide a clearer understanding of the relationships among essential keywords in the relevant literature and the networks they form.
To apply these configurations to VOSviewer’s analysis, it was necessary to create a thesaurus file that reflects these changes. With this file, we can observe a distinction in the frequency of keywords within the literature on the interaction of AI in healthcare (see
Table 11).
These are the keywords that we anticipated will significantly contribute to the formation of distinct clusters by VOSviewer during the co-occurrence analysis. This anticipation is based on the VOSviewer manual [
32], which suggests that keyword frequency and total link strength are among the characteristics considered to some extent for clustering.
However, as illustrated in
Figure 6, while the 18 specific keywords with 100 or more occurrences can create an initial network of relationships among themselves, such a network will be insufficient to provide a comprehensive picture of the main thematic directions of the relevant literature. The question that remains is what the minimum occurrence level should be for including keywords in the analysis and clustering configuration. As mentioned above, the VOSviewer manual does not provide any specific guidelines to follow, and neither does the literature on studies using this software for bibliometric analysis in AI healthcare. For example, Saheb et al. [
22] chose a minimum of three cases in their attempt to map ethical issues arising from the use of artificial intelligence in the medical field, while Kumari et al. [
34] set a threshold of 10 occurrences for keywords related to the role of machine learning (ML) and deep learning (DL) in healthcare with big data analysis.
To systematically determine a minimum number of keyword occurrences that effectively reflects the number of clusters, enabling an informed assessment of the main subject areas concerning the interaction of AI in the medical field, we utilized a well-established methodology known as the elbow rule. This method can be traced back, to some extent, to Thorndike’s research study in 1953 [
35]. It offers a solution for determining the optimal number of clusters by operating on the principle that as the number of clusters increases, the internal variance of the data decreases while the external variance increases. The inflection point, referred to as the ‘elbow point’, indicates where the drop in internal variance demonstrates a sharp slope, highlighting the appropriate number of clusters in the data [
36].
In our study, the variables of interest included the following three: the number of keywords, number of co-occurrences, and number of generated clusters. Therefore, by adjusting the number of co-occurrences (threshold) in our analysis, we obtained varying numbers of keywords and a specific count of clusters each time. In total, 89 distinct occurrences were identified, serving as data points for constructing the elbow graph. At these points, we observed nine alternative cluster numbers.
Table 12 lists the points at which the number of clusters varied, while
Figure 7 displays the various network connections among keywords for different cluster counts.
Figure 8 illustrates what is commonly referred to as an elbow chart. In this chart, each dot represents a data point defined by three attributes: the number of occurrences (horizontal axis), the number of keywords (color-coded), and the number of clusters (vertical axis read from the right side). As observed, the majority of data points are concentrated within the three and four cluster categories, indicating a significant concentration of keywords and occurrences in these groups before the data begin to diverge significantly.
This pattern is even more pronounced in
Figure 9, which focuses specifically on the 0–100 range for both the number of keywords and occurrences. The intersection points of these two variables suggest that the optimal number of clusters, according to the elbow rule, falls within the range of three to four clusters. To decide which of the two cluster scenarios to focus on for our analysis, it was beneficial to examine the keywords that define each cluster.
Table 13 provides this relevant information, where keywords not included in the case of three clusters are highlighted in red, and those that change clusters are marked in purple.
Comparatively, between the two scenarios, the arrangement with three clusters provided a clearer understanding of the areas within which AI is applied in the medical field. Specifically, with three clusters, the keywords indicated that the first cluster focuses on support for clinical decisions and life-cycle approaches; the second one primarily addresses the analysis of visual information; and the third cluster concentrates on newer ‘learning’ technologies, which are closely associated with what the literature identifies as human–computer interaction (HCI).
The three key areas within the medical field, as illustrated in
Figure 10, were slated for comprehensive analysis in the subsequent discussion section.
4. Discussion
Owing to VOSviewer’s inability to sort articles by cluster, we manually categorized them according to their keywords, titles, and abstracts. Moreover, the impact of the articles, as reflected by their citation count, was given particular consideration in the detailed discussion of each topic.
4.1. Medical Informatics and Clinical Decision Support Systems
In this cluster, the largest number of items with the highest frequency was found compared to the other two clusters, making it the primary target for the integration of AI in the medical industry. Medical informatics interventions, such as clinical decision support (Gude et al. [
37]), offer significant support to health professionals operating within a highly complex environment. Consequently, many researchers have already endeavored from very early on to evaluate models aimed at improving clinical decision-making, such as those proposed by Roach et al. [
38], Brenner et al. [
39], and Schecke et al. [
40]. Roach et al. [
38], for example, developed a critical expert system that identifies drug interactions in their eventual combinations, aiming to prevent adverse effects in patients and helping physicians avoid choices that could lead to such outcomes. The study by Schecke et al. [
40], this time in the context of a surgical procedure, also aimed to assist doctors’ decisions. With the proposed AES-2 (Anesthesia Decision Support System) model, they attempted to offer intelligent alerts and appropriate treatment recommendations concerning anesthesia for cardiac surgery operations. In the research conducted by Brenner et al. [
39], we encounter an approach that still appears quite risky today: the proposition of a microcomputer-based personal medical advisor and reference system designed to assist medically inexperienced computer users in accessing general medical information, answers to specific medical queries, and details related to either existing or hypothetical medical issues. It even suggests the possibility of consulting a doctor.
In the next decade, many more scholars followed these early ideas. Rau et al. [
41] proposed an enhanced clinical decision support system (DSS) for patient anesthesia during surgery, which demonstrated superiority in terms of user interface and interaction with the computer system. Additionally, Clark et al. [
42] made a more sophisticated attempt in the area of drug interaction and clinical decision-making than Roach et al. [
38]. For the first time, they described an electronic drug prescribing system that, through central administration and a set of rules, enables improved decision-making for each patient by considering allergies and interactions in real time.
However, at this juncture, the potential of AI in the medical industry was being recognized to the extent that it can extend beyond assisting individual departments and functions. The development of separate information systems on a case-by-case basis has created complexity in the cooperation between the different agencies involved. Thus, we encounter a pioneering effort by Cimino et al. [
43] to implement the “Interactive Query Workstation” for multi-resource querying from various types of databases—a clinical and bibliographic database, a cancer database, a drug interaction database, and a medical knowledge base. Additionally, Mori [
44] endeavored, through a new class of software, to support the interaction between health professionals and the specialized tasks they perform. This software aimed to properly manage terminological diversity without imposing uniformity, instead facilitating spontaneous convergence among controlled vocabularies. In the same vein, Glasspool et al. [
45] discussed DSS aimed at facilitating the planning of care actions to ensure they do not conflict with each other. Xiao et al. [
46,
47] introduced the term “Distributed DSS” to describe systems with such characteristics. They emphasized that for such systems to function effectively, it is deemed necessary to ensure protected interaction through the delineation of clinical user roles and universally applied policies.
Regardless of what happens in terms of connecting the different stakeholders, care providers, what is also of interest in the integration of AI in clinical decision-making is the interaction between doctor and patient. Frize et al. [
48] were the ones focusing on improving this relationship through the integration of smart monitoring components in various medical environments, from intensive care units to rheumatoid arthritis wards. Douali et al. [
49] later described a DSS aimed at helping physicians provide personalized care with greater accuracy, quality, and efficiency through the development of a semantic web. Meanwhile, Khattak et al. [
50] emphasized personalized care for elderly patients through the use of an innovative, dynamic DSS service that aligns nutritional intake with information from the patients’ daily activities.
The elderly are a population group that is quite sensitive and prone to experiencing adverse drug reactions, so it is particularly interesting to integrate AI into drug monitoring systems and, by extension, into decision-making systems [
51]. For this reason, various scholars have focused on the development of a suitably designed decision-making system, such as Thum et al. [
52], Johansson et al. [
53], and Gómez-Sebastià et al. [
54].
AI in DSSs also plays a significant role in another sensitive area of care: emergency department (ED) patients. The prolonged waiting times there can lead to very negative outcomes. Therefore, it is particularly important to understand the flow of patients in the nursing departments and their behaviors. Wu et al. [
55] utilized a rule-based data-mining approach to investigate the relationship between various types of patient behaviors and their length of stay (LOS), and to construct a model for predicting patient LOS. Their primary objective was to develop an interactive DSS. Liu et al. [
56], on the other hand, utilized an agent-based model in an effort to enhance the understanding of the complexity, evaluate policies, and improve the effectiveness of EDs.
Similarly, the recent emergence of the COVID-19 pandemic presented a significant healthcare emergency. The scenario quickly led to a depletion of hospital resources and necessitated critical clinical decisions. In this context, AI and medical informatics played crucial roles in identifying significant parameters for making informed decisions. Snowdon et al. [
57] began their work by constructing a conceptual model that captured the diverse ways in which information and technology can support the public health response to a pandemic. Meanwhile, Suraj et al. [
58] introduced the SMART COVID Navigator, a clinical decision support tool designed for treating COVID-19. This web-based application enabled clinicians to access patients’ electronic health records and analyze disease interactions from a wide range of observational studies, which influenced the understanding of severity and mortality rates associated with COVID-19.
It is also noteworthy to highlight the contribution of AI in the field of learning medical informatics and making appropriate decisions within academic education. The characteristics of an intelligent DSS play an important role in the effectiveness of medical education [
59]. Eliot et al. [
60] discussed an intelligent teaching system that customizes the level of knowledge to meet individual student needs, providing feedback on any misconceptions they may have. A particular area of interest for them was the teaching of cardiopulmonary resuscitation techniques, where decisions about how to teach were distinguished from decisions about what to teach. With the emergence of ChatGPT, medical informatics education through AI is advancing. Considering students’ perspectives, Sabrina Magalhães Araujo and Ricardo Cruz-Correia [
61] focused on identifying suitable prompts to enhance medical teaching and learning.
From the above, the contribution of AI to clinical decision support and medical informatics can be distinguished, which is clearly evolving both in a more general context and within specific departments in the medical field. In
Table 14, for a better understanding of AI’s contribution in these terms, the sampled publications implemented by medical departments are distinguished.
4.2. Advanced Medical Imaging and Diagnosis Systems: Algorithms and Automation
A second avenue through which AI has been integrated into the medical field is image analysis for diagnostic purposes. The first examples of this technology date back to the 1980s, when the initial studies and systems emerged. One of the earliest implementations was EMERGE, developed by Hudson and Cohen [
86], which was specifically designed for emergency rooms. This expert system utilized medical criteria maps and a scoring system to efficiently evaluate chest pain, demonstrating how rule-based AI could improve the speed and accuracy of diagnostics in acute medical settings. Concurrently, research by Zinder [
87] highlighted the integration of AI with clinical laboratory processes, emphasizing the growing reliance on technologies such as computerized tomography. This work underscored the crucial role of AI in managing and interpreting the voluminous and complex diagnostic data, proving particularly beneficial in fields like oncology and internal medicine where precision and rapid results are paramount. Following this, the development of a comprehensive AI system for perinatal monitoring by Hernández and Gómez [
88] significantly advanced obstetrics by automating the diagnosis and prognosis of fetal conditions during labor. Utilizing syntactic pattern recognition methods, this system provided real-time, accurate assessments, vastly improving upon traditional manual monitoring methods. The latter part of the decade saw further innovations with the creation of PUPA, a Pulse Programming Assistant for Nuclear Magnetic Resonance Imaging by Foxvog et al. [
89], which automated the complex process of creating pulse programs for magnetic resonance imaging (MRI) experiments. This advancement was particularly transformative for departments like radiology and neurology, enhancing the functionality and application of MRI technology. Additionally, the research on the role of metadata in medical expert systems by Al-Zobaidie and Grimson [
90] demonstrated how effectively managing metadata could streamline the integration between databases and AI systems, thereby enhancing diagnostic processes. By facilitating more sophisticated data handling and interpretation capabilities, this approach significantly improved the accuracy and speed of diagnoses across various medical departments.
During the 1990s, the field of medical imaging and computer-assisted diagnosis experienced continued innovation, building upon the foundational AI technologies of the 1980s. Significant improvements were seen in systems designed for specific medical tasks. The development of the VIA-RAD system, as documented by Rogers [
91] in 1995, exemplifies the evolution of diagnostic radiology tools. VIA-RAD, a blackboard-based system, integrated computer-displayed radiological images with cooperative computerized assistance for decision-making, representing an advanced version of the image interpretation systems first introduced in the previous decade. This system utilized extensive data collection and cognitive modeling to improve the interaction between perception and problem-solving in radiological assessments, indicating a sophisticated leap forward in how AI could enhance the accuracy and efficiency of radiology departments. In the same vein, significant progress was made in the field of neuroimaging. The research by Brown et al. [
92], on a model-based assessment of lung structures, showcased the use of a sophisticated inferencing and control system that identified major lung structures from medical images. This technology advanced the applications seen in earlier MRI innovations like PUPA [
89], by adding a level of automated feature recognition and diagnostic support that was not previously available. During this decade, new advancements further refined diagnostic capabilities in medicine. One notable development was the QUAWDS system, introduced by Weintraub et al. [
93], which utilized advanced pattern recognition and an abductive hypothesis assembler to analyze human gait dynamically, marking a shift from traditional static image analysis to more complex, motion-oriented diagnostics. Also, Olabarriaga et al. [
94] further advanced the field by developing an intelligent interactive segmentation method that employed a piece-wise deformable model for analyzing complex medical images, such as those used in diagnosing osteoarthritic ankles. This method required minimal user intervention and significantly improved the efficiency and accuracy of segmenting intricate anatomical structures, demonstrating the decade’s push towards more sophisticated and user-friendly diagnostic tools.
Continuing into the 2000s, the integration of AI in medical imaging and diagnostics further evolved with groundbreaking advancements that built upon the technological momentum of the previous decades. In the realm of medical imaging, we observed a leap forward with the work of Chuang and Lie [
95] in 2004, who developed an object segmentation algorithm that utilized an extended gradient vector flow field model. This technology advanced the image segmentation processes that were fundamental in the 1980s and 1990s by introducing a system that required no human interaction, signifying a substantial progression toward full automation in medical image analysis. Another key development was the system presented by Olabarriaga et al. [
96] for the segmentation of thrombus in abdominal aortic aneurysms from computed tomography angiography scans. This system showcased a novel application of nonparametric statistical grey level modeling in the medical imaging domain, a marked advancement from the texture analysis and pattern recognition methodologies that began emerging in the 1990s. It represented a refined approach to dealing with the complexity of medical image data by providing a robust automated segmentation with minimal user input.
Progress in diagnostic capabilities was also evident in the improved understanding of image data. The Medical Imaging Interaction Toolkit, introduced by Wolf et al. [
97], extended beyond the algorithmic capacities of medical imaging to include interaction and visualization. This toolkit was a notable enhancement over the foundational imaging technologies from previous decades as it integrated algorithms with visualization, allowing for more interactive applications in medical image analysis. Furthermore, the period saw significant advancements in ultrasound imaging technology, both in methodology and application. In 2008, Rossi et al. [
98] introduced an algorithm that greatly improved the automatic recognition of the common carotid artery in longitudinal ultrasound B-mode scans, showcasing a move away from the labor-intensive manual processes of the past. This was indicative of the burgeoning trend towards automation within medical imaging analysis. Complementing this trend, Wein et al. [
99] made a parallel leap in the same year with their development of a method for automatic computed tomography (CT)–ultrasound registration for diagnostic imaging and image-guided interventions. Their system, which utilized a novel real-time simulation of medical ultrasound from CT data coupled with a robust similarity measure, enabled the alignment of 3D ultrasound sweeps with corresponding tomographic modalities without the need for manual input.
The period between 2011 and the COVID-19 outbreak was characterized by significant improvements and the emergence of new technologies that further propelled the capabilities of AI in medicine, refining the accuracy and efficiency of diagnostic processes. For instance, in 2011, Ababneh et al. [
100] introduced an innovative, fully automated system for the segmentation of bones from knee MRI images, which was particularly impactful for osteoarthritis research. Their system utilized graph-cut-based segmentation algorithms to identify imaging biomarkers for this debilitating joint disease, which demonstrated the potential of AI in automating and improving the diagnostic workflow. Song et al. [
101] in the same year made strides in the field with their development of a surface-region context in optimal multi-object graph-based segmentation for the robust delineation of pulmonary tumors, enhancing the precision of lung cancer treatment planning. Furthermore, by 2014, Roy et al. [
102] had contributed significantly to the field of content-based image retrieval systems for 3D medical datasets. Their work demonstrated the potential for such systems to provide a rapid and accurate retrieval of medical images, aiding radiologists in their diagnostic tasks and enhancing the efficiency of medical workflows. In 2017, Prasad et al. [
103] showcased the versatility of image-based diagnostic tools with their deployment on Android devices for plant species identification based on leaf images, suggesting potential applications of medical image-based diagnostic tools beyond traditional healthcare settings.
During the COVID-19 pandemic, the application of AI in medical imaging and diagnosis saw significant innovation and deployment, as detailed in the works of Aouad et al. [
104] and Kuang et al. [
105]. The rapid spread of the virus and the urgent need for efficient diagnostic protocols meant that traditional methods needed to be augmented with smarter, faster technology. Aouad et al. [
104] discuss the integration of smart city technologies to monitor and control the epidemic spread, harnessing the power of the Internet of Things (IoT) to quickly diagnose COVID-19, thereby reducing human-to-human interaction, and enhancing response times to contain outbreaks. Kuang et al. [
105] highlight the potential of AI-driven segmentation methods, particularly for acute ischemic stroke lesions on non-contrast CT scans, which may have parallels in imaging techniques for viral infections, showing how AI helped to address the challenges of low contrast and artifacts common in rapid, high-volume testing environments.
In the contemporary post-pandemic landscape, research by Vázquez-Ingelmo et al. [
106] and Batista et al. [
107], among others, has shown a trend towards more integrated, intelligent, and user-centric medical diagnostic systems. Vázquez-Ingelmo et al. presented the CARTIER-IA platform, which enhances the usability of medical data management by integrating various types of data and enabling the application of AI algorithms through a web application, catering to diverse roles in medical research and practice. Batista et al. discussed how the evolution to 6G technology will amplify the capabilities of smart health services, making them more efficient and accessible, while also highlighting the need to address the security and privacy challenges that accompany these advancements. These developments signal a transition to a future where healthcare is more connected, data-driven, and patient-centric, supported by sophisticated AI tools for better diagnosis and management.
In
Table 15 below, we present a summary of the publications within the sample that explore the intersection of AI with medical imaging and computer-assisted diagnosis, categorized by medical specialty.
4.3. Human–Computer Interaction and the Importance of Learning Systems
Another vibrant domain where AI has significantly penetrated is the interface between humans and computers, particularly through the lens of learning systems. This convergence has been pivotal in advancing HCI, making systems more intuitive, responsive, and capable of learning from user interactions. The genesis of this synergy can be traced back to the development of early neural networks that aimed to mimic human thinking patterns.
Beginning in the early 2000s, Lau et al. [
122] developed a framework for mining patterns of dyspepsia symptoms across time points, highlighting the potential of constraint-based association rule mining to aid domain experts in medical data analysis. Moving forward, Bayro-Corrochano et al. [
123] and Watanabe [
124] explored geometric algebra in neural networks and the symmetrical properties of training and generalization errors in learning machines, respectively, pushing the boundaries of what ML could achieve in complex applications. By 2007, efforts by Haddawy et al. [
125] demonstrated the feasibility of understanding anatomical sketches, pointing towards the potential of integrating more intuitive forms of HCI in medical training. In the same year, Flores et al. [
126] discussed the AMPLIA system which used pedagogic negotiation in medical education, blending learning environments with AI.
The 2010s saw further advancements in this area, with Swangnetr et al. [
127] and Gholami et al. [
128] further exploring the adaptation of robotic and learning technologies for healthcare applications, demonstrating the integration of emotional state classification in patient–robot interactions and neonate pain assessment, respectively. By 2015, the focus had slightly shifted towards more direct patient care applications. This shift is illustrated by Rasmusson and Irvine [
129], who explored the neurobiology of executive function under stress and its optimization in intense military training, and by Biglari et al. [
130], who developed a haptics-enabled surgical training system integrated with DL. This period also saw advancements in human–computer interfaces for rehabilitation, as shown by Novak and Riener [
131], who presented a machine-learning algorithm for predicting targets of human reaching motions with an arm rehabilitation exoskeleton. Additionally, Senadeera et al. [
132] discussed turning gaming electroencephalography peripherals into trainable brain–computer interfaces, highlighting the potential for broader applications in interactive environments.
In the wake of the COVID-19 pandemic, research pivoted to address immediate needs. For instance, Luo et al. [
133] introduced a textile-based tactile learning platform that could record, monitor, and learn human–environment interactions using ML techniques to adjust sensor performance. This work underscores the progressive integration of AI in direct patient interaction and training. Additionally, Saxena et al. [
134] presented an integrated network for real-time facial expression recognition, enhancing the capabilities of human–robot interaction in healthcare, demonstrating the evolving role of AI in understanding complex human expressions for better patient care.
Recently, the convergence of learning systems and HCI has been marked by significant research efforts that demonstrate innovative applications of technology in educational and interactive settings. A notable advancement was reported by Ahuja et al. [
135], who developed a robot for eldercare that combines AI, ML, and the IoT. This robot not only assists the elderly with daily tasks but also incorporates learning algorithms that adapt to the user’s behavior, enhancing both independence and safety.
Similarly, Kovalev et al. [
136] introduced the Augmented Mirror Hand (MIRANDA), a virtual reality-based system for training users with prosthetic limbs. This system leverages ML to adjust to the specific movements of an individual, offering personalized training that improves the efficacy of prosthetic usage, thus facilitating faster and more effective rehabilitation. Mehr et al. [
137] explored the potential of AI-powered lower limb assistive devices designed for home care. Their work focuses on developing adaptable central pattern generators and a divergent component of motion for personalized motion planning. This technology is particularly promising for enhancing the mobility of individuals with disabilities in their home environments, thereby extending learning-based human–machine collaboration into daily living activities.
On the educational front, Liu [
138] focused on motivating medical students’ active learning through an autonomous learning environment that integrates AI to support interactive and adaptive learning experiences. This approach aimed to transform traditional medical education by leveraging AI to create dynamic educational content that responds to the cognitive and emotional needs of students. Additionally, Ryan et al. [
139] tackled the integration of fairness in the software design process. Their study emphasized the need for HCI and ML experts to collaborate closely to ensure that AI-driven systems are not only effective but also equitable. This research underlines the importance of incorporating ethical considerations into the learning processes of AI systems to ensure they are aligned with societal values.
Similarly to the analysis of the other two clusters,
Table 16 provides a summary of the publications in the sample that investigate the integration of AI within HCI concept and learning technologies, organized by medical department.
5. Conclusions
This bibliometric analysis represents a comprehensive effort to systematically map the integration of AI in healthcare. We utilized data from two databases instead of the usual single source, acknowledging the challenges of their combined evaluation. The review of 2061 articles initially focused on the rising number of publications and the productive factors contributing significantly to the relevant literature. We observed a substantial increase in publications, particularly in recent years, with numerous authors and organizations from various parts of the world actively participating. This underscores the global recognition of AI’s importance across multiple healthcare areas.
We then identified the main subject areas where AI has significantly evolved and improved medicine. Using keyword co-occurrence analysis and a novel bibliographic technique, we determined the optimal number of thematic clusters: 1. medical information and clinical DSSs, 2. medical diagnosis and advanced medical imaging, and 3. HCI and learning systems. These areas have seen significant advancements due to AI adoption. Clinical DSSs can now process vast amounts of patient data, providing healthcare professionals with real-time, evidence-based recommendations. These systems enhance patient safety and outcomes by predicting drug interactions and recommending alternative treatments. Additionally, the integration of AI in medical information systems has streamlined administrative processes, significantly reducing the burden on healthcare staff. Moreover, AI’s capability to process and analyze imaging data has enhanced diagnostic accuracy and efficiency in specialties like radiology and oncology. Innovations like automated image segmentation and enhanced pattern recognition have reduced medical staff workload and improved patient outcomes by enabling faster, more accurate diagnoses. Furthermore, the intersection of AI with HCI and learning systems has led to significant advancements in medical technology and educational methodologies. From early neural networks to sophisticated applications like the AMPLIA system and haptics-enabled surgical training, AI has transformed human–computer interfaces, facilitating more effective medical training and care, and extending to applications such as rehabilitation and eldercare.
From the above, the study can be a source for state-of-the-art AI diffusion as reflected in publications data. Furthermore, it assists scholars and legislators, as well as practitioners, in better understanding the evolution of healthcare-related AI research and the prerequisites for the responsible use of AI in healthcare settings.
Despite the advances of this research, the authors acknowledge several limitations. The analysis exclusively focused on articles from two databases and was limited to those written in English, potentially overlooking significant literature outside of this research data. Additionally, since it takes time for articles to accumulate citations, high-quality recent publications may not have achieved an ideal citation count, leading to potential research bias.
To address these limitations and continue advancing our understanding of AI in healthcare, future research should consider incorporating articles from multiple languages and additional databases, such as the Web of Science. Further refinements could include the adoption of advanced data analysis techniques such as latent semantic analysis (LSA) and latent Dirichlet allocation (LDA). These methods are well-suited for identifying deeper semantic patterns that might not be evident through traditional keyword co-occurrence analysis, thereby providing a more comprehensive view of the literature across diverse linguistic contexts. The implementation of these techniques would enhance the global perspective and inclusivity of the analysis. Furthermore, ongoing studies could focus on the real-time tracking of AI advancements and their immediate impact on healthcare practices, ensuring that the latest innovations are quickly and accurately reflected in the literature. This proactive stance will help keep healthcare professionals well-informed and prepared to integrate state-of-the-art AI tools into their practices, ultimately improving patient care worldwide.
Moreover, this study did not explore other significant factors that may affect the integration of AI in healthcare, such as policies, regulations, and the overall healthcare system infrastructure. Future research could be designed as follow-up projects to investigate these areas. Such studies could examine how different regulatory frameworks and policy environments influence the adoption and implementation of AI technologies in healthcare settings. Additionally, assessing the readiness and capacity of healthcare systems to integrate AI solutions, including evaluating technological infrastructure and the preparedness of healthcare professionals, would provide a more comprehensive understanding. Exploring the ethical, legal, and social implications of AI, as well as its economic impact on healthcare costs and benefits, would further enrich the field. These areas of research are crucial for developing strategies and policies to support the effective and responsible implementation of AI in healthcare systems globally.