Next Issue
Volume 11, September
Previous Issue
Volume 11, March
 
 

Informatics, Volume 11, Issue 2 (June 2024) – 29 articles

Cover Story (view full-size image): The increasing interest in MS profiling for arthropod identification has led to the creation of guidelines for sample preparation and assessing MS spectra quality, an element which is currently lacking. In the present work, a bioinformatics tool, MSProfileR, was created that integrates a control quality system for detecting and excluding outlier MS profiles and optimises the process of MS spectra analysis, including the addition of spectra metadata. It was developed in an R environment and offers a user-friendly web interface using the R Shiny framework. Its application to two arthropod spectra datasets in our study highlights its superiority to manual classification. MSProfileR is an open-source software that can be used by the scientific community, particularly entomologists, without any need for programming expertise. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
12 pages, 1449 KiB  
Review
Identifying Long COVID Definitions, Predictors, and Risk Factors in the United States: A Scoping Review of Data Sources Utilizing Electronic Health Records
by Rayanne A. Luke, George Shaw, Jr., Geetha Saarunya and Abolfazl Mollalo
Informatics 2024, 11(2), 41; https://doi.org/10.3390/informatics11020041 - 14 Jun 2024
Viewed by 724
Abstract
This scoping review explores the potential of electronic health records (EHR)-based studies to characterize long COVID. We screened all peer-reviewed publications in the English language from PubMed/MEDLINE, Scopus, and Web of Science databases until 14 September 2023, to identify the studies that defined [...] Read more.
This scoping review explores the potential of electronic health records (EHR)-based studies to characterize long COVID. We screened all peer-reviewed publications in the English language from PubMed/MEDLINE, Scopus, and Web of Science databases until 14 September 2023, to identify the studies that defined or characterized long COVID based on data sources that utilized EHR in the United States, regardless of study design. We identified only 17 articles meeting the inclusion criteria. Respiratory conditions were consistently significant in all studies, followed by poor well-being features (n = 14, 82%) and cardiovascular conditions (n = 12, 71%). Some articles (n = 7, 41%) used a long COVID-specific marker to define the study population, relying mainly on ICD-10 codes and clinical visits for post-COVID-19 conditions. Among studies exploring plausible long COVID (n = 10, 59%), the most common methods were RT-PCR and antigen tests. The time delay for EHR data extraction post-test varied, ranging from four weeks to more than three months; however, most studies considering plausible long COVID used a waiting period of 28 to 31 days. Our findings suggest a limited utilization of EHR-derived data sources in defining long COVID, with only 59% of these studies incorporating a validation step. Full article
Show Figures

Figure 1

17 pages, 2966 KiB  
Article
Analysis of the Epidemic Curve of the Waves of COVID-19 Using Integration of Functions and Neural Networks in Peru
by Oliver Amadeo Vilca Huayta, Adolfo Carlos Jimenez Chura, Carlos Boris Sosa Maydana and Alioska Jessica Martínez García
Informatics 2024, 11(2), 40; https://doi.org/10.3390/informatics11020040 - 7 Jun 2024
Viewed by 697
Abstract
The coronavirus (COVID-19) pandemic continues to claim victims. According to the World Health Organization, in the 28 days leading up to 25 February 2024 alone, the number of deaths from COVID-19 was 7141. In this work, we aimed to model the waves of [...] Read more.
The coronavirus (COVID-19) pandemic continues to claim victims. According to the World Health Organization, in the 28 days leading up to 25 February 2024 alone, the number of deaths from COVID-19 was 7141. In this work, we aimed to model the waves of COVID-19 through artificial neural networks (ANNs) and the sigmoidal–Boltzmann model. The study variable was the global cumulative number of deaths according to days, based on the Peru dataset. Additionally, the variables were adapted to determine the correlation between social isolation measures and death rates, which constitutes a novel contribution. A quantitative methodology was used that implemented a non-experimental, longitudinal, and correlational design. The study was retrospective. The results show that the sigmoidal and ANN models were reasonably representative and could help to predict the spread of COVID-19 over the course of multiple waves. Furthermore, the results were precise, with a Pearson correlation coefficient greater than 0.999. The computational sigmoidal–Boltzmann model was also time-efficient. Moreover, the Spearman correlation between social isolation measures and death rates was 0.77, which is acceptable considering that the social isolation variable is qualitative. Finally, we concluded that social isolation measures had a significant effect on reducing deaths from COVID-19. Full article
(This article belongs to the Section Health Informatics)
Show Figures

Figure 1

25 pages, 1047 KiB  
Article
MSProfileR: An Open-Source Software for Quality Control of Matrix-Assisted Laser Desorption Ionization–Time of Flight Spectra
by Refka Ben Hamouda, Bertrand Estellon, Khalil Himet, Aimen Cherif, Hugo Marthinet, Jean-Marie Loreau, Gaëtan Texier, Samuel Granjeaud and Lionel Almeras
Informatics 2024, 11(2), 39; https://doi.org/10.3390/informatics11020039 - 6 Jun 2024
Viewed by 736
Abstract
In the early 2000s, matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) emerged as a performant and relevant tool for identifying micro-organisms. Since then, it has become practically essential for identifying bacteria in microbiological diagnostic laboratories. In the last decade, it [...] Read more.
In the early 2000s, matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) emerged as a performant and relevant tool for identifying micro-organisms. Since then, it has become practically essential for identifying bacteria in microbiological diagnostic laboratories. In the last decade, it was successfully applied for arthropod identification, allowing researchers to distinguish vectors from non-vectors of infectious diseases. However, identification failures are not rare, hampering its wide use. Failure is generally attributed either to the absence of respective counter species MS spectra in the database or to the insufficient quality of query MS spectra (i.e., lower intensity and diversity of MS peaks detected). To avoid matching errors due to non-compliant spectra, the development of a strategy for detecting and excluding outlier MS profiles became compulsory. To this end, we created MSProfileR, an R package leading to a bioinformatics tool through a simple installation, integrating a control quality system of MS spectra and an analysis pipeline including peak detection and MS spectra comparisons. MSProfileR can also add metadata concerning the sample that the spectra are derived from. MSProfileR has been developed in the R environment and offers a user-friendly web interface using the R Shiny framework. It is available on Microsoft Windows as a web browser application by simple navigation using the link of the package on Github v.3.10.0. MSProfileR is therefore accessible to non-computer specialists and is freely available to the scientific community. We evaluated MSProfileR using two datasets including exclusively MS spectra from arthropods. In addition to coherent sample classification, outlier MS spectra were detected in each dataset confirming the value of MSProfileR. Full article
Show Figures

Graphical abstract

32 pages, 2505 KiB  
Article
Chatbot Technology Use and Acceptance Using Educational Personas
by Fatima Ali Amer jid Almahri, David Bell and Zameer Gulzar
Informatics 2024, 11(2), 38; https://doi.org/10.3390/informatics11020038 - 3 Jun 2024
Viewed by 496
Abstract
Chatbots are computer programs that mimic human conversation using text or voice or both. Users’ acceptance of chatbots is highly influenced by their persona. Users develop a sense of familiarity with chatbots as they use them, so they become more approachable, and this [...] Read more.
Chatbots are computer programs that mimic human conversation using text or voice or both. Users’ acceptance of chatbots is highly influenced by their persona. Users develop a sense of familiarity with chatbots as they use them, so they become more approachable, and this encourages them to interact with the chatbots more readily by fostering favorable opinions of the technology. In this study, we examine the moderating effects of persona traits on students’ acceptance and use of chatbot technology at higher educational institutions in the UK. We use an Extended Unified Theory of Acceptance and Use of Technology (Extended UTAUT2). Through a self-administrated survey using a questionnaire, data were collected from 431 undergraduate and postgraduate computer science students. This study employed a Likert scale to measure the variables associated with chatbot acceptance. To evaluate the gathered data, Structural Equation Modelling (SEM) coupled with multi-group analysis (MGA) using SmartPLS3 were used. The estimated Cronbach’s alpha highlighted the accuracy and legitimacy of the findings. The results showed that the emerging factors that influence students’ adoption and use of chatbot technology were habit, effort expectancy, and performance expectancy. Additionally, it was discovered that the Extended UTAUT2 model did not require grades or educational level to moderate the correlations. These results are important for improving user experience and they have implications for academics, researchers, and organizations, especially in the context of native chatbots. Full article
(This article belongs to the Section Human-Computer Interaction)
Show Figures

Figure 1

14 pages, 611 KiB  
Article
Analysing the Impact of Generative AI in Arts Education: A Cross-Disciplinary Perspective of Educators and Students in Higher Education
by Sara Sáez-Velasco, Mario Alaguero-Rodríguez, Vanesa Delgado-Benito and Sonia Rodríguez-Cano
Informatics 2024, 11(2), 37; https://doi.org/10.3390/informatics11020037 - 3 Jun 2024
Viewed by 754
Abstract
Generative AI refers specifically to a class of Artificial Intelligence models that use existing data to create new content that reflects the underlying patterns of real-world data. This contribution presents a study that aims to show what the current perception of arts educators [...] Read more.
Generative AI refers specifically to a class of Artificial Intelligence models that use existing data to create new content that reflects the underlying patterns of real-world data. This contribution presents a study that aims to show what the current perception of arts educators and students of arts education is with regard to generative Artificial Intelligence. It is a qualitative research study using focus groups as a data collection technique in order to obtain an overview of the participating subjects. The research design consists of two phases: (1) generation of illustrations from prompts by students, professionals and a generative AI tool; and (2) focus groups with students (N = 5) and educators (N = 5) of artistic education. In general, the perception of educators and students coincides in the usefulness of generative AI as a tool to support the generation of illustrations. However, they agree that the human factor cannot be replaced by generative AI. The results obtained allow us to conclude that generative AI can be used as a motivating educational strategy for arts education. Full article
(This article belongs to the Topic AI Chatbots: Threat or Opportunity?)
Show Figures

Figure 1

15 pages, 1056 KiB  
Communication
Safety of Human–Artificial Intelligence Systems: Applying Safety Science to Analyze Loopholes in Interactions between Human Organizations, Artificial Intelligence, and Individual People
by Stephen Fox and Juan G. Victores
Informatics 2024, 11(2), 36; https://doi.org/10.3390/informatics11020036 - 29 May 2024
Viewed by 566
Abstract
Loopholes involve misalignments between rules about what should be done and what is actually done in practice. The focus of this paper is loopholes in interactions between human organizations’ implementations of task-specific artificial intelligence and individual people. The importance of identifying and addressing [...] Read more.
Loopholes involve misalignments between rules about what should be done and what is actually done in practice. The focus of this paper is loopholes in interactions between human organizations’ implementations of task-specific artificial intelligence and individual people. The importance of identifying and addressing loopholes is recognized in safety science and in applications of AI. Here, an examination is provided of loophole sources in interactions between human organizations and individual people. Then, it is explained how the introduction of task-specific AI applications can introduce new sources of loopholes. Next, an analytical framework, which is well-established in safety science, is applied to analyses of loopholes in interactions between human organizations, artificial intelligence, and individual people. The example used in the analysis is human–artificial intelligence systems in gig economy delivery driving work. Full article
(This article belongs to the Section Human-Computer Interaction)
Show Figures

Figure 1

25 pages, 5139 KiB  
Article
Improving Minority Class Recall through a Novel Cluster-Based Oversampling Technique
by Takorn Prexawanprasut and Thepparit Banditwattanawong
Informatics 2024, 11(2), 35; https://doi.org/10.3390/informatics11020035 - 28 May 2024
Viewed by 546
Abstract
In this study, we propose an approach to address the pressing issue of false negative errors by enhancing minority class recall within imbalanced data sets commonly encountered in machine learning applications. Through the utilization of a cluster-based oversampling technique in conjunction with an [...] Read more.
In this study, we propose an approach to address the pressing issue of false negative errors by enhancing minority class recall within imbalanced data sets commonly encountered in machine learning applications. Through the utilization of a cluster-based oversampling technique in conjunction with an information entropy evaluation, our approach effectively targets areas of ambiguity inherent in the data set. An extensive evaluation across a diverse range of real-world data sets characterized by inter-cluster complexity demonstrates the superior performance of our method compared to that of existing oversampling techniques. Particularly noteworthy is its significant improvement within the Delinquency Telecom data set, where it achieves a remarkable increase of up to 30.54 percent in minority class recall compared to the original data set. This notable reduction in false negative errors underscores the importance of our methodology in accurately identifying and classifying instances from underrepresented classes, thereby enhancing model performance in imbalanced data scenarios. Full article
(This article belongs to the Section Machine Learning)
Show Figures

Figure 1

18 pages, 2236 KiB  
Article
An Intelligent Model and Methodology for Predicting Length of Stay and Survival in a Critical Care Hospital Unit
by Enrique Maldonado Belmonte, Salvador Oton-Tortosa, Jose-Maria Gutierrez-Martinez and Ana Castillo-Martinez
Informatics 2024, 11(2), 34; https://doi.org/10.3390/informatics11020034 - 17 May 2024
Viewed by 821
Abstract
This paper describes the design and methodology for the development and validation of an intelligent model in the healthcare domain. The generated model relies on artificial intelligence techniques, aiming to predict the length of stay and survival rate of patients admitted to a [...] Read more.
This paper describes the design and methodology for the development and validation of an intelligent model in the healthcare domain. The generated model relies on artificial intelligence techniques, aiming to predict the length of stay and survival rate of patients admitted to a critical care hospitalization unit with better results than predictive systems using scoring. The proposed methodology is based on the following stages: preliminary data analysis, analysis of the architecture and systems integration model, the big data model approach, information structure and process development, and the application of machine learning techniques. This investigation substantiates that automated machine learning models significantly surpass traditional prediction techniques for patient outcomes within critical care settings. Specifically, the machine learning-based model attained an F1 score of 0.351 for mortality forecast and 0.615 for length of stay, in contrast to the traditional scoring model’s F1 scores of 0.112 for mortality and 0.412 for length of stay. These results strongly support the advantages of integrating advanced computational techniques in critical healthcare environments. It is also shown that the use of integration architectures allows for improving the quality of the information by providing a data repository large enough to generate intelligent models. From a clinical point of view, obtaining more accurate results in the estimation of the ICU stay and survival offers the possibility of expanding the uses of the model to the identification and prioritization of patients who are candidates for admission to the ICU, as well as the management of patients with specific conditions. Full article
(This article belongs to the Section Health Informatics)
Show Figures

Figure 1

25 pages, 6739 KiB  
Article
QUMA: Quantum Unified Medical Architecture Using Blockchain
by Akoramurthy Balasubramaniam and B. Surendiran
Informatics 2024, 11(2), 33; https://doi.org/10.3390/informatics11020033 - 17 May 2024
Viewed by 1167
Abstract
A significant increase in the demand for quality healthcare has resulted from people becoming more aware of health issues. With blockchain, healthcare providers may safely share patient information electronically, which is especially important given the sensitive nature of the data contained inside them. [...] Read more.
A significant increase in the demand for quality healthcare has resulted from people becoming more aware of health issues. With blockchain, healthcare providers may safely share patient information electronically, which is especially important given the sensitive nature of the data contained inside them. However, flaws in the current blockchain design have surfaced since the dawn of quantum computing systems. The study proposes a novel quantum-inspired blockchain system (Qchain) and constructs a unique entangled quantum medical record (EQMR) system with an emphasis on privacy and security. This Qchain relies on entangled states to connect its blocks. The automated production of the chronology indicator reduces storage capacity requirements by connecting entangled BloQ (blocks with quantum properties) to controlled activities. We use one qubit to store the hash value of each block. A lot of information regarding the quantum internet is included in the protocol for the entangled quantum medical record (EQMR). The EQMR can be accessed in Medical Internet of Things (M-IoT) systems that are kept private and secure, and their whereabouts can be monitored in the event of an emergency. The protocol also uses quantum authentication in place of more conventional methods like encryption and digital signatures. Mathematical research shows that the quantum converged blockchain (QCB) is highly safe against attacks such as external attacks, intercept measure -repeat attacks, and entanglement measure attacks. We present the reliability and auditability evaluations of the entangled BloQ, along with the quantum circuit design for computing the hash value. There is also a comparison between the suggested approach and several other quantum blockchain designs. Full article
(This article belongs to the Section Health Informatics)
Show Figures

Figure 1

13 pages, 2602 KiB  
Article
Performance Evaluation of Deep Learning Models for Classifying Cybersecurity Attacks in IoT Networks
by Fray L. Becerra-Suarez, Victor A. Tuesta-Monteza, Heber I. Mejia-Cabrera and Juan Arcila-Diaz
Informatics 2024, 11(2), 32; https://doi.org/10.3390/informatics11020032 - 17 May 2024
Viewed by 939
Abstract
The Internet of Things (IoT) presents great potential in various fields such as home automation, healthcare, and industry, among others, but its infrastructure, the use of open source code, and lack of software updates make it vulnerable to cyberattacks that can compromise access [...] Read more.
The Internet of Things (IoT) presents great potential in various fields such as home automation, healthcare, and industry, among others, but its infrastructure, the use of open source code, and lack of software updates make it vulnerable to cyberattacks that can compromise access to data and services, thus making it an attractive target for hackers. The complexity of cyberattacks has increased, posing a greater threat to public and private organizations. This study evaluated the performance of deep learning models for classifying cybersecurity attacks in IoT networks, using the CICIoT2023 dataset. Three architectures based on DNN, LSTM, and CNN were compared, highlighting their differences in layers and activation functions. The results show that the CNN architecture outperformed the others in accuracy and computational efficiency, with an accuracy rate of 99.10% for multiclass classification and 99.40% for binary classification. The importance of data standardization and proper hyperparameter selection is emphasized. These results demonstrate that the CNN-based model emerges as a promising option for detecting cyber threats in IoT environments, supporting the relevance of deep learning in IoT network security. Full article
Show Figures

Figure 1

17 pages, 987 KiB  
Article
ACME: A Classification Model for Explaining the Risk of Preeclampsia Based on Bayesian Network Classifiers and a Non-Redundant Feature Selection Approach
by Franklin Parrales-Bravo, Rosangela Caicedo-Quiroz, Elianne Rodríguez-Larraburu and Julio Barzola-Monteses
Informatics 2024, 11(2), 31; https://doi.org/10.3390/informatics11020031 - 17 May 2024
Viewed by 889
Abstract
While preeclampsia is the leading cause of maternal death in Guayas province (Ecuador), its causes have not yet been studied in depth. The objective of this research is to build a Bayesian network classifier to diagnose cases of preeclampsia while facilitating the understanding [...] Read more.
While preeclampsia is the leading cause of maternal death in Guayas province (Ecuador), its causes have not yet been studied in depth. The objective of this research is to build a Bayesian network classifier to diagnose cases of preeclampsia while facilitating the understanding of the causes that generate this disease. Data for the years 2017 through 2023 were gathered retrospectively from medical histories of patients treated at “IESS Los Ceibos” hospital in Guayaquil, Ecuador. Naïve Bayes (NB), The Chow–Liu Tree-Augmented Naïve Bayes (TANcl), and Semi Naïve Bayes (FSSJ) algorithms have been considered for building explainable classification models. A proposed Non-Redundant Feature Selection approach (NoReFS) is proposed to perform the feature selection task. The model trained with the TANcl and NoReFS was the best of them, with an accuracy close to 90%. According to the best model, patients whose age is above 35 years, have a severe vaginal infection, live in a rural area, use tobacco, have a family history of diabetes, and have had a personal history of hypertension are those with a high risk of developing preeclampsia. Full article
(This article belongs to the Section Health Informatics)
Show Figures

Figure 1

19 pages, 1405 KiB  
Article
Investigating User Experience of VR Art Exhibitions: The Impact of Immersion, Satisfaction, and Expectation Confirmation
by Lin Cheng, Junping Xu and Younghwan Pan
Informatics 2024, 11(2), 30; https://doi.org/10.3390/informatics11020030 - 16 May 2024
Viewed by 888
Abstract
As an innovative form in the digital age, VR art exhibitions have attracted increasing attention. This study aims to explore the key factors that influence visitors’ continuance intention to VR art exhibitions using the expectation confirmation model and experience economy theory and to [...] Read more.
As an innovative form in the digital age, VR art exhibitions have attracted increasing attention. This study aims to explore the key factors that influence visitors’ continuance intention to VR art exhibitions using the expectation confirmation model and experience economy theory and to explore ways to enhance visitor immersion in virtual environments. We conducted a quantitative study of 235 art professionals and enthusiasts, conducted using the partial least squares structural equation modeling (PLS-SEM), to examine the complex relationship between confirmation (CON), Perceived Usefulness (PU), Aesthetic Experiences (AE), Escapist Experiences (EE), Satisfaction (SAT), and Continuance Intention (CI). The results show that confirmation plays a key role in shaping PU, AE, and EE, which in turn positively affect visitors’ SAT. Among these factors, AE positively impacts PU, but EE have no impact. A comprehensive theoretical model was then constructed based on the findings. This research provides empirical support for designing and improving VR art exhibitions. It also sheds light on the application of expectation confirmation theory and experience economy theory in the art field to improve user experience and provides theoretical guidance for the sustainable development of virtual digital art environment. Full article
(This article belongs to the Topic Theories and Applications of Human-Computer Interaction)
Show Figures

Figure 1

11 pages, 558 KiB  
Article
Fuzzy Classification Approach to Select Learning Objects Based on Learning Styles in Intelligent E-Learning Systems
by Ibtissam Azzi, Abdelhay Radouane, Loubna Laaouina, Adil Jeghal, Ali Yahyaouy and Hamid Tairi
Informatics 2024, 11(2), 29; https://doi.org/10.3390/informatics11020029 - 15 May 2024
Viewed by 751
Abstract
In e-learning systems, even though the automatic detection of learning styles is considered the key element in the adaptation process, it does not represent the main goal of this process at all. Indeed, to accomplish the task of adaptation, it is also necessary [...] Read more.
In e-learning systems, even though the automatic detection of learning styles is considered the key element in the adaptation process, it does not represent the main goal of this process at all. Indeed, to accomplish the task of adaptation, it is also necessary to be able to automatically select the learning objects according to the detected styles. The classification techniques are the most used techniques to automatically select the learning objects by processing data derived from learning object metadata. By using these classification techniques, considerable results are obtained via several approaches and consist of mapping the learning objects into different teaching strategies and then mapping these strategies into the identified learning styles. However, these approaches have some limitations related to robustness. Indeed, a common feature of these approaches is that they do not directly map learning object metadata elements to learning style dimensions. Moreover, they do not consider the fuzzy nature of learning objects. Indeed, any learning object can be suitable for different learning styles at varying degrees of suitability. This highlights the need to find a way to remedy this shortcoming. Our work is part of the automatic selection of learning objects. So, we will propose an approach that uses the fuzzy classification technique to select learning objects based on learning styles. In this approach, the metadata of each learning object that complies with the Institute of Electrical and Electronics Engineers (IEEE) standard are stored in a database as an Extensible Markup Language (XML) file. The Fuzzy C Means algorithm is used, on one hand, to assign fuzzy suitability rates to the stored learning objects and, on the other hand, to cluster them into the Felder and Silverman learning styles model categories. The experiment results show the performance of our approach. Full article
Show Figures

Figure 1

15 pages, 1551 KiB  
Review
Variations in Using Diagnosis Codes for Defining Age-Related Macular Degeneration Cohorts
by Fritz Gerald Paguiligan Kalaw, Jimmy S. Chen and Sally L. Baxter
Informatics 2024, 11(2), 28; https://doi.org/10.3390/informatics11020028 - 1 May 2024
Viewed by 1177
Abstract
Data harmonization is vital for secondary electronic health record data analysis, especially when combining data from multiple sources. Currently, there is a gap in knowledge as to how studies identify cohorts of patients with age-related macular degeneration (AMD), a leading cause of blindness. [...] Read more.
Data harmonization is vital for secondary electronic health record data analysis, especially when combining data from multiple sources. Currently, there is a gap in knowledge as to how studies identify cohorts of patients with age-related macular degeneration (AMD), a leading cause of blindness. We hypothesize that there is variation in using medical condition codes to define cohorts of AMD patients that can lead to either the under- or overrepresentation of such cohorts. This study identified articles studying AMD using the International Classification of Diseases (ICD-9, ICD-9-CM, ICD-10, and ICD-10-CM). The data elements reviewed included the year of publication; dataset origin (Veterans Affairs, registry, national or commercial claims database, and institutional EHR); total number of subjects; and ICD codes used. A total of thirty-seven articles were reviewed. Six (16%) articles used cohort definitions from two ICD terminologies. The Medicare database was the most used dataset (14, 38%), and there was a noted increase in the use of other datasets in the last few years. We identified substantial variation in the use of ICD codes for AMD. For the studies that used ICD-10 terminologies, 7 (out of 9, 78%) defined the AMD codes correctly, whereas, for the studies that used ICD-9 and 9-CM terminologies, only 2 (out of 30, 7%) defined and utilized the appropriate AMD codes (p = 0.0001). Of the 43 cohort definitions used from 37 articles, 31 (72%) had missing or incomplete AMD codes used, and only 9 (21%) used the exact codes. Additionally, 13 articles (35%) captured ICD codes that were not within the scope of AMD diagnosis. Efforts to standardize data are needed to provide a reproducible research output. Full article
(This article belongs to the Special Issue Health Informatics: Feature Review Papers)
Show Figures

Figure 1

15 pages, 2703 KiB  
Article
Prompt Design through ChatGPT’s Zero-Shot Learning Prompts: A Case of Cost-Sensitive Learning on a Water Potability Dataset
by Kokisa Phorah, Malusi Sibiya and Mbuyu Sumbwanyambe
Informatics 2024, 11(2), 27; https://doi.org/10.3390/informatics11020027 - 28 Apr 2024
Viewed by 1170
Abstract
Datasets used in AI applications for human health require careful selection. In healthcare, machine learning (ML) models are fine-tuned to reduce errors, and our study focuses on minimizing errors by generating code snippets for cost-sensitive learning using water potability datasets. Water potability ensures [...] Read more.
Datasets used in AI applications for human health require careful selection. In healthcare, machine learning (ML) models are fine-tuned to reduce errors, and our study focuses on minimizing errors by generating code snippets for cost-sensitive learning using water potability datasets. Water potability ensures safe drinking water through various scientific methods, with our approach using ML algorithms for prediction. We preprocess data with ChatGPT-generated code snippets and aim to demonstrate how zero-shot learning prompts in ChatGPT can produce reliable code snippets that cater to cost-sensitive learning. Our dataset is sourced from Kaggle. We compare model performance metrics of logistic regressors and gradient boosting classifiers without additional code fine-tuning to check the accuracy. Other classifier performance metrics are compared with results of the top 5 code authors on the Kaggle scoreboard. Cost-sensitive learning is crucial in domains like healthcare to prevent misclassifications with serious consequences, such as type II errors in water potability assessment. Full article
Show Figures

Figure 1

25 pages, 31666 KiB  
Article
Every Thing Can Be a Hero! Narrative Visualization of Person, Object, and Other Biographies
by Jakob Kusnick, Eva Mayr, Kasra Seirafi, Samuel Beck, Johannes Liem and Florian Windhager
Informatics 2024, 11(2), 26; https://doi.org/10.3390/informatics11020026 - 26 Apr 2024
Viewed by 1419
Abstract
Knowledge communication in cultural heritage and digital humanities currently faces two challenges, which this paper addresses: On the one hand, data-driven storytelling in these fields has mainly focused on human protagonists, while other essential entities (such as artworks and artifacts, institutions, or places) [...] Read more.
Knowledge communication in cultural heritage and digital humanities currently faces two challenges, which this paper addresses: On the one hand, data-driven storytelling in these fields has mainly focused on human protagonists, while other essential entities (such as artworks and artifacts, institutions, or places) have been neglected. On the other hand, storytelling tools rarely support the larger chains of data practices, which are required to generate and shape the data and visualizations needed for such stories. This paper introduces the InTaVia platform, which has been developed to bridge these gaps. It supports the practices of data retrieval, creation, curation, analysis, and communication with coherent visualization support for multiple types of entities. We illustrate the added value of this open platform for storytelling with four case studies, focusing on (a) the life of Albrecht Dürer (person biography), (b) the Saliera salt cellar by Benvenuto Cellini (object biography), (c) the artist community of Lake Tuusula (group biography), and (d) the history of the Hofburg building complex in Vienna (place biography). Numerous suggestions for future research arise from this undertaking. Full article
(This article belongs to the Special Issue Digital Humanities and Visualization)
Show Figures

Figure 1

38 pages, 917 KiB  
Article
A Survey of Vision-Based Methods for Surface Defects’ Detection and Classification in Steel Products
by Alaa Aldein M. S. Ibrahim and Jules-Raymond Tapamo
Informatics 2024, 11(2), 25; https://doi.org/10.3390/informatics11020025 - 23 Apr 2024
Viewed by 1506
Abstract
In the competitive landscape of steel-strip production, ensuring the high quality of steel surfaces is paramount. Traditionally, human visual inspection has been the primary method for detecting defects, but it suffers from limitations such as reliability, cost, processing time, and accuracy. Visual inspection [...] Read more.
In the competitive landscape of steel-strip production, ensuring the high quality of steel surfaces is paramount. Traditionally, human visual inspection has been the primary method for detecting defects, but it suffers from limitations such as reliability, cost, processing time, and accuracy. Visual inspection technologies, particularly automation techniques, have been introduced to address these shortcomings. This paper conducts a thorough survey examining vision-based methodologies related to detecting and classifying surface defects on steel products. These methodologies encompass statistical, spectral, texture segmentation based methods, and machine learning-driven approaches. Furthermore, various classification algorithms, categorized into supervised, semi-supervised, and unsupervised techniques, are discussed. Additionally, the paper outlines the future direction of research focus. Full article
(This article belongs to the Special Issue New Advances in Semantic Recognition and Analysis)
Show Figures

Figure 1

27 pages, 978 KiB  
Article
Machine Learning and Deep Learning Sentiment Analysis Models: Case Study on the SENT-COVID Corpus of Tweets in Mexican Spanish
by Helena Gomez-Adorno, Gemma Bel-Enguix, Gerardo Sierra, Juan-Carlos Barajas and William Álvarez
Informatics 2024, 11(2), 24; https://doi.org/10.3390/informatics11020024 - 23 Apr 2024
Viewed by 1301
Abstract
This article presents a comprehensive evaluation of traditional machine learning and deep learning models in analyzing sentiment trends within the SENT-COVID Twitter corpus, curated during the COVID-19 pandemic. The corpus, filtered by COVID-19 related keywords and manually annotated for polarity, is a pivotal [...] Read more.
This article presents a comprehensive evaluation of traditional machine learning and deep learning models in analyzing sentiment trends within the SENT-COVID Twitter corpus, curated during the COVID-19 pandemic. The corpus, filtered by COVID-19 related keywords and manually annotated for polarity, is a pivotal resource for conducting sentiment analysis experiments. Our study investigates various approaches, including classic vector-based systems such as word2vec, doc2vec, and diverse phrase modeling techniques, alongside Spanish pre-trained BERT models. We assess the performance of readily available sentiment analysis libraries for Python users, including TextBlob, VADER, and Pysentimiento. Additionally, we implement and evaluate traditional classification algorithms such as Logistic Regression, Naive Bayes, Support Vector Machines, and simple neural networks like Multilayer Perceptron. Throughout the research, we explore different dimensionality reduction techniques. This methodology enables a precise comparison among classification methods, with BETO-uncased achieving the highest accuracy of 0.73 on the test set. Our findings underscore the efficacy and applicability of traditional machine learning and deep learning models in analyzing sentiment trends within the context of low-resource Spanish language scenarios and emerging topics like COVID-19. Full article
(This article belongs to the Section Machine Learning)
Show Figures

Figure 1

24 pages, 1669 KiB  
Review
Systematic Review of English/Arabic Machine Translation Postediting: Implications for AI Application in Translation Research and Pedagogy
by Lamis Ismail Omar and Abdelrahman Abdalla Salih
Informatics 2024, 11(2), 23; https://doi.org/10.3390/informatics11020023 - 22 Apr 2024
Viewed by 2291
Abstract
The twenty-first century has witnessed an extensive evolution in translation practice thanks to the accelerated progress in machine translation tools and software. With the increased scalability and availability of machine translation software empowered by artificial intelligence, translation students and practitioners have continued to [...] Read more.
The twenty-first century has witnessed an extensive evolution in translation practice thanks to the accelerated progress in machine translation tools and software. With the increased scalability and availability of machine translation software empowered by artificial intelligence, translation students and practitioners have continued to show an unwavering reliance on automatic translation systems. Academically, there is little recognition of the need to develop machine translation skillsets amongst translation learners in English/Arabic translation programs. This study provides a systematic review of machine translation postediting with reference to English/Arabic machine translation. Using the Preferred Reporting Items for Systematic Review and Meta-Analysis, the paper reviewed 60 studies conducted since the beginning of the twenty-first century and classified them by different metrics to identify relevant trends and research gaps. The results showed that research on the topic has been primarily prescriptive, concentrating on evaluating and developing machine translation software while neglecting aspects related to translators’ skillsets and competencies. The paper highlights the significance of postediting as an important digital literacy to be developed among Arabic translation students and the need to bridge the existing research and pedagogic gap in MT education. Full article
(This article belongs to the Special Issue Digital Humanities and Visualization)
Show Figures

Figure 1

23 pages, 1989 KiB  
Article
Optimization of Obstructive Sleep Apnea Management: Novel Decision Support via Unsupervised Machine Learning
by Arthur Pinheiro de Araújo Costa, Adilson Vilarinho Terra, Claudio de Souza Rocha Junior, Igor Pinheiro de Araújo Costa, Miguel Ângelo Lellis Moreira, Marcos dos Santos, Carlos Francisco Simões Gomes and Antonio Sergio da Silva
Informatics 2024, 11(2), 22; https://doi.org/10.3390/informatics11020022 - 19 Apr 2024
Viewed by 1277
Abstract
This study addresses Obstructive Sleep Apnea (OSA), which impacts around 936 million adults globally. The research introduces a novel decision support method named Communalities on Ranking and Objective Weights Method (CROWM), which employs principal component analysis (PCA), unsupervised Machine Learning technique, and Multicriteria [...] Read more.
This study addresses Obstructive Sleep Apnea (OSA), which impacts around 936 million adults globally. The research introduces a novel decision support method named Communalities on Ranking and Objective Weights Method (CROWM), which employs principal component analysis (PCA), unsupervised Machine Learning technique, and Multicriteria Decision Analysis (MCDA) to calculate performance criteria weights of Continuous Positive Airway Pressure (CPAP—key in managing OSA) and to evaluate these devices. Uniquely, the CROWM incorporates non-beneficial criteria in PCA and employs communalities to accurately represent the performance evaluation of alternatives within each resulting principal factor, allowing for a more accurate and robust analysis of alternatives and variables. This article aims to employ CROWM to evaluate CPAP for effectiveness in combating OSA, considering six performance criteria: resources, warranty, noise, weight, cost, and maintenance. Validated by established tests and sensitivity analysis against traditional methods, CROWM proves its consistency, efficiency, and superiority in decision-making support. This method is poised to influence assertive decision-making significantly, aiding healthcare professionals, researchers, and patients in selecting optimal CPAP solutions, thereby advancing patient care in an interdisciplinary research context. Full article
(This article belongs to the Topic Decision Science Applications and Models (DSAM))
Show Figures

Figure 1

28 pages, 706 KiB  
Article
Digital Transformation in Omani Higher Education: Assessing Student Adoption of Video Communication during the COVID-19 Pandemic
by Fatima Amer jid Almahri, Islam Elbayoumi Salem, Ahmed Mohamed Elbaz, Hassan Aideed and Zameer Gulzar
Informatics 2024, 11(2), 21; https://doi.org/10.3390/informatics11020021 - 19 Apr 2024
Cited by 1 | Viewed by 1178
Abstract
The COVID-19 pandemic has influenced many fields, such as communication, commerce, and education, and pushed business entities to adopt innovative technologies to continue their business operations. Students need to do the same, so it is essential to understand their acceptance of these technologies [...] Read more.
The COVID-19 pandemic has influenced many fields, such as communication, commerce, and education, and pushed business entities to adopt innovative technologies to continue their business operations. Students need to do the same, so it is essential to understand their acceptance of these technologies to make them more usable for students. This paper employs the Unified Theory of Acceptance and Use of Technology 2 (UTAUT2) to identify the factors that influenced students’ acceptance and use of different online communication services as the primary tool for learning during the COVID-19 pandemic. Six factors of UTAUT2 were used to measure the acceptance and use of video communication services at the Business College of the University of Technology and Applied Sciences. Two hundred students completed our online survey. The results demonstrated that social influence, facilitating conditions, hedonic motivation, and habit affect behavioral intention positively, while performance expectancy and effort expectancy have no effect on behavioral intention. Full article
(This article belongs to the Section Human-Computer Interaction)
Show Figures

Figure 1

20 pages, 3041 KiB  
Article
Artificial Intelligence Chatbots in Chemical Information Seeking: Narrative Educational Insights via a SWOT Analysis
by Johannes Pernaa, Topias Ikävalko, Aleksi Takala, Emmi Vuorio, Reija Pesonen and Outi Haatainen
Informatics 2024, 11(2), 20; https://doi.org/10.3390/informatics11020020 - 18 Apr 2024
Viewed by 1416
Abstract
Artificial intelligence (AI) chatbots are next-word predictors built on large language models (LLMs). There is great interest within the educational field for this new technology because AI chatbots can be used to generate information. In this theoretical article, we provide educational insights into [...] Read more.
Artificial intelligence (AI) chatbots are next-word predictors built on large language models (LLMs). There is great interest within the educational field for this new technology because AI chatbots can be used to generate information. In this theoretical article, we provide educational insights into the possibilities and challenges of using AI chatbots. These insights were produced by designing chemical information-seeking activities for chemistry teacher education which were analyzed via the SWOT approach. The analysis revealed several internal and external possibilities and challenges. The key insight is that AI chatbots will change the way learners interact with information. For example, they enable the building of personal learning environments with ubiquitous access to information and AI tutors. Their ability to support chemistry learning is impressive. However, the processing of chemical information reveals the limitations of current AI chatbots not being able to process multimodal chemical information. There are also ethical issues to address. Despite the benefits, wider educational adoption will take time. The diffusion can be supported by integrating LLMs into curricula, relying on open-source solutions, and training teachers with modern information literacy skills. This research presents theory-grounded examples of how to support the development of modern information literacy skills in the context of chemistry teacher education. Full article
(This article belongs to the Topic AI Chatbots: Threat or Opportunity?)
Show Figures

Figure 1

33 pages, 4209 KiB  
Article
A Machine Learning as a Service (MLaaS) Approach to Improve Marketing Success
by Ivo Pereira, Ana Madureira, Nuno Bettencourt, Duarte Coelho, Miguel Ângelo Rebelo, Carolina Araújo and Daniel Alves de Oliveira
Informatics 2024, 11(2), 19; https://doi.org/10.3390/informatics11020019 - 15 Apr 2024
Viewed by 1127
Abstract
The exponential growth of data in the digital age has led to a significant demand for innovative approaches to assess data in a manner that is both effective and efficient. Machine Learning as a Service (MLaaS) is a category of services that offers [...] Read more.
The exponential growth of data in the digital age has led to a significant demand for innovative approaches to assess data in a manner that is both effective and efficient. Machine Learning as a Service (MLaaS) is a category of services that offers considerable potential for organisations to extract valuable insights from their data while reducing the requirement for heavy technical expertise. This article explores the use of MLaaS within the realm of marketing applications. In this study, we provide a comprehensive analysis of MLaaS implementations and their benefits within the domain of marketing. Furthermore, we present a platform that possesses the capability to be customised and expanded to address marketing’s unique requirements. Three modules are introduced: Churn Prediction, One-2-One Product Recommendation, and Send Frequency Prediction. When applied to marketing, the proposed MLaaS system exhibits considerable promise for use in applications such as automated detection of client churn prior to its occurrence, individualised product recommendations, and send time optimisation. Our study revealed that AI-driven campaigns can improve both the Open Rate and Click Rate. This approach has the potential to enhance customer engagement and retention for businesses while enabling well-informed decisions by leveraging insights derived from consumer data. This work contributes to the existing body of research on MLaaS in marketing and offers practical insights for businesses seeking to utilise this approach to enhance their competitive edge in the contemporary data-oriented marketplace. Full article
(This article belongs to the Section Machine Learning)
Show Figures

Figure 1

12 pages, 504 KiB  
Article
Variations in Pattern of Social Media Engagement between Individuals with Chronic Conditions and Mental Health Conditions
by Elizabeth Ayangunna, Gulzar Shah, Kingsley Kalu, Padmini Shankar and Bushra Shah
Informatics 2024, 11(2), 18; https://doi.org/10.3390/informatics11020018 - 14 Apr 2024
Viewed by 967
Abstract
The use of the internet and supported apps is at historically unprecedented levels for the exchange of health information. The increasing use of the internet and social media platforms can affect patients’ health behavior. This study aims to assess the variations in patterns [...] Read more.
The use of the internet and supported apps is at historically unprecedented levels for the exchange of health information. The increasing use of the internet and social media platforms can affect patients’ health behavior. This study aims to assess the variations in patterns of social media engagement between individuals diagnosed with either chronic diseases or mental health conditions. Data from four iterations of the Health Information National Trends Survey Cycle 4 from 2017 to 2020 were used for this study with a sample size (N) = 16,092. To analyze the association between the independent variables, reflecting the presence of chronic conditions or mental health conditions, and various levels of social media engagement, descriptive statistics and logistic regression were conducted. Respondents who had at least one chronic condition were more likely to join an internet-based support group (Adjusted Odds Ratio or AOR = 1.5; Confidence Interval, CI = 1.11–1.93) and watch a health-related video on YouTube (AOR = 1.2; CI = 1.01–1.36); respondents with a mental condition were less likely to visit and share health information on social media, join an internet-based support group, and watch a health-related video on YouTube. Race, age, and educational level also influence the choice to watch a health-related video on YouTube. Understanding the pattern of engagement with health-related content on social media and how their online behavior differs based on the patient’s medical conditions can lead to the development of more effective and tailored public health interventions that leverage social media platforms. Full article
(This article belongs to the Section Health Informatics)
Show Figures

Figure 1

18 pages, 2301 KiB  
Article
Governors in the Digital Era: Analyzing and Predicting Social Media Engagement Using Machine Learning during the COVID-19 Pandemic in Japan
by Salama Shady, Vera Paola Shoda and Takashi Kamihigashi
Informatics 2024, 11(2), 17; https://doi.org/10.3390/informatics11020017 - 7 Apr 2024
Viewed by 1211
Abstract
This paper presents a comprehensive analysis of the social media posts of prefectural governors in Japan during the COVID-19 pandemic. It investigates the correlation between social media activity levels, governors’ characteristics, and engagement metrics. To predict citizen engagement of a specific tweet, machine [...] Read more.
This paper presents a comprehensive analysis of the social media posts of prefectural governors in Japan during the COVID-19 pandemic. It investigates the correlation between social media activity levels, governors’ characteristics, and engagement metrics. To predict citizen engagement of a specific tweet, machine learning models (MLMs) are trained using three feature sets. The first set includes variables representing profile- and tweet-related features. The second set incorporates word embeddings from three popular models, while the third set combines the first set with one of the embeddings. Additionally, seven classifiers are employed. The best-performing model utilizes the first feature set with FastText embedding and the XGBoost classifier. This study aims to collect governors’ COVID-19-related tweets, analyze engagement metrics, investigate correlations with governors’ characteristics, examine tweet-related features, and train MLMs for prediction. This paper’s main contributions are twofold. Firstly, it offers an analysis of social media engagement by prefectural governors during the COVID-19 pandemic, shedding light on their communication strategies and citizen engagement outcomes. Secondly, it explores the effectiveness of MLMs and word embeddings in predicting tweet engagement, providing practical implications for policymakers in crisis communication. The findings emphasize the importance of social media engagement for effective governance and provide insights into factors influencing citizen engagement. Full article
(This article belongs to the Section Social Informatics and Digital Humanities)
Show Figures

Figure 1

15 pages, 1407 KiB  
Article
Key Industry 4.0 Organisational Capability Prioritisation towards Organisational Transformation
by Stefan Smuts and Alta van der Merwe
Informatics 2024, 11(2), 16; https://doi.org/10.3390/informatics11020016 - 2 Apr 2024
Viewed by 1085
Abstract
Industry 4.0 aids organisational transformation powered by innovative technologies and connectivity. In addition to navigating complex Industry 4.0 concepts and characteristics, organisations must also address organisational consequences related to fast-paced organisational transformation and resource efficacy. The optimal allocation of organisational resources and capabilities [...] Read more.
Industry 4.0 aids organisational transformation powered by innovative technologies and connectivity. In addition to navigating complex Industry 4.0 concepts and characteristics, organisations must also address organisational consequences related to fast-paced organisational transformation and resource efficacy. The optimal allocation of organisational resources and capabilities to large transformational programs, as well as the significant capital investment associated with digital transformation, compel organisations to prioritize their efforts. Hence, this study investigates how key Industry 4.0 organisational capabilities could be prioritized towards organisational digital transformation. Data were collected from 49 participants who had completed a questionnaire containing 26 statement actions aligned to sensing, seizing, transforming and supporting organisational capability domains. By analysing the data, statement actions were prioritized and operationalized into a prototyped checklist. Two organisations applied the prototyped checklist, illustrating unique profiles and transformative actions. The operationalisation of the checklist highlighted its utility in establishing where an organisation operates in terms of digital transformation, as well as what additional steps might be followed to improve its capability prioritisation based on low checklist scores. By understanding the prioritisation of Industry 4.0 capabilities, organisations could ensure that resources are allocated optimally for business value creation based on organisational capabilities prioritisation. Full article
Show Figures

Figure 1

14 pages, 2421 KiB  
Article
Detecting Structured Query Language Injections in Web Microservices Using Machine Learning
by Edwin Peralta-Garcia, Juan Quevedo-Monsalbe, Victor Tuesta-Monteza and Juan Arcila-Diaz
Informatics 2024, 11(2), 15; https://doi.org/10.3390/informatics11020015 - 2 Apr 2024
Viewed by 1353
Abstract
Structured Query Language (SQL) injections pose a constant threat to web services, highlighting the need for efficient detection to address this vulnerability. This study compares machine learning algorithms for detecting SQL injections in web microservices trained using a public dataset of 22,764 records. [...] Read more.
Structured Query Language (SQL) injections pose a constant threat to web services, highlighting the need for efficient detection to address this vulnerability. This study compares machine learning algorithms for detecting SQL injections in web microservices trained using a public dataset of 22,764 records. Additionally, a software architecture based on the microservices approach was implemented, in which trained models and the web application were deployed to validate requests and detect attacks. A literature review was conducted to identify types of SQL injections and machine learning algorithms. The results of random forest, decision tree, and support vector machine were compared for detecting SQL injections. The findings show that random forest outperforms with a precision and accuracy of 99%, a recall of 97%, and an F1 score of 98%. In contrast, decision tree achieved a precision of 92%, a recall of 86%, and an F1 score of 97%. Support Vector Machine (SVM) presented an accuracy, precision, and F1 score of 98%, with a recall of 97%. Full article
(This article belongs to the Section Machine Learning)
Show Figures

Figure 1

27 pages, 1241 KiB  
Article
Computational Ensemble Gene Co-Expression Networks for the Analysis of Cancer Biomarkers
by Julia Figueroa-Martínez, Dulcenombre M. Saz-Navarro, Aurelio López-Fernández, Domingo S. Rodríguez-Baena and Francisco A. Gómez-Vela
Informatics 2024, 11(2), 14; https://doi.org/10.3390/informatics11020014 - 28 Mar 2024
Viewed by 1645
Abstract
Gene networks have become a powerful tool for the comprehensive examination of gene expression patterns. Thanks to these networks generated by means of inference algorithms, it is possible to study different biological processes and even identify new biomarkers for such diseases. These biomarkers [...] Read more.
Gene networks have become a powerful tool for the comprehensive examination of gene expression patterns. Thanks to these networks generated by means of inference algorithms, it is possible to study different biological processes and even identify new biomarkers for such diseases. These biomarkers are essential for the discovery of new treatments for genetic diseases such as cancer. In this work, we introduce an algorithm for genetic network inference based on an ensemble method that improves the robustness of the results by combining two main steps: first, the evaluation of the relationship between pairs of genes using three different co-expression measures, and, subsequently, a voting strategy. The utility of this approach was demonstrated by applying it to a human dataset encompassing breast and prostate cancer-associated stromal cells. Two gene networks were computed using microarray data, one for breast cancer and one for prostate cancer. The results obtained revealed, on the one hand, distinct stromal cell behaviors in breast and prostate cancer and, on the other hand, a list of potential biomarkers for both diseases. In the case of breast tumor, ST6GAL2, RIPOR3, COL5A1, and DEPDC7 were found, and in the case of prostate tumor, the genes were GATA6-AS1, ARFGEF3, PRR15L, and APBA2. These results demonstrate the usefulness of the ensemble method in the field of biomarker discovery. Full article
Show Figures

Figure 1

15 pages, 4633 KiB  
Article
The Research Interest in ChatGPT and Other Natural Language Processing Tools from a Public Health Perspective: A Bibliometric Analysis
by Giuliana Favara, Martina Barchitta, Andrea Maugeri, Roberta Magnano San Lio and Antonella Agodi
Informatics 2024, 11(2), 13; https://doi.org/10.3390/informatics11020013 - 22 Mar 2024
Cited by 1 | Viewed by 1269
Abstract
Background: Natural language processing, such as ChatGPT, demonstrates growing potential across numerous research scenarios, also raising interest in its applications in public health and epidemiology. Here, we applied a bibliometric analysis for a systematic assessment of the current literature related to the applications [...] Read more.
Background: Natural language processing, such as ChatGPT, demonstrates growing potential across numerous research scenarios, also raising interest in its applications in public health and epidemiology. Here, we applied a bibliometric analysis for a systematic assessment of the current literature related to the applications of ChatGPT in epidemiology and public health. Methods: A bibliometric analysis was conducted on the Biblioshiny web-app, by collecting original articles indexed in the Scopus database between 2010 and 2023. Results: On a total of 3431 original medical articles, “Article” and “Conference paper”, mostly constituting the total of retrieved documents, highlighting that the term “ChatGPT” becomes an interesting topic from 2023. The annual publications escalated from 39 in 2010 to 719 in 2023, with an average annual growth rate of 25.1%. In terms of country production over time, the USA led with the highest overall production from 2010 to 2023. Concerning citations, the most frequently cited countries were the USA, UK, and China. Interestingly, Harvard Medical School emerges as the leading contributor, accounting for 18% of all articles among the top ten affiliations. Conclusions: Our study provides an overall examination of the existing research interest in ChatGPT’s applications for public health by outlining pivotal themes and uncovering emerging trends. Full article
Show Figures

Figure 1

Previous Issue
Back to TopTop