Next Issue
Volume 6, April
Previous Issue
Volume 6, February
 
 

AI, Volume 6, Issue 3 (March 2025) – 21 articles

Cover Story (view full-size image): This research presents a novel multimodal framework integrating Retrieval-Augmented Generation (RAG) with Multimodal Large Language Models (MLLMs) to enhance the analysis of Eurobarometer survey data. Traditional approaches struggle with the scale and complexity of these surveys, which include textual and visual elements. Our framework enables targeted queries for trends and visualization analysis, improving the accessibility and interpretability for policymakers, researchers, and citizens. Its modular design supports applications such as survey studies, comparative analyses, and domain-specific investigations. Its scalability and reproducibility make it suitable for e-governance and public sector deployment, facilitating informed decision-making and advanced research in public opinion analysis. The source code is available HERE. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
14 pages, 655 KiB  
Perspective
AI-Driven Telerehabilitation: Benefits and Challenges of a Transformative Healthcare Approach
by Rocco Salvatore Calabrò and Sepehr Mojdehdehbaher
AI 2025, 6(3), 62; https://doi.org/10.3390/ai6030062 - 17 Mar 2025
Cited by 1 | Viewed by 1693
Abstract
Artificial intelligence (AI) has revolutionized telerehabilitation by integrating machine learning (ML), big data analytics, and real-time feedback to create adaptive, patient-centered care. AI-driven systems enhance telerehabilitation by analyzing patient data to personalize therapy, monitor progress, and suggest adjustments, eliminating the need for constant [...] Read more.
Artificial intelligence (AI) has revolutionized telerehabilitation by integrating machine learning (ML), big data analytics, and real-time feedback to create adaptive, patient-centered care. AI-driven systems enhance telerehabilitation by analyzing patient data to personalize therapy, monitor progress, and suggest adjustments, eliminating the need for constant clinician oversight. The benefits of AI-powered telerehabilitation include increased accessibility, especially for remote or mobility-limited patients, and greater convenience, allowing patients to perform therapies at home. However, challenges persist, such as data privacy risks, the digital divide, and algorithmic bias. Robust encryption protocols, equitable access to technology, and diverse training datasets are critical to addressing these issues. Ethical considerations also arise, emphasizing the need for human oversight and maintaining the therapeutic relationship. AI also aids clinicians by automating administrative tasks and facilitating interdisciplinary collaboration. Innovations like 5G networks, the Internet of Medical Things (IoMT), and robotics further enhance telerehabilitation’s potential. By transforming rehabilitation into a dynamic, engaging, and personalized process, AI and telerehabilitation together represent a paradigm shift in healthcare, promising improved outcomes and broader access for patients worldwide. Full article
Show Figures

Figure 1

27 pages, 7182 KiB  
Article
Detection of Leaf Diseases in Banana Crops Using Deep Learning Techniques
by Nixon Jiménez, Stefany Orellana, Bertha Mazon-Olivo, Wilmer Rivas-Asanza and Iván Ramírez-Morales
AI 2025, 6(3), 61; https://doi.org/10.3390/ai6030061 - 17 Mar 2025
Viewed by 1060
Abstract
Leaf diseases, such as Black Sigatoka and Cordana, represent a growing threat to banana crops in Ecuador. These diseases spread rapidly, impacting both leaf and fruit quality. Early detection is crucial for effective control measures. Recently, deep learning has proven to be a [...] Read more.
Leaf diseases, such as Black Sigatoka and Cordana, represent a growing threat to banana crops in Ecuador. These diseases spread rapidly, impacting both leaf and fruit quality. Early detection is crucial for effective control measures. Recently, deep learning has proven to be a powerful tool in agriculture, enabling more accurate analysis and identification of crop diseases. This study applied the CRISP-DM methodology, consisting of six phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. A dataset of 900 banana leaf images was collected—300 of Black Sigatoka, 300 of Cordana, and 300 of healthy leaves. Three pre-trained models (EfficientNetB0, ResNet50, and VGG19) were trained on this dataset. To improve performance, data augmentation techniques were applied using TensorFlow Keras’s ImageDataGenerator class, expanding the dataset to 9000 images. Due to the high computational demands of ResNet50 and VGG19, training was performed with EfficientNetB0. The models—EfficientNetB0, ResNet50, and VGG19—demonstrated the ability to identify leaf diseases in bananas, with accuracies of 88.33%, 88.90%, and 87.22%, respectively. The data augmentation increased the performance of EfficientNetB0 to 87.83%, but did not significantly improve its accuracy. These findings highlight the value of deep learning techniques for early disease detection in banana crops, enhancing diagnostic accuracy and efficiency. Full article
(This article belongs to the Special Issue Artificial Intelligence in Agriculture)
Show Figures

Graphical abstract

15 pages, 3081 KiB  
Article
Antiparasitic Pharmacology Goes to the Movies: Leveraging Generative AI to Create Educational Short Films
by Benjamin Worthley, Meize Guo, Lucas Sheneman and Tyler Bland
AI 2025, 6(3), 60; https://doi.org/10.3390/ai6030060 - 17 Mar 2025
Viewed by 666
Abstract
Medical education faces the dual challenge of addressing cognitive overload and sustaining student engagement, particularly in complex subjects such as pharmacology. This study introduces Cinematic Clinical Narratives (CCNs) as an innovative approach to teaching antiparasitic pharmacology, combining generative artificial intelligence (genAI), edutainment, and [...] Read more.
Medical education faces the dual challenge of addressing cognitive overload and sustaining student engagement, particularly in complex subjects such as pharmacology. This study introduces Cinematic Clinical Narratives (CCNs) as an innovative approach to teaching antiparasitic pharmacology, combining generative artificial intelligence (genAI), edutainment, and mnemonic-based learning. The intervention involved two short films, Alien: Parasites Within and Wormquest, designed to teach antiparasitic pharmacology to first-year medical students. A control group of students only received traditional text-based clinical cases, while the experimental group engaged with the CCNs in an active learning environment. Students who received the CCN material scored an average of 8% higher on exam questions related to the material covered by the CCN compared to students in the control group. Results also showed that the CCNs improved engagement and interest among students, as evidenced by significantly higher scores on the Situational Interest Survey for Multimedia (SIS-M) compared to traditional methods. Notably, students preferred CCNs for their storytelling, visuals, and interactive elements. This study underscores the potential of CCNs as a supplementary educational tool, and suggests the potential for broader applications across other medical disciplines outside of antiparasitic pharmacology. By leveraging genAI and edutainment, CCNs represent a scalable and innovative approach to enhancing the medical learning experience. Full article
(This article belongs to the Special Issue Exploring the Use of Artificial Intelligence in Education)
Show Figures

Figure 1

25 pages, 340 KiB  
Article
Clinical Applicability of Machine Learning Models for Binary and Multi-Class Electrocardiogram Classification
by Daniel Nasef, Demarcus Nasef, Kennette James Basco, Alana Singh, Christina Hartnett, Michael Ruane, Jason Tagliarino, Michael Nizich and Milan Toma
AI 2025, 6(3), 59; https://doi.org/10.3390/ai6030059 - 14 Mar 2025
Cited by 2 | Viewed by 831
Abstract
Background: This study investigates the application of machine learning models to classify electrocardiogram signals, addressing challenges such as class imbalances and inter-class overlap. In this study, “normal” and “abnormal” refer to electrocardiogram findings that either align with or deviate from a standard electrocardiogram, [...] Read more.
Background: This study investigates the application of machine learning models to classify electrocardiogram signals, addressing challenges such as class imbalances and inter-class overlap. In this study, “normal” and “abnormal” refer to electrocardiogram findings that either align with or deviate from a standard electrocardiogram, warranting further evaluation. “Borderline” indicates an electrocardiogram that requires additional assessment to distinguish benign variations from pathology. Methods: A hierarchical framework reformulated the multi-class problem into two binary classification tasks—distinguishing “Abnormal” from “Non-Abnormal” and “Normal” from “Non-Normal”—to enhance performance and interpretability. Convolutional neural networks, deep neural networks, and tree-based models, including Gradient Boosting Classifier and Random Forest, were trained and evaluated using standard metrics (accuracy, precision, recall, and F1 score) and learning curve convergence analysis. Results: Results showed that convolutional neural networks achieved the best balance between generalization and performance, effectively adapting to unseen data and variations without overfitting. They exhibit strong convergence and robust feature importance rankings, with ventricular rate, QRS duration, and P-R interval identified as key predictors. Tree-based models, despite their high performance metrics, demonstrated poor convergence, raising concerns about their reliability on unseen data. Deep neural networks achieved high sensitivity but suffered from overfitting, limiting their generalizability. Conclusions: The hierarchical binary classification approach demonstrated clinical relevance, enabling nuanced diagnostic insights. Furthermore, the study emphasizes the critical role of learning curve analysis in evaluating model reliability, beyond performance metrics alone. Future work should focus on optimizing model convergence and exploring hybrid approaches to improve clinical applicability in electrocardiogram signal classification. Full article
(This article belongs to the Section Medical & Healthcare AI)
Show Figures

Figure 1

25 pages, 4169 KiB  
Article
Leveraging Spectral Neighborhood Information for Corn Yield Prediction with Spatial-Lagged Machine Learning Modeling: Can Neighborhood Information Outperform Vegetation Indices?
by Efrain Noa-Yarasca, Javier M. Osorio Leyton, Chad B. Hajda, Kabindra Adhikari and Douglas R. Smith
AI 2025, 6(3), 58; https://doi.org/10.3390/ai6030058 - 13 Mar 2025
Viewed by 681
Abstract
Accurate and reliable crop yield prediction is essential for optimizing agricultural management, resource allocation, and decision-making, while also supporting farmers and stakeholders in adapting to climate change and increasing global demand. This study introduces an innovative approach to crop yield prediction by incorporating [...] Read more.
Accurate and reliable crop yield prediction is essential for optimizing agricultural management, resource allocation, and decision-making, while also supporting farmers and stakeholders in adapting to climate change and increasing global demand. This study introduces an innovative approach to crop yield prediction by incorporating spatially lagged spectral data (SLSD) through the spatial-lagged machine learning (SLML) model, an enhanced version of the spatial lag X (SLX) model. The research aims to show that SLSD improves prediction compared to traditional vegetation index (VI)-based methods. Conducted on a 19-hectare cornfield at the ARS Grassland, Soil, and Water Research Laboratory during the 2023 growing season, this study used five-band multispectral image data and 8581 yield measurements ranging from 1.69 to 15.86 Mg/Ha. Four predictor sets were evaluated: Set 1 (spectral bands), Set 2 (spectral bands + neighborhood data), Set 3 (spectral bands + VIs), and Set 4 (spectral bands + top VIs + neighborhood data). These were evaluated using the SLX model and four decision-tree-based SLML models (RF, XGB, ET, GBR), with performance assessed using R2 and RMSE. Results showed that incorporating spatial neighborhood data (Set 2) outperformed VI-based approaches (Set 3), emphasizing the importance of spatial context. SLML models, particularly XGB, RF, and ET, performed best with 4–8 neighbors, while excessive neighbors slightly reduced accuracy. In Set 3, VIs improved predictions, but a smaller subset (10–15 indices) was sufficient for optimal yield prediction. Set 4 showed slight gains over Sets 2 and 3, with XGB and RF achieving the highest R2 values. Key predictors included spatially lagged spectral bands (e.g., Green_lag, NIR_lag, RedEdge_lag) and VIs (e.g., CREI, GCI, NCPI, ARI, CCCI), highlighting the value of integrating neighborhood data for improved corn yield prediction. This study underscores the importance of spatial context in corn yield prediction and lays the foundation for future research across diverse agricultural settings, focusing on optimizing neighborhood size, integrating spatial and spectral data, and refining spatial dependencies through localized search algorithms. Full article
(This article belongs to the Special Issue Artificial Intelligence in Agriculture)
Show Figures

Figure 1

27 pages, 10632 KiB  
Article
Integration of YOLOv8 Small and MobileNet V3 Large for Efficient Bird Detection and Classification on Mobile Devices
by Axel Frederick Félix-Jiménez, Vania Stephany Sánchez-Lee, Héctor Alejandro Acuña-Cid, Isaul Ibarra-Belmonte, Efraín Arredondo-Morales and Eduardo Ahumada-Tello
AI 2025, 6(3), 57; https://doi.org/10.3390/ai6030057 - 13 Mar 2025
Viewed by 1408
Abstract
Background: Bird species identification and classification are crucial for biodiversity research, conservation initiatives, and ecological monitoring. However, conventional identification techniques used by biologists are time-consuming and susceptible to human error. The integration of deep learning models offers a promising alternative to automate and [...] Read more.
Background: Bird species identification and classification are crucial for biodiversity research, conservation initiatives, and ecological monitoring. However, conventional identification techniques used by biologists are time-consuming and susceptible to human error. The integration of deep learning models offers a promising alternative to automate and enhance species recognition processes. Methods: This study explores the use of deep learning for bird species identification in the city of Zacatecas. Specifically, we implement YOLOv8 Small for real-time detection and MobileNet V3 for classification. The models were trained and tested on a dataset comprising five bird species: Vermilion Flycatcher, Pine Flycatcher, Mexican Chickadee, Arizona Woodpecker, and Striped Sparrow. The evaluation metrics included precision, recall, and computational efficiency. Results: The findings demonstrate that both models achieve high accuracy in species identification. YOLOv8 Small excels in real-time detection, making it suitable for dynamic monitoring scenarios, while MobileNet V3 provides a lightweight yet efficient classification solution. These results highlight the potential of artificial intelligence to enhance ornithological research by improving monitoring accuracy and reducing manual identification efforts. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)
Show Figures

Figure 1

22 pages, 1390 KiB  
Article
Emotion-Aware Embedding Fusion in Large Language Models (Flan-T5, Llama 2, DeepSeek-R1, and ChatGPT 4) for Intelligent Response Generation
by Abdur Rasool, Muhammad Irfan Shahzad, Hafsa Aslam, Vincent Chan and Muhammad Ali Arshad
AI 2025, 6(3), 56; https://doi.org/10.3390/ai6030056 - 13 Mar 2025
Cited by 1 | Viewed by 1725
Abstract
Empathetic and coherent responses are critical in automated chatbot-facilitated psychotherapy. This study addresses the challenge of enhancing the emotional and contextual understanding of large language models (LLMs) in psychiatric applications. We introduce Emotion-Aware Embedding Fusion, a novel framework integrating hierarchical fusion and attention [...] Read more.
Empathetic and coherent responses are critical in automated chatbot-facilitated psychotherapy. This study addresses the challenge of enhancing the emotional and contextual understanding of large language models (LLMs) in psychiatric applications. We introduce Emotion-Aware Embedding Fusion, a novel framework integrating hierarchical fusion and attention mechanisms to prioritize semantic and emotional features in therapy transcripts. Our approach combines multiple emotion lexicons, including NRC Emotion Lexicon, VADER, WordNet, and SentiWordNet, with state-of-the-art LLMs such as Flan-T5, Llama 2, DeepSeek-R1, and ChatGPT 4. Therapy session transcripts, comprising over 2000 samples, are segmented into hierarchical levels (word, sentence, and session) using neural networks, while hierarchical fusion combines these features with pooling techniques to refine emotional representations. Attention mechanisms, including multi-head self-attention and cross-attention, further prioritize emotional and contextual features, enabling the temporal modeling of emotional shifts across sessions. The processed embeddings, computed using BERT, GPT-3, and RoBERTa, are stored in the Facebook AI similarity search vector database, which enables efficient similarity search and clustering across dense vector spaces. Upon user queries, relevant segments are retrieved and provided as context to LLMs, enhancing their ability to generate empathetic and contextually relevant responses. The proposed framework is evaluated across multiple practical use cases to demonstrate real-world applicability, including AI-driven therapy chatbots. The system can be integrated into existing mental health platforms to generate personalized responses based on retrieved therapy session data. The experimental results show that our framework enhances empathy, coherence, informativeness, and fluency, surpassing baseline models while improving LLMs’ emotional intelligence and contextual adaptability for psychotherapy. Full article
(This article belongs to the Special Issue Multimodal Artificial Intelligence in Healthcare)
Show Figures

Figure 1

16 pages, 2166 KiB  
Article
Integrating Pose Features and Cross-Relationship Learning for Human–Object Interaction Detection
by Lang Wu, Jie Li, Shuqin Li, Yu Ding, Meng Zhou and Yuntao Shi
AI 2025, 6(3), 55; https://doi.org/10.3390/ai6030055 - 12 Mar 2025
Viewed by 665
Abstract
Background: The main challenge in human–object interaction detection (HOI) is how to accurately reason about ambiguous, complex, and difficult to recognize interactions. The model structure of the existing methods is relatively single, and the image input may be occluded and cannot be accurately [...] Read more.
Background: The main challenge in human–object interaction detection (HOI) is how to accurately reason about ambiguous, complex, and difficult to recognize interactions. The model structure of the existing methods is relatively single, and the image input may be occluded and cannot be accurately recognized. Methods: In this paper, we design a Pose-Aware Interaction Network (PAIN) based on transformer architecture and human posture to address these issues through two innovations: A new feature fusion method is proposed, which fuses human pose features and image features early before the encoder to improve the feature expression ability, and the individual motion-related features are additionally strengthened by adding to the human branch; the Cross-Attention Relationship fusion Module (CARM) better fuses the three-branch output and captures the detailed relationship information of HOI. Results: The proposed method achieves 64.51%AProle#1, 66.42%AProle#2 on the public dataset V-COCO and 30.83% AP on HICO-DET, which can recognize HOI instances more accurately. Full article
Show Figures

Figure 1

23 pages, 3001 KiB  
Review
A Bibliometric Analysis on Artificial Intelligence in the Production Process of Small and Medium Enterprises
by Federico Briatore, Marco Tullio Mosca, Roberto Nicola Mosca and Mattia Braggio
AI 2025, 6(3), 54; https://doi.org/10.3390/ai6030054 - 12 Mar 2025
Viewed by 841
Abstract
Industry 4.0 represents the main paradigm currently bringing great innovation in the field of automation and data exchange among production technologies, according to the principles of interoperability, virtualization, decentralization and production flexibility. The Fourth Industrial Revolution is driven by structural changes in the [...] Read more.
Industry 4.0 represents the main paradigm currently bringing great innovation in the field of automation and data exchange among production technologies, according to the principles of interoperability, virtualization, decentralization and production flexibility. The Fourth Industrial Revolution is driven by structural changes in the manufacturing sector, such as the demand for customized products, market volatility and sustainability goals, and the integration of artificial intelligence and Big Data. This work aims to analyze, from a bibliometric point of view of journal papers on Scopus, with no time limitation, the existing literature on the application of AI in SMEs, which are crucial elements in the industrial and economic fabric of many countries. However, the adoption of modern technologies, particularly AI, can be challenging for them, due to the intrinsic structure of this type of enterprise, despite the positive effects obtained in large organizations. Full article
Show Figures

Graphical abstract

16 pages, 1040 KiB  
Article
Trade-Offs in Navigation Problems Using Value-Based Methods
by Petra Csereoka and Mihai V. Micea
AI 2025, 6(3), 53; https://doi.org/10.3390/ai6030053 - 10 Mar 2025
Viewed by 597
Abstract
Deep Q-Networks (DQNs) have shown remarkable results over the last decade in scenarios ranging from simple 2D fully observable short episodes to partially observable, graphically intensive, and complex tasks. However, the base architecture of a vanilla DQN presents several shortcomings, some of which [...] Read more.
Deep Q-Networks (DQNs) have shown remarkable results over the last decade in scenarios ranging from simple 2D fully observable short episodes to partially observable, graphically intensive, and complex tasks. However, the base architecture of a vanilla DQN presents several shortcomings, some of which were mitigated by new variants focusing on increased stability, faster convergence, and time dependencies. These additions, on the other hand, bring increased costs in terms of the required memory and lengthier training times. In this paper, we analyze the performance of state-of-the-art DQN families in a simple partially observable mission created in Minecraft and try to determine the optimal architecture for such problem classes in terms of the cost and accuracy. To the best of our knowledge, the analyzed methods have not been tested on the same scenario before, and hence a more in-depth comparison is required to understand the real performance improvement they provide better. This manuscript also offers a detailed overview of state-of-the-art DQN methods, together with the training heuristics and performance metrics registered during the proposed mission, allowing researchers to select better-suited models to solving future problems. Our experiments show that Double DQN networks are capable of handling partially observable scenarios gracefully while maintaining a low hardware footprint, Recurrent Double DQNs can be a good candidate even when the resources must be restricted, and double-dueling DQNs are a well-performing middle ground in terms of their cost and performance. Full article
Show Figures

Figure 1

24 pages, 23486 KiB  
Article
Influence of Model Size and Image Augmentations on Object Detection in Low-Contrast Complex Background Scenes
by Harman Singh Sangha and Matthew J. Darr
AI 2025, 6(3), 52; https://doi.org/10.3390/ai6030052 - 5 Mar 2025
Cited by 1 | Viewed by 1096
Abstract
Background: Bigger and more complex models are often developed for challenging object detection tasks, and image augmentations are used to train a robust deep learning model for small image datasets. Previous studies have suggested that smaller models provide better performance compared to bigger [...] Read more.
Background: Bigger and more complex models are often developed for challenging object detection tasks, and image augmentations are used to train a robust deep learning model for small image datasets. Previous studies have suggested that smaller models provide better performance compared to bigger models for agricultural applications, and not all image augmentation methods contribute equally to model performance. An important part of these studies was also to define the scene of the image. Methods: A standard definition was developed to describe scenes in real-world agricultural datasets by reviewing various image-based machine-learning applications in the agriculture literature. This study primarily evaluates the effects of model size in both one-stage and two-stage detectors on model performance for low-contrast complex background applications. It further explores the influence of different photo-metric image augmentation methods on model performance for standard one-stage and two-stage detectors. Results: For one-stage detectors, a smaller model performed better than a bigger model. Whereas in the case of two-stage detectors, model performance increased with model size. In image augmentations, some methods considerably improved model performance and some either provided no improvement or reduced the model performance in both one-stage and two-stage detectors compared to the baseline. Full article
(This article belongs to the Special Issue Artificial Intelligence in Agriculture)
Show Figures

Figure 1

20 pages, 682 KiB  
Article
Sentence Interaction and Bag Feature Enhancement for Distant Supervised Relation Extraction
by Wei Song and Qingchun Liu
AI 2025, 6(3), 51; https://doi.org/10.3390/ai6030051 - 4 Mar 2025
Viewed by 719
Abstract
Background: Distant supervision employs external knowledge bases to automatically match with text, allowing for the automatic annotation of sentences. Although this method effectively tackles the challenge of manual labeling, it inevitably introduces noisy labels. Traditional approaches typically employ sentence-level attention mechanisms, assigning lower [...] Read more.
Background: Distant supervision employs external knowledge bases to automatically match with text, allowing for the automatic annotation of sentences. Although this method effectively tackles the challenge of manual labeling, it inevitably introduces noisy labels. Traditional approaches typically employ sentence-level attention mechanisms, assigning lower weights to noisy sentences to mitigate their impact. But this approach overlooks the critical importance of information flow between sentences. Additionally, previous approaches treated an entire bag as a single classification unit, giving equal importance to all features within the bag. However, they failed to recognize that different dimensions of features have varying levels of significance. Method: To overcome these challenges, this study introduces a novel network that incorporates sentence interaction and a bag-level feature enhancement (ESI-EBF) mechanism. We concatenate sentences within a bag into a continuous context, allowing information to flow freely between them during encoding. At the bag level, we partition the features into multiple groups based on dimensions, assigning an importance coefficient to each sub-feature within a group. This enhances critical features while diminishing the influence of less important ones. In the end, the enhanced features are utilized to construct high-quality bag representations, facilitating more accurate classification by the classification module. Result: The experimental findings from the New York Times (NYT) and Wiki-20m datasets confirm the efficacy of our suggested encoding approach and feature improvement module. Our method also outperforms state-of-the-art techniques on these datasets, achieving superior relation extraction accuracy. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

26 pages, 4614 KiB  
Article
A Multimodal Framework Embedding Retrieval-Augmented Generation with MLLMs for Eurobarometer Data
by George Papageorgiou, Vangelis Sarlis, Manolis Maragoudakis and Christos Tjortjis
AI 2025, 6(3), 50; https://doi.org/10.3390/ai6030050 - 3 Mar 2025
Viewed by 3444
Abstract
This study introduces a multimodal framework integrating retrieval-augmented generation (RAG) with multimodal large language models (MLLMs) to enhance the accessibility, interpretability, and analysis of Eurobarometer survey data. Traditional approaches often struggle with the diverse formats and large-scale nature of these datasets, which include [...] Read more.
This study introduces a multimodal framework integrating retrieval-augmented generation (RAG) with multimodal large language models (MLLMs) to enhance the accessibility, interpretability, and analysis of Eurobarometer survey data. Traditional approaches often struggle with the diverse formats and large-scale nature of these datasets, which include textual and visual elements. The proposed framework leverages multimodal indexing and targeted retrieval to enable focused queries, trend analysis, and visualization, across multiple survey editions. The integration of LLMs facilitates advanced synthesis of insights, providing a more comprehensive understanding of public opinion trends. The proposed framework offers prospective benefits for different types of stakeholders, including policymakers, journalists, nongovernmental organizations (NGOs), researchers, and citizens, while highlighting the need for performance assessment to evaluate its effectiveness based on specific business requirements and practical applications. The framework’s modular design supports applications, such as survey studies, comparative analyses, and domain-specific investigations, while its scalability and reproducibility make it suitable for e-governance and public sector deployment. The results indicate potential enhancements in data interpretation and data analysis by providing stakeholders with the capability not only to utilize raw text data for knowledge extraction but also to conduct image analysis based on indexed content, paving the way for informed policymaking and advanced research in the social sciences, while emphasizing the need for performance assessment to validate the framework’s output and functionality, based on the selected architectural components. Future research will explore expanded functionalities and real-time applications, ensuring the framework remains adaptable to evolving needs in public opinion analysis and multimodal data integration. Full article
Show Figures

Figure 1

22 pages, 699 KiB  
Article
Integration of Artificial Intelligence in K-12: Analysis of a Three-Year Pilot Study
by Boško Lišnić, Goran Zaharija and Saša Mladenović
AI 2025, 6(3), 49; https://doi.org/10.3390/ai6030049 - 1 Mar 2025
Viewed by 1334
Abstract
A three-year pilot study investigated the effectiveness of artificial intelligence (AI) as a motivational tool for teaching programming concepts within the Croatian Informatics curriculum. The study was conducted in schools through the extracurricular activity EDIT CodeSchool with the Development of Intelligent Web Applications [...] Read more.
A three-year pilot study investigated the effectiveness of artificial intelligence (AI) as a motivational tool for teaching programming concepts within the Croatian Informatics curriculum. The study was conducted in schools through the extracurricular activity EDIT CodeSchool with the Development of Intelligent Web Applications (RIWA) module. Twelve schools in Split-Dalmatia County in the Republic of Croatia participated, resulting in 112 successfully completed student projects. The program consisted of two phases: (1) theoretical instruction with examples and exercises, and (2) project-based learning, where students developed final projects using JavaScript and the ml5.js library. The study employed project analysis and semi-structured student interviews to assess learning outcomes. Findings suggest that AI-enhanced learning can effectively support programming education without increasing instructional hours, providing insights for integrating AI concepts into existing curricula. Full article
(This article belongs to the Topic AI Trends in Teacher and Student Training)
Show Figures

Figure 1

17 pages, 1149 KiB  
Article
Applying Decision Transformers to Enhance Neural Local Search on the Job Shop Scheduling Problem
by Constantin Waubert de Puiseau, Fabian Wolz, Merlin Montag, Jannik Peters, Hasan Tercan and Tobias Meisen
AI 2025, 6(3), 48; https://doi.org/10.3390/ai6030048 - 1 Mar 2025
Viewed by 874
Abstract
Background: The job shop scheduling problem (JSSP) and its solution algorithms have been of enduring interest in both academia and industry for decades. In recent years, machine learning (ML) has been playing an increasingly important role in advancing existing solutions and building [...] Read more.
Background: The job shop scheduling problem (JSSP) and its solution algorithms have been of enduring interest in both academia and industry for decades. In recent years, machine learning (ML) has been playing an increasingly important role in advancing existing solutions and building new heuristic solutions for the JSSP, aiming to find better solutions in shorter computation times. Methods: In this study, we built on top of a state-of-the-art deep reinforcement learning (DRL) agent, called Neural Local Search (NLS), which can efficiently and effectively control a large local neighborhood search on the JSSP. In particular, we developed a method for training the decision transformer (DT) algorithm on search trajectories taken by a trained NLS agent to further improve upon the learned decision-making sequences. Results: Our experiments showed that the DT successfully learns local search strategies that are different and, in many cases, more effective than those of the NLS agent itself. In terms of the tradeoff between solution quality and acceptable computational time needed for the search, the DT is particularly superior in application scenarios where longer computational times are acceptable. In this case, it makes up for the longer inference times required per search step, which are caused by the larger neural network architecture, through better quality decisions per step. Conclusions: Therefore, the DT achieves state-of-the-art results for solving the JSSP with ML-enhanced local search. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

14 pages, 7455 KiB  
Article
Swamped with Too Many Articles? GraphRAG Makes Getting Started Easy
by Joëd Ngangmeni and Danda B. Rawat
AI 2025, 6(3), 47; https://doi.org/10.3390/ai6030047 - 1 Mar 2025
Viewed by 1292
Abstract
Background: Both early researchers, such as new graduate students, and experienced researchers face the challenge of sifting through vast amounts of literature to find their needle in a haystack. This process can be time-consuming, tedious, or frustratingly unproductive. Methods: Using only abstracts and [...] Read more.
Background: Both early researchers, such as new graduate students, and experienced researchers face the challenge of sifting through vast amounts of literature to find their needle in a haystack. This process can be time-consuming, tedious, or frustratingly unproductive. Methods: Using only abstracts and titles of research articles, we compare three retrieval methods—Bibliographic Indexing/Databasing (BI/D), Retrieval-Augmented Generation (RAG), and Graph Retrieval-Augmented Generation (GraphRAG)—which reportedly offer promising solutions to these common challenges. We assess their performance using two sets of Large Language Model (LLM)-generated queries: one set of queries with context and the other set without context. Our study evaluates six sub-models—four from Light Retrieval-Augmented Generation (LightRAG) and two from Microsoft’s Graph Retrieval-Augmented Generation (MGRAG). We examine these sub-models across four key criteria—comprehensiveness, diversity, empowerment, and directness—as well as the overall combination of these factors. Results: After three separate experiments, we observe that MGRAG has a slight advantage over LightRAG, naïve RAG, and BI/D for answering queries that require a semantic understanding of our data pool. The results (displayed in grouped bar charts) provide clear and accessible comparisons to help researchers quickly make informed decisions on which method best suits their needs. Conclusions: Supplementing BI/D with RAG or GraphRAG pipelines would positively impact the way both beginners and experienced researchers find and parse through volumes of potentially relevant information. Full article
Show Figures

Figure 1

37 pages, 1551 KiB  
Article
Deep Reinforcement Learning: A Chronological Overview and Methods
by Juan Terven
AI 2025, 6(3), 46; https://doi.org/10.3390/ai6030046 - 24 Feb 2025
Viewed by 4472
Abstract
Introduction: Deep reinforcement learning (deep RL) integrates the principles of reinforcement learning with deep neural networks, enabling agents to excel in diverse tasks ranging from playing board games such as Go and Chess to controlling robotic systems and autonomous vehicles. By leveraging foundational [...] Read more.
Introduction: Deep reinforcement learning (deep RL) integrates the principles of reinforcement learning with deep neural networks, enabling agents to excel in diverse tasks ranging from playing board games such as Go and Chess to controlling robotic systems and autonomous vehicles. By leveraging foundational concepts of value functions, policy optimization, and temporal difference methods, deep RL has rapidly evolved and found applications in areas such as gaming, robotics, finance, and healthcare. Objective: This paper seeks to provide a comprehensive yet accessible overview of the evolution of deep RL and its leading algorithms. It aims to serve both as an introduction for newcomers to the field and as a practical guide for those seeking to select the most appropriate methods for specific problem domains. Methods: We begin by outlining fundamental reinforcement learning principles, followed by an exploration of early tabular Q-learning methods. We then trace the historical development of deep RL, highlighting key milestones such as the advent of deep Q-networks (DQN). The survey extends to policy gradient methods, actor–critic architectures, and state-of-the-art algorithms such as proximal policy optimization, soft actor–critic, and emerging model-based approaches. Throughout, we discuss the current challenges facing deep RL, including issues of sample efficiency, interpretability, and safety, as well as open research questions involving large-scale training, hierarchical architectures, and multi-task learning. Results: Our analysis demonstrates how critical breakthroughs have driven deep RL into increasingly complex application domains. We highlight existing limitations and ongoing bottlenecks, such as high data requirements and the need for more transparent, ethically aligned systems. Finally, we survey potential future directions, highlighting the importance of reliability and ethical considerations for real-world deployments. Full article
Show Figures

Figure 1

22 pages, 2188 KiB  
Article
Morphological Analysis and Subtype Detection of Acute Myeloid Leukemia in High-Resolution Blood Smears Using ConvNeXT
by Mubarak Taiwo Mustapha and Dilber Uzun Ozsahin
AI 2025, 6(3), 45; https://doi.org/10.3390/ai6030045 - 24 Feb 2025
Viewed by 769
Abstract
(1) Background: Acute Myeloid Leukemia (AML) is a complex hematologic malignancy where accurate subtype classification is crucial for targeted treatment and improved patient outcomes. Automated AML subtype detection is especially important for underrepresented subtypes to ensure equitable diagnostics; (2) Methods: This study explores [...] Read more.
(1) Background: Acute Myeloid Leukemia (AML) is a complex hematologic malignancy where accurate subtype classification is crucial for targeted treatment and improved patient outcomes. Automated AML subtype detection is especially important for underrepresented subtypes to ensure equitable diagnostics; (2) Methods: This study explores the potential of ConvNeXt, an advanced convolutional neural network architecture, for classifying high-resolution peripheral blood smear images into AML subtypes. A deep learning pipeline was developed, integrating Stochastic Weight Averaging (SWA) for model stability, Mixup data augmentation to enhance generalization, and Grad-CAM for model interpretability, ensuring biologically meaningful feature visualization. Various models, including ResNet50 and Vision Transformers, were benchmarked for comparative performance analysis; (3) Results: ConvNeXt outperformed ResNet50, achieving a classification accuracy of 95% compared to 91% for ResNet50 and 81% for transformer-based models (Vision Transformers). Grad-CAM visualizations provided biologically interpretable heatmaps, enhancing trust in computational predictions and bridging the gap between AI-driven diagnostics and clinical decision-making. Ablation studies highlighted the contributions of data augmentation, optimizer selection, and hyperparameter tuning, demonstrating the robustness and adaptability of the model; (4) Conclusions: This study advances AI’s role in hematopathology by combining high classification performance, explainability, and scalability. ConvNeXt offers a robust, interpretable, and scalable solution for AML subtype classification, improving diagnostic precision and supporting clinical decision-making. These results underscore the potential for AI-driven advancements in equitable and efficient AML diagnostics. Full article
(This article belongs to the Section Medical & Healthcare AI)
Show Figures

Figure 1

25 pages, 1066 KiB  
Review
Artificial Intelligence Adoption in Public Administration: An Overview of Top-Cited Articles and Practical Applications
by Matej Babšek, Dejan Ravšelj, Lan Umek and Aleksander Aristovnik
AI 2025, 6(3), 44; https://doi.org/10.3390/ai6030044 - 21 Feb 2025
Cited by 1 | Viewed by 4508
Abstract
Background: The adoption of artificial intelligence (AI) in public administration (PA) has the potential to enhance transparency, efficiency, and responsiveness, ultimately creating greater public value. However, the integration of AI into PA faces challenges, including conceptual ambiguities and limited knowledge of the practical [...] Read more.
Background: The adoption of artificial intelligence (AI) in public administration (PA) has the potential to enhance transparency, efficiency, and responsiveness, ultimately creating greater public value. However, the integration of AI into PA faces challenges, including conceptual ambiguities and limited knowledge of the practical applications. This study addresses these gaps by offering an overview and categorization of AI research and applications in PA. Methods: Using a dataset of 3149 documents from the Scopus database, this study identifies the top 200 most-cited articles based on citation per year. It conducts descriptive and content analyses to identify the existing state, applications, and challenges regarding AI adoption. Additionally, selected AI use cases from the European Commission’s database are categorized, focusing on their contributions to public value. The analysis centers on three governance dimensions: internal processes, service delivery, and policymaking. Results: The findings provide a categorized understanding of AI concepts, types, and applications in PA, alongside a discussion of best practices and challenges. Conclusion: This study serves as a resource for researchers seeking a comprehensive overview of the current state of AI in PA and offers policymakers and practitioners insights into leveraging AI technologies to improve service delivery and operational efficiency. Full article
Show Figures

Figure 1

20 pages, 3066 KiB  
Article
GeNetFormer: Transformer-Based Framework for Gene Expression Prediction in Breast Cancer
by Oumeima Thaalbi and Moulay A. Akhloufi
AI 2025, 6(3), 43; https://doi.org/10.3390/ai6030043 - 21 Feb 2025
Viewed by 874
Abstract
Background: Histopathological images are often used to diagnose breast cancer and have shown high accuracy in classifying cancer subtypes. Prediction of gene expression from whole-slide images and spatial transcriptomics data is important for cancer treatment in general and breast cancer in particular. This [...] Read more.
Background: Histopathological images are often used to diagnose breast cancer and have shown high accuracy in classifying cancer subtypes. Prediction of gene expression from whole-slide images and spatial transcriptomics data is important for cancer treatment in general and breast cancer in particular. This topic has been a challenge in numerous studies. Method: In this study, we present a deep learning framework called GeNetFormer. We evaluated eight advanced transformer models including EfficientFormer, FasterViT, BEiT v2, and Swin Transformer v2, and tested their performance in predicting gene expression using the STNet dataset. This dataset contains 68 H&E-stained histology images and transcriptomics data from different types of breast cancer. We followed a detailed process to prepare the data, including filtering genes and spots, normalizing stain colors, and creating smaller image patches for training. The models were trained to predict the expression of 250 genes using different image sizes and loss functions. GeNetFormer achieved the best performance using the MSELoss function and a resolution of 256 × 256 while integrating EfficientFormer. Results: It predicted nine out of the top ten genes with a higher Pearson Correlation Coefficient (PCC) compared to the retrained ST-Net method. For cancer biomarker genes such as DDX5 and XBP1, the PCC values were 0.7450 and 0.7203, respectively, outperforming ST-Net, which scored 0.6713 and 0.7320, respectively. In addition, our method gave better predictions for other genes such as FASN (0.7018 vs. 0.6968) and ERBB2 (0.6241 vs. 0.6211). Conclusions: Our results show that GeNetFormer provides improvements over other models such as ST-Net and show how transformer architectures are capable of analyzing spatial transcriptomics data to advance cancer research. Full article
(This article belongs to the Section Medical & Healthcare AI)
Show Figures

Figure 1

16 pages, 3435 KiB  
Article
A Combined Windowing and Deep Learning Model for the Classification of Brain Disorders Based on Electroencephalogram Signals
by Dina Abooelzahab, Nawal Zaher, Abdel Hamid Soliman and Claude Chibelushi
AI 2025, 6(3), 42; https://doi.org/10.3390/ai6030042 - 20 Feb 2025
Viewed by 964
Abstract
Background: The electroencephalogram (EEG) is essential for diagnosing and classifying brain disorders, enabling early medical intervention. Its ability to identify brain abnormalities has increased its clinical use in assessing changes in brain activity. Recent advancements in deep learning have introduced effective methods for [...] Read more.
Background: The electroencephalogram (EEG) is essential for diagnosing and classifying brain disorders, enabling early medical intervention. Its ability to identify brain abnormalities has increased its clinical use in assessing changes in brain activity. Recent advancements in deep learning have introduced effective methods for interpreting EEG signals, utilizing large datasets for enhanced accuracy. Objective: This study presents a deep learning-based model designed to classify EEG data with better accuracy compared to existing approaches. Methods: The model consists of three key components: data selection, feature extraction, and classification. Data selection employs a windowing technique, while the feature extraction and classification stages use a deep learning framework combining a convolutional neural network (CNN) and a Long Short-Term Memory (LSTM) network. The resulting architecture includes up to 18 layers. The model was evaluated using the Temple University Hospital (TUH) dataset, comprising data from 2785 patients, ensuring its applicability to real-world scenarios. Results: Comparative performance analysis shows that this approach surpasses existing methods in accuracy, sensitivity, and specificity. Conclusions: This study highlights the potential of deep learning in enhancing EEG signal interpretation, offering a pathway to more accurate and efficient diagnoses of brain disorders for clinical applications. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop