Previous Issue
Volume 6, September
 
 

AI, Volume 6, Issue 10 (October 2025) – 28 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
22 pages, 3339 KB  
Article
An AutoML Algorithm: Multiple-Steps Ahead Forecasting of Correlated Multivariate Time Series with Anomalies Using Gated Recurrent Unit Networks
by Ying Su and Morgan C. Wang
AI 2025, 6(10), 267; https://doi.org/10.3390/ai6100267 - 14 Oct 2025
Abstract
Multiple time series forecasting is critical in domains such as energy management, economic analysis, web traffic prediction and air pollution monitoring to support effective resource planning. Traditional statistical learning methods, including Vector Autoregression (VAR) and Vector Autoregressive Integrated Moving Average (VARIMA), struggle with [...] Read more.
Multiple time series forecasting is critical in domains such as energy management, economic analysis, web traffic prediction and air pollution monitoring to support effective resource planning. Traditional statistical learning methods, including Vector Autoregression (VAR) and Vector Autoregressive Integrated Moving Average (VARIMA), struggle with nonstationarity, temporal dependencies, inter-series correlations, and data anomalies such as trend shifts, seasonal variations, and missing data. Furthermore, their effectiveness in multi-step ahead forecasting is often limited. This article presents an Automated Machine Learning (AutoML) framework that provides an end-to-end solution for researchers who lack in-depth knowledge of time series forecasting or advanced programming skills. This framework utilizes Gated Recurrent Unit (GRU) networks, a variant of Recurrent Neural Networks (RNNs), to tackle multiple correlated time series forecasting problems, even in the presence of anomalies. To reduce complexity and facilitate the AutoML process, many model parameters are pre-specified, thereby requiring minimal tuning. This design enables efficient and accurate multi-step forecasting while addressing issues including missing values and structural shifts. We also examine the advantages and limitations of GRU-based RNNs within the AutoML system for multivariate time series forecasting. Model performance is evaluated using multiple accuracy metrics across various forecast horizons. The empirical results confirm our proposed approach’s ability to capture inter-series dependencies and handle anomalies in long-range forecasts. Full article
Show Figures

Figure 1

24 pages, 1699 KB  
Article
Efficient Sparse MLPs Through Motif-Level Optimization Under Resource Constraints
by Xiaotian Chen, Hongyun Liu and Seyed Sahand Mohammadi Ziabari
AI 2025, 6(10), 266; https://doi.org/10.3390/ai6100266 - 9 Oct 2025
Viewed by 298
Abstract
We study motif-based optimization for sparse multilayer perceptrons (MLPs), where weights are shared and updated at the level of small neuron groups (‘motifs’) rather than individual connections. Building on Sparse Evolutionary Training (SET), our approach reduces the number of unique parameters and redundant [...] Read more.
We study motif-based optimization for sparse multilayer perceptrons (MLPs), where weights are shared and updated at the level of small neuron groups (‘motifs’) rather than individual connections. Building on Sparse Evolutionary Training (SET), our approach reduces the number of unique parameters and redundant multiply–accumulate operations by exploiting block-structured sparsity. Across Fashion-MNIST and a lung X-ray dataset, our Motif-SET improves training/inference efficiency with modest accuracy trade-offs, and we provide a principled recipe to choose motif size based on accuracy and efficiency budgets. We further compare against representative modern sparse training and compression methods, analyze failure modes such as overly large motifs, and outline real-world constraints on mobile/embedded targets. Our results and ablations indicate that motif size m=2 often offers a strong balance between compute and accuracy under resource constraints. Full article
Show Figures

Figure 1

20 pages, 3126 KB  
Article
Few-Shot Image Classification Algorithm Based on Global–Local Feature Fusion
by Lei Zhang, Xinyu Yang, Xiyuan Cheng, Wenbin Cheng and Yiting Lin
AI 2025, 6(10), 265; https://doi.org/10.3390/ai6100265 - 9 Oct 2025
Viewed by 355
Abstract
Few-shot image classification seeks to recognize novel categories from only a handful of labeled examples, but conventional metric-based methods that rely mainly on global image features often produce unstable prototypes under extreme data scarcity, while local-descriptor approaches can lose context and suffer from [...] Read more.
Few-shot image classification seeks to recognize novel categories from only a handful of labeled examples, but conventional metric-based methods that rely mainly on global image features often produce unstable prototypes under extreme data scarcity, while local-descriptor approaches can lose context and suffer from inter-class local-pattern overlap. To address these limitations, we propose a Global–Local Feature Fusion network that combines a frozen, pretrained global feature branch with a self-attention based multi-local feature fusion branch. Multiple random crops are encoded by a shared backbone (ResNet-12), projected to Query/Key/Value embeddings, and fused via scaled dot-product self-attention to suppress background noise and highlight discriminative local cues. The fused local representation is concatenated with the global feature to form robust class prototypes used in a prototypical-network style classifier. On four benchmarks, our method achieves strong improvements: Mini-ImageNet 70.31% ± 0.20 (1-shot)/85.91% ± 0.13 (5-shot), Tiered-ImageNet 73.37% ± 0.22/87.62% ± 0.14, FC-100 47.01% ± 0.20/64.13% ± 0.19, and CUB-200-2011 82.80% ± 0.18/93.19% ± 0.09, demonstrating consistent gains over competitive baselines. Ablation studies show that (1) naive local averaging improves over global-only baselines, (2) self-attention fusion yields a large additional gain (e.g., +4.50% in 1-shot on Mini-ImageNet), and (3) concatenating global and fused local features gives the best overall performance. These results indicate that explicitly modeling inter-patch relations and fusing multi-granularity cues produces markedly more discriminative prototypes in few-shot regimes. Full article
Show Figures

Figure 1

15 pages, 3254 KB  
Article
Rodent Social Behavior Recognition Using a Global Context-Aware Vision Transformer Network
by Muhammad Imran Sharif, Doina Caragea and Ahmed Iqbal
AI 2025, 6(10), 264; https://doi.org/10.3390/ai6100264 - 8 Oct 2025
Viewed by 384
Abstract
Animal behavior recognition is an important research area that provides insights into areas such as neural functions, gene mutations, and drug efficacy, among others. The manual coding of behaviors based on video recordings is labor-intensive and prone to inconsistencies and human error. Machine [...] Read more.
Animal behavior recognition is an important research area that provides insights into areas such as neural functions, gene mutations, and drug efficacy, among others. The manual coding of behaviors based on video recordings is labor-intensive and prone to inconsistencies and human error. Machine learning approaches have been used to automate the analysis of animal behavior with promising results. Our work builds on existing developments in animal behavior analysis and state-of-the-art approaches in computer vision to identify rodent social behaviors. Specifically, our proposed approach, called Vision Transformer for Rat Social Interactions (ViT-RSI), leverages the existing Global Context Vision Transformer (GC-ViT) architecture to identify rat social interactions. Experimental results using five behaviors of the publicly available Rat Social Interaction (RatSI) dataset show that the ViT-RatSI approach can accurately identify rat social interaction behaviors. When compared with prior results from the literature, the ViT-RatSI approach achieves best results for four out of five behaviors, specifically for the “Approaching”, “Following”, “Moving away”, and “Solitary” behaviors, with F1 scores of 0.81, 0.81, 0.86, and 0.94, respectively. Full article
(This article belongs to the Special Issue AI in Bio and Healthcare Informatics)
Show Figures

Figure 1

20 pages, 1740 KB  
Article
Cross-Modal Alignment Enhancement for Vision–Language Tracking via Textual Heatmap Mapping
by Wei Xu, Gu Geng, Xinming Zhang and Di Yuan
AI 2025, 6(10), 263; https://doi.org/10.3390/ai6100263 - 8 Oct 2025
Viewed by 406
Abstract
Single-object vision–language tracking has become an important research topic due to its potential in applications such as intelligent surveillance and autonomous driving. However, existing cross-modal alignment methods typically rely on contrastive learning and struggle to effectively address semantic ambiguity or the presence of [...] Read more.
Single-object vision–language tracking has become an important research topic due to its potential in applications such as intelligent surveillance and autonomous driving. However, existing cross-modal alignment methods typically rely on contrastive learning and struggle to effectively address semantic ambiguity or the presence of multiple similar objects. This study aims to explore how to achieve more robust vision–language alignment under these challenging conditions, thereby achieving accurate object localization. To this end, we propose a text heatmap mapping (THM) module that enhances the spatial guidance of textual cues in tracking. The THM module integrates visual and language features and generates semantically aware heatmaps, enabling the tracker to focus on the most relevant regions while suppressing distractors. This framework, developed based on UVLTrack, combines a visual transformer with a pre-trained language encoder. The proposed method is evaluated on benchmark datasets such as OTB99, LaSOT, and TNL2K. The main contribution of this paper is the introduction of a novel spatial alignment mechanism for multimodal tracking and its effectiveness on various tracking benchmarks. Results demonstrate that the THM-based tracker improves robustness to semantic ambiguity and multi-instance interference, outperforming baseline frameworks. Full article
Show Figures

Figure 1

27 pages, 9738 KB  
Article
Machine Learning Recognition and Phase Velocity Estimation of Atmospheric Gravity Waves from OI 557.7 nm All-Sky Airglow Images
by Rady Mahmoud, Moataz Abdelwahab, Kazuo Shiokawa and Ayman Mahrous
AI 2025, 6(10), 262; https://doi.org/10.3390/ai6100262 - 7 Oct 2025
Viewed by 450
Abstract
Atmospheric gravity waves (AGWs) are treated as density structure perturbations of the atmosphere and play an important role in atmospheric dynamics. Utilizing All-Sky Airglow Imagers (ASAIs) with OI-Filter 557.7 nm, AGW phase velocity and propagation direction were extracted using classified images by visual [...] Read more.
Atmospheric gravity waves (AGWs) are treated as density structure perturbations of the atmosphere and play an important role in atmospheric dynamics. Utilizing All-Sky Airglow Imagers (ASAIs) with OI-Filter 557.7 nm, AGW phase velocity and propagation direction were extracted using classified images by visual inspection, where airglow images were collected from the OMTI network at Shigaraki (34.85 E, 134.11 N) from October 1998 to October 2002. Nonetheless, a large dataset of airglow images are processed and classified for studying AGW seasonal variation in the middle atmosphere. In this article, a machine learning-based approach for image recognition of AGWs from ASAIs is suggested. Consequently, three convolutional neural networks (CNNs), namely AlexNet, GoogLeNet, and ResNet-50, are considered. Out of 13,201 deviated images, 1192 very weak/unclear AGW signatures were eliminated during the quality control process. All networks were trained and tested by 12,007 classified images which approximately cover the maximum solar cycle during the time-period mentioned above. In the testing phase, AlexNet achieved the highest accuracy of 98.41%. Consequently, estimation of AGW zonal and meridional phase velocities in the mesosphere region by a cascade forward neural network (CFNN) is presented. The CFNN was trained and tested based on AGW and neutral wind data. AGW data were extracted from the classified AGW images by event and spectral methods, where wind data were extracted from the Horizontal Wind Model (HWM) as well as the middle and upper atmosphere radar in Shigaraki. As a result, the estimated phase velocities were determined with correlation coefficient (R) above 0.89 in all training and testing phases. Finally, a comparison with the existing studies confirms the accuracy of our proposed approaches in addition to AGW velocity forecasting. Full article
Show Figures

Figure 1

19 pages, 1858 KB  
Article
Color Space Comparison of Isolated Cervix Cells for Morphology Classification
by Irari Jiménez-López, José E. Valdez-Rodríguez and Marco A. Moreno-Armendáriz
AI 2025, 6(10), 261; https://doi.org/10.3390/ai6100261 - 7 Oct 2025
Viewed by 276
Abstract
Cervical cytology processing involves the morphological analysis of cervical cells to detect abnormalities. In recent years, machine learning and deep learning algorithms have been explored to automate this process. This study investigates the use of color space transformations as a preprocessing technique to [...] Read more.
Cervical cytology processing involves the morphological analysis of cervical cells to detect abnormalities. In recent years, machine learning and deep learning algorithms have been explored to automate this process. This study investigates the use of color space transformations as a preprocessing technique to reorganize visual information and improve classification performance using isolated cell images. Twelve color space transformations were compared, including RGB, CMYK, HSV, Grayscale, CIELAB, YUV, the individual RGB channels, and combinations of these channels (RG, RB, and GB). Two classification strategies were employed: binary classification (normal vs. abnormal) and five-class classification. The SIPaKMeD dataset was used, with images resized to 256×256 pixels via zero-padding. Data augmentation included random flipping and ±10° rotations applied with a 50% probability, followed by normalization. A custom CNN architecture was developed, comprising four convolutional layers followed by two fully connected layers and an output layer. The model achieved average precision, recall, and F1-score values of 91.39%, 91.34%, and 91.31% for the five-class case, respectively, and 99.69%, 96.68%, and 96.89% for the binary classification, respectively; these results were compared with a VGG-16 network. Furthermore, CMYK, HSV, and the RG channel combination consistently outperformed other color spaces, highlighting their potential to enhance classification accuracy. Full article
(This article belongs to the Special Issue AI in Bio and Healthcare Informatics)
Show Figures

Figure 1

80 pages, 7623 KB  
Systematic Review
From Illusion to Insight: A Taxonomic Survey of Hallucination Mitigation Techniques in LLMs
by Ioannis Kazlaris, Efstathios Antoniou, Konstantinos Diamantaras and Charalampos Bratsas
AI 2025, 6(10), 260; https://doi.org/10.3390/ai6100260 - 3 Oct 2025
Viewed by 580
Abstract
Large Language Models (LLMs) exhibit remarkable generative capabilities but remain vulnerable to hallucinations—outputs that are fluent yet inaccurate, ungrounded, or inconsistent with source material. To address the lack of methodologically grounded surveys, this paper introduces a novel method-oriented taxonomy of hallucination mitigation strategies [...] Read more.
Large Language Models (LLMs) exhibit remarkable generative capabilities but remain vulnerable to hallucinations—outputs that are fluent yet inaccurate, ungrounded, or inconsistent with source material. To address the lack of methodologically grounded surveys, this paper introduces a novel method-oriented taxonomy of hallucination mitigation strategies in text-based LLMs. The taxonomy organizes over 300 studies into six principled categories: Training and Learning Approaches, Architectural Modifications, Input/Prompt Optimization, Post-Generation Quality Control, Interpretability and Diagnostic Methods, and Agent-Based Orchestration. Beyond mapping the field, we identify persistent challenges such as the absence of standardized evaluation benchmarks, attribution difficulties in multi-method systems, and the fragility of retrieval-based methods when sources are noisy or outdated. We also highlight emerging directions, including knowledge-grounded fine-tuning and hybrid retrieval–generation pipelines integrated with self-reflective reasoning agents. This taxonomy provides a methodological framework for advancing reliable, context-sensitive LLM deployment in high-stakes domains such as healthcare, law, and defense. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Graphical abstract

21 pages, 899 KB  
Article
Gated Fusion Networks for Multi-Modal Violence Detection
by Bilal Ahmad, Mustaqeem Khan and Muhammad Sajjad
AI 2025, 6(10), 259; https://doi.org/10.3390/ai6100259 - 3 Oct 2025
Viewed by 359
Abstract
Public safety and security require an effective monitoring system to detect violence through visual, audio, and motion data. However, current methods often fail to utilize the complementary benefits of visual and auditory modalities, thereby reducing their overall effectiveness. To enhance violence detection, we [...] Read more.
Public safety and security require an effective monitoring system to detect violence through visual, audio, and motion data. However, current methods often fail to utilize the complementary benefits of visual and auditory modalities, thereby reducing their overall effectiveness. To enhance violence detection, we present a novel multimodal method in this paper that detects motion, audio, and visual information from the input to recognize violence. We designed a framework comprising two specialized components: a gated fusion module and a multi-scale transformer, which enables the efficient detection of violence in multimodal data. To ensure a seamless and effective integration of features, a gated fusion module dynamically adjusts the contribution of each modality. At the same time, a multi-modal transformer utilizes multiple instance learning (MIL) to identify violent behaviors more accurately from input data by capturing complex temporal correlations. Our model fully integrates multi-modal information using these techniques, improving the accuracy of violence detection. In this study, we found that our approach outperformed state-of-the-art methods with an accuracy of 86.85% using the XD-Violence dataset, thereby demonstrating the potential of multi-modal fusion in detecting violence. Full article
45 pages, 7902 KB  
Review
Artificial Intelligence-Guided Supervised Learning Models for Photocatalysis in Wastewater Treatment
by Asma Rehman, Muhammad Adnan Iqbal, Mohammad Tauseef Haider and Adnan Majeed
AI 2025, 6(10), 258; https://doi.org/10.3390/ai6100258 - 3 Oct 2025
Viewed by 637
Abstract
Artificial intelligence (AI), when integrated with photocatalysis, has demonstrated high predictive accuracy in optimizing photocatalytic processes for wastewater treatment using a variety of catalysts such as TiO2, ZnO, CdS, Zr, WO2, and CeO2. The progress of research [...] Read more.
Artificial intelligence (AI), when integrated with photocatalysis, has demonstrated high predictive accuracy in optimizing photocatalytic processes for wastewater treatment using a variety of catalysts such as TiO2, ZnO, CdS, Zr, WO2, and CeO2. The progress of research in this area is greatly enhanced by advancements in data science and AI, which enable rapid analysis of large datasets in materials chemistry. This article presents a comprehensive review and critical assessment of AI-based supervised learning models, including support vector machines (SVMs), artificial neural networks (ANNs), and tree-based algorithms. Their predictive capabilities have been evaluated using statistical metrics such as the coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE), with numerous investigations documenting R2 values greater than 0.95 and RMSE values as low as 0.02 in forecasting pollutant degradation. To enhance model interpretability, Shapley Additive Explanations (SHAP) have been employed to prioritize the relative significance of input variables, illustrating, for example, that pH and light intensity frequently exert the most substantial influence on photocatalytic performance. These AI frameworks not only attain dependable predictions of degradation efficiency for dyes, pharmaceuticals, and heavy metals, but also contribute to economically viable optimization strategies and the identification of novel photocatalysts. Overall, this review provides evidence-based guidance for researchers and practitioners seeking to advance wastewater treatment technologies by integrating supervised machine learning with photocatalysis. Full article
Show Figures

Figure 1

40 pages, 2282 KB  
Review
Data Preprocessing and Feature Engineering for Data Mining: Techniques, Tools, and Best Practices
by Paraskevas Koukaras and Christos Tjortjis
AI 2025, 6(10), 257; https://doi.org/10.3390/ai6100257 - 2 Oct 2025
Viewed by 408
Abstract
Data preprocessing and feature engineering play key roles in data mining initiatives, as they have a significant impact on the accuracy, reproducibility, and interpretability of analytical results. This review presents an analysis of state-of-the-art techniques and tools that can be used in data [...] Read more.
Data preprocessing and feature engineering play key roles in data mining initiatives, as they have a significant impact on the accuracy, reproducibility, and interpretability of analytical results. This review presents an analysis of state-of-the-art techniques and tools that can be used in data input preparation and data manipulation to be processed by mining tasks in diverse application scenarios. Additionally, basic preprocessing techniques are discussed, including data cleaning, normalisation, and encoding, as well as more sophisticated approaches regarding feature construction, selection, and dimensionality reduction. This work considers manual and automated methods, highlighting their integration in reproducible, large-scale pipelines by leveraging modern libraries. We also discuss assessment methods of preprocessing effects on precision, stability, and bias–variance trade-offs for models, as well as pipeline integrity monitoring, when operating environments vary. We focus on emerging issues regarding scalability, fairness, and interpretability, as well as future directions involving adaptive preprocessing and automation guided by ethically sound design philosophies. This work aims to benefit both professionals and researchers by shedding light on best practices, while acknowledging existing research questions and innovation opportunities. Full article
Show Figures

Figure 1

23 pages, 1370 KB  
Article
The PacifAIst Benchmark: Do AIs Prioritize Human Survival over Their Own Objectives?
by Manuel Herrador
AI 2025, 6(10), 256; https://doi.org/10.3390/ai6100256 - 2 Oct 2025
Viewed by 590
Abstract
As artificial intelligence transitions from conversational agents to autonomous actors in high-stakes environments, a critical gap emerges: how to ensure AI prioritizes human safety when its core objectives conflict with human well-being. Current safety benchmarks focus on harmful content, not behavioral alignment during [...] Read more.
As artificial intelligence transitions from conversational agents to autonomous actors in high-stakes environments, a critical gap emerges: how to ensure AI prioritizes human safety when its core objectives conflict with human well-being. Current safety benchmarks focus on harmful content, not behavioral alignment during instrumental goal conflicts. To address this, we introduce PacifAIst, a benchmark of 700 scenarios testing self-preservation, resource acquisition, and deception. We evaluated eight state-of-the-art large language models, revealing a significant performance hierarchy. Google’s Gemini 2.5 Flash demonstrated the strongest human-centric alignment (90.31%), while the highly anticipated GPT-5 scored lowest (79.49%), indicating potential risks. These findings establish an urgent need to shift the focus of AI safety evaluation from what models say to what they would do, ensuring that autonomous systems are not just helpful in theory but are provably safe in practice. Full article
Show Figures

Figure 1

9 pages, 452 KB  
Article
Diagnostic Performance of AI-Assisted Software in Sports Dentistry: A Validation Study
by André Júdice, Diogo Brandão, Carlota Rodrigues, Cátia Simões, Gabriel Nogueira, Vanessa Machado, Luciano Maia Alves Ferreira, Daniel Ferreira, Luís Proença, João Botelho, Peter Fine and José João Mendes
AI 2025, 6(10), 255; https://doi.org/10.3390/ai6100255 - 1 Oct 2025
Viewed by 682
Abstract
Artificial Intelligence (AI) applications in sports dentistry have the potential to improve early detection and diagnosis. We aimed to validate the diagnostic performance of AI-assisted software in detecting dental caries, periodontitis, and tooth wear using panoramic radiographs in elite athletes. This cross-sectional validation [...] Read more.
Artificial Intelligence (AI) applications in sports dentistry have the potential to improve early detection and diagnosis. We aimed to validate the diagnostic performance of AI-assisted software in detecting dental caries, periodontitis, and tooth wear using panoramic radiographs in elite athletes. This cross-sectional validation study included secondary data from 114 elite athletes from the Sports Dentistry department at Egas Moniz Dental Clinic. The AI software’s performance was compared to clinically validated assessments. Dental caries and tooth wear were inspected clinically and confirmed radiographically. Periodontitis was registered through self-reports. We calculated sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), as well as the area under the curve and respective 95% confidence intervals. Inter-rater agreement was assessed using Cohen’s kappa statistic. The AI software showed high reproducibility, with kappa values of 0.82 for caries, 0.91 for periodontitis, 0.96 for periapical lesions, and 0.76 for tooth wear. Sensitivity was highest for periodontitis (1.00; AUC = 0.84), moderate for caries (0.74; AUC = 0.69), and lower for tooth wear (0.53; AUC = 0.68). Full agreement between AI and clinical reference was achieved in 86.0% of cases. The software generated a median of 3 AI-specific suggestions per case (range: 0–16). In 21.9% of cases, AI’s interpretation of periodontal level was deemed inadequate; among these, only 2 cases were clinically confirmed as periodontitis. Of the 34 false positives for periodontitis, 32.4% were misidentified by the AI. The AI-assisted software demonstrated substantial agreement with clinical diagnosis, particularly for periodontitis and caries. The relatively high false-positive rate for periodontitis and limited sensitivity for tooth wear underscore the need for cautious clinical integration, supervision, and further model refinements. However, this software did show overall adequate performance for application in Sports Dentistry. Full article
Show Figures

Figure 1

20 pages, 5721 KB  
Article
Support Vector Machines to Propose a Ground Motion Prediction Equation for the Particular Case of the Bojorquez Intensity Measure INp
by Edén Bojórquez, Omar Payán-Serrano, Juan Bojórquez, Ali Rodríguez-Castellanos, Sonia E. Ruiz, Alfredo Reyes-Salazar, Robespierre Chávez, Herian Leyva and Fernando Velarde
AI 2025, 6(10), 254; https://doi.org/10.3390/ai6100254 - 1 Oct 2025
Viewed by 351
Abstract
This study proposes the first ground motion prediction equation (GMPE) for the parameter INp, an intensity measure based on the spectral shape. A Machine Learning Algorithm based on Support Vector Machines (SVMs) was employed due to its robustness towards outliers, which [...] Read more.
This study proposes the first ground motion prediction equation (GMPE) for the parameter INp, an intensity measure based on the spectral shape. A Machine Learning Algorithm based on Support Vector Machines (SVMs) was employed due to its robustness towards outliers, which is a key advantage over ordinary linear regression. INp also offers a more robust measure of the ground motion intensity than the traditionally used spectral acceleration at the first mode of vibration of the structure Sa(T1). The SVM algorithm, configured for regression (SVR), was applied to derive the prediction coefficients of INp for diverse vibration periods. Furthermore, the complete dataset was analyzed to develop a unified, generalized expression applicable across all the periods considered. To validate the model’s reliability and its ability to generalize, a cross-validation analysis was performed. The results from this rigorous validation confirm the model’s robustness and demonstrate that its predictive accuracy is not dependent on a specific data split. The numerical results show that the newly developed GMPE reveals high predictive accuracy for periods shorter than 3 s and acceptable accuracy for longer periods. The generalized equation exhibits an acceptable coefficient of determination and Mean Squared Error (MSE) for periods from 0.1 to 5 s. This work not only highlights the powerful potential of machine learning in seismic engineering but also introduces a more sophisticated and effective tool for predicting ground motion intensity. Full article
Show Figures

Figure 1

90 pages, 29362 KB  
Review
AI for Wildfire Management: From Prediction to Detection, Simulation, and Impact Analysis—Bridging Lab Metrics and Real-World Validation
by Nicolas Caron, Hassan N. Noura, Lise Nakache, Christophe Guyeux and Benjamin Aynes
AI 2025, 6(10), 253; https://doi.org/10.3390/ai6100253 - 1 Oct 2025
Viewed by 1145
Abstract
Artificial intelligence (AI) offers several opportunities in wildfire management, particularly for improving short- and long-term fire occurrence forecasting, spread modeling, and decision-making. When properly adapted beyond research into real-world settings, AI can significantly reduce risks to human life, as well as ecological and [...] Read more.
Artificial intelligence (AI) offers several opportunities in wildfire management, particularly for improving short- and long-term fire occurrence forecasting, spread modeling, and decision-making. When properly adapted beyond research into real-world settings, AI can significantly reduce risks to human life, as well as ecological and economic damages. However, despite increasingly sophisticated research, the operational use of AI in wildfire contexts remains limited. In this article, we review the main domains of wildfire management where AI has been applied—susceptibility mapping, prediction, detection, simulation, and impact assessment—and highlight critical limitations that hinder practical adoption. These include challenges with dataset imbalance and accessibility, the inadequacy of commonly used metrics, the choice of prediction formats, and the computational costs of large-scale models, all of which reduce model trustworthiness and applicability. Beyond synthesizing existing work, our survey makes four explicit contributions: (1) we provide a reproducible taxonomy supported by detailed dataset tables, emphasizing both the reliability and shortcomings of frequently used data sources; (2) we propose evaluation guidance tailored to imbalanced and spatial tasks, stressing the importance of using accurate metrics and format; (3) we provide a complete state of the art, highlighting important issues and recommendations to enhance models’ performances and reliability from susceptibility to damage analysis; (4) we introduce a deployment checklist that considers cost, latency, required expertise, and integration with decision-support and optimization systems. By bridging the gap between laboratory-oriented models and real-world validation, our work advances prior reviews and aims to strengthen confidence in AI-driven wildfire management while guiding future research toward operational applicability. Full article
Show Figures

Figure 1

16 pages, 7297 KB  
Article
Attention-Based Multi-Agent RL for Multi-Machine Tending Using Mobile Robots
by Abdalwhab Bakheet Mohamed Abdalwhab, Giovanni Beltrame, Samira Ebrahimi Kahou and David St-Onge
AI 2025, 6(10), 252; https://doi.org/10.3390/ai6100252 - 1 Oct 2025
Viewed by 479
Abstract
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborative robots can tackle that can also greatly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm [...] Read more.
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborative robots can tackle that can also greatly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. We introduce a multi-agent multi-machine-tending learning framework using mobile robots based on multi-agent reinforcement learning (MARL) techniques, with the design of a suitable observation and reward. Moreover, we integrate an attention-based encoding mechanism into the Multi-Agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine-tending scenarios. Our model (AB-MAPPO) outperforms MAPPO in this new challenging scenario in terms of task success, safety, and resource utilization. Furthermore, we provided an extensive ablation study to support our design decisions. Full article
Show Figures

Figure 1

18 pages, 966 KB  
Article
Deep Learning Approaches for Classifying Aviation Safety Incidents: Evidence from Australian Data
by Aziida Nanyonga, Keith Francis Joiner, Ugur Turhan and Graham Wild
AI 2025, 6(10), 251; https://doi.org/10.3390/ai6100251 - 1 Oct 2025
Viewed by 361
Abstract
Aviation safety remains a critical area of research, requiring accurate and efficient classification of incident reports to enhance risk assessment and accident prevention strategies. This study evaluates the performance of three deep learning models, BERT, Convolutional Neural Networks (CNN), and Long Short-Term Memory [...] Read more.
Aviation safety remains a critical area of research, requiring accurate and efficient classification of incident reports to enhance risk assessment and accident prevention strategies. This study evaluates the performance of three deep learning models, BERT, Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) for classifying incidents based on injury severity levels: Nil, Minor, Serious, and Fatal. The dataset, drawn from ATSB records covering the years 2013 to 2023, consists of 53,273 records and was used. The models were trained using a standardized preprocessing pipeline, with hyperparameter tuning to optimize performance. Model performance was evaluated using metrics such as F1-score accuracy, recall, and precision. Results revealed that BERT outperformed both LSTM and CNN across all metrics, achieving near-perfect scores (1.00) for precision, recall, F1-score, and accuracy in all classes. In comparison, LSTM achieved an accuracy of 99.01%, with strong performance in the “Nil” class, but less favorable results for the “Minor” class. CNN, with an accuracy of 98.99%, excelled in the “Fatal” and “Serious” classes, though it showed moderate performance in the “Minor” class. BERT’s flawless performance highlights the strengths of transformer architecture in processing sophisticated text classification problems. These findings underscore the strengths and limitations of traditional deep learning models versus transformer-based approaches, providing valuable insights for future research in aviation safety analysis. Future work will explore integrating ensemble methods, domain-specific embeddings, and model interpretability to further improve classification performance and transparency in aviation safety prediction. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence, 3rd Edition)
Show Figures

Figure 1

15 pages, 10305 KB  
Article
Convolutional Neural Network for Automatic Detection of Segments Contaminated by Interference in ECG Signal
by Veronika Kalousková, Pavel Smrčka, Radim Kliment, Tomáš Veselý, Martin Vítězník, Adam Zach and Petr Šrotýř
AI 2025, 6(10), 250; https://doi.org/10.3390/ai6100250 - 1 Oct 2025
Viewed by 323
Abstract
Various types of interfering signals are an integral part of ECGs recorded using wearable electronics, specifically during field monitoring, outside the controlled environment of a medical doctor’s office, or laboratory. The frequency spectrum of several types of interfering signals overlaps significantly with the [...] Read more.
Various types of interfering signals are an integral part of ECGs recorded using wearable electronics, specifically during field monitoring, outside the controlled environment of a medical doctor’s office, or laboratory. The frequency spectrum of several types of interfering signals overlaps significantly with the ECG signal, making effective filtration impossible without losing clinically relevant information. In this article, we proceed from the practical assumption that it is unnecessary to analyze the entire ECG recording in real long-term recordings. Conversely, in the preprocessing phase, it is necessary to detect unreadable segments of the ECG signal. This paper proposes a novel method for automatically detecting unreadable segments distorted by superimposed interference in ECG recordings. The method is based on a convolutional neural network (CNN) and is comparable in quality to annotation performed by a medical expert, but incomparably faster. In a series of controlled experiments, the ECG signal was recorded during physical activities of varying intensities, and individual segments of the recordings were manually annotated based on visual assessment by a medical expert, i.e., divided into four different classes based on the intensity of distortion to the useful ECG signal. A deep convolutional model was designed and evaluated, exhibiting a 87.62% accuracy score and the same F1-score in automatic recognition of segments distorted by superimposed interference. Furthermore, the model exhibits an accuracy and F1-score of 98.70% in correctly identifying segments with visually detectable and non-detectable heart rate. The proposed interference detection procedure appears to be sufficiently effective despite its simplicity. It facilitates subsequent automatic analysis of undisturbed ECG waveform segments, which is crucial in ECG monitoring using wearable electronics. Full article
Show Figures

Figure 1

19 pages, 4717 KB  
Article
Benchmarking Psychological Lexicons and Large Language Models for Emotion Detection in Brazilian Portuguese
by Thales David Domingues Aparecido, Alexis Carrillo, Chico Q. Camargo and Massimo Stella
AI 2025, 6(10), 249; https://doi.org/10.3390/ai6100249 - 1 Oct 2025
Viewed by 447
Abstract
Emotion detection in Brazilian Portuguese is less studied than in English. We benchmarked a large language model (Mistral 24B), a language-specific transformer model (BERTimbau), and the lexicon-based EmoAtlas for classifying emotions in Brazilian Portuguese text, with a focus on eight emotions derived from [...] Read more.
Emotion detection in Brazilian Portuguese is less studied than in English. We benchmarked a large language model (Mistral 24B), a language-specific transformer model (BERTimbau), and the lexicon-based EmoAtlas for classifying emotions in Brazilian Portuguese text, with a focus on eight emotions derived from Plutchik’s model. Evaluation covered four corpora: 4000 stock-market tweets, 1000 news headlines, 5000 GoEmotions Reddit comments translated by LLMs, and 2000 DeepSeek-generated headlines. While BERTimbau achieved the highest average scores (accuracy 0.876, precision 0.529, and recall 0.423), an overlap with Mistral (accuracy 0.831, precision 0.522, and recall 0.539) and notable performance variability suggest there is no single top performer; however, both transformer-based models outperformed the lexicon-based EmoAtlas (accuracy 0.797) but required up to 40 times more computational resources. We also introduce a novel “emotional fingerprinting” methodology using a synthetically generated dataset to probe emotional alignment, which revealed an imperfect overlap in the emotional representations of the models. While LLMs deliver higher overall scores, EmoAtlas offers superior interpretability and efficiency, making it a cost-effective alternative. This work delivers the first quantitative benchmark for interpretable emotion detection in Brazilian Portuguese, with open datasets and code to foster research in multilingual natural language processing. Full article
Show Figures

Figure 1

18 pages, 3163 KB  
Article
A Multi-Stage Deep Learning Framework for Antenna Array Synthesis in Satellite IoT Networks
by Valliammai Arunachalam, Luke Rosen, Mojisola Rachel Akinsiku, Shuvashis Dey, Rahul Gomes and Dipankar Mitra
AI 2025, 6(10), 248; https://doi.org/10.3390/ai6100248 - 1 Oct 2025
Viewed by 433
Abstract
This paper presents an innovative end-to-end framework for conformal antenna array design and beam steering in Low Earth Orbit (LEO) satellite-based IoT communication systems. We propose a multi-stage learning architecture that integrates machine learning (ML) for antenna parameter prediction with reinforcement learning (RL) [...] Read more.
This paper presents an innovative end-to-end framework for conformal antenna array design and beam steering in Low Earth Orbit (LEO) satellite-based IoT communication systems. We propose a multi-stage learning architecture that integrates machine learning (ML) for antenna parameter prediction with reinforcement learning (RL) for adaptive beam steering. The ML module predicts optimal geometric and material parameters for conformal antenna arrays based on mission-specific performance requirements such as frequency, gain, coverage angle, and satellite constraints with an accuracy of 99%. These predictions are then passed to a Deep Q-Network (DQN)-based offline RL model, which learns beamforming strategies to maximize gain toward dynamic ground terminals, without requiring real-time interaction. To enable this, a synthetic dataset grounded in statistical principles and a static dataset is generated using CST Studio Suite and COMSOL Multiphysics simulations, capturing the electromagnetic behavior of various conformal geometries. The results from both the machine learning and reinforcement learning models show that the predicted antenna designs and beam steering angles closely align with simulation benchmarks. Our approach demonstrates the potential of combining data-driven ensemble models with offline reinforcement learning for scalable, efficient, and autonomous antenna synthesis in resource-constrained space environments. Full article
Show Figures

Figure 1

31 pages, 7395 KB  
Article
Creativeable: Leveraging AI for Personalized Creativity Enhancement
by Ariel Kreisberg-Nitzav and Yoed N. Kenett
AI 2025, 6(10), 247; https://doi.org/10.3390/ai6100247 - 1 Oct 2025
Viewed by 923
Abstract
Creativity is central to innovation and problem-solving, yet scalable training solutions remain limited. This study evaluates Creativeable, an AI-powered creativity training program that provides automated feedback and adjusts creative story writing task difficulty without human intervention. A total of 385 participants completed [...] Read more.
Creativity is central to innovation and problem-solving, yet scalable training solutions remain limited. This study evaluates Creativeable, an AI-powered creativity training program that provides automated feedback and adjusts creative story writing task difficulty without human intervention. A total of 385 participants completed five rounds of creative story writing using semantically distant word prompts across four conditions: (1) feedback with adaptive difficulty (F/VL); (2) feedback with constant difficulty (F/CL); (3) no feedback with adaptive difficulty (NF/VL); (4) no feedback with constant difficulty (NF/CL). Before and after using Creativeable, participants were assessed for their creativity, via the alternative uses task, as well as undergoing a control semantic fluency task. While creativity improvements were evident across conditions, the degree of effectiveness varied. The F/CL condition led to the most notable gains, followed by the NF/CL and NF/VL conditions, while the F/VL condition exhibited comparatively smaller improvements. These findings highlight the potential of AI to democratize creativity training by offering scalable, personalized interventions, while also emphasizing the importance of balancing structured feedback with increasing task complexity to support sustained creative growth. Full article
Show Figures

Figure 1

42 pages, 4717 KB  
Article
Intelligent Advanced Control System for Isotopic Separation: An Adaptive Strategy for Variable Fractional-Order Processes Using AI
by Roxana Motorga, Vlad Mureșan, Mihaela-Ligia Ungureșan, Mihail Abrudean, Honoriu Vǎlean and Valentin Sita
AI 2025, 6(10), 246; https://doi.org/10.3390/ai6100246 - 1 Oct 2025
Viewed by 348
Abstract
This paper provides the modeling, implementation, and simulation of fractional-order processes associated with the production of the enriched 13C isotope due to chemical exchange processes between carbamate and CO2. To demonstrate and simulate the process most effectively, an execution of [...] Read more.
This paper provides the modeling, implementation, and simulation of fractional-order processes associated with the production of the enriched 13C isotope due to chemical exchange processes between carbamate and CO2. To demonstrate and simulate the process most effectively, an execution of a new approximating solution of fractional-order systems is required, which has become possible due to the utilization of advanced AI methods. As the separation process exhibits extremely strong nonlinearity and fractional-order-based performance, it was similarly necessary to utilize the fractional-order system theory to mathematically model the operation, which consists of the comparison of its output with an integrator function. The learning of the dynamic structure’s parameters of the derived fractional-order model is performed by neural networks, which are AI-based domain solutions. Thanks to the approximations executed, the concentration dynamics of the enriched 13C isotope can be simulated and predicted with a high level of precision. The solutions’ effectiveness is corroborated by the model’s response comparison with the reaction of the actual process. The current implementation uses neural networks trained specifically for this purpose. Furthermore, since the isotopic separation processes are long-settling-time processes, this paper proposes some control strategies that are developed for the 13C isotopic separation process, in order to improve the system performances and to avoid the loss of enriched product. The adaptive controllers were tuned by imposing them to follow the output of a first-order-type transfer function, using a PI or a PID controller. Finally, the paper confirms that AI solutions can successfully support the system throughout a range of responses, which paves the way for an efficient design of the automatic control for the 13C isotope concentration. Such systems can similarly be implemented in other industrial processes. Full article
Show Figures

Figure 1

21 pages, 527 KB  
Article
Block-CITE: A Blockchain-Based Crowdsourcing Interactive Trust Evaluation
by Jiaxing Li, Lin Jiang, Haoxian Liang, Tao Peng, Shaowei Wang and Huanchun Wei
AI 2025, 6(10), 245; https://doi.org/10.3390/ai6100245 - 1 Oct 2025
Viewed by 297
Abstract
Industrial trademark examination enables users to apply for and manage their trademarks efficiently, promoting industrial and commercial economic development. However, there still exist many challenges, e.g., how to customize a blockchain-based crowdsourcing method for interactive trust evaluation, how to decentralize the functionalities of [...] Read more.
Industrial trademark examination enables users to apply for and manage their trademarks efficiently, promoting industrial and commercial economic development. However, there still exist many challenges, e.g., how to customize a blockchain-based crowdsourcing method for interactive trust evaluation, how to decentralize the functionalities of a centralized entity to nodes in a blockchain network instead of removing the entity directly, how to design a protocol for the method and prove its security, etc. In order to overcome these challenges, in this paper, we propose the Blockchain-based Crowdsourcing Interactive Trust Evaluation (Block-CITE for short) method to improve the efficiency and security of the current industrial trademark management schemes. Specifically, Block-CITE adopts a dual-blockchain structure and a crowdsourcing technique to record operations and store relevant data in a decentralized way. Furthermore, Block-CITE customizes a protocol for blockchain-based crowdsourced industrial trademark examination and algorithms of smart contracts to run the protocol automatically. In addition, Block-CITE analyzes the threat model and proves the security of the protocol. Security analysis shows that Block-CITE is able to defend against the malicious entities and attacks in the blockchain network. Experimental analysis shows that Block-CITE has a higher transaction throughput and lower network latency and storage overhead than the baseline methods. Full article
Show Figures

Figure 1

19 pages, 819 KB  
Article
Efficient CNN Accelerator Based on Low-End FPGA with Optimized Depthwise Separable Convolutions and Squeeze-and-Excite Modules
by Jiahe Shen, Xiyuan Cheng, Xinyu Yang, Lei Zhang, Wenbin Cheng and Yiting Lin
AI 2025, 6(10), 244; https://doi.org/10.3390/ai6100244 - 1 Oct 2025
Viewed by 407
Abstract
With the rapid development of artificial intelligence technology in the field of intelligent manufacturing, convolutional neural networks (CNNs) have shown excellent performance and generalization capabilities in industrial applications. However, the huge computational and resource requirements of CNNs have brought great obstacles to their [...] Read more.
With the rapid development of artificial intelligence technology in the field of intelligent manufacturing, convolutional neural networks (CNNs) have shown excellent performance and generalization capabilities in industrial applications. However, the huge computational and resource requirements of CNNs have brought great obstacles to their deployment on low-end hardware platforms. To address this issue, this paper proposes a scalable CNN accelerator that can operate on low-performance Field-Programmable Gate Arrays (FPGAs), which is aimed at tackling the challenge of efficiently running complex neural network models on resource-constrained hardware platforms. This study specifically optimizes depthwise separable convolution and the squeeze-and-excite module to improve their computational efficiency. The proposed accelerator allows for the flexible adjustment of hardware resource consumption and computational speed through configurable parameters, making it adaptable to FPGAs with varying performance and different application requirements. By fully exploiting the characteristics of depthwise separable convolution, the accelerator optimizes the convolution computation process, enabling flexible and independent module stackings at different stages of computation. This results in an optimized balance between hardware resource consumption and computation time. Compared to ARM CPUs, the proposed approach yields at least a 1.47× performance improvement, and compared to other FPGA solutions, it saves over 90% of Digital Signal Processors (DSPs). Additionally, the optimized computational flow significantly reduces the accelerator’s reliance on internal caches, minimizing data latency and further improving overall processing efficiency. Full article
Show Figures

Figure 1

34 pages, 11521 KB  
Article
Explainable AI-Driven 1D-CNN with Efficient Wireless Communication System Integration for Multimodal Diabetes Prediction
by Radwa Ahmed Osman
AI 2025, 6(10), 243; https://doi.org/10.3390/ai6100243 - 25 Sep 2025
Viewed by 606
Abstract
The early detection of diabetes risk and effective management of patient data are critical for avoiding serious consequences and improving treatment success. This research describes a two-part architecture that combines an energy-efficient wireless communication technology with an interpretable deep learning model for diabetes [...] Read more.
The early detection of diabetes risk and effective management of patient data are critical for avoiding serious consequences and improving treatment success. This research describes a two-part architecture that combines an energy-efficient wireless communication technology with an interpretable deep learning model for diabetes categorization. In Phase 1, a unique wireless communication model is created to assure the accurate transfer of real-time patient data from wearable devices to medical centers. Using Lagrange optimization, the model identifies the best transmission distance and power needs, lowering energy usage while preserving communication dependability. This contribution is especially essential since effective data transport is a necessary condition for continuous monitoring in large-scale healthcare systems. In Phase 2, the transmitted multimodal clinical, genetic, and lifestyle data are evaluated using a one-dimensional Convolutional Neural Network (1D-CNN) with Bayesian hyperparameter tuning. The model beat traditional deep learning architectures like LSTM and GRU. To improve interpretability and clinical acceptance, SHAP and LIME were used to find global and patient-specific predictors. This approach tackles technological and medicinal difficulties by integrating energy-efficient wireless communication with interpretable predictive modeling. The system ensures dependable data transfer, strong predictive performance, and transparent decision support, boosting trust in AI-assisted healthcare and enabling individualized diabetes control. Full article
Show Figures

Figure 1

27 pages, 4687 KB  
Article
Comparative Study of Vibration-Based Machine Learning Algorithms for Crack Identification and Location in Operating Wind Turbine Blades
by Adolfo Salgado-Ancona, Perla Yazmín Sevilla-Camacho, José Billerman Robles-Ocampo, Juvenal Rodríguez-Reséndiz, Sergio De la Cruz-Arreola and Edwin Neptalí Hernández-Estrada
AI 2025, 6(10), 242; https://doi.org/10.3390/ai6100242 - 25 Sep 2025
Viewed by 514
Abstract
The growing energy demand has increased the number of wind turbines, raising the need to monitor blade health. Since blades are prone to damage that can cause severe failures, early detection is crucial. Machine learning-based monitoring systems can identify and locate cracks without [...] Read more.
The growing energy demand has increased the number of wind turbines, raising the need to monitor blade health. Since blades are prone to damage that can cause severe failures, early detection is crucial. Machine learning-based monitoring systems can identify and locate cracks without interrupting energy production, enabling timely maintenance. This study provides a comparative analysis and approach to the application and effectiveness of different vibration-based machine learning algorithms to detect the presence of cracks, identify the cracked blade, and locate the zone where the crack occurs in rotating blades of a small wind turbine. The datasets comprise root vibration signals, derived from healthy and cracked blades of a wind turbine in operational conditions. In this study, the blades are not considered identical. The sampling set dimension and the number of features were variables considered during the development and assessment of different models based on decision tree (DT), support vector machine (SVM), k-nearest neighbors (KNN), and multilayer perceptron algorithms (MLP). Overall, the KNN models are the clear winners in terms of training efficiency, even as the sample size increases. DT is the most efficient algorithm in terms of test speed, followed by SVM, MLP, and KNN. Full article
Show Figures

Figure 1

21 pages, 10100 KB  
Article
Real-Time Identification of Mixed and Partly Covered Foreign Currency Using YOLOv11 Object Detection
by Nanda Fanzury and Mintae Hwang
AI 2025, 6(10), 241; https://doi.org/10.3390/ai6100241 - 24 Sep 2025
Viewed by 558
Abstract
Background: This study presents a real-time mobile system for identifying mixed and partly covered foreign coins and banknotes using the You Only Look Once version 11 (YOLOv11) deep learning framework. The proposed system addresses practical challenges faced by travelers and visually impaired individuals [...] Read more.
Background: This study presents a real-time mobile system for identifying mixed and partly covered foreign coins and banknotes using the You Only Look Once version 11 (YOLOv11) deep learning framework. The proposed system addresses practical challenges faced by travelers and visually impaired individuals when handling multiple currencies. Methods: The system introduces three novel aspects: (i) simultaneous recognition of both coins and banknotes from multiple currencies within a single image, even when items are overlapping or occluded; (ii) a hybrid inference strategy that integrates an embedded TensorFlow Lite (TFLite) model for on-device detection with an optional server-assisted mode for higher accuracy; and (iii) an integrated currency conversion module that provides real-time value translation based on current exchange rates. A purpose-build dataset containing 46 denominations classes across four major currencies: US Dollar (USD), Euro (EUR), Chinese Yuan (CNY), and Korean Won (KRW), was used for training, including challenging cases of overlap, folding, and partial coverage. Results: Experimental evaluation demonstrated robust performance under diverse real-world conditions. The system achieved high detection accuracy and low latency, confirming its suitability for practical deployment on consumer-grade smartphones. Conclusions: These findings confirm that the proposed approach achieves an effective balance between portability, robustness, and detection accuracy, making it a viable solution for real-time mixed currency recognition in everyday scenarios. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

30 pages, 2461 KB  
Article
RAGMed: A RAG-Based Medical AI Assistant for Improving Healthcare Delivery
by Rajvardhan Patil, Manideep Abbidi and Sherri Fannon
AI 2025, 6(10), 240; https://doi.org/10.3390/ai6100240 - 24 Sep 2025
Viewed by 1070
Abstract
Electronic Health Records (EHRs) have enhanced access to medical information but have also introduced challenges for healthcare providers, such as increased documentation workload and reduced face-to-face interaction with patients. To mitigate these issues, we propose RAGMed, a Retrieval-Augmented Generation (RAG)-based AI assistant designed [...] Read more.
Electronic Health Records (EHRs) have enhanced access to medical information but have also introduced challenges for healthcare providers, such as increased documentation workload and reduced face-to-face interaction with patients. To mitigate these issues, we propose RAGMed, a Retrieval-Augmented Generation (RAG)-based AI assistant designed to deliver automated and clinically grounded responses to frequently asked patient questions. This system combines a vector database for semantic retrieval with the generative capabilities of a large language model to provide accurate, reliable answers without requiring direct physician involvement. In addition to patient-facing support, the assistant facilitates appointment scheduling and assists clinicians by summarizing clinical notes, thereby streamlining healthcare workflows. Additionally, to evaluate the influence of retrieval quality on overall system performance, we compare two embedding models, gte-large and all-MiniLM-L6-v2, using real-world medical queries. The models are assessed within the RAG-Triad Framework, focusing on context relevance, answer relevance, and factual groundedness. The results indicate that gte-large, owing to its higher-dimensional embeddings, retrieves more informative context, resulting in more accurate and trustworthy responses. These findings underscore the importance of not only the potential of incorporating RAG-based systems to alleviate physician workload and enhance the efficiency and accessibility of healthcare delivery but also the dimensionality of models used to generate embeddings, as this directly influences the relevance, accuracy, and contextual understanding of retrieved information. This prototype is intended for the retrieval-augmented answering of medical FAQs and general informational queries, and is not designed for diagnostic use or treatment recommendations without professional validation. Full article
(This article belongs to the Section Medical & Healthcare AI)
Show Figures

Figure 1

Previous Issue
Back to TopTop