Next Issue
Volume 6, October
Previous Issue
Volume 6, August
 
 

AI, Volume 6, Issue 9 (September 2025) – 44 articles

Cover Story (view full-size image): Uterine fibroids are one of the leading health concerns for women worldwide, with an economic burden of over USD 42 billion annually. Although recent advances have improved diagnosis and treatment, the current standard of care still faces limitations, particularly the need for personalized approaches. To address this challenge, this study provides an objective analysis of factors influencing procedure success and introduces a scalable, interpretable artificial intelligence (AI) system to support clinical decision-making. Our models predict the probability of treatment success and symptom relief, as well as the individualized likelihood of each fibroid responding to treatment. By offering patient-specific predictions at both patient and fibroid levels, this system could potentially enhance referral accuracy and improve treatment planning. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
16 pages, 291 KB  
Article
SVM, BERT, or LLM? A Comparative Study on Multilingual Instructed Deception Detection
by Daichi Azuma, René Meléndez, Michal Ptaszynski, Fumito Masui, Lara Aslan and Juuso Eronen
AI 2025, 6(9), 239; https://doi.org/10.3390/ai6090239 - 22 Sep 2025
Viewed by 299
Abstract
The automated detection of deceptive language is a crucial challenge in computational linguistics. This study provides a rigorous comparative analysis of three tiers of machine learning models for detecting instructed deception: traditional machine learning (SVM), fine-tuned discriminative models (BERT), and in-context learning with [...] Read more.
The automated detection of deceptive language is a crucial challenge in computational linguistics. This study provides a rigorous comparative analysis of three tiers of machine learning models for detecting instructed deception: traditional machine learning (SVM), fine-tuned discriminative models (BERT), and in-context learning with generalist Large Language Models (LLMs). Using the “cross-cultural deception detection” dataset, our findings reveal a clear performance hierarchy. While SVM performance is inconsistent, fine-tuned BERT models achieve substantially superior accuracy. Notably, a multilingual BERT model improves cross-topic accuracy on Spanish text to 90.14%, a gain of over 22 percentage points from its monolingual counterpart (67.20%). In contrast, modern LLMs perform poorly in zero-shot settings and fail to surpass the SVM baseline even with few-shot prompting, underscoring the effectiveness of task-specific fine-tuning. By transparently addressing the limitations of the solicited, low-stakes deception dataset, we establish a robust methodological baseline that clarifies the strengths of different modeling paradigms and informs future research into more complex, real-world deception phenomena. Full article
32 pages, 1238 KB  
Article
GRU-BERT for NILM: A Hybrid Deep Learning Architecture for Load Disaggregation
by Annysha Huzzat, Ahmed S. Khwaja, Ali A. Alnoman, Bhagawat Adhikari, Alagan Anpalagan and Isaac Woungang
AI 2025, 6(9), 238; https://doi.org/10.3390/ai6090238 - 22 Sep 2025
Viewed by 308
Abstract
Non-Intrusive Load Monitoring (NILM) aims to disaggregate a household’s total aggregated power consumption into appliance-level usage, enabling intelligent energy management without the need for intrusive metering. While deep learning has improved NILM significantly, existing NILM models struggle to capture load patterns across both [...] Read more.
Non-Intrusive Load Monitoring (NILM) aims to disaggregate a household’s total aggregated power consumption into appliance-level usage, enabling intelligent energy management without the need for intrusive metering. While deep learning has improved NILM significantly, existing NILM models struggle to capture load patterns across both longer time intervals and subtle timings for appliances involving brief or overlapping usage patterns. In this paper, we propose a novel GRU+BERT hybrid architecture, exploring both unidirectional (GRU+BERT) and bidirectional (Bi-GRU+BERT) variants. Our model combines Gated Recurrent Units (GRUs) to capture sequential temporal dependencies with Bidirectional Encoder Representations from Transformers (BERT), which is a transformer-based model that captures rich contextual information across the sequence. The bidirectional variant (Bi-GRU+BERT) processes input sequences in both forward (past to future) and backward (future to past) directions, enabling the model to learn relationships between power consumption values at different time steps more effectively. The unidirectional variant (GRU+BERT) provides an alternative suited for appliances with structured, sequential multi-phase usage patterns, such as dishwashers. By placing the Bi-GRU or GRU layer before BERT, our models first capture local time-based load patterns and then use BERT’s self-attention to understand the broader contextual relationships. This design addresses key limitations of both standalone recurrent and transformer-based models, offering improved performance on transient and irregular appliance loads. Evaluated on the UK-DALE and REDD datasets, the proposed Bi-GRU+BERT and GRU+BERT models show competitive performance compared to several state-of-the-art NILM models while maintaining a comparable model size and training time, demonstrating their practical applicability for real-time energy disaggregation, including potential edge and cloud deployment scenarios. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

26 pages, 3391 KB  
Article
Improving Remote Access Trojans Detection: A Comprehensive Approach Using Machine Learning and Hybrid Feature Engineering
by AlsharifHasan Mohamad Aburbeian, Manuel Fernández-Veiga and Ahmad Hasasneh
AI 2025, 6(9), 237; https://doi.org/10.3390/ai6090237 - 21 Sep 2025
Viewed by 468
Abstract
Remote Access Trojans (RATs) pose a serious cybersecurity risk due to their stealthy control over compromised systems. This study presents a detection framework that integrates host, network, and newly engineered behavioral features to enhance the identification of RATs. Two sets of experiments were [...] Read more.
Remote Access Trojans (RATs) pose a serious cybersecurity risk due to their stealthy control over compromised systems. This study presents a detection framework that integrates host, network, and newly engineered behavioral features to enhance the identification of RATs. Two sets of experiments were performed: (i) using the original dataset only, and (ii) using an extended dataset with ten engineered features and importance analysis. The framework was evaluated on a public Kaggle dataset of an RAT and benign traffic. Eight machine learning classifiers were tested, including three baseline methods, four ensemble approaches, and a neural network. Results show that the engineered hybrid feature set substantially improves detection performance. Among the tested algorithms, Random Forest and MLP achieved the strongest performance, with accuracies of 98% and 97%, respectively, while Gradient Boosting and LightGBM also performed competitively. Performance was assessed using multiple metrics, and to gain deeper insight into model learning behavior, learning curves and Precision–Recall curves were analyzed. The results demonstrate how well hybrid feature modeling, neural networks, and ensemble machine learning techniques may improve RAT identification. In future work, exploring the use of explainable ML methods may improve the detection capabilities. Full article
Show Figures

Figure 1

30 pages, 1643 KB  
Article
Destination (Un)Known: Auditing Bias and Fairness in LLM-Based Travel Recommendations
by Hristo Andreev, Petros Kosmas, Antonios D. Livieratos, Antonis Theocharous and Anastasios Zopiatis
AI 2025, 6(9), 236; https://doi.org/10.3390/ai6090236 - 19 Sep 2025
Viewed by 410
Abstract
Large language-model chatbots such as ChatGPT and DeepSeek are quickly gaining traction as an easy, first-stop tool for trip planning because they offer instant, conversational advice that once required sifting through multiple websites or guidebooks. Yet little is known about the biases that [...] Read more.
Large language-model chatbots such as ChatGPT and DeepSeek are quickly gaining traction as an easy, first-stop tool for trip planning because they offer instant, conversational advice that once required sifting through multiple websites or guidebooks. Yet little is known about the biases that shape the destination suggestions these systems provide. This study conducts a controlled, persona-based audit of the two models, generating 6480 recommendations for 216 traveller profiles that vary by origin country, age, gender identity and trip theme. Six observable bias families (popularity, geographic, cultural, stereotype, demographic and reinforcement) are quantified using tourism rankings, Hofstede scores, a 150-term cliché lexicon and information-theoretic distance measures. Findings reveal measurable bias in every bias category. DeepSeek is more likely than ChatGPT to suggest off-list cities and recommends domestic travel more often, while both models still favour mainstream destinations. DeepSeek also points users toward culturally more distant destinations on all six Hofstede dimensions and employs a denser, superlative-heavy cliché register; ChatGPT shows wider lexical variety but remains strongly promotional. Demographic analysis uncovers moderate gender gaps and extreme divergence for non-binary personas, tempered by a “protective” tendency to guide non-binary travellers toward countries with higher LGBTQI acceptance. Reinforcement bias is minimal, with over 90 percent of follow-up suggestions being novel in both systems. These results confirm that unconstrained LLMs are not neutral filters but active amplifiers of structural imbalances. The paper proposes a public-interest re-ranking layer, hosted by a body such as UN Tourism, that balances exposure fairness, seasonality smoothing, low-carbon routing, cultural congruence, safety safeguards and stereotype penalties, transforming conversational AI from an opaque gatekeeper into a sustainability-oriented travel recommendation tool. Full article
(This article belongs to the Special Issue AI Bias in the Media and Beyond)
Show Figures

Figure 1

22 pages, 28286 KB  
Article
RA-CottNet: A Real-Time High-Precision Deep Learning Model for Cotton Boll and Flower Recognition
by Rui-Feng Wang, Yi-Ming Qin, Yi-Yi Zhao, Mingrui Xu, Iago Beffart Schardong and Kangning Cui
AI 2025, 6(9), 235; https://doi.org/10.3390/ai6090235 - 18 Sep 2025
Viewed by 393
Abstract
Cotton is the most important natural fiber crop worldwide, and its automated harvesting is essential for improving production efficiency and economic benefits. However, cotton boll detection faces challenges such as small target size, fine-grained category differences, and complex background interference. This study proposes [...] Read more.
Cotton is the most important natural fiber crop worldwide, and its automated harvesting is essential for improving production efficiency and economic benefits. However, cotton boll detection faces challenges such as small target size, fine-grained category differences, and complex background interference. This study proposes RA-CottNet, a high-precision object detection model with both directional awareness and attention-guided capabilities, and develops an open-source dataset containing 4966 annotated images. Based on YOLOv11n, RA-CottNet incorporates ODConv and SPDConv to enhance directional and spatial representation, while integrating CoordAttention, an improved GAM, and LSKA to improve feature extraction. Experimental results showed that RA-CottNet achieves 93.683% Precision, 86.040% Recall, 93.496% mAP50, 72.857% mAP95, and 89.692% F1-score, maintaining stable performance under multi-scale and rotation perturbations. The proposed approach demonstrated high accuracy and real-time capability, making it suitable for deployment on agricultural edge devices and providing effective technical support for automated cotton boll harvesting and yield estimation. Full article
Show Figures

Figure 1

17 pages, 2619 KB  
Article
AE-DD: Autoencoder-Driven Dictionary with Matching Pursuit for Joint ECG Denoising, Compression, and Morphology Decomposition
by Fars Samann and Thomas Schanze
AI 2025, 6(9), 234; https://doi.org/10.3390/ai6090234 - 17 Sep 2025
Viewed by 734
Abstract
Background: Electrocardiogram (ECG) signals are crucial for cardiovascular diagnosis, but their analysis face challenges from noise contamination, compression difficulties due to their non-stationary nature, and the inherent complexity of its morphological components, particularly for low-amplitude P- and T-waves obscured by noise. Methodology: This [...] Read more.
Background: Electrocardiogram (ECG) signals are crucial for cardiovascular diagnosis, but their analysis face challenges from noise contamination, compression difficulties due to their non-stationary nature, and the inherent complexity of its morphological components, particularly for low-amplitude P- and T-waves obscured by noise. Methodology: This study proposes a novel, multi-stage framework for ECG signal denoising, compressing, and component decomposition. The proposed framework leverages the sparsity of ECG signal to denoise and compress these signals using autoencoder-driven dictionary (AE-DD) with matching pursuit. In this work, a data-driven dictionary was developed using a regularized autoencoder. Appropriate trained weights along with matching pursuit were used to compress the denoised ECG segments. This study explored different weight regularization techniques: L1- and L2-regularization. Results: The proposed framework achieves remarkable performance in simultaneous ECG denoising, compression, and morphological decomposition. The L1-DAE model delivers superior noise suppression (SNR improvement up to 18.6 dB at 3 dB input SNR) and near-lossless reconstruction (MSE<105). The L1-AE dictionary enables high-fidelity compression (CR = 28:1 ratio, MSE0.58×105, PRD = 2.1%), outperforming non-regularized models and traditional dictionaries (DCT/wavelets), while its trained weights naturally decompose into interpretable sub-dictionaries for P-wave, QRS complex, and T-wave enabling precise, label-free analysis of ECG components. Moreover, the learned sub-dictionaries naturally decompose into interpretable P-wave, QRS complex, and T-wave components with high accuracy, yielding strong correlation with the original ECG (r=0.98, r=0.99, and r=0.95, respectively) and very low MSE (1.93×105, 9.26×104, and 3.38×104, respectively). Conclusions: This study introduces a novel autoencoder-driven framework that simultaneously performs ECG denoising, compression, and morphological decomposition. By leveraging L1-regularized autoencoders with matching pursuit, the method effectively enhances signal quality while enabling direct decomposition of ECG signals into clinically relevant components without additional processing. This unified approach offers significant potential for improving automated ECG analysis and facilitating efficient long-term cardiac monitoring. Full article
Show Figures

Figure 1

26 pages, 1061 KB  
Article
EEViT: Efficient Enhanced Vision Transformer Architectures with Information Propagation and Improved Inductive Bias
by Rigel Mahmood, Sarosh Patel and Khaled Elleithy
AI 2025, 6(9), 233; https://doi.org/10.3390/ai6090233 - 17 Sep 2025
Viewed by 523
Abstract
The Transformer architecture has been the foundational cornerstone of the recent AI revolution, serving as the backbone of Large Language Models, which have demonstrated impressive language understanding and reasoning capabilities. When pretrained on large amounts of data, Transformers have also shown to be [...] Read more.
The Transformer architecture has been the foundational cornerstone of the recent AI revolution, serving as the backbone of Large Language Models, which have demonstrated impressive language understanding and reasoning capabilities. When pretrained on large amounts of data, Transformers have also shown to be highly effective in image classification via the advent of the Vision Transformer. However, they still lag in vision application performance compared to Convolutional Neural Networks (CNNs), which offer translational invariance, whereas Transformers lack inductive bias. Further, the Transformer relies on the attention mechanism, which despite increasing the receptive field, makes it computationally inefficient due to its quadratic time complexity. In this paper, we enhance the Transformer architecture, focusing on its above two shortcomings. We propose two efficient Vision Transformer architectures that significantly reduce the computational complexity without sacrificing classification performance. Our first enhanced architecture is the EEViT-PAR, which combines features from two recently proposed designs of PerceiverAR and CaiT. This enhancement leads to our second architecture, EEViT-IP, which provides implicit windowing capabilities akin to the SWIN Transformer and implicitly improves the inductive bias, while being extremely memory and computationally efficient. We perform detailed experiments on multiple image datasets to show the effectiveness of our architectures. Our best performing EEViT outperforms existing SOTA ViT models in terms of execution efficiency and surpasses or provides competitive classification accuracy on different benchmarks. Full article
Show Figures

Figure 1

32 pages, 4887 KB  
Article
Emerging Threat Vectors: How Malicious Actors Exploit LLMs to Undermine Border Security
by Dimitrios Doumanas, Alexandros Karakikes, Andreas Soularidis, Efstathios Mainas and Konstantinos Kotis
AI 2025, 6(9), 232; https://doi.org/10.3390/ai6090232 - 15 Sep 2025
Viewed by 931
Abstract
The rapid proliferation of Large Language Models (LLMs) has democratized access to advanced generative capabilities while raising urgent concerns about misuse in sensitive security domains. Border security, in particular, represents a high-risk environment where malicious actors may exploit LLMs for document forgery, synthetic [...] Read more.
The rapid proliferation of Large Language Models (LLMs) has democratized access to advanced generative capabilities while raising urgent concerns about misuse in sensitive security domains. Border security, in particular, represents a high-risk environment where malicious actors may exploit LLMs for document forgery, synthetic identity creation, logistics planning, or disinformation campaigns. Existing studies often highlight such risks in theory, yet few provide systematic empirical evidence of how state-of-the-art LLMs can be exploited. This paper introduces the Silent Adversary Framework (SAF), a structured pipeline that models the sequential stages by which obfuscated prompts can covertly bypass safeguards. We evaluate ten high-risk scenarios using five leading models—GPT-4o, Claude 3.7 Sonnet, Gemini 2.5 Flash, Grok 3, and Runway Gen-2—and assess outputs through three standardized metrics: Bypass Success Rate (BSR), Output Realism Score (ORS), and Operational Risk Level (ORL). Results reveal that, while all models exhibited some susceptibility, vulnerabilities were heterogeneous. Claude showed greater resistance in chemistry-related prompts, whereas GPT-4o and Gemini generated highly realistic outputs in identity fraud and logistics optimization tasks. Document forgery attempts produced only partially successful templates that lacked critical security features. These findings highlight the uneven distribution of risks across models and domains. By combining a reproducible adversarial framework with empirical testing, this study advances the evidence base on LLM misuse and provides actionable insights for policymakers and border security agencies, underscoring the need for stronger safeguards and oversight in the deployment of generative AI. Full article
Show Figures

Figure 1

21 pages, 2917 KB  
Article
Intelligent Decision-Making Analytics Model Based on MAML and Actor–Critic Algorithms
by Xintong Zhang, Beibei Zhang, Haoru Li, Helin Wang and Yunqiao Huang
AI 2025, 6(9), 231; https://doi.org/10.3390/ai6090231 - 14 Sep 2025
Viewed by 522
Abstract
Traditional Reinforcement Learning (RL) struggles in dynamic decision-making due to data dependence, limited generalization, and imbalanced subjective/objective factors. This paper proposes an intelligent model combining the Model-Agnostic Meta-Learning (MAML) framework with the Actor–Critic algorithm to address these limitations. The model integrates the AHP-CRITIC [...] Read more.
Traditional Reinforcement Learning (RL) struggles in dynamic decision-making due to data dependence, limited generalization, and imbalanced subjective/objective factors. This paper proposes an intelligent model combining the Model-Agnostic Meta-Learning (MAML) framework with the Actor–Critic algorithm to address these limitations. The model integrates the AHP-CRITIC weighting method to quantify strategic weights from both subjective expert experience and objective data, achieving balanced decision rationality. The MAML mechanism enables rapid generalization with minimal samples in dynamic environments via cross-task parameter optimization, drastically reducing retraining costs upon environmental changes. Evaluated on enterprise indicator anomaly decision-making, the model achieves significantly higher task reward values than traditional Actor–Critic, PG, and DQN using only 10–20 samples. It improves time efficiency by up to 97.23%. A proposed Balanced Performance Index confirms superior stability and adaptability. Currently integrated into an enterprise platform, the model provides efficient support for dynamic, complex scenarios. This research offers an innovative solution for intelligent decision-making under data scarcity and subjective-objective conflicts, demonstrating both theoretical value and practical potential. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

20 pages, 7210 KB  
Article
Toward Reliable Models for Distinguishing Epileptic High-Frequency Oscillations (HFOs) from Non-HFO Events Using LSTM and Pre-Trained OWL-ViT Vision–Language Framework
by Sahbi Chaibi and Abdennaceur Kachouri
AI 2025, 6(9), 230; https://doi.org/10.3390/ai6090230 - 14 Sep 2025
Viewed by 415
Abstract
Background: Over the past two decades, high-frequency oscillations (HFOs) between 80 and 500 Hz have emerged as valuable biomarkers for delineating and tracking epileptogenic brain networks. However, inspecting HFO events in lengthy EEG recordings remains a time-consuming visual process and mainly relies on [...] Read more.
Background: Over the past two decades, high-frequency oscillations (HFOs) between 80 and 500 Hz have emerged as valuable biomarkers for delineating and tracking epileptogenic brain networks. However, inspecting HFO events in lengthy EEG recordings remains a time-consuming visual process and mainly relies on experienced clinicians. Extensive recent research has emphasized the value of introducing deep learning (DL) and generative AI (GenAI) methods to automatically identify epileptic HFOs in iEEG signals. Owing to the ongoing issue of the noticeable incidence of spurious or false HFOs, a key question remains: which model is better able to distinguish epileptic HFOs from non-HFO events, such as artifacts and background noise? Methods: In this regard, our study addresses two main objectives: (i) proposing a novel HFO classification approach using a prompt engineering framework with OWL-ViT, a state-of-the-art large vision–language model designed for multimodal image understanding guided by optimized natural language prompts; and (ii) comparing a range of existing deep learning and generative models, including our proposed one. Main results: Notably, our quantitative and qualitative analysis demonstrated that the LSTM model achieved the highest classification accuracy of 99.16% among the time-series methods considered, while our proposed method consistently performed best among the different approaches based on time–frequency representation, achieving an accuracy of 99.07%. Conclusions and significance: The present study highlights the effectiveness of LSTM and prompted OWL-ViT models in distinguishing genuine HFOs from spurious non-HFO oscillations with respect to the gold-standard benchmark. These advancements constitute a promising step toward more reliable and efficient diagnostic tools for epilepsy. Full article
(This article belongs to the Section Medical & Healthcare AI)
Show Figures

Graphical abstract

15 pages, 4635 KB  
Article
GLNet-YOLO: Multimodal Feature Fusion for Pedestrian Detection
by Yi Zhang, Qing Zhao, Xurui Xie, Yang Shen, Jinhe Ran, Shu Gui, Haiyan Zhang, Xiuhe Li and Zhen Zhang
AI 2025, 6(9), 229; https://doi.org/10.3390/ai6090229 - 12 Sep 2025
Viewed by 508
Abstract
In the field of modern computer vision, pedestrian detection technology holds significant importance in applications such as intelligent surveillance, autonomous driving, and robot navigation. However, single-modal images struggle to achieve high-precision detection in complex environments. To address this, this study proposes a GLNet-YOLO [...] Read more.
In the field of modern computer vision, pedestrian detection technology holds significant importance in applications such as intelligent surveillance, autonomous driving, and robot navigation. However, single-modal images struggle to achieve high-precision detection in complex environments. To address this, this study proposes a GLNet-YOLO framework based on cross-modal deep feature fusion, aiming to improve pedestrian detection performance in complex environments by fusing feature information from visible light and infrared images. By extending the YOLOv11 architecture, the framework adopts a dual-branch network structure to process visible light and infrared modal inputs, respectively, and introduces the FM module to realize global feature fusion and enhancement, as well as the DMR module to accomplish local feature separation and interaction. Experimental results show that on the LLVIP dataset, compared to the single-modal YOLOv11 baseline, our fused model improves the mAP@50 by 9.2% over the visible-light-only model and 0.7% over the infrared-only model. This significantly improves the detection accuracy under low-light and complex background conditions and enhances the robustness of the algorithm, and its effectiveness is further verified on the KAIST dataset. Full article
Show Figures

Figure 1

33 pages, 5048 KB  
Article
Beyond DOM: Unlocking Web Page Structure from Source Code with Neural Networks
by Irfan Prazina, Damir Pozderac and Vensada Okanović
AI 2025, 6(9), 228; https://doi.org/10.3390/ai6090228 - 12 Sep 2025
Viewed by 469
Abstract
We introduce a code-only approach for modeling web page layouts directly from their source code (HTML and CSS only), bypassing rendering. Our method employs a neural architecture with specialized encoders for style rules, CSS selectors, and HTML attributes. These encodings are then aggregated [...] Read more.
We introduce a code-only approach for modeling web page layouts directly from their source code (HTML and CSS only), bypassing rendering. Our method employs a neural architecture with specialized encoders for style rules, CSS selectors, and HTML attributes. These encodings are then aggregated in another neural network that integrates hierarchical context (sibling and ancestor information) to form rich representational vectors for each web page’s element. Using these vectors, our model predicts eight spatial relationships between pairs of elements, focusing on edge-based proximity in a multilabel classification setup. For scalable training, labels are automatically derived from the Document Object Model (DOM) data for each web page, but the model operates independently of the DOM during inference. During inference, the model does not use bounding boxes or any information found in the DOM; instead, it relies solely on the source code as input. This approach facilitates structure-aware visual analysis in a lightweight and fully code-based way. Our model demonstrates alignment with human judgment in the evaluation of web page similarity, suggesting that code-only layout modeling offers a promising direction for scalable, interpretable, and efficient web interface analysis. The evaluation metrics show our method yields similar performance despite relying on less information. Full article
Show Figures

Figure 1

19 pages, 325 KB  
Review
Artificial Intelligence in Medical Education: A Narrative Review on Implementation, Evaluation, and Methodological Challenges
by Annalisa Roveta, Luigi Mario Castello, Costanza Massarino, Alessia Francese, Francesca Ugo and Antonio Maconi
AI 2025, 6(9), 227; https://doi.org/10.3390/ai6090227 - 11 Sep 2025
Viewed by 1009
Abstract
Artificial Intelligence (AI) is rapidly transforming medical education by enabling adaptive tutoring, interactive simulation, diagnostic enhancement, and competency-based assessment. This narrative review explores how AI has influenced learning processes in undergraduate and postgraduate medical training, focusing on methodological rigor, educational impact, and implementation [...] Read more.
Artificial Intelligence (AI) is rapidly transforming medical education by enabling adaptive tutoring, interactive simulation, diagnostic enhancement, and competency-based assessment. This narrative review explores how AI has influenced learning processes in undergraduate and postgraduate medical training, focusing on methodological rigor, educational impact, and implementation challenges. The literature reveals promising results: large language models can generate didactic content and foster academic writing; AI-driven simulations enhance decision-making, procedural skills, and interprofessional communication; and deep learning systems improve diagnostic accuracy in visually intensive tasks such as radiology and histology. Despite promising findings, the existing literature is methodologically heterogeneous. A minority of studies use controlled designs, while the majority focus on short-term effects or are confined to small, simulated cohorts. Critical limitations include algorithmic opacity, generalizability concerns, ethical risks (e.g., GDPR compliance, data bias), and infrastructural barriers, especially in low-resource contexts. Additionally, the unregulated use of AI may undermine critical thinking, foster cognitive outsourcing, and compromise pedagogical depth if not properly supervised. In conclusion, AI holds substantial potential to enhance medical education, but its integration requires methodological robustness, human oversight, and ethical safeguards. Future research should prioritize multicenter validation, longitudinal evaluation, and AI literacy for learners and educators to ensure responsible and sustainable adoption. Full article
(This article belongs to the Special Issue Exploring the Use of Artificial Intelligence in Education)
29 pages, 651 KB  
Systematic Review
Retrieval-Augmented Generation (RAG) in Healthcare: A Comprehensive Review
by Fnu Neha, Deepshikha Bhati and Deepak Kumar Shukla
AI 2025, 6(9), 226; https://doi.org/10.3390/ai6090226 - 11 Sep 2025
Viewed by 1773
Abstract
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by integrating external knowledge retrieval to improve factual consistency and reduce hallucinations. Despite growing interest, its use in healthcare remains fragmented. This paper presents a Systematic Literature Review (SLR) following PRISMA guidelines, synthesizing 30 peer-reviewed [...] Read more.
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by integrating external knowledge retrieval to improve factual consistency and reduce hallucinations. Despite growing interest, its use in healthcare remains fragmented. This paper presents a Systematic Literature Review (SLR) following PRISMA guidelines, synthesizing 30 peer-reviewed studies on RAG in clinical domains, focusing on three of its most prevalent and promising applications in diagnostic support, electronic health record (EHR) summarization, and medical question answering. We synthesize the existing architectural variants (naïve, advanced, and modular) and examine their deployment across these applications. Persistent challenges are identified, including retrieval noise (irrelevant or low-quality retrieved information), domain shift (performance degradation when models are applied to data distributions different from their training set), generation latency, and limited explainability. Evaluation strategies are compared using both standard metrics and clinical-specific metrics, FactScore, RadGraph-F1, and MED-F1, which are particularly critical for ensuring factual accuracy, medical validity, and clinical relevance. This synthesis offers a domain-focused perspective to guide researchers, healthcare providers, and policymakers in developing reliable, interpretable, and clinically aligned AI systems, laying the groundwork for future innovation in RAG-based healthcare solutions. Full article
Show Figures

Figure 1

45 pages, 2364 KB  
Systematic Review
Advances and Optimization Trends in Photovoltaic Systems: A Systematic Review
by Luis Angel Iturralde Carrera, Gendry Alfonso-Francia, Carlos D. Constantino-Robles, Juan Terven, Edgar A. Chávez-Urbiola and Juvenal Rodríguez-Reséndiz
AI 2025, 6(9), 225; https://doi.org/10.3390/ai6090225 - 10 Sep 2025
Viewed by 534
Abstract
This article presents a systematic review of optimization methods applied to enhance the performance of photovoltaic (PV) systems, with a focus on critical challenges such as system design and spatial layout, maximum power point tracking (MPPT), energy forecasting, fault diagnosis, and energy management. [...] Read more.
This article presents a systematic review of optimization methods applied to enhance the performance of photovoltaic (PV) systems, with a focus on critical challenges such as system design and spatial layout, maximum power point tracking (MPPT), energy forecasting, fault diagnosis, and energy management. The emphasis is on the integration of classical and algorithmic approaches. Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (PRISMA) methodology, 314 relevant publications from 2020 to 2025 were analyzed to identify current trends, methodological advances, and practical applications in the optimization of PV performance. The principal novelty of this review lies in its integrative critical analysis, which systematically contrasts the applicability, performance, and limitations of deterministic classical methods with emerging stochastic metaheuristic and data-driven artificial intelligence (AI) techniques, highlighting the growing dominance of hybrid models that synergize their strengths. Traditional techniques such as analytical modeling, numerical simulation, linear and dynamic programming, and gradient-based methods are examined in terms of their efficiency and scope. In parallel, the study evaluates the growing adoption of metaheuristic algorithms, including particle swarm optimization, genetic algorithms, and ant colony optimization, as well as machine learning (ML) and deep learning (DL) models applied to tasks such as MPPT, spatial layout optimization, energy forecasting, and fault diagnosis. A key contribution of this review is the identification of hybrid methodologies that combine metaheuristics with ML/DL models, demonstrating superior results in energy yield, robustness, and adaptability under dynamic conditions. The analysis highlights both the strengths and limitations of each paradigm, emphasizing challenges related to data availability, computational cost, and model interpretability. Finally, the study proposes future research directions focused on explainable AI, real-time control via edge computing, and the development of standardized benchmarks for performance evaluation. The findings contribute to a deeper understanding of current capabilities and opportunities in PV system optimization, offering a strategic framework for advancing intelligent and sustainable solar energy technologies. Full article
Show Figures

Figure 1

24 pages, 5198 KB  
Article
A Markerless Vision-Based Physical Frailty Assessment System for the Older Adults
by Muhammad Huzaifa, Wajiha Ali, Khawaja Fahad Iqbal, Ishtiaq Ahmad, Yasar Ayaz, Hira Taimur, Yoshihisa Shirayama and Motoyuki Yuasa
AI 2025, 6(9), 224; https://doi.org/10.3390/ai6090224 - 10 Sep 2025
Viewed by 1232
Abstract
The geriatric syndrome known as frailty is characterized by diminished physiological reserves and heightened susceptibility to unfavorable health consequences. As the world’s population ages, it is crucial to detect frailty early and accurately in order to reduce hazards, including falls, hospitalization, and death. [...] Read more.
The geriatric syndrome known as frailty is characterized by diminished physiological reserves and heightened susceptibility to unfavorable health consequences. As the world’s population ages, it is crucial to detect frailty early and accurately in order to reduce hazards, including falls, hospitalization, and death. In particular, functional tests are frequently used to evaluate physical frailty. However, current evaluation techniques are limited in their scalability and are prone to inconsistency due to their heavy reliance on subjective interpretation and manual observation. In this paper, we provide a completely automated, impartial, and comprehensive frailty assessment system that employs computer vision techniques for assessing physical frailty tests. Machine learning models have been specifically designed to analyze each clinical test. In order to extract significant features, our system analyzes the depth and joint coordinate data for important physical performance tests such as the Walking Speed Test, Timed Up and Go (TUG) Test, Functional Reach Test, Seated Forward Bend Test, Standing on One Leg Test, and Grip Strength Test. The proposed system offers a comprehensive system with consistent measurements, intelligent decision-making, and real-time feedback, in contrast to current systems, which lack real-time analysis and standardization. Strong model accuracy and conformity to clinical benchmarks are demonstrated by the experimental outcomes. The proposed system can be considered a scalable and useful tool for frailty screening in clinical and distant care settings by eliminating observer dependency and improving accessibility. Full article
(This article belongs to the Special Issue Multimodal Artificial Intelligence in Healthcare)
Show Figures

Figure 1

15 pages, 889 KB  
Article
Transformer Models Enhance Explainable Risk Categorization of Incidents Compared to TF-IDF Baselines
by Carlos Ramon Hölzing, Patrick Meybohm, Oliver Happel, Peter Kranke and Charlotte Meynhardt
AI 2025, 6(9), 223; https://doi.org/10.3390/ai6090223 - 9 Sep 2025
Viewed by 795
Abstract
Background: Critical Incident Reporting Systems (CIRS) play a key role in improving patient safety but facess limitations due to the unstructured nature of narrative data. Systematic analysis of such data to identify latent risk patterns remains challenging. While artificial intelligence (AI) shows promise [...] Read more.
Background: Critical Incident Reporting Systems (CIRS) play a key role in improving patient safety but facess limitations due to the unstructured nature of narrative data. Systematic analysis of such data to identify latent risk patterns remains challenging. While artificial intelligence (AI) shows promise in healthcare, its application to CIRS analysis is still underexplored. Methods: This study presents a transformer-based approach to classify incident reports into predefined risk categories and support clinical risk managers in identifying safety hazards. We compared a traditional TF-IDF/logistic regression model with a transformer-based German BERT (GBERT) model using 617 anonymized CIRS reports. Reports were categorized manually into four classes: Organization, Treatment, Documentation, and Consent/Communication. Models were evaluated using stratified 5-fold cross-validation. Interpretability was ensured via Shapley Additive Explanations (SHAP). Results: GBERT outperformed the baseline across all metrics, achieving macro averaged-F1 of 0.44 and a weighted-F1 of 0.75 versus 0.35 and 0.71. SHAP analysis revealed clinically plausible feature attributions. Conclusions: In summary, transformer-based models such as GBERT improve classification of incident report data and enable interpretable, systematic risk stratification. These findings highlight the potential of explainable AI to enhance learning from critical incidents. Full article
(This article belongs to the Special Issue Adversarial Learning and Its Applications in Healthcare)
Show Figures

Figure 1

25 pages, 4660 KB  
Article
Dual-Stream Former: A Dual-Branch Transformer Architecture for Visual Speech Recognition
by Sanghun Jeon, Jieun Lee and Yong-Ju Lee
AI 2025, 6(9), 222; https://doi.org/10.3390/ai6090222 - 9 Sep 2025
Viewed by 877
Abstract
This study proposes Dual-Stream Former, a novel architecture that integrates a Video Swin Transformer and Conformer designed to address the challenges of visual speech recognition (VSR). The model captures spatiotemporal dependencies, achieving a state-of-the-art character error rate (CER) of 3.46%, surpassing traditional convolutional [...] Read more.
This study proposes Dual-Stream Former, a novel architecture that integrates a Video Swin Transformer and Conformer designed to address the challenges of visual speech recognition (VSR). The model captures spatiotemporal dependencies, achieving a state-of-the-art character error rate (CER) of 3.46%, surpassing traditional convolutional neural network (CNN)-based models, such as 3D-CNN + DenseNet-121 (CER: 5.31%), and transformer-based alternatives, such as vision transformers (CER: 4.05%). The Video Swin Transformer captures multiscale spatial representations with high computational efficiency, whereas the Conformer back-end enhances temporal modeling across diverse phoneme categories. Evaluation of a high-resolution dataset comprising 740,000 utterances across 185 classes highlighted the effectiveness of the model in addressing visually confusing phonemes, such as diphthongs (/ai/, /au/) and labio-dental sounds (/f/, /v/). Dual-Stream Former achieved phoneme recognition error rates of 10.39% for diphthongs and 9.25% for labiodental sounds, surpassing those of CNN-based architectures by more than 6%. Although the model’s large parameter count (168.6 M) poses resource challenges, its hierarchical design ensures scalability. Future work will explore lightweight adaptations and multimodal extensions to increase deployment feasibility. These findings underscore the transformative potential of Dual-Stream Former for advancing VSR applications such as silent communication and assistive technologies by achieving unparalleled precision and robustness in diverse settings. Full article
Show Figures

Figure 1

18 pages, 495 KB  
Article
Optimizing NFL Draft Selections with Machine Learning Classification
by Akshaj Enaganti and George Pappas
AI 2025, 6(9), 221; https://doi.org/10.3390/ai6090221 - 9 Sep 2025
Viewed by 659
Abstract
The National Football League draft is one of the most important events in the creation of a successful franchise in professional American football. Selecting players as part of the draft process, however, is difficult, as a multitude of factors affect decisions to opt [...] Read more.
The National Football League draft is one of the most important events in the creation of a successful franchise in professional American football. Selecting players as part of the draft process, however, is difficult, as a multitude of factors affect decisions to opt for one player over another; a few of these include collegiate statistics, team need and fit, and physical potential. In this paper, we utilize a machine learning approach, with various types of models, to optimize the NFL draft and, in turn, enhance team performances. We compare the selections made by the system to the real athletes selected, and assess which of the picks would have been more impactful for the respective franchise. The specific investigation allows for further research by altering the weighting of specific factors and their significance in this decision-making process to land on the ideal player based on what a specific team desires. Using artificial intelligence in this process can produce more consistent results than high-risk traditional methods. Our approach extends beyond a basic Random Forest classifier by simulating complete draft scenarios with player attributes and team needs weighted. This allows comparison of different draft strategies (best-player-available vs. need-based) and demonstrates improved prediction accuracy over conventional methods. Full article
Show Figures

Figure 1

17 pages, 4523 KB  
Article
Self-Emotion-Mediated Exploration in Artificial Intelligence Mirrors: Findings from Cognitive Psychology
by Gustavo Assuncao, Miguel Castelo-Branco and Paulo Menezes
AI 2025, 6(9), 220; https://doi.org/10.3390/ai6090220 - 9 Sep 2025
Viewed by 506
Abstract
Background: Exploration of the physical environment is an indispensable precursor to information acquisition and knowledge consolidation for living organisms. Yet, current artificial intelligence models lack these autonomy capabilities during training, hindering their adaptability. This work proposes a learning framework for artificial agents to [...] Read more.
Background: Exploration of the physical environment is an indispensable precursor to information acquisition and knowledge consolidation for living organisms. Yet, current artificial intelligence models lack these autonomy capabilities during training, hindering their adaptability. This work proposes a learning framework for artificial agents to obtain an intrinsic exploratory drive, based on epistemic and achievement emotions triggered during data observation. Methods: This study proposes a dual-module reinforcement framework, where data analysis scores dictate pride or surprise, in accordance with psychological studies on humans. A correlation between these states and exploration is then optimized for agents to meet their learning goals. Results: Causal relationships between states and exploration are demonstrated by the majority of agents. A 15.4% mean increase is noted for surprise, with a 2.8% mean decrease for pride. Resulting correlations of ρsurprise=0.461 and ρpride=0.237 are obtained, mirroring previously reported human behavior. Conclusions: These findings lead to the conclusion that bio-inspiration for AI development can be of great use. This can incur benefits typically found in living beings, such as autonomy. Further, it empirically shows how AI methodologies can corroborate human behavioral findings, showcasing major interdisciplinary importance. Ramifications are discussed. Full article
Show Figures

Figure 1

20 pages, 2020 KB  
Article
MST-DGCN: Multi-Scale Temporal–Dynamic Graph Convolutional with Orthogonal Gate for Imbalanced Multi-Label ECG Arrhythmia Classification
by Jie Chen, Mingfeng Jiang, Xiaoyu He, Yang Li, Jucheng Zhang, Juan Li, Yongquan Wu and Wei Ke
AI 2025, 6(9), 219; https://doi.org/10.3390/ai6090219 - 8 Sep 2025
Viewed by 530
Abstract
Multi-label arrhythmia classification from 12-lead ECG signals is a tricky problem, including spatiotemporal feature extraction, feature fusion, and class imbalance. To address these issues, a multi-scale temporal–dynamic graph convolutional with orthogonal gates method, termed MST-DGCN, is proposed for ECG arrhythmia classification. In this [...] Read more.
Multi-label arrhythmia classification from 12-lead ECG signals is a tricky problem, including spatiotemporal feature extraction, feature fusion, and class imbalance. To address these issues, a multi-scale temporal–dynamic graph convolutional with orthogonal gates method, termed MST-DGCN, is proposed for ECG arrhythmia classification. In this method, a temporal–dynamic graph convolution with dynamic adjacency matrices is used to learn spatiotemporal patterns jointly, and an orthogonal gated fusion mechanism is used to eliminate redundancy, so as to strength their complementarity and independence through adjusting the significance of features dynamically. Moreover, a multi-instance learning strategy is proposed to alleviate class imbalance by adjusting the proportion of a few arrhythmia samples through adaptive label allocation. After validating on the St Petersburg INCART dataset under stringent inter-patient settings, the experimental results show that the proposed MST-DGCN method can achieve the best classification performance with an F1-score of 73.66% (+6.2% over prior baseline methods), with concurrent improvements in AUC (70.92%) and mAP (85.24%), while maintaining computational efficiency. Full article
Show Figures

Figure 1

15 pages, 1304 KB  
Article
Conv-ScaleNet: A Multiscale Convolutional Model for Federated Human Activity Recognition
by Xian Wu Ting, Ying Han Pang, Zheng You Lim, Shih Yin Ooi and Fu San Hiew
AI 2025, 6(9), 218; https://doi.org/10.3390/ai6090218 - 8 Sep 2025
Viewed by 441
Abstract
Background: Artificial Intelligence (AI) techniques have been extensively deployed in sensor-based Human Activity Recognition (HAR) systems. Recent advances in deep learning, especially Convolutional Neural Networks (CNNs), have advanced HAR by enabling automatic feature extraction from raw sensor data. However, these models often struggle [...] Read more.
Background: Artificial Intelligence (AI) techniques have been extensively deployed in sensor-based Human Activity Recognition (HAR) systems. Recent advances in deep learning, especially Convolutional Neural Networks (CNNs), have advanced HAR by enabling automatic feature extraction from raw sensor data. However, these models often struggle to capture multiscale patterns in human activity, limiting recognition accuracy. Additionally, traditional centralized learning approaches raise data privacy concerns, as personal sensor data must be transmitted to a central server, increasing the risk of privacy breaches. Methods: To address these challenges, this paper introduces Conv-ScaleNet, a CNN-based model designed for multiscale feature learning and compatibility with federated learning (FL) environments. Conv-ScaleNet integrates a Pyramid Pooling Module to extract both fine-grained and coarse-grained features and employs sequential Global Average Pooling layers to progressively capture abstract global representations from inertial sensor data. The model supports federated learning by training locally on user devices, sharing only model updates rather than raw data, thus preserving user privacy. Results: Experimental results demonstrate that the proposed Conv-ScaleNet achieves approximately 98% and 96% F1-scores on the WISDM and UCI-HAR datasets, respectively, confirming its competitiveness in FL environments for activity recognition. Conclusions: The proposed Conv-ScaleNet model addresses key limitations of existing HAR systems by combining multiscale feature learning with privacy-preserving training. Its strong performance, data protection capability, and adaptability to decentralized environments make it a robust and scalable solution for real-world HAR applications. Full article
Show Figures

Figure 1

23 pages, 904 KB  
Article
Unplugged Activities for Teaching Decision Trees to Secondary Students—A Case Study Analysis Using the SOLO Taxonomy
by Konstantinos Karapanos, Vassilis Komis, Georgios Fesakis, Konstantinos Lavidas, Stavroula Prantsoudi and Stamatios Papadakis
AI 2025, 6(9), 217; https://doi.org/10.3390/ai6090217 - 5 Sep 2025
Viewed by 2230
Abstract
The integration of Artificial Intelligence (AI) technologies in students’ lives necessitates the systematic incorporation of foundational AI literacy into educational curricula. Students are challenged to develop conceptual understanding of computational frameworks such as Machine Learning (ML) algorithms and Decision Trees (DTs). In this [...] Read more.
The integration of Artificial Intelligence (AI) technologies in students’ lives necessitates the systematic incorporation of foundational AI literacy into educational curricula. Students are challenged to develop conceptual understanding of computational frameworks such as Machine Learning (ML) algorithms and Decision Trees (DTs). In this context, unplugged (i.e., computer-free) pedagogical approaches have emerged as complementary to traditional coding-based instruction in AI education. This study examines the pedagogical effectiveness of an instructional intervention employing unplugged activities to facilitate conceptual understanding of DT algorithms among 47 9th-grade students within a Computer Science (CS) curriculum in Greece. The study employed a quasi-experimental design, utilizing the Structure of Observed Learning Outcomes (SOLO) taxonomy as the theoretical framework for assessing cognitive development and conceptual mastery of DT principles. Quantitative analysis of pre- and post-intervention assessments demonstrated statistically significant improvements in student performance across all evaluated SOLO taxonomy levels. The findings provide empirical support for the hypothesis that unplugged pedagogical interventions constitute an effective and efficient approach for introducing AI concepts to secondary education students. Based on these outcomes, the authors recommend the systematic implementation of developmentally appropriate unplugged instructional interventions for DTs and broader AI concepts across all educational levels, to optimize AI literacy acquisition. Full article
Show Figures

Figure 1

45 pages, 990 KB  
Review
Large Language Models in Cybersecurity: A Survey of Applications, Vulnerabilities, and Defense Techniques
by Niveen O. Jaffal, Mohammed Alkhanafseh and David Mohaisen
AI 2025, 6(9), 216; https://doi.org/10.3390/ai6090216 - 5 Sep 2025
Viewed by 2072
Abstract
Large Language Models (LLMs) are transforming cybersecurity by enabling intelligent, adaptive, and automated approaches to threat detection, vulnerability assessment, and incident response. With their advanced language understanding and contextual reasoning, LLMs surpass traditional methods in tackling challenges across domains such as the Internet [...] Read more.
Large Language Models (LLMs) are transforming cybersecurity by enabling intelligent, adaptive, and automated approaches to threat detection, vulnerability assessment, and incident response. With their advanced language understanding and contextual reasoning, LLMs surpass traditional methods in tackling challenges across domains such as the Internet of Things (IoT), blockchain, and hardware security. This survey provides a comprehensive overview of LLM applications in cybersecurity, focusing on two core areas: (1) the integration of LLMs into key cybersecurity domains, and (2) the vulnerabilities of LLMs themselves, along with mitigation strategies. By synthesizing recent advancements and identifying key limitations, this work offers practical insights and strategic recommendations for leveraging LLMs to build secure, scalable, and future-ready cyber defense systems. Full article
Show Figures

Figure 1

21 pages, 471 KB  
Review
Long Short-Term Memory Networks: A Comprehensive Survey
by Moez Krichen and Alaeddine Mihoub
AI 2025, 6(9), 215; https://doi.org/10.3390/ai6090215 - 5 Sep 2025
Viewed by 1129
Abstract
Long Short-Term Memory (LSTM) networks have revolutionized the field of deep learning, particularly in applications that require the modeling of sequential data. Originally designed to overcome the limitations of traditional recurrent neural networks (RNNs), LSTMs effectively capture long-range dependencies in sequences, making them [...] Read more.
Long Short-Term Memory (LSTM) networks have revolutionized the field of deep learning, particularly in applications that require the modeling of sequential data. Originally designed to overcome the limitations of traditional recurrent neural networks (RNNs), LSTMs effectively capture long-range dependencies in sequences, making them suitable for a wide array of tasks. This survey aims to provide a comprehensive overview of LSTM architectures, detailing their unique components, such as cell states and gating mechanisms, which facilitate the retention and modulation of information over time. We delve into the various applications of LSTMs across multiple domains, including the following: natural language processing (NLP), where they are employed for language modeling, machine translation, and sentiment analysis; time series analysis, where they play a critical role in forecasting tasks; and speech recognition, significantly enhancing the accuracy of automated systems. By examining these applications, we illustrate the versatility and robustness of LSTMs in handling complex data types. Additionally, we explore several notable variants and improvements of the standard LSTM architecture, such as Bidirectional LSTMs, which enhance context understanding, and Stacked LSTMs, which increase model capacity. We also discuss the integration of Attention Mechanisms with LSTMs, which have further advanced their performance in various tasks. Despite their strengths, LSTMs face several challenges, including high Computational Complexity, extensive Data Requirements, and difficulties in training, which can hinder their practical implementation. This survey addresses these limitations and provides insights into ongoing research aimed at mitigating these issues. In conclusion, we highlight recent advances in LSTM research and propose potential future directions that could lead to enhanced performance and broader applicability of LSTM networks. This survey serves as a foundational resource for researchers and practitioners seeking to understand the current landscape of LSTM technology and its future trajectory. Full article
Show Figures

Figure 1

26 pages, 3073 KB  
Article
From Detection to Decision: Transforming Cybersecurity with Deep Learning and Visual Analytics
by Saurabh Chavan and George Pappas
AI 2025, 6(9), 214; https://doi.org/10.3390/ai6090214 - 4 Sep 2025
Viewed by 511
Abstract
Objectives: The persistent evolution of software vulnerabilities—spanning novel zero-day exploits to logic-level flaws—continues to challenge conventional cybersecurity mechanisms. Static rule-based scanners and opaque deep learning models often lack the precision and contextual understanding required for both accurate detection and analyst interpretability. This [...] Read more.
Objectives: The persistent evolution of software vulnerabilities—spanning novel zero-day exploits to logic-level flaws—continues to challenge conventional cybersecurity mechanisms. Static rule-based scanners and opaque deep learning models often lack the precision and contextual understanding required for both accurate detection and analyst interpretability. This paper presents a hybrid framework for real-time vulnerability detection that improves both robustness and explainability. Methods: The framework integrates semantic encoding via Bidirectional Encoder Representations from Transformers (BERTs), structural analysis using Deep Graph Convolutional Neural Networks (DGCNNs), and lightweight prioritization through Kernel Extreme Learning Machines (KELMs). The architecture incorporates Minimum Intermediate Representation (MIR) learning to reduce false positives and fuses multi-modal data (source code, execution traces, textual metadata) for robust, scalable performance. Explainable Artificial Intelligence (XAI) visualizations—combining SHAP-based attributions and CVSS-aligned pair plots—serve as an analyst-facing interpretability layer. The framework is evaluated on benchmark datasets, including VulnDetect and the NIST Software Reference Library (NSRL, version 2024.12.1, used strictly as a benign baseline for false positive estimation). Results: Our evaluation reports that precision, recall, AUPRC, MCC, and calibration (ECE/Brier score) demonstrated improved robustness and reduced false positives compared to baselines. An internal interpretability validation was conducted to align SHAP/GNNExplainer outputs with known vulnerability features; formal usability testing with practitioners is left as future work. Conclusions: The framework, Designed with DevSecOps integration in mind, the system is packaged in containerized modules (Docker/Kubernetes) and outputs SIEM-compatible alerts, enabling potential compatibility with Splunk, GitLab CI/CD, and similar tools. While full enterprise deployment was not performed, these deployment-oriented design choices support scalability and practical adoption. Full article
Show Figures

Figure 1

50 pages, 2995 KB  
Review
A Survey of Traditional and Emerging Deep Learning Techniques for Non-Intrusive Load Monitoring
by Annysha Huzzat, Ahmed S. Khwaja, Ali A. Alnoman, Bhagawat Adhikari, Alagan Anpalagan and Isaac Woungang
AI 2025, 6(9), 213; https://doi.org/10.3390/ai6090213 - 3 Sep 2025
Viewed by 776
Abstract
To cope with the increasing global demand of energy and significant energy wastage caused by the use of different home appliances, smart load monitoring is considered a promising solution to promote proper activation and scheduling of devices and reduce electricity bills. Instead of [...] Read more.
To cope with the increasing global demand of energy and significant energy wastage caused by the use of different home appliances, smart load monitoring is considered a promising solution to promote proper activation and scheduling of devices and reduce electricity bills. Instead of installing a sensing device on each electric appliance, non-intrusive load monitoring (NILM) enables the monitoring of each individual device using the total power reading of the home smart meter. However, for a high-accuracy load monitoring, efficient artificial intelligence (AI) and deep learning (DL) approaches are needed. To that end, this paper thoroughly reviews traditional AI and DL approaches, as well as emerging AI models proposed for NILM. Unlike existing surveys that are usually limited to a specific approach or a subset of approaches, this review paper presents a comprehensive survey of an ensemble of topics and models, including deep learning, generative AI (GAI), emerging attention-enhanced GAI, and hybrid AI approaches. Another distinctive feature of this work compared to existing surveys is that it also reviews actual cases of NILM system design and implementation, covering a wide range of technical enablers including hardware, software, and AI models. Furthermore, a range of new future research and challenges are discussed, such as the heterogeneity of energy sources, data uncertainty, privacy and safety, cost and complexity reduction, and the need for a standardized comparison. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

38 pages, 13994 KB  
Article
Post-Heuristic Cancer Segmentation Refinement over MRI Images and Deep Learning Models
by Panagiotis Christakakis and Eftychios Protopapadakis
AI 2025, 6(9), 212; https://doi.org/10.3390/ai6090212 - 2 Sep 2025
Viewed by 770
Abstract
Lately, deep learning methods have greatly improved the accuracy of brain-tumor segmentation, yet slice-wise inconsistencies still limit reliable use in clinical practice. While volume-aware 3D convolutional networks achieve high accuracy, their memory footprint and inference time may limit clinical adoption. This study proposes [...] Read more.
Lately, deep learning methods have greatly improved the accuracy of brain-tumor segmentation, yet slice-wise inconsistencies still limit reliable use in clinical practice. While volume-aware 3D convolutional networks achieve high accuracy, their memory footprint and inference time may limit clinical adoption. This study proposes a resource-conscious pipeline for lower-grade-glioma delineation in axial FLAIR MRI that combines a 2D Attention U-Net with a guided post-processing refinement step. Two segmentation backbones, a vanilla U-Net and an Attention U-Net, are trained on 110 TCGA-LGG axial FLAIR patient volumes under various loss functions and activation functions. The Attention U-Net, optimized with Dice loss, delivers the strongest baseline, achieving a mean Intersection-over-Union (mIoU) of 0.857. To mitigate slice-wise inconsistencies inherent to 2D models, a White-Area Overlap (WAO) voting mechanism quantifies the tumor footprint shared by neighboring slices. The WAO curve is smoothed with a Gaussian filter to locate its peak, after which a percentile-based heuristic selectively relabels the most ambiguous softmax pixels. Cohort-level analysis shows that removing merely 0.1–0.3% of ambiguous low-confidence pixels lifts the post-processing mIoU above the baseline while improving segmentation for two-thirds of patients. The proposed refinement strategy holds great potential for further improvement, offering a practical route for integrating deep learning segmentation into routine clinical workflows with minimal computational overhead. Full article
Show Figures

Figure 1

22 pages, 47099 KB  
Article
Deciphering Emotions in Children’s Storybooks: A Comparative Analysis of Multimodal LLMs in Educational Applications
by Bushra Asseri, Estabrag Abaker, Maha Al Mogren, Tayef Alhefdhi and Areej Al-Wabil
AI 2025, 6(9), 211; https://doi.org/10.3390/ai6090211 - 2 Sep 2025
Viewed by 729
Abstract
Emotion recognition capabilities in multimodal AI systems are crucial for developing culturally responsive educational technologies yet remain underexplored for Arabic language contexts, where culturally appropriate learning tools are critically needed. This study evaluated the emotion recognition performance of two advanced multimodal large language [...] Read more.
Emotion recognition capabilities in multimodal AI systems are crucial for developing culturally responsive educational technologies yet remain underexplored for Arabic language contexts, where culturally appropriate learning tools are critically needed. This study evaluated the emotion recognition performance of two advanced multimodal large language models, GPT-4o and Gemini 1.5 Pro, when processing Arabic children’s storybook illustrations. We assessed both models across three prompting strategies (zero-shot, few-shot, and chain-of-thought) using 75 images from seven Arabic storybooks, comparing model predictions with human annotations based on Plutchik’s emotional framework. GPT-4o consistently outperformed Gemini across all conditions, achieving the highest macro F1-score of 59% with chain-of-thought prompting compared to Gemini’s best performance of 43%. Error analysis revealed systematic misclassification patterns, with valence inversions accounting for 60.7% of errors, while both models struggled with culturally nuanced emotions and ambiguous narrative contexts. These findings highlight fundamental limitations in current models’ cultural understanding and emphasize the need for culturally sensitive training approaches to develop effective emotion-aware educational technologies for Arabic-speaking learners. Full article
(This article belongs to the Special Issue Exploring the Use of Artificial Intelligence in Education)
Show Figures

Figure 1

21 pages, 1406 KB  
Article
Neural Network-Based Weight Loss Prediction: Behavioral Integration of Stress and Sleep in AI Decision Support
by Mayra Cruz Fernandez, Francisco Antonio Castillo-Velásquez, Omar Rodriguez-Abreo, Enriqueta Ortiz-Moctezuma, Luis Angel Iturralde Carrera, Adyr A. Estévez-Bén, José M. Álvarez-Alvarado and Juvenal Rodríguez-Reséndiz
AI 2025, 6(9), 210; https://doi.org/10.3390/ai6090210 - 2 Sep 2025
Viewed by 687
Abstract
This study evaluates the effect of incorporating behavioral variables, sleep quality (SQ) and stress level (SL), into neural network models for predicting weight loss. An artificial neural network (ANN) was trained using data from 100 adults aged 18 to 60, integrating demographic, physiological, [...] Read more.
This study evaluates the effect of incorporating behavioral variables, sleep quality (SQ) and stress level (SL), into neural network models for predicting weight loss. An artificial neural network (ANN) was trained using data from 100 adults aged 18 to 60, integrating demographic, physiological, and behavioral inputs. The findings emphasize that weight change is a multifactorial process influenced not only by caloric intake, basal metabolic rate, and physical activity, but also by psychological and behavioral factors such as sleep and stress. From a medical perspective, the inclusion of SQ and SL aligns with the biopsychosocial model of obesity, acknowledging the metabolic consequences of chronic stress and poor sleep. This integration allows for the development of low-cost, non-invasive, and personalized weight management tools based on self-reported data, especially valuable in resource-limited healthcare settings. Behavioral-aware AI systems such as the one proposed have the potential to support clinical decision-making, enable early risk detection, and guide the development of digital therapeutics. Quantitative results demonstrate that the best-performing architecture achieved a Root Mean Square Error (RMSE) of 1.98%; when SQ was excluded, the RMSE increased to 4.39% (1.8-fold), when SL was excluded it rose to 4.69% (1.95-fold), and when both were removed, the error reached 6.02% (2.5-fold), confirming the substantial predictive contribution of these behavioral variables. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop