Next Issue
Volume 6, December
Previous Issue
Volume 6, October
 
 

AI, Volume 6, Issue 11 (November 2025) – 24 articles

Cover Story (view full-size image): Existing anomaly detection methods typically perform well only under single-class settings, which substantially limits their applicability and deployment efficiency in real-world multi-class scenarios. To overcome this limitation, we propose a dynamic visual adaptation framework for multi-class anomaly detection. The framework incorporates a HyperAD plug-in module that dynamically adjusts model parameters according to the input, enabling adaptive feature extraction. By integrating the Mamba block, CNN block, and HyperAD plug-in, the proposed approach jointly captures global, local, and dynamic representations. Experimental results demonstrate that our framework achieves state-of-the-art performance on both the MVTec AD and VisA datasets, yielding image-level mAU-ROC scores of 98.8% and 95.1%, respectively. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
22 pages, 3725 KB  
Article
An Enhanced Machine Learning Framework for Network Anomaly Detection
by Oumaima Chentoufi, Mouad Choukhairi and Khalid Chougdali
AI 2025, 6(11), 299; https://doi.org/10.3390/ai6110299 - 20 Nov 2025
Viewed by 1048
Abstract
Given the increasing volume and sophistication of cyber-attacks, there has always been a need for improved and adaptive real-time intrusion detection systems. Machine learning algorithms have presented a promising approach for enhancing their capabilities. This research has focused on investigating the impact of [...] Read more.
Given the increasing volume and sophistication of cyber-attacks, there has always been a need for improved and adaptive real-time intrusion detection systems. Machine learning algorithms have presented a promising approach for enhancing their capabilities. This research has focused on investigating the impact of different dimensionality reduction approaches on performance, and we have chosen to work with both Batch PCA and Incremental PCA alongside Logistic Regression, SVM, and Decision Tree classifiers. We started this work by applying machine learning algorithms directly on pre-processed data, then applied the same algorithms on the reduced data. Our results have yielded an accuracy of 98.61% and an F1-score of 98.64% with a prediction time of only 0.09 s using Incremental PCA with Decision Tree. We also have obtained an accuracy of 98.44% and an F1-score of 98.47% with a prediction time of 0.04 s from Batch PCA with SVM, and an accuracy of 98.47% and an F1-score of 98.51% with a prediction time of 0.05 s from Incremental PCA with Logistic Regression. The findings demonstrate that Incremental PCA offers near real-time IDS deployment in large networks. Full article
Show Figures

Figure 1

17 pages, 6551 KB  
Article
AdaLite: A Distilled AdaBins Model for Depth Estimation on Resource-Limited Devices
by Mohammed Chaouki Ziara, Mohamed Elbahri, Nasreddine Taleb, Kidiyo Kpalma and Sid Ahmed El Mehdi Ardjoun
AI 2025, 6(11), 298; https://doi.org/10.3390/ai6110298 - 20 Nov 2025
Viewed by 878
Abstract
This paper presents AdaLite, a knowledge distillation framework for monocular depth estimation designed for efficient deployment on resource-limited devices, without relying on quantization or pruning. While large-scale depth estimation networks achieve high accuracy, their computational and memory demands hinder real-time use. To address [...] Read more.
This paper presents AdaLite, a knowledge distillation framework for monocular depth estimation designed for efficient deployment on resource-limited devices, without relying on quantization or pruning. While large-scale depth estimation networks achieve high accuracy, their computational and memory demands hinder real-time use. To address this problem, a large model is adopted as a teacher, and a compact encoder–decoder student with few trainable parameters is trained under a dual-supervision scheme that aligns its predictions with both teacher feature maps and ground-truth depths. AdaLite is evaluated on the NYUv2, SUN-RGBD and KITTI benchmarks using standard depth metrics and deployment-oriented measures, including inference latency. The distilled model achieves a 94% reduction in size and reaches 1.02 FPS on a Raspberry Pi 2 (2 GB CPU), while preserving 96.8% of the teacher’s accuracy (δ1) and providing over 11× faster inference. These results demonstrate the effectiveness of distillation-driven compression for real-time depth estimation in resource-limited environments. The code is publically available. Full article
Show Figures

Figure 1

14 pages, 1169 KB  
Article
Can Open-Source Large Language Models Detect Medical Errors in Real-World Ophthalmology Reports?
by Ante Kreso, Bosko Jaksic, Filip Rada, Zvonimir Boban, Darko Batistic, Donald Okmazic, Lara Veldic, Ivan Luksic, Ljubo Znaor, Sandro Glumac, Josko Bozic and Josip Vrdoljak
AI 2025, 6(11), 297; https://doi.org/10.3390/ai6110297 - 20 Nov 2025
Viewed by 968
Abstract
Accurate documentation is critical in ophthalmology, yet clinical notes often contain subtle errors that can affect decision-making. This study prospectively compared contemporary large language models (LLMs) for detecting clinically salient errors in emergency ophthalmology encounter notes and generating actionable corrections. 129 de-identified notes, [...] Read more.
Accurate documentation is critical in ophthalmology, yet clinical notes often contain subtle errors that can affect decision-making. This study prospectively compared contemporary large language models (LLMs) for detecting clinically salient errors in emergency ophthalmology encounter notes and generating actionable corrections. 129 de-identified notes, each seeded with a predefined target error, were independently audited by four LLMs (o3 (OpenAI, closed-source), DeepSeek-v3-r1 (Deepseek, open-source), MedGemma-27B (Google, open-source), and GPT-4o (OpenAI, closed-source)) using a standardized prompt. Two masked ophthalmologists graded error localization, relevance of additional issues, and overall recommendation quality, with within-case analyses applying appropriate nonparametric tests. Performance varied significantly across models (Cochran’s Q = 71.13, p = 2.44 × 10−15). o3 achieved the highest error localization accuracy at 95.7% (95% CI, 89.5–98.8), followed by DeepSeek-v3-r1 (90.3%), MedGemma-27b (80.9%), and GPT-4o (53.2%). Ordinal outcomes similarly favored o3 and DeepSeek-v3-r1 (both p < 10−9 vs. GPT-4o), with mean recommendation quality scores of 3.35, 3.05, 2.54, and 2.11, respectively. These findings demonstrate that LLMs can serve as accurate “second-eyes” for ophthalmology documentation. A proprietary model led on all metrics, while a strong open-source alternative approached its performance, offering potential for privacy-preserving on-premise deployment. Clinical translation will require oversight, workflow integration, and careful attention to ethical considerations. Full article
(This article belongs to the Section Medical & Healthcare AI)
Show Figures

Figure 1

21 pages, 2856 KB  
Article
Modeling Dynamic Risk Perception Using Large Language Model (LLM) Agents
by He Wen, Mojtaba Parsaee and Zaman Sajid
AI 2025, 6(11), 296; https://doi.org/10.3390/ai6110296 - 19 Nov 2025
Viewed by 1441
Abstract
Background: Understanding how accident risk escalates during unfolding industrial events is essential for developing intelligent safety systems. This study proposes a large language model (LLM)-based framework that simulates human-like risk reasoning over sequential accident precursors. Methods: Using 100 investigation reports from [...] Read more.
Background: Understanding how accident risk escalates during unfolding industrial events is essential for developing intelligent safety systems. This study proposes a large language model (LLM)-based framework that simulates human-like risk reasoning over sequential accident precursors. Methods: Using 100 investigation reports from the U.S. Chemical Safety Board (CSB), two Generative Pre-trained Transformer (GPT) agents were developed: (1) an Accident Precursor Extractor to identify and classify time-ordered events, and (2) a Subjective Probability Estimator to update perceived accident likelihood as precursors unfold. Results: The subjective accident probability increases near-linearly, with an average escalation of 8.0% ± 0.9% per precursor (p<0.05). A consistent tipping point occurs at the fourth precursor, marking a perceptual shift to high-risk awareness. Across 90 analyzed cases, Agent 1 achieved 0.88 precision and 0.84 recall, while Agent 2 reproduced human-like probabilistic reasoning within ±0.08 of expert baselines. The magnitude of escalation differed across precursor types. Organizational factors were perceived as the highest risk (median = 0.56), followed by human error (median = 0.47). Technical and environmental factors demonstrated comparatively smaller effects. Conclusions: These findings confirm that LLM agents can emulate Bayesian-like updating in dynamic risk perception, offering a scalable and explainable foundation for adaptive, sequence-aware safety monitoring in safety-critical systems. Full article
Show Figures

Figure 1

27 pages, 657 KB  
Review
Artificial Intelligence in Finance: From Market Prediction to Macroeconomic and Firm-Level Forecasting
by Flavius Gheorghe Popa and Vlad Muresan
AI 2025, 6(11), 295; https://doi.org/10.3390/ai6110295 - 17 Nov 2025
Cited by 1 | Viewed by 3991
Abstract
This review surveys how contemporary machine learning is reshaping financial and economic forecasting across markets, macroeconomics, and corporate planning. We synthesize evidence on model families, such as regularized linear methods, tree ensembles, and deep neural architecture, and explain their optimization (with gradient-based training) [...] Read more.
This review surveys how contemporary machine learning is reshaping financial and economic forecasting across markets, macroeconomics, and corporate planning. We synthesize evidence on model families, such as regularized linear methods, tree ensembles, and deep neural architecture, and explain their optimization (with gradient-based training) and design choices (activation and loss functions). Across tasks, Random Forest and gradient-boosted trees emerge as robust baselines, offering strong out-of-sample accuracy and interpretable variable importance. For sequential signals, recurrent models, especially LSTM ensembles, consistently improve directional classification and volatility-aware predictions, while transformer-style attention is a promising direction for longer contexts. Practical performance hinges on aligning losses with business objectives (for example cross-entropy vs. RMSE/MAE), handling class imbalance, and avoiding data leakage through rigorous cross-validation. In high-dimensional settings, regularization (such as ridge/lasso/elastic-net) stabilizes estimation and enhances generalization. We compile task-specific feature sets for macro indicators, market microstructure, and firm-level data, and distill implementation guidance covering hyperparameter search, evaluation metrics, and reproducibility. We conclude in open challenges (accuracy–interpretability trade-off, limited causal insight) and outline a research agenda combining econometrics with representation learning and data-centric evaluation. Full article
(This article belongs to the Special Issue AI in Finance: Leveraging AI to Transform Financial Services)
Show Figures

Figure 1

29 pages, 9355 KB  
Article
AI-Delphi: Emulating Personas Toward Machine–Machine Collaboration
by Lucas Nóbrega, Luiz Felipe Martinez, Luísa Marschhausen, Yuri Lima, Marcos Antonio de Almeida, Alan Lyra, Carlos Eduardo Barbosa and Jano Moreira de Souza
AI 2025, 6(11), 294; https://doi.org/10.3390/ai6110294 - 14 Nov 2025
Viewed by 1206
Abstract
Recent technological advancements have made Large Language Models (LLMs) easily accessible through apps such as ChatGPT, Claude.ai, Google Gemini, and HuggingChat, allowing text generation on diverse topics with a simple prompt. Considering this scenario, we propose three machine–machine collaboration models to streamline and [...] Read more.
Recent technological advancements have made Large Language Models (LLMs) easily accessible through apps such as ChatGPT, Claude.ai, Google Gemini, and HuggingChat, allowing text generation on diverse topics with a simple prompt. Considering this scenario, we propose three machine–machine collaboration models to streamline and accelerate Delphi execution time by leveraging the extensive knowledge of LLMs. We then applied one of these models—the Iconic Minds Delphi—to run Delphi questionnaires focused on the future of work and higher education in Brazil. Therefore, we prompted ChatGPT to assume the role of well-known public figures from various knowledge areas. To validate the effectiveness of this approach, we asked one of the emulated experts to evaluate his responses. Although this individual validation was not sufficient to generalize the approach’s effectiveness, it revealed an 85% agreement rate, suggesting a promising alignment between the emulated persona and the real expert’s opinions. Our work contributes to leveraging Artificial Intelligence (AI) in Futures Research, emphasizing LLMs’ potential as collaborators in shaping future visions while discussing their limitations. In conclusion, our research demonstrates the synergy between Delphi and LLMs, providing a glimpse into a new method for exploring central themes, such as the future of work and higher education. Full article
(This article belongs to the Topic Generative AI and Interdisciplinary Applications)
Show Figures

Figure 1

50 pages, 837 KB  
Article
FedEHD: Entropic High-Order Descent for Robust Federated Multi-Source Environmental Monitoring
by Koffka Khan, Winston Elibox, Treina Dinoo Ramlochan, Wayne Rajkumar and Shanta Ramnath
AI 2025, 6(11), 293; https://doi.org/10.3390/ai6110293 - 14 Nov 2025
Viewed by 914
Abstract
We propose Federated Entropic High-Order Descent (FedEHD), a drop-in client optimizer that augments local SGD with (i) an entropy (sign) term and (ii) quadratic and cubic gradient components for drift control and implicit clipping. Across non-IID CIFAR-10 and CIFAR-100 benchmarks (100 clients, 10% [...] Read more.
We propose Federated Entropic High-Order Descent (FedEHD), a drop-in client optimizer that augments local SGD with (i) an entropy (sign) term and (ii) quadratic and cubic gradient components for drift control and implicit clipping. Across non-IID CIFAR-10 and CIFAR-100 benchmarks (100 clients, 10% sampled per round), FedEHD achieves faster and higher convergence than strong baselines including FedAvg, FedProx, SCAFFOLD, FedDyn, MOON, and FedAdam. On CIFAR-10, it reaches 70% accuracy in approximately 80 rounds (versus 100 for MOON and 130 for SCAFFOLD) and attains a final accuracy of 72.5%. On CIFAR-100, FedEHD surpasses 60% accuracy by about 150 rounds (compared with 250 for MOON and 300 for SCAFFOLD) and achieves a final accuracy of 68.0%. In an environmental monitoring case study involving four distributed air-quality stations, FedEHD yields the highest macro AUC/F1 and improved calibration (ECE 0.183 versus 0.186–0.210 for competing federated methods) without additional communication and with only O(d) local overhead. The method further provides scale-invariant coefficients with optional automatic adaptation, theoretical guarantees for surrogate descent and drift reduction, and convergence curves that illustrate smooth and stable learning dynamics. Full article
Show Figures

Figure 1

15 pages, 851 KB  
Article
Attitudes Toward Artificial Intelligence in Organizational Contexts
by Silvia Marocco, Diego Bellini, Barbara Barbieri, Fabio Presaghi, Elena Grossi and Alessandra Talamo
AI 2025, 6(11), 292; https://doi.org/10.3390/ai6110292 - 14 Nov 2025
Viewed by 1962
Abstract
The adoption of artificial intelligence (AI) is reshaping organizational practices, yet workers’ attitudes remain crucial for its successful integration. This study examines how perceived organizational ethical culture, organizational innovativeness, and job performance influence workers’ attitudes towards AI. A survey was administered to 356 [...] Read more.
The adoption of artificial intelligence (AI) is reshaping organizational practices, yet workers’ attitudes remain crucial for its successful integration. This study examines how perceived organizational ethical culture, organizational innovativeness, and job performance influence workers’ attitudes towards AI. A survey was administered to 356 workers across diverse sectors, with analyses focusing on 154 participants who reported prior AI use. Measures included the Attitudes Towards Artificial Intelligence at Work (AAAW), Corporate Ethical Virtues (CEV), Inventory of Organizational Innovativeness (IOI), and an adapted version of the In-Role Behaviour Scale. Hierarchical regression analyses revealed that ethical culture dimensions, particularly Clarity and Feasibility, significantly predicted attitudes towards AI, such as anxiety and job insecurity, with Feasibility also associated with the attribution of human-like traits to AI. Supportability, reflecting a cooperative work environment, was linked to lower perceptions of AI human-likeness and adaptability. Among innovation dimensions, only Raising Projects, the active encouragement of employees’ ideas, was positively related to perceptions of AI adaptability, highlighting the importance of participatory innovation practices over abstract signals. Most importantly, perceived job performance improvements through AI predicted more positive attitudes, including greater perceived quality, utility, and reduced anxiety. Overall, this study contributes to the growing literature on AI in organizations by offering an exploratory yet integrative framework that captures the multifaceted nature of AI acceptance in the workplace. Full article
Show Figures

Figure 1

23 pages, 6147 KB  
Article
Super-Resolution Reconstruction Approach for MRI Images Based on Transformer Network
by Xin Liu, Chuangxin Huang, Jianli Meng, Qi Chen, Wuzheng Ji and Qiuliang Wang
AI 2025, 6(11), 291; https://doi.org/10.3390/ai6110291 - 14 Nov 2025
Viewed by 1822
Abstract
Magnetic Resonance Imaging (MRI) serves as a pivotal medical diagnostic technique widely deployed in clinical practice, yet high-resolution reconstruction frequently introduces motion artifacts and degrades signal-to-noise ratios. To enhance imaging efficiency and improve reconstruction quality, this study proposes a Transformer network-based super-resolution framework [...] Read more.
Magnetic Resonance Imaging (MRI) serves as a pivotal medical diagnostic technique widely deployed in clinical practice, yet high-resolution reconstruction frequently introduces motion artifacts and degrades signal-to-noise ratios. To enhance imaging efficiency and improve reconstruction quality, this study proposes a Transformer network-based super-resolution framework for MRI images. The methodology integrates Nonuniform Fast Fourier Transform (NUFFT) with a hybrid-attention Transformer network to achieve high-fidelity reconstruction. The embedded NUFFT module adaptively applies density compensation to k-space data based on sampling trajectories, while the Mixed Attention Block (MAB) activates broader pixel engagement to amplify feature extraction capabilities. The Interactive Attention Block (IAB) facilitates cross-window information fusion via overlapping windows, effectively suppressing artifacts. Evaluated on the fastMRI dataset under 4× radial undersampling, the network demonstrates 3.52 dB higher PSNR and 0.21 SSIM improvement over baselines, outperforming state-of-the-art methods across quantitative metrics. Visual assessments further confirm superior detail preservation and artifact suppression. This work establishes an effective pipeline for high-quality radial MRI reconstruction, providing a novel technical pathway for low-field MRI systems with significant research and application value. Full article
Show Figures

Figure 1

28 pages, 20548 KB  
Article
KGGCN: A Unified Knowledge Graph-Enhanced Graph Convolutional Network Framework for Chinese Named Entity Recognition
by Xin Chen, Liang He, Weiwei Hu and Sheng Yi
AI 2025, 6(11), 290; https://doi.org/10.3390/ai6110290 - 13 Nov 2025
Viewed by 881
Abstract
Recent advances in Chinese Named Entity Recognition (CNER) have integrated lexical features and factual knowledge into pretrained language models. However, existing lexicon-based methods often inject knowledge as restricted, isolated token-level information, lacking rich semantic and structural context. Knowledge graphs (KGs), comprising relational triples, [...] Read more.
Recent advances in Chinese Named Entity Recognition (CNER) have integrated lexical features and factual knowledge into pretrained language models. However, existing lexicon-based methods often inject knowledge as restricted, isolated token-level information, lacking rich semantic and structural context. Knowledge graphs (KGs), comprising relational triples, offer explicit relational semantics and reasoning capabilities, while Graph Convolutional Networks (GCNs) effectively capture complex sentence structures. We propose KGGCN, a unified KG-enhanced GCN framework for CNER. KGGCN introduces external factual knowledge without disrupting the original word order, employing a novel end-append serialization scheme and a visibility matrix to control interaction scope. The model further utilizes a two-phase GCN stack, combining a standard GCN for robust aggregation with a multi-head attention GCN for adaptive structural refinement, to capture multi-level structural information. Experiments on four Chinese benchmark datasets demonstrate KGGCN’s superior performance. It achieves the highest F1-scores on MSRA (95.96%) and Weibo (71.98%), surpassing previous bests by 0.26 and 1.18 percentage points, respectively. Additionally, KGGCN obtains the highest Recall on OntoNotes (84.28%) and MSRA (96.14%), and the highest Precision on MSRA (95.82%), Resume (96.40%), and Weibo (72.14%). These results highlight KGGCN’s effectiveness in leveraging structured knowledge and multi-phase graph modeling to enhance entity recognition accuracy and coverage across diverse Chinese texts. Full article
Show Figures

Figure 1

14 pages, 992 KB  
Article
DVAD: A Dynamic Visual Adaptation Framework for Multi-Class Anomaly Detection
by Han Gao, Huiyuan Luo, Fei Shen and Zhengtao Zhang
AI 2025, 6(11), 289; https://doi.org/10.3390/ai6110289 - 8 Nov 2025
Viewed by 1393
Abstract
Despite the superior performance of existing anomaly detection methods, they are often limited to single-class detection tasks, requiring separate models for each class. This constraint hinders their detection performance and deployment efficiency when applied to real-world multi-class data. In this paper, we propose [...] Read more.
Despite the superior performance of existing anomaly detection methods, they are often limited to single-class detection tasks, requiring separate models for each class. This constraint hinders their detection performance and deployment efficiency when applied to real-world multi-class data. In this paper, we propose a dynamic visual adaptation framework for multi-class anomaly detection, enabling the dynamic and adaptive capture of features based on multi-class data, thereby enhancing detection performance. Specifically, our method introduces a network plug-in, the Hyper AD Plug-in, which dynamically adjusts model parameters according to the input data to extract dynamic features. By leveraging the collaboration between the Mamba block, the CNN block, and the proposed Hyper AD Plug-in, we extract global, local, and dynamic features simultaneously. Furthermore, we incorporate the Mixture-of-Experts (MoE) module, which achieves a dynamic balance across different features through its dynamic routing mechanism and multi-expert collaboration. As a result, the proposed method achieves leading accuracy on the MVTec AD and VisA datasets, with image-level mAU-ROC scores of 98.8% and 95.1%, respectively. Full article
Show Figures

Figure 1

23 pages, 8644 KB  
Article
Understanding What the Brain Sees: Semantic Recognition from EEG Responses to Visual Stimuli Using Transformer
by Ahmed Fares
AI 2025, 6(11), 288; https://doi.org/10.3390/ai6110288 - 7 Nov 2025
Viewed by 1480
Abstract
Understanding how the human brain processes and interprets multimedia content represents a frontier challenge in neuroscience and artificial intelligence. This study introduces a novel approach to decode semantic information from electroencephalogram (EEG) signals recorded during visual stimulus perception. We present DCT-ViT, a spatial–temporal [...] Read more.
Understanding how the human brain processes and interprets multimedia content represents a frontier challenge in neuroscience and artificial intelligence. This study introduces a novel approach to decode semantic information from electroencephalogram (EEG) signals recorded during visual stimulus perception. We present DCT-ViT, a spatial–temporal transformer architecture that pioneers automated semantic recognition from brain activity patterns, advancing beyond conventional brain state classification to interpret higher level cognitive understanding. Our methodology addresses three fundamental innovations: First, we develop a topology-preserving 2D electrode mapping that, combined with temporal indexing, generates 3D spatial–temporal representations capturing both anatomical relationships and dynamic neural correlations. Second, we integrate discrete cosine transform (DCT) embeddings with standard patch and positional embeddings in the transformer architecture, enabling frequency-domain analysis that quantifies activation variability across spectral bands and enhances attention mechanisms. Third, we introduce the Semantics-EEG dataset comprising ten semantic categories extracted from visual stimuli, providing a benchmark for brain-perceived semantic recognition research. The proposed DCT-ViT model achieves 72.28% recognition accuracy on Semantics-EEG, substantially outperforming LSTM-based and attention-augmented recurrent baselines. Ablation studies demonstrate that DCT embeddings contribute meaningfully to model performance, validating their effectiveness in capturing frequency-specific neural signatures. Interpretability analyses reveal neurobiologically plausible attention patterns, with visual semantics activating occipital–parietal regions and abstract concepts engaging frontal–temporal networks, consistent with established cognitive neuroscience models. To address systematic misclassification between perceptually similar categories, we develop a hierarchical classification framework with boundary refinement mechanisms. This approach substantially reduces confusion between overlapping semantic categories, elevating overall accuracy to 76.15%. Robustness evaluations demonstrate superior noise resilience, effective cross-subject generalization, and few-shot transfer capabilities to novel categories. This work establishes the technical foundation for brain–computer interfaces capable of decoding semantic understanding, with implications for assistive technologies, cognitive assessment, and human–AI interaction. Both the Semantics-EEG dataset and DCT-ViT implementation are publicly released to facilitate reproducibility and advance research in neural semantic decoding. Full article
(This article belongs to the Special Issue AI in Bio and Healthcare Informatics)
Show Figures

Figure 1

22 pages, 1071 KB  
Article
Development and Validation of a Questionnaire to Evaluate AI-Generated Summaries for Radiologists: ELEGANCE (Expert-Led Evaluation of Generative AI Competence and ExcelleNCE)
by Yuriy A. Vasilev, Anton V. Vladzymyrskyy, Olga V. Omelyanskaya, Yulya A. Alymova, Dina A. Akhmedzyanova, Yuliya F. Shumskaya, Maria R. Kodenko, Ivan A. Blokhin and Roman V. Reshetnikov
AI 2025, 6(11), 287; https://doi.org/10.3390/ai6110287 - 5 Nov 2025
Viewed by 1257
Abstract
Background/Objectives: Large language models (LLMs) are increasingly considered for use in radiology, including the summarization of patient medical records to support radiologists in processing large volumes of data under time constraints. This task requires not only accuracy and completeness but also clinical applicability. [...] Read more.
Background/Objectives: Large language models (LLMs) are increasingly considered for use in radiology, including the summarization of patient medical records to support radiologists in processing large volumes of data under time constraints. This task requires not only accuracy and completeness but also clinical applicability. Automatic metrics and general-purpose questionnaires fail to capture these dimensions, and no standardized tool currently exists for the expert evaluation of LLM-generated summaries in radiology. Here, we aimed to develop and validate such a tool. Methods: Items for the questionnaire were formulated and refined through focus group testing with radiologists. Validation was performed on 132 LLM-generated summaries of 44 patient records, each independently assessed by radiologists. Criterion validity was evaluated through known-group differentiation and construct validity through confirmatory factor analysis. Results: The resulting seven-item instrument, ELEGANCE (Expert-Led Evaluation of Generative AI Competence and Excellence), demonstrated excellent internal consistency (Cronbach’s α = 0.95). It encompasses seven dimensions: relevance, completeness, applicability, falsification, satisfaction, structure, and correctness of language and terminology. Confirmatory factor analysis supported a two-factor structure (content and form), with strong fit indices (RMSEA = 0.079, CFI = 0.989, TLI = 0.982, SRMR = 0.029). Criterion validity was confirmed by significant between-group differences (p < 0.001). Conclusions: ELEGANCE is the first validated tool for expert evaluation of LLM-generated medical record summaries for radiologists, providing a standardized framework to ensure quality and clinical utility. Full article
Show Figures

Figure 1

28 pages, 2892 KB  
Article
“In Metaverse Cryptocurrencies We (Dis)Trust?”: Mediators and Moderators of Blockchain-Enabled Non-Fungible Token (NFT) Adoption in AI-Powered Metaverses
by Seunga Venus Jin
AI 2025, 6(11), 286; https://doi.org/10.3390/ai6110286 - 4 Nov 2025
Viewed by 1017
Abstract
Metaverses have been hailed as the next arena for a wide spectrum of technovation and business opportunities. This research (∑ N = 714) focuses on the three underexplored areas of virtual commerce in AI-enabled metaverses: blockchain-powered cryptocurrencies, non-fungible tokens (NFTs), and AI-powered virtual [...] Read more.
Metaverses have been hailed as the next arena for a wide spectrum of technovation and business opportunities. This research (∑ N = 714) focuses on the three underexplored areas of virtual commerce in AI-enabled metaverses: blockchain-powered cryptocurrencies, non-fungible tokens (NFTs), and AI-powered virtual influencers. Study 1 reports the mediating effects of (dis)trust in AI-enabled blockchain technologies and the moderating effects of consumers’ technopian perspectives in explaining the relationship between blockchain transparency perception and intention to use cryptocurrencies in AI-powered metaverses. Study 1 also reports the mediating effects of Neo-Luddism perspectives regarding metaverses and the moderating effects of consumers’ social phobia in explaining the relationship between AI-algorithm awareness and behavioral intention to engage with AI-powered virtual influencers in metaverses. Study 2 reports the serial mediating effects of general perception of NFT ownership and psychological ownership of NFTs as well as the moderating effects of the investment value of NFTs in explaining the relationship between acknowledgment of the nature of NFTs and intention to use NFTs in AI-enabled metaverses. Theoretical contributions to the literature on digital materiality and psychological ownership of blockchain/cryptocurrency-powered NFTs as emerging forms of digital consumption objects are discussed. Practical implications for NFT-based branding/entrepreneurship and creative industries in blockchain-enabled metaverses are provided. Full article
Show Figures

Figure 1

19 pages, 2582 KB  
Review
From Black Box to Glass Box: A Practical Review of Explainable Artificial Intelligence (XAI)
by Xiaoming Liu, Danni Huang, Jingyu Yao, Jing Dong, Litong Song, Hui Wang, Chao Yao and Weishen Chu
AI 2025, 6(11), 285; https://doi.org/10.3390/ai6110285 - 3 Nov 2025
Cited by 1 | Viewed by 6399
Abstract
Explainable Artificial Intelligence (XAI) has become essential as machine learning systems are deployed in high-stakes domains such as security, finance, and healthcare. Traditional models often act as “black boxes”, limiting trust and accountability. Traditional models often act as “black boxes”, limiting trust and [...] Read more.
Explainable Artificial Intelligence (XAI) has become essential as machine learning systems are deployed in high-stakes domains such as security, finance, and healthcare. Traditional models often act as “black boxes”, limiting trust and accountability. Traditional models often act as “black boxes”, limiting trust and accountability. However, most existing reviews treat explainability either as a technical problem or a philosophical issue, without connecting interpretability techniques to their real-world implications for security, privacy, and governance. This review fills that gap by integrating theoretical foundations with practical applications and societal perspectives. define transparency and interpretability as core concepts and introduce new economics-inspired notions of marginal transparency and marginal interpretability to highlight diminishing returns in disclosure and explanation. Methodologically, we examine model-agnostic approaches such as LIME and SHAP, alongside model-specific methods including decision trees and interpretable neural networks. We also address ante-hoc vs. post hoc strategies, local vs. global explanations, and emerging privacy-preserving techniques. To contextualize XAI’s growth, we integrate capital investment and publication trends, showing that research momentum has remained resilient despite market fluctuations. Finally, we propose a roadmap for 2025–2030, emphasizing evaluation standards, adaptive explanations, integration with Zero Trust architectures, and the development of self-explaining agents supported by global standards. By combining technical insights with societal implications, this article provides both a scholarly contribution and a practical reference for advancing trustworthy AI. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

22 pages, 9577 KB  
Article
YOLOv11-4ConvNeXtV2: Enhancing Persimmon Ripeness Detection Under Visual Challenges
by Bohan Zhang, Zhaoyuan Zhang and Xiaodong Zhang
AI 2025, 6(11), 284; https://doi.org/10.3390/ai6110284 - 1 Nov 2025
Viewed by 960
Abstract
Reliable and efficient detection of persimmons provides the foundation for precise maturity evaluation. Persimmon ripeness detection remains challenging due to small target sizes, frequent occlusion by foliage, and motion- or focus-induced blur that degrades edge information. This study proposes YOLOv11-4ConvNeXtV2, an enhanced detection [...] Read more.
Reliable and efficient detection of persimmons provides the foundation for precise maturity evaluation. Persimmon ripeness detection remains challenging due to small target sizes, frequent occlusion by foliage, and motion- or focus-induced blur that degrades edge information. This study proposes YOLOv11-4ConvNeXtV2, an enhanced detection framework that integrates a ConvNeXtV2 backbone with Fully Convolutional Masked Auto-Encoder (FCMAE) pretraining, Global Response Normalization (GRN), and Single-Head Self-Attention (SHSA) mechanisms. We present a comprehensive persimmon dataset featuring sub-block segmentation that preserves local structural integrity while expanding dataset diversity. The model was trained on 4921 annotated images (original 703 + 6 × 703 augmented) collected under diverse orchard conditions and optimized for 300 epochs using the Adam optimizer with early stopping. Comprehensive experiments demonstrate that YOLOv11-4ConvNeXtV2 achieves 95.9% precision and 83.7% recall, with mAP@0.5 of 88.4% and mAP@0.5:0.95 of 74.8%, outperforming state-of-the-art YOLO variants (YOLOv5n, YOLOv8n, YOLOv9t, YOLOv10n, YOLOv11n, YOLOv12n) by 3.8–6.3 percentage points in mAP@0.5:0.95. The model demonstrates superior robustness to blur, occlusion, and varying illumination conditions, making it suitable for deployment in challenging maturity detection environments. Full article
Show Figures

Figure 1

26 pages, 720 KB  
Review
Ethical Bias in AI-Driven Injury Prediction in Sport: A Narrative Review of Athlete Health Data, Autonomy and Governance
by Zbigniew Waśkiewicz, Kajetan J. Słomka, Tomasz Grzywacz and Grzegorz Juras
AI 2025, 6(11), 283; https://doi.org/10.3390/ai6110283 - 1 Nov 2025
Cited by 1 | Viewed by 3190
Abstract
The increasing use of artificial intelligence (AI) in athlete health monitoring and injury prediction presents both technological opportunities and complex ethical challenges. This narrative review critically examines 24 empirical and conceptual studies focused on AI-driven injury forecasting systems across diverse sports disciplines, including [...] Read more.
The increasing use of artificial intelligence (AI) in athlete health monitoring and injury prediction presents both technological opportunities and complex ethical challenges. This narrative review critically examines 24 empirical and conceptual studies focused on AI-driven injury forecasting systems across diverse sports disciplines, including professional, collegiate, youth, and Paralympic contexts. Applying an IMRAD framework, the analysis identifies five dominant ethical concerns: privacy and data protection, algorithmic fairness, informed consent, athlete autonomy, and long-term data governance. While studies commonly report the effectiveness of AI models—such as those employing decision trees, neural networks, and explainability tools like SHAP and HiPrCAM—few offers robust ethical safeguards or athlete-centered governance structures. Power asymmetries persist between athletes and institutions, with limited recognition of data ownership, transparency, and the right to contest predictive outputs. The findings highlight that ethical risks vary by sport type and competitive level, underscoring the need for sport-specific frameworks. Recommendations include establishing enforceable data rights, participatory oversight mechanisms, and regulatory protections to ensure that AI systems align with principles of fairness, transparency, and athlete agency. Without such frameworks, the integration of AI in sports medicine risks reinforcing structural inequalities and undermining the autonomy of those it intends to support. Full article
Show Figures

Figure 1

25 pages, 1436 KB  
Article
Scaling Swarm Coordination with GNNs—How Far Can We Go?
by Gianluca Aguzzi, Davide Domini, Filippo Venturini and Mirko Viroli
AI 2025, 6(11), 282; https://doi.org/10.3390/ai6110282 - 1 Nov 2025
Viewed by 1411
Abstract
The scalability of coordination policies is a critical challenge in swarm robotics, where agent numbers may vary substantially between deployment scenarios. Reinforcement learning (RL) offers a promising avenue for learning decentralized policies from local interactions, yet a fundamental question remains: can policies trained [...] Read more.
The scalability of coordination policies is a critical challenge in swarm robotics, where agent numbers may vary substantially between deployment scenarios. Reinforcement learning (RL) offers a promising avenue for learning decentralized policies from local interactions, yet a fundamental question remains: can policies trained on one swarm size transfer to different population scales without retraining? This zero-shot transfer problem is particularly challenging because the traditional RL approaches learn fixed-dimensional representations tied to specific agent counts, making them brittle to population changes at deployment time. While existing work addresses scalability through population-aware training (e.g., mean-field methods) or multi-size curricula (e.g., population transfer learning), these approaches either impose restrictive assumptions or require explicit exposure to varied team sizes during training. Graph Neural Networks (GNNs) offer a fundamentally different path. Their permutation invariance and ability to process variable-sized graphs suggest potential for zero-shot generalization across swarm sizes, where policies trained on a single population scale could deploy directly to larger or smaller teams. However, this capability remains largely unexplored in the context of swarm coordination. For this reason, we empirically investigate this question by combining GNNs with deep Q-learning in cooperative swarms. We focused on well-established 2D navigation tasks that are commonly used in the swarm robotics literature to study coordination and scalability, providing a controlled yet meaningful setting for our analysis. To address this, we introduce Deep Graph Q-Learning (DGQL), which embeds agent-neighbor graphs into Q-learning and trains on fixed-size swarms. Across two benchmarks (goal reaching and obstacle avoidance), we deploy up to three times larger teams. The DGQL preserves a functional coordination without retraining, but efficiency degrades with size. The ultimate goal distance grows monotonically (15–29 agents) and worsens beyond roughly twice the training size (20 agents), with task-dependent trade-offs. Our results quantify scalability limits of GNN-enhanced DQL and suggest architectural and training strategies to better sustain performance across scales. Full article
(This article belongs to the Section AI in Autonomous Systems)
Show Figures

Figure 1

46 pages, 4421 KB  
Systematic Review
Artificial Neural Network, Attention Mechanism and Fuzzy Logic-Based Approaches for Medical Diagnostic Support: A Systematic Review
by Noel Zacarias-Morales, Pablo Pancardo, José Adán Hernández-Nolasco and Matias Garcia-Constantino
AI 2025, 6(11), 281; https://doi.org/10.3390/ai6110281 - 1 Nov 2025
Viewed by 1990
Abstract
Accurate medical diagnosis is essential for informed decision making and the delivery of effective treatment. Traditionally, this process relies on clinical judgment, integrating data and medical expertise to inform decision making. In recent years, artificial neural networks (ANNs) have proven to be valuable [...] Read more.
Accurate medical diagnosis is essential for informed decision making and the delivery of effective treatment. Traditionally, this process relies on clinical judgment, integrating data and medical expertise to inform decision making. In recent years, artificial neural networks (ANNs) have proven to be valuable tools for diagnostic support. Attention mechanisms have enhanced ANNs performance, while fuzzy logic has contributed to managing uncertainty inherent in clinical data. This systematic review analyzes how the integration of these three approaches enhances computational models for medical diagnostic support. Following PRISMA 2020 guidelines, a comprehensive search was conducted across five scientific databases (IEEE Xplore, ScienceDirect, Web of Science, SpringerLink, and ACM Digital Library) for studies published between 2020 and 2025 that implemented the combined use of ANNs, attention mechanisms, and fuzzy logic for medical diagnostic support. Inclusion and exclusion criteria were applied, along with a quality assessment. Data extraction and synthesis were conducted independently by two reviewers and verified by a third. Out of 269 initially identified articles, 32 met the inclusion criteria. The findings consistently indicate that the integration of ANNs, attention mechanisms, and fuzzy logic significantly improves the performance of diagnostic models. ANNs effectively capture complex data patterns, attention mechanisms prioritize the most relevant features, and fuzzy logic provides robust handling of ambiguity and imprecise information through continuous degrees of membership. This integration leads to more accurate and interpretable diagnostic models. Future research should focus on leveraging multimodal data, enhancing model generalization, reducing computational complexity, and exploring novel fuzzy logic techniques and training paradigms to improve adaptability in real-world clinical settings. Full article
(This article belongs to the Section Medical & Healthcare AI)
Show Figures

Figure 1

38 pages, 1164 KB  
Article
From Initialization to Convergence: A Three-Stage Technique for Robust RBF Network Training
by Ioannis G. Tsoulos, Vasileios Charilogis and Dimitrios Tsalikakis
AI 2025, 6(11), 280; https://doi.org/10.3390/ai6110280 - 1 Nov 2025
Viewed by 851
Abstract
A parametric machine learning tool with many applications is the radial basis function (RBF) network, which has been incorporated into various classification and regression problems. A key component of these networks is their radial functions. These networks acquire adaptive capabilities through a technique [...] Read more.
A parametric machine learning tool with many applications is the radial basis function (RBF) network, which has been incorporated into various classification and regression problems. A key component of these networks is their radial functions. These networks acquire adaptive capabilities through a technique that consists of two stages. The centers and variances are computed in the first stage, and in the second stage, which involves solving a linear system of equations, the external weights for the radial functions are adjusted. Nevertheless, in numerous instances, this training approach has led to decreased performance, either because of instability in arithmetic computations or due to the method’s difficulty in escaping local minima of the error function. In this manuscript, a three-stage method is suggested to address the above problems. In the first phase, an initial estimation of the value ranges for the machine learning model parameters is performed. During the second phase, the network parameters are fine-tuned within the intervals determined in the first phase. Finally, in the third phase of the proposed method, a local optimization technique is applied to achieve the final adjustment of the network parameters. The proposed method was evaluated on several machine learning models from the related literature, as well as compared with the original RBF training approach. This methodhas been successfully applied to a wide range of related problems reported in recent studies. Also, a comparison was made in terms of classification and regression error. It should be noted that although the proposed methodology had very good results in the above measurements, it requires significant computational execution time due to the use of three phases of processing and adaptation of the network parameters. Full article
Show Figures

Figure 1

24 pages, 1614 KB  
Article
Severity-Aware Drift Adaptation for Cost-Efficient Model Maintenance
by Khrystyna Shakhovska and Petro Pukach
AI 2025, 6(11), 279; https://doi.org/10.3390/ai6110279 - 23 Oct 2025
Viewed by 1612
Abstract
Objectives: This paper introduces an adaptive learning framework for handling concept drift in data by dynamically adjusting model updates based on the severity of detected drift. Methods: The proposed method combines multiple statistical measures to quantify distributional changes between recent and historical data [...] Read more.
Objectives: This paper introduces an adaptive learning framework for handling concept drift in data by dynamically adjusting model updates based on the severity of detected drift. Methods: The proposed method combines multiple statistical measures to quantify distributional changes between recent and historical data windows. The resulting severity score drives a three-tier adaptation policy: minor drift is ignored, moderate drift triggers incremental model updates, and severe drift initiates full model retraining. Results: This approach balances stability and adaptability, reducing unnecessary computation while preserving model accuracy. The framework is applicable to both single-model and ensemble-based systems, offering a flexible and efficient solution for real-time drift management. Also, different transformation methods were reviewed, and quantile transformation was tested. By applying a quantile transformation, the Kolmogorov–Smirnov (KS) statistic decreased from 0.0559 to 0.0072, demonstrating effective drift adaptation. Full article
Show Figures

Figure 1

23 pages, 3532 KB  
Review
Generative Artificial Intelligence in Healthcare: A Bibliometric Analysis and Review of Potential Applications and Challenges
by Vanita Kouomogne Nana and Mark T. Marshall
AI 2025, 6(11), 278; https://doi.org/10.3390/ai6110278 - 23 Oct 2025
Viewed by 3096
Abstract
The remarkable progress of artificial intelligence (AI) in recent years has significantly extended its application possibilities within the healthcare domain. AI has become more accessible to a wider range of healthcare personnel and service users, in particular due to the proliferation of Generative [...] Read more.
The remarkable progress of artificial intelligence (AI) in recent years has significantly extended its application possibilities within the healthcare domain. AI has become more accessible to a wider range of healthcare personnel and service users, in particular due to the proliferation of Generative AI (GenAI). This study presents a bibliometric analysis of GenAI in healthcare. By analysing the Scopus database academic literature, our study explores the knowledge structure, emerging trends, and challenges of GenAI in healthcare. The results showed that GenAI is increasingly being adoption in developed countries, with major US institutions leading the way, and a large number of papers are being published on the topic in top-level academic venues. Our findings also show that there is a focus on particular areas of healthcare, with medical education and clinical decision-making showing active research, while areas such as emergency medicine remain poorly explored. Our results also show that while there is a focus on the benefits of GenAI for the healthcare industry, its limitations need to be acknowledged and addressed to facilitate its integration in clinical settings. The findings of this study can serve as a foundation for understanding the field, allowing academics, healthcare practitioners, educators, and policymakers to better understand the current focus within GenAI for healthcare, as well as highlighting potential application areas and challenges around accuracy, privacy, and ethics that must be taken into account when developing healthcare-focused GenAI applications. Full article
Show Figures

Figure 1

19 pages, 4001 KB  
Article
ConvNeXt with Context-Weighted Deep Superpixels for High-Spatial-Resolution Aerial Image Semantic Segmentation
by Ziran Ye, Yue Lin, Muye Gan, Xiangfeng Tan, Mengdi Dai and Dedong Kong
AI 2025, 6(11), 277; https://doi.org/10.3390/ai6110277 - 22 Oct 2025
Viewed by 1245
Abstract
Semantic segmentation of high-spatial-resolution (HSR) aerial imagery is critical for applications such as urban planning and environmental monitoring, yet challenges, including scale variation, intra-class diversity, and inter-class confusion, persist. This study proposes a deep learning framework that integrates convolutional networks (CNNs) with context-enhanced [...] Read more.
Semantic segmentation of high-spatial-resolution (HSR) aerial imagery is critical for applications such as urban planning and environmental monitoring, yet challenges, including scale variation, intra-class diversity, and inter-class confusion, persist. This study proposes a deep learning framework that integrates convolutional networks (CNNs) with context-enhanced superpixel generation, using ConvNeXt as the backbone for feature extraction. The framework incorporates two key modules, namely, a deep superpixel module (Spixel) and a global context modeling module (GC-module), which synergistically generate context-weighted superpixel embeddings to enhance scene–object relationship modeling, refining local details while maintaining global semantic consistency. The introduced approach achieves mIoU scores of 84.54%, 90.59%, and 64.46% on diverse HSR aerial imagery benchmark datasets (Vaihingen, Potsdam, and UV6K), respectively. Ablation experiments were conducted to further validate the contributions of the global context modeling module and deep superpixel modules, highlighting their synergy in improving segmentation results. This work facilitates precise spatial detail preservation and semantic consistency in HSR aerial imagery interpretation tasks, particularly for small objects and complex land cover classes. Full article
Show Figures

Figure 1

20 pages, 2894 KB  
Article
End-to-End Swallowing Event Localization via Blue-Channel-to-Depth Substitution in RGB-D: GRNConvNeXt-Modified AdaTAD with KAN-Chebyshev Decoder
by Derek Ka-Hei Lai, Zi-An Zhao, Andy Yiu-Chau Tam, Jing Li, Jason Zhi-Shen Zhang, Duo Wai-Chi Wong and James Chung-Wai Cheung
AI 2025, 6(11), 276; https://doi.org/10.3390/ai6110276 - 22 Oct 2025
Viewed by 914
Abstract
Background: Swallowing is a complex biomechanical process, and its impairment (dysphagia) poses major health risks for older adults. Current diagnostic methods such as videofluoroscopic swallowing (VFSS) and fiberoptic endoscopic evaluation of swallowing (FEES) are effective but invasive, resource-intensive, and unsuitable for continuous [...] Read more.
Background: Swallowing is a complex biomechanical process, and its impairment (dysphagia) poses major health risks for older adults. Current diagnostic methods such as videofluoroscopic swallowing (VFSS) and fiberoptic endoscopic evaluation of swallowing (FEES) are effective but invasive, resource-intensive, and unsuitable for continuous monitoring. This study proposes a novel end-to-end RGB–D framework for automated swallowing event localization in continuous video streams. Methods: The framework enhances the AdaTAD backbone through three key innovations: (i) finding the optimal strategy to integrate depth information to capture subtle neck movements, (ii) examining the best adapter design for efficient temporal feature adaptation, and (iii) introducing a Kolmogorov–Arnold Network (KAN) decoder that leverages Chebyshev polynomials for non-linear temporal modeling. Evaluation on a proprietary swallowing dataset comprising 641 clips and 3153 annotated events demonstrated the effectiveness of the proposed framework. We analysed and compared the modification strategy across designs of adapters, decoders, input channel combinations, regression methods, and patch embedding techniques. Results: The optimized configuration (VideoMAE + GRNConvNeXtAdapter + KAN + RGD + boundary regression + sinusoidal embedding) achieved an average mAP of 83.25%, significantly surpassing the baseline I3D + RGB + MLP model (61.55%). Ablation studies further confirmed that each architectural component contributed incrementally to the overall improvement. Conclusions: These results establish the feasibility of accurate, non-invasive, and automated swallowing event localization using depth-augmented video. The proposed framework paves the way for practical dysphagia screening and long-term monitoring in clinical and home-care environments. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop