Topical Advisory Panel applications are now closed. Please contact the Editorial Office with any queries.
Journal Description
AI
AI
is an international, peer-reviewed, open access journal on artificial intelligence (AI), including broad aspects of cognition and reasoning, perception and planning, machine learning, intelligent robotics, and applications of AI, published monthly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within ESCI (Web of Science), Scopus, EBSCO, and other databases.
- Journal Rank: JCR - Q1 (Computer Science, Interdisciplinary Applications) / CiteScore - Q2 (Artificial Intelligence)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 20.7 days after submission; acceptance to publication is undertaken in 3.9 days (median values for papers published in this journal in the first half of 2025).
- Recognition of Reviewers: APC discount vouchers, optional signed peer review, and reviewer names published annually in the journal.
Impact Factor:
5.0 (2024);
5-Year Impact Factor:
4.6 (2024)
Latest Articles
Destination (Un)Known: Auditing Bias and Fairness in LLM-Based Travel Recommendations
AI 2025, 6(9), 236; https://doi.org/10.3390/ai6090236 - 19 Sep 2025
Abstract
Large language-model chatbots such as ChatGPT and DeepSeek are quickly gaining traction as an easy, first-stop tool for trip planning because they offer instant, conversational advice that once required sifting through multiple websites or guidebooks. Yet little is known about the biases that
[...] Read more.
Large language-model chatbots such as ChatGPT and DeepSeek are quickly gaining traction as an easy, first-stop tool for trip planning because they offer instant, conversational advice that once required sifting through multiple websites or guidebooks. Yet little is known about the biases that shape the destination suggestions these systems provide. This study conducts a controlled, persona-based audit of the two models, generating 6480 recommendations for 216 traveller profiles that vary by origin country, age, gender identity and trip theme. Six observable bias families (popularity, geographic, cultural, stereotype, demographic and reinforcement) are quantified using tourism rankings, Hofstede scores, a 150-term cliché lexicon and information-theoretic distance measures. Findings reveal measurable bias in every bias category. DeepSeek is more likely than ChatGPT to suggest off-list cities and recommends domestic travel more often, while both models still favour mainstream destinations. DeepSeek also points users toward culturally more distant destinations on all six Hofstede dimensions and employs a denser, superlative-heavy cliché register; ChatGPT shows wider lexical variety but remains strongly promotional. Demographic analysis uncovers moderate gender gaps and extreme divergence for non-binary personas, tempered by a “protective” tendency to guide non-binary travellers toward countries with higher LGBTQI acceptance. Reinforcement bias is minimal, with over 90 percent of follow-up suggestions being novel in both systems. These results confirm that unconstrained LLMs are not neutral filters but active amplifiers of structural imbalances. The paper proposes a public-interest re-ranking layer, hosted by a body such as UN Tourism, that balances exposure fairness, seasonality smoothing, low-carbon routing, cultural congruence, safety safeguards and stereotype penalties, transforming conversational AI from an opaque gatekeeper into a sustainability-oriented travel recommendation tool.
Full article
(This article belongs to the Special Issue AI Bias in the Media and Beyond)
►
Show Figures
Open AccessArticle
RA-CottNet: A Real-Time High-Precision Deep Learning Model for Cotton Boll and Flower Recognition
by
Rui-Feng Wang, Yi-Ming Qin, Yi-Yi Zhao, Mingrui Xu, Iago Beffart Schardong and Kangning Cui
AI 2025, 6(9), 235; https://doi.org/10.3390/ai6090235 - 18 Sep 2025
Abstract
Cotton is the most important natural fiber crop worldwide, and its automated harvesting is essential for improving production efficiency and economic benefits. However, cotton boll detection faces challenges such as small target size, fine-grained category differences, and complex background interference. This study proposes
[...] Read more.
Cotton is the most important natural fiber crop worldwide, and its automated harvesting is essential for improving production efficiency and economic benefits. However, cotton boll detection faces challenges such as small target size, fine-grained category differences, and complex background interference. This study proposes RA-CottNet, a high-precision object detection model with both directional awareness and attention-guided capabilities, and develops an open-source dataset containing 4966 annotated images. Based on YOLOv11n, RA-CottNet incorporates ODConv and SPDConv to enhance directional and spatial representation, while integrating CoordAttention, an improved GAM, and LSKA to improve feature extraction. Experimental results showed that RA-CottNet achieves 93.683% , 86.040% , 93.496% , 72.857% , and 89.692% - , maintaining stable performance under multi-scale and rotation perturbations. The proposed approach demonstrated high accuracy and real-time capability, making it suitable for deployment on agricultural edge devices and providing effective technical support for automated cotton boll harvesting and yield estimation.
Full article
(This article belongs to the Special Issue Leveraging Simulation and Deep Learning for Enhanced Health and Safety)
►▼
Show Figures

Figure 1
Open AccessArticle
AE-DD: Autoencoder-Driven Dictionary with Matching Pursuit for Joint ECG Denoising, Compression, and Morphology Decomposition
by
Fars Samann and Thomas Schanze
AI 2025, 6(9), 234; https://doi.org/10.3390/ai6090234 - 17 Sep 2025
Abstract
►▼
Show Figures
Background: Electrocardiogram (ECG) signals are crucial for cardiovascular diagnosis, but their analysis face challenges from noise contamination, compression difficulties due to their non-stationary nature, and the inherent complexity of its morphological components, particularly for low-amplitude P- and T-waves obscured by noise. Methodology: This
[...] Read more.
Background: Electrocardiogram (ECG) signals are crucial for cardiovascular diagnosis, but their analysis face challenges from noise contamination, compression difficulties due to their non-stationary nature, and the inherent complexity of its morphological components, particularly for low-amplitude P- and T-waves obscured by noise. Methodology: This study proposes a novel, multi-stage framework for ECG signal denoising, compressing, and component decomposition. The proposed framework leverages the sparsity of ECG signal to denoise and compress these signals using autoencoder-driven dictionary (AE-DD) with matching pursuit. In this work, a data-driven dictionary was developed using a regularized autoencoder. Appropriate trained weights along with matching pursuit were used to compress the denoised ECG segments. This study explored different weight regularization techniques: L1- and L2-regularization. Results: The proposed framework achieves remarkable performance in simultaneous ECG denoising, compression, and morphological decomposition. The L1-DAE model delivers superior noise suppression (SNR improvement up to 18.6 dB at dB input SNR) and near-lossless reconstruction ( ). The L1-AE dictionary enables high-fidelity compression (CR = 28:1 ratio, , PRD = 2.1%), outperforming non-regularized models and traditional dictionaries (DCT/wavelets), while its trained weights naturally decompose into interpretable sub-dictionaries for P-wave, QRS complex, and T-wave enabling precise, label-free analysis of ECG components. Moreover, the learned sub-dictionaries naturally decompose into interpretable P-wave, QRS complex, and T-wave components with high accuracy, yielding strong correlation with the original ECG ( , , and , respectively) and very low MSE ( , , and , respectively). Conclusions: This study introduces a novel autoencoder-driven framework that simultaneously performs ECG denoising, compression, and morphological decomposition. By leveraging L1-regularized autoencoders with matching pursuit, the method effectively enhances signal quality while enabling direct decomposition of ECG signals into clinically relevant components without additional processing. This unified approach offers significant potential for improving automated ECG analysis and facilitating efficient long-term cardiac monitoring.
Full article

Figure 1
Open AccessArticle
EEViT: Efficient Enhanced Vision Transformer Architectures with Information Propagation and Improved Inductive Bias
by
Rigel Mahmood, Sarosh Patel and Khaled Elleithy
AI 2025, 6(9), 233; https://doi.org/10.3390/ai6090233 - 17 Sep 2025
Abstract
►▼
Show Figures
The Transformer architecture has been the foundational cornerstone of the recent AI revolution, serving as the backbone of Large Language Models, which have demonstrated impressive language understanding and reasoning capabilities. When pretrained on large amounts of data, Transformers have also shown to be
[...] Read more.
The Transformer architecture has been the foundational cornerstone of the recent AI revolution, serving as the backbone of Large Language Models, which have demonstrated impressive language understanding and reasoning capabilities. When pretrained on large amounts of data, Transformers have also shown to be highly effective in image classification via the advent of the Vision Transformer. However, they still lag in vision application performance compared to Convolutional Neural Networks (CNNs), which offer translational invariance, whereas Transformers lack inductive bias. Further, the Transformer relies on the attention mechanism, which despite increasing the receptive field, makes it computationally inefficient due to its quadratic time complexity. In this paper, we enhance the Transformer architecture, focusing on its above two shortcomings. We propose two efficient Vision Transformer architectures that significantly reduce the computational complexity without sacrificing classification performance. Our first enhanced architecture is the EEViT-PAR, which combines features from two recently proposed designs of PerceiverAR and CaiT. This enhancement leads to our second architecture, EEViT-IP, which provides implicit windowing capabilities akin to the SWIN Transformer and implicitly improves the inductive bias, while being extremely memory and computationally efficient. We perform detailed experiments on multiple image datasets to show the effectiveness of our architectures. Our best performing EEViT outperforms existing SOTA ViT models in terms of execution efficiency and surpasses or provides competitive classification accuracy on different benchmarks.
Full article

Figure 1
Open AccessArticle
Emerging Threat Vectors: How Malicious Actors Exploit LLMs to Undermine Border Security
by
Dimitrios Doumanas, Alexandros Karakikes, Andreas Soularidis, Efstathios Mainas and Konstantinos Kotis
AI 2025, 6(9), 232; https://doi.org/10.3390/ai6090232 - 15 Sep 2025
Abstract
►▼
Show Figures
The rapid proliferation of Large Language Models (LLMs) has democratized access to advanced generative capabilities while raising urgent concerns about misuse in sensitive security domains. Border security, in particular, represents a high-risk environment where malicious actors may exploit LLMs for document forgery, synthetic
[...] Read more.
The rapid proliferation of Large Language Models (LLMs) has democratized access to advanced generative capabilities while raising urgent concerns about misuse in sensitive security domains. Border security, in particular, represents a high-risk environment where malicious actors may exploit LLMs for document forgery, synthetic identity creation, logistics planning, or disinformation campaigns. Existing studies often highlight such risks in theory, yet few provide systematic empirical evidence of how state-of-the-art LLMs can be exploited. This paper introduces the Silent Adversary Framework (SAF), a structured pipeline that models the sequential stages by which obfuscated prompts can covertly bypass safeguards. We evaluate ten high-risk scenarios using five leading models—GPT-4o, Claude 3.7 Sonnet, Gemini 2.5 Flash, Grok 3, and Runway Gen-2—and assess outputs through three standardized metrics: Bypass Success Rate (BSR), Output Realism Score (ORS), and Operational Risk Level (ORL). Results reveal that, while all models exhibited some susceptibility, vulnerabilities were heterogeneous. Claude showed greater resistance in chemistry-related prompts, whereas GPT-4o and Gemini generated highly realistic outputs in identity fraud and logistics optimization tasks. Document forgery attempts produced only partially successful templates that lacked critical security features. These findings highlight the uneven distribution of risks across models and domains. By combining a reproducible adversarial framework with empirical testing, this study advances the evidence base on LLM misuse and provides actionable insights for policymakers and border security agencies, underscoring the need for stronger safeguards and oversight in the deployment of generative AI.
Full article

Figure 1
Open AccessArticle
Intelligent Decision-Making Analytics Model Based on MAML and Actor–Critic Algorithms
by
Xintong Zhang, Beibei Zhang, Haoru Li, Helin Wang and Yunqiao Huang
AI 2025, 6(9), 231; https://doi.org/10.3390/ai6090231 - 14 Sep 2025
Abstract
Traditional Reinforcement Learning (RL) struggles in dynamic decision-making due to data dependence, limited generalization, and imbalanced subjective/objective factors. This paper proposes an intelligent model combining the Model-Agnostic Meta-Learning (MAML) framework with the Actor–Critic algorithm to address these limitations. The model integrates the AHP-CRITIC
[...] Read more.
Traditional Reinforcement Learning (RL) struggles in dynamic decision-making due to data dependence, limited generalization, and imbalanced subjective/objective factors. This paper proposes an intelligent model combining the Model-Agnostic Meta-Learning (MAML) framework with the Actor–Critic algorithm to address these limitations. The model integrates the AHP-CRITIC weighting method to quantify strategic weights from both subjective expert experience and objective data, achieving balanced decision rationality. The MAML mechanism enables rapid generalization with minimal samples in dynamic environments via cross-task parameter optimization, drastically reducing retraining costs upon environmental changes. Evaluated on enterprise indicator anomaly decision-making, the model achieves significantly higher task reward values than traditional Actor–Critic, PG, and DQN using only 10–20 samples. It improves time efficiency by up to 97.23%. A proposed Balanced Performance Index confirms superior stability and adaptability. Currently integrated into an enterprise platform, the model provides efficient support for dynamic, complex scenarios. This research offers an innovative solution for intelligent decision-making under data scarcity and subjective-objective conflicts, demonstrating both theoretical value and practical potential.
Full article
(This article belongs to the Section AI Systems: Theory and Applications)
►▼
Show Figures

Figure 1
Open AccessArticle
Toward Reliable Models for Distinguishing Epileptic High-Frequency Oscillations (HFOs) from Non-HFO Events Using LSTM and Pre-Trained OWL-ViT Vision–Language Framework
by
Sahbi Chaibi and Abdennaceur Kachouri
AI 2025, 6(9), 230; https://doi.org/10.3390/ai6090230 - 14 Sep 2025
Abstract
Background: Over the past two decades, high-frequency oscillations (HFOs) between 80 and 500 Hz have emerged as valuable biomarkers for delineating and tracking epileptogenic brain networks. However, inspecting HFO events in lengthy EEG recordings remains a time-consuming visual process and mainly relies on
[...] Read more.
Background: Over the past two decades, high-frequency oscillations (HFOs) between 80 and 500 Hz have emerged as valuable biomarkers for delineating and tracking epileptogenic brain networks. However, inspecting HFO events in lengthy EEG recordings remains a time-consuming visual process and mainly relies on experienced clinicians. Extensive recent research has emphasized the value of introducing deep learning (DL) and generative AI (GenAI) methods to automatically identify epileptic HFOs in iEEG signals. Owing to the ongoing issue of the noticeable incidence of spurious or false HFOs, a key question remains: which model is better able to distinguish epileptic HFOs from non-HFO events, such as artifacts and background noise? Methods: In this regard, our study addresses two main objectives: (i) proposing a novel HFO classification approach using a prompt engineering framework with OWL-ViT, a state-of-the-art large vision–language model designed for multimodal image understanding guided by optimized natural language prompts; and (ii) comparing a range of existing deep learning and generative models, including our proposed one. Main results: Notably, our quantitative and qualitative analysis demonstrated that the LSTM model achieved the highest classification accuracy of 99.16% among the time-series methods considered, while our proposed method consistently performed best among the different approaches based on time–frequency representation, achieving an accuracy of 99.07%. Conclusions and significance: The present study highlights the effectiveness of LSTM and prompted OWL-ViT models in distinguishing genuine HFOs from spurious non-HFO oscillations with respect to the gold-standard benchmark. These advancements constitute a promising step toward more reliable and efficient diagnostic tools for epilepsy.
Full article
(This article belongs to the Section Medical & Healthcare AI)
►▼
Show Figures

Graphical abstract
Open AccessArticle
GLNet-YOLO: Multimodal Feature Fusion for Pedestrian Detection
by
Yi Zhang, Qing Zhao, Xurui Xie, Yang Shen, Jinhe Ran, Shu Gui, Haiyan Zhang, Xiuhe Li and Zhen Zhang
AI 2025, 6(9), 229; https://doi.org/10.3390/ai6090229 - 12 Sep 2025
Abstract
In the field of modern computer vision, pedestrian detection technology holds significant importance in applications such as intelligent surveillance, autonomous driving, and robot navigation. However, single-modal images struggle to achieve high-precision detection in complex environments. To address this, this study proposes a GLNet-YOLO
[...] Read more.
In the field of modern computer vision, pedestrian detection technology holds significant importance in applications such as intelligent surveillance, autonomous driving, and robot navigation. However, single-modal images struggle to achieve high-precision detection in complex environments. To address this, this study proposes a GLNet-YOLO framework based on cross-modal deep feature fusion, aiming to improve pedestrian detection performance in complex environments by fusing feature information from visible light and infrared images. By extending the YOLOv11 architecture, the framework adopts a dual-branch network structure to process visible light and infrared modal inputs, respectively, and introduces the FM module to realize global feature fusion and enhancement, as well as the DMR module to accomplish local feature separation and interaction. Experimental results show that on the LLVIP dataset, compared to the single-modal YOLOv11 baseline, our fused model improves the mAP@50 by 9.2% over the visible-light-only model and 0.7% over the infrared-only model. This significantly improves the detection accuracy under low-light and complex background conditions and enhances the robustness of the algorithm, and its effectiveness is further verified on the KAIST dataset.
Full article
(This article belongs to the Special Issue Deep Learning Technologies and Their Applications in Image Processing, Computer Vision, and Computational Intelligence)
►▼
Show Figures

Figure 1
Open AccessArticle
Beyond DOM: Unlocking Web Page Structure from Source Code with Neural Networks
by
Irfan Prazina, Damir Pozderac and Vensada Okanović
AI 2025, 6(9), 228; https://doi.org/10.3390/ai6090228 - 12 Sep 2025
Abstract
►▼
Show Figures
We introduce a code-only approach for modeling web page layouts directly from their source code (HTML and CSS only), bypassing rendering. Our method employs a neural architecture with specialized encoders for style rules, CSS selectors, and HTML attributes. These encodings are then aggregated
[...] Read more.
We introduce a code-only approach for modeling web page layouts directly from their source code (HTML and CSS only), bypassing rendering. Our method employs a neural architecture with specialized encoders for style rules, CSS selectors, and HTML attributes. These encodings are then aggregated in another neural network that integrates hierarchical context (sibling and ancestor information) to form rich representational vectors for each web page’s element. Using these vectors, our model predicts eight spatial relationships between pairs of elements, focusing on edge-based proximity in a multilabel classification setup. For scalable training, labels are automatically derived from the Document Object Model (DOM) data for each web page, but the model operates independently of the DOM during inference. During inference, the model does not use bounding boxes or any information found in the DOM; instead, it relies solely on the source code as input. This approach facilitates structure-aware visual analysis in a lightweight and fully code-based way. Our model demonstrates alignment with human judgment in the evaluation of web page similarity, suggesting that code-only layout modeling offers a promising direction for scalable, interpretable, and efficient web interface analysis. The evaluation metrics show our method yields similar performance despite relying on less information.
Full article

Figure 1
Open AccessReview
Artificial Intelligence in Medical Education: A Narrative Review on Implementation, Evaluation, and Methodological Challenges
by
Annalisa Roveta, Luigi Mario Castello, Costanza Massarino, Alessia Francese, Francesca Ugo and Antonio Maconi
AI 2025, 6(9), 227; https://doi.org/10.3390/ai6090227 - 11 Sep 2025
Abstract
Artificial Intelligence (AI) is rapidly transforming medical education by enabling adaptive tutoring, interactive simulation, diagnostic enhancement, and competency-based assessment. This narrative review explores how AI has influenced learning processes in undergraduate and postgraduate medical training, focusing on methodological rigor, educational impact, and implementation
[...] Read more.
Artificial Intelligence (AI) is rapidly transforming medical education by enabling adaptive tutoring, interactive simulation, diagnostic enhancement, and competency-based assessment. This narrative review explores how AI has influenced learning processes in undergraduate and postgraduate medical training, focusing on methodological rigor, educational impact, and implementation challenges. The literature reveals promising results: large language models can generate didactic content and foster academic writing; AI-driven simulations enhance decision-making, procedural skills, and interprofessional communication; and deep learning systems improve diagnostic accuracy in visually intensive tasks such as radiology and histology. Despite promising findings, the existing literature is methodologically heterogeneous. A minority of studies use controlled designs, while the majority focus on short-term effects or are confined to small, simulated cohorts. Critical limitations include algorithmic opacity, generalizability concerns, ethical risks (e.g., GDPR compliance, data bias), and infrastructural barriers, especially in low-resource contexts. Additionally, the unregulated use of AI may undermine critical thinking, foster cognitive outsourcing, and compromise pedagogical depth if not properly supervised. In conclusion, AI holds substantial potential to enhance medical education, but its integration requires methodological robustness, human oversight, and ethical safeguards. Future research should prioritize multicenter validation, longitudinal evaluation, and AI literacy for learners and educators to ensure responsible and sustainable adoption.
Full article
(This article belongs to the Special Issue Exploring the Use of Artificial Intelligence in Education)
Open AccessSystematic Review
Retrieval-Augmented Generation (RAG) in Healthcare: A Comprehensive Review
by
Fnu Neha, Deepshikha Bhati and Deepak Kumar Shukla
AI 2025, 6(9), 226; https://doi.org/10.3390/ai6090226 - 11 Sep 2025
Abstract
►▼
Show Figures
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by integrating external knowledge retrieval to improve factual consistency and reduce hallucinations. Despite growing interest, its use in healthcare remains fragmented. This paper presents a Systematic Literature Review (SLR) following PRISMA guidelines, synthesizing 30 peer-reviewed
[...] Read more.
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by integrating external knowledge retrieval to improve factual consistency and reduce hallucinations. Despite growing interest, its use in healthcare remains fragmented. This paper presents a Systematic Literature Review (SLR) following PRISMA guidelines, synthesizing 30 peer-reviewed studies on RAG in clinical domains, focusing on three of its most prevalent and promising applications in diagnostic support, electronic health record (EHR) summarization, and medical question answering. We synthesize the existing architectural variants (naïve, advanced, and modular) and examine their deployment across these applications. Persistent challenges are identified, including retrieval noise (irrelevant or low-quality retrieved information), domain shift (performance degradation when models are applied to data distributions different from their training set), generation latency, and limited explainability. Evaluation strategies are compared using both standard metrics and clinical-specific metrics, FactScore, RadGraph-F1, and MED-F1, which are particularly critical for ensuring factual accuracy, medical validity, and clinical relevance. This synthesis offers a domain-focused perspective to guide researchers, healthcare providers, and policymakers in developing reliable, interpretable, and clinically aligned AI systems, laying the groundwork for future innovation in RAG-based healthcare solutions.
Full article

Figure 1
Open AccessSystematic Review
Advances and Optimization Trends in Photovoltaic Systems: A Systematic Review
by
Luis Angel Iturralde Carrera, Gendry Alfonso-Francia, Carlos D. Constantino-Robles, Juan Terven, Edgar A. Chávez-Urbiola and Juvenal Rodríguez-Reséndiz
AI 2025, 6(9), 225; https://doi.org/10.3390/ai6090225 - 10 Sep 2025
Abstract
This article presents a systematic review of optimization methods applied to enhance the performance of photovoltaic (PV) systems, with a focus on critical challenges such as system design and spatial layout, maximum power point tracking (MPPT), energy forecasting, fault diagnosis, and energy management.
[...] Read more.
This article presents a systematic review of optimization methods applied to enhance the performance of photovoltaic (PV) systems, with a focus on critical challenges such as system design and spatial layout, maximum power point tracking (MPPT), energy forecasting, fault diagnosis, and energy management. The emphasis is on the integration of classical and algorithmic approaches. Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (PRISMA) methodology, 314 relevant publications from 2020 to 2025 were analyzed to identify current trends, methodological advances, and practical applications in the optimization of PV performance. The principal novelty of this review lies in its integrative critical analysis, which systematically contrasts the applicability, performance, and limitations of deterministic classical methods with emerging stochastic metaheuristic and data-driven artificial intelligence (AI) techniques, highlighting the growing dominance of hybrid models that synergize their strengths. Traditional techniques such as analytical modeling, numerical simulation, linear and dynamic programming, and gradient-based methods are examined in terms of their efficiency and scope. In parallel, the study evaluates the growing adoption of metaheuristic algorithms, including particle swarm optimization, genetic algorithms, and ant colony optimization, as well as machine learning (ML) and deep learning (DL) models applied to tasks such as MPPT, spatial layout optimization, energy forecasting, and fault diagnosis. A key contribution of this review is the identification of hybrid methodologies that combine metaheuristics with ML/DL models, demonstrating superior results in energy yield, robustness, and adaptability under dynamic conditions. The analysis highlights both the strengths and limitations of each paradigm, emphasizing challenges related to data availability, computational cost, and model interpretability. Finally, the study proposes future research directions focused on explainable AI, real-time control via edge computing, and the development of standardized benchmarks for performance evaluation. The findings contribute to a deeper understanding of current capabilities and opportunities in PV system optimization, offering a strategic framework for advancing intelligent and sustainable solar energy technologies.
Full article
(This article belongs to the Special Issue The Application of Machine Learning and AI Technology Towards the Sustainable Development Goals)
►▼
Show Figures

Figure 1
Open AccessArticle
A Markerless Vision-Based Physical Frailty Assessment System for the Older Adults
by
Muhammad Huzaifa, Wajiha Ali, Khawaja Fahad Iqbal, Ishtiaq Ahmad, Yasar Ayaz, Hira Taimur, Yoshihisa Shirayama and Motoyuki Yuasa
AI 2025, 6(9), 224; https://doi.org/10.3390/ai6090224 - 10 Sep 2025
Abstract
The geriatric syndrome known as frailty is characterized by diminished physiological reserves and heightened susceptibility to unfavorable health consequences. As the world’s population ages, it is crucial to detect frailty early and accurately in order to reduce hazards, including falls, hospitalization, and death.
[...] Read more.
The geriatric syndrome known as frailty is characterized by diminished physiological reserves and heightened susceptibility to unfavorable health consequences. As the world’s population ages, it is crucial to detect frailty early and accurately in order to reduce hazards, including falls, hospitalization, and death. In particular, functional tests are frequently used to evaluate physical frailty. However, current evaluation techniques are limited in their scalability and are prone to inconsistency due to their heavy reliance on subjective interpretation and manual observation. In this paper, we provide a completely automated, impartial, and comprehensive frailty assessment system that employs computer vision techniques for assessing physical frailty tests. Machine learning models have been specifically designed to analyze each clinical test. In order to extract significant features, our system analyzes the depth and joint coordinate data for important physical performance tests such as the Walking Speed Test, Timed Up and Go (TUG) Test, Functional Reach Test, Seated Forward Bend Test, Standing on One Leg Test, and Grip Strength Test. The proposed system offers a comprehensive system with consistent measurements, intelligent decision-making, and real-time feedback, in contrast to current systems, which lack real-time analysis and standardization. Strong model accuracy and conformity to clinical benchmarks are demonstrated by the experimental outcomes. The proposed system can be considered a scalable and useful tool for frailty screening in clinical and distant care settings by eliminating observer dependency and improving accessibility.
Full article
(This article belongs to the Special Issue Multimodal Artificial Intelligence in Healthcare)
►▼
Show Figures

Figure 1
Open AccessArticle
Transformer Models Enhance Explainable Risk Categorization of Incidents Compared to TF-IDF Baselines
by
Carlos Ramon Hölzing, Patrick Meybohm, Oliver Happel, Peter Kranke and Charlotte Meynhardt
AI 2025, 6(9), 223; https://doi.org/10.3390/ai6090223 - 9 Sep 2025
Abstract
Background: Critical Incident Reporting Systems (CIRS) play a key role in improving patient safety but facess limitations due to the unstructured nature of narrative data. Systematic analysis of such data to identify latent risk patterns remains challenging. While artificial intelligence (AI) shows promise
[...] Read more.
Background: Critical Incident Reporting Systems (CIRS) play a key role in improving patient safety but facess limitations due to the unstructured nature of narrative data. Systematic analysis of such data to identify latent risk patterns remains challenging. While artificial intelligence (AI) shows promise in healthcare, its application to CIRS analysis is still underexplored. Methods: This study presents a transformer-based approach to classify incident reports into predefined risk categories and support clinical risk managers in identifying safety hazards. We compared a traditional TF-IDF/logistic regression model with a transformer-based German BERT (GBERT) model using 617 anonymized CIRS reports. Reports were categorized manually into four classes: Organization, Treatment, Documentation, and Consent/Communication. Models were evaluated using stratified 5-fold cross-validation. Interpretability was ensured via Shapley Additive Explanations (SHAP). Results: GBERT outperformed the baseline across all metrics, achieving macro averaged-F1 of 0.44 and a weighted-F1 of 0.75 versus 0.35 and 0.71. SHAP analysis revealed clinically plausible feature attributions. Conclusions: In summary, transformer-based models such as GBERT improve classification of incident report data and enable interpretable, systematic risk stratification. These findings highlight the potential of explainable AI to enhance learning from critical incidents.
Full article
(This article belongs to the Special Issue Adversarial Learning and Its Applications in Healthcare)
►▼
Show Figures

Figure 1
Open AccessArticle
Dual-Stream Former: A Dual-Branch Transformer Architecture for Visual Speech Recognition
by
Sanghun Jeon, Jieun Lee and Yong-Ju Lee
AI 2025, 6(9), 222; https://doi.org/10.3390/ai6090222 - 9 Sep 2025
Abstract
►▼
Show Figures
This study proposes Dual-Stream Former, a novel architecture that integrates a Video Swin Transformer and Conformer designed to address the challenges of visual speech recognition (VSR). The model captures spatiotemporal dependencies, achieving a state-of-the-art character error rate (CER) of 3.46%, surpassing traditional convolutional
[...] Read more.
This study proposes Dual-Stream Former, a novel architecture that integrates a Video Swin Transformer and Conformer designed to address the challenges of visual speech recognition (VSR). The model captures spatiotemporal dependencies, achieving a state-of-the-art character error rate (CER) of 3.46%, surpassing traditional convolutional neural network (CNN)-based models, such as 3D-CNN + DenseNet-121 (CER: 5.31%), and transformer-based alternatives, such as vision transformers (CER: 4.05%). The Video Swin Transformer captures multiscale spatial representations with high computational efficiency, whereas the Conformer back-end enhances temporal modeling across diverse phoneme categories. Evaluation of a high-resolution dataset comprising 740,000 utterances across 185 classes highlighted the effectiveness of the model in addressing visually confusing phonemes, such as diphthongs (/ai/, /au/) and labio-dental sounds (/f/, /v/). Dual-Stream Former achieved phoneme recognition error rates of 10.39% for diphthongs and 9.25% for labiodental sounds, surpassing those of CNN-based architectures by more than 6%. Although the model’s large parameter count (168.6 M) poses resource challenges, its hierarchical design ensures scalability. Future work will explore lightweight adaptations and multimodal extensions to increase deployment feasibility. These findings underscore the transformative potential of Dual-Stream Former for advancing VSR applications such as silent communication and assistive technologies by achieving unparalleled precision and robustness in diverse settings.
Full article

Figure 1
Open AccessArticle
Optimizing NFL Draft Selections with Machine Learning Classification
by
Akshaj Enaganti and George Pappas
AI 2025, 6(9), 221; https://doi.org/10.3390/ai6090221 - 9 Sep 2025
Abstract
►▼
Show Figures
The National Football League draft is one of the most important events in the creation of a successful franchise in professional American football. Selecting players as part of the draft process, however, is difficult, as a multitude of factors affect decisions to opt
[...] Read more.
The National Football League draft is one of the most important events in the creation of a successful franchise in professional American football. Selecting players as part of the draft process, however, is difficult, as a multitude of factors affect decisions to opt for one player over another; a few of these include collegiate statistics, team need and fit, and physical potential. In this paper, we utilize a machine learning approach, with various types of models, to optimize the NFL draft and, in turn, enhance team performances. We compare the selections made by the system to the real athletes selected, and assess which of the picks would have been more impactful for the respective franchise. The specific investigation allows for further research by altering the weighting of specific factors and their significance in this decision-making process to land on the ideal player based on what a specific team desires. Using artificial intelligence in this process can produce more consistent results than high-risk traditional methods. Our approach extends beyond a basic Random Forest classifier by simulating complete draft scenarios with player attributes and team needs weighted. This allows comparison of different draft strategies (best-player-available vs. need-based) and demonstrates improved prediction accuracy over conventional methods.
Full article

Figure 1
Open AccessArticle
Self-Emotion-Mediated Exploration in Artificial Intelligence Mirrors: Findings from Cognitive Psychology
by
Gustavo Assuncao, Miguel Castelo-Branco and Paulo Menezes
AI 2025, 6(9), 220; https://doi.org/10.3390/ai6090220 - 9 Sep 2025
Abstract
►▼
Show Figures
Background: Exploration of the physical environment is an indispensable precursor to information acquisition and knowledge consolidation for living organisms. Yet, current artificial intelligence models lack these autonomy capabilities during training, hindering their adaptability. This work proposes a learning framework for artificial agents to
[...] Read more.
Background: Exploration of the physical environment is an indispensable precursor to information acquisition and knowledge consolidation for living organisms. Yet, current artificial intelligence models lack these autonomy capabilities during training, hindering their adaptability. This work proposes a learning framework for artificial agents to obtain an intrinsic exploratory drive, based on epistemic and achievement emotions triggered during data observation. Methods: This study proposes a dual-module reinforcement framework, where data analysis scores dictate pride or surprise, in accordance with psychological studies on humans. A correlation between these states and exploration is then optimized for agents to meet their learning goals. Results: Causal relationships between states and exploration are demonstrated by the majority of agents. A mean increase is noted for surprise, with a mean decrease for pride. Resulting correlations of and are obtained, mirroring previously reported human behavior. Conclusions: These findings lead to the conclusion that bio-inspiration for AI development can be of great use. This can incur benefits typically found in living beings, such as autonomy. Further, it empirically shows how AI methodologies can corroborate human behavioral findings, showcasing major interdisciplinary importance. Ramifications are discussed.
Full article

Figure 1
Open AccessArticle
MST-DGCN: Multi-Scale Temporal–Dynamic Graph Convolutional with Orthogonal Gate for Imbalanced Multi-Label ECG Arrhythmia Classification
by
Jie Chen, Mingfeng Jiang, Xiaoyu He, Yang Li, Jucheng Zhang, Juan Li, Yongquan Wu and Wei Ke
AI 2025, 6(9), 219; https://doi.org/10.3390/ai6090219 - 8 Sep 2025
Abstract
Multi-label arrhythmia classification from 12-lead ECG signals is a tricky problem, including spatiotemporal feature extraction, feature fusion, and class imbalance. To address these issues, a multi-scale temporal–dynamic graph convolutional with orthogonal gates method, termed MST-DGCN, is proposed for ECG arrhythmia classification. In this
[...] Read more.
Multi-label arrhythmia classification from 12-lead ECG signals is a tricky problem, including spatiotemporal feature extraction, feature fusion, and class imbalance. To address these issues, a multi-scale temporal–dynamic graph convolutional with orthogonal gates method, termed MST-DGCN, is proposed for ECG arrhythmia classification. In this method, a temporal–dynamic graph convolution with dynamic adjacency matrices is used to learn spatiotemporal patterns jointly, and an orthogonal gated fusion mechanism is used to eliminate redundancy, so as to strength their complementarity and independence through adjusting the significance of features dynamically. Moreover, a multi-instance learning strategy is proposed to alleviate class imbalance by adjusting the proportion of a few arrhythmia samples through adaptive label allocation. After validating on the St Petersburg INCART dataset under stringent inter-patient settings, the experimental results show that the proposed MST-DGCN method can achieve the best classification performance with an F1-score of 73.66% (+6.2% over prior baseline methods), with concurrent improvements in AUC (70.92%) and mAP (85.24%), while maintaining computational efficiency.
Full article
(This article belongs to the Special Issue Artificial Intelligence in Biomedical Engineering: Challenges and Developments)
►▼
Show Figures

Figure 1
Open AccessArticle
Conv-ScaleNet: A Multiscale Convolutional Model for Federated Human Activity Recognition
by
Xian Wu Ting, Ying Han Pang, Zheng You Lim, Shih Yin Ooi and Fu San Hiew
AI 2025, 6(9), 218; https://doi.org/10.3390/ai6090218 - 8 Sep 2025
Abstract
►▼
Show Figures
Background: Artificial Intelligence (AI) techniques have been extensively deployed in sensor-based Human Activity Recognition (HAR) systems. Recent advances in deep learning, especially Convolutional Neural Networks (CNNs), have advanced HAR by enabling automatic feature extraction from raw sensor data. However, these models often struggle
[...] Read more.
Background: Artificial Intelligence (AI) techniques have been extensively deployed in sensor-based Human Activity Recognition (HAR) systems. Recent advances in deep learning, especially Convolutional Neural Networks (CNNs), have advanced HAR by enabling automatic feature extraction from raw sensor data. However, these models often struggle to capture multiscale patterns in human activity, limiting recognition accuracy. Additionally, traditional centralized learning approaches raise data privacy concerns, as personal sensor data must be transmitted to a central server, increasing the risk of privacy breaches. Methods: To address these challenges, this paper introduces Conv-ScaleNet, a CNN-based model designed for multiscale feature learning and compatibility with federated learning (FL) environments. Conv-ScaleNet integrates a Pyramid Pooling Module to extract both fine-grained and coarse-grained features and employs sequential Global Average Pooling layers to progressively capture abstract global representations from inertial sensor data. The model supports federated learning by training locally on user devices, sharing only model updates rather than raw data, thus preserving user privacy. Results: Experimental results demonstrate that the proposed Conv-ScaleNet achieves approximately 98% and 96% F1-scores on the WISDM and UCI-HAR datasets, respectively, confirming its competitiveness in FL environments for activity recognition. Conclusions: The proposed Conv-ScaleNet model addresses key limitations of existing HAR systems by combining multiscale feature learning with privacy-preserving training. Its strong performance, data protection capability, and adaptability to decentralized environments make it a robust and scalable solution for real-world HAR applications.
Full article

Figure 1
Open AccessArticle
Unplugged Activities for Teaching Decision Trees to Secondary Students—A Case Study Analysis Using the SOLO Taxonomy
by
Konstantinos Karapanos, Vassilis Komis, Georgios Fesakis, Konstantinos Lavidas, Stavroula Prantsoudi and Stamatios Papadakis
AI 2025, 6(9), 217; https://doi.org/10.3390/ai6090217 - 5 Sep 2025
Abstract
►▼
Show Figures
The integration of Artificial Intelligence (AI) technologies in students’ lives necessitates the systematic incorporation of foundational AI literacy into educational curricula. Students are challenged to develop conceptual understanding of computational frameworks such as Machine Learning (ML) algorithms and Decision Trees (DTs). In this
[...] Read more.
The integration of Artificial Intelligence (AI) technologies in students’ lives necessitates the systematic incorporation of foundational AI literacy into educational curricula. Students are challenged to develop conceptual understanding of computational frameworks such as Machine Learning (ML) algorithms and Decision Trees (DTs). In this context, unplugged (i.e., computer-free) pedagogical approaches have emerged as complementary to traditional coding-based instruction in AI education. This study examines the pedagogical effectiveness of an instructional intervention employing unplugged activities to facilitate conceptual understanding of DT algorithms among 47 9th-grade students within a Computer Science (CS) curriculum in Greece. The study employed a quasi-experimental design, utilizing the Structure of Observed Learning Outcomes (SOLO) taxonomy as the theoretical framework for assessing cognitive development and conceptual mastery of DT principles. Quantitative analysis of pre- and post-intervention assessments demonstrated statistically significant improvements in student performance across all evaluated SOLO taxonomy levels. The findings provide empirical support for the hypothesis that unplugged pedagogical interventions constitute an effective and efficient approach for introducing AI concepts to secondary education students. Based on these outcomes, the authors recommend the systematic implementation of developmentally appropriate unplugged instructional interventions for DTs and broader AI concepts across all educational levels, to optimize AI literacy acquisition.
Full article

Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
3 September 2025
Join Us at the MDPI at the University of Toronto Career Fair, 23 September 2025, Toronto, ON, Canada
Join Us at the MDPI at the University of Toronto Career Fair, 23 September 2025, Toronto, ON, Canada

1 September 2025
MDPI INSIGHTS: The CEO’s Letter #26 – CUJS, Head of Ethics, Open Peer Review, AIS 2025, Reviewer Recognition
MDPI INSIGHTS: The CEO’s Letter #26 – CUJS, Head of Ethics, Open Peer Review, AIS 2025, Reviewer Recognition
Topics
Topic in
AI, Data, Economies, Mathematics, Risks
Advanced Techniques and Modeling in Business and Economics
Topic Editors: José Manuel Santos-Jaén, Ana León-Gomez, María del Carmen Valls MartínezDeadline: 30 September 2025
Topic in
AI, Energies, Entropy, Sustainability
Game Theory and Artificial Intelligence Methods in Sustainable and Renewable Energy Power Systems
Topic Editors: Lefeng Cheng, Pei Zhang, Anbo MengDeadline: 31 October 2025
Topic in
AI, Algorithms, Diagnostics, Emergency Care and Medicine
Trends of Artificial Intelligence in Emergency and Critical Care Medicine
Topic Editors: Zhongheng Zhang, Yucai Hong, Wei ShaoDeadline: 30 November 2025
Topic in
AI, Drones, Electronics, Mathematics, Sensors
AI and Data-Driven Advancements in Industry 4.0, 2nd Edition
Topic Editors: Teng Huang, Yan Pang, Qiong Wang, Jianjun Li, Jin Liu, Jia WangDeadline: 15 December 2025

Conferences
Special Issues
Special Issue in
AI
AI and the Evolution of Work: Redefining Project Management across Disciplines
Guest Editor: Jose BerengueresDeadline: 30 September 2025
Special Issue in
AI
Artificial Intelligence for Network Management
Guest Editors: Stephen Ojo, Agbotiname Lucky Imoize, Lateef Adesola AkinyemiDeadline: 30 September 2025
Special Issue in
AI
Development and Design of Autonomous Robot
Guest Editors: Tayab Din Memon, Kamran Shaukat, Sufyan Ali MemonDeadline: 24 October 2025
Special Issue in
AI
Adversarial Learning and Its Applications in Healthcare
Guest Editors: Min Xian, Aleksandar VakanskiDeadline: 27 October 2025