Previous Issue
Volume 7, September
 
 

Mach. Learn. Knowl. Extr., Volume 7, Issue 4 (December 2025) – 13 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
26 pages, 3453 KB  
Article
Hybrid Deep Learning Approaches for Accurate Electricity Price Forecasting: A Day-Ahead US Energy Market Analysis with Renewable Energy
by Md. Saifur Rahman and Hassan Reza
Mach. Learn. Knowl. Extr. 2025, 7(4), 120; https://doi.org/10.3390/make7040120 - 15 Oct 2025
Abstract
Forecasting day-ahead electricity prices is a crucial research area. Both wholesale and retail sectors highly value improved forecast accuracy. Renewable energy sources have grown more influential and effective in the US power market. However, current forecasting models have shortcomings, including inadequate consideration of [...] Read more.
Forecasting day-ahead electricity prices is a crucial research area. Both wholesale and retail sectors highly value improved forecast accuracy. Renewable energy sources have grown more influential and effective in the US power market. However, current forecasting models have shortcomings, including inadequate consideration of renewable energy impacts and insufficient feature selection. Many studies lack reproducibility, clear presentation of input features, and proper integration of renewable resources. This study addresses these gaps by incorporating a comprehensive set of input features, while these features are engineered to capture complex market dynamics. The model’s unique aspect is its inclusion of renewable-related inputs, such as temperature data for solar energy effects and wind speed for wind energy impacts on US electricity prices. The research also employs data preprocessing techniques like windowing, cleaning, normalization, and feature engineering to enhance input data quality and relevance. We developed four advanced hybrid deep learning models to improve electricity price prediction accuracy and reliability. Our approach combines variational mode decomposition (VMD) with four deep learning (DL) architectures: dense neural networks (DNNs), convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and bidirectional LSTM (BiLSTM) networks. This integration aims to capture complex patterns and time-dependent relationships in electricity price data. Among these, the VMD-BiLSTM model consistently outperformed the others across all window implementations. Using 24 input features, this model achieved a remarkably low mean absolute error of 0.2733 when forecasting prices in the MISO market. Our research advances electricity price forecasting, particularly for the US energy market. These hybrid deep neural network models provide valuable tools and insights for market participants, energy traders, and policymakers. Full article
Show Figures

Figure 1

15 pages, 2232 KB  
Article
Image-Based Deep Learning for Brain Tumour Transcriptomics: A Benchmark of DeepInsight, Fotomics, and Saliency-Guided CNNs
by Ali Alyatimi, Vera Chung, Muhammad Atif Iqbal and Ali Anaissi
Mach. Learn. Knowl. Extr. 2025, 7(4), 119; https://doi.org/10.3390/make7040119 - 15 Oct 2025
Abstract
Classifying brain tumour transcriptomic data is crucial for precision medicine but remains challenging due to high dimensionality and limited interpretability of conventional models. This study benchmarks three image-based deep learning approaches, DeepInsight, Fotomics, and a novel saliency-guided convolutional neural network (CNN), for transcriptomic [...] Read more.
Classifying brain tumour transcriptomic data is crucial for precision medicine but remains challenging due to high dimensionality and limited interpretability of conventional models. This study benchmarks three image-based deep learning approaches, DeepInsight, Fotomics, and a novel saliency-guided convolutional neural network (CNN), for transcriptomic classification. DeepInsight utilises dimensionality reduction to spatially arrange gene features, while Fotomics applies Fourier transforms to encode expression patterns into structured images. The proposed method transforms each single-cell gene expression profile into an RGB image using PCA, UMAP, or t-SNE, enabling CNNs such as ResNet to learn spatially organised molecular features. Gradient-based saliency maps are employed to highlight gene regions most influential in model predictions. Evaluation is conducted on two biologically and technologically different datasets: single-cell RNA-seq from glioblastoma GSM3828672 and bulk microarray data from medulloblastoma GSE85217. Outcomes demonstrate that image-based deep learning methods, particularly those incorporating saliency guidance, provide a robust and interpretable framework for uncovering biologically meaningful patterns in complex high-dimensional omics data. For instance, ResNet-18 achieved the highest accuracy of 97.25% on the GSE85217 dataset and 91.02% on GSM3828672, respectively, outperforming other baseline models across multiple metrics. Full article
Show Figures

Graphical abstract

16 pages, 10961 KB  
Article
Exploratory Proof-of-Concept: Predicting the Outcome of Tennis Serves Using Motion Capture and Deep Learning
by Gustav Durlind, Uriel Martinez-Hernandez and Tareq Assaf
Mach. Learn. Knowl. Extr. 2025, 7(4), 118; https://doi.org/10.3390/make7040118 - 14 Oct 2025
Abstract
Tennis serves heavily impact match outcomes, yet analysis by coaches is limited by human vision. The design of an automated tennis serve analysis system could facilitate enhanced performance analysis. As serve location and serve success are directly correlated, predicting the outcome of a [...] Read more.
Tennis serves heavily impact match outcomes, yet analysis by coaches is limited by human vision. The design of an automated tennis serve analysis system could facilitate enhanced performance analysis. As serve location and serve success are directly correlated, predicting the outcome of a serve could provide vital information for performance analysis. This article proposes a tennis serve analysis system powered by Machine Learning, which classifies the outcome of serves as “in”, “out” or “net”, and predicts the coordinate outcome of successful serves. Additionally, this work details the collection of three-dimensional spatio-temporal data on tennis serves, using marker-based optoelectronic motion capture. The classification uses a Stacked Bidirectional Long Short-Term Memory architecture, whilst a 3D Convolutional Neural Network architecture is harnessed for serve coordinate prediction. The proposed method achieves 89% accuracy for tennis serve classification, outperforming the current state-of-the-art whilst performing finer-grain classification. The results achieve an accuracy of 63% in predicting the serve coordinates, with a mean absolute error of 0.59 and a root mean squared error of 0.68, exceeding the current state-of-the-art with a new method. The system contributes towards the long-term goal of designing a non-invasive tennis serve analysis system that functions in training and match conditions. Full article
28 pages, 3456 KB  
Article
Learning to Partition: Dynamic Deep Neural Network Model Partitioning for Edge-Assisted Low-Latency Video Analytics
by Yan Lyu, Likai Liu, Xuezhi Wang, Zhiyu Fan, Jinchen Wang and Guanyu Gao
Mach. Learn. Knowl. Extr. 2025, 7(4), 117; https://doi.org/10.3390/make7040117 - 13 Oct 2025
Viewed by 162
Abstract
In edge-assisted low-latency video analytics, a critical challenge is balancing on-device inference latency against the high bandwidth costs and network delays of offloading. Ineffectively managing this trade-off degrades performance and hinders critical applications like autonomous systems. Existing solutions often rely on static partitioning [...] Read more.
In edge-assisted low-latency video analytics, a critical challenge is balancing on-device inference latency against the high bandwidth costs and network delays of offloading. Ineffectively managing this trade-off degrades performance and hinders critical applications like autonomous systems. Existing solutions often rely on static partitioning or greedy algorithms that optimize for a single frame. These myopic approaches adapt poorly to dynamic network and workload conditions, leading to high long-term costs and significant frame drops. This paper introduces a novel partitioning technique driven by a Deep Reinforcement Learning (DRL) agent on a local device that learns to dynamically partition a video analytics Deep Neural Network (DNN). The agent learns a farsighted policy to dynamically select the optimal DNN split point for each frame by observing the holistic system state. By optimizing for a cumulative long-term reward, our method significantly outperforms competitor methods, demonstrably reducing overall system cost and latency while nearly eliminating frame drops in our real-world testbed evaluation. The primary limitation is the initial offline training phase required by the DRL agent. Future work will focus on extending this dynamic partitioning framework to multi-device and multi-edge environments. Full article
Show Figures

Figure 1

22 pages, 3708 KB  
Article
Faithful Narratives from Complex Conceptual Models: Should Modelers or Large Language Models Simplify Causal Maps?
by Tyler J. Gandee and Philippe J. Giabbanelli
Mach. Learn. Knowl. Extr. 2025, 7(4), 116; https://doi.org/10.3390/make7040116 - 7 Oct 2025
Viewed by 346
Abstract
(1) Background: Comprehensive conceptual models can result in complex artifacts, consisting of many concepts that interact through multiple mechanisms. This complexity can be acceptable and even expected when generating rich models, for instance to support ensuing analyses that find central concepts or decompose [...] Read more.
(1) Background: Comprehensive conceptual models can result in complex artifacts, consisting of many concepts that interact through multiple mechanisms. This complexity can be acceptable and even expected when generating rich models, for instance to support ensuing analyses that find central concepts or decompose models into parts that can be managed by different actors. However, complexity can become a barrier when the conceptual model is used directly by individuals. A ‘transparent’ model can support learning among stakeholders (e.g., in group model building) and it can motivate the adoption of specific interventions (i.e., using a model as evidence base). Although advances in graph-to-text generation with Large Language Models (LLMs) have made it possible to transform conceptual models into textual reports consisting of coherent and faithful paragraphs, turning a large conceptual model into a very lengthy report would only displace the challenge. (2) Methods: We experimentally examine the implications of two possible approaches: asking the text generator to simplify the model, either via abstractive (LLMs) or extractive summarization, or simplifying the model through graph algorithms and then generating the complete text. (3) Results: We find that the two approaches have similar scores on text-based evaluation metrics including readability and overlap scores (ROUGE, BLEU, Meteor), but faithfulness can be lower when the text generator decides on what is an interesting fact and is tasked with creating a story. These automated metrics capture textual properties, but they do not assess actual user comprehension, which would require an experimental study with human readers. (4) Conclusions: Our results suggest that graph algorithms may be preferable to support modelers in scientific translations from models to text while minimizing hallucinations. Full article
Show Figures

Figure 1

38 pages, 3764 KB  
Review
AI-Enabled IoT Intrusion Detection: Unified Conceptual Framework and Research Roadmap
by Antonio Villafranca, Kyaw Min Thant, Igor Tasic and Maria-Dolores Cano
Mach. Learn. Knowl. Extr. 2025, 7(4), 115; https://doi.org/10.3390/make7040115 - 6 Oct 2025
Viewed by 732
Abstract
The Internet of Things (IoT) revolutionizes connectivity, enabling innovative applications across healthcare, industry, and smart cities but also introducing significant cybersecurity challenges due to its expanded attack surface. Intrusion Detection Systems (IDSs) play a pivotal role in addressing these challenges, offering tailored solutions [...] Read more.
The Internet of Things (IoT) revolutionizes connectivity, enabling innovative applications across healthcare, industry, and smart cities but also introducing significant cybersecurity challenges due to its expanded attack surface. Intrusion Detection Systems (IDSs) play a pivotal role in addressing these challenges, offering tailored solutions to detect and mitigate threats in dynamic and resource-constrained IoT environments. Through a rigorous analysis, this study classifies IDS research based on methodologies, performance metrics, and application domains, providing a comprehensive synthesis of the field. Key findings reveal a paradigm shift towards integrating artificial intelligence (AI) and hybrid approaches, surpassing the limitations of traditional, static methods. These advancements highlight the potential for IDSs to enhance scalability, adaptability, and detection accuracy. However, unresolved challenges, such as resource efficiency and real-world applicability, underline the need for further research. By contextualizing these findings within the broader landscape of IoT security, this work emphasizes the critical importance of developing IDS solutions that ensure the reliability, privacy, and security of interconnected systems, contributing to the sustainable evolution of IoT ecosystems. Full article
Show Figures

Graphical abstract

37 pages, 3463 KB  
Article
Enhancing Cancer Classification from RNA Sequencing Data Using Deep Learning and Explainable AI
by Haseeb Younis and Rosane Minghim
Mach. Learn. Knowl. Extr. 2025, 7(4), 114; https://doi.org/10.3390/make7040114 - 1 Oct 2025
Viewed by 354
Abstract
Cancer is one of the most deadly diseases, costing millions of lives and billions of USD every year. There are different ways to identify the biomarkers that can be used to detect cancer types and subtypes. RNA sequencing is steadily taking the lead [...] Read more.
Cancer is one of the most deadly diseases, costing millions of lives and billions of USD every year. There are different ways to identify the biomarkers that can be used to detect cancer types and subtypes. RNA sequencing is steadily taking the lead as the method of choice due to its ability to access global gene expression in biological samples and facilitate more flexible methods and robust analyses. Numerous studies have employed artificial intelligence (AI) and specifically machine learning techniques to detect cancer in its early stages. However, most of the models provided are very specific to particular cancer types and do not generalize. This paper proposes a deep learning and explainable AI (XAI) combined approach to classifying cancer subtypes and a deep learning-based approach for the classification of cancer types using BARRA:CuRDa, an RNA-seq database with 17 datasets for seven cancer types. One architecture is designed to classify cancer subtypes with around 100% accuracy, precision, recall, F1 score, and G-Mean. This architecture outperforms the previous methodologies for all individual datasets. The second architecture is designed to classify multiple cancer types; it classifies eight types within the neighborhood of 87% of validation accuracy, precision, recall, F1 score, and G-Mean. Within the same process, we employ XAI, which identifies 99 genes out of 58,735 input genes that could be potential biomarkers for different cancer types. We also perform Pathway Enrichment Analysis and Visual Analysis to establish the significance and robustness of our methodology. The proposed methodology can classify cancer types and subtypes with robust results and can be extended to other cancer types. Full article
Show Figures

Figure 1

20 pages, 14055 KB  
Article
TL-Efficient-SE: A Transfer Learning-Based Attention-Enhanced Model for Fingerprint Liveness Detection Across Multi-Sensor Spoof Attacks
by Archana Pallakonda, Rayappa David Amar Raj, Rama Muni Reddy Yanamala, Christian Napoli and Cristian Randieri
Mach. Learn. Knowl. Extr. 2025, 7(4), 113; https://doi.org/10.3390/make7040113 - 1 Oct 2025
Viewed by 316
Abstract
Fingerprint authentication systems encounter growing threats from presentation attacks, making strong liveness detection crucial. This work presents a deep learning-based framework integrating EfficientNetB0 with a Squeeze-and-Excitation (SE) attention approach, using transfer learning to enhance feature extraction. The LivDet 2015 dataset, composed of both [...] Read more.
Fingerprint authentication systems encounter growing threats from presentation attacks, making strong liveness detection crucial. This work presents a deep learning-based framework integrating EfficientNetB0 with a Squeeze-and-Excitation (SE) attention approach, using transfer learning to enhance feature extraction. The LivDet 2015 dataset, composed of both real and fake fingerprints taken using four optical sensors and spoofs made using PlayDoh, Ecoflex, and Gelatine, is used to train and test the model architecture. Stratified splitting is performed once the images being input have been scaled and normalized to conform to EfficientNetB0’s format. The SE module adaptively improves appropriate features to competently differentiate live from fake inputs. The classification head comprises fully connected layers, dropout, batch normalization, and a sigmoid output. Empirical results exhibit accuracy between 98.50% and 99.50%, with an AUC varying from 0.978 to 0.9995, providing high precision and recall for genuine users, and robust generalization across unseen spoof types. Compared to existing methods like Slim-ResCNN and HyiPAD, the novelty of our model lies in the Squeeze-and-Excitation mechanism, which enhances feature discrimination by adaptively recalibrating the channels of the feature maps, thereby improving the model’s ability to differentiate between live and spoofed fingerprints. This model has practical implications for deployment in real-time biometric systems, including mobile authentication and secure access control, presenting an efficient solution for protecting against sophisticated spoofing methods. Future research will focus on sensor-invariant learning and adaptive thresholds to further enhance resilience against varying spoofing attacks. Full article
(This article belongs to the Special Issue Advances in Machine and Deep Learning)
Show Figures

Figure 1

20 pages, 646 KB  
Article
Adversarial Attacks Detection Method for Tabular Data
by Łukasz Wawrowski, Piotr Biczyk, Dominik Ślęzak and Marek Sikora
Mach. Learn. Knowl. Extr. 2025, 7(4), 112; https://doi.org/10.3390/make7040112 - 1 Oct 2025
Viewed by 307
Abstract
Adversarial attacks involve malicious actors introducing intentional perturbations to machine learning (ML) models, causing unintended behavior. This poses a significant threat to the integrity and trustworthiness of ML models, necessitating the development of robust detection techniques to protect systems from potential threats. The [...] Read more.
Adversarial attacks involve malicious actors introducing intentional perturbations to machine learning (ML) models, causing unintended behavior. This poses a significant threat to the integrity and trustworthiness of ML models, necessitating the development of robust detection techniques to protect systems from potential threats. The paper proposes a new approach for detecting adversarial attacks using a surrogate model and diagnostic attributes. The method was tested on 22 tabular datasets on which four different ML models were trained. Furthermore, various attacks were conducted, which led to obtaining perturbed data. The proposed approach is characterized by high efficiency in detecting known and unknown attacks—balanced accuracy was above 0.94, with very low false negative rates (0.02–0.10) for binary detection. Sensitivity analysis shows that classifiers trained based on diagnostic attributes can detect even very subtle adversarial attacks. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

33 pages, 5405 KB  
Article
Transfer Learning for Generalized Safety Risk Detection in Industrial Video Operations
by Luciano Radrigan, Sebastián E. Godoy and Anibal S. Morales
Mach. Learn. Knowl. Extr. 2025, 7(4), 111; https://doi.org/10.3390/make7040111 - 30 Sep 2025
Viewed by 464
Abstract
This paper proposes a transfer learning-based approach to enhance video-driven safety risk detection in industrial environments, addressing the critical challenge of limited generalization across diverse operational scenarios. Conventional deep learning models trained on specific operational contexts often fail when applied to new environments [...] Read more.
This paper proposes a transfer learning-based approach to enhance video-driven safety risk detection in industrial environments, addressing the critical challenge of limited generalization across diverse operational scenarios. Conventional deep learning models trained on specific operational contexts often fail when applied to new environments with different lighting, camera angles, or machinery configurations, exhibiting a significant drop in performance (e.g., F1-score declining below 0.85). To overcome this issue, an incremental feature transfer learning strategy is introduced, enabling efficient adaptation of risk detection models using only small amounts of data from new scenarios. This approach leverages prior knowledge from pre-trained models to reduce the reliance on large-labeled datasets, particularly valuable in industrial settings where rare but critical safety risk events are difficult to capture. Additionally, training efficiency is improved compared with a classic approach, supporting deployment on resource-constrained edge devices. The strategy involves incremental retraining using video segments with average durations ranging from 2.5 to 25 min (corresponding to 5–50% of new scenario data), approximately, enabling scalable generalization across multiple forklift-related risk activities. Interpretability is enhanced through SHAP-based analysis, which reveals a redistribution of feature relevance toward critical components, thereby improving model transparency and reducing annotation demands. Experimental results confirm that the transfer learning strategy significantly improves detection accuracy, robustness, and adaptability, making it a practical and scalable solution for safety monitoring in dynamic industrial environments. Full article
Show Figures

Graphical abstract

21 pages, 5230 KB  
Article
Attention-Guided Differentiable Channel Pruning for Efficient Deep Networks
by Anouar Chahbouni, Khaoula El Manaa, Yassine Abouch, Imane El Manaa, Badre Bossoufi, Mohammed El Ghzaoui and Rachid El Alami
Mach. Learn. Knowl. Extr. 2025, 7(4), 110; https://doi.org/10.3390/make7040110 - 29 Sep 2025
Viewed by 398
Abstract
Deploying deep learning (DL) models in real-world environments remains a major challenge, particularly under resource-constrained conditions where achieving both high accuracy and compact architectures is essential. While effective, Conventional pruning methods often suffer from high computational overhead, accuracy degradation, or disruption of the [...] Read more.
Deploying deep learning (DL) models in real-world environments remains a major challenge, particularly under resource-constrained conditions where achieving both high accuracy and compact architectures is essential. While effective, Conventional pruning methods often suffer from high computational overhead, accuracy degradation, or disruption of the end-to-end training process, limiting their practicality for embedded and real-time applications. We present Dynamic Attention-Guided Pruning (DAGP), a Dynamic Attention-Guided Soft Channel Pruning framework that overcomes these limitations by embedding learnable, differentiable pruning masks directly within convolutional neural networks (CNNs). These masks act as implicit attention mechanisms, adaptively suppressing non-informative channels during training. A progressively scheduled L1 regularization, activated after a warm-up phase, enables gradual sparsity while preserving early learning capacity. Unlike prior methods, DAGP is retraining-free, introduces minimal architectural overhead, and supports optional hard pruning for deployment efficiency. Joint optimization of classification and sparsity objectives ensures stable convergence and task-adaptive channel selection. Experiments on CIFAR-10 (VGG16, ResNet56) and PlantVillage (custom CNN) achieve up to 98.82% FLOPs reduction with accuracy gains over baselines. Real-world validation on an enhanced PlantDoc dataset for agricultural monitoring achieves 60 ms inference with only 2.00 MB RAM on a Raspberry Pi 4, confirming efficiency under field conditions. These results illustrate DAGP’s potential to scale beyond agriculture to diverse edge-intelligent systems requiring lightweight, accurate, and deployable models. Full article
Show Figures

Figure 1

36 pages, 35564 KB  
Article
Enhancing Soundscape Characterization and Pattern Analysis Using Low-Dimensional Deep Embeddings on a Large-Scale Dataset
by Daniel Alexis Nieto Mora, Leonardo Duque-Muñoz and Juan David Martínez Vargas
Mach. Learn. Knowl. Extr. 2025, 7(4), 109; https://doi.org/10.3390/make7040109 - 24 Sep 2025
Viewed by 442
Abstract
Soundscape monitoring has become an increasingly important tool for studying ecological processes and supporting habitat conservation. While many recent advances focus on identifying species through supervised learning, there is growing interest in understanding the soundscape as a whole while considering patterns that extend [...] Read more.
Soundscape monitoring has become an increasingly important tool for studying ecological processes and supporting habitat conservation. While many recent advances focus on identifying species through supervised learning, there is growing interest in understanding the soundscape as a whole while considering patterns that extend beyond individual vocalizations. This broader view requires unsupervised approaches capable of capturing meaningful structures related to temporal dynamics, frequency content, spatial distribution, and ecological variability. In this study, we present a fully unsupervised framework for analyzing large-scale soundscape data using deep learning. We applied a convolutional autoencoder (Soundscape-Net) to extract acoustic representations from over 60,000 recordings collected across a grid-based sampling design in the Rey Zamuro Reserve in Colombia. These features were initially compared with other audio characterization methods, showing superior performance in multiclass classification, with accuracies of 0.85 for habitat cover identification and 0.89 for time-of-day classification across 13 days. For the unsupervised study, optimized dimensionality reduction methods (Uniform Manifold Approximation and Projection and Pairwise Controlled Manifold Approximation and Projection) were applied to project the learned features, achieving trustworthiness scores above 0.96. Subsequently, clustering was performed using KMeans and Density-Based Spatial Clustering of Applications with Noise (DBSCAN), with evaluations based on metrics such as the silhouette, where scores above 0.45 were obtained, thus supporting the robustness of the discovered latent acoustic structures. To interpret and validate the resulting clusters, we combined multiple strategies: spatial mapping through interpolation, analysis of acoustic index variance to understand the cluster structure, and graph-based connectivity analysis to identify ecological relationships between the recording sites. Our results demonstrate that this approach can uncover both local and broad-scale patterns in the soundscape, providing a flexible and interpretable pathway for unsupervised ecological monitoring. Full article
Show Figures

Figure 1

23 pages, 2613 KB  
Article
Learning to Balance Mixed Adversarial Attacks for Robust Reinforcement Learning
by Mustafa Erdem and Nazım Kemal Üre
Mach. Learn. Knowl. Extr. 2025, 7(4), 108; https://doi.org/10.3390/make7040108 - 24 Sep 2025
Viewed by 522
Abstract
Reinforcement learning agents are highly susceptible to adversarial attacks that can severely compromise their performance. Although adversarial training is a common countermeasure, most existing research focuses on defending against single-type attacks targeting either observations or actions. This narrow focus overlooks the complexity of [...] Read more.
Reinforcement learning agents are highly susceptible to adversarial attacks that can severely compromise their performance. Although adversarial training is a common countermeasure, most existing research focuses on defending against single-type attacks targeting either observations or actions. This narrow focus overlooks the complexity of real-world mixed attacks, where an agent’s perceptions and resulting actions are perturbed simultaneously. To systematically study these threats, we introduce the Action and State-Adversarial Markov Decision Process (ASA-MDP), which models the interaction as a zero-sum game between the agent and an adversary attacking both states and actions. Using this framework, we show that agents trained conventionally or against single-type attacks remain highly vulnerable to mixed perturbations. Moreover, we identify a key challenge in this setting: a naive mixed-type adversary often fails to effectively balance its perturbations across modalities during training, limiting the agent’s robustness. To address this, we propose the Action and State-Adversarial Proximal Policy Optimization (ASA-PPO) algorithm, which enables the adversary to learn a balanced strategy, distributing its attack budget across both state and action spaces. This, in turn, enhances the robustness of the trained agent against a wide range of adversarial scenarios. Comprehensive experiments across diverse environments demonstrate that policies trained with ASA-PPO substantially outperform baselines—including standard PPO and single-type adversarial methods—under action-only, observation-only, and, most notably, mixed-attack conditions. Full article
Show Figures

Figure 1

Previous Issue
Back to TopTop