Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,101)

Search Parameters:
Keywords = learning agent

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 1707 KB  
Article
Joint Optimization of Microservice and Database Orchestration in Edge Clouds via Multi-Stage Proximal Policy
by Xingfeng He, Mingwei Luo, Dengmu Liu, Zhenhua Wang, Yingdong Liu, Chen Zhang, Jiandong Wang, Jiaxiang Xu and Tianping Deng
Symmetry 2026, 18(1), 136; https://doi.org/10.3390/sym18010136 - 9 Jan 2026
Abstract
Microservices as an emerging architectural approach have been widely applied in the development of online applications. However, in large-scale service systems, frequent data communications, complex invocation dependencies, and strict latency requirements pose significant challenges to efficient microservice orchestration. In addition, microservices need to [...] Read more.
Microservices as an emerging architectural approach have been widely applied in the development of online applications. However, in large-scale service systems, frequent data communications, complex invocation dependencies, and strict latency requirements pose significant challenges to efficient microservice orchestration. In addition, microservices need to frequently access the database to achieve data persistence, creating a mutual dependency between the two, and this symmetry further increases the complexity of service orchestration and coordinated deployment. In this context, the strong coupling of service deployment, database layout, and request routing makes effective local optimization difficult. However, existing research often overlooks the impact of databases, fails to achieve joint optimization among databases, microservice deployments, and routing, or lacks fine-grained orchestration strategies for multi-instance models. To address the above limitations, this paper proposes a joint optimization framework based on the Database-as-a-Service (DaaS) paradigm. It performs fine-grained multi-instance queue modeling based on queuing theory to account for delays in data interaction, request queuing, and processing. Furthermore this paper proposes a proximal policy optimization algorithm based on multi-stage joint decision-making to address the orchestration problem of microservices and database instances. In this algorithm, the action space is symmetrical between microservices and database deployment, enabling the agent to leverage this characteristic and improve representation learning efficiency through shared feature extraction layers. The algorithm incorporates a two-layer agent policy stability control to accelerate convergence and a three-level experience replay mechanism to achieve efficient training on high-dimensional decision spaces. Experimental results demonstrate that the proposed algorithm effectively reduces service request latency under diverse workloads and network conditions, while maintaining global resource load balancing. Full article
19 pages, 1855 KB  
Article
CLIP-RL: Closed-Loop Video Inpainting with Detection-Guided Reinforcement Learning
by Meng Wang, Jing Ren, Bing Wang and Xueping Tang
Sensors 2026, 26(2), 447; https://doi.org/10.3390/s26020447 - 9 Jan 2026
Abstract
Existing video inpainting methods typically combine optical flow propagation with Transformer architectures, achieving promising inpainting results. However, they lack adaptive inpainting strategy optimization in diverse scenarios, and struggle to capture high-level temporal semantics, causing temporal inconsistencies and quality degradation. To address these challenges, [...] Read more.
Existing video inpainting methods typically combine optical flow propagation with Transformer architectures, achieving promising inpainting results. However, they lack adaptive inpainting strategy optimization in diverse scenarios, and struggle to capture high-level temporal semantics, causing temporal inconsistencies and quality degradation. To address these challenges, we make one of the first attempts to introduce reinforcement learning into the video inpainting domain, establishing a closed-loop framework named CLIP-RL that enables adaptive strategy optimization. Specifically, video inpainting is reformulated as an agent–environment interaction, where the inpainting module functions as the agent’s execution component, and a pre-trained inpainting detection module provides real-time quality feedback. Guided by a policy network and a composite reward function that incorporates a weighted temporal alignment loss, the agent dynamically selects actions to adjust the inpainting strategy and iteratively refines the inpainting results. Compared to ProPainter, CLIP-RL improves PSNR from 34.43 to 34.67 and SSIM from 0.974 to 0.986 on the YouTube-VOS dataset. Qualitative analysis demonstrates that CLIP-RL excels in detail preservation and artifact suppression, validating its superiority in video inpainting tasks. Full article
(This article belongs to the Section Intelligent Sensors)
19 pages, 2856 KB  
Article
Applying Dual Deep Deterministic Policy Gradient Algorithm for Autonomous Vehicle Decision-Making in IPG-Carmaker Simulator
by Ali Rizehvandi, Shahram Azadi and Arno Eichberger
World Electr. Veh. J. 2026, 17(1), 33; https://doi.org/10.3390/wevj17010033 - 9 Jan 2026
Abstract
Automated driving technologies have the capability to significantly increase road safety by decreasing accidents and increasing travel efficiency. This research presents a decision-making strategy for automated vehicles that models both lane changing and double lane changing maneuvers and is supported by a Deep [...] Read more.
Automated driving technologies have the capability to significantly increase road safety by decreasing accidents and increasing travel efficiency. This research presents a decision-making strategy for automated vehicles that models both lane changing and double lane changing maneuvers and is supported by a Deep Reinforcement Learning (DRL) algorithm. To capture realistic driving challenges, a highway driving scenario was designed using the professional multi-body simulation tool IPG Carmaker software, version 11 with realistic weather simulations to include aspects of rainy weather by incorporating vehicles with explicitly reduced tire–road friction while the ego vehicle is attempting to safely and perform efficient maneuvers in highway and merged merges. The hierarchical control system both creates an operational structure for planning and decision-making processes in highway maneuvers and articulates between higher-level driving decisions and lower-level autonomous motion control processes. As a result, a Duel Deep Deterministic Policy Gradient (Duel-DDPG) agent was created as the DRL approach to achieving decision-making in adverse driving conditions, which was built in MATLAB version 2021, designed, and tested. The study thoroughly explains both the Duel-DDPG and standard Deep Deterministic Policy Gradient (DDPG) algorithms, and we provide a direct performance comparative analysis. The discussion continues with simulation experiments of traffic complexity with uncertainty relating to weather conditions, which demonstrate the effectiveness of the Duel-DDPG algorithm. Full article
(This article belongs to the Section Automated and Connected Vehicles)
Show Figures

Figure 1

17 pages, 459 KB  
Article
Adaptive Credit Card Fraud Detection: Reinforcement Learning Agents vs. Anomaly Detection Techniques
by Houda Ben Mekhlouf, Abdellatif Moussaid and Fadoua Ghanimi
FinTech 2026, 5(1), 9; https://doi.org/10.3390/fintech5010009 - 9 Jan 2026
Abstract
Credit card fraud detection remains a critical challenge for financial institutions, particularly due to extreme class imbalance and the continuously evolving nature of fraudulent behavior. This study investigates two complementary approaches: anomaly detection based on multivariate normal distribution and deep reinforcement learning using [...] Read more.
Credit card fraud detection remains a critical challenge for financial institutions, particularly due to extreme class imbalance and the continuously evolving nature of fraudulent behavior. This study investigates two complementary approaches: anomaly detection based on multivariate normal distribution and deep reinforcement learning using a Deep Q-Network. While anomaly detection effectively identifies deviations from normal transaction patterns, its static nature limits adaptability in real-time systems. In contrast, the DQN reinforcement learning model continuously learns from every transaction, autonomously adapting to emerging fraud strategies. Experimental results demonstrate that, although initial performance metrics of the DQN are modest compared to anomaly detection, its capacity for online learning and policy refinement enables long-term improvement and operational scalability. This work highlights reinforcement learning as a highly promising paradigm for dynamic, high-volume fraud detection, capable of evolving with the environment and achieving near-optimal detection rates over time. Full article
Show Figures

Figure 1

33 pages, 857 KB  
Review
Deep Reinforcement Learning in the Era of Foundation Models: A Survey
by Ibomoiye Domor Mienye, Ebenezer Esenogho and Cameron Modisane
Computers 2026, 15(1), 40; https://doi.org/10.3390/computers15010040 - 9 Jan 2026
Abstract
Deep reinforcement learning (DRL) and large foundation models (FMs) have reshaped modern artificial intelligence (AI) by enabling systems that learn from interaction while leveraging broad generalization and multimodal reasoning capabilities. This survey examines the growing convergence of these paradigms and reviews how reinforcement [...] Read more.
Deep reinforcement learning (DRL) and large foundation models (FMs) have reshaped modern artificial intelligence (AI) by enabling systems that learn from interaction while leveraging broad generalization and multimodal reasoning capabilities. This survey examines the growing convergence of these paradigms and reviews how reinforcement learning from human feedback (RLHF), reinforcement learning from AI feedback (RLAIF), world-model pretraining, and preference-based optimization refine foundation model capabilities. We organize existing work into a taxonomy of model-centric, RL-centric, and hybrid DRL–FM integration pathways, and synthesize applications across language and multimodal agents, autonomous control, scientific discovery, and societal and ethical alignment. We also identify technical, behavioral, and governance challenges that hinder scalable and reliable DRL–FM integration, and outline emerging research directions that suggest how reinforcement-driven adaptation may shape the next generation of intelligent systems. This review provides researchers and practitioners with a structured overview of the current state and future trajectory of DRL in the era of foundation models. Full article
Show Figures

Figure 1

22 pages, 5187 KB  
Article
Adaptive Policy Switching for Multi-Agent ASVs in Multi-Objective Aquatic Cleaning Environments
by Dame Seck, Samuel Yanes-Luis, Manuel Perales-Esteve, Sergio Toral Marín and Daniel Gutiérrez-Reina
Sensors 2026, 26(2), 427; https://doi.org/10.3390/s26020427 - 9 Jan 2026
Abstract
Plastic pollution in aquatic environments is a major ecological problem requiring scalable autonomous solutions for cleanup. This study addresses the coordination of multiple Autonomous Surface Vehicles by formulating the problem as a Partially Observable Markov Game and decoupling the mission into two tasks: [...] Read more.
Plastic pollution in aquatic environments is a major ecological problem requiring scalable autonomous solutions for cleanup. This study addresses the coordination of multiple Autonomous Surface Vehicles by formulating the problem as a Partially Observable Markov Game and decoupling the mission into two tasks: exploration to maximize coverage and cleaning to collect trash. These tasks share navigation requirements but present conflicting goals, motivating a multi-objective learning approach. The proposed multi-agent deep reinforcement learning framework involves the utilisation of the same Multitask Deep Q-network shared by all the agents, with a convolutional backbone and two heads, one dedicated to exploration and the other to cleaning. Parameter sharing and egocentric state design leverages agent homogeneity and enable experience aggregation across tasks. An adaptive mechanism governs task switching, combining task-specific rewards with a weighted aggregation and selecting tasks via a reward-greedy strategy. This enables the construction of Pareto fronts capturing non-dominated solutions. The framework demonstrates improvements over fixed-phase approaches, improving hypervolume and uniformity metrics by 14% and 300%, respectively. It also adapts to diverse initial trash distributions, providing decision-makers with a portfolio of effective and adaptive strategies for autonomous plastic cleanup. Full article
(This article belongs to the Special Issue Advances in Wireless Sensor and Mobile Networks)
Show Figures

Figure 1

12 pages, 1441 KB  
Article
Development of an Exploratory Simulation Tool: Using Predictive Decision Trees to Model Chemical Exposure Risks and Asthma-like Symptoms in Professional Cleaning Staff in Laboratory Environments
by Hayden D. Hedman
Laboratories 2026, 3(1), 2; https://doi.org/10.3390/laboratories3010002 - 9 Jan 2026
Abstract
Exposure to chemical irritants in laboratory and medical environments poses significant health risks to workers, particularly in relation to asthma-like symptoms. Routine cleaning practices, which often involve the use of strong chemical agents to maintain hygienic settings, have been shown to contribute to [...] Read more.
Exposure to chemical irritants in laboratory and medical environments poses significant health risks to workers, particularly in relation to asthma-like symptoms. Routine cleaning practices, which often involve the use of strong chemical agents to maintain hygienic settings, have been shown to contribute to respiratory issues. Laboratories, where chemicals such as hydrochloric acid and ammonia are frequently used, represent an underexplored context in the study of occupational asthma. While much of the research on chemical exposure has focused on industrial and high-risk occupations or large cohort populations, less attention has been given to the risks in laboratory and medical environments, particularly for professional cleaning staff. Given the growing reliance on cleaning agents to maintain sterile and safe workspaces in scientific research and healthcare facilities, this gap is concerning. This study developed an exploratory simulation tool, using a simulated cohort based on key demographic and exposure patterns from foundational research, to assess the impact of chemical exposure from cleaning products in laboratory environments. Four supervised machine learning models were applied to evaluate the relationship between chemical exposures and asthma-like symptoms: (1) Decision Trees, (2) Random Forest, (3) Gradient Boosting, and (4) XGBoost. High exposures to hydrochloric acid and ammonia were found to be significantly associated with asthma-like symptoms, and workplace type also played a critical role in determining asthma risk. This research provides a data-driven framework for assessing and predicting asthma-like symptoms in professional cleaning workers exposed to cleaning agents and highlights the potential for integrating predictive modeling into occupational health and safety monitoring. Future work should explore dose–response relationships and the temporal dynamics of chemical exposure to further refine these models and improve understanding of long-term health risks. Full article
Show Figures

Figure 1

18 pages, 418 KB  
Article
AnonymAI: An Approach with Differential Privacy and Intelligent Agents for the Automated Anonymization of Sensitive Data
by Marcelo Nascimento Oliveira Soares, Leonardo Barbosa Oliveira, Antonio João Gonçalves Azambuja, Jean Phelipe de Oliveira Lima and Anderson Silva Soares
Future Internet 2026, 18(1), 41; https://doi.org/10.3390/fi18010041 - 9 Jan 2026
Abstract
Data governance for responsible AI systems remains challenged by the lack of automated tools that can apply robust privacy-preserving techniques without destroying analytical value. We propose AnonymAI, a novel methodological framework that integrates LLM-based intelligent agents, the mathematical guarantees of differential privacy, and [...] Read more.
Data governance for responsible AI systems remains challenged by the lack of automated tools that can apply robust privacy-preserving techniques without destroying analytical value. We propose AnonymAI, a novel methodological framework that integrates LLM-based intelligent agents, the mathematical guarantees of differential privacy, and an automated workflow to generate anonymized datasets for analytical applications. This framework produces data tables with formally verifiable privacy protection, dramatically reducing the need for manual classification and the risk of human error. Focusing on the protection of tabular data containing sensitive personal information, AnonymAI is designed as a generalized, replicable pipeline adaptable to different regulations (e.g., General Data Protection Regulation) and use-case scenarios. The novelty lies in combining the contextual classification capabilities of LLMs with the mathematical rigor of differential privacy, enabling an end-to-end pipeline from raw data to a protected, analysis-ready dataset. The efficiency and formal guarantees of this approach offer significant advantages over conventional anonymization methods, which are often manual, inconsistent, and lack the verifiable protections of differential privacy. Validation studies, covering both controlled experiments on four types of synthetic datasets and broader tests on 19 real-world public tables from various domains, confirmed the applicability of the framework, with the agent-based classifier achieving high overall accuracy in identifying confidential columns. The results demonstrate that the protected data maintains high value for statistical analysis and machine learning models, highlighting AnonymAI’s potential to advance responsible data sharing. This work paves the way for trustworthy and scalable data governance in AI through a rigorously engineered automated anonymization pipeline. Full article
(This article belongs to the Special Issue Intelligent Agents and Their Application)
Show Figures

Figure 1

28 pages, 2702 KB  
Article
Adaptive and Sustainable Smart Environments Using Predictive Reasoning and Context-Aware Reinforcement Learning
by Abderrahim Lakehal, Boubakeur Annane, Adel Alti, Philippe Roose and Soliman Aljarboa
Future Internet 2026, 18(1), 40; https://doi.org/10.3390/fi18010040 - 8 Jan 2026
Abstract
Smart environments play a key role in improving user comfort, energy efficiency, and sustainability through intelligent automation. Nevertheless, real-world deployments still face major challenges, including network instability, delayed responsiveness, inconsistent AI decisions, and limited adaptability under dynamic conditions. Many existing approaches lack advanced [...] Read more.
Smart environments play a key role in improving user comfort, energy efficiency, and sustainability through intelligent automation. Nevertheless, real-world deployments still face major challenges, including network instability, delayed responsiveness, inconsistent AI decisions, and limited adaptability under dynamic conditions. Many existing approaches lack advanced context-awareness, effective multi-agent coordination, and scalable learning, leading to high computational cost and reduced reliability. To address these limitations, this paper proposes MACxRL, a lightweight Multi-Agent Context-Aware Reinforcement Learning framework for autonomous smart-environment control. The system adopts a three-tier architecture consisting of real-time context acquisition, lightweight prediction, and centralized RL-based decision learning. Local agents act quickly at the edge using rule-based reasoning, while a shared CxRL engine refines actions for global coordination, combining fast responsiveness with continuous adaptive learning. Experiments show that MACxRL reduces energy consumption by 45–60%, converges faster, and achieves more stable performance than standard and deep RL baselines. Future work will explore self-adaptive reward tuning and extend deployment to multi-room environments toward practical real-world realization. Full article
Show Figures

Graphical abstract

26 pages, 1013 KB  
Article
AoI-Aware Data Collection in Heterogeneous UAV-Assisted WSNs: Strong-Agent Coordinated Coverage and Vicsek-Driven Weak-Swarm Control
by Lin Huang, Lanhua Li, Songhan Zhao, Daiming Qu and Jing Xu
Sensors 2026, 26(2), 419; https://doi.org/10.3390/s26020419 - 8 Jan 2026
Abstract
Unmanned aerial vehicle (UAV) swarms offer an efficient solution for data collection from widely distributed ground users (GUs). However, incomplete environment information and frequent changes make it challenging for standard centralized planning or pure reinforcement learning approaches to simultaneously maintain global solution quality [...] Read more.
Unmanned aerial vehicle (UAV) swarms offer an efficient solution for data collection from widely distributed ground users (GUs). However, incomplete environment information and frequent changes make it challenging for standard centralized planning or pure reinforcement learning approaches to simultaneously maintain global solution quality and local flexibility. We propose a hierarchical data collection framework for heterogeneous UAV-assisted wireless sensor networks (WSNs). A small set of high-capability UAVs (H-UAVs), equipped with substantial computational and communication resources, coordinate regional coverage, trajectory planning, and uplink transmission control for numerous resource-constrained low-capability UAVs (L-UAVs) across power-Voronoi-partitioned areas using multi-agent deep reinforcement learning (MADRL). Specifically, we employ Multi-Agent Deep Deterministic Policy Gradient (MADDPG) to enhance H-UAVs’ decision-making capabilities and enable coordinated actions. The partitions are dynamically updated based on GUs’ data generation rates and L-UAV density to balance workload and adapt to environmental dynamics. Concurrently, a large number of L-UAVs with limited onboard resources perform self-organized data collection from GUs and execute opportunistic relaying to a remote access point (RAP) via H-UAVs. Within each Voronoi cell, L-UAV motion follows a weighted Vicsek model that incorporates GUs’ age of information (AoI), link quality, and congestion avoidance. This spatial decomposition combined with decentralized weak-swarm control enables scalability to large-scale L-UAV deployments. Experiments demonstrate that the proposed strong and weak agent MADDPG (SW-MADDPG) scheme reduces AoI by 30% and 21% compared to No-Voronoi and Heuristic-HUAV baselines, respectively. Full article
(This article belongs to the Section Communications)
26 pages, 2527 KB  
Article
Coordinated Scheduling of BESS–ASHP Systems in Zero-Energy Houses Using Multi-Agent Reinforcement Learning
by Jing Li, Yang Xu, Yunqin Lu and Weijun Gao
Buildings 2026, 16(2), 274; https://doi.org/10.3390/buildings16020274 - 8 Jan 2026
Abstract
This paper addresses the critical challenge of multi-objective optimization in residential Home Energy Management Systems (HEMS) by proposing a novel framework based on an Improved Multi-Agent Proximal Policy Optimization (MAPPO) algorithm. The study specifically targets the low convergence efficiency of Multi-Agent Deep Reinforcement [...] Read more.
This paper addresses the critical challenge of multi-objective optimization in residential Home Energy Management Systems (HEMS) by proposing a novel framework based on an Improved Multi-Agent Proximal Policy Optimization (MAPPO) algorithm. The study specifically targets the low convergence efficiency of Multi-Agent Deep Reinforcement Learning (MADRL) for coupled Battery Energy Storage System (BESS) and Air Source Heat Pump (ASHP) operation. The framework synergistically integrates an action constraint projection mechanism with an economic-performance-driven dynamic learning rate modulation strategy, thereby significantly enhancing learning stability. Simulation results demonstrate that the algorithm improves training convergence speed by 35–45% compared to standard MAPPO. Economically, it delivers a cumulative cost reduction of 15.77% against rule-based baselines, outperforming both Independent Proximal Policy Optimization (IPPO) and standard MAPPO benchmarks. Furthermore, the method maximizes renewable energy utilization, achieving nearly 100% photovoltaic self-consumption under favorable conditions while ensuring robustness in extreme scenarios. Temporal analysis reveals the agents’ capacity for anticipatory decision-making, effectively learning correlations among generation, pricing, and demand to achieve seamless seasonal adaptability. These findings validate the superior performance of the proposed centralized training architecture, providing a robust solution for complex residential energy management. Full article
43 pages, 10782 KB  
Article
Nested Learning in Higher Education: Integrating Generative AI, Neuroimaging, and Multimodal Deep Learning for a Sustainable and Innovative Ecosystem
by Rubén Juárez, Antonio Hernández-Fernández, Claudia Barros Camargo and David Molero
Sustainability 2026, 18(2), 656; https://doi.org/10.3390/su18020656 - 8 Jan 2026
Abstract
Industry 5.0 challenges higher education to adopt human-centred and sustainable uses of artificial intelligence, yet many current deployments still treat generative AI as a stand-alone tool, neurophysiological sensing as largely laboratory-bound, and governance as an external add-on rather than a design constraint. This [...] Read more.
Industry 5.0 challenges higher education to adopt human-centred and sustainable uses of artificial intelligence, yet many current deployments still treat generative AI as a stand-alone tool, neurophysiological sensing as largely laboratory-bound, and governance as an external add-on rather than a design constraint. This article introduces Nested Learning as a neuro-adaptive ecosystem design in which generative-AI agents, IoT infrastructures and multimodal deep learning orchestrate instructional support while preserving student agency and a “pedagogy of hope”. We report an exploratory two-phase mixed-methods study as an initial empirical illustration. First, a neuro-experimental calibration with 18 undergraduate students used mobile EEG while they interacted with ChatGPT in problem-solving tasks structured as challenge–support–reflection micro-cycles. Second, a field implementation at a university in Madrid involved 380 participants (300 students and 80 lecturers), embedding the Nested Learning ecosystem into regular courses. Data sources included EEG (P300) signals, interaction logs, self-report measures of engagement, self-regulated learning and cognitive safety (with strong internal consistency; α/ω0.82), and open-ended responses capturing emotional experience and ethical concerns. In Phase 1, P300 dynamics aligned with key instructional micro-events, providing feasibility evidence that low-cost neuro-adaptive pipelines can be sensitive to pedagogical flow in ecologically relevant tasks. In Phase 2, participants reported high levels of perceived nested support and cognitive safety, and observed associations between perceived Nested Learning, perceived neuro-adaptive adjustments, engagement and self-regulation were moderate to strong (r=0.410.63, p<0.001). Qualitative data converged on themes of clarity, adaptive support and non-punitive error culture, alongside recurring concerns about privacy and cognitive sovereignty. We argue that, under robust ethical, data-protection and sustainability-by-design constraints, Nested Learning can strengthen academic resilience, learner autonomy and human-centred uses of AI in higher education. Full article
Show Figures

Figure 1

23 pages, 3238 KB  
Article
Agricultural Injury Severity Prediction Using Integrated Data-Driven Analysis: Global Versus Local Explainability Using SHAP
by Omer Mermer, Yanan Liu, Charles A. Jennissen, Milan Sonka and Ibrahim Demir
Safety 2026, 12(1), 6; https://doi.org/10.3390/safety12010006 - 8 Jan 2026
Abstract
Despite the agricultural sector’s consistently high injury rates, formal reporting is often limited, leading to sparse national datasets that hinder effective safety interventions. To address this, our study introduces a comprehensive framework leveraging advanced ensemble machine learning (ML) models to predict and interpret [...] Read more.
Despite the agricultural sector’s consistently high injury rates, formal reporting is often limited, leading to sparse national datasets that hinder effective safety interventions. To address this, our study introduces a comprehensive framework leveraging advanced ensemble machine learning (ML) models to predict and interpret the severity of agricultural injuries. We use a unique, manually curated dataset of over 2400 agricultural incidents from AgInjuryNews, a public repository of news reports detailing incidents across the United States. We evaluated six ensemble models, including Gradient Boosting (GB), eXtreme Grading Boosting (XGB), Light Gradient Boosting Machine (LightGBM), Adaptive Boosting (AdaBoost), Histogram-based Gradient Boosting Regression Trees (HistGBRT), and Random Forest (RF), for their accuracy in classifying injury outcomes as fatal or non-fatal. A key contribution of our work is the novel integration of explainable artificial intelligence (XAI), specifically SHapley Additive exPlanations (SHAP), to overcome the “black-box” nature of complex ensemble models. The models demonstrated strong predictive performance, with most achieving an accuracy of approximately 0.71 and an F1-score of 0.81. Through global SHAP analysis, we identified key factors influencing injury severity across the dataset, such as the presence of helmet use, victim age, and the type of injury agent. Additionally, our application of local SHAP analysis revealed how specific variables like location and the victim’s role can have varying impacts depending on the context of the incident. These findings provide actionable, context-aware insights for developing targeted policy and safety interventions for a range of stakeholders, from first responders to policymakers, offering a powerful tool for a more proactive approach to agricultural safety. Full article
(This article belongs to the Special Issue Farm Safety, 2nd Edition)
Show Figures

Figure 1

23 pages, 1101 KB  
Article
A Reinforcement Learning-Based Optimization Strategy for Noise Budget Management in Homomorphically Encrypted Deep Network Inference
by Chi Zhang, Fenhua Bai, Jinhua Wan and Yu Chen
Electronics 2026, 15(2), 275; https://doi.org/10.3390/electronics15020275 - 7 Jan 2026
Abstract
Homomorphic encryption provides a powerful cryptographic solution for privacy-preserving deep neural network inference, enabling computation on encrypted data. However, the practical application of homomorphic encryption is fundamentally constrained by the noise budget, a core component of homomorphic encryption schemes. The substantial multiplicative depth [...] Read more.
Homomorphic encryption provides a powerful cryptographic solution for privacy-preserving deep neural network inference, enabling computation on encrypted data. However, the practical application of homomorphic encryption is fundamentally constrained by the noise budget, a core component of homomorphic encryption schemes. The substantial multiplicative depth of modern deep neural networks rapidly consumes this budget, necessitating frequent, computationally expensive bootstrapping operations to refresh the noise. This bootstrapping process has emerged as the primary performance bottleneck. Current noise management strategies are predominantly static, triggering bootstrapping at pre-defined, fixed intervals. This approach is sub-optimal for deep, complex architectures, leading to excessive computational overhead and potential accuracy degradation due to cumulative precision loss. To address this challenge, we propose a Deep Network-aware Adaptive Noise-budget Management mechanism, a novel mechanism that formulates noise budget allocation as a sequential decision problem optimized via reinforcement learning. The core of the proposed mechanism comprises two components. First, we construct a layer-aware noise consumption prediction model to accurately estimate the heterogeneous computational costs and noise accumulation across different network layers. Second, we design a Deep Q-Network-driven optimization algorithm. This Deep Q-Network agent is trained to derive a globally optimal policy, dynamically determining the optimal timing and network location for executing bootstrapping operations, based on the real-time output of the noise predictor and the current network state. This approach shifts from a static, pre-defined strategy to an adaptive, globally optimized one. Experimental validation on several typical deep neural network architectures demonstrates that the proposed mechanism significantly outperforms state-of-the-art fixed strategies, markedly reducing redundant bootstrapping overhead while maintaining model performance. Full article
(This article belongs to the Special Issue Security and Privacy in Artificial Intelligence Systems)
Show Figures

Figure 1

21 pages, 988 KB  
Article
Study of Performance from Hierarchical Decision Modeling in IVAs Within a Greedy Context
by Francisco Federico Meza-Barrón, Nelson Rangel-Valdez, María Lucila Morales-Rodríguez, Claudia Guadalupe Gómez-Santillán, Juan Javier González-Barbosa, Guadalupe Castilla-Valdez, Nohra Violeta Gallardo-Rivas and Ana Guadalupe Vélez-Chong
Math. Comput. Appl. 2026, 31(1), 8; https://doi.org/10.3390/mca31010008 - 7 Jan 2026
Abstract
This study examines decision-making in intelligent virtual agents (IVAs) and formalizes the distinction between tactical decisions (individual actions) and strategic decisions (composed of sequences of tactical actions) using a mathematical model based on set theory and the Bellman equation. Although the equation itself [...] Read more.
This study examines decision-making in intelligent virtual agents (IVAs) and formalizes the distinction between tactical decisions (individual actions) and strategic decisions (composed of sequences of tactical actions) using a mathematical model based on set theory and the Bellman equation. Although the equation itself is not modified, the analysis reveals that the discount factor (γ) influences the type of decision: low values favor tactical decisions, while high values favor strategic ones. The model was implemented and validated in a proof-of-concept simulated environment, namely the Snake Coin Change Problem (SCCP), using a Deep Q-Network (DQN) architecture, showing significant differences between agents with different decision profiles. These findings suggest that adjusting γ can serve as a useful mechanism to regulate both tactical and strategic decision-making processes in IVAs, thus offering a conceptual basis that could facilitate the design of more intelligent and adaptive agents in domains such as video games, and potentially in robotics and artificial intelligence as future research directions. Full article
(This article belongs to the Special Issue Numerical and Evolutionary Optimization 2025)
Show Figures

Figure 1

Back to TopTop