Saved Queries

Deep Reinforcement Learning (DRL) algorithms often exhibit significant performance variability across different training runs, even with identical settings. This paper investigates the hypothesis that a key contributor to this variability is the divergence in the observation spaces explored by individual learning agents. We conducted an empirical study using Proximal Policy Optimization (PPO) agents trained on eight Atari environments. We analyzed the collected agent trajectories by qualitatively visualizing and quantitatively measuring the divergence in their explored observation spaces. Furthermore, we cross-evaluated the learned actor and value networks, measuring the average absolute TD-error, the RMSE of value estimates, and the KL divergence between policies to assess their functional similarity. We also conducted experiments where agents were trained from identical network initializations to isolate the source of this divergence. Our findings reveal a strong correlation: environments with low-performance variance (e.g., Freeway) showed high similarity in explored observation spaces and learned networks across agents. Conversely, environments with high-performance variability (e.g., Boxing, Qbert) demonstrated significant divergence in both explored states and network functionalities. This pattern persisted even when agents started with identical network weights. These results suggest that differences in experiential trajectories, driven by the stochasticity of agent–environment interactions, lead to specialized agent policies and value functions, thereby contributing substantially to the observed inconsistencies in DRL performance. Full article

(This article belongs to the Special Issue Advancements and Applications in Reinforcement Learning)

23 pages, 13739 KiB

Open AccessArticle

Traffic Accident Rescue Action Recognition Method Based on Real-Time UAV Video

by Bo Yang, Jianan Lu, Tao Liu, Bixing Zhang, Chen Geng, Yan Tian and Siyu Zhang

Drones 2025, 9(8), 519; https://doi.org/10.3390/drones9080519 - 24 Jul 2025

Abstract

Low-altitude drones, which are unimpeded by traffic congestion or urban terrain, have become a critical asset in emergency rescue missions. To address the current lack of emergency rescue data, UAV aerial videos were collected to create an experimental dataset for action classification and localization annotation. A total of 5082 keyframes were labeled with 1–5 targets each, and 14,412 instances of data were prepared (including flight altitude and camera angles) for action classification and position annotation. To mitigate the challenges posed by high-resolution drone footage with excessive redundant information, we propose the SlowFast-Traffic (SF-T) framework, a spatio-temporal sequence-based algorithm for recognizing traffic accident rescue actions. For more efficient extraction of target–background correlation features, we introduce the Actor-Centric Relation Network (ACRN) module, which employs temporal max pooling to enhance the time-dimensional features of static backgrounds, significantly reducing redundancy-induced interference. Additionally, smaller ROI feature map outputs are adopted to boost computational speed. To tackle class imbalance in incident samples, we integrate a Class-Balanced Focal Loss (CB-Focal Loss) function, effectively resolving rare-action recognition in specific rescue scenarios. We replace the original Faster R-CNN with YOLOX-s to improve the target detection rate. On our proposed dataset, the SF-T model achieves a mean average precision (mAP) of 83.9%, which is 8.5% higher than that of the standard SlowFast architecture while maintaining a processing speed of 34.9 tasks/s. Both accuracy-related metrics and computational efficiency are substantially improved. The proposed method demonstrates strong robustness and real-time analysis capabilities for modern traffic rescue action recognition. Full article

(This article belongs to the Special Issue Cooperative Perception for Modern Transportation)

►▼ Show Figures

Figure 1

19 pages, 2893 KiB

Open AccessArticle

Reactive Power Optimization of a Distribution Network Based on Graph Security Reinforcement Learning

by Xu Zhang, Xiaolin Gui, Pei Sun, Xing Li, Yuan Zhang, Xiaoyu Wang, Chaoliang Dang and Xinghua Liu

Appl. Sci. 2025, 15(15), 8209; https://doi.org/10.3390/app15158209 - 23 Jul 2025

Abstract

With the increasing integration of renewable energy, the secure operation of distribution networks faces significant challenges, such as voltage limit violations and increased power losses. To address the issue of reactive power and voltage security under renewable generation uncertainty, this paper proposes a graph-based security reinforcement learning method. First, a graph-enhanced neural network is designed, to extract both topological and node-level features from the distribution network. Then, a primal-dual approach is introduced to incorporate voltage security constraints into the agent’s critic network, by constructing a cost critic to guide safe policy learning. Finally, a dual-critic framework is adopted to train the actor network and derive an optimal policy. Experiments conducted on real load profiles demonstrated that the proposed method reduced the voltage violation rate to 0%, compared to 4.92% with the Deep Deterministic Policy Gradient (DDPG) algorithm and 5.14% with the Twin Delayed DDPG (TD3) algorithm. Moreover, the average node voltage deviation was effectively controlled within 0.0073 per unit. Full article

(This article belongs to the Special Issue IoT Technology and Information Security)

►▼ Show Figures

Figure 1

18 pages, 1138 KiB

Open AccessArticle

Intelligent Priority-Aware Spectrum Access in 5G Vehicular IoT: A Reinforcement Learning Approach

by Adeel Iqbal, Tahir Khurshaid and Yazdan Ahmad Qadri

Sensors 2025, 25(15), 4554; https://doi.org/10.3390/s25154554 - 23 Jul 2025

Abstract

Efficient and intelligent spectrum access is crucial for meeting the diverse Quality of Service (QoS) demands of Vehicular Internet of Things (V-IoT) systems in next-generation cellular networks. This work proposes a novel reinforcement learning (RL)-based priority-aware spectrum management (RL-PASM) framework, a centralized self-learning priority-aware spectrum management framework operating through Roadside Units (RSUs). RL-PASM dynamically allocates spectrum resources across three traffic classes: high-priority (HP), low-priority (LP), and best-effort (BE), utilizing reinforcement learning (RL). This work compares four RL algorithms: Q-Learning, Double Q-Learning, Deep Q-Network (DQN), and Actor-Critic (AC) methods. The environment is modeled as a discrete-time Markov Decision Process (MDP), and a context-sensitive reward function guides fairness-preserving decisions for access, preemption, coexistence, and hand-off. Extensive simulations conducted under realistic vehicular load conditions evaluate the performance across key metrics, including throughput, delay, energy efficiency, fairness, blocking, and interruption probability. Unlike prior approaches, RL-PASM introduces a unified multi-objective reward formulation and centralized RSU-based control to support adaptive priority-aware access for dynamic vehicular environments. Simulation results confirm that RL-PASM balances throughput, latency, fairness, and energy efficiency, demonstrating its suitability for scalable and resource-constrained deployments. The results also demonstrate that DQN achieves the highest average throughput, followed by vanilla QL. DQL and AC maintain fairness at high levels and low average interruption probability. QL demonstrates the lowest average delay and the highest energy efficiency, making it a suitable candidate for edge-constrained vehicular deployments. Selecting the appropriate RL method, RL-PASM offers a robust and adaptable solution for scalable, intelligent, and priority-aware spectrum access in vehicular communication infrastructures. Full article

(This article belongs to the Special Issue Emerging Trends in Next-Generation mmWave Cognitive Radio Networks)

►▼ Show Figures

Figure 1

22 pages, 5966 KiB

Open AccessArticle

Road-Adaptive Precise Path Tracking Based on Reinforcement Learning Method

by Bingheng Han and Jinhong Sun

Sensors 2025, 25(15), 4533; https://doi.org/10.3390/s25154533 - 22 Jul 2025

Viewed by 31

Abstract

This paper proposes a speed-adaptive autonomous driving path-tracking framework based on the soft actor–critic (SAC) and pure pursuit (PP) methods, named the SACPP controller. The framework first analyzes the obstacles around the vehicle and plans an obstacle-free reference path with the minimum curvature using the hybrid A* algorithm. Next, based on the generated reference path, the current state of the vehicle, and the vehicle motor energy efficiency diagram, the optimal speed is calculated in real time, and the vehicle dynamics preview point at the future moment—specifically, the look-ahead distance—is predicted. This process relies on the learning of the SAC network structure. Finally, PP is used to generate the front wheel angle control value by combining the current speed and the predicted preview point. In the second layer, we carefully designed the evaluation function in the tracking process based on the uncertainties and performance requirements that may occur during vehicle driving. This design ensures that the autonomous vehicle can not only quickly and accurately track the path, but also effectively avoid surrounding obstacles, while keeping the motor running in the high-efficiency range, thereby reducing energy loss. In addition, since the entire framework uses a lightweight network structure and a geometry-based method to generate the front wheel angle, the computational load is significantly reduced, and computing resources are saved. The actual running results on the i7 CPU show that the control cycle of the control framework exceeds 100 Hz. Full article

(This article belongs to the Special Issue AI-Driving for Autonomous Vehicles)

►▼ Show Figures

Figure 1

19 pages, 1371 KiB

Open AccessArticle

The Structure and Driving Mechanisms of the Departmental Collaborative Network in Primary-Level Social Risk Prevention and Control: A Network Study of J City, China

by Lirong Zhang, Haixing Zhang and Qingzhi Jiang

Systems 2025, 13(8), 617; https://doi.org/10.3390/systems13080617 - 22 Jul 2025

Viewed by 69

Abstract

Primary-level social risk prevention and control is a complex, systemic endeavor that requires close cooperation among various local government departments. Within this context, addressing bureaucratic segmentation and strengthening interdepartmental collaboration are critical issues in primary-level social risk governance. This study uses social network analysis and the exponential random graph model to examine the collaborative network structure and driving mechanisms among government departments engaged in risk prevention, with J City as a network study. The findings reveal that (1) while a collaborative governance framework exists, the network has low overall density, strong localized clustering, and a clear core-periphery structure, indicating the need for improved coordination and more refined collaborative mechanisms; (2) the formation of the risk prevention network is influenced by both endogenous structural factors and exogenous actor attributes. Endogenously, reciprocity and transitivity play significant roles in tie formation; exogenously, departments with similar resource mobilization capacities are more likely to collaborate, while those with strong communication, digital technology, and resource mobilization capabilities are more likely to initiate collaborations, and those with high communication capacity are more likely to accept collaborative offers. This study offers insights into the dynamics and formation mechanisms of departmental collaborative networks in primary-level social risk governance. Full article

(This article belongs to the Section Systems Practice in Social Science)

►▼ Show Figures

Figure 1

33 pages, 304 KiB

Open AccessArticle

LEADER Territorial Cooperation in Rural Development: Added Value, Learning Dynamics, and Policy Impacts

by Giuseppe Gargano and Annalisa Del Prete

Land 2025, 14(7), 1494; https://doi.org/10.3390/land14071494 - 18 Jul 2025

Viewed by 299

Abstract

This study examines the added value of territorial cooperation within the LEADER approach, a key pillar of the EU’s rural development policy. Both interterritorial and transnational cooperation projects empower Local Action Groups (LAGs) to tackle common challenges through innovative and community-driven strategies. Drawing on over 3000 projects since 1994, LEADER cooperation has proven its ability to deliver tangible results—such as joint publications, pilot projects, and shared digital platforms—alongside intangible benefits like knowledge exchange, improved governance, and stronger social capital. By facilitating experiential learning and inter-organizational collaboration, cooperation enables stakeholders to work across territorial boundaries and build networks that respond to both national and transnational development issues. The interaction among diverse actors often fosters innovative responses to local and regional problems. Using a mixed-methods approach, including case studies of Italian LAGs, this research analyses the dynamics, challenges, and impacts of cooperation, with a focus on learning processes, capacity building, and long-term sustainability. Therefore, this study focuses not only on project outcomes but also on the processes and learning dynamics that generate added value through cooperation. The findings highlight how territorial cooperation promotes inclusivity, fosters cross-border dialogue, and supports the development of context-specific solutions, ultimately enhancing rural resilience and innovation. In conclusion, LEADER cooperation contributes to a more effective, participatory, and sustainable model of rural development, offering valuable insights for the broader EU cohesion policy. Full article

(This article belongs to the Special Issue Rural Development Strategies in the EU: Strengthening Agricultural Sectors and Local Economies)

11 pages, 274 KiB

Open AccessEssay

Connecting the Dots: Applying Network Theories to Enhance Integrated Paramedic Care for People Who Use Drugs

by Jennifer L. Bolster, Polly Ford-Jones, Elizabeth A. Donnelly and Alan M. Batt

Systems 2025, 13(7), 605; https://doi.org/10.3390/systems13070605 - 18 Jul 2025

Viewed by 545

Abstract

The evolving role of paramedics presents a unique opportunity to enhance care for people who use drugs, a population disproportionately affected by systemic barriers and inequities. In fragmented healthcare systems, paramedics are well-positioned to improve access through initiatives such as social prescribing and harm reduction. This theory-driven commentary explores how Network Theory and Actor Network Theory provide valuable theoretical underpinnings to conceptualize and strengthen the integration of paramedics into care networks. By emphasizing the centrality of paramedics and their connections with both human and non-human actors, these theories illuminate the relational dynamics that influence effective care delivery. We argue that leveraging paramedics’ positionality can address gaps in system navigation, improve patient outcomes, and inform policy reforms. Future work should examine the roles of other key actors, strengthen paramedic advocacy, and identify strategies to overcome barriers to care for people who use drugs. Full article

(This article belongs to the Section Systems Theory and Methodology)

16 pages, 493 KiB

Open AccessArticle

Techno-Pessimistic Shock and the Banning of Mobile Phones in Secondary Schools: The Case of Madrid

by Joaquín Paredes-Labra, Isabel Solana-Domínguez, Marco Ramos-Ramiro and Ada Freitas-Cortina

Soc. Sci. 2025, 14(7), 441; https://doi.org/10.3390/socsci14070441 - 18 Jul 2025

Viewed by 477

Abstract

Over a three-year R&D project, the perception of mobile phone use in Spanish secondary schools shifted from initial tolerance to increasingly prohibitive policies. Drawing on the Actor–Network Theory, this study examines how mobile phones—alongside institutional discourses and school and family concerns—acted as dynamic actants, shaping public and political responses. The research adopted a qualitative design combining policy and media document analysis, nine semi-structured interviews with key stakeholders, ten regional case studies, and twelve focus groups. The study concluded with a public multiplier event that engaged the broader educational community. The Madrid region, among the first to adopt a restrictive stance, contributed two school-based case studies and three focus groups with teachers, students, and families. Findings suggest that the turn toward prohibition was motivated less by pedagogical evidence than by cultural anxieties, consistent with what it conceptualizes as a techno-pessimistic shock. This shift mirrors the historical patterns of societal reaction to disruption and technological saturation. Rather than reinforcing binary framings of promotion versus prohibition, such moments invite critical reflection. The study argues for nuanced, evidence-based, and multilevel governance strategies to address the complex role of mobile technologies in education. Full article

(This article belongs to the Special Issue Educational Technology for a Multimodal Society)

►▼ Show Figures

Graphical abstract

15 pages, 216 KiB

Open AccessArticle

Freedom as Social Practice: Reconstructing Religious Freedom in Everyday Life

by Michele Garau and Giacomo Bazzani

Religions 2025, 16(7), 914; https://doi.org/10.3390/rel16070914 - 16 Jul 2025

Viewed by 183

Abstract

This article examines how religious freedom is enacted and redefined through everyday practices in pluralistic urban settings. Moving beyond the classical notion of negative liberty as non-interference, it explores the social conditions that enable or constrain the practical expression of religious life. Drawing on forty-three qualitative interviews with religious leaders and civic actors in Florence, Italy, the study analyses how religious freedom is experienced across institutional contexts such as hospitals, schools, prisons, workplaces, and sport facilities. The findings reveal a persistent tension between formal legal rights and their uneven implementation in daily life. While legal guarantees are generally upheld, structural barriers and discretionary practices often hinder access to religious expression. At the same time, informal interactions, local networks, and dialogical engagement play a key role in supporting the concrete exercise of religious freedom. The article argues that freedom is not simply a legal status but a social process, realized through relational and institutional arrangements. By foregrounding the role of everyday interaction in shaping the conditions of freedom, this study contributes to broader sociological debates on pluralism, normativity, and the social foundations of institutional life. Full article

(This article belongs to the Special Issue Urban Governance of Interreligious Dialogue and Freedom of/from Religion)

25 pages, 732 KiB

Open AccessArticle

Accuracy-Aware MLLM Task Offloading and Resource Allocation in UAV-Assisted Satellite Edge Computing

by Huabing Yan, Hualong Huang, Zijia Zhao, Zhi Wang and Zitian Zhao

Drones 2025, 9(7), 500; https://doi.org/10.3390/drones9070500 - 16 Jul 2025

Viewed by 240

Abstract

This paper presents a novel framework for optimizing multimodal large language model (MLLM) inference through task offloading and resource allocation in UAV-assisted satellite edge computing (SEC) networks. MLLMs leverage transformer architectures to integrate heterogeneous data modalities for IoT applications, particularly real-time monitoring in remote areas. However, cloud computing dependency introduces latency, bandwidth, and privacy challenges, while IoT device limitations require efficient distributed computing solutions. SEC, utilizing low-earth orbit (LEO) satellites and unmanned aerial vehicles (UAVs), extends mobile edge computing to provide ubiquitous computational resources for remote IoTDs. We formulate the joint optimization of MLLM task offloading and resource allocation as a mixed-integer nonlinear programming (MINLP) problem, minimizing latency and energy consumption while optimizing offloading decisions, power allocation, and UAV trajectories. To address the dynamic SEC environment characterized by satellite mobility, we propose an action-decoupled soft actor–critic (AD-SAC) algorithm with discrete–continuous hybrid action spaces. The simulation results demonstrate that our approach significantly outperforms conventional deep reinforcement learning methods in convergence and system cost reduction compared to baseline algorithms. Full article

►▼ Show Figures

Figure 1

21 pages, 1847 KiB

Open AccessArticle

Global Division of Responsibility Sharing: How Refugee Systems Operate Through the Economic Management of Mobility and Immobility

by Austin H. Vo and Michelle S. Dromgold-Sermen

Soc. Sci. 2025, 14(7), 434; https://doi.org/10.3390/socsci14070434 - 15 Jul 2025

Viewed by 198

Abstract

In 2023, there were approximately 32 million refugees globally. Nine out of the ten countries with the highest origins of refugees were in the Global South; conversely, only three of the ten countries hosting the highest numbers of refugees were in the Global North. In this study, we introduce the conceptual framework of a global division of responsibility sharing to describe how functions of Global North countries as permanent “resettlement” countries and Global South countries as perpetual countries of “asylum” and “transit” constitute unequal burdens with unequal protections for refugees. We illustrate—theoretically and empirically—how the structural positions of state actors in a global network introduce and reify a global division in refugee flows. Empirically, we test and develop this framework with network analysis of refugee flows to countries of asylum from 1990 to 2015 in addition to employing data on monetary donations to the United Nations High Commissioner for Refugees (UNHCR) from 2017 to 2021. We (1) provide evidence of the structure and role of intermediary countries in refugee flows and (2) examine how UNHCR monetary aid conditions intermediary countries’ role of routing and transit. We illustrate how network constraints and monetary donations affect and constitute a global division in the management of historic and contemporary international refugee flows and explore the consequences of this global division for refugees’ access to resources and social and human rights. Full article

(This article belongs to the Special Issue Migration, Citizenship and Social Rights)

►▼ Show Figures

Figure 1

28 pages, 1727 KiB

Open AccessArticle

Detecting Jamming in Smart Grid Communications via Deep Learning

by Muhammad Irfan, Aymen Omri, Javier Hernandez Fernandez, Savio Sciancalepore and Gabriele Oligeri

J. Cybersecur. Priv. 2025, 5(3), 46; https://doi.org/10.3390/jcp5030046 - 15 Jul 2025

Viewed by 263

Abstract

Power-Line Communication (PLC) allows data transmission through existing power lines, thus avoiding the expensive deployment of ad hoc network infrastructures. However, power line networks remain vastly unattended, which allows tampering by malicious actors. In fact, an attacker can easily inject a malicious signal (jamming) with the aim of disrupting ongoing communications. In this paper, we propose a new solution to detect jamming attacks before they significantly affect the quality of the communication link, thus allowing the detection of a jammer (geographically) far away from a receiver. We consider two scenarios as a function of the receiver’s ability to know in advance the impact of the jammer on the received signal. In the first scenario (jamming-aware), we leverage a classifier based on a Convolutional Neural Network, which has been trained on both jammed and non-jammed signals. In the second scenario (jamming-unaware), we consider a one-class classifier based on autoencoders, allowing us to address the challenge of jamming detection as a classical anomaly detection problem. Our proposed solution can detect jamming attacks on PLC networks with an accuracy greater than 99% even when the jammer is 68 m away from the receiver while requiring training only on traffic acquired during the regular operation of the target PLC network. Full article

►▼ Show Figures

Figure 1

22 pages, 2108 KiB

Open AccessArticle

Deep Reinforcement Learning for Real-Time Airport Emergency Evacuation Using Asynchronous Advantage Actor–Critic (A3C) Algorithm

by Yujing Zhou, Yupeng Yang, Bill Deng Pan, Yongxin Liu, Sirish Namilae, Houbing Herbert Song and Dahai Liu

Mathematics 2025, 13(14), 2269; https://doi.org/10.3390/math13142269 - 15 Jul 2025

Viewed by 273

Abstract

Emergencies can occur unexpectedly and require immediate action, especially in aviation, where time pressure and uncertainty are high. This study focused on improving emergency evacuation in airport and aircraft scenarios using real-time decision-making support. A system based on the Asynchronous Advantage Actor–Critic (A3C) algorithm, an advanced deep reinforcement learning method, was developed to generate faster and more efficient evacuation routes compared to traditional models. The A3C model was tested in various scenarios, including different environmental conditions and numbers of agents, and its performance was compared with the Deep Q-Network (DQN) algorithm. The results showed that A3C achieved evacuations 43.86% faster on average and converged in fewer episodes (100 vs. 250 for DQN). In dynamic environments with moving threats, A3C also outperformed DQN in maintaining agent safety and adapting routes in real time. As the number of agents increased, A3C maintained high levels of efficiency and robustness. These findings demonstrate A3C’s strong potential to enhance evacuation planning through improved speed, adaptability, and scalability. The study concludes by highlighting the practical benefits of applying such models in real-world emergency response systems, including significantly faster evacuation times, real-time adaptability to evolving threats, and enhanced scalability for managing large crowds in high-density environments including airport terminals. The A3C-based model offers a cost-effective alternative to full-scale evacuation drills by enabling virtual scenario testing, supports proactive safety planning through predictive modeling, and contributes to the development of intelligent decision-support tools that improve coordination and reduce response time during emergencies. Full article

(This article belongs to the Special Issue Future Technologies and Models for Integrated Transportation and Intelligent Transportation Networks)

►▼ Show Figures

Figure 1

24 pages, 1076 KiB

Open AccessArticle

Visual–Tactile Fusion and SAC-Based Learning for Robot Peg-in-Hole Assembly in Uncertain Environments

by Jiaxian Tang, Xiaogang Yuan and Shaodong Li

Machines 2025, 13(7), 605; https://doi.org/10.3390/machines13070605 - 14 Jul 2025

Viewed by 248

Abstract

Robotic assembly, particularly peg-in-hole tasks, presents significant challenges in uncertain environments where pose deviations, varying peg shapes, and environmental noise can undermine performance. To address these issues, this paper proposes a novel approach combining visual–tactile fusion with reinforcement learning. By integrating multimodal data (RGB image, depth map, tactile force information, and robot body pose data) via a fusion network based on the autoencoder, we provide the robot with a more comprehensive perception of its environment. Furthermore, we enhance the robot’s assembly skill ability by using the Soft Actor–Critic (SAC) reinforcement learning algorithm, which allows the robot to adapt its actions to dynamic environments. We evaluate our method through experiments, which showed clear improvements in three key aspects: higher assembly success rates, reduced task completion times, and better generalization across diverse peg shapes and environmental conditions. The results suggest that the combination of visual and tactile feedback with SAC-based learning provides a viable and robust solution for robotic assembly in uncertain environments, paving the way for scalable and adaptable industrial robots. Full article

(This article belongs to the Section Robotics, Mechatronics and Intelligent Machines)

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 25.

Go to page 1 2 3 4 5

Search Results (1,223)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI