This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms
by
Zhaoyu Liu
Zhaoyu Liu 1
,
Wenchu Cheng
Wenchu Cheng 2,
Liang Zeng
Liang Zeng 2,* and
Xinxin He
Xinxin He 3
1
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
2
School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing 100081, China
3
School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
*
Author to whom correspondence should be addressed.
Drones 2025, 9(11), 788; https://doi.org/10.3390/drones9110788 (registering DOI)
Submission received: 11 October 2025
/
Revised: 8 November 2025
/
Accepted: 10 November 2025
/
Published: 12 November 2025
Abstract
Heterogeneous unmanned aerial vehicle (UAV) swarms are becoming critical components of next-generation non-terrestrial networks, enabling tasks such as communication relay, spectrum monitoring, cooperative sensing, and navigation. Yet, their heterogeneity and multifunctionality bring severe challenges in task allocation and resource scheduling, where traditional multi-agent reinforcement learning methods often suffer from high algorithmic complexity, lengthy training times, and deployment difficulties on resource-constrained nodes. To address these issues, this paper proposes a low-complexity multi-agent soft actor–critic (MASAC) framework that combines parameter sharing (shared actor with device embeddings and shared-backbone twin critics), lightweight network design (fixed-width residual MLP with normalization), and robust training mechanisms (minimum-bias twin-critic updates and entropy scheduling) within the CTDE paradigm. Simulation results show that the proposed framework achieves more than 14-fold parameter compression and over a 93% reduction in training time, while maintaining or improving performance in terms of the delay–energy utility function. These advances substantially reduce computational overhead and accelerate convergence, providing a practical pathway for deploying multi-agent reinforcement learning in large-scale heterogeneous UAV clusters and supporting diverse mission scenarios under stringent resource and latency constraints.
Share and Cite
MDPI and ACS Style
Liu, Z.; Cheng, W.; Zeng, L.; He, X.
Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms. Drones 2025, 9, 788.
https://doi.org/10.3390/drones9110788
AMA Style
Liu Z, Cheng W, Zeng L, He X.
Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms. Drones. 2025; 9(11):788.
https://doi.org/10.3390/drones9110788
Chicago/Turabian Style
Liu, Zhaoyu, Wenchu Cheng, Liang Zeng, and Xinxin He.
2025. "Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms" Drones 9, no. 11: 788.
https://doi.org/10.3390/drones9110788
APA Style
Liu, Z., Cheng, W., Zeng, L., & He, X.
(2025). Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms. Drones, 9(11), 788.
https://doi.org/10.3390/drones9110788
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.