Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms

Liu, Zhaoyu; Cheng, Wenchu; Zeng, Liang; He, Xinxin

doi:10.3390/drones9110788

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms

¹

School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China

²

School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing 100081, China

³

School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China

^*

Author to whom correspondence should be addressed.

Drones 2025, 9(11), 788; https://doi.org/10.3390/drones9110788 (registering DOI)

Submission received: 11 October 2025 / Revised: 8 November 2025 / Accepted: 10 November 2025 / Published: 12 November 2025

(This article belongs to the Special Issue Advances in AI Large Models for Unmanned Aerial Vehicles)

Download Versions Notes

Abstract

Heterogeneous unmanned aerial vehicle (UAV) swarms are becoming critical components of next-generation non-terrestrial networks, enabling tasks such as communication relay, spectrum monitoring, cooperative sensing, and navigation. Yet, their heterogeneity and multifunctionality bring severe challenges in task allocation and resource scheduling, where traditional multi-agent reinforcement learning methods often suffer from high algorithmic complexity, lengthy training times, and deployment difficulties on resource-constrained nodes. To address these issues, this paper proposes a low-complexity multi-agent soft actor–critic (MASAC) framework that combines parameter sharing (shared actor with device embeddings and shared-backbone twin critics), lightweight network design (fixed-width residual MLP with normalization), and robust training mechanisms (minimum-bias twin-critic updates and entropy scheduling) within the CTDE paradigm. Simulation results show that the proposed framework achieves more than 14-fold parameter compression and over a 93% reduction in training time, while maintaining or improving performance in terms of the delay–energy utility function. These advances substantially reduce computational overhead and accelerate convergence, providing a practical pathway for deploying multi-agent reinforcement learning in large-scale heterogeneous UAV clusters and supporting diverse mission scenarios under stringent resource and latency constraints.

Keywords: multi-agent soft actor–critic (MASAC); low complexity; unmanned aerial vehicle (UAV) cluster; resource allocation; centralized training–decentralized execution (CTDE)

Share and Cite

MDPI and ACS Style

Liu, Z.; Cheng, W.; Zeng, L.; He, X. Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms. Drones 2025, 9, 788. https://doi.org/10.3390/drones9110788

AMA Style

Liu Z, Cheng W, Zeng L, He X. Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms. Drones. 2025; 9(11):788. https://doi.org/10.3390/drones9110788

Chicago/Turabian Style

Liu, Zhaoyu, Wenchu Cheng, Liang Zeng, and Xinxin He. 2025. "Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms" Drones 9, no. 11: 788. https://doi.org/10.3390/drones9110788

APA Style

Liu, Z., Cheng, W., Zeng, L., & He, X. (2025). Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms. Drones, 9(11), 788. https://doi.org/10.3390/drones9110788

Article Menu

Towards Scalable Intelligence: A Low-Complexity Multi-Agent Soft Actor–Critic for Large-Model-Driven UAV Swarms

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI