Collaborative Search Algorithm for Multi-UAVs Under Interference Conditions: A Multi-Agent Deep Reinforcement Learning Approach
Abstract
:1. Introduction
- A collaborative search model for multi-UAVs under interference conditions is established. Based on this model, an optimization problem is constructed with the goal of minimizing the total search time by jointly optimizing the channel selection and search heading angles of the UAVs. To reduce the complexity of solving the problem, the optimization is decomposed into two subproblems, corresponding to spectrum collaboration under interference conditions and undisturbed search collaboration.
- An MAPPO-based spectrum collaboration algorithm that only requires the RSSI of the channels is proposed to enhance the algorithm’s real-time responsiveness, allowing each UAV to complete the collaboration without needing to know the channel’s state information between other nodes.
- An MAPPO-based search collaboration algorithm that takes the TPM of the task area as a shared message is designed to optimize the heading of the UAVs. In addition, action masking based on the current state and the previous state is employed in the search collaboration to improve the convergence speed of the algorithm. The policy network is also delicately designed with two convolutional neural networks to match the two-dimensional TPM.
2. System Model
2.1. UAV Search Model
2.2. Target Probability Map
2.3. Channel Model
2.4. UAV Communication Model
3. Proposed Algorithm
3.1. Structure of MARL
3.2. PPO-Based MARL
3.3. MAPPO-Based Spectrum Collaboration
Algorithm 1: MAPPO-based spectrum collaboration under jamming conditions. |
3.4. MAPPO-Based Search Collaboration
Algorithm 2: MAPPO-based search collaboration. |
3.5. Computational Complexity Analysis
4. Simulation and Results
4.1. Parameter Setting
4.2. Performance Benchmark
4.3. Performance Evaluation
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Hu, W.; Yu, Y.; Liu, S.; She, C.; Guo, L.; Vucetic, B.; Li, Y. Multi-UAV Coverage Path Planning: A Distributed Online Cooperation Method. IEEE Trans. Veh. Technol. 2023, 72, 11727–11740. [Google Scholar] [CrossRef]
- Lou, Z.; Wang, R.; Belmekki, B.E.Y.; Kishk, M.A.; Alouini, M.S. Terrain-Based UAV Deployment: Providing Coverage for Outdoor Users. IEEE Trans. Veh. Technol. 2024, 73, 8988–9002. [Google Scholar] [CrossRef]
- Li, X.; Lu, X.; Chen, W.; Ge, D.; Zhu, J. Research on UAVs Reconnaissance Task Allocation Method Based on Communication Preservation. IEEE Trans. Consum. Electron. 2024, 70, 684–695. [Google Scholar] [CrossRef]
- Zhang, B.; Lin, X.; Zhu, Y.; Tian, J.; Zhu, Z. Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps. IEEE Trans. Intell. Veh. 2024, 9, 3827–3842. [Google Scholar] [CrossRef]
- Wu, J.; Sun, Y.; Li, D.; Shi, J.; Li, X.; Gao, L.; Yu, L.; Han, G.; Wu, J. An Adaptive Conversion Speed Q-Learning Algorithm for Search and Rescue UAV Path Planning in Unknown Environments. IEEE Trans. Veh. Technol. 2023, 72, 15391–15404. [Google Scholar] [CrossRef]
- Ribeiro, R.G.; Cota, L.P.; Euzébio, T.A.M.; Ramírez, J.A.; Guimarães, F.G. Unmanned-Aerial-Vehicle Routing Problem With Mobile Charging Stations for Assisting Search and Rescue Missions in Postdisaster Scenarios. IEEE Trans. Syst. Man Cybern. Syst. 2022, 52, 6682–6696. [Google Scholar] [CrossRef]
- Bai, Y.; Zhao, H.; Zhang, X.; Chang, Z.; Jäntti, R.; Yang, K. Toward Autonomous Multi-UAV Wireless Network: A Survey of Reinforcement Learning-Based Approaches. IEEE Commun. Surv. Tutor. 2023, 25, 3038–3067. [Google Scholar] [CrossRef]
- Sun, G.; He, L.; Sun, Z.; Wu, Q.; Liang, S.; Li, J.; Niyato, D.; Leung, V.C.M. Joint Task Offloading and Resource Allocation in Aerial-Terrestrial UAV Networks With Edge and Fog Computing for Post-Disaster Rescue. IEEE Trans. Mob. Comput. 2024, 23, 8582–8600. [Google Scholar] [CrossRef]
- Wang, Y.; Su, Z.; Xu, Q.; Li, R.; Luan, T.H.; Wang, P. A Secure and Intelligent Data Sharing Scheme for UAV-Assisted Disaster Rescue. IEEE/ACM Trans. Netw. 2023, 31, 2422–2438. [Google Scholar] [CrossRef]
- Qian, M.; Chen, W.; Sun, R. A Maneuvering Target Tracking Algorithm Based on Cooperative Localization of Multi-UAVs with Bearing-Only Measurements. IEEE Trans. Instrum. Meas. 2024, 73, 1–11. [Google Scholar] [CrossRef]
- Xu, K.; Song, C.; Xie, Y.; Pan, L.; Gan, X.; Huang, G. RMT-YOLOv9s: An Infrared Small Target Detection Method Based on UAV Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2024, 21, 1–5. [Google Scholar] [CrossRef]
- Lei, X.; Hu, X.; Wang, G.; Luo, H. A multi-UAV deployment method for border patrolling based on Stackelberg game. J. Syst. Eng. Electron. 2023, 34, 99–116. [Google Scholar] [CrossRef]
- Yang, D.; Wang, J.; Wu, F.; Xiao, L.; Xu, Y.; Zhang, T. Energy Efficient Transmission Strategy for Mobile Edge Computing Network in UAV-Based Patrol Inspection System. IEEE Trans. Mob. Comput. 2024, 23, 5984–5998. [Google Scholar] [CrossRef]
- Xu, W.; Wang, C.; Xie, H.; Liang, W.; Dai, H.; Xu, Z.; Wang, Z.; Guo, B.; Das, S.K. Reward Maximization for Disaster Zone Monitoring With Heterogeneous UAVs. IEEE/ACM Trans. Netw. 2024, 32, 890–903. [Google Scholar] [CrossRef]
- Huang, H.; Savkin, A.V.; Huang, C. Decentralized Autonomous Navigation of a UAV Network for Road Traffic Monitoring. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 2558–2564. [Google Scholar] [CrossRef]
- Lun, Y.; Wang, H.; Wu, J.; Liu, Y.; Wang, Y. Target Search in Dynamic Environments With Multiple Solar-Powered UAVs. IEEE Trans. Veh. Technol. 2022, 71, 9309–9321. [Google Scholar] [CrossRef]
- Shen, G.; Lei, L.; Zhang, X.; Li, Z.; Cai, S.; Zhang, L. Multi-UAV Cooperative Search Based on Reinforcement Learning With a Digital Twin Driven Training Framework. IEEE Trans. Veh. Technol. 2023, 72, 8354–8368. [Google Scholar] [CrossRef]
- Zhang, C.; Yao, W.; Zuo, Y.; Gui, J.; Zhang, C. Multi-Objective Optimization of Dynamic Communication Network for Multi-UAVs System. IEEE Trans. Veh. Technol. 2024, 73, 4081–4094. [Google Scholar] [CrossRef]
- Peng, H.; Huo, M.l.; Liu, Z.z.; Xu, W. Simulation analysis of cooperative target search strategies for multiple UAVs. In Proceedings of the 27th Chinese Control and Decision Conference (2015 CCDC), Qingdao, China, 23–25 May 2015; pp. 4855–4859. [Google Scholar]
- Zhang, Y.; Yuan, J.; Bao, H. Optimization algorithm and application of sand cat swarm based on sparrow warning mechanism and spiral search strategy. In Proceedings of the 2024 39th Youth Academic Annual Conference of Chinese Association of Automation (YAC), Dalian, China, 7–9 June 2024; pp. 452–457. [Google Scholar]
- Zheng, J.; Ding, M.; Sun, L.; Liu, H. Distributed Stochastic Algorithm Based on Enhanced Genetic Algorithm for Path Planning of Multi-UAV Cooperative Area Search. IEEE Trans. Intell. Transp. Syst. 2023, 24, 8290–8303. [Google Scholar] [CrossRef]
- Yang, F.; Ji, X.; Yang, C.; Li, J.; Li, B. Cooperative search of UAV swarm based on improved ant colony algorithm in uncertain environment. In Proceedings of the 2017 IEEE International Conference on Unmanned Systems (ICUS), Beijing, China, 27–29 October 2017; pp. 231–236. [Google Scholar]
- Wu, J.; Luo, J.; Jiang, C.; Gao, L. Multi-UAV Cooperative Search in Multi-Layered Aerial Computing Networks: A Multi-Agent Deep Reinforcement Learning Approach. In Proceedings of the ICC 2024—IEEE International Conference on Communications, Denver, CO, USA, 9–13 June 2024; pp. 2791–2796. [Google Scholar]
- Ma, T.; Jiang, J.; Liu, X.; Liu, R.; Sun, H. Target Search of UAV Swarm Based on Improved Wolf Pack Algorithm. In Proceedings of the 2023 6th International Symposium on Autonomous Systems (ISAS), Nanjing, China, 23–25 June 2023; pp. 1–6. [Google Scholar]
- Hou, Y.; Zhao, J.; Zhang, R.; Cheng, X.; Yang, L. UAV Swarm Cooperative Target Search: A Multi-Agent Reinforcement Learning Approach. IEEE Trans. Intell. Veh. 2024, 9, 568–578. [Google Scholar] [CrossRef]
- Gao, Y.; Jin, G.; Guo, Y.; Zhu, G.; Yang, Q.; Yang, K. Weighted area coverage of maritime joint search and rescue based on multi-agent reinforcement learning. In Proceedings of the 2019 IEEE 3rd Advanced Information Management Communicates, Electronic and Automation Control Conference (IMCEC), Chongqing, China, 11–13 October 2019; pp. 593–597. [Google Scholar]
- Ni, J.; Tang, G.; Mo, Z.; Cao, W.; Yang, S.X. An Improved Potential Game Theory Based Method for Multi-UAV Cooperative Search. IEEE Access 2020, 8, 47787–47796. [Google Scholar] [CrossRef]
- Fei, B.; Bao, W.; Zhu, X.; Liu, D.; Men, T.; Xiao, Z. Autonomous Cooperative Search Model for Multi-UAV With Limited Communication Network. IEEE Internet Things J. 2022, 9, 19346–19361. [Google Scholar] [CrossRef]
- Yanmaz, E.; Balanji, H.M.; Güven, Į. Dynamic Multi-UAV Path Planning for Multi-Target Search and Connectivity. IEEE Trans. Veh. Technol. 2024, 73, 10516–10528. [Google Scholar] [CrossRef]
- Balanji, H.M.; Yanmaz, E. Priority-Based Dynamic Multi-UAV Positioning for Multi-Target Search and Connectivity*. In Proceedings of the 2024 IEEE Wireless Communications and Networking Conference (WCNC), Dubai, United Arab Emirates, 21–24 April 2024; pp. 1–6. [Google Scholar]
- Devaraju, S.; Ihler, A.; Kumar, S. A Deep Q-Learning Connectivity-Aware Pheromone Mobility Model for Autonomous UAV Networks. In Proceedings of the 2023 International Conference on Computing, Networking and Communications (ICNC), Honolulu, HI, USA, 20–22 February 2023; pp. 575–580. [Google Scholar]
- Wang, F.; Zhang, Z.; Zhou, L.; Shang, T.; Zhang, R. Robust Multi-UAV Cooperative Trajectory Planning and Power Control for Reliable Communication in the Presence of Uncertain Jammers. Drones 2024, 8, 558. [Google Scholar] [CrossRef]
- Meng, K.; He, X.; Wu, Q.; Li, D. Multi-UAV Collaborative Sensing and Communication: Joint Task Allocation and Power Optimization. IEEE Trans. Wirel. Commun. 2023, 22, 4232–4246. [Google Scholar] [CrossRef]
- Zhang, H.; Ma, H.; Mersha, B.W.; Zhang, X.; Jin, Y. Distributed cooperative search method for multi-UAV with unstable communications. Appl. Soft Comput. 2023, 148, 110592. [Google Scholar] [CrossRef]
- Chai, S.; Yang, Z.; Huang, J.; Li, X.; Zhao, Y.; Zhou, D. Cooperative UAV search strategy based on DMPC-AACO algorithm in restricted communication scenarios. Def. Technol. 2024, 31, 295–311. [Google Scholar] [CrossRef]
- Shen, Y.; Wei, C.; Sun, Y.; Duan, H. Bird Flocking Inspired Methods for Multi-UAV Cooperative Target Search. IEEE Trans. Circuits Syst. II Express Briefs 2024, 71, 702–706. [Google Scholar] [CrossRef]
- Yan, K.; Xiang, L.; Yang, K. Cooperative Target Search Algorithm for UAV Swarms With Limited Communication and Energy Capacity. IEEE Commun. Lett. 2024, 28, 1102–1106. [Google Scholar] [CrossRef]
- Luo, Q.; Luan, T.H.; Shi, W.; Fan, P. Deep Reinforcement Learning Based Computation Offloading and Trajectory Planning for Multi-UAV Cooperative Target Search. IEEE J. Sel. Areas Commun. 2023, 41, 504–520. [Google Scholar] [CrossRef]
- Kong, X.; Yang, J.; Chai, X.; Zhou, Y. An advantage duPLEX dueling multi-agent Q-learning algorithm for multi-UAV cooperative target search in unknown environments. Simul. Model. Pract. Theory 2025, 142, 103118. [Google Scholar] [CrossRef]
- Senthilnath, J.; Harikumar, K.; Sundaram, S. Metacognitive Decision-Making Framework for Multi-UAV Target Search Without Communication. IEEE Trans. Syst. Man Cybern. Syst. 2024, 54, 3195–3206. [Google Scholar] [CrossRef]
- Li, L.; Zhang, X.; Yue, W.; Liu, Z. Cooperative search for dynamic targets by multiple UAVs with communication data losses. ISA Trans. 2021, 114, 230–241. [Google Scholar] [CrossRef] [PubMed]
- Day, R.; Salmon, J. Spatiotemporal Analysis of Multi-UAV Persistent Search and Retrieval with Stochastic Target Appearance. Drones 2025, 9, 152. [Google Scholar] [CrossRef]
- Zhao, Z.; Zhang, X.; Fang, H.; Yang, Q. Distributed Formation Planning for Unmanned Aerial Vehicles. Drones 2025, 9, 306. [Google Scholar] [CrossRef]
- Xu, S.; Zhou, Z.; Li, J.; Wang, L.; Zhang, X.; Gao, H. Communication-Constrained UAVs’ Coverage Search Method in Uncertain Scenarios. IEEE Sens. J. 2024, 24, 17092–17101. [Google Scholar] [CrossRef]
- Qian, Y.; Sheng, K.; Ma, C.; Li, J.; Ding, M.; Hassan, M. Path Planning for the Dynamic UAV-Aided Wireless Systems Using Monte Carlo Tree Search. IEEE Trans. Veh. Technol. 2022, 71, 6716–6721. [Google Scholar] [CrossRef]
- Sun, L.; Wang, J.; Wan, L.; Li, K.; Wang, X.; Lin, Y. Human-UAV Interaction Assisted Heterogeneous UAV Swarm Scheduling for Target Searching in Communication Denial Environment. IEEE Trans. Autom. Sci. Eng. 2025, 22, 4457–4472. [Google Scholar] [CrossRef]
- Yao, P.; Wei, X. Multi-UAV Information Fusion and Cooperative Trajectory Optimization in Target Search. IEEE Syst. J. 2022, 16, 4325–4333. [Google Scholar] [CrossRef]
- Pérez-Carabaza, S.; Scherer, J.; Rinner, B.; López-Orozco, J.A.; Besada-Portas, E. UAV trajectory optimization for Minimum Time Search with communication constraints and collision avoidance. Eng. Appl. Artif. Intell. 2019, 85, 357–371. [Google Scholar] [CrossRef]
- Khan, A.; Yanmaz, E.; Rinner, B. Information merging in multi-UAV cooperative search. In Proceedings of the 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 31 May–7 June 2014; pp. 3122–3129. [Google Scholar]
- Hu, J.; Xie, L.; Lum, K.Y.; Xu, J. Multiagent Information Fusion and Cooperative Control in Target Search. IEEE Trans. Control Syst. Technol. 2013, 21, 1223–1235. [Google Scholar] [CrossRef]
- Zeng, Y.; Wu, Q.; Zhang, R. Accessing From the Sky: A Tutorial on UAV Communications for 5G and Beyond. Proc. IEEE 2019, 107, 2327–2375. [Google Scholar] [CrossRef]
- Mozaffari, M.; Saad, W.; Bennis, M.; Nam, Y.H.; Debbah, M. A Tutorial on UAVs for Wireless Networks: Applications, Challenges, and Open Problems. IEEE Commun. Surv. Tutor. 2019, 21, 2334–2360. [Google Scholar] [CrossRef]
- Liang, L.; Ye, H.; Li, G.Y. Spectrum Sharing in Vehicular Networks Based on Multi-Agent Reinforcement Learning. IEEE J. Sel. Areas Commun. 2019, 37, 2282–2292. [Google Scholar] [CrossRef]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal Policy Optimization Algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar]
- Yu, C.; Velu, A.; Vinitsky, E.; Gao, J.; Wang, Y.; Bayen, A.; Wu, Y. The Surprising Effectiveness of PPO in Cooperative, Multi-Agent Games. arXiv 2022, arXiv:2103.01955. [Google Scholar]
Study | Spectrum Collaboration | Search Collaboration | Jammer Resilience | Real-Time Operation | Joint Optimization |
---|---|---|---|---|---|
[28,29,30,31] | ✔ | ✔ | ✘ | ✘ | ✘ |
[32,33] | ✔ | ✘ | ✘ | ✔ | ✘ |
[34,35,38,39,40,41,43,44,45,46,47,48] | ✘ | ✔ | ✘ | ✔ | ✘ |
[36,37,42] | ✘ | ✔ | ✘ | ✘ | ✘ |
our work | ✔ | ✔ | ✔ | ✔ | ✔ |
Parameter | Description | Value |
---|---|---|
N | Number of UAVs | 3 |
M | Number of targets | 15 |
h | UAV flight altitude | 100 m |
Detection probability | 0.9 | |
False alarm probability | 0.1 | |
Thresholds for determining the absence and presence of a target | 0.05, 0.95 | |
Carrier frequency | 2 GHz | |
B | Channel bandwidth | 1 MHz |
UAV power | 23 dbm | |
Jammer power | 60 dbm | |
, | Environmental constants | 0.13, 11.9 |
NLoS link’s additional attenuation factor | 20 dB | |
White noise power spectral density | −100 dbm/MHz | |
Threshold for the information transmitted to the GCS | 0.9 | |
Hyperparameter for channel switching | 0.1 | |
Discount factor | 0.99 | |
GAE parameter | 0.95 | |
Clip factor | 0.2 | |
, | Policy/critic network learning rate of MAPPO-based spectrum collaboration | 5 × 10−4, 5 × 10−4 |
, | Policy/critic network learning rate of MAPPO-based search collaboration | 5 × 10−4, 5 × 10−3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, W.; Chen, Y.; Zhang, Y.; Chen, Y.; Du, Y. Collaborative Search Algorithm for Multi-UAVs Under Interference Conditions: A Multi-Agent Deep Reinforcement Learning Approach. Drones 2025, 9, 445. https://doi.org/10.3390/drones9060445
Wang W, Chen Y, Zhang Y, Chen Y, Du Y. Collaborative Search Algorithm for Multi-UAVs Under Interference Conditions: A Multi-Agent Deep Reinforcement Learning Approach. Drones. 2025; 9(6):445. https://doi.org/10.3390/drones9060445
Chicago/Turabian StyleWang, Wei, Yong Chen, Yu Zhang, Yong Chen, and Yihang Du. 2025. "Collaborative Search Algorithm for Multi-UAVs Under Interference Conditions: A Multi-Agent Deep Reinforcement Learning Approach" Drones 9, no. 6: 445. https://doi.org/10.3390/drones9060445
APA StyleWang, W., Chen, Y., Zhang, Y., Chen, Y., & Du, Y. (2025). Collaborative Search Algorithm for Multi-UAVs Under Interference Conditions: A Multi-Agent Deep Reinforcement Learning Approach. Drones, 9(6), 445. https://doi.org/10.3390/drones9060445