A Deep Reinforcement Learning-Driven Seagull Optimization Algorithm for Solving Multi-UAV Task Allocation Problem in Plateau Ecological Restoration
Abstract
1. Introduction
2. Related Work
3. Problem Formulation
3.1. Scenario Description
3.2. Restrictive Conditions
3.3. Objective Functions
4. A Deep Reinforcement Learning-Driven Seagull Optimization Algorithm
4.1. Seagull Optimization Algorithm
4.1.1. Migration Behavior
- (1)
- Avoid collision. The specific formula is
- (2)
- Move in the direction of the optimal seagull position. This is shown in detail in Equation (9):
- (3)
- Constantly approach the optimal seagull position. The specific formula is
4.1.2. Attack Behavior
4.2. Theoretical Framework for DRL-SOA
Algorithm 1 DRL-SOA |
Input: the parameters’ configuration Output: globally optimal individual position Initialize algorithm-related parameters Initialize the seagull population and calculate individual fitness Set initial parameters: population size while do for to DQNAgent1 selects the decay strategy for parameter A Updates on gull migration locations DQNAgent2 selects the local search strategy and updates the seagull individual’s best attack position DQNAgent3 selects other strategies to further optimize and update the seagull’s position Update the best position and fitness of the seagull individual within the current cycle end for end while |
4.3. Deep Reinforcement Learning
4.3.1. State Set Design
4.3.2. Motion Space Design
4.3.3. Reward-and-Punishment Function Design
4.3.4. Action Selection Strategy
4.3.5. The Process of DQNAgent-Adjusting the Optimization Strategies
5. Experiments and Analysis
5.1. Design of Experiments
5.2. Hyperparameter Tuning
5.3. Algorithm Performance Testing
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Ortuani, B.; Mayer, A.; Bianchi, D.; Sona, G.; Crema, A.; Modina, D.; Bolognini, M.; Brancadoro, L.; Boschetti, M.; Facchi, A. Effectiveness of Management Zones Delineated from UAV and Sentinel-2 Data for Precision Viticulture Applications. Remote Sens. 2024, 16, 635. [Google Scholar] [CrossRef]
- Kimseaill; Jo, K.S.; Jin, S. Convergent Military Science and Weapon Research on the Threat of UAVs. Korean J. Converg. Sci. 2022, 11, 287–310. [Google Scholar]
- Fatihur, R.M.F.; Shurui, F.; Yan, Z.; Lei, C. A Comparative Study on Application of Unmanned Aerial Vehicle Systems in Agriculture. Agriculture 2021, 11, 22. [Google Scholar] [CrossRef]
- Jia, S.; Kai, Z.; Yang, L. Survey on Mission Planning of Multiple Unmanned Aerial Vehicles. Aerospace 2023, 10, 208. [Google Scholar] [CrossRef]
- Hu, C.F.; Song, S.H.; Xu, J.J.; Wang, D.D. Distributed Task Allocation Based on Auction-PIO Algorithm for Multi-UAV Tracking. J. Tianjin Univ. (Sci. Technol.) 2024, 57, 403–414. [Google Scholar]
- Wu, J.H.; Zhang, J.C.; Sun, Y.N.; Li, X.W.; Gao, L.J.; Han, G.J. Multi-UAV Collaborative Dynamic Task Allocation Method Based on ISOM and Attention Mechanism. IEEE Trans. Veh. Technol. 2024, 73, 6225–6235. [Google Scholar] [CrossRef]
- Ma, Y.; Zhao, Y.; Bai, S.; Yang, J.; Zhang, Y. Collaborative task allocation of heterogeneous multi-UAV based on improved CBGA algorithm. In Proceedings of the 16th International Conference on Control, Automation, Robotics and Vision (ICARCV), Shenzhen, China, 13–15 December 2020. [Google Scholar]
- Liu, X.; Jing, T.; Hou, L. An FW–GA Hybrid Algorithm Combined with Clustering for UAV Forest Fire Reconnaissance Task Assignment. Mathematics 2023, 11, 2400. [Google Scholar] [CrossRef]
- Zhang, J.; Cui, Y.; Ren, J. Dynamic Mission Planning Algorithm for UAV Formation in Battlefield Environment. IEEE Trans. Aerosp. Electron. Syst. 2023, 59, 3750–3765. [Google Scholar] [CrossRef]
- Peng, Q.; Wu, H.S.; Li, N.; Wang, F. A Dynamic Task Allocation Method for Unmanned Aerial Vehicle Swarm Based on Wolf Pack Labor Division Model. IEEE Trans. Emerg. Top. Comput. Intell. 2024, 8, 4075–4089. [Google Scholar] [CrossRef]
- Wang, J.F.; Jia, G.W.; Lin, J.C.; Hou, Z.X. Cooperative task allocation for heterogeneous multi-UAV using multi-objective optimization algorithm. J. Cent. South Univ. Sci. Technol. Min. Metall. 2020, 27, 432–448. [Google Scholar] [CrossRef]
- Zhou, Z.; Liu, H.; Dai, Y.; Qin, L. A Tent-Lévy-Based Seagull Optimization Algorithm for the Multi-UAV Collaborative Task Allocation Problem. Appl. Sci. 2024, 14, 5398. [Google Scholar] [CrossRef]
- Liu, H.; Zhang, J.; Zhou, Z.; Dai, Y.; Qin, L. A Deep Reinforcement Learning-Based Algorithm for Multi-Objective Agricultural Site Selection and Logistics Optimization Problem. Appl. Sci. 2024, 14, 8479. [Google Scholar] [CrossRef]
- Chen, L.Z.; Liu, W.L.; Zhong, J.H. An Efficient Multi-objective Ant Colony Optimization for Task Allocation of Heterogeneous Unmanned Aerial Vehicles. J. Comput. Sci. 2021, 58, 101545. [Google Scholar] [CrossRef]
- Gao, X.H.; Wang, L.; Yu, X.Y.; Su, X.C.; Ding, Y.; Lu, C.; Peng, H.J.; Wang, X.W. Conditional probability based multi-objective cooperative task assignment for heterogeneous UAVs. Eng. Appl. Artif. Intell. 2023, 123, 106404. [Google Scholar] [CrossRef]
- Huan, L.; Fuqing, Z.; Ling, W.; Jie, C.; Jianxin, T.; Jonrinaldi. An estimation of distribution algorithm with multiple intensification strategies for two-stage hybrid flow-shop scheduling problem with sequence-dependent setup time. Appl. Intell. 2022, 53, 5160–5178. [Google Scholar]
- Xu, Y.; Sun, Z.; Xue, X.; Gu, W.; Peng, B. A hybrid algorithm based on MOSFLA and GA for multi-UAVs plant protection task assignment and sequencing optimization. Appl. Soft Comput. J. 2020, 96, 106623. [Google Scholar] [CrossRef]
- Xu, Q.Z.; Ma, Y.X.; Wang, N. Task Allocation for Multi-UAV Under Dynamic Environment. J. Nav. Aviat. Univ. 2023, 38, 473–482. [Google Scholar]
- Long, H.; Wei, C.; Duan, H.B. Task Allocation for Multi-UAV Reconnaissance via Unsupervised Learning Discrete Pigeon-Inspired Optimization. J. Air Force Eng. Univ. 2023, 24, 16–22+32. [Google Scholar]
- Chen, C.; Liang, X.; Zhang, Z.; Zheng, K.; Liu, D.; Yu, C.; Li, W. Cooperative target allocation for air-sea heterogeneous unmanned vehicles against saturation attacks. J. Frankl. Inst. 2024, 361, 1386–1402. [Google Scholar] [CrossRef]
- Liu, H.; Zhang, J.; Dai, Y.; Qin, L.; Zhi, Y. Multi-constraint distributed terminal distribution path planning for fresh agricultural products. Appl. Intell. 2024, 55, 180. [Google Scholar] [CrossRef]
- Chen, H.; Xu, J.; Wu, C. Multi-UAV task assignment based on improved Wolf Pack Algorithm. In Proceedings of the 2020 International Conference on Cyberspace Innovation of Advanced Technologies, Guangzhou, China, 5 December 2020; pp. 109–115. [Google Scholar]
- Zhang, J.D.; Chen, Y.Y.; Yang, Q.M.; Lu, Y.; Shi, G.Q.; Wang, S.; Hu, J.W. Dynamic Task Allocation of Multiple UAVs Based on Improved A-QCDPSO. Electronics 2022, 11, 1028. [Google Scholar] [CrossRef]
- Yan, F.; Chu, J.; Hu, J.W.; Zhu, X.P. Cooperative task allocation with simultaneous arrival and resource constraint for multi-UAV using a genetic algorithm. Expert Syst. Appl. 2024, 245, 123023. [Google Scholar] [CrossRef]
- Liu, H.; Qin, L.; Zhou, Z. Knowledge-Based Perturbation LaF-CMA-ES for Multimodal Optimization. Appl. Sci. 2024, 14, 9133. [Google Scholar] [CrossRef]
- Yu, J.; Guo, J.; Zhang, X.; Zhou, C.; Xie, T.; Han, X. A Novel Tent-Levy Fireworks Algorithm for the UAV Task Allocation Problem Under Uncertain Environment. IEEE Access 2022, 10, 102373–102385. [Google Scholar] [CrossRef]
- Liu, H.; Zhao, F.; Wang, L.; Xu, T.; Dong, C. Evolutionary Multitasking Memetic Algorithm for Distributed Hybrid Flow-Shop Scheduling Problem With Deterioration Effect. IEEE Trans. Autom. Sci. Eng. 2024, 22, 1390–1404. [Google Scholar] [CrossRef]
- Dhiman, G.; Kumar, V. Seagull optimization algorithm: Theory and its applications for large-scale industrial engineering problems. Knowl.-Based Syst. 2018, 165, 169–196. [Google Scholar] [CrossRef]
- Li, R.; Chen, S.; Xia, J.; Zhou, H.; Shen, Q.; Li, Q.; Dong, Q. Predictive modeling of deep vein thrombosis risk in hospitalized patients: A Q-learning enhanced feature selection model. Comput. Biol. Med. 2024, 175, 108447. [Google Scholar] [CrossRef]
- Terven, J. Deep Reinforcement Learning: A Chronological Overview and Methods. AI 2025, 6, 46. [Google Scholar] [CrossRef]
- Dhiman, G.; Singh, K.K.; Slowik, A.; Chang, V.; Yildiz, A.R.; Kaur, A.; Garg, M. EMoSOA: A new evolutionary multi-objective seagull optimization algorithm for global optimization. Int. J. Mach. Learn. Cybern. 2020, 12, 1–26. [Google Scholar] [CrossRef]
- Ma, Y.; Li, B.; Huang, W.; Fan, Q. An Improved NSGA-II Based on Multi-Task Optimization for Multi-UAV Maritime Search and Rescue under Severe Weather. J. Mar. Sci. Eng. 2023, 11, 781. [Google Scholar] [CrossRef]
- Deng, M.; Yao, Z.; Li, X.; Wang, H.; Nallanathan, A.; Zhang, Z. Dynamic Multi-Objective AWPSO in DT-Assisted UAV Cooperative Task Assignment. IEEE J. Sel. Areas Commun. 2023, 41, 3444–3460. [Google Scholar] [CrossRef]
- Chen, H.X.; Nan, Y.; Yang, Y. Multi-UAV Reconnaissance Task Assignment for Heterogeneous Targets Based on Modified Symbiotic Organisms Search Algorithm. Sensors 2019, 19, 734. [Google Scholar] [CrossRef] [PubMed]
- Wang, F.; Fu, Q.P.; Han, M.C.; Xing, L.N.; Wu, H.S. Learning-guided coevolution multi-objective particle swarm optimization for heterogeneous UAV cooperative multi-task reallocation problem. Control Theory Appl. 2024, 41, 1009–1017. [Google Scholar]
- Han, H.G.; Zhang, L.L.; Yinga, A.; Qiao, J.F. Adaptive multiple selection strategy for multi-objective particle swarm optimization. Inf. Sci. 2023, 624, 235–251. [Google Scholar] [CrossRef]
- Sun, Y.A.; Yen, G.G.; Yi, Z. IGD Indicator-Based Evolutionary Algorithm for Many-Objective Optimization Problems. IEEE Trans. Evol. Comput. 2019, 23, 173–187. [Google Scholar] [CrossRef]
Scope of Work | Number of UAVs | Task Size | Range of Task Positions |
---|---|---|---|
3 km × 3 km | 6 | 20 | (100, 100) to (3000, 3000) |
3 km × 3 km | 6 | 30 | (100, 100) to (3000, 3000) |
4 km × 4 km | 10 | 40 | (100, 100) to (4000, 4000) |
4 km × 4 km | 10 | 50 | (100, 100) to (4000, 4000) |
Item | Description |
---|---|
Processor | Intel®Core(TM) i5-8300H CPU @ 2.30 GHz 2.30 GHz (Intel, Santa Clara, CA, USA) |
RAM | 8 GB |
OS | Windows 11 (64-bit) |
Python version | Python 3.9 |
Experiment Number | α | γ |
---|---|---|
1 | 0.001 | 0.95 |
2 | 0.001 | 0.9 |
3 | 0.001 | 0.85 |
4 | 0.001 | 0.8 |
5 | 0.01 | 0.95 |
6 | 0.01 | 0.9 |
7 | 0.01 | 0.85 |
8 | 0.01 | 0.8 |
9 | 0.1 | 0.95 |
10 | 0.1 | 0.9 |
11 | 0.1 | 0.85 |
12 | 0.1 | 0.8 |
Experiment Number | Number of Non-Discretionary Solutions |
---|---|
1 | 2 |
2 | 3 |
3 | 7 |
4 | 6 |
5 | 5 |
6 | 24 |
7 | 10 |
8 | 3 |
9 | 3 |
10 | 2 |
11 | 4 |
12 | 4 |
Task Size | Test Case Number | DRL-SOA | INSGA-II-MTO | EMoSOA | AWPSO | MOSOS | LeCMPSO |
---|---|---|---|---|---|---|---|
20 | 1 | 1.063 | 0.979 | 0.841 | 0.711 | 0.596 | 0.612 |
2 | 0.966 | 0.936 | 0.745 | 0.643 | 0.571 | 0.528 | |
30 | 1 | 1.048 | 0.827 | 0.727 | 0.772 | 0.603 | 0.635 |
2 | 0.955 | 0.903 | 0.886 | 0.804 | 0.678 | 0.674 | |
40 | 1 | 0.931 | 0.968 | 0.739 | 0.785 | 0.692 | 0.659 |
2 | 0.950 | 0.830 | 0.693 | 0.632 | 0.552 | 0.691 | |
50 | 1 | 1.097 | 0.986 | 0.801 | 0.791 | 0.613 | 0.648 |
2 | 0.974 | 0.944 | 0.846 | 0.663 | 0.564 | 0.663 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Qin, L.; Zhou, Z.; Liu, H.; Yan, Z.; Dai, Y. A Deep Reinforcement Learning-Driven Seagull Optimization Algorithm for Solving Multi-UAV Task Allocation Problem in Plateau Ecological Restoration. Drones 2025, 9, 436. https://doi.org/10.3390/drones9060436
Qin L, Zhou Z, Liu H, Yan Z, Dai Y. A Deep Reinforcement Learning-Driven Seagull Optimization Algorithm for Solving Multi-UAV Task Allocation Problem in Plateau Ecological Restoration. Drones. 2025; 9(6):436. https://doi.org/10.3390/drones9060436
Chicago/Turabian StyleQin, Lijing, Zhao Zhou, Huan Liu, Zhengang Yan, and Yongqiang Dai. 2025. "A Deep Reinforcement Learning-Driven Seagull Optimization Algorithm for Solving Multi-UAV Task Allocation Problem in Plateau Ecological Restoration" Drones 9, no. 6: 436. https://doi.org/10.3390/drones9060436
APA StyleQin, L., Zhou, Z., Liu, H., Yan, Z., & Dai, Y. (2025). A Deep Reinforcement Learning-Driven Seagull Optimization Algorithm for Solving Multi-UAV Task Allocation Problem in Plateau Ecological Restoration. Drones, 9(6), 436. https://doi.org/10.3390/drones9060436