Next Article in Journal
Efficient Dual-Stream Network with Soft-Gated Fusion for Bearing Fault Diagnosis Using Acoustic Emission Signals
Previous Article in Journal
A Review of Domain-Adaptive Continual Deep Learning Remaining Useful Life Estimation for Bearing Fault Prognosis Under Evolving Data Distributions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region

Department of Mechanical Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
*
Author to whom correspondence should be addressed.
Machines 2026, 14(4), 413; https://doi.org/10.3390/machines14040413
Submission received: 27 February 2026 / Revised: 28 March 2026 / Accepted: 6 April 2026 / Published: 8 April 2026

Abstract

Although previous studies have considered sensing constraints and UAV dynamics, most of them have used unrealistic sensing limitations and simplified dynamic models. Thus, these approaches can suffer from a significant discrepancy between simulation results and real-world deployment. To address this issue, this study incorporates high-fidelity sensing constraints and UAV dynamics into a multi-agent reinforcement learning approach, focusing on the practical interplay between FOV limitations and pursuit strategies. First, the proposed reward considers the sensing constraints via a gaze-alignment reward, which varies with the field-of-view condition, and a capturability reward that encourages transitions toward a capturable region. Second, realistic UAV dynamics, including lateral motion, forward motion, and yawing, are modeled in a simulation environment to reduce the sim-to-real gap. Quantitative evaluations demonstrated that the proposed formulation significantly improved the capture performance under diverse sensing conditions. The capturability reward increases the capture success rate by 11.4%. When the maximum speed of the evading UAV was 2 m/s faster than that of the pursuing UAVs, all capture trials failed when lateral motion was not used. However, when lateral motion was enabled, the success rate increased to 99.2%, highlighting the need for lateral motion.
Keywords: multi-UAV; cooperative pursuit; multi-agent reinforcement learning; MAPPO; pursuit evasion; limited detectable region multi-UAV; cooperative pursuit; multi-agent reinforcement learning; MAPPO; pursuit evasion; limited detectable region

Share and Cite

MDPI and ACS Style

Huh, S.; Lim, S.; Jang, H.; Byun, W.; Yu, S.; Nam, W. Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region. Machines 2026, 14, 413. https://doi.org/10.3390/machines14040413

AMA Style

Huh S, Lim S, Jang H, Byun W, Yu S, Nam W. Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region. Machines. 2026; 14(4):413. https://doi.org/10.3390/machines14040413

Chicago/Turabian Style

Huh, Soobin, Sungwon Lim, Hyeokjae Jang, Woohyun Byun, Suhyeong Yu, and Woochul Nam. 2026. "Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region" Machines 14, no. 4: 413. https://doi.org/10.3390/machines14040413

APA Style

Huh, S., Lim, S., Jang, H., Byun, W., Yu, S., & Nam, W. (2026). Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region. Machines, 14(4), 413. https://doi.org/10.3390/machines14040413

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop