Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region

Huh, Soobin; Lim, Sungwon; Jang, Hyeokjae; Byun, Woohyun; Yu, Suhyeong; Nam, Woochul

doi:10.3390/machines14040413

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region

by

Soobin Huh

,

Sungwon Lim

,

Hyeokjae Jang

,

Woohyun Byun

,

Suhyeong Yu

and

Woochul Nam

^*

Department of Mechanical Engineering, Chung-Ang University, Seoul 06974, Republic of Korea

^*

Author to whom correspondence should be addressed.

Machines 2026, 14(4), 413; https://doi.org/10.3390/machines14040413

Submission received: 27 February 2026 / Revised: 28 March 2026 / Accepted: 6 April 2026 / Published: 8 April 2026

(This article belongs to the Special Issue Advanced Planning, Perception, and Control for Autonomous Vehicles and Robots)

Download Versions Notes

Abstract

Although previous studies have considered sensing constraints and UAV dynamics, most of them have used unrealistic sensing limitations and simplified dynamic models. Thus, these approaches can suffer from a significant discrepancy between simulation results and real-world deployment. To address this issue, this study incorporates high-fidelity sensing constraints and UAV dynamics into a multi-agent reinforcement learning approach, focusing on the practical interplay between FOV limitations and pursuit strategies. First, the proposed reward considers the sensing constraints via a gaze-alignment reward, which varies with the field-of-view condition, and a capturability reward that encourages transitions toward a capturable region. Second, realistic UAV dynamics, including lateral motion, forward motion, and yawing, are modeled in a simulation environment to reduce the sim-to-real gap. Quantitative evaluations demonstrated that the proposed formulation significantly improved the capture performance under diverse sensing conditions. The capturability reward increases the capture success rate by 11.4%. When the maximum speed of the evading UAV was 2 m/s faster than that of the pursuing UAVs, all capture trials failed when lateral motion was not used. However, when lateral motion was enabled, the success rate increased to 99.2%, highlighting the need for lateral motion.

Keywords: multi-UAV; cooperative pursuit; multi-agent reinforcement learning; MAPPO; pursuit evasion; limited detectable region

Share and Cite

MDPI and ACS Style

Huh, S.; Lim, S.; Jang, H.; Byun, W.; Yu, S.; Nam, W. Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region. Machines 2026, 14, 413. https://doi.org/10.3390/machines14040413

AMA Style

Huh S, Lim S, Jang H, Byun W, Yu S, Nam W. Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region. Machines. 2026; 14(4):413. https://doi.org/10.3390/machines14040413

Chicago/Turabian Style

Huh, Soobin, Sungwon Lim, Hyeokjae Jang, Woohyun Byun, Suhyeong Yu, and Woochul Nam. 2026. "Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region" Machines 14, no. 4: 413. https://doi.org/10.3390/machines14040413

APA Style

Huh, S., Lim, S., Jang, H., Byun, W., Yu, S., & Nam, W. (2026). Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region. Machines, 14(4), 413. https://doi.org/10.3390/machines14040413

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Agent Reinforcement Learning for Multi-UAV Pursuit with Full Planar Motion and a Limited Detectable Region

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI