You are currently on the new version of our website. Access the old version .
DronesDrones
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

23 January 2026

From Human Teams to Autonomous Swarms: A Reinforcement Learning-Based Benchmarking Framework for Unmanned Aerial Vehicle Search and Rescue Missions

,
,
,
,
and
1
Department of Data Science, FH Kufstein Tirol, 6330 Kufstein, Austria
2
Department of Mathematics and Informatics, University of Passau, 94030 Passau, Germany
3
Institute of Mountain Emergency Medicine, Eurac Research, 39100 Bolzano, Italy
4
Department of Sport Science–Medical Section, University of Innsbruck, 6020 Innsbruck, Austria
Drones2026, 10(2), 79;https://doi.org/10.3390/drones10020079 
(registering DOI)
This article belongs to the Special Issue Drone Communication, Networking, and Trajectory Control in Urban Environments

Abstract

The adoption of novel technologies such as Unmanned Aerial Vehicles (UAVs) in Search and Rescue (SAR) operations remains limited. As a result, their full potential is not yet realized. Although UAVs have been deployed on an ad hoc basis, typically under manual control by dedicated operators, assisted and fully autonomous configurations remain largely unexplored. In this study, three SAR frameworks are systematically evaluated within a unified benchmarking framework: conventional ground missions, UAV-assisted missions, and fully autonomous UAV operations. As the key performance indicator, the target localization time was quantified and used as the means of comparison amongst frameworks. The conventional and assisted frameworks were experimentally tested through physical hardware in a controlled outdoor setting, wherein simulated callouts occurred via rescue teams. The autonomous swarm framework was simulated in the form of a multi-agent Reinforcement Learning (RL) method via the use of the Proximal Policy Optimization (PPO) algorithm. This enabled the optimization of the decentralized cooperative actions that could occur for efficient exploration of a partially observed three-dimensional environment. Our results demonstrated that the autonomous swarm significantly outperformed the conventional and assisted approaches in terms of speed and coverage. Finally, a detailed depiction of the framework’s integration into an operational system is provided.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.