Next Article in Journal
Visual Flight Rules-Based Collision Avoidance Systems for UAV Flying in Civil Aerospace
Previous Article in Journal
Geometric Insight into the Control Allocation Problem for Open-Frame ROVs and Visualisation of Solution
Open AccessArticle

Sim-to-Real Quadrotor Landing via Sequential Deep Q-Networks and Domain Randomization

1
Lincoln Centre for Autonomous Systems, University of Lincoln, Lincoln LN6 7TS, UK
2
School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, UK
3
Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany
4
Bosch Center for Artificial Intelligence, 72076 Tubingen, Germany
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Robotics 2020, 9(1), 8; https://doi.org/10.3390/robotics9010008
Received: 28 January 2020 / Revised: 21 February 2020 / Accepted: 22 February 2020 / Published: 25 February 2020
(This article belongs to the Section Intelligent Robots and Mechatronics)
The autonomous landing of an Unmanned Aerial Vehicle (UAV) on a marker is one of the most challenging problems in robotics. Many solutions have been proposed, with the best results achieved via customized geometric features and external sensors. This paper discusses for the first time the use of deep reinforcement learning as an end-to-end learning paradigm to find a policy for UAVs autonomous landing. Our method is based on a divide-and-conquer paradigm that splits a task into sequential sub-tasks, each one assigned to a Deep Q-Network (DQN), hence the name Sequential Deep Q-Network (SDQN). Each DQN in an SDQN is activated by an internal trigger, and it represents a component of a high-level control policy, which can navigate the UAV towards the marker. Different technical solutions have been implemented, for example combining vanilla and double DQNs, and the introduction of a partitioned buffer replay to address the problem of sample efficiency. One of the main contributions of this work consists in showing how an SDQN trained in a simulator via domain randomization, can effectively generalize to real-world scenarios of increasing complexity. The performance of SDQNs is comparable with a state-of-the-art algorithm and human pilots while being quantitatively better in noisy conditions. View Full-Text
Keywords: deep reinforcement learning; aerial vehicles; Sim-to-Real deep reinforcement learning; aerial vehicles; Sim-to-Real
Show Figures

Figure 1

MDPI and ACS Style

Polvara, R.; Patacchiola, M.; Hanheide, M.; Neumann, G. Sim-to-Real Quadrotor Landing via Sequential Deep Q-Networks and Domain Randomization. Robotics 2020, 9, 8. https://doi.org/10.3390/robotics9010008

AMA Style

Polvara R, Patacchiola M, Hanheide M, Neumann G. Sim-to-Real Quadrotor Landing via Sequential Deep Q-Networks and Domain Randomization. Robotics. 2020; 9(1):8. https://doi.org/10.3390/robotics9010008

Chicago/Turabian Style

Polvara, Riccardo; Patacchiola, Massimiliano; Hanheide, Marc; Neumann, Gerhard. 2020. "Sim-to-Real Quadrotor Landing via Sequential Deep Q-Networks and Domain Randomization" Robotics 9, no. 1: 8. https://doi.org/10.3390/robotics9010008

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop