Next Article in Journal
Feature Extraction from Building Submetering Networks Using Deep Learning
Previous Article in Journal
An Integrated Strategy for Autonomous Exploration of Spatial Processes in Unknown Environments
Open AccessArticle

Learning Reward Function with Matching Network for Mapless Navigation

by Qichen Zhang 1,2, Meiqiang Zhu 1,2,*, Liang Zou 1,2, Ming Li 1,2 and Yong Zhang 1,2
1
Engineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou 221116 China
2
The School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
*
Author to whom correspondence should be addressed.
Sensors 2020, 20(13), 3664; https://doi.org/10.3390/s20133664
Received: 16 May 2020 / Revised: 24 June 2020 / Accepted: 24 June 2020 / Published: 30 June 2020
(This article belongs to the Section Sensor Networks)
Deep reinforcement learning (DRL) has been successfully applied in mapless navigation. An important issue in DRL is to design a reward function for evaluating actions of agents. However, designing a robust and suitable reward function greatly depends on the designer’s experience and intuition. To address this concern, we consider employing reward shaping from trajectories on similar navigation tasks without human supervision, and propose a general reward function based on matching network (MN). The MN-based reward function is able to gain the experience by pre-training through trajectories on different navigation tasks and accelerate the training speed of DRL in new tasks. The proposed reward function keeps the optimal strategy of DRL unchanged. The simulation results on two static maps show that the DRL converge with less iterations via the learned reward function than the state-of-the-art mapless navigation methods. The proposed method performs well in dynamic maps with partially moving obstacles. Even when test maps are different from training maps, the proposed strategy is able to complete the navigation tasks without additional training. View Full-Text
Keywords: deep reinforcement learning; reward shaping; matching network; navigation deep reinforcement learning; reward shaping; matching network; navigation
Show Figures

Figure 1

MDPI and ACS Style

Zhang, Q.; Zhu, M.; Zou, L.; Li, M.; Zhang, Y. Learning Reward Function with Matching Network for Mapless Navigation. Sensors 2020, 20, 3664.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop