Next Article in Journal
Risk-Constrained Stochastic Scheduling of a Grid-Connected Hybrid Microgrid with Variable Wind Power Generation
Previous Article in Journal
Review of the Recent Progress on GaN-Based Vertical Power Schottky Barrier Diodes (SBDs)
Article Menu
Issue 5 (May) cover image

Export Article

Open AccessArticle

Completing Explorer Games with a Deep Reinforcement Learning Framework Based on Behavior Angle Navigation

College of Information and Communication, Harbin Engineering University, Harbin 150001, China
*
Author to whom correspondence should be addressed.
Electronics 2019, 8(5), 576; https://doi.org/10.3390/electronics8050576
Received: 7 May 2019 / Revised: 20 May 2019 / Accepted: 20 May 2019 / Published: 25 May 2019
(This article belongs to the Section Systems & Control Engineering)
  |  
PDF [9525 KB, uploaded 25 May 2019]
  |     |  

Abstract

In cognitive electronic warfare, when a typical combat vehicle, such as an unmanned combat air vehicle (UCAV), uses radar sensors to explore an unknown space, the target-searching fails due to an inefficient servoing/tracking system. Thus, to solve this problem, we developed an autonomous reasoning search method that can generate efficient decision-making actions and guide the UCAV as early as possible to the target area. For high-dimensional continuous action space, the UCAV’s maneuvering strategies are subject to certain physical constraints. We first record the path histories of the UCAV as a sample set of supervised experiments and then construct a grid cell network using long short-term memory (LSTM) to generate a new displacement prediction to replace the target location estimation. Finally, we enable a variety of continuous-control-based deep reinforcement learning algorithms to output optimal/sub-optimal decision-making actions. All these tasks are performed in a three-dimensional target-searching simulator, i.e., the Explorer game. Please note that we use the behavior angle (BHA) for the first time as the main factor of the reward-shaping of the deep reinforcement learning framework and successfully make the trained UCAV achieve a 99.96% target destruction rate, i.e., the game win rate, in a 0.1 s operating cycle. View Full-Text
Keywords: target-searching; cognitive electronic warfare; deep reinforcement learning; continuous control-based navigation optimization; behavior angle target-searching; cognitive electronic warfare; deep reinforcement learning; continuous control-based navigation optimization; behavior angle
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

You, S.; Diao, M.; Gao, L. Completing Explorer Games with a Deep Reinforcement Learning Framework Based on Behavior Angle Navigation. Electronics 2019, 8, 576.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Electronics EISSN 2079-9292 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top