Next Article in Journal
DTM-Aided Adaptive EPF Navigation Application in Railways
Next Article in Special Issue
Receiver-Initiated Handshaking MAC Based on Traffic Estimation for Underwater Sensor Networks
Previous Article in Journal
Optimization of Gas Sensors Based on Advanced Nanomaterials through Split-Plot Designs and GLMMs
Previous Article in Special Issue
Orthogonal Chirp Division Multiplexing for Underwater Acoustic Communication
Article Menu
Issue 11 (November) cover image

Export Article

Open AccessArticle
Sensors 2018, 18(11), 3859; https://doi.org/10.3390/s18113859

Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation

1
Department of Electrical and Computer Engineering, Michigan Technological University, Houghton, MI 49931, USA
2
Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ 07030, USA
3
Department of Mechanical Engineering-Engineering Mechanics, Michigan Technological University, Houghton, MI 49931, USA
*
Author to whom correspondence should be addressed.
Received: 1 October 2018 / Revised: 26 October 2018 / Accepted: 4 November 2018 / Published: 9 November 2018
(This article belongs to the Special Issue Underwater Sensing, Communication, Networking and Systems)
Full-Text   |   PDF [1024 KB, uploaded 19 November 2018]   |  

Abstract

This work studies online learning-based trajectory planning for multiple autonomous underwater vehicles (AUVs) to estimate a water parameter field of interest in the under-ice environment. A centralized system is considered, where several fixed access points on the ice layer are introduced as gateways for communications between the AUVs and a remote data fusion center. We model the water parameter field of interest as a Gaussian process with unknown hyper-parameters. The AUV trajectories for sampling are determined on an epoch-by-epoch basis. At the end of each epoch, the access points relay the observed field samples from all the AUVs to the fusion center, which computes the posterior distribution of the field based on the Gaussian process regression and estimates the field hyper-parameters. The optimal trajectories of all the AUVs in the next epoch are determined to maximize a long-term reward that is defined based on the field uncertainty reduction and the AUV mobility cost, subject to the kinematics constraint, the communication constraint and the sensing area constraint. We formulate the adaptive trajectory planning problem as a Markov decision process (MDP). A reinforcement learning-based online learning algorithm is designed to determine the optimal AUV trajectories in a constrained continuous space. Simulation results show that the proposed learning-based trajectory planning algorithm has performance similar to a benchmark method that assumes perfect knowledge of the field hyper-parameters. View Full-Text
Keywords: underwater communication networks; under-ice exploration; field estimation; AUVs; adaptive trajectory planning; reinforcement learning underwater communication networks; under-ice exploration; field estimation; AUVs; adaptive trajectory planning; reinforcement learning
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Wang, C.; Wei, L.; Wang, Z.; Song, M.; Mahmoudian, N. Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation. Sensors 2018, 18, 3859.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Sensors EISSN 1424-8220 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top