Next Article in Journal
Category Maps Describe Driving Episodes Recorded with Event Data Recorders
Previous Article in Journal
Introduction to MAchine Learning & Knowledge Extraction (MAKE)
Article Menu

Export Article

Open AccessArticle
Mach. Learn. Knowl. Extr. 2018, 1(1), 2; doi:10.3390/make1010002

Learning to Teach Reinforcement Learning Agents

Department of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
Borealis AI, University of Alberta, CCIS 3-232, Edmonton, AB T6G 2M9, Canada
Author to whom correspondence should be addressed.
Received: 19 September 2017 / Revised: 17 November 2017 / Accepted: 1 December 2017 / Published: 6 December 2017
View Full-Text   |   Download PDF [1087 KB, uploaded 6 December 2017]   |  


In this article, we study the transfer learning model of action advice under a budget. We focus on reinforcement learning teachers providing action advice to heterogeneous students playing the game of Pac-Man under a limited advice budget. First, we examine several critical factors affecting advice quality in this setting, such as the average performance of the teacher, its variance and the importance of reward discounting in advising. The experiments show that the best performers are not always the best teachers and reveal the non-trivial importance of the coefficient of variation (CV) as a statistic for choosing policies that generate advice. The CV statistic relates variance to the corresponding mean. Second, the article studies policy learning for distributing advice under a budget. Whereas most methods in the relevant literature rely on heuristics for advice distribution, we formulate the problem as a learning one and propose a novel reinforcement learning algorithm capable of learning when to advise or not. The proposed algorithm is able to advise even when it does not have knowledge of the student’s intended action and needs significantly less training time compared to previous learning approaches. Finally, in this article, we argue that learning to advise under a budget is an instance of a more generic learning problem: Constrained Exploitation Reinforcement Learning. View Full-Text
Keywords: machine learning; reinforcement learning; transfer learning; action advice; machine teaching machine learning; reinforcement learning; transfer learning; action advice; machine teaching

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Fachantidis, A.; Taylor, M.E.; Vlahavas, I. Learning to Teach Reinforcement Learning Agents. Mach. Learn. Knowl. Extr. 2018, 1, 2.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Metrics

Article Access Statistics



[Return to top]
Mach. Learn. Knowl. Extr. EISSN 2504-4990 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top