Goal-Directed Planning and Goal Understanding by Extended Active Inference: Evaluation through Simulated and Physical Robot Experiments
Abstract
:1. Introduction
2. Related Studies
3. The Proposed Model
3.1. Model Architecture
3.2. Learning
3.3. Online Goal-Directed Action Plan Generation
3.4. Goal Inference
4. Experiments
4.1. Experiment 1: Simulated Mobile Agent in a 2D Space
- Generalization in learning for goal-directed plan generation;
- Goal-directed plan generation for different types of goals;
- Goal understanding from sensory observation for different types of goals;
- Rational plan generation.
4.1.1. Experiment 1A: Generalization in Plan Generation by Learning
4.1.2. Experiment 1B: Goal-Directed Plan Generation for Different Types of Goals
4.1.3. Experiment 1C: Goal Inference by Sensory Observation
4.1.4. Experiment 1D: Goal-Directed Planning Enforcing the Well-Posed Condition
4.2. Experiment 2: Object Manipulation by a Physical Humanoid Robot
4.2.1. Experiment 2A: Goal-Directed Plan Generation and Execution
4.2.2. Experiment 2B: Goal Understanding
5. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Sehon, S.R. Goal-directed action and teleological explanation. Causation Explan. 2007, 4, 155–170. [Google Scholar]
- Löhrer, G. Actions, Reason Explanations, and Values. Tutti Diritti Riserv. 2016, 1, 17–30. [Google Scholar]
- Csibra, G.; Bíró, S.; Koós, O.; Gergely, G. One-year-old infants use teleological representations of actions productively. Cogn. Sci. 2003, 27, 111–133. [Google Scholar] [CrossRef]
- Kawato, M.; Maeda, Y.; Uno, Y.; Suzuki, R. Trajectory formation of arm movement by cascade neural network model based on minimum torque-change criterion. Biol. Cybern. 1990, 62, 275–288. [Google Scholar] [CrossRef] [PubMed]
- Miall, R.C.; Wolpert, D.M. Forward models for physiological motor control. Neural Netw. 1996, 9, 1265–1279. [Google Scholar] [CrossRef]
- Kawato, M. Internal models for motor control and trajectory planning. Curr. Opin. Neurobiol. 1999, 9, 718–727. [Google Scholar] [CrossRef]
- Friston, K.; Rigoli, F.; Ognibene, D.; Mathys, C.; Fitzgerald, T.; Pezzulo, G. Active inference and epistemic value. Cogn. Neurosci. 2015, 6, 187–214. [Google Scholar] [CrossRef]
- Parr, T.; Friston, K.J. Generalised free energy and active inference. Biol. Cybern. 2019, 113, 495–513. [Google Scholar] [CrossRef] [Green Version]
- Friston, K.; Mattout, J.; Kilner, J. Action understanding and active inference. Biol. Cybern. 2011, 104, 137–160. [Google Scholar] [CrossRef] [Green Version]
- Friston, K.; Samothrakis, S.; Montague, R. Active inference and agency: Optimal control without cost functions. Biol. Cybern. 2012, 106, 523–541. [Google Scholar] [CrossRef] [Green Version]
- Baltieri, M.; Buckley, C.L. An active inference implementation of phototaxis. In Proceedings of the 14th European Conference on Artificial Life ECAL 2017, Lyon, France, 4–8 September 2017; pp. 36–43. [Google Scholar]
- Friston, K. A theory of cortical responses. Philos. Trans. R. Soc. B Biol. Sci. 2005, 360, 815–836. [Google Scholar] [CrossRef] [PubMed]
- Matsumoto, T.; Tani, J. Goal-Directed Planning for Habituated Agents by Active Inference Using a Variational Recurrent Neural Network. Entropy 2020, 22, 564. [Google Scholar] [CrossRef] [PubMed]
- Ahmadi, A.; Tani, J. A novel predictive-coding-inspired variational RNN model for online prediction and recognition. Neural Comput. 2019, 31, 2025–2074. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Shimojo, S. Postdiction: Its implications on visual awareness, hindsight, and sense of agency. Front. Psychol. 2014, 5, 196. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Tani, J. Learning to generate articulated behavior through the bottom-up and the top-down interaction processes. Neural Netw. 2003, 16, 11–23. [Google Scholar] [CrossRef]
- Tani, J. Model-based learning for mobile robot navigation from the dynamical systems perspective. IEEE Trans. Syst. Man Cybern. Part B 1996, 26, 421–436. [Google Scholar] [CrossRef] [Green Version]
- Fountas, Z.; Sajid, N.; Mediano, P.A.M.; Friston, K.J. Deep active inference agents using Monte-Carlo methods. Adv. Neural Inf. Process. Syst. 2020, 33, 11662–11675. [Google Scholar]
- Sajid, N.; Tigas, P.; Zakharov, A.; Fountas, Z.; Friston, K. Exploration and preference satisfaction trade-off in reward-free learning. arXiv 2021, arXiv:2106.04316. [Google Scholar]
- Çatal, O.; Wauthier, S.; De Boom, C.; Verbelen, T.; Dhoedt, B. Learning Generative State Space Models for Active Inference. Front. Comput. Neurosci. 2020, 14, 103. [Google Scholar] [CrossRef]
- Hafner, D.; Lillicrap, T.; Ba, J.; Norouzi, M. Dream to control: Learning behaviors by latent imagination. arXiv 2019, arXiv:1912.01603. [Google Scholar]
- Arie, H.; Endo, T.; Arakaki, T.; Sugano, S.; Tani, J. Creating novel goal-directed actions at criticality: A neuro-robotic experiment. New Math. Nat. Comput. 2009, 5, 307–334. [Google Scholar] [CrossRef] [Green Version]
- Nasiriany, S.; Pong, V.; Lin, S.; Levine, S. Planning with Goal-Conditioned Policies. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32. [Google Scholar]
- Yamashita, Y.; Tani, J. Emergence of functional hierarchy in a multiple timescale neural network model: A humanoid robot experiment. PLoS Comput. Biol. 2008, 4, e1000220. [Google Scholar] [CrossRef] [PubMed]
- Rumelhart, D.; Hinton, G.; Williams, R. Learning internal representations by error propagation. In Parallel Distributed Processing; MIT Press: Cambridge, MA, USA, 1986; Chapter 8. [Google Scholar]
- Ohata, W.; Tani, J. Investigation of the Sense of Agency in Social Cognition, Based on Frameworks of Predictive Coding and Active Inference: A Simulation Study on Multimodal Imitative Interaction. Front. Neurorobot. 2020, 14, 61. [Google Scholar] [CrossRef] [PubMed]
- Doya, K.; Yoshizawa, S. Memorizing oscillatory patterns in the analog neuron network. In Proceedings of the 1989 International Joint Conference on Neural Networks, Washington DC, USA, 18–22 June 1989; Volume 1, pp. 27–32. [Google Scholar]
- Williams, R.J.; Zipster, D. A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1989, 1, 270–280. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. In Proceedings of the 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
- Ahmadi, A.; Tani, J. How Can a Recurrent Neurodynamic Predictive Coding Model Cope with Fluctuation in Temporal Patterns? Robotic Experiments on Imitative Interaction. Neural Netw. 2017, 92, 3–16. [Google Scholar] [CrossRef]
- Butz, M.V.; Bilkey, D.; Humaidan, D.; Knott, A.; Otte, S. Learning, planning, and control in a monolithic neural event inference architecture. Neural Netw. 2019, 117, 135–144. [Google Scholar] [CrossRef] [Green Version]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Oliver, G.; Lanillos, P.; Cheng, G. An empirical study of active inference on a humanoid robot. IEEE Trans. Cogn. Dev. Syst. 2021, 1. [Google Scholar] [CrossRef]
- Meo, C.; Franzese, G.; Pezzato, C.; Spahn, M.; Lanillos, P. Adaptation through prediction: Multisensory active inference torque control. arXiv 2021, arXiv:2112.06752. [Google Scholar]
- Van de Maele, T.; Verbelen, T.; Çatal, O.; De Boom, C.; Dhoedt, B. Active Vision for Robot Manipulators Using the Free Energy Principle. Front. Neurorobotics 2021, 15, 642780. [Google Scholar] [CrossRef]
- Andrychowicz, M.; Wolski, F.; Ray, A.; Schneider, J.; Fong, R.; Welinder, P.; McGrew, B.; Tobin, J.; Pieter Abbeel, O.; Zaremba, W. Hindsight Experience Replay. In Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
- Mendonca, R.; Rybkin, O.; Daniilidis, K.; Hafner, D.; Pathak, D. Discovering and Achieving Goals via World Models. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2021; Volume 34. [Google Scholar]
- Warde-Farley, D.; de Wiele, T.V.; Kulkarni, T.D.; Ionescu, C.; Hansen, S.; Mnih, V. Unsupervised Control Through Non-Parametric Discriminative Rewards. In Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
Layer | |||
---|---|---|---|
1 | 2 | 3 | |
60 | 40 | 20 | |
6 | 4 | 2 | |
2 | 4 | 8 | |
w | 0.0001 | 0.0005 | 0.001 |
1.0 | 1.0 | 1.0 |
Reaching | Cycling | |
---|---|---|
NRMSD | 0.033241 | 0.028158 |
Layer | |||
---|---|---|---|
1 | 2 | 3 | |
60 | 40 | 20 | |
6 | 4 | 2 | |
2 | 10 | 20 | |
w | 0.0001 | 0.0005 | 0.001 |
1.0 | 1.0 | 1.0 |
Grasping-Placing | Grasping-Swinging | |
---|---|---|
NRMSD | 0.10053 | 0.01514 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Matsumoto, T.; Ohata, W.; Benureau, F.C.Y.; Tani, J. Goal-Directed Planning and Goal Understanding by Extended Active Inference: Evaluation through Simulated and Physical Robot Experiments. Entropy 2022, 24, 469. https://doi.org/10.3390/e24040469
Matsumoto T, Ohata W, Benureau FCY, Tani J. Goal-Directed Planning and Goal Understanding by Extended Active Inference: Evaluation through Simulated and Physical Robot Experiments. Entropy. 2022; 24(4):469. https://doi.org/10.3390/e24040469
Chicago/Turabian StyleMatsumoto, Takazumi, Wataru Ohata, Fabien C. Y. Benureau, and Jun Tani. 2022. "Goal-Directed Planning and Goal Understanding by Extended Active Inference: Evaluation through Simulated and Physical Robot Experiments" Entropy 24, no. 4: 469. https://doi.org/10.3390/e24040469
APA StyleMatsumoto, T., Ohata, W., Benureau, F. C. Y., & Tani, J. (2022). Goal-Directed Planning and Goal Understanding by Extended Active Inference: Evaluation through Simulated and Physical Robot Experiments. Entropy, 24(4), 469. https://doi.org/10.3390/e24040469