Gradient-Free De Novo Learning
Abstract
1. Introduction
1.1. Agents and Attractors
1.2. Model Selection
1.3. Rewards and Punishment
1.4. Overview
2. Active Inference, Learning and Selection
2.1. Generative Models
2.2. Variational Free Energy and Inference
2.3. Active Inference
2.4. Active Learning
2.5. Active Selection
3. Attractor Learning
3.1. Phase 1: Structure Learning
3.2. Phase 2: Attractor Learning
3.3. Phase 3: Continual Learning
4. A Worked Example
4.1. Phase 1: Structure Learning
4.2. Phase 2: Attractor Learning
4.3. Phase 3: Continual Learning
5. Conclusions
Limitations and Future Directions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Frequently Used Notation and Quantities
References
- Parr, T.; Pezzulo, G.; Friston, K.J. Active Inference: The Free Energy Principle in Mind, Brain, and Behavior; MIT Press: Cambridge, MA, USA, 2022. [Google Scholar]
- Friston, K.; Da Costa, L.; Sakthivadivel, D.A.R.; Heins, C.; Pavliotis, G.A.; Ramstead, M.; Parr, T. Path integrals, particular kinds, and strange things. Phys. Life Rev. 2023, 47, 35–62. [Google Scholar] [CrossRef]
- Jaynes, E.T. The Minimum Entropy Production Principle. Annu. Rev. Phys. Chem. 1980, 31, 579–601. [Google Scholar] [CrossRef]
- Arnold, L. Random Dynamical Systems (Springer Monographs in Mathematics); Springer: Berlin, Germany, 2003. [Google Scholar]
- Crauel, H.; Flandoli, F. Attractors for Random Dynamical-Systems. Probab. Theory Rel. 1994, 100, 365–393. [Google Scholar] [CrossRef]
- Wang, B. Sufficient and necessary criteria for existence of pullback attractors for non-compact random dynamical systems. J. Differ. Equ. 2012, 253, 1544–1583. [Google Scholar] [CrossRef]
- Kerner, B.S.; Osipov, V.a.c.V. Autosolitons: A New Approach to Problems of Self-Organization and Turbulence; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013; Volume 61. [Google Scholar]
- Evans, D.J. A non-equilibrium free energy theorem for deterministic systems. Mol. Phys. 2003, 101, 1551–1554. [Google Scholar] [CrossRef]
- Kwon, C.; Ao, P.; Thouless, D.J. Structure of stochastic dynamics near fixed points. Proc. Natl. Acad. Sci. USA 2005, 102, 13029–13033. [Google Scholar] [CrossRef] [PubMed]
- Seifert, U. Stochastic thermodynamics, fluctuation theorems and molecular machines. Rep. Prog. Phys. Phys. Soc. (Great Br.) 2012, 75, 126001. [Google Scholar] [CrossRef] [PubMed]
- Ao, P. Emerging of Stochastic Dynamical Equalities and Steady State Thermodynamics from Darwinian Dynamics. Commun. Theor. Phys. 2008, 49, 1073–1090. [Google Scholar] [CrossRef]
- Zhang, F.; Xu, L.; Zhang, K.; Wang, E.; Wang, J. The potential and flux landscape theory of evolution. J. Chem. Phys. 2012, 137, 65102. [Google Scholar] [CrossRef]
- Schreiber, S.J.; Benaim, M.; Atchade, K.A. Persistence in fluctuating environments. J. Math. Biol. 2011, 62, 655–683. [Google Scholar] [CrossRef]
- Kiebel, S.J.; von Kriegstein, K.; Daunizeau, J.; Friston, K.J. Recognizing sequences of sequences. PLoS Comput. Biol. 2009, 5, e1000464. [Google Scholar] [CrossRef] [PubMed]
- Rabinovich, M.I.; Afraimovich, V.S.; Varona, P. Heteroclinic binding. Dyn. Syst.-Int. J. 2010, 25, 433–442. [Google Scholar] [CrossRef]
- Friston, K.; Ao, P. Free energy, value, and attractors. Comput. Math. Methods Med. 2012, 2012, 937860. [Google Scholar] [CrossRef] [PubMed]
- Bellman, R. On the Theory of Dynamic Programming. Proc. Natl. Acad. Sci. USA 1952, 38, 716–719. [Google Scholar] [CrossRef]
- Da Costa, L.; Sajid, N.; Parr, T.; Friston, K.; Smith, R. Reward Maximization Through Discrete Active Inference. Neural Comput. 2023, 35, 807–852. [Google Scholar] [CrossRef]
- Kappen, H.J. Path integrals and symmetry breaking for optimal control theory. J. Stat. Mech.-Theory Exp. 2005, 11, P11011. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
- Watkins, C.J.C.H.; Dayan, P. Q-Learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
- Amari, S. Natural gradient works efficiently in learning. Neural Comput. 1998, 10, 251–276. [Google Scholar] [CrossRef]
- Ay, N. Information Geometry on Complexity and Stochastic Interaction. Entropy 2015, 17, 2432–2458. [Google Scholar] [CrossRef]
- Kim, E.J. Investigating Information Geometry in Classical and Quantum Systems through Information Length. Entropy 2018, 20, 574. [Google Scholar] [CrossRef]
- Parr, T.; Da Costa, L.; Friston, K. Markov blankets, information geometry and stochastic thermodynamics. Philos. Trans. Ser. A Math. Phys. Eng. Sci. 2020, 378, 20190159. [Google Scholar] [CrossRef]
- Da Costa, L.; Friston, K.; Heins, C.; Pavliotis, G.A. Bayesian mechanics for stationary processes. Proc. R. Soc. A Math. Phys. Eng. Sci. 2021, 477, 20210518. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Flat minima. Neural Comput 1997, 9, 1–42. [Google Scholar] [CrossRef] [PubMed]
- Friston, K.; Daunizeau, J.; Kiebel, S.J. Reinforcement learning or active inference? PLoS ONE 2009, 4, e6421. [Google Scholar] [CrossRef] [PubMed]
- Wei, R. Value of information and reward specification in active inference and POMDPs. arXiv 2024, arXiv:2408.06542. [Google Scholar] [CrossRef]
- Berger, J.O. Statistical Decision Theory and Bayesian Analysis; Springer: New York, NY, USA; London, UK, 2011. [Google Scholar]
- Lindley, D.V. On a Measure of the Information Provided by an Experiment. Ann. Math. Stat. 1956, 27, 986–1005. [Google Scholar] [CrossRef]
- Mackay, D.J.C. Information-Based Objective Functions for Active Data Selection. Neural Comput. 1992, 4, 590–604. [Google Scholar] [CrossRef]
- Friston, K.J.; Litvak, V.; Oswal, A.; Razi, A.; Stephan, K.E.; van Wijk, B.C.M.; Ziegler, G.; Zeidman, P. Bayesian model reduction and empirical Bayes for group (DCM) studies. Neuroimage 2016, 128, 413–431. [Google Scholar] [CrossRef]
- Friston, K.; Penny, W. Post hoc Bayesian model selection. Neuroimage 2011, 56, 2089–2099. [Google Scholar] [CrossRef]
- Penny, W.D. Comparing Dynamic Causal Models using AIC, BIC and Free Energy. NeuroImage 2012, 59, 319–330. [Google Scholar] [CrossRef] [PubMed]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Gershman, S.J.; Blei, D.M. A tutorial on Bayesian nonparametric models. J. Math. Psychol. 2012, 56, 1–12. [Google Scholar] [CrossRef]
- Tenenbaum, J.B.; Kemp, C.; Griffiths, T.L.; Goodman, N.D. How to grow a mind: Statistics, structure, and abstraction. Science 2011, 331, 1279–1285. [Google Scholar] [CrossRef]
- Friston, K.J.; Da Costa, L.; Tschantz, A.; Kiefer, A.; Salvatori, T.; Neacsu, V.; Koudahl, M.; Heins, C.; Sajid, N.; Markovic, D.; et al. Supervised structure learning. Biol. Psychol. 2024, 193, 108891. [Google Scholar] [CrossRef]
- Sakai, J. Core Concept: How synaptic pruning shapes neural wiring during development and, possibly, in disease. Proc. Natl. Acad. Sci. USA 2020, 117, 16096–16099. [Google Scholar] [CrossRef]
- Tononi, G.; Cirelli, C. Sleep function and synaptic homeostasis. Sleep Med. Rev. 2006, 10, 49–62. [Google Scholar] [CrossRef]
- Friston, K.J.; Lin, M.; Frith, C.D.; Pezzulo, G.; Hobson, J.A.; Ondobaka, S. Active Inference, Curiosity and Insight. Neural Comput. 2017, 29, 2633–2683. [Google Scholar] [CrossRef]
- Guitart-Masip, M.; Huys, Q.J.; Fuentemilla, L.; Dayan, P.; Duzel, E.; Dolan, R.J. Go and no-go learning in reward and punishment: Interactions between affect and effect. Neuroimage 2012, 62, 154–166. [Google Scholar] [CrossRef]
- Jonker, N.C.; Timmerman, M.E.; de Jong, P.J. The reward and punishment responsivity and motivation questionnaire (RPRM-Q): A stimulus-independent self-report measure of reward and punishment sensitivity that differentiates between responsivity and motivation. Front. Psychol. 2022, 13, 929255. [Google Scholar] [CrossRef]
- Kahneman, D.; Tversky, A. Prospect Theory: An Analysis of Decision Under Risk; Cambridge University Press: New York, NY, USA, 1988; pp. 183–214. [Google Scholar]
- Rouhani, N.; Niv, Y. Signed and unsigned reward prediction errors dynamically enhance learning and memory. eLife 2021, 10, e61077. [Google Scholar] [CrossRef] [PubMed]
- Dawkins, R. The Blind Watchmaker; Penguin: London, UK, 1986. [Google Scholar]
- Friston, K.J.; Salvatori, T.; Isomura, T.; Tschantz, A.; Kiefer, A.; Verbelen, T.; Koudahl, M.; Paul, A.; Parr, T.; Razi, A.; et al. Active Inference and Intentional Behavior. Neural Comput. 2025, 37, 666–700. [Google Scholar] [CrossRef] [PubMed]
- Esposito, M.; Harbola, U.; Mukamel, S. Nonequilibrium fluctuations, fluctuation theorems, and counting statistics in quantum systems. Rev. Mod. Phys. 2009, 81, 1665–1702. [Google Scholar] [CrossRef]
- Wang, X.M. From Dirac Notation to Probability Bracket Notation: Time Evolution and Path Integral under Wick Rotations. arXiv 2009, arXiv:0901.4816. [Google Scholar] [CrossRef]
- Banič, I.; Erceg, G.; Greenwood, S.; Kennedy, J. Transitive points in CR-dynamical systems. Topol. Its Appl. 2023, 326, 108407. [Google Scholar] [CrossRef]
- Kolyada, S.; Snoha, L.u. Topological transitivity. Scholarpedia 2009, 4, 5802. [Google Scholar] [CrossRef]
- Beal, M.J. Variational Algorithms for Approximate Bayesian Inference. PhD. Thesis, University College London, London, UK, 2003. [Google Scholar]
- MacKay, D.J.C. Information Theory, Inference and Learning Algorithms; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
- Plummer, S.; Pati, D.; Bhattacharya, A. Dynamics of Coordinate Ascent Variational Inference: A Case Study in 2D Ising Models. Entropy 2020, 22, 1263. [Google Scholar] [CrossRef]
- Zaou, A.; Benchiheb, O.; Amouch, M. Orbits of Random Dynamical Systems. Bol. Soc. Parana. Matemática 2024, 42, 1–9. [Google Scholar] [CrossRef]
- Clark, A. Surfing Uncertainty: Prediction, Action, and the Embodied Mind; Oxford University Press: Oxford, UK, 2016. [Google Scholar]
- Friston, K.; Heins, C.; Verbelen, T.; Da Costa, L.; Salvatori, T.; Markovic, D.; Tschantz, A.; Koudahl, M.; Buckley, C.; Parr, T. From pixels to planning: Scale-free active inference. arXiv 2024, arXiv:2407.20292. [Google Scholar] [CrossRef]
- Attias, H. Planning by Probabilistic Inference. In Proceedings of the International Workshop on Artificial Intelligence and Statistics, Key West, FL, USA, 3–6 January 2003. [Google Scholar]
- Botvinick, M.; Toussaint, M. Planning as inference. Trends Cogn. Sci. 2012, 16, 485–488. [Google Scholar] [CrossRef]
- Van Dijk, S.G.; Polani, D. Informational Constraints-Driven Organization in Goal-Directed Behavior. Adv. Complex Syst. 2013, 16, 1350016. [Google Scholar] [CrossRef]
- Ay, N.; Bertschinger, N.; Der, R.; Guttler, F.; Olbrich, E. Predictive information and explorative behavior of autonomous robots. Eur. Phys. J. B 2008, 63, 329–339. [Google Scholar] [CrossRef]
- Barlow, H. Possible principles underlying the transformations of sensory messages. In Sensory Communication; Rosenblith, W., Ed.; MIT Press: Cambridge, MA, USA, 1961; pp. 217–234. [Google Scholar]
- Linsker, R. Perceptual Neural Organization—Some Approaches Based on Network Models and Information-Theory. Annu. Rev. Neurosci. 1990, 13, 257–281. [Google Scholar] [CrossRef]
- Olshausen, B.A.; Field, D.J. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 1996, 381, 607–609. [Google Scholar] [CrossRef] [PubMed]
- Higgins, I.; Chang, L.; Langston, V.; Hassabis, D.; Summerfield, C.; Tsao, D.; Botvinick, M. Unsupervised deep learning identifies semantic disentanglement in single inferotemporal face patch neurons. Nat. Commun. 2021, 12, 6456. [Google Scholar] [CrossRef] [PubMed]
- Sanchez, E.H.; Serrurier, M.; Ortner, M. Learning Disentangled Representations via Mutual Information Estimation. arXiv 2019, arXiv:1912.03915. [Google Scholar] [CrossRef]
- Gros, C. Cognitive Computation with Autonomously Active Neural Networks: An Emerging Field. Cogn. Comput. 2009, 1, 77–90. [Google Scholar] [CrossRef]
- Sakthivadivel, D.A.R. Weak Markov Blankets in High-Dimensional, Sparsely-Coupled Random Dynamical Systems. arXiv 2022, arXiv:2207.07620. [Google Scholar] [CrossRef]
- Tipping, M.E. Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 2001, 1, 211–244. [Google Scholar] [CrossRef]
- Dauwels, J. On Variational Message Passing on Factor Graphs. In Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France, 24–29 June 2007; pp. 2546–2550. [Google Scholar]
- Winn, J.; Bishop, C.M. Variational message passing. J. Mach. Learn. Res. 2005, 6, 661–694. [Google Scholar]
- Tomasello, M. Cultural Learning Redux. Child Dev. 2016, 87, 643–653. [Google Scholar] [CrossRef]
- Tervo, D.G.R.; Tenenbaum, J.B.; Gershman, S.J. Toward the neural implementation of structure learning. Curr. Opin. Neurobiol. 2016, 37, 99–105. [Google Scholar] [CrossRef]
- Savage, L.J. The Foundations of Statistics; Wiley: New York, NY, USA, 1954. [Google Scholar]
- Friston, K.; Parr, T.; Zeidman, P. Bayesian model reduction. arXiv 2018, arXiv:1805.07092. [Google Scholar]
- Friston, K.J.; Stephan, K.; Li, B.J.; Daunizeau, J. Generalised Filtering. Math. Probl. Eng. 2010, 2010, 621670. [Google Scholar] [CrossRef]
- Kerr, W.C.; Graham, A.J. Generalized phase space version of Langevin equations and associated Fokker-Planck equations. Eur. Phys. J. B 2000, 15, 305–311. [Google Scholar] [CrossRef]
- Yedidia, J.S.; Freeman, W.T.; Weiss, Y. Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans. Inf. Theory 2005, 51, 2282–2312. [Google Scholar] [CrossRef]
- Kelso, J.A.S.; Tognoli, E. Toward a Complementary Neuroscience: Metastable Coordination Dynamics of the Brain. In Downward Causation and the Neurobiology of Free Will; Murphy, N., Ellis, G.F.R., O’Connor, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2009; pp. 103–124. [Google Scholar]
- Da Costa, L.; Sandved-Smith, L.; Friston, K.; Ramstead, M.J.D.; Seth, A.K. A Mathematical Perspective on Neurophenomenology. arXiv 2024, arXiv:2409.20318. [Google Scholar] [CrossRef]
- Vidal, G. Entanglement renormalization. Phys. Rev. Lett. 2007, 99, 220405. [Google Scholar] [CrossRef] [PubMed]
- Ding, L.; Li, C.; Jin, D.; Ding, S. Survey of spectral clustering based on graph theory. Pattern Recognit. 2024, 151, 110366. [Google Scholar] [CrossRef]
- Limanowski, J. Precision control for a flexible body representation. Neurosci. Biobehav. Rev. 2022, 134, 104401. [Google Scholar] [CrossRef] [PubMed]
- Paton, B.; Hohwy, J.; Enticott, P.G. The rubber hand illusion reveals proprioceptive and sensorimotor differences in autism spectrum disorders. J. Autism Dev. Disord. 2012, 42, 1870–1883. [Google Scholar] [CrossRef]
- Rohe, T.; Noppeney, U. Cortical Hierarchies Perform Bayesian Causal Inference in Multisensory Perception. PLoS Biol. 2015, 13, e1002073. [Google Scholar] [CrossRef] [PubMed]
- Schwenzer, M.; Ay, M.; Bergs, T.; Abel, D. Review on model predictive control: An engineering perspective. Int. J. Adv. Manuf. Technol. 2021, 117, 1327–1349. [Google Scholar] [CrossRef]
- Feldman, A.G.; Levin, M.F. The Origin and Use of Positional Frames of Reference in Motor Control. Behav. Brain Sci. 1995, 18, 723–744. [Google Scholar] [CrossRef]
- Ao, P. Laws in Darwinian evolutionary theory. Phys. Life Rev. 2005, 2, 117–156. [Google Scholar] [CrossRef]
- Feldman, A.G. New insights into action-perception coupling. Exp. Brain Res. 2009, 194, 39–58. [Google Scholar] [CrossRef]
- Friston, K.; Mattout, J.; Kilner, J. Action understanding and active inference. Biol. Cybern. 2011, 104, 137–160. [Google Scholar] [CrossRef]
- Still, S. Information-theoretic approach to interactive learning. Europhys. Lett. 2009, 85, 28005. [Google Scholar] [CrossRef]
- Klyubin, A.S.; Polani, D.; Nehaniv, C.L. Empowerment: A universal agent-centric measure of control. In Proceedings of the 2005 IEEE Congress on Evolutionary Computation, Edinburgh, UK, 2–5 September 2005; Volume 1, pp. 128–135. [Google Scholar]
- Hafner, D.; Ortega, P.A.; Ba, J.; Parr, T.; Friston, K.; Heess, N. Action and Perception as Divergence Minimization. arXiv 2020, arXiv:2009.01791. [Google Scholar]
- Linderman, S.W.; Miller, A.C.; Adams, R.P.; Blei, D.M.; Paninski, L.; Johnson, M.J. Recurrent switching linear dynamical systems. arXiv 2016, arXiv:1610.08466. [Google Scholar] [CrossRef]
- Zeki, S.; Shipp, S. The functional logic of cortical connections. Nature 1988, 335, 311–317. [Google Scholar] [CrossRef]
- Parr, T.; Friston, K.J. The active construction of the visual world. Neuropsychologia 2017, 104, 92–101. [Google Scholar] [CrossRef]
- Da Costa, L.; Parr, T.; Sajid, N.; Veselic, S.; Neacsu, V.; Friston, K. Active inference on discrete state-spaces: A synthesis. J. Math. Psychol. 2020, 99, 102447. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Friston, K.; Parr, T.; Heins, C.; Da Costa, L.; Salvatori, T.; Tschantz, A.; Koudahl, M.; Van de Maele, T.; Buckley, C.; Verbelen, T. Gradient-Free De Novo Learning. Entropy 2025, 27, 992. https://doi.org/10.3390/e27090992
Friston K, Parr T, Heins C, Da Costa L, Salvatori T, Tschantz A, Koudahl M, Van de Maele T, Buckley C, Verbelen T. Gradient-Free De Novo Learning. Entropy. 2025; 27(9):992. https://doi.org/10.3390/e27090992
Chicago/Turabian StyleFriston, Karl, Thomas Parr, Conor Heins, Lancelot Da Costa, Tommaso Salvatori, Alexander Tschantz, Magnus Koudahl, Toon Van de Maele, Christopher Buckley, and Tim Verbelen. 2025. "Gradient-Free De Novo Learning" Entropy 27, no. 9: 992. https://doi.org/10.3390/e27090992
APA StyleFriston, K., Parr, T., Heins, C., Da Costa, L., Salvatori, T., Tschantz, A., Koudahl, M., Van de Maele, T., Buckley, C., & Verbelen, T. (2025). Gradient-Free De Novo Learning. Entropy, 27(9), 992. https://doi.org/10.3390/e27090992