The Information Dynamics of Generative Diffusion
Abstract
1. Introduction
- Information theory and entropy-based perspectives
- Phase transitions, associative memories, and symmetry breaking
- Thermodynamics and the role of entropy
- Contributions
- Entropy production as a signature of symmetry breaking: We review the known expression for conditional entropy production in diffusion dynamics and provide an intuitive interpretation (see [10,12] for a practical use of the entropy production expression). We then connect this information-theoretic quantity to symmetry-breaking phenomena reported in [15,16], emphasizing that bifurcations manifest as pronounced, ensemble-level changes in information measures. In particular, symmetry breaking induces a transient loss of identifiability of , which appears as a peak in the conditional entropy rate.
- Noise-driven decisions via posterior geometry: We interpret these entropy-rate signatures through the geometry of the score field. Near low-curvature directions, the score weakens and temporarily loses its ability to suppress stochastic fluctuations, so noise becomes the effective selector of the branch that the generative trajectory follows. This provides a unified, mechanism-level explanation for “decision” events during generation.
- Local divergence of trajectories under reverse-time dynamics: We show that the same loss of curvature is reflected in the local linearization of the reverse-time flow. The Jacobian of the score develops expanding directions, implying local exponential amplification of small differences between nearby generative trajectories over finite horizons. This explains how noise fluctuations can propagate and shape the final generative outcome.
- Variance peak as a speciation-time marker: Motivated by tools from stochastic thermodynamics, we introduce the variance of pathwise conditional entropy. Along individual trajectories, the pathwise conditional entropy need not decrease monotonically and may transiently increase, reflecting temporary ambiguity in the resolution of uncertainty about . This trajectory heterogeneity produces a pronounced peak in the variance, and we show how this peak concentrates around the speciation time [24].
2. Information Theory
3. Diffusion Models
4. Generative Information Transfer in Score Matching Diffusion
4.1. Score Norm and Posterior Concentration
4.2. Conditional Entropy Production as Optimal Error
4.3. Generative Bandwidth
5. Statistical Physics, Order Parameters and Phase Transitions
Branching Paths of Fixed-Points and Spontaneous Symmetry Breaking
6. Dynamics of the Generative Trajectories
6.1. The Global Perspective on Generative Bifurcations
6.2. Information Geometry
7. A Stochastic Thermodynamic Perspective
7.1. Variance of Pathwise Conditional Entropy as a Signature of Symmetry Breaking
Connection with the Speciation Time
8. Discussion and Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Proof That the Variance Peaks at the Speciation Time
Appendix A.1. The Speciation Time
Appendix A.2. Mixture of Data Points
- Infinite SNR (; early times ): In (A9), the deterministic term dominates the Gaussian fluctuation . As a result, the posterior concentrates on the true class (with probability tending to 1), so in probability. Hence .
- Zero SNR (; late times ): In (A9), all logits vanish, and the posterior tends to the prior: . Thus deterministically, so .
Appendix A.3. Mixture of Gaussians
Appendix A.4. Variance Peak in the EDM Case

References
- Sohl-Dickstein, J.; Weiss, E.A.; Maheswaranathan, N.; Ganguli, S. Deep Unsupervised Learning Using Nonequilibrium Thermodynamics. arXiv 2015, arXiv:1503.03585. [Google Scholar] [CrossRef]
- Ho, J.; Jain, A.; Abbeel, P. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems 33; MIT Press: Cambridge, MA, USA, 2020; pp. 6840–6851. [Google Scholar]
- Dhariwal, P.; Nichol, A. Diffusion Models Beat GANs on Image Synthesis. arXiv 2021, arXiv:2105.05233. [Google Scholar] [CrossRef]
- Kong, Z.; Ping, W.; Huang, J.; Zhao, K.; Catanzaro, B. DiffWave: A Versatile Diffusion Model for Audio Synthesis. In Proceedings of the International Conference on Learning Representations (ICLR), Virtual Event, 3–7 May 2021. [Google Scholar]
- Singer, U.; Polyak, A.; Hayes, T.; Yin, X.; An, J.; Zhang, S.; Hu, Q.; Yang, H.; Ashual, O.; Gafni, O.; et al. Make-A-Video: Text-to-Video Generation without Text-Video Data. arXiv 2022, arXiv:2209.14792. [Google Scholar] [CrossRef]
- Ho, J.; Salimans, T. Classifier-Free Diffusion Guidance. arXiv 2022, arXiv:2207.12598. [Google Scholar] [CrossRef]
- Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. arXiv 2022, arXiv:2112.10752. [Google Scholar] [CrossRef]
- Song, Y.; Sohl-Dickstein, J.; Kingma, D.P.; Kumar, A.; Ermon, S.; Poole, B. Score-Based Generative Modeling through Stochastic Differential Equations. arXiv 2021, arXiv:2011.13456. [Google Scholar] [CrossRef]
- Song, J.; Meng, C.; Ermon, S. Denoising Diffusion Implicit Models. arXiv 2022, arXiv:2010.02502. [Google Scholar] [CrossRef]
- Kong, X.; Brekelmans, R.; Ver Steeg, G. Information-Theoretic Diffusion. arXiv 2023, arXiv:2302.03792. [Google Scholar] [CrossRef]
- Kong, X.; Liu, O.; Li, H.; Yogatama, D.; Ver Steeg, G. Interpretable Diffusion via Information Decomposition. arXiv 2023, arXiv:2310.07972. [Google Scholar]
- Stancevic, D.; Handke, F.; Ambrogioni, L. Entropic Time Schedulers for Generative Diffusion Models. arXiv 2025, arXiv:2504.13612. [Google Scholar] [CrossRef]
- Dieleman, S.; Sartran, L.; Roshannai, A.; Savinov, N.; Ganin, Y.; Richemond, P.H.; Doucet, A.; Strudel, R.; Dyer, C.; Durkan, C.; et al. Continuous Diffusion for Categorical Data. arXiv 2022, arXiv:2211.15089. [Google Scholar] [CrossRef]
- Franzese, G.; Martini, M.; Corallo, G.; Papotti, P.; Michiardi, P. Latent Abstractions in Generative Diffusion Models. Entropy 2025, 27, 371. [Google Scholar] [CrossRef]
- Raya, G.; Ambrogioni, L. Spontaneous Symmetry Breaking in Generative Diffusion Models. arXiv 2023, arXiv:2305.19693. [Google Scholar] [CrossRef]
- Ambrogioni, L. The Statistical Thermodynamics of Generative Diffusion Models: Phase Transitions, Symmetry Breaking and Critical Instability. arXiv 2024, arXiv:2310.17467. [Google Scholar] [CrossRef] [PubMed]
- Biroli, G.; Mézard, M. Generative Diffusion in Very Large Dimensions. J. Stat. Mech. Theory Exp. 2023, 2023, 093402. [Google Scholar] [CrossRef]
- Alaoui, A.E.; Montanari, A.; Sellke, M. Sampling from Mean-Field Gibbs Measures via Diffusion Processes. arXiv 2023, arXiv:2310.08912. [Google Scholar] [CrossRef]
- Huang, B.; Montanari, A.; Pham, H.T. Sampling from Spherical Spin Glasses in Total Variation via Algorithmic Stochastic Localization. arXiv 2024, arXiv:2404.15651. [Google Scholar] [CrossRef]
- Montanari, A. Sampling, Diffusions, and Stochastic Localization. arXiv 2023, arXiv:2305.10690. [Google Scholar] [CrossRef]
- Benton, J.; De Bortoli, V.; Doucet, A.; Deligiannidis, G. Nearly d-Linear Convergence Bounds for Diffusion Models via Stochastic Localization. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 7–11 May 2024. [Google Scholar]
- Sclocchi, A.; Favero, A.; Wyart, M. A Phase Transition in Diffusion Models Reveals the Hierarchical Nature of Data. Proc. Natl. Acad. Sci. USA 2025, 122, e2408799121. [Google Scholar] [CrossRef]
- Sclocchi, A.; Favero, A.; Levi, N.I.; Wyart, M. Probing the Latent Hierarchical Structure of Data via Diffusion Models. J. Stat. Mech. Theory Exp. 2025, 2025, 084005. [Google Scholar] [CrossRef]
- Biroli, G.; Bonnaire, T.; de Bortoli, V.; Mézard, M. Dynamical Regimes of Diffusion Models. Nat. Commun. 2024, 15, 9957. [Google Scholar] [CrossRef]
- Bonnaire, T.; Urfin, R.; Biroli, G.; Mézard, M. Why Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in Training. arXiv 2025, arXiv:2505.17638. [Google Scholar]
- Achilli, B.; Ambrogioni, L.; Lucibello, C.; Mézard, M.; Ventura, E. Memorization and Generalization in Generative Diffusion under the Manifold Hypothesis. J. Stat. Mech. Theory Exp. 2025, 2025, 073401. [Google Scholar] [CrossRef]
- Achilli, B.; Ventura, E.; Silvestri, G.; Pham, B.; Raya, G.; Krotov, D.; Lucibello, C.; Ambrogioni, L. Losing Dimensions: Geometric Memorization in Generative Diffusion. arXiv 2024, arXiv:2410.08727. [Google Scholar] [CrossRef]
- Ambrogioni, L. In Search of Dispersed Memories: Generative Diffusion Models Are Associative Memory Networks. Entropy 2024, 26, 381. [Google Scholar] [CrossRef]
- Hoover, B.; Strobelt, H.; Krotov, D.; Hoffman, J.; Kira, Z.; Chau, D.H. Memory in Plain Sight: A Survey of the Uncanny Resemblances Between Diffusion Models and Associative Memories. In Associative Memory & Hopfield Networks in 2023; NeurIPS: New Orleans, LA, USA, 2023. [Google Scholar]
- Hess, J.; Morris, Q. Associative Memory and Generative Diffusion in the Zero-Noise Limit. arXiv 2025, arXiv:2506.05178. [Google Scholar] [CrossRef]
- Jeon, D.; Kim, D.; No, A. Understanding Memorization in Generative Models via Sharpness in Probability Landscapes. arXiv 2024, arXiv:2412.04140. [Google Scholar] [CrossRef]
- Pham, B.; Raya, G.; Negri, M.; Zaki, M.J.; Ambrogioni, L.; Krotov, D. Memorization to Generalization: Emergence of Diffusion Models from Associative Memory. arXiv 2025, arXiv:2505.21777. [Google Scholar] [CrossRef]
- Premkumar, A. Neural Entropy. arXiv 2024, arXiv:2409.03817. [Google Scholar]
- Seifert, U. Entropy production along a stochastic trajectory and an integral fluctuation theorem. Phys. Rev. Lett. 2005, 95, 040602. [Google Scholar] [CrossRef]
- Ikeda, K.; Uda, T.; Okanohara, D.; Ito, S. Speed-accuracy relations for diffusion models: Wisdom from nonequilibrium thermodynamics and optimal transport. Phys. Rev. X 2025, 15, 031031. [Google Scholar]
- Lou, A.; Meng, C.; Ermon, S. Discrete Diffusion Language Modeling by Estimating the Ratios of the Data Distribution. arXiv 2023, arXiv:2305.14627. [Google Scholar]
- Sahoo, S.; Arriola, M.; Schiff, Y.; Gokaslan, A.; Marroquin, E.; Chiu, J.; Rush, A.; Kuleshov, V. Simple and Effective Masked Diffusion Language Models. In Advances in Neural Information Processing Systems 37; MIT Press: Cambridge, MA, USA, 2024; pp. 130136–130184. [Google Scholar]
- Karras, T.; Aittala, M.; Aila, T.; Laine, S. Elucidating the Design Space of Diffusion-Based Generative Models. In PNIPS’22: Proceedings of the 36th International Conference on Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2022. [Google Scholar]
- Anderson, B.D. Reverse-time diffusion equation models. Stoch. Process. Their Appl. 1982, 12, 313–326. [Google Scholar] [CrossRef]
- Hyvärinen, A. Estimation of Non-Normalized Statistical Models by Score Matching. J. Mach. Learn. Res. 2005, 6, 695–709. [Google Scholar]
- Vincent, P. A connection between score matching and denoising autoencoders. Neural Comput. 2011, 23, 1661–1674. [Google Scholar] [CrossRef] [PubMed]
- Carreira-Perpinan, M.A. Mode-finding for mixtures of Gaussian distributions. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 22, 1318–1323. [Google Scholar]
- Carreira-Perpinán, M.A.; Williams, C.K. On the number of modes of a Gaussian mixture. In Scale Space Methods in Computer Vision; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
- Améndola, C.; Engström, A.; Haase, C. Maximum number of modes of Gaussian mixtures. Inf. Inference J. IMA 2020, 9, 587–600. [Google Scholar]
- Amari, S.i. Information Geometry and Its Applications; Applied Mathematical Sciences (AMS); Springer: Tokyo, Japan, 2016; Volume 194. [Google Scholar]
- Kadkhodaie, Z.; Guth, F.; Simoncelli, E.P.; Mallat, S. Generalization in diffusion models arises from geometry-adaptive harmonic representations. arXiv 2023, arXiv:2310.02557. [Google Scholar]
- Stanczuk, J.P.; Batzolis, G.; Deveney, T.; Schönlieb, C.B. Diffusion Models Encode the Intrinsic Dimension of Data Manifolds. In Proceedings of the International Conference on Machine Learning, Vienna, Austria, 21–27 July 2024. [Google Scholar]
- Ventura, E.; Achilli, B.; Silvestri, G.; Lucibello, C.; Ambrogioni, L. Manifolds, Random Matrices and Spectral Gaps: The Geometric Phases of Generative Diffusion. arXiv 2024, arXiv:2410.05898. [Google Scholar] [CrossRef]




Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Stančević, D.; Ambrogioni, L. The Information Dynamics of Generative Diffusion. Entropy 2026, 28, 195. https://doi.org/10.3390/e28020195
Stančević D, Ambrogioni L. The Information Dynamics of Generative Diffusion. Entropy. 2026; 28(2):195. https://doi.org/10.3390/e28020195
Chicago/Turabian StyleStančević, Dejan, and Luca Ambrogioni. 2026. "The Information Dynamics of Generative Diffusion" Entropy 28, no. 2: 195. https://doi.org/10.3390/e28020195
APA StyleStančević, D., & Ambrogioni, L. (2026). The Information Dynamics of Generative Diffusion. Entropy, 28(2), 195. https://doi.org/10.3390/e28020195
