Affine Calculus for Constrained Minima of the Kullback–Leibler Divergence
Abstract
:1. Introduction and Notations
1.1. Prerequisites
1.2. Summary of Content
2. Total Natural Gradient of the KL-Divergence
2.1. Total Natural Gradient of the KL-Divergence
2.2. Natural Gradient of the Entropy and Total Natural Gradient of the Cross Entropy
2.3. Total Natural Gradient of the Jensen–Shannon Divergence
3. Product Sample Space
3.1. Product Sample Space: Marginalization
3.2. Product Sample Space: Mean-Field Approximation
3.3. Product Sample Space: Kantorovich and Scrödinger
3.4. Product Sample Space: Conditional Probability Function
3.5. Variational Bayes
4. Discussion
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Efron, B.; Hastie, T. Computer Age Statistical Inference; Institute of Mathematical Statistics (IMS) Monographs; Algorithms, evidence, and data science; Cambridge University Press: New York, NY, USA, 2016; Volume 5, pp. xix+475. [Google Scholar]
- Amari, S. Geometry of Semiparametric Models and Applications. Invited Papers Meeting IP64 Likelihood and Geometry. Organizer Preben F. Blaesild. In Proceedings of the 51st Session of the International Statistical Institute, Istanbul, Turkey, 18–26 August 1997. [Google Scholar]
- Amari, S.I. Natural gradient works efficiently in learning. Neural Comput. 1998, 10, 251–276. [Google Scholar] [CrossRef]
- Amari, S.; Nagaoka, H. Methods of Information Geometry; Translated from the 1993 Japanese original by Daishi Harada; American Mathematical Society: Providence, RI, USA, 2000; pp. x+206. [Google Scholar]
- Amari, S.I. Information Geometry and Its Applications; Applied Mathematical Sciences; Springer: Tokyo, Japan, 2016; Volume 194, pp. xiii+374. [Google Scholar]
- Pistone, G.; Sempi, C. An infinite-dimensional geometric structure on the space of all the probability measures equivalent to a given one. Ann. Statist. 1995, 23, 1543–1561. [Google Scholar] [CrossRef]
- Chirco, G.; Pistone, G. Dually affine Information Geometry modeled on a Banach space. arXiv 2022, arXiv:2204.00917. [Google Scholar] [CrossRef]
- Weyl, H. Space- Time- Matter; Translation of the 1921 RAUM ZEIT MATERIE; Dover: New York, NY, USA, 1952. [Google Scholar]
- Lin, J. Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 1991, 37, 145–151. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
- Amari, S.i.; Karakida, R.; Oizumi, M. Information geometry connecting Wasserstein distance and Kullback-Leibler divergence via the entropy-relaxed transportation problem. Inf. Geom. 2018, 1, 13–37. [Google Scholar] [CrossRef]
- Peyré, G.; Cuturi, M. Computational Optimal Transport. Found. Trends Mach. Learn. 2019, 11, 355–607. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-encoding variational Bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
- Khan, M.E.; Rue, H.V. The Bayesian learning rule. J. Mach. Learn. Res. 2023, 24, 46. [Google Scholar]
- Musielak, J. Orlicz Spaces and Modular Spaces; Lecture Notes in Mathematics; Springer: Berlin/Heidelberg, Germany, 1983; Volume 1034, pp. iii+222. [Google Scholar]
- Pistone, G. Information geometry of the Gaussian space. In Information Geometry and Its Applications; Springer: Cham, Switzerland, 2018; Volume 252, pp. 119–155. [Google Scholar]
- Lang, S. Differential and Riemannian Manifolds, 3rd ed.; Graduate Texts in Mathematics; Springer: Berlin/Heidelberg, Germany, 1995; Volume 160, pp. xiv+364. [Google Scholar]
- Pistone, G. Information Geometry of the Probability Simplex: A Short Course. Nonlinear Phenom. Complex Syst. 2020, 23, 221–242. [Google Scholar] [CrossRef]
- Landau, L.D.; Lifshits, E.M. Course of Theoretical Physics, 3rd ed.; Statistical Physics; Butterworth-Heinemann: Oxford, UK, 1980; Volume V. [Google Scholar]
- Brown, L.D. Fundamentals of Statistical Exponential Families with Applications in Statistical Decision Theory; Number 9 in IMS Lecture Notes; Monograph Series; Institute of Mathematical Statistics: Ann Arbor, MI, USA, 1986; pp. x+283. [Google Scholar]
- do Carmo, M.P. Riemannian Geometry; Mathematics: Theory & Applications; Translated from the Second Portuguese Edition by Francis Flaherty; Birkhäuser Boston Inc.: Berlin, Germany, 1992; pp. xiv+300. [Google Scholar]
- Pistone, G. Statistical Bundle of the Transport Model. In Geometric Science of Information; Springer International Publishing: Berlin/Heidelberg, Germany, 2021; pp. 752–759. [Google Scholar] [CrossRef]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2006; pp. xxiv+748. [Google Scholar]
- Ay, N. Information geometry of the Otto metric. Inf. Geom. 2024, 1–24. [Google Scholar] [CrossRef]
- Chirco, G.; Malagò, L.; Pistone, G. Lagrangian and Hamiltonian dynamics for probabilities on the statistical bundle. Int. J. Geom. Methods Mod. Phys. 2022, 19, 2250214. [Google Scholar] [CrossRef]
- Malagò, L.; Montrucchio, L.; Pistone, G. Wasserstein Riemannian geometry of Gaussian densities. Inf. Geom. 2018, 1, 137–179. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pistone, G. Affine Calculus for Constrained Minima of the Kullback–Leibler Divergence. Stats 2025, 8, 25. https://doi.org/10.3390/stats8020025
Pistone G. Affine Calculus for Constrained Minima of the Kullback–Leibler Divergence. Stats. 2025; 8(2):25. https://doi.org/10.3390/stats8020025
Chicago/Turabian StylePistone, Giovanni. 2025. "Affine Calculus for Constrained Minima of the Kullback–Leibler Divergence" Stats 8, no. 2: 25. https://doi.org/10.3390/stats8020025
APA StylePistone, G. (2025). Affine Calculus for Constrained Minima of the Kullback–Leibler Divergence. Stats, 8(2), 25. https://doi.org/10.3390/stats8020025