Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (16)

Search Parameters:
Keywords = Bregman geometry

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 3816 KiB  
Article
A K-Means Clustering Algorithm with Total Bregman Divergence for Point Cloud Denoising
by Xiaomin Duan, Anqi Mu, Xinyu Zhao and Yuqi Wu
Symmetry 2025, 17(8), 1186; https://doi.org/10.3390/sym17081186 - 24 Jul 2025
Viewed by 281
Abstract
Point cloud denoising is essential for improving 3D data quality, yet traditional K-means methods relying on Euclidean distance struggle with non-uniform noise. This paper proposes a K-means algorithm leveraging Total Bregman Divergence (TBD) to better model geometric structures on manifolds, enhancing robustness against [...] Read more.
Point cloud denoising is essential for improving 3D data quality, yet traditional K-means methods relying on Euclidean distance struggle with non-uniform noise. This paper proposes a K-means algorithm leveraging Total Bregman Divergence (TBD) to better model geometric structures on manifolds, enhancing robustness against noise. Specifically, TBDs—Total Logarithm, Exponential, and Inverse Divergences—are defined on symmetric positive-definite matrices, each tailored to capture distinct local geometries. Theoretical analysis demonstrates the bounded sensitivity of TBD-induced means to outliers via influence functions, while anisotropy indices quantify structural variations. Numerical experiments validate the method’s superiority over Euclidean-based approaches, showing effective noise separation and improved stability. This work bridges geometric insights with practical clustering, offering a robust framework for point cloud preprocessing in vision and robotics applications. Full article
(This article belongs to the Section Mathematics)
Show Figures

Figure 1

24 pages, 2044 KiB  
Article
Bregman–Hausdorff Divergence: Strengthening the Connections Between Computational Geometry and Machine Learning
by Tuyen Pham, Hana Dal Poz Kouřimská and Hubert Wagner
Mach. Learn. Knowl. Extr. 2025, 7(2), 48; https://doi.org/10.3390/make7020048 - 26 May 2025
Viewed by 949
Abstract
The purpose of this paper is twofold. On a technical side, we propose an extension of the Hausdorff distance from metric spaces to spaces equipped with asymmetric distance measures. Specifically, we focus on extending it to the family of Bregman divergences, which includes [...] Read more.
The purpose of this paper is twofold. On a technical side, we propose an extension of the Hausdorff distance from metric spaces to spaces equipped with asymmetric distance measures. Specifically, we focus on extending it to the family of Bregman divergences, which includes the popular Kullback–Leibler divergence (also known as relative entropy). The resulting dissimilarity measure is called a Bregman–Hausdorff divergence and compares two collections of vectors—without assuming any pairing or alignment between their elements. We propose new algorithms for computing Bregman–Hausdorff divergences based on a recently developed Kd-tree data structure for nearest neighbor search with respect to Bregman divergences. The algorithms are surprisingly efficient even for large inputs with hundreds of dimensions. As a benchmark, we use the new divergence to compare two collections of probabilistic predictions produced by different machine learning models trained using the relative entropy loss. In addition to the introduction of this technical concept, we provide a survey. It outlines the basics of Bregman geometry, and motivated the Kullback–Leibler divergence using concepts from information theory. We also describe computational geometric algorithms that have been extended to this geometry, focusing on algorithms relevant for machine learning. Full article
Show Figures

Figure 1

11 pages, 441 KiB  
Article
Symplectic Bregman Divergences
by Frank Nielsen
Entropy 2024, 26(12), 1101; https://doi.org/10.3390/e26121101 - 16 Dec 2024
Viewed by 1067
Abstract
We present a generalization of Bregman divergences in finite-dimensional symplectic vector spaces that we term symplectic Bregman divergences. Symplectic Bregman divergences are derived from a symplectic generalization of the Fenchel–Young inequality which relies on the notion of symplectic subdifferentials. The symplectic Fenchel–Young inequality [...] Read more.
We present a generalization of Bregman divergences in finite-dimensional symplectic vector spaces that we term symplectic Bregman divergences. Symplectic Bregman divergences are derived from a symplectic generalization of the Fenchel–Young inequality which relies on the notion of symplectic subdifferentials. The symplectic Fenchel–Young inequality is obtained using the symplectic Fenchel transform which is defined with respect to the symplectic form. Since symplectic forms can be built generically from pairings of dual systems, we obtain a generalization of Bregman divergences in dual systems obtained by equivalent symplectic Bregman divergences. In particular, when the symplectic form is derived from an inner product, we show that the corresponding symplectic Bregman divergences amount to ordinary Bregman divergences with respect to composite inner products. Some potential applications of symplectic divergences in geometric mechanics, information geometry, and learning dynamics in machine learning are touched upon. Full article
(This article belongs to the Special Issue Information Geometry for Data Analysis)
Show Figures

Figure 1

30 pages, 1927 KiB  
Article
Fast Proxy Centers for the Jeffreys Centroid: The Jeffreys–Fisher–Rao Center and the Gauss–Bregman Inductive Center
by Frank Nielsen
Entropy 2024, 26(12), 1008; https://doi.org/10.3390/e26121008 - 22 Nov 2024
Cited by 1 | Viewed by 1039
Abstract
The symmetric Kullback–Leibler centroid, also called the Jeffreys centroid, of a set of mutually absolutely continuous probability distributions on a measure space provides a notion of centrality which has proven useful in many tasks, including information retrieval, information fusion, and clustering. However, the [...] Read more.
The symmetric Kullback–Leibler centroid, also called the Jeffreys centroid, of a set of mutually absolutely continuous probability distributions on a measure space provides a notion of centrality which has proven useful in many tasks, including information retrieval, information fusion, and clustering. However, the Jeffreys centroid is not available in closed form for sets of categorical or multivariate normal distributions, two widely used statistical models, and thus needs to be approximated numerically in practice. In this paper, we first propose the new Jeffreys–Fisher–Rao center defined as the Fisher–Rao midpoint of the sided Kullback–Leibler centroids as a plug-in replacement of the Jeffreys centroid. This Jeffreys–Fisher–Rao center admits a generic formula for uni-parameter exponential family distributions and a closed-form formula for categorical and multivariate normal distributions; it matches exactly the Jeffreys centroid for same-mean normal distributions and is experimentally observed in practice to be close to the Jeffreys centroid. Second, we define a new type of inductive center generalizing the principle of the Gauss arithmetic–geometric double sequence mean for pairs of densities of any given exponential family. This new Gauss–Bregman center is shown experimentally to approximate very well the Jeffreys centroid and is suggested to be used as a replacement for the Jeffreys centroid when the Jeffreys–Fisher–Rao center is not available in closed form. Furthermore, this inductive center always converges and matches the Jeffreys centroid for sets of same-mean normal distributions. We report on our experiments, which first demonstrate how well the closed-form formula of the Jeffreys–Fisher–Rao center for categorical distributions approximates the costly numerical Jeffreys centroid, which relies on the Lambert W function, and second show the fast convergence of the Gauss–Bregman double sequences, which can approximate closely the Jeffreys centroid when truncated to a first few iterations. Finally, we conclude this work by reinterpreting these fast proxy Jeffreys–Fisher–Rao and Gauss–Bregman centers of Jeffreys centroids under the lens of dually flat spaces in information geometry. Full article
(This article belongs to the Special Issue Information Theory in Emerging Machine Learning Techniques)
Show Figures

Figure 1

23 pages, 7837 KiB  
Article
Understanding Higher-Order Interactions in Information Space
by Herbert Edelsbrunner, Katharina Ölsböck and Hubert Wagner
Entropy 2024, 26(8), 637; https://doi.org/10.3390/e26080637 - 27 Jul 2024
Cited by 4 | Viewed by 2061
Abstract
Methods used in topological data analysis naturally capture higher-order interactions in point cloud data embedded in a metric space. This methodology was recently extended to data living in an information space, by which we mean a space measured with an information theoretical distance. [...] Read more.
Methods used in topological data analysis naturally capture higher-order interactions in point cloud data embedded in a metric space. This methodology was recently extended to data living in an information space, by which we mean a space measured with an information theoretical distance. One such setting is a finite collection of discrete probability distributions embedded in the probability simplex measured with the relative entropy (Kullback–Leibler divergence). More generally, one can work with a Bregman divergence parameterized by a different notion of entropy. While theoretical algorithms exist for this setup, there is a paucity of implementations for exploring and comparing geometric-topological properties of various information spaces. The interest of this work is therefore twofold. First, we propose the first robust algorithms and software for geometric and topological data analysis in information space. Perhaps surprisingly, despite working with Bregman divergences, our design reuses robust libraries for the Euclidean case. Second, using the new software, we take the first steps towards understanding the geometric-topological structure of these spaces. In particular, we compare them with the more familiar spaces equipped with the Euclidean and Fisher metrics. Full article
Show Figures

Figure 1

16 pages, 656 KiB  
Article
Divergences Induced by the Cumulant and Partition Functions of Exponential Families and Their Deformations Induced by Comparative Convexity
by Frank Nielsen
Entropy 2024, 26(3), 193; https://doi.org/10.3390/e26030193 - 23 Feb 2024
Cited by 1 | Viewed by 2049
Abstract
Exponential families are statistical models which are the workhorses in statistics, information theory, and machine learning, among others. An exponential family can either be normalized subtractively by its cumulant or free energy function, or equivalently normalized divisively by its partition function. Both the [...] Read more.
Exponential families are statistical models which are the workhorses in statistics, information theory, and machine learning, among others. An exponential family can either be normalized subtractively by its cumulant or free energy function, or equivalently normalized divisively by its partition function. Both the cumulant and partition functions are strictly convex and smooth functions inducing corresponding pairs of Bregman and Jensen divergences. It is well known that skewed Bhattacharyya distances between the probability densities of an exponential family amount to skewed Jensen divergences induced by the cumulant function between their corresponding natural parameters, and that in limit cases the sided Kullback–Leibler divergences amount to reverse-sided Bregman divergences. In this work, we first show that the α-divergences between non-normalized densities of an exponential family amount to scaled α-skewed Jensen divergences induced by the partition function. We then show how comparative convexity with respect to a pair of quasi-arithmetical means allows both convex functions and their arguments to be deformed, thereby defining dually flat spaces with corresponding divergences when ordinary convexity is preserved. Full article
Show Figures

Figure 1

37 pages, 548 KiB  
Review
Survey of Optimization Algorithms in Modern Neural Networks
by Ruslan Abdulkadirov, Pavel Lyakhov and Nikolay Nagornov
Mathematics 2023, 11(11), 2466; https://doi.org/10.3390/math11112466 - 26 May 2023
Cited by 64 | Viewed by 21541
Abstract
The main goal of machine learning is the creation of self-learning algorithms in many areas of human activity. It allows a replacement of a person with artificial intelligence in seeking to expand production. The theory of artificial neural networks, which have already replaced [...] Read more.
The main goal of machine learning is the creation of self-learning algorithms in many areas of human activity. It allows a replacement of a person with artificial intelligence in seeking to expand production. The theory of artificial neural networks, which have already replaced humans in many problems, remains the most well-utilized branch of machine learning. Thus, one must select appropriate neural network architectures, data processing, and advanced applied mathematics tools. A common challenge for these networks is achieving the highest accuracy in a short time. This problem is solved by modifying networks and improving data pre-processing, where accuracy increases along with training time. Bt using optimization methods, one can improve the accuracy without increasing the time. In this review, we consider all existing optimization algorithms that meet in neural networks. We present modifications of optimization algorithms of the first, second, and information-geometric order, which are related to information geometry for Fisher–Rao and Bregman metrics. These optimizers have significantly influenced the development of neural networks through geometric and probabilistic tools. We present applications of all the given optimization algorithms, considering the types of neural networks. After that, we show ways to develop optimization algorithms in further research using modern neural networks. Fractional order, bilevel, and gradient-free optimizers can replace classical gradient-based optimizers. Such approaches are induced in graph, spiking, complex-valued, quantum, and wavelet neural networks. Besides pattern recognition, time series prediction, and object detection, there are many other applications in machine learning: quantum computations, partial differential, and integrodifferential equations, and stochastic processes. Full article
(This article belongs to the Special Issue Mathematical Foundations of Deep Neural Networks)
Show Figures

Figure 1

35 pages, 988 KiB  
Article
Revisiting Chernoff Information with Likelihood Ratio Exponential Families
by Frank Nielsen
Entropy 2022, 24(10), 1400; https://doi.org/10.3390/e24101400 - 1 Oct 2022
Cited by 13 | Viewed by 5998
Abstract
The Chernoff information between two probability measures is a statistical divergence measuring their deviation defined as their maximally skewed Bhattacharyya distance. Although the Chernoff information was originally introduced for bounding the Bayes error in statistical hypothesis testing, the divergence found many other applications [...] Read more.
The Chernoff information between two probability measures is a statistical divergence measuring their deviation defined as their maximally skewed Bhattacharyya distance. Although the Chernoff information was originally introduced for bounding the Bayes error in statistical hypothesis testing, the divergence found many other applications due to its empirical robustness property found in applications ranging from information fusion to quantum information. From the viewpoint of information theory, the Chernoff information can also be interpreted as a minmax symmetrization of the Kullback–Leibler divergence. In this paper, we first revisit the Chernoff information between two densities of a measurable Lebesgue space by considering the exponential families induced by their geometric mixtures: The so-called likelihood ratio exponential families. Second, we show how to (i) solve exactly the Chernoff information between any two univariate Gaussian distributions or get a closed-form formula using symbolic computing, (ii) report a closed-form formula of the Chernoff information of centered Gaussians with scaled covariance matrices and (iii) use a fast numerical scheme to approximate the Chernoff information between any two multivariate Gaussian distributions. Full article
Show Figures

Graphical abstract

34 pages, 1942 KiB  
Article
On Voronoi Diagrams on the Information-Geometric Cauchy Manifolds
by Frank Nielsen
Entropy 2020, 22(7), 713; https://doi.org/10.3390/e22070713 - 28 Jun 2020
Cited by 13 | Viewed by 6435
Abstract
We study the Voronoi diagrams of a finite set of Cauchy distributions and their dual complexes from the viewpoint of information geometry by considering the Fisher-Rao distance, the Kullback-Leibler divergence, the chi square divergence, and a flat divergence derived from Tsallis entropy related [...] Read more.
We study the Voronoi diagrams of a finite set of Cauchy distributions and their dual complexes from the viewpoint of information geometry by considering the Fisher-Rao distance, the Kullback-Leibler divergence, the chi square divergence, and a flat divergence derived from Tsallis entropy related to the conformal flattening of the Fisher-Rao geometry. We prove that the Voronoi diagrams of the Fisher-Rao distance, the chi square divergence, and the Kullback-Leibler divergences all coincide with a hyperbolic Voronoi diagram on the corresponding Cauchy location-scale parameters, and that the dual Cauchy hyperbolic Delaunay complexes are Fisher orthogonal to the Cauchy hyperbolic Voronoi diagrams. The dual Voronoi diagrams with respect to the dual flat divergences amount to dual Bregman Voronoi diagrams, and their dual complexes are regular triangulations. The primal Bregman Voronoi diagram is the Euclidean Voronoi diagram and the dual Bregman Voronoi diagram coincides with the Cauchy hyperbolic Voronoi diagram. In addition, we prove that the square root of the Kullback-Leibler divergence between Cauchy distributions yields a metric distance which is Hilbertian for the Cauchy scale families. Full article
(This article belongs to the Special Issue Information Geometry III)
Show Figures

Figure 1

24 pages, 1604 KiB  
Article
On a Generalization of the Jensen–Shannon Divergence and the Jensen–Shannon Centroid
by Frank Nielsen
Entropy 2020, 22(2), 221; https://doi.org/10.3390/e22020221 - 16 Feb 2020
Cited by 103 | Viewed by 15203
Abstract
The Jensen–Shannon divergence is a renown bounded symmetrization of the Kullback–Leibler divergence which does not require probability densities to have matching supports. In this paper, we introduce a vector-skew generalization of the scalar α -Jensen–Bregman divergences and derive thereof the vector-skew α -Jensen–Shannon [...] Read more.
The Jensen–Shannon divergence is a renown bounded symmetrization of the Kullback–Leibler divergence which does not require probability densities to have matching supports. In this paper, we introduce a vector-skew generalization of the scalar α -Jensen–Bregman divergences and derive thereof the vector-skew α -Jensen–Shannon divergences. We prove that the vector-skew α -Jensen–Shannon divergences are f-divergences and study the properties of these novel divergences. Finally, we report an iterative algorithm to numerically compute the Jensen–Shannon-type centroids for a set of probability densities belonging to a mixture family: This includes the case of the Jensen–Shannon centroid of a set of categorical distributions or normalized histograms. Full article
Show Figures

Graphical abstract

14 pages, 5219 KiB  
Article
Convex Optimization via Symmetrical Hölder Divergence for a WLAN Indoor Positioning System
by Osamah Abdullah
Entropy 2018, 20(9), 639; https://doi.org/10.3390/e20090639 - 25 Aug 2018
Cited by 7 | Viewed by 3597
Abstract
Modern indoor positioning system services are important technologies that play vital roles in modern life, providing many services such as recruiting emergency healthcare providers and for security purposes. Several large companies, such as Microsoft, Apple, Nokia, and Google, have researched location-based services. Wireless [...] Read more.
Modern indoor positioning system services are important technologies that play vital roles in modern life, providing many services such as recruiting emergency healthcare providers and for security purposes. Several large companies, such as Microsoft, Apple, Nokia, and Google, have researched location-based services. Wireless indoor localization is key for pervasive computing applications and network optimization. Different approaches have been developed for this technique using WiFi signals. WiFi fingerprinting-based indoor localization has been widely used due to its simplicity, and algorithms that fingerprint WiFi signals at separate locations can achieve accuracy within a few meters. However, a major drawback of WiFi fingerprinting is the variance in received signal strength (RSS), as it fluctuates with time and changing environment. As the signal changes, so does the fingerprint database, which can change the distribution of the RSS (multimodal distribution). Thus, in this paper, we propose that symmetrical Hölder divergence, which is a statistical model of entropy that encapsulates both the skew Bhattacharyya divergence and Cauchy–Schwarz divergence that are closed-form formulas that can be used to measure the statistical dissimilarities between the same exponential family for the signals that have multivariate distributions. The Hölder divergence is asymmetric, so we used both left-sided and right-sided data so the centroid can be symmetrized to obtain the minimizer of the proposed algorithm. The experimental results showed that the symmetrized Hölder divergence consistently outperformed the traditional k nearest neighbor and probability neural network. In addition, with the proposed algorithm, the position error accuracy was about 1 m in buildings. Full article
Show Figures

Figure 1

20 pages, 368 KiB  
Review
Information Geometry of κ-Exponential Families: Dually-Flat, Hessian and Legendre Structures
by Antonio M. Scarfone, Hiroshi Matsuzoe and Tatsuaki Wada
Entropy 2018, 20(6), 436; https://doi.org/10.3390/e20060436 - 5 Jun 2018
Cited by 12 | Viewed by 4326
Abstract
In this paper, we present a review of recent developments on the κ -deformed statistical mechanics in the framework of the information geometry. Three different geometric structures are introduced in the κ -formalism which are obtained starting from three, not equivalent, divergence functions, [...] Read more.
In this paper, we present a review of recent developments on the κ -deformed statistical mechanics in the framework of the information geometry. Three different geometric structures are introduced in the κ -formalism which are obtained starting from three, not equivalent, divergence functions, corresponding to the κ -deformed version of Kullback–Leibler, “Kerridge” and Brègman divergences. The first statistical manifold derived from the κ -Kullback–Leibler divergence form an invariant geometry with a positive curvature that vanishes in the κ 0 limit. The other two statistical manifolds are related to each other by means of a scaling transform and are both dually-flat. They have a dualistic Hessian structure endowed by a deformed Fisher metric and an affine connection that are consistent with a statistical scalar product based on the κ -escort expectation. These flat geometries admit dual potentials corresponding to the thermodynamic Massieu and entropy functions that induce a Legendre structure of κ -thermodynamics in the picture of the information geometry. Full article
(This article belongs to the Special Issue Theoretical Aspect of Nonlinear Statistical Physics)
15 pages, 1088 KiB  
Article
Information Geometry for Covariance Estimation in Heterogeneous Clutter with Total Bregman Divergence
by Xiaoqiang Hua, Yongqiang Cheng, Hongqiang Wang and Yuliang Qin
Entropy 2018, 20(4), 258; https://doi.org/10.3390/e20040258 - 8 Apr 2018
Cited by 13 | Viewed by 4743
Abstract
This paper presents a covariance matrix estimation method based on information geometry in a heterogeneous clutter. In particular, the problem of covariance estimation is reformulated as the computation of geometric median for covariance matrices estimated by the secondary data set. A new class [...] Read more.
This paper presents a covariance matrix estimation method based on information geometry in a heterogeneous clutter. In particular, the problem of covariance estimation is reformulated as the computation of geometric median for covariance matrices estimated by the secondary data set. A new class of total Bregman divergence is presented on the Riemanian manifold of Hermitian positive-definite (HPD) matrix, which is the foundation of information geometry. On the basis of this divergence, total Bregman divergence medians are derived instead of the sample covariance matrix (SCM) of the secondary data. Unlike the SCM, resorting to the knowledge of statistical characteristics of the sample data, the geometric structure of matrix space is considered in our proposed estimators, and then the performance can be improved in a heterogeneous clutter. At the analysis stage, numerical results are given to validate the detection performance of an adaptive normalized matched filter with our estimator compared with existing alternatives. Full article
(This article belongs to the Special Issue Radar and Information Theory)
Show Figures

Figure 1

15 pages, 1133 KiB  
Article
Information Geometry for Radar Target Detection with Total Jensen–Bregman Divergence
by Xiaoqiang Hua, Haiyan Fan, Yongqiang Cheng, Hongqiang Wang and Yuliang Qin
Entropy 2018, 20(4), 256; https://doi.org/10.3390/e20040256 - 6 Apr 2018
Cited by 25 | Viewed by 3666
Abstract
This paper proposes a radar target detection algorithm based on information geometry. In particular, the correlation of sample data is modeled as a Hermitian positive-definite (HPD) matrix. Moreover, a class of total Jensen–Bregman divergences, including the total Jensen square loss, the total Jensen [...] Read more.
This paper proposes a radar target detection algorithm based on information geometry. In particular, the correlation of sample data is modeled as a Hermitian positive-definite (HPD) matrix. Moreover, a class of total Jensen–Bregman divergences, including the total Jensen square loss, the total Jensen log-determinant divergence, and the total Jensen von Neumann divergence, are proposed to be used as the distance-like function on the space of HPD matrices. On basis of these divergences, definitions of their corresponding median matrices are given. Finally, a decision rule of target detection is made by comparing the total Jensen-Bregman divergence between the median of reference cells and the matrix of cell under test with a given threshold. The performance analysis on both simulated and real radar data confirm the superiority of the proposed detection method over its conventional counterparts and existing ones. Full article
(This article belongs to the Section Information Theory, Probability and Statistics)
Show Figures

Figure 1

44 pages, 438 KiB  
Article
The Information Geometry of Bregman Divergences and Some Applications in Multi-Expert Reasoning
by Martin Adamčík
Entropy 2014, 16(12), 6338-6381; https://doi.org/10.3390/e16126338 - 1 Dec 2014
Cited by 19 | Viewed by 10685
Abstract
The aim of this paper is to develop a comprehensive study of the geometry involved in combining Bregman divergences with pooling operators over closed convex sets in a discrete probabilistic space. A particular connection we develop leads to an iterative procedure, which is [...] Read more.
The aim of this paper is to develop a comprehensive study of the geometry involved in combining Bregman divergences with pooling operators over closed convex sets in a discrete probabilistic space. A particular connection we develop leads to an iterative procedure, which is similar to the alternating projection procedure by Csiszár and Tusnády. Although such iterative procedures are well studied over much more general spaces than the one we consider, only a few authors have investigated combining projections with pooling operators. We aspire to achieve here a comprehensive study of such a combination. Besides, pooling operators combining the opinions of several rational experts allows us to discuss possible applications in multi-expert reasoning. Full article
(This article belongs to the Special Issue Maximum Entropy Applied to Inductive Logic and Reasoning)
Show Figures

Graphical abstract

Back to TopTop