entropy-logo

Journal Browser

Journal Browser

Divergence Measures: Mathematical Foundations and Applications in Information-Theoretic and Statistical Problems

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: closed (1 September 2021) | Viewed by 35531

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editor


E-Mail Website
Guest Editor
1. Andrew & Erna Viterbi Faculty of Electrical and Computer Engineering, Technion—Israel Institute of Technology, Haifa 3200003, Israel
2. Faculty of Mathematics, Technion—Israel Institute of Technology, Haifa 3200003, Israel
Interests: information theory; coding theory; probability theory; combinatorics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Information theory and probability, statistical learning theory, statistical signal processing, and other related disciplines greatly benefit from non-negative measures of dissimilarity (i.e., divergence measures) between pairs of probability measures defined on the same measurable space. Exploring the mathematical foundations of divergence measures (e.g., Bregman, Renyi, and f-divergences), and their potential applications in new information-theoretic and statistical problems is of interest, and many interesting results involve the use of these generalized divergence measures.

This Special Issue encourages research and survey papers on the mathematical properties and applications of divergence measures from an information-theoretic perspective.

Prof. Igal Sason
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Divergence measures
  • Bregman divergences
  • f-divergences
  • Renyi divergence
  • relative entropy (Kullback-Leibler divergence)
  • Information projections
  • Strong data processing inequalities
  • Hypothesis testing
  • Guessing
  • Coding theorems based on divergence measures
  • Concentration of measure inequalities

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

5 pages, 206 KiB  
Editorial
Divergence Measures: Mathematical Foundations and Applications in Information-Theoretic and Statistical Problems
by Igal Sason
Entropy 2022, 24(5), 712; https://doi.org/10.3390/e24050712 - 16 May 2022
Cited by 4 | Viewed by 2579
Abstract
Data science, information theory, probability theory, statistical learning, statistical signal processing, and other related disciplines greatly benefit from non-negative measures of dissimilarity between pairs of probability measures [...] Full article

Research

Jump to: Editorial

26 pages, 1548 KiB  
Article
Discriminant Analysis under f-Divergence Measures
by Anmol Dwivedi, Sihui Wang and Ali Tajer
Entropy 2022, 24(2), 188; https://doi.org/10.3390/e24020188 - 27 Jan 2022
Cited by 2 | Viewed by 2510
Abstract
In statistical inference, the information-theoretic performance limits can often be expressed in terms of a statistical divergence between the underlying statistical models (e.g., in binary hypothesis testing, the error probability is related to the total variation distance between the statistical models). As the [...] Read more.
In statistical inference, the information-theoretic performance limits can often be expressed in terms of a statistical divergence between the underlying statistical models (e.g., in binary hypothesis testing, the error probability is related to the total variation distance between the statistical models). As the data dimension grows, computing the statistics involved in decision-making and the attendant performance limits (divergence measures) face complexity and stability challenges. Dimensionality reduction addresses these challenges at the expense of compromising the performance (the divergence reduces by the data-processing inequality). This paper considers linear dimensionality reduction such that the divergence between the models is maximally preserved. Specifically, this paper focuses on Gaussian models where we investigate discriminant analysis under five f-divergence measures (Kullback–Leibler, symmetrized Kullback–Leibler, Hellinger, total variation, and χ2). We characterize the optimal design of the linear transformation of the data onto a lower-dimensional subspace for zero-mean Gaussian models and employ numerical algorithms to find the design for general Gaussian models with non-zero means. There are two key observations for zero-mean Gaussian models. First, projections are not necessarily along the largest modes of the covariance matrix of the data, and, in some situations, they can even be along the smallest modes. Secondly, under specific regimes, the optimal design of subspace projection is identical under all the f-divergence measures considered, rendering a degree of universality to the design, independent of the inference problem of interest. Full article
Show Figures

Figure 1

52 pages, 769 KiB  
Article
Error Exponents and α-Mutual Information
by Sergio Verdú
Entropy 2021, 23(2), 199; https://doi.org/10.3390/e23020199 - 5 Feb 2021
Cited by 9 | Viewed by 3562
Abstract
Over the last six decades, the representation of error exponent functions for data transmission through noisy channels at rates below capacity has seen three distinct approaches: (1) Through Gallager’s E0 functions (with and without cost constraints); (2) large deviations form, in terms [...] Read more.
Over the last six decades, the representation of error exponent functions for data transmission through noisy channels at rates below capacity has seen three distinct approaches: (1) Through Gallager’s E0 functions (with and without cost constraints); (2) large deviations form, in terms of conditional relative entropy and mutual information; (3) through the α-mutual information and the Augustin–Csiszár mutual information of order α derived from the Rényi divergence. While a fairly complete picture has emerged in the absence of cost constraints, there have remained gaps in the interrelationships between the three approaches in the general case of cost-constrained encoding. Furthermore, no systematic approach has been proposed to solve the attendant optimization problems by exploiting the specific structure of the information functions. This paper closes those gaps and proposes a simple method to maximize Augustin–Csiszár mutual information of order α under cost constraints by means of the maximization of the α-mutual information subject to an exponential average constraint. Full article
Show Figures

Figure 1

15 pages, 297 KiB  
Article
Minimum Divergence Estimators, Maximum Likelihood and the Generalized Bootstrap
by Michel Broniatowski
Entropy 2021, 23(2), 185; https://doi.org/10.3390/e23020185 - 31 Jan 2021
Cited by 4 | Viewed by 1742
Abstract
This paper states that most commonly used minimum divergence estimators are MLEs for suited generalized bootstrapped sampling schemes. Optimality in the sense of Bahadur for associated tests of fit under such sampling is considered. Full article
20 pages, 328 KiB  
Article
Strongly Convex Divergences
by James Melbourne
Entropy 2020, 22(11), 1327; https://doi.org/10.3390/e22111327 - 21 Nov 2020
Cited by 6 | Viewed by 2045
Abstract
We consider a sub-class of the f-divergences satisfying a stronger convexity property, which we refer to as strongly convex, or κ-convex divergences. We derive new and old relationships, based on convexity arguments, between popular f-divergences. Full article
26 pages, 394 KiB  
Article
A Two-Moment Inequality with Applications to Rényi Entropy and Mutual Information
by Galen Reeves
Entropy 2020, 22(11), 1244; https://doi.org/10.3390/e22111244 - 1 Nov 2020
Cited by 2 | Viewed by 2097
Abstract
This paper explores some applications of a two-moment inequality for the integral of the rth power of a function, where 0<r<1. The first contribution is an upper bound on the Rényi entropy of a random vector in [...] Read more.
This paper explores some applications of a two-moment inequality for the integral of the rth power of a function, where 0<r<1. The first contribution is an upper bound on the Rényi entropy of a random vector in terms of the two different moments. When one of the moments is the zeroth moment, these bounds recover previous results based on maximum entropy distributions under a single moment constraint. More generally, evaluation of the bound with two carefully chosen nonzero moments can lead to significant improvements with a modest increase in complexity. The second contribution is a method for upper bounding mutual information in terms of certain integrals with respect to the variance of the conditional density. The bounds have a number of useful properties arising from the connection with variance decompositions. Full article
Show Figures

Figure 1

36 pages, 522 KiB  
Article
On Relations Between the Relative Entropy and χ2-Divergence, Generalizations and Applications
by Tomohiro Nishiyama and Igal Sason
Entropy 2020, 22(5), 563; https://doi.org/10.3390/e22050563 - 18 May 2020
Cited by 12 | Viewed by 4361
Abstract
The relative entropy and the chi-squared divergence are fundamental divergence measures in information theory and statistics. This paper is focused on a study of integral relations between the two divergences, the implications of these relations, their information-theoretic applications, and some generalizations pertaining to [...] Read more.
The relative entropy and the chi-squared divergence are fundamental divergence measures in information theory and statistics. This paper is focused on a study of integral relations between the two divergences, the implications of these relations, their information-theoretic applications, and some generalizations pertaining to the rich class of f-divergences. Applications that are studied in this paper refer to lossless compression, the method of types and large deviations, strong data–processing inequalities, bounds on contraction coefficients and maximal correlation, and the convergence rate to stationarity of a type of discrete-time Markov chains. Full article
39 pages, 1011 KiB  
Article
Conditional Rényi Divergences and Horse Betting
by Cédric Bleuler, Amos Lapidoth and Christoph Pfister
Entropy 2020, 22(3), 316; https://doi.org/10.3390/e22030316 - 11 Mar 2020
Cited by 6 | Viewed by 3028
Abstract
Motivated by a horse betting problem, a new conditional Rényi divergence is introduced. It is compared with the conditional Rényi divergences that appear in the definitions of the dependence measures by Csiszár and Sibson, and the properties of all three are studied with [...] Read more.
Motivated by a horse betting problem, a new conditional Rényi divergence is introduced. It is compared with the conditional Rényi divergences that appear in the definitions of the dependence measures by Csiszár and Sibson, and the properties of all three are studied with emphasis on their behavior under data processing. In the same way that Csiszár’s and Sibson’s conditional divergence lead to the respective dependence measures, so does the new conditional divergence lead to the Lapidoth–Pfister mutual information. Moreover, the new conditional divergence is also related to the Arimoto–Rényi conditional entropy and to Arimoto’s measure of dependence. In the second part of the paper, the horse betting problem is analyzed where, instead of Kelly’s expected log-wealth criterion, a more general family of power-mean utility functions is considered. The key role in the analysis is played by the Rényi divergence, and in the setting where the gambler has access to side information, the new conditional Rényi divergence is key. The setting with side information also provides another operational meaning to the Lapidoth–Pfister mutual information. Finally, a universal strategy for independent and identically distributed races is presented that—without knowing the winning probabilities or the parameter of the utility function—asymptotically maximizes the gambler’s utility function. Full article
24 pages, 1604 KiB  
Article
On a Generalization of the Jensen–Shannon Divergence and the Jensen–Shannon Centroid
by Frank Nielsen
Entropy 2020, 22(2), 221; https://doi.org/10.3390/e22020221 - 16 Feb 2020
Cited by 63 | Viewed by 11195
Abstract
The Jensen–Shannon divergence is a renown bounded symmetrization of the Kullback–Leibler divergence which does not require probability densities to have matching supports. In this paper, we introduce a vector-skew generalization of the scalar α -Jensen–Bregman divergences and derive thereof the vector-skew α -Jensen–Shannon [...] Read more.
The Jensen–Shannon divergence is a renown bounded symmetrization of the Kullback–Leibler divergence which does not require probability densities to have matching supports. In this paper, we introduce a vector-skew generalization of the scalar α -Jensen–Bregman divergences and derive thereof the vector-skew α -Jensen–Shannon divergences. We prove that the vector-skew α -Jensen–Shannon divergences are f-divergences and study the properties of these novel divergences. Finally, we report an iterative algorithm to numerically compute the Jensen–Shannon-type centroids for a set of probability densities belonging to a mixture family: This includes the case of the Jensen–Shannon centroid of a set of categorical distributions or normalized histograms. Full article
Show Figures

Graphical abstract

Back to TopTop