# Recent Progresses in Characterising Information Inequalities

## Abstract

**:**

## 1. Introduction

## 2. Notations

## 3. A Framework for Information Inequalities

#### 3.1. Geometric Framework

**weakly entropic**if there exists $\delta >0$ such that $\delta \xb7h$ is entropic, and is called

**almost entropic**if it is the limit of a sequence of weakly entropic functions. Let ${\Gamma}^{*}\left(\mathcal{N}\right)$ be the set of all entropic functions and ${\overline{\Gamma}}^{*}\left(\mathcal{N}\right)$ be its closure. Then ${\overline{\Gamma}}^{*}\left(\mathcal{N}\right)$ is a closed and convex cone, and in fact is the set of all almost entropic functions. Compared to ${\Gamma}^{*}\left(\mathcal{N}\right)$, its closure ${\overline{\Gamma}}^{*}\left(\mathcal{N}\right)$ is more manageable. In fact, for many application, it is sufficient to consider ${\overline{\Gamma}}^{*}\left(\mathcal{N}\right)$. The following proves that characterising all linear information inequalities is equivalent to characterising the set ${\overline{\Gamma}}^{*}\left(\mathcal{N}\right)$.

**Theorem**

**1**

**(Yeung [10])**An information inequality ${\sum}_{\alpha \subseteq \mathcal{N}}{c}_{\alpha}H\left({X}_{\alpha}\right)\ge 0$ is valid (i.e., holds for all discrete random variables) if and only if

#### 3.2. Non-Shannon Inequalities

**Theorem**

**2**

**(Non-Shannon’s inequality [1])**Let $\{{X}_{1},{X}_{2},{X}_{3},{X}_{4}\}$ be random variables. Then

**Sketch of proof of Theorem 2**:

**Remark:**In the above proof of Theorem 2, the non-Shannon inequality is proved by invoking only a sequence of Shannon inequalities. This seems impossible at the first glance, as by definition, non-Shannon inequalities are all inequalities that are not implied by Shannon inequalities. The trick however is to apply Shannon inequalities over a larger set of random variables.

- ${\Gamma}^{*}\left(\mathcal{M}\right)\subseteq {\rm Y}$;
- For any $g\in {\Gamma}^{*}\left(\mathcal{N}\right)$, there exists a $h\in {\Gamma}^{*}\left(\mathcal{M}\right)\cap \mathcal{C}$ such that $g={\mathrm{proj}}_{\mathcal{N}}\left[h\right]$. Or equivalently, ${\Gamma}^{*}\left(\mathcal{N}\right)\subseteq {\mathrm{proj}}_{\mathcal{N}}[{\Gamma}^{*}\left(\mathcal{M}\right)\cap \mathcal{C}]$.

**Remark:**Instead of verifying if an information inequality is valid or not, we can also use the Fourier-Motzkin elimination method to find all linear inequalities that defines the cone ${\mathrm{proj}}_{\mathcal{N}}(\mathcal{C}\cap {\rm Y})$. Clearly, each such inequality corresponds to a valid information inequality over $\{{X}_{i},i\in \mathcal{N}\}$.

#### 3.3. Non-Polyhedral Property

**Remark:**The non-polyhedral property of ${\overline{\Gamma}}^{*}\left({\mathcal{N}}_{4}\right)$ was later used by [16] to show that the set of achievable tuples of a network is in general also non-polyhedral. As a result, this proved that the Linear Programming bounds is not tight in general.

**Theorem 3**

**(Matúš)**

**Remark:**Using one single nonlinear inequality, it can be proved that the set of all almost entropic functions is not polyhedral.

**Theorem**

**4**

**(Quadratic information inequality [17])**Let $g\in {\overline{\Gamma}}^{*}\left(\mathcal{N}\right)$,

**Remark:**Subject to the constraint that $b\left(g\right)>2a\left(g\right)$, then the series of linear inequalities (20) is implied by the Shannon inequalities. Therefore, the constraint (i.e., $b\left(g\right)>2a\left(g\right)$) we imposed on Theorem 4 is not critical.

**Conjecture**

**1**

## 4. Equivalent Frameworks

#### 4.1. Differential Entropy

**Definition 1**

**(Differential entropies)**

**differential entropy**of $({X}_{i},i\in \alpha )$ is denoted by

**Remark:**For notation simplicity, we abuse our notations by using $H\left(X\right)$ to denote both discrete and differential entropies. However, its exact meaning should be clear from the context.

**Definition 2**

**(Balanced inequalities)**

**balanced**if for all $n\in \mathcal{N}$, ${\sum}_{\alpha \subseteq \mathcal{N}:n\in \alpha}{c}_{\alpha}=0$.

**Example**

**1**

**Proposition**

**1**

**(Necessary and sufficiency of balanced inequalities [6])**For any valid information inequality ${\sum}_{\alpha \subseteq \mathcal{N}}{c}_{\alpha}H\left({X}_{\alpha}\right)\ge 0$ , it is a valid discrete information inequality if and only if

- 1.
- its residual weights ${r}_{n}\ge 0$ for all n, and
- 2.
- its balanced counterpart is also valid.$$\sum _{\alpha \subseteq \mathcal{N}}{c}_{\alpha}H\left({X}_{\alpha}\right)-\sum _{n\in \mathcal{N}}{r}_{n}H\left({X}_{n}\right|{X}_{i},i\ne n)\ge 0.$$

**Theorem**

**5**

**(Equivalence [6])**All information inequalities for continuous random variables are balanced. Furthermore, a balanced information inequality

#### 4.2. Inequalities for Kolmogorov Complexity

**random**variables. However, for the following Kolmogorov complexity framework, the objects of interest are

**deterministic**strings instead.

**Theorem**

**6**

**(Equivalence [3])**An information inequality (for discrete random variable) ${\sum}_{\alpha}{c}_{\alpha}H({X}_{i},i\in \alpha )\ge 0$ is valid if and only if the corresponding Kolmogorov complexity inequality defined below

#### 4.3. A Group-Theoretic Framework

**Definition 3**

**(Group-theoretic construction of random variables)**

**Theorem**

**7**

**(Group characterisable random variables [4])**Let G be a finite group and $\{{G}_{i},i\in \mathcal{N}\}$ be a set of subgroups of G. For each $i\in \mathcal{N}$, let ${X}_{i}$ be the random variable induced by the subgroup ${G}_{i}$ as defined above. Then for any $\alpha \subseteq \mathcal{N}$,

- 1.
- $H({X}_{i},i\in \alpha )=log\left|G\right|/|{\cap}_{i\in \alpha}{G}_{i}|$,
- 2.
- $|\lambda ({X}_{i},i\in \alpha )|=\left|G\right|/|{\cap}_{i\in \alpha}{G}_{i}|$,
- 3.
- $({X}_{i},i\in \alpha )$ is uniformly distributed over its support. In other word, the value of the probability distribution function of $({X}_{i},i\in \alpha )$ is either zero or is a constant.

**Definition**

**4**

**group characterisable**if it is the entropy function of a set of random variables $\left\{{X}_{1},\dots ,{X}_{n}\right\}$ induced by a finite group G and its subgroups $\left\{{G}_{1},\dots ,{G}_{n}\right\}$. Furthermore, h is

- 1.
**representable**if $\{G,{G}_{1},\dots ,{G}_{n}\}$ are all vector space, and- 2.
**abelian**if G is abelian.

**Theorem**

**8**

**(Group-theoretic inequalities [4])**Let

**Theorem**

**9**

**(Converse [4])**The information inequality (30) is valid if it is satisfied by all random variables induced by groups, or equivalently, the group-theoretic inequality (32) is valid.

**Lemma 1**

**(Properties of group induced random variables)**

- 1.
**(Functional dependency)**$H\left({X}_{l}\right|{X}_{i},i\in \alpha )=0$ (i.e., ${X}_{l}$ is a function of ${X}_{\alpha}$) if and only if ${\cap}_{i\in \alpha}{G}_{i}\subseteq {G}_{l}$. Hence, functional dependency is equivalent to subset relation;- 2.
**(Independency)**$I({X}_{i};{X}_{j}|{X}_{l})=0$ if and only if$$\begin{array}{c}\hfill |{G}_{i}\cap {G}_{l}\left|\right|{G}_{j}\cap {G}_{l}|=|{G}_{l}\left|\right|{G}_{i}\cap {G}_{j}\cap {G}_{l}|;\end{array}$$- 3.
**(Conditioning preserves group characterisation)**for any fixed any $\alpha \subseteq \mathcal{N}$, the group $K\triangleq {\cap}_{i\in \alpha}{K}_{i}$ and its subgroups ${K}_{i}\triangleq K\cap {G}_{i}$ for $i\in \mathcal{N}$ induce a set of random variables $\{{Y}_{i},i\in \mathcal{N}\}$ such that$$H({Y}_{i},i\in \beta )=H({X}_{i},i\in \beta |{X}_{j},j\in \alpha )$$$$g\left(\beta \right)=h\left(\beta \right|\alpha )$$

**Proposition**

**2**

**(Duality [19])**Let $\left\{{V}_{1},\dots ,{V}_{n}\right\}$ be a set of vector subspaces of $V\triangleq {\mathbb{F}}^{m}$ over the finite field $\mathbb{F}$. Define the following subspace ${W}_{i}$ for $i\in \mathcal{N}$:

**Remark:**While ${W}^{\perp}$ and W are both subspaces of V and $dimW+dim{W}^{\perp}=dimV$, $\langle W,{W}^{\perp}\rangle \ne V$ in general. If $\mathbb{F}=\mathbb{R}$, then ${W}_{i}$ (defined as in (34)) is the orthogonal complement of ${V}_{i}$.

**Example 2**

**(Group-theoretic Proof)**

#### 4.4. Combinatorial Perspective

**Definition 5**

**(Quasi-uniform random variables)**

**Definition 6**

**(Box assignment)**

**Definition 7**

**(Quasi-uniform box assignment)**

**quasi-uniform**if for any $\alpha \subseteq \mathcal{N}$, the cardinality of ${\mathcal{A}}_{\mathcal{N}|\alpha}\left({a}_{\alpha}\right)$ is constant for all ${a}_{\alpha}\in {\mathcal{A}}_{\alpha}$. And we will denote the constant by $|{\mathcal{A}}_{\mathcal{N}|\alpha}|$ for simplicity.

**Proposition**

**3**

**(Equivalence [7])**Let $\left\{{X}_{1},\dots ,{X}_{n}\right\}$ be a set of quasi-uniform random variables and $\mathcal{A}$ be its probability distribution’s support. Then $\mathcal{A}$ is a quasi-uniform box assignment in ${\prod}_{i\in \mathcal{N}}{\mathcal{X}}_{i}$. Furthermore, for all $\alpha \subseteq \mathcal{N}$,

**Theorem**

**10**

**(Combinatorial interpretation [7])**An information inequality

**Example 3**

**(Combinatorial proof)**

#### 4.5. Coding Perspective

**Example**

**4**

**Example 5**

**(Linear codes)**

**Definition 8**

**(Weight enumerator)**

**Theorem**

**12**

**(Generalised Greene’s Theorem [20])**Let C be a quasi-uniform code and $\left\{{X}_{1},\dots ,{X}_{n}\right\}$ be its induced quasi-uniform random variables. Suppose that ρ is the entropy function of $\left\{{X}_{1},\dots ,{X}_{n}\right\}$. In other words, $\rho \left(\alpha \right)=H({X}_{i},i\in \alpha )$. Then

**Remark:**The Greene’s Theorem is a special case of Theorem 12 when the code $\mathcal{C}$ is a linear code.

**Example**

**6**

## 5. Constrained Information Inequalities

#### 5.1. Rank Inequalities

**Theorem 13**

**(Ingleton inequality)**

**Theorem**

**14**

**(Kinser [23])**Suppose $\mathcal{X}=\left\{{X}_{1},\dots ,{X}_{n}\right\}$ and h is representable over $\mathcal{X}$. Then

**Theorem**

**15**

**(Dougherty**

**et al.**

**[24])**Suppose $\mathcal{X}=\{A,B,{C}_{1},\dots ,{C}_{n}\}$ and h is representable over $\mathcal{X}$. Then

**Remark:**In addition to the inequalities obtained in Theorem 15, the work [24] found all subspace rank inequalities in five variables (called DFZ inequalities) and many more other new inequalities in six variables.

**Definition 9**

**(ϵ-truncation)**

**Definition 10**

**(Truncation-preserving inequalities)**

**Theorem**

**16**

**(Insufficiency of truncation preserving inequalities [5])**Let ${\Delta}_{n}$ be the set of all subspace rank inequalities involving n variables (or subspaces). Then for sufficiently large n, ${\Delta}_{n}$ is not truncation-preserving.

#### 5.2. Determinantal Inequalities

**Definition 11**

**(Gaussian polymatroid)**

**Gaussian**if there exists a set of jointly Gaussian random variables $\{{Y}_{j},j\in \mathcal{K}\}$ with a $\left|\mathcal{K}\right|\times \left|\mathcal{K}\right|$ covariance matrix and a partition of $\mathcal{K}$ into n disjoint nonempty subsets ${\beta}_{1},\dots ,{\beta}_{n}$ such that for any $\alpha \subseteq \mathcal{N}$,

**weakly Gaussian**if there exists $\delta >0$ such that $\delta h$ is Gaussian, and

**almost Gaussian**if h is the limit of a sequence of weakly Gaussian functions.

- (
**Hadamard inequality**) Let K be a positive definite matrix K. Then$$\begin{array}{c}\hfill detK\le \prod _{i=1}^{\left|\mathcal{K}\right|}{K}_{i,i}\end{array}$$$$H\left({Y}_{1},\dots ,{Y}_{k}\right)\le \sum _{i=1}^{k}H\left({Y}_{i}\right).$$ - (
**Szasz inequality**) For any $1\le l<k$,$$\begin{array}{c}\hfill {\left(\prod _{\beta :\left|\beta \right|=l}det\left({K}_{\beta}\right)\right)}^{1/\left(\genfrac{}{}{0pt}{}{k-1}{l-1}\right)}\ge {\left(\prod _{\beta :\left|\beta \right|=l+1}det\left({K}_{\beta}\right)\right)}^{1/\left(\genfrac{}{}{0pt}{}{k-1}{l}\right)}.\end{array}$$$$\begin{array}{c}\hfill \frac{1}{\left(\genfrac{}{}{0pt}{}{k}{l}\right)}\sum _{\beta :\left|\beta \right|=l}\frac{H({Y}_{i},i\in \beta )}{l}\ge \frac{1}{\left(\genfrac{}{}{0pt}{}{k}{l+1}\right)}\sum _{\beta :\left|\beta \right|=l+1}\frac{H({Y}_{i},i\in \beta )}{l+1}.\end{array}$$

## 6. Summary and Conclusions

## Acknowledgements

## References

- Zhang, Z.; Yeung, R.W. On the characterization of entropy function via information inequalities. IEEE Trans. Inform. Theory
**1998**, 44, 1440–1452. [Google Scholar] [CrossRef] - Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J.
**1948**, 27, 379–423, 623–656. [Google Scholar] [CrossRef] - Hammer, D.; Romashchenko, A.; Shen, A.; Vereshchagin, N. Inequalities for Shannon Entropy and Kolmogorov Complexity. J. Computer Syst. Sci.
**2000**, 60, 442–464. [Google Scholar] [CrossRef] - Chan, T.H.; Yeung, R.W. On a relation between information inequalities and group theory. IEEE Trans. Inform. Theory
**2002**, 48, 1992–1995. [Google Scholar] [CrossRef] - Chan, T.H.; Grant, A.; Kern, D. Novel technique in characterising representable polymatroids. IEEE Trans. Inform. Theory
**2009**. submitted for publication. [Google Scholar] - Chan, T.H. Balanced information inequalities. IEEE Trans. Inform. Theory
**2003**, 49, 3261–3267. [Google Scholar] [CrossRef] - Chan, T.H. A combinatorial approach to information inequalities. Commun. Inform. Syst.
**2001**, 1, 1–14. [Google Scholar] [CrossRef] - Dougherty, R.; Freiling, C.; Zeger, K. Six New Non-Shannon Information Inequalities. IEEE Int. Symp. Inform. Theory
**2006**, 233–236. [Google Scholar] - Matus, F. Infinitely Many Information Inequalities. In Proceedings of ISIT 2007, Nice, France, June 2007.
- Yeung, R. A framework for linear information inequalities. IEEE Trans. Inform. Theory
**1997**, 43, 1924–1934. [Google Scholar] [CrossRef] - Yeung, R.; Yan, Y. Information Theoretic Inequality Prover. Available online: http://user-www.ie.cuhk.edu.hk/ITIP/ (accessed on 27 January 2011).
- Yeung, R. A First Course in Information Theory; Kluwer Academic/Plenum Publisher: New York, NY, USA, 2002. [Google Scholar]
- Yeung, R.W.; Zhang, Z. A class of non-Shannon-type information inequalities and their applications. Commun. Inform. Syst.
**2001**, 1, 87–100. [Google Scholar] [CrossRef] - Sason, I. Identification of new classes of non-Shannon type constrained information inequalities and their relation to finite groups. In Proceedings of 2002 IEEE International Symposium, Lausanne, Switzerland, 30 June–5 July 2002.
- Makarychev, K.; Makarychev, Y.; Romashchenko, A.; Vereshchagin, N. A new class of non-Shannon-type inequalities for entropies. Commun. Inform. Syst.
**2002**, 2, 147–165. [Google Scholar] [CrossRef] - Chan, T.H.; Grant, A. Dualities between Entropy Functions and network codes. IEEE Trans. Inform. Theory
**2008**, 54, 4470–4487. [Google Scholar] [CrossRef] - Chan, T.; Grant, A. Non-linear Information Inequalities. Entropy J.
**2008**, 10, 765–775. [Google Scholar] [CrossRef] - Strictly speaking, the Kolmogorov complexity of a string depends on the chosen “computer model”. However, the choice of the computer model will only affect the resulting Kolmogorov up to a constant difference (because different computer models can emulate each other). Asymptotically, such a difference will not cause a significant difference.
- Chan, T.H.; Grant, A. Linear programming bounds for network coding. IEEE Trans. Inform. Theory
**2011**. sbmitted to be published. [Google Scholar] - Chan, T.H.; Grant, A.; Britz, T. Properties of quasi-uniform codes. In Proceedings of 2010 IEEE International Symposium on Information Theory, Austin, TX, USA, June 2010.
- Ingleton inequalities are not valid information inequalities, as there exists almost entropic polymatroids violating the inequalities.
- Guille, L.; Chan, T.H.; Grant, A. The minimal set of Ingleton inequalities. IEEE Trans. Inform. Theory
**2009**. [Google Scholar] - Kinser, R. New inequalities for subspace arrangements. J. Combin. Theory Ser. A
**2010**. [Google Scholar] [CrossRef] - Dougherty, R.; Freiling, C.; Zeger, K. Linear rank inequalities on five or more variables. Arxiv Preprint
**2009**. [Google Scholar] - Each X
_{i}can be a vector of jointly distributed Gaussian random variables as defined in (61). - Lutwak, E.; Yang, D.; Zhang, G. Cramer-Rao and moment-entropy inequalities for Renyi entropy and generalized Fisher information. IEEE Trans. Info. Theory
**2005**, 51, 473–478. [Google Scholar] [CrossRef] - Lutwak, E.; Yang, D.; Zhang, G. Moment-entropy inequalities for a random vector. IEEE Trans. Info. Theory
**2007**, 53, 1603–1607. [Google Scholar] [CrossRef]

© 2011 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license http://creativecommons.org/licenses/by/3.0/.

## Share and Cite

**MDPI and ACS Style**

Chan, T. Recent Progresses in Characterising Information Inequalities. *Entropy* **2011**, *13*, 379-401.
https://doi.org/10.3390/e13020379

**AMA Style**

Chan T. Recent Progresses in Characterising Information Inequalities. *Entropy*. 2011; 13(2):379-401.
https://doi.org/10.3390/e13020379

**Chicago/Turabian Style**

Chan, Terence. 2011. "Recent Progresses in Characterising Information Inequalities" *Entropy* 13, no. 2: 379-401.
https://doi.org/10.3390/e13020379