# Approximating Information Measures for Fields

## Abstract

**:**

## 1. Introduction

**Definition**

**1.**

**Theorem**

**1.**

- 1.
- $I(\mathcal{A};\mathcal{B}|\mathcal{C})=I(\mathcal{A};\sigma \left(\mathcal{B}\right)\left|\mathcal{C}\right)=I(\mathcal{A};\mathcal{B}|\sigma \left(\mathcal{C}\right))$ (invariance of completion);
- 2.
- $I(\mathcal{A};\mathcal{B}\wedge \mathcal{C}|\mathcal{D})=I(\mathcal{A};\mathcal{B}\left|\mathcal{D}\right)+I(\mathcal{A};\mathcal{C}|\mathcal{B}\wedge \mathcal{D})$ (chain rule).

## 2. Proofs

**Theorem**

**2.**

- 1.
- $I(\mathcal{A};\mathcal{B}|\mathcal{C})=I(\mathcal{B};\mathcal{A}\left|\mathcal{C}\right)$;
- 2.
- $I(\mathcal{A};\mathcal{B}|\mathcal{C})\ge 0$ with the equality if and only if $P(A\cap B|\mathcal{C})=P(A\left|\mathcal{C}\right)P\left(B\right|\mathcal{C})$ almost surely for all $A\in \mathcal{A}$ and $B\in \mathcal{B}$;
- 3.
- $I(\mathcal{A};\mathcal{B}|\mathcal{C})\le min(H\left(\mathcal{A}\right|\mathcal{C}),H(\mathcal{B}\left|\mathcal{C}\right))$;
- 4.
- $I(\mathcal{A};{\mathcal{B}}_{1}|\mathcal{C})\le I(\mathcal{A};{\mathcal{B}}_{2}|\mathcal{C})$ if ${\mathcal{B}}_{1}\subset {\mathcal{B}}_{2}$;
- 5.
- $I(\mathcal{A};{\mathcal{B}}_{n}|\mathcal{C})\uparrow I(\mathcal{A};\mathcal{B}|\mathcal{C})$ for ${\mathcal{B}}_{n}\uparrow \mathcal{B}$.

**Theorem**

**3**

**.**For any field $\mathcal{K}$ and any event $G\in \sigma \left(\mathcal{K}\right)$, there is a sequence of events ${K}_{1},{K}_{2},\cdots \in \mathcal{K}$ such that

**Proof.**

- We have $\mathsf{\Omega}\in \mathcal{K}$. Hence, $\mathsf{\Omega}\in \mathcal{G}$.
- For $A\in \mathcal{G}$, consider ${K}_{1},{K}_{2},\cdots \in \mathcal{K}$ such that ${lim}_{n\to \infty}P(A\u25b5{K}_{n})=0$. Then, $A\u25b5{K}_{n}={A}^{c}\u25b5{K}_{n}^{c}$, where ${K}_{1}^{c},{K}_{2}^{c},\cdots \in \mathcal{K}$. Hence, ${A}^{c}\in \mathcal{G}$.
- For ${A}_{1},{A}_{2},\cdots \in \mathcal{G}$, consider events ${K}_{i}^{n}\in \mathcal{K}$ such that $P({A}_{i}\u25b5{K}_{i}^{n})\le {2}^{-n}$. Then,$$\begin{array}{c}\hfill P\left(\left(\bigcap _{i=1}^{n}{A}_{i}\right)\u25b5\left(\bigcap _{i=1}^{n}{K}_{i}^{i+n}\right)\right)\le \sum _{i=1}^{n}P({A}_{i}\u25b5{K}_{i}^{i+n})\le {2}^{-n}.\end{array}$$$$\begin{array}{c}\hfill P\left(\left(\bigcap _{i=1}^{\infty}{A}_{i}\right)\u25b5\left(\bigcap _{i=1}^{n}{A}_{i}\right)\right)=P\left(\bigcap _{i=1}^{n}{A}_{i}\right)-P\left(\bigcap _{i=1}^{\infty}{A}_{i}\right).\end{array}$$$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& P\left(\left(\bigcap _{i=1}^{\infty}{A}_{i}\right)\u25b5\left(\bigcap _{i=1}^{n}{K}_{i}^{i+n}\right)\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \le P\left(\left(\bigcap _{i=1}^{\infty}{A}_{i}\right)\u25b5\left(\bigcap _{i=1}^{n}{A}_{i}\right)\right)+P\left(\left(\bigcap _{i=1}^{n}{A}_{i}\right)\u25b5\left(\bigcap _{i=1}^{n}{K}_{i}^{i+n}\right)\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \le P\left(\bigcap _{i=1}^{n}{A}_{i}\right)-P\left(\bigcap _{i=1}^{\infty}{A}_{i}\right)-{2}^{-n},\hfill \end{array}$$

**Theorem**

**4**

**.**Fix an $\u03f5\in (0,{e}^{-1}]$ and a field $\mathcal{C}$. For finite partitions $\alpha ={\left\{{A}_{i}\right\}}_{i=1}^{I}$ and ${\alpha}^{\prime}={\left\{{A}_{i}^{\prime}\right\}}_{i=1}^{I}$ such that $P({A}_{i}\u25b5{A}_{i}^{\prime})\le \u03f5$ for all $i\in \left\{1,\cdots ,I\right\}$, we have

**Proof.**

**Proof**

**of**

**Theorem**

**1.**

**1 (invariance of completion):**Consider some measurable fields $\mathcal{A}$, $\mathcal{B}$, and $\mathcal{C}$. We are going to demonstrate

**Theorem**

**5**

**.**Let $\mathcal{A}$, $\mathcal{B}$, $\mathcal{C}$, and $\mathcal{D}$ be subfields of $\mathcal{J}$. We have

**Proof.**

**Theorem**

**6**

**.**Let $\alpha ={\left\{{A}_{i}\right\}}_{i=1}^{I}$ be a finite partition and let $\mathcal{C}$ be a field. For each $\u03f5>0$, there exists a finite partition ${\gamma}^{\prime}\subset \sigma \left(\mathcal{C}\right)$ such that for any partition $\gamma \subset \sigma \left(\mathcal{C}\right)$ finer than ${\gamma}^{\prime}$ we have

**Proof.**

**Proof**

**of**

**Theorem**

**1.**

**2 (chain rule):**Let $\mathcal{A}$, $\mathcal{B}$, $\mathcal{C}$, and $\mathcal{D}$ be arbitrary fields, and let $\alpha $, $\beta $, $\gamma $, and $\delta $ be finite partitions. The point of our departure is the chain rule for finite partitions [9] (Equation 2.60)

## 3. Applications

## Funding

## Conflicts of Interest

## References

- Dębowski, Ł. A general definition of conditional information and its application to ergodic decomposition. Stat. Probab. Lett.
**2009**, 79, 1260–1268. [Google Scholar] [CrossRef][Green Version] - Dębowski, Ł. On the Vocabulary of Grammar-Based Codes and the Logical Consistency of Texts. IEEE Trans. Inf. Theory
**2011**, 57, 4589–4599. [Google Scholar] [CrossRef][Green Version] - Dębowski, Ł. Is Natural Language a Perigraphic Process? The Theorem about Facts and Words Revisited. Entropy
**2018**, 20, 85. [Google Scholar] [CrossRef][Green Version] - Gelfand, I.M.; Kolmogorov, A.N.; Yaglom, A.M. Towards the general definition of the amount of information. Dokl. Akad. Nauk. SSSR
**1956**, 111, 745–748. (In Russian) [Google Scholar] - Dobrushin, R.L. A general formulation of the fundamental Shannon theorems in information theory. Uspekhi Mat. Nauk.
**1959**, 14, 3–104. (In Russian) [Google Scholar] - Pinsker, M.S. Information and Information Stability of Random Variables and Processes; Holden-Day: San Francisco, CA, USA, 1964. [Google Scholar]
- Wyner, A.D. A definition of conditional mutual information for arbitrary ensembles. Inf. Control.
**1978**, 38, 51–59. [Google Scholar] [CrossRef][Green Version] - Billingsley, P. Probability and Measure; John Wiley: New York, NY, USA, 1979. [Google Scholar]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory; John Wiley: New York, NY, USA, 1991. [Google Scholar]
- Crutchfield, J.P.; Feldman, D.P. Regularities unseen, randomness observed: The entropy convergence hierarchy. Chaos
**2003**, 15, 25–54. [Google Scholar] [CrossRef] - Birkhoff, G.D. Proof of the ergodic theorem. Proc. Natl. Acad. Sci. USA
**1932**, 17, 656–660. [Google Scholar] [CrossRef] - Rokhlin, V.A. On the fundamental ideas of measure theory. Am. Math. Soc. Transl. Ser. 1
**1962**, 10, 1–54. [Google Scholar] - Gray, R.M.; Davisson, L.D. The ergodic decomposition of stationary discrete random processses. IEEE Trans. Inf. Theory
**1974**, 20, 625–636. [Google Scholar] [CrossRef] - Löhr, W. Properties of the Statistical Complexity Functional and Partially Deterministic HMMs. Entropy
**2009**, 11, 385–401. [Google Scholar] [CrossRef][Green Version] - Crutchfield, J.P.; Marzen, S. Signatures of infinity: Nonergodicity and resource scaling in prediction, complexity, and learning. Phys. Rev. E
**2015**, 91, 050106. [Google Scholar] [CrossRef][Green Version] - Hilberg, W. Der bekannte Grenzwert der redundanzfreien Information in Texten—eine Fehlinterpretation der Shannonschen Experimente? Frequenz
**1990**, 44, 243–248. [Google Scholar] [CrossRef] - Shannon, C. Prediction and entropy of printed English. Bell Syst. Tech. J.
**1951**, 30, 50–64. [Google Scholar] [CrossRef] - Takahira, R.; Tanaka-Ishii, K.; Dębowski, Ł. Entropy Rate Estimates for Natural Language—A New Extrapolation of Compressed Large-Scale Corpora. Entropy
**2016**, 18, 364. [Google Scholar] [CrossRef][Green Version] - Herdan, G. Quantitative Linguistics; Butterworths: London, UK, 1964. [Google Scholar]
- Heaps, H.S. Information Retrieval—Computational and Theoretical Aspects; Academic Press: New York, NY, USA, 1978. [Google Scholar]
- Hahn, M.; Futrell, R. Estimating Predictive Rate-Distortion Curves via Neural Variational Inference. Entropy
**2019**, 21, 640. [Google Scholar] [CrossRef][Green Version] - Braverman, M.; Chen, X.; Kakade, S.M.; Narasimhan, K.; Zhang, C.; Zhang, Y. Calibration, Entropy Rates, and Memory in Language Models. arXiv
**2019**, arXiv:1906.05664. [Google Scholar] - Dębowski, Ł. Mixing, Ergodic, and Nonergodic Processes with Rapidly Growing Information between Blocks. IEEE Trans. Inf. Theory
**2012**, 58, 3392–3401. [Google Scholar] [CrossRef] - Dębowski, Ł. On Hidden Markov Processes with Infinite Excess Entropy. J. Theor. Probab.
**2014**, 27, 539–551. [Google Scholar] [CrossRef][Green Version] - Travers, N.F.; Crutchfield, J.P. Infinite Excess Entropy Processes with Countable-State Generators. Entropy
**2014**, 16, 1396–1413. [Google Scholar] [CrossRef][Green Version] - Dębowski, Ł. Maximal Repetition and Zero Entropy Rate. IEEE Trans. Inf. Theory
**2018**, 64, 2212–2219. [Google Scholar] [CrossRef]

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Dębowski, Ł. Approximating Information Measures for Fields. *Entropy* **2020**, *22*, 79.
https://doi.org/10.3390/e22010079

**AMA Style**

Dębowski Ł. Approximating Information Measures for Fields. *Entropy*. 2020; 22(1):79.
https://doi.org/10.3390/e22010079

**Chicago/Turabian Style**

Dębowski, Łukasz. 2020. "Approximating Information Measures for Fields" *Entropy* 22, no. 1: 79.
https://doi.org/10.3390/e22010079