# Information-Theoretic Analysis of Memoryless Deterministic Systems

## Abstract

## 1. Introduction

#### Related Work

## 2. Definition and Elementary Properties of Information Loss and Relative Information Loss

#### 2.1. Notation and Preliminaries

#### 2.2. Information Loss

Definition 1 (Information Loss)

**Proposition**

Proposition 1 (Information Loss of a Cascade)

**Proof.**

**Proposition**

**2**

**.**Let $g:\phantom{\rule{4pt}{0ex}}\mathcal{X}\to \mathcal{Y}$, $\mathcal{X}\subseteq {\mathbb{R}}^{N}$, and let the input RV X be such that its probability measure ${P}_{X}$ has an absolutely continuous component ${P}_{X}^{ac}\ll {\lambda}^{N}$ which is supported on $\mathcal{X}$. If there exists a set $B\subseteq \mathcal{Y}$ of positive ${P}_{Y}$-measure such that the preimage ${g}^{-1}(y)$ is uncountable for every $y\in B$, then

#### 2.3. Relative Information Loss

**Definition**

Definition 2 (Relative Information Loss)

**Definition**

**3**

The information dimension of an RV X is

**Proposition**

**3**

**.**Let X be an N-dimensional RV with positive information dimension $d(X)$ and finite $H({\widehat{X}}^{(0)})$. If $H({\widehat{X}}^{(0)}|Y=y)<\infty $ and $d(X|Y=y)$ exists ${P}_{Y}$-a.s., then the relative information loss equals

#### 2.4. Interplay between Information Loss and Relative Information Loss

**Proposition**

**4**

**.**Let X be such that $H(X)=\infty $ and let $l(X\to Y)>0$. Then, $L(X\to Y)=\infty $.

## 3. Information Loss for Piecewise Bijective Functions

**Definition**

Definition 4 (Piecewise Bijective Function)

#### 3.1. Information Loss in PBFs

**Proposition**

**5**

**.**The information loss induced by a PBF is given as

#### 3.2. Upper Bounds on the Information Loss

**Proposition**

**6**

**.**The information loss induced by a PBF can be upper bounded by the following ordered set of inequalities:

#### 3.3. Reconstruction and Reconstruction Error Probability

**Definition**

Definition 5 (Reconstructor & Reconstruction Error)

**Proposition**

Proposition 7 (MAP Reconstructor)

**.**

**Proof.**

**Definition**

Definition 6 (Bijective Part)

**Proposition**

Proposition 8 (Fano-Type Bound)

**Proof.**

**Proposition**

**9**

**.**The information loss $L(X\to Y)$ in a PBF is lower bounded by the error probability ${P}_{e}$ of a MAP reconstructor via

## 4. Information Loss for Systems that Reduce Dimensionality

#### 4.1. Relative Information Loss for Continuous Input RVs

**Proposition**

Proposition 10 (Relative Information Loss in Dimensionality Reduction)

**Proof.**

**Corollary**

**1.**

**Corollary**

**2.**

#### 4.2. Reconstruction and Reconstruction Error Probability

**Proposition**

**11.**

**Proof.**

## 5. Some Examples from Signal Processing and Communications

#### 5.1. Quantizer

#### 5.2. Center Clipper

#### 5.3. Adding Two RVs

#### 5.4. Square-Law Device and Gaussian Input

#### 5.5. Polynomials

#### 5.6. Energy Detection of Communication Signals

#### 5.7. Principal Components Analysis and Dimensionality Reduction

## 6. Discussion and Outlook

## Abbreviations

MSRE | mean squared reconstruction error |

RV | random variable |

probability density function | |

MMSE | minimum mean squared error |

PBF | piecewise bijective function |

MAP | maximum a posteriori |

PCA | principal components analysis |

## Appendix A. Proof of Proposition 8

## Appendix B. Proof of Proposition 10

## Appendix C. Proof of Proposition 11

**Figure 1.**Two different outputs of the rectifier, a (

**left**) and b (

**right**) with $a>b$. Both outputs lead to the same uncertainty about the input (and to the same reconstruction error probability), but to different mean squared reconstruction errors (MSREs): Assuming both possible inputs are equally probable, the MSREs are $2{a}^{2}>2{b}^{2}$. Energy and information behave differently.

**Figure 2.**Definition and properties of information loss. (

**a**) Model for computing the information loss in a memoryless input-output system g. Q is a quantizer with partition ${\mathcal{P}}_{n}$; (

**b**) The information loss of the cascade equals the sum of the individual information losses of the constituent systems.

**Figure 3.**The center clipper—an example for a system with infinite information loss and infinite information transfer.

**Figure 4.**Third-order polynomial of Section 5.5. (

**a**) The function and its MAP reconstructor indicated with a thick red line; (

**b**) Information loss as a function of input variance ${\sigma}^{2}$.

**Figure 5.**Constellation diagrams used in the example in Section 5.6: (

**a**) 16-PSK; (

**b**) 16-QAM; and (

**c**) circular 16-QAM.

**Figure 6.**Mutual information between the constellation points of Figure 5 (normalized to unit energy) and the noisy output of the energy detector ${Y}_{1}={\int}_{0}^{{T}_{I}}{(R(t)+N(t))}^{2}dt$. $N(t)$ is a Gaussian noise signal with standard deviation σ (see text). Note that the maximum mutual information in the noiseless case is bounded by $4-L((A,B)\to {Y}_{1})=4-L((A,B)\to Y)$ according to Table 1. The mutual information for 16-PSK with ${T}_{I}=1$ is zero and hence not depicted.

**Table 1.**Information loss $L((A,B)\to ({Y}_{1},\cdots ,{Y}_{1/{T}_{I}}))$ in the energy detector as a function of the constellation and the integration time ${T}_{I}$.

T_{I} | 1, 1/2, 1/4 | 1/3 |
---|---|---|

16-PSK | 4 | 1.75 |

16-QAM | 2.5 | 2 |

Circular 16-QAM | 2 | 1.5 |

**Table 2.**Comparison of results for some examples from Section 5. While there is a close connection between information loss and the reconstruction error probability (cf. Propositions 8 and 11), there is no apparent connection between information loss and the MSRE—energy and information behave inherently differently.

Example | MSRE | L(X → Y) | l(X → Y) | P_{e} |
---|---|---|---|---|

$Y={\widehat{X}}^{(n)}$, ${P}_{X}\ll \lambda $ | $\approx {2}^{-2n}/12$ | ∞ | 1 | 100% |

Center Clipper, ${P}_{X}\ll \lambda $ | ${P}_{X}(\mathcal{C})\mathbb{E}\left({X}^{2}|X\in \mathcal{C}\right)$ | ∞ | ${P}_{X}(\mathcal{C})$ | ${P}_{X}(\mathcal{C})$ |

$Y={X}_{1}+{X}_{2}$, ${P}_{{X}_{1},{X}_{2}}\ll {\lambda}^{2N}$ | – | ∞ | $1/2$ | 100% |

$Y={X}^{2}$, ${f}_{X}(x)={f}_{X}(-x)$ | $\mathbb{E}\left({X}^{2}\right)$ | 1 | 0 | 50% |

$Y={X}^{3}-100X$, X Gaussian | – | Figure 4b | 0 | $2Q\left(\frac{10}{\sqrt{3}\sigma}\right)-2Q\left(\frac{20}{\sqrt{3}\sigma}\right)$ |

PCA, ${P}_{X}\ll {\lambda}^{N}$ | min | ∞ | $\frac{N-M}{N}$ | 100% |

