# Generalised Measures of Multivariate Information Content

## Abstract

## 1. Introduction

## 2. Mutual Information Content

## 3. Marginal Information Sharing

**Theorem**

**1.**

**Proof.**

## 4. Synergistic Information Content

## 5. Properties of the Union and Intersection Information Content

**Property**

**1**

**.**The union and intersection information content are idempotent,

**Property**

**2**

**.**The union and intersection information content are commutative,

**Property**

**3**

**.**The union and intersection information content are associative,

**Property**

**4**

**.**The union and intersection information content are connected by absorption,

**Property**

**5**

**.**The union and intersection information content are distribute over each other,

**Theorem**

**2.**

**Theorem**

**3.**

**Proof.**

**Property**

**6**

**.**The union and intersection information content are given by at least one of

## 6. Generalised Marginal Information Sharing

**Theorem**

**4.**

**Proof.**

## 7. Multivariate Information Decomposition

**Theorem**

**5.**

**Proof.**

## 8. Union and Intersection Mutual Information

## 9. Conclusions

## Appendix A

**Lemma**

**A1.**

**Proof.**

**Lemma**

**A2.**

**Proof.**

**Lemma**

**A3.**

**Proof.**

**Lemma**

**A4.**

**Proof.**

**Lemma**

**A5.**

**Proof.**

**Lemma**

**A6.**

**Proof.**

**Lemma**

**A7.**

**Proof.**

**Lemma**

**A8.**

**Proof.**

**Lemma**

**A9.**

**Proof.**

## References

**Figure 1.**(

**Top left**) When depicting a measure on the union of two sets $\mu (A\cup B)$, the area of each section can be used to represent the inequality (5) and hence the values $\mu (A\backslash B)$, $\mu (B\backslash A)$ and $\mu (A\cap B)$ correspond to the area of each section. This correspondence can be generalised to consider an arbitrary number of sets. (

**Bottom left**) When depicting the joint entropy $H(X,Y)$, the area of each section can also be used to represent the inequality (1) and hence the values $H\left(X\right|Y)$, $H\left(Y\right|X)$ and $I(X;Y)$ correspond to the area of each section. However, this correspondence does not generalise beyond two variables. (

**Right**) For example, when considering the entropy of three variables, the multivariate mutual information $I(X;Y;Z)$ cannot be accurately represented using an area since, as represented by the hatching, it is not non-negative.

**Figure 2.**(

**Left**) Indiana assumes that Alice’s information $h\left(x\right)$ is independent of Bob’s information $h\left(y\right)$ such that her information is given by $h\left(x\right)+h\left(y\right)$. (

**Middle**) Johnny knows the joint distribution ${p}_{XY}$, and hence his information is given by the joint information content $h(x,y)$. (

**Right**) There is no inequality that requires Johnny’s information to be no greater than Indiana’s assumed information, or vice versa. On the one hand, Johnny can have more information than Indiana since a joint realisation can be more surprising than both of the individual marginal realisations. On the other hand, Indiana can have more information than Johnny since a joint realisation can be less surprising than both of the individual marginal realisations occurring independently. Thus, as represented by the hatching, the mutual information content $i(x;y)$ is not non-negative.

**Figure 3.**(

**Left**) If Alice’s information $h\left(x\right)$ is greater than Bob’s information $h\left(y\right)$, then Eve’s information $h(x\bigsqcup y)$ is equal to Alice’s information $h\left(x\right)$. In effect, Eve is pessimistically assuming that information provided by the least surprising marginal realisation $h(x\sqcap y)$ is already provided by the most surprising marginal realisation $h(x\bigsqcup y)$, i.e., Bob’s information $h\left(y\right)$ is a subset of Alice’s information $h\left(x\right)$. From this perspective, Eve gets unique information from Alice relative to Bob $h(x\backslash y)$, but does not get any unique information from Bob relative to Alice $h(y\backslash x)=0$. (

**Right**) Although for each realisation Eve can only get unique information from either Alice or Bob, it is possible that Eve can expect to get unique information from both Alice and Bob on average. (Do not confuse this representation of the union entropy with the diagram that represents the joint entropy in Figure 1).

**Figure 4.**(

**Left**) This Venn diagram shows how the synergistic information $h(x\oplus y)$ can be defined by comparing the joint information content $h(x,y)$ from Figure 2 to the union information content $h(x\bigsqcup y)$ from Figure 3. Note that, for this particular realisation, we are assuming that $h\left(x\right)>h\left(y\right)$. It also provides a visual representation of the decomposition (40) of the joint information content $h(x,y)$. (

**Right**) By rearranging the marginal entropies such that they match Figure 2 (albeit with different sizes here), it is easy to see why the mutual information content $i(x;y)$ is equal to the intersection information content $h(x\sqcap y)$ minus the synergistic information content $h(x\oplus y)$, c.f. (38).

**Figure 6.**(

**Bottom right**) The distributive lattices $\langle \mathit{x},h(\phantom{\rule{0.166667em}{0ex}}\bigsqcup \phantom{\rule{0.166667em}{0ex}}),h(\phantom{\rule{0.166667em}{0ex}}\sqcap \phantom{\rule{0.166667em}{0ex}})\rangle $ of information contents for two and three and three observers. It is also important to note that, by replacing h, x, y and z with H, X, Y and Z, respectively, we can obtain the distributive lattices for entropy. In fact, this is crucial since Property 6 enables us to reduce the distributive lattice of information contents to a mere total order; however, this property does not apply to the entropies, and hence we cannot further simplify the lattice of entropies. (

**Top left**) By the fundamental theorem of distributive lattices, the distributive lattices of marginal information contents has a one-to-one correspondence with the lattice of sets. Notice that the lattice for two sets corresponds to the Venn diagram for entropies in Figure 3.

**Figure 7.**(

**Left**) The total order of marginal information contents for two and three observers, whereby we have assumed that Alice’s information $h\left(x\right)$ is greater than Bob’s information $h\left(y\right)$, which is greater than Charlie’s information $h\left(z\right)$. It is important to note that taking the expectation value over these information contents for each realisation, which may each have a different total orders, yields entropies which are merely partially ordered. It is for this reason that Property 6 does not apply to entropies. (

**Right**) The Venn diagrams corresponding to the total order for for two and three observers and their corresponding information contents. Notice that the total order for two sets corresponds to the Venn diagram for information contents in Figure 3.

**Figure 8.**Similar to Figure 4, this Venn diagram shows how the synergistic information $h(x\oplus y\oplus z)$ can be defined by comparing the joint information content $h(x,y,z)$ to the union information content $h(x\bigsqcup y\bigsqcup z)$. Note that, for this particular realisation, we are assuming that $h\left(x\right)>h\left(y\right)>h\left(z\right)$.

**Figure 9.**(

**Top-middle and left**) The join semi-lattice $\u2329\mathit{h}\phantom{\rule{0.166667em}{0ex}};(\phantom{\rule{0.166667em}{0ex}},)\u232a$ for $n=2$ and $n=3$ marginal observers. Johnny’s information is always given by the joint information content at the top of the semi-lattice, while the information content of individuals such as Alice, Bob and Charlie who observe single realisations are found at the bottom of the semi-lattice. The information content of joint marginal observers such as Joanna, Jonas and Joan are found in between these two extremities. (

**Bottom-middle and right**) The meet semi-lattice $\u2329\mathit{h}\phantom{\rule{0.166667em}{0ex}};\sqcap \u232a$ for $n=2$ and $n=3$ marginal observers. Since these two semi-lattices are not connect by absorption, their combined structure is not a lattice.

**Figure 10.**(

**Top left**) The redundancy lattices $\langle \mathcal{A}\left(\mathit{x}\right),\u2aaf\rangle $ of information contents for two and three and three observers. Each note in the lattice corresponds to an element in $\mathcal{A}\left(\mathit{x}\right)$ from (69), while the ordering between elements is given by ⪯ from (70). (

**Bottom right**) The partial information contents ${h}_{\partial}\left(\alpha \right)$ corresponding to the redundancy lattices of information contents for two and three observers.

**Figure 11.**This Venn provides a visual representation of the decomposition of the joint entropy $H(X,Y,Z)$. This decomposition is given by replacing x, y, z and h with X, Y, Z and H in (85), respectively.

Finn, C.; Lizier, J.T. Generalised Measures of Multivariate Information Content. *Entropy* **2020**, *22*, 216.
https://doi.org/10.3390/e22020216

Finn C, Lizier JT. Generalised Measures of Multivariate Information Content. *Entropy*. 2020; 22(2):216.
https://doi.org/10.3390/e22020216

Finn, Conor, and Joseph T. Lizier. 2020. "Generalised Measures of Multivariate Information Content" *Entropy* 22, no. 2: 216.
https://doi.org/10.3390/e22020216