Information Is (Only) Probability

Manzotti, Riccardo

doi:10.3390/proceedings2022081036

Open AccessProceeding Paper

Information Is (Only) Probability^†

by

Riccardo Manzotti

Department of Business, Law, and Consumer Behavior, IULM University, 20143 Milan, Italy

^†

Presented at Philosophy and Computing Conference, IS4SI Summit 2021, online, 12–19 September 2021.

Proceedings 2022, 81(1), 36; https://doi.org/10.3390/proceedings2022081036

Published: 14 March 2022

(This article belongs to the Proceedings of The 2021 Summit of the International Society for the Study of Information)

Download

Browse Figures

Versions Notes

Abstract

:

Is information a real entity, or is it only a useful mathematical notion? Shannon’s equation and a consolidated tradition have led many scientists and laymen to consider information as some kind of physical, additive quantity. The very well-known argument in favour of information as a physical quantity has not reached any definitive conclusion. Much has been derived from Shannon’s formula, which was introduced to provide a convenient logarithmic measure that might be practically useful and close to most intuitive feelings. Shannon never insisted on the ontological status of information, which might be revealed to be a lot less committing than is often believed. The bottom line is that information might be a convenient mathematical notation to express the probability of independent configurations in a system.

Keywords:

information; entropy; Shannon; causal overdetermination; reification; ontology

1. Introduction

Here, I will show that there is a mathematical derivation that reduces Shannon’s formula to standard probability theory between independent events. I present an alternative formulation of entropy based on remapping events with different probabilities over multiple equiprobable events. This suggests that, as happens with entropy, there are not more probable configurations, but rather there are many indistinguishable configurations that are mapped onto convenient “more probable” states. Since an alternative formula that does not require any ontological commitment is available, the contention is that information does not exist (its existence does not make any difference). If H can be expressed as the product between the probabilities of independent events, there is no need to add any new entity—information is simply a convenient way to express the probability of different configurations of a physical system.

2. Shannon’s Equation Revisited

Shannon’s equation and a consolidated tradition have led many scientists and laymen to consider information as some kind of physical, additive quantity. The standard formula that all schoolboys learn is the familiar sum of products between probabilities and their logarithms [1]:

H (X) = \sum_{i}^{N} p_{i} L o g_{2} p_{i}

(1)

The aim of this paper is to show that the above function (1) can be rewritten in terms of standard probability theory, and that it thus does not entail any ontological commitment. Namely, that it is possible to show that:

H (X) = - \frac{1}{M} L o g_{2} \prod_{j}^{M} {\tilde{p}}_{j}

(2)

where

{\tilde{p}}_{j}^{'} s

(

M \geq N)

maps the probabilities,

p_{i}

, of the random variable, X, using a method that will be described below. It can be shown that Equations (1) and (2) are mathematically equivalent (in the sense of providing the same set of solutions). Since Equation (2) is a function of a product of probabilities, it follows that H expresses, logarithmically, the probability of independent events, i.e., P(AB) = P(A)P(B). Thus, by Ockham’s principle, if Equation (2) were valid it would drain all ontological commitments of Equation (1). Information as a real quantity could be dismissed.

A few steps show how Equation (2) can be derived from Equation (1). First, Equation (1) can be rewritten as an exponent, and thus the sum can be reformulated as a product with exponents:

2^{- H} = 2^{\sum_{i}^{N} p_{i} L o g_{2} p_{i}} = \prod_{i}^{N} {(2^{L o g_{2} p_{i}})}^{p_{i}} = \prod_{i}^{N} p_{i}^{p_{i}}

(3)

Equation (3) begins to take some of the spell away. The next step consists of removing exponents and suggesting suitable independent events. Let d be the greatest common factor among all

p_{i}

(assuming

p_{i} \in ℚ, \forall i

,

d \in ℚ

, d is the greatest rational, such that

p_{i} / d = q \in ℕ, \forall i

). By means of d, Equation (3) can then be rewritten as (Figure 1):

\prod_{i}^{N} p_{i}^{p_{i}} = p_{1}^{p_{1}} \cdot p_{2}^{p_{2}} \cdot \dots \cdot p_{N}^{p_{N}} = p_{1}^{d} \cdot p_{1}^{d} \cdot \dots \cdot p_{N}^{d} = {\tilde{p}}_{1}^{d} \cdot {\tilde{p}}_{2}^{d} \cdot \dots \cdot {\tilde{p}}_{M}^{d}

(4)

whereas each factor,

p_{i}^{p_{i}}

, is rewritten as a product,

\prod_{k}^{K_{i}} {(p_{i})}^{d}

, where

K_{i} = p_{i} / d, K_{i} \geq 1, M = \sum_{i}^{N} \frac{p_{i}}{d}

. Of course, if

d = p_{i}

,

K_{i} = 1

. Thus, each

p_{i}

can be revisited by—and thus substituted with—a series of independent events,

{\tilde{x}}_{i}, 1 \leq i \leq M

, each having a probability,

{\tilde{p}}_{i}

. The set of probabilities,

{\tilde{p}}_{i}

, is built by multiplying each

p_{i}

K_{i}

times. Equation (3) can be rewritten as:

2^{- H} = \prod_{i}^{N} p_{i}^{p_{i}} = \prod_{j}^{M} {\tilde{p}}_{i}^{d} = {[\prod_{j}^{M} {\tilde{p}}_{i}]}^{d}

(5)

Additionally, then, since

M = \sum_{}^{N} (p_{i} / d) = \sum_{}^{N} p_{i} / d = 1 / M

, it is trivial to obtain Equation (2). The derivation from Equation (1) to Equation (2) might appear complicated, but it is not. For instance, if the elements are equiprobable symbols (

p_{i} = \frac{1}{N}),

the familiar Hartley’s expression,

H = L o g_{2} N

, will follow [2]. An important caveat:

\prod_{}^{N} p_{i}

does not correspond to a probability (

p (i \cap j) = 0, \forall i \neq j

),

\prod_{}^{M} {\tilde{p}}_{i}

does. In fact,

\prod_{}^{M} {\tilde{p}}_{i}

is the probability of any sequence of M events whose frequency matches exactly that of the outcomes of X (Figure 2). In the future, it would be interesting to check whether this proposal might be extended to the continuous case, thereby providing an even more compelling case.

Consider a handful of numerical examples.

First, suppose having two outcomes with probabilities

{\frac{1}{3}, \frac{1}{2}}

, then

d = \frac{1}{6}

, since

\frac{1}{6}

is their greatest common factor, i.e., d is a rational number by which

\frac{1}{3} / \frac{1}{6} = 2

,

\frac{1}{2} / \frac{1}{6} = 3

. In fact,

{(\frac{1}{3})}^{\frac{1}{3}} \cdot {(\frac{1}{3})}^{\frac{1}{3}}

=

{(\frac{1}{3})}^{\frac{1}{6}} {(\frac{1}{3})}^{\frac{1}{6}} \cdot {(\frac{1}{2})}^{\frac{1}{6}} {(\frac{1}{2})}^{\frac{1}{6}} {(\frac{1}{2})}^{\frac{1}{6}}

.

Second, suppose having three outcomes,

x_{1}, x_{2}, x_{3}

, with probabilities

{\frac{1}{6}, \frac{1}{3}, \frac{1}{2}}

, N = 3. Their greatest common factor common is d = ⅙. By multiplying all elements according to the ratio

K_{i} = p_{i} / d

, a set,

{{\tilde{x}}_{j}}, M = 6

, where each element has a probability of

d = \frac{1}{6} = \frac{1}{M}

, we obtain

{p_{1}, p_{2}, p_{3}} \Rightarrow {{\tilde{p}}_{1}, {\tilde{p}}_{2}, {\tilde{p}}_{3}, {\tilde{p}}_{4}, {\tilde{p}}_{5}, {\tilde{p}}_{6}}

. Equation (1) obtains

H = - (\frac{1}{6} L o g_{2} \frac{1}{6} + \frac{1}{3} L o g_{2} \frac{1}{3} + \frac{1}{2} L o g_{2} \frac{1}{2}) ≅ 1.459147917 \dots

. Equation (2) obtains

H = - \frac{1}{6} L o g_{2} (\frac{1}{6} \cdot \frac{1}{3} \cdot \frac{1}{3} \cdot \frac{1}{2} \cdot \frac{1}{2} \cdot \frac{1}{2}) ≅ 1.459147917 \dots

Finally, suppose having two outcomes with probabilities

{\frac{3}{4}, \frac{1}{4}}

, N = 2. Then,

d = \frac{1}{4}

, M = 4.

{p_{1}, p_{2}} \Rightarrow {{\tilde{p}}_{1}, {\tilde{p}}_{2}, {\tilde{p}}_{3}, {\tilde{p}}_{4}}

. The probability

\prod_{}^{M} {\tilde{p}}_{i}

represents the probability of any sequence of M outcomes of the kind 0001, 0100, and 1000 (or whatever matches X’s profile).

3. Discussion

Is information something real or not? In many fields, the notion that information is a physical quantity has led scientists to debatable conclusions about both its nature and role [3,4]. Many arguments have been advanced both in favour of and against the reality of information. It is fair to maintain that information cannot be directly measured, as can be done with other, less problematic physical quantities (say, mass and charge). It is impossible to tell how much information is contained in a physical system unless more knowledge about the relation between that physical system and another system is available. It is also disputable whether information has any real causal efficacy, or if it is an epiphenomenal notion, as is centres of mass. The very well-known argument in favour of information as a physical quantity has not reached any definitive conclusion [5]. In this context, much have been derived from Equation (1), which, as Shannon himself stated, was introduced mostly because “the logarithmic measure is more convenient […] it is practically more useful […] it is mathematically more suitable […] it is nearer to our intuitive feelings” [1]. Shannon hardly provides final arguments for the physical existence of information.

An interesting outcome of Equation (2) is that it is achieved by remapping events with different probabilities over multiple equiprobable events. This suggests that, as happens with entropy, there are not more probable configurations, but rather there are many indistinguishable configurations that are mapped onto convenient “more probable” states. Therefore, Equation (2) (and the steps from three to six) is a way to unpack one’s limited knowledge or perspective about physical states and represent them in a neutral way where each state is just as probable.

This is not to say that the notion of information is not a very successful and useful one. Yet, the existence of an alternative formula that does not require any ontological commitment to information makes a strong cause against its existence—the existence of information does not seem to make any difference. Equations (1) and (2) causally overdetermine what a system does, and, since Equation (1) is ontologically less parsimonious, for the Ockham’s razor Equation (2) is to be preferred.

In this contribution, Equation (2) expresses H in terms of the product between the probabilities of independent events. Therefore, if Equation (2) was used in place of Equation (1) (despite being more cumbersome), there would be no need to add any new entity—information is simply a convenient way to express the probability of different configurations of a physical system. Since Equation (2) is mathematically equivalent to Equation (1), and it does not require anything but probabilities, it follows that Equation (2) provides a mathematical proof that information does not exist apart from as a useful conceptual tool.

Funding

Not applicable.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423, 623–656. [Google Scholar] [CrossRef] [Green Version]
Hartley, R.V.L. Transmission of Information. Bell Syst. Tech. J. 1928, 7, 535–563. [Google Scholar] [CrossRef]
Wheeler, J.A. Information, Physics, Quantum: The Search for Links. In Third International Symposium on the Foundations of Quantum Mechanics; Physical Society of Japan: Tokyo, Japan, 1989. [Google Scholar]
Tononi, G.; Boly, M.; Massimini, M.; Koch, C. Integrated information theory: From consciousness to its physical substrate. Nat. Rev. Neurosci. 2016, 17, 450–461. [Google Scholar] [CrossRef] [PubMed]
Levy, A. Information in Biology: A Fictionalist Account. Noûs 2011, 45, 640–657. [Google Scholar] [CrossRef]

Figure 1. Each discrete set of probabilities

p_{i}

can be split into a larger set of equally probable events

{\tilde{p}}_{i}

.

Figure 1. Each discrete set of probabilities

p_{i}

can be split into a larger set of equally probable events

{\tilde{p}}_{i}

.

Figure 2. (a) X, such that

p_{1} = \frac{1}{2}, p_{2} = \frac{1}{2}

, since

\prod_{}^{M} {\tilde{p}}_{i} = {\tilde{p}}_{1} {\tilde{p}}_{2} = \frac{1}{4}

; (b) X, such that

p_{1} = \frac{3}{4}, p_{2} = \frac{1}{4}

, since

\prod_{}^{M} {\tilde{p}}_{i} = {\tilde{p}}_{1} {\tilde{p}}_{2} = \frac{3}{16}

.

Figure 2. (a) X, such that

p_{1} = \frac{1}{2}, p_{2} = \frac{1}{2}

, since

\prod_{}^{M} {\tilde{p}}_{i} = {\tilde{p}}_{1} {\tilde{p}}_{2} = \frac{1}{4}

; (b) X, such that

p_{1} = \frac{3}{4}, p_{2} = \frac{1}{4}

, since

\prod_{}^{M} {\tilde{p}}_{i} = {\tilde{p}}_{1} {\tilde{p}}_{2} = \frac{3}{16}

.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Manzotti, R. Information Is (Only) Probability. Proceedings 2022, 81, 36. https://doi.org/10.3390/proceedings2022081036

AMA Style

Manzotti R. Information Is (Only) Probability. Proceedings. 2022; 81(1):36. https://doi.org/10.3390/proceedings2022081036

Chicago/Turabian Style

Manzotti, Riccardo. 2022. "Information Is (Only) Probability" Proceedings 81, no. 1: 36. https://doi.org/10.3390/proceedings2022081036

APA Style

Manzotti, R. (2022). Information Is (Only) Probability. Proceedings, 81(1), 36. https://doi.org/10.3390/proceedings2022081036

Article Menu

Information Is (Only) Probability^†

Abstract

1. Introduction

2. Shannon’s Equation Revisited

3. Discussion

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Information Is (Only) Probability †

Abstract

1. Introduction

2. Shannon’s Equation Revisited

3. Discussion

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Information Is (Only) Probability^†