1. Introduction
The Kolmogorov complexity  measures the amount of information contained in an individual object (usually a string) x, by the size of the smallest program that generates it. It naturally characterizes a probability distribution over  (the set of all finite binary strings), assigning a probability of  for any string x, where c is a constant. This probability distribution is called universal probability distribution and it is denoted by .
The Shannon entropy  of a random variable X is a measure of its average uncertainty. It is the smallest number of bits required, on average, to describe x, the output of the random variable X.
Kolmogorov complexity and Shannon entropy are conceptually different, as the former is based on the length of programs and the later in probability distributions. However, for any recursive probability distribution (
i.e., distributions that are computable by a Turing machine), the expected value of the Kolmogorov complexity equals the Shannon entropy, up to a constant term depending only on the distribution (see [
1]).
Several information measures or entropies have been introduced since Shannon’s seminal paper [
2]. We are interested in two different generalizations of Shannon entropy:
- Rényi entropy [ 3- ], an additive measure based on a specific form of mean of the elementary information gain: instead of using the arithmetic mean, Rényi used the Kolmogorov-Naguno mean associated with the function  -  where  -  and  -  are constants,  b-  is a real greater than 1 and  α-  is a non-negative parameter; 
- Tsallis entropy [ 4- ], a non additive measure, often called a non-extensive measure, in which the probabilities are scaled by a positive power  α- , that may either reinforce the large (if  - ) or the small (if  - ) probabilities. 
 Let 
 and 
 denote, respectively, the Rényi and Tsallis entropies associated with the probability distribution 
P. Both are continuous functions of the parameter 
α and both are (quite different) generalizations of the Shannon entropy, in the sense that 
 (see [
5]). It is well known (see [
6]) that for any recursive probability distribution 
P over 
, the average value of 
 and the Shannon entropy 
 are close, in the sense that 
 where 
 is the length of the shortest program that describes the distribution 
P. We study if this property also holds for Rényi and Tsallis entropies. The answer is no. If we replace 
H by 
R or by 
T, the inequalities (
1) are no longer true (unless 
). We also analyze the validity of the relationship (
1), replacing Kolmogorov complexity by its time-bounded version, proving that it holds for distributions such that the cumulative probability distribution is computable in an allotted time. So, for these distributions, Shannon entropy equals the expected value of the time-bounded Kolmogorov complexity. We also study the convergence of Tsallis and Rényi entropies of the universal time-bounded distribution 
, proving that both entropies converge if and only if 
. Finally, we prove the uniform continuity of Tsallis, Rényi and Shannon entropies.
The rest of this paper is organized as follows. In the next section, we present the notions and results that will be used. In 
Section 3, we study if the inequalities (
1) can be generalized for Rényi and Tsallis entropies and we also establish a similar relationship for the time-bounded Kolmogorov complexity. In 
Section 4, we analyze the entropies of the universal time-bounded distribution. In 
Section 5, we establish the uniform continuity of the entropies, a mathematical result that may be useful in further research.
  2. Preliminaries
 is the set of all finite binary strings. The empty string is denoted by ϵ.  is the set of strings of length n and  denotes the length of the string x. Strings are lexicographically ordered. The logarithm of x in base 2 is denoted by . The real interval between a and b, including a and excluding b is represented by . A sequence of real numbers  is denoted by .
  2.1. Kolmogorov Complexity
Kolmogorov complexity was introduced independently, in the 1960’s by Solomonoff [
7], Kolmogorov [
8], and Chaitin [
9]. Only essential definitions and basic results are given here, for further details see [
1]. The model of computation used is the prefix-free Turing machine, 
i.e., Turing machines with a prefix-free domain. A set of strings 
A is prefix-free if no string in 
A is prefix of another string of 
A. Kraft’s inequality guarantees that for any prefix-free set 
A, 
. In this work all resource bounds 
t considered are time constructible, 
i.e., there is a Turing machine whose running time is exactly 
 on every input of size 
n.
Definition 2.1. Kolmogorov complexity. Let U be a fixed prefix-free universal Turing machine. For any two strings , the Kolmogorov complexity of x given y is , where  is the output of the program p with auxiliary input y when it is run in the machine U. For any time constructible t, the t-time-bounded Kolmogorov complexity of x given y is, .
The default value for y, the auxiliary input is the empty string ϵ; for simplicity, we denote  and  by  and , respectively. The choice of the universal Turing machine affects the running time of a program by, at most, a logarithmic factor and the program length by, at most, a constant number of extra bits.
Definition 2.2. Let c be a non-negative integer. We say that  is c-Kolmogorov random if .
Proposition 2.3. For all strings , 
we have:- -
- ; 
- -
-  and ; 
- -
- There are at least c-Kolmogorov random strings of length n. 
 As Solovay [
10] observed, for infinitely many 
x, the time-bounded version of Kolmogorov complexity equals the unbounded version. Formally, we have:
Theorem 2.4. (Solovay [
10]) 
For all time-bounds  we have  for infinitely many x. As a consequence of this result, there is a string x of arbitrarily large Kolmogorov complexity such that .
Definition 2.5. A semi-measure over a discrete set X is a function  such that, . We say that a semi-measure is a measure if the equality holds. A semi-measure is constructive if it is semi-computable from below, i.e., for each x, there is a Turing machine that produces a monotone increasing sequence of rationals converging to .
An important constructive semi-measure based on Kolmogorov complexity is defined by 
. This semi-measure dominates any other constructive semi-measure 
μ (see [
11,
12]), in the sense that there is a constant 
 such that, for all 
x, 
. For this reason, this semi-measure is called universal. Since it is natural to consider time-bounds on Kolmogorov complexity, we can define 
, a time-bounded version of 
.
Definition 2.6. We say that a function f is computable in time t if there is a Turing machine that on the input x computes the output , in exactly  steps.
Definition 2.7. The t-time-bounded universal distribution is , where c is a real number such that .
In [
1], the authors proved that 
 dominates distributions computable in time 
t, where 
 is a time-bound that only depends on 
t. Formally:
Theorem 2.8. (Claim 7.6.1 in [
1]) 
If , the cumulative probability distribution of μ, is computable in time  then for all , , where .   2.2. Entropies
Information Theory was introduced in 1948 by C.E. Shannon [
2]. Shannon entropy quantifies the uncertainty of the results of an experiment; it quantifies the average number of bits necessary to describe an outcome from an event.
Definition 2.9. (Shannon entropy [
2]) 
Let  be a finite or infinitely countable set and let X be a random variable taking values in  with distribution P. The Shannon entropy of the random variable X is  Definition 2.10. (Rényi entropy [
3]) 
Let  be a finite or infinitely countable set and let X be a random variable taking values in  with distribution P and let  be a non-negative real number. The Rényi entropy of order α of the random variable X is  Definition 2.11. (Tsallis entropy [
4]) 
Let  be a finite or infinitely countable set and let X be a random variable taking values in  with distribution P and let  be a non-negative real number. The Tsallis entropy of order α of the random variable X is  It is easy to prove that 
 Given the conceptual differences in the definitions of Kolmogorov complexity and Shannon entropy, it is interesting that under some weak restrictions on the distribution of the strings, they are related, in the sense that the value of Shannon entropy equals the expected value of Kolmogorov complexity, up to a constant term that only depends on the distribution.
 Theorem 2.12. Let  be a recursive probability distribution such that  . Then,  Proof. (Sketch, see [
1] for details). The first inequality follows directly from the Noiseless Coding Theorem, that, for these distributions, states 
. Since 
 is universal, 
, for all 
x, which is equivalent to 
. Thus, we have:
 ☐
   3. Kolmogorov Complexity and Entropy: How Close?
Since Rényi and Tsallis entropies are generalizations of Shannon entropy, we now study if Theorem 2.12 can be generalized for these entropies. Then, we prove that for distributions such that the cumulative probability distribution is computable in time , Shannon entropy equals the expected value of the t-time-bounded Kolmogorov complexity.
First, we observe that the interval 
 of the inequalities of Theorem 2.12 is tight up to a constant term that only depends on the universal Turing machine chosen as reference. A similar study has been included in [
1] and in [
13]. The following examples illustrate the tightness of this interval. We present a probability distribution that satisfies: 
 and a probability distribution that satisfies:
Example 3.1. Fix . Consider the distribution concentrated in , i.e.,  Notice that describing this distribution is equivalent to describing . So, . On the other hand, . Thus, if  is c-Kolmogorov random, i.e., , then  and .
Example 3.2. Let ybe a string of length n that is c-Kolmogorov random, i.e.,  and consider the following probability distribution over :   where  represents the real number between 0 and 1 which binary representation is y. Notice that we can choose x0 and x1 such that  where  is a constant greater than 1 and hence does not depend on n. Thus, - , since describing  is essentially equivalent to describe ,  and y; 
- ; 
- . 
 Now we address the question if an analogue of Theorem 2.12 holds for Rényi and Tsallis entropies. We show that the Shannon entropy is the only entropy that verifies simultaneously both inequalities of Theorem 2.12 and thus is the only one suitable to deal with information.
For every 
, 
, and any probability distribution 
P, with finite support, (see [
5]), we have:
 Thus, 
- For , ; 
- For , . 
It is known that for a given probability distribution with finite support, the Rényi and Tsallis entropies are monotonic increasing functions one of each other with respect to 
α (see [
14]). Thus, for every 
 and 
, we also have a similar relationship for the Tsallis entropy, 
i.e., 
Hence, it follows that:
- For , ; 
- For , . 
In the next result we show that the inequalities above are, in general, false for different values of α.
Proposition 3.3. There are recursive probability distributions P such that:- , where ; 
- , where ; 
- , where ; 
- , where . 
 Proof. For 
, consider the following probability distribution:
 It is clear that this distribution is recursive. We use this distribution for some specific 
n to prove all items. 
We generalize this result, obtaining the following theorem which proof is similar to the previous proposition.
Theorem 3.4. For every  and  there are recursive probability distributions P such that, - ; 
- ; 
- ; 
- . 
 The previous results show that only the Shannon entropy satisfies the inequalities of Theorem 2.12. Now, we analyze if a similar relationship holds in a time-bounded Kolmogorov complexity scenario.
If, instead of considering  and  in the inequalities of Theorem 2.12, we use their time-bounded version and impose some computational restrictions on the distributions, we obtain a similar result. Notice that, for the class of distributions mentioned on the following theorem, the entropy equals (up to a constant) the expected value of time-bounded Kolmogorov complexity.
Theorem 3.5. Let P be a probability distribution over  such that , the cumulative probability distribution of P, is computable in time . Setting , we have,  Proof. The first inequality follows directly from the similar inequality of Theorem 2.12 and from the fact that .
From Theorem 2.8, if 
P is a probability distribution such that 
 is computable in time 
, then for all 
 and 
, 
. Then, summing over all 
x, we get:
 which is equivalent to 
.☐
  4. On the Entropies of the Time-Bounded Universal Distribution
If the time that a program can use to produce a string is bounded, we get the so called time-bounded universal distribution, . In this section, we study the convergence of the three entropies with respect to this distribution.
Theorem 4.1. The Shannon entropy of the distribution  diverges.
Proof. If 
 then 
 is a decreasing function. Let 
A be the set of strings such that 
; this set is recursively enumerable. 
 So, if we prove that 
 diverges, the result follows. Assume, by contradiction, that 
 for some 
. Then, considering 
 we conclude that 
r is a semi-measure. Thus, there is a constant 
 such that, for all 
x, 
. Hence, for 
, we have 
 So, , which is a contradiction since by Theorem 2.4, A contains infinitely many strings x of time-bounded Kolmogorov complexity larger than a constant such that . The contradiction follows from the assumption that the sum  converges. So,  diverges.
Now we show that, similarly to the behavior of Rényi and Tsallis entropies of universal distribution 
 (see [
15]), we have 
 if 
 and 
 if 
. First observe that:
- If , ; 
- If , . 
Theorem 4.2 The Tsallis entropy of order α of time-bounded universal distribution  converges iff .
Proof. From Theorem 8 of [
15], we have that 
 converges if 
. Since 
 is a probability distribution, there is a constant 
λ such that, for all 
x, 
. So, 
, which implies that 
 from which we conclude that for 
, 
 converges.
 For 
, the proof is analogous to the proof of Theorem 4.1. Suppose that 
 for some 
. Hence, 
 is a constructible semi-measure. Then, there is a constant 
τ such that for all 
, 
 which is equivalent to 
 By Theorem 2.4, there are infinitely many strings 
x such that 
. Then it would follow that for these strings 
, which is false since these particular strings can have arbitrarily large Kolmogorov complexity. ☐
Theorem 4.3. The Rényi entropy of order α of time-bounded universal distribution  converges iff .
Proof. For , since , we have . Thus,  converges.
For , suppose without loss of generality that α is rational (otherwise take another rational slightly larger than α). Assume that . Then by universality of  and since  is computable, we would have  which, by taking logarithms, is equivalent to . Since , this would contradict Solovay’s Theorem of page 597. ☐
  5. Uniform Continuity of the Entropies
Shannon, Rényi, and Tsallis entropies are very useful measures in Information Theory and Physics. In this section we look to these entropies as functions of the corresponding probability distribution, and establish an important property: all the three entropies are uniformly continuous. We think that this mathematical result may be useful in the future research on this topic. It should be mentioned that the uniform continuity of the Tsallis entropy for certain values of 
α has already been proved in [
16].
  5.1. Tsallis Entropy
In order to prove the uniform continuity of the Tsallis entropy, we need some technical lemmas.
Lemma 5.1. Let  and . We have, 
 Proof. Consider the function 
 for 
 and 
. So, 
 is positive for all 
 and 
 is negative. We want to prove that:
 If we shift to the left 
a and 
b, 
 it is easy to show that the value of 
 (for 
) is greater than 
. In particular, if 
, 
 which means that 
. □
 Lemma 5.2. Let , and . We have,  Proof. Consider the function 
 for 
 and 
. Let 
, 
. The function 
f is continuous in 
 and differentiable in 
. So, by Lagrange’s Theorem, 
 Note that 
. Thus, 
. □
 Theorem 5.3. Let P and Q be two probability distributions over . Let  and  be the Tsallis entropy of distributions P and Q, respectively, where , . Then, 
 Proof. If 
, we have:
 Thus, it is sufficient to consider 
.
 If 
, we have:
 Thus, it is sufficient to consider 
. □
  5.2. Shannon Entropy
Lemma 5.4. The function  is uniformly continuous in .
Proof. The function is continuous in  and this is a compact set, so the function is uniformly continuous. □
Theorem 5.5. Let P and Q be two probability distributions over . Let  and  be the Shannon entropy of distributions P and Q, respectively. Then,  Proof. By Lemma 5.4, we have that:
 So, 
 It is sufficient to consider 
 and consider 
. □
   5.3. Rényi Entropy
Lemma 5.6. Let a and b be two real numbers greater than 1. Then,  Proof. Consider the function 
 and 
. Let 
. The function 
f is continuous in 
 and differentiable in 
. So, by Lagrange’s Theorem, 
 Note that 
. Thus, 
. □
 Lemma 5.7. The function  is uniformly continuous in . Thus,  Proof. The function is continuous in  and this is a compact set, so the function is uniformly continuous. □
Theorem 5.8. Let P and Q be two probability distributions over . Let  and  be the Rényi entropy of distributions P and Q respectively, where , . Then, 
 Proof. If :
Consider . By Lemma 5.7, we know that there is β such that if  then . So, consider .
We have to show that if 
 then 
:
If :
One of the 
 is at least 
 so that 
 and 
; assume that 
. We have 
 But, by Lemma 5.2 we have 
 Thus, 
 We get 
 It is sufficient to consider 
. □
  6. Conclusions
We proved that, among the three entropies we have studied, Shannon entropy is the only one that satisfies the relationship with the expected value of Kolmogorov complexity stated in [
1], by exhibiting a probability distribution for which the relationship fails for some values of 
α of Tsallis and Rényi entropies. Furthermore, under the assumption that cumulative probability distribution is computable in an allotted time, a time-bounded version of the same relationship holds for the Shannon entropy. Since it is natural to define a probability distribution based on time-bounded Kolmogorov complexity, we studied the convergence of this distribution under Shannon entropy and its two generalizations: Tsallis and Rényi entropies. We also proved that the three entropies considered in this work are uniformly continuous.