On the connections of the generalized entropies and Kolmogorov-Sinai entropies

We consider the concept of generalized measure-theoretic entropy, where instead of the Shannon entropy function we consider an arbitrary concave function defined on the unit interval, vanishing in the origin. Under mild assumptions on this function we show that this isomorphism invariant is linearly dependent on the Kolmogorov-Sinai entropy.


Introduction
Dynamical and measure-theoretic (called also Kolmogorov-Sinai entropy) entropies are basic tools for investigating dynamical systems (see e.g. [5,9]). They were extensively studied and successfully applied among others in statistical physics and quantum information. It appeared to be an exceptionally powerful tool for exploring nonlinear systems. One of the biggest advantages of the Kolmogorov-Sinai entropies lies in the fact that it makes possible to distinguish the formally regular systems (those with the measure-theoretic entropy equal to zero) from the chaotic ones (with positive entropy, which implies positivity of topological entropy [10]).
The Kolmogorov-Sinai entropy of a given transformation T acting on a probability space (X, Σ, µ) is defined as the supremum over all finite measurable partitions P of the dynamical entropy of T with respect to P, denoted by h(T, P). As a dynamical counterpart of Shannon entropy, the entropy of transformation T with respect to a given partition P is defined as the limit of the sequence 1 n H(P n ) with η being the Shannon function given by η(x) = −x log x for x > 0 with η(0) := 0 and P n is the join partition of partitions T −i P for i = 0, ..., n − 1.
The existence of the limit in the definition of the dynamical entropy follows from the subadditivity of η. The most common interpretation of this quantity is the average (over time and the phase space) one-step gain of information about the initial state. Taking supremum over all finite partitions we obtain an isomorphism invariant which measures the rate of producing randomness (chaos) by the system. Since Shannon's seminal paper [18] many generalizations of the concept of Shannon static entropy were considered, see Arimoto [1], Rényi [16] and Csiszár's survey article [4]. The dynamical and measure-theoretic counterparts were considered by few authors. De Paly [14] proposed generalized dynamical entropies based on the concept of the relative static entropies. Unfortunately it appeared that, despite some special cases [14,15] the explicit calculations of this invariant may not be possible. Grassberger and Procaccia proposed in [6] a dynamical counetrpart of the well-known generalization of Shannon entropy -the Rényi entropy, and its measure-theoretic counterpart were considered by Takens and Verbitski. They showed that for ergodic transformations with positive measure-theoretic entropy, Rényi entropies of a measure-theoretic transformation are either infinite or equal to the measure-theoretic entropy [20]. The answer for non-ergodic aperiodic transformations is different, for Rényi entropies of order α > 1 they are equal to the essential infimum of the measure-theoretic entropies of measures forming the decomposition of a given measure into ergodic components, while for α < 1 they are still infinite [21]. In particular, this means that Rényi entropies of order α < 1 are metric invariants sensitive to ergodicity. Similar generalization was made by Mesón and Vericat [11,12] for so called Havrda-Charvát-Tsallis entropy [7] and their results were similar to ones obtained by Takens and Verbitski in [20].
Our approach is based on Arimoto generalization applied to dynamical case. Instead of the Shannon function η we consider a concave function g : [0, 1] → R such that lim x→0 + g(x) = g(0) = 0 and define the dynamical g-entropy of the finite partition P as h(g, T, P) = lim sup n→∞ 1 n A∈Pn g(µ(A)).
The behaviour of the quotient g(x)/η(x) as x converges to zero appears to be crucial for our considerations. Mainly, defining and Cs(g) := lim sup we will prove that In the case of Ci(g) = ∞ we will show that in every aperiodic system and for every γ ≥ 0, there exists a finite partition P such that h(g, T, P) ≥ γ.
Taking the supremum over all partitions we obtain Kolmogorov entropy-like isomorphism invariant, which we will call the measure-theoretic g-entropy of a transformation with respect to an invariant measure. One might ask whether this invariant may give any new information about the system. We will prove (Theorem 3.2) that for g with Cs(g) < ∞, this new invariant is linearly dependent on Kolmogorov-Sinai entropy. It means that in fact the Shannon entropy function is the most natural one -not only it has all of the properties which the entropy function should have [5], but also considering different entropy functions we will not obtain essentially different invariant. This result might has the other interpretation. Ornstein and Weiss showed in [13] that every finitely observable invariant for the class of all ergodic processes has to be a continuous function of the entropy. It is easy to see that any continuous function of the entropy is finitely observable -one simply composes the entropy estimators with the continuous function itself. In other words an isomorphism invariant is finitely observable if and only if it is a continuous function of the Kolmogorov-Sinai entropy. Therefore our result implies that the generalized measure-theoretic entropy is in fact finitely observable. It should be possible to give a more direct proof of the finite observability of the generalized measure-theoretic entropy but the proof cannot be easier 1 than the proof that entropy itself is finitely observable, see [22].
The paper is organized as follows: in the next section we give a formal definition of the dynamical g-entropy and establish its basic properties. The subsequent section is devoted to the construction of a zero dynamical entropy process with a given positive g-entropy. Finally, in the last section, we define a measure-theoretic gentropy of a transformation and show connections between this new invariant and the Kolmogorov-Sinai entropy.

Basic facts and definitions
Let (X, Σ, µ) be a Lebesgue space and let g : [0, 1] → R be a concave function with g(0) = lim x→0 + g(x) = 0. 2 By G 0 we will denote the set of all such functions. Every g ∈ G 0 is subadditive, i. e. g(x + y) ≤ g(x) + g(y) for every x, y ∈ [0, 1], and quasihomogenic, i.e. ϕ g : (0, 1] → R defined by ϕ g (x) := g(x)/x is decreasing (see [17]). 3 For a given finite partition P we define the g-entropy of the partition P as For g = η the latter is equal to the Shannon entropy of the partition P. For two finite partitions P and Q of the space X we define a new partition P ∨ Q (join partition of P and Q) consisting of the subsets of the form B ∩ C where B ∈ P and C ∈ Q. The join partition of more than two partitions is defined similarly.
1 Benjamin Weiss personal communication 2 We might assume only that g(0) = 0, but then the idea of the dynamical g-entropy would fail, since if P n+1 = P n for every n and lim x→0 + g(x) = 0, then the dynamical g-entropy of the partition P would be infinite. Therefore, if g is not well-defined at zero we will assume that g(0) := lim x→0 + g(x). 3 If g is fixed we will omit the index, writing just ϕ.

Dynamical g-entropies.
For an automorphism T : X → X and a partition P = {E 1 , ..., E k } we put Now for a given g ∈ G 0 and a finite partition P we can define the dynamical g-entropy of the transformation T with respect to P as (2) h µ (g, T, P) = lim sup n→∞ 1 n H (g, P n ) .
Alternatively we will call it the g-entropy of the process (X, Σ, µ, T, P). If the dynamical system (X, Σ, T, µ) is fixed then we omit T , writing just h(g, P). As in the case of Shannon dynamical entropies we are interested in the existence of the limit of 1 n H(g, P n ) ∞ n=1 . If g = η, we obtain the Shannon dynamical entropy h(T, P). However, in the general case we can not replace an upper limit in (2) by the limit, since it might not exist. Existence of the limit in the case of the Shannon function follows from the subadditivity of the static Shannon entropy. This property has every subderivative function, i.e. function for which the inequality g(xy) ≤ xg(y) + yg(x) holds for any x, y ∈ [0, 1], but this is not true in general (an appropriate example will be given in Section 2.2). Therefore we propose more general classes of functions for which the limit exists. It exists if g belongs to one of two following classes: It is easy to show that if g is subderivative then the limit lim x→0 + g(x)/η(x) is finite. Moreover we will see that values of dynamical g-entropies depend on the behaviour of g in the neighbourhood of zero. We will prove that if g ∈ G 0 0 ∪ G Sh 0 , then there is a linear dependence between the dynamical g-entropy and the Shannon dynamical entropy of a given partition. Before we give the general result (Theorem 2.1) we will state few facts, which we will use in the proof of this theorem. We give the following lemmas ommiting their elementary proofs. 1] g(x).
The following lemma states that the value of the dynamical g-entropy is given by the behaviour of g in the neighbourhood of zero.
Proof. Let P ∈ B and g 1 , g 2 ∈ G 0 , c > 0 be fullfill the assumptions. Since g ∈ G 0 is bounded we have Dividing by n and converging to infinity we obtain We may state now the main theorem of this section.
and h(g 2 , P) < ∞, then If additionally lim sup Remark 2.1. Whenever g 2 : [0, 1] → R is a nonnegative concave function satisfying g 2 (0) = 0 and g ′ 2 (0) = ∞, we can have any pair 0 < a ≤ b ≤ ∞ as limit inferior and limit superior of g 1 /g 2 in 0, choosing a suitable function g 1 . The idea is as follows: construct g 1 piecewise linear. To do so define inductively a strictly decreasing sequence x k → 0, and a decreasing sequence of values y k = g 1 (x k ) → 0, thus defining intervals J k := [x k+1 , x k ] where g is affine. The only constraint to get a concave function is that the slope of g on each interval J k has to be smaller than y k /x k , and increasing with respect to k; this is not an obstruction to approach any limit inferior and limit superior for g 1 (x)/g 2 (x), provided that x k+1 > 0 is choosen small enough.
Proof of Theorem 2.1. Let P ∈ B. Suppose that g ∈ G 0 and g ′ (0) < ∞. Then which completes the proof of point 1. To prove point 2 let g 1 , g 2 ∈ G 0 be such that We will assume that lim sup The estimation of the lower boundary for h(g 1 , P) remains correct if we omit this assumption. Since g is subadditive, the sequence (H(g, P n )) ∞ n=1 is nondecreasing and there exists the limit lim n→∞ H(g 2 , P n ). If it is finite, then h(g 2 , P) = 0 and by (3) and Lemma 2.1 we have Since lim sup Therefore we can assume that lim Using (3) for every n > 0 we get and A∈Pn: µ(A)<δ and from (4) we obtain Converging with n to infinity we obtain: Therefore we obtain the assertion. In the case of infinite limit superior of the quotient g 1 (x)/g 2 (x) we can repeat the above reasoning just omitting an upper bound for considered expressions.
g 2 (x) and using similar arguments we obtain point 3.
Using similar arguments we might obtain the answer in the case of infinite limit lim x→0 + g 1 (x)/g 2 (x) and positive dynamical g 2 -entropy: Theorem 2.2. Let g 1 , g 2 ∈ G 0 be such that lim x→0 + g 1 (x)/g 2 (x) = ∞ and let a finite partition P has positive g 2 -entropy. Then h(g 1 , P) is infinite.

2.2.
Case of g ∈ G ∞ 0 . We will prove that for every g ∈ G ∞ 0 , any aperiodic automorphism T and every γ ∈ R there exists a partition P ∈ B such that h(g, P) ≥ γ. Since we omit the assumption of ergodicity we will use different techniques mainly based on the well-known Rokhlin Lemma which guarantees existence of so called Rokhlin towers of given height, covering sufficiently large part of X. Using such towers we will find lower estimations for g-entropy of a process similar to ones obtained by Frank Blume in [2], [3], where he proposed, for a given sequence (a n ) ∞ n=1 converging to infinity slower than n, a construction of a partition into two sets P, for which lim n→∞ H(P n )/a n = ∞.
If M 0 , . . . , M n−1 ⊂ X are pairwise disjoint sets of equal measure, then τ = (M 0 , M 1 , . . . , M n−1 ) is called a tower. If additionally M k = T −(n−k−1) M n−1 for k = 1, . . . , n − 1, then τ is called Rokhlin tower. 4 By the same bold letter τ we will denote the set n−1 k=0 M k . Obviously µ(τ ) = nµ(M n−1 ). Integer n is called the height of tower τ . Moreover for i < j we define a subtower In aperiodic systems there exist Rokhlin towers of a given length and covering sufficiently large part of X: . If T is an aperiodic and surjective transformation of Lebesgue space (X, Σ, µ), then for every ε > 0 and every integer n ≥ 2 there exists a Rokhlin tower τ of height n with µ(τ ) > 1 − ε.
Our goal is to find a lower bound for the dynamical g-entropy of a given partition. For this purpose we will use Rokhlin towers and we will calculate dynamical g-entropy with respect to a given Rokhlin tower. This leads us to the following definition: Let P be a finite partition of X and F ∈ Σ, then we define the (static) g-entropy of P restricted to F as H F (g, P) := B∈P g(µ(B ∩ F )).
The following lemma gives us estimation for H(g, P) from below by the value of g-entropy restricted to a subset of X. Lemma 2.5. Let g ∈ G 0 . Let P be a finite partition such that there exists a set E ∈ P with 0 < µ(E) < 1. If F ∈ Σ, then Proof. By the mean value theorem we have , for any set of measure smaller or equal to 1/2, where x A 0 ∈ (µ(A ∩ F ), µ(A)).
The following lemma will play important rule in the proof of the main theorem of this section where α is some positive number. Then there exist δ > 0 and s ∈ (0, α) such that Proof. Nonnegativity of g for x ∈ [0, α] and its concavity imply that there exists s ∈ (0, α) such that g is nondecreasing in [0, s]. Fix n ∈ N and E ∈ Σ. There exists δ ∈ (0, s) such that Let F ∈ Σ be such that µ(E△F ) ≤ δ. Then for every A ∈ P E n i B ∈ P F n we have (6) |µ It is easy to see that for x ∈ [0, s] the monotonicity and subadditivity of g implies that (7) |g(y) − g(x)| ≤ g(|y − x|).
Define D s = {i ∈ {1, . . . , m} | µ(A i ) < s i µ(B i ) < s}. From (5), (6), (7) and monotonicity of g in [0, s] we obtain To find the lower bound for the g-entropy of a partition we will construct so called independent sets. We construct this set in the following way: Let τ be a tower of height m. We divide the highest level of this tower (M m−1 ) into two sets of equal measure let say I (m−1) and M m−1 \I (m−1) . Next we consider T −1 I (m−1) and T −1 (M m−1 \I (m−1) ). We divide each of them into two sets of equal measure and obtain sets I  We can make this construction since every aperiodic system do not have atoms of positive measure and in every non-atomic Lebegue space for every measurable set A and every α ∈ [0, α] there exists B ⊂ A such that µ(B) = α. Now we give an estimation from below for the g-entropy restricted to a given Rokhlin tower. First, by P I we will denote a partition into two sets {I, X\I}, for a measurable set I. Then the following lemma is true.
Theorem 2.3. Let g ∈ G ∞ 0 and T be an aperiodic, surjective automorphism of a Lebesgue space (X, Σ, µ) and let γ ∈ R. Then there exists a partition P ∈ B such that h(g, P) ≥ γ.
Proof. We will prove that for any γ > 0 there exists a partition P E = {E, X\E} such that h(g, P) ≥ γ. We define recursively a sequence of sets E n ∈ Σ. Let Let n > 0 and assume that we have already defined E n−1 , N n−1 and δ n−1 . Using Lemma 2.6 we can choose δ n > 0 such that for any F ∈ Σ, for which µ(E n−1 △F ) < 2δ n . Since we can choose such N n ∈ N that (10) ϕ δ n 2 −Nn−1 ϕ η (δ n 2 −Nn−1 ) > 2γ δ n log 2 .
By Lemma 2.4 there exists M n ∈ Σ, such that τ n = M n , T M n , . . . , T 2Nn−1 M n is a Rokhlin tower of measure µ(τ n ) = δ n . Let I n ⊂ τ n be an independent set in τ n and E n := (E n−1 \τ n ) ∪ I n . Then for all positive integers n. By (8) we have δ n < 2 −n and we conclude that (1 En ) ∞ n=0 is a Cauchy sequence in L 1 (X). Therefore there exist E ∈ Σ such that 1 En converges to 1 E . For this set we have Since E n ∩ τ n = I n , applying (9) and Lemmas 2.5, 2.7 we obtain that for N n such that δ n · 2 −Nn−1 < s: From (10) we obtain that For any s ≤ t and block [ω 0 , . . . , ω t−s ] with a i ∈ A we define a cylinder C t s (ω 0 , . . . , ω t−s ) = {x ∈ X : x i = ω i−s for i = s, . . . , t}. We consider the Borel σ-algebra with respect to the metric, which is given by d(x, y) = 2 −N , where N = min{|i| : x i = y i }. One can show that Borel σ-algebra is the minimal σ-algebra containing all cylindrical sets. Let p = (p 1 , . . . , p k ) be a probability vector, i.e. p i ≥ 0 for any i and Σp i = 1. We define a measure ρ = ρ(p) on A by setting ρ({i}) = p i . Then µ p is a corresponding product measure on X = A Z . Thus, the static g-entropy of a partition P A = {[1], [2], . . . , [k]} is equal to H µp g, P A n = ω∈A n g µ(C n−1 0 (ω 0 , . . . , ω n−1 )) = ω∈A n g p ω 0 · · · p ω n−1 , where ω = (ω 0 , . . . , ω n−1 ). By the concavity of the function g we have where equality holds only when p = p * = 1 k , . . . , 1 k . Before calculating the dynamical g-entropy of the partition P A with respect to measure µ p * , we give the following lemma, which proof will be given later: and for any κ > 1.
Therefore, applying Lemma 2.8 for the partition P A and κ = k we obtain Remark 2.2. If we consider lower limit instead of the upper limit we would obtain lim inf Therefore we can not replace an upper limit by the limit in the definition of the dynamical g-entropy.
Proof of Lemma 2.8. We will show the equality for the upper limit. Proof of the equality for the lower limit is similar. Let (x n ) ∞ n=1 and (m n ) ∞ n=1 be such that lim sup n→∞ g(x n )/η(x n ) = c and x n ∈ (κ −mn , κ −mn+1 ) for every n ∈ N. Then − log x n ≥ − log κ −mn+1 . Every function g ∈ G 0 is quasihomogenic, so for every positive integer n occurs g(x n )

Kolmogorov-Sinai entropy like invariant
The basic tool in the ergodic theory is Kolmogorov-Sinai entropy defined as a supremum of Shannon dynamical entropies over all finite partitions: It is invariant under metric isomorphism. Following the Kolmogorov proposition we take the supremum over all partitions of dynamical g-entropy of a partition. For a given system (X, Σ, µ, T ) we define (12) h µ (g, T ) = sup and call it the measure-theoretic g-entropy of transformation T with respect to measure µ.
It is easy to see that it is an isomorphism invariant. Ornstein and Weiss [13] showed the striking result that measure-theoretic entropy is the only finitely observable invariant for the class of all ergodic processes. More precisely -every finitely observable invariant for a class of all ergodic processes is a continuous function of entropy. Of course in the case of g ∈ G 0 0 ∪ G Sh 0 by Corollary 2.2 we have We will show that for a wider class of functions, namely for functions for which for any ergodic transformation T . This shows that the measure-theoretic g-entropy is in fact finitely observable: one might simply compose the entropy estimators [22] with the linear function itself. Our proof will be similar to the proof of [20, Thm 1.1] where Takens and Verbitski showed that for ergodic transformations supremum over all finite partitions of dynamical Rényi entropies of order α > 1 are equal to the measure-theoretic entropy of T with respect to measure µ. Let us introduce necessary definitions. Let T i be automorphisms of Lebesgue space (X i , Σ i , µ i ) for i = 1, 2 respectively. Then we say that T 2 is a factor of transformation T 1 , if there exists a homomorphism φ : X 1 → X 2 such that φT 1 = T 2 φ µ 1 a.e. on X 1 .
Suppose that T 2 is a factor of T 1 under homomorphism φ. Then for an arbitrary finite partition P of X 2 we have Hence h(g, T 2 , P) = h(g, T 1 , φ −1 P). Therefore This implies the following proposition: Proposition 3.1. If T 2 is a factor of T 1 , then for every function g ∈ G 0 h µ (g, T 2 ) ≤ h µ (g, T 1 ).

3.1.
Measure-theoretic g-entropies for Bernoulli automorphisms. An automorphism T on (X, Σ, µ) is called Bernoulli automorphism if it is isomorphic to some Bernoulli shift. The crucial role in the proof of the main theorem of this section (Theorem 3.2) will play a well-known theorem due to Sinai: [19]). Let T be an arbitrary ergodic automorphism of some Lebesgue space (X, Σ, µ). Then each Bernoulli automorphism with h µ (T 1 ) ≤ h µ (T ) is a factor of the automorphism T .
The following proposition will play a crucial role in our considerations: It is easy to see that h µ (σ) = log M. From Theorem 3.1 we conclude that σ is a factor of T . Therefore applying formula (11) we obtain Applying Lemma 2.8 completes the proof.

3.2.
Main theorem. Our goal in this section is the following result: Theorem 3.2. Let T be an ergodic automorphism of Lebesgue space (X, Σ, µ), and g ∈ G 0 be such that Cs(g) ∈ (0, ∞) Then If g ∈ G 0 0 , then h µ (g, T ) = 0. If g ∈ G 0 is such that Cs(g) = ∞ and T has positive measure-theoretic entropy, then h µ (g, T ) = ∞.
To prove Theorem 3.2 we need first few preliminary lemmas.

Therefore lim inf
k→∞ c k ≤ m lim sup n→∞ a n , and this is equivalent to the statement h(g, T m , P) ≤ mh(g, T, P).
Taking supremum over all finite partitions we obtain the assertion.
Next lemma will be just a weaker version of Theorem 3.2.
Lemma 3.2. If an automorphism T m of a Lebesgue space (X, Σ, µ) is ergodic for every m ∈ N, then for every function g ∈ G 0 , such that Cs(g) < ∞ holds h µ (g, T ) = Cs(g) · h µ (T ).
Proof. Case of g ∈ G 0 0 follows from Corollary 2.2. Suppose that there exists such g ∈ G 0 \G 0 0 which fullfills assumptions of lemma and for which we have Then applying Lemma 3.1 to the transformation T m and using equality h µ (T m ) = mh µ (T ) (see [9,Thm 4.3.16]) we obtain Therefore for sufficiently large m there exists an integer M for which (15) h µ (g, T m ) ≤ mh µ (g, T ) < Cs(g) log M ≤ m Cs(g)h µ (T ) = Cs(g)h µ (T m ).
Proposition 3.2 applied to the transformation T m guarantees that for every g ∈ G 0 with positive (finite) Cs(g) we have (16) h µ (g, T m ) ≥ Cs(g) log M.
Comparing (15) and (16) we obtain the contradiction, which implies that Proof of Theorem 3.2. If h µ (T ) = 0 theorem is true, since for any partition P we have 0 ≤ h(g, P) ≤ Cs(g)h(P) = 0.
Since T ′ is a factor of T , so Proposition 3.1 implies that Cs(g)h µ (T ) = Cs(g)h µ (T ′ ) = h µ (g, T ′ ) ≤ h µ (g, T ) ≤ Cs(g)h µ (T ) which completes the proof of the case of finite h µ (T ). If h µ (T ) = ∞, then Proposition 3.2 implies that h µ (g, T ) ≥ Cs(g) log M for every M > 0 and the theorem is proved.
3.3. Generator theorem counterpart. In the case of g ∈ G ∞ 0 there is no counterpart of a Kolmogorov-Sinai generator theorem, which says that the measuretheoretic entropy of the transformation T is realised on every generator of the σ-algebra Σ. Let us consider Sturm shifts -shifts which model translations of the circle T = [0, 1). Let β ∈ [0, 1) and consider the translation φ β : [0, 1) → [0, 1) defined by φ β (x) = x + β ( mod 1). Let P denote the partition of [0, 1) given by P = {[0, β), [β, 1)}. Then we associate a binary sequence to each t ∈ [0, 1) according to its itinerary relative to P; that is we associate to t ∈ [0, 1) the biinfinite sequence x defined by x i = 0 if φ i β (t) ∈ [0, β) and x i = 1 if φ i β (t) ∈ [β, 1). The set of such sequences is not necessary closed, but it is shift-invariant and so its closure is a shift space called Sturmian shift. If β is irrational, then Sturmian shift is minimal, i.e. there is no proper subshift. Moreover for a minimal Sturmian shift, the number of n-blocks which occur in an infinite shift space is exactly n + 1. Therefore for zero-coordinate partition P A , which is a finite generator of σ-algebra Σ and for any function g ∈ G 0 we have H(g, P A n ) = A∈P A n g(µ S (A)) ≤ ϕ 1 n + 1 where µ S is the unique invariant measure for Sturm shift. Thus, h(g, P A ) ≤ lim sup n→∞ n + 1 n g 1 n + 1 = 0.
On the other hand since it is strictly ergodic (and thus aperiodic) Theorem 2.3 implies that for any function g ∈ G ∞ 0 h µ (g, T ) = ∞, therefore we have a finite generator, for which the supremum is not attained.