Next Article in Journal
Scalable and Fully Distributed Localization in Large-Scale Sensor Networks
Previous Article in Journal
No Uncountable Polish Group Can be a Right-Angled Artin Group

Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

# Tsallis Entropy and Generalized Shannon Additivity

by and
Institute of Mathematics, University of Lübeck, D-23562 Lübeck, Germany
*
Author to whom correspondence should be addressed.
Axioms 2017, 6(2), 14; https://doi.org/10.3390/axioms6020014
Received: 19 May 2017 / Revised: 8 June 2017 / Accepted: 10 June 2017 / Published: 14 June 2017

## Abstract

:
The Tsallis entropy given for a positive parameter $α$ can be considered as a generalization of the classical Shannon entropy. For the latter, corresponding to $α = 1$, there exist many axiomatic characterizations. One of them based on the well-known Khinchin-Shannon axioms has been simplified several times and adapted to Tsallis entropy, where the axiom of (generalized) Shannon additivity is playing a central role. The main aim of this paper is to discuss this axiom in the context of Tsallis entropy. We show that it is sufficient for characterizing Tsallis entropy, with the exceptions of cases $α = 1 , 2$ discussed separately.

## 1. Introduction

Some history. In 1988 Tsallis [1] generalized the Boltzmann-Gibbs entropy
$S = − k B ∑ i = 1 n p i ln p i ,$
Describing classical thermodynamical ensembles with microstates of probabilities $p i$, by the entropy
$S α = k B 1 − ∑ i = 1 n p i α α − 1$
For $0 < α ≠ 1$ in the sense that $lim α → 1 S α = S$. Here $k B$ is the Boltzmann constant (being a multiplicative constant neglected in the following). Many physicists argue that generalizing the classical entropy was a breakthrough in thermodynamics since the extension allows better describing systems out of equilibrium and systems with strong correlations between microstates. There is, however, also criticism on the application of Tsallis’ concept (compare [2,3]). In information theory pioneered by Shannon, the Boltzmann-Gibbs entropy is one of the central concepts. We follow the usual practice to call it Shannon entropy. Also note that Tsallis’ entropy concept coincides up to a constant with the Havrda-Charvát entropy [4] given in 1967 in an information theoretical context. Besides information theory, entropies are used in many fields, among them dynamical systems, data analysis (see e.g. [5]), and fractal geometry [6].
There have been given many axiomatic characterizations of Tsallis’ entropy originating in such of the classical Shannon entropy (see below). One important axiom called (generalized) Shannon additivity is extensively discussed and shown to be sufficient in some sense in this paper.
Tsallis entropy. In the following, let $▵ n = { ( p 1 , p 2 , … , p n ) ∈ ( R + ) n , ∑ i = 1 n p i = 1 }$ for $n ∈ N$ be the set of all n-dimensional stochastic vectors and $▵ = ⋃ n ∈ N ▵ n$ be the set of all stochastic vectors, where $N = { 1 , 2 , 3 , … }$ and $R +$ are the sets of natural numbers and of nonnegative real numbers, respectively. Given $α > 0$ with $α ≠ 1$, the Tsallis entropy of a stochastic vector $( p 1 , p 2 , … , p n )$ of some dimension n is defined by
$H ( p 1 , p 2 , … , p n ) = 1 − ∑ i = 1 n p i α α − 1 .$
In the case $α = 1$, the value $H p 1 , … , p n$ is not defined, but the limit of it as $α$ approaches to 1 is
$H ( p 1 , p 2 , … , p n ) = − ∑ i = 1 n p i ln p i ,$
Which provides the classical Shannon entropy. In so far Tsallis entropy can be considered as a generalization of the Shannon entropy and so it is not surprising that there have been many attempts to generalize various axiomatic characterizations of the latter to the Tsallis entropy.
Axiomatic characterizations. One line of characterizations mainly followed by Suyari [7] and discussed in this paper has its origin in the Shannon-Khinchin axioms of Shannon entropy (see [8,9]). Note that other characterizations of Tsallis entropy are due to dos Santos [10], Abe [11] and Furuichi [12]. For some general discussion of axiomatization of entropies see [13].
A map $H : ▵ → R +$ is the Shannon entropy up to a multiplicative positive constant if it satisfies the following axioms:
Axiom (S4) called Shannon additivity is playing a key role in the characterization of the Shannon entropy and an interesting result given by Suyari [7] says that its generalization
For $α ≠ 1$ provides the Tsallis entropy for this $α$.
More precisely, if $H : ▵ → R +$ satisfies (S1), (S2), (S3) and (GS4), then $c ( α ) H$ is the Tsallis entropy for some positive constant $c ( α )$. The full result of Suyari slightly corrected by Ilić et al. [14] includes a characterization of the map $α → c ( α )$ under the assumption that H also depends continuously on $α ∈ R + \ { 0 }$. We do not discuss this characterization, but we note here that the results below also provide an immediate simplification of the whole result of Suyari and Ilić et al.
Given $α$, the constant $c ( α )$ is determined by any positive value $H ( p 1 , p 2 , … , p n )$ of some stochastic vector $( p 1 , p 2 , … , p n )$. If this reference vector is for example given by $( 1 2 , 1 2 )$, one easily sees that $c ( α ) = 2 1 − α − 1 ( 1 − α ) H 1 2 , 1 2$ and $c ( 1 ) = ln 2 H 1 2 , 1 2$.
The main result. In this paper, we study the role of generalized Shannon additivity in characterizing Tsallis entropy, where for $α ∈ R + \ { 0 }$ and $H : ▵ → R$ we also consider the slightly relaxed property that
It turns out that this property basically is enough for characterizing the Tsallis entropy for $α ∈ R + \ { 0 , 1 , 2 }$ and with a further weak assumption in the cases $α = 1 , 2$. As already mentioned, the statement (iii) for $α = 1$ is an immediate consequence of a characterization of Shannon entropy by Diderrich [15] simplifying an axiomatization given by Faddeev [16] (see below).
Theorem 1.
Let $H : ▵ → R$ be given with (GS4) or, a bit weaker, with (GS4’), for $α ∈ R + \ { 0 }$. Then the following holds:
(i)
If $α ≠ 1 , 2$, then
(ii)
If $α = 2$, then the following statements are equivalent:
(a)
It holds
(b)
H is bounded on $▵ 2$,
(c)
H is continuous on $▵ 2$,
(d)
H is symmetric on $▵ 2$,
(e)
H does not change the signum on $▵ 2$.
(iii)
If $α = 1$, then the following statements are equivalent:
(a)
It holds
(b)
H is bounded on $▵ 2$.
Note that statement (iii) is given here only for reasons of completeness. It follows from a result of Diderrich [15].
The paper is organized as follows. Section 2 is devoted to the proof of the main result. It will turn out that most of the substantial work is related to stochastic vectors contained in $▵ 2 ∪ ▵ 3$ and that the generalized Shannon additivity acts as a bridge to stochastic vectors longer than 2 or 3. Section 3 completes the discussion. In particular, the Tsallis entropy for $α = 1 , 2$ on rational vectors is discussed and an open problem is formulated.

## 2. Proof of the Main Result

We start with investigating the relationship of $H ( p 1 , p 2 )$ and $H ( p 2 , p 1 )$ for $( p 1 , p 2 ) ∈ ▵ 2$.
Lemma 1.
Let $α ∈ R + \ { 0 }$ and $H : ▵ → R$ satisfy (GS4’). Then for all $( p 1 , p 2 ) ∈ ▵ 2$ it follows
$( 1 − 3 · 2 − α ) H ( p 1 , p 2 ) + 2 − α H ( p 2 , p 1 ) = H 1 2 , 1 2 ( 1 − p 1 α − p 2 α ) ,$
in particular for $α = 1$
$H ( p 1 , p 2 ) = H ( p 2 , p 1 )$
and for $α = 2$
$H ( p 1 , p 2 ) + H ( p 2 , p 1 ) = 4 H 1 2 , 1 2 ( 1 − p 1 2 − p 2 2 ) .$
Moreover it holds
$H ( 1 ) = 0 .$
Proof.
First of all, note that (5) is an immediate consequence of (GS4’) implying
$H ( 1 , 0 ) = H ( 1 ) + 1 α H ( 1 , 0 ) .$
Further, two different applications of (GS4’) to $H 1 2 , 1 2 , 0$ provide
$H 1 2 , 1 2 + 1 2 α H ( 1 , 0 ) = H 1 2 , 1 2 , 0 = H ( 1 , 0 ) + 1 α H 1 2 , 1 2 .$
Therefore $H ( 1 , 0 ) = 0$, and since one similarly gets $H ( 0 , 1 ) = 0$, we can assume in the following that $p 1 , p 2 ≠ 0$.
Applying (GS4’) three times, one obtains
$H ( p 1 , p 2 ) + ( p 1 α + p 2 α ) H 1 2 , 1 2 = H p 1 2 , p 1 2 , p 2 2 , p 2 2 = H p 1 2 , 1 2 , p 2 2 + 2 − α H ( p 1 , p 2 )$
and in the same way
$H p 1 2 , 1 2 , p 2 2 + 2 − α H ( p 2 , p 1 ) = H p 1 2 , p 2 2 , p 1 2 , p 2 2 = H 1 2 , 1 2 + 2 1 − α H ( p 1 , p 2 ) .$
Transforming (7) to the term $H p 1 2 , 1 2 , p 2 2$ and then substituting this term in (6), provides
$H ( p 1 , p 2 ) + ( p 1 α + p 2 α ) H 1 2 , 1 2 = H 1 2 , 1 2 + 3 · 2 − α H ( p 1 , p 2 ) − 2 − α H ( p 2 , p 1 ) ,$
which is equal to (2). Statements (3) and (4) follow immediately from Equation (2). ☐
In the case $α = 1$ condition (GS4’) implies that the order of components of a stochastic vector does not make a difference for H:
Lemma 2.
Let $H : ▵ → R$ satisfy (GS4’) for $α = 1$. Then H is permutation-invariant, meaning that $H ( p 1 , p 2 … , p n ) = H ( p π ( 1 ) , p π ( 2 ) … , p π ( n ) )$ for each $( p 1 , p 2 , … , p n ) ∈ ▵ ; n ∈ N$ and each permutation π of ${ 1 , 2 , … , n }$.
Proof.
It suffices to show that
For $n < 3$ this has been shown in Lemma 1 (see (3)), for $n ≥ 3$ it follows directly from (GS4’) and from Lemma 1. ☐
The following lemma provides the substantial part of the proof of Theorem 1.
Lemma 3.
For $H : ▵ → R$ satisfying (GS4’) with $α ∈ R + \ { 0 , 1 }$, the following holds:
(i)
If $α ≠ 2$, then
(ii)
If $α = 2$, then the following statements are equivalent:
(a)
It holds
(b)
H is symmetric on $▵ 2$, meaning that $H ( p 1 , p 2 ) = H ( p 2 , p 1 )$ for all $( p 1 , p 2 ) ∈ ▵ 2$,
(c)
H is continuous on $▵ 2$,
(d)
H is bounded on $▵ 2$,
(e)
H is nonnegative or nonpositive on $▵ 2$.
Proof.
We first show (i). Let $α ≠ 2$ and $( p 1 , p 2 ) ∈ ▵ 2$. Changing the role of $p 1$ and $p 2$ in (2), by Lemma 1 one obtains
$( 1 − 3 · 2 − α ) H ( p 2 , p 1 ) = 2 α 2 − α H 1 2 , 1 2 ( 1 − p 1 α − p 1 α ) − 2 − 2 α H p 1 , p 2 .$
Moreover, one easily sees that (2) transforms to
$( 1 − 3 · 2 − α ) H ( p 2 , p 1 ) = 2 α ( 1 − 3 · 2 − α ) H 1 2 , 1 2 ( 1 − p 1 α − p 2 α ) − ( 1 − 6 · 2 − α + 9 · 2 − 2 α ) H ( p 1 , p 2 ) .$
(8) and (9) provide
$( 1 − 2 2 − α ) H 1 2 , 1 2 ( 1 − p 1 α − p 2 α ) = ( 1 − 3 · 2 1 − α + 2 3 − 2 α ) H ( p 1 , p 2 ) .$
Since $1 − 3 · 2 1 − α + 2 3 − 2 α = ( 1 − 2 2 − α ) ( 1 − 2 1 − α )$, it follows
$H ( p 1 , p 2 ) = 1 − p 1 α − p 2 α 1 − 2 1 − α H 1 2 , 1 2 .$
In order to show (ii), let $α = 2$ and define maps $f : [ 1 2 , 1 ] → [ 1 2 , 1 ]$ and $D : [ 1 2 , 1 ] → [ 0 , ∞ [$ by
$f ( p ) = max 1 − p p , 1 − 1 − p p$
and
$D ( p ) = | H ( p , 1 − p ) − H ( 1 − p , p ) |$
for $p ∈ [ 1 2 , 1 ]$.
By (4) in Lemma 1, (a) is equivalent both to (b) and to $D ( p ) = 0$ for all $p ∈ [ 1 2 , 1 ]$. (c) implies (d) by compactness of $▵ 2$ and validity of the implications (a) ⇒ (c) and (a) ⇒ (e) is obvious.
From
$H ( p , 1 − p ) + p 2 H 1 − p p , 1 − 1 − p p = H ( 1 − p , 2 p − 1 , 1 − p ) = H ( 1 − p , p ) + p 2 H 1 − 1 − p p , 1 − p p$
for $p ∈ 1 2 , 1$ one obtains
$D ( p ) = p 2 D ( f ( p ) )$
and by induction
$D ( p ) = ∏ k = 1 n f ∘ k ( p ) 2 D ( f ∘ n ( p ) )$
with $f ∘ n ( p ) = f ( f ( … ( f ( ︷ n t i m e s p ) ) … ) ) .$
For $p ∈ 2 3 , 1$ it holds $f ( p ) = 2 − 1 p$, hence f maps the interval $2 3 , 1$ onto the interval $1 2 , 1$. Since $p − f ( p ) = ( p − 1 ) 2 p > 0$ for all $p ∈ 2 3 , 1$, the following holds:
Moreover, applying (10) to $p = 1 2$ yields $0 = D ( 1 2 ) = D ( 1 ) 4$, hence
$D ( 1 ) = 0 .$
Assuming (d), by use of (11), (12) and (13) one obtains $D ( p ) = 0$ for all $p ∈ 1 2 , 1$, hence (a). If (e) is valid, then by (4) in Lemma 1
$D ( r ) ≤ 4 H 1 2 , 1 2 ( 1 − r 2 − ( 1 − r ) 2 ) ≤ 4 H 1 2 , 1 2$
for all $r ∈ [ 1 2 , 1 ]$, providing (d). By the already shown, (a), (b), (c), (d), (e) are equivalent. ☐
Now we are able to complete the proof of Theorem 1. Assuming (GS4’), we first show (1) for $α ≠ 1 , 2$, and for H bounded and $α = 2$. This provides statement (i) and, together with Lemma 3 (ii), statement (ii) of Theorem 1.
Statement (1) is valid for all $( p 1 , p 2 , … , p n ) ∈ ▵ 1 ∪ ▵ 2$ by Lemma 3. In order to prove it for $n > 2$, we use induction. Assuming validity of (1) for all $( p 1 , p 2 , … , p n ) ∈ ▵$ with $n = k$, where $k ∈ N \ { 1 }$, let $( p 1 , p 2 , … , p k , p k + 1 ) ∈ ▵$. Choose some $j ∈ { 1 , 2 , … , k }$ with $p j + p j + 1 > 0$. Then by (GS4’) and Lemma 3 we have
$H ( p 1 , … , p j − 1 , p j , p j + 1 , p j + 2 , … , p k + 1 ) = H ( p 1 , … , p j − 1 , p j + p j + 1 , p j + 2 , … , p k + 1 ) + ( p j + p j + 1 ) α H p j p j + p j + 1 , p j + 1 p j + p j + 1 = H 1 2 , 1 2 1 − ∑ i = 1 j − 1 p i α − ( p j + p j + 1 ) α − ∑ i = j + 2 k + 1 p i α 1 − 2 1 − α + H 1 2 , 1 2 ( p j + p j + 1 ) α 1 − 2 1 − α 1 − p j p j + p j + 1 α − p j + 1 p j + p j + 1 α = H 1 2 , 1 2 1 − ∑ i = 1 k + 1 p i α 1 − 2 1 − α .$
So (1) holds for all $( p 1 , p 2 , … , p n ) ∈ ▵$ with $n = k + 1$.
In order to see (iii), recall a result of Diderrich [15] stating that $H : ▵ → R$ is a multiple of the Shannon entropy if H is bounded and permutation-invariant on $▵ 2$ and satisfies
which is weaker than (GS4’) with $α = 1$. Since under (GS4’) H is permutation-invariant by Lemma 2, Diderrichs axiom are satisfied, and we are done.

## 3. Further Discussion

Our discussion suggests that the case $α = 2$ is more complicated than the general one. In order to get some further insights, particularly in the case $α = 2$, let us consider only rational stochastic vectors. So in the following let $▵ Q = ⋃ n ∈ N ▵ n Q$ with $▵ n Q = ▵ n ∩ Q n$ for $n ∈ N$ and $Q$ being the rationals. The following proposition states that for $α ≠ 1$ the `rational’ generalized Shannon additivity principally provides the Tsallis entropy on the rationals, which particularly provides a proof of the implication (c) ⇒ (a) in Theorem 1 (ii).
Proposition 1.
Let $H : ▵ Q → R$ be given with (S4) for $▵ Q$ instead of ▵ and $α ∈ R + \ { 0 , 1 }$. Then it holds
Proof.
For the vectors $1 m , … , 1 m , 1 n , … , 1 n ∈ ▵$ with $m , n ∈ N$, we get from axiom (S4)
$H 1 m , . . . , 1 m + m 1 m α H 1 n , . . . , 1 n = H 1 m n , . . . , 1 m n = H 1 n , . . . , 1 n + n 1 n α H 1 m , . . . , 1 m ,$
implying
$H 1 m , . . . , 1 m = H 1 n , . . . , 1 n · 1 − 1 m α − 1 1 − 1 n α − 1 .$
Now consider any rational vector $( p 1 , p 2 , … , p n ) ∈ ▵ Q$ with $p 1 = a 1 b , p 2 = a 2 b , . . . , p n = a n b$ for $b , a 1 , … , a n ∈ N$ satisfying $∑ i = 1 n a i = b$. With (S4) we get
$H ( p 1 , . . . , p n ) + ∑ i = 1 n p i α · H 1 a i · n , . . . , 1 a i · n = H 1 b · n , . . . , 1 b · n = H 1 n , . . . , 1 n + n · 1 n α · H 1 b , . . . , 1 b .$
Using (15), we obtain
Let us finally compare (ii) and (iii) in Theorem 1 and ask for the role of (c), (d) and (e) of (ii) in (iii). Symmetry is already given by Lemma 2 when only (S4) is satisfied, (S4) and nonnegativity are not sufficient for characterizing Shannon entropy, as shown in [17]. By our knowledge, there is no proof that (S4) and continuity are enough, but (S4) and analyticity is working. Showing the latter, in [18] an argumentation reducing everything to the rationals as above has been used.
We want to resume with the open problem whether the further assumptions for $α = 2$ in Theorem 1 are necessary.
Problem 1.
Is (1) in Theorem 1 also valid for $α = 2$?

## Author Contributions

Sonja Jäckle provided most of the results and material with exception of the proof of Lemma 3 (ii). On the base of the material, Karsten Keller wrote the paper.

## Conflicts of Interest

The authors declare no conflict of interest.

## References

1. Tsallis, C. Possible generalization of Boltzmann-Gibbs statistics. J. Stat. Phys. 1988, 52, 479–487. [Google Scholar] [CrossRef]
2. Cartwright, J. Roll over, Boltzmann. Phys. World 2014, 27, 31–35. [Google Scholar] [CrossRef]
3. Tsallis, C. Approach of complexity in nature: Entropic nonuniqueness. Axioms 2016, 5, 20. [Google Scholar] [CrossRef]
4. Havrda, J.; Charvát, F. Quantification method of classification processes. Concept of structural α-entropy. Kybernetika 1967, 3, 30–35. [Google Scholar]
5. Amigó, J.M.; Keller, K.; Unakafova, V.A. On entropy, entropy-like quantities, and applications. Discrete Contin. Dyn. Syst. B 2015, 20, 3301–3343. [Google Scholar] [CrossRef]
6. Guariglia, E. Entropy and Fractal Antennas. Entropy 2016, 18, 1–17. [Google Scholar] [CrossRef]
7. Suyari, H. Generalization of Shannon-Khinchin axioms to nonextensive systems and the uniqueness theorem for the nonextensive entropy. IEEE T. Inform. Theory 2004, 50, 1783–1787. [Google Scholar] [CrossRef]
8. Khinchin, A.I. Mathematical Foundations of Information Theory; Dover: New York, NY, USA, 1957. [Google Scholar]
9. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 397–423 and 623–656. [Google Scholar] [CrossRef]
10. Dos Santos, R.J.V. Generalization of Shannon’s theorem for Tsallis entropy. J. Math. Phys. 1997, 38, 4104–4107. [Google Scholar] [CrossRef]
11. Abe, S. Tsallis entropy: How unique? Contin. Mech. Thermodyn. 2004, 16, 237–244. [Google Scholar] [CrossRef]
12. Furuichi, S. On uniqueness Theorems for Tsallis entropy and Tsallis relative entropy. IEEE Trans. Inf. Theory 2005, 51, 3638–3645. [Google Scholar] [CrossRef]
13. Csiszár, I. Axiomatic characterizations of information measures. Entropy 2008, 10, 261–273. [Google Scholar] [CrossRef]
14. Ilić, V.M.; Stanković, M.S.; Mulalić, E.H. Comments on “Generalization of Shannon-Khinchin axioms to nonextensive systems and the uniqueness theorem for the nonextensive entropy”. IEEE Trans. Inf. Theory 2013, 59, 6950–6952. [Google Scholar] [CrossRef]
15. Diderrich, G.T. The role of boundedness in characterizing Shannon entropy. Inf. Control 1975, 29, 140–161. [Google Scholar] [CrossRef]
16. Faddeev, D.F. On the concept of entropy of a finite probability scheme. Uspehi Mat. Nauk 1956, 11, 227–231. (In Russian) [Google Scholar]
17. Daróczy, Z.; Maksa, D. Nonnegative information functions. Analytic function methods in probability theory. Colloq. Math. Soc. Janos Bolyai 1982, 21, 67–78. [Google Scholar]
18. Nambiar, K.K.; Varma, P.K.; Saroch, V. An axiomatic definition of Shannon’s entropy. App. Math. Lett. 1992, 5, 45–46. [Google Scholar] [CrossRef]

## Share and Cite

MDPI and ACS Style

Jäckle, S.; Keller, K. Tsallis Entropy and Generalized Shannon Additivity. Axioms 2017, 6, 14. https://doi.org/10.3390/axioms6020014

AMA Style

Jäckle S, Keller K. Tsallis Entropy and Generalized Shannon Additivity. Axioms. 2017; 6(2):14. https://doi.org/10.3390/axioms6020014

Chicago/Turabian Style

Jäckle, Sonja, and Karsten Keller. 2017. "Tsallis Entropy and Generalized Shannon Additivity" Axioms 6, no. 2: 14. https://doi.org/10.3390/axioms6020014

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.