Next Article in Journal
On The Use of Entropy to Predict Boundary Layer Stability
Previous Article in Journal
Thermal Analysis in Pipe Flow: Influence of Variable Viscosity on Entropy Generation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On a simple derivation of a family of nonextensive entropies from information content

Department of Physics, Ochanomizu University, 2-1-1 Otsuka, Bunkyo-ku, Tokyo 112-8610, Japan
Entropy 2004, 6(4), 364-374; https://doi.org/10.3390/e6040364
Submission received: 1 June 2004 / Accepted: 31 July 2004 / Published: 1 August 2004

Abstract

:
The nonextensive entropy of Tsallis can be seen as a consequence of postulates on a self-information, i.e., the constant ratio of the first derivative of a self-information per unit probability to the curvature (second variation) of it. This constancy holds if we regard the probability distribution as the gradient of a self-information. Considering the form of the nth derivative of a self- information with keeping this constant ratio, we arrive at the general class of nonextensive entropies. Some properties on the series of entropies constructed by this picture are investigated.

1 Introduction

Tsallis statistics[1] can be seen as a formalism based on a pair of deformed functions of usual exponential and logarithmic ones[2,3]. Their deformed functions are called the q-exponential and the q-logarithmic functions, respectively. The q-exponential function is defined as e q x = [ 1 + ( 1 q ) x ] + 1 / 1 q , where [x]+ = max{x, 0} and the q-logarithmic function lnqx = (x1−q − 1)/(1 − q), where x ≥ 0. The derivation of the deformed (or, the generalized) form of the usual logarithmic entropy and its mathematical structure based on it have been attracted much attention. Furthermore, studies on deriving probability distribution functions consistent with these generalized entropies from microscopic dynamics have been done [4,5] and some statistical mechanical foundations of the Tsallis distribution are summarized in Ref.[6] (and references therein). One question arises regarding the origin of the nonextensive entropy and the clarification of it is desirable for further study. Although the Tsallis entropy was devised in physics in a heuristic way motivated by multi- fractals[1], the use of it has been stretched to neighboring scientific fields such as information theory [7], econophysics[8], to name a few. Therefore, it is required to understand how the entropy is connected to the existing quantities. In this paper, we shall provide a partial answer to that query from the information theoretical point of view and present a general form of entropies.
Let us review the elicitation procedure of the Tsallis entropy. It is shown that the Tsallis entropy is derived from the following set of postulates and a notation on the self-information with standard average:
Postulate 1: For p ∈ [0, 1], h(p) is a continuous function
Postulate 2: For two different probabilities p1 and p2, h(p1p2) = h(p1)+ h(p2)+ λh(p1)h(p2), where λ ≠ 0
Postulate 3 (Notation): h(0) = 1/(q − 1), where q ∈ ℝ.
Since λ ≠ 0, the expression in Postulate 2 can be rewritten as a factorized form s(p1p2) = s(p1)s(p2) with s(p) = 1 + λh(p). The trivial solution s(p) = 0, which gives constant function for h(p) (= −1/λ) is not appropriate due to the fact that it provides constant information irrespective of the associated probabilities. Therefore we may put the general solution as s(p) = pq−1[9], which leads to the form h(p) = (pq−1 − 1)/λ. From Notation, we determine λ = 1 − q, hence the linear average of h(p) (i.e., ∑kpkh(pk)) leads to the Tsallis entropy. Since Notation fixes the value of λ, if we replace Notation with h(1/2) = 1, resulting entropy gives the type β generalization of Shannon entropy by Havrda-Charvat[10] and Daroczy[11]. In this case the factor λ becomes 21−β − 1, where β ≠ 1. The difference between the Tsallis entropy and the type β entropy stems from the arbitrary factor which fixes the form of the self-information. It should be noted that, in a nonadditive self-information h(p) = (pq−1− 1)/(1 − q), we have finite information for q ≠ 1 even if the least probable event occurs[7], whereas we get infinite information in the ordinary logarithmic definition of the self-information (see Appendix for the definition).
In view of the fact that some alternative entropies and discussions based on them are recently attracting much attention for their potential uses and applications[12,13,14], our motivation in this study is set to the investigation for certain types of a hierarchical structure embedded in the generalized entropies starting with a self-information, which characterizes the associated en- tropy. This paper presents a procedure for obtaining a series of generalized class of (nonextensive) entropies, which include some existing entropies as special cases.
In the next section, we give an elementary prescription in order to obtain a series of nonextensive entropies including Tsallis one. In Section 3, we brieffy discuss properties of basic ingredients for the nonextensive entropies. We present our conclusions in the last section.

2 A method to construct a generalized class of entropies

Information content is a basic quantity for measuring information of systems considered in in-formation theory. This notion is used for making entropies in our study. As a starting point, let us consider a probability distribution under α rescaling transformation defined on ℝ, where α (α ≠ 1) is a positive real number. The transformed probability distribution pα(x) is defined as pα(x) = αp(x/α). In general, the value of α determines the broadness and the size of the distribu-tion. When α < 1, pα(x) shrinks, whereas if α > 1 it enlarges, however, its shape is maintained. In our present consideration, we focus on a special case with respect to this rescaling, where the probability function is invariant under the transformation:
α p ( x α ) = p ( x )
This transformation expresses that the probability distribution is a homogeneous function of degree one. It can be shown that, when the probability distribution is continuous on [0, 1], the ratio p(x)/x is found to be constant over this range. For positive integers ni (i = 1, … , 4) satisfying n1/n2(= r1), n3/n4(= r2) ∈ [0, 1], repeated use of the equation (1) yields
p ( n 1 / n 2 ) = n 1 p ( 1 / n 1 · n 1 / n 2 ) = n 1 p ( 1 / n 2 ) = n 1 n 4 p ( 1 / n 2 · 1 / n 4 ) = n 1 n 4 p ( ( n 2 n 3 ) 1 · ( n 3 / n 4 ) ) = n 1 / n 2 · n 4 / n 3 p ( n 3 / n 4 ) ,
which leads to p(r1)/r1 = p(r2)/r2. Therefore, the ratio p(x)/x results in a constant value. This fact constitutes a crucial basis in the following consideration. It is shown that the constant ratio of the change of the self-information per unit probability to its curvature (second derivative) leads to the Tsallis entropy. For this purpose, we add the following postulate:
Postulate 4: The ratio h ( x ) / x h ( x ) is invariant under transformation Eq.(1).
This postulate implies (h’(x)/x)/h’’(x) = const., in which the general solution can be expressed as h’(x) = axb (a,b ∈ ℝ). Then, if we choose a = −1, b = q − 2, and setting the integration constant as 1/(q − 1), we obtain the form h(x) = lnq(x−1). This form leads to the Tsallis entropy if we use the standard mean value ∫xh(x)dx (If we employ the escort average[15], we have the normalized Tsallis entropy[7,16,17]). We immediately generalize the above procedure to the nth derivative of the self-information. That is,
Postulate 4’: The ratio h ( n ) ( x ) / x h ( n + 1 ) ( x ) is invariant under transformation Eq.(1), where h(x) is a Cn+1 class function.
The general solution of the above expression can be put as h(n)(x) = axq−(n+1) + b. Therefore, we obtain the solutions:
h ( x ) = { k = 1 n + 1 c k 1 x k 1 ( k 1 ) ! , c n = a + b ( if   q = n + 1 ) a x q 1 C ( q ; n ) + k = 1 n + 1 c k 1 x k 1 ( k 1 ) ! , c n = b ( if   q n + 1   and   q n ) a x n 1 ( n 1 ) ! ln x + k = 1 n + 1 c k 1 x k 1 ( k 1 ) ! , c n = b ( if   q = n ) ,
where C(q; n) = (q − 1)(q − 2) (qn) and c0,...,cn are integration constants. We find that the Shannon entropy (the self-information by Hartley[18]) is recovered when n = 1, a = −1, b = 1 and ck = 0, ∀k in the third line of the above expression. Since the form of the cases qn + 1 and qn in the second line can be considered as a generic self-information, we shall consider properties of it in the next section. The values of a and ck’s are constrained by the feature of h(x) in addition to the sign of C(q; n). Using the standard definition of the mean value, the resulting average of the self-information (i.e., entropy) can be expressed as follows:
a C ( q ; n ) i p i q + i k = 1 n + 1 c k 1 p i k ( k 1 ) ! ,       ( n 1 ) .
We denote this entropy as H q n (p; {ck}), where a set of probabilities is denoted simply by p. Some existing entropies can be expressed as a paticular class of H q n (p; {ck}):
when a = −1,
(1)
the Rényi entropy of order α, ln i p i α / ( 1 α ) is expressed as ln [ ( 1 α ) H α 1 ( p ; 0 , 0 ) ] / ( 1 α ) .
(2)
H q 1 ( p ; 1 q 1 , 0 ) is equivalent to the Tsallis entropy i p i q 1 1 q .
(3)
the quadratic information, 1 i p i 2 (e.g.,[19]) is given by H 2 1 (p; 1, 0).
(4)
the type β entropy is H β 1 ( P ; 1 1 2 1 β , 0 ) .
If we allow a to take various values, we can express other entropies with H q n (p; {ck}). For example, the cubic information 1 i p i 3 [20] can be represented either by H 3 2 (p; 1, 0, 1) with a = −1, or H 3 2 (p; 1, 0, −2) with a = 0, or H 3 1 (p; 1, 0, 0) with a = −2.
In the subsequent discussion, we set a = −1 by taking into account that the conventional entropy is recovered in this case. If we put a = −kB, where kB is the Boltzmann factor, it can be regarded as an entropy in statistical mechanics, however, recall that we usually put kB = 1 to make the entropy dimensionless.

3 Some general properties of H q n ( p ; { c k } )

There are many properties characterizing an entropy. Among these, we shall investigate ingre-dients to clarify the relation to the physical entropy, which is expected to have concavity[21] for instance. Our way of consideration is to determine or to give constraints between parameters included in H q n ( p ; { c k } ) so that the generalized entropies can possess the fundamental features below.
Certainty and monotonicity w.r.t the number of states:
In order to guarantee the certainty H q n ( 1 , 0 , , 0 ; { c k } ) = 0 , we need to have the relation C ( q ; n ) 1 = k = 1 n + 1 c k 1 / ( k 1 ) ! ( a = 1 ) . This provides a strong constraint on the value of ck’s and q for a given order n. Whether or not the entropy for the equiprobability pi = w−1 is an increasing function of w, in general, depends complicately on the range of value q and the values of ck’s. As a specific example, let us mention the case n = 1. The behavior of H q 1 ( w 1 ; c 0 ; c 1 ) is shown in Figure 1, where the relation that ensures the certainty condition described above has used. It increases monotonically with respect to w when 1 < q < 2 for all values of c0. For other regions of q, H q 1 ( w 1 ; c 0 ; c 1 ) monotonically decreases as w grows. In a specific choice of the integration constants, i.e., when ck = 1, ∀k, d H q n ( w 1 ; 1 , , 1 ) / d w is written as w 1 q / C ( q ; n ) + i = 1 n w k / k ! (a = −1) for qn. This means that H q n ( w 1 ; 1 , , 1 ) can be an increasing function of w (when the values of n and q are chosen appropriately), which is one of the desirable behavior for an entropy at the equiprobability.
Two-state system:
One of the simple examinations of a behavior of the generalized entropy is for a two-state sys-tem (W = 2), where the probabilities is defined p1 = p and p2 = 1 − p, respectively. From the definition, straightforward calculations yield a final expression:
H q 1 ( p ; c 0 , c 1 ) = 1 p q ( 1 p ) q ( q 1 ) ( q 2 ) + 2 c 1 p ( p 1 ) ,
where we set a = −1, and used the end-points condition (i.e, the entropy should vanish at p = 0 and p = 1, leading to the relation c0 + c1 = [(q− 1)(q − 2)]−1). Figure 2. shows the shape of H q 2 ( p ; c 0 , c 1 ) for different values of q, where the end-points condition and the concavity are both satisfied. Note that the concavity of the entropy holds only for the parameter range c 1 < q q 1 2 1 q .
Expansibility:
H q n ( p ; { c k } ) is expansible for all q and n, i.e.,
H q n ( p 1 , , p w ; { c k } ) = H q n ( p 1 , , p w , 0 ; { c k } ) ,
where w denotes the number of states. Regarding the entropy for uniform probabilities and certainty, we shall add comments in the next section.
Transformation from discrete entropy to continuous one:
If we define a discrete probability as pi = p( x i )∆x (∀i, i = 1,... , m), where x i ∈[xi−1, xi] and the bins of length ∆x = xixi−1 with x0 and xm being end-points of its support, and p(x) is continuous within the bins, then the corresponding discrete entropy can be expressed as
p q n ( p ; { c k } ) = i = 1 m Δ x p q ( x i ) C ( q ; n ) ( Δ x ) q 1 + i = 1 m k = 1 n + 1 c k 1 ( k 1 ) ! Δ x p k ( x i ) ( Δ x ) k 1 .
If we put ∆x = 1 as explained in Ref.[22], the first and second terms come close to the integral of p q ( x i ) / C ( q ; n ) and k = 1 n + 1 c k 1 p k ( x i ) / ( k 1 ) ! , respectively by the Riemann integrability.

4 Concluding remarks

It is noteworthy that for the general form of entropy such as i f ( p i ) (in our case f(pi) = pih(pi)), the additivity of the entropy holds only when it satisfies the relation [pif’’(pi) = 0. This relation straightforwardly leads to the Shannon form f(pi) = −piln pi [23]. In this sense, a family of generalized entropies constructed in this paper can be considered to be intrinsically nonadditive.
Since the generalized form of entropy presented in this study contains a number of parame-ters (integration constants) depending on the order n, some properties satisfied by some known generalized entropies can be used as constraints for the parameters. Among preferable properties that should be possessed by entropies, a behavior for the equiprobability and certainty have been mentioned. As for obtaining the analytical expression of the equilibrium distribution pi by the Lagrange multiplier method, it seems to be diffcult to perform it for the general form. This is mainly because of the fact that the constrained (by energies) entropy functional, which should be solved, becomes a polynomial with respect to pi.
In this paper, we have presented an elementary procedure to obtain a series of nonexten-sive entropies from some postulates on the self-information. As a main ingredient, the ratio (h(n)(x)/x)/h(n+1)(x) has been assumed to be constant. Our formulation provides us that the Tsallis nonextensive entropy can be viewed as a special realization of a set of the large class of generalized entropies. In fact, this view is not new perspective. Some authors seem to have been shared this viewpoint in developing the generalized entropies[14,24,25]. We believe that the discussion presented in this study serves as understanding the structure of entropies and giving a deeper insight into a number of (generalized) entropies, not from a heuristic definition. We note that applications of these entropies to systems both in statistical mechanics and in information theory are potentially fruitful, but we need to know what aspect of the system relatetes to the nonextensivity parameters for further studies.

Acknowledgements

The author acknowledges the Research Fellowships from JSPS and support in part from Grant-in- Aid for Scientific Research No.06632 from the Ministry of Education, Science, Sports and Culture of Japan.

Appendix

Rudiments of information theory are how to define the self-information. Let Ω be a certain set or phase space: Ω = {ω1,ω2,...}. The phase space consists of an ensemble of elementary events (mi-crostates in the language of the statistical mechanics). We are now considering the discrete case, which enables us to suppose to take the probability space (Ω, F , P), where is a σ-algebra over Ω and P is a probability measure P : → [0, 1] given by P ( F ) = ω I F p i , where pi = p({ωi}).

Definition

A function I which takes real value defined on the is a self information or information content on if the axioms below are satisfied:
(A1)
The impossible event contains finite information θ(q) determined by system’s intrinsic pa-rameter q whereas the certain event does not contain any information;
I(φ) = θ(q), I(Ω) = 0.
(A2)
For any two event ωi and ωj belonging to , such that ωj ⊂ ωi, we have
I(ωi) < I(ωj).

Proposition

Let I(ωi) bean information content on . If the function f satisfiesconditions:
f−1 : R+ → [0, 1], f−1(0) = 1, f−1(θ(q)) = 0
then the set function
μ f ( q ) ( ω i ) = f 1 ( I ( ω i ) ) ( q R )
for every event ωi ∈ ℜ is a probability measure on ℜ.
From the two axioms we conclude I(ωi) ≥ 0 (∀ωi). Then we immediately have the following statement on the self-information:
If the information independence of events belonging to doesnot coincide with the probability independence such that I(ωiωj) = I(ωi) + I(ωj) + (q − 1)I(ωi)I(ωj) (additivityviolation), then we have f (x) = klnqx(k > 0). Hence the probability measure is given as μ f ( q ) ( ω i ) = e q I ( ω i ) k whichis equivalent to
I ( ω i ) = k ln q μ f ( q ) ( ω i ) .
Proof
By the condition of the above statement, for any two events ωi and ωj we have
μ f ( q ) ( ω i ω j ) = μ f ( q ) ( ω i ) μ f ( q ) ( ω j )
only when I(ωiωj) = I(ωi) + I(ωj) + (q − 1)I(ωi)I(ωj). Let us put xi = I(ωi) and xj = I(ωj)
Then
f 1 ( x i + x j + ( q 1 ) x i x j ) = f 1 ( I ( ω i ω j ) ) = μ f ( q ) ( ω i ω j ) = f 1 ( I ( ω i ) ) f 1 ( I ( ω j ) )
The functional equation obtained is f −1(xi + xj + (q − 1)xixj) = f−1(xi)f−1(xj ). When we put φ(x) = lnqf−1(x), the functional equation which should be satisfied is
φ(xi + xj + (q − 1)xixj) = φ(xi) + φ(xj) + (q − 1)φ(xi)φ(xj )
whose solution is φ(x) = kx. Therefore we obtain f 1 ( x ) = e q k x . When we replace the arbitrary constant k with −1/k, the functional form f(x) = −klnqx(k > 0) is concluded.☐

References

  1. Tsallis, C. J. Stat. Phys. 1988, 52, 479.
  2. Naudts, J. Physica A 2002, 316, 323.
  3. Tsallis, C. Quimica Nova 1994, 17, 468.Yamano, T. Physica A 2002, 305, 486.
  4. Beck, C. Phys. Rev. Lett. 2001, 87, 180601.
  5. Sattin, J. J. Phys. A 2003, 36, 1583.
  6. Rajagopal, A.K.; Abe, S. Statistical mechanical foundations of power-law distributions. eprint, [cond-mat/0303064].
  7. Yamano, T. Phys. Rev. E 2001, 63, 46105.Physica A 2002, 305, 190.
  8. Tsallis, C.; Anteneodo, C.; Borlandand, L.; Osorio, R. Physica A 2003, 324, 89.
  9. Hardy, G.H.; Littlewood, J.E.; Polya, G. Inequalities; Cambridge University Press, 1973. [Google Scholar]
  10. Havrda, J.; Charvat, F. Kybernetika 1967, 30, 30.
  11. Daroczy, Z. Inf. and Control 1970, 16, 36.
  12. Kaniadakis, G. Phys. Rev. E 2002, 66, 056125.
  13. Anteneodo, C.; Plastino, A.R. J. Phys. A 1999, 32, 1089.
  14. Papa, A.R.R. J. Phys. A 1998, 31, 5271.
  15. Beck, C.; Schlögl, F. Thermodynamics of chaotic Systems; Cambridge University Press: Cambridge, 1993. [Google Scholar]
  16. Landsberg, P.T.; Vedral, V. Phys. Lett. A 1998, 247, 211.
  17. Rajagopal, A.K.; Abe, S. Phys. Rev. Lett. 1999, 83, 1711.
  18. Hartley, R.V.L. Bell Syst. Tech. J. 1928, 7, 535.
  19. Watanabe, S. Knowing and Guessing, a Quantitative Study of Inference and Information; Wiley: New York, 1969. [Google Scholar]
  20. Chen, C.H. Inform. Sci. 1976, 10, 159.
  21. The concavity of the present family of entropies is highly parameter dependent. The function f ( x ) = x q / C ( q ; n ) + k = 1 n + 1 c k 1 x k / ( k 1 ) ! can be convex for ck > 0, ∀k if n satisfies C(q; n) < 0 (> 0) when q > 1 (q < 1). Then, for every pi, p i ∈ [0, 1] and λ1, λ2 (λ1 + λ2 = 1, λ1 > 0, λ1 > 0), the difference between entropy for the intermediate probability p i ( p i = λ1pi + λ2 p i ) and the sum of the weighted entropies for two end-point probabilities p and p’, which can be calculated as Δ n q = H q n ( p ; { c k } ) λ 1 H q n ( p ; { c k } ) λ 2 H q n ( p ; { c k } ) , can be shown to be positive ( Δ n q 0 ) with the Jensen’s inequality. The equality holds if ck = 0 for all k except for k ≠ 1.
  22. Van der Lubbe, J.C.A. Information Theory; Cambridge University Press, 1997. [Google Scholar]
  23. Rossignoli, R.; Canosa, N. Phys. Lett. A 1999, 264, 148.
  24. Sharma, B.D.; Mittal, D.P. J. Math. Sci. 1975, 10, 28.
  25. Van der Lubbe, J.C.A.; Boxma, Y.; Boekee, D.E. Inform. Sci. 1984, 32, 187.
Figure 1. Entropy for a two-state system H q 2 ( p ; c 0 ; c 1 ) as a function of p for different values of q with c1 = 0.7. Note that the value of c0 is determined from the values of q and c1 such that (c0 + c1)−1 = (q − 1)(q − 2) due to the requirements H q 2 ( 0 ; c 0 ; c 1 ) = 0 and H q 2 ( 1 ; c 0 ; c 1 ) = 0. From the value of the second derivative of the entropy at p = 1/2, it is concave when c 1 < q q 1 2 1 q .
Figure 1. Entropy for a two-state system H q 2 ( p ; c 0 ; c 1 ) as a function of p for different values of q with c1 = 0.7. Note that the value of c0 is determined from the values of q and c1 such that (c0 + c1)−1 = (q − 1)(q − 2) due to the requirements H q 2 ( 0 ; c 0 ; c 1 ) = 0 and H q 2 ( 1 ; c 0 ; c 1 ) = 0. From the value of the second derivative of the entropy at p = 1/2, it is concave when c 1 < q q 1 2 1 q .
Entropy 06 00364 g001
Figure 2. H q 2 ( w 1 ; c 0 , c 1 ) as a function of w for different values of q when c0 = 1.3. Note that the value of c1 is determined from the values of q and c0.
Figure 2. H q 2 ( w 1 ; c 0 , c 1 ) as a function of w for different values of q when c0 = 1.3. Note that the value of c1 is determined from the values of q and c0.
Entropy 06 00364 g002

Share and Cite

MDPI and ACS Style

Yamano, T. On a simple derivation of a family of nonextensive entropies from information content. Entropy 2004, 6, 364-374. https://doi.org/10.3390/e6040364

AMA Style

Yamano T. On a simple derivation of a family of nonextensive entropies from information content. Entropy. 2004; 6(4):364-374. https://doi.org/10.3390/e6040364

Chicago/Turabian Style

Yamano, Takuya. 2004. "On a simple derivation of a family of nonextensive entropies from information content" Entropy 6, no. 4: 364-374. https://doi.org/10.3390/e6040364

Article Metrics

Back to TopTop