Abstract
Conditions are highlighted for generalized entropies to allow for non-trivial time-averaged entropy rates for a large class of random sequences, including Markov chains and continued fractions. The axiomatic-free conditions arise from the behavior of the marginal entropy of the sequence. Apart from the well-known Shannon and Rényi cases, only logarithmic versions of Sharma–Taneja–Mittal entropies may fulfill these conditions. Their main properties are detailed.
1. Introduction
Quantifying the information—or uncertainty—of a sequence of random variables has been considered since the foundation of information theory in [1], where the entropy rate of a random sequence is defined as the limit , with denoting the entropy of the n first coordinates of for a positive integer. Shannon’s original concept of entropy naturally generalizes to various entropies, introduced in the literature for a better fit to certain complex systems through proper axiomatic requirements. Most of the classical examples given in Table 1 (first column) belong to the class of -entropies
where h and are real functions satisfying compatibility assumptions and is the distribution of a discrete random variable X on E; see Section 2 below and also [2,3]. When h is the identity, the corresponding -entropy is a -entropy, denoted by .
Ref. [1] established the existence and a closed-form expression for the entropy rate of homogeneous ergodic Markov chains. This relies on the chain rule—strong additivity property—of the Shannon entropy, , where denotes the Shannon entropy of the distribution of conditional to . This chain rule implicitly implies that, for ergodic Markov chains, the sequence of marginal entropies grows (asymptotically) linearly with n.
This type of linearity appears as an interesting property for a generalized entropy to measure—in a meaningful way—the information carried by a random process: all the variables contribute equally to the global information of the sequence. Below, an -entropy will be said to be linear for a random sequence if the marginal entropy sequence grows linearly with n, where . Furthermore, Ciuperca et al. [4] show that linearity is required for obtaining a non-trivial time-averaged -entropy rate
Self-evidently, the Shannon entropy is linear for independent and identically distributed (i.i.d.) sequences and most ergodic Markov chains. A natural attempt to prove linearity of other entropies consists in investigating some additive functional identity replacing the chain rule. Regnault et al. [5] establish the linearity of Rényi entropies for homogeneous ergodic Markov chains, based on the identity , where denotes the -escort distribution of X; explicit weighted expressions for the Rényi entropy rates follow. Most likely, such an approach cannot be systematically generalized to any other -entropy by lack of a proper additive functional identity.
Ciuperca et al. [4] successfully explore an axiomatic-free alternative approach, originating in analytic combinatorics. It consists of deriving the asymptotic behavior of from the behavior of the associated Dirichlet series
where and is the distribution of . A random sequence is a quasi-power (QP) sequence if converges to zero as n tends to infinity, and if a real number , and strictly positive analytic functions c and , with strictly decreasing and , such that for all real numbers, , some exists, such that
for all n, where is an analytic function, such that ; see [6]. For any -entropy that is linear for QP sequences, the entropy rate (2) is proven to reduce to either the Shannon or Rényi entropy rate, up to a constant factor. The large class of QP sequences includes i.i.d. sequences, most ergodic Markov chains with atomic state spaces, and more complex ergodic dynamical systems, such as random continued fractions.
The present paper identifies only three types of -entropies that are linear—up to obvious equivalence conditions—for QP sequences: Shannon and Rényi entropies, plus a third one, the logarithm of Taneja entropy, which we will call log-Taneja entropy. This result, valid for all QP sequences, extends [7], dedicated to Markov chains. As stated in Theorem 2, it relies on the asymptotic behavior of -entropies stated in Theorem 1; in Section 4, we highlight a classification of -entropies into five exclusive classes, depending on the growth rate of the marginal entropies: linear, over-linear, sub-linear, contracting, constant.
Further, a pertinent choice of the function h changes the class of the -entropy. A well-known example is provided by the linear Rényi entropy that appears as the logarithm of Tsallis entropy (up to constants) while the latter is either constant or over-linear, depending on the parameter. Another interesting case considered below is the Sharma–Taneja–Mittal (STM) entropy
where , and are two positive parameters; see [8]. Frank and Plastino [9] show that STM entropy is the only entropy that gives rise to thermostatistics based on escort mean values, while [10] and the references therein highlight connections to statistical mechanics. The -entropy
where , called Taneja entropy in the literature, appears as the limit of as tends to , with fixed, and Shannon entropy is obtained for . The logarithm transforms both of these contracting or over-linear entropies into linear ones. In other words, their associated entropy rates (2) will be non-trivial for QP sequences. Since, to our knowledge, the log-STM and log-Taneja entropies have never been considered in the literature, this paper ends by studying their main properties.
2. Quasi Power Log Entropies
Generally speaking, the -entropies defined by (1), with and , are such that either is concave and h increases or is convex and h decreases. Moreover, with h and , takes only non-negative values. For most functionals of interest in the literature, h and are locally equivalent to products of power and logarithmic functions, which led in [4] to the definition of the class of quasi-power log entropies.
A -entropy is a quasi-power log (QPL) if , and exist, such that is QPL at 0, in the sense that
Further, an -entropy is QPL if both (7) are satisfied and , and exist, such that
where
with denoting the sign of a; see [11]. Of course, (8) makes sense only if is well defined in a neighborhood of , inducing the following restrictions on the parameters:
Note that for and , the -entropy collapses to a constant. Column 1 of Table 1 shows the most classical entropies in the literature that are QPL.
3. Asymptotic Behavior of the Marginal Entropies of QP Sequences
The asymptotic behaviors of the sequences of marginal QPL entropies of any random sequence are controlled by the Dirichlet series (3) of the sequences, as made explicit in the next theorem. This special case of [11] (Theorem 1), where the asymptotic behaviors of divergence functionals are addressed, deserves to be detailed for the entropy of QP sequences. Note that Markov chains are considered in [7] in order to obtain weighted rate formulas involving Perron–Frobenius eigenvalues and eigenvectors.
Theorem 1.
Let be a QP sequence. For any QPL φ-entropy,
with
Further, for any QPL -entropy,
where
and
Proof.
In [7,11], the first quantities in (10) and (12) are called rescaled -entropy and -entropy rates of . They constitute proper—non-degenerated—information measures associated with or -entropies. The rescaled entropy rate depends on only through the parameters of the QPP; see [7] for an interpretation of these parameters for Markov chains. Note that the averaging sequence is defined up to the asymptotic equivalence .
Table 1 presents (for the classical entropies) the values of , the averaging sequence , and the rescaled entropy rates , depending on the values of the parameters. Table 2 details the dependencies for -entropies.
Table 2.
Limit marginal -entropy , averaging sequence , and , according to the values of a, s, , and sign of a.
4. Classification of QPL -Entropies—Dependence on
Classification of all QPL -entropies into five exclusive classes derives from Theorem 1, for QP sequences, according to the asymptotic behavior of the marginal entropy, or, equivalently, to the type of rescaling sequence . The important point is that this classification depends only on the functions and h—through the parameters and involved in the QPL properties (7) and (8), and not in the specific dynamics of the QP sequence . In particular, the classification is valid for a wide class of atomic Markov chains, including finite ones, as shown in [7].
The classification of -entropies into four classes easily derives from either (9) or Table 2. Indeed, four types of behaviors for the sequences of marginal entropies are possible, according to the values of s and . First, for and (Shannon entropy or equivalent), the marginal entropy increases linearly with n and, hence, ; it is the only linear QPL -entropy. Second, for and , the rescaling sequences are respectively and . Equation (4) states that is decreasing with s and that . So, for , and the marginal entropy explodes exponentially fast, up to ; such an entropy is called over-linear or expanding. Third, for (with either or 1), both and the marginal entropy decrease (in absolute value) exponentially fast to 0; such an entropy is called contracting. Finally, for and (Tsallis entropy or equivalent), the marginal entropy converges to some finite value, and the rescaling sequence is constant.
The discussion easily extends to -entropies through the asymptotic behaviors (11) of marginal entropies or the rescaling sequence in (12). The following classification of -entropies ensues, shown in Table 3 according to parameters. The last column of Table 1 shows the classes of classical entropies.
- Contracting case:. The marginal entropy decreases to 0, typically exponentially fast.
- Constant case:. The marginal entropy converges to a non-degenerate value.
- Sub-linear case: and . The marginal entropy increases slower than n, typically as or , with .
- Linear case:, with . The marginal entropy increases linearly with the number of variables.
- Over-linear (or expanding) case:. The marginal entropy increases faster than n, typically exponentially fast.
Transforming a -entropy into an -entropy with a QPL function h leads to a change of class in several cases. Any transformation h with exchanges the over-linear and contracting classes, while the choice of has no impact on these two classes. Taking would transform the linear Shannon entropy into an intricate entropy of the over-linear class.
5. Linear QPL Entropies for QP Sequences
Since both STM and Taneja entropies are positive (due to the pertinent choice of parameters), their logarithms are well defined, which allows us to derive new -entropies and, hence, complete Shannon and Rényi entropies in the linear class.
Definition 1.
The log-STM entropy and log-Taneja entropy are defined for any random variable X, taking values in a set E with distribution m, respectively, by and
In particular, the Rényi entropy appears as a log-STM entropy for , up to multiplicative and additive constants, here omitted for the sake of simplicity.
The classification and ensuing discussion presented in Section 4 lead to the following characterization of the class of linear QPL -entropies.
Theorem 2.
Due to obvious equivalences, such as and , and multiplicative constants, exactly three types of linear QPL -entropies exist for QP sequences: Shannon entropy, with ; Rényi entropies, with and ; log-Taneja entropies, with and .
Moreover, the entropy rates associated with these entropies are either the Shannon entropy rate (for Shannon entropy) or Rényi (for both the Rényi and log-Taneja entropies).
Obviously, all log-STM entropies are also linear and equivalent to Rényi entropies since
To our knowledge, log-Taneja entropies and log-STM entropies have never been considered in the literature, so we will now establish some of their properties.
Various properties of STM entropies are studied in [12]. STM and Taneja entropies are symmetric, continuous, and positive. Symmetry and continuity are preserved by composition by the logarithm, yielding symmetry and continuity of log-STM and log-Taneja entropies. However, is clearly not positive.
For a number N of equiprobable states, the STM entropy is while Taneja entropy is . Both increase with N except when . The property transfers to the logarithmic versions.
Borges and Roditi [12] show that STM-entropies are concave for and convex for both with , and ; all other cases present no definite concavity. Since the logarithm is a concave and increasing function, the log-STM entropies are also concave for , the other cases being indefinite.
Scarfone [13] gives the maximum STM entropy distribution, while Beitollahi and Azhdari [14] compute Taneja entropy for numerous classical distributions, among which Bernoulli and geometric. Both transfer to the logarithmic versions.
STM entropies are axiomatically characterized as the family of -entropies—with , such that , satisfying the functional identity
where and denote arbitrary probability distributions on E; see [15] and the references therein. From a probabilistic point of view, (13) states that the entropy of a pair of independent variables, with distributions and , satisfies, for the Dirichlet series defined by (3),
For any fixed and tending to , (14) yields for the Taneja entropy of independent variables, In particular, if , then that is a “weighted additivity” rule for independent variables with equal Dirichlet series at —or, equivalently, equal Rényi entropy of parameter .
The axiomatic property (13) for STM entropies yields an alternative proof of the linearity of log-STM entropies for i.i.d. sequences of random variables. For the sake of simplicity, let us detail it for log-Taneja entropies. Let be an i.i.d. sequence with common distribution m. Some simple calculation from the definition gives
where denotes the -escort distribution associated with m. The escort transformation is known to preserve independence in the sense that ; see, e.g., [5] and the references therein. Moreover, both the Shannon entropy and Rényi entropy are well-known to be additive for independent variables, and hence and . Applying (15) to thus yields
Obviously, the right-hand term of the sum is negligible with respect to n, so that . Finally, the entropy rate of the sequence reduces to the Rényi entropy of the common distribution.
Author Contributions
Both authors have contributed equally to all aspects of this manuscript. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Menéndez, M.L.; Morales, D.; Pardo, L.; Salicrú, M. (h,ϕ)-entropy differential metric. Appl. Math. 1997, 42, 81–98. [Google Scholar] [CrossRef]
- Basseville, M. Divergence measures for statistical data processing—An annotated bibliography. Signal Process. 2013, 93, 621–633. [Google Scholar] [CrossRef]
- Ciuperca, G.; Girardin, V.; Lhote, L. Computation and Estimation of Generalized Entropy Rates for Denumerable Markov Chains. IEEE Trans. Inf. Theory 2011, 57, 4026–4034. [Google Scholar] [CrossRef]
- Regnault, P.; Girardin, V.; Lhote, L. Weighted Closed Form Expressions Based on Escort Distributions for Rényi Entropy Rates of Markov Chains. In Geometric Science of Information; Lecture Notes in Computer Science; Nielsen, F., Barbaresco, F., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 648–656. [Google Scholar] [CrossRef]
- Vallée, B. Dynamical sources in information theory: Fundamental intervals and word prefixes. Algorithmica 2001, 29, 262–306. [Google Scholar] [CrossRef][Green Version]
- Girardin, V.; Lhote, L.; Regnault, P. Different Closed-Form Expressions for Generalized Entropy Rates of Markov Chains. Methodol. Comput. Appl. Probab. 2019, 21, 1431–1452. [Google Scholar] [CrossRef]
- Sharma, B.D.; Taneja, I.J. Entropy of type (α,β) and other generalized measures in information theory. Metrika 1975, 22, 205–215. [Google Scholar] [CrossRef]
- Frank, T.; Plastino, A. Generalized thermostatistics based on the Sharma-Mittal entropy and escort mean values. Eur. Phys. J. B- Matter Complex Syst. 2002, 30, 543–549. [Google Scholar] [CrossRef]
- Kolesnichenko, A.V. Two-parameter Sharma–Taneja–Mittal entropy as the basis of family of equilibrium thermodynamics of nonextensive systems. Prepr. Keldysh Inst. Appl. Math. 2020, 36, 35. [Google Scholar] [CrossRef]
- Girardin, V.; Lhote, L. Rescaling Entropy and Divergence Rates. IEEE Trans. Inf. Theory 2015, 61, 5868–5882. [Google Scholar] [CrossRef]
- Borges, E.P.; Roditi, I. A family of nonextensive entropies. Phys. Lett. A 1998, 246, 399–402. [Google Scholar] [CrossRef]
- Scarfone, A.M. A Maximal Entropy Distribution Derivation of the Sharma-Taneja-Mittal Entropic Form. Open Syst. Inf. Dyn. 2018, 25, 1850002. [Google Scholar] [CrossRef]
- Beitollahi, A.; Azhdari, P. Exponential family and Taneja’s entropy. Appl. Math. Sci. 2010, 4, 2013–2019. [Google Scholar]
- Suyari, H.; Ohara, A.; Wada, T. Mathematical Aspects of Generalized Entropies and their Applications. J. Phys. Conf. Ser. 2010, 201, 011001. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).