On the uniqueness theorem for pseudo-additive entropies

We discuss the idea that the Tsallis-type (q-additive) entropic chain rule allows for a wider class of entropic functionals than previously thought. In particular, we point out that the ensuing entropy solutions (e.g., Tsallis entropy) can be determined uniquely only when one fixes the prescription for handling conditional entropies. Our point is substantiated with the Dar\'otzy's mapping theorem and DeFinetti-Kolmogorov theorem for escort distributions and illustrated with a number of examples. Connection with Landsberg's classification of non-extensive thermodynamical systems is also briefly discussed.


Introduction
During the last two decades, the complex dynamical systems undergone an important conceptual shift. The catalyst was infusion of new ideas from theory of critical phenomena (scaling laws), (multi)fractals and trees, renormalization group, random matrix theory and information theory. On the other hand, the usual Boltzmann-Gibbs statistics (BGS) has proven to be grossly inadequate in this context. While successful in describing stationary systems characterized by ergodicity or metric transitivity, BGS fails to reproduce statistical behavior of many real-world systems in biology, astrophysics, geology, and the economic and social sciences. In recent years the use of a new paradigm known as generalized statistics has become very popular in complex systems. The notion "generalized statistics" refers to statistical systems that are described via broad, (semi)heavy-tail distributions. Examples include generalized hyperbolic distributions, Meixner distributions, Weibull distributions, and various power-law tail distributions (e.g., Zipf-Pareto, Lévy or Tsallis-type distributions). In particular, the statistics associated with power-law tails accounts for a rich class of phenomena often observed in complex systems ranging from financial markets, physics and biology to geoscience (see, e.g. Refs. [1,2] and citations therein).
Currently there is a number of key techniques for dealing with generalized-statistics-based complex systems. Among these one can especially mention; a) innate general statistics approaches such as q-deformed thermostatistics of Tsallis [1], Superstatistics of Beck et al. [3] and various generalized non-extensive entropies [4][5][6][7][8]; b) information-theoretic concepts such as information theory of Rényi [9,10], transfer entropies [11,12], complexity theory [1,13], or information geometry [14]. Particularly the concept of entropy both in its information-theoretic and combinatorial disguise plays a central rôle here. It is, however, clear that entropy can provide meaningful and technically sound modus operandi only if there is a good guiding principle behind its formulation. Admittedly, from a mathematical point of view, the most satisfactory prescriptions are based on axiomatization of entropy or (and even better) on some operationally viable coding theorem. It is purpose of this paper to focus on the axiomatic underpinning of entropies with a special emphasize on the axiomatics of the so-called q-additive entropies. We shall show, by extending our previous argument [15], that commonly used axioms for q-additive entropies are prone to have multiplicity of distinct solutions with the culprit residing in the way how conditional entropies are handled. Since the existence of different solutions is intimately related to Darótzy's mapping theorem and Kolmogorov-Nagumo's quasi-linear means, we study thus generated class of entropic functionals that all satisfy the same q-additive entropic chain rule. We will also briefly touch upon the possibility of yet another mechanism, which is implied by the DeFinetti-Kolmogorov theorem for escort distributions.
The plan of the paper is as follows. After this Introduction, we present in Section 2 a concise overview of some standard (pseudo-)additive entropic rules which yield degenerate solutions for entropic functionals. This will be important in Section 3 where the same phenomenon will be observed in the context of entropic chain rules. Despite their higher restrictiveness, the chain rules still lead to degenerate solutions and this fact is present even in the (more sophisticated) q-extensive entropic chain rules as demonstrated in Section 4. In Section 5 we identify the root cause of the degeneracy in the way how conditional entropies are handled. In particular, we prove that with the help of Darótzy's mapping theorem and Kolmogorov-Nagumo's quasi-linear means one may easily satisfy the same q-extensive entropic chain rule with two completely distinct yet perfectly legitimate entropic functionals. Finally, Section 6 summarizes our results and discusses further consequences.

Some (pseudo-)additivity rules
Suppose X and Y are two events, e.g. messages, having W X and W Y as possible number of states, respectively. The two events taken together form a joint event that we denote as XY. If X and Y are independent of each other, every combination of X and Y values is a possible joint event, and so W XY = W X W Y . In this case the entropy for a composite bipartite system XY -the joint entropy H(X, Y), is additive, i.e.
On the other hand, if the events are not independent (and do not interact), it may be that some combinations of X and Y are not allowed (i.e., W XY < W X W Y ), so that the joint entropy H(X, Y) may be less than H(X) + H(Y) -a property known as the subadditivity of the entropy. We will have more to say about this in Section 5.
In practice, the entropy behavior in a number of independent bipartite systems cannot be easily grasped via simple additivity prescription (1). The reasons behind might be the fact that some marginal states (e.g., surface states) are neglected or that the long-range correlations are invalidating the assumption of independence and the concept of independence is used only as a convenient approximation. Whatever the reason, the modus operandi in these cases is to resort to some simple one-parameter deformation of the additivity rule that grasps in one way or another the non-additive contributions. Such deformations are known as pseudo-additive entropic forms. Most commonly used versions are; Tsallis-type additivity [1] Landsberg-type additivity [16] H(X, (AdS/CFT) δ-additivity [1,17] Masi-Czachor-type supra-additivity [18], etc. Let us point out that it might happen that the above (pseudo-)additivity rules hold also for more generic joint systems for which W XY = W X W Y . In such cases one speaks about (pseudo-)extensivity [6]. Since the (pseudo-)extensivity is more a trade of the actual interaction/correlation between X and Y rather than entropy itself we shall not dwell on this issue here. Let us just mention that the actual form of non-extensivity and the value of parameters can be often connected to some physical phenomena [8,19]. In Section 3 we will see that, from a strictly mathematical standpoint, it is logically more satisfactory to deal with dependent but non-interacting systems. In such cases above (pseudo-)additivity rules will be replaced with more restrictive entropy chain rules.

Degeneracy in solutions
It is well known that there is a number of "logically consistent" but form-inequivalent entropic functionals satisfying the simple additivity rule (1). Examples are provided by Shannon entropy (and ensuing Gibbs and von Neumann entropies) or Rényi entropy. In this connection, we should emphasize perhaps a less known fact that a similar situation holds also for pseudo-additivity. For instance, by assuming that with X is associated probability P X = {p i } n i=1 , then the Tsallis-type additivity rule is not only satisfied with Tsallis entropy as one would expect, but also with Landsberg's entropy [16] H or with Behara-Chawla γ-entropy [20] H(X) = On the same footing, the (AdS/CFT) δ-additivity is satisfied with a class of the so-called Segal entropies [21].

First bite on uniqueness
At this stage there are two points that should be noted. First, above (pseudo-)additive forms (1)-(4) are true only for independent events. Second, one could entertain the idea that when the entropic chain rules (which employ also the conditional entropy H(Y|X)) are used instead, then their higher restrictiveness could remove the unwanted degeneracy and lead to a single unique solution. This might seem as a good strategy particularly because the entropic chain rule with ensuing H(Y|X) is of practical relevance in a number of ares. In this connection one can mention applications in; 1. information theory where it describes input-output information of (quantum) communication channel and enters in data processing inequality [22], 2. non-equilibrium and complex dynamics where it enters in May-Wigner criterion for the stability of dynamical systems and helps to estimate the connectivity of the network of system exchanges [23], 3. time series and data analysis where it describes transfer entropies in bivariate time series and degree of synchronization between two signals [12] .
In the following two sections we will see what the entropic chain rules and their pseudo-additive extensions can really say about the uniqueness of entropic functionals.

Entropic chain rule
3.1. Additive entropic chain rule -some fundamentals Let us consider two generally dependent events. Elementary (Shannon-type) entropic chain rule for ensuing random variables X and Y reads where H(X) is a single-random variable entropy, H(X, Y) is joint entropy, and H(Y|X) represents the related conditional entropy of Y given X. The meaning of the chain rule (8) is depicted in Figure 1. By induction one can generalize the previous relation (8) to At this stage we can ask ourselves a following question. By imposing simple consistency conditions (i.e., Kolmogorov axioms 1-3, see, e.g. [5]), such as: 1. continuity, i.e., when H(P X ) is a continuous function of all arguments, 2. maximality, i.e., when for given n is H(P X ) maximal for P X = (1/n, . . . , 1/n) and 3. expansibility, i.e., when H(p 1 , . . . , p n , 0) = H(p 1 , . . . , p n ), to what extend is H(X) (with the above chain rule) unique? It might perhaps come as a surprise that the uniqueness is not guarantied, at least not unless one prescribes how to handle conditional entropies.
i.e., H(X|Y) is defined via linear mean. In contrast, in the case of Rényi's entropy the conditional entropy of X given Y is defined by means of quasi-linear Kologorov-Nagumo (KN) mean [24,25] in the following two-step sequence [26] I q (X|Y = y) = 1 1 − q log 2 ∑ x∈X p q (x|y) , Here, the KN function φ reads [9,25] φ Particularly noteworthy is the appearance of so-called escort distribution [26,27]: in Eq. (11). Interestingly, a need for ρ q (y) in the definition of I q (X|Y) was already observed by A. Rényi in 1960's [26] -some 30 years before the escort distribution was officially introduced in Ref. [27].

Second bite on uniqueness
In analogy with the preceding section one defines the q-extensive entropic chain rule for two random generally dependent variables X and Y as By induction one can generalize this to an n-partite system as We can ask again a similar question as before, namely to what extend is H(X) ≡ H(P X ) (subject to above chain rule) unique.
As before, we should not be surprised to learn that the uniqueness is not guarantied unless H(X|Y) is specified. Indeed, the q-extensive entropic chain rule together with consistency conditions are not enough to fix the unique form of H(X). Examples include a) Tsallis' entropy where the conditional entropy of X given Y is defined as [28] S q (X|Y = y) = 1 i.e., S q (A|B) is defined via linear (escort) mean, b) Frank-Daffertshofer entropy [29] where the conditional entropy is defined as with the KN function φ(x) = log r e x q , i.e., S FD (X|Y) is obtained via quasi-linear KN mean, c) Sharma-Mittal entropy [30] where the conditional entropy is given by Here δ = 2 1−q − 1 and φ(x) = log r e x δ−1 , i.e. S SM (X|Y) is again defined via quasi-linear KN mean.

Two simple theorems
Let us now state two simple theorems that are pertinent to the discussion of q-extensive entropic rules. (2 λx − 1)/γ, λγ > 0 for λ = 0 . 1 The explicit form of the function Φ(X, q) depends on the type of the q deformation. For instance, for q extensivity we have Φ(X, q) = 1 + (1 − q)H(X).

Statement: Let us have additive chain rule with H(X|Y) defined with KN function φ(x).
Darótzy's mapping then generates q-extensivity (provided we set γ = 1 − q) where new H(X|Y) is defined via KN function Proof of Theorem 1.
The actual values of a and λ are immaterial for the q-extensivity as the first line of the proof does not depend on them. The second line shows that the ensuing q-extensive entropy represents a two-parametric functional (apart from parameters originating from the KN function φ).
So, the core idea of this theorem is that if there exist two different entropy functionals for a given additive chain rule with two different KN-related conditional entropies, then one can use Darótzy's mapping to map these two solutions to another two solutions that solve the q-extensive chain rule. A simple examples of this mechanism were provided by the Frank-Daffertshofer and Sharma-Mittal entropies (17)-(18) with Darótzy's mappings h δ (x) = (2 (1−q)x − 1)/δ applied to Rényi entropy I r with δ = 2 1−q − 1 and δ = 1 − q, respectively. For some further details on this issue see, e.g. Ref. [4].
The second theorem highlights some tricky points related to joint escort distributions. HereR(q) kl is the correct would-be joint escort distribution. Note thatR(q) kl is not the escort of r kl andR(q) kl = R(q) kl iff events are independent [15].
An important upshot of this theorem is that some entropies (particularly those born out of escort distributions) might satisfy q-additivity but not necessarily q-extensive entropic chain rule. The reason is that the more restrictive chain rule is sensitive to the failure of the DeFinetti-Kolmogorov relation for joint escort distributions. Typical example is provided by the JA entropy [15,31,32] S J A,q (X) = 1 1 − q . Because − ∑ k ρ k (q) log p k satisfies the simple additivity rule for independent events, Darótzy's mapping ensures that S J A,q (X) satisfies the q-additivity. On the other hand, − ∑ k ρ k (q) log p k does not satisfy the simple chain rule due to failure of the DeFinetti-Kolmogorov relation for joint events. Consequently, S J A,q (X) is not q-extensive.
Landsberg concluded that there are only 6 logically consistent thermodynamic classes: SHC, SHC, SHC,SHC,SHC,SHC. Types SHC areSHC are logically impossible [16], see Figure 2. This classification is, however, based entirely on entropies for independent events. Since entropic (pseudo-)additivity rule and chain rule for a given system do not need to follow the same pattern (e.g., they do not need to be both q-additive) the entropy class might change when conditional probabilities are included. For instance, according to Landsberg S J A,q (X) should for q < 1, belong to SHC class, while the corresponding chain-rule generalization allows for flipping between S andS, see [15].

Conclusions
In this paper we have demonstrated that any given q-extensive entropic chain rule allows for a wider class of entropic functionals than previously thought. This degeneracy in solutions plagues also q-additive entropic rules but one would expect that the higher restrictiveness of the q-extensive rule could perhaps remove such a degeneracy. This is not the case. The culprit behind is in the way how H(X|Y) is handled. There is a flexibility in the definition of permissible H(X|Y) by using quasi-linear (or KN) means. These results beg for a question; what is a typical trademark of non-extensive statistics; a) pseudo-additivity (as indicated by Landsberg's classification), b) pseudo-extensivity (i.e., chain rule with conditional entropies) or c) power-law-type entropy maximizers.