Non-extensive Entropic Distance Based on Diffusion: Restrictions on Parameters in Entropy Formulae

Based on a diffusion-like master equation we propose a formula using the Bregman divergence for measuring entropic distance in terms of different non-extensive entropy expressions. We obtain the non-extensivity parameter range for a universal approach to the stationary distribution by simple diffusive dynamics for the Tsallis and the Kaniadakis entropies, for the Hanel–Thurner generalization, and finally for a recently suggested log-log type entropy formula which belongs to diverging variance in the inverse temperature superstatistics.


Introduction
Over the last decades, there have been several suggestions for generalizations of the Boltzmann-Gibbs-Shannon (BGS) entropy formula [1][2][3][4][5][6].Most formulas can be grouped into categories either by their mathematical form (trace form or a function of the trace form) [7], or by the scaling properties for large systems, usually providing large entropies, S, even if not necessarily proportional to the logarithm of the number of states, ln W [8][9][10][11].Such entropy formulas are often presented based on axioms, like the Khinchin or Shannon axioms for the most known logarithmic entropy formula due to Boltzmann and Gibbs.Non-extensive entropy then abandons the additivity axiom and replaces it with a more general assumption.For the group entropy, the assumption of the associativity of the entropy composition rule is used [7].Another way to obtain a generalized entropy formula starts with physical properties, like the universality of the thermal equilibrium between two systems as described by the zeroth law of thermodynamics [12,13], and by this it also includes the associativity assumption [14].It is also noteworthy that starting with a general, non-associative composition prescription one arrives asymptotically at an associative one-only by repeating it in small steps and reconstructing the effective composition formula in the continuous scaling limit [15].
Deformations of the original Boltzmann-Gibbs-Shannon entropy formula can also be achieved by tracing it back from the canonical distribution, exponential in the subsystem energy, when parameters of this distribution are treated as stochastic ones, or otherwise randomized, and hence integrated over another-the superstatistical-distribution [16][17][18].Instead of treating the temperature or its inverse as a fluctuating quantity, one may also consider the number of degrees of freedom, and hence the dimensionality of phase space, as a fluctuating quantity.In [19][20][21][22][23] it has been shown that simple fluctuation patterns of the particle number, N, influencing the phase space volume Ω occupied by an ideal gas, can lead both to an exponential distribution of exp(−ω/T) by a Poisson N distribution, or to cut power-law like Tsallis-Pareto distribution by a negative binomial N distribution.In general, a trace form entropy, emerges with K(S) satisfying a second order differential equation, the additivity restoration condition (ARC) [24].Connecting to a more traditional notation, the K function is related to the deformed logarithm function as This form includes the one parameter deformed logarithms as well as the generalizations with more parameters [12,13,[25][26][27].
In the present paper we investigate whether such entropy formulas define an entropic distance between two probability distributions, which has the following useful properties: 1. it is positive for any two different distributions; 2. it is zero for comparing any distribution with itself; 3. it is symmetric.
A symmetric distance measure is most naturally formulated as a sum of relative divergences, ρ(P, Q) = σ(P|Q) + σ(Q|P).For deformed logarithms two basic forms of suggestions occur in the literature, one based on the ratio P n /Q n and one based on the difference in their logarithm, cf.refs.[28][29][30][31].Furthermore we are interested in a definition which describes a distance that is always shrinking between a dynamically evolving distribution and the one belonging to maximal entropy under a spontaneous dynamics.
In the following we show this behavior for the traditional logarithmic entropy formula and a generally state-dependent nearest neighbour master equation (defined below), and then propose a generalization of the symmetrized entropic distance measure based on the deformed logarithm function.We conclude that some proposals encertain a shrinking of the entropic distance during an approach to the stationary distribution only for a restricted range of the non-extensivity parameter(s) used in the entropy formula.

Probability Distributions
The Poisson distribution is ubiquitous in statistical phenomena, where the discreteness of the basic variable, n, and its positivity are natural constraints.This distribution, is parameterized by a single parameter, coinciding with the mean value of the variable n : a = n .The logarithm of its characteristic function, generates all central moments of the distribution.Notably, due to δn 2 = n the Poisson distribution narrows for growing mean value of n.
This distribution plays a central role in different areas of physics: it occurs as the statistics of spontaneous radioactive decays during given time intervals, as the number of photons in a Glauber coherent state, or as the number distribution of randomly produced hadrons in high energy experiments when uniformly filling the available phase space.In the latter case, a given total energy, E, is distributed among n newly made particles in experiment.The statistical weight to find a single particle with individual energy ω among the products is roughly proportional to the relative phase space described by a ratio of corresponding n-dimensional spheres.Its average over the Poisson distribution is given by the following remarkable formula Several authors use the power n − 1 instead of n in this formula.The difference traces back to different concepts on the hyperspheric phase space volume vs. surface.Since for large n it does not matter, we do not go into details here.This result allows for a Gibbsean interpretation of this statistical factor with the kinetic temperature T = E/ n .
Beyond the Poissonian also sub-and super-Poissonian bosonic states occur with the corresponding Bernoulli or negative binomial multiplicity distributions in experiments.The above ideal phase space ratio averages in those cases to a statistical factor generalizing the exponential to a cut power-law form, also known as a canonical distribution to the Tsallis entropy, e.g., for the negative binomial distribution with n = p(k + 1) The above Tsallis-Pareto distribution, Equation ( 6) can also be obtained as a canonical distribution to the Tsallis entropy formula [3][4][5], with q = 1 + 1/(k + 1) and T = E/ n .These and similar experiences suggest that different entropy formulas can be used as a basis for defining the entropic distance.Earlier attempts mostly defined a divergence formula based on the deformed logarithm, ln def , of the ratio of the two respective probabilities [31,32].An alternative approach-which we propose to follow-uses the difference of the deformed logarithms as a definition, it is based on a Bregman type divergence.While for the traditional logarithm function these alternative definitions coincide (due to ln(x/y) = ln x − ln y), they do differ appreciably for the deformed logarithm functions in general use.

Master Equation
Let us denote the probability of having exactly n quanta in the observed system (part of phase space) at the time t generally by P n (t).This quantity may depend on further parameters, this is suppressed in the notation by now.However, in any blind choice case it is proportional to the quantity 1/W n , to the ratio of one to the number of possible ways to realize the state with n quanta.
A wide range of phenomena can be described by a simplified dynamics, assuming that only one quantum can be exchanged in a time instant.In this case, the evolution of P n (t) and W n depends only on the state probabilities of having one more or one less quantum.We consider here only the linearized version of this dynamics, the so called master equation: Here, λ n is the transition (decay) rate from a state with n quanta to the state n − 1, µ n is the corresponding reverse (growth) rate from n to n + 1.In this case the occurrence of state n in a huge parallel ensemble (Gibbs ensemble) of systems is fed by both the (n + 1) → n and (n − 1) → n processes and it is diminished by the reverse processes.It is of special interest to investigate processes when λ and µ are related by symmetry principles, like time reversal invariance or subsystem-reservoir homogeneity.
We shall quote the general detailed balance solution of Equation ( 9), P eq n = Q n , as the equilibrium distribution.Since the equation is homogeneous and linear in the Q n -s the overall normalization is not fixed by it.Being the equilibrium distribution, all Qn -s should be zero.That condition is fulfilled only if (λQ) n+1 = (µQ) n (10) from which it follows that also (λQ holds, annullating all evolution.Based on this observation, the detailed balance distribution satisfies where Q 0 can be obtained from the normalization condition This is kept during the evolution, since ∑ n Ṗn = 0 upon Equation (9).

Entropic Distance
Based on the Boltzmann-Gibbs entropy, a distance measure from a reference distribution, Q n is based on the quantity By this definition σ(Q|Q) = 0, and it gives a positive Kullback-Leibler divergence [33][34][35] between the normalized distributions P n and Q n .This can be easily proved by using the Jensen inequality, ∏ A P i i ≤ ∑ P i A i applied to A i = Q i /P i .The symmetric combination, based on a sum of Bregman type divergences [28][29][30][31], is non-negative then term by term.For a fixed, time-independent Q n the master Equation (9) causes σ(P|Q) to decrease: The sum can be re-arranged by summation index re-definition to This expression is non-positive for arbitrary P n (t) values only if Namely, one arrives solely in this case at the (1 − x) ln x type expression under the sum.But this means that the reference distribution, to which any initial distribution approaches if evolving according to the master equation, is the detailed balance distribution!It can be shown that by the virtue of the master equation also the distance σ(Q|P) is decreasing term by term.The symmetrically summed Kullback-Leibler distance to the detailed balance distribution is also reduced: ρ(P, Q) ≤ 0, due to (Here x = (µP) n /(λP) n+1 ).
In principle different distance measures (based on deformed entropy formulas) can also be used to investigate this property of a linear master equation.If one considers an entropy defined by a deformed logarithm function, then the Kullback-Leibler divergence and the definition of the distance between probability distributions can be modified accordingly as This measure is zero only if P n = Q n for all n; otherwise it is positive if ln def P is a strictly increasing function, i.e., ln def P > 0 on the whole interval (0, 1).
The condition of the convergence to the P n = Q n detailed balance solution then reads as On the other hand, from the master equation one obtains with x n = P n /Q n and Γ n = (λQ) n+1 = (µQ) n .Substituting Equation ( 25) into Equation ( 23) and re-arranging the summation indices finally we arrive at the requirement This is satisfied if R n is strictly increasing, which means that R n > 0 for all possible P n and Q n values; ρ = 0 being realized only if x n+1 = x n = 1 (the last equality is due to the normalization . The condition for approaching the detailed balance distribution then finally reads as It is clear that this is a more detailed restriction than just the concavity of the entropy formula.The Q-independent part of the expression in the bracket equals (P ln def P) , whose positivity is the well-known concavity requirement for the entropy formula.In cases when also ln def P < 0, this concaivity suffices for the uniform approach to the stationary distribution.In the opposite case we arrive at the nontrivial constraint (P ln def P) > ln def P at Q n = 1.
Let us now test whether some familiar suggestions for deformed logarithms satisfy this constraint.The Tsallis logarithm is defined as It results in ln def P n = P q−2 n > 0 and due to Equation ( 27) the final constraint is This is fulfilled for arbitrary (P, Q) probability distribution pairs if 0 ≤ q ≤ 2; otherwise it might be violated.The classical logarithm is reconstructed when q = 1.
The kappa-exponential, promoted by Kaniadakis [36], belongs to the deformed logarithm In this case one obtains as the condition for the correct evolution towards the detailed balance solution by the linear master equation.This is fulfilled universally (i.e., for arbitrary P n and Q n in the physical interval of [0, 1]) as long as κ 2 ≤ 1.One obtains this result as follows.Both square bracket expressions in Equation ( 31) are non-negative.Their sum is also non-negative if −1 ≤ κ ≤ +1.Otherwise there are P n values near to zero for which one of the summands, either due to (1 − κ) < 0 or due to (1 + κ) < 0, diverge to negative infinity, spoiling the inequality in this way.The classical logarithm is reconstructed for κ = 0 and in order to have the same power-law tail in canonical distributions κ = q − 1.The condition for universal evolution translates to 0 ≤ q ≤ 2. By now there is a much bigger variety of suggested entropy formulas, many of them having the trace form using a deformed logarithm function [37][38][39].Here we analyse one of them, the (c, d) entropy suggested by Hanel and Thurner [8], since it includes a number of different entropy formulas suggested earlier as particular cases.It is given as with being the incomplete gamma function.The coefficients A and B can be determined by considering the equiprobability case, p i = 1/W, resulting in S eq.prob.c,d = K(ln W), and applying the conditions ln def 1 = −K(0) = 0 and ln def 1 = K (0) = 1.By this procedure one obtains A = 1/(Γ 1 + cΓ 1 ) and B = AΓ 1 , with Γ 1 = Γ(d + 1, 1) and Γ 1 = −1/e.The corresponding deformed logarithm of the probability, based on the form Equation ( 21), is Here, for instance, the c = q, d = 0 choice gives the Tsallis logarithm, Equation (28) with A = e/(1 − c) and B = 1/(1 − c).
The first and second derivative of the deformed logarithm are given as and Here we used the abbreviations Γ, Γ and Γ without explicitely writing out the arguments d + 1 and 1 − c ln P n .All this leads to the following criterion for the term by term approach to the stationary distribution Q n by a general P n via the diffusion like master equation: Suppose now that A > 0. This requires eΓ(d + 1, 1) > c.(In the above d = 0 example that translates to c < 1.) Furthermore, considering the extreme case Q n = 0 (equivalent to the concavity condition) we deal with This is fulfilled for 0 < c < 1 and d > 1 − 1/c.Once this is so, observing that the lhs expression of the inequality Equation ( 39) is linear in Q n , the condition has to be checked only at the other endpoint of the Q n interval.At Q n = 1 we have: Although this criterion looks involved, we note, that at P n = 1 it simplifies to 1/A > 0, i.e., Γ(d + 1, 1) > c/e as supposed above.Similar analysis can be done for the opposite, A < 0 case.
Finally, the recently suggested doubly logarithmic entropy formula, designed for extreme large fluctuations in a reservoir by Biro et al. [23,24], considers as the deformed logarithm, leading to for the evolution condition ρ < 0. This is fulfilled for any pair of distributions, since ln P n < 0 is always true.

Conclusions
By now the use of non-extensive entropy formulas is clearly numerous.A review of all areas of physics where a power-law tailed canonical distribution occurs can be found in [5].Furthermore, applications to non-Gaussian velocity distributions in dusty plasmas can be found in [40,41].In this paper we have concentrated on a mathematical problem in the background testing generalized entropy formulas with respect to whether they serve as a basis for a well-behaving distance measure which describes an approach to the stationary distribution of state-dependent diffusion-like master equations.We have found that, in some parameter range of the suggested modern entropy formulas, the distance measure, based on the difference of deformed logarithms of the respective distributions, behaves well as shrinking uniformly, term by term during the time evolution towards the stationary distribution of general, state-dependent diffusion-like master equations.In particular, while the Tsallis entropy and other generalized entropy formulas (involving further deformed logarithm functions) guarantee the above term by term approach only in a restricted range of parameters, the extreme case belonging to a diverging variance in the temperature superstatistics, with a log-log form entropy formula, behaves universally well in this respect.