Some Non-Obvious Consequences of Non-Extensiveness of Entropy

Non-additive (or non-extensive) entropies have long been intensively studied and used in various fields of scientific research. This was due to the desire to describe the commonly observed quasi-power rather than the exponential nature of various distributions of the variables of interest when considered in the full available space of their variability. In this work we will concentrate on the example of high energy multiparticle production processes and will limit ourselves to only one form of non-extensive entropy, namely the Tsallis entropy. We will discuss some points not yet fully clarified and present some non-obvious consequences of non-extensiveness of entropy when applied to production processes.


Introduction
Entropy plays an important role in the study of the production mechanism of elementary particles observed in hadronic and nuclear collisions. This is the case both in the modelling of these processes based on thermodynamics (that is, on the description of distributions of all kinds of observables characterizing multiparticle production processes) and in their description in the language of statistical models (i.e., mainly on the description of their multiplicity distributions).
Over time, more and more new experimental results appeared, which began clearly to indicate that the originally used Boltzmann entropy (in the first case) or Shannon entropy (treated as a measure of information in the second), did not describe the results in the entire range of measured values. Experimentally observed distributions depart from the expected exponential form (in the first case) and from the Poissonian distribution (in the second) [1,2]. This was generally taken as an indication that different mechanisms operate, resulting in the occurrence of various types of correlations and fluctuations, and these do not fit into the scheme of equilibrium thermodynamics or the Shannon information measure [3]. This meant that it was necessary either to add appropriate conditions to the definition of the Boltzmann-Shannon entropy used, or to extend the very concept of entropy so that in its new form it could be applied to more complex systems without any additional conditions (their operation would be replaced by a new form of the entropy formula and by some new parameters appearing in it).
A multitude of new definitions of entropy and related measures of information have appeared in various fields of science (see, for example, [3][4][5][6][7] and references cited therein). In most cases, their distinguishing feature is their non-extensiveness. Here we will consider only the case of Tsallis entropy [5] S q , which for q = 1 becomes Boltzmann-Shannon entropy, S = S q=1 : which is currently the most widely-used to describe the particle production processes mentioned above (in fact, Tsallis entropy was introduced independently before and then rediscovered by Tsallis in thermodynamics [8,9]. It should be mentioned that from the point of view of information theory, the entropies S = S q=1 and S q are related to a different, specific way of collecting information about the object of interest [10]. This observation has recently been used in cognitive science [11]). The reason for this is the quasi-power nature of the Tsallis distribution f q (x) that is obtained from it, and, as it was shown a long time ago in [12][13][14], it is this type of distribution that is most suitable for describing the distributions of various variables in the full observable range of their variability. In fact, there are a variety of systems that do not comply with the standard equilibrium theory and that fit under the description of non-extensive entropy, thus suggesting that the entropic index q could be a convenient manner for quantifying some relevant aspects of complexity [5]. The Tsallis distribution is obtained by maximizing the Tsallis entropy using some constraints imposed on the distribution function sought. It turns out that in the commonly used version this procedure leads to a rather surprising result, namely that the non-extensiveness parameter q appearing in the definition of entropy is, in a sense, dual to the non-extensiveness parameter q obtained from the description of the observed distributions. As we show in (Section 2), this result is confirmed by the simultaneous analysis of multiparticle production processes in nucleon and nuclear collisions. In (Section 3) we show how by properly redefining the functions exp q (x) and ln q (y) this problem of duality can be avoided.
Tsallis entropy S q is nonadditive, namely where A and B are two systems independent in the sense that f (AB) = f (A) f (B) and the parameter q is simply a measure of the degree of this non-additivity (note that we tacitly assume here and in all subsequent considerations that q is the same in both systems). If, hypothetically, we extended this reasoning to the system of ν independent components (again, with the same q), , then we would have some kind of non-linear non-additivity (in parameter q), because now To better understand the role of the parameter q, let us additionally consider the non-additive versions of conditional probability and conditional entropy. Let us say that the considered system can be divided into two subsystems, A and B, and that p ij (A, B) is the joint normalized probability of finding A in state i and B in state j. Then the conditional probability B with A being in the i − th state, p ij (B|A), is given by Bayes' multiplication law, and the corresponding conditional Shannon entropy is By analogy to Equation (3) we can now write the corresponding conditional non-additive Tsallis entropy as where (note that because S q (B|A) ≤ S q (B) one must have q ≥ 1). This allows us to interpret the nonextensivity parameter q in terms of the conditional entropy as and turns out to be crucial for nonadditive (quantum) information theory [15]. In practical applications, the non-extensiveness of the entropy manifests itself in the quasi-power character of the distributions obtained from it, i.e., in the case considered here in the appearance of the non-extensiveness parameter q in the Tsallis distribution. However, there is a problem here that we discuss in Sections 2 and 3, namely that for a certain type of constraints, the parameters q in the definition of entropy and q in the Tsallis distribution are not identical but dual to each other, i.e., q + q = 2. Usually, the meaning of the non-extensiveness parameter is related to Tsallis distributions rather than to entropy as above. These, in turn, can be obtained in many ways, depending on the details of the described physical process and even from the Shannon entropy, if only the appropriate constraints are applied. We discuss this issue in more detail in Section 4. Section 5 contains our summary and conclusions.

From Tsalis Entropy to Tsalis Distribution
The Tsallis distribution (2) (valid for 0 ≤ x < ∞; 1 ≤ q ≤ 3/2) is obtained by maximizing the Tsallis entropy (1) using the following constraints [16]: In most cases, it is this form of distribution that is used phenomenologically to describe the various distributions measured in high-energy multiple particle production experiments (with x = X/T and the scaling factor T is usually identified with the temperature and X denotes the energy or momentum of the measured particles; it also appears in the normalization as 1/T). As shown in Figure 1, using this form of Tsallis distribution one obtains from measurements of different observables (rapidity, multiplicity and transverse momentum) and for high enough energies q > 1 (for low energies, conservation laws are important and they can sometimes push the parameter q to the q < 1 region). In addition, note that the values of q obtained from different observables are different (but always q > 1). These differences are due to the influence of two factors. The first is whether q is estimated from the temperature fluctuations obtained from data already averaged over other fluctuations or from data taking other fluctuations into account as well, and the second is that in different analyzes q is obtained in other regions of the phase space. √ s dependencies of the parameters q obtained from different observables. Squares: q obtained from multiplicity distributions f (N) [17,18] (fitted by q = 0.88 + 0.063 ln[ (s)]). Circles: q obtained from different analyses of the transverse momenta distribution f (p T ). Data points are, respectively, from a compilation of p + p data (full symbols) [19], from CMS data (half filled circles at high energies) [20,21] (fitted by q = 0.95 + 0.021 ln[ (s)]. Triangles: q obtained from analyses of rapidity distributions f (y) [22,23] (and fitted by However, this is not the only possible choice of constraints. Instead, using constraints in the form which seems to be more natural from the point of view of physical interpretation, namely that obtain [16] f These two different definitions pertain to two different schemes of the nonextensive statistical mechanics [24]. It should be noted that [25] proposes a parametric technique that shows the equivalence of different schemes (including those discussed here), and [26] once again shows the relationship of both averaging schemes (i.e., Equations (10) and (11)) with duality q ↔ 1/q. Now note that for distribution f (x) from Equation (12) becomes f (x) from Equation (2) (note that in addition to the additive duality represented by Equation (13), multiplicative duality, q ↔ 1/q, was also considered [27,28] shows the potential physical application of a combination of both types of duality to study cosmic ray physics). This means that the imposition of these constraints leads to a situation in which the non-extensiveness parameter q appearing in the definition of entropy is dual to the non-extensiveness parameter q obtained from describing the observed distributions. The problem of this duality has been raised many times (for example in [29][30][31]), but it does not seem to have been put to the experimental test yet, at least not in the field of multiparticle production. It turns out, however, that experiments measuring the multiplicities and distributions of particles produced in nuclear (AA) and nucleon (nn) collisions are very useful for this purpose, because they simultaneously measure the multiplicities (enabling the estimation of the entropy produced) and particle distributions, and thus allow for the simultaneous determination and comparison of the non-extensiveness of the above mentioned relevant parameters and to verify the hypothesis of their duality. Nuclear collisions are usually described by increasingly complex statistical models that try to account for all possible collective effects [32][33][34]. Because, however, for our purposes, the mutual relation between the entropies of AA and nn collisions will be important, to estimate the entropy in the nuclear collision, it will therefore be more convenient to use the phenomenological description based on the assumption that it can be described by a certain superposition of collisions of single nucleons (taking into account only nucleons that collided at least once and assuming that their collisions are independent-these are the so-called "wounded nucleons") [35]. (The reason for this choice may be the fact that, despite its apparent simplicity, this model is still able to describe a surprisingly large number of experimental results [36,37]).
In this approach, the total observed multiplicity N is the sum of the multiplicities n i=1,...,ν of particles emitted from ν individual sources, and the average total multiplicity N is the product of the average number of sources, ν , and the average multiplicity from the source, n i , (which here is assumed to be the same for each source): The identity of the sources assumed here means that their entropies are equal, so using the relationship (4) the entropy ν of such sources is In further considerations, ν will denote the number N P of nucleons of the incident nucleus participating in the collision (i.e., participants), and ν = N W /2, where N W is the number of wounded nucleons. Continuing in the same vein and assuming that the total entropy is proportional to the average multiplicity of particles produced in the collision, we can relate the average multiplicities in nuclear (AA) and nucleon (NN) collisions, namely This simple dependence already allows for some preliminary assessment of the q parameter. It turns out that the observed N AA grows non-linearly with N P , N AA > N P N pp [38]. Considering this observation from the point of view of entropy, it is clear that we must have q < 1 here. However, this is only a very rough estimate, because, strictly speaking, formula (17) is not fully correct with respect to the S q entropy. We will therefore return to Equation (15) denoting now the entropy for the whole particle production process by s and the corresponding non-extensiveness parameter byq, and their equivalents for nucleon collisions by S and q, respectively. The relation (15) for N particles now looks like this: where s (1) q = α is the entropy for a single particle. In the A + A collision with ν nucleons participating Equation (15) results in where S q is the entropy for a single nucleon. Denoting multiplicity in single N + N collisions by n, one can write that the respective entropy is whereas the entropy in A + A collisions for N produced particles is This means therefore that S Parameters q andq are usually not the same. However, from analyzes in [38,39] one obtains that for NN collisions (where N P = 1)q = 1. On the other hand, forq = q Equation (22) corresponds to the situation encountered in superpositions as now one obtains In the general case, we obtain the formula for the ratio N/(ν · n)) N ν · n = 1 νn · ln c 1 ln where which for N = N AA >, n = N pp and ν = N P is presented in Figure 2 for different reactions (see [40] for more details). Note that for energies √ s > 7 GeV one has c 1 > 1. This means thatq < 1 and (because c 2 > 0) also q < 1, confirming therefore previous estimates based on Equation (17).
This, however, is as much as can be said for sure, because while the distributions can give exact values of the parameter q , the same cannot be said about q except that q < 1 (at least in a certain energy range). We still have too many free parameters here, e.g., unknown a priori entropy s (1) q . Therefore, while the statement that mostly we have q > 1 and q < 1 seems certain, it is not known how exactly (if at all) the duality q + q = 2 (13) is satisfied.

More Thorough Screening of Duality
We will now deal with the problem of duality in more detail. Figure 3 shows the entropies S q obtained from the distributions (12) for 0.5 < q ≤ 1, (here, q was changed to 2q − 1), and for 1 ≤ q < 1.5, Let us note that for values of q outside the range of variability declared for a given entropy, S q < 1, i.e., it is always lower than unity, which is less than the Shannon entropy. From Figure 3 it can be seen that the entropy formula S q , which could be used in the entire allowable range of the parameter q, describing both q cases and 2 − q dual to them, must contain both elements of (26) and (27), i.e., have the following form: The corresponding Tsallis distribution is now where 0.5 < q < 1.5.
A natural question arises as to what should be modified and how in such a case? What we would like to suggest here is the use of appropriately modified definitions of the exp q (x) and ln q (x) functions, namely to replace exp q (x) defined in Equation (2) by and, accordingly, This form works for all x and q values, and there are no additional restrictions on the admissible values of the q parameter depending on whether x > 0 or x < 0. Formally, this corresponds to replacing q → q = 2 − q when changing the sign of x. Figure 4 shows behaviour of the functions exp q (x) and ln q (x). Note that using this form we now have and the ocupation numbers of particles n q (x) and antiparticles n q (−x) satisfy relation n q (−x) + n q (x) = −ζ (33) for all values of q (ζ = +1 for bosons and −1 for fermions). The naive replecement of the Euler-exponential with another, deformed exponential function (namely given by Equation (2)) can lose the particle-hole symmetry, inherent in the traditional Fermi distribution above and below the Fermi level. Previously, these relationships had a dual form, exp q (−x) · exp 2−q (x) = 1 and n q (x) + n 2−q (−x) = −ζ. This means that such an approach avoids not only the problem of duality discussed earlier in Section 2, but also preserves the particle-hole symmetry concerning distribution above and below the Fermi level which is fundamental in field theory and was discussed in [42,43].
In the above considerations, we must remember that the modified functions exp q (x) and ln q (y) are not differentiable everywhere because the functions sign(x) (in the first case) and sign(1 − y) (in the second) have a discontinuity at x = 0 or y = 1. Therefore, by their derivatives for x = 0 (or y = 1), we understand their limits for x → 0 (or y → 1). In this approach, the first derivatives exp q (x) and ln q (y) are the same for x = 0 and y = 1 as the first derivatives exp(x) and ln(y), while their n-th derivatives already depend on q in the following way: and lim y→1 d n ln q (y)

Other Sources of Tsallis Distribution
Note that since Equation (2) describes the data in the entire measured area of phase space, i.e., both those associated with the thermal approach and those associated with hard collisions, the justification of this formula cannot be reduced to the Tsallis entropy only. It is worth noting that for each probability distribution the appropriate form of entropy can be given and for each probability distribution one can also give the constraints which, when used together with the Shannon entropy, lead to this probability distribution [44]. For our considerations, it is important to note that when selecting the constraints in such a way that they best take into account the most important dynamic features of the examined system, one could basically stop at the Shannon entropy [45]. For example, condition x = const provides to the usual exponential distribution, x 2 gives Gaussian distribution, ln(x) = const gamma distribution, whereas ln 1 + x 2 gives a Cauchy distribution. In general, for some function h(x), the maximum entropy density for f (x) satisfying the constraint dx f (x)h(x) = const has the form f (x) = exp[λ 0 + λh(x)] where parameters λ 0 and λ are fixed by the requirement of normalization for f (x) and by the above constraint. To obtain the Tsallis distribution in this way, we need to use a constraint like this: The Tsallis distribution understood as a quasi-power distribution can also be obtained in many ways without referring to any form of entropy [46]. We will now discuss a few of them in more detail.
Superstatistics. This approach extends the exponential description, f (E) = 1 T exp(− E T ), characterized by some parameter of the scale, T, by allowing fluctuations of this parameter [47]. In particular, if they are described by a gamma distribution, the total result is a Tsallis distribution [29,48], the f q (E) = 2−q the parameter q characterizing the strength of fluctuations in T is given by its variance, Since in thermal models ω 2 T is related to the heat capacity C V , one possible meaning of the parameter q is its relationship to the heat capacity, q = 1 + 1/C V (note that here q > 1 always). Other classes of generalized statistics can also be obtained, and with small variance of fluctuations they all behave universally [47].
Preferential attachment. This approach describes a situation where the scale parameter depends linearly on the variable under consideration, as is the case when preferential attachment correlations are encountered in the system under consideration, e.g., when x . This changes the equation defining the distribution, resulting in the Tsallis distribution with q > 1 [49,50], Tsallis distribution from multiplicative noise. The Tsallis distribution may also mean that the described process has a stochastic character defined by the additive, γ(t), and multiplicative, ξ(t), noise and described by the Langevin equation, The corresponding Fokker-Planck equation has the form and for stationary solutions When both noises are uncorrelated (i.e., when Cov(ξ, γ) = 0) and when there is no drift caused by additive noise (i.e., E(ξ) = 0) the solution to Equation (44) is the Tsallis distribution in p 2 [51]: The Tsallis distribution with p (as in Equation (2)) and not p 2 is obtained for the more complicated case of T = T(q) when [46] T Note that T now depends non-linearly on q, which significantly makes the Tsallis distribution more flexible, allowing for the analysis and comparison of various types of processes (cf. [46]). At this point, it is worth noting that there is a relationship between the type of noise and the condition imposed in MaxEnt. In the case of Shannon entropy, a condition imposed on the arithmetic mean corresponds to additive noise, while the use of a condition imposed on the geometric mean corresponds to multiplicative noise and leads to a power distribution [52].
Conditional probability. The methods for obtaining the Tsallis distribution presented so far are basically limited to cases with q > 1. Cases with q < 1 can only be observed in constrained systems. Consider for example N independent energies, E i=1,...,N , where each of them follows the Boltzman distribution, g i (E i ) = 1 λ exp − E i λ , and their sum, However, if the available energy is bounded, E = Nα = const, these energies will no longer be independent and will be described by conditional probabilities in the form of Tsallis distributions with q < 1: One could obtain a Tsallis-like distribution with q > 1 only if the scale parameter λ fluctuates in the same way as in the case of superstatistics. Statistical physics. A Tsallis distribution with q < 1 also follows from statistical physics. Consider an isolated system with energy U = const and ν degrees of freedom (particles). We choose one of them with energy E U, then the rest of the system has energy E r = U − E. If this particle is in one well-defined state then the number of states of the entire system is Ω(E r ), and the probability that the energy of the selected particle is E is P(E) ∝ Ω(U − E). Expanding ln Ω(U − E) around U and keeping only the first two terms one obtains that is a Boltzman distribution with However, it is usually expected that Ω(E r ) ∝ E r ν α 1 ν−α 2 with α 1 , α 2 ∼ O(1). Choosing α 1 = 1 and α 2 = 2 (because the number of states in the reservoir has decreased by one), therefore This allows us to write the probability of selection of energy E as: that is, in the form of the Tsallis distribution with q = 1 − 1 ν−2 ≤ 1, such as in the case of conditional probability above.

Summary and Conclusions
Entropy has always played an important role in the study of the production mechanisms of particles produced in high-energy hadronic and nuclear collisions, either in their description based on thermodynamics [2] or in descriptions using elements of information theory [4].
In the application of the non-extensive approach, we encounter the problem of a certain duality manifested in the parallel occurrence of the parameter q and 2 − q, which is best illustrated by the parallel description of particle production processes in nucleon and nuclear collisions discussed in Section 2. The second manifestation of duality appears in an attempt at a non-extensive description of quantum statistical distributions. As suggested by the results of [42,43] they are inconsistent with the conventional description using Tsallis distributions (and prefer the nonextensive Kaniadakis distribution). The point here is the necessity to preserve the particle-hole symmetry requiring that exp(−x) · exp(x) = 1, while using the original q-exponential Tsallis distribution it leads to exp q (−x) · exp 2−q (x) = 1. In Section 3 we propose a new formula defining the non-extensive function exp q (x) which restores this symmetry and we have a nonextensive version of particle-hole symmetry again which restores this symmetry in the form exp q (−x) · exp q (x) = 1.
From a more technical perspective, it is worth noting that both Shannon's and Tsallis' entropies have the same generating function, f (x) = ∑ i p x i , and that the difference in their forms is just due to the form of adopted differentiation operator. For standard first-order differentiation, d f (x)/dx , we obtain the Shannon entropy, whereas adopting the Jackson q-derivative, D q f (x) = f (qx)− f (x) qx−x , yields the Tsallis entropy. In fact, other expressions for entropy can be obtained by using yet other forms of differentiation operators [7].
Author Contributions: G.W. and Z.W. contributed equally to all stages of this work: conceived the problem, calculations and preparation of the manuscript. All authors have read and agreed to the published version of the manuscript.