Approach of complexity in nature: Entropic nonuniqueness

Boltzmann introduced in the 1870's a logarithmic measure for the connection between the thermodynamical entropy and the probabilities of the microscopic configurations of the system. His entropic functional for classical systems was extended by Gibbs to the entire phase space of a many-body system, and by von Neumann in order to cover quantum systems as well. Finally, it was used by Shannon within the theory of information. The simplest expression of this functional corresponds to a discrete set of $W$ microscopic possibilities, and is given by $S_{BG}= -k\sum_{i=1}^W p_i \ln p_i$ ($k$ is a positive universal constant; {\it BG} stands for {\it Boltzmann-Gibbs}). This relation enables the construction of BG statistical mechanics. The BG theory has provided uncountable important applications. Its application in physical systems is legitimate whenever the hypothesis of {\it ergodicity} is satisfied. However, {\it what can we do when ergodicity and similar simple hypotheses are violated?}, which indeed happens in very many natural, artificial and social complex systems. It was advanced in 1988 the possibility of generalizing BG statistical mechanics through a family of nonadditive entropies, namely $S_q=k\frac{1-\sum_{i=1}^W p_i^q}{q-1}$, which recovers the additive $S_{BG}$ entropy in the $q \to1$ limit. The index $q$ is to be determined from mechanical first principles. Along three decades, this idea intensively evolved world-wide (see Bibliography in \url{http://tsallis.cat.cbpf.br/biblio.htm}), and led to a plethora of predictions, verifications, and applications in physical systems and elsewhere. As expected whenever a {\it paradigm shift} is explored, some controversy naturally emerged as well in the community. The present status of the general picture is here described, starting from its dynamical and thermodynamical foundations, and ending with its most recent physical applications.


Introduction
In light of contemporary physics, the qualitative and quantitative study of nature may be done at various levels, which here we refer to as microcosmos, mesocosmos and macrocosmos. At the macroscopic level, we have thermodynamics; at the microscopic level, we have mechanics (classical, quantum, relativistic mechanics, quantum chromodynamics) and the laws of electromagnetism, which enable in principle the full description of all of the degrees of freedom of the system; at the mesoscopic level, we focus on the degrees of freedom of a typical particle, representing, in one way or another, the behavior of most of the degrees of freedom of the system. The laws that govern the microcosmos together with theory of probabilities are the basic constituents of statistical mechanics, a theory, which then establishes the connections between these three levels of description of nature. At the microscopic level, we typically address classical or quantum equations of evolution with time, trajectories in phase space, Hamiltonians, Lagrangians, among other mathematical objects. At the mesoscopic level, we address Langevin-like, master-like and Fokker-Planck-like equations. Finally, at the macroscopic level, we address the laws of thermodynamics with its concomitant Legendre transformations between the appropriate variables.
In all of these theoretical approaches, the thermodynamical entropy S, introduced by Clausius in 1865 [1] and its corresponding entropic functional S({p i }) play a central role. In a stroke of genius, the first adequate entropic functional was introduced (for what we nowadays call classical systems) by Boltzmann in the 1870s [2,3] for a one-body phase space and was later on extended by Gibbs [4] to the entire many-body phase space. Half a century later, in 1932, von Neumann [5] extended the Boltzmann-Gibbs (BG) entropic functional to quantum systems. Finally, in 1942, Shannon showed [6] the crucial role that this functional plays in the theory of communication. The simplest expression of this functional is that corresponding to a single discrete random variable admitting W possibilities with nonvanishing probabilities {p i }, namely: where k is a conventional positive constant (in physics, typically taken to be the Boltzmann constant k B ). This expression enables, as is well known, the construction of what is usually referred to as (BG) statistical mechanics, a theory that is notoriously consistent with thermodynamics. To be more precise, what is well established is that the BG thermostatistics is sufficient for satisfying the principles and structure of thermodynamics. Whether it is or not also necessary is a most important question that we shall address later on in the present paper. This crucial issue and its interconnections with the Boltzmann and the Einstein viewpoints have been emphatically addressed by E.G.D. Cohen in his acceptance lecture of the 2004 Boltzmann Award [7]. On various occasions, generalizations of the expression (1) have been advanced and studied in the realm of information theory. In 1988, [8] (see also [9,10]) the generalization of the BG statistical mechanics itself was proposed through the expression: where the q-logarithmic function is defined through ln q z ≡ z 1−q −1 . Various predecessors of S q , q-exponentials and q-Gaussians abound in the literature within specific historical contexts (see, for instance, [11] for a list with brief comments).

Definitions
An entropic functional S({p i }) is said to be additive (we are adopting Oliver Penrose's definition [12]) if, for any two probabilistically independent systems A and B (i.e., p A+B It can be straightforwardly proven that S q satisfies: Consequently, S BG = S 1 is additive, whereas S q is non-additive for q = 1. The definition of extensivity is much more subtle and follows thermodynamics. A specific entropic functional S({p i }) of a specific system (or a specific class of systems, with its N elements with their corresponding correlations) is said to be extensive if: i.e., if S(N) grows like N for N >> 1, where N ∝ L d , d being the integer or fractal dimension of the system, and L its linear size. Let us emphasize that determining whether an entropic functional is additive is a very simple mathematical task (due to the hypothesis of independence), whereas determining if it is extensive for a specific system can be a very heavy one, sometimes even intractable.

Probabilistic Illustrations
If all nonzero-probability events of a system constituted by N elements are equally probable, we In that case, S BG (N) = k ln W(N) and S q (N) = k ln q W(N). Therefore, if the system satisfies W(N) ∝ µ N (µ > 1; N → ∞) (e.g., for independent coins, we have W(N) = 2 N ), referred to as the exponential class, we have that the additive entropy S BG is also extensive. Indeed, S BG (N) ∝ N. For all other values of q = 1, we have that the non-additive entropy S q is nonextensive.
However, if we have instead a system such that W(N) ∝ N ρ (ρ > 0; N → ∞), referred to as the power-law class, we have that the non-additive entropy S q is extensive for: Indeed, S 1−1/ρ (N) ∝ N. For all other values of q (including q = 1), we have that S q is nonextensive for this class; the extensive entropy corresponding to the limit ρ → ∞ precisely is the additive S BG .
Let us now mention another, more subtle, case where the nonzero probabilities are not equal [13]. We consider a triangle of N (N = 2, 3, 4, ...) correlated binary random variables, say n heads and (N − n) tails (n = 0, 1, 2, ..., N). The probabilities p N,n (∑ N n=0 p N,n = 1 , ∀N) are different from zero only within a strip of width d (more precisely, for n = 0, 1, 2, ..., d)) and vanish everywhere else. This specific probabilistic model is asymptotically scale-invariant (i.e., it satisfies the so-called Leibniz triangle rule for N → ∞): see [13] for full details. For this strongly-correlated model, the non-additive entropy S q is extensive for a unique value of q, namely: We see that the extensive entropy corresponding to the limit d → ∞ precisely is the additive S BG . These examples transparently show the important difference between entropic additivity and entropic extensivity. What has historically occurred is that, during 140 years, most physicists have been focusing on systems that belong to the exponential class, typically either non-interacting systems (ideal gas, ideal paramagnet) or short-range-interacting ones (e.g., d-dimensional Ising, XY and Heisenberg ferromagnets with first-neighbor interactions). Since for this class, but not so for many others, the additive BG entropic functional is also extensive, a frequent confusion has emerged in the understanding of very many people and textbooks, which has led, and is unfortunately still leading, to somehow considering additive and extensive as synonyms, which is definitively false ( this error is so easy to make, such that, by inadvertence, the book [14] by Gell-Mann and myself was entitled Nonextensive Entropy, whereas it should have been entitled Non-additive Entropy; obviously, we definitively regret this misnomer).

Physical Illustrations
The entropic index q is to be determined from first principles, namely from the time evolution (in phase space, Hilbert space and analogous) of the state of the full system. This typically is an analytically hard task. Nevertheless, this task has been accomplished in some few cases. Let us briefly review some of them: 1. The logistic map at its Feigenbaum point; 2. The entropy of a subsystem of a (1 + 1)-dimensional system characterized by a central charge c at its quantum critical point; 3. The entropy of a subsystem of a (1 + 1)-dimensional generalized isotropic Lipkin-Meshkov-Glick model at its quantum critical point.
For the logistic map , we have that a value of q exists, such that S q asymptotically increases linearly with time, where the value of q is dictated by the Lyapunov exponent being positive or zero, which in turn depends on the value of the external parameter a. To be more precise, we assume the interval [−1, 1] of x divided into W tiny intervals (identified with i = 1, 2, ..., W); we then place in one of those intervals many M initial conditions (with M >> W); and finally, we iterate the map for each of these initial conditions. The number of points M i (t) that are located at the i-th interval satisfy ∑ W i=1 M i (t) = M , ∀t. We define next the probabilities p i (t) ≡ M i (t)/M, which enable the evaluation of the entropy S q (t)/k = can be shown that a unique value of q exists such that K q ≡ lim t→∞ lim W→∞ lim M→∞ S q (t)/k t is finite. For any value of q above this special one, the ratio K q vanishes, and for any value of q below this special one, the ratio K q diverges.
For all values of a such that the Lyapunov exponent λ 1 is positive (i.e., in the presence of strong chaos, where the sensitivity to the initial conditions ξ ≡ lim ∆x(0)→0 ∆x(t) ∆x(0) increases exponentially with time, ξ = e λ 1 t ), we have that q = 1, and the ratio precisely equals the Lyapunov exponent (i.e., In contrast, at the edge of chaos, i.e., for the value of a where successive bifurcations accumulate (sometimes referred to as the Feigenbaum point), i.e., a = 1.401155189092..., we have that the Lyapunov exponent vanishes, and consistently [30,31], (in fact, 1018 exact digits are numerically known nowadays [32]; see [11] for full details). At such special values of a, we verify that ξ = e λ q t q , where a q-generalized version of the Pesin-like identity has been rigorously established [31]. The edge of chaos of logistic-like maps provides a remarkable connection of q-statistics with multifractals [30]. This is particularly welcome because the postulate of the entropy S q in order to have a basis for generalizing BG statistics was inspired precisely by the structure of multifractals. The present status of our knowledge strongly suggests that a BG system typically "lives" in a smoothly-occupied phase-space, whereas the systems obeying q-statistics "live" in hierarchically-occupied phase-spaces.
Let us now address the entropy of an L-sized block of an N-sized quantum system at its quantum critical point, belonging to the universality class, which is characterized by a central charge c (e.g., the universality classes of the short-range Ising and the short-range isotropic XY ferromagnets correspond respectively to c = 1/2 and c = 1). It has been shown [33] that S q is extensive for: We verify that c → ∞ yields q = 1 (BG). Finally, let us address the generalized isotropic Lipkin-Meshkov-Glick model [34], characterized by (m, k), where m is the number of states of the model (e.g., if the system is constituted by s-sized spins, we have m = 2s, s = 1/2, 1, 3/2, ...), and k (k = 0, 1, 2, ...) is the number of vanishing magnon densities. The entropy S q is extensive for: Notice that, in the limit s → ∞, q = 1 (BG). Numerical results are available as well in the literature. For example, for a random antiferromagnet with s-sized spins, we have [35]: Before we proceed with analyzing thermodynamical aspects, let us stress that we have addressed here two different types of linearities, the thermodynamical one (i.e., S q (N) ∝ N) and the dynamical one (i.e., S q (t) ∝ t). Although the nature of these linearities is different and even the values of q, which guarantee them, may be different (although possibly related), there are reasons to expect both to be satisfied on similar grounds: this question was in fact (preliminarily) addressed in [36] and elsewhere.

Renyi Entropy versus q-Entropy
Let us address here a question that frequently appears in the literature, generating some degree of confusion. We refer to the discussion of Renyi entropy versus q-entropy on thermodynamical and dynamical grounds. The Renyi entropy [16] is defined as: hence: It is straightforward to verify that S R q (S q ) is a monotonic function of S q , ∀q. Consequently, under the same constraints, the extremization of S R q yields precisely the same distribution as the extremization of S q (in total analogy with the trivial fact that maximizing, under the same constraints, S BG or say [S BG ] 3 yields one and the same BG exponential weight). This mathematical triviality is at the basis of sensible confusion in the minds of some members of the community. Thermodynamics and statistical mechanics is much more than a mere probability distribution, and the reader has surely never seen, and this for more than one good reason, constructing a successful theory such as thermodynamics by using say [S BG ] 3 instead of S BG .
To make things more precise, let us list now several important differences between S q and S R q (see, for instance, [11] and the references therein).
(i) Additivity: If A and B are two arbitrary probabilistically-independent systems, S R q is additive, ∀q, whereas S q satisfies the non-additive property in Equation (4).
is concave only for 0 < q ≤ 1. Both S q and S R q are convex for q < 0. These properties have consequences for characterizing the thermodynamic stability of the system. (iii) Lesche stability: S q is Lesche-stable ∀q > 0, whereas S R q is Lesche-stable only for q = 1. Lesche stability characterizes the experimental reproducibility of the entropy of a system. (iv) Pesin-like identity: For many physically important low-dimensional conservative or dissipative nonlinear dynamical systems with zero Lyapunov exponent, it is verified that, in the t → ∞ limit, S q (t) ∝ t for a unique special value of q = 1. This linearity property for t >> 1 is lost for S R q (t); indeed, for those systems, it can be easily verified that S R q (t) ∝ ln t (∀q). No dynamical systems are yet known for which S R q (t) is linear for q = 1. This linearity enables, ∀q, a natural connection with the coefficient (Lyapunov exponent for the q = 1 systems), which characterizes the dynamically meaningful sensitivity to the initial conditions. (v) Thermodynamical extensivity: For various N-sized quantum systems, it can be shown that a fixed value of q = 1 exists, such that, in the N → ∞ limit, S q (N) ∝ N, thus satisfying the necessary thermodynamic extensivity for the entropy. For those systems, S R q (N) ∝ ln N (∀q), which violates thermodynamics. For this statement, we have of course assumed that a (physically meaningful) limit q = 1 exists in the N → ∞ limit. Various papers exist in the literature that focus on situations such that a phenomenological index q can be defined, which depends on N (see, for instance, [37,38] and the references therein), but they remain out of the present scope, since their N → ∞ limit yields q = 1. (vi) The likelihood function that satisfies Einstein's requirement of factorizability coincides with the function, which extremizes the entropic functional of the system (currently, the inverse function of the generalized logarithm, which characterizes that precise entropic functional: For q = 1 systems, the factorizable likelihood function is well known to be W ∝ e S BG /k , the exponential function being the inverse of S BG /k = ln W (for equal probabilities), and for appropriate constraints, it maximizes the entropy S BG . For q = 1, we have [39] W ∝ e S q /k q , where the q-exponential function precisely is the inverse of S q /k = ln q W (for equal probabilities), and for appropriate constraints, it extremizes the entropy S q . In contrast with this property, the factorizable likelihood function for the Renyi entropy is e S R q , where the exponential function is the inverse of S R q = ln W (for equal probabilities), but it differs from the q-exponential function, which is the one that extremizes S R q . These properties plausibly have consequences for the large deviation theory of these systems (see the discussion about this theory below).

Why Must the Entropic Extensivity Be Preserved in All Circumstances?
Since we are ready to permit the entropic functional to be non-additive, should we not also allow for possible entropic nonextensivity? This question surely is a most interesting one, but to the best of our understanding, the answer is no. Indeed, there exist at least two important reasons for always demanding the physical (thermodynamical) entropy of a given system to be extensive. One of them is based on the Legendre transformations structure of thermodynamics; the other one is so suggested by the large deviations in some anomalous probabilistic models where the limiting distributions are q-Gaussians.

Thermodynamics
This argument has been developed in [11] and more recently in [15] (which we follow now). We briefly review this argument here. Let us first write a general Legendre transformation form of a thermodynamical energy G of a generic d-dimensional system (d being an integer or fractal dimension): T, p, µ, H If we consider now the thermodynamical L → ∞ limit, we obtain: where, using a compact notation,  , u), i.e., the usual extensive variables); this is of course the case found in the textbooks of thermodynamics. The thermodynamic relations (15) and (16) put on an equal footing the entropy S, the volume V and the number of elements N, and the extensivity of the latter two variables is guaranteed by definition. In fact, a similar analysis can be performed using N instead of V since V ∝ N.
An example of a nonstandard system with θ = 0 is the classical Hamiltonian discussed in what follows. We consider two-body interactions decaying with distance r like 1/r α (α ≥ 0). For this system, we have θ = d − α whenever 0 ≤ α < d (see, for example, Figure 1 of [40]). This peculiar scaling occurs because the potential is not integrable, i.e., the integral ∞ constant dr r d−1 r −α diverges for 0 ≤ α ≤ d; therefore, the Boltzmann-Gibbs canonical partition function itself diverges. Gibbs was aware of this kind of problem and has pointed out [4] that whenever the partition function diverges, the BG theory cannot be used because, in his words, "the law of distribution becomes illusory". The divergence of the total potential energy occurs for α ≤ d, which is referred to as long-range interactions. If α > d, which is the case of the d = 3 Lennard-Jones potential, whose attractive part corresponds to α = 6, the integral does not diverge, and we recover the standard behavior of short-range-interacting systems with the θ = 0 scaling. Nevertheless, it is worth recalling that nonstandard thermodynamical behavior is not necessarily associated with long-range interactions in the classical sense just discussed. A meaningful description would then be long-range correlations (spatial or temporal), because for strongly quantum-entangled systems, correlations are not necessarily connected with the interaction range. However, the picture of long-versus short-range interactions in the classical sense, directly related to the distance r, has the advantage of illustrating clearly the thermodynamic relations (15) and (16) for the different scaling regimes, as shown in Figure 1. Representation of the different scaling regimes of Equation (16) for classical d-dimensional systems. For attractive long-range interactions (i.e., 0 ≤ α/d ≤ 1, α characterizing the interaction range in a potential with the form 1/r α ), we may distinguish three classes of thermodynamic variables, namely, those scaling with L θ , named pseudo-intensive (L is a characteristic linear length; θ is a system-dependent parameter), those scaling with L d+θ , the pseudo-extensive ones (the energies), and those scaling with L d (which are always extensive). For short-range interactions (i.e., α > d), we have θ = 0, and the energies recover their standard L d extensive scaling, falling in the same class of S, N, V, etc., whereas the previous pseudo-intensive variables become truly intensive ones (independent of L); this is the region with two classes of variables that is covered by the traditional textbooks of thermodynamics. From [15].
To summarize this crucial subsection, we may insist that what is thermodynamically relevant is that the entropy of a given system must be extensive, not that the entropic functional ought to be additive. This is consistent with the fact that Einstein's principle for the factorizability of the likelihood function is satisfied not only for the additive BG entropic functional, but also for nonadditive ones [39,41].

Large Deviation Theory
The so-called large deviation theory (LDT) [42] constitutes the mathematical counterpart of the heart of BG statistical mechanics, namely the famous canonical-ensemble BG factor e −βH(N) = e −N[βh(N)] with h(N) ≡ H(N)/N. Since, for short-range interactions, βh(N) is a thermodynamically-intensive quantity in the limit N → ∞, we see that the BG weight represents an exponential decay with N. This exponential dependence is to be associated [42][43][44][45][46] with the LDT probability P(N; x) ≃ e −N r 1 (x) , where Subindex 1 in the rate function r 1 (x) will soon become clear.
Since r 1 (x) is directly related to a relative entropy per particle (see, for instance, [43]), the quantity Nr 1 (x) plays the role of an extensive entropy.
If we focus now on, say, a d-dimensional classical system involving two-body interactions whose potential asymptotically decays at long distance r like −A/r α (A > 0; α ≥ 0), the canonical BG partition function converges whenever the potential is integrable, i.e., for α/d > 1 (short-range interactions), and diverges whenever it is non-integrable, i.e., for 0 ≤ α/d ≤ 1 (long-range interactions). The use of the BG weight becomes unjustified ("illusory" in Gibbs words [4] for, say, Newtonian gravitation, which in the present notation corresponds to (α, d) = (1, 3); hence, α/d = 1/3) in the later case because of the divergence of the BG partition function. We might therefore expect the emergence of some function f (H N ) different from the exponential one, in order to describe some specific stationary (or quasi-stationary) states differing from thermal equilibrium. The Hamiltonian H N generically scales like NÑ withÑ ≡ N 1−α/d −1 1−α/d ≡ ln α/d N (with the q-logarithmic function defined as ln q z ≡ z 1−q −1 1−q ; z > 0; ln 1 z = ln z). Notice that (N → ∞)Ñ ∼ N 1−α/d /(1 − α/d) The particular case α = 0 yieldsÑ ∼ N, thus recovering the usual prefactor of mean field theories. The quantity βH N can be rewritten as [(βÑ)H N /(NÑ)]N = [βH N /(NÑ)]N, whereβ ≡ βÑ ≡ 1/k BT =Ñ/k B T plays the role of an intensive variable. The correctness of all of these scalings has been profusely verified in various kinds of thermal, diffusive and geometrical (percolation) systems (see [11,45]). We see that, not only for the usual case of short-range interactions, but also for long-range ones, [βH N /(NÑ)] plays a role analogous to an intensive variable. The q-exponential function e z q ≡ [1 + (1 − q)z] 1 1−q (e z 1 = e z ) (and its associated q-Gaussian) has already emerged, in a considerable amount of nonextensive and similar systems, as the appropriate generalization of the exponential one (and its associated Gaussian). Therefore, it appears as rather natural to conjecture that, in some sense that remains to be precisely defined, the LDT expression e −r 1 N becomes generalized into something close to e −r q N q (q ∈ R), where the generalized rate function r q is expected to be some generalized entropic quantity per particle. As shown in Figures 2 and 3 (see the details in [45]), it is precisely this e −r q N q behavior that emerges in a strongly correlated nontrivial model [43,45]. Since, as for the q = 1 case, r q N appears to play the role of a total entropy, this specific illustration is consistent with an extensive entropy. , where (a(x), r q (x)) are positive quantities. From [45].  Figure 2 in (q-log)-linear representation. Let us stress that the unique asymptotically-power-law function, which provides straight lines at all scales of a (q-log)-linear representation, is the q-exponential function. The inset shows the results corresponding to N up to 50. From [45].

Further Applications and Final Words
A regularly-updated bibliography on the present subject can be found at [47]. At the same site, a selected list of theoretical, experimental, observational and computational papers can be found, as well. From these very many papers, let us briefly mention here a few recent ones.
For those systems that may be well described by a specific class of nonlinear homogeneous d = 1 Fokker-Planck equations, a prediction was advanced [48] in 1966, namely the scaling µ = 2/(3 − q), where µ is the exponent that characterizes the scaling between space and time (specifically the fact that x 2 scales like t µ ) and q is the index of the q-Gaussian, which describes the paradigmatic solution of the equation. Notice that q = 1 yields the well-known Einstein 1905 result µ = 1 for Brownian motion. The prediction was experimentally verified (within a 2% precision along an entire experimental decade), in 2015 [49], for confined granular material. It would be surely interesting to also verify, for higher-dimension confined granular matter, the d-dimensional generalization of that scaling, namely µ = 2 2+d(1−q) [50]; hence, once again µ = 1 for q = 1. For an area-preserving two-dimensional map, namely the standard map, it was neatly shown [51] how q-statistics, or BG statistics, or even a combination of both emerges as a function of the unique external parameter (K) of the map. This and various other emergencies of q-Gaussian and q-exponential distributions in many natural, artificial and social complex systems are most probably connected with q-generalizations of the central limit theorem (see, for instance, [52][53][54][55][56][57][58][59][60][61][62][63]).
Another q-statistical connection that certainly is interesting is the one with the so-called (asymptotically) scale-free networks. Indeed, their degree distribution has been shown in many cases to be given by p(k) ∝ e −k/κ q (k being the number of links joining a given node), which plays the role of the Boltzmann-Gibbs factor for short-range-interacting Hamiltonian systems. This connection was already established in the literature since one decade ago (see, for instance, [64,65]). Moreover, it has been recently shown [66] that neither q nor κ depend independently on the dimensionality d and from the exponent α characterizing the range of the interaction, but, interestingly enough, only depend on the ratio α/d. Very many papers focus on the degree distributions of (asymptotically) scale-free networks from a variety of standpoints. For example, an interesting exactly solvable master-equation approach is available in [67]. The novelty that we remind about in this mini-review is that the q-exponential degree distribution is here obtained from a simple entropic variational principle (under a constraint where the average degree plays the role of the internal energy in statistical mechanics).
Many other systems (e.g., related to those mentioned in [102][103][104][105]) are awaiting for approaches along the above and similar lines. They would be very welcome. Even so, we may say that the present status of the theory described herein is at a reasonably satisfactory stage of physical and mathematical understanding.