Polyadic Entropy, Synergy and Redundancy among Statistically Independent Processes in Nonlinear Statistical Physics with Microphysical Codependence

The information shared among observables representing processes of interest is traditionally evaluated in terms of macroscale measures characterizing aggregate properties of the underlying processes and their interactions. Traditional information measures are grounded on the assumption that the observable represents a memoryless process without any interaction among microstates. Generalized entropy measures have been formulated in non-extensive statistical mechanics aiming to take microphysical codependence into account in entropy quantification. By taking them into consideration when formulating information measures, the question is raised on whether and if so how much information permeates across scales to impact on the macroscale information measures. The present study investigates and quantifies the emergence of macroscale information from microscale codependence among microphysics. In order to isolate the information emergence coming solely from the nonlinearly interacting microphysics, redundancy and synergy are evaluated among macroscale variables that are statistically independent from each other but not necessarily so within their own microphysics. Synergistic and redundant information are found when microphysical interactions take place, even if the statistical distributions are factorable. These findings stress the added value of nonlinear statistical physics to information theory in coevolutionary systems.


General Motivation
Traditional information-theoretical measures quantify the amount of macroscale information i.e., statistical properties required to characterize a system (Information Entropy, [1,2]), the amount of information shared i.e., redundant among processes (Redundant Information, e.g., the measures in [2,3]), and the amount of innovative information emerging from the non-redundant cooperation among processes (Synergistic Interaction Information, e.g., the measures in [4,5]).
The information-theoretical measures of entropy, redundancy and synergy are traditionally formulated assuming independence among the microstates living within a macroscale variable of interest. The latter is then assumed to represent a random process in which each observational event is an independent realization within a distribution embodying the statistical envelope aggregating the aforementioned microstates. That is, while evaluating statistical relationships among macroscale variables, the microphysical interactions within at a statistical mechanical level remain elusive. If no such interactions take place e.g., in a perfect gas, the traditional information metrics will be accurate. However, that may no longer be the case when the assumption of independent microphysics no longer holds, such as in complex systems with nonlocal interactions leading to long-range codependence and multifractal scaling characteristics.

Statistical and Information Entropy
One core assumption in statistical physics resides in the probabilistic interpretation of Entropy. In short, that view interprets the energy densities of a system in phase space as being probabilities that the system exhibits the configurations enclosed by the volume elements where those densities are characterized.
For systems with numerable independent microstates of cardinality M, this brings the original Boltzmann-Gibbs formulations to the popular definition of statistical entropy [6,7]: where p m is the probability associated to the microstate indexed as m and k is a positive constant, which in the original thermodynamic context is given by the Boltzmann constant.
In the context of Shannon's mathematical theory of communication [1], the microstates indexed as m are represented as messages characterizing a system and p m are the probabilities assigned to them. The higher the probability of a state is, the more information will be gained about the system through the associated message. Information Entropy is then formulated in that sense as the canonic information requirements to fully characterize the state of a system, as a function of how likely its states are. The functional form is equivalent to that of Equation (1) with k = 1: For the remainder of the paper, statistical entropy shall be represented by the informationtheoretical notation H in order to remain clear that we are working with a scaled Boltzmann constant k = 1 in Equation (1).
When a = 2, Equation (2) corresponds to the Shannon Entropy used in computing and most information-theoretical applications, with units in bit. For instance, the Information Entropy of a fair coin toss is 1 bit, corresponding to the single binary (e.g., yes/no) question required to find its actual outcome. If, instead, the natural-based logarithm a = e is taken, the Information Entropy is given in nat.
In statistical terms, the aforementioned entropies are functionally equivalent and can be denoted as Boltzmann-Gibbs-Shannon (BGS) Entropy [8].

Information Redundancy
Information-theoretical measures of redundancy quantify the amount of information shared (and thus redundant) among a set of macroscale variables assuming that the underlying microphysics are disentangled. A redundancy-based complex system network will thus entail codependences among macroscale processes whilst the eventual codependences within microphysics are still elusive. An example of such a measure is Multi-Information [3], a non-negative quantity that can be expressed in the following form [5]: Multi-Information measures the global statistical dependence among the N components Y i of Y. A well-known particular case is obtained for N = 2, leading to Mutual Information [2] between generic components Y i and Y j of Y: which interestingly has a non-null, positive lower bound (general theoretical proof in [9], and for finite samples in [10]).

Information Synergy
Information-theoretical measures of synergy quantify the amount of information present in the system as a whole that is not present in any of its strict subsets. As such, synergy refers to the information gain stemming from the collective cooperation among system constituents.
One such measure is Interaction Information, which can be expressed as [11]: where i is the index referring to the component Y i of Y and N denotes the number of components. The Interaction Information is Synergistic across the components of Y when I t > 0 for dim(Y) ≥ 3. Conversely, when I t < 0 the Information is Redundant (a negative synergy entails redundancy), becoming equivalent to multi-information.
A necessary condition for a non-redundant synergistic polyad to emerge within Y lies in the non-Gaussianity of its joint distribution. That is, from a macroscopic standpoint, only statistical moments of order n > 2 can enclose synergistic information. Therefore, the Interaction Information can also be expressed directly in terms of such higher-order moments. Detailed formulation details are found in [5].

The Fundamentals
Our ultimate aim is to express macroscale information theoretical measures in terms of non-extensive statistical mechanics. For that purpose, consider the following functional class of generalized entropy measures equivalent to those in [8,12]: where p m ≡ p Y (y m ) is the probability of occurrence of the microstate y m in the macroscale variable Y, M is the number of microstates, and k is a positive constant reflecting the fact that entropy can only be defined up to a multiplicative constant (i.e., only entropy differences can be fully determined). The parameters q and r are associated to the existence of nonlinear codependencies among microstates within Y, e.g., with respect to the probability of the state at which the system lies and to that of the neighbouring states [13]. In fact, the presence of these two parameters in the entropy functional stems from the dependence, relative to two different powers of the probability distribution, of the nonlinear interaction terms in the associated nonlinear Fokker-Planck Equation [14]. Anomalous diffusion with coupling among two scaling regimes and nonlinear cross-scale coevolution in complex dynamical systems are among the physically relevant examples where this entropy functional can play an important role in eliciting macroscale thermodynamic effects of nonlinear statistical mechanics. In a coevolutionary system setting, the aforementioned parameters can be interpreted as coevolution indices [15] or geometric parameters of the coevolution manifold [16], entailing evolutionary dynamic codependence among mutually influencing observables. Particular cases of Equation (8) include the well-known Tsallis entropy [17] by taking either q = 1 or r = 1 (but not both), and the seminal Botzmann-Gibbs-Shannon (BGS) entropies by taking both parameters as unity, i.e., q = r = 1, a situation representing the absence of microphysical codependencies.
In a joint multivariate Y, say Y = (Y 1 , · · · , Y N ) of N components, there can be microphysical interactions across its macroscale components or dimensions rather than solely within each marginal, which means that the joint entropy of Y will not necessarily be the sum of the entropies of its components, i.e., entropy will not be additive when there are codependencies among subsystems. Additivity only holds when microphysical interactions are confined to within each marginal or macroscale component of the multivariate Y.
In order to isolate the contribution that microphysical interactions bring to the macrophysical entropy, we formulate the joint entropy of a system represented by Y built from N statistically independent components or subsystems Y 1 , · · · , Y N .
The a priori statistical independence of the subsystems corresponds formally to the probability factorization which dictates not only bilateral but also multilateral statistical independence. However, it does not imply factorization of the entropies, since the mixing of subsystems can introduce cross-dependencies among microstates belonging to different subsystems. Whether and under which circumstances that is the case is investigated by evaluating the difference between the entropy of the overall system Y and the sum of the entropies of its subsystems taken isolately.

Dyadic Systems
The entropy functional in Equation (8) for a bivariate system Y = (Y i , Y j ) where the factorization Equation (9) holds is hereby derived from Equation (8) under the factorization constraint in Equation (9), yielding: a result consistent with [12]. This means that the total entropy of the system [lhs of Equation (10)] is not fully assessed by the entropy of its parts when they were separate entities [first two rhs terms of Equation (10)], rather depending on nonlinear terms which are a function of the microphysical codependence parameters q and r. Once coming together in the form of Y, and notwithstanding their statistical independence, Y i and Y j bring about a nonlinear contribution to the joint entropy arising from the microphysical interactions represented by the parameters q and r. When these are unit valued, the joint entropy reduces to the sum of the entropies that the parts held as separate entities.
Whether the joint entropy will be higher or lower than the sum of that from the separate components will depend on what kind of microphysical interaction is at play. For instance, when both q and r are lower than one, the joint entropy will be higher than the sum of the entropies of the separate subsystems (the whole will be more than the sum of the parts, entailing a dyadic synergy). When the aforementioned parameters are higher than one, the converse happens, with the entropy of the combined system being lower than the sum of the entropies of the subsystems prior to mixing, implying the existence of entropy-reducing factors, i.e., redundancy, among those subsystems.
These are then macroscale footprints of microphysical links being established across subsystems once they become connected within a bivariate whole.
The notion that the entropy of a combined system can be lower than the sum of the entropies of its constituents taken separately can be physically understood by taking into consideration that upon combining previously separated subsystems, these can develop microphysical links resulting in a loss of dynamic freedom and hence of entropy relative to the pre-mixing stage. The connection is then captured through the nonlinear terms in the joint entropy functional. Only when no such connections exist has the entropy to be non-decreasing, which is also the case in reality.

Triadic Systems
The joint entropy of a triadic system Y = (Y 1 , Y 2 , Y 3 ) is hereby formulated with the entropy functional Equation (8) as follows: where i, j, l are permutations of the indices 1, 2, 3 including themselves. All nonlinear terms Equation (14) to (18) are trivially null for BGS entropies, resulting in the additive decomposability of the joint entropy in that case, reflecting the non-existence of microphysical interactions across different subsystems.
For q = r, the contributions from triadic products of entropies [terms Equations (16)-(18)] will always be non-negative since (1 − q) 2 ≥ 0, ∀q ∈ R. This means that triadic products will always be synergistic in this regard. This is consistent with mechanistic results that triadic interactions yield statistical and dynamical synergies in complex dynamical systems, e.g., [5,16,18].
When the parameters differ (q = r), the terms Equations (14) and (15) admit negative outcomes, namely when q > 1 ∧ r < 1 or q < 1 ∧ r > 1. In this case, the contribution to the total entropy is negative, entailing redundancy (negative synergy). The term Equation (18) will always be null when q = 1 or r = 1, i.e., in the particular conditions associated to the Tsallis entropy. This entropy will thus entail only non-negative triadic contributions to the joint entropy i.e., capture only synergistic triads (whilst leaving out redundant, codependent ones). In doing so, the measure provides only non-redundant information, avoiding the overestimation of information content.

Synergy and Redundancy Emerging among Statistically Independent Variables
With polyadic entropy functionals at hand, we are in position to quantify the synergy and redundancy associated to the nonlinear microphysical footprint onto the macrophysics (i.e., to non-trivial scaling parameters q and r), with special interest regarding the cross-variable microphysical interactions that develop once the once-separate variables come together forming the joint system polyad.
In the current study, we consider triads involving variables Y i , Y j , Y l that are statistically independent a priori, i.e., for which: Under these conditions the Mutual Information (MI), a known measure of statistical redundancy, is null by definition. MI characterizes statistical redundancy at the macroscopic level, i.e., information shared by probability distributions. As such, while it has a thermodynamic (macroscale) relevance, it does not inform on whether the microphysics are independent as well.
The Polyadic Synergy S N among statistically independent subsystems Y α , α ∈ {i 1 , · · · , i N } enrolling in a combined N-component system Y is hereby defined as the difference between the sum of entropies of each subsystem prior to combination (total entropy of a juxtaposition of separate subsystems) and the entropy of the combined system wherein the subsystems are allowed to communicate (i.e., to interact) at the microphysical level, whilst retaining their macroscale statistical independence by preserving factorability as in Equation (19).
Mathematically, this definition is hereby expressed as: This definition leads to a trivially null synergy for the BGS entropies, which is natural since they do not admit microphysical interactions within the macroscale distribution, i.e., they assume the variables under evaluation to be memoryless random processes.
In general, the synergy will be positive when the entropy of the combined system Y exceeds the sum of the entropies of the intervening components Y α . Physically, this means that, upon combination, the Y α cooperate to produce emerging dynamics not present in any of them a priori.
An example of a process with emerging dynamics is triadic wave resonance, wherein from a pair of statistically independent primary waves ω i and ω j of frequencies f i and f j respectively, a secondary wave ω l emerges with frequency f l = f i + f j , whilst preserving statistical independence between ω i and ω j . The spectrum of the resulting wave system is richer than the sum of the spectra of the primary waves, thus requiring more information to characterize it. The spectral entropy of the resulting system is thus necessarily higher. Note that when the secondary or child wave ω l is dependent upon the primary or parent waves the Equation (19) does not hold since the waves are pairwise independent but not triadically. In a pure triad all three intervening waves will be pairwise and triadically independent as expressed in the factorability condition Equation (19).
Conversely, the synergy will be negative when the entropy of Y is lower than the sum of the parts Y α . At first sight, it might appear to be thermodynamically counter-intuitive. However, such negative synergies can emerge if there are redundancies or constraints developing from the establishment of microphysical links across the once-separated parts. Such links entail freedom loss in the dynamics, with associated reduction in the total entropy relative to the a priori detached configuration. Note also that a Negative Synergy is fundamentally Redundancy, consistent with the information-theoretical counterparts presented in Sections 1.2 and 1.3.
By imposing that the probability distribution of the combined system remain factorable as in Equation (19), macroscale information-theoretical measures of redundancy vanish as noted above, therefore any emerging redundancy can be attributed to the microphysical interactions, the macroscale footprint of which is carried in the entropy terms involving the parameters q and r.
In order to illustrate the explicit role of these in the synergy and redundancy among independent variables, we consider the two and three-component forms (dyadic and triadic respectively).

Dyadic Form
The dyadic synergy S 2 naturally follows from Equation (20) as: By decomposing the joint entropy as in Equation (10), S 2 becomes: Particular forms of S 2 of interest are trivially obtained for entropy functionals under the Tsallis entropy, where either q = 1 or r = 1 (but not both), leading to: or, equivalently, S which means that when the sole parameter q or r is lower (higher) than 1 there is a positive (negative) synergy between Y i and Y j when they combine to form Y i,j . The notational superscript [T ] is intended to stress that this is the particular case under the Tsallis entropy functional. The trivially null case then occurs in the BGS entropy functional case q = r = 1, leading to the known result that neither synergistic nor redundant information can be found among memoryless statistically independent processes.
Overall, whether the synergy will be positive or negative will depend not only on the entropies per se but crucially so on the relative weight of the parameters representing nonlinear microphysical interactions. A process mixing that is not captured at all on traditional information-theoretic metrics and leads to a diagnostic of positive synergy under the Tsallis entropy functional (e.g., q < 1) may actually entail redundancy if the second parameter outweighs the cooperative (synergistic) role of q.

Triadic Form
Similarly to the dyadic form, the triadic synergy associated to the joint yet factorable (Y i , Y j , Y l ) relative to its disjoint a priori terms is hereby obtained by taking N = 3 in Equation (20), and decomposing the triadic entropies as in Equations (13)-(18), yielding: where i, j, l are permutations of the indices 1, 2, 3 including themselves. Again here, the synergy is null under the BGS entropies, as all of its terms vanish when both parameters are unity q = r = 1.
The triadic synergy for the one-parameter Tsallis entropy case then becomes: by taking r = 1 in Equation (25). An equivalent expression is obtained by taking q = 1 instead. For q = 1 there will always be a positive contribution from the triadic product to the synergy in Equation (31). Whether or not the overall synergy will be positive or negative (redundancy) will be dependent on the behaviour of the lower-order terms.

Concluding Remarks
The formulation of generalized entropy functionals where microphysical nonlinearities are taken into account (albeit for now in a parametric manner) is relevant for the evaluation of entropy functionals in systems the microphysics of which are nonlinearly codependent, e.g., for anomalous diffusion, nonlinear coevolution and mixing among heterogeneous yet interconnected media e.g., across a boundary permeable to momentum and/or heat transfer. While a diversity of studies have explored such functionals and their associated mathematical properties and physical consistency, the evaluation of higher-order functionals had been largely elusive, along with the explicit formulation of the synergies arising from bringing separate subsystems together into an interconnected whole. A new measure of polyadic synergy has been introduced and discussed that takes into account the microphysical codependence through the entropic parameters present in generalized "non-extensive" entropies. These effects have been isolated by considering the synergy among statistically independent variables. As expected, when the microphysics are uncoupled the synergy is null in line with the classical information-theoretical results valid for i.i.d. variables. However, when codependence exists within the microphysics, and albeit the macroscale statistical independence (via the associated probability factorability), a synergistic term emerges in the macrophysics, expressing the macroscale footprint of interconnected microphysics: cooperant/constructive when the synergy is positive, constraining/redundant when the synergy is negative.
Fundamentally, we can interpret the non-additivity of generalized entropies as coming down to the comparison between pre and post-mixing subsystems. If we sum the marginal entropies of all components already involved in a composite system, then they will naturally add up to that of the overall system, because entropy is fundamentally extensive at each stage of the system evolution. We can further interpret the non-extensiveness as arising from comparing system states at different stages of their evolution. In fact, the entropy of each component that will enter a system can change upon involvement in that system, even if the macroscale statistics were initially independent, since microphysical interactions across the sub-system divide can now take place that were not the case when the subsystems were not communicating (i.e., when they were separate).
The present study formulated generalized information theoretical metrics beyond the traditional assumption of memoryless microphysics, by taking nonlinear statistical physics into account. As such, it is aimed at sharing basilar ideas, concepts and developments, igniting discussion and opening windows of opportunity for further exploration by the community. Follow-up studies shall delve on further properties and applications to the characterization of information metrics and predictability among processes exhibiting non-trivial internal memory within their microphysics.
All in all, the study brings out the following take-home messages: • Factorable probabilities do not necessarily lead to additive entropies.

•
Microscale codependence does not necessarily lead to macroscale codependence.

•
Macroscale independendence does not necessarily imply microscale independence. This is consistent with the knowledge that statistical independence does not imply dynamic independence [16]. Moreover, the findings of the present study stress the relevance of taking nonlinear microphysical interactions into account when formulating information-theoretical measures, especially when a system is undergoing mixing among subsystems such as in thermodynamic coevolutionary settings.