Speed Gradient and MaxEnt Principles for Shannon and Tsallis Entropies

In this paper we consider dynamics of non-stationary processes that follow the MaxEnt principle. We derive a set of equations describing dynamics of a system for Shannon and Tsallis entropies. Systems with discrete probability distribution are considered under mass conservation and energy conservation constraints. The existence and uniqueness of solution are established and asymptotic stability of the equilibrium is proved. Equations are derived based on the speed-gradient principle originated in control theory.


Introduction
The notion of entropy is widely used in modern statistical physics, thermodynamics, information theory, engineering, etc.
In 1948, Claude Shannon [1] introduced the entropy of a probability distribution P (X) as where X is a discrete random variable with possible values {x 1 , ..., x n }.
S(X, q) = 1 q − 1 1 where q is any real number.It was shown that the Tsallis entropy tends to the Shannon entropy when q → 1.
It is noted by Alexander N. Gorban et al. [3] that the Tsallis entropy coincides with the Cressie-Read (CR) entropy [4,5].The Tsallis entropy was introduced four years later than CR entropy.However the Tsallis entropy has become very popular in statistical mechanics and thermodynamics nowadays.It has also found many applications in various scientific fields such as chemistry, biology, medicine, economics, geophysics, etc.There is a plenty of works that use and analyze the Tsallis entropy [6].
Since seminal works of Edwin T. Jaynes (1957) [7,8] and until recent years [9][10][11] the maximum entropy (MaxEnt) principle attracts a strong interest of researchers.The MaxEnt claims that the system tends to its state with maximum entropy under constraints defined by other physical laws.
Despite a large number of publications studying the maximum entropy states, the dynamics of evolution and transient behavior of the system are still not well investigated.The MaxEnt principle defines the asymptotic behavior of the system, but does not say anything about how the system moves to its asymptotic behavior.
We use the speed-gradient (SG) principle that has already been successfully applied in [12][13][14] to obtain equations of the dynamics for discrete systems.Applicability of the SG-principle is experimentally tested in [14] for the systems of finite number of particles simulated with the molecular dynamics method.In this paper we apply SG-approach to describe dynamics of the systems that tend to maximize the Tsallis entropy.
The evolution law of the system for the Shannon entropy is formulated in the following general form: where logarithm of the vector is understood componentwise, I is an identity operator, Ψ is a symmetric matrix, Γ > 0 is a constant gain.For the Tsallis entropy the evolution law is By means of the SG-principle the distribution corresponding to the maximum value of entropy is found.
Note that there is a well known general form of the time-evolution equations for non-equilibrium systems, abbreviated as GENERIC (general equation for the non-equilibrium reversible-irreversible coupling) [15,16].The SG-principle has some similarities with GENERIC equations for the case when the goal function is set as entropy and constraints are specified by energy.Nevertheless the SG-principle can be considered to be more general since any smooth functional may be taken as a goal function, not only entropy functional.Connection between GENERIC and the SG-principle is examined in Section 2.3.
Along with the Tsallis entropy more general forms of relative entropies and divergences such as CR entropy or Csiszár-Morimoto conditional entropies (f-divergencies) [17] can also be considered from the SG-principle perspective.
The paper is organized as follows.Section 2 formulates the SG-principle with some illustrating examples.Section 3 introduces Jaynes' formalism.Dynamics equation for the system that tends to maximize the Shannon entropy is described in Section 4. Section 5 derives dynamics equation for the Tsallis entropy.Equilibrium stability and asymptotic convergence is proved.Section 6 contains results from 5th section extended to the internal energy conservation constraint.

Speed-Gradient Principle
Let us consider a category of open physical systems which dynamics are described by the system of nonlinear differential equations where x ∈ C n is the system state vector, u is the vector of input (free) variables t ≥ 0. The problem is to derive the law of variation (evolution) of u(t) that satisfies some criterion of "naturalness" of its behavior to give the model features characterizing a real physical system.A typical approach to derive such a criterion from variational principles usually starts with specifying some integral functional (for example, the action functional of the least action principle [18]).Functional minimization defines probable trajectories of the system {x(t), u(t)} as points in the corresponding functional space.An alternative approach is based on local functions depending on a current system states.According to M. Planck [19], local principles have some advantage over integral ones because the current state and motion of the system do not depend on its later states and motions.Following [12,20], let us formulate the speed-gradient local variational principle allowing to synthesize the laws of system dynamics.
The speed-gradient principle: of all possible motions, the system implements the ones for which input variables vary in the direction of the speed-gradient of some "goal" functional Q t .If the constraints are imposed on a motion of the system then the direction is a speed-gradient vector projection on the admissible directions (the ones that satisfy the constraints) set.
The SG equation of motion is formed as the feedback law in the finite form or in the differential form where Qt is a rate of change of the goal functional along the trajectory of the system (5), Γ is a positive constant gain that may be a positive definite matrix.
Let us describe application of the SG-principle in the simplest (yet the most important) case where a category of models of the dynamics ( 5) is specified as the relation: The relation (8) just means that we are deriving the law of variation of velocity of the system state.In accordance with the SG-principle, the goal functional (function) Q(x) needs to be specified first.Q(x) should be based on physics of a real system and reflect its tendency to decrease the current Q(x(t)) value.After that, the law of dynamics can be expressed as (6).
The SG-principle is also applicable to develop models of the dynamics of distributed systems that are described on infinite-dimensional state spaces.There in particular can be a vector x in a Hilbert space X and a nonlinear operator f (x, u, t) defined on a dense set D F ⊂ X ; in this case, the solutions of Equation ( 5) are generalized.
To illustrate how the SG-principle works in physical sense we consider several examples.

Example 1: Motion of a Particle in the Potential Field
In this case the vector x = (x 1 , x 2 , x 3 ) T consists of coordinates x 1 , x 2 , x 3 of a particle.Choose smooth Q(x) as the potential energy of a particle and derive the speed-gradient law in the differential form.To this end, calculate the speed and the speed-gradient Then, choosing the diagonal positive definite gain matrix Γ = m −1 I 3 , where m > 0 is a parameter, I 3 is the 3 × 3 identity matrix, we arrive at the Newton' Note that the speed-gradient laws with non diagonal gain matrices Γ can be incorporated if a non-Euclidean metric in the space of inputs is introduced by the matrix Γ −1 .Admitting dependence of the metric matrix Γ on x one can obtain evolution laws for complex mechanical systems described by Lagrangian or Hamiltonian formalism.
The SG-principle applies not only to finite dimensional systems, but also to infinite dimensional (distributed) ones.Particularly, x may be a vector of a functional space X and f (x, u, t) may be a nonlinear differential operator (in such a case the solutions of ( 5) should be understood as generalized ones).We will omit mathematical details for simplicity.

Example 2: Wave, Diffusion and Heat Transfer Equations
Let x = x(r), r = (r 1 , r 2 , r 3 ) T ∈ Ω be the temperature field or the concentration of a substance field defined in the domain Ω ∈ R 3 .Choose the goal functional evaluating non-uniformity of the field as follows where ∇ r x(r, t) is the spatial gradient of the field and boundary conditions are assumed zero for simplicity.Calculation of the speed Q(x, t) and then speed-gradient of Q t yields i is the Laplace operator.Therefore the speed-gradient law in differential form ( 7) is ∂ 2 ∂t 2 x(r, t) = −γ∆x(r, t), which corresponds to the D'Alembert wave equation.The SG-law in finite form (6) reads and coincides with the diffusion or heat transfer equation.Note that the differential form of the speed-gradient laws often corresponds to reversible processes while the finite form generates irreversible ones.For modelling of more complex dynamics a combination of finite and differential SG-laws may be useful.
In a similar way dynamical equations for many other mechanical, electrical and thermodynamic systems can be recovered.The SG-principle applies to a broad class of physical systems subjected to potential and/or dissipative forces.This paper is aimed at application of the SG-principle to entropy-driven systems.

GENERIC and SG-Principle
GENERIC time-evolution equation can be written as: where x represents a state of a system, E is a total energy functional and S is entropy functional.
There is a connection between GENERIC and SG equations.Indeed, suppose we have to maximize entropy function S(x) of a system having constraint for a total energy E(x) = E = const.Lagrangian for this problem can be defined as: where λ is a real Lagrange multiplier.
According to ( 9) and ( 10) we can rewrite Equation ( 13) as: We can see that dynamics equation obtained from the SG-principle ( 14) coincides with GENERIC Equation (11) for dx dt = u, L(x) = −Γλ and M (x) = −Γ.GENERIC is based on two "potentials" of total energy and entropy.The SG-principle can use any functional that has to be maximized (minimized) as a goal function.It may be not only Lagrangian (12) or entropy functional.The SG-principle can consider various functionals and constraints.So it can be treated as a more general approach.Nevertheless GENERIC is also a general equation.It uses parametrized matrices L(x) and M (x) that make it possible to use GENERIC for a wide range of time-evolution systems.

Jaynes's Maximum Entropy Principle
The approach proposed by Jaynes [7,8] became the foundation for statistical physics nowadays.Its main ideas are described below.
Let P (x) be a probability distribution for a discrete random variable X.This is an unknown distribution that needs to be defined on the basis of a certain system information.Let us suppose that there is the information about some average values H m which are known a priori: The next equality is also true for probability distribution Conditions ( 15) and ( 16) in general can be insufficient to derive P (x).In this case, according to Jaynes, applying maximization of information entropy S I (1) is the most objective method to define the distribution.
Maximum search with additional conditions ( 15) and ( 16) is performed by using Lagrange multipliers; it leads to where λ m can be derived from conditions (15).In case of equilibrium these formulas show that the maximum of information entropy coincides with the Boltzmann-Gibbs entropy and can be identified with the thermodynamic entropy.

Maximization of the Shannon Entropy with the Speed-Gradient Method
Consider a discrete system which consists of N identical particles distributed over m cells.In case when the mass conservation constraint holds it is true that m i=1 N i = N.It can be normalized as: Particles can migrate from one cell to another.We are interested in both the steady-state and the transient behavior of the system.Due to MaxEnt principle it is true that the limit behavior of the system maximizes its entropy for the steady-state in case when nothing else is known [8].
To get a transient mode behavior first we apply the SG-principle choosing the Shannon entropy (1) as the goal function to be maximized.
For simplicity we assume that the motion is continuous in time and the numbers N i are changing continuously.Then the law of motion can be represented as: where u i = u i (t) are control functions ( 6) which has to be determined.This approach has already been introduced in [13].So we omit the details of derivation of equations and introduce the final form of the system dynamics: In case when internal energy constraint also holds: where E i is the energy of particle in the ith cell and the total energy does not change, then the evolution law has the form: where logarithm of the vector is understood componentwise, N = (N 1 , ..., N m ) T , I is an identity matrix, Ψ S is a symmetric m × m matrix defined as follows: where Ẽi = E i − 1 m m i=1 E i is a vector of energies.It is shown that the limit probability distribution is unique and can be obtained from Jaynes's MaxEnt principle [13].

The Speed-Gradient Dynamics of the Tsallis Entropy Maximization Process
We extend approach introduced in previous section to the case of the Tsallis entropy.To get evolution law for the mass conservation constraint (19) we apply the SG-principle choosing the Tsallis entropy as the goal function to be maximized where q is any real number and X = (N 1 , ..., N m ) T is the state vector of the system.We will also use the law of motion in the form (20). Evaluation scheme is as follows.First the speed of entropy change is evaluated:

Ṅi
Then evaluate the gradient of the speed with respect to the vector of controls u i considered as frozen parameters.
∇ u i Ṡ(q) = q (1 − q)N q N q−1 i And finally define actual controls proportionally to the projection of the speed-gradient to the surface of constraints (19) according to SG Equation (6).
Now we can evaluate Lagrange multiplier λ: The final form of the system dynamics law is as follows: Let us find the equilibrium mode which corresponds to asymptotic behavior of the variables N i .In this mode Ṅi = 0. Based on (25) it means that mN q−1 i = m i=1 N q−1 i which is possible only when all N i are equal.According to constraint (19) we have that N i = N m .This result corresponds to the maximum state of classical entropy and agrees with thermodynamics.

Equilibrium Stability
Let us examine the stability of the equilibrium mode.We introduce Lyapunov function V (X, q) = S max (q) − S(q), where S max (q) is a maximum possible value for the Tsallis entropy with parameter q.
Evaluation of V yields Based on CBS inequality for vectors a = (1, .., 1) and b = (N q−1 1 , .., N q−1 m ) we have that V (q) ≤ 0. Equality V (q) = 0 holds if and only if all the values N i are equal.This is the maximum of entropy state.Thus the law (25) provides global asymptotic stability of the maximum entropy state.The physical meaning of this law is nothing but moving along the direction of the maximum entropy production rate (direction of the fastest entropy growth).

Internal Energy Constraint
The case of more than one constraint can be treated in the same fashion.Suppose the energy conservation law (22) holds in addition to the mass conservation law (19).
Then the evolution law should have the form where λ 1 and λ 2 are Lagrange multipliers that can be defined by substitution of ( 27) into (19) and (22).
The solutions for λ 1 and λ 2 are given by formulas This solution is defined for It holds for all cases except a degenerate case when all E i are equal.
General form of evolution law can be obtained by substitution of λ 1 and λ 2 from (28) into Equation ( 27).In abbreviated form we represent this law as where N = (N 1 , ..., N m ) T , I is the m × m identity matrix, Γ = Γ N q , Ψ T s is a symmetric m × m matrix defined as follows: 1) and E = (E 1 , ..., E m ) T is a vector of energies.

Equilibrium Stability
As before it can be shown that V (X, q) = S max (q) − S(X, q) is Lyapunov function and there is a unique stable equilibrium state of the system in non-degenerate cases.Let us demonstrate it.
We substitute λ 1 and λ 2 from (28) into Equation ( 27) and the expression for u i we substitute into (30).Result expression is We introduce a new scalar product function for two vectors as: λ 2 E i and taking into account Equation (39), the Equation (37) can be transformed to where β = − 1 q−1 λ 1 λ 2 .We can see that (40) coincides with the Tsallis distribution (35).As mentioned in [21], β in Equation ( 35) is not the Lagrange multiplier associated to the internal energy constraint (which is λ 1 in our notation).Following by notation of C. Tsallis (see Equation (10) in [2]) we have that λ 1 = −λ 2 β(q − 1).It explains the variable substitution in (39).
It is evident that (40) satisfies the normalization constraint (19).Let us check that the second constraint for energy ( 22) is also satisfied.
Let us substitute N i from (36) into (22).Then we get After substitution (41) into (36) we get that Which means that internal energy constraint (22) is true for (40).

Conclusions
We have investigated non-stationary states of processes that follow the MaxEnt principle for the Shannon and the Tsallis entropies.We have derived Equations ( 21), ( 23), ( 25) and (29) which describe dynamics of the system that tends to the state with maximum entropy.Systems with discrete probability distribution of states were considered under mass conservation and energy conservation constraints.We have shown that the limit distribution is unique and corresponds to Gibbs-Jaynes in the case of the Shannon entropy and the Tsallis distribution for the Tsallis entropy.
Both Shannon and Tsallis entropies can be also defined for continuous probability distributions.Methods described in this paper are possible to extend for probability density functions (pdf).
The key point of our approach is using the SG-method with the goal function chosen as the entropy of the system.The SG-principle generates equations for the transient (non-stationary) states of the system operation, i.e., it gives an answer to the question of How the system will evolve?This fact distinguishes the SG-principle from the principle of maximum entropy, the principle of maximum Fisher information and others characterizing the steady-state processes and providing an answer to the questions of To where? and How far?
There are many entropies used in the literature [22].The most popular are the Rényi entropy [23], the Tsallis entropy [2], the Cressie-Read family [4,5] and the Burg entropy [24].General forms of entropies based on Lyapunov functionals are reviewed in [3,25] where entropy is understood as a measure of uncertainty which increases in Markov processes.Investigation of dynamics for these entropies based on the SG-principle seems to be promising for further investigations.