Principle of Minimum Discrimination Information and Replica Dynamics

Dynamics of many complex systems can be described by replicator equations (RE). Here we present an effective method for solving a wide class of RE based on reduction theorems for models of inhomogeneous communities. The solutions of the RE minimize the discrimination information of the initial and current distributions at each point of the system trajectory, not only at the equilibrium, under time-dependent constraints. Applications to inhomogeneous versions of some conceptual models of mathematical biology (logistic and Ricker models of populations and Volterra' models of communities) are given.


Introduction
Replicator equations describe the dynamics of distributions in heterogeneous populations and communities under selective forces, when the heterogeneity implies existence of selective differences between individuals.One of the first replicator equations was used by Fisher, Haldane, and Wright to study the evolution of multi-allelic one-locus gene frequencies under the force of natural selection (for more information see [1]).Another well-known source of replicator equations comes from the evolutionary game theory [2,3].

OPEN ACCESS
A very high or even infinite system dimensionality is one of the most fundamental difficulties in the study of replicator equations.Another approach to inference of unknown distribution p subject to some given testable information (constraints) about the system can be based on the Principal of minimum discrimination information, MinxEnt (in equivalent terms, the Principle of maximum information entropy, MaxEnt).The divergence between the distribution p and reference distribution m can be measured with information discrimination ] : [ m p I (known also as KL-divergence between p and m, see s.2 for definitions).The MinxEnt principle [4] states that, given new information, a new distribution p should be chosen in such a way as to minimize ] : [ m p I ; see also [5,6] for rationalization and applications of the MaxEnt principle.
A grave objection against this approach is that the MaxEnt principle does not follow from the basic laws and fundamental theories and hence may or may be not postulated as an independent assertion.The problem can be eliminated for some particular systems if one can derive the MaxEnt principle from the system dynamics.
Generally, in applications to dynamical systems the MinxEnt principle is used to estimate the unknown distribution p at given constraints when the system is in equilibrium.Recently it was shown that the MinxEnt principle is valid as an exact theorem for a wide class of selection systems not only in equilibrium states but also at every point along the system trajectory.More precisely, it was proven in [7], that: 1) some complex models of selection systems can be reduced to an escort system of ordinary differential equations; 2) the solutions of corresponding RE have the form of time-dependent Boltzmann distributions (in other terms, they belong to the exponential family of distributions), and conversely, every time-dependent Boltzmann distribution satisfies a replicator equation; for a simplified model of the selection system it was shown in [8] that 3) the Dynamical principle of minimum of discrimination information (MinxEnt) is valid: the solution to the RE minimizes the KLdivergence of the initial and current distributions under some natural constraints at every instant; these constrains can in turn be computed explicitly at every moment from the system dynamics.The obtained results were illustrated on some simple models of free growing Malthusian inhomogeneous populations.
In this paper we consider a more general version of selection systems with self-regulation then the systems studied in [7]; it allows us accounting for a possible dependence of reproduction rate on different statistical characteristics of the set of traits given in the model, such as mean values, covariance and higher moments of the system distribution.
We formulate the reduction theorem (Section 3) and Dynamical MinxEnt principle for such systems (Section 4); the main results of these sections are similar to those obtained in [7,8], but they are relevant for more general selection systems.In Section 5 the results are applied to inhomogeneous versions of classical logistic and Ricker population models.In Section 6 we give the "conjugate" description of the solution to RE based on the Dynamical MinxEnt principle; we show that the KLdivergence is the Legendre transform of the logarithm of the partition function for corresponding timedependent Boltzmann distributions; we also show that the solution to the escort system and the current mean values of the "traits" accounted for by the selection system are conjugate variables.In Section 7 we extend the reduction theorem and Dynamical MinxEnt principle to models of inhomogeneous communities and apply them to some classical Volterra' type models of mathematical biology.Overall, three fundamental mathematical objects are under consideration in this paper: MinxEnt principle, Exponential families of distributions, and Replica dynamics (selection system).There exist close interconnections between these objects (loosely speaking, they are in some sense equivalent).In particular, (A) The Kullback' (or Jaynes', for MaxEnt) theorem states that the MinxEnt distribution belongs to the Exponential family, implying that MinxEnt => Exponential family; (B) It is almost evident (and was mentioned in the literature [8,9]), that if the distribution belongs to the exponential family, then it satisfies the MinxEnt, suggesting the inverse implication: Exponential family =>MinxEnt.
(C) Any time-dependent exponential distribution solves corresponding replicator equation, so the implication: time-dependent Exponential family => Replicator equation becomes trivial; (D) The major focus of this paper is on the inverse implication: Replica dynamics (selection system) => time-dependent Exponential family.It is composed of the reduction theorem and formulas for the system distribution and current constraints.Not only the Dynamical MinxEnt principle but also the reduction theorem follow from the fact that the solution to the replicator equation belongs to the exponential family.In this case all current statistical characteristics of the system can be computed with the help of the moment generating function (or, more generally, by generating functional) for the initial distribution.This implies that one can construct a closed escort system for auxiliary variables, whose time derivatives are equal to the "weights" of the traits, defining the reproduction rate.These variables coincide with the Lagrange multipliers for time-dependent MinxEnt distribution.The dimensionality of the escort system does not depend on the dimensionality of the initial model and is equal to the number of traits.Exact formulations are given in s.3 and Mathematical Appendix.

The KL-Divergence and MinxEnt Principle
In the case of continuous distributions the discrimination information, or the KL -divergence of the distribution p from m, is: is known as a partition function and the Lagrange multipliers s λ solve the system: Distributions of the form (2.2) belong to the exponential family of distribution [10].The minimized value of the discrimination information is: In the MinxEnt distribution (2.2), the information concerning the constraints s A is encoded in the set of Lagrange multipliers s λ via equation (2.3), given the reference pdf m ; conversely, if s λ in (2.2) are known, then the MinxEnt distribution is also known and hence the mean values s A can be computed.In other words, the MinxEnt principle implies the equivalence of description of the distribution by the set of constraints and by the set of Lagrange multipliers.Below (see Section 6) we show that for replica dynamics this equivalence is universal and does not depend on the MinxEnt principle.

Selection Systems with Self-regulation and the Reduction Theorem
The selection system is a mathematical model of an inhomogeneous population, in which every individual is characterized by a vector-parameter a = ) ,... that takes on values from set A .The parameter a specifies an individual's inherited invariant properties and does not change with time; the set of all individuals with a given value of the vector-parameter a in the population is called a -clone.Let ) , ( a t l be the density of the population at the moment t over the parameter a , so that the total population size: to be the per capita reproduction rate (Malthusian fitness) at the moment t .We suppose that the reproduction rate of every a -clone does not depend on other clones but can depend on a and on some general population characteristics such as the total population size.These quantities evolve with time providing some self-regulation of the system dynamics.For example, the reproduction rate for the logistic model is proportional to where B is the upper boundary of the population size; the reproduction rate of the Ricker' model is proportional to )) ( exp( t N β − .An abstract selection system (or, in the author's terms, a system with inheritance) was studied in [11] (see also references to earlier work therein) where a general selection theorem was proven.
In [7] a class of selection systems with self-regulation was studied and a reduction theorem was proved; the theorem gives an effective algorithm for investigation of the selection systems and corresponding replicator equations.Below we formulate a more general version of this theorem.
It was assumed in [7] that the individual reproduction rate can depend on two types of integral characteristics of the system ("regulators"): the extensive characteristics, which depend on the total size of the system (as in most population models) and intensive characteristics, which do not depend on the total size but only on the population frequencies (as in most genetic models).The intensive characteristics are of the form: The mathematical form of the fitness (3.5) suggests (from a biological point of view) that the individual fitness depends on a given finite set of traits.
The function ) (a i ϕ in (3.5) may describe quantitative contribution of a particular i-th trait to the total fitness and then ) , ( G t u i describes the relative importance (weight) of the trait contributions, which at every time moment can depend on the state of the environment, population size, the mean, variance, covariance, and other statistical characteristics of the traits.We emphasize that the model accounts for the interactions between the traits only with the help of a given set of regulators.For example, if one needs to account for all moments up to the 2 nd order, the following set of regulators should be used: Then, the covariance between the traits k i ϕ ϕ , at the moment t is the function of these regulators: Clearly, this way one can account for the dependence of the fitness on mixed moments of any order; however, the approach that is described below is truly useful only when considers just a few regulators.
In model (3.5)-(3.6) the regulators and hence the reproduction rate ) , ( a t F are not given explicitly but should be computed using the current pdf ) , ( a t P at each time moment, so in the general case, the model is a nonlinear equation of infinite dimensionality.Nevertheless, it can be reduced to a Cauchy problem for the escort system of ODE.For a less general version of the model (which allows dependence of the functions i u on a single regulator only) the reduction theorem was proven in [7].Below we formulate a more general version of this theorem, which gives an effective algorithm for investigation of the selection systems and the corresponding replicator equations.
Introduce the generating functional: , and ) (a r is a measurable function on A.
Define auxiliary variables as a solution to the escort system of differential equations: , and: . Then the functions: In particular the total size of the population: As a corollary, we obtain the central formula for the current distribution of the system: for any (measurable) function.
Equality (3.12) shows that pdf (3.15) belongs to the exponential family of distributions [10].In more "physical" terms, the pdf is the time-dependent Boltzmann distribution of the form with the Boltzmann factor ) exp(B where: and the partition function: Remark that in our case the partition function is completely known, given the initial pdf ) , 0 ( a P and the solution to the Cauchy problem (3.10).Within the frameworks of selection system (3.5)-(3.6) the partition function has a clear biological meaning: is proportional to the current population sizes, which follows from formula (3.14).

Dynamical MinxEnt Principle
Comparing the distribution (3.15) with the MinxEnt distribution (2.2) one can conclude that under time the solution to replicator equation (3.6) minimizes the KLdivergence of the initial and current distributions not only at the equilibrium but also at each point of the system trajectory.These constrains in turn can be computed explicitly at every instant depending on the system dynamics.The following theorems collect together corresponding mathematical assertions. Theorem 2) The values of constraints evolve due to escort system (3.10) and at each time moment can be computed using the formula: and can be computed using the following formulas: In the following section we demonstrate how Theorems 1-3 can be applied to some classical population models.A similar theory can be developed for replicator equations with discrete time and corresponding selection systems (maps) [13].

Inhomogeneous Logistic Model
Many particular models of selection systems have the form of inhomogeneous logistic equation: The general solution to this equation with distributed parameters was obtained in [7], example 5.

Let
)] [exp( ) , ( be the mgf of the joint initial distribution of β and μ .Then where 2 1 ,q q solve the escort system: The total population size and the current distribution: Now we are able to apply the results of s.4.The discrimination information at moment t: The distribution (5.3) provides the minimum of discrimination information, which is equal to (5.4) at each time moment among all distributions subject to the given mean values of birth and death rates at this moment: A particular case of equation (5.1): ) was studied in [12] for independent uniformly distributed parameters . It was proven in [7] The current mean values of the birth and death rates: In a general case, the solution to (5.6) is given by ( 5.3) at . The asymptotical behaviors of the solution to equation (5.6) vary dramatically depending on the initial distribution.Let the positive parameters β, μ be independent again, and the initial distributions of both parameters be exponential, ) exp( ) ( , and 1 ) 0 ( = N for simplicity.Then Hence, the solution to equation (5.6)  = provides the minimum of the discrimination information over all the distributions subject to the mean values of the birth and death rates (5.10), and this minimum is equal to (5.11).

Inhomogeneous Ricker' Model
Let us consider the inhomogeneous version of the well known Ricker' equation: (5.12) The general solution to this equation with distributed parameters was obtained in [7], example 6.Let ) , ( Applying the results of s.4 we can compute the discrimination information at moment t : The current mean values of the parameters are given by: (5.17) The pdf (5.15) provides minimum of the discrimination information at every time moment subject to the constraints (5.17), and this minimum is equal to (5.16).
For example, let the parameters β and μ be independent and exponentially distributed in [0,∞) with the means s 1 and s 2 at the initial instant.Then , and: (5.18)This equation has a stable state 1 s q = .As ∞ → t , 1 ) ( s t q → , the total population size tends to infinity and the population density concentrates at the value 0 = μ of the parameter μ and vanishes in any finite interval of values of the parameter β .The distribution: provides minimum of the discrimination information subject to the constraints: and this minimum is equal to: (5.21)

"Conjugative" Approach to the Selection System Dynamics
In the previous section we presented solutions to inhomogeneous logistic and Ricker models using the corresponding auxiliary variables and escort systems.These solutions minimize the information discrimination under certain constraints and hence can be found throw solving a conditioned optimization problem.Similar results can be obtained for other models of inhomogeneous populations.Let us clarify the interconnections between these two approaches.
Let us first come back to logistic model (5.1).Formally, the values ) ( ), ( 2 1 t q t q at t moment can be found independently on the system (5.2) by minimization of discrimination information (5.4)  are equal to the solution to system (5.2).Practically this means that we can solve equations (5.5) for ) ( ), ( 2 1 t q t q and this solution must coincide with the solution to system (5.2).To be more specific, let us consider a simple model (5.6) with exponentially distributed birth and death rates at the initial moment.For this model: where * i α is the value at which the right hand side of (6.2) reaches its supremum.
It follows from the theorem that dynamics of the selection system and the corresponding replicator equation can be equally described either in terms of the auxiliary variables ) (t q i or in terms of the constraints, i.e., the current mean values of the traits, ] [ i t E ϕ , and this equivalence does not depend on the MinxEnt principle.Technically, the former approach is more appropriate as the auxiliary variables can be found from the escort system.The latter approach is of principal importance, because it shows that for in order to completely determine the dynamics of system (3.5) and its distribution at any time moment it is enough to know only the mean values of the traits at this moment together with the initial distribution.

Reduction Theorem and Dynamical MinxEnt
Consider the model of a community consisting from r interacting populations.We suppose again that every individual is characterized by their own value of vector-parameter a .Let ) , ( a t l j be the density of j-th population at moment t.In this section we consider the model of an inhomogeneous community where the reproduction rates can depend on current characteristics of every population in the community composing a "regulator".Formally, we consider the set of m regulators, each of which is the r-dimension vector-function where: Each regulator corresponds to appropriate weight function . The theory for inhomogeneous community model (7.1)-(7.4) is similar to the theory presented in Sections 3 and 4 for inhomogeneous populations up to more complex technical details.Theorem 5 (see MA for complete formulation) reduces complex model (7.1)-( 7.3) to an escort system of ordinary nonautonomic equations of dimension n r × and gives the solution to replicator equation (7.4).Theorem 6 (MA) establishes the Dynamical MinxEnt Principle for the inhomogeneous community model and gives explicit formulas for discrimination information and constraints at each time moment.Let us apply the general theory to some classical models of biological communities consisting of interacting inhomogeneous populations.

Inhomogeneous Prey-predator Volterra' Model
The prey-predator Volterra' model in its simplest form reads: a is the reproduction rate of the prey population, 2 a is the per capita rate of the consumption of prey by the predators, 3 a is the death rate of the predator, and 2 4 / a a is the fraction of prey biomass, which is converted into predator biomass.Let us consider the inhomogeneous version of this classical model supposing that parameters 1 a , 2 a , and 3 a are distributed and the ratio 2 4 / a a is fixed (and hence could be chosen equal to 1).We also suppose that the reproduction and death processes are specific for each subpopulation, while the consumption is driven by the interaction of the prey (predator) subpopulation with the entire predator (prey) population.Let be the total sizes of the populations.The initial population sizes and initial distributions ) ; 1 a P a a P are assumed to be given.The total rate of consumption is equal Assuming the "proportional distribution" of prey among the predators we can write the inhomogeneous version of Volterra' model in the form: Theorem 5 gives a method for studying this model and a more general model (7.2); the principal step is a reduction of the model to the escort system of ODE.It is instructive to deduce the escort system and the main results informally to clarify the main idea of the method in application to community models.
It is natural to suppose that the parameter 3 a is stochastically independent on the parameters

M
be the mgf of the initial distribution of the parameter 3 a .Introduce the auxiliary variables as a solution to the Cauchy problem: Then system (7.8) Finally, we obtain a closed system of non-autonomous equations: Now that we have a solution to the Cauchy problem for this system with zero initial values, we can get explicit formulas for total populations' sizes (7.12) and current distribution of the parameters (7.13), which completely solve the problem.In particular, the current mean values of the parameters: One can check that the obtained formulas coincide with the formulas, which follow from Theorem 5, MA.The current information discriminations for the inhomogeneous Volterra' model: The distributions (7.13) provide the minimum of information discriminations equaled to (7.17) over all distributions compatible with constraints (7.16).
Integrating the equations of system (7.8) over the parameters we obtain the system: These equations for total sizes of inhomogeneous populations have the same form as the initial Volterra' system (7.5); the difference is that now the parameter values are not constants but vary over time according to formulas (7.16).The phase-parametric portrait of "homogeneous" Volterra' model is well known (see, e.g., [14]).The dynamics of system (7.18) is determined by the parametric point , which moves across the parametric portrait of model (7.5).This phenomenon, which may be referred to as "traveling across the parametric portrait of a homogeneous model" is a common feature of corresponding inhomogeneous models.It was well observed on the example of discrete-time models [13].For Volterra-type model of two inhomogeneous populations with logistic reproduction rates and ratio-dependent predator functional response the phenomenon was studied in detail in [15].

Competition of Two Inhomogeneous Populations
The dynamics of two populations competing for a common resource can be described by the following logistic-like model (see, e.g., [14] where we denoted: where we denoted: distribution, whose parameters solve the escort system.With the solution to the replicator equation we can compute the current mean values of the traits at any instant.Then, treating these mean values as constraints, we can show that the "MinxEnt distribution" coincides with the solution of the replicator equation, which was obtained independently of the MinxEnt algorithm.Hence, the principle of the minimum of discrimination information can be considered as the variation principle that governs the selection system dynamics.Overall, both the reduction theorem and the Dynamical MinxEnt principle stem from the fact that the solution to the replicator equation belongs to the exponential family.
On the other hand, it is easy to show that any Boltzmann distribution with time-dependent parameters solves the corresponding replicator equation [8] and hence provides the minimum of discrimination information for the distribution of the associated selection system.It means that the replica dynamics is the "natural habitat" for the MinxEnt principle; within the framework of selection systems, we cannot choose whether or not to ascribe the property of minimization of the information discrimination to the selection system.It is an intrinsic property of any solution to the replicator equations that is fulfilled due to the system dynamics at any point of the system trajectory.More generally, the MinxEnt principle is an internal property of the process of natural selection at every moment of the system evolution.
We showed that the discrimination information is the Legendre transform of logarithm of the partition function; the auxiliary variables and the constraints are conjugated variables under this transformation.This assertion clarifies the meaning of the auxiliary variables and the role of the escort system.It also implies that the dynamics of selection system can be equally well described either in terms of the auxiliary variables or in terms of the constraints, which are the current mean values of the traits.Hence, the solution to the replicator equation is completely determined by the current mean values of the traits subject the important condition that the initial distribution is known.
Our approach is illustrated for inhomogeneous versions of some conceptual models of mathematical biology, namely, the logistic and Ricker' models of populations and the Volterra' models of communities.Formally, the inhomogeneous versions can be written in the same form as the initial models, with their current mean values substituted for the fixed parameters, as it was done in s.7 for the Volterra' system.As a result, one obtains a complex non-linear integro-differential system, which hardly can be studied directly.The developed approach allows us to reduce this complex model to a system of two non-autonomous differential equations, which is specific for every initial distribution of the parameters.The same method works for other models and allows us to find the distribution of the systems at any time moment.
Looking beyond the formal solutions, we can now reveal a general optimization principle that governs the replica dynamics in many problems in mathematical biology, evolutionary game theory [3], the Eigen quasispecies theory [18], etc.As a result of the selection process, these systems evolve in such a way that the discrimination information is minimized at each time moment, given the mean values of the accounting traits.The values of the constraints and corresponding minimal value of discrimination information can be determined for each specific system by the developed method; we can conclude that for this type of models the MinxEnt principle follows from the system dynamics.

8 )
Distribution(5.7)  provides the minimum of discrimination information at each time moment among all distributions concentrated in the rectangle mean values of the birth and death rates (5.8).
denote prey and predator densities, 1 35)Distributions(7.32) provide the minimum (7.34) of discrimination information for model (7.21) at each time moment among all distributions subject the mean values (7.33).

3 )
Dynamics of the constraints are determined by the covariance equation: → and the population vanishes in any finite interval of values of both parameters, β and μ .
can be written formally as: The constraint values can be computed at each time moment by the formula: