Next Article in Journal
Ecological Economic Evaluation Based on Emergy as Embodied Cosmic Exergy: A Historical Study for the Beijing Urban Ecosystem 1978–2004
Previous Article in Journal
Entropy and Energy in Characterizing the Organization of Concept Maps in Learning Science
Previous Article in Special Issue
Entropy: The Markov Ordering Approach
Open AccessArticle

Principle of Minimum Discrimination Information and Replica Dynamics

Lockheed Martin MSD, National Institutes of Health, Bldg. 38A, Rm. 5N511N, 8600 Rockville Pike, Bethesda, MD 20894, USA
Entropy 2010, 12(7), 1673-1695; https://doi.org/10.3390/e12071673
Received: 4 March 2010 / Revised: 15 June 2010 / Accepted: 18 June 2010 / Published: 28 June 2010
(This article belongs to the Special Issue Entropy in Model Reduction)

Abstract

Dynamics of many complex systems can be described by replicator equations (RE). Here we present an effective method for solving a wide class of RE based on reduction theorems for models of inhomogeneous communities. The solutions of the RE minimize the discrimination information of the initial and current distributions at each point of the system trajectory, not only at the equilibrium, under time-dependent constraints. Applications to inhomogeneous versions of some conceptual models of mathematical biology (logistic and Ricker models of populations and Volterra’ models of communities) are given.
Keywords: replicator equation; selection system; model reduction; discrimination information; cross-entropy replicator equation; selection system; model reduction; discrimination information; cross-entropy

1. Introduction

Replicator equations describe the dynamics of distributions in heterogeneous populations and communities under selective forces, when the heterogeneity implies existence of selective differences between individuals. One of the first replicator equations was used by Fisher, Haldane, and Wright to study the evolution of multi-allelic one-locus gene frequencies under the force of natural selection (for more information see [1]). Another well-known source of replicator equations comes from the evolutionary game theory [2,3].
A very high or even infinite system dimensionality is one of the most fundamental difficulties in the study of replicator equations. Another approach to inference of unknown distribution p subject to some given testable information (constraints) about the system can be based on the Principal of minimum discrimination information, MinxEnt (in equivalent terms, the Principle of maximum information entropy, MaxEnt). The divergence between the distribution p and reference distribution m can be measured with information discrimination I [ p : m ] (known also as KL-divergence between p and m, see s.2 for definitions). The MinxEnt principle [4] states that, given new information, a new distribution p should be chosen in such a way as to minimize I [ p : m ] ; see also [5,6] for rationalization and applications of the MaxEnt principle.
A grave objection against this approach is that the MaxEnt principle does not follow from the basic laws and fundamental theories and hence may or may be not postulated as an independent assertion. The problem can be eliminated for some particular systems if one can derive the MaxEnt principle from the system dynamics.
Generally, in applications to dynamical systems the MinxEnt principle is used to estimate the unknown distribution p at given constraints when the system is in equilibrium. Recently it was shown that the MinxEnt principle is valid as an exact theorem for a wide class of selection systems not only in equilibrium states but also at every point along the system trajectory. More precisely, it was proven in [7], that: 1) some complex models of selection systems can be reduced to an escort system of ordinary differential equations; 2) the solutions of corresponding RE have the form of time-dependent Boltzmann distributions (in other terms, they belong to the exponential family of distributions), and conversely, every time-dependent Boltzmann distribution satisfies a replicator equation; for a simplified model of the selection system it was shown in [8] that 3) the Dynamical principle of minimum of discrimination information (MinxEnt) is valid: the solution to the RE minimizes the KL–divergence of the initial and current distributions under some natural constraints at every instant; these constrains can in turn be computed explicitly at every moment from the system dynamics. The obtained results were illustrated on some simple models of free growing Malthusian inhomogeneous populations.
In this paper we consider a more general version of selection systems with self-regulation then the systems studied in [7]; it allows us accounting for a possible dependence of reproduction rate on different statistical characteristics of the set of traits given in the model, such as mean values, covariance and higher moments of the system distribution.
We formulate the reduction theorem (Section 3) and Dynamical MinxEnt principle for such systems (Section 4); the main results of these sections are similar to those obtained in [7,8], but they are relevant for more general selection systems. In Section 5 the results are applied to inhomogeneous versions of classical logistic and Ricker population models. In Section 6 we give the “conjugate” description of the solution to RE based on the Dynamical MinxEnt principle; we show that the KL-divergence is the Legendre transform of the logarithm of the partition function for corresponding time-dependent Boltzmann distributions; we also show that the solution to the escort system and the current mean values of the “traits” accounted for by the selection system are conjugate variables. In Section 7 we extend the reduction theorem and Dynamical MinxEnt principle to models of inhomogeneous communities and apply them to some classical Volterra’ type models of mathematical biology. Overall, three fundamental mathematical objects are under consideration in this paper:
MinxEnt principle, Exponential families of distributions, and Replica dynamics (selection system). There exist close interconnections between these objects (loosely speaking, they are in some sense equivalent). In particular,
(A) The Kullback’ (or Jaynes’, for MaxEnt) theorem states that the MinxEnt distribution belongs to the Exponential family, implying that MinxEnt => Exponential family;
(B) It is almost evident (and was mentioned in the literature [8,9]), that if the distribution belongs to the exponential family, then it satisfies the MinxEnt, suggesting the inverse implication: Exponential family =>MinxEnt.
(C) Any time-dependent exponential distribution solves corresponding replicator equation, so the implication: time-dependent Exponential family => Replicator equation becomes trivial;
(D) The major focus of this paper is on the inverse implication: Replica dynamics (selection system) => time-dependent Exponential family. It is composed of the reduction theorem and formulas for the system distribution and current constraints. Not only the Dynamical MinxEnt principle but also the reduction theorem follow from the fact that the solution to the replicator equation belongs to the exponential family. In this case all current statistical characteristics of the system can be computed with the help of the moment generating function (or, more generally, by generating functional) for the initial distribution. This implies that one can construct a closed escort system for auxiliary variables, whose time derivatives are equal to the “weights” of the traits, defining the reproduction rate. These variables coincide with the Lagrange multipliers for time-dependent MinxEnt distribution. The dimensionality of the escort system does not depend on the dimensionality of the initial model and is equal to the number of traits. Exact formulations are given in s.3 and Mathematical Appendix.

2. The KL-Divergence and MinxEnt Principle

In the case of continuous distributions the discrimination information, or the KL - divergence of the distribution p from m, is:
I [ p : m ] = A p ( x ) ln p ( x ) m ( x ) d x = E p [ ln p m ]
The value S [ p : m ] = I [ p : m ] is known also as the relative entropy. The probability density function (pdf) m is assumed to be given. The inference of p by minimizing I [ p : m ] (equivalently, by maximizing S [ p : m ] ) is known as the Principle of minimum discrimination information, MinxEnt (accordingly, the Principal of Maximum relative or cross-entropy, MaxEnt).
Assume that expected values A s of some variables φ s , s = 1 , ... n over the pdf p are known: E p [ φ s ] = A s . Then the “MinxEnt distribution” p * that minimizes the discrimination information I [ p : m ] subject to these constraints is:
p * ( a ) = 1 Z m ( a ) exp ( s = 1 n λ s φ s ( a ) )
where Z ( λ ) = E m [ exp ( s = 1 n λ s φ s ) ] is known as a partition function and the Lagrange multipliers λ s solve the system:
ln Z / λ s = A s
Distributions of the form (2.2) belong to the exponential family of distribution [10]. The minimized value of the discrimination information is:
I [ p * : m ] = ln Z ( λ ) + s = 1 n λ s A s
In the MinxEnt distribution (2.2), the information concerning the constraints A s is encoded in the set of Lagrange multipliers λ s via equation (2.3), given the reference pdf m ; conversely, if λ s in (2.2) are known, then the MinxEnt distribution is also known and hence the mean values A s can be computed. In other words, the MinxEnt principle implies the equivalence of description of the distribution by the set of constraints and by the set of Lagrange multipliers. Below (see Section 6) we show that for replica dynamics this equivalence is universal and does not depend on the MinxEnt principle.

3. Selection Systems with Self-regulation and the Reduction Theorem

The selection system is a mathematical model of an inhomogeneous population, in which every individual is characterized by a vector-parameter ( a 1 , ... a n ) = a that takes on values from set A . The parameter a specifies an individual’s inherited invariant properties and does not change with time; the set of all individuals with a given value of the vector-parameter a in the population is called a -clone. Let l ( t , a ) be the density of the population at the moment t over the parameter a , so that the total population size:
N ( t ) = A l ( t , a ) d a
and the current population distribution P ( t , a ) = l ( t , a ) / N ( t ) . Denote F ( t , a ) to be the per capita reproduction rate (Malthusian fitness) at the moment t . We suppose that the reproduction rate of every a -clone does not depend on other clones but can depend on a and on some general population characteristics such as the total population size. These quantities evolve with time providing some self-regulation of the system dynamics. For example, the reproduction rate for the logistic model is proportional to ( 1 N ( t ) / B ) where B is the upper boundary of the population size; the reproduction rate of the Ricker’ model is proportional to exp ( β N ( t ) ) .
An abstract selection system (or, in the author’s terms, a system with inheritance) was studied in [11] (see also references to earlier work therein) where a general selection theorem was proven.
In [7] a class of selection systems with self-regulation was studied and a reduction theorem was proved; the theorem gives an effective algorithm for investigation of the selection systems and corresponding replicator equations. Below we formulate a more general version of this theorem.
It was assumed in [7] that the individual reproduction rate can depend on two types of integral characteristics of the system (“regulators”): the extensive characteristics, which depend on the total size of the system (as in most population models) and intensive characteristics, which do not depend on the total size but only on the population frequencies (as in most genetic models). The intensive characteristics are of the form:
H k ( t ) = A h k ( a ) P ( t , a ) d a = E t [ h k ]
and the extensive characteristics:
G i ( t ) = A g i ( a ) l ( t , a ) d a = N ( t ) E t [ g i ]
where g i , h k are appropriate weight functions.
Overall, we specify for each model a finite set of the extensive regulators G ( t ) = G 1 ( t ) , ... G m ( t ) , which contains the total population size; we assume that the individual reproduction rate can depend on this set of regulators at each time moment.
If we assume the overlapping generations and smoothness of l ( t , a ) in t for each a A , then the population dynamics can be described by the following master model:
d l ( t , a ) / d t = l ( t , a ) F ( t , a )
F ( t , a ) = i = 1 n u i ( t , G ) φ i ( a )
here u i ( t , G ) are continuous functions.
The initial distribution P(0,a) and the initial population size N(0) are assumed to be given. The current system distribution P ( t , a ) = l ( t , a ) / N ( t ) solves the replicator equation:
d P ( t , a ) / d t = P ( t , a ) ( F ( t , a ) E t [ F ] )
The mathematical form of the fitness (3.5) suggests (from a biological point of view) that the individual fitness depends on a given finite set of traits.
The function φ i ( a ) in (3.5) may describe quantitative contribution of a particular i-th trait to the total fitness and then u i ( t , G ) describes the relative importance (weight) of the trait contributions, which at every time moment can depend on the state of the environment, population size, the mean, variance, covariance, and other statistical characteristics of the traits. We emphasize that the model accounts for the interactions between the traits only with the help of a given set of regulators. For example, if one needs to account for all moments up to the 2nd order, the following set of regulators should be used:
N ( t ) = A l ( t , a ) d a , G i ( t ) = A φ i ( a ) l ( t , a ) d a , G i k ( t ) = A φ i ( a ) φ k ( a ) l ( t , a ) d a
Then, the covariance between the traits φ i , φ k at the moment t is the function of these regulators:
u i ( t , G ) = C o v [ φ i , φ k ] ( t ) = G i k ( t ) / N ( t ) G i ( t ) G k ( t ) / N 2 ( t )
Clearly, this way one can account for the dependence of the fitness on mixed moments of any order; however, the approach that is described below is truly useful only when considers just a few regulators.
In model (3.5)-(3.6) the regulators and hence the reproduction rate F ( t , a ) are not given explicitly but should be computed using the current pdf P ( t , a ) at each time moment, so in the general case, the model is a nonlinear equation of infinite dimensionality. Nevertheless, it can be reduced to a Cauchy problem for the escort system of ODE. For a less general version of the model (which allows dependence of the functions u i on a single regulator only) the reduction theorem was proven in [7]. Below we formulate a more general version of this theorem, which gives an effective algorithm for investigation of the selection systems and the corresponding replicator equations.
Introduce the generating functional:
Φ ( r ; λ ) = A r ( a ) exp ( i = 1 n λ i φ i ( a ) ) P ( 0 , a ) d a
where λ = ( λ 1 , ... λ n ) , and r ( a ) is a measurable function on A.
Define auxiliary variables as a solution to the escort system of differential equations:
d q i / d t = u i ( t , G * ( t ) ) , q i ( 0 ) = 0 , i = 1 , ... n
where G * ( t ) = G 1 * ( t ) , ... G m * ( t ) , and:
G k * ( t ) = N ( 0 ) Φ ( g k , q ( t ) ) , q ( t ) = ( q 1 ( t ) , ... q n ( t ) )
Denote:
K t ( a ) = exp ( i = 1 n q i ( t ) φ i ( a ) )
Theorem 1.
Let 0 < T be the maximal value of t such that Cauchy problem (3.10) has a unique global solution { q ( t ) } at t [ 0 , T ) . Then the functions:
l ( t , a ) = l ( 0 , a ) K t ( a ) G k ( t ) = G k * ( t ) = N ( 0 ) Φ ( g i ; q ( t ) , s ( t ) )
satisfy system (3.4)-(3.5) at t∈[0,T).
In particular the total size of the population:
N ( t ) = N ( 0 ) Φ ( 1 ; q ( t ) ) = N ( 0 ) E 0 [ K t ]
As a corollary, we obtain the central formula for the current distribution of the system:
P ( t , a ) = P ( 0 , a ) K t ( a ) / E 0 [ K t ]
In particular, E t [ f ] = E 0 [ f K t ] / E 0 [ K t ] for any (measurable) function.
Equality (3.12) shows that pdf (3.15) belongs to the exponential family of distributions [10]. In more “physical” terms, the pdf is the time-dependent Boltzmann distribution of the form P ( t , a ) = P ( 0 , a ) exp ( B ) / Z with the Boltzmann factor exp ( B ) where:
B ( q ( t ) ; a ) = i = 1 n q i ( t ) φ i ( a ) .
and the partition function:
Z ( q ( t ) ) = E 0 [ exp ( B ( q ( t ) ; a ) ] = E 0 [ K t ]
Remark that in our case the partition function is completely known, given the initial pdf P ( 0 , a ) and the solution to the Cauchy problem (3.10). Within the frameworks of selection system (3.5)-(3.6) the partition function has a clear biological meaning: Z ( q ( t ) ) = N ( t ) / N ( 0 ) is proportional to the current population sizes, which follows from formula (3.14).

4. Dynamical MinxEnt Principle

Comparing the distribution (3.15) with the MinxEnt distribution (2.2) one can conclude that under time-dependent constraints A i = E t [ φ i ] the solution to replicator equation (3.6) minimizes the KL-divergence of the initial and current distributions not only at the equilibrium but also at each point of the system trajectory. These constrains in turn can be computed explicitly at every instant depending on the system dynamics. The following theorems collect together corresponding mathematical assertions.
Theorem 2.
1) Let P t = P ( t , a ) be the solution (3.15) of replicator equation (3.6). Then at every moment t the distribution P t provides minimum of I [ P t : P 0 ] over all probability distributions compatible with the constraints A i ( t ) = E t [ φ i ] , i = 1 , ... n ;
2) The values of constraints evolve due to escort system (3.10) and at each time moment can be computed using the formula:
A i ( t ) = E 0 [ φ i K t ] / E 0 [ K t ] = Φ ( φ i ; q ( t ) ) / Φ ( 1 ; q ( t ) )
3) Dynamics of the constraints are determined by the covariance equation:
d A i ( t ) / d t = C o v t [ F , φ i ]
Theorem 3.
The discrimination information I [ P t : P 0 ] solves the covariance equation:
d I [ P t : P 0 ] d t = C o v t [ B , F ]
and can be computed using the following formulas:
I [ P t : P 0 ] = E 0 [ B K t ] / E 0 [ K t ] ln E 0 [ K t ] I [ P t : P 0 ] = E t [ B ] ln ( N ( t ) / N ( 0 ) )
where F = F ( t , a ) , B = B ( q ( t ) ; a ) .
In the following section we demonstrate how Theorems 1-3 can be applied to some classical population models. A similar theory can be developed for replicator equations with discrete time and corresponding selection systems (maps) [13].

5. The Ricker’ and Logistic Inhomogeneous Models and the Dynamical MinxEnt Principle

5.1. Inhomogeneous Logistic Model

Many particular models of selection systems have the form of inhomogeneous logistic equation:
d l ( t ; β , μ ) / d t = l ( t ; β , μ ) [ ( β f 1 ( N ( t ) ) μ f 2 ( N ( t ) ) ]
The general solution to this equation with distributed parameters was obtained in [7], example 5. Let M ( λ 1 , λ 2 ) = E 0 [ exp ( λ 1 β + λ 2 μ ) ] be the mgf of the joint initial distribution of β and μ . Then l ( t ; β , μ ) = l ( 0 ; β , μ ) exp ( q 1 ( t ) β + q 2 ( t ) μ ) where q 1 , q 2 solve the escort system:
d q 1 / d t = f 1 ( N ( 0 ) M ( q 1 , q 2 ) ) d q 2 / d t = f 2 ( N ( 0 ) M ( q 1 , q 2 ) ) q 1 ( 0 ) = 0 , q 2 ( 0 ) = 0
The total population size N ( t ) = N ( 0 ) M ( q 1 ( t ) , q 2 ( t ) ) and the current distribution:
P ( t ; β , μ ) = P ( 0 ; β , μ ) exp ( q 1 ( t ) β + q 2 ( t ) μ ) / M ( q 1 ( t ) , q 2 ( t ) )
Now we are able to apply the results of s.4. The discrimination information at moment t:
I [ P t : P 0 ] = E 0 [ ( q 1 ( t ) β + q 2 ( t ) μ ) e q 1 ( t ) β + q 2 ( t ) μ ] E 0 [ e q 1 ( t ) β + q 2 ( t ) μ ] ln E 0 [ e q 1 ( t ) β + q 2 ( t ) μ ]
The distribution (5.3) provides the minimum of discrimination information, which is equal to (5.4) at each time moment among all distributions subject to the given mean values of birth and death rates at this moment:
E t [ β ] = E 0 [ β exp ( q 1 ( t ) β + q 2 ( t ) μ ) ] E 0 [ exp ( q 1 ( t ) β + q 2 ( t ) μ ) ] ,  , E t [ μ ] = E 0 [ μ exp ( q 1 ( t ) β + q 2 ( t ) μ ) ] E 0 [ exp ( q 1 ( t ) β + q 2 ( t ) μ ) ]
A particular case of equation (5.1):
d l ( t ; β , μ ) / d t = l ( t ; β , μ ) ( β μ N ( t ) )
was studied in [12] for independent uniformly distributed parameters β and μ , β [ a 1 , b 1 ] , μ [ a 2 , b 2 ] and P ( 0 ; β , μ ) = 1 / ( ( b 1 a 1 ) ( b 2 a 2 ) ) . It was proven in [7] that then P ( t ; β , μ ) = P 1 ( t ; β ) P 2 ( t ; μ ) where:
P 1 ( t ; β ) = exp ( t β ) ( b 1 a 1 ) E 0 [ exp ( t β ) ] = t exp ( t β ) exp ( t b 1 ) exp ( t a 1 ) P 2 ( t ; μ ) = exp ( q 2 ( t ) μ ) ( b 2 a 2 ) E 0 [ exp ( q 2 ( t ) μ ) ] = q 2 ( t ) exp ( q 2 ( t ) μ ) exp ( q 2 ( t ) b 2 ) exp ( q 2 ( t ) a 2 )
The current mean values of the birth and death rates:
E t [ β ] = b 1 b 1 a 1 1 exp ( ( b 1 a 1 ) t ) 1 t b 1 E t [ μ ] = b 2 b 2 a 2 1 exp ( ( b 2 a 2 ) q 2 ( t ) ) 1 q 2 ( t ) a 2
Distribution (5.7) provides the minimum of discrimination information at each time moment among all distributions concentrated in the rectangle [ a 1 , b 1 ] × [ a 2 , b 2 ] subject to the mean values of the birth and death rates (5.8).
In a general case, the solution to (5.6) is given by (5.3) at q 1 ( t ) = t and q 2 ( t ) that solves the equation d q 2 / d t = N ( 0 ) M 0 ( t , q 2 ) , q 2 ( 0 ) = 0 . The asymptotical behaviors of the solution to equation (5.6) vary dramatically depending on the initial distribution. Let the positive parameters β, μ be independent again, and the initial distributions of both parameters be exponential, P i ( x ) = s i exp ( x s i ) . Let s 1 = T , s 2 = 1 , and N ( 0 ) = 1 for simplicity. Then q 1 ( t ) = t , q 2 ( t ) = 1 1 2 T ln ( 1 t / T ) . The current system distribution P ( t ; β , μ ) = P 1 ( t ; β ) P 2 ( t ; μ ) where both marginal distributions are again exponential:
P 1 ( t ; β ) = ( T t ) exp ( β ( T t ) ) P 2 ( t ; μ ) = 1 2 T ln ( 1 t / T ) exp ( μ 1 2 T ln ( 1 t / T )
E t [ β ] = 1 T q 1 ( t ) = 1 T t E t [ μ ] = 1 1 q 2 ( t ) = 1 1 2 T ln ( 1 t / T )
Hence, the solution to equation (5.6) with the initial exponential distribution exists only up to the moment t = T . The total population size N ( t ) = N ( 0 ) / { ( 1 t / T ) 1 2 T ln ( 1 t / T ) } tends to infinity as t T and the population vanishes in any finite interval of values of both parameters, β and μ .
The discrimination information for P t = P 1 ( t ; β ) P 2 ( t ; μ )
I [ P t : P 0 ] = t T t + 1 1 2 T ln ( 1 t / T ) 1 + ln ( 1 t / T ) + ln ( 1 2 T ln ( 1 t / T ) )
The exponential distribution P ( t ; β , μ ) = P 1 ( t ; β ) P 2 ( t ; μ ) provides the minimum of the discrimination information over all the distributions subject to the mean values of the birth and death rates (5.10), and this minimum is equal to (5.11).

5.2. Inhomogeneous Ricker’ Model

Let us consider the inhomogeneous version of the well known Ricker’ equation:
d l ( t ; β , μ ) / d t = l ( t ; β , μ ) [ ( β exp ( c N ( t ) ) μ ) ]
The general solution to this equation with distributed parameters was obtained in [7], example 6. Let M 0 ( λ 1 , λ 2 ) be the mgf of the joint initial distribution of β and μ . Then:
l ( t ; β , μ ) = l ( 0 ; β , μ ) exp ( q ( t ) β t μ )
where the auxiliary variable q ( t ) solve the Cauchy problem:
d q / d t = exp ( c N ( 0 ) M 0 ( q , t ) ) , q ( 0 ) = 0
The total population size N ( t ) = N ( 0 ) M ( q ( t ) , t ) , and the system distribution:
P ( t ; β , μ ) = P ( 0 ; β , μ ) exp ( q ( t ) β t μ ) / M ( q ( t ) , t )
Applying the results of s.4 we can compute the discrimination information at moment t :
I [ P t : P 0 ] = E 0 [ ( q ( t ) β t μ ) exp ( q ( t ) β t μ ) ] / M ( q ( t ) , t ) ln M ( q ( t ) , t )
The current mean values of the parameters are given by:
E t [ β ] = E 0 [ β exp ( q ( t ) β t μ ) ] E 0 [ exp ( q ( t ) β t μ ) ] , E t [ μ ] = E 0 [ μ exp ( q ( t ) β t μ ) ] E 0 [ exp ( q ( t ) β t μ ) ]
The pdf (5.15) provides minimum of the discrimination information at every time moment subject to the constraints (5.17), and this minimum is equal to (5.16).
For example, let the parameters β and μ be independent and exponentially distributed in [0,∞) with the means s1 and s2 at the initial instant. Then M ( q , t ) = s 1 s 2 / ( ( s 1 q ) ( s 2 + t ) ) , and:
d q / d t = exp ( c N ( 0 ) s 1 s 2 / ( ( s 1 q ) ( s 2 + t ) ) ) , q ( 0 ) = 0
This equation has a stable state q = s 1 . As t , q ( t ) s 1 , the total population size tends to infinity and the population density concentrates at the value μ = 0 of the parameter μ and vanishes in any finite interval of values of the parameter β . The distribution:
P ( t ; β , μ ) = P ( 0 ; β , μ ) exp ( q ( t ) β t μ ) ( s 1 q ) ( s 2 + t ) / s 1 s 2
provides minimum of the discrimination information subject to the constraints:
E t [ β ] = 1 s 1 q ( t ) , E t [ μ ] = 1 s 2 + t
and this minimum is equal to:
I [ P t : P 0 ] = q ( t ) ( s 2 + t ) s 1 q ( t ) t ( s 1 q ( t ) ) s 2 + t + ln ( s 2 + t ) + ln ( s 1 q ( t ) ) ln ( s 1 s 2 )

6. “Conjugative” Approach to the Selection System Dynamics

In the previous section we presented solutions to inhomogeneous logistic and Ricker models using the corresponding auxiliary variables and escort systems. These solutions minimize the information discrimination under certain constraints and hence can be found throw solving a conditioned optimization problem. Similar results can be obtained for other models of inhomogeneous populations. Let us clarify the interconnections between these two approaches.
Let us first come back to logistic model (5.1). Formally, the values q 1 ( t ) , q 2 ( t ) at t moment can be found independently on the system (5.2) by minimization of discrimination information (5.4) subject to the mean values E t [ β ] , E t [ μ ] . We should emphasize that these constraints cannot be assigned arbitrarily but are completely defined by the system dynamics at the given initial distribution. Hence, if E t [ β ] , E t [ μ ] can be estimated independently, e.g., by data processing, then they must coincide with (5.5).
It implies that if a distribution P t is such that I [ P t : P 0 ] is equal to (5.4) and the mean values of the traits β , μ over P t are equal to the estimated ones then P t coincides with (5.3) and hence the values q 1 ( t ) , q 2 ( t ) are equal to the solution to system (5.2). Practically this means that we can solve equations (5.5) for q 1 ( t ) , q 2 ( t ) and this solution must coincide with the solution to system (5.2). To be more specific, let us consider a simple model (5.6) with exponentially distributed birth and death rates at the initial moment. For this model:
q 1 ( t ) = t = T 1 / E t [ β ] q 2 ( t ) = 1 1 2 T ln ( 1 t / T ) = 1 1 / E t [ μ ]
We see that the mean values E t [ β ] , E t [ μ ] , if known, completely determine the auxiliary variables q 1 ( t ) , q 2 ( t ) and hence the solution to selection system (5.6) and the corresponding replicator equation given the initial distribution.
These arguments can be made rigorous for any selection system (3.5) and replicator equation (3.6). The following theorem states that the information discrimination I [ P t : P 0 ] is the Legendre transform of the logarithm of partition function (3.17); the auxiliary variables q i ( t ) and the constraints A i ( t ) = E t [ φ i ] are conjugate under this transform (see MA for some definitions and proof).
Theorem 4.
i) The information discrimination I [ P t : P 0 ] as a function of constraints A i ( t ) is the Legendre transform of W = ln E 0 [ K t ] as a function of variables q i ( t ) , and conversely;
ii) the variables q ( t ) are conjugate to the constraints A ( t ) ;
iii) for given constraints A ( t ) , the values of the variables q ( t ) and I [ P t : P 0 ] can be found as a solution to optimization problem:
I [ P t : P 0 ] = sup { α : i = 1 n α i A i ( t ) W ( α ) }
and q i ( t ) = α i * where α i * is the value at which the right hand side of (6.2) reaches its supremum.
It follows from the theorem that dynamics of the selection system and the corresponding replicator equation can be equally described either in terms of the auxiliary variables q i ( t ) or in terms of the constraints, i.e., the current mean values of the traits, E t [ φ i ] , and this equivalence does not depend on the MinxEnt principle. Technically, the former approach is more appropriate as the auxiliary variables can be found from the escort system. The latter approach is of principal importance, because it shows that for in order to completely determine the dynamics of system (3.5) and its distribution at any time moment it is enough to know only the mean values of the traits at this moment together with the initial distribution.

7. Inhomogeneous Models of Communities

7.1. Reduction Theorem and Dynamical MinxEnt

Consider the model of a community consisting from r interacting populations. We suppose again that every individual is characterized by their own value of vector-parameter a . Let l j ( t , a ) be the density of j-th population at moment t. In this section we consider the model of an inhomogeneous community where the reproduction rates can depend on current characteristics of every population in the community composing a “regulator”. Formally, we consider the set of m regulators, each of which is the r-dimension vector-function G i ( t ) = ( G i 1 ( t ) , ... G i r ( t ) ) , i = 1 , ... n where:
G i j ( t ) = A g i ( a ) l j ( t , a ) d a
Each regulator corresponds to appropriate weight function g i . A finite set of the regulators corresponds to each specific model; we denote this set as G ( t ) = ( G 1 ( t ) , ... G n ( t ) ) .
The current population sizes N j ( t ) = A l j ( t , a ) d a compose the regulator of a special importance, N ( t ) = ( N 1 ( t ) , ... N r ( t ) ) . We assume that N ( t ) is included in the set of the model’ regulators. The distribution of j-th population in the community is by definition P j ( t , a ) = l j ( t , a ) / N j ( t ) .
The model of inhomogeneous community considered here is of the form:
d l j ( t , a ) / d t = l j ( t , a ) F j ( t , a )
F j ( t , a ) = i = 1 n u i j ( t , G ) φ i ( a )
where the functions u i j can be specific for each trait and each population. The initial pdf-s P j ( 0 , a ) and the initial population sizes N j ( 0 ) are assumed to be given. The current pdf P j ( t , a ) solves the replicator equation:
d P j ( t , a ) / d t = P j ( t , a ) ( F j ( t , a ) E t j [ F j ] )
The theory for inhomogeneous community model (7.1)-(7.4) is similar to the theory presented in Section 3 and Section 4 for inhomogeneous populations up to more complex technical details. Theorem 5 (see MA for complete formulation) reduces complex model (7.1)-(7.3) to an escort system of ordinary non-autonomic equations of dimension r × n and gives the solution to replicator equation (7.4). Theorem 6 (MA) establishes the Dynamical MinxEnt Principle for the inhomogeneous community model and gives explicit formulas for discrimination information and constraints at each time moment. Let us apply the general theory to some classical models of biological communities consisting of interacting inhomogeneous populations.

7.2. Inhomogeneous Prey-predator Volterra’ Model

The prey-predator Volterra’ model in its simplest form reads:
d x / d t = a 1 x a 2 x y d y / d t = a 3 y + a 4 x y
where x ( t ) and y ( t ) denote prey and predator densities, a 1 is the reproduction rate of the prey population, a 2 is the per capita rate of the consumption of prey by the predators, a 3 is the death rate of the predator, and a 4 / a 2 is the fraction of prey biomass, which is converted into predator biomass.
Let us consider the inhomogeneous version of this classical model supposing that parameters a 1 , a 2 , and a 3 are distributed and the ratio a 4 / a 2 is fixed (and hence could be chosen equal to 1). We also suppose that the reproduction and death processes are specific for each subpopulation, while the consumption is driven by the interaction of the prey (predator) subpopulation with the entire predator (prey) population. Let x ( t ; a 1 , a 2 ) , y ( t ; a 3 ) be the densities of the prey and predator populations over parameters a 1 , a 2 and a 3 correspondingly, and:
X ( t ) = A x ( t ; a 1 , . a 2 ) d a 1 d a 2 , Y ( t ) = A y ( t ; a 3 ) d a 3
be the total sizes of the populations. The initial population sizes and initial distributions P 1 ( 0 ; a 1 , a 2 ) , P 2 ( 0 ; a 2 ) are assumed to be given. The total rate of consumption is equal to:
Y ( t ) A a 2 x ( t ; a 1 , a 2 ) d a 1 d a 2 = Y ( t ) X ( t ) E 1 t [ a 2 ]
Assuming the “proportional distribution” of prey among the predators we can write the inhomogeneous version of Volterra’ model in the form:
d x ( t ; a 1 , a 2 ) / d t = x ( t ; a 1 , a 2 ) ( a 1 a 2 Y ( t ) ) d y ( t ; a 3 ) / d t = y ( t ; a 3 ) ( G ( t ) a 3 )
where G ( t ) = A a 2 x ( t ; a 1 , a 2 ) d a 1 d a 2 = X ( t ) E t 1 [ a 2 ] .
Theorem 5 gives a method for studying this model and a more general model (7.2); the principal step is a reduction of the model to the escort system of ODE. It is instructive to deduce the escort system and the main results informally to clarify the main idea of the method in application to community models.
It is natural to suppose that the parameter a 3 is stochastically independent on the parameters a 1 , a 2 . Let M 1 ( λ 1 , λ 2 ) be the mgf of the initial joint distribution of the parameters a 1 , a 2 , and M 2 ( λ 3 ) be the mgf of the initial distribution of the parameter a 3 .
Introduce the auxiliary variables q 1 ( t ) , q 2 ( t ) as a solution to the Cauchy problem:
d q 1 / d t = Y ( t ) d q 2 / d t = G ( t ) = X ( t ) E t 1 [ a 2 ] q 1 ( 0 ) = q 2 ( 0 ) = 0
Then system (7.8) can be written formally as:
d x ( t ; a 1 , a 2 ) / d t = x ( t ; a 1 , a 2 ) ( a 1 a 2 d q 1 / d t ) d y ( t ; a 3 ) / d t = y ( t ; a 3 ) ( d q 2 / d t a 3 )
Its solution is:
x ( t ; a 1 , a 2 ) = x ( 0 ; a 1 , a 2 ) exp ( a 1 t a 2 q 1 ( t ) ) y ( t ; a 3 ) = y ( 0 ; a 3 ) exp ( ( q 2 ( t ) a 3 t )
Now we can express all values of interest with the help of the mgf-s of the initial distributions and the auxiliary variables:
X ( t ) = X ( 0 ) A exp ( a 1 t a 2 q 1 ( t ) ) P 1 ( 0 ; a 1 , a 2 ) d a 1 d a 2 = X ( 0 ) M 1 ( t , q 1 ( t ) ) Y ( t ) = Y ( 0 ) A exp ( ( q 2 ( t ) a 3 t ) P 2 ( 0 ; a 3 ) d a 3 = Y ( 0 ) exp ( q 2 ( t ) ) M 2 ( t )
P 1 ( t ; a 1 , a 2 ) = x ( t ; a 1 , a 2 ) / X ( t ) = exp ( a 1 t a 2 q 1 ( t ) ) / M 1 ( t , q 1 ( t ) ) P 1 ( 0 ; a 1 , a 2 ) P 2 ( t ; a 3 ) = y ( t ; a 3 ) / Y ( t ) = exp ( a 3 t ) / M 3 ( t ) P 2 ( 0 ; a 3 )
E t 1 [ a 2 ] = ( A a 2 exp ( a 1 t a 2 q 1 ( t ) ) P 1 ( 0 ; a 1 , a 2 ) d a 1 d a 2 ) / M 1 ( t , q 1 ( t ) ) , hence
E t 1 [ a 2 ] X ( t ) = X ( 0 ) λ 2 M 1 ( t , q 1 ( t ) )
Finally, we obtain a closed system of non-autonomous equations:
d q 1 / d t = Y ( 0 ) exp ( q 2 ( t ) ) M 2 ( t ) d q 2 / d t = X ( 0 ) λ 2 M 1 ( t 1 , q 1 )
Now that we have a solution to the Cauchy problem for this system with zero initial values, we can get explicit formulas for total populations’ sizes (7.12) and current distribution of the parameters (7.13), which completely solve the problem. In particular, the current mean values of the parameters:
E 1 t [ a 1 ] = λ 1 ln ( M 1 ( t , q 1 ( t ) ) ) E 1 t [ a 2 ] = λ 2 ln ( M 1 ( t , q 1 ( t ) ) ) E 2 t [ a 3 ] = λ 3 ln ( M 2 ( t ) )
One can check that the obtained formulas coincide with the formulas, which follow from Theorem 5, MA. The current information discriminations for the inhomogeneous Volterra’ model:
I [ P 1 t : P 1 0 ] = E 0 [ ( a 1 t a 2 q 1 ( t ) ) exp ( a 1 t a 2 q 1 ( t ) ) / M 1 0 ( t , q 1 ( t ) ) ] ln M 1 0 ( t , q 1 ( t ) ) I [ P 2 t : P 2 0 ] = E 2 0 [ ( a 1 t a 2 q 1 ( t ) ) exp ( a 1 t a 2 q 1 ( t ) ) / M 1 0 ( t , q 1 ( t ) ) ] ln M 1 0 ( t , q 1 ( t ) )
The distributions (7.13) provide the minimum of information discriminations equaled to (7.17) over all distributions compatible with constraints (7.16).
Integrating the equations of system (7.8) over the parameters we obtain the system:
d X / d t = X ( E 1 t [ a 1 ] E 1 t [ a 2 ] Y ) d Y / d t = Y ( E t 1 [ a 2 ] X E t 2 [ a 3 ] )
These equations for total sizes of inhomogeneous populations have the same form as the initial Volterra’ system (7.5); the difference is that now the parameter values are not constants but vary over time according to formulas (7.16). The phase-parametric portrait of “homogeneous” Volterra’ model is well known (see, e.g., [14]). The dynamics of system (7.18) is determined by the parametric point ( E 1 t [ a 1 ] , E 1 t [ a 2 ] , E 2 t [ a 3 ] ) , which moves across the parametric portrait of model (7.5). This phenomenon, which may be referred to as “traveling across the parametric portrait of a homogeneous model” is a common feature of corresponding inhomogeneous models. It was well observed on the example of discrete-time models [13]. For Volterra-type model of two inhomogeneous populations with logistic reproduction rates and ratio-dependent predator functional response the phenomenon was studied in detail in [15].

7.3. Competition of Two Inhomogeneous Populations

The dynamics of two populations competing for a common resource can be described by the following logistic-like model (see, e.g., [14], ch.4):
d x / d t = a x ( 1 ( x + α y ) / A ) d y / d t = b y ( 1 ( y + β x ) / B )
where A , B are the capacities of the ecological niches for both populations, and α , β are the coefficients of interspecies competitions. A more general Allee-like model has a form:
d x / d t = a x ( ( x L ) ( A x ) α y ) d y / d t = b y ( ( y M ) ( B y ) β x )
where L , M are the lower threshold sizes for both populations.
Consider the inhomogeneous versions of these models, supposing that the reproduction rates a , b are distributed, and the competition is defined by the total sizes X ( t ) , Y ( t ) of the population.
Then instead of the logistic model we obtain the following model:
d x ( t , a ) / d t = a x ( t , a ) ( 1 ( X ( t ) + α Y ( t ) ) / A ) d y ( t , b ) / d t = b y ( t , b ) ( 1 ( Y ( t ) + β X ( t ) ) / B )
and correspondingly:
d x ( t , a ) / d t = a x ( t , a ) ( ( X ( t ) L ) ( A X ( t ) ) α Y ( t ) ) d y ( t , b ) / d t = b y ( t , b ) ( ( Y ( t ) M ) ( B Y ( t ) ) β X ( t ) )
instead of the Allee-type model. Both systems have a form:
d x ( t , a ) / d t = a x ( t , a ) ( u ( X ( t ) ) α Y ( t ) ) d y ( t , b ) / d t = b y ( t , b ) ( v ( Y ( t ) ) β X ( t ) )
where u , v are appropriate functions.
Let P 1 ( 0 ; a ) and P 2 ( 0 ; b ) be the initial distributions of the Malthusian rates a , b and M 1 ( λ ) , M 2 ( λ ) be corresponding mgf-s. In order to study model (7.23), we apply Theorem 5, MA and consider the 4-dimension escort system:
d q 1 1 / d t = u ( X ( 0 ) M 1 ( q 1 1 α q 2 1 ) d q 2 1 / d t = Y ( 0 ) M 2 ( q 1 2 β q 2 2 ) d q 1 2 / d t = v ( Y ( 0 ) M 2 ( q 1 2 β q 2 2 ) ) d q 2 2 / d t = X ( 0 ) M 1 ( q 1 1 α q 2 1 ) q 1 1 ( 0 ) = q 2 1 ( 0 ) = q 1 2 ( 0 ) = q 2 2 ( 0 ) = 0
Suppose that Cauchy problem (7.24) has a unique global solution at t [ 0 , T ) , 0 < T .
Define K t 1 ( a ) = exp ( a ( q 1 1 ( t ) α q 2 1 ( t ) ) , K t 2 ( b ) = exp ( b ( q 1 2 ( t ) β q 2 2 ( t ) ) .
Then the solution to model (7.23) is:
x ( t , a ) = x ( 0 , a ) K t 1 ( a ) = x ( 0 , a ) exp ( a ( q 1 1 ( t ) α q 2 1 ( t ) ) y ( t , b ) = y ( 0 , b ) K t 2 ( a ) = y ( 0 , b ) exp ( b ( q 1 2 ( t ) β q 2 2 ( t ) ) X ( t ) = X ( 0 ) M 1 ( q 1 1 ( t ) α q 2 1 ( t ) ) Y ( t ) = Y ( 0 ) M 2 ( q 1 2 ( t ) β q 2 2 ( t ) )
The mean values of the Malthusian rates at t moment:
E 1 t [ a ] = λ ln ( M 1 ( q 1 1 ( t ) α q 2 1 ( t ) ) ) E 2 t [ b ] = λ ln ( M 2 ( q 1 2 ( t ) β q 2 2 ( t ) ) )
The current pdf-s:
P 1 ( t ; a ) = P 1 ( 0 ; a ) exp ( a ( q 1 1 ( t ) α q 2 1 ( t ) ) / M 1 ( q 1 1 ( t ) α q 2 1 ( t ) ) P 2 ( t ; b ) = P 2 ( 0 ; b ) exp ( b ( q 1 2 ( t ) β q 2 2 ( t ) ) / M 2 ( q 1 2 ( t ) β q 2 2 ( t ) )
The information discrimination:
I [ P t 1 : P 0 1 ] = E 0 1 [ a Q 1 ( t ) exp ( a Q 1 ( t ) ] / E 0 1 [ exp ( a Q 1 ( t ) ] ln E 0 1 [ exp ( a Q 1 ( t ) ] I [ P t 2 : P 0 2 ] = E 0 2 [ b Q 2 ( t ) exp ( b Q 2 ( t ) ] / E 0 2 [ exp ( b Q 2 ( t ) ] ln E 0 2 [ exp ( b Q 2 ( t ) ]
where we denoted:
Q 1 ( t ) = q 1 1 ( t ) α q 2 1 ( t ) , Q 2 ( t ) = q 1 2 ( t ) β q 2 2 ( t )
The distributions P t j (7.27) for j = 1 , 2 at each moment t provide minimum of I [ P t j : P 0 j ] given by (7.28) over all probability distributions compatible with the constraints (7.26).
A simpler logistic model (7.21) can be reduced to a two-dimension escort system:
d q 1 / d t = X ( 0 ) M 1 ( t ( q 1 + α q 2 ) / A ) d q 2 / d t = Y ( 0 ) M 2 ( t ( β q 1 + q 2 ) / B ) q 1 ( 0 ) = q 2 ( 0 ) = 0
The solution to model (7.21) is given by the following formulas:
x ( t , a ) = x ( 0 , a ) exp ( a ( t ( q 1 ( t ) + α q 2 ( t ) / A ) ) ) y ( t , b ) = y ( 0 , b ) exp ( b ( t ( β q 1 ( t ) + q 2 ( t ) / B ) ) ) X ( t ) = X ( 0 ) M 1 ( t ( q 1 ( t ) + α q 2 ( t ) ) / A ) Y ( t ) = Y ( 0 ) M 2 ( t ( β q 1 ( t ) + q 2 ( t ) ) / B )
The current pdf-s:
P 1 ( t ; a ) = P 1 ( 0 ; a ) exp ( a ( t ( q 1 ( t ) + α q 2 ( t ) ) / A ) ) / M 1 ( t ( q 1 ( t ) + α q 2 ( t ) ) / A ) P 2 ( t ; b ) = P 2 ( 0 ; b ) exp ( b ( t ( β q 1 ( t ) + q 2 ( t ) ) / B ) ) / M 2 ( t ( β q 1 ( t ) + q 2 ( t ) ) / B )
The mean values of the Malthusian rates at moment t :
E 1 t [ a ] = λ ln ( M 1 ( t ( q 1 ( t ) + α q 2 ( t ) ) / A ) ) E 2 t [ b ] = λ ln ( M 2 ( t ( β q 1 ( t ) + q 2 ( t ) ) / B ) )
The information discrimination:
I [ P t 1 : P 0 1 ] = E 0 1 [ a Q 1 ( t ) exp ( a Q 1 ( t ) ] / E 0 1 [ exp ( a Q 1 ( t ) ] ln E 0 1 [ exp ( a Q 1 ( t ) ] I [ P t 2 : P 0 2 ] = E 0 2 [ b Q 2 ( t ) exp ( b Q 2 ( t ) ] / E 0 2 [ exp ( b Q 2 ( t ) ] ln E 0 2 [ exp ( b Q 2 ( t ) ]
where we denoted:
Q 1 ( t ) = t ( q 1 ( t ) + α q 2 ( t ) ) / A ) , Q 2 ( t ) = t ( β q 1 ( t ) + q 2 ( t ) ) / B
Distributions (7.32) provide the minimum (7.34) of discrimination information for model (7.21) at each time moment among all distributions subject the mean values (7.33).

8. Discussion and Conclusions

In this paper we develop a method of solving selection systems that describe replica dynamics based on the reduction theorem, and show that the solutions obey the Principe of minimum discrimination information, MinxEnt.
The selection system is a dynamical model of an inhomogeneous biological community or a population of individuals, each of which is characterized by a set of qualitative traits; the values of these traits determine the reproduction rate of the individual. The model takes into account the processes of replication and selection, but not mutation and immigration (at least, not explicitly). The main problem of interest is the dynamics of joint distribution of these traits in each population depending on the initial distribution, as well as on correlations between the traits and interconnections between the populations. The dynamics of a distribution of a selection system is governed by the replicator equation.
The MinxEnt (in other equivalent form, the Principle of maximum information entropy, or cross-entropy, MaxEnt), was successfully applied to various statistical, physical and biological problems as a method for inference of unknown distribution, subject to some given constraints. MinxEnt offers an efficient algorithm for construction of the minimum discrimination information probability distributions; the “MinxEnt distributions” are well known both in physics as the Boltzmann-Gibbs distributions, as well as in mathematics as the exponential family of distributions. The method can incorporate interactions between different traits and variables, i.e., in the form of moments of their joined distribution and requires only testable information in the form of mean values of the traits.
There exists an “observer-dependent” view of the cross-entropy concepts (defended by Jaynes [5,6] and, subsequently, by many other authors). Briefly, the authors claimed that entropy is a property of our description of a system rather than a property of the system itself. Then, if MaxEnt is fundamentally an algorithm of Bayesian statistical inference from partial information it is not clear why one can expect it to work as a description of nature. This problem has been discussed in many of papers over the last 50 years. An acceptable answer was formulated by Dewar [16]: “…MaxEnt predicts that behavior, which is selected reproducibly by nature under the imposed constraints”. In the recent paper [9] the author has suggested that MaxEnt “is not a physical principle but, rather, an inference algorithm that passively translates physical assumptions into macroscopic predictions”.
Nevertheless, the problem cannot be considered to be completely solved. Actually, an approach to obtain a satisfactory solution was pointed out by A. Einstein [17], who argued that the statistics of a system should follow from its dynamics and, in principle, could not be postulated a priori.
In this paper we show that, for a wide class of models for replica dynamics (the selection systems), the dynamical version of the Principal of minimal information production is neither inference algorithm no external principle but a mathematical assertion that can be derived from the system dynamics instead of being postulated.
We provide an algorithmic approach to find the solution to the replicator equation throw reduction of the initial model (which may have infinite dimensionality) to a corresponding escort system of ordinary differential equations, which can be of significantly smaller dimensionality than the original system; the actual dimensionality is equal to the number of traits that define the reproduction rate. The solution to any replicator equation from the considered class is a time-dependent Boltzmann distribution, whose parameters solve the escort system. With the solution to the replicator equation we can compute the current mean values of the traits at any instant. Then, treating these mean values as constraints, we can show that the “MinxEnt distribution” coincides with the solution of the replicator equation, which was obtained independently of the MinxEnt algorithm. Hence, the principle of the minimum of discrimination information can be considered as the variation principle that governs the selection system dynamics. Overall, both the reduction theorem and the Dynamical MinxEnt principle stem from the fact that the solution to the replicator equation belongs to the exponential family.
On the other hand, it is easy to show that any Boltzmann distribution with time-dependent parameters solves the corresponding replicator equation [8] and hence provides the minimum of discrimination information for the distribution of the associated selection system. It means that the replica dynamics is the “natural habitat” for the MinxEnt principle; within the framework of selection systems, we cannot choose whether or not to ascribe the property of minimization of the information discrimination to the selection system. It is an intrinsic property of any solution to the replicator equations that is fulfilled due to the system dynamics at any point of the system trajectory. More generally, the MinxEnt principle is an internal property of the process of natural selection at every moment of the system evolution.
We showed that the discrimination information is the Legendre transform of logarithm of the partition function; the auxiliary variables and the constraints are conjugated variables under this transformation. This assertion clarifies the meaning of the auxiliary variables and the role of the escort system. It also implies that the dynamics of selection system can be equally well described either in terms of the auxiliary variables or in terms of the constraints, which are the current mean values of the traits. Hence, the solution to the replicator equation is completely determined by the current mean values of the traits subject the important condition that the initial distribution is known.
Our approach is illustrated for inhomogeneous versions of some conceptual models of mathematical biology, namely, the logistic and Ricker’ models of populations and the Volterra’ models of communities. Formally, the inhomogeneous versions can be written in the same form as the initial models, with their current mean values substituted for the fixed parameters, as it was done in s.7 for the Volterra’ system. As a result, one obtains a complex non-linear integro-differential system, which hardly can be studied directly. The developed approach allows us to reduce this complex model to a system of two non-autonomous differential equations, which is specific for every initial distribution of the parameters. The same method works for other models and allows us to find the distribution of the systems at any time moment.
Looking beyond the formal solutions, we can now reveal a general optimization principle that governs the replica dynamics in many problems in mathematical biology, evolutionary game theory [3], the Eigen quasispecies theory [18], etc. As a result of the selection process, these systems evolve in such a way that the discrimination information is minimized at each time moment, given the mean values of the accounting traits. The values of the constraints and corresponding minimal value of discrimination information can be determined for each specific system by the developed method; we can conclude that for this type of models the MinxEnt principle follows from the system dynamics.

Acknowledgements

I appreciate four anonymous referees and the Guest Editor of the volume Alexander Gorban for their valuable comments and criticisms from which the paper has greatly benefited.

Mathematical Appendix

1. Information Discrimination and the Legendre Transform

Let us recall some definitions and statements (see e.g., [19], section 12 for rigorous definitions and theorems). Let W ( α ) be a convex function of r-dimensional vector α . The Legendre transform of the function W ( α ) is the function:
L ( β ) = sup { α : ( α , β ) W ( α ) }
where ( α , β ) = i = 1 r α i β i . The function L ( β ) is again a convex function, and its Legendre transform is W ( α ) . The functions W ( α ) and L ( β ) are conjugate.
If the function W ( α ) is smooth, then the transform (A.1) coincides with the classical Legendre transform defined as follows. Let α = α ( β ) be the solution of the equation:
W ( α ) = β , i . e . , W ( α ) / α i = β i . Then L ( β ) = ( α ( β ) , β ) W ( α ( β ) )
The variables α , β are conjugate. Let
W ( λ ) = ln E 0 [ exp ( i = 1 n λ i φ i ( a ) ]
where λ = ( λ 1 , ... λ n ) . It is a convex smooth function on λ = ( λ 1 , ... λ n ) . Then:
β j W ( λ ) / λ j = E 0 [ φ j exp ( i = 1 n λ i φ i ) ] / E 0 [ exp ( i = 1 n λ i φ i ) ]
Now, let λ i = q i ( t ) . Then:
W ( q ( t ) ) = ln E 0 [ exp ( i = 1 n q i ( t ) φ i ) ] = ln E 0 [ K t ]
β j = E 0 [ φ j K t ] / E 0 [ K t ] = E t [ φ j ] = A j ( t ) for all j n
Hence, taking β j = A j ( t ) , we obtain:
i = 1 n λ i ( β ) β i = i = 1 n q i ( t ) E t [ φ i ] = E t [ B ( q ( t ) ) ]
W ( λ ( β ) ) = E 0 [ K t ] and fianlly :
L ( β ) = ( λ ( β ) , β ) W ( λ ( β ) ) = E t [ B ( q ( t ) ) ] ln E 0 [ K t ] = I [ P t : P 0 ]
So, I [ P t : P 0 ] at given constraints E t [ φ j ] = A j ( t ) solves the optimization problem:
I [ P t : P 0 ] = sup { α : i = 1 n α i A i ( t ) W ( α ) }
and hence q i ( t ) = α i * where α i * provide supremum in (A.2).
We have thus proven Theorem 4.

2. Inhomogeneous Models of Communities: the Main Theorems

Below we formulate the main theorems about model of inhomogeneous community (7.1)-(7.4). We omit the proofs because these theorems are similar to the corresponding theorems for inhomogeneous populations ([7,8]) up to more complex technical details.
The model (7.1)-(7.4) can be reduced to an escort system of ordinary non-autonomous equations of dimensionality r × n :
d q i j / d t = u i j ( t , G i * ( t ) ) , q i j ( 0 ) = 0 , j = 1 , ... r , i = 1 , ... n ,
Here:
q j ( t ) = ( q 1 j ( t ) , ... q n j ( t ) )
G i * j ( t ) = N j ( 0 ) Φ j ( g i ; q j ( t ) )
in particular:
N * j ( t ) = N j ( 0 ) Φ j ( 1 ; q j ( t ) )
and Φ j ( r ; λ ) is the generating functional (2.5) for the initial distribution P j ( 0 , a ) .
Denote K t j ( a ) = exp ( i = 1 n q i j ( t ) φ i ( a ) ) .
Theorem 5.
Suppose that Cauchy problem (A.3) has a unique global solution at t [ 0 , T ) , 0 < T . Then the functions:
l j ( t , a ) = l j ( 0 , a ) K t j ( a )
G i j ( t ) = N j ( 0 ) Φ j ( g i ; q j ( t ) )
N j ( t ) = N j ( 0 ) Φ j ( 1 ; q j ( t ) )
satisfy system (7.1)- (7.3) at t∈[0,T). The pdf:
P j ( t , a ) = P j ( 0 , a ) K t j ( a ) / E 0 [ K t j ] = P j ( 0 , a ) K t j ( a ) / Φ j ( 1 ; q j ( t ) )
solves the replicator equations (7.4).
It follows from Theorem 5, that:
d N j / d t = N j ( i = 1 n u i j ( t , G i ) E t j [ φ i ] )
where E t j [ φ i ] = Φ j ( φ i ; q j ( t ) ) / Φ j ( 1 ; q j ( t ) ) .
Theorem 6.
i) Let P t j = P j ( t , a ) be the solution (A.4) to replicator equation (7.4). Then at each moment t the distribution P t j provides minimum of I [ P t j : P 0 j ] over all probability distributions compatible with the constraints A i j ( t ) = E t j [ φ i ] , i = 1 , ... n ;
ii) The constraint values can be computed at each time moment by the formula:
A i j ( t ) = E 0 j [ φ i K t ] / E 0 [ K t ] = Φ j ( φ i ; q j ( t ) ) / Φ j ( 1 ; q j ( t ) )
iii) The discrimination information I [ P t j : P 0 j ] can be computed with the help of the following formulas:
I [ P t j : P 0 j ] = E t [ B j ( t , a ) ] ln ( N j ( t ) / N j ( 0 ) )
I [ P t j : P 0 j ] = E 0 j [ B j K t j ] / E 0 j [ K t j ] ln E 0 j [ K t j ]
where B j ( t , a ) = i = 1 n q i j ( t ) φ i ( a ) .

References

  1. Schuster, P.; Sigmund, K. Replicator dynamics. J. Theor. Biol. 1983, 100, 533–538. [Google Scholar] [CrossRef]
  2. Akin, E. Exponential Families and game dynamics. Can. J. Math. 1982, XXXIV, 374–405. [Google Scholar] [CrossRef]
  3. Hofbauer, J.; Sigmund, K. Evolutionary game dynamics. Bull. Am. Math. Soc. 2003, 40, 479–519. [Google Scholar] [CrossRef]
  4. Kullback, S. Information Theory and Statistics; John Wiley: New York, NY, USA, 1959. [Google Scholar]
  5. Jaynes, E.T. Information theory and statistical mechanics I. Phys. Rev. 1957, 106, 620–630. [Google Scholar] [CrossRef]
  6. Jaynes, E.T. Probability Theory: The Logic of Science; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
  7. Karev, G.P. On mathematical theory of selection: continuous time population dynamics. J. Math. Biol. 2010, 60, 107–129. [Google Scholar] [CrossRef] [PubMed]
  8. Karev, G.P. Replicator equations and the principle of minimal production of information. Bull. Math. Biol. 2010. [Google Scholar] [CrossRef]
  9. Dewar, R.C. Maximum entropy production as an inference algorithm that translates physical assumptions into macroscopic predictions: don’t shoot the messenger. Entropy 2009, 11, 931–944. [Google Scholar] [CrossRef]
  10. Chentsov, N.N. Statistical Decision Rules and Optimal Inference; AMS: Providence, RI, USA, 1982. [Google Scholar]
  11. Gorban, A.N. Selection theorem for systems with inheritance. Math. Model. Nat. Phenom. 2007, 2, 1–45. [Google Scholar] [CrossRef]
  12. Ackleh, A.S.; Marshall, D.F.; Heatherly, H.E.; Fitzpatrick, B.G. Survival of the fittest in a generalized logistic model. Math. Model. Meth. Appl. Sci. 1999, 9, 1379–1391. [Google Scholar] [CrossRef]
  13. Karev, G.P. Inhomogeneous maps and mathematical theory of selection. JDEA 2008, 14, 31–58. [Google Scholar] [CrossRef]
  14. Bazykin, A.D. Nonlinear Dynamics of Interacting Populations; World Scientific: Singapore, 1998. [Google Scholar]
  15. Karev, G.; Novozhilov, A.; Koonin, E. Mathematical modeling of tumor therapy with oncolytic viruses: Effects of parametric heterogeneity on cell dynamics. Biol. Dir. 2006, 3, 1–30. [Google Scholar]
  16. Dewar, R.C. Maximum entropy production and the fluctuation theorem. J. Phys. A: Math. Gen. 2005, 38, L371–L381. [Google Scholar]
  17. Einstein, A. The Collected Papers of Albert Einstein, Volume 3: The Swiss Years: Writings, 1909-1911; Klein, M.J., Kox, A.J., Renn, J., Schulmann, R., Eds.; Princeton University Press: Princeton, NJ, USA, 1993; pp. 286–312. [Google Scholar]
  18. Eigen, M.; McCaskill, J.; Schuster, P. The molecular quasi-species. Adv. Chem. Phys. 1989, 75, 149–263. [Google Scholar]
  19. Rockafellar, R.T. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 1970. [Google Scholar]
Back to TopTop