Far-from-Equilibrium Time Evolution between two Gamma Distributions

Many systems in nature and laboratories are far from equilibrium and exhibit significant fluctuations, invalidating the key assumptions of small fluctuations and short memory time in or near equilibrium. A full knowledge of Probability Distribution Functions (PDFs), especially time-dependent PDFs, becomes essential in understanding far-from-equilibrium processes. We consider a stochastic logistic model with multiplicative noise, which has gamma distributions as stationary PDFs. We numerically solve the transient relaxation problem, and show that as the strength of the stochastic noise increases the time-dependent PDFs increasingly deviate from gamma distributions. For sufficiently strong noise a transition occurs whereby the PDF never reaches a stationary state, but instead forms a peak that becomes ever more narrowly concentrated at the origin. The addition of an arbitrarily small amount of additive noise regularizes these solutions, and re-establishes the existence of stationary solutions. In addition to diagnostic quantities such as mean value, standard deviation, skewness and kurtosis, the transitions between different solutions are analyzed in terms of entropy and information length, the total number of statistically distinguishable states that a system passes through in time.


I. INTRODUCTION
In classical statistical mechanics, the Gaussian (or normal) distribution and mean-field type theories based on such distributions have been widely used to describe equilibrium or near equilibrium phenomena. The ubiquity of the Gaussian distribution stems from the central limit theorem that random variables governed by different distributions tend to follow the Gaussian distribution in the limit of large sample size [1][2][3]. In such a limit, fluctuations are small and have a short correlation time, and mean values and variance completely describe all different moments, greatly facilitating analysis.
Many systems in nature and laboratories are however far from equilibrium, exhibiting significant fluctuations. Examples are found not only in turbulence in astrophysical and laboratory plasmas, but also in forest fires, the stock market, and biological ecosystems [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23]. Specifically, anomalous (much larger than average values) transport associated with large fluctuations in fusion plasmas can degrade the confinement, potentially even terminating fusion operation [6]. Tornadoes are rare, large amplitude events, but can cause very substantial damage when they do occur. In biology, the pioneering work of Delbrück on bacteriophages showed that viruses replicate in strongly fluctuating bursts [24]. The fluctuations of the burst amplitudes were explained mathematically by stochastic autocatalytic reaction models first introduced in [25]. Delbrück's autocatalytic models predict discrete negative-binomial distributions, that can be well approximated by gamma distributions when the average number of particles is large. Furthermore, gene expression and protein productions, which used to be thought of as smooth processes, have also been observed to occur in bursts leading to negative binomial and gamma distributed protein copy numbers (e.g. [19][20][21][22][23]).
Such rare events of large amplitude (called intermittency) can dominate the entire transport even if they occur infrequently [8,28]. They thus invalidate the assumption of small fluctuations with short correlation time, making mean value and variances meaningless. Therefore, to understand the dynamics of a system far from equilibrium, it is crucial to have a full knowledge of Probability Distribution Functions (PDFs), including time-dependent PDFs [29].
The consequences of strong fluctuations in far-from-equilibrium systems are multiple. In physics, far-from-equilibrium fluctuations produce dissipative patterns, shift or wipe out phase transitions, etc. In economics, finance and actuarial science, strong fluctuations are important issues of risk evaluation.
In biology, strong fluctuations generate phenotypic heterogeneity that helps multicellular organisms or microbial populations to adapt to changes of the environment by so-called "bet-hedging" strategies. In such a strategy, only a part of the cell population adapts upon emergence of new environmental conditions. The remaining part retains the memory of the old conditions and is thus already adapted if environmental conditions revert to previous ones [26].
Exceptional behavior can also rescue cell subpopulations from drug-induced lethal conditions, thus generating drug resistance [27]. In particular, because of the skewness and exponential tail of the gamma distribution, gamma distributed populations contain a significant proportion of individuals with exceptionally high phenotype. Bet-hedging being a dynamic phenomenon, it is important, for biological studies, to be able to predict not only steady-state but also time-dependent distributions.
Obtaining a good quality of PDFs is often very challenging, as it requires a sufficiently large number of simulations or observations. Therefore, a PDF is usually constructed by averaging data from a long time series, and is thus stationary (independent of time). Unfortunately, such stationary PDFs miss crucial information about the dynamics/evolution of non-equilibrium processes (e.g. tumour evolution). Theoretical prediction of time-dependent PDFs has proven to be no less challenging due to the limitation in our understanding of nonlinear stochastic dynamical systems as well as the complexity in the required analysis. Spectral analysis, for example, using theoretical tools similar to those used in quantum mechanics (e.g. raising and lower operators) is useful (e.g. [1]), but the summation of all eigenfunctions is necessary for time-dependent PDFs far from equilibrium. Various different methodologies have also been developed to obtain approximate PDFs, such as the variational principle, the rate equation method, or moment method [30][31][32][33][34][35]. In particular, the rate equation method [31,32] assumes that the form of a time-dependent PDF during the relaxation is similar to that of the stationary PDF, and thus approximates a time-dependent PDF during transient relaxation by a PDF having the same functional form as a stationary PDF, but with time-varying parameters.
In this work we show that this assumption is not always appropriate. We consider a stochastic logistic model with multiplicative noise. We show that for fixed parameter values the stationary PDFs are always gamma distributions (e.g. [36,37]), one of the most popular distributions used in fitting experimental data. However, we find numerically that the time-dependent PDFs in transitioning from one set of parameter values to another are significantly different from gamma distributions, especially for strong stochastic noise. For sufficiently strong multiplicative noise it is necessary to introduce additive noise as well to obtain stationary distributions at all. We note that in inferential statistics, gamma distributions facilitate Bayesian model learning from data, as a gamma distribution is a conjugate prior to many likelihood functions. It is therefore interesting to test whether models with stationary gamma distributions also have timedependent gamma distributions.

II. STOCHASTIC LOGISTIC MODEL
We consider the logistic growth with a multiplicative noise given by the following Langevin where x is a random variable, and ξ is a stochastic forcing, which for simplicity can be taken as a short-correlated random forcing as follows: In Eq. (2), the angular brackets represent the average over ξ, ξ = 0, and D is the strength of the forcing. γ is the control parameter in the positive feedback, representing the growth rate of x, while ǫ represents the efficiency in self-regulation by a negative feedback. γx − ǫx 2 can be considered as the gradient of the potential V as Thus, V has its minimum value at x = γ ǫ . When ξ = 0, x = γ ǫ (the carrying capacity) is a stable equilibrium point since  [25], but is different from these in many ways, the most important being the lack of discreteness and the possibility of reaching a steady state due to the finite capacity of logistic growth. We will show in the following that in spite of these differences, our model is capable of producing large fluctuations.
By using the Stratonovich calculus [2,3,38], we can obtain the following Fokker-Planck equation for the PDF p(x, t) (see Appendix A for details): corresponding to the Langevin equation (1). By setting ∂ t p = 0, we can analytically solve for the stationary PDFs as which is the well-known gamma distribution. The two parameters a and b are given by a = γ/D and b = ǫ/D. The mean value and variance of the gamma distribution are found to be: where σ = Var(x) is the standard deviation. We recognise x as the carrying capacity for a deterministic system with ξ = 0. It is useful to note that x is given by the linear growth rate scaled by ǫ, while Var(x) is given by the product of the linear growth rate and the diffusion coefficient, each scaled by ǫ. That is, the effect of stochasticity should be measured relative to the linear growth rate.
Therefore, the case of small fluctuations is modelled by values of D small compared with γ and ǫ. In such a limit, a and b are large, making Var(x) ≪ x in Eq. (5). That is, the width of the PDF is much smaller than its mean value. In this limit, Eq. (4) reduces to a Gaussian distribution. To show this, we express Eq. (4) in the following form: where f (x) = bx − (a − 1) ln x. For large b, we expand f (x) around the stationary point )/x up to the second order in x − x p to find: Here a ≫ 1 was used. Using Eq. (8) in Eq. (6) then gives us which is a Gaussian PDF with mean value x . Here β = 1/Var(x) is the inverse temperature and the variance Var(x) is given by Eq. (5). Therefore, for a sufficiently small D, the gamma distribution is approximated as a Gaussian PDF, which is consistent with the central limit theorem as small D corresponds to small fluctuations and large system size. See also [39] for a different derivation.
As D increases, the Gaussian approximation becomes increasingly less valid. Indeed, even the gamma distribution becomes invalid asymptotically, when t → ∞, if D > γ; according to Eq. (4) having a < 1 yields lim x→0 p = lim x→0 ∂p ∂x = ∞. However, from the full time-dependent Fokker-Planck equation (3) one finds that if the initial condition satisfies p = 0 at x = 0, then p(x = 0) will remain 0 for all later times. As we will see, the resolution to this seeming paradox is that no stationary distribution is ever reached for D > γ, but instead the peak approaches ever closer to x = 0, without ever reaching it.
If we are interested in obtaining stationary solutions even when D > γ, one way to achieve that is to return to the original Langevin equation (1), but now include a further additive stochastic noise η: where ξ and η are uncorrelated, and η satisfies η(t)η(t ′ ) = 2Qδ(t − t ′ ). The new version of the Fokker-Planck equation (3) then becomes: which has stationary solutions given by This integral can be evaluated analytically, but the final form is not particularly illuminating. The only point to note is that for non-zero Q the denominator is never 0 even for x → 0, which avoids any possible singularities at the origin. For γ > D and Q ≪ D the solutions are also essentially indistinguishable from the previous gamma distribution (4).
The only significant effect of including η therefore is to avoid the previous difficulties at the origin when D > γ.
As we have seen, both Fokker-Planck equations (3) and (11)  and similar to previous work [40][41][42]. The only point that requires discussion are the boundary conditions. As noted above, for (3) the equation itself states that p = 0 at x = 0 is the appropriate boundary condition, provided only that the initial condition also satisfies this. In contrast, for (11) the appropriate boundary condition is ∂ ∂x p = 0 at x = 0. To derive this boundary condition for (11), we simply integrate (11) over the range x = [0, ∞] and require that the total probability should always remain 1, so that d dt p dx = 0. Regarding the outer boundary, choosing some moderately large outer value for x, and then imposing p = 0 there was sufficient. Resolutions up to 10 6 grid points were used, and results were carefully checked to ensure they were independent of the grid size, time step, and precise choice of outer boundary.

III. DIAGNOSTICS
Once the time-dependent solutions are computed, we can analyze them using a number of diagnostics. First, we can evaluate the mean value x and standard deviation σ from (5).
Next, to explore the extent to which the time-dependent PDFs differ from gamma distribu-tions, we can simply compare them with 'equivalent' gamma distributions and compute the difference. That is, given x and σ, the gamma distribution p equiv having the same mean and variance would have as its two parameters a = x 2 /σ 2 and b = x /σ 2 . With these values, we define to measure how different the actual time-dependent PDF is from its equivalent gamma distribution.
Two other familiar quantities often useful in analyzing PDFs are the skewness and kurtosis, defined by Skewness measures the extent to which a PDF is asymmetric about its peak, whereas kurtosis measures how concentrated a PDF is in the peak versus the tails, relative to a Gaussian having the same variance. (The −3 is included in the definition of the kurtosis to ensure that a Gaussian would yield 0.) For gamma distributions one finds analytically that the skewness is 2 D/γ, and the kurtosis is 6D/γ. Comparing the skewness and kurtosis of the time-dependent PDFs with these formulas is therefore another useful way of quantifying how similar or different they are from gamma distributions.
Another quantity that can be useful is the so-called differential entropy as a measure of order versus disorder (as entropy always is): In particular, we expect S to be small for localised PDFs, and large for spread out ones (e.g. [40][41][42][43]). For unimodal PDFs as the ones studied here, entropy and standard deviation are typically comparably good measures of localization, but for bimodal peaks entropy can be significantly better [42]. For the gamma distribution in Eq. (4), the differential entropy can be shown to be given by where ψ(a) = d ln (Γ(x)) dx x=a is the double gamma function.
Our final diagnostic quantity is what is known as information length. Unlike all the previous diagnostics, which are simply evaluated at any instant in time but otherwise do not involve t, information length is the Lagrangian quantity, explicitly concerned with the full time-history of the evolution of a given PDF. It is thus ideally suited to understanding time-dependent PDFs. Very briefly, we begin by defining Note how τ has units of time, and quantifies the correlation time over which the PDF changes, thereby serving as a time unit in statistical space. Alternatively, 1/τ quantifies the (average) rate of change of information in time. E is due to the change in either width (variance) of the PDF or the mean value, which are determined by γ, D and ǫ for the gamma distribution (e.g. see Eq. (4)). In the standard Brownian motion, the mean value is zero so that E is due to the change in the variance of a PDF.
The total change in information between initial and final times, 0 and t respectively, is then defined by measuring the total elapsed time in units of τ as: This information length L measures the total number of statistically distinguishable states that a system evolves through, thereby establishing a distance between the initial and final PDFs in the statistical space. Note that L by construction is a continuous variable, and thus measures the total 'number' of statistically different states as a continuous number. See also [40][41][42][43][44][45][46][47][48] for further applications and theoretical background of E and L.
Keeping ǫ and D fixed, we then switch γ back and forth between two values, in the following sense: Take the gamma distribution (4) corresponding to one value, call it γ 1 , and use that as the initial condition to solve (3) with the other value, call it γ 2 . We then interchange γ 1 and γ 2 to complete the pair of 'inward' and 'outward' processes. Such a pair can be thought of as an order/disorder phase transition [40,41], caused for example by cyclically adjusting temperature in an experiment. This protocol is also inspired from adaptation of a biological system. During adaptation a model parameter can be abruptly changed in response to the change of environmental conditions, for instance a particle replication parameter γ, but the resulting changes can be extremely heterogeneous in the population.    The first panel in figure 3 shows the previous quantities x and L · D 1/2 , but now plotted against each other rather than separately against time. The behaviour is exactly as one might expect, with L growing more or less linearly with distance from the initial position. The right panel in figure 3 shows the entropy (16), again as a function of x rather than time, to emphasize the cyclic nature of the two processes. The significance is indeed as claimed above, with more localized PDFs having smaller entropy values. Note how γ = 0.5 → 0.05, which had the narrower PDFs, has lower entropy values than the reverse process. Note also how reducing D by a factor of two, thereby making the PDFs narrower, causes the entire cyclic pattern to shift downward by an essentially constant amount. D is greater than γ.) Figure 6 shows the resulting PDFs, and how they approach ever closer to the origin, but never actually achieve the x −1 blowup that would be implied by Eq. (4) for a = γ/D = 0.
The peak amplitude simply increases indefinitely, as t 1/2 . The widths correspondingly also decrease; the apparent increase is an illusion caused by the logarithmic scale for x.
The dashed lines also show the equivalent gamma distributions, as before. Note how the difference becomes increasingly noticeable; in line with the fact that the equivalent gamma distribution is tending toward its singular behaviour as x decreases, but the actual PDFs must always have p(0) = 0. Figure 7 is the equivalent of figure 2, and directly compares γ = 0.5 → 0 here with the previous γ = 0.5 → 0.05. We see that x starts out very similarly, but instead of equilibrating to 0.05, it now tends to 0 as t −1 . E again starts out similarly, but ultimately tends to 0 much slower, as t −3 instead of exponentially. This t −3 scaling for E has an  interesting consequence for L, namely that L does saturate to a finite value L ∞ (since t −3/2 dt remains bounded for t → ∞) even though the PDF itself never settles to a stationary state. x , 2/ x and 6/ x , respectively, and indicate the behaviour expected for exact gamma distributions. ever smaller as the peak moves toward the origin. Skewness and kurtosis seem to follow the expected gamma distribution relationship extremely well, even though we saw before in figure 6 that the PDFs are actually different from gamma distributions. As x → 0, both skewness and kurtosis thus become indefinitely large.
Finally, we turn to the Fokker-Planck equation (11) with additive noise included, and use it to explore the two questions that could not be addressed otherwise. First, how does a process like γ = 0.5 → 0 then equilibrate to a stationary solution? Second, what does the reverse process γ = 0 → 0.5 look like?
We will keep D = 10 −3 and Q = 10 −5 fixed throughout this section. Since the effective diffusion coefficients in (11) are Dx 2 and Q [recall also the denominator of Eq. (12)], this means that Q is dominant only within x ≤ 0.1; any stationary solutions with peaks much beyond that are effectively pure gamma distributions. Figure 9 shows the same type of inward/outward process as before in figure 1, only now switching γ between 0.5 and 0.1. Comparing with figure 1, we see that the dynamics are very similar, just with all the peaks considerably narrower, which is to be expected if this peak is now seeing just as much diffusion from Q as from D, it is not surprising that it spreads out somewhat more, and is correspondingly somewhat lower than a pure gamma distribution would be. Figure 10 shows the fundamentally new case, namely switching γ between 0.5 and 0.
The inward process γ = 0.5 → 0 is again very similar to either figure 1 or 9. The only difference to figure 6 is that the process does actually equilibrate to a stationary solution now, as given by Eq. (12). The reverse process γ = 0 → 0.5 is rather different though. The initial central peak now broadens far more than previously seen in figures 1 and 9.
One interesting consequence of this extreme broadening for γ = 0 → 0.5 is on the total information length L ∞ . In figure 9 these values are 25 and 16, respectively, whereas in figure 10 they are 35 and 9.5. That is, in both cases decreasing γ yields larger L ∞ values than increasing γ does, consistent with the peaks being narrower, and hence passing through for γ = 0.5 → 0, this is exactly as one might expect: having the peak travel somewhat further yields extra information length. However, comparing 16 for γ = 0.1 → 0.5 versus 9.5 for γ = 0 → 0.5 is puzzling then! The peak has further to travel, but accomplishes it with less information length. The reason is precisely this extreme broadening, which substantially reduces the number of distinguishable states along the way. See also [40,41], where the same effect was studied for Gaussian PDFs, and values of D as small as 10 −7 , leading to fundamentally different scalings of L ∞ with D for inward and outward processes.
Returning to the central question of this paper, namely how close the time-dependent PDFs are to gamma distributions, the results for figure 9 are similar to the previous ones.
In particular, we recall that before in figure 5 we had the difference scaling as D 1/2 , so a smaller D here means a smaller difference. These results are approaching the small D regime where gamma distributions become very close to Gaussians anyway, which generally remain close to Gaussian as they move.
However, for the γ = 0 → 0.5 process in figure 10, the intermediate stages do not look much like gamma distributions. (The final equilibrium is indistinguishable from a gamma distribution though, consistent with Q being completely negligible for these values of x.) For the intermediate stages, these were found to be so different from gamma distributions that attempting to fit a gamma distribution having the same x and σ made little sense; this extreme broadening and long tail trailing behind the peak meant that both x and σ were too different from the normal expectation that they should be measures of 'peak' and 'width'.
Instead, we simply asked the question, which values of a and b would minimize the quantity |p − p bf | dx, where p is the time-dependent PDF to be fitted, and p bf is the best-fit gamma distribution. Unlike our previous difference formula, this does not yield simple analytic formulas for the a and b to choose, but is numerically still straightforward to implement. Figure 11 shows the results, for two of the intermediate stages in the γ = 0 → 0.5 process. We can see that the fit is rather poor, indicating that these PDFs are significantly different from gamma distributions.
This misfit is also not caused by the inclusion of Q; if this or any similar central peak is evolved for either small or zero Q in the Fokker-Planck equation, the result is always similar to here. As explained also in [40,41], the dynamics of how central peaks move away from the origin is simply different from how peaks already away from the origin move, regardless of whether the final states are Gaussians as in [40,41], or gamma distributions as here.

V. CONCLUSION
Gamma distributions are among the most popular choices for modelling a broad range of experimentally determined PDFs. It is often assumed that time-dependent PDFs can then simply be modelled as gamma distributions with time-varying parameters a and b. In this work we have demonstrated that one should be cautious with such an approach. By numerically solving the full time-dependent Fokker-Planck equation, we found that there are three sets of circumstances where the PDFs can differ significantly from gamma distributions: • If D < γ, so that stationary solutions exist, but D is also sufficiently close to γ that a gamma distribution differs significantly from a Gaussian, then the time-dependent PDFs will also differ significantly from gamma distributions. • If D > γ, stationary gamma distributions do not exist at all. Instead, peaks move ever closer to the origin, and in the process increasingly differ from gamma distributions.
• If the initial condition is a peak right on the origin -either as a result of adding additive noise to produce stationary solutions even for D > γ, or simply as an arbitrary initial condition -then any evolution away from the origin will differ significantly from gamma distributions. Unlike the previous two items, which become more pronounced for larger D, this effect is most clearly visible for smaller D, where the mismatch between the naturally narrower peaks and the extreme broadening seen in figure 11 becomes increasingly significant.
In summary, our results show that a simple Langevin equation model mimics the strong fluctuations of far-from-equilibrium systems. This model has gamma distributions as steady-state solutions, but the time-dependent solutions can deviate considerably from this law. This makes tasks such as Bayesian and frequentist inference of the model from data more complicated. On the other hand, the model shows complex asymptotic dynamics with situations when a steady state is reached or not, different from the one of the deterministic logistic model that invariably evolves to the maximum capacity. The studied model is general enough and can be applied to many practical situations in biology, economics, finance, and physics. Future work will apply some of these ideas to fitting actual data.

Appendix A: Derivation of the Fokker-Planck Equations
In order to derive the Fokker-Planck equation (3) from the Langevin equation (1), it is useful to introduce a generating function Z: Then, by definition of 'average', the average of Z is related to the PDF, p(x, t), as Z = dx Z p(x, t) = dx e iλx(t) p(x, t).
Thus, we see that Z is the Fourier transform of p(x, t). The inverse Fourier transform of Z then gives p(x, t): p(x, t) = 1 2π dλ e −iλx Z .
We note that Eq. (A3) can be written as p(x, t) = 1 2π dλ e iλ(x−x(t)) = δ(x − x(t)) , which is another form of p(x, t). To obtain the equation for p(x, t), we first derive the equation for x and then take the inverse Fourier transform as summarised in the following.
By using the Stratonovich calculus [2,3,38], the solution to Eq. (B1) is found as y(t) = y 0 e −(γt+B(t)) + ǫe −(γt+B(t)) t 0 dt 1 e (γt 1 +B(t 1 )) , where y 0 = y(t = 0) and B(t) = t 0 dt 1 ξ(t 1 ) is the Brownian motion. Therefore, is the geometric Brownian motion with a drift (e.g. [2]). The time integral of the latter is used in understanding stochastic processes in financial mathematics and many other areas [49,50]. In particular, in the long time limit, its PDF can be shown to be a gamma distribution. However, this PDF of x is not particularly useful as it involves complicated summations and integrals that cannot be evaluated in closed form [49,50].