Allelic Entropy and Times until Loss or Fixation of Neutral Polymorphisms

In the diffusion approximation of the neutral Wright–Fisher model, the expected time until fixation or loss of a neutral allele is proportional to the initial entropy of the distribution of the allele in the population. No explanation is known for this coincidence. In this paper, we show that the rate of entropy dissipation is proportional to the number of segregating alleles. Since the final fixed state has zero entropy, the expected lifetime of segregating alleles is proportional to the initial entropy in the system. We show that classical formulae on the average time to loss of segregating alleles and the expected time to fixation of the last segregating allele stem from these properties of the diffusion process. We also extend our results to the case of population size changing in time. The dissipation of heterozygosity and entropy shows that superlinear population growth leads to infinite expected fixation times, i.e., neutral alleles in fast-growing populations could segregate forever without ever becoming fixed or disappearing by genetic drift.


Introduction
Fixation times of variants in natural populations have been an important object of study since the birth of modern population genetics (see Ewens [1] and citations therein).They have been widely studied in standard discrete-time neutral models describing the evolution of neutral alleles under genetic drift, such as the Wright-Fisher [2] and Moran models [1], especially in their diffusion approximation [3].A classic result for the expected time t FL until fixation or loss of a neutral allele, derived by the diffusion approximation of the Wright-Fisher model, is where p is the initial frequency of the allele.A striking observation is that the quantity that appears on the right side corresponds to the mathematical form of an entropy.It could be the Shannon entropy associated with the Bernoulli sampling of an individual with or without that allele from the population, or the Boltzmann entropy of the system in terms of the number of microstates compatible with a macroscopic frequency p for that allele, since the two are closely related [4].This formal correspondence between Equation (1), i.e., time until loss or fixation of a neutral allele, and entropy was noticed en passant by Buss and Clote [5], who proved Equation (1) in a weaker form for a more general class of models with finite population size.However, no simple explanation is known for this coincidence.
In this paper, we discuss a clear connection between the allelic entropy of variants in a neutral population and the time until loss or fixation of the alleles.As a consequence of the diffusion dynamics of neutral alleles and the definition of entropy, the average rate of entropy dissipation is proportional to the number of segregating alleles minus one.Since the entropy of the final state with a single fixed allele is zero, the overall change in entropy should be equal to the initial one.This is the basis of the relation between the lifetime of segregating alleles and the initial entropy.We use this approach to (re)derive formulae for the average time until loss of a segregating allele in a multiallelic population, as well as the expected time to fixation of the last surviving allele.
In the last sections, we tackle the case of nonconstant population size N(t).We prove that there are two different cases, depending if the ∞ 0 dt/N(t) is infinite or finite.In the first case, i.e., for populations with sublinear or linear growth, all polymorphisms get fixed or extinct in the long term, and there is still a correspondence between the initial entropy and the expectation values of nonlinear rescalings of times until fixation/loss.In the second case, i.e., for exponential growth or other forms of superlinear growth, the expected times to fixation or loss are infinite.This corresponds to alleles segregating forever in the population with a nonzero probability.

Entropy Dissipation and Times to Fixation
Throughout this paper, we consider a population of large size N evolving according to the diffusion approximation of the neutral Wright-Fisher model without mutations.
Assume that the population is polymorphic with A possible types (alleles).Let k 1 , k 2 . . .k A be the counts of the different alleles in the population (with ∑ A i=1 k i = N) and f 1 , f 2 . . .f A the corresponding frequencies, i.e., f i = k i /N.Then, the (Boltzmann) entropy of the polymorphism in the population is defined as S Boltzmann = ln ( N k 1 ,k 2 ...k A ).For large N, S Boltzmann ≈ S defined as This corresponds also to N times the Shannon entropy of the sampling distribution of an allele from the population.This entropy is positive definite, and it is zero if and only if the population is monomorphic (i.e., for some i we have f i = 1, f j = 0 for j = i).

Biallelic Entropy and Time until Fixation/Loss
The Wright-Fisher process on the frequency f of a neutral allele satisfies [1,3] where ∆ f is the variation in frequency in a single generation.The scale and speed of the corresponding one-dimensional diffusion process are f and N f (1− f ) , respectively [1].From the Kolmogorov equations, for two alleles of frequency f and 1 − f , the entropy dissipated in a single generation is hence the final result where δ [X] is the indicator function of X, i.e., a function that is 1 if X is true and 0 otherwise.

Multiallelic Entropy and Time until Loss
Consider a Wright-Fisher process with A 0 initial alleles of initial frequencies f 1,0 , f 2,0 . . .f A 0 ,0 .Denote the number of segregating alleles in time as The diffusion approximation for this system implies [1] E therefore, the decrease in entropy in a generation is hence We denote by j 1 . . .j A 0 the allele indices ordered by survival and by t L,j 1 < t L,j 2 . . .< t L,j A 0 the corresponding lifetimes (both indices and times are random variables).Note that in absence of mutations, the j A 0 th allele gets fixed forever in the population and t L,j A 0 = ∞.By integrating the Equation ( 27), we obtain We can rewrite this in terms of the time of loss t L of a random allele among the A 0 − 1 alleles not going to fixation.The expected time until loss is

Time until Fixation
A more interesting result can be obtained by combining the results above.The sum of all times until loss/fixation of each allele is but the same sum can be decomposed as the sum of the expected time to fixation of the last allele and the sum of times of loss of the others: The expected time to fixation of the last surviving allele is therefore For a slightly different point of view on these results, define the quantities S where the last integral can be reinterpreted in terms of the fixation time t F,i and fixation probability P f ix,i = f i as follows: Once we notice that E[S C,i (∞)] = 0, this formula is equivalent to the classical results of Kimura and Ohta [6] on the expected times to fixation for alleles of initial frequency f i .Equation (30) can the be found by weighting E[t F,i ] by the neutral probabilities of fixation P f ix,i = f i,0 .
From yet another point of view, the sum S C = ∑ A i=1 S C,i has a constant rate of dissipation per generation, as long as some alleles are still segregating: ), which is equivalent to Equation (30), since fixation or loss occurs in finite time.

Variable Population Size 2.2.1. Entropy and Rescaled Times to Fixation/Loss
Given the population size N 0 at the initial time, we define the relative population size λ(t) = N(t)/N 0 .We also define the rescaled entropy as s(t) = S(t)/λ(t) (and s C (t) = S C (t)/λ(t)).These quantities change only after a change in frequencies, not in population size only.
By the same reasoning as before, we find that the rescaled entropy satisfies the generalized equation If we change variable from t to τ = t 0 du/λ(u), we find that the average decay of rescaled entropy is linear in τ.Note also that s(0) = S(0).Therefore, provided that ∞ 0 dt/λ(t) = +∞ (which ensures that all alleles are going to fixation or loss in a finite time, as we discuss in the next section), the formulae of the previous sections retain their validity once each time is replaced by the corresponding rescaled time.

Superlinear Growth and Infinite Fixation Times
For the case of superlinear growth, ∞ 0 dt/λ(t) < +∞, and the following theorem applies.
Theorem 1.Consider a population with effective size N(t) = λ(t)N 0 and a finite number of alleles A 0 > 1 at time t = 0, evolving according to the diffusion process in Equation (6).Denote the probability that alleles are segregating in the population at time t by P converges to 0 for t → +∞, i.e., there are no segregating alleles at t = +∞.
, there is a nonzero probability that alleles are still segregating in the population at time t = +∞.
To prove the statement (ii), consider the heterozygosity There are segregating alleles in the population if and only if h > 0. Its expected value E[h] satisfies the well-known relation leading to an exponential decay of the expected heterozygosity The second part of the theorem leads to this corollary.

Corollary 1. If
∞ 0 dt/λ(t) < +∞, the expected times to fixation/loss are infinite: For growing populations with τ max < +∞, the expected values of τ L and τ F are always finite and satisfy the equations Since τ L , τ F ≤ τ max , we can obtain lower bounds on the final expected values of the rescaled entropy-like measures:

Constant Population Size
The neutral evolution of a biallelic variant implies a constant dissipation of allelic entropy in time: The rate of dissipation is independent on S or f , provided that the alleles are still segregating.Note that the only functions of f with this property are linear combinations of the entropy, the frequency f (which is a martingale under neutral evolution) and a constant.
We present an alternative derivation of the classical Formula (1) for the expected time to fixation based on the rate of entropy dissipation.Denote by t FL the time to fixation or loss of one of the alleles from initial frequencies f 0 and 1 − f 0 .The time t FL is a random variable with probability density denoted by p[t FL = t].Note that since all terms are equivalent to the probability that the alleles are still segregating at time t.
The expected time to fixation or loss is then where we used Equation ( 24) and the average of Equation ( 23) over the realizations of the process.Since this quantity is finite, the probability that the alleles are still segregating at t = ∞ is zero, hence f (∞) ∈ {0, 1} and the entropy S( f (∞)) = 0.This way, we obtain the final result This derivation shows that the relation between fixation time and entropy is a consequence of the linear decay of the average entropy of the system.For multiple alleles, the decay in entropy does not depend on allele frequencies, but it is related only to the number of segregating alleles: Interestingly, this implies that the initial entropy of the system is related to the average lifetime of a random allele doomed to extinction For A 0 = 2, extinction of one of the two alleles means fixation of the other, hence this result reduces to Equation (26).Note that the time until loss or fixation of a given allele (e.g., the ith allele) is instead as derived by pooling all the other alleles together-which does not change their evolution in frequency, since they are interchangeable due to the neutrality-and using Equation (26).
The expected time to fixation of the last surviving allele is Note that despite the appearance, the right-hand side of this equation has no direct interpretation as an entropy.It is rather related to a difference in entropies conditioned on different macroscopic variables.

Variable Population Size
The previous results can be generalized to models with variable population size N(t) by a well-known time rescaling [7].In fact, we find that the average decay of rescaled entropy is linear in the rescaled time τ = t 0 du/λ(u).
Therefore, for most demographic dynamics, the formulae of the previous sections retain their validity once each time is replaced by the corresponding rescaled time.For example, Equation (28) can be generalized for E[τ L ]: and Equation ( 30) can be generalized for E[τ F ]: However, this is not the case for populations that experience superlinear growth (for example, an exponential growth N(t)/N 0 = e αt with α > 0).In this case, the genetic drift is effectively frozen after a finite time.This implies that there could be segregating alleles at arbitrarily long times with finite probability.This is actually what happens: Theorem 1.Consider a population with time-varying effective size, initially harboring a finite number of neutral alleles.
(i) If τ max = +∞, which includes populations growing linearly or less than linearly, then all alleles will either get fixed or disappear in finite time.(ii) If τ max < +∞, which includes populations growing exponentially or faster, there is a nonzero probability that alleles will segregate forever in the population.
The second part of the theorem leads to this corollary: Corollary 1.The expected times to fixation/loss for neutral alleles in exponentially growing populations (and other populations with τ max < +∞) are infinite.
This is true also for the relevant case of exponential growth.Denote the probability that alleles are segregating in the population at time t by P[A(t) > 1].For an exponentially growing population at rate α > 0, the expected times to fixation/loss are infinite, and the probability of observing segregating alleles at time t = +∞ is at least in terms of the initial number of alleles A 0 and heterozygosity h 0 .However, even for rapidly growing populations, the expected values of τ L and τ F are always finite and bounded by This suggests that populations experiencing a more extreme growth (i.e., smaller τ max ) are guaranteed to maintain a higher fraction of the initial genetic diversity, as confirmed by the decay of heterozygosity E[h ∞ ] = h 0 e −τ max /N .In the simplest case of an exponential growth with rate α, we have that the final entropy satisfies the bounds

Discussion
Our results show that entropy plays an unexpected role in the Kolmogorov equations describing the diffusion approximation to the Wright-Fisher model.The identification of the times until fixation and loss with the initial entropy of the system and related measures stems from this special role.
While in this work we connected the fixation times with the properties of allelic entropy and its dissipation under neutrality, it is not clear if these special properties arise by chance or if they are a consequence of more general relations between population genetics, partial differential equations, statistical mechanics and information theory.Entropy dissipation methods for the study of partial differential equations and nonlinear systems [8] do not consider the entropy of a random variable but rather the entropy of its distribution.For similar reasons, even mappings between population genetics and statistical mechanics (see e.g., Sella and Hirsh [9], Mustonen and Lässig [10]) do not seem to offer an immediate explanation of the linear decay of the entropy function.
For populations of variable size, entropy decay is time-dependent and generally nonlinear in time.In populations with linear or sublinear growth, expected fixation times are finite and related to entropy for populations through a nonlinear time transformation.On the other hand, superlinear growth (i.e., ∞ 0 dt/N(t) < +∞) is associated with infi- nite expected fixation times when the dynamics is neutral.These results open a series of questions on the interpretation of fixation and extinction probabilities and times in exponentially growing populations and the meaning of neutrality and positive selection.The theory of the fixation probabilities in growing populations [11] states that exponential expansion greatly increases the fixation probabilities of beneficial mutations.At the same time, our results prove that it strongly increases the lifetime of neutral polymorphism, reducing fixation probabilities."Neutrality" here must be interpreted in a very strict sense, since in the long term the large population size would enhance any slightly beneficial or deleterious selective effect; in fact, the effective time-dependent population-scaled selection coefficient is N e (t) times the individual-level fitness effect.Any rapid unlimited growth is also an unrealistic demographic dynamics on long time scales; most natural populations experience exponential expansion only as a transient phase, e.g., colonization of a new niche or infection of a new host.The biological relevance of these asymptotic results should be revisited with an eye to the typical times of expansion and fixation in each case study.