d'Alembert's direct and inertial forces acting on populations: the Price equation and the fundamental theorem of natural selection

I develop a framework for interpreting the forces that act on any population described by frequencies. The conservation of total frequency, or total probability, shapes the characteristics of force. I begin with Fisher's fundamental theorem of natural selection. That theorem partitions the total evolutionary change of a population into two components. The first component is the partial change caused by the direct force of natural selection, holding constant all aspects of the environment. The second component is the partial change caused by the changing environment. I demonstrate that Fisher's partition of total change into the direct force of selection and the forces from the changing environmental frame of reference is identical to d'Alembert's principle of mechanics, which separates the work done by the direct forces from the work done by the inertial forces associated with the changing frame of reference. In d'Alembert's principle, there exist inertial forces from a change in the frame of reference that exactly balance the direct forces. I show that the conservation of total probability strongly shapes the form of the balance between the direct and inertial forces. I then use the strong results for conserved probability to obtain general results for the change in any system quantity, such as biological fitness or energy. Those general results derive from simple coordinate changes between frequencies and system quantities. Ultimately, d'Alembert's separation of direct and inertial forces provides deep conceptual insight into the interpretation of forces and the unification of disparate fields of study.


INTRODUCTION
The fundamental theorem of natural selection divides total evolutionary change into two components 1 . The first component is the partial change caused by the direct force of natural selection. The second component is the partial change caused by all other forces.
The theorem states that the change in fitness caused by the direct force of natural selection equals the genetic variance in fitness. We can interpret "genetic variance" to mean the component of variance associated with things that are transmitted through time. Natural selection is the force that changes the frequencies of those transmissible things.
Fisher wrote clearly about the distinction between the direct force of natural selection and the other evolutionary forces 1,2 . Yet much confusion followed in the history of the subject. Essentially all commentators considered only the total evolutionary change, rather than Fisher's split into two partial components.
A correct interpretation of Fisher's partial components eventually developed, starting with Price 3 and Ewens 4 . However, both of those authors concluded that Fisher's split of total change into components provided little value.
In this article, I show that Fisher's split of evolutionary change is equivalent to d'Alembert's split of the general causes of dynamics into direct and inertial forces. d'Alembert's principle is the foundation for essentially all of the key results of theoretical physics, starting with a) doi: 10.3390/e17107087 in Entropy b) web: http://stevefrank.org Newton's laws and leading to the subsequent generalizations via Lagrangian and Hamiltonian mechanics. Lanczos 5 , in his great synthesis of the variational principles of mechanics, elevates d'Alembert's principle to the key insight that ties together the whole subject. To Lanczos, the tremendous value of d'Alembert's principle follows from the fact that it "focuses attention on the forces, not on the moving body . . ." In the same way, Fisher's goal was to isolate and interpret the force of natural selection, rather than to emphasize the dynamics of total change.
The study and interpretation of force requires separating the action of a force from the frame of reference. A force affects change, and the measurement and interpretation of that change depends on the changing frame of reference of the system. To understand the force as distinct from the frame of reference, force and frame of reference must be separated.
That separation between force and frame of reference is exactly what Fisher did and was exactly how he discussed his analysis. I argue here that connecting Fisher's theorem to d'Alembert's principle will help to clarify the separation of direct force and frame of reference.
In Fisher's analysis, he was vague about the mathematical form of the changes associated with the frame of reference. Here, by using the Price equation, I make explicit the connections between Fisher's theorem and d'Alembert's principle.
My argument follows three steps. First, I derive the general form of the Price equation. Second, I connect the Price equation to d'Alembert's principle. Third, I discuss the fundamental theorem of natural selection in the context of d'Alembert's separation of the direct forces and the inertial forces associated with the changing frame of reference. By d'Alembert's separation, we obtain a parti-tion of total evolutionary change in fitness into the change by the direct force of natural selection and the change by the inertial forces of the changing environmental frame of reference.
The analysis is much more general and powerful than a theorem limited to natural selection. Instead, we find a broad analysis of the dynamics of any population or aggregation that can be characterized by frequencies. The conservation of total frequency, or total probability, establishes a symmetry that defines many of the characteristics of aggregate dynamics. Those characteristics of aggregate dynamics apply to natural selection, to many problems in mechanics, and to any analysis of the changes in probability distributions.

THE PRICE EQUATION
The Price equation 6,7 describes the change in an average value obtained over some aggregation or population. Each component of the population has a weighting, q, and a value, z. Begin with a discrete analog of the chain rule for differentiation of a product ∆(qz) = (q + ∆q)(z + ∆z) − qz = (∆q)z + (q + ∆q)∆z = (∆q)z + q ∆z, in which q = q+∆q and z = z+∆z. The same chain rule can be applied to vectors. By using dot product notation, we obtain an abstract form of the Price equation [7][8][9] ∆(q · z) = ∆q · z + q · ∆z, (1) in which a dot product is understood in the usual way as q · z = q i z i . This equation can be interpreted in various ways. For our purposes, we can take q i to be the frequency associated with a subset, i, of the initial population, such that the total frequency is q i = 1. Thus,z = q i z i is the average of z, in which z i is a function that maps i to some value. Similarly, we have a second population, with frequencies q i and values z i , in which q i = 1. One can use various rules for the relations between q i and q i and between z i and z i , allowing a wide variety of different perspectives on the transformations that relate the two populations 7 . For our purposes, we can operate abstractly and not worry about the particular rules. Our only restriction is that we can map the index i between the two populations.

FITNESS AS A CHANGE IN FREQUENCY
The function z i can map subset i to any value. When studying frequency changes, let us rename the variable as m ≡ z, and choose to describe the ratio of frequencies between the two populations associated with i. We can think of m i as a growth rate, or as a kind of force that moves the system from q i to q i . In particular, the above expression is equivalent to exponential growth driven by m i as We may call m i fitness, because it expresses the relative growth of the weighting associated with i. The term m i is, in effect, a growth rate relative to an unspecified underlying scale of change. We can take m i as a given force of growth and derive q i , or we can take the outcome q i as given, and derive the effective force, m i , that is consistent with the outcome.
If we thought of i as a particular individual or a particular type, then m i would express the growth rate associated with that individual or type between the two populations. However, the equations allow us simply to make the definition that relates q i to q i , and not restrict ourselves to a particular interpretation of what i means in those terms.
I confine my analysis to small differences, ∆q i → dq i ≡ q i , in whichq i = q i − q i is small. For small differences we have (see Methods for assumptions) Using this definition and the substitution m i ≡ z i in the Price equation eqn 1 from the prior section, we obtain a general expression for the total change in fitness aṡ m =q · m + q ·ṁ, in which we ignore the second order termq ·ṁ in this description of small changes, with ∆z → dz ≡ṁ.
With the definition of fitness as a growth rate, m i = q i /q i , average fitness is This equation expresses the conservation of total probability or total frequency. It follows that the change in average fitness,ṁ, must also be zerȱ The termq · m has a wide variety of interpretations related to information theory and classical mechanics. For example, this term expresses entropy momentum or Fisher information 10,11 , aṡ The term m i =log q i = log q i /q i is the change in entropy in each dimension, i, describing an entropy velocity or nondimensional entropy momentum relative to an unspecified underlying scale of change. Thus,q · m may be interpreted as the gain in entropy momentum, which must be balanced by the loss of entropy momentum in the second term, q ·ṁ, to achieve overall conservation, m = 0. Note that I have used − log q i as the entropy in each dimension, consistent with the information theory concept of self-information or surprise as − log q i . That definition leads to system entropy as the expectation over the different dimensions, − q i log q i . Some people prefer to define the entropy in each dimension as −q i log q i , and system entropy as the sum over each dimension, in which case my usage of entropy or information momentum does not make sense.
The term q 2 i /q i is widely used as the Fisher information metric, particularly in the study of information geometry 11 . Thus, the first term inṁ = 0 is the gain in Fisher information, and the second term is an exact balancing loss in Fisher information. The balance leads to an overall conservation of Fisher information, as emphasized by Frieden 10 .
We have transcended our original formulation of biological fitness in these descriptions of probability, information, and entropy. The expressions here apply to any problem that can be expressed in terms of changing frequencies in populations or aggregates, subject to the conservation of total frequency.

D'ALEMBERT'S PRINCIPLE
We may write d'Alembert's principle 5 as Here, all terms are vectors, and the implicit dot product withq distributes over the parentheses. The vector q locates the system, andq is a virtual displacement of the system from its current location to a nearby location. A virtual displacement is like an imaginary displacement, in which the system is held fixed in its current state, and then one moves its location without changing anything else. All forces and the frame of reference for measurement are held constant 5 .
A virtual displacement must be consistent with all forces of constraint. In our case, the primary force of constraint on a virtual displacement,q, is that the sum of the frequencies is one. Thus, q i = 0 expresses the force of constraint set by the conservation of total frequency or probability. Because a virtual displacement must be consistent with the forces of constraint, we need only analyze those forces that are in addition to the forces of constraint. In particular, we need to track the direct forces, F, and inertial forces, I.
The term F is the vector of direct forces acting on the system, and the term I is the vector of inertial forces that balance the direct forces to achieve no net change. d'Alembert's principle can be thought of as a generalization of Newton's second law of motion 5 , in whichF = µÃ is read as the total force,F, equals mass, µ, times total acceleration,Ã. Total force and total acceleration must include forces of constraint. If we write total inertial force asĨ = −µÃ, then Newton's law isF +Ĩ = 0.
When we study an actual system, we are usually interested in how the direct, or applied, forces influence dynamics. To do that, we need to separate the direct forces from the constraining forces. For example, in studying the frequency dynamics and evolutionary change caused by natural selection, we usually wish to analyze the direct force of growth rate, or fitness, separately from the force of constraint imposed by the conservation of total probability.
In d'Alembert's formulation, the direct and inertial forces typically do not sum to zero, F + I = 0, because those terms do not include the constraining forces. Instead, in d'Alembert's expression (F + I)q = 0, the terṁ q · F combines the direct and constraining forces, and the termq·I combines all inertial forces, including any forces of constraint. Newton's law is a special case of the more general principle of d'Alembert 5 .

INTERPRETATION OF D'ALEMBERT'S PRINCIPLE
Here is a simple intuitive description of d'Alembert's principle 12 . You are sitting in a car at rest, and the car suddenly accelerates. You feel thrown back into the seat. But, even as the car gains speed, you effectively do not move in relation to the frame of reference of the car: your velocity relative to the car remains zero. That net zero velocity can be thought of as the balance between the direct force of the seat pushing on you and the inertial force sending you back as the car accelerates forward.
As long as your frame of reference moves with you, then your net motion in your frame of reference is zero. Put another way, there is always a changing frame of reference that zeroes net change by balancing the work of direct forces on a system against the work of a balancing inertial force. Although the system is a dynamic expression of changing components, it also has an overall static, equilibrium quality that aids analysis. As Lanczos 5 emphasizes, d'Alembert's principle "focuses attention on the forces, not on the moving body . . ."

D'ALEMBERT AND THE CONSERVATION OF TOTAL PROBABILITY
This section transforms the conservation of total probability expressed by eqn 2 into a form of d'Alembert's principle. We first note that (see Methods forlog m notation) The symbol " " denotes element-wise multiplication of vectors, the ratio denotes element-wise division, and dot products distribute over parentheses. With this expression, we can rewrite our general result in eqn 2 for the conservation of total probability, or the change in fitness, in the general form of d'Alembert, (F + I)q = 0, as m +log m q = 0.
We equate this expression with d'Alembert by interpreting m ≡ F as the force of growth, or fitness, or, more generally, the direct forces acting on frequency change.
We interpretlog m ≡ I as the inertial forces, which typically are described in terms of acceleration with respect to the frame of reference.

DIRECT AND INERTIAL FORCES
The expression in eqn 3 describes d'Alembert's principle for systems that follow conservation of total probability. This section considers how we should interpret (F + I)q = 0 for the direct and inertial forces in terms of Newtonian concepts of force and acceleration.
The dot product expression in eqn 3 can be written as a sum over the individual dimensions of the system The first term on each side,q · m ≡q · F, is the virtual displacement times the direct force. We may call this term the virtual work of the direct forces, because physical work is displacement times force. We can write this component of virtual work solely in terms of frequencies from our prior definition of m i =q i /q i . The second term on each side,q ·log m ≡q · I, is the virtual work of the inertial forces. To interpret the inertial forces with respect to acceleration, it is useful to expresslog m asl The termq i is the second order infinitesimal change, or acceleration. Thus, I ≡log m expresses how the changing frame of reference, arising from changed frequencies, leads to inertial forces that are accelerations. We can now write d'Alembert's principle under the conservation of total probability solely in terms of the probabilities, or frequencies, as Distributing the virtual displacement,q i , across the parentheses in the sum and splitting the sum into direct and inertial components yields The sum ofq i is zero because q i = 0 by conservation of total probability, and thus the accelerations,q i , also sum to zero. However, in a particular dimension, there may be an imbalance between direct and inertial force, q i . That imbalance arises because the force of constraint on total probability differs across dimensions.

UNITARY COORDINATES AND PATH LENGTHS
From eqns 5 and 6, we may express d'Alembert's balance between the total direct and inertial components as The q 2 i /q i terms can be understood as distances by considering the curvature caused by the constraining force of the conservation of total probability. To get a proper sense of distance in that curved geometric space, we need to change the coordinates.
Let the new coordinates be r = √ q. Then the total Euclidean length of the vector r is the square root of the sum of squares in each dimension, which is Vector lengths in the new coordinates are always one, which provides a pure expression of the conservation of total probability. In general, the q may be arbitrary weightings, such that q i is conserved, and thus q i = 0. Here, I focus on conserved probability, in which the q i are positive and sum to one.
The path lengths of motion take on simple interpretations in terms of distance in the unitary coordinates. The transformed coordinates yield which shows the simple Euclidean interpretation of squared distance in the r coordinates as a sum of squared differences. This expression of distance is also equivalent to the Fisher information metric 10,11 . However, geometry is perhaps more fundamental than information, because the distance arises inevitably from curvature of paths caused by analyzing probability displacement subject to unitary conservation of total probability.

GEOMETRY
This section briefly reviews the geometry of frequency change dynamics that follow from two assumptions. The first assumption is that direct force, m i , causes exponential growth This growth expression establishes a natural logarithmic scaling for comparing frequencies, because When changes are small, m i =log q i =q i /q i . We could interpret those changes with respect to log q i as entropy or information. But the geometry of force and growth may be a better way to think about the fundamental nature of these expressions. The second assumption is that total frequency or probability is conserved, q i = 0. That conservation imposes a constraint on paths of change. The constraint may be expressed by the geometry of the unitary coordinates, r = √ q, which yields a conserved length |r| = 1.
The path lengths for virtual displacements times direct or inertial forces are q 2 i /q i = 4 ṙ 2 i . The essential geometry arising from growth and from conservation of total probability sets the form of the distances.

CANONICAL COORDINATES AND CONSERVATION IN EACH DIMENSION
Hamiltonian expressions in canonical coordinates often provide the deepest insight into the symmetries of a system 13 . To obtain the Hamiltonian, the use of r = √ q coordinates was a first step, because we can rewrite d'Alembert's principle in eqn 7 as However, the net balance only applies to the total system rather than separately in each dimension. If we can find the proper canonical coordinates, then the forces of constraint will appear independently in each dimension, and the balance of direct and inertial forces will also appear independently in each dimension. In a Hamiltonian formulation, we assign two values to each component, usually considered as position and momentum 13 . In our nondimensional system, our primary factor is the conservation of total probability, which we express through the unitary coordinates r = √ q, such that the length of r is always one If, for each point, we take r i = √ q i for position and p i = √ q i for momentum, then r·p = 1, and the conserved Hamiltonian is This expression satisfies the requirements for Hamiltonian canonical coordinates of position and momentum, which are that ∂H/∂r i = −ṗ i and ∂H/∂p i =ṙ i . The differential of the Hamiltonian often provides a useful expressionḢ =r · p − r ·p = 0, which, in each separate dimension, is zerȯ because r i = p i = √ q i , and thus we can write the Hamiltonian in each dimension as Here, the curvature from the force of constraint is divided into equal and opposite contributions in the direct and inertial force components, recovering a NewtonianF i − µÃ i = 0 perspective independently in each dimension.
We can rewrite eqn 8 as a d'Alembert's principle ex-pressionḢ for virtual displacementṙ, direct force F = −p l ogṙ, and inertial force I = r l ogṗ. The symbol " " denotes element-wise multiplication of vectors, and dot products distribute over parentheses. Thus,Ḣ = (F + I)ṙ = 0, with the Newtonian equality F i + I i = 0 satisfied in each dimension.

COORDINATES FOR QUANTITIES CORRELATED WITH FORCE
We can analyze any quantitative system property by transforming coordinates. We start with the general results for the conservation of total probability and information momentum,ṁ = 0. We then obtain an expression for the change in the system quantity,ż, by the change in coordinates (m,ṁ) → (z,ż), in which the different coordinates now have an arbitrary relation rather than the earlier equivalence. That change in coordinates generalizes theṁ form of the Price equation (eqn 2), to give the change in the average value of z aṡ z =q · z + q ·ż.
The z i values are the averages of z in each dimension, i. Because z can be any quantity, calculated in any way, this equation gives the most general expression forż, the change in the average of z. One can think ofz = q i z i as a functional of the arbitrary function, z, that maps i → z i . The only restriction on the expression forż shown here is that changes be small. For large changes, the exact form of the Price equation in eqn 1 should be used.
We can relateṁ toż by writing the change in coordinates, m → z andṁ →ż, as the regression equations z = β zm m + ż = βżṁṁ + γ, in which the regression coefficients, β, are obtained by minimizing the length of the "error" vector. To analyze the length of the error vector, we can use standard identities from the theory of least squares for regression 14 .
In particular, the first regression equation follows from choosing β zm to minimize | q | 2 = q i 2 i , in which q = √ q i i denotes a √ q weighted vector. Choosing β zm to minimize the length of q leads to m q · q = 0, because the minimum length of q occurs when that vector is orthogonal to m q . Note thatq i = q i m i , thuṡ In the equation forż, minimizing γ q 2 sets βżṁ. We also have, by standard theory, q · γ = 0. Using these identities, from which we obtain the changeż in terms of the original coordinates forṁ aṡ the right expression arising from the fact thatq · m + q · m = 0. The total change,ż, is split into the virtual work term, β zmq · m, and the inertial force term, βżṁq ·ṁ. The regression coefficients rescale coordinates (m,ṁ) → (z,ż). Ifz is a conserved quantity, or the system is at an equilibrium with respect toz, thenż = 0. We can write a d'Alembert forṁ z = (β zm m − βżṁm)q = 0 which, whenq · m = 0, implies β zm = βżṁ, and the d'Alembert equality holds separately in each dimension. In this case, the dynamics of z are influenced by both the conservation of probability and by additional constraints set by the conservation ofz. We may, of course, choose the changing reference frame,ż, such thatż = 0, in which case the direct and inertial forces do not completely balance.

THE FUNDAMENTAL THEOREM
We may set βżṁ = 0, either because the changing value ofz is unaffected by the changing reference frame, or because the effects of the changing reference frame are ignored by assumption. We then have an expression for the partial change caused by the direct forces, holding constant the frame of referencė z s =q · z = β zmq · m, in which the s subscript emphasizes that this is a partial change ascribed to the direct forces, or the forces of selection. This form includes, as special cases, Fisher's fundamental theorem of natural selection, the breeder's equation of genetics, and other common expressions for the change in populations caused by natural selection.
Note thatq · m = V m , the variance of m, becausė which is the variance of m, becausem = 0. If we take z = m in order to study the change in fitness caused by the direct forces, thenṁ s = V m , the change in mean fitness caused by selection,ṁ s , is the variance in fitness, V m . Fisher was interested in the transmissible change inm associated with genetic factors, g, thus he partitioned fitness as m = g+δ. Here, the genetic factors are partial regressions associated with particular genes, such that g is chosen to maximize the amount of the total variance in fitness, V m , associated with the transmissible genes 4,9,15,16 . The δ terms are residuals in the regression, such that one gets the additive partition of total variance from classical regression theory as The change in fitness caused by the direct forces can now be written asṁ and thus the transmissible change in fitness caused by natural selection and associated with genetic factors iṡ in which V g is the variance in the transmissible effects of the genetic factors on fitness, or the genetic variance in fitness. That partial change in fitness caused by direct forces and associated with transmissible factors is what Fisher emphasized in his fundamental theorem of natural selection. By defining the genetic factors, g, as the only direct forces of interest, the residual forces of selection, δ, are added to the other inertial forces that define the changing frame of reference.
In models of evolutionary change, Fisher chose to ascribe the direct force of change associated with g to natural selection, and all other forces to the inertial frame that he called environmental causes. That d'Alembert interpretation of the split between direct and inertial forces provides a clear way in which to understand Fisher's fundamental theorem of natural selection. There is, of course, an arbitrary aspect to such a partition, because the split between direct and inertial forces depends entirely on how one chooses to define the frames of reference. For example, a change in how one defines the set of potentially transmissible factors, g, alters how one splits forces between direct and inertial components 15 .

CONCLUSIONS
The fundamental equations for change are identical between many laws of physics and evolutionary change by natural selection. However, the different histories of those subjects and the long and confused debates in biology about Fisher's fundamental theorem have obscured the simple, common basis of the underlying theory. I unified different theories by combining d'Alembert's conceptual frame with the abstract expressions of the Price equation. That combination led to a simple and very general basis for understanding populations or aggregations, in which one can interpret total frequency or total probability as a conserved quantity. By combining conservation of total frequency with a notion of change based on exponential growth, I showed the geometric and algebraic forms of change that arise from d'Alembert's partition of direct and inertial forces. I also provided an elegant Hamiltonian expression in canonical coordinates, which recovers the Newtonian balance of force and acceleration independently in each dimension for the corresponding direct and inertial forces of d'Alembert.
Finally, I showed that arbitrary system quantities, such as biological traits, or any total system quantity such as energy, can be interpreted through two steps. First, begin with the universal results that arise from conservation of total probability and the notion of change as exponential growth. Second, apply a simple coordinate transformation between frequency change and system quantities to obtain general expressions for the change in system quantities.

METHODS
The assumption of small changes associated with the overdot notation does not imply that forces are weak. Instead, the scale of change is small, in the sense typically associated with continuous time derivatives in differential equations. However, I have avoided classical derivative notation and differential equations in order to retain the more general form of the abstract Price equation 7,8 .
For example, in the definition m i =q i /q i , the overdot notation can be interpreted as a small change in q i , such thatq i ≡ dq i . Fitness in biology is sometimes given as an absolute number or as a nondimensional change in frequency, consistent with m i , and sometimes as a rate or Malthusian parameter, which might be given as Here, dτ is the underlying scale of change, which is typically a small change in time. However, we can take dτ as an abstraction of the underlying scale of change, which may have any units or be nondimensional. If we take the units on τ as the square of time, then we move toward traditional definitions of force or acceleration. Because dτ is small, the quantities of rates, forces, or accelerations may be large.
In the text, we are always looking at equivalences between left and right hand sides of equations. So we can always multiply or divide by various functions of dτ in-terpreted with respect to arbitrary dimensions. The abstraction in the text is intentional, because the interdisciplinary connections between seemingly different subjects and results arise only when one focuses on the abstract structure of the key results. For example, the need for such abstraction arose elsewhere when studying the relation between Fisher's fundamental theorem and Fisher information 7,8,17 .
The abstract structure shows the unity among a broad array of fundamental expressions in mechanics, in biology, in information theory and information geometry, and in many other kinds of problems that can be cast in variational form.
I have made the assumption that the scale of change is small, and thus all quantities with overdots are small. In biology, that assumption is often associated with models of populations with overlapping generations described in continuous time differential equations 16 . In mechanics, that assumption corresponds to the classical differential equation expressions in continuous time.
The analysis of discrete changes that are not small, typically associated with discrete time models, remains an open problem. The exact Price expression in eqn 1 gives a hint at how to proceed when changes are not small. The connection to the continuous expressions of mechanics and d'Alembert might be achieved by careful use of differential geometry and construction of discrete changes as sums of small changes along continuous paths. But that analysis remains an open problem for the future. Some results based on the analysis of the exact, discrete Price equation may provide a point of departure 7,8 .
Thelog m i notation is interpreted aṡ which is the change in the relative distance of m i from zero. This interpretation is consistent with the expression oflog m i in terms of the changes in q i given in eqn 4.