1. Introduction
The fundamental theorem of natural selection divides total evolutionary change into two components [
1]. The first component is the partial change caused by the direct force of natural selection. The second component is the partial change caused by all other forces.
The theorem states that the change in fitness caused by the direct force of natural selection equals the genetic variance in fitness. We can interpret “genetic variance” to mean the component of variance associated with things that are transmitted through time. Natural selection is the force that changes the frequencies of those transmissible things.
Fisher wrote clearly about the distinction between the direct force of natural selection and the other evolutionary forces [
1,
2]. Yet much confusion followed in the history of the subject. Essentially all commentators considered only the total evolutionary change, rather than Fisher’s split into two partial components.
A correct interpretation of Fisher’s partial components eventually developed, starting with Price [
3] and Ewens [
4]. However, both of those authors concluded that Fisher’s split of total change into components provided little value.
In this article, I show that Fisher’s split of evolutionary change is equivalent to d’Alembert’s split of the general causes of dynamics into direct and inertial forces. d’Alembert’s principle is the foundation for essentially all of the key results of theoretical physics, starting with Newton’s laws and leading to the subsequent generalizations via Lagrangian and Hamiltonian mechanics.
Lanczos [
5], in his great synthesis of the variational principles of mechanics, elevates d’Alembert’s principle to the key insight that ties together the whole subject. To Lanczos, the tremendous value of d’Alembert’s principle follows from the fact that it “focuses attention on the forces, not on the moving body
…” In the same way, Fisher’s goal was to isolate and interpret the force of natural selection, rather than to emphasize the dynamics of total change.
The study and interpretation of force requires separating the action of a force from the frame of reference. A force affects change, and the measurement and interpretation of that change depends on the changing frame of reference of the system. To understand the force as distinct from the frame of reference, force and frame of reference must be separated.
That separation between force and frame of reference is exactly what Fisher did and was exactly how he discussed his analysis. I argue here that connecting Fisher’s theorem to d’Alembert’s principle will help to clarify the separation of direct force and frame of reference.
In Fisher’s analysis, he was vague about the mathematical form of the changes associated with the frame of reference. Here, by using the Price equation, I make explicit the connections between Fisher’s theorem and d’Alembert’s principle.
My argument follows three steps. First, I derive the general form of the Price equation. Second, I connect the Price equation to d’Alembert’s principle. Third, I discuss the fundamental theorem of natural selection in the context of d’Alembert’s separation of the direct forces and the inertial forces associated with the changing frame of reference. By d’Alembert’s separation, we obtain a partition of total evolutionary change in fitness into the change by the direct force of natural selection and the change by the inertial forces of the changing environmental frame of reference.
The analysis is much more general and powerful than a theorem limited to natural selection. Instead, we find a broad analysis of the dynamics of any population or aggregation that can be characterized by frequencies. The conservation of total frequency, or total probability, establishes a symmetry that defines many of the characteristics of aggregate dynamics. Those characteristics of aggregate dynamics apply to natural selection, to many problems in mechanics, and to any analysis of the changes in probability distributions.
2. The Price Equation
The Price equation [
6,
7] describes the change in an average value obtained over some aggregation or population. Each component of the population has a weighting,
q, and a value,
z. Begin with a discrete analog of the chain rule for differentiation of a product
in which
and
. The same chain rule can be applied to vectors. By using dot product notation, we obtain an abstract form of the Price equation [
7,
8,
9]
in which a dot product is understood in the usual way as
.
This equation can be interpreted in various ways. For our purposes, we can take to be the frequency associated with a subset, i, of the initial population, such that the total frequency is . Thus, is the average of z, in which is a function that maps i to some value. Similarly, we have a second population, with frequencies and values , in which .
One can use various rules for the relations between
and
and between
and
, allowing a wide variety of different perspectives on the transformations that relate the two populations [
7]. For our purposes, we can operate abstractly and not worry about the particular rules. Our only restriction is that we can map the index
i between the two populations.
3. Fitness as a Change in Frequency
The function
can map subset
i to any value. When studying frequency changes, let us rename the variable as
, and choose
to describe the ratio of frequencies between the two populations associated with
i. We can think of
as a growth rate, or as a kind of force that moves the system from
to
. In particular, the above expression is equivalent to exponential growth driven by
as
We may call
fitness, because it expresses the relative growth of the weighting associated with
i. The term
is, in effect, a growth rate relative to an unspecified underlying scale of change. We can take
as a given force of growth and derive
, or we can take the outcome
as given, and derive the effective force,
, that is consistent with the outcome.
If we thought of i as a particular individual or a particular type, then would express the growth rate associated with that individual or type between the two populations. However, the equations allow us simply to make the definition that relates to , and not restrict ourselves to a particular interpretation of what i means in those terms.
I confine my analysis to small differences,
, in which
is small. For small differences we have (see Methods for assumptions)
Using this definition and the substitution
in the Price equation Equation (
1) from the prior section, we obtain a general expression for the total change in fitness as
in which we ignore the second order term
in this description of small changes, with
.
4. Conservation of Total Probability, Entropy Momentum, and Fisher Information
With the definition of fitness as a growth rate,
, average fitness is
This equation expresses the conservation of total probability or total frequency. It follows that the change in average fitness,
, must also be zero
The term
has a wide variety of interpretations related to information theory and classical mechanics. For example, this term expresses entropy momentum or Fisher information [
10,
11], as
The term
is the change in entropy in each dimension,
i, describing an entropy velocity or nondimensional entropy momentum relative to an unspecified underlying scale of change. Thus,
may be interpreted as the gain in entropy momentum, which must be balanced by the loss of entropy momentum in the second term,
, to achieve overall conservation,
.
Note that I have used as the entropy in each dimension, consistent with the information theory concept of self-information or surprise as . That definition leads to system entropy as the expectation over the different dimensions, . Some people prefer to define the entropy in each dimension as , and system entropy as the sum over each dimension, in which case my usage of entropy or information momentum does not make sense.
The term
is widely used as the Fisher information metric, particularly in the study of information geometry [
11]. Thus, the first term in
is the gain in Fisher information, and the second term is an exact balancing loss in Fisher information. The balance leads to an overall conservation of Fisher information, as emphasized by Frieden [
10].
We have transcended our original formulation of biological fitness in these descriptions of probability, information, and entropy. The expressions here apply to any problem that can be expressed in terms of changing frequencies in populations or aggregates, subject to the conservation of total frequency.
5. d’Alembert’s Principle
We may write d’Alembert’s principle [
5] as
Here, all terms are vectors, and the implicit dot product with
distributes over the parentheses. The vector
locates the system, and
is a virtual displacement of the system from its current location to a nearby location. A virtual displacement is like an imaginary displacement, in which the system is held fixed in its current state, and then one moves its location without changing anything else. All forces and the frame of reference for measurement are held constant [
5].
A virtual displacement must be consistent with all forces of constraint. In our case, the primary force of constraint on a virtual displacement, , is that the sum of the frequencies is one. Thus, expresses the force of constraint set by the conservation of total frequency or probability. Because a virtual displacement must be consistent with the forces of constraint, we need only analyze those forces that are in addition to the forces of constraint. In particular, we need to track the direct forces, , and inertial forces, .
The term
is the vector of direct forces acting on the system, and the term
is the vector of inertial forces that balance the direct forces to achieve no net change. d’Alembert’s principle can be thought of as a generalization of Newton’s second law of motion [
5], in which
is read as the total force,
, equals mass,
μ, times total acceleration,
. Total force and total acceleration must include forces of constraint. If we write total inertial force as
, then Newton’s law is
.
When we study an actual system, we are usually interested in how the direct, or applied, forces influence dynamics. To do that, we need to separate the direct forces from the constraining forces. For example, in studying the frequency dynamics and evolutionary change caused by natural selection, we usually wish to analyze the direct force of growth rate, or fitness, separately from the force of constraint imposed by the conservation of total probability.
In d’Alembert’s formulation, the direct and inertial forces typically do not sum to zero,
, because those terms do not include the constraining forces. Instead, in d’Alembert’s expression
, the term
combines the direct and constraining forces, and the term
combines all inertial forces, including any forces of constraint. Newton’s law is a special case of the more general principle of d’Alembert [
5].
6. Interpretation of d’Alembert’s Principle
Here is a simple intuitive description of d’Alembert’s principle [
12]. You are sitting in a car at rest, and the car suddenly accelerates. You feel thrown back into the seat. But, even as the car gains speed, you effectively do not move in relation to the frame of reference of the car: your velocity relative to the car remains zero. That net zero velocity can be thought of as the balance between the direct force of the seat pushing on you and the inertial force sending you back as the car accelerates forward.
As long as your frame of reference moves with you, then your net motion in your frame of reference is zero. Put another way, there is always a changing frame of reference that zeroes net change by balancing the work of direct forces on a system against the work of a balancing inertial force. Although the system is a dynamic expression of changing components, it also has an overall static, equilibrium quality that aids analysis. As Lanczos [
5] emphasizes, d’Alembert’s principle “focuses attention on the forces, not on the moving body
…”
7. d’Alembert and the Conservation of Total Probability
This section transforms the conservation of total probability expressed by Equation (
2) into a form of d’Alembert’s principle. We first note that (see Methods for
notation)
The symbol “⊙” denotes element-wise multiplication of vectors, the ratio denotes element-wise division, and dot products distribute over parentheses. With this expression, we can rewrite our general result in Equation (
2) for the conservation of total probability, or the change in fitness, in the general form of d’Alembert,
, as
We equate this expression with d’Alembert by interpreting
as the force of growth, or fitness, or, more generally, the direct forces acting on frequency change. We interpret
as the inertial forces, which typically are described in terms of acceleration with respect to the frame of reference.
8. Direct and Inertial Forces
The expression in Equation (
3) describes d’Alembert’s principle for systems that follow conservation of total probability. This section considers how we should interpret
for the direct and inertial forces in terms of Newtonian concepts of force and acceleration.
The dot product expression in Equation (
3) can be written as a sum over the individual dimensions of the system
The first term on each side,
, is the virtual displacement times the direct force. We may call this term the virtual work of the direct forces, because physical work is displacement times force. We can write this component of virtual work solely in terms of frequencies from our prior definition of
.
The second term on each side,
, is the virtual work of the inertial forces. To interpret the inertial forces with respect to acceleration, it is useful to express
as
The term
is the second order infinitesimal change, or acceleration. Thus,
expresses how the changing frame of reference, arising from changed frequencies, leads to inertial forces that are accelerations.
We can now write d’Alembert’s principle under the conservation of total probability solely in terms of the probabilities, or frequencies, as
Distributing the virtual displacement,
, across the parentheses in the sum and splitting the sum into direct and inertial components yields
The sum of
is zero because
by conservation of total probability, and thus the accelerations,
, also sum to zero. However, in a particular dimension, there may be an imbalance between direct and inertial force,
. That imbalance arises because the force of constraint on total probability differs across dimensions.
9. Unitary Coordinates and Path Lengths
From Equations (
5) and (
6), we may express d’Alembert’s balance between the total direct and inertial components as
The
terms can be understood as distances by considering the curvature caused by the constraining force of the conservation of total probability. To get a proper sense of distance in that curved geometric space, we need to change the coordinates.
Let the new coordinates be
. Then the total Euclidean length of the vector
is the square root of the sum of squares in each dimension, which is
Vector lengths in the new coordinates are always one, which provides a pure expression of the conservation of total probability. In general, the
may be arbitrary weightings, such that
is conserved, and thus
. Here, I focus on conserved probability, in which the
are positive and sum to one.
The path lengths of motion take on simple interpretations in terms of distance in the unitary coordinates. The transformed coordinates yield
which shows the simple Euclidean interpretation of squared distance in the
coordinates as a sum of squared differences. This expression of distance is also equivalent to the Fisher information metric [
10,
11]. However, geometry is perhaps more fundamental than information, because the distance arises inevitably from curvature of paths caused by analyzing probability displacement subject to unitary conservation of total probability.
10. Geometry
This section briefly reviews the geometry of frequency change dynamics that follow from two assumptions. The first assumption is that direct force,
, causes exponential growth
This growth expression establishes a natural logarithmic scaling for comparing frequencies, because
When changes are small,
. We could interpret those changes with respect to
as entropy or information. But the geometry of force and growth may be a better way to think about the fundamental nature of these expressions.
The second assumption is that total frequency or probability is conserved, . That conservation imposes a constraint on paths of change. The constraint may be expressed by the geometry of the unitary coordinates, , which yields a conserved length . The path lengths for virtual displacements times direct or inertial forces are . The essential geometry arising from growth and from conservation of total probability sets the form of the distances.
11. Canonical Coordinates and Conservation in Each Dimension
Hamiltonian expressions in canonical coordinates often provide the deepest insight into the symmetries of a system [
13]. To obtain the Hamiltonian, the use of
coordinates was a first step, because we can rewrite d’Alembert’s principle in Equation (
7) as
However, the net balance only applies to the total system rather than separately in each dimension. If we can find the proper canonical coordinates, then the forces of constraint will appear independently in each dimension, and the balance of direct and inertial forces will also appear independently in each dimension.
In a Hamiltonian formulation, we assign two values to each component, usually considered as position and momentum [
13]. In our nondimensional system, our primary factor is the conservation of total probability, which we express through the unitary coordinates
, such that the length of
is always one
If, for each point, we take
for position and
for momentum, then
, and the conserved Hamiltonian is
This expression satisfies the requirements for Hamiltonian canonical coordinates of position and momentum, which are that
and
. The differential of the Hamiltonian often provides a useful expression
which, in each separate dimension, is zero
because
, and
thus we can write the Hamiltonian in each dimension as
Here, the curvature from the force of constraint is divided into equal and opposite contributions in the direct and inertial force components, recovering a Newtonian
perspective independently in each dimension.
We can rewrite Equation (
8) as a d’Alembert’s principle expression
for virtual displacement
, direct force
, and inertial force
. The symbol “⊙” denotes element-wise multiplication of vectors, and dot products distribute over parentheses. Thus,
, with the Newtonian equality
satisfied in each dimension.
12. Coordinates for Quantities Correlated with Force
We can analyze any quantitative system property by transforming coordinates. We start with the general results for the conservation of total probability and information momentum,
. We then obtain an expression for the change in the system quantity,
, by the change in coordinates
, in which the different coordinates now have an arbitrary relation rather than the earlier equivalence. That change in coordinates generalizes the
form of the Price equation (Equation (
2)), to give the change in the average value of
z as
The
values are the averages of
z in each dimension,
i. Because
z can be any quantity, calculated in any way, this equation gives the most general expression for
, the change in the average of
z. One can think of
as a functional of the arbitrary function,
z, that maps
. The only restriction on the expression for
shown here is that changes be small. For large changes, the exact form of the Price equation in Equation (
1) should be used.
We can relate
to
by writing the change in coordinates,
and
, as the regression equations
in which the regression coefficients,
β, are obtained by minimizing the length of the “error” vector. To analyze the length of the error vector, we can use standard identities from the theory of least squares for regression [
14].
In particular, the first regression equation follows from choosing
to minimize
, in which
denotes a
weighted vector. Choosing
to minimize the length of
leads to
, because the minimum length of
occurs when that vector is orthogonal to
. Note that
, thus
In the equation for
, minimizing
sets
. We also have, by standard theory,
.
Using these identities,
from which we obtain the change
in terms of the original coordinates for
as
the right expression arising from the fact that
. The total change,
, is split into the virtual work term,
, and the inertial force term,
. The regression coefficients rescale coordinates
.
If
is a conserved quantity, or the system is at an equilibrium with respect to
, then
. We can write a d’Alembert form
which, when
, implies
, and the d’Alembert equality holds separately in each dimension. In this case, the dynamics of
are influenced by both the conservation of probability and by additional constraints set by the conservation of
. We may, of course, choose the changing reference frame,
, such that
, in which case the direct and inertial forces do not completely balance.
13. The Fundamental Theorem
We may set
, either because the changing value of
is unaffected by the changing reference frame, or because the effects of the changing reference frame are ignored by assumption. We then have an expression for the partial change caused by the direct forces, holding constant the frame of reference
in which the
s subscript emphasizes that this is a partial change ascribed to the direct forces, or the forces of selection. This form includes, as special cases, Fisher’s fundamental theorem of natural selection, the breeder’s equation of genetics, and other common expressions for the change in populations caused by natural selection.
Note that
, the variance of
, because
which is the variance of
, because
.
If we take
in order to study the change in fitness caused by the direct forces, then
, the change in mean fitness caused by selection,
, is the variance in fitness,
. Fisher was interested in the transmissible change in
associated with genetic factors,
, thus he partitioned fitness as
. Here, the genetic factors are partial regressions associated with particular genes, such that
is chosen to maximize the amount of the total variance in fitness,
, associated with the transmissible genes [
4,
9,
15,
16]. The
δ terms are residuals in the regression, such that one gets the additive partition of total variance from classical regression theory as
.
The change in fitness caused by the direct forces can now be written as
and thus the transmissible change in fitness caused by natural selection and associated with genetic factors is
in which
is the variance in the transmissible effects of the genetic factors on fitness, or the genetic variance in fitness. That partial change in fitness caused by direct forces and associated with transmissible factors is what Fisher emphasized in his fundamental theorem of natural selection. By defining the genetic factors,
, as the only direct forces of interest, the residual forces of selection,
δ, are added to the other inertial forces that define the changing frame of reference.
In models of evolutionary change, Fisher chose to ascribe the direct force of change associated with
to natural selection, and all other forces to the inertial frame that he called environmental causes. That d’Alembert interpretation of the split between direct and inertial forces provides a clear way in which to understand Fisher’s fundamental theorem of natural selection. There is, of course, an arbitrary aspect to such a partition, because the split between direct and inertial forces depends entirely on how one chooses to define the frames of reference. For example, a change in how one defines the set of potentially transmissible factors,
, alters how one splits forces between direct and inertial components [
15].
14. Conclusions
The fundamental equations for change are identical between many laws of physics and evolutionary change by natural selection. However, the different histories of those subjects and the long and confused debates in biology about Fisher’s fundamental theorem have obscured the simple, common basis of the underlying theory.
I unified different theories by combining d’Alembert’s conceptual frame with the abstract expressions of the Price equation. That combination led to a simple and very general basis for understanding populations or aggregations, in which one can interpret total frequency or total probability as a conserved quantity. By combining conservation of total frequency with a notion of change based on exponential growth, I showed the geometric and algebraic forms of change that arise from d’Alembert’s partition of direct and inertial forces. I also provided an elegant Hamiltonian expression in canonical coordinates, which recovers the Newtonian balance of force and acceleration independently in each dimension for the corresponding direct and inertial forces of d’Alembert.
Finally, I showed that arbitrary system quantities, such as biological traits, or any total system quantity such as energy, can be interpreted through two steps. First, begin with the universal results that arise from conservation of total probability and the notion of change as exponential growth. Second, apply a simple coordinate transformation between frequency change and system quantities to obtain general expressions for the change in system quantities.