Thermodynamic and Differential Entropy under a Change of Variables

The differential Shannon entropy of information theory can change under a change of variables (coordinates), but the thermodynamic entropy of a physical system must be invariant under such a change. This difference is puzzling, because the Shannon and Gibbs entropies have the same functional form. We show that a canonical change of variables can, indeed, alter the spatial component of the thermodynamic entropy just as it alters the differential Shannon entropy. However, there is also a momentum part of the entropy, which turns out to undergo an equal and opposite change when the coordinates are transformed, so that the total thermodynamic entropy remains invariant. We furthermore show how one may correctly write the change in total entropy for an isothermal physical process in any set of spatial coordinates.


Introduction
The Gibbs entropy of classical statistical thermodynamics is, apart from some non-essential constants, the differential Shannon entropy [2] of the probability density function (pdf) in the phase space of the system under consideration. However, whereas the thermodynamic entropy is not expected to depend upon the choice of variables, the differential entropy can be changed by a transformation of variables. In particular, the differential entropy of a spatial pdf depends on the choice of the coordinates used to describe the spatial configuration of the system. Moreover, a change of variables can change not only the absolute differential entropy, but also its change on a change in the pdf, as shown below. This sensitivity to coordinates appears paradoxical, since a physically meaningful quantity ought to be independent of the choice of spatial coordinates. A similar concern was previously expressed in a critique of the concept of the differential entropy itself [3].
Here, we demonstrate that, for the thermodynamic entropy, a transformation of the spatial coordinates is accompanied by a compensating change of the entropy of the canonically conjugate momenta so that the full thermodynamic entropy remains invariant. The invariance of the full entropy stems from the fact that the Jacobian of a canonical transformation equals unity, and explicit demonstration of this invariance yields a simple formula for correcting a spatial entropy computed with transformed coordinates to yield correct full entropy changes. These results have application in calculations of spatial entropy from molecular simulations when Cartesian coordinates are transformed, for example to bond-angle-torsion [4] coordinates.
The paper is structured as follows. Section 2 reviews the formalism of entropy in classical statistical thermodynamics, defines a splitting of the full thermodynamic entropy into momentum and spatial parts for the case in which the spatial coordinates used are Cartesian, and shows that then the change in the spatial entropy equals the change in the total entropy, for an isothermal process. (Note that the spatial entropy of the solute part of a solute-solvent system is often termed the solute's configurational entropy.) Sections 3 and 4 investigate the effects on the spatial and momentum entropy, respectively, of a transformation of the Cartesian coordinates to general spatial coordinates. Section 5, discusses how one may evaluate the change in the full thermodynamic entropy due to an isothermal process in terms of the change in the spatial entropy evaluated in non-Cartesian coordinates. Finally, Section 6 draws conclusions.

Spatial Entropy in Cartesian Coordinates
In classical statistical thermodynamics, the entropy S of a system described by coordinates q 1 , …, q s and the canonically conjugate momenta p 1 , …, p s , briefly p and q, respectively, is given in terms of the system's pdf ρ(p, q) in the phase space (p, q) as [5] (1) Here, k B is Boltzmann's constant, and the factor h s , where h = 2πħ is Planck's constant (quasi-classically, the number of states in a volume element Δ p Δ q of the phase space (p, q) is Δ p Δ q /h s , see e.g., [6]), ensures that the argument of the logarithm is dimensionless. The probability distribution function ρ(p, q) is given by the Boltzmann-Gibbs distribution (2) where β = 1/(k B T), with T being the absolute temperature, E(p, q) is the system's energy, and (3) is the partition function, which is the distribution's normalization constant divided by h s to make it dimensionless. The entropy S can be written in terms of the partition function Z as (4) where (5) is the mean (expectation) value of the energy E(p, q).
Let us assume that the coordinates q are Cartesian. Then the energy E(p, q) is the sum of a kinetic energy, K, which is a function of the momenta p only, , where m i are the masses associated with degrees of freedom i, and a potential energy U = U(q), which depends only on spatial coordinates. The probability distribution (2) and the partition function (3) now factorize as (6) where, (8) (9) are momentum and spatial probability distributions with normalization constants (10) (11) which can be termed, respectively, the momentum and spatial partition functions. It should be noted that neither of these partition functions are dimensionless; the treatment of the factor 1/h s is discussed later in this section.
Accordingly, the full entropy (1) can now be separated as (12) where S m is a momentum entropy which can be evaluated in closed form, (13) and S s is a spatial entropy, (14) Similarly to (4), the spatial entropy S s can be written as (15) where (16) is the mean value of the potential energy U(q).
A factor of h s is included in the definition of the momentum entropy (13) so that the full entropy S m + S s has the correct physical dimensions of energy/temperature. The association of this factor with the momentum entropy is arbitrary (as was its inclusion in the momentum partition function (10)); it could instead have been included in the spatial entropy. Unfortunately, the factor cannot be split so that both parts of the full entropy are dimensionless. Thus, neither of the "partial" entropies S m and S s is correctly dimensioned. Nonetheless, the troubling dimensions cancel for differences in S m and S s arising from changes in the pdf, since the relevant terms appear in the arguments of logarithms, and so such differences are physically meaningful.
It is evident from Equation (13) that the momentum entropy does not depend upon the spatial pdf ρ s (q), but only upon the momentum pdf ρ m (p), which in turn depends only on the atomic masses and the temperature. As a consequence, for an isothermal physical process with a fixed set of atoms, the change in total entropy equals the change in spatial entropy: (17) More generally, this equation holds in any coordinate system for which the kinetic energy K is independent of the spatial coordinates, K = K(p).
Note that the spatial entropy defined here is akin to the configurational entropy. However, the latter usually refers specifically to the entropy associated with the conformational fluctuations of a molecule in solution, and is therefore exclusive of the solvent entropy [9]. The spatial entropy is more general, as it may refer to the whole system or any of its parts.

Spatial Entropy under a Coordinate Transformation
It is sometimes of interest to compute the change in total entropy, ΔS, associated with an isothermal molecular process, such as protein-ligand binding or protein-folding. Equation (17) shows that the change in total entropy can be obtained by computing the change in the spatial entropy in Cartesian coordinates. However, Cartesian coordinates are not always optimal for this purpose, because the pdf in Cartesian coordinates includes many coordinate dependencies that are rather easily removed by transforming to more natural coordinates, such as bond-angle-torsion (BAT) coordinates [4]. For example, even the Cartesian coordinates of a single atom, (x i , y i , z i ), can be strongly correlated with each other, due to the natural tendency of each atom to move in a circular trajectory corresponding to a bondrotation. This motion is readily captured by a single torsional variable. For this reason, a transformation from Cartesian coordinates to suitably defined internal coordinates of the molecular system under consideration, plus the coordinates of the translation and rotation of the system as a whole, is often performed.
Transforming from Cartesian coordinates q to new coordinates Q(q) (with an inverse q = q(Q)), transforms the probability density function (pdf) ρ s (q) into a pdf ρs(Q) of the new coordinates Q according to a rule of general probability theory as (18) where J(Q) is the Jacobian of the transformation. The differential Shannon entropy H q of information theory, defined as [2] (19) can be written in terms of the new coordinates as (20) where (21) is the Shannon entropy of the transformed pdf ρs(Q) and (22) is the expectation value of the logarithm of the Jacobian J(Q).
Based upon Equation (20), the spatial part of the thermodynamic entropy (14) in Cartesian coordinates q, S s = k B H q , transforms to a different value on changing to internal coordinates Q: (23) where S̃s = k B H Q . This result may seem troubling, since the thermodynamic entropy of a physical system should not depend on the choice of the spatial coordinates used. Nonetheless, this Jacobian correction is real and is not even guaranteed to cancel when one computes the difference between two entropies associated with a change in the potential function and a consequent change in the physical pdf from ρ to some ρ′. That is, the change ΔS s in the Cartesian spatial entropy S s arising from a change in the pdf of q does not necessarily equal the change ΔSs in the transformed spatial entropy S̃s arising from the corresponding change in the transformed pdf of Q. This is because the Jacobian can vary with Q, so its contribution to the entropy for ρ (22) can differ from that for ρ′ (24) Hence, (25) We note that a similar treatment of the transformation of spatial entropy under a change of coordinates has been given in [7].

Momentum Entropy under a Coordinate Transformation
The coordinate transformation q → Q = Q(q) is called a point transformation because the new coordinates Q are functions only of the old coordinates q; that is, they do not involve the old conjugate momenta p. (A general point transformation may also involve an explicit dependence on time.) A point transformation of the spatial coordinates is associated with a canonical transformation of the full phase space coordinate system; i.e., of both the spatial and momentum coordinates, such that p, q → P = P (p, q) , Q = Q (q) , with the inverses p = p(P,Q), q = q(Q). In a canonical transformation, the new variables P,Q remain canonically conjugate, which means that (26) where L = L(Q, Q, t), with Q̇ = dQ/dt, is the system's Lagrangian expressed in terms of Q. Classical statistical thermodynamics is based on the Hamilton formalism of mechanics, and property (26) ensures that the Hamilton equations in terms of P, Q retain their canonical form. Here, an important property of a canonical transformation p, q → P,Q is that its Jacobian equals unity [8]. This means that the phase-space volume element of an integration in the full phase space is invariant: (27) A point transformation q → Q = Q(q) itself has in general a Jacobian J(Q) ≠ = 1, and so dq = J(Q) dQ. Thus, for invariance (27) to hold, the momentum volume element dp must transform in a canonical transformation as (28) Using (27), we can write the full entropy (1) in terms of an integral in the new phase space variables P,Q simply as (29) where (30) is the pdf of the new canonical variables P,Q. Note that because in general p = p(P,Q), the kinetic energy K is now a function of not only the momenta P, but also of the non-Cartesian coordinates Q: K = K(P,Q). Like any joint pdf, (30) may be factorized by means of the product rule as (31) where (32) is the marginal pdf of the coordinates Q, and is the conditional pdf of P given Q. Using (30), (28) and (10), it can be verified that the marginal pdf (32) equals the spatial probability density (18) that was introduced in Sec. 2 under the same symbol: (34) Here, the transformation dP = J(Q) dp was performed, which reverts the kinetic energy K(P,Q) to being a function of the Cartesian momenta p only, K = K(p), since this transformation is the inverse of that which made the kinetic energy K(p) a function of both P and Q. (The transformation P → p = p(P,Q) has an inverse P = P(p,Q) that is such that K(P(p,Q), Q) is in fact a function of p only.) Since Z m /Z = 1/Z s see Equation (7)], the last line of (34) indeed yields the spatial pdf (18).
Using (31) and the fact that, like any conditional pdf, the distribution (33) is properly normalized (i.e., ∫dP ρm(P|Q) = 1 at any Q), the full entropy (29) can be written as (35) where (36) is the mean (expectation) value with respect to Q of the Shannon entropy of the conditional pdf ρm(P|Q), (37) and H Q is the Shannon entropy (21) of ρs(Q). In analogy with (12), Equation (35) in turn can be written as (38) where (39) may be termed the momentum entropy associated with the new momenta P, and (40) is the spatial entropy in the new spatial coordinates Q. Equations (12) and (38), together with (23), straightforwardly yield the transformation of the momentum entropy: (41) This relation complements the transformation (23) of the spatial entropy so that, on a point transformation q → Q = Q(q), a change in the spatial entropy is compensated by a change in the momentum entropy and the full entropy remains invariant: S m + S s = S̃m + S̃s.

Coordinates
It is now easy to show how the change in the spatial entropy of a molecular system, calculated using a non-Cartesian system of coordinates, can be corrected to provide the corresponding change in the full thermodynamic entropy of the system, which is the quantity of physical interest. We consider the entropy change arising from some physicochemical process which changes the phase-space pdf of the system from ρ to ρ′. The molecular species of interest is considered to be present at a standard concentration C ∘ , which corresponds to an isolated molecule in a container of volume V ∘ = 1/C ∘ [9]. For an isothermal process, and with Cartesian coordinates, the momentum pdf is unchanged, so that there is no change in the momentum entropy, S m . The change ΔS of the full thermodynamic entropy therefore equals the change in the Cartesian spatial entropy S s : (42) If non-Cartesian coordinates Q are used, as in many methods for calculating the spatial entropy, a change ΔS̃s in the entropy of the coordinates Q obtained in this way can be corrected to the change ΔS s in the Cartesian spatial entropy S s using (23): (43) Here, the correction term (44) is the difference between the means, evaluated with the changed and original spatial distributions and ρs(Q), respectively, of the logarithm of the Jacobian of the transformation q → Q = Q(q). If molecular simulations are used for the evaluation of spatial entropy, then 〈ln J(Q)〉 is evaluated easily as a simple arithmetic mean, (45) where Q i , i = 1, …, n is a sample of the coordinates obtained from snapshots of the simulation trajectory.
We now consider the specific case where Q represents bond-angle-torsion (BAT) coordinates. For N atoms, Q comprises 3 external translational coordinates r ex = (x ex , y ex , z ex ), 3 external rotational coordinates θ ex , ϕ ex , ψ ex , and 3N − 6 internal coordinates of N − 1 bond lengths b = (b 2 , …, b N ), N − 2 bond angles θ = (θ 3 , …, θ N ), and N − 3 torsional angles ϕ = (ϕ 4 , …, ϕ N ), where the subscripts indicate the atoms to which the internal coordinates correspond. The Jacobian for this transformation is given as [4] (46) and the spatial pdf in Cartesian coordinates q = x, ρ s (x), is thus transformed into the following pdf in BAT coordinates: where we have used expression (9) for ρ s , along with the fact that the potential energy U(x) of a molecule of N atoms that is not located in an external field depends only on its 3N − 6 internal BAT coordinates (b, θ, ϕ). The right-hand side of (47) may be written as a product of two factors, one depending on only the external coordinate θ ex and the other on only the internal coordinates (b, θ, ϕ). Therefore, the joint pdf (47) can be factorized as (49) where (50) and (51) Here, ρẽ x , the marginal pdf of the external coordinates, is clearly normalized because (52) and ρĩ n (b, θ, ϕ), the marginal pdf of the internal BAT coordinates, is normalized based on the definition of Z s Equation (48)].
The entropy of the joint pdf (49) now separates as (53) where S̃e x and S̃i n are the entropies of the marginal pdf's ρẽ x and ρĩ n , respectively. Using expression (50) for ρẽ x , the external entropy S̃e x can be written as (54) where 〈ln(sin θ ex )〉 is the mean of ln(sin θ ex ) for a uniform distribution of molecular orientations. Finally, using Equations (23), (46), (53) and (54), we have for the spatial entropy in Cartesian coordinates, S s : In previous work [10], S s has been written in terms of the BAT coordinates (b, θ, ϕ) as (56) where ρ(b, θ, ϕ) is a distribution function of the internal BAT coordinates that becomes the normalized marginal pdf ρĩ n (b, θ, ϕ) on multiplication by J(b, θ), Expression (56) is entirely consistent with the presented formalism, as it can be rewritten in terms of the properly normalized marginal pdf ρĩ n (b, θ, ϕ) and J(b, θ) to give (58) which is identical with (55). The results of [10] and other work that adopted a similar evaluation of S s as that in (56) are thus not affected by our findings.
The approaches outlined above for evaluating the spatial entropy in Cartesian coordinates in BAT coordinates avoid potential shortcomings that may arise from approximating J(b, θ) as a constant equal to its value at the equilibrium values b = b 0 and θ = θ 0 of the internal BAT coordinates. In particular, although most of these coordinates are "hard" and therefore make nearly constant contributions to the Jacobian, this is by no means the case for the pseudobond and pseudo-angles often used to define the position and orientation of one molecule relative to another in a noncovalent complex [11].
One circumstance in which the Jacobian can be approximated as constant is that in which the molecule occupies only a single, reasonably narrow energy well with its local minimum at internal coordinates b 0 , θ 0 , ϕ 0 . One may then make the approximation J(b, θ) ≈ J(b 0 , θ 0 ), in which case the pdf of Equation (51) simplifies to (59) If, as a further approximation, the harmonic approximation is used for the potential energy U(b, θ, ϕ), this pdf becomes a multivariate normal (Gaussian) distribution, the entropy of which can be evaluated in closed form, yielding for the entropy of internal coordinates the following estimate: (60) where F(b 0 , θ 0 , ϕ 0 ) is the Hessian matrix of the harmonic potential energy at the energy minimum. The widely used quasiharmonic approximation for estimation of the configurational entropy from molecular simulations was based in its original formulation [12] on the assumption that F ≈ ∑ −1/β where ∑ is the covariance matrix of a simulation sample of internal coordinates. Using now Equations (55) and (60), the spatial entropy in Cartesian coordinates is obtained as (61) With these approximations, an isothermal process that changes the conformation of the system produces a change ΔS s in the spatial entropy S s given by (62) where the primed quantities pertain to the changed system. Clearly, the Jacobian-dependent term may be neglected only when the equilibrium Jacobians of the two conformations are approximately the same. As noted above, this approximation holds to good accuracy when and Another perspective on this condition is also of interest. We first note that , where is the total kinetic-energy matrix in the BAT coordinates, and m i are the masses of the atoms; , like J BAT , is a function of the BAT coordinates. This expression follows from the fact that = ℬℳ −1 ℬ T , where ℳ = diag(m 1 , m 1 , m 1 , …, m N ,m N ,m N ) and ℬ is a 3N × 3N matrix whose elements are the partial derivatives ∂Q i /∂x j of all the BAT coordinates with respect to the Cartesian coordinates, so that det ℬ −1 = J BAT (see, e.g., [14]). Using this identity, along with, e.g., Equations (2.34) and (3.5) of [13], one may show that (63) where I(b, θ, ϕ) is the matrix of the instantaneous inertia tensor and G(b, θ, ϕ) is the kineticenergy matrix of the 3N − 6 internal degrees of freedom. Note that det I = I 1 I 2 I 3 , where I i are the principal moments of inertia. The Jacobian term in (62) therefore may be written as (64)

Conclusions
We have addressed the situation in which one wishes to compute ΔS for a classically treated, isothermal physical process where the phase-space probability distribution ρ(p, q) goes to ρ′ (p, q). In the biophysical context, this might be a binding or folding process. If (p, q) are Cartesian, then we can factorize ρ(p, q) as ρ m (p)ρ s (q) and accordingly decompose the entropy S into S s + S m . The momentum entropy, S m , is not affected by the physical process, so ΔS = ΔS s . (If the temperature T changes, then there is a contribution from S m as well, which can be computed analytically.) In practical applications, it is often preferable to compute spatial entropy in non-Cartesian coordinates, Q, but questions arise regarding the correct way to treat the entropy under a coordinate transformation because the differential Shannon entropy of information theory, which has the same mathematical form as the thermodynamic entropy, is not invariant under a change of coordinates. This lack of invariance appears problematic, because a simple change of coordinates must not affect the change in the entropy computed for a physical process.
This paradox is reconciled when one recognizes that the thermodynamic spatial entropy does in fact transform in the same manner as the differential Shannon entropy, but that the change in the transformed spatial entropy, Δ S̃s, is not in general equal to the change in total entropy, ΔS. The reason is that, in the new coordinates, unlike in Cartesians, the physical process also produces a change Δ S̃m in the entropy associated with the conjugate momenta, where S̃m is defined as an average of the momentum entropy over all values of the spatial coordinates. This change in the transformed momentum entropy precisely cancels the change in the spatial entropy associated with the transformation of coordinates, so that the change in total entropy due to the physical process is invariant under the transformation of coordinates.
The present analysis furthermore has provided useful expressions for the total entropy change for an isothermal physical process in terms of the spatial entropy in any set of spatial coordinates.