Holonomic Constraints : A Case for Statistical Mechanics of Non-Hamiltonian Systems

A dynamical system submitted to holonomic constraints is Hamiltonian only if considered in the reduced phase space of its generalized coordinates and momenta, which need to be defined ad hoc in each particular case. However, specially in molecular simulations, where the number of degrees of freedom is exceedingly high, the representation in generalized coordinates is completely unsuitable, although conceptually unavoidable, to provide a rigorous description of its evolution and statistical properties. In this paper, we first review the state of the art of the numerical approach that defines the way to conserve exactly the constraint conditions (by an algorithm universally known as SHAKE) and permits integrating the equations of motion directly in the phase space of the natural Cartesian coordinates and momenta of the system. We then discuss in detail SHAKE numerical implementations in the notable cases of Verlet and velocity-Verlet algorithms. After discussing in the same framework how constraints modify the properties of the equilibrium ensemble, we show how, at the price of moving to a dynamical system no more (directly) Hamiltonian, it is possible to provide a direct interpretation of the dynamical system and so derive its Statistical Mechanics both at equilibrium and in non-equilibrium conditions. To achieve that, we generalize the statistical treatment to systems no longer conserving the phase space volume (equivalently, we introduce a non-Euclidean invariant measure in phase space) and derive a generalized Liouville equation describing the ensemble even out of equilibrium. As a result, we can extend the response theory of Kubo (linear and nonlinear) to systems subjected to constraints.


Introduction
The dynamical and statistical behavior of a mechanical system of many degrees of freedom subjected to holonomic constraints presents specific features that seem worth presenting and discussing in a unified framework.A mechanical Hamiltonian system is a system whose evolution is derivable from a standard Hamiltonian H(r, p) = K(p) + V (r), where is the kinetic energy expressed in Cartesian coordinates as a Euclidean quadratic form of the momenta p := {p i = m i ṙi , i = 1, . . ., N} , and V = V (r) is a function of the 3N Cartesian coordinates r := {r i , i = 1, . . ., N} .
N is the number of point particles in the system, and we have put ourselves in dimension 3. The space of the coordinates is called configuration space, while the phase space {r, p} gives the space of the mechanical states of the system.To say that the system is subjected to f holonomic constraints is equivalent to saying that the motion has to evolve on a (3N − f )-dimensional configuration space, which results from imposing f geometrical conditions σ α (r) = 0 , α = 1, . . ., f at all times.These constraints can connect all the coordinates of the configuration space, in which case we call them global (Blue Moon [1,2]), or connect disjoint subgroups of the coordinates, which is, for example, the way in which molecular systems can sometimes be described [3], or else can be the conditions for orthonormality of single electron orbitals, as in the Car-Parrinello approach to ab initio molecular dynamics [4].
Global constraints can be used to bring the system in situations normally difficult to visit.In these cases, the constraints can act as a kind of Maxwell daemon.In the solution of the classical Statistical Mechanics of dynamical systems, the constraints confront two major problems.As we have seen before, the constraints are an essential ingredient in the definition of the dynamical system, therefore any acceptable algorithm introduced to solve the dynamics of such a system cannot propagate any error, as otherwise the statistical behavior of the ensemble in the presence of the constraints cannot be properly formulated.This last problem in principle is automatically solved for Hamiltonian systems by using generalized coordinates.However, especially for systems with many degrees of freedom, generalized coordinates are completely intractable, and one should be able to formulate properly the problem by using Cartesian coordinates in a standard way.As we will see, a family of algorithms avoiding the propagation of the errors have been introduced [3,5], while the proper formulation of the statistical ensemble is straightforward for the equilibrium case [6] but requires some more work for non-equilibrium, where the missing ingredient is the correct Liouville equation to use [7].To get the proper Liouville equation one has to abandon the traditional Hamiltonian description, in which one had the constraint forces by using a Lagrange multiplier, and go straight to the non-Hamiltonian behavior of the equations of motion of the system in which the constraint forces are explicitly (analytically) solved.For this non-Hamiltonian equations the Euclidean nature of phase space is no more an invariant, therefore one has to find an invariant, non-Euclidean, measure to be associated with the phase space so that the statistical behavior of the system can be properly described by generalizing the Liouville equation [7].As we will see, the non-equilibrium response of our constrained system to external perturbations can be derived directly [8], while the best known results of linear response theory, the so-called fluctuation-dissipation theorem, can be derived but requires some extra work [9].The advised reader should be warned that in this review we have excluded the treatment of non-holonomic constraints, a very large family difficult to unify and in any event requiring special treatments [10].
In Section 2, we summarize the formalism needed to describe a system with constraints.In Section 3, we write down the general (SHAKE) equations to be solved to derive whatever family of numerical algorithms and briefly describe the two best known formulations: the one, adopted with the Verlet algorithm, usually referred to as SHAKE [11], not to be confused with the reference to the general equation to be solved, and the one modified to work with the velocity-Verlet algorithm, usually referred to as RATTLE [12].To these two, we will briefly add a more recent alternative devised to give a parallel implementation of SHAKE [13].In Section 4, we derive the equilibrium ensemble of Hamiltonian systems subjected to holonomic constraints [6].Section 5 presents an effective approach to compute conditional averages by the use of holonomic constraints (Blue Moon) [1,2,14,15].
The possibility to compute conditional averages can be used in conjunction with non-equilibrium molecular dynamics techniques [16][17][18][19][20] to compute rate constants, hydrodynamical phenomena and, in general, relaxations from large fluctuations statistically produced by introducing suitable constraints.In Section 6, we formulate the non-Hamiltonian equations of motion for a constrained system, we derive from them an invariant measure for the phase space, the correct generalized Liouville equation and, again, as a way to see how all that works, the expressions for the equilibrium ensembles.Then, we start from the generalized Liouville equation to give a rigorous expression for the response to external perturbations of a constrained system [8] and we prove with some rigorous arguments that also the classical results of linear response theory can be shown to hold [9].The paper is concluded by a short outlook in which we try to assess the state of the art in the treatment of the computational classical Statistical Mechanics for systems subjected to holonomic constraints.

Dynamics with Holonomic Constraints
Given the Lagrangian L(r, ṙ) = K(ṙ) − V(r) of a dynamical system with N particles in dimension 3, subjected to f holonomic constraints the equations of motion (Lagrange equation of I type) are d dt where the λ α (t), α = 1, . . ., f are the unknown Lagrangian multipliers to be determined by imposing that Note that the multipliers λ α (t) together with all their derivatives of any order can be determined by taking successive time derivatives of the expressions σ α (r(t)) = 0 .In particular, are evident conditions to be satisfied since holonomic constraints do not perform any mechanical work and therefore permitted velocities and constrained forces have to be orthogonal.Moreover, Substituting in Equation ( 9) the equations of motion, Equation ( 6), and solving the resulting linear system for the λs, we get where • ∂σ β ∂r i , α = 1, . . ., f , β = 1, . . ., f .( The expressions resulting from introducing Equation (10) in Equation ( 6) could possibly provide fully explicit, no longer Hamiltonian, dynamics.For the moment, we will not be interested in such formulation because any approximate algorithm that make use of Equation (10) will necessarily propagate the errors in the constraint relations, Equation ( 5), with dramatic consequences on the stability of the model.Further derivatives with respect to time of the σs will provide linear relationships for the higher order derivatives of the λs, which could be needed in higher order algorithms.
By taking the standard Legendre transform on the Lagrangian L, the same dynamics can be straightforwardly formulated in Hamiltonian terms, involving the same treatment for the constraint forces.In the following, it will be useful for theoretical purposes to consider an equivalent representation of the Hamiltonian expressed in terms of (i) the f constraint relationships σ := {σ α (r), α = 1, . . ., f }, and (ii) the remaining (3N − f ) generalized coordinates q := {q ν (r), ν = 1, . . ., 3N − f }.
This change of coordinates is a point transformation of the configuration space and therefore generates a canonical transformation [10].With this change of variables, the Lagrangian L of our system generates the Lagrangian in the new coordinates given by L * (q, q, σ, σ) = L (r(q, σ), ṙ(q, σ, q, σ)) . (15) Sometimes, it is useful to call collectively the variable {q, σ} = {u}.From the Lagrangian L * , we get and where the kinetic term is now and ∂u α is the metric matrix associated with the new variables.Note that it is almost immediate to find for the inverse matrix M −1 the explicit expression • ∂u β ∂r i , α = 1, . . ., 3N , β = 1, . . ., 3N .
There is an intimate connection between the matrix M, the Jacobian matrix J = ∂r ∂u and the Jacobian determinant J = ∂(u) ∂(r) = |J| of the {r} −→ {u} point transformation.Introducing the mass tensor (µ) i,k = m i δ i,k for i, k = 1, . . ., 3N by means of the Kronecker delta, one can rewrite M in Equation (19) and, of course, its determinant |M|, as In this representation, the constrained motion is generated by the Lagrangian L * (q, q, σ = 0, σ = 0) = L c (q, q) in the (3N − f )-dimensional space Note, however, that these equations are no longer in normal form.
The Hamiltonian formulation helps us to get back to an evolution expressed in normal form.Given that we have To proceed, it is useful to write the matrices M and M −1 in block form where ∂q ν ∂q ν ∂σ α • ∂q η ∂r i , with Z already defined in Equation (11), to derive a number of results, which we will use in the following.In particular, note that the block matrices defined above are not independent from each other.A first set of useful relations can be derived by expanding the expressions for the identity Another useful relation is the one that, in a different language, is known as Fixman's Theorem [21].It relates the determinants of the matrices A and Z Equation ( 33) can be derived directly observing that (use Equation ( 32) ) and that By putting Equation ( 33) together with Equation ( 21), we obtain for the determinant of the block matrix A, the interesting expression (36) that will be useful later on.
Going back to the constraint relations, the conditions σ = 0 can now be written as σ = E T p q + Z p σ = 0, (37) giving the non-zero values of the conjugated momenta pσ when the constraints are imposed, where Z and Ẽ are implicitly defined in Equation (38).In these conditions, the Hamiltonian of the constrained motion can be evaluated explicitly on the hypersurface σ = 0, p σ = pσ to obtain where we first used pσ = − Z−1 ẼT p q to go from Equation (40) to Equation (41) and, then, from Equation (41) to Equation (42), we have used the relations from Equation (32),

SHAKE, Integrating the Equations of Motion
The numerical integration of the equations of motion ( 6) requires discretizing the time and to provide a suitable algorithm of given precision.Generally, for evident reasons, one avoids the use of algorithms requiring more than the computation at each step of the forces avoiding successive derivatives, e.g., Ḟi = − ∑ j ṙj • ∂ ∂r j ∂V ∂r i , etc.For illustrative purposes, we will limit ourselves to write down the integration of the Lagrangian equations of motion using the Verlet algorithm and of the Hamiltonian equations using the velocity-Verlet algorithm.

Verlet Algorithm
The celebrated (1967) Verlet algorithm is easily obtained by writing down and summing up the forward and backward Taylor expansions of each coordinate truncated to the fourth order.Calling x a generic variable in the configuration space set r (and where t is the running time and h is the integration step resulting from time discretization.The velocity with this algorithm is computed by subtracting the same forward and backward Taylor expansions.
We get, with one timestep of delay, where we have written explicitly the error of order O(h 2 ) for further use.Notice that the velocities, which carry a larger error, do not enter in the computation of the trajectory, which remains precise to the order three, and, as it has been shown, has many other remarkable features that can be summarized by saying that this algorithm is simplectic [22,23].In presence of holonomic constraints, the acceleration of any coordinate x can be decomposed in the two contributions, F = −∂ x V coming from the interaction potential of the model and where Λ is a set of parameters to be determined, so that Substituting Equation (47) in the algorithm Equation (45), we have where x(t + h), the provisional value of the coordinate at time t + h, is the position the coordinate would take in the absence of constraints.Now, and this is the essential conceptual content of the whole family of SHAKE algorithms, we determine the set of the Λ parameters by imposing and solving the set of algebraic equations Since we have f Λ values and f constraint relationships, the system of algebraic, generally not linear, equations is well posed.The values of the Λs solving these equations are in general different from the values of the λ(t) obtainable from Equations (10).However, the difference cannot be of greater order than the one involved in the algorithmic error.Therefore, the values of the coordinates at time t + h will entail an error equivalent to the one produced by the blind application of the Verlet algorithm.However, now, the constraint relationships will be satisfied exactly at every timestep and the dynamics of the system will not disrupt the model.
Many different ways have been proposed to solve the system of Equations ( 46), see e.g., [13,[24][25][26][27].The original, and still commonly used, goes back to Berendsen [3], who called it, again, SHAKE.It proceeds by satisfying one constraint at a time, iterating constraint relationship by constraint relationship until convergence.The technical details have been worked out, apart from the original paper, more pedagogically in [11].Leimkuhler [28] has demonstrated that the resulting numerical procedure maintains the time reversal invariance and the simplectic character of the algorithm.In the referred to, original, implementation, the algorithm is inherently serial and cannot be easily parallelized.Practical parallelizations are either approximate or the algorithms are specifically tailored to the problems at hand.An interesting general parallel solution has been worked out by Weinbach and Elber [13].They take advantage of the fact that the essential step in solving the SHAKE equation can be recast as the solution of a sparse linear problem of the type Ay = b with y the vector of unknowns.Constructing a suitable positive definite matrix, they solve the SHAKE equation using (parallel) conjugate gradient minimization of the quadratic form 1 2 Λ T AΛ − Λ T σ in place of the standard iterative process (inherently serial).

Velocity-Verlet Algorithm
The Verlet algorithm can be easily recast in an algebraically equivalent form that, when applied in the correct order, produces both the positions and the velocities at the same time.Using the same symbols, let us first rewrite Equation (45) by replacing −x(t − h) with its value extracted from Equation (46) where we have written explicitly for further use the expression for the error O(h 3 ), to be normally rejected, in order to obtain the first equation of the velocity-Verlet algorithm, which expresses the position x at time (t + h) with an error of order O(h 3 ), here retained in its explicit form.Next, we write the velocity ẋ(t + h) from Verlet (46) taken at time (t + h) and eliminate the position x(t + 2h) using, again, Verlet (45) taken to go from time (t + h) to time (t + 2h), where we made again use of Verlet (45) to expand one of the two x(t + h) contributions.
Regrouping terms and simplifying, we can write where we got rid of the fraction using Verlet (46).Finally, by expanding with Taylor the third derivative term at time (t + h), ...
, we observe that the two terms ∝ h 2 cancel each other leaving a term in h 3 .Finally, we arrive at the second equation of the velocity-Verlet algorithm which expresses the velocity ẋ at time (t + h), again with an error of order O(h 3 ).
Substituting Equation (47) into Equation (52), we have, analogously to the previous section, where x(t + h), the provisional value of the coordinate at time t + h, is the position that the coordinate would take in the absence of constraints using Equation (52), and G = −Λ • (∂ x σ) is the constraint force and the parameters Λ are determined by imposing and solving the set of equations Again, we have an algebraic system with f constraint relationships σ and f unknown values Λ .The problem is well posed and the solution can be retrieved exactly along the same lines as before, by using the iterative SHAKE algorithm.Of course, we will have a different set of parameters, Λ = λ, but the new positions x(t + h) will satisfy the constraints (5) exactly at the time t + h.To calculate the new velocities, the above procedure must be repeated by substituting Equation (47) in Equation ( 59), now at time t + h, where the provisional velocity at time t + h, i.e., the value the velocity would take in the absence of constraints at time (t + h).Note that, at this stage, the Λ and, therefore, the constraint force G (t) at time t are already computed and therefore included in x(t + h), while one needs to determine the yet unknown parameters Λ by imposing and solving the set of equations Once more, we have an algebraic system with f constraint relationships, σ, and f unknown values, Λ .The problem is well posed and the solution can be retrieved by an iterative SHAKE-like procedure, i.e., proceeding by satisfying one constraint relation at a time.The whole procedure of imposing constraints within the velocity-Verlet scheme is known by a different name, the RATTLE algorithm [12], although it is indeed nothing else than the same SHAKE procedure applied twice, once for positions and once for velocities, to two different sets of equations.The main difference with the original SHAKE algorithm [3] lies in the fact that the velocities calculated using Equation (62) are at each time exactly tangent to the constraint hypersurface σ = 0, while the velocities calculated, usually, simply using Equation (46) in the original SHAKE algorithm are tangent only within the algorithm accuracy (O(h 2 )).This extra precision does not come for free, but at the cost of doubling the effort in calculating the unknown Λ parameters.As long as the velocities in the Verlet algorithm do not enter directly into the numerical integration of the positions, such difference can be safely ignored; however, if desired, nothing would impede applying SHAKE in the same spirit, and with similar costs, to correct the velocities from Equation (46).

Equilibrium Statistical Mechanics in the Hamiltonian Formulation
The expression of the statistical equilibrium ensemble in Cartesian coordinates of a system subjected to holonomic constraints is not smooth but singular since the probability density defined in a 6N-dimensional phase space is associated with a mechanical system whose motion takes place in a (6N − 2 f )-dimensional subspace, i.e., the intersection of the 2 f hypersurfaces σ(r) = 0 and σ(r, p) = 0.
On the contrary, it is immediate to write down the (microcanonical) probability density in the reduced phase space of the 2(3N − f ) generalized coordinates q, p q using the Hamiltonian H c in Equation (42).
Let Ô(r, p) be a dynamical variable defined using Cartesian coordinates and Ôc (q, p q ) = Ô(r(q, σ = 0), p(q, σ = 0, p q , p σ = pσ )) the equivalent variable expressed using generalized coordinates, restricted to the constrained hypersurface.The familiar microcanonical average in generalized coordinates reads where We will now transform it into the equivalent integral in Cartesian coordinates by making use of the canonical transformation that connect the "generalized" phase space variables (u, p u ) introduced in Section 2 to the "Cartesian" phase space variables (r, p).We first remark that, on the (6N − 2 f ) phase space hypersurface, one has where, for the product of the first f delta functions, we have used the shortcut notation δ(σ) = ∏ α δ(σ α ), and, equivalently, for the last term δ(p σ − pσ ) = ∏ α δ((p σ − pσ ) α ).
A more convenient expression for this product of delta functions can be derived by nothing that by multiplying Equation (37) by Z −1 , one obtains: Finally, using the facts that the Jacobian associated with a canonical transformation generated by a point transformation in the coordinates preserves the phase space volume, i.e., and Equation (42), we can write for the microcanonical average (65) where now, with |Z| the modulus of the determinant of the matrix Z. From Equation (69), it follows directly the expression for the probability density in the microcanonical equilibrium ensemble in Cartesian coordinates and similar for other equilibrium ensembles.In particular for the Canonical ensemble, where classically momenta and coordinates are explicitly independent, the probability density will result in being For theoretical purposes, as we will see in the following, it is very useful to write the ensemble in terms of the marginal configurational probability density P M (r) and the conditional probability density of the momenta P C (p|r): where P M (r) dr = dp P(r, p) dr = dp e −βH(r,p) δ(σ(r))δ(Z −1 σ(r, p)) Explicitly, P M can be computed by first integrating out the delta functions for σ in Equation (75), which amounts, after a change of variables, to the substitutions p σ = pσ , and then by executing the Gaussian integrals involved in the canonical ensemble.Assuming that the standard result of the integration of a unidimensional Gaussian integral is known, and, for a multidimensional Gaussian integral, diagonalizing the quadratic form and using the invariance of the determinant under unitary transformations, one derives immediately for the integral that, in n dimensions, . . .
Focusing on the numerator in Equation ( 75), making a change of variables in the integral by using dp = J −1 dp u = J −1 dp q dp σ , one has To obtain Equation (79), we have proceeded as follows.First, we substitute the kinetic Hamiltonian term in the exponential with its "unconstrained" expression in Equation ( 40) and integrate over dp σ using Equation (42).Now, we perform the remaining multidimensional Gaussian integral over the remaining momenta p q by using Equation (77) and use Equation ( 36) to arrive at the result in Equation (79).Following the same procedure for the denominator and simplifying the constants in Equation (79), we finally gets for the normalized marginal probability density Equation (80) tells us that the marginal probability density in configuration space, in the presence of constraints, is not simply ∝ exp (−βV)δ(σ) but contains the biasing term |Z(r)| 1 2 coming from the limitations in momentum space induced by the constraints.
The conditional probability density in momentum space is given by where to get the first equality we referred to Equation (80); for the next step, the result implicit in Equation (79); and, finally, then Equation (83).The configuration dependent factor |Z(r)| 1 2 in Equation (83) indicates that, when there are constraints, positions and momenta are no longer independent.In particular, the distribution of momenta becomes no more simply Maxwellian.

Rare Events and Blue Moon Ensemble
In the statistical mechanical treatment of macroscopic phenomena, one is interested in computing the properties of interest by identifying suitable observables, i.e., function of phase space, ξ(r, p), although here, and for a while, we will focus on observables depending only on the configuration space and obtaining their macroscopic counterpart by taking an ensemble average (to be definite, let us choose to work with the canonical ensemble) of it, More generally, given one observable, it can be instructive to compute the marginal probability density associated with it in the ensemble Macroscopically speaking, this probability density has a profound meaning since it can be associated, via the definition of the (Landau) free energy to the reversible work needed to bring the physical system from a reference state to the value ξ = ξ .This fact can be easily seen by taking the derivative with respect to ξ of Equation ( 86) In the same spirit of Section 2, we introduce the canonical transformation from the Cartesian coordinates {r, p} to the generalized coordinates {u, p u }, where u = (q, ξ) with the set q suitably chosen.Now, using that d dξ = − ∂ ∂ξ , integrating by parts, we arrive at where H * is the Hamiltonian expressed in the generalized u coordinates, see Equation (17).
In generalized coordinates, the kinetic term K * in the Hamiltonian (see Equation ( 18)) gives a non-zero contribution to the derivative, which is nothing but a geometrical correction that ultimately involves (see Appendix A) the Jacobian J = ∂(u) ∂(r) of the coordinate transformation, From Equation (90), we see that the derivative of W ξ is a conditional average, at a given value of the observable, of the generalized force acting on the system, i.e., typically a thermodynamic force.The evaluation of this expression as given in Equation (90) requires constructing explicitly the set of generalized coordinate u, something usually very cumbersome, and needs an ad hoc derivation in each particular case.As a matter of fact, it is possible to circumvent this technical difficulty [15] and derive expressions directly in terms of the function ξ(r) and its derivatives with respect to the Cartesian coordinates r.The "work" associated with this force is what we identify with reversible work.By thermodynamic integration, we can get the reversible work relative to a reference state and by exponentiation the probability density associated with the random variable ξ.
In standard conditions, when ξ is a unimodal random variable, the sampling of its probability density is an easy matter that can be computed directly in any straightforward Monte Carlo or Molecular Dynamics simulation by simply recording the histogram of visited ξ values.Things become less evident when the probability distribution of the random variable is not only multimodal, but the regions in between the maxima are characterized by very low probabilities so that whatever simulation gets stuck in one of the highly probable regions and, physically speaking, we are in the presence of a metastability.When this is the case, a brute force sampling of the histogram is no longer possible and one has to find, by cunning, alternative ways to proceed.As we will see in the following, the concept of conditioned or constrained probability will take us out of the difficulty.
To see that, we consider the condition ξ(r) = ξ as the constraint and we compare the conditional probability with Equation (80) for the marginal probability density in the presence of the constraint σ.We find i.e., a way to sample the conditional probability of r given ξ by unbiasing a constrained probability density.Now, even regions of the configurational space associated with very low probabilities can be efficiently sampled and the metastability problem is taken out.In particular as a kind of corollary to Equation (93), we have for any configurational observable Ô(r), with the rhs that, at variance with its left counterpart, can be efficiently sampled even for values of ξ corresponding to metastabilities.
The next problem arises when we need to take conditional averages for observables depending on the whole phase space, i.e., r and p.This case, apparently not so common, is instead general if one considers conditional dynamic properties (time correlation functions) even of configurational properties.Indeed, as it is immediately evident, Ô(r(t)) is nothing else than a function of the initial condition (r, p) parametrically dependent on the time t.Therefore, to be able to sample an unbiased conditional ensemble, with σ = 0, we need to have unbiased P(r, p|ξ ).We know that the momenta in the constrained ensemble, Equation (83), are irreversibly biased and thus unusable.However, we can unbias the configurations taken along a constrained trajectory and associate with them momenta sampled from an unbiased probability distribution.Knowing that, in the original ensemble, positions and momenta are independent and moreover the distribution of momenta P p (p) is just a product of Maxwellians, we can easily get such a sample P(r, p|ξ ) = P p (p)P r (r|ξ ) from which directly a computable expression for a time correlation function at given ξ where the time evolution now has to be intended to be fully unconstrained.The ensemble so constructed is the Blue Moon Ensemble and the problem of this particular metastability is now solved.In particular, if we are interested in the calculation of an unconditioned time correlation function in a system where a brute force calculation (due to metastability) is not possible, we can compute it by thermodynamic integration using the predetermined marginal probability of ξ, P ξ (ξ ).We get To simplify the algebra and the notation, we have developed our argument only in the case of one scalar condition, ξ.The generalization to a vectorial condition is straightforward but cumbersome, and it can also be found explicitly derived in the literature [29].The case in which the mechanical system contains constraints other than the ones representing physical conditions, typically molecular constraints, is formally more involved but conceptually identical.The interested reader can find all needed details in Ciccotti et al. [2].

Liouville Equation in the Presence of Constraints
The careful reader will have noticed at this point that we have properly solved the dynamics of a Hamiltonian system subjected to holonomic constraints and also formulated, in the Cartesian space, its statistical behavior at equilibrium.Instead, we have been unable to formulate in general the Statistical Mechanics of the system, including the evolution of the non equilibrium ensemble.The reason is that we miss the Liouville equation for this family of dynamical systems.We will see in the following that a generalized Liouville equation, always in the Cartesian reference description, can be derived at the price of abandoning the formulation of the dynamical evolution of the system by Lagrange multipliers and deriving, instead, the statistical behavior of a many-body, non-Hamiltonian system still satisfying the (assumed) conditions needed to justify a statistical treatment (e.g., chaotic behavior of the constituents, etc.).In these conditions, we will be able (i) to get a correct generalized Liouville equation; (ii) to find the results already obtained for the equilibrium ensemble; (iii) to generalize to these systems the theory of the response to external perturbations (in particular Kubo linear response theory [30]) well known and of widespread use for Hamiltonian systems, not only theoretically but also in molecular dynamics simulations [19,[31][32][33].
A general, non-Hamiltonian, dynamical autonomous system is defined, in the set of variables {x}(≡ {r, p = mṙ}), by with the single component G i (x) not derivable as ± ∂H ∂x i . The first and most important difference with a Hamiltonian system, especially in view of the derivation of the statistical properties of such a system, is that the phase space volume can be no more an invariant of the motion.If that happens, the standard approach of Statistical Mechanics is doomed to fail.However, it is easy to see that an invariant measure for the systems given in Equation ( 98) is easily found.Indeed, now where x t = x t (t; x 0 ) is the solution of Equation (98), x 0 the initial condition and J t = J(x t , x 0 ) the determinant of the Jacobian matrix of the time-generated change of variables.In Appendix B, it is shown that where κ, the divergence of the flow in phase space κ(x t ) = ∇ T x t ẋt , is known as the phase space compressibility of the dynamical system, and we have introduced the shorthand notation ∇ x t for the gradient ∇ x = ∂ ∂x with respect to coordinates x t at time t.The solution of Equation (100) with the initial condition J(x t=0 , x 0 ) = 1 is where w(x t , t) is the primitive function associated with the indefinite time integral of κ(x t ), which exists with certainty given that κ(x t ) = d ln J t dt .Substituting this results in Equation (99), we find exp{−w(x t , t)}dx t = exp{−w(x 0 , 0)}dx 0 (102) i.e., the conservation in time of the measure e −w(x t ,t) dx t .The factor exp{−w(x t , t)}, let us call it γ(x t , t), is the metric factor associated with the coordinate transformation x 0 → x t ,.It tells us that the statistical space of the variables x is no more Euclidean but has a non trivial metric structure.
Remembering that the statistical ensemble is described by a probability density P(x, t), we have for the normalization condition, from which we get d dt ( dxγP ) = 0 and, therefore, the continuity equation which represents the new, valid form for the Liouville equation for our more general dynamical systems.The solutions of Equation (104) will give us the evolution in the time of the ensemble associated with our non-Hamiltonian systems in non-equilibrium conditions while their stationary, asymptotic solutions can represent the equilibrium ensemble.We will now proceed in two steps, in order to derive the consequences of this more general approach to Statistical Mechanics.First, we will discuss how to obtain, in these conditions, not just a stationary solution but the correct equilibrium ensemble corresponding to the microcanonical ensemble of the standard case and show, just for illustration, how by this procedure we can re-derive the equilibrium ensemble for systems subjected to holonomic constraints.Then, second, using the general form of the solution of Equation (104), we will show the validity of the response approach developed by Onsager and Kubo (at least) for the Hamiltonian case.

Generalized Distribution Function
Assuming that the system (98) possesses n c conserved quantities Ĉk (x), k = 1, . . ., n c , the space sampled by its trajectories will be the subspace intersection of the hypersurfaces Ĉk (x) = c k , where the values c k are determined by the initial conditions.The "microcanonical" distribution function generated in these conditions is The solution (106) satisfies Equation (104) since its total time derivative is evidently zero and, moreover, is microcanonical since all accessible configurations are equiprobable.Other solutions exist, for example products of delta functions for subsets of the full set of conservation laws, but they do not correspond to physical ensembles since they will represent hypersurfaces containing states that will never be visited.In other words, physical ensembles cannot be obtained by using only the solutions of the Liouville equation (104).To satisfy the stationary Liouville equation is a necessary but not sufficient condition.From the previous observations, it is possible to derive the rules to be followed to construct the proper equilibrium ensemble and the correct invariant measure.
The ensemble: 1. Construct the distribution function by Equation (106) using all the independent conservation laws implicit in the equations of motion; 2. Eliminate from the statistical space all variables that result uncoupled to the bulk of the system or driven by it.By driven, we mean variables (i) whose evolution follows that of the other variables without influencing those ones and (ii) that do not appear in the phase space expression of any of the n c conserved quantities Ĉk .
A (not so) typical example could be that of particles of zero mass interacting with the system only via the holonomic constraints defining their own values (see Appendix C).
The measure: 3. Once the essential, reduced, set of variables, let us call them x , has been selected, calculate the phase space compressibility κ(x ) = ∇ T x ẋ of the reduced dynamical system and use κ(x ) to determine γ(x ).
The results are where, via the normalization factor, is implicitly defined the new partition function We now turn, for illustrative purposes, to apply the formalism just developed to an originally Hamiltonian dynamical system subjected to holonomic constraints.As we have seen before, the non-Hamiltonian equations of motion are obtained inserting directly Equation (10) into Equation ( 14).The result is ṙi Now, σ α = σα = 0 , α = 1, . . ., f are 2 f conservation laws to be added to the Hamiltonian.The compressibility factor κ is (easily) computed as giving from which we recover the ensemble already derived, Equation (73).

Response Theory
We address now the central question of dynamical non-equilibrium Statistical Mechanics for systems subjected to holonomic constraints: how to get statistical averages when the evolution of the system is no more stationary be it due to time-dependent perturbations or to the study of relaxation processes [8].These problems are already solved in the Hamiltonian case (even with non-Hamiltonian perturbations but conserving the phase space volume [19,[31][32][33][34][35]); here, we extend that solution to our present case.Let us start from the simpler case of the study of relaxation.Here, we have the system prepared in a non-equilibrium condition and we intend to study the macroscopic relaxation of an observable: where dω(x) = γ(x)dx is the invariant measure already derived and P(x, t) is the ensemble at time t obtained by evolving with the generalized Liouville Equation (104) the initial non-stationary ensemble P(x, 0) ≡ P(x).Let us define the Liouville operator It follows immediately that with formal solution As for the evolution of the ensemble, we can start from the Liouville equation ( 104) Moreover, as in the standard case P(x, t) = e −ı L0 t P(x) .( 125) By remembering that we are working with an invariant measure, we find, again, Ô t = dω Ô(x)P (x, t) = dω e ı L0 t Ô(x) P(x) = dω Ô(x(t))P(x), a relation easy to implement in molecular dynamics simulation of a relaxation process if we can prepare a sample of the non-equilibrium initial ensemble P(x).
Let us now move to the case in which we are interested to compute the response of the system to an external time-independent field.The equations of motion become ẋ = G(x) + D(x)F (t), or not from a Hamiltonian perturbation term (H p ).In any event, to simplify the formalism (and on the basis of what is usually done in transport studies [36]), let us assume that ∇ T x D(x) = 0, i.e., that the perturbation satisfies the incompressibility condition.This condition guarantees that, even in the presence of the perturbation, the non-zero compressibility arises only from the constraints and it is given by Equation (114).The Liouville operator is now time-dependent is the initial condition an equilibrium distribution or a general one, in any event a relation easy to implement in molecular dynamics simulations.Equation ( 132) is valid in general both in the linear response regime and beyond it.In the case of small perturbations, it is possible to show, after some algebra, that, in the presence of constraints, the classical linear response result of Green [37] and Kubo [30] is recovered and holds without any alteration from the uncostrained case [9].

Conclusions
The dynamics and Statistical Mechanics of a many-body system subjected to holonomic constraints have been discussed both following, as for the equilibrium case, the classical historical Lagrange (Hamilton) approach, using Lagrangian multipliers, and, more generally, from the newer perspective, encompassing also non-equilibrium, of non-Hamiltonian flows in phase space.One section has been dedicated to review in depth the most relevant numerical implementations, while, for the sake of readability, the reader has been addressed to the relevant literature for the technically most involved cases.A quite peculiar application, the zero-mass particle case, has been discussed to show how constraints can be creatively used to extend the description of the system opening a different, efficient, way of incorporating, for example, new features in force field models.Let us, finally, remark that developing the statistical theory of dynamical systems subjected to holonomic constraints permits to cover both equilibrium and non-equilibrium simulations of molecular systems but also to explore the domain of rare events, including the computing of complex free energy landscapes, the probing of the dynamics of rare events and even performing non-equilibrium hydrodynamical simulations by properly sampling initial conditions assigning the proper weight to the ensemble of non-equilibrium trajectories that gives the correct (linear and nonlinear) response.
Another, possibly more interesting, case can arise with a force-field model containing extra-centers of force, whose positions do not coincide with the atomic positions but follow adiabatically the motion of the atoms, taking positions that satisfy the condition of zero force on them.The dynamics of these extra zero-mass points are again inherently driven by that of the material points and so does not intervene in the statistical behavior of the material system.