Symmetries and Geometrical Properties of Dynamical Fluctuations in Molecular Dynamics

We describe some general results that constrain the dynamical fluctuations that can occur in non-equilibrium steady states, with a focus on molecular dynamics. That is, we consider Hamiltonian systems, coupled to external heat baths, and driven out of equilibrium by non-conservative forces. We focus on the probabilities of rare events (large deviations). First, we discuss a PT (parity-time) symmetry that appears in ensembles of trajectories where a current is constrained to have a large (non-typical) value. We analyse the heat flow in such ensembles, and compare it with non-equilibrium steady states. Second, we consider pathwise large deviations that are defined by considering many copies of a system. We show how the probability currents in such systems can be decomposed into orthogonal contributions, that are related to convergence to equilibrium and to dissipation. We discuss the implications of these results for modelling non-equilibrium steady states.


I. INTRODUCTION
This article studies dynamical fluctuations in stochastic processes of relevance for molecular dynamics. More precisely, we consider stochastic systems described by underdamped Langevin equations. We focus on large deviation principles, which encode the probability of rare dynamical events [1] and discuss the physical principles and symmetries that govern the probabilities of such events. The applications we have in mind are physical systems of interacting atoms and molecules, which are usually thought of as evolving by deterministic (Hamiltonian) dynamics. However, it is now standard to add stochastic terms to these equations of motion to describe the coupling of these systems to their environments. This coupling is especially important if we aim to describe non-equilibrium steady states, in which the work done by external forces must be dissipated in the environment. For this reason, a clear understanding of the interplay between molecular dynamics and stochastic forces is vital in order to build accurate models of molecular systems away from equilibrium.

A. Motivation
Molecular dynamics [2] is now established as a standard tool for computational studies of a variety of systems, including a wide range of biomolecules and physical materials. For a system that is completely isolated from its environment, the prescription for computation of dynamical trajectories is extremely simple: one identifies a set of co-ordinates q, their canonical momenta p, and a Hamiltonian H. The equations of motion are simply Moreover, there are efficient computational methods for obtaining accurate solutions to these equations, which perform well on modern high-performance computing platforms. However, many physical systems are not completely isolated from their environments. In particular, they often exchange energy with some kind of thermal bath, so that they equilibrate at some temperature T . The volume of some systems may also fluctuate, so that their pressure remains constant. In such cases, several different methods are available for modelling the coupling of the system to its environment. For example, a range of different thermostats may be used.
For systems at thermal equilibrium, the results of molecular dynamics simulations depend on the choice of thermostat. However, there is an established knowledge base as to which aspects of the systems are independent of this choice (for example, free energies), and under what circumstances other aspects should be affected only mildly (for example, dynamical correlation functions depend weakly on the choice of thermostat if the coupling to the environment is weak [3] (Section 7.4.1)). This knowledge is based on theoretical insights-for example, one typically uses thermostats that do not affect the invariant measure (Boltzmann distribution) of the system and (if possible) also preserve the microscopic time-reversal symmetry of the equilibrium state.
For systems that are far from thermal equilibrium, the situation is more complicated. Such systems are important and are increasingly being modelled by molecular dynamics: see for example [4][5][6][7]. To study general features of such states, one might consider non-equilibrium steady states in which a material is simultaneously coupled to two heat baths with different temperatures; or systems that are relaxing slowly towards an equilibrium; or systems in which some variables are conditioned to take a non-typical value. In all these cases, the choice of thermostat (or barostat) can significantly affect the dynamical behaviour, and it is not clear what choice is appropriate when modelling any specific system. In particular, the invariant measure is (in general) no longer a Boltzmann distribution, and time-reversal symmetry is broken, so there are fewer principles available to constrain the design of suitable molecular dynamics models.
The same issues arise-even more noticeably-when proposing highly-simplified models of molecular dynamics systems. For example, in Markov State Models (MSMs) of biomolecules [8,9], one represents a large molecule by a number of discrete states, with Markovian transitions between them. In equilibrium, the relevant transition rates are constrained by the principle of detailed balance (at least as long as the states depend only on a systems' configuration and not on its momenta). Out of equilibrium, there are fewer general rules, although the modern theory of stochastic thermodynamics [10] does address how physically-observable quantities like heat and work can be related to transition rates in simplified (coarse-grained) stochastic models. As non-equilibrium systems are studied increasingly widely, we argue that general principles for the design and interpretation of model systems is becoming increasingly important.

B. Outline
In this article, we analyse dynamical fluctuations in stochastic models of molecular systems. The stochastic element of these models represents the coupling of our system to its environment, and the states in the models represent the coordinates and momenta of the molecular system. We review and extend recent work which showed how general symmetries and geometrical properties govern dynamical fluctuations in these models, concentrating particularly on rare events (large deviations from the typical behaviour). We propose that these general principles can be useful when building models of non-equilibrium states, since they constrain the range of possible behaviour for different kinds of systems.
Our results are based on two recent developments, both of which focus on the key role of dissipation and the breaking of time-reversal symmetry (a key concept in stochastic thermodynamics).
We first consider rare events in which an equilibrium (time-reversal symmetric) system spontaneously maintains a large current, over a long period. It was argued in [11] that these events are free from dissipation, in contrast to typical non-equilibrium steady states. In Section III, we review this argument and present some new examples that illustrate the operation of the general principle. In particular, we focus on the role and definition of dissipation and entropy production in these rare events. (Our results also have implications for the Maximum Caliber hypothesis [12] for building models of non-equilibrium systems.) The second part of this paper concerns fluctuations in irreversible Markov processes, and their analysis in terms of forces and currents in the space of probability distributions. These currents and forces can be decomposed into reversible (equilibrium-like) and "non-equilibrium" (irreversible) parts. Moreover, these two contributions to the force obey a kind of orthogonality relation, which has consequences for the non-equilibrium fluctuations. In Section IV, we review recent results in this direction, and we present a new application to systems described by a Hamiltonian evolution, coupled to a thermostat. In contrast to the diffusive (overdamped) systems discussed previously, we argue that the different terms in the theory have slightly different physical interpretations. We discuss the role of dissipation in that case.

II. DEFINITIONS AND PRELIMINARIES
In this section, we collect several theoretical results needed in the following. They are primarily based on the theory of stochastic thermodynamics, as reviewed in [10].

A. Model: Conservative Forces
We consider a Hamiltonian system coupled to a thermostat at temperature T .
There are N particles moving in d dimensions: we denote their co-ordinates by q = (q i ) n i=1 , with n = N d. Each co-ordinate takes values on a circle of perimeter L: we take q i ∈ Λ with Λ := [−L/2, L/2]; the points q i = ±L/2 are identified with each other. The conjugate momenta are p = (p i ) n i=1 , so that p ∈ R n . Define Ω := Λ n × R n as the phase space. All particles have the same mass m = 1 (cases where not all masses are equal can be analysed similarly, but we concentrate here on the simplest case). The Hamiltonian is where V is the potential energy that depends only on the co-ordinates q.
The system is coupled to a heat bath at temperature T (we set Boltzmann's constant k B = 1). We assume that the coupling of particle i is independent of the co-ordinates, so the (stochastic) equations of motion are The coupling of particle i to the heat bath appears through the stochastic force where γ is a friction constant and dW i t is a standard Brownian noise. (The Brownian noises dW i t , dW j t etc. are all independent.) The generalisation to the case where the friction depends on the co-ordinates is straightforward but requires some heavier notation. These stochastic differential equations are equivalent to Langevin equations in physics: see [11] (Equations (2) and (3)), and replace (dW/dt) by η. With this choice the invariant measure (steady state probability distribution) π for the phase space point (q, p) satisfies [13] dπ(q, p) = 1 where d(q, p) = dqdp, so the integral runs over all of phase space. The notation dπ(q, p) indicates the (infinitesimal) probability that the system is at the phase space point (q, p); the associated probability density is e −H/T /Z.

B. Energy Flow into the Heat Bath
It is useful to also consider the flow of energy from the system to the heat bath. The energy E in the heat bath obeys the equation of motion where the circle indicates a Stratonovich product. This product of a force and a velocity is the rate at which the particles do work on their surrounding environment, which therefore corresponds with the heat transfer. Combining with (3) one has Hence, d(H + E) = 0: the total energy H + E is (strictly) conserved. Note that the internal co-ordinates (q, p) evolve independently of E: this energy is useful as a book-keeping tool, but it does not affect the system's dynamics. Hence, the heat flow into the bath over the time interval [t , t] can be recovered as (Absolute values of the bath energy E are not well-defined within this theory, but the heat transfer may be computed for any trajectory.) We also have d(H + E) = 0 and thus Q(t , t) = H(q t , p t ) − H(q t , p t ).

C. Model: Non-Conservative Forces
To include non-equilibrium steady states in our general setting, we replace the gradient force −(∂V /∂q i ) in (3) by a general force f (q) that depends only on the co-ordinates q, but is not necessarily the gradient of a potential. That is, we consider The coupling to the heat bath is still given by (4) and the energy of the heat bath obeys (6); the heat flow is still given by (8). However, (7) becomes The invariant measure of this system is not known in general-we denote the steady state distribution by π but there is no analogue of (5). In addition, in the steady states of conservative systems, one expects the average of E to be independent of time: E[Q(t , t)] = 0 in steady state. For the non-conservative forces considered here, one expects E[Q(t , t)] > 0 for t > t (unless f is the gradient of a potential): see also Section II E below.

D. Path Measures
We consider trajectories of these models, over a fixed time interval [0, τ ]. Define X t = (q t , p t ) as the state of the system at time t and let X = (q t , p t ) t∈[0,τ ] be a sample path (trajectory). Note that the equation of motion for the co-ordinates q has no stochastic part, so all possible trajectories of this system have ∂ t q i = p i . In general, we use P to indicate a path measure for such a system, with initial conditions sampled from the invariant measure π. In addition, let P X0 be the path measure with fixed initial condition X 0 . Hence dP(X) = dP X0 (X) · dπ(X 0 ).
To obtain an explicit representation of the path measure, we rewrite (9) as dq i t = p i t dt, dp i t = γw i (q t , p t )dt + 2γT dW i t (12) and denote the invariant measure of this system by π. In this case, the (infinitesimal) probability of trajectory X is given (in the Stratonovich convention) by where w t indicates w(q t , p t ), and P ref X0 is a reference measure (corresponding to p t being a random walk with diffusion constant γT , and dq = pdt as an equality). Expectation values with respect to such path measures can be obtained as E X0 [G] = G(X)dP X0 (X). In physics one might equivalently write a path integral (again in Stratonovich convention) [10,14] which has exactly the same meaning (with δ[∂ t q − p] encapsulating the constraint that ∂ t q t = p t for all times t).

E. Time-Reversal Symmetry and Relation to Heat Flow
Let T be a time-reversal operator acting on paths, which reverses the arrow of time and the direction of all momenta. That is, (TX) t := (q τ −t , −p τ −t ). For a single phase space point (q, p) let T(q, p) := (q, −p). The time-reversibility of Hamiltonian evolution combined with the appropriate combination of forces in (4) means that for conservative systems described by (3), the steady state has a time-reversal symmetry dP(X) = dP(TX).
Moreover, using (13), as it applies to systems described by (3) or (9), it may be verified that which relates the heat flow into the bath to the breaking of time-reversal symmetry, for these systems. For conservative systems, combining (5), (11) and (15) recovers Q(0, τ ) = H(X 0 ) − H(X τ ), as required since dQ t = −dH t in that case. In general (for both conservative and non-conservative systems), averaging (16) with respect to dP(X) -which corresponds to initial conditions taken from the invariant measure-one sees also that E P [Q(0, τ )] ≥ 0: the average steady-state energy flow into the bath is non-negative, and vanishes only if the system is time-reversal symmetric.
For the connection to fluctuation theorems, see [10].

III. ABSENCE OF DISSIPATION IN CONDITIONED ENSEMBLES OF TRAJECTORIES
This section builds on recent work by Jack and Evans [11], concerning dissipation in certain trajectory ensembles. The motivation for that work was a hypothesis [15] that properties of non-equilibrium steady states can be inferred by analysing a particular class of rare fluctuations that occur at equilibrium. These are rare events in which time-averaged currents have non-typical values. This hypothesis can be motivated as a far-from-equilibrium generalisation of linearresponse theory, with its associated fluctuation-dissipation theorems and Onsager reciprocity relations. Following [11], we show that this hypothesis fails qualitatively, in that it does not correctly account for dissipation in the nonequilibrium steady states.

A. Parity Symmetry
The results of this section apply to conservative systems whose Hamiltonian has a parity symmetry.
The idea is that on inverting some subset of the coordinates S ⊆ {1, 2, . . . , n} and their conjugate momenta, the Hamiltonian is unchanged.
Hence, define a parity operator P that acts on paths otherwise. In addition, P acts on single phase space points as P(q, p) = (q,p). We restrict in the following to parity-symmetric Hamiltonians H for which H(q,p) = H(q, p).

B. Examples
To illustrate our general arguments, we focus on a very simple example: consider a single particle moving on a circle. The system is described by (3) with a single co-ordinate q and conjugate momentum p. We take L = 1 and choose V (q) = −V 0 cos(2πq) for some constant V 0 ≥ 0. Hence The parity symmetry in this case is simply P(q, p) = (−q, −p). We also define a corresponding non-equilibrium system (of the form (9)) where a (constant) force f ext drives the system around the circle. In this case We analyse this example in Section III G below. Figure 1 illustrates the system, and some of its properties.
For an application of these ideas in a more complicated example-fluid motion under shear-see [11]. There, the particles move in two dimensions: they are confined in the vertical direction, but there are periodic boundaries in the horizontal direction. Shear flow occurs when the upper and lower boundaries move (horizontally) with relative velocity v. In that case, the parity P corresponds to a plane reflection, which reverses the direction of the shear flow, but leaves the orthogonal directions unchanged (see Figure 1 in [11]).
Illustration of the example system described in Section III B. A particle moves on a circle, with co-ordinate q = θ/π. There is a conservative force −V (q) and (possibly) an external force f ext that drives the system around the circle.  (27) requires that P (θ) remains symmetric under θ → −θ, see Equation (35). However, to support the finite current J, the distribution of the momentum p =q =θ/π must have a distribution that favours p > 0 (not shown).

C. Currents and Fluxes
Define u(q, p) := i a i (q)p i as a general momentum, that changes its sign under time-reversal (here the a i are a set of weight functions). We consider the time-integral of one such momentum, which we identify as an (average) flux We further assume that a i (q) = a i (q): with this choice J τ changes its sign under parity-reversal. Hence As a shorthand for such an equation, we say that J τ is "odd" in both T and P. However, J is even in the combined symmetry operation PT, that is, In general, we identify time-integrated quantities that are odd in T as "fluxes". An important example is Q(0, τ ) = τ 0 dE s (recall (8)), which is odd in T but even in P. Hence Q is odd under PT: we write Q(X) for the heat transfer associated to path X so that We refer to fluxes that are odd in P (and hence even in PT) as "transport fluxes": we imagine that some quantity is being transported through the system in a particular direction that is odd under parity. On the other hand, we refer to fluxes that are even in P (and hence odd in PT) as "dissipative fluxes": they are independent of the direction of transport.

D. Ensembles and PT-Symmetry
We consider a set of rare sample paths for which J τ has a non-typical value J. In later sections, we consider the limit of large τ , but for the moment, τ can take any value. Define a path distribution P J that is conditioned on this value of J τ as dP J (X) := dP(X|J τ (X) = J).
This definition means that dP J (X) = 0 if J τ (X) = J. In addition, if two trajectories X, X satisfy J τ (X) = J τ (X ) = J, then Now, fix some J = 0 and consider a trajectory X with J τ (X) = J. Then, trajectories PX and TX both have J τ = −J, so dP J (PX) = 0 = dP J (TX). On the other hand, (23) means that trajectory PTX has J τ (PTX) = J and (18) implies that dP(PTX) = dP(X). Hence, using (26), one has This symmetry of conditioned path ensembles mirrors the main result of [11] (which applies to a related set of "biased" path ensembles). (We derived this symmetry for systems described by (3) but the discussion of this section generalises immediately to any system as long as the path measure satisfies P(TX) = P(X) = P(PX) and the current J τ satisfies

E. Observable Consequences
Recall that the heat flow Q(0, τ ) is odd under PT. Hence, the average value of Q for the conditioned ensemble is where the second equality is obtained by a change of integration variable and the third uses (24) and (27). Hence, on average, no energy flows into the heat bath in the steady state of the conditioned ensemble, even though a finite current J is flowing.
In the nomenclature of Section III C, this argument can be used to show that all dissipative fluxes vanish in the conditioned ensemble. Hence we say that the conditioned ensemble is free from dissipation. On the other hand, note that all co-ordinates q i are even under T. In this case, the derivation (28) may be used to show that the averages of all co-ordinates that are odd in P must vanish in the conditioned ensemble, see [11] for specific examples.

F. Large Deviation Principle and Auxiliary Process
We now consider the limit of large time τ , in which case the probability that a sample path has J τ (X) = J is governed by a large deviation principle at "level-1" (in the nomenclature of Donsker and Varadhan). We write this as [16] Prob where I is the rate function. One has I(J) ≥ 0; also, if I(J) > 0, then the probability of current J is suppressed exponentially as τ → ∞. These events are clearly very rare. The conditioned distribution P J is not easy to analyse. To make further progress, it is useful to define an "auxiliary" Markov process [16][17][18][19][20][21] whose steady state path measure is close to P J : see [16] for a comprehensive discussion. To do so, we first define a scaled cumulant generating function and define j(s) = ψ (s), such that j is a monotonically increasing function of s, with inverse s * = j −1 . Assuming that I is convex (that is, there are no dynamical phase transitions [1]), then and the value that achieves the supremum is s * (J). Assuming that the original process of interest is given by (3), define an s-dependent auxiliary process as for some function G s : Ω → R to be specified below. The forceb i t has the same statistical properties as b i t in (9), but we use a different notation because we are going to make a mapping between sample paths of this auxiliary process and sample paths of the original process (3). At the level of sample paths, b =b. We identify γ(∂G s /∂p i ) as a "control force" [21,22] that realises the required flux J (see also [20]).
Since the sample paths of the auxiliary process should be as close as possible to those of the original process, we use the fact that the heat current in the original process is a deterministic function of (p, q) and satisfies (7). Hence, that equation also is used to compute the energy E t for the auxiliary process.
The determination of a suitable function G s is described in [16]. Briefly, let L be the generator of the process (3). Then exp(−G s ) solves the eigenvalue equation where the coefficients a i are those appearing in (21) and the eigenvalue ψ(s) coincides with (30). Under these conditions, let P aux s be the path measure of this s-dependent process. Then in the sense that the relative entropy between these two distributions is o(τ ) [16]. For our purposes, this result has two important implications. First, the analysis of the (intractable) probability measure P J has been reduced to analysis of the auxiliary model, which is often easier. Second, it means that the physical behaviour associated with the conditioned ensemble P J can be reproduced by the stochastic process (32). In particular, this auxiliary process achieves current flow without dissipation. This unusual situation can be achieved only with the aid of a "control potential" G s , which in general has a complex dependence on q and p: see Section III G. In addition, comparing (3) and (32), one sees that db i t corresponds to db i t − γT (∂G s /∂p i )dt. The interpretation of this fact is that when one considers the conditioned process, the stochastic noises that appear in the definition of the original process do not have a mean value of zero any more: in fact the dW i t that appears in the definition of db i t has a mean value proportional to ∂G s /∂p i once the conditioning is applied. That is, one can think of the control force as a bias on the noise that is induced by the conditioning.

G. Example System: Comparisons between the Auxiliary Process and Other Physical Ensembles
In this section, we compare the auxiliary process defined above with two other physical processes, in order to explore in more detail the nature of dissipation. We use the example system (19) to illustrate the relevant ideas. We take In the special case where there is no potential (V 0 = 0), the auxiliary process can be obtained exactly. The generator L acts on functions g : Ω → R as Lg = (p · ∇ q − γp · ∇ p + γT ∇ 2 p )g. One finds G s (p, q) = (−sp/γ) and ψ(s) = (T s 2 /γ). In that case, the equations of motion (32) for the auxiliary process coincide with the non-equilibrium system (20), with f ext = sT . However, as noted above, the heat flow in the auxiliary process is given by (7). On the other hand, the heat flow in the non-equilibrium system is given by (10), with f = f ext , a constant. In this case, it is easily verified that for long trajectories (τ → ∞), the auxiliary process (and hence the conditioned process) have 1 τ E[Q(0, τ )] = 0 while the non-equilibrium process has 1 τ E[Q(0, τ )] = (f ext ) 2 /γ. We emphasise that this case V 0 = 0 is a special one -if one inspects the statistics of the particle trajectories (that is, (q t , p t ) t∈[0,τ ] ) then it is not possible to determine whether one is observing the non-equilibrium system (20) or the conditioned equilibrium system based on (19). On the other hand, if one observes (by some physical measurement) the heat flow into the reservoir for these two cases, then one sees that the conditioned process has no dissipation (no net heat flow), but the non-equilibrium process does have a finite rate of heat flow into the environment.
In the general case V 0 > 0, the two ensembles are more easily distinguished. For example, one may show (see (35) below) that E J (q) = 0 but the steady state of the non-equilibrium process has E(q) > 0 if f ext > 0. In that case, the physical situation is that the external force f ext drives the system away from the potential minimum (at q = 0); once the particle has reached the maximum then it wraps around the circle and falls back to the minimum, and the work that was done by the external force is dissipated as heat in the bath. On the other hand, the physical interpretation of the conditioned process is that the particle borrows energy from the heat bath in order to overcome the barrier, before returning that energy to the heat bath as it falls back down again. For explicit computations on a similar system in the overdamped limit, see [23].
The physical interpretation of the auxiliary process in this case is that the control force −γT (∂G/∂p) does work to push the system away from the minimum, but this work is not dissipated as heat in the bath: instead the control force acts to slow down the particle as it falls back to the minimum, in such a way as to avoid any dissipation. We expect that this requires complex velocity-dependent forces that are not expected in typical equilibrium systems. One may imagine that the control potential is applied by a kind of Maxwellian demon, that has full control over all aspects of the particle motion, and hence can avoid the usual expectations of thermodynamics, that persistent particle currents should be accompanied by dissipation.
Based on the numerical results of [11] and the symmetries of the problem, we illustrate in Figure 1b how the parity-time (PT) symmetry affects the conditioned steady state of the example problem discussed in Section III B. One observes a qualitative difference between the conditioned steady state and the non-equilibrium steady state that is observed when f ext > 0. To see this, note that if a co-ordinate q i is odd under the parity operation P then its marginal distribution (probability density) P J i is necessarily symmetric, in the conditioned steady state. This steadystate distribution is evaluated at some appropriate time t: for example, consider the limit τ → ∞ with t = ατ for some α ∈ (0, 1). Then where q i (X t ) is the value of co-ordinate q i at time t in trajectory X, the second line is a change of integration variable X → PTX, the third uses (27) and that q i is odd under P. The last equality uses that P J i (q) is independent of the parameter α. Hence one has also E J (q i ) = We imagine that the noise force db in (19) comes from friction between the particle and a surrounding solvent, but now imagine that the solvent is moving with constant velocity v.
We refer to this as a system with advection (of the particle, by the solvent). In this case the equations of motion are obtained by applying a Galilean transformation to (19), which yields where in this case In this equation, the first term on the right hand side comes from friction with the moving solvent. In this case one may verify dE t = −dH t with H = (p t − v) 2 /2. The steady state has (d/dt)E(H t ) = 0 and so there is no heat flow into the bath: 1 τ E[Q(0, τ )] = 0.

H. Formulae for Heat Flow in Terms of Path Probabilities
Recall (16), which connects the heat flow in a trajectory with the ratio of probabilities of forward and backward paths, for the systems described by (3)- (9). It follows from that equation that 1 τ E[Q(0, τ )] > 0 if, for typical trajectories X, dP X0 (X) differs significantly from dP (TX)0 (TX). However, the results of the previous section show clearly that systems with advection and conditioned ensembles (and auxiliary process) violate (16), in that there is breaking of time-reversal symmetry, but no heat flow.
For the case with advection, the solution to this apparent paradox [24] is that one should replace (16) by the alternative formula where P v is the path probability distribution for the system with solvent velocity v, and similarly P −v has solvent velocity −v. It may be checked directly from the path probabilities (13) that this gives the correct heat transfer in our case. Our inference from [24] is that one should not regard (16) as a fundamental formula for heat flow: one should instead compute the heat transfer to the bath directly using (36) and then derive the corresponding formula in terms of path probabilities. Based on that assumption, it is easily verified that for the conditioned ensembles as defined here, one should take or, equivalently, as proposed in [11].

I. Outlook
We summarise the outcomes of this analysis as follows. First, conditioning on fluxes that are odd in P leads to dissipation-free ensembles, in the sense that no heat flows from the system into its environment. Second, the behaviour observed in this ensembles can be reproduced by auxiliary models, but this requires an "optimal control" potential G s that (typically) depends in a complicated way on all co-ordinates in the system, and does not correspond to a simple physical driving force. The fact that such forces tend to have a complex dependence on the system's state has been remarked before [18,20,25,26]. In the present context, our results help to rationalise this fact: these forces act to drive currents without inducing dissipation, so they must inevitably be very different from driving forces that appear in typical physical systems. Third, entropy production (in the environment) can (in these situations) be directly computed in terms of an energy flow, which helps to clarify what is the appropriate formula for obtaining Q in terms of path probabilities.
To see the consequences of these results, we focus on the comparison between non-equilibrium steady states (e.g., in the example (20)), and conditioned ensembles of trajectories. In both cases, currents flow through the system, but only the conditioned ensemble respects the PT symmetry (27). This property of the conditioned ensemble has its origin in the symmetries of the underlying dynamics, which still have implications for rare fluctuations in which large currents are sustained over long time periods. In response theory for equilibrium states, it is familiar that the same symmetries place strong constraints on linear responses, leading (for example) to Onsager reciprocity and fluctuation-dissipation theorems [10]. However, the far-from-equilibrium steady states considered here do not retain any such symmetries-the connection between spontaneous fluctuations and responses to perturbations has broken down, as do the usual fluctuation-dissipation theorems. As discussed in [11], this difference between responses and spontaneous fluctuations leads to the failure of maximum entropy (or maximum caliber) approaches such as that of [15].

IV. ORTHOGONALITY OF FORCES AND CURRENTS IN NON-EQUILIBRIUM SYSTEMS
In this section, we discuss a different set of symmetry properties of dynamical fluctuations in systems with nonconservative dynamics, coupled to a heat bath. The idea is to decompose forces in the system into two pieces, according to their behaviour under time-reversal. This leads to a decomposition of the heat flow into two contributionshousekeeping heat and excess heat. It also leads to a decomposition of probability currents which has a geometrical interpretation: the current has two orthogonal components, one of which can be attributed to a free energy gradient. For overdamped systems, these results are familiar from the theory of stochastic thermodynamics [10] and from the Macroscopic Fluctuation Theory [27]. We will show that for systems with momenta, the construction is slightly more complicated, and we discuss the resulting decompositions and their geometrical interpretations.

A. Overdamped Diffusions
We first review the situation in overdamped systems described by stochastic differential equations (SDEs) or firstorder Langevin equations. We summarise relevant results from stochastic thermodynamics [10] and from Macroscopic Fluctuation Theory [27]. The physical significance of these results is summarised in Section IV A 5.

Model
We consider a system with state x t = (x i t ) n i=1 , which takes values in a space Γ ⊆ R n . It evolves in time as where we introduced a set of noise intensities (γ i ) n i=1 , one for each coordinate. Assuming that all the γ i are finite, we identify forces that drive the particle motion as Comparing with (9), one sees that γ plays the role of a noise intensity in both systems. One way to arrive at (41) is to consider the overdamped limit of (9); note however that on taking this limit, the noise intensity γ i in (41) does not correspond to the friction constant γ in (9).
The heat transfer to the environment is [10] ( Equation 16) The path measure for this system is given by an analogue of (13), which is where we write f t = f (x t ) for compactness of notation,γ is a diagonal matrix with elements (γ i ), and P ref corresponds to a random walk for x t with "diffusion matrix"γT . The SDE (41) is associated with a Fokker-Planck equation [13] that describes the evolution of a probability density ρ on Γ, as Finally, for a general current j : Γ → R n and a vector field F : Γ → R n , it is useful to define

Time Reversal and Heat Transfer
Define a time-reversal operation T 0 which reverses time but does not change any coordinates or momenta, as is appropriate for overdamped dynamics. That is, for paths X on the time interval [0, τ ], we take (T 0 X) t = (X) τ −t . Now define an adjoint dynamics [27] for which the path measure is P * , with dP * (X) = dP (T 0 X) .
That is, the steady probability of a particular path under the adjoint dynamics is equal to the corresponding probability of the time-reversed path, under the orginal dynamics. By considering paths with τ → 0, one sees that the invariant measure associated with the adjoint process is the same as that of the original process, π * = π. The equations of motion of the adjoint process may be derived, either directly from (44) or using the Fokker-Planck equation (45). This latter approach is outlined in Appendix A 1. We summarise the result: let the invariant measure of the process be π, with and Z 0 = Γ e −U (x) dx for normalisation. The "potential" U can be obtained by solving a partial differential equation: see (A2). Then the adjoint process has equation of motion (41), with f i of (42) replaced by If f i = −(∂V /∂x i ) for some potential V then (41) corresponds to the overdamped limit of a conservative system, the invariant measure is dπ(x) ∝ e −V (x)/T dx, and f = f * . Hence the original and adjoint processes concide, and the system is time-reversal symmetric: dP(T 0 X) = dP * (X) = dP(X). Defining dP * X0 (X) := dP * (X)/dπ * (X 0 ) by analogy with (11), and recalling that π * = π, the analogue of (16) in this system is which may be verified directly from (43) and (44). Now define which is known as the housekeeping heat [10]. Note that since π = π * one could equivalently define Q hk (0, τ ) = T log[dP(X)/dP * (X)], but we choose to use path probabilities conditioned on their initial states, for later convenience. Using (11), (48) as well as π * = π, one sees that That is, the total heat has two components: the final term on the right-hand-side is related to the difference in probability between initial and final states and says that heat is transferred to the bath as the system relaxes towards more likely configurations. The other contribution Q hk (0, τ ) is the additional heat flow that is not associated with relaxation towards more likely states. This is a dissipative heat flow and represents energy input from external forces that is not available for doing work, but must be expended in order "to do the housekeeping". In steady states E(Q) = E(Q hk ): the only contribution to the (average) heat flow is the housekeeping heat.

Splitting of the Force According to Time-Reversal
Define Since the adjoint process corresponds to a time-reversed dynamics, one sees that the force f S is even (symmetric) under time-reversal, while f A is odd (anti-symmetric). From (43) and (52), one then sees that That is, the housekeeping heat is associated with the anti-symmetric force, while the remaining (excess) heat is associated with the symmetric force. We note that for consistency of (54) with (44) and (51) one must also have This may be verified using (53) together with (A2). It is also equivalent to div(f A e −U ) = 0, which means that if ρ ∝ e −U is the invariant density, then the corresponding probability current ρf A is divergence free, and therefore does not transport any density: see for example [28]. It follows that for systems of the form (41) and (42), one may replace the force f by f λ = f S + λf A and the invariant measure is independent of λ.

Large Deviation Principle for Many Copies of the System
So far we have considered dynamical fluctuations at the level of individual sample paths. To gain further insight, it is useful to consider a large deviation principle (LDP) that appears when we consider M independent copies of our system, with a limit M → ∞. The resulting LDP is of the same form as those considered in Macroscopic Fluctuation Theory (MFT). This allows us to identify an orthogonality relation between two contributions to the probability current J that appears in (45): these two contributions originate from the splitting f = f S + f A .
To this end, define the empirical density ρ M such that V ρ M dx is the number of copies of the system whose positions x are inside any volume V ⊂ Γ. Similarly let j M be the empirical current, defined as in [27]. Then, as M → ∞, one has an LDP The rate function I [0,T ] is finite only if ∂ t ρ = −∇ · j, in which case where V(ρ) = T Γ ρ(x)[log ρ(x) + U (x) + log Z]dx is the quasipotential (a kind of non-equilibrium free energy) and Physically, χ is a mobility and F is a force that acts in the space of densities (distinct from the physical force f ). The adjoint process obeys an LDP that is analogous to (56) and (57), with J(ρ) replaced by J * (ρ) = χ(ρ)F * (ρ), where the adjoint force F * can be obtained from the following formulae, which mirror (53): The resulting theory has several interesting features. First, within this general framework [27], the force F S is a free energy gradient, and is orthogonal to F A in the sense that This also implies that J A , F S = 0 = J S , F A . Second, we have an LDP analogue of (51) and (54), which follows directly from (56) and reads Note that Q hk in (51) is the heat transfer for a given sample path: here we are defining Q hk as the average heat transfer for a family of paths, as specified by ρ and j. The antisymmetric force F A is responsible for the housekeeping heat. In the steady state one has φ = φ U := (e −U /Z) and the associated empirical current is j U = J A (φ U ); in this case Q hk = τ J A (φ U ), F A depends only on the anti-symmetric force and current.

Physical Significance and Relation to Molecular Dynamics
The key results from this section are (i) that splitting the physical force f = f S + f A establishes a connection between f A and the housekeeping heat (which determines the steady-state dissipation) [10]; (ii) that splitting the probability current J = J S + J A shows that J S corresponds to a gradient flow for the quasipotential V, within an appropriate metric [27,29,30]; and (iii) that the currents J S and J A are orthogonal, which allows the characterisation of the quasipotential as the solution of a Hamilton-Jacobi equation [27] and also has consequences for the rate of convergence of such systems to their steady states [31]. We recently showed in [32] that these structures are also present in (irreversible) Markov chains, although the notion of orthogonality needs to be generalised, and the rate function analogous to (57) is not a quadratic function of the current in that case.
From a physical point of view, the decomposition of the force as f S +f A means for any (irreversible, non-equilibrium) diffusion process, one can define a reversible process in which the force f S acts alone, and this process has the same invariant measure as the original one (where f = f S + f A ). If one considers many copies of this system as in Section IV A 4 then the reversible process evolves by steepest descent of the free energy, while the non-conservative component of the dynamics (f A ) gives rise to a probability current J A that flows in a direction orthogonal to the freeenergy gradient. The reversible sector of the theory includes all information about the invariant measure [via (60)], while the irreversible sector describes the entropy production, as shown by (54) and (61). The orthogonality of the forces in (60) ensures that the decomposition of the force is unique, although obtaining explicit formulae for f S and f A requires that the invariant measure of the system is known, which is not typically the case for irreversible processes. As an analogy for the splitting, one may think in terms of a Helmholtz decomposition of the force into a gradient (f S ) and a circulation (f A ), or perhaps as a functional Hodge decomposition of the probability current into three pieces, as in [33] (see also [31]). Regardless of the specific mathematical structure, the key point is that we obtain a decomposition of the forces and currents into two parts, with distinct geometrical properties, and different physical interpretations.
If we return briefly to the example of Section III B and Figure 1 and consider the overdamped limit (with f ext > 0), one expects the following properties. The qualitative features of the potential U will be given by the negative of the logarithm of the "non-equilibrium" distribution shown in Figure 1b: it will have a single minimum at some q > 0. The force f S is simply the gradient of this potential, and the reversible process in which f S acts alone is simply a diffusion in this potential. The non-reversible force is not the gradient of a potential: it is positive on average, so that it drives the system around the circle. However, it is not a constant force like f ext , it has a non-trivial dependence on the co-ordinate q, so that the physical force f = f ext − V is recovered as f S + f A .
We emphasise, however, that the results presented so far in this section are restricted to overdamped dynamics, and follow directly from macroscopic fluctuation theory [27]. Our aim now is to extend them to molecular dynamics, as given by (3) and (9). We will show that there are two possible extensions of the overdamped case, which corresponds to two different splittings of the current J. One of the choices yields a geometrical structure analogous to (60), which is related to the GENERIC (General Equation for Non-Equilibrium Reversible-Irreversible Coupling) formalism [34,35]: see [36,37]. However, there is no connection between this splitting and the housekeeping heat. The second splitting makes the connection to the housekeeping heat, similar to (61), but there is no gradient structure analogous to (60). We briefly discuss the advantages and disadvantages of the two approaches: their physical consequences are addressed in Section IV D.

B. Extension to Systems with Finite Damping: (pre)-GENERIC Splitting
We consider the model of (9), which we write as with a (rescaled) force F p = (f /γ) − p. The analysis of this section follows closely that of Section IV A, with the state point x replaced by the phase space point (q, p). Note, however, that there is no noise in the equation of motion for the co-ordinates q i , so some of the friction constants γ i in (41) must be set to zero. The implications of this will be discussed below. The Fokker-Planck equation for this system involves a phase space density φ defined on the space Ω: we write with J = (J q , J p ) and ∇ = (∇ q , ∇ p ) having components for both co-ordinates and momenta. Specifically, The invariant measure is π and we write by analogy with (48). In this case, U may be obtained by solving (A8).

Adjoint Process
We define the adjoint process exactly as in (47). Note that this definition does not involve the reversal of any momenta. The construction of the adjoint is given in Appendix A 2. Its equation of motion involves the adjoint force F * p : dq t = −p t dt, dp t = γF * p (q t )dt + 2γT dW t .
Note that the rate of change of q is now in the opposite direction to p, because the operator T 0 reverses time without flipping the momenta. The analogue of (53) is For the case of conservative forces as in (3), one has f = −∇V for some potential V , so that U = (p 2 /2 + V )/T . In this case, the antisymmetric force F A p contains the Hamiltonian evolution, while the symmetric force contains the coupling to the thermostat. That is the essence of the GENERIC formalism [34,35]: see also [36,37]. This connection is clearer at the level of probability currents, as we discuss in the next section. However, in contrast to the overdamped case, we note that splitting the force as F p = F S p + F A p does not provide a general connection to dissipation: for these systems, the heat flow is given by (16), which breaks the analogy with the overdamped case, where the formula (50) applies. It is not possible to apply (50) in systems with finite damping, because the adjoint process has ∂ t q = −p, so dP(X) > 0 implies dP * (X) = 0 (unless by some chance p t = 0 for all t).

Large Deviation Principle
We now analyse large deviations in these systems, following a method that is parallel to Section IV A 4. We consider M copies of our system, and let (φ M , j M ) be the empirical density and current, defined on the phase space Ω. The analogue of (56) is We write j = (j q , j p ), and the rate function is finite only if j q = J q = pφ [recall (64)] and ∂ t ρ = −∇ · j, in which case with V(φ) = T Ω φ(q, p)[log φ(q, p) + U (q, p) + log Z]d(q, p), also χ p (φ) = γφ, and J p was defined in (64). For a complete analogy with Section IV A 4, one should take a mobility matrix χ = diag(χ q , χ p ) with χ q = 0: however, the fact that χ is singular means that not all results from the overdamped case can be applied in this setting: see below.
We seek an analogue of (60). That is, our aim is to split the current J into two orthogonal components, one of which is a free-energy gradient. To this end, we consider an LDP for the adjoint process, which is analogous to (68) and (69), but with J p replaced by J * p (φ) = χ p F * p − γT ∇ p φ and with the modified constraint that j q = J * q = −pφ (instead of +pφ). Hence defining J S = (J + J * )/2, and J A = (J − J * )/2, one has Since χ is a singular matrix, it is not possible to define forces F such that J(φ) = χ(φ)F (φ), in contrast to (58) for the overdamped case. However, since J S q = 0, it is possible to write This allows us to identify the symmetric part of the dynamics as a gradient flow. Moreover, by direct analogy with (60), it may be verified that which says that the antisymmetric current is orthogonal to the gradient of the quasipotential. Using (71) and integrating once by parts, it follows that the quasipotential is constant under the antisymmetric part of the time evolution, see also [31]. As noted above, in the conservative case, where f = −∇ q V , then U = H/T with H = (p 2 /2) + V , and so T ∇ p U = p. In that case the anti-symmetric current contains the terms coming from the Hamiltonian evolution: J A = (∇ p H, −∇ q H)φ and J S contains the terms proportional to γ, which come from the coupling to the heat bath. This is the setting that has been named pre-GENERIC [37]. However, we emphasise that (71) and (72) apply also in the non-conservative setting. We also note that the absence of noise in the equation of motion for q makes χ singular: it is possible to regularise this system by adding an independent noise that acts on q, which does not change any of the conclusions of this section.
The geometrical structure that is apparent from (71) and (72) makes the construction of this section attractive. One can view a general time-evolution as superposition of a gradient flow towards the non-equilibrium steady state, together with an orthogonal drift that breaks time-reversal. However, the overdamped case also includes formulae such as (51) and (61), which relate the antisymmetric forces and currents to dissipation. As noted above, these results have no analogues in this setting: the connection between the splitting of J and the dissipation has been lost in the passage from overdamped systems to those considered here. For this reason, we now consider an alternative splitting of the current J(φ) that appears in (68). This alternative splitting loses the gradient structure encoded in (71), but recovers the connection to the heat flow.
C. Splitting the Currents and Forces into Equilibrium and non-Equilibrium Components

Dual Process
We introduce a dual process, which differs from the adjoint process defined above. The nomenclature "dual process" is discussed in Appendix A 3. (The idea of comparing path measures for different processes in order to make connections to heat flow is discussed in [10].) The path measure for the dual process is P. It obeys dP(X) = dP(TX), which differs from (47) since the operator T reverses all momenta (recall Section II E). Applying (73) for paths with τ → 0, one sees that the invariant measure π of the dual process satisfies dπ(q, p) = dπ(q, −p), so the analogue of (65) is The dual process may be constructed: see Appendix A 3. The coordinates and momenta in the dual process obey As above, we write the deterministic term in the equation of motion for p (in the original process) as γF p dt with The corresponding quantity in the dual process is In the conservative case, U = H/T , so T ∇ p U = p and the dual process coincides with the original process. Since the conservative case corresponds to a model with an equilibrium steady state, we define where the superscripts E and N indicate equilibrium and non-equilibrium contributions. These are the analogues of the symmetric and antisymmetric forces discussed above.

Formulae for Heat Currents Based on Sample Paths
The housekeeping heat for these systems is defined by analogy with (51) as Note that the path probabilities in (78) are conditioned on their initial states as in (51). This is essential, so as to ensure that Q hk (0, τ ) → 0 as τ → 0: as the trajectory length goes to zero, so does the heat flow. Recalling (16), the analogue of (52) is that for any path X where the second equality uses (65), (74) and U (q, −p) = U (q, p). Using (10) to substitute for Q, one sees that The analogy with (54) motivates us to define a "force" acting in the phase space as This is a non-equilibrium force, in the sense that it vanishes in conservative systems. (Recall that the conservative case has U = (p 2 /2 + V )/T and f = −∇ q V .) Hence, in terms of dissipation, F N is analogous to the force f A in the overdamped case, and one has which shows that the "non-equilibrium" force F N does indeed determine the steady-state dissipation.

Large Deviation Principles
These results also have implications for large deviations. For the original process one still has (68). One splits J p = J E p + J N p such that the corresponding LDP for the dual process is similar, but now with J p = J E p − J N p . In this case One has χ p = φγ and χ q = 0, as in Section IV B. Since χ is singular, it is not possible to write J(φ) = χ(φ)F (φ), but one does have Moreover, there is an orthogonality relation analogous to (72). This is derived in Appendix A 3. The result is that It is clear that J E does not correspond to a gradient flow, so there is no analogue of (71). However, there is an analogous statement to (61), which is that the housekeeping heat flow associated with the path (ρ, j) is The steady state probability density is φ = φ U and the associated empirical current is j U = J E (φ U ) + J N (φ U ), with both "equilibrium" and "non-equilibrium" currents contributing, contrary to the overdamped case. However, from (85) one has Q hk = τ J N (φ U ), F N , which depends only on the non-equilibrium force and current. Thus the non-equilibrium part of the theory is intrinsically linked to the housekeeping heat, as one might expect from the definitions (73) and (78).

D. Discussion
In Section IV A, we reviewed some results that show how dynamical fluctuations in overdamped systems are accompanied by underlying geometrical structures related to gradient flows, orthogonalities and dissipation. In Sections IV B and IV C, we showed how these structures generalise to systems described by Hamiltonian dynamics, including nonequilibrium driving forces, and coupling to a heat bath. The resulting structures are more complex, since there are two alternative time-reversal operations, depending on whether one chooses to reverse the momenta or not.
To summarise the key results: one may split the probability current either as J = J S + J A (corresponding to simple time-reversal) or as J = J E + J N (corresponding to time-and momentum-reversal). In both cases, the resulting currents (and their conjugate forces) are orthogonal, in the sense of (72) and (85). It is likely that these orthogonalities can be used to derive bounds on the rates with which non-equilibrium systems converge to their steady states, by generalising the analysis of [28,31,32].
The splitting J = J S + J A recovers the recently proposed (pre-)GENERIC splitting of [37], at least for conservative systems. In this case, two currents J S and J A can be identified straightforwardly, since J A corresponds to the Hamiltonian evolution and the J S to the action of the thermostat. This decomposition is very natural in that context, and is exploited (for example) in integration schemes for molecular dynamics [38] (in the "BAOAB" notation of that work, J A encapsulates the parts of the evolution denoted by A,B and J S is responsible for the part denoted by O). The fact that this same decomposition of the current can be applied in non-conservative systems is not so well-known-this case resembles the overdamped situation of Section IV A, where the two parts of the current have the same geometrical properties, as steepest descent of the free energy (J S ), and an orthogonal drift (J A ). The properties of J S connect this splitting to recent studies that represent convergence to a steady state as a gradient flow for the free energy [29,39]. However, as in the overdamped case, explicit formulae for J S and J A are not available, and computing these quantities is only possible if the invariant measure (or quasipotential) of the system is known. In addition, the current J A is not connected to the entropy production in this case-in this sense, the splitting does not separate the different aspects of the system as cleanly as was the case for overdamped systems.
On the other hand, if one considers the splitting J = J E + J N then there is no sense of steepest descent of the free energy (J E is not a gradient), but this splitting does provide a connection to the steady-state dissipation (via (86)). In the absence of a gradient structure, the splitting does not provide as simple a physical picture as in the overdamped case, but it is interesting to note that one may represent the time evolution of such a system as a combination of a non-dissipative process (described by J E ) and a dissipating one J N .
To illustrate these points, we return to the example of Section III B and Figure 1: if f ext = 0 then the potential U (q, p) = [p 2 /2 + V (q)]/T is symmetric in both its arguments, with a single minimum at (q, p) = 0. For f ext > 0 then U cannot be separated as a sum of terms depending on q and p alone, so the co-ordinates and momenta are not independent. Moreover, U does not in general have any symmetry. One does expect a single minimum for some q, p > 0. For the splitting J = J S +J A , we note from (70) that the phase space current J S acts only on the momentum co-ordinates: it represents the action of a thermostat that applies damping and noise, and drives the momentum distribution P (p) = φ(q, p)dq towards its (q-dependent) "local equilibrium" form P eq (p|q) = (1/Z) e −U (q,p) dq. The antisymmetric part of the dynamics, described by J A , includes an (irreversible) advection of the co-ordinates in accordance with the current value of the local momentum, as well as the effect of the non-equilibrium forces f on the momenta. In summary, one can think of J S as the result of a "non-equilibrium thermostat" in which the damping force F S p = −T ∇ p U depends on the co-ordinates q as well as the momenta p, and which drives the system towards a state with finite average momentum. The current J A accounts for the Hamiltonian parts of the time evolution, and the non-conservative forces.
For the splitting J = J E + J N , the potential that appears is U (q, p) = U (q, −p): one expects that this function has a minimum for some q > 0 and p < 0. The non-equilibrium current J N acts only on the momentum and can be interpreted in terms of a coordinate-dependent damping force, F N p (q, p) = −T ∇ p [p 2 /(2T ) − U (q, p)] which we expect (on average) to drive the system towards positive momenta. The current J E includes the advection of the co-ordinates by the momenta, as well as the action of the force f . For this current acting alone, one arrives at a process whose steady state is time-reversal-symmetric (in the sense that the right hand side of (78) vanishes), but whose invariant measure is not provided by the above analysis (and is neither e −U nor e −U ). The nature of the process described by J E acting alone seems to deserve further investigation (both in this specific case and more generally).
As a final point, we note that the rate functions for path probabilities (57) and (69) were derived by considering many copies of our system, but the same formulae also govern large deviations at level 2.5 [17,22,32,40,41]. These LDPs involve rare events where an unusual density or current is sustained over a long time period (in a single system). Such LDPs are closely related to level-1 LDPs such as (29). Moreover, orthogonality formulae such as (60) allow rate functions at level 2.5 to be decomposed into contributions that come from currents that are symmetric and antisymmetric under time-reversal, with implications for the rate of convergence of non-equilibrium systems to their steady states [28,31]. Such decompositions have also been connected to recent results related to bounds on dissipation in non-equilibrium steady states [32,42,43].
For extensions to this work, it is possible to combine the two splittings presented here, in order to split the current J into four pieces, which are separated according to their behaviour under the two operations T and T 0 . It is also of interest to relax the restriction that there is no noise in the equation of motion for q. We hope to return to the resulting geometrical structures in a later work.

V. CONCLUSIONS
We end with a few remarks as to the relevance of these results for modelling systems by molecular dynamics. Throughout this article, we have focussed on general results such as symmetries and geometrical structures. For example, we showed in Section III that ensembles conditioned on atypical currents retain a PT symmetry that is not present in typical non-equilibrium states [11]. Hence the conditioned ensembles seem to be in a different class from non-equilibrium steady states.
In the analysis of Section IV, we showed how currents and forces in molecular systems can be split in different ways, based on the theories of stochastic thermodynamics [10] and MFT [27]. We believe that these results are relevant for two reasons. First, the existence of gradient structures such as (71) has potential mathematical applications, in rigorous derivations of effective theories that apply on large length and time scales [44]. The idea is that if a system evolves by steepest descent of some free energy, then any coarse-grained description of that system should also be represented as a steepest decent (of the coarse-grained free energy). Second, the use of orthogonality relationships to decompose currents (and their corresponding rate functions [32]) has the potential to establish new constraints on fluctuations in non-equilibrium systems. We also note in passing that by identifying currents and their conjugate forces, one may also decompose rate functions using a canonical structure [17,45], which makes explicit the connections between antisymmetry under time-reversal and fluctuation theorems [46]). We look forward to more work in these directions, and their application in practical contexts.