Langevin dynamics with variable coefficients and nonconservative forces: from stationary states to numerical methods

: Langevin dynamics is a versatile stochastic model used in biology, chemistry, engineering, physics and computer science. Traditionally, in thermal equilibrium, one assumes (i) the forces 2 are given as the gradient of a potential and (ii) a ﬂuctuation-dissipation relation holds between 3 stochastic and dissipative forces; these assumptions ensure that the system samples a prescribed 4 invariant Gibbs-Boltzmann distribution for a speciﬁed target temperature. In this article, we 5 relax these assumptions, incorporating variable friction and temperature parameters and allowing 6 nonconservative force ﬁelds, for which the form of the stationary state is typically not known a priori. 7 We examine theoretical issues such as stability of the steady state and ergodic properties, as well as 8 practical aspects such as the design of numerical methods for stochastic particle models. Applications 9 to nonequilibrium systems with thermal gradients and active particles are discussed. results for a system with T ⊥ = 1 / 2 and T || = 5. We see that the cumulative 291 average of the respective estimates converges, after a short equilibration period, to the target values.


13
Langevin dynamics is a system of stochastic differential equations of the form Here q, p represent vectors of the positions and momenta of the particles comprising together 2n ∈ N degrees of freedom. The mass matrix M is symmetric positive definite. The force F is usually taken to be conservative, i.e. F (q) = −∇ q U (q) for some potential energy function U . Langevin dynamics describes a physical system of particles moving under prescribed interaction forces and subject to collisions with particles of a "heat bath." The friction matrix Γ, which is here allowed to vary with position, typically models drag on a set of distinguished particles due to the interactions with the surrounding environment, whereas the matrix Σ, also potentially varying with position, characterizes the stochastic effects of collisions. W represents a vector of independent and uncorrelated Wiener processes in R m , W = [W 1 , . . . W m ] T , where the components of its formal time derivativeẆ are independent white noise processes such that and In case Γ(q) ≡ γI where γ is a nonnegative scalar and constant, one may take Σ(q) ≡ σI, where 15 σ = √ 2γk B T , T is the temperature and k B is Boltzmann's constant. The resulting system can then 16 be shown, under conditions reviewed in Section 2, to have a unique invariant distribution with density 17 ρ β (q, p) ∝ exp(−βH(q, p)), where H(q, p) = p T M −1 p/2 + U (q) represents the total energy of the 18 isolated deterministic system and β −1 = k B T .

19
The emphasis in this article is on the general form (1)-(2), for which we will relax several typical 20 assumptions. First, the system is not assumed to admit a universal fluctuation-dissipation relation, 21 instead we assume only certain nondegeneracy conditions. Second, we do not assume a conservative 22 force field. Such generalized forms of Langevin dynamics can be used to model diffusion (or thermal) 23 gradients by particle simulation [1][2][3], as well as a variety of models for flocking [4][5][6], protein convergence of observable averages.

31
Our approach to proving ergodicity is dependent on the fact that the friction and noise terms 32 in the equations depend only on positions. On the other hand, within this class, and subject to 33 the nondegeneracy condition on the friction and noise tensors, our treatment is general and has a 34 considerable range of applications. Formally (1)-(2) includes dissipative particle dynamics (DPD), 35 which is a momentum-conserving, 2nd order, gradient-type system [9,10], but we specifically exclude in 36 this article cases for which Γ is not positive definite, which occur as a direct consequence of momentum 37 conservation in DPD; ergodic properties for DPD systems in one dimension were discussed in previous 38 work of Shardlow and Yan [11]. See also [12][13][14] for some special cases of Langevin-type systems with 39 velocity dependent coefficients to which our formulation is not applicable.

40
To perform numerical discretization of the system (1)-(2), we proceed by splitting. This involves (thus one obtains second order accuracy for the same computational work as a first order scheme).

52
In the case of the more general systems considered in this article, the calculation of the 53 Ornstein-Uhlenbeck solution, which is required at each step of our splitting-based numerical methods, 54 becomes potentially demanding from a computational standpoint. In the methods of [15,16] this is 55 done exactly, however in the current setting the solution of a Lyapunov equation would need to be 56 obtained at each step in time, a calculation that would dominate the computational load in a large 57 scale simulation. Therefore, we offer an efficient numerical procedure based on multiple timestepping

58
[17], relying on a further splitting of the OU equation at each interior timestep.

59
In the last section of this article, we illustrate the theory in three numerical examples. The first is 60 a particle system with an imposed diffusion gradient, as in [7,[18][19][20][21]. Here the issue is to capture the 61 correct density fluctuations as a function of position. In the second model, we incorporate a stirring 62 force which alters the equilibrium state of a physical model. In the last example, we discuss the 63 use of a blended Langevin model which draws on ideas from the literature of flocking, in particular 64 the Cucker-Smale system. We use our theory to infer attractive steady states for the system, and Let (x(t)) t≥0 denote the solution of an Itô diffusion process of the forṁ taking values in a suitable domain. In this general presentation, x(t) ∈ Ω x may represent either a position vector or else the combined vector of positions and momenta, i.e. x(t) = (q(t), p(t)) ∈ Ω q × R n . In the latter case, we would typically either assume a compact configurational domain, as for example when periodic boundary conditions are used in the position space, i.e., Ω q = LT n , where L > 0 and T = R/Z denotes the 1-torus, or else an unbounded domain, e.g. Ω q = R n . (W (t)) t≥0 represents a standard Wiener process in R m and the coefficients a, B are assumed to be smooth, i.e., a ∈ C ∞ (Ω x , R 2n ), and B ∈ C ∞ (Ω x , R 2n×m ). The initial state of x is specified by the probability measure µ 0 , which throughout this article we assume to be such that x(0) has finite mean and variance, i.e., Ωx x 2 2 µ 0 (dx) < ∞.

The associated semigroup of evolution operators and their adjoints 75
The time evolution of the expectations of observables under the dynamics of the SDE is described by the semigroup of evolution operators (P t ) t≥0 for ϕ ∈ S, x ∈ Ω x , where the expectation is taken with respect to the Wiener measure associated with the driving noise process (W (t)) t≥0 and S ⊆ M(Ω x , R) denotes a set of test functions or observables, which is contained in M(Ω x , R), the set of all real valued measurable functions defined on the domain Ω x . If the test function set S is chosen appropiately, (e.g. (Ω x , R) denotes the set of all smooth bounded real valued functions defined on Ω x ) the action of P t corresponds to the solution of an initial value problem, i.e., The operator L, defined such that for all ϕ ∈ S, is referred to as the infinitesimal generator. As the action of the operators (P t ) t≥0 is given as the solution of the differential equation (6), it is common to use the notation P t := e tL . In the case of an Itô diffusion process (5), and with the regularity assumptions on the coefficients a, B stated above, it can be shown that the infinitesimal generator takes the form where " : " denotes the Frobenius product, i.e., See, e.g., [24] for the case S = C ∞ b (Ω x , R), and [25] for extensions from that core to more general test 76 function sets such as the ones considered below.

77
It is often convenient to adopt a dual perspective by considering the evolution of the density of the law µ t of x(t) in time. Let µ t (dx) = ρ(x, t)dx, for t ≥ 0, in the sense of distributions. The corresponding semigroup (P † t ) t≥0 of transfer operators is then naturally defined as the formal adjoint operators of P t , i.e., Similarly as for its adjoint, the action of P † t corresponds to the solution of an initial value problem, which in this case is known as the Fokker-Planck equation. More specifically, where ρ( · , t) denotes the probability density of µ t and L † -the Fokker-Planck operator-can be shown to correspond to the L 2 -adjoint of the infinitesimal generator L, i.e.,

79
Note that (7) is in general to be interpreted in a weak sense as, up to this point, we have not made any assumptions on the regularity of ρ. However, within the scope of this article it is sufficient to consider the case where the differential operator ∂ t − L † is hypoelliptic. A differential operator A is said to be hypoelliptic, if for any g solving the differential equation Ag = f , it follows that g is of higher regularity than f in the sense that with > 0, where H loc s denotes the local Sobolev space of order s ∈ N. This means that if ∂ t − L † 80 is hypoelliptic, then the solution of (7) is smooth in the sense that ρ ∈ C ∞ (Ω x , (0, ∞)) irrespective 81 of the regularity of ρ 0 . A common way to establish hypoellipticity of a differential operator is via where a i , 0 ≤ i ≤ M are C ∞ vector fields in R n and † indicates the formal L 2 adjoint. Iteratively define a collection of vector fields by denotes the commutator of vector fields X, Y ∈ C ∞ (Ω x , R n ) and (∇X), (∇Y ) their Jacobian matrices. If then the operator A is hypoelliptic.

84
The condition (9) is commonly referred to as Hörmander's condition. In the particular case of A = ∂ t − L † one can easily verify that (9) is exactly satisfied if with B 0 = a and B i refers to the i-th column of the diffusion tensor B in (5). This particular version 85 of Hörmander's condition adapted to the parabolic PDE of the form (7) is referred to as the parabolic 86 Hörmander condition. 87 The direct consequence of the parabolic Hörmander condition is the smoothness of the underlying 88 transition kernel describing the evolution of the probability measure associated to the SDE.  Let now µ(dx) = ρ(x)dx be a probability measure with density ρ ∈ C 2 (Ω x , [0, ∞)). When we say that µ is an invariant measure of the SDE (5) or that the latter preserves the probability measure µ, we mean that the probability density ρ is a stationary solution of the Fokker-Planck equation associated to the SDE, i.e., L † ρ = 0.

Ergodicity and convergence in law
Note that for (10) to be well posed in a strong sense it is sufficient that L is hypoelliptic, i.e., the 91 Fokker-Planck operator L † satisfies the Hörmander condition. It is also straightforward to see that 92 every convex combination of two invariant measures is again an invariant measure of the respective 93 SDE which by definition means that the set of invariant measures of a particular SDE is convex.

94
Let for t > 0 and some suitable observable ϕ, for the µ-weighted spatial average of ϕ. The process x is said to be ergodic with respect to the invariant probability measure µ if for all ϕ ∈ L 1 (µ) and for almost all realizations of the Wiener process W , and µ 0 -almost all initial values x(0), trajectory averages coincide with expectations with respect to the measure µ in the asymptotic limit t → ∞, i.e., It can be shown (see [27]) that ergodicity follows if (i) there exists an invariant measure with positive 95 smooth density and (ii) ∂ t − L † is hypoelliptic.

97
Assume that x(t) converges in law towards a unique invariant measure µ, i.e., for all ϕ ∈ C ∞ b (Ω x , R), µ 0 -almost all x ∈ Ω x . A common way to characterize the convergence of the expectation E[ϕ(x(t)) | x(0) = x], or, in semi group notation the convergence of e tL ϕ(x) to the µ-weighted average µ [ϕ], is via functional decay estimates of the semi-group operators e tL . For this purpose a set of test functions S is fixed and equipped with a norm · S such that E := (S, · S ) forms a Banach-space. Of particular interest in this context is exponential convergence of e tL ϕ towards µ [ϕ] in the respective norm, i.e., where C, κ are positive constants, the latter corresponding to the spectral gap of the generator L in the functional space E 0 = (S 0 , · S ), where S 0 ⊆ S denotes the subset of test functions with vanishing mean, i.e., Let the operator Π denote the orthogonal projection from S onto S 0 , i.e., Denote further by A B(E) the operator norm of an operator A : E → E. (13) implies that e tL Π when considered as an operator on E is bounded in the operator norm B(E) as 2.4. Finite-time averages and the central limit theorem 98 Ergodicity of an SDE ensures that time averages of infinitely long trajectories almost surely coincide with spatial averages with respect to the target measure. However, ergodicity per se does not address the statistical properties of finite-time averages apart from the convergence of the time average ϕ t as t → ∞. For practical applications, where for a given unique invariant measure, the aim is to approximate the µ-weighted average µ[ϕ] of a test function ϕ ∈ S, it is important that fluctuations of finite-time averages ϕ t (i.e. the Monte Carlo error in the finite-time approximations) around the infinite-time value lim t→∞ ϕ t = µ[ϕ] can be quantified. For this purpose it typically regarded as necessary that a central limit theorem holds, i.e., where σ 2 ϕ is commonly referred to as the asymptotic variance of the observable ϕ. A sufficient condition for a central limit theorem of the form (15) to hold for the solution x of (5) and the observable ϕ is that the Poisson equation has a solution which belongs to L 2 (µ) (see [28]). The asymptotic variance then takes the form Let L 2 0 (µ) denote the subspace of L 2 (µ) consisting of functions with vanishing mean. Note that for Φ ∈ L 2 (µ) it is a priori not clear how to interpret (16) since only under additional regularity assumptions (16) can be interpreted in a weak sense. A common way to makes sense of (16) is by deriving bounds for the operator (16) for ϕ ∈ E 0 . The relationship between the spectral properties of the operator L −1 Π and the convergence properties of the solution process x, or more specifically the decay properties of the semi-group operator e tL Π as t → ∞ can be made more precise via the formal identity which follows from for ϕ ∈ {φ ∈ S 0 : Lφ ∈ S 0 }. Using the identity (17), one directly finds that (14) is a sufficient condition for L −1 Π to be bounded since We conclude that the exponential decay (13) implies the central limit theorem (15) for ϕ in the 99 corresponding function space E 0 . Estimates for E 0 = H 1 (µ) ∩ L 2 0 (µ) can be obtained using the 100 framework of hypocoercivity as presented in [29]. In [30] techniques are introduced to show the decay 101 estimate (13) for E 0 = L 2 0 (µ). In this article we will use Lyapunov function based techniques which 102 allow to show exponential convergence as in (13) in some weighted L ∞ spaces, which we specify in the 103 next section. Another way of deriving exponential decay estimates for the semi-group (e tL ) t≥0 , which are sufficient to establish a central limit theorem for certain observables, is by means of well established Lyapunov techniques. These techniques have been formulated originally for discrete-time Markov processes/Markov chains [31-33] and have been subsequently extended to continuous time solutions processes of SDEs [22,23,34,35]. The function space on which decay estimates are shown in these references are Banach spaces of the form ( where K ∈ C 2 (Ω x , [1, ∞)) is a positive function, and Exponential convergence in the sense of is typically shown provided that the following two assumptions are satisfied.

108
As outlined in [23,35], Assumption 2 follows if L satisfies the parabolic Hörmander condition and the SDE (5) is controllable in the sense that there is a t > 0 such that for any pair Moreover, there exist C > 0 and κ > 0 such that (22) holds for all ϕ ∈ L ∞ K (Ω x ) and any t ∈ [0, ∞).

111
Finally, let us demonstrate how decay estimates in spaces L ∞ K can be used to derive a central limit theorem for certain functions. Let V be a Lyapunov function such that the conditions for Theorem 2.2 are satisfied for K = V. Note that if the conditions of Theorem 2.2 are also satisfied for V 2 , then this implies that a central limit theorem holds for all observables ϕ ∈ L ∞ V , since (24) being valid for for ϕ ∈ L ∞ V , so that by [28] indeed a central limit theorem of the form (15) holds for ϕ ∈ L ∞ V . This   117 We now return to the examples from the introduction, specifically Langevin dynamics with 118 space-dependent friction (1)-(2), and show that in case the variable friction tensor Γ is positive definite 119 and the diffusion tensor Σ has full rank, at all points of the phase space, then the system satisfies 120 the conditions described in the previous section for ergodicity and exponential decay estimates. As illustrations, we consider three applications: (i) a particle model with a temperature gradient, (ii) 122 a simple 2-dimensional Langevin diffusion with a non-conservative force term, (iii) an illustrative 123 Langevin dynamics model for of flocking/swarming, as commonly arises in studies of active particle 124 systems. Let, for a square matrix A ∈ R n×n ,

Langevin dynamics with configuration-dependent diffusion
denote its spectrum. The following assumption on the spectrum of Γ and ΣΣ T ensures the existence 140 of a suitable Lyapunov function in the case these matrices are not constant in q.
(ii) The diffusion matrix has full rank, i.e., rank(Σ(q)) = n, for all q ∈ Ω q , and the spectrum of ΣΣ T is bounded in the sense that Obviously, Assumption 3 is automatically satisfied for Ω q = LT n as long as the coefficients Γ, Σ 142 are smooth and Γ(q) is positive definite and Σ(q) has full rank at every point q ∈ Ω q . The next 143 assumption ensures the existence of a suitable Lyapunov function in the case of a non-conservative 144 force. Again, it is trivially satisfied for Ω q = LT n and F ∈ C ∞ (T n , R).

145
Assumption 4. There exists a potential function U ∈ C 2 (Ω q , R) with the following properties for all q ∈ Ω q .

147
(ii) the potential function is bounded from below, i.e., there exists u min > −∞ such that We point out that if F is a conservative force, then Assumption 4 reduces to the same asymptotic In (1)-(2), let the force, the friction and diffusion tensors be smooth functions, i.e., such that Assumption 4 and Assumption 3 hold. There is a unique invariant probability measurẽ in the case of Ω q = LT n and with suitably chosen positive constants a, b, c > 0 in the case of Ω q = R n . Furthermore, the probability measureμ is such, that Proof. Lyapunov condition 152 We first show that Assumption 1 holds for K l as defined in (27) and (28) in the respective setups. Note that We first show the existence of constants a l , b l such that Assumption 1 is satisfied for a = a l , b = b l in 153 the case l = 1. For l ≥ 2, the existence of suitable constants a l , b l follows inductively.
By Assumption 4, (i), it follows that Similarly, by Assumption 3, (ii), we find Putting both inequalities together thus yields where we used Assumption 4, (iii) in the inequality. The existence of suitable constants a and b such that Assumption 1 is satisfied for the case K = K 1 , follows directly if the matrix M is positive definite. The positive definiteness of the block matrix M is implied if (See e.g. [37]), the matrices −2bI n + cΓ(q) and the Schur complement are both positive definite. Indeed, since the spectrum of Γ is uniformly bounded on Ω q = R n according 155 to Assumption 3, (i) the positive definiteness of both these matrices can be ensured by choosing a, b 156 sufficiently small and c sufficiently large.
Rewrite (29) as a second order differential equation: Since rank(Σ(q)) = n for all q ∈ Ω q , there exists Σ g ∈ C ∞ (Ω q , R m×n ) such that for all q, p ∈ R n , thus, is a solution of (29).

159
Note that the case where Γ and Σ are constant was already studied in Mattingly et al [35] and 160 the proof of the above theorem resembles the structure of the proof therein.

162
In what follows we provide three examples of models whose ergodic properties can be characterised 163 by the above Theorem 3.1. 3.2. Single-particle system with non-conservative force 165 As an example of a system with a non-conservative force term, which satisfies the condition of the above theorem we consider a Langevin equation of the form (1)-(2) where the force term F is of the form where J = 0 is a skew-symmetric matrix and α ∈ R. For α = 0 the additional term α J q obviously does not correspond to the negative gradient of a smooth potential energy function, thus in this case the force (31) is indeed non-conservative. It is easy to verify that U satisfying the growth conditions (ii)-(iii) in Assumption 4 and with > 0 as q → ∞ implies (i) in Assumption 4. Therefore, as long as the remaining conditions 166 on the coefficients Γ and Σ in Theorem 3.1 are satisfied, it follows from the same theorem that the 167 respective (non-equilibrium) dynamics possesses an invariant measure, µ α (dq, dp) = ρ α (q, p)dq dp,

168
to which it converges exponentially fast in L ∞ K as specified above. In the remainder of this section we consider the application of Theorem 3.1 to two different types of particle systems, which can be seen as instances of the underdamped Langevin equation with non-constant coefficients. With some abuse of notation whenever a particle model is considered, let q i ∈ LT d , i = 1, . . . , N where L > 0 and p i ∈ R d , i = 1, . . . , N denote the position and momentum vectors of the particles, respectively. If, on the other hand, we want to refer to the i-th entry of the denotes the operator which selects the i-th cartesian coordinate, i.e., Π i x = e i · x. Furthermore, if not stated otherwise, we will assume, that the force is conservative and corresponds to the gradient of a potential function U (q), which is composed as the sum of smooth pair potentialsŨ i,j ∈ C ∞ (R, R), 1 ≤ j < i ≤ N ., i.e., 3.4. Particle system with temperature gradient 171 As a first application we consider an N -particle system with periodic boundary condition where the particles are coupled to a heat bath whose temperature varies depending on the position of the particle within the periodic simulation box. The system we consider in the following is of the forṁ for i = 1, . . . , N , where W i , i = 1, . . . , N , are independent white noise processes in R d . We further assume that the friction coefficient γ > 0 is a positive constant and β a smooth positive function on LT d , i.e., β ∈ C ∞ (LT d , R + ). In light of (29) this corresponds to a constant diagonal friction matrix Γ(q) = γI n , and diffusion tensor of the form The matrices Γ and Σ are clearly positive definite and invertible, hence by Theorem 3.1, there exists a 172 unique invariant measure µ γ,β to which the solution of (33) converges (exponentially fast) in law. Note bounded from above and away from 0, and the potential U is modified so that Assumption 4 is satisfied,

177
(e.g. by adding a confining potential to U ,) it is easy to see that Theorem 3.1 applies also for the case 178 Ω q = R n . noise. An SDE model was presented there without analysis. A subsequent paper [5] provided numerical 185 evidence for flocking states. In these papers, the SDE approach consists of a Langevin-type model with 186 coordinate-dependent friction and additive (typically uniform) white noise, that is they take the form For a system of particles moving in one dimension, one has a friction matrix Γ and for some given scalar kernel function ψ(r) ≥ 0. That is, the matrix Γ(q) is a (weighted) graph Laplacian 188 which reflects the interaction structure of the problem. If ψ(r) > 0, this interaction graph is complete.

189
The inclusion of conservative forces derived from a potential energy function U is an addition to the models mentioned above and allows to incorporate direct attraction and repulsion effects. Note that since [γ ij ] 1≤i,j≤N is of the form of a graph Laplacian, it is positive semi-definite with at least one eigenvalue being singular. If we assume that ψ(r) > 0, then it follows that [γ ij ] 1≤i,j≤N possesses exactly one singular eigenvector, 1 N ∈ R N , i.e., the vector whose entries all are equal to 1. Consequently, Γ(q), has exactly d singular eigenvalues, each of them being of the form u l = 1 N ⊗ e l , where e l denotes the l-th canonical vector in R d . These singular eigenvectors can be associated with the mean momentum of the collection of particles via the relation for l ∈ {1, . . . , d}. There are several issues regarding the model (34)-(35). Most importantly, while the diffusion matrix in (35) has full rank, the friction tensor Γ(q) is only of rank (N − 1)d, which means that in directions of the singular vectors u l , 1 ≤ l ≤ d there is no dissipation and therefore the kinetic energy of the system will be unbounded as t → ∞, and hence one cannot expect (34)-(35) to possess an invariant measure. More specifically, if U is composed of pair potentials one can show that the mean momentum p is a Brownian motion,ṗ = [u l ] T 1≤l≤dẆ .

190
A simple fix to the model (34) − (35), which ensures ergodicity of the dynamics, is to add additional dissipation, which is uniform in each component of p. This can be achieved by replacing the friction tensor in (35) by Γ (q), which for > 0 is defined such that Γ (q) = Γ(q) + I N d .
It follows directly from the Gershgorin circle theorem that Γ (q) is positive definite for all q ∈ Ω q and 191 any choice of > 0. While the above described regularized stochastic Cucker-Smale dynamics is a valid extension of the original model which ensures that the conditions of Theorem 3.1 are satisfied and hence geometric ergodicity for (34)-(35) holds, the form of the corresponding invariant measure does depend in a non-trivial way on Γ, σ and and unless Γ is constant in q, one can not easily find a closed form of the invariant measure. We therefore propose another novel modification of (34)-(35), which is geometrically ergodic with an invariant measure of closed form. We construct this model as a superposition, i.e., x = x ⊥ + x || , of two independent stochastic processes, x ⊥ = (q ⊥ , p ⊥ ) and x || = (q || , p || ) taking both values in Ω x × R n , respectively. We construct these processes such that the parametrization of the former process determines the statistical properties of the stochastic inter-particle interactions and the parametrization of the latter process the collective motion of the flock, i.e., the diffusive properties of the centre of mass. More specifically, we construct x ⊥ as the solution of the SDE and x || as the solution of dq || dt = p || , where γ ⊥ , T ⊥ , γ || , T || ≥ 0 and W as specified in (3) and (4). We refer to (38) where γ ij are defined as above in (36) and (37). By choosing the diffusion matrix Σ ⊥ such that we ensure that the total momentum in each dimension of the physical domain remains constant, i.e., d dt This follows directly from the fact, that as well as u l · ∇ q U (q ⊥ ) = 0, (since U is composed of pair potentials) and therefore by (38), Consensus dynamics: 198 We construct the matrix Γ || such that the difference in the momenta of all particle pairs remains constant under the dynamics (39), i.e., for all 1 ≤ i, j ≤ N . We achieve that by choosing Γ || as and Σ || such that i.e., Σ || = N −1/2 I N ⊗ I d .
Combined dynamics: 199 We first observe that although the processes x ⊥ and x || are driven by the same Wiener process W they are indeed independent. This follows since the column vectors of Γ ⊥ (q) are orthogonal to the column vectors of Γ || in the sense that for all q ∈ Ω q , and hence also so that Σ ⊥ (q)W and Σ || W are independent processes, which implies that also the solution processes of the respective SDEs are independent. Moreover, since U is composed of pair-potentials, we have Using (45)-(47) it directly follows that x = x ⊥ + x || can be identified with the solution of (1)- (2) with with and or more explicitly with the solution of dq dt = p, Remark 3.1. We note that the above choice of Γ ⊥ (q) and Σ ⊥ (q) is very similar to the friction tensor and diffusion tensor in dissipative particle dynamics. In fact (38) would exactly correspond to a DPD system, if instead of (40) one constructs the friction tensor such that dissipation is aligned with the relative orientation of particle pairs Proposition 3.1. Let Γ(q) and Σ(q) defined as above in (49) and (50), with T ⊥ > 0, T || > 0. Let further Ω q = T N ×d and U be of the form (32), then the SDE (51) possesses an invariant measure µ T ⊥ ,T || (dq dp) = ρ T ⊥ ,T || (q, p)dq dp of the form with covariance matrix is the all-ones matrix, i.e., every entry of 1 N is equal to 1. The matrix L ∈ R N ×N with denotes the graph Laplacian of a fully connected graph. In particular, where 1 ≤ i < j ≤ N , and for p : where 1 ≤ j ≤ d.

200
Proof. We show that (52) is a stationary solution of the Fokker-Planck equation associated with the SDE (51), i.e., with where the action of each of these operators applied to ρ ∈ C 2 (Ω q , R) is given as Before we show (55), we first note that since ker (Γ ⊥ (q)) = span Γ || , for all q ∈ Ω q , we have Furthermore, Finally, since U (q) is composed of pair potential functions, we have for all c ∈ R, hence in particular Given the above identities, we conclude (57), and L † O⊥ ρ T ⊥ ,T || = 0, L † O|| ρ T ⊥ ,T || = 0, due to (ii) and (i), respectively.

206
The invariant measure µ T ⊥ ,T || specified in Proposition 3.1 is unique and the law of (51) converges 207 exponentially fast towards µ T ⊥ ,T || in the sense of (26) with K l , l ∈ N as constructed in the proof of 208 Theorem 3.1.  210 We next describe the construction of numerical methods for the Langevin system. The 211 discretization schemes described here are based on a general splitting framework as explained in the 212 introduction, adapted for the variable coefficient structure.

213
Following [15], we break the Langevin system (1)-(2) into three parts: A and B corresponding to the deterministic flow: and O, which is associated to the isolated momenta diffusion process defined by an Ornstein-Uhlenbeck type equation, i.e. the stochastic systeṁ In the case of non-constant coefficients Γ(q), Σ(q), exact solution of the O-step can be computationally costly. More specifically, bearing in mind that q(t) ≡ q is fixed during the isolated Ornstein Uhlenbeck process (58), a time-step ∆t > 0 can be written: where R ∼ N (0, I N d ), and G t = e −∆tΓ(q) .
The matrix S t ∈ R N d×N d is related to G t as where C t is the solution of the Lyapunov equation This means that in order to compute an exact solution of the O-part as an (59), first the Lyapunov equation (62) has to solved, followed by the computation of the Cholesky decomposition (61) and matrix exponential (60). Each of these operations is without any additional assumptions on the structure of the matrices Γ(q) and Σ(q) of computational complexity O (N 3 d 3 ). We circumvent these computations by instead integrating the O-part using a numerical method. We construct a symmetric splitting method based on the decompositionṗ = −Γ(q)p =:D + Σ(q)Ẇ =:F , and use the integration sequence DFD, which results in an update of the form with G = exp(− ∆t 2 Γ(q)).
In order to avoid large numerical errors induced by this approximation we employ a multiple 223 timestepping approach [17], meaning that instead of executing a single integration step of (63) for 224 a full time step, we repeat for K ∈ N, the integrations step (63) K-times using a step size ∆t/K.  , p), Γ, Σ, ∆t, K 2: q ← q + (∆t/2)p 3: p ← p + (∆t/2)F (q) 4: G ← exp(− ∆t 2K Γ(q)) 5: for i = 1 to K do 6: p ← G 2 p + G √ ∆t/K Σ(q) R i , R i ∼ N (0, I n ) 7: end for 8: p ← p + (∆t/2)F (q) 9: q ← q + (∆t/2)p 10: return (q, p) 228 as there is a trade off between computational costs and accuracy of the approximation of the solution 229 of the Ornstein-Uhlenbeck process, which in turn affects the accuracy of the invariant measure of the 230 numerical method. The parameter K in the m-BAOAB algorithm allows control of the accuracy with 231 which the OU process is resolved. The method was tested in the example of Subsection 5.3, below, but 232 the detailed numerical analysis of this method will be taken up in a forthcoming article.

Numerical experiments 234
In this section we consider three systems that fit the form of (1)-(2). In the first, we consider a We first study a simple particle system with a position dependent temperature parameter as described in Section 3.4, comprising N = 64 particles on a two dimension torus, i.e., q i ∈ LT 2 with L = 5, which are heated by a source at the center of the simulation box. The heat source is modeled by choosing the position-dependent heat bath temperature as where c L = 1 2 (L, L) is the center point of the simulation box and Ψ is a smooth bump function of the form The constant T max > 0 corresponds to the maximum heat bath temperature at the center point c L of the simulation box and T min > 0 describes the heat bath temperature outside of the disk The potential function U is modeled as the sum of pairwise potentials, i.e., with r i,j := |q j − q i | and ω is a simple repulsive soft potential of the form where k and c r are positive constants. The pair interaction described by ω corresponds to a harmonic spring of stiffness k and rest length c r . Particle systems involving this type of pair potential are commonly used as benchmark systems in the context of dissipative particle dynamics [39,40]. Due to the isolated jump discontinuity in the second derivative of the potential Theorem 3.1 does not strictly speaking, apply in this case. Although it would be easy to modify the potential to have any desired level of smoothness, we expect, based on our experience with molecular dynamics problems where similar such issues arise, that the results will be very similar to those obtained with the potential given here.
The simulation results reported in the remainder of this section are obtained for a parameterization of the model where k = 25, c r = 1 and r max = 1 with T min = 1/10 and T max = 4. The particle positions and momenta were initialized on an equidistant grid such that q i = L N ( i N , i mod N ) and p i = (0, 0), respectively. N t = 10 5 timesteps of stepsize h = 2 · 10 −2 were simulated with varying values γ ∈ {10 i , i = −1, 0, 1, 2} of the friction coefficient. Define by the effective temperature of the system at distance r ∈ [0, L/2) from the center c L of the simulation box. Figure 1, B, shows estimates of this function for different values of γ calculated at points with A larger value of γ leads to a tighter local coupling of the particle system with the heat bath and 251 hence for large friction γ = 10 2 , the estimates of the effective temperature coincide very well with the 252 heat bath temperature given by Ψ at the respective distances from the center of the simulation box. Due to the relatively small size of the heat source area the number of particles in this area is small in comparison to the number of particles outside of the heat source area and hence the shape of the radial density is mainly determined by the statistical properties of the particles outside the heat source area. The empirical particle densities shown in Figure 2 provide further insight into how the value of the friction coefficient affects the properties of µ γ,β . For a small friction coefficient, i.e., γ = 10 −2 the particle density is very close to uniform on LT 2 . If the heat bath temperature was constant in LT 2 one would expect a uniform density as the potential function U is solely composed of pair potentials and thus translation-invariant in the coordinates of the particle density. More precisely, let q i,1 and q i,2 denote the first and second coordinate value of the i-th particle, then for all a 1 , a 2 ∈ R. Therefore, the observed uniform particle density for γ = 10 −2 is consistent with the 262 observation that the position dependency of the effective temperature vanishes for this friction value.

263
While for the considered trajectory length/sample size the plot of the invariant for a friction coefficient 264 γ = 10 2 is very noisy, it still strongly suggests that the translation invariance of the particle density, 265 i.e., the uniformity of the particle density, is broken in this regime. Within the interior of the heat 266 source region particles are distributed approximately uniformly whereas close to the boundary of the 267 heat source region the particle density is increased. Outside the heat source region the particle density Effective temperature B γ = 10 −2 γ = 10 −1 γ = 10 0 γ = 10 1 γ = 10 2 Figure 1. Panel A: Radial density function for the particle system described in Section 5.1 for different values of the friction coefficient γ. Panle B: Distance r from heat source center c L vs. effective temperature estimated according to (65). The black curve corresponds to the heat bath temperature Ψ(r), with Ψ as defined in (64).  272 We next consider an instance of a non-equilibrium system with a non-conservative force of a form as outlined in Section 3.2. More specifically, we let Ω q = R n with n = 2 and let the force F be of the form (31), i.e.,

System with non-conservative force
We furthermore assume that the system is driven by a standard Langevin diffusion, i.e., Γ = γI n , and where we choose γ = 1, and β = 1. This means, that without the non-conservative force part, i.e., in the case α = 0, the system considered resembles to a particle moving in a 4-well potential driven by a standard Langevin equation at equilibrium at unit temperature. The non-conservative force part α J q, corresponds to a stirring force, which pushes the system radially and in clockwise direction around the origin. The effect of the stirring force can be seen in Figure 3. In the absence of the stirring force (see Figure 3, panel A) the invariant distribution is exactly the canonical distribution, i.e., The modes of the marginal density in q are exactly positioned at the energy minima of U at (±1, ±1).
Estimates of the mean momentum vectorfield p(q) := Ωq pρ α (q, p)dp, vanish at each point in the configurational phase space. In the presence of the stirring force (see 273 Figure 3, panel B), the invariant density is rotated slightly and smeared out over the energy barriers.

274
The mean momentum resembles a vector field spiralling clockwise around the origin. If not stated otherwise the results presented in this section were obtained from a system of 64 particles in a periodic simulation box of edge length 5 and dimension 2, i.e., Ω q = LT N d with Figure 3. Marginal of q of the invariant density of the system described in Section 5.2 in the absence (α = 0) of the non-conservative force (A), and in the presence (α = 1) of the non-conservative force (B). The black arrows correspond to estimates of the mean momentum vector field (66) in the invariant density at the respective points in configurational space. d = 2, L = 5 and N = 64. We assume the particle pair-potentialsŨ i,j in the definition of U to be identical Morse potentials, i.e., for i = j U i,j (r) = D(1 − e −a(r−cr ) ) 2 , r > 0, with D = 1, a = 1, c r = 1/2. Following [41] we choose the functions ψ in the definition of γ ij in (36) to be of the form ψ(r) = K 1 + r α , with K = 1/10, and α = 6. 289 5.3.2. Independent control of peculiar and consensus temperature 290 We first demonstrate that as stated in Proposition 3.1, we can indeed control the peculiar and the consensus temperature independently, such that these quantities coincide with the values of the model parameters T || and T ⊥ , respectively, i.e., we show that the identities (53) and (54) are reproduced in simulations. For this purpose we compute an estimate of the peculiar temperature at time step n ∈ N as and an estimate of the consensus temperature at time step k ∈ N as In Figure 4 we show the results for a system with T ⊥ = 1/2 and T || = 5. We see that the cumulative 291 average of the respective estimates converges, after a short equilibration period, to the target values. . The blue trajectory shows the estimates ϕ T ⊥ (p (k) ) and ϕ T || (p (k) ) at timestep k for the peculiar and the consensus temperature, respectively. The respective cumulative averages are shown in red.

Properties of the flock 294
We first demonstrate the effect of the parameterization of the peculiar dynamics (the values of 295 the peculiar friction γ ⊥ and the peculiar temperature T ⊥ ) on the flock size, flock formation, and 296 inter-flock diffusion. While we vary the values of the peculiar temperature parameter and peculiar 297 friction parameter in order to demonstrate the effect of these parameters on the above quantities, we 298 leave the parameter values of the consensus dynamics unchanged in all simulations as γ || = 1 and 299 T || = 1. In the first series of simulations we vary the value of T ⊥ , while fixing the value of the peculiar 300 friction parameter to γ ⊥ = 1. Figure 6 shows the radial density for different values of T ⊥ . As one would 301 expect, the probability mass is concentrated around the mean rest-length of the Morse-potential for 302 low values of T ⊥ and the distribution spreads out for higher values of T ⊥ . If we measure flock size in 303 terms of the mean distance between particles in the flock, this implies that the flock size grows for 304 increased temperature. 305 We next explore the effects of the peculiar friction parameter γ ⊥ . We study the effect of the value of γ ⊥ on the flock formation using the mean distance between particles, ϕ md (q) = 2 N (N − 1) as a measure of the progression of the flock formation. We initialize the particle position out of 306 equilibrium on a equidistant square grid covering a square with side length 5/2. Figure 5, A, shows 307 the time evolution of observed mean distance ϕ md (q) in simulations using different values of γ ⊥ . We 308 find that for small value of γ ⊥ flock formation comes with strong oscillation in the flock size (measured 309 in terms of the mean particle distance), which only slowly decays. With increased values of γ ⊥ these 310 oscillations are more strongly damped so that the flock size quickly approaches its equilibrium value. 311 We next explore the effect of the value of the model parameter γ ⊥ on the mobility of particles within the flock. In order to measure the strength of diffusion within the flock we consider the mean distance of the particle i to the center of mass, i.e., The autocorrelation of this observable can be used as a measure of mobility within the flock.
312 Figure Figure 5, panel B, shows the autocorrelation function for ϕ i mc calculated from a trajectory in 313 equilibrium, i.e., after the initial transitional flock-formation phase described in the previous paragraph. 314 We can see that the decay of the autocorrelation function becomes slower for increasing values of γ ⊥ , 315 which indicates that for an increased value of γ ⊥ the mobility of particles within the flock is reduced.  We next explore the effects of the values of γ || and T || on the collective motion of the flock, i.e., the diffusive behaviour of the center of mass. Since the motion of the center of mass is described by the consensus dynamics, which is driven by an Ornstein-Uhlenbeck process, we find the diffusion constant D of the center of mass (viewed as a free particle in space) to be   Table Table 1 shows the estimated diffusion coefficients for various values of γ || and T || . We find a good 319 match between theoretically predicted values and observed values. 320

321
In this article we have provided a general treatment of the convergence of Langevin dynamics 322 to a stationary state, including for systems with configuration-dependent friction and noise, as 323 well as nonconservative forces. We have demonstrated the concepts in applications to systems with 324 temperature gradients and stochastic models of active particle systems. Our approach does not assume 325 the usual fluctuation-dissipation rule, so it can be applied to a wide range of nonequilibrium molecular