Reliable Approximation of Long Relaxation Timescales in Molecular Dynamics

Many interesting rare events in molecular systems, like ligand association, protein folding or conformational changes, occur on timescales that often are not accessible by direct numerical simulation. Therefore, rare event approximation approaches like interface sampling, Markov state model building, or advanced reaction coordinate-based free energy estimation have attracted huge attention recently. In this article we analyze the reliability of such approaches. How precise is an estimate of long relaxation timescales of molecular systems resulting from various forms of rare event approximation methods? Our results give a theoretical answer to this question by relating it with the transfer operator approach to molecular dynamics. By doing so we also allow for understanding deep connections between the different approaches.


Introduction
The problem of accurate estimation of long relaxation timescales associated with rare events in molecular dynamics like ligand association, protein folding, or conformational changes has attracted a lot of attention recently.Often, these timescales are not accessible by direct numerical simulation.Therefore, different discrete coarse graining approaches for their approximation, like Markov state model (MSM) building [1,2] or time-lagged independent component analysis (TiCA) [3,4] have been introduced and successfully applied to various molecular systems [5,6].These approaches are based on finite-dimensional Galerkin discretization [1] or variational approximation [7,8] of the transfer operator of the molecular dynamics process [9].In several theoretical studies the approximation error of these numerical techniques regarding the longest relaxation timescales has been analyzed resulting in error estimates in terms of the dominant eigenvalues of the transfer operator [3,9].In this article we first show how to obtain similar error estimates when replacing the transfer operator by the infinitesimal generator [10] associated with it.Furthermore, the analysis exhibits that the different approaches are deeply connected, that is, in the end they lead to an identical numerical problem.In addition to the different discrete coarse graining approaches, the literature contains various alternative reaction coordinate sampling approaches aiming at approximation of very long relaxation processes.In these sampling approaches, one assumes that the effective dynamical behavior of the systems on long timescales can be described by a relatively low dimensional object given by some reaction coordinates.Various advanced methods such as umbrella sampling [11,12], metadynamics [13,14], blue moon sampling [15], the adaptive biasing force method [16], or temperature-accelerated molecular dynamics (TAMD) [17], as well as trajectory-based techniques like milestoning [18], transition interface sampling [19], or forward flux sampling [20] may serve as some examples.These methods result in free energy barriers, transition rates, or first mean passage times for the rare events of interest; they are complemented by several approaches to the effective dynamics of the reaction coordinate space [21][22][23] that allow for significantly faster simulation of these rare events [24][25][26] including details of the underlying molecular mechanisms.Surprisingly, our analytic tools, originally developed for discrete coarse graining approaches, can also be utilized for evaluating the approximation quality of reaction coordinate sampling approaches to the effective dynamics.We derive an explicit error estimate for the longest timescale resulting from the choice of specific reaction coordinates.
However, estimating the approximation quality is not the only way of utilizing the analytical insights presented in this article.We also demonstrate how the new techniques for simulation of the effective dynamics can be used for efficient MSM building or TiCA applications.
Mathematically, the article is based on the analysis of the dominant timescales of reversible and ergodic diffusion processes in energy landscapes.The leading eigenvalues of the transfer operator (or, equivalently, the infinitesimal generator) and the corresponding eigenfunctions characterize the dynamical behavior of the process on long timescales [9,27].Firstly, in several articles the approximation error with respect to these leading eigenvalues under discretization of the transfer operator has been discussed, cf.[3,7,8,[28][29][30].Following this work, we characterize the approximation quality for the (low-lying) eigenvalues of the infinitesimal generator.This permits us to study the connection between the effective dynamics considered in [23] and Galerkin discretization schemes for the transfer operator.Secondly, following the work [7,8], we study the variational approach for the infinitesimal generator.In fact, we will see that this approach leads to the same generalized matrix eigenproblem as the one resulting from Galerkin discretization.Thirdly, numerical issues related to the estimation of the coefficient matrices by means of the effective dynamics are discussed.
The paper is organized as follows.In Section 2, we introduce the various operators associated to the reversible diffusion processes and discuss the relation between eigenvalues and relaxation timescales.Next, in Section 3, we study the Galerkin discretization of generators/transfer operators for solving the eigenproblem and show that previous results can be extended to reaction coordinate subspaces.In Section 4, the variational approach to the approximation of the eigenproblem is considered and its relations to the Galerkin approach are worked out in detail.Then, in Section 5, we discuss numerical issues related to estimating the discretization matrices by means of simulating the effective dynamics for given reaction coordinates; the performance of this approach is studied numerically in Section 6.Finally, conclusions and some further remarks are given in Section 7.After being familiar with the facts in Section 2, readers who are more interested in numerical algorithmic aspects rather than detailed mathematical analysis can skip Sections 3 and 4 and refer to Sections 5 and 6 on first reading.

Diffusion Process and the Associated Operators
We consider a diffusion process given by the stochastic differential equation (SDE) where x s ∈ R n , parameter β > 0 is related to the inverse of system's temperature, and w s is an n-dimensional Brownian motion.V : R n → R is a potential function which is assumed to be smooth and bounded from below.The results presented subsequently can be extended to more general reversible diffusion processes with a state-dependent noise intensity matrix, cf.[23].However, for the sake of simplicity of presentation we restrict our considerations to the specific case (1) typically studied in molecular dynamics.
The infinitesimal generator of the dynamics (1) is given by, It is known that, under mild conditions on V, the solution process (x s ) s≥0 of (1) is ergodic [31], and its unique invariant measure π is given by π(dx) = ρ(x)dx where, We introduce the Hilbert space H = L 2 (R n , π), which is endowed with the inner product, and the norm The domain of the operator L will be denoted as D(L) ⊂ H.It is also known that the process (x s ) s≥0 is a reversible process and that L is a self-adjoint operator with respect to the inner product (4).Whenever the potential V grows to infinity fast enough at infinity, its spectrum is discrete [9].Let λ i ∈ C and ϕ i ∈ D(L) be the eigenvalues and the corresponding (normalized) eigenfunctions of −L, that is, the solutions of the eigenproblem, Due to the self-adjointness of L and the fact that, we can assume that λ i ∈ R with, with ϕ 0 ≡ 1.Given s ≥ 0, we define the operator T s : H → H by, where E denotes the expectation taken with respect to the paths of (1) under the initial condition that x 0 = x.It is well-known that u(s, x) = T s f (x) is the solution of the Kolmogorov backward equation that is, the operators T s , s ≥ 0 form a one-parameter semigroup whose infinitesimal generator is L, and therefore they are self-adjoint in H as well.Because of Equation (10), the formal expression T s = e sL is often used in the literature.Similarly to (8), we also know that the eigenvalues of T s are given by, with the same eigenfunctions ϕ i , i = 0, 1, • • • .
In the following we introduce another operator called the transfer operator, which has been extensively considered in the literature, to investigate the metastability of molecular systems and to build Markov state models (MSM) [1,6,9].A lag time τ > 0 is fixed, with p(x, • ; τ) being the transition density function of the process (1) starting from x ∈ R n , i.e., p(x, y ; τ) describes the probability density of starting from state x at time s = 0 and arriving at y ∈ R n after time τ.For a bounded and continuous function u ∈ H, the transfer operator T τ : H → H is defined by [1,27,32], From (12), it follows immediately that, which then implies T τ = T τ , i.e., the transfer operator T τ coincides with the operator T τ , a member within the semigroup (T s ) s≥0 .Denote the eigenvalues of T τ as µ i , i ≥ 0, such that, Then from the discussions above and the eigenvalues of T s in (11), we can conclude that µ i = e −λ i τ and the corresponding eigenfunctions are the same as the eigenfunctions ϕ i of the infinitesimal generator L. These eigenvalues and eigenfunctions encode crucial timescale information of the dynamical system.Specifically, the relaxation timescales t i of the dynamics (1) are given by [10], This means that the dominant relaxation timescales of the dynamics (1) can be obtained by computing the dominant eigenvalues of T τ (or, equivalently, T τ , L), cf.[10,27].

Galerkin Approximation of the Eigenvalues of the Generator
In this section, we study the Galerkin method for computing the eigenvalues of the infinitesimal generator L. While Galerkin discretization of the transfer operator has been studied to some extent [9], results on the associated infinitesimal generator are rather sparse.

Some General Results
To introduce the Galerkin method, let H 0 be a Hilbert subspace of H containing the constant function, and let P denote the orthogonal projection operator from H to H 0 , which satisfies P 2 = P and, The Galerkin method aims at approximating the solution of (6) in the subspace H 0 .Specifically, we want to find f ∈ H 0 , such that, for some constant κ ≥ 0. Using the property (14), we know that problem (15) is equivalent to the eigenproblem for the operator −PL on the subspace H 0 , i.e., It is straightforward to verify that −PL is a self-adjoint operator on H 0 .Similarly to (8), let ζ i ∈ H 0 be the orthonormal eigenfunctions of the operator −PL corresponding to eigenvalues κ i , where, and ζ 0 ≡ 1.When H 0 is an infinite dimensional subspace, we assume κ i → +∞ as i → +∞.
In the following, we want to study the condition under which the eigenvalues of the projected generator PL are reliable approximations of the eigenvalues of the full generator L. The following approximation result was obtained in [23] and we include its proof for completeness: Theorem 1.For i ≥ 0, let ϕ i and ζ i be the orthonormal eigenfunctions of the operators −L and −PL corresponding to the eigenvalues λ i and κ i , respectively.We have, Proof.From (15), we have It follows from the orthogonality of the functions ζ i that E i+1 is an (i + 1)-dimensional subspace of H.
Using (17) it is direct to verify that, Applying the min-max theorem to the eigenvalues of the operator −L, we conclude, where E i+1 goes over all (i + 1)-dimensional subspaces of H.For the upper bound, we can compute that, where we have used the fact that ϕ i , The conclusion follows from (7).
Previous studies on the Galerkin approximation of the dominant eigenvalues of the transfer operator have shown that the approximation error of eigenvalues can be reliably bounded by means of the projection errors of the corresponding eigenfunctions [28][29][30].Next we will derive a similar result for the generator L. To this end, we introduce the orthogonal projection P ⊥ from H to the complement subspace H ⊥ 0 of H 0 , that is, P ⊥ = I − P. We have Theorem 2. Let ϕ be a normalized eigenfunction of the operator −L corresponding to the eigenvalue λ.Define constants, and suppose that 0 < δ 2 < 1.Then there is an eigenvalue κ i of the operator −PL, such that, where and the summation consists of finite terms when H 0 is a finite dimensional subspace.For all g ∈ H 0 , we can compute, Therefore we have, Remark 1.Notice that our error bound above relies on both constants δ 1 and δ 2 , while the error bound in [30] for the transfer operator only depends on one constant, the projection error δ 2 .This difference is due to the fact that the generator L is an unbounded operator while the transfer operator is bounded.

Finite Dimensional Subspaces
In applications, it is often assumed that H 0 is spanned by finitely many basis functions.In particular, this is the situation when constructing MSMs based on indicator functions of partition sets [30] or based on core sets [10].
Let H 0 be the finite dimensional space , where ψ i ∈ H are the basis functions, and consider the eigenproblem (15).As a direct application of Theorem 1 and Theorem 2, we have, Corollary 1.For Galerkin approximation of the eigenproblem (15) using the finite-dimensional ansatz space H 0 , the following three statements are valid: where C, S are N × N matrices whose entries are given by, smallest eigenvalues of problem (24) and, we have, where λ i , ϕ i are the eigenvalues and the eigenfunctions of the operator −L, respectively.3. Let P be the orthogonal projection operator from H to H 0 , and ϕ be an eigenfunction of the operator −L corresponding to the eigenvalue λ.Define constants, and suppose that δ 2 < 1.Then there is an eigenvalue κ i of problem (24) such that,

Infinite Dimensional Subspace: Effective Dynamics
In this subsection, we discuss Galerkin approximations based on infinite-dimensional ansatz spaces; these cases appear when studying the effective dynamics given by a so-called reaction coordinate, cf.[23].In order to explain the relation between Galerkin approximation and effective dynamics, let us first recall some definitions and results regarding the effective dynamics.For more details, readers are referred to [21,23,33] for related work.
Let ξ : R n → R m be a reaction coordinate function, m ≥ 1.For any function f ∈ H and x ∈ R n , we define, where z = ξ(x) ∈ R m , δ(•) denotes the delta function, and Define the probability measure ν on R m given by ν(dz) = Q(z) dz for z ∈ R m and consider the Hilbert space H = L 2 (R m , ν).H induces a (infinite dimensional) linear subspace of H, namely, and ( 28) clearly implies that Then, using (28), we can verify that P 2 = P and, Therefore, the mapping P : H → H 0 actually is the orthogonal projection operator from H to the subspace H 0 .For f ∈ H, z ∈ R m , in the following we will also write P f (z) instead of f (z), where f ∈ H such that P f = f • ξ.The effective dynamics of the dynamics (1) for the reaction coordinate ξ is defined on R m and satisfies the SDE, where z s ∈ R m , w s is a Brownian motion on R m , and the coefficients b : R m → R m , σ : R m → R m×m are given by, for ∀z ∈ R m , 1 ≤ l, l ≤ m.The infinitesimal generator of the process governed by ( 31) is given by, which is a self-adjoint operator on space H with discrete spectrum under appropriate conditions on ξ.We consider the eigenproblem, and let ϕ i ∈ H be the orthonormal eigenfunctions of the operator − L corresponding to the eigenvalues λ i , where, Applying Theorems 1 and 2, we have the following result.
Corollary 2. For the eigenproblem (34) associated with the effective dynamics, the following three statements are valid: 2. Let ϕ i and ϕ i be the normalized eigenfunctions of the operators −L and − L corresponding to eigenvalues λ i and λ i , respectively.We have, 3. Let ϕ be the normalized eigenfunction of the operator −L corresponding to the eigenvalue λ.Define constants, and suppose δ 2 < 1.Then there is an eigenvalue λ i of the problem (34), such that, Proof.The proof of the first assertion can be found in [23].Using (30) and (36), we can derive, i.e., λ i and ϕ i • ξ are the eigenvalues and eigenfunctions of the projected operator −PL on the subspace H 0 , respectively.Furthermore, the second assertion is implied by Theorem 1.The third assertion follows from Theorem 2 in the same way.
Remark 2. As an interesting conclusion of the first assertion, we can conclude that, on the infinitesimal subspace H 0 defined in (29), the projected operator −PL is essentially described by another differential operator L, which is defined in the Hilbert space H and coincides with the infinitesimal generator of the effective dynamics on R m .

Variational Approach to Generator Eigenproblem
In this section, we study the variational approach to approximate the eigenvalues and eigenfunctions of the operator −L.This approach has been considered in [4,7,8] to study the related eigenproblem of the transfer operator.Its main idea is to approximate the dominant eigenvalues of a self-adjoint transfer operator via an appropriate form of the Rayleigh variational principle instead via Galerkin discretization [7].Herein, we present a similar approach to the low-lying generator eigenvalues.

Variational Principle
The main object of the variational approach is the following functional F : D(L) ⊕(k+1) → R, that acts on k + 1 functions from D(L).
Given arbitrary constants ω i > 0, 0 ≤ i ≤ k, we define the functional, Clearly, for the (normalized) leading eigenfunctions ϕ i of L, we have, where λ i are the corresponding eigenvalues.The main workhorse of the variational principle is the following lower and upper bound: Theorem 3 (Variational principle).Let ω i , i = 0, 1, . . ., k be a decreasing sequence of positive real numbers, i.e., For any orthonormal family of functions f i ∈ D(L), i = 0, 1, . . ., k, we have, In order to prove this variational principle we need the following simple lemma: Lemma 1. Suppose k > 0, and let (α i ) i=0,1,...,k and (ω i ) i=0,1,...,k be two ordered sequences of real numbers such that, Then, for any permutation (ω i ) i=0,1,...,k of the sequence (ω i ) i=0,1,...,k , we have, Proof.The proof of Theorem 3 is given in two steps: 1.For the lower bound, we consider the optimization problem, min Next, we introduce the Lagrange multipliers λ ij for 0 ≤ i ≤ j ≤ k, and consider the auxiliary functional, Applying calculus of variation, we conclude that the minimizer of (44) satisfies, Multiplying f j for some i < j ≤ k in the first equation of ( 46) and integrating, we obtain In the same way we could also obtain λ ij = −2ω j L f j , f i π .Using the fact that L is self-adjoint and ω i > ω j for i < j, we conclude that, and (46) reduces to an eigenproblem, Therefore, the minimizer of (44) is given by the orthonormal eigenfunctions.Applying Lemma 1, we can further conclude that the lower bound is obtained when 2. For the upper bound, similarly to the proof of Theorem 1, direct computation gives, where we have used the fact that −Lϕ i = λ i ϕ i and f i ,

Optimization Problem
The variational principle of Theorem 3 allows for approximation of the low-lying eigenvalues of the generator.In order to turn it into an algorithm, we again introduce N basis functions ψ 1 , • • • , ψ N ∈ D(L).We want to approximate the first k + 1 eigenvalues λ i , as well as the eigenfunctions ϕ i , 0 ≤ i ≤ k by approximating the eigenfunctions using linear combinations of the basis functions.That is, we consider the functions, where x il are real-valued coefficients to be determined, 0 Inspired by Theorem 3, we wish to determine the coefficients x il by solving the optimization problem, min Recalling the matrices C, S defined in (25) and defining the vectors or, equivalently, in matrix form, min Using a similar argument as in the proof of Theorem 3, we can obtain, Theorem 4. The minimum of the optimization problem (51) is achieved by the functions f i as of (50) with the coefficients from the first k + 1 eigenvectors X i of the generalized matrix eigenproblem, It is supposed that the eigenvectors X i of (54) are chosen such that X T i SX j = δ ij and the corresponding eigenvalues are κ i for 0 ≤ i ≤ k, where Remark 3. Combining the above result with Subsection 3.2, we see that both the Galerkin method and the variational approach lead to the same generalized matrix eigenproblem with an identical estimate for the eigenvalue error.

Numerical Algorithms
In this section, we consider how the matrices C, S defined in (25), that is, can be approximated from trajectories of the diffusion process.For the transfer operator this problem has been studied in [4,7,8] using trajectories of the original diffusion process given by (1).
In contrast, we herein will consider trajectories of the effective dynamics (31) instead of the original diffusion process.

Computing Coefficient Matrices Using Effective Dynamics
Similar to the setup in Subsection 3.3, we assume that a reaction coordinate function ξ : R n → R m , as well as N basis functions ψ l , 1 ≤ l ≤ N, are given.Furthermore, we suppose that the basis functions ψ l can be written as ψ l = ψ l • ξ for some functions ψ l ∈ H, i.e., ψ l ∈ H 0 .In this case, it follows from the first assertion of Corollary 2 and the relation ( 30) that, These equalities, though simple, are quite interesting, because they relate the entries of the coefficient matrices C, S to the infinitesimal generator L of the effective dynamics in (33).Since ν is the unique invariant measure of the effective dynamics [23], we can apply the ergodic theorem and get, where z s denotes a realization of the effective dynamics (31), ∆t > 0 is the step size, M ∈ N is a large integer, and only the parts of trajectories after time M 0 ∆t are used for estimation.
For the matrix C, using (57), the definition of the infinitesimal generator L, as well as the ergodic theorem, we can derive, In the above, E denotes the mathematical expectation with respect to the effective dynamics z s , and the last equality follows from the symmetry of the matrix C.
To compute C ll numerically, we further introduce a parameter τ 1, and approximate (59) by, Formulas ( 58) and ( 60) can be used to estimate the coefficient matrices C, S, provided that we can obtain a long trajectory of the effective dynamics (31).Remark 4. From the discussions in Section 2, we know that the eigenvalues of the transfer operator T τ and those of the operator −L satisfy the relation µ i = e −λ i τ , i ≥ 0. When the lag time τ is small, the approximation µ i ≈ 1 − λ i τ holds for the leading eigenvalues since λ i is small.In fact, estimating the matrix C using the last expression in (60), we will have C = S− C τ , where the matrix C is given by, It is easy to observe that the eigenvalue estimations resulting from problem (54) are related to those of the problem CX = µSX by µ = 1 − λτ.Note that (61) is very similar to the estimator derived in [3] except for the fact that here we use trajectories of the effective dynamics instead of the original dynamics.To summarize, when the lag time τ is small, the above discussion implies that after solving the problem (54) we can approximate the leading eigenvalues of the transfer operator by µ i = 1 − λ i τ.

Algorithms for Simulating the Effective Dynamics
In order to utilize the above results we have to be able to efficiently compute (long) realizations of the effective dynamics (31).In this subsection, we discuss two numerical algorithms for realizing this.

Algorithm 1
The first algorithm is based on the following formula for the coefficients b, a given in (32): where x s is a realization of the original diffusive dynamics (1) and µ z is the restriction of the invariant measure π to the submanifold ξ −1 (z) = x ∈ R n | ξ(x) = z .We refer readers to [23] for more details.
In order to utilize this for simulation, we fix two parameters 0 < ∆s ∆t and proceed as follows: ∆s of length ∆s of the (unconstrained) full dynamics x s by discretizing (1).Compute the coefficients b, a by, where 1 ≤ l, l ≤ m.
In the above, z k∆t,l denotes the lth components of z k∆t ∈ R m .The initial states x 0 are sampled from the probability measure µ z ; this can be achieved by using the numerical schemes proposed in [15,34,35], which simulate the original dynamics (1) and then project the state onto the submanifold ξ −1 (z).

Algorithm 2
The second algorithm is inspired by the TAMD method proposed in [17].In the following we provide a slightly different argument which motivates the method.The main idea is to consider the extended dynamics, where κ is a large constant, w s , ws are independent Brownian motions on R n , and x s,i denotes the ith component of the state x s (similar notations for z s , w s , ws ).Note that the invariant measure of the dynamics (65) has a probability density, with respect to the Lebesgue measure on the extended space R n+m .If we choose (x, z) → z as the reaction coordinate function of (65) and derive the effective dynamics following [21,23], we can obtain, where w s is a Brownian motion on R m , and, for z ∈ R m , 1 ≤ l, l ≤ m.Note that in (68), L is the generator given in (2) and integration by parts has been used to derive the second expression for b (κ) .It is not difficult to show that b (κ) → b and a (κ) → a, when κ → +∞.Therefore (67) is an approximation of the effective dynamics (31) when κ 1.For numerical simulations, we can express (68) as time averages, where x s satisfies the SDE (65) with fixed z s = z, i.e., The main steps of the algorithm can be summarized as follows: 1. Denote z = z k∆t at step k ≥ 0. Simulate dynamics (70) for M steps with time step size ∆s.Compute the coefficients, 2. Compute σ from a = σ σ T by matrix decomposition.Update the state z (k+1)∆t according to, i are independent standard Gaussian variables, 1 ≤ i ≤ m.

Illustrative Example
In order to illustrate the analysis and the performance of the numerical methods presented in the previous sections, we study simple two-dimensional dynamics: where β > 0, x s = (x s,1 , x s,2 ) ∈ R 2 and w s,1 , w s,2 are two independent one-dimensional Brownian motions.The potential V in dynamics (73) is defined as, where > 0, and (r, θ) is the polar coordinate of the state x = (x 1 , x 2 ) satisfying, Under the polar coordinate, it is easy to see that the potential V contains three local minima at linebreak θ = 0, ± 2π 3 where the radius is determined by the relation r 2 = 1 + 1 1+4rθ 2 .Furthermore, when parameter is small, one can expect that the dynamics (73) will be mainly confined in the neighbourhood of the curve defined by the relation r 2 = 1 + 1 1+4rθ 2 , where the potential is relatively flat.Profiles of the potentials V 1 and V are displayed in Figure 1.The main purpose of this numerical experiment is to demonstrate that the leading eigenvalues of the operator −L corresponding to dynamics (73) can be approximated with the help of its effective dynamics, provided that the reaction coordinate function as well as the basis functions are chosen appropriately.
We choose parameters β = 4.0 and = 0.05 in the following numerical experiment.In fact, for this two-dimensional problem, it is possible to directly solve the eigenproblem (5) by discretizing the operator L. First of all, we note that the generator can be written as L = e βV β ∇(e −βV ∇).Defining the operator D such that D f = e − β 2 V f for a function f , it is straightforward to see that the operator −L D = −DLD −1 has the same eigenvalues λ i as −L and the corresponding eigenfunctions are given by ϕ D i = Dϕ i = e − β 2 V ϕ i , where ϕ i are the eigenfunctions of −L.Furthermore, L D is a self-adjoint operator under the standard L 2 inner product.Instead of −L, we will work with −L D and solve the eigenproblem −L D f = λ f because the discretized matrix will be symmetric and the corresponding eigenfunctions ϕ D i decay rapidly.Taking into account the profile of the potential V in Figure 1b, we truncate the whole space R 2 into a finite domain [−2, 2] × [−2, 2], which is then discretized using a 500 × 500 uniform mesh, leading to the cell resolution ∆x 1 = ∆x 2 = 4 500 = 0.008.For 1 ≤ i, j ≤ 500, let f i,j , V i,j denote the values of the functions f , V evaluated at state − 2.0 + (i − 1 2 )∆x 1 , −2.0 + (j − 1 2 )∆x 2 , respectively.Other notations such as V i± 1 2 ,j are defined in a similar way.Approximating −L ) by the centered finite difference scheme, we obtain, for 1 < i, j < 500.For boundary cells, the Neumann condition is applied when the neighboring cells are lying outside of the truncated domain.From (76), it can be observed that the resulting discretization matrix is both symmetric and sparse.Solving the eigenvalues of this matrix (of order 250,000 ) using the Krylov-Schur method through the numerical package SLEPc [36], we obtain the first four eigenvalues, with relative residual errors smaller than 1.1 × 10 −6 .The corresponding eigenvectors are shown in Figure 2.With the above reference result at hand, we continue to study the approximation quality of the effective dynamics with respect to the leading eigenvalues.For this purpose, we choose the reaction coordinate function as ξ(x) = θ(x) ∈ [−π, π], i.e., our reaction coordinate is the angle of the polar coordinate representation.Direct calculation shows that the coefficients b, σ in (32) reduces to, Discretizing the interval [−π, π] into 1000 subintervals and applying the projection scheme proposed in [34] for each fixed z = −π + 2πj 1000 , 0 ≤ j ≤ 1000, we can compute the coefficients of the effective dynamics; the resulting profiles are shown in Figure 3a,b.After these preparations, we can generate trajectories of the effective dynamics by simulating the SDE (31) using standard time stepping schemes.As shown in Figure 3c, the effective dynamics spend long times around values − 2π 3 , 0 and 2π 3 , which is accordance with the behavior of dynamics (73) as well as with the profile of the potential V in Figure 1b.Since the effective dynamics is one-dimensional, we can also discretize its infinitesimal generator L in (33) and compute the eigenvalues of − L which gives, λ 0 = 0.000 , λ 1 = 0.012 , λ 2 = 0.044 , λ 3 = 2.068 .
Comparing to (77), we conclude that the eigenvalues λ 0 , λ 1 , λ 2 of the original dynamics (73) are quite well approximated by those of the effective dynamics.
As the final step of our experiment, we test the trajectory-based method proposed in Subsection 5.1.
As before, we conclude that the eigenvalues λ 0 , λ 1 , λ 2 of the original dynamics are relatively well approximated., 0 ≤ j ≤ 1000, the coefficients b(z), σ(z) are estimated by generating a trajectory of the constrained version of dynamics (73) using the projection scheme proposed in [34] with the time step size 2 × 10 −5 , and 3 × 10 6 steps are simulated; (c) A typical sample trajectory of the effective dynamics for dynamics (73) with reaction coordinate function ξ(x) = θ(x).

Conclusions
In this work we have studied the approximation of eigenvalues and eigenfunctions of the infinitesimal generator associated with the longest relaxation processes of diffusive processes in energy landscapes.Following the previous studies on transfer operators, we consider the Galerkin discretization method, the variational approach and the effective dynamics given by a low-dimensional reaction coordinate for solving the eigenvalue problem in application to the generator.It turns out that: (1) there are rather similar results for the approximation error of the three methods; and (2) the first two methods lead to the same generalized matrix eigenproblem while the third can be used for efficient estimation of the associated coefficient matrices.
Before we conclude, it is worth mentioning several issues which go beyond the scope of our current work.Firstly, while we have assumed that the dynamics are driven by the gradient of a potential function, we emphasize that the analysis in the current work can be directly applied to more general reversible processes (see [23] for details).Secondly, for non-reversible dynamics, as, for example, for Langevin dynamics, it is not immediately clear how the results in the current work can be applied.However, the approach in [9] (Section 5.3), shows that the extended reversibility of Langevin dynamics may well allow for a generalization of our results.Thirdly, for the numerical algorithms which are briefly outlined in Section 5, both the numerical analysis and their applications to more complicated systems need to be further investigated.Lastly, both the analysis and the algorithms in our current work depend on the choice of the reaction coordinate function.Different choices will have different approximation qualities of the eigenvalues/eigenfunctions of the system [21,23,37].Algorithmic identification of reaction coordinate functions for high-dimensional systems is a challenging problem and has attracted considerable attention; most approaches utilize machine learning approaches [38], while the relation between identification and effective dynamics has only been explored recently [39].All of these issues are topics of ongoing research.

Figure 2 .
Figure 2. Eigenfunctions ϕ D i of operator −L D corresponding to the first four eigenvalues in (77).