Spectral Properties of Effective Dynamics from Conditional Expectations

Nüske, Feliks; Koltai, Péter; Boninsegna, Lorenzo; Clementi, Cecilia

doi:10.3390/e23020134

Open AccessEditor’s ChoiceArticle

Spectral Properties of Effective Dynamics from Conditional Expectations

¹

Center for Theoretical Biological Physics and Department of Chemistry, Rice University, Houston, TX 77005, USA

²

Institute of Mathematics, Universität Paderborn, 33098 Paderborn, Germany

³

Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany

⁴

Institute for Quantitative and Computational Biosciences and Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, CA 90095, USA

⁵

Department of Physics, Freie Universität Berlin, 14195 Berlin, Germany

^*

Author to whom correspondence should be addressed.

Entropy 2021, 23(2), 134; https://doi.org/10.3390/e23020134

Submission received: 15 December 2020 / Accepted: 18 January 2021 / Published: 21 January 2021

(This article belongs to the Special Issue Entropy Methods for Stochastic Dynamical Systems and Evolution Equations)

Download

Browse Figures

Review Reports Versions Notes

Abstract

The reduction of high-dimensional systems to effective models on a smaller set of variables is an essential task in many areas of science. For stochastic dynamics governed by diffusion processes, a general procedure to find effective equations is the conditioning approach. In this paper, we are interested in the spectrum of the generator of the resulting effective dynamics, and how it compares to the spectrum of the full generator. We prove a new relative error bound in terms of the eigenfunction approximation error for reversible systems. We also present numerical examples indicating that, if Kramers–Moyal (KM) type approximations are used to compute the spectrum of the reduced generator, it seems largely insensitive to the time window used for the KM estimators. We analyze the implications of these observations for systems driven by underdamped Langevin dynamics, and show how meaningful effective dynamics can be defined in this setting.

Keywords:

stochastic differential equations; coarse graining; infinitesimal generator; spectral analysis; extended dynamic mode decomposition; Kramers–Moyal formulae; Langevin dynamics

1. Introduction

The description of high-dimensional dynamical systems by a reduced set of variables, usually referred to as coarse graining or model reduction, is of tremendous importance across many different fields of research. Examples range from finance to climate modeling to molecular biology. From the huge body of literature on the subject, we mention in particular the Mori–Zwanzig formalism [1,2,3,4,5], as well as the framework of averaging and homogenization for systems with explicit multiscale structure [6,7]. Within the field of molecular physics, references [8,9,10,11,12,13] present important contributions to this line of research. Here, we focus on model reduction for stochastic differential equations (SDEs), and follow another standard approach, which is based on conditioning along level sets of the coarse graining map [14,15,16]. For a detailed theoretical analysis of the method in the context of SDEs, please see [15,17,18,19,20].

For a given coarse grained description of a system, a fundamental question to address is the quality of approximation of the full system by means of the reduced system, as measured by a suitable metric (which usually depends on the problem at hand). In many cases, the approximation of spectral properties of the system’s generator is useful in this context. The generator and its associated semigroup, also called Koopman semigroup, are used to describe the time evolution of expectation values of observable functions. For metastable systems, the leading generator eigenpairs provide information on slow modes in the dynamical system. Spectral approximation results for the conditioning approach have been obtained in [17,18].

Another important problem to consider is the analysis and parameter estimation of coarse grained models based on simulation data of the full system. In recent years, a variety of methods has been developed to learn models for the Koopman semigroup off simulation data, see [21,22,23,24,25,26,27] and the references therein. In Ref. [28], some of the authors of the present study presented a conceptually simple framework for the data-driven approximation of the Koopman generator, called gEDMD. This framework can also be used to identify and analyze coarse grained models within the context of the conditioning approach. The gEDMD method requires knowledge of the full system parameters. If these parameters are unknown, they need to be replaced by a suitable approximation, such as Kramers–Moyal (KM) formulae. These are based on averages of finite differences at a finite offset (time window). The quality of this approximation as a function of the offset will be addressed in this paper. Even though we focus on KM estimators here, let us mention that a multitude of more advanced methods for parameter estimation of stochastic dynamics are available, please see Ref. [29] for an overview. Spectral methods have been considered in [30,31], while particular attention to the choice of time window has been paid in [7,32]. The Kramers–Moyal formulae being among the simplest estimators, we take them as the starting point for our study.

The third focus of this study is model reduction for systems driven by underdamped Langevin dynamics, which is a widely used model, especially in molecular and biological physics. As the momentum variables of these dynamics often play just an auxiliary role, an interesting question to address is how to define a reduced dynamics that only involves the position state variables. As the conditioning approach does not provide meaningful answers in this case, finding meaningful effective equations remains an open problem in this setting [33,34,35].

In this paper, we report theoretical and numerical results on the issues raised above. The contributions of this study are as follows:

Concerning the first problem, we prove a new relative error bound for the approximation of generator eigenvalues by the coarse grained generator, if the dynamics is reversible (Proposition 2). This bound shows that a small projection error of the full eigenfunctions with respect to the energy norm is required for a small eigenvalue error. We also derive conditions to ensure that the spectrum of the reduced generator is discrete in the first place (Proposition 1).
Concerning the second issue, we present numerical examples indicating that, if KM estimators are used within the gEDMD algorithm for reversible systems, on a good set of reaction coordinates, then the resulting eigenvalue estimates seem to be fairly insensitive to the offset used for the KM estimators (Section 4.2 and Section 4.3, Conjecture 1).
Thirdly, we suggest that, if the observations of the second part can be confirmed theoretically, it is possible to use KM estimators at large offsets to define meaningful effective equations for underdamped dynamics (Corollary 1). The reason is that the statistics of the underdamped process approach those of an overdamped process after a suitable re-scaling of time. We provide successful illustrations of this idea using a toy example and molecular dynamics simulation data of the alanine dipeptide (Section 5.2 and Section 5.3).

The rest of this paper is organized as follows: in Section 2, we recap what is needed of the theory of stochastic differential equations, their generators, the conditioning approach, and the data-driven approximation of Koopman generators. In Section 3, we present and illustrate our spectral approximation result. The technical details of the proofs are deferred to Section 6. In Section 4, we present numerical results on the spectrum of gEDMD models based on KM estimators for reversible systems, and conjecture that the observed behaviour can be expected in general. We analyze the implications of this hypothesis for systems driven by underdamped Langevin dynamics in Section 5, and provide additional numerical results for this setting. Conclusions and the outlook follow in Section 7.

2. Concepts

2.1. SDEs and Generators

In this paper, we consider a reversible Markov process

X_{t}

attaining values in a domain

Ω \subset R^{d}

. The process is governed by the stochastic differential equation

\begin{matrix} d X_{t} & = b (X_{t}) d t + σ (X_{t}) d B_{t} . \end{matrix}

(1)

Here,

B_{t}

denotes d-dimensional Brownian motion, the function

b : R^{d} \mapsto R^{d}

is called the drift, and

σ : R^{d} \to R^{d \times d}

is called the diffusion. We use the notation

a (x) = σ (x) σ {(x)}^{T}

for the covariance matrix of the diffusion. A standard example for dynamics of type Equation (1) are the overdamped Langevin dynamics

\begin{matrix} d X_{t} = - \frac{1}{γ} \nabla V (X_{t}) d t + \sqrt{2 β^{- 1} γ^{- 1}} d B_{t}, \end{matrix}

(2)

where

V : Ω \to R

is a scalar function called the potential, while

β, γ

are constants corresponding to the inverse temperature and the friction in physics applications.

We assume that

X_{t}

is ergodic with respect to a unique invariant measure

μ

with Boltzmann density

ρ \propto exp (- F (x))

, where F is called a generalized potential. In addition, we assume the diffusion to satisfy a so-called uniform ellipticity condition

0 < η_{1} {∥ v ∥}^{2} \leq v^{T} a (x) v \leq η_{2} {∥ v ∥}^{2},

(3)

for constants

η_{1}, η_{2} > 0

. The invariant measure

μ

gives rise to the Hilbert space

L_{μ}^{2}

of all square integrable functions with respect to that measure. We can think of these functions as physical observables. The inner product on

L_{μ}^{2}

is given by

{〈 ψ, \tilde{ψ} 〉}_{μ} = \int_{Ω} ψ (x) \tilde{ψ} (x) d μ (x) = \int_{Ω} ψ (x) \tilde{ψ} (x) ρ (x) d x .

(4)

For a fixed time window

t \geq 0

, and an observable function

ψ \in L_{μ}^{2}

, the Koopman operator

K_{t}

describes the evolution of the expectation value of

ψ

by means of the dynamics (1):

K_{t} ψ (x) = E^{x} [ψ (X_{t})],

where

E^{x} [\cdot]

denotes expectation given that the dynamics starts deterministically at x. The infinitesimal generator

L

of the Markov process

X_{t}

is then defined as a formal time-derivative of this expectation value:

L ψ (x) = \frac{d}{d t} E^{x} [ψ (X_{t})] |_{t = 0} .

(5)

It follows that, by describing the system in terms of the expectations

K_{t} ψ

, the nonlinear dynamical system (1) turns into a linear, but infinite-dimensional system with differential equation

\frac{d}{d t} K_{t} ψ = L K_{t} ψ .

For this reason, the Koopman operators

K_{t}

and their generator

L

have been studied extensively in past decades. A technical subtlety arising from this infinite-dimensional description is that the time-derivative (5) is not well-defined for all

ψ \in L_{μ}^{2}

. For smooth and compactly supported functions, however, stochastic calculus shows that

L

is well-defined, and acts as a second order differential operator:

\begin{matrix} L ψ (x) & = \sum_{i = 1}^{d} b_{i} (x) \frac{\partial ψ (x)}{\partial x_{i}} + \frac{1}{2} \sum_{i, j = 1}^{d} a_{i j} (x) \frac{\partial^{2} ψ (x)}{\partial x_{i} \partial x_{j}} \end{matrix}

(6)

\begin{matrix} = b (x) \cdot \nabla ψ (x) + \frac{1}{2} a (x) : \nabla^{2} ψ (x) . \end{matrix}

(7)

Here,

\nabla^{2} ψ

is the Hessian matrix of the function

ψ

, and the colon: denotes the Frobenius inner product between matrices, i.e., A:B =

\sum_{i, j} A_{i j} B_{i j}

. For the same class of functions, the generator is symmetric with respect to the inner product (4), and satisfies the important equality

\begin{matrix} {〈 L ψ, \tilde{ψ} 〉}_{μ} & = & - \frac{1}{2} \int_{Ω} a (x) \nabla ψ (x) \cdot \nabla \tilde{ψ} (x) d μ (x), \end{matrix}

(8)

which requires only first order derivatives. The negative of the right-hand side of Equation (8) is called the quadratic form

Q (ψ, \tilde{ψ}) : = \frac{1}{2} \int_{Ω} a (x) \nabla ψ (x) \cdot \nabla \tilde{ψ} (x) d μ (x) .

(9)

Tools from functional analysis [36,37] can be used to define (9) on a larger set of functions

V_{μ}

, usually called the form domain. Because of (3), Q defines an inner product on the form domain, and

V_{μ}

in fact turns into a Hilbert space with energy norm

{∥ \cdot ∥}_{Q} = Q {(\cdot, \cdot)}^{1 / 2} .

The reason to introduce all these concepts is that the energy norm will serve as error measure for the main results of this study.

2.2. Spectral Decomposition

We are particularly interested in eigenvalues and eigenfunctions of the (negative) generator

- L

. Because of (8), the spectrum of

- L

must be part of the non-negative real axis. We will further assume that there is a complete set of eigenfunctions (i.e., they form a basis of

L_{μ}^{2}

) corresponding to discrete eigenvalues. In other words, there are functions

ψ_{0}, ψ_{1}, ψ_{2}, \dots

and non-negative numbers

κ_{0} < κ_{1} < κ_{2} < \dots

such that

\begin{matrix} - L ψ_{i} = κ_{i} ψ_{i} . \end{matrix}

(10)

Conditions for the existence of a completely discrete spectrum are discussed in Section 6. The assumption that all eigenvalues

κ_{i}

are distinct is for simplicity only, especially with regard to the analysis in Section 6. It follows again from (8) that

κ_{0} = 0

, and

ψ_{0} \equiv 1

is the constant function.

The physical significance of these eigenpairs is that, by [38] (Ch 2, Thm 2.4.), the eigenfunctions

ψ_{i}

are also eigenfunctions of the Koopman operators

K_{t}

for all

t \geq 0

, corresponding to eigenvalues

λ_{i} (t) = e^{- κ_{i} t} .

(11)

Due to the exponential decay of all

λ_{i} (t)

, it is common to refer to the

κ_{i}

as rates, and to their reciprocals as implied timescales

\begin{matrix} t_{i} = \frac{1}{κ_{i}} . \end{matrix}

(12)

In many applications, including molecular dynamics, we expect to find a number K of dominant rates

0 < κ_{1} < \dots < κ_{K} ≪ κ_{K + 1}

separated from all others. These dominant spectral components are of particular interest as they are related to metastability, that is, the existence of long-lived macrostates such that transitions between those states are rare events [22,39,40].

2.3. Dimensionality Reduction

The main topic of this study is the effect of dimensionality reduction on the generator eigenvalues

κ_{i}

introduced above. Following the notation of Refs. [15,17,18], we consider a smooth coarse graining function

ξ

, which maps the state space

Ω \subset R^{d}

onto a lower-dimensional space

\hat{Ω} \subset R^{m}

, where

m \leq d

. For a position

z \in \hat{Ω}

in reduced space, we denote the marginal probability distribution of the invariant measure

μ

by

ν

, and assume it possesses a corresponding density function

ϑ (z)

. The conditional expectation operator

P ψ (z) = E^{μ} [ψ (x) ∣ ξ (x) = z]

computes the stationary average of a function

ψ

defined on

Ω

, conditional to

ξ

attaining a fixed value

z \in \hat{Ω}

. Using the Dirac

δ

-function, we can informally write the above expression as

P ψ (z) = \frac{1}{ϑ (z)} \int_{Ω} ψ (x) δ (ξ (x) - z) d μ (x) .

Consider the space

L_{ν}^{2}

of physical observables on reduced space

\hat{Ω}

:

L_{ν}^{2} = {φ : \hat{Ω} \to R, \int_{\hat{Ω}} φ^{2} (z) d ν (z) < \infty} .

By the concatenation

φ \circ ξ

, every such function can be viewed as a function on full state space

Ω

. In fact,

L_{ν}^{2}

can be exactly identified as the subspace of functions in

L_{μ}^{2}

which depend only on the value of

z = ξ (x)

, with the conditional expectation operator acting as orthogonal projection onto this subspace [17]. Note that, unless

ξ

is constant,

L_{ν}^{2}

is an infinite-dimensional subspace.

This motivates consideration of the projected generator

\begin{matrix} L^{ξ} & = PLP . \end{matrix}

(13)

As discussed comprehensively in [17], this operator retains the shape of the generator of a reversible Markov process

Z_{t}

on

\hat{Ω}

, as for smooth compactly supported functions

φ \in L_{ν}^{2}

, we have that

L^{ξ} φ = P [L ξ] \cdot \nabla_{z} φ + \frac{1}{2} P [\nabla ξ^{T} a \nabla ξ] : \nabla_{z}^{2} φ .

(14)

In the above equation,

L ξ

is an m-dimensional vector, each entry containing the application of

L

to each component of

ξ

, and

\nabla ξ

is the

d \times m

Jacobian matrix of

ξ

. The coefficients

\begin{matrix} b^{ξ} (z) = P (L ξ) (z), & a^{ξ} (z) = P (\nabla ξ^{T} a \nabla ξ) (z) \end{matrix}

(15)

serve as effective drift and effective diffusion for the process

Z_{t}

, respectively. It can also be shown that

Z_{t}

is ergodic with respect to

ν

[17], and we can associate with

L^{ξ}

a form domain

V_{ν}

with an effective quadratic form

Q^{ξ} (φ, \tilde{φ}) = \frac{1}{2} \int_{\hat{Ω}} a^{ξ} (z) \nabla_{z} φ (z) \cdot \nabla_{z} \tilde{φ} (z) d ν (z) .

(16)

If the projected generator also possesses a discrete spectrum with eigenvalues

ω_{i}, i = 0, 1, \dots

, comparison of those eigenvalues with the original ones

κ_{i}

provides information about how well the effective dynamics

Z_{t}

retain the relaxation processes of the original process. Our results on this topic are presented in Section 3 and Section 6.

2.4. Galerkin Approximation

The numerical approximation of the eigenfunctions

ψ_{i}, i = 1, \dots, K

is often achieved by Galerkin projection, i.e., orthogonal linear projection of the generator to a finite-dimensional subspace. After choosing such a space

W \subset V_{μ}

, with a basis set

{ϕ_{i}}_{i = 1}^{N}

, the Galerkin approach consists of finding

{\hat{ψ}}_{i} \in W

such that

\begin{matrix} - {〈L {\hat{ψ}}_{i}, ϕ_{j}〉}_{μ} = Q ({\hat{ψ}}_{i}, ϕ_{j}) & = {\hat{ω}}_{i} {〈{\hat{ψ}}_{i}, ϕ_{j}〉}_{μ} \forall 1 \leq j \leq N . \end{matrix}

(17)

Equation (17) is a generalized matrix eigenvalue problem. The approximate eigenvalues

{\hat{ω}}_{i}

are also called Ritz values associated with

W

. If

W

is chosen as a subspace of

V_{ν}

(given a set of reduced variables as described in the previous section), it was shown in [17] that (17) serves both as a weak form for

L

in

V_{μ}

and for

L^{ξ}

in

V_{ν}

. Importantly, the min-max-principle implies that for any such subspace

κ_{i} \leq ω_{i} \leq {\hat{ω}}_{i} .

(18)

Moreover, if simulation data

{x_{m}}_{m = 1}^{M}

of the full process (1), approximately sampling the invariant measure

μ

, is available, the following empirical estimators

\begin{matrix} Q (ϕ_{i}, ϕ_{j}) & \approx \frac{1}{2 M} \sum_{m = 1}^{M} \nabla ϕ_{i} {(x_{m})}^{T} a (x_{m}) \nabla ϕ_{j} (x_{m}), & {〈ϕ_{i}, ϕ_{j}〉}_{μ} & \approx \frac{1}{M} \sum_{m = 1}^{M} ϕ_{i} (x_{m}) ϕ_{j} (x_{m}) \end{matrix}

(19)

will converge to the terms in Equation (17) in the limit of infinite data. This method has been called generator Extended Dynamic Mode Decomposition (gEDMD) [28]. Consequently, for a subspace

W \subset V_{ν}

comprised of functions on reduced space, gEDMD will simultaneously approximate the eigenvalues of

L

and

L^{ξ}

by (18). Note that, in this setting, simulation data of the full dynamics (1) can still be used in Equation (19); it is not necessary to simulate the effective dynamics defined by

L^{ξ}

first. As the gradients in (19) are taken with respect to the full state variables x, we have to apply the chain rule when evaluating (19) for basis functions

ϕ_{i} \in W \subset V_{ν}

:

\nabla ϕ_{i} (ξ (x)) = \nabla_{z} ϕ_{i} {(z)}^{T} \nabla ξ^{T} (x)

.

Note that the estimator for

Q

in (19) exploits reversibility of the process. An alternative estimator, which can also be applied to non-reversible processes, is

\begin{matrix} - {〈L ϕ_{i}, ϕ_{j}〉}_{μ} & \approx - \frac{1}{M} \sum_{m = 1}^{M} [b (x_{m}) \nabla ϕ_{i} (x_{m}) + \frac{1}{2} a (x_{m}) : \nabla^{2} ϕ_{i} (x_{m})] ϕ_{j} (x_{m}) . \end{matrix}

(20)

We will require this last equation in the next section.

2.5. Kramers–Moyal Estimators

The parameters

b^{ξ}

and

a^{ξ}

in (15) involve integrals over nonlinear manifolds in high-dimensional space, and they are rarely used in practice for this reason. For a process

{X_{t}}_{t \geq 0}

and a positive offset

s > 0

, define the projected first order finite difference at time t by

d_{s}^{ξ} (X_{t}) = ξ (X_{t + s}) - ξ (X_{t}) .

With this notation, the basic Kramers–Moyal (KM) formulae [41] for the approximation of

b^{ξ}

and

a^{ξ}

are given by:

\begin{matrix} b^{ξ} (z) & = lim_{s \to 0} E^{μ} [\frac{1}{s} d_{s}^{ξ} (X_{0}) | ξ (X_{0}) = z], & a^{ξ} (z) & = lim_{s \to 0} E^{μ} [\frac{1}{s} d_{s}^{ξ} (X_{0}) \otimes d_{s}^{ξ} (X_{0}) | ξ (X_{0}) = z] . \end{matrix}

(21)

We note that many more sophisticated approximations can be found in the literature.

KM estimators can also be used in the context of gEDMD. If the parameters b and a of the original process (1) are unknown, gEDMD can be re-formulated upon replacing b and a by the first order finite differences at all data points in (20). If

s > 0

corresponds to a multiple of the integration time step in a discrete trajectory

{x_{m}}_{m = 1}^{M}

, then (20) can be converted to:

- {〈L ϕ_{i}, ϕ_{j}〉}_{μ} \approx - \frac{1}{M} \sum_{m = 1}^{M} [\frac{1}{s} d_{s}^{ξ} (x_{m}) \nabla_{z} ϕ_{i} (x_{m}) + \frac{1}{2 s} (d_{s}^{ξ} (x_{m}) \otimes d_{s}^{ξ} (x_{m})) : \nabla_{z}^{2} ϕ_{i} (x_{m})] ϕ_{j} (x_{m}) .

(22)

This estimator is consistent for

s \to 0

and

M \to \infty

. There is a dual meaning to eigenpairs derived from this approximation: on one hand, they serve as an approximation to the eigenpairs of

L^{ξ}

, at least for small offsets s. On the other hand, these are also approximations to the spectrum of the generator of a non-reversible coarse grained SDE with drift and diffusion given by (21), see again [28] for a detailed exposition. In Section 4 and Section 5, we present a numerical study on the effect of the offset s on the spectrum of this generator. We note that similar approximations can certainly be built on more advanced estimators than (21), but the KM formulae will suffice for this study.

3. Spectral Properties of the Projected Generator

3.1. Summary of Spectral Properties

The first major result of this study concerns the approximation error for the dominant eigenvalues

κ_{1}, κ_{2}, \dots

by corresponding eigenvalues of

L^{ξ}

. We only provide a high-level summary of these results here, while the technically more involved statements and their proofs can be found in Section 6.

First, we introduce conditions to ensure the spectrum of the effective generator

L^{ξ}

is also discrete, see Proposition 1. We then show in Proposition 2 and Corollary 2 that the relative eigenvalue error

E_{i} = \frac{ω_{i} - κ_{i}}{ω_{i}}

can be bounded in terms of the energy norm of the projection residual

P_{Q}^{⊥} ψ_{i} = (I - P_{Q}) ψ_{i}

of the corresponding eigenfunctions, with

P_{Q}

denoting the

Q

-orthogonal projection onto

V_{ν}

:

E_{i} = \frac{ω_{i} - κ_{i}}{ω_{i}} \leq C {∥ P_{Q}^{⊥} ψ_{i} ∥}_{Q}^{2} .

(23)

In other words, if the eigenfunction

ψ_{i}

can be written as a function of the reduced variables z, up to a small error, then we can expect the eigenvalue

κ_{i}

to be reproduced well by the effective dynamics on z. However, as the error is measured by the energy norm, Proposition 2 shows that not only the eigenfunction

ψ_{i}

, but also its first order derivatives must be approximated well by functions of z alone. In the next Section 3.2, we show that this is not merely an academic condition, but indeed necessary.

Our result improves on existing ones in two ways. First, in Ref. [18] (Theorem 2), it was shown that the absolute eigenvalue error of the projected generator is small if

∥ {LP}^{⊥} ψ_{i} ∥_{L_{μ}^{2}}

and

∥ P^{⊥} ψ_{i} ∥_{L_{μ}^{2}}

are small. Proposition 2 (Corollary 2) complements these results in the sense that it bounds the relative error of eigenvalues (timescales), which is a more practical error measure for eigenvalues close to zero, i.e., large timescales. For a more detailed elaboration on the relationship of the projection error and the relative error of timescales, please refer to the text after Corollary 2. Second, our bound (37) is less restrictive than the conditions in [18], as it uses the energy norm involving only first order derivatives, while the term

∥ {LP}^{⊥} ψ_{i} ∥_{L_{μ}^{2}}

necessarily requires second derivatives. In fact, the bound assumed in [18] (Theorem 2) implies our bound up to another multiplicative constant (thus, it is more restrictive), see Lemma 1.

3.2. Illustration of the Error Bound

In order to provide an illustration of the error bound, we consider a two-dimensional Ornstein–Uhlenbeck process, that is, overdamped Langevin dynamics (2) with potential (see Figure 1A):

\begin{matrix} V (x, y) & = \frac{1}{2} (α_{x} x^{2} + α_{y} y^{2}) . \end{matrix}

(24)

We set

α_{x} = 1, α_{y} = 5, β = γ = 1.0

. The eigenvalues and eigenfunctions of this system are known analytically. For all integers

r, s \geq 0

, we have eigenvalues

κ_{r, s} = r α_{x} + s α_{y}

, with eigenfunctions

ψ_{r, s} = \frac{1}{\sqrt{r! s!}} H_{r} (x) H_{s} (y)

, using one-dimensional Hermite polynomials

H_{i}

. The first four non-zero eigenvalues correspond to eigenfunctions which are constant in y-direction. The first of these eigenfunctions,

ψ_{1, 0} = x

, is shown in Figure 1C. We now consider a family of reaction coordinates

ξ^{m} (x, y) = x + 0.1 sin (m y) .

For

m = 0

, the reaction coordinate

ξ^{0} (x, y) = x

perfectly captures the first eigenfunction

ψ_{1, 0}

. For positive m, however, the level sets

ξ^{- 1} (z)

of the coarse graining map oscillate within a vertical strip of width 0.1 around

x = z

, see Figure 1A for a comparison of the level sets at

m = 0

and

m = 10

. Due to these small scale oscillations, we still expect to find a relatively small projection error of the eigenfunction

ψ_{1, 0}

, if measured by the

L_{μ}^{2}

-norm, but an increasingly larger error if the energy norm (involving derivatives) is employed.

In order to estimate these projection errors, we use finite-dimensional subspaces

W^{m} \subset V_{ν}^{m}

, where

V_{ν}^{m}

is the form domain corresponding to reaction coordinate

ξ^{m}

. The subspaces are spanned by the first ten Hermite polynomials

H_{i} (z)

, which exactly capture the slowest eigenfunction for

m = 0

. For each subspace, we calculate the Galerkin matrices in Equation (17) by numerical integration, and then extract the first non-trivial eigenvalues

{\hat{ω}}_{1}^{m}

and eigenfunctions

{\hat{ψ}}_{1}^{m}

. We calculate the relative errors

E_{1}^{m} = \frac{{\hat{ω}}_{1}^{m} - κ_{1, 0}}{{\hat{ω}}_{1}^{m}}

and the eigenfunction approximation errors

∥ ψ_{1, 0} - {\hat{ψ}}_{1}^{m} ∥^{2}

, measured using both the norms on

L_{μ}^{2}

and the energy norm. The results are shown in Figure 1B. We observe that the

L_{μ}^{2}

-error remains almost constant as m increases, while the energy norm error increases steadily, reflecting the increasingly oscillating shape of the approximate eigenfunctions

{\hat{ψ}}_{1}^{m}

(see Figure 1D). In agreement with our error estimate, the energy norm approximation error provides a fairly tight bound for the relative eigenvalue error

E_{1}^{m}

. It should be noted that the quantities shown here only provide upper bounds for the relative error

E_{1}

and the projection error

δ_{1}^{2}

, but they suffice for the purpose of illustration.

In summary, this example confirms the eigenvalue error bound provided in Proposition 2, and it also highlights the importance of capturing the derivatives of generator eigenfunctions of interest when selecting a reaction coordinate for coarse graining.

4. Spectral Properties and Kramers–Moyal Estimators

The second part of this work is a numerical study on the effect of employing Kramers–Moyal type approximations when estimating spectral properties of projected generators. As explained in Section 2.5, KM estimators can be incorporated into the gEDMD method by means of Equation (22) if the full system parameters b and a are not known. Again, we stress that using gEDMD to calculate eigenvalues in this way possesses a dual meaning: it is an approximation to the eigenvalues of

L^{ξ}

, but also an approximation to the eigenvalues of a non-reversible dynamics with parameters given by (21). We find that for projections onto a good set of reaction coordinates

ξ

, i.e., coordinates which are known to capture the slow dynamics of a system well, it is possible to use a surprisingly large offset parameter s in Equation (22).

4.1. Methods

In all of the following examples, we generate long realizations of the system under investigation, by employing the Euler–Maruyama method with discrete integration time step

Δ_{t}

, for M steps, such that the total simulation time equals

M Δ_{t}

. The only exception is the molecular example in Section 5.3, where a molecular dynamics code was used to generate the data, see the references given there. For the application of gEDMD on one-dimensional domains, we either use Gaussian basis functions

ϕ_{i}

, or periodic Gaussians

ϕ_{i}^{p}

if the domain is periodic:

\begin{matrix} ϕ_{i} (z) & = exp [- \frac{1}{2 ρ} {(z - z_{i})}^{2}], & ϕ_{i}^{p} (z) & = exp [- \frac{1}{2 ρ} {sin}^{2} (\frac{1}{2} (z - z_{i}))], \end{matrix}

(25)

with bandwidth

ρ

and centers

z_{i} \in \hat{Ω}

. On two-dimensional domains, we use products of the univariate functions defined above, centered on a regular grid.

With these basis functions, the Galerkin matrices for gEDMD are calculated in different ways. As a reference, we use estimators (19), which require knowledge of the full system parameters. In addition, we use the estimator (22) with a series of offsets

s > 0

. We then solve the generalized eigenvalue problem (17) for each of these cases. Eigenvalue estimates thus obtained are denoted by

{\hat{ω}}_{i}^{0}

and

{\hat{ω}}_{i}^{s}

, respectively. We keep track of the relative error

E_{i}^{s} = \frac{| {\hat{ω}}_{i}^{s} - {\hat{ω}}_{i}^{0} |}{{\hat{ω}}_{i}^{s}},

(26)

and also monitor the reciprocals

{\hat{t}}_{i}^{0} = {({\hat{ω}}_{i}^{0})}^{- 1}, {\hat{t}}_{i}^{s} = {({\hat{ω}}_{i}^{s})}^{- 1}

, which serve as estimates of the implied timescales (12). Additionally, we also extract estimates of the K dominant eigenfunctions from each of these eigenvalue problems, and apply the PCCA method [42] to determine metastable decompositions of the domain based on these eigenfunctions. PCCA returns K membership functions

χ_{j} (z), j = 1, \dots, K

, such that

\sum_{j = 1}^{K} χ_{j} (z) = 1

at all points z, and

χ_{j} (z)

indicates the degree of membership of each point z to metastable state j.

4.2. Lemon Slice Potential

We consider overdamped Langevin dynamics Equation (2) in the “lemon slice” potential

V (r, φ) = cos (4 φ) + \frac{1}{cos (0.5 φ)} + 10 {(r - 1)}^{2} + \frac{1}{r},

(27)

where

r, φ

are two-dimensional polar coordinates, at inverse temperature

β = 1.0

and friction

γ = 1.0

. A contour of the potential is shown in Figure 2. Note that the second and the last term in (27) impose an infinite barrier along the negative x-axis and at the origin. The slow dynamics of this system correspond to transitions between the four main minima of the potential V, we therefore find three dominant timescales

t_{1} \approx 2.6, t_{2} \approx 0.95, t_{3} \approx 0.75

, see [28] for a previous analysis of the same example. Hence, we can select the polar angle

φ

as a suitable reaction coordinate

ξ (x, y) = φ (x, y)

. In this case, the effective drift

b^{ξ}

and diffusion

a^{ξ}

can even be calculated analytically, see Appendix A, they are indicated by the black lines in Figure 3A,B. The data set we use comprises

M = 5 \cdot 10^{6}

data points at integration time step

Δ_{t} = 10^{- 3}

.

Since applying gEDMD with a positive offset in (22) corresponds to approximating the generator of an SDE with coefficients (21), and we also have analytical expressions for the exact effective parameters

b^{ξ}

and

a^{ξ}

, we first provide a comparison between these analytical parameters and histogrammed estimates of Equation (21). We find in Figure 3A,B that the exact drift and diffusion are recovered well for small s, as expected, while very different results are obtained as s increases. In particular, the effective diffusion is no longer constant as a function of

φ

for large s.

Next, we compare the dominant spectra of the generators corresponding to these different dynamics. To this end, we employ gEDMD with fifteen Gaussian basis functions (25), centered at equal distance between

z_{i} = - 2.8

and

z_{i} = 2.8

, each of width

ρ = 0.1

, as described in Section 4.1. In Figure 3C, we show the first three timescales

{\hat{t}}_{i}^{s}

compared to their reference values, and also the relative errors

E_{i}^{s}

, see (26). As s increases by about two orders of magnitude, the relative errors remain within a ten to twenty percent margin around the references, which is generally acceptable from a practical point of view. As shown in Figure 3D, the membership functions

χ_{j}, j = 1, \dots, 4

generated by the PCCA method for

s = 0.1

, barely differ from the ones computed using full system parameters. We therefore conclude that the dominant spectrum is approximately retained by all models up to

s = 0.1

. In other words, the structure of the spectrum is unchanged as long as the offset is clearly smaller than the slow timescales, even though the KM estimates for drift and diffusion are very different from the analytical coefficients

b^{ξ}

and

a^{ξ}

.

4.3. Prototypical Molecular Potential

In order to confirm the observations made in the previous section, we study a more complex example. The system is designed to mimic a small molecule consisting of five atoms. The three-dimensional Euclidean coordinates of all five atoms thus define the system’s fifteen-dimensional state space. We imagine these five atoms to be linked by bonds like a chain. The dynamical model is again the overdamped Langevin dynamics (2), with a potential energy comprised of typical molecular interactions, namely harmonic bond, bond angle, and dihedral potentials. The precise parameters can be found in Appendix B. Note, however, that the system is only qualitatively similar to a small molecule, since parameter values and the units of time and energy do not correspond to physical values. We also do not include any solvent molecules or velocities. The data set comprises

M = 2 \cdot 10^{6}

data points at integration time step

Δ_{t} = 5 \cdot 10^{- 3}

.

The system parameters are tuned in such a way that the two dihedral angles

ϕ_{1}, ϕ_{2}

, spanned by those five atoms, capture the slow dynamics of the system, corresponding to transitions between six symmetrically arranged minima of the effective free energy in the

ϕ_{1}

-

ϕ_{2}

-plane, see Figure 4A. We run the reference gEDMD analysis using 100 two-dimensional Gaussian basis functions centered on a ten-by-ten grid, with

ρ = 0.05

. Five dominant implied timescales between

t_{1} \approx 25

and

t_{5} \approx 10

(black lines in Figure 4C) are determined, and the six energy minima are recovered as metastable states by a PCCA analysis, as shown in Figure 4B.

We repeat the same experiment as for the previous example: using the same basis set of 100 Gaussians, we apply gEDMD with KM estimators (22), for a series of offsets between

s = 5 \cdot 10^{- 3}

and

s = 1.0

. The corresponding timescale estimates

{\hat{t}}_{i}^{s}

and the relative errors

E_{i}^{s}

are also shown in Figure 4C. The relative errors are generally larger than for the previous example, but still within an acceptable range of ten to forty percent. Applying the PCCA analysis to the model extracted at

s = 1.0

, we are still able to reproduce the correct metastable decomposition of reaction coordinate space. In summary, we can still conclude that the dominant spectrum is retained as long as the offset is at least an order of magnitude smaller than the fastest interesting timescale.

4.4. Summary of Observations

The previous two examples have indicated that, at least for reversible systems, and for a good selection of reaction coordinates, it is possible to use the KM estimators within the framework of gEDMD without disrupting the overall structure of the dominant spectrum. We state this observation as a conjecture, which will be investigated and made quantitative in future work:

Conjecture 1.

Let ξ be a good set of reaction coordinates for a reversible process

X_{t}

, which capture the slow part of the dynamics well (e.g., small projection errors

δ_{i}

as in Proposition 2). Let s be an offset such that

0 < s ≪ t_{K}

. Then, it is possible to recover the dominant spectrum by applying gEDMD with KM estimators, as in (22), at least in a qualitative sense.

Note that Conjecture 1 is equivalent to saying that the effective dynamics with parameters (21) will retain the dominant spectrum well in the setting described above. In addition, the quality of a gEDMD approximation depends on the basis set. Conjecture 1 should be understood in the sense of using gEDMD with a powerful basis set, such that the basis set error does not play a major role.

5. Underdamped Langevin Dynamics

The third topic of this paper is to study the implications of the first two results for systems driven by underdamped Langevin dynamics. Let us recall that the theoretical analysis presented in Section 3 and Section 6 hinges on the reversible setting. In addition, the numerical results on KM estimators shown in Section 4 were using the reversible overdamped Langevin process (2) (OL process from now on). However, a popular dynamical model, especially in molecular physics, are underdamped Langevin dynamics (UL process) in position and momentum space

(q, p)

, where

q \in Ω

is the position of the system and

p \in R^{d}

is its momentum. The equations of motion are

\begin{matrix} d q_{t} & = & p_{t} d t, \end{matrix}

(28)

\begin{matrix} d p_{t} & = & - \nabla V (q_{t}) d t - γ p_{t} d t + \sqrt{2 γ β^{- 1}} d B_{t} . \end{matrix}

(29)

Here,

V, γ, β

have the same physical meaning of energy, friction, and inverse temperature, as in (2) above. The invariant density of the process (28) and (29) factors as

μ (q, p) = μ_{P} (p) μ_{Q} (q) \propto exp (- \frac{β}{2} p^{T} p) exp (- β V (q)) .

(30)

The underdamped process is not reversible, so the error theory developed in this paper does not apply. However, it is well known that, for large enough friction, if we observe the position coordinate

q_{t}

only at every s-th step, for s large, this time-rescaled process behaves like an OL process. This phenomenon is called the overdamped limit, and gives rise to the following idea: if we select a reaction coordinate

ξ = ξ (q)

, which does not depend on momentum, then

ξ (q_{k s}), k = 1, 2, \dots

, observed at large offset s, behaves like a projected OL process. If we employ KM estimators or related approaches at the same offset within the context of gEDMD, we effectively compute models for the OL dynamics. Our error theory does apply to the reversible OL process, and, if our Conjecture 1 is correct, we still get the dominant spectrum of these OL dynamics right by using KM formulae at a large offset. This idea will be illustrated by numerical examples in the following subsections.

5.1. Projection and Re-Scaling of the Underdamped Process

We start by discussing the projection of the underdamped Langevin process by a map

ξ = ξ (q)

depending only on the position coordinates. Oftentimes, one is not interested in quantities that explicitly depend on the momenta, which renders this a realistic setting (see [35] for an approach to model reduction which includes the momenta). Unfortunately, the coefficients of the effective dynamics (13) are identically zero in this case, see also [34,43]. To see this, it is readily checked using the definition of the generator that

L ξ = p \cdot \nabla_{q} ξ

and

\nabla ξ^{T} a \nabla ξ = 0

. Using the definition of the parameters (15), the factorization (30) of the invariant measure, and the fact that

μ_{P} (p)

is the density of a multivariate normal distribution with mean zero, we find that

\begin{matrix} b^{ξ} (z) & = P (L ξ) (z) \\ = \int_{Ω \times R^{d}} L ξ (q, p) μ (q, p) δ (ξ (q, p) - z) d q d p \\ \propto \int_{Ω \times R^{d}} p \cdot \nabla_{q} ξ (q) exp (- β V (q)) exp (- \frac{β}{2} p^{T} p) δ (ξ (q) - z) d q d p \\ = [\int_{Ω} exp (- β V (q)) \nabla_{q}^{T} ξ (q) δ (ξ (q) - z) d q] \cdot [\int_{R^{d}} p exp (- \frac{β}{2} p^{T} p) d p] \\ = 0, \\ a^{ξ} (z) & = P (\nabla ξ^{T} a \nabla ξ) = 0 . \end{matrix}

How can one define a suitable effective dynamics in this case? As already mentioned above, if the friction

γ

is sufficiently large, the positional component

q_{t}

behaves like a reversible overdamped dynamics (2) on long time scales. More precisely, if we observe the positions

q_{k s}, k = 1, 2, \dots

for an offset

s ≫ \frac{1}{γ}

, then the statistics of this process will be approximately the same as those of an OL process

X_{t}

in position space, observed at the same offset. A particular pair of statistics to observe is given by the KM formulae (21). As a consequence, if the reaction coordinate

ξ

captures the slow dynamics of the OL process

X_{t}

well, and if in addition our Conjecture 1 is correct, then we can use the KM estimators on the underdamped data to build a good model (or a suitable effective dynamics) for the overdamped process

X_{t}

.

The following corollary provides a formal derivation of this argument, again in a qualitative sense, thus connecting the results of Section 3 and Section 4.

Corollary 1.

Let

X_{t} \in Ω \subset R^{d}

denote the OL process (2). Moreover, assume that ξ is good reaction coordinate for

X_{t}

(e.g., small projection errors

δ_{i}

in Proposition 2). Let

(q_{t}, p_{t})

denote the UL process on

Ω \times R^{d}

with the same parameters as

X_{t}

.

(i) For γ sufficiently large, the statistics of the positional component

q_{t}

of the underdamped process, and of the overdamped process

X_{t}

, approximately agree at offsets much larger than

\frac{1}{γ}

.

(ii) If in addition Conjecture 1 is true, then application of the Kramers–Moyal estimator (22) at offsets

t_{K} > s > \frac{1}{γ}

to

q_{t}

, allows for recovering the dominant spectrum by means of gEDMD.

Proof.

(i) Setting

ϵ = \frac{1}{γ}

, and re-scaling the UL process (28) and (29) by

(q_{t}^{c}, p_{t}^{c}) = (q_{c γ t}, p_{c γ t}) = (q_{c t / ϵ}, p_{c t / ϵ})

, the re-scaled equations of motion are

\begin{matrix} d q_{t}^{c} & = \frac{c}{ϵ} p_{t}^{c} d t, \\ d p_{t}^{c} & = - \frac{c}{ϵ} \nabla V (q_{t}^{c}) d t - \frac{c}{ϵ^{2}} p_{t}^{c} d t + \frac{1}{ϵ} \sqrt{2 β^{- 1} c} d B_{t} . \end{matrix}

For sufficiently large

γ

and

c \geq 1

, Theorem 18.1 of [6] implies that the law of

q_{t}^{c}

is close to that of

d Q_{t}^{c} = - c \nabla V (Q_{t}^{c}) d t + \sqrt{2 β^{- 1} c} d B_{t} .

We note that

Q_{t}^{c}

is a time re-scaling of

X_{t}

via

Q_{t}^{c} = X_{c γ t}

, hence we have for functions

f = f (q) \in L_{μ_{Q}}^{2} (Ω)

:

E^{μ_{Q}} [f (q_{c γ t})] \approx E^{μ_{Q}} [f (X_{c γ t})] .

(31)

(ii) In particular, the approximate equality (31) applies directly to estimator (22) for

s > \frac{1}{γ}

. If Conjecture 1 is true, the resulting gEDMD models will qualitatively retain the leading timescales of the process

X_{t}

. □

5.2. Langevin Toy Model

To illustrate the ideas outlined in the previous section, we first consider another two-dimensional toy potential

V (x, y) = 3 x^{4} - 5 x^{2} + 1.5 x + 3 y^{2},

(32)

shown in Figure 5. For

γ = 10, β = 0.4

, we generate data of both the UL and OL processes for this potential, each data set comprising

M = 10^{7}

points at integration time step

Δ_{t} = 10^{- 3}

. For both dynamical models, the slowest transition in this energy landscape is the crossing of the barrier around

x = 0

, thus

ξ (x, y) = x

is a suitable reaction coordinate in both instances. The associated implied timescale in the OL model is

t_{1} \approx 7.3

. The effective drift in the overdamped case simply corresponds to the x-derivative of the x-dependent part the potential, while the effective diffusion remains constant, see the black dashed lines in Figure 6A,B.

We first illustrate the relationship between the overdamped limit and estimates of drift and diffusion by means of KM formulae. In Figure 6A,B, we show histogrammed estimates of the KM expressions (21), using both the OL and UL simulation data. We see that, for a small offset

s = 10^{- 3}

, UL estimates are almost zero, while the OL data lead to estimates close to the analytical values. Both findings are as expected. For a much larger offset

s = 0.5

, however, both estimates are significantly different from what we find for small offsets, and, most importantly, the overdamped and underdamped estimates agree well. This in line with our qualitative argument in Corollary 1(i).

We apply gEDMD to both data sets, using the KM estimators as in (22), for a series of offsets between

s = 10^{- 3}

and

s = 1.0

. The basis set is comprised of fifteen Gaussian basis functions (25), centered uniformly between

x = - 2.5

and

x = 2.5

, each of width

ρ = 0.1

. From all of these models, we extract the slowest implied timescale

{\hat{t}}_{1}^{s}

and the relative errors

E_{1}^{s}

, and present the results in Figure 6C. As expected, OL estimates are highly accurate at small offsets. For large offsets, these estimates remain within the same error margin that we have seen before. Estimates based on the UL data, however, are far off for small offsets (as the coefficients of the corresponding dynamics are essentially zero), but once the offset exceeds the critical value

s > \frac{1}{γ} = 0.1

, they are about as accurate as those based on OL data. We also verify in Figure 6D that, for both data sets, the two metastable states of the effective dynamics can be correctly identified by applying PCCA to the eigenvectors of the gEDMD model at a large offset

s = 0.5

.

This example shows that, upon increasing the offset, it is possible to find a sweet spot where s is larger than the critical relaxation time

\frac{1}{γ}

, but smaller than the slowest timescale

t_{1}

, such that meaningful effective dynamics for the UL process along x can be defined in this regime.

5.3. Alanine Dipeptide

Our final numerical example is a more complex data set, namely molecular dynamics simulations of the alanine dipeptide. The data we use is the same as in Refs. [44,45], consisting of

M = 10^{6}

points at

1 ps

time spacing. The dynamical model is the UL process as in Section 5.2, using the AMBER 99 molecular force field [46] as potential V. Friction is set to

γ = 0.1

{ps}^{- 1}

, and

β

is derived from the temperature

T = 300

K

via

β^{- 1} = k_{B} T

, using the Boltzmann constant

k_{B}

. As is well-known from numerous previous studies, the slow dynamics of alanine dipeptide can be represented well in the space of backbone dihedral angles

ϕ, ψ

. Figure 7 A shows the effective free energy landscape in the space of these reaction coordinates. Three major minima can be identified, which correspond to three metastable states of the full dynamics. The corresponding transition timescales

t_{1} \approx 1 ns

and

t_{2} \approx 0.1 ns

are indicated by the black lines in Figure 7B.

For gEDMD, we employ a similar basis set as in Section 4.3, comprised of 100 periodic Gaussians, centered on a ten-by-ten grid between

- 2.5

and

2.5

, with bandwidth

ρ = 0.05

. We compute gEDMD models for a range of offsets between

s = 1

ps

and

s = 50

ps

. The resulting first two implied timescales

{\hat{t}}_{1}^{s}, {\hat{t}}_{2}^{s}

and the corresponding relative errors

E_{i}^{s}

are shown in Figure 7B. These results confirm the findings of the previous examples, as both leading timescales are roughly reproduced for all offsets considered. The relative error

E_{2}^{s}

for the second timescale is generally larger than what we observed in the previous examples, but it is still acceptable in the context of this example. An interesting observation is that, for small offsets, we are able to fully recover both slow processes by a PCCA analysis. The corresponding state decomposition is shown in Figure 7C. For a larger offset which is comparable to the timescale

t_{2}

, the faster transition within the left part of the plane appears to be blurred out. However, applying PCCA with only two metastable states still recovers the slowest process, as indicated by the decomposition in Figure 7D.

6. Precise Statements on Spectral Properties and Their Proofs

In this section, we provide detailed proofs of the spectral approximation results outlined in Section 3.

6.1. Form Domain

We consider an open domain

Ω \subset R^{d}

, and make the assumption of uniform ellipticity (3). The generator

L

in (7) can be defined initially on the set of smooth functions or smooth and compactly supported functions. The form domain

V_{μ}

can then be obtained as the closure of this initial domain with respect to the Dirichlet norm

{∥ ψ ∥}_{1}^{2} = {∥ ψ ∥}_{L_{μ}^{2}}^{2} + Q (ψ, ψ) .

(33)

We note that, on domains with a boundary, the choice of initial domain has an impact on the boundary conditions. In addition, we can restrict all function spaces to the orthogonal complement of the constant one function without explicitly changing the notation.

The assumption of uniform ellipticity implies that

V_{μ}

is also a Hilbert space if equipped with the energy norm

{∥ \cdot ∥}_{Q} = Q {(\cdot, \cdot)}^{1 / 2} .

(34)

In many cases of practical interest, the energy and Dirichlet norms are equivalent [36]. We state this as an assumption and generally use the energy norm on

V_{μ}

in what follows:

Assumption 1:

The Dirichlet norm (33) and energy norm (34) are equivalent.

6.2. Solution Operator and Discrete Spectrum

The solution operator

T : V_{μ} \to V_{μ}

associated with

Q

is defined by

Q (T ψ, \tilde{ψ}) = {〈ψ, \tilde{ψ}〉}_{μ} \forall \tilde{ψ} \in V_{μ} .

(35)

Assumption 2:

The solution operator is compact on

V_{μ}

with norm

{∥ \cdot ∥}_{Q}

.

As a consequence of this assumption, the generator possesses a complete set of eigenfunctions

ψ_{i} \in V_{μ}

with eigenvalues

κ_{i}, i = 1, 2, \dots

, which are given as reciprocals of the eigenvalues of

T

. There is a number of well-known settings where Assumption 2 can be shown to hold. These include bounded Lipschitz domains with Dirichlet, Neumann, or periodic boundary conditions, as well as overdamped Langevin dynamics with a potential satisfying suitable growth conditions on the potential V [36].

6.3. Coarse Grained Generator and Its Spectrum

Analogously to Section 6.1, the projected generator

L^{ξ}

can be defined initially on the set of smooth or smooth and compactly supported functions in

L_{ν}^{2}

, which is an infinite-dimensional subspace of

L_{μ}^{2}

[17]. Using the effective quadratic form

Q^{ξ}

(16), the effective form domain

V_{ν}

can be defined again by completion of the initial domain with respect to the corresponding Dirichlet norm

{∥ ϕ ∥}_{1, ξ}^{2} = {∥ ϕ ∥}_{L_{ν}^{2}}^{2} + Q^{ξ} (ϕ, ϕ)

. Due to the relations [17]

\begin{matrix} {〈ϕ, \tilde{ϕ}〉}_{ν} & = {〈ϕ \circ ξ, \tilde{ϕ} \circ ξ〉}_{μ}, & Q^{ξ} (ϕ, \tilde{ϕ}) & = Q (ϕ \circ ξ, \tilde{ϕ} \circ ξ), \end{matrix}

(36)

Assumption 1 carries over to the effective Dirichlet norm and the effective energy norm. In order to proceed from this point, some care needs to be taken with regard to the coarse graining map:

Assumption 3:

The coarse graining map

ξ

is such that the effective form domain

V_{ν}

is a subspace of

V_{μ}

.

Remark 1.

Assumption 3 will hold in many cases of practical interest, which include the projection of a periodic domain onto a lower-dimensional periodic domain by a function ξ which respects the periodic boundary conditions. As a negative example, however, consider an SDE on a rectangle in two-dimensional space, with absorbing boundary conditions. Choose the coarse graining function as the projection onto the first coordinate axis. The form domain and the effective form domain will be given as first order Sobolev spaces with zero boundary conditions, but the effective form domain is not contained in the full form domain as its elements do not vanish on parts of the full boundary which are parallel to the first coordinate axis.

If Assumption 3 holds, it makes sense to define the Q-orthogonal projection from

V_{μ}

onto

V_{ν}

, which we denote by

P_{Q}

. As a first main result, we show that assumptions 1, 2, and 3 are sufficient to ensure that the spectrum of

L^{ξ}

is also discrete:

Proposition 1.

If assumptions 1, 2, and 3 hold, the spectrum of

L^{ξ}

is discrete.

Proof.

By assumption 1, the effective solution operator

T^{ξ}

is uniquely defined on

V_{ν}

by

Q^{ξ} (T^{ξ} ϕ, \tilde{ϕ}) = {〈ϕ, \tilde{ϕ}〉}_{ν} \forall \tilde{ϕ} \in V_{ν} .

It is readily checked that

T^{ξ} = P_{Q} T P_{Q}

. Since

P_{Q}

is bounded and

T

is compact by assumption 2, so is

T^{ξ}

. Hence, its spectrum (and correspondingly that of

L^{ξ}

) is discrete. □

6.4. Approximation Result

Next, we derive a bound on the eigenvalue error of

L^{ξ}

in terms of the energy norm approximation error of the dominant eigenfunctions of

L

. The idea is to apply classical Galerkin error estimates to a sequence of finite-dimensional subspaces

W^{h}

in

V_{ν}

, and exploit that these provide approximations to the (nonzero) eigenvalues of both

L

and

L^{ξ}

.

Proposition 2.

Denote the projection error of eigenfunction

ψ_{i}

with respect to the energy norm by

δ_{i} = {∥ P_{Q}^{⊥} ψ_{i} ∥}_{Q}

. The relative error between the i-th eigenvalues of

L^{ξ}

and

L

is bounded by

\frac{ω_{i} - κ_{i}}{ω_{i}} \leq [1 + max_{j = 1, \dots, i - 1} \frac{ω_{j}^{2} κ_{i}^{2}}{{(ω_{j} - κ_{i})}^{2}} {∥ (I - P_{Q}) T ∥}_{Q}^{2}] δ_{i}^{2} .

(37)

Proof.

We consider a sequence

W^{h}

of finite-dimensional subspaces in the reduced space

V_{ν}

. For every

W^{h}

, the

Q

-orthogonal projection from

V_{μ}

onto

W^{h}

is labeled by

P_{Q}^{h}

, and the corresponding

Q^{ξ}

-orthogonal projection in

V_{ν}

is called

P_{Q, ξ}^{h}

. We assume that all

W^{h}

contain the projections

P_{Q} ψ_{j}

for

j = 1 \dots, i

, and satisfy the following approximability condition in

V_{ν}

(which holds in any separable Hilbert space):

lim_{h \to 0} {∥ (I - P_{Q, ξ}^{h}) ϕ ∥}_{Q^{ξ}} = 0 \forall ϕ \in V_{ν} .

(38)

Note that the spaces

W^{h}

are just of auxiliary nature, their sole purpose being to reconcile the theory of [47]—that uses finite-dimensional approximation spaces—with our infinite-dimensional

V_{ν}

. There is no need to specify them in detail, and h only serves as a formal parameter here. Using (18), and by applying [47] (Theorem 3.2) to the approximation of

L

, we find that for any of the approximation spaces

W^{h}

:

\begin{matrix} \frac{ω_{i} - κ_{i}}{ω_{i}} \leq \frac{{\hat{ω}}_{i} - κ_{i}}{{\hat{ω}}_{i}} & \leq [1 + max_{j = 1, \dots, i - 1} \frac{{\hat{ω}}_{j}^{2} κ_{i}^{2}}{{({\hat{ω}}_{j} - κ_{i})}^{2}} {∥ (I - P_{Q}^{h}) T ∥}_{Q}^{2}] {∥ (I - P_{Q}^{h}) ψ_{i} ∥}_{Q}^{2} \\ = [1 + max_{j = 1, \dots, i - 1} \frac{{\hat{ω}}_{j}^{2} κ_{i}^{2}}{{({\hat{ω}}_{j} - κ_{i})}^{2}} {∥ (I - P_{Q}^{h}) T ∥}_{Q}^{2}] δ_{i}^{2}, \end{matrix}

where

{\hat{ω}}_{i}

are the Ritz values associated with

W^{h}

, see Section 2.4. The last equality holds because

P_{Q} ψ_{i} \in W^{h}

by assumption. It remains to study the pre-factor in the limit of

h \to 0

. We conclude from (38) that

{\hat{ω}}_{i} \to ω_{i}

as

h \to 0

, which already yields the first term in the pre-factor. Regarding the second term, we first observe that

P_{Q}^{h} = P_{Q, ξ}^{h} P_{Q}

, since for any

ψ \in V_{μ}

and

ϕ \in W^{h}

:

\begin{matrix} {〈ψ - P_{Q, ξ}^{h} P_{Q} ψ, ϕ〉}_{Q} & = {〈(P_{Q} + P_{Q}^{⊥}) ψ - P_{Q, ξ}^{h} P_{Q} ψ, ϕ〉}_{Q} \\ = {〈P_{Q} ψ - P_{Q, ξ}^{h} P_{Q} ψ, ϕ〉}_{Q} = {〈(I - P_{Q, ξ}^{h}) P_{Q} ψ, ϕ〉}_{Q^{ξ}} = 0 . \end{matrix}

The third equality is due to (36), while the last equality is due to the definition of

P_{Q, ξ}^{h}

. With this result, and using pointwise convergence (38) of

P_{Q, ξ}^{h}

to the identity in

V_{ν}

, we conclude for any

ψ \in V_{μ}

:

\begin{matrix} lim_{h \to 0} {∥ (I - P_{Q}^{h}) ψ - P_{Q}^{⊥} ψ ∥}_{Q} & = lim_{h \to 0} {∥ (P_{Q} - P_{Q}^{h}) ψ ∥}_{Q} \\ = lim_{h \to 0} {∥ (I - P_{Q, ξ}^{h}) P_{Q} ψ ∥}_{Q^{ξ}} = 0 . \end{matrix}

Combing this pointwise convergence of

I - P_{Q}^{h}

towards

P_{Q}^{⊥}

in

V_{μ}

, and compactness of the solution operator, we have

lim_{h \to 0} (I - P_{Q}^{h}) T = (I - P_{Q}) T

with respect to the operator norm. This establishes the second term in the pre-factor and hence the proposition. □

To make the bound from the previous result more accessible, we will bound the term involving the operator

T

.

Corollary 2.

In the setting of Proposition 2, the following error bound holds:

\frac{ω_{i} - κ_{i}}{ω_{i}} \leq [1 + \frac{κ_{i}^{2}}{κ_{1}^{2}} max_{j = 1, \dots, i - 1} \frac{ω_{j}^{2}}{{(ω_{j} - κ_{i})}^{2}}] δ_{i}^{2} .

(39)

Proof.

As already mentioned, it follows from (35) that

T ψ_{i} = \frac{1}{κ_{i}} ψ_{i} \forall i \geq 1 .

Since the

ψ_{i}

are

Q

-orthogonal, it follows that

{∥ T ∥}_{Q} = \frac{1}{κ_{1}}

. Furthermore, since

P_{Q}

is a

Q

-orthogonal projection, so is

I - P_{Q}

, and thus both have

Q

-norm at most one. It follows that

∥ (I - P_{Q}) {T ∥}_{Q} \leq \frac{1}{κ_{1}}

, and hence the claim. □

Remark 2.

Due to the equivalence of the norms

{∥ \cdot ∥}_{Q}

and

{∥ \cdot ∥}_{1}

, the bound in Proposition 2 also applies to the latter norm. Let

P_{1}

denote the orthogonal projection onto

V_{ν}

with respect to the Dirichlet norm. Then,

∥ P_{Q}^{⊥} ψ_{i} ∥_{Q} \leq ∥ P_{1}^{⊥} ψ_{i} ∥_{Q} \leq C {∥ P_{1}^{⊥} ψ_{i} ∥}_{1} .

6.5. Comments

The bound from Corollary 2 allows some interpretation. Note that, in (39), the ratio of the eigenvalues

κ_{1}, κ_{i}

of

L

and the relative difference of eigenvalues

κ_{i}

and

ω_{j}

of

L

and

L^{ξ}

(respectively) play a role. Let us fix i and assume that the first i timescales are comparable, meaning that

κ_{i} / κ_{1}

is a moderate number. Let us further assume that the squared projection errors

δ_{ℓ}

are all much smaller than the relative eigenvalue gaps, i.e.,

δ_{ℓ}^{2} ≪ min \{1, \frac{κ_{j} - κ_{j - 1}}{κ_{j}}\} \forall j, ℓ = 1, \dots, i .

Then, inductively from

j = 1

to

j = i

, it follows that

(ω_{j} - κ_{j}) / ω_{j} \approx O (δ_{j}^{2})

, since we find that

ω_{j} \approx κ_{j}

, thus the relative differences between

κ_{ℓ}

and

ω_{j}

are large for

j \neq ℓ

, which makes the term in the square brackets on the right-hand side of (39) moderate. Hence, in this situation, the relative error of the i-th timescale’s approximation is effectively governed by the projection error

δ_{i}^{2}

.

We also note that the bound assumed in [18] (Theorem 2) implies our bound up to another multiplicative constant:

Lemma 1.

Let us consider the diffusion (1), satisfying (3), on a bounded domain with smooth boundary and reflecting boundary conditions. If

∥ L P^{⊥} ψ_{i} ∥_{L_{μ}^{2}} \leq δ_{i}

, then there is a

C > 0

such that the assumptions of Proposition 2 are satisfied with

\begin{matrix} ∥ P_{Q}^{⊥} ψ_{i} ∥_{Q} & \leq & C δ_{i} . \end{matrix}

(40)

Proof.

We use the weighted Sobolev spaces

H_{μ}^{k}

[48]. Since the spatial domain is bounded and

μ

is smooth, it is a weight in Muckenhoupt class, cf. [48] (Equation (1.2)). The regularity conditions [48] (Equations (2.2)–(2.4)) are satisfied by assumption. Now, the result follows directly from the weighted Agmon–Douglis–Nierenberg estimate [48] (Theorem 2.4) giving

{∥ u ∥}_{H_{μ}^{2}} \leq C {∥ L u ∥}_{L_{μ}^{2}}

for some

C > 0

independent of u. Using assumption 1 and (3), we have

\begin{matrix} ∥ P_{Q}^{⊥} ψ_{i} ∥_{Q} & \leq ∥ P^{⊥} ψ_{i} ∥_{Q} \leq C ∥ P^{⊥} ψ_{i} ∥_{H_{μ}^{1}} \leq C {∥ P^{⊥} ψ_{i} ∥}_{H_{μ}^{2}} \\ \leq C ∥ L P^{⊥} ψ_{i} ∥_{L_{μ}^{2}} \leq C δ_{i} . \end{matrix}

concluding the proof. □

7. Conclusions

We have investigated the approximation of high-dimensional diffusion processes by effective dynamics defined on the lower-dimensional space of reduced variables. For reversible diffusions, a new relative error bound for the approximation of eigenvalues of the infinitesimal generator by the eigenvalues of a reduced generator was proved. Our bound shows that a small projection error of the corresponding generator eigenfunctions, measured by the energy norm, is sufficient for small eigenvalue errors.

In addition, we have presented numerical examples regarding the data-driven estimation of the eigenvalues of projected generators by means of the gEDMD method. If the full system parameters are unknown, they need to be approximated, for example using Kramers–Moyal formulae. We have presented numerical examples that, for reversible systems, and good reaction coordinates, the resulting spectral estimates seem to remain stable across a long range of time windows in the KM-estimators.

Finally, we have suggested a strategy to define meaningful effective equations for underdamped Langevin dynamics on a subset of its position space. Exploiting the overdamped limit, and using KM estimators at large time windows, we can effectively model a projected overdamped process on the same domain. Numerical examples have confirmed the feasibility of this approach.

Future work will focus on providing a theoretical foundation for the observations stated in Conjecture 1. In addition, the relation between the positional coordinate of an (underdamped) Langevin process on long timescales, and the corresponding overdamped Langevin equation (cf. Corollary 1(i)), needs to be analyzed in more detail.

Author Contributions

Conceptualization, F.N., P.K., L.B. and C.C.; methodology, F.N., P.K., L.B. and C.C.; software, F.N. and L.B.; validation, F.N. and L.B.; formal analysis, F.N. and P.K.; funding acquisition; P.K. and C.C.; writing—original draft preparation, F.N., P.K., L.B. and C.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Rice University Academy of Fellows (F.N.), by Deutsche Forschungsgemeinschaft (DFG) through CRC 1114 “Scaling Cascades in Complex Systems”, Project Number 235221301, project A01 (P.K.), by the Einstein Foundation Berlin (C.C.), by the National Science Foundation (Nos. CHE-1265929, CHE-1738990, and PHY-1427654) and the Welch Foundation (No. C-1570) (F.N., L.B., C.C.).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We would like to thank Ralf Banisch, Stefan Klus, Thomas Swinburne, Tony Lelièvre, and Frank Noé. We are also indebted to the Institute for Pure and Applied Mathematics (IPAM) for hosting the long program “Complex High-Dimensional Energy Landscapes”.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Lemon Slice Potential: Effective Drift and Diffusion Coefficients

Here, we calculate the effective drift and diffusion if the lemon slice potential Equation (27) is projected onto the polar angle

ξ (x, y) = φ (x, y)

. We start by expressing the generator

L

in polar coordinates. The Laplacian operator in polar coordinates is

Δ f = \frac{\partial^{2} f}{\partial r^{2}} + \frac{1}{r^{2}} \frac{\partial^{2} f}{\partial φ^{2}} + \frac{1}{r} \frac{\partial f}{\partial r} .

Moreover, it follows from the chain rule and the definition of

ξ

that for any function f,

\begin{matrix} \frac{\partial f}{\partial x} & = cos φ \frac{\partial f}{\partial r} - \frac{sin φ}{r} \frac{\partial f}{\partial φ}, \\ \frac{\partial f}{\partial y} & = sin φ \frac{\partial f}{\partial r} + \frac{cos φ}{r} \frac{\partial f}{\partial φ} . \end{matrix}

The generator in polar coordinates then becomes

L = - [\frac{\partial V}{\partial r} \frac{\partial}{\partial r} + \frac{1}{r^{2}} \frac{\partial V}{\partial φ} \frac{\partial}{\partial φ}] + \frac{1}{β} [\frac{\partial^{2}}{\partial r^{2}} + \frac{1}{r^{2}} \frac{\partial^{2}}{\partial φ^{2}} + \frac{1}{r} \frac{\partial}{\partial r}] .

Applying the generator to the reaction coordinate

ξ

, only one of the terms above is non-zero, resulting in

L ξ = - \frac{1}{r^{2}} \frac{\partial V}{\partial φ} = \frac{1}{r^{2}} [4 sin (4 φ) - \frac{sin (0.5 φ)}{2 {cos}^{2} (0.5 φ)}] .

In order to evaluate the first term in Equation (15), we note that the Jacobian determinant of

ξ

is

J (x, y) = \frac{1}{r^{2}}

, and that the stationary distribution factors, canceling the

φ

-dependent term. We are left with

\begin{matrix} b^{ξ} (φ) & = \frac{1}{ϑ (φ)} \int_{0}^{\infty} \frac{1}{r^{2}} [4 sin (4 φ) - \frac{sin (0.5 φ)}{2 {cos}^{2} (0.5 φ)}] r μ (r, φ) d r \\ = \frac{1}{C_{2}} [4 sin (4 φ) - \frac{sin (0.5 φ)}{2 {cos}^{2} (0.5 φ)}] \int_{0}^{\infty} \frac{1}{r} exp (- 10 {(r - 1)}^{2} - \frac{1}{r}) d r \\ = \frac{C_{1}}{C_{2}} [4 sin (4 φ) - \frac{sin (0.5 φ)}{2 {cos}^{2} (0.5 φ)}], \end{matrix}

with the definitions

C_{1} : = \int_{0}^{\infty} \frac{1}{r} exp (- 10 {(r - 1)}^{2} - \frac{1}{r}) d r

and

C_{2} : = \int_{0}^{\infty} r exp (- 10 {(r - 1)}^{2} - \frac{1}{r}) d r

. For the diffusion, we obtain

\begin{matrix} \nabla ξ^{T} a \nabla ξ & = 2 [{(\frac{\partial ξ}{\partial x})}^{2} + {(\frac{\partial ξ}{\partial y})}^{2}] \\ = 2 J (x, y) = \frac{2}{r^{2}} . \end{matrix}

Inserting this into the second term in (15), and using the same arguments as before, we end up with

a^{ξ} (φ) = \frac{2 C_{1}}{C_{2}} .

Appendix B. Parameters of Prototypical Molecular Example

The system we considered in Section 4.3 is defined as follows: denote the three-dimensional position vectors of atoms one through five by

r_{i}

,

1 \leq i \leq 5

. We denote the pairwise distances between atoms

i, j

by

d_{i, j}

, the bond angle formed by atoms

i, j, k

by

θ_{i, j, k}

, and the dihedral angle formed by atoms

i, j, k, l

by

ϕ_{i, j, k, l}

. The potential energy is then defined as a superposition of harmonic bond terms, harmonic bond angle terms, and dihedral angle terms:

\begin{matrix} V (r_{1}, \dots, r_{5}) & = \sum_{i = 1}^{4} \frac{1}{2} k_{b} {(d_{i, i + 1} - d^{0})}^{2} + \\ + \sum_{i = 1}^{3} \frac{1}{2} k_{θ} {(θ_{i, i + 1, i + 2} - θ^{0})}^{2} \\ + \sum_{i = 1}^{2} k_{ϕ} (1 - cos (n_{ϕ}^{i} ϕ_{i, i + 1, i + 2, i + 3})) . \end{matrix}

(A1)

The constants appearing above are set to

\begin{matrix} k_{b} & = 1, & d^{0} & = 2, & k_{θ} & = 1, & θ^{0} & = \frac{π}{2}, & k_{ϕ} & = 0.02, & n_{ϕ}^{1} & = 3, & n_{ϕ}^{2} & = 2 . \end{matrix}

The overdamped Langevin dynamics (2) with potential (A1) are simulated at inverse temperature

β = 15

, for

10^{7}

discrete steps at integration time step

Δ_{t} = 10^{- 3}

. For the purpose of our analysis, we retain only every fifth step of this trajectory.

References

Mori, H. Transport, collective motion, and Brownian motion. Prog. Theor. Phys. 1965, 33, 423–455. [Google Scholar] [CrossRef]
Zwanzig, R. Nonlinear generalized Langevin equations. J. Stat. Phys. 1973, 9, 215–220. [Google Scholar] [CrossRef]
Chorin, A.J.; Hald, O.H.; Kupferman, R. Optimal prediction and the Mori–Zwanzig representation of irreversible processes. Proc. Natl. Acad. Sci. USA 2000, 97, 2968–2973. [Google Scholar] [CrossRef] [PubMed]
Chorin, A.J.; Hald, O.H.; Kupferman, R. Optimal prediction with memory. Physica D 2002, 166, 239–257. [Google Scholar] [CrossRef]
Hijón, C.; Español, P.; Vanden-Eijnden, E.; Delgado-Buscalioni, R. Mori–Zwanzig formalism as a practical computational tool. Faraday Discuss. 2009, 144, 301–322. [Google Scholar] [CrossRef]
Pavliotis, G.; Stuart, A. Multiscale Methods: Averaging and Homogenization; Springer Science & Business Media: Berlin, Germany, 2008. [Google Scholar]
Pavliotis, G.A.; Stuart, A.M. Parameter estimation for multiscale diffusions. J. Stat. Phys. 2007, 127, 741–781. [Google Scholar] [CrossRef]
Clementi, C. Coarse-grained models of protein folding: Tol-models or predictive tools? Curr. Opin. Struct. Biol. 2008, 18, 10–15. [Google Scholar] [CrossRef]
Noé, F.; Clementi, C. Collective variables for the study of long-time kinetics from molecular trajectories: Theory and methods. Curr. Opin. Struct. Biol. 2017, 43, 141–147. [Google Scholar] [CrossRef]
Noid, W. Perspective: Coarse-grained models for biomolecular systems. J. Phys. Chem. 2013, 139, 090901. [Google Scholar] [CrossRef]
Prinz, J.H.; Wu, H.; Sarich, M.; Keller, B.; Senne, M.; Held, M.; Chodera, J.D.; Schütte, C.; Noé, F. Markov models of molecular kinetics: Generation and Validation. J. Chem. Phys. 2011, 134, 174105. [Google Scholar] [CrossRef]
Rohrdanz, M.A.; Zheng, W.; Maggioni, M.; Clementi, C. Determination of reaction coordinates via locally scaled diffusion map. J. Chem. Phys. 2011, 134, 124116. [Google Scholar] [CrossRef] [PubMed]
Saunders, M.G.; Voth, G.A. Coarse-Graining Methods for Computational Biology. Annu. Rev. Biophys. 2013, 42, 73–93. [Google Scholar] [CrossRef] [PubMed]
Weinan, E.; Vanden-Eijnden, E. Metastability, conformation dynamics, and transition pathways in complex systems. In Multiscale Modelling and Simulation; Springer: Berlin, Germany, 2004; pp. 35–68. [Google Scholar]
Legoll, F.; Lelièvre, T. Effective dynamics using conditional expectations. Nonlinearity 2010, 23, 2131. [Google Scholar] [CrossRef]
Froyland, G.; Gottwald, G.A.; Hammerlindl, A. A trajectory-free framework for analysing multiscale systems. Phys. D 2016, 328, 34–43. [Google Scholar] [CrossRef]
Zhang, W.; Hartmann, C.; Schutte, C. Effective dynamics along given reaction coordinates, and reaction rate theory. Faraday Discuss. 2016, 195, 365–394. [Google Scholar] [CrossRef]
Zhang, W.; Schütte, C. Reliable Approximation of Long Relaxation Timescales in Molecular Dynamics. Entropy 2017, 19, 367. [Google Scholar] [CrossRef]
Legoll, F.; Lelièvre, T.; Olla, S. Pathwise estimates for an effective dynamics. Stoch. Process. Appl. 2017, 127, 2841–2863. [Google Scholar] [CrossRef]
Lelièvre, T.; Zhang, W. Pathwise estimates for effective dynamics: The case of nonlinear vectorial reaction coordinates. Multiscale Model. Simul. 2019, 17, 1019–1051. [Google Scholar] [CrossRef]
Schütte, C.; Fischer, A.; Huisinga, W.; Deuflhard, P. A Direct Approach to Conformational Dynamics Based on Hybrid Monte Carlo. J. Comput. Phys. 1999, 151, 146–168. [Google Scholar] [CrossRef]
Dellnitz, M.; Junge, O. On the Approximation of Complicated Dynamical Behavior. SIAM J. Numer. Anal. 1999, 36, 491–515. [Google Scholar] [CrossRef]
Noé, F.; Nüske, F. A variational approach to modeling slow processes in stochastic dynamical systems. Multiscale Model. Simul. 2013, 11, 635–655. [Google Scholar] [CrossRef]
Williams, M.O.; Kevrekidis, I.G.; Rowley, C.W. A Data-Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition. J. Nonlinear Sci. 2015, 25, 1307–1346. [Google Scholar] [CrossRef]
Mardt, A.; Pasquali, L.; Wu, H.; Noé, F. VAMPnets for deep learning of molecular kinetics. Nat. Commun. 2018, 9, 1–11. [Google Scholar] [CrossRef] [PubMed]
Klus, S.; Nüske, F.; Koltai, P.; Wu, H.; Kevrekidis, I.; Schütte, C.; Noé, F. Data-Driven Model Reduction and Transfer Operator Approximation. J. Nonlinear Sci. 2018, 28, 985–1010. [Google Scholar] [CrossRef]
Wu, H.; Noé, F. Variational approach for learning Markov processes from time series data. J. Nonlinear Sci. 2020, 30, 23–66. [Google Scholar] [CrossRef]
Klus, S.; Nüske, F.; Peitz, S.; Niemann, J.H.; Clementi, C.; Schütte, C. Data-driven approximation of the Koopman generator: Model reduction, system identification, and control. Phys. D Nonlinear Phenom. 2020, 406, 132416. [Google Scholar] [CrossRef]
Kessler, M.; Lindner, A.; Sorensen, M. Statistical Methods for Stochastic Differential Equations; CRC Press: Boca Raton, FL, USA, 2012. [Google Scholar]
Gobet, E.; Hoffmann, M.; Reiß, M. Nonparametric estimation of scalar diffusions based on low frequency data. Ann. Stat. 2004, 32, 2223–2253. [Google Scholar]
Crommelin, D.; Vanden-Eijnden, E. Diffusion Estimation from Multiscale Data by Operator Eigenpairs. Multiscale Model. Simul. 2011, 9, 1588–1623. [Google Scholar] [CrossRef]
Zhang, L.; Mykland, P.A.; Aït-Sahalia, Y. A tale of two time scales: Determining integrated volatility with noisy high-frequency data. J. Am. Stat. Assoc. 2005, 100, 1394–1411. [Google Scholar] [CrossRef]
Bittracher, A.; Hartmann, C.; Junge, O.; Koltai, P. Pseudo generators for under-resolved molecular dynamics. Eur. Phys. J. Spec. Top. 2015, 224, 2463–2490. [Google Scholar] [CrossRef][Green Version]
Bittracher, A.; Koltai, P.; Junge, O. Pseudogenerators of spatial transfer operators. SIAM J. Appl. Dyn. Syst. 2015, 14, 1478–1517. [Google Scholar] [CrossRef][Green Version]
Duong, M.H.; Lamacz, A.; Peletier, M.A.; Schlichting, A.; Sharma, U. Quantification of coarse-graining error in Langevin and overdamped Langevin dynamics. Nonlinearity 2018, 31, 4517. [Google Scholar] [CrossRef]
Bakry, D.; Gentil, I.; Ledoux, M. Analysis and Geometry of Markov Diffusion Operators; Springer Science & Business Media: Berlin, Germany, 2013; Volume 348. [Google Scholar]
Davies, E.B. Metastable states of symmetric Markov semigroups II. J. Lond. Math. Soc. 1982, 2, 541–556. [Google Scholar] [CrossRef]
Pazy, A. Semigroups of Linear Operators and Applications to Partial Differential Equations; Springer: New York, NY, USA; Berlin, Germany, 1983. [Google Scholar]
Davies, E.B. Metastable states of symmetric Markov semigroups I. Proc. Lond. Math. Soc. 1982, 45, 133–150. [Google Scholar] [CrossRef]
Deuflhard, P.; Huisinga, W.; Fischer, A.; Schütte, C. Identification of almost invariant aggregates in reversible nearly uncoupled Markov chains. Linear Algebra Appl. 2000, 315, 39–59. [Google Scholar] [CrossRef]
Risken, H.; Haken, H. The Fokker–Planck Equation: Methods of Solution and Applications, 2nd ed.; Springer: Berlin, Germany, 1989. [Google Scholar]
Deuflhard, P.; Weber, M. Robust Perron cluster analysis in conformation dynamics. Linear Algebra Appl. 2005, 398, 161–184. [Google Scholar] [CrossRef]
Schütte, C. Conformational Dynamics: Modelling, Theory, Algorithm, and Application to Biomolecules. Available online: https://opus4.kobv.de/opus4-zib/frontdoor/index/index/docId/406 (accessed on 20 January 2021).
Nüske, F.; Wu, H.; Prinz, J.H.; Wehmeyer, C.; Clementi, C.; Noé, F. Markov state models from short non-equilibrium simulations—Analysis and correction of estimation bias. J. Chem. Phys. 2017, 146, 094104. [Google Scholar] [CrossRef]
Wang, J.; Olsson, S.; Wehmeyer, C.; Pérez, A.; Charron, N.E.; De Fabritiis, G.; Noé, F.; Clementi, C. Machine learning of coarse-grained molecular dynamics force fields. ACS Cent. Sci. 2019, 5, 755–767. [Google Scholar] [CrossRef]
Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Klepeis, J.L.; Dror, R.O.; Shaw, D.E. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins Struct. Funct. Bioinform. 2010, 78, 1950–1958. [Google Scholar] [CrossRef]
Knyazev, A.V.; Osborn, J.E. New a priori FEM error estimates for eigenvalues. SIAM J. Numer. Anal. 2006, 43, 2647–2667. [Google Scholar] [CrossRef][Green Version]
Cejas, M.E.; Durán, R.G. Weighted a priori estimates for elliptic equations. arXiv 2017, arXiv:1711.00879. [Google Scholar] [CrossRef]

Figure 1. Illustration of Proposition 2 by means of a two-dimensional Ornstein–Uhlenbeck process, and one-dimensional reaction coordinates

ξ^{m} (x, y) = x + 0.1 sin (m y)

. (A) selected level sets

ξ^{- 1} (z)

for

m = 0

(blue) and

m = 10

(green), with a contour of the potential in the background; (B) red:

L_{μ}^{2}

-error between exact first eigenfunction

ψ_{1, 0} = x

, and the approximate slowest eigenfunction

{\hat{ψ}}_{1}^{m}

, computed by Galerkin projection onto the space of the first ten Hermite polynomials

ψ_{i} (z)

, where z is the reaction coordinate

ξ^{m}

. Green: the same error, but measured using the energy norm. Black: Relative eigenvalue error

E_{1}^{m}

corresponding to the same approximation. The vertical axis is labeled by the decadic logarithm; (C) contour of the exact slowest eigenfunction

ψ_{1, 0} = x

; (D) contour of the approximate slowest eigenfunction

{\hat{ψ}}_{1}^{m}

for

m = 10

.

Figure 1. Illustration of Proposition 2 by means of a two-dimensional Ornstein–Uhlenbeck process, and one-dimensional reaction coordinates

ξ^{m} (x, y) = x + 0.1 sin (m y)

. (A) selected level sets

ξ^{- 1} (z)

for

m = 0

(blue) and

m = 10

(green), with a contour of the potential in the background; (B) red:

L_{μ}^{2}

-error between exact first eigenfunction

ψ_{1, 0} = x

, and the approximate slowest eigenfunction

{\hat{ψ}}_{1}^{m}

, computed by Galerkin projection onto the space of the first ten Hermite polynomials

ψ_{i} (z)

, where z is the reaction coordinate

ξ^{m}

. Green: the same error, but measured using the energy norm. Black: Relative eigenvalue error

E_{1}^{m}

corresponding to the same approximation. The vertical axis is labeled by the decadic logarithm; (C) contour of the exact slowest eigenfunction

ψ_{1, 0} = x

; (D) contour of the approximate slowest eigenfunction

{\hat{ψ}}_{1}^{m}

for

m = 10

.

Figure 2. Contour plot of the lemon slice potential Equation (27).

Figure 3. Analysis of effective dynamics on the polar angle for the lemon slice potential. (A) numerical estimates of effective drift for different values of the offset s, compared to the reference in black; (B) the same for the effective diffusion; (C) implied timescales

{\hat{t}}_{1}^{s}, {\hat{t}}_{2}^{s}, {\hat{t}}_{3}^{s}

extracted from gEDMD models using KM formulae at various offsets s (solid lines), compared to the results of applying gEDMD with full system parameters (dashed black lines). We also show the relative errors

E_{i}^{s}

(26) for all three timescales (thin dashed lines, scale on the right, labeled by decadic logarithm); (D) four metastable membership functions generated by the PCCA method, extracted from a gEDMD model at offset

s = 0.1

(green) and using exact system parameters (black).

Figure 3. Analysis of effective dynamics on the polar angle for the lemon slice potential. (A) numerical estimates of effective drift for different values of the offset s, compared to the reference in black; (B) the same for the effective diffusion; (C) implied timescales

{\hat{t}}_{1}^{s}, {\hat{t}}_{2}^{s}, {\hat{t}}_{3}^{s}

extracted from gEDMD models using KM formulae at various offsets s (solid lines), compared to the results of applying gEDMD with full system parameters (dashed black lines). We also show the relative errors

E_{i}^{s}

(26) for all three timescales (thin dashed lines, scale on the right, labeled by decadic logarithm); (D) four metastable membership functions generated by the PCCA method, extracted from a gEDMD model at offset

s = 0.1

(green) and using exact system parameters (black).

Figure 4. Analysis of the effective dynamics of a prototypical five atom molecular system in the space of its dihedral angles

ϕ_{1}, ϕ_{2}

. (A) effective free energy in dihedral angle plane; (B) decomposition into six metastable states based on PCCA analysis of a gEDMD model with exact system parameters. Gray dots represent transition states where none of the memberships

χ_{j}

exceeds

0.6

. (C) first five implied timescales

{\hat{t}}_{i}^{s}

extracted from gEDMD models with KM estimators at various offsets s (solid lines), compared to the gEDMD model with exact system parameters (dashed black lines). Error bars were computed by bootstrapping. We also show the mean relative error

E_{i}^{s}

given in (26) (thin dashed lines, scale on the right, labeled by decadic logarithm); (D) decomposition into six metastable states based on PCCA analysis of a gEDMD model at offset

s = 1.0

.

Figure 4. Analysis of the effective dynamics of a prototypical five atom molecular system in the space of its dihedral angles

ϕ_{1}, ϕ_{2}

. (A) effective free energy in dihedral angle plane; (B) decomposition into six metastable states based on PCCA analysis of a gEDMD model with exact system parameters. Gray dots represent transition states where none of the memberships

χ_{j}

exceeds

0.6

. (C) first five implied timescales

{\hat{t}}_{i}^{s}

extracted from gEDMD models with KM estimators at various offsets s (solid lines), compared to the gEDMD model with exact system parameters (dashed black lines). Error bars were computed by bootstrapping. We also show the mean relative error

E_{i}^{s}

given in (26) (thin dashed lines, scale on the right, labeled by decadic logarithm); (D) decomposition into six metastable states based on PCCA analysis of a gEDMD model at offset

s = 1.0

.

Figure 5. Two-dimensional model potential (32).

Figure 6. Analysis of effective dynamics on the x-coordinate of the two-dimensional toy potential Equation (32). (A) numerical estimates of effective drift for different values of the offset s using both OL data (dots) and UL data (crosses); (B) the same for the effective diffusion; (C) leading implied timescale

{\hat{t}}_{1}^{s}

obtained from gEDMD models of the projected OL data (dots) and UL data (crosses), as a function of s, compared to the reference value in black. The reference was extracted from a gEDMD model using exact system parameters. We also show the relative error

E_{i}^{s}

for both data sets (thin dashed lines, scale on the right, labeled by decadic logarithm). The vertical gray line indicates the critical relaxation time

\frac{1}{γ}

; (D) PCCA memberships extracted from gEDMD models at offset

s = 0.5

for both data sets, compared to the reference gEDMD model in black.

Figure 6. Analysis of effective dynamics on the x-coordinate of the two-dimensional toy potential Equation (32). (A) numerical estimates of effective drift for different values of the offset s using both OL data (dots) and UL data (crosses); (B) the same for the effective diffusion; (C) leading implied timescale

{\hat{t}}_{1}^{s}

obtained from gEDMD models of the projected OL data (dots) and UL data (crosses), as a function of s, compared to the reference value in black. The reference was extracted from a gEDMD model using exact system parameters. We also show the relative error

E_{i}^{s}

for both data sets (thin dashed lines, scale on the right, labeled by decadic logarithm). The vertical gray line indicates the critical relaxation time

\frac{1}{γ}

; (D) PCCA memberships extracted from gEDMD models at offset

s = 0.5

for both data sets, compared to the reference gEDMD model in black.

Figure 7. Analysis of effective dynamics for alanine dipeptide in the space of its backbone dihedral angles

ϕ, ψ

. (A) effective free energy of the original simulation data in the

ϕ - ψ

-plane. Metastable states correspond to the two deep minima on the left, and the shallow minimum on the right; (B) slowest two timescales

{\hat{t}}_{1}^{s}, {\hat{t}}_{2}^{s}

computed by gEDMD models at various offsets s, compared to the reference values in black. Error bars were estimated by bootstrapping. We also show the mean relative error

E_{i}^{s}

(26) (thin dashed lines, scale on the right, labeled by decadic logarithm); (C) metastable decomposition into three states determined by applying PCCA to the eigenfunctions of the gEDMD model at

s = 5 ps

. Gray dots represent transition states where none of the memberships

χ_{j}

exceeds 0.6. (D) The same for

s = 50 ps

, but using only two states.

Figure 7. Analysis of effective dynamics for alanine dipeptide in the space of its backbone dihedral angles

ϕ, ψ

. (A) effective free energy of the original simulation data in the

ϕ - ψ

-plane. Metastable states correspond to the two deep minima on the left, and the shallow minimum on the right; (B) slowest two timescales

{\hat{t}}_{1}^{s}, {\hat{t}}_{2}^{s}

computed by gEDMD models at various offsets s, compared to the reference values in black. Error bars were estimated by bootstrapping. We also show the mean relative error

E_{i}^{s}

(26) (thin dashed lines, scale on the right, labeled by decadic logarithm); (C) metastable decomposition into three states determined by applying PCCA to the eigenfunctions of the gEDMD model at

s = 5 ps

. Gray dots represent transition states where none of the memberships

χ_{j}

exceeds 0.6. (D) The same for

s = 50 ps

, but using only two states.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nüske, F.; Koltai, P.; Boninsegna, L.; Clementi, C. Spectral Properties of Effective Dynamics from Conditional Expectations. Entropy 2021, 23, 134. https://doi.org/10.3390/e23020134

AMA Style

Nüske F, Koltai P, Boninsegna L, Clementi C. Spectral Properties of Effective Dynamics from Conditional Expectations. Entropy. 2021; 23(2):134. https://doi.org/10.3390/e23020134

Chicago/Turabian Style

Nüske, Feliks, Péter Koltai, Lorenzo Boninsegna, and Cecilia Clementi. 2021. "Spectral Properties of Effective Dynamics from Conditional Expectations" Entropy 23, no. 2: 134. https://doi.org/10.3390/e23020134

APA Style

Nüske, F., Koltai, P., Boninsegna, L., & Clementi, C. (2021). Spectral Properties of Effective Dynamics from Conditional Expectations. Entropy, 23(2), 134. https://doi.org/10.3390/e23020134

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Spectral Properties of Effective Dynamics from Conditional Expectations

Abstract

1. Introduction

2. Concepts

2.1. SDEs and Generators

2.2. Spectral Decomposition

2.3. Dimensionality Reduction

2.4. Galerkin Approximation

2.5. Kramers–Moyal Estimators

3. Spectral Properties of the Projected Generator

3.1. Summary of Spectral Properties

3.2. Illustration of the Error Bound

4. Spectral Properties and Kramers–Moyal Estimators

4.1. Methods

4.2. Lemon Slice Potential

4.3. Prototypical Molecular Potential

4.4. Summary of Observations

5. Underdamped Langevin Dynamics

5.1. Projection and Re-Scaling of the Underdamped Process

5.2. Langevin Toy Model

5.3. Alanine Dipeptide

6. Precise Statements on Spectral Properties and Their Proofs

6.1. Form Domain

6.2. Solution Operator and Discrete Spectrum

6.3. Coarse Grained Generator and Its Spectrum

6.4. Approximation Result

6.5. Comments

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Lemon Slice Potential: Effective Drift and Diffusion Coefficients

Appendix B. Parameters of Prototypical Molecular Example

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI