# Comparing Markov Chain Samplers for Molecular Simulation

^{*}

^{†}

## Abstract

**:**

## 1. Summary

## 2. Preliminaries

- B: ${\mathbf{P}}_{n}^{\prime}={\mathbf{P}}_{n-1}+\frac{1}{2}\mathsf{\Delta}t\mathbf{F}\left({\mathbf{Q}}_{n-1}\right)$,
- A: ${\mathbf{Q}}_{n}^{\prime}={\mathbf{Q}}_{n-1}+\frac{1}{2}\mathsf{\Delta}t{M}^{-1}{\mathbf{P}}_{n}^{\prime}$,
- O: ${\mathbf{P}}_{n}^{\u2033}=exp(-\gamma \mathsf{\Delta}t){\mathbf{P}}_{n}^{\prime}+\sqrt{1-exp(-2\gamma \mathsf{\Delta}t)}{\beta}^{-1/2}{M}_{1/2}{\mathbf{R}}_{n}$,
- A: ${\mathbf{Q}}_{n}={\mathbf{Q}}_{n}^{\prime}+\frac{1}{2}\mathsf{\Delta}t{M}^{-1}{\mathbf{P}}_{n}^{\u2033}$,
- B: ${\mathbf{P}}_{n}={\mathbf{P}}_{n}^{\u2033}+\frac{1}{2}\mathsf{\Delta}t\mathbf{F}\left({\mathbf{Q}}_{n}\right)$,

- ${\mathbf{Q}}_{n}={\mathbf{Q}}_{n}^{\prime}+\frac{1}{2}\sqrt{2\delta t}{\beta}^{-1/2}{M}_{1/2}^{-\mathsf{T}}{\mathbf{R}}_{n}$,
- ${\mathbf{Q}}_{n+1}^{\prime}={\mathbf{Q}}_{n}+\delta t{M}^{-1}\mathbf{F}\left({\mathbf{Q}}_{n}\right)+\frac{1}{2}\sqrt{2\delta t}{\beta}^{-1/2}{M}_{1/2}^{-\mathsf{T}}{\mathbf{R}}_{n}$,

#### 2.1. Estimating Integrated Autocorrelation Time

`acor`for estimating the IAcT is available on the web [10]. Estimating the IAcT can be quite difficult, and

`acor`can give unsatisfactory results. An attempt to improve it [1] is at best marginally successful. For reversible methods, there are properties of the autocorrelation function that may be useful for improving estimates of it [8].

## 3. Quasi-Reliable Estimates of Sampling Thoroughness

#### 3.1. Quasi-Reversible Propagators

- BR: ${\mathbf{P}}_{n}=-({\mathbf{P}}_{n}^{\u2033}+\frac{1}{2}\mathsf{\Delta}t\mathbf{F}\left({\mathbf{Q}}_{n}\right))$.

## 4. Irreversible Samplers and Their Superiority

#### 4.1. A Very Simple Example

#### 4.2. A Very Simple Example with a Barrier

## 5. Optimal Langevin Damping for a Model Problem

- Reaching precise conclusions is difficult for most of the analysis unless $\omega \mathsf{\Delta}t$ is bounded above by some constant of order 1, which seems to be satisfied in practice. This assumption underlies the statements that follow.
- The spectral gap is an increasing function of $\omega $, so for a multidimensional quadratic potential energy, the value of $\gamma $ that maximizes the spectral gap depends on the lowest frequency ${\omega}_{1}$.
- The spectral gap is maximized for $\gamma \le 2\omega $, corresponding to an underdamped system, for which the spectral gap is $\omega \mathsf{\Delta}t+\mathcal{O}\left(\mathsf{\Delta}{t}^{2}\right)$.
- The eigenfunctions of the operator $\mathcal{G}$ can be partitioned into eigenspaces ${\mathbb{P}}_{k}^{\prime}$, $k=0,1,\dots $, where ${\mathbb{P}}_{k}^{\prime}$ is a linear combination of $k+1$ specific polynomials of degree k in $\omega q$ and p. The greatest eigenvalue of $\mathcal{G}$ is ${\tau}_{max}={max}_{k}{\tau}_{max}^{\left(k\right)}$ where ${\tau}_{max}^{\left(k\right)}$ is the maximum IAcT over ${\mathbb{P}}_{k}^{\prime}$.
- Figure 1 shows that, for fixed $\mathsf{\Delta}t$ and $\gamma $, the value ${\tau}_{max}^{\left(k\right)}$ is an increasing function of $\omega $, at least for $k=1,2,3,4$. Hence, as for the spectral gap, it is the lowest frequency ${\omega}_{1}$ that dictates the maximum $\tau $.
- Figure 2 indicates that, for fixed $\mathsf{\Delta}t$ and $\omega $, the maximizing $\tau $ is either ${\tau}_{max}^{\left(1\right)}$ or ${\tau}_{max}^{\left(2\right)}$, depending on the value of $\gamma $. The optimal damping coefficient is $\gamma =(\sqrt{6}/2)\omega $, for which ${\tau}_{max}^{\left(1\right)}={\tau}_{max}^{\left(2\right)}=\sqrt{6}/\left(\omega \mathsf{\Delta}t\right)$.
- For the preobservable $\omega q$, and, indeed, for any odd polynomial, the IAcT $\tau $ becomes arbitrary small as $\gamma $ goes to zero. This does not mean, however that the variance goes to 0, because Equation (1) is an asymptotic result, and the IAcT is a prefactor for the leading order $1/N$ term in the expression for the variance. The next order $1/{N}^{2}$ term would dominate if $\gamma $ were very small. An order $1/{N}^{2}$ error is characteristic of uniform sampling, which would be the consequence of nearly ballistic movement. In addition, odd polynomials are special in that their expectation is independent of total energy, so it matters not that barely diffusive movement samples different values of total energy only at a very slow rate.
- The eigenfunction for ${\tau}_{max}^{\left(2\right)}$ is a specific linear combination of ${\omega}^{2}{q}^{2}-1$ and ${p}^{2}-1$. For quadratic polynomials, the total energy does affect its expected value, which is why it is necessary that $\gamma $ be large enough to sample different energies on a reasonable time scale.

#### 5.1. The Forward Transfer Operator

#### 5.2. Gamma for Maximum Spectral Gap

#### 5.3. Gamma for Maximum IAcT

## 6. Discussion and Conclusions

## Author Contributions

## Conflicts of Interest

## References

- Fang, Y.; Cao, Y.; Skeel, R. Quasi-Reliable Estimates of Effective Sample Size. arXiv, 2017; arXiv:1705.03831v2. [Google Scholar]
- Schütte, C.; Huisinga, W. Biomolecular Conformations as Metastable Sets of Markov Chains. In Proceedings of the 38th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, 4–6 October 2000; pp. 1106–1115. [Google Scholar]
- Schütte, C.; Sarich, M. Metastability and Markov State Models in Molecular Dynamics; American Mathematical Society: Providence, RI, USA, 2013. [Google Scholar]
- Lyman, E.; Zuckerman, D.M. On the Structural Convergence of Biomolecular Simulations by Determination of the Effective Sample Size. J. Phys. Chem. B
**2007**, 111, 12876–12882. [Google Scholar] [CrossRef] [PubMed] - Rey-Bellet, L.; Spiliopoulos, K. Irreversible Langevin Samplers and Variance Reduction: A Large Deviations Approach. Nonlinearity
**2015**, 28, 2081–2104. [Google Scholar] [CrossRef] - Leimkuhler, B.; Matthews, C. Rational Construction of Stochastic Numerical Methods for Molecular Sampling. Appl. Math. Res. Express
**2013**, 2013, 34–56. [Google Scholar] [CrossRef] - Leimkuhler, B.; Matthews, C.; Stoltz, G. The Computation of Averages from Equilibrium and Nonequilibrium Langevin Molecular Dynamics. IMA J. Numer. Anal.
**2016**, 36, 13–79. [Google Scholar] [CrossRef] - Geyer, C.J. Practical Markov Chain Monte Carlo. Stat. Sci.
**1992**, 7, 473–511. [Google Scholar] [CrossRef] - Priestly, M.B. Spectral Analysis and Time Series; Academic Press: Cambridge, MA, USA, 1981. [Google Scholar]
- Goodman, J. Acor, Statistical Analysis of a Time Series. 2009. Available online: http://www.math.nyu.edu/faculty/goodman/software/acor/ (accessed on 4 August 2014).
- Zhang, X.; Bhatt, D.; Zuckerman, D.M. Automated Sampling Assessment for Molecular Simulations Using the Effective Sample Size. J. Chem. Theory Comput.
**2010**, 6, 3048–3057. [Google Scholar] [CrossRef] [PubMed] - Cancés, E.; Legoll, F.; Stoltz, G. Special Issue on Molecular Modelling: Theoretical and Numerical Comparison of Some Sampling Methods for Molecular Dynamics. ESAIM Math. Model. Numer. Anal.
**2007**, 41, 351–389. [Google Scholar] [CrossRef] - Diaconis, P.; Holmes, S.; Neal, R.M. Analysis of a Nonreversible Markov Chain Sampler. Ann. Appl. Probab.
**2000**, 10, 726–752. [Google Scholar] [CrossRef] - Risken, H. The Fokker-Planck Equation. Methods of Solution and Applications; Springer Series in Synergetics; Springer: New York, NY, USA, 1984; Volume 18. [Google Scholar]
- Kozlov, S. Effective diffusion for the Fokker-Planck equation. Math. Notes
**1989**, 45, 360–368. [Google Scholar] [CrossRef]

**Figure 1.**$\gamma \mathsf{\Delta}t\phantom{\rule{0.166667em}{0ex}}{\tau}_{max}^{\left(k\right)}$ vs. $\omega /\gamma $ for $k=1,2,3,4$.

**Figure 2.**$\omega \mathsf{\Delta}t\phantom{\rule{0.166667em}{0ex}}{\tau}_{max}^{\left(k\right)}$ vs. $\gamma /\omega $ for $k=1,2,3,4$.

**Table 1.**${\tau}_{max}$ is the IAcT for $\mathcal{F}$ and ${\tau}_{max}^{\mathrm{rv}}$ is that for ${\mathcal{F}}^{\mathrm{rv}}=\frac{1}{2}(\mathcal{F}+{\mathcal{F}}^{\mathsf{T}})$.

(a) ${\mathit{\tau}}_{\mathbf{max}}$ | |||

$\mathit{n}\setminus \mathit{\epsilon}$ | 0.1 | 0.01 | 0.001 |

10 | 90 | 990 | 9990 |

100 | 900 | 9900 | 99,900 |

1000 | 9000 | 99,000 | 999,000 |

(b) ${\mathit{\tau}}_{\mathbf{max}}^{\mathbf{rv}}-{\mathit{\tau}}_{\mathbf{max}}$ | |||

$\mathit{n}\setminus \mathit{\epsilon}$ | 0.1 | 0.01 | 0.001 |

10 | 35 | 33 | 33 |

100 | 3910 | 3511 | 3355 |

1000 | 403,612 | 389,921 | 351,047 |

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Skeel, R.D.; Fang, Y. Comparing Markov Chain Samplers for Molecular Simulation. *Entropy* **2017**, *19*, 561.
https://doi.org/10.3390/e19100561

**AMA Style**

Skeel RD, Fang Y. Comparing Markov Chain Samplers for Molecular Simulation. *Entropy*. 2017; 19(10):561.
https://doi.org/10.3390/e19100561

**Chicago/Turabian Style**

Skeel, Robert D., and Youhan Fang. 2017. "Comparing Markov Chain Samplers for Molecular Simulation" *Entropy* 19, no. 10: 561.
https://doi.org/10.3390/e19100561