# Quantifying Configuration-Sampling Error in Langevin Simulations of Complex Molecular Systems

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{*}

## Abstract

**:**

## 1. Introduction

#### KL Divergence as a Natural Measure of Sampling Bias

## 2. Numerical Discretization Methods and Timestep-Dependent Bias

#### Langevin Integrators Introduce Sampling Bias That Grows with the Size of the Timestep

## 3. Estimators for KL Divergence and the Configurational KL Divergence

#### 3.1. Near-Equilibrium Estimators for KL Divergence

#### 3.2. A Simple Modification to the Near-Equilibrium Estimator Can Compute KL Divergence in Configuration Space

#### 3.3. Comparison of Phase-Space Error for Different Integrators

#### 3.4. Comparison of Configurational KL Divergence for Different Integrators

#### 3.5. Influence of the Collision Rate

#### 3.6. Comparison with Reference Methods Validates the Near-Equilibrium Estimate

#### 3.6.1. Practical Lower Bound from Nested Monte Carlo

#### 3.6.2. Practical Upper Bound from Jensen’s Inequality

#### 3.6.3. Sandwiching the KL Divergence to Validate the Near-Equilibrium Estimate

## 4. Relation to GHMC Acceptance Rates

## 5. Discussion

#### Future Directions

## 6. Detailed Methods

`openmmtools`library: https://github.com/choderalab/openmmtools. This implementation allows the user to specify a Langevin integrator using a splitting string (like OVRVO) and can automatically compute shadow work for each splitting.

#### 6.1. One-Dimensional Model System: Double Well

`numpy.trapz`). The KL divergence between a given $\rho $ and $\pi $ was then computed using

`scipy.stats.entropy`on the histogram representation.

#### 6.2. Model Molecular Mechanics System: A Harmonically Restrained Water Cluster

- The test system must have interactions typical of solvated molecular mechanics models, so that we would have some justification for generalizing from the results. This rules out 1D systems, for example, and prompted us to search for systems that were not alanine dipeptide in vacuum.
- The test system must have sufficiently few degrees of freedom that the nested Monte Carlo estimator remains feasible. Because the nested estimator requires converging many exponential averages, the cost of achieving a fixed level of precision grows dramatically with the standard deviation of the steady-state shadow work distribution. The width of this distribution is extensive in system size. Empirically, this ruled out using the first water box we had tried (with approximately 500 rigid TIP3P waters [21], with 3000 degrees of freedom). Practically, there was also a limit to how small it is possible to make a water system with periodic boundary conditions in OpenMM (about 100 waters, or 600 degrees of freedom), which was also infeasible.
- The test system must have enough disordered degrees of freedom that the behavior of work averages is typical of larger systems. This was motivated by our observation that it was paradoxically much easier to converge estimates for large disordered systems than it was to converge estimates for the 1D toy system.

`WaterCluster`test system, which comprises 20 rigid TIP3P waters weakly confined in a central harmonic restraining potential with force constant $K=1\phantom{\rule{3.33333pt}{0ex}}\mathrm{kJ}/\mathrm{mol}/{\mathrm{nm}}^{2}$ applied to all atoms. This test system is available in version 0.14.0 of the

`openmmtools`package [43]. Simulations were performed in double-precision using the

`Reference`platform in OpenMM 7.2 [41] to minimize the potential for introducing significant round-off error due to finite floating point precision.

#### 6.3. Caching Equilibrium Samples

`CustomIntegrator`in OpenMM [41]. In brief, XC-HMC is a strategy to reduce the adverse effects of momentum flipping on sampling autocorrelation times in GHMC. GHMC uses Langevin integration of Hamiltonian dynamics as a proposal mechanism, and accepts or rejects each proposal according to a Metropolis criterion. Whenever the Metropolis test fails, the proposal is rejected and the momentum must be reversed; this is necessary to maintain detailed balance [2,44], but can lead to extremely large autocorrelation times when the acceptance rate is not sufficiently close to 100% (see [39] for empirical examples and further discussion). In “extra-chance” HMC [37,39], rather than immediately flipping the momentum whenever the Metropolis criterion fails, the proposal trajectory is instead extended to generate a new proposal, and another (suitably modified) Metropolis criterion is checked. For suitable choices of parameters (length of trajectory proposal, timestep, number of extra chances, length of “extra-chance” trajectories), this strategy can virtually eliminate the effect of momentum flipping, at the cost of increasing the average length of proposal trajectories.

#### 6.4. Computing Shadow Work for Symmetric Strang Splittings

#### 6.5. Computation of Shadow Work for OVRVO

#### 6.6. Variance-Controlled Adaptive Estimator for KL Divergence

## Author Contributions

## Acknowledgments

`CustomIntegrator`facility within OpenMM that greatly simplifies implementation of the integrators considered here, and for helpful feedback on the manuscript. Finally, we thank the three anonymous reviewers for their many thoughtful comments that greatly improved the manuscript, for identifying relevant prior work we had missed, and for suggesting additional experiments to isolate the effect of the collision rate.

## Conflicts of Interest

## Appendix A. Statistics of Shadow Work Distributions

**Figure A1.**Shadow work distributions for the water cluster are approximately Gaussian for all integrators examined. In all panels, solid lines and shaded regions denote Gaussian fits, while dots denote histogram estimates. The

**top row**depicts work distributions where initial conditions are sampled from the nonequilibrium steady-state induced by the corresponding integrator and timestep $\left[(\mathbf{x},\mathbf{v})\sim \rho \right]$; these shadow work values are used to measure phase-space error in the near-equilibrium estimates of ${\mathcal{D}}_{\mathrm{KL}}$. The

**bottom row**depicts work distributions where initial conditions are sampled from the $\omega $ ensemble $\left[\mathbf{x}\sim {\rho}_{\mathbf{x}},\phantom{\rule{0.222222em}{0ex}}\phantom{\rule{0.222222em}{0ex}}\mathbf{v}\sim \pi \left(\mathbf{v}\right|\mathbf{x})\right]$; these work values are used to estimate configuration-space error in the near-equilibrium estimates of ${\mathcal{D}}_{\mathrm{KL}}$.

**Figure A2.**Standard deviation of water cluster shadow work distribution grows with time step. The left panel summarizes the top row of Figure A1 (shadow work distributions for trajectories initialized in $\pi $), and the right panel summarizes the bottom row of Figure A1 (trajectories initialized in $\omega $).

**Figure A3.**Exponential averages with respect to the shadow work distribution become increasingly difficult with increasing timestep. It becomes increasingly difficult to estimate the expectation of ${e}^{-w}$ with respect to Gaussian fits to work distributions $p\left(w\right)$ for the water cluster, as the timestep $\Delta t$ increases. The four panels increase in $\Delta t$ from left to right: note the changing x-axis scales. The

**solid line**is the shadow work distribution $p\left(w\right)$ measured at each timestep. The

**dashed line**is ${e}^{-w}$. The

**dash-dotted line**is ${e}^{-w}\xb7p\left(w\right)$.

**Figure A4.**The computational effort required to reach a fixed uncertainty threshold depends sharply on $\Delta t$. The total computational effort is determined by the total number of trajectories sampled in that condition (i.e.,${\sum}_{i=1}^{N}{M}_{i}$), multiplied by the length of each trajectory. Note that the curves are not monotonic, since the number of required trajectories increases superlinearly with $\Delta t$, but the number of timesteps in each trajectory decreases linearly with $\Delta t$.

## Appendix B. Log-Scale Plots

**Figure A5.**Agreement between near-equilibrium and reference methods may be better visualized at small timesteps on a log-scale. This is a repeat of Figure 8, but with a log-scale y-axis. Vertical artifacts are caused by the estimate becoming transiently negative, which cannot be represented on the logarithmic scale.

## Appendix C. Further Comments on the Collision Rate

**Figure A6.**Integrator accuracy can be compared at different collision rates. Here we have plotted the same results as in Figure 7 but grouped by the collision rate $\gamma $ rather than by scheme. Note that the ordering over schemes induced by configuration-space error can vary as a function of collision rate—i.e., at the lowest measured $\gamma $, the ordering from lowest to highest error is (1) VRORV, (2) OVRVO, (3) RVOVR, (4) ORVRO, but at the highest measured $\gamma $, the ordering is different.

**Figure A7.**Similar results are obtained for 1000-step protocols as for 2000-step protocols. For the same wide array of conditions reported in Figure A6, near-equilibrium estimates generated using 5x fewer samples of 2x shorter protocols are broadly consistent with the results obtained by more exhaustive sampling.

## References

- Lemons, D.S.; Gythiel, A. Paul Langevin’s 1908 paper “On the Theory of Brownian Motion” [“Sur la théorie du mouvement brownien,” C. R. Acad. Sci. (Paris) 146, 530–533 (1908)]. Am. J. Phys.
**1997**, 65, 1079–1081. [Google Scholar] [CrossRef] - Lelièvre, T.; Stoltz, G.; Rousset, M. Free Energy Computations: A Mathematical Perspective; Imperial College Press: London, UK; Hackensack, NJ, USA, 2010. [Google Scholar]
- Leimkuhler, B.; Matthews, C. Molecular Dynamics: With Deterministic and Stochastic Numerical Methods; Springer: Cham, Germany, 2015. [Google Scholar]
- Frenkel, D.; Smit, B. Understanding Molecular Simulation, 2nd ed.; Academic Press, Inc.: Orlando, FL, USA, 2001. [Google Scholar]
- Maruyama, G. Continuous Markov Processes and Stochastic Equations. Rend. Circolo Mat. Palermo
**1955**, 4, 48–90. [Google Scholar] [CrossRef] - Ermak, D.L.; Yeh, Y. Equilibrium Electrostatic Effects on the Behavior of Polyions in Solution: Polyion-Mobile Ion Interaction. Chem. Phys. Lett.
**1974**, 24, 243–248. [Google Scholar] [CrossRef] - Brünger, A.; Brooks, C.L.; Karplus, M. Stochastic Boundary Conditions for Molecular Dynamics Simulations of ST2 Water. Chem. Phys. Lett.
**1984**, 105, 495–500. [Google Scholar] [CrossRef] - Pastor, R.W.; Brooks, B.R.; Szabo, A. An Analysis of the Accuracy of Langevin and Molecular Dynamics Algorithms. Mol. Phys.
**1988**, 65, 1409–1419. [Google Scholar] [CrossRef] - Athénes, M. A Path-Sampling Scheme for Computing Thermodynamic Properties of a Many-Body System in a Generalized Ensemble. Eur. Phys. J. B Condens. Matter Complex Syst.
**2004**, 38, 651–663. [Google Scholar] [CrossRef] - Adjanor, G.; Athènes, M.; Calvo, F. Free Energy Landscape from Path-Sampling: Application to the Structural Transition in LJ38. Eur. Phys. J. B Condens. Matter Complex Syst.
**2006**, 53, 47–60. [Google Scholar] [CrossRef] - Bussi, G.; Parrinello, M. Accurate Sampling Using Langevin Dynamics. Phys. Rev. E
**2007**, 75, 056707. [Google Scholar] [CrossRef] [PubMed] - Izaguirre, J.A.; Sweet, C.R.; Pande, V.S. Multiscale Dynamics of Macromolecules Using Normal Mode Langevin. In Biocomputing 2010; World Scientific: Singapore, 2009; pp. 240–251. [Google Scholar] [CrossRef]
- Leimkuhler, B.; Matthews, C. Robust and Efficient Configurational Molecular Sampling via Langevin Dynamics. J. Chem. Phys.
**2013**, 138, 174102. [Google Scholar] [CrossRef] [PubMed] - Leimkuhler, B.; Matthews, C. Efficient Molecular Dynamics Using Geodesic Integration and Solvent–solute Splitting. Proc. R. Soc. A
**2016**, 472, 20160138. [Google Scholar] [CrossRef] [PubMed] - Grubmüller, H.; Tavan, P. Multiple Time Step Algorithms for Molecular Dynamics Simulations of Proteins: How Good Are They? J. Comput. Chem.
**1998**, 19, 1534–1552. [Google Scholar] [CrossRef] - Hopkins, C.W.; Le Grand, S.; Walker, R.C.; Roitberg, A.E. Long-Time-Step Molecular Dynamics through Hydrogen Mass Repartitioning. J. Chem. Theory Comput.
**2015**, 11, 1864–1874. [Google Scholar] [CrossRef] [PubMed] - Butler, B.D.; Ayton, G.; Jepps, O.G.; Evans, D.J. Configurational Temperature: Verification of Monte Carlo Simulations. J. Chem. Phys.
**1998**, 109, 6519–6522. [Google Scholar] [CrossRef] - Leimkuhler, B.; Matthews, C. Rational Construction of Stochastic Numerical Methods for Molecular Sampling. Appl. Math. Res. eXpress
**2013**, 2013, 34–56. [Google Scholar] [CrossRef] - Sweet, C.R.; Hampton, S.S.; Skeel, R.D.; Izaguirre, J.A. A Separable Shadow Hamiltonian Hybrid Monte Carlo Method. J. Chem. Phys.
**2009**, 131, 174106. [Google Scholar] [CrossRef] [PubMed] - Sivak, D.A.; Chodera, J.D.; Crooks, G.E. Using nonequilibrium fluctuation theorems to understand and correct errors in equilibrium and nonequilibrium simulations of discrete Langevin dynamics. Phys. Rev. X
**2013**, 3, 011007. [Google Scholar] [CrossRef] - Jorgensen, W.L.; Chandrasekhar, J.; Madura, J.D.; Impey, R.W.; Klein, M.L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys.
**1983**, 79, 926–935. [Google Scholar] [CrossRef] - Sivak, D.A.; Chodera, J.D.; Crooks, G.E. Time Step Rescaling Recovers Continuous-Time Dynamical Properties for Discrete-Time Langevin Integration of Nonequilibrium Systems. J. Phys. Chem. B
**2014**, 118, 6466–6474. [Google Scholar] [CrossRef] [PubMed] - Melchionna, S. Design of quasisymplectic propagators for Langevin dynamics. J. Chem. Phys.
**2007**, 127, 044108. [Google Scholar] [CrossRef] [PubMed] - Skeel, R.D.; Izaguirre, J.A. An impulse integrator for Langevin dynamics. Mol. Phys.
**2002**, 100, 3885–3891. [Google Scholar] [CrossRef] - Serrano, M.; De Fabritiis, G.; Espanol, P.; Coveney, P. A stochastic Trotter integration scheme for dissipative particle dynamics. J. Math. Comput. Simul.
**2006**, 72, 190–194. [Google Scholar] [CrossRef] - Thalmann, F.; Farago, J. Trotter derivation of algorithms for Brownian and dissipative particle dynamics. J. Chem. Phys.
**2007**, 127, 124109. [Google Scholar] [CrossRef] [PubMed] - Leimkuhler, B.; Matthews, C.; Stoltz, G. The Computation of Averages from Equilibrium and Nonequilibrium Langevin Molecular Dynamics. IMA J. Numer. Anal.
**2016**, 36, 13–79. [Google Scholar] [CrossRef] - Swope, W.C.; Andersen, H.C.; Berens, P.H.; Wilson, K.R. A Computer Simulation Method for the Calculation of Equilibrium Constants for the Formation of Physical Clusters of Molecules: Application to Small Water Clusters. J. Chem. Phys.
**1982**, 76, 637–649. [Google Scholar] [CrossRef] - Bennett, C.H. Mass Tensor Molecular Dynamics. J. Comput. Phys.
**1975**, 19, 267–279. [Google Scholar] [CrossRef] - Pomès, R.; McCammon, J.A. Mass and Step Length Optimization for the Calculation of Equilibrium Properties by Molecular Dynamics Simulation. Chem. Phys. Lett.
**1990**, 166, 425–428. [Google Scholar] [CrossRef] - Plecháč, P.; Rousset, M. Implicit Mass-Matrix Penalization of Hamiltonian Dynamics with Application to Exact Sampling of Stiff Systems. Multiscale Model. Simul.
**2010**, 8, 498–539. [Google Scholar] [CrossRef] - Sivak, D.A.; Crooks, G.E. Near-Equilibrium Measurements of Nonequilibrium Free Energy. Phys. Rev. Lett.
**2012**, 108. [Google Scholar] [CrossRef] [PubMed] - Perez-Cruz, F. Kullback-Leibler Divergence Estimation of Continuous Distributions. In Proceedings of the 2008 IEEE International Symposium on Information Theory, New Orleans, LA, USA, 30 November–4 December 2008; pp. 1666–1670. [Google Scholar] [CrossRef]
- Dhabal, D.; Nguyen, A.H.; Singh, M.; Khatua, P.; Molinero, V.; Bandyopadhyay, S.; Chakravarty, C. Excess Entropy and Crystallization in Stillinger-Weber and Lennard-Jones Fluids. J. Chem. Phys.
**2015**, 143, 164512. [Google Scholar] [CrossRef] [PubMed] - Athènes, M.; Adjanor, G. Measurement of Nonequilibrium Entropy from Space-Time Thermodynamic Integration. J. Chem. Phys.
**2008**, 129, 024116. [Google Scholar] [CrossRef] [PubMed] - Shirts, M.R.; Pande, V.S. Comparison of Efficiency and Bias of Free Energies Computed by Exponential Averaging, the Bennett Acceptance Ratio, and Thermodynamic Integration. J. Chem. Phys.
**2005**, 122, 144107. [Google Scholar] [CrossRef] [PubMed] - Campos, C.M.; Sanz-Serna, J.M. Extra Chance Generalized Hybrid Monte Carlo. J. Comput. Phys.
**2015**, 281, 365–374. [Google Scholar] [CrossRef] - Athènes, M.; Marinica, M.C. Free Energy Reconstruction from Steered Dynamics without Post-Processing. J. Comput. Phys.
**2010**, 229, 7129–7146. [Google Scholar] [CrossRef] - Wagoner, J.A.; Pande, V.S. Reducing the Effect of Metropolization on Mixing Times in Molecular Dynamics Simulations. J. Chem. Phys.
**2012**, 137, 214105. [Google Scholar] [CrossRef] [PubMed] - Nilmeier, J.P.; Crooks, G.E.; Minh, D.D.L.; Chodera, J.D. Nonequilibrium Candidate Monte Carlo Is an Efficient Tool for Equilibrium Simulation. Proc. Natl. Aacd. Sci. USA
**2011**, 108, E1009–E1018. [Google Scholar] [CrossRef] [PubMed] - Eastman, P.; Swails, J.; Chodera, J.D.; McGibbon, R.T.; Zhao, Y.; Beauchamp, K.A.; Wang, L.P.; Simmonett, A.C.; Harrigan, M.P.; Stern, C.D.; et al. OpenMM 7: Rapid Development of High Performance Algorithms for Molecular Dynamics. PLOS Comput. Biol.
**2017**, 13, e1005659. [Google Scholar] [CrossRef] [PubMed] - Lam, S.K.; Pitrou, A.; Seibert, S. Numba: A LLVM-Based Python JIT Compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, Austin, TX, USA, 15 November 2015; ACM: New York, NY, USA, 2015; pp. 1–6. [Google Scholar] [CrossRef]
- Chodera, J.; Rizzi, A.; Naden, L.; Beauchamp, K.; Grinaway, P.; Fass, J.; Rustenburg, B.; Ross, G.A.; Simmonett, A.; Swenson, D.W. Openmmtools: 0.14.0—Exact Treatment of Alchemical PME Electrostatics, Water Cluster Test System, Optimizations; GitHub: San Francisco, CA, USA, 2018. [Google Scholar] [CrossRef]
- Akhmatskaya, E.; Bou-Rabee, N.; Reich, S. Erratum to “A Comparison of Generalized Hybrid Monte Carlo Methods with and without Momentum Flip” [J. Comput. Phys. 228 (2009) 2256–2265]. J. Computat. Phys.
**2009**, 228, 7492–7496. [Google Scholar] [CrossRef] - Adjanor, G.; Athènes, M. Gibbs Free-Energy Estimates from Direct Path-Sampling Computations. J. Chem. Phys.
**2005**, 123, 234104. [Google Scholar] [CrossRef] [PubMed] - Adib, A.B. Comment on “On the Crooks fluctuation theorem and the Jarzynski equality” [J. Chem. Phys. 129, 091101 (2008)]. J. Chem. Phys.
**2009**, 130, 247101. [Google Scholar] [CrossRef] [PubMed] - Shirts, M.R.; Chodera, J.D. Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys.
**2008**, 129, 124105. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**Different numerical integrators introduce different error structure in phase space, illustrated in a double-well system. Here, we illustrate the timestep-dependent discretization error introduced by four integrators on a 1D double-well potential [$U\left(\mathbf{x}\right)\equiv {\mathbf{x}}^{6}+2cos\left(5(\mathbf{x}+1)\right)$]. The

**top row**of 2D contour plots illustrates the difference between the phase-space density $\rho (\mathbf{x},\mathbf{v})$ sampled at the maximum timestep considered ($\Delta t=0.7$, close to the stability limit) and the equilibrium density $\pi (\mathbf{x},\mathbf{v})$; solid lines indicate positive contours, while dashed lines indicate negative contours. The

**bottom row**of 1D density plots shows timestep-dependent perturbation in the sampled marginal distribution in configuration space, ${\rho}_{\mathbf{x}}$, with the equilibrium distribution ${\pi}_{\mathbf{x}}$ depicted as a solid black line. The sampled marginal distributions ${\rho}_{\mathbf{x}}$ are shown for increasingly large timestep, denoted ${\rho}_{\Delta t}$, depicted by increasingly light dotted lines, for $\Delta t=0.3,0.5,0.7$ (arbitrary units). Inspecting the contour plots suggests that some integrator splittings (especially VRORV) induce error that fortuitously “cancels out” when the density is marginalized by integrating over $\mathbf{v}$, while the error in other integrator splittings (ORVRO, OVRVO) constructively sums to amplify the error in configuration space.

**Figure 2.**A simple nonequilibrium protocol allows measurement of the KL divergence in phase and configuration space close to equilibrium. Simple nonequilibrium protocols can be used in complex molecular systems to rapidly estimate—utilizing the Crooks fluctuation theorem—the KL divergence of sampled Langevin densities from equilibrium. In both panels, the x-axis is the number of steps taken so far in the length-$2T$ protocol, and ${\langle {w}_{\mathrm{shad}}\rangle}_{\pi}$ indicates the average (reduced, unitless) shadow work accumulated over T steps of Langevin dynamics, initialized from equilibrium $(({\mathbf{x}}_{0},{\mathbf{v}}_{0})$~$\pi )$. (

**a**) The original scheme described in Sivak et al. [20] to measure the KL divergence between the sampled phase-space density $\rho $ and the equilibrium phase-space density $\pi $. ${\langle {w}_{\mathrm{shad}}\rangle}_{\rho}$ is the average shadow work accumulated over T steps of Langevin dynamics, initialized from the integrator’s steady state $(({\mathbf{x}}_{0},{\mathbf{v}}_{0})$~$\rho )$. (

**b**) The modified scheme introduced here to measure the KL divergence in the configuration-space marginal density between the marginal sampled configuration-space density ${\rho}_{\mathbf{x}}$ and marginal equilibrium density ${\pi}_{\mathbf{x}}$. ${\langle {w}_{\mathrm{shad}}\rangle}_{\omega}$ is the average shadow work accumulated over T steps of Langevin dynamics, where the initial configuration is drawn from the integrator’s steady state, and the initial velocities are drawn from equilibrium $({\mathbf{x}}_{0}$~${\rho}_{\mathbf{x}},{\mathbf{v}}_{0}$~$\pi \left(\mathbf{v}\right|{\mathbf{x}}_{0}\left)\right)$. We denote this distribution $\omega (\mathbf{x},\mathbf{v})\equiv {\rho}_{\mathbf{x}}\left(\mathbf{x}\right)\pi \left(\mathbf{v}\right|\mathbf{x})$. The

**top row**schematically illustrates “distance from equilibrium”, with y-axis ticks for ${\mathcal{D}}_{\mathrm{KL}}(\pi \parallel \pi )=0$, ${\mathcal{D}}_{\mathrm{KL}}(\omega \parallel \pi )\le {\mathcal{D}}_{\mathrm{KL}}(\rho \parallel \pi )$. The

**bottom row**illustrates the average work (here, just shadow work) accumulated throughout each protocol.

**Figure 3.**Comparison of Langevin integrators in terms of phase-space and marginal distributions. For a simple 1D system with the quartic potential $U\left(\mathbf{x}\right)={\mathbf{x}}^{4}$, the error in sampled phase-space density $\rho $ and its marginal density ${\rho}_{\mathbf{x}}$ grows as a function of timestep $\Delta t$. However, different Langevin integrators (OVRVO and VRORV shown here) derived from symmetric Strang splittings can lead to drastically different error structures in phase space, which can induce fortuitous cancellation of error in the marginal distribution under certain circumstances (VRORV), see [18]. In the

**top row**, we illustrate the definition of the 1D system (

**left**: the potential energy function, $U\left(\mathbf{x}\right)={\mathbf{x}}^{4}$;

**middle**: the equilibrium marginal density over configuration space, ${\pi}_{\mathbf{x}}\left(\mathbf{x}\right)\propto {e}^{-\beta U\left(\mathbf{x}\right)}$;

**right**: the equilibrium joint distribution over phase space $\pi (\mathbf{x},\mathbf{v})$). In the

**middle row**, we illustrate the increasing discrepancy between the sampled distribution ρ and the equilibrium distribution π, for both the full phase-space and the marginal configuration space, as a function of timestep $\Delta t$, for the given model problem and the particular choice of the Bussi–Parinello Langevin integrator OVRVO (7). Here the differences between exact and discrete configurational measures are plotted above the contours of the phase space density, for four values of the stepsize $\Delta t=$ [0.43, 0.66, 0.88, 1.1]. In the

**bottom row**, we illustrate the timestep-dependent error in a similar way for another integrator VRORV (12).

**Figure 4.**Using the near-equilibrium approximation, some numerical methods introduce far less configuration-space bias in molecular mechanics models than others. The results here are reported for a small cluster of rigid TIP3P waters, described in more detail in the Detailed Methods section, and illustrated in the

**leftmost panel**. On the x-axis is the timestep $\Delta t$, measured in femtoseconds (fs). On the y-axis is the estimated KL divergence ${\mathcal{D}}_{\mathrm{KL}}$. (

**a**) The error over the joint distribution on ${\mathcal{D}}_{\mathrm{KL}}(\rho \parallel \pi )$. (

**b**) The error over the configuration-space marginal ${\mathcal{D}}_{\mathrm{KL}}({\rho}_{\mathbf{x}}\parallel {\pi}_{\mathbf{x}})$. Each colored curve corresponds to a numerical scheme for Langevin dynamics. The shaded region is the mean ± 95% confidence interval.

**Figure 5.**No consistent relationship between the GHMC acceptance rate and the steady-state bias is apparent for a 1D system. Since the GHMC rejection rate grows similarly with $\Delta t$ across all four schemes, but the configuration-space KL divergence does not, the GHMC rejection rate can be overly “conservative” for some splittings. Panel (

**a**) shows the growth in the GHMC rejection rate as a function of timestep $\Delta t$, for the 1D double-well model considered in Figure 1 and Figure 6. On the x-axis is an evenly spaced grid of 50 timesteps between 0.1 and 0.7. On the y-axis is the estimated rejection rate, which is based on a sample average of the GHMC acceptance criterion. The shaded region is the mean ± 95% confidence interval. Panel (

**b**) compares the GHMC rejection rate vs. the phase-space bias at steady state, over the range of timesteps plotted in panel (

**a**). The y-axis is KL divergence between the phase-space histograms, plotted on a log-scale. Panel (

**c**) compares the GHMC rejection rate vs. the configuration-space bias at steady state, over the range of timesteps plotted in panels (

**a**,

**b**). The y-axis is the KL divergence between the configuration-space histograms, plotted on a log-scale. Note that in panel (

**c**), we have truncated the leftmost parts of the curves for RVOVR and VRORV rejection rates less than 0.05 and 0.1, respectively, due to noise in histogram estimates of very small ${\mathcal{D}}_{\mathrm{KL}}({\rho}_{\mathbf{x}}\parallel {\pi}_{\mathbf{x}})$.

**Figure 6.**KL divergence is a natural measure of sampling error, although system-specific observables display different sensitivities to sampling error. Even for the simple double-well potential considered in Figure 1, configuration-space properties display different sensitivities to sampling error, motivating the use of a “universal” error measure, such as the KL divergence. The

**top left**panel illustrates the double-well potential energy function from Figure 1, and the

**top center**panel shows the resulting marginal equilibrium density, ${\pi}_{\mathbf{x}}$, at $\beta =1$. The

**bottom left**panel shows, as a function of $\Delta t$, growth in the magnitude of the error in average potential energy, ${|\langle U\rangle}_{\rho}-{\langle U\rangle}_{\pi}|$, which has been used previously as a sensitive measure of sampling error [13]. The

**bottom center**panel shows the error in the apparent free energy difference between the two wells as a function of $\Delta t$. Note that the timestep-dependent behavior of these two observables implies different rankings of integrator fidelity that may mislead one into believing error in all observables remains low with increasing timestep. However, as is clear here, just because an integrator introduces low timestep-dependent error in one observable does not mean that the method will introduce low error in another observable: for example, OVRVO preserves the well populations as accurately as VRORV, but introduces much larger errors in the average potential energy. The

**right column**summarizes the growth in timestep-dependent error, as measured by the KL divergence. While all four integrators introduce comparable levels of $\Delta t$-dependent error in the phase-space distribution, they induce dramatically different magnitudes of error in the configuration-space marginal.

**Figure 7.**The choice of collision rate influences sampling bias. As we vary the collision rate $\gamma $ over a few orders of magnitude, the resulting measured KL divergence responds in different ways for the different schemes. The phase-space bias appears to increase with increasing collision rate for all schemes. The configuration-space bias for OVRVO and ORVRO appears to increase with increasing collision rate, but the configuration-space bias for RVOVR appears to decrease with increasing collision rate. The anomalous low configuration-space error for VRORV is observed across all collision rates tested. The non-monotonic curves in the $\gamma =100$ ps${}^{-1}$ condition are expected to be due to finite-sampling error, and are expected to be attenuated at a larger number of protocol samples. (Note that one condition is omitted from these plots for clarity: estimates of ${\mathcal{D}}_{\mathrm{KL}}$ for OVRVO at $\Delta t=8$ fs. At that timestep, the variance of the resulting ${\mathcal{D}}_{\mathrm{KL}}$ estimates for this scheme was much larger than for the other schemes.) See Figure A6 for a comparison grouped by collision rate, rather than by integrator.

**Figure 8.**The near-equilibrium estimator is consistent with reference estimators for a practical range of $\Delta t$. We compared the near-equilibrium estimates reported in Figure 4 for the water cluster against a likely under-estimate and a likely over-estimate of the ${\mathcal{D}}_{\mathrm{KL}}$. In the

**top row**, we validate near-equilibrium estimates of the KL divergence on the full state space $(\mathbf{x},\mathbf{v})$. In the

**bottom row**, we validate near-equilibrium estimates of the KL divergence on configuration space $\left(\mathbf{x}\right)$ alone. Each column corresponds to a numerical method for Langevin dynamics. The

**darker band**in each plot corresponds to the near-equilibrium estimate ± 95% confidence intervals from asymptotic uncertainty estimate (details in Section 3.1). The

**lighter band with a solid line**corresponds to the nested Monte Carlo estimate ± 95% confidence intervals from bootstrapping (details in Section 6.6). The

**lighter band with a dotted line**corresponds to the exponential average estimate ± 95% confidence intervals from bootstrapping (details in Section 6.6). Log-scale versions of these plots are provided in the appendix also, Figure A5. In the

**lower two panels**, we summarize these results by plotting all near-equilibrium estimates vs. all exponential-average estimates (

**left**) and all near-equilibrium estimates vs. all nested Monte Carlo estimates (

**right**). The colored dots and bars correspond to the means ± uncertainties used in the earlier panels. The dashed diagonal line shows parity. Grey error dots and error bars correspond to conditions where the nested Monte Carlo estimate reached the computational budget ($5\times {10}^{4}$ inner-loop samples) but failed to reach the inner-loop uncertainty threshold, and is thus more biased. See Section 6.6 for additional details.

**Figure 9.**Near-equilibrium measurements recapitulate the relationship between steady-state ${\mathcal{D}}_{\mathrm{KL}}$ and GHMC acceptance rate for the water cluster test system. Panel (

**a**) shows the growth in the GHMC rejection rate (1 minus the acceptance rate) as a function of timestep $\Delta t$ (in femtoseconds), for the water cluster test system illustrated in Figure 4. On the x-axis are timesteps [0.1 fs, 0.5 fs, 1.0 fs, ...7.5 fs, 8 fs]. On the y-axis is the estimated rejection rate, which is based on a sample average of the GHMC acceptance criterion, over 10,000 proposals per condition. The shaded region is the mean ± 95% confidence interval. Panel (

**b**) compares the GHMC rejection rate vs. the phase-space bias at steady state, over the range of timesteps plotted in panel (

**a**). The y-axis is the KL divergence between the phase-space distributions as measured by the near-equilibrium estimate, plotted on a log-scale. Panel (

**c**) compares the GHMC rejection rate vs. the configuration-space bias at steady state, over the range of timesteps plotted in panels (

**a**,

**b**). Note that in panels (

**b**,

**c**), we have truncated at ${\mathcal{D}}_{\mathrm{KL}}\le {10}^{-4}$, due to noise in near-equilibrium estimates of very small ${\mathcal{D}}_{\mathrm{KL}}$.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Fass, J.; Sivak, D.A.; Crooks, G.E.; Beauchamp, K.A.; Leimkuhler, B.; Chodera, J.D.
Quantifying Configuration-Sampling Error in Langevin Simulations of Complex Molecular Systems. *Entropy* **2018**, *20*, 318.
https://doi.org/10.3390/e20050318

**AMA Style**

Fass J, Sivak DA, Crooks GE, Beauchamp KA, Leimkuhler B, Chodera JD.
Quantifying Configuration-Sampling Error in Langevin Simulations of Complex Molecular Systems. *Entropy*. 2018; 20(5):318.
https://doi.org/10.3390/e20050318

**Chicago/Turabian Style**

Fass, Josh, David A. Sivak, Gavin E. Crooks, Kyle A. Beauchamp, Benedict Leimkuhler, and John D. Chodera.
2018. "Quantifying Configuration-Sampling Error in Langevin Simulations of Complex Molecular Systems" *Entropy* 20, no. 5: 318.
https://doi.org/10.3390/e20050318