1. Introduction
The basic operation of a molecular dynamics (MD) simulation is numerical integration of equations of motion of a set of interacting atoms to generate discrete trajectories of atomic positions and velocities (“configuration”). Time-averaging properties of configuration from such trajectories is ideally identical to ensemble-averaging at equilibrium [
1], provided the ergodic hypothesis holds at finite times. Unfortunately, most MD trajectories are not ergodic, leaving sometimes important regions of configuration space unexplored. This lack of ergodicity formally results from the fact that we cannot run infinitely long MD simulations, but it practically results from the fact that high-probability regions of configuration space are often separated by regions of low probability.
Methods of “ergodicitizing” MD simulations have a long history [
2], and include “collective-variable biasing” methods such as umbrella sampling [
3], thermodynamic integration [
4], and much more, as well as “tempering” approaches such as simulated annealing [
5] and replica-exchange [
6,
7]. Because these methods and their rapidly growing set of descendants all aim in one way or another to generate the correct equilibrium statistics of one or more variables of interest, they are often referred to generically as “free-energy” calculations. However, relative to the number of methodological developments in free-energy calculations, detailed comparative analysis of these methods remains underrepresented.
Collective-variable approaches are judged by how computationally efficiently they can produce an accurate free-energy as a function of the collective variable. This efficiency is most sensitively determined by how well they can handle naturally occurring restrictions to sampling in the unbiased coordinates, or so-called “hidden” barriers. Strictly speaking, unbiased coordinates must be ergodically sampled in order for the statistics along the CV to be accurate, but how close to this condition we must come for a tolerable level of accuracy, for a given choice of CV, and for a given method are not always clear.
To make matters worse, the very fact that these sampling restrictions cause inaccuracies is often difficult if not impossible to detect in production all-atom simulations. As an example, we recently showed in a study of the prion protein conformational thermodynamics that the popular CV-biasing approach Well-Tempered Metadynamics (WTMD) [
8,
9] can suffer a severe pathology in computing a one-dimensional free-energy profile that evidently stems from the method’s history-dependent bias [
10]. We presented an approach for calculating the same FEP using enhanced sampling without a history-dependent bias, termed “on-the-fly parameterization” (OTFP) [
11] that showed better convergence properties than did WTMD.
Spurred by this previous work, we take the opportunity here to investigate the convergence behavior of the history-dependent biasing approaches of WTMD and adaptive-biasing forces (ABF) [
12], and the history-independent approach OTFP, in very simple, analytically tractable low-dimensional systems. Our aim is to partially elucidate properties of configuration space that most likely lead to pathologies, or at least compromised convergence, in these methods in systems where we have full
a priori knowledge of all sampling restrictions.
2. Methods
We have run WTMD [
8], ABF [
12] and OTFP [
11] methods to recover a 1D free-energy profile (FEP) from a simple 2D potential energy surface (PES). We use NAMD 2.12 [
13], and the native ABF [
12,
14] and WTMD [
8] implementations therein, and refer interested readers to those publications for detailed quantitative descriptions of those methods. The OTFP method used here is exactly that detailed in our previous work [
11]. We recognize that these methods are under continuous improvement (see, e.g., [
9,
15,
16]), but here we intend to understand fundamental differences among these approaches.
For the one-dimensional case, the FEP is defined as (up to a constant):
where
is the Boltzmann constant,
T the temperature and 〈〉 means the average over the equilibrium distribution of configuration. The Dirac delta collects those configurations of the ensemble that yield the CV’s
equal to
z. ABF reconstructs this by an algorithm that converges upon the mean forces,
, by applying a scaled, accumulated negative estimate of those forces. WTMD reconstructs the FEP by adaptively converging to an energy bias (rather than a force bias) that can be processed to provide the FEP. OTFP uses temperature-accelerated MD (TAMD) [
17] to enhance sampling along a CV, and accumulates mean-force estimates on-the-fly that can at any point be used to parameterize an analytical expansion of the free energy via gradient optimization [
11].
The goal of the present work is to study the impact of “hidden barriers” on the convergence behavior of different free-energy methods. In order to do that we construct a simple model described by the following PES:
The gaussian terms generates the different basins in the PES and the last three terms serves as a boundary to confine the system dynamics around them. In particular, we used
,
and
, with the center of the gaussian functions at
,
,
and
. With this parameters, the PES basins are distributed in a kink shape, as can be seen in the contour plot of
Figure 1. We will use the
A,
B,
C and
D labels to refer to the corresponding minima as it is shown in the figure. With this model, we can select the
x direction as the CV for the free-energy reconstruction while keeping slow “forgotten” modes along the
y dimension. Thus, the use of a CV-biasing method will allow the acceleration of the transition
through the energy barrier at
, but the transition
and
will not be accelerated and their barriers at
and
become “hidden barriers”.
From the point of view of ABF, WTMD or OTFP, x is a bad CV because it does not include all the slow modes of the system, a condition that must be fulfilled to achieve system acceleration and avoid sampling issues. However, as we explained in the introduction, for real-life systems it is reasonable to consider that the presence of hidden barriers will be the most general case since there are several non-accelerated dimensions (, being M the number of CVs and N the number of atoms) that could hide unknown metastable states. In our simple model, the hidden barriers generated by the existence of the A and D states can be considered the most simple case of hidden barriers since their corresponding transitions are completely orthogonal to the chosen CV. On the other hand, the metastable states A and D are aligned in x with the minima of the middle channel. Therefore, the presence of hidden barriers will occur at the same CV values of the FEP minima. The hidden barriers can occur in any region of the CV domain, and it would be of interest to study the influence of them at different positions of the middle channel. We turn to this in the second half of the results below.
For all the simulations presented in this work, the system evolves following a Langevin dynamics using a mass of
, a time step of
, and a coupling coefficient of
. In the case of OTFP/TAMD method, the auxiliary particle evolves following a Brownian dynamics with the same time step
. For OTFP it is require to assume a functional form of the free-energy to be parametrized. As in our previous works [
10,
11,
16] we used a linear decomposition
where
are the basis functions and
are the coefficient of the expansion and the parameters to be adjusted by OTFP. The basis functions are constructed using simple
chapeau functions (see [
10,
11] for more details).
3. Results
Before conducting our free-energy calculations, the first important variable to fix is the system temperature
T. Since we want to accelerate the
x direction we need first to emulate the behavior of a “rare-event” evolution on this dimension. Thus, we need a temperature that allows very few transition events along
x during an MD simulation. At the same time, we want to operate in a regime where there is
some activity along the orthogonal direction as well, even if that direction is not ergodically sampled, because hidden barriers that are
never crossed by definition will not lead to problems in reproducibility of results. Thus, we need to explore the behavior of the MD simulations at different temperatures.
Figure 2 shows the position traces of an MD simulation at
, the temperature that was finally chosen. The traces for other temperatures can be found in
Figure S2 of the Supplementary Materials. Using
and
x as CV, we numerically integrate Equation (
1) to obtain the double-well free-energy profile (FEP) shown on top of
Figure 1.
To help the reader to understand the transitions events between the various basins, we include together with the
x and
y traces the information of the trapping basis as a function of time (shown in red in
Figure 2). To assign a trapping basin label to an instantaneous
configuration we begin by associating a set of consecutive integer numbers (e.g., 4, 3, 2 and 1) to the A, B, C and D minima. Then, we perform a Voronoi tessellation from the PES minima to decide which integer number will be associated with each
configuration. Finally, we perform a running average of the resulting numeric trace, in order to avoid the apparent basin changes at very short simulation times. That is, since the acceleration method increases the likelihood of observing high-energy configurations, we frequently observe a rapid backward and forward crossing across the boundaries of a tessellation cell, a feature that could be easily be confused with a change in the sampling state. Thus, we found that a running average can considerably improve the assignment of the trapping state. We have taken an interval of
time units to perform this operation (note the trace has
time units). The basin trace information is included any time we show a simulation trace.
Besides the system temperature, another temperature that has to be chosen is the bias sampling temperature of the WTMD and OTFP methods. In the case of WTMD this is the temperature associated with the target distribution, while in the case of OTFP, this is the temperature associated with the distribution of the auxiliary variables of TAMD. In any case, these two distributions are exactly the same:
where
T is the system temperature and
is called the bias temperature in the WTMD framework. While WTMD converges to this accelerated distribution at long simulation times, the OTFP/TAMD method starts sampling it from the beginning of the simulation. Given that both methods use the same accelerated distribution, we have chosen the same
value in all the OTFP and WTMD simulations. Note that ABF does not have a bias temperature or an analogous parameter because at long time it converges to the uniform distribution, because the ABF force converges to the negative mean force.
Each free-energy method used in this work has its own set of parameters and some of these parameters have to be chosen carefully to ensure optimum convergence. The key parameters of WTMD are the deposition rate of the bias potential and the width of the Gaussian hills deposited. The rate is the product of the Gaussian height and the deposition frequency. We ran some test simulations to explore the convergence of WTMD for different parameter values (see
Figures S3 and S4 in the Supplementary Materials). By inspecting these plots we have selected a hill deposition every 100 time steps, a Gaussian width of 1 and a height of
. For the OTFP/TAMD simulations, once the auxiliary temperature is fixed, we need to choose the friction of the auxiliary particle and the spring constant between it and the configuration. The latter is easy to fix, since it is only required to be strong enough to keep the system in a small region of the CV space around the auxiliary particle. We select a spring constant of 1500, which was found to be an adequate value (see
Figures S5 and S6 in the Supplementary Materials). In this way the averaged forces that the system exerts on the auxiliary particle will be a good approximation of the FEP gradient in the auxiliary particle position. The OTFP/TAMD auxiliary friction is a more important parameter in the method convergence. It has to be high enough to ensure there adiabatic decoupling of the system and the auxiliary particle dynamics. At the same time, the higher the friction is, the slower will be the sampling of the auxiliary particle across the CV space. We have chosen a friction value of 5000 times the system friction
. Finally, to avoid non-equilibrium effects in ABF, we have chosen to apply the bias in each bin after the gradient as been at least measured 100 times.
The average convergence is determined by running 10 independent simulations for each system and each method and computing the resulting average FEPs for each case. These average FEPs are shown in the
Figures S7–S9 of the Supplementary Materials. The average results show that the statistics of 10 independent simulations are enough to obtain a good sampling of the hidden barriers and recover the true FEP for all the cases studied in this work. The convergence to the true FEP allows us to ensure that the method parameters were properly chosen, especially parameters such as the Gaussian width in WTMD and the auxiliary friction in TAMD. If these parameters are badly chosen, the methods would have serious issues, such as convergence to an incorrect FEP. Finally, we show here the results of each individual simulation, since it is from the individual simulations where the statistics are not enough to achieve a good sampling of the hidden barrier that we can study its harmful effects, which is the goal of the paper.
In
Figure 3 we show each of the 10 independent FEPs reconstructed via ABF, WTMD and OTFP along the
x CV together with the “true” FEP computed via a direct (numerical) integration of Equation (
1). The reconstructed and the true FEPs are mutually aligned via least squares. In the same figure we also show the average convergence profiles for each method with its standard deviation. Note that these profiles result from averaging the individual convergence profiles, so each represents the expected convergence behavior of one individual simulation. These individual convergence profiles are obtained by measuring the root mean square deviation (RMSD) between the reconstructed profiles and the true FEP at each time, after their alignment via least square. In
Figure 3 it is possible to see that the three methods give an accurate approximation of the FEP and all of them reach a good convergence below an RMSD of 0.5 after a simulation time of 20
steps. However, it is notable that ABF and WTMD obtain a better convergence than OTFP.
In
Figure 4 we show the sampling trace of a representative simulation for each method. In a first analysis of this figure, it is possible to see that the three methods show a faster sampling of the
x CV than the one observed in the MD trace of
Figure 2. Furthermore, there is an interesting phenomenon that can be observed by a further comparison between
Figure 4 and the MD trace. Despite the fact that the acceleration is only focused on the
x coordinate, all three methods are evidently able to accelerate the orthogonal
y coordinate. An explanation for this acceleration arises when considering that the transitions
and
have a higher rate when the reactant density along
x increases at high-energy states, as occurs during the CV acceleration. For example, a particular
transition path that follows the MEP at
will have associated a higher barrier than a transition path that start from
in the A basin and goes through the transition state. The latter transition increases its probability when the probability to be in
increases, as in the accelerated simulations. Thus, although the MEP is orthogonal to the CV variable, the hidden barrier of our example maintain a certain degree of coupling with the CV. This coupling helps to reduce or nullify the negative effect of the hidden barrier’s presence. Finally,
Figure 4 shows that the sampling of OTFP is much slower than the one shown by ABF and WTMD. The origin of this difference resides in their different acceleration strategies. The acceleration sampling of the CV in the OTFP/TAMD approach is made via a high-temperature dynamics of the high-friction auxiliary particle. So this implies a slower motion across
x but, as it is demonstrated in the TAMD original paper [
17], a higher probability to overcome the associated barriers. On the contrary, the acceleration of ABF and WTMD does not require a higher-friction on
x, resulting in the faster sampling shown in the figure.
To further investigate the hidden barrier effect on the acceleration methods we change the CV definition by introducing a small rotation of
degrees to obtain a coordinate system,
, and using
as a CV. In
Figure 5 it is possible to see the impact of this slight change on the FEP. It is remarkable the significant change generated in the FEP while only this slight CV rotation has been introduced: now the
CV shows three FEP minima. However more remarkable is the effect that this small CV change has in the sampling of the free-energy methods.
Figure 6 shows the free-energy and convergence profiles for the ABF, OTFP and WTMD simulations. It is possible to see in these plots a performance drop relative to the previous results for the
x CV, which is more significant for the ABF and WTMD methods. In
Figure 7 we show the trace of some representative simulations. In this figure we can see that ABF gets stuck in an
region almost the entire simulation time. On the other hand, WTMD sampling is also affected, showing some periods of being stuck, although less significant than in ABF. Finally, OTFP does not seem to be affected by the CV change and no “sticking” behavior is observed.
The convergence behavior of WTMD and ABF has been formally analyzed in [
18,
19,
20] respectively. In particular, Lelievre et al. [
20] addressed the so-called “bi-channel” case, where there are two different free-energy channels along the CV separated by hidden barriers. They provide a formal treatment of this case and shows that the ABF convergence for this case will depend on the Boltzmann sampling trough the hidden barriers. This is the best scenario that we can expect for any of the methods treated in this work: since the hidden barriers are not being accelerated, their associated transitions will be the lowest modes needed to sample in order to reconstruct the correct FES with the contributions of all the PES metastable states. However, the conclusion achieved by Lelievre et al. is tied to the hypothesis that the (negative) free-energy is a “good bias” in each free-energy channel [
20].
We can consider the PES used in this work to be composed of three channels: the region of the
transition, with the other two channels leading to the
A and
D minima. In this scenario, the convergence issue observed for the
CV in respect to the
x CV can be understood following Lelievre et al. [
20]. While the FEP associated with
x CV can be considered a good bias for each channel, the FEP associated with
is not. The free-energy minima at
will result in a high-energy bias that will overlap the energy barrier of the
channel. Thus, the ABF and WTMD bias contribute to the barrier at
rather than improving the sampling along it, as it is shown in
Figure 8. As with many other adaptive importance-sampling techniques, we can expect the same issue to be present in WTMD simulations (see for instance remark 6 of the reference [
19] by Lelievre et al.). However, since OTFP is not an adaptive technique, we should not expect to be
a priori any further sampling issue, besides the associated with the natural slow sampling of the hidden barriers.
In order to further explore the arguments given in the paragraphs above, we modify the PES by deepening the
A and
D minima via increasing the
and
coefficients of Equation (
2) from 15 to
. Following the above discussion, we expect that this change will have a larger impact in the adaptive methods, since the self-constructed barrier shown in
Figure 8 will be increased. The resulting FEP and convergence profiles for are shown in
Figure 9. In these plots it is possible to see that the most affected method by this change is WTMD while ABF and OTFP retain the previous convergence behavior. The robust behavior of OTFP is expected from its non-adaptive nature. However, to give an explanation for the impact difference between WTMD and ABF we look to the sampling traces shown in
Figure 10. By comparing this figure with
Figure 7 it is possible to see that the trapping periods in WTMD trace significantly increase and start to dominate the entire simulation time. For ABF the trapping periods also increase but it already shows a dominance of the trapping periods over the entire simulation time in the previous system. Thus, the deepening of the
A and
D minima results in breaking the WTMD resistance to the trapping phenomena expected in the adaptive methods. To confirm that the sampling problems in WTMD and ABF arise from the choice of
as the CV and not for the solely deepening of
A and
D minima, we show in
Figures S10 and S11 of the Supplementary Materials the results for the
x CV. As expected, the three methods are able to recover the corresponding FES for the
x CV.
Using the basin trace of all the system studied in this work (see
Figure 4,
Figure 7,
Figure 10 and
Figure S11) we have constructed a histogram of the “trapping time” that the system spend in each minimum of the PES and for each method used. This histograms are included in
Figures S12–S14 and serves as statistical support to the conclusions drawn in the paragraphs above.