Automatic Determination of Stepsize Parameters in Monte Carlo Simulation Tested on a Bromodomain-Binding Octapeptide

The proportional integral controller, commonly used in engineering applications for process control, has been implemented for the tuning of the stepsizes in Metropolis Monte Carlo simulations. Similarly to the recent application for tuning the chemical potential parameter in grand-canonical ensemble simulation, the process-control approach was found to work well for the problem of selecting the stepsize for each torsion angle that results in a targeted acceptance rate during the simulation of an octapeptide in aqueous environment.


Introduction
The Metropolis Monte Carlo method [1] is the first computational procedure that was shown capable of sampling an ensemble of configurations suitable for statistical-mechanical treatment. While current simulation work in macromolecular systems generally employs the molecular dynamics approach with significant success, there are applications where Monte Carlo was found to be more efficient [2]. Also, the ease of restricting the sampling to a limited number of degrees of freedom makes Monte Carlo an attractive choice [3]. However, it is important to recognize that the Metropolis method actually encompasses a large family of methods [4] and thus there is still room for significant improvements and for 'niche' applications [5].
One of the factors determining the computational efficiency of Monte Carlo (MC) simulations is the proper selection of the range of changes from which trial moves will be sampled, commonly represented by the so-called stepsize parameters, e.g., the edge of the cube around a molecule within which the trial position must fall or the width of a torsion angle interval within which the trial angle should fall. Generally, the larger these parameters the larger conformational change is made in each step, but the smaller the acceptance probability will be. However, selection of stepsizes resulting in a targeted acceptance rate is a nontrivial process for system having several degrees of freedom of different types.
The present paper proposes an automated method that employs the proportional-integral controller (PIC) technique, well-known in engineering application. It has recently been shown to work extremely well in controlling the density during grand-canonical ensemble simulations [6].
In general, it is difficult to establish the optimal acceptance rate since the sampling efficiency is only known with any certainty after several long simulations. For systems with smooth free energy landscape, the accumulated experience in the field leads to target acceptance rates around 30% instead of the assumption of the early practitioners that 50% is the best [7,8]. For systems with a rugged free energy landscape, however, the optimum shifts toward smaller acceptance rates since larger stepsizes may cross easier barriers separating separate free-energy basins [9].
The present paper leaves the question of establishing the optimal acceptance rates for further study. Instead, it provides an automated solution of the problem of determining the stepsize parameters that produce a targeted acceptance rate, e.g., 30%. The availability of such technique, however, could help in establishing the optimum acceptance rate in a different study.
It should also be emphasized that while optimizing the stepsize selection is likely to produce a significant improvement in the ease and efficiency of simulating biochemical systems with the Monte Carlo approach, other developments, such as the generalized ensemble approach (e.g., Ref. [10]) or the development of more complex moves may prove more important. However, such developments can also employ the stepsize tuning technique described in the present paper.

Background
The selection of stepsize parameters in MC simulations is generally performed by trial and error, based on shorter runs. Given that the simulation efficiency is not sensitive to small changes in these parameters, such a procedure is adequate for systems with a few types of degrees of freedom sampled, e.g., for neat liquids or solutions involving small solutes. For systems containing a large number of degrees of freedom whose stepsize parameters have to be tuned separately (e.g., a polypeptide with a large number of different types of torsion angles), the trial and error approach breaks down since it requires much longer trial runs and it is practically impossible to consider the effect of the change in the stepsize parameter of one type of degrees of freedom on the acceptance rate of a different type. As a result, such simulations are generally run with poorly tuned stepsize parameters.
A procedure for automated tuning of stepsize parameters has already been introduced [11]. In that work the simulation was partitioned into blocks where acceptance rate statistics are gathered on the basis of which the stepsize parameters are modified to move toward the target acceptance rate. That work also emphasized the important point that the ensemble generated during tuning will deviate from the Boltzmann distribution. Thus, unless strict adherence to the Boltzmann distribution is not important (e.g., for simulated annealing), tuning should precede data gathering, even though continuing the tuning may improve the sampling efficiency as the stepsize parameters adapt to new conformations of the system. This tuning algorithm has been implemented into the CHARMM program for simulation of macromolecules [12] and tested in detail on the alanine dipeptide in vacuum [8].
Our recent experience with the automatic tuning of the chemical potential parameter called B in grand-canonical ensemble simulations to a target density gave us the motivation to develop the tuning technique presented here. For the tuning of B, three algorithms were tested: (1) change B based on the average densities over two successive blocks; (2) change B based on the fluctuation of the number of molecules N calculated in successive blocks; and (3) change B continually as dictated by the PIC technique. While the first two techniques even failed to converge on larger systems the PIC technique always worked and outperformed the other two by several orders of magnitude. The reason for this is that in order for the block-average based methods to work the block averages should be converged enough (otherwise the change made moved B in the wrong direction), but that unduly lengthens the calculation. This result thus suggests that the tuning based on simulation blocks is significantly less efficient than tuning by the PIC technique.

Methods
The process control technique identifies a process variable whose value has to be kept around its target and a manipulated variable M V whose value the controller changes continually to stabilize the process variable. In our case the process variable is the acceptance rate a and the manipulated variable is the stepsize parameter ∆. The canonical proportional-integral derivative control equation (PID) is written as follows [13]: where is the current deviation of the process variable from its target, t is the time, and the controller parameters K C , τ I , τ D , c s , are the proportional gain, integral time, derivative time, and controller bias, respectively. It has been shown that the dynamic stability of a controlled system can be sensitive to the selection of the derivative time. Therefore, the derivative term is frequently omitted unless empirically shown to be necessary [14]. Furthermore, we implemented the differential form of the control equation: This transformation solves the so-called 'integral windup' problem, whereby an historic period of large deviation dominates the integral term [14]. It has the added benefit of eliminating the controller bias as an additional tuning parameter. In general, the controller tuning parameters K C and τ I are chosen to make the controller most sensitive to changes in the process variable without introducing undue fluctuations in it as a result of controllerinduced changes. The general method of arriving at such controller tuning parameters K C and τ I is the open-loop protocol of Ziegler and Nichols [15]. Note, however, that the controller behavior is much less sensitive to the precise values of K C and τ I when compared to the sensitivity of the acceptance rate to the stepsize.
The process control equation has an implicit time dependence that is not present in MC simulations. To emphasize this, we rewrite the control equation once more in finite difference form in terms of the simulation step number, i: The calculation of i , however, requires some additional thought. Since any single step is either accepted or rejected, the acceptance rate at step i has to represent an average over a certain stretch of the simulation. To avoid having to either limit the controller action to simulation blocks or to having to store the acceptance information for a large number of steps, we chose to define the (average) acceptance rate a i at step i as a weighted average of the previous acceptance rates, with successively smaller weight given to more distant contributions. This can be achieved using where M acc/rej i is unity if the move was accepted at step i and zero if not. The second equation in Eq. (4) converts the recursion formula into an explicit one. Each application of the recursion introduces higher powers of w a , resulting in the sum over the number of steps involving different powers of w a . In order to include contributions from a long enough stretch of earlier moves, w a has to be close to one. As a result, this formula would give disproportionate weight to the first contribution unless it is zero (since everything else is multiplied by (1 − w a )) and thus the deviation from the target acceptance rate would be underestimated for a significant portion of the run. For this reason, the recursion is always started from zero acceptance rate, even though using the target acceptance rate is usually considered the better choice.
For simulations requiring strictly Boltzmann-weighted ensembles the tuning phase will be followed by simulation with fixed stepsizes. Since the stepsizes fluctuate continually during tuning phase -both due to the controller action and, more importantly, to the changes in the conformation -the current value of each stepsize will have random deviation from the value that would, on the average, result in acceptance rate closest to the target. Therefore, instead of using the value at the time the tuning is stopped, it is better to choose a value that is averaged over the last stretch of the tuning run. We tested both using simple averages over a stretch that already reached the targeted acceptance rates and using a weighted average, analogous to Eq. (4). Using w ∆ as the weight the weighted average of the stepsizes ∆ j over i moves, ∆ i is The normalization factor before the summation (based on the summation formula of the geometric series) is introduced to correct for the fact that the summation omits the term without a (1−w ∆ ) factor. Denoting the weighted sum of ∆'s with S i , we have the following recursion to calculate it: Note that the weight factor w a used for the stepsize averaging can be different from the w ∆ used for the acceptance rate calculation. Note again, that the closer w ∆ is to one, the longer stretch of the simulation will contribute significantly to the calculated stepsize average. The effect of the weighting schemes described by Eqs. (4)(5)(6) can be quantified if we write w as w = 1 − 1/n. Since (1 − 1/n) n e −1 , w N e −N/n = 10 −N/(n ln 10) .

Calculations
The stepsize tuning was implemented into the MC program MMC [16] for the translation and rotation of non-solvent molecules and for the torsion angle sampling, both for regular torsion moves and for local torsion moves. Regular torsion changes a single torsion angle. In local torsion moves [17][18][19] a backbone angle change (called the 'driver torsion') is combined with complementary changes in the six successive torsion angles to keep the remaining part of the backbone unchanged. The choice of the stepsize only affects the range of the driver torsion -the rest of the torsion angles are determined by the geometrical constraints imposed by the fixed bond lengths and bond angles. Note, that there is no guarantee to find a solution after a driver torsion change (in that case, of course, the move attempt is rejected). The larger the driver torsion change, the higher is the probability that no solution will be found.
Tuning of the torsion stepsizes poses by far the most serious problem since the acceptance of a given torsion change depends strongly on the environment and relative topology of each torsion. Thus the test of the tuning method introduced will focus on tuning torsion angle stepsizes.
For each degree of freedom a separate controller was implemented. However, we used the same controller tuning parameters K C and τ I (see Eq. 3) and same averaging weight w a (see Eq. 4) for all simple torsions, and a different set for all local torsions.
The system chosen for the test was the octapeptide (RHK(Ac-K)LMFK), the segment of the protein p53 that binds to PCAF Bromodomain. It was simulated with a ca 6Å layer of water, stabilized with the Primary Hydration Shell technique [20], the MC variant of the original version developed for MD simulations [21]. The water potential used was TIP3P [22] and both the solute-solute and the solutesolvent interactions were described by the CHARMM-27 force field [23]. The φ and ψ backbone torsion angles were sampled also with local moves [17][18][19] enhanced with the reverse proximity criterion [24]. The peptide bond ω was kept fixed and the iterative algorithm described in Ref. [24] was used to find the proximal solution to the chain closure problem.
This system has a total of 52 torsion angles and 294 water molecules. Eight of the 52 angles were sampled by local moves (resulting in changing six additional backbone angles). Since every 20th step involved the attempted change of a torsion angle every 1040 steps attempted to move each torsion angle once.

Results
As discussed above, the general method of arriving at the values of K C and τ I is the open-loop protocol of Ziegler and Nichols [15]. From our earlier experience [6], however, we had already reasonable values. Accordingly, the final values of K C and τ I and the weight factor w a of Eq. (4) were selected by runs using combinations of K C = 1 and K C = 10, τ I = 2000 and τ I = 20, 000, w a = 0.99, w a = 0.999, and w a = 0.9999 and monitoring the fluctuation of the acceptance rates and of the tuned stepsizes. In Table 1. Mean fluctuation of the tuned parameters during tuning.  engineering control scenarios, PIC performance is typically robust over a wide range of operating conditions [14]. By analogy, the parameters determined here are expected to be effective in a variety of simulation systems without modification. Three runs were performed with each parameter combination whose initial conformations were extracted from preliminary runs, at least 5 * 10 8 MC steps apart. The fluctuations were averaged separately for torsions that were sampled with local moves and those sampled as simple torsions. Table 1 shows the parameter combinations used, and the calculated standard deviations, of the acceptance rates and stepsizes after 25M (1M=10 6 ) and 50M MC steps. Standard deviations were calculated from 5M long block averages. Based on the calculated fluctuations we selected K C = 10, τ I = 2000 for the controller parameters. As for the weight to be used for the calculation of the 'current' acceptance rate, w a = 0.999 appeared to work best for simple torsions and w a = 0.9999 for the local torsional moves.
The evolution of the acceptance rates and step sizes in one of the three runs during tuning is given in Tables 2 and 3, resp., for the φ, ψ and χ 1 angles of the two middle residues (LYS 3 and Ac-LYS 4 ). The backbone torsion angles (φ and ψ) were sampled with local moves while the side chain torsion angles (χ 1 ) were sampled as simple torsion. Note that the convergence characteristics of the side chains are expected to be similar to those of the terminal residue(s). The results show that the targeted acceptance rate (30%) is reached reasonably quickly and stays steady. Table 2. Evolution of the acceptance rates for selected torsion angles during tuning. a t a a t a a t a a t a a t a a  Legend: (a) N MC is the number of Monte Carlo steps/10 6 . (b) φ, ψ, and χ are the torsion angles over the N-C α , C α -C, and C α -C β bonds, respectively; (c) the subscripts 3 and 4 refer to the residue numbers.
For the choice w ∆ of Eq. (5), tuning runs of length 25M, 50M, and 75M were followed by 50M long runs with fixed stepsizes. The stepsizes were obtained using Eq.(5-6) with w ∆ = 0.999, 0.9999, and 0.99999, after tuning runs of 25M, 50M and 75M long and also by averaging the tuned stepsizes over the last 10M or 25M of the run. The resulting acceptance rate ranges over the three independent runs and the average fluctuation of the acceptance rates in each run are shown in Table 4. For simple torsions, adequately tuned stepsizes were obtained in the 25M long run, using w ∆ = 0.999. However, the backbone torsions require both longer tuning runs and longer stretch to average. Using Eq. (6), at least 50M long tuning run is necessary with w ∆ = 0.99999. Similarly, using simple averaging, averaging over the shorter length used (10M) produced by far the largest acceptance rate spread (after the 50M tuning runs). The tuned stepsizes obtained with simple averaging were somewhat less reliable than the set obtained with Eq. (6). Table 3. Evolution of the stepsize parameters for selected torsion angles during tuning. In engineering applications the controller tuning parameters K C and τ I are selected to minimize the fluctations in the process variable that are due to the changes in the manipulated variable [15]. However, when the evolution of the system results in a change in the relation between the manipulated variable and the process variable then a well controlled system should respond by introducing changes in the manipulated variable. In the previous PIC application [6] the process variable -the solvent densityhad an essentially isotropic conformational space and thus the fluctuations in the manipulated variable -the chemical potential parameter B -were small. In the current application, however, there is a significant conformation dependence of the relation between the acceptance rate and the stepsize. This explains the relatively large acceptance rate fluctuation and the lack of success to find K C and τ I values that eliminate the fluctuation. Note, however, that the fluctuations in the acceptance rates after tuning are actually larger than the fluctuations during tuning, demonstrating that the tuning process is able to respond to the changing relation between stepsizes and acceptance rates. The magnitude of these fluctuations also shows the importance of selecting the right method for the averaging to determine the fixed stepsize to be used after the tuning. Table 4. Acceptance rates and their fluctuations using tuned (fixed) stepsizes determined with different averaging strategies. Legend: (a) N MC is the number of Monte Carlo steps/10 6 . (b) N t MC is the number of tuning steps. (c) w ∆ is the stepsize averaging weight factor in Eq. (6). N t MC MC steps of the tuning run. (d)N a MC is the length of the run the stepsizes were simply averaged over (when Eq. (6) was not used). (e) − and + refer to the smallest and largest value from the three independent runs. (f) represent averaging the averages over three independent runs.

Discussion
We presented an efficient and robust way to automatically tune stepsize parameters in a Monte Carlo simulation. While for systems involving a few degrees of freedom trial and error approach is usually adequate, for more complex systems it soon becomes an exercise in frustration as the acceptance rates converge much slower and the change of one stepsize may affect more than one acceptance rate.
Automatic tuning results in several distinct benefits: First, by eliminating the ad-hoc trials the overall computer time use is reduced. Second, by minimizing the need for human interventions, the overall timeframe of the project is reduced. One result of the relentless increase in processor speeds is that the human time spent with the setup of the project becomes the dominant part of the effort, thus savings at this step will have large impact. Last, but not least, automating the tuning ensures the uniform quality of the tuned parameters. Since the efficiency of sampling is the result of a complex interplay among the various parameters, lower-quality tuning of just a few simulation parameters can have a significant impact on the overall sampling efficiency.
An alternative algorithm for tuning stepsizes [11], based on block averages, has been tested on the alanine dipeptide [8]. Extensive testing resulted in the choice of blocks that included 200 consecutive move attempt of a torsion angle. In our system that would correspond to 2 * 10 5 (0.2M) MC steps. However, the acceptance rates for systems with as many torsional degrees of freedom as we have are far form converged during runs of this length. Therefore, the update of the stepsize could easily make a change in the wrong direction, necessitating many more iterations until convergence (such occurrences have indeed been observed in our earlier work tuning the chemical potential parameter). In contrast, the method introduced in this paper robustly approaches the target acceptance rate in a few million MC steps.
There are two important questions related to the stepsize tuning that the present paper did not address. First of all, it is important to point out that the conformational distribution sampled during the tuning phase of a simulation contains an unspecified bias with respect to the Boltzmann distribution [11]. This is true for any algorithm that adjusts the stepsize during sampling. This is not a problem for simulations that are performed as part of simulated annealing. For simulations that aim to obtain Boltzmann averages, however, the tuning step has to be performed separately from the production run.
The other question is the selection of optimal target acceptance rate. While in general the sampling efficiency is not too sensitive to the target acceptance rate -this happens to be the case for the test presented here -there are examples where the optimized setup resulted in different optimized acceptance rates for different sampling algorithms [24]. The availability of a robust automatic stepsize tuning algorithm, however, is expected to help establishing the optimization protocol for the selection of target acceptance rate.