Open Access This article is
- freely available
Computation 2018, 6(1), 21; doi:10.3390/computation6010021
The Role of Conformational Entropy in the Determination of Structural-Kinetic Relationships for Helix-Coil Transitions
Max Planck Institute for Polymer Research, Mainz 55128, Germany
Author to whom correspondence should be addressed.
Received: 21 December 2017 / Accepted: 16 February 2018 / Published: 26 February 2018
Coarse-grained molecular simulation models can provide significant insight into the complex behavior of protein systems, but suffer from an inherently distorted description of dynamical properties. We recently demonstrated that, for a heptapeptide of alanine residues, the structural and kinetic properties of a simulation model are linked in a rather simple way, given a certain level of physics present in the model. In this work, we extend these findings to a longer peptide, for which the representation of configuration space in terms of a full enumeration of sequences of helical/coil states along the peptide backbone is impractical. We verify the structural-kinetic relationships by scanning the parameter space of a simple native-biased model and then employ a distinct transferable model to validate and generalize the conclusions. Our results further demonstrate the validity of the previous findings, while clarifying the role of conformational entropy in the determination of the structural-kinetic relationships. More specifically, while the global, long timescale kinetic properties of a particular class of models with varying energetic parameters but approximately fixed conformational entropy are determined by the overarching structural features of the ensemble, a shift in these kinetic observables occurs for models with a distinct representation of steric interactions. At the same time, the relationship between structure and more local, faster kinetic properties is not affected by varying the conformational entropy of the model.
Keywords:helix-coil transition; structural-kinetic relationships; coarse-grained dynamics; Markov state models
Coarse-grained (CG) molecular simulation models have played a key role in laying the foundation for modern theories of protein folding, and continue to provide significant insight into the complex dynamical processes sampled by biological macromolecules [1,2]. These models are useful for providing microscopic interpretations that complement experimental findings, especially for systems and processes that are computationally out of reach for atomically-detailed models. For example, single-molecule experiments probe a large range of complex kinetic processes sampled by biomolecular systems, but require an underlying molecular model for accurate structural interpretations . The computational effort required to investigate a system not only depends on the sheer number of particles but also on the range of relevant timescales, thermodynamic and chemical conditions, as well as system variations (i.e., mutations). Therefore, it is clear that CG models will continue to be essential for providing consistent and exhaustive interpretations for experimental observations.
Despite significant advances in the development of chemically-specific CG models for proteins [4,5,6,7], a fundamental challenge severely limits the predictive capabilities of CG models—the interpretation of CG dynamical properties. The process of removing degrees of freedom from a system typically results in decreased molecular friction and softer interaction potentials. This effect is a double-edged sword: Effectively speeding up the sampling of configuration space while obscuring the connection to the true dynamics of the system. These “lost” dynamics not only prevent quantitative prediction of kinetic properties, but may also lead to qualitatively incorrect interpretations generated from CG simulations [8,9]. Unlike many polymer systems, where a homogeneous dynamical rescaling factor is capable of recovering the correct dynamics of the underlying system [10,11], the rescaling associated with the complex hierarchy of dynamical processes generated by biological molecules is likely a complex function of the system’s configuration. Consequently, the application of bottom-up approaches that aim to re-insert the appropriate friction via a generalized Langevin equation remains conceptually and computationally challenging .
We recently demonstrated that, if a CG model incorporates certain essential physics, simple relationships between structural and kinetic properties may emerge . These structural-kinetic relationships represent powerful tools that can be employed to ensure a CG protein model generates consistent kinetic information, in terms of both relative timescales of dynamical processes and the dynamical pathways sampled during a particular process. To identify these relationships we considered a model system for helix formation—a heptapeptide of alanine residues—and employed a flavored-Gō model [14,15,16], whose parameters are easily tuned to generate particular structural features. To ensure accurate modeling of the structural ensemble, i.e., to avoid sterically forbidden conformations, we incorporated a detailed representation of steric interactions into the model. We then performed a systematic search through parameter space, afforded by the simplicity of the model, and analyzed correlations between the emergent structural and kinetic properties. To validate the generality of our conclusions, we also considered a transferable model with more complex interaction potentials but a slightly simpler representation of steric interactions.
In this study, we investigate the robustness of our previous conclusions by considering a longer peptide with experimental reference data—the capped, helix forming peptide AC-(AAQAA)3-NH2. We follow the strategy of our previous study, while expanding upon the models employed, which clarifies the impact of model representation or, more precisely, conformational entropy on the resulting structural-kinetic relationships. More specifically, by varying the energetic parameters of a given model type, we keep the conformational entropy approximately fixed and demonstrate that the global, long timescale kinetic properties (i.e., ratio of folding to unfolding timescales) are determined precisely by the average helical content of the ensemble. Comparison between two distinct model types demonstrates a shift in the timescale ratios due to the change in model representation. Furthermore, by adjusting the steric interactions of one model type, we provide clear evidence that the conformational entropy is the dominant contributor to this shift. In contrast, we find that more local, faster kinetic processes are consistently determined by structural features of the ensemble, regardless of the precise model representation.
2. Computational Methods
2.1. Coarse-Grained (CG) Models
2.1.1. Hybrid Gō (Hy-Gō)
To investigate the relationship between structural and kinetic properties generated from CG simulation models, we employ a Gō-type model , which defines attractive interactions based on the location of atoms in the native structure. More specifically, we use a flavored-Gō model with three simple, Gō-type parameters [14,15,16,17]: (i) a native contact (nc) attraction, , employed between pairs of atoms which lie within a certain distance in the native structure, i.e., the -helix, of the peptide, (ii) a desolvation barrier (db) interaction, , also employed between native contacts, and (iii) a hydrophobic (hp) attraction, , employed between all pairs of atoms of the amino acid side chains. The same functional forms are employed as in many previous studies , with a tunable prefactor for each interaction as described below. The form of the interactions are illustrated in the top two panels of Figure 1.
In addition to the three Gō-type interactions, we also partially employed a standard AA force field, AMBER99sb , to model both the steric interactions between all non-hydrogen atoms and also the specific local conformational preferences along the chain. More specifically, the bond, angle, dihedral, and 1–4 interactions of the AA force-field are employed without adjustment. To incorporate generic steric effects, without including specific attractive interactions, we constructed Weeks-Chandler-Andersen potentials  (i.e., purely repulsive potentials) directly from the Lennard-Jones parameters of each pair of atom types in the AA model (bottom panel, Figure 1). For simplicity of implementation, we then fit each of these potentials to an functional form. The van der Waals attractions and all electrostatic interactions in the AA force field were not included and water molecules were not explicitly represented. The total interaction potential for the model may be written: , where the backbone (bb) interaction includes both the intramolecular and steric interactions determined from the AA force field. The first three coefficients represent the only free parameters of the model, while .
The philosophy of this model is that the three Gō-type interactions will roughly sample the correct conformational ensemble for short peptides, while atomically-detailed local sterics restrict the model to sample a physically-realistic ensemble of structures. More specifically, the steric interactions ensure that (i) conformations that are sterically forbidden in the all-atom (AA) model are not sampled and (ii) the relevant regions of the Ramachandran plot are sampled, while retaining barriers between metastable states. We have previously determined that these characteristics are essential for constructing models with reasonable kinetic properties [9,20]. Although the local sterics of the backbone are modeled with near-atomistic resolution, the Hy-Gō model substitutes the full atomistic description of the peptide with a highly coarse-grained representation by modeling the complex combination of dispersion and electrostatic peptide-peptide and peptide-solvent interactions with a limited set of simple interactions between and atoms. Note that this is quite distinct from all-atom Gō  models, which employ atomically-detailed energetic parameters based on the positions of each atom in the native structure.
As an alternative CG model, we considered PLUM, which also describes the protein backbone with near-atomistic resolution, while representing each amino acid side chain with a single CG site, within an implicit water environment . In PLUM, the parametrization of local interactions (e.g., sterics) aimed at a qualitative description of Ramachandran maps, while longer-range interactions—hydrogen bond and hydrophobic—aimed at reproducing the folding of a three-helix bundle, without explicit bias toward the native structure . The model is transferable in that it aims at describing the essential features of a variety of amino-acid sequences, rather than an accurate reproduction of any specific one. After parametrization, it was demonstrated that the PLUM model folds several helical peptides [6,22,23,24,25], stabilizes -sheet structures [6,26,27,28,29], and is useful for probing the conformational variability of intrinsically disordered proteins . We also considered four minor reparametrizations of the PLUM model:
- the side chain van der Waals radius is decreased to 90% of its original value .
- the hydrogen-bonding interaction strength is decreased to 94.5% of its original value .
- the hydrogren-bonding interaction strength is decreased to 90% of its original value.
- the side chain interaction interaction strength is decreased to 95% of its original value.
Although PLUM provides a near-atomistic representation of the peptide backbone, the representation is coarsened with respect to the atomically-detailed backbone and side chain sterics of the Hy-Gō model. It is important to note that while parametrizations 2–4 change only energetic parameters (i.e., approximately fixed conformational entropy), parametrization 1 significantly changes the conformational entropy of the system by adjusting the steric interactions of the amino acid side chains. In the following, we refer to the latter as the PLUM-ent model.
2.2. Simulation Details
In this study, we consider the capped helix forming peptide AC-(AAQAA)3-NH2, which has been extensively characterized both computationally and experimentally  and is often employed as a reference system for force field optimization. Throughout the manuscript, we refer to the system simply as .
CG molecular dynamics simulations of with the Hy-Gō model were performed with the Gromacs 4.5.3 simulation suite  in the constant NVT ensemble, while employing the stochastic dynamics algorithm with a friction coefficient and a time step of . For each model and for each temperature considered, 40 independent simulations were performed with starting conformations varying from full helix to full coil. Each simulation was performed for , recording the system every . The CG unit of time, , can be determined from the fundamental units of length, mass, and energy of the simulation model, but does not provide any meaningful description of the dynamical processes generated by the model. In this case, ps.
CG simulations of with the PLUM force field [6,24,33] were run using the ESPResSo simulation package . For details of the force field, implementation, and simulation parameters, see Bereau and Deserno . For each temperature considered, a single canonical simulation was performed for with at timestep of , recording the system every , where ps. Temperature control was ensured by means of a Langevin thermostat with friction coefficient .
2.3. Lifson-Roig Models
According to the Lifson-Roig (LR) formulation [35,36], a peptide system is treated as a 1D-Ising model, which represents the state of each residue as being either helical, h, or coil, c. These simple equilibrium models employ two parameters, w and v, which are related to the free energy of helix propagation and nucleation, respectively. These parameters may be determined directly from simulation data using a Bayesian approach , and describe the overarching structural characteristics of the underlying ensemble. The average fraction of helical segments, , i.e., propensity of sequential triplets of h states along the peptide chain, is directly determined from . Although residue- or sequence-specific LR parameters may be determined in order to more faithfully reproduce the helix-coil properties generated from a simulation of a particular peptide system, in this work we determine a single set of for each simulation model. These simple models are incapable of describing certain features, e.g., end effects, of the underlying systems; however, we utilize the LR parameters only as a characterization tool. For consistent comparison with the experimentally-determined w parameter, we determine w for each model while setting .
2.4. Markov State Models
Given a trajectory generated from molecular dynamics simulations, Markov state models (MSMs) attempt to approximate the slow modes of the exact dynamical propagator with a finite transition probability matrix, [37,38,39]. This requires a discretization of configuration space, which groups all possible configurations into a manageable set of microstates. Once the microstates are chosen, the number of observed transitions from microstate i to j at a time separation , , is determined. The matrix of transition counts, , embodies the dynamics of the simulation trajectory. An estimator for the transition probability matrix is then constructed such that the simulation data is optimally described, while ensuring normalization and detailed balance constraints. The latter constraint applies to any system at equilibrium and alleviates finite-sampling issues. The transition probability matrix is constructed by maximizing the posterior , where we applied Bayes’ theorem and a uniform prior distribution .
In the present work, MSMs are built from the helix-coil trajectories generated by each CG model. To determine the MSM microstate representation, time-lagged independent component analysis  was performed on the configurational space characterized by the dihedral angles of each residue along the peptide backbone. A density clustering algorithm, developed by Sittel et al. , was then applied to the five “most significant” dimensions in order to determine the number and placement of microstates. We employed the same parameters as used in the original publication for clustering along an all-atom trajectory for . In particular, we used a radius of 0.1, an energy spacing of 0.1 for the free energy screening, and an energy spacing of 0.2 for constructing the network of clusters. A population minimum of 400 and 200 samples per cluster were employed for the Hy-Gō and PLUM models, respectively.
This semi-automated procedure for determining the relevant microstates yields a different number of microstates for each system and for each temperature considered. To estimate the average structural characteristics for a given microstate, we first performed a simple regular space, two-state clustering along the dihedral angle for each peptide bond. This resulted in a dividing surface approximately at the barrier between and metastable states on the Ramachandran plot. We denoted states as h and states as c and determined an approximate trajectory of the sequence of h/c states. We then compared this trajectory with the trajectory generated from the density clustering and determined the average h/c content for each microstate. These quantities were used to determine the Lifson-Roig parameters, but do not inform the construction of the MSM in any way.
Following the determination of microstates, the simulation trajectories were “cored” using the most probable path analysis  to avoid imperfect definitions of dividing surfaces between microstates. A constant waiting time was used for all microstates and was determined for each system individually in order to ensure that the resulting implied timescales of the model were approximately constant with increasing lag time. This resulted in waiting times between 20 and 100 depending on the particular parametrization of the Hy-Gō (PLUM) model and the simulation temperature. Using the cored microstate trajectories, MSMs were generated via a standard maximum-likelihood technique . For each MSM, the lag time was chosen in the normal way by constructing MSMs for increasing lag times to determine when the resulting set of timescales were converged. Typically, the lag time closely corresponded to the waiting time used in the coring analysis and did not exceed three times this number. Figure 2a presents a representative implied timescale plot. To validate the resulting models, we performed the standard Chapman-Kolmogorov tests in order to compare the probability decay from various metastable states estimated from the simulation and predicted from the MSM. Figure 2b presents a representative Chapman-Kolmogorov result for the decay of probability from the helix state as a function of time. MSM construction and analysis were performed using the pyEmma package .
We note that in the following analysis the uncertainty of observables estimated from MSMs or from the LR models is not explicitly determined. This uncertainty is difficult to accurately estimate because we are determining these properties via a simplified description of the simulation, which results in compounding sources of errors. In particular, there are errors due to both the approximate representation of configuration space and also sampling errors from the simulation. We have minimized the latter by running many independent simulations for each model as described above, while the former is assessed via the implied timescale and Chapman-Kolmogorov tests discussed in this section. While there are methods for estimating the uncertainty in, e.g., the transition probability elements, this becomes prohibitively expensive for the large number of distinct MSMs considered in this work. Since the correlations between observables determined in the work demonstrate very small variations (as indicated by the Pearson correlation coefficients), we assert that the uncertainty in the observables is rather small, resulting in robust conclusions extracted from the analysis.
3. Results and Discussion
In this study, we investigate the relationship between structural and kinetic properties of helix-coil transition networks generated by microscopic simulation models. As a structural-characterization tool, we construct Lifson-Roig (LR) models from each simulation trajectory. Although the kinetics of helix-coil transitions are often interpreted in terms of a kinetic extension of the Ising model [45,46], the precise impact of the model’s assumptions on the fine details of the resulting kinetic network is not well understood. For example, we recently demonstrated that the topological details of the kinetic networks generated by various simulations models for may vary drastically, while generating nearly identical structural properties . Here, we construct Markov state models (MSMs) directly from the simulation trajectories, allowing a more complex relationship between the LR parameters and the kinetic properties of the system.
As a model system, we investigate the AC-(AAQAA)3-NH2 peptide, which contains 15 peptide bonds. From the LR point of view, there are states, determined by enumerating the various sequences of helical (h) and coil (c) states along the peptide backbone. However, this representation is inappropriate for constructing MSMs from the simulation trajectories. Instead, we systematically determine coarser, representative microstates, which correspond to collections of the more detailed configurations of the system. See the Methods section for a detailed description of microstate determination. For reference, we employ data from previous analysis of the system . In particular, we characterize the ensembles generated by CG simulation models with respect to those obtained from NMR experiments and from simulations of the ff03* all-atom (AA) model.
We consider two distinct types of CG models in this study, as described in greater detail in the Methods section. First, we employ a relatively simple, native-biased CG model. The hybrid Gō (Hy-Gō) model, is a flavored-Gō model [14,15,16], with 3 Gō-type interactions and also physics-based interactions in the form of sterics and torsional preferences along the backbone . We constructed 15 distinct Hy-Gō parametrizations by varying the Gō-type interactions while keeping the steric interactions fixed (i.e., the conformational entropy of the models remains approximately unchanged). We also considered the transferable PLUM model  along with four minor reparametrizations. Three of the reparametrizations also corresponded to changing only the energetic parameters, while one reparametrization changed the side chain bead size and, thus, the conformational entropy with respect to the other PLUM models. We simulated each model over a range of temperatures to assess their thermodynamic properties (i.e., the temperature dependence of structural quantities). The energy scales of the models were aligned by shifting the temperature scale such that the average fraction of helical segments (i.e., three consecutive helical states), , at the reference temperature, , matched the experimental value at 300 K. For each model and each temperature considered, we determined the two LR parameters, , and constructed an MSM directly from the simulation data.
3.1. Properties of the Helix-Coil Transition
We characterized the overall structure of each ensemble in terms of the LR free energy of helix extension, w, and the average fraction of helical segments, . Figure 3a,b present these two quantities, respectively, as a function of temperature for the 15 Hy-Gō models (colored curves with circle markers) and five PLUM models (gray-scale curves). The original PLUM model and three energetic reparametrizations are denoted with square markers, while the steric reparametrization, named “PLUM-ent”, is denoted with an X marker. From the temperature dependence of w, we fit a thermodynamic model to quantify the T-independent enthalpy and entropy of helix extension. In particular, we assume , neglecting the heat capacity contributions to the free energy and considering a relatively small range in T. corresponds to the slope of the curves in Figure 3a and is a simple measure of the cooperativity of the transition. For consistency with the CG models, we fit the experimental w values from a similarly small temperature range, resulting in a slightly inflated experimental compared with that reported in previous work .
Figure 3c presents the slope of , denoted simply , versus for each model. Recall that within the LR formulation, both the helix extension parameter, w, and the helix nucleation parameter, v, contribute to determining the average fraction of helical segments, . Since it is typically assumed that v is dominated by entropic contributions , one may expect a linear correlation between and for models with similar conformational entropy (i.e., models that differ only by their energetic parameters). (Note that although we determine LR models with fixed v, the corresponding entropic effects are effectively folded into w to generate the appropriate ). Indeed, Figure 3c demonstrates that both the Hy-Gō and PLUM models (except the PLUM-ent model, gray X marker) generate individual linear correlations between and . This is consistent with our previous results for . For models with differing conformational entropy, we expect, in general, distinct trends between and , as long as entropy is playing a significant role in determining . This is clearly the case for the Hy-Gō and PLUM models for (Figure 3c). In our previous investigation of , we found a consistent linear correlation for both model types. This could be because either (i) entropy does not play a significant role in determining or (ii) The PLUM models considered happened to fall on the intersection of the two linear trends. The implication of conformational entropy as the dominant factor in the difference between the two linear correlation trends in Figure 3c is strongly supported by the PLUM-ent model, which represents a change in the conformational entropy via adjustment of steric interactions. For this model, the van der Waals radius of the side chain was adjusted to better represent the Ramachandran regions sampled by a higher-resolution model. This adjustment results in a / trend in closer agreement with the Hy-Gō models, which employ detailed steric interactions that faithfully describe the conformational entropy of the peptide backbone. Figure 3c demonstrates that the slope of the / correlation for the various parametrizations of the Hy-Gō model is consistent with both experimental and AA results .
3.2. Validation of Structural-Kinetic Relationships for
Our previous work identified a robust relationship between structural and kinetic properties generated by distinct models for . In particular, it was demonstrated that the average fraction of helical residues, , and the average fraction of helical segments, , determined the ratio of rates for characteristic nucleation and elongation processes. The calculation of nucleation and elongation rates for depended upon the identification of particular microstates representing nucleation states of the peptide. Here, the situation is complicated by our heterogeneous microstate representation for the ensembles. For this reason, we first considered coarser processes—global folding and unfolding of the peptide. We define the timescales of folding, , and unfolding, , as the mean first passage times from the coil ensemble to the full helix state and vice versa, respectively. The coil ensemble is comprised of all microstates whose average structure contains no consecutive set of three residues with greater than 50% helicity (according to a naive dividing surface definition of the helical state of a single residue, see Methods section for details). Subsequently, we also considered faster processes corresponding to the waiting time that each residue, i, spends in a helical state before transitioning to the coil state, , and vice versa, . These waiting times were calculated using the two-state dividing surface along the ramachandran plot, as described in the Methods section. Note that since the CG unit of time does not correspond to a physical time, there is always an arbitrary speed-up associated with dynamical properties generated by CG models. For this reason, we only consider ratios of timescales, to effectively account for this speed-up.
Figure 4ai presents the temperature dependence of the ratio of folding to unfolding timescales. Remarkably, the Hy-Gō models generate nearly identical folding versus unfolding timescales at the reference temperature , regardless of the particular parametrization. The PLUM models with similar conformational entropies also demonstrate similar folding to unfolding ratios, although the variance is somewhat higher than the Hy-Gō models. Recall that the energy scales of the models are aligned by shifting the temperature such that all models have the same value of at . Thus, similar to our previous results for nucleation/elongation timescales, the folding/unfolding timescales are largely determined by the average fraction of helical segments in a consistent manner over all parametrizations of a given model type. However, in this case, there is a shift in the timescale ratio at the reference temperature depending on the details of the model representation. This is supported by the fact that the PLUM-ent model (Figure 3c) displays kinetics at the reference temperature which are closer in line to the Hy-Gō models. The validity of the structural-kinetic relationship is further demonstrated in Figure 4aii, which presents the slope of the curves in panel (ai) versus the enthalpy of helix elongation, . Interestingly, the temperature-dependence of the timescale ratio is dictated by , determined from the LR model, rather than the slope of the actual helical content, , albeit these two quantities largely coincide within a model class. It is noteworthy that this particular structural-kinetic relationship is one that is well-known—the ratio of folding to unfolding timescales quantifies the free energy difference between the folded and the unfolded state for a two-state folder .
Panel (b) of Figure 4 presents similar results to panel (a), but for the ratio of waiting time in the helical (h) state to waiting time in the coil (c) state for a particular residue i. Here, i represents a position on the helix and an average is performed over residues in that position from either end of the helix. In particular, panel (b) presents results from the terminal ends of the helix (excluding capping groups), which corresponds to the fastest h to c transitions along the peptide chain. However, equivalent results were obtained for each position along the peptide. Unlike the folding to unfolding ratios, the various models generate distinct waiting time ratios at . However, panel (bi) demonstrates that the differences in the ratios is determined by structural features of the ensemble at , namely, the average fraction of helical residues, , and the average fraction of lone helices, . In other words, while the global folding to unfolding ratio is determined only by the average fraction of helical segments, the consistency of local kinetics requires further alignment of the structural ensemble. Correspondingly, panel (bii) demonstrates that the temperature dependence of the ratio of waiting times depends not only on but also the temperature dependence of and .
3.3. Thermodynamics and Transition Network Topology
Our previous analysis of suggested that topological features of the transition network at a single reference temperature may be capable of determining the thermodynamics (i.e., temperature dependence) of the model, without explicitly taking into account simulations at various temperatures . We found that the distribution of paths from full helix to full coil states in the network provided a good characterization of the topology of the network. To quantify this distribution, we employed the conditional path entropy , , which characterizes the average degree of randomness for paths from s to d passing through a particular intermediate state, u. Our hypothesis was that models with more directed transitions to the helix state, i.e., lower average conditional path entropy in the folding direction, would undergo a faster change in helical content with decreasing temperature (i.e., higher cooperativity). Our results demonstrated that this naive picture was partially valid, although a more complex relationship between the topology and cooperativity was necessary for a quantitative description .
Figure 5 presents the conditional path entropy in the folding direction, averaged over all intermediate states, versus for each model. Note that the PLUM-ent model was left out because its microstate representation contained too few states to produce meaningful results with this analysis. It is also important to note that the models were not simulated exactly at , introducing errors into any legitimate trend. For numerical convenience, we weighted each of the features by the fractional flux passing through each state. We also chose to normalize the path entropies by the total number of paths for each network. Similar to our previous results, provides a rough description of cooperativity of the model. Extrapolating to the experimental value of (∼1.9 ), Figure 5 implies that the experimental helix-coil transition would have a very low conditional path entropy and, thus, it undergoes an incredibly directed transition from helix to coil state. This is in line with the conventional view of a two-state transition, where there is an exponential suppression of intermediate states .
We recently demonstrated structural-kinetic relationships for peptide secondary structure formation that emerge if a CG model incorporates certain essential physics . In this work, we further validate these structural-kinetic relationships for a longer peptide, where the representation of configuration space in terms of the full enumeration of sequences of helical/coil states along the peptide backbone is impractical. Furthermore, the role of conformational entropy in the determination of these relationships was clarified by distinguishing between models with fixed or varying steric interactions. More specifically, a change in the conformational entropy, achieved by either changing the model representation or by explicitly adjusting the steric interactions, results in a shift in the global, long timescale kinetic observables at the reference temperature. However, this does not affect the temperature dependence of the kinetic properties, which vary consistently with the enthalpy of helix extension. Additionally, local kinetic properties are determined by structural features of the ensemble, without dependence on the model representation. The semi-automated construction of Markov state models provides a network picture of dynamics, allowing at least the partial characterization of temperature-dependent quantities from a single reference temperature. In combination with our previous work [9,13,20], the results presented here provide a potential strategy for ensuring kinetic accuracy in CG models through the matching of particular structural quantities. This approach requires further identification and validation of structural-kinetic relationships for tertiary structure formation, and may be especially useful for providing structural interpretations for kinetic protein experiments.
We thank Marius Bause and Alessia Centi for critical reading of the manuscript. This work was funded through a postdoctoral fellowship from the Alexander von Humboldt foundation (Joseph F. Rudzinski) and an Emmy Noether fellowship of the Deutsche Forschungsgemeinschaft (Tristan Bereau).
Joseph F. Rudzinski and Tristan Bereau conceived and designed the experiments; Joseph F. Rudzinski and Tristan Bereau performed the experiments; Joseph F. Rudzinski and Tristan Bereau analyzed the data; Joseph F. Rudzinski and Tristan Bereau wrote the paper. Both authors have read and approved the final manuscript.
Conflicts of Interest
The authors declare no conflict of interest.
The following abbreviations are used in this manuscript:
|MSM||Markov state model|
- Dill, K.A.; Chan, H.S. From Levinthal to Pathways to Funnels. Nat. Struct. Biol. 1997, 4, 10–19. [Google Scholar] [CrossRef] [PubMed]
- Kmiecik, S.; Gront, D.; Kolinski, M.; Wieteska, L.; Dawid, A.E.; Kolinski, A. Coarse-Grained Protein Models and Their Applications. Chem. Rev. 2016, 116, 7898–7936. [Google Scholar] [CrossRef] [PubMed]
- Schuler, B.; Soranno, A.; Hofmann, H.; Nettels, D. Single-Molecule FRET Spectroscopy and the Polymer Physics of Unfolded and Intrinsically Disordered Proteins. Ann. Rev. Biochem. 2016, 45, 207–231. [Google Scholar] [CrossRef] [PubMed]
- Liwo, A.; Oldziej, S.; Czaplewski, C.; Kozlowska, U.; Scheraga, H.A. Parametrization of Backbone-Electrostatic and Multibody Contributions to the UNRES Force Field for Protein-Structure Prediction from Ab Initio Energy Surfaces of Model Systems. J. Phys. Chem. B 2004, 108, 9421–9438. [Google Scholar] [CrossRef]
- Maupetit, J.; Tuffery, P.; Derreumaux, P. A Coarse-Grained Protein Force Field for Folding and Structure Prediction. Proteins Struct. Funct. Bioinf. 2007, 69, 394–408. [Google Scholar] [CrossRef] [PubMed]
- Bereau, T.; Deserno, M. Generic Coarse-Grained Model for Protein Folding and Aggregation. J. Chem. Phys. 2009, 130, 235106. [Google Scholar] [CrossRef] [PubMed]
- Davtyan, A.; Schafer, N.P.; Zheng, W.; Clementi, C.; Wolynes, P.G.; Papoian, G.A. AWSEM-MD: Protein Structure Prediction Using Coarse-Grained Physical Potentials and Bioinformatically Based Local Structure Biasing. J. Phys. Chem. B 2012, 116, 8494–8503. [Google Scholar] [CrossRef] [PubMed]
- Habibi, M.; Rottler, J.; Plotkin, S.S. As Simple as Possible, but not Simpler: Exploring the Fidelity of Coarse-Grained Protein Models for Simulated Force Spectroscopy. PLoS Comput. Biol. 2016, 12. [Google Scholar] [CrossRef] [PubMed]
- Rudzinski, J.F.; Kremer, K.; Bereau, T. Communication: Consistent Interpretation of Molecular Simulation Kinetics Using Markov State Models Biased with External Information. J. Chem. Phys. 2016, 144, 051102. [Google Scholar] [CrossRef] [PubMed]
- Harmandaris, V.A.; Kremer, K. Predicting Polymer Dynamics at Multiple Length and Time Scales. Soft Matter 2009, 5, 3920–3926. [Google Scholar] [CrossRef]
- Salerno, K.; Agrawal, A.; Perahia, D.; Grest, G. Resolving Dynamic Properties of Polymers through Coarse-Grained Computational Studies. Phys. Rev. Lett. 2016, 116, 058302. [Google Scholar] [CrossRef] [PubMed]
- Hijón, C.; Español, P.; Vanden-Eijnden, E.; Delgado-Buscalioni, R. Mori–Zwanzig Formalism as a Practical Computational Tool. Fold. Des. 2010, 144, 301–322. [Google Scholar] [CrossRef]
- Rudzinski, J.F.; Bereau, T. Structural-Kinetic-Thermodynamic Relationships for Peptide Secondary Structure Formation Identified from Transition Network Properties. bioRxiv. 2017. Available online: https://www.biorxiv.org/content/early/2017/12/18/183053 (accessed on 31 August 2017).
- Cheung, M.; Garcia, A.; Onuchic, J. Protein Folding Mediated by Solvation: Water Expulsion and Formation of the Hydrophobic Core Occur after the Structural Collapse. Proc. Natl. Acad. Sci. USA 2002, 99, 685–690. [Google Scholar] [CrossRef] [PubMed]
- Clementi, C.; Plotkin, S.S. The Effects of Nonnative Interactions on Protein Folding Rates: Theory and Simulation. Protein Sci. 2004, 13, 1750–1766. [Google Scholar] [CrossRef] [PubMed]
- Chan, H.S.; Zhang, Z.; Wallin, S.; Liu, Z. Cooperativity, Local-Nonlocal Coupling, and Nonnative Interactions: Principles of Protein Folding from Coarse-Grained Models. Ann. Rev. Phys. Chem. 2011, 62, 301–326. [Google Scholar] [CrossRef] [PubMed]
- Taketomi, H.; Ueda, Y.; Gō, N. Studies on Protein Folding, Unfolding and Fluctuations by Computer-Simulation. I. Effect of Specific Amino-Acid Sequence Represented by Specific Inter-Unit Interactions. Int. J. Pept. Protein Res. 1975, 7, 445–459. [Google Scholar] [CrossRef] [PubMed]
- Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of Multiple Amber Force Fields and Development of Improved Protein Backbone Parameters. Proteins Struct. Funct. Bioinf. 2006, 65, 712–725. [Google Scholar] [CrossRef] [PubMed]
- Weeks, J.; Chandler, D.; Andersen, H. Role of Repulsive Forces in Determining Equilibrium Structure of Simple Liquids. J. Chem. Phys. 1971, 54, 5237. [Google Scholar] [CrossRef]
- Rudzinski, J.F.; Bereau, T. Concurrent Parametrization Against Static and Kinetic Information Leads to More Robust Coarse-Grained Force Fields. Eur. Phys. J. Spec. Top. 2016, 225, 1373–1389. [Google Scholar] [CrossRef]
- Whitford, P.C.; Noel, J.K.; Gosavi, S.; Schug, A.; Sanbonmatsu, K.Y.; Onuchic, J.N. An All-Atom Structure-Based Potential for Proteins: Bridging Minimal Models with All-Atom Empirical Forcefields. Proteins Struct. Funct. Bioinf. 2009, 75, 430–441. [Google Scholar] [CrossRef] [PubMed]
- Bereau, T.; Bachmann, M.; Deserno, M. Interplay Between Secondary and Tertiary Structure Formation in Protein Folding Cooperativity. J. Am. Chem. Soc. 2010, 132, 13129–13131. [Google Scholar] [CrossRef] [PubMed]
- Bereau, T.; Deserno, M.; Bachmann, M. Structural Basis of Folding Cooperativity in Model Proteins: Insights from a Microcanonical Perspective. Biophys. J. 2011, 100, 2764–2772. [Google Scholar] [CrossRef] [PubMed]
- Bereau, T.; Wang, Z.J.; Deserno, M. More than the Sum of its Parts: Coarse-Grained Peptide-Lipid Interactions from a Simple Cross-Parametrization. J. Chem. Phys. 2014, 140, 115101. [Google Scholar] [CrossRef] [PubMed]
- Bereau, T.; Bennett, W.F.D.; Pfaendtner, J.; Deserno, M.; Karttunen, M. Folding and Insertion Thermodynamics of the Transmembrane WALP Peptide. J. Chem. Phys. 2015, 143, 1. [Google Scholar] [CrossRef] [PubMed]
- Bereau, T.; Globisch, C.; Deserno, M.; Peter, C. Coarse-Grained and Atomistic Simulations of the Salt-Stable Cowpea Chlorotic Mottle Virus (SS-CCMV) Subunit 26-49: Beta-Barrel Stability of the Hexamer and Pentamer Geometries. J. Chem. Theory Comput. 2012, 8, 3750–3758. [Google Scholar] [CrossRef] [PubMed]
- Osborne, K.L.; Bachmann, M.; Strodel, B. From Computational Biophysics to Systems Biology. In Proceedings of the CBSB11, Julich, Germany, 20–22 July 2011; p. 151. [Google Scholar]
- Osborne, K.L.; Bachmann, M.; Strodel, B. Thermodynamic Analysis of Structural Transitions during GNNQQNY Aggregation. Proteins Struct. Funct. Bioinf. 2013, 81, 1141–1155. [Google Scholar] [CrossRef] [PubMed]
- Osborne, K.L.; Barz, B.; Bachmann, M.; Strodel, B. Thermodynamics of protein aggregation. Phys. Procedia 2014, 53, 90. [Google Scholar] [CrossRef]
- Rutter, G.O.; Brown, A.H.; Quigley, D.; Walsh, T.R.; Allen, M.P. Testing the Transferability of a Coarse-Grained Model to Intrinsically Disordered Proteins. Phys. Chem. Chem. Phys. 2015, 17, 31741–31749. [Google Scholar] [CrossRef] [PubMed]
- Best, R.B.; Hummer, G. Optimized Molecular Dynamics Force Fields Applied to the Helix-Coil Transition of Polypeptides. J. Phys. Chem. B 2009, 113, 9004–9015. [Google Scholar] [CrossRef] [PubMed]
- Hess, B.; Kutzner, C.; van der Spoel, D.; Lindahl, E. GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J. Chem. Theory Comput. 2008, 4, 435–447. [Google Scholar] [CrossRef] [PubMed]
- Wang, Z.J.; Deserno, M. A Systematically Coarse-Grained Solvent-Free Model for Quantitative Phospholipid Bilayer Simulations. J. Phys. Chem. B 2010, 114, 11207–11220. [Google Scholar] [CrossRef] [PubMed]
- Limbach, H.; Arnold, A.; Mann, B.; Holm, C. ESPResSo - An Extensible Simulation Package for Research on Soft Matter Systems. Comput. Phys. Commun. 2006, 174, 704–727. [Google Scholar] [CrossRef]
- Lifson, S.; Roig, A. On the Theory of Helix-Coil Transition in Polypeptides. J. Chem. Phys. 1961, 34, 1963–1974. [Google Scholar] [CrossRef]
- Vitalis, A.; Caflisch, A. 50 Years of Lifson-Roig Models: Application to Molecular Simulation Data. J. Chem. Theory Comput. 2012, 8, 363–373. [Google Scholar] [CrossRef] [PubMed]
- Chodera, J.D.; Swope, W.C.; Pitera, J.W.; Dill, K.A. Long-Time Protein Folding Dynamics from Short-Time Molecular Dynamics Simulations. Multiscale Model. Simul. 2006, 5, 1214–1226. [Google Scholar] [CrossRef]
- Noé, F. Probability Distributions of Molecular Observables Computed from Markov Models. J. Chem. Phys. 2008, 28, 244103. [Google Scholar] [CrossRef] [PubMed]
- Bowman, G.R.; Pande, V.S.; Noé, F. An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation; Springer Science and Business Media: Dordrecht, The Netherlands, 2014. [Google Scholar]
- Prinz, J.H.; Wu, H.; Sarich, M.; Keller, B.; Senne, M.; Held, M.; Chodera, J.D.; Schütte, C.; Noé, F. Markov Models of Molecular Kinetics: Generation and Validation. J. Chem. Phys. 2011, 134, 174105. [Google Scholar] [CrossRef] [PubMed]
- Perez-Hernandez, G.; Paul, F.; Giorgino, T.; De Fabritiis, G.; Noé, F. Identification of Slow Molecular Order Parameters for Markov Model Construction. J. Chem. Phys. 2013, 139, 015102. [Google Scholar] [CrossRef] [PubMed]
- Sittel, F.; Stock, G. Robust Density-Based Clustering to Identify Metastable Conformational States of Proteins. J. Chem. Theory Comput. 2016, 12, 2426–2435. [Google Scholar] [CrossRef] [PubMed]
- Jain, A.; Stock, G. Identifying Metastable States of Folding Proteins. J. Chem. Theory Comput. 2012, 8, 3810–3819. [Google Scholar] [CrossRef] [PubMed]
- Scherer, M.K.; Trendelkamp-Schroer, B.; Paul, F.; Perez-Hernandez, G.; Hoffmann, M.; Plattner, N.; Wehmeyer, C.; Prinz, J.H.; Noé, F. PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. J. Chem. Theory Comput. 2015, 11, 5525–5542. [Google Scholar] [CrossRef] [PubMed]
- Schwarz, G. On Kinetics of Helix-coil Transition of Polypeptides in Solution. J. Mol. Biol. 1965, 11, 64–77. [Google Scholar] [CrossRef]
- Thompson, P.; Eaton, W.; Hofrichter, J. Laser Temperature Jump Study of the Helix Reversible Arrow Coil Kinetics of an Alanine Peptide Interpreted with a ‘Kinetic Zipper’ Model. Biochemistry 1997, 36, 9200–9210. [Google Scholar] [CrossRef] [PubMed]
- Frauenfelder, Hans. The Physics of Proteins: An Introduction to Biological Physics and Molecular Biophysics; Springer Science and Business Media: New York, NY, USA, 2010. [Google Scholar]
- Kafsi, M.; Grossglauser, M.; Thiran, P. The Entropy of Conditional Markov Trajectories. IEEE Trans. Inf. Theory 2013, 59, 5577–5583. [Google Scholar] [CrossRef]
Figure 1. A visualization of the Hy-Gō model representation and interactions for . (Left) Illustration of a native contact between atoms and a generic contact between atoms, along with the corresponding parameters, , associated with these interactions. (Right) The top two panels present the interaction potentials for the Gō-type interactions as a function of the model parameters. In the top panel, . The bottom panel presents the Weeks-Chandler-Andersen-like potentials employed to model sterics along the peptide backbone.
Figure 2. (a) Representative implied timescale test. Each line represents a different characteristic timescale of the Markov state model as a function of lag time. The vertical dashed line denotes the lag time chosen in this case. (b) Representative Chapman-Kolmogorov test. The “helix” state corresponds to the microstate for each model with the highest helical content (see Methods section for details). The blue solid curve presents the probability decay of the helix state, as determined directly from the simulation trajectory, while the red dashed curve presents the same quantity determined from the MSM. The transparent cyan and orange regions denote the error bars for each quantity.
Figure 3. Temperature dependence of the Lifson-Roig helix propagation parameter, w (a), and the average fraction of helical segments, (b). The curves presented are a linear fit of the raw data. (c) The slope of , denoted , versus the enthalphy of helix extension (determined as the slope of ). Data from the Hy-Gō models are denoted with circle markers and are colored according to their relative cooperativity (as determined by ). The results from the PLUM models are colored in gray-scale according to . The energetic reparametrizations of PLUM are indicated with square markers, while the PLUM-ent model is indicated with X markers. The experimental and all-atom results taken from Best and Hummer  are denoted with triangle magenta and dark blue markers, respectively. Note that the models are aligned by shifting the temperature such that all models achieve at .
Figure 4. (ai) Temperature dependence of the ratio of folding to unfolding timescales, . (aii) Slope of the linear fit of as a function of inverse temperature versus the enthalphy of helix extension, (determined as the slope of ). (bi) Ratio of waiting time in the helix state, , to waiting time in the coil state, , for residue position i as a function of the average fraction of helical residues, , and the average fraction of lone helices, , at . The free parameter, , was determined by a fit over all models to maximize the pearson correlation coefficient, R, between the plotted quantities. Here, i corresponds to the terminal ends of the peptide (not including capping groups). (bii) Slope of the linear fit of as a function of inverse temperature versus , the slope of the linear fit of as a function of inverse temperature, and the slope of the linear fit of as a function of inverse temperature. The free parameters, and , were determined by a fit over all models to maximize the pearson correlation coefficient, R, between the plotted quantities. Data from the Hy-Gō models are denoted with circle markers and are colored according to their relative cooperativity (as determined by ). The results from the PLUM models are colored in gray-scale according to . The energetic reparametrizations of PLUM are indicated with square markers, while the PLUM-ent model is indicated with X markers.
Figure 5. Average conditional path entropy in the folding direction versus the enthalpy of helix extension, (determined as the slope of ).
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).