Exploring Nearest Neighbor Interactions and Their Influence on the Gibbs Energy Landscape of Unfolded Proteins and Peptides

The Flory isolated pair hypothesis (IPH) is one of the corner stones of the random coil model, which is generally invoked to describe the conformational dynamics of unfolded and intrinsically disordered proteins (IDPs). It stipulates, that individual residues sample the entire sterically allowed space of the Ramachandran plot without exhibiting any correlations with the conformational dynamics of its neighbors. However, multiple lines of computational, bioinformatic and experimental evidence suggest that nearest neighbors have a significant influence on the conformational sampling of amino acid residues. This implies that the conformational entropy of unfolded polypeptides and proteins is much less than one would expect based on the Ramachandran plots of individual residues. A further implication is that the Gibbs energies of residues in unfolded proteins or polypeptides are not additive. This review provides an overview of what is currently known and what has yet to be explored regarding nearest neighbor interactions in unfolded proteins.


Introduction
For a long period of time the unfolded states of proteins have not attracted considerable interest in the protein community because their basic properties seemed to be obvious and generic. Following early models of Flory, it was assumed that unfolded peptide chains are describable by the random coil model irrespective of its amino acid residues sequence [1-3]. Locally, it was assumed that the latter sample the entire sterically allowed region of the Ramachandran plot, which is rather similar for different amino acid residues, glycine and proline being the exceptions (cf. Figure 1). Hence, the unfolded state has been thought to be just a reservoir of conformational entropy that has to be overcome for a protein to fold. This view has changed over the last 25 years for a variety of reasons. First, the discovery of intrinsically disordered proteins (IDPs) that perform biological functions revealed that the nature of their amino acid residues must play a role, since the results of bioinformatic analyses suggested that polar and charged amino acid residues occur more frequently in IDPs than they do in folded proteins [4][5][6][7]. IDPs are of particular relevance for cell signaling processes that frequently involve a transition between disordered and folded states. Intrinsically disordered segments of otherwise folded proteins are involved in pivotal protein-DNA and ligand-receptor interactions [5,8]. Some IDPs are of great biomedical relevance in that they are prone to self-assembly into amyloid fibrils [9][10][11]. Canonical examples are the amyloid β-peptides Aβ 1-40 and Aβ 1-42 , α-synuclein and the tau protein [12][13][14].
The second reason for the growing interest in unfolded proteins and peptides is based on the discovery that, contrary to Flory's assumption and a belief cultivated over decades, individual amino acid residues do sample a space less than the one dictated by steric constraints. Moreover, experimental data and coil library analyses revealed that the Ramachandran distribution depends very much on the nature of the amino acid side Figure 1. Sterically allowed φ, ψ space proposed by Ramachandran. Solid lines enclose the region allowed by hard-sphere bumps at standard radii; dashed lines show the region allowed with reduced radii; dotted lines add regions allowed if τ (N-Cα-C′) is relaxed slightly. Ψ and φ values run from −180° to 180°. Taken from [29].
The third reason for the increasing attractiveness of unfolded proteins is that they do not always behave like an ideal random coil. While the scaling laws obtained for the radii of gyration and end-to-end distances of a large number of foldable proteins subjected to denaturing conditions (i.e., high urea concentration) indicate a nearly perfect self-avoiding random walk (i.e., an exponents close to ν = 0.59) [30], IDPs or disordered protein segments can exhibit scaling laws with exponents spreading over a large range, depending on the apparent net charge of the investigated sequence [31][32][33][34][35]. Exponents below 0.5 indicate a compact structure, while values above 0.6 reflect extended coils preferably interacting with water [34]. It should be noted in this context that it is known for quite some time that proteins such as cytochrome c prefer some type of compact molten globule state under certain denaturing conditions (low or high pH, high temperature, on membrane surfaces, even in the presence of urea and guanidine hydrochloride at neutral pH) [36,37]. Interestingly, even scaling coefficients between 0.53 and 0.6, indicating ensembles somewhere between an ideal and self-avoiding random coil, do not exclude the possibility that Figure 1. Sterically allowed ϕ, ψ space proposed by Ramachandran. Solid lines enclose the region allowed by hard-sphere bumps at standard radii; dashed lines show the region allowed with reduced radii; dotted lines add regions allowed if τ (N-C α -C ) is relaxed slightly. Ψ and ϕ values run from −180 • to 180 • . Taken from [29].
The third reason for the increasing attractiveness of unfolded proteins is that they do not always behave like an ideal random coil. While the scaling laws obtained for the radii of gyration and end-to-end distances of a large number of foldable proteins subjected to denaturing conditions (i.e., high urea concentration) indicate a nearly perfect self-avoiding random walk (i.e., an exponents close to ν = 0.59) [30], IDPs or disordered protein segments can exhibit scaling laws with exponents spreading over a large range, depending on the apparent net charge of the investigated sequence [31][32][33][34][35]. Exponents below 0.5 indicate a compact structure, while values above 0.6 reflect extended coils preferably interacting with water [34]. It should be noted in this context that it is known for quite some time that proteins such as cytochrome c prefer some type of compact molten globule state under certain denaturing conditions (low or high pH, high temperature, on membrane surfaces, even in the presence of urea and guanidine hydrochloride at neutral pH) [36,37]. Interestingly, even scaling coefficients between 0.53 and 0.6, indicating ensembles somewhere between an ideal and self-avoiding random coil, do not exclude the possibility that the unfolded proteins exhibit local transient structures [38]. Calculations reported by Fitzkee and Rose revealed that an unfolded protein can behave like a selfavoiding random coil while still showing considerable local order [39].
Irrespective of the complexities indicated above one might arrive at the conclusion that if the intrinsic structural propensities of the amino acid residues are known, then the conformational sampling of an unfolded protein could be predicted, and its energetics and entropic content assessed. Different propensity scales are available in the literature, which could be used to this end [23,[40][41][42][43][44][45]. However, such a modeling of unfolded states requires the absence of non-local contacts between the amino acid residues and the validity of Flory's isolated pair hypothesis (IPH), which states that the conformational dynamics of residues do not correlate. If both requirements were met residue energetics and entropies would be additive. Unfortunately, none of the above conditions are generally met. While non-local contacts might be negligible in extended IDPs (solvation energy dominates over intrapeptide interactions), several lines of evidence suggest that the isolated pair hypothesis is not valid irrespective of the specific coil state of the protein.
While the influence of non-local contacts on the compactness and local order of IDPs and unfolded proteins has been explored in some detail [14,[46][47][48][49], work on nearest neighbor interactions (NNIs) appears somewhat fragmentized. It is the goal of this review to provide an overview over the rather diverse work performed over the last 25 years that was aimed at exploring how NNIs affect conformational propensities of amino acid residues. We will compare the results of these efforts and shed some light on their relevance for a thorough understanding of unfolded and disordered proteins.
Even though the term random coil is clearly defined in the polymer literature (as nicely delineated in [50]), the term is often used as a label for unfolded and disordered proteins in an indiscriminate way. Strictly speaking, the term can only be used for sufficiently long polymers comprised of rigid monomers (peptide groups) and freely rotatable linkers, the length dependence of which can be described by a power law for its mean radius of hydration; i.e., R h ∼ N ν , where N is the number of peptide units. As indicated above, the exponent is 0.59 in a good solvent. Locally, the random coil model is based on the assumption that amino acid residue conformations are restricted solely by steric hindrances between the side chains as well as between side chains and backbone groups and by electrostatic interactions [51]. However, as indicated above, peptide/protein-solvent interactions reduce the available conformational space further and increase the anisotropy of the residue orientations [24,25,27,28,[52][53][54][55]. Moreover, unfolded and disordered proteins frequently contain segments that can transiently adopt regular secondary structures [14,38,49,56]. Hence, real unfolded polypeptides and proteins do not meet the requirements for a random coil on a local level while they still might do so on a more global level [30,39,57]. Therefore, we follow the late Harold Scheraga in that we use the term statistical coil instead, which correctly reflects the fact that different chain conformations differ in terms of their Gibbs free energies [56,57].
This review is organized as follows. In the first step some basic thermodynamic aspects of NNIs are delineated. The main part of the review is divided into three sections. Part 1 provides some general description of the statistical thermodynamics of NNIs. This is followed by Part 2, in which we review the coil library analyses that extracted the conformational propensities and NNI effects from the structure of the denatured proteins. In Part 3, we provide an overview of the computational studies that explored the underlying physical mechanism of NNIs. Part 4 describes the experimental studies, mostly on short peptides, that were specifically aimed at quantifying the NNIs between different types of residues. A summary and outlook finish the review.

Thermodynamic Aspects of Nearest Neighbor Interactions
It is now well established that the Ramachandran plot of amino acid residues in unfolded peptides and proteins contains several basins associated with different secondary structures. The most prominent ones are shown in Figure 1. Generally, based on experimental studies with blocked dipeptides and unblocked GxG guest-host tripeptides (x: host amino acid residue), all residues predominantly populate two basins in the upper left part of the Ramachandran plot, which are assignable to polyproline II (pPII) and β-strand (β) (between 70 and 100%) [23,40,42,43]. The basin of the former can be found in a region between ϕ-values of −90 • and −60 • while the ψ-values might vary between 130 • and 180 • . The center of β-strand basins can vary over a range between ϕ-values of −100 • and −140 • . Corresponding ψ-values vary again between 130 • and 180 • . This area encompasses backbone structures associated with different types of β-sheet structures. The sampling of other basins by residues in the above peptides is somewhat more limited. The most prominent one is assignable to right-handed helical structures (−70 • < ϕ < −30 • , −60 • < ψ < −20 • ), which encompasses canonical α-helical and 3 10 -like conformations. Quite a few residues sample the region of γ (30 • < ϕ < 70 • , −80 • < ψ < −50 • ) and inverse γ-turns (ϕ, ψ with opposite sign). In specific cases residues were found to sample lefthanded helical structures (30 • < ϕ < 70 • , 20 • < ψ < 60 • ), and conformations in the bridge region around ϕ = −60 • and ψ = 0, which are generally found at the i + 2 position of type I and II' β-turns [43,45,58,59]. In addition, aspartic acid and asparagine residues can form asx-turns, supporting the conformations (50 • < ϕ < 80 • , 120 • < ψ < 180 • ). Asx-turns consist of a sequence of three residues, the first of which is either aspartic acid or asparagine [60].
The population χ i,k of these conformations can be calculated as where G i,k is the Gibbs free energy if the k-th conformation of the i-th residue. R is gas constant and T the temperature in Kelvin. It can be decomposed as follows: where G 0 i,k is the Gibbs energy of the kth conformation of the ith residue in the absence of nearest neighbor interactions, δG 0 ij,k is the Gibbs energy contribution by the jth neighbor that reflect the steric and physico-chemical properties of the neighbor irrespective of its adopted conformation and δG ij,kl is the conformation-dependent contribution of the jth neighbor in its lth conformation.
Distinguishing between these two contributions to the NNI-Gibbs energy is technically difficult (vide infra) but nevertheless of conceptual significance. If the NNI was comprised only of δG 0 ij,kl contributions, there would be no cooperativity between the interacting residues. In such a case, which is comparable to the first-order contribution to the eigenenergies of quantum mechanical systems, the internal energies, entropies and thus also the Gibbs energies of residues in an unfolded polypeptide would still be additive and thus the isolated pair hypothesis would still be valid. Therefore, only conformation dependent NNIs are a real game changer in that the population of a certain conformation of a considered residue becomes dependent on the conformation adopted by its neighbors. To illuminate the significance of conformation dependent NNIs, let us consider an oligopeptide with only two interacting residues. In the absence of any cooperativity, the mole fraction of a conformation pair k, l is calculated as where the Gibbs energy terms G 1,k and G 2,l contain the first two terms of Equation (2). In the presence of cooperative NNIs, the corresponding equation for χ kl looks rather different: where we consider the mutual influence of residue 2 on 1 and residue 1 on 2 in the numerator. The partition sum in the denominator runs over all peptide conformations, which is different from the product of partition sums of individual residues in Equation (3). Obviously, the corresponding conformational Gibbs entropies of the two systems are also different, namely, in the absence and in the presence of cooperativity. The theory introduced thus far solely considers the possibility that NNIs affect the Gibbs energies of residue conformations. However, it is equally likely that NNIs change the equilibrium position of a basin. This can be seen from a simple example. Assume that the potential function associated with a basin can be approximated by a harmonic potential of the type V(q) = 1/2·k(q − q 0 ) 2 where q represents one of the two dihedral coordinates. If the corresponding residue is subjected to a perturbing potential V(q) = aq + b, the equilibrium position shifts from q = q 0 to q 0 = a/k + q 0 . If the q-dependence of the perturbing potential is more complicated, that expression for q 0 becomes more complicated as well.
In what follows, we will have to keep the theoretical thoughts of this section in mind when we go over the NNI literature.

Coil Library Studies
In 1995, two important articles reported what the authors called conformational propensities of amino acid residues in coil regions of proteins. Swindells et al. determined the Ramachandran distributions of amino acid residues in coil regions of 85 structures obtained from the protein data bank [15]. They excluded any residues incorporated in right-handed helices and β-sheets to eliminate local and non-local interactions associated with the stabilization of these secondary structures. Moreover, they assumed that the thus obtained Ramachandran distributions could reflect intrinsic structural propensities and thus be representative of the respective conformational distributions in unfolded/denatured proteins. This conceptual model is based on the assumption that non-local interactions in these coils are random in nature and averaged out by considering a sufficiently large data set. Contrary to one of the basic assumptions of Flory's random coil model, they found significant differences between the population distributions of different amino acid residues. They reported integrated propensities for four regions of the Ramachandran plot shown in Figure 2. If one converts their propensity values into fractions by dividing the reported propensity values by their sum, one obtains 0.35, 0.21 and 0.33 for the pPII, β-strand and right-handed helical region, respectively, of alanine. For valine, however, the corresponding fractions are 0.24, 0.45 and 0.22. Swindells et al. found that the obtained propensity values correlated with secondary structure propensities for the β-strand while the correlation with right-handed helical propensities was found to be weak [15]. In earlier days, Ramachandran plots were indiscriminately derived from protein data sets because of the belief that all non-intrinsic influences could be averaged out by a sufficiently large data set, irrespective of its content of regular secondary structures. That this is not the case was demonstrated by Serrano [16]. Figure 3 shows the distribution of alanine obtained from an unrestricted data set ( Figure 3A) and from a restricted coil set (hel- In earlier days, Ramachandran plots were indiscriminately derived from protein data sets because of the belief that all non-intrinsic influences could be averaged out by a sufficiently large data set, irrespective of its content of regular secondary structures. That this is not the case was demonstrated by Serrano [16]. Figure 3 shows the distribution of alanine obtained from an unrestricted data set ( Figure 3A) and from a restricted coil set (helices and where sheets were omitted ( Figure 3B). The difference is striking. While the former is dominated by an intense peak in the right-handed helical region the latter becomes maximal in the pPII region. Hence, Serrano's data clearly reveal the propensity of alanine for polyproline II six years before experimental results obtained with an heptaalanine peptide suggested the same and triggered a highly controversial debate. Details of this debate are summarized in an earlier review of Toal and Schweitzer-Stenner [22]. Serrano's analysis, which was augmented by a comparison of the chemical shifts and 3 J(H N H Cα ), observed for short model peptides and calculated for coil library distributions, clearly corroborated the notion that different amino acid residues have different intrinsic propensities for specific conformations. parameters are side-chain dependent. Empirical values should therefore be considered as an average. In the case of unfolded and disordered proteins and peptides, the measured Karplus parameters represent a conformational average: where P(ηi) is the probability for the residue to adopt a dihedral angle φi or ψi.   Nearest neighbor effects were first considered in a third paper by Penkett et al. that appeared in 1997 [61]. Keeping in mind the well-stablished fact that the helical and β-sheet propensities of amino acid residues are context dependent they were wondering whether such context dependencies would also be observable in the unfolded proteins for which they assumed minimal non-local interactions. To this end, they used 1 H NMR to determine the 3 J(H N H Cα ) coupling constants for a 130-residue fragment of the fribronectin-binding proteins from Staphyloccus aureus. They observed that 9 of 16 glutamic acid residues preceded by residues with either branched or aromatic side chains exhibited coupling constants between 6.2 and 7.0 Hz, while the other 7 with asparagine or glutamate as upstream neighbors lie between 5.7 and 6.3 Hz.
Further insight into the underlying NNIs came from the average 3 J(H N H Cα ) values calculated for the coil library distributions of Swindells et al. [15]. These calculations were performed with a Karplus equation that relates the coupling constant to the dihedral backbone angle ϕ as follows [58]: where x and y denote the interacting nuclei x and y, η = ϕ, ψ, and θ i (I = 1, 2) are phase angles.
A, B and C are empirical Karplus parameters obtained by fitting Equation (9) to the J-coupling data sets obtained for proteins with a known crystal structure [59,[62][63][64][65]. Alternatively, these parameters could be obtained with density functional theory calculations but this has been accomplished thus far only for alanine [66]. It is likely that the exact parameters are sidechain dependent. Empirical values should therefore be considered as an average. In the case of unfolded and disordered proteins and peptides, the measured Karplus parameters represent a conformational average: where P(η i ) is the probability for the residue to adopt a dihedral angle ϕ i or ψ i . The results of the J-coupling analysis of Penkett et al. are shown in Figure 4 [61]. The average 3 J(H N H Cα ) values of 18 amino acid residues can apparently be subdivided into two groups. Residues preceded by F, H, I, T, V, W, and Y (L-group) exhibit systematically higher 3 J(H N H Cα ) values than corresponding residues preceded by representatives of the complementary group (G excluded). These results seem to indicate that sterically demanding aliphatic and aromatic residues move the overall distribution to lower (more negative) ϕ values. While important as a first data-based insight regarding the occurrence of NNIs, the results can be interpreted either as reflecting shifts of basins associated with different secondary structures and as redistribution between different basins. Experimental results to be discussed below reveal that NNIs can indeed cause both.
In what follows, we will focus on two different coil library analyses reported by the Sosnick and Dunbrack group, which both deal with NNIs in explicit terms [19,67]. Other reported libraries put a focus on individual propensities and conformational sampling, which is of lesser interest in this context [15,16,68,69]. Jha et al. conducted a very comprehensive coil library analysis of 2020 chains with more than twenty residues [18,67]. The authors produce Ramachandran plots for four different data sets: one set with no restrictions (all secondary structure sequences included), a second one from which helices and sheets were taken out, a third one for which turns were taken out as well and a fourth one from which flanking residues were also eliminated. The authors' analysis clearly showed that taking helices and sheets out of the data set produce rather different Ramachandran distributions. Moreover, their analysis yielded rather different Ramachandran distributions for individual residues and revealed significant nearest neighbor influences.   (6) and (7)). Taken from [61] with permission, 1997, Academic Press.
In what follows, we will focus on two different coil library analyses reported by the Sosnick and Dunbrack group, which both deal with NNIs in explicit terms [19,67]. Other reported libraries put a focus on individual propensities and conformational sampling, which is of lesser interest in this context [15,16,68,69]. Jha et al. conducted a very comprehensive coil library analysis of 2020 chains with more than twenty residues [18,67]. The authors produce Ramachandran plots for four different data sets: one set with no restrictions (all secondary structure sequences included), a second one from which helices and sheets were taken out, a third one for which turns were taken out as well and a fourth one from which flanking residues were also eliminated. The authors' analysis clearly showed that taking helices and sheets out of the data set produce rather different Ramachandran distributions. Moreover, their analysis yielded rather different Ramachandran distributions for individual residues and revealed significant nearest neighbor influences.
The number of residue data in the coil library of Jha et al. do not allow for Ramachandran plots of different tripeptide sequences to depict enough data points for most of the twenty amino acid residues. This is not surprising because a coil analysis excludes at least all secondary structures. Figure 5 shows Ramachandran plots for the central alanine residues for the following tripeptide sequences: GAG, with A taken from a disordered-like subset (labeled as 'unfolded' on the Sosnick group web site) and the two glycines from  (6) and (7)). Taken from [61] with permission, 1997, Academic Press.
The number of residue data in the coil library of Jha et al. do not allow for Ramachandran plots of different tripeptide sequences to depict enough data points for most of the twenty amino acid residues. This is not surprising because a coil analysis excludes at least all secondary structures. Figure 5 shows Ramachandran plots for the central alanine residues for the following tripeptide sequences: GAG, with A taken from a disordered-like subset (labeled as 'unfolded' on the Sosnick group web site) and the two glycines from segments outside of helical and sheet structures; GAG with G and A data taken from segments outside of helical and sheet segments; and GAX and XAG, where X indicates an integration over all neighbors.
A comparison of these plots is quite revealing. The distribution of glycine-flanked alanine residues located outside the regular secondary structures seems to have comparable propensities for pPII and right-handed helical structures. The population of β-sheets is negligible. If one restricts the selection of alanine to coil-like segments, then pPII seems to be preferred over helical structures ( Figure 5, upper panel, right). However, the number of counts might be too low to tell. If one integrates either over all upstream or all downstream neighbors, then pPII becomes clearly dominant ( Figure 5, lower panel). Regarding the latter, the β-strand slightly gains at the expense of right-handed conformations. These data show that in this coil library nearest neighbors predominantly increase the pPII propensity of alanine. segments outside of helical and sheet structures; GAG with G and A data taken from ments outside of helical and sheet segments; and GAX and XAG, where X indicate integration over all neighbors. A comparison of these plots is quite revealing. The distribution of glycine-flan alanine residues located outside the regular secondary structures seems to have com rable propensities for pPII and right-handed helical structures. The population of β-sh is negligible. If one restricts the selection of alanine to coil-like segments, then pPII se to be preferred over helical structures ( Figure 5, upper panel, right). However, the num of counts might be too low to tell. If one integrates either over all upstream or all do stream neighbors, then pPII becomes clearly dominant ( Figure 5, lower panel). Regar the latter, the β-strand slightly gains at the expense of right-handed conformations. T data show that in this coil library nearest neighbors predominantly increase the pPII pensity of alanine.
The Sosnick group utilized their coil library distributions to explore the chemic denatured state of apomyoglobin, ubiquitin, the SNase fragment Δ131Δ and eglin C which they tried to reproduce experimentally determined NMR-based residual dip constants [18]. Jha et al. did so with coil library-based Ramachandran distributions The Sosnick group utilized their coil library distributions to explore the chemically denatured state of apomyoglobin, ubiquitin, the SNase fragment ∆131∆ and eglin C, for which they tried to reproduce experimentally determined NMR-based residual dipolar constants [18]. Jha et al. did so with coil library-based Ramachandran distributions with and without specific nearest neighbor interactions. Residues were taken from regions that contain neither helices nor sheets nor turn conformations. NNIs were considered solely for residue dimers for which the total internal energy was written as where a i labels the identity of the ith-residue that samples the basin b i . The interaction energy δU accounts for the cooperativity or anti-cooperativity between conformations b i and b i+1 of the two residues. It is related to the conditional probability P( The coil data basis of the authors was not large enough to allow for an empirical determination of NNIs. To gain information about the latter they performed Monte Carlo simulations with an energy functional for each basin with and without NNIs. Energy minimization was constrained by the utilized coil library distributions for individual residues. The energy functional did not contain any protein-solvent interaction; basically, the authors employed excluded volume effects. This procedure was carried out with and without nearest neighbor interactions. As one can infer from the results obtained for apomyoglobin, shown in Figure 6, calculations with NNIs achieved a much better reproduction of the experimental residue coupling constants. Equally interesting is the fact that the experimental values do not at all follow predictions based on an idealized random coil model ( Figure 6B). The latter used three major isoenergetic basins for each residue of an A 50 polypeptide (pPII, β-strand, right-handed helical) with a population of 0.33. The obtained V-shape reflect the decreased correlation with the molecular axis for residues closer to the termini. An analysis of the conformational ensemble of the investigated unfolded/denatured proteins reveals a dominance of what Jha et al. called stretched conformations in which individual residues sample predominantly pPII and β-basins. For the set of coil library residues used for the residual dipolar coupling analysis they obtained a mole fraction ratio of <pPII>:<β>:<right-handed helical> = 0.33:0.36:0.27. While the numbers seem to be reminiscent of a random coil supporting distribution (i.e., sampling of all sterically allowed regions), the very existence of distinguishable pPII and βbasins is not. Despite the deviation from an ideal random coil behavior on a local level, the radii of gyration calculated for the above and six additional unfolded/denatured proteins indicate self-avoiding random coils. This result is in line with computational results of Fitzkee et al., who showed that an ensemble of rods connected by flexible linkers can still reproduce the scaling law for self-avoiding random coils [39]. All these results show that it is necessary to distinguish between local and global aspects of random coils, as very early on emphasized by de Gennes [71] for polymers and by Toal and Schweitzer-Stenner for peptides and proteins [22]. An even more thorough and residue-specific analysis of coil libraries have bee dertaken by Ting et al. [19]. Their data set contained 3038 proteins from the Uppsala tron Density server. In line with the protocol of the Sosnick group they obtained dif coil libraries by employing different restrictions regarding the selection of residue largest set contained loop residues (no regular secondary structure elements) for all backbone atoms appear in the data set. Each residue is at least three residues from the regular structures. This set was termed TCBIG, which contain the single designation of turn, coil, β-bridge, π-helix and 310-helix. The second, reduced data s not contain π-and 310 helices (TCB). The authors classified their Ramachandran dis tions in terms of the following five conformations (Table 1) While of great insight for an understanding of the relevance of NNIs, the above studies do not provide much specific information about how NNIs depend on the physico-chemical properties of the involved residues. The coil library analysis of Jha et al. suggests a strong anti-cooperativity between pPII and right-handed helical conformations of alanine and alanine-like as well as β-branched upstream neighbors, respectively. Aromatic residues positioned downstream seem to cause anti-cooperative interactions between the pPII states of the interacting residues [67].
An even more thorough and residue-specific analysis of coil libraries have been undertaken by Ting et al. [19]. Their data set contained 3038 proteins from the Uppsala Electron Density server. In line with the protocol of the Sosnick group they obtained different coil libraries by employing different restrictions regarding the selection of residues. The largest set contained loop residues (no regular secondary structure elements) for which all backbone atoms appear in the data set. Each residue is at least three residues away from the regular structures. This set was termed TCBIG, which contain the single letter designation of turn, coil, β-bridge, π-helix and 3 10 -helix. The second, reduced data set did not contain πand 3 10 helices (TCB). The authors classified their Ramachandran distributions in terms of the following five conformations (Table 1): α-helical, β-strand, polyproline II, left-handed helical and extended. Since considering all combinations of a given residue with its neighbors is an impossible task (20 3 combinations) the authors confined themselves to selected 'dimers' where they changed the neighbor either upstream or downstream from the considered residue and averaged over all neighbors for the other side. If the influence of the two neighbors is not additive-a notion for which experimental evidence exist in the literature (vide infra)-the information obtained from Ramachandran plots of different pairs might not necessarily represent the NNIs between the pairs. In order to permit a thorough mathematical analysis of the obtained Ramachandran plots, the authors represented the latter by a continuous functional that can be ascribed to a combination of a two-dimensional Gaussian distribution located in close proximity to basins associated with secondary structures. While this modeling bears some similarity with the Gaussian model of Schweitzer-Stenner [72], differences should be emphasized. While the latter works with 1:1 assignments of Gaussians to the assumed basins of the Ramachandran plot (which would be five for the above set of conformations assumed by Ting et al.), the former functional is entirely based on the distributions of data points inferred from the coil library sets. In both cases, the functionals facilitate the mathematic analyses of distributions. Table 1. List of Hellinger distances between the indicated amino acid residues independent of neighbors (left value) and in the presence of a glutamine residue at the upstream position (right value). Hellinger distance values indicating at least moderately different distributions are typed in bold. All values were taken from Ting et al. [19] and divided by 100. A closer look at the residue dimer distributions of Ting et al. reveal that some neighbors have a significant influence on specific residues. Let us again focus on alanine. Compared with X-AG, F, V and Q as downstream neighbors substantially increase the right-handed helical population of alanine at the expense of pPII (Figure 7). On the contrary, proline as a downstream neighbor stabilizes pPII. The nearest downstream neighbors of valine (including valine itself) increase the right-handed helical populations as well. While the latter is also significant in the coil library of the Sosnick group, the neighbor-induced helical population seems to be more pronounced in the Ting et al. library. This is a very surprising result. We consider the implications of these results below in the context of our discussion of NNIs in model peptides.

Simulations
The first thorough computational investigation of NNIs was carried out by Pappu et al. [68]. The authors confined themselves on exploring the interactions between alanine residues in a blocked oligo-alanine peptide. The authors sub-divided the Ramachandran plot into 6 × 6 equally sized mesostates (60° × 60°). Their Monte-Carlo calculations were  [19] (open access). machandran plots. The latter can be calculated as where P R and P R are the two Ramachandran distributions to be compared with each other. An H value of zero means that the two distributions are identical, whereas a value of 1 indicate they are orthogonal. However, even very dissimilar Ramachandran distributions will not be able to produce values close to 1, because they cover only a limited fraction of the Ramachandran space. Schweitzer-Stenner and Toal employed the following criteria: H values between 0 and 0.1 indicate that two distributions are similar [73]. Values between 0.1 and 0.25 as well as between 0.25 and 0.4 indicate that they are modestly similar and dissimilar, respectively. Values above 0.4 reflect very dissimilar distributions (note, that Ting et al. multiplied their H-values with 100). Since the Hellinger distance is practically a measure of orthogonality, it is more sensitive to changes of basin position than it is to the redistribution of populations [73]. Table 1 lists the Hellinger distances for pairs of eight amino acid residues. The left values represent the H-distances for pairs irrespective of their neighbors (Ramachandran plots for different neighbors were added up) whereas the right values represent H-distances if Q is present as an upstream neighbor. Only a few residue pairs fall in the category 'modestly dissimilar'. They mostly contain valine and asparagine. If only Q is considered as upstream neighbors, then all H-values are in the similar or modestly similar range. The significance of these values is not entirely clear. The integration over all neighbors might hide a strong influence of a few residue types. Valine and asparagine seem to be good candidates, as is, most likely, proline (values for P are reported by Ting et al.). We will return to the use of Hellinger distances below when we discuss investigations of short peptides in water.

Simulations
The first thorough computational investigation of NNIs was carried out by Pappu et al. [68]. The authors confined themselves on exploring the interactions between alanine residues in a blocked oligo-alanine peptide. The authors sub-divided the Ramachandran plot into 6 × 6 equally sized mesostates (60 • × 60 • ). Their Monte-Carlo calculations were performed with a rather simple hard sphere model by means of which they just explored the sterically available space, very much in line with the classical Ramachandran approach [69]. They identified clashes between nearest neighbors sampling mesostates in the right helical region, while nearest neighbors sampling mesostates in the pPII and β-strand regions do not interfere with each other. Their results led the authors to conclude that an increasing chain length (of their oligo-alanine peptide) leads to a reduction in conformational space in the unfolded state, which reduces the conformational entropy and thus facilitates the folding into an overall right-handed α-helical conformation.
In a subsequent paper, Tran et al. investigated how steric-based NNIs depend on different type of neighbors [74]. They explored the conformational propensities of 22 amino acid residues (norvaline and norleucine in addition to the natural ones) in N-acetyl-(host) Lx-(host) L -N-methylacetamide (L: number of host residues) host-guest blocked tetrapeptides. Glycine, alanine, valine, phenylalanine and proline were selected as hosts. The result of their analysis is shown in Figure 8. While the influence of glycine on the guest residue is negligible (as one would expect), all other hosts (including proline) shift conformational sampling from the lower (all types of right-handed helical conformations) to the upper left quadrant (pPII and all types of β-strand). Interestingly, the underlying NNIs seem to be more pronounced for L = 2 (influence of nearest and second nearest neighbor) for A, F and, in part, V hosts. Altogether, the NNIs identified by Tran et al. produce more stretched peptide and protein conformations in the unfolded state than expected based on Ramachandran-type distributions. In that regard, their results are at variance with the nearest neighbor effects inferred from the coil library of Ting et al. [19]. However, for an increasing chain length, conformational entropy causes the chain to depart from a rod-like structure. Consequently, longer polypeptides and denatured proteins still obey the scaling law for a self-avoiding random coil.
The result of their analysis is shown in Figure 8. While the influence of glycine on the guest residue is negligible (as one would expect), all other hosts (including proline) shift conformational sampling from the lower (all types of right-handed helical conformations) to the upper left quadrant (pPII and all types of β-strand). Interestingly, the underlying NNIs seem to be more pronounced for L = 2 (influence of nearest and second nearest neighbor) for A, F and, in part, V hosts. Altogether, the NNIs identified by Tran et al. produce more stretched peptide and protein conformations in the unfolded state than expected based on Ramachandran-type distributions. In that regard, their results are at variance with the nearest neighbor effects inferred from the coil library of Ting et al. [19]. However, for an increasing chain length, conformational entropy causes the chain to depart from a rod-like structure. Consequently, longer polypeptides and denatured proteins still obey the scaling law for a self-avoiding random coil.  NNIs also played a role in the MD simulations of Gnanakaran and Garcia on oligoalanine peptides of a different length [75]. These authors used a modified Amber force field (A94 mod) for which they eliminated the force constants for the two dihedral backbone angles. They found that the pPII conformation of the residues is stabilized by NNIs that involves the optimal packing of water molecules in a groove formed by the peptide backbone of at least four residues [27]. However, if the number of alanine residues exceeds ten [27], helical conformations stabilized by intrapeptide hydrogen bonding become more likely, in agreement with experimental results. While the results of this work are important due to its emphasis on the role of the solvent, the elimination of the intrinsic force constants for backbone dihedral angles seems to be somewhat heuristic.
Solvent effects also play a prominent role in the work of Avbelj and Baldwin [76]. These authors explored the electrostatic interactions between atoms within a residue and found them to stabilize the extended β-strand structures. pPII is stabilized by water in that it substantially shields this interaction. Such shielding effects get involved in NNIs, in that the solvation of residues depend on the conformation of neighbors. This is illustrated in Figure 9, which displays the change in the electrostatic solvation free energy caused by replacing an alanine at position 5 of a nine-residue oligoalanine peptide by a valine. The change is more pronounced if the valine residue adopts pPII than the one in the residue's β-strand conformation. Moreover, the graphs in Figure 9 reveal the concomitant reduction in the electrostatic solvation energy of the neighbors, which particularly affects the downstream neighbor. Moreover, it is stronger for the pPII than it is for the β-strand conformation of valine. This important result suggests a cooperative interaction between the pPII state of valine and the β-strand conformation of the neighbor. Besides valine, Avbelj and Baldwin investigated the influence of the remaining 18 amino acid residues. They found the decrease in the electrostatic solvation free energy of the guest residue (compared with alanine) is particularly pronounced (>6.2 kJ/mol for pPII) for aromatic and aliphatic/β-branched residues (V, I, W, Y, F, H and T) and always larger if the guest residue adopts pPII. This work reveals that NNIs between residues adopting conformations in the upper left quadrant of the Ramachandran space are mostly solvent mediated.
it substantially shields this interaction. Such shielding effects get involved in NNIs, in that the solvation of residues depend on the conformation of neighbors. This is illustrated in Figure 9, which displays the change in the electrostatic solvation free energy caused by replacing an alanine at position 5 of a nine-residue oligoalanine peptide by a valine. The change is more pronounced if the valine residue adopts pPII than the one in the residue's β-strand conformation. Moreover, the graphs in Figure 9 reveal the concomitant reduction in the electrostatic solvation energy of the neighbors, which particularly affects the downstream neighbor. Moreover, it is stronger for the pPII than it is for the β-strand conformation of valine. This important result suggests a cooperative interaction between the pPII state of valine and the β-strand conformation of the neighbor. Besides valine, Avbelj and Baldwin investigated the influence of the remaining 18 amino acid residues. They found the decrease in the electrostatic solvation free energy of the guest residue (compared with alanine) is particularly pronounced (>6.2 kJ/mol for pPII) for aromatic and aliphatic/β-branched residues (V, I, W, Y, F, H and T) and always larger if the guest residue adopts pPII. This work reveals that NNIs between residues adopting conformations in the upper left quadrant of the Ramachandran space are mostly solvent mediated.  The group of Sosnick has substantially contributed to our current understanding of NNIs. Besides their work on coil libraries [18,77], they conducted a thorough MD study on xAA and AAx tripeptides in water where x denotes the guest residue. To this end they employed three force fields in implicit water: Amber 94, the modified Amber force field of Garcia (G-S-94), and OPLS-AA-2001 [78]. Simulation with these three force fields yield rather different propensities for the central alanine of AAA. Amber 94 produces the wellknown preference for right-handed helical structures while the other force field yield a more balanced distribution. The authors could not reproduce the high pPII propensity for alanine with the G-S-94 force field, which can certainly be attributed to their use of an implicit solvent model. They also observed substantial differences between the Ramachandran distributions in AAA and in the alanine dipeptide in that the residue of the latter spends more time in pPII and β. These results are at variance with experimental results that show a higher pPII preference for A in AAA than in the alanine dipeptide, in qualitative agreement with Garcia's work [79]. Again, this discrepancy points to different solvent models used by Garcia and Zaman et al. Figure 10 displays the propensities for four different representative neighbors obtained with the G-S-94 and OPLS-AA-2001 force field. The predicted changes are considerable but very much force-field dependent. For GAA, which one could use as a reference system, the G-S-94 force field produces a Ramachandran plot for the central alanine that is dominated by right-handed helical conformations. On the contrary, the OPLS force field produces a dominance of pPII and β-strand. With G-S-94, replacing G by L, N or D keeps the high helical propensity while causing some redistribution between pPII and β-strand. The OPLS force field yields an increased sampling of the right-handed helical and bridge region for G→L and an overall increase in the helical population for G→D. The distributions obtained with Amber 94 are not indicative of massive NNI influence, as for all the guest residues the α-helix population is dominant. There is no doubt that the results of these calculations are important in that they suggest that NNIs can produce substantial changes in the Gibbs energy landscape of residues and their conformational entropy. However, without experimental validation, it is problematic to employ the obtained population changes for a quantitative assessment of the influence of NNIs on the energetics of unfolded polypeptides and proteins.

NMR on Denatured Proteins
As indicated above, the first experimental results indicating that NNIs affect the structural distributions of denatured proteins came from NMR studies. They rely to a significant extent on the use of a J-coupling constant, which reflect the degree of throughbond interaction between two nuclear spins, which are generally one, two or three bonds apart. Their general dependence on dihedral backbone angles is described by Equation (6)

NMR on Denatured Proteins
As indicated above, the first experimental results indicating that NNIs affect the structural distributions of denatured proteins came from NMR studies. They rely to a significant extent on the use of a J-coupling constant, which reflect the degree of throughbond interaction between two nuclear spins, which are generally one, two or three bonds apart. Their general dependence on dihedral backbone angles is described by Equation (6) (vide supra).
Penkett et al. used 3 J(H N H Cα ) coupling constants of a denatured fibronectin binding protein to conclude that β-branched and aromatic neighbors shift these values up [61]. The authors interpreted this observation as indicating a shift to more negative average ϕ-values of the respective conformational ensemble. An even more thorough study was conducted by Peti et al., who analyzed 3 J(H N H Cα ) coupling and chemical shifts of three denatured proteins, namely, ubiquitin, disulfide reduced, carboxymethylated lysozyme and a so-called all-A-α-lactalbumin (all-A means that all cysteines were replaced by alanines) [80]. 1 H, 15 N-HSQC spectra were interpreted as indicative of a random coil conformation in which right-handed helical and pPII/β-regions are populated. This notion was further supported by their observation that the average 15 N chemical shifts of the amino acid residues (taken over all residue of a given type in the investigated proteins) correlate with the corresponding chemical shifts derived from the (restricted) coil library of Smith et al. [81]. However, if these proteins were really sampling a random coil type ensemble there should be no NNIs of significance. This, however, does not seem to be the case. Peti et al. showed that the nearest neighbor-induced chemical shift changes reported by Braun et al. [82], based on 15 N measurements of unblocked GGxA peptides (x represents all natural amino acid residues), correlate with the nearest neighbor effects on leucine residues in the set of unfolded proteins [83]. They attributed these changes to conformational changes. Correlations between 15 N chemical shift and 3 J(H N H Cα ) coupling constant changes support this notion. These results seem to confirm the observation of Penkett et al., that branched and aromatic neighbors produce more negative ϕ-values [61]. If Peti et al. interpreted the neighbor dependence of the 15 N chemical shifts correctly, their results invalidate the isolated pair hypothesis, which implies that the conformational ensemble of the investigated proteins cannot be an ideal random coil.

Structural Analysis of Homopeptides
The above NMR based analyses rely very much on averages over many amino acid residues in the considered denatured proteins and in the utilized coil libraries. We wonder whether such a procedure could obfuscate information about the conformational propensities of residues and their dependence on nearest neighbors. Moreover, averaging over different ensembles might lead to very similar coupling constants, so that changes in the latter are difficult to interpret, particularly if one relies only on a single type of coupling parameter. An alternative approach in this regard utilizes short peptides, which, owing to their limited length, cannot adopt any regular secondary structure. For a long period of time, blocked dipeptides were considered suitable model systems to explore intrinsic conformational propensities of amino acid residues. Ramachandran and Flory used them to explore the sterically allowed region of the Ramachandran plot [3,69]. The alanine dipeptide has been the system of choice for multiple MD simulations [84][85][86][87][88][89]. More recently, they have been used to experimentally determine conformational preferences in water and related blocked tripeptides even for the investigation of nearest neighbor interactions [17,23,[90][91][92][93][94]. The preference for blocked dipeptides over, e.g., unblocked tripeptides, was generally based on the assumption that the charged terminal groups of, e.g., tripeptides, could affect conformational propensities [95,96]. However, experimental evidence reported by Toal et al. revealed that this is not the case for trialanine (A 3 ) and trivaline (V 3 ) [79]. Kallenbach and coworkers chose AcG 2 xG 2 NH 2 host-guest peptides [97,98]. Our research group has embarked on a thorough investigation of unblocked tri-, tetra-, and pentapeptides to determine the intrinsic conformational propensities and NNI effects [24,[42][43][44][45][99][100][101]. Contrary to blocked dipeptides, this choice provides some more natural context to the investigated amino acid residue. In what follows, we will review these works with an emphasis on NNIs.
We start this discussion with a focus on alanine. The respective amino acid residue has long served as model system for the exploration of the Ramachandran space. The Ramachandran plot for the alanine dipeptide that solely reflects steric exclusion and electrostatic interactions looks very much like Figure 1. Thus, it fully represents the local aspect of an ideal random coil behavior. Very similar plots were obtained with more sophisticated molecular dynamics simulations in explicit water [91,92,102,103]. Hence, it came as a surprise when Shi et al., based on 1 H NMR and UVCD data for a hepta-alanine peptide (XAO-peptide, Ac-X 2 A 7 O 2 -NH 2 , X represents aminobutyric acid) in water, arrived at the conclusion that the peptide predominantly samples a basin assignable to the pPII conformation [104]. Up to this point this conformation had been associated with the trans conformation of proline in oligo-and poly-proline peptides, even though some early UVCD studies of Tiffany and Krimm had indicated that poly-L-lysine and poly-L-glutamic acid could adopt this conformation [105]. Their results were later corroborated by vibrational circular dichroism studies [106].
Since there was no obvious reason for alanine to prefer pPII, the results of Shi et al. became highly controversial in the field. Scheraga, Liwo and coworkers re-analyzed the data of Shi et al. based on the results of MD simulations and arrived at the conclusion that there is no specific preference of alanine for pPII [102,103,107]. Small-angle X-ray scattering data were found to be inconsistent with a conformational ensemble dominated by pPII [103]. This discussion overlooked the fact that early coil library studies had already indicated the very high pPII propensity of alanine (vide supra), thus lending credibility to the results of Shi et al. [40].
After the paper of Shi et al. was published, their results were sometimes interpreted as indicating that the alanine sequences could adopt a stable pPII-helix [108][109][110]. Some wording chosen by the authors certainly facilitated this reading, but in a follow-up paper they actually found no evidence for any cooperative nearest neighbor interactions between alanines in GGAAGG and GGAAAGG peptides [98], which would be needed for the formation of a stable pPII helix. Nevertheless, their work triggered a discussion of the socalled reconciliation problem, namely, the apparent contradiction between the occurrence of pPII helices in denatured proteins and their well-established behavior as a self-avoiding random coil [30,74].
Spectroscopic studies on alanine-based oligopeptides suggest that some cooperativity between the pPII states of alanine residues actually exists, in line with the MD results of Garcia (vide supra). At an early stage of the debate about the alleged pPII propensity of alanine, Schweitzer-Stenner et al. combined IR, polarized Raman and vibrational circular dichroism (VCD) data to show that the unblocked tetra-alanine A 4 has a higher pPII propensity than tri-alanine (A 3 ) [111]. This result was corroborated by a Raman optical activity study of McColl et al. [112]. However, both studies were qualitative in nature in that they did not report any numbers reflecting conformational propensities. This gap was filled later by more quantitative studies that utilized NMR J-coupling constants in addition to the amide I' band profiles in IR, Raman and VCD spectra. The obtained results suggest that the central alanine residue in A 3 has a slightly higher pPII propensity than the one in GAG, namely, 0.84-0.9 for the former and 0.72-0.8 for the latter [76,84,113]. This difference looks small, but it is indicative of a Gibbs free energy change of ca. 3 kJ/mol in favor of pPII (for A 3 ).
Graf et al. measured the 3 J(H N H Cα ) and 2 J(NC α ) coupling constants for a series of A n peptides (n = 3-7) [114]. While the former reflects average ϕ-values, the latter depends on the ψ-value of the residue that precedes the utilized amide nitrogen. The authors used the measured coupling constants in constrained MD simulations from which they obtained a slight stabilization of β-strand over pPII with increasing residue number. However, the obtained changes might be within the error limits of their analysis. Two other spectroscopic studies suggest that an increasing length of alanine sequences stabilizes non-extended structures over pPII, in line with predictions [75]. Verbaro et al. combined vibrational spectroscopy and fluorescence energy transfer experiments with the coupling constants of Graf et al., to show that the individual alanine residues of the unblocked A 5 W peptide exhibit pPII fractions between 0.65 and 0.75 [115]. These values are lower than the those observed for trialanine, but they are still way higher than any predictions obtained with steric exclusion and MD calculations. The slight destabilization of pPII benefits righthanded helical conformations for residues 2-4, in line with predictions of Gnanakaran and Garcia. The fifth alanine residue behaves differently in that it exhibits a more pronounced β-strand population. The latter could reflect interactions with the aromatic W-residue at the C-terminal, which exhibits a mixture of pPII and β-strand.
Shi et al. had initiated the debate about the pPII propensity of alanine with a spectroscopic analysis of the XAO peptide (vide supra). They deduced a rather high propensity (mole fraction) value of~0.9 from their data. While later studies on shorter peptides yielded similar values, it is higher than the values Verbaro et al. reported for the alanine residues in A 5 W. Moreover, a conformational ensemble of XAO totally dominated by pPII sampling would be inconsistent with the radius of gyration obtained from the SAXS experiments [103]. A more realistic picture emerged from the study of Schweitzer-Stenner and Measey [116], who combined IR, Raman and VCD band profiles with the 3 J(H N H Cα ) values of Shi et al. with the results of MD simulations of Scheraga, Liwo and collaborators [116]. The authors obtained pPII propensities in the 0.55-0.7 range for the three central alanine residues of the peptide, while the values are much lower for the residue in proximity of X and O. They were found to sample an exceptional large number of different turn-supporting structures and to exhibit larger β-strand propensities than alanine in short peptides. The results of this study are in line with those of Verbaro et al. [115], but they additionally suggest that alanine can be heavily influenced by the conformational distribution of neighbors. This was later confirmed by Toal et al. [99] (vide infra). Regarding the debate about the pPII propensity of alanine, the results of Schweitzer-Stenner and Measey hit a middle ground between two extreme views, namely, that of a nearly all pPII XAO peptide [104] and the results of MD simulations and short-angle X-ray studies that suggested that there is no exceptional pPII propensity of alanine at all [103,107].
We now move to a discussion of other homopeptide sequences. Early work of Eker et al. suggested that the central valine in unblocked V 3 has a very high β-propensity [117,118]. The latter was later confirmed by Graf et al. who reported a value of 0.52 for β, 0.19 for pPII and 0.29 for right-handed helical conformations [114]. Schweitzer-Stenner, who combined the J-coupling values of Graf et al. with amide I profiles, reported an even higher βstrand value (0.68), which comes at the expense of pPII and right-handed helical sampling (0.16 for both) [72]. These values seem to be more in line with the absence of a significant signal in the UVCD spectrum of V 3 [118]. They should be compared with the conformational distributions of valine in GVG, for which Hagarman et al. reported a β-strand propensity of just 0.38 [42]. The remaining population of V is distributed over pPII and several turn-forming conformations, including γ-turn. The Ramanchandran plots of GVG and VVV are shown in Figure 11. They illustrate the large influence that the two terminal valine residues have on the central valine residue, which involves changes in the population and basin position. Taken together, these results are indicative of strong NNIs that cause a predominance of β-strands, which, with respect to pPII, is stabilized by ca. 3.7 kJ/mol in VVV.
Results obtained for other homo-tripeptides and related GxG peptides are noteworthy. For GKG, Hagarman et al. reported a rather balanced distribution in the upper left panel of the Ramachandran plot (mole fractions of 0.5 and 0.41 for pPII and β-strand, respectively) [42]. The distribution of the central residue of KKK reported by Verbaro et al. is significantly different [113]. The usual two basin distribution comprising pPII and β is merged into one broad basin centered at ϕ,ψ = −95 • , 170 • . The authors termed this a distorted pPII conformation. This notion seems to be justified by a UVCD spectrum that is still very much pPII like, though with a more symmetric couplet. The result is at least qualitatively consistent with earlier findings for poly-L-lysine and an hepta-lysine peptide [105,119]. Apparently, the changes caused by NNIs in this peptide are quantitative and qualitative in that they can change the populations as well as the positions of the basins. strand propensity of just 0.38 [42]. The remaining population of V is distributed over pPII and several turn-forming conformations, including γ-turn. The Ramanchandran plots of GVG and VVV are shown in Figure 11. They illustrate the large influence that the two terminal valine residues have on the central valine residue, which involves changes in the population and basin position. Taken together, these results are indicative of strong NNIs that cause a predominance of β-strands, which, with respect to pPII, is stabilized by ca. 3.7 kJ/mol in VVV.  Another GxG-x 3 comparison has been carried out for aspartic acid. The Ramachandran of GDG is rather peculiar. The pPII population is low (0.2), whereas the β-strand is comparatively highly populated (0.48). In addition, aspartic acid was found to sample type II' β-turn-supporting conformations and to a significant degree asx-turn conformations, which lie in the upper right quadrant of the Ramachandran plot ( Figure 11) [43,45,59]. The latter do not appear in coil library Ramachandran plots for these aspartic acid residues, but they occur frequently in proteins [60]. In fully protonated DDD, the β-strand population of the central residue is nearly identical with the one of GDG, but the shape of the distribution was found to be different [120]. However, NNIs populate right-handed helical (3 10 )supporting conformations at the expense of pPII and asx. The former lies slightly below the type II β-turn-supporting conformation populated in GDG [45]. Interestingly, upon deprotonation of the D-residues, the distribution looks very much like the one observed for KKK, but with a less negative (more pPII-like) ϕ-angle. A similar result was obtained for ionized GDG by Rybka et al., though with more separated pPII and β-basins [45].
Recently, our research group has explored the conformational landscape of unblocked GxxG and GxxxG peptides. Contrary to the above tripeptides, the presence of terminal glycine residues allowed us to determine the Ramachandran plots of all x-residues. Here, we start with x = D (protonated). The Ramachandran plots are shown in Figure 12. There are numerous noteworthy observations. First, the corresponding peptide Ramachandran plots are rather different, which already indicates that the NNIs are operative. For all D-residues, we obtained a comparatively high population of type I/II (i + 2)-turn-forming structures. The asx-turn basin is still populated for residue D1 of GDDG and for D1 and D3 of GDDDG. pPII and β-strand populations are comparable, with the exception of the second D residue of GDDG, where β-strand is even more populated than it is in GDG. An investigation into the aspartic acid dipeptide at acidic and neutral pH led to the conclusion that the above differences between ionized and protonated DDD reflect interactions between the terminal carboxylate group and the aspartate side chains, which are naturally absent in denatured proteins. Hence, it is not surprising that, e.g., the coil library distribution of aspartic acid in DDD segments resembles more the one observed for protonated DDD [120]. The results obtained with fully protonated D-containing peptides should be considered as representing the properties of D-containing homo-segments in numerous disordered segments or proteins. A detailed discussion can be found in Milorey et al. [101].
Hellinger distance analysis of Milorey et al. suggests that the distributions of the tetraand pentapeptides are distinct from that of the respective GxG, while they show some similarity with respect to each other [100,101]. This analysis reveals that the basin positions in the Ramachandran plots of the investigated GxxG and GxxxG peptides are not very different. For trivaline and oligo-alanines, the NNIs seem to stabilize the already dominant conformer, but the length-dependent propensity of alanine for right-handed helices leads to stabilization of the helical conformations [27,75]. Recently, Schweitzer-Stenner et al. used NMR spectroscopy to investigate the cationic state of GKKG. They found that the downstream lysine residue is more effected by NNIs than the upstream one. The Ramachandran for the former looks very much like the one obtained for KKK, but with pPII and β-strand bassines clearly separated [121].

Structural Analysis of Heteropeptides
In this paragraph we discuss two experimental approaches aimed at exploring NNIs in short peptides. We start with the work of Cho and colleagues. Oh et al. used UV CD and 3 J(H N H Cα )-values of blocked tripeptides to assess the nearest neighbor interactions between all 20 natural amino acid residues (the authors used the term dipeptides, but that is not in line with literature terminology for blocked peptides) [93,122]. Since they did not assign individual amide proton signals, they used an average of the two 3 J(H N H Cα ) coupling values. They compared the average pPII population of the corresponding xy and yx pairs of the residues, which they deduced from the average 3 J(H N H Cα ) coupling by applying a simple two-state (pPII-β strand) model. The large aspect ratio of the standard deviation along the diagonal (0.169) and the antidiagonal (0.03) in Figure 13 was interpreted as indicating that the effective propensities of the corresponding xy and yx pairs are very similar. In other words, the amino acid residue type rather than the sequence matters. What do the results of the studies on the above homopeptide sequences have in common? For the D-and R-sequences, NNIs seem to produce a distribution with a nearly equipartition between pPII and β. The respective mole fractions of, e.g., GRRG and GRRRG, suggest that while the N-terminal R resembles to some extent the central residue of GRG (i.e., pPII dominance over β), the distributions of the other R-residues are much more balanced [100]. A similar equipartition effect seems to be operative in trilysine, where it is accompanied by a merger of pPII and β-basins. For D, this conformational balancing appears only in GDDDG. NNIs cause a reorganization of the population of turn-forming peptides for both R and D. The central R residue of GRRRG features some measurable populations of right-handed helical conformations, while the central residue of GDDDG exhibits an increased population of type II'/I β-turn-forming structures. A Hellinger distance analysis of Milorey et al. suggests that the distributions of the tetraand pentapeptides are distinct from that of the respective GxG, while they show some similarity with respect to each other [100,101]. This analysis reveals that the basin positions in the Ramachandran plots of the investigated GxxG and GxxxG peptides are not very different. For trivaline and oligo-alanines, the NNIs seem to stabilize the already dominant conformer, but the length-dependent propensity of alanine for right-handed helices leads to stabilization of the helical conformations [27,75].
Recently, Schweitzer-Stenner et al. used NMR spectroscopy to investigate the cationic state of GKKG. They found that the downstream lysine residue is more effected by NNIs than the upstream one. The Ramachandran for the former looks very much like the one obtained for KKK, but with pPII and β-strand bassines clearly separated [121].

Structural Analysis of Heteropeptides
In this paragraph we discuss two experimental approaches aimed at exploring NNIs in short peptides. We start with the work of Cho and colleagues. Oh et al. used UV CD and 3 J(H N H Cα )-values of blocked tripeptides to assess the nearest neighbor interactions between all 20 natural amino acid residues (the authors used the term dipeptides, but that is not in line with literature terminology for blocked peptides) [93,122]. Since they did not assign individual amide proton signals, they used an average of the two 3 J(H N H Cα ) coupling values. They compared the average pPII population of the corresponding xy and yx pairs of the residues, which they deduced from the average 3 J(H N H Cα ) coupling by applying a simple two-state (pPII-β strand) model. The large aspect ratio of the standard deviation along the diagonal (0.169) and the antidiagonal (0.03) in Figure 13 was interpreted as indicating that the effective propensities of the corresponding xy and yx pairs are very similar. In other words, the amino acid residue type rather than the sequence matters. In a follow-up paper, Jung et al. investigated the same large set of peptides, but this time they assigned chemical shifts and thus J-coupling constants to the N-and C-terminal residues [94]. Here we focus on their 3 J(H N H Cα ) data because this parameter has a clearly established structure dependence. Figure 14 (upper panel) shows the average 3 J(H N H Cα ) of their x and y residues for the 19 investigated amino acid residues (proline omitted). The averaging was done over all neighbors. Several aspects of the plots are noteworthy. Just based on their 3 J(H N H Cα ) values there seem to be three classes of residues. The first one contains residues with values significantly below the respective average (6.7 Hz for x and 7.4 Hz for y). It solely contains alanine and glycine. The second one contains values significantly above the averages. Its members are N, I, V, H and T for x and N, Y,F, I, V, H and T for y. If changes of 3 J(H N H Cα ) would solely reflect pPII/β-ratios, class 1 residues would have a high pPII propensity, while class 2 members would prefer β-strand. The remaining residues (class 3) fluctuate around the respective average values. The second observation is that the standard deviations of the average values are small. This seems to suggest that the nearest neighbor dependence of 3 J(H N H Cα ) is weak. Third, the averages over all J-coupling constants are different (larger for y than for x). This seems to reflect some end effects, which is astonishing because one of the constantly stated arguments for the use of blocked peptides is the proposed absence of end effects.
The plots in the lower panel of Figure 14 convey a slightly different message. It depicts the average change of 3 J(H N H Cα ) caused by the respective residues that constitute the abscissa. The data suggest that if positioned at x most of the residues exert nearly the same rather moderate influence, with the aromatic residues Y, F and W as exceptions. The latter all increase the 3   In a follow-up paper, Jung et al. investigated the same large set of peptides, but this time they assigned chemical shifts and thus J-coupling constants to the N-and C-terminal residues [94]. Here we focus on their 3 J(H N H Cα ) data because this parameter has a clearly established structure dependence. Figure 14 (upper panel) shows the average 3 J(H N H Cα ) of their x and y residues for the 19 investigated amino acid residues (proline omitted). The averaging was done over all neighbors. Several aspects of the plots are noteworthy. Just based on their 3 J(H N H Cα ) values there seem to be three classes of residues. The first one contains residues with values significantly below the respective average (6.7 Hz for x and 7.4 Hz for y). It solely contains alanine and glycine. The second one contains values significantly above the averages. Its members are N, I, V, H and T for x and N, Y, F, I, V, H and T for y. If changes of 3 J(H N H Cα ) would solely reflect pPII/β-ratios, class 1 residues would have a high pPII propensity, while class 2 members would prefer β-strand. The remaining residues (class 3) fluctuate around the respective average values. The second observation is that the standard deviations of the average values are small. This seems to suggest that the nearest neighbor dependence of 3 J(H N H Cα ) is weak. Third, the averages over all J-coupling constants are different (larger for y than for x). This seems to reflect some end effects, which is astonishing because one of the constantly stated arguments for the use of blocked peptides is the proposed absence of end effects. Here, we focus on the most important aspects of their results. Figure 15 shows the mole fraction of a series of GxAG, GAyG, GxDG and GDyG peptides. The Ramachandran plots for the alanine-containing series are depicted in Figure 16. In all these cases, the influence of the x-and y-neighbors is obvious. For alanine they reduce the pPII fraction quite substantially. The effect is most pronounced for valine. In the case of aspartic acid, the nearest neighbors increase the pPII propensity of D at the expense to turn conformations. On both cases, NNIs move the system towards equipartition, though to a different degree.
Toal et al. found that propensity-related NNI effects are much less pronounced for pairs of lysine, leucine and valine [99]. Interestingly, however, NNIs were found to cause, in part, substantial changes in the basin coordinates. These results thus underscore the notion that changes in individual coupling constants can be very misleading in that they could reflect changes in populations and basin coordinates. Only the use of multiple coupling constants and vibrational spectroscopy data lead to a meaningful result.
In addition to exploring the conformational distributions of GxyG peptides at room temperature, Toal et al. measured the temperature dependence of the 3 J(H N H Cα ) constants. Earlier 1 H NMR experiments on GxG peptides had shown that these data sets can be analyzed rather accurately in terms of a two-state model that describes the equilibrium between pPII and β-strand [45]. This yields very informative values for ΔH and TΔS. Figure  17 compares the thermodynamic parameters of L and V residues in the presence of different neighbors. For L the ΔG-values of pPII/β-strand equilibria at room temperature vary x 1 and x 2 correspond to x and y of the notation used in this article. The character C in the upper left corner indicates that these figures are part of a figure set in [94], from where they were taken and modified.
The plots in the lower panel of Figure 14 convey a slightly different message. It depicts the average change of 3 J(H N H Cα ) caused by the respective residues that constitute the abscissa. The data suggest that if positioned at x most of the residues exert nearly the same rather moderate influence, with the aromatic residues Y, F and W as exceptions. The latter all increase the 3 J(H N H Cα ) values of their neighbors. More variations were observed for the y position, where K, R, Y, H and N cause a reduction in their neighbors 3 J(H N H Cα ), while Y, F and W at position x again cause a substantial increase in this coupling constant. Now, we turn to the work on unblocked peptides carried out in our research group and in the Schwalbe laboratory in Frankfurt. Toal et al. conducted a large number of investigations into GxyG-type tetrapeptides by using vibrational and NMR spectroscopy. Here, we focus on the most important aspects of their results. Figure 15 shows the mole fraction of a series of GxAG, GAyG, GxDG and GDyG peptides. The Ramachandran plots for the alanine-containing series are depicted in Figure 16. In all these cases, the influence of the x-and y-neighbors is obvious. For alanine they reduce the pPII fraction quite substantially. The effect is most pronounced for valine. In the case of aspartic acid, the nearest neighbors increase the pPII propensity of D at the expense to turn conformations. On both cases, NNIs move the system towards equipartition, though to a different degree.
Toal et al. found that propensity-related NNI effects are much less pronounced for pairs of lysine, leucine and valine [99]. Interestingly, however, NNIs were found to cause, in part, substantial changes in the basin coordinates. These results thus underscore the notion that changes in individual coupling constants can be very misleading in that they could reflect changes in populations and basin coordinates. Only the use of multiple coupling constants and vibrational spectroscopy data lead to a meaningful result.
In addition to exploring the conformational distributions of GxyG peptides at room temperature, Toal et al. measured the temperature dependence of the 3 J(H N H Cα ) constants. Earlier 1 H NMR experiments on GxG peptides had shown that these data sets can be analyzed rather accurately in terms of a two-state model that describes the equilibrium between pPII and β-strand [45]. This yields very informative values for ∆H and T∆S. Figure 17 compares the thermodynamic parameters of L and V residues in the presence of different neighbors. For L the ∆G-values of pPII/β-strand equilibria at room temperature vary between 0.5 and −0.5 kJ/mol (~0.2*RT), in line with the reported very limited influence of NNIs on L at room temperature. However, except for GDLG, the corresponding enthalpic and entropic differences of the investigated GxLG and GLyG peptides are considerable. This notion particularly applies to GSLG and GVLG. For V-containing tetrapeptides, the thermodynamic parameters convey a different message. While ∆H and T∆S values are large for GVG, corresponding values of V-containing tetrapeptides are small, with the notable exception of GVLG. If both enthalpic and entropic values are high, then the entropy will win at temperatures at which proteins thermally unfold. This leads to a stabilization of β-strand conformations. The rather small Gibbs energy differences between pPII and β-strand at room temperature obtained for all residue pairs, depicted in Figure 17, are due to enthalpy-entropy compensation and the close proximity of the isoequilibrium point to room temperature. Toal et al. showed that the Gibbs energy difference of a rather large number of GxG peptides (x = L, V, I, S, K, Y, W and F) become practically identical at a temperature of 302 K. The isoequilibrium of another group (x = E, R, M and N) was observed at 312 K [24]. The occurrence of isoequilibria points can be related to an enthalpyentropy compensation and a common origin of enthalpic and entropic differences, namely, solute-solvent interactions [123][124][125], which in the case of GxyG peptides can obscure NNI effects. What this implies for our understanding of thermal unfolding of proteins has still to be understood. between 0.5 and −0.5 kJ/mol (~0.2*RT), in line with the reported very limited influence of NNIs on L at room temperature. However, except for GDLG, the corresponding enthalpic and entropic differences of the investigated GxLG and GLyG peptides are considerable. This notion particularly applies to GSLG and GVLG. For V-containing tetrapeptides, the thermodynamic parameters convey a different message. While ΔH and TΔS values are large for GVG, corresponding values of V-containing tetrapeptides are small, with the notable exception of GVLG. If both enthalpic and entropic values are high, then the entropy will win at temperatures at which proteins thermally unfold. This leads to a stabilization of β-strand conformations. The rather small Gibbs energy differences between pPII and β-strand at room temperature obtained for all residue pairs, depicted in Figure 17, are due to enthalpy-entropy compensation and the close proximity of the isoequilibrium point to room temperature. Toal et al. showed that the Gibbs energy difference of a rather large number of GxG peptides (x = L, V, I, S, K, Y, W and F) become practically identical at a temperature of 302 K. The isoequilibrium of another group (x = E, R, M and N) was observed at 312 K [24]. The occurrence of isoequilibria points can be related to an enthalpy-entropy compensation and a common origin of enthalpic and entropic differences, namely, solute-solvent interactions [123][124][125], which in the case of GxyG peptides can obscure NNI effects. What this implies for our understanding of thermal unfolding of proteins has still to be understood. and aspartic acid (lower panel) in the indicated tri-and tetrapeptides. The turn fraction was calculated as the sum over the occupation of all non-pPII/β-strand conformations. Note, that the color codes for the two panels are different. Taken from Toal [126].      In a follow-up study, Schweitzer-Stenner and Toal compared Ramachandran plots of GxG and GxyG by means of Hellinger distance calculations [73]. We reiterate that Hellinger distances are more sensitive to changes in basin coordinates than to redistributions of populations. Their values suggest that, e.g., distributions of GAG are only modestly dissimilar from those of the alanine residues in the above tetrapeptides. Compared with GDG, however, K, L and V as neighbors produce very dissimilar distributions of D. Interestingly, a significant dissimilarity was also obtained for valine if flanked by D, S and L, which predominantly reflects basin rather than population changes. Moreover, the authors calculated the Hellinger distances for corresponding pairs in the coil library of Ting et al. They obtained much lower values, which suggests that for the investigated residue pairs the averaging over either the upstream or downstream residues obfuscates to some extent the NNIs between residues.
While the experimental data presented in this and preceding paragraphs reveal substantial NNI effects between amino acid residues in short peptides and in coil libraries they do not per se imply a violation of the isolated pair hypothesis. In principle, it is thinkable that just the different nature of neighbors causes the observed changes in basin population and coordinates. A breakdown of the isolate pair hypothesis requires that the population of a conformation of a residue depends on the conformation adopted by its neighbors. Recently, Schweitzer-Stenner and Toal showed that the available GxyG data set is diagnostic of an anticooperative pPII-β-strand interaction where the pPII of one residue stabilizes the β-strand of neighbors, thus destabilizing the respective pPII conformation [127]. For some residue pairs, this interaction becomes significant only at high protein-melting temperatures, because of the large enthalpic and entropic differences between pPII and β-strand (vide infra). The authors utilized the derived temperature dependence of the NNIs between K and V in GKVG to predict the temperature dependence of the UVCD spectrum of the unfolded Max3 peptide (VK) 4 The work of Toal et al. as well as the homopeptide studies discussed in the preceding paragraph revealed the necessity to use a set of J-coupling constants in conjunction with vibrational spectroscopy data to arrive at a quantitative assessment of NNI. Just probing 3 J(H N H Cα ) and its changes in the presence of different neighbors does not allow for a valid structural analysis. In order to substantiate this notion, a closer look at J-coupling constants is helpful. Let us start with alanine. If one averages over all alanine neighbors the obtained value for 3 J(H N H Cα ) is 6.36 Hz and carries a standard deviation of 0.4 Hz, which might point to moderate NNIs (data were taken from Toal et al. [99]). For 3 J(H N C ) the average value is 1.23 Hz with a standard deviation of 0.08 Hz. These values seem to suggest weak NNIs. However, a closer look informs that these averages have very limited meaning. Valine as a neighbor increases 3 J(H N H Cα ) from 6.1 to 6.6 Hz, which is a significant increase. Concomitantly, 3 J(H N C ) increases from 1.18 to 1.27 Hz. These correlated changes combined with an even more significant increase in 3 J(H Cα C ) from 1.02 to 2.56 Hz reflect a drastic decrease in the pPII propensity from 0.8 to 0.38. Again, if one would be only concerned about the average 3 J(H Cα C ) value (2.15 ± 0.32 Hz), one would not expect such a significant change in the propensities by any of the investigated neighbors. The J-coupling plots for D-containing peptides underscore this conclusion. If one looks solely at the 3 J(H N H Cα ) values, a standard deviation of 0.47 Hz (relative value = 0.06) for an average of 7.42 Hz suggest that the nearest neighbor-induced changes are significantly smaller than, e.g., the differences between the average values obtained for A and D. However, respective 3 J(H N C ) and 3 JH NCα C ) variations are way more pronounced, thus indicating significant NNI-induced changes, in line with the analysis of Toal et al. [99]. For the series of arginine homopeptides investigated by Milorey et al., the standard deviation for 3 J(H N H Cα ) is very small (the relative value is just 0.04) [100], which would lead to the conclusion that the mutual influence of arginines on each other is weak. However, a look at the large standard deviation of the average 3 J(H N C ) reveals that such a conclusion would be incorrect. It should be noted in this context that the changes in the VCD strength of the excitonically coupled amide I modes of GxyG peptides, as reported by Toal et al., are further indicators of substantial NNI-induced structural changes [99].
Taken together, the studies described in this paragraph corroborate the occurrence of residue specific NNIs. The observed correlation between the pPII and β-strand propensities of the neighbors indicate that NNIs are conformation dependent, which implies a breakdown of the isolated pair hypothesis.

Summary and Outlook
Nearest neighbor interactions between amino acid residues in unfolded and denatured proteins started to attract attention nearly 25 years ago. Their relevance stems from the fact that their occurrence could indicate a violation of the isolated pair hypothesis of Flory, on which his random coil model for unfolded polypeptides and proteins was built. At the beginning information about nearest neighbor interactions came from coil library analyses, MD simulations and, to a limited extent, from NMR data on denatured proteins. The results of coil library analyses provided the clearest evidence for a substantial influence of NNIs on conformational propensities of amino acid residues in coil-like structures. Only recently has experiments on model peptides provided a more quantitative assessment of NNI in terms of propensity changes and thermodynamic interaction parameters. Unfortunately, a unifying picture did not yet emerge from the available data. Early coil library and NMR data indicate that the upstream presence of aliphatic and aromatic residues might cause a conformational redistribution to more extended structures. The extensive set of coil library-based Ramachandran plots of Ting et al. suggest that NNI mostly affect the population of the basin associated with right-handed helical conformations. The few MD-simulation studies aimed at exploring NNI effects reveal that the emerging results depend on the choice of the force fields. Experimental studies on tetra-and a few pentapeptides reveal a complicated picture in that they show that NNI effects are highly residue and sequence dependent. While they mostly cause redistributions between pPII and β-strand conformations, modest populations of turn-forming conformations occur in homopeptide sequences.
In view of the complexity indicated above one might wonder whether studying NNIs is worth the effort. While there has been considerable initial interest in the subject over a period of ca. 15 years after the emergence of first evidence for detectable NNIs in denatured proteins, it seems that it has been put on the backburner, particularly by the computational community. Recent force-field developments were guided by conformational analyses on block dipeptides, which thus explicitly excludes NNIs [128]. Conformational studies on poly-peptide sequences with different net charges were predominantly aimed at exploring the classical polymer physics parameters, such as the radius of gyration, average interresidue distance and scaling factor, rather than on conformational propensities and nearest neighbor interactions. We think that an explicit consideration of NNIs is necessary for a variety of reasons. First, even though our experimental studies suggest that NNIs frequently lead to a randomization of individual Ramachandran distributions and correlation effects between pPII and β-strand conformations, they cause a reduction in conformational entropies. This notion is in line with computational studies on unfolded proteins [18,77]. In view of the relevance of the conformational entropy for protein-protein and protein-DNA interactions associated with disorder->order or order->disorder transitions, a correct assessment of the conformational entropies of the involved IDPs or disordered segments seems to be pivotal. Second, since solvation plays a major role in determining conformational propensities and NNIs, the notion that the solvation Gibbs energy of an unfolded protein is just the sum of the residue solvation energies can no longer be maintained [129]. Third, given the relevance of MD simulations for the study of IDPs and their biological functions, the utilized force field should be able to account for the influence of NNIs. In view of the fact that most of the current force fields cannot even properly reproduce experimental data of model peptides [25,26,130], the field is not even close to achieving this goal. Fourth, it is very likely that residual structures of protein segments are relevant for the initial phase of protein folding and peptide/protein self-assembly [131][132][133].
When it comes to the study of NNIs one might wonder whether a further exploration of coil libraries or studies of short peptides can fill the void. The advantage of the former is the large data sets; its disadvantage is the fact that coil distributions might still not be representative of residues in unfolded proteins, because the protein context and different degrees of solvent accessibility cannot be ignored [45]. The latter issues are addressed with studies on short model peptides, but performing an extensive analysis, such as the one by Toal et al., for all combinations of amino acid residues is out of question. Moreover, one has to take into account that studies on short peptides may not provide the full picture, even for longer polypeptides that are incapable of folding. The reason for this deficiency is that cooperative effects that support the population of right-handed helical structures are difficult to obtain from an analysis of tetra-and pentapeptides. According to the Zimm-Bragg or Lifson-Roig model, cooperative NNIs between residues in helical conformations can become more relevant with increasing length of the oligo/poly-peptide [134,135]. This could actually explain the observation that coil library distributions indicate more righthanded helical content than the Ramachandrans of short peptides [44]. To maneuver in this rather complex landscape, a reduction strategy is called for that would identify the NNIs between a limited number of residues, representing groups with aromatic, polar, charged and aliphatic side chains. Possible candidates are F, S, R and V. One might add L to the aliphatic group since βand γ-branched aliphatic residues behave differently [99].
Available data sets for A, L and V are already considerable but might need to be complemented by studies on pentapeptides, since the influences of up-and downstream neighbors are apparently not additive. D plays a special role because its side chain can interact with the backbone. Once the NNIs for a complete set of sequences with these residues are determined, one could study longer peptides composed of residues for which high (Zimm-Bragg) s-parameters have been reported to elucidate the interplay between the helix and extended structures supporting NNIs. Such a data set could provide a sufficient basis for the development of a suitable MD force field and water models and substantially increase our understanding of unfolded and intrinsically disordered proteins.
Funding: Part of the research described in this article was funded by the National Science Foundation (CHE 0804492 and MCB-1817650).