A Rationalization of the Effect That TMAO, Glycine, and Betaine Exert on the Collapse of Elastin-like Polypeptides

Elastin-like polypeptides (ELPs) are soluble in water at low temperature, but, on increasing the temperature, they undergo a reversible and cooperative, coil-to-globule collapse transition. It has been shown that the addition to water of either trimethylamine N-oxide (TMAO), glycine, or betaine causes a significant decrease of T(collapse) in the case of a specific ELP. Traditional rationalizations of these phenomena do not work in the present case. We show that an alternative approach, grounded in the magnitude of the solvent-excluded volume effect and its temperature dependence (strictly linked to the translational entropy of solvent and co-solute molecules), is able to rationalize the occurrence of ELP collapse in water on raising the temperature, as well as the T(collapse) lowering caused by the addition to water of either TMAO, glycine, or betaine.


Introduction
It is well-established that elastin-like polypeptides, ELPs, are soluble in water at low temperature and undergo a temperature-induced, reversible, and cooperative collapse transition, passing from extended, coil conformations to compact, globular ones [1][2][3]. Soon after collapse, aggregation occurs, and T(collapse) practically corresponds to the lower critical solution temperature. In a recent and very interesting study, Cremer and co-workers tried to shed light on the effect that the addition to water of three co-solutes-trimethylamine N-oxide (TMAO), glycine, and betaine-has on the collapse temperature of a specific ELP [4]. The latter consists of 120 repeat units of the sequence Val-Pro-Gly-Val-Gly, for a total of 600 residues. Experimental measurements showed that T(collapse) = 28.5 • C in water, and it decreases significantly on raising the concentration of the three co-solutes. In particular, T(collapse) is 10 • C in 1 M glycine, 12.5 • C in 1 M TMAO, and 18.5 • C in 1 M betaine [4]. In other words, the addition to water of either TMAO, glycine, or betaine stabilizes the globule state of ELP. This result can be considered as "expected" because all three co-solutes are stabilizing agents of the native state of globular proteins [5,6], and the globule state of ELP should resemble the native state of globular proteins. To clarify the mechanism of action of such co-solutes, Cremer and co-workers performed both experimental measurements and computer simulations, obtaining the following results: (1) the surface tension of the aqueous solutions increases with respect to that of water on adding glycine and betaine, but it decreases upon TMAO addition (see Figure 3B in [4]); (2) both glycine and betaine molecules prefer to interact with water and are depleted at the ELP surface, whereas TMAO molecules prefer to interact with ELP and are enriched at its surface (see Figure 4 in [4]); (3) FTIR spectra in the OH stretching region indicate that the addition of TMAO and glycine causes a substantial red-shift effect (which should be indicative of stronger intermolecular H-bonds), whereas betaine addition causes essentially no effect (see Figures 5 and 6 in [4]); (4) the tetrahedral order parameter values, determined The collapse transition is cooperative, endothermic, and entropy-driven [25], even though polymer chains pass from extended to compact conformations (i.e., a coil-to-globule collapse). Indeed, the entropy increase comes from the gain in translational entropy of water molecules caused by the WASA decrease associated with polymer collapse. Such a theoretical approach has been extended to rationalize the effect of different co-solutes and co-solvents on PNIPAM T(collapse). For instance, the addition of sodium salts to water causes, in general, a density increase that leads to a rise in the magnitude of the solvent-excluded volume effect (the density becomes relevant as a measure of number density) [10]. The expectation would be a general lowering of PNIPAM T(collapse), but the situation is slightly trickier, depending on the strength of anion energetic attractions for the PNIPAM surface with respect to those for water molecules and on the geometric accessibility of the polymer surface (recognizing that the globule state is characterized by chain fluctuations and not solid-like interior packing [10,23]). In general, anions preferring water stabilize the globule state, lowering T(collapse), whereas anions preferring the PNIPAM surface stabilize the coil state, raising T(collapse). In the present study, we would like to apply the same theoretical approach to the collapse transition of ELP to try to provide a coherent rationalization of the effect the addition of either TMAO, glycine, or betaine has on T(collapse).

Theory Section
The collapse of some ELPs was investigated by means of DSC measurements, showing that the process is reversible, cooperative, and endothermic [26,27]. The average enthalpy change is ∆H(collapse) = 1.6 kJ molres −1 , and, assuming T(collapse) = 28.5 • C, ∆S(collapse) = 5.3 J K −1 molres −1 (note that ELP collapse can be described as a phase transition between two macro-states-the coil one, C-state, and the globule one, G-state-so that ∆G(collapse) = 0 at T(collapse); indeed, a pressure-temperature phase diagram has been obtained [26]). These experimental data, despite their relevance, do not provide clues on the molecular origin of the entropy gain driving ELP collapse. The devised statistical thermodynamic approach leads to the following relationships [19]: where the two minus signs are a consequence of our original choice to describe the swelling process, to be in line with the description of globular protein unfolding; ∆E a = [E a (C-state) − E a (G-state) + ∆E(intra)], where E a (C-state) and E a (G-state) measure the energetic interactions (i.e., both van der Waals attractions and H-bonds) among the C-state or the G-state, respectively, of ELP and the surrounding water and co-solute molecules; ∆E(intra) is the difference in intra-chain energetic interactions between the C-state and the G-state; ∆H reorg is the enthalpy change due to the structural reorganization of water-water H-bonds upon collapse (i.e., many water molecules pass from the hydration shell of ELP to bulk water); and ∆S reorg is the corresponding entropy change. It has been shown by different authors using different theoretical arguments [21,[28][29][30][31] that the structural reorganization of waterwater H-bonds produces enthalpy and entropy changes that almost exactly compensate each other: This is in line with the experimental finding that there is no relationship between the effect of a co-solute on water structure and its stabilizing or destabilizing action on the native state of globular proteins [32]. It is important to underscore that: (1) ∆H reorg and ∆S reorg are not small quantities, but they do not affect the overall Gibbs free energy change due to enthalpy-entropy compensation; and (2) ∆H reorg and ∆S reorg depend strongly on temperature because a large positive heat capacity change is associated with the structural reorganization of water-water H-bonds [33]. ∆∆S x is the entropy contribution provided by the difference in solvent-excluded volume between the two states, and ∆S conf represents the gain in conformational entropy of the polypeptide chain upon swelling (for more, see below). On these grounds, the transition Gibbs free energy change ∆G tr = −∆G(collapse) is: where [∆G c (C) − ∆G c (G)] = −T∆∆S x , and ∆G c (C) and ∆G c (G) represent the reversible work to create, in water or aqueous solutions, a cavity suitable to host the C-state and the G-state, respectively. The ∆∆G c contribution is calculated by means of a simple geometric model: the G-state is a sphere, and the C-state is a prolate spherocylinder having the same V vdW of the sphere and a larger WASA [10,19]. These geometric assumptions are supported by available data. It has been shown that very high hydrostatic pressures (above 2000 atm) favor the G-state, lowering T(collapse) [26,27]. This datum means that there is a difference in volume between the two ELP macro-states, but it is very small and can safely be neglected when performing model calculations at 1 atm. In addition, MD simulations showed that a marked WASA decrease occurs upon collapse of a 90-residue (VPGVG) 18 , and that both swollen and compact conformations are highly hydrated, with almost all the peptide groups involved in H-bonds with water molecules, regardless of ELP conformation [34].
In the present study, an ELP chain of 601 residues in the G-state is modelled as a sphere of radius a = 24.5 Å, V vdW = 61,601 Å 3 , and WASA = 8430 Å 2 , whereas the C-state is modelled as a prolate spherocylinder of radius a = 12.25 Å, cylindrical length l = 114.33 Å, V vdW = 61,601 Å 3 , and WASA = 12,147 Å 2 (note that, on average, the residue volume in proteins amounts to 102.5 Å 3 [35]). The G-state and C-state geometric models are representative of the huge number of conformations belonging to the two macro-states and, for this reason, can be considered to be independent of co-solute addition to water. It is important to underscore that the ∆∆G c contribution: (a) is always positive because ∆G c increases with cavity WASA, even though the cavity V vdW is kept fixed [36,37]; and (b) is calculated by means of the analytic formulas provided by classic scaled particle theory (SPT) for spherical and prolate spherocylindrical cavities in a hard sphere fluid mixture (the pressure-volume term is neglected for its smallness at P = 1 atm) [38,39]. A critical role is played by the volume packing density of the hard sphere fluid mixture (i.e., aqueous solutions), ξ 3 = (π/6) × Σρ j × σ j 3 , where ρ j is the number density, in molecules per Å 3 , of species j and σ j is the corresponding hard sphere diameter; ξ 3 represents the fraction of the total liquid volume occupied by water and co-solute molecules. The physical reliability of classic SPT formulas is well established [21,[39][40][41]. Experimental values of the density of water and the considered aqueous solutions of TMAO, glycine, and betaine were used to perform calculations over the 5-35 • C temperature range [42]. Experimental density values need to be used in order to account for the real attractions that exist among solvent and co-solute molecules and to determine the solution density [43,44]. The following effective hard sphere diameters were used and considered to be temperatureindependent: (a) σ(H 2 O) = 2.80 Å [45], corresponding to the position of the first maximum in the oxygen-oxygen radial distribution function of water, at room temperature and 1 atm [46]; (b) σ(glycine) = 5.15 Å, which corresponds to the diameter of the sphere having the experimental partial molar volume of glycine in water [47]; (c) σ(TMAO) = 5.40 Å and σ(betaine) = 6.20 Å, which correspond to the diameters of the two spheres possessing the WASA calculated for the two molecules [48]. Even though different criteria were applied to select the effective hard sphere diameters of the three co-solutes, their relative size is correct in view of the molecular structures.
The T·∆S conf contribution is estimated by considering that each monomer gains a temperature-independent conformational entropy upon swelling: where N res = 601 and ∆S conf (res) = 4 J K −1 molres −1 , the same value used in all our previous applications of this approach to thermo-responsive polymers (such as PNIPAM) [10][11][12][13][14]. Even though Equation (5) may appear a rough approximation, its validity is supported by the finding that the denaturation entropy change (of which ∆S conf constitutes a large portion) scales linearly with the number of residues in a large set of globular proteins [49,50]. This term is assumed to be independent of co-solute addition to water (i.e., the conformational entropy is an intrinsic property of polymer chains, largely dictated by steric constraints [51]). According to theoretical approaches and computer simulations [52][53][54], an average value for ∆S conf (res) of globular proteins would be around 19 J K −1 molres −1 .
The marked difference between the two numbers is due to the large conformational entropy characterizing the G-state of ELP in comparison to the unique 3D structure of the native state of globular proteins.
Since ∆G tr [T(collapse)] = 0, T(collapse) = 28.5 • C in water, and ∆∆G c (water) = 1203.4 kJ mol −1 at 28.5 • C, it is possible to take advantage of this constraint via Equation (4) and of the T·∆S conf estimate reported above, fixing: The finding that ∆E a (water) is a negative and not-small quantity should not come as a surprise considering that the C-state has a larger WASA than the G-state, and considering the chemical features of the ELP surface (i.e., the possibility to make H-bonds with water molecules). In addition, since ELP collapse is endothermic and Equation (1) is valid, ∆E a (water) is expected to be negative. Using the average per residue contribution reported at the beginning of the Theory section, for a 600-residue ELP, ∆H(collapse) ≈ 960 kJ mol −1 and, so, ∆H reorg ≈ 480 kJ mol −1 . The latter large positive number needs an explanation. A marked WASA decrease is associated with ELP collapse [34]; in other words, a marked decrease in hydration shell size occurs and many water molecules return to the bulk (for a 600residue ELP, the number can be as large as 800-900 water molecules [34]). This is the structural reorganization of water-water H-bonds, and the finding that ∆H reorg ≈ 480 kJ mol −1 means that the difference in strength among H-bonds in the hydration shell and those in the bulk water amounts to a fraction of 1 kJ. The ∆E a (water) estimate is considered to be temperature independent in view of the limited temperature range considered in this study (5-35 • C) and is enough to analyze ELP collapse in water and aqueous solutions [4]. Note that it is the ∆H reorg term that is to be strongly temperature dependent [33,55]. The ∆E a quantity is expected to be larger in magnitude in aqueous solutions containing TMAO, glycine, and betaine, due to their attractive interactions with the ELP surface. Since the ∆S conf contribution is assumed to be independent of the co-solute presence, and knowing the different T(collapse) values determined by Cremer and co-workers at different co-solute concentrations [4], the above procedure allows us to also obtain reliable ∆E a estimates in aqueous solutions containing TMAO, glycine, and betaine.
An important question is related to the sensitivity of the results to the values assigned to the various parameters of the model. The results are very sensitive to the sizes of the sphere and prolate spherocylinder, and to the value assigned to ∆S conf (res) that is multiplied by N res in Equation (5). To highlight such sensitivity, the ∆∆G c functions obtained in water by slightly modifying the radius and length of the C-state prolate spherocylinder (and keeping fixed the radius of the G-state sphere) and the T·∆S conf − ∆E a straight lines obtained by considering ∆S conf (res) = 4.00 ± 0.05 J K −1 molres −1 (and keeping ∆E a fixed) are shown in Figure 1. The plot emphasizes the sensitivity and shows that the theoretical approach works well in reproducing the occurrence of ELP collapse around 28 °C, assigning reliable values to the various parameters.

Results and Discussion
Experimental data show that the addition to water of either TMAO, glycine, or betaine causes a density increase that translates into an increase of the volume packing density of the solutions. This is shown in Figures 2 and 3 for the aqueous solutions of the three cosolutes at 0.5 M and 1 M concentrations, in the 5-35 °C temperature range. It is worth noting that, despite the 1 M glycine aqueous solution having the largest density, the 1 M betaine aqueous solution has the largest volume packing density, highlighting the important role of the diameter of co-solute molecules. The corresponding ΔΔGc functions are shown in Figure 4. It is evident that in all the considered aqueous solutions, the ΔΔGc magnitude is larger than that in water (i.e., there is coherence in the effect of the three cosolutes). In all cases, the ΔΔGc function increases with temperature and co-solute concentration, and this occurs to a larger extent in the case of glycine, even though the volume packing density of betaine aqueous solutions is larger. Such a result comes from the basic fact that the diameter of solvent and co-solute molecules has a prevailing role (as already discussed in depth to rationalize the larger ΔGc magnitude in water with respect to that in other liquids [19,22,41]), and glycine molecules are smaller than betaine ones (i.e., the molecular diameter is 5.15 Å versus 6.20 Å, respectively). In general, the ΔΔGc contribution tends to stabilize the G-state, all the more so upon concentration increase of the three co-solutes. It is interesting to note that TMAO, also in the present approach, appears to be special because even though the ΔΔGc magnitude in 0.5 and 1 M TMAO solutions is only slightly larger than that in water, the T(collapse) values are markedly smaller than that in water (see Table 1); this point merits further investigation. In any case, the solvent-excluded volume argument is able to rationalize, in a coherent-though qualitative-manner the experimental finding that the addition to water of either TMAO, glycine, or betaine lowers the T(collapse) value of ELP [4]. The plot emphasizes the sensitivity and shows that the theoretical approach works well in reproducing the occurrence of ELP collapse around 28 • C, assigning reliable values to the various parameters.

Results and Discussion
Experimental data show that the addition to water of either TMAO, glycine, or betaine causes a density increase that translates into an increase of the volume packing density of the solutions. This is shown in Figures 2 and 3 for the aqueous solutions of the three cosolutes at 0.5 M and 1 M concentrations, in the 5-35 • C temperature range. It is worth noting that, despite the 1 M glycine aqueous solution having the largest density, the 1 M betaine aqueous solution has the largest volume packing density, highlighting the important role of the diameter of co-solute molecules. The corresponding ∆∆G c functions are shown in Figure 4. It is evident that in all the considered aqueous solutions, the ∆∆G c magnitude is larger than that in water (i.e., there is coherence in the effect of the three co-solutes). In all cases, the ∆∆G c function increases with temperature and co-solute concentration, and this occurs to a larger extent in the case of glycine, even though the volume packing density of betaine aqueous solutions is larger. Such a result comes from the basic fact that the diameter of solvent and co-solute molecules has a prevailing role (as already discussed in depth to rationalize the larger ∆G c magnitude in water with respect to that in other liquids [19,22,41]), and glycine molecules are smaller than betaine ones (i.e., the molecular diameter is 5.15 Å versus 6.20 Å, respectively). In general, the ∆∆G c contribution tends to stabilize the G-state, all the more so upon concentration increase of the three co-solutes. It is interesting to note that TMAO, also in the present approach, appears to be special because even though the ∆∆G c magnitude in 0.5 and 1 M TMAO solutions is only slightly larger than that in water, the T(collapse) values are markedly smaller than that in water (see Table 1); this point merits further investigation. In any case, the solvent-excluded volume argument is able to rationalize, in a coherent-though qualitative-manner the experimental finding that the addition to water of either TMAO, glycine, or betaine lowers the T(collapse) value of ELP [4].  To reach a quantitative agreement, it is important to recognize that the stabili effect of ΔΔGc is counterbalanced by the destabilizing effect of the ΔEa contribution is a large and negative quantity in water, the magnitude of which should rise on ad the three co-solutes because the molecules of the latter can be involved in attractive i actions with the ELP surface [12,13,23]. Robust estimates of the ΔEa contribution are difficult to obtain using theoretical relationships and/or computational procedure cause one would need: (a) reliable ensembles for both the G-state and the C-state of which is a chain of 600 residues; and (b) good force-fields to describe the interactio  To reach a quantitative agreement, it is important to recognize that the stabili effect of ΔΔGc is counterbalanced by the destabilizing effect of the ΔEa contribution; is a large and negative quantity in water, the magnitude of which should rise on add the three co-solutes because the molecules of the latter can be involved in attractive in actions with the ELP surface [12,13,23]. Robust estimates of the ΔEa contribution are v difficult to obtain using theoretical relationships and/or computational procedures cause one would need: (a) reliable ensembles for both the G-state and the C-state of E which is a chain of 600 residues; and (b) good force-fields to describe the interaction the three co-solutes with both the ELP surface and the water molecules. In contrast, the simple approach outlined to arrive at an estimate of ΔEa at T(collapse) in water (please, see Equation (6)) is feasible and should produce values with internal consistency (any possible error should be more or less of the same entity in all three cases). These ΔEa estimates are listed in the last column of Table 1.  Moreover, they are assumed to be temperature independent in view of the small temperature range over which ELP collapse occurs in the considered aqueous solutions [4] (remember that it is the ∆Hreorg term to be strongly temperature dependent [33,55]). This assumption allows the drawing of the T·ΔSconf − ΔEa straight lines that cross the ΔΔGc functions at T(collapse); see Figure 5, panel (a) for betaine aqueous solutions, panel (b) for TMAO aqueous solutions, and panel (c) for glycine aqueous solutions. Actually, the  To reach a quantitative agreement, it is important to recognize that the stabilizing effect of ∆∆G c is counterbalanced by the destabilizing effect of the ∆E a contribution; this is a large and negative quantity in water, the magnitude of which should rise on adding the three co-solutes because the molecules of the latter can be involved in attractive interactions with the ELP surface [12,13,23]. Robust estimates of the ∆E a contribution are very difficult to obtain using theoretical relationships and/or computational procedures because one would need: (a) reliable ensembles for both the G-state and the C-state of ELP, which is a chain of 600 residues; and (b) good force-fields to describe the interactions of the three co-solutes with both the ELP surface and the water molecules. In contrast, the simple approach outlined to arrive at an estimate of ∆E a at T(collapse) in water (please, see Equation (6)) is feasible and should produce values with internal consistency (any possible error should be more or less of the same entity in all three cases). These ∆E a estimates are listed in the last column of Table 1.
Moreover, they are assumed to be temperature independent in view of the small temperature range over which ELP collapse occurs in the considered aqueous solutions [4] (remember that it is the ∆H reorg term to be strongly temperature dependent [33,55]). This assumption allows the drawing of the T·∆S conf − ∆E a straight lines that cross the ∆∆G c functions at T(collapse); see Figure 5, panel (a) for betaine aqueous solutions, panel (b) for TMAO aqueous solutions, and panel (c) for glycine aqueous solutions. Actually, the straight lines drawn in Figure 5 also account for a very small uncertainty of 0.01 J K −1 molres −1 , associated with ∆S conf (res), to further emphasize the sensitivity of the model results to this parameter. The numbers in the last column of Table 1 indicate that the ∆E a quantity increases in magnitude with the addition of the considered co-solutes to water. This finding makes sense because the molecules of TMAO, glycine, and betaine can all be involved in attractive interactions (i.e., both dispersion interactions and H-bonds) at the ELP surface, and the latter should markedly increase upon swelling of the polypeptide chain. There are quantitative differences among the three co-solutes, but they cannot be taken for granted in view of the simplicity and roughness of the procedure used to arrive at the ∆E a estimates. However, it is important to underscore that preferential interaction (i.e., enrichment) and preferential exclusion (i.e., depletion) are expressions used to describe thermodynamic data referring to differences between conformations belonging to two huge ensembles (i.e., the two macro-states) and cannot be taken literally [56][57][58]. The expectation is that polymer chains possessing both polar and nonpolar moieties, such as PNIPAM and ELP, are attractive for water molecules (indeed, they are soluble in water at low temperature) and for the molecules of TMAO, glycine, and betaine. In fact, MD simulations by Berne and co-workers demonstrated that TMAO molecules, similarly to urea molecules, are enriched at the surface of hydrophobic polymers [59,60]. This reasoning implies that the surface of ELP chains is covered by water and co-solute molecules. Note that the MD results by Cremer and co-workers were obtained not for an ELP chain but for a single Val-Pro-Gly-Val-Gly peptide [4]. To address these matters, it is mandatory to perform MD simulations on polymer chains since the surface area magnitude is a critical factor [61], and additivity might not hold in these cases. Nevertheless, polymer collapse does occur when the translational entropy gain of water and co-solute molecules, associated with the decrease in solvent-excluded volume, overwhelms the other contributions in the Gibbs free energy balance of Equation (4).
Life 2022, 12, x FOR PEER REVIEW 9 of 13 straight lines drawn in Figure 5 also account for a very small uncertainty of 0.01 J K −1 molres −1 , associated with ΔSconf(res), to further emphasize the sensitivity of the model results to this parameter. The numbers in the last column of Table 1 indicate that the ΔEa quantity increases in magnitude with the addition of the considered co-solutes to water. This finding makes sense because the molecules of TMAO, glycine, and betaine can all be involved in attractive interactions (i.e., both dispersion interactions and H-bonds) at the ELP surface, and the latter should markedly increase upon swelling of the polypeptide chain. There are quantitative differences among the three co-solutes, but they cannot be taken for granted in view of the simplicity and roughness of the procedure used to arrive at the ΔEa estimates. However, it is important to underscore that preferential interaction (i.e., enrichment) and preferential exclusion (i.e., depletion) are expressions used to describe thermodynamic data referring to differences between conformations belonging to two huge ensembles (i.e., the two macro-states) and cannot be taken literally [56][57][58]. The expectation is that polymer chains possessing both polar and nonpolar moieties, such as PNIPAM and ELP, are attractive for water molecules (indeed, they are soluble in water at low temperature) and for the molecules of TMAO, glycine, and betaine. In fact, MD simulations by Berne and co-workers demonstrated that TMAO molecules, similarly to urea molecules, are enriched at the surface of hydrophobic polymers [59,60]. This reasoning implies that the surface of ELP chains is covered by water and co-solute molecules. Note that the MD results by Cremer and co-workers were obtained not for an ELP chain but for a single Val-Pro-Gly-Val-Gly peptide [4]. To address these matters, it is mandatory to perform MD simulations on polymer chains since the surface area magnitude is a critical factor [61], and additivity might not hold in these cases. Nevertheless, polymer collapse does occur when the translational entropy gain of water and co-solute molecules, associated with the decrease in solvent-excluded volume, overwhelms the other contributions in the Gibbs free energy balance of Equation (4).
(a)  In conclusion, the present analysis confirms that the magnitude of the solvent-excluded volume effect and its temperature dependence (strictly linked to the translational entropy of solvent and co-solute molecules) are able to rationalize, in a more than qualitative manner, the occurrence of ELP collapse in water upon raising the temperature. Via the same approach, we rationalize the T(collapse) lowering caused by the addition to water of either TMAO, glycine, or betaine. Approaches grounded in the solvent-excluded volume idea also work well in situations where other approaches fail, and this is something that we would like to highlight.
Funding: This research was funded by Università degli Studi del Sannio, FRA 2020.