## 1. Introduction

From the point of view of polymer physics, the folding of a protein is similar to the coil-globule transition of a short polypeptide chain [

1]. The coil-globule transition is known as the phase transition of first (in rigid) or second order (in flexible chains) [

1,

2]. By following the behavior of the order parameter (degree of “nativeness”)

${f}_{N}\left(T\right)\in [1,0]$ or its counterpart (“denaturation” degree)

${f}_{D}\left(T\right)=1-{f}_{N}\left(T\right)$, it is possible to describe the phenomenon; the condition:

defines the folding temperature

${T}_{D}$. If there are no finite size effects or heterogeneity (the account of heteropolymeric effects in the coil-globule transition is outside the scope of the current study), the order parameter at the transition point undergoes an abrupt all-or-none transformation. Responsible for this coil-globule phase transition are strong correlations between repeat unit conformations, which occur due to the van der Waals interactions between the remote repeat units [

1]. Changes in external conditions (temperature, pressure, pH, solution composition, etc.) shift the equilibrium in these effective interactions from repulsion (good solvent regime) via neutral conditions (ideal or theta conditions) to attraction (poor solvent regime), which forces the protein to fold. The hydrogen bonds, which are responsible for the formation of secondary structures, have a shorter span and influence the conformations locally. In

$\alpha $-helices, one hydrogen bond fixes the conformations of three subsequent residues [

3]. Although the hydrogen bond loops in

$\beta $-hairpins are roughly twice as long and typically span over five to seven residues [

4], the interaction is still local. Therefore, according to the Landau–Peierls theorem [

5], such hydrogen bonds cannot per se lead to the coil-globule (phase) transition [

1,

6]. However, if there are long-range interactions present in the system, the formation of secondary structures can change the effective stiffness of the polypeptide chain, increase stability, and thus, promote the coil-globule transition at equal external conditions. Indirect support for such a mechanism arises from the fact that both the coil-helix transition and protein folding occur at the same interval of external parameters [

7]. Thermodynamic cooperativity as a concept is often attributed to the sharpness of the phase transition, which results from the spatially correlated behavior of the particles (in this case, repeating units). The situation of the idealized first order phase transition with correlations that extend throughout the system and lead to the discontinuity of the order parameter corresponds to infinite cooperativity and the zero transition interval. When it comes to the folding of single domain globular proteins of just

$N<100$ repeating units long, the limited system sizes impose constraints onto otherwise infinite correlations at the transition point. Consequently, the folding happens over some small temperature interval

$\Delta T(\ne 0)$, which needs to be estimated. Using the Taylor expansion cut at first order, it is possible to approximate the order parameter as

${f}_{D}^{appr}$ with the help of the tangent at the transition point:

From the definitions of initial and final temperatures as

${f}_{D}^{appr}\left({T}_{1}\right)=0$ and

${f}_{D}^{appr}\left({T}_{2}\right)=1$, one can define the transition interval (see, e.g., [

8,

9]) as:

The derivative of the order parameter at the transition point is the experimentally measurable quantity that provides access to information on the system’s cooperativity. The temperature is not the only possible external parameter that can induce the transition. The experiments are often set by changing the concentration of the denaturant such as urea or guanidinium chloride (GdmCl). After repeating the steps behind Equation (

2), the resulting expression for the change in the number of bound denaturant molecules during the transition is:

so that the thermodynamic cooperativity of the transition can be still estimated by the measured slope of the transition curve at its middle point.

In this paper, we introduce the protein chain length as a parameter into the two-state model, perform the finite-size scaling of protein folding, and compare the two famous criteria of cooperativity.

## 2. Materials and Methods

The two-state model is the simplest among the folding models, yet is very general and fruitful and therefore deserves a detailed, even pedantic derivation of its well-known formulas, enabling us to trace their origins and limitations. Within the two-state paradigm, the presence of just two possible macroscopic states is assumed: the native globular state with the energy value

${E}_{N}$ and the denatured coil one with the energy

${E}_{D}$. To reflect the uniqueness of the native state, a degeneracy

${g}_{N}=1$ is attributed; a

${g}_{D}\gg 1$ degeneracy is set for the denatured state to reflect its greater conformational entropic freedom. Without loss of generality, one can assume

${E}_{N}=0$,

${E}_{D}\ne 0$ and write down the density of states for the two-state model:

where

$\delta \left(x\right)$ is the Dirac delta function, resulting in the partition function:

where

$[\dots ]$ is the number of repeat units in the native or denatured state and

$\beta =1/T$ is the inverse temperature. The average energy is just the internal energy of the system and follows directly as:

leading to the heat capacity:

The denaturation degree reads:

and the equilibrium constant:

At the transition point, the numbers of repeat units in the native

N or the denatured

D state are equal, and with the help of Equation (

9), we can express the transition temperature Equation (

1) and interval Equation (

3) in terms of the two-state model parameters as:

The last expression can be rewritten as:

resulting in the famous expression for the energetic price of the transition between the two states. Privalov and Kheshinashvili [

10] referred to Equation (

12) as an approximation, but as we showed above, it is indeed exact within the two-state picture. Since all the above formulae are derived under the assumption of the existence of strictly two states, the results can only be attributed to one cooperative unit, i.e., a part of a molecule that undergoes the transition from

N to

D as a whole. Microcalorimetry allows the simultaneous measurement of the transition enthalpies for the whole protein molecule and for the cooperative unit [

11]. Potentiometric titration also allows the difference in the degree of ionization to be measured for the entire molecule and compared with the value for the cooperative unit [

12].

The order of a conformational transition can be evaluated by analyzing the dependence of the slope of the transition on the molecular weight of the protein (

M), which is linearly proportional to the degree of polymerization

N. It is clear that the slope of the phase transition in small systems depends on the dimensions of this system [

1,

13]. In the case of the first order phase transition, the slope increases proportionally to the number of units in a system [

13], while the slope for the second order phase transition is proportional to the square root of this number [

1].

The system sizes can be introduced by the reasonable assumption that each repeating unit of the polypeptide chain can be found in one out of

$Q>2$ rotational isomeric states, only one of which corresponds to the native state. Since there is

N such repeating units, the number of possible states in the denatured conformation for the whole macromolecule and the additive energy of the system read:

In view of Equation (

11), this means:

This is a very interesting result, which shows that within the two-state paradigm, the denaturation temperature does not depend on the system size. Instead, the transition interval is inversely proportional to N, which naturally leads to a zero interval at $N\to \infty $, just as it should in the case of the phase transition.

## 3. Results and Discussion

The criterion of the two-state cooperativity

${k}_{2}$ of protein folding has already been discussed in detail (see, e.g., [

14,

15] and the references therein). It is defined as the ratio of van ’t Hoff and calorimetric enthalpy (energy):

where the van ’t Hoff energy is:

and the amount of heat exchanged during the transition is calculated as the integral under the heat capacity curve:

According to Equations (

13), (

16), and (

17), the resulting:

is an expression that asymptotically tends to one (from above) for large

N. It can be concluded that the two-state ansatz, expressed in Equation (

5), results in

${k}_{2}=1$, making it the necessary condition for the transition to be classified as a two-state one. Please note, strictly speaking, that it follows from noting that

${k}_{2}=1$ means that the transition is a two-state one. In a certain sense, the condition is negative: if

${k}_{2}$ is different from unity, the transition cannot be a two-state one, while if it is close to unity, it is not enough to conclude the two-state behavior. The folding cooperativity measure:

was proposed by Klimov and Thirumalai [

16] to compare the cooperativities of different proteins. Based on their collection of experimental and simulation data of protein folding, the same authors later suggested a size scaling law for the folding cooperativity measure

${\Omega}_{c}\propto {N}^{\zeta}$ [

17], where

$\zeta =1+\gamma $ and

$\gamma $ is a susceptibility exponent.

Using our Equation (

3), we can significantly simplify the expressions for

${\Omega}_{c}$. Li et al. defined the interval

$\Delta {T}^{*}={T}_{2}^{*}-{T}_{1}^{*}$ as the width at half-height of the differential curve [

17]. One can approximate the peaked curve by a rectangle with sides at

${T}_{1}^{*}$ and

${T}_{2}^{*}$ and a height

$|{f}_{D}^{\prime}{\left(T\right)|}_{{T}_{D}}$ in such a way, that

$1={\int}_{0}^{\infty}{f}_{N}^{\prime}\left(T\right)dT\approx {\left|{f}_{D}^{\prime}\left(T\right)\right|}_{{T}_{D}}({T}_{2}^{*}-{T}_{1}^{*})$. With the account of Equation (

3), this leads to the obvious

$\Delta {T}^{*}=\Delta T$, proving that both definitions of the transition interval are equivalent, at least in the sense of asymptotic, size scaling relations. The same Equation (

3), when inserted into the cooperativity measure Equation (

19), simply results in:

The result is not surprising, since the

$\frac{{T}_{D}}{\Delta T}$ ratio is common in the studies of finite size effects at phase transitions [

18,

19,

20]. In view of Equation (

14), valid for the two-state model, it simply means that:

However, if not bound to the two-state paradigm, the more general and model-independent formula expressed with Equation (

20) allows establishing direct links between the well-known size scaling relations and the cooperativity measure

${\Omega}_{c}$. To take into account the possibility for both the first and second order mechanisms of the phase transition,

$\frac{{T}_{D}}{\Delta T}\propto {N}^{1/d\nu}$ scaling should be considered [

18,

19,

20,

21] (instead of

${N}^{1}$, used by Li et al.), where

$d\nu $ is a critical exponent of the correlation length or radius of gyration; the

$d\nu =1$ and

$d\nu =2$ values would correspond to the first and second order phase transition, accordingly. From Equation (

20), it immediately follows that:

with

$d\nu =1$ for our case of exactly two-state folders. Klimov, Thirumalai, and Li [

16,

17] justified the necessity for the new critical exponent

$\zeta =1+\gamma $ by fitting the points from their dataset reported in [

17] to their expression Equation (

19). However, our Equation (

22) can be used instead, without invoking the new critical exponent

$\zeta $. In order to compare the two approaches, in

Figure 1, we re-plot the data from [

17] and compare them with our Equation (

22). The data points for

$ln{\Omega}_{c}$ and

$2ln\frac{{T}_{D}}{\Delta T}$ vs.

$lnN$ are superimposed, and the corresponding fitted straight lines are indistinguishable, thus validating Equation (

22) over the set of data from [

17]. The fit results in

$d{\nu}_{exp}=0.92$, which is close to, but not equal to one. The scaling on the basis of Equation (

22) nicely fits the experimental trends and thus allows us to treat protein folding as a true phase transition in a finite system in the sense of Lifshitz–Grosberg–Khokhlov [

2]. The fact that the transition interval has the same size scaling exponent as the correlation length is a nice example of the contribution of correlations in protein conformations to folding cooperativity.

There is further experimental evidence that supports our view. Ptitsyn and Uversky proposed the molten globule as the third thermodynamic state of protein molecules in a number of publications [

22,

23]. Based on the systematic analysis of data on urea and guanidinium chloride induced transition of globular proteins from the native to the unfolded state (

$N\to U$), from the native to the molten globule (

$N\to MG$) state, and from the molten globule to the unfolded state (

$MG\to U$), it has been shown that in all these cases, the cooperativity of unfolding increases linearly with the increase in the molecular weight of the protein up to 25–30 kDa [

22,

23]. In fact, this cooperativity of all three transitions measured in terms of

$\Delta n$ (see Equation (

4)) follows

$log\Delta n=d\nu logM-b$, with

$d{\nu}_{N-U}=0.97$,

$d{\nu}_{N-MG}=1.02$, and

$d{\nu}_{MG-U}=0.89$, all close to the

$d\nu =0.92$ value, estimated from the temperature inspired set of data from [

17]. This means that such a dependence of the cooperativity of urea-induced and guanidinium chloride-induced transitions in small proteins on their molecular weight suggests that all three types of transitions are all-or-none, indicating that the molten globule state is separate from the native and unfolded state by all-or-none transitions [

22,

23]. Thus, the experimental data on denaturant-induced unfolding of small globular proteins are consistent with the linear

$log{\Omega}_{c}$ vs.

$logN$ dependence described in [

17].

The comparison of cooperativity measures shows that each of them has advantages and drawbacks. The strict two-state assumption, expressed in Equation (

5) allows the derivation of

${k}_{2}\approx 1$ at large

N, which is therefore a necessary condition for the two-state folding. Independent of the chain length,

${k}_{2}$ allows the statement about which of the proteins under consideration comes closer to the ideal two-state behavior. Instead, in the same

$N\to \infty $ limit,

${\Omega}_{c}$ tends to infinity, which means that under other equal conditions, longer chains have higher values of the cooperativity measure

${\Omega}_{c}$. On the other hand,

${k}_{2}$, as defined by Equation (

15), contains both equilibrium and kinetic quantities, which are only equal when the system has reached equilibrium, and the deviation from unity can be attributed to kinetic traps (see also [

15] for the definition and discussion about the kinetic cooperativity). Regarding

${\Omega}_{c}$, once expressed through

$\frac{{T}_{D}}{\Delta T}$, it becomes a criterion similar to those introduced in other areas of physics to deal with the effects of a finite size at phase transitions. The last fact puts it on very solid grounds.