Detecting molecular folding from noise measurements

Detecting conformational transitions in molecular systems is key to understanding biological processes. Here we investigate the force variance in single-molecule pulling experiments as an indicator of molecular folding transitions. We consider cases where Brownian force ﬂuctuations are large, masking the force rips and jumps characteristic of conformational transitions. We compare unfolding and folding data for DNA hairpin systems of loop sizes 4,8, and 20 and the 110 amino acids protein barnase, ﬁnding conditions that facilitate the detection of folding events at low forces where the signal-to-noise ratio is low. In particular, we discuss the role of temperature as a useful parameter to improve the detection of folding transitions in entropically driven processes where folding forces are temperature independent. The force variance approach might be extended to detect the elusive intermediate states in RNA and protein folding.


Introduction
Protein folding remains a challenging topic in biophysics. In 1968 Levinthal argued that stochastic dif-fusive motion alone could not account for the short timescales of protein folding [1]. Folding a protein into its native structure can be likened to finding a needle in a haystack. Assuming that the backbone dihedral angles of the amino acids chain are divided into three distinct regions of the Ramachandran plot, the typical folding time grows like 3 N × τ d , with N the number of amino acids and τ d the diffusive time in such regions. The latter can be expressed as τ d = l 2 /6D, where l is the region size, and D is the diffusion constant. Taking l ∼ 3Å, the inter amino-acid distance, and using the Stokes formula D = k B T /γ with γ = 6πηl and η ∼ 0.001 Pa·s the shear viscosity of water, we obtain τ d = 2 · 10 −11 s. Thus, a protein consisting of N = 20 residues would fold in approximately one second, while for N = 60, the folding time would be the universe's age. This rough estimation emphasizes natural evolution's role in speeding up protein folding.
To solve Levinthal's paradox, the molten globule hypothesis was proposed by Ptitsyn in the 70s: native folding is guided by the accumulation of native-like interactions and the sequential formation of intermediates. In small globular proteins, the molten globule is an intermediate between the unfolded and native states, where the polypeptide chain pre-forms a scaffold of the native structure. Experimental measure-ments suggest a dry molten globule with the outer layer of the protein hydrated and the core dehydrated. The latter has a native-like expanded structure with the backbone formed but with side chains loosely packed [2,3,4]. The evidence in favor of molten globule intermediates has always been indirect [5,6,7].
The study of protein folding has traditionally relied on bulk experiments such as calorimetry, hydrogen exchange, NMR, and fluorescence spectroscopy. However, these methods have limitations in detecting short-lived intermediates, whose presence is masked by the averaging effect of bulk assays. Singlemolecule force spectroscopy experiments have revolutionized the study of protein folding thanks to their unprecedented spatial and temporal resolution, allowing us to detect previously undetectable shortlived intermediates. Recently, using single-molecule experiments it has been demonstrated that the rupture force variance of the ligand-protein complex biotin-streptavidin increases close to the transition state [8]. Optical tweezers have proven especially adept at spotting these intermediates [9,10,11,12], and in co-translational folding assays upon exiting the ribosome [13]. A major twist in experiments has been recently achieved with calorimetric force spectroscopy [14,15] by measuring the folding enthalpy, entropy, and heat capacity change of the small globular protein barnase [16]. Barnase is a 110 amino acids bacterial ribonuclease protein secreted by the bacterium Bacillus amyloliquefaciens and the focus of many studies of protein folding [17,18,19,20]. In reference [21] we found that barnase folds in a two-state manner without observable intermediates at kHz sampling rates. In a subsequent study [22], we demonstrated that the transition state has the thermodynamic properties of a dry molten globule: a native-like structure of high-energy and low configurational entropy relative to the native state. This study also set a thermodynamic ground on the energy landscape hypotheses (ELH) proposed by Wolynes and collaborators in the 80s. In the ELH, proteins fold along a funnel-shaped energy landscape with multiple productive folding trajectories [23,24].
Despite the many studies on barnase, direct observation of the hypothesized molten globule inter-mediate has not been possible. A major question is identifying experimental limitations to detect hidden short-lifetime states using noise force measurements. Here we address noise measurements of the unfolding and folding dynamics of barnase measured in pulling experiments at different temperatures (7-37ºC) [22]. We compare such measurements with those obtained in DNA hairpins of varying loop sizes, where the entropic barrier to folding is large, like for proteins. To this end, we measured the force variance in pulling experiments at loading rates 4−7pN/s and 1kHz sampling rate. We ask whether folding events can be detected in an entropy-driven process where folding forces are low, and the folding rip is indistinguishable from the noise. We also analyze the effect of decreasing temperature to reduce thermal fluctuations and increase the signal-to-noise ratio of the folding events. Detecting folding events is critical to identify folding intermediates that require additional resolution in the experiments. Here we will focus on detecting folding events in DNA hairpins and barnase, setting the basis for future studies for detecting the often elusive folding intermediates.

Materials and methods
In pulling experiments with optical tweezers, the molecule under study (DNA hairpins and barnase) is tethered between two beads. Double-stranded DNA (dsDNA) handles are attached to the end of the molecule to prevent nonspecific interactions between molecules and beads. The handles are ligated to the N-and C-termini for protein barnase via cysteinethiol chemical reduction (details in Ref. [21]). For the DNA hairpins, designed oligos are hybridized and ligated to build a DNA construct consisting of the hairpin and two flanking 29bp short handles (details in Ref. [25]). The 5'-end of the molecular construct is attached to one bead via anti-digoxigenin -digoxigenin bonds (3.0 to 3.4 µm diameter counts; Spherotech, Libertyville, IL), while the other end is attached to a micron-sized polystyrene microsphere using streptavidin-biotin bonds (2.0 to 2.9 µm diameter bead; G. Kisker Biotech, Steinfurt, Germany). The first bead is captured in the optical trap to mea- sure the force, while the other is immobilized at the tip of a micro-pipette by air suction (Fig. 1a). In a pulling experiment, a molecule is tethered between two beads, and the optical trap is moved between a minimum force where the molecule is folded and a maximum force where it is unfolded. In a pulling cycle, the force applied to the system increases (decreases) when moving the optical trap away (towards) the pipette. To change the temperature, we use the temperature-jump optical trap described in Ref. [14], where an extra collimated laser is used to heat the medium surrounding the optical trap uniformly. For low-temperature measurements, the instrument is put inside an icebox kept at 4ºC, permitting us to do measurements in the range of 4-40ºC.

Results and discussion
The force is repeatedly stretched and released in pulling experiments while recording the force versus trap-position distance curves (FDCs). In the unfold-ing process, a force rip is observed at high forces (> 15pN), indicating the transition from the native (N ) to the unfolded (U ) state (dark color trajectories in Fig.1b). Furthermore, the value of force where the transition is observed varies from one pull to another, indicating that the unfolding events are thermally activated. In the refolding process, the force is reduced until a folding event is observed as a sudden force rise. The size of the force jump is proportional to the difference in molecular extension between N and U . However, as can be seen in Fig.1b (light color trajectories), a rise in the force cannot be appreciated in the folding FDCs of barnase because the folding event takes place at low forces, < 5pN. At such low forces, the magnitude of the force jump is expected to be comparable to the noise.
To detect the folding transition, we measured the variance of the force signal in the unfolding and folding trajectories separately. The analysis of the force variance considers the effects due to the bead, handles, and molecule under study that are modeled as three serially connected springs. The optical trap is modeled using Hooke's law, where f is the force, k b is the stiffness of the optical trap, and x b is the displacement of the bead to the trap's center. The dsDNA handles, and the unfolded state of the DNA hairpin and barnase are modeled with the Worm-Like Chain (WLC) model [26], In Eq. (2), k B is the Boltzmann constant, T is the temperature, x is the extension of the molecule, and L c is the contour length of the handles or the unfolded molecule. Extensibility is considered for the case of the short dsDNA handles in the DNA hairpins case, by correcting the extension x with the term (1+f /Y ) where Y = 16pN for the 29bp dsDNA handles [25]. Finally, the elastic response of the folded molecule is modeled as a dipole oriented under an applied force. Its extension is modeled with the Freely-Jointed Chain model (FJC), In Eq.(3), x is the dipole extension at force f , d 0 is the dipole contour length, which is equal to 2nm for the DNA hairpin, and 3nm for barnase.

Force variance in a two-branches model
In our pulling experiments, the control parameter is the trap position λ, and the measured force is a fluctuating quantity. To detect the folding transitions we compute the force variance (σ 2 f ) in a statistical model with two branches, folded and unfolded, describing the experimental FDCs shown in Fig. 1. The upper and lower branches in the FDCs of Fig. 1b stand for the folded (N ) and unfolded (U ) branches where the molecule is in the Native (N ) or Unfolded (U ) states showing distinct FDCs. In what follows, force branches and states are used indistinctly: folded branch↔N and unfolded branch↔U. In equilibrium, the probability of observing the molecule in states N or U (P N and P U ) is given by the Boltzmann-Gibbs factor: where ∆G N (U) is the partial free energy of N (U ) at a given trap position and is the partition function of the system (molecule, handles, and bead). The partial free energy of the system when the molecule (DNA hairpins and barnase) is in N and U is calculated as: where x d denotes the projected extension of the dipole, x U is the extension of the unfolded molecule, is the bead displacement, all quantities evaluated at the force when the molecule is in N or U (i.e., ). The forces acting on each element are defined as f d (dipole), f h (handles), f b (beads), f U (unfolded polymer) and have different elastic responses resulting in the observed different force branches of Figure 1b. These relations have been defined in Eqs. (1),(2), (3). Note that f d , f h , f b , f U are equal at the upper integration limits in (5a) and in (5b), corresponding to serially connected springs.
In the absence of force jumps between the two branches, the force variance is given by, The force variances in each branch, σ 2 f (N ) and σ 2 f (U ), are determined by the elastic properties of the molecular construct in that branch, k m (N ), k m (U ), c) b) a) 16  where 1/k m (N ) = 1/k h + 1/k d and 1/k m (U ) = 1/k h + 1/k U is the stiffness of the molecular construct, resulting from two serially connected springs of stiffnesses k h (handle) and k d (dipole for the folded state) or k U for the unfolded polymer. k U and k d are derived from Eq.(2) and Eq.(3), respectively. At a given trap position λ, the equilibrium force f and its second moment f 2 are defined as: where f N (U) denotes the average force when the molecule is in N (U ), i.e., f N (U) = ∂ λ ∆G N (U) with ∆G N (U) given in Eqs. (5a),(5b).
To determine the variance of the force, we calculated the second derivative of the thermodynamic po-tential ∆G(λ): where k ef f is the effective stiffness of the system along the equilibrium FDC. Using the definition of f Eq. (8a) we compute ∂ λ f : with where we used f = A(λ)/Z λ . The second term, 1/Z λ ·∂ λ (A(λ)), is obtained by taking the λ-derivative of the above definition for A(λ), and using f 2 in Eq. (8b): is the equilibrium stiffness and k N (U) = ∂ λ f N (U) , are the stiffnesses of each branch, equal to the slope in the corresponding force branch (N or U ). Introducing Eqs. (11) and (12) into Eq. (10) and using (4),and (9c), we get: with For one branch only, e.g. P N = 1, P U = 0 we get k = k ef f = k N and σ 2 f = σ 2 f (N ). In general, for systems with two branches, the slope of the FDC becomes negative in the region where the two branches coexist P N ∼ P U ∼ 1/2 and k ef f can become negative (black line connecting the two branches in Fig.2a). Figure 2a shows an experimental unfolding (red curve) and folding (blue curve) trajectory measured for DNA hairpin L4. Notice that at low (high) force values, f < 13 (f > 19) pN, the unfolding and folding trajectories overlap onto the folded (unfolded) branches (dashed lines), respectively. In between, unfolding and folding transitions are observed as red force rips and blue force jumps in Fig. 2a. To construct the equilibrium FDC (black line in Fig. 2a), we define the native and unfolded force branches at low and high forces outside the region limited by the force rips and jumps (red and blue dashed lines). The force branches have been calculated by fitting the elastic properties of the optical trap (k b ), by imposing the previously determined elastic properties of handles and unfolded polymers [25,22,27] and their folding free energies [28,22]. This permits us to determine σ 2 f (N ), σ 2 f (U ) from Eq. (7) and P N , P U from Eq.(4). Equilibrium probabilities for each branch (red, folded; blue, unfolded) are shown in Fig. 2b. We derive σ 2 f in (14) by computing k from the equilibrium FDC, and the effective stiffness of each force branch, k N and k U . Figure 2c shows the estimated σ 2 f for the DNA hairpin L4 at 25ºC as a func-tion of the trap position (bottom axis) and the force in the unfolded branch (top axis). As expected, σ 2 f decreases with force at low forces (F branch) and high forces (U branch) but shows a peak at the transition region f U ∼ 15pN due to the contribution of the term k ef f in (14). The above calculations can be extended for systems with more than two branches. The average stiffness is given by: (15) with M the total number of branches and Z λ =

DNA hairpins
The experimental values of σ 2 f for DNA hairpins were extracted from the experimental FDCs measured at loading rates of 4 − 6 pN/s by averaging the force signal in λ-windows of 10nm, meaning that the force increases/decreases ∼ 0.5pN inside each window. Figure 3a shows the measured σ 2 f along the unfolding (red) and folding (blue) process for the DNA hairpin L4 at 7ºC (top), 25ºC (center), and 37ºC (bottom) as a function of the force at the unfolded force branch, f U . We remark four features from Figure  3a: first, the σ 2 f values overlap at high and low forces as expected because the molecular state is the same (folded or unfolded). Second, the forces at which σ 2 f is maximum (unfolding, red: folding, blue) shift to lower values as temperature increases. Third, the hysteresis of σ 2 f between unfolding (red) and folding (blue) decreases with temperature. Fourth, equilibrium transitions are expected to populate forces between the two maxima. In fact, at 39ºC the measured unfolding (red) and folding (blue) σ , and L20 (panel c) measured at 7ºC (top), 25ºC (center), and 39ºC (bottom) as a function of the measured force along the unfolded force branch. Notice that the unfolding peak of these hairpins takes place at lower forces as we increase the temperature, while the folding peak remains independent of the temperature for L8 and L20.
were carried out under quasi-static conditions (see Fig. 1b top, right).
Regarding the DNA hairpins with loop sizes 8 and 20, we note that σ 2 f during the unfolding process (red dots in Fig. 3b,c) shifts with temperature, whereas the same data during refolding (blue dots in Fig.  3b,c) change comparably much less with temperature. This is an indication that folding is entropically driven. Notice also that the unfolding forces where σ 2 f is maximum (red symbols in Fig. 3) are similar for L4, L8, and L12, in agreement with the fact that the transition state of unfolding is located within hairpin's stem and independent of loop's size.

Barnase
For barnase, σ 2 f was calculated by averaging the force over λ-windows of 8nm in the FDCs. Like for DNA hairpins, σ 2 f during the folding process (blue points in Fig. 4) changes with temperature comparably much less than the unfolding process (red points in Fig. 4). Figure 4 shows that barnase folds around 4 pN at the three temperatures, while the unfolding events and maximum σ 2 f occur at 30pN at 7ºC, 26pN at 25ºC, and 22pN at 37ºC.

Conclusions
We studied the variance of the force signal, σ 2 f , in single-molecule pulling experiments. Our aim is to detect entropically driven folding at low forces where the magnitude of force fluctuations is high, and the signal-to-noise ratio of the folding events is low. Moreover, we computed the equilibrium force variance and compared it with the force variance measured in non-equilibrium conditions. First, we studied three DNA hairpins as toy models to test the method's validity. The studied hairpins have a stem formed by 20 base pairs and four (L4), eight (L8), and twenty (L20) bases in the loop. The first studied hairpin, L4, has a small entropic barrier to folding, showing folding and unfolding transitions at sufficiently high forces (Fig. 3a). For L4, the force variance σ 2 f detects the forces at which folding and unfolding transitions occur. We have also observed that the unfolding and folding transitions for L4 are temperature-dependent while the folding transitions for L8 and L20 are roughly temperatureindependent indicating that the folding process is entropically driven (Figs. 3b,c).
Next, we studied the folding process of protein bar- , and 37ºC (panel c) as a function of the force along the U branch. Notice that the unfolding transition (peak in the red symbols) appears at higher forces as we decrease the temperature, while the folding event (peak in the blue symbols) does not move with temperature.
nase. This transition is challenging to detect in the FDCs (zoom in Fig. 1b), but it is observed as a gentle bump around 4pN in the force variance σ 2 f (blue squares in Fig. 4). In this case, the transition is not observed as a clear maximum as in the case of DNA hairpins L4 and L8 (blue squares in Fig. 3a,b), because folding occurs far from equilibrium. In fact, the gentle bump observed for either L20 (blue squares in Fig. 3c) and barnase (blue squares in Fig. 4) should become a peak in equilibrium conditions (black lines), demonstrating that folding in these two molecules is highly irreversible. Indeed, hopping transitions between these molecules' folded and unfolded states cannot be observed within the experimentally accessible timescales.
Future work should consider molecular intermediates and the usefulness of measuring the force variance σ 2 f to detect them. Our approach might be extended by considering a theory for σ 2 f in outof-equilibrium conditions where detecting structural transition is challenging.