Stochastic Modelling of 13C NMR Spin Relaxation Experiments in Oligosaccharides

A framework for the stochastic description of relaxation processes in flexible macromolecules including dissipative effects has been recently introduced, starting from an atomistic view, describing the joint relaxation of internal coordinates and global degrees of freedom, and depending on parameters recoverable from classic force fields (energetics) and medium modelling at the continuum level (friction tensors). The new approach provides a rational context for the interpretation of magnetic resonance relaxation experiments. In its simplest formulation, the semi-flexible Brownian (SFB) model has been until now shown to reproduce correctly correlation functions and spectral densities related to orientational properties obtained by direct molecular dynamics simulations of peptides. Here, for the first time, we applied directly the SFB approach to the practical evaluation of high-quality 13C nuclear magnetic resonance relaxation parameters, T1 and T2, and the heteronuclear NOE of several oligosaccharides, which were previously interpreted on the basis of refined ad hoc modelling. The calculated NMR relaxation parameters were in agreement with the experimental data, showing that this general approach can be applied to diverse classes of molecular systems, with the minimal usage of adjustable parameters.


Introduction
Monitoring and describing molecular dynamics are an important area of investigation in modern physical chemistry. Internal and global motions in solution affect directly or indirectly most spectroscopic methods aimed at the characterization of non-rigid molecules such as Nuclear Magnetic Resonance (NMR) relaxation [1][2][3], fluorescence anisotropy decay [4], timeresolved X-ray [5] and in single-molecule experiments such as site-directed spin-labelled electron spin resonance [6,7], Förster fluorescence resonance energy transfer [8] and atomic force microscopy [9].
In particular, Nuclear Magnetic Resonance (NMR) spectroscopy is known to be an important and powerful experimental technique for the observation of the dynamic properties of macromolecules. Some of the macroscopic physical observables are the relaxation times T 1 , T 2 and the heteronuclear Overhauser Effect (NOE) of 15 N, 2 H and 13 C nuclei, which are extremely sensitive to molecular motions, leading to the possibility to understand localized dynamics (e.g., studying conformational motions specifically in the active site of a protein) and to build a spatially distributed map of the macromolecule flexibility. However, interpretative tools can be complex due to several factors such as (i) the necessity to take into account diverse kinds of interactions, e.g., dipolar 15 N and 13 C and quadrupolar 2 H interactions, (ii) the coupling between global reorientation and large amplitude motions of entire domains, as well as limited local readjustments and restricted single-residue motions. In general, different spectroscopic techniques probe different physical observables, which, in addition, provide information on motions taking place at different time-scales. It seems therefore particularly important to introduce relevant sets of coordinates that are adapted to the observable involved in a particular experimental approach. This consideration is especially relevant in the case of NMR relaxation [1,10], for which interpretative methods for internal relaxation processes were introduced early on in the form of adaptable simple spectral densities, as in the Lipari-Szabo (LS) approach [11,12], or later, in the form of explicit dynamic models, as for instance in the Slowly Relaxing Local Structure (SRLS) model [13,14].
A rational approach to the in silico interpretation of the relaxation data of flexible molecules needs two distinct elements. First, a precise geometrical analysis is needed in order to relate properly the dynamic model to a set of experimentally observable quantities. In particular, care should be taken to account for the tensorial nature of the spectroscopic interactions, by defining proper local frames of reference. Next, the relaxation times or other observables are linked to time correlation functions/spectral densities of a specific nature that can be evaluated on the basis of the dynamic model itself. The latter can range from a full atomistic molecular dynamics (MD) simulation-based approach to simplified semianalytical expressions for the correlation functions. At an intermediate level of complexity, several approaches have been devised based on various approximations. For instance, one can make the simplifying assumption that local motions are due, at least for semi-rigid systems, to a network of dynamically coupled neighbours (network model) [15,16] or caused by partial diffusive reorientation within a local potential (SRLS) [14]. One can also assume specific statistical characteristics (diffusive or Brownian dynamics, fractional Brownian dynamics [17], etc.).
Recently, a systematic approach [18,19] has been proposed that tries to combine a detailed definition of the molecular geometry and a correct description of the associated dynamical features. The method attempts to include the information on the molecule geometry, topology and interactions into a general stochastic model, which can be tailored at different levels of accuracy introducing specific approximations based on time-scale separation arguments. A master equation can be obtained and, with suitable approximations, numerically solved. In particular, a basic implementation, named the Semi-Flexible Brownian (SFB) model, has been developed for the description of partially flexible macromolecules in solution. Until now, the SFB model has been applied only to model cases, and no examples have been shown of calculations of directly measurable observables. In this paper, we present a full investigation of the SFB performance for the evaluation of 13 C nuclear magnetic resonance relaxation parameters, T 1 and T 2 , and the heteronuclear NOE of several oligosaccharides, which were previously interpreted on the basis of ad hoc stochastic modelling. In particular, we discuss the computational strategy and implementation of the method and detailed results, which confirm how the calculated NMR relaxation parameters are in satisfactory agreement with the experimental data, and we suggest that this general approach can be safely applied to diverse classes of molecular systems, with a minimal usage of adjustable parameters.
The paper is organized as follows. Section 2 summarizes the basic features of the SFB model and its implementation. The main results are shown in Section 3. A discussion is provided in Section 4.

Observable and Geometric Setup
A spectroscopic observable is written usually in terms of suitable time correlation functions or spectral densities, and their fast and accurate evaluation are the main objective of a dynamic modelling approach. The distinction between the description of the dynamics of the molecular system and the definition of the physico-chemical observable is often skipped over. We review here some salient points useful for the general comprehension of the topic.
Magnetic relaxation times T 1 , T 2 and the NOE of 15 N, 13 C and 2 H nuclei depend on dipolar ( 15 N and 13 C) and quadrupolar ( 2 H) interactions, on chemical shift anisotropy and cross-correlation effects. In general, we define the following set of reference frames: (i) a Laboratory Frame (LF), i.e., a fixed external frame; (ii) an "Attached" Frame (AF), i.e., a frame attached to the molecule, where the exact way of defining the AF is actually model-dependent and is left temporarily undefined, noting that the choice of the AF while straightforward for a rigid molecule is not trivial for a flexible system; (iii) an interaction frame (µF), i.e., a local frame linked to the AF where some specific second-rank tensor spectroscopic property µ is well represented. Depending on the problem at hand, this could be for instance the frame where the 13 C-1 H dipolar or 13 C chemical shift (CSA) tensors are diagonal [20,21]. In Figure 1, an example is shown of the frame choice to compute the NMR relaxation data of a 13 C-1 H probe. The set of Euler angles Ω or other orientational coordinates transforming from the Laboratory Frame (LF) to the AF is time dependent and linked to the local restricted motions, large amplitude conformational motion and global orientation of the molecule. The dipolar and CSA frames are usually supposed to be rigidly attached to the AF. Here, only the Dipolar Frame (DF) is shown as an example. Let us briefly describe the evaluation of the dipolar contribution, as an example, to clarify the relation between the dynamical model and the physical observable. As is shown in Appendix A, the NMR relaxation observables are functions of the spectral densities of the correlation functions: Here, C is a constant depending on the interaction type and geometry (cf. Appendix A). Using the properties of Wigner rotation matrices [22], the correlation functions can be rewritten as: Here, Ω AD = (0, π/2, 0). Thus, the expression for the correlation functions simplifies to: with d 2 0,0 (π/2) = −1/2, d 2 ±1,0 (π/2) = 0 and d 2 ±2,0 (π/2) = The spectral densities for the dipolar interaction are thus calculated as: The spectral densities j (k,k ) m,m (ω) are directly recoverable from the description of the molecular motion, which can be now discussed independently of the specific interaction, described at the desired level of accuracy (from an all-atom molecular dynamics treatment at one extreme to a description based on a rigid diffusive model at the opposite extreme).

Dynamic Model
In this work, we used the semi-flexible Brownian (SFB) model that was developed for the description of partially flexible macromolecules in solution, introduced in [18,19]. The SFB approach describes, in essence, the case of a flexible rotating molecule, assuming a generic energy function defined by harmonic coordinates and their conjugate momenta. The model neglects large-amplitude activated torsional kinetics and/or crankshaft motions [23,24], as well as second-order precession effects. Internal motions are described as a harmonic or boson bath, which retains full coupling with external motion and includes dissipative/stochastic effects. The SFB has been derived as a simple tool to describe molecular relaxation processes based on the Fokker-Planck (FP) [25] equations, which is suitable and computationally efficient even in problems of large dimensions and is solidly founded on structural information comparable to a standard molecular dynamics simulation, but amenable to a semi-analytical solution. Details of the model and the proposed methods for the numerical solution were presented in [19], and we limited ourselves here to a brief summary. The SFB model is based on the Fokker-Planck equation for the probability density ρ(Q, t): Here, Q = (Ω, x), where Ω denotes the set of Euler angles describing the orientation of the molecular frame AF (vide supra) with respect to the LF and x = (x 1 , . . . , x i , . . . , x N ) is a set of dimensionless harmonic degrees of freedom, obtained as linear combinations of internal coordinates, their conjugate momenta and external (angular) momentum components.M p , p = 1, 2, 3, are the components of the infinitesimal rotation operator in the AF. As discussed in [18], this is the simplest description, recoverable from an initially atomistic model, of the Brownian dynamics of a non-rigid body, accounting for inertial effects and coupling between rotation and change of shape. Indeed, it describes the semi-rigid macromolecule of n atoms (or extended atoms, when a coarse-grained representation of the molecule is used), as a rotator coupled to N = 6n − 9 (i.e., 3n − 6 internal coordinates, 3n − 6 internal momenta and 3 components of the angular momentum L vector) harmonic degrees of freedom, in a fashion quite similar to standard spin-boson quantum mechanical approaches. Here, p(x) = exp(−x 2 /2)/(2π) N/2 is the Gaussian distribution of the N modes, and the equilibrium distribution is ρ(Q) = p(x)/8π 2 ; ω io ij is a (non-symmetric) matrix describing the relaxation of the internal coordinates, while ω int ip dictates the coupling between rotation and internal coordinates. Both ω io ij and ω int ip can be derived from the molecular geometry and the generalized friction tensor obtained from a hydrodynamic description, which defines the dissipative forces acting on the molecule [18]. Approximate, but accurate semi-analytical solutions for correlation functions/spectral densities of inter-est, e.g., as given by Equation (4), have been presented based on extended perturbation treatments, which take advantage of the spin-boson structure of the time evolution operator to tackle the system dimensionality and devise a fast way of evaluating spectral densities of interest for the interpretation of nuclear magnetic resonance relaxation experiments [19].

Parameterization and Implementation
Matrix elements ω io ij and ω int ip are directly recoverable from an internal generic potential function defined with respect to natural coordinates, referred to as a local-minimum (reference) structure, and the related Hessian matrix, i.e., the Force Field (FF), as well as the friction tensor obtained for the local-minimum structure using a generalized hydrodynamic model [18,26]. Notice that more sophisticate choices are possible: a collection of reference molecular structures can be used, with or without the possibility of dynamical interconversion, the evaluation of internal energy directly from a short molecular dynamics simulation via a variance-covariance matrix evaluation for all the system internal coordinates and more refined approaches to evaluate dissipative properties, beyond the hydrodynamic limit. We focused on the most convenient choices from the computational point of view; the method was implemented in the form of an integrated package, the Stochastic Augmented Liouville Equation Method (SALEM), which operates from scratch reading the single reference structure, evaluating all the internal parameters based on some assumed FF and the basic macroscopic properties of the medium (e.g., viscosity) and estimating the relaxation parameters, based on the procedure summarized in Appendix A, through the calculation of the spectral densities defined in Equation (4). Following previous work [26], we allow as the only free parameter of the model the hydrodynamic average radius of atoms R eff , which is necessary for the calculation of the friction tensor.
Despite the high complexity of the basic theory, its application is relatively straightforward. As mentioned above, we planed to make SALEM made available soon as an open-access tool to the scientific community. Presently, the code is in its beta test phase of development, but it is already capable of (i) reading a PDB file together with very few other data, part of which comes from the experimental setup (temperature, viscosity, type of probe and its geometry, spectrometer frequency) and part from the physico-chemical properties (FF, hydrodynamic boundary conditions and effective radius), (ii) performing, currently invoking Tinker [27], a simple energy minimization, (iii) evaluating the friction tensor, (iv) defining the parameters contained in Equation (5), (v) evaluating the spectral densities defined in Equation (4), and finally, (vi) evaluating the relaxation times and NOEs. The computational times for the systems considered here went from a fraction of a second to a few minutes on a standard Nvidia GPU for gaming (larger molecules would require GPUs equipped with additional RAM memory and CUDA processes). Before fully publishing the code, at least in a pre-release phase useful for interested researchers, we intend to streamline the various operations under a general intuitive GUI and simplify the compilation procedures under common operating systems.

Results
We tested our method on a collection of oligosaccharides, for which previous analyses based on specific stochastic models were presented. The following systems are considered: Estimates of 13 C-NMR parameters T 1 , T 2 and the NOE for selected CH and CH 2 probes were calculated by SALEM starting from a local-minimum structure (Figures 2-7) and the related Hessian matrix. These were obtained through the Tinker 8 program package [27] using the popular MM3 FF [33][34][35] (see References [36,37] for a review of several FFs for carbohydrates). Minimization was performed through the minimize tool of Tinker with an "RMS gradient per atom criterion" of 0.01 Å. The Hessian matrix was calculated through the Tinker utility testhess. In the hydrodynamic model used to evaluate the friction tensor by SALEM, four quantities were required, namely: the temperature, the local viscosity, the hydrodynamic boundary conditions and the effective radius of the atoms, R eff . Temperature and viscosity were set by the experimental conditions. For sugars in water or polar solvents, stick boundary conditions can be considered appropriate. The only true free parameter was the effective radius.      Local-minimum structure of the GCY system. Atoms of the active probe are highlighted as spheres. The torsional angle θ for the rotation of the hydroxymethyl group with respect to the sugar ring is also highlighted.
For each system, the optimal R eff parameter was determined as the one providing the lowest sum of squared percentage deviations for T 1 , T 2 and the NOE over the entire ensemble of experimental data available for that system. In Tables 1-6, for each set of available experimental data (typeset in boldface in the tables), the optimal theoretical estimates are reported for each system together with the value of the associated optimal R eff and with the percentage deviations from the experimental T 1 , T 2 , and the NOE (e(T 1 ), e(T 2 ), e(NOE), respectively). Table 1. Experimental and calculated relaxation parameters for system R2R. The optimal R eff and the percentage deviations from the experimental T 1 , T 2 , and the NOE (e (T 1 ), e (T 2 ), e (NOE), respectively) are also reported.
Probe: 13  The analysis of the various oligosaccharides studied in this work allowed us to attempt a comparative analysis of the model performance and sensitivity to some experimental parameters, with the caveat that a systematic investigation, which goes beyond the preliminary nature of this work, should be performed. The most important factor is temperature. Increasing the temperature implies lowering the viscosity, and thus increasing the principal values of the diffusion tensor. Assuming that in the range of temperatures at which the experiments were carried out, the conformational free energy of the molecules is not changed, increasing the diffusion tensor caused a decrease of the characteristic correlation times, which became closer to the extreme narrowing limit condition. This, in turn, had the effect of increasing T 1 and T 2 , as well as making them much closer to each other (since the rotational anisotropy is low in small molecules). Reaching the extreme narrowing limit, the spectral densities tended to become proportional to the overall tumbling correlation time, thus losing sensitivity on the conformational dynamics. Therefore, a worse performance of the model was observed and should not be unexpected at lower temperatures/larger viscosities. In the case of BGL at 253 K (Table 2), which is the case of a lower temperature and a solvent with higher viscosity, we found an agreement with experimental data within 20% of the relative error, which was the worst scenario in all the test calculations presented here. Simulations around 298 K and in solvents with a similar viscosity showed an agreement within 10% with the experiments. Table 3. Experimental and calculated relaxation parameters for system GGM. The optimal R eff and the percentage deviations from the experimental T 1 , T 2 and the NOE (e (T 1 ), e (T 2 ), e (NOE), respectively) are also reported.   Table 4. Experimental and calculated relaxation parameters for system TRI. The optimal R eff and the percentage deviations from the experimental T 1 , T 2 and the NOE (e (T 1 ), e (T 2 ), e (NOE), respectively) are also reported. The relaxation data of the penta-saccharide LNF were calculated on five probes located on the five different sugar rings. At fixed temperature and viscosity conditions, NMR relaxation depends on local geometry (i.e., how the C-H probes are oriented with respect to the diffusion tensor principal axes) and on the conformational free energy. In [38], the authors studied the effect of the shape of the R2R Potential Energy Surface (PES) along the Ψ angle on T 1 , T 2 , and the NOE. From MD simulations, it was found that the PES was bistable. The calculation of the NMR relaxation data with the harmonic approximation around the most important minimum led to a 10% error with respect to the calculation done with the bistable potential. The reader is encouraged to inspect Figure 5 of [31]. It reported the 2D PESs along the four (Φ, Ψ) couples of dihedral angles connecting the different sugar units. The PES along the angles connecting rings C and D was bistable, while the other three surfaces could be considered, as first approximation, harmonic. A one-to-one mapping between the probe position and the conformational free energy was not possible.
Globally, the approximations done in the SFB model implied that the estimation of a few NMR relaxation data was outside the experimental error (e.g., T 1 for the A and the E rings), and the overall agreement was 5-10% worse than the agreement found with the diffusive chain stochastic model applied in the past [31]. Table 5. Experimental and calculated relaxation parameters for system LNF. The optimal R eff and the percentage deviations from the experimental T 1 , T 2 and the NOE (e (T 1 ), e (T 2 ), e (NOE), respectively) are also reported.  The sensitivity of the method at different spectrometer frequencies can be also discussed. Even if, again, a trend cannot be strictly highlighted (since there were many factors playing at the same time: geometry, PES, temperature/viscosity, experimental setup), the simulations at lower frequencies tended to be more accurate than those at higher frequencies. This observation can be rationalized, partially, as follows: by increasing the Larmor frequency, NMR relaxation data were more affected by the tails of the spectral densities, which were in turn more sensitive to the fast-relaxing processes, i.e., internal motions described by the simplified harmonic PES in the SFB model. Table 6. Experimental and calculated relaxation parameters for system GCY. The optimal R eff and the percentage deviations from the experimental T 1 , T 2 and the NOE (e (T 1 ), e (T 2 ), e (NOE), respectively) are also reported.
Probe: 13  Overall, the SFB model tended to perform better at higher temperatures, lower viscosities and lower spectrometer frequencies and for molecules with a limited internal flexibility. However, our purpose here was to propose the SFB approach as a general tool, with minimal parametrization, capable of interpreting diverse experimental observations without resorting to ad hoc hypotheses concerning the molecular geometry, internal PES, dissipative properties and so on. The cases presented in this study showed that in the best conditions, relative errors within 5% were found, which usually were compatible with experimental errors. The average percentage deviation over all calculations was 8.5, 5.8 and 7.7 for T 1 , T 2 and the NOE, respectively. The maximum percentage deviation was 22.4, 13.6 and 18.5, respectively. Such results are significant, if one considers the relative drastic approximations included in the SFB model, confirming-at least for this class of systems-its performative capabilities, which are amenable to be considerably improved by lifting some approximations, like the harmonic nature of the internal PES and the upgrade of the estimates of the friction tensor to include hydrophilicity/hydrophobicity effects. The latter notation was based on the observed estimates of the only free parameter of the model, the effective radius R eff . The optimal R eff value was found in the range 1.6-2.2 Å for all systems except LNF, for which a higher value of 3.2 Å was obtained, which could be due to the molecule being particularly hydrophilic. This, in turn, implies that the molecular dimensions should be increased to take into account a layer of water surrounding the penta-saccharide.

Discussion
The results reported in the previous section showed that the SFB model reproduced the observed relaxation times and the NOEs of the set of oligosaccharides reported here, with an average accuracy of 5-10%, thus proving the ability of capturing the long-range dynamics of these systems. The study took into account molecules of increasing size, and there was no significant drift in the performance of the model with increasing molecular dimensions. Such an observation is promising, suggesting that the model could profitably be employed for large macromolecules, provided they can be still described as semi-flexible objects, i.e., molecules that mainly fluctuate about a minimum free energy structure (e.g., globular proteins).
The agreement with experimental data was in most cases within 10% relative error. In some cases, higher errors up to 22% were observed. Such discrepancies were expected since free energy profiles along sugars' Φ, Ψ tetrahedral angles exhibited bistable energy profiles, while only harmonic energy profiles were used in SALEM. In the last case analysed here, GCY, as discussed in [32], the rotation of the hydroxymethyl group with respect to the sugar ring described by the torsional angle θ-see Figure 7-featured two energy minima at θ ≈ −70 and θ ≈ +50, with the former conformation being the predominant one. The Tinker energy minimization led to a structure with six over eight units in the predominant conformation, and the results shown above were those obtained running SALEM on one of these six probes.
Still, the present general purpose SFB model showed a good performance if compared with straightforward results obtained via molecular dynamics simulations, especially considering that O(µs) trajectories were required in the latter case [28,39]. As mentioned before, the only free parameter here was R e f f , which was adapted case by case, but as a rule of thumb, a starting value of 2 Å usually provided good agreement with the experiment.

Perspectives
The main purpose of this paper was to validate, for a class of similar molecular systems for which we have directly controllable published NMR relaxation data, a general stochastic model based on a minimal amount of assumed phenomenological information. In our previous work [18,19], we formulated a systematic approach to describe the dynamics of a non-rigid molecule, based on elaborations from fundamental classical and statistical mechanics, in the form of a family of multidimensional Fokker-Planck operators for the probability density of internal and external degrees of freedom, retaining inertial effects and dissipation. The SFB model is the simplest implementation of this general methodology [19]. This approach seemed particularly relevant for the description of large molecular objects, such as proteins, which represent the main domain of application in the authors' perspective. The method provides a physically sound framework and is amenable to an efficient treatment at a modest computational price. We are currently working along three possible lines of development: (i) first of all, we are applying the present implementation of the SFB approach for interpreting NMR relaxation data of medium-sized folded proteins in solution, without additional improvement, barring a more accurate evaluation of the internal PES; (ii) at the same time, we are also exploring the effects of including large amplitude motions in the SFB model to account for locally more mobile regions; finally, (iii) we are streamlining SALEM, to make its usage as clear as possible with intuitive interfaces and instructions, to release it as a free tool for the general community.

Conflicts of Interest:
The authors declare no conflict of interest.