Characterizing Aciniform Silk Repetitive Domain Backbone Dynamics and Hydrodynamic Modularity

Spider aciniform (wrapping) silk is a remarkable fibrillar biomaterial with outstanding mechanical properties. It is a modular protein consisting, in Argiope trifasciata, of a core repetitive domain of 200 amino acid units (W units). In solution, the W units comprise a globular folded core, with five α-helices, and disordered tails that are linked to form a ~63-residue intrinsically disordered linker in concatemers. Herein, we present nuclear magnetic resonance (NMR) spectroscopy-based 15N spin relaxation analysis, allowing characterization of backbone dynamics as a function of residue on the ps–ns timescale in the context of the single W unit (W1) and the two unit concatemer (W2). Unambiguous mapping of backbone dynamics throughout W2 was made possible by segmental NMR active isotope-enrichment through split intein-mediated trans-splicing. Spectral density mapping for W1 and W2 reveals a striking disparity in dynamics between the folded core and the disordered linker and tail regions. These data are also consistent with rotational diffusion behaviour where each globular domain tumbles almost independently of its neighbour. At a localized level, helix 5 exhibits elevated high frequency dynamics relative to the proximal helix 4, supporting a model of fibrillogenesis where this helix unfolds as part of the transition to a mixed α-helix/β-sheet fibre.


Introduction
Spider aciniform (or wrapping) silk is the toughest type of silk and is a remarkable biomaterial with outstanding mechanical properties [1]. Spider silk proteins (spidroins) and silkworm silk proteins (fibroins) share a general architecture of a relatively long repetitive domain, comprising a concatenated series of repetitive units or sequence motifs, flanked by much shorter non-repetitive N-and C-terminal domains [2,3]. Aciniform spidroin (AcSp1) is the primary constituent of wrapping silk. In Argiope trifasciata, it is a modular protein containing at least 14 identical concatenated repeats of a 200 amino acid unit (termed here "W" units, from wrapping) [4].
Modular protein architecture, in which discrete structured modules are connected together by linkers that range from rigid to highly flexible, is common in nature [5][6][7]. The structure of individual domains are frequently studied in isolation by nuclear magnetic resonance (NMR) spectroscopy and/or X-ray diffraction, then placed in a multi-domain context through NMR spectroscopy [8,9], small angle X-ray scattering [10], or cryo-electron microscopy [11], allowing delineation of structure and dynamics in the context of the larger assembly. The orientation of domains relative to one another, their dynamics, and the relation between domains is crucial for expanding our understanding of their function [9,12,13].
Many multi-domain proteins comprise discrete, differing units (e.g., scaffolding units such as the SH2, SH3, PDZ, or PTB domains) connected by linkers with both widespread pathophysiological consequences [14] and potential for recombination for synthetic biology purposes [15]. From an NMR spectroscopy standpoint, di-ubiquitin [16,17] and GB1 [18] have been extensively studied as model multi-domain proteins. In contrast to these proteins, where discrete modules impart individual function, many fibrous proteins including spider silks employ repetitive modules [2]. This adds unique difficulties for structural biology, where the repetitive nature of these proteins leads to challenges in unambiguously tracking individual modules.
We recently employed NMR spectroscopy to determine the solution-state structure of the recombinant W unit of A. trifasciata AcSp1 in the context of both the single unit (W 1 ) and the two-unit concatemer (W 2 ) [19]. W 1 is composed of a predominantly helical globular domain composed of five defined α-helices with an unstructured~12 residue N-terminal tail and~50 residue C-terminal tail. In W 2 , the tails of neighbouring units become a linker that retains intrinsically disordered behaviour while the globular domains are identically structured giving rise to beads-on-a-string type conformation.
Fibres cannot be formed from solutions of W 1 , but manual drawing of fibres is readily possible from solutions containing W 2 , W 3 , or W 4 concatemers [20], including from solution-state NMR samples of W 2 [19]. During fibre formation, AcSp1 undergoes a partial conversion from α-helical to β-sheet structuring [21], putatively seeded at helix 5 in the W unit [19,22]. This transition is recapitulated in recombinant W 2 between the soluble and fibrous forms [19]. The 200 residue W unit from A. trifasciata differs significantly from other spider silks, such as the extensively studied major and minor ampullate silks, where short repetitive motifs such as A n , (GA) n , GGX or GPGXX dominate the protein sequence [1,2]. Silkworm fibroin is also dominated by short motifs (e.g., GAGAGS and GAGAGY) in its repetitive domain [3,23]. Echoing these differences in primary structuring, the retention of α-helical character in aciniform silk fibre is distinct from both ampullate silks and silkworm silk, where fibres are completely depleted of α-helical character [1,2,21]. Hence, although recombinant W proteins are much shorter than native aciniform silk proteins and lack non-repetitive N-and C-terminal domains, with reduced strength and extensibility relative to native silk that scale approximately with the number of W units [24], the structural behaviour of these proteins is consistent with the native protein.
Characterization of dynamics within proteins at the atomic-level is possible at a variety of time-scales using NMR spectroscopy [25,26]. Measurement of 15 N nuclear spin relaxation properties, namely the longitudinal and transverse relaxation times (T 1 and T 2 , respectively) and cross-relaxation through the heteronuclear 1 H-15 N nuclear Overhauser effect ([ 1 H]-15 N NOE) in particular, allow for characterization of small-amplitude, high-frequency motions as a function of position along the polypeptide backbone together with delineation of regions experiencing slower dynamic fluctuations [27]. Relation of these spin relaxation parameters to global and local motion is often carried out through the model-free [28] or extended model-free [29] approaches. In instances where a single global correlation time is not suitable, such as for proteins containing large unstructured regions, the reduced spectral density mapping approach provides residue-by-residue characterization of dynamics without a reliance on global rotational diffusion parameters [30].
These dynamics characterization methods rely upon unambiguous distinction of 1 H-15 N cross-peaks in 2D spectra, a situation impossible in concatemeric AcSp1 repeat units without some means to distinguish one W unit from the other. The technique of intein-mediated trans-splicing provides such a means, where individual W 1 units can be selectively labelled with NMR active isotopes and investigated, in the present work, in the context of the larger fibre-forming W 2 protein.
Inteins are naturally-occurring protein segments that excise themselves from a polypeptide and ligate the flanking protein segments together with a native peptide bond. This reaction will occur provided that a nucleophilic Ser or Cys is present immediately C-terminal to the intein and that the intein-protein fragment pair are stable in solution and allow the ligation to occur [31,32].
There have now been many demonstrated applications of inteins in structural biology and biotechnology [32,33], including protein cyclization [34][35][36]; protein switches [37][38][39][40]; in vivo protein engineering and probe attachment for biophysical studies [41][42][43]; and, importantly for the present work, segmental isotope enrichment [19,31,[44][45][46][47][48]. In our previous structural studies [19], we performed segmental-labelling using split intein trans-splicing [49,50], whereby either the first (W 2-1 ) or second (W 2-2 ) W unit in W 2 was enriched with NMR-active 13 C and/or 15 N nuclei while the other W unit was at natural abundance. Although we demonstrated differential dynamics between the globular core and linker/tail regions through variation in the observed heteronuclear [ 1 H]-15 N NOE, a more in-depth analysis of backbone dynamics is necessary to compare and contrast the behaviour of isolated W units vs. concatemers and to provide insight into more subtle variations in dynamics within the W unit.
Herein, new insight has been gained into the modularity, global conformation, and localized backbone dynamics behaviour of spider wrapping silk concatemers through characterization of ps-ns timescale NMR relaxation behaviour of each W unit in W 2 relative to one another and to W 1 . In W 1 and in each W unit of W 2 , reduced spectral density mapping is consistent with a structured five α-helix globular core having elevated dynamics in helix 5 with intrinsically disordered N-and C-terminal tails and, in W 2 , linker. Nuclear spin relaxation data are consistent with rotational diffusion by a compact globular core in W 1 and with modular tumbling of each W unit in W 2 almost independently of the remainder of the protein. Beyond ramifications for AcSp1 behaviour, the methods presented herein will serve as a useful atomic-level model for further characterization of modular proteins in solution.

Nuclear Spin Relaxation Parameters
Longitudinal (T 1 ) and transverse (T 2 ) relaxation time constants were measured at 16.4 T on a residue-by-residue basis [27] for both the monomeric (W 1 ) and concatemeric (W 2 ) states of recombinant AcSp1 ( Figure 1). To facilitate direct comparison, and because these data are integral for the subsequent analysis, the [ 1 H]-15 N data that we previously reported [19] are also plotted in Figure 1. A segregation in spin relaxation behaviour is clear between (i) the folded domain (residues 12-149 for a given W unit, with secondary structure elements shown in linear form in Figure 1) and (ii) the N-and C-terminal tails of W 1 and W 2 and the linker spanning W 2-1 to W 2-2 in W 2 . Namely, T 1 and the [ 1 H]-15 N NOE are elevated through the folded core of a given W unit and decrease in the linker and tail regions while T 2 exhibits the opposite trend.
For direct overall comparison, mean values of T 1 and T 2 were determined for four subdivided regions of the W unit chosen based upon our previous structural and 19 F-NMR studies: the globular core (residues 12-149 and, in W 2 , 212-349), helix 5 within the core (residues 135-149 and, in W 2 , 335-349), tails (W 1 : residues 1-11 and 150-199; W 2 : residues 1-11 and 350-400), and the linker (residues 150-211 in W 2 only) ( Figure 2). Qualitatively, T 1 is larger while T 2 is smaller in the globular core for each W 2 subunit relative to W 1 . The observed behaviour is consistent with the 15 N relaxation behaviour to be expected on the basis of more rapidly (W 1 ) vs. slowly (W 2 ) tumbling molecules [27]. Notably, T 1 values for W 1 , W 2-1 , and W 2-2 are significantly different for the core (p-value < 0.0001) and helix 5 (p-value < 0.01), while the tails are relatively similar (albeit with large variance) regardless of protein size.
In examining overall T 2 behaviour (Figure 2), the most striking feature is the large difference in mean values between the globular core and the disordered tails/linker, with a significant elevation in T 2 for the tails or linker in all W units. As with T 1 , W 1 exhibits a significant difference in behaviour from both W 2-1 and W 2-2 (p-value < 0.0001), with an elevated T 2 relative to either unit in W 2 . Although the significance is low (p-value < 0.1), helix 5 follows the same qualitative trend. Unlike with T 1 , however, W 2-1 and W 2-2 do not exhibit significant differences in T 2 relative to each other. Although the tails would be expected to be less encumbered and more dynamic overall, there is no difference between the mean values observed for the tails and the linker. tails would be expected to be less encumbered and more dynamic overall, there is no difference between the mean values observed for the tails and the linker. Analysis and error propagation were carried out using Mathematica notebooks from Leo Spyracopoulos [51]. The secondary structuring of the W unit is depicted on the basis of PDB entry 2MU3 [19], with grey shading for each helical segment. Analysis and error propagation were carried out using Mathematica notebooks from Leo Spyracopoulos [51]. The secondary structuring of the W unit is depicted on the basis of PDB entry 2MU3 [19], with grey shading for each helical segment.

Reduced Spectral Density Mapping
The values of the reduced spectral density at J(0), J(ωN), and J(0.87ωH) were calculated independently for each W unit. All residues with T1 and T2 fits that met the goodness-of-fit criterion (χ 2 < critical χ 2 [51]) for a given dataset at 16.4 T were employed for spectral density determination, giving 145, 153, and 131 residues, respectively, for W1, W2-1, and W2-2. J(0.87ωH) and J(0) both strongly demonstrate disparate dynamics between the folded core and the tails/linker, mirroring the behaviour of the individual T1, T2, and [ 1 H]-15 N NOE parameters (Figures 3 and 4). J(ωN), conversely, remains relatively uniform throughout all regions of a given unit of W1 or W2, without significant differences between globular and linker domains. Reflecting the relatively strong dependence of J(ωN) on T1, an expected decrease in J(ωN) is observed from W1, W2-1, to W2-2 for the globular core (p-value < 0.0001), with helix 5 following suit (p-value < 0.01), while those for the tails and linker are not significantly different between the W units.
J(0.87ωH), which is very sensitive to differences in motion in the high frequency regime [52], exhibits localized increases in W1 and in each W unit of W2 around residues 36, 63, 80, 121, and 132 ( Figure 3), correlating directly to the locations of loops or turns within the W unit [19]. It should be noted, though, that the increases in J(0.87ωH) observed at these locations are not of the same magnitude as those seen for the linker or tails ( Figure 3). These variations are, therefore, likely reflective of regions of the protein experiencing increased dynamics but tumbling with the core of the folded domain rather than behaving as intrinsically-disordered segments.

Reduced Spectral Density Mapping
The values of the reduced spectral density at J(0), J(ω N ), and J(0.87ω H ) were calculated independently for each W unit. All residues with T 1 and T 2 fits that met the goodness-of-fit criterion (χ 2 < critical χ 2 [51]) for a given dataset at 16.4 T were employed for spectral density determination, giving 145, 153, and 131 residues, respectively, for W 1 , W 2-1 , and W 2-2 . J(0.87ω H ) and J(0) both strongly demonstrate disparate dynamics between the folded core and the tails/linker, mirroring the behaviour of the individual T 1 , T 2 , and [ 1 H]-15 N NOE parameters (Figures 3 and 4). J(ω N ), conversely, remains relatively uniform throughout all regions of a given unit of W 1 or W 2 , without significant differences between globular and linker domains. Reflecting the relatively strong dependence of J(ω N ) on T 1 , an expected decrease in J(ω N ) is observed from W 1 , W 2-1 , to W 2-2 for the globular core (p-value < 0.0001), with helix 5 following suit (p-value < 0.01), while those for the tails and linker are not significantly different between the W units.
A statistically significant increase in J(0.87ω H ) is also observed for helix 5 (residues 135-149) relative to helix 4 (residues 101-127) (p-values of 0.0060, 0.0001, and 0.0025 for W 1 , W 2-1 , and W 2-2 , respectively) or to the core (helices 1-4) (p-values of 0.019, 0.0011, and 0.0085 for W 1 , W 2-1 , and W 2-2 , respectively). Helix 1, like helix 5, is directly connected to a disordered tail or linker segment [19]; therefore, helix 1 would be expected to exhibit similarly elevated dynamics if proximity to a tail or linker were the only factor at play. Instead, helix 1 behaves more like a core helix, as demonstrated by a lack of significant differences in J(0.87ω H ) for helix 1 vs. helix 4 (p-values of 0.11, 0.046, and 0.126 for W 1 , W 2-1 , and W 2-2 , respectively). There is also a qualitative difference between J(0.87ω H ) of helix 5 in the W units, with W 1 > W 2-1 > W 2-2 ( Figure 4) following the same qualitative trend as the core as a whole.
Like J(ω N ), the behaviour at J(0) differs between W 1 , W 2-1 , and W 2-2 in the globular core (p-value < 0.0001) and helix 5 (p-value < 0.001), increasing from W 1 , to W 2-1 and W 2-2 and opposite in trend to the observed decrease in J(ω N ). W 1 has the lowest mean frequency at J(0) over the folded core (4.26˘0.07), followed by W 2-1 (5.02˘0.11), and then W 2-2 (5.67˘0.37) (Figure 4). On average, the tail/linker regions do differ between W 1 , W 2-1 , and W 2-2 but care must be taken during interpretation given that the tail group for W 2-1 has seven values and W 2-2 linker has six values. When the tails and linker are grouped together, there is no statistical difference between the W units.

Analysis of Rotational Diffusion
Spin relaxation data were used to compare the suitability of isotropic, axially symmetric, or fully anisotropic rotational diffusion tensors for W 1 , W 2-1 , W 2-2 , and W 2 based upon all residues with [ 1 H]-15 N NOE > 0.65 [53] using the software ROTDIF [54]. In each instance, an axially symmetric rotational diffusion tensor provided the best fit to the data (Table 1); notably, the degree of anisotropy observed was minimal in light of the fact that~90% of an 878 monomeric protein dataset exhibited an anisotropy of 1.17 or higher [54]. W 1 spin relaxation data were best fit with a prolate rotational diffusion tensor for all 20 members of the NMR structural ensemble (PDB entry 2MU3 [19]), with an anisotropy range of 1.09-1.18. Conversely, W 2-2 was uniformly oblate (anisotropy 0.90-0.95 with W 1 ensemble) while W 2-1 varied depending upon the ensemble member employed, with anisotropy of 1.05-1.16 (16 members) or 0.88-0.95 (four members) observed. The fitting behaviour for W 2-1 is consistent with the small degree of anisotropy observed, with minimal deviation from an isotropic fit. Additionally, diffusion tensors were modelled for a combined W 2-1 and W 2-2 spin relaxation data set using the ensemble of 20 inferred W 2 structures [19]. In this instance, anisotropy remained minimal and most ensemble members led to oblate fits (anisotropy 0.8-0.92 for 18 members; 1.12-1.13 for two members). Goodness-of-fit, as judged by χ 2 per degrees of freedom, was comparable in all instances (Table 1). Ensemble-averaged values of τ c based upon anisotropic diffusion tensors demonstrated only modest increases from 7.9 ns for W 1 to 9.0 ns for W 2-1 to 9.6 ns for W 2-2 . To place this behaviour in context, a variety of rotational correlation time (τ c ) estimations were compared ( Table 2). Using Stokes' law (Equation (5)), crude values of τ c were estimated both according to an assumption of spherical shape (Equation (6)) and according to our previously reported hydrodynamic radii determined by diffusion ordered NMR spectroscopy (DOSY) and verified by dynamic light scattering [19]. Our NMR-derived W 1 structural ensemble and the inferred W 2 structural ensemble were also used for detailed hydrodynamics calculations in HYDROPRO [55] to estimate τ c . To test the effect of a more compact globular tumbling unit, τ c values were also determined for the W 1 globular core (i.e., excluding the N-and C-terminal tails) with and without the inclusion of the more dynamic ( Figure 4) and less stable helix 5 [19,22]. It should also be noted that the viscosities employed for the Stokes' law and HYDROPRO hydrodynamics calculations (Table 2) were experimentally determined, rather than estimated. This determination was carried out through DOSY experiments with use of an internal dioxane standard [56]. Estimated τ c values for W 1 were most consistent with the experimentally-observed behaviour for the most compact estimates of its conformation. W 2 , conversely, was estimated regardless of the hydrodynamic model employed to tumble much more slowly than was experimentally observed for each globular unit in the concatemer.
The ratio of T 1 to T 2 can also be related to τ c for the tumbling of a macromolecule in solution under the qualification that the 1 H-15 N spin pair in question does not experience significant rapid internal motion [27]. Extending this treatment, measured backbone relaxation time constants may be modified (giving T 1 ' and T 2 ', Equations (1) and (2), respectively) to remove high frequency spectral density contributions [17]. The ratio of these modified relaxation time constants, (Equation (3)), is relatively insensitive to localized variations in dipolar coupling and 15 N chemical shift anisotropy and, for a protein core, primarily dependent on overall tumbling. Direct comparison of the estimated τ c obtained from average values of T 1 /T 2 or T 1 1 /T 2 1 on the basis of the 15 N relaxation analysis of Kay et al. [27] neglecting high-frequency terms (Equation (7)) demonstrates excellent agreement with the far more rigorous ROTDIF calculation. At a more global level, discrete differences in 1/ are apparent between the globular core vs. tail and linker regions of W 1 , W 2-1 , and W 2-2 ( Figure 5), with the core of W 1 being significantly decreased in 1/ (p-value < 0.0001) relative to W 2-1 or W 2-2 and mirroring of this behaviour by helix 5 (p-value < 0.01) ( Figure 5B). Following the same trend as J(0) (Figure 4), W 1 has a statistically significantly lower mean (p-value < 0.0001) 1/ in the core compared to W 2-1 ; W 2-1 is again significantly lower than W 2-2 (p-value < 0.0001), reflecting differences in rotational diffusion between W 1 and each of the W units in W 2 ( Figure 5; Table 1). Additionally, regardless of being in a tail or linker, the corresponding residues in a given W unit in W 2 (i.e., residues 1-11 relative to 201-211 and 150-200 relative to 350-400) demonstrate very similar mean 1/ values. The variance accompanying 1/ is, however, too great to draw significance.  (4)); 2 Calculated using Stokes' law (Equation (5)) for a 100% 15 N/ 13 C-enriched protein mass using a hydration shell of either 1.6 Å (lower estimate) or 3.2 Å (upper estimate), assuming spherical shape (Equation (6)); 3 Calculated using Stokes' law (Equation (5)) based upon hydrodynamic radii determined by DOSY [19]; 4 Average˘average deviation of HYDROPRO [55] predicted τ c over 20-member ensembles of structures of W 1 or W 2 [19], or over globular core of W 1 ; 5 Determined based upon indicated relaxation time constant ratio using Equation (7); 6 Average˘average deviation over all 20 structural ensemble members determined through axially-symmetric diffusion tensor ROTDIF in identical manner to Table 1; 7 Entire W 1 structure; 8 Globular core (residues 12-149); and, 9 Globular core excluding helix 5 (residues 12-128).   (4)); 2 Calculated using Stokes' law (Equation (5)) for a 100% 15 N/ 13 C-enriched protein mass using a hydration shell of either 1.6 Å (lower estimate) or 3.2 Å (upper estimate), assuming spherical shape (Equation (6)); 3 Calculated using Stokes' law (Equation (5)) based upon hydrodynamic radii determined by DOSY [19]; 4 Average ± average deviation of HYDROPRO [55] predicted τc over 20-member ensembles of structures of W1 or W2 [19], or over globular core of W1; 5 Determined based upon indicated relaxation time constant ratio using Equation (7); 6 Average ± average deviation over all 20 structural ensemble members determined through axially-symmetric diffusion tensor ROTDIF in identical manner to Table 1; 7 Entire W1 structure; 8 Globular core (residues 12-149); and, 9 Globular core excluding helix 5 (residues 12-128).

Discussion
AcSp1 from A. trifasciata is a modular protein composed primarily of a repetitive domain of concatenated 200 amino acid W units [4]. We recently demonstrated that the W unit is composed of a well-folded globular domain of~138 residues connected to adjacent globular domains by intrinsically-disordered linkers~62 residues in length [19]. The functional necessity of this folded domain is further implied from the fact that it appears to be highly conserved (albeit considering a limited number of sequenced species), while the linker may vary both in length and sequence [57].
The modularity of AcSp1 was established through direct backbone chemical shift comparison between the monomeric (W 1 ) and concatemeric (W 2 ) states of AcSp1. Specifically, the chemical shifts of W 2 are remarkably similar to W 1 with exception of those in the linker immediately proximal to the covalent W unit linkage [19]. Beyond conservation of chemical shifts, heteronuclear [ 1 H]-15 N NOE data recorded at 16.4 T (Figure 1) also uphold the conformational independence of W units, given that W 1 and each of the W units in W 2 exhibit very similar NOE enhancement factor patterns as a function of position within the W unit [19]. In each case, higher NOE enhancement factors are exhibited in the folded domain (residues 12-149, numbering relative to each W unit) and lower or negative enhancements in the disordered terminal or linker regions (residues 1-11, 150-200) (Figure 1). The effect of concatemeric linking of W units is observed in the vicinity of the covalent linkage of the W units (residues~190 to 210 of W 2 ) through a less negative NOE enhancement relative to the free N-and C-terminal tails of W 1 and of W 2-1 and W 2-2 , respectively.
Our previous studies showed clear modularity in the W unit both in terms of structuring of the globular domain and the intrinsic disorder of the linker. The 15 N spin relaxation measurements and reduced spectral density mapping detailed herein demonstrate that this modularity clearly extends beyond structuring and into the dynamic behaviour along the polypeptide backbone. Segmental isotope-labelling mediated by split intein trans-splicing allowed us to track this behaviour unambiguously along the length of W 2 . It should be noted that the relaxation analysis methods employed herein are limited to probing motions in the ps-ns regime [26], but that the trans-splicing methodology, itself, can be directly applied to other NMR-based methods allowing characterization of longer time-scale motions.
Direct comparison of the N-and C-terminal W units in the concatemer was, thus, possible alongside a comparison of W 1 to W 2 . T 1 , T 2 , [ 1 H]-15 N NOE, and 1/ , in each case, explicitly delineate the globular vs. tail or linker domains (Figures 1, 2 and 5). For the most part, the globular domain exhibits uniform spin relaxation behaviour, with slight localized decreases in the [ 1 H]-15 N NOE delineating the secondary structure elements centred at residue 35-37 (between helix 1 and the converged, predominantly helical region over residues , residue 61 (between residues 40-60 and helix 2), residues 79-80 (between helices 2 and 3), residues 91-93 (near the helix 3 C-terminus), and residues 128-132 (between helices 4 and 5).
Given the existence of discrete disordered and globular domains in the W unit, we employed the 15 N reduced spectral density mapping approach [30] to analyze backbone dynamics, rather than the model-free [28] or extended model-free [29] formalism. This approach alleviates the requirement for defining specific motional modes and their independence or lack thereof, with demonstrated suitability for proteins of mixed ordered and disordered segments [52]. The resulting values of the spectral density function at three frequencies, J(0), J(ω N ), and J(0.87ω H ), derived from relaxation parameters T 1 , T 2 , and [ 1 H]-15 N NOE provide complimentary information along the peptide backbone (Figures 3 and 4) and unequivocally support our original model of W 2 , based on W 1 restraints [19], with the tail/linker regions and the globular domain being noticeably segregated.
J(ω N ) is much less sensitive to internal motions at the ps-ns time scale in comparison to J(0) and J(0.87ω H ) and displays very little variability between the folded domain and the linker. The residues in the folded domain have large J(0) and small J(0.87ω H ), while the linker and tail regions display the opposite trend (Figures 3 and 4). This is consistent with a situation where the linker and tails experience motion over a wider range of frequencies relative to the globular domain, as would be expected for an intrinsically disordered domain. Noticeably, the mean J(0) and J(ω N ) values over the globular domain differ significantly between W 1 , W 2-1 , and W 2-2 , increasing or decreasing, respectively, from W 1 to W 2-1 to W 2-2 . This behaviour is consistent with an increase in tumbling rates from W 1 to W 2-1 to W 2-2 .
Based upon both heteronuclear NMR [19] and 19 F-NMR [22], helix 5 (residues 135-149) and the portion of the globular domain in contact with it (falling in proximity to residue 36) are less stable than the remainder of the protein. Chaotropic denaturation or treatment with the detergent dodecylphopshocholine lead to helix 5 destabilization and a concomitant structural rearrangement in the globular core of the W unit [19,22]. Notably, therefore, the spectral density in helix 5 deviates from the remainder of the globular domain. Direct comparison to the proximal helix 4 shows elevated spectral density at J(0.87ω H ). J(0) is also qualitatively lower for helix 5 than for the remainder of the globular core (helix 1-4) (Figure 3). This behaviour, as a whole, is consistent with a greater sampling of high-frequency motion in this region of the W unit regardless of whether it is in an isolated W unit or a concatemeric construct.
Helix 5 falls immediately N-terminal to the intrinsically disordered linker. Our working hypothesis is that decompaction of the W unit occurs through loss of interaction of helix 5 with the core [22] followed by denaturation [19]. This would, in turn, greatly favour protein-protein entangling and interaction, inducing subsequent β-sheet formation during fibrillogenesis. Backbone-level dynamics are consistent with the distinct behaviour of helix 5 relative to the remainder of the globular domain and unambiguously demonstrate a propensity for increased high frequency motion.
Before considering rotational tumbling behaviour in more detail, it should be noted that the viscosities measured for the W 1 and W 2 samples in NMR buffer (Table 2) are significantly higher than the~0.82 cP that is derived on the basis of a linear combination of the expected [58] H 2 O and D 2 O viscosities at 30˝C. The source of this elevated viscosity is not fully clear. Given that W 1 will, for example, spontaneously form nanoparticle (or micellar) structures in aqueous solution [59], supramolecular assembly was certainly a distinct possibility. Neither spin relaxation behaviour (Table 2) nor translational diffusion observed by DOSY [19] are consistent with long-lived entanglement of the proteins or of stable oligomer formation. Were entanglement, oligomerization, or nanoparticle/micelle formation happening in the bulk of the sample, substantially slower tumbling and diffusion than observed would be expected. The fact that the vast majority of protein in solution is still fully observable on the basis of both spin relaxation behaviour and signal intensity by heteronuclear ( [19] and herein) and 19 F [22] NMR implies that if intermolecular entangling and/or supramolecular assembly are occurring and increasing solution viscosity that this only involves a small fraction of the total protein.
Tumbling of W 1 , reflected in the observed τ c of 7.9 ns, is more rapid than would be anticipated strictly on the basis of the W 1 hydrodynamic radius previously determined though DOSY [19] or through hydrodynamics calculations using the W 1 structural ensemble ( Table 2). W 1 , instead, exhibits tumbling consistent with a compact spherical particle of the same molecular weight with a half-shell of water. The overestimates in τ c on the basis of overall W 1 shape and dimensions are not surprising, given that the presence of intrinsically disordered domains in a protein leads to a general overestimation of τ c by methods (such as HYDROPRO) that employ an assumption of rigid behaviour [60,61]. Truncation of the W 1 structure either to the globular core or to the core without helix 5 lead to improved agreement between the inferred and observed τ c values, with the helix 1-4 globular core leading to a predicted τ c of~8.4 ns. Rotational diffusion is, therefore, most consistent with a compact globular core where helix 5 is not always attached. Translational diffusion, conversely, agrees well with the overall shape of W 1 [19].
Modest increases in τ c , to 9.0 ns for W 2-1 and 9.6 ns for W 2-2, are observed relative to 7.9 ns for W 1 . These values are~2/3 of those predicted for a compact sphere and~1/2 those predicted on the basis of the DOSY-determined W 2 hydrodynamic radius and <1/3 that predicted by HYDROPRO on the basis of the W 2 structural ensemble. This behaviour is also directly reflected in the magnitude of the observed increases in 1/ from W 1 to W 2-1 to W 2-2 ( Figure 5). Namely, W 2 does not exhibit anywhere near the expected [17]~doubling of 1/ relative to W 1 that would be observed if the two W units in W 2 were rigidly tumbling together as a species of double the molecular weight. Following studies using ensemble methods to accurately predict rotational diffusion for molecules containing intrinsically-disordered linkers [61,62], this is instead consistent with mostly decoupled tumbling of each globular domain. The increased τ c of W 2-2 relative to W 2-1 is consistent with greater hydrodynamic friction experienced from the asymmetric nature of the tails, with an~11 residue disordered N-terminal tail for W 2-1 vs. an~50 residue disordered C-terminal tail for W 2-2 .

Sample Preparation
Protein samples were prepared by recombinantly expressing W 1 and W 2 in Escherichia coli BL21(DE3), following previously-described protocols [19,63]. It should be noted that W 1 consists of residues 1-199 of the AcSp1 repeat unit from A. trifasciata while W 2-1 and W 2-2 each comprise the full 200 amino acid repeat unit concatenated to form a 400 residue protein. An N-terminal Met is also present in W 2 from the initiation codon; for simplicity of comparison between W 1 and each unit in W 2 , the Met is not included in residue numbering. Uniformly 15 N-enriched W 1 (~0.2 mM), and selectively 15 N-enriched W 2-1 and W 2-2 (~0.2 mM) NMR samples were prepared in sodium acetate buffer (20 mM

Spin-Relaxation NMR Experiments
NMR spin relaxation experiments were carried out at 30˝C on an Avance III NMR spectrometer operating at 16.4 T (Bruker Canada, Milton, ON, Canada) and equipped with a triple-resonance 5 mm indirect detect TCI cryoprobe. Two-dimensional phase-sensitive 1 H-15 N HSQC experiments were used to measure longitudinal relaxation times (T 1 ; hsqct1etf3gpsi pulse program, Bruker library) and transverse relaxation times (T 2 ; hsqct2etf3gpsi pulse program, Bruker library). All experiments were performed using 16 scans, 1.5 s recycle delay for W 1 and 1.75 s for W 2-1 and W 2-2 , spectral widths of 23 and 16 ppm with offsets of 115.5 ppm and at the water frequency (4.705 ppm), respectively, for 15 N and 1 H. W 1 spectra contained 192ˆ2048 complex points and W 2-1 and W 2-2 contained 128ˆ2048 complex points for the 15 N and 1 H, respectively. The T 1 data were collected using relaxation delays of 50, 100, 250, 500, 750, 1000, 1300, and 1700 ms and the T 2 data were collected using 17, 34, 51, 85, 119, 152, 187, and 238 ms relaxation delays, with a Carr-Purcell-Meiboom-Gill pulse train applied as appropriate for a given relaxation delay during the recycle delay to compensate for heating effects.   15 N nuclei were measured in an interleaved manner as described previously [19]. Briefly, the [ 1 H]-15 N NOE measurements were performed using a total of 356ˆ4096 complex points with 32 transients for W 1 and 256ˆ4096 complex points and 32 transients for both W 2 domains.

Determination of Spin Relaxation Parameters and Reduced Spectral Density mapping
Backbone 15 N T 1 , T 2 , and [ 1 H]-15 N NOE as a function of 1 H-15 N cross-peak position were determined and correlated to our previously assigned chemical shifts (deposited in the Biological Magnetic Resonance Data Bank for W 1 (BMRB entry 17899) and W 2 (BMRB entry 25197) [19,63]). The 15 N T 1 and T 2 values with associated errors were determined using the Mathematica version 8.0.4 (Wolfram, Champaign, IL, USA) notebook Relaxation Decay, freely available from Leo Spyracopoulos [51]. R 1 (R 1 (s´1) = 1/T 1 ) and R 2 (R 2 (s´1) = 1/T 2 ) relaxation rates were determined from nonlinear least-square fits to a two-parameter monoexponential decay. Errors were estimated based on the average spectral noise. The [ 1 H]-15 N heteronuclear NOE was measured as the ratio of the saturated spectrum to the reference spectrum as I sat /I ref where I sat and I ref are the intensities of the peaks in the 1 H-15 N HSQC spectra, with and without proton saturation during the recycle delay, respectively. Non-linear fits were used to minimize the statistical value of χ 2 . The χ 2 goodness-of-fit test per residue was used and compared to the exact critical χ 2 determined from 100 Monte Carlo simulations (9.146) for a single residue at a 95% confidence interval: and: where γ N and γ H are the gyromagnetic ratios of 15 N and 1 H, respectively. The ratio of these modified rates, was calculated as: Finally, per-residue values of J(0), J(ω N ), and J(0.87ω H ) were determined through 15 N reduced spectral density mapping [30] using the Spectral Density Mathematica notebook [51].

Viscosity Determination
The viscosity (η) of each NMR sample was calculated using a dioxane internal standard [56]. DOSY experiments acquired and processed as detailed previously for W 1 and W 2 [19] were analyzed to directly determine the translational diffusion coefficient (D C ) for dioxane in a given W sample. Coupling each measured D C with the known hydrodynamic diameter (d H ) of dioxane (0.424 nm [56]), η may be determined through the Stokes-Einstein equation [64]: where k B is the Boltzmann constant and T the absolute temperature (303 K).

Analysis of Rotational Diffusion
To analyze rotational diffusion behaviour with respect the 15 N spin relaxation data, isotropic, axially symmetric, and anisotropic diffusion tensors models were applied to W 1 , W 2-1 , and W 2-2 using ROTDIF 3.1 [54]. Only residues with [ 1 H]-15 N NOE > 0. 65 and not likely to be involved in conformational exchange were used for the analysis. For W 1 , W 2-1 , and W 2-2 , the 20-member W 1 structural ensemble (PDB ID 2MU3) was iteratively analyzed through ROTDIF using robust least-square fitting to obtain global information with coordinates from the lowest energy member to model the diffusion tensor frame (D || and D K tensor axes for an axially symmetric system, or D xx , D yy , and D zz tensor axes for a fully anisotropic system) and Euler angles (α, β, and/or γ). In addition, the W 2 ensemble member with the calculated R g closest to the experimental R g was deemed the representative model for the reference frame of the diffusion tensor for W 2 (merged W 2-1 and W 2-2 relaxation data). The robust least-squares optimization method was employed during fitting and full statistical analysis was employed to determine the most statistically upheld diffusion tensor model for a given ensemble member.

Estimations of Rotational Correlation Time
The rotational correlation time (τ c ), assuming a hydrated sphere, may be estimated through Stokes' law [64]: where r H is the radius of hydration. For a hydrated protein, r H may be roughly estimated on the basis of the specific volume (υ = 0.73 cm 3 /g) as [65]: r H " r3υM r {p4πN A qs 1/3`r w (6) where M r is the molecular weight, N A is Avogadro's number, and r w is radius of the hydration layer surrounding the protein (1.6-3.2 Å for 1 2 -1 hydration shell [66]). For direct comparison, the average ratios of T 1 /T 2 (or T 1 1 /T 2 1 ) for all residues with an NOE > 0.65 in a given protein were employed to estimate τ c . Through neglecting of the high-frequency terms of the spectral density, the analysis of Kay et al. [27] may be simplified to: τ c " r1{p4πν N qs p6T 1 {T 2´7 q 1/2 (7) where ν N is the resonance frequency of 15 N (in Hz). HYDROPRO [55] was also used to estimate ensemble-averaged τ c values based upon the NMR-derived structural ensemble for W 1 (PDB entry 2MU3) and the W 2 ensemble inferred on the basis of concatenated NMR-derived restraints for W 1 [19]. The resulting output was parsed for τ c (harmonic mean (correlation) time) as calculated on the basis of the combined input of temperature, solvent viscosity, molecular weight, solute partial specific volume, solution density, and PDB structural coordinates. For comparison, calculations were carried out for W 1 structural ensembles truncated using an in-house Tcl/Tk script to the globular core (residues 12-149) or the globular core excluding both the turn between helices 4 and 5 and helix 5 (residues 12-128).

Statistical Tests
Statistical analyses were performed between units and protein regions as described above to evaluate significance between means through ordinary one-way ANOVA test when comparing 3 or more means or unpaired two tailed t-test with Welch's correction for unequal variances when comparing two means within the Prism 6 or InStat software packages (both from GraphPad Software Inc., La Jolla, CA, USA). All distributions were assumed to be Gaussian. Unless otherwise noted, significance was determined at an α of 0.05.

Conclusions
The core repetitive domain of AcSp1 is composed of concatenated 200 amino acid units, identical in sequence and very similar in tertiary structuring and internal motions. Through split intein-mediated trans-splicing, individual repeat units were selectively isotope-enriched and investigated in the context of the W 2 protein capable of fibrillogenesis. Intein-mediated segmental-labelling is also highly promising for future studies of other modular proteins, whereby spectral complexity can be reduced without compromising the functional state of the protein. Backbone-level dynamics very clearly demonstrate the beads-on-a-string conformation of the AcSp1 repetitive domain, with structured globular domains linked by lengthy intrinsically-disordered segments forming a relatively viscous solubilized state. Although our previous translational diffusion studies imply that the linker is not highly extended, with W 2 and W 3 exhibiting relatively compact conformations, the 15 N spin relaxation behaviour detailed herein demonstrate that each globular domain in W 2 tumbles nearly independently of its neighbour. Regardless of the construct examined, helix 5 also exhibited elevated high-frequency dynamics relative to the remainder of the globular core. Rotational diffusion behaviour of W 1 is also most consistent with a W unit globular core where helix 5 is not stably attached. Unambiguous measurement of backbone dynamics, therefore, improves our understanding of both AcSp1 repetitive domain modularity and allow direct demonstration of variations in localized stability that were implied by titration with chaotropes and detergent.
to Jan K. Rainey and RGPIN/41823-2015 to Xiang-Qin Liu); key infrastructure was provided through NSERC Research Tools and Instruments Grants and a Leaders Opportunity Fund award from the Canadian Foundation for Innovation (to Jan K. Rainey); and, a Dalhousie Medical Research Foundation Capital Equipment Grant (to Jan K. Rainey and Xiang-Qin Liu). The TCI probe for the 16.4 T NMR spectrometer at the NRC-BMRF were provided by Dalhousie University through an Atlantic Canada Opportunities Agency Grant. Jan K. Rainey is supported by a Canadian Institutes for Health Research New Investigator Award and Marie-Laurence Tremblay was supported by an NSERC Doctoral Postgraduate Scholarship.