Stretching of Bombyx mori Silk Protein in Flow

The flow-induced self-assembly of entangled Bombyx mori silk proteins is hypothesised to be aided by the ‘registration’ of aligned protein chains using intermolecularly interacting ‘sticky’ patches. This suggests that upon chain alignment, a hierarchical network forms that collectively stretches and induces nucleation in a precisely controlled way. Through the lens of polymer physics, we argue that if all chains would stretch to a similar extent, a clear correlation length of the stickers in the direction of the flow emerges, which may indeed favour such a registration effect. Through simulations in both extensional flow and shear, we show that there is, on the other hand, a very broad distribution of protein–chain stretch, which suggests the registration of proteins is not directly coupled to the applied strain, but may be a slow statistical process. This qualitative prediction seems to be consistent with the large strains (i.e., at long time scales) required to induce gelation in our rheological measurements under constant shear. We discuss our perspective of how the flow-induced self-assembly of silk may be addressed by new experiments and model development.


Introduction
Protein-based natural materials persistently inspire the development of novel humanmade materials, owing to their biocompatibility, unique combinations of strength and toughness [1][2][3][4], low-energy processing [5] and efficient solvent recycling [6]. While the industrial production of polymer-based fibres is challenged by a highly non-trivial interdependence between the molecular level of bond-orientation-dependent nucleation, and the macroscopic level, where the temperature-dependent rheology generates stretch of entire chain segments [7][8][9][10][11], silk is processed in semi-dilute aqueous conditions [5], where nucleation can be induced through the stretch-induced disruption of the solvation layer [12]. In order to generate sufficient stretch at modest flow rates, the silk protein has evolved to contain 'sticky' patches (which are assumed to be consisting of ionic calcium bridges between the carboxylated side groups of aspartic and glutamic acids) that significantly slow down stretch relaxation in flow [6,13]. Additionally, it is hypothesised that the sticky patches may serve as registration points for the proteins to correlate their positions and control the nucleation of crystallites [14,15], see Figure 1. This suggests that nucleation takes place shortly after chain alignment. In the present work, we present gelation data under constant shear that reveal that the timescale required for gelation to set is much larger than the time required to align the chains. Using our recently developed tube model for associating polymers [16], we reveal broad distributions in chain stretch that render intermolecular correlation of chains a relatively slow process.
Our starting point is using the insights from several NMR studies [17][18][19][20], and the linear viscoelasticity of silk [21,22] to interpret silk in the gland as a 'sticky' intrinsically disordered protein above the overlap concentration [13], akin to entangled associating polymers [23][24][25][26]. The silk protein consists of N = 5524 amino acids [22], each with a typical step length of b ≈ 0.4 nm [27,28] (Table 1). In dilute conditions, the radius of gyration, R g , is expected to scale as N ν , with ν = 1/3 for a poor solvent, 1/2 for a θ solvent and 0.585 for a good solvent. We expect this exponent, which is a measure for the solubility, to depend on the temperature, on the pH and on the cation concentration [13,29,30]. Nevertheless, it was found that the prediction for R g under θ conditions, is in decent agreement with small angle X-ray scattering [22] and small angle neutron scattering [31]. Under these conditions, full chain extension (giving an end-to-end distance of 2210 nm) requires an increase of the end-to-end distance R e = √ 6R g ≈ 29.7 nm between the chain ends by a factor of 75. Figure 1. Schematic representation of silk processing. The polymer is entangled and contains stickers (orange) that are connected by linker chains (green). For flow rates that exceed the relaxation rate by reptation, the entanglement blobs of the polymer align in the direction of the flow field. If the chains remain unstretched, the stickers are mobile and the sticker concentration inside the tube is almost homogeneous. Upon modest stretching of the polymer, the distance between the stickers increases and the sticker concentration becomes inhomogeneous, which gives rise to 'registration points' that enable the correlation of nearby chains.
A polymer stretches in flow when it is unable to sufficiently relax the perturbations imposed by the flow. In dilute conditions, the relaxation is facilitated by long-ranged hydrodynamic interactions with a corresponding Zimm time τ Zimm = τ 0 N 3ν . For a typical monomeric relaxation time of a picosecond, this would imply a Zimm time of just 0.4 µs, and chain stretching would require enormous flow rates of 2.5 × 10 6 s −1 . The first step to improve this to conditions that are biologically achievable by the animal, by concentrating the protein.
Concentrating of the protein has two consequences. First, the chains are overlapping and topological entanglements emerge that restrict to global motion of the chain to curvilinear diffusion within an effective 'entanglement tube' with a diameter a (which is the statistical length scale below which a substrand appears unentangled and relaxes freely) and a length aZ e , with Z e ≈ 10 the number of topological entanglements per chain [13] ( Table 1). The timescale for the protein to escape the tube sets the strain rate required to align a chain. The second consequence of concentrating the chains, which is of importance to stretch an aligned chain (and takes place at higher strain rates), is that the hydrodynamic relaxation mode is done away with, and stretch relaxation is now characterised by the Rouse time, τ Rouse = τ 0 N 2 = τ e Z 2 e ≈ ×10 −5 s, with τ e ≈ 10 −7 s the relaxation time of a strand between entanglements (Table 1).
Although this is two orders of magnitude larger than the stretch-relaxation time in dilute conditions, this is not nearly enough: the animal spins silk at a deformation rate of 1 s −1 . The crucial strategy of the silkworm is to incorporate sticky patches in the protein that further slow down the stretch relaxation time to a 'sticky Rouse' time [6,23,32], τ Rouse ≈ τ s Z 2 s , with τ s ≈ 10 − 100 ms the lifetime of a sticker (which depends on the metal cation composition) and Z s ≈ 5 the number of stickers per chain [13] (Table 1). In previous work [13,21,22], we have determined the values of the physical parameters, Z e , Z s , τ s using the linear viscoelastic response of the silk feedstock. In this work, we will use these values to computationally investigate the stretching of the chains under non-linear flow conditions using our recently developed tube model for association polymers [16].
To model the chain stretching and assess the potential for registration effects, we here extend our tube model with a description for the finite extensibility of the silk protein, as well as a description for the influence of the tension in the stretched chain on the lifetime of the stickers. In the following, we will review earlier work in extension of simple Gaussian chains in relation to the registration effect. Subsequently, we will discuss our methods, including the extension of our tube model and a discussion on the importance to distinguish between an association-dissociation and a bond-swap mechanism for sticker opening and closing under non-linear flow conditions. We then discuss our experimental and simulation results.

Theory for Registration of Stretched Chains
The registration effect intrinsically relies on the aligned conformation of the polymer chain; in particular, the lengthscale and strength of correlations between the stickers in the flow direction. In order to calculate a correlation function, we will invoke the Gaussian chain approximation (in the next section we will discuss how the presence of stickers changes this idealised, but usually very successful, representation). Our starting point is the equilibrium distribution that describes the equilibrium statistics of the contour-length fluctuations of a (sub)chain with a mean length R and standard deviation σ R . We illustrate this distribution in Figure 2, where we consider a chain with Z e entanglements and a tube diameter a and no stickers. The tube contour has a mean length R = aλZ e and a standard deviation σ R = R 2 /3λZ e , where λ is the mean stretch ratio in extensional flow. (We remark that the tube diameter, the number of entanglements and the tube length in principle all depend on the concentration and the solubility of the protein; we assume both properties to be fixed in this work (the tube diameter is a = bN ν e (φ) and the number of entanglements is Z e = N/N e (φ), where ν is 1/3 for a poor solvent, 1/2 for a θ solvent and 0.585 for a good solvent. We expect ν to be affected by the temperature, pH and cation concentration [13,29,30]), and N e (φ) ∝ φ −1/(3ν−1) the number of monomers per entanglement strand [33]. Under θ conditions (ν = 1/2) the contour length increases linearly with an increasing concentration.).) Hence, the normalised standard deviation is which shows that the importance of contour-length fluctuations vanishes with an increasing molecular weight or with an increasing stretch ratio, λ (this is largely the reason of the success of pre-averaged models for the non-linear viscoelastic response of regular (nonsticky) polymers [34]). We now use this description for a substrand between two neighbouring stickers (n ≡ 1), which represents a portion f = n/(Z s + 1) of the chain. The length of this tube segment has a mean R = a f λZ e and a standard deviation σ = R 2 /3 f λZ e . The next sticker on the chain (n ≡ 2) is separated by a substrand twice this length (in general n indicates which sticker in the chain is considered). Ignoring the finite size of the chain, we can define a correlation function as being proportional to the probability of finding a sticker at a distance r, as with M a normalisation factor such that C(r) tends to unity for large r, and with R n ≡ naλZ e /(Z s + 1) and σ n ≡ a nλZ e /3(Z s + 1). We show the correlation function in Figure 3 for various values of the fraction of stickers and chain extensions.     Figure 3 shows the correlation function as a function of the distance from the sticker. In each panel the stretch ratio is varied, while the only difference between the panels is the number of stickers per entanglement. In panel (a) the number of stickers is 5 per chain of 10 entanglements, corresponding to the silk protein (both values may slightly vary in biological feedstock samples for variations in the concentration and ionic contents [13]). In the absence of stretch (λ = 1), the correlation function is rather flat, which implies that there is no long-ranged correlation between the stickers. By modest increases of λ (note that λ max = 23.6), large oscillations emerge, which point at the correlation of sticker positions in the direction of the flow field. Hence, there is a sensitive rheological switch to create a strongly inhomogeneous sticker concentration in the direction of the flow field. Given that this phenomenon vanishes for an increasing number of stickers per chain (panels (b) and (c)), we speculate that the silk protein has undergone selection to contain slightly fewer than one sticker per entanglement to generate 'registration points' at modest flow rates, as advertised in the cartoon of Figure 1. However, an important caveat is the fact that we have assumed a relatively narrow Gaussian distribution of chain stretches, while we have recently found that associating polymers have rather broad stretch distribution [16]. We will calculate and discuss the impact of these large fluctuations in the following sections.

Experimental Methods
Native silk feedstock (NSF) was obtained from the middle-posterior gland section of commercially bred silkworms, by dissecting 5th instar larvae, as described previously [6,21,22,35]. Generally, a fresh dissection was performed for each rheology experiment and no additional reagents or further chemical treatments were applied to the NSF specimens. Natural variations in the viscosity between different samples can be attributed to variations in ion concentrations (potassium, in particular) that affect the sticker lifetime [6,13]. (These variations in the viscosity can be reproduced by adding ions to diluted fibroin solutions followed by re-concentrating to physiological conditions [36].) A sample of the NSF (ca. 0.1 mL, from the middle-posterior gland section) was carefully transferred to a rheometer (DHR2, TA Instruments) fitted with a CP1/20 geometry (20 mm diameter cone, with 1 • opening angle and 27 µm truncation) and a Peltier temperature controlled sample stage. The small amount of excess sample was not trimmed, as the associated stress could initiate gelation. A few droplets of water were placed around (but not touching) the specimen and the sample area was enclosed with a loosely fitting, custom-built cover, to minimise water evaporation from the specimen.
The flow behaviour was measured in several sequentially programmed stages. After a pause of 1 min (to allow thermal equilibration and relaxation of loading flow stress), a constant shear flow rate ofγ = 1 s −1 at 25 • C was applied over 100 s, to ensure the specimen was uniformly distributed. The low shear rate viscosity (η 1 ) was obtained by averaging data from the final 30 s. To further reduce the risk of evaporation during subsequent stages of the experiment, the temperature was reduced (at 10 • C/min), while maintaining the shear flow. This also allowed a flow activation energy to be calculated. Next, the linear viscoelastic behaviour was measured by a logarithmic oscillatory frequency sweep (15 steps, ω = 126 to 6.3 rad s −1 , at 1% strain followed by 25 steps, 31.4 to 0.13 rad s −1 at 5% strain, with the lower frequency stage repeated to ensure reproducibility). Then, gelation was initiated by applying a faster shear flow (5 s −1 ); the incipient phase change was observed as a pronounced increase in the shear stress (or viscosity) and first normal stress (i.e., axial force trying to separate the cone from the sample stage). Finally, the progress of gelation was observed using an oscillatory time sweep (at 1 rad s −1 and 5 % strain, over 5 min). In previous work, we identified these structural changes within the sample as the alignment of proteins (consistent with the formation of an aligned, anisotropic gel) through birefringence measurements [5,36], confocal microscopy [37] and rheo-IR [38].
The linear viscoelastic rheology, represented by the elastic modulus G (ω) and the viscous modulus G (ω), was characterised in two stages. By curve-fitting four Maxwellian modes to the data (Equations (18) and (19)), characteristic time scales and corresponding moduli were extracted [21,35,39]. While this is a convenient method to obtain rapid insight into possible (precursors of) gelation in the sample, the obtained information is phenomenological. For the molecular simulations (see next section), however, we require information about the physical time scales (the reptation time and the sticker lifetime) as well as the molecular (the number of entanglements and stickers per protein). We extract this information from the linear rheology by curve-fitting the sticky-reptation model [13,23,40] to this data, see Figure 4. This latter approach is significantly slower than curve-fitting Maxwellian modes, because both the parameter optimisation and the statistical analysis of the parameter values requires extensive sampling of the parameter space. Ref. [13] extensively discusses this method, and also provides information about the availability of our software.

Tube Model: Brownian Dynamics
Fully atomistic simulations, and even coarse-grained one-bead-per-amino-acid simulations, are computationally unfeasible for the problem at hand. Here, we adopt the central idea by de Gennes and Edwards of replacing the many-chain problem with a single chain in a tube-like confinement imposed by its environment of entanglements [41,42], and solve the Brownian dynamics of the aligned chain in the flow direction (effectively in 1D) [16,43]. This approach is simple yet powerful, and has led to the development of widely applied finite-element solvers [44][45][46][47], a physical explanation for the (apparent) 3.4 power dependence of the relaxation time of polymer melts on the molecular-weight [48], and a comprehensive understanding of the rich non-linear rheology of (bimodal) polymer blends [34,49]. In the spirit of other theory and modelling work on associating polymers [46], we recently added a description for the stochastic attachment and detachment of associating monomers to the tubular environment developed for full non-linear flows [16]. Here, we will add a description for the finite extensibility of the silk chain.
The starting point is to consider a chain consisting of N Kuhn segments with length b, and Z e entanglements (hence, with tube diameter a = b(N/Z e ) 1/2 ≈ 9.4 nm). The configuration of the chain is given by the spatial coordinates R i of monomers i = 1, . . . , N along the curvilinear direction along the tube, which evolve with time according to the Langevin Equation [34,43,48] with ζ is the monomeric friction. The last term in the right-hand side of the equation represents the force exerted by the flow field, withε the strain rate. The term F thermal,i is a stochastic force given by the equipartition theorem with k B T is the thermal energy, and F conf,i represents the intramolecular forces related to the chain conformation, which behaves as an harmonic bead-spring model if k s,i = 1, but using includes (approximately [50]) the finite extensibility of the chain up to a maximum stretch ratio λ max = Nb/Z e a ≈ 23.6 (Nb is the length of the fully extended chain and Z e a is the tube length). The boundary conditions are ∂R/∂i = aZ e /N both at i = 1 and at i = N.
The opening and closing of the stickers are described using the stochastic variable p i (t), which is zero for open stickers (as well as for for non-sticky monomers) and unity for closed ones. The former type of monomers are unbound and can freely diffuse and respond to the drag exerted by the flow field, as well as relax stress in adjoining segments. A closed sticker, however, is kinetically trapped by its environment and is unable to diffuse or to respond to local stress in the polymer. Consequently, the closed sticker advects with the background flow.
The underlying physical chemistry of the opening and closing rates is described in the following section.

Physical Chemistry of the Stickers
The rheology of sticky polymers is largely determined by the sticker lifetime. In the case of silk [6,13], the stickers in silk fibroin appear to consist predominantly of calcium bridges between carboxylate-substituted amino acids. This does not preclude contributions from other polyvalent cations, however, such as the smaller amounts of magnesium found in the feedstocks. The lifetime of these ionic bridges depends on their ionic environment and appears to shorten with increasing monovalent cation (mainly potassium) concentration. Usually, the sticker lifetime is described using an empirical Arrhenius form τ s = τ 0 s exp(E act /k B T), where the activation energy is determined from the linear rheology using Van 't Hoff plots [13,23,26,40,[51][52][53]. In this section, we argue that this activation energy might not be the one that is of importance to the non-linear rheology, because a different mechanism of sticker opening takes over.
Indeed, in most molecular dynamics studies of associating polymers (typically unentangled vitrimers), it is argued that most stickers are bound into closed sticker pairs 'S 2 ', so that it is unlikely for two free stickers 'S 1 ' to meet and associate: it is more likely that a free sticker finds a closed sticker pair and closes by bondswapping (Figure 5a illustrates both mechanisms). That is, sticker opening and closing takes place according to the chemical reaction equation with k bs the bondswapping rate. However, under conditions of strong flow the tension in the polymer chains is likely to dissociate the closed sticker pair without the need for a open sticker nearby. Therefore, in the non-linear rheology it is crucial to also include an association-dissociation mechanism, with an association rate k a and dissociation rate k d . In order to determine whether or not strong flow alters the mechanism of sticker opening, it is important to first understand which mechanism dominates under quiescent conditions. Hence, we first write down the effective sticker closing and opening rates as and  Table 1) in all of our simulations. The value of p is controlled entirely by the association and dissociation rate coefficients and by the total sticker concentration through The effective closing and opening rates, on the other hand, are affected by the respective contributions of the bondswapping and the association-dissociation mechanisms, see Figure 5b. This figure plots the opening and closing rates at various fixed p values against the total sticker concentration in units of k bs /k d . For small sticker concentrations or, equivalently, small bondswapping rates, i.e., for k bs [S] k d , sticker opening predominantly takes place by sticker dissociation (i.e., k open /k d ≈ 1) and sticker closing takes place by the association to an open sticker, so k close /k d ≈ 2p/(1 − p). In the other limit where the sticker concentration is high, i.e., for k bs [S] k d , both sticker opening and closing take place predominantly by bondswapping with rates k open /k d ∝ (1 − p) and k close /k d ∝ (p), respectively.
While in the linear rheology it suffices to know the fraction of closed stickers p and the effective sticker lifetime, τ s = k −1 open , in the non-linear rheology the dependence of k open on the tension within the chain needs to be taken into account. We model this by assuming the chain tension to predominantly affect the dissociation rate where ν is an attempt frequency (which may be much smaller than the Rouse frequency of a monomer [13]). In particular, we assume that the force decreases the activation energy for sticker dissociation. In general, the molecular potential is smooth and non-linear; however, to qualitatively capture the force-induced sticker dissociation it suffices to truncate the potential beyond the linear term, so the activation energy may be approximated by [54], with F conf the conformational force in Equation (7) and nm the typical length scale for bond dissociation (which may depend on the ionic strength [13]). In our simulations, we vary from 0.1 to 100 nm and set k bs = 0 and E 0 act = 8k B T (Table 1). This latter value is obtained from the linear rheology of silk [13], see Figure 4.
For these values, the dissociation of the stickers occurs mainly due the finite extensibility of the chain: A sticker near the chain end experiences a force F conf ≈ 3k B Tk s (λ; λ max )λ/R s , and the stretch-dependent sticker lifetime is approximately within the harmonic-spring approximation, sticker dissociation is accelerated when the stretch ratio is λ ≈ a/3 ≈ 89, while sticker dissociation is actually accelerated by the finite extensibility of the chain when λ approaches λ max = 23.6 due to the divergence of k s (Table 1). Hence, to achieve accelerated sticker dissociation for modest chain stretches the dissociation length should be increased.

Simulation Method
We solve Equation (5) by representing the continuous chain by a chain with a number of 'beads' [13,43]: more beads implies a finer discretisation; at rest, the bond length between beads is dR = a/(N beads + 1). The dynamics is simulated by taking fixed-sized explicit Euler time steps. In the absence of stickers and flow, this implies numerical stability for time steps ∆t < 1 2 τ e /[π(N beads + 1)] 2 . In the presence of stickers, every time step should also be smaller than k −1 open , which depends on the forces that act on the stickers as discussed above. Table 1. Physical parameter of the silk feedstock within the coarse-grained tube model. Each parameter value is motivated at the point where its symbol is introduced throughout Sections 1-3 of the main text. † In our simulations, the bond-swap rate k bs is set to zero, so the association and dissociation rates, k a and k d , are set by the values of p and τ 0 s . ‡ The maximum stretch ratio is defined with respect to the stretch ratio of a chain of which the entanglement blobs are aligned in the flow field (which is therefore pre-stretched to some degree).

Silk Gelation under Constant Shear
Initial steady shear rate measurements (over 100 s at = 1 s −1 and 25 • C) from all the NSF specimens used in this study gave essentially constant viscosities, indicating good quality solutions without gelation. Values of η 1 ranged from 187 to 2165 Pa s, which encompassed the majority of the stages of fibre spinning during cocoon construction [6]. As noted previously [35], viscosity measurements during the temperature ramps followed Arrhenius-type relationships: where E act,η is the flow activation energy and R is the gas constant. This is demonstrated in Figure 6a, where plots of ln η(T) against 1000/T gave essentially straight lines, from which it was possible to calculate the flow activation energy E act,η . It was found that the offsets of the lines varied considerably, reflecting the differences in viscosities between specimens. Similar variations were found previously [21], and were explained by variations in the concentration of ions in the samples [6,13,36]. Nevertheless, the slopes were similar, giving vales of E act,η from 40.6 to 51.6 kJ mol −1 . We interpret the activation energy for flow as an apparent one, which emerges due to the temperaturedependence of both the sticker lifetime and the entanglement density. Both properties depend on the composition of the solution (e.g., the concentration of potassium [6,13]). The variations of the flow activation energy for different batches of silk worms (we previously reported 31.8 kJ mol −1 using the same sample preparation technique and rheometry methods [35]) will be subject to future studies.
All the NSF specimens used exhibited viscoelastic behaviour. Exemplar results from an oscillary sweep (at 15 • C) are shown in Figure 6b, for a specimen of NSF with its shear viscosity close to the middle of the range observed (η 1 = 613 Pa s at 25 • C). Elastic behaviour (G > G ) dominated at high frequencies, but changed over to viscous behaviour (G < G ) at lower frequencies. The dynamic modulus data was reproducible, as demonstrated by measurements from the overlapped high and low frequency steps and the repeated low frequency steps, confirming that the solution was stable.
Following our previous work [13], we extract information about the physical properties of the silk feedstock (elastic modulus G e , number of entanglements Z e and stickers per chain Z s , and the sticker lifetime τ s ) by curve-fitting the sticky-reptation model to the oscillatory results (Ref. [13] provides the model and directions to our curve-fitting software). All extracted physical parameter values are given in Table 2, and lie within the range of values that we obtained from 15 other specimens [13]. We remark that the shortest relaxation time extracted, i.e., the sticker lifetime, is τ s ≈ 5 − 31 ms (log τ s = −1.9 ± 0.4), while the longest relaxation extracted, the reptation time is τ rep ≈ τ s Z 2 s Z e /α = 0.26 s, with α = 10 a parameter that approximately describes relaxation by contour-length fluctuations [13]. Table 2. Physical parameter values obtained by fitting the sticky-reptation model to the elastic and viscous modulus for ω ≥ 0.4 rad s −1 shown in Figure 6b [13].    (18) and (19) and Table 3 We remark that obtaining a decent fit quality required us to truncate our data for frequencies ω < 0.4 rad s −1 . Indeed, the oscillatory measurements generally revealed that the NSF specimens were not measured into their terminal zone (i.e., where only the slowest mode contributes to the dynamic moduli), and the expected slopes of 1 and 2 for log G and log G in the terminal zone were not observed. In order to quantify these deviations, we also empirically fitted our data using four Maxwellian modes [21,35,39],

Physical Property Value
where τ i is a relaxation time constant and g i is a 'modulus' contribution', which combines the actual modulus and the relative abundance of that mode. The values of these parameters are given in Table 3. Table 3. Parameter values obtained by fitting the four-mode Maxwell model in Equations (18) and (19) to the elastic and viscous modulus shown in Figure 6b. We find that the shortest timescale τ 4 = 13 ms lies within the confidence interval (defined by the standard deviation) of the sticker lifetime, τ s = 5 − 32 ms and the modes with time scales τ 3 = 0.092 s τ 2 = 0.531 s seem to capture the shape of the curve for all frequencies near and above the reptation time τ rep ≈ 0.26 s, while the first mode with τ 1 = 7.87 s seems to capture the deviations from the expected slopes of 1 and 2 for log G and log G in the terminal zone. Previously [39], we found that the slow-mode contribution builds up throughout pseudo-static shear stress relaxation measurements, and may indicate the formation of precursors to gelation. This slow-mode contribution is relatively small in the samples studied here, and we interpreted the protein to be in a sufficiently good state of solution.

Mode
Following these oscillatory shear measurements, we subjected the samples to a steady shear flow at a higher rate. Exemplar data are shown in Figure 6c,d, for the specimen previously characterised in Figure 6b It was found that both shear (σ) and first normal stress (N 1 ) increased gradually, up to around 250 s corresponding to a strain of 1250; subsequently both parameters increased more strongly, which was indicative of gel formation. The presence of gelation inferred from the rheometry was corroborated by the observation of enhanced scattering of light from the sample, indicative of inhomogeneities. Finally, gelation was confirmed by observing progressive increases in dynamic moduli (particularly G ) over time. Clearly, once initiated by flow, gel strength can increase further under essentially quiescent conditions. This also concurs with previously reported observations [39]. The data shown in Figure 6e is typical for a specimen in which gelation has been initiated; however, the elastic modulus remained below the loss modulus, where the shear flow had been stopped prior to the strong increase in shear and normal stress.

Modelling of Dispersed Chain Stretching in Extensional Flow
We present our simulations of Equation (5) with the parameters from Table 1 in Figure 7. In this Figure, every panel shows stretch distributions at various extension ratesε, while in each panel the simulations are carried out with a different dissociation length in the range 0.1-100 nm. In the absence of flow, the distributions are Gaussian, as expected from Section 2. Upon increasing the extensional flow rates, the high-stretch tail of the distribution broadens, in agreement with our early predictions of 'power-law stretching' in Reference [16]. At high rates, the stretch further increases. For small values of , i.e., for panels (a-c), the stretch distribution is truncated at a stretch of approximately 2 µm. In that regime, the finite extensibility of the chain leads to large forces acting upon the stickers and the stickers dissociate. At a value of = 100 nm, the harmonic forces (without finite extensibility effects) are large enough to dissociate the stickers at more modest stretches, and the truncation of the stretch distribution shifts to smaller stretch values.
We emphasise the qualitative difference of these findings compared to what is expected from the usual polymer physics in Figure 2. While the stretch distribution of non-associating polymers narrows down at high flow rates (which suggests the emergence of clearly distinguishable registration points), the stretch distribution of associating polymers widens with an increasing flow rate. At sufficiently high rates, we observe that the stretch-distribution converges to a steady distribution that is independent of the flow rate, provided that the sticky Weissenberg number,ετ SR , remains much smaller than the bare Weissenburg number,ετ R , with τ SR ≈ τ s Z 2 s the sticky Rouse time and τ R ≈ τ e Z 2 e the bare Rouse time. This shows that an appropriately designed chain may stretch in flow to an extent that is rather insensivite to the flow rate, while the chemistry of the stickers (which may be altered using cation concentrations and pH and is physically described by the activation energy and dissociation length) seems to be the crucial control parameter.   Table 1.

Toy-Model Simulations of Associating Polymers in Shear Flow
In order to model stresses at small strains due to chain-alignment in extensional flow, and to model stress in shear experiments it will be necessary to model the polymer chains in three spatial dimensions. Nevertheless, with a small adaptation of our 1D model we can already show that the broad distributions of stretch will be preserved, although they will emerge at higher strain ratesγ in shear than the strain ratesε that are needed in extensional flow. The flow field in shear has a x and y component. We realise that the component in parallel to the flow acts on a length scale comparable to the tube diameter, a. Therefore, in Equation (5) we replace the extensional flow fieldγζR i byγζ(R i − R C.M. ) for −a/2 < R i − R C.M. < a/2 andγζ(a/2)sign(R i − R C.M. ) otherwise, with R C.M. the center of mass. This implies the extension rate acts only near the center of the chain. Next, in line with our earlier work [16], we consider a sticky dumbbell chain with only two stickers near the ends of the chain, which are synchronised (i.e., either both are open or both are closed; opening and closing of the stickers occurs simultaneously), and we switch off the finite extensibility effects and the effect of stretch on the sticker dissociation time. All remaining parameters are the same as in the previous section, and are summarised in Table 1. Figure 8a shows the stretch distribution of the dumbbell for strain rates ranging from 0 toγτ R = 20. In the absence of flow (γ = 0), we retrieve a narrow Gaussian distribution. For increasing flow rates, the distribution broadens up and just beyond the 'sticky' Weissenberg numberγτ R /(1 − p) = 1 (for a dumbbell, free diffusion is only possible exactly a fraction (1 − p) of the time). Beyond the sticky Weissenberg number, we see a broadening of the distribution while a peak emerges at a stretch ratio λ = 1, representing the presence of fully relaxed (yet, within our model, aligned) chains. When the bare Weissenberg number of unity (γτ R = 1) is exceeded, the distribution shifts to higher stretch values. Hence, in analogy with our simulations of associating polymers in extensional flow, we find a broadening of the stretch distribution in shear flow. In contrast to the case of extensional flow, there is, as expected, no runaway stretch.
Filling up the probability distribution of Figure 8a requires finite time. Figures 8a,b show how the average stretch and stress reach the steady state value as a function of the strainγt, which is the time in units of the strain rate. Panel (c) shows the stress for the same simulations, which approximately equals the square of the stretch, as expected. While at small strain rates the steady state is reached at a strain of the order unity, for larger stretches much larger strains are required. This is in qualitative agreement with the larger strains required to induce gelation in shear experiments (see Figure 6).

Discussion
In the silk literature, the mechanism of gelation and flow-induced crystallisation of silk is hypothesised to be preceded by the alignment and registration of protein chains. From the view of simple polymer physics, registration points only become discernible when the chains are not merely aligned but also stretched to a small degree. Indeed, silk has approximately 1 sticker per entanglement, and due to the rather large fluctuation of an entangled substrand of the chain, the sticker is able to fluctuate far from its meanequilibrium position; the sticker concentration is effectively smeared out. By stretching the chain to a small extent in the direction of the flow, the stickers separate to larger distances and a crystal-like correlation function with discernible length scales emerges, see Figure 1.
This view of silk registration is highly idealised, in particular because associating polymers exhibit a broad stretch distribution in steady-state flow. Indeed, akin to ring polymers [55,56], rare events of large chain stretches emerge below the stretch transition. We speculate that the width of the distribution determines how likely it is for equally stretched polymers to associate with each other, such that a correlated 'registered' network may form. Recent molecular dynamics simulations of associating polymers in extensional flow do show the influence of flow on the reversible network [57], and indicate that addressing the hypothesis of registration is in reach for molecular simulations.
For such future studies, we suggests to investigate the interplay between chain stretching and registration through the following two possible mechanisms:

1.
The broad stretch distribution may set a probability by which proteins with a similar stretch may 'register' or associate and eventually stretch further or give rise to nucleation events; in this case, correctly parameterised single-chain simulations will enable to calculate nucleation rates. Within this mechanism, the finite extensibility of the chains and the tension-dependence of the sticker stability compress the high end of the stretch distribution, and are expected to be important control parameters of the nucleation rate.

2.
At small strains, registration may develop swiftly, so that the stretch of a protein correlates with the stretch of its surrounding proteins, which in turn leads to collective stretching of the entire network.
In order to investigate these hypotheses it will be important gain more insight both experimentally and theoretically into the registration effect, e.g., by mixing in barium (divalent cations) that may bind to the stickers, and performing rheo-SAXS experiments., as well as theoretically using full 3D simulations of the stretching of associating polymers. As we discussed in Section 3.3, currently sticker opening in molecular dynamics simulations is almost exclusively modeled using a bondswapping mechanism; and the activation energy of sticker opening is usually determined from the linear rheology (see Figure 4). Under flow conditions sticker opening will be governed by sticker dissociation, which may have a very different activation energy.
In conclusion, we have applied our recent developments of the established tube model for entangled polymer dynamics, to the more complex case of associating polymers, including the flow-induced stretching of silk. The results indicate that, in spite of the much greater complexity of biomolecules such as the silk protein, over synthetic polymers, that the polymer physics developed over the last 20 years for polymer viscoelasticity can be effective in interpreting the evolved rheological behaviour of silk solutions. It demonstrates, for example, the remarkable effects that well-tailored temporary intermolecular associations can have on the distribution of molecular stretch, in both extension and shear flows. The study also suggests a future direction by which to investigate a potential coupling between the registration effect and broad stretch distributions. We have discussed which new steps, experimental and theoretical, are required to investigate this coupling, itself a promising route to the flow-induced liquid-solid transition that silk must undergo. An understanding of the alignment and stretching of silk at the molecular and microstructural length-scales, such as exemplified in this work, may in future assist the development of novel fibres with similar properties as silk, both in performance and in controlled manufacturing.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.