Probing the Occurrence of Soluble Oligomers through Amyloid Aggregation Scaling Laws

Drug discovery frequently relies on the kinetic analysis of physicochemical reactions that are at the origin of the disease state. Amyloid fibril formation has been extensively investigated in relation to prevalent and rare neurodegenerative diseases, but thus far no therapeutic solution has directly arisen from this knowledge. Other aggregation pathways producing smaller, hard-to-detect soluble oligomers are increasingly appointed as the main reason for cell toxicity and cell-to-cell transmissibility. Here we show that amyloid fibrillation kinetics can be used to unveil the protein oligomerization state. This is illustrated for human insulin and ataxin-3, two model proteins for which the amyloidogenic and oligomeric pathways are well characterized. Aggregation curves measured by the standard thioflavin-T (ThT) fluorescence assay are shown to reflect the relative composition of protein monomers and soluble oligomers measured by nuclear magnetic resonance (NMR) for human insulin, and by dynamic light scattering (DLS) for ataxin-3. Unconventional scaling laws of kinetic measurables were explained using a single set of model parameters consisting of two rate constants, and in the case of ataxin-3, an additional order-of-reaction. The same fitted parameters were used in a discretized population balance that adequately describes time-course measurements of fibril size distributions. Our results provide the opportunity to study oligomeric targets using simple, high-throughput compatible, biophysical assays.


Introduction
The deposition of amyloid fibrils in the brain is a pathological hallmark of several different neurodegenerative disorders, yet the pathogenic role of these insoluble aggregates is not fully understood [1]. On the other hand, there is now substantial in vivo evidence of amyloidogenic proteins also forming small soluble oligomers that spread to neighboring cells and induce downstream processes associated with neurodegeneration [1,2]. Chemical kinetics, a classical cornerstone for drug discovery [3], is hardly applicable to the study of this new and pre-eminent target [4][5][6], in part due to the lack of straightforward methods to monitor the formation of a highly heterogeneous group of species ranging from protein dimers to complex n-mers [7,8]. In contrast, extensive research has been devoted to protein aggregation kinetics based on the characteristic tinctorial properties of amyloid fibrils [9].
An important step towards the kinetic quantification of off-pathway aggregation was taken after the observation of protein precipitation occurring in parallel with the formation of amyloid fibrils of lysozyme [10]. Before, kinetic analysis of amyloid aggregation of the islet amyloid polypeptide (IAPP) suggested the formation of intermediate on-and off-pathway phases during IAPP fibrillogenesis [11]. The presence of non-amyloidogenic species produces perceptible deviations from the time evolution of the amyloid signal expected for the generic nucleation and growth processes of the phase transition: where α is the normalized amyloid conversion, and k a and k b are combinations of elementary rate constants [12]. One of the kinetic signatures found to be associated with off-pathway aggregation was the unusually weak dependence of the lag phase duration on the initial concentration of lysozyme [10]. Similar behaviors observed with other protein models have provided the basis for varied interpretations of the amyloid aggregation mechanism encompassing, for example, Michaelis-Menten-like saturation of the elongation step [13], complex sub-steps of nucleation and growth [14], stochastic fluctuations in the nucleation time [15], and the suppression of fibril fragmentation at high fibril concentrations [16]. Unlike these possible explanations for the underperforming scaling laws, off-pathway aggregation can be directly investigated by analytical and microscopic techniques such as those used to identify insoluble aggregates of lysozyme [10], and later on, soluble oligomers of ataxin-3 [17], and metastable oligomers of Aβ40 and Aβ42 peptides [18,19]. Because the formation of soluble and insoluble assemblies is fed by a common pool of protein monomers, we propose that amyloid fibrillation kinetics can be used to reveal the presence of the parallel oligomeric pathway. To test this hypothesis, we chose two systems, human insulin and ataxin-3, for which the fibrillation kinetics have been measured under conditions of known oligomeric composition. Insulin is a protein hormone existing in solution in a thermodynamic equilibrium of monomers, dimers, tetramers, hexamers and higher-order oligomers [20,21]. Changes in the protein molecular structure induced by low pH, high temperature or the presence of organic solvents lead to the formation of amyloid fibrils through the direct association of insulin monomers [20], or by the assembly of intermediate on-pathway oligomers [22]. Ataxin-3 is a multi-domain protein with a globular Josephin domain and a C-terminal flexible tail containing a polyglutamine (polyQ) repeat whose expansion ultimately causes Machado-Joseph disease. Ataxin-3 aggregation involves an initial step mediated by the Josephin domain, and a second step dependent on the expanded polyQ tract that accelerates protein aggregation and promotes the formation of mature amyloid fibers [23,24]. The analysis of the thioflavin-T (ThT) binding assay run at different concentrations of human insulin and ataxin-3 uncovers mechanistic aspects of the oligomeric and fibrillar pathways. The distinct aggregation mechanisms predicted for each protein are experimentally validated by time-course dynamic light scattering (DLS) measurements.

Dynamic Light Scattering
Dynamic light scattering measurements were performed using an ALV/DLS/ SLS-5000F, SP-86 goniometer system (ALV-GmbH, Langen, Germany) equipped with a CW diode-pumped Nd:YAG solid-state Compass-DPSS laser with a symmetrizer (Coherent Inc., Santa Clara, CA, USA). The laser operates at 488 nm with an output power of 400 mW. The intensity scale was calibrated against scattering from toluene. 700 µL samples of 5 mg/mL insulin were incubated in glass cuvettes at 45 • C without mechanical shaking and periodically analyzed at a scattering angle 90 • to the incident beam. Hydrodynamic radii of the particles in solution were estimated from the diffusion coefficient(s) delivered from CONTIN analysis [25]. Discontinuous auto-correlation functions were not considered for CONTIN analysis.

Results and Discussion
For the experimental conditions adopted in each model protein, human insulin and ataxin-3 produce fibrillar species with distinct morphologies ( Figure 1A,B) and at markedly different aggregation rates ( Figure 1C). Long, straight filaments of human insulin are formed much faster than the small, worm-like fibrils of ataxin-3, thereby suggesting that phase transition mechanisms are differently affected by the fibril elongation step. Chemical kinetic analysis pinpoints these differences, and reveals how the presence of soluble oligomers influences each type of protein aggregation curves. The quantitative methods proposed here are expected to contribute to the identification of mechanistic changes provoked, e.g., by the presence of aggregation modulators or by different conditions of temperature, pH, ionic strength, etc. Case study examples of human insulin and ataxin-3 aggregation. Transmission electron microscopy (TEM) micrographs of negatively stained fibrils of (A) 5 mg/mL human insulin and (B) 5 µM (0.218 mg/mL) ataxin-3 captured after 6 h and 65 h incubation, respectively (scale bars, 100 nm). (C) Schematic amyloid fibrillation curves representing the progress of normalized thioflavin-T (ThT) fluorescence (F/F F ) during the aggregation of human insulin and ataxin-3 in the range of protein concentrations studied by Foderà et al. [26] and Silva et al. [17], respectively. The half-life coordinates t 50 and v 50 are indicated by the arrows and by the slopes of dashed lines, respectively.

Mechanistic Analysis of Insulin Aggregation
In the simplified mechanism represented in Figure 2A, the elementary intermediate steps participating in the primary nucleation, secondary nucleation and elongation of insulin fibrils are summed up into the overall rate constants k n , k 2 , and k + , respectively [17]. The sigmoidal (rather than hyperbolic) progress curve of insulin fibrillation ( Figure 1C) points to low values of the parameter k b = k n /k a , which gives the relative weight of primary nucleation over the autocatalytic steps of secondary nucleation and elongation (k a = k 2 + k + ) [10]. The fast elongation rates suggested by the morphology of insulin fibrils ( Figure 1A) are confirmed by the high value of k a associated to the steep burst phase (and high v 50 value) in Figure 1C. The specific weight of secondary nucleation and elongation in determining the value of k a cannot be distinguished from single progress curve analysis because these steps follow similar rate laws [17]. Moreover, since fibril breakage (rate constant k − ) does not change the total mass of ThT-positive filaments but only their number [17], complementary measurements of fibril size distributions are required to directly assess the role of the breakage step.
Prior knowledge of the protein oligomerization state is required before we can move into the deeper levels of the different aggregation pathways [27]. The oligomerization equilibrium of human insulin ( Figure 2B) has been characterized by Bocian et al. [21] using 2D and pulsed field gradient spin echo (PFGSE) nuclear magnetic resonance (NMR). It is, therefore, possible to estimate the availability of insulin monomers under conditions of total protein concentration, presence of zinc, and acidic pH that are similar to those adopted by Foderà et al. [26] while measuring amyloid fibrillation kinetics. Based on the knowledge of the values of the monomer concentration C 1−mer ( Figure 2C), peculiar scaling laws of equilibrium ( Figure 2D) and kinetic ( Figure 2E,F) parameters can be explained using a number of fitted parameters commensurate with the number of independent observations. As an indicator of the amount of amyloid fibrils produced, the final ThT fluorescence intensity (F F ) (pink line in Figure 2D) is not directly determined by the total protein available (closed circles in Figure 2D) or even by the monomer concentration alone. Since protein aggregation takes place until the monomer concentration C 1−mer equals the thermodynamic solubility C * , the value of F F reflects the difference (C 1−mer − C * ) otherwise known as supersaturation (∆C) [12]. This is illustrated in Figure 2D (blue line) with no other fitting parameters than the fluorescence proportionality constant (in arbitrary units) and the insulin solubility, which is a measurable quantity. Besides confirming amyloid fibrillation as a phase transition process driven by supersaturation, the F F scaling law is consistent with a mechanism of monomer addition admitting no supplementary contribution from pre-existing soluble oligomers to the final ThT fluorescence signal. Consequently, the amyloid pathway ( Figure 2A) and the oligomeric equilibrium ( Figure 2B) are found to take place over distinct timescales, with insulin monomers being consumed by the first process at much faster rates than they are produced by the second. The separation of timescales simplifies the application of analytic model equations that were originally derived by assuming the soluble protein fully dissociated [12]. Theoretical curves of t 50 and v 50 vs. protein concentration can be computed using the equations in Figure 2E,F (see Appendix A for details), after expressing k a and k b as a function of ∆C (and of C 1−mer ). If, as it seems to be the case of insulin, fibril elongation predominates over secondary nucleation (i.e., k a ≈ k + and k b ≈ k n /k + ), then both k a and k b are proportional to the initial supersaturation (∝ ∆C) considering that [10,12]: These simple premises and two model parameters are sufficient to elucidate the unconventionally weak C T -dependence of t 50 ( Figure 2E) as being the result of the lower molar fractions of insulin monomer observed for higher protein concentrations ( Figure 2C). If the associated states of soluble insulin were ignored, the lower limits usually admitted for the absolute scaling factor |γ| would be too high to reproduce the measured trend in Figure 2E (red lines). Remarkably, the set of parameters, k a and k b fitted to the lag-time scaling data in Figure 2E are the same as those that describe the aggregation rate data in Figure 2F (Appendix B). In both cases, the used value of C * is the one resulting from the interpretation of Figure 2C. Far from being redundant, the confirmation of kinetic predictions by different and independent measurements provides unequivocal evidence that the present theoretical framework, with only two model parameters, is indeed valid.

Mechanistic Analysis of Ataxin-3 Aggregation
The study of ataxin-3 aggregation follows the same underlying principle that was adopted for human insulin, and has a similar purpose: to show how traditional kinetics can be markedly distorted by the presence of soluble oligomers. As in the case of other polyQ-repeat proteins [28], the formation of ataxin-3 fibrils and the dissociation of ataxin-3 oligomers occur simultaneously ( Figure 3A), and thus, timescale separation cannot be assumed as a simplifying hypothesis. Supported by DLS, size-exclusion chromatography and TEM data, a detailed account of the different steps shown in Figure 3A was recently provided [17], including quantitative estimations of the rate constants κ 1+ , κ 1− , κ n+ , κ n− characterizing the elementary steps of ataxin-3 oligomerization. The worm-like fibrils shown in Figure 1B are predominantly formed by secondary nucleation (k a ≈ k 2 ) and primary nucleation (k b ≈ k n /k 2 ), with minor contributions from the fibril elongation (k + ≈ 0) and fibril breakage (k − ≈ 0) steps [17]. The measured effect of protein concentration on the ThT fluorescence progress curves ( Figure 3B, symbols) is not fully assessed if the oligomeric pathway is not taken into account; on the whole, the black lines in Figure 3B are indicative of good numerical fits, yet they are based on Equation (1), which ignores the occurrence of the parallel reactions of soluble oligomer formation/dissociation. Regardless of how elaborated the theoretical model can be, the fitted parameters are, in this limited scenario, comparable to semi-empirical coefficients showing no evident fundamental meaning. In the illustrative case of Figure 3B, the empirically determined values of k a and k b would follow a proportional relationship with protein concentration, which is not reconcilable with established theories ( Figure S1).  [17]. The steps of amyloid fibril formation are the same as in Figure 2A. The mass of amyloid fibrils is a function of only k a and k b , whereas the number of filaments is also influenced by fibril breakage and by the critical size of fibrils formed by primary and secondary nucleation (R * and R * 2 , respectively). (B) Symbols: ThT fluorescence increase measured for ataxin-3 concentrations of (from top to bottom) C T = 10 µM, 7 µM, 5 µM, 4 µM and 2 µM [17]. Lines: individual (black) and global (blue) fittings of the experimental data by Equations (1)  (D) Red-shadowed area: typically, v 50 is positively correlated with C T (and with t −1 50 ) [29]. Measured data were adapted with permission from Silva et al. [17]. Copyright 2018 John Wiley and Sons.
Instead of using the amyloid fibrillation model in its closed form solution, the original differential equation, was solved simultaneously with the oligomerization rate equilibrium, Equations (A6) and (A7) (Appendix A), and then fitted to the ThT fluorescence progress curves ( Figure 3B, blue lines). Although computationally more demanding than the approach followed with human insulin, the number of degrees of freedom remains unusually low as regards to complex biophysical problems: three independent scaling laws of t 50 ( Figure 3C), v 50 ( Figure 3D) and F F ( Figure 3E) are used to estimate no other unknowns but the scaling constants associated to k a and k b . Unlike the case of insulin, the value of ataxin-3 solubility is known beforehand to be very low (C * ≈ 0) as evidenced by values of monomer concentration lower than the detection limits under equilibrium conditions [17]. In contrast, since the autocatalytic rate constant of ataxin-3 is determined by the secondary nucleation step (k a ≈ k 2 ), a scaling exponent n 2 is now introduced to account for the poorly understood k 2 vs. ∆C relationship: In practice, different fitted parameters are provided by the individualized analysis of each progress curve in Figure 3B (black lines), whereas the global fit (blue lines) requires a single set of rate constants k a and k b to model both the aggregation assay and its scaling laws ( Figure 3C-E). The better goodness-of-fit statistics of the former procedure ( Figure S1) is not surprising since, as in the case of overparameterized problems, the individual numerical analysis is not cross-validated and tends to overfit the experimental error, therefore, compromising the model's predictive power [17,[30][31][32]. The global fitting confirms that pre-determined oligomerization constants can be integrated in aggregation reaction networks to explain highly peculiar kinetics, such as the very weak C T -dependence of t 50 (Figure 3C), and notably, the negative C T -dependence of v 50 ( Figure 3D). Although a more conventional result in the absence of quenching phenomena [33,34], the linear scaling law of the end-point ThT fluorescence ( Figure 3E) is explained by the dissociation of ataxin-3 oligomers occurring in the same time scale as amyloid fibrillation. The observed straight line crossing the origin also indicates that the soluble protein was converted into amyloid-like fibrils without the occurrence of significant monomer degradation during incubation [17]. The complex, yet self-consistent behaviors of half-life and end-point readings cross-validate the molecular-level implications arising from the definition of the secondary nucleation rate constant (k 2 ), and particularly, from the obtained value of the scaling exponent n 2 close to 0. A direct comparison with the fibril elongation step would suggest a first-order dependence of k 2 on the initial supersaturation ∆C since both rates linearly increase with the instantaneous values of supersaturation and fibril mass [10,17]. However, more than just a collisional rate coefficient, k 2 is an overall rate constant accounting for the rate-limiting steps leading to the formation of secondary nuclei [35]. According to classical nucleation theory [36], the nucleation promoting effect elicited by higher supersaturation levels (and lower energetic barriers for phase transition) can be, in part, counteracted by the concomitant decrease in the critical sizes of the primary (n * 1 ) and secondary (n * 2 ) nucleus. This extra contribution, which is not evident for primary nucleation of amyloid fibrils [12], seems relevant for the secondary nucleation of ataxin-3. Somewhat undervalued in regard to induction time measurements, half-life aggregation rates v 50 (or, equivalently, maximum aggregation rates) offer the opportunity to identify the predominant autocatalytic process. Whilst the scaling of t 50 is greatly influenced by primary nucleation, the scaling of v 50 is determined by the balance between elongation and secondary nucleation rates, with the effect of C T getting weaker as secondary nucleation becomes more important. Therefore, and similarly to what was concluded for insulin, the measured scaling laws of ataxin-3 aggregation are determined by the fibrillation mechanism itself and by the presence of thermodynamically stable, soluble aggregates that further deplete the concentration of free monomer in solution.

Model Predictions Are Further Confirmed by Size Distribution Analysis of Insulin Aggregation
The previous models present a detailed picture of the different steps affecting the formation of the insoluble filaments that can be further tested using DLS measurements of particle size distributions (PSDs). Contrary to what was observed for ataxin-3 [17], the size of insulin fibrils tends to increase over time until reaching hydrodynamic radii (R h ) above the micrometer scale- Figure 4A-D (insulin) and Figure 4E (ataxin-3). This is not surprising taking into account the TEM images obtained at the end of each aggregation assay ( Figure 1A,B), and the negligible role of the elongation step during ataxin-3 fibrillation. Another obvious difference to ataxin-3 is the persisting dominance of the left-side peak (R h < 10 nm) up to the end of the aggregation reaction ( Figure 4A-C). To a certain extent, this is explained by the value of protein solubility (C * ), which as already discussed, is much higher in the case of human insulin. While the final concentration of soluble ataxin-3 was too low to be detected by DLS [17], the C * value of insulin is responsible for the population of soluble protein to continue predominating, even after large insoluble aggregates are formed ( Figure 4C). Another reason explaining the modest increase in the intensity of scattered light of larger particles is associated with the dispersion of sizes and consequential broadening of PSDs provoked by the continuous elongation of old and newly-formed insulin fibrils, as opposed to the formation of ataxin-3 filaments with the constant dimension characteristic of the ataxin-3 secondary nucleus. In common with ataxin-3, fibril breakage has a minor role in determining the time variation of the PSD in quiescent insulin solutions: in the case of ataxin-3, the shape of these distributions did not change significantly during the burst and plateau phases of aggregation despite the increased relative importance of the population of ataxin-3 fibrils [17]. In the case of insulin, the elongation-dominated mechanism can be discerned from the expected fibril size increase during the burst phase ( Figure 4A,B,D), whereas, after ∼ 4.5 h incubation, the mean aggregate size stabilizes at a constant value of R h ≈ 1100 nm without any visible signs of fibril fragmentation ( Figure 4C,D). Dashed lines: representations of Equation (4) using values of k a = 4.13 h −1 and k b = 6.90 × 10 −9 fitted beforehand to amyloid aggregation scaling laws (Figure 2), and R * = 5.7 nm, k + ≈ k a and k 2 ≈ 0; lines from top to bottom k a = 4.13 × 1.2 h −1 , k a = 4.13 × 1.1 h −1 and k a = 4.13 h −1 . Solid line: solution of the discretized population balance taking into account the presence of pre-assembled clusters (Appendix B, Section B.2). (E) Measured (symbols) and simulated (lines) evolution of R h during ataxin-3 aggregation (adapted from Silva et al. [17]). Lines: representations of Equation (4) using previously fitted values of k a and k b , and R * = 91 nm, R * 2 = 15 nm, k 2 ≈ k a , k + ≈ 0 and (from top to bottom) k a /8, k a /4 and k a [17].
The absence of significant fibril breakage reinforces the thesis that the oligomerization pathway is the main reason for the weak concentration dependence of the lag phase duration. Therefore, the alternative suggestion put forward by Knowles et al. [29] ascribing the less-than-linear scaling laws to predominant fibril breakage could not be confirmed in the cases of insulin and ataxin-3 aggregation. Owing to the negligible influx of new filaments created by fibril fragmentation, the discretized population balance adopted by the crystallization-like model (CLM) can be simplified to the following closed-form solution [17]: with α given by Equation (1) and R * /R * 2 representing the ratio of hydrodynamic radii of primary and secondary nuclei. Interestingly, when the dominant autocatalytic process is fibril elongation (k a ≈ k + k 2 ), the value of R * can be estimated from the limiting case of Equation (4) for long reaction times using the values of k b and final fibril size (R ∞ ) as the only inputs. After replacing the values of k b = 6.90 × 10 −9 (fitted to the ThT aggregation data for C T = 5 mg/mL) and of R ∞ = 1100 nm (estimated by DLS) in Equation (5), the result of R * = 5.7 nm is obtained, which is a dimension slightly larger than the size of the insulin monomer. The contrast between this result and the critical size of 91 nm (corresponding to ∼ 1.5 × 10 5 monomers) found for the initial ataxin-3 cluster ( Figure 4E) indicates that the differences in the aggregation mechanism of the two proteins are already evident from the initial nucleation events. The higher entropic barrier that has to be overcome to generate the primary nucleus of ataxin-3 helps to explain why this phase transition process is so much slower than that of insulin. Although our estimations are not sufficiently accurate to describe the exact aggregation state of the primary nucleus of insulin, it seems clear that only a few monomers are required to originate the fibrillar aggregates. Such predictions of the critical amyloid size can be affected by the existence of large contaminant particles interfering with the final size estimation used in Equation (5). In the present case, a well-defined distribution of particles possibly consisting of disordered protein clusters with R h between ∼ 100 nm and > 1000 nm is identified right from the beginning of the DLS measurements ( Figure 4A,D). Next, we will show that its occurrence should not have affected the final PSDs. Differently from the emerging peak observed since the beginning of ataxin-3 aggregation [17], the initial size distributions shown for insulin in Figure 4A do not evolve in a clearly defined way until close to the burst phase of fibril elongation shown in Figure 4B. Pre-filtration of the insulin solution using 0.22 µm syringe filters efficiently removed these particles ( Figure S2), but it also delayed the onset of the fibrillation process until a point where the kinetic measurements of Foderà et al. [26] could not be reproduced anymore. Therefore, pre-assembled protein clusters act as important heterogeneous nucleation centers without which the rapid formation of ordered aggregates is compromised [37][38][39]. The low concentration of the insulin clusters (fraction of total protein < 10 −10 estimated from the initial PSDs) is high enough to conceal the initial progress of fibril sizes expected to start at R * = 5.7 nm and not from R h values greater than 100 nm (compare dashed lines and experimental values in Figure 4D). In order to include the contribution of pre-existing assemblies in the predicted PSDs, numerical simulations were carried out as previously described for ataxin-3 [17], with the additional introduction of a simple mechanism of cluster-fibril adhesion described in detail in Section B.2 of Appendix B and in Figure S3. The challenge was to reproduce the experimental results in Figure 4A-D, namely, the initial presence of pre-assembled clusters, the gradual vanishing of this population as new insulin fibrils are formed, and the final emergence of a differentiated population of large aggregates. This was achieved using the values of k a and k b fitted to ThT aggregation data and one additional fitting parameter establishing the physical limit of particle detection ( Figure 4D, solid line, and Figure S3A-C). The good agreement between theoretical and measured PSDs does not necessarily mean that cluster-fibril adhesion is the only mechanism capable of describing the size evolution of the initial clusters. In fact, since the numerical simulations assuming no pre-existing aggregates are still able to describe the later phase of fibril aggregation and the steady-state size distributions ( Figure S3D-F), it is conceivable that the scarce population of clusters could have declined by means of other mechanisms, involving, for example, dissociation processes elicited by the decreasing concentration of dissolved protein.
Although these hypotheses would imply the introduction of new model parameters such as cluster dissociation rate constants, the bottom line conclusion would remain that the final PSDs are negligibly affected by the presence of pre-assembled clusters.
To sum up, NMR, DLS and ThT aggregation data were used to conclude that soluble, partially oligomerized insulin gives rise to fibrillar aggregates by the processes of primary nucleation and subsequent fibril elongation with minor contributions from secondary nucleation and fibril breakage. A critical amyloid size of R * = 5.7 nm could be calculated for insulin using Equation (5) and the value of R ∞ estimated from the final PSD.

Systematization of Concepts
The conclusions drawn for insulin and ataxin-3 are expected to generalize well, not only because they are supported by a combination of complementary results (obtained, in the case of insulin, by two other research teams besides our own), but also as a consequence of the wide spectrum of behaviors covered by the two systems: from nearly irreversible (insulin) to fully reversible (ataxin-3) oligomerization, and from dominant elongation (insulin) to dominant secondary nucleation (ataxin-3). A linkage between the occurrence of soluble oligomers and amyloid fibrillation kinetics can be established from the analysis of equilibrium and kinetic proportionality relations, as summarized in Figures 5 and 6, respectively. The direct proportion of the end-point amyloid signal and protein concentration predicted in the absence of the oligomerization pathway ( Figure 5A), will not be observed in the cases of refractory or slow dissociating oligomers ( Figure 5B). If the formation of amyloid fibrils is capable of totally reversing the oligomerization equilibrium ( Figure 5C), the F F signal would not differ substantially from that of fully dissociated protein.
The building evidence associating soluble oligomers to the pathogenesis of neurodegenerative diseases allows us to anticipate a new interest in chemical kinetic analysis as a tool to identify potential modulators of off-pathway oligomerization. In this respect, the final fluorescence value is a direct measurement of the extent of the amyloidogenic reaction but it can also reveal whether parallel aggregation pathways are inhibited or promoted by test compounds. For example, IAPP mimics synthesized with N-methylated amide bonds inhibit the aggregation of IAPP and Aβ40 by stabilizing protein monomers and nontoxic oligomers, thereby shifting the equilibrium towards the production of less amyloid fibrils and eliciting lower F F values [40]. Although complementary measurements are required in order to validate oligomer modulation effects, final fluorescence analysis is well suited for primary screenings of large libraries of chemical compounds. The oligomerization pathway can be further probed by the analysis of kinetic scaling laws, which have different interpretations according to whether the autocatalytic step is fibril elongation ( Figure 6A-D) or secondary nucleation ( Figure 6E-H). In both cases, however, marked deviations from linearity are obtained in the presence of slowly dissociating oligomers. Absolute values of the scaling factor |γ| lower than 1 are admissible independently of the dominant secondary step ( Figure 6B,F). If a high degree of oligomerization persists during amyloid fibril formation, positive t 50 vs. C T dependences are also possible, especially when secondary nucleation is a predominant step ( Figure 6F). The aggregation rate v 50 is a useful comparator to gauge the kinetic impact of soluble oligomers based on marked deviations from the straight-line relationships ( Figure 6C,G), but also to identify cases of dominant fibril elongation (positive C T -dependence) and dominant secondary nucleation (neutral or negative C T -dependence).

Conclusions
In conclusion, sigmoidal shapes of ThT fluorescence aggregation curves of insulin and ataxin-3 indicated that primary nucleation is the rate limiting step of amyloid fibril formation in both model proteins. Unconventionally weak t 50 scaling with protein concentration was explained by different aggregation mechanisms, involving, in one case (ataxin-3), dissociable soluble oligomers and rapid secondary nucleation, and in the other (insulin), refractory soluble oligomers and rapid fibril elongation. This was inferred from the analysis of the often disregarded measurables of end-point fluorescence and half-life aggregation rate, and could be confirmed by DLS and NMR results without overparameterization issues. Over and above the importance of reaction scaling laws to discriminate the mechanisms of protein aggregation, the rationale presented here is originally oriented to the discovery of new drugs targeting soluble oligomers. This compelling therapeutic target in neurodegenerative diseases [1,2,7], as well as in type 2 diabetes [41,42], remains largely unexplored except for very recent and encouraging candidate antibody therapies [43,44]. With the new chemical kinetic toolbox, amyloid binding assays can be utilized in either high-throughput screenings or drug repurposing strategies in the quest for disease-modifying, anti-oligomerization compounds. Equation (A3), while the analysis of kinetic scaling laws requires supersaturation to be expressed in terms of monomer concentration (C 1 ). To estimate the relationship between the initial value of C 1 and the total insulin concentration (C T ), the isodesmic-type oligomerization equilibrium of insulin determined by Bocian et al. [21] (Figure 2B) was followed: The oligomerization reaction scheme of ataxin-3 involves successive reversible steps of monomer addition (Figure 2A), each n-step being characterized by the rate constants of monomer aggregation (κ n+ ) and dissociation (κ n− ). Except for the initial dimerization reaction (rate constants κ 1+ and κ 1− ), all the subsequent steps are well characterized by the same fixed values of κ n+ and κ n− . Since the soluble oligomers do not participate in the formation of amyloid fibrils, the concentration of the n-mer (C n ) is solely determined by the linear equilibrium balance, whereas the concentration of monomer is also dictated by the rate of amyloid fibrillation (dM/dt): with the sum term accounting for the capture and release of one protein molecule per elementary step. After expressing supersaturation in terms of monomer concentration, the CLM Equation (A2) is reformulated as [17]: where M F = C T − C * corresponds to the final concentration of monomers present in the insoluble phase if all soluble n mers become dissociated during the process of amyloid fibrillation.

Appendix B.1. Scaling Laws
The kinetic scaling laws of insulin fibrillation were analyzed as follows: Equation (A5) was numerically solved for different values of total insulin concentration (C T ) using previously determined equilibrium constants (K 12 = 4.9 × 10 5 , K 24 = 5.0 × 10 4 , K 46 = 2.7 × 10 3 and K iso = 1.35 × 10 4 ) [21]. The obtained values of monomer concentration ( Figure 2C) were used to estimate the amyloid solubility of insulin (C * = 0.029 mg/mL) from the scaling law of the end-point ThT fluorescence with C T ( Figure 2D). After recognizing fibril elongation as the predominant autocatalytic step during insulin fibrillation (k a ≈ k + and k b ≈ k n /k + ), the dependence of the two CLM parameters on supersaturation was expressed as k a = k a ∆C and k b = k b ∆C, with ∆C = C 1 − C * . These definitions were replaced in Equations (A4a) and (A4b), which were then solved using the measured values of t 50 and v 50 ( Figure 2E,F, respectively). Finally, the values of k a and k b estimated for each insulin concentration were averaged to obtain k a = 1.34 × 10 2 mL/mg/h and k b = 2.41 × 10 −7 mL/mg.
In the case of ataxin-3, the set of differential equations comprising Equations (A6) and (A7) were numerically solved using Mathworks ® MATLAB 2016b (Natick, MA, USA) to obtain the concentration of polymerized monomers (M) as a function of time for the cases of C T = 10 µM, 7 µM, 5 µM, 4 µM and 2 µM. In order to keep Equation (A6) manageable for numerical computation, a cut-off size of n ∞ = 9 × 10 6 monomers was adopted as the maximum dimension of soluble ataxin-3 oligomers, and the condition dC n /dt = 0 was imposed for n ≥ n ∞ . The initial conditions were set assuming no fibrillar aggregates present in solution (M(0) = 0) and that the fractional compositions of monomers and n-mers correspond to those extracted from DLS measurements [17]. Since the predominant autocatalytic step during ataxin-3 fibrillation is secondary nucleation, k a ≈ k 2 and k b ≈ k n /k 2 . Owing to the low solubility of ataxin-3 (C * ≈ 0), the initial supersaturation is a direct proportion of C T and the CLM parameters are given as k a = k a C n 2 T and k b = k b C 2−n 2 T . The proportionality constants k 2 and k n , and the order-of-reaction n 2 were estimated by minimizing the absolute error between predicted and measured ThT fluorescence (F) progress curves using the experimentally determined calibration curve F(a.u.) = 0.70 × M( µM) and the known oligomerization rate constants κ 1+ = 7.99 × 10 −4 µM −1 h −1 , κ 1− = 9.73 h −1 , κ n+ = 0.167 µM −1 h −1 , and κ n− = 0.775 h −1 [17].
To simulate the t 50 and v 50 scaling laws in Figure 6, Equations (A6) and (A7) were numerically solved as described for ataxin-3 using illustrative values of C * = 0, k a = 0.4C n 2 T h −1 and k b = 5 × 10 −4 (C T /5) 2−n 2 . Two limit situations corresponding to predominant elongation step (n 2 = 1) or predominant secondary nucleation (n 2 = 0) processes were considered. Soluble protein was admitted to occur either as a monomer or as a dimer (κ n+ = 0 and κ n− = 0) with the distribution of initial species given by the exemplar function C 1 = 10 × 1 − e −C T /10 for C T values comprised between 0 and 20 in arbitrary concentration units. The relative weight of the oligomer dissociation rate was investigated by changing the value of κ 1− between 0 and 5 h −1 while keeping a fixed value of κ 1+ = 0. The normalized amyloid signal (M/M F ) was computed over time and the corresponding half-life coordinates, t 50 and v 50 were represented as a function of C T (Figure 6).

Appendix B.2. Discretized Population Balance
The time evolution of the size distribution of insulin fibrils was simulated using the discrete population balance derived for general phase transition processes comprising the steps of primary nucleation, secondary nucleation, growth/elongation and breakage [12,17]. As previously described for ataxin-3, the concentration of filaments composed by j monomers ( f j ) is given as [17]: The Kronecker delta functions in Equation (A8) set the sizes of the primary and secondary nuclei to fixed values of n * and n * 2 , respectively, with the latter being adopted as the smallest possible filament size (j ≥ n * 2 ). The Heaviside function establishes a minimum fibril size of 2n * + 1 molecules above which fragmentation starts to occur [46]. In the case of insulin, this equation is simplified since secondary nucleation and fibril breakage take place to a negligible extent (k 2 ≈ 0 and k − ≈ 0). The previous Equation (A2) can be obtained from Equation (A8) by extending the sum of j × d f j /dt to all filaments [17]: A population of pre-existing clusters with hydrodynamic radii R h > 100 nm was identified in the analyzed insulin solutions ( Figure S2). The DLS intensity-peak corresponding to this population gradually vanished from the measured size distributions as fibril elongation took place ( Figure 4A-D). To simulate this behavior, a simple particle adhesion mechanism is proposed in which pre-existing clusters composed by k monomers are considered to join to the available j-mer fibrils and form heterogeneous agglomerates composed by l = k + j monomers. During the initial phases of the reaction, a number of pre-existing clusters remains isolated because there are fewer fibrils than clusters. distributions following the Rayleigh law of light scattering [17]. Importantly, the final size distributions were not significantly affected by the presence of initial clusters as indicated by the simulated results for P c (0) = 0 ( Figure S3D-F).