This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Protein aggregation is an important field of investigation because it is closely related to the problem of neurodegenerative diseases, to the development of biomaterials, and to the growth of cellular structures such as cyto-skeleton. Self-aggregation of protein amyloids, for example, is a complicated process involving many species and levels of structures. This complexity, however, can be dealt with using statistical mechanical tools, such as free energies, partition functions, and transfer matrices. In this article, we review general strategies for studying protein aggregation using statistical mechanical approaches and show that canonical and grand canonical ensembles can be used in such approaches. The grand canonical approach is particularly convenient since competing pathways of assembly and dis-assembly can be considered simultaneously. Another advantage of using statistical mechanics is that numerically exact solutions can be obtained for all of the thermodynamic properties of fibrils, such as the amount of fibrils formed, as a function of initial protein concentration. Furthermore, statistical mechanics models can be used to fit experimental data when they are available for comparison.

Protein aggregation is an active, multidisciplinary science, with researchers and practitioners working in broad disciplines, including biophysics, medicine, biomaterials, and pharmaceuticals. With diverse perspectives, it is not surprising that the papers on protein aggregation differ widely in their emphasis and methodologies: from fundamental research related to molecular mechanisms and aggregation pathways to searching for biomarkers, drug targets, even to imaging of plaques in the brain, dissolution of fibrils and amyloids

Early applications of statistical mechanical methods to the studies of protein problems can best be represented by the treatment of helix-coil transitions in proteins by Zimm and Bragg in the 1950s [

Analogous to a one-dimensional Ising model [

It is in a similar spirit that our statistical mechanical treatment of protein aggregation has been developed [

To see the complication of protein aggregation processes, we use

Protofibrillar intermediates are heterogeneous, metastable aggregates already containing

One of the earlier applications of statistical mechanical methods to protein systems is the Zimm-Bragg model [

Although protein aggregation is the subject of the article, we use the simplest model, _{s}_{nuc}_{nuc}_{h}_{s}_{h}

In direct combinatorial approaches to solving the partition function, _{N}_{N}

where the term (

The Zimm-Bragg (ZB) model for helix-coil transitions in proteins assumes that any number of helical stretches may form along the chain, where residues could be involved in short-ranged interactions with other residues. Each residue could assume either a coil or a helical conformation, thus for nearest-neighbor interactions between residues, there are four possible combinations of a pair of residues. That is, two residues at positions

The partition function for the ZB model, _{N}

where the vectors 〈_{i}_{λ}_{i}_{1,}_{2,}_{N}_{i}

The transfer matrix elements represent the probability that a residue occupies a different state from its neighbor. Thus, the transfer matrix used in the ZB model for helix-coil transitions in proteins has the following form:

where the column (_{i}

where “_{1} is the largest eigenvalue of

Once the partition function for the protein is known explicitly using the ZB theory, some average quantities can be defined and compared with experiments. The average fraction of residues in a chain of length

where _{H}

Similar averages are calculated for the sheet-coil and helix-sheet-coil systems. In general for a chain of length _{j}

where

and a similar result can be derived for

In 1961, Oosawa and Kasai constructed a model for equilibrium protein aggregation using ideas from the helix-coil theory that were being developed at the same time. First, in the model the total number of proteins in the system is fixed and is denoted by _{tot}_{eq}

where _{k}

The nucleation equilibrium constant is often denoted _{eq}_{1}, and dimers, _{2}, can also be written as:

Once a dimer is formed, then trimer, . . .,

where

Since the total protein mass in the system is conserved, _{tot}

Therefore, in the thermodynamic limit

where the sum converges when _{1}_{tot}_{1}.

A more recent approach to equilibrium peptide assembly introduced by van Gestel and van der Schoot relates the concentrations of protein aggregates to equilibrium partition functions [

This modeling approach is consistent with a recent set of experiments [

The isolated monomer in the ZB model for aggregation is assumed to be a natively unstructured protein. A “helix” protein is defined if _{helix}_{sheet} and _{helix}_{coil}, where _{helix} can be defined by using

All of the aggregate species in a system of volume

where _{k}_{k}_{k}_{k}

which contains an entropy of mixing term as well as the free energy of the aggregate of size _{T}

where

A generalized ZB model for protein aggregation can now be defined by an effective Hamiltonian. The effective Hamiltonian is used to find a transfer matrix by assuming the interactions between aggregates are described by a nearest-neighbor, Ising-like model, in which the protein could be in any of the two states: a sheet (or helix) or coil conformation. The interactions include the free energy

where _{1,}_{2,}_{N}_{i}

where ^{β}^{(}^{N}^{−1)}^{K}^{N}^{−1} with _{B}_{B}_{B}_{B}

where _{1} = exp(−2_{1} = exp(

Amyloid formation is generally believed to be dominated by 1D or quasi-1D chains of proteins, which may then bundle into protofibrils and fibrils. It is because of this fact that the transfer matrix formulation in statistical mechanics, if extended successfully, is a powerful technique for the studies of amyloid formation. We focus on this extension in this and the following sections.

To include helix, sheet, and coil conformations in a single model, a Potts model [

where the Kronecker delta _{j}_{j}_{+1}) ≡ exp(−2_{j}_{j}_{+1})) and _{j}_{j}_{+1}) _{1}; _{2}; and _{3}. The notation can be simplified by letting _{1}, _{2}, and finally _{3}.

The propagation parameters _{1} and _{2} are associated with the free energies _{1} and _{2} that refer to the interaction between the

and the partition function for

The Ising-like ZB model for sheet-coil (or helix-coil) transitions in aggregates, _{i}_{i}

To describe the fibrils, which may contain several filaments that can be described by

where the free energy _{y}_{x}_{1} was free energy of an interaction between a sheet protein and one of its neighbors. Other effective Hamiltonians could also be written to describe the case where the filaments are not aligned in register with each other, as well as cases where the protein conformation may also play a role in the assembly of the filaments into full fibrils [

In addition to conformational changes in filaments and fibrils, another complication is that the kinetic pathways to the formation of fibrils seem to be sequence-dependent [

We can model the equilibrium aggregates of A

The position of a vertex within the strip is specified by coordinates (_{x}_{y}_{T}_{x}_{y}_{y}_{y}

The interactions between the proteins in these aggregates is modeled similarly to the 2-helix chain model for proteins proposed by Skolnick [

As mentioned, the nucleus is the smallest equilibrium aggregate in our formulation. The next smallest aggregate occupies the first two columns of the strip lattice, and contain 2_{y}_{y}_{y}_{1}_{2}

where _{fil}_{y}

where 2

where _{1} → 1) yields independent strips parallel to the ^{Ly}^{Ly}_{1}, _{2} and _{3}. The partition function for aggregates on the strip lattice can be calculated by plugging the eigenvalues, _{2}_{D,i}

where, again, _{i}

Using the ZB formalism, we can also model fibrils in the

where we assume that the interaction between any two sheet proteins from adjacent aggregates is also quantified by the free energy

In general, the transfer matrix for the 2D model has dimension ^{Ly}^{Ly}^{2}^{Ly}^{2}^{Ly}_{y}

To compare with experiments, some average properties of the dilute system of monomers and aggregates can be defined. The total number density for the strip or cube model for fibrils can be written as:

where _{x}_{y}

where the _{i}

and is directly related to the length of the fibrils. The expressions for 〈_{1}〉 and 〈_{y}_{=4} strip lattice, as illustrated in

which can be used to fit the AFM measurements of the average lengths of the fibrils. This relation also holds for the average length of a fibril described by the cube lattice model as depicted in _{y}

where the fugacity _{k}_{k}_{j}_{tot}

Now the average lengths of the aggregates can be computed. Next, the average fraction of the aggregates that are sheet, 〈_{2}〉, is calculated for the A_{2}〉 can be written as:

where _{2} is the sheet propagation parameter. The procedure for finding 〈_{2}〉 is quite general, and works for all of the transfer matrices that we have considered in this model.

In this subsection, our ZB-like model predictions are compared to the experimental results for the CD spectra of A_{2}〉_{Aβ}_{2}, the sheet interaction free energy, _{2}, the sheet-coil interfacial free energy, and

For the _{y}_{y}

where each contribution in _{y}_{2}〉_{αS}

The fit of 〈_{2}〉_{Aβ}_{1}, and to a lesser extent on the binding between filaments as quantified by _{2}, indicates that the interfacial tension between sheet and coil regions in the aggregates is modest. However, the fibril concentrations do not really increase from zero until nearly 100 μM according to the model predictions, but

The model predictions for the strip model of _{ƒib}/_{2}〉_{Ly}_{=4} give nearly the same result for the concentrations used in the AFM experiments.

The fits of the AFM data done by van Raaij _{2}, whereas in van Raaij’s model prediction, the free energy barrier between adjacent sheet and coil proteins in the aggregates does not seem to be present. A finite contribution from _{2} means the fibrils will have longer stretches of sheet content when compared to cases when _{2} is closer to zero when there is little or no penalty between coil to sheet regions in the fibrils. Additionally, our model predicts that the inter-filament interactions,

When fitting the A

As mentioned, the data sets were fit using various boundary conditions, and the open boundary case (proteins could adopt either conformation at the boundaries) was found to be the best choice for fitting for the average lengths of the

The models for fibrils discussed so far do not take into account interactions between protein and solvent, or some free energy that would be associated with nucleus formation. The ZB model for aggregation can be extended to take into account these phenomena by using a grand-canonical model. We summarize several main differences between canonical and grand-canonical approaches: (1) in the grand canonical model, aggregates of all sizes are included; (2) an aggregate phase and solution phase are in equilibrium; (3) chemical potentials can be used relating the solution phase as well as the aggregate phase [

The solution phase is defined by specifying the chemical potential for protein monomers in the solution can be written [

where the subscript “_{ST}_{SR}_{agg}_{agg}

where “_{PC}_{PV}

With the simple statistical mechanical model summarized in the sections to follow, we can relate the chemical potential contribution from the protein interactions in aggregates, _{PC}

As a first step in generalizing the canonical effective-Hamiltonian models presented earlier, the free energy

To write down an effective Hamiltonian that can include the free energy

where the lattice-gas variable _{i}_{i}_{pp}_{i}_{i}_{+}_{nc}) = 1 − _{i}_{i}_{+}_{n}_{c}) ensuring that there is solvent at site _{c}

Since the number of proteins on the lattice can fluctuate, this description of protein aggregation is described by using the grand canonical ensemble. The lattice-gas formalism,

where _{PC}_{c}_{i}_{i}

where _{1}, _{1}, and _{PC}

The inter-filament interactions between two 1D filaments are treated using the same methodology introduced in earlier sections. In general, the Hamiltonian for an _{x}_{y}

where ℋ_{fil}_{y}

Since nucleation cannot in reality occur in 1D, we consider a similar model for aggregates that positions the nucleus along the

where −_{pp}_{y}

For either description of fibrils (model A or B), the total number of proteins on a strip lattice is then

where the sums over {_{T}

where
^{n}^{c}^{L}^{y} × (^{n}^{c}^{L}^{y} and has (^{n}^{c}^{L}^{y} number of eigenvalues, whereas the transfer matrix
^{L}^{y} × (^{L}^{y} and has (^{L}^{y} number of eigenvalues.

To compare with experiments, we can define quantities similar to _{p}

respectively. Other quantities may also be defined including the average lengths of filaments, and the average length of sheet stretches in aggregates [

We solve for _{PC}_{ST}_{SR}_{PV}_{PC}_{ST}_{SR}_{PV}_{PV}_{p}_{1} ≈ _{1} = 0.35 kcal/mol, and _{1} = 7.26 kcal/mol, _{1} ≈ 0 kcal/mol, and _{1} in the A

By focusing on the aggregation of proteins in forming oligomers, protofibrils, and fibrils, and their relations to neurodegenerative diseases, statistical mechanical approaches to protein aggregation have been developed. We have made a general summary of the field, presenting recent formulations of the ZB model based on the canonical, as well as the grand canonical, approaches to the amyloid formation processes. Some results are presented to show that these models can be used to interpret experimental observations as well as to provide phase diagrams [

More experimental data like the CD results of Terzi

Of course, a statistical mechanical approach to protein aggregation has serious limitations. It can only be used to study the equilibrium properties of the systems, not the rates of the processes involved nor the transient behaviors, such as quasi-equilibrium or kinetic trapping. Furthermore, the available experimental data that we can compare our theories with are so far extremely limited. Therefore, statistical mechanical models are not a tool for predicting assembly pathways. However, statistical mechanics can be used to show that experimental observations are consistent with the predictions of a certain route of aggregation, that is, given a route of aggregation and the associated effective free energy, statistical mechanical models can predict equilibrium distributions of oligomers, protofibrils, fibrils,

The pathway prediction function is better achieved by using kinetic models [

We would like to thank Frank Ferrone, Brigita Urbanc, and J. van Gestel for stimulating discussions.

The authors declare no conflicts of interest.

(

In both (

Cartoon illustrations of an A

A ZB model for protein aggregation is illustrated, where the proteins (circles) in aggregates can be coil (white), sheet (black), or helix (red marked with X) in conformation. The free energies associated with each conformation are listed, as well as the interfacial free energies _{1}, _{2}, and _{3} between helix-coil, sheet-coil, or helix-sheet regions, respectively.

(

(

(_{y}_{x}_{x}

Partial lists of chemical species that may exist in dynamic equilibrium with fibrils for A_{y}_{y}_{y}_{c}

(_{2}〉 for 1D, 2D, and 3D structures in the A_{y}_{2}〉 for _{fib}_{1} = 7.41_{2} = −2.47^{2} + ^{4})_{1} = _{2} = −1.64

Summary of protein conformation energies. A site could be occupied with a solvent cluster, denoted by _{c}_{1},

Proteins or solvent clusters may occupy lattice sites, where the front-view (_{c}

(_{p}_{1} ≈ _{1} = 0.35 kcal/mol, and _{1} = 7.26 kcal/mol, _{1} ≈ 0 kcal/mol, and _{c}_{c}

Summary of the ZB weights for two residues that are adjacent to each other in a protein.

Weight | ||
---|---|---|

c | c | 1 |

h | c | 1 |

c | h | |

h | h |