Filamentous Aggregates of Tau Proteins Fulfil Standard Amyloid Criteria Provided by the Fuzzy Oil Drop (FOD) Model

Abnormal filamentous aggregates that are formed by tangled tau protein turn out to be classic amyloid fibrils, meeting all the criteria defined under the fuzzy oil drop model in the context of amyloid characterization. The model recognizes amyloids as linear structures where local hydrophobicity minima and maxima propagate in an alternating manner along the fibril’s long axis. This distribution of hydrophobicity differs greatly from the classic monocentric hydrophobic core observed in globular proteins. Rather than becoming a globule, the amyloid instead forms a ribbonlike (or cylindrical) structure.


Introduction
The origin of amyloid transformation has attracted scientific attention for more than 35 years-at least since being acknowledged as the cause of various neurodegenerative disorders [1]. The coexistence and mutual relations between Aβ amyloids and tau tangles, resulting in the damage and destruction of synapses, is believed to provoke behavioral changes that are associated with cognitive impairment [2,3]. The appearance of amyloid fibrils is the consequence of plasticity of proteins, which can adopt different conformational states [4]. The proteins of high content of intrinsically disordered structural forms seem to be the candidates ready for partially folded state which may transform to disordered aggregates with low packing [5,6].
Emergence of fibrillary structures is also thought of as the result of involvement of intrinsically disordered proteins, especially at early phases of the folding process [7].
Reaching the form of highly packed structuralised aggregates that are based mainly on β-structural forms opens the possibility for the unlimited elongation of highly packed ordered amyloid form [8]. The presence of Beta-structural form (cross Beta) allows for the propagation due to the possible H-bonds system to be organised on both sites. factors internal to the polypeptide itself. One shall mention the mutation-related amyloidosis [23], however the prion-based amyloid transformation does not require the presence of mutation, as it is the discussed case.
Shaking is known to promote amyloidogenesis-and it can hardly be called a chemical factor. Many factors-including chemical ones-were identified to support the amyloidosis transformation [24]. However other factors than environmental (shaking in particular) are not the object of our analysis. One shall also take into account that the folding as well the misfolding processes take place in macromolecular crowding conditions, however the immanent presence of water makes the water environment of high importance [25].
Perhaps shaking disrupts the structure of the solvent in such a way as to prevent it from guiding "natural" conformational changes within the protein chain. Alternatively, shaking is notable for aerating the solvent. The resulting increase in the area of the liquid/gas interface may produce structural changes within the solvent itself.
In addition to analysis of the tau amyloid, as listed in Protein Data Bank [26], this work proposes an in silico experiment, which involves determining alternative structures that the tau amyloid sequence may attain (using specialized protein folding software, such as Robetta [27,28] and I-Tasser [29,30]), and performing folding simulations based on the FOD model. It turns out that the sequence is indeed capable of producing a globular form with a single, monocentric hydrophobic core. Subjecting globular structures to FOD characterization enables us to track changes that result in amyloidogenesis. The work focuses on three distinct structures: (1) the superfibril (seeking the causes behind its structural variability); (2) the protofibril (identifying the characteristic properties of amyloid structures); and (3) a single chain participating in the fibril. Our research is based on observations rooted in the FOD model, specifically, the linear propagation of hydrophobicity in amyloids (which prevents a shared hydrophobic core from forming). As discussed in [14,15], the presence of alternating bands of high and low hydrophobicity can be regarded as one of the principal indicators of amyloid transformation.

Abbreviations Used
FOD-Fuzzy Oil Drop model RD-Relative Distance-The divergence entropy introduced by Kullback and Leibler (described in Methods) used to express the distance between two compared profiles is of entropy category thus it requires an introduction of reference distribution. This is why the distance between T-O (T theoretical-idealized distribution and O-observed distribution) measured by divergence entropy is compared with the O-R (O-observed, R-uniform distribution deprived on any form of hydrophobicity concentration), also measured by divergence entropy. The parameter expressing the relative distance O|T versus (O|T + O|R) measures the closeness of O distribution versus T distribution in respect to O versus R distribution. The RD parameter becomes polypeptide chain length independent. It makes possible comparison of different proteins. RD (T-O-R)-RD parameter calculated for two reference distributions T-theoretical and R-uniform RD (T-O-H)-RD parameter calculated in respect to reference distribution called H-distribution based on intrinsic hydrophobicity of amino acids present in particular polypeptide chain fragment HvT-correlation coefficient expressing the relation between H-intrinsic hydrophobicity of amino acids versus the T-theoretical (expected) hydrophobicity for the idealized status of the residue HvO-correlation coefficient expressing relation between H-intrinsic hydrophobicity of amino acid versus is status as observed in particular protein TvO-correlation coefficient expressing relation between T-idealized hydrophobicity and O-observed in protein under consideration phf-tau-paired helical filament-tau phf-tauO-paired helical filament-tau-as it is available in 5O3O phf-tau in symmetrical form of superfibril phf-tauT-paired helical filament-tau-as it is available in 5O3T phf-tau in asymmetrical form of superfibril phf-tauL-paired helical filament-tau-as it is available in 5O3L phf-tau in form of superfibril similar to phf-tauO IT-#-identification of the model constructed using I-Tasser program with number 1-5 since 5 models were constructed using this program ROB-#-identification of the model constructed using Robetta program with number 1-5 since 5 models were constructed using this program FOD-#-identification of the model constructed using FOD model with number 1-5 since 5 models were constructed using this program Tau (267-312)-fragment of tau peptide-protein under PDB ID 2MZ7 Tpp-tau phosphothreonine peptide-protein under PDB ID 1I8H Tau (306-311A)-fragment of tau to identify the structure available in PDB as 2ON9 Tau (306-310)-fragment of tau to identify the structure available in PDB as 3Q9G Tau (306-311B)-fragment of tau to identify the structure available in PDB as 3OVL Tau (305-311)-fragment of tau to identify the structure available in PDB as 4E0M Tau (623-628)-fragment of tau to identify the structure available in PDB as 4NP8 Tau (306-311C)-fragment of tau to identify the structure available in PDB as 5K7N F-actin-actin, alpha skeletal muscle as available in PDB as 3J8I PDB-Protein Data Bank CASP-Critical Assessment Protein Structure Prediction BLAST-Basic Local Alignment Search Tool PSI-BLAST-Position-Specific Iterated BLAST MSA-Multiple Sequence Alignment

Superfibril
This analysis concerns the amyloid form that is listed in PDB as 5O3O, 5O3L, and 5O3T (pronase-treated paired helical filament in Alzheimer's disease brain neurofibrillary tangle protein, paired helical filament-tau, phf-tau, Homo sapiens). Fragment: residues 623-695 of tau protein (306-378 according to PDB numbering) Chains A, C, E, G, I, along with their counterparts (B, D, F, H, J) make up the proto-fibrils [11]. In order to characterize individual chains in the context of the superfibril and proto-fibrils, we have singled out chains E and F. These two chains are located in the central part of the fibril and can be regarded as representative of an arbitrarily long structure. This selection also minimizes edge effects caused by the finite width of the complex.
Properties of Superfibrils and Interfaces-What Is the Source of Different Isoforms of Tau Filaments? Table 1 presents the status of tau amyloid structures in terms of RD values, revealing large discordances between T and O profiles in both models (T-O-R and T-O-H). This means that the distribution of hydrophobicity does not involve a central hydrophobic core. Further analysis will reveal that the amyloids are dominated by a pattern that consists of alternating bands of high and low hydrophobicity. High values of RD further indicate that the folding process is driven by the intrinsic properties of each residue rather than by a global force field-this is also typical for amyloids [14,15]. Regarding the hydrophobicity profile correlation coefficients, HvT and TvO lag behind HvO. This is also due to the absence of a central hydrophobic core, which is replaced by linear propagation of narrow "bands". In further sections we will specifically describe locations that exhibit these properties. Figure 1) highlights the major differences between these distributions. It should be noted that the chart consists of many overlapping profiles, which means that the distribution of local minima and maxima is replicated in each adjacent chain, resulting in a set of narrow bands, as suggested above.

Visual comparison of T and O (
The FOD model may also be used to predict the properties of shared hydrophobic cores in protein complexes [31]. In order to properly characterize a given complex, it is important to assess the status of its interface. With regard to proto-fibrils, the distribution that was observed in phf-tauT differs from those exhibited by the remaining structures. However the difference is limited only to the structure of interface, which is discussed in this paper. In phf-tauO and phf-tauL the status is similar and it suggests that the superfibril emerges as a result of factors consistent with the FOD model, i.e., under the influence of the aqueous solvent. This interpretation is supported by the high values of all correlation coefficients. We may conclude that the interface is shaped by all factors which determine the structure of the complex itself, with major involvement of water. The picture changes, however, when dealing with phf-tauT. Its high value of RD (T-O-R), coupled with negative values of HvT and TvO coefficients and a high value of the HvO coefficient, suggest that, in this case, the solvent does not play a significant role in complexation.
It should be noted that the status of the interface is computed by taking into account all interface residues in the entire fibril (following protein-protein contact distance criteria of PDBsum [32]). When all three correlation coefficients adopt strongly positive values, we may assume that the structure of the interface represents a compromise between all three hydrophobicity profiles (observed, intrinsic and theoretical). In contrast, negative values of HvT and TvO are understood to mean that the interface folds "in spite of" the FOD model and in consequence in spite of environmental effects that act upon the protofibril complex. The characterization concerns chains E and F, which are located centrally and therefore representative of an unrestricted fibril. The calculated values are typical for amyloid forms, and include high values of RD and HvO, along with very low (sometimes even negative) values of HvT and TvO. In phf-tauT, both individual chains, as well as the interface fragment, are shaped by intrinsic hydrophobicity rather than by the external environment, which would favor the formation of a monocentric hydrophobic core.
As the status of individual chains (in the context of the superfibril) is largely similar in all structures, we will limit their presentation to Phf-tau in form, as observed in phf-tauO and phf-tauT (Table 2). When analyzing individual chains as components of the superfibril, we arrive at similar RD values. In contrast, when the same chains are analyzed as components of individual proto-fibrils (phf-tauT), their values differ due to differences in the orientation of each proto-fibril. Negative values of correlation coefficients for HvT and TvO with a high value of correlation coefficient for HvO relation suggest that phf-tauT is a typical amyloid form. Figure 1 provides a graphical representation of the superfibril and both chains (E and F) treated as components of the superfibril. As the status of individual chains (in the context of the superfibril) is largely similar in all structures, we will limit their presentation to Phf-tau in form, as observed in phf-tauO and phf-tauT ( Table 2). Table 2. Status of individual chains treated as components of the superfibril. The presentation of phf-tauL is omitted since due to its similarity to phf-tauO. When analyzing individual chains as components of the superfibril, we arrive at similar RD values. In contrast, when the same chains are analyzed as components of individual proto-fibrils (phf-tauT), their values differ due to differences in the orientation of each proto-fibril. Negative values of correlation coefficients for HvT and TvO with a high value of correlation coefficient for HvO relation suggest that phf-tauT is a typical amyloid form. Figure 1 provides a graphical representation of the superfibril and both chains (E and F) treated as components of the superfibril. The status of phf-tauO superfibril is visualized in Figure 1A, which shows that the theoretical distribution involves two local maxima, along with hydrophilic fragments that are exposed on the The status of phf-tauO superfibril is visualized in Figure 1A, which shows that the theoretical distribution involves two local maxima, along with hydrophilic fragments that are exposed on the surface. Neither maximum is evident in the observed distribution, however O includes other local maxima, located in areas where low hydrophobicity is expected. It should be noted that each of these local maxima (as well as minima) represents an entire band stretching along the fibril's long axis. The overlap is due to the repeating pattern that is present in each individual chain, with only the outlying chains exhibiting slightly lower hydrophobicity. On the other hand, the differences between the theoretical distributions are readily apparent since this distribution predicts that hydrophobicity should decrease along with distance from the center. The degree of discordance between T and O can be analyzed by comparing theoretical charts with the observed distributions for chains E and F (which are centrally located and therefore representative for the entire fibril-see Figure 1B,C).

RD Correlation Coefficient T-O-R T-O-H HvT
The interface fragment appears to be consistent with the FOD model. Given the central location of the interface, a hydrophobicity peak is expected and-to a certain extent-present in the actual complex. Comparing O with T reveals that two outlying residues exhibit relatively low hydrophobicity, while the central section corresponds to a major spike. Consequently, we rate the interface fragment as being accordant with the model.
A characteristic feature of amyloids is the presence of numerous local maxima in areas where low hydrophobicity (and vice versa) is predicted by the theoretical model. However, it is important to remember that, unlike globular proteins (which may also exhibit this phenomenon), the complexed chains form here bands which stretch along the entire long axis of the fibril. These observations are confirmed by analysis of T and O for chains E and F (treated as components of the superfibril). The discordance between T and O distribution in most of proteins is of local character. Table 3 presents the hydrophobicity parameters for each proto-fibril. In this case, each proto-fibril is treated as a distinct structural unit. This means that a separate Gaussian is constructed for each proto-fibril (in the preceding section, a shared Gaussian form was computed for the entire superfibril). Results are indicative of an amyloid form: high values of RD and HvO along with very low (even negative) values of HvT and TvO. The correlation coefficients reveal that the structure is dominated by the intrinsic properties of its component residues-an observation that is supported by the observed high values of RD in both models (T-O-R and T-O-H). Thus, the observed distribution is more closely aligned with R (or H) rather than T.

Properties of Proto-Fibrils
The status of chains E and F, treated as components of their respective proto-fibrils, confirms that they adopt amyloid-like forms, although this effect is less pronounced than in the case of the superfibril (lower values of T-O-H RD and HvO-see Table 3).  Figure 2 provides a visual representation of these results, showing T and O distributions for one proto-fibril (chains A, C, E, G and I) and the status of the E chain within this structural unit.
An asymmetrical distribution of local maxima is observed in the proto-fibril as a result of significant displacement of the system's central point as compared to the superfibril. Numerous local maxima are present in areas where low hydrophobicity is expected. The involvement of a local maximum in the interface fragment indicates that complexation of protofibrils is generated as the effect of the influence of environment (according to FOD model). An asymmetrical distribution of local maxima is observed in the proto-fibril as a result of significant displacement of the system's central point as compared to the superfibril. Numerous local maxima are present in areas where low hydrophobicity is expected. The involvement of a local maximum in the interface fragment indicates that complexation of protofibrils is generated as the effect of the influence of environment (according to FOD model).

Properties of Individual Chains Treated as Distinct Structural Units
Our analysis also covers individual chains that are treated as distinct structural units, with a separate Gaussian being plotted for each chain (under the assumption that each chain folds in separation from other chains). To determine the causes of the discordance between the observed and theoretical distributions, we have singled out fragments for which this discordance is particularly evident. Note that we are not dealing with isolated deviations-in many areas, both distributions strongly oppose each other, indicating that the chain does not produce a globule and is likely insoluble due to the lack of a polar surface. Figure 3 illustrates the status of chains E and F treated as individual structural units.

Properties of Individual Chains Treated as Distinct Structural Units
Our analysis also covers individual chains that are treated as distinct structural units, with a separate Gaussian being plotted for each chain (under the assumption that each chain folds in separation from other chains). To determine the causes of the discordance between the observed and theoretical distributions, we have singled out fragments for which this discordance is particularly evident. Note that we are not dealing with isolated deviations-in many areas, both distributions strongly oppose each other, indicating that the chain does not produce a globule and is likely insoluble due to the lack of a polar surface. Figure 3 illustrates the status of chains E and F treated as individual structural units. An asymmetrical distribution of local maxima is observed in the proto-fibril as a result of significant displacement of the system's central point as compared to the superfibril. Numerous local maxima are present in areas where low hydrophobicity is expected. The involvement of a local maximum in the interface fragment indicates that complexation of protofibrils is generated as the effect of the influence of environment (according to FOD model).

Properties of Individual Chains Treated as Distinct Structural Units
Our analysis also covers individual chains that are treated as distinct structural units, with a separate Gaussian being plotted for each chain (under the assumption that each chain folds in separation from other chains). To determine the causes of the discordance between the observed and theoretical distributions, we have singled out fragments for which this discordance is particularly evident. Note that we are not dealing with isolated deviations-in many areas, both distributions strongly oppose each other, indicating that the chain does not produce a globule and is likely insoluble due to the lack of a polar surface. Figure 3 illustrates the status of chains E and F treated as individual structural units.  It is clear that even when analyzed as distinct units, the discussed chains still diverge from the theoretical distribution of hydrophobicity ( Figure 3). No C-terminal maxima (predicted by T) are present in the observed distributions.
Regarding phf-tauL, both of the distributions are similar to those calculated for phf-tauO, with only the interface being somewhat different. The observed distribution, while discordant, does not resemble an amyloid (which would appear as a sinusoidal pattern consisting of similar local maxima).
Taking into account the discussed distributions, it is easy to pinpoint fragments where O deviates from T (see Figure 4). It is clear that even when analyzed as distinct units, the discussed chains still diverge from the theoretical distribution of hydrophobicity ( Figure 3). No C-terminal maxima (predicted by T) are present in the observed distributions.
Regarding phf-tauL, both of the distributions are similar to those calculated for phf-tauO, with only the interface being somewhat different. The observed distribution, while discordant, does not resemble an amyloid (which would appear as a sinusoidal pattern consisting of similar local maxima).
Taking into account the discussed distributions, it is easy to pinpoint fragments where O deviates from T (see Figure 4).  Table 4) that match the colors of three-dimensional (3D) presentations on Figure 5. Figure 4 provides a sample set of distributions (T and O) for a single chain-E from phf-tauO. As already noted, the chains differ in detail, while the overall pattern remains largely identical, regardless of the structural unit in question (superfibril, protofibril, or individual chain). The highlighted fragments have been singled out on the basis of visual inspection, supplemented with correlation coefficients computed for successive five-residue segments. Fragments for which HvT and TvO are negative while HvO assumes that a large value will be subjected to further analysis. Table 4 summarizes the results obtained for all structural units in phf-tauO. As shown in Table 4, the status of selected fragments is quite similar, regardless of the structural unit in question-in all cases, these fragments are strongly discordant vs. the theoretical distribution. Table 4. RD values in both models (T-O-R and T-O-H), along with HvT, TvO, and HvO correlation coefficients for chains E and F analyzed as part of the superfibril, as part of a protofibril and on their own. Figure 4 illustrates the division of the chain into individual fragments.   Table 4) that match the colors of three-dimensional (3D) presentations on Figure 5. Figure 4 provides a sample set of distributions (T and O) for a single chain-E from phf-tauO. As already noted, the chains differ in detail, while the overall pattern remains largely identical, regardless of the structural unit in question (superfibril, protofibril, or individual chain). The highlighted fragments have been singled out on the basis of visual inspection, supplemented with correlation coefficients computed for successive five-residue segments. Fragments for which HvT and TvO are negative while HvO assumes that a large value will be subjected to further analysis. Table 4 summarizes the results obtained for all structural units in phf-tauO. As shown in Table 4, the status of selected fragments is quite similar, regardless of the structural unit in question-in all cases, these fragments are strongly discordant vs. the theoretical distribution. Table 4. RD values in both models (T-O-R and T-O-H), along with HvT, TvO, and HvO correlation coefficients for chains E and F analyzed as part of the superfibril, as part of a protofibril and on their own. Figure 4 illustrates the division of the chain into individual fragments.

Comparative Analysis Involving Theoretical Models
As previously noted, we have carried out an in silico experiment that consisted of predicting the conformation of a tau protein whose sequence matches the discussed amyloids. Our analysis concerned the entire molecule as well as fragments that are highlighted in Figure 4. Figure 5 presents 3D models of tau polypeptides obtained using software packaged described in the Materials and Methods section. Visual inspection reveals the possible emergence of globular forms: I-Tasser produces four such structures (out of five input cases), while Robetta produces one (out of five). The tendency of the FOD model to produce globular forms should come as no surprise given the model's propensity to direct hydrophobic residues towards the center of the molecule (due to interactions with the aqueous solvent). One of the models that was produced by I-Tasser appears to involve a hydrophobic core (in the sense of the FOD model-cf. underlined structures in Figure 5). None of the models produced by Robetta satisfies this criterion. Regarding the FOD model, despite its natural tendency to generate hydrophobic cores, only two among 500 structures analyzed in the course of the study contain a hydrophobic core (i.e., satisfy the RD < 0.5 condition).
Analysis of numerical values that are listed in Table 5, along with visual inspection of 3D forms reveals that some of these fragments adopt structures consistent with the Gaussian distribution. Of particular note is the fragment marked in red on Figure 5 (residues 50-61, numbered 356-357 according to PDB), which does not conform to theoretical predictions in any model. In interpreting this fact, we may refer to the dominant role of the fragment whose sequence does not adapt to the centralized distribution of hydrophobicity. We can speculate that this fragment (GSLDNITHVPGG) is therefore the most amyloidogenic sequence in the set under consideration. This suggestion is supported by the data shown in Table 5, particularly RD values One of the models that was produced by I-Tasser appears to involve a hydrophobic core (in the sense of the FOD model-cf. underlined structures in Figure 5). None of the models produced by Robetta satisfies this criterion. Regarding the FOD model, despite its natural tendency to generate hydrophobic cores, only two among 500 structures analyzed in the course of the study contain a hydrophobic core (i.e., satisfy the RD < 0.5 condition).
Analysis of numerical values that are listed in Table 5, along with visual inspection of 3D forms reveals that some of these fragments adopt structures consistent with the Gaussian distribution. Of particular note is the fragment marked in red on Figure 5 (residues 50-61, numbered 356-357 according to PDB), which does not conform to theoretical predictions in any model. In interpreting this fact, we may refer to the dominant role of the fragment whose sequence does not adapt to the centralized distribution of hydrophobicity. We can speculate that this fragment (GSLDNITHVPGG) is therefore the most amyloidogenic sequence in the set under consideration. This suggestion is supported by the data shown in Table 5, particularly RD values that are either highest or second highest in the entire set.   Table 5. Cont.

Other Fragments of Tau Proteins
In order for our comparative analysis to be as comprehensive as possible, we also include tau proteins (or fragments of tau amyloids) that PDB lists as being capable of adopting non-amyloid conformations. The Tpp (1I8H) represents the 541-553 fragment of the previously described tau protein, in complex with a microtubule-specifically, ww domain complexed with human tau phosphothreonine peptide microtubule-associated protein tau. In this complex, the A chain comprises residues 541-553 (so-called phf-tau), while chain B represents the ww domain (6-44) [33].
The Tpp sequence does not fully match phf-tau (and the others), however we have included it in our analysis due to functional similarities.
Results shown in Table 6 and on Figure 6 reveal that the Tpp complex does not conform to the FOD model. The status of chain A (tau), when analyzed on its own, is also discordant. On the other hand, the same chain conforms to the model when analyzed as part of the complex. This means that chain B creates suitable conditions for chain A to produce a shared hydrophobic core that is consistent with the 3D Gaussian. Table 6. Parameters describing the 541-553 fragment (chain A) of the tau protein Tpp in complex with the ww domain (residues 6-44).  Analysis of our results indicates the need of a "chaperone", which chain A requires to reach together a conformation consistent with the FOD model.

T-O-R T-O-H HvT TvO HvO
Tau (267-312) is another protein related to the discussed tau structure. Its sequence matches the short fragment at 306-312 in tau as it is present in phf-tauO. According to [34] this fragment (267-312) of tau protein is bound to microtubules.
The status of the tau chain in tau (267-312) reveals strong discordance with regard to the Considering the individual fragments of Tpp, it turns out that the fragment 27-29 of chain B is the most discordant, and that eliminating this fragment from calculations lowers the RD value of the complex. This may indicate the conformational alignment between chain B and chain A (given that the presence of chain A disrupts the distribution of hydrophobicity in chain B).
Analysis of our results indicates the need of a "chaperone", which chain A requires to reach together a conformation consistent with the FOD model.
Tau (267-312) is another protein related to the discussed tau structure. Its sequence matches the short fragment at 306-312 in tau as it is present in phf-tauO. According to [34] this fragment (267-312) of tau protein is bound to microtubules.
The status of the tau chain in tau (267-312) reveals strong discordance with regard to the theoretical model (see Figure 7), with RD = 0.680 (T-O-R) and 0.527 (T-O-H). Correlation coefficients are −0.049, −0.042, and 0.673 for HvT, TvO, and HvO, respectively. These values indicate that the structure of the chain is dominated by the conformational tendencies of individual residues rather than by the external hydrophobic force field. Interaction with microtubules is likely to be the driving force behind conformational adaptation. The structure of the entire complex is not known, however information regarding the interaction of individual residues with the microtubule might explain the discordance that was observed throughout the chain. Only the helical fragment at 295-299 appears accordant with the FOD model (with RD values and correlation coefficients of 0.264, 0.170, 0.876, 0.921, and 0.965, respectively, showing good alignment between the theoretical and observed distribution). Analysis of our results indicates the need of a "chaperone", which chain A requires to reach together a conformation consistent with the FOD model.
Tau (267-312) is another protein related to the discussed tau structure. Its sequence matches the short fragment at 306-312 in tau as it is present in phf-tauO. According to [34] this fragment (267-312) of tau protein is bound to microtubules.
The status of the tau chain in tau (267-312) reveals strong discordance with regard to the theoretical model (see

Peptides
To complete our study of tau-derived structures that are listed in PDB we also need to consider peptides capable of amyloid transformation. The possible mechanism driving this process, discussed in [14,15], remains applicable in the case under consideration.
Peptides that match the tau protein sequence are mostly related to the fragment at 306-311-the short N-terminal fragment of the tau protein present in phf-tauO, phf-tauL, and phf-tauT. A short peptide which does not produce a globular form should not, in principle, be analyzed using the FOD model. Nevertheless, for the sake of completeness, we will present RD (T-O-R) values calculated for such peptides-see Table 7. The values shown in Table 7 only reveal the type of hydrophobicity distribution, with no assessment of the hydrophobic core structure. Low values indicate that hydrophobic residues are located in the central part of the chain, surrounded by N-and C-terminal hydrophilic residues. As shown, the short VQIVYK fragment, despite including an outlying Val residue, is a good match for the centralized hydrophobic core structure. When additional neighboring residues (adjacent to the 306-311 fragment) are included in analysis, the value of RD increases significantly.
Peptides that are identified as capable of amyloidogenesis appear to adopt amyloid-like conformations themselves. As discussed in [14,15], the distribution of hydrophobicity in a peptide may-regardless of its accordance with the theoretical distribution-give rise to amyloid formation, as long as the environment favors linear complexation of additional peptides, with alternating bands of high and low hydrophobicity emerging along the axis of the fibril. This is visualized in Figure 8, which compares two fringe cases (in terms of RD values). The values shown in Table 7 only reveal the type of hydrophobicity distribution, with no assessment of the hydrophobic core structure. Low values indicate that hydrophobic residues are located in the central part of the chain, surrounded by N-and C-terminal hydrophilic residues. As shown, the short VQIVYK fragment, despite including an outlying Val residue, is a good match for the centralized hydrophobic core structure. When additional neighboring residues (adjacent to the 306-311 fragment) are included in analysis, the value of RD increases significantly.
Peptides that are identified as capable of amyloidogenesis appear to adopt amyloid-like conformations themselves. As discussed in [14,15], the distribution of hydrophobicity in a peptide may-regardless of its accordance with the theoretical distribution-give rise to amyloid formation, as long as the environment favors linear complexation of additional peptides, with alternating bands of high and low hydrophobicity emerging along the axis of the fibril. This is visualized in Figure 8, which compares two fringe cases (in terms of RD values).   Figure 8 evidences the appearance of structures that are characterized by linear propagation of hydrophobicity peaks/troughs, which is a precondition of amyloid formation. As highlighted by to-date observations and interpretations, the environment must "support" the creation of such forms. It is thought that under natural conditions the structure of water does not favor the formation of amyloid fibrils.
The value of RD computed for a short peptide implies how many local maxima are present. A low value indicates that hydrophobicity is concentrated in the central part of the peptide (e.g., Tau (306-311B)), while a high value suggests the presence of numerous local maxima (e.g., Tau (305-311)).

Is It Possible to Differentiate between the Amyloid Fibril and the Fibrillary Structure Present in the Microfilament?
The defining property of amyloids is their fibrillary nature. This phenomenon, however, is not restricted to amyloids. Many biologically active fibrillary proteins exist, often serving as biological scaffolds-this includes polymer microfilaments, such as F-actin (filamentous actin). An example structure of this protein is listed in PDB under ID 3J8I [35].
Analysis of protein 3J8I reveals an alternative approach to a fibrillary structural formation that relies on different mechanisms than those, which drive amyloidogenesis. The PDB structure comprises five monomeric units arranged into a linear complex. Each monomer is a single-domain chain, 375 aa in length, with varying secondary folds: three beta sheets (11 beta strands in total) and 17 helices. The monomers are spatially arranged in a shape of a helix, forming an elongated fibril. Our analysis focuses on the F chain, which is placed in the center of this fibril. We believe that this chain best represents any inner subunit of a long fibril, with adjacent neighbors on either side.
In analyzing the F chain we apply a twofold approach: first, we treat the chain as an independent structure (constructing a 3D Gaussian capsule calculated specifically for that chain), and subsequently we analyze it as part of the complex (with a broader Gaussian encapsulating the entire complex). The former approach enables us to determine the status of the chain itself, while the latter provides clues regarding its role in the formation of a fibril.
The same observations that we relied on when an analyzing the tau amyloid (high values of RD in both configurations; negative HvT and TvO correlation coefficients and strongly positive HvO) will be sought in our study of F-actin to determine whether the conditions that give rise to amyloid fibrils also apply in the presented case.
The above mentioned set of parameters shows that T not only deviates from O, but can, in some respects, be viewed as its polar opposite. This is taken as evidence that the given conformation is driven by intrinsic hydrophobicity of individual residues. Table 8 presents FOD parameters that describe F-actin (as listed in PDB under ID 3J8I), which were derived from T and O distributions plotted in Figure 9. It appears that the entire complex, as well as both versions of the F chain, deviate from the theoretical distribution, with no monocentric hydrophobic core being observed in either case. Note that Table 8 only lists the status of selected fragments-those, whose amyloid-like conformation may be important in light of the current discussion regarding identification of amyloid forms.
Results show that local amyloid-like properties may be attributed (in varying degrees) to the beta sheet. Such localized amyloid-like folds can indeed be found in many biologically active proteins (e.g., antifreeze proteins that contain solenoid fragments) [36]. Likewise, the beta sheet found in the lysozyme may also be regarded as amyloid-like [37]. (Note that this particular beta sheet plays an important role given its proximity to the active site-it even contributes one of the catalytic residues of the lysozyme). It seems that the presence of a similar structure in actin is not a unique phenomenon, especially given the structure of its immediate neighborhood. As it turns out, local amyloid-like folds in biologically proteins are typically bracketed by "stop" signals (or "caps"), which prevent unchecked linear propagation. They do so by ensuring that the structure, as a whole, conforms to the theoretical distribution of hydrophobicity and mediating entropically advantageous contact with water. This is highlighted in Table 8 with "stop" annotations.
In the scope of our analysis we also computed FOD correlation coefficients for successive fragments of the input chain while using a 5 aa moving frame. This reveals the exact placement of residues which exhibit amyloid-like characteristics. Eliminating such residues lowers the value of RD (although not below 0.5-see the "No neg CC" annotations in Table 8). Visual inspection of both profiles (theoretical-T and observed-O) reveals residues that contribute to the discordance (these are highlighted in Figure 9 and marked in Figure 10, which presents the protein's 3D structure). It is worth noting that these residues together comprise only 10% of the chain. Eliminating visually inspected residues brings RD down below 0.5, which means that the remainder of the chain conforms to the monocentric distribution of hydrophobicity. Table 8. Parameters describing the microfilament structure as present in F-actin. β-sheet-1-sheet with starting numbers 8-11, β-sheet-2-sheet with starting numbers 150-154, β-sheet-3-sheet with starting numbers 70-72. Stop sign-fragment "stopping" linear propagation: *-fragment 351-374, **-fragment 325-331, ***-fragment 173-176, No neg CC-status of the chain with residues representing negative correlation coefficients (CC) for HvT and TvO with high HvO, No selected-status of the chain with residues identified as discordant under visual analysis of the T and O profile-shown in Figures 9 and 10. local amyloid-like folds in biologically proteins are typically bracketed by "stop" signals (or "caps"), which prevent unchecked linear propagation. They do so by ensuring that the structure, as a whole, conforms to the theoretical distribution of hydrophobicity and mediating entropically advantageous contact with water. This is highlighted in Table 8 with "stop" annotations.

T-O-R T-O-H
In the scope of our analysis we also computed FOD correlation coefficients for successive fragments of the input chain while using a 5 aa moving frame. This reveals the exact placement of residues which exhibit amyloid-like characteristics. Eliminating such residues lowers the value of RD (although not below 0.5-see the "No neg CC" annotations in Table 8). Visual inspection of both profiles (theoretical-T and observed-O) reveals residues that contribute to the discordance (these are highlighted in Figure 9 and marked in Figure 10, which presents the protein's 3D structure). It is worth noting that these residues together comprise only 10% of the chain. Eliminating visually inspected residues brings RD down below 0.5, which means that the remainder of the chain conforms to the monocentric distribution of hydrophobicity. . Highlighted positions mark residues that cause discordance between those distributions (on the basis of visual inspection). The remainder of the chain is regarded as accordant with the theoretical distribution. It likely contributes to the protein's structural stability-under the assumption that a well-ordered hydrophobic core and the presence of disulfide bonds both play a role in stabilizing tertiary conformations. . Highlighted positions mark residues that cause discordance between those distributions (on the basis of visual inspection). The remainder of the chain is regarded as accordant with the theoretical distribution. It likely contributes to the protein's structural stability-under the assumption that a well-ordered hydrophobic core and the presence of disulfide bonds both play a role in stabilizing tertiary conformations. Figure 9. T (blue) and O (red) profiles for the F chain from F-actin (PDB ID 3J8I), divided into two parts for visibility: 1-200 (A) and 201-375 (B). Highlighted positions mark residues that cause discordance between those distributions (on the basis of visual inspection). The remainder of the chain is regarded as accordant with the theoretical distribution. It likely contributes to the protein's structural stability-under the assumption that a well-ordered hydrophobic core and the presence of disulfide bonds both play a role in stabilizing tertiary conformations. Figure 10. 3D presentation of F chain from F-actin (PDB ID 3J8I). Beta sheets are displayed in different shades of yellow. Red fragments distinguish residues that cause discordance between T Figure 10. 3D presentation of F chain from F-actin (PDB ID 3J8I). Beta sheets are displayed in different shades of yellow. Red fragments distinguish residues that cause discordance between T and O distributions. These fragments correspond to highlighted parts of hydrophobicity profiles presented in Figure 9.
The amyloid-like beta sheet-1 is characterized by the linear propagation of alternating bands that differ in terms of hydrophobicity. This effect manifests itself as a strong discordance between T and O profiles, where-in some cases-the observed distribution appears to be a polar opposite of the theoretical distribution.
Linear propagation can be observed by studying the status of successive fragments that comprise the beta sheet. It is therefore interesting to speculate about the participation of such beta sheets in formation of a complex with a clearly fibrillary nature. The sheets in question are dispersed and do not form a continuous band of alternating hydrophobicity. Consequently, they cannot be regarded as a structural scaffold for the complex. What is more, the beta sheets that are contributed by different chains are not in contact (as shown in Figure 11). The presence of "stop" fragments, also shown in Figure 11, which arrest linear propagation, suggests that amyloid-like conditions are intended to remain local and not dominate the structure. Similar "caps" can be found in many other proteins, which include amyloid-like fragments [38] and prevent the unrestricted elongation of such structures. In effect, we can state that the structure referred to as a "fibril" might be produced in various ways. Linear propagation of hydrophobicity bands is the prerequisite of amyloid formation (as well as a useful criterion for identifying amyloids), whereas other fibrillary structures (such as F-actin) are formed through nonbinding interactions (including salt bridges and hydrogen bonds). Thus, even though the end result (elongated fibril) is similar, the underlying mechanisms differ. Against In summarizing our comparative analysis of fibrillary structures, it should be noted that these structures owe their existence to different mechanisms. An amyloid emerges as a consequence of linear propagation of alternating bands of high and low hydrophobicity, whereas globular proteins form complexes via nonbinding interactions (including salt bridges and hydrogen bonds). In the latter case, even when amyloid-like fragments can be found in the proteins' structures, they are dispersed and protected by "stoppers", which prevent them from interacting with one another to form complexes.
In effect, we can state that the structure referred to as a "fibril" might be produced in various ways. Linear propagation of hydrophobicity bands is the prerequisite of amyloid formation (as well as a useful criterion for identifying amyloids), whereas other fibrillary structures (such as F-actin) are formed through nonbinding interactions (including salt bridges and hydrogen bonds). Thus, even though the end result (elongated fibril) is similar, the underlying mechanisms differ. Against this background we propose that the criteria listed in this book differentiate amyloids and enable their identification. We also show that the presence of beta folds is not required (e.g., as evidenced by the tau amyloid). Instead, amyloids may form whenever the folding process is driven by the intrinsic properties of individual residues, as confirmed by the parameters that are studied in this work.

Discussion
The comparative analysis of proteins associated with amyloid tau confirms the previously stated hypothesis concerning the structural properties of the amyloid. According to this hypothesis, the amyloid is characterized by the presence of alternating bands of variable hydrophobicity. It seems that linear propagation-which can be regarded as contrary to the emergence of a centralized hydrophobic core (as seen in globular proteins)-is a characteristic property of amyloids. A similar phenomenon can be observed in Aβ amyloids [39,40]. The network of hydrogen bonds that is discussed in numerous studies [11,41] favors this type of conformation and is thought to be associated with the linear properties of beta folds. In the tau amyloid, however, β-strands play a much smaller role than in other known amyloids. This suggests that while hydrogen bonds are important, their role is not necessarily linked to β-structures.
Hydrophobicity is capable of binding together proximate charged residues, however, electrostatic interactions should, in principle, prevent such clustering. Under such conditions only hydrophobic forces can result in the observed arrangement. Thus, a conformation that is driven by intrinsic hydrophobicity (and does not generate a central hydrophobic core) may be regarded as both the cause and the mechanism of amyloid transformation.
The FOD model recognizes several possible forms for the tau superfibril. This diversity is likely caused by interactions between the solvent and the emerging amyloid. We suggest that, while phf-tauO and phf-tauL emerge as the effect of the influence of surrounding water, in phf-tauT, the structure is driven by the specific band-like arrangement of hydrophobicity in the amyloid itself.
The tau protein, whose task is to mediate interaction with microtubules, must align itself to the complexed object. When the protein is subjected to folding on its own, in an independent manner, it may adopt a globular conformation and remain soluble. An open question is why the same protein undergoes complexation in a form which does not resemble a globule. As shown, a chain that is sequentially identical to the amyloid fragment of the tau chain cannot produce a globular structure. In this context, microtubules may be viewed as a "chaperone", which ensures that the protein adopts its intended conformation, required for biological activity.
Conclusions that are related to the process of amyloidogenesis and the role of the FOD model in explaining this process, all point to the need for further research into the properties of the aqueous solvent. While we possess good knowledge of the properties of ice, the corresponding "normal" (or physiological) condition of liquid water is poorly understood-for example, we are still unsure of why the density of water peaks at 4 degrees C. This may explain the recent uptake in investigations that aim to explain such phenomena [41][42][43][44][45][46]. We believe that these studies may also cast a new light on the process of amyloidogenesis, which-in all likelihood-is associated with the (heretofore unknown) influence of the force field exerted by the surrounding water. This field should be modeled as a continuum rather than (as is common practice in modern molecular dynamics packages) as a collection of distinct molecules. The FOD model provides a good baseline for such research.
The analysis allows for distinguishing of critical short sequences especially resistant to adopt the conformation accordant with the expected uni-centric hydrophobic distribution. This phenomenon is also observed in other amyloids, especially Aβ(1-42) amyloid [39].

Data
The analysis concerns tau protein amyloids listed in PDB as capable of forming highly ordered superfibrils. In addition, we also consider selected fragments of the tau protein, including short peptides. Table 9 gives the full list of structures subjected to analysis. Table 9. Set of proteins subjected to analysis, along with an indication of chain length and complexation capabilities. The rightmost column provides references.  Table 8 includes tau superfibrils (phf-tauL, phf-tauO, phf-tauT), smaller structural units (including individual chains-tau (267-312)), as well as complexes with other proteins (Tpp). We also consider individual peptides that are widely characterized as capable of forming amyloid structures.

PDB ID Characteristics Length Complex Reference
All of the above structures are subjected to FOD characterization in the context of the superfibril, the protofibril and the individual chain. Our analysis further extends to peptides whose composition is similar or identical to PDB sequences. The status of such molecules is determined by computing their RD coefficients. It should be noted that seeking proper hydrophobic cores in very short peptides (<15 aa) makes little sense-such peptides are characterized while using FOD criteria only in order to provide a coherent platform for comparative studies. The FOD model provides useful information regarding the relationship between each residue's intrinsic hydrophobicity and its placement in a fully folded chain.

Folding of Peptides-Components of Amyloid Structures
Peptide sequences which form parts of the tau amyloid (e.g., 306-378, as listed under phf-tauO) have been subjected to folding simulations while using Robetta [26,28] and I-Tasser [29,30], as well as to simulations based on the FOD model [53]. This operation can be regarded as an in silico experiment whose aim is to provide alternatives to structures generated by specialized 3D structure prediction software. Our goal is to identify theoretical opportunities for alternative folds (unlike those listed under phf-tauO and similar entries). The globular forms that are generated by the FOD model may provide clues regarding the discordance between the theoretical distribution of hydrophobicity and the actual location of hydrophobicity maxima/minima. A ranking list of the resulting structures may be composed in order to identify factors that increase similarities between the theorized conformation and the corresponding amyloid form.
Robetta is a software package that is aimed at the modeling and analysis of protein structures [27,28]. It is a strong performer in successive editions of the CASP challenge, which focuses on predicting the 3D conformations of input residue sequences [54]. Robetta works in the following manner: the user is asked to input a sequence of amino acids comprising a given protein chain. This sequence is then subdivided into fragments (called domains) while using the "Ginzu" hierarchical scanning algorithm. The algorithm recognises fragments homologous to sequences for which the preferred secondary conformation has been established on the basis of experimental studies. Such homologous areas are detected by (in the order of accuracy) BLAST, PSI-BLAST [55], FFAS03 [56], and 3D-Jury [57] taking as input the sequences produced in the preceding step. The identified domains are modeled by applying a comparative modeling protocol, while all other chain fragments are treated as linkers (if they consist of fewer than 50 residues) or are assigned to structural families as defined in the Pfam-A database [58] using HMMER [59]. Fragments and sequences that have not been recognized as putative domains are analyzed via MSA of the full-length target derived from a PSI-BLAST search against the NCBI non-redundant (NR) protein sequence database [60]. Putative domains identified through Pfam-A and MSA are modeled using de novo structure prediction. Finally, following assembly, side chains are modeled by applying Monte Carlo algorithms [61]. The description is based on [60].
I-Tasser (Iterative Threading ASSEmbly Refinement) is a software package which can predict the structure of a protein given its sequence. In this application, prediction bases on querying PDB for templates using the multiple threading approach. I-Tasser is a strong contender in CASP challenges, topping the ranking in editions 7 through 12 [62][63][64].
The user submits a sequence of amino acids, which is then compared (by LOMETS [65]) to template proteins with similar structural characteristics. An optimal template is then selected and overlapping fragments are assembled into an output model while using replica-exchange Monte Carlo simulations [66], while differing fragments are modeled ab initio. If LOMETS is unable to identify a suitable template, the entire structure is subjected to ab initio modeling. The next step involves a search for low energy states (using SPICKER [67]) in the resulting chain via clustering simulation decoys. This is followed by the reassembly of the template protein starting with SPICER cluster centroid, however this time the simulation is guided by spatial constraints that are provided by TM-align on the basis of LOMETS templates and PDB data. The purpose of the second iteration is to remove steric clashes as well as to refine the global topology of the cluster centroids. The decoys generated in the second simulations are clustered and the lowest energy structures are selected [68]. The final step involves the construction of a detailed model from the available structures via optimization of the hydrogen bond network using REMO. Further information can be found in [68].
Robetta computations were carried out while using the publicly available service [27]. I-Tasser computations were carried out using [68]. FOD model computations were carried out using PL-Grid platform on "Cyfronet" Computer Center AGH Krakow infrastructure [69], a detailed description of which can be found in [70].
The FOD model involves two intermediate folding stages: the early-stage intermediate [71][72][73][74] and the late-stage intermediate [72][73][74]. The initial step, which is meant to generate a starting structure for further optimization, is omitted since the conformation listed under 5O3O is taken as the starting structure (treated as early-stage in this case). This chain is then immersed in an aqueous solvent, whose effects are modeled using a 3D Gaussian (as an external force field). In line with the FOD model, hydrophobic residues tend to congregate at the center of the protein body while hydrophilic residues are exposed on its surface. The process produces a prominent hydrophobic core that is encapsulated by a hydrophilic "shell" (with near-zero values of hydrophobicity on the surface). Optimization of hydrophobic interactions and optimization of nonbinding internal interactions is carried out in an alternating fashion, with each step being repeated several times.
Nonbonding interactions are optimized using Gromacs 4.6.5 software suite (Groningen, The Netherlands) [75], available on the PL-Grid infrastructure at ACK Cyfronet AGH Kraków [69]. FOD-based optimization aims to minimize differences between the idealized (3D Gauss function) and observed distribution of hydrophobicity in the target protein. The workflow interleaves both procedures in order to converge on the final conformation.
Folding simulations that rely on the FOD model are relevant since they acknowledge the effects exerted by the aqueous solvent, and treat them as a global phenomenon (i.e., external force field producing a molecule-wide hydrophobic core).

Comparative Analysis
All tertiary conformations that were produced by the modeling algorithms, as well as structures that are listed in PDB, were analyzed with regard to the status of their hydrophobic cores, which is described by the RD (relative distance) coefficient. Comparing RD values brings the information about the degree of disorder in respect to ideal distribution. In consequence, the approach to amyloid form can be assessed. RD expresses the degree of order present in the protein's hydrophobic core and indirectly indicates whether the protein is globular or not. Generating a globular structure with a prominent hydrophobic core (hydrophobicity peaking at the center of the molecule and decreasing along with distance from the center, becoming very low on the surface) suggests that the given amyloid peptide may, under certain circumstances, adopt a globular conformation. The ranking of protein structures, sorted in the order of decreasing globularity, reveals changes which cause proteins to forfeit their centralized hydrophobic cores and that may-in extreme cases-produce amyloid forms.
The RD coefficient can be computed for two independent cases: T-O-R and T-O-H. The former case expressing the relative distance between the observed distribution (O) and two boundary distributions: theoretical (T) which is given by the 3D Gaussian, and uniform (R, random) where each residue is ascribed a hydrophobicity value of 1/N (N being the number of residues in the input chain). R-distribution represents the case of uniform (absence of any local hydrophobicity concentration) distribution, which is the opposite one versus the centralized distribution. In the latter case the uniform distribution is replaced by a distribution corresponding to the intrinsic hydrophobicity of each residue in the input chain (H). Comparing both values reveals factors that guide the folding process (this is particularly true in the T-O-H case). A high value of RD (T-O-H) indicates that folding is dominated by the intrinsic properties of each residue with no regard to cooperative generation of a shared hydrophobic core. When this type of distribution is repeated in successive fragment of the polypeptide, the result is a linear sequence of alternating bands of high and low hydrophobicity. This, in turn, enables the unrestricted elongation of the fibril. An interpretation of this phenomenon (referred to as "ladders") can also be found in [76].
A comparative assessment of T-O-R and T-O-H coefficients in fibrillary/amyloid structures as well as the structures that are produced by various folding algorithms may enable us to identify the "seeds" of linear propagation. FOD criteria have previously been used to assess the distribution of hydrophobicity in structures published by the CASP project [77].
It is hard to compare the interpretation based on FOD with other methods due to the fact that the hydrophobic interaction is underestimated in the discussion concerning amyloid transformation. However, some aspects of intrinsically disordered proteins that were extensively investigated [78] remain in agreement with the results of the analysis of these proteins in respect to the FOD model [79]. The development of techniques as cryo-electron microscopy [80] as well as solid state NMR [81] makes the availability of amyloid structures possible. Structuralization of water is recently in focus of attention, especially the ordering of water on surface [82] what remains in close relation to our model