Symmetry and Dissymmetry in Protein Structure— System-Coding Its Biological Speciﬁcity

: The solenoid is a highly ordered structure observed in proteins, characterized by a set of symmetries. A group of enzymes—lyases containing solenoid fragments—was subjected to analysis with focus on their distribution of hydrophobicity / hydrophilicity, applying the fuzzy oil drop model. The model di ﬀ erentiates between a monocentric distribution hydrophobic core (spherical symmetry—mathematically modeled by a 3D Gaussian) and linear propagation of hydrophobicity (symmetry based on translation of structural units, i.e., chains—evident in amyloids). The linearly ordered solenoid carries information that a ﬀ ects the structure of the aqueous solvent in its neighborhood. Progressive disruption of its symmetry (via incorporation of asymmetrical fragments of varying size) appears to facilitate selective interaction with the intended substrate during enzymatic catalysis. present in ﬁrst beta-strand in beta-sheet are given to identify particular beta-sheet.


Introduction
The structure referred to as "solenoid" was, on initial recognition, assigned to the "new fold" category in Protein Data Bank (PDB). The solenoid is defined as the structure constructed by tandem repeated units of similar secondary structure. The specific solenoid form, constructed by short beta-fragments generating elongated tubes, are selected for analysis in this paper. This was due to the presence of a heretofore unknown beta-helix conformation. This name itself pertains to a sequence of short beta-strands, arranged in a tubular frame that superficially resembles a helix. This effect is so peculiar that pectate lyase (2PEC) was even described as a "surprising new fold" [1]. In a solenoid, a short structural motif composed of beta-strands is cyclically replicated along an axis perpendicular to its surface, producing a tubule [2][3][4][5]. The cross-section of this tubule may be circular, flattened (oval), square-like, or rectangular, possibly with deformed edges. Regardless, the structure adopts an elongated, regular form which may be described as a cylindrical micelle [6]. Solenoid-like fragments have been identified in antifreeze proteins [7] responsible for counteracting the formation of ice crystals, and in lyases-a group of enzymes that cleave chemical bonds. This can be readily observed by searching PDB for occurrences of the "solenoid" keyword [8] (although note that some lyases lack solenoid fragments and will not be included in the presented analysis).
Our study of antifreeze proteins [7] and lyases is based on the fuzzy oil drop model, which categorizes the distribution of hydrophobicity in protein molecules [9,10]. Its predecessor-the classic oil drop model [11]-asserts the presence of a centrally located hydrophobic core, which is sheathed by hydrophilic residues facing the water environment. In the fuzzy oil drop model this discrete (two-layer) distribution is replaced with a continuous hydrophobicity gradient, stretching from the center all the way to the molecular surface and mathematically modeled by a 3D Gaussian. The Gaussian form is then assumed to represent the theoretical (or "idealized") distribution of hydrophobicity for a given protein. Its values peak at the center and then decrease along with distance from the center, reaching almost 0 on the surface (at a distance of 3σ in each principal direction). Of course, the actual (observed) distribution of hydrophobicity does not always correspond to this model with perfect accuracy. Comparing both distributions (theoretical vs. observed) provides clues regarding the structural order (or disorder) present in the protein. Discordances may be either global or local. In the former case, the protein is said to lack a hydrophobic core in the sense of the fuzzy oil drop model. In the latter case, areas where both distributions diverge from each other often coincide with sites of biological activity. Excess hydrophobicity on the protein surface marks complexation sites [12,13], while local hydrophobicity deficiencies are often found in enzymatic active sites [14]. An entirely different distribution of hydrophobicity, lacking a monocentric core, is present in amyloids [6,15], where local minima and maxima propagate along the axis of the fibril. This paper presents the continuation of the discussion concerning the linear distribution (in contract to centric one) which is observed in amyloids [6,15]. The linear propagation as observed in amyloids lacks a natural "stop" signal and may continue indefinitely, which explains the unrestricted length of amyloids. The detailed discussion focused on amyloids can be found in [16]. This paper can be treated also as supplementary to the discussion of 3D structure of antifreeze proteins, where solenoid is also present [7]. The common background for these groups of proteins is the linear propagation of bands of different hydrophobicity.
As it turns out, linear propagation of hydrophobicity can also be observed in biologically active proteins, and specifically in various solenoid fragments, where consistent "bands" of hydrophobicity propagate along a given axis. Given the lack of a natural mechanism to arrest this propagation, a solenoid must be equipped with a specific "cap" [17]. Such fragments are indeed observed in proteins, and their role is to adapt the emerging structure to the 3D Gaussian by exposing hydrophilic residues on its water-facing side, while maintaining compatibility with the solenoid on the underside. Antifreeze proteins which contain solenoids provide examples of such structures [18].
The analysis presented in this work focuses on lyases which contain solenoid fragments. Unlike antifreeze proteins, lyases also include enzymatic active sites. Considering the fuzzy oil drop model, antifreeze proteins-by virtue of their presence-disrupt the structural ordering of the aqueous medium which would be required for an ice crystal to emerge. It can be said that an antifreeze protein generates a distinct "field" which affects the structuralization of water. By analogy, the solenoid fragment present in lyases may be treated as the source of a "field" which promotes enzymatic catalysis. Our work provides a description of this field and its relationship with the active site of solenoid-containing lyases.

Materials and Methods
The fuzzy oil drop model was applied in the analysis of proteins belonging to the lyase class, whose structure includes solenoid fragments. Table 1 lists proteins which have been selected for analysis. From among solenoid-containing lyases, we have selected those for which PDB provides a description of catalytic residues [8]. Lyases have been divided into groups depending on their substrate.

Model Description
The fuzzy oil drop model has been thoroughly described in numerous publications [9,10]. Consequently, we will limit our description to a brief recapitulation of its core aspects, relevant from the point of view of the presented study.
The target protein molecule is encapsulated by a 3D Gaussian form, whose dimensions (σ coefficients) are each equal to 1/3 of the distance from the molecule's center to the most distal atom along each principal axis. The center of the coordinate system coincides with the geometric center of the molecule. Under such conditions, values of the 3D Gaussian are assumed to reflect theoretical hydrophobicity at any point within the protein body. This distribution is referred to as the theoretical distribution (denoted T). Actual proteins do not always conform to the theoretical distribution. The observed distribution (denoted O) is produced by inter-residual interactions, which, in turn, depend on the distance between residues and on their intrinsic hydrophobicity [32]. Each amino acid is represented by its so-called effective atom (averaged-out position of all atoms comprising the given residue). Computing values of the aforementioned Gaussian for all effective atoms enables us to directly compare both distributions (T and O). The distance between these distributions can be quantitatively expressed using the concept of divergence entropy (distance entropy) proposed by Kullback and Leibler [33]. The value of D KL represents the distance between T and O (denoted O/T). Like any other measure of entropy, this value cannot be interpreted on its own-it requires a reference, which in our case is the so-called unified distribution (R). This distribution assigns a value of 1/N to each residue in the protein chain (N being the total number of residues), indicating that no concentration of hydrophobicity exists at any point in the protein body. Comparing T/O with O/R tells us whether the observed distribution more closely approximates the theoretical distribution or the random distribution. To avoid dealing with two separate values, we introduce the following relative distance (RD) coefficient: Another potential reference (other than R) is a distribution which corresponds to the intrinsic hydrophobicity of each amino acid. This distribution, denoted H, is used to determine whether the observed distribution is dominated by the intrinsic properties of each participating residue. Since relative distance may be measured in two different systems (T-O-R and T-O-H), we further distinguish two distinct values of RD: RD TR and RD TH . Visual depiction of the former case is presented in Figure 1 (reduced to a single dimension for the sake of clarity).

RD TH = O/T (O/T + O/H)
D KL and RD may be calculated for any arbitrarily selected structural unit: protein complex, individual chain, or selected domain. In each case, a suitable 3D Gaussian is plotted to encapsulate the target structure. It is also possible to analyze the status of chain fragments (e.g., a single beta-sheet or an individual secondary structure)-following prior normalization of T i , O i , R i , and H i values for all residues comprising the target fragment. When studying individual fragments, no distinct ellipsoid capsule is formed. The resulting value of RD then tells us how the given fragment affects the larger structure for which T i , O i , R i , and H i had previously been computed.
In our work we characterize complete structural units as well as selected fragments of such units. Additionally, we compute correlation coefficients for pairs of values: observed vs. theoretical hydrophobicity; observed vs. intrinsic hydrophobicity and theoretical vs. intrinsic hydrophobicity. The resulting values express the extent to which intrinsic hydrophobicity determines the theoretical hydrophobicity for the target segment, as well as the involvement of each residue in generating a monocentric hydrophobic core. High values of the observed vs. theoretical coefficient indicates that the given residues align themselves to the hydrophobic core, while high values of the observed vs. intrinsic coefficient suggest that intrinsic hydrophobicity dominates the conformation of the target segment.

Calculation Procedure
The detailed description of fuzzy oil drop model is available in [10]. The short presentation of the procedure aimed on RD calculation is given: a. the representation of the molecule with X, Y, Z coordinates (as available in PDB) given for each atom is changed to the effective atom representation. Effective atom-position of averaged coordinates of atoms belonging to certain amino acids; b. the molecule under consideration shall be oriented in coordinate system as follows: A-the geometric center localized in origin of coordinate system; B-the X-axis oriented according to two most distant effective atoms in molecule; C-the Y-axis oriented according to most distant effective atoms-distance limited to Y-Z-plane; c. the T-distribution-sigma parameters calculated as 1/6 of largest distance along each axis independently (according to 3-sigma rule for normal distribution); d. the second parameter (position of maximum) is calculated as a mean value of effective atoms positions; e. the Ti values are calculated as values of the 3D Gauss function in the position of each effective atom; f. the Oi values express the hydrophobic interaction between effective atoms-thus its value depends on the distance between two interacting residues and their intrinsic hydrophobicity

Calculation Procedure
The detailed description of fuzzy oil drop model is available in [10]. The short presentation of the procedure aimed on RD calculation is given: the representation of the molecule with X, Y, Z coordinates (as available in PDB) given for each atom is changed to the effective atom representation. Effective atom-position of averaged coordinates of atoms belonging to certain amino acids; b.
the molecule under consideration shall be oriented in coordinate system as follows: A-the geometric center localized in origin of coordinate system; B-the X-axis oriented according to two most distant effective atoms in molecule; C-the Y-axis oriented according to most distant effective atoms-distance limited to Y-Z-plane; c.
the T-distribution-sigma parameters calculated as 1/6 of largest distance along each axis independently (according to 3-sigma rule for normal distribution); d.
the second parameter (position of maximum) is calculated as a mean value of effective atoms positions; e.
the Ti values are calculated as values of the 3D Gauss function in the position of each effective atom; f.
the Oi values express the hydrophobic interaction between effective atoms-thus its value depends on the distance between two interacting residues and their intrinsic hydrophobicity taken according to the selected scale. The scale presented in [10] was taken in our calculations. The function expressing this interaction was taken according to Levitt function [32]. The Kullback Leibler divergence entropy (D KL ) is applied to measure the difference between two profiles T and O [33]. Since the value of D KL entropy cannot be interpreted (entropy category) the second reference distribution is defined (Figure 1)-the unified one with Ri values for all residues equal to 1/N. The comparison of O distribution versus two reference distributions T and R may suggest the similarity to one of them. The closeness to one of them is expressed by RD parameter, the calculation of which is described above.
The procedure of (D KL ) (and in consequence RD) calculation may be applied for structural units: protein complexes, one-chain proteins, domains. The independent 3D Gauss function shall be generated for each structural unit. The (D KL ) values (and in consequence RD) may be calculated for selected fragments of polypeptide chain after prior normalization of distributions for this selected chain fragment. It allows declare the status of individual units as well as the status of selected chain fragment.

Antifreeze and Amyloids Versus Lyases
The three groups of proteins-antifreeze type III [7], amyloids [16] and lyases (discussed in this paper)-represent common structural characteristics. This is the presence of structural forms with linear propagation of bands of different hydrophobicity level. This similarity is discussed in the Section 4 in this paper.

Results
For detailed analysis, we have selected proteins representative of various enzyme classes within EC 4.2.2. These proteins, in addition to solenoid fragments, also include disordered fragments in their N-and C-terminal sections, helical "stoppers" preventing further propagation of solenoids, and loops which appear to form part of their active sites. Some of these proteins (EC 4.2.1.1) are further characterized by the presence of a helix which runs parallel to the solenoid (replacing the disordered N-and C-terminal fragments). Proteins which belong to the Lyase II family contain solenoids which do not form a full twist of the beta-helix, but merely half of it. These proteins are also included in our analysis since they include repeating, coherent beta-strands.
Recurring beta-strands form a beta-sheet which constitutes one of the side walls of the solenoid tubule. As shown in [15], local maxima and minima of hydrophobicity propagate linearly along the axis of the solenoid (or amyloid fibril) [6,15]. In accordance with PDBSum data [34], we define the solenoid as a collection of beta sheets which represent its walls.
In principle, such linear propagation may continue indefinitely and therefore must be arrested by a custom "stopper". In most of the presented proteins, this role falls to a short helical fragment positioned athwart the solenoid [17]. This helix is amphipathic, with its hydrophilic residues facing the aqueous environment and the "underside" directly adjacent to the solenoid. Comprehensive results reveal the status of the solenoid itself as well as of the surrounding "caps". In proteins which include a parallel helix we also provide a characterization of this helix, or of any corresponding external fragments parallel to the solenoid.
The solenoid is surrounded by helices, loops, or unstructured fragments. We jointly refer to such fragments as the "envelope" and theorize that they act to increase the solubility of the protein under consideration. Proteins selected for our study exhibit a gradual increase in the size and sophistication of the envelope structure.
The catalytic center is defined as the immediate neighborhood of catalytic residues (if a catalytic residue is present at position i, we define "immediate neighborhood" as the fragment between i − 10 and I + 10). Additionally, extended loops above the catalytic center are also considered to belong to the catalytic center due to their close proximity and stepwise development. Table 2 provides a summary of all analyzed enzymes. Table 2. RD values computed for complexes, proteins, and solenoid fragments. RD values are also listed for terminal sections (caps) bracketing the solenoid, as well as for any other notable structural features (e.g., parallel helix). The description refers to monomers of each listed protein. Underlined items correspond to proteins selected for detailed analysis in this paper. To distinguish the fragments included/excluded from calculations in ALL position the and * symbols were used. Enzymes which comprise the EC 4.2.1.1 class are all similar in status, despite being derived from various organisms. The whole molecule is characterized by RD < 0.5, interpreted as the presence of a monocentric hydrophobic core, even though the protein also contains a solenoid fragment. When considering the solenoid by itself, we arrive at RD > 0.5-as expected given that within this fragment the monocentric core is replaced by linear propagation of local maxima and minima. In contrast, the "stop" sequence and the parallel helix both comply with the model (RD < 0.5), indicating that they mediate contact with the aqueous environment, and that the molecule as a whole contains a prominent hydrophobic core even though a solenoid fragment is also present.

PDB ID
Lyases which belong to class EC 4.2.2.2 diverge from the theoretical model. In nearly all these molecules the solenoid does not conform to the theoretical distribution (1BN8 being the sole exception). "Stop" sequences generally provide hydrophobic closure for the solenoid, except in 1PXZ where their RD values slightly exceed 0.5. Parallel fragments-corresponding to the location of the parallel helix in EC 4.1.1.1-adopt variable conformations, with a short helix and several disordered fragments in evidence. The RD values listed in Table 2 are computed for all these fragments treated as a single unit.
The representative of class EC4.2.2.10 exhibits accordance with the model only within its "stopper" helix.
Proteins which belong to class EC4.2.2.59 are derived from the same source organism, but in their crystal structure they interact with various ligands acting as inhibitors [29,30]. From the point of view of FOD analysis, these proteins are not structurally different. Both remain consistent with the model; additionally, they are the only two proteins in the study set where the solenoid section itself has been identified as consistent with the model. As expected, the "stopper" sequences are also accordant. The reason behind the unusual status of the solenoid section (compared to other proteins discussed in this paper) may be related to the fact that the repetitive beta-sheet forms only one half of the "tubule" which sheathes a centrally located helix. Additionally-unlike other presented proteins the catalytic residue forms part of the central helix.
Two proteins classified under the "Lyase II" header differ only with respect to their "stopper" sequences. Surprisingly, in 2QXZ this sequence diverges from the theoretical distribution, which may be due to the presence of an adjacent chain (the crystal structure is made up of dimers). Both helices contain residues which mediate complexation, and this may affect their FOD status.
The following sections characterize the status of proteins representing each class of enzymes.
2FKO, a lyase and a member of the carbonic anhydrase family, contains a solenoid section with a characteristic parallel helix, very similar to the one observed in antifreeze proteins. By analyzing the parameters listed in Table 3, we may conclude that the helix introduces a distribution of hydrophobicity which favors contact with the aqueous environment, thereby ensuring solubility. On the other hand, each beta-sheet as well as the entire solenoid (i.e., both sheets taken together) remain divergent from the model. Helical "caps" at either end of the solenoid are consistent with the theoretical distribution of hydrophobicity, which can be explained by their amphipathic properties (shown in Figure 2C-pink fragment, where the red line is closely aligned with the blue line, as well as in Table 3). Table 3. FOD characterization of pectate lyase (2FKO). RD coefficients determine the relative distance between the observed distribution (O) and two pairs of boundary cases (theoretical vs. unified-T-R-and theoretical vs. intrinsic-T-H). Fragments which diverge from the monocentric theoretical model are listed in boldface. To identify beta sheets, the table lists the earliest fragments belonging to each sheet. Identification of secondary structure follows PDBSum criteria [34]. *-only numbers of residues present in first beta-strand in beta-sheet are given to identify particular beta-sheet.

Fragment
Structural Properties   Table 3. Catalytic residues-orange spheres and an orange star in Figure 2C. Colors distinguishing fragments correspond to those shown in Figure 2C. In this particular case, linear distribution is not in conflict with the theoretical distribution due to the hydrophilic nature of exposed loops, along with the central (within each beta-strand) location of hydrophobic residues. The effect is further reinforced by the amphipathic properties of the adjacent helix, producing a molecule which is largely consistent with the fuzzy oil drop model (Table 3), even though the solenoid itself does not contain a centralized hydrophobic core. When considered on its own, the solenoid diverges from the model in favor of a linear distribution of hydrophobicity, as shown in Figure 2C. (fragment 10-145) and in Figure 2A and B-yellow fragments.

T-O-R T-O-H HvT
The 60-80 fragment, which comprises two beta-strands participating in two distinct beta sheets, along with the catalytic residue, remains locally accordant.

1PLU-Representative of EC 4.2.2.2
The pectate lyase (EC 4.2.2.2) group is represented by 1PLU. As a whole, this protein lacks a central hydrophobic core (Table 4); however, its helical "stopper" remains accordant, likely due to its role as a means of preventing unchecked propagation of local hydrophobicity maxima.
Compared to the previously discussed protein, in this case the long, parallel helix is replaced by disordered N-and C-terminal sections, which contain only short helical fragment. Additionally, the active site is not exposed and appears shielded by two protruding loops. These loops are referred to  Table 3. Catalytic residues-orange spheres and an orange star in Figure 2C. Colors distinguishing fragments correspond to those shown in Figure 2C. Pink-C-terminal fragment, identified as accordant with the fuzzy oil drop model. Green-"stopper" fragments also accordant with the fuzzy oil drop model. The solenoid fragment in Figure 2A is yellow; turquoise-positions of Zn ions shown in Figure 2C as blue dots. In Figure 2B the blue fragment represents catalytic residue environment; C-T (blue), O (red), and H (green) profiles in 2FKO. Yellow star in C-catalytic residue. Blue circles in C-position of Zn2+ complexation.
When interpreting the distributions shown in Figure 2 (fragments with green background in C and green fragments in A and B) we can observe accordance in the N-and C-terminal sections. Within the solenoid itself the observed distribution follows a sinusoid pattern, which is due to cyclic repetition of a specific unit sequence. The beginning and end of the solenoid diverge significantly from the theoretical distribution. The placement of Zn 2+ ions, as shown in Table 3, likely affects the distribution of hydrophobicity producing locally accordant conditions.
The neighborhood of the enzymatically active residue conforms to the aforementioned sinusoid pattern and does not form a central hydrophobic core. This distribution corresponds to linear propagation of alternating minima and maxima.
In this particular case, linear distribution is not in conflict with the theoretical distribution due to the hydrophilic nature of exposed loops, along with the central (within each beta-strand) location of hydrophobic residues. The effect is further reinforced by the amphipathic properties of the adjacent helix, producing a molecule which is largely consistent with the fuzzy oil drop model (Table 3), even though the solenoid itself does not contain a centralized hydrophobic core. When considered on its own, the solenoid diverges from the model in favor of a linear distribution of hydrophobicity, as shown in Figure 2C. (fragment 10-145) and in Figure 2A,B-yellow fragments.
The 60-80 fragment, which comprises two beta-strands participating in two distinct beta sheets, along with the catalytic residue, remains locally accordant.

1PLU-Representative of EC 4.2.2.2
The pectate lyase (EC 4.2.2.2) group is represented by 1PLU. As a whole, this protein lacks a central hydrophobic core (Table 4); however, its helical "stopper" remains accordant, likely due to its role as a means of preventing unchecked propagation of local hydrophobicity maxima. Table 4. FOD characterization of pectate lyase (1PLU). RD coefficients determine the relative distance between the observed distribution (O) and two pairs of boundary cases (theoretical vs. unified-T-R-and theoretical vs. intrinsic-T-H). Fragments which diverge from the monocentric theoretical model are listed in boldface. To identify beta sheets (*), the table lists the earliest fragments belonging to each sheet. Identification of secondary structure follows PDBSum criteria [34]. Envelope positions marked by double asterisks represent fragments with residues engaged in SS-bonds eliminated, or N-terminal fragment (usually very flexible). To distinguish the fragments included/excluded from calculations in ALL position the * and ** symbols were used.

Fragment
Structural Properties Compared to the previously discussed protein, in this case the long, parallel helix is replaced by disordered N-and C-terminal sections, which contain only short helical fragment. Additionally, the active site is not exposed and appears shielded by two protruding loops. These loops are referred to as "catalytic" due to the presence of catalytic residues ( Table 4). One of them diverges significantly from the theoretical model, while another is locally accordant. Each contains one catalytic residue. Taken as a whole, the section which includes both catalytic residues along with a fragment of the solenoid is highly discordant from the model.

T-O-R T-O-H H/T T/O H/O
From among four beta sheets (parts of solenoid), one is consistent with the model while the remaining ones exhibit rather high RD values. The solenoid as a whole is also discordant.
The "stop" fragment, which is helical, follows the theoretical distribution-as expected, given its role (exposure of hydrophilic residues on the outside, and internalization of hydrophobic residues to ensure compatibility with the solenoid). The set of loose fragments surrounding the solenoid is also discordant; however certain elements of the solenoid's "sheath" exhibit local accordance, as demonstrated by their high correlation coefficients.
Catalytic residues are found in the beta-strands which comprise the solenoid. Their status varies. The loops which bracket the catalytic center are characterized by high RD values, except for the one at 195-202. The entire neighborhood of the catalytic center, treated as a single unit (58-80 and 117-224), is highly disordered (in the sense of the fuzzy oil drop model). Figure 3 and Figure 5 provide a visual depiction of the catalytic loops and the so-called outlying loops.
In summary, it is evident that the solenoid does not follow the distribution of hydrophobicity stipulated by the fuzzy oil drop model: instead of a monocentric core we are faced with a linear distribution. The catalytic active site is also discordant, whereas the collection of outlying loops which "sheathe" the solenoid remains consistent with the model and may be regarded as an "envelope", ensuring solubility of the protein. Their role therefore corresponds to that of the parallel helix in 2FKO.
Theoretical and observed hydrophobicity distribution profiles ( Figure 3) reveal that the N-and C-terminal fragments remain accordant. In contrast, the fragments directly adjacent to the active site diverge from the model ( Figure 3C-green background and green fragments in Figure 3A,B).
The solenoid fragment ( Figure 3A-yellow fragments) corresponds to a sinusoid distribution of hydrophobicity, lacking a monocentric core. Both distributions diverge greatly in fragments which correspond to catalytic loops (differences between blue and red lines in Figure 3C).
The linear distribution of hydrophobicity in the beta-sheet which begins with the fragment at 19-23 is the source of high values of RD given in Table 4. Clear differences between O and T can be observed, with the observed distribution approximating intrinsic hydrophobicity values. Additionally, local minima are arranged in a periodic pattern, reflecting the conformation of successive beta-strands, with each local minimum located in an area where the model predicts a hydrophobicity peak instead.
It is interesting to note the structure of the active site. We have singled out fragments which contain catalytic residues-beta-strands which form part of the solenoid-as well as loose coils bracketing the active site. The entire system, composed of catalytic beta-strands and the outlying loops, is significantly discordant and dominated by the intrinsic properties of each residue (Figure 3). Symmetry 2019, 11, x FOR PEER REVIEW 13 of 25 Figure 3. Characteristics of 1PLU. A and B-3D presentation of 1PLU. Residues distinguished as red reveal linear propagation of a highly hydrophobic band, which strongly diverges from the expected hydrophobicity. This is further visualized in the profiles shown below (T-blue, O-red, and H-green hydrophobicity distributions). Hydrophobicity distributions are plotted for beta-strands which comprise the beta-sheet of 1PLU (the first such fold is located at [19][20][21][22][23]. Notably, a local minimum is present where a local maximum would be expected. Good alignment between O and H is also evident for the beta portion of the solenoid, highlighting the discordance of residues distinguished as red dots. Green fragments-stoppers preventing infinite elongation of the solenoid. Table 5. lists RD values for 1IDJ (pectin lyase), along with the corresponding correlation coefficients. This protein contains four disulfide bonds. As expected, the solenoid exhibits a linear distribution of hydrophobicity (sinusoidal form of the observed distribution highly different in respect to theoretical distribution, additionally following the H distribution). The structure is dominated by the intrinsic hydrophobicity of each participating residue rather than by the tendency to form a shared core. The solenoid terminates in a helical "stopper", which is consistent with the model and exposes its hydrophilic residues towards the aqueous environment (fragments of the profiles shown in Figure 4C. distinguished by green background and green fragments in Figures 4A  and 4B).

Pectin Lyase (1IDJ)-Representative of EC 4.2.2.10
The "envelope" (which is equivalent to the parallel helix in 2FKO-pink fragments in Figure  4A-profiles of these fragments distinguished in Figure 4C. by the green background) diverges from the model; however its constituent fragments exhibit high accordance, which can be interpreted as energetically favorable contact with the environment.
The neighborhood of the catalytic active site, including the outlying loops as well as beta-strands which form part of the solenoid and contain catalytic residues, exhibits significant deviations from the model, with only the local neighborhood of residue D154 (145-152) seen as Figure 3. Characteristics of 1PLU. A and B-3D presentation of 1PLU. Residues distinguished as red reveal linear propagation of a highly hydrophobic band, which strongly diverges from the expected hydrophobicity. This is further visualized in the profiles shown below (T-blue, O-red, and H-green hydrophobicity distributions). Hydrophobicity distributions are plotted for beta-strands which comprise the beta-sheet of 1PLU (the first such fold is located at [19][20][21][22][23]. Notably, a local minimum is present where a local maximum would be expected. Good alignment between O and H is also evident for the beta portion of the solenoid, highlighting the discordance of residues distinguished as red dots. Green fragments-stoppers preventing infinite elongation of the solenoid. Table 5 lists RD values for 1IDJ (pectin lyase), along with the corresponding correlation coefficients. This protein contains four disulfide bonds. As expected, the solenoid exhibits a linear distribution of hydrophobicity (sinusoidal form of the observed distribution highly different in respect to theoretical distribution, additionally following the H distribution). The structure is dominated by the intrinsic hydrophobicity of each participating residue rather than by the tendency to form a shared core. The solenoid terminates in a helical "stopper", which is consistent with the model and exposes its hydrophilic residues towards the aqueous environment (fragments of the profiles shown in Figure 4C. distinguished by green background and green fragments in Figure 4A,B).

Pectin Lyase (1IDJ)-Representative of EC 4.2.2.10
The "envelope" (which is equivalent to the parallel helix in 2FKO-pink fragments in Figure 4A-profiles of these fragments distinguished in Figure 4C. by the green background) diverges from the model; however its constituent fragments exhibit high accordance, which can be interpreted as energetically favorable contact with the environment.
The neighborhood of the catalytic active site, including the outlying loops as well as beta-strands which form part of the solenoid and contain catalytic residues, exhibits significant deviations from the model, with only the local neighborhood of residue D154 (145-152) seen as accordant. Catalytic center  Figure 4C. Figure 7 provides a depiction of the presented fragments. Correlation coefficients calculated for fragments directly adjacent to the active site exhibit negative values. This means that the given fragment not only diverges from the theoretical distribution, but in fact adopts a distribution which can be regarded as opposite to it. This, in turn, suggests a high degree of structurally encoded functionality, since-given the variable nature of the peptide bond-a structure which opposes the natural alignment of T and O would be unlikely to emerge randomly.
As seen, in Figure 4, the observed distribution in the neighborhood of the active site is dominated by a sinusoid pattern, suggesting linear arrangement of identical beta-strands which comprise the solenoid (compare Figure 4 and Table 5). The loose loops which contain catalytic residues are also highly discordant.
Comparing 1PLU with 1IDJ reveals greater structural complexity of the latter protein (in the sense of the fuzzy oil drop model). Table 5. FOD characterization of pectin lyase (1IDJ). RD coefficients determine the relative distance between the observed distribution (O) and two pairs of boundary cases (theoretical vs. unified-T-R-and theoretical vs. intrinsic-T-H). Fragments which diverge from the monocentric theoretical model are listed in boldface. To identify beta sheets, the table lists the earliest fragments belonging to each sheet. Identification of secondary structures follows PDBSum criteria [34]. Positions marked with the double asterisks correspond to SS-bonds which disrupt hydrophobicity-based ordering. Only numbers of residues belonging to stating beta-strand in beta-sheet are used for identification.

Fragment
Structural Properties Correlation coefficients calculated for fragments directly adjacent to the active site exhibit negative values. This means that the given fragment not only diverges from the theoretical distribution, but in fact adopts a distribution which can be regarded as opposite to it. This, in turn, suggests a high degree of structurally encoded functionality, since-given the variable nature of the peptide bond-a structure which opposes the natural alignment of T and O would be unlikely to emerge randomly. , and H (green) hydrophobicity distribution profiles. Orange circles mark catalytic residues, whose position is represented by orange stars in C. Fragments distinguished in C as pink correspond to the envelope shown in A. Green fragments in C correspond to "stoppers" shown in A and B (also in green). Blue fragment in C represent the catalytic environment shown in B (also in blue). Orange stars-catalytic residues; gray-envelope; green-"stop" fragments; turquoise-catalytic center together with its neighborhood. Yellow fragments in A comprise the solenoid section. The yellow stars in C-catalytic residues.

T-R T-H H/T T/O H/O
As seen, in Figure 4, the observed distribution in the neighborhood of the active site is dominated by a sinusoid pattern, suggesting linear arrangement of identical beta-strands which comprise the solenoid (compare Figure 4 and Table 5). The loose loops which contain catalytic residues are also highly discordant.
Comparing 1PLU with 1IDJ reveals greater structural complexity of the latter protein (in the sense of the fuzzy oil drop model).

3DP3-Representative of EC 4.2.2.59
EC 4.2.2.59 is represented by 3DP3, a protein which differs structurally from the previously discussed examples due to its hexameric crystal structure. In the literature, the functional unit is characterized as a dimer [29]; however, our analysis includes monomer, dimer, and hexamer. , and H (green) hydrophobicity distribution profiles. Orange circles mark catalytic residues, whose position is represented by orange stars in C. Fragments distinguished in C as pink correspond to the envelope shown in A. Green fragments in C correspond to "stoppers" shown in A and B (also in green). Blue fragment in C represent the catalytic environment shown in B (also in blue). Orange stars-catalytic residues; gray-envelope; green-"stop" fragments; turquoise-catalytic center together with its neighborhood. Yellow fragments in A comprise the solenoid section. The yellow stars in C-catalytic residues.

3DP3-Representative of EC 4.2.2.59
EC 4.2.2.59 is represented by 3DP3, a protein which differs structurally from the previously discussed examples due to its hexameric crystal structure. In the literature, the functional unit is characterized as a dimer [29]; however, our analysis includes monomer, dimer, and hexamer.
3DP3 contains a somewhat different type of solenoid, where the beta-sheet forms only part of the "tubule" and runs parallel to a central helix. Nevertheless, the specific ordering present in the beta-sheet suggests solenoid-like properties. Due to the biological activity of this protein (a lyase) we decided to include it in our study set.
Since the complex listed in PDB is tightly packed (large intermolecular contact area), our analysis also considers the effect of inter-chain interactions upon individual fragments of the chain. A marked decrease in RD is observed following elimination of residues which mediate such interaction.
Our analysis of the monomer indicates higher (than in former proteins) match between T and O ( Figure 5C). No characteristic linear propagation (sinusoid pattern) is discernible; however, the catalytic residue diverges from the model, along with its whole neighborhood-in favor of intrinsic hydrophobicity (refer to correlation coefficients in Table 6). This means that the structure of the catalytic site is dominated by the intrinsic properties of each residue rather than by broad hydrophobic forces ( Figure 5C-blue background and blue fragments in Figure 5A,B.) Localization of fragments described in Table 6 can be seen in Figure 5. Table 6. FOD characterization of tryptophanase (3DP3). RD coefficients determine the relative distance between the observed distribution (O) and two pairs of boundary cases (theoretical vs. unified-T-R-and theoretical vs. intrinsic-T-H). Fragments which diverge from the monocentric theoretical model are listed in boldface. To identify beta sheets, the table lists the earliest fragments belonging to each sheet. Identification of secondary structure follows PDBSum criteria [34]. "E" corresponds to the presence of a catalytic residue, with the preceding number listing the number of such residues. Here, the reference to a solenoid (*) extends the traditional definition of solenoid fragments. The protein in question contains a repeating beta structure; however, this structure forms only half of the characteristic solenoid "tubule".  The detailed status of the catalytic residue and its close neighborhood is presented in Figure 6. shown in the form of profiles ( Figure 6A-status in monomer; Figure 6B-status in dimer). The detailed status of the catalytic residue and its close neighborhood is presented in Figure 6. shown in the form of profiles ( Figure 6A-status in monomer; Figure 6B-status in dimer).

Pectate Lyase-2QY1-Lyase II
The characteristics of this enzyme (as given in PDB) is as follows: eliminative cleavage of (1→4)-α-D-galact-4-enuronosyl groups at their non-reducing ends. This enzyme is classified as Lyase II.
2QY1 is a representative of the fairly numerous pectate lyase class. It crystallizes as a dimer; however, our analysis focuses on its monomeric form. The protein itself is a mutant (R236F and A31G). The former mutation is near the catalytic residue (R230) and likely affects its enzymatic activity due to structural differences. Unlike R236F, the latter mutation (A31G) does not appear

Pectate Lyase-2QY1-Lyase II
The characteristics of this enzyme (as given in PDB) is as follows: eliminative cleavage of (1→4)-α-D-galact-4-enuronosyl groups at their non-reducing ends. This enzyme is classified as Lyase II.
2QY1 is a representative of the fairly numerous pectate lyase class. It crystallizes as a dimer; however, our analysis focuses on its monomeric form. The protein itself is a mutant (R236F and A31G). The former mutation is near the catalytic residue (R230) and likely affects its enzymatic activity due to structural differences. Unlike R236F, the latter mutation (A31G) does not appear significant from the point of view of FOD analysis.
2QY1 is conformationally similar to EC 4.2.2.2 group proteins, with a solenoid fragment surrounded by loose fragments, a "stopper" helix and exposed loops surrounding the active site-although these loops are shorter than in EC 4.2.2.2 enzymes. Table 7 summarizes the properties of the presented fragments of 2QY1 (monomer). The RD value (Table 7) for the complete 2QY1 molecule suggests that no molecule-wide hydrophobic core is present. Results listed in Table 7 indicate that the monocentric distribution is heavily disrupted by the presence of a solenoid fragment whose own hydrophobicity profile is determined by the intrinsic properties of each residue. Beta sheets are similarly discordant versus the theoretical distributions (as seen in Figure 7). Linear arrangement of local hydrophobicity peaks can be observed along the axis of the solenoid. The N-and C-terminal sections remain accordant, likely due to the need for solubility in an aqueous environment. fragments in question are well adapted to their role as stoppers (i.e., they mediate contact with the aqueous environment) ( Figure 7A and Figure 7B). It is interesting to note the status of the catalytic active site, which is significantly discordant vs. the theoretical distribution. The specific role of R236F is difficult to determine; however, the neighborhood of the active site appears similar to the structure of EC 4.2.2.2 enzymes. Orange spheres-catalytic residues in A and B, and orange stars in C; green-"stop" fragments; blue-catalytic center together with its neighborhood; pink-envelope.

Discussion
Enzymatic activity requires a suitably structured active site, adapted to the target substrate which must be recognized by the enzyme. Analysis of hydrophobicity distribution in antifreeze proteins, with evidence of linear propagation (alternating local minima and maxima), may provide hints regarding the structuralization of water near the protein surface. This effect is believed to counteract ice crystal formation by altering the structural properties of the aqueous medium [7,17]. Similarly, in the case of lyases which include solenoid fragments, the structural ordering of water  (A,B), and orange stars in (C); green-"stop" fragments; blue-catalytic center together with its neighborhood; pink-envelope. Table 7. FOD characterization of pectate lyase (2QY1). RD coefficients determine the relative distance between the observed distribution (O) and two pairs of boundary cases (theoretical vs. unified-T-R-and theoretical vs. intrinsic-T-H). Fragments which diverge from the monocentric theoretical model are listed in boldface. To identify beta sheets, the table lists the earliest fragments belonging to each sheet. Identification of secondary structure follows PDBSum criteria [34]. As with other Tables the numbers of residues belonging to starting beta-strand in beta-sheet are used for identification.

Fragment
Structural Properties The solenoid also contains "stop" fragments which conform to the model by exposing their hydrophilic residues towards the environment while internalizing hydrophobic residues.

RD
As already remarked, the N-and C-terminal sections remain accordant and can be found on the surface of the protein alongside the solenoid. Their role is to ensure solubility, while the function of "stoppers" (helix at 45-55) is to counteract unchecked propagation of the solenoid fragment itself. This observation is supported by high values of all three correlation coefficients, indicating that the fragments in question are well adapted to their role as stoppers (i.e., they mediate contact with the aqueous environment) ( Figure 7A,B). It is interesting to note the status of the catalytic active site, which is significantly discordant vs. the theoretical distribution. The specific role of R236F is difficult to determine; however, the neighborhood of the active site appears similar to the structure of EC 4.2.2.2 enzymes.

Discussion
Enzymatic activity requires a suitably structured active site, adapted to the target substrate which must be recognized by the enzyme. Analysis of hydrophobicity distribution in antifreeze proteins, with evidence of linear propagation (alternating local minima and maxima), may provide hints regarding the structuralization of water near the protein surface. This effect is believed to counteract ice crystal formation by altering the structural properties of the aqueous medium [7,17]. Similarly, in the case of lyases which include solenoid fragments, the structural ordering of water may play an important role in catalysis. This hypothesis is based on the interpretation of the presence of similar (linear propagation of different level of hydrophobicity bands) ordering of hydrophobicity in antifreeze proteins. In this case, this linearity appears to be the signal for surrounding water to differentiate the ordering of water molecules against the ice generation. In consequence it may be assumed that the presence of solenoid in protein of much higher complication as the enzymes is aim-oriented. If the influence of solenoid structure in antifreeze proteins can be accepted, similar influence of solenoid structure in discussed lyases can be interpreted as the source of similar force field addressed to water environment.
Even without detailed knowledge regarding the specific structural properties of the medium, we note that the conditions encountered near hydrophilic surfaces differ markedly from those, which can be observed near hydrophobic areas-particularly in light of reports of water levitation above hydrophobic surfaces [35]. The force field addressed to water environment by linear propagation of different hydrophobicity bands modified by the parts of polypeptide chains called here as envelope supply the additional modification of the force field spread around particular lyase. The gradual complication of the "envelope" part in discussed lyases may be interpreted as gradual modification of specificity of the enzyme-it means substrate recognition. All other interactions between enzyme and substrate take part in the recognition process (electrostatic, vdW H-Bonds etc.). The hydrophobic interaction-based approach is presented in this paper. Comparison of the 3D-structure of lyases with various substrates indicates that the structuralization of water enforced by the protein's own hydrophobicity distribution, may generate a force field which promotes enzymatic activity. One example is the environment of the catalytic active site in 2KFO, whose immediate neighborhood differs from other presented lyases. We suspect that the resulting force field may play a role in guiding the substrate to the active site and that furthermore it may benefit the catalysis process itself.
The structure of any field, including the hydrophobic force field, is a way to encode information. The more complex the field, the more information it carries, and this may prove important from the point of view of catalytic processes. The quantity of information carried by 2FKO seems significantly lower than e.g., in 3DP3, which may be due to the specific properties of the latter protein's substrate, requiring a finely tuned force field. Additional information-carrying factor is the presence of ions (Table 1.) which are localized in specific positions delivering additional signals for water ordering in their neighborhood. One can speculate that they take part in specification of substrate directing to the catalytic center.
Reports regarding the possible involvement of amyloid structures at early stages of protein evolution underscore the importance of linear hydrophobic force fields [36]. The fact that similar conditions exist in enzymes-representing the apex of biological activity-seems consistent with the observations presented in [37][38][39][40][41]. A highly symmetrical system carries relatively little information due to its repeatability and symmetry (both of which render it structurally predictable). Here, the notion of symmetry refers not just to the geometric properties of the folded polypeptide chain, but also to the highly ordered placement of fragments which interact with the environment. Specifically, a clear balance emerges between the placement of polar and nonpolar residues. By exposing such structures on its surface, the protein effectively transmits a signal into its environment, causing the surrounding water molecules to align themselves with the protein body as discussed already above. This "signal" is likely recognized by other solvated molecules. Thus, the symmetry of the protein itself extends to its environment, ensuring the formation of a specific local force field.
One example of this phenomenon is provided by the set of lyases in which a symmetrical structure (the solenoid) is offset by dissymmetrical fragments, producing a balance between symmetry and dissymmetry. This balance affects the structure of the aqueous solvent and likely plays a crucial role in attracting potential partner molecules for chemical reactions catalyzed by lyases.

Conclusions
In conclusion, we can state that protein folding represents the search for a balance between structural ordering consistent with the fuzzy oil drop model and local deviations from the model. Such deviations often correspond to the protein's biological activity [37]. Antifreeze proteins, whose activity does not require complexation of ligands or other proteins, appear to work by generating a force field which obstructs formation of ice crystals [7]. In the case of lyases, the same event may be exploited locally to identify and attract a specific ligand (substrate)-as demonstrated by the structure of active sites of the presented enzymes. The lysozyme, an enzyme which belongs to a different class (hydrolase), exhibits a markedly different secondary conformation; however, it appears to rely on the same general effect, i.e., balance between structural order (adherence to the 3D Gaussian) and local deviations from that order. It would be difficult to contradict the observation that a protein in perfect agreement with the 3D Gaussian would act upon its environment in a different manner, by exposing only hydrophilic residues and therefore affecting the arrangement of nearby water dipoles. In particular, local exposure of hydrophobic residues may play a critical role in the protein's biological profile. Linear propagation of hydrophobicity/hydrophilicity promotes unlimited elongation (e.g., in amyloids) [6,15], but it may also locally disrupt the aqueous environment in a way which attracts specific ligands and promotes catalysis (in proteins which contain solenoid fragments).
The set of proteins selected for analysis was aimed to show the gradual complication of the structure and the increasingly specific influence on the surrounded water. This stepwise increase of complication is interpreted as increase of the specificity of the enzyme what is shown using different substrates for enzymes under consideration.