Protein Helical Structures: Defining Handedness and Localization Features

: The quantitative evaluation of the chirality of macromolecule structures remains one of the exciting issues in biophysics. In this paper, we propose methods for quantitative analysis of the chirality of protein helical and superhelical structures. The analysis of the chirality sign of the protein helical structures ( α -helices and 3 (cid:2869)(cid:2868) -helices) is based on determining the mixed product of every three consecutive vectors between neighboring reference points— α -carbons atoms. The method for evaluating the chirality sign of coiled-coil structures is based on determining the direction and value of the angle between the coiled-coil axis and the α -helices axes. The chirality sign of the coiled coil is calculated by averaging the value of the cosine of the corresponding angle for all helices forming the superhelix. Chirality maps of helical and superhelical protein structures are presented. Furthermore, we propose an analysis of the distributions of helical and superhelical structures in polypeptide chains of several protein classes. The features common to all studied classes and typical for each protein class are revealed. The data obtained, in all likelihood, can reflect con-siderations about molecular machines as chiral formations.


Introduction
Since the time of Louis Pasteur, one of the key categories in the triad of "symmetrydissymmetry-asymmetry" in living systems has been considered chirality-the property of objects or systems to be non-superimposable with their mirror images in any combination of translations and rotations in three-dimensional space [1]. It is well-known that proteins synthesized in ribosomes contain only L-amino acids, and nucleic acids contain Dribose and D-deoxyribose. For more than 100 years, the attention of researchers has been attracted to the mystery of the origin and evolutionary fixation of chiral dissymmetry in living nature. There is no generally accepted answer to this question, although there are numerous hypotheses [2][3][4][5][6][7][8][9]. However, we are interested in what unique or simply beneficial living nature has acquired on Earth, having taken the path of consistent symmetry breaking in its development.
In previously published works, relying on numerous sources, we developed and substantiated the tendency according to which the hierarchies of the structures of proteins and nucleic acids are characterized by a change in the type of symmetry and chirality sign. Starting from the level of asymmetric carbon in the amino acids of the primary structure of proteins, there is a tendency to alternate the chirality sign of the structural levels of proteins L-D-L-D [10][11][12].
If we talk about the general trend, then the primary structure (polypeptide chain) is characterized by completely left-handed chirality of amino acid residues. Higher levels are represented by helical structures of various types (in this work, we do not consider βstructures). Secondary structures are represented mainly by right-handed α-helices [13][14][15][16]. This conformation is energetically more preferred [17,18]. The next most common helical secondary structure of proteins is 3 10 -helices. They account for up to 15-20% of all protein helices [19]. Although left-handed helices are rare, they also have structural or functional significance [18,20,21]. At the next level of the hierarchical organization of protein structures, there are superhelical structures-coiled coil composed of α-helices [22][23][24]. In addition to well-known coiled-coil structures with a heptade sequence, there are also nontrivial ones. In the case of nonheptad sequence repeats, the heterochiral structure of L-and D-polypeptides contains large and likely more stable interfaces. [25]. In this paper, the authors expect the heterochiral oligomers would be incompatible with α-helical supercoiling.
To date, there is a significant number of works that offer various methods for assessing chirality [26][27][28][29][30][31][32][33][34][35]. The Ramachandran plot is a significant tool for describing a peptide backbone in the context of the backbone dihedral angles [36,37]. One of the concepts associated with the Ramachandran plot is a metric for backbone handedness [38]. It is based on interpreting a peptide backbone as a helix with axial and angular displacement. Every region handedness of the Ramachandran plot could be characterized by this metric for cis and trans backbones. The most comprehensive theoretical approach to assessing the chirality of helical structures was developed by M. Petitjean. Following this concept, the measure of chirality should be a continuous characteristic and be determined for a space of any dimension, and the chirality index should not depend on the method of choosing the mirror image [39][40][41]. The above methods are either highly specialized or rather complicated to implement, so it is impossible to determine the sign and measure of chirality of helical protein structures for large data sets.
Since the discovery of coiled-coil structures, several approaches have been proposed to describe them as chiral objects. In [42], the authors proposed to relate the pitch and radius of the coiled coil with a change in the twist angle and a shift along the coiled-coil axis per amino acid. In another work [43], the calculation of the pitch of the double-helical coiled coil was made, taking into account the Cartesian vector coordinates of the average atomic positions of the core atoms of the N-and C-terminus. It should be noted that both methods presented do not allow one to reveal the chirality of coiled coils. Furthermore, the possibility of the formation of a right-handed structure is not taken into account. In work [44], it is proposed to determine the sign of the chirality (τ) of the surface curve for a classical coiled coil using the periodicity of the helix, the angle between adjacent residues, and the shift along the axis per amino acid: for right-handed structures, τ is positive, for left-handed ones-negative. However, the authors note that it is not obvious that at negative τ the superhelix is left-handed.
In this work, we present our methods for the quantitative evaluation of the protein helical structure chirality. A necessary and sufficient condition is the mutual arrangement of α-carbons, which allows a reduction in the amount of processed information and is a clear advantage when processing large data arrays. To determine the chirality sign of coiled-coil structures, the twisting direction of each individual α-helix relative to the entire superhelix axis is taken into account. Based on the methods, computer programs in Python 3.7 are implemented. Chirality maps of helical and coiled-coil structures of proteins are obtained. The accuracy of the maps is confirmed by the analysis of real structures.
Another aspect of this work is the regularity identification of the protein "chiral" structure using the example of helical and superhelical structures and their functions. The protein structure determines its function and role in the biological system. However, despite the abundant data on the functions of various proteins, the functions of some of them remain unclear. One of the methods for determining the correspondence of the protein structure and function and its role in metabolism is the method of comparison with already known proteins with similar structures. In the present work, the regularities in the distribution of helices and coiled coils along the polypeptide chain, which form a dynamic backbone of protein molecules, are considered. Research on patterns and relationships between the chiral protein structure and its functions could be used to develop the concept of a protein as a molecular machine.

Computational Methodology
The object of research is helical and coiled-coil structures from the PDB, CC + databases [45,46].

Method for Quantitative Analysis of the Chirality of Protein Helical Structures
Previously, we proposed a method for quantitative analysis of the chirality of protein helical structures, based on the construction of consecutive vectors between neighboring carbons in the chain and determination of their vector product [47]. The method is based on the mutual arrangement of α-carbons (C α ) atoms in the amino acid chain-the reference points in the model. The skeleton of α-carbons and the helix axis are directed from the C-terminus to the N-terminus. As a result of constructing vectors between neighboring atoms (from the previous to the subsequent) for n atoms (C α 1…C α n), the direction vector d and the sum of the vector products s were determined. The sign of the cosine of the angle between these vectors determines the chirality sign: in the case of cos∠(d, s) < 0-D; in the case of cos∠(d, s) > 0-L. The length of the product vector was used as an index of chirality, which in some cases led to an incorrect estimate of chirality. For example, for two successive right-handed helices connected in different ways, radically different results were obtained. Further, the above method was developed and improved by us.
Let us consider a model of a polypeptide chain consisting of n amino acid residues and having n C α atoms as reference points. We construct a vector between each two neighboring control points. For n reference points, there are (n-1) vectors ( Figure 1). For every three consecutive vectors, their mixed product is calculated (taking into account their coordinates in three-dimensional space): The sum of all mixed products can be represented as a characteristic of the helical structure chirality, and the equation for its description can be presented as follows: The sign of the mixed product determines the chirality sign of the protein helical structure.

Method for Quantitative Analysis of the Chirality of Protein Superhelical Structures
We propose the method for the quantitative analysis of the coiled-coil chirality. Similar to the above method for helical structures, information on the mutual arrangement of the α-carbons of the amino acid residues is used as the initial conditions. The method is based on the determination of the twisting direction of each individual α-helix relative to the entire superhelix axis. The angle between the superhelix axis and the helices axes was chosen as a criterion for determining this direction. The clockwise or counterclockwise orientation of this angle relative to the superhelix axis serves as an indicator for determining the chirality sign of the superhelix.
As an example, a thermostable, heterotrimeric coiled coil (PDB ID: 1BB1 [48]) is considered ( Figure 2a). Since the α-helix has an average of 3.6 amino acid residues per turn, the geometric center of 4 consecutive atoms (for the first turn-points C1, C2, C3, C4, etc.) is taken as the conditional center of the helix turn. They are the points of the helix axes-A1, A2, …, An, through which the axis of the α-helix is drawn (Figure 2b). At the next stage, the "middle helix" is constructed, which does not carry a physical meaning but is necessary for further constructions: the geometric centers of the first (C1,1, C2,1, C3,1), etc., of the atoms of the α-helices, are the points of the "middle helix" (B1 … Bn in the inset in Figure  2c). For the "middle helix", the axis is constructed according to the above method. The axis of the middle helix is defined as the superhelix axis ( Figure 2d). Both the axis of the middle helix and the axes of the α-helices are curved lines in three-dimensional space (insert in Figure 2d), but for the minimum number of atoms, the curve can be considered as a straight line. For the left-handed superhelix, helices forming it will be deflected to the right side relative to the superhelix axis. The angle between the vs. and vci vectors will be counted counterclockwise ( Figure 3). The angle between the direction of the superhelical axis (vs vector) and the directions of the constituent helices axes (vc1, vc2, vc3 vectors) determines the chirality sign of the superhelix ( Figure 4). Let us denote the vector from the axis of the superhelix to the axis of the i-th spiral as vki and the vector product of the vci and vs. vectors as vpi. If the vector vci deviates from the vector vs. to the right (clockwise), the superhelix is left-handed, and the v ki and vpi vectors are counter-directed. If the vci vector deviates from the vs. vector to the left, the superhelix is right-handed, and the v ki and vpi vectors will be co-directed. The angle between vki and vpi is determined using the dot product: where vki is the vector from the superhelix axis to the k-axis of the helix; vp-cross product of vci and vs. vectors. The estimate of the chirality sign of the superhelix is calculated by averaging the value of the cosine of the corresponding angle for all helices forming the superhelix: where βi is the angle between the direction of the i-th α-helix axis and the direction of the superhelix axis (that is, between the vectors vki and vpi); q is the number of helices in the superhelix. Based on the methods, computer programs in Python 3.7 are implemented. The graphical interface is implemented using the tkinter library. The programs allow to load a model from a file, display a list of helical structures, determine the sign of their chirality, and display a three-dimensional image using the matplotlib library. Input data are files with.pdb or.txt extension. The average time for calculating a helix is 50 ms. The average time for calculating the superhelix is 20 ms. The methods are based on a small amount of initial data. That makes it possible to reduce the amount of processed information and is a significant advantage when processing large amounts of data.

Characterization of the Chirality of Protein Helical Structures
Using the developed method for quantitative analysis of protein helical structure chirality, the following proteins were considered: enzymes of seven classes (oxidoreductase-189, transferase-86, hydrolase-182, lyase-131, isomerase-113, ligase-89, translocase-130), chaperone-15, viral protein-26, structural protein-13, endo-and exocytosis protein-7, electron transport protein-4 (Table 1). Chirality maps were obtained for the considered protein helical structures ( Figure 5).   For all studied proteins, α-helices and 3 -helices are mainly right-handed: lefthanded α-helices make up 0.06% of the total number of α-helices, and left-handed 3helices-4.6% of the total number of 3 -helices. Thus, no left-handed helices were found in the structures of lyases, viral proteins, chaperones, and electron transport proteins. In hydrolases, translocases, and structural proteins, the percentage of left-handed 3 -helices is 10.34%, 9.07%, and 10.52%, respectively. These results significantly differ from other studied protein classes. The highest percentage of left-handed α-helices was found in the structures of endo-and exocytosis proteins (0.4%). The data are presented in Table 2.

Characterization of the Chirality of Protein Superhelical Structures
The calculated parameters for estimating the chirality of coiled-coil structures are shown in Table A1 in Appendix A.
The coiled-coil chirality map is shown in Figure 6. The method was tested on 114 coiled-coil structures from the CC+ database [46]. All coiled-coil structures are defined as left-handed. Thus, the results obtained adequately reflect the data presented in the scientific literature.

Distributions of Helical and Superhelical Structures in Polypeptide Chains
In addition to the previously obtained results [49], the distribution of secondary helical structures and coiled coils in the polypeptide chains of proteins of eight functional classes: virus protein, chaperone, oxidoreductase, hydrolase, structural proteins, exocytosis/endocytosis proteins, and electron transport proteins were analyzed. Proteins were selected using the PDB [45] and CC+ [46] databases. We considered the proteins with the presence of the coiled-coil structures according to the CC+ database. The number of structures considered in each class was determined by the number and availability of structures in the mentioned databases. For processing files in the (.pdb) format, a special algorithm was developed and implemented as a program in the C++ language. The results of the analysis are presented as charts of secondary structure and coiled-coil structure distribution along the protein chain.  In the class of oxidoreductases, no clear patterns were found in the distribution of helical secondary structures along the polypeptide chain. The α-helices are distributed evenly. The probability of observing these structures belongs to the range from 0.52 to 0.79 (here and below, the values are given for the chains excluding the ends). Coiled-coil structures in this class are not numerous, and they are distributed evenly. This is probably due to the functional features of these proteins: perhaps they do not require the mechanical strength that coiled coils provide.
In the studied hydrolases, the probability of observing α-helices increases as they approach the C-terminus, and it reaches a value of 0.82. Coiled coils are more common in this class than in oxidoreductases. It is possible to distinguish sections of the chain where the probability of detecting superhelical structures is higher. The lack of clear patterns in the distribution of helices in this class may be due to the specificity of the interaction between the enzyme and the substrate. The presence of regions with a relatively high concentration of supercoiled structures can determine the local demand of the molecule for rigid structures.
In the considered isomerases, the probability of detecting α-helices also increases as the C-terminus of the polypeptide chain is approached. The percentage of α-helices in the studied isomerases is the highest among the considered enzyme classes. The probability of their observation lies in the range from 0.35 to 0.88. There are few coiled-coil structures in isomerases, and their distribution along the polypeptide chain shows long-term minima. The maxima of the probability of the appearance of coiled coils coincide with the places of noticeable accumulation of α-helices, especially at the C-terminus of the chain. Probably, this distribution indicates their binding to the active center at one of the ends.
Thus, no clear patterns in the distribution of α-helices were found in enzymes, and there are few coiled-coil structures in these classes. This is probably since the enzymes must ensure the specificity of the reactions and not have the same structure for the entire class.

Viral Proteins
A total of 109 polypeptide chains of viral proteins were considered. The distribution of helical and superhelical structures along the polypeptide chain of viral proteins, as well as the distribution of the lengths of α-helices and 3 10 -helices are shown in Figure 8. In this class, a high percentage of α-helices was found throughout the entire primary protein chain. The probability of observing α-helical structures lies in the range from 0.46 to 0.81. The proteins studied are rod-shaped molecules. The abundance of α-helices distributed along the primary chain may explain the rod-like structure of many molecules, which determines their strength for the rigid capsid formation. The viral proteins also showed a high content of coiled-coil structures along the entire chain, while the probability of detection increases closer to the center of the chain and reaches a maximum of 0.54. The formation of coiled coils often involves α-helices belonging to different polypeptide chains. Therefore, such coiled-coil structures can be classified as belonging to the protein quaternary structure. A large number of α-helices and coiled coils may reflect the peculiarities of interaction with an infected cell.

Chaperones
A total of 40 polypeptide chains of viral proteins were considered. The distribution of helical and superhelical structures along the polypeptide chain of chaperones, as well as the distribution of the lengths of α-helices and 3 10 -helices are shown in Figure 9. The probability of observing α-helical structures in chaperones lies in the range from 0.47 to 0.87 (at the ends of the polypeptide chain, the minimum probability value is 0.2), and the median value is 0.73.
The probability of detecting coiled-coil structures increases closer to the C-terminus. In this class, coiled-coil structures are distributed by domains throughout the entire chain, which can provide fragmentary local rigidity of molecules that essentially have the functions and features of transformer constructs.

Structural Proteins
To identify patterns in the distribution of helices and coiled coils along the polypeptide chain, 29 polypeptide chains of structural proteins were considered. The distribution of helical and superhelical structures along the polypeptide chain of structural proteins, as well as the distribution of the lengths of α-helices and 3 10 -helices are shown in Figure  10. In structural proteins, a high percentage of α-helices was found throughout the entire primary protein chain. The probability of observing α-helical structures lies in the range from 0.46 to 1 (at the ends of the polypeptide chain, the minimum probability value is 0.23). The presence of rather long α-helices is naturally consistent with functional features. A large number of helices provides a rigid structure for the entire molecule, which is certainly important for structural proteins responsible for the mechanical strength of the cell. Several minima of the probability of detecting α-helices, coinciding with the maxima of the probability of detecting irregular structures, can provide mechanical mobility of the molecule in certain places. Structural proteins are characterized by a uniform distribution of coiled coils throughout the entire chain. However, the probability of the appearance of such a structure gradually decreases toward the C-terminus.

Exocytosis/Endocytosis Proteins
To identify patterns in the distribution of helices and coiled coils along the polypeptide chain, nine polypeptide chains of exocytosis/endocytosis proteins were considered. The distribution of helical and superhelical structures along the polypeptide chain of structural proteins, as well as the distribution of the lengths of α-helices and 3 -helices are shown in Figure 11. In the considered exocytosis/endocytosis proteins, regions of the polypeptide chain were detected where the probability of detecting an α-helix is equal to 1. That is, α-helices are present in all studied proteins in this region of the polypeptide chain. The minimum value of the probability of observing α-helical structures is 0.57, and the median value is 0.86. The coiled-coil structures in exocytosis/endocytosis proteins are located in three large domains and are completely absent at both ends of the molecules. Such structure distribution can provide local rigidity of molecules as elements of mechanical structures, which is necessary for the formation of vesicles. Among the structures considered are t-SNAREs and v-SNAREs, in which helices play a crucial role in functioning.

Electron Transport Proteins
To identify patterns in the distribution of helices and coiled coils along the polypeptide chain, eight polypeptide chains of electron transport proteins were considered. The distribution of helical and superhelical structures along the polypeptide chain of structural proteins, as well as the distribution of the lengths of α-helices and 3 10 -helices are shown in Figure 12. As in exocytosis/endocytosis proteins, a plateau of α-helices can be observed in electron transport proteins, where the probability of detecting an α-helix is 1. The presence of regions rich in α-helices is probably associated with their location in the membrane. In this class, there are regions with a minimal probability of detecting coiled coils at the Nterminus, as well as in the middle of the polypeptide chain. Perhaps this feature allows proteins to change conformation.
It should be noted that, in all studied classes, the maxima of the probability of appearance of 3 10 -helices coincide with the minima of the probability of α-helices appearance. In general, the percentage of 3 10 -helices in the considered protein classes are low. Only in a few classes, the probability of their detection is more than 0.2. In general, 3 10helices are found in the form of rather short structures, but in some classes (isomerases, hydrolases, oxidoreductases), helices of more than 10 amino acid residues are found. Based on the presented charts, a common property observed in proteins of the studied groups was revealed: the predominance of irregular regions at the beginning and at the end of polypeptide chains, which is consistent with the simplified concept of a protein globule as a hydrophobic nucleus in a hydrophilic envelope.

Discussion
A topical issue in the field of molecular biophysics is the question of the quantitative evaluation of chiral transformations not only in the named helical and superhelical structures but also in the irregular structures connecting them, as well as the quantitative evaluation of the protein globule chirality [50][51][52][53][54]. The presented methods for quantitative analysis of the chirality of protein helical and superhelical structures are distinguished by the small amount of information required for its use, as well as by the relative simplicity of calculations. For all proteins studied, α-helices and 3 10 -helices are mainly represented by right-handed structures: left-handed α-helices make up 0.06% of the total number of α-helices, and left-handed 3 10 -helices-4.6% of the total number of 3 10 -helices. The obtained results of the analysis of the α-helices and 3 10 -helices axes made it possible to confirm the general trend of the chirality sign of the right-handed and left-handed secondary structures of proteins. All studied coiled-coil structures were left-handed. The results obtained fully confirm the concept of a change in the symmetry types associated with the chirality sign-alternation during the transition to the next hierarchical level from helical to superhelical structures [11,12].
Also, in this work, we analyzed the location of secondary helical structures (α-helices and 310-helices) and coiled-coil structures along the polypeptide chains. The features common to all studied classes and typical for each protein class are revealed. A quantitative description of the spatial characteristics and locations of regular chiral intra-and supramolecular structures of proteins is significant for understanding the mechanisms of specific functioning of proteins of different classes and is also a necessary stage in the development of the general theory of molecular machines as chiral hierarchical structures.