# Towards Symmetry-Based Explanation of (Approximate) Shapes of Alpha-Helices and Beta-Sheets (and Beta-Barrels) in Protein Structure

^{*}

## Abstract

**:**

## 1. Introduction

**Alpha-helices and beta-sheets: brief reminder.**Proteins are biological polymers that perform most of life’s function. A single chain polymer (protein) is folded in such a way that it forms local substructures called secondary structure elements. In order to study the structure and function of proteins it is extremely important to have a good geometrical description of the proteins structure. There are two important secondary structure elements: alpha-helices and beta-sheets. A part of the protein structure where different fragments of the polypeptide align next to each other in extended conformation forming a line-like feature defines a secondary structure called an alpha-helix. A part of the protein structure where different fragments of the polypeptide align next to each other in extended conformation forming a surface-like feature defines a secondary structure called a beta-pleated sheet, or, for short, a beta-sheet; see, e.g., [1,2].

**Shapes of alpha-helices and beta-sheets: first approximation.**The actual shapes of the alpha-helices and beta-sheets can be complicated. In the first approximation, alpha-helices are usually approximated by cylindrical spirals (also known as circular helices or (cylindrical) coils), i.e., curves which, in an appropriate coordinate system, have the form $x=a\xb7\mathrm{cos}(\omega \xb7t)$, $y=a\xb7\mathrm{sin}(\omega \xb7t)$, and $c=b\xb7t$. Similarly, in the first approximation, beta-sheets are usually approximated as planes. These are the shapes that we will try to explain in this paper.

**What we do in this paper: our main result.**In this paper, following the ideas of a renowned mathematician M. Gromov [3], we use symmetries to show that under reasonable assumptions, the empirically observed shapes of cylindrical spirals and planes are indeed the best families of simple approximating sets.

**Auxiliary result: we also explain the (approximate) shape of beta-barrels.**The actual shape of an alpha-helix or of a beta-sheet is somewhat different from these first-approximation shapes. In [4], we showed that symmetries can explain some resulting shapes of beta-sheets. In this paper, we will add, to the basic approximate shapes of a circular helix and a planes, one more shape. This shape is observed when, due to tertiary structure effects, a beta-sheet “folds” on itself, becoming what is called a beta-barrel. In the first approximation, beta-barrels are usually approximated by cylinders. So, in this paper, we will also explain cylinders.

**Possible future work: need for explaining shapes of combinations of alpha-helices and beta-sheets.**A protein usually consists of several alpha-helices and beta-sheets. In some cases, these combinations of basic secondary structure elements have their own interesting shapes: e.g., coils (alpha-helices) sometimes form a coiled coil. In this paper, we use symmetries to describe the basic geometric shape of secondary structure elements; we hope that similar symmetry ideas can be used to describe the shape of their combinations as well.

## 2. Symmetry Approach in Physics: Brief Reminder

**Symmetries are actively used in physics.**In our use of symmetries, we have been motivated by the successes of using symmetries in physics; see, e.g., [5]. So, in order to explain our approach, let us first briefly recall how symmetries are used in physics.

**Symmetries in physics: main idea.**In physics, we usually know the differential equations that describe the system’s dynamics. Once we know the initial conditions, we can then solve these equations and obtain the state of the system at any given moment of time.

**Symmetries in physics: examples.**Let us give two examples of the use of symmetries in physics:

- a simpler example in which we will be able to perform all the computations, and
- a more complex example in which we will skip all the computations and proofs—but which will be useful for our analysis of the shape of proteins.

**First example: pendulum.**As the first simple example, let us consider the problem of finding how the period T of a pendulum depends on its length L and on the free fall acceleration g on the corresponding planet. We will denote the desired dependence by $T=f(L,g)$. This dependence was originally found by using Newton’s equations. We will show that (modulo a constant) the same dependence can be obtained without using any differential equations, only by taking the corresponding symmetries into account.

**What is the advantage of using symmetries?**At first glance, the above derivation of the pendulum formula is somewhat useless: we did not invent any new mathematics, the above mathematics is very simple, and we did not come up with any new physical conclusion—the formula for the period of the pendulum is well known. Yes, we got a slightly simpler derivation, but once a result is proven, getting a new shorter proof is not very interesting. So what is new in this derivation?

- if we have an experimental confirmation of the pendulum formula,
- this does not necessarily mean that we have confirmed Newton’s equations—all we confirmed are the symmetries.

**General comment about physical problems and fundamental physical equations.**The fact that we could derive this formula so easily shows that maybe in more complex situations, when solving the corresponding differential equation is not as easy, we would still be able to find an explicit solution by using appropriate symmetries. This is indeed the case in many complex problems; see, e.g., [5].

**Second example: shapes of celestial objects.**Another example where symmetries are helpful is the description of observed geometric shapes of celestial bodies. Many galaxies have the shape of planar logarithmic spirals; other clusters, galaxies, galaxy clusters have the shapes of the cones, conic spirals, cylindrical spirals, straight lines, spheres, etc. For several centuries, physicists have been interested in explaining these shapes. For example, there exist several dozen different physical theories that explain the observed logarithmic spiral shape of many galaxies. These theories differ in their physics, in the resulting differential equations, but they all lead to exactly the same shape—of the logarithmic spiral.

## 3. From Physics to Analyzing Shapes of Proteins: Towards the Formulation of the Problem

**Reasonable symmetries.**It is reasonable to assume that the underlying chemical and physical laws do not change under shifts and rotations. Thus, as a group of symmetries, we take the group of all “solid motions”, i.e., of all transformations which are composed of shifts and rotations.

**Reasonable shapes.**In chemistry, different shapes are possible. For example, bounded shapes like a point, a circle, or a sphere do occur in chemistry, but, due to their boundedness, they usually (approximately) describe the shapes of relatively small molecules like benzenes, fullerenes, etc.

**Reasonable families of shapes.**We do not want to just find one single shape, we want to find families of shapes that approximate the actual shapes of proteins. These families contain several parameters, so that by selecting values of all these parameters, we get a shape.

**We want to select the best approximating family.**In principle, we can have many different approximating families. Out of all these families, we want to select the one which is the best in some reasonable sense—e.g., the one that, on average, provides the most accurate approximation to the actual shape, or the one which is the fastest to compute, etc.

**What does the “best” mean?**There are many possible criteria for selecting the “best” family. It is not easy even to enumerate all of them—while our objective is to find the families which are the best according to each of these criteria. To overcome this difficulty, we therefore formulate a general description of the optimality criteria and provide a general description of all the families which are optimal with respect to different criteria.

- If none of the families is the best, then this criterion is of no use, so there should be at least one optimal family.
- If several different families are equally best, then we can use this ambiguity to optimize something else: e.g., if we have two families with the same approximating quality, then we choose the one which is easier to compute. As a result, the original criterion was not final: we get a new criterion ($A{\u2ab0}_{\mathrm{new}}B$ if either A gives a better approximation, or if $A{\sim}_{\mathrm{old}}B$ and A is easier to compute), for which the class of optimal families is narrower. We can repeat this procedure until we get a final criterion for which there is only one optimal family.

## 4. Definitions and the Main Result

**Definition**

**1.**

**Definition**

**2.**

**Definition**

**3.**

- By a multi-valued function $F:M\to N$ we mean a function that maps each $m\in M$ into a discrete set $F\left(m\right)\subseteq N$.
- We say that a multi-valued function is smooth if for every point ${m}_{0}\in M$ and for every value ${f}_{0}\in F\left(m\right)$, there exists an open neighborhood U of ${m}_{0}$ and a smooth function $f:U\to N$ for which $f\left({m}_{0}\right)={f}_{0}$ and for every $m\in U$, $f\left(m\right)\subseteq F\left(m\right)$.

**Definition**

**4.**

- We say that a class A of closed subsets of M is G-invariant if for every set $X\in A$, and for every transformation $g\in G$, the set $g\left(X\right)$ also belongs to the class.
- If A is a G-invariant class, then we say that A is a finitely parametric family of sets if there exist:
- -
- a (finite-dimensional) smooth manifold V;
- -
- a mapping s that maps each element $v\in V$ into a set $s\left(v\right)\subseteq M$; and
- -
- a smooth multi-valued function $\mathsf{\Pi}:G\times V\to V$

such that:- -
- the class of all sets $s\left(v\right)$ that corresponds to different $v\in V$ coincides with A, and
- -
- for every $v\in V$, for every transformation $g\in G$, and for every $\pi \in \mathsf{\Pi}(g,v)$, the set $s\left(\pi \right)$ (that corresponds to π) is equal to the result $g\left(s\right(v\left)\right)$ of applying the transformation g to the set $s\left(v\right)$ (that corresponds to v).

- Let $r>0$ be an integer. We say that a class of sets B is a r -parametric class of sets if there exists a finite-dimensional family of sets A defined by a triple $(V,s,\mathsf{\Pi})$ for which B consists of all the sets $s\left(v\right)$ with v from some r-dimensional sub-manifold $W\subseteq V$.

**Definition**

**5.**

- By an optimality criterion, we mean a pre-ordering (i.e., a transitive reflexive relation) ⪯ on the set $\mathcal{A}$.
- An optimality criterion is called G-invariant if for all $g\in G$, and for all $A,B\in \mathcal{A}$, $A\u2aafB$ implies $g\left(A\right)\u2aafg\left(B\right)$.
- An optimality criterion is called final if there exists one and only one element $A\in \mathcal{A}$ that is preferable to all the others, i.e., for which $B\u2aafA$ for all $B\ne A$.

**Lemma.**

- the optimal family ${A}_{\mathrm{opt}}$ is G-invariant; and
- each set X from the optimal family is a union of orbits of $\ge (d-r)$-dimensional subgroups of the group G.

**Theorem.**

**Conclusion.**These shapes correspond exactly to alpha-helices, beta-sheets (and beta-barrels) that we observe in proteins. Thus, the symmetries indeed explain the observed protein shapes.

## 5. Proofs

**Proof of the**

**Lemma.**

- 1°.
- Let us first show that this family ${A}_{\mathrm{opt}}$ is indeed G-invariant, i.e., that $g\left({A}_{\mathrm{opt}}\right)={A}_{\mathrm{opt}}$ for every transformation $g\in G$.Indeed, let $g\in G$. From the optimality of ${A}_{\mathrm{opt}}$, we conclude that for every $B\in \mathcal{A}$, ${g}^{-1}\left(B\right)\u2aaf\phantom{\rule{3.33333pt}{0ex}}{A}_{\mathrm{opt}}$. From the G-invariance of the optimality criterion, we can now conclude that $B\u2aafg\left({A}_{\mathrm{opt}}\right)$. This is true for all $B\in \mathcal{A}$ and therefore, the family $g\left({A}_{\mathrm{opt}}\right)$ is optimal. But since the criterion is final, there is only one optimal family; hence, $g\left({A}_{\mathrm{opt}}\right)={A}_{\mathrm{opt}}$. So, ${A}_{\mathrm{opt}}$ is indeed invariant.
- 2°.
- Let us now show an arbitrary set ${X}_{0}$ from the optimal family ${A}_{\mathrm{opt}}$ consists of orbits of $\ge (d\phantom{\rule{3.33333pt}{0ex}}-\phantom{\rule{3.33333pt}{0ex}}r)$-dimensional subgroups of the group G.Indeed, the fact that ${A}_{\mathrm{opt}}$ is G-invariant means, in particular, that for every $g\in G$, the set $g\left({X}_{0}\right)$ also belongs to ${A}_{\mathrm{opt}}$. Thus, we have a (smooth) mapping $g\to g\left({X}_{0}\right)$ from the d-dimensional manifold G into the $\le r$-dimensional set $G\left({X}_{0}\right)=\left\{g\left({X}_{0}\right)\phantom{\rule{0.166667em}{0ex}}\right|\phantom{\rule{0.166667em}{0ex}}g\in G\}\subseteq {A}_{\mathrm{opt}}$. In the following, we will denote this mapping by ${g}_{0}$.

**Proof of the**

**Theorem.**

- 0:
- The only 0-dimensional orbit is a point.
- 1:
- A generic 1-dimensional orbit is a cylindrical spiral, which is described (in appropriate coordinates) by the equations $z=k\xb7\varphi $, $\rho ={R}_{0}$. Its limit cases are:
- -
- a circle ($z=0$, $\rho ={R}_{0}$);
- -
- a semi-line (ray);
- -
- a straight line.

- 2:
- Possible 2-D orbits include:
- -
- a plane;
- -
- a semi-plane;
- -
- a sphere; and
- -
- a circular cylinder.

- a cylindrical spiral (with a straight line as its limit case);
- a plane (or a part of the plane), and
- a cylinder.

## 6. Symmetry-Related Speculations on Possible Physical Origin of the Observed Shapes

## Acknowledgements

## References

- Branden, C.I.; Tooze, J. Introduction to Protein Structure; Garland Publisher: New York, NY, USA, 1999. [Google Scholar]
- Lesk, A.M. Introduction to Protein Science: Architecture, Function, and Genomics; Oxford University Press: New York, NY, USA, 2010. [Google Scholar]
- Gromov, M. Crystals, proteins and isoperimetry. Bull. Am. Math. Soc.
**2011**, 48, 229–257. [Google Scholar] [CrossRef] - Stec, B.; Kreinovich, V. Geometry of protein structures. I. Why hyperbolic surfaces are a good approximation for beta-sheets. Geombinatorics
**2005**, 15, 18–27. [Google Scholar] - Feynman, R.P.; Leighton, R.B.; Sands, M. Feynman Lectures on Physics; Addison-Wesley: Boston, MA, USA, 2005. [Google Scholar]
- Finkelstein, A.M.; Kreinovich, V. Derivation of Einstein’s, Brans-Dicke and Other Equations From Group Considerations. In Proceedings of the Sir Arthur Eddington Centenary Symposium on Relativity Theory; Choque-Bruhat, Y., Karade, T.M., Eds.; World Scientific: Singapore, 1985; Volume 2, pp. 138–146. [Google Scholar]
- Finkelstein, A.M.; Kreinovich, V.; Zapatrin, R.R. Fundamental Physical Equations Uniquely Determined by Their Symmetry Groups. In Global Analysis—Studies and Applications II; Springer: Berlin/Heidelberg, Germany, 1986; Volume 1214, pp. 159–170. [Google Scholar]
- Finkelstein, A.; Kosheleva, O.; Kreinovich, V. Astrogeometry, error estimation, and other applications of set-valued analysis. ACM SIGNUM Newsl.
**1996**, 31, 3–25. [Google Scholar] [CrossRef] - Finkelstein, A.; Kosheleva, O.; Kreinovich, V. Astrogeometry: Towards mathematical foundations. Int. J. Theor. Phys.
**1997**, 36, 1009–1020. [Google Scholar] [CrossRef] - Finkelstein, A.; Kosheleva, O.; Kreinovich, V. Astrogeometry: Geometry explains shapes of celestial bodies. Geombinatorics
**1997**, VI, 125–139. [Google Scholar] - Li, S.; Ogura, Y.; Kreinovich, V. Limit Theorems and Applications of Set Valued and Fuzzy Valued Random Variables; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2002. [Google Scholar]
- Novotny, J.; Bruccoleri, R.E.; Newell, J. Twisted hyperboloid (Strophoid) as a model of beta-barrels in proteins. J. Mol. Biol.
**1984**, 177, 567–573. [Google Scholar] [CrossRef]

© 2012 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/.)

## Share and Cite

**MDPI and ACS Style**

Nava, J.; Kreinovich, V.
Towards Symmetry-Based Explanation of (Approximate) Shapes of Alpha-Helices and Beta-Sheets (and Beta-Barrels) in Protein Structure. *Symmetry* **2012**, *4*, 15-25.
https://doi.org/10.3390/sym4010015

**AMA Style**

Nava J, Kreinovich V.
Towards Symmetry-Based Explanation of (Approximate) Shapes of Alpha-Helices and Beta-Sheets (and Beta-Barrels) in Protein Structure. *Symmetry*. 2012; 4(1):15-25.
https://doi.org/10.3390/sym4010015

**Chicago/Turabian Style**

Nava, Jaime, and Vladik Kreinovich.
2012. "Towards Symmetry-Based Explanation of (Approximate) Shapes of Alpha-Helices and Beta-Sheets (and Beta-Barrels) in Protein Structure" *Symmetry* 4, no. 1: 15-25.
https://doi.org/10.3390/sym4010015