# Unveiling the Hidden Rules of Spherical Viruses Using Point Arrays

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Icosahedral Symmetry

#### 2.2. Affine Extensions

#### 2.3. Major Features of Point Arrays

#### 2.3.1. 55 Unique Single Point Arrays

#### 2.3.2. Gauge Points

#### 2.3.3. Sister Point Arrays

#### 2.3.4. Double Base Point Arrays

#### 2.3.5. Single Free Parameter

#### 2.4. Radially Ordered Single Base Point Arrays

#### 2.5. Point Array Fitness Algorithm

#### 2.5.1. Identify Protruding Features of a Virus

#### 2.5.2. Determine Gauge Point Scaling

#### 2.5.3. Scale and Truncate Point Arrays

#### 2.5.4. Compute RMSD from Truncated Point Arrays to the Viral Capsid Proteins

#### 2.5.5. Determine Best Fit Point Arrays

- If a point array has a lower RMSD score by $0.5$ Å or more.
- Have at least one element near each protein.
- Encase the protein capsid with points above and below.
- Have a better agreement with the gauge point fits, as seen in Figure 11.
- Have more points of contact with capsid proteins, e.g., each point on the five-fold axes have at least 5 points of contact with protein surfaces. We consider this step after checking gauge point fits (d), as the number of contacts can be quite large for point arrays with $\mathrm{IDD}$ bases or ${\overrightarrow{T}}_{2}$ extensions which can considerably lower the RMSD score.

#### 2.6. Comparison with Previous Measure

#### 2.6.1. Gauge Point Agreement

#### 2.6.2. Simplified RMSD Measure

#### 2.6.3. Gauge Fixing of Truncated Point Arrays

#### 2.6.4. Recognition of Sister Point Arrays

#### 2.6.5. Tie Breaking Criteria

## 3. Results and Discussion

#### 3.1. Virus Point Array Classification

#### 3.2. Advantage of Sister Point Arrays

#### 3.3. Penton Base of Adenovirus Ad3 Dodecahedron (HEV, T = 1, 4aqq)

#### 3.4. Hepatitis E VLP (HEV, T = 1, 3hag)

#### 3.5. Bacteriophage MS2 (MS2, T = 3, 2ms2)

#### 3.6. Hepatitis B (HBV, T = 4, 1qgt)

#### 3.7. Cowpea Chlorotic Mottle Virus Maturation (CCMV, T = 3, 1cwp)

#### 3.8. Cowpea Mosaic Virus (CPMV, pT3, 1ny7) Lysine Analysis

#### 3.9. Bacteriophage HK97 Prohead II (HK97, T = $7l$, 3e8k)

#### 3.10. Limitations of Point Arrays

## 4. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A

#### Appendix A.1. Icosahedral Rotation Matrices

#### Appendix A.2. Vertices of the Icosahedral Polyhedra

#### Appendix A.3. Worked Example of an Affine Extension

## References

- Keef, T.; Twarock, R. Affine Extensions of the Icosahedral Group with Applications to the Three-dimensional Organisation of Simple Viruses. J. Math. Biol.
**2009**, 59, 287–313. [Google Scholar] [CrossRef] - Keef, T.; Wardman, J.P.; Ranson, N.A.; Stockley, P.G.; Twarock, R. Structural Constraints on the Three-dimensional Geometry of Simple Viruses: Case Studies of a New Predictive Tool. Acta Crystallogr. Sect. A Found. Crystallogr.
**2013**, 69, 140–150. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Caspar, D.L.; Klug, A. Physical Principles in the Construction of Regular Viruses. Cold Spring Harbor Symp. Quant. Biol.
**1962**, 27, 1–24. [Google Scholar] [CrossRef] [PubMed] - Wilson, D.P. Protruding Features of Viral Capsids are Clustered on Icosahedral Great Circles. PLoS ONE
**2016**, 11, 1–22. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Janner, A. Form, Symmetry and Packing of Biomacro-molecules. I. Concepts and Tutorial Examples. Acta Crystallogr. Sect. A Found. Crystallogr.
**2010**, 66, 301–311. [Google Scholar] [CrossRef] [PubMed] - Janner, A. Form, Symmetry and Packing of Biomacro-molecules. II. Serotypes of Human Rhinovirus. Acta Crystallogr. Sect. A Found. Crystallogr.
**2010**, 66, 312–326. [Google Scholar] [CrossRef] [PubMed] - Janner, A. Form, Symmetry and Packing of Biomacromolecules. III. Antigenic, Receptor and Contact Binding Sites in Picornaviruses. Acta Crystallogr. Sect. A Found. Crystallogr.
**2011**, 67, 174–189. [Google Scholar] [CrossRef] - Janner, A. Form, Symmetry and Packing of Biomacromolecules. IV. Filled Capsids of Cowpea, Tobacco, MS2 and Pariacoto RNA Viruses. Acta Crystallogr. Sect. A Found. Crystallogr.
**2011**, 67, 517–520. [Google Scholar] [CrossRef] - Janner, A. Form, Symmetry and Packing of Biomacromolecules. V. Shells with Boundaries at anti-nodes of Resonant Vibrations in Icosahedral RNA Viruses. Acta Crystallogr. Sect. A Found. Crystallogr.
**2011**, 67, 521–532. [Google Scholar] [CrossRef] - Janner, A. From an Affine Extended Icosahedral Group Towards a Toolkit for Viral Architecture. Acta Crystallogr. Sect. A Found. Crystallogr.
**2013**, 69, 151–163. [Google Scholar] [CrossRef] - Zappa, E.; Dykeman, E.C.; Twarock, R. On the Subgroup Structure of the Hyperoctahedral Group in Six Dimensions. Acta Crystallogr. Sect. A Found. Adv.
**2014**, 70, 417–428. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Zappa, E.; Dykeman, E.C.; Geraets, J.A.; Twarock, R. A Group Theoretical Approach to Structural Transitions of Icosahedral Quasicrystals and Point Arrays. J. Phys. A Math. Theor.
**2016**, 49, 175203. [Google Scholar] [CrossRef] [Green Version] - Rochal, S.B.; Konevtsova, O.V.; Myasnikova, A.E.; Lorman, V.L. Hidden Symmetry of Small Spherical Viruses and Organization Principles in “anomalous” and Double-Shelled Capsid Nanoassemblies. Nanoscale
**2016**, 8, 16976–16988. [Google Scholar] [CrossRef] - Twarock, R.; Luque, A. Structural Puzzles in Virology Solved with an Overarching Icosahedral Design Principle. Nat. Commun.
**2019**, 10, 4414. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Wardman, J.P. A Symmetry Approach to Virus Architecture. Ph.D. Thesis, University of York, York, UK, 2012. [Google Scholar]
- Wang, Q.; Kaltgrad, E.; Lin, T.; Johnson, J.E.; Finn, M.G. Natural Supramolecular Building Blocks: Wild-type Cowpea Mosaic Virus. Chem. Biol.
**2002**, 9, 805–811. [Google Scholar] [CrossRef] [Green Version] - Chatterji, A.; Ochoa, W.F.; Paine, M.; Ratna, B.R.; Johnson, J.E.; Lin, T. New Addresses on an Addressable Virus Nanoblock: Uniquely Reactive Lys Residues on Cowpea Mosaic Virus. Chem. Biol.
**2004**, 11, 855–863. [Google Scholar] [CrossRef] [Green Version] - Garriga, D.; Querol-Audí, J.; Abaitua, F.; Saugar, I.; Pous, J.; Verdaguer, N.; Castón, J.R.; Rodriguez, J.F. The 2.6-Angstrom Structure of Infectious Bursal Disease Virus-Derived T=1 Particles Reveals New Stabilizing Elements of the Virus Capsid. J. Virol.
**2006**, 80, 6895–6905. [Google Scholar] [CrossRef] [Green Version] - Guu, T.S.Y.; Liu, Z.; Ye, Q.; Mata, D.A.; Li, K.; Yin, C.; Zhang, J.; Tao, Y.J. Structure of the Hepatitis E Virus-like Particle Suggests Mechanisms for Virus Assembly and Receptor Binding. Proc. Natl. Acad. Sci. USA
**2009**, 106, 12992–12997. [Google Scholar] [CrossRef] [Green Version] - Szolajska, E.; Burmeister, W.P.; Zochowska, M.; Nerlo, B.; Andreev, I.; Schoehn, G.; Andrieu, J.P.P.; Fender, P.; Naskalska, A.; Zubieta, C.; et al. The Structural Basis for the Integrity of Adenovirus Ad3 Dodecahedron. PLoS ONE
**2012**, 7, 1–11. [Google Scholar] [CrossRef] [Green Version] - Carrillo-Tripp, M.; Shepherd, C.M.; Borelli, I.A.; Venkataraman, S.; Lander, G.; Natarajan, P.; Johnson, J.E.; Brooks, C.L., III; Reddy, V.S. VIPERdb2: An Enhanced and Web API Enabled Relational Database for Structural Virology. Nucleic Acids Res.
**2009**, 37, D436–D442. [Google Scholar] [CrossRef] - Brooks, B.; Brooks, C.L., III; Mackerell, A.D.; Nilsson, L.; Petrella, R.J.; Roux, B.; Won, Y.; Archontis, G.; Bartels, C.; Boresch, S.; et al. CHARMM: The Biomolecular Simulation Program. J. Comput. Chem.
**2009**, 30, 1545–1614. [Google Scholar] [CrossRef] [PubMed] - Hartman, E.C.; Jakobson, C.M.; Favor, A.H.; Lobba, M.J.; Álvarez-Benedicto, E.; Francis, M.B.; Tullman-Ercek, D. Quantitative Characterization of All Single Amino Acid Variants of a Viral Capsid-Based Drug Delivery Vehicle. Nat. Commun.
**2018**, 9, 1–11. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Hartman, E.C.; Lobba, M.J.; Favor, A.H.; Robinson, S.A.; Francis, M.B.; Tullman-Ercek, D. Experimental Evaluation of Coevolution in a Self-Assembling Particle. Biochemistry
**2019**, 58, 1527–1538. [Google Scholar] [CrossRef] [PubMed] - Liu, H.; Qu, C.; Johnson, J.E.; Case, D.A. Pseudo-Atomic Models of Swollen CCMV from Cryo-Electron Microscopy Data. J. Struct. Biol.
**2003**, 142, 356–363. [Google Scholar] [CrossRef] - Larson, S.B.; Day, J.; Greenwood, A.; McPherson, A. Refined Structure of Satellite Tobacco Mosaic Virus at 1.8 Å Resolution. J. Mol. Biol.
**1998**, 277, 37–59. [Google Scholar] [CrossRef] - Naitow, H.; Tang, J.; Canady, M.; Wickner, R.B.; Johnson, J.E. L-A Virus at 3.4 Å Resolution Reveals Particle Architecture and mRNA Decapping Mechanism. Nat. Struct. Biol.
**2002**, 9, 725–728. [Google Scholar] [CrossRef] - Tars, K.; Bundule, M.; Fridborg, K.; Liljas, L. The Crystal Structure of Bacteriophage GA and a Comparison of Bacteriophages Belonging to the Major Groups of Escherichia coli L eviviruses. J. Mol. Biol.
**1997**, 271, 759–773. [Google Scholar] [CrossRef] - Golmohammadi, R.; Valegård, K.; Fridborg, K.; Liljas, L. The Refined Structure of Bacteriophage MS2 at 2.8 Å Resolution. J. Mol. Biol.
**1993**, 234, 620–639. [Google Scholar] [CrossRef] - Speir, J.A.; Munshi, S.; Wang, G.; Baker, T.S.; Johnson, J.E. Structures of the Native and Swollen Forms of Cowpea Chlorotic Mottle Virus Determined by X-ray Crystallography and Cryo-Electron Microscopy. Structure
**1995**, 3, 63–78. [Google Scholar] [CrossRef] [Green Version] - Oda, Y.; Saeki, K.; Takahashi, Y.; Maeda, T.; Naitow, H.; Tsukihara, T.; Fukuyama, K. Crystal Structure of Tobacco Necrosis Virus at 2.25 Å Resolution. J. Mol. Biol.
**2000**, 300, 153–169. [Google Scholar] [CrossRef] - Lin, T.; Chen, Z.; Usha, R.; Stauffacher, C.V.; Dai, J.B.; Schmidt, T.; Johnson, J.E. The Refined Crystal Structure of Cowpea Mosaic Virus at 2.8 Å Resolution. Virology
**1999**, 265, 20–34. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Speir, J.; Natarajan, P.; Taylor, D.; Chen, Z.; Johnson, J.E. Crystal Structure of Helicoverpa Armigera Stunt Virus. 2011; To be published. [Google Scholar]
- Wynne, S.A.; Crowther, R.A.; Leslie, A.G.W. The Crystal Structure of the Human Hepatitis B Virus Capsid. Mol. Cell
**1999**, 3, 771–780. [Google Scholar] [CrossRef] - Munshi, S.; Liljas, L.; Cavarelli, J.; Bomu, W.; McKinney, B.; Reddy, V.S.; Johnson, J.E. The 2.8 Å Structure of a T = 4 Animal Virus and its Implications for Membrane Translocation of RNA. J. Mol. Biol.
**1996**, 261, 1–10. [Google Scholar] [CrossRef] [Green Version] - Hryc, C.F.; Chen, D.H.; Afonine, P.V.; Jakana, J.; Wang, Z.; Haase-Pettingell, C.; Jiang, W.; Adams, P.D.; King, J.A.; Schmid, M.F.; et al. Accurate Model Annotation of a Near-Atomic Resolution Cryo-EM Map. Proc. Natl. Acad. Sci. USA
**2017**, 114, 3103–3108. [Google Scholar] [CrossRef] [Green Version] - Gertsman, I.; Gan, L.; Guttman, M.; Lee, K.; Speir, J.A.; Duda, R.L.; Hendrix, R.W.; Komives, E.A.; Johnson, J.E. An unexpected twist in viral capsid maturation. Nature
**2009**, 458, 646–650. [Google Scholar] [CrossRef] [Green Version] - Zubieta, C.; Schoehn, G.; Chroboczek, J.; Cusack, S. The Structure of the Human Adenovirus 2 Penton. Mol. Cell
**2005**, 17, 121–135. [Google Scholar] [CrossRef] - Hadden, J.A.; Perilla, J.R.; Schlicksup, C.J.; Venkatakrishnan, B.; Zlotnick, A.; Schulten, K. All-atom Molecular Dynamics of the HBV Capsid Reveals Insights into Biological Function and Cryo-EM Resolution Limits. eLife
**2018**, 7, 1–27. [Google Scholar] [CrossRef] - Wang, J.C.Y.; Mukhopadhyay, S.; Zlotnick, A. Geometric Defects and Icosahedral Viruses. Viruses
**2018**, 10, 25. [Google Scholar] [CrossRef] [Green Version] - Indelicato, G.; Cermelli, P.; Salthouse, D.G.; Racca, S.; Zanzotto, G.; Twarock, R. A Crystallographic Approach to Structural Transitions in Icosahedral Viruses. J. Math. Biol.
**2012**, 64, 745–773. [Google Scholar] [CrossRef] [PubMed] - Tama, F.; Brooks III, C.L. Diversity and Identity of Mechanical Properties of Icosahedral Viral Capsids Studied with Elastic Network Normal Mode Analysis. J. Mol. Biol.
**2005**, 345, 299–314. [Google Scholar] [CrossRef] [PubMed] - Wikoff, W.R.; Liljas, L.; Duda, R.L.; Tsuruta, H.; Hendrix, R.W.; Johnson, J.E. Topologically Linked Protein Rings in the Bacteriophage HK97 Capsid. Science
**2000**, 289, 2129–2133. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Lošdorfer Božič, A.; Šiber, A.; Podgornik, R. Statistical Analysis of Sizes and Shapes of Virus Capsids and Their Resulting Elastic Properties. J. Biol. Phys.
**2013**, 39, 215–228. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Rossmann, M.G. Structure of Viruses: A Short History. Q. Rev. Biophys.
**2013**, 46, 133–180. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**(

**a**) The spatial distributions of gauge points have icosahedral symmetry, i.e., they are arranged with 2-, 3- and 5-fold symmetry as shown. The gauge points all lie on the 15 icosahedral great circles which connect nearest neighbor symmetry axes [4] and are colored as in Table 1. We refer to the arcs of the circles subtending the 2-, 3- and 5-fold axes as the 5-2 GC, 5-3 GC and 2-3 GC. The volume bounded between these arcs is known as the asymmetric unit (AU) and is a representative one-sixtieth section of the entire capsid (yellow). (

**b**) The gauge points have been placed on a radially colored HBV (1qgt) capsid at their nearest possible distance from the protein surface. We will later see that the only admissible gauge points are the purple and orange elements sitting atop the protruding dimers. (

**c**) There are 21 unique gauge points in the AU, see Table 2. The gauge points are formed from 2-, 3- and 5-fold translations of the icosahedron (ICO), Dodecahedron (DOD) and Icosadodecahedron (IDD). The gauge points along the 2-, 3- and 5-fold axes result solely from ${\mathrm{IDD}}_{2}$, ${\mathrm{DOD}}_{3}$ and ${\mathrm{ICO}}_{5}$ point arrays, respectively. The off-axis gauge points, e.g., GP:2-GP:5, each result from only nearly identical point arrays known as sister point arrays. For example, Gauge Point 4 (GP:4) is only created from ${\varphi}^{\prime}{\mathrm{ICO}}_{3}$ and $\varphi {\mathrm{DOD}}_{5}$. In total 36 of the 55-point arrays have gauge points off-axis. There are no gauge points in the bulk regions not located on the great circles, however interior points of the point arrays can be located there.

**Figure 2.**A structural comparison showing the wide diversity in viral capsid shapes of even the simplest $T=1$ viruses. Each virus capsid has been radially colored, with red being the most interior, and blue the exterior. These three, (

**a**) Infectious Bursal Virus (Avibirnavirus, 2gsy [18]), (

**b**) Hepatitis E Virus-like particle (Hepevirus, 3hag [19]), and (

**c**) Penton Base of Adenovirus A (Mastadenovirus, 4aqq [20]), were chosen as they have almost no overlap in their protruding features. The best fit point array for each capsid is listed. We will see that these point arrays describe unique features and complement the T-number classification. The asymmetric unit of each capsid is contained by the triangular region of gauge points (Figure 1).

**Figure 3.**The three standard polyhedra with icosahedral symmetry and the affine extension translation vectors. (

**a**) The 12 vertices of the icosahedron are on the six 5-fold axes, and the structure is generated by applying all 60 icosahedral rotations on a single point $[0,1,\varphi ]$, which also serves as the translation vector ${\overrightarrow{T}}_{5}$. (

**b**) The 20 vertices of the dodecahedron are on the ten 3-fold axes, and the structure is generated by applying the full icosahedral group to the point $[-{\varphi}^{\prime},0,\varphi ]$, which also serves as the ${\overrightarrow{T}}_{3}$ translation vector. (

**c**) The 30 vertices of the icosadodecahedron are on the fifteen 2-fold axes, and the structure is generated by applying the full icosahedral group to the point $\frac{1}{2}[-1,{\varphi}^{2},\varphi ]$, which serves as the ${\overrightarrow{T}}_{2}$ translation vector.

**Figure 4.**Formation of the $\varphi {\mathrm{ICO}}_{5}$ point array [1]. (

**a**) The base of this point array is an icosahedron scaled up by multiplying by $\varphi $. (

**b**) The base point array is then translated by ${\overrightarrow{T}}_{5}$. Under this ${\overrightarrow{T}}_{5}$ extension, the base point array consists of 4 separate levels. The lowest level is a single vertex which remains on the 5-fold axes, the next level of 5 vertices all intersect 3-fold axes (all with the same radius). The next level is located on the 5-2 great circle of the AU, resulting in 60 points. Finally, the gauge point of the point array is created by the largest radius point (Figure 5). (

**c**) Icosahedral symmetry is now applied, creating the full point array (base and cloud), where the points are colored as in Figure 1. As the translation was ${\overrightarrow{T}}_{5}$, the resulting gauge point remains on the 5-fold axes (Figure 1). (

**d**) A look down one of the twelve 5-fold axes shows that the point array has icosahedral symmetry. This entire point is enveloped by a now larger icosahedron, with the purple points sitting on its edges. The formation of point arrays can equivalently be described as placing an icosahedron centered at each of the original base vertices to create the point cloud.

**Figure 5.**A histogram of the point array elements by radii in arbitrary units. The base array $\varphi \mathrm{ICO}$ has a bold outline around the 12 points at $r\sim 0.64$. The gauge points are on the 5-fold axes (Gauge Point 1). In total there are now 3 icosahedrons and one dodecahedron, all with different radii. There is also a layer of 60 points along the 5-2 great circle of the AU; due to the 60 rotations of the icosahedral group, any point not located on a symmetry axes will produce 60 points after the symmetry is applied (Figure 4). The cardinality or size of this point array is $116=12\left(\mathrm{ICO}\right)+20\left(\mathrm{DOD}\right)+12\left(\mathrm{ICO}\right)+60\left(5-2\right)+12\left(\mathrm{ICO}\right)$.

**Figure 6.**Formation of a double base point array. (

**a**) Point array $\varphi {\mathrm{ICO}}_{5}$ and (

**b**) point array $\varphi {\mathrm{DOD}}_{5}$ have the same affine extension ${\overrightarrow{T}}_{5}$ and may therefore be combined to form (

**c**) ${\left(\mathrm{ICO}\cup \mathrm{DOD}\right)}_{5}$, a double base point array. (

**d**) A view down a 5-fold axes after applying icosahedral symmetry, where the points originating from $\varphi {\mathrm{DOD}}_{5}$ are shown as asterisks (✳). All that is required to combine these two point arrays is that they have the same translation vector ${\overrightarrow{T}}_{5}$ and it is coincidental that both $\varphi {\mathrm{ICO}}_{5}$ and $\varphi {\mathrm{DOD}}_{5}$ intersect at a 3-fold axes during the affine extension. The full radial distribution of points is shown in Figure 7a.

**Figure 7.**Here we examine two ways to form double point arrays with $\varphi {\mathrm{ICO}}_{5}$. As before, the base polyhedra are outlined in bold. In (

**a**) we examine the formation of $\varphi {\mathrm{ICO}}_{5}\cup \varphi {\mathrm{DOD}}_{5}$ which adds points above the icosahedral envelope of $\varphi {\mathrm{ICO}}_{5}$ and could be used to locate a site for surface modification of the capsid. The cardinality of $\varphi {\mathrm{ICO}}_{5}$ and $\varphi {\mathrm{DOD}}_{5}$ is 116 and 172 respectively (Table 2); however each of these point arrays respectively generate the base of the other array, reducing the cardinality from 288 points to 256 points, with 8 radial levels, rather than 10. We also see the formation of (

**b**) $\varphi {\mathrm{ICO}}_{5}\cup 2{\mathrm{IDD}}_{5}$ which adds surface points near the same radial level as the original gauge points of $\varphi {\mathrm{ICO}}_{5}$. The cardinality of $2{\mathrm{IDD}}_{5}$ is 242 and none of the points overlap $\varphi {\mathrm{ICO}}_{5}$. Therefore, the total cardinality of the double point array is 358 and the number of radial levels is 11. Double base point arrays can be used in a variety of applications, e.g., they can provide a more complete radial description of a capsid, indicating geometric constraints on all proteins, or they could provide locations where ligands or decoration proteins could be added to meet these new conditions, or even suggest precisely where internal drugs should be placed to not disrupt the stability of the capsid.

**Figure 8.**The radial distribution of point clouds for sister point arrays are identical, except for the base arrays (shown in bold outline). In (

**a**) ${\varphi}^{\prime}{\mathrm{ICO}}_{3}$ is identical to $\varphi {\mathrm{DOD}}_{5}$ except for their two bases ${\varphi}^{\prime}\mathrm{ICO}$ and $\varphi \mathrm{DOD}$. Please note that these point arrays have different affine extensions, so they cannot be combined. In (

**b**) ${\mathrm{IDD}}_{5}$ is identical to ${\mathrm{ICO}}_{2}$ except for $\mathrm{IDD}$ and $\mathrm{ICO}$. The bulk points at $r\sim 0.75$ have two sets of 60 points one on the left and one on the right side of the 5-2 great circle in the AU. While these points have the same radius, they are not equivalent locations.

**Figure 9.**Schematic representation of the fitness algorithm for point array matching to viral capsids. Most viruses only have a few point arrays that have a low RMSD. Many ties are due to sister point arrays.

**Figure 10.**Features of the point array fitting algorithm. Here we have docked the gauge points on HBV, a T = 4 virus with dimeric protrusions. We only consider the point arrays with gauge points nearest the most radially distal protruding features, e.g., the nearest gauge points to the protruding dimers are GP: 17, 18 and 19 (yellow circles) and 4 and 5 (blue circles), see Table 2. In this example, there are 3 gauge points that would have fallen through the capsid surface when docked, which were instead stopped at their minimum distance to the surface. We see that most points sit on the surface of the proteins, and some on the symmetry axes in between proteins. Points which are near more than one protein are weighted more in the RMSD, through protein multiplicity ${p}_{i}$, see Equation (2). High multiplicity always occurs on symmetry axes e.g., Gauge Point 1 (red) on the 5-fold axis would count 5 times. There are several points which would lead to a high prevalence, i.e., they do not intersect proteins over a large range of radial scaling, including Gauge Point 15 (blue) which is on a 2-fold axis and Gauge Point 3 (orange). In this example, these points would lead to a poor RMSD fit, as they are each several angstroms away from the surface.

**Figure 11.**A section of the swollen CCMV capsid radially colored and fitted with $\varphi {\mathrm{ICO}}_{2}$. In our fitness measure we consider all gauge points which are near protrusions. In this example, we see how a point array can have a low RMSD value, but not match the protruding features well and therefore be excluded later. In this case Gauge Point 21 is 10 Å below the most external atoms, and the other gauge points which sit atop the blue sections are the better fit.

**Figure 12.**(

**a**) Adenovirus Ad3 Dodecahedron is composed of a penton base (Ad3, T = 1, 4aqq [20]) and is best fitted by ${\mathrm{IDD}}_{5}$ with an RMSD of 3.7 Å and Gauge Point 19. Ad3 is not equally co-fit by the sister point array ${\mathrm{ICO}}_{2}$. The radial histogram of these sister point arrays appear in Figure 8b. It is noteworthy that the entire point array was used in the fitness algorithm, without any radial cutoff. While the gauge points and interior points agree well with the protein surface, there are several points (orange and black) that float a few angstroms off the surface. (

**b**) The interior view of the capsid. Ad3 is stabilized by strand-swapping between neighboring pentamers, which occurs across the 2-fold axes between the IDD base (blue points encircled in yellow) of ${\mathrm{IDD}}_{5}$ and the innermost (lowest radial level) points (blue) which sit on the underside of the strand-swapping. (

**c**) A side view of one of the pentamer subunits with a section of protein removed to show the interior 5-fold point (red) contained by ${\mathrm{ICO}}_{2}$ only and the interior 2-fold points (blue) contained by ${\mathrm{IDD}}_{5}$ only. These 2-fold points (blue) provided by ${\mathrm{IDD}}_{5}$ only, are situated at the major point of contact between the adjacent pentameric units, above the strand-swapping. As the two sister point arrays are identical except for these points (Figure 8), they are the deciding factors on which RMSD is lower. Here the ${\mathrm{ICO}}_{2}$ point (red) is sitting in an empty pocket, making the RMSD 4.9 Å 1.2 Å higher than ${\mathrm{IDD}}_{5}$ with an RMSD of 3.7 Å.

**Figure 13.**A demonstration of how point array fits may be improved using double base point arrays. In (

**a**) we see T = 1 Hepatitis E virus-like particle (T = 1, 3hag [19]) fitted only by $2\varphi {\mathrm{IDD}}_{2}$ with an RMSD of 3.2 Å and Gauge Point 15. This point array however does not have any points on the interior surface of the capsid. In (

**b**) we combine $2\varphi {\mathrm{IDD}}_{2}$ with the smaller radius point array ${\varphi}^{2}{\mathrm{IDD}}_{2}$, shown here as magenta points, irrespective of their geometric locations deviating from Table 1. These new points sit perfectly on the surface and add interior surface locations and lower the RMSD to 2.8 Å. In (

**c**) we see the interior of our capsid, ${\varphi}^{2}{\mathrm{IDD}}_{2}$ shown in orange along the 5-2 great circle. The point array fitting of the capsid interior can also be seen in Figure 14.

**Figure 14.**Exhibiting the point array description of Hepatitis E virus-like particle (T = 1, 3hag [19]) using a ribbon view with each protein colored separately. In (

**a**) we are viewing down a 2-fold axis, the gauge point is 15 (blue) on the 2-fold axes. We can see the many point array elements nestled in and around the protein surface. In (

**b**) we are looking at the same section from the side view, and you can see how the proteins fit neatly between the point arrays. The lowest radial points (orange) were added by ${\varphi}^{2}{\mathrm{IDD}}_{2}$.

**Figure 15.**(

**a**) Bacteriophage MS2 (T = 3, 2ms2 [29]), viewed down the 5-fold axes, is fitted best by $\varphi {\mathrm{ICO}}_{3}\sim {\varphi}^{\prime}{\mathrm{DOD}}_{5}$ with an RMSD of 0.7 Å and Gauge Point 2. Equivalent capsid proteins are colored blue, gray and cyan. Though this fit uses only 2 of the 5 radial levels of the point arrays, its agreement with experiments is remarkable, as it is nearly impossible to substitute different amino acids near the gauge points (orange) [23,24]. (

**b**) A perspective view of the gauge points sitting right next to the pentamer protrusions, which were found to be nearly immutable, suggesting this geometric location is critical to the stability of MS2. As MS2 is composed of 180 chemically identical proteins, the hexamer protrusions are similarly immutable, at least pre-assembly of the capsid. (

**c**) It was shown that the f-g loop which is the inner surface of the hexamer hole was very mutable, which is also in agreement with the point array fit, as the restricted points are several angstroms away (orange). Directly above these points are the $\beta $-sheets.

**Figure 16.**(

**a**) A view down the 2-fold axis of Hepatitis B (T = 4, 1qgt [34]) which is composed of 120 homodimers. The AB dimers (A is pink and B is white) forms the 12 pentamers (pink) and the CD dimer (C is blue and D is gray) along with the B-protein form the 30 hexamer subunits (blue, gray and white). The capsid is best fitted by the double base point array point ${\mathrm{ICO}}_{2}\cup .5{\mathrm{ICO}}_{2}$ with an RMSD of 1.3 Å and Gauge Point 19 (purple) located on the AB pentamer dimers. There are also points (orange) located at the internal interface of the CD dimer (Semi-translucent blue proteins for clarity). (

**b**) The AB dimer is bounded above (purple) and below (black) by point arrays elements. (

**c**) The CD dimer of the hexamers has an orange point on the line of contact between the two $\alpha $-helices of the CD dimer and rest atop two bulk points (black) on the bottom of each protein of the dimer. The CD dimer is more flexible than the AB dimer [39]. (

**d**) Looking out from the interior of the capsid along a 2-fold axis, we see a slightly squashed hexamer (white, gray and blue) centered on 2-fold axis. Each hexamer is in contact with four interior bulk points (black) and two interior 5-2 GC points (purple). As such, the hexamers do not have local 3-fold symmetry, which is consistent with the point array constraints. For comparison, a squashed hexagon is shown as a solid line (pink) and an ideal hexagon is shown as a dotted line (green). Note that even though this point arrays was generated from two base icosahedra $(\mathrm{ICO}\cup .5\mathrm{ICO})$, all of the 5-fold points have been truncated as they are more than 4 Å below the interior capsid.

**Figure 17.**(

**a**) A look down the 2-fold axes of CCMV (T = 3, 1cwp [30]) which is best fitted by ${\varphi}^{\prime 2}{\mathrm{ICO}}_{3}$ with an RMSD of 0.7 Å and Gauge Point 5. The pentamers are shown in pink and the hexamers are gray and blue. (

**b**) Inside the capsid view of the points as they coordinate the pentamer and hexamer meeting. (

**c**) Outside view of the hexamers. The pentamer proteins only make contact with one point, though the hexmamer unit makes contact with 3 points, above, below and at their junction.

**Figure 18.**A comparison of the structural changes of CCMV $(T=3)$ which occur during maturation using point arrays. Each capsid is radially colored. (

**a**) Exterior view of CCMV native state, showing the prominent role that hexameric features play in determining the point array fit. (

**b**) The interior of CCMV natives state. There are point array locations (orange) in contact with the interior surface of the pentamer, as was seen in Figure 17. (

**c**) The swollen CCMV capsid is approximately a uniformly scaled native state, though the orientation of the hexamer chains rotate slightly and the center of the hexamer opens up at the top and closes at the bottom, allowing the sister point array ${\varphi}^{2}{\mathrm{DOD}}_{5}$ to fit the capsid. The overall effect of this reorganization is to create holes in the protein capsid. (

**d**) The interior view of the mature capsid showing how the hexamer unit changes to accommodate a new point (yellow) after maturation.

**Figure 19.**CPMV (pT3, 1ny7 [32]) has five solvent exposed exterior lysine residues found to be reactive which are labeled here by the protein domain they occupy [17]. There are two point arrays ${\varphi}^{\prime}{\mathrm{ICO}}_{5}\cup 2{\varphi}^{\prime 2}{\mathrm{IDD}}_{5}$ and $\varphi {\mathrm{ICO}}_{5}\cup \varphi {\mathrm{DOD}}_{5}$ that fit equally well with an RMSD of 1.5 Å and Gauge Point 1. This agreement is not surprising, as their truncated point arrays are identical except for a single point (purple) along the 5-2 GC. The geometric restrictions of the point arrays are in remarkable agreement with the reactivity data [17], explaining at least 4 of the 5 reactivity sites (Table 5).

**Figure 20.**Point arrays may predict a loss of local symmetry of hexameric subunits for Bacteriophage HK97 Prohead II ($T=7l$, 3e8k [37]), as the hexamer subunits have different geometric restrictions at non-equivalent locations. (

**a**) This capsid is best fitted by ${\varphi}^{\prime}{(\mathrm{ICO}\cup \mathrm{IDD})}_{5}$ with an RMSD of 1.8 Å. The pentamer protrudes at Gauge Point 1, as do many T = 7 viruses, and then the surface bows inward to fit beneath the 2nd radial level (purple). (

**b**) There are small pockets within the hexamers, where the 3rd radial level along the 5-3 GC sits just below the surface of the hexamer (purple), the atoms above the pocket have been made translucent. (

**c**) Inside view, each hexamer is bounded on the inside by the 4th radial level along the 5-3 GC point (orange). The pentamers and hexamers are each bounded by two different points within the AU.

**Table 1.**The color scheme used for point array elements throughout this paper. Each color specifies the type of geometric location, e.g., 5-fold points are red, 2-fold points are blue, and purple are the points on the great circle between the two (Figure 1).

Location | Color |
---|---|

5-fold | Red |

5-3 GC | Orange |

3-fold | Yellow |

3-2 GC | Green |

2-fold | Blue |

2-5 GC | Purple |

Bulk | Black |

Origin | Teal |

**Table 2.**The 55 admissible point arrays [1,2], grouped by extension vectors ${\overrightarrow{T}}_{5},{\overrightarrow{T}}_{3}$ and ${\overrightarrow{T}}_{2}$, ordered from largest to smallest relative radii. The initial scaling of the base point arrays before affine extension is as shown. The gauge points and their location on the great circles are listed (Figure 1). The number of distinct radial levels, size of the full point array (cardinality) and sister point arrays are also given (3 arrays do not have sister arrays). While all sister arrays have identical point clouds, the two point arrays which are identical are indicated with †. The 14 point arrays with only a single radial level located on the icosahedral symmetry axes are indicated with an asterisk *. Three point arrays ${\mathrm{ICO}}_{5},{\mathrm{DOD}}_{3}$ and ${\mathrm{IDD}}_{2}$ have an element at the origin, which is degenerate. Note that ${\varphi}^{\prime}=-1/\varphi $; however all of the base polyhedra are invariant under multiplication by −1, therefore we only report the absolute value of the scaling.

**Table 3.**Here we present the results of our fitness algorithm of 16 viruses, considering RMSD scores, gauge points (GP), and points of contact (NAU). All fits are decided as in Figure 9. Most RMSD fits are at least better than 0.5 Å than the next best array. Several viruses are equally well fit by a point array and its sister array. When both sister point arrays are equally valid, e.g., ${\mathrm{ICO}}_{3}$ or ${\mathrm{DOD}}_{5}$ of STMV, both are listed. If two point arrays are tied for best, that is stated explicitly. There are two cryo-EM fit solutions for CCMV (swln1) in the swollen state, we analyzed ccmvswln1 [25]. All structures analyzed in this paper were obtained from the Viper Database [21].

Best Fit Point Arrays with RMSD Values | ||||||
---|---|---|---|---|---|---|

Name | T | PA | RMSD (Å) | GP | NAU | PDBID |

Adenovirus Ad3 Dodecahedron | 1 | ${\mathrm{IDD}}_{5}$ | 3.7 | 19 | 7 | 4aqq [20] |

Hepatitis E VLP | 1 | $2\varphi {\mathrm{IDD}}_{2}\cup {\varphi}^{2}{\mathrm{IDD}}_{2}$ | 2.8 | 15 | 26 | 3hag [19] |

Infectious Bursal Virus | 1 | $.5{\varphi}^{3}{\mathrm{DOD}}_{2}$ | 4.5 | 8 | 9 | 2gsy [18] |

STMV | 1 | ${\mathrm{ICO}}_{3}\sim {\mathrm{DOD}}_{5}$ | 1.2 | 3 | 3 | 1a34 [26] |

L-A Virus | 2 | $\varphi {\mathrm{DOD}}_{5}\cup {\mathrm{DOD}}_{5}$ | 1.4 | 4 | 3 | 1m1c [27] |

Bacteriophage GA | 3 | $\varphi {\mathrm{ICO}}_{3}\sim {\varphi}^{\prime}{\mathrm{DOD}}_{5}$ | 0.2 | 2 | 2 | 1gav [28] |

Bacteriophage MS2 | 3 | $\varphi {\mathrm{ICO}}_{3}\sim {\varphi}^{\prime}{\mathrm{DOD}}_{5}$ | 0.7 | 2 | 2 | 2ms2 [29] |

CCMV Native | 3 | ${\varphi}^{\prime 2}{\mathrm{ICO}}_{3}$ | 0.7 | 5 | 3 | 1cwp [30] |

CCMV Swollen | 3 | ${\varphi}^{2}{\mathrm{DOD}}_{5}$ | 2.7 | 5 | 4 | swln1 [25] |

Tobacco Necrosis Virus | 3 | ${\varphi}^{\prime}{\mathrm{ICO}}_{5}\cup 2{\varphi}^{\prime 2}{\mathrm{IDD}}_{5}$ | 0.9 | 1 | 5 | 1c8n [31] |

Cowpea Mosaic Virus (CPMV) | $pT3$ | ${\varphi}^{\prime}{\mathrm{ICO}}_{5}\cup 2{\varphi}^{\prime 2}{\mathrm{IDD}}_{5}$ | 1.5 | 1 | 5 | 1ny7 [32] |

Helicoverpa (HASV) | 4 | ${\mathrm{IDD}}_{5}\cup {\varphi}^{\prime}{\mathrm{ICO}}_{5}$ | 1.1 | 19 | 6 | 3s6p [33] |

Hepatitis B | 4 | ${\mathrm{ICO}}_{2}\cup .5{\mathrm{ICO}}_{2}$ | 1.3 | 19 | 5 | 1qgt [34] |

Nudaurelia Capensis $\omega $ Virus | 4 | ${\mathrm{DOD}}_{5}\cup {\varphi}^{\prime}{\mathrm{ICO}}_{5}$ | 1.5 | 3 | 5 | 1ohf [35] |

Bacteriophage P22 Mature | $7l$ | $\varphi {\mathrm{ICO}}_{5}\cup \varphi {\mathrm{DOD}}_{5}$ | 0.8 | 1 | 3 | 5uu5 [36] |

HK97 Prohead II | $7l$ | ${\varphi}^{\prime}{\mathrm{ICO}}_{5}\cup {\varphi}^{\prime}{\mathrm{IDD}}_{5}$ | 1.8 | 1 | 4 | 3e8k [37] |

**Table 4.**At first glance, there appear to be several point arrays tied for the best fit of the swollen CCMV particle; however 3 of these were excluded based on poor agreement with the gauge points (Figure 11).

CCMV Swollen $(\mathit{T}=3)$ | ||||
---|---|---|---|---|

PA | RMSD | GP | NAU | Notes |

${\varphi}^{\prime 2}{\mathrm{ICO}}_{3}$ | 1.9 | 5 | 3 | |

$\varphi {\mathrm{ICO}}_{2}\cup .5{\varphi}^{2}{\mathrm{ICO}}_{2}$ | 2.2 | 21 | 6 | excluded |

$\varphi {\mathrm{ICO}}_{2}\cup 2{\varphi}^{\prime}{\mathrm{IDD}}_{2}$ | 2.3 | 21 | 7 | excluded |

$\varphi {\mathrm{ICO}}_{2}\cup \varphi {\mathrm{IDD}}_{2}$ | 2.6 | 21 | 8 | excluded |

${\varphi}^{2}{\mathrm{DOD}}_{5}$ | 2.7 | 5 | 4 |

**Table 5.**A comparison of the reactivity of the 5 solvent exposed lysine residues of CPMV [17] and the geometric restrictions imposed by the point arrays. An X indicates that the point array is unlikely to permit changes at this location and a + indicates no specific geometric restriction. We also indicate our naive expectation of capsid reactivity based on good solvent accessibility (+) and lack of steric hindrance (–). Both point arrays indicate that A LYS 82 and B LYS 199 should have low reactivity as there is a point array element nearly atop each, implying that the site is a key structural feature, similar to the loop protrusions of MS2. Both point arrays also agree that C LYS 99 should have no restrictions on reactivity. While ${\varphi}^{\prime}{\mathrm{ICO}}_{5}\cup 2{\varphi}^{\prime 2}{\mathrm{IDD}}_{5}$ is the best fit point array, it indicates that the most reactive site A LYS 38 should not be reactive; however there is a loophole here as $\varphi {\mathrm{ICO}}_{5}\cup \varphi {\mathrm{DOD}}_{5}$ have the same truncated points except for this single geometric restriction, so CPMV can be reactive here as this location is not critical to the other point array stability. This reactive site A LYS 38 is a bit paradoxical as it is slightly hidden and less solvent exposed, yet it is very reactive. Lastly neither point array implies any restrictions for C LYS 34 though it has low reactivity and good solvent accessibility and low steric hinderance. It is possible that including the genomic material in the point array fitness would clarify this case [17].

CPMV (pT = 3) LYS Reactivity Comparison | ||||
---|---|---|---|---|

Residue | Reactivity | ${\mathbf{\varphi}}^{\prime}$${\mathbf{ICO}}_{\mathbf{5}}\cup \mathbf{2}{\mathbf{\varphi}}^{\prime \mathbf{2}}$${\mathbf{IDD}}_{\mathbf{5}}$or | Accessibility | Naive |

$\mathbf{\varphi}$${\mathbf{ICO}}_{\mathbf{5}}\cup \mathbf{\varphi}$${\mathbf{DOD}}_{\mathbf{5}}$ | Prediction | |||

A LYS 82 | Low | X/X | Solvent: +, Sterics: − | Good |

A LYS 38 | Highest | X/+ | Solvent: −, Sterics: + | Poor |

B LYS 199 | Low | X/X | Solvent: +, Sterics: − | Good |

C LYS 34 | Low | +/+ | Solvent: +, Sterics: − | Good |

C LYS 99 | Second | +/+ | Solvent: −, Sterics: + | Poor |

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Wilson, D.P.
Unveiling the Hidden Rules of Spherical Viruses Using Point Arrays. *Viruses* **2020**, *12*, 467.
https://doi.org/10.3390/v12040467

**AMA Style**

Wilson DP.
Unveiling the Hidden Rules of Spherical Viruses Using Point Arrays. *Viruses*. 2020; 12(4):467.
https://doi.org/10.3390/v12040467

**Chicago/Turabian Style**

Wilson, David P.
2020. "Unveiling the Hidden Rules of Spherical Viruses Using Point Arrays" *Viruses* 12, no. 4: 467.
https://doi.org/10.3390/v12040467