Adenovirus Structure: What Is New?

Adenoviruses are large (~950 Å) and complex non-enveloped, dsDNA icosahedral viruses. They have a pseudo-T = 25 triangulation number with at least 12 different proteins composing the virion. These include the major and minor capsid proteins, core proteins, maturation protease, terminal protein, and packaging machinery. Although adenoviruses have been studied for more than 60 years, deciphering their architecture has presented a challenge for structural biology techniques. An outstanding event was the first near-atomic resolution structure of human adenovirus type 5 (HAdV-C5), solved by cryo-electron microscopy (cryo-EM) in 2010. Discovery of new adenovirus types, together with methodological advances in structural biology techniques, in particular cryo-EM, has lately produced a considerable amount of new, high-resolution data on the organization of adenoviruses belonging to different species. In spite of these advances, the organization of the non-icosahedral core is still a great unknown. Nevertheless, alternative techniques such as atomic force microscopy (AFM) are providing interesting glimpses on the role of the core proteins in genome condensation and virion stability. Here we summarize the current knowledge on adenovirus structure, with an emphasis on high-resolution structures obtained since 2010.


Introduction
Adenoviruses were discovered close to 70 years ago [1], and since then they have been found in all types of vertebrates [2]. As basic science tools, they have provided fundamental insights in biological functions such as splicing [3]. Currently, they are best known by their potential as therapeutic vectors, culminating with their use as SARS-CoV-2 vaccine vehicles during the COVID-19 pandemic [4][5][6]. Adenoviruses soon became an object of interest for structural biologists-they were among the first viruses to be imaged by electron microscopy (EM); their major coat protein was one of the first animal virus proteins to be crystallized; and they were used to demonstrate the possibility of imaging frozen-hydrated biological material in the electron microscope, in the early days of cryo-EM. However, although the general capsid organization was soon unveiled, reaching the finest details of adenovirus architecture required many years of studies, and could only be attained when cryo-EM realized its potential to provide near-atomic resolution structural data [7]. The human adenovirus type 5 (HAdV-C5) virion was by then the largest biological object ever solved at high resolution by any structural biology technique. In fact, a few more years of analyses were required to reconcile the virion models based on cryo-EM and X-ray crystallography data [8]. These historical aspects have previously been reviewed [9]. Here, we summarize the knowledge available after publication of the first high-resolution map [7], and discuss new adenovirus-related structures reported since 2010. These include fiber heads and complete virions from previously uncharacterized species and genera, providing new insights into the structural diversity and receptor binding modes within the Adenoviridae family (Table 1).

Components and Organization of the Adenovirus Virion
The International Committee for the Taxonomy of Viruses (ICTV) currently recognizes 86 different adenovirus species, grouped in six genera (https://talk.ictvonline.org/ ictv-reports/ictv_9th_report/dsdna-viruses-2011/w/dsdna_viruses/93/adenoviridae, accessed on 13 May 2021). Their linear dsDNA genome varies in length between 26 kbp in the frog adenovirus (FrAdV-1, a Siadenovirus) and 48 kbp in the only fish adenovirus isolated so far (WSAdV-1, an ichtadenovirus) [33,34]. This genome is packed inside a pseudo T = 25 icosahedral capsid with a diameter of approximately 95 nm vertex to vertex (Figure 1a). The major coat protein, hexon, forms the icosahedron facets. Capsomers at the vertices are formed by penton base and fibers. These two proteins are key players in the initial stages of infection, as they are in charge of cell receptor interaction. A series of minor coat proteins help to assemble and maintain the shell, and have been termed "glue" or "cementing" proteins. In the Mastadenovirus genus, which includes the human adenoviruses, minor coat proteins IIIa, VI, and VIII are located on the inner capsid surface, and protein IX on the outside (Figure 1a,b). The set of minor coat proteins and their organization changes between genera and species. Three of them (IIIa, VI, and VIII) are conserved in all the Adenoviridae family and therefore would be expected to play crucial roles during assembly [35]. For example, protein VI is a key factor for AdV entry in the host cell (see Sections 3.5 and 4.4). The capsid geometry can be represented as a set of two different kinds of tiles. A flat tile corresponding to most of the facet is formed by the nine central hexon capsomers, and is often referred to as "group-of-nine" (GON). The other kind of tile is formed by the penton and its five surrounding hexon capsomers (peripentonal hexons). This tile has been termed "group-of-six" (GOS) [7,9]. The icosahedral asymmetric unit is composed by four hexon trimers, one penton base monomer, one protein IIIa, two copies of VIII, and four copies of protein IX (Figure 1b).
cial roles during assembly [35]. For example, protein VI is a key factor for AdV entry in the host cell (see Sections 3.5 and 4.4). The capsid geometry can be represented as a set of two different kinds of tiles. A flat tile corresponding to most of the facet is formed by the nine central hexon capsomers, and is often referred to as "group-of-nine" (GON). The other kind of tile is formed by the penton and its five surrounding hexon capsomers (peripentonal hexons). This tile has been termed "group-of-six" (GOS) [7,9]. The icosahedral asymmetric unit is composed by four hexon trimers, one penton base monomer, one protein IIIa, two copies of VIII, and four copies of protein IX (Figure 1b).  [28]. White symbols indicate the position of the icosahedral symmetry axes. The white rectangle highlights one icosahedral asymmetric unit. The fibers are not represented here, as they cannot be traced in studies using icosahedral symmetry. Structure rendering with ChimeraX [36]. (b) Zoom in on the asymmetric unit and its closest neighbors. Two views are provided: as seen from outside the capsid (left) and from inside (right). The four unique hexon trimers are labeled H1 to H4, and the different proteins are colored according to the color key at the left. One of the hexons is further magnified to show the pVIN and pVIIN2 peptides. Additionally highlighted is the GOS (penton plus peripentonal hexons), depicted as seen from inside the capsid. Structures rendered with Chimera [37].
Inside the capsid, a large amount of virus encoded, DNA-binding proteins accompany the adenovirus genome, forming a non-icosahedral core. Some of them (the core proteins) are considered DNA-condensing agents due to their positive charge [38,39]. Others are involved in genome replication (terminal protein), genome packaging (IVa2), or maturation (adenovirus protease, AVP) [40][41][42].

Hexon
Hexon is the main building block of the adenovirus protein shell, accounting for approximately 60% of the total virion mass. In the capsid there are 720 hexon monomers, organized in 240 trimers, with 12 trimers per facet (Figure 1a,b). Hexon is a large polypeptide, more than 900 amino acids long in all known adenovirus types. The monomer folds as two eight-stranded β barrels, or jellyroll domains, held apart by a small β-sheet [43,44]. The double jelly rolls form the pseudo-hexagonal base of the trimeric hexon capsomer. Long loops intercalated between the β-strands form the hexon towers, and contain the  [28]. White symbols indicate the position of the icosahedral symmetry axes. The white rectangle highlights one icosahedral asymmetric unit. The fibers are not represented here, as they cannot be traced in studies using icosahedral symmetry. Structure rendering with ChimeraX [36]. (b) Zoom in on the asymmetric unit and its closest neighbors. Two views are provided: as seen from outside the capsid (left) and from inside (right). The four unique hexon trimers are labeled H1 to H4, and the different proteins are colored according to the color key at the left. One of the hexons is further magnified to show the pVI N and pVII N2 peptides. Additionally highlighted is the GOS (penton plus peripentonal hexons), depicted as seen from inside the capsid. Structures rendered with Chimera [37].
Inside the capsid, a large amount of virus encoded, DNA-binding proteins accompany the adenovirus genome, forming a non-icosahedral core. Some of them (the core proteins) are considered DNA-condensing agents due to their positive charge [38,39]. Others are involved in genome replication (terminal protein), genome packaging (IVa2), or maturation (adenovirus protease, AVP) [40][41][42].

Hexon
Hexon is the main building block of the adenovirus protein shell, accounting for approximately 60% of the total virion mass. In the capsid there are 720 hexon monomers, organized in 240 trimers, with 12 trimers per facet (Figure 1a,b). Hexon is a large polypeptide, more than 900 amino acids long in all known adenovirus types. The monomer folds as two eight-stranded β barrels, or jellyroll domains, held apart by a small β-sheet [43,44]. The double jelly rolls form the pseudo-hexagonal base of the trimeric hexon capsomer. Long loops intercalated between the β-strands form the hexon towers, and contain the hyper variable regions (HVRs) (Figure 2a) [45]. The N-termini and C-termini adopt different conformations depending on their location in the capsid, to establish interactions between hexons and minor coat proteins IIIa and VIII [7]. The adenovirus capsid is described as pseudo-T = 25 because of the oligomeric arrangement of the hexon. Since hexons are trimers and not hexamers, the icosahedral asymmetric unit is composed of 4 × 3 (hexon molecules) + 1 (penton molecule) = 13 independent polypeptides, instead of the 25 predicted by the Caspar and Klug quasi-equivalence theory [9,46]. hyper variable regions (HVRs) (Figure 2a) [45]. The N-termini and C-termini adopt different conformations depending on their location in the capsid, to establish interactions between hexons and minor coat proteins IIIa and VIII [7]. The adenovirus capsid is described as pseudo-T = 25 because of the oligomeric arrangement of the hexon. Since hexons are trimers and not hexamers, the icosahedral asymmetric unit is composed of 4 × 3 (hexon molecules) + 1 (penton molecule) = 13 independent polypeptides, instead of the 25 predicted by the Caspar and Klug quasi-equivalence theory [9,46]. Penton base pentamer. The location of the untraced RGD loop in one of the monomers is indicated. In (a,b), the inner side of the particle would be at the bottom. (c) Localized reconstruction, without symmetry enforcement, of the HAdV-D26 fiber bound to the penton base. The view is from outside the capsid. An atomic model of the knob [19] is fitted into the density. Notice that the knob density appears clearly trimeric. The schematic diagram illustrates how the three-fold symmetric fiber (triangle) is shifted relative to the center (red dot) of the five-fold symmetric penton base (pentagon). Adapted from [47]. (d) Comparison between the LAdV-2 (yellow) and HAdV-C5 (gray) protein IIIa structures. Notice that the GOSglue domains and part of the connecting helix overlap, but the VIII-binding domain in the LAdV-2 protein swings away from its position in the human virus. The black pentagon indicates the position of the 5-fold symmetry axis. Modified from [31]. (e) Schematics showing the organization of protein IX in HAdV-C5, BAdV-3, and HAdV-F41, and LH3 in LAdV-2. Molecules forming triskelions located at the center of the facet (I3 symmetry axis) are in cyan, and those located at the L3 axes in several shades of pink. The rope domains in HAdV-C5, and the rope and C-terminal domains in HAdV-F41, are depicted as dashed lines, indicating non-modeled residues. Monomers of protein IX/LH3 in the asymmetric unit are depicted on top of each schematic, with the N-and C-termini indicated. For HAdV-C5 and HAdV-F41, the four monomers in the AU are overlapped according to their triskelion region, to highlight the different conformations of the rope domain. Adapted from [30].
The hexon architecture is highly conserved throughout the Adenoviridae family. The main differences reside in the HVRs, which present a varying degree of flexibility in the different virus types. In the HAdV-C5 cryo-EM model, only four of the seven loops at the top of each hexon monomer could be traced [28]. In contrast, a recent study of the enteric HAdV-F41 solved all the loops except HVR4 [30]. In human adenoviruses of species C, HVR1 presents a unique, 32 residue-long acidic loop that confers a large negative charge to the outer capsid surface [48]. In HAdV-F41 and HAdV-D26, HVR1 is shorter than in HAdV-C5 and could be fully traced in cryo-EM maps [29,30]. The acidic HVR-1 in HAdV-C5 seems to be involved in electrostatic interactions with neutralizing defensins, and its  [19] is fitted into the density. Notice that the knob density appears clearly trimeric. The schematic diagram illustrates how the three-fold symmetric fiber (triangle) is shifted relative to the center (red dot) of the five-fold symmetric penton base (pentagon). Adapted from [47]. (d) Comparison between the LAdV-2 (yellow) and HAdV-C5 (gray) protein IIIa structures. Notice that the GOS-glue domains and part of the connecting helix overlap, but the VIII-binding domain in the LAdV-2 protein swings away from its position in the human virus. The black pentagon indicates the position of the 5-fold symmetry axis. Modified from [31]. (e) Schematics showing the organization of protein IX in HAdV-C5, BAdV-3, and HAdV-F41, and LH3 in LAdV-2. Molecules forming triskelions located at the center of the facet (I3 symmetry axis) are in cyan, and those located at the L3 axes in several shades of pink. The rope domains in HAdV-C5, and the rope and C-terminal domains in HAdV-F41, are depicted as dashed lines, indicating non-modeled residues. Monomers of protein IX/LH3 in the asymmetric unit are depicted on top of each schematic, with the N-and C-termini indicated. For HAdV-C5 and HAdV-F41, the four monomers in the AU are overlapped according to their triskelion region, to highlight the different conformations of the rope domain. Adapted from [30].
The hexon architecture is highly conserved throughout the Adenoviridae family. The main differences reside in the HVRs, which present a varying degree of flexibility in the different virus types. In the HAdV-C5 cryo-EM model, only four of the seven loops at the top of each hexon monomer could be traced [28]. In contrast, a recent study of the enteric HAdV-F41 solved all the loops except HVR4 [30]. In human adenoviruses of species C, HVR1 presents a unique, 32 residue-long acidic loop that confers a large negative charge to the outer capsid surface [48]. In HAdV-F41 and HAdV-D26, HVR1 is shorter than in HAdV-C5 and could be fully traced in cryo-EM maps [29,30]. The acidic HVR-1 in HAdV-C5 seems to be involved in electrostatic interactions with neutralizing defensins, and its absence in species D and F may play a role in determining the enteric or ocular tropism of these viruses, although this aspect is not well understood yet [30,49]. In lizard adenovirus type 2 (LAdV-2), an Atadenovirus, the loops are shorter and could be modeled without gaps. It has been proposed that simpler loops could correlate with lower evolutionary pressure induced by the immune system in reptiles [31]. In HAdV-C5, the valley formed by the three hexon towers is involved in interactions with coagulation factors [50]. Most recently, this region has also been shown to bind the cell receptor CD46, revealing a new entry mechanism for a large group of viruses in HAdV species D [12].

Penton Base
Pentamers of penton base fill the gap left by the five peripentonal hexons. The penton base protein folds into two domains: a single jellyroll and an upper insertion facing the solvent-exposed exterior (Figure 2b) [44,51]. This upper domain contains the hypervariable loop, which due to its flexibility, has not been traced in any of the available HAdV structures [7,27,29,30,51]. In most human adenoviruses, the hypervariable loop bears the RGD sequence, an α v integrin-binding motif. Enteric HAdV-F40 and HAdV-F41 lack the RGD motif, having instead RGAD and IGDD [52]. Nevertheless, it has been recently shown that HAdV-F41 binds laminin-binding integrins [53]. Since the RGD loop is also involved in neutralization by the enteric defensin HD5, it has been proposed that lack of both this sequence motif and the acidic HVR1 in hexon may play a role in facilitating infection of intestinal cells by HAdV-F40 and HAdV-F41 [30,49]. Another surface-exposed variable loop presents sequence divergence and different conformations in the human adenoviruses [7,29,30,51]. This loop, whose role is unknown, has been proposed as a site to be engineered for gene therapy, along with the hypervariable loop [51,54]. In human adenoviruses, a long (~50 residues) N-terminal arm extends away from the main body of the protein towards the interior of the virion. Only those arm residues closest to the jelly roll domain (residues 37-51 in HAdV-C5) are ordered, and interact with two monomers of the inner coat protein IIIa [7,29]. The rest are disordered, and seemingly plunge into the non-icosahedrally ordered core [7]. This disordered part is absent in Atadenoviruses, which have a shorter penton base protein and also lack the mobile, variable loops on the outer surface, as observed for the hexon HVRs [31]. Comparison of a high-resolution structure of recombinant HAdV-F41 penton base with the same protein in the context of the virion has shown that regions of the protein involved in interactions with the peripentonal hexons, fiber, and protein IIIa are disordered in solution, but become ordered upon capsid assembly [27]. Virus-like particles can be formed by penton base pentamers of certain HAdV species assembling in dodecahedra, with uses in receptor identification, gene delivery, and vaccine development [55].

Fibers
Trimeric fibers are attached to the outer surface of the penton base pentamer, forming a non-covalent complex. Each fiber is composed of three domains: head, shaft, and Nterminal tail. The C-terminal globular head (also named knob) folds as an anti-parallel β-sandwich and is responsible for fiber trimerization and attachment to the receptor at the host cell membrane. Fibers can bind receptors in a non-stoichiometric way, as exemplified by a recent cryo-EM study on the HAdV-B3 fiber head bound to desmoglein-2 [20], where the trimeric head was found to bind either one or two copies of the receptor, but not three. The structural aspects of receptor binding by human adenovirus fiber heads have recently been reviewed [56]. The last years have also provided extensive information on the structures of non-human adenovirus fibers heads, including those for genera not previously analyzed, such as Siadenoviruses and Atadenoviruses (Table 1). Although the general β-sandwich architecture is conserved, these structures show variability in the loops connecting the β-strands. The loops are very short in Atadenoviruses, producing the smallest fiber head known so far [17]. The Siadenovirus fiber heads are more similar to those found in reovirus than to other adenoviruses, and the monomer has a unique β-hairpin that embraces the neighboring subunit [14,44].
The central domain of the fiber protein folds as a trimeric β-spiral and forms a shaft of variable length, depending on the virus type [57,58]. Shaft length and flexibility also play a role in AdV entry in the cell, by facilitating virion interaction with both the primary receptor via fiber and integrins via penton base [59]. Recombinant fibers of a minimum shaft length can assemble into stable trimers in the absence of the head [60]. Finally, the extended N-terminal tails, with a FNPVYPY sequence conserved in all human adenoviruses, bind at the cleft formed by two adjacent penton base monomers to attach the fiber to the rest of the capsid. Details on the architecture of the proximal part of the fiber and its attachment to the capsid have been obscured by their lack of compliance with the icosahedral symmetry that cryo-EM studies usually exploit to reach high resolution. The 330 Å long HAdV-C5 fiber is flexible, bending near the surface of the penton base and therefore being blurred out when projections of thousands of particles are averaged. In the cryo-EM structure of HAdV-C5, only density for the lower part of the fiber shaft was observed, suggesting its interaction with a hydrophobic ring at the center of the penton base pentamer [61]. The shorter fibers in species B (130 Å) and D (150 Å) are more rigid, facilitating the visualization of the complete fiber in cryo-EM maps of HAdV-D26 and HAdV-C5 pseudotyped with the HAdV-B35 fiber [29,62]. However, due to the symmetry mismatch between the trimeric fiber and the imposed icosahedral symmetry, in all cases the fiber density displayed an artefactual 5-fold symmetry, with apparently five N-terminal fiber tails bound to the penton surface, when there should be only 3 binding sites occupied [29,61,62]. Fortunately, recent advances in cryo-EM image processing have started providing the means to extricate nonicosahedral details from icosahedral virus capsids [63]. Although at moderate resolution (~7 Å), application of these methods to an adenovirus has shown the disposition of the three N-terminal fiber tails bound to the HAdV-D26 penton base, and the clearly trimeric fiber head (Figure 2c). It has also revealed that the HAdV-D26 fiber shaft is slightly tilted with respect to the penton base [47]. Cryo-EM studies also show how the fiber N-terminal peptides extend further than previously observed by X-ray crystallography [51], wrapping around the RGD loop in the penton base [47,61].
Enteric human adenoviruses HAdV-F40 and HAdV-F41 have two fiber genes of different length, but only incorporate one fiber per vertex, either a long or a short one [64]. All known Aviadenoviruses have two fibers per vertex [65]. Exceptionally, in the lizard Atadenovirus LAdV-2 two fiber genes were found, assembled as either one short or three long fiber projections per vertex [66]. It is to be hoped that the new cryo-EM methods will also provide information on these more complex architectures in a not so distant future.

Protein IIIa
Witness to the complexity of the adenovirus capsid, and the challenges it has posed for structural biology, are the numerous changes in the position assigned to protein IIIa (reviewed in [9]). This protein went from spanning the capsid shell at the edges near the 2-fold icosahedral symmetry [67], to occupying the inner vertex region [68]. When near atomic resolution was achieved, a crystallographic study returned IIIa to an external position at the edges [69], while the cryo-EM analysis kept it located underneath the penton [7]. As more structural data have become available, the internal position of protein IIIa has been confirmed and is no longer a source of debate [8,28,29].
There are five copies of IIIa underneath each vertex (Figure 1b). Protein IIIa has 585 amino acids in HAdV-C5, but only residues 7 to 300 have been traced in the highresolution map [28]. The rest of the polypeptide chain is not icosahedrally ordered and remains undefined. The traced part of IIIa has a predominantly α-helical fold and consists of two globular domains connected by a long α-helix. The N-terminal domain was termed the GOS-glue domain, as it connects the penton and peripentonal hexons, keeping the structure of the GOS together (Figure 1b) [7]. In the human adenovirus structures, the C-terminal domain binds protein VIII, which helps joining the GOS to the GON hexons.
In spite of being among the core genes conserved throughout the Adenoviridae family [35], recently solved structures indicate that there is a conformational variability in this protein associated to the different species and genera. The variability is subtle among the human adenoviruses: In HAdV-D26, an extra domain of protein IIIa was found to be ordered, formed by amino acids 314 to 390 [29]. In HAdV-F41, the helix connecting the GOS-glue and VIII-interacting domains is slightly kinked, and it has been proposed that this kink facilitates additional contacts between IIIa and the N-terminal arm of penton base, stabilizing the penton of the enteric virus [30]. A much more drastic conformational change has been found in the first high-resolution structure of an adenovirus not belonging to the Mastadenovirus genera. In the Atadenovirus LAdV-2, the C-terminal domain of IIIa is rotated around the axis of the connecting helix by more than 200 degrees with respect to its counterpart in HAdV-C5 (Figure 2d). Although the domain fold is very similar to the human adenovirus proteins, this large movement changes completely its interactions with the surrounding molecules, removing contacts with protein VIII beneath the GOS. This large change in Atadenovirus protein IIIa seems to be induced by the presence of an unidentified genus-specific peptide beneath the vertex [31].

Protein VI
Polypeptide VI (250 residues in HAdV-C5) is cleaved at two positions (after residues 33 and 239 in HAdV-C5) by the adenovirus maturation protease AVP [41]. The C-terminal peptide pVI C acts as a cofactor required for AVP activation, in a remarkable one-dimensional chemistry mechanism: pVI C slides on the dsDNA molecule and binds covalently to AVP, which then uses the viral genome as a track to reach all its substrates in the core and the inner capsid surface [41]. Beyond the N-terminal cleavage, a region containing a predicted amphipathic α-helix (residues 34-54) interacts with lipid bilayers altering their curvature, therefore conferring membrane-lytic activity to protein VI. This activity is required for the virion to escape the endosome during entry (reviewed in [70]).
High-resolution structural data on protein VI are scarce. Weak density inside a hexon cavity opening towards the virus core has been assigned to the cleaved N-terminal peptide (pVI N , residues 5-33) in HAdV-C5 [28] and HAdV-D26 [29] (Figure 1b). While in HAdV-D26 the pVI N peptide was traced in such a way that the cleavage site was located at the rim of the hexon cavity, accessible to the protease as it slides on the dsDNA, in HAdV-C5 the chain was traced in the opposite direction, in such a way that the cleavage site is hidden inside the hexon cavity and oriented away from the core. However, the available evidence indicates that the correct direction for pVI N is the one modeled in HAdV-C5 [28]. First, the HAdV-C5 model is based in a map with higher resolution than that of HAdV-D26, facilitating the identification of landmark sidechains; second, in a variant where protein VI is not cleaved at the N-terminal site, pVI N as traced in [28] is connected with extra density attributable to the amphipathic, membrane-lytic peptide [71].
In HAdV-C5, residues 109-143 of protein VI were also modeled, closing the cavity of one of the four hexons in the asymmetric unit (Figure 1b) [28]. The fact that density for all the traced pVI fragments is weak, and found only near a few of the hexons, indicates that the protein is not icosahedrally ordered, and that it does not fill all its possible binding sites in hexon. Partial occupancy is expected, as there are 720 hexon monomers in the capsid and approximately 360 copies of protein VI [72], which are too few for a 1:1 pVI:hexon stoichiometry, but too many to have one copy of VI per hexon trimer. It has recently been found that both the unusual pVI:hexon stoichiometry, and the odd location of the pVI N cleavage site in its recessed position inside the hexon cavity, can be understood by considering an unexpected interplay between protein VI and core protein VII during assembly (see Section 4.4) [71].

Protein VIII
Protein VIII (227 residues in HAdV-C5) is cleaved by the maturation protease at three sites [41]. The two larger fragments (residues 2-112 and 157-227) stay together inside the capsid, stabilizing hexon unions on the inner surface of the icosahedral shell. There are two independent copies of protein VIII per asymmetric unit. One of them is wedged between protein IIIa and the peripentonal hexons, collaborating in the GOS-GON union. The second copy is located beneath the GON (Figure 1b). Each copy of protein VIII interacts with four hexon trimers. Some of these interactions consist in a so-called β-sheet augmentation, in which a β-strand in protein VIII is incorporated into one of the jelly rolls in the neighboring hexon trimer [7,31].
The two larger products of protein VIII maturation have been modeled in all available high-resolution structures of adenovirus virions [7,[27][28][29][30][31]. The excised central peptides have lower sequence conservation, vary in size among different adenoviruses, and do not seem to follow the icosahedral symmetry. They may play a role in stabilization of the Atadenovirus and enteric human adenovirus capsids, but this proposal was based on poorly defined densities in which no sequence could be unequivocally assigned [30,31]. Additional evidence is required to assess this point.

Protein IX
Protein IX (140 residues in HAdV-C5) is the only minor coat protein located on the outer capsid surface. This protein adopts an elongated conformation composed of three domains: N-terminal, linker (or rope), and C-terminal. The N-terminal domains of three IX molecules associate in a triskelion-shaped feature located in the valleys between hexons at the icosahedral 3-fold axis (I3) and at the local 3-fold symmetry axis formed by hexons 2, 3, and 4 in the asymmetric unit (L3) (Figure 1a,b and Figure 2e). The rope domain is highly flexible, and crawls around the hexons on the capsid surface, forming a sort of hairnet, until it reaches the facet edge [73]. At the facet edges, the C-terminal domains of three IX molecules join a fourth one coming from the neighboring facet, to form a coiled coil with three parallel and one anti-parallel α-helices, in the case of HAdV-C5 and HAdV-D26 [7,28,29]. There are four triskelions, but only three helix bundles, per icosahedron facet (Figure 2e). Protein IX is dispensable for AdV assembly, but IX-deletion mutants assemble viral particles with low thermostability. Importantly, the triskelion domain is enough to confer capsid thermostability [74].
In non-human Mastadenoviruses canine (CAdV-1) [75], bovine (BAdV-3) [32], and bat (BtAdV-250A) [76], the rope domain is shorter, and the C-terminal domains of IX form coiled coils with only three parallel α-helices located directly on top of their N-terminal triskelion at both the I3 and L3 axis, protruding in a radial orientation between the towers of the hexons [77]. In these viruses, there are four triskelions and four helix bundles per icosahedron facet.
Surprisingly, in the enteric HAdV-F41 capsid there is no density corresponding to the C-terminal 4-helix bundle at the icosahedron edges as in HAdV-C5 and HAdV-D26 [27,30]. Blurry density on top of the triskelions at the L3 axis suggests a conformation similar to the non-human Mastadenoviruses, but there is not any density, even weak, which could account for a helix bundle on top of the I3 triskelion. Therefore, the arrangement of the C-terminal domain of HAdV-F41 protein IX does not seem to follow any of the architectures previously observed in human or non-human adenoviruses. A model has been proposed where protein IX in HAdV-F41 would form four triskelions and three mobile helix bundles, located on top of the L3 triskelions, per facet (Figure 2e) [30].
Protein IX is unique to Mastadenoviruses, but the Atadenovirus specific protein LH3 plays a similar architectural role in the capsid [78]. LH3 forms prominent knobs protruding over the towers of hexons at the I3 and L3 icosahedral axes. Notably, these knobs have a trimeric β-helix fold typical of receptor binding bacteriophage tailspikes [26] (Figure 2e). The LH3 trimer is highly stable and has extensive contacts with the surrounding hexons, likely contributing to the high stability of the Atadenovirus capsids [26,31]. The C-termini of proteins LH3, IX in non-human Mastadenoviruses, and presumably in HAdV-F41, are in a location above the hexon towers more exposed than in HAdV-C5, suggesting that they may have advantages as a locale for exogenous peptide fusion in vector design [30,31,77].
Beneath the knobs, LH3 contacts the capsid surface via a triskelion arrangement which is structurally identical to that of protein IX, indicating a capsid-binding element conserved between Atadenoviruses and Mastadenoviruses [31]. This means that on the one hand, LH3 has negligible sequence similarity with IX but a structurally similar N-terminal domain; on the other, LH3 has high sequence similarity to the C-terminal domain of human adenovirus E1B 55K, a non-structural protein whose gene is directly upstream of the gene coding for IX [79]. The combination of structural and sequence analyses indicates that, in the course of adenovirus evolution, a common ancestor of Atadenoviruses and Mastadenoviruses acquired an LH3-like gene from a bacterium or bacteriophage, perhaps by sharing the same environment (e.g., the host gut). In Mastadenoviruses, this gene was duplicated, and each copy evolved independently. One copy replaced the triskelion with a large N-terminal extension, losing the capacity to bind to the capsid and acquiring new functions in the infectious cycle (E1B-55K). The other copy conserved the triskelion and a structural role, but changed its C-terminus drastically, giving rise to protein IX [31].

Core Proteins
A distinctive feature of adenoviruses is the incorporation of a large amount (more than 20 MDa) of virus encoded, DNA-binding proteins packed inside the virion together with the genome (Figure 3a). Core proteins V, VII, and µ are highly basic proteins, expected to act as DNA condensing agents [38,39]. There is no structural information for any of them in solution, and the core organization is not revealed by cryo-EM studies using icosahedral symmetry and single particle averaging. Small fragments of proteins VII and V have been modeled in recent cryo-EM studies of HAdV-C5 and HAdV-F41 virions, interacting with the inner surfaces of hexons [27,28].

Protein V
Protein V has 368 amino acids and a predicted isoelectric point (pI) of 10 in HAdV-C5, and is present in approximately 150 copies per virion [72,80]. Cross-linking studies identified the interactions among the core proteins, indicating the formation of dimers of proteins V and VII. Moreover, the same study suggested that protein V exists in a complex with VII and µ, but never found VII and µ interacting without V, indicating that VII and µ are in close contact with V [81]. More recent studies have shown that recombinant protein V exists in a dimer-monomer equilibrium, and there is a direct association between proteins V and VI in solution, supporting a model where protein V bridges the core with the capsid [82]. The majority of protein V is released at the beginning of uncoating [83]. A recent cryo-EM study of HAdV-F41 assigned density located in a pocket formed by hexons 2, 3, and 4 to a central region of protein V (residues 170-194) [27].
Protein V is only present in Mastadenoviruses, and is not essential for adenovirus assembly [84]. Two genus-specific, positively charged proteins, P32k and LH2, have been found in Atadenovirus virions, and proposed to be located on the inner capsid surface, substituting for protein V. However, their position in the viral particle has not been unequivocally identified [26,31,78].

Proteins VII and µ
Protein VII is the most abundant protein in the adenovirus core. Its estimated copy number in the virion varies between 500 and 800 [72,80]. It is rich in arginine amino acids, which confer it a high positive charge (predicted pI 12.35 in HAdV-C5). AVP cleaves the precursor pVII at residues 13 and 24 (from a total of 198 in HAdV-C5) [41]. In the most recent structural analysis of HAdV-C5, it was found that the second cleaved N-terminal peptide (pVII N2 , residues 14-24) occupies a position equivalent to that of the protein VI N-terminal peptide pVI N , lining the hexon cavity wall (Figure 1b) [28]. This observation was unexpected, because as a core protein, no part of VII was thought to be icosahedrally ordered, and there were no previous indications that pVII interacted with hexon. Residues Ser31 in pVI N and Phe22 of pVII N2 interact with the same binding pocket in the hexon wall, raising the possibility that pVI and pVII may compete for hexon binding during assembly.
Protein µ (also called protein X) is the smallest core protein, with 80 amino acids in HAdV-C5 and a predicted pI of 12.88. Estimations on its copy number vary between 100 and 300 [72,85]. In HAdV-C5, AVP cleaves the precursor form of µ at residues 32 and 51 [41]. Both VII and µ condense dsDNA in solution [86,87].

Organization of the Adenovirus Core
Early studies showed that cores released from HAdV-C5 particles under mild disruption conditions contain proteins VII, V, and µ, and may form 150-300 Å-thick fibers. After high ionic strength treatment of these cores, only polypeptide VII remains associated with the viral DNA, forming a "bead-on-a-string" filament reminiscent of cell chromatin, as observed by electron microscopy of metal shadowed specimens [88,89]. A~120 Å-thick fiber has also been observed in disrupted immature adenovirus particles, which contain the precursor version of proteins VII and µ [90]. Based on these observations, it has been proposed that protein VII may condense dsDNA by wrapping it around the protein, while µ may exert a bridging action between different regions of the dsDNA molecule [39]. Discerning the organization of the adenovirus core in its physiological context, i.e., inside the capsid, is not straightforward. Cryo-EM high-resolution studies rely on averaging thousands of copies of the same macromolecular object, and it is not clear yet if the core architecture is the same in each viral particle. Cryo-electron tomography yields three-dimensional maps of unique objects, but averaging is still required to achieve a certain level of detail.

Role of the Core Proteins in Adenovirus Assembly, Maturation and Entry
Core proteins have long been considered condensing agents required to tightly pack the adenovirus genome within the capsid. However, the volume relation between the interior of the adenovirus particle and the genome is much less tight than in other dsDNA viruses, for example herpesvirus [38], and HAdV-C5 particles lacking protein VII (Ad5-VII-) can be assembled, showing that the protein condensing action is not required for genome packaging [92]. Structural and biophysical studies are revealing additional roles for core proteins in the adenovirus infectious cycle. Immature adenovirus virions produced by the HAdV-C2 ts1 mutant, which contain the precursor versions of all AVP targets including core proteins VII and µ, do not release protein VI in the early endosome, become trapped in the endocytic pathway, and are eventually destroyed in lysosomes [70]. Comparison of the structure, stability, and mechanical properties of mature and immature particles indicates that maturation of the core proteins causes an increase in the internal pressure of adenovirus virions, presumably due to a change in their interactions with the dsDNA molecule upon proteolytic cleavage. The higher pressure turns the mature particle metastable, facilitating penton loss in the initial stages of uncoating, which opens the way for protein VI to be exposed and carry out its Reproduced from [91]. (c) Competition between proteins VI and VII for hexon binding impinges on AdV maturation and entry. Reproduced from [71].
A first glimpse on the adenovirus core organization has been provided by a cryoelectron tomography study combined with molecular dynamics simulations. This study indicated that the HAdV-C5 genome and core proteins can be modeled as an ensemble of soft particles related to each other by a soft electrostatic repulsion. This soft repulsion would be generated by an excess of DNA negative charges not screened by the positive charges of the core proteins, and generates a certain degree of internal pressure in the adenovirus particle [39]. Atomic force microscopy (AFM) studies comparing the mechanical properties of HAdV-C5 virions and particles lacking core protein VII have recently confirmed that the major core protein plays a role in decreasing the electrostatic repulsion created by confinement of the dsDNA genome, therefore modulating the internal pressure in the capsid [91]. The AFM images also showed the aspect of core contents released upon mechanical breakage of the particles in physiological conditions, i.e., in liquid buffer (as opposed to previous studies on dehydrated, Pt-C shadowed specimens) (Figure 3b). In these conditions, it was observed that different regions of the dsDNA molecule associate in bundles interspersed with clusters presumably formed by DNA wrapping around protein VII. It was concluded that protein VII condenses the adenovirus genome by combining direct clustering and promotion of bridging by other core proteins, i.e., protein µ [91].

Role of the Core Proteins in Adenovirus Assembly, Maturation and Entry
Core proteins have long been considered condensing agents required to tightly pack the adenovirus genome within the capsid. However, the volume relation between the interior of the adenovirus particle and the genome is much less tight than in other dsDNA viruses, for example herpesvirus [38], and HAdV-C5 particles lacking protein VII (Ad5-VII-) can be assembled, showing that the protein condensing action is not required for genome packaging [92]. Structural and biophysical studies are revealing additional roles for core proteins in the adenovirus infectious cycle. Immature adenovirus virions produced by the HAdV-C2 ts1 mutant, which contain the precursor versions of all AVP targets including core proteins VII and µ, do not release protein VI in the early endosome, become trapped in the endocytic pathway, and are eventually destroyed in lysosomes [70]. Comparison of the structure, stability, and mechanical properties of mature and immature particles indicates that maturation of the core proteins causes an increase in the internal pressure of adenovirus virions, presumably due to a change in their interactions with the dsDNA molecule upon proteolytic cleavage. The higher pressure turns the mature particle metastable, facilitating penton loss in the initial stages of uncoating, which opens the way for protein VI to be exposed and carry out its membrane lytic function in the endosome [90,[93][94][95]. Thus, maturation of core proteins plays a crucial role to ensure correct uncoating, and therefore infectivity.
When protein VII is absent, non-infectious particles are assembled. Ad5-VII particles do not expose protein VI and become trapped in the endosome [71,92]. Unlike immature ts1 particles, Ad5-VII-particles have higher internal pressure than the mature virion [91] and have no trouble releasing pentons, as shown by thermal and mechanical stability assays [71]. Instead, these particles fail to expose the lytic protein because its N-terminal region, which includes the pVI N peptide and the amphipathic α-helix, remains hidden inside the hexon cavity oriented towards the virus core, unavailable for proteolytic cleavage and interaction with the endosome membrane [28,71,92]. These observations support a model where the N-terminal regions of proteins VI (360 copies) and VII (500-800 copies) compete for hexon binding sites (720, one per hexon monomer) during adenovirus assembly. The precursor proteins pVI or pVII would continually be pushed out from the hexon cavity by their competitors, putting them in the path of AVP sliding on the DNA. In the absence of protein VII, the competition does not exist, and all pVI copies remain secured inside the hexon cavity, with their pVI N cleavage site hidden away from the protease and the lytic peptide shielded by hexon, preventing virion escape from the endosome (Figure 3c) [71].

Conclusions
Advances in structural biology methods (notably cryo-EM), availability of novel techniques (such as AFM), and discovery of new viruses have resulted in notable advances in our understanding of the adenovirus particle organization and its variations throughout the different species and genera. However, many questions remain open. We now know that core proteins are not just passive players helping squeeze a stiff dsDNA molecule inside a small space, but they play active roles in assembly, maturation and uncoating, defining the capacity of adenovirus virions to deliver their genome in the cell [71,90,91,[93][94][95][96]. Yet, there are no structural data on any of the core proteins, and only initial glimpses on the core architecture have been obtained by cryo-electron tomography and atomic force microscopy imaging [39,91]. We still do not understand how the genome, heavily bound by core proteins, associates with the capsid shell, and there are only limited data on the organization of the packaging machinery [42,97,98]. We have many more structures of fiber heads, alone or receptor bound (Table 1), and data indicating new receptor binding strategies [12,20,53,99]. Virion structures show that the external decorating proteins IX and LH3 mark adenovirus evolution; these proteins may play a role in host specificity, and constitute interesting locations for exogenous peptide display or retargeting [7,[27][28][29][30][31]77]. Yet, more information is required to understand the relation between adenovirus types and their related pathologies. In spite of their being old friends (or foes) [100], adenoviruses keep a large amount of secrets that scientists will need to unveil. This is necessary if we want to efficiently fight adenovirus caused infections, or tailor the virus to suit our needs in the battle against other diseases.