Structural Insights into APOBEC3-Mediated Lentiviral Restriction

Mammals have developed clever adaptive and innate immune defense mechanisms to protect against invading bacterial and viral pathogens. Human innate immunity is continuously evolving to expand the repertoire of restriction factors and one such family of intrinsic restriction factors is the APOBEC3 (A3) family of cytidine deaminases. The coordinated expression of seven members of the A3 family of cytidine deaminases provides intrinsic immunity against numerous foreign infectious agents and protects the host from exogenous retroviruses and endogenous retroelements. Four members of the A3 proteins—A3G, A3F, A3H, and A3D—restrict HIV-1 in the absence of virion infectivity factor (Vif); their incorporation into progeny virions is a prerequisite for cytidine deaminase-dependent and -independent activities that inhibit viral replication in the host target cell. HIV-1 encodes Vif, an accessory protein that antagonizes A3 proteins by targeting them for polyubiquitination and subsequent proteasomal degradation in the virus producing cells. In this review, we summarize our current understanding of the role of human A3 proteins as barriers against HIV-1 infection, how Vif overcomes their antiviral activity, and highlight recent structural and functional insights into A3-mediated restriction of lentiviruses.


Introduction
To fight against invading bacterial and viral pathogens, eukaryotic organisms have developed not only effective cellular innate and adaptive immunity such as humoral and T cell-mediated immune responses, but also intrinsic immunity, whereby expression of endogenous host restriction factors provides a first line of cellular defense against infections. One important family of mammalian restriction factors that inhibit a variety of viral infections is the apolipoprotein B messenger RNA (mRNA)-editing enzyme catalytic polypeptide (APOBEC) family of proteins [1][2][3]. APOBECs are polynucleotide cytidine deaminases that convert deoxycytidines in foreign DNA substrates to deoxyuridines (dC-to-dU) and/or cytidines in RNA to uridines (C-to-U) [1].
There are 11 members of the human APOBEC family of genes, which arose by gene duplication, including activation-induced cytidine deaminase (AID), APOBEC1 (A1), APOBEC2 (A2), seven APOBEC3s (hereafter referred to as A3 proteins or A3A, A3B, A3C, A3D, A3F, A3G, A3H), and APOBEC4 (A4) [4]. AID plays a significant role in somatic hypermutation and antibody diversification by class-switch recombination of immunoglobulin genes [5,6]. A1 is responsible for C-to-U editing of the apolipoprotein B transcript at position C6666 and changing a glutamine codon (CAA) to a (UAA) translational stop codon. The cytidine deamination results in the synthesis of the truncated apoB-48 protein, which is important for absorption and transportation of dietary lipid from the small intestine to the liver [7,8]. Currently, the functions of A2 and A4 are still unknown [9,10]. The seven homologous members of the A3 family of proteins all encode a conserved zinc-dependent cytidine deaminase domain (CD) with A3A, A3C, and A3H containing a single cytidine deaminase domain (CD) and A3B, A3D, A3F, and A3G containing a catalytically inactive (CD1) and a catalytically active (CD2) cytidine deaminase domain [4] (Figure 1a). The CDs are organized into three distinct groups based on their homologies, named zinc-coordinating domains Z1, Z2, and Z3 [11,12]. The N-terminal CD1 of A3 proteins A3B, A3D A3F, and A3G are catalytically inactive and function to bind RNA and ssDNA [13][14][15]; they also play a role in the oligomerization of A3 proteins and facilitate processive movement of the A3 proteins on the template to increase the efficiency of cytidine deamination [16][17][18][19]. In addition to sharing their evolutionary origins, the CDs of the A3 family of proteins share fundamental structural homology (Figure 1b). A3 CD domains are composed of five β strands and six α helices centered around a deaminase core with a zinc co-ordinating atom.
The catalytic mechanism of cytidine deamination is believed to be similar to that of an E. coli cytidine deaminase, which is homologous to AID and APOBEC1 [20][21][22][23][24][25]. Each CD domain contains a H-X 1 -E-X 23-28 -P-C-X 2-4 -C motif, where "X" is any amino acid (Figure 1a). During the cytidine deamination reaction, a water molecule is hydrolyzed to donate a proton to the catalytic site glutamate and form a hydroxyl group; the cysteines and histidine coordinate a Zn 2+ atom and the glutamate at the catalytic site functions to shuttle the proton during the hydrolytic deamination reaction to convert cytidine to uridine [4] (Figure 1c).
A3G is the most potent inhibitor of HIV-1∆vif ; however, A3F, haplotypes II, V, and VII of A3H, and to a lesser extent, A3D can also hypermutate HIV-1∆vif and exhibit less potent but significant antiviral activity [31,57,60,61]. A3G, A3F, A3D, and A3H (haplotype II) are predominantly cytoplasmic proteins (compared to A3A, A3B, and A3H haplotype I, which are predominantly nuclear) that are incorporated into newly formed HIV-1∆vif virions through non-specific interactions with viral genomic RNA or non-viral RNAs such as 7SL RNA [14,[62][63][64][65][66][67]. It is estimated that on average, about 7 ± 4 molecules of A3G are incorporated per virion produced from primary CD4 + T cells [68]. It is only during reverse transcription in the newly infected target cells that the A3 proteins deaminate (dC-to-dU) the viral minus-strand DNA, leading to lethal G-to-A hypermutation in the complementary plus strand [30,31,[69][70][71][72]. His-X-Glu-X 23-28 Pro-Cys-X 2-4 Cys consensus zinc-dependent cy�dine deaminase mo�f:  In the absence of a functional Vif protein (HIV-1∆vif ), A3G is packaged into HIV-1 virions in the producer cells, exerting its antiviral activity in the target cells by inducing lethal hypermutation, inhibiting reverse transcription and blocking integration. Reverse transcription and pre-integration complex (PIC) formation occurs within mostly intact viral cores in the cytoplasm and the nucleus [48]. (b) Vif induces degradation of A3 proteins. In the presence of wild-type HIV-1, expression of Vif in the producer cells targets A3 proteins for proteasomal degradation, leading to productive HIV-1 infection of target cells. Created with BioRender.com.
It has been proposed that hypermutation by A3 proteins can increase genetic variation in HIV-1 populations and can promote viral evolution [73][74][75][76][77]. In support of this notion, several studies have suggested that resistance to antiviral drugs is increased when A3 proteins are present and can induce G-to-A mutations in the viral genome [78][79][80][81][82]. However, other studies have suggested that A3G-induced hypermutation is an "all or nothing" phenomenon; virion incorporation of A3G and A3F leads to hypermutation and lethal mutational loads which inactivate the provirus and hence do not significantly contribute to viral genetic diversification [80,[83][84][85]. This is primarily because the high levels of G-to-A mutations ensure that the proportion of viral genomes that escape termination codon mutations is vanishingly small (4 × 10 −21 genomes for A3G-induced hypermutation and 1 × 10 −11 genomes for A3F-induced hypermutation) [84]. In addition, selection against deleterious mutations results in a gradient of hypermutations [86], whereby viral DNAs contain the highest frequency of mutations, mRNAs contain intermediate frequency of mutations, and viral RNAs contain the lowest frequency of mutations. This gradient of hypermutations is the result of the G-to-A mutations having deleterious effects on multiple stages of viral replication, including virus production, transcription, mRNA stability, nuclear-cytoplasmic transport, translation, and virion assembly, which ensures that only those viral genomes with low levels of G-to-A mutations are packaged into virions. Modeling of recombination between a hypermutated genome and a wild-type genome indicated that very little or no hypermutated portions of the genome without lethal mutations can be rescued by recombination to contribute to the genetic diversity of the replicating viral population [84].
In addition to G-to-A hypermutation, A3 proteins can also block viral replication through deaminase-independent mechanisms [87][88][89][90][91][92][93][94][95][96]. A3 proteins can inhibit reverse transcription and reduce viral DNA synthesis indirectly by binding viral RNA and blocking the processivity of reverse transcriptase [87][88][89][90][91][92] or directly by binding to reverse transcriptase (RT) [95]. Furthermore, A3 proteins can also block 3 -processing of the viral DNA and integration into the human genome, but to find out whether this is due to an indirect mechanism that involves A3F binding to viral DNA ends and blocking access to integrase or through direct interactions with integrase requires further studies [93,94,96]. Virion incorporation of A3 proteins in experimental settings can greatly exceed the amounts of A3 proteins incorporated in virions produced from primary activated CD4 + T cells [68] and it is important to evaluate antiviral activities under physiologically relevant levels of A3 protein packaging. Interestingly, different A3 proteins exhibit different degrees of deaminase-independent antiviral activities. Introducing a catalytic-site mutation in A3G (E259Q) severely reduces its antiviral activity [93]; in contrast, a similar catalytic site mutation in A3F (E251Q) has only a minimal effect on its antiviral activity [94]. This suggests that A3G largely induces viral inhibition through deaminase-dependent mechanisms, whereas A3F exerts its antiviral effects mainly through deaminase-independent mechanisms (affecting reverse transcription, 3 -processing and/or integration) [93,94]. A3H has also been reported to mainly inhibit viral replication through a cytidine deamination-independent mechanism as little to no G-to-A hypermutation was observed, even with reductions in viral infectivity [97]. In support of this conclusion, it was observed that when A3F and A3H hap II inhibit virus infectivity to a similar extent, a smaller proportion of viral genomes are hypermutated in the presence of A3H compared to A3F and the viral genomes hypermutated by A3H exhibit 3-fold less G-to-A mutations/clone [31].
Human A3 proteins are among the most potent inhibitors of HIV-1 replication in the absence of Vif. Understanding their structure and molecular interactions with Vif proteins may provide molecular targets for the development of novel classes of antiviral drugs that can harness the innate immune defenses to inhibit the replication and spread of HIV-1. Determining the molecular details of how A3 proteins interact with their ssDNA substrates and identify the target cytidine for deamination can also provide valuable insights into how these enzymes can be used to generate more effective and nucleotide-specific gene editing tools [98][99][100]. Here, we summarize the current understanding and recent structural insights into how Vif overcomes the antiviral activity of human A3 proteins and mediates restriction of lentiviruses.

Overview of A3 Protein and Vif Structures
In addition to sharing their evolutionary origins, the CDs of A3 family of proteins share fundamental structural homology (Figure 1b). Numerous structures of A3 proteins that have been determined so far by x-ray crystallography or nuclear magnetic resonance (NMR) [18,[101][102][103][104][105][106][107][108][109][110][111][112][113] are listed in Table 1. Structures of full-length single-deaminase-domain A3 proteins A3A, A3C, and A3H alone or in complex with single-stranded nucleic acids have been determined. In addition, there are many structures of the C-terminal domains (CTDs) containing the catalytically active domains of double-deaminase-domain A3 proteins, A3B-CTD, A3F CTD , and A3G CTD . Some recent studies have determined the structures of the catalytic domains of the A3G CTD in complex with its ssDNA substrate. In addition, there are a few structures of the RNA-binding N-terminal domains (NTDs) of A3B NTD and A3G NTD . Most recently, a double-domain structure of a full-length rhesusA3G chimera was solved, which suggests that the CD1 and CD2 domains can form different packing orientations to support RNA binding and dimerization/multimerization [131]. Two different full-length structures showed a 29 • rotation of CD2 relative to CD1, suggesting that their relative orientations might be flexible. A potential role for a positively charged surface that forms through CD1-CD1 dimerization and could be involved in RNA binding is also evident. However, mutating residues in this positively charged interface did not abolish virion packaging, suggesting that the 126 FW 127 residues play a more important role in A3G virion incorporation. All of these structures show that the CDs of A3 proteins share a basic structure exhibiting a canonical CD fold that encompasses a 5 β-strand core surrounded by 6 α-helices; the zinc-dependent catalytic site is formed by the α2and α3-helices and the β3-strand ( Figure 1b) [112,138].
It is essential to elucidate how Vif binds to the various A3 proteins and targets them for proteasomal degradation. Furthermore, to achieve a more comprehensive understanding of the antiviral activities of the A3 proteins at the molecular level, determination of the structures of A3 proteins in complex with their ssDNA substrates is critical. Structural and biochemical studies have led to the precise assignments of some of the residues that are involved in ssDNA binding in the different A3 structures obtained in the presence of nucleic acids [109,110,112,125]. These accumulating structural, functional, and biochemical features provide important insights into A3 protein cytidine deaminase activities and substrate specificities as well as their recognition by HIV-1 Vif-containing CRL5 E3 ligase complex [134]. In the next sections, we review what is currently known about interactions of human A3 proteins with HIV-1 Vif and ssDNA.

Identification of A3 Residues Critical for A3-Vif Interactions through Mutational Analyses
Until very recently, a structure of any A3 protein in complex with Vif had not been available [123]. Consequently, most of our understanding of structural determinants that are critical for A3-Vif interactions was obtained through mutational analyses of Vif and A3 proteins. These studies identified three putative Vif interaction surfaces that are distinct and are defined as A3G-like, A3F-like, and A3H-like [139]. The A3F-like interaction is very similar to the Vif interaction with A3C or A3D CTD . Elucidation of the structures of A3G CTD , A3G NTD , A3F CTD , A3C, and A3H (see references in Table 1) has led to the mapping of existing mutational analysis data onto the solvent-exposed surfaces of the different A3 proteins (Figure 3).  , and A3D CTD (a predicted model generated through homology modeling based on the A3F CTD structure (PDB: 6NIL) by using the Phyre2 web portal for protein modeling, prediction and analysis [140], and A3H hap II (PDB: 6BBO). The known Vif-mediated degradation determinants labeled in red (for A3G and A3H), the critical residues for Vif-binding in magenta (for A3C, A3D, and A3F), and the CBFβ binding residues in cyan (for A3C, A3D, A3F). (b) Multiple sequence alignment of A3 domains (A3F CTD , A3C, A3D CTD ) and the critical Vif-and CBFβ-binding residues are highlighted in yellow bars. A3F was used as a reference and hence the numbering is based on A3F's residue numbers. The determinants for Vif-and CBFβ-binding are in non-overlapping regions, which are indicated above the alignment. (c,d) Similar primary sequences of residues in A3G NTD and A3H hap II are shown, respectively. The residues highlighted in yellow bars are identified to be critical for Vif-mediated degradation of the respective A3 proteins. The sequence alignment was performed by a Clustal W multiple alignment in BioEdit.

FIGURE #3
It is important to note that analysis of A3-Vif interactions by mutagenesis and co-immunoprecipitation or A3 degradation assays can only identify the determinants that are functionally important for binding or degradation. Mutational analyses cannot determine whether the amino acids identified as critical residues interact with each other directly or whether they influence binding and A3 degradation through indirect interactions. Indeed, as discussed later, the recently determined structure of an A3F CTD -Vif-CBFβ complex clearly showed that some A3F determinants that were predicted to be important for interaction with Vif do not interact with Vif, but instead interact with CBFβ [123].

A3G Determinants That Interact with Vif
Early mutational analyses of human A3G (A3G), guided by amino acid differences with Vif-resistant simian A3Gs, identified A3G-D128 as an essential determinant for Vif-induced degradation and substitution of A3G-D128 with K rendered A3G completely resistant to Vif-mediated degradation [141][142][143][144]. Subsequent studies showed that the 128 DPD 130 motif in A3G NTD was critical for Vif-mediated degradation [145]. Letko and colleagues observed that A3G-Y125R substitution mutant was resistant to Vif-mediated degradation, indicating that it was involved in the interaction with Vif [146]. A structure of the A3G NTD showed that the 128 DPD 130 residues are exposed on the surface and available to interact with Vif. The residues involved in the Vif interaction form a distinct surface composed of poorly conserved regions in α2, α3, and α4 helices in the A3G NTD (Figure 3a,c). The D128 and D130 residues of the DPD motif and other regions of A3G are under positive selection, which results from host-pathogen genetic conflicts that lead to rapid fixation of mutations altering amino acids at protein-protein interfaces, indicating that the genetic conflict between A3G and Vif has been going on over evolutionary time [147,148].

A3F, A3C, and A3D Determinants That Interact with Vif
Studies of A3G/A3F chimeras showed that in contrast to the interaction of A3G NTD with Vif, the interaction of A3F CTD with Vif is critical for inducing human A3F degradation [149]. Additional mutational analysis of A3F identified 289 EFLARH 294 motif as a structural determinant that is essential for Vif binding and A3F degradation [150,151]. Within the 289 EFLARH 294 motif, E289 was identified as a single amino acid that is essential for Vif-mediated A3F degradation and an E289K substitution rendered A3F resistant to Vif [150,151]. Another mutational study, guided by sequence differences between Vif-sensitive A3F and Vif-resistant rhesus macaque A3F, identified E324 as a critical amino acid for Vif-mediated A3F degradation [152]. Subsequent determination of the structure of A3F CTD showed that the 289 EFLARH 294 motif and E324 are near each other (Figure 3a,b).
The EFLARH motif was conserved in human A3C (A3C; 106 EFLARH 111 ) and human A3D ( 302 EFLARH 307 ) and substitution of the glutamate in these motifs conferred resistance to HIV-1 Vif-mediated degradation [151]. These results suggested that a similar structural interface in A3F, A3C, and A3D is critical for interaction with HIV-1 Vif.
Determination of a crystal structure of A3C and extensive mutational analyses revealed that a shallow cavity composed of hydrophobic and negatively charged residues in the α2 and α3 helices forms a Vif-interacting surface [106] (Figure 3a,b). Amino acids E106, F107, R110, and H111 are within this interaction surface, confirming a role for the "EF" and "RH" residues of the ELFARH motif in Vif binding. In addition, E141 of A3C, which is equivalent to the E324 residue in A3F, is also part of the interaction interface. Importantly, several additional residues in the α2 helix (L72, F75, C76, I79, L80, S81, Y86) were shown to be critical for Vif binding and mutagenesis of equivalent residues in A3F (L255, F258, C259, I262, L263, S264, Y269) and A3D (L268, F271, C272, I275, L276, S277, Y282) confirmed the importance of these amino acids in A3C, A3D, and A3F for Vif binding [106]. Subsequent biochemical studies have suggested that the L255A, F258A, and L263A substitutions destabilized the A3F CTD structures, which perhaps contributed to their relative resistance to Vif-mediated degradation [120].

A3H Determinants That Interact with Vif
A3H, the only Z3-type cytidine deaminase, has a similar core structure [107,108,133] and the single CD domain is responsible for cytidine deamination, binding to Vif, and binding to RNA. The A3H gene is polymorphic in humans and there are at least seven distinct A3H haplotypes (hap I to hap VII); the mRNAs transcribed from the A3H haplotypes have varying degrees of intracellular stability, resulting in different steady-state levels of A3H proteins and thus, anti-viral activity [153][154][155][156][157]. Of the seven haplotypes, hap II, V, and VII express higher steady-state levels of mRNAs and therefore, proteins that exhibit potent restriction against HIV-1 and LINE-1 retrotransposons. Zhen et al. found that the D121 amino acid in A3H hap II was critical for Vif binding and Vif-induced A3H degradation [158]. Subsequently, extensive mutational analysis by Nakashima and others identified 9 A3H amino acids (S86, W90, V93, D94, I96, K97, D100, D121, L125, S129) that cluster on the surface of helices α3 and α4 that were critical for Vif binding [159,160]. Recently, a 2.49 Å crystal structure of A3H hap II revealed a uniquely long C-terminal helix 6 and a disrupted β5 strand of the canonical five-stranded β-sheet core [132] (Figure 3a,d). Furthermore, this study showed that A3H has a highly positively charged surface, which facilitates RNA-mediated dimers, inhibits its deaminase activity, and modulates its subcellular localization between the nucleus and cytosol. Future structural studies are warranted to determine how the residues identified by mutational analysis dictate the physical interactions between Vif and A3H.

HIV-1 Vif Determinants That Interact with A3 Proteins
Alanine-scanning mutational analysis of the first 60 amino acids of Vif identified two distinct regions that were critical for Vif-mediated degradation of A3G and A3F [169]. A Vif mutant in which the 40 YRHHY 44 region is replaced with alanines is fully capable of degrading A3F but not A3G ( Figure 4). Conversely, a Vif mutant in which the 14 DRMR 17 residues are substituted with alanines is fully capable of degrading A3G but not A3F (Figure 4). The specificity of these mutants towards A3G and A3F was confirmed in vivo by infection of humanized mice and comparing the patterns of G-to-A hypermutations [170]. In view of these results, it is very interesting that when the 14 DRMR 17 is substituted with SERQ or SEMQ (equivalent residues in SIV African green monkey (agm) Vif), it confers the ability to induce degradation of agmA3G as well as the otherwise resistant human A3G-D128K [169], an observation that has been independently confirmed [169]. These observed differences, which were thought to indicate species-specific interactions, were attributed to the two positively charged amino acids (R15 and R17) in HIV-1 Vif and their interactions with the negatively charged A3G-D128. One hypothesis is that the substitution of DRMR with SEMQ creates a new motif that interacts with A3G-D128 in a manner that is distinct from the interaction of wild-type Vif with A3G. However, a structure of a Vif-A3G complex is needed to resolve this mystery. Residues involved in interactions with A3 proteins are shown using Rasmol software (www.rasmol.org).
Residues involved in interaction with A3F, A3G, and A3H hap II are shown in purple, green, and red, respectively. Residues that are involved in interaction with both A3G and A3F are shown in gold. Residues involved in interactions with CBFβ, EloC, and Cul5 are shown in magenta, brown, and black, respectively.
In addition to the 14 DRMR 17 motif, some amino acids in the 74 TGERDW 79 motif and the 171 EDRW 174 motif were also found to be essential for Vif-mediated degradation of A3F but not A3G [119,171] (Figure 4). Specifically, E76A and W79 in the 74 TGERDW 79 motif and all four amino acids in the 171 EDRW 174 were essential for Vif-mediated degradation of A3F [119]. Finally, the 23 SLVK 26 region was shown to be critical for degradation of both A3G and A3F [172,173].
As discussed earlier, the conserved EFLARH motif is essential for degradation of A3F, A3C, and A3D [151]. Thus, as expected, similar Vif determinants are involved in degradation of both A3F and A3C, but there were some differences in the Vif-A3F and Vif-A3C interactions [119]. Vif amino acids R17, E171, and R173 were much more critical for the Vif-A3F interaction than the Vif-A3C interaction.
HIV-1 Vif uses a distinct motif to counteract A3H hap II compared to the motifs involved in degradation of A3G, A3F, A3C, and A3D. Ooms et al. [174] found that the amino acid variation at position 48 in Vifs dictates the differential ability to induce A3H hap II degradation (Figure 4). Approximately 24.7% of Vifs, including NL4-3 Vif, contain N48 and cannot neutralize the antiviral activity of A3H hap II, whereas 74.1% of Vifs, including the LAI Vif, contain H48 and can efficiently induce A3H hap II degradation (https://hivmut.org). Subsequently, the same group identified that the presence of F39 and H48 substitution in HIV-1 Vif affected the activity against A3H hap II but not against A3G and A3F [175].

HIV-1 Vif Determinants That Interact with CBFβ
In 2011, Jager et al. and Zhang et al. independently identified core binding factor β (CBFβ) as a host factor that specifically binds to HIV-1 Vif and facilitates efficient degradation of A3 proteins [176,177]. CBFβ is a non-DNA binding subunit of the RUNX1, RUNX2, and RUNX3 family of heterodimeric transcription factors that regulates their ability to promote transcription of immunity related genes [178][179][180][181]. Interestingly, the Vif-CBFβ interaction overlaps with the CBFβ-RUNX interaction, which indicates that Vif and RUNX1 are mutually exclusive for CBFβ-binding. The Vif-CBFβ interaction increases the steady-state levels of Vif by preventing its degradation [176,177,182,183] as well as by increasing its biosynthesis [184]. In addition to increasing the stability of Vif, the Vif-CBFβ interaction was shown to sequester CBFβ in the cytoplasm and reduce its ability to promote transcription of genes controlled by the RUNX complexes, which includes A3 genes [185]. Thus, the Vif-CBFβ interaction inhibits A3 protein expression.
Guo et al. [134] determined the structure of a pentameric complex composed of Cul5, EloB/C, Vif, and CBFβ (discussed below), indicating that Vif is tightly associated with CBFβ. Mutational analyses identified several residues of Vif that are important for its in vivo interaction with CBFβ [186][187][188]. Double-alanine scanning mutagenesis of the first 60 amino acids of Vif provided a comprehensive view of Vif determinants essential for in vivo interaction with CBFβ and identified the N-terminal 5 WQVMIVW 11 as the major interaction determinant [189] (Figure 4). Furthermore, in agreement with a previous study [190], CBFβ amino acid F68 played a key role in forming a tripartite hydrophobic interaction with CBFβ I55 and Vif W5 to maintain a stable and functional Vif-CBFβ complex. The pentameric structure [134] showed the extensive hydrophobic interactions between the N-terminal anti-parallel β-strands of Vif and CBFβ, which include the 5 WQVMIVW 11 region of Vif, and I55 and F68 of CBFβ.

Structure of the Pentameric Complex of Vif, Cul5, EloB/C, and CBFβ
In a landmark study, Guo et al. [134] determined the structure of a Vif-Cul5-EloB/C-CBFβ pentamer, providing the first structure of full-length Vif (amino acids  as well as new insights into its interactions with the E3 ubiquitin ligase complex substrate adaptors EloB (residues 1-102) EloC (residues 1-102), scaffold protein Cul5 (residues 12-386), and CBFβ (residues 17-112). The Vif BC-box motif, which contains the conserved 144 SLQYLA 149 region that is homologous to the suppressor of cytokine signaling (SOCS)-box motif in SOCS-box proteins, interacts with EloC [191]. The H 108 -C 114 -C 133 -H 139 zinc-binding motif, which is essential for binding to Cul5, is bound to a Zn +2 atom that is solvent inaccessible. Vif residues 116-131 within the H 108 -C 114 -C 133 -H 139 motif act as a cullin box and mutations of Vif I120S and L124S impaired the Vif-Cul5 interaction ( Figure 4). Cul5 residues L52 and W53 play a critical role in binding to Vif.
The Vif-CBFβ complex plays a key role in organizing the pentameric complex and in its absence, interactions between Cul5 and EloB/C were reduced. The importance of Vif residues 5 WQVMIVW 11 and CBFβ amino acids I55 and F68 is consistent with the close interactions of these amino acids in the structure. Furthermore, the C-terminal CBFβ helix 5 residues (e.g., F143) also form critical interactions with Vif residues W89, T96, and L106. Some of these residues were previously identified as being critical for interaction with Vif [192] and others have been shown to be critical for the Vif-CBFβ interaction [187,193]. Importantly, the larger buried surface area of the Vif-CBFβ interaction compared to the CBFβ-RUNX1 interaction (4797 Å vs. 3941 Å) suggests that CBFβ has a higher affinity for Vif than RUNX1. The suggestion that Vif evolved a stronger affinity to CBFβ than its cellular partner RUNX1 underscores the importance of the Vif-CBFβ interaction for its role in overcoming restriction by A3 proteins.
Overall, the pentameric structure indicated that the Vif interacting proteins Cul5, EloC, and CBFβ interact with hydrophobic portions of Vif and leave the positively charged surfaces free to interact with A3 proteins. Most of the Vif amino acids previously identified to be critical for interactions with A3 proteins are exposed on the Vif surface and available to interact with A3 proteins (see Figure 4).

Structure of a Vif-CBFβ-A3F CTD Ternary Complex
A cryogenic electron microscopy (cryo-EM) structure of a ternary complex consisting of Vif, CBFβ, and A3F CTD at 3.9 Å resolution was recently reported [123], providing the first structure of a Vif-A3 complex. As discussed earlier, the Vif-CBFβ interaction was thought to facilitate Vif-mediated degradation of A3 proteins by primarily inhibiting degradation of Vif and increasing its steady-state levels. Furthermore, it was thought that Vif binding to CBFβ partially blocks A3 binding and that CBFβ must be displaced before Vif can bind to A3 proteins [139,186]. It was therefore surprising that binding to A3F CTD did not significantly alter the conformation of the Vif-CBFβ complex. Additionally, it was found that CBFβ directly interacts with A3F CTD and participates in the recruitment of A3F CTD to the Cul5 E3 ubiquitin ligase complex (Figure 5a).  The structure shows that Vif and CBFβ form a stable platform to which A3F CTD binds. A3F CTD binding does not induce significant conformational changes in the Vif-CBFβ complex or in the A3F CTD . Local conformational changes in the Vif loop between β4 and β5, which include the 23 SLVK 26 region previously shown to be important for A3G and A3F binding [172,173], brings this loop closer to the Vif C-terminal residues 173-176, which overlap the 171 ERDW 174 region previously shown to be important for A3F binding [119,192].

FIGURE #5
The structure revealed the CBFβ-A3F CTD interface and the Vif-A3F CTD interface (Figure 5b,c). The importance of the CBFβ-A3F CTD interface in vivo was established through mutational analysis of critical CBFβ and A3F CTD residues at the interface. CBFβ residues R35 and R43 form a positively charged surface that interacts with a negatively charged surface on A3F CTD composed of E324 and several main chain carbonyls. The negatively charged CBFβ residue E54 interacts with the positively charged A3F residue R293, which is stabilized by Vif H73. A3F R293D mutant was resistant to degradation by Vif and CBFβ E54K mutant interfered with WT Vif's ability to degrade A3F. Remarkably, the A3F mutant R293D became sensitive to Vif-mediated degradation in the presence of the CBFβ mutant E54K. These charge-swapped pairs of mutants reversed the Vif-resistance phenotype in vivo, indicating that A3F R293 and CBFβ E54 physically interact.
The physiological relevance of the Vif-A3F interface was also determined by mutational analyses of Vif and A3F CTD residues predicted to be critical for the interaction. Three major Vif-A3F CTD interactions were identified in the structure. First, Vif W79A formed a stacking interaction with A3F P265. The Vif double-mutant W79A-H80A was unable to induce degradation of WT A3F and the A3F mutant P265A was partially resistant to WT Vif, indicating the importance of these amino acids for Vif-mediated A3F degradation. Second, Vif R15 forms a strong electrostatic interaction with the negatively charged main chain carbonyls of A3F residues 260-263. As predicted from previous studies indicating the importance of the 14 DRMR 17 motif in Vif [169], the Vif mutants R15D or R15E failed to induce degradation of WT A3F or rescue infectivity in single replication cycle assays. Third, Vif K50 residue, together with CBFβ E54, formed electrostatic interactions with A3F E289 and R293 residues. Vif mutant K50E was severely defective in inducing A3F degradation or rescuing virus infectivity in single cycle assays and as shown previously [151], A3F mutant E289K was highly resistant to degradation by WT Vif.
Overall, the A3F CTD -Vif-CBFβ structure provides a comprehensive model of how A3 proteins bind to the Vif-CBFβ dimer and are recruited to the Cul5-EloB/C complex for polyubiquitination and proteasomal degradation. The structure clarifies much of the mutagenesis data; some of the A3F-Vif interactions were confirmed, while other interactions thought to be between Vif and A3F are actually A3F-CBFβ interactions. For example, A3F residues R293 [119,150,151] and E324 [152] were thought to interact with Vif, but actually interact with CBFβ [123]. Other residues that were thought to be involved in Vif-A3F interaction are located in the interior of the structure and indirectly influence Vif-mediated degradation of A3F. These insights reinforce the notion that both structural studies and in vivo mutational analyses are needed to attain a full understanding of the Vif-A3 interactions.
It is also important to note that the interactions of the Vif-CBFβ complex with A3G are likely to be quite different from the interactions with A3F, since the CBFβ mutations that inhibit degradation of A3F (R35E, R43E, E54K) do not inhibit degradation of A3G. Based on the conservation of the EFLARH motif and its importance in Vif-mediated degradation of A3F, A3D, and A3C, the Vif-CBFβ interactions with A3D CTD and A3C are likely to be similar but may well exhibit differences. Thus, it will be important to obtain structures of the other A3 proteins in complex with the Vif-CBFβ complex.

Insights into Substrate Selection, A3 Deamination, and Editing-Site Selection
Cytosolic A3D/F/G/H proteins are incorporated into newly budding virions through interactions of their N-terminal RNA binding domain with HIV-1 viral RNA, leading to HIV-1 restriction through both editing and non-editing mechanisms. The CTD domain for double domain A3s retains the deaminase and hypermutation activity, while the NTD domain, which is catalytically inactive, retains important functions for nucleic acid binding, oligomerization, packaging, and processivity [13,113,149,[194][195][196][197]. Extensive G-to-A hypermutation [32,52,70,83,84] of viral genomes in patients results in an average of 20% of G residues mutated to A residues during a single round of reverse transcription, leading to lethal mutagenesis and viral inactivation [84]. During reverse transcription as minus-strand synthesis progresses, RNase H degradation of the viral RNA template allows A3G to access the newly synthesized ssDNA (minus-strand DNA), inducing deamination of dC to dU. A3G lacks deaminase activity on dsDNA or RNA templates [71,149,194,198]. The dC-to-dU deaminations on the (-) strand DNA then template G-to-A mutations to the (+) strand DNA as RT completes plus-strand DNA synthesis, forming a dsDNA viral genome.

FIGURE #6
A +1 C 0 C -1 C -2 T -3  Structures of the A3 NTD or CTD domains in the presence or absence of nucleic acid have broadened our understanding of A3-ssDNA interactions. Low resolution structures of high molecular mass and low molecular mass full-length A3G that provided an overall shape of the A3G were determined by small-angle X-ray scattering [204]. Structures of the catalytically active domains of A3A, A3B, A3C, A3F, A3G, and A3H were determined in the absence of ssDNA, leading to three historical models of how ssDNA coordinates in the catalytic active site of the A3 proteins [18,101,[103][104][105][106]109,110,113,120,126,[128][129][130]205,206]. The "brim" model proposed that a ring of positively-charged amino acids which are positioned surrounding the concave catalytic active site aid in positioning the target cytidine (dC 0 ) for deamination [101]. Additionally, the "kinked" [18,126] and "straight" models [129,206] proposed that the ssDNA substrate is either in a bent or straight configuration in the CTD active site, respectively. The 3.1 Å crystal structure of A3F CTD predicted a single ssDNA binding groove, leading to the catalytic active site supporting the "straight" model [120], while an NMR structure of A3A modeled the ssDNA substrate with a "kink" or a bend so that the reactive cytidine is positioned into the active site [103].
However, recent co-crystal structures of A3A and A3G complexed with ssDNA have now been solved, providing valuable new insights into A3 substrate binding and specificity. The A3A-ssDNA 2.2 Å co-crystal structure shows the binding groove of a DNA oligonucleotide (5 -TTTTTTTCTTTTTTT-3 ) bound in the active site of A3A, ready for catalysis centering on the deamination target 5 -TC-3 with the ssDNA positioned in a U-shaped conformation [110]. Three nucleotide contacts are made with target deoxycytidine and two flanking nucleotides and the A3A amino acid H29 is thought to "latch" and stabilize the substrate ssDNA at the target cytidine (dC 0 ). This unique U-shaped DNA conformation is also confirmed by a 3.1 Å co-crystal structure of A3A-ssDNA (5 -AAAAAATCGGGAAA) and a 1.7 Å co-crystal structure of an A3B/A3A-ssDNA chimera with Y315 playing a role in forming an open or closed catalytic cleft for ssDNA binding [109]. Further modeling studies of A3B CTD bound to ssDNA supports this U-shaped conformation [207].
A recent study determined the co-crystal structure of A3G CTD with ssDNA at 1.86 Å resolution that contained its relevant substrate 5 -AATCCCAAA-3 , with the ssDNA adopting a curved shape in the active site and W211 playing a critical role in substrate recognition (Figure 6a) [112]. In this structure the ssDNA has a more extended conformation contacting 5 nucleotides in the CTD instead of three as was observed in the A3A-ssDNA co-crystal. Overall, the ssDNA substrate binding conformation in complex with A3 suggests that this U-bent region of nucleic acid is a hotspot for A3 cytidine deamination and may be common to all A3 proteins. Indeed, the tRNA adenosine deaminase TadA from Staphylococcus aureus co-crystal structure in complex with tRNA also shares structural similarity with A3A-ssDNA, suggesting evolutionary conservation [208].
APOBEC proteins have evolved to select different ssDNA substrates, reflecting a diversity of their function. Unlike A3 proteins, AID has evolved to target dsDNA for immunoglobulin class-switch recombination and V(D)J hypermutation. The first co-crystal structure of a maltose-binding protein (MBP)-AID fusion protein and dCMP was published showing potential surface grooves that favored G-quadruplexed DNA substrates over linear ssDNA that likely guide AID substrate recognition and facilitate double-strand breaks for class switch recombination [205]. Here, no evidence of a U-shaped substrate channel was observed, likely reflecting a more rigid nature of the dsDNA substrate. Thus, although all APOBECs have a common cytidine deamination function, the overall protein structure at the catalytic core influences substrate specificity.
Interactions of the ssDNA substrate outside of the core catalytic active site have also been reported to influence substrate selection and cytidine deaminase activity. A co-crystal structure of A3F CTD bound to poly-dT(10) shows a new ssDNA-binding surface distal to the zinc active center mediated by hydrophobic and electrostatic interactions between the ssDNA substrate, tyrosine (Y333), and lysine (K352/K355/K358) residues in A3F CTD [121]. These amino acids were also shown to play a role in RNA binding as well as the catalytic activity of A3F. A 1.9 Å crystal structure of soluble A3B NTD mutant showed two positively-charged amino acid patches around the NTD domain that likely bind nucleic acid through electrostatic interactions and facilitate cytidine deamination [117]. In addition, the positively charged CD1 of A3G has been shown to crosslink ssDNA via mass spectrometry and mutations Y181A/Y182A have reduced deaminase activity [209]. The 2.0 Å co-crystal structure of rhesus A3G-CD1 bound to poly-dT ssDNA shows strong ssDNA-binding affinity, which is likely due to the positively-charged surface of rhesus A3G [113]. Indeed, crystal and solution structures of A3G CTD [101,126] lacking the NTD domain have severely reduced deaminase activity, implicating the importance of regions outside the catalytic domain in influencing substrate selection and deamination. Overall, the negatively charged phosphate backbone of the ssDNA substrate requires stabilization via positively charged patches and grooves along the surface of the A3 protein to mediate deamination [104,128].
Recently, a 3.28 Å crystal structure of human A3H showed that A3H is bound to a short RNA duplex [108] (Figure 6b). A3H proteins purified from E. coli were devoid of cytidine deaminase activity unless they were treated with RNase A, suggesting that RNA binding inhibits cytidine deaminase activity. Mutagenesis of residues in loop 7 and α6 helix increased cytidine deamination activity, supporting the notion that RNA binding and deamination activities are separated. A model was proposed in which A3H dimerization is promoted by RNA binding, which facilitates cytoplasmic localization, as previously shown for A3G [210], promotes virion incorporation, and increases deaminase-independent inhibition by binding to the template RNA and indirectly inhibiting reverse transcription. On the other hand, RNA binding and dimerization inhibited the cytidine deamination activity. Interestingly, Bohn et al. reported a 2.24 Å crystal structure of A3H from pig-tailed macaques (pgtA3H), which showed that the pgtA3H also forms a dimer around a short double-stranded RNA [107]. However, in contrast to the human A3H, the pgtA3H was reported to retain potent cytidine deaminase activity while retaining binding to short RNA duplexes in the viral genome. These studies have raised interesting questions regarding the nature of RNA sequences that promote A3H binding, dimerization, virion incorporation, and regulation of cytidine deaminase activity.

Deamination and Editing-Site Selection
The sequence context and secondary structure of the ssDNA as well as A3 protein secondary structure can influence site preferences that determine patterns of cytidine deamination. Studies show that the local nucleotide sequence context surrounding the dC 0 site of deamination (−1, +1, and +2 nucleotides) influences cytidine deamination frequencies for each of the A3 proteins ( Figure 6c) [31,125,198,[211][212][213][214][215][216]. Detailed hypermutation studies of the −2 position showed that A3G preferred to mutate 5 -TC or 5 -CC dinucleotides when the −2 nucleotide was a C (5 -CCC-3 ; the deaminated C is underlined) [31]. A3F and A3H preferred to mutate TC dinucleotide sites when a T was at the −2 position (5 -TTC-3 ) and A3D preferred to mutate TC sites when an A was at the −2 position (5 -ATC-3 ). Interestingly, all A3 proteins strongly disfavored a G at the −2 position [5 -G(T/C)C-3 ]. The most distantly related APOBEC, AID, exhibits a different pattern of hypermutation, favoring 5 -(A/T)(A/G)C-3 and disfavoring the A3G hotspot 5 -CCC-3 [213]. Knowing the patterns of editing-site selection, proviral genomes from infected patients can be analyzed for hypermutation patterns; sequence analysis provides evidence that A3 proteins can copackage and comutate the viral genome [31] to block HIV-1 infection. In this study, co-packaging and the increase in G-to-A hypermutation was additive when A3G and A3F were co-expressed and evidence of a synergistic increase when A3G and A3H hap II were co-expressed. However, another study used a single plasmid expression system to express both A3G and A3F at low levels and observed a modest enhancement of A3G deamination activity by A3F (~1.5 fold) compared to the expected additive effect [217]. This result suggests that at low levels of A3 protein expression, it is possible to observe synergistic cooperativity between A3G and A3F.
Structural and biochemical studies have provided insight into how A3G selects the substrate site for deamination. Similarities exist amongst the A3 proteins, but differences in amino acid sequence, overall size of the protein, as well as sequence and length variation in loops L1, L3, and L7 contribute to the differences in substrate recognition and cytidine deamination patterns [18,[103][104][105]218,219]. Loop 7, in particular, amino acids R313, Y315, D316, and D317, have been implicated in structural models to play a role in ssDNA binding to the −1 nucleotide and proper positioning for deamination and notably have been proposed to be a "nucleotide specificity box" [201,218,220,221]. Interestingly, swapping the A3G loop 7 into AID at the complementary position changed the site-selection preference of AID to A3G in in vitro assays on ssDNA substrates [220]. A recent 2.9 Å co-crystal structure of an A3G CTD fusion protein with the substrate hot spot sequence 3 -CCCA-5 captured the substrate bound to A3G with the non-preferred "A" in the −1 nucleotide position in the binding pocket of A3G [125]. Structural analysis showed that rearrangements in the pocket were not conducive to deamination (no rearrangement of A3G D316 flipping to interact with the −1 nucleotide or DNA binding at dC 0 ), showing that the precise conformation of the catalytic binding pocket is needed for selective deamination. Indeed, A3B crystal structures in the presence or absence of ssDNA show that in the absence of ssDNA substrate, loops 1 and 7 formed a closed structure with R211 from loop 1 and Y315 from loop 7 stacked on each other, blocking access to the active site; yet in the presence of ssDNA, the loops underwent a conformational switch to accommodate the ssDNA substrate [109,115,116]. Further biochemical and structural studies have also confirmed the importance of loop 7 in editing site selection [20,145,150,201,222]. Mutation D317Y in A3G loop 7 changed the deamination profile of A3G from 5 -CC to 5 -TC and swapping A3G loop 7 with A3A switches A3G deamination preference to 5 -TC [218]. Mutations in loop 7 of A3F decreased ssDNA substrate binding and deamination activity [223] and swapping 315 YDDQ 318 of A3G to 307 YYFW 310 of A3F switched the substrate preference of A3G from 5 -CC to 5 -TC or 5 -GC [120]. This was also true for AID when loop regions of AID were swapped with corresponding amino acids of A3G or A3F, which resulted in the swapping of substrate preferences [220,221].
Co-crystal structure of A3G CTD with its preferred ssDNA substrate 5 -AATCCCAAA-3 (favored sequence shown in bold) at high resolution has provided detailed insights into the interactions of A3G residues and the ssDNA [112] (Figure 6d). The target cytidine (C 0 ) base interacts with the 257 HAE 259 sequence that defines the zinc-binding cytidine deaminase domain and Y315 (loop 7). The cytidine base at the −1 position (C −1 ) interacts with the 316 DDQ 318 sequence in loop 7. The cytidine at the −2 position (C −2 ) interacts with D316 (loop 7) and R374 (α-helix 6). The W211 residue interacts with both the cytidine at the −3 position (C −2) and thymidine at the −3 position (T −3 ). The H216 residue interacts with the adenine at the +1 position (A +1 ). In addition to the interactions with the bases, positively charged residues R213, R215, H216, and N244 interact with the phosphate backbone to neutralize the negative charges. A variety of interactions that include pi-pi interactions, hydrogen bonds, water-mediated hydrogen bonds, and hydrophobic interactions serve to stabilize the binding and position the target cytidine for deamination. Thus, the specificity of editing site selection is driven by many structural factors and interactions.

Summary
The A3 family of proteins provides potent protection against a wide variety of viruses and intrinsic immunity to the host through deaminase-dependent and deaminase-independent mechanisms of inhibition. Cumulative biochemical and structural knowledge has improved our understanding of how A3 proteins bind to Vif and the recent discovery of A3F binding to CBFβ suggests that CBFβ has a direct role in recruiting A3F, and likely other A3 proteins, to the CRL5 E3 ubiquitin ligases. We are also beginning to gain a structural understanding of how A3 proteins bind to the ssDNA substrate and select the sites of cytidine deamination. Importantly, A3 residues distal to the catalytic site, including residues in the non-catalytic N-terminal domain, can influence the sites of cytidine deamination.
Even with the remarkable advances in our understanding of the structure and function of A3 proteins, important questions remain unanswered. Although the structure of a rhA3G full-length protein was recently determined [131], so far, there is no structure of a full-length human double-deaminase domain A3 protein and we do not know how the two CD domains in human A3 proteins are oriented with respect to each other. Resolving the structure of a full-length A3G and/or A3F could provide vital insights into how these proteins bind to their long ssDNA. We need to obtain structures of other A3 proteins bound to the Vif-CBFβ-CRL5 E3 ubiquitin ligase complex and compare these structures with the Vif-CBFβ-A3F CTD complex to determine how their Vif-A3 interactions differ. Importantly, a structural understanding of how A3 proteins inhibit viral replication through deaminase-independent mechanisms is needed. A thorough structural understanding of the Vif-A3 interactions could facilitate the rational design and development of novel therapeutics in the fight against AIDS.