HIV-1 Tat: Its Dependence on Host Factors is Crystal Clear

HIV-1 transcription is regulated at the level of elongation by the viral Tat protein together with the cellular elongation factor P-TEFb, which is composed of cyclin T1 and Cdk9 subunits. The crystal structure of a Tat:P-TEFb complex (Tahirov, T.H.; Babayeva, N.D.; Varzavand, K.; Cooper, J.J.; Sedore, S.C.; and Price, D.H. Crystal structure of HIV-1 Tat complexed with human P-TEFb. Nature 2010, 465, 747–751.) reveals molecular details of Tat and its interactions that have eluded investigators for more than two decades and provides provocative insights into the mechanism of Tat activation.

residues in the C-terminal domain (CTD) repeats of RNAP II to generate the processive IIo form of the enzyme [2]. P-TEFb also exists in a catalytically inactive form bound to the 7SK snRNP, which is composed of 7SK snRNA and the Larp7, Mepce, and Hexim1 or Hexim2 proteins [2,[8][9][10]. Tat promotes dissociation of the 7SK snRNP to activate Cdk9 [11][12][13], and recent evidence suggests that the inhibited complex together with Tat are recruited to the HIV-1 promoter during initiation and that Tat and P-TEFb are later transferred to TAR on the nascent RNA, thereby releasing the inhibitory 7SK snRNP [13]. Despite this progress on the molecular mechanism of Tat activation, the structure of Tat and complexes with its interacting partners has remained elusive. The NMR structure of TAR RNA bound to a small RNA-binding peptide from Tat has been known for some time [14][15][16], but no structural details about the Tat activation domain or its relationship to the transcriptional machinery had been revealed. That is, until the recent high-resolution crystal structure of the Tat:P-TEFb complex from Tahirov, Price, and co-workers [1]. This tour-de-force study, which relied on painstaking purification of the complex co-expressed from baculovirus vectors, provides deep insights into the assembly of this viral-host protein complex and helps explain how Tat subverts P-TEFb to regulate HIV-1 transcription. This work follows on the heels of a previous breakthrough crystal structure of P-TEFb, which showed plasticity of the cyclinT1-Cdk9 interface and the importance of Cdk9 phosphorylation for activation of the kinase and substrate recognition [17].
Tat is comprised of two exons. The first exon is sufficient for viral transcription and contains an activation domain (AD; residues 1-48), RNA-binding domain (RBD; residues 49-57), and a C-terminal extension (residues 58-72) ( Figure 1). The second exon (residues 73-86 or 73-101, depending on the isolate) does not have a primary role in viral transcription but may have other functions during viral replication [18]. In the 2.1 Å crystal structure reported by Tahirov et al., the Tat AD is found to acquire a relatively extended conformation upon interaction with P-TEFb (Figure 1), forming an extensive interface with both CycT1 (88% of the covered Tat surface) and Cdk9 (12% of the covered surface) (Figure 2), with a total buried surface area twice the average for a stable protein-protein interaction [19]. The cysteine-rich portion of the Tat AD is folded into a compact structure with two -helices that coordinate two Zn ions. It is clear from the structure that the Tat AD by itself does not adopt a stable fold, and its extended conformation in some ways resembles ribosomal proteins that fold only in the context of the rRNA scaffold [20]. The Tat RBD and C-terminal extension (residues 50-86) are not observed in the structure, probably because the complex lacks TAR, which is believed to help define an as yet unknown conformation of the arginine-rich RBD [14][15][16].
Earlier biochemical data demonstrated that the Tat AD coordinates two Zn ions through cysteine-rich zinc finger (ZnF)-like motifs and that Cys261 in CycT1 forms a Zn-mediated bridge to Tat ( Figure 2) [7,21]. The precise type of metal coordination remained unclear because the ZnF motifs did not correspond to any particular consensus sequence, but the structure now shows that one Zn ion is tetrahedrally coordinated by Cys22, His33, Cys34, and Cys37 of Tat and the second by Cys25, Cys27, and Cys30 of Tat and most likely Cys261 of CycT1 ( Figure 1). No density was observed for residues 253-266 of CycT1, which corresponds to the previously defined Tat:TAR recognition motif (TRM) important for RNA binding [7] (dashed line, Figure 2), and a broadened density was observed for Cys261, consistent with biochemical data suggesting this region has high mobility or intrinsic disorder [22]. It remains possible that a different residue from a disordered part of Tat occupies the position ascribed to Cys261. Nonetheless, the observed coordination patterns neatly correspond to earlier mutagenesis and binding studies of Tat indicating the importance of these particular residues in Zn-dependent TAR binding and Tat activation [7,23]. Many human transcription factors contain cysteine-or histidine-containing ZnFs that typically fold into modular domains that recognize DNA or RNA or mediate protein-protein interactions [24,25]. The Zn domains in Tat bear no structural similarity to the known classes of ZnFs or other metalloproteins, and the reliance on CycT1 to complete the metal coordination indicates how dependent Tat structure is on its host partner. Moreover, the complex displays a large number of intermolecular hydrogen bonds between Tat and P-TEFb compared to few intramolecular contacts within Tat alone, further implying that Tat folding is primarily determined by the P-TEFb interface. The sequence of Tat exon 1 (residues 1-72) is shown along with its different modules: AD (residues 1-48), RBD (residues 49-57) and C-terminus (residues 58-72). Within the AD, the two Zinc-finger motifs, ZnF1 and ZnF2, are shown in red and orange. The metal coordination of ZnF2 is completed by C261 of CycT1 (cyan). Only the Tat AD is observed in the structure and becomes folded upon the P-TEFb interaction (chain C in PDB 3MI9). The structure representations shown in Figures 1 and 2 were generated using Chimera software (UCSF).

Figure 2.
Structure of the Tat:P-TEFb complex. The Tat AD (gray) is observed to interact primarily with CycT1 (cyan) but also with the Cdk9 subunit (blue). The dashed line represents the distance between the last residue in CycT1 that is ordered in the structure (L252) and C261 (PDB 3MI9). TRM denotes the Tat-TAR Recognition Motif in CycT1 [7] (residues 250-262) that is disordered in the Tat:P-TEFb complex.
One can envisage at least two practical advantages for a protein, particularly from a small virus, to evolve such a strong dependence on interacting partners to adopt structure: First, an intrinsically disordered or structurally flexible protein may adapt to more than one binding partner and facilitate competing interactions that may be required in a temporal manner for function [26,27]. For example, Tat may engage proteins other than P-TEFb during transcription initiation and the transition into elongation, including other host elongation factors or 7SK snRNP components [13,28,29]. Second, a virus with a very limited genome, such as HIV-1, can optimize its coding capacity by evolving small proteins that do not possess extensive polypeptide chains needed to stabilize a protein fold. Rather, they can rely on existing structural scaffolds provided by the host, provided that they evolve a sufficiently tight and specific protein-protein interface. The use of Zn domains to help fold portions of Tat also is economical, as exogenous metals provide a good way to stabilize small protein modules [30]. HIV-1 also appears to have evolved other strategies to make economical use of its coding capacity. For example, the small viral protein Rev uses an adaptable binding surface to recognize multiple sites on the RRE RNA and form a large homo-oligomeric assembly that exports viral RNAs from the nucleus to the cytoplasm [31].
Another possible advantage to structural flexibility is the ability of a virus to populate large areas of sequence space that include functionally permissive mutations. Indeed, Tat tolerates up to 40% sequence variation without noticeable loss of transcriptional activity [32]. For the conserved residues, the Tahirov et al. structure is satisfyingly consistent with many observed evolutionary constraints.
Most obviously, the amino acids that confer tetrahedral geometry for metal coordination are fully conserved, as are amino acids in the protein-protein interface where mutations would cause clashes with P-TEFb. The need for other conserved positions is not yet explained. For example, position 2 is always a negatively charged amino acid (Asp or Glu), position 26 is a highly conserved aromatic residue (Tyr or Phe), and position 28 is a highly conserved Lys. Lys28 is acetylated (although not in the structure) and modulates the formation of Tat:P-TEFb complexes on TAR [33]. Acetylation of Lys28, located on the second ZnF, might help stabilize the Tat:P-TEFb protein-protein interface or participate in remodeling when bound to TAR. Indeed, it has been proposed that recognition of the apical loop of TAR, which requires both Tat and CycT1, may involve conformational changes to one or the other protein that consequently facilitate RNA recognition [6,7]. Such questions, and also the requirements of other conserved residues, highlight the importance of visualizing Tat:P-TEFb complexes with RNA and possibly other Tat-host protein complexes. One previous study determined the structure of a related equine infectious anemia virus (EIAV) Tat-CycT1 fusion protein bound to EIAV TAR [34] and found that both Tat and CycT1 contact the RNA hairpin, with the EIAV Tat RBD adopting a helical conformation and flanking regions contributing to assembly of the ternary complex. Strong similarities between the ZnFs and core regions of Tat suggest that the CycT1 interaction surface is the most conserved structural feature, whereas differences between the TARs and RBDs of HIV-1 and EIAV do not allow the HIV-1 interaction to be modeled. It will be interesting to determine if HIV-1 Tat utilizes similar principles of RNA recognition, where residues outside the RBD also coordinate the assembly of the RNA-protein complex.
Tahirov et al. reported the structure not of just one Tat:P-TEFb complex but of two complexes, with and without a bound ATP analog [1]. The ATP-bound structure permits comparison with a previously reported P-TEFb structure [17] and suggests that Tat may remodel the complex and possibly explain how it regulates Cdk9 activity. The interface between CycT1 and Cdk9 involves a relatively small surface [1,17]. Tat inserts itself into a groove at the heterodimer interface, augmenting the surface and possibly creating a more stable and active P-TEFb complex. This insertion leads to some interesting conformational rearrangements. First, Tat unfolds and disorders a small -helix in CycT1 that is part of the TRM [7], thus exposing a buried surface of CycT1 that allows recognition of the ZnF and adjacent regions of Tat. The TRM is required for TAR loop recognition, raising the possibility that the additional exposed surfaces also might be positioned to contact the RNA as the nascent transcript emerges from RNAP II. Second, the conformational changes in CycT1 cause an 8.5° rotation in the position of the Cdk9 subunit and shifts the positions of a phosphorylated Thr loop (T-loop) in Cdk9 (Figure 2), an autophosphorylation site that promotes kinase activity [17], and two other loops near the ATP-binding site. These concerted rearrangements may help activate P-TEFb and modify the substrate specificity of Cdk9, possibly staging phosphorylation so that Ser5 of the RNAP II CTD becomes modified first in paused transcription complexes and Ser2 becomes modified later as TAR emerges and the transition into productive elongation complexes ensues [35,36]. One caveat to comparing Tat-bound P-TEFb to unbound P-TEFb is that CycT1 in the unbound structure [17] contained three mutations (Q77R, E96G, and F241L), where E96 is located close to the Tat and Cdk9 interfaces such that loss of contacts to neighboring arginines potentially could explain some of the conformational differences ascribed to Tat binding. This caveat notwithstanding, it is tempting to speculate that some of the proposed Cdk9 rearrangements induced by Tat, or changes in kinase activity, also may lead to release of the 7SK snRNP, based on how Hexim1 of the 7SK snRNP is believed to inhibit P-TEFb and analogous to how p27 kip1 inhibits Cdk2:cyclinA [1]. Further structural and functional studies will be needed to illuminate how the inhibitory 7SK snRNP complex is assembled into Tat complexes [13,29], disassembled by Tat [11,12], or ejected upon TAR binding [13,37].
The structure by Tahirov et al. represents a major milestone in Tat and HIV-1 biology and, like all landmark papers, raises a host of interesting new questions. Some of these will be addressed by additional structural studies, most immediately of complexes with TAR. While some is known about how Tat interacts with the bulge region of TAR, the structural basis for TAR recognition by the full Tat:P-TEFb complex, including the TAR loop, remains incomplete. It is possible that the interactions observed or conformational transitions implied by the present structure will differ in the context of RNA or other factors present in transcription initiation or elongation complexes. Some of these may be important for Cdk9 activation or substrate recognition. How the Tat:P-TEFb complex interacts with or is mutually exclusive with the inhibitory 7SK snRNP complex remains to be explored. Novel discoveries are still emerging that suggest molecular mimicry between viral and host components, such as TAR and 7SK snRNA [38], and finding differences between these host and viral-host complexes may reveal unique surfaces for targeted drug design. The structural basis for how posttranslational modifications of Tat, including Lys acetylation and Arg methylation, affect complex formation and activity is not understood. And the broader implications of the structure for Tat inhibitor design remain for the future. Obvious targets seem to be regions of P-TEFb that interact with Tat but are not used for its normal cellular function, which still must be defined. The Tat:P-TEFb structure provides a wonderful starting point for deeper studies of the mechanism of Tat activation, inhibition, and control of cellular transcription elongation. Once again, viral structural biology provides a window into host cell biology.