Structural and Functional Insights into Foamy Viral Integrase

Successful integration of retroviral DNA into the host chromosome is an essential step for viral replication. The process is mediated by virally encoded integrase (IN) and orchestrated by 3'-end processing and the strand transfer reaction. In vitro reaction conditions, such as substrate specificity, cofactor usage, and cellular binding partners for such reactions by the three distinct domains of prototype foamy viral integrase (PFV-IN) have been described well in several reports. Recent studies on the three‑dimensional structure of the interacting complexes between PFV-IN and DNA, cofactors, binding partners, or inhibitors have explored the mechanistic details of such interactions and shown its utilization as an important target to develop anti-retroviral drugs. The presence of a potent, non-transferable nuclear localization signal in the PFV C-terminal domain extends its use as a model for investigating cellular trafficking of large molecular complexes through the nuclear pore complex and also to identify novel cellular targets for such trafficking. This review focuses on recent advancements in the structural analysis and in vitro functional aspects of PFV-IN.


Introduction
Integration of the linear viral cDNA into the host cell chromosome occurs through two sequential catalytic events orchestrated by the virally encoded enzyme integrase (IN) and is considered a key step in the retroviral lifecycle [1,2]. IN recognizes attachment sites at long terminal repeats on both termini production of non-infectious viruses [31][32][33]. Experimental evidence suggests two alternative models for Pol encapsidation into virions; however, the exact mechanism remains elusive [34][35][36].
Autocatalytic cleavage of the Pol precursor protein of other retroviruses leads to the production of protease (PR), reverse transcriptase (RT) and RNase H (RN), and IN, whereas a single cleavage event in FVs causes cleavage of the 127 kDa Pol protein into two nuclear proteins, an approximately 85 kDa protein containing the PR, RT, and RN domains and a 40-kDa protein consisting of only the IN domain [37,38].
Among the three domains of retroviral IN proteins the structure of the NTD in solution reveals a compact three-helical bundle in which Zn 2+ coordination occurs through His and Cys residues of the invariant HHCC motif [39,40]. Similar to retrotransposons and some bacterial transposases, retroviral IN CCD domains harbor a highly conserved D, D 35 , E amino acid sequence motif [8,9], and the invariant D and E residues (D 64 , D 116 , and E 152 for HIV-1) are responsible for catalysis of all enzymatic activities such as 3' processing, DNA strand transfer [8,41] and disintegration [9,10]. The CTD is rich in K and R residues and has non-specific DNA-binding activity [42][43][44].
Protein mixing experiments have revealed similar IN domain organizations from Gammaretrovirus (murine leukemia virus and the feline leukemia virus) [45] and Spumavirus [46]. PFV IN is enzymatically active in vitro and is homologous to other retroviral INs [47,48]. Although these three retroviral IN domains comprise ~290 amino acid residues linked through inter-linkers, PFV IN is significantly longer and comprises about 392 residues [26,49]. The possibility exists that an additional domain consisting of approximately 50 residues might be present at the N-terminus, preceding the NTD domain. Crystallographic analysis of PFV IN and DNA complexes has revealed an extra domain termed the N-terminal extension domain (NED) [25], which has no role in in vitro activity [24] however is essential in complexing with viral DNA (vDNA) [6].
Although numerous efforts have focused on structural analysis of HIV IN to develop new antiviral compounds and to unveil the enzymatic mechanisms, the exact three-dimensional structure of the full protein or its functional mechanism is still unknown [50,51]. The requirements of high concentration [52] of long viral DNA substrates [53][54][55] and LEDGF (lens epithelium derived growth factor) complexed IN made the X-ray crystallographic analysis of the HIV-1 intasome a difficult task. In contrast, remarkable solubility, utilization of exclusively short viral DNA substrates [24] for concerted integration assays made the IN from PFV an excellent ortholog of HIV-IN for in vitro experimentation and crystallographic determination of the retroviral intasome. The crystal structure of the PFV intasome provided insight into the HIV-1 drug resistance mechanism, as it has a 22% amino acid sequence similarity within the active sites of CCD domains of HIV-1 [25,56].
Symmetric intasome complexes assembled with full-length PFV IN, Zn 2+ , and pre-processed U5 viral DNA substrate retained concerted integration activity [25]. Intasome crystals differ in the basic Zn 2+ -IN-vDNA assembly and the presence of ligands: Mn 2+ or Mg 2+ , target DNA, or inhibitors were yielded from the PFV system [25]. The intasome structure is composed of a dimer of IN dimers and a pair of synapsed vDNA ends. The pairs of IN subunits are functionally and structurally distinct as predicted by partial structures [57,58]. The inner subunits adopt an extended conformation and are responsible for all interactions with viral DNA and catalysis. The outer IN subunits do not interact with each other or the viral DNA and seem to play a supporting role. Interaction between molecules within each dimer occurred through an extensive CCD-CCD interface (reviewed in [59]). However the NTD and CTD domains of the outer subunits are disordered in the crystals and their functions are currently unknown [6,60]. Soaking PFV intasome crystals in MnCl 2 revealed metal bound intasomal active sites [25] that support the previous prediction about the presence of two metal ions at each of the active sites [61].

Biochemical and Enzymatic Properties of FV Integrase
The function of IN is to insert viral DNA into the host genome, and the reaction is accomplished by 3'-processing and DNA strand transfer steps, which are bimolecular nucleophilic substitution (S N 2 ) reactions. Both steps are simultaneously accomplished by metal-dependent nucleotidyl transferase and nuclease action, like transposases and RNase H enzymes and both are catalyzed by a triad of acidic residues in a characteristic D, D 35 , E motif found in all retroviral INs [62][63][64]. The catalytic site residues coordinate the positions of two Mg 2+ ions to activate the attacking nucleophile (the oxygen atom of a water molecule for 3'-processing and the 3'-hydroxyl of viral DNA for strand transfer) and destabilize the scissile phosphodiester bonds [65][66][67]. IN removes the terminal dinucleotides (GT in HIV-1, TT in ASLV avian sarcoma leukosis virus) from each 3'-end of the double-stranded viral DNA during the processing step. However, FV integration does not involve the removal of two terminal nucleotides from its U3 region. In contrast, two terminal nucleotides (AT) are removed from its U5 region in order to provide the subterminal CA for the joining reaction to host cell DNA [68]. In the second step, a nucleophile attacks the free 3'-hydroxyl of the viral DNA on the target chromosomal DNA, resulting in covalent joining of the two molecules. The subsequent removal of the two unpaired nucleotides at each 5'-overhanging end of the viral DNA and filling of the gaps are most likely performed by host enzymes. Although HIV-1 IN has long been studied to develop inhibitors, the three-dimensional characterization of intact HIV-1 IN structure is unknown because of its poor solubility. This problem has been tried to be overcome by studying individual domains and introducing a mutation into the catalytic domain [69]. Studies over the past decade have determined the structure of all three HIV-1 IN domains [12,39,40,44]. Structural characterization of the intasome, which comprises the IN, viral DNA, and associated host cellular proteins, will help gain insight into inhibitor function. The first retroviral intasome that has been successfully characterized is the PFV intasome [25].
Several detergents are generally used for improving the solubility of recombinant IN and these might have effects on the in vitro enzymatic actvities of INs. We investigated the effect of various chemicals on feline foamy viral integrase (FFV IN) enzymatic activities [26]. We evaluated the potential effects of CHAPS, glycerol, Tween 20, and Triton X-100 on 3'-processing activities. Although glycerol and Triton X-100 had no inhibitory activity up to a 50% final concentration, the non-ionic detergent showed inhibitory effects at 10 mM. However, no effect of Tween 20 was observed at a high concentration due to its viscous nature.
The isolated CCD domain from HIV-1 [70] and ASLV [71] IN proteins failed to execute 3' processing and DNA strand transfer activities. Evidence from several studies indicates that a tetrameric form of the retroviral IN is required for concerted in vitro integration [72,73]. Mixtures of HIV-1 IN with defective NTD and CTD deletion mutants support 3' processing and DNA strand transfer activities, suggesting that these enzymes function as multimers [74,75]. HIV-1 IN exists in tetramer-dimer equilibrium, with the tetrameric form predominating at concentrations as low as 0.2 mM, but PFV IN mostly exists as a monomer at concentrations <30 mM [4,76]. Size exclusion chromatography has failed to reveal the multimeric forms of PFV IN [24].
PFV IN shows efficient catalytic properties that make it a more elegant IN to study different enzymatic activities than HIV IN [77]. A very large excess of enzyme compared with DNA is required (usually >30:1) to accomplish 3'-processing activity by HIV-1 IN, which hampers study of its catalytic properties. PFV IN forms a relatively stronger complex with synthetic DNA duplexes that imitates the terminal sequence of the viral DNA long terminal repeat U5 domain [78]. Fluorescence polarization of the DNA-binding kinetics at 25 °C revealed that DNA-HIV IN forms in 3-4 minutes, which is approximately five times longer than the time required for DNA association with PFV IN [78]. This observation indicates the greater favorability of PFV IN binding to DNA [77].

Functional Aspects: In Vivo and in Vitro
Retroviral INs belong to the retroviral IN superfamily (RISF) which comprises nucleic acid metabolizing enzymes such as RNase H, RuvC (Escherichia coli ruvC gene derived endonuclease), bacteriophage MuA transposase, and nuclease component of the RISC complex Argonaute [62]. These enzymes are characterized by electronegative D and E side chains in their catalytic domains [12,64]. Observations from solution biochemistry [73,79], chemical cross-linking [72], and structurally based approaches [25,80] have revealed that an IN tetramer is required for DNA strand transfer activity and an IN multimer as well as dimer catalyzes 3' processing activity in vitro [57,72]. However, the PFV IN tetramer efficiently processes a pair of LTR ends in crystallo [65].
Recombinant PFV IN is capable of executing 3'-processing and half-site strand transfer in vitro [81,82]. One study found that these enzymes also catalyze concerted integration in vitro under stringent conditions [24]. In the case of HIV-1 IN, unprocessed DNA molecules of several hundred base pairs and an optimal concentration of DMSO and/or PEG are required for concerted integration in vitro [53,54]. In contrast, PFV IN is capable of utilizing preprocessed oligonucleotide donor DNA substrates as short as 16 bp to carry out almost exclusive concerted strand transfer in vitro and under physiological conditions [24]. Unlike lentiviral INs, only one end (U5) of the processed cDNA is utilized for processing by the Spumavirus IN [83].
PFV IN is highly proficient at concerted integration in vitro compared to that of HIV-1 IN, as it generates a considerable amount of half-site products. A higher concentration of HIV-1 IN negatively regulates the concerted integration efficiency, and the optimum concentration used is 5-15 nM [79]. These observations indicate that a lower multimeric form is required before interacting with its donor DNA substrate. However, PFV IN is less prone to form multimers, which allows for studies of the concerted retroviral integration process at higher IN and donor DNA inputs and to visualize the products [24]. Interestingly, these properties of PFV IN are extremely helpful in structural studies of the retroviral synaptic complex (IN bound long terminal repeat DNA ends).
The LTR sequence of viral DNA is the only sequence that can be recognized by retroviral IN and is necessary to perform the enzymatic activities successfully [84,85]. We investigated substrate specificity of three retroviral INs by incubating different LTR substrates such as HIV-1, PFV, and FFV. HIV-1 IN and PFV IN utilized their own substrates highly specifically and cleaved only their own substrates, but FFV IN is non-specific and efficiently cleaves its own substrate as well as the HIV-1 LTR and PFV LTR substrates [26].
Retroviral integration is not entirely random and shows genus-specific biases on the availability of target DNA sequence in the reported genomic scales [86]. Members of the Lentivirus family including HIV-1 IN integrate along the bodies of active genes and integration negatively correlates with the uppermost levels of transcriptional activity [87] but members of the Gammaretrovirus family favor integration at the promoter sites and CpG islands [88][89][90]. However PFV showed reduced frequency of integration within transcription units and therefore showed less effects on local gene expression [91,92].

Metal Use Is Distinct among Retroviruses
Retroviral IN has two metal binding domains; the NTD containing a zinc finger motif, which binds Zn 2+ , and the CCD, containing a conserved D, D 35 , E motif, which has a tendency to bind Mn, or other metal cofactors [8,93]. Mg 2+ is a natural cofactor due to its abundant presence in vivo; however, Mn 2+ also acts as cofactor in vitro [76,94].
Similar to other members of the nucleotidyl transferases protein family, the IN active site is composed of acidic residues and performs enzymatic activities through a two-metal-cation mechanism and the nature and number of divalent metal cations required for catalysis have already been identified [95,96]. Processing and joining reactions are two IN enzymatic activities performed by the CCD where Mg 2+ or Mn 2+ cations act as co-factors in vitro. Some studies have successfully investigated active site complex formation using physiologically irrelevant but stronger binding metals, such as Zn 2+ , Cd 2+ , or Ca 2+ with ASV IN [95] and Cd 2+ with HIV-1 IN [96], however a recent study reported a PFV IN/Mg 2+ or Mn 2+ intasome assembly with two metal cations at the active site [25]. The reason for zinc coordination is due to its ability to accept four tetrahedrally arranged ligands. Although it supports endonucleolytic activity in vitro, it cannot be used as cofactor for IN catalysis in vivo [97].
We investigated the metal requirements for FV IN enzymatic activities by using FFV IN. Our results demonstrated that similar to other retroviral IN, FFV IN successfully utilizes Mn 2+ or Mg 2+ as a cofactor in three different enzymatic reactions [26]. We also screened the capacity of FFV IN to utilize other metal ions as co-factors. Besides Mn 2+ or Mg 2+ , Co 2+ and Zn 2+ ions act as cofactors for 3'-processing activity of FFV IN, as they induce enzymatic activity in the absence of the Mn 2+ ion and show concentration-dependent induction [26].

Structural Basis of IN Inhibitor Action
Zidovudine (AZT) was the first antiretroviral agent introduced in 1986 [98] and utilized as therapeutic agent in antiretroviral therapy [99]. A total of 26 antiretroviral compounds classified into six classes (nucleoside/nucleotide reverse transcriptase inhibitors, N(t)RTIs; non-nucleoside reverse transcriptase inhibitors, NNRTIs; protease inhibitors, PIs; fusion inhibitors; co-receptor blockers and integrase strand transfer inhibitors, INSTIs) are currently approved for clinical use [100]. Raltegravir is an INSTI, causes a strong inhibition of integration, and reduces viral load [101]. Elvitegravir is a Phase III INSTI and has been approved for HAART by the FDA [102][103][104]. Infectivity of PFV in cell culture and in vitro enzymatic activities of PFV IN showed a significant sensitivity to raltgravir and elvitegravir [24]. The high degree of amino acid sequence similarity within the active sites of HIV-1 and PFV INs and superposition of active side residues in crystals [24] implies the possibility of HIV-1 IN strand transfer inhibitors to be equally active against PFV IN [24,105,106]. Soaking PFV intasome crystals in raltegravir or other INSTIs has revealed that these compounds bind the active site and are interacted by their oxygen atoms with bound metal ions at the IN active site [25]. Crystallographic analysis of PFV intasome and INSTIs interactions explained the divergent binding mechanisms within the active site and hence explore the possibility to determine the raltegravir-resistance mutations [25].

NLS of FV Integrase
Successful infection and replication of retroviruses in their host depend largely on the reverse transcription of their RNA genome to cDNA and integration into the host cell chromosome (reviewed in [81]). Integration occurs in the nucleus; therefore, viral cDNA complexed with other viral and cellular proteins forms a large nucleoprotein complex termed the preintegration complex (PIC), which translocates from the cytoplasm to the nucleus [107]. Various cellular nuclear-import receptors as well as viral karyophilic proteins have been suggested to be involved for the successful translocation of PIC into the nucleus. Among the cellular nuclear-import receptors involvment of the importin a/ß heterodimer [108][109][110], importin 7 [111,112] and transportin 3 (TPNO3/transportin-SR2) [113][114][115] have been extensively studied. Another cellular protein, lens epithelium-derived growth factor/p75 (LEDGF/p75) plays a vital role in HIV-1 PIC translocation into the nuclei of infected cells [116][117][118]. TNPO3/ Transportin-SR2 is a member of the importin ß family, which recognizes serine-arginine-rich repeats within precursor-mRNA splicing factors and subsequently transports these factors into the nucleus [119,120] suggesting the possible role of TNPO3/Transportin-SR2 in transporting the HIV-1 PIC into the nucleus. Moreover RNAi-mediated knockdown of TNPO3/Transportin-SR2 causes the inhibition of HIV-1 replication, after reverse transcription and before the covalent attachment of viral cDNA to host chromosomal DNA [113]. This findings imply that TNPO3/Transportin-SR2 is either necessary for viral cDNA transportation or optimal integration activity [114]. Among the suggested karyophilic viral proteins, such as the Matrix (MA), Vpr and IN, integrase remains a reasonable candidate for a functionally relevant target of nuclear import factors as it plays an essential role in the nucleus during the early steps of infection [114]. Screening of cellular proteins for possible binding partners of IN, revealed TNPO3/Transportin-SR2 as an integrase binding protein. Moreover a recent study based on co-immunoprecipitation (co-IP) and immunostaining experiments and the use of cellpermeable functional peptides suggested that in addition to importin α, TNPO3 is involved in nuclear import of IN in virus-infected cells [121]. Retroviral INs have nuclear localization signals (NLS) [122] and therefore perform two important roles in the viral life cycle. They transport viral cDNA from the cytoplasm to the nucleus and integrate viral cDNA into the cellular genomic DNA [62,84]. Members of the retroviral family possess NLS at different IN residues. HIV-1 IN possesses a basic bipartite type NLS at residues 186-188 and 211-219 [123]. An additional NLS in the central domain of the HIV-1 IN has been reported at residues 161-173 [124]. In the feline immunodeficiency virus IN, the karyophilic determinant has been mapped to the highly conserved N-terminal zinc binding HHCC motif. The karyophilic determinant for ASLV IN has been proposed to be a noncanonical NLS that maps to six basic amino acids located within residues 207-235 of the protein [125]. Besides IN, other viral proteins with an NLS sequence include the matrix (MA) domain of Gag and Vpr of lentiviruses, which are thought to be important for importing PIC into the nuclei of growth-arrested cells [126][127][128].
Several studies have indicated that FV IN contains NLS sequences [129,130]. A monoclonal antibody targeted to FV IN suggests that the IN domain of the Pol protein has a NLS responsible for import into the nucleus [35]. It has been suggested that as a part of PFV PIC, the Gag protein translocates to the nucleus and facilitates integration by tethering the viral DNA genome to the host cell chromatin [131], but a separate study showed that translocation of Gag and the viral genome to the nucleus is dependent on IN in both cycling and growth-arrested cells [132]. It has also been shown that the FV genome and Gag can enter the nuclei of growth-arrested cells, indicating the ability of the FV PIC to cross an intact nuclear membrane similar to that of lentiviruses [132]. However, cells transduced with IN-deficient vectors reveal no Gag or viral genome within nuclei, suggesting that Gag alone is not critical for PIC transport to the nucleus, whereas IN is an absolute requirement [132]. This study did not find any dependence of Gag nuclear localization on IN in cycling cells. Therefore, transport of Gag to the nucleus is totally independent of IN [132].
We investigated the karyophilic determinant of PFV IN, and our results demonstrated that PFV IN possesses a potent NLS in its C-terminal domain spanning from residues 289-371 [129]. Studies on seven point mutants created at that region by changing the basic amino acids lysine or arginine to threonine or proline revealed that a mutation at positions 305 and 315 did not affect nuclear localization of PFV-IN but mutations at positions 308, 313, 318, 324, and 329, significantly altered nuclear localization. Further observations indicate that PFV-IN NLS is non-transferable and different from classical NLS, which has five discontinuous basic amino acids.

Conclusion
Although much effort has long been undertaken to use HIV-1 IN as a target for anti-retroviral drug development, the exact structural elucidation of full-length protein remains unknown. Rapid emergence of HIV-1 IN inhibitor resistant strains necessitate the knowledge of the structural details of active site geometry and IN domain interactions in order to develop more effective drugs. PFV IN is equally sensitive to HIV-1 IN inhibitors therefore the crystal structures of PFV intasome as the only determined retroviral intasome will help to elucidate the structural basis of HIV-1 drug resistance to clinical INSTIs. This review provides insights about the biochemical and enzymatic activities of FV IN and the reasons for using FV IN as a model system to investigate the structural basis of anti-retroviral therapy as well as to improve the safety of FV vector systems.