Determinants of Genomic RNA Encapsidation in the Saccharomyces cerevisiae Long Terminal Repeat Retrotransposons Ty1 and Ty3

Long-terminal repeat (LTR) retrotransposons are transposable genetic elements that replicate intracellularly, and can be considered progenitors of retroviruses. Ty1 and Ty3 are the most extensively characterized LTR retrotransposons whose RNA genomes provide the template for both protein translation and genomic RNA that is packaged into virus-like particles (VLPs) and reverse transcribed. Genomic RNAs are not divided into separate pools of translated and packaged RNAs, therefore their trafficking and packaging into VLPs requires an equilibrium between competing events. In this review, we focus on Ty1 and Ty3 genomic RNA trafficking and packaging as essential steps of retrotransposon propagation. We summarize the existing knowledge on genomic RNA sequences and structures essential to these processes, the role of Gag proteins in repression of genomic RNA translation, delivery to VLP assembly sites, and encapsidation.


Introduction
Long-terminal repeat (LTR) retrotransposons are transposable genetic elements that comprise a significant fraction of many eukaryotic genomes [1,2]. They are progenitors of retroviruses, differing in that they replicate intracellularly [3][4][5]. Like retroviruses, LTR-retrotransposons replicate via an RNA intermediate and insert their double-stranded DNA into the host genome [6][7][8]. Retrotransposons play beneficial roles in host genome remodeling and evolution, but also cause mutations and insertional gene inactivation that can lead to diverse genetic diseases [9,10]. Our knowledge of LTR-retransposons is mostly based on studies of Ty elements of Saccharomyces cerevisiae. Ty1 and Ty3 are the most extensively characterized LTR-retrotransposons (see recent reviews [11,12]). Ty1 represents the Pseudoviridae family and is the most abundant mobile genetic element in the genome of S. cerevisiae [13,14]. Ty3 belongs to Metaviridae family of LTR-retrotransposons, whose members are more related to retroviruses than Pseudoviridae in genome organization and sequences of their encoded proteins [1,3,15]. In haploid yeast cells, Ty3 RNA is present at low levels and its expression is induced by pheromone stimulation in mating yeast cells [16][17][18]. Full-length Ty1 or Ty3 genomic RNA (gRNA) plays dual role in replication and serves as a template for translation [19] as well as the retrotransposon genome that is packaged into virus like particles (VLP) and reverse transcribed [6,8]. Therefore a fine balance between gRNA translation and packaging events is required for productive retrotransposition. Retrotransposons exploit cellular machinery to transcribe their genomic copy, and both Ty1 and Ty3 transcripts synthesized by RNA polymerase II are capped and polyadenylated before nuclear export. Despite the fact that Ty transposition is a rare event, Ty1 RNA comprises up to ~10% of polyadenylated mRNA in haploid S. cerevisiae cells [6,25,26]. This probably results from significantly longer half-life of Ty1 RNA compared to yeast mRNAs [27] and recent data indicate that Ty1 Gag is required for gRNA stability [28]. Ty1 and Ty3 genomic transcripts (5.7 kb and 5.2 kb, respectively) are similarly organized and each contains two partially overlapping GAG and POL open reading frames flanked by untranslated regions (UTR) at the 5′ and 3′ termini [7,17,20,24] (Figure 1). The 5′ UTR is comprised of unique U5 sequence and R region, the latter of which is repeated in the 3′ UTR. In addition to R, the 3′ UTR contains a unique U3 sequence. Splicing of Ty1 and Ty3 transcripts has not been detected, but the presence of shorter transcripts is documented [29][30][31][32]. After export to the cytoplasm, full-length RNAs provide the template for both protein translation and gRNA that is ultimately packaged into VLPs. The primary translation products are Gag and Gag-Pol precursors, the latter resulting from a +1 ribosomal frameshifting event. Gag-Pol is produced at the 5% level of that of Gag [33][34][35][36]. Ty1 and Ty3 frameshifting results from translational pausing due to the presence of rare tRNA codons [35,[37][38][39].
The Ty1 and Ty3 gRNA, Gag, and Gag-Pol precursors colocalize in specific cytoplasmic foci termed retrosomes, where they assemble into VLPs [40][41][42][43] (Figure 2). VLPs contain a Gag structural protein, the POL-encoded enzymes, and two copies of Ty gRNA organized in a dimeric form [44,45]. Once VLPs undergo maturation due to the specific Gag and Gag-Pol proteolysis, dimeric gRNA is reverse transcribed into (−) DNA, utilizing cellular tRNAiMet as a primer [46][47][48]. tRNAiMet is packaged into VLPs during assembly. Both elements contain a bipartite primer binding site (PBS) but Ty1 PBS is localized in GAG ORF while the 5′ and 3′ portions of the Ty3 PBS are located at Retrotransposons exploit cellular machinery to transcribe their genomic copy, and both Ty1 and Ty3 transcripts synthesized by RNA polymerase II are capped and polyadenylated before nuclear export. Despite the fact that Ty transposition is a rare event, Ty1 RNA comprises up to~10% of polyadenylated mRNA in haploid S. cerevisiae cells [6,25,26]. This probably results from significantly longer half-life of Ty1 RNA compared to yeast mRNAs [27] and recent data indicate that Ty1 Gag is required for gRNA stability [28]. Ty1 and Ty3 genomic transcripts (5.7 kb and 5.2 kb, respectively) are similarly organized and each contains two partially overlapping GAG and POL open reading frames flanked by untranslated regions (UTR) at the 5 1 and 3 1 termini [7,17,20,24] (Figure 1). The 5 1 UTR is comprised of unique U5 sequence and R region, the latter of which is repeated in the 3 1 UTR. In addition to R, the 3 1 UTR contains a unique U3 sequence. Splicing of Ty1 and Ty3 transcripts has not been detected, but the presence of shorter transcripts is documented [29][30][31][32]. After export to the cytoplasm, full-length RNAs provide the template for both protein translation and gRNA that is ultimately packaged into VLPs. The primary translation products are Gag and Gag-Pol precursors, the latter resulting from a +1 ribosomal frameshifting event. Gag-Pol is produced at the 5% level of that of Gag [33][34][35][36]. Ty1 and Ty3 frameshifting results from translational pausing due to the presence of rare tRNA codons [35,[37][38][39].
The Ty1 and Ty3 gRNA, Gag, and Gag-Pol precursors colocalize in specific cytoplasmic foci termed retrosomes, where they assemble into VLPs [40][41][42][43] (Figure 2). VLPs contain a Gag structural protein, the POL-encoded enzymes, and two copies of Ty gRNA organized in a dimeric form [44,45]. Once VLPs undergo maturation due to the specific Gag and Gag-Pol proteolysis, dimeric gRNA is reverse transcribed into (´) DNA, utilizing cellular tRNAiMet as a primer [46][47][48]. tRNAiMet is packaged into VLPs during assembly. Both elements contain a bipartite primer binding site (PBS) but Ty1 PBS is localized in GAG ORF while the 5 1 and 3 1 portions of the Ty3 PBS are located at opposite ends of the genome [47,48] (Figure 1). After (+) strand DNA synthesis nuclear import occurs and a nuclear localization signal (NLS) is present on IN [49,50]. The replication cycle is completed by integration into regions of the genome that are associated with transcription by RNA polymerase III [51][52][53][54] (Figure 2). Several host factors required for Pol III activity are important determinants of target sequence specificity [54][55][56][57][58] and recent data indicate that an interaction between Ty1 IN and the AC40 subunit of Pol III plays the predominant role in targeting Ty1 integration upstream of genes transcribed by RNA Pol III [59].
Viruses 2016, 8,193 3 of 16 opposite ends of the genome [47,48] (Figure 1). After (+) strand DNA synthesis nuclear import occurs and a nuclear localization signal (NLS) is present on IN [49,50]. The replication cycle is completed by integration into regions of the genome that are associated with transcription by RNA polymerase III [51][52][53][54] (Figure 2). Several host factors required for Pol III activity are important determinants of target sequence specificity [54][55][56][57][58] and recent data indicate that an interaction between Ty1 IN and the AC40 subunit of Pol III plays the predominant role in targeting Ty1 integration upstream of genes transcribed by RNA Pol III [59].

Cis-Acting Sequences in Retrotransposons gRNA
Retroelement gRNAs contain internal structures fundamental to propagation. Prominent among these motifs are cis-acting sequences required for gRNA dimerization, packaging, and priming of reverse transcription. The cis-acting sequences were first delimited for Ty1 using a mini-Ty1 donor-helper system [60]. Mini-Ty1s are deletion mutants of the Ty1-H3 element (a His+ revertant; [6]) lacking functional gene products. Helper elements are incapable of retrotransposition but supplement Ty1 proteins in trans, enabling mini-Ty1s containing cis sequences to transpose. These studies demonstrated that regions located within or adjacent to the LTRs are important while up to 5 kb of Ty1 internal sequences can be deleted without any significant effect on transposition.

Cis-Acting Sequences in Retrotransposons gRNA
Retroelement gRNAs contain internal structures fundamental to propagation. Prominent among these motifs are cis-acting sequences required for gRNA dimerization, packaging, and priming of reverse transcription. The cis-acting sequences were first delimited for Ty1 using a mini-Ty1 donor-helper system [60]. Mini-Ty1s are deletion mutants of the Ty1-H3 element (a His+ revertant; [6]) lacking functional gene products. Helper elements are incapable of retrotransposition but supplement Ty1 proteins in trans, enabling mini-Ty1s containing cis sequences to transpose. These studies demonstrated that regions located within or adjacent to the LTRs are important while up to 5 kb of Ty1 internal sequences can be deleted without any significant effect on transposition. The minimal Ty1 element capable of retrotransposition when proteins are supplemented in trans contains 380 nt of 5 1 -end of the (+) RNA genome and 357 nt from its 3 1 -end. Using a similar assay, cis-acting sequences in Ty3 RNA were delimited at the 5 1 and 3 1 -ends, and nt 429 to 4979 were demonstrated dispensable for retrotransposition when proteins were supplemented in trans [12,36]. The minimal Ty1 element capable of retrotransposition when proteins are supplemented in trans contains 380 nt of 5′-end of the (+) RNA genome and 357 nt from its 3′-end. Using a similar assay, cis-acting sequences in Ty3 RNA were delimited at the 5′ and 3′-ends, and nt 429 to 4979 were demonstrated dispensable for retrotransposition when proteins were supplemented in trans [12,36].  [61]. Nucleotide positions at which SHAPE reactivities increased when proteins were gently removed from VLP-associated Ty1 genomic RNA are marked with blue diamonds. Those positions most likely correspond to Gag binding sites within the dimeric Ty1 genomic RNA in VLPs. Regions in monomeric Ty1 RNA mapped in vitro as binding sites for Gag C-terminal region are represented in red [62]. Palindromic (PAL) sequences, including reciprocal interstrand interactions are annotated in green [61], and cyclization mediated by CYC5 and CYC3 in violet [63].
tRNAiMet is essential for Ty1 and Ty3 retrotransposition. Using mutational analyses Ty1 gRNA sequences required for primer binding were located between nt 94 and 104 [46] based on complementarity to the tRNAiMet 3′ acceptor stem. However, this 10nt Ty1 PBS sequence was not required for tRNAiMet packaging into VLPs [46]. Further studies presented additional evidence for sequences located 3′ to PBS that were necessary for tRNAiMet encapsidation [64,65]. Those regions of complementarity to the tRNAiMet TΨC and DHU arms not only enable primer packaging, but also play a role in the initiation of reverse transcription in vivo [64,66]. Similar molecular determinants for retrotransposition were found in the tRNAiMet primer for Ty1 and Ty3 [48]. Although both elements contain a bipartite PBS formed from three segments, they are differently organized. Ty1 PBS is localized in GAG ORF and the largest region separating its PBS sequences is 28-nt, while parts of the Ty3 PBS are located at the opposite ends of the genome, adjacent to the 5′ UTR (one segment, nt 121-128) and in the 3' UTR (two segments) [47,48]. In Ty3, tRNAiMet annealed to opposite ends of gRNA is proposed to mediate its cyclization [47]. For both Ty1 and Ty3, an interaction between gRNA ends is required for efficient initiation of reverse transcription [47,63]. Ty1 cyclization is not mediated by the tRNAiMet bridge, but occurs via direct interaction of  [61]. Nucleotide positions at which SHAPE reactivities increased when proteins were gently removed from VLP-associated Ty1 genomic RNA are marked with blue diamonds. Those positions most likely correspond to Gag binding sites within the dimeric Ty1 genomic RNA in VLPs. Regions in monomeric Ty1 RNA mapped in vitro as binding sites for Gag C-terminal region are represented in red [62]. Palindromic (PAL) sequences, including reciprocal interstrand interactions are annotated in green [61], and cyclization mediated by CYC5 and CYC3 in violet [63]. tRNAiMet is essential for Ty1 and Ty3 retrotransposition. Using mutational analyses Ty1 gRNA sequences required for primer binding were located between nt 94 and 104 [46] based on complementarity to the tRNAiMet 3 1 acceptor stem. However, this 10nt Ty1 PBS sequence was not required for tRNAiMet packaging into VLPs [46]. Further studies presented additional evidence for sequences located 3 1 to PBS that were necessary for tRNAiMet encapsidation [64,65]. Those regions of complementarity to the tRNAiMet TΨC and DHU arms not only enable primer packaging, but also play a role in the initiation of reverse transcription in vivo [64,66]. Similar molecular determinants for retrotransposition were found in the tRNAiMet primer for Ty1 and Ty3 [48]. Although both elements contain a bipartite PBS formed from three segments, they are differently organized. Ty1 PBS is localized in GAG ORF and the largest region separating its PBS sequences is 28-nt, while parts of the Ty3 PBS are located at the opposite ends of the genome, adjacent to the 5 1 UTR (one segment, nt 121-128) and in the 3 1 UTR (two segments) [47,48]. In Ty3, tRNAiMet annealed to opposite ends of gRNA is proposed to mediate its cyclization [47]. For both Ty1 and Ty3, an interaction between gRNA ends is required for efficient initiation of reverse transcription [47,63]. Ty1 cyclization is not mediated by the tRNAiMet bridge, but occurs via direct interaction of complementary sequences within gRNA, namely CYC5 at the 5 1 -end and CYC3 at the 3 1 -end [63] (Figure 3). The CYC5 sequence is located downstream of the PBS and, interestingly, is also complementary to the tRNAiMet DHU stem, raising the possibility for additional interactions [64]. However, defects in initiation of reverse transcription caused by mutations compromising CYC pairing could be complemented by mutations that restored CYC5:CYC3 pairing, proving their direct interaction [63]. In studies using the modular mini-Ty1 system (pJEF1254; [60]), Bolton et al. demonstrated that another long-range intramolecular interaction was critical for Ty1 propagation [67]. In particular, base-pairing between the 1-GAGGAGA-7 sequence within the 5 1 repeat (R) region and the 264-UCUCCUC-270 sequence downstream of the PBS was required for efficient initiation of reverse transcription. We further defined this region as constituting a part of an RNA intramolecular pseudoknot by combining chemical probing with mutational analysis [68] (Figure 3). The structure of Ty3 RNA remains uncharacterized while a comprehensive model of Ty1 secondary structure in VLPs and in vitro was determined [61]. The 5 1 -end of Ty1 gRNA is highly structured and compactly folded, analogous to retroviral 5 1 UTRs [69,70]. Interestingly, UTRs of yeast RNAs are less structured than corresponding coding regions [71].

Retrotransposon Gag Proteins
Gag and its mature products are the major VLP structural components, serving as multifunctional regulators that orchestrate retrotransposon replication. Organization of the 290-amino acid Ty3 Gag3 is related to the counterpart proteins from simple retroviruses encoding a capsid (CA) and nucleocapsid (NC) domain, separated by short spacer (SP) (Figure 1). CA is critical for Ty3 Gag3 multimerization, whereas NC is required for all nucleoprotein interactions mediated by Ty3 Gag3 [36,[72][73][74]. Mature Ty3 NC resembles retroviral NC proteins and is small (57-aa), basic (pI = 11.15) protein, containing one CCHC zinc-finger (ZF) motif [43,75,76]. Some members of Orthoretrovirinae subfamily also have nucleocapsid protein with only one ZF (Gamma-and Epsilonretroviruses), while others possess two motifs. Spumaretrovirinae NC-like proteins lack ZF motifs [77]. Ty3 NC displays nucleic acid chaperone activity and promotes nucleic acid aggregation, tRNAiMet annealing, and Ty3 RNA dimerization in vitro [47,78]. During retrovirus replication Gag or NC, via chaperone activity, facilitate genome dimerization and packaging, annealing of the tRNA primer and the strand-transfer events associated with reverse transcription [79]. Based on those observations, the role of Gag3 and NC in Ty3 replication can be considered analogous to retroviral Gag and NC proteins [79][80][81][82]. Mutational studies of the Ty3 NC domain indicate that both the N-terminal basic region (NTD) and zinc-finger play important roles in association of Ty3 Gag3 with gRNA [72,78]. The NC domain controls Ty3 Gag3 and gRNA co-localization and trafficking prior to assembly, and is required for gRNA packaging into VLP. Mature NC chaperones nucleic acid interactions during reverse transcription and mutations that abolish Ty3 Gag3 processing into mature NC block cDNA synthesis [75].
Although functionally related, Ty1 Gag lacks sequence and structural homology to Ty3 or retroviral Gag proteins [83]. The 441 aa Ty1 Gag precursor undergoes only one C-terminal cleavage by PR, providing the mature 401 aa Gag and a 40 aa, acidic peptide [84][85][86][87] (Figure 1). Ty1 Gag possesses the ability to interact with RNA in vitro [19,88] but does not have the canonical NC domain with zinc-finger motif. Based on in vitro studies, the RNA binding and chaperone activity region of Ty1 Gag has been mapped to C-terminal residues Asn299-His401, a region containing three clusters of basic amino acids [89]. The importance of this C-terminal region was further demonstrated using a Ty1 mutant from which this had been deleted. This mutant fails to interact with RNA in vitro. In case of retroviral NC proteins, basic residues were likewise demonstrated important for chaperone activity, as deleting the N-terminal basic domain significantly reduced these properties [90][91][92][93][94]. A synthetic peptide (TYA1-D) corresponding to the C-terminal 103-aa region of Ty1 Gag binds RNA, promotes annealing of tRNAiMet, Ty1 RNA dimerization and initiation of reverse transcription in vitro [89]. Recently, chaperone activity has been further characterized using recombinant proteins corresponding to diverse regions of Ty1 Gag. The C-terminal region (CTR) protein encompassing the C-terminal 228 residues of Ty1 Gag, and containing three basic clusters, displayed robust chaperone activity comparable to TYA1 peptide [62], while a truncated form of CTR (sCTR) lacking 47 C-terminal residues and containing only the first and second basic cluster, lost activity. This observation supports the critical function of the third basic cluster for chaperone activity of Ty1 Gag. Alternatively, all three basic clusters are required to promote Ty1 Gag/RNA interactions.

Trafficking of Ty gRNA and Gag to Retrosomes
Ty1 and Ty3 gRNAs are not divided into separate pools of translated and packaged RNA, thus gRNA trafficking and packaging into VLPs require an equilibrium between competing events of translation and packaging. VLP assembly starts at the specific, microscopically distinct, cytoplasmic foci known as retrosomes, where gRNA and Gag colocalize [40][41][42]. Both gRNA and Gag are required for retrosome nucleation, as when Gag is not translated or gRNA is absent from the cytoplasm, retrosomes do not form [28,42]. Retrosomes are still observed in Ty1 strains with mutations in, or deletion of, PR, IN, and RT coding sequences, indicating that retrosome formation is independent of enzyme activity [28]. A direct gRNA interaction with Gag is required for retrosome nucleation and mutations in C-terminal RNA binding domain of Ty1 Gag [28,42] or NC domain of Ty3 Gag3 [72] disrupt retrosome formation.
Little is known about Ty gRNA trafficking to retrosomes and it remains unclear where the first interaction with Gag occurs. Recent findings indicate that Ty1 gRNA is translated in association with signal recognition particle (SRP), which interacts with nascent Gag polypeptide for transport into the ER [95]. The SRP pathway is universally conserved and utilized for co-translational targeting of mRNA coding secretory and membrane proteins to the ER [96]. As Ty1 gRNA is translated, SRP interacts with the ribosome and specific hydrophobic sequences in the nascent Gag polypeptide to target the translating complex to the ER, whereupon Gag enters the ER lumen and assumes a stable conformation. When the SRP is rendered genetically defective by an srp68-DAmP mutation, Gag is present but rapidly turned over and retrosomes do not form. Stable Ty1 Gag is retro-translocated from the ER lumen to the cytoplasm and binds translating Ty1 gRNA. Multimerization of Gag bound to Ty1 gRNA may repress gRNA translation and induce a shift from its translation to packaging into VLPs [95].
Current data raises an interesting possibility that Ty1 and Ty3 Gag proteins play a role in nuclear export of gRNAs [72,97]. Using a two-plasmid system, where Ty1 Gag and gRNA are expressed independently, Checkley et al. demonstrated that Gag enhances gRNA export from nucleus, stability, and co-localization into retrosomes [28]. In the absence of Gag, Ty1 gRNA accumulates in the nucleus and becomes unstable when the Ty1 element contains a chain terminating mutation (Ty1fs) adjacent to the Gag initiation codon. When Gag is expressed independently, Ty1fs RNA regains its stability and is present in the retrosomes. The nuclear localization signal (NLS) has not been identified in Ty1 Gag and Ty1 Gag was not shown to enter the nucleus. However, Ty1 Gag nuclear localization may be only transient and difficult to detect. Nevertheless, Ty1 Gag is proposed to accumulate at the nuclear periphery and enhance gRNA nuclear export using Mex67p pathway [28,41]. This scenario does not exclude the possibility that during retrotransposition, Ty1 gRNA may be exported from the nucleus independently of Gag via Mex67p pathway. Translation of Ty1 gRNA results in accumulation of Gag and Gag-Pol in the cytoplasm and Gag may capture newly exported Ty1 transcripts at the nuclear periphery. Unlike Ty1 Gag, Ty3 Gag3 is suggested to enter the nucleus to recruit newly transcribed gRNA for packaging. Although Ty3 wild type Gag3 is not found in the nucleus, mutating conserved residues within its NC domain that disrupt RNA binding results in Gag3 accumulation in the nucleus, a decrease of Ty3 gRNA concentration in retrosomes and reduced packaging [72]. The effects of these mutations indicate that the Gag NC domain is required for both packaging and delivery of gRNA to VLP assembly sites. Consequently, early Gag binding to gRNA, both in the nucleus or immediately after nuclear export, may sequester gRNA from translation machinery and facilitate its trafficking to the retrosomes for VLP formation. Retroviral Gag proteins, including those of HIV-1, murine leukemia virus, human and simian foamy viruses, and Rous sarcoma virus may enter the nucleus [98][99][100][101]. Moreover, nuclear trafficking of RSV Gag is required for efficient packaging of viral gRNA into assembling virus particles [102]. However, it remains unknown whether it is a property of other retroviral Gag proteins.

P-Bodies and Retrosome Formation
Many lines of evidence indicate that both Ty1 and Ty3 require mRNA processing body (P-body) proteins for effective retrotransposition and VLP assembly. Eukaryotic P-bodies are cytoplasmic ribonucleoprotein granules wherein mRNA deadenylation-dependent and nonsense-mediated decay factors are concentrated along with their mRNA substrates and where mRNA decay processes can occur [103][104][105]. Ty3 retrosomes co-localize with P-bodies [40] and the Gag3 NC domain is required for this localization [72]. Mutational studies suggest that both the N-terminal basic tail and zinc finger are engaged in association with P-body proteins [72]. P-bodies are not only sites of mRNA degradation but also play an important role in translation repression and mRNA segregation for storage or decay [106,107]. Ty3 VLPs assembly in P-bodies might be beneficial and existing results support the hypothesis that P-body factors may serve to divide the translation and assembly functions of Ty3 gRNA [40,108,109]. Association of P-body proteins may sequester Ty3 gRNA from cellular translational machinery, thus increasing the pool that is not actively translated and can be packaged. A significant fraction of Ty1 Gag foci localize in P-bodies [110], but different conditions promote Ty1 retrosome and P-body formation [41,97,110]. Nevertheless, P-body components are important cofactors of both Ty1 and Ty3 retrotransposition and their deficiency negatively influences the appearance of retrosomes, the level of retrotransposition-competent VLPs, protein maturation, and gRNA packaging [40,97,[109][110][111][112].

RNA Packaging
Although Ty gRNA is selected from a pool of excess cellular RNAs and selectively packaged into VLPs [8,113], sequences important for packaging have not been precisely defined. It was demonstrated that the R region of the 5 1 UTR, the entire POL, and 3 1 UTR are dispensable for Ty1 RNA localization to retrosomes [42] while Ty3 RNA localization to retrosomes and packaging into VLPs was dependent on the presence of UTRs or POL sequences [108]. For efficient VLP formation in a heterologous host, Ty1 [114] or Ty3 [73] GAG sequence was sufficient. Based on the mini-Ty1 donor-helper system Xu and Boeke [60] demonstrated that cis-acting sequences required for gRNA packaging reside within 580 nt at the RNA 5 1 terminus. When additional nucleotides, up to position 380, were deleted from mini-Ty1 RNAs, packaging was reduced to 80%. Further deletion of nucleotides 237-380 reduced packaging to 15% and significantly impeded the amount of RNA co-purified with VLPs. Collectively, these observations argue that either major cis-acting sequences required for Ty1 RNA packaging reside within 237-380 region or its presence facilitates proper exposure of packaging element. The 237-380 fragment encompasses part of the sequence essential for folding of the Ty1 RNA pseudoknot (nt 256-270) [68] (Figure 3). RNA kissing loops that can be defined as an "intermolecular pseudoknot" mediate HIV dimerization, and therefore are determinants of retrovirus packaging (for review, see [115]). Interestingly, the Ty1 RNA pseudoknot may provide a binding site for proteins within VLPs [61]. In theory, this reflects retroviral propagation where Gag binding is fundamental for RNA encapsidation [116][117][118]. Therefore, the Ty1 RNA pseudoknot was a plausible mediator of gRNA packaging. However, using a combination of structural and functional analyses, a role for the Ty1 pseudoknot in RNA encapsidation could not be demonstrated [68].
Both genetic studies and limited in vitro data imply that Ty encapsidation relies on recognition of cis-acting sequences in gRNA by Gag. Moreover, it is not known if the same nucleotide sequences are recognized by Gag during nuclear export, trafficking to retrosomes and gRNA packaging [28,42,60]. The sites of Gag binding within Ty1 or Ty3 gRNA are not precisely defined. Chemoenzymatic analyses of Ty1 gRNA inside mature VLPs and after gentle deproteinization identified sites occupied by proteins, but could not define the identity of the protein bound. However, since Gag is the most abundant protein in VLPs we hypothesised that the majority of those sites were occupied by Gag (Figure 3). The most prominent changes in SHAPE-determined nucleotide reactivity after extraction of proteins were observed within~500 nt adjacent to the gRNA 5 1 -end indicating that disrupting RNA-protein interactions impacts mainly the Ty1 gRNA region containing the major cis-acting signals for dimerization, packaging, and initiation of reverse transcription [61]. A recent study confirmed and extended those findings via in vitro hydroxyl radical footprinting of Gag CTR/Ty1 RNA complex [62] (Figure 3). A major Gag CTR binding site was detected within the pseudoknot present at the 5 1 -end of Ty1 RNA. Interestingly, mutations disrupting pseudoknot formation interfere with retrotransposition [68,119]. Additional Gag CTR binding sites are adjacent to the Ty1 RNA sequences that are important for tRNAiMet annealing, cyclization, and possibly dimerization [61,62].
VLP-associated Ty1 and Ty3 gRNAs are dimeric [44,45,47]. Early models suggested that dimerization was mediated by non-covalent interaction of two tRNAiMet molecules hybridized to gRNA [47,89]. tRNAiMet contains a 12-nt palindromic sequence at the 5 1 -end which is not paired with Ty1 gRNA when tRNAiMet is annealed. This strand of tRNAiMet was proposed to mediate genome dimerization [47]. It was also demonstrated for Ty1 that dimerization of short RNA transcripts induced by the chaperone synthetic peptide (TYA1-D) was inefficient when the PBS was mutated and hybridization of tRNAiMet was compromised [89]. However, further studies [45] demonstrated that gRNA dimerization was not decreased in Ty3 IN mutants that failed to package tRNAiMet. Moreover, in a recent study [108] the authors monitored Ty3 packaging using a benzonase assay [74,120] and showed that deletion of 5 1 -3 1 -bipartite PBS did not prevent gRNA packaging. Therefore, either tRNAiMet is not a prerequisite for packaging or unlike in retroviruses, dimerization is not essential. Arguing against the notion that dimerization and packaging of retroviruses and retrotransposons differ significantly in this regard, Feng et al. demonstrated that the Ty1 dimers resemble those of retroviruses in that they undergo stabilization during proteolytic maturation of the VLP [44,121]. By analogy to retroviruses, and in agreement with the SHAPE reactivity profiles obtained from the Ty1 gRNA in different biological states, palindromic (PAL) sequences were suggested to mediate formation of Ty1 dimers [61,119]. Altered SHAPE reactivity patterns observed in native VLPs supported involvement of PAL residues in intermolecular interactions. Moreover, PAL sequences provide protein binding sites [61,62] (Figure 3). Ty1 Gag derivatives containing the chaperone domain (TYA1-D and CTR), in addition to Ty3 NC, promote dimerization of their respective RNAs in vitro [47,62,89].

Gag Assembly into VLPs
Ty VLPs are functionally analogous to the non-enveloped core particles of retroviruses. VLPs are stable, specific nucleoprotein structures that protect gRNA from cellular nucleases and allow its reverse transcription. Immature VLPs comprise Gag and Gag-Pol precursor proteins and two copies of gRNA, tRNAiMet, in addition to cellular RNAs and proteins (reviewed in [11,12]). Clusters of VLPs at different stages of maturation are observed in retrosomes of cells expressing a Ty1 or Ty3 element [8,23,40]. VLP maturation starts from autocatalytic cleavage of PR, which directs further processing of the Pol domain, resulting in release of RT and IN. Ty3 Gag3 is processed to mature CA, SP, and NC proteins [122] (Figure 1). Ty1 Gag processing is relatively simple and PR cleavage of Gag-p49 results in mature Gag-p45 and a 40-aa acidic peptide (p4). This peptide has not been detected in VLP preparations or cell extracts [84][85][86][87]. Both immature and mature Ty1 Gag proteins can assemble into VLPs, but display different properties and only the mature Gag form trans-activates transposition [88]. Although Ty VLPs are formed independently of Gag and Gag-Pol processing [123,124], this process is required for VLP maturation, reverse transcription, and integration [19,84,85,122]. The external structures of Ty1 and Ty3 VLPs are similar and do not undergo major rearrangement during maturation. However Ty1 assembly seems to be more flexible and results in accumulation of VLPs of variable sizes, ranging from 30 to 80 nm in diameter [85,125,126]. Ty3 VLP populations are more homogeneous, with diameters ranging from 25 to 52 nm, as was shown using atomic force microscopy [123]. Yeast factors are essential for formation of active VLPs but Gag can assemble into spherical particles even when expressed in a heterologous host [73,114,127].
Based on mutational analyses, Gag residues important for Ty1 VLP assembly and structure have been mapped within aa 62-114 and 340-363 [128]. In another study, the functional importance of three regions, namely aa 36-50, 239-287, and 330-350 was proposed, as they were predicted to assume an α-helical configuration [129]. Recent predictions of the Ty1 Gag structure suggest its N-terminal residues 1-172 and C-terminal residues 355-401 are disordered, while the region encompassing residues 173-354 forms α-helices [62]. The C-terminal 355-401 region was demonstrated to be necessary for interactions with Ty1 RNA [62], consistent with indirect studies in yeast [88] and results obtained with Ty1 Gag-derived peptide TYA1-D [89]. Immunological analysis suggest that the N-terminal region of Ty1 Gag forms the outer shell of VLPs, while the C-terminus, containing the RNA binding domain, is internal [130].
Similar to retroviruses [131], the CA domain of Ty3 Gag3 is a major determinant of VLP assembly [123]. Residues 86-100 of Ty3 CA are consistent with the major homology motif (MHR) characteristic of retroviral CA proteins and altering MHR composition induced similar defects in retrotransposon and retroviral particle formation [132]. Structure predictions suggest that Ty3 CA comprises smaller C-terminal domain (148-207) and a large α-helical N-terminal domain (NTD) , which contributes to the outer VLP shell [74]. The N-terminal CA domain is critical for Ty3 Gag3 multimerization and mutations of the first 100 residues inhibited VLP assembly [74,133]. The N-terminal domain of CA is able to interact with both the NTD of other CA proteins and with its C-terminal domain (CTD).
The NC domain of uncleaved Ty3 Gag3, but not mature NC, is required for gRNA packaging into VLPs since mutations disrupting mature NC production did not inhibit gRNA encapsidation [75]. The conserved zinc-finger motif is, however, critical [76]. Nevertheless, disrupting the zinc-finger motif or deleting the NC domain did not abolish Gag3 and Gag3-Pol3 multimerization and their aggregates were observed in the nucleus of cells expressing Ty3 [72]. This indicates that the NC domain of Ty3 Gag3 is critical for both gRNA trafficking and packaging into VLPs, while the CA domain may be more essential for structural stability of particles than observed for retroviruses [72,134].
The spacer domain (SP), located between CA and NC domains of Ty3 Gag3, is not critical for VLP assembly. However, deleting the SP domain results in formation of more compact VLPs that are defective for retrotransposition. The Ty3 SP domain is more acidic than known retroviral SP and substitutions of acidic residues to alanine led to VLP disruption and inhibition of retrotransposition [75]. Based on those observations, a model of SP contributing as a molecular "spring" to Ty3 VLP assembly has been proposed. During Ty3 retrosome formation, intramolecular interactions between the acidic SP domain and basic region of NC limit premature Gag3 multimerization. While the NC domain binds to gRNA, intermolecular interactions of NC and SP can occur to facilitate Gag3 multimerization on gRNA and CA-CA interactions. After Ty3 Gag3 processing, mature NC remains bound to gRNA while negatively charged SP in the CA-SP intermediate, which constitutes a significant fraction of processing products, could act as molecular "spring" and destabilize the CA-SP interactions, thereby fostering disassembly and release of cDNA.

Gag/RNA Interactions Important for Copy Number Control (CNC)
There are many mechanisms that may limit the effects of retrotransposition on the host genome. With regards to Ty RNA packaging, one such mechanism is of particular interest. Garfinkel et al. discovered Ty1 copy number control (CNC) [135] which is characterized by decreased retrotransposition when additional elements are present in the genome [136]. Structural analyses via SHAPE suggested that altered Ty1 gRNA/Gag interactions may participate in conferring CNC [61]. Recently a protein factor (p22), necessary and sufficient for CNC, has been identified [30] and its mechanism of action has been proposed. p22 is encoded by an internally initiated Ty1 mRNA and is an N-terminal truncated form of Gag that is cleaved by Ty1 PR at the same site as Gag to form p18. The multifaceted approach illuminated how this restriction factor interacts with Gag to inhibit retrotransposition [62]. Although Ty1 Gag and p22 play opposing roles in retrotransposition, they share a nucleic acid chaperone domain. p18 exhibits lower chaperone activity than Gag in tRNAiMet annealing and dimerization in vitro but binds Ty1 RNA with similarly high affinity. p18 and Gag bind within the same sites on Ty1 RNA, arguing they might compete for gRNA binding. Moreover, nuclease protection assays revealed that p22/p18 prevents stable packaging of Ty1 RNA [62]. By analyzing missense mutations in Ty1 that confer partial resistance to p22, a further study found that p22/p18 disturbs the central function of Gag during VLP assembly [137]. Therefore, Ty1 RNA packaging constitutes an important step that helps keep retrotransposition in check [136].

Conclusions
gRNA plays two distinct roles in the retrotransposon life cycle, namely as a template for translation and reverse transcription (Figure 2). gRNAs are not divided into separate pools of translated RNA and packaged RNA, therefore their trafficking and packaging into VLPs requires an equilibrium between competing events of translation and packaging. Perhaps Gag binding to RNA excludes newly assembling ribosomes and, as a consequence, leads to packaging and reverse transcription. Cis-acting sequences and structures important for packaging are not clearly defined and it is likely that multiple RNA sites scattered outside major packaging regions may contribute to the process. Moreover, it is not known if the same nucleotide sequences are recognized by Gag during gRNA nuclear export, trafficking to retrosomes and packaging. Many questions remain open but it is clear that RNA packaging and VLP formation is critical for Ty propagation. The VLP protects retrotransposon gRNA and concentrates all factors required for reverse transcription. VLP formation may also protect the host from high levels of free reverse transcriptase that could be potentially harmful.