New Insights into the Genome Organization of Yeast Double-Stranded RNA LBC Viruses

The yeasts Torulaspora delbrueckii (Td) and Saccharomyces cerevisiae (Sc) may show a killer phenotype that is encoded in dsRNA M viruses (V-M), which require the helper activity of another dsRNA virus (V-LA or V-LBC) for replication. Recently, two TdV-LBCbarr genomes, which share sequence identity with ScV-LBC counterparts, were characterized by high-throughput sequencing (HTS). They also share some similar characteristics with Sc-LA viruses. This may explain why TdV-LBCbarr has helper capability to maintain M viruses, whereas ScV-LBC does not. We here analyze two stretches with low sequence identity (LIS I and LIS II) that were found in TdV-LBCbarr Gag-Pol proteins when comparing with the homologous regions of ScV-LBC. These stretches may result from successive nucleotide insertions or deletions (indels) that allow compensatory frameshift events required to maintain specific functions of the RNA-polymerase, while modifying other functions such as the ability to bind V-M (+)RNA for packaging. The presence of an additional frameshifting site in LIS I may ensure the synthesis of a certain amount of RNA-polymerase until the new compensatory indel appears. Additional 5′- and 3′-extra sequences were found beyond V-LBC canonical genomes. Most extra sequences showed high identity to some stretches of the canonical genomes and can form stem-loop structures. Further, the 3′-extra sequence of two ScV-LBC genomes contains rRNA stretches. The origin and possible functions of these extra sequences are here discussed.


Introduction
Killer yeasts produce protein toxins that are lethal to sensitive yeasts. The synthesis and secretion of killer toxins by Torulaspora delbrueckii (Td) and Saccharomyces cerevisiae (Sc) requires the presence of at least two cytoplasmic dsRNA viruses that are members of the family Totiviridae. One is a satellite virus with a medium-size genome (V-M) that encodes the toxin, and the other is a helper virus with a large-size genome (V-LA) that provides the capsid and polymerase required for maintenance and replication of both viruses. Additionally, the role of ScV-M1 dsRNA in the maintenance of ScV-LA1 by a yet-unknown mechanism has been recently suggested [1].
A specific LA virus may support different types of satellite M viruses but usually only one type in each killer yeast strain [2,3]. LBC viruses are another type of large-size dsRNA virus that may coexist with V-LA and V-M in the cytoplasm of S. cerevisiae; although no helper activity is known for V-LBC in this yeast species [4][5][6][7][8]. However, a new LBC virus has recently been found in T. delbrueckii (TdV-LBCbarr2) that may act as a helper for two M viruses in the same yeast strain: TdV-Mbarr1 and ScV-M1 [9]. These viruses are inherited in the cytoplasm from mother yeast to daughter bud and transferred horizontally between different yeasts by mating or heterokaryon formation [10]. However, the Sc-M1 virus has ScV-LA [5,7,[13][14][15]18]: (i) a stem-loop for frameshifting located downstream from the slippery site; (ii) a stem-loop for (+)ssRNA packaging located downstream from RdRp domain; and (iii) a stem-loop for RNA replication located upstream from 3 end [9]. These motifs are found in equivalent positions with respect to the ScV-LA genomes. Similar stem-loops seem to be also present in ScV-LBC genomes but located in different positions with respect to ScV-LA genomes. The similarity of these motifs and features of LBCbarr viruses and LA viruses could explain the helper capability of TdV-LBCbarr2 to maintain M viruses [9]. The genomes of Td-LBCbarr viruses also contain the 5 AU-rich region present in L and M viruses. This motif seems to facilitate the "melting" of the template (−)RNA strand and the access of the RNA polymerase for conservative transcription [4,6,19,20]. This motif is 5 GAAATT in TdV-LBCbarr, which is similar to that of ScV-LBC (5 GAATTT), and different from that of ScV-LA (5 GAAAAA) [9,14]. Although both TdV-LBCbarr Gag-Pol amino acid sequences showed modest global identity with that of Sc-LBC viruses (about 44%), most of the motifs required for Gag and Pol functions in Sc-LBC viruses were also found in LBCbarr viruses [9].
This study deeply analyzes the genome sequence of LBC viruses to determine the presence of: (i) low identity stretches located in the viral canonical genomes, (ii) sequence stretches with relevant identity to genomic sequence that might have been horizontally transferred from cellular organisms (xenologs), (iii) 5 -and 3 -extra sequences, and (iv) possible interactions between these extra sequences and proximal canonical sequences to form stem-loops and intramolecular kissing complexes. All these features may help to elucidate the phylogenetic origin of these viruses.

Yeast Strains and Culture Media
The yeasts used in this study are shown in Table 1. The killer phenotype and presence of L and M genomes (dsRNA) in these yeast strains were previously analyzed [4,14,17]. Standard culture media were used for yeast growth [21]. ∆G for the (+)ssRNA is in kJ/mol, and it was obtained with the program MFOLD. Nucleotide numbers refer to the RNA (+) strand of the viral genome. Sc, Saccharomyces cerevisiae. Td, Torulaspora delbrueckii.

Purification of V-LBC dsRNA from Killer Yeasts
Nucleic acid samples from killer yeasts were obtained as previously described [14,22]. The dsRNA from each yeast culture was obtained by CF-11 cellulose chromatography [23]. L and M dsRNA were separated by gel electrophoresis in 1% agarose. The 4.6 kb bands were cut from the gel and purified using RNaid ® Kit (MP Biomedicals, LLC, Illkirch, France).

cDNA Library Preparation and DNA Sequencing
The preparation of cDNA libraries and HTS (high-throughput sequencing) were performed at Unidad de Genómica Cantoblanco (Fundación Parque Científico de Madrid, Spain) as described elsewhere [14]. Random primers dTVN and dABN (Isogen Life Science, De Meern, The Netherlands) and SuperScriptIII retrotranscriptase (Thermo Fisher Scientific, Waltham, MA, USA) were used for cDNA first-strand synthesis. Subsequently, the synthesis of the second cDNA strand, end repair, adenylation in 3 -end, and TruSeq adaptors' ligation was performed (Illumina, San Diego, CA, USA). The adaptor oligonucleotides contained signals for DNA amplification and sequencing, as well as short sequences (indices) for multiplexing in the sequencing run. Each library was amplified using a PCR enrichment procedure, ensuring that all cDNA molecules of the library contained the adaptors at both ends. The resulting libraries were denatured prior to seeding on a flow cell for sequencing on a MiSeq system by using 2 × 80-2 × 150 sequencing runs.

Assembly of Virus Genome Sequences
The cDNA sequences were assembled by the company Biotechvana (Technological Park of Valencia, Spain) as described elsewhere [14]. A de novo assembly was done using SOAP deNOVO2 method [24] and two Illumina libraries for each virus. Multiple assembly attempts were tried with scaffolding and an insert size of 200. The most effective Kmer value was 47. Contigs shorter than 300 nt were removed from the config file. The selected contigs were used as input to the NR database of the NCBI using the BLASTX program [25] implemented in GPRO 1.1 software [26]. High identity was found between several contigs/scaffolds and some previously known viral RNA sequences (such as V-LA and LBC dsRNA) or host transcripts. Contaminating sequences that resulted in nonhomologous to known V-LBC genomes were filtered. Each virus genome was sequenced at least three times. Different samples and dates were used for each virus. Full coverage of the canonical sequence of each virus was obtained at least twice. 100% identity was obtained for all sequences from the same killer yeast strain. Viral genomes comparisons were done using only the full coverage sequences.

Sequence Analysis Tools
The sequence identity among nucleotide sequences of L genomes was obtained by using ClustalW(2.1) program [27], and MUSCLE(3.8) software for amino-acid sequence comparison [28]. Global alignment for identification of low identity stretches (below 50% identity in windows of 50 nucleotides or 20 amino acids in length) was done using Clone Manager 7.11 (Sci Ed Software LLC, Westminster, CO, USA), Scoring matrix: Linear (Mismatch 2, OpenGap 4, ExtGap 1 for cDNA; and BLOSUM 62 for protein). BLAST software and a data bank of nucleic acids were used to search for identities between viral genomes. Only BLAST hits with identity above 80% and length above 30 nucleotides were considered as xenologs. The MFOLD program (http://unafold.rna.albany.edu/?q=mfold/RNA-Folding-Form, (accessed on 21 December 2021)) was used to predict the folding of ssRNA [29]. The parameters used were: 37 • C as folding temperature; ionic conditions of 1M NaCl and no divalent ions; 5 as percent suboptimality number; 50 as upper bound on the number of computed foldings; 30 as maximum interior/bulge loop size; 30 as maximum asymmetry of an interior/bulge loop; and no limit for maximum distance between paired bases.

Comparison of Nucleotide Canonical Sequences from TdV-LBC and ScV-LBC Genomes
In addition to the conserved slippery site found in all known LBC viruses upstream from the Gag ORF stop codon, TdV-LBCbarr genomes have a second putative in-frame translation re-initiation codon ("2151GGGGAGATGA2160") located downstream from Gag-ORF and upstream from Pol domain. An identical second putative in-frame start codon is also present in all ScV-LBC genomes, but in a different location ("2324GGGGAGATGA2333"). These ATG in-frame codons are preceded by a possible slippery site, "GGGGAG", in all cases (Figures 1  and S1).
The identity between the highly-conserved RdRp-domain sequences of T. delbrueckii and S. cerevisiae LBC viruses (64-66%) was greater than that found for full Gag-Pol sequences (44%), and much greater than that found for Gag sequences (37%). In most cases, the percentage of identity between the different LBC viruses was higher when comparing the amino acid sequences of Gag-Pol than when comparing the nucleotide sequence of the genomes. The opposite occurred when comparing the sequences of TdV-LBC and ScV-LBC, which showed 44% and 64-66% identity for amino acid Gag-Pol and canonical nucleotide sequences, respectively. This exception was mainly explained by the presence of two stretches in TdV-LBCbarr Gag-Pol that show very low identity with the homologous region from Sc-LBC viruses: LIS I, from amino acid A558 to K828, and LIS II, from S1340 to I1443, which showed 24% and 26% identity with ScV-LBC1-original, respectively. Interestingly, the two ssRNA-binding domains of ScV-LA1-original Gag-Pol, that are necessary for viral propagation [31], are located in the homologous sequences of these two LIS. Additionally, the stem-loop for packaging and a region with low RNA-sequence identity are also located inside LIS II. Similar low identity stretches were not found when comparing TdV-LA and ScV-LA Gag-Pol sequences [14]. However, the 42-aa variable region, that separates Gag and Pol domains Sc-LA viruses [14,15], is fully coincident with part of TdV-LBCbarr LIS I (Figure 1).
Microorganisms 2022, 10, x FOR PEER REVIEW 6 of 15 gous region from Sc-LBC viruses: LIS I, from amino acid A558 to K828, and LIS II, from S1340 to I1443, which showed 24% and 26% identity with ScV-LBC1-original, respectively. Interestingly, the two ssRNA-binding domains of ScV-LA1-original Gag-Pol, that are necessary for viral propagation [31], are located in the homologous sequences of these two LIS. Additionally, the stem-loop for packaging and a region with low RNA-sequence identity are also located inside LIS II. Similar low identity stretches were not found when comparing TdV-LA and ScV-LA Gag-Pol sequences [14]. However, the 42-aa variable region, that separates Gag and Pol domains Sc-LA viruses [14,15], is fully coincident with part of TdV-LBCbarr LIS I ( Figure 1). High local identity (80-95% for stretches longer than 30 nt) was found between the nucleotide sequences of TdV-LBCbarr genomes and some xenolog mitochondrial or genomic sequences of organisms such as Cherax tenuimanus (93% identity, 30 nt in mitochondrial cytochrome oxidase I gene), Xenopus parasitic worm Protopolystoma xenopodis (95%, 37 nt in contig 0228346), Homo sapiens (93%, 30 nt in chromosome 18), Vitis vinifera (80%, 54 nt in contig VV78X249912.5), Marinilactibacillus sp. 15R (93%, 30 nt in sequence CP017761.1), Saccharomycopsis fibuligera (89%, 37 nt in chromosome B6), and Sus scrofa (85%, 40 nt in chromosome 7). Most of these putative xenolog stretches are located upstream from the highly conserved RdRp domain, except the last two that are located in this domain. High local identity was also found between a stretch of ScV-LBC2-EX1125 (94%, 66 nt) and the 5′-extra sequence previously found in ScV-M1-EX231 [4], the only case among Sc-LBC viruses. All of these stretches were found along the LBC genome but none was coincident with any of the two LIS regions (Supplementary Figure S1   High local identity (80-95% for stretches longer than 30 nt) was found between the nucleotide sequences of TdV-LBCbarr genomes and some xenolog mitochondrial or genomic sequences of organisms such as Cherax tenuimanus (93% identity, 30 nt in mitochondrial cytochrome oxidase I gene), Xenopus parasitic worm Protopolystoma xenopodis (95%, 37 nt in contig 0228346), Homo sapiens (93%, 30 nt in chromosome 18), Vitis vinifera (80%, 54 nt in contig VV78X249912.5), Marinilactibacillus sp. 15R (93%, 30 nt in sequence CP017761.1), Saccharomycopsis fibuligera (89%, 37 nt in chromosome B6), and Sus scrofa (85%, 40 nt in chromosome 7). Most of these putative xenolog stretches are located upstream from the highly conserved RdRp domain, except the last two that are located in this domain. High local identity was also found between a stretch of ScV-LBC2-EX1125 (94%, 66 nt) and the 5 -extra sequence previously found in ScV-M1-EX231 [4], the only case among Sc-LBC viruses. All of these stretches were found along the LBC genome but none was coincident with any of the two LIS regions (Supplementary Figures S1 and 1).

Analysis of 5 -and 3 -Extra Sequences of TdV-LBCbarr and ScV-LBC Genomes
The genome sequence obtained from the two Td-LBCbarr and the four Sc-LBC viruses was longer than the estimated canonical sequence (Table 1). Extra nucleotides were found beyond the 5 and 3 ends of the canonical genomes, in all cases except in the 3 -end of ScV-LBClusA. For sequence descriptions, nucleotides were numbered from the 5 GAATTT conserved motif in ScV-LBC viruses, which was considered as the 5 -end in the canonical genomes of S. cerevisiae LBC viruses [5,8,9]. The homologous motif 5 GAAATT was considered for TdV-LBCbarr genomes. The 5 -terminal G was considered as number 1. Additional nucleotides located upstream from the 5 GAATTT or 5 GAAATT motif were numbered with a negative symbol starting at (−)1 from the first nucleotide upstream from 5 G (Figures 2 and 3). Similarly, additional nucleotides located downstream from the 3 -end of ScV-LBC (CTACGCG3 ) [5] or TdV-LBCbarr1 (CCATAAGC3 ) [9] genomes were numbered with a positive symbol starting at (+)1 from the first nucleotide located downstream from C3 (Figures 4 and 5).

Analysis of 5′-and 3′-Extra Sequences of TdV-LBCbarr and ScV-LBC Genomes
The genome sequence obtained from the two Td-LBCbarr and the four Sc-LBC viruses was longer than the estimated canonical sequence (Table 1). Extra nucleotides were found beyond the 5′ and 3′ ends of the canonical genomes, in all cases except in the 3′-end of ScV-LBClusA. For sequence descriptions, nucleotides were numbered from the 5′GAATTT conserved motif in ScV-LBC viruses, which was considered as the 5′-end in the canonical genomes of S. cerevisiae LBC viruses [5,8,9]. The homologous motif 5′GAAATT was considered for TdV-LBCbarr genomes. The 5′-terminal G was considered as number 1. Additional nucleotides located upstream from the 5'GAATTT or 5′GAAATT motif were numbered with a negative symbol starting at (−)1 from the first nucleotide upstream from 5′G (Figures 2 and 3). Similarly, additional nucleotides located downstream from the 3′-end of ScV-LBC (CTACGCG3') [5] or TdV-LBCbarr1 (CCA-TAAGC3′) [9] genomes were numbered with a positive symbol starting at (+)1 from the first nucleotide located downstream from C3′ (Figures 4 and 5).

Canonical Nucleotide and Amino Acid Sequences of Td-LBC and Sc-LBC Viruses
Contrary to that found for comparison of most L viruses, nucleotide sequence identity was lower than amino acid sequence identity when comparing TdV-LBC and ScV-LBC. This is mainly because TdV-LBCbarr Gag-Pol contains two stretches that share very low sequence identity with the homologous region of ScV-LBC: LIS I and LIS II. It has been described in various viral genomes that frameshifting due to nucleotide insertions or deletions (indels), which may cause the premature termination of protein synthesis, can be restored to produce functional proteins by a secondary indel near the primary indel site. This phenomenon is known as "compensatory frameshift" [32], and may explain the low amino acid identity in LIS I and II. These LIS regions could be the result of successive indel followed by compensatory frameshift events. This way, changes in amino acid sequence would be more relevant than changes in nucleotide sequence. As consequence, these indels will decrease Gag-Pol identity while somehow maintaining the nucleotide sequence identity between the RNA regions that belong to the LIS of these viruses. This would somehow allow LBC viruses to maintain specific functions of the (+) RNA while changing specific functions of the variable stretches of Gag-Pol; such as gaining the ability to bind newly emerged versions of M (+) RNA for packaging into the TdV-LBCbarr2 virion, or losing the ability to bind M (+) RNA as may have occurred in ScV-LBC. This will always require restoring the correct translation frame of the RdRp domain by "compensatory frameshift". However, eventually, the correct translation frame may also be temporarily restored by the involvement of the second putative slippery site "GGGGAG", which is followed by a putative in-frame translation re-initiation codon and is located upstream from Pol domains in all LBC genomes. This strategy would temporarily ensure the availability of active RNA polymerase with the correct amino acid sequence of the RdRp domain. Meanwhile, a subsequent indel may occur to get the definitive "compensatory frameshift". This strategy may not be required to temporarily compensate indels in LIS II, since there are no Pol essential domains downstream from this stretch. Alternatively, the presence of a second putative in-frame translation re-initiation codon downstream from the Gag stop codon raises the possibility that some amount of free (unattached to Gag protein) RdRp could be synthesized as a functional enzyme.
Something similar may occur for the 42-aa variable region found in the Gag-Pol encoded by LA viruses. When comparing only this region of TdV-LAbarr1 and ScV-LA1original, the amino acid identity (20%) was much lower than nucleotide identity (44%) [14]; which is similar to that found between TdV-LBCbarr LIS and the homologous stretch of ScV-LBC. Although it has been suggested that this 42-aa variable region is indeed separating the two domains of LA Gag-Pol, and does not interact tightly with other amino acids of Gag and Pol domains [15], it could be involved in a more relevant function than previously thought. Thus, since this region contains many hydrophobic amino acids, it could be involved in shaping the Gag-Pol domain responsible for the specific recognition of the (+)RNA to be packaged in the virion. Contrastingly, the second putative slippery site, which could allow temporary restoration of the correct translation frame, has not been found in the LA virus. Notwithstanding, a second putative in-frame translation re-initiation codon located downstream from the Gag stop codon was also found in all LA genomes; which also raises the possibility that some amount of free RdRp could be synthesized [14].
Some stretches of TdV-LBCbarr genomes showed relevant identity with the mitochondrial chromosome of C. tenuimanus, and some genomic sequences of P. xenopodis, H. sapiens, V. vinifera, Marinilactibacillus sp., S. fibuligera, and S. scrofa. All these nucleotide stretches coincide with rather conserved amino acid sequences of Gag-Pol (Figure 1). These findings suggest the transfer of xenolog RNA stretches from different organisms to the LBC virus genome, maybe by recombination with the viral mRNAs, as previously suggested [17]. This process could have occurred during the phylogenetic appearance of L viruses. A different evolution of S. cerevisiae L viruses with respect to T. delbrueckii may explain why the relevant identity of the xenolog sequences with the S. cerevisiae genome was not detected. Only a similar stretch was found in an Sc-LBC virus (ScV-LBC2-EX1125) that showed relevant identity with a 5 -extra sequence found in ScV-M1-EX231 [4]. This result suggests that V-L and V-M RNA could recombine if they coincide in the same yeast strain, as previously suggested [4]. However, the coincidence of ScV-LBC2 and ScV-M1 in the same yeast strain has not yet been found. Despite this, ScV-LBC2 and ScV-LBC1 share 96% identity [9]; which suggests that either of these two helper viruses could maintain ScV-M1 or ScV-M2 in a K1 or in a K2 killer yeast, respectively.

Features Found in the 5 -and 3 -Extra Sequences from LBC Genomes
The additional sequences that we have found in V-LBC genomes may be only a part of the completed extra sequences of each virus. It cannot rule out that the shorter extra sequences might just reflect some lack of robustness during the contig assembling of some samples. We do not know whether these extra sequences are present in the dsRNA located inside of the virion or they are only a part of a viral RNA intermediary. Similar results were found for yeast V-M and V-LA genomes. However, identity between extra and canonical sequences of the same genome was only found in V-LBC as well as in LA viruses, but not in M viruses [4,14]. The matching base pairs of these sequence stretches to allow the formation of double-strand stem-loops at the ssRNA ends of these viruses, which may protect viral ssRNA from single-strand exonucleases. These stem-loops may also provide a free 3 -end that could be used as a primer by RNA-dependent RNA polymerases for dsRNA synthesis or for mRNA transcription. As previously suggested [14], these stem-loops could even have more than one function. In addition, intramolecular interaction between extra sequences and proximal canonical sequences may also play a yet unknown role in the biology of these L viruses [33]. Although LBC 5 stem-loops resemble that previously found in TdV-LAbarr1, no kissing-loop interaction involving the 5 -end of LBC genomes was detected, as it was previously found in LAbarr1 and ScV-LA1 genomes. Therefore, unlike previously proposed for LA viruses [14], we cannot suggest a possible role of kissing-loop interactions in favoring RNA polymerase access to the (−)RNA strand for mRNA transcription in LBC viruses.
Ribosomal RNA stretches were detected in the 3 -extra sequences of ScV-LBC1 and ScV-LBClus4, but not in TdV-LBCbarr extra sequences. Similarly, the 3 -extra sequences of several M viruses from S, cerevisiae, and T. delbrueckii also contain rRNA sequences [4], as well as the 5 -and 3 -extra sequences of several ScV-LA genomes [14]. Strikingly, all these sequences so far found in L and M viruses belong to different regions of the same rRNA or to different types of rRNA. Accordingly, these rRNA stretches do not share a relevant identity. It has been suggested that ScV-M RNA could bind to other RNAs from the host or from other viruses [4], similarly to that suggested for poliovirus RNA [34] and plant viruses [35]. It is thus possible that M and L viruses integrate into cellular RNA as rRNA, as retroviruses and retrotransposons do in chromosomal DNA. This strategy may protect these viruses from disappearance if some copies of their genome remain attached to a more persistent RNA. It has even been suggested that yeast viruses could recombine with rRNA and form a ribonucleoprotein resembling the yeast ribosome. The formation of these ribosome-like complexes may ensure that the yeast virus remains in the cell [14]. This could be similar to the endogenization of some ant RNA virus genomes involving nuclear chromosomes [36], but using a different strategy that involves rRNA in yeasts. Untangling the roles of the rRNA sequences located in 5 and 3 extra sequences of V-L and V-M genomes could reveal complex biological functions. Indeed, some rRNA-containing mRNA sequences have been described in mammalian cells. Some of these rRNA sequences appear to function as cis-regulatory elements involved in translation efficiency, while other sequences seem to be involved in some neurodegenerative diseases [37][38][39].
The finding of identical sequence stretches in the 5 -extra sequences of all LBC genomes, the 3 -extra sequences of LBCbarr genomes, and the 5 -extra sequences of some ScV-LA genomes [14], indicates that these extra sequences could have a common origin. As these extra sequences often show relevant local identity with some stretches of the canonical sequences, it has been suggested that they may originate from an imprecise molecular mechanism involved in viral replication [14], such as cap-snatching [40].

Conclusions
The two LIS found in TdV-LBCbarr Gag-Pol may have been originated by successive indels that allow virus speciation while maintaining the fundamental functions of (+) RNA, and Gag and Pol domains. The existence of a second in-frame translation re-initiation codon, preceded by a possible slippery site, may facilitate a required "compensatory frameshift". This strategy would allow LBC viruses to change their ability to bind newly arisen versions of M (+)RNA for packaging and replication. The transfer of xenolog RNA sequence stretches from different organisms to the canonical sequence of V-L genomes could be at the inception of these viruses. The extra sequences located at both sides of V-LBC canonical genomes may form RNA secondary structures involved in avoiding ssRNA degradation and facilitating dsRNA synthesis, or in a still unknown biological function related to virus replication. The finding of rRNA stretches in the 3 -extra sequences of ScV-LBC genomes may be a consequence of recombination of virus RNA with yeast rRNA. This could form a kind of ribonucleoprotein that somehow resembles the yeast ribosome and ensure the permanence of these viruses in the yeast cell.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/microorganisms10010173/s1, Figure S1: Multiple sequence alignment between ScV-LBC1-original, ScV-LBClus4, and TdV-LBCbarr2 (+) strand nucleotide sequences (cDNA). 5 GAA(A/T)TT conserved motif (5 conserved), translation initiation (start of Gag and Gag-Pol, or internal possible start ATG in Pol ORF of LBC1-original, LBClus4 and LBCbarr2), termination codons (stop of Gag and stop of Gag-Pol), ribosome frameshifting site (−1 frameshift site), frameshifting associated sequence (stem loop for frameshift), packaging signal (stem loop for packaging), and replication signal (stem loop for replication) are indicated, shaded and/or underlined in the nucleotide sequence. The highly conserved RdRp domain located in the central third of Pol is also underlined.
permanence of these viruses in the yeast cell. lowing supporting information can be downloaded at: Multiple sequence alignment between ScV-LBC1-original, strand nucleotide sequences (cDNA). 5′GAA(A/T)TT conon initiation (start of Gag and Gag-Pol, or internal possible al, LBClus4 and LBCbarr2), termination codons (stop of Gag shifting site (−1 frameshift site), frameshifting associated seackaging signal (stem loop for packaging), and replication indicated, shaded and/or underlined in the nucleotide seomain located in the central third of Pol is also underlined. , dicates identical nucleotide positions. Sequence stretches of dentity with some mitochondrial or genomic sequences of , the sequences of other organisms are shown above each nd each percentage of identity is in parenthesis.