Characterization of Divergent Grapevine Badnavirus 1 Isolates Found on Different Fig Species (Ficus spp.)

Fig mosaic disease is spread worldwide and is believed to have a viral etiology. Divergent isolates of grapevine badnavirus 1 (GBV1), named fGBV1, were discovered on Ficus carica, F. palmata, F. virgata, and F. afghanistanica in the fig germplasm collection of the Nikita Botanical Gardens, Russia, expanding the list of viruses infecting this crop. The complete genomes of five fGBV1 isolates from F. carica and F. palmata trees were determined using high-throughput and Sanger sequencing. The genomes comprised 7283 base pairs, contained four overlapping open reading frames, were 99.7 to 99.9% identical to each other, and related to GBV1 (83.2% identity). The reverse transcriptase RNase H genome regions of fGBV1 and GBV1 share 84.6% identity, indicating that fGBV1 is a divergent isolate of GBV1, which was found on the new natural hosts from a different family (Moraceae). Further, fGBV1-specific primers were developed to detect the virus using RT-PCR. Survey of 47 trees, belonging to four fig species and 14 local and introduced F. carica cultivars, showed the high fGBV1 prevalence in the collection (93.6%), including trees with no obvious symptoms of fig mosaic disease.


Introduction
, common in many regions of its cultivation. The symptoms of the disease are diverse and are typically manifested on leaves and fruits as mottling, mosaics, ring spots, discoloration, or deformation [2,3]. It is believed that the disease has a viral etiology. Fifteen viruses from different taxonomic groups and three viroids were detected on fig. The symptoms of FMD are mainly attributed to fig mosaic virus (FMV, genus Emaravirus, family Fimoviridae), and their diversity is due to the influence of other viruses in mixed infection [4][5][6]. Among them, fig badnavirus 1 (FBV1, genus Badnavirus) was identified [7] and was the only representative of the family Caulimoviridae found on figs until recently.
Badnaviruses are widely distributed on fruit and ornamental plants. Their genomes are represented by a single molecule of non-covalently closed circular double-stranded DNA of 7-9 kbp. The genome is transcribed to produce a greater-than-genome length terminally redundant pregenomic (pg) RNA, which is either translated or serves as a template for replication of the viral genome through reverse transcription. The genome usually contains three or four open reading frames (ORF). The large intergenic region (LIGR), which is enclosed between the end of ORF3 or ORF4 and the beginning of ORF1, contains transcription regulatory motifs. By convention, the beginning of the genome is considered to be the first nucleotide of the tRNA met binding site, which serves as a primer for the DNA minus strand synthesis. Badnaviruses are naturally transmitted by mealybugs and aphids in a semi-persistent manner [8][9][10].

Results and Discussion
Total RNAs from the F. carica cultivars Temri, Kraps di Hersh, Bleuet, Smena, and from an F. palmata tree, displaying typical FMD symptoms [13], were used for metagenomic sequencing. On average, 695,000 quality-filtered 150-bp pair-ended reads per library were generated by high-throughput sequencing (HTS). The assembled contigs were aligned to the nucleotide (nt) sequences of badnaviruses (taxid:10652) available in GenBank. In total, BLASTn search found in five samples 12 contigs ranging from 122 to 7188 nt, which were 81.0-84.8% identical to the full-length genome of GBV1 [15] (query coverage 94-100%). At the same time, these contigs were 66.9-72.5% identical to the FBV1 complete genome sequences and thus seem to be derived from another badnavirus, which was designated fGBV1 for convenience and to discriminate the GBV1 isolates from grape and fig.
The complete genomes of five fGBV1 isolates were assembled from the contigs. The isolates from the F. carica cultivars and the F. palmata tree were 99.7-99.9% identical to each other, indicating the low level of genetic diversity of the virus and the high reliability of the HTS results as well. Joints and gaps between the contigs and the near entire LIGR were re-sequenced and fully confirmed by the Sanger method using custom primers flanking the regions of question (Table S1). In addition, the Sanger sequencing of the LIGR confirmed the circular nature of the fGBV1 genome.
The fGBV1 genome of the isolate Tem64 from the cultivar Temri comprises 7283 bp and includes four overlapping ORFs. The genome starts with the tRNA met binding site (TGGTATCAGATAGTTT, positions 1-18). ORF1 (positions 287-718) and ORF2 (715-1122) encode the proteins 143 and 135 amino acid (aa) residues, respectively. ORF3 (positions 1119-6719) encodes a putative polyprotein 1866 aa length. Using the CD Search Service, zinc finger (aa 810-827), peptidase A3 (aa 1085-1285), reverse transcriptase (RT, aa 1311-1497), and RNase H (aa 1593-1721) motifs were identified in the polyprotein. The additional ORF4 is located at positions 6488-6748, overlapping the 3 -end of ORF3. The LIGR encompasses a genome segment at positions 6749 to 286 and comprises 821 nt. A TATA-box (TATTTAA, positions 7107-7113) was similar to that of commelina yellow mottle virus (X52938) [16], a species type of the genus Badnavirus. A putative polyadenylation signal (AATAAA) was identified at positions 7224-7229. In the remaining four isolates of the virus, all the functional elements of the genome mentioned above were at the same positions. Thus, the organization of the fGBV1 genome is typical of badnaviruses. Twenty-eight nt substitutions were randomly dispersed along the five genomes. In the coding regions, most of them (19 out of 23) turned out to be non-synonymous.
Twenty-four genome sequences of approved and tentative badnaviruses, available in GenBank, were selected for phylogenetic analysis. These included badnaviruses close to GBV1 [15] and a number of virus species from more distant groups [9]. Since the five fGBV1 genomes were almost identical, only the isolate Tem64 was employed for phylogeny. Four phylogenetic trees based on the full-length genomes, ORF3 nt and aa sequences, and aa sequences of the RT/RNaseH domains (410 aa residues) were reconstructed.
When analyzing complete genomes, fGBV1 was shown to cluster with GBV1 ( Figure 1). The sister clade was formed by FBV1 (JF411989) and grapevine Roditis leaf discolorationassociated virus (GRLDaV) (HG940503) [17]. Both clades and the whole cluster, which included these four viruses, were supported by the 100% bootstrap values. The composition of this phylogroup and its position on the three other trees were similar ( Figure S1A-C). The results of phylogenetic analysis also show that FBV1, GRLDaV, GBV1, and fGBV1 had a common ancestor, suggesting the possibility of vector transmission of GBV1 isolates from fig to grapes or vice versa ref. [17] (this work). This assumption requires experimental verification.
Based on the phylogeny and complete genome identity data, fGBV1 is most closely related to GBV1. The close comparison of their genomes is presented in Table 1. It should be stressed that the direct comparison of the fGBV1 and GBV1 genomes was difficult because the latter was deposited in GenBank in a form, not quite common for badnaviruses. To compare, the GBV1 genome was presented here in the traditional manner, i.e., starting from the tRNA met sequence. Due to the high similarity of the five fGBV1 genomes, only one isolate (Tem64) was taken to compare. The fGBV1 genome is 138 nt longer mainly due to larger sizes of the LIGR and ORF3. In contrast, fGBV1 ORF4 is noticeably shorter because its start AUG codon is much closer to the LIGR than in GBV1. RT/RNaseH genome regions of these two viruses shared 84.6% identity at the nt level. According to the demarcation criteria of the genus Badnavirus, the differences between species in this region should be more than 20% [8,9]. Thus, fGBV1 should be regarded as a divergent isolate of GBV1. The differences between fGBV1 and GBV1 can possibly be due to their adaptation to different hosts.
According to the ICTV, there are three criteria for assigning a certain badnavirus into a new species: host ranges, differences in polymerase (RT+ RNase H) nt sequences of more than 20%, and vector specificities [18]. GBV1 was found on grapevine (Vitis vinifera, Vitaceae), while fGBV1 was revealed on figs (Ficus spp., family Moraceae). Detection of fGBV1 on host plants from the different family could potentially be a reason to consider it a new virus species. However, although most badnaviruses have narrow natural host ranges, there are a few exceptions. For example, banana (family Musaceae) is a common host for banana streak viruses. At the same time, the banana streak CA virus isolate Hainan (OL803889) was detected on sugarcane from the family Poaceae. Canna yellow mottle virus was first detected and ubiquitous in cannes (Cannaceae). On the other hand, this virus was also found on betel nut (Piper betel, Palmaceae; e.g., KJ825824, KM373210, KM403570), ornamental ginger (Alpinia purpurata, Zingiberaceae; KU168312), and carnation (Dianthus spp., Cariofillaceae; KP836342). Thus, the second criterion also does not allow fGBV1 to be considered a distinct species. which included these four viruses, were supported by the 100% bootstrap values. The composition of this phylogroup and its position on the three other trees were similar (Figure S1A-C). The results of phylogenetic analysis also show that FBV1, GRLDaV, GBV1, and fGBV1 had a common ancestor, suggesting the possibility of vector transmission of GBV1 isolates from fig to grapes or vice versa ref. [17] (this work). This assumption requires experimental verification. The prevalence of fGBV1 in the collection was surveyed using RT-PCR. NucleoSpin RNA Plant Kit was chosen to isolate total RNA because the standard protocol involves on-column processing of the sample with recombinant DNase supplied with the kit. As primers for the GBV1 detection [15] did not recognize the fig isolates of the virus (data not shown), the fGBV1-F1/R1 primers were designed for the specific fGBV1 detection.
During a small-scale survey, amplicons of the expected size 633 bp were generated in each of 31 F. carica samples, including local and introduced cultivars, as well as in five F. palmata, one F. afghanistanica, and in seven of ten F. virgata specimens ( Table 2). In total, 93.6% of the trees tested positive for fGBV1, indicating that it is widespread on F.  Although the vast majority of the tested plants were shown to be fGBV1-infected, the virus was not detected in three F. virgata trees (Table 2, Figure 2, upper gel). These plants can be considered as negative controls, indicating the specificity of the PCR assay. Another badnavirus, FBV1, is also known to be widespread on F. carica worldwide [4,7]. Using RT-PCR with 1094F/1567R primers [7], FBV1 was shown to be widely distributed in the NBG fig collection. Some representative results of the FBV1 testing are demonstrated in Figure 2 (lower gel). Comparison of the upper and lower gels showed that fGBV1-specific primers did not recognize FBV1 in the F. virgata tree II/2/68, further supporting the specificity of the fGBV1 analysis. In addition, eight 633 bp amplicons were sequenced ( Table 2). All these were fGBV1, confirming the virus-specific detection. Thus, the fGBV1-F1/R1 primers can apparently be used for specific detection of the virus.  [7] primers, respectively. The tree numbers (see Table 2 for details) are shown above the picture. Arrows right of the pictures indicate PCR product of the corresponding size, bp. M-GeneRuler 100 bp DNA ladder Plus (Thermo Scientific).
Badnavirus nucleic acid can be in episomal and/or integrated forms in infected plant [9]. DNase-treated total RNA was used to detect fGBV1 by RT-PCR, resulting in the specific PCR product generation (Figure 3, lane 1). No amplification was observed when the RT step was omitted, indicating that DNA contaminations were efficiently removed (Figure 3, lane 2). Since fGBV1 was detected after the DNase treatment, viral pgRNA seemed to be the template for RT-PCR, suggesting the active virus replication at least in fig plants studied in this work. On the other hand, if the DNase treatment was omitted, a strong PCR product of the expected size was obtained by direct PCR (Figure 3, lane 3). This result shows that total RNA contained residual DNA in an amount sufficient to detect the virus by direct PCR. Thus, both PCR and RT-PCR assays can be used to detect fGBV1 successfully. Whether the virus DNA is also integrated in the fig genome has yet to be studied.  [7] primers, respectively. The tree numbers (see Table 2 for details) are shown above the picture. Arrows right of the pictures indicate PCR product of the corresponding size, bp. M-GeneRuler 100 bp DNA ladder Plus (Thermo Scientific).
Badnavirus nucleic acid can be in episomal and/or integrated forms in infected plant [9]. DNase-treated total RNA was used to detect fGBV1 by RT-PCR, resulting in the specific PCR product generation (Figure 3, lane 1). No amplification was observed when the RT step was omitted, indicating that DNA contaminations were efficiently removed ( Figure 3, lane 2). Since fGBV1 was detected after the DNase treatment, viral pgRNA seemed to be the template for RT-PCR, suggesting the active virus replication at least in fig plants studied in this work. On the other hand, if the DNase treatment was omitted, a strong PCR product of the expected size was obtained by direct PCR (Figure 3, lane 3). This result shows that total RNA contained residual DNA in an amount sufficient to detect the virus by direct PCR. Thus, both PCR and RT-PCR assays can be used to detect fGBV1 successfully. Whether the virus DNA is also integrated in the fig genome has yet to be studied.
The sequences of the 633 bp PCR products were shown to be 99.8 to 100% identical at the nt and aa levels, irrespective of the fig species, F. carica cultivar, or the terrace on which the tested tree grows. Meanwhile, the five fGBV1 full-length genomes were also almost identical; all the tested fig trees were apparently infected with the same fGBV1 isolate, which could be disseminated along the collection by some invertebrate vector. [9]. DNase-treated total RNA was used to detect fGBV1 by RT-PCR, resulting in the spe-cific PCR product generation (Figure 3, lane 1). No amplification was observed when the RT step was omitted, indicating that DNA contaminations were efficiently removed (Figure 3, lane 2). Since fGBV1 was detected after the DNase treatment, viral pgRNA seemed to be the template for RT-PCR, suggesting the active virus replication at least in fig plants studied in this work. On the other hand, if the DNase treatment was omitted, a strong PCR product of the expected size was obtained by direct PCR (Figure 3, lane 3). This result shows that total RNA contained residual DNA in an amount sufficient to detect the virus by direct PCR. Thus, both PCR and RT-PCR assays can be used to detect fGBV1 successfully. Whether the virus DNA is also integrated in the fig genome has yet to be studied. Most fGBV1 isolates were found on trees with typical FMD symptoms, which could be induced by either fGBV1 or other viruses (Table 2). However, fGBV1 was also detected in several samples with no obvious symptoms of the disease, including two trees of the local F. carica cultivar Sabrutsiya Rosovaya, as well as some F. afghanistanica, F. virgata, and F. palmata trees. This suggests that fGBV1 does not appear to cause symptoms on infected plants by itself or these trees are in the early stage of infection when the viral titer is still low. The observed FMD symptoms were likely due to FMV, which was detected in all the symptomatic trees listed in Table 2 using RT-PCR as described previously [13].
Thus, metagenomic analysis revealed a previously uncharacterized fig badnavirus that expands the list of viruses infecting this crop. Indeed, fGBV1 is a divergent isolate of GBV1 from grapevine and is clearly different from FBV1. Nevertheless, two fig badnaviruses, FBV1 [7] and fGBV1 (this work), have much in common. Both viruses are widely distributed in surveyed plantings. FBV1 was detected in the vast majority of fig samples of different origins [7] and is currently the most widespread fig virus in the world [4]. Further, fGBV1 was revealed in 93.6% of the samples tested in Russia ( Table 2). Genomes of known FBV1 isolates are 99-100% identical [7]; likewise, genomes of the Russian fGBV1 isolates are just as conserved. Moreover, both viruses seem to be symptomless in figs. However, more fGBV1 isolates from other geographical regions should be studied to better understand the genetic diversity and incidence of the virus. Since fGBV1 was shown to be widespread in introduced cultivars, it is very likely that this virus can be detected in other regions of fig cultivation using the RT-PCR assay elaborated in the present study.

Materials and Methods
Plant material was gathered in the fig germplasm collection of the NBG (44.51N; 34.23E) from 2018 to 2020. The collection is located on a terraced slope and occupies three terraces elongated in the latitudinal direction. Self-rooted trees of various fig species and F. carica cultivars are randomly distributed within rows. Most trees tested for the viruses were about 30 years old. Leaf samples from each tree to be analyzed were tested individually.
Total RNA for HTS was extracted from the F. carica cultivars Temri, Kraps di Hersh, Bleuet, Smena, and from an F. palmata tree with typical FMD symptoms using RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. DNA libraries were prepared using TruSeq Stranded Total RNA Library Prep Plant kit (Illumina, San Diego, CA, USA) and sequenced on MiSeq Illumina platform as described