High-Throughput Sequencing Indicates a Novel Marafivirus in Grapevine Showing Vein-Clearing Symptoms

A putative new marafivirus was identified in a ‘Jumeigui’ grapevine exhibitting obvious vein-clearing symptoms by high-throughput sequencing, which tentatively named grapevine-associated marafivirus (GaMV). The nearly complete genomic sequence of GaMV was amplified by reverse transcription PCR, and the terminal sequences were determined using the rapid amplification of cDNA ends method. The nearly complete genome of GaMV is 6346 bp long, excluding the poly(A) tail, and shows 51.2–62.3% nucleotide identity with other members of the genera Marafivirus, Maculavirus and Tymovirus in the family Tymoviridae. Additionally, it includes five functional domains homologous to those found in members of these genera. A phylogenetic analysis showed that GaMV clustered with other species-related marafiviruses. These data support GaMV being a representative member of a novel species in the genus Marafivirus. Furthermore, GaMV was graft-transmissible and 26 of 516 (5.04%) grapevine samples from five provinces in China tested positive by reverse transcription PCR. The coat protein of GaMV isolates shared 91.7–100% and 96.7–100% identities at the nt and aa levels, respectively. The coat protein-based phylogenetic trees revealed three well-defined clusters.


Introduction
The International Committee on the Taxonomy of Viruses divides the family Tymoviridae into three genera, Maculavirus, Marafivirus and Tymovirus [1]. These positive single-stranded RNA viruses contain genomes of approximately 6.0-7.5 kb. There are five viruses belonging to the family Tymoviridae that are reported to infect grapevine: Grapevine fleck virus (GFkV), grapevine red globe virus (GRGV), grapevine Syrah virus 1 (GSyV-1), grapevine asteroid mosaic-associated virus (GAMaV) and grapevine rupestris vein feathering virus (GRVFV). Among them, GFkV and GRGV belong to the genus Maculavirus, whereas GSyV-1, GAMaV and GRVFV belong to the genus Marafivirus [2]. GFkV, GRGV, GAMaV and GRVFV are associated with grapevine fleck complex, which is a major grapevine viral disease distributed worldwide [3]. This complex consists of several diseases, including grapevine fleck (FK), grapevine asteroid mosaic (AM), grapevine rupestris necrosis and grapevine rupestris vein feathering. The first recognized disease of this complex in California was AM [4]. It has been successfully transmitted by grafting and is characterized by translucent/chlorotic star-like spots on the foliage of several Vitis vinifera cultivars, as well as the clearing of primary and secondary veins of Vitis rupestris [5]. An isometric virus with morphological traits resembling those of GFkV was purified from AM-affected V. rupestris [5] and named GAMaV. GAMaV was further characterized after its genome was sequenced [6][7][8]. The FK disease has also been reported in California, and it induces symptoms on V. rupestris that are distinct from those of AM. It is characterized by the clearing of the third-and fourth-order veins [9,10]. In Italy, GFkV is associated with the FK-affected V. rupestris [11]. GRVFV is associated with vein feathering on V. rupestris, which is characterized by the transient mild chlorotic discoloration of the primary and secondary leaf veins [3,5]. In addition to the viruses causing these diseases, the grapevine vein-clearing virus is associated with grapevine vein-clearing symptoms, similar to those caused by AM [12].
During a field investigation in 2014, a disease causing obvious vein-clearing symptoms was observed on a 'Jumeigui' grapevine in Dalian City, Liaoning Province, China. The symptoms were similar to those of AM caused by the grapevine fleck complex. The possible agent of this disease was thought to be GAMaV or GCVC, but neither was detected by reverse transcription (RT)-PCR in this sample. In 2017, to identify possible viral infections in the diseased grapevine sample, small RNA sequencing (sRNA-seq) and RNA sequencing (RNA-seq) were employed.

Analyses of the High-Throughput Sequencing (HTS) Data
Sequencing of sRNAs and RNAs from symptomatic leaves of the 'Jumeigui' grapevine resulted in 18, 599, 455 and 602, 231, 929 clean reads, respectively. The clean data were analyzed to identify viral sequences using VirusDetect software. The sRNA-seq data revealed 17 contigs of 55-286 nt that were homologous to polyproteins of GAMaV (AOX24075), representing 26.6% coverage ( Figure S1), and the RNA-seq data revealed 14 contigs of 206-4866 nt that were homologous to polyproteins of oat blue dwarf virus (AAC57874), representing 94.6% coverage ( Figure S1). These results indicated the presence of a candidate marafivirus in the 'Jumeigui' grapevine sample, which we tentatively named grapevine-associated marafivirus (GaMV). Mapping results showed that 50,789 sRNAs and 12,937 RNAs from sRNA-seq and RNA-seq clean data were derived from GaMV, respectively ( Figure 1). Additionally, both sRNA-seq and RNA-seq identified some contigs in the sample that were homologous to sequences ofgrapevine geminivirus A, grapevine leafroll-associated virus-3 (GLRaV-3), grapevine rupestris stem pitting-associated virus, GFkV, grapevine Pinot gris virus, grapevine virus E, grapevine yellow speckle viroid 1, hop stunt viroid and citrus exocortis viroid. and it induces symptoms on V. rupestris that are distinct from those of AM. It is characterized by the clearing of the third-and fourth-order veins [9,10]. In Italy, GFkV is associated with the FK-affected V. rupestris [11]. GRVFV is associated with vein feathering on V. rupestris, which is characterized by the transient mild chlorotic discoloration of the primary and secondary leaf veins [3,5]. In addition to the viruses causing these diseases, the grapevine vein-clearing virus is associated with grapevine vein-clearing symptoms, similar to those caused by AM [12]. During a field investigation in 2014, a disease causing obvious vein-clearing symptoms was observed on a 'Jumeigui' grapevine in Dalian City, Liaoning Province, China. The symptoms were similar to those of AM caused by the grapevine fleck complex. The possible agent of this disease was thought to be GAMaV or GCVC, but neither was detected by reverse transcription (RT)-PCR in this sample. In 2017, to identify possible viral infections in the diseased grapevine sample, small RNA sequencing (sRNA-seq) and RNA sequencing (RNA-seq) were employed.

Analyses of the High-Throughput Sequencing (HTS) Data
Sequencing of sRNAs and RNAs from symptomatic leaves of the 'Jumeigui' grapevine resulted in 18, 599, 455 and 602, 231, 929 clean reads, respectively. The clean data were analyzed to identify viral sequences using VirusDetect software. The sRNA-seq data revealed 17 contigs of 55-286 nt that were homologous to polyproteins of GAMaV (AOX24075), representing 26.6% coverage ( Figure S1), and the RNA-seq data revealed 14 contigs of 206-4866 nt that were homologous to polyproteins of oat blue dwarf virus (AAC57874), representing 94.6% coverage ( Figure S1). These results indicated the presence of a candidate marafivirus in the 'Jumeigui' grapevine sample, which we tentatively named grapevine-associated marafivirus (GaMV). Mapping results showed that 50,789 sRNAs and 12,937 RNAs from sRNA-seq and RNA-seq clean data were derived from GaMV, respectively ( Figure 1). Additionally, both sRNA-seq and RNA-seq identified some contigs in the sample that were homologous to sequences ofgrapevine geminivirus A, grapevine leafroll-associated virus-3 (GLRaV-3), grapevine rupestris stem pitting-associated virus, GFkV, grapevine Pinot gris virus, grapevine virus E, grapevine yellow speckle viroid 1, hop stunt viroid and citrus exocortis viroid.

Sequence and Phylogenetic Analyses of the GaMV Genome
Five overlapping fragments of GaMV isolate JMG were amplified by RT-PCR with specific primers and were assembled into a contiguous sequence by overlapping common regions (in general, approximately 100 bp) of the amplicons. Furthermore, we obtained the intact sequence of the 3 -untranslated regions (UTR) using 3 -rapid amplification of cDNA ends (RACE), and the partial sequences of 5 -end was obtained, after several attempts to produce a longer fragment, using 5 -RACE. Finally, we obtained the nearly complete genome of GaMV (GenBank accession NO. MZ422607), which was 6346 bp, excluding the polyA tail. The nearly complete GaMV sequence showed 51.2-62.3% identities with the genera Marafivirus, Maculavirus and Tymoviruses in the family Tymoviridae ( Table 1). The following five domains were identified in the polyprotein using the CD search tool at the National Center for Biotechnology Information database (  A phylogenetic tree was constructed to establish the relationships between GaMV and other members of the family Tymoviridae (Figure 2). The nucleotide sequences of the complete genomes from all the approved marafiviruses, and several members of the genera Maculavirus and Tymovirus, were used in the analyses. GaMV clustered together

Graft-Transmission of GaMV
All the grafted 'Beta' grapevines showed obvious symptoms of vein-cleari chlorotic mottling (Figure 3a-c). At 12 months after grafting, the five grafted 'Beta vines tested positive for GaMV using the two specific PCR primer pairs ( Figure  contrast, non-grafted plants tested negative for GaMV and did not show symptom data suggested that GaMV was transmissible by grafting.

Graft-Transmission of GaMV
All the grafted 'Beta' grapevines showed obvious symptoms of vein-clearing and chlorotic mottling (Figure 3a-c). At 12 months after grafting, the five grafted 'Beta' grapevines tested positive for GaMV using the two specific PCR primer pairs (Figure 3d). In contrast, non-grafted plants tested negative for GaMV and did not show symptoms. These data suggested that GaMV was transmissible by grafting.

Sequence Identities of CP Genes and Phylogenetic Relationships between Different GaMV Isolates
The CP sequences from 20 GaMV isolates were obtained to analyze the genetic diversity among GaMV isolates. The sequences of these isolates have been deposited in GenBank ( Table 2). For all these isolates, except ML2, the three initially sequenced clones within a single grapevine showed >99.0% nt identity levels. Therefore, more clones of isolate ML2 were sequenced and two variant types, represented by ML2 clone 3 and ML2 clone 5, were identified. The CPs of GaMV isolates showed 91.7-100% and 96.7-100% identities at the nucleotide and amino acid levels, respectively (Table S2). The CP-based phylogenetic tree revealed the existence of three well-defined clusters. The major isolates belonged to group I, whereas isolates ML2 and LF belonged to groups II, and III, respectively (Figure 4).

Discussion
More than 80 species of grapevine viruses have been reported [15], to date, and new grapevine viruses continue to be identified, largely owing to HTS technology [16]. Most grapevine viruses are associated with the three major viral disease complexes in grapevine: Infectious degeneration, leafroll and rugose wood. The next most important disease is the grapevine fleck complex, which occurs worldwide [3]. Among the viruses related to the disease, GFkV, GRGV, GSyV-1 and GRVFV have been found in China, but their pathogenicity levels remain poorly understood. Here, we demonstrated an infection caused by a new marafivirus, GaMV, in a grapevine sample showing obvious vein-clearing symptoms. Our results provide new knowledge on the fleck complex caused by viruses

Discussion
More than 80 species of grapevine viruses have been reported [15], to date, and new grapevine viruses continue to be identified, largely owing to HTS technology [16]. Most grapevine viruses are associated with the three major viral disease complexes in grapevine: Infectious degeneration, leafroll and rugose wood. The next most important disease is the grapevine fleck complex, which occurs worldwide [3]. Among the viruses related to the disease, GFkV, GRGV, GSyV-1 and GRVFV have been found in China, but their pathogenicity levels remain poorly understood. Here, we demonstrated an infection caused by a new marafivirus, GaMV, in a grapevine sample showing obvious vein-clearing symptoms. Our results provide new knowledge on the fleck complex caused by viruses in the genus Marafivirus.
The genomic sequences of GaMV in the diseased 'Jumeigui' grapevine were confirmed by sRNA-seq, RNA-seq and traditional Sanger sequencing of PCR products. The nearly complete genome of GaMV was also determined by multiple 5 -RACE amplifications, which produced the longest obtainable genomic fragments. The genomic sequence included domains homologous to all five common domains found in other marafiviruses, and it also contained a marafibox-like sequence similar to that conserved in viruses belonging to Marafivirus [14]. Furthermore, the complete genome-based phylogenetic tree revealed that GaMV clustered with other species-related marafiviruses. Additionally, this study demonstrated the graft transmission of GaMV. These characteristics strongly support GaMV being a novel member of the genus Marafivirus.
Vein clearing is a typical symptom of the grapevine fleck complex. Although the symptoms observed on the 'Jumeigui' samples were very similar to those of AM, we did not confirm a relationship between GaMV and the disease because the sample contained multiple viruses, including GFkV and grapevine Pinot gris virus. GFkV has been associated with the grapevine fleck complex, whereas GPGV has been putatively associated with a novel grapevine disease, known as grapevine leaf mottling and deformation [17]. Therefore, the pathogenicity of GaMV alone still needs to be determined. The field survey showed that GaMV was present in the field at a moderate rate (5.04%), and analyses of CP genes from different samples indicated that there were variants. Therefore, the potential harm of GaMV to grapevines is a cause for concern because of its graft transmissibility and occurrence status.

Plant Material for HTS
A 'Jumeigui' grapevine, showing vein-clearing symptoms (Figure 5a,b), was collected from a vineyard of the Dalian Academy of Agricultural Sciences (Dalian City, Liaoning Province, China) in 2014. Cutting propagated from the infected grapevine also showed vein-clearing symptoms (Figure 5c-e). In spring 2017, diseased leaves were collected and frozen rapidly in liquid nitrogen before being preserved in carbon dioxide ice-blocks and shipped to Biomarker Biology Technology (Beijing, China). Transport took 2-3 days.

HTS and Bioinformatics Analyses
Leaf samples were used to extract total RNAs and generate a cDNA library of sR-NAs. sRNA-seq was carried out using an Illumina HiSeq™ 2000 system (SinoGenoMax, Beijing, China), as reported previously [18]. Clean data were obtained by removing sequences <18 nt or >30 nt, low-quality tags, poly-A-tags and N-tags from raw reads. Sequences of potential viruses were identified by analyzing the clean data using Virus-Detect (http://virusdetect.feilab.net/cgi-bin/virusdetect/index.cgi/) [19]. For RNA-seq, the Epicentre Ribo-Zero rRNA Removal Kit (Epicentre, Madison, WI, USA) was used to remove ribosomal RNA from extracts of total RNA. The ribosomal RNA-depleted RNA sample was then used to construct a cDNA library using a TruSeq RNA Sample Prep Kit (Illumina, San Diego, CA, USA), which was sequenced on an Illumina HiSeq 4000 platform with a paired-end 150-bp format (Biomarker Biology Technology). Reads mapping to the grapevine genome (PN40024 assembly 12X) were removed by hierarchical indexing using hisat software [20]. Unmapped reads were used for de novo assembly and BLAST analyses embedded in VirusDetect. A 'Jumeigui' grapevine, showing vein-clearing symptoms (Figure 5a,b), was collected from a vineyard of the Dalian Academy of Agricultural Sciences (Dalian City, Liaoning Province, China) in 2014. Cutting propagated from the infected grapevine also showed vein-clearing symptoms (Figure 5c-e). In spring 2017, diseased leaves were collected and frozen rapidly in liquid nitrogen before being preserved in carbon dioxide iceblocks and shipped to Biomarker Biology Technology (Beijing, China). Transport took 2-3 days.

Amplification and Analyses of the GaMV Genome
Five primer pairs were designed on the basis of contig sequences and were used to amplify GaMV genomic sequences (Table 3). PCR fragments were recovered, purified and then cloned into the Zero Background pTOPO-Blunt vector (Aidlab, Beijing, China). At least three positive clones of each PCR product were sequenced at Shanghai Sangon Biological Engineering & Technology (Shanghai, China). The 5 -and 3 -UTRs were amplified by the RACE strategy using a SMARTer ® RACE 5 /3 Kit (TaKaRa) in accordance with the manufacturer's instructions.

Sequence Analyses
The tool ORF Finder at the National Center for Biotechnology Information was used to search for potential open reading frames in the genomic RNA of GaMV. The CD search