is a genus of approximately 225 species of deciduous and ever green shrubs, trees and woody climbers in the family Rutaceae
]. It is native primarily to tropical and subtropical regions, and some of its species extend to warm temperate zones [2
]. The genus is economically important and has been used in pharmaceutics, cosmetics, and culinary applications [2
]. Leaves, bark, fruits and fruit oil are utilized in traditional medicines to treat asthma, common cold, gout, jaundice, diarrhea, pain and stomach upsets [4
]. Fruits and fruit oil are also extensively used as spice in eastern Asian countries [3
]. In China, for a long history of cultivation the production of Zanthoxylum
is so far worth multi-billion dollars. In particular, green Sichuan pepper (Z. armatum
) is one of the most economically important commercial crops in Chongqing Province, which produces 70% of dry fruits of the variety nationwide. For instance, in 2018, approximately 530,000 acres of this variety were planted only in Jiangjin District, with a record high sale value of over half billion dollars. Several pathogens including fungi and phytoplasmas cause serious diseases in cultivated Zanthoxylum
]. However, no virus or viroid has previously been reported to be associated with this crop. Recently, a virus-like disease, flower yellowing disease (FYD), was found on green Sichuan pepper trees in Chongqing. Typical symptoms observed of the FYD include internode shortening, leaf curl, pistil abortion, stamen yellowing and intumescence, and eventually fruit dropping. After an irreversible and degenerative progress during a few years, the diseased trees decline and die, thus causing severe yield reduction in crop production. Initial symptoms of the blossoms usually appear only on a few branches of diseased trees, and more and more branches show the symptoms following the disease progress. With propagations of green Sichuan pepper by seeds and grafting, symptomatic trees were randomly distributed in the fields (Figure 1
g). It appears that the FYD is naturally transmitted someway from source trees to healthy trees. Unfortunately, the etiology of the disease is unknown.
Advances in high-throughput sequencing (HTS) have led to the development of metagenomic analyses of genomes from a variety of living forms [12
]. This technology has also been used to study populations of many organisms. One such example is viral metagenomics or the generation of the virome from a single species, instead of entire microbial communities [13
]. The first study of human virome was initiated as a systematic exploration of viruses infecting humans [14
]. This systematic approach for viruses was then extended to animals and plants [15
]. The study of plant viromes such as that of grapevine and pepper provided significant information on viral populations, and viral quasispecies [17
]. In principle, HTS techniques such as RNA sequencing (RNA-seq) and small RNA sequencing (sRNA-seq) yield short sequences of viral RNA or sRNA entity extracted from plants for in silico assembly to generate longer sequences (contigs) that facilitate BLAST-based or BLAST-independent annotation [19
As for this study, the HTS of both RNA- and sRNA-seq coupled with the bioinformatics analysis allowed discovery of four new RNA viruses, tentative members of the genera Nepovirus (subgroup C), Idaeovirus, Enamovirus and Nucleorhabdovirus, in diseased green Sichuan pepper plants. To understand the cause of FYD, a subsequent large-scale field survey, during which many indications pointed to the nepovirus, was performed. The prevalence of viruses and FYD as well as associated symptoms, and the fact which demonstrated that the viruses have the ability to spread through farm operation (grafting), or still unspecified natural transmissions in the open field, could not be neglected so that relevant works are of great significance.
2. Materials and Methods
2.1. Plant Materials
In June 2017, leaves and inflorescences were collected from green Sichuan pepper trees with or without symptoms and stored at −80 °C. One tree with foliar yellowing and dwarf as well as flower yellowing and deformation was used for high-throughput sequencing (HTS) analysis (Figure 1
a,b,e). An asymptomatic tree was used as a negative control (Figure 1
2.2. RNA Preparation and HTS
To identify potential viral and viroidal pathogens, total RNA was extracted from both samples using the EASY spin Plus Complex Plant RNA Kit (Aidlab, China) for RNA sequencing (RNA-seq), and total sRNA was extracted from the samples using the EASYspin Plant microRNA Extract kit (Aidlab) for small RNA sequencing (sRNA-seq). For RNA-seq, after depletion of the ribosomal RNA, the libraries were constructed with a TruSeq RNA Sample Prep kit (Illumina, USA) and sequenced using the Illumina HiSeq X-ten platform (pair-end reads of 150 bp). The sRNA libraries were built using a TruSeq Small RNA Sample Prep Kit (Illumina) and sequenced using the Illumina Hiseq2500 platform (1 × 50 bp read lengths).
2.3. HTS Data Analysis and Virus Identification
RNA reads obtained from RNA- and sRNA-seq were analyzed using the software CLC Genomics Workbench 9.5 (Qiagen, Germany). After adaptor trimming and filtering of raw reads, de novo
assembly of clean reads was carried out using the algorithms Trinity (RNA) or Velvet (sRNA) [20
]. The clean reads from RNA-seq were mapped against the Citrus sinensis
reference genomes (downloaded from https://www.citrusgenomedb.org
) to remove host reads and then assembled, while sRNA clean reads being trimmed from sRNA-seq were directly assembled. Assembled contigs were subjected to a BLASTx and BLASTn search (https://blast.ncbi.nlm.nih.gov/Blast.cgi
) of viral (taxid:10239) and viroidal (taxid:12884) sequences in GenBank Databases, respectively.
2.4. Determination of Virus Genomes
To determine the genome sequence of a novel virus, specific primers (Table S1
) were designed based on its contig sequences obtained in this study and used in RT-PCR for amplification of overlapped viral fragments. One-step RT-PCR was carried out using the PrimeScript One Step RT-PCR Kit (Takara, Japan). RACE (5′ and 3′ termini) was performed using the GeneRacer Core Kit (Invitrogen, USA). PCR amplicons were purified using the Gel Extraction Kit (Biomega, USA) and then cloned using the pEASY-T1 Vector System (TransGen, China). At least five clones of each PCR amplicon were fully sequenced (Sanger), and sequences of all amplicons were assembled to generate complete genome sequences.
2.5. HTS Data Reanalysis
The resulted full viral genome sequences were used as references to map against both RNA-seq and sRNA-seq clean reads by the CLC Genomics. Average coverage was calculated by multiplying the number of mapped reads by the average read length (149.74 nt for RNA-seq; 21.24 nt for sRNA-seq) and dividing the total by the length of the genome. In addition, small RNAs of each virus were categorized according to the genome polarity, size and 5’ end nucleotide preference, analyzed with statistics and visualized.
2.6. Sequence and Phylogenetic Analysis
Standard procedure for the genome sequence analysis of a new virus was as follows: (I) Open reading frame (ORF) of the viral genome or genome segment was identified by the NCBI ORF finder (https://www.ncbi.nlm.nih.gov/orffinder
); (II) closely related viruses were found by the BLASTp search of the databases using amino acid (aa) sequences of all ORF-encoded proteins; (III) protein molecular weight was estimated using the CLC Genomics; (IV) conserved domain of putative protein was detected by the Conserved Domain Database (CDD) of NCBI (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi
); (V) conserved motif of putative protein was found by the multiple sequence alignment of new virus and related viruses using the ClustalW [22
Phylogenetic relationships between each novel virus and a range of related viruses were analyzed by Mega 7 [23
], using amino acid sequences of entire or partial proteins, which are conserved, and commonly selected for phylogenetic reconstruction of these viruses. All the maximum-likelihood trees were generated with the MUSCLE sequence alignment, Jones-Taylor-Thornton (JTT) model, and gaps complete deletion treatment, and 1000 bootstrap replicates applied.
2.7. RT-PCR Protocols and Field Surveys
One-step RT-PCR assays using the PrimeScript One Step RT-PCR Kit (Takara, Japan) were developed for the detection of the four viruses, one for each virus, and used to test the field samples collected earlier. Virus specific primer sequences, reaction system and thermocycling condition for each virus are listed in Tables S1 and S2
To investigate the distribution and infection rates of the four viruses in main green Sichuan pepper growing areas in Chongqing, a total of 217 leaf samples were tested. 186 leaf samples were collected from FYD-infected (41) and asymptomatic (145) trees in fields. The sampled areas included Fulu, Guangpu and Sanhe of Bishan County, Xianfeng and Ciyun of Jiangjin District, Shichuan of Yubei District and Dandu of the Changshou District. Most of these areas are adjacent to the Jiangjin District where the sequencing samples were initially collected.
For the purpose of finding evidence of potential natural virus transmission, leaf samples were tested from 31 seedlings from an open field of the Ciyun breeding base where FYD is common in Chongqing.
Furthermore, to analyze the presence of GSPNeV in symptomatic and asymptomatic branches of the same tree, both symptomatic and asymptomatic branches of four GSPNeV positive plants were separately tested by RT-PCR.
2.8. Graft Transmission
ZP-9, a tree with flower yellowing symptoms, was confirmed to be naturally infected with all four viruses by RT-PCR. Mature branches were collected from the ZP-9 and a healthy tree. To prove that these viruses are active entities rather than fragments integrated into the host genome, green bark chips of each source were separately grafted onto five virus-free green Sichuan pepper seedlings as stocks, with three bark inoculations per stock. Additionally, to support that GSPNeV is transmissible, buds from a symptomatic tree infected only with this virus were also grafted onto five stocks in a similar way. The grafted seedlings were placed in a growth chamber with 12-hr light/12-hr dark, temperature at 25 °C and relative humidity at 75%.
The HTS, also known as next-generation sequencing and deep sequencing, is a sensitive and reliable technique to discover previously unknown viruses and study the virome of agricultural crops [46
]. The options of enriching the sample for the viral nucleic acids, using virions extracts, total nucleic acids, dsRNA, poly-A RNAs, and small RNAs, among others, are available for viral diagnostics. Sequencing of small RNAs plus total RNA simultaneously satisfies convenience, encompassing all sequence information of viral pathogens (including viriods) [19
]. For RNA-seq, the RNAs in library including those of virus origin are randomly fragmented to smaller sizes before synthesis of cDNA and addition of adaptors, and sequenced. By contrast, sRNAs of different sizes that are generated during plant RNA silencing against virus infection are produced by an endogenous silencing machinery [48
]. Characteristics of virus-derived small interfering RNAs (vsiRNAs) provide insight into the interaction between a virus and its host plant. RNA-seq is more appropriate for the detection of some viruses with low titer in a plant than sRNA-seq, due to longer contigs from assembly of the RNA reads [51
]. Combining RNA-seq, sRNA-seq, RT-PCR cloning and Sanger sequencing techniques, we discovered four novel RNA viruses belonging to distinct genera in the FYD-affected green Sichuan pepper tree in this study. However, the existence of other underlying viral pathogens without any sequence similarity to any known viruses is not entirely excluded.
Mix infection of four different RNA viruses in a single plant species supports the commonness of horizontal virus transfer. All four viruses are novel although each of them is distantly related to a group of known viruses in the established taxonomy system. Members of Nepovirus
) are bipartite viruses with positive-sense, single-stranded (+ss) RNA genome. Both RNA1 (~7.5 kb) and RNA2 (~3.9 kb) have a VPg linked to their 5’ end and poly (A) tail at their 3’ end, and each contains a ORF encoding a polyprotein [24
]. Viruses of the genus Idaeovirus
are composed of linear +ss RNA1 (5.4 kb), RNA2 (2.2 kb), and sgRNA3 (1.0 kb), that express replicase, MP and CP, respectively. They may have a stem-loop structure in the 3′ end of the genome that is not polyadenylated [29
]. The genus Enamovirus
) is yet to be assigned to an order. The genome of the monopartite viruses containing five ORFs is a linear, +ss RNA of 5.7 kb with a 5’-end VPg but without 3’ poly (A) tail [53
]. The genus Nucleorhabdovirus
) is characterized by having a liner negative sense (−) ssRNA genome (11–15 kb) encoding five to seven proteins in the order of 3’l-N-P-P3-M-G-L-5’t [54
]. The results of comprehensive analyses and phylogenetics of the genomic sequences of the four novel viruses determined in this study are consistent with the salient nature described for known viruses. Therefore, they are provisionally named as GSPNeV, GSPIV, GSPEV and GSPNuV. The genome and encoded proteins of these viruses are divergent from known viruses infecting other plant species, suggesting they are distinct members of the genera Nepovirus
Virome analysis of the diseased tree revealed viral population profile and antiviral RNA silencing response by the host to the virus infections. High viral RNA read numbers from RNA-seq correlate with high levels of viral RNAs of genome, replication intermediate, or mRNA present during the infection, reproduction, or transcription in the diseased tissues. From this prospective, the high abundance of GSPNeV and GSPIV may suggest that these viruses replicate to higher titers than GSPEV and GSPNuV, or alternatively sampling may have occurred during active replication for these viruses. The green Sichuan pepper’s RNA silencing response to virus infection resembles that previously observed in other plants [55
], as they show similar vsiRNAs characteristics. Although all the viruses are potentially affected by this same silencing system, it appears that GSPNeV is more easily targeted because it gave rise to much more vsiRNAs than those of other viruses. Intriguingly, in a comparative analysis the value dividing the transcriptome reads proportion by the sRNA reads proportion of GSPIV is 6.09 (5.3%/0.87%) much higher than 0.68 (7.15%/10.42%) of GSPNeV, 0.53 (0.39%/0.73%) of GSPEV, and 0.45 (0.05%/0.11%) of GSPNuV. It is likely that for some unknown reasons the other viruses have a general pattern of living in the plant according to the values (<1), whereas GSPIV not (>6). All these indications suggest that interactions of the plant and the viruses are complex.
The virus sequences obtained from the HTS forms the basis for unraveling the etiology of the viral diseases [56
]. Specific, sensitive and rapid RT-PCR assays were developed and used to detect the viruses. The prevalence of GSPNeV and GSPEV is probably due to effective transmission of them by some vectors in the fields. It is interesting that apparently GSPIV has the highest viral copies in the sequenced plant (RNA2). A similar case has been found that a synergistic effect probably occurs when an idaeovirus coinfects with other viruses in the same plant [59
]. However, the distribution of GSPIV in the field is very limited. Most of the diseased trees were found to be co-infected by GSPNeV and GSPEV in the surveyed growth regions. It is not clear whether one of them or both are causal agents of the FYD. However, many observations, without regard to whether all the FYD-infected trees were positive for GSPNeV in RT-PCR analysis, showed evidence to suggest that it might be the pathogen causing the FYD. Many asymptomatic samples were infected by GSPNeV, though this could be explained by clinical latency. If a pathogenic virus accumulates at low level in the plant, the infected plant is likely to be asymptomatic [60
]. Taken this into consideration, we speculated that virus population of GSPNeV in the asymptomatic tree is small, and that the symptoms will be macroscopic once the virus titer reaches a high level over the course of a few years. It should be reminded that the asymptomatic trees infected with GSPNeV might be cultivars that are to some extent resistant to the virus infection, which is the rule rather than the exception for plant viruses [61
]. Moreover, those viruses may represent the virulence-deficient variants, which do not induce symptoms. It is visible that the development of the FYD in single tree or among trees is a gradual course by time, and possible that initial manifestation of the FYD is showing no symptom. Therefore, we have inoculated the virus-free trees of disease-sensitive cultivar with GSPNeV-infected materials in order to fulfill Koch’s Postulates. Further observation of symptom expression is still ongoing.
The transmission of these viruses by grafting was proven in this study, and we have yet not observed any symptoms on grafted trees similar to those noted on the source tree. In fact, field development of FYD has been empirically observed as slow but suddenly destructive, and whether it has a latent period in asymptomatic trees is unclear [62
]. Grafting, which is easy to implement, has become a common practice to propagate green Sichuan pepper trees, and will inevitably accelerate the distribution of the viral pathogens if the source trees are not certified. In addition, widespread of the viruses in the fields and breeding base suggests some natural dispersal of these viruses, other than those linked to production practices such as cutting and grafting. As previously reported, nepoviruses can be vectored by nematodes, pollens and seeds, and, in some cases, by mites and thrips [24
]. Transmission of idaeoviruses by pollens and seeds could be rapid, but no biological vector has been reported [63
]. Enamoviruses are transmitted by aphids [64
], and nucleorhabdoviruses by insect species such as leafhoppers, planthoppers and aphids [54
]. However, we have not addressed these transmissions in our study.
Farmers recently reported that the incidences of FYD have increased rapidly. As more and more trees are being planted, the viruses will severely jeopardize the sustainability of green Sichuan pepper production. At the current moment, we are focusing on investigating the role of each of the four viruses on FYD. Preliminary observations noted various symptoms, which might be associated with different viruses in the plants. However, a further test is necessary to confirm the associations for solid conclusions. Considering again the natural infections of these viruses in seedlings, future work will focus on elucidating the mechanism of viral transmission and developing a virus-free certificate program to distribute seedlings to farmers.
In summary, we identified four new graft- and naturally-transmissible RNA viruses by HTS of RNA- and sRNA-seq techniques, obtained the complete viral genomes based on cloning and Sanger sequencing, and investigated their occurrence in several major producing regions in Chongqing, using RT-PCR. Filed survey allows us to find the association of GSPNeV with the FYD. The virome analysis also suggests such an association. Further characterization of the viral sRNA populations recalls a typical virus-plant interaction model with respect to RNA silencing, thus an important mechanism could be utilized in combating against viral infections in plants [66
]. Future works of urgent need were discussed. All together these data provide significant information for virus control as well as disease management, contributing to the sustainability of green Sichuan pepper production.