Recent Advances on Detection and Characterization of Fruit Tree Viruses Using High-Throughput Sequencing Technologies

Perennial crops, such as fruit trees, are infected by many viruses, which are transmitted through vegetative propagation and grafting of infected plant material. Some of these pathogens cause severe crop losses and often reduce the productive life of the orchards. Detection and characterization of these agents in fruit trees is challenging, however, during the last years, the wide application of high-throughput sequencing (HTS) technologies has significantly facilitated this task. In this review, we present recent advances in the discovery, detection, and characterization of fruit tree viruses and virus-like agents accomplished by HTS approaches. A high number of new viruses have been described in the last 5 years, some of them exhibiting novel genomic features that have led to the proposal of the creation of new genera, and the revision of the current virus taxonomy status. Interestingly, several of the newly identified viruses belong to virus genera previously unknown to infect fruit tree species (e.g., Fabavirus, Luteovirus) a fact that challenges our perspective of plant viruses in general. Finally, applied methodologies, including the use of different molecules as templates, as well as advantages and disadvantages and future directions of HTS in fruit tree virology are discussed.


Introduction
Fruit trees host a wide range of viral pathogens, mainly as a result of their vegetative mode of propagation and perennial nature [1]. Notably, among fruit trees, 44 viruses have been identified in the major cultivated Prunus species [2]. Some of these pathogens are causing diseases adversely affecting yield and/or decreasing the fruit quality [1]. The identification, detection, and characterization of the causal agents of such diseases are challenging, due to the low titer and irregular within-plant distribution of viruses in perennial plants, the occurrence of both inter-and intra-species mixed-infections in single trees, the frequent symptomless infections or fluctuation of symptom intensity during the season, and the complex and heterogeneous nature of viral populations [3][4][5][6][7][8][9].
Several biological, serological, and molecular assays have been applied, either alone or in combination, for the detection of fruit tree viruses. Biological assays, also known as biological indexing, were, in the past, an essential tool for the evaluation of the health status of individual plants and for the detection and/or characterization of a particular virus. In the case of fruit tree viruses, bioassays took advantage of the increased susceptibility of some fruit tree genotypes to viral infection, resulting in pronounced and easily identifiable symptoms expression. A number of fruit tree genotypes have been used for this purpose, e.g., Malus platycarpa, Malus micromalus, Malus pumila "Virginia Crab" or "Spy227", for apple and pear virus pathogens [10,11]; and Prunus persica GF305, Prunus tomentosa, Prunus serrulata "Kwanzan" or "Shirofugen" [12][13][14], for stone fruit viral pathogens. Besides woody indicators, several herbaceous plant species, showing either local or systemic symptoms, have also been used [3,10,15], although no universal indicators for all fruit viruses are available.
Effectiveness of bioassays vary considerably and can be hampered by several factors. Firstly, there is a need for controlled greenhouse facilities for plant maintenance, due to the vector-mediated transmission of several viruses, thus limiting a large scale and easy-to-use evaluation. Although symptoms on woody indicators could appear a few weeks post-inoculation, they may take from several months to years in the case of some viruses (e.g., Plum pox virus, PPV). In addition, a high intra-species genetic diversity of fruit tree viruses may result, not only in variable symptomatology [16,17], but also in symptomless reactions (e.g., PPV-Rec isolates in GF305 peach seedlings) [18]. However, a question arises concerning the health status of used indicators, if not issued from in vitro meristem culture or thermotherapy sanitation, due to the possibility of hidden infections possibly influencing the interpretation of results.
Nowadays, although still necessary in fruit tree virus research (assessment of biological properties, pathotyping etc.) and certification schemes, the labor-and time-consuming biological indexing, with problematic reliability and subjective evaluation, has less practical application in virus diagnosis.
Serological methods, ELISA (enzyme-linked immunosorbent assay) and its modifications, although limited in sensitivity, enable cheap and fast parallel diagnosis of many samples, using either polyclonal, monoclonal, or recombinant antibodies [19][20][21]. However, immunogenicity of some fruit tree viruses is scarce, producing antibodies which are not reliable enough for mass-scale diagnosis. Therefore, specific and sensitive detection of most fruit tree viruses and virus-like pathogens is mainly based on molecular diagnosis.
Generally, detection of viral pathogens by genome amplification techniques (Polymerase chain reaction: PCR, Reverse transcription polymerase chain reaction: RT-PCR and Loop-mediated isothermal amplification: LAMP) is more sensitive than ELISA, while offering the possibility of multiple detection [22][23][24][25]. However, the development of specific molecular tests is strongly dependent on the knowledge of the pathogen genome and its molecular diversity, making primer design highly dependent on (or limited by) available sequence datasets. Especially in the case of recently discovered or poorly studied viruses, the available sequences might not represent the real pathogen diversity present in the field, resulting in the development of insufficiently polyvalent tests, which might provide false negative results. Similar situations also occurred with well-known and widely spread viruses, such as Prune dwarf virus (PDV) [26] and Apple chlorotic leaf spot virus (ACLSV) [27], highlighting the need for continuous assessment of the viral molecular variability as a prerequisite to develop polyvalent and efficient detection tools.
Such problems can be overcome by the use of modern next generation sequencing (NGS) or high-throughput sequencing (HTS) technologies, which, theoretically, can produce sequence data of every putative viral agent present in a sample without the need of any former knowledge about their genome [28]. This fact, in addition to the relatively little time needed for the completion of a sequence run and the analysis of the resulting data, make HTS the undisputed leading technology in diagnostics, nowadays.

Principles and Technologies of High-Througput Sequencing
HTS technologies were introduced in 2005 with the commercialization of FLX Genome Sequencer from 454 Life Sciences using real-time sequencing-by-synthesis pyrosequencing technology [29]. The 454 Life Sciences platform was discontinued by Roche in 2016, as it was eclipsed by newer and more competitive platforms. However, this approach established the basis for current PCR-based HTS technologies. Applied Biosystems SOLiD platform was commercialized in 2006, and it was based on a different technology: Sequencing by oligo ligation. These sequencers were also discontinued in 2016. Currently, Illumina and Ion Torrent platforms are the only technologies available in the category of PCR-based HTS.
Nowadays, Illumina offers nine different sequencers ( Table 1) that can produce from 4 million to 20 billion short reads, from 50 to 300 nucleotides. It is the most used technology in fruit tree virus research, and couples sequencing-by-synthesis on a surface of a flow cell divided in lanes. Oligonucleotides complementary to the specific adapters ligated to the library DNA fragments are covalently attached to the surface. PCR occurs in the flow cell through a process that includes heating-cooling and incubation with PCR reagents, thus, producing millions of clusters that are generated from single nucleic acid fragments. Sequencing is performed using the four nucleotides labelled with different fluorescent dyes, blocked chemically at their 3 -end, to allow only the incorporation of one nucleotide. After excitation and image capture, the fluorescent dye is chemically unblocked and removed, allowing the incorporation of the next nucleotide in the next cycle. Successively, low quality reads are filtered out [30]. Illumina sequencers can exploit different strategies for HTS analysis of plant viruses, shotgun approach from total DNA or RNA (RNASeq), nucleic acid sequencing from partially or totally purified viral particles, and sequencing of small interfering RNAs (siRNAs).
Ion Torrent platform was introduced in 2010 by Life Technologies. The first commercialized sequencer was the Ion Personal Genome Machine (PGM), which added a revolutionary semiconductor sequencing technology. Ion Torrent™ Technology directly translates chemical output into digital information on a semiconductor chip. The technology exploits the biochemical process that takes place when the polymerase incorporates a nucleotide into the growing DNA strand when a hydrogen ion (H + ) is released as a byproduct. Thus, the technology is based on the real time detection of hydrogen ion (H + ) concentration, representing a change in the pH due to the ion release when a nucleotide is incorporated into a DNA strand. The principal components of the Ion Torrent sequencers are the sequencing chips. These microprocessor chips incorporate extremely dense arrays of millions of micromachined wells coupled to a proprietary ion sensor. Each well is able to hold a single DNA molecule from the DNA library. Currently, Ion Torrent commercializes three different sequencers that can hold different types of chips (Table 1). Ion Torrent sequencers can produce from 550,000 to 100 million short reads, from 200 to 400 nucleotides. As in the case of Illumina, Ion Torrent platform can exploit RNASeq, nucleic acid sequencing from partially or totally purified viral particles, and siRNA sequencing.
Other novel technologies based on single molecule, real-time (SMRT) sequencing (Pacific Biosciences) where no amplification step for sample preparation is required, are in development. Pacific Biosciences offers two different sequencers that can generate from 500,000 to 10 million reads, with an average length of 1000 nucleotides (Table 1). Recently, Oxford Nanopore has developed the MinION technology, based on the use of nanopores, a platform smaller than a smartphone that runs by plugging it through an USB connection to the laptop or computer. Currently, there are three sequencers from Oxford Nanopore ( Table 1). The SMRT sequencing has been applied for detection and identification of human pathogens, viruses [31], and bacteria [32]. In plant pathology, this technology is still in development, and its application for fruit tree viruses detection remains to be studied.

Sample Preparation
Sample preparation for HTS usually includes a set of common steps, regardless of the HTS platform: Nucleic acid extraction, fragmentation, and/or size-selection; enrichment of RNA viral sequences; preparation of libraries; and bioinformatic analysis. Although these procedures require significant manipulation, time, and skills on genomic library preparation, currently, there are commercially available automated systems able to produce sequence-ready nucleic acids libraries. Platform manufacturers and third-party companies provide specific kits simplifying and standardizing the construction of libraries that, in addition, can be prepared for multiplexing if necessary. Optimal results are obtained using highly pure starting nucleic acids preparation. Contamination, inhibitors, or impurities in the starting material can produce false negative results and artifacts. Although nucleic acid purification, even from woody plants, is a routine task in many laboratories, the preparation of high quality nucleic acids for HTS is not trivial. Different types of targets have been used for fruit tree virus detection, as follows.
(a) Virus-derived small interfering RNAs RNA silencing constitutes a major antiviral mechanism in plants, where hosts enzymes called dicers cut the viral RNA into small interfering RNAs (siRNAs) 21-24 nt long. These molecules can be isolated from the infected plants, subjected to HTS, and used to reconstruct viral genomes. Virus-derived siRNAs have been used to detect known and unknown plant viruses in fruit trees by HTS [33][34][35][36][37][38][39][40][41][42]. The advantage of this approach is that siRNAs can be used to detect RNA and DNA viruses, and also, to detect integrated endogenous viral elements (EVEs), as long as they are transcribed. Moreover, it has been successfully used to recover a persistent virus in persimmon [37]. The disadvantage of this approach is the small size of the viral reads that may complicate the genome assembling process, especially for novel viruses.
(b) Double-stranded RNA dsRNA has been exploited to detect fruit tree viruses by HTS, and also in the study of diseases of unknown etiology [40,[43][44][45][46][47][48][49][50]. The advantage of using this target for HTS is an enrichment of viral nucleic acids sequences. Current protocols rely on phenol/chloroform extraction to obtain a mix of total DNA and RNA, from which the dsRNA fraction is enriched by chromatography on a long polymer cellulose matrix [51]. Although the detection of DNA viruses can be scarce, dsRNA has facilitated the discovery of many new RNA viruses, such as Apricot vein clearing associated virus (AVCaV) [44], Prunus virus T (PrVT) [45], a new luteovirus species provisionally named Cherry associated luteovirus (ChALV) [48], or a new fabavirus named Cherry virus F [40].
(c) Total DNA or RNA Total DNA-sequencing has been scarcely used in the detection of viruses from woody plants. It has been successfully applied in citrus to identify a new DNA virus [52] using genomic DNA as target, which was subsequently submitted to fragmentation to prepare Illumina sequencingcompatible libraries.
Total RNA-sequencing has been used to detect and characterize viruses in several woody crops, such as grapevine [53][54][55][56], apple, pear [57], and cherry [9]. It has also been successfully applied for the discovery of two new fabaviruses infecting Prunus species [40,49]. Although traditional total RNA purification can be performed, there are commercially available kits that can be used to substitute in-house purifications. A potential drawback of the method is that viral titer can be low within the background of plant RNA, a limitation that can be overcome by depletion of highly abundant plant ribosomal RNA from the total RNA pool.
(d) Virion-associated nucleic acids purified from virus-like particles Virion-associated nucleic acids (VANA) are used to enrich viral nucleic acids by purifying viral particles from the plant material [58], however, to date, there is no report of the application of the method to fruit tree viruses. The disadvantage of this method is the requirement of complex sample processing. In addition, it is limited to the detection of encapsidated viruses.

Bioinformatics
The bioinformatic analysis of the HTS data can be performed with biologist-friendly commercially available packages or with open platforms. Quality control, sequence assembly, contig annotation, and identification of variations between samples are required steps for the analysis of data. Quality control depends on the technology and the standard parameters whose thresholds are usually provided by the manufacturer. The assembly of reads to recover a viral genome can rely on two different approaches: (i) De novo assembly, where the viral genomes are partially or fully reconstructed by annotation of the generated contigs, or (ii) reads mapping against a reference sequence. For low titer fruit tree viruses, a combination of both approaches might be required, along with higher sequencing depth. Annotation of contigs is performed using BLASTn, BLASTx, BLASTp, and tBLASTx (https: //blast.ncbi.nlm.nih.gov/Blast.cgi). De novo assembly has limitations, especially when faced with the recovery of several viral variants or virus strains of the same virus species in a single plant sample. This occurs particularly when siRNAs are used as template for HTS, due to the low length of reads that can produce false positive results or incorrect assembly. In that case, a strategy that uses longer reads (e.g., 100 nucleotides or higher) can help solve the problem, as stringent assembly parameters facilitates the recovery of each of the isolates that infect the sample. Subsequently, different nucleotide sequences among variants, such as single nucleotide polymorphism (SNP) or insertions and deletions (indels), can be analyzed by PCR amplification and restriction fragment length polymorphism (RFLP) analysis. The increasing number of available host genomes allows for the bioinformatic depletion of host genomes that, along with the deduplication of reads, can facilitate the analysis at high sequencing depth. There is a continuous development of suitable pipelines for plant virus discovery and detection, in the attempt to reconcile computing needs with the pace of the improvements and development of new HTS platforms. The first automatic pipeline for the identification of viruses in RNA-sequenced samples, VirusFinder, was developed by Wang et al. [59]. A detailed comparison between this and other available automatized bioinformatics tools, VirusHunter, VirFind, ezVIR, ViromeScan, Taxonomer, VIP, VirusDetect, VSD toolkit, and Metavisitor, has been reported by Jones et al. [60].

Application of High-Throughput Sequencing Technology to Detection and Discovery of Fruit Tree Viruses
HTS is now entering in the routine detection of plant viruses, either in cases of unknown etiology or in quarantine analysis of plant material trades. Facilitated by the decrease of costs, HTS is going to become the method of choice for virus discovery and identification in fruit trees. Scientific discussions are now dealing with the challenges of the technique and with the definition of laboratory standards for its application (see EPPO-COST FA1407 workshop, 2017; https://www.eppo.int/MEETINGS/ 2017_meetings/wk_ngs_diagnostics).

Identification of New Viruses in Fruit Tree Species with the Aid of High-Throughput Sequencing
The worldwide application of HTS to the analysis of plant viruses enlarges the potential of the technology, as revealed by the number of reports, in many economic relevant crops, as well as in eco-genomics studies that deal with the description of novel viruses. Here, we report a most updated list of them (Table 2), with the awareness that an almost daily increase of the viruses and virus-like pathogens here listed is going to deeply modify our knowledge of the "virus world" and its association with plant diseases.

(a) Citrus
The list of citrus virus-like diseases with unknown etiology was updated since 2012, with two publications on citrus yellow vein clearing of lemon trees [61] and citrus chlorotic dwarf [52]. In the first case, a new alphaflexivirus (Citrus yellow vein clearing virus, CYVCV) was discovered by siRNA analysis in Turkish accessions, and it was related to, but sufficiently divergent from the Indian citrus ringspot virus in the genus Mandarivirus. In the latter disease, a new viral DNA sequence was associated with the symptoms through siRNAs and total DNA sequencing. This new, highly divergent monopartite geminivirus was named Citrus chlorotic dwarf associated virus (CCDaV).
In order to identify the causal agent of citrus vein enation disease, Vives et al. [62] examined the siRNA fraction from infected and healthy Etrog citron plants. Contigs assembled from viral siRNAs showed similarity with luteovirus sequences, and determined the genome of Citrus vein enation virus (CVEV), related to the Enamovirus genus.
A series of studies on citrus leprosis-associated viruses was described by Roy and co-authors. This disease, present in South and Central America and transmitted by mites, is caused by two differently shaped and taxonomically allocated viruses. Roy et al. [63,64] described, in Mexican leprosis-affected citrus plants, the sequence of two genomic components of a new rhabdo-like virus, similar to Orchid fleck virus, and attributed it to the proposed and still unassigned genus Dichoravirus. From Citrus leprosis symptomatic leaves from Colombia, Roy et al. [65] produced an siRNA library and found a novel cytoplasmic Citrus leprosis virus atypical strain (CiLV-C2; Cilevirus), plus Citrus tristeza virus (CTV) and citrus viroid sequences. This new virus produced symptoms typical of CiLV, but was not detected with either serological or PCR-based assays targeting the previously described cytoplasmic virus. Finally, the same group detected in Florida, on Carrizo citrange analyzed for citrus decline disease/citrus blight, three different endogenous pararetroviruses, significantly matching the Petunia vein clearing virus genome [66]; while, in a herbarium specimens in Florida, more than 50 years old, near full genome sequences of CiLV-N and -C species were recovered from the siRNA fraction [67].
A siRNAs library was prepared in Italy from orange trees affected by concave gum, a severe citrus disease reported more than 80 years ago and still with an unknown etiology. Navarro et al. [68] identified a new virus (with two genomic components of negative stranded RNA) similar to tenuiviruses and phleboviruses. This novel virus, tentatively named Citrus concave gum-associated virus, is flexuous and non-enveloped. While the viral agent associated to citrus sudden death disease, which appeared in Brazil in 1999, has been already characterized as belonging to a tymovirus-like species, and named as Citrus sudden death-associated virus (CSDaV), and an effort to further clarify the virome linked to disease expression was done by Matsumura et al. [69]. The sequencing analysis of the total RNA and siRNAs from CSD-symptomatic and -asymptomatic plants allowed the identification of both CSDaV and CTV in multiple virus infections, and the differentiation of viral genotypes. The still complicated virome picture identified in those plants consists of CTV as the most predominant virus (in an area in which the virus is endemic), followed by CSDaV, Citrus endogenous pararetrovirus (CitPRV), and two putative novel viruses tentatively named Citrus jingmen-like flavivirus and Citrus virga-like virus. The latter two viruses seem to be more associated with asymptomatic plants.

(b) Stone fruits
A pioneering work on HTS application to Prunus for de novo discovery and detection of viruses started from the INRA-Bordeaux team in 2013. Candresse et al. [43] analyzed a dsRNA library from sour cherry showing symptoms of Shirofugen stunt disease (SSD). The failure to detect any other viral agent, but a divergent isolate of Little cherry virus-1 (LChV1), suggested that LChV1 could be responsible for the SSD syndrome.
From a cherry tree collected in Italy, and a plum tree collected in Azerbaijan, both with mixed-infections and no symptoms, Marais et al. [45] reconstructed contigs having weak but significant identity with various members of the family Betaflexiviridae. The identified viral species showed sequence homology with the sole member of the genus Tepovirus, Potato virus T (PVT). The name Prunus virus T was proposed for this new virus, which has 1% prevalence in the screening of a large Prunus collection.
Subsequently, during a field survey in apricot orchards, vein clearing and mottling symptoms were observed by Elbeaino et al. [44] on young leaves of a 3-year-old apricot tree. None of the main Prunus viruses were detected, while sequencing of dsRNA allowed the identification of a novel virus showing the highest nucleotide identity (ca. 40%) with Citrus leaf blotch virus (CLBV). The borderline phylogenetic allocation of this virus, namely Apricot vein clearing associated virus (AVCaV), between citri-and vitiviruses (presence of four distinct encoded proteins) led to a proposal for a new genus in Betaflexiviridae. Indeed, a further report from Marais et al. [70] identified two more isolates of AVCaV from almond and Japanese plums, and an additional new virus sequence in an Azerbaijani almond tree (Aze204) that was considered as a new distinct species (named Caucasus prunus virus, CPrV). The authors suggested that AVCaV and CPrV, having close genetic and structural similarity, could be integrated in the newly created genus Prunevirus. Anyway, while Elbeaino et al. [44] identified AVCaV in mixed-infections, the failure to detect any other viral agent in the Aze204 source indicated the putative involvement of CPrV in the etiology of the mild symptoms observed (chlorotic spots along the veins and weak reddening of young leaves).
Additional work was presented on the full genome sequencing of the Asian prunus viruses. Marais et al. [46] reconstructed the complete genomes of Asian prunus virus 1, 2, and 3 (APV1, APV2, and APV3). A relevant phylogenetic assessment was that APV2 should be considered as a distinct viral species in the genus Foveavirus, while taxonomies of APV1 and APV3 were not yet defined. In the most recent report from this group [71], the new capillovirus Mume virus A was described in a Japanese apricot (Prunus mume) showing diffuse chlorotic spots on leaves. Lenz et al. [48], in Czech Republic, described in several mixed-infected cherry accessions (having no particular symptoms) a new virus sequence, named cherry-associated luteovirus (ChALV). This is the fourth member of the family Luteoviridae reported to naturally infect woody plants. After this discovery, sequence homologies to ChALV were identified by Wu et al. [72] in the United States in peach accessions from Spain and Georgia. A clade of Rosaceae-associated luteoviruses (rose spring dwarf, nectarine stem pitting, etc.) was therefore conceived and proposed.
A sequencing project performed in California [73] on dsRNA extracts from imported nectarine accessions that showed stunting and pitting on woody cylinder in a propagation block, revealed the presence of a novel luteovirus (nectarine stem pitting-associated virus, NSPaV). Once again, the importance of metagenomic analysis in post-entry quarantine to assess plant health status is underlined. Villamor et al. [74] using total RNA instead of dsRNA on the same format (50 nucleotide single read Illumina run) sequenced, besides the NSPaV found in California, a new virus genome resembling a typical marafivirus (Nectarine virus M, NeVM) in symptomatic P. persica trees; the same marafivirus and luteovirus were found in symptomless nectarine and peach selections, which tested negative by indexing or free from known viruses by serological and molecular tests. In the attempts to clarify the complex association and genomic relationship of several virus-like diseases and the related agents in cherry, Villamor et al. [75] did a comprehensive analysis of previously reported full genomic sequences (Cherry green ring mottle virus, and Cherry necrotic rusty mottle virus) and determined new sequences from isolates of Cherry twisted leaf-associated virus (CTLaV) and Cherry rusty mottle-associated virus (CRMaV). Finally, the segregation of these viral sequences into four clades corresponding to distinct virus species was assessed. The establishment of a new genus (Robigovirus), gathering these viruses, was then suggested within the family Betaflexiviridae.
Villamor et al. [49] revealed a new virus resembling fabaviruses when they analyzed rRNAdepleted total RNA, extracted from a sweet cherry tree showing angular necrotic leaf spot symptoms, a general decline, and loss of fruit-bearing shoots. This virus (with a bipartite genome) was named Prunus virus F (PrVF), and it was found associated in the sample along with other common Prunusinfecting viruses.
More new Prunus viruses were also identified during recent HTS screening and surveys in Asia. He et al. [38] sequenced the siRNAs from peach plants affected by a disease causing smaller and cracked fruit and searched for viral pathogens. Besides already known viruses, a novel fabavirus, namely Peach leaf pitting-associated virus (PLPaV) was identified, having some genome differences in the genome features from other members of the genus. More recently, one more Fabavirus species, CVF, was identified by Koloniuk et al. [40] infecting sweet and sour cherry trees in Greece and Czech Republic. CVF is closely related to PrVF, although the two viruses may differ in the expression strategy of their RNA2 genomic segments.
Igori and co-workers described the complete sequence of a highly divergent South Korean isolate of a ChALV from peach [76]. Later on, from leaves exhibiting symptoms of yellowing and slight mottling collected from a yard-grown peach tree, they assembled the complete genome of the novel Peach virus D (PeVD) [77]. Sequence comparisons and phylogenetic analysis revealed that PeVD is most closely related to viruses in the genus Marafivirus, with a closer relationship (56-61% amino acid identity) with Nectarine virus M (NeVM), recently reported to infect nectarine. Similarly, Jo et al. [78] describing the virome of six peach trees, again in South Korea, by metatranscriptome analysis, identified five known betaflexiviruses, and a novel virus belonging to the family Tymoviridae, having a very close similarity to PeVD.
(c) Pome fruits HTS analysis was applied to study the occurrence of symptoms of small leaves and growth retardation in field-grown apple trees in Korea [79]. The study showed that plants were multiply infected by at least two of the following viruses: ACLSV, Apple stem grooving virus (ASGV), Apple stem pitting virus (ASPV), Apple green crinkle associated virus (AGCaV), and Apricot latent virus (ApLV). HTS analysis of an siRNA library allowed for the discovery of an isolate of AGCaV in field-grown quince plants showing symptoms of fruit deformation and bud failure [80]. In a successive survey, the virus was found consistently associated with symptoms.
An apple geminivirus was discovered in apple by sequencing an siRNA library [34]. The virus had a limited prevalence in Chinese cultivars without association with specific symptoms. Moreover, a comprehensive HTS study of the virome of apple trees with symptoms of mosaic not associated with Apple mosaic virus (ApMV), led to the discovery of a novel ilarvirus, closely correlated with ApMV, which was named Apple necrotic mosaic virus (ApNMV) [81].
A couple of publications deal with the study of Apple rubbery wood disease by HTS analysis. While Jakovljevic et al. [82] failed in identifying putatively associated virus(es) by sequencing ribosomal RNA (rRNA)-depleted total RNA from infected plants, other authors [83] enlarged the previous dataset with the analysis of additional dsRNA libraries. A complex virome, containing two new viral species, Apple rubbery wood virus 1 and 2 (ARWV 1 and ARWV 2), allowed for establishment of a strong link among Apple rubbery wood disease and the Apple rubbery viruses. ARWVs have a multi-segmented negative sense RNA genome and represent unique members of the Bunyavirales. The authors suggest a taxon as a new genus under the Phenuiviridae family. Finally, Apple-associated luteovirus (AaLV) is a new species in the genus Luteovirus discovered by HTS sequencing of a library of rRNA-depleted total RNAs [84]. AaLV was found in asymptomatic apple plants of different cultivars in China and, as such, is considered a latent virus.

(d) Examples of High-Throughput Sequencing based virus discovery on minor fruit trees (persimmon, mulberry, actinidia)
A short survey on the literature regarding the minor fruit tree species can be presented to further support the wide potential of HTS discovery when applied to some "neglected" virus-like plant disorders.
Ito et al. [85] detected two genomes of graft transmissible viruses in total RNA extracts (treated with S1 nuclease to enrich for dsRNA) of Japanese persimmon trees exhibiting fruit apex disorder.
One of the complete genomes consisted of 13,467 nucleotides and encoded genes similar to those of plant cytorhabdoviruses (named Persimmon virus A, PeVA). The other genome shared an organization similar to those of some insect and fungal viruses having dsRNA genomes. The virus, named Persimmon latent virus (PeLV), formed a possible new genus clustering with two insect viruses and one plant virus, Cucurbit yellows-associated virus. Later on, two dsRNA molecules of about 1.5 kb in size were sequenced by Morelli et al. [86] from leaf tissue, showing veinlets necrosis, of a Japanese persimmon from Southern Italy. The identified virus, Persimmon cryptic virus (PeCV), which is close to deltapartitiviruses, was, however, also found in symptomless persimmon trees and seedlings.
In mulberry, a library from siRNAs allowed Ma et al. [87] to identify a novel DNA virus (Mulberry mosaic dwarf associated virus) in a Chinese accession affected by mosaic and dwarfing symptoms. This virus has a small monopartite circular DNA genome (2.95 kb) containing ORFs (open reading frames) in both polarity strands, resembling CCDaV, a divergent geminivirus recently identified in citrus (see Loconsole et al. [52]). The Mulberry badnavirus 1 (MBV1) has been characterized as the etiological agent of a disease observed on a mulberry tree in Lebanon by Chiumenti et al. [35]. An siRNA library was analyzed, and these data, together with genome walking experiments, generated the full-length virus sequence. Uniquely among badnaviruses, the MBV1 sequence encodes a single ORF containing all the conserved pararetrovirus motifs, while a defective genome was found to replicate and be encapsidated in infected plants.
Wang et al. [88] sequenced siRNAs from kiwi fruit (Actinidia spp.) in China, and provided the first complete genome sequence of an ASGV isolate (ASGV-Ac) infecting a kiwi fruit plant. This sequence is phylogenetically distant to ASGV isolates from all other hosts by sharing about 80% genome sequence identity, and likely represents a novel variant. Zheng et al. [89] characterized a new virus infecting kiwi fruit vines affected by chlorotic ringspots. The virus (Actinidia chlorotic ringspot-associated virus, AcCRaV), presents double-membrane bodies in infected tissues, and has a genome composed of RNA segments in negative polarity, so it covers the typical features of members in the genus Emaravirus. A final report from New Zealand [90] described the complete genome sequence of a novel virus, tentatively named Actinidia seed-borne latent virus (ASbLV). The virus was identified from Actinidia chinensis (a seedling in home garden), its genome contains four open reading frames, and is most closely related to CPrV (56% nt identity), a member of the genus Prunevirus in Betaflexiviridae. Plants were affected by vein chlorosis and mottling, but were found to be co-infected with Actinidia virus A and/or Actinidia virus 1, although ASbLV was also detected in asymptomatic vines as a single infection.

Application of High-Throughput Sequencing to Detection and Virus Genetic Variability Studies
The increasing use of HTS techniques for a rapid and sensitive detection of the virome in particular plant material (virus source collections, plant clinics, mother plants for certified stocks, etc.) is essentially devoted to have a complete picture of the systemic agents (for what is known in that host). This application is generally intended as "targeted", when generic or even specific primers are able to amplify all the variability in a particular target sequence and compare, simultaneously, up to several hundreds of samples in a single run. Alternatively, the unbiased, or "untargeted", sequencing of many samples can also aim to describe the fine-tuning of nucleotide variability in the quasi-species frame. A by-product of this analysis is the refinement of conventional molecular diagnostic tools through the additional genome variability information gained. ?: putatively belongs to a new virus genus.
At least two applications for a massive genotyping and virus characterization through HTS can be reported. The first one is the screening for dominant strains or isolates in endemic areas where cross-protection against dangerous pathogens is performed. In Brazil or South Africa, mild CTV strains are used to control the emergence of severe strains. In these cases, accurate monitoring of newly emergent strains, that can occur by introduction or recombination, can be used for timely containment actions. A second application is an eco-genomic survey in certain crops or for certain virus groups, that is, run necessarily as a pilot study in geographic areas (see the example of Australia) [91] to check the virus status already present, and avoid the introduction of unwanted agents. Pursuing the former objective, Zablocki & Pietersen [92] and Read & Pietersen [93] provided evidence through HTS analysis regarding the isolates associated with the breaking of CTV cross-protection in grapefruit industry, characterizing recombination events, and fine population mapping, even at a single gene level. An example of the study of fine molecular virus-host interactions is the report of Visser et al. [94] who sequenced siRNAs and transcripts on grapefruit co-infected by CTV and Citrus dwarfing viroid. Analyzed siRNAs showed the pathogen regions associated with increased silencing and an altered plant hormone regulatory pattern. The same group developed and evaluated e-probes to be applied on citrus massive sequencing for rapid, simultaneous identification of 11 recognized viruses and viroids [95]. This e-probe-based approach was applied to screen different samples and RNA template, and mainly for CTV genotyping.
In the frame of CTV research, Yokomi et al. [17] sequenced siRNAs of Californian CTV isolates with divergent serological and molecular profiles and trifoliate resistance-breaking isolates were identified, while Licciardello et al. [96] analyzed siRNAs from two CTV isolates (aggressive and mild), representing the population structure of the virus in Sicily. Their molecular data needed, however, to be integrated with bio-indexing, in order to obtain adequate information for risk assessment.
The genome of a one-century old lemon-infecting CTV isolate was reconstructed in Greece through siRNAs sequencing on Ion Torrent platform ( Table 3). The isolate, even if it belongs to a severe strain, shows mild symptoms and bears signal of recombination events.
Other records of variability of citrus viruses were done in Japan [97], where five isolates of CVEV were determined by total RNA sequencing and compared, and in China [98], with the siRNA detection of a Chinese isolate of CYVCV.
On the side of stone fruit, more attention was paid to date to some less characterized viruses such as LChV1, Plum bark necrosis stem pitting-associated virus (PBNSPaV) or Cherry virus A (CVA), as well as on newly identified fabaviruses. Studies on ilarviruses in Australian stone fruits focused on genetic variability of a number of isolates obtained by generic amplicon sequencing [99].
More specifically, several variants of PBNSPaV were retrieved from a project of sequencing four Prunus samples of unclear or unknown etiology [100], while a first isolate from sweet cherry was sequenced in China [101]. Recently, isolates of LChV1 were sequenced from peach in South Korea [102], apricot in Hungary [41], and sweet cherry in China [103]. Full genome sequences from isolates of CVA were obtained from apricot in Japan [104] and Hungary [41], and from sweet cherry in China [105]. In a wild cherry tree from Slovakia, together with a multiple viral infection, Glasa et al. [9] found two divergent isolates of CVA (differing by 14% at the nucleotide level) and gave some evidence on the recombinant nature of these isolates and for the presence of "non-cherry" group CVA isolates in cherry host plants. CVA variability was studied also in Canada by Kesanakurti et al. [50] in 39 stone fruit tree specimens infected with CVA, in which 75 full and 16 partial-length CVA genomes were assembled. No phylogenetic relationship was found between the host and the assembled genome variants.
Using sequencing of dsRNA on PCR positive samples Špak et al. [106] obtained nearly complete genomes of one CNRMV and three CGRMV isolates, concluding, by their genetic relationship with foreign isolates and negative results screening in a plant collection for these viruses, that their origin was from imported accessions.
Three publications from a project on Ilarvirus species in Australia interestingly applied a strategy of sequencing of double-indexed amplicons. In a first study [107], the variability of Prunus necrotic ringspot virus (PNRSV) was investigated on 53 PNRSV-infected trees in conserved gene regions of each of PNRSV RNA1, RNA2, and RNA3. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups, and a nucleotide identity of 97% was determined as demarcation threshold for each of the PNRSV phylo-groups, each one composed by distinct clades of polymorphic variants. A second study [99] addressed, through generic amplicon sequencing (in RNA2), the distribution of Ilarvirus species populations amongst 61 Australian Prunus trees from a wide survey in stone fruit orchards. Mixed-infections of ilarviruses were detected, and also two different RNA2 sequence variants most likely belonging to unknown ilarviruses, from which no other genomic RNAs were identified. In a survey of 127 Prunus tree samples collected from five states in Australia, Kinoti et al. [108] found ApMV and PDV to occur in 3% and 10% of the trees, respectively. HTS of amplicons from partial conserved regions of RNA1, RNA2, and RNA3 of ApMV and PDV was used to determine the genetic diversity of the Australian isolates of each virus.
Furthermore, by using HTS, Koloniuk et al. [40] described the co-infection of sweet and sour cherry trees with diverse genomic variants of two closely related fabaviruses, namely PrVF and CVF. A similar observation was done by Villamor et al. [49] during the initial identification of PrVF. It is particularly interesting that arbitrary mixtures of phylogenetically distinct genotypes of the two genomic segments from both viruses exist in the sampled trees, while the occurrence of single genotype infections is rare.  Finally, in HTS research of virus variability in apple, Visser et al. [33] analyzed siRNA libraries from plants infected by ASGV. The analysis of data demonstrates that, in apple, the anti-viral RNA silencing operates similarly to other plant species by producing predominant 21-22 nucleotide-long viral siRNAs, and these originate mainly from the 3 end of the viral genome, where subgenomic RNAs are synthesized. In addition, publicly available mRNA and siRNA libraries from apple were mined by Jo et al. [57] to survey for the presence of ASGV and obtain information on genome recombination, single nucleotide variants, and phylogenetic relationships. The work illustrates the suitability of already available datasets to be used in successive investigations and for multiple purposes. Table 3 summarizes detection and fruit tree virus genetic variability studies applying HTS.

Advantages and Disadvantages of High-Throughput Sequencing Compared to Classical Diagnostic Approaches
The application of HTS technologies has several advantages over the classical serological, molecular, and biological approaches both for disease diagnosis and for specific virus detection in fruit trees. Among the most important advantages of HTS is that a priori knowledge on the viral pathogens infecting a plant sample is not required. This means that there is no need to know the sequence of a targeted virus species, as in PCR, in order to design primers for its molecular detection or, like ELISA, to have antibodies available for serological detection. The classical detection assays, with the exception of biological indexing, are highly specific and cannot identify unknown viruses that might be involved in the etiology of a plant disease. The use of HTS reduces the time needed, and facilitates the necessary procedures in order to identify the etiological agent of a disease, since the method provides the identification of the plant sample's virome, often unveiling the presence of novel viral pathogens [38,49,53,74,111,112].
As far as virus detection is concerned, the application of deep sequencing approaches alleviates the problem of false negatives due to high genetic variability or low titer, which are frequently encountered in viruses infecting fruit trees [113,114]. Moreover, the high number of different reads obtained from the HTS run makes possible the reconstruction of the full genome sequences of a virus species, which can be subsequently used to perform genetic variability studies. Complete viral genomes or big fragments of them have been obtained using either siRNAs [33][34][35][36][37][38][39][40][41], dsRNA [40,[43][44][45][46][48][49][50]71,106], or total RNA [9,40,49,57,76] template from a multitude of different fruit tree plants. Where HTS excels, however, is the identification of genetic variants of a virus species that often co-exist in the same perennial host. The ability to produce full genomes from de novo assembly, in combination to sequence depths which can reach up to a few thousand reads per site, can reveal the presence of variants in very low frequencies, as well as defective RNAs or mixed-infections of closely related strains/isolates [9,40,49]. It should be noted, however, that validation of the sequences obtained by HTS can be needful, for which the conventional Sanger sequencing is applied. This verification usually concerns poorly covered genome portions, the 5´and 3´genome extremities or the integrity of unusual recombination junction site(s).
In woody perennial crops, mixed virus infection is the most common feature, so unravelling diseases of unknown etiology is complicated by the simultaneous presence of different viruses (potentially acting in synergism/antagonism), and the presence/absence of a novel key virus should be further confirmed by conventional detection in wider surveys. Symptoms can be linked to peculiar, severe isolates of a virus species, thus the recovery of specific sequences maybe not enough for causality association. In the citrus sudden death disease [69], a specific strain of CSDaV seems to be associated with the expression of symptoms, and several other viruses with higher abundance of reads are present in symptomless plants, along with different strains of CSDaV. Another case is the discovery of an atypical cytoplasmic citrus leprosis virus strain (CiLV-C2) [63], found by HTS and not detected by molecular tools specific for CiLV-C. On the other hand, HTS often produces sequences belonging to non-plant pathogenic agents, such as mycoviruses, which were identified in grapevine [53], or pararetrovirus sequences that possibly represent genetic fossils integrated in the plant genome [115]. In addition, HTS has revealed, over the last few years, the presence of a large number of new virus species, the biological significance of which is unknown. For some of those viruses, a number of biological traits can be inferred based on the homology they share with already characterized species, however, for many, it is even still debatable whether they represent plant-infecting viruses having economic importance [85,116,117]. The identification of a new plant virus species could affect trading of plant material across countries, and therefore, biological data should be provided soon after the initial identification of the pathogen, in order to better assess its significance [118].
Despite the advances in the field of HTS technology and the ability to sequence several samples simultaneously as a pool of extracted nucleic acid, testing the large number of samples necessary to study the incidence or molecular epidemiology of a plant virus species is still costly. In addition, there are no established thresholds of reads numbers to identify the presence of a virus species in a sample, which needs to assess the sanitary status of a sample. The detection of a low number of reads from a viral genome could indicate either the presence of a virus in low titer, or the contamination of the sample during handling. Therefore, the additional application of species-specific molecular/serological techniques is still needed in order to confirm such findings. Another drawback of high-throughput technologies is the impact of the selected template on the kind of viral pathogens detected. This template-dependency might be observed for DNA viruses which are excluded from the analysis of dsRNA-based libraries and, likewise, after poly(A) enrichment, where viruses lacking a poly(A) tail are not detected. On the other hand, VANA has been reported to successfully detect both RNA and DNA viruses [119,120] but it is not able to detect agents that are not encapsidated, such as viroids and endornaviruses. Given each approach's benefits and limitations, serious consideration should be given to the selection of the template used. The majority of tree-infecting viruses have RNA genomes, have a dsRNA stage during their replication and might have a poly(A)tail and get encapsidated for transmission and movement. However, the recent rise in identification of tree-infecting DNA viruses [52,121,122], viruses lacking poly(A) tail (e.g., luteoviruses) [48,72,74,76], and the importance of viroid-related diseases possibly make siRNAs or total RNA the templates of choice for a thorough identification of the viruses present in a sample. An important drawback of HTS is the fact that, contrary to the classical detection methods, the analysis of HTS data requires the use of sophisticated bioinformatics tools by specialized scientific personnel. Depending on the template analyzed, different obstacles might be encountered during the bioinformatic analysis. It is probably necessary to further optimize the pipeline used for the analysis e.g., in RNAseq analysis, a step of subtracting the plant genome should be included in order to easily identify new (and known) viruses, since the majority of the resulting contigs are plant-derived. Otherwise, when the plant species genome is not available, it is a highly laborious task to identify the contigs related to new viruses from the thousands of contigs produced from the de novo assembly.
Finally, it should be highlighted that diagnostic interpretation of the bioinformatic analysis of HTS data is of paramount importance compared to interpretation of the results of classical detection assays. Whichever the template used for the analysis, it is highly probable that a multitude of different viruses will be found in a sample. Perennial host trees accumulate viruses over the years and those might, or might not, play a role on the development of a disease. The presence of a virus in a diseased sample is not proof of the induction of the observed symptomatology and caution should be even taken when naming a new virus, as the inclusion of the disease in the virus name could predispose the research on this virus, and might have implications on the movement of propagative material across borders. Characterization of a newly identified virus and its role in disease etiology would probably require the use of infectious complementary DNA clones or other biological techniques, enabling establishment of a single agent infection of a plant [118].

Future Directions of High-Throughput Sequencing in Fruit Tree Virology
HTS analysis is undoubtedly a technological breakthrough that has yet to offer a lot in plant virology. However, advances in the field of fruit-tree virology using HTS have to mitigate the weaknesses of the other assays, mainly those related to pathogen diagnosis and the identification of the sanitary status of the trees. Those two aspects of research are directly involved in the production of virus-tested high-quality propagative material, and the prevention of trading of diseased plants which are the main ways of containment for serious fruit tree viral diseases. The perennial life cycle of tree crops places virus infections into a different perspective than those of the annual herbaceous plants. In the latter, the diseases and viral loads might change each year, and annual fluctuations are observed even in areas where specific viruses are endemic. In fruit trees, however, viruses accumulate over the years, giving rise to mixed-infections, which become the normal status of a tree. Moreover, the initial sanitary status of each plant from the nursery might play a major role on the development of a disease through the interactions of the viruses present. Therefore, it will be highly important to further improve current HTS technologies and subsequent bioinformatic analysis procedures, so that they can be efficiently applied in the fields of certification of fruit tree propagation material and phytosanitary control.
HTS will eventually lead to the identification of the majority, if not every, virus and viroid present in each host species, updating their virome data. This extensive database, of different isolates from the viruses infecting trees, will be able to optimize the existing detection assays by addressing one of the major shortcomings for the traditional methodologies, which is the high sequence variability present in fruit tree-infecting viruses. The optimization of the existing primers based on sequences from different isolates, and the variants of these isolates present in a sample, will minimize false-negative results, thus leading to more accurate diagnostic assays. Moreover, the assessment of the new viruses identified will provide a thorough list of pathogens posing potential threats, and quarantine lists can be updated.
HTS has proven a valuable tool in the identification of genotype mixtures in the form of virus variants, and different isolates inside a single plant. With the aid of this technology, the study of interactions between different viruses or virus genotypes will be facilitated, unveiling mechanisms of pathogenesis and disease development. Importantly, the method could be used for the study of plant-virus-vector interactions through transcriptomic analyses or studies of resistance mechanisms, such as RNA silencing, acting against viruses in different fruit tree species, which in the end, could lead to the development of novel control tools.
Author Contributions: All authors contributed to writing and revising the review paper.
Funding: This work and costs to publish in open access were supported by funds from VirFree. This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 734736. This publication reflects only the authors' view. The Agency is not responsible for any use that may be made of the information it contains. M.G. acknowledges support from the Slovak Research and Development Agency (APVV-15-0232). A.O. acknowledges also the support from Instituto Nacional de Investigaciones Agrarias (RTA-2014-00061, E-RTA2017-00009).