Assessment of Metagenomic Sequencing and qPCR for Detection of Influenza D Virus in Bovine Respiratory Tract Samples

High throughput sequencing is currently revolutionizing the genomics field and providing new approaches to the detection and characterization of microorganisms. The objective of this study was to assess the detection of influenza D virus (IDV) in bovine respiratory tract samples using two sequencing platforms (MiSeq and Nanopore (GridION)), and species-specific qPCR. An IDV-specific qPCR was performed on 232 samples (116 nasal swabs and 116 tracheal washes) that had been previously subject to virome sequencing using MiSeq. Nanopore sequencing was performed on 19 samples positive for IDV by either MiSeq or qPCR. Nanopore sequence data was analyzed by two bioinformatics methods: What’s In My Pot (WIMP, on the EPI2ME platform), and an in-house developed analysis pipeline. The agreement of IDV detection between qPCR and MiSeq was 82.3%, between qPCR and Nanopore was 57.9% (in-house) and 84.2% (WIMP), and between MiSeq and Nanopore was 89.5% (in-house) and 73.7% (WIMP). IDV was detected by MiSeq in 14 of 17 IDV qPCR-positive samples with Cq (cycle quantification) values below 31, despite multiplexing 50 samples for sequencing. When qPCR was regarded as the gold standard, the sensitivity and specificity of MiSeq sequence detection were 28.3% and 98.9%, respectively. We conclude that both MiSeq and Nanopore sequencing are capable of detecting IDV in clinical specimens with a range of Cq values. Sensitivity may be further improved by optimizing sequence data analysis, improving virus enrichment, or reducing the degree of multiplexing.


Introduction
High throughput sequencing is currently revolutionizing the genomics field and providing new approaches to the detection and characterization of viruses. The utilization of metagenomic sequencing to elucidate genome sequences of viruses, particularly RNA viruses, directly from clinical samples offers several benefits. First, metagenomics enables identification and genomic characterization of unexpected viruses or even novel viruses either as primary pathogens or as co-infectants, without prior knowledge of their clinical significance [1][2][3][4]. Second, it eliminates the need for ongoing optimization of primers and/or probes for rapidly evolving or highly diverse RNA viruses [5]. Third, it facilitates routine surveillance and early detection of outbreaks of novel virus strains that are distinct from currently circulating strains. Finally, the development of portable sequencing devices creates the potential for timely identification of routine cases or outbreaks in the field [6,7]. Sequencing technology continues to evolve rapidly. With the capability of generating long reads, relatively lower set-up cost and portability, Oxford Nanopore sequencing has attracted increasing attention for its potential advantages in some circumstances over short-read sequencing technologies [8,9].
Metagenomic sequencing has been widely used for non-targeted detection of viruses and has been applied to identify several "new" viruses associated with bovine respiratory disease (BRD) [10][11][12][13]. In dairy cattle, bovine adenovirus 3 (BAdV3), bovine rhinitis A virus (BRAV), and influenza D virus (IDV) showed significant association with BRD [12]. In beef cattle, bovine rhinitis B virus (BRBV), BRAV, and IDV showed statistical association with BRD [11,13]. Among all of the viruses detected by sequencing of the bovine respiratory tract metagenome, influenza D virus (IDV) has been identified as a common virus associated with BRD in both beef and dairy cattle, suggesting the potential contribution of IDV to BRD [11][12][13]. Influenza D virus (IDV) belongs to the Orthomyxoviridae, and is a single-stranded, enveloped, segmented and negative-sense RNA virus [14]. Since its discovery in swine in USA in 2011, IDV has been reported all over the world, and cattle are thought to be the natural host reservoir [13,[15][16][17][18]. In addition to cattle and swine, IDV has been reported in sheep, goats, laboratory animals (ferrets and guinea pigs), and seropositivity has been detected in humans [17,[19][20][21]. Concerns about interspecies transmission and potential zoonosis have been raised due to the high seroprevalence of IDV antibodies in people exposed to cattle [19].
Given the diversity of viruses now known to be associated with BRD and the potential for discovery of novel viruses like IDV, there is increasing interest in application of metagenomic sequencing for diagnostics, since screening for many individual viruses using targeted PCR assays quickly becomes logistically complex, expensive, and time-consuming. The relative performance of metagenomic sequencing compared to PCR in terms of analytical sensitivity, however, has not been widely explored. At the time of writing, there is only one published study focused on the assessment of performance of MiSeq and Nanopore sequencing relative to gold standard qPCR, which included a limited number of samples and a narrow range of viral loads [1].
In this study, IDV was used as a representative BRD-associated virus to examine the feasibility of using metagenomic sequencing for detection of viruses in clinical bovine respiratory samples. We compared results of long-read sequencing on the Oxford Nanopore GridION platform and previously generated Illumina MiSeq data [13] to the results of an IDV-specific qPCR. The objective was to assess IDV detection using all three approaches applied to a set of bovine respiratory tract samples containing a wide range of viral loads.

Ethics Statement
The samples used in this study were collected as part of a previous study [10]. Collection of the samples was approved by the University of Calgary Veterinary Sciences Animal Care Committee (AC15-0109).

Sample Preparation
The overall sample preparation workflow is shown in Figure 1a. Sample collection and preparation were described previously [13,22]. Briefly, paired nasal swabs (n = 116) and tracheal washes (n = 116) were collected from cattle with BRD and healthy controls from four different feedlots in Alberta, Canada between November 2015 and January 2016 [22]. The samples were centrifuged and supernatants were treated with DNase (Life Technologies, Carlsbad, CA, USA) and RNase (Promega, Madison, WI, USA), followed by extraction of viral nucleic acids using a commercial kit (QIAamp MinElute virus spin kit, Qiagen, Venlo, Netherlands). A portion (2.5 µL) of extracted total nucleic acids was used directly as a template for IDV qPCR and another portion (7 µL) used to generate cDNA for sequencing. The first strand was reverse transcribed with primer FR26RV-N (5 -GCC GGA GCT CTG CAG ATA  TCN NNN NN-3 ) using Superscript III enzyme (Life Technologies, Carlsbad, CA, USA), followed by complementary strand synthesis using Sequenase polymerase (Affymetrix, Santa Clara, CA, USA) as per manufacturer's instructions [23]. Double-stranded DNA was purified using NucleoMag beads (Macherey-Nagel Inc., Bethlehem, PA, USA) and subsequently subjected to random amplification with primer FR20RV (5 -GCC GGA GCT CTG CAG ATA TC-3 ) prior to sequencing library preparation [23].

qPCR Confirmation and Quantification
Quantitative real-time PCR for IDV was performed on the extracted total nucleic acids for each sample (total number of samples = 232) using previously described primers and probe specific for IDV [24]. The qPCR was carried out using AgPath-ID One-Step RT-PCR reagents (ThermoFisher Scientific, Waltham, USA) in a total volume of 12.5 µL, which included 2.5 µL template, 4 pM forward/reverse primers, 2 pM probe and 0.5 µL AmpliTaq Gold DNA polymerase in a Bio-Rad CFX 96 Real-Time Detection System (Bio-Rad, Hercules, CA). The following cycling conditions were used: reverse transcription phase at 48 • C for 30 min; initial activation phase at 95 • C for 10 min; 40 two-step cycles of denaturation at 95 • C for 15 s; and annealing and extension at 60 • C for 1 min. To obtain a positive control template, a DNA fragment corresponding to a 170 bp portion of the PB1 gene of IDV (accession number: JQ922306) was synthesized and inserted into pUC57-Amp vector (Bio Basic, Markham, ON, Canada). A 10-fold dilution series of the positive control plasmid was used to construct a standard curve to determine the efficiency of the PCR. All samples were tested in duplicate along with the standard curve and no template controls. Samples, for which both of the duplicates gave a sigmoid amplification curve with a Cq (cycle quantification) value, were considered positive.

GridION Library Preparation and Sequencing
Nineteen samples that were IDV positive by either MiSeq virome sequencing or qPCR were selected for Nanopore sequencing. Samples were selected to represent the full range of Cq values observed in the qPCR results. Three batches of six samples were multiplexed and run on individual flow cells, while one sample (sample 129) was run individually. The DNA used for GridION Nanopore library preparation was from the same randomly amplified DNA that was used for MiSeq sequencing ( Figure 1a). Ligation 1D sequencing kit SQK-LSK108 was used for library preparation. End-repair and dA-tailing were performed on randomly amplified DNA for each sample using NEBNext FFPE DNA repair mix and Ultra II End-prep enzyme mix (New England Biolabs, Ipswich, MA, USA). After purification with AMPure XP beads (Beckman Coulter, Brea, CA, USA), native barcode ligation using the EXP-NBD103 barcode kit (Oxford Nanopore Technologies, Oxford, UK) and Blunt/TA Ligase master mix (New England Biolabs, Ipswich, MA, USA) was performed as per manufacturer's instructions. The concentration of barcoded libraries was determined using a Qubit fluorometer (ThermoFisher Scientific, Waltham, MA, USA) and subsequently equimolar amounts of each barcoded library (total amount = 1 µg) were pooled, and adaptors were added using Quick T4 DNA Ligase (New England Biolabs, Ipswich, MA, USA). Each pooled library (14.5 µL) was mixed with 35 µL Priming Buffer and 25.5 µL Loading Beads and loaded dropwise through the sample port into the flow cell (FLO-MIN106) as per manufacturer's instructions. The MinKNOW platform QC check confirmed at least 800 available pores, and the High Accuracy Basecalling (HAC) Flip-flop model was applied.

Bioinformatic Analysis
The workflow of bioinformatic analysis is illustrated in Figure 1b. Once Nanopore raw data were demultiplexed and trimmed using Porechop (https://github.com/rrwick/Porechop) and passed the quality score (Qscore) 7, high quality reads were aligned to the bovine genome (BioProject Accessions PRJNA33843, PRJNA32899) using Minimap2 [25], and unmapped reads (i.e., non-host derived reads) from each sample were de novo assembled using Trinity [26,27]. Assembled contigs were mapped to the virus Reference Sequence (RefSeq) database using BLASTn and virus-like contigs with a minimum alignment length of 100 bp and an expectation (e) value < 10 −3 were further examined by BLASTx alignment to the GenBank non-redundant protein sequence database to confirm the nucleotide sequence-based identification and to remove any spurious matches [28]. The total number of viral reads was determined as previously described [13].
Quality filtered reads from the Nanopore sequencing were also uploaded to the EPI2ME (https: //epi2me.nanoporetech.com/) platform for analysis with the WIMP (What's in My Pot, version 2.3.7) application for taxonomic classification of reads.
Viruses 2020, 12, x FOR PEER REVIEW 4 of 12 from each sample were de novo assembled using Trinity [26,27]. Assembled contigs were mapped to the virus Reference Sequence (RefSeq) database using BLASTn and virus-like contigs with a minimum alignment length of 100 bp and an expectation (e) value < 10 −3 were further examined by BLASTx alignment to the GenBank non-redundant protein sequence database to confirm the nucleotide sequence-based identification and to remove any spurious matches [28]. The total number of viral reads was determined as previously described [13]. Quality filtered reads from the Nanopore sequencing were also uploaded to the EPI2ME (https://epi2me.nanoporetech.com/) platform for analysis with the WIMP (What's in My Pot, version 2.3.7) application for taxonomic classification of reads.  Extracted RNA was used directly for qPCR, while DNA randomly amplified from the same extracts was used for MiSeq sequencing and GridION Nanopore sequencing; (b) Bioinformatic workflow to identify viruses in BRD samples. WIMP (What's In My Pot) analysis was used only for Nanopore data. The remaining analysis was the same for data from both MiSeq and GridION Nanopore sequencing except Minimap2 was used instead of Bowtie2 for host sequence subtraction.

Comparison of IDV Detection by MiSeq and qPCR
Samples totaling 232 (116 nasal swabs and 116 tracheal washes) that had been sequenced using MiSeq previously (500 cycle V2 chemistry, libraries of 50 multiplexed samples) were tested by an IDV-specific qPCR (Figure 1a) [10]. The detection limit of the PCR was demonstrated to be 62.5 copies per reaction (data not shown). There were 53 IDV positive samples based on the qPCR and the range of Cq values was from 16.99 (6.25 × 10 7 copies per reaction) to 39.46 (6.88 copies per reaction) with median Cq value being 34.07 (2.91 × 10 2 copies per reaction). The agreement of IDV detection between qPCR and MiSeq was 82.8%. When qPCR was regarded as the gold standard, the sensitivity and specificity of MiSeq detection were 28.3% and 98.9%, respectively. IDV was detected by MiSeq in 14 of 17 IDV qPCR-positive samples with Cq values below 31 (Figure 2), when multiplexing 50 samples in the MiSeq flow cell. Only 1 of 36 IDV qPCR-positive samples with a Cq value above 31 was detected by MiSeq ( Figure 2). Nineteen samples that were positive for IDV by qPCR or MiSeq, and that represented the full range of qPCR Cq values, were selected for further analysis by Nanopore sequencing.
Extracted RNA was used directly for qPCR, while DNA randomly amplified from the same extracts was used for MiSeq sequencing and GridION Nanopore sequencing; (b) Bioinformatic workflow to identify viruses in BRD samples. WIMP (What's In My Pot) analysis was used only for Nanopore data. The remaining analysis was the same for data from both MiSeq and GridION Nanopore sequencing except Minimap2 was used instead of Bowtie2 for host sequence subtraction.

Comparison of IDV Detection by MiSeq and qPCR
Samples totaling 232 (116 nasal swabs and 116 tracheal washes) that had been sequenced using MiSeq previously (500 cycle V2 chemistry, libraries of 50 multiplexed samples) were tested by an IDV-specific qPCR (Figure 1a) [10]. The detection limit of the PCR was demonstrated to be 62.5 copies per reaction (data not shown). There were 53 IDV positive samples based on the qPCR and the range of Cq values was from 16.99 (6.25 × 10 7 copies per reaction) to 39.46 (6.88 copies per reaction) with median Cq value being 34.07 (2.91 × 10 2 copies per reaction). The agreement of IDV detection between qPCR and MiSeq was 82.8%. When qPCR was regarded as the gold standard, the sensitivity and specificity of MiSeq detection were 28.3% and 98.9%, respectively. IDV was detected by MiSeq in 14 of 17 IDV qPCR-positive samples with Cq values below 31 (Figure 2), when multiplexing 50 samples in the MiSeq flow cell. Only 1 of 36 IDV qPCR-positive samples with a Cq value above 31 was detected by MiSeq ( Figure 2). Nineteen samples that were positive for IDV by qPCR or MiSeq, and that represented the full range of qPCR Cq values, were selected for further analysis by Nanopore sequencing.

Comparison of Nanopore Sequencing Results with Previously Determined MiSeq Data
A total of 82.7 million reads were obtained from MiSeq. After removing low-quality reads and host-derived reads, 33.6 million reads remained. A total of 1.8 million high-quality viral reads was generated, accounting for 2.19% of the total reads obtained from MiSeq [13]. A total of 5.9 million raw Nanopore reads was generated, 69.5% of which passed the quality filter (Qscore 7) and were classified taxonomically when the data was uploaded to the EPI2ME platform. These classified reads included a total of 0.41 million viral reads, accounting for 6.9% of filter passed, classified reads ( Figure 3). The proportion of viral reads per sample was 0.1% to 18.4% (Nanopore, WIMP analysis) compared to

Comparison of Nanopore Sequencing Results with Previously Determined MiSeq Data
A total of 82.7 million reads were obtained from MiSeq. After removing low-quality reads and host-derived reads, 33.6 million reads remained. A total of 1.8 million high-quality viral reads was generated, accounting for 2.19% of the total reads obtained from MiSeq [13]. A total of 5.9 million raw Nanopore reads was generated, 69.5% of which passed the quality filter (Qscore 7) and were classified taxonomically when the data was uploaded to the EPI2ME platform. These classified reads included a total of 0.41 million viral reads, accounting for 6.9% of filter passed, classified reads ( Figure 3). The proportion of viral reads per sample was 0.1% to 18.4% (Nanopore, WIMP analysis) compared to 0.03% to 3.1% for the previously generated MiSeq data; however, with both sequencing approaches, the majority of reads obtained were identified as host-derived or other (bacteria, fungi, unclassified) ( Figure 3).

Comparison of IDV Detection by qPCR, MiSeq and Nanopore Sequencing
The 19 samples selected for sequencing on the Nanopore GridION platform represented a range of IDV concentrations based on qPCR of 6.88 to 6.25 × 10 7 genome copies per reaction, corresponding to Cq values ranging from 39.46 to as low as 16.99 (Table 1). The agreement of IDV detection between qPCR and Nanopore was 57.9% (in-house) and 84.2% (WIMP), and that between MiSeq and Nanopore was 89.5% (in-house) and 73.7% (WIMP). IDV was detected in the Nanopore data from all but one (18/19) of the IDV-positive samples when reads were classified using the WIMP application, but this proportion dropped to 11/19 when the in-house read assembly workflow was used. For most (7/8) of the samples with disparate results, 10 or fewer IDV reads were identified in the WIMP analysis. The exception was sample T10 with 29 IDV reads.
In order to explore qualitatively whether detection of other viruses in addition to IDV was comparable between the two metagenomic sequencing platforms, we compared the complete lists of viruses detected by MiSeq or Nanopore in the 19 IDV-positive samples. The number of different viruses detected in each sample varied from none to a maximum of four. The proportion of samples with perfect agreement between MiSeq and Nanopore (in-house) was 52.6%, by MiSeq and Nanopore (WIMP) it was 47.4%; and by Nanopore (in-house) and Nanopore (WIMP) it was 36.8% (Supplementary Table S1).

Discussion
Metagenomic sequencing is transforming routine detection of viruses from traditional cell culture, antibody-antigen techniques and qPCR to detection of viruses in a target-independent manner. Sequencing approaches have now been widely applied for detection of known and novel agents in various types of clinical specimens in both human and veterinary medicine [3,29,30]. The potential usefulness of viral metagenomics for virus surveillance and diagnostics is still in debate due to its performance relative to the gold-standard method of real-time qPCR routinely employed in diagnostic laboratories [6]. A recent assessment of the performance of Nanopore, MiSeq and qPCR for detection of chikungunya and dengue viruses in serum or plasma samples with relatively high viral loads (Cq values from 14 to 32) demonstrated 100% agreement between these methods [1]. In this investigation, however, a maximum of 16 samples was multiplexed and sequenced using MiSeq, and each sample was sequenced individually on Nanopore [1]. This low degree of multiplexing translates to high analytical sensitivity, but correspondingly makes these technologies relatively more expensive per sample and decreases the potential application for routine diagnostics. In our current In addition to WIMP classification of quality-filtered Nanopore reads, we also performed a de novo assembly of the Nanopore reads. The largest IDV contigs assembled for each sample from Nanopore data (using the in-house bioinformatics workflow, Figure 1b) were generally longer than those from MiSeq data and ranged from 626 to 2308 bp (Nanopore), and 249 to 1584 bp (MiSeq) ( Table 1). The genome segment coverage of each largest contig from each sample was from 10.5% to 95.3%. The proportion of Nanopore reads mapped to IDV for each sample by in-house analysis was higher than that from MiSeq except for sample T10 and T30. The proportion of IDV reads identified in the WIMP analysis of the Nanopore data, however, was generally comparable to that from the Nanopore (in-house) workflow (Table 1). The proportion of reads identified as IDV in Nanopore (WIMP), Nanopore (in-house) and MiSeq sequencing was generally extremely low (average 2.51%, 17.03%, and 0.46%, respectively). As expected, approximately six times more reads were obtained for the individually sequenced sample 129 than for those from the multiplexed samples (Table 1). Sample 129 also had the lowest Cq value in the IDV qPCR (16.99, corresponding to 6.25 × 10 7 copies per reaction) and the highest proportion of IDV reads in the metagenomic sequencing results (Nanopore-WIMP 14.69%, Nanopore-in-house 27.72%, MiSeq 1.48%) ( Table 1).

Comparison of IDV Detection by qPCR, MiSeq and Nanopore Sequencing
The 19 samples selected for sequencing on the Nanopore GridION platform represented a range of IDV concentrations based on qPCR of 6.88 to 6.25 × 10 7 genome copies per reaction, corresponding to Cq values ranging from 39.46 to as low as 16.99 (Table 1). The agreement of IDV detection between qPCR and Nanopore was 57.9% (in-house) and 84.2% (WIMP), and that between MiSeq and Nanopore was 89.5% (in-house) and 73.7% (WIMP). IDV was detected in the Nanopore data from all but one (18/19) of the IDV-positive samples when reads were classified using the WIMP application, but this proportion dropped to 11/19 when the in-house read assembly workflow was used. For most (7/8) of the samples with disparate results, 10 or fewer IDV reads were identified in the WIMP analysis. The exception was sample T10 with 29 IDV reads.
In order to explore qualitatively whether detection of other viruses in addition to IDV was comparable between the two metagenomic sequencing platforms, we compared the complete lists of viruses detected by MiSeq or Nanopore in the 19 IDV-positive samples. The number of different viruses detected in each sample varied from none to a maximum of four. The proportion of samples with perfect agreement between MiSeq and Nanopore (in-house) was 52.6%, by MiSeq and Nanopore (WIMP) it was 47.4%; and by Nanopore (in-house) and Nanopore (WIMP) it was 36.8% (Supplementary  Table S1).

Discussion
Metagenomic sequencing is transforming routine detection of viruses from traditional cell culture, antibody-antigen techniques and qPCR to detection of viruses in a target-independent manner. Sequencing approaches have now been widely applied for detection of known and novel agents in various types of clinical specimens in both human and veterinary medicine [3,29,30]. The potential usefulness of viral metagenomics for virus surveillance and diagnostics is still in debate due to its performance relative to the gold-standard method of real-time qPCR routinely employed in diagnostic laboratories [6]. A recent assessment of the performance of Nanopore, MiSeq and qPCR for detection of chikungunya and dengue viruses in serum or plasma samples with relatively high viral loads (Cq values from 14 to 32) demonstrated 100% agreement between these methods [1]. In this investigation, however, a maximum of 16 samples was multiplexed and sequenced using MiSeq, and each sample was sequenced individually on Nanopore [1]. This low degree of multiplexing translates to high analytical sensitivity, but correspondingly makes these technologies relatively more expensive per sample and decreases the potential application for routine diagnostics. In our current study, we performed further exploration to assess the performance of metagenomic sequencing approaches with a higher degree of multiplexing of clinical samples in both MiSeq and Nanopore sequencing. IDV presented an excellent target for this comparison given its association with BRD in beef and dairy cattle, and the availability of specimens with a wider range of viral loads than has been included in previous investigations [19,20,31].
Different viral extraction kits have been demonstrated to have variable extraction efficiencies for different viruses in respiratory clinical samples [32]. The QIAamp MinElute Virus Spin Kit (MVSK) has been found to be generally applicable for isolating nucleic acid for qPCR or metagenomic virus identification of adenovirus, influenza virus A, human parainfluenza virus 3, human coronavirus OC43, and human metapneumovirus in respiratory clinical samples [31]. In our current study, the original nucleic acids extracted with the MVSK were used for both qPCR and metagenomic sequencing, eliminating the influence of different extraction methods and kits on our results (Figure 1a).
The IDV-specific qPCR assay detected its target in 22.8% (53/232) of specimens. While the majority of IDV positive samples with Cq value below 31 were detected by MiSeq, only 1/36 samples with a Cq above this threshold were positive by sequencing (Figure 2). These results demonstrate that for samples where the viral load exceeds 6.25 × 10 2 per reaction even a relatively modest MiSeq sequencing effort (50 samples multiplexed in a single flow cell) is sufficient to detect the virus. The agreement between qPCR and Nanopore of 57.9% (in-house) and 84.2% (WIMP) demonstrated that relatively modest Nanopore sequencing effort (six samples multiplexed) is also sufficient to detect the virus. The results from Nanopore sequencing (in-house), however, showed no consistent relationship between viral load and detection by sequencing; furthermore, no consistent relationship between viral load and proportion of viral reads was observed in either MiSeq or Nanopore sequencing (Table 1). For example, the two IDV positive samples 199 and 10 had Cq values of 26.01 and 24.09, respectively; however, sample 199 had a higher proportion of IDV sequence reads in Nanopore and MiSeq than sample 10 ( Table 1).
There are several possible explanations for differences in both the proportion of IDV reads and total viral reads detected in each sample by MiSeq and Nanopore. First, variation in the amounts of DNA used for sequencing library preparation for the two sequencing platforms may play an important role. Second, the abundance of virus relative to host or bacterial genetic material is a critical determinant of the detection threshold of metagenomic sequencing. A greater proportional abundance of a virus increases the chance that it will be detected by sequencing and improves the genome coverage obtained. Therefore, virus enrichment is commonly applied to clinical samples and enrichment methods such as those used in this study (a combination of centrifugation and nuclease-treatment) should lead to removal of bacteria and host cells, thus improving virus detection [32]. Virus propagation in cell culture is a less appealing method for virus enrichment since it is time-consuming, requires specific expertise and creates the potential for introduction of mutations [33]. Reduction of the degree of multiplexing of samples is an alternative way to improve virus detection, but there is a corresponding increase in cost per sample and a corresponding reduction in throughput that are undesirable in research or clinical diagnostic settings. Reduction of the degree of multiplexing of samples also reduces the chances of cross-barcode contamination because barcode reagents are susceptible to cross-contamination [34].
Bioinformatic analysis in metagenomic sequencing remains challenging but is crucial for accurate identification of diagnostic targets. We used the comparable pipeline to analyze both data from MiSeq and Nanopore sequencing (in-house), which showed the feasibility of metagenomic viral whole-genome-sequencing using both Nanopore and MiSeq technology with the assembled contigs covering from 10 to 95% of each IDV genome segment. Although de novo assembly was performed on Nanopore sequencing data, for the majority of the samples, the length of the largest contig was that of one single read. Skipping the assembly step in bioinformatic analysis of Nanopore data could provide an advantage for timely identification of potential pathogens. The long reads of Nanopore sequencing are thought to provide good confidence for species level identification, but the low coverage combined with the error rates of this platform preclude its use for strain-level resolution [4].
The detection rates of IDV, the number or proportion of IDV reads (Table 1), and other viruses detected (Supplementary Table S1) from Nanopore (WIMP) and Nanopore (in-house) were different, which demonstrates that bioinformatic analysis affects the results of virus detection. Taxonomic classification in WIMP is based on Centrifuge [35], which compares query sequences to a (undescribed) reference database with a high speed and space-optimized k-mer-based algorithm. For Nanopore (in-house) and MiSeq analysis, BLAST [36] was used to compare assembled contig sequences to the NCBI RefSeq virus database, which is a more computationally intense process that produces more detailed results. The identification of a match using Centrifuge is based on probabilities of particular k-mer combinations occurring in the query and reference and not a consideration of the entire query sequence, thus increasing the possibility of false positives [35]. In contrast, Trinity assembly and then BLAST search against a reference database could lead to false negatives if the particular target sequence is very rare [37]. If there are very few reads derived from some component of the metagenome, these reads may not be included in the assembly since there is insufficient "evidence" to support building contigs from them [36,37]. The current lack of definition of the reference database or the ability to use custom databases with WIMP make this approach inappropriate for clinical diagnostic applications due to the difficulty of validating such approaches. Our results provide an illustration of the profound effects that post-sequencing analysis can have on results, and the trade-offs associated with each choice. Selection of the most appropriate analysis pipeline must consider the sequencing platform, as well as tolerance for false negatives and false positives, logistical considerations, and the required taxonomic resolution.
Analytical sensitivity is currently one of the main limitations of metagenomics. In this study, IDV was detected by MiSeq sequencing in specimens with qPCR Cq value as high as 35.62 when 50 samples were multiplexed in comparison to a maximum Cq value of 39.46 using Nanopore with multiplexing of six samples. For the IDV positive samples with low virus loads (e.g., sample 32), targeted qPCR may be preferable given its higher analytical sensitivity. Interestingly, we observed two samples that were IDV positive by both Nanopore (WIMP) and MiSeq but negative by qPCR (Samples T30 and T52, Table 1). These cases illustrate a potential advantage of metagenomic sequencing compared to qPCR since a likely explanation for this observation is that these specimens contained strain variants of IDV that were not detected by the qPCR assay. We were unable to determine if this was the case since the IDV sequence reads did not cover the region of the genome targeted by the species-specific qPCR assay. Targeted PCR assays for rapidly evolving RNA viruses require ongoing performance monitoring, and optimization of primers and probes [5]. No single method is suitable for application for all pathogens or specimen types, and each one has advantages in different circumstances.

Conclusions
Taken together our results demonstrate the potential of metagenomic sequencing on the Illumina MiSeq and Oxford Nanopore platforms for detection of viruses, including IDV, in clinical samples from naturally infected animals with a wide range of viral loads. While application of these approaches to screening animal populations or infectious disease research is feasible, their deployment for routine virology diagnostics in clinical settings will require additional research, laboratory and bioinformatic method development, and performance evaluation. Our exploration of two sequencing platforms, different degrees of multiplexing, including samples containing a wide range of virus loads, and comparing to a current diagnostic gold standard is an important step toward achieving these goals. Selection of appropriate methods will continue to require careful consideration of the numerous trade-offs that confront practitioners at each step of the investigation.