Comparison of Nanopore and Synthesis-Based Next-Generation Sequencing Platforms for SARS-CoV-2 Variant Monitoring in Wastewater

Shortly after the beginning of the SARS-CoV-2 pandemic, many countries implemented sewage sentinel systems to monitor the circulation of the virus in the population. A fundamental part of these surveillance programs is the variant tracking through sequencing approaches to monitor and identify new variants or mutations that may be of importance. Two of the main sequencing platforms are Illumina and Oxford Nanopore Technologies. Here, we compare the performance of MiSeq (Illumina) and MinION (Oxford Nanopore Technologies), as well as two different data processing pipelines, to determine the effect they may have on the results. MiSeq showed higher sequencing coverage, lower error rate, and better capacity to detect and accurately estimate variant abundances than MinION R9.4.1 flow cell data. The use of different variant callers (LoFreq and iVar) and approaches to calculate the variant proportions had a remarkable impact on the results generated from wastewater samples. Freyja, coupled with iVar, may be more sensitive and accurate than LoFreq, especially with MinION data, but it comes at the cost of having a higher error rate. The analysis of MinION R10.4.1 flow cell data using Freyja combined with iVar narrows the gap with MiSeq performance in terms of read quality, accuracy, sensitivity, and number of detected mutations. Although MiSeq should still be considered as the standard method for SARS-CoV-2 variant tracking, MinION’s versatility and rapid turnaround time may represent a clear advantage during the ongoing pandemic.


Introduction
Since the early beginning of the pandemic caused by SARS-CoV-2, efforts have been made to monitor both the circulation and evolution of the virus at the community level.
Wastewater-based epidemiology (WBE) has turned out to be a cost-effective early warning tool to monitor the spread of the virus and the upsurge of new variants and their circulation [1].For this purpose, next-generation sequencing (NGS) information becomes essential, and many countries have implemented wastewater surveillance systems that have become informative tools to support public health decision-making processes regarding the SARS-CoV-2 pandemic [2,3].
Despite sewage being a very complex matrix containing mixtures of variants, NGS may provide reliable qualitative and relative quantitative data.Qualitative data give information of which variants are present in a given sample based on the signature mutations defining each variant, whereas quantitative data show the proportions of these variants.Nevertheless, some studies warn about the reliability of using sequencing results to quantify the abundance of variants in wastewater due to its lower sensitivity in comparison with real-time PCR-based assays [4].In the context of a pandemic where results are required to take action, fast sequencing techniques as NGS are of critical importance.
Nanopore and synthesis-based sequencing techniques are the most commonly used NGS methods.Nanopores are the basis of Oxford Nanopore Technologies (ONT) sequencing platforms whereas synthesis-based methods are the basis of both Illumina and PacBio sequencing platforms.The Nanopore platform allows real-time sequencing of both short and long reads as well as detection of modified bases (i.e., methylation) in both DNA and RNA [5].The Illumina sequencing platform can only sequence short reads but provides a very high accuracy.The PacBio sequencing platform, even though not as widely used as Nanopore and Illumina for SARS-CoV-2 variant monitoring, allows long-read sequencing and modified base detection with a similar accuracy as Nanopore [6].
In this work, the main objective was to compare MiSeq (Illumina) with MinION (ONT) NGS approaches to evaluate the pros and cons of each method.With this aim, five synthetic RNAs (Twist bioscience, San Francisco, CA, USA) corresponding to SARS-CoV-2 variants of concern (VOC) Alpha, Beta, Gamma, Delta, and Omicron BA.1 were mixed at different proportions.These mixtures, as well as a sample containing only a synthetic RNA for the Wuhan variant, were sequenced with both approaches with the aim to compare their performance in terms of (i) per position error rate (PPER), (ii) output of reads per position (i.e., read depth), (iii) S gene coverage, (iv) detected mutations, and (iv) accuracy of the estimation of the proportion of each VOC.Two variant callers, LoFreq [7] and iVar (embedded in Freyja) [8,9], were also compared.On the one hand, LoFreq was designed to detect variants at very low frequencies and to be able to distinguish them from sequencing errors.It has been successfully tested in quasi-species analysis [7,10], making it a suitable tool to identify SARS-CoV-2 variants.On the other hand, a recently published bioinformatic tool, Freyja, has been created to determine the abundance of SARS-CoV-2 lineages in mixed samples, specially focusing on wastewater samples [9].For variant calling, Freyja relies on iVar, which was built for viral amplicon-based sequencing [8].Additionally, six SARS-CoV-2 naturally contaminated wastewater samples were sequenced to evaluate the performance of these NGS/pipelines in real case scenarios.

Per Position Error Rate (PPER) Calculation
The quality of the generated reads is a critical issue when making decisions based on sequencing data.By sequencing a Twist synthetic RNA corresponding to the original Wuhan SARS-CoV-2 strain, a comparison of the error rate could be carried out.PPER was defined as the total number of detected substitutions per nucleotide as compared to the expected sequence.Although the occurrence of mutations during the RNA synthesis reaction used to prepare Twist synthetic controls could not be completely ruled out, we considered that any unexpected mutation found in the Wuhan Twist RNA control would be a potential sequencing artifact.In our approach, we considered not only the errors that may occur in the sequencing reactions and the basecalling step but also those that may happen during the aligning and variant calling steps [11,12].
In R10.4.1 duplex data, an important number of substitutions corresponding to the five VOCs were detected in the Wuhan Twist RNA, suggesting errors in the read-pairing process probably due to the high similarity of the amplicons sequenced.For the PPER estimation on the duplex data, we performed two different analyses: (i) including all detected sequencing artifacts and (ii) discarding those mutations that corresponded to VOCs.PPER was evaluated using the output of LoFreq and iVar variant callers.With both variant callers, MinION generated a higher PPER than MiSeq except for duplex data analyzed with LoFreq after removing the detected VOC mutations (Table 1).
With iVar, the differences between the two sequencing platforms were clearly larger, with MinION's R9.4.1 flow cell having 25 times higher PPER than MiSeq (0.236% vs. 0.009%).R10.4.1 flow cell singleplex data reduced the difference with MiSeq, having only 5 times higher PPER than the Illumina platform (0.050% vs. 0.009%).While PPER was generally lower for MiSeq than MinION, data regarding the total number of sequencing artifacts and the frequency range of these errors greatly varied for each sequencing platform/variant caller combination.Of note, MinION consistently reported a higher frequency range of sequencing artifacts, which could be as high as 71.0% in R9.4.1 flow cell and 48.6% and 33.76% in R10.4.1 singleplex and duplex data, respectively, while sequencing artifacts reported by MiSeq only reached frequencies up to 5.3%.These results indicate a lower sequencing accuracy for MinION as compared to MiSeq, although R10.4.1 singleplex data significantly reduced the number and frequency of sequencing artifacts with both LoFreq and iVar.
In an attempt to reduce the high number of sequencing artifacts detected by iVar in MinION data, a 1% mutation frequency cut-off was included.In the R9.4.1 flow cell, even though the number of sequencing artifacts was still high, we were able to discard 95 of these errors (i.e., a reduction of 28%) whereas R10.4.1 flow cell showed a greater reduction of 53% in singleplex data and over 80% in duplex data.In both MinION's types of flow cell, the PPER showed a lower decrease than in MiSeq since mutations with frequencies <1% had a very low sequencing depth.In the case of MiSeq data, there was an 84% reduction in the number of sequencing artifacts as well as a three times lower PPER because most of the detected mutations were in frequencies <1%.The cut-off was applied to both Twist mixtures and wastewater samples (see below).

Coverage and Depth Comparison
An amplicon-based sub-ARTIC V3 approach targeting only the spike gene was used to amplify 10 different mixtures of Twist synthetic RNA controls corresponding to five different VOCs (Alpha, Beta, Gamma, Delta, and Omicron BA.1), as well as six naturally contaminated wastewater samples, in addition to the synthetic RNA corresponding to SARS-CoV-2 Wuhan.From mixes 1 to 6 (see Section 4.1), synthetic RNA controls corresponding to Omicron BA.1 and Delta variants were mixed at different proportions to emulate what happened with these two variants in a real scenario during the 6th wave in Spain, when the Delta variant was displaced by Omicron BA.1 [16].From mixes 7 to 10, the five different VOCs were mixed at different proportions to cover multiple possible scenarios.
Using the coverage data obtained from the sequencing of the 10 mixtures and the Wuhan Twist synthetic RNA, the mean depth per amplicon was calculated (Figure 1A).Both platforms showed the same trend throughout the spike gene, but MiSeq consistently outputted more reads per amplicon than both MinION flow cells.The lower read depth of the amplicon A82 seen with both sequencing methodologies, which may be due to the overlapping of this amplicon with two Twist fragments.
The same analysis was carried out for the six wastewater samples (Figure 1B).As seen with the Twist mixtures, more reads per amplicon were outputted by the Illumina platform.The almost complete absence of read depth of the amplicon A73 in these samples may be due to the use of ARTIC V3 which was not still optimized for the Omicron variant.
In both Twist RNA mixtures and wastewater samples, the R10.4.1 duplex data showed the lowest number of reads per amplicon.

Comparing the Accuracy of VOC Abundance Estimation
MiSeq and MinION platforms were tested to compare their ability to estimate the abundance of different VOCs present in mixed samples (synthetic RNA mixtures and naturally contaminated samples).The data obtained were analyzed with LoFreq and Freyja and compared to the expected values.
Relative VOC abundance estimations measured by the eight different approaches (MiSeq with LoFreq, MinION R9.Using the coverage data obtained from the sequencing of the 10 mixtures and the Wuhan Twist synthetic RNA, the mean depth per amplicon was calculated (Figure 1A).Both platforms showed the same trend throughout the spike gene, but MiSeq consistently outputted more reads per amplicon than both MinION flow cells.The lower read depth of the amplicon A82 seen with both sequencing methodologies, which may be due to the overlapping of this amplicon with two Twist fragments.Significant positive correlations, determined by the Spearman rank correlation coefficient, between observed and expected results were ascertained for all tested pipelines (Table 2).Of note, with MinION R9.4.1 data, Freyja showed enhanced sensitivity over LoFreq, enabling the detection of all low-frequency variants present at 5%.MinION R10.4.1 flow cell detected all variants present in the mixtures except for duplex data analyzed with LoFreq that missed 5% of Delta VOCs in mix 10.Freyja (iVar) showed a worse correlation than LoFreq with R10.4.1 duplex data due to the detection of non-present VOCs.Significant positive correlations, determined by the Spearman rank correlation coefficient, between observed and expected results were ascertained for all tested pipelines (Table 2).Of note, with MinION R9.4.1 data, Freyja showed enhanced sensitivity over LoFreq, enabling the detection of all low-frequency variants present at 5%.MinION R10.4.1 flow cell detected all variants present in the mixtures except for duplex data analyzed with LoFreq that missed 5% of Delta VOCs in mix 10.Freyja (iVar) showed a worse correlation than LoFreq with R10.4.1 duplex data due to the detection of nonpresent VOCs.
Mean deviations between observed and expected percentages were calculated (Table 2).Freyja performed better with MiSeq and MinION R9.Mean deviations between observed and expected percentages were calculated (Table 2).Freyja performed better with MiSeq and MinION R9.4.1 flow cell data and equally to LoFreq with MinION R10.4.1 singleplex and duplex data.Highly biased deviations were obtained for the Beta variant in mixtures 7 to 10 with both sequencing techniques.Consequently, this variant was not considered in the calculations.
In naturally contaminated samples, the percentage of Delta and Omicron BA.1 VOCs was determined by RT-qPCR and compared with the NGS proportion determination.Both sequencing methods were able to successfully detect Delta and BA.1 VOCs present in the samples, even though MinION was not always sensitive enough to detect low-prevalence variants (Figure 2B).In most samples, NGS revealed the presence of only Omicron BA.1 and Delta VOCs, however, in samples 2, 3, and 6 a few signature mutations of Omicron BA.2 were also detected.Consequently, in these samples the common mutations between BA.1 and BA.2 were not included in the abundance estimation.A higher degree of deviation from RT-qPCR values was observed compared to the artificial Twist RNA mixtures, however, significant positive correlations were also observed (Table 2).
Despite the good Spearman rank correlation coefficients of the MinION R10.4.1 duplex data, the five VOCs used in this study were found in all analyzed samples likely due to erroneous read pairing in the duplex basecalling function.

Detected Mutations
An important factor to consider when determining the presence and abundance of a certain variant is the number of detected signature mutations that define this variant.For instance, the recommendation published by the EU in 2021 indicates that at least three genetic markers per variant should be reported for variant identification [17], and Twist RNAs of Alpha, Beta, Gamma, Delta, and Omicron BA.1 VOCs have 4, 3, 7, 4, and 26 signature mutations (SNV), respectively (Figure S1).In the case of the Twist RNA mixtures (Figure S1A), both sequencing platforms gave a similar number of detected mutations when Freyja was used to analyze sequencing data.From 21-29 out of 30 signature mutations were detected in mixes 1-6, which contained only two VOCs, and from 32-36 out of 40 signature mutations were detected in mixes 7-10, which contained five different VOCs.However, with LoFreq fewer signature mutations were detected with MinION R9.4.1 flow cell as compared to MinION R10.4.1 flow cell and MiSeq: only 8-26 out of 30 signature mutations were detected in mixes 1-6 and 11-30 mutations out of 40 expected in mixes 7-10.A similar trend was observed in wastewater samples (Figure S1B), where MinION R9.4.1 flow cell data analyzed with LoFreq consistently detected fewer mutations than Freyja, whereas LoFreq with the R10.4.1 flow cell data was more in line with MiSeq results.Despite the intrinsic variability of these types of samples, MiSeq tended to output more signature mutations than MinION, adding robustness to the results.
MinION R10.4.1 duplex data cannot be directly compared with the other approaches as read-pairing errors are likely to occur, which are a confounding factor when counting the number of detected mutations.
Independently of the sequencing platform, variant caller, and sequencing depth, mutations S371L, S373P, S375F, and K417N of Omicron BA.1 were systematically detected in lower abundances, thus being the first ones to be lost when the concentration of the Omicron BA.1 variant decreased.MiSeq LoFreq was able to detect these mutations in more samples than the other approaches, but with reporting frequencies as low as 0.13% (K417N in mix 4) and, consequently, affecting the variant abundance value.
In Figure S1A,B, those mutations corresponding to VOCs that were not present in the sample are also shown.In wastewater samples from late 2021-beginning 2022, the number of non-present VOCs (Alpha, Beta, and Gamma) that were detected was higher than in Twist RNA mixtures probably due to the lower amount of viral genetic material present in the samples compared to that in the Twist RNA mixtures.This makes the samples more prone to cross-contamination and index hopping (MiSeq) or barcode bleeding (MinION).

Discussion
NGS technologies are of utmost importance when searching for new variants or mutations that may have an impact on population health.The Illumina platform is generally considered as the gold standard [18][19][20], although ONT is becoming more popular due to its lower cost, turnaround time, and portability [21-23], and PacBio also offers SARS-CoV-2-directed resources [24].Here, we present a comparison between these two platforms, specifically MiSeq (Illumina) and MinION (ONT).Mixtures of known proportions of different Twist synthetic RNAs corresponding to five VOCs were sequenced along with six wastewater samples to assess the differences in terms of error rate, coverage, sequencing depth, estimation accuracy, and sensitivity.Additionally, all these parameters were compared and evaluated using two different bioinformatic pipelines, LoFreq and Freyja, with the aim to determine the effect that different data processing may have on the results.
It is widely believed that MinION does not achieve the MiSeq basecall quality [25,26].The mean Phred score of Twist RNA mixtures and wastewater samples was 36.0 (99.97% basecall accuracy), 13.0 (~94.99% basecall accuracy), 14.4 (~96.37%basecall accuracy), and 22.9 (~99.49%basecall accuracy) for MiSeq, MinION R9.4.1, and MinION R10.4.1 singleplex and duplex data, respectively.Despite the possible pairing issues that happened with the R10.4.1 duplex read data, there is a remarkable increase in read quality reaching values closer to MiSeq.In our study, we estimated the final PPER of each sequencing platform, i.e., the number of detected errors after variant calling, by sequencing a Twist RNA corresponding to the original SARS-CoV-2 sequence.Our results show that MiSeq has a lower PPER than MinION, independently of the variant caller used, that is in line with previous findings [26].The largest difference between the two sequencing platforms was found when using iVar with the 1% frequency filter, where MiSeq and MinION R9.4.1 flow cells had a PPER of 0.003% and 0.202%, respectively.Nevertheless, there was a remarkable improvement in MinION's PPER when using the new R10.4.1 flow cell, having a PPER of 0.036% with LoFreq and of 0.049% with iVar with the 1% cut-off.Contrarily, R10.4.1 duplex data increased the PPER likely due to both read-pairing errors caused by the highly similar amplicons produced when amplifying the five different Twist synthetic RNAs used in the study and the lower sequencing depth of the duplex data.
Two key points to explain the differences found in the PPER between LoFreq and iVar are the number of sequencing artifacts and the frequency of these errors, i.e., the number of reads containing the error.With LoFreq, a high number of sequencing artifacts were reported from MiSeq data, whereas only a few were found in MinION data.With iVar, the opposite behavior was observed particularly with MinION R9.4.1 flow cell and R10.4.1 duplex data.Regarding the frequency of the sequencing artifacts, in MiSeq these were found in low frequencies (up to 5.3%) whereas frequency values went up to 71.0%, 48.6%, and 33.8% in MinION R9.4.1, R10.4.1 singleplex, and R10.4.1 duplex data, respectively.LoFreq was designed under the premise that it is difficult to differentiate between sequencing errors and actual mutations that are present at a very low frequency (<0.05%).To discriminate between sequencing errors and low-frequency mutations, LoFreq estimates the run-specific sequencing error rates based on mapping, base, and alignment qualities [7].Consequently, on the one hand, MiSeq high read quality enabled LoFreq to report a high number of sequencing artifacts which were mostly found at very low frequencies (<1%), suggesting that some are likely to be Twist synthesis or PCR-induced errors.On the other hand, due to the lower read quality of MinION, LoFreq could only report a few sequencing artifacts which were found at higher frequencies (>5.9% and >1.6% in MinION R9.4.1 and R10.4.1 flow cells, respectively).Contrarily, iVar (as part of Freyja) does not consider the run-specific sequencing error rate [8].Our data showed that adding a frequency filter of 1% may help to reduce the background noise, hence, in both sequencing platforms only mutations with frequencies ≥1% are reported.The application of the filter confirmed that in MiSeq most of the detected sequencing artifacts are present at very low frequencies (<1%) whereas in MinION, except for the R10.4.1 duplex data, the frequencies are higher.As expected, MinION R10.4.1 singleplex data fall between MiSeq and R9.4.1 flow cell data, showing a remarkable reduction of the sequencing artifacts when applying the 1% frequency filter, albeit not as high as MiSeq.The differential behavior of these variant callers may explain why, when analyzing MinION data from both Twist mixtures and wastewater samples, iVar outputted more signature mutations than LoFreq, and why with MinION R9.4.1 flow cell data and LoFreq, some variants present at frequencies <5% were lost.
The covered percentage of the gene of interest with enough reads (in our case, >100 reads) is also a parameter of importance when analyzing sequencing data.It has been previously reported that MiSeq provides a better coverage than MinION as well as higher mutation detection [18].Our results for both Twist RNA mixtures and wastewater samples are in line with these findings.Particularly, when using LoFreq, MiSeq was able to detect more signature mutations than MinION, which, summed to the higher coverage, add robustness to MiSeq results.
With MiSeq data and MinION R10.4.1 flow cell singleplex data, all variants present in the Twist RNA mixtures could be detected even when present at a very low proportion (5%), regardless of the variant caller used.However, with MiniON R9.4.1 flow cell data and LoFreq variant caller, 5% of Alpha, Gamma, and Omicron BA.1 VOCs in mixtures 7, 9, and 10, respectively, could not be detected.MinION R10.4.1 duplex data showed good estimations even though the detection of mutations corresponding to VOCs that were not present in mixtures 1-6 hampers the comparisons with the other approaches.Although not significantly different, correlations between observed and expected VOC frequencies and average deviation values were better when using Freyja than LoFreq for both sequencing methods.The use of the Freyja algorithm has been widely implemented in SARS-CoV-2 wastewater-monitoring studies using Illumina sequencing platforms [9,27,28], but not on ONT sequencing data.To our knowledge, Freyja has only been used with GridION on clinical specimens [29].
In actual wastewater samples, significant positive correlations between measured VOC frequencies and values estimated by duplex RT-qPCR assays were observed with both platforms.MiSeq data performed similarly with both LoFreq and Freyja but showed the weakest correlations (ρ = 0.75 and ρ = 0.74, respectively) despite outputting more signature mutations than MinION.In contrast, MinION data with both types of flow cells, R9.4.1 and R10.4.1, even though missing some variants present at low frequencies, showed higher correlations with RT-qPCR when Freyja was employed to analyze the sequencing data (ρ = 0.81 and ρ = 0.82, respectively).It is worth noting that using RT-qPCR abundances may not be as accurate as known Twist RNA concentrations, since many factors (e.g., inhibition) may affect the RT-qPCR results.
While MiSeq typically outperforms MinION based on the studied parameters, it represents a more expensive platform, requiring a fixed space, and both library preparation and run time are generally longer.For instance, the cost of reagents (RT and library preparation) and flow cells required to analyze a single sample in our laboratory, a universitybased Spanish research laboratory, was approximately 30% higher for MiSeq as compared to MinION.MinION technology offers an affordable, versatile, portable, and real-time sequencing alternative.
MinION can be of special interest when using wastewater samples as an early warning tool to detect the emergence of new potential variants where sequencing turnaround time is crucial.However, it comes at the cost of losing accuracy and sensitivity when compared to MiSeq [18,30].It is worth mentioning that the improvement of both flow cell pores and reagent chemistry (R10.4.1 flow cells and kit V14), as well as the improvements in basecalling procedures (duplex reads), narrows the gap with the Illumina platform.Some read-pairing issues seem to happen when generating duplex read data probably due to highly similar amplicons produced when using the ARTIC panel to detect SARS-CoV-2 variants.Duplex data errors may lead to the detection of non-present variants in the samples, suggesting that this approach should be taken with caution when analyzing wastewater samples where a mixture of variants is likely to be present.
It has also been demonstrated that data processing by either LoFreq or Freyja plays a key role in results interpretation, affecting the PPER as well as the accuracy and sensitivity in both sequencing platforms.

Standard RNA Control Mixtures
Solutions of commercially available SARS-CoV-2 synthetic RNA controls (Twist Biosciences, San Francisco, CA, USA) corresponding to Alpha (Control 14, EPI_ISL_710528), Beta (Control 16, EPI_ISL_678597), Gamma (Control 17, EPI_ISL_792683), Delta (Control 23, EPI_ISL_1544014), and Omicron BA.1 (Control 48, EPI_ISL_6841980) were prepared in nuclease-free water, each containing 10 4 genome copies (gc)/µL.Then, these working solutions were mixed at different proportions as shown in Table 3.Additionally, a solution containing Twist RNA control corresponding to the original Wuhan SARS-CoV-2 sequence (Control 2, MN908947.3)was also prepared at the same concentration.cDNA was synthesized using random hexamers, after a thermal 5 min shock at 65 • C, using SuperScript III enzyme (Thermo Fisher Scientific, Waltham, MA, USA), in the presence of RNaseOUT (Thermo Fisher Scientific, Waltham, MA, USA).Subsequently, a sub-ARTIC V3 protocol was followed to amplify the spike gene.Briefly, 2 sets of 14 primers corresponding to amplicons 71 to 84 covering the entire S gene were separated into 2 pools (odd and even).Then, 5 µL of the previously obtained cDNA was added to the odd and even pools and a 40-cycle PCR amplification was carried out using the Q5 ® High-Fidelity DNA Polymerase (New England Biolabs, Ipswich, MA, USA).

Wastewater Samples
Wastewater samples collected from 6 WWTPs located in Spain between 20 December 2021 and 3 January 2022, were chosen for the evaluation of the sequencing methods.With the aim to avoid biases regarding sampling procedures, three 24h composite samples and three grab samples were used.Twenty-four-hour composite samples were collected within the framework of the Catalan Surveillance Network of SARS-CoV-2 in Sewage [31][32][33].Grab samples were collected within the framework of the VATar COVID-19 Spanish wastewater surveillance project [34,35].Samples were transported refrigerated (0-4 • C) in a portable icebox, concentrated upon arrival, and analyzed the next day.Viral particles were concentrated from 200 mL of wastewater following an aluminum hydroxide adsorptionprecipitation method [36] and RNA extraction was performed using the Maxwell RSC PureFood GMO and Authentication Kit (Promega Corporation, Madison, WI, USA), following the manufacturer's instructions.The six samples were then confirmed to be positive for SARS-CoV-2 employing both the N1 (CDC) [37] and IP4 (Institut Pasteur) [38] targets.

Library Preparation
Prior to the beginning of the library preparation, PCR products of both standard RNA control mixtures and wastewater samples were purified.The RNA controls were purified with magnetic Kapa Pure Beads (Roche, Basel, Switzerland) and the wastewater samples with the QIAquick Gel Extraction Kit (Qiagen, Hilden, Germany) following the manufacturers' instructions.
Illumina libraries were prepared using the KAPA HyperPrep Kit (Roche, Basel, Switzerland).Briefly, purified DNA of each sample was (i) quantified using a Qubit ® dsDNA HS Assay Kit (Thermo Fisher Scientific, Waltham, MA, USA), (ii) normalized to 1.5 ng/µL with Tris-HCl 10 mM, (iii) end-repaired and A-tailed, (iv) the adaptors were ligated using the Illumina Kapa Single-Indexed Adapter Kit (Set A + B) and purified with Kapa Pure Beads, (v) libraries were amplified by an 8-cycle PCR, purified with Kapa Pure Beads, and quantified with Qubit.Finally, all libraries were pooled and normalized to 4 nM with Tris-HCl 10 mM.MiSeq ® Reagent Kit 600v3 cartridges were used to sequence the prepared libraries on the MiSeq ® platform (Illumina, San Diego, CA, USA) for 58 h.
MinION libraries to load in the R9.4.1 flow cell were prepared following the Nanopore Classic PCR tiling of SARS-CoV-2 virus protocol [39] with minor changes.Briefly, purified DNA of each sample was (i) quantified using a Qubit ® dsDNA HS Assay Kit, (ii) normalized to at least 4 ng/µL with Tris-HCl 10 mM, (iii) end-repaired and A-tailed, (iv) the barcodes were ligated using the EXP-NBD104 and EXP-NBD114 kits, (v) the barcoded samples were pooled and purified with the AMPure XP magnetic beads (Beckman Coulter, Brea, CA, USA), and (vi) the adapters were ligated before the final clean-up with AMPure XP magnetic beads.The obtained library was loaded in a R9.4.1 flow cell (FLO-MIN106D) and sequenced for 72 h using a MinION Mk1C (ONT, Oxford, UK).
At the time of writing, ONT released the new R10.4.1 flow cells and V14 kit chemistry.As stated by ONT, the new R10.4.1 flow cells, in combination with the V14 reagents, enable higher read accuracy (Q20+).Moreover, thanks to a duplex read system in which the complementary strand is read immediately after the template strand and a consensus basecall is performed, qualities ~Q30 or above are claimed to be achieved.We used R9.4.1 and R10.4.1 flow cells and reagents to determine the advantages of these new products when compared to their predecessors and MiSeq.
To prepare the libraries to load into the R10.4.1 flow cell, the ligation sequencing amplicons-Native Barcoding Kit 24 V14 (SQK-NBD114.24)protocol [40] was followed.This protocol was chosen because at the time of writing the Nanopore Classic PCR tiling of SARS-CoV-2 virus protocol used for the R9.4. 1 was not yet adapted to the new R10.4.1 flow cells and kit V14 chemistry.However, the steps are very similar to the ones described for R9.4.1 flow cells.The main changes are a clean-up step with beads between the end-repair and the barcode ligation, the use of EDTA instead of heat incubation to stop the barcode ligation, and the addition of BSA to the library prior to flow cell loading.The obtained library was then loaded in an R10.4.1 flow cell (FLO-MIN114).R10.4.1 flow cells have two modes of sequencing, the accurate mode that works at 260 bases per second (bps) and the default mode, at 400 bps.The accurate mode reduces the data yield but favors the quality of the obtained reads, and the default mode prioritizes the data yield at the cost of losing accuracy.The accurate mode was selected for the present study.
Despite the possibility of cleaning and reloading the MinION flow cells, these processes were not carried out in order to compare the performance of MinION with a single library load as was carried out with MiSeq.Had the flow cells been reloaded with additional library material, a higher output would have been expected.

MiSeq Run Settings and Data Processing
Default settings were used to perform the sequencing run.Paired-end FASTQ files were generated from the Illumina MiSeq for each of the synthetic RNA mixtures and sewage samples.The bioinformatic analysis included the next steps: (i) MiSeq R1 and R2 paired ends were used to reconstruct each amplicon using the FLASH v1.2.11 program [41], setting a minimum of 20 overlapping bases and a maximum of 10% mismatches [42], (ii) Phred score was checked using FastQC v0.11.9 and no reads <Q20 were found, (iii) primers were trimmed and low-quality bases (<Q15) were removed from the ends of the sequences using Cutadapt v.3.5 [43], (iv) clean reads were aligned against the region corresponding to the S gene (21563-25384) of the reference genome NC_045512.2[44] using minimap2 v.2.23 [45].The resulting BAM file was used for variant calling with both LoFreq v2.1.5and iVar v1.3.1 (embedded within Freyja v1.3.6)[7,8].In order to equally process MiSeq and MinION data, single-nucleotide insertions, which are most likely to be sequencing artifacts (mainly present in MinION data), were filtered out and frequencies were adjusted using the BAM alignment positions.

MinION Run Settings and Data Processing
A run length of 72 h with active channel selection activated, a pore scan frequency of 1.5 h, and a minimum read length of 20 bp were set for both R9.4.1 and R10.4.1 flow cells.Real-time basecalling and de-multiplexing were disabled.
For R9.4.1 flow cell data, raw FAST5 files were basecalled and de-multiplexed after the run using Guppy v.6.0.6 into FASTQ files.The basecalling was carried out with the superaccurate (SUP) mode and with a minimum Q-score of 8.The de-multiplexing step was carried out with the options "require_barcodes_both_ends" and "detect_mid_strand_barcodes".
For R10.4.1 flow cell data, raw FAST5 files were first basecalled with Guppy selecting the specific SUP mode for 260 bps and a minimum Q-score of 8. Using Duplex Tools v0.3.1 [46], the "sequencing_summary.txt" file generated in the basecalling step was processed first by "pairs_from_summary" to identify duplex pair candidates, followed by "filter_pairs" to filter the selected read candidates.After these steps, two files were generated, "pair_ids_filtered.txt"and "pair_ids.txt"which were copied into the folder containing the FAST5 files.Finally, using Guppy, the duplex basecalling was performed over the raw FAST5 files using the 260 bps SUP mode and options "duplex_pairing_mode" and "duplex_pairing_file".
The resulting FASTQ files were quality checked using Nanoplot v1.40.2 [14] and a mean quality of Q13 was obtained for Twist mixtures and wastewater samples when using the R9.4.1 flow cell.With the R10.4.1 flow cell, a mean quality of Q14.4 was obtained for singleplex data and of 22.9 for duplex data.After QC, reads underwent the same steps described for MiSeq, from step (iii) onwards.
LoFreq was run with "call-parallel" and "no-default-filter" options.Despite using the "no-default-filter" option, alignment qualities, multiple testing correction, and p-value threshold filtering on variant quality were still active.This parameter was used with the aim to increase the sensitivity of the variant calling in MinION data.iVar was run with the "freyja variants" command so default values were used and a variant call format (VCF) file was generated.To calculate the different variant relative abundances, the "freyja demix" command was used and a TSV file was generated with the corresponding variant percentages [9].Freyja was adapted to work only with the spike gene instead of the whole genome.

Per Position Error Rate (PPER) Calculation
The calculation of the PPER was carried out through the following formula: PPER = (Nº of reads of the sequencing errors)/(Total S gene reads) If there are different sequencing errors in the same read, they both add to the PPER value.

VOC Abundance Estimation
Two different approaches were used to calculate the abundance of the present VOCs.When using LoFreq, VOC abundances were calculated as the mean percentage of the signature mutations corresponding to each VOC, including the following criteria: (i) only mutations with a number of reads ≥ 100 were included; (ii) positions with coverage ≥100 where a mutation was expected but not detected were not considered; and (iii) only single-nucleotide variants (SNVs) were used for the calculations.In contrast, when using Freyja (v1.3.6), the relative abundance of the detected variants was automatically calculated and collected in the summarized.tsvfile.Of note, as Freyja is based on the UShER [47] phylogenetic tree, deletions are not considered to assign variants.Two duplex RT-qPCR assays were developed to determine the proportion of Deltaand Omicron BA.1-specific signature mutations: S:Del157/158 (22029_22034DelAGTTCA) and S:Ins214 (22121InsGAGCCAGAA), respectively, as previously performed for other specific mutations [35].RT-qPCR assays were performed using 400 nM of each primer and 200 nM of each of the two probes targeting genomes with and without the specific signature mutation (Table 4).RT-qPCR mastermixes were prepared using the PrimeScript One-Step RT-PCR Kit (Takara Bio, San Jose, CA, USA), and the temperature program was 10 min at 50 • C, 3 min at 95 • C, and 45 cycles of 3 s at 95 • C and 30 s at 60 • C. Twist synthetic SARS-CoV-2 RNA controls (Control 2, Control 23, and Control 48) were used to prepare standard curves for genome quantification.The limit of detection (LOD) and limit of quantification (LOQ) were determined for each specific target by running a series of dilutions of the target with 4-10 replicates per dilution.Parameters of all standard curves and estimated LOD and LOQ for the four targets are summarized in Table S1.Percentages of SARS-CoV-2 genomes containing each specific signature mutation were calculated as previously described [35].

Conclusions
In summary, our study aiming at comparing the performance of MiSeq and MinION sequencing platforms coupled to different bioinformatic pipelines reinforces the idea that Nanopore technology coupled with improved bioinformatic pipelines based on Freyja could serve as a fast tool to generate systematic information on known SARS-CoV-2 variant tracking.However, the Illumina technology should still be considered as the gold standard for the identification and tracking of novel mutations and to confirm the circulation of minority variants.
Nevertheless, the recent improvements made by ONT, despite the need for polishing, offer promising perspectives.

Figure 2 .
Figure 2. Relative abundance estimations of specific VOCs detected in (A) 10 different mixtures of Twist synthetic RNA controls and (B) SARS-CoV-2-positive samples collected from 6 different wastewater treatment plants (WWTPs).
4.1 flow cell data and equally to LoFreq with MinION R10.4.1 singleplex and duplex data.Highly biased deviations were

Figure 2 .
Figure 2. Relative abundance estimations of specific VOCs detected in (A) 10 different mixtures of Twist synthetic RNA controls and (B) SARS-CoV-2-positive samples collected from 6 different wastewater treatment plants (WWTPs).

Author Contributions:
Conceptualization: R.M.P., S.G. and A.B.; Methodology: J.G., D.G.-C., A.C., D.G.-P., L.G. and A.C.-M.; Investigation: A.C. and D.G.-P.; Formal analysis: J.G., J.Q., M.I.C. and I.A.; Funding acquisition: R.M.P., S.G., A.B. and J.Q.; Supervising: M.I.C., S.G. and R.M.P.; Writing original draft: A.C., D.G.-P.and S.G.; Writing-review and editing: all authors.All authors have read and agreed to the published version of the manuscript.Funding: This research was partially supported by the Catalan Agency for Water (ACA), the Catalan Public Health Agency (ASPCAT) from the Department of Health, the Health Innovation Program from the General Research Directorate (DGRIS) of the Generalitat de Catalunya, the COVID-19 wastewater surveillance project (VATar COVID19), funded by the Spanish Ministry for Ecological Transition and Demographic Challenge and the Spanish Ministry of Health, the Catalan Agency for the Management of Grants for Universities (AGAUR, Grant Number 2020PANDE00055), and the Spanish Ministry of Science and Innovation (PCI2021-121928), and the Project PI22/00258, funded by Instituto de Salud Carlos III (ISCIII) and co-funded by the European Union.AC holds a predoctoral fellowship FI_SDUR from Generalitat de Catalunya.The authors would like to thank the EU and the Spanish State Research Agency for funding project PCI2021-121928, in the frame of the collaborative international consortium SARA financed under the ERA-NET Aquatic Pollutants Joint Transnational Call (GA nª869178).This ERA-NET is an integral part of the activities developed by the Water, Oceans and AMR Joint Programming Initiatives.Informed Consent Statement: Not applicable.

Table 1 .
Accuracy of MiSeq and MinION platforms with LoFreq and iVar using Wuhan SARS-CoV-2 RNA Control.

Table 2 .
Comparison between observed and expected percentages of the VOCs for Twist mixtures and wastewater samples.

Table 3 .
Proportion of synthetic RNA controls included in each 10 RNA mixtures.

Table 4 .
Primers and probes used for Omicron BA.1 and Delta duplex RT-qPCR assays.