Mutations in Animal SARS-CoV-2 Induce Mismatches with the Diagnostic PCR Assays

Recently, the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) was detected in several animal species. After transmission to animals, the virus accumulates mutations in its genome as adaptation to the new animal host progresses. Therefore, we investigated whether these mutations result in mismatches with the diagnostic PCR assays and suggested proper modifications to the oligo sequences accordingly. A comprehensive bioinformatic analysis was conducted using 28 diagnostic PCR assays and 793 publicly available SARS-CoV-2 genomes isolated from animals. Sixteen out of the investigated 28 PCR assays displayed at least one mismatch with their targets at the 0.5% threshold. Mismatches were detected in seven, two, two, and six assays targeting the ORF1ab, spike, envelope, and nucleocapsid genes, respectively. Several of these mismatches, such as the deletions and mismatches at the 3’ end of the primer or probe, are expected to negatively affect the diagnostic PCR assays resulting in false-negative results. The modifications to the oligo sequences should result in stronger template binding by the oligos, better sensitivity of the assays, and higher confidence in the result. It is necessary to monitor the targets of diagnostic PCR assays for any future mutations that may occur as the virus continues to evolve in animals.


Introduction
The global outbreak of coronavirus disease-2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) was first reported in Wuhan city, Hubei province, China in December 2019 [1,2]. It was announced by the World Health Organization (WHO) as a public health emergency of international concern then identified as a pandemic disease on 11 March 2020. The number of confirmed cases has been rising dramatically; as of 16 March 2021, the virus had spread to 219 countries and territories with around 120 million confirmed cases and 2.6 million deaths. The pandemic spread of the virus is related to its transmission by the symptomatic and asymptomatic carriers with the presence of animal reservoirs [3,4].
The rapid diagnosis of SARS-CoV-2 infection is the cornerstone for policymakers to control the outbreak. The scheme of COVID-19 diagnosis depends on epidemiological history, laboratory diagnosis, virus isolation, serological identification, molecular confirmation, and radiological diagnosis [6]. SARS-CoV-2 nucleic acid detection is the main, most specific, sensitive, and rapid tool for diagnosis of the infection [7]. Therefore, the WHO recommended the reverse transcription-quantitative polymerase chain reaction Netherlands 270

Selection of Diagnostic PCR Assays
A total of 28 primer-probe set binding sites were investigated in the current study ( Table 2). They included primer-probe sets from assays listed on the World Health Organization (WHO) website [15] and developed by the Chinese Center for Disease Control and Prevention (China CDC), China; the Centers for Disease Control and Prevention, Atlanta, GA, United States (US CDC); the Institute of Virology-Charité-Universitätsmedizin Berlin, Germany; the National Institute of Infectious Diseases (NIID), Japan; Institute Pasteur, Paris, France; The University of Hong Kong (HKU), Hong Kong; and the National Institute of Health of Thailand (THAI NIH), Thailand; in addition to several other assays developed by researchers [1,[9][10][11][12][13][14][16][17][18].
The distribution of the 28 PCR assays along the SARS-CoV-2 genome was as follows: 10 in the ORF1ab gene, four in the S gene, three in the E gene, and 11 in the N gene ( Figure 1). The assays were named in the current study depending on the developing organization or researcher and following [19]. For example, CN-CDC-ORF1ab was developed by the Chinese Center for Disease Control and Prevention for the ORF1ab gene. Similarly, the Young-S assay was developed by Young et al. [18] for the S gene.

Multiple-Sequence Alignment
All animal sequences and the reference sequence were aligned using Multiple Sequence Comparison by Log-Expectation (MUSCLE) v3.8.31 [50]. The quality of the multiple sequence alignment (MSA) results was checked in AliView [51]. Edits to the alignment were manually introduced when necessary to obtain the best alignment. The MSA length was 29,903 (the same length as the reference genome), and the nucleotide positions in all genomes were called based on the positions in the reference genome. The MSA was exported in FASTA format.

Identification of Nucleotide Changes at the Primer-Probe Binding Sites
In all the analyses, reverse primers were reverse complemented, and the mutations were investigated at the binding sites in the MSA. The same was performed for the probe designed by HKU for the N gene (HKU-N) as it was an antisense probe. Nucleotide variations at the primer/probe sequences or binding sites were investigated in AliView. Sequences with at least one ambiguous nucleotide (N) at any binding site were excluded for that binding site. The analysis results are reported in Table S2. To exclude the sequencing errors and infrequent mutations, a threshold of 0.5% [19] was applied in reporting the

Multiple-Sequence Alignment
All animal sequences and the reference sequence were aligned using Multiple Sequence Comparison by Log-Expectation (MUSCLE) v3.8.31 [50]. The quality of the multiple sequence alignment (MSA) results was checked in AliView [51]. Edits to the alignment were manually introduced when necessary to obtain the best alignment. The MSA length was 29,903 (the same length as the reference genome), and the nucleotide positions in all genomes were called based on the positions in the reference genome. The MSA was exported in FASTA format.

Identification of Nucleotide Changes at the Primer-Probe Binding Sites
In all the analyses, reverse primers were reverse complemented, and the mutations were investigated at the binding sites in the MSA. The same was performed for the probe designed by HKU for the N gene (HKU-N) as it was an antisense probe. Nucleotide variations at the primer/probe sequences or binding sites were investigated in AliView. Sequences with at least one ambiguous nucleotide (N) at any binding site were excluded for that binding site. The analysis results are reported in Table S2. To exclude the sequencing errors and infrequent mutations, a threshold of 0.5% [19] was applied in reporting the nucleotide variation. In this case, variations that existed in less than four genomes were considered below the 0.5% threshold and therefore not reported here in the main Tables  or Figures (reported only in Table S2). When the variations were above the threshold, the sequences at the binding site of a primer/probe were exported in FASTA format and stratified using the Sequence Tracer module of the Alignment Explorer (Available online at http://entropy.szu.cz:8080/EntropyCalcWeb/sequences (accessed on 15 January 2021)). This module sorts the identical sequence variants into discrete groups and calculates their frequencies. The results of the Sequence Tracer module are presented in Figures 2-4 for the ORF1ab, S and E, and N genes, respectively.

Results
A total of 793 SARS-CoV-2 animals' genomes isolated from cats, dogs, golden hamsters, lions, minks, tigers, and mouse were used in this study. Twelve out of the investigated 28 PCR assays displayed a perfect match with their targets at the determined threshold. The detailed information on assay names, countries, animal species, primer/probe sequences, positions, number of match and mismatch nucleotides are available in Table S2.

Mismatches in Diagnostic PCR Assays Targeting the ORF1ab Gene
It was observed that out of the 10 assays targeting the ORF1ab gene, three showed a perfect match with animal isolates at the defined threshold. These three assays were the Pasteur-ORF1ab-2, Young-ORF1ab, and Won-ORF1ab. The NIID-JP-ORF1ab had mismatches for the two sequencing primers (forward and reverse). Mismatches for the forward sequencing primer occurred at a total frequency of 57.2% including three nucleotide deletions (51.97%), one nucleotide mismatch (5.08%), and two nucleotide mismatches (0.13%). The reverse sequencing primer displayed a single mismatch with 0.51% of animal sequences as shown in Figure 2A. The reverse primer of Yip-ORF1ab displayed a single-nucleotide substitution with 0.77% of analyzed sequences ( Figure 2B). In Pasteur-ORF1ab-1, the forward primer and the probe displayed a perfect match with all the studied genomes (100%), while only 582 of 792 informative sequences (73.48%) had a perfect match with the reverse primer. The remaining sequences (210) exhibited two types of single mismatches ( Figure 2C). In Corman-ORF1ab, probe2 displayed two nucleotide substitutions (C-R and A-M) with all sequences, while the reverse primer showed one mismatch (T-S) with all tested animal sequences ( Figure 2D). The forward primer and probe1 of Corman-ORF1ab perfectly matched all the studied informative sequences. The CN-CDC-ORF1ab forward primer displayed a single mismatch with 0.5% of the sequences, as illustrated in Figure 2E. One mismatch (T-C) was also observed with all tested animals' sequences for the reverse primer of the Chan-ORF1ab assay ( Figure 2F). The HKU-ORF1ab probe showed a single mismatch with 1.51% of sequences ( Figure 2G).

Mismatches in Diagnostic PCR Assays Targeting the S Gene
Out of the four investigated PCR assays for the S gene, Chan-S and Won-S perfectly matched the studied genomes at the 0.5% threshold. Mismatches were observed for the forward and reverse primers of the Young-S assay and the sequencing forward primer of the NIID-JP-S assay. The forward primer of the Young-S assay perfectly matched with 374 sequences (47.4%), while mismatches occurred in 415 sequences (52.60%) due to a deletion of six nucleotides (TACATG). The reverse primer of Young-S showed one nucleotide mismatch with 1.27% of sequences, as shown in Figure 3A. The sequencing forward primer of the NIID-JP-S assay showed a perfect match with 99.49% of sequences and two types of single-nucleotide mismatches with 0.51% of animal sequences ( Figure 3B). Two out of the three tested PCR assays targeting the E gene perfectly matched the studied genomes at the defined threshold. The reverse primer of the Won-E assay exhibited a single-nucleotide substitution (A-T) with all tested viral sequences as shown in

Mismatches in Diagnostic PCR Assays Targeting the E Gene
Two out of the three tested PCR assays targeting the E gene perfectly matched the studied genomes at the defined threshold. The reverse primer of the Won-E assay exhibited a single-nucleotide substitution (A-T) with all tested viral sequences as shown in Figure 3C.

Mismatches in Diagnostic PCR Assays Targeting the N Gene
It was observed that, out of the investigated eleven assays targeting the N gene, five assays (US-CDC-N-2, US-CDC-N-3, Corman-N, Won-N, and HKU-N) displayed a perfect match with the studied genomes at the determined threshold. The US-CDC-N-1 probe and reverse primer showed single-nucleotide mismatches with 0.89% and 12.12% of animals' sequences, respectively, as demonstrated in Figure 4A. The reverse primer of NIH-TH-N assay matched 697 sequences and mismatched 95 tested sequences with a percentage of 88.01% and 11.99%, respectively ( Figure 4B). One mismatch (C-G) was observed with all animal sequences for the Young-N probe ( Figure 4C). The forward primer of the CN-CDC-N assay displayed three and four nucleotide mismatches with 56.82% and 1.65% of sequences, respectively ( Figure 4D). In addition, the NIID-JP-N reverse primer showed a single-nucleotide mismatch (G-C) with all tested sequences ( Figure 4E). The reverse primer of Chan-N showed two single mismatches with 1.64% of sequences as observed in Figure 4F.

Suggested Modifications of Primer-Probe Sets
Based on the reported variations at the primer-probe binding sites, we suggested some adjustments to the primer-probe sequences using the International Union of Pure

Mismatches in Diagnostic PCR Assays Targeting the N Gene
It was observed that, out of the investigated eleven assays targeting the N gene, five assays (US-CDC-N-2, US-CDC-N-3, Corman-N, Won-N, and HKU-N) displayed a perfect match with the studied genomes at the determined threshold. The US-CDC-N-1 probe and reverse primer showed single-nucleotide mismatches with 0.89% and 12.12% of animals' sequences, respectively, as demonstrated in Figure 4A. The reverse primer of NIH-TH-N assay matched 697 sequences and mismatched 95 tested sequences with a percentage of 88.01% and 11.99%, respectively ( Figure 4B). One mismatch (C-G) was observed with all animal sequences for the Young-N probe ( Figure 4C). The forward primer of the CN-CDC-N assay displayed three and four nucleotide mismatches with 56.82% and 1.65% of sequences, respectively ( Figure 4D). In addition, the NIID-JP-N reverse primer showed a single-nucleotide mismatch (G-C) with all tested sequences ( Figure 4E). The reverse primer of Chan-N showed two single mismatches with 1.64% of sequences as observed in Figure 4F.

Suggested Modifications of Primer-Probe Sets
Based on the reported variations at the primer-probe binding sites, we suggested some adjustments to the primer-probe sequences using the International Union of Pure and Applied Chemistry (IUPAC) nucleotide codes (Table 3). These adjustments were performed for the mismatches above the threshold. and Applied Chemistry (IUPAC) nucleotide codes (Table 3). These adjustments were performed for the mismatches above the threshold.   Table 3. Summary of mismatches and suggested modifications to the oligos targeting animal SARS-CoV-2. Modifications to the oligo sequences (blue underlined) were performed only for mutations above the 0.5% threshold (present in four or more of the total genomes, red underlined). No modifications were suggested for mutations below the threshold (red). Deletions in the oligo targets are represented by underlined dashes, and each dash corresponds to a nucleotide that has been deleted.

Discussion
Our study aimed to evaluate the currently available diagnostic PCR primers and probes, either recommended by WHO or published in the latest literature, for the detection of SARS-CoV-2 in animal hosts. We identified potential mutations at the primer/probe binding sites in SARS-CoV-2 isolated from animals and suggested several modifications to the primers and probe sequences to perfectly match their targets. Perfect match between PCR oligos and their targets will increase the confidence in the results and help veterinarians, technicians, laboratory professionals, clinicians, and policymakers control the disease in animals and humans. To this extent, 28 diagnostic PCR assays were in silico evaluated using 793 SARS-CoV-2 genomes isolated from cats, dogs, golden hamsters, lions, minks, tigers, and mouse. To prevent any bias in methodology, several points were considered.
(1) All animal SARS-CoV-2 genomes available from the GISAID and NCBI databases from various geographical regions (Asia, Europe, North America, and South America) were selected for reassessment of the assays. (2) The MSA length was 29,903, which is the same length as the reference genome, and the short sequences were not included in our analysis.
(3) Sequences with at least one ambiguous nucleotide (N) at any binding site were omitted. (4) In the reporting of nucleotide variation, a threshold of 0.5% was applied to remove sequencing errors and infrequent mutations. (5) Using the Sequence Tracer module allows incomplete or short sequences to be filtered out, identical sequence variants to be sorted into different classes, and their frequencies to be determined.
In this study, sixteen out of the investigated 28 PCR assays displayed at least one mismatch with their templates. This number is higher than that obtained by [19] who reported mismatches in seven out of 27 assays. This result may be due to the ongoing adaptation of SARS-CoV-2 in animal hosts resulting in higher variations in animal isolates compared with human isolates [26]. These variations highlighted the need for frequent evaluation of currently available diagnostic PCR assays to successfully control the SARS-CoV-2 pandemic. On the other hand, twelve out of the 28 PCR assays showed a perfect match with their targets at the determined threshold. These findings may be supported by the lower mutation rates in coronaviruses compared with other RNA viruses due to the RNA proofreading activity of nsp14-exoribonuclease [25,52]. In case of SARS-CoV-2, the virus acquires two mutations in its genome per month with an estimated evolutionary rate of 1.15 × 10 −3 substitutions/site/year [53,54].
Several mismatches with the investigated PCR assays were reported. These mismatches were not necessary to produce false-negative results as the effect of the mismatch varied according to the number, positions, and target (probe, forward, or reverse primer). The negative effect of a single-nucleotide mismatch on target annealing is lower than deletions or multiple-nucleotide mismatches. Mismatches near the 3 end can affect the target's amplification and detection while a single mismatch located near the 5 end or more than five bases from the 3 end can affect only the first few PCR cycles with no noticeable impact on the amplification process [55][56][57]. Single mismatches in the reverse or forward primers may not have a significant impact on target detection. However, a single mismatch in the probe may result in a false-negative, as it prevents the probe binding and fluorescence emission [32,36,58,59].
In our study, (1) Single-nucleotide mismatches were reported near the 3 end in NIID-JP-ORF1ab sequencing forward primer, Pasteur-ORF1ab-1 reverse primer, Chan-ORF1ab reverse primer, NIID-JP-S sequencing forward primer, and US-CDC-N-1 reverse primer, (2) Fatal deletions were detected in two assays: NIID-JP-ORF1ab sequencing forward primer and Young-S forward primer, (3) Multiple-nucleotide mismatches were observed in NIID-JP-ORF1ab sequencing forward primer, Corman-ORF1ab probe2, and CN-CDC-N forward primer, (4) Mismatches in probes that may result in false-negative were detected in four assays: Corman-ORF1ab, HKU-ORF1ab, US-CDC-N-1, and Young-N, and (5) A single mismatch with all animal sequences was observed in Corman-ORF1ab probe2, Corman-ORF1ab reverse primer, Chan-ORF1ab reverse primer, Won-E reverse primer, Young-N probe, and NIID-JP-N reverse primer. Shirato and his colleagues then updated the NIID-JP-N reverse primer to correct such mismatch in another report [14]. Mismatches in the Corman-ORF1ab probe2 were introduced by the authors so that the probe2 detects SARS-CoV-2, SARS-CoV, and bat-SARS-related CoVs [12]. The amplification method might not be influenced by a single mismatch near the 5' end; however, correction of such mismatches would ensure stronger template binding, better sensitivity, and higher confidence in the results. Therefore, we suggested several modifications to the oligos that did not perfectly match SARS-CoV-2 genomes from animals (Table 3). However, the proposed modifications may require experimental testing using COVID-19 confirmed clinical samples considering the low sensitivity of certain diagnostic PCR assays in some cases [60,61].
It was observed that three (Pasteur-ORF1ab-2, Young-ORF1ab, and Won-ORF1ab) of the 10 assays targeting the ORF1ab gene showed a perfect match with animal isolates at the specified threshold. These findings are in agreement with Khan and Cheung [19], who used 17,175 human SARS-CoV-2 sequences to test the three assays. Seven out of 9 assays targeting the ORF1ab gene showed a perfect match with human SARS-CoV-2 isolates in the study conducted by Khan and Cheung [19], and only two (Chan-ORF1ab probe and Charite-ORF1b reverse primer) showed a mismatch at the same threshold. Compared to the previous study [19], the higher number of mismatches in our study (seven) may be attributable to the mutations investigated in ORF1ab of SARS-CoV-2 animal genomes [26]. Positive selection has also been demonstrated for specific residues of the non-structural proteins of ORF1ab and the accessory proteins ORF3a and ORF8. These sites of the SARS-CoV-2 genome may be significant in generating variants adapted to humans or animals. Such findings can affect the production of diagnostic tests, therapeutics and preventive instruments, such as vaccines and antivirals [54].
In our study, we reported mismatches in one (Won-E) of the current three assays targeting the E gene compared to none reported by Khan and Cheung [19].  [19]. The N and E genes encode essential coronavirus capsid structural proteins, while other proteins regulate a range of molecular processes during viral replication [62]. The E gene is highly conserved with no mutations [26]. The single-nucleotide mismatch observed here in Won-E reverse primer is likely due to the primer design, not the evolution of animal SARS-CoV-2 at this site, because this mismatch is present in all the studied genomes including the reference sequence (Wuhan-Hu-1). The N gene may be under positive selective pressure where it is accumulating a significant number of mutations in human and animal isolates [26,63].
At the 0.5% threshold, two of the four investigated PCR assays targeting the S gene (Young-S forward and reverse primers, and NIID-JP-S sequencing forward primer) displayed mismatches with the studied genomes. On the contrary, Khan and Cheung [19] did not find any nucleotide mismatches in the assays targeting the S gene. The SARS-CoV-2 spike protein plays a major role in host cell receptor attachment, neutralizing antibody production, and host tropism allocation [2]. The SARS-CoV-2, like SARS-CoV and HCoV-NL63, uses the angiotensin-converting enzyme 2 (ACE2) receptor for host cell entry [2,64,65]. As the virus infects the animals and evolves during the outbreak, nucleotide substitutions may emerge in the primer/probe binding regions including the S gene [54,66,67]. The current SARS-CoV-2 genomes were isolated from animals where there are considerable differences in the ACE2 receptors compared with humans. Therefore, adaptation of the virus to animals will likely be different from humans, resulting in the accumulation of different mutations in the S gene due to the differences in the ACE2 [26,68,69]. The S gene was reported to be under persistent positive selection [66], which may result in additional mutations accumulating in the S gene in the future.

Conclusions
We evaluated 28 diagnostic PCR assays that were initially developed to detect SARS-CoV-2 in humans, for the detection of SARS-CoV-2 in animals. Sixteen out of the investigated 28 PCR assays displayed at least one mismatch with their targets at the 0.5% threshold. These mismatches were attributed to the continuous evolution occurring in SARS-CoV-2 in animals. Several of these mismatches are expected to negatively affect the diagnostic PCR assays. Therefore, we suggested some modifications to the oligo sequences accordingly. These suggestions should result in stronger template binding by the oligos, better sensitivity of the assays, and higher confidence in the results. As the virus continues to evolve in animals and accumulates mutations in its genome, it is crucial to frequently monitor the effects of these mutations on the diagnostic PCR assays and modify them accordingly. This should reduce the probability of false-negative results and help control the COVID-19 pandemic in animals and humans.
Supplementary Materials: The following are available online at https://www.mdpi.com/2076-081 7/10/3/371/s1, Table S1: Information on SARS-CoV-2 genomes used in the current study including the virus isolate, accession number, host, geographic region or country, genome length, collection date, database from which they were downloaded, and the percentage of ambiguous bases (%N), Table S2: Results of the bioinformatic analysis of 28 diagnostic PCR targets using 793 animal SARS-CoV-2 genomes.