Best Molecular Tools to Investigate Coronavirus Diversity in Mammals: A Comparison

Coronaviruses (CoVs) are widespread and highly diversified in wildlife and domestic mammals and can emerge as zoonotic or epizootic pathogens and consequently host shift from these reservoirs, highlighting the importance of veterinary surveillance. All genera can be found in mammals, with α and β showing the highest frequency and diversification. The aims of this study were to review the literature for features of CoV surveillance in animals, to test widely used molecular protocols, and to identify the most effective one in terms of spectrum and sensitivity. We combined a literature review with analyses in silico and in vitro using viral strains and archive field samples. We found that most protocols defined as pan-coronavirus are strongly biased towards α- and β-CoVs and show medium-low sensitivity. The best results were observed using our new protocol, showing LoD 100 PFU/mL for SARS-CoV-2, 50 TCID50/mL for CaCoV, 0.39 TCID50/mL for BoCoV, and 9 ± 1 log2 ×10−5 HA for IBV. The protocol successfully confirmed the positivity for a broad range of CoVs in 30/30 field samples. Our study points out that pan-CoV surveillance in mammals could be strongly improved in sensitivity and spectrum and propose the application of a new RT-PCR assay, which is able to detect CoVs from all four genera, with an optimal sensitivity for α-, β-, and γ-.


How Diagnostic Failure Can Affect Animal Surveillance
In recent times, coronaviruses (CoVs) have proved to be a major issue for both public and animal health. Indeed, their large genome size, mutation rate, and frequency of recombination seem to make these viruses more susceptible to cross-species transmission and to the subsequent adaptation to new hosts and ecological niches [1][2][3][4][5]. In addition, the shedding of these viruses via fecal and respiratory routes permits easier transmission both between and within host species compared to other agents that require contact with body fluids, such as Ebola, resulting in easier spillover and higher contagiousness.
Coronaviruses are enveloped, positive-stranded RNA viruses that infect mammals and birds. Currently, the subfamily Orthocoronaviririnae includes four genera, namely Alpha-, Beta-, Gamma-, and Delta-coronavirus (α-, β-, γ-, δ-CoV). There are seven CoVs that are known to infect (or have infected) humans. Of these, three emerged through large epidemics climaxing in the ongoing pandemic of COVID-19, caused by the β-CoV severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [6][7][8]. Interestingly, human CoVs are all phylogenetically related with viruses found in livestock, especially bovines and camelids, and in wildlife, especially in bats [9][10][11][12][13][14]. This further underscores the need In this study, we critically reviewed the literature on CoV animal surveillance based on molecular tests, to identify the assays currently available for the identification of the four coronavirus genera. Metadata on the target animal species, sampling strategies, and sample matrices adopted were collected and systematically analyzed to selected candidate assays for in silico and in vitro analysis, to ultimately identify the most effective one in terms of analytical sensitivity, specificity, and applicability under field conditions. The selection process of the more suitable protocol for CoV surveillance is shown in Figure 1.

Literature Review
Google scholar and Pubmed were searched to retrieve a large dataset of scientific literature describing the surveillance of animals for coronaviruses using a broad-spectrum RT-PCR method, including both wildlife and domestic species. In total, 100 papers published between 2003 and 2021 were retrieved. As mentioned, pre-SARS papers (published before 2003) mostly regarded the investigation of specific CoVs infecting livestock, and no attention was posed for the spectrum of the protocols used. For all the papers, we recorded the target species that were assigned to the following categories: "domestic animals", "bats", "rodents", "other wildlife", and "birds". In addition, we distinguished between "environmental sampling", "live sampling", "passive surveillance", and "active euthanasia of animals for diagnostic purposes". We also recorded what samples had been analyzed and their preservation method upon sampling (i.e., the use of lysis buffers or viral transport medium). Results are summarized in Table 1, showing different sampling efforts depending on the animal type, with 71% of publications reporting the screening of bats. The category of domestic animals was the second one in terms of frequency (13%), while fewer studies included the screening of birds (7%). Studies on bat CoVs have been published every year since 2005, while works investigating other animals are sporadic. Studies on domestic animals mostly target single species of interest, especially swine and dromedary camels; on the other hand, studies on wildlife, including bats, rodents, birds, and other mammals, mostly involved different species. Most studies rely on the live sampling of animals. Around 30% of studies also include the screening of carcasses obtained through passive surveillance or from animals euthanized for diagnostic purposes. In particular, 45% and 21% of studies regarding rodents and bats respectively adopted euthanasia. For bats and rodents only, approximately 9% of studies were performed on environmental samples. Most studies preferred the use of gastrointestinal samples (either swabs or faces) for virological screening regardless of the target species. Respiratory samples (mostly oral swabs) were also analyzed in 41% of papers (ranging from 36% and 58% depending on the species), while 28% of the studies included the analysis of organs (ranging from 14% to 64% depending on the species). Nineteen papers reported that when both samples were used, only gastrointestinal matrixes provided positive results; one, referred to bovines, reported positive findings in the respiratory tract only while two reported concordance between respiratory and gastrointestinal samples. Most studies reported the use of field stabilizers: most authors used different viral transport medium, but RNAlater™ (Invitrogen, Massachusetts, USA) or lysis buffers were also employed. N = number of papers. In the section, "targeted animal species" is indicated if the study describes the targeted sampling of a single species or if more than one species is tested due to opportunistic sampling (such as trapping or netting). * Acronyms for the sampling strategies refer to live sampling (L), environmental sampling (E), passive surveillance (P), and sacrifice of animals (S).
Among the 100 studies analyzed, 52 publications used broad-spectrum primers published in Woo et al./Poon et al. [34,35] (26%) or in De Souza Luna et al. [36] (26%). Other frequently used primers included the ones developed by Chu et al. [37] and Quan et al. [38], referenced in 8% and 6% of papers, respectively ( Table 2). Another 48 protocols were used in less than three studies (< 3%). Most of the protocols were successful in the amplification of CoVs belonging to the genera α-CoV and β-CoV, while the genera δ-CoV and γ-CoV were identified only using primers from Chu et al. [37] and a few other methods that were specifically designed for the surveillance of birds [30,[39][40][41][42] (Table 2).

In Silico Evaluation
We selected seven pan-coronavirus protocols, the primers of which were aligned to compare their nucleotide sequences to map their position in the CoV genome. Primer sets selected for analyses in silico included the three most widely used protocols in surveillance studies [34][35][36][37], primers previously published by Chu et al. [43] as well as three updated protocols developed on the basis of recent CoV sequences [32,44,45]. Preliminary analyses showed that most primers are mapped within the same regions of RdRp, and show redundancy in their nucleotide sequences ( Figure 2).
For in silico evaluation, we retrieved 69 sequences from the National Center for Biotechnology Information (NCBI) GenBank database, including the reference sequences of each species recognized by the International Committee on Taxonomy of Viruses (ICTV) [46] as well as additional strains of CoVs species reported from different hosts. For each of the four genera, we built a nucleotide alignment using Clustal Omega implemented in Geneious Prime ® 2020.1.2 (Biomatters, Auckland, New Zealand) and we assessed the primer-template complementarity, which is crucial for specific amplification. All assays were analyzed for the number and the position of mismatches and tested using Geneious Prime ® 2020.1.2, allowing up to 4 mismatches in the binding region of each primer and no mismatches within the last 3 bp of the primer 3 end, which is assumed to have a significantly larger impact on priming efficiency [47]. Primers that did not meet these criteria were underlined (≥ 5 mismatches) and/or marked (mismatches within 3 bp of primer 3 end) (Tables S1-S3). Many mismatches (and often towards the 3 end of the primers) were observed between several tested primers and δand γ-CoV sequences (Table 3 and  Table S3). Interestingly, primer sets designed by De Souza Luna et al. [37] had a generally low primer-template complementarity (up to 9 mismatches) despite its extensive use in the literature. On the other hand, in silico analyses revealed the best primer-template complementarity between CoV sequences from all four genera and the primers sets Hu-F2/ Hu-R1 and Chu11-F1/Chu11-R1 [32,37]. Detailed results of the complementarity between primers and CoV sequences are presented in Tables S1-S3.  Two-step RT- As Identical to Chu06-R1  Moreover, based on the in silico results, we set up a novel pan-CoV assay developed by a combination of existing primers from different studies [32,34,35,37]. The complete list of primers tested in silico is presented in Table 3.

In Vitro Evaluation
For in vitro comparison, we selected the two most promising primer sets, based on their low number of mismatches (i.e., Hu et al. and Chu et al. [32,37]) as well as the oligonucleotides set from De Souza Luna et al. [37] because of its wide application despite their unsatisfactory results in silico (shown in bold red in Table 3). In addition, we tested a new oligonucleotide combination using Hu-F2/ Hu-R1 for the first round of one-step RT-PCR (668 bp), followed by nested PCR with the primers Poon-F and Chu06-R1 (440 bp), to increase assay sensitivity [32,35,43].
We tested all four selected protocols both as first-round amplification and as intended in the original studies in the case of nested approaches. We used the QIAGEN OneStep RT-PCR kit (QIAGEN, Hilden, Germany) for the first round of all RT-PCR assays, and the Platinum™ Taq DNA Polymerase (Invitrogen, MA, USA) for the second round of the assays, as indicated in the original studies [32,36]. Primer concentration and thermal cycles were adopted as indicated in the original studies [32,36]. Whenever exact protocols were not available, we followed the recommendations provided by the manufacturers in the manual of each amplification kit [37]. Further details are described in the Supplementary Materials Figure S1.
The analytical sensitivity of single assays was evaluated for three CoV genera using cell-adapted viral strains, namely one α-CoV (canine coronavirus, CaCoV), two β-CoVs (bovine coronavirus, BoCoV and SARS-CoV-2) and one γ-CoV (infectious bronchitis virus, IBV) (Table 4). Unfortunately, no isolates of δ-CoVs were available for this test. For each virus, we obtained 10-fold serial dilutions and extracted the RNA using the QIAamp ® Viral RNA Mini kit (QIAGEN, Hilden, Germany) according to the manufacturer's instructions. All analyses were run in triplicate for each dilution. For SARS-CoV-2 only, we compared the sensitivity of each method with the specific real-time RT-PCR targeting E gene widely used for the diagnosis of infection in humans and animals [48].
All protocols revealed unsatisfactory results in the first round, failing to detect SARS-CoV-2 and CaCoV even at high concentrations (1 × 10 4 PFU/mL and 5 × 10 4 TCID50/mL, respectively). Among the nested PCRs, the highest sensitivity for αand β-CoVs was observed with the newly designed protocol, showing a limit of detection (LoD) of 100 PFU/mL SARS-CoV-2, 50 TCID 50 /mL CaCoV, and 0.39 TCID 50 /mL BoCoV. For IBV, the highest sensitivity (5th 10-fold dilution of 9 ± 1 log2 HA) was observed employing the primer set published by Chu et al. [37]. As expected, species-specific real-time RT-PCR was the most sensitive assay for SARS-CoV-2 detection. Further details are shown in Figure 3.

Field Evaluation
In order to test the performances of the new nested assay optimized in this study for animal surveillance, we analyzed 30 field archive samples representative of different field conditions and all four genera of coronavirus. In particular, we selected different samples originating from a wide variety of wild and domestic animals that were previously confirmed as positive for CoVs using different approaches, including other pan-coronavirus methods [36,37], species-specific PCRs, and NGS. We tested different kind of sample matrixes among the ones that are mostly used in animal surveillance, including feces, anal swabs, saliva, salivary swabs, and organs (pools and intestines). Samples were collected up to 8 years ago and stored either dry or using different field stabilizers (Table S4). For RNA extraction, we used either the QIAamp ® Viral RNA Mini kit (QIAGEN, Hilden, Germany), NucleoSpin RNA Mini kit (MACHEREY-NAGEL, Düren, Germany), or MagMAX Pathogen™ RNA/DNA Nucleic Acid Kit (Applied Biosystems, MA, USA).
The new protocol was able to confirm the presence of CoV RNA in all the archived samples, including the different matrixes that are commonly used for CoV detection and regardless of the collection strategies and the age of samples (Table S4). Positive samples were confirmed through Sanger sequencing, which provided clean sequences of about 440 base pairs. Nucleotide sequences were analyzed trough the BLAST online tool (Rockville Oike, USA) For primary identification purposes, we aligned the nucleotide sequences of all tested strains together with the reference sequences retrieved from GenBank and representative of the CoV diversity found in animals and built a maximum likelihood (ML) phylogenetic tree. Detected strains belonged to all CoV genera and were isolated from several hosts, including swab/salivary samples positive for SARS-CoV-2 showing Ct values up to 30.76 ( Figure 4, Table S4). For all classified CoVs, this analysis allowed correct identification, confirming the specificity of the newly developed test. Besides, phylogenetic analysis of the sequences obtained was sufficiently informative to allow classification within known subgenera of all CoV strains that do not meet parameters for official classification (Figure 4) [49].  [50]. ML phylogenetic trees were inferred using PhyML (version 3.0) implemented in Seaview (Lyon, France), employing the GTR+G4 substitution model, a heuristic SPR branch-swapping algorithm, and SH-like branch supports [51]; obtained trees were edited online for graphical display using iTOL on server: https://itol.embl.de/ [52]. CoV species investigated in the study are indicated in red, with branches of tested strains shown as red boxes.

A Look into the Future of Coronavirus Surveillance
The ongoing pandemic of COVID-19 shows the dramatic consequences that emergent coronaviruses may have on a naïve population. Similar to what has been seen in humans, novel CoVs can infect livestock, causing epidemics that may evolve into large epizootics, and causing severe economic consequences, as shown by the worldwide spread of porcine epidemic diarrhea virus (PEDV) [53]. In both humans and livestock, coronaviruses emerge after spilling over from a reservoir host that maintains coronaviruses in nature [4]. In this view, the screening of wildlife is providing increasing information about the large diversity of this viral family in a wide variety of animals, but with the highest frequency and diversification in the order Chiroptera, namely in bats. However, we found that the literature is largely skewed towards the investigation of these animals, where SARS-like viruses were first found [54], and this may generate confounding data. Indeed, it is increasingly clear that we could find a similar diversity of CoVs in other animals as well, if we searched more robustly, such as in rodents and birds [9,10,31,55,56]. In this context, the few CoV species described in rodents, which make up approximately 40% of all mammalian species, seem too much in contrast with the large diversity of CoVs found in bats, and this might be explained by the lower sampling effort and the limited number of target species, as shown in our selected literature review.
Another interesting point that emerged from the analyzed literature that refers to pan-CoV surveillance is the fact that fewer than 41% of the studies tested the respiratory tract of animals, compared to almost 80% of papers describing the testing of feces and/or rectal swabs. This approach is due to the higher probability for feces to test positive [9]. However, we note that this consideration may differ in different species. Indeed, while CoVs in bats seem to have mostly a gastroenteric tropism, most human viruses and some of the CoV species described in companion and domestic animals are associated with the respiratory tract. Fortunately, several studies have confirmed that these viruses can also be found in feces if a molecular approach is applied, and they also encourage the use of this matrix in case of limited resources or if other sample types cannot be processed or collected [9]. However, it is likely that the parallel use of oral/nasal swabs would increase the sensitivity of unbiased surveillance.
Currently, all data suggest that alpha-and beta-coronaviruses have evolved in mammalian hosts while birds are the evolutionary reservoir for gamma-and deltacoronaviruses [40]. However, new coronaviruses that have been recently described in mammals include a divergent gamma-coronavirus in a captive beluga whale (Delphinapterus leucas) [30] and new delta-coronaviruses in Asian leopard cats (Prionailurus bengalensis) and in Chinese ferret badgers (Melogale moschate) found in wet markets. In addition, PDCoV is rapidly emerging in the swine industry in both the USA and China, leading to a severe disease with consequent economic losses [57]. Overall, these data suggest that the circulation of delta and gamma-coronaviruses in mammals might be underestimated [31]. Indeed, our results emphasize how most of the protocols widely used in mammals actually fail to detect γand δ-CoVs, suggesting that data might be confounded by methodological constrains, leading to a substantial under-sampling of mammals for these viruses [31]. In particular, most of the protocols reported as pan-coronavirus were not actually tested for γand δ-CoVs [36,38], and often included primers with low complementarity against their target regions and several mismatches located within their 3 end. For γ-CoVs, these data were confirmed in vitro, with most protocols showing low sensitivity. While it was not possible to perform similar analyses for δ-CoVs, the consistency obtained between the analyses performed in silico and in vitro for the other three genera confirms how primer-template complementarity strongly influences the success of PCRs and suggests that analyses in silico can be used as a good proxy when isolates are unavailable for actual testing.
We observed the worst results using the primer sets published by De Souza Luna et al. [36], which showed a high number of mismatches with γand δ-CoVs and very low performances for the amplification of IBV (γ-CoV), failing to detect all tree replicates even at the highest concentration. It is worth noting that this protocol is still one of the most widely used in the surveillance for CoVs despite being one of the first ones to be developed using the alignment of the few viral sequences available at that time. We obtained similar unsatisfying results for most widespread protocols, with an exception made for the one designed by Chu et al. [37], which was actually intended for surveillance in birds. In addition, as our results demonstrate, the sensitivity of these protocols was low even for αand β-CoVs.
Technically speaking, all protocols considered in this study included primers overlapping with each other in the same portions of the RNA-dependent RNA polymerases (RdRp). This choice is related to the fact that this gene is highly conserved among different coronaviruses and provides sequences of sufficient length for phylogenetic studies and a preliminary classification [27]. In addition, the amplification of the same fragment secures an easier comparison among CoVs using different assays. However, the protocols analyzed in this study differ in the number and position of degenerations, thus affecting the range of CoV species with sufficient complementarity. For example, primers by De Souza Luna et al. [36] and by Poon et al. [35] contain no degenerations and show very low primer and template complementarity with CoVs divergent from the ones used for the development of the assay, especially for the reverse primer. Since the start of the COVID-19 pandemic, few novel pan-CoVs assays have been published [44,45], but the complementarity of their primer sets with most CoVs is only moderate. On the other hand, primers used by Chu et al. in the first round include relevant degenerations that improved the performances of this assay both in silico and in vitro [37]. A nested approach using non-degenerated primers for the second amplification round was implemented. Actually, the use of nested or hemi-nested protocols is well established and used by most authors. This choice is well supported by our data, which show that all the protocols provide highly unsatisfying results for all the analyzed viruses in the first round. On the other hand, we showed how nested PCRs significantly increased the sensitivity of all the assays, in some cases up to five logarithms. This is in agreement with results from another study, where the authors observed a difference of five to eight logarithms between the first and second step of the pan-CoV assay carried out on serial dilutions of SARS-CoV-2 and MERS-CoV [45]. Nested PCR may be omitted only in case of a high viral load in the samples, although field samples (especially swabs) are likely to have low concentrations of viral RNA. This was demonstrated in a study analyzing anal swabs of bats, where 389 copies of SARS-CoV RNA/mL [54] were detected. Similarly, in another study, samples of different origins (rectal, nasal, and ocular swabs) tested for the presence of BoCoV RNA showed concentrations varying from 8.0 × 10 8 to 2.2 × 10 1 RNA/µL [58]. Therefore, we suggest that using only the first step of RT-PCRs for field surveillance may easily lead to false negative results. We confirmed this hypothesis on field samples analyzed using our novel approach that turned out positive after the second step of amplification. Among the different protocols described in the literature, the one developed by Hu et al. is one of the few relying on a single step of amplification [32]. As expected, this assay showed the lowest sensitivity overall for all CoVs under investigation. Nevertheless, its degenerated primers showed an excellent complementarity with coronaviruses, revealing zero (FW: n = 62; REV: n = 41) or at maximum one (FW: n = 7; REV: n = 27) mismatch with most CoVs analyzed (n = 69), with the only exception being Beluga whale coronavirus, accounting for two mismatches only, one of which is located toward the 3 end of the primer. This is consistent with our results from in vitro analyses, which showed a higher sensitivity of this protocol compared to the first rounds of all other protocols tested. Thus, we decided to combine these primers with nested primers from Chu et al.with the aim of developing a new approach for the detection of CoV RNA from all four genera (α-, β-, γ-, and δ-CoV) while securing higher sensitivity compared to the published assays [37].
Such results confirmed that our new assay showed an increased sensitivity for all αand β-CoV in vitro, but in the case of IBV (γ-CoV), better sensitivity was observed with primers from Chu et al., whose aim was to identify CoVs in bird samples [37]. Notably, we were able to detect as little as 10 PFU/mL of SARS-CoV-2, even if 100 PFU/mL was established as a limit of detection because some of the replicates of the highest dilution were weak and sequencing was unsuccessful. Compared to the broadly used species-specific rRT-PCR targeting the E gene [48], our protocol resulted in the loss of approximately two logarithms and false negatives for dilutions showing 36 Ct through rRT-PCR. However, our data from field samples suggest this test could be sensitive enough for its application in the field, just as we were successful in confirming positivity for all the archived samples belonging to all four CoV genera. Indeed, our protocol successfully identified all positive field samples, which included different matrixes, hosts, and viruses, showing that its application in surveillance programs is promising for both wildlife and domestic animals. This evidence is crucial because the used of broad-spectrum approaches is frequently avoided in veterinary surveillance, especially in case of disease outbreaks, in favor of probe-based protocols. Indeed, our study demonstrated how unbiased surveillance of coronaviruses is still very low in domestic animals. This is mostly due to the fact that probe-based molecular methods are faster, more sensitive, and scalable compared to broadspectrum protocols, which often require a nested or hemi-nested approach to secure acceptable sensitivity. However, targeted analyses might confound diagnosis during epidemics, especially because coronaviruses often cause similar symptoms. In addition, we recently found that even epidemic viruses, such as PEDV, could circulate endemically in pig herds in the absence of symptoms [24]. This implies that specific PCRs might turn out to be positive even when the targeted pathogen is not responsible for the clinical disease, potentially delaying the detection of novel viruses, as seen during the emergence of SADS-CoV [22]. Of note, the case of MERS shows how the emergence of coronaviruses in the human population might be favored by a previous passage in domestic animals, which can both amplify or modify the viruses, increasing either infectivity, transmissibility, or pathogenicity for humans. In this context, broad-spectrum surveillance in livestock is of utmost importance not only in terms of animal health per se, but also in terms of public health and conservation, as it allows early detection of a spill-over event. The use of our pan-coronavirus approach might combine the chances of detecting known and unknown coronaviruses from all matrixes that are commonly used in animal surveillance with an acceptable sensitivity, thus overcoming costly and time-consuming approaches, such as metagenomics or the application of several targeted protocols. In addition, our method allows a preliminary classification of the infectious agent by sequencing and phylogenetic analysis, leading to the identification of circulating or emerging coronaviruses. As demonstrated by Chan et al., pan-CoV amplification can be combined with the MinION sequencer [59]. Therefore, further research should focus on the implementation of portable next-generation sequencers to provide rapid and cost-effective preliminary classification.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/v13101975/s1. Methodological details of in vitro evaluation of selected protocols. Figure S1. Analytical sensitivity of the novel assay developed in this study. Table S1. Complementarity of primers with Alfacoronavirus sequences. Table S2. Complementarity of primers with Betacoronavirus sequences. Table S3. Complementarity of primers with Gammacoronavirus and Deltacoronavirus sequences. Table S4. Field samples and viral strains used to test the inclusivity of the novel assay developed in this study.  Institutional Review Board Statement: The present study used mainly fecal samples collected from animal' disclosures or from the ground as environmental samples. This did not imply any handling and disturbance of animals. Intestines were collected from animals found dead and submitted for virological investigations. This procedure does not require any specific ethical approval and the sampling procedures were performed in compliance with the country's own legislation and the recommendations of international institutions. According to the national legislation regulating animal experimentation, no ethical approval or permit was required for collecting and processing the samples tested in this study, with the following exceptions. The capture and handling of bats (Miniopterus schreibersii) to collect oral swabs was disclosed by the Italian Ministry of the Environment, Land and Sea notwithstanding the DPR 357/97 following the opinion given by ISPRA (38025 dated August 13, 2020). Bat handling was performed by trained personnel only vaccinated against rabies. The handling of pigs to collect rectal swabs was disclosed by the Italian Ministry of Health (Decree 208/2021-PR dated April 16, 2021) and was performed in strict accordance twith the relevant national and local animal welfare bodies (Legislative Decree 26/2014). Cat and dog swabs were submitted to the IZSVe under the framework of SARS-CoV-2 surveillance in pets (Parere CE IZSVe 04/2021 dated May 14, 2021).
Informed Consent Statement: Not applicable.