Evaluation of “Caterina assay”: An Alternative Tool to the Commercialized Kits Used for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Identification

Here we describe the first molecular test developed in the early stage of the pandemic to diagnose the first cases of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in Sardinian patients in February–March 2020, when diagnostic certified methodology had not yet been adopted by clinical microbiology laboratories. The “Caterina assay” is a SYBR®Green real-time reverse-transcription polymerase chain reaction (rRT-PCR), designed to detect the nucleocapsid phosphoprotein (N) gene that exhibits high discriminative variation RNA sequence among bat and human coronaviruses. The molecular method was applied to detect SARS-CoV-2 in nasal swabs collected from 2110 suspected cases. The study article describes the first molecular test developed in the early stage of the declared pandemic to identify the coronavirus disease 2019 (COVID-19) in Sardinian patients in February–March 2020, when a diagnostic certified methodology had not yet been adopted by clinical microbiology laboratories. The assay presented high specificity and sensitivity (with a detection limit ≥50 viral genomes/μL). No false-positives were detected, as confirmed by the comparison with two certified commercial kits. Although other validated molecular methods are currently in use, the Caterina assay still represents a valid and low-cost detection procedure that could be applied in countries with limited economic resources.


Introduction
The current pandemic of coronavirus disease 2019 , caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1], poses serious problems in different fields of medical practice [2,3]. COVID-19 respiratory infection may vary from very mild to severe symptoms, even death, and presents clinical features that render it difficult to diagnose it from other viral respiratory diseases [4,5]. SARS-CoV-2 contains a single-stranded RNA as genome (~30 kb) and is a zoonotic agent likely originated from bats and genetically related to other human beta-coronaviruses, SARS-CoV and Middle East respiratory syndrome-related coronavirus (MERS-CoV), albeit with substantial differences in virulence and pathogenicity profile [6].
mFold program (http://www.unafold.org/DNA_form.php accessed on 20 January 2020) to predict D-loop within the N gene target was used as already published [19,20] ( Figure 2). Briefly, for the selected target sequence, folding conditions were Na + 0.05 mol/L; Mg ++ 0.002 mol/L. The hybridization temperature corresponded to the Sanger PCR annealing temperature (Ta), as indicated in Figure 2. mFold program (http://www.unafold.org/DNA_form.php accessed on 20 January 2020) to predict D-loop within the N gene target was used as already published [19,20] ( Figure 2). Briefly, for the selected target sequence, folding conditions were Na + 0.05 mol/L; Mg ++ 0.002 mol/L. The hybridization temperature corresponded to the Sanger PCR annealing temperature (Ta), as indicated in Figure 2.

RNA Extraction
RNA was extracted by using a QIAamp ® viral RNA Mini Kit (QIAGEN GmbH, Hilden, Germany) following the manufacturer's instructions. The same extraction was used for SARS-CoV-2 cell suspension extraction (positive control) as well as for the clinical samples (swabs) used in the routine work. Before the PCR analysis, all extracts were analyzed for total RNA concentration and purity by using NanoDrop™ OneC Spectrophotometer (Fisher Scientific Pte Ltd, 8 Pandan Crescent LL4, Singapore).

RNA Extraction
RNA was extracted by using a QIAamp ® viral RNA Mini Kit (QIAGEN GmbH, Hilden, Germany) following the manufacturer's instructions. The same extraction was used for SARS-CoV-2 cell suspension extraction (positive control) as well as for the clinical samples (swabs) used in the routine work. Before the PCR analysis, all extracts were analyzed for total RNA concentration and purity by using NanoDrop™ OneC Spectrophotometer (Fisher Scientific Pte Ltd, 8 Pandan Crescent LL4, Singapore).

RNA Extraction Internal Controls
A set of oligonucleotides, designed on human β-actin gene (GenBank accession N. M10277), was used as internal control to assess the correct RNA extraction procedure. The oligos β-forward 5 -GGCGTGATGGTGGGC-3 and β-reverse 5 -GTCATCTTCTCGCG GTTG-3 were used in the real-time PCR reaction, by using the same conditions of SARS-CoV-2 PCR, and the amplicon length was 236 bp.

Real-Time rt-PCR Quantitative Conditions
The SYBR ® Green Real-time PCR reaction was carried out with TaqPathTM   RNA extract and 7 µL of DNase RNase free water. Additionally, heat-labile uracil-DNAglycosylase (UDG, Roche Molecular Biochemicals) and 2 L of RNA extract was used. The PCR profile, conducted by using a CFX 96 apparatus (Bio-Rad laboratories USA), was as follows: (i) an initial uracil-DNA-glycosylase (UNG) incubation at 25 • C for 2 min, (ii) RT incubation at 50 • C for 15 min and (iii) 40 cycles of 3 s at 95 • C, 30 s at 60 • C and 2 s at 83 • C, (iv) final melting curve was performed for 50 to 95 • C with a transition rate of 5 • C/s. Fluorescence was detected at the end of the 83 • C segment (avoiding a specific fluorescence due to primer dimers and in continuous mode during melting curve process). To ensure the absence of potential secondary PCR products, amplicons were analyzed by 0.8% agarose gel electrophoresis.

cDNA Sequencing
To confirm that the obtained PCR product amplified the expected N gene region of SARS-CoV-2, a sequencing reaction (Sanger reaction) was performed using the Big-Dyechemistry Kit Perkin-Elmer Applied Biosystems, Foster City, CA, USA) and the same oligos utilized in rRT-PCR as sequencing primers. The results were edited and evaluated with Chromas chromatogram file editor (Technelysium, Queensland, Australia) and analyzed by the Basic Local Alignment Search Tool program (Blastn, http://www.ncbi.nlm.nih.gov/ BLAST accessed on 20 March 2020).

Clinical Samples
There were 2110 nasopharyngeal swabs tested from patients recorded as suspected cases (with clinical signs of COVID-19), or contact (close contact with an individual with COVID- 19). The group of tested patients was composed of 58% women, aged (20-99) median 59 and 42% men, aged (20-80) median 55. All samples were recruited from January to March 2020. Nasal swab sampling was performed using FLOQSwabs ® (Copan Italia S.p.A., Brescia, Italy), collected into a tube containing MEM transport media and stored at 4 • C until RNA extraction was performed on the same day of the sampling. In each run a no template control, negative extraction control, and SARS-CoV-2-positive control were included as previously described. Samples were considered negative when the SARS-CoV-2 target had a cycle threshold (Ct) >40.

Statistical Analysis
The experiment was executed in triplicate; the standard deviation (SD) of the threshold cycles Ct for each sample were comprised between ±0.8 Ct. The correlation coefficient (R2) for the standard curve was comprised of 0.95 to 0.97, and the different PCR efficiency was comprised between 97% and 98%. The accuracy levels of this analytical procedure were measured by Cohen's kappa coefficient (k) [18] by using the "Kappa as a Measure of Concordance in Categorical Sorting/VassarStats" program, available online http:// vassarstats.net/kappa.html (accessed on 21 December 2020), all with 95% confidence intervals. AY463059. Among them, the N gene of SARS-CoV-2 (28274-29533 nt) showed the most discriminative region between the aligned sequences (data not shown). In details, a multiple alignment of the N1 gene revealed a high sequence variability between the analyzed beta-coronaviruses ( Figure 1B). To avoid problems of steric effects between PCR primers and gene target, an accurate cDNA fold evaluation was performed by the mFold web program as already published [19] (Figure 2). The selected oligos were designed in a region without displacement loop (D-loop) and characterized by a low Gibbs free energy (∆G 0 ) value [21] (Figure 2).

SYBR ® Green rRT-PCR Melting Curve Analysis
The melting curve analysis of the N gene amplicons revealed two different melting peaks spaced by approximately 10 • C: the first peak, with a Tm of 77 ± 1 • C, represented possible primer-dimers that were relievable in the negative control/sample (around 40 bp). The second peak with a Tm of 87 ± 1 • C corresponded to the SARS-CoV-2 positive sample (451 bp). The same result was verified by agarose gel electrophoresis analysis, in which the two peaks in fluorescence corresponded to two PCR bands presenting different sizes ( Figure 3A).  rRT-PCR melting curve analysis was also performed for human β-actin gene that was used as molecular target control for RNA extraction (Figure 4). A positive reaction was confirmed by a melting peak of 85 • C Tm and a rRT-PCR Cycle threshold (Ct) limit ≤31.

Sensitivity of the Method
Inactivated Opitrol NAT SARS-CoV-2 was utilized to evaluate the sensitivity of the method. Opitrol NAT SARS-CoV-2 was provided with a viral titer ±1 × 10 7 corresponding to 1 × 10 7 viral genomes/mL. The calibration curve for the genomic copy number versus Ct value was obtained from 10-fold serial dilutions of the viral suspension, ranging from 1 × 10 6 to 50 viral genomes/μL extract (with Ct = 35.75), and served as a standard quantification curve to evaluate the sensitivity of the rRT-PCR reaction.
A wide linear range of quantitation was obtained (from 10 6 to 10 2 viral genomes/uL, 7 ten-fold dilutions). The standard curves showed a correlation regression coefficient R 2 = 0.99 ( Figure 3B).

SARS-CoV-2 N Protein Identification by Sequencing
In order to confirm the specificity of the "Caterina assay" as a diagnostic tool for SARS-CoV-2 detection, 10 qPCR amplicons from clinical specimens were sequenced by the Sanger method [19]. Both rRT-PCR primers were used in the Sanger sequencing reaction to generate a sequence of 406 bp ( Figure 5). Nucleotide sequence showed 100% of homology with the N gene of the reference genome SARS-CoV-2 Wu-Hu-1, as previously evaluated in silico analysis. In addition, we also verified by sequencing the amplicons of 50 samples (from February to April 2020). The first sequence of the N1 gene was deposited in the nucleotide database GenBank with accession MT187977.1.

Sensitivity of the Method
Inactivated Opitrol NAT SARS-CoV-2 was utilized to evaluate the sensitivity of the method. Opitrol NAT SARS-CoV-2 was provided with a viral titer ±1 × 10 7 corresponding to 1 × 10 7 viral genomes/mL. The calibration curve for the genomic copy number versus Ct value was obtained from 10-fold serial dilutions of the viral suspension, ranging from 1 × 10 6 to 50 viral genomes/µL extract (with Ct = 35.75), and served as a standard quantification curve to evaluate the sensitivity of the rRT-PCR reaction.
A wide linear range of quantitation was obtained (from 10 6 to 10 2 viral genomes/uL, 7 ten-fold dilutions). The standard curves showed a correlation regression coefficient R 2 = 0.99 ( Figure 3B).

SARS-CoV-2 N Protein Identification by Sequencing
In order to confirm the specificity of the "Caterina assay" as a diagnostic tool for SARS-CoV-2 detection, 10 qPCR amplicons from clinical specimens were sequenced by the Sanger method [19]. Both rRT-PCR primers were used in the Sanger sequencing reaction to generate a sequence of 406 bp ( Figure 5). Nucleotide sequence showed 100% of homology with the N gene of the reference genome SARS-CoV-2 Wu-Hu-1, as previously evaluated in silico analysis. In addition, we also verified by sequencing the amplicons of 50 samples (from February to April 2020). The first sequence of the N1 gene was deposited in the nucleotide database GenBank with accession MT187977.1.

Detection of SARS-CoV-2 on Clinical Samples from a Court of Sardinian Patients
The recruitment of clinical samples was organized within a monitor COVID-19 program established by the Sardinian regional government through the "Azienda Tutela della Salute" (ATS).
In total, there were 2110 oropharyngeal swabs collected from public hospital patients with suspected COVID-19 infection, composed of 1230 females and 880 males. Samples swabs were collected for a period of 60 days starting from 3 February 2020. The clinical samples were collected from (a) suspected patients for clinical signs (n = 648) and asymptomatic patients (n = 1462). Among 2110 samples analyzed with the "Caterina assay", 181 samples resulted SARS-CoV-2 positive (8.5%) and 1900 negative (91.5%). Fifty positive samples were further sequenced to confirm the expected sequence of SARS-CoV-2 N1 gene (data not shown). No mutation was detected with sequenced qPCR amplicons, confirming the relative stability of the N1 gene of the SARS-CoV-2 strains isolated from Sardinian patients. Figure 6 shows the number of total swabs analyzed and the relative positive samples/day observed during the initial pandemic period in Sardinia. The first positive sample was revealed 29 February 2020 (30 days from the set-up of the application of the "Caterina assay"). Subsequently, the positive samples began to increase over two months, as a consequence of an increase in the number of swabs per day more than the number of cases itself, with a maximum number of positive swabs (n = 63) detected. The percentage of positive samples between genders was 9.1% and 7.9%, females and males respectively.

Detection of SARS-CoV-2 on Clinical Samples from a Court of Sardinian Patients
The recruitment of clinical samples was organized within a monitor COVID-19 program established by the Sardinian regional government through the "Azienda Tutela della Salute" (ATS).
In total, there were 2110 oropharyngeal swabs collected from public hospital patients with suspected COVID-19 infection, composed of 1230 females and 880 males. Samples swabs were collected for a period of 60 days starting from 3 February 2020. The clinical samples were collected from (a) suspected patients for clinical signs (n = 648) and asymptomatic patients (n = 1462). Among 2110 samples analyzed with the "Caterina assay", 181 samples resulted SARS-CoV-2 positive (8.5%) and 1900 negative (91.5%). Fifty positive samples were further sequenced to confirm the expected sequence of SARS-CoV-2 N1 gene (data not shown). No mutation was detected with sequenced qPCR amplicons, confirming the relative stability of the N1 gene of the SARS-CoV-2 strains isolated from Sardinian patients. Figure 6 shows the number of total swabs analyzed and the relative positive samples/day observed during the initial pandemic period in Sardinia. The first positive sample was revealed 29 February 2020 (30 days from the set-up of the application of the "Caterina assay"). Subsequently, the positive samples began to increase over two months, as a consequence of an increase in the number of swabs per day more than the number of cases itself, with a maximum number of positive swabs (n = 63) detected. The percentage of positive samples between genders was 9.1% and 7.9%, females and males respectively.

PCR Validation Method: Comparison of Commercial Kit and the Homemade Procedure
After the first months of the epidemic, as soon as commercial validated kits were available, SARS-CoV-2 diagnosis was carried out in conjunction with the use of the Caterina assay with two commercial kits based on Taq There were 110 samples recruited from hospitalized patients with a clinical and laboratory diagnosis for COVID-19, and 100 samples from asymptomatic subjects, which were used to calculate the concordance between molecular tests. Inter-rater reliability was calculated by Cohen's kappa coefficient by GraphPad online calculator, (https://www.graphpad.com/quickcalcs/kappa1/ accessed on 5 January 2021). The Caterina assay showed an optimal intra-rater reliability with both diagnostic tests, used as gold standard in this experiment. The K coefficient was ranged from 0.97 (95% CI, 0.939-1.0) to 0.99 (95% CI, 0.972-1.0) for COVID-19 Genesig Real-Time PCR assay TM and DA An Gene (2019-nCoV) RNA TM respectively (Table 1).

PCR Validation Method: Comparison of Commercial Kit and the Homemade Procedure
After the first months of the epidemic, as soon as commercial validated kits were available, SARS-CoV-2 diagnosis was carried out in conjunction with the use of the Caterina assay with two commercial kits based on Taq There were 110 samples recruited from hospitalized patients with a clinical and laboratory diagnosis for COVID-19, and 100 samples from asymptomatic subjects, which were used to calculate the concordance between molecular tests. Inter-rater reliability was calculated by Cohen's kappa coefficient by GraphPad online calculator, (https:// www.graphpad.com/quickcalcs/kappa1/ accessed on 5 January 2021). The Caterina assay showed an optimal intra-rater reliability with both diagnostic tests, used as gold standard in this experiment. The K coefficient was ranged from 0.97 (95% CI, 0.939-1.0) to 0.99 (95% CI, 0.972-1.0) for COVID-19 Genesig Real-Time PCR assay TM and DA An Gene (2019-nCoV) RNA TM respectively (Table 1). Table 1. Clinical performance comparison between Genesis-coronavirus/DA An Gene Co., Ltd. kit and Caterina assay (n = 210).

Discussion
With the WHO declaration of the COVID-19 pandemic, countries were faced with this novel public health emergency [22]. More accurate and sensitive diagnostic methods are needed to monitor new cases and the progress of the pandemic.
Our study describes a low-cost and simple diagnostic method used in Sardinia (Italy) to report cases and assess the declared pandemic when commercial kits and information on the genome sequences of the new SARS-CoV-2 were not yet available. The "Caterina assay" analytical procedure was designed according the first SARS-CoV-2 genome sequenced by Zhang et al. [26] and the available genomes of other SARS-associated beta-coronaviruses in human and other animal reservoirs (Figure 1). It has been in use in our medical service laboratory from January 2020 to the beginning of April 2020 and enabled a fast and reliable diagnosis of the first COVID-19 cases ( Figure 6). A carful study of the stem-loop architecture in the N1 gene amplicon permitted to identify a looped-out region from the primer binding site where the oligos could perfectly anneal to the template, increasing the sensibility of the detection method ( Figure 2) [21,27]. Indeed, the reaction presented a high sensitivity (detection limit to 50 viral genomes/µL, Ct = 35.75) (Figure 3), which resulted to be higher compared to other SYBR ® Green-based RT-PCR methods for SARS-CoV-2 detection (10 3 copies/µL) [28]. Specificity of detection of SARS-CoV-2 was finally confirmed by the sequencing of 50 RT-PCR amplicons coming from clinal specimens ( Figure 5).
Among the "gold standards" developed for SARS-CoV-2 detection, the structural viral protein E and N genes [12,13,17], and non-structural protein RdRp [13,17] were the most often chosen. The N protein especially was selected in first place during this study and by other authors due to the high sequence variation in comparison with the sequences of other related human and bat beta-coronaviruses ( Figure 1) [12,13].
On the other hand, a comparison between the choice of the most specific and conserved target gene for diagnostic is often difficult, as the RT-PCR performance depends on which region primers were designed within a gene [27,29]. For instance, during the analytical comparison between primers/probe sets, RdRp and E genes were more sensitive than N genes [13,27]. However, Chu et al. found that the N gene assay was approximately 10 times more sensitive than the RdRp (ORF-1b) gene assay in detecting positive clinical specimens from two patients via rRT-PCR [30]. It has been reported that the N gene that encoded an internal N nucleoprotein protein was one of the most abundant proteins and was highly immunogenic and less prone to genetic changes [31,32]. However, recent study showed that some structural proteins, like the E protein, have low mutation rates across the residue sequence while other viral components, such as the Spike (S) [33] or the N protein showed higher degrees of variability [34].
Mutations present in the proximity of three-end oligo could affect primer sensitivity interfering with DNA elongation mediated by the activity of 5 -3 exonuclease activity of the DNA polymerase. For this, we have evaluated through the COVID-19 CG browser interface (https://covidcg.org accessed on 18 February 2021), the frequency of mutations (single-nucleotide variations: SNVs) present in the primers sequence of the N1 protein [35]. Forward primer S-CoV2-F contained one SVN with a higher frequency of 0.4% in position 283,000 (5 -ATGGACCCCAAAATCAGCGA-3 G283000T present in 1542 sequence counts among the 443.126 sequences selected from isolates globally collected) (Figure 7). This mutation could potentially impact the sensitivity of the primer. To overcome this potential issue, the use of a degenerative forward primer might be suggested. We also performed a multiple alignment of SARS-CoV-2 isolates considered emerging variants with a higher transmissibility and infectivity [36]. As shown in Figure 8, the primers of our diagnostic method were also able to detect the new variants and no mismatches was found in the annealing sequence of the primers [31,32].
Recently, a few papers have described alternative low-cost diagnostic methods for SARS-CoV-2 based on SYBR ® Green chemistry. For instance, Topan et al. selected another conserved gene encoding the viral Membrane protein (M) to be detected by TaqMan probe or SYBR ® Green. However, SYBR ® Green qPCR resulted in being an adaptation of pre-existent TaqMan qPCR protocol, whose reaction could result in a lower amplification efficiency losing analytical sensitivity.
Diagnostic performance of the Caterina assay has been benchmarked with official gold standards and is now available for diagnostic routine but not in the early epidemic phase. We accurately assessed the performance of the rRT-PCR reaction by testing previous patient samples tested with the Caterina assay using two gold standard kits (Table 1). Our analytical procedure still represents a robust and low-cost methodology that could be applied in particular settings. For example, the calculation of the burden of the coronavirus detection will be valuable: the medium of costs of a commercial diagnostic PCR kit (without extraction) is approximately 17 euros/sample, whereas the Caterina assay is about 3 euros/sample. The use of the Caterina assay may be valuable in low-resource countries where a low health income could be a critical condition during the governance of the COVID-19 pandemic. For this, a continuous global improvement in diagnostic tests remains crucial for a faster diagnosis to treat possible positive patients and implement measures of containment, in both industrialized countries and low-resource settings [18,37]. We also performed a multiple alignment of SARS-CoV-2 isolates considered emerging variants with a higher transmissibility and infectivity [36]. As shown in Figure 8, the primers of our diagnostic method were also able to detect the new variants and no mismatches was found in the annealing sequence of the primers [31,32].
Recently, a few papers have described alternative low-cost diagnostic methods for SARS-CoV-2 based on SYBR ® Green chemistry. For instance, Topan et al. selected another conserved gene encoding the viral Membrane protein (M) to be detected by TaqMan probe or SYBR ® Green. However, SYBR ® Green qPCR resulted in being an adaptation of pre-existent TaqMan qPCR protocol, whose reaction could result in a lower amplification efficiency losing analytical sensitivity.
Diagnostic performance of the Caterina assay has been benchmarked with official gold standards and is now available for diagnostic routine but not in the early epidemic phase. We accurately assessed the performance of the rRT-PCR reaction by testing previous patient samples tested with the Caterina assay using two gold standard kits (Table 1). Our analytical procedure still represents a robust and low-cost methodology that could be applied in particular settings. For example, the calculation of the burden of the coronavirus detection will be valuable: the medium of costs of a commercial diagnostic PCR kit (without extraction) is approximately 17 euros/sample, whereas the Caterina assay is about 3 euros/sample. The use of the Caterina assay may be valuable in low-resource countries where a low health income could be a critical condition during the governance of the COVID-19 pandemic. For this, a continuous global improvement in diagnostic tests remains crucial for a faster diagnosis to treat possible positive patients and implement measures of containment, in both industrialized countries and low-resource settings [18,37].

Conclusions
A "homemade" SYBR ® Green-based rRT-PCR, designed to detect the N gene of SARS-CoV-2, exhibited a high and reliable method for detection. This suggests the importance of developing front-line and low-cost analytical procedures even during a health emergency. In this case, territorial screening delivered the possibility of an early response between suspected clinical cases and SARS-CoV-2 infection. This possibility has allowed targeted epidemiological choices during the pandemic initial phase of COVID-19 pandemic in Sardinia island.

Recommendation
While more research is needed to elucidate the true nature of the severe acute respiratory syndrome, the Caterina assay should be used rather than imported commercialized kits because it is a cheaper and reliable way to detect COVID-19.

Conclusions
A "homemade" SYBR ® Green-based rRT-PCR, designed to detect the N gene of SARS-CoV-2, exhibited a high and reliable method for detection. This suggests the importance of developing front-line and low-cost analytical procedures even during a health emergency. In this case, territorial screening delivered the possibility of an early response between suspected clinical cases and SARS-CoV-2 infection. This possibility has allowed targeted epidemiological choices during the pandemic initial phase of COVID-19 pandemic in Sardinia island.

Recommendation
While more research is needed to elucidate the true nature of the severe acute respiratory syndrome, the Caterina assay should be used rather than imported commercialized kits because it is a cheaper and reliable way to detect COVID-19.  Informed Consent Statement: Not applicable.

Data Availability Statement:
The raw data supporting the conclusions of this manuscript will be made available by the authors. However, data sharing is not applicable to this article in case of patients data.