In Silico Evaluation of CRISPR-Based Assays for Effective Detection of SARS-CoV-2

Coronavirus disease (COVID-19) caused by the SARS-CoV-2 has been an outbreak since late 2019 up to now. This pandemic causes rapid development in molecular detection technologies to diagnose viral infection for epidemic prevention. In addition to antigen test kit (ATK) and polymerase chain reaction (PCR), CRISPR-based assays for detection of SARS-CoV-2 have gained attention because it has a simple setup but still maintain high specificity and sensitivity. However, the SARS-CoV-2 has been continuing mutating over the past few years. Thus, molecular tools that rely on matching at the nucleotide level need to be reevaluated to preserve their specificity and sensitivity. Here, we analyzed how mutations in different variants of concern (VOC), including Alpha, Beta, Gamma, Delta, and Omicron strains, could introduce mismatches to the previously reported primers and crRNAs used in the CRISPR-Cas system. Over 40% of the primer sets and 15% of the crRNAs contain mismatches. Hence, primers and crRNAs in nucleic acid-based assays must be chosen carefully to pair up with SARS-CoV-2 variants. In conclusion, the data obtained from this study could be useful in selecting the conserved primers and crRNAs for effective detections against the VOC of SARS-CoV-2.


Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been spread worldwide and led to an outbreak. Until 18 July 2022, more than 567 million people have been infected, and 6 million people died due to viral infection or its complications [1]. SARS-CoV-2 is an RNA virus that relies on its unique RNA polymerase enzyme to replicate its RNA in the human host. This process can be erroneous, leading to high mutation rates in RNA viruses. In September 2020, the Alpha variant (B1.1.7) occurred, followed by the spread of the Beta variant. In late 2020, the Gamma variant (P.1) had spread, and the Delta (B.1.617.2) variant had taken over afterward. Now, Omicron variants (both BA. 4 and BA.5) are dominating worldwide during 2022 [2]. The mutations in spike proteins in SARS-CoV-2 for different variants helped the virus evade host immunity. For example, the D614G mutation observed in all strains enhances viral replication, infectivity, and virion stability [3]. Other mutations occur at different frequencies and have been summarized in Figure 1.
To prevent the spread of SARS-CoV-2, molecular diagnostic techniques have been used to detect and subsequently quarantine the infected cases. The standard detection technique is reverse transcription-quantitative polymerase chain reaction (RT-qPCR). Briefly, RT-qPCR works by converting SARS-CoV-2 RNA to cDNA by reverse transcriptase and then amplifying cDNA by DNA polymerase. This technique has high sensitivity and Several CRISPR-based assays for SARS-CoV-2 detection have been developed and published during the past few years. Nonetheless, SARS-CoV-2 continuously evolves into several variants of concern (VOC), including Alpha, Beta, Gamma, Delta, and Omicron. Thus, the diagnostic performance of those CRISPR-based assays could be affected by the high mutation rate of SARS-CoV-2. In this systematic review, we collected the primers and crRNAs from previous works that describe the CRISPR-Cas system for SARS-CoV-2 detection up to December 2021 and then analyzed how mutations in different variants of SARS-CoV-2 could affect each assay. Several CRISPR-based assays for SARS-CoV-2 detection have been developed and published during the past few years. Nonetheless, SARS-CoV-2 continuously evolves into several variants of concern (VOC), including Alpha, Beta, Gamma, Delta, and Omicron. Thus, the diagnostic performance of those CRISPR-based assays could be affected by the high mutation rate of SARS-CoV-2. In this systematic review, we collected the primers and crRNAs from previous works that describe the CRISPR-Cas system for SARS-CoV-2 detection up to December 2021 and then analyzed how mutations in different variants of SARS-CoV-2 could affect each assay.

Mutations in SARS-CoV-2 Variants
SARS-CoV-2 is a positive-sense single-stranded RNA virus. Its genome is about 30,000 nucleotides long, comprising 15 open reading frames (ORFs). The major ORFs encode non-structural proteins (proteases and RNA polymerase) and structural proteins, including spike (S), envelope (E), matrix (M), and nucleocapsid (N) proteins. These proteins are important for viral entry, fusion, and replication in the host cells. The S protein interacts with the angiotensin-converting enzyme 2 (ACE2) receptor and transmembrane protease serine 2 (TMPRSS2) on the host cell membrane, leading to viral entry [6,7]. Since S protein is on the viral surface membrane, it is one of the target proteins for immune response. From the SARS-CoV-2 genome database, it has been observed that this S gene has the most mutation rates, which allows it to escape the host immune system (Figure 1). D614G was the first mutation identified to increase viral replication, infectivity, and virion stability [3]. Interestingly, this mutation is found in all strains. For Alpha strain, it was first reported in the United Kingdom and contained three major mutations, del69-70, N501Y, and P681H, which enhance ACE2 receptor affinity, increase infectivity, and promote viral entry into respiratory epithelial cells, respectively [8][9][10]. The beta strain was identified in South Africa and carried E484K, K417N, and N501Y. The E484 mutation was shown to reduce the neutralization of polyclonal human plasma antibodies [11]. Similarly, the Gamma strain contained E484K, K417T, and N501Y, and it was found in Brazil. Delta strain was discovered in India and comprised G142D, L452R, E484Q, and P681R. L452R mutation was shown to help in escaping the defense mechanism [12]. Lastly, the Omicron strain was identified in South Africa. Mainly, it contains many mutations, including del69-70, T95I, G142D, del143-145, K417N, T478K, N501Y, N679K, and P681H [13]. From this variation in spike protein, the mutations on the S gene are used to classify its variants. However, there are high mutation rates in SARS-CoV-2, especially on the S gene; the detection assays for SARS-CoV-2 RNA need to be designed carefully to prevent false negative detection.

CRISPR-Cas Detection System
CRISPR-Cas system is a bacterial defense mechanism against phage. Cas nuclease can recognize and hydrolyze DNA or RNA targets using CRISPR RNA (crRNA), which has a complementary sequence to the target. This system has been applied for many applications, including gene knock-in and knock-out [14]. However, some of the Cas proteins such as Cas12a, Cas12b, and Cas13 have collateral nuclease activity, meaning that once these Cas proteins recognize the target, they can cleave nucleotide non-specifically. For Cas12a and Cas12b, the Cas protein together with its crRNA recognize double-stranded DNA (dsDNA) target and have the trans-cleavage activity for single-stranded DNA (ssDNA) [15,16]. For Cas13, the Cas protein, along with its crRNA, can hydrolyze the RNA target and later cleave single-stranded RNA (ssRNA) non-specifically [4]. This collateral activity has been utilized for nucleic acid detection ( Figure 2).
For CRISPR detection, the major difference between Cas12 and Cas13 is that Cas12 only detects DNA targets with protospacer adjacent motif (PAM) sequence, while Cas13 can detect RNA targets without PAM restriction. To detect SARS-CoV-2, the CRISPR Cas12-based system needs to convert viral RNA to cDNA, followed by DNA amplification and detection by Cas12 protein. In contrast, the CRISPR Cas13-based system needs to turn RNA into cDNA for amplification with the T7 promoter. The resulting DNA was then in vitro transcribed to generate RNA for Cas13 detection. This additional in vitro transcription (IVT) could introduce more complications such as misincorporation from T7 RNA polymerase (about 0.005% error rate) [17].
In general, the limit of detection for the CRISPR-Cas system alone is in picomolar to femtomolar ranges, which is not sensitive enough to detect a few copies of targets in the attomolar range [4,18,19]. Hence, CRISPR-Cas has been combined with nucleic amplification techniques, including PCR, RPA, and LAMP, lowering the detection limit tõ 10 copies per µL. For CRISPR detection, the major difference between Cas12 and Cas13 is that C only detects DNA targets with protospacer adjacent motif (PAM) sequence, while C can detect RNA targets without PAM restriction. To detect SARS-CoV-2, the CR Cas12-based system needs to convert viral RNA to cDNA, followed by DNA amplific and detection by Cas12 protein. In contrast, the CRISPR Cas13-based system needs to RNA into cDNA for amplification with the T7 promoter. The resulting DNA was th vitro transcribed to generate RNA for Cas13 detection. This additional in vitro trans tion (IVT) could introduce more complications such as misincorporation from T7 polymerase (about 0.005% error rate) [17].
In general, the limit of detection for the CRISPR-Cas system alone is in picomo femtomolar ranges, which is not sensitive enough to detect a few copies of targets i attomolar range [4,18,19]. Hence, CRISPR-Cas has been combined with nucleic amp tion techniques, including PCR, RPA, and LAMP, lowering the detection limit to ~10 ies per µL.

Mismatches in the Amplification Step and CRISPR-Cas Detection Step
Three major DNA amplification techniques have been used in combination CRISPR-based assays. The first one is PCR. This approach is conventional, where th plification consists of three main steps: denaturation at 95 °C, annealing of primers a 60 °C, and extension at 68-72 °C. The second one is LAMP. LAMP uses 4-8 primers ing to distinct regions. With the special design of the assay, DNA polymerase would plify DNA products with a loop-like structure, enhancing its further amplification o merous repeat sequences of the target. This reaction occurs at a single temperatu

Mismatches in the Amplification Step and CRISPR-Cas Detection Step
Three major DNA amplification techniques have been used in combination with CRISPR-based assays. The first one is PCR. This approach is conventional, where the amplification consists of three main steps: denaturation at 95 • C, annealing of primers at 50-60 • C, and extension at 68-72 • C. The second one is LAMP. LAMP uses 4-8 primers binding to distinct regions. With the special design of the assay, DNA polymerase would amplify DNA products with a loop-like structure, enhancing its further amplification of numerous repeat sequences of the target. This reaction occurs at a single temperature of around 60-65 • C. The third one is recombinase-based assay, which is commercialized under recombinase polymerase amplification (RPA), recombinase-aided amplification (RAA), or enzymatic recombinase amplification (ERA). In this technique, the recombinase pairs the primers to the complementary sequence in the DNA target. The resulting displaced strand is secluded by single-stranded DNA-binding protein (SSB), followed by primer extension and DNA amplification by a strand displacing DNA polymerase. This process can happen isothermally at 37-42 • C.
Since SARS-CoV-2 has different mutations across the variants, the primers must be designed carefully to avoid mismatches. PCR has higher primer specificity than LAMP or RPA due to higher temperature during amplification [20]. However, the position of mismatches on the primer could also affect DNA amplification differently. The mismatches within the 3 ends of primers are typically more detrimental than the internal mismatches [21,22].
In the CRISPR-Cas detection step, mutations of the virus can affect the binding of crRNA, resulting in compromising collateral activity. However, there is no generalizable rule to predict the effect of the mismatch positions on the trans-cleavage activity. Ooi et al. had shown that Cas12a from Acidaminococcus spp. with E174R/S542R/K548R (enAsCas12a) can tolerate mutation better than wild-type AsCas12a and Lachnospiraceae bacterium Cas12a Pathogens 2022, 11, 968 5 of 13 (LbCas12a) [23]. In addition, the study also presented that the mismatches at positions 7-9 and 19 of crRNA have the most detrimental effect, while positions 1 and 17 have less effect. For Cas13, there is no systematic study on how the mismatch affects its collateral activity. However, Abudayyeh et al. reported that a single mismatch in gRNA only minimally affects Leptotrichia shahii Cas13a's knockdown ability, while double mismatches in any positions in gRNA reduce knockdown efficiency dramatically [24]. For Cas13d, Wessels et al. demonstrated that the knockdown is most affected by mismatches in the seed region of gRNA between nucleotides 15-21 [25].

Mismatches in Published CRISPR-Cas Detection System
Firstly, the search was conducted electronically on PubMed from January 2020 to December 2021. The search strategy included keywords within titles or abstracts "((SARS-CoV-2) OR (severe acute respiratory syndrome coronavirus 2) OR (COVID-19) AND (CRISPR))". Search results were imported into Covidence (https://www.covidence.org/ (accessed on 20 December 2021) for systematic review management. The abstracts of 258 candidate publications were evaluated by three reviewers focusing on the articles related to the molecular techniques (amplification-free or PCR or LAMP or RPA) combined with a CRISPR-based assay for SARS-CoV-2 detection. Review articles and irrelevant studies were excluded during the evaluation process. Finally, the primer and crRNA sequences were retrieved from the full texts of 53 candidate articles (Table S1). Genome sequences of SARS-CoV-2, including Alpha, Beta, Gamma, Delta, and Omicron variants, were multiple aligned and then analyzed for primer and crRNA binding sites. The mismatches or deletions within the primers or crRNA binding sites were illustrated in Figure S1.
For our compilation, 33% (4/12) of PCR primer pairs have mismatches for different variants, while there are 21% (7/33) and 54% (26/48) of LAMP and RPA/RAA primers containing mismatches, respectively (Table 1). However, in the PCR-based assay, only one of the crRNAs (1/12) has mismatches with the Omicron variant. For LAMP-based assay, 6% (2/34) of the crRNAs have mismatches with Alpha, Beta, and Omicron variants. For RPA/RAA-based assay, 23% (11/48) of the crRNAs have mismatches with Alpha, Gamma, Delta, and Omicron variants.  Amplification-free detection of SARS-CoV-2 with CRISPR-Cas13a and mobile phone microscopy All three crRNAs are perfect match [34] An  Considering which viral genes have the most mismatches in amplification primers, we found that the S gene has the most mismatches (82%), while the E and N genes have 50% and 39% mismatches, respectively. This result agrees that the S gene has the highest variation among the genes in the SARS-CoV-2 genome. Strain-wise, the primer pairs have the lowest mismatches for gamma strain (11%), while mismatches in other strains (alpha, beta, delta, and Omicron) are almost double (18%, 21%, 19%, and 22%, respectively).
Next, we examined CRISPR-Cas reactions, and we found that 15% of the CRISPR-Cas reactions have mismatches between the viral genome and crRNA. Dividing by type of Cas (Cas12 versus Cas13), crRNAs in Cas12 has slightly lower mismatch percentages than crRNA in Cas13 (14% versus 20%, respectively). Strain-wise, crRNAs have the most mismatches with Delta and Omicron (7%), while Beta, Gamma, and Alpha have 2%, 2%, and 4% mismatches, respectively. Considering the target genes, the S gene has the most mismatches with crRNAs (25%), while E, ORF1ab, and N genes have mismatches at 0%, 15%, and 17%, respectively. This suggests that crRNAs of the E gene are robust for SARS-CoV-2 detection. Hence, the E gene could have potential use as a universal detection gene, while the S gene could be beneficial for designing assays to differentiate viral strains.

Conclusions
During the era of the SARS-CoV-2 pandemic, early diagnosis and screening are crucial to prevent the spread of the virus. However, due to the rapid mutation rate of the viral genome, molecular detection assays, such as RT-qPCR and CRISPR-Cas systems, that rely on nucleotide sequence need to be implemented cautiously. For the CRISPR-Cas system, we found that one-third of the primer sets in the reported studies and 15% of those assays have mismatches with crRNAs. Thus, for the current and upcoming variants of SARS-CoV-2, the primer and crRNA sequences need to be aligned with the current spreading viral genome to ensure the accuracy and specificity of the result.