Identifying Structural Features of Nucleotide Analogues to Overcome SARS-CoV-2 Exonuclease Activity

With the recent global spread of new SARS-CoV-2 variants, there remains an urgent need to develop effective and variant-resistant oral drugs. Recently, we reported in vitro results validating the use of combination drugs targeting both the SARS-CoV-2 RNA-dependent RNA polymerase (RdRp) and proofreading exonuclease (ExoN) as potential COVID-19 therapeutics. For the nucleotide analogues to be efficient SARS-CoV-2 inhibitors, two properties are required: efficient incorporation by RdRp and substantial resistance to excision by ExoN. Here, we have selected and evaluated nucleotide analogues with a variety of structural features for resistance to ExoN removal when they are attached at the 3′ RNA terminus. We found that dideoxynucleotides and other nucleotides lacking both 2′- and 3′-OH groups were most resistant to ExoN excision, whereas those possessing both 2′- and 3′-OH groups were efficiently removed. We also found that the 3′-OH group in the nucleotide analogues was more critical than the 2′-OH for excision by ExoN. Since the functionally important sequences in Nsp14/10 are highly conserved among all SARS-CoV-2 variants, these identified structural features of nucleotide analogues offer invaluable insights for designing effective RdRp inhibitors that can be simultaneously efficiently incorporated by the RdRp and substantially resist ExoN excision. Such newly developed RdRp terminators would be good candidates to evaluate their ability to inhibit SARS-CoV-2 in cell culture and animal models, perhaps combined with additional exonuclease inhibitors to increase their overall effectiveness.


Introduction
SARS-CoV-2, the causative agent of COVID-19, is a member of the Nidovirales order of positive-strand RNA viruses [1]. Members of this coronavirus family include those responsible for SARS, MERS and assorted mild respiratory infections in humans and animals [2]. The coronaviruses fall into four major groups, designated alpha, beta, gamma and delta [3]; SARS-CoV, MERS-CoV and SARS-CoV-2 are in the beta lineage and are closely related to one another [4]. Like other coronaviruses, SARS-CoV-2 has a large RNA genome encoding more than 25 proteins. There are 16 nonstructural proteins (Nsp1-16), many of which form the replication-transcription complex (RTC). The functions of these proteins have been extensively reviewed [5,6], and several of them have been selected as targets for drug development.
Because of the large genome size of the coronaviruses (>30 kb), their relatively low fidelity RNA-dependent RNA polymerase (RdRp) would tend to produce a high number of errors during RNA replication and transcription [7]; the resulting mutations could RNA substrates. We identified several hepatitis C virus (HCV) NS5A inhibitors [52] that also inhibited the SARS-CoV-2 exonuclease [41,53,54]. Several of these ExoN inhibitors acted synergistically with RdRp inhibitors in blocking viral replication in Calu-3 cells [41]; recently, other investigators have proposed similar combination drug approaches [55,56]. Interestingly, our studies indicated that the HCV NS5A inhibitors Velpatasvir and Daclatasvir not only inhibited the exonuclease but inhibited RdRp activity as well [53,54]. We also noticed that Tenofovir, once incorporated into RNA by RdRp, was largely resistant to removal by ExoN [41]. This led us to examine further the properties of nucleotide analogues that might lead to such ExoN resistance.
In this paper, based on published sequence information, we first assess the conservation of protein sequences within the exonuclease functional sites of Nsp14 and Nsp10 in SARS-CoV-2 variants (Figure 1 and Figure S1). The high conservation observed for these proteins indicates that inhibitors of ExoN function would be effective against current and future variants of SARS-CoV-2. Second, we present the results of enzymatic assays evaluating the ability of nucleotide analogues with various structural features to be excised from the 3 end of RNA by ExoN. We focus on nucleotides and nucleotide analogues with sugar modifications, such as lack of the 2 -OH, lack of the 3 -OH, or lack of both 2 -and 3 -OH, as well as base-modified nucleotides. Among the molecules we tested, nucleotide analogues lacking both the 2 -and 3 -OH moieties were most resistant to ExoN activity. Given the sequence conservation of Nsp14/10 among SARS-CoV-2 variants, the structural features of nucleotide analogues identified above can guide the design and synthesis of novel RdRp inhibitors that can both be efficiently incorporated by the RdRp and substantially resist ExoN excision, perhaps combined with additional exonuclease inhibitors to increase their overall effectiveness to inhibit SARS-CoV-2.

Sequence Conservation in Nsp14 and Nsp10
With more than 5 million genomes of SARS-CoV-2 sequenced, naturally occurring mutations have been analyzed in a number of strains, including the most recent variants of concern to public health [57][58][59]. Overall, SARS-CoV-2 is a relatively conservative virus compared to other RNA viruses [60]. This low genomic variability means that coronaviral genomes contain lower numbers of mutations, and analysis of those mutations that become widespread may help in our understanding of viral functions. In the coronaviruses, low genome variability is ensured by the activity of ExoN during RNA replication [7,9,11,12,14,15].
The majority of genomic sequence variations observed in SARS-CoV-2 variants are located within genes encoding Spike, structural and accessory proteins, while the members of the ExoN complex, Nsp14 and Nsp10 genes, accumulate fewer nucleotide changes, most of which are synonymous (not causing changes in the encoded amino acids). Figure 1a shows amino acid diversity profiles for an early 2019 SARS-CoV-2 strain and the recent Omicron BA.5 variant. Nsp10 displays high sequence conservation. Moving from early variants to more recent ones, the overall amino acid diversity in Nsp14 becomes lower, in high contrast with the increasing sequence diversity in the Spike protein, for example. Within the ExoN-related portion of Nsp14, only one amino acid position retains diversity, displaying a conservative substitution I42V. Such sequence conservation in Nsp14 and Nsp10 may reflect a negative evolutionary pressure on the structure of the protein components of the ExoN complex, preserving its function, along with a bottleneck effect accounting for the widespread distribution of the neutral I42V. Figure 1b shows the primary structure of the exonuclease-related domains of the Nsp14 protein, in four SARS-CoV-2 strains (the original 2019 isolate Wuhan-Hu-1, Delta, Omicron BA.1 and Omicron BA.2), along with SARS-CoV-1, MERS-CoV and OC43-CoV, a coronavirus causing the common cold. Positions marked with asterisks, triangles and diamonds in Figure 1b correspond to amino acid residues that are critical for ExoN function. Artificially introduced mutations at these positions have been shown to hinder the ExoN proofreading activity [12,14,36]. The catalytic residues at positions marked with asterisks comprise the active center of the ExoN domain of Nsp14.
When recent variants of SARS-CoV-2 are compared to the original 2019 strain, very few mutations in the replication component, including Nsp14 and Nsp10 proteins, became fixed in the population. Within the portion of Nsp14 that spans the Nsp10-binding site and ExoN domain (Figure 1b), the only widespread naturally occurring mutation is I42V, a conservative substitution in the Nsp10-binding site, as evidenced by analysis of more than 80,000 SARS-CoV-2 genomes [59]. This is the only mutation that is ubiquitous in the exonuclease-related portion of Nsp14 in Omicron sub-lineages [60]. Another amino acid change in Nsp14, a D to N substitution at amino acid position 212 in the Figure 1b alignment (mutation not shown), present in some BA.2 sequences (e.g., GenBank accession ON117774), is located in the first zinc finger region of the ExoN domain. This substitution is between similar polar amino acids, aspartic acid and asparagine, and it does not hit any position that has been shown to be important for zinc finger formation. Eskier et al. [34] reported that a non-synonymous substitution F234L (position according to Figure 1b, mutation not shown), identified based on analysis of~30,000 SARS-CoV-2 genomes and located within the Nsp14 zinc finger 1, was associated with an increased genome-wide mutational load, but so were the other two identified nucleotide substitutions that were synonymous and therefore not affecting the corresponding protein sequence. Nsp14 of the SARS-CoV-2 Wuhan-Hu-1 variant (highlighted in blue) was used as a reference: amino acid differences in other variants compared to the reference sequence are shown in red. Functional motifs comprising the ExoN active site are underlined, and the zinc finger regions are boxed [12,14,36]. ExoN catalytic residues are indicated with asterisks; residues interacting with Nsp10 are marked by blue triangles; and positions critical for zinc fingers, protein solubility or ExoN activity are shown as red diamonds [12,14]. The sequences were retrieved from GenBank: OC43-CoV (common cold; accession number YP_009924328), MERS-CoV (YP_009047225), SARS-CoV-1 (JX163928), and variants of SARS-CoV-2: Wuhan-Hu-1 (YP_009725309, original strain isolated in 2019), Delta (OM990852), and Omicron BA.1 (ON141240) and BA.2 (ON553707). Protein alignment was built using Clustal Omega [62,63], visualized by MView [63,64], and then annotated.
In the relatively short and conservative Nsp10 protein of SARS-CoV-2, naturally occurring mutations are rare ( Figure S1). Artificially introduced mutations in many of the evolutionarily highly conserved amino acid positions in Nsp10 have been shown to lead to reduced fidelity [65] or even be lethal to coronaviruses by interfering with viral replication [8,37]. A naturally occurring amino acid substitution R134N, detected by analysis of~1000 genomes of SARS-CoV-2 B.1.617, is neutral and not under selective pressure [58].
Based on the conservative nature of components of the ExoN complex, we predict that any inhibitor for this enzymatic function should have broad-spectrum inhibitory potential for most current and possibly future strains of SARS-CoV-2.

Excision of Nucleotide Analogues from RNA by SARS-CoV-2 ExoN
In our previous study, we reported that Tenofovir-terminated RNA showed high resistance toward excision by the SARS-CoV-2 ExoN compared to RNA terminated with Remdesivir, Molnupiravir, Sofosbuvir or Favipiravir at the 3 end [41]. We also demonstrated that the triphosphate forms of numerous nucleotide analogues with different structural features can be incorporated into RNA by the SARS-CoV-2 RdRp complex and terminate RNA extension with varying efficiency in either immediate or delayed fashion [46,47]. Here, with the goal of providing insights into the design of viral polymerase inhibitors that could evade the exonuclease proofreading function, we systematically investigated the chemical or structural properties of selected nucleotide analogues for exonuclease excision.
As shown in Figure 2, adenosine-(a), Cordycepin-(d), dideoxyadenosine-(g) or Tenofovirterminated RNA (j) were separately incubated with the SARS-CoV-2 pre-assembled exonuclease complex (Nsp14/Nsp10) at 37 • C for 0 and 10 min, and the results were analyzed by MALDI-TOF mass spectrometry. The spectra in the middle (Figure 2b,e,h,k) reflected the molecular weights of the corresponding intact RNAs. After 10 min of exonuclease treatment, the RNA products were re-analyzed by MS, and the results are shown in Figure 2c,f,i,l. As an example, the peak at 8168 Da corresponds to the Cordycepin-terminated RNA before exonuclease treatment (Figure 2e). Exonuclease activity caused nucleotide cleavage from the 3 -end of the Cordycepin-terminated RNA, as shown by the eight lower molecular weight fragments corresponding to cleavage of 5-13 nucleotides (Figure 2f), with about 10% intact RNA remaining (peak at 8175 Da), indicating some level of ExoN resistance. Similarly, ddAand Tenofovir-terminated RNAs were analyzed before (Figure 2h,k) and after ExoN treatment (Figure 2i,l). Approximately 60% and 55% of intact RNAs were observed, respectively (Figure 2i,l), indicating that both nucleotide analogues have substantial ExoN resistance. As a control, the intact adenosine-terminated RNA peak (around 8183 Da, Figure 2b) was not observed after treatment (Figure 2c), indicating that the natural nucleotide A is completely removed by the Nsp14/10 complex. By comparing spectra (Figure 2c,f,i,l), we concluded that ddA-and Tenofovir-terminated RNAs exhibit high resistance toward ExoN cleavage while Cordycepin-terminated RNA displays moderate resistance.  shown in b,e,h,k) and SARS-CoV-2 pre-assembled exonuclease complex (Nsp14/Nsp10) was incubated at 37 • C for 10 min. These intact RNAs (b,e,h,k) and their respective exonuclease reaction products (c,f,i,l) were analyzed by MALDI-TOF MS. The signal intensity was normalized to the highest peak. The accuracy for m/z determination is approximately ±10 Da.
A previous study suggested that 3 -deoxynucleotide analogues could potentially resist exonuclease excision [11]. Moreover, 3 -deoxyadenosine-5 -triphosphate (Cordycepin TP) has been demonstrated to be incorporated efficiently and terminate RNA synthesis by the SARS-CoV-2 RdRp [66]. To confirm whether 3 -dA can evade exonuclease activity, the exonuclease resistance of Cordycepin-terminated RNA was further evaluated at different incubation times. Adenosine-and Cordycepin-terminated RNAs were treated with exonuclease for 0, 5, 10 or 15 min ( Figure S2). As incubation time increases, a reduced amount of the Cordycepin-extended intact RNA peak (at~8168 Da) is observed, while the 14 nucleotide-long RNA fragment peak (at~4388 Da) becomes increasingly dominant ( Figure S2g-j). In contrast, adenosine-terminated RNA was rapidly degraded by exonuclease ( Figure S2c-e). Thus, this result indicates that 3 -dA only has some resistance to ExoN. Figures 3 and S3 present the SARS-CoV-2 exonuclease results for uridine-and uridine analogue-terminated RNAs. The MALDI-TOF MS spectra of RNAs extended with 2 -NH 2 -2 -dUTP, Zidovudine-TP, Stavudine-TP, Biotin-16-UTP and Biotin-16-dUTP are shown in Figures 3e,h,k and S3b,e. After incubation with the SARS-CoV-2 Nsp14/10 complex, only Zidovudine-and Stavudine-terminated RNAs retain~40% and 50% of their respective intact RNA peaks (Figure 3i,l). These results indicate that Zidovudine-and Stavudineterminated RNAs have substantial exonuclease resistance. In the control spectrum of exonuclease-treated uridine-extended RNA (Figure 3c), no intact RNA was observed, with the majority of fragments being 14-15 nucleotides long, indicated by the major peaks at 4411 Da and 4718 Da. After treatment with ExoN, the intact peak for the 2 -NH 2 -2 -dUterminated RNA is eliminated, with mainly fragments of 18-21 nucleotides remaining (Figure 3f), indicating very low resistance to ExoN. The Biotin-16-U-and Biotin-16-dUterminated RNAs (Figure S3c,f) also displayed very low resistance toward ExoN activity, with the intact RNA peak completely eliminated, with predominantly 18-21 nucleotide long fragments remaining. A mixture of 500 nM template-loop-primer terminated at its 3 end with either U (a), 2 -NH 2 -U (d), Zidovudine (AZT) (g) or Stavudine (Sta) (j) (sequences shown in b,e,h,k) and SARS-CoV-2 pre-assembled exonuclease complex (Nsp14/Nsp10) was incubated at 37 • C for 10 min. These intact RNAs (b,e,h,k) and their respective exonuclease reaction products (c,f,i,l) were analyzed by MALDI-TOF MS. The signal intensity was normalized to the highest peak.
The results for guanosine-, ddG-, Cidofovir-and ddC-terminated RNAs are shown in Figure 4. In the presence of SARS-CoV-2 ExoN, ddG-terminated RNA shows the most ExoN resistance with predominantly the intact RNA peak (8167 Da) remaining (Figure 4f). The ddC-terminated RNA displayed a substantial level of ExoN resistance but less than ddG, as indicated by the amount of intact RNA (8147 Da) remaining (Figure 4l). Cidofovirterminated RNA had moderate resistance to ExoN excision, with even less of the intact RNA peak (8141 Da) remaining ( Figure 4i). As expected, the control G-terminated RNA was completely digested by ExoN (Figure 4c).  (g-l). A mixture of 500 nM of template-loopprimer terminated at its 3 end with either G (a), ddG (d), Cidofovir (Cid) (g) or ddC (j) (sequences shown in b,e,h,k) and SARS-CoV-2 pre-assembled exonuclease complex (Nsp14/Nsp10) was incubated at 37 • C for 10 min. These intact RNAs (b,e,h,k) and their respective exonuclease reaction products (c,f,i,l) were analyzed by MALDI-TOF MS. The signal intensity was normalized to the highest peak.
The results for RNAs terminated with cytosine and three additional cytosine analogueterminated synthetic RNAs were treated with SARS-CoV-2 exonuclease and are depicted in Figure S4. After treatment with ExoN for 10 min, the intact RNA peaks measured at time 0 ( Figure S4e,h,k) were no longer observed for 2 -dC-, 2 -F-2 -dC-and 2 -OMe-Cterminated RNAs ( Figure S4f,i,l). Compared to the control result for cytosine-terminated RNA ( Figure S4c), these cytosine analogue-terminated RNAs did not demonstrate any notable resistance to exonuclease activity ( Figure S4f,i,l).
The above results indicate that the structure of the nucleotide analogue at the 3 terminus of RNA plays an essential role in its excision by the SARS-CoV-2 exonuclease complex. In Figure 5, the nucleotide analogues analyzed in this study are ranked in ascending order based on their resistance to exonuclease cleavage in our molecular assay. The nucleotide analogues such as 2 -dN, 2 -OMe-N and 2 -F-2 -dN, where N represents any of the nucleobases tested here, as well as the natural ribonucleotides, when present at the 3 end of RNA, are most easily excised by ExoN. While 2 -NH 2 -2 -dN-and analogues with modifications on the base (Biotin-N and Biotin-2 -dN) provide low resistance to ExoN, 3 -deoxynucleotide analogues (3 -dN and Cidofovir) at the 3 position of the RNA display moderate ExoN resistance. Most importantly, appreciable resistance to ExoN was observed for ddN-, 3 -N 3 -2 -dN-, Tenofovir-and Stavudine-terminated RNAs. These results also indicate that the polarity of the modification group on the sugar ring of the nucleotide analogue attached at the 3 end of the RNA may have an effect on exonuclease cleavage. Nucleotide analogues containing the 2 -and 3 -OH (polar groups) show no resistance towards excision. However, increasing the hydrophobicity at both the 2 and 3 positions of the sugar ring (e.g., ddG, Stavudine and Tenofovir) results in the highest resistance to ExoN excision from RNA. The nucleotide analogues displaying higher resistance to exonuclease excision may help guide the further development of nucleotide analogues with particular structural features for potential coronavirus therapeutics. The ATP analogue Tenofovir-DP, the UTP analogues Stavudine-TP and AZT-TP are obligate terminators of the RdRp extension reaction though less efficiently incorporated than their corresponding natural nucleotides [46,47]. As demonstrated in this work, all three molecules resist excision by the SARS-CoV-2 ExoN to a substantial extent. The dideoxynucleotides also have significant resistance to excision by ExoN. Docking studies performed by Wu et al., indicated that ddA (Didanosine), ddC (Zalcitabine) and Stavudine had comparable binding energies for the active site of SARS-CoV-2 RdRp [68]. The structural modifications identified here will assist in the design of potentially efficient terminators of the RdRp that are also resistant to ExoN. Among the nucleotide analogues we analyzed, even those with substantial resistance to ExoN still showed some excision at longer incubation times. One approach to efficiently inhibit SARS-CoV-2 is the use of combination drug regimens involving distinct RdRp and ExoN inhibitors, as demonstrated in vitro by Wang et al. [41]. Another more challenging option is the development of a single nucleotide analogue that is relatively well incorporated by RdRp and sufficiently resistant to ExoN excision.

Conflicts of Interest:
The authors declare no conflict of interest.