Next Article in Journal
Clinical-Genomic Analysis of 1261 Patients with Ehlers–Danlos Syndrome Outlines an Articulo-Autonomic Gene Network (Entome)
Previous Article in Journal
Effects of rpl1001 Gene Deletion on Cell Division of Fission Yeast and Its Molecular Mechanism
Previous Article in Special Issue
Role of Optimization in RNA–Protein-Binding Prediction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bio-Chemoinformatics-Driven Analysis of nsp7 and nsp8 Mutations and Their Effects on Viral Replication Protein Complex Stability

by
Bryan John J. Subong
and
Takeaki Ozawa
*
Department of Chemistry, School of Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8654, Japan
*
Author to whom correspondence should be addressed.
Curr. Issues Mol. Biol. 2024, 46(3), 2598-2619; https://doi.org/10.3390/cimb46030165
Submission received: 20 February 2024 / Revised: 12 March 2024 / Accepted: 14 March 2024 / Published: 18 March 2024
(This article belongs to the Special Issue Predicting Drug Targets Using Bioinformatics Methods)

Abstract

:
The nonstructural proteins 7 and 8 (nsp7 and nsp8) of SARS-CoV-2 are highly important proteins involved in the RNA-dependent polymerase (RdRp) protein replication complex. In this study, we analyzed the global mutation of nsp7 and nsp8 in 2022 and 2023 and analyzed the effects of mutation on the viral replication protein complex using bio-chemoinformatics. Frequently occurring variants are found to be single amino acid mutations for both nsp7 and nsp8. The most frequently occurring mutations for nsp7 which include L56F, L71F, S25L, M3I, D77N, V33I and T83I are predicted to cause destabilizing effects, whereas those in nsp8 are predicted to cause stabilizing effects, with the threonine to isoleucine mutation (T89I, T145I, T123I, T148I, T187I) being a frequent mutation. A conserved domain database analysis generated critical interaction residues for nsp7 (Lys-7, His-36 and Asn-37) and nsp8 (Lys-58, Pro-183 and Arg-190), which, according to thermodynamic calculations, are prone to destabilization. Trp-29, Phe-49 of nsp7 and Trp-154, Tyr-135 and Phe-15 of nsp8 cause greater destabilizing effects to the protein complex based on a computational alanine scan suggesting them as possible new target sites. This study provides an intensive analysis of the mutations of nsp7 and nsp8 and their possible implications for viral complex stability.

1. Introduction

SARS-CoV-2, or severe acute respiratory syndrome coronavirus 2, is the causative agent of the COVID-19 pandemic and is the seventh coronavirus known to infect humans [1]. As of 21 March 2023, the World Health Organization has reported 761,071,826 confirmed cases of COVID-19, with 6,879,677 deaths recorded [2,3]. The increasing number of viral infections has become a global problem that has caused tremendous harm to health [4] and economic impacts [5,6].
The viral replication complex of SARS-CoV-2 is primarily composed of three nonstructural proteins: nonstructural protein 12 (nsp12), which comprises the A-chain; nonstructural protein 7 (nsp7), which comprises the C-chain; and nonstructural protein 8 (nsp8), which comprises the B- and D-chains [7] (Figure 1). Cryo-EM studies have shown that the SARS-CoV-2 polymerase complex comprises an nsp12 core subunit bound to an nsp7-8 heterodimer and another nsp8 monomer bound to the complex at a site different from that of nsp7-8 [7,8]. These three proteins form the nsp12–nsp7–nsp8 supercomplex of the viral replication protein complex. This viral replication protein complex represents the minimal machinery of the virus that can perform nucleotide polymerization [9]. The viral replication protein complex regulates the replication of the SARS-CoV-2 genome, highlighting its significance in the viral life cycle [10]. This has been further studied as a hotspot for drug targeting, emphasizing the importance of understanding the structure and mechanism of this dynamic assembly [3,11].
Nsp12, a 106-kDa protein, is the main catalytic subunit of the protein complex [12,13]. Its structure comprises an N-terminal nidovirus RNA-dependent RNA polymerase (RdRp)-associated nucleotidyltransferase and a C-terminal right-hand RdRp domain [13]. Owing to its essential role in viral replication, nsp12 has been a primary target for antiviral drug development, such as remdesivir, which is an inhibitor of RdRp polymerases [14], and biologics, such as monoclonal antibodies [15].
The other two proteins, nsp7 and nsp8, have recently been gaining attention owing to their essential role in the viral replication process, since nsp12 alone possesses little activity and requires nsp7 and nsp8 for RNA synthesis activity [16].
Nsp7, a 9-kDa protein that is primarily alpha-helical in structure, functions as a cofactor that binds to nsp12, allowing stabilization of the polymerase domain [17]. Nsp8, a 24-kDa protein comprising both alpha-helical and beta-strands, functions as a co-factor for nsp12 binding and is critically important in extending the template RNA-binding surface. The N-terminal regions are hypothesized to serve as molecular handles during the recruitment of additional viral factors and organization of the viral replication complex [18].
The nsp7–nsp8 complex is responsible for the binding of RNA. It also gives RNA binding capabilities to nsp12 [19]. The activity of nsp12 has also been demonstrated to be regulated by nsp7–nsp8. Nsp7 mutations such as F49A, M52A, L56A, triple mutation of F49A, M52A and L56A, C8G and V11A and nsp8 mutations such as F92A, M90A, M94A have been shown to decrease RdRp activity [10], highlighting the essential roles of these two proteins in the viral replication process. Moreover, nsp7–nsp8 is mostly conserved among different coronaviruses [20] making them potential targets for antiviral drug development owing to their important functions and high conservation.
Several studies have demonstrated that in vitro mutation of viral replication complex proteins in other coronaviruses can sometimes lead to folding defects that affect their function [18] and can sometimes lead to delayed virus growth [21]. These earlier studies collectively underscore the critical importance of nsp7 and nsp8 in viral replication and the impact of other mutations on the viral replication protein complex functionality among coronaviruses, particularly SARS-CoV-2.
Bio-chemoinformatics tools use computational tools that combine bioinformatics and chemoinformatics [22]. This includes techniques in bioinformatics such as sequence assembly and multiple sequence alignment [23] and genomics and proteomics annotation [24], whereas chemoinformatics includes in silico methods such as mutation analysis and protein stability analysis [25,26]. Using these computational tools, researchers can gain insights and pivotal knowledge to understand and explain various biological phenomena on a global scale.
In the present study, we analyzed the global mutation of nsp7 and nsp8 from protein sequence data in May 2022 and April 2023. We further explored the effects of these mutations on viral replication protein complex stability using the computational bio-chemoinformatic analysis. To further investigate the mutational effects of nsp7 and nsp8 on viral replication protein complex stability, critical interaction residues, comparison of bio-chemoinformatic predictions and wet lab experimental results and a computational alanine scan were performed (Supplementary Figure S1). This study is of great importance in understanding the current state of viral protein evolution and how this might affect viral replication mechanisms. Additionally, studying mutations in these proteins can help us gain new insights into identifying amino acid residues for possible targeting and destabilizing the viral replication machinery.

2. Materials and Methods

2.1. Sequence Mining of Human Isolates and Sequence Alignment

Global protein sequences of SARS-CoV-2 isolated from humans (Homo sapiens) were retrieved from NCBI Beta (https://www.ncbi.nlm.nih.gov/datasets/taxonomy/2697049/) on 27 May 2022 and 17 April 2023. The 2022 dataset had 1,783,299 nsp7 sequences and 1,783,229 nsp8 sequences, and the 2023 dataset had 6,895,947 nsp7 sequences and 6,895,889 nsp8 sequences. The reference sequence for the native nsp7 protein sequence is NCBI Reference Sequence: YP_009725303.1, and the reference sequence for the native nsp8 protein sequence is NCBI Reference Sequence: YP_009725304.1.
The protein sequences were then extracted based on unique sequences and aligned using the default settings of the Geneious alignment method using Geneious Prime software (version 2022.2, Biomatters Ltd., Auckland, New Zealand). Samples identified with unique sequences other than the native protein sequence were referred to as variants. Mutations in the protein sequence were then analyzed further for protein structure analysis.

2.2. Conserved Domain Analysis

NCBI Conserved Domain Database (CDD) analysis [27] was performed using the native sequences of nsp7 and nsp8. Nsp7 protein sequence analysis was queried against CDD v3.20–59,693 PSSMs, expected threshold value: 0.010000, composition based on adjustment statistics, and a concise result mode, with a maximum number of hits of 500. The Nsp8 protein sequence analysis was queried against CDD v3.20–59,693 PSSMs, expected threshold value: 0.010000, composition based on adjustment statistics, and a standard result mode, with a maximum number of hits of 500. Critical interaction residues were then generated for nsp7 and nsp8 (Supplementary Dataset S1).

2.3. Protein Structural Modeling and Stability Analysis

The RNA-dependent RNA polymerase (RdRp) protein replication complex or viral replication protein complex model containing native sequences of nsp12 (NCBI Reference Sequence: YP_009725307.1), nsp7 and nsp8 (Supplementary Dataset S2) was built using the Robetta server, http://robetta.bakerlab.org (accessed on 1 November 2023), comparative modeling [28] using the reference modeling structure, PDB: 8GWE [29] as a template. The protein sequence query versus PDB template generated a very high sequence identity of 0.98. The comparative model was built from a template structure and aligned using HHSEARCH, SPARKS and Raptor. Loop regions were assembled from fragments and optimized to fit the aligned template structure. The structures generated were superimposed into partial threads before hybrid sampling. The modeled structure was then prepared (e.g., assigning bonds and protonation, fixing structural defects) [9] and refined in ChimeraX [30,31] before protein stability analysis.
Amino acid mutations in nsp7 and nsp8 were then subjected to computational analysis to predict their effects on the stability of the viral replication complex. Alanine scanning was performed by substituting alanine on each amino acid residue. Estimation of Gibbs free energy change values (ΔΔG) were calculated using DDMut, https://biosig.lab.uq.edu.au/ddmut (last accessed on 7 March 2024) which is a deep learning model that captures relationships through its neural network architecture based on published experimental ΔΔG values [32]. The ΔΔG values allow classification of mutation effects as either thermodynamically stabilizing or destabilizing on the protein structure. The ΔΔG change of protein stability is usually defined as follows:
ΔΔG = ΔGmutant − ΔGwild-type
where
ΔGmutant = (Gunfolded − Gfolded)
ΔGwild-type = (Gunfolded − Gfolded).
In such computational analysis, a ΔΔG < 0 kcal/mol is described as a destabilizing mutation, while a ΔΔG > 0 kcal/mol is a stabilizing mutation.

2.4. Data Visualization and Statistical Analysis

Data statistical analysis (e.g., mean, standard error, 95% confidence interval, violin plot) was calculated using DataTab 2024 (e.U. Graz, Austria), https://datatab.net (last accessed on 1 January 2024) [33].

3. Results

3.1. Global Variation in nsp7 and nsp8 Protein Sequences

The 2022 samples (1,783,299 nsp7 sequences and 1,783,229 nsp8 sequences) generated 506 variants for nsp7 and 1582 variants for nsp8. The 2023 samples (6,895,947 nsp7 sequences and 6,895,889 nsp8 sequences) generated 4537 variants for nsp7 and 14,992 variants for nsp8 (Supplementary Dataset S3).
The protein sequence distribution for nsp7 showed that 98% of the nsp7 had the native protein sequence and 2% had the variant sequence in 2022, whereas in 2023 (Figure 2), 97% had the native protein sequence and 3% had the variant sequence (Figure 2).
A single amino acid mutation was the dominant type of mutation among the ten most frequently occurring variants for both 2022 and 2023 (Figure 2). In 2022, the frequency of the protein variants was as follows: L71F (0.19%), S25L (0.17%), D77N (0.1%), M75I (0.09%), T81I (0.08%), M3I (0.07%), Q63R (0.07%), V33I (0.06%), S26F (0.06%), and L56F (0.04%). In 2023, the frequency of the protein variants was as follows: D77N (1.43%), L71F (1.05%), shorter amino acid sequence containing several Xs amino acids (denoted as * in Figure 2) (0.56%), protein sequence with multiple ambiguous protein sequence (Xs, denoted as ** in Figure 2) (0.54%), S25L (0.34%), S26F (0.32%), Q63R (0.25%), M75I (0.15%), T81I (0.12%) and another protein sequence with multiple ambiguous protein sequences (Xs, denoted as *** in Figure 2) (0.10%).
The protein sequence distribution for nsp8 showed that 93% of the nsp8 sequences have the native protein sequence and the remaining 7% as variants in 2022 (Figure 3), whereas in 2023, 91% have the native protein sequence and 9% have the variant sequence (Figure 3).
Similar to the nsp7 protein sequence analysis, a single amino acid mutation was the dominant mutation among the ten most frequently occurring protein sequence variants for both 2022 and 2023 for nsp8 (Figure 3). In 2022, the frequency of variants was as follows: Q24R (2.46%), T145I (1.17%), T141M (0.26%), T148I (0.17%), S76X (0.16%), T89I (0.14%), T123I (0.12%), P133S (0.09%), T187I (0.08%) and Q24H (0.07%). In 2023, the frequency of protein variants was as follows: S76X (1.43%), Q24R (1.05%), T145I (0.56%), P121X and L122X (0.54%), N118S (0.34%), T141M (0.32%), L122X (0.25%), T148I (0.15%), T187I (0.12%) and T123I (0.10%). The mutation of the amino acid threonine into isoleucine was the most frequently occurring amino acid mutation for both 2022 and 2023 (Figure 3).

3.2. Predicted Protein Stability of the Most Frequently Occurring Protein Variants

To determine the effects of the mutations on the viral replication protein complex, calculation of the ΔΔG of the mutant protein was performed. The effects of nsp7 mutation on the C-chain of the viral replication complex were modeled (Table 1). Among the most frequently occurring nsp7 mutations, seven were found to cause destabilizing effects, whereas three were found to cause stabilizing effects. L56F and L71F had the greatest destabilizing effect (−1.16 kcal/mol; −1.13 kcal/mol), whereas S26F and M75I (0.39 and 0.35 kcal/mol) had the greatest stabilizing effect on the viral replication complex.
For the effects of nsp8, we explored the effects of the mutation on each of the B and D chains and its overall effect (B, D mutation) (Table 2). Among the most frequently occurring single amino acid nsp8 mutations (excluding those containing ambiguous amino acid), nine mutations were found to cause overall stabilizing effects on the viral replication complex, with one mutation (P133S) causing destabilizing effects (−3.34 kcal/mol).
In the case of the P133S variant, the native sequence Pro-133 amino acid residue stabilized the structure by forming several inter-chain and intra-chain interactions (Figure 4). Pro-133 (B-chain) forms an inter-chain H-bond with Lys-391 of the A-chain (nsp12). Several hydrophobic interactions occur between Pro-133 (B-chain) and residues such as Trp-182 (B-chain), Arg-392 (A-chain) and Lys-391 (A-chain). However, in the variant sequence, Ser-133 forms only one intrachain H-bond with Trp-182 (B-chain). Ser-133 also does not form hydrophobic interactions like that of P-133; rather, Ser-133 forms a weaker van der Waals interaction with Trp-182 (B-chain) and Ser-133 (B-chain). In the D-chain, Pro-133 forms three intra-chain H-bonds with residues Gly-113, Trp-182 and Val-131. However, the variant Ser-133 (D-chain) only forms one H-bond with the residue Trp-182. Pro-133 had more stability than Ser-133 due to the multiple H-bonding and multiple hydrophobic interactions it formed with its neighboring atomic environment compared with the Ser-133 variant.
A single amino acid substitution of threonine to isoleucine is a common mutation in nsp8. Analysis showed that these mutations stabilize the viral replication protein complex. We observed that during this mutation, an increased number of non-covalent interactions, in particular hydrophobic interactions, occurred. In T145I, the native sequence Thr-145 can form a polar interaction with Ile-156 in the B-chain. With a mutation to Ile-145, two polar interactions occur between Ile-145 and Asp-143. In the case of the T148I variant, additional hydrophobic interactions occur between the variant, Ile-148, and Leu-153 in the D-chain. In Thr-148 (D-chain), no hydrophobic interactions were observed, whereas the mutant Ile-148 formed a hydrophobic interaction with Leu-76 of nsp7 (C-chain). In T123I (B-chain), Thr-123 formed only one hydrophobic interaction with Ile-270 (A-chain, nsp12). Upon mutation to Ile-123, inter- and intra-chain hydrophobic interactions occur with Leu-270 (A-chain), Ile-119 (B-chain) and Ile-106 (B-chain). In the T187I (B-chain), Thr-187 and Lys-127 form hydrophobic interactions. The variant Ile-187 forms hydrophobic interactions with Lys-127, Met-137 and Ile-185 (B-chain) (Supplementary Dataset S4).
Because other frequently occurring mutations such as S76X and L122X contained ambiguous amino acid sequences, we simulated all possible 19 amino acid mutations that might occur for the variants. Among the possible mutations at the 76th amino acid residue position of nsp8, an S76P mutation would render the most destabilizing effect (−1.39 kcal/mol), whereas an S76Y mutation would cause a stabilizing effect (2.27 kcal/mol) (Table 3). At the 122nd position of nsp8, an L122G mutation would render the most destabilizing effect (−2.42 kcal/mol), whereas an L122W mutation (0.95 kcal/mol) would cause a stabilizing effect (Table 4).
One of the most frequently occurring potential variants of nsp8 is a potential double mutation at the two amino acid positions, 121st and 122nd. To predict the potential effect of double amino acid substitutions on these sites, we performed various possible amino acid substitutions via permutations with repetition (Supplementary Dataset S5). Table 5 shows that a double substitution to glycine (P121G,L122G) causes the greatest destabilization (−4.04 kcal/mol), followed by P121D,L122G; P121T,L122G; and P121S,L122G (−3.98, −3.91 and −3.82 kcal/mol).
On the other hand, a mutation with only on the 121st amino acid position from a proline (P) to a glutamic acid, E (P121E), will cause the greatest stabilizing effect (1.65 kcal/mol). This is followed by a mutation to a Q (P21Q) (1.52 kcal/mol) and a double mutation of P121E and L122F (1.19 kcal/mol).

3.3. Mutation Effects on Critical Amino Acid Positions of nsp7 and nsp8

To further explore amino acid residues that might be critical in nsp7 and nsp8 for protein interactions, a conserved domain database (CDD) analysis was performed. CDD analysis allows identification and characterization of amino acid residues within a protein sequence that are structurally and evolutionarily conserved across different virus species. Protein homologues across different species of related viruses were used for protein alignment (Supplementary Figure S2).
The alignment used 27 nsp7 protein sequences and its homolog across different species, whereas the alignment for nsp8 comprised 30 protein sequences across different species. CDD analysis revealed conservation of three amino acid residues for nsp7 and nsp8. These critical interaction residues for nsp7 are Lys-7, His-36 and Asn-37, and the critical interaction residues for nsp8 are Lys-58, Pro-183 and Arg-190.
To further understand the potential stability changes upon mutation of these critical interaction residues, we examined the changes in the ΔΔG energy upon substitution to any of the remaining 19 amino acids. In the case of nsp7 (Figure 5, Supplementary Dataset S6), the mutation of Lys (K) at the 7th amino acid position to D (−2.55 kcal/mol), P (−2.4 kcal/mol) and G (−1.87 kcal/mol) amino acids would cause the most destabilizing effect, whereas the mutation to any of the three amino acids: L (0.03 kcal/mol), F (0.02 kcal/mol) and I (0.01 kcal/mol) would cause a slightly stabilizing effect, with a mutation to M (0.00 kcal/mol) causing a neutral mutation. The mutation at the 36th position from amino acid H to R (−1.69 kcal/mol), G (−1.59 kcal/mol) and Q (−1.18 kcal/mol) would cause the greatest destabilizing effects. Mutation to any of the seven amino acids, namely Y (0.83 kcal/mol), L (0.62 kcal/mol), F (0.39 kcal/mol), C (0.24 kcal/mol), E (0.1 kcal/mol), V (0.04 kcal/mol) and I (0.01 kcal/mol), would cause stabilizing effects. The mutation at the 37th position from amino acid N to P (−2.36 kcal/mol), G (−1.59 kcal/mol) and K (−1.49 kcal/mol) would cause the most destabilizing effects, whereas mutation to any of the six amino acids Y (0.91 kcal/mol), I (0.33 kcal/mol), L (0.26 kcal/mol), C (0.19 kcal/mol), F (0.12 kcal/mol) and M (0.01 kcal/mol) could cause stabilizing effects. Our analysis showed that thermodynamically, most amino acid substitutions of these three critical interaction residues render destabilizing effects.
In the case of nsp8 (Figure 6, Supplementary Dataset S7), mutation of Lys-58 would render a mostly destabilizing effect especially with amino acids H (−1.55 kcal/mol), Q (−1.49 kcal/mol) and G (−1.2 kcal/mol). Stabilizing effects were rendered with mutations to I (0.27 kcal/mol), L (0.18 kcal/mol), C (0.17 kcal/mol), A (0.12 kcal/mol), V (0.08 kcal/mol) and R (0.05 kcal/mol).
A mutation at Pro-183 (P) would mostly cause a destabilizing effect, especially with the mutations to Y (−1.35 kcal/mol), F (−1.32 kcal/mol), D (−1.16 kcal/mol), E (−1.15 kcal/mol) and M (−1.09 kcal/mol). On the other hand, stabilizing effects were observed with mutations to C (0.93 kcal/mol), V (0.49 kcal/mol) and I (0.17 kcal/mol). Mutation to any other amino acid except L amino acid (0.80 kcal/mol) of Arg-190 would have a destabilizing effect on the viral replication protein complex.

3.4. Comparison of Bio-Chemoinformatic Calculations and Predictions with Wet Lab Experimental Results

To confirm the reliability of bio-chemoinformatic calculations and predictions and to gain understanding of their biological significance, we simulated known mutations of nsp7 and nsp8 based on wet lab experiments reported by Biswal, 2021 [10].
Reported mutations of nsp7 which have shown to decrease the RdRp efficiency include F49A, M52A, L56A, triple mutation of F49A, M52A, L56A, C8G and V11A. In the case of F49A, M52A and L56A, experimental evidence has shown that a triple mutation of F49A, M52A and L56A disrupted RdRp efficiency greatly compared to the individual mutation components.
Table 6 shows that the destabilizing mutations for nsp7 based on the wet lab experimental results are in agreement with our bio-chemoinformatics analysis. A triple mutation of F49A, M52A and L56A (−3.46 kcal/mol) was found to be higher than the individual mutation effects: F49A (−2.99 kcal/mol), M52A (−2.12 kcal/mol) and L56A (−3.09 kcal/mol).
Mutation of nsp7 N37V was reported to have no detrimental effect to the nsp7–nsp8 complex but caused decrease in RdRp activity when it was part of the viral replication protein complex. In this regard, we modeled three situations: (1) mutation of nsp7 N37V in nsp7–nsp8 dimer complex (PDB: 6YHU), (2) mutation of nsp7 N37V in the nsp7–nsp8 heterotetrameric complex using the X-ray crystal structure of the wet lab experiments (PDB: 7JLT) and mutation of nsp7 N37V as part of the viral replication protein complex.
Bio-chemoinformatic analysis showed that N37V has no detrimental effect (stabilizing or neutral effect) on both the nsp7–nsp8 dimer complex (0.13 kcal/mol) and nsp7–nsp8 heterotetramer complex (0.22 kcal/mol) but has a destabilizing effect (−0.15 kcal/mol) or reduction on the RdRp efficiency when introduced in the viral replication complex.
To further confirm biological significance of the bio-chemoinformatic calculations and predictions (Table 7), we simulated the nsp8 mutations based on reported wet lab experiments. Experimental evidence has shown that F92A, M90A and M94A have destabilizing effects on the RdRp efficiency [10].
Table 7 shows that our biochemoinformatic analyses are in agreement with the observed experimental results in which destabilizing effects were observed. The destabilizing effects were: F92A (−3.06 kcal/mol), M90A (−1.39 kcal/mol) and M94A (−1.94 kcal/mol).
Overall, our analysis has shown that bio-chemoinformatic analyses are in good agreement with the wet lab experimental results. Moreover, bio-chemoinformatic results which are stabilizing render neutral or no detrimental effect or possibly improve efficiency to some extent to the RdRp, whereas destabilizing effects render a decrease in RdRp efficiency.

3.5. Individual Amino Acid Residue Contributions to Protein Complex Stability

To further explore the contributions of each amino acid residue to the stability of the viral replication complex, we conducted an alanine scan on each of the protein chains. Each of the non-alanine amino acid residues was mutated to alanine, and our analysis showed that most of the amino acid residue sites of the viral replication complex are prone to destabilization or are thermodynamic hotspots, whereas some portions are neutral or stabilizing sites. In total, 84.1% of the nsp12 protein, 76.5% of the nsp8: B-chain, 82.4% of the nsp8: D-chain, and 80.8% of the C-chain are prone to destabilization upon alanine mutation. A simultaneous alanine scan of nsp8 at both the B-chain and D-chain revealed that 48% of the total amino acid residues were prone to destabilization (Supplementary Dataset S8).
On the other hand, the percentage of amino acid residues that would render stabilization upon alanine mutation are: 15.9% of nsp12 (A-chain), 23.5% of nsp8 (B-chain), 17.7% of nsp8 (D-chain) and 19.2% of nsp7 (C-chain). Simultaneous alanine scans at both the B-chain and D-chain showed that 42% of the total amino acid residues were neutral sites (Supplementary Dataset S8).
We further examined the overall contribution of the amino acids of nsp7 and nsp8 to the overall stability of the protein complex. In the case of nsp7 (C-chain), the amino acids W (−3.21 kcal/mol), F (−2.93 kcal/mol), L (−2.56 ± 0.58 kcal/mol) and I (−2.22 ± 0.21 kcal/mol) demonstrated the greatest destabilizing effect (<−2.0 kcal/mol average) during alanine substitution (Figure 7A; Supplementary Dataset S9). Nsp7 has only one tryptophan and one phenylalanine, Trp-29 and Phe-49. Trp-29 contributes to the stability of the protein complex through interchain interactions between the A-chain and C-chains. An important interaction is H-bonding with Gln-444 of nsp12 (A-chain) and Val-410 of nsp12 (A-chain). Furthermore, Phe-49 of nsp7 forms several H-bonds with neighboring amino acids in the C-chain, which helps stabilize the complex. H-bonding occurs between Phe-49 and amino acid residues such as Met-52, Val-53, Thr-45 and Thr-46 within the C-chain (Figure 8).
Examining nsp8 (B-chain), the amino acids L (−2.33 ± 0.41 kcal/mol), Y (−2.18 ± 0.39 kcal/mol), W (−2.12 ± 0.45 kcal/mol) and I (−2.08 ± 0.76 kcal/mol) exhibited the greatest destabilizing effect when substituted with alanine (Figure 7B; Supplementary Dataset S9). An alanine scan of nsp8 at the D-chain showed the greatest average destabilizing effect when the amino acids Y (−2.79 ± 0.38 kcal/mol), I (−2.43 ± 0.5 kcal/mol), L (−2.26 ± 0.7 kcal/mol), F (−2.16 ± 0.83 kcal/mol) and W (−2.03 ± 0.54 kcal/mol) were substituted with alanine (Figure 7C).
Simultaneous alanine scan of nsp8 at both the B-chain and D-chain showed Y (−3.21 ± 0.63 kcal/mol), F (−3.08 ± 0.64 kcal/mol) and W (−3.01 ± 2.05 kcal/mol) with the greatest average destabilizing effect (Figure 7D). The average value for both chains was higher than the average values for each of the individual chains. Moreover, average values for the mutational effect for both chains showed a stabilizing effect for amino acids such as G (0 ± 0.11 kcal/mol), Q (0.12 ± 0.25 kcal/mol), D (0.13 ± 0.52 kcal/mol), N (0.39 ± 0.43 kcal/mol), K (0.47 ± 0.34 kcal/mol) and S (0.52 ± 0.5 kcal/mol). These positive average stabilizing effect values were not observed for these amino acids for each of the individual B and D chains (Supplementary Dataset S9).
Three amino acid residues of nsp8 showed the greatest destabilizing effect (<−4.0 kcal/mol) in both the B-chain and D-chain. These are Trp-154 (−4.46 kcal/mol), Tyr-135 (−4.17 kcal/mol) and Phe-15 (−4.1 kcal/mol) (Supplementary Dataset S9).
Destabilizing effects were mostly caused by the disruption of the H-bonding that forms in their respective atomic environments. Trp-154 forms four intrachain H-bonds with Phe-147, Leu-189, Tyr-149 and Ala-126. In the D-chain, it forms two H-bonds with Phe-147 and Tyr-149. In Tyr-135, it forms three H-bonds in the B-chain, namely with Lys-139, Tyr-138 and Ile-172, while it forms two H-bonds in the D-chain, with Tyr-138 and Lys-139. Aside from several intra-chain H-bonds at the B-chain with amino acid residues Ala-18, Gln-19, Ser-11 and Tyr-12, the native amino acid residue Phe-15 forms an aromatic interaction with a neighboring Phe-49. The same aromatic interaction also occurs at the D-chain of the viral complex with Phe-49. Phe-15 also forms two H-bonds with Tyr-12 and Ser-11 (Supplementary Dataset S10).

4. Discussion

A global study of mutations in viral replication is important to understand viral evolution and drug resistance [34], disease pathogenesis [35] and the development of antiviral strategies [36]. These studies provide insights and knowledge into the molecular mechanisms underlying viral replication at the population level and offer a foundation for the development of targeted therapeutic interventions and the design of novel antiviral agents [36,37].
In this regard, we analyzed the nsp7 and nsp8 protein sequences available from 2022 to 2023 at NCBI. Our analysis of global mutations for 2022 and 2023 showed that more than 90% of the global protein sequences conserve the native protein sequences. A prior study in 2021–2022 also reported similar findings on the percentage of native protein sequences for nsp7 and nsp8 [38], although the study did not further investigate the effect of these mutations on the viral replication protein complex.
In 2021, only S25L (1.70%) and S26F (0.28%) have percentage frequencies of occurrence greater than 0.10% for nsp7. The remaining mutations were at 0.01–0.02%. For nsp8, only M129I (0.35%) and I156V (0.33%) have frequencies greater than 0.10%, with the remaining variants in the frequency range of 0.01–0.06% [9]. Our recent data for 2022 and 2023 show that the percentage frequency distribution of variants for nsp7 did not exceed 0.20% (Figure 2). S25L and S26F, along with D77N and L71F, are the most frequently occurring variants of nsp7 for 2022 and 2023. For nsp8, M129I and I156V are not in the 10 most frequently occurring variants for 2022–2023. Meanwhile, S76X and Q24R are the two most frequently occurring variants, with percentage frequencies greater than 1% for 2023. Out of the ten most frequently occurring variants (Table 2), only the P133S mutation rendered a stabilizing effect, whereas the remaining variants rendered a stabilizing effect. We also simulated possible mutations for the most frequently occurring mutations in nsp8, which contains one and two ambiguous amino acid sequences. Ambiguous amino acid sequences often arise due to low quality or poor sequencing data [39], degenerate genetic codes in which multiple codons may code for the same amino acid [40,41], and genetic variations such as insertions, deletions or mutations [42,43]. In S76X, mutation to P, G, N and D amino acids would have destabilizing effects, whereas any other amino acids would have neutral or stabilizing effects. In L122X, a mutation to glycine causes the greatest destabilizing effect. The same effect of glycine substitution was observed in the two amino acid substitutions, P121X and L122X. In the P121X, L122X variant, mutation of the 122nd amino acid to G amino acid with Pro-121 mutating to G, D, T, S and N amino acids would cause the greatest destabilization, while stabilizing effects occurred when there was no mutation on the 122nd position and a mutation to E and Q amino acids occurred at the 121st amino acid position. For the double mutations of P121X and L122X, DDMut has been tested for high accuracy for three simultaneous mutations. It is recommended as a future study for P121X, L122X to be compared with other in silico analyses for four simultaneous mutations. In summary, most of the frequently occurring mutations for nsp7 are predicted to cause a destabilizing effect, whereas mutations for nsp8 would render a stabilizing effect on the viral replication protein complex.
Mutation of threonine to isoleucine at different positions was notable in the most frequently occurring variants of nsp8 (Table 2). Our analysis showed that mutations from threonine to isoleucine would have an overall stabilizing effect on the viral replication complex. The substitution of threonine with isoleucine can often influence protein stability changes through hydrophobic interactions, hydrogen bonding and side chain packing [44,45]. Hydrophobic isoleucine can enhance hydrophobic interactions within the protein core, which contribute to stability [44]. This stability causes increased thermal stability and hydrophobicity through improved internal packing and increased hydrophobic interactions [46]. The role of hydrophobic interactions was noted in our analysis of nsp8, in which an increased number of hydrophobic interactions with the substitution of threonine and isoleucine was observed for the mutants. In proteins, such as the villin headpiece subdomain, conformation is mainly stabilized through hydrophobic interactions [47].
In the context of viral protein mutations, threonine to isoleucine mutations have been associated with functional changes, altering viral infectivity and interactions with host cellular processes. A threonine to isoleucine mutation has also been reported in different proteins, such as the polymerase protein of murine leukemia viruses [48], capsid of RNA viruses [49] and the P7 protein of hepatitis C virus [50]. In terms of functionality, a threonine to isoleucine mutation at position 544 of the spike glycoprotein of Zaire ebolavirus has been frequently observed in past outbreaks and has been shown to have a potential role in infection efficiency [51]. In human immunodeficiency virus type-1 (HIV-1), a T24I mutation of the nucleocapsid protein has been reported as a second-site suppressor that causes the rescue of replication and RNA packaging [52]. Hence, thermodynamically, a threonine to isoleucine mutation can cause protein stability and can cause favorable biological effects on the virus, such as an increase in infection rate and replication rescue.
We further analyzed mutational effects on the critical interaction residues that were identified using conserved domain database analysis. Our analysis revealed that the critical interaction residues for nsp7 are Lys-7, His-36 and Asn-37. These three amino acids are incongruent with experimental studies proposing the potential critical role of these three amino acids in the potential interaction of the nsp7/nsp8/nsp12 polymerase complex with RNA [21]. For nsp8, the three potential critical interaction residues are Lys-58, Pro-183 and Arg-190. These three amino acids are also incongruent with some studies that have proposed their potential critical roles, with Pro-183 and Arg-190 postulated to be involved in nsp12 binding and Lys-58 might be critical for Nsp8–RNA interactions [21,53].
Our findings showed that most amino acid substitutions on these sites for nsp7 and nsp8 would render an overall destabilizing effect on the viral replication complex. This was quite evident, in particular with the mutation of Lys-7 of nsp7 in which substitution of any other amino acid would mostly cause destabilization. In the case of Arg-190 of nsp8, mutation to any other amino acid except for L amino acid will cause destabilization. In our 2023 global analysis, we noted that certain mutations at these critical amino residues have been sequenced. For Lys-7 of nsp7, K7R (n = 444) is the most frequently occurring variant in the dataset, followed by K7Q (n = 14) and K7N (n = 6). For His-36 and Asn-37, some mutations were observed but at a low frequency. These include H36T (n = 4), H36P (n = 2), H36Q (n = 2), N37S (n = 6), N37N (n = 6), N37D (n = 2) and N37K (n = 2). For nsp8, no mutation so far has been sequenced for Lys-58, whereas variants for Pro-183 and Arg-190 have been sequenced at low frequencies. These variants are P183S (n = 10), P128L (n = 4), R190A (n = 4), R190P (n = 8) and R190P (n = 2). Based on our computational analysis, we predict that these mutations might cause destabilizing effects on the viral replication complex. Overall, our computational thermodynamic data are in agreement with an earlier hypothesis that these three respective amino acid residues of nsp7 and nsp8 are critical interaction residues conserved across different non-human viral isolates. Disrupting these amino acid sites may be further explored for further studies as potential target sites.
We also simulated wet lab experiments using bio-chemoinformatic calculations to confirm the reliability of our methods and to gain biological significance. Our results showed good agreement with previously reported effects of mutations of nsp7 and nsp8 on the viral replication protein complex [10]. The nsp7 triple mutation of F49A, M52A and L56A demonstrated the greatest destabilizing effect compared to their individual mutations (Table 6). This was consistent with the observed wet lab experiments where the triple mutation caused a greater decrease in RdRp efficiency. Our results also showed destabilizing effects with other mutations such as C8G and V11A, which were also reported to decrease the RdRp efficiency. Moreover, our analysis showed that the nsp7 N37V mutation caused stabilizing or neutral effects when expressed as part of the nsp7–nsp8 dimer and nsp7–nsp8 heterotetramer complex. Destabilizing effects were predicted when it is expressed as part of the viral replication protein complex or the nsp12–nsp7–nsp8 supercomplex. These results were in agreement with the wet lab experiments, which reported no detrimental effect to the nsp7–nsp8 complex but notably decreased RdRp activity when expressed as part of the viral replication protein complex. Furthermore, mutations of nsp8 such as F92A, M90A and M94A, which we predicted to be destabilizing, have been shown in the wet lab experiments to have decreased RdRp efficiency. This suggests that our bio-chemoinformatics results showing stabilizing or neutral effects render no detrimental effect or possibly increased activity to some extent to the RdRp efficiency, while destabilizing effects render decreased RdRp efficiency. Reduced RdRp activity has been shown to substantially slow down viral replication in RNA viruses such as in tick-borne flavivirus [54] and can alter the RNA synthesis process in tomato mosaic virus [55]. Also, inhibitors of RdRp of SARS-CoV-2 such as remdesivir slow down viral replication by reducing and inhibiting the viral RdRp efficiency [56]. This highlights the biological significance of the stabilizing and destabilizing effects of mutations on the viral replication protein complex in the context of viral replication fitness.
Another aspect that we examined in this study is the alanine scan of the amino acid residues comprising the viral replication complex. Our analysis has shown that most regions of the viral replication complex are potential hotspot residues or thermodynamically destabilizing sites, whereas a few are neutral or stabilizing sites. Simultaneous alanine scans of the B-chain and D-chain showed that 48% of the amino acid residues were potential hotspots and 42% were neutral sites. In contrast, individual alanine scans of the B-chain and D-chain showed that 82.4% were potential hotspots for the B-chain and 80.8% were potential hotspots for the D-chain. The difference in the number of potential hotspots when both the B- and D-chains are present can be attributed to possible interchain and intrachain interactions within the protein complex [57], allosteric effects [58] and conformational changes [59,60]. The destabilizing effect during alanine mutations in individual chains often arises from the disturbance of critical interactions within each chain, which lead to decreased stability. When both chains are mutated simultaneously to alanine, it can often lead to the formation of favorable interactions at the interface between the chains, resulting in a stabilizing effect on the overall complex [61]. In the case of proteins such as nsp8, which form two chains in a complex, a simultaneous poly-alanine scan would be a better technique to determine the effect of each amino acid residue on overall protein stability.
Amino acids such as leucine, tryptophan, phenylalanine and isoleucine in nsp7 and tryptophan, tyrosine and phenylalanine in nsp8 are prone to destabilization when substituted with alanine. Our results agree with those of a previous study that used energy per residue decomposition to predict amino acid hotspots in which tyrosine, phenylalanine and leucine were some predicted hotspot candidates [11]. Hotspot amino acid residues have been found to be enriched in forming H-bonds [11,62], such as in the case of Tryp-29 and Phe-49 of nsp7 and Trp-154, Tyr-135 and Phe-15 of nsp8 in our analysis. These amino acid residues exhibited the greatest destabilizing effect owing to the disruption of hydrogen bonds that they, respectively, form within the viral replication protein complex. Exploration of these residues as potential hotspot residues as target candidates can be further performed for confirmation.
The present study has studied extensively the temporal mutation frequencies of nsp7 and nsp8, identified critical interaction residues, confirmed previously reported wet lab results and identified new amino acid residue targets for possible drug development. In this aspect, as we utilized the native sequence of nsp12 in our models, a possible mutation of nsp12 in combination with mutations of nsp7 and nsp8 can be performed to study multi-chain mutations as a future direction. With data on mutations of spike proteins being richly available in the literature [63,64], transmission and spread models based on mutations of infectivity-related protein and replication-related proteins of SARS-CoV-2 [65,66,67] can be of great interest to assimilate relevant data in tracking the molecular evolution, distribution and implications on the global epidemiological trend of the virus. This would allow the development of robust methods to mitigate the spread of the virus and to develop high-efficacy and high-specificity drugs.

5. Conclusions

The present study analyzed the global mutation of nsp7 and nsp8 in 2022 and 2023, in which certain mutations have significant effects on the stability of the viral replication complex. Most of the frequently occurring mutations in nsp7 were predicted to destabilize, whereas mutations in nsp8 were predicted to cause stabilization. The substitution of threonine with isoleucine in nsp8 was found to occur frequently in the global population. This mutation can lead to increased stability and may cause potential functional changes. Moreover, critical interaction residues for nsp7 and nsp8 have been identified, and the effects of mutations on these sites caused destabilization. Bio-chemoinformatic predictions were in good agreement with previously reported wet lab experimental results. Furthermore, potential hotspot residues for nsp7 and nsp8 have been predicted with amino acids such as tryptophan, phenylalanine and tyrosine, proposing their possible role as amino acid residues for targeting. The present study provided an intensive study of the mutations of nsp7 and nsp8 and their effects on the stability of the viral replication protein complex. This has allowed a better understanding of the current state of viral protein evolution, the possible effect on viral replication mechanisms and insights into new possible protein target sites.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/cimb46030165/s1, Supplementary Figure S1. Experimental workflow of the study. Supplementary Figure S2. Multiple sequence alignment for nsp7 (A) and nsp8 (B) during conserved domain database (CDD) analysis. Supplementary Dataset S1. Reference sequences of nsp7 and nsp8 for Conserve Domain Database (CDD) Analysis. Supplementary Dataset S2. Native protein sequences of nsp12, nsp7 and nsp8. Supplementary Dataset S3. Temporal mutation of nsp7 and nsp8 (2022- 2023). Supplementary Dataset S4. Hydrophobic interactions in threonine to isoleucine mutation of nsp8. Supplementary Dataset S5. Double mutation analysis for P121X, L122X. Supplementary Dataset S6. Non-covalent interactions for critical interaction residues of nsp7. Supplementary Dataset S7. Non-covalent interactions for critical interaction residues of nsp8. Supplementary Dataset S8. Computational alanine scan of the viral replication complex. Supplementary Dataset S9. Statistics of amino acid contributions to viral replication protein complex stability. Supplementary Dataset S10. Hydrogen bonding interactions in the potential hotspot residues in nsp8.

Author Contributions

Conceptualization: B.J.J.S. and T.O.; analysis: B.J.J.S.; writing: original draft preparation: B.J.J.S. and T.O.; writing review and editing: B.J.J.S. and T.O.; supervision: T.O. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI (Grants-in-Aid for Scientific Research (A) 22H00322 (T.O.)).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article and Supplementary Materials.

Acknowledgments

B.J.J.S. would like to acknowledge the support of the Ministry of Education, Culture, Sports, Science and Technology (MEXT) scholarship and the Asian Chemical Biology Initiative (ACBI). This work was inspired by the international and interdisciplinary environments of the JSPS Core-to-Core Program, “Asian Chemical Biology Initiative”.

Conflicts of Interest

All authors declare no conflicts of interest.

References

  1. Shang, J.; Wan, Y.; Luo, C.; Ye, G.; Geng, Q.; Auerbach, A.; Li, F. Cell Entry Mechanisms of SARS-CoV-2. Proc. Natl. Acad. Sci. USA 2020, 117, 11727–11734. [Google Scholar] [CrossRef] [PubMed]
  2. Academy of Hirudotherapy St.-Petersburg Russia; Ai, K. Pandemic “COVID-19—Postcovid Syndrome”: A System Method of Leeching Is a New and Effective Treatment. J. Virol. Res. Rep. 2023, 4, 1–12. [Google Scholar] [CrossRef]
  3. Bisen, A.C.; Agrawal, S.; Sanap, S.N.; Ravi Kumar, H.G.; Kumar, N.; Gupta, R.; Bhatta, R.S. COVID-19 Retreats and World Recovers: A Silver Lining in the Dark Cloud. Health Care Sci. 2023, 2, 264–285. [Google Scholar] [CrossRef]
  4. Pollard, C.A.; Morran, M.P.; Nestor-Kalinoski, A.L. The COVID-19 Pandemic: A Global Health Crisis. Physiol. Genom. 2020, 52, 549–557. [Google Scholar] [CrossRef]
  5. Veljanoska, F.; Mazahrih, B. The Impact of COVID-19 on FDI. Int. J. Bus. Perform. Manag. 2023, 24, 315–343. [Google Scholar] [CrossRef]
  6. Schotte, S.; Zizzamia, R. The Livelihood Impacts of COVID-19 in Urban South Africa: A View from below. Soc. Indic. Res. 2023, 165, 1–30. [Google Scholar] [CrossRef]
  7. Peng, Q.; Peng, R.; Yuan, B.; Zhao, J.; Wang, M.; Wang, X.; Wang, Q.; Sun, Y.; Fan, Z.; Qi, J.; et al. Structural and Biochemical Characterization of the Nsp12-Nsp7-Nsp8 Core Polymerase Complex from SARS-CoV-2. Cell Rep. 2020, 31, 107774. [Google Scholar] [CrossRef]
  8. Faisal, H.M.N.; Katti, K.S.; Katti, D.R. Differences in Interactions Within Viral Replication Complexes of SARS-CoV-2 (COVID-19) and SARS-CoV Coronaviruses Control RNA Replication Ability. JOM 2021, 73, 1684–1695. [Google Scholar] [CrossRef]
  9. Reshamwala, S.M.S.; Likhite, V.; Degani, M.S.; Deb, S.S.; Noronha, S.B. Mutations in SARS-CoV-2 Nsp7 and Nsp8 Proteins and Their Predicted Impact on Replication/Transcription Complex Structure. J. Med. Virol. 2021, 93, 4616–4619. [Google Scholar] [CrossRef] [PubMed]
  10. Biswal, M.; Diggs, S.; Xu, D.; Khudaverdyan, N.; Lu, J.; Fang, J.; Blaha, G.; Hai, R.; Song, J. Two Conserved Oligomer Interfaces of NSP7 and NSP8 Underpin the Dynamic Assembly of SARS-CoV-2 RdRP. Nucleic Acids Res. 2021, 49, 5956–5966. [Google Scholar] [CrossRef] [PubMed]
  11. Sarma, H.; Jamir, E.; Sastry, G.N. Protein-Protein Interaction of RdRp with Its Co-Factor NSP8 and NSP7 to Decipher the Interface Hotspot Residues for Drug Targeting: A Comparison between SARS-CoV-2 and SARS-CoV. J. Mol. Struct. 2022, 1257, 132602. [Google Scholar] [CrossRef] [PubMed]
  12. Te Velthuis, A.J.W.; Arnold, J.J.; Cameron, C.E.; Van Den Worm, S.H.E.; Snijder, E.J. The RNA Polymerase Activity of SARS-Coronavirus Nsp12 Is Primer Dependent. Nucleic Acids Res. 2010, 38, 203–214. [Google Scholar] [CrossRef] [PubMed]
  13. Gao, Y.; Yan, L.; Huang, Y.; Liu, F.; Zhao, Y.; Cao, L.; Wang, T.; Sun, Q.; Ming, Z.; Zhang, L.; et al. Structure of the RNA-Dependent RNA Polymerase from COVID-19 Virus. Science 2020, 368, 779–782. [Google Scholar] [CrossRef] [PubMed]
  14. Bravo, J.P.K.; Dangerfield, T.L.; Taylor, D.W.; Johnson, K.A. Remdesivir Is a Delayed Translocation Inhibitor of SARS-CoV-2 Replication. Mol. Cell 2021, 81, 1548–1552.e4. [Google Scholar] [CrossRef] [PubMed]
  15. Machitani, M.; Takei, J.; Kaneko, M.K.; Ueki, S.; Ohashi, H.; Watashi, K.; Kato, Y.; Masutomi, K. Development of Novel Monoclonal Antibodies against Nsp12 of SARS-CoV-2. Virol. J. 2022, 19, 213. [Google Scholar] [CrossRef] [PubMed]
  16. Hartenian, E.; Nandakumar, D.; Lari, A.; Ly, M.; Tucker, J.M.; Glaunsinger, B.A. The Molecular Virology of Coronaviruses. J. Biol. Chem. 2020, 295, 12910–12934. [Google Scholar] [CrossRef] [PubMed]
  17. Yoshimoto, F.K. The Proteins of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2 or n-COV-19), the Cause of COVID-19. Protein J. 2020, 39, 198–216. [Google Scholar] [CrossRef]
  18. Kirchdoerfer, R.N.; Ward, A.B. Structure of the SARS-CoV Nsp12 Polymerase Bound to Nsp7 and Nsp8 Co-Factors. Nat. Commun. 2019, 10, 2342. [Google Scholar] [CrossRef]
  19. Zhang, C.; Li, L.; He, J.; Chen, C.; Su, D. Nonstructural Protein 7 and 8 Complexes of SARS-CoV-2. Protein Sci. 2021, 30, 873–881. [Google Scholar] [CrossRef]
  20. Anderson, T.K.; Hoferle, P.J.; Chojnacki, K.J.; Lee, K.W.; Coon, J.J.; Kirchdoerfer, R.N. An Alphacoronavirus Polymerase Structure Reveals Conserved Replication Factor Functions. Nucleic Acids Res. 2024, gkae153. [Google Scholar] [CrossRef]
  21. Subissi, L.; Posthuma, C.C.; Collet, A.; Zevenhoven-Dobbe, J.C.; Gorbalenya, A.E.; Decroly, E.; Snijder, E.J.; Canard, B.; Imbert, I. One Severe Acute Respiratory Syndrome Coronavirus Protein Complex Integrates Processive RNA Polymerase and Exonuclease Activities. Proc. Natl. Acad. Sci. USA 2014, 111, E3900–E3909. [Google Scholar] [CrossRef] [PubMed]
  22. Ansari, M.H.R.; Saher, S.; Parveen, R.; Khan, W.; Khan, I.A.; Ahmad, S. Role of Gut Microbiota Metabolism and Biotransformation on Dietary Natural Products to Human Health Implications with Special Reference to Biochemoinformatics Approach. J. Tradit. Complement. Med. 2023, 13, 150–160. [Google Scholar] [CrossRef] [PubMed]
  23. Orton, R.J.; Gu, Q.; Hughes, J.; Maabar, M.; Modha, S.; Vattipally, S.B.; Wilkie, G.S. Bioinformatics Tools for Analysing Viral Genomic Data: -EN- -FR- Des Outils Bio-Informatiques Pour l’analyse Des Données de Génomique Virale -ES- Herramientas de Bioinformática Para Analizar Datos de Genómica Vírica. Rev. Sci. Tech. OIE 2016, 35, 271–285. [Google Scholar] [CrossRef]
  24. Murillo, J.; Villegas, L.M.; Ulloa-Murillo, L.M.; Rodríguez, A.R. Recent Trends on Omics and Bioinformatics Approaches to Study SARS-CoV-2: A Bibliometric Analysis and Mini-Review. Comput. Biol. Med. 2021, 128, 104162. [Google Scholar] [CrossRef]
  25. Lo, Y.-C.; Rensi, S.E.; Torng, W.; Altman, R.B. Machine Learning in Chemoinformatics and Drug Discovery. Drug Discov. Today 2018, 23, 1538–1546. [Google Scholar] [CrossRef]
  26. Raslan, M.A.; Raslan, S.A.; Shehata, E.M.; Mahmoud, A.S.; Sabri, N.A. Advances in the Applications of Bioinformatics and Chemoinformatics. Pharmaceuticals 2023, 16, 1050. [Google Scholar] [CrossRef]
  27. Wang, J.; Chitsaz, F.; Derbyshire, M.K.; Gonzales, N.R.; Gwadz, M.; Lu, S.; Marchler, G.H.; Song, J.S.; Thanki, N.; Yamashita, R.A.; et al. The Conserved Domain Database in 2023. Nucleic Acids Res. 2023, 51, D384–D388. [Google Scholar] [CrossRef]
  28. Song, Y.; DiMaio, F.; Wang, R.Y.-R.; Kim, D.; Miles, C.; Brunette, T.; Thompson, J.; Baker, D. High-Resolution Comparative Modeling with RosettaCM. Structure 2013, 21, 1735–1742. [Google Scholar] [CrossRef]
  29. Yan, L.; Huang, Y.; Ge, J.; Liu, Z.; Lu, P.; Huang, B.; Gao, S.; Wang, J.; Tan, L.; Ye, S.; et al. A Mechanism for SARS-CoV-2 RNA Capping and Its Inhibition by Nucleotide Analog Inhibitors. Cell 2022, 185, 4347–4360.e17. [Google Scholar] [CrossRef]
  30. Meng, E.C.; Goddard, T.D.; Pettersen, E.F.; Couch, G.S.; Pearson, Z.J.; Morris, J.H.; Ferrin, T.E. UCSF ChimeraX: Tools for Structure Building and Analysis. Protein Sci. 2023, 32, e4792. [Google Scholar] [CrossRef]
  31. Pettersen, E.F.; Goddard, T.D.; Huang, C.C.; Meng, E.C.; Couch, G.S.; Croll, T.I.; Morris, J.H.; Ferrin, T.E. UCSF ChimeraX: Structure Visualization for Researchers, Educators, and Developers. Protein Sci. 2021, 30, 70–82. [Google Scholar] [CrossRef] [PubMed]
  32. Zhou, Y.; Pan, Q.; Pires, D.E.V.; Rodrigues, C.H.M.; Ascher, D.B. DDMut: Predicting Effects of Mutations on Protein Stability Using Deep Learning. Nucleic Acids Res. 2023, 51, W122–W128. [Google Scholar] [CrossRef]
  33. Edmund, E.; Kamuzora, M.; Muhogora, W.; Ngoya, P.; Muhulo, A.; Amirali, A.; Makoba, A.; Ngoye, W.; Ngaile, J.; Majatta, S.; et al. Radiation Dose to Breast during Digital Mammography in Tanzania. Radiat. Prot. Dosimetry 2024, ncad316. [Google Scholar] [CrossRef] [PubMed]
  34. Cui, Y.; Li, Y.; Li, M.; Zhao, L.; Wang, D.; Tian, J.; Bai, X.; Ci, Y.; Wu, S.; Wang, F.; et al. Evolution and Extensive Reassortment of H5 Influenza Viruses Isolated from Wild Birds in China over the Past Decade. Emerg. Microbes Infect. 2020, 9, 1793–1803. [Google Scholar] [CrossRef]
  35. Neumann-Haefelin, C. HLA-B27-Mediated Protection in HIV and Hepatitis C Virus Infection and Pathogenesis in Spondyloarthritis: Two Sides of the Same Coin? Curr. Opin. Rheumatol. 2013, 25, 426–433. [Google Scholar] [CrossRef]
  36. Chen, X.; Liu, J.; Li, Y.; Zeng, Y.; Wang, F.; Cheng, Z.; Duan, H.; Pan, G.; Yang, S.; Chen, Y.; et al. IDH1 Mutation Impairs Antiviral Response and Potentiates Oncolytic Virotherapy in Glioma. Nat. Commun. 2023, 14, 6781. [Google Scholar] [CrossRef]
  37. Bailey, A.C.; Fisher, M. Current Use of Antiretroviral Treatment. Br. Med. Bull. 2008, 87, 175–192. [Google Scholar] [CrossRef]
  38. Abbasian, M.H.; Mahmanzar, M.; Rahimian, K.; Mahdavi, B.; Tokhanbigli, S.; Moradi, B.; Sisakht, M.M.; Deng, Y. Global Landscape of SARS-CoV-2 Mutations and Conserved Regions. J. Transl. Med. 2023, 21, 152. [Google Scholar] [CrossRef]
  39. Singer, J.; Thomson, E.; Hughes, J.; Aranday-Cortes, E.; McLauchlan, J.; Da Silva Filipe, A.; Tong, L.; Manso, C.; Gifford, R.; Robertson, D.; et al. Interpreting Viral Deep Sequencing Data with GLUE. Viruses 2019, 11, 323. [Google Scholar] [CrossRef]
  40. Tian, L.; Shen, X.; Murphy, R.W.; Shen, Y. The Adaptation of Codon Usage of +ssRNA Viruses to Their Hosts. Infect. Genet. Evol. 2018, 63, 175–179. [Google Scholar] [CrossRef]
  41. Cristina, J.; Fajardo, A.; Soñora, M.; Moratorio, G.; Musto, H. A Detailed Comparative Analysis of Codon Usage Bias in Zika Virus. Virus Res. 2016, 223, 147–152. [Google Scholar] [CrossRef]
  42. Aksamentov, I.; Roemer, C.; Hodcroft, E.; Neher, R. Nextclade: Clade Assignment, Mutation Calling and Quality Control for Viral Genomes. J. Open Source Softw. 2021, 6, 3773. [Google Scholar] [CrossRef]
  43. Park, D.; Hahn, Y. Rapid Protein Sequence Evolution via Compensatory Frameshift Is Widespread in RNA Virus Genomes. BMC Bioinformatics 2021, 22, 251. [Google Scholar] [CrossRef] [PubMed]
  44. Kathuria, S.V.; Chan, Y.H.; Nobrega, R.P.; Özen, A.; Matthews, C.R. Clusters of Isoleucine, Leucine, and Valine Side Chains Define Cores of Stability in High-energy States of Globular Proteins: Sequence Determinants of Structure and Stability. Protein Sci. 2016, 25, 662–675. [Google Scholar] [CrossRef] [PubMed]
  45. Holder, J.B.; Bennett, A.F.; Chen, J.; Spencer, D.S.; Byrne, M.P.; Stites, W.E. Energetics of Side Chain Packing in Staphylococcal Nuclease Assessed by Exchange of Valines, Isoleucines, and Leucines. Biochemistry 2001, 40, 13998–14003. [Google Scholar] [CrossRef] [PubMed]
  46. Cano-Muñoz, M.; Cesaro, S.; Morel, B.; Lucas, J.; Moog, C.; Conejero-Lara, F. Extremely Thermostabilizing Core Mutations in Coiled-Coil Mimetic Proteins of HIV-1 Gp41 Produce Diverse Effects on Target Binding but Do Not Affect Their Inhibitory Activity. Biomolecules 2021, 11, 566. [Google Scholar] [CrossRef] [PubMed]
  47. Pace, C.N.; Fu, H.; Fryar, K.L.; Landua, J.; Trevino, S.R.; Shirley, B.A.; Hendricks, M.M.; Iimura, S.; Gajiwala, K.; Scholtz, J.M.; et al. Contribution of Hydrophobic Interactions to Protein Stability. J. Mol. Biol. 2011, 408, 514–528. [Google Scholar] [CrossRef] [PubMed]
  48. Ou, C.Y.; Boone, L.R.; Koh, C.K.; Tennant, R.W.; Yang, W.K. Nucleotide Sequences of Gag-Pol Regions That Determine the Fv-1 Host Range Property of BALB/c N-Tropic and B-Tropic Murine Leukemia Viruses. J. Virol. 1983, 48, 779–784. [Google Scholar] [CrossRef] [PubMed]
  49. Urbanowski, M.D.; Ilkow, C.S.; Hobman, T.C. Modulation of Signaling Pathways by RNA Virus Capsid Proteins. Cell. Signal. 2008, 20, 1227–1236. [Google Scholar] [CrossRef] [PubMed]
  50. Steinmann, E.; Pietschmann, T. Hepatitis C Virus P7—A Viroporin Crucial for Virus Assembly and an Emerging Target for Antiviral Therapy. Viruses 2010, 2, 2078–2095. [Google Scholar] [CrossRef]
  51. Ueda, M.T.; Kurosaki, Y.; Izumi, T.; Nakano, Y.; Oloniniyi, O.K.; Yasuda, J.; Koyanagi, Y.; Sato, K.; Nakagawa, S. Functional Mutations in Spike Glycoprotein of Zaire Ebolavirus Associated with an Increase in Infection Efficiency. Genes Cells 2017, 22, 148–159. [Google Scholar] [CrossRef]
  52. Cimarelli, A.; Luban, J. Context-Dependent Phenotype of a Human Immunodeficiency Virus Type 1 Nucleocapsid Mutation. J. Virol. 2001, 75, 7193–7197. [Google Scholar] [CrossRef]
  53. Snijder, E.J.; Decroly, E.; Ziebuhr, J. The Nonstructural Proteins Directing Coronavirus RNA Synthesis and Processing. In Advances in Virus Research; Elsevier: Amsterdam, The Netherlands, 2016; Volume 96, pp. 59–126. ISBN 978-0-12-804736-1. [Google Scholar]
  54. Yang, J.; Jing, X.; Yi, W.; Li, X.-D.; Yao, C.; Zhang, B.; Zheng, Z.; Wang, H.; Gong, P. Crystal Structure of a Tick-Borne Flavivirus RNA-Dependent RNA Polymerase Suggests a Host Adaptation Hotspot in RNA Viruses. Nucleic Acids Res. 2021, 49, 1567–1580. [Google Scholar] [CrossRef]
  55. Nishikiori, M.; Dohi, K.; Mori, M.; Meshi, T.; Naito, S.; Ishikawa, M. Membrane-Bound Tomato Mosaic Virus Replication Proteins Participate in RNA Synthesis and Are Associated with Host Proteins in a Pattern Distinct from Those That Are Not Membrane Bound. J. Virol. 2006, 80, 8459–8468. [Google Scholar] [CrossRef] [PubMed]
  56. Uppal, T.; Tuffo, K.; Khaiboullina, S.; Reganti, S.; Pandori, M.; Verma, S.C. Screening of SARS-CoV-2 Antivirals through a Cell-Based RNA-Dependent RNA Polymerase (RdRp) Reporter Assay. Cell Insight 2022, 1, 100046. [Google Scholar] [CrossRef] [PubMed]
  57. Tuncbag, N.; Keskin, O.; Gursoy, A. HotPoint: Hot Spot Prediction Server for Protein Interfaces. Nucleic Acids Res. 2010, 38, W402–W406. [Google Scholar] [CrossRef]
  58. Tang, Q.; Alontaga, A.Y.; Holyoak, T.; Fenton, A.W. Exploring the Limits of the Usefulness of Mutagenesis in Studies of Allosteric Mechanisms. Hum. Mutat. 2017, 38, 1144–1154. [Google Scholar] [CrossRef] [PubMed]
  59. Moreira, I.S.; Fernandes, P.A.; Ramos, M.J. Computational Alanine Scanning Mutagenesis—An Improved Methodological Approach. J. Comput. Chem. 2007, 28, 644–654. [Google Scholar] [CrossRef]
  60. Moreira, I.S.; Fernandes, P.A.; Ramos, M.J. Hot Spots—A Review of the Protein–Protein Interface Determinant Amino-acid Residues. Proteins Struct. Funct. Bioinforma. 2007, 68, 803–812. [Google Scholar] [CrossRef]
  61. Ye, X.; Lee, Y.-C.; Gates, Z.P.; Ling, Y.; Mortensen, J.C.; Yang, F.-S.; Lin, Y.-S.; Pentelute, B.L. Binary Combinatorial Scanning Reveals Potent Poly-Alanine-Substituted Inhibitors of Protein-Protein Interactions. Commun. Chem. 2022, 5, 128. [Google Scholar] [CrossRef]
  62. Keskin, O.; Ma, B.; Nussinov, R. Hot Regions in Protein–Protein Interactions: The Organization and Contribution of Structurally Conserved Hot Spot Residues. J. Mol. Biol. 2005, 345, 1281–1294. [Google Scholar] [CrossRef]
  63. Harvey, W.T.; Carabelli, A.M.; Jackson, B.; Gupta, R.K.; Thomson, E.C.; Harrison, E.M.; Ludden, C.; Reeve, R.; Rambaut, A.; COVID-19 Genomics UK (COG-UK) Consortium; et al. SARS-CoV-2 Variants, Spike Mutations and Immune Escape. Nat. Rev. Microbiol. 2021, 19, 409–424. [Google Scholar] [CrossRef]
  64. Magazine, N.; Zhang, T.; Wu, Y.; McGee, M.C.; Veggiani, G.; Huang, W. Mutations and Evolution of the SARS-CoV-2 Spike Protein. Viruses 2022, 14, 640. [Google Scholar] [CrossRef]
  65. Geoghegan, J.L.; Senior, A.M.; Di Giallonardo, F.; Holmes, E.C. Virological Factors That Increase the Transmissibility of Emerging Human Viruses. Proc. Natl. Acad. Sci. USA 2016, 113, 4170–4175. [Google Scholar] [CrossRef] [PubMed]
  66. Kumberger, P.; Frey, F.; Schwarz, U.S.; Graw, F. Multiscale Modeling of Virus Replication and Spread. FEBS Lett. 2016, 590, 1972–1986. [Google Scholar] [CrossRef] [PubMed]
  67. Laha, S.; Chakraborty, J.; Das, S.; Manna, S.K.; Biswas, S.; Chatterjee, R. Characterizations of SARS-CoV-2 Mutational Profile, Spike Protein Stability and Viral Transmission. Infect. Genet. Evol. 2020, 85, 104445. [Google Scholar] [CrossRef] [PubMed]
Figure 1. SARS-CoV-2 viral replication protein complex. The viral replication protein complex is primarily comprised of the nsp12–nsp7–nsp8 supercomplex. The nsp12 (A-chain, shown in red) is the main catalytic subunit of the protein complex. The nsp7 (C-chain, shown in green) functions as a cofactor that binds to nsp12. The nsp8 (B-chain, shown in cyan; D-chain, shown in purple) functions as a cofactor and as a helper in extending the template RNA-binding surface. This viral replication protein complex represents the minimal machinery of the virus that can perform nucleotide polymerization. The viral replication protein complex shown was modelled using Robetta comparative modelling using PDB: 8GWE as template.
Figure 1. SARS-CoV-2 viral replication protein complex. The viral replication protein complex is primarily comprised of the nsp12–nsp7–nsp8 supercomplex. The nsp12 (A-chain, shown in red) is the main catalytic subunit of the protein complex. The nsp7 (C-chain, shown in green) functions as a cofactor that binds to nsp12. The nsp8 (B-chain, shown in cyan; D-chain, shown in purple) functions as a cofactor and as a helper in extending the template RNA-binding surface. This viral replication protein complex represents the minimal machinery of the virus that can perform nucleotide polymerization. The viral replication protein complex shown was modelled using Robetta comparative modelling using PDB: 8GWE as template.
Cimb 46 00165 g001
Figure 2. Nsp7 protein sequence distribution. In total, 98% percent of nsp7 contain the native protein sequence, while 2% are variants based on May 2022 data. Single amino acid mutations are dominant among the ten most frequently occurring variants (A). Based on April 2023 data (B), the native protein is the dominant protein sequence, accounting for 97% of the sequence, with 3% for the variants. Seven single amino acid mutations are the dominant variation. The occurrence of nsp7 with a shorter amino acid sequence (*) and two nsp7 protein sequences containing multiple ambiguous sequences (Xs) (** and ***) are observed in the ten most frequently occurring variants.
Figure 2. Nsp7 protein sequence distribution. In total, 98% percent of nsp7 contain the native protein sequence, while 2% are variants based on May 2022 data. Single amino acid mutations are dominant among the ten most frequently occurring variants (A). Based on April 2023 data (B), the native protein is the dominant protein sequence, accounting for 97% of the sequence, with 3% for the variants. Seven single amino acid mutations are the dominant variation. The occurrence of nsp7 with a shorter amino acid sequence (*) and two nsp7 protein sequences containing multiple ambiguous sequences (Xs) (** and ***) are observed in the ten most frequently occurring variants.
Cimb 46 00165 g002
Figure 3. Nsp8 protein sequence distribution. In total, 93% percent of nsp8 contains the native protein sequence, while 7% are variants with mutated sequences based on May 2022 data. Single amino acid mutations are dominant among the ten most frequently occurring variants (A). Based on April 2023 data (B), the native protein is the dominant protein sequence for nsp8 with 91% occurrence, whereas variants occur at 9%. Single amino acid mutations are the dominant type of mutation, with the exception of a potential double mutation at amino acid positions 121 and 122, where ambiguous amino acid sequences (X) have been reported.
Figure 3. Nsp8 protein sequence distribution. In total, 93% percent of nsp8 contains the native protein sequence, while 7% are variants with mutated sequences based on May 2022 data. Single amino acid mutations are dominant among the ten most frequently occurring variants (A). Based on April 2023 data (B), the native protein is the dominant protein sequence for nsp8 with 91% occurrence, whereas variants occur at 9%. Single amino acid mutations are the dominant type of mutation, with the exception of a potential double mutation at amino acid positions 121 and 122, where ambiguous amino acid sequences (X) have been reported.
Cimb 46 00165 g003
Figure 4. Non-covalent interactions in P133S variant. The native protein Pro-133 (PRO1065; B-chain) forms H-bond with Lys-391, A-chain (nsp12); hydrophobic interactions with Trp-182 (TRP1114; B-chain), Arg-392 (A-chain) and Lys-391 (A-chain) (A). In the variant sequence, Ser-133 (SER1065; B chain) forms H-bond with Trp-182 (TRP1114; B-chain); van der Waals interaction with Trp-182 (TRP1114; B-chain) (B). Pro-133 (PRO1346; D-chain) forms H-bonds with Gly-113 (GLY1346; D-chain), Trp-182 (TRP1395; D-chain) and Val-131 (VAL1344; D-chain) (C). Ser-133 (D-chain) forms one H-bond with Trp-182 (TRP1395, D-chain) (D). Red-dotted lines represent H-bonds, green-dotted lines represent hydrophobic interactions and cyan-dotted lines represent van der Waals interaction.
Figure 4. Non-covalent interactions in P133S variant. The native protein Pro-133 (PRO1065; B-chain) forms H-bond with Lys-391, A-chain (nsp12); hydrophobic interactions with Trp-182 (TRP1114; B-chain), Arg-392 (A-chain) and Lys-391 (A-chain) (A). In the variant sequence, Ser-133 (SER1065; B chain) forms H-bond with Trp-182 (TRP1114; B-chain); van der Waals interaction with Trp-182 (TRP1114; B-chain) (B). Pro-133 (PRO1346; D-chain) forms H-bonds with Gly-113 (GLY1346; D-chain), Trp-182 (TRP1395; D-chain) and Val-131 (VAL1344; D-chain) (C). Ser-133 (D-chain) forms one H-bond with Trp-182 (TRP1395, D-chain) (D). Red-dotted lines represent H-bonds, green-dotted lines represent hydrophobic interactions and cyan-dotted lines represent van der Waals interaction.
Cimb 46 00165 g004
Figure 5. ΔΔG change upon mutation of the three critical interaction residues in nsp7. The mutation of Lys-7 shows a destabilizing effect with only I, F and L amino acids showing minimal stabilizing effects (A). The mutation of His-36 has mostly destabilizing effects, with only F, C, L, I and Y mutations having stabilizing effects (B). Similarly, mutation at Asn-37 causes destabilizing effects except Y, I, L, C, F and M amino acid mutations, which render stabilizing effects (C). The x-axis shows the ΔΔG (kcal/mol) and the y-axis shows the amino acid substitution.
Figure 5. ΔΔG change upon mutation of the three critical interaction residues in nsp7. The mutation of Lys-7 shows a destabilizing effect with only I, F and L amino acids showing minimal stabilizing effects (A). The mutation of His-36 has mostly destabilizing effects, with only F, C, L, I and Y mutations having stabilizing effects (B). Similarly, mutation at Asn-37 causes destabilizing effects except Y, I, L, C, F and M amino acid mutations, which render stabilizing effects (C). The x-axis shows the ΔΔG (kcal/mol) and the y-axis shows the amino acid substitution.
Cimb 46 00165 g005
Figure 6. ΔΔG change upon mutation of the three critical interaction residues in nsp8. The mutation of Lys-58 shows a destabilizing effect with I, L, C, A, V and R amino acids showing stabilizing effects (A). The mutation of Pro-183 has mostly destabilizing effects, with only C, V and I amino acid mutations causing stabilizing effects (B). Mutation at Arg-190 has largely destabilizing effects, with the exception that only L amino acid has stabilizing effects (C). The x-axis shows the ΔΔG (kcal/mol) and the y-axis shows the amino acid substitution.
Figure 6. ΔΔG change upon mutation of the three critical interaction residues in nsp8. The mutation of Lys-58 shows a destabilizing effect with I, L, C, A, V and R amino acids showing stabilizing effects (A). The mutation of Pro-183 has mostly destabilizing effects, with only C, V and I amino acid mutations causing stabilizing effects (B). Mutation at Arg-190 has largely destabilizing effects, with the exception that only L amino acid has stabilizing effects (C). The x-axis shows the ΔΔG (kcal/mol) and the y-axis shows the amino acid substitution.
Cimb 46 00165 g006
Figure 7. Violin plot of amino acid residue contributions to viral replication complex stability. The violin plot shows the distribution of the different destabilizing/stabilizing effects of each amino acid residue when substituted with alanine. Mutation of some amino acid residues to alanine is found to have greater destabilizing effects than other amino acids. These amino acids that render greater stability to the viral replication complex include W, F, L and I in nsp-7 (A); L, Y, F, W and I in nsp8 (B-chain) (B); Y, I, L, F and W in nsp8 (D-chain) (C); and Y, F and W in nsp8 (combined B- and D-chain) (D). The x-axis shows the amino acid, whereas the y-axis shows the average ΔΔG for the amino acids analyzed.
Figure 7. Violin plot of amino acid residue contributions to viral replication complex stability. The violin plot shows the distribution of the different destabilizing/stabilizing effects of each amino acid residue when substituted with alanine. Mutation of some amino acid residues to alanine is found to have greater destabilizing effects than other amino acids. These amino acids that render greater stability to the viral replication complex include W, F, L and I in nsp-7 (A); L, Y, F, W and I in nsp8 (B-chain) (B); Y, I, L, F and W in nsp8 (D-chain) (C); and Y, F and W in nsp8 (combined B- and D-chain) (D). The x-axis shows the amino acid, whereas the y-axis shows the average ΔΔG for the amino acids analyzed.
Cimb 46 00165 g007
Figure 8. Hydrogen bonding sites for Trp-29 and Phe-49 in nsp7. Nsp7 has only one tryptophan, Trp-29, which can form H-bonds with Gln-444 and Val-410 of the A-chain (nsp12) (A). Nsp7’s only phenylalanine, Phe-49, can form multiple H-bonds within the C-chain, which includes Met-52 (MET1182; C-chain), Thr-45 (THR1175; C-chain) and Thr-46 (THR1176; C-chain) (B). Red-dotted lines represent H-bonds.
Figure 8. Hydrogen bonding sites for Trp-29 and Phe-49 in nsp7. Nsp7 has only one tryptophan, Trp-29, which can form H-bonds with Gln-444 and Val-410 of the A-chain (nsp12) (A). Nsp7’s only phenylalanine, Phe-49, can form multiple H-bonds within the C-chain, which includes Met-52 (MET1182; C-chain), Thr-45 (THR1175; C-chain) and Thr-46 (THR1176; C-chain) (B). Red-dotted lines represent H-bonds.
Cimb 46 00165 g008
Table 1. Most frequently occurring mutations in nsp7 and their mutational effects based on ΔΔG values.
Table 1. Most frequently occurring mutations in nsp7 and their mutational effects based on ΔΔG values.
MutationChainΔΔGStability Prediction (kcal/mol)Effect
L56FC−1.16Destabilizing
L71FC−1.13Destabilizing
S25LC−0.68Destabilizing
M3IC−0.54Destabilizing
D77NC−0.38Destabilizing
V33IC−0.24Destabilizing
T81IC−0.01Destabilizing
Q63RC0.05Stabilizing
M75IC0.35Stabilizing
S26FC0.39Stabilizing
Table 2. Most frequently occurring mutations in nsp8 and their mutational effects based on ΔΔG values.
Table 2. Most frequently occurring mutations in nsp8 and their mutational effects based on ΔΔG values.
MutationChainIndividual ΔΔGStability Prediction (kcal/mol)
B; D Chains
ΔΔG Over-All Stability Prediction (kcal/mol)Effect
P133SB, D−1.22; −0.62−3.34Destabilizing
Q24HB, D−0.02; −0.030.17Stabilizing
T89IB, D0.07; 0.110.89Stabilizing
T141MB, D−0.13; 0.61.02Stabilizing
T145IB, D0.07; 0.081.18Stabilizing
Q24RB, D−0.04; −0.121.19Stabilizing
T123IB, D0.05; 0.01.27Stabilizing
N118SB, D−0.8; −0.461.96Stabilizing
T148IB, D0.18; 0.142.23Stabilizing
T187IB, D0.54; 0.123.64Stabilizing
Table 3. Effect of mutation on the 76th amino acid position of nsp8 on the viral replication complex.
Table 3. Effect of mutation on the 76th amino acid position of nsp8 on the viral replication complex.
MutationChainIndividual ΔΔGStability Prediction (kcal/mol)
B; D Chains
ΔΔG Over-All Stability Prediction (kcal/mol)Effect
S76PB, D−1.83; −1.79−1.39Destabilizing
S76GB, D−1.21; −1.27−0.66Destabilizing
S76NB, D−0.59; −0.57−0.23Destabilizing
S76DB, D−1.45; −1.01−0.14Destabilizing
S76MB, D0.06; 0.630Neutral
S76QB, D−0.3; −0.050.11Stabilizing
S76KB, D−0.32; −0.030.16Stabilizing
S76HB, D−0.6; −0.020.23Stabilizing
S76TB, D−0.5; −0.090.32Stabilizing
S76VB, D−0.06; 0.640.33Stabilizing
S76EB, D−0.08; 0.190.37Stabilizing
S76LB, D−0.02; 0.50.43Stabilizing
S76FB, D−0.02; 0.220.44Stabilizing
S76AB, D−0.39; −0.010.48Stabilizing
S76WB, D−0.13; −0.040.55Stabilizing
S76RB, D−0.54; 0.550.87Stabilizing
S76IB, D0.09; 1.150.99Stabilizing
S76CB, D0.04; 1.01.68Stabilizing
S76YB, D0.64; 1.032.27Stabilizing
Table 4. Effect of mutation on the 122nd amino acid position of nsp8 on the viral replication complex.
Table 4. Effect of mutation on the 122nd amino acid position of nsp8 on the viral replication complex.
MutationChainIndividual ΔΔGStability Prediction (kcal/mol)
B; D Chains
ΔΔG Over-All Stability Prediction (kcal/mol)Effect
L122GB, D−3.0; −2.09−2.42Destabilizing
L122HB, D−1.58; −0.74−1.56Destabilizing
L122DB, D−2.35; −1.35−1.41Destabilizing
L122AB, D−2.42; −1.58−1.25Destabilizing
L122EB, D−1.78; −1.08−1.14Destabilizing
L122NB, D−1.66; −0.97−0.89Destabilizing
L122CB, D−1.82; −1.32−0.69Destabilizing
L122TB, D−1.52; −1.36−0.66Destabilizing
L122PB, D−2.42; −1.34−0.64Destabilizing
L122SB, D−1.59; −1.08−0.63Destabilizing
L122KB, D1.11; −0.32−0.14Destabilizing
L122VB, D−0.02; 0.50.06Stabilizing
L122MB, D−0.45; −0.170.14Stabilizing
L122IIB, D−0.53; −0.270.2Stabilizing
L122QB, D−0.87; −0.410.32Stabilizing
L122FB, D−0.33; −0.190.66Stabilizing
L122RB, D−0.58; −0.040.73Stabilizing
L122YB, D0.15; 0.180.92Stabilizing
L122WB, D−0.35; 0.110.95Destabilizing
Table 5. Effect of double mutation on the 121st and 122nd amino acid positions of nsp8 on the viral replication complex.
Table 5. Effect of double mutation on the 121st and 122nd amino acid positions of nsp8 on the viral replication complex.
MutationChainIndividual ΔΔGStability Prediction (kcal/mol)
B; D Chains
ΔΔG Over-All Stability Prediction (kcal/mol)Effect
P121G;L122GB, D−2.99; −2.09; −2.75; −0.72−4.04Destabilizing
P121D;L122GB, D−3.0; −2.16; −2.7; −0.92−3.98Destabilizing
P121T;L122GB, D−2.99; −2.18; −2.26; −0.7−3.91Destabilizing
P121S;L122GB, D−2.99; −2.18; −1.66; −0.4−3.82Destabilizing
P121N;L122GB, D−2.99; −2.16; −2.39; −0.71−3.69Destabilizing
P121RB, D−1.35; 0.391.03Stabilizing
P121CB, D−1.58; 0.11.08Stabilizing
P121E;L122FB, D−0.39; 0.89; −1.73; 1.021.19Stabilizing
P121QB, D−1.53; 1.241.52Stabilizing
P121EB, D−1.73; 1.181.65Stabilizing
Table 6. Comparison of nsp7 mutational effects based on bio-chemoinformatic calculations and wet lab experimental results.
Table 6. Comparison of nsp7 mutational effects based on bio-chemoinformatic calculations and wet lab experimental results.
MutationChainΔΔGStability Prediction (kcal/mol)Effect (This Study)Wet Lab Results Based on Literature [10]
F49AC−2.99DestabilizingDecreased RdRp efficiency
M52AC−2.12DestabilizingDecreased RdRp efficiency
L56AC−3.09DestabilizingDecreased RdRp efficiency
F49A, M52A,
L56A
C−3.46DestabilizingGreater decreased RdRp efficiency
C8GC−1.97DestabilizingDecreased RdRp efficiency
V11AC−2.12DestabilizingDecreased RdRp efficiency
N37V *A0.13StabilizingNot applicable
N37V **A, C0.22StabilizingNo detrimental effect to nsp7–nsp8 complex
N37V ***C−0.15DestabilizingDecreased RdRp efficiency
* Nsp7 N37V mutation was introduced into the nsp7–nsp8 dimer complex (PDB: 6YHU). ** Nsp37 N37V mutation was introduced into the nsp7–nsp8 heterotetrameric complex (PDB: 7JLT) of the original wet lab experimental data. *** Nsp7 N37V mutation was introduced into the viral replication protein complex.
Table 7. Comparison of nsp8 mutational effects based on bio-chemoinformatic calculations and wet lab experimental results.
Table 7. Comparison of nsp8 mutational effects based on bio-chemoinformatic calculations and wet lab experimental results.
MutationChainIndividual ΔΔGStability Prediction (kcal/mol)
B; D Chains
ΔΔG Over-All Stability Prediction (kcal/mol)Effect (This Study)Wet Lab Results Based on Literature [10]
F92AB, D−2.14, −3.09−3.06DestabilizingDecreased RdRp efficiency
M90AB, D−1.92, −2.48−1.39DestabilizingDecreased RdRp efficiency
M94AB, D−1.18, −2.84−1.94DestabilizingDecreased RdRp efficiency
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Subong, B.J.J.; Ozawa, T. Bio-Chemoinformatics-Driven Analysis of nsp7 and nsp8 Mutations and Their Effects on Viral Replication Protein Complex Stability. Curr. Issues Mol. Biol. 2024, 46, 2598-2619. https://doi.org/10.3390/cimb46030165

AMA Style

Subong BJJ, Ozawa T. Bio-Chemoinformatics-Driven Analysis of nsp7 and nsp8 Mutations and Their Effects on Viral Replication Protein Complex Stability. Current Issues in Molecular Biology. 2024; 46(3):2598-2619. https://doi.org/10.3390/cimb46030165

Chicago/Turabian Style

Subong, Bryan John J., and Takeaki Ozawa. 2024. "Bio-Chemoinformatics-Driven Analysis of nsp7 and nsp8 Mutations and Their Effects on Viral Replication Protein Complex Stability" Current Issues in Molecular Biology 46, no. 3: 2598-2619. https://doi.org/10.3390/cimb46030165

Article Metrics

Back to TopTop