Computational Analysis of SAM Analogs as Methyltransferase Inhibitors of nsp16/nsp10 Complex from SARS-CoV-2

Methyltransferases (MTases) enzymes, responsible for RNA capping into severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), are emerging important targets for the design of new anti-SARS-CoV-2 agents. Here, analogs of S-adenosylmethionine (SAM), obtained from the bioisosteric substitution of the sulfonium and amino acid groups, were evaluated by rigorous computational modeling techniques such as molecular dynamics (MD) simulations followed by relative binding free analysis against nsp16/nsp10 complex from SARS-CoV-2. The most potent inhibitor (2a) shows the lowest binding free energy (–58.75 Kcal/mol) and more potency than Sinefungin (SFG) (–39.8 Kcal/mol), a pan-MTase inhibitor, which agrees with experimental observations. Besides, our results suggest that the total binding free energy of each evaluated SAM analog is driven by van der Waals interactions which can explain their poor cell permeability, as observed in experimental essays. Overall, we provide a structural and energetic analysis for the inhibition of the nsp16/nsp10 complex involving the evaluated SAM analogs as potential inhibitors.


Introduction
In October 2022, the World Health Organization (WHO) reported over 600 million cases and over six million deaths since the beginning of the COVID-19 pandemic [1]. This disease is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), an enveloped β-coronavirus formed by a large and complex positive-sense single-stranded RNA genome [2]. Particularly, these viruses have one of the largest genomes of all RNA viruses. For SARS-CoV-2, its genome has~29,800 bases, which are responsible encodes a total of 4 structural and 16 nonstructural proteins (named nsp1-nsp16), which are crucial for the virus survival [3,4]. Among these potential targets, some nonstructural proteins (nsps) which play key roles in RNA capping in coronavirus [5][6][7] have been used for drug design of anti-SARS-CoV-2 agents [8][9][10][11], once that RNA cap modification contributes to host cell defense as viral RNA lacking 2 -O methylation which is sensed and inhibited by the interferon-stimulated IFIT-1 protein [12].
Previous studies involving human and animal coronaviruses have demonstrated that the nsp16/nsp10 complex is responsible for the Cap-0 binding of mRNA due to its (Nucleoside-2 O)-methyltransferase activity [13]. Particularly, the activity of nsp16 is improved by the presence of nsp10, which plays a cofactor rule [8,14]. In the Cap-0 reaction mechanism, the nsp16 methylates the mRNA by transferring its methyl group from S-adenosylmethionine (SAM) donor to the unmethylated ribose 2 -O, obtaining RNA-2 -O-methylated and S-adenosyl homocysteine (SAH) [15,16]. 2 of 12 Recently, Bobileva and co-workers [11] synthesized analogs of SAM as inhibitors of viral mRNA cap methyltransferases (MTases). They evaluated these new inhibitors into nsp14 and nsp16/nsp10 from SARS-CoV-2 and human glycine N-methyltransferase (GNMT), where five compounds show nanomolar to submicromolar IC 50 values. However, these compounds did not show selectivity concerning human GNMT and have poor cell permeability. Therefore, understanding the inhibition mechanism of the nsp16/nsp10 complex is important for elucidating molecular details which can explain the source of no selectivity and poor cell permeability. Some computational studies have indicated that the origin of the catalytic effect of methyltransferases is mainly due to electrostatic preorganization [17][18][19][20][21]. Recently, our group has described the catalytic mechanism of the 2 -O methylation of the viral mRNA cap by applying Quantum Mechanics/Molecular Mechanics (QM/MM) with MD simulations [9]. Besides, recent computational studies have been published involving inhibitors of 2 -O-Methyltransferase from SARS-CoV-2 [22][23][24]. Overall, we aim to understand the determinants of nsp16/nsp10 inhibition by SAM analogs and provide insights into their poor cell permeability by performing powerful computational analysis.

Structural Analysis of MD Simulations and PCA Analysis
The structural stabilization for all nsp16/nsp10 systems is evaluated by using RMSD and RMSF plots computed by the CPPTRAJ program [25] ( Figure 1A,B). In Figure 1A, the simulated nsp16/nsp10 systems show suitable convergence considering their respective average structures. These observations corroborate the computed RMSDs values: 1.4 ± 0.3, 1.7 ± 0.2, 1.7 ± 0.1, 2.6 ± 0.1, 1.6 ± 0.1 and 2.2 ± 0.1 Å for 1a, 2a, 2b, 4c, SAM and SFG systems, respectively. These results suggest that the SAM and its analogs systems were well equilibrated during MD simulation. Interestingly, 1a, 2a and 2b systems show RMSD values close to the SAM system, while SFG and 4c systems show the highest and most similar RMSD values (2.2 ± 0.1 and 2.6 ± 0.1 Å, respectively). From the RMSF analysis ( Figure 1B), the values were: 0.7 ± 0.6, 0.8 ± 0.8, 0.7 ± 0.7, 0.7 ± 0.6, 0.7 ± 0.6 and 0.7 ± 0.8 Å for 1a, 2a, 2b, 4c, SAM and SFG systems, respectively. Although their respective standard deviations are on the same scale, mainly due to changes in the terminal region, the protein section which comprises amino acid residues from Asp125 to Phe150 shows a significant difference, which is more evident for Thr140 (2.4 and 2.5 Å for SAM and 2a, respectively). These structural analyses suggest that even small differences can be determinants for the binding of all evaluated compounds.
To better understand the collective and individual movements of the nsp16 structure, we applied PCA analysis for all simulated systems. It is also important to highlight that the nsp10 protein (cofactor) was not included in the PCA analysis, the discussion focus is on the nsp16 protein. All 200 ns of MD simulations were considered to obtain the protein movement, all analyzed systems were plotted using three combinations of the principal components (PCs): PC1 vs. PC2 (Figure 2), PC1 vs. PC3, and PC2 vs. PC3. PC plots for 1a, 2b, and 4c systems are provided in the Supplementary Materials file ( Figure S1).
As can be observed in Figure 2, where the progress of the trajectory for all nsp16 protein systems is shown, each point of the plot means movement direction for the nsp16-SAM, nsp16-SFG, and nsp16-2a during the 200 ns of MD simulation. The clustering of the conformers in the PC1 vs. PC2 plot in SAM and SFG systems have a similar profile, which means similar conformational behavior during MD simulations, which could be associated with a small motion of the nsp16 structure. However, the 2a system shows a different profile for conformational changes during MD simulations, which can suggest induction caused by the 2a inhibitor into the catalytic site of nsp16.
As can be seen in Figure 3 and Table 1, for SAM and SFG systems residues Asp99, Asp114 and Asp130 maintain relevant hydrogen bonds with donor atoms from SAM and SFG structures. Particularly, Asp99 shows an occupancy percentage greater than 100%, it occurs due to the carboxylic group of Asp99 interacting with both hydroxyl groups of the furan ring present in SAM and SFG structure, which allow a suitable orientation for the catalytic reaction of nsp16/nsp10 complex [9]. Interestingly, the Asp130 residue does not show any hydrogen interaction with synthesized SAM analogs due to the absence of the aminobutanoate group in their structures. These results can provide a clue about the structural effects caused by modifications to SAM analogs, which will corroborate our energetic analysis. As can be observed in Figure 2, where the progress of the trajectory for all nsp16 protein systems is shown, each point of the plot means movement direction for the nsp16-SAM, nsp16-SFG, and nsp16-2a during the 200 ns of MD simulation. The clustering of the conformers in the PC1 vs. PC2 plot in SAM and SFG systems have a similar profile, which means similar conformational behavior during MD simulations, which could be associated with a small motion of the nsp16 structure. However, the 2a system shows a different profile for conformational changes during MD simulations, which can suggest induction caused by the 2a inhibitor into the catalytic site of nsp16.  (Table S1).  (Table S1). As can be seen in Figure 3 and Table 1, for SAM and SFG systems residues Asp99, Asp114 and Asp130 maintain relevant hydrogen bonds with donor atoms from SAM and SFG structures. Particularly, Asp99 shows an occupancy percentage greater than 100%, it occurs due to the carboxylic group of Asp99 interacting with both hydroxyl groups of the furan ring present in SAM and SFG structure, which allow a suitable orientation for the catalytic reaction of nsp16/nsp10 complex [9]. Interestingly, the Asp130 residue does not show any hydrogen interaction with synthesized SAM analogs due to the absence of the aminobutanoate group in their structures. These results can provide a clue about the structural effects caused by modifications to SAM analogs, which will corroborate our energetic analysis.

Binding Free Energy and Residual Decomposition Analysis
As described in the Material and Methods section, relative binding free energies for all nsp16/nsp10 systems were performed using the MM/GBSA method [26][27][28] as implemented in MMPBSA.py [29] and are presented in Table 2. It should be highlighted that a single MD trajectory of the bound complexes was considered to compute the relative binding free energy (∆G bind ), where the ∆E int term (in Equation (3)) is canceled due the energy differences are from the same MD ensemble [27,28]. The thermodynamic terms related to the binding free energy are listed in Table 2. As can be observed, van der Waals (∆E vdW ) and non-polar (∆G SA ) terms are the most consistent components for binding free energy, ranging from −51.2 to −41.8 Kcal/mol for ∆E vdW and from −6.8 to −5.9 Kcal/mol for ∆G SA . The calculated binding free energy (∆G bind ) values for SAM and SFG systems are −70.0 and −39.8 Kcal/mol, respectively, suggesting a most favorable bind for SAM, the natural substrate of the nsp16/nsp10 complex. As can be observed for the selected SAM analogs (1a, 2a and 2b), they have a lower ∆G bind (−46.8, −58.7 and −49.6 Kcal/mol, respectively) than the pan-MTase inhibitor (SFG, −39.8 Kcal/mol) which agrees with experimental evidence found by Bobileva et al. [11]. According to Bobileva et al. [11] compound 4c showed minimal activity (IC 50 = 223 µM), the ∆G bind found was −30.1 Kcal/mol, the highest value among all simulated compounds. On the other hand, compound 2a has the lowest ∆G bind (−58.7 Kcal/mol) which agrees with the results found by Bobileva et al. [11]. Therefore, our applied computational strategy described the same binding tendency observed by Bobileva et al. [11] for the most potent compounds (1a, 2a and 2b) as well as for the inactive compound (4c).
Interestingly, the electrostatic (∆E ele ) and polar (∆G GB ) term showed significant differences between SAM and SFG related to other compounds. For Kcal/mol, respectively). Our results suggest that ∆E ele and ∆G GB terms could be related to the cell permeability presented by SAM analogs in A549 cell line essays performed by Bobileva et al. [11].
A plot of a residual decomposition analysis of ∆G bind (Figure 4) was included to improve the energetic description of the features that contributed to the recognition and binding in all the nsp16 systems. In Table 3 any residue with values below −1.20 Kcal/mol was included as an important residue in the binding process. ∆ terms could be related to the cell permeability presented by SAM analogs in A549 cell line essays performed by Bobileva et al. [11].
A plot of a residual decomposition analysis of ∆ ( Figure 4) was included to improve the energetic description of the features that contributed to the recognition and binding in all the nsp16 systems. In Table 3 any residue with values below −1.20 Kcal/mol was included as an important residue in the binding process.   As can be observed in Table 3, some important residues for the SAM binding as Tyr47, Gly73, Asp99 and Asp130 (−2.2, −1.8, −12.7 and −4.4 Kcal/mol, respectively) decrease significantly for SFG (0.0, −1.2, −6.1 and 0.3 Kcal/mol, respectively) and in others SAM analogs. Particularly, the carboxylic group of Asp99 maintains strong H bonds with both hydroxyl groups of the furan ring of SAM, which explain the high occupancy values shown in Table 3. By another hand, some residues, such as Asn43, Ser74, Lys76 and Met131 appear to be important to the binding of SFG and synthesized SAM analogs. Interestingly, for the 2a compound, Lys76 has the lowest interaction value (−10.4 Kcal/mol) among all computed systems, compensating for the increased interaction with Asp130, which indicates its great importance for the binding of the most active synthesized SAM analog, in agreement with Bobileva et al. results [11]. Our computational results suggest that modifications performed by Bobileva et al. [11] to the SAM analogs, although, have allowed new interactions with the catalytic site of nsp16, important interactions (e.g., Asp99 and Asp130) were drastically decreased according to MD and free energy analysis.

System Setup and MD Simulations
Initially, we have chosen the nsp16/nsp10-SAM complex interacting with RNA from PDB code 6WKS [8] and well equilibrated from our previous nsp16/nsp10 study [9]. The 3D structures SAM and its analogs (SFG, 1a, 2a, 2b and 4c) ( Figure 5) were in silico build and structurally minimized at the quantum mechanics (QM) level by applying the Hartree-Fock (HF) method with a 6-31G ** basis set and the RESP method [30] was used for the partial charges calculations carried out in the Gaussian09 package [31]. To estimate the protonation states of the titratable amino acid residues, a pKa calculation was performed using the PROPKA method [32] at pH 7. The ff14SB [33] was applied for the protein (nsp10 and nsp16) while GAFF [34] was used for the RNA part and SAM and its analogs. It is important to highlight that nsp10 is a zinc-binding protein, then Zinc AMBER force field (ZAFF) [35] parameters were used for the description of its metal center containing Zn 2+ ions. Then, tleap module of the Amber20 program [36] was used to add protons of the protein (nsp10 and nsp16). Each system (nsp16-nsp10-X-m7GpppA-RNA; "X" means ligand) was immersed in a truncated octahedral cell of water molecules described by TIP3P [37] model, extending 8 Å away from the solute part. All technical procedures were detailed previously [9]. In all simulation stages, the particle mesh Ewald (PME) method was used to calculate the long-range electrostatic forces employing a nonbonded cutoff of 10 Å, and H-bonds were constrained by using the SHAKE method [38]. The PMEMD module of the Amber20 program [36] was used for all MM simulations.

Structural and Thermodynamic Analysis
The root-mean-square deviation (RMSD) and root-mean-square fluctuation (RMSF) of the backbone atoms (Cα, N, O, C) plots were computed to avail structural stabilization of all simulated systems, where the trajectories ensembles were fitted to the average structures from production stages by using the CPPTRAJ module [25]. Besides, the principal component analysis (PCA) approach [39,40] was applied to explore the local and collective movements of nsp16/nsp10 systems that occurred during the MD simulations. means ligand) was immersed in a truncated octahedral cell of water molecules described by TIP3P [37] model, extending 8 Å away from the solute part. All technical procedures were detailed previously [9]. In all simulation stages, the particle mesh Ewald (PME) method was used to calculate the long-range electrostatic forces employing a nonbonded cutoff of 10 Å, and H-bonds were constrained by using the SHAKE method [38]. The PMEMD module of the Amber20 program [36] was used for all MM simulations.  Thereafter, the CPPTRAJ module [25] was used to select 20 ns (a total of 2000 representative snapshots) from the production stage of the MD simulations for the binding free energy (∆G bind ) calculations using the MM/GBSA approach [26][27][28], as implemented into the MMPBSA.py module [29] of AmberTools20. The main equations of the ∆G bind by MM/GBSA are computed as follows: (1) where ∆G bind is computed from the gas-phase MM energy (∆E MM ), solvation energy (∆G SOLV ) and the entropic term (−T∆S) (Equation (1)). The ∆E MM is the sum of the changes in the internal (bond, angles, and dihedral energies) (∆E int ), electrostatic (∆E ele ) and van der Waals (∆E vdw ) interactions (Equation (2)). As a single-trajectory scheme is applied for the ∆G bind calculations, the ∆E int is equal to zero. The ∆G SOLV includes the polar (∆G GB ) and non-polar (∆G SA ) energies for ∆G bind (Equation (3)). Due to the computational cost, the entropic term (−T∆S) was not included in the ∆G bind calculations [27,28]. Furthermore, a per-residual decomposition analysis was computed to provide insights into the relative contribution of the amino acid residues [26]. This method has been successfully used in SARS-CoV-2 drug design studies [9,[41][42][43][44][45][46][47].

Conclusions
In this study, we have used MD simulations followed by structural analysis and binding free energy calculations to evaluate the key features of the inhibition of nsp16/nsp10 complex from SARS-CoV-2 by the SFG and new SAM analogs. Our results are in good agreement with the experimental evidence proposed by Bobileva et al. [11]. The most active compound (2a) shows the lowest binding free energy value (−58.7 Kcal/mol) among all SAM analogs, including the reference inhibitor, SFG. Interestingly, the significant interaction of 2a with Lys76 (−10.4 Kcal/mol) can suggest it is a key residue for the binding of this SAM analog and its good activity. Regarding the poor cell permeability results found for these new SAM analogs, our results suggest that it could be related to the positive values of electrostatic interactions computed by the MM/GBSA calculations. Finally, the insights provided by applied computational techniques here may be used as leads for further drug development based on SAM analogs as inhibitors of MTases from SARS-CoV-2.