Multi-Step In Silico Discovery of Natural Drugs against COVID-19 Targeting Main Protease

In continuation of our antecedent work against COVID-19, three natural compounds, namely, Luteoside C (130), Kahalalide E (184), and Streptovaricin B (278) were determined as the most promising SARS-CoV-2 main protease (Mpro) inhibitors among 310 naturally originated antiviral compounds. This was performed via a multi-step in silico method. At first, a molecular structure similarity study was done with PRD_002214, the co-crystallized ligand of Mpro (PDB ID: 6LU7), and favored thirty compounds. Subsequently, the fingerprint study performed with respect to PRD_002214 resulted in the election of sixteen compounds (7, 128, 130, 156, 157, 158, 180, 184, 203, 204, 210, 237, 264, 276, 277, and 278). Then, results of molecular docking versus Mpro PDB ID: 6LU7 favored eight compounds (128, 130, 156, 180, 184, 203, 204, and 278) based on their binding affinities. Then, in silico toxicity studies were performed for the promising compounds and revealed that all of them have good toxicity profiles. Finally, molecular dynamic (MD) simulation experiments were carried out for compounds 130, 184, and 278, which exhibited the best binding modes against Mpro. MD tests revealed that luteoside C (130) has the greatest potential to inhibit SARS-CoV-2 main protease.


Introduction
The WHO mentioned on 8 February 2022 that SARS-CoV-2 caused confirmed infections estimated by 396,558,014 people all over the globe and resulted in the death of an additional 5,745,032 [1]. The spreading of this notorious virus dramatically in addition to the shortage of effective treatment, mandates the utilization of new fast and efficient drug design strategies [2]. Computer-aided (computer-based, computational, or in silico) strategies in drug discovery represent quick and reliable approaches that could predict the bioactivity of any compound reducing the waste of effort, [3,4] time, and money. Computeraided drug discovery approaches include molecular docking [5][6][7], molecular dynamic simulations [8], QSAR [9], pharmacophore modeling [10,11], ADMET [12,13], DFT [14], drug molecular design [15,16], and toxicity prediction [17][18][19]. These approaches target the enhancement of drug activity besides the discovery of new ligands [20].
Viral proteases are successfully utilized as promising antiviral targets. For instance, the aspartyl protease and the serine proteases were effective targets for antivirals against human immunodeficiency virus and hepatitis C virus, respectively [38]. The vital role of M pro during the replication of SARS-CoV-2 is to activate a group of sixteen functional and non-structural proteins through the separation of the two overlying polyproteins (pp1a and pp1ab). Consequently, the inhibition of M pro will cause definite damage to the virus [39]. Additionally, the structure and the sequence of viral main protease (M pro ) and human proteases are quietly different [40]. These properties suggest M pro as a target for anti-COVID-19 drug discovery [41,42].
In this work, a collection of 310 naturally originated antiviral compounds has been screened using different computational methods to detect the most potent naturally derived M pro inhibitor. The utilized methods included molecular structures similarity study with PRD_002214, a fingerprint study against PRD_002214, the molecular docking against M pro PDB ID: 6LU7, in silico toxicity studies, and molecular dynamic (MD) simulation experiments ( Figure 1).
The degree of molecular similarity between two molecules depends on the similarity coefficient (metric) which is used to compute a quantitative score for the degree of similarity based on the weighted values of structural descriptors. The similarity between two molecules is the inverse function of the distance between them in descriptor space [57]. When there are two or more reference ligands, the shortest distance to a reference ligand is used. In this work, Euclidean distances between the rank-ordering of different descriptors are calculated to determine descriptor similarity where Euclidean distances represent the shortest distance between two points [58]. Structural similarity studies between the 310 antiviral compounds (Supplementary Materials Figure S1) and the co-crystallized ligand PRD_002214 ( Figure 2) of M pro PDB ID: 6LU7 have been applied by the software Discovery Studio depending on the previous descriptors. The antiviral compounds were examined in six groups ( Figure 3) and a similarity check was performed for each group separately using PRD_002214 as a reference. The distance between PRD_002214 and the tested compounds is illustrated in Figure 3. The results favored thirty compounds that have good structural similarity with PRD_002214 ( Figure 4). The values of molecular properties for compounds are listed in Table 1.

Docking Studies
Docking studies were proceeded to inspect the binding free energies (∆G) and the binding modes [111][112][113][114][115][116] of the antiviral compounds against M pro PDB ID: 6LU7 (Table 4) with PRD_002214 as a reference. Eight compounds (128, 130, 156, 180, 184, 203, 204, and  278) exhibited the most a-like binding mode and the highest binding energies. Starting with PRD_002214, it showed eight hydrogen bonds in addition to four hydrophobic reactions. In detail, the first pocket of M pro was occupied by the 2-oxopyrrolidin-3-yl moiety that was involved in two hydrogen-bonding interactions with His163 and Glu166. The 2-acetamido-3-methylbutanamido)-N-ethyl-4-methyl pentanamide moiety was buried in the second pocket with four hydrogen-bonding interactions with Gln189, Glu166, and Thr190 together with three hydrophobic interactions with Met165 and His41. The benzyl acetate moiety was suited in the third pocket of the receptor engaging in two hydrogen bonds with His164 and His41. Moreover, the 5-methylisoxazole-3-carboxamide moiety occupied the fourth pocket forming a hydrophobic interaction with Ala191 ( Figure 5) [117]. The antiviral compound, 130 was engaged in six hydrogen bonds, three hydrophobic interactions, and one electrostatic attraction. Firstly, the methyl-3-(4-hydroxy-3-methoxyph enyl)acrylate moiety was fitted in the first pocket making two hydrogen bonds with His 163 and Thr26 and one electrostatic interaction with Cys145. Furthermore, the 6methyltetrahydro-2H-pyran-3,4,5-triol moiety was engaged in two hydrogen-bonding interactions inside the second pocket with His 164 and Met165 in addition to hydrophobic interactions (three) with Met49 and His41. Additionally, compound 130 was involved in two hydrogen bonds with Glu166 in the third pocket. Finally, the (3R,4R)-3-(hydroxymethyl)tetrahydrofuran-3,4-diol was buried in the fourth pocket ( Figure 6).  The antiviral compound 278 exhibited a ∆G of −29.00 kcal. mol −1 . It was combined with receptor protein through five hydrogen bonds and four hydrophobic interactions as shown in Figure 8. Basically, the 7-hydroxy-5,9-dimethyl-6H-naphtho [2,1-d][1,3]dioxin -6one occupied the first pocket of COVID-19 main protease with two hydrophobic interactions with Pro168 and Ala191. The macrocyclic structure of the tested compounds occupied the other three pockets of the target receptor engaging in five hydrogen bonds with Glu166, His164, Cys145, Gly143, and Gln189. Moreover, it formed two hydrophobic interactions with Met165 and His41.  Figures S2-S6, respectively).

Toxicity Models
In this experiment, the toxicity profiles of the favored eight antiviral compounds (128,  130, 156, 180, 184, 203, 204, and 278) were examined by seven toxicity models (illustrated in Table 5) in the Discovery Studio software version 4.0 [118,119]. All the examined antiviral compounds were estimated as non-carcinogenic in the FDA rodent carcinogenicity model. Additionally, all antiviral compounds except 130 showed TD 50 values more than simeprevir where the values were ranging from 2.871 to 12.946 g/kg body. Furthermore, all antiviral compounds except 180 showed rat maximum tolerated dose values higher than that of simeprevir, the values were ranging from 0.018 to 2.382 g/kg body weight . Compounds 128, 130, 156, 180, 184, and 204 revealed oral LD 50 values in a range of 0.274 to 10.020 g/kg body weight, higher than that of simeprevir (0.209 g/kg body weight). On the other hand, compounds 203 and 278 showed oral LD 50 values of 0.141 and 0.166 g/kg body weight, respectively which were lower than that of simeprevir. Compounds 128, 130, 156, and 184 showed LOAEL values ranging from 0.012 to 0.040 g/kg body weight while simeprevir exhibited 0.002 g/kg body weight. Finally, all the antiviral compounds showed mild to moderate irritancy except 184 which showed no irritancy in both models (Table 5).

Molecular Dynamics
Molecular dynamics (MD) simulation has provided many valuable insights into the binding of drugs to their targets. This includes accurate evaluation of the binding strength between a ligand and its target, studying the nature of macromolecules, and characterizing the effect of certain mutations on the resistance profile of many drugs [120,121]. In this test, three compounds (130, 184, and 278) that exhibited good binding mode against M pro were nominated for MD simulation studies.

RMSD, RMSF, and RDF Analysis
To endorse our virtual screening approach so far, five MD simulation experiments were conducted on the free M pro , co-crystallized ligand-M pro , 130-M pro , 184-M pro , and 278-M pro . Interestingly, the calculated RMSD for the free M pro exceeded 4.8 Å while the RMSD of the co-crystalized ligand-M pro reached nearly 2.2 Å. As expected, the RMSD of 130-M pro and 184-M pro reached only 1.7 and 2.1 Å, respectively. In contrast, the 278-M pro complex had the highest RMSD value among the four ligands, reaching about 2.7 Å (Figure 9). The complex 130-M pro showed the least RMSD value which is a good indicator of its ability to further restrict the flexibility of M pro compared to the co-crystalized ligand. Furthermore, its ability to stabilize the M pro is attributed to the strong binding mode between the M pro and compound 130. Similar results were obtained when calculating the RMSF values for all the residues of the five systems where the native enzyme showed a significantly higher level of residues fluctuation during the simulation compared to the co-crystalized ligand and the three lead compounds ( Figure 10). It is worth mentioning that the high flexibility of M pro as shown by higher values for RMSD and RMSF is consistent with its intended function to process the resulting polyprotein from the replication cycle of the virus. Accordingly, the ability of compound 130 to produce lower values for RMSD and RMSF highlights its potentiality as a potent M pro inhibitor. In addition to RMSD and RMSF calculations, the radial distribution function (RDF) was also computed to provide extra insights into the binding of the M pro and the three selected compounds. RDF can explore the distance relationship (atom to atom) between two types of molecules (ligand, and receptor). The average density of ligand to protein (M pro ) surface in the stable period was computed by analyzing the RDF of each ligand to M pro surface [122]. The gmx rdf program was utilized to explore the RDF of the three ligands to the surface of the M pro , and the results are demonstrated in Figure 11. All of the RDF for the three ligands with the M pro showed distinct peaks at 0.22~0.34 nm, a distance that enabled the three hits to form strong hydrogen bonds and hydrophobic interactions. Furthermore, among the three complexes, the 130-M pro complex reached the highest peak nearly at 0.25 nm, followed by 184-M pro and then 278-M pro which reached their maximum peaks at 0.23 nm and 0.34 nm, respectively. To this end, the RDF also proves the ability of the three compounds to predominantly distribute at a distance that allows strong interactions with the M pro and also highlights the superiority of compound 130 over compounds 184 and 278. In conclusion, the RDF calculations are highly matched with RMSD and RMSF calculations, endorsing the obtained results from the docking step.

Binding Free Energy Calculations Using MM-PBSA Approach
The binding free energies between the four ligands (co-crystallized reference and the three retrieved lead compounds) with M pro were computed from all the conformations in the saved trajectories utilizing the MM-PBSA approach. The g_mmpbsa package generated by Kumari et al. [123] was utilized to compute all the MM-PBSA binding free energy types (van der Waals, electrostatic, polar solvation, and SASA energies) for the four complexes of M pro with the four ligands ( Table 6). The calculated binding energy showed a significant higher binding affinity for 130 and 184 compared to the co-crystalized ligand and 278 which is consistent with the obtained results of both docking and MD simulations.

Molecular Similarity Detection
Molecular Similarity detection was performed for the antiviral compounds using Discovery Studio software as described in the Supplementary Materials.

Molecular Fingerprint Detection
Molecular fingerprint detection was performed for the antiviral compounds using Discovery Studio software as described in the Supplementary Materials.

Docking Studies
Docking studies of the antiviral compounds were carried out for the antiviral compounds against SARS-CoV-2 main protease PDB ID: 6LU7 using MOE.14 software [124][125][126][127] as shown in the Supplementary Materials.

Toxicity Studies
In silico toxicity profiles were calculated for the antiviral compounds using Discovery Studio 4.0 [128][129][130] as shown in the Supplementary Materials.

MDS
All molecular dynamics (MD) simulations were performed for the antiviral compounds using the GROningen MAchine as shown in the Supplementary Materials.

Conclusions
Herein, it was concluded that luteoside C (130) was found to be the most potent inhibitor of M pro among a collection of 310 natural antiviral compounds depending on a multiphase in silico approach. The molecular structures similarity study against PRD_002214, the ligand of the target enzyme, favored thirty compounds. Then, the fingerprint study against PRD_002214 elected the most similar sixteen compounds. The molecular docking against M pro PDB ID: 6LU7 and toxicity studies favored eight compounds. The MD simulations experiments were carried out and revealed the superiority of 130 as the most potent inhibitor of M pro . Although the in vitro and in vivo examinations against COVID-19 are not accessible for our team, we depended on extensive well-structured in silico studies to offer all scientists who have the facilities the chance of strongly potential SARS-CoV-2 inhibitors. Our research is an important initial step that could be very helpful in the journey of finding a cure.

Supplementary Materials:
The chemical structures of the tested compounds, detailed method in addition to the toxicity report can be downloaded at: https://www.mdpi.com/article/10.3390/ijms2 3136912/s1.