Discovery of Potential SARS-CoV-2 Papain-like Protease Natural Inhibitors Employing a Multi-Phase In Silico Approach

As an extension of our research against COVID-19, a multiphase in silico approach was applied in the selection of the three most common inhibitors (Glycyrrhizoflavone (76), Arctigenin (94), and Thiangazole (298)) against papain-like protease, PLpro (PDB ID: 4OW0), among 310 metabolites of natural origin. All compounds of the exam set were reported as antivirals. The structural similarity between the examined compound set and S88, the co-crystallized ligand of PLpro, was examined through structural similarity and fingerprint studies. The two experiments pointed to Brevicollin (28), Cryptopleurine (41), Columbamine (46), Palmatine (47), Glycyrrhizoflavone (76), Licochalcone A (87), Arctigenin (94), Termilignan (98), Anolignan B (99), 4,5-dihydroxy-6″-deoxybromotopsentin (192), Dercitin (193), Tryptanthrin (200), 6-Cyano-5-methoxy-12-methylindolo [2, 3A] carbazole (211), Thiangazole (298), and Phenoxan (300). The binding ability against PLpro was screened through molecular docking, disclosing the favorable binding modes of six metabolites. ADMET studies expected molecules 28, 76, 94, 200, and 298 as the most favorable metabolites. Then, molecules 76, 94, and 298 were chosen through in silico toxicity studies. Finally, DFT studies were carried out on glycyrrhizoflavone (76) and indicated a high level of similarity in the molecular orbital analysis. The obtained data can be used in further in vitro and in vivo studies to examine and confirm the inhibitory effect of the filtered metabolites against PLpro and SARS-CoV-2.

Human interest in the use of natural products has been back-traced for thousands years [15,16]. The power of natural products as antiviral medicines has been confirmed several scientific reports [17][18][19][20].
PLpro is a crucial protein in the coronavirus that has an essential role in the p cessing mechanism of viral polyproteins. This step results in the generation of an effici replicase complex [21]. PLpro has another essential role against human immunity throu post-translational modifications on human proteins [22].
In the current study, we report the use of several computational filtration metho on 310 metabolites of natural origin that belong to diverse chemical classes and are ported as antivirals ( Figure S1 and Table S1). Our experiments revealed the most expec inhibitors of human coronavirus PLpro among them. We depended on the reported si ilarities between the PLpro of SARS-CoV-1 and SARS-CoV-2 ( Figure 1).

Figure 1.
In silico protocol to select the most promising candidate against PLpro.

Molecular Similarity
It is worth mentioning that S88 was used as a positive control (lead molecule) in t work as S88 is the co-crystallized ligand of our target protein and has a reported bind mode. Additionally, currently, there are no FDA-approved drugs for the treatment coronavirus targeting PLpro. Accordingly, it was found that S88 may serve as a good c didate to check the similarity of our compounds against it.

Molecular Similarity
It is worth mentioning that S88 was used as a positive control (lead molecule) in this work as S88 is the co-crystallized ligand of our target protein and has a reported binding mode. Additionally, currently, there are no FDA-approved drugs for the treatment of coronavirus targeting PLpro. Accordingly, it was found that S88 may serve as a good candidate to check the similarity of our compounds against it.
The following descriptors (H-bond donor (HBA) [31], H-bond acceptor (HBD) [32], partition coefficient (ALog p) [33], molecular weight (M. Wt) [34], rotatable bonds [35], rings, and aromatic rings [36] besides molecular fractional polar surface area (MFPSA) [37]) were examined between the 310 metabolites ( Figure S1, Supplementary data) and S88 using Discovery Studio software (Vélizy-Villacoublay, France). The degree of likeness was calculated through the computation of minimum distances. The minimum distances were computed based on the variations in the aforementioned parameters and represent the computed quantitative difference in the structure between S88 and the examined compounds and are inversely proportional to the similarity degree. The 310 molecules were spit into five equal groups of 50 molecules each, and one (last group) that contained 60 molecules . The study determined the 30 most suitable metabolites  (Figures 2 and 3, and Table 1). [37]) were examined between the 310 metabolites ( Figure S1, Supplementary data) and S88 using Discovery Studio software (Vélizy-Villacoublay, France). The degree of likeness was calculated through the computation of minimum distances. The minimum distances were computed based on the variations in the aforementioned parameters and represent the computed quantitative difference in the structure between S88 and the examined compounds and are inversely proportional to the similarity degree.
The 310 molecules were spit into five equal groups of 50 molecules each, and one (last group) that contained 60 molecules . The study determined the 30 most suitable metabolites (Figures 2 and 3, and Table 1).

Filter Using Fingerprints
Various computational methods that describe the similarities between different molecules have gained more interest in drug discovery [38]. One of the most helpful techniques in this approach is fingerprints [39]. The fingerprint study includes binary strings that compute the existence or absence of vital sub-structural fragments to calculate the structural similarity between molecules. This technique is currently utilized in virtual screening and detection of similarities between hit compounds and the lead one. The main difference between the fingerprints and molecular similarity studies is that the first individually calculates the presence and or absence of certain descriptors in S88 and the examined compounds, while molecular similarity calculates the degree of similarity between them as a whole structure.
The fingerprints technique was carried out using Discovery Studio software and examined the following parameters: HBA, HBD [40], charge [41], hybridization [42], positive and negative ionizable [43], halogen, aromatic, or none of them besides the ALogP category of atoms. All the mentioned parameters were converted to pits by the computer. Then, Life 2022, 12, 1407 6 of 27 the computer calculated the bits in both S88 and the target compounds (SA), in the target compounds only (SB), or S88 only (SA). The identification of the most similar (that have the most identical molecular fingerprints) compounds to S88 is important to pick compounds with a higher degree of similarities. The most similar compounds are expected to exert greater protein binding and activity.
The study (Table 2)

Docking Studies
The docking analysis of 28, 41, 46, 47, 76, 87, 94, 98, 99, 192, 193, 200, 211, 298, and 300 was carried out against the coronavirus PLpro enzyme's binding site (PDB ID: 4OW0). The crystallized ligand (S88) was used as a reference compound. For each compound, 30 run poses were carried out. The applied procedure of molecular docking was verified through the there-docking of S88 against the PLpro active site for another time. The small value of the RMSD (0.98 Å) between the two poses indicated the applicability of the applied protocol ( Figure 4).
Differentiation between the tested compounds for their binding affinity was dependent on certain factors. (i) The first factor is the correct binding mode of a tested compound. The compound that exerted a binding mode very close to S88 was expected to have a good affinity against PLpro. The correct binding modes were determined according to the nature of the interactions (hydrogen or hydrophobic bonds) with the specific amino acid residues in the active pocket of PLpro. This factor is critical as a compound with the correct binding mode is expected to have a higher affinity than a compound with high binding energy having an incorrect binding mode. Therefore, the incorrect binding mode, resulting in incorrect affinity predictions, decreases the compound's rate of virtual screening [44,45]. (ii) Gibbs free energy (∆G binding) indicates the stability of the obtained conformation between the tested compound and PLpro (Table 3). According to the thermodynamic balance law, the value of ∆G is inversely proportional to the stability of the examined molecule and indicates that binding with PLpro will occur spontaneously. In other words, the increase in the negative free energy of a compound (reactant) will increase the reaction spontaneously and result in more stable conformations [46,47]. Differentiation between the tested compounds for their binding affinity was depend ent on certain factors. (i) The first factor is the correct binding mode of a tested compound The compound that exerted a binding mode very close to S88 was expected to have a good affinity against PLpro. The correct binding modes were determined according to the na ture of the interactions (hydrogen or hydrophobic bonds) with the specific amino acid residues in the active pocket of PLpro. This factor is critical as a compound with the correc binding mode is expected to have a higher affinity than a compound with high binding energy having an incorrect binding mode. Therefore, the incorrect binding mode, result ing in incorrect affinity predictions, decreases the compound's rate of virtual screening [44,45]. (ii) Gibbs free energy (ΔG binding) indicates the stability of the obtained confor mation between the tested compound and PLpro (Table 3). According to the thermody namic balance law, the value of ΔG is inversely proportional to the stability of the exam ined molecule and indicates that binding with PLpro will occur spontaneously. In other words, the increase in the negative free energy of a compound (reactant) will increase the reaction spontaneously and result in more stable conformations [46,47].     The proposed binding mode of S88 expressed a ∆G of −59.13 kcal/mol. S88 made one HB between its amide moiety and Tyr269. Additionally, the naphthyl moiety made eight hydrophobic interactions (HI) withAsp165, Met209, Arg167, Ala247, Thr302, Pro248, and Pro249. The ethyl bridge was included in two hydrophobic interactions with Pro249 and Tyr265. The piperidine moiety formed two hydrophobic bonds with Tyr265 and Tyr269. (Figure 5). binding modes. For this reason, such compounds were selected for further investigation.
The proposed binding mode of S88 expressed a ΔG of −59.13 kcal/mol. S88 made one HB between its amide moiety and Tyr269. Additionally, the naphthyl moiety made eight hydrophobic interactions (HI) withAsp165, Met209, Arg167, Ala247, Thr302, Pro248, and Pro249. The ethyl bridge was included in two hydrophobic interactions with Pro249 and Tyr265. The piperidine moiety formed two hydrophobic bonds with Tyr265 and Tyr269. (Figure 5).   Compound 94 showed good binding energy (ΔG = −50.82) against the PLpro active site. It formed four HBs with Lys158, Tyr274, and Arg167. Additionally, the phenyl rings were involved in five HIs with Leu163, Tyr269, Tyr265, and Asp165 (Figure 7). Compound 94 showed good binding energy (∆G = −50.82) against the PLpro active site. It formed four HBs with Lys158, Tyr274, and Arg167. Additionally, the phenyl rings were involved in five HIs with Leu163, Tyr269, Tyr265, and Asp165 (Figure 7).  Compound 98 revealed good fitting with a docking score of −52.21 kcal/mol. The OH group formed one HB with Asp303, and the methoxy group formed another HB with Lys158. Many HIs were observed between the tested compound and Asp165, Arg167, Pro249, Tyr269, Tyr265, Leu163, and Tyr274 ( Figure 8). Compound 98 revealed good fitting with a docking score of −52.21 kcal/mol. The OH group formed one HB with Asp303, and the methoxy group formed another HB with Lys158. Many HIs were observed between the tested compound and Asp165, Arg167, Pro249, Tyr269, Tyr265, Leu163, and Tyr274 ( Figure 8). Compound 98 revealed good fitting with a docking score of −52.21 kcal/mol. The OH group formed one HB with Asp303, and the methoxy group formed another HB with Lys158. Many HIs were observed between the tested compound and Asp165, Arg167, Pro249, Tyr269, Tyr265, Leu163, and Tyr274 ( Figure 8).   Figure 9). The compound demonstrated two HBs with Tyr274. In addition, it formed 12 HIs, as shown in Figure 10.    Compound 298 showed a binding mode against the PLpro active site with a binding affinity of −48.46 kcal/mol. It was incorporated in eight HIs with Pro248, Tyr265, Leu163, Tyr269, and Pro249 ( Figure 11). Compound 298 showed a binding mode against the PLpro active site with a binding affinity of −48.46 kcal/mol. It was incorporated in eight HIs with Pro248, Tyr265, Leu163, Tyr269, and Pro249 ( Figure 11).

ADMET
ADMET studies were achieved using Discovery Studio 4.0, with remdesivir as a reference. The following descriptors were examined. (i) The ability to penetrate the bloodbrain barrier [48] (BBB), intestinal absorption [49] (HIA), aqueous solubility [50] (S), CYP2D6 binding [51], hepatotoxicity, and plasma protein binding [52] (PPB). The calculated properties are listed in (Table 4). All compounds showed high levels of BBB penetration except molecules 28, 76, 94, 200, and 298, which displayed medium to very low BBB levels. All the tested molecules showed good absorption characteristics comparable to remdesivir, which exhibited a very low level of absorption. Moreover, the solubility of the tested molecules was projected to be between low and good levels except for molecule 211, which showed a very low level. All molecules in addition to remdesivir were calculated to be inhibitors against CYP2D6 except molecules 28, 87, 94, 98, 99, 192, 200, 298 and 300. All the tested molecules were expected to have unfavorable hepatotoxic effects except molecules 28, 41, and 192, which were predicted to be non-toxic. All tested molecules and remdesivir were expected to bind to the plasma protein with a percentage of >90%, except molecule 46, which demonstrated plasma protein binding <90%. (Figure 12).

ADMET
ADMET studies were achieved using Discovery Studio 4.0, with remdesivir as a reference. The following descriptors were examined. (i) The ability to penetrate the blood-brain barrier [48] (BBB), intestinal absorption [49] (HIA), aqueous solubility [50] (S), CYP2D6 binding [51], hepatotoxicity, and plasma protein binding [52] (PPB). The calculated properties are listed in (Table 4). All compounds showed high levels of BBB penetration except molecules 28, 76, 94, 200, and 298, which displayed medium to very low BBB levels. All the tested molecules showed good absorption characteristics comparable to remdesivir, which exhibited a very low level of absorption. Moreover, the solubility of the tested molecules was projected to be between low and good levels except for molecule 211, which showed a very low level. All molecules in addition to remdesivir were calculated to be inhibitors against CYP2D6 except molecules 28, 87, 94, 98, 99, 192, 200, 298 and 300. All the tested molecules were expected to have unfavorable hepatotoxic effects except molecules 28, 41, and 192, which were predicted to be non-toxic. All tested molecules and remdesivir were expected to bind to the plasma protein with a percentage of >90%, except molecule 46, which demonstrated plasma protein binding <90%. (Figure 12).  poor. c Aq. solubility level, a is extremely low, b is very low, c is low, d is good, e is optimal. d CYP2D6, n is a non-inhibitor, i is an inhibitor. e Hepatotoxicity, if >0.5 is toxic, if <0.5 is non-toxic. f PPBb is >90%, c is >95%.  Figure 12. The expected ADMET characters. Figure 12. The expected ADMET characters.
In silico testing revealed that the majority of molecules had expected low levels of toxicity (Table 5).

DFT Studies
DFT parameters including binding energy [63], HOMO [64], LUMO [64], gap energy [65], and dipole moment [66,67] were studied for the most promising molecules, 76, 94, and 298, using Discovery Studio software. S88 was used as a reference. The results of the DFT studies are summarized in Table 6 and Figures 13 and 14.

Frontier Molecular Orbitals Analysis
Frontier molecular orbitals analysis can efficiently demonstrate active sites in addition to determining the kinetic stability and the chemical reactivity of a molecule [68]. The EHOMO and ELUMO of the tested molecules were computed using DMol3 implemented in Discovery Studio software [69]. The LUMO may be engaged in a nucleophilic attack, while the HOMO refers to the most probable site of an electrophilic attack. The HOMO energy represents the ionization potential of a drug, while that of the LUMO describes the electron affinity.
For gap energy, it was reported that a molecule is thought to be softer and more chemically reactive when its energy gap is small. In addition, a molecule was assumed to have greater chemical hardness and to be more stable when it had a large energy gap [70]. In this study, molecule 76 was found to have a low level of gap energy of 0.096 Ha, while molecules 94 and 298 were found to have high gap energy of 0.141 and 0.131, respectively. These findings indicate that compound 76 has higher reactivity than compounds 94 and 298. On the contrary, compounds 94 and 298 may possess higher stability than compound 76.
For the dipole moment values, compound 94 had a dipole moment value of 3.582. This value is nearly equal to that of S88 (3.621). The elevated dipole moment was expected to increase HBing, and non-bonded interactions in the compound-protein complexes were predicted to increase the binding affinity during SARS-CoV-2 inhibition. Compounds 76 and 298 had fewer values of the dipole moment of 1.700 and 1.094, respectively. From these findings, it can be concluded that compounds 76 and 94 have a higher chance of interacting with the target protein than compound 298 (Table 6 and Figure 13).
As shown in Figure 13B, the HOMO spatial distributions of molecule 76 were mainly distributed on the 3-(3,4-dihydroxyphenyl) -7-hydroxy-5-methoxy-4H-chromen-4-one moiety, while those of LUMO were located on the 7-hydroxy-5-methoxy-4H-chromen-4one moiety (the electron acceptor zones).  Frontier molecular orbitals analysis can efficiently demonstrate active sites in addition to determining the kinetic stability and the chemical reactivity of a molecule [68]. The EHOMO and ELUMO of the tested molecules were computed using DMol3 implemented in Discovery Studio software [69]. The LUMO may be engaged in a nucleophilic attack, while the HOMO refers to the most probable site of an electrophilic attack. The HOMO energy represents the ionization potential of a drug, while that of the LUMO describes the electron affinity.
For gap energy, it was reported that a molecule is thought to be softer and more chemically reactive when its energy gap is small. In addition, a molecule was assumed to have greater chemical hardness and to be more stable when it had a large energy gap [70]. In this study, molecule 76 was found to have a low level of gap energy of 0.096 Ha, while molecules 94 and 298 were found to have high gap energy of 0.141 and 0.131, respectively. These findings indicate that compound 76 has higher reactivity than compounds 94 and 298. On the contrary, compounds 94 and 298 may possess higher stability than compound 76.
For the dipole moment values, compound 94 had a dipole moment value of 3.582. This value is nearly equal to that of S88 (3.621). The elevated dipole moment was expected to increase HBing, and non-bonded interactions in the compound-protein complexes were predicted to increase the binding affinity during SARS-CoV-2 inhibition. Compounds 76 and 298 had fewer values of the dipole moment of 1.700 and 1.094, respectively. From these findings, it can be concluded that compounds 76 and 94 have a higher chance of interacting with the target protein than compound 298 (Table 6 and Figure 13).
The specific role of the HOMO center (3-(3,4-dihydroxyphenyl) -7-hydroxy-5-methoxy-4H-chromen-4-one moiety) in the binding of the receptor was previously confirmed by our docking experiments. As we noticed in Figure 13, the carbonyl group at position-4 of 4H-chromen-4-one (HOMO center) formed an H-bond acceptor with the phenolic OH group (LUMO center) of Tyr229. Furthermore, the LUMO of the accepting species (the two phenolic OH groups of catechol moiety) formed two H-bond donors with the HOMO of the donating species (OH group of Thr302 and OH group of Tyr274).

Molecular Electrostatic Potential Maps (MEP)
MEP is a very helpful technique for understanding the 3D charge distributions over a molecule.
In MEP, the electronegative atoms are highlighted with red and can be acceptors in H-bonding interactions. On the other hand, the electron-poor atoms are highlighted in blue and are incorporated into H-bonds as donors. Finally, the neutral atoms are highlighted from green to yellow and incorporated in HIs [71].
The MEP map of molecule 76 shows that the negative potential sites are on oxygen atoms (seven red patches) and the positive potential sites are around the hydrogen atoms (six blue patches). This indicates that molecule 76 has seven positions available for Hbonding acceptors and six positions suitable for H-bond donors. This map defines the region in which the molecule can have non-covalent interactions ( Figure 14).
The presented study preferred glycyrrhizoflavone (76) as the most relevant inhibitor of human coronavirus PLpro. Glycyrrhizoflavone is a flavonoid that has been isolated from licorice and Glycyrrhiza glabra roots [72]. Glycyrrhisoflavone exhibited potent antiviral activity against the human immunodeficiency virus by inhibiting giant cell formation in the infected cells and inhibiting viral transcription [73,74].

Conclusions
Several computational filtration methods (similarity assessment, fingerprints check, docking, ADMET, toxicity, and DFT) were carried out on 310 metabolites of natural origin that were reported as antivirals against PLpro, (PDB ID: 4OW0) and its co-crystallized ligand S88. The experiments predicted a high degree of binding between glycyrrhizoflavone (76) and PLpro. Accordingly, the potential of glycyrrhizoflavone to be an inhibitor against human coronavirus PLpro inhibitor is highly expected. More studies must be carried out on such a promising drug to affirm its inhibitory potential against PLpro.

Molecular Similarity Detection
Was applied using Discovery Studio 4.0 software. Details have been discussed in detail in the Supplementary data.

Fingerprint Studies
Were applied using Discovery Studio 4.0 software. Details have been discussed in detail in the Supplementary data.