Finding the First Potential Inhibitors of Shikimate Kinase from Methicillin Resistant Staphylococcus aureus through Computer-Assisted Drug Design

Methicillin-resistant Staphylococcus aureus (MRSA) is an important threat as it causes serious hospital and community acquired infections with deathly outcomes oftentimes, therefore, development of new treatments against this bacterium is a priority. Shikimate kinase, an enzyme in the shikimate pathway, is considered a good target for developing antimicrobial drugs; this is given because of its pathway, which is essential in bacteria whereas it is absent in mammals. In this work, a computer-assisted drug design strategy was used to report the first potentials inhibitors for Shikimate kinase from methicillin-resistant Staphylococcus aureus (SaSK), employing approximately 5 million compounds from ZINC15 database. Diverse filtering criteria, related to druglike characteristics and virtual docking screening in the shikimate binding site, were performed to select structurally diverse potential inhibitors from SaSK. Molecular dynamics simulations were performed to elucidate the dynamic behavior of each SaSK–ligand complex. The potential inhibitors formed important interactions with residues that are crucial for enzyme catalysis, such as Asp37, Arg61, Gly82, and Arg138. Therefore, the compounds reported provide valuable information and can be seen as the first step toward developing SaSK inhibitors in the search of new drugs against MRSA.


Introduction
Today, antimicrobial resistance in bacteria has become a serious healthcare concern [1]; the World Health Organization (WHO) published a list of 12 bacteria in 2017 [2], their level of resistance to antibiotics has become so grave, that they represent an important threat. Therefore, these organisms have been declared priority pathogens to encourage antibiotic drug design by pharmaceutical companies [3]. From the Gram-positive bacteria shown on this list, drug-resistant Staphylococcus aureus has become one of the most problematic pathogens worldwide.
Methicillin-resistant Staphylococcus aureus (MRSA) has been widely known as a major cause of nosocomial and community-acquired infections that range from mild cases, such as skin and soft tissue infections, to more serious and deadlier, like bacteremia, osteomyelitis, and infective endocarditis [4]. In 2017, an estimated of 12,000 cases of MRSA infections Figure 1. Workflow to select the potential SaSK inhibitors by computer-assisted drug design. 94 95 Currently, virtual screening has become essential for the drug design process, it per-96 mits an accurate prediction of the position and conformation of a ligand in the binding 97 site of a target protein through an established scoring function [20]. After filtering, 52 com-98 pounds were docked into the SaSK active site using the protocol described in materials 99 and methods; the five molecules with the highest docking score and structural diversity 100 were selected in this study ( Figure 1 and Table 1

Virtual Screening
Currently, virtual screening has become essential for the drug design process, it permits an accurate prediction of the position and conformation of a ligand in the binding site of a target protein through an established scoring function [20]. After filtering, 52 compounds were docked into the SaSK active site using the protocol described in materials and methods; the five molecules with the highest docking score and structural diversity were selected in this study ( Figure 1 and Table 1). Docking results show that compounds interact with residues important for substrate binding or enzyme catalysis, such as Gly82 and Arg138, both are in charge of the stabilization and orientation of shikimate; moreover, Gly83 also participates in substrate stabilization [21]. Conversely, Arg120 is a residue present in the LID domain known for its function as a substrate stabilizing residue in the ATP-shikimate complex [22]. Finally, Asp37 participates in the interaction with the hydroxyl groups of shikimate molecule [23] while Asp35 is an important residue in the nucleotide binding domain that forms hydrogen bonds with other residues that facilitate the approximation of shikimate and ATP [21] ( Table 1 and Figure 2).

103
Docking results show that compounds interact with residues important for substrate 104 binding or enzyme catalysis, such as Gly82 and Arg138, both are in charge of the stabili-105 zation and orientation of shikimate; moreover, Gly83 also participates in substrate stabi-106 lization [21]. Conversely, Arg120 is a residue present in the LID domain known for its 107 function as a substrate stabilizing residue in the ATP-shikimate complex [22]. Finally, 108 Asp37 participates in the interaction with the hydroxyl groups of shikimate molecule [23] 109 while Asp35 is an important residue in the nucleotide binding domain that forms hydro-110 gen bonds with other residues that facilitate the approximation of shikimate and ATP [21]  Docking results show that compounds interact with residues important for substrate 1 binding or enzyme catalysis, such as Gly82 and Arg138, both are in charge of the stabili-1 zation and orientation of shikimate; moreover, Gly83 also participates in substrate stabi-1 lization [21]. Conversely, Arg120 is a residue present in the LID domain known for its 1 function as a substrate stabilizing residue in the ATP-shikimate complex [22]. Finally, 1 Asp37 participates in the interaction with the hydroxyl groups of shikimate molecule [23] 1 while Asp35 is an important residue in the nucleotide binding domain that forms hydro-1 gen bonds with other residues that facilitate the approximation of shikimate and ATP [ Docking results show that compounds interact with residues important for substrate binding or enzyme catalysis, such as Gly82 and Arg138, both are in charge of the stabilization and orientation of shikimate; moreover, Gly83 also participates in substrate stabilization [21]. Conversely, Arg120 is a residue present in the LID domain known for its function as a substrate stabilizing residue in the ATP-shikimate complex [22]. Finally, Asp37 participates in the interaction with the hydroxyl groups of shikimate molecule [23] while Asp35 is an important residue in the nucleotide binding domain that forms hydrogen bonds with other residues that facilitate the approximation of shikimate and ATP [21] ( Table 1 and Figure 2).  Docking results show that compounds interact with residues important for substrate 104 binding or enzyme catalysis, such as Gly82 and Arg138, both are in charge of the stabili-105 zation and orientation of shikimate; moreover, Gly83 also participates in substrate stabi-106 lization [21]. Conversely, Arg120 is a residue present in the LID domain known for its 107 function as a substrate stabilizing residue in the ATP-shikimate complex [22]. Finally, 108 Asp37 participates in the interaction with the hydroxyl groups of shikimate molecule [23]  Docking results show that compounds interact with residues important for substrate binding or enzyme catalysis, such as Gly82 and Arg138, both are in charge of the stabilization and orientation of shikimate; moreover, Gly83 also participates in substrate stabilization [21]. Conversely, Arg120 is a residue present in the LID domain known for its function as a substrate stabilizing residue in the ATP-shikimate complex [22]. Finally, Asp37 participates in the interaction with the hydroxyl groups of shikimate molecule [23] while Asp35 is an important residue in the nucleotide binding domain that forms hydrogen bonds with other residues that facilitate the approximation of shikimate and ATP [21] (Table 1 and Figure 2).
Although, there are no reports in literature of inhibitors for SaSK, other shikimate kinases, in particular for M. tuberculosis and H. pylori, have been shown [22,24]. These molecules comprise several scaffolds, such as pyrazolone derivatives, manzamines, 2aminobenzothiazoles, and substrate analogues, that particularly target the shikimate binding site [25][26][27].
Docking results in this study were able to select compounds that possess a particular orientation that allows them to perform important interactions with residues that are vital for enzyme catalysis, such as Arg61, Arg138, and Gly82, which are reported to be critical for ligand stabilization and orientation in studies performed in M. tuberculosis shikimate kinase [21][22][23]28].

Molecular Dynamics Studies
For purposes of gathering added information about SaSK-potential inhibitor complex, molecular dynamics simulations of 100 ns were performed. First, protein-ligand complex stability was evaluated analyzing the root mean square deviation (RMSD) of the Cα protein atoms. The data show that after 20 ns, four of the complexes reach stability, however, C5 leaves the binding site after 6 ns of simulation, therefore, no analysis was performed for this compound. Average RMSD values obtained are 0.416, 0.397, 0.397, and 0.459 nm for SaSK-C1, SaSK-C2, SaSK-C3, and SaSK-C4 complexes, respectively, indicating that all remain stable during simulation time ( Figure 3).
Furthermore, root mean square fluctuations analysis (RMSF) was performed, the results show that in all four complexes, the highest fluctuation observed corresponds to the LID domain (residues 120-135 in SaSK), a region that it is known for its great flexibility [29], in this case, this movement can be attributed to an essential movement to accommodate the compound within the binding site. These fluctuations are more notorious in SaSK-C1, SaSK-C3, and SaSK-C4 complexes (Figure 4). Average RMSF values obtained are 0.172, 0.191, 0.184, and 0.186 nm for SaSK-C1, SaSK-C2, SaSK-C3, and SaSK-C4, respectively. Furthermore, to assess the effect of compound binding in protein tertiary structure, the radius of gyration (Rg) analysis was realized. As it can be seen, the value of Rg in each complex keeps constant during entire simulation time, suggesting that none of the potential inhibitors alters the structure of the protein ( Figure 5). plex, molecular dynamics simulations of 100 ns were performed. First, protein-ligand 129 complex stability was evaluated analyzing the root mean square deviation (RMSD) of the 130 Cα protein atoms. The data show that after 20 ns, four of the complexes reach stability, 131 however, C5 leaves the binding site after 6 ns of simulation, therefore, no analysis was 132 performed for this compound. Average RMSD values obtained are 0.416, 0.397, 0.397, and 133 0.459 nm for SaSK-C1, SaSK-C2, SaSK-C3, and SaSK-C4 complexes, respectively, indicat-134 ing that all remain stable during simulation time (Figure 3). 135 136 Figure 3. Root mean square deviation analysis for free enzyme and the different protein-ligand 137 complexes. 138 Furthermore, root mean square fluctuations analysis (RMSF) was performed, the re-139 sults show that in all four complexes, the highest fluctuation observed corresponds to the 140 LID domain (residues 120-135 in SaSK), a region that it is known for its great flexibility 141 [29], in this case, this movement can be attributed to an essential movement to accommo-142 date the compound within the binding site. 149 Furthermore, to assess the effect of compound binding in protein tertiary structure, 150 the radius of gyration (Rg) analysis was realized. As it can be seen, the value of Rg in each 151 complex keeps constant during entire simulation time, suggesting that none of the poten-152 tial inhibitors alters the structure of the protein ( Figure 5).  149 Furthermore, to assess the effect of compound binding in protein tertiary structure, 150 the radius of gyration (Rg) analysis was realized. As it can be seen, the value of Rg in each 151 complex keeps constant during entire simulation time, suggesting that none of the poten-152 tial inhibitors alters the structure of the protein ( Figure 5). Finally, hydrogen bond analysis was performed. The data show that the number of 157 H-bonds in the complexes vary during simulation time with an average of 7, 4, 3, and 1 158 for SaSK-C1, SaSK-C2, SaSK-C3, and SaSK-C4, respectively, suggesting a better binding 159 mode for compound C1 (Figure 6 and Table 2). Finally, hydrogen bond analysis was performed. The data show that the number of H-bonds in the complexes vary during simulation time with an average of 7, 4, 3, and 1 for SaSK-C1, SaSK-C2, SaSK-C3, and SaSK-C4, respectively, suggesting a better binding mode for compound C1 (Figure 6 and Table 2).

Linear Interaction Energy
The linear interaction energy (LIE) [30] approach is an extensively described method to compute binding affinities. It permits to combine explicit conformational sampling (of the protein-bound and unbound-ligand states) with efficiency to be able to calculate quantitative values for the protein-ligand binding free energy ∆G bind .
In this study, the LIE method was employed to calculate the binding affinities of the four complexes evaluated by molecular dynamics. Results show that C1 and C4 obtained a negative value for ∆G bind because of a combination of Van der Waals and electrostatic interaction energies, while C2 and C3 obtained a positive one, indicating that the former make a more stable complex with SaSK than the latter (Table 3).

ADME-Tox Evaluation
In the final analysis to complete the characterization of these potential inhibitors, an important issue during first steps of drug design process is the prediction of the ADME-Tox properties of the molecules. In this context, a detailed study was performed for the four compounds using the SwissADME online tool [31] and PreADMET server [32]. The data show that, in general, all compounds obtained evaluations in the permitted range of each characteristic, which indicates the drugability potential of these compounds (Tables 4 and  5). It is important to note that this type of characterization has not been reported for inhibitors against other SKs from bacteria [22,24].

Small Molecules Chemical Library
In this study, A 3D small molecule database was retrieved from ZINC15 database Tranches (http://zinc15.docking.org/, accessed on 9 January 2021) [33]. First, based on substrate structure, a Log p value of −1 and a molecular weight ≤ 350 Da, were used as selection criteria. By the time this work was finished, the number of compounds was neighboring 5 million.
Prior to virtual screening by docking, compounds were filtered according to Lipinski's Rule of Five' [34] to select those that possess physicochemical properties present in potential drug candidates only. Additionally, given that unfavorable structural alerts that can produce toxicity may lead to a compound being rejected in further studies [19,35], an in silico toxicity risk assessment for Mutagenicity, Tumorigenic, Irritant, and Reproductive effects was performed using Osiris Data Warrior software [36]. The presence of a single toxic parameter was enough to eliminate a given compound. Furthermore, the topological surface area (TPSA) was also calculated where a value ranging from >75 to <140 was necessary to be included, along with a number of rotatable bonds of less than 5. Finally, compounds were clustered according to their structure and similarity using as criteria that the highest similarity value fell below 0.8 according to Data Warrior Software.

Docking Studies
The previously reported SaSK 3D homology model [18] was employed for docking studies, the protein structure was prepared with the Protein Preparation Wizard in Maestro (Schrödinger Suite Release 2019-4) [37]. Bonds order was assigned, hydrogen atoms were added, and formal charges were treated, protein minimization was applied with the OPLS3e forcefield. The grid box was generated with default settings, with a 10Å × 10Å × 10Å size, using the center between amino acids Met14, Asp37, Ile48, Phe60, Arg61, Gly81, Gly82, Gly83 Pro118, and Arg138, which correspond to amino acids forming the shikimate binding site of the enzyme [26,29]. After filtering criteria (Figure 1), the 3D ligand structures of compounds selected were prepared using Ligprep, at a selected pH range of 7 ± 2, where ionization states were generated; the energy was minimized using the OPLS3e force field. Docking studies were carried out with Glide [38] implemented in the Maestro software, using the extra precision (XP) mode [39] that provides further elimination of false positives by applying extensive sampling and a more stringent scoring function, the best five poses of each compound were retained as output.

Molecular Dynamics Studies
Molecular dynamics simulations were performed using GROMACS version 2019.3 [40] and CHARMM 36 forcefield [41]. Before MD simulation, compounds were parametrized using SwissParam Server (http://swissparam.ch/, accessed 17 May 2021) [42]. SaSK coordinates and topology were constructed using GROMACS, then each ligand was merged into a complex with SaSK and the system was immersed into the center of a dodecahedral box, the solute box distance was set at 1.0 nm. The system was solvated by the addition of TIP3P waters [43] and counterions were added to reach a salt concentration of 0.15 M.
MD simulations began with an energy minimization (EM) simulation as the first step, which was performed during 100 ps to reach a local minimum employing the steepest descent algorithm. Afterwards, the system was submitted to temperature and pressure equilibration by performing two 100 ps equilibration steps namely, an isothermal-isochoric (NVT) ensemble followed by an isothermal-isobaric (NPT) ensemble under no position restraint, thus bringing the system to a 310 K temperature and 1 bar pressure. Temperature and pressure were maintained by employing the velocity-rescale thermostat [44] and the Parrinello-Rahman pressure coupling methods [45]. Finally, a 100 ns timescale MD was carried out, employing a 1.2 nm for short-range interactions and the leap-frog integrator algorithm. MD simulations were then analyzed using Visual Molecular Dynamics (VMD) software [46]. Furthermore, to explore the structural and dynamic behavior of the proteinligand complex, analysis of the MD data involved root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg), and hydrogen bond analysis (H-bonding).

Linear Interaction Energy Calculations
Binding free energies were obtained for each of the complexes based on the linear interaction energy (LIE) method calculated by the Equation (1): where (V LJ ) bound indicates the average Lennard-Jones energy for ligand-protein interaction; (V LJ ) free is the average Lennard-Jones energy for ligand-solvent interaction; (V CL ) bound is the average electrostatic energy for ligand-protein interaction; (V CL ) free is the average electrostatic energy for ligand-solvent interaction; the LIE coefficients are given by α, β, and γ which for small drug-like ligands correspond to α = 0.18, β = 0.50, and γ = 0.00 [30,47,48].

Conclusions
The study of an around 5 million small molecules database through a computerassisted drug design strategy, permitted to find the first set of potential inhibitors of SaSK. According to the structural analysis, these compounds formed interactions with residues important for enzyme catalysis. Furthermore, they demonstrated good ADME-Tox and druglike characteristics, which make these molecules an attractive starting point for the development of new drugs against MRSA.

Supplementary Materials:
The following is available online. Table S1: Structures of the compounds selected for virtual screening.