In Silico Identification of Possible Inhibitors for Protein Kinase B (PknB) of Mycobacterium tuberculosis

With tuberculosis still being one of leading causes of death in the world and the emergence of drug-resistant strains of Mycobacterium tuberculosis (Mtb), researchers have been seeking to find further therapeutic strategies or more specific molecular targets. PknB is one of the 11 Ser/Thr protein kinases of Mtb and is responsible for phosphorylation-mediated signaling, mainly involved in cell wall synthesis, cell division and metabolism. With the amount of structural information available and the great interest in protein kinases, PknB has become an attractive target for drug development. This work describes the optimization and application of an in silico computational protocol to find new PknB inhibitors. This multi-level computational approach combines protein–ligand docking, structure-based virtual screening, molecular dynamics simulations and free energy calculations. The optimized protocol was applied to screen a large dataset containing 129,650 molecules, obtained from the ZINC/FDA-Approved database, Mu.Ta.Lig Virtual Chemotheca and Chimiothèque Nationale. It was observed that the most promising compounds selected occupy the adenine-binding pocket in PknB, and the main interacting residues are Leu17, Val26, Tyr94 and Met155. Only one of the compounds was able to move the active site residues into an open conformation. It was also observed that the P-loop and magnesium position loops change according to the characteristics of the ligand. This protocol led to the identification of six compounds for further experimental testing while also providing additional structural information for the design of more specific and more effective derivatives.


Introduction
Mycobacterium tuberculosis (Mtb) is the pathogen responsible for the development of tuberculosis (TB). TB is, to this day, one of the most common causes of mortality, by a single pathogen, around the world. According to the World Health Organization (WHO), in 2019, Mtb was responsible for 10 million new infections globally, with an estimated 1.2 million deaths [1]. The treatment for TB is rather extensive and involves a combination of different antibiotics. If not performed correctly, it may lead to the development of drug-resistant strains such as the multi-drug-resistant and extensively drug-resistant strains (MDR and XDR-TB), which in turn, may cause higher transmissibility rates and larger outbreaks [2,3]. Finding additional or new strategies to control infection is of the utmost importance [4].
PknB is one of the 11 "eukaryotic-like" Serine/Threonine protein kinase (STPK) responsible for phosphorylation-mediated signaling in Mtb. It is a transmembrane protein with four extracellular PASTA (Penicillin binding proteins and Serine Threonine kinases-Associated) motifs and an intracellular catalytic kinase domain [5]. PknB is involved in many key bacterial processes such as cell wall synthesis, cell division and metabolism. Associated) motifs and an intracellular catalytic kinase domain [5]. PknB is involved in many key bacterial processes such as cell wall synthesis, cell division and metabolism. Several studies have shown that its expression affects cell morphology and survival, with a depletion of PknB leading to the loss of bacterial viability [6][7][8]. In addition, it is essential for the reactivation of the cells from a hypoxic state, being an important "switch" between dormancy and resuscitation [9,10]. These aspects not only reinforce the significance of the role of PknB in Mtb, but also highlight its importance as a potential drug target.
Structurally, PknB is divided into two lobes, an N-lobe and a C-terminal lobe. Between them, is the ATP pocket. As with most STPKs, it has five main groups that are indispensable for the catalytic process: helix C (residues 51-65); the P-loop (the phosphatebinding or glycine-rich loop) where the amides of the glycines coordinate the phosphates of ATP, functioning as a clamp (residues [18][19][20][21][22][23]; the magnesium positioning loop (residues 156-158); the catalytic loop (residues 135-143); and the activation loop (residues 164-177) (Figure 1). PknB can be found in two conformations, open and closed, depending on the position of the helix C and the P-loop region. The open conformation corresponds to an inactive state of PknB, whereas the closed conformation corresponds to the active state. To be in a closed conformation and in an active state, a glutamic acid from the helix C must be close contact with a lysine near the P-loop, forming a salt bridge. This lysine is a fundamental piece for the catalysis as it orients the position of the phosphates of ATP [11,12]. Mycobacterial STPK only presents 30% sequence identity to eukaryotic STPK; however, the ATP-binding and catalytic mechanism are highly conserved [13]. There is a great interest in finding kinase inhibitors, mainly in cancer research, where it is important in signal regulation. In fact, the first crystal structure of PknB bound to a non-ATP analog was the result of an in silico screening of 40,000 compounds that concluded that mitoxantrone, a compound used in cancer treatment, could inhibit bacterial growth in culture. Since then, other inhibitors were found but showed limited activity on mycobacteria mainly due to poor permeability through the cell envelope [13,14]. Mycobacteria has a unique, thick lipid-rich envelope composed three layers: the capsule, the cell wall and the cell membrane. This highly hydrophobic cell envelope acts as a protective armor and hinders the diffusion of many compounds, including more hydrophobic molecules such as Mycobacterial STPK only presents 30% sequence identity to eukaryotic STPK; however, the ATP-binding and catalytic mechanism are highly conserved [13]. There is a great interest in finding kinase inhibitors, mainly in cancer research, where it is important in signal regulation. In fact, the first crystal structure of PknB bound to a non-ATP analog was the result of an in silico screening of 40,000 compounds that concluded that mitoxantrone, a compound used in cancer treatment, could inhibit bacterial growth in culture. Since then, other inhibitors were found but showed limited activity on mycobacteria mainly due to poor permeability through the cell envelope [13,14]. Mycobacteria has a unique, thick lipid-rich envelope composed three layers: the capsule, the cell wall and the cell membrane. This highly hydrophobic cell envelope acts as a protective armor and hinders the diffusion of many compounds, including more hydrophobic molecules such as antibiotics [3,15,16]. Moreover, to reach the target, the drug must be able to cross the host cell membranes, as Mtb can be present in multiple intra-and extracellular entities, such as macrophages and necrotic granulomas [17].
Computer-aided drug discovery has been gaining more popularity due to the gains in computational power, the development of new programs and the exponential raise in the number of 3D protein targets available and deposited in the Protein Data Bank [18][19][20]. The usage of these computational tools in the early stages of drug design minimizes failures in later stages and makes the entire process more cost-efficient [21][22][23]. There are multiple published works in which the usage of these tools has led to the discovery of multiple promising compounds against PknB [24][25][26][27][28]. Consequently, this work aims to use computational tools to understand the binding pocket of PknB and to identify new promising inhibitors. To accomplish this goal, a combination of multiple computational methods was employed, including virtual screening, molecular dynamics simulations and free energy calculations [29]. An overview of the workflow is depicted in Figure 2. antibiotics [3,15,16]. Moreover, to reach the target, the drug must be able to cross the host cell membranes, as Mtb can be present in multiple intra-and extracellular entities, such as macrophages and necrotic granulomas [17].
Computer-aided drug discovery has been gaining more popularity due to the gains in computational power, the development of new programs and the exponential raise in the number of 3D protein targets available and deposited in the Protein Data Bank [18][19][20]. The usage of these computational tools in the early stages of drug design minimizes failures in later stages and makes the entire process more cost-efficient [21][22][23]. There are multiple published works in which the usage of these tools has led to the discovery of multiple promising compounds against PknB [24][25][26][27][28]. Consequently, this work aims to use computational tools to understand the binding pocket of PknB and to identify new promising inhibitors. To accomplish this goal, a combination of multiple computational methods was employed, including virtual screening, molecular dynamics simulations and free energy calculations [29]. An overview of the workflow is depicted in Figure 2.

Structure Identification
Structural information regarding the serine/threonine protein kinase (PknB) from Mycobacterium tuberculosis was obtained from the Protein Data Bank [30]. The X-ray structures and corresponding information of the targets selected are detailed in Table 1. A total of 15 protein structures and 8 ligands were used in the protein-ligand docking protocol validation stage. Only the catalytic domain of each structure (i.e., chain A) was used in this study.

Structure Identification
Structural information regarding the serine/threonine protein kinase (PknB) from Mycobacterium tuberculosis was obtained from the Protein Data Bank [30]. The X-ray structures and corresponding information of the targets selected are detailed in Table 1. A total of 15 protein structures and 8 ligands were used in the protein-ligand docking protocol validation stage. Only the catalytic domain of each structure (i.e., chain A) was used in this study. Most of the molecular targets studied present magnesium (Mg 2+ ) or manganese ions (Mn 2+ ) in the active site. It has been demonstrated that magnesium ions have a stabilizing effect on ATP in order to allow for the kinase phosphorylation process to occur, even though its precise role is complex and not fully understood [36,37]. The mutations in the structures 3F61, 3F69, 3ORI, 3ORK, 3ORL, 3ORM, 3ORO, 3ORP and 3ORT were evaluated to further understand the role of specific residues in the dimerization and kinase regulation [31,32]. The manganese ions were typically added in such studies as substitutes for Mg 2+ .
Nearly all of the selected structures correspond only to the catalytic domain of PknB bound to an ATP analog. As mentioned previously, 2FUM was the first crystallized structure of PknB in complex with an inhibitor, a non-ATP analog, called mitoxantrone [18]. The structures containing antagonists (2FUM, 5U94 and 6B2P) are in the closed conformation, with Lys40 and Glu59 facing each other, except 3F69, in which the Glu59 is facing the outside of the pocket. The main different in these four structures, compared to the ones containing ATP analogs, is a small shift of the P-loop in the direction of the C-terminal lobe. All the antagonist molecules are in the nucleotide binding gap of ATP. The most important residues for the interactions with the antagonist molecules seem to be Leu17, Gly18, Val25, Ala38, Met 92, Glu93, Tyr94 and Val95 in the N-terminal lobe, and Met145 and Met155 in the C-terminal lobe [13,26,33].

Protein-Ligand Docking Protocol Validation
To evaluate and optimize the docking protocol and to select the best structure to represent this target in the virtual screening campaign, re-docking and cross-docking studies were performed. The goal of the re-docking consists in providing an assessment of the quality of the docking protocol in reproducing the pose of the crystallographic ligands in their specific targets [38]. In fact, the accuracy of different docking methods/scoring algorithms can vary significantly with the characteristics of the protein targets [38,39]. Several aspects of the protocol, such as the size and coordinates of the binding area, were then optimized toward minimizing the RMSD between the re-docked and crystallographic poses. Cross-docking consists in docking each ligand into all of the studied structures. It is an important method to evaluate the ability of each structure to correctly accommodate ligands from other structures and is a measure of its general usefulness [24]. Different scoring functions evaluate the target and the ligands in different ways [38][39][40][41]. Additionally, the type of target and characteristics of the ligand molecules can cause variation in the virtual screening (VS) results; therefore, it is important to consider as many scoring function (SF) alternatives as possible [39]. For this reason, five of the most widely used scoring functions in docking were used in this study: AutoDock Vina [42] and GOLD [43] CHEMPLP, ASP, ChemScore and GoldScore. The RMSD calculations were performed using DockRMSD [44]. The best performing SFs alongside the best protein structure, in terms of the RMSD (Å) comparison and docking score, were the ones selected to move to the next stage.

Virtual Screening Protocol Optimization
At this stage, the molecular targets that did not contain magnesium were removed, as these ions are an important part of the catalytic site, and the re-docking and cross-docking scores were decreased in the absence of these cations, confirming their importance for proper ligand placement. Similarly, the structures that contained manganese ions were also removed from the data set because its addition was only to evaluate their role in the stability of the protein [32] and their score in the re-docking and cross docking was lower when compared with the structures containing Mg 2+ cations (Supplementary Materials- Table S1). From the five structures containing Mg 2+ ions, only the two that presented the best results in the re-docking and cross-docking studies moved on to this stage.
To evaluate the quality of the VS protocol, the ability to discriminate between real binders and non-binders was evaluated by applying an active vs. decoys protocol. Actives are compounds that possess known experimental activity against PknB. Decoys are compounds randomly generated from the active molecule to have similar 1-D physico-chemical properties but different 2-D topologies to make them likely non-binders [37]. The ideal scoring function would rank the real binders better than decoys, but due to imperfections in the performance of the actual SF, that is not always the case [23]. Hence, the chosen SF to move on to the screening of large databases of compounds is the one that provides the best results in terms of early recognition metrics.
The active compounds were extracted from CHEMBL [45] and BindingDB [46] with a previous filtration of the compounds with an experimental IC 50 or K d under 5500 nm. For each active molecule, 50 decoys were randomly generated using the Database of Useful Decoys: Enhanced (DUD-E) [47]. The test set was composed of a total of 68 active compounds and 3400 decoys. Several metrics were used to evaluate the quality of the VS protocol. Early recognition metrics, such as the Receiver Operating Characteristic curve (ROC), the Area under the Curve (AUC) and the Enrichment Factor (EF), were calculated using Microsoft excel. The Total Gain (TG), Robust Initial Enhancement (RIE), Boltzmann enhanced discrimination of ROC (BEDROC) and Hit Rate were calculated using the online "Screening Explorer" server [48].

Virtual Screening for the Identification of Potential PknB Binders
The optimized VS protocol obtained in the previous step and the SF GOLD/ASP was applied to three large databases of compounds in search of promising molecules to be tested experimentally as possible PknB inhibitors. The virtual databases selected were: the FDA (U.S. Food and Drug Administration)-approved database, a subset of the ZINC database [49], the Mu.Ta.Lig. Virtual Chemotheca [50] and the French National chemical library (Chimiothèque Nationale) [51].
The ZINC database is a free archive of commercially available compounds, provided by the Irwin and Shoichet Laboratories in the Department of Pharmaceutical Chemistry at the University of California (UCSF), that contains over 230 million compounds. For this study, a set of 3206 FDA-approved compounds was tested. Drug repurposing is a powerful tool in finding new uses to already known and well-characterized drugs as antimicrobial agents, whether used alone or in synergy with other antimicrobials [52]. The Mu.Ta.Lig. Virtual Chemotheca was created from COST Action CA15135 and is composed of molecules already synthetized/isolated and tested during the medicinal chemistry research for lead compounds [50]. A total of 64,804 compounds were screened against PknB. The Chimiothèque Nationale offers a diverse collection of more than 70,000 compounds and 15,000 original and little-tested natural extracts. In this study, a total of 61,640 compounds from the Chimiothèque were screened. The top compounds of each database were evaluated using pharmacokinetics and pharmacodynamics predictors such as SwissADME [53] and pkCSM [54]. In the top results, the toxic and carcinogenic compounds were filtered out. Compounds with a more positive LogP values were also favored due to their higher lipophilicity, and hence, higher probability of crossing the mycobacterial envelope. A selection of molecules of each database was performed and the selected molecules were moved on to the molecular dynamics and free energy calculation stages.

Molecular Dynamics (MD) Simulations
The Molecular Dynamics (MD) simulations performed in this work were conducted using the Amber18 software package [55]. To assign the molecular mechanics parameters, ANTECHAMBER was used, taking into account the general amber force field [56], with RESP charges calculated using HF/6-31G(d) with Gaussian16 [57]. The protein was described using the Amber14sb force field [A]. The protein-ligand complexes were contained in a box of TIP3P water molecules. The edges of the water box were placed at a distance of at least 12 Å from each atom of the complex. Periodic boundary conditions were used on all MD simulations. Long-range electrostatic interactions were calculated using the particlemesh Ewald summation method. For the electrostatic and Lennard-Jones interactions, a cut-off value of 10.0 Å was used. All bonds involving hydrogen atoms were constrained using the SHAKE algorithm. This allowed the application of a 2-fs time step.
All solvated ligand-PknB complexes went through a phased minimization procedure with four energy-minimization stages to remove bad contacts and optimize the system prior to the MD simulation. In the first stage (2500 steps), only the positions of the water molecules were optimized, with the remaining parts of the system kept constrained to their initial positions through the application of a harmonic potential (50 kcal mol −1 Å −2 ). In the second stage (2500 steps), all hydrogen atoms present in the system were optimized. In the third stage (2500 steps), only the positions of the protein backbone atoms were kept constrained, enabling the optimization of the amino-acid side chains. Finally in the fourth stage (10,000 steps), all constrains were removed, enabling a full optimization of the model. After the energy minimization steps, all complexes were heated from 0 to 310.15 K over 50 ps, using a Langevin thermostat at constant volume (NVT ensemble). They were further equilibrated at 310.15 K during 50 ps to stabilize the density. The production phase was run for 100 ns using an NPT ensemble, with a pressure of 1 bar (Berendsen barostat) and a temperature of 310.15 K (Langevin thermostat). The trajectory analysis was performed using the cpptraj tool [58] and VMD [59]. To produce the high-quality protein figures, PyMOL [60] was used.

MM/GBSA Free Energy Calculations
The MM/PBSA.py [61] script from Amber was used to perform the MM/GBSA calculations. The calculations used the last 40 ns of each MD simulation, with an interval of 100 ps between each structure (a total of 800 structures per complex). A salt concentration of 0.100 mol dm −3 was used. To gather information about the contribution of each active-site amino acid residue to the overall free energy, the per-residue free energy decomposition option was used.

Data Analysis
All the data from the docking, re-docking, VS studies and MD simulations results were treated using the average and standard deviation tools from Microsoft Excel. To submit the VS results to the online "Screening Explorer" server, a csv file composed of 3 columns with the Identification of the compounds, the score (ranked from the best to the worst), and the numerical indication of Active (1) or Decoy (0) was created for each SF. Table 2 compares the RMSD (Å) values for the re-docking of all the PknB protein structures studied obtained with all the SFs used: Vina, CHEMPLP, ASP, ChemScore and GoldScore. Molecular docking studies were performed to evaluate the effect of the metal ions (Mg 2+ and Mn 2+ ) on the re-docking scores. All scores improved in the presence of magnesium ions (data not shown), indicating that they are indeed crucial for the stabilization of the ATP analogs and other binders.

Results and Discussion
Generally, the SFs that better reproduced the crystallographic pose were Vina, with an average value of 1.09 Å, followed by ASP, with a value of 1.70 Å. On the other hand, all the other GOLD SFs did not show a good performance (CHEMPLP-2.03 Å, ChemScore 3.22 Å and GoldScore-2.15 Å). ChemScore was the SF that had the most difficulty in the re-docking, which can be explained by the fact that bigger, more flexible ligands, such as ATP analogs, can be more challenging to place correctly [43]. Overall, the RMSD values are below 2 Å across most targets, indicating that there is a general good performance with the different docking programs/SFs. According to the cross-docking results (Supplementary Materials-Tables S2-S6) the target 1O6Y and 3F61 were the best in the cross-docking of all the ligands, across most of the SFs tested.
The structures 3ORI and 3ORL also presented very good scores for all the SFs; however, these structures did not perform as effectively in the re-docking stage (with RMSD values of 2.14 Å and 2.47 Å, respectively). Furthermore, the presence of manganese cations and mutations made them less suitable to move on to the VS of large databases of compounds. Based on these results, the choice of the structures to continue the protocol were 1O6Y and 3F61, which had the best scores in 3 out of the 5 different SFs. The GOLD SFs were chosen to progress to the next stage, due to their observed higher computational efficiencies achieved with similar accuracy levels. Table 3 compares the ability of all tested GOLD SFs (CHEMPLP, ASP, ChemScore and GoldScore) in discriminating between binders and decoys for both studied structures, using several metrics. The enrichment factor is described as the ratio between the percentage of active compounds in the selected subset and the percentage in the entire database [62]. A high EF in the top 1% means that the scoring function can find more true positives early on, and it means that the scoring function is adequate or better for a specific target. However, this metric is highly dependent on the number of actives and is unable to discriminate between SFs that present actives in the top of the list and SFs that present actives just before the threshold [63]. For these reasons, the robust initial enhancement (RIE) was also considered since it does not depend on the number of actives used and is less susceptible to variations when a small number of actives are used. Comparing the structures studied, the higher EF1% was obtained for structure 1O6Y across all the SFs. However, the ASP scoring function stands out with an EF1% of 3.76. A ROC curve is obtained by plotting the true positive rate (TPR) against the false positive rate (FPR). In other words, if only true positives are found at the top of the database, it leads to a higher ROC curve and an AUC of 100%. BEDROC is a normalization of the RIE bounded by 0 and 1. TG quantifies the discrimination of actives over decoys attributable to score variations. TG values over 0.25 combined with an AUC over 0.5 indicate a good performance and reproducibility from the VS protocol [48]. In Figure 3, it is possible to visualize the difference in ROC curves between targets and SFs. ASP stands out as the SF that can find more true positives early on (3.76) and as the one that shows a higher AUC (80.9%). ChemScore was the SF that performed the worst in both targets. Since the best performing SF, across all the metrics and for both targets, was ASP, it was this SF that was selected in the virtual screening of large databases of compounds. and GoldScore) in discriminating between binders and decoys for both studied structures, using several metrics. The enrichment factor is described as the ratio between the percentage of active compounds in the selected subset and the percentage in the entire database [62]. A high EF in the top 1% means that the scoring function can find more true positives early on, and it means that the scoring function is adequate or better for a specific target. However, this metric is highly dependent on the number of actives and is unable to discriminate between SFs that present actives in the top of the list and SFs that present actives just before the threshold [63]. For these reasons, the robust initial enhancement (RIE) was also considered since it does not depend on the number of actives used and is less susceptible to variations when a small number of actives are used. Comparing the structures studied, the higher EF1% was obtained for structure 1O6Y across all the SFs. However, the ASP scoring function stands out with an EF1% of 3.76. A ROC curve is obtained by plotting the true positive rate (TPR) against the false positive rate (FPR). In other words, if only true positives are found at the top of the database, it leads to a higher ROC curve and an AUC of 100%. BEDROC is a normalization of the RIE bounded by 0 and 1. TG quantifies the discrimination of actives over decoys attributable to score variations. TG values over 0.25 combined with an AUC over 0.5 indicate a good performance and reproducibility from the VS protocol [48]. In Figure 3, it is possible to visualize the difference in ROC curves between targets and SFs. ASP stands out as the SF that can find more true positives early on (3.76) and as the one that shows a higher AUC (80.9%). ChemScore was the SF that performed the worst in both targets. Since the best performing SF, across all the metrics and for both targets, was ASP, it was this SF that was selected in the virtual screening of large databases of compounds.  An explanation for the discrepancy of results between the two structures might be the position of the residue Glu59 and the position of the magnesium positioning loop (Figure 4). Some authors defend that the position of this loop is important to bring the helix C to its activated form [64], that is, promoting the contact between Lys40 and Glu59. For this reason, the structure that was chosen to move on to the VS stage was 1O6Y.
An explanation for the discrepancy of results between the two structures might be the position of the residue Glu59 and the position of the magnesium positioning loop (Figure 4). Some authors defend that the position of this loop is important to bring the helix C to its activated form [64], that is, promoting the contact between Lys40 and Glu59. For this reason, the structure that was chosen to move on to the VS stage was 1O6Y. From the virtual screening results, a selection of molecules was used to perform MD simulations. These molecules were chosen based on their ranking during the virtual screening procedure and their molecular diversity. For the molecules from the Chemotheca and from Chimiothèque Nationale databases, Lipinski's rule of five [65,66] was also considered.
The chosen molecules from ZINC/FDA were Inositol Niacinate, Riboflavin Monophosphate, Fosaprepitant, Nilotinib, Gadofosveset, Tedizolid Phosphate, Cobicistat and Cangrelor. From Chemotheca, the chosen molecules were CMLDID11504, CMLDID1335, CMLDID24682, CMLDID25037, CMLDID35281, CMLDID39270, CMLDID42750, CMLDID46926, CMLDID49099 and CMLDID57593. Finally, from Chimiothèque Nationale, the chosen molecules were AB-00011214, AB-00011297, AB-00014565, AB-00019576, AB-00028661, AB-00057453, AB-00063630, AB-00064179, AB-00070072 and AB-00074812. All selected molecules, their scores, and some of their physical-chemical properties can be seen in Figure 5. These properties and figures were obtained using the data From the virtual screening results, a selection of molecules was used to perform MD simulations. These molecules were chosen based on their ranking during the virtual screening procedure and their molecular diversity. For the molecules from the Chemotheca and from Chimiothèque Nationale databases, Lipinski's rule of five [65,66] was also considered.
The chosen molecules from ZINC/FDA were Inositol Niacinate, Riboflavin Monophosphate, Fosaprepitant, Nilotinib, Gadofosveset, Tedizolid Phosphate, Cobicistat and Cangrelor. From Chemotheca, the chosen molecules were CMLDID11504, CMLDID1335, CMLDID24682, CMLDID25037, CMLDID35281, CMLDID39270, CMLDID42750, CML-DID46926, CMLDID49099 and CMLDID57593. Finally, from Chimiothèque Nationale, the chosen molecules were AB-00011214, AB-00011297, AB-00014565, AB-00019576, AB-00028661, AB-00057453, AB-00063630, AB-00064179, AB-00070072 and AB-00074812. All selected molecules, their scores, and some of their physical-chemical properties can be seen in Figure 5. These properties and figures were obtained using the data visualization program DataWarrior [67]. As a reference molecule, Mitoxantrone was used. Two of the chosen molecules have been reported as active against M. tuberculosis or other members of the Mycobacteriaceae family. Nilotinib was shown to regulate protective innate immune responses against intracellular Mycobacterium bovis and Mycobacterium avium [68]. Tedizolid Phosphate was reported to have powerful bactericidal capabilities against M. avium [69] and showed promising capabilities against the Mycobacterium abscessus com-plex [70]. Furthermore, Tedizolid Phosphate has also been reported to be active against M. tuberculosis [71,72]. visualization program DataWarrior [67]. As a reference molecule, Mitoxantrone was used. Two of the chosen molecules have been reported as active against M. tuberculosis or other members of the Mycobacteriaceae family. Nilotinib was shown to regulate protective innate immune responses against intracellular Mycobacterium bovis and Mycobacterium avium [68]. Tedizolid Phosphate was reported to have powerful bactericidal capabilities against M. avium [69] and showed promising capabilities against the Mycobacterium abscessus complex [70]. Furthermore, Tedizolid Phosphate has also been reported to be active against M. tuberculosis [71,72]. To validate the molecular docking results, the structural stability of the protein-ligand complex was evaluated, and MM/GBSA calculations and molecular dynamics simulations were carried out. For each molecule, in complex with 1O6Y, 100 ns of simulation was performed, starting from the posed predicted from docking. To evaluate the structural stability, multiple properties were calculated for the last 40 ns of simulation for each ligand-protein complex, and these are presented in Table 4. To validate the molecular docking results, the structural stability of the protein-ligand complex was evaluated, and MM/GBSA calculations and molecular dynamics simulations were carried out. For each molecule, in complex with 1O6Y, 100 ns of simulation was performed, starting from the posed predicted from docking. To evaluate the structural stability, multiple properties were calculated for the last 40 ns of simulation for each ligand-protein complex, and these are presented in Table 4.
When comparing to the initial docking pose, 13 of the 28 ligands had an RMSD value lower than 2 Å (for further information, see Supplementary Materials- Figures S1-S3). Even in the cases where the RMSD value was higher, the standard deviation remained low. This suggests an induced-fit adjustment to the binding pocket of PknB. All molecules remained bound to the protein in the predicted binding location. The stability of the complexes is further supported by the stable SASA values observed during the last 40 ns of simulation. Considering that all protein-ligand complexes were stable, MM/GBSA calculations could be performed. The results are presented in Table 5 and a comparison of the MM/GBSA results with the virtual screening scores is displayed in Figure 6.   A total of 15 of 28 ligands performed equal or better than the reference ligand. From these, two were from Chemotheca, six were from Chimiothèque Nationale and seven were from ZINC/FDA. Between these ligands, these MM/GBSA values varied between −23.0 ± 0.1 and −59.8 ± 0.3 kcal/mol, compared to Mitoxantrone's −23.0 ± 0.2 kcal/mol. The best ligand from Chemotheca was CMLDID24682, with a MM/GBSA value of −31.5 ± 0.1 kcal/mol. The molecule from Chimiothèque Nationale with the best performance was AB- A total of 15 of 28 ligands performed equal or better than the reference ligand. From these, two were from Chemotheca, six were from Chimiothèque Nationale and seven were from ZINC/FDA. Between these ligands, these MM/GBSA values varied between −23.0 ± 0.1 and −59.8 ± 0.3 kcal/mol, compared to Mitoxantrone's −23.0 ± 0.2 kcal/mol. The best ligand from Chemotheca was CMLDID24682, with a MM/GBSA value of −31.5 ± 0.1 kcal/mol. The molecule from Chimiothèque Nationale with the best performance was AB-00014564, with a result of −37.0 ± 0.1 kcal/mol. Lastly, the ligand from ZINC/FDA with the best results was Riboflavin Monophosphate with an MM/GBSA value of −59.8 ± 0.3 kcal/mol.
To further analyze the affinity between the best performing ligands and the receptor, the overall binding free energy was decomposed into the contribution of each residue. The residues that had, in general, a bigger contribution are represented in Figure 7. The six top performing ligands were chosen for this analysis, one from Chemotheca (CML-DID24682), two from Chimiothèque Nationale (AB-00014564 and AB-00011214) and three from ZINC/FDA (Table 6). From ZINC/FDA, the molecules chosen were Riboflavin Monophosphate, Tedizolid phosphate and Gadofosveset trisodium. Overall, the PknB residues that had higher contributions for binding free energy were Leu17, Val26, Tyr94 and Met155. For CMLDID24682, Val 25 and Met155 had the greater contributions, with ∆G values of −2.3 ± 0.3 and −2.6 ± 0.4 kcal/mol, respectively. The residues with the higher contributions were maintained for AB-00014564, with a ∆G value of −2.9 ± 0.4 kcal/mol for Val 25 and of −3.7 ± 0.8 kcal/mol for Met155. In the case of AB-00011214, Met155 remained one of the more important residues, having a ∆G value of −2.0 ± 0.7 kcal/mol, and was joined by Tyr94, which had a ∆G value of −2.2 ± 0.5 kcal/mol. For the binding of the first of the FDA molecules, Riboflavin Monophosphate, the greater contribution came from Leu17 and Met155, presenting ∆G values of −2.4 ± 0.5 and −2.3 ± 0.4 kcal/mol, respectively. As for the binding of Tedizolid Phosphate, the greater contribution came from Val25, with a ∆G value of −1.6 ± 0.5 kcal/mol. The same was observed for the final ligand, Gadofosveset, in which Val25 presented the higher contribution to the binding free energy with a decomposed ∆G value of 1.8 ± 0.3 kcal/mol. Overall, the PknB residues that had higher contributions for binding free energy were Leu17, Val26, Tyr94 and Met155. For CMLDID24682, Val 25 and Met155 had the greater contributions, with ∆G values of −2.3 ± 0.3 and −2.6 ± 0.4 kcal/mol, respectively. The residues with the higher contributions were maintained for AB-00014564, with a ∆G value of −2.9 ± 0.4 kcal/mol for Val 25 and of −3.7 ± 0.8 kcal/mol for Met155. In the case of AB-00011214, Met155 remained one of the more important residues, having a ∆G value of −2.0 ± 0.7 kcal/mol, and was joined by Tyr94, which had a ∆G value of −2.2 ± 0.5 kcal/mol. For the binding of the first of the FDA molecules, Riboflavin Monophosphate, the greater contribution came from Leu17 and Met155, presenting ∆G values of −2.4 ± 0.5 and −2.3 ± 0.4 kcal/mol, respectively. As for the binding of Tedizolid Phosphate, the greater contribution came from Val25, with a ∆G value of −1.6 ± 0.5 kcal/mol. The same was observed for the final ligand, Gadofosveset, in which Val25 presented the higher contribution to the binding free energy with a decomposed ∆G value of 1.8 ± 0.3 kcal/mol. Table 6. Best performing molecules from the ZINC/FDA database.

Riboflavin Monophosphate
Riboflavin Monophosphate is a form of vitamin B2 used to restore riboflavin in anaemia, migraine, alcoholism, and hyperhomocysteinemia [73].

Drug Name Description Structure
Riboflavin Monophosphate Riboflavin Monophosphate is a form of vitamin B2 used to restore riboflavin in anaemia, migraine, alcoholism, and hyperhomocysteinemia [73].

Tedizolid phosphate
Tedizolid phosphate is a member of the oxazolidinone class of antibiotics. It is effective in the treatment of certain Gram-positive bacterial infections. It works by inhibiting protein synthesis by bacterial ribosomes [74].

Gadofosveset
Gadofosveset trisodium is an intravenous contrast agent used with magnetic resonance imaging [75].
To extract the most representative structure from the MD simulations, the cluster command from cpptraj was used [59]. The six best performing molecules and their interactions are represented in Figures 8 and 9. The represented interactions were obtained by using the Protein-Ligand Interaction Profiler [76].
During the interaction of CMLDID2468 with PknB, multiple hydrogen bonds were established. Lys40 interacted with the Triazine group, and Val95 interacted with the nitrogen from the terminal aniline group. Lastly, Ala144 interacted with the nitrogen in the other terminal group. Throughout the simulation, there was a shift in the position of the magnesium ions and the carboxamide region of the molecule was stabilized by one of them. The P-loop accompanied the shift of the cations and moved toward the front, and Glu59 was oriented toward the second Mg 2+ .
AB-00011214 only formed one hydrogen bond with PknB. Asp96 interacted with one of the secondary amine groups. Throughout the simulation, there was a shift in the position of the P-loop of PknB opening the catalytic site, increasing the distance between Lys40 and Glu59. This occurred due to the structure of the ligand. It was a large and flexible molecule and as it unfolded in the catalytic site, it triggered a shift in the position of the residues of the nucleotide-binding area as well as the Mg cations.
As for AB-00014564, this ligand interacted with PknB with three hydrogen bonds. Lys40 interacted with the phenol group. Glu93 and Val95 interacted with the Pyrrolidine-2,5-diol group, with the nitrogen and one of the oxygens, respectively. The molecule occupied a higher position in nucleotide-binding area of the PknB binding pocket, causing a shift in the P-loop region, the Mg2 + ions and magnesium position loop, toward the Clobe. The pocket, however, maintained its closed configuration.

Tedizolid phosphate
Tedizolid phosphate is a member of the oxazolidinone class of antibiotics. It is effective in the treatment of certain Gram-positive bacterial infections. It works by inhibiting protein synthesis by bacterial ribosomes [74].

Drug Name Description Structure
Riboflavin Monophosphate Riboflavin Monophosphate is a form of vitamin B2 used to restore riboflavin in anaemia, migraine, alcoholism, and hyperhomocysteinemia [73].

Tedizolid phosphate
Tedizolid phosphate is a member of the oxazolidinone class of antibiotics. It is effective in the treatment of certain Gram-positive bacterial infections. It works by inhibiting protein synthesis by bacterial ribosomes [74].

Gadofosveset
Gadofosveset trisodium is an intravenous contrast agent used with magnetic resonance imaging [75].
To extract the most representative structure from the MD simulations, the cluster command from cpptraj was used [59]. The six best performing molecules and their interactions are represented in Figures 8 and 9. The represented interactions were obtained by using the Protein-Ligand Interaction Profiler [76].
During the interaction of CMLDID2468 with PknB, multiple hydrogen bonds were established. Lys40 interacted with the Triazine group, and Val95 interacted with the nitrogen from the terminal aniline group. Lastly, Ala144 interacted with the nitrogen in the other terminal group. Throughout the simulation, there was a shift in the position of the magnesium ions and the carboxamide region of the molecule was stabilized by one of them. The P-loop accompanied the shift of the cations and moved toward the front, and Glu59 was oriented toward the second Mg 2+ .
AB-00011214 only formed one hydrogen bond with PknB. Asp96 interacted with one of the secondary amine groups. Throughout the simulation, there was a shift in the position of the P-loop of PknB opening the catalytic site, increasing the distance between Lys40 and Glu59. This occurred due to the structure of the ligand. It was a large and flexible molecule and as it unfolded in the catalytic site, it triggered a shift in the position of the residues of the nucleotide-binding area as well as the Mg cations.
As for AB-00014564, this ligand interacted with PknB with three hydrogen bonds. Lys40 interacted with the phenol group. Glu93 and Val95 interacted with the Pyrrolidine-2,5-diol group, with the nitrogen and one of the oxygens, respectively. The molecule occupied a higher position in nucleotide-binding area of the PknB binding pocket, causing a shift in the P-loop region, the Mg2 + ions and magnesium position loop, toward the Clobe. The pocket, however, maintained its closed configuration.

Gadofosveset
Gadofosveset trisodium is an intravenous contrast agent used with magnetic resonance imaging [75].

Drug Name Description Structure
Riboflavin Monophosphate Riboflavin Monophosphate is a form of vitamin B2 used to restore riboflavin in anaemia, migraine, alcoholism, and hyperhomocysteinemia [73].

Tedizolid phosphate
Tedizolid phosphate is a member of the oxazolidinone class of antibiotics. It is effective in the treatment of certain Gram-positive bacterial infections. It works by inhibiting protein synthesis by bacterial ribosomes [74].

Gadofosveset
Gadofosveset trisodium is an intravenous contrast agent used with magnetic resonance imaging [75].
To extract the most representative structure from the MD simulations, the cluster command from cpptraj was used [59]. The six best performing molecules and their interactions are represented in Figures 8 and 9. The represented interactions were obtained by using the Protein-Ligand Interaction Profiler [76].
During the interaction of CMLDID2468 with PknB, multiple hydrogen bonds were established. Lys40 interacted with the Triazine group, and Val95 interacted with the nitrogen from the terminal aniline group. Lastly, Ala144 interacted with the nitrogen in the other terminal group. Throughout the simulation, there was a shift in the position of the magnesium ions and the carboxamide region of the molecule was stabilized by one of them. The P-loop accompanied the shift of the cations and moved toward the front, and Glu59 was oriented toward the second Mg 2+ .
AB-00011214 only formed one hydrogen bond with PknB. Asp96 interacted with one of the secondary amine groups. Throughout the simulation, there was a shift in the position of the P-loop of PknB opening the catalytic site, increasing the distance between Lys40 and Glu59. This occurred due to the structure of the ligand. It was a large and flexible molecule and as it unfolded in the catalytic site, it triggered a shift in the position of the residues of the nucleotide-binding area as well as the Mg cations.
As for AB-00014564, this ligand interacted with PknB with three hydrogen bonds. Lys40 interacted with the phenol group. Glu93 and Val95 interacted with the Pyrrolidine-2,5-diol group, with the nitrogen and one of the oxygens, respectively. The molecule occupied a higher position in nucleotide-binding area of the PknB binding pocket, causing To extract the most representative structure from the MD simulations, the cluster command from cpptraj was used [59]. The six best performing molecules and their interactions are represented in Figures 8 and 9. The represented interactions were obtained by using the Protein-Ligand Interaction Profiler [76].
During the interaction of CMLDID2468 with PknB, multiple hydrogen bonds were established. Lys40 interacted with the Triazine group, and Val95 interacted with the nitrogen from the terminal aniline group. Lastly, Ala144 interacted with the nitrogen in the other terminal group. Throughout the simulation, there was a shift in the position of the magnesium ions and the carboxamide region of the molecule was stabilized by one of them. The P-loop accompanied the shift of the cations and moved toward the front, and Glu59 was oriented toward the second Mg 2+ .
AB-00011214 only formed one hydrogen bond with PknB. Asp96 interacted with one of the secondary amine groups. Throughout the simulation, there was a shift in the position of the P-loop of PknB opening the catalytic site, increasing the distance between Lys40 and Glu59. This occurred due to the structure of the ligand. It was a large and flexible molecule and as it unfolded in the catalytic site, it triggered a shift in the position of the residues of the nucleotide-binding area as well as the Mg cations.   As for AB-00014564, this ligand interacted with PknB with three hydrogen bonds. Lys40 interacted with the phenol group. Glu93 and Val95 interacted with the Pyrrolidine-2,5-diol group, with the nitrogen and one of the oxygens, respectively. The molecule occupied a higher position in nucleotide-binding area of the PknB binding pocket, causing a shift in the P-loop region, the Mg2 + ions and magnesium position loop, toward the C-lobe. The pocket, however, maintained its closed configuration.
Riboflavin Monophosphate established multiple interactions. Gly21 formed a hydrogen bond with the phosphate group. Lys40 formed a salt bridge with the same phosphate group. Asp102 established a hydrogen bond with one of the nitrogens from the Imidazolidine-2,4-diol group. Finally, Ala142 and Asp156 formed hydrogen bonds with the three oxygens from the main chain. The phosphate groups of this compound interacted with the magnesium cations in the same way as the ATP analogs, and hence, there was no significative difference in the positions of the P-loop and magnesium positioning loop.
The Tedizolid Phosphate bond to PknB formed a hydrogen bond between Ser23 and the oxygen from the Hydroxyoxazolidin group. Lys40 established a salt bridge with the phosphate group. The phosphate group of Tedizolid Phosphate also interacted with the Mg 2+ in the same way as the ATP analogs; however, the ring portion of compound was not in the same place as the adenine moiety of ATP would be, leading to a shift of the activation and catalytic loops.
Finally, Gadofosveset established two hydrogen bonds with PknB. Phe19 and Gly21 interacted with two of the ligand's carboxylic acids. In addition, Arg101 and Lys140 formed salt bridges with two other carboxylic acids. Finally, Lys40 formed a salt bridge with the phosphate group. Throughout the simulation, there were no significant differences in the position of the P-loop. However, the phosphate group, which was present in the middle of the ligand, was stabilized by the Mg cations, so it was oriented in a way that was closer to the cations than to where the adenine moiety of ATP would have been. This also led to a shift in the position of the activation and catalytic loops.

Conclusions
In this article, an in silico methodology is described for the discovery of new mycobacterial serine/threonine-protein kinase, PknB, inhibitors. From the six compounds identified as potential inhibitors, only one (AB-00011214) was able to turn the active site from a closed to an open conformation, with a shift in the position of the residue Glu59. In the simulations of the other five compounds, there was a slight shift of the Glu59, but only toward one of the Mg cations; this was not enough to lead to an open conformation. Two of the FDA compounds (Tedizolid Phosphate and Gadofosveset) caused significative changes in the position of the activation and catalytic loops, with Tedizolid showing intracellular antimicrobial activity against Mtb. Of course, the precise kinetic effect of these structural modifications is still not clear; however, these results can provide clues toward a better understanding of the inhibitory mechanism. Further in vitro and in vivo experimental studies on the drugs described should be considered to evaluate their ability to cross the bacterial envelope and confirm their actual potential as anti-tuberculosis agents, not only against Mtb but also toward resistant strains.
In conclusion, the data presented here demonstrate that this multi-level computational approach accurately predicted the binding position of several crystallographic compounds and identified six potential strong candidates for PknB inhibition. It also provided solid structural information on the protein-ligand interactions that can be used to design more specific and effective derivatives.
Supplementary Materials: The following are available online, Figure S1: Root mean square deviation plots for the selected ligands from the ZINC/FDA database, Figure S2: Root mean square deviation plots for the selected ligands from the Mu.Ta.Lig. Virtual Chemotheca, Figure S3: Root mean square deviation plots for the selected ligands from the Chimiothèque Nationale, Table S1: GOLD Re-docking results showing the Impact of the presence of Mg2+ and Mn2+ on the score, Table S2: CHEMPLP Cross-docking score results for all the molecular targets studied. A higher score corresponds to a better affinity, Table S3: ASP score results for all the molecular targets studied. A higher score corresponds to a better affinity, Table S4: ChemScore score results for all the molecular targets studied. A higher score corresponds to a better affinity, Table S5: GoldScore score results for all the molecular targets studied. A higher score corresponds to a better affinity, Table S6: Vina score results for all the molecular targets studied. A more negative score corresponds to a better affinity.