A Molecular Modeling Approach to Identify Potential Antileishmanial Compounds Against the Cell Division Cycle (cdc)-2-Related Kinase 12 (CRK12) Receptor of Leishmania donovani

The huge burden of leishmaniasis caused by the trypanosomatid protozoan parasite Leishmania is well known. This illness was included in the list of neglected tropical diseases targeted for elimination by the World Health Organization. However, the increasing evidence of resistance to existing antimonial drugs has made the eradication of the disease difficult to achieve, thus warranting the search for new drug targets. We report here studies that used computational methods to identify inhibitors of receptors from natural products. The cell division cycle-2-related kinase 12 (CRK12) receptor is a plausible drug target against Leishmania donovani. This study modelled the 3D molecular structure of the L. donovani CRK12 (LdCRK12) and screened for small molecules with potential inhibitory activity from African flora. An integrated library of 7722 African natural product-derived compounds and known inhibitors were screened against the LdCRK12 using AutoDock Vina after performing energy minimization with GROMACS 2018. Four natural products, namely sesamin (NANPDB1649), methyl ellagic acid (NANPDB1406), stylopine (NANPDB2581), and sennecicannabine (NANPDB6446) were found to be potential LdCRK12 inhibitory molecules. The molecular docking studies revealed two compounds NANPDB1406 and NANPDB2581 with binding affinities of −9.5 and −9.2 kcal/mol, respectively, against LdCRK12 which were higher than those of the known inhibitors and drugs, including GSK3186899, amphotericin B, miltefosine, and paromomycin. All the four compounds were predicted to have inhibitory constant (Ki) values ranging from 0.108 to 0.587 μM. NANPDB2581, NANPDB1649 and NANPDB1406 were also predicted as antileishmanial with Pa and Pi values of 0.415 and 0.043, 0.391 and 0.052, and 0.351 and 0.071, respectively. Molecular dynamics simulations coupled with molecular mechanics Poisson–Boltzmann surface area (MM/PBSA) computations reinforced their good binding mechanisms. Most compounds were observed to bind in the ATP binding pocket of the kinase domain. Lys488 was predicted as a key residue critical for ligand binding in the ATP binding pocket of the LdCRK12. The molecules were pharmacologically profiled as druglike with inconsequential toxicity. The identified molecules have scaffolds that could form the backbone for fragment-based drug design of novel leishmanicides but warrant further studies to evaluate their therapeutic potential.


Introduction
Leishmaniasis is a worldwide menace that exists in all continents except Oceania and it is endemic in the tropical and subtropical areas in Eastern Africa, Southern Europe, the Middle East, South-eastern Mexico, and Central and South America [1]. Approximately one million new cases and between 26,000 to 65,000 deaths occur annually [2]. Leishmaniasis is a neglected tropical disease caused by the trypanosomatid protozoan Leishmania parasites transmitted to humans through the bites of infected phlebotomine sand flies [3][4][5][6][7]. The disease manifests in three major forms, namely cutaneous leishmaniasis (CL), mucocutaneous leishmaniasis (MCL), and visceral leishmaniasis (VL) [8,9]. During the last decade, leishmaniasis has been observed with cases of co-infections in areas including the Mediterranean region, France, Italy, Portugal, Spain, Thailand, and Brazil [10][11][12]. Moreover, VL co-infection with HIV-infected patients living in Asia (especially India) and some African countries have been reported [13].
Leishmaniasis mostly affects people living in poor areas and places further economic stress on scanty financial resources [14][15][16]. The savings of most households are depleted to get treatment, while the few others incur debt. Leishmaniasis impacts negatively on the psychological and social status of infected persons. The disfiguring scars lead to various forms of social stigmatization and exclusion from community activities [17].
Currently, the dearth of effective and affordable drugs is a major problem hindering the eradication of leishmaniasis. Existing drugs are expensive, ranging from USD 30 to 1500 [17]. Paromomycin (PM) is the cheapest option in India, while liposomal amphotericin B (AmBisome) and miltefosine (Milt) costs USD 162-229 and USD 119 per patient, respectively [18].
Drug resistance is also a major issue facing the existing therapeutic options, hence the need to identify new drug targets. The cell division cycle (CDC)-2-related kinases CRK3, CRK6, and CRK12, which are cyclin-dependent kinases (CDKs) have recently been identified as plausible targets [19,20]. The overexpression of both CRK12 and the cyclin protein CYC9 have been reported to increase the resistance of L. donovani to pyrazolopyrimidines [20]. However, CRK12 has been reported to exist in a complex with CYC9 [19][20][21]. In bloodstream trypanosomes, both CRK12 and CYC9 are critical for proliferation in vitro [21]. Computational modelling studies showed that the most promising compound (GSK3186899), which inhibited the L. donovani parasites in a mouse model, binds to the CRK12 in the ATP binding pocket [19,20]. Mutation studies also suggested that GSK3186899 binds to CRK12 and not CYC9 since the effectiveness of GSK3186899 was reduced in a mutant version of the CRK12 [19,20]. The CRK12 is an essential gene for L. donovani and Leishmania mexicana promastigotes [20,22] and critical in the bloodstream stage of Trypanosoma brucei [21]. It also plays an essential role in the survival of trypanosomatids of Trypanosoma brucei [21], which corroborates CRK12 as a drug target for parasitic kinetoplastids belonging to the Trypanosoma genus [20,22]. In addition, the depletion of CRK12 results in the expansion of the flagellar pocket and impairment of endocytosis [21,23].
Computer-aided drug design (CADD) has become more advantageous than the traditional approach of high-throughput screening (HTS) as it has helped reduce the wastage of resources in terms of cost, effort, and time by significantly decreasing the number of compounds and filtering out only hits for further HTS. Natural products remain an untapped reservoir of new drug candidates for combating various kinds of diseases. The African flora is rich in biodiversity [24] and can be exploited to produce novel drug candidates from their natural sources. Therefore, the identification of new bioactive compounds via in silico drug design is vital in unravelling novel leads that have the potential to inhibit the activity of L. donovani by targeting the Leishmania donovani cell division cycle (CDC)-2-related kinase 12 (LdCRK12).
This study seeks to model a reasonably accurate 3D structure of the LdCRK12 and identify potential natural product-derived LdCRK12 inhibitory compounds through virtual screening. It also sought to characterize the mechanisms of binding between the LdCRK12 and potential inhibitory molecules using molecular dynamics (MDs) simulations integrated with molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) [25,26]. In addition, it undertakes predictive pharmacokinetic and physicochemical profiling as well as the biological activity of compounds to identify potential novel drug-like leads.

Materials and Methods
A schematic pipeline detailing the step-by-step techniques employed in the study is described in Figure 1. After modelling and validating the 3D structure of the LdCRK12, structure-based virtual screening (SBVS) was performed to identify compounds with high binding affinity to the LdCRK12 protein. Additionally, the selected hits were docked against the human cyclin-dependent kinase 9 (CDK9) since it is a homologue of the kinase domain of the LdCRK12. Molecular interactions between the receptors and the compounds were investigated using MD studies including MM/PBSA. Chemical absorption, distribution, metabolism, excretion, and toxicity (ADMET) predictions were performed to evaluate the toxicity of the compounds. Thereafter, the biological activity of identified biomolecules was predicted using machine learning-based Open Bayesian techniques [27,28]. Methodology schema employed in this study for predicting potential antileishmanial compounds. Three modelling techniques comprising Modeller [29,30], I-TASSER [31][32][33][34] and Robetta [35][36][37] were used to predict potential LdCRK12 structures. Evaluation of the predicted protein structures revealed the reasonably best model. Natural compounds from the African Natural Product Database (AfroDB), as well as the North African Natural Product Database (NANPDB) and known antileishmanial compounds, were docked against LdCRK12 and the human CDK9 receptors. The potential lead compounds were subjected to absorption, distribution, metabolism, excretion, and toxicity (ADMET), biological activity predictions, and molecular dynamics (MDs) computations.

Sequence Retrieval
Since the experimental 3D structure of the LdCRK12 does not exist, there was the need to employ modelling techniques to predict a reasonably accurate structure. The protein sequence of the LdCRK12 with UniProtKB ID: A0A3S7WQK2 was retrieved from UniProtKB, a repository for amino acid sequences of proteins [38][39][40].

Obtaining the Structure of LdCRK12 and Human CDK9
Three different modelling approaches were employed in this study, comprising I-TASSER Suite [31][32][33][34], Robetta [35][36][37] and Modeller 9.20 [29,30] to predict the 3D structures of the LdCRK12 protein. The structure of the human CDK9 was retrieved from the protein data bank (PDB) with PDB ID 4BCF. The details used to generate a reasonably valid structure of the LdCRK12 via Modeller 9.20, I-TASSER, and Robetta are described.

Template Search and Selection
The sequence of the LdCRK12 was uploaded into SWISS-MODEL which performed a basic local alignment search tool (BLAST) search to obtain suitable templates that were identical to the target sequence [41]. A BLAST search was also conducted on the kinase domain using the BLAST option in UniProtKB. The most suitable template was then selected for modelling.

Structure Prediction Using Modeller
EasyModeller 4.0 [42], a graphical user interface (GUI) for Modeller was used to model the structure of LdCRK12. The sequence and the selected template (PDB ID 4BCF) were imported into EasyModeller 4.0. Sequence alignment was performed to predict the secondary structure of the protein by using the selected template and the sequence of the LdCRK12. Modeller then used the outcome to generate five models from which the best is selected based on their discrete optimized protein energy (DOPE) scores. DOPE is a statistical potential score used to evaluate homology models in protein structure prediction. For the same target, the model with the lowest DOPE score was chosen as the best [30,43].

Structure Prediction Using I-TASSER
I-TASSER (https://zhanglab.ccmb.med.umich.edu/I-TASSER/; accessed on 22 October 2019) was employed to predict the structure of LdCRK12. The amino acid sequence of the LdCRK12 protein was uploaded into the I-TASSER platform and 5 protein structures were predicted using default parameters.

Structure Prediction Using Robetta
The LdCRK12 amino acid sequence was also uploaded into Robetta (https://robetta. bakerlab.org; accessed on 27 February 2020) and the "comparative modelling (CM) only" option was selected. Robetta then parsed the sequence into putative domains and built models for the domains which are homologues to solved protein structures using comparative modelling [37]. Five protein structures were predicted using default parameters.

Structural Validation
The quality of the generated models was assessed via SAVESv5.0 (http://servicesn. mbi.ucla.edu/SAVES/; accessed on 9 March 2020) along with Ramachandra plots from PROCHECK (https://www.ebi.ac.uk/thornton-srv/databases/pdbsum/Generate.html; accessed on 6 February 2021) [44]. The z-score obtained from ProSA-web [45,46], an indication of the overall model quality of the structures, was also determined. The z-score determines whether the input model is of X-ray or NMR quality. The local model quality of the structures was also determined by plotting the energies as a function of protein residue position. The positive values signify problematic or erroneous parts of the input structure. The reasonably best structure was selected based on the quality assessments performed.

Prediction of Binding Sites
Computed Atlas of Surface Topography of proteins (CASTp) [47,48] was used to predict potential binding sites of the LdCRK12 protein. Chimera and PyMOL were used to assess the features of the predicted binding sites [49][50][51].

Preparation of Proteins and Ligand Libraries
The ligands were obtained from the African Natural Product Database (AfroDB) and the North African Natural Product Database (NANPDB) [52,53]. A total of 6842 compounds were obtained in 2D spatial data file (sdf) format from the NANPDB and were converted to 3D structures using Open Babel's "gen3d" option. Additionally, 880 compounds were retrieved from the AfroDB in 3D sdf format. A total of 7722 compounds were obtained for this study by combining the two databases and removing duplicates. Additionally, the compound libraries were filtered based on Lipinski's rule of five.
Compounds labelled 5, 7, and 8 which showed very good half-maximum effective concentration (EC 50 ) values in a mouse model of visceral leishmaniasis by inhibiting CRK12 were used in this study [20]. Amphotericin B, miltefosine, and paromomycin were also included in the study. GSK3186899 (also known as compound 7 or DDD85365), amphotericin B, miltefosine, and paromomycin were retrieved from PubChem with compound identifiers (CIDs) 122429808, 5280965, 3599, and 165580, respectively. MarvinSketch 17.17.0 was used to generate the 3D sdf of compounds 5 and 8. Additionally, a 2-amino-4-heteroarylpyrimidine inhibitor (Code: T6Q), an inhibitor of the human CDK9 was extracted from the complex and saved in sdf format. All ligand structures were then energy minimized using the universal force field (UFF) under the Conjugate Gradient algorithm in 200 steps before being converted to the partial charge and atom type (pdbqt) file format of the Protein Data Bank using Open Babel.
Both LdCRK12 and the human CDK9 were energy minimized using the Optimized Potentials for Liquid Simulations (OPLS)/All Atom (AA) force field in GROMACS 2018. PyMOL (PyMOL Molecular Graphics System, Version 1.5.0.4, Schrödinger, LLC) was used to visualize the energy minimized structures and to remove the water molecules surrounding the protein. The protein structures were then saved in the Protein Data Bank format (pdb) using PyMOL. The protein structures were then converted to AutoDock Vina's compatible pdbqt format using the "make macromolecule" option in PyRx.

Virtual Screening
Autodock Vina was employed for the virtual screening process [54]. The pre-filtered library and the known drugs were screened against the LdCRK12 using a grid box dimension of 91.21 × 93.45 × 78.24 Å 3 and centered at (74.47, 128.44, 81.76) Å to cover the kinase domain. Compounds that possessed binding energies higher than −8.5 kcal/mol were not selected. A more stringent threshold was used herein since a previous study showed that −7.0 kcal/mol which was defined for AutoDock users can significantly distinguish between putative specific and non-specific protein-ligand bonds [55]. The result was then inspected visually using PyMOL to select the best docked ligands.
The known ligands and the selected compounds were re-docked to the human CDK9 using AutoDock Vina. The CDK9 protein was remodelled using the existing CDK9 structure (PDB ID: 4BCF) as a template via Modeller before molecular docking studies due to missing residues. A grid box with the dimension of 80.86 × 62.73 × 91.07 Å 3 and center (81.89, 80.83, 70.34) Å was specified for the CDK9. Compounds that demonstrated a higher binding affinity to the human CDK9 than 2-amino-4-heteroaryl-pyrimidine were not considered for downstream analysis.

Characterisation of Mechanism of Binding
The interactions between LdCRK12 and the ligands were determined and analyzed via LigPlot + v1.4.5 using default parameters [56]. Additionally, the human CDK9-ligand interactions were investigated.

Pharmacological Profiling
Selected compounds with high binding affinities with the LdCRK12 protein and low binding affinities to the human CDK9 were subjected to absorption, distribution, metabolism, and excretion (ADME) evaluation using SwissADME [57]. The toxicity profiles of the selected compounds were evaluated using OSIRIS Property Explorer in DataWarrior 5.0.0 [58]. DataWarrior uses features of chemical structures to predict physicochemical properties. The algorithm in the OSIRIS Property Explorer predicts the likelihood of a drug being a mutagenic, tumorigenic, irritant, and possessing a reproductive effect. Prediction of activity spectra for substances (PASS) was used to predict the biological activity of the compounds. PASS predicts the biological activity spectra of compounds using the simplified molecular input line entry system (SMILES) files of the structures based on the Bayesian approach [27,28].

Quality Evaluation of Shortlisted Molecules
The inhibitory constant (Ki) was calculated using the binding energies of the selected compounds along with other metrics consisting of ligand efficiency (LE), LE scale (LE_Scale), fit quality (FQ), and LE-dependent lipophilicity (LELP). The abovementioned metrics were determined using the method described previously [59,60].

MD Simulations of Proteins and Protein-Ligand Complexes
A 10 ns MD simulation was performed for LdCRK12 and protein-ligand complexes using GROMACS 2018 [61,62]. Xmgrace [63] was used to plot the graphs generated from the MD simulations. The binding free energies of the complexes were calculated using the MM/PBSA method [25]. MM/PBSA calculations of the complexes were carried out using g_MM/PBSA, which calculates binding energy components and the individual energy contributions of the residues [25]. The graphs from the MM/PBSA computations were generated using the R programming package [64].

Results and Discussion
The results of the molecular modelling, molecular docking, ADMET evaluation, prediction of antileishmanial activity and MD simulations are presented.

Modelling the Structure of LdCRK12
There was the need to model the structure of the LdCRK12 since there is no available structure in the protein data bank. An earlier study modelled the structure of the LdCRK12 using Molecular Operating Environment (MOE version 2014.09; Chemical Computing Group, Inc.) [20]. MOE-Homology combines segment-matching and methods of inserting or deleting regions to model protein structures. Advanced knowledge-based loop searching and sidechain rotamer selection methods are then employed to build models by default. An average model is then generated by MOE for a user-controlled energy minimization [65].
Studies have compared the quality of protein structures generated using different modelling techniques [65][66][67]. No technique has been found to be superior in every aspect to the others [65,66]. The protein family and the sequence identity between the query and template structures influence the quality of a model built using a homology modelling technique [66]. A comparison study of various homology modelling algorithms including MOE, I-TASSER, Rosetta, PRIME, SWISS-MODEL, Composer, and ORCHESTRAR reported that all the techniques produced high quality models when the sequence identity between the query and the template is greater than 35% [66,67]. However, for low sequence identities, it becomes difficult for the modelling algorithms to produce high-quality structures [66]. It is therefore imperative that different modelling techniques are used to build protein structures that have relatively low sequence identities to their templates. The quality of the modelled structures must be assessed to select the reasonably best model.
Herein, three freely accessible and widely used techniques comprising Modeller, I-TASSER and Robetta were employed to predict structures of the LdCRK12. The present study compares the structures from these three techniques to select the reasonably best model, as carried out previously [68][69][70].

Template Search
A BLAST search was performed to retrieve identical structures as suitable templates for modelling the LdCRK12 structure. The BLAST search via SWISS-MODEL revealed 5449 templates with a sequence identity lower than 30%. A further BLAST search was conducted on the kinase domain (amino acid residues 459-833) using the BLAST option (BLASTP 2.9.0+) by selecting BLOSUM62, the most commonly used scoring matrix in BLAST [71]. The search revealed six reviewed protein structures that were identical to the kinase domain of the LdCRK12 ( Table 1). One of the most widely used template selection criteria is to select the model with the highest sequence identity to the protein sequence. The quality of the experimentally determined structure is also an important factor to consider in the template selection. The reasonably best template was selected based on the E-value, sequence identity, query coverage, and the availability of a 3D structure. The human CDK9 was thus selected as the template for modelling the LdCRK12 via Modeller 9.2 as described previously [20]. Although, sequences O14098 and Q9TVL3-2 had sequence identities of 36% and 35% and BLAST scores of 356 and 348, respectively, they were not selected due to their relatively low coverage to the LdCRK12 (Table 1). Cyclin-dependent kinase 9 (CDK9) of humans, rats, and mice had the same E-value, BLAST score, and sequence identity of 7.4 × 10 −34 , 345, and 31.3%, respectively. The three proteins also had better coverage of the LdCRK12. However, the human CDK9 was the only protein with a solved 3D structure.  [20]. The human cyclin-dependent kinase 9 (CDK9) is a cdc2-like serine/threonine kinase whose related pathways have been associated with various human malignancies and cardiomyocyte hypertrophy. The sequence of the LdCDK12 was aligned to the template sequence and five structures were modelled using Modeller 9.2.
The qualities of the five generated models were evaluated using the DOPE and genetic algorithm 341 (GA341) scores. The GA341 score, which is derived from statistical potential, assesses the reliability of a model [72]. A model can be said to be reliable when the GA341 score is higher than the determined threshold of 0.7. The five generated models using Modeller 9.2 had a GA341 score lower than the 0.7 cut-off, thus the DOPE score was used to select the most suitable model. The DOPE score is also a statistical potential score used to assess predicted models. The reasonably best model is selected by choosing the structure with the least DOPE value [30,43]. The DOPE and GA341 scores of the five predicted models from Modeller 9.2 are shown ( Table 2). For the Modeller generated structures, model MOD5 was selected as the most suitable structure of the LdCRK12 due to its very low DOPE score of −50486.88281 (Table 2 and Supplementary file 1). I-TASSER was used to generate five protein structures of the LdCRK12. Based on the magnitude regarding the threading template alignments and the convergence parameters of the structure assembly simulations, I-TASSER computed a confidence rating for each model, which is known as the C-score. A higher C-score value represents a model with higher confidence and is usually in the range of (−5, 2) [31][32][33][34]. Out of the five generated I-TASSER structures, model ITAS5 was selected as the most suitable model due to its high C-score of −2.66 (Table 3 and Supplementary file 2). Table 3. Predicted I-TASSER models and C-scores.

Structure Prediction Using Robetta
Robetta was also employed to model five structures of the LdCRK12. Robetta uses the ROSETTA to model protein structures either by comparative modelling or ab initio. For the LdCRK12, Robetta used comparative modelling to predict plausible structures (Table 4). ROB1 was considered as the reasonably best model since the predicted models are ranked based on the model quality assessment method available in ProQ2 after clustering. The predicted b-factors by color representation of the models were also visualized in Pymol. The b-factor, which influences the local quality of a model, shows the parts of the structure that were remodelled and not covered by a template. These regions are the least accurate and have the most variation between models. All five predicted structures showed similar b-factor coloration. Therefore, the five models were further evaluated using SAVES v5.0 (Table 4). ROB1 had a VERIFY score of 82.97%, which was the highest; ERRAT quality factor of 88.0579; PROVE score of 0.0% and four PROCHECK errors, three warnings, and two passes (Table 4). ROB1 was thus selected as the most acceptable structure from Robetta (Supplementary file 3).

Quality Assessment of Selected Models
The quality of the best models from each of the three techniques was assessed using SAVES v5.0. Modelled protein structure MOD5 had poor values for all the quality metrics (Table 5). MOD5 had VERIFY, ERRAT, and PROVE scores of 41.20%, 10.0536, and 16.1%, respectively. MOD5 was also predicted by PROCHECK to have five errors, two warnings, and one pass (Table 5). ITAS5 had very good VERIFY and ERRAT scores of 85.36% and 80.2158, respectively. Although ITAS5 had the highest VERIFY score, it was predicted using PROVE to be 9.5% erroneous ( Table 5). PROCHECK also predicted ITAS5 to have six errors, two warnings, and one pass. ROB1 had the highest ERRAT quality factor of 88.0579 and 0.0% erroneous parts, as predicted by PROVE ( Table 5). The ERRAT error plots for MOD5, ITAS5, and ROB1 were generated ( Figure S1). MOD5 had the most erroneous or misfolded regions (Figures 2a and S1A), while ROB1 had the lowest error rate for protein folding (Figures 2c and S1C). Furthermore, the kinase domain of the LdCRK12 (residues 459-833) in the ROB1 structure was not predicted to have any misfolded or erroneous regions ( Figure  S1C). ITAS5 was also observed to have few misfolded portions (Figures 2b and S1B).  The Ramachandran plots of all the three shortlisted models were obtained using PROCHECK which evaluates the stereochemistry of protein models by determining residue-by-residue geometry and overall structure geometry [44]. A protein structure is considered as quality based on the percentage of residues in the most favored (core), additionally allowed, generously allowed, and disallowed regions [73]. Protein structure MOD5 had 79.7%, 15.5%, 3.0%, and 1.8% of residues in the most favored, additionally allowed, generously allowed, and disallowed regions, respectively (Table 6 and Figure S2A). ITAS5 was also predicted to have 61.0%, 29.8%, 5.9%, and 3.3% of residues in the most favored, additionally allowed, generously allowed, and disallowed regions, respectively (Table 6 and Figure S2B). For the ROB1 structure, 82% of the amino acid residues were present in the most favored region, 17.1% residues were found in the additionally allowed regions, 0.4% of residues were in the generously allowed regions, and 0.4% in the disallowed region (Table 6 and Figure 3). The Ramachandran plots revealed that the model ROB1 had the most reasonably good structure (Table 6, and Figures 3 and S2A,B). Table 6. Ramachandran plot statistics for the best models from the 3 modelling techniques. For all 3 models, the number of end residues (excluding Gly and Pro) = 2, Glycine residues = 65, Proline residues = 85, and the total number of residues = 881. The quality of the overall best model (ROB1) was evaluated using the z-score from ProSA-web [45,46]. The overall best model was predicted to be of X-ray quality and had a zscore of −9.7 (Figure 4a). The local model quality of the chosen model was also determined by plotting the energies as a function of amino acid residue position. Most of the residues were predicted to have negative energy values, signifying a very good model ( Figure 4b). Generally, positive values signify problematic or erroneous parts of the input structure.

Binding Site Characterization
A binding site is a region on a protein that binds to a ligand or another macromolecule with specificity [74]. CASTp was employed to predict the binding sites of the LdCRK12. At the active site, a ligand or a substrate binds to an enzyme to induce a chemical reaction [75]. CASTp uses the Delaunay triangulation, alpha shape, and discrete flow methods to identify topographic features, measure areas and volumes [76,77].
CASTp predicted 127 binding sites for the chosen LdCRK12 protein model. The predicted binding cavities with no openings and with relatively small volumes and areas such that no ligand could fit were ignored [59,78]. Since the modelled structure had many disordered regions from residues Met1 to about Ala400, only binding cavities predicted to border the kinase domain (459-833) were considered. A total of 14 binding sites were selected after visualization in Chimera 1.12 and Pymol. The residues lining each of the 14 binding sites are shown (Table 7). Pocket 7 was observed to overlap with pocket 3 (Table 7). Aligning the 3D structures of the LdCRK12 and the human CDK9 in complex with T6Q revealed that pocket 1 is the ATP binding site of the kinase domain ( Figure 5 and Table 7).

Preparation of Screening Library
A total of 7722 African natural compounds were used as the screening library [52,53]. Additionally, Lipinski's rule of five was used to filter the library to obtain 4409 compounds comprising 3872 and 537 ligands from the NANPDB and AfroDB, respectively.
Three known antileishmanial drugs, namely amphotericin B, miltefosine, and paromomycin, were also retrieved from PubChem with CIDs 5,280,965, 3599, and 165,580, respectively. Three inhibitors of LdCRK12 comprising compounds 5, 7, and 8, which had very good EC 50 values in mouse models of visceral leishmaniasis ranging from 0.005 µM to 2 µM, were also used. The 3D structure of GSK3186899 was downloaded from Pub-Chem with CID 122,429,808 whereas those of compounds 5 and 8 were generated using MarvinSketch 17.17.0. Additionally, a 2-amino-4-heteroaryl-pyrimidine inhibitor (Code: T6Q), complexed with the human CDK9 (PDB ID: 4BCF) was extracted from the complex and saved in sdf format. All ligand structures were energy minimized using the universal force field (UFF) under the Conjugate Gradient algorithm in 200 steps and converted to the partial charge and atom type (pdbqt) format using Open Babel before the virtual screening.

Virtual Screening of Compounds
Autodock Vina was used for the virtual screening process [54]. The compounds were first screened against the LdCRK12. Compounds with good pose and low binding energies against the LdCRK12 were re-docked against the human CDK9 to select compounds that are less likely to interact with critical residues of the human CDK9.

Screening the Library against LdCRK12
The pre-filtered library comprising a total of 4409 compounds and the known inhibitors were screened against the energy minimized LdCRK12 using a grid box dimension of 91.21 × 93.45 × 78.24 Å 3 and centered at (74.47, 128.44, 81.76) Å to cover the kinase domain of the protein. A total of 4369 compounds were successfully screened against the LdCRK12. A stringent threshold of −8.5 kcal/mol was used to select the compounds after the virtual screening process. This threshold was used since it has been shown that an AutoDock score of −7.0 kcal/mol differentiates well between certain and uncertain protein-ligand interactions [55]. A total of 290 compounds had binding energies less than or equal to −8.5 kcal/mol. AutoDock Vina uses a negative function to rank the output in the order of decreasing binding affinity, thus, the higher the negativity, the more plausible the candidate as a potential lead compound.
The protein-ligand complexes were then inspected visually using PyMOL to select the best docked ligands. A total of 17 compounds were eliminated since they did not dock deep into the LdCRK12. Additionally, based on the generated protein-ligand interaction profiles, 27 compounds that did not exhibit any hydrogen bonding with LdCRK12 were excluded. A total of 246 compounds were thus selected from the virtual screening output. Of the 246 compounds, ZINC000095485940 demonstrated the least binding energy to the LdCRK12 with a value of −10.1 kcal/mol. NANPDB1406, NANPDB2581 and NANPDB6446 also demonstrated low binding energies of −9.5, −9.2 and −9.1 kcal/mol, respectively. ZINC000095485940, NANPDB1406, NANPDB2581, and NANPDB6446 demonstrated a higher binding affinity to the LdCRK12 than all the known inhibitors used in this study (Table 8). Among the known inhibitors, the compound 8 and TQ6 demonstrated the least binding energy of −9.1 kcal/mol to the LdCRK12. Compound 8 was reported to inhibit the Leishmania parasite with EC 50 values of 0.025 µM and 0.075 µM in the axenic and intramacrophage assays, respectively. GSK3186899, paromomycin, and compound 5 also had binding energies of −8.5, −7.9, and −7.2 kcal/mol, respectively (Table 8). These three compounds demonstrated binding energies lower than the −7.0 kcal/mol threshold defined for AutoDock users [55]. This implies that these compounds have the potential to demonstrate significant inhibitory activities against the parasite as exhibited by compounds 5 and 7 previously [20]. Miltefosine demonstrated the highest binding energy of −5.0 kcal/mol to the LdCRK12 (Table S1).

Re-Docking Compounds against the CDK9
Since the kinase domain is conserved and the human CDK9 is homologous to the LdCRK12, there was the need to screen the shortlisted compounds against the CDK9. A total of 246 were re-docked against the human CDK9 to select compounds with a relatively low binding affinity to the CDK9, which were less likely to interact with critical residues of the human CDK9. Before the virtual screening, the CDK9 was remodelled with PDB structure 4BCF as a template using Modeller 9.2 due to missing residues. Residues 1-5, 89-96, 177-181, and 327-330 were missing in the human CDK9 structure. The complete sequence of the human CDK9 was retrieved from UniProt with ID P50750 [38][39][40]. The sequence was aligned to the 4BCF structure and five models were generated using Modeller 9.2. The qualities of the five models were assessed using the DOPE and GA341 scores. All the modelled structures had a GA341 score of 1, thus the structure with the lowest DOPE (−38809.11328) score was chosen.
Ligands that docked into the ATP binding site of the human CDK9 were not considered for downstream analysis. Additionally, compounds with similar binding energy against the CDK9 as T6Q were eliminated to prevent the likelihood of drug off-target binding. T6Q had a binding energy of −8.6 kcal/mol when docked into the ATP binding site of the human CDK9 (Table 8).

Characterization of Human CDK9-Ligand Interactions
The human CDK9-ligand interactions were also investigated ( Table 8 and Table S1). Ligands which interacted with the critical residues of the human CDK9 (Ile25, Ala46, Lys48, Phe103, Glu107, Asp109, Asp145, Leu156, and Asp167) were not considered for downstream analysis due to the possibility of drug off-target activity [80,81]. Previous studies have reported on the crystal structures of analogues of 4-(thiazol-5-yl)-2-(phenylamino)pyrimidine-5-carbonitrile bound to CDK9/cyclin T [82,83]. The compounds demonstrated Ki values ranging from 6-43 nM with an increase in the thermal stability of CDK9/cyclin T [82]. It was reported that the thiazole, pyrimidine, and aniline moieties docked into the ATP binding site and formed a hydrogen bond with the hinge region of the kinase [82]. The pyrimidine ring was observed to lie between Ala46 and Leu156 while the C5-carbonitrile was reported to form a lone pair−π interaction with an average distance of 3.7 Å with the gatekeeper residue Phe103 [82]. Hydrogen-bonds were also formed between the compounds and residues Ile25, Lys48, Asp145, with Glu107 and Asp167 [82]. Other studies have corroborated the above listed residues as being critical to CDK9-ligand binding [84,85]. A molecular docking study involving CDK9 and BAY-958 also reported BAY-958 to form a hydrogen-bond with Asp109 [84].
A total of 94 compounds that interacted with the critical residues of the human CDK9 were eliminated from this study. A total of 19 compounds with a relatively high binding affinity to the LdCRK12 and did not interact with the critical residues of the human CDK9 were obtained.

ADMET Prediction
Though the screening library was pre-filtered using Lipinski's rule, Veber's rule was further applied to the 19 identified compounds, of which two failed. NANPDB4609 and NANPDB3239 violated Veber's rule due to their high total polar surface area (TPSA) values of 151.96 and 145.91, respectively. Veber's rule requires a TPSA of no more than 140 Å 2 [86]. Compounds with a TPSA not more than 140 Å 2 are considered to have good oral bioavailability [86]. TPSA values are considered as good indicators of excellent human intestinal absorption (HIA) and Caco-2 permeability [87]. The calculated logP (cLogP) values were also determined using the OSIRIS DataWarrior 5.0.0 (Table S2).
Most of the compounds were predicted to be moderately soluble, including compounds 5, 7, and 8 (Table S2). Compound 5 was shown experimentally to have poor solubility and is metabolically unstable although it was the most potent against LdCRK12 with an EC 50 value of 0.014 µM in the intra-macrophage assay [20]. NANPDB6446 was predicted to be very soluble while NANPDB1406 was predicted to be moderately soluble. ZINC95485940, NANPDB1406, and NANPDB1649 were also predicted to be soluble (Table S2).
The potential of a drug to move across the blood-brain barrier to the brain is referred to as BBB permeation. Only NANPDB2581, NANPDB2582, NANPDB3614, NANPDB1649 and ZINC000095485880 were predicted to have permeation into the brain-blood barrier (BBB) [ Table S2]. In the brain, the drug binds to specific receptors to activate certain signaling pathways. Additionally, for a drug to exhibit the desired pharmacological activities with the brain parenchyma, it needs to be able to permeate the BBB [88]. T6Q, compound GSK3186899, compound 5, NANPDB4609, NANPDB3239, amphotericin B, paromomycin, and miltefosine were predicted to have low gastrointestinal (GI) absorption, which suggests a low probability of successful absorption into the bloodstream (Table S2). Another factor considered was the likelihood of the compounds to be non-Pglycoprotein (P-gp) substrates. P-gp aids in the removal of drugs or xenobiotics from the central nervous system (CNS) by functioning as a biological barrier by removing toxins and xenobiotics from cells. It is also crucial in the absorption and distribution of drugs [89]. All the inhibitors or drugs used in this study were predicted to be P-gp substrates (Table S2). Of the top 19 hits, 10 compounds were predicted to be non-P-gp substrates (Table S2) and may likely have desirable distribution in the circulatory system upon administration.

Toxicity Prediction with OSIRIS Property Explorer
The toxicity profiles of the 17 hits and the known drugs were determined using OSIRIS DataWarrior 5.0.0 (Table S3). Of the 17 hits, 13 compounds were predicted to be non-tumorigenic, non-mutagenic and non-irritant, and to have no reproductive effects (Table S3). NANPDB6446 was predicted to be highly mutagenic, tumorigenic, and irritant. NANPDB6446 can serve as a scaffold for fragment-based drug design due to its relatively low molecular weight of 365.381 g/mol.
NANPDB3614 and ZINC000000828203 were also predicted to be highly tumorigenic while NANPDB3284 was predicted to have reproductive effects (Table S3). Compounds 5, 8, amphotericin B, miltefosine, paromomycin, and T6Q were also predicted to have no mutagenicity, tumorigenicity, irritancy, and reproductive effect (Table S3). GSK3186899 was predicted to possess low tumorigenicity though it was non-mutagenic, non-irritant, and had no reproductive effect (Table S3). GSK3186899 was selected as the preclinical candidate due to its effectiveness, efficacy, pharmacokinetics, and safety profile [20]. GSK3186899 was reported to possess L. donovani inhibitory activity in cidal axenic amastigote and intra-macrophage assays with EC 50 values of 0.1 and 1.4 µM, respectively [20].

Biological Activities of Hits
The biological activities of the 17 identified hits were determined using PASS, an Open Bayesian machine learning technique. Structure descriptors, which are also referred to as multilevel neighborhoods of atoms (MNAs) descriptors, were generated as inputs [27].
A total of 13 compounds were predicted to possess antiprotozoal activity, of which 10 were predicted to be antileishmanial (Table S4). NANPDB1406, NANPDB2521, NAN-PDB3435, NANPDB3284 and ZINC000095486260 were predicted as kinase inhibitors (Table S4). Since the LdCRK12 has a kinase domain, these predictions necessitate the in vitro testing of these compounds to validate their anti-LdCRK12 and antileishmanial properties. Fifteen of the hits were predicted to possess antineoplastic (anticancer) activity (Table S4). A review on the in vitro leishmanicidal potential of anticancer compounds suggested the use of antineoplastic compounds for the treatment of leishmaniasis [90]. Protein kinase inhibitors such as sunitinib, sorafenib, and lapatinib which are used for treating cancers were reported to be active against Leishmania donovani amastigotes in murine macrophages with IC 50 values of 1.1, 3.7, and 2.5 µM, respectively, showing similar efficacy to that of miltefosine (IC 50 = 1.0 µM) [91]. Sunitinib, sorafenib, and lapatinib were also reported to be non-toxic to mammalian cells [91]. NANPDB1011, NANPDB3949, ZINC000095486260, NANPDB3435, NANPDB3284 and NANPDB2521 were predicted to possess dermatological activities. These compounds may be beneficial in treating post kala-azar dermal leishmaniasis (PKADL). NANPDB1649 (sesamin) has been reported to be active against Leishmania amazonensis with an IC 50 value of 15.8 µg/mL and was not cytotoxic to macrophage cells with CC 50 values greater than 100 µg/mL [92]. Additionally, ZINC000000828203 (diphyllin) isolated from Haplophyllum bucharicum (Rutaceae) has been reported to demonstrate antileishmanial activity against Leishmania infantum promastigotes and intracellular amastigotes with IC 50 values of 14.4 µM and 0.2 µM, respectively [93]. NANPDB3614 (justicidin B) has also been shown to be a potential antiprotozoal agent by showing antitrypanosomal activities against Trypanosoma brucei rhodesiense and Trypanosoma cruzi with IC 50 values of 0.2 and 2.6 µg/mL, respectively [94]. Since Leishmania and Trypanosoma are trypanosomatids, repurposing NANPDB3614 for the development of therapeutic agents for leishmaniasis can be explored.

Compound
Binding    The ligand efficiency (LE) of the selected compounds ranged from 0.327 to 0.413 (Table 9) which are very close to the average ligand efficiency values reported for fragmentlike compounds (0.38). LE is used to assess the binding affinity, taking into account the number of heavy atoms (NHA) of a molecule [96,97]. Herein, NANPDB1649 demonstrated the lowest LE value of 0.327. ZINC000095485940, NANPDB6446, NANPDB2581 and NANPDB1406 had LE values of 0.347, 0.380, 0.404 and 0.416, respectively (Table 9;  Table 10). Similarly, these LE values are close to the average LE values of fragment-like molecules (0.38) [97].
The LE_Scale takes into consideration size dependency, which is a limitation of the LE metric. The computed LE_Scale values ranged from 0.347 to 0.416 (Table 9), in concordance with the LE_Scale values of similar active compounds with the same number of heavy atoms [98,99]. ZINC000095485940 had the lowest LE_Scale value of 0.347, while NANPDB1406 had the highest value of 0.416. NANPDB2581, NANPDB6446 and NANPDB1649 also had LE_Scale values of 0.404, 0.380 and 0.380, respectively ( Table 9).
The fit quality (FQ), which is a more accurate metric used to assess ligand efficiency, is determined as a ratio of the observed LE to the LE_Scale of a compound. The closer the FQ to 1, the more ideal the ligand. The calculated FQ values ranged from 0.861 to 1.003 (Table 9), suggestive that the selected molecules have plausible binding to the LdCRK12 receptor [97].
Another important metric, ligand-efficiency-dependent lipophilicity (LELP) was also computed for the selected molecules. For a promising compound, the recommended LELP should be between 0 and 7.5, although molecules that satisfy Lipinski's rule are reported to have LELP values less than 16.5 [100]. The LELP values of all proposed molecules ranged between 0.521 and 9.861, which suggests that the selected molecules have a good affinity to LdCRK12, considering lipophilicity. ZINC000095485940, NANPDB1406 and The ligand efficiency (LE) of the selected compounds ranged from 0.327 to 0.413 (Table 9) which are very close to the average ligand efficiency values reported for fragment-like compounds (0.38). LE is used to assess the binding affinity, taking into account the number of heavy atoms (NHA) of a molecule [96,97]. Herein, NANPDB1649 demonstrated the lowest LE value of 0.327. ZINC000095485940, NANPDB6446, NANPDB2581 and NANPDB1406 had LE values of 0.347, 0.380, 0.404 and 0.416, respectively (Table 9; Table 10). Similarly, these LE values are close to the average LE values of fragment-like molecules (0.38) [97].
The LE_Scale takes into consideration size dependency, which is a limitation of the LE metric. The computed LE_Scale values ranged from 0.347 to 0.416 (Table 9), in concordance with the LE_Scale values of similar active compounds with the same number of heavy atoms [98,99]. ZINC000095485940 had the lowest LE_Scale value of 0.347, while NAN-PDB1406 had the highest value of 0.416. NANPDB2581, NANPDB6446 and NANPDB1649 also had LE_Scale values of 0.404, 0.380 and 0.380, respectively ( Table 9).
The fit quality (FQ), which is a more accurate metric used to assess ligand efficiency, is determined as a ratio of the observed LE to the LE_Scale of a compound. The closer the FQ to 1, the more ideal the ligand. The calculated FQ values ranged from 0.861 to 1.003 (Table 9), suggestive that the selected molecules have plausible binding to the LdCRK12 receptor [97].
Another important metric, ligand-efficiency-dependent lipophilicity (LELP) was also computed for the selected molecules. For a promising compound, the recommended LELP should be between 0 and 7.5, although molecules that satisfy Lipinski's rule are reported to have LELP values less than 16.5 [100]. The LELP values of all proposed molecules ranged between 0.521 and 9.861, which suggests that the selected molecules have a good affinity to LdCRK12, considering lipophilicity. ZINC000095485940, NANPDB1406 and NANPDB6446 had LELP values of 0.521, 3.761 and 2.370, respectively (Table 9).

Molecular Dynamics Simulations
Molecular dynamics studies the motion of atoms along the course of time by the integration of Newton's equations of motions [101]. Molecular dynamics simulations were performed using GROMACS 2018 to elucidate the dynamic behavior of selected compounds within the active sites of the LdCRK12 protein. The root mean square deviation (RMSD), the radius of gyration (Rg), and root mean square fluctuation (RMSF) were analyzed for the unbound protein and the protein-ligand complexes (Figure 7a-c).

The Root Mean Square Deviation (RMSD) of the Complexes
To evaluate the stability of the LdCRK12-ligand complexes, the RMSD plots of the unbound protein and the LdCRK12-ligand complexes were analyzed (Figure 7a). The RMSD is a frequently used measure of the differences between the structures sampled during the simulation and the reference structure [102]. MD simulations require systems to be close to their native conformation. The time trajectory of RMSD shows the deviation of a protein structure from a reference structure as a function of time [102].
The RMSD values of all nine structures experienced a gradual rise from 0 to 3 ns.  (Figure 7a). The LdCRK12-NANPDB2581 complex experienced the longest stability with an average RMSD value of 0.9 nm until about 7 ns where a gradual rise was observed (Figure 7a). LdCRK12-ZINC000095485940, LdCRK12-NANPDB1649, and LdCRK12-NANPDB6446 complexes were unstable from 0 to about 6 ns where they maintained stable RMSDs with averages of 1.5, 1.55, and 1.5 nm, respectively, until the end of the 10 ns simulation period (Figure 7a).

The Radius of Gyration (Rg) of Complexes
This study analyzed the compactness and folding of the unbound protein and the protein-ligand complexes by plotting the radius of gyration over simulation time. The loss of compactness affects the stability of the complex by introducing weak intermolecular bonds. When the Rg of a complex is higher, the compactness of the protein-ligand complex is lower, causing the interactions between ligand and protein to be weaker [103]. A stably folded protein will maintain a relatively steady Rg while the Rg value is likely to change over time if the protein unfolds [104]. The RMSF trajectories of the unbound LdCRK12 structure and LdCRK12-ligand complexes were also investigated. The RMSF reveals the flexibility of different regions of a protein, which can be related to crystallographic B-factors [102]. Residues contributing to the complex structural fluctuation can be assessed by this stability profile analysis. Higher RMSF values imply greater fluctuations. Protein regions involved in ligand binding and catalysis are known to demonstrate greater fluctuations [105]. Adaptive variation in flexibility lies principally in these regions of the protein sequence that affect the conformational stabilities of the protein-ligand complex [105].
The RMSF plots revealed that all eight compounds caused some degree of fluctuations in similar regions of the LdCRK12 (Figure 7c). Fluctuations were observed at regions from residue index 30-50, 60-150, 280-350, and 700-800. The highest fluctuation was observed between residues 60-150 followed by residues 280-350, implying they could be involved in ligand binding.

Contributing Energy Terms
The molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) computation was employed to determine the binding free energies of the LdCRK12-ligand complexes. At a quantitative level, simulation-based methods provide substantially more accurate estimates of ligand binding free energies than other computational approaches such as docking [106]. The calculation of the binding free energy ∆G bind , which is the free energy difference between the ligand-bound state and the corresponding unbound states of protein and ligand, is used to quantify the affinity of a ligand to its target. Assessing the ∆G bind of a series of ligands against a particular target can reveal those ligands with higher binding affinities to the target. Thus, the ∆G bind calculations are important to gain in-depth knowledge about the binding modes of the hits in drug design [107].
The MM/PBSA calculations showed that compound 8 had the lowest binding free energy of −68.609 kJ/mol (Table 11). Compound 5 was also observed to have a binding free energy of −54.023 kJ/mol while GSK3186899 had −27.382 kJ/mol (Table 11). Compounds 5 and 8 demonstrated better inhibitory activities against L. donovani than GSK3186899, although GSK3186899 was selected as the preclinical candidate due to pharmacokinetics and safety concerns [20]. NANPDB1649 had the lowest binding free energy of −50.434 kJ/mol among the five selected hits (Table 11). NANPDB2581, NANPDB6446 and NANPDB1406 also demonstrated low binding free energies of −49.374, −37.179 and −24.518 kJ/mol, respectively. These compounds exhibited binding affinities similar or better than that of the preclinical candidate (GSK3186899), thus are worthy of further experimental validation.
Even though ZINC000095485940 was predicted to have the lowest binding energy to the LdCRK12 (−10.1 kcal/mol) by Autodock Vina, it had the highest binding free energy of 0.593 kJ/mol from the MM/PBSA computations (Table 11), thereby potentially limiting its lead-likeness. In a previous study, compounds with high binding free energies have been shown to demonstrate inhibitory activity against receptors due to their very low electrostatic energies and very high polar energies [108]. ZINC000095485940 demonstrated high polar solvation energy of 136.331 kJ/mol and electrostatic energy of −29.485 kJ/mol (Table 11). Previous studies have reported that electrostatic and van der Waals forces contribute predominantly and continuously to the binding energy along with simulations that favored the binding of complexes [26,109]. All compounds demonstrated very low van der Waal's energies, ranging from −84.419 kJ/mol to −138.191 kJ/mol (Table 11).

Per-Residue Energy Decomposition
The MM/PBSA method can be used to calculate free binding energies by per-residue decomposition. This involves the decomposition of each residue by including the interactions in which each residue is involved. These provide useful insight into important interactions of key residues in free energy contribution. Residues contributing binding free energy greater than 5 kJ/mol or less than −5 kJ/mol are worth considering as key residues for the binding of a ligand to a protein [110]. The per-residue energy decomposition computations for each complex were performed (Figures 8 and S4A-G). From the protein-ligand interactions, residues Leu465, Ser466, Thr469, Ala486, Lys488, Ala568, Ser569, Asp612, Asn613, Leu615, and Asp626 were considered as key residues for ligand binding in the ATP binding site (Section 3.6). From the MM/PBSA per residue decomposition computations for the LdCRK12-GSK3186899 complex, it was observed that only Lys488 and Arg575 contributed individual energies beyond the ±5 kJ/mol threshold with energy values of 10.1287 and 5.8145 kJ/mol, respectively ( Figure S4B). For the LdCRK12-NANPDB1406 complex, Val473, Lys488 and Leu615 contributed energies of −5.0135, 14.2430, and −7.2060 kJ/mol, respectively (Figure 8). Only Lys488 was observed to contribute individual energy above the ±5 kJ/mol threshold with values of 7.8042 and 13.3733 kJ/mol in the LdCRK12-NANPDB1649 and LdCRK12-NANPDB2581 complexes, respectively ( Figure S4D,E). Additionally, Asp612 was the only residue that contributed individual energy beyond the ±5 kJ/mol with an energy value of 5.4536 kJ/mol in the LdCRK12-NANPDB6446 complex ( Figure S4F). For the LdCRK12-ZINC000095485940 complex, Lys488 and Asp626 contributed 17.8578 and 9.9136 kJ/mol, respectively ( Figure S4G). From the per-residue energy decomposition computations, it is suggested that Lys488 is a very crucial residue for ligand binding in the ATP binding site, which warrants further experimental validation to determine its role.

Future Outlook and Implication of the Study
This study modelled a reasonable structure of LdCRK12 with good quality parameters which has been made available to the scientific community to enrich work on structurebased drug discovery. Additionally, small molecules with the potential to inhibit the activity of LdCRK12 were identified, which could serve as the building blocks for the design of novel biotherapeutics. The study further proposed suitable molecules with negligible toxicity. Since the study is entirely computational, making available structures and compounds enable synthesis and screening to ascertain their potency as antileishmanial molecules. These predicted compounds can help stimulate the pace of searching for effective antileishmanial drugs globally.
In order to identify polypharmacological agents against leishmaniasis, it warrants investigating the inhibitory potential of the identified biomolecules against other CDC-2-related kinases of Leishmania, especially CRK3 and CRK6 [111]. CRK3 is essential for cell cycle progression and growth in Leishmania mexicana [112,113], while the role of CRK6 remains unclear [113,114], it has accessory functions in the cell cycle in T. brucei [114].

Conclusions
Natural products have shown the potential to be repurposed as effective L. donovani CRK12 inhibitors. This study sought to identify potential Leishmania inhibitors from the African flora by targeting the LdCRK12. The study identified four potential bioactive compounds comprising NANPDB1406, NANPDB2581, NANPDB6446 and NANPDB1649 with binding affinities of −9.5, −9.2, −9.1 and −8.5 kcal/mol, respectively. NANPDB1406, NANPDB2581 and NANPDB6446 demonstrated higher binding affinities than the preclinical compound (GSK3186899) which had the binding energy of −8.5 kcal/mol [20]. This study suggests Lys488 as a very crucial residue for ligand binding in the ATP binding site. MD simulations, including MM/PBSA, corroborated the potential inhibition of LdCRK12 by the compounds. Physiochemical and toxicological profiling predicted the compounds to be drug-like and have insignificant toxicity concerns. Ligand quality metrics comprising inhibitory constant (Ki), ligand efficiency (LE), fit quality (FQ), LE scale (LE_scale), and LE-dependent lipophilicity (LELP) also indicated that the potential antileishmanial compounds could serve as templates for fragment-based drug design for Leishmania inhibitors. The predicted Ki values of the potential drug candidates ranged from 0.108 to 0.587 µM. Furthermore, the molecules were predicted as antileishmanial molecules, necessitating experimental evaluation to corroborate their bioactivity.
Supplementary Materials: The following are available online at https://www.mdpi.com/2218-2 73X/11/3/458/s1, Figure S1: ERRAT error plots of the selected models: (A) ERRAT error plot for MOD5, (B) ITAS5, and (C) ROB1. Red bars represent the misfolded regions, yellow bars demonstrate the error region between 95% and 99%, and green bars indicate the region with a lower error rate for protein folding, Figure S2 Table S1: The binding energies and intermolecular bonds between LdCRK12 and compounds, Table S2: ADME Prediction of top 19 hits and known drugs for Gastrointestinal (GI); Blood Brain Barrier (BBB); Estimated Solubility (ESOL) class, P-glycoprotein (Pgp) and TPSA, Table S3: Toxicological profiles of the 17 hits and the known drugs, Table S4: Predicted biology activity of the lead compounds using Prediction of Activity Spectra for Substances (PASS). Pa and Pi represent probable activity and probable inactivity, respectively. Supplementary_file_1: 3D model of the LdCRK12 generated using Modeller (MOD5). Supplementary_file_2: The 3D model of the LdCRK12 generated using I-TASSER (ITAS5). Supplementary_file_3: The 3D model of the LdCRK12 generated using Robetta (ROB1).