Identification of Kaurane-Type Diterpenes as Inhibitors of Leishmania Pteridine Reductase I

The current treatments against Leishmania parasites present high toxicity and multiple side effects, which makes the control and elimination of leishmaniasis challenging. Natural products constitute an interesting and diverse chemical space for the identification of new antileishmanial drugs. To identify new drug options, an in-house database of 360 kauranes (tetracyclic diterpenes) was generated, and a combined ligand- and structure-based virtual screening (VS) approach was performed to select potential inhibitors of Leishmania major (Lm) pteridine reductase I (PTR1). The best-ranked kauranes were employed to verify the validity of the VS approach through LmPTR1 enzyme inhibition assay. The half-maximal inhibitory concentration (IC50) values of selected bioactive compounds were examined using the random forest (RF) model (i.e., 2β-hydroxy-menth-6-en-5β-yl ent-kaurenoate (135) and 3α-cinnamoyloxy-ent-kaur-16-en-19-oic acid (302)) were below 10 μM. A compound similar to 302, 3α-p-coumaroyloxy-ent-kaur-16-en-19-oic acid (302a), was also synthesized and showed the highest activity against LmPTR1. Finally, molecular docking calculations and molecular dynamics simulations were performed for the VS-selected, most-active kauranes within the active sites of PTR1 hybrid models, generated from three Leishmania species that are known to cause cutaneous leishmaniasis in the new world (i.e., L. braziliensis, L. panamensis, and L. amazonensis) to explore the targeting potential of these kauranes to other species-dependent variants of this enzyme.


Introduction
Leishmaniasis refers to a group of anthroponotic and zoonotic diseases that affect between 700,000 and 1 million people worldwide, causing between 20,000 and 30,000 deaths each year, primarily among populations found in tropical and subtropical areas. Leishmaniasis has been classified as a neglected tropical disease (NTD) due to lack of research and the poor development of new drugs over many decades [1][2][3]. Leishmaniasis is caused by approximately 20 protozoan parasite species of the genus Leishmania, which are transmitted to humans by more than 30 different species of phlebotomine sandflies [4]. The distinct species of Leishmania cause at least four separate syndromes: visceral leishmaniasis (VL, also known as kala-azar), post-kala-azar dermal leishmaniasis (PKDL), cutaneous leishmaniasis (CL), and mucocutaneous leishmaniasis (MCL) [5].
The CL subtype is typically characterized by localized, diffuse, or disseminated skin lesion [6]. In the old world (southern Europe, the Middle East, southwest Asia, and 2. Results and Discussion 2.1. A Combined Ligand-/Structure-Based Virtual Screening Approach Using LmPTR1 2.1.1. Ligand-Based VS The ChEMBL dataset (https://www.ebi.ac.uk/chembl/, accessed on 10 January 2021) was classified as either active or inactive (binary classification), using a cutoff value of pIC 50 ≥ 5.0 (pIC 50 = log IC 50 ). This value was selected according to the range of pIC 50 values observed for the entire dataset (657 structures) to obtain the maximum representation of the chemical space for each class of structure (active and inactive). Structures with pIC 50 values between 4.9 and 5.0 (range of 0.1 units) were excluded to avoid edge effects and improve the predictive capacity of the models by minimizing the potential activity differences associated with errors and different experimental protocols. IC 50 values describe the concentration of a given substance required to inhibit 50% of parasite growth [20].
Subsequently, VolSurf+ descriptors (128) were calculated for the remaining molecules, including 298 inactive (46.9%) and 338 active (53.1%) molecules. All VolSurf+ descriptors [30,31] together with their respective binary classifications were used to build a random forest (RF) model in Knime software (KNIME 3.1.0 the Konstanz Information Miner Copyright, 2003-2014, www.knime.org) [32]. A model with 200 trees was selected, and the Gini Index was used as a split criterion, which has the lowest false-positive rate. A five-fold cross-validation procedure was performed, splitting the dataset five times into a modeling set (80%/20%). Only the modeling set, which was additionally divided into multiple training and test sets (80%/20%), was used to build and validate the models [33].
For the training set used in the RF model, the match percentage values approached 100%. Sensitivity (true-positive rate) values of 78.1% and 82.6% and specificity (truenegative rate) values of 72.7% and 73.7%, were obtained for the cross-validation and test sets, respectively. Two parameters were calculated to evaluate the quality of the RF model: the receiver operating characteristic (ROC) curve and Matthews's correlation coefficient (MCC). The area under the ROC curve (AUC) plots the true-positive rate (sensitivity) against the false-positive rate (1−specificity), and the MCC correlates all values in the confusion matrix [34,35].
For the L. major RF model, AUC values of 0.85 and 0.87 were obtained for the five-fold cross-validation and external validation datasets, respectively. When calculating the MCC parameter, a value of 1 represents a perfect prediction, a value of 0 represents a random prediction, and a value of −1 represents total disagreement between the prediction and the observation. Our L. major RF model returned values of 0.51 (five-fold cross-validation) and 0.57 (external validation) [35]. The slightly higher MCC value obtained for the external validation (0.57) demonstrates a high degree of differentiation between the active and inactive compounds identified in the ChEMBL dataset.
The applicability domain (APD) was used to assess the reliability of the predictions for the samples in the test and SL sets, and the calculation of the APD is based on the molecular interactions determined by the VolSurf+ descriptors [14,20]. For the L. major RF model set, more than 98.4% of molecules were classified as reliable, with only 8 molecules classified as unreliable. When the RF models were applied to the kaurane dataset, more than 94.2% of molecules were classified as reliable in each model, with only 20 molecules classified as unreliable. Unreliable molecules were removed.
A ligand-based VS was performed on the remaining 340 kauranes. Only 7 of the 340 structures were classified as active (ligand-based probability value [LB] ≥ 0.5), with structures 134 and 135 representing the two best-ranked kauranes, with LB values of 0.57 and 0.55, respectively (Figure 1). These two diterpenoids are found in Wedelia chinensis, a species of Asteraceae [36]. Structurally, these two kauranes are characterized by the presence of (1S,4R,5R)-2-methyl-5-propan-2-ylcyclohex-2-ene-1,4-diol, linked through an ester bond to the kaurenoic acid. The LB values for these two kauranes are almost identical, indicating that the activity of these two compounds is likely associated with the presence of this monoterpenoid and the pi-bond in the structure between C9 and C11.
Molecules 2021, 26, x 4 of 23 bond to the kaurenoic acid. The LB values for these two kauranes are almost identical, indicating that the activity of these two compounds is likely associated with the presence of this monoterpenoid and the pi-bond in the structure between C9 and C11. Additionally, two additional kauranes isolated from Wedelia trilobata, structures 298 and 302, also presented LB values greater than 0.5. Cinnamoyl (302) and 2-phenylacetic (298) esters are established with the carboxyl group of the kaurenoic acid ( Figure 1) [37]. ]. In these two structures, the functional groups present in R3 were also found to play key roles, as structure 301, which also includes a cinnamoyl ester, was classified as inactive (LB = 0.48). The presence of a hydroxyl moiety in R3 represents the unique structural difference between structures 301 and the active structure 302.

Structure-based VS
A structure-based VS (molecular docking) was performed to explore the mechanism of action of the kauranes dataset against the crystal structure of PTR1 (E.C. 1.5.1.33), an NADPH-dependent short-chain reductase that is responsible for the unusual salvage of pterin in Leishmania and acts as a metabolic bypass for drugs that target dihydrofolate reductase [29]. The docking scores and the respective deviation values for the best-ranked structures are reported in Table 1 (all binding energy values can be found in Supplementary Material, Table S3). All tested molecules were ranked using the following probability calculation (Equation (1)), as previously reported by Herrera-Acevedo et al. [14,20]. Those kauranes that presented structure-based probability values (SB) above 0.5 were classified as active. SB = (Ei/Emin) > 0. 5

and Ei < Eligand
(1) where SB is the structure-based probability; Ei is the docking energy of compound i, where i ranges from 1 to 360 (Kauranes dataset); Emin is the lowest energy value of the dataset; and Eligand is the ligand energy from protein crystallography. Additionally, two additional kauranes isolated from Wedelia trilobata, structures 298 and 302, also presented LB values greater than 0.5. Cinnamoyl (302) and 2-phenylacetic (298) esters are established with the carboxyl group of the kaurenoic acid ( Figure 1) [37]. In these two structures, the functional groups present in R3 were also found to play key roles, as structure 301, which also includes a cinnamoyl ester, was classified as inactive (LB = 0.48). The presence of a hydroxyl moiety in R3 represents the unique structural difference between structures 301 and the active structure 302.

Structure-Based VS
A structure-based VS (molecular docking) was performed to explore the mechanism of action of the kauranes dataset against the crystal structure of PTR1 (E.C. 1.5.1.33), an NADPH-dependent short-chain reductase that is responsible for the unusual salvage of pterin in Leishmania and acts as a metabolic bypass for drugs that target dihydrofolate reductase [29]. The docking scores and the respective deviation values for the best-ranked structures are reported in Table 1 (all binding energy values can be found in Supplementary  Material, Table S3). All tested molecules were ranked using the following probability calculation (Equation (1)), as previously reported by Herrera-Acevedo et al. [14,20]. Those kauranes that presented structure-based probability values (SB) above 0.5 were classified as active.
where SB is the structure-based probability; E i is the docking energy of compound i, where i ranges from 1 to 360 (Kauranes dataset); E min is the lowest energy value of the dataset; and E ligand is the ligand energy from protein crystallography.
The docking results showed that all 360 compounds obtained SB values above 0.5; however, relative to the PTR1 inhibitors that were used as controls, 252 structures and 359 structures had lower docking scores than 7,8-dihydro-L-biopterin (DHB) and pyrimethamine (PMA), respectively. The Protein Data Bank (PDB) ligand for LmPTR1, methotrexate (PDB ID: MTX) [38], has a calculated docking score of −560.4 kJ/mol.
Structures 135 and 302 (Figure 1), which were predicted to have high LB probability values based on the RF model, also showed high SB values and were two of the ten best-ranked kauranes identified, with SB values of −423.0 kJ/mol and −416.7 kJ/mol, respectively. Spatially, in the active site of LmPTR1, structures 135 and 302 adopted an L-shaped conformation, similar to that observed for the ligand methotrexate ( Figure 2a). Based on the two-dimensional analysis, common interactions were identified for these two kauranes compared with methotrexate, highlighting the π-alkyl interaction with M233 and the van der Waals interactions with S112, Y191, K198, and G225 ( Figure 2).  [38], has a calculated docking score of −560.4 kJ/mol.
Structures 135 and 302 (Figure 1), which were predicted to have high LB probability values based on the RF model, also showed high SB values and were two of the ten bestranked kauranes identified, with SB values of −423.0 kJ/mol and −416.7 kJ/mol, respectively. Spatially, in the active site of LmPTR1, structures 135 and 302 adopted an L-shaped conformation, similar to that observed for the ligand methotrexate ( Figure 2a). Based on the two-dimensional analysis, common interactions were identified for these two kauranes compared with methotrexate, highlighting the π-alkyl interaction with M233 and the van der Waals interactions with S112, Y191, K198, and G225 ( Figure 2). Interacting residues are shown as colored circles depending on the interactions (as colored dashed lines): H-bond (lime), Van der Waals (green), π-π (purple) and πalkyl (pink), unfavorable (red), and carbon H-bond (teal) interactions.
Methotrexate achieved a docking score of −560.4 kJ/mol in the active site of LmPTR1, and the formation of two H-bond interactions with S111 and N118 were observed ( Figure   Figure 2. (a) Docking conformations of structure 135 (green), 302 (red) and MTX (yellow) in the active site of LmPTR1; 2D-residual interaction diagrams of (b) methotrexate (MTX), (c) structure 135, and (d) structure 302. Interacting residues are shown as colored circles depending on the interactions (as colored dashed lines): H-bond (lime), Van der Waals (green), π-π (purple) and π-alkyl (pink), unfavorable (red), and carbon H-bond (teal) interactions.
Methotrexate achieved a docking score of −560.4 kJ/mol in the active site of LmPTR1, and the formation of two H-bond interactions with S111 and N118 were observed (Figure 2b). Structure 302 also formed two H-bonds between S227 and the carboxylic group of C-19. Additionally, the aromatic ring of F113 interacted with both 135 and 302, in addition to methotrexate, through π-π and π-alkyl interactions. Two steric interactions that unfavorably influenced the molecular binding energy were identified for the structures 135 (R17 and D232) and 302 (S111 and L226), as shown in Figure 2c,d, respectively.

Consensus Analysis of the Two VS Approaches
To verify the potentially active kauranes and their possible mechanisms of action, a combined approach using both structure-and ligand-based VS was performed. An equation was used to combine the probability scores of both VS approaches with the true-negative rate from the RF model to minimize the probability of selecting false-positive compounds (Equation (2)) [14,20].
where CA Lm = combined-approach probability, SB = structure-based probability, TN = true-negative rate, and LB = ligand-based probability. Equation (2) is based on the fact that the ligand-based VS uses pIC 50 experimental values; thus, the LB score has a high weight with respect to the SB score, which only relates interactions between the protein and ligand. The ligand-based VS seeks to reduce the probability of selecting inactive molecules as active compounds (false positive); therefore, in Equation (2), the LB is associated with an increment of TN. Table 2 shows the results for the five kauranes that were classified as active using the combined approach and the two VS methodologies. Four of the five structures that previously displayed a high active probability value in the ligand-based VS ( Figure 1) emerged as interesting potential anti-Leishmania structures that might act on the LmPTR1 protein. In addition, fischericin F (structure 101), extracted from Ligularia fischeri, a species of the Ligularia genus (Asteraceae) [39], was also classified as potentially active in the combined approach (CA Lm = 0.69). Although this kaurane did not present the highest scores from the ligand-based VS, it emerged as the best-ranked structure from the structurebased VS approach. Structurally, 101 has ferulic acid as the main feature, bound to the ent-kaurane skeleton through an ester bond at C14.
Through this combined approach, based on two different VS methodologies, five kauranes from various Asteraceae species were identified as having promising antileishmanial activity against LmPTR1 from a dataset of 360 kauranes, with structures 302 and 135 indicated as having high probability values based on both the ligand-based and structure-based VS approaches.

In Vitro Enzymatic Activity Inhibition for VS-Selected Kauranes against Lmptr1
To verify the results obtained from the combined approach using the two VS methodologies, the in vitro enzymatic inhibitory activities of structures 135 and 302 (actives) and structure 301 (inactive) were examined. In addition, two kauranes, structures 301a and 302a (in which the cinnamoyloxy group was replaced by a coumaroyloxy group), were also tested against LmPTR1. The diterpenes 135, 301, and 302 were synthesized for use in an in vitro enzymatic activity inhibition assay. This aim was oriented to identify appropriate precursors, as such compounds are not commercially available. Therefore, a phytochemical study was initially performed focusing on the fruits of Xylopia frutescens, an annonaceous plant that is rich in kaurane-type diterpenes [40].
This procedure led to the isolation of various diterpenes, but our interest was particularly focused on ent-kaurenoic acid (A), 3α-hydroxy-ent-kaur-16-en-19-oic acid (B), and 3α,9β-dihydroxy-ent-kaur-16-en-19-oic acid (C). The structures of these compounds were fully elucidated by detailed spectroscopic data interpretation and comparison with data available in the literature for compounds A and B [41,42]. Compound C has not been yet reported, but it was found to have very similar 1 H and 13 C NMR data to that of the previously reported pterokaurene L3 (PL3) [43]. This fact suggested that C and PL3 are structural analogues, whose spectral difference was solely found to be on the chemical shift at C-3 (δ C 78.6 in C vs. 38.6 in PL3), confirming the presence of a carbinol carbon having an α-oriented hydroxyl group. This α-orientation was deduced by the 1 H NMR data of H-3 (δ H 4.63, dd, 1H), specifically the consistent coupling constants between H-3β and H-2α (J = 12 Hz) and H-2β (J = 5 Hz) [37]. These kaurane-type diterpenes A-C were therefore considered suitable precursors to initiate the synthesis of target compounds. Thus, compound 135 was obtained from the commercially available (R)-(−)-carvone (D) (Figure 3), which was first transformed into 5β-hydroxy-(R)-carvone (E) by chemoselective monohydroxylation and subsequently reduced to 2-oxo-menth-6-en-5β-ol (F) by selective hydrogenation using the Wilkinson's catalyst [44].
appropriate precursors, as such compounds are not commercially available. Therefore, a phytochemical study was initially performed focusing on the fruits of Xylopia frutescens, an annonaceous plant that is rich in kaurane-type diterpenes [40].
This procedure led to the isolation of various diterpenes, but our interest was particularly focused on ent-kaurenoic acid (A), 3α-hydroxy-ent-kaur-16-en-19-oic acid (B), and 3α,9β-dihydroxy-ent-kaur-16-en-19-oic acid (C). The structures of these compounds were fully elucidated by detailed spectroscopic data interpretation and comparison with data available in the literature for compounds A and B [41,42]. Compound C has not been yet reported, but it was found to have very similar 1 H and 13 C NMR data to that of the previously reported pterokaurene L3 (PL3) [43]. This fact suggested that C and PL3 are structural analogues, whose spectral difference was solely found to be on the chemical shift at C-3 (δC 78.6 in C vs. 38.6 in PL3), confirming the presence of a carbinol carbon having an α-oriented hydroxyl group. This α-orientation was deduced by the 1 H NMR data of H-3 (δH 4.63, dd, 1H), specifically the consistent coupling constants between H-3β and H-2α (J = 12 Hz) and H-2β (J = 5 Hz) [37]. These kaurane-type diterpenes A-C were therefore considered suitable precursors to initiate the synthesis of target compounds. Thus, compound 135 was obtained from the commercially available (R)-(−)-carvone (D) (Figure 3), which was first transformed into 5β-hydroxy-(R)-carvone (E) by chemoselective monohydroxylation and subsequently reduced to 2-oxo-menth-6-en-5β-ol (F) by selective hydrogenation using the Wilkinson's catalyst [44].
Steglich esterification was also exploited to obtain the other two selected diterpenes ( Figure 4). Isolated compounds B and C were separately esterified with cinnamic acid (H), yielding the phenylpropanoid/diterpene ester adducts 301 and 302, respectively, with good yields (78%-79%). Additionally, the scope of this reaction was expanded to produce compounds 301a and 302a, using the same diterpene precursors (B and C) condensed with p-coumaric acid (I) to observe the influence of the p-hydroxyl group at the phenylpropanoid moiety in the subsequent enzymatic study.  The selected synthetic diterpene esters 135, 301, 302, 301a, and 302a were tested in vitro to experimentally determine their abilities to inhibit the enzymatic activity of LmPTR1 as an extension of the results provided by the in silico screening. Recombinant LmPTR1 was kinetically assessed, as previously reported [40], to ensure the appropriate enzymatic features, resulting in a consistent substrate Km of 5.6 μM. After testing LmPTR1 inhibition, the selected diterpenes exhibited inhibitory properties at different levels, following a concentration-response behavior within the 0.1-128 μM range. The IC50 was then calculated for the tested diterpenes, and these values were used to calculate the apparent inhibitory constant (Ki app ) ( Table 3) using the Cheng-Prusoff equation, assuming reversible competitive inhibition and 1:1 stoichiometry [47]. PMA, a known PTR1 inhibitor, was used as a positive control. Among the three VS-selected diterpenes, 135 was found to be the most potent inhibitor, whereas 301 exhibited the lowest Ki app . Remarkably, the inhibitory activity was improved by approximately 60% if a 3α-p-coumaroyloxy group was present in 302 instead of a 3α-cinnamoyloxy substituent, as 302a exhibited a lower Ki app value than 302. No similar effect was observed for 301, as 301a showed a slightly lower inhibitory activity than 301. Therefore, a reasonable inference based on this small set of compounds is that the presence of a p-hydroxyl group at the phenylpropanoid moiety might favor inhibitory activity, whereas a 9β-hydroxyl group at the diterpene moiety has a negative influence on LmPTR1 inhibition. 24 Finally, although the test diterpenes were found to be less active than the positive control, the concentration-response behavior and the consequently calculated Ki app (≤5 μM) of the selected diterpenes demonstrated the validity of the designed VS approach for the selection of bioactive compounds against PTR1 and the computationally studied binding modes of these selected compounds within the active site of LmPTR1, which is associated with the development of CL. These selected compounds can be considered important, showing that they can be used to obtained additional active PTR1 inhibitors.

Molecular Docking Calculations for The Kaurane Dataset Using Hybrid Models of La, Lb, and Lpptr1
The structures 135, 302, and 302a displayed in vitro activity against L. major, which is one of the species responsible for most CL cases in the Mediterranean littoral, the Middle The selected synthetic diterpene esters 135, 301, 302, 301a, and 302a were tested in vitro to experimentally determine their abilities to inhibit the enzymatic activity of LmPTR1 as an extension of the results provided by the in silico screening. Recombinant LmPTR1 was kinetically assessed, as previously reported [40], to ensure the appropriate enzymatic features, resulting in a consistent substrate Km of 5.6 µM. After testing LmPTR1 inhibition, the selected diterpenes exhibited inhibitory properties at different levels, following a concentration-response behavior within the 0.1-128 µM range. The IC 50 was then calculated for the tested diterpenes, and these values were used to calculate the apparent inhibitory constant (Ki app ) ( Table 3) using the Cheng-Prusoff equation, assuming reversible competitive inhibition and 1:1 stoichiometry [47]. PMA, a known PTR1 inhibitor, was used as a positive control. Among the three VS-selected diterpenes, 135 was found to be the most potent inhibitor, whereas 301 exhibited the lowest Ki app . Remarkably, the inhibitory activity was improved by approximately 60% if a 3α-p-coumaroyloxy group was present in 302 instead of a 3α-cinnamoyloxy substituent, as 302a exhibited a lower Ki app value than 302. No similar effect was observed for 301, as 301a showed a slightly lower inhibitory activity than 301. Therefore, a reasonable inference based on this small set of compounds is that the presence of a p-hydroxyl group at the phenylpropanoid moiety might favor inhibitory activity, whereas a 9β-hydroxyl group at the diterpene moiety has a negative influence on LmPTR1 inhibition. Finally, although the test diterpenes were found to be less active than the positive control, the concentration-response behavior and the consequently calculated Ki app (≤5 µM) of the selected diterpenes demonstrated the validity of the designed VS approach for the selection of bioactive compounds against PTR1 and the computationally studied binding modes of these selected compounds within the active site of LmPTR1, which is associated with the development of CL. These selected compounds can be considered important, showing that they can be used to obtained additional active PTR1 inhibitors.

Molecular Docking Calculations for the Kaurane Dataset Using Hybrid Models of La, Lb, and Lpptr1
The structures 135, 302, and 302a displayed in vitro activity against L. major, which is one of the species responsible for most CL cases in the Mediterranean littoral, the Middle East, the Indian subcontinent, and central Asia [48]. However, in the American continent, other Leishmania species, such as L. amazonensis (La), L. braziliensis (Lb), Molecules 2021, 26, 3076 9 of 23 and L. panamensis (Lp), are associated with great clinical diversity, associated particularly with CL and MCL [49]. Therefore, the potential activity of kauranes against PTR1 from these three species must also be examined, despite the absence of crystal structure for these species.

Hybrid Models of La, Lb, and LpPTR1
Hybrid models were built in YASARA software (YASARA (18.4.24) Vienna, Austria: YASARA Biosciences GmbH; 2018) [50] from sequences of three Leishmania species, Lp, La, and Lb. To verify and validate the reliability and stereochemical qualities of the modeled protein, data from Ramachandran, WHAT IF, and VERIFY 3D plots and the quality Zscores of dihedrals were determined for the built models, which describes how many standard deviations separate the model quality from the average high-resolution X-ray structure [51]. Higher values are better, and negative values indicate that the homology model looks worse than a high-resolution X-ray structure [52,53]. The Ramachandran plot showed that the main possible chain conformations included more than 88.7% of residues in the most favored regions for the three hybrid models, with close to 10.0% of residues in allowed regions. Only the Lp model showed two residues (0.8%) in disallowed regions (outliers; Supplementary Material, Figure S2). Because the percentage of residues found in the outlier region was low or absent, the generated models were considered satisfactory. Eleven residues in the active site were analyzed against the template sequence and were found to be conserved [38].
According to the VERIFY 3D results (https://services.mbi.ucla.edu/SAVES/, accessed on 13 February 2021), 87.1% (Lp), 86.1% (Lb), and 80.0% (La) of residues had mean 3D/1D scores ≥ 0.2, which indicated a reliable model because more than 80% of amino acids had values of 0.2 in the 3D/1D profile. The verification of dihedral quality was classified as optimal for the three models, with values above 0.913. The quality of atomic contacts between the atoms of each residue was analyzed using the Coarse Packing Quality Control module of WHAT IF (https://swift.cmbi.ru.nl/servers/html/index.html, accessed on 15 February 2021), which compares the distribution of atom positions around each residue. The mean scores of all wastes were −0.334, −0.488, and −0.667, for Lb, La, and Lp, respectively. A score of less than −5.0 for a residue indicates poor or unusual atomic contacts.

Molecular Docking Calculations for Kauranes Dataset
Molecular docking calculations for the 360 kaurane dataset plus the two derivative compounds, 301a and 302a, were obtained using the Autodock/Vina algorithm for the three generated Leishmania hybrid models (Lp, Lb, and La) to evaluate whether the kauranes that showed in vitro activity against L. major have the potential to display multispecies activity. Equation (3) combines the SB probability scores obtained from the docking calculations of all three models, and DHB and PMA were used as references.
where LbSB is the structure-based probability score for L. braziliensis, LpSB is the structurebased probability score for L. panamensis, and LaSB is the structure-based probability score for L. amazonensis. CA is the consensus analysis for all three species. Therefore, a CA value equal to or greater than 0.5 is classified as active. Among the 362 structures tested, only 274 were classified as active, and 301a and 302a were the best-ranked molecules, with CA values of 0.96 and 0.94, respectively. The kauranes (135 and 302) that demonstrated in vitro activity against L. major also showed high CA values (above 0.86) and were among the ten best-ranked molecules (Table 4). DHB showed more affinity for PTR1 from the three Leishmania species than PMA. Lower docking scores than the two control structures were obtained for 100%, 81%, and 99% of the tested kauranes for Lb, Lp, and La, respectively.
Docking poses for structure 302 in the active site of the three Leishmania PTR1 models and the interacting residues for 302, DHB, and PMA are displayed in Figure 5 and Table 5, respectively. A Vina score of −9.73 kcal/mol was calculated for Lp, predominantly due to van der Waals interactions, with five common interactions identified between DHB and PMA (L19, S112, Y194, L226, and S227). A critical common π-anion interaction was observed between D181 and the aromatic ring of the cinnamoyl group. No H-bond interactions are observed for this kaurane in the active site of LpPTR1. Docking poses for structure 302 in the active site of the three Leishmania PTR1 models and the interacting residues for 302, DHB, and PMA are displayed in Figure 5 and Table  5, respectively. A Vina score of −9.73 kcal/mol was calculated for Lp, predominantly due to van der Waals interactions, with five common interactions identified between DHB and PMA (L19, S112, Y194, L226, and S227). A critical common π-anion interaction was observed between D181 and the aromatic ring of the cinnamoyl group. No H-bond interactions are observed for this kaurane in the active site of LpPTR1.   Similarly, the structure of 302 achieved a Vina score of −11.1 kcal/mol in the active site of LbPTR1, exhibiting some common van der Waals interactions with DHB and PMA (S112, S227, and L228). An H-bond interaction was established between G225 and the carboxylic group of C-19. G225 did not interact with DHB and PMA, which establish three H-bonds (L19 and N110 were common between these two molecules). Interestingly, an alkyl interaction with L19 was observed for the structures 302 and 135, which was the best-ranked molecule for LbPTR1 (Vina score of −13.07 kcal/mol).
For structure 302, in the active site of LaPTR1, two H-bond interactions were observed with A15 and K16, and K16 was also observed in the complex between LaPTR1 and DHB, identified as a critical amino acid for the binding. For both the kaurene 302 and the two controls (PMA and DHB), a higher number of van der Waals interactions were exhibited than any other type of intermolecular interaction, although only the interaction with Y193 was common among all three of these structures. Finally, an alkyl interaction with P223 was identified for the structures 302 and PMA.

Molecular Dynamics Simulations
L. braziliensis is the causative agent of human CL and MCL in various countries of the American continent, including Colombia, Brazil, Nicaragua, and Ecuador, among others [49,54,55]. Thus, to validate the hybrid model constructed for LbPTR1 and to evaluate the protein-ligand stabilities of the structures 135, 302, and 302a, molecular dynamics (MD) studies were performed using DHB and PMA as reference ligands.
Root-mean-square deviation (RMSD) values were used to evaluate the structural stability of the receptor frame by measuring the distance between different positions (in nm) that a set of atoms exhibited over time [56]. In the first half of the simulation time (0-25 ns), the structures 135, 302, 302a, DHB, PMA, and the apoenzyme of LbPTR1 (apoLbPTR1, protein without ligand) showed a similar grade of perturbation, with RMSD values ranging from approximately 0.35 to 0.65 nm. After 25 ns, all ligands exhibited reduced perturbations relative to that observed for apoLbPTR1, which suggests increased stability exerted by the inhibitors on the complex with LbPTR1. RMSD values for the protein-kaurane complexes of approximately 0.5 nm were observed, except for the reference ligand, DHB, which showed a slightly higher RMSD value (approximately 0.55 nm). In contrast, apoLbPTR1 exhibited values approaching 0.7 nm (Figure 6a). protein without ligand) showed a similar grade of perturbation, with RMSD values ranging from approximately 0.35 to 0.65 nm. After 25 ns, all ligands exhibited reduced perturbations relative to that observed for apoLbPTR1, which suggests increased stability exerted by the inhibitors on the complex with LbPTR1. RMSD values for the protein-kaurane complexes of approximately 0.5 nm were observed, except for the reference ligand, DHB, which showed a slightly higher RMSD value (approximately 0.55 nm). In contrast, apoLbPTR1 exhibited values approaching 0.7 nm (Figure 6a). The fluctuations for each LbPTR1 residue were analyzed by calculating the rootmean-square fluctuation (RMSF) values. Kauranes, DHB, and PMA in complex with LbPTR1 presented lower values than the apoenzyme, and the LbPTR1 loops were identified as the most variable regions. In the sections of LbPTR1 with defined tertiary structures (helical or β-sheets), the fluctuation of residues for both the apoenzyme and the complexes The fluctuations for each LbPTR1 residue were analyzed by calculating the root-meansquare fluctuation (RMSF) values. Kauranes, DHB, and PMA in complex with LbPTR1 presented lower values than the apoenzyme, and the LbPTR1 loops were identified as the most variable regions. In the sections of LbPTR1 with defined tertiary structures (helical or β-sheets), the fluctuation of residues for both the apoenzyme and the complexes formed with DHB, PMA, and kauranes (135, 302, and 302a) was less than 0.25 nm. For most of the residues in the active site, the RMSF values decreased when LbPTR1 was in complex with structure 302.
In the loop region, from A65 to S85, structure 302a showed the highest RMSF value (approximately 0.9 nm) compared with structures 135 and 302a, which reached RMSF values lower than 0.6 nm. This behavior might be due to differences in the spatial conformation of 302a within the active site of LbPTR1 compared with those for structures 135 and 302; consequently, the molecular docking values are justified. The analysis of the loop section between N110 and T135 showed that inhibitors (structures 135, 302, and 302a, DHB, and PMA) in complex with LbPTR1 reached RMSF values approaching 1.0 nm; in contrast, the apoenzyme exhibited a value above 1.65 nm (Figure 6b), indicating that these structures stabilized the protein following the formation of an LbPTR1-kaurane complex. The diterpenes showed similar RMSF values as DHB; however, in this loop region, PMA has a lower RMSF value (approximately 0.8 nm).
The evolution of the LbPTR1 packing level was observed through the radius of gyration (RoG) values. The diterpene-LbPTR1 complexes showed no differences in RoG values compared with the complex formed between DHB and LbPTR1 (RoG of approximately 2.00 nm), with fluctuations in the tertiary structure lower than 0.10 nm. The RoG values for PMA were slightly different (approximately 2.05), demonstrating a different behavior throughout the 50 ns test period than the other structures analyzed (Figure 6c). ApoLbPTR1, during the initial 25 ns, showed a higher RoG than the complexes, with a RoG value approaching 2.05 nm. However, in the second half of the simulation, a decrease in the RoG value was observed, reaching a value similar to those observed for the complexes formed with diterpenes and DHB (RoG of approximately 1.95 nm). Based on these results, the structures 135, 302, and 302a appeared to be stably folded after the MD simulation.
According to the MD simulations, the binding free energies for structures 135, 302, 302a, PMA, and DHB were calculated through the MM/PBSA method. The diterpenes 135, 302, and 302a presented binding free energy values of −132.7 kJ/mol, −121.4 kJ/mol, and −138.3 kJ/mol, respectively, which were all lower energy values than those measured for DHB and PMA in complex with LbPTR1, which presented free energy values of −107.4 kJ/mol and −110.0 kJ/mol, respectively. These differences in energetic contributions were associated with structural differences (Table 6). In the three kauranes, van der Waals interactions showed the highest negative contributions to the binding free energy, which supported the previously observed docking results. The solvent-accessible surface area (SASA) and electrostatic parameters contributed negatively, but to a lesser degree, to the binding free energies in similar proportions (except for the electrostatic parameter of 302, which displayed a higher contribution to total binding energy). The reference inhibitor (PMA) and the native ligand (DHB) demonstrated different behaviors from those observed for the three diterpenes, with a higher contribution of electrostatic interactions to the total binding free energy value, which represented the energy parameter with the highest negative energetic contribution. For all molecules, the polar solvation had a positive contribution to the total energy value, with larger contributions to the complexes DHB-LbPTR1 and PMA-LbPTR1.

Prediction of ADMET Properties
The pharmacokinetic properties such as absorption, distribution, metabolism, excretion, and toxicity (ADMET) of the structures 135, 302, and 302a were predicted by using Volsurf+ [31,57], ADMETlab 2.0 [58], and OSIRIS Data Warrior v.5.2.1 [59]. The results show that all three structures did not present any violation to the Lipinski's "rule of five", implying that these kauranes are probably orally bioavailable. These compounds also showed a high gastrointestinal absorption (Table S6).
Cytochrome P450 (CYP) enzymes are a family of heme proteins involved in the metabolism of frequent pharmacologically active compounds and can cause drug to drug interactions with co-administered drugs as well as unwanted adverse side effects [60]. The results obtained in ADMETlab 2.0 [58] showed that structures 135, 302, and 302a are substrates of CYP2C19. Additionally, structures 302 and 302a presented probabilities above 0.53 of being substrate of CYP2C9 and the structure 135 is substrate of CYP2D6 and CYP3A2. All selected kauranes are non-inhibitors of the studied CYP 450 isoenzymes.
That being said, these molecules do not affect significantly the metabolism of the drugs that work as substrates of the mentioned isoenzymes, therefore being safe to use concomitantly with additional pharmacotherapy (Table S6).
Structures 135 and 302a had no predicted risk for the development of mutagenicity, tumorigenesis, negative effects on the reproductive system, or irritability; only structure 302 was predicted as potentially irritable. The identification of potential hERG channel blockers at the early stage of drug discovery process will decrease the risk of cardiotoxicity-related attritions in the later and more expensive development stage [61]. The three evaluated kauranes (structures 135, 302, and 302a) showed minimal probabilities of being hERG blockers with values lower than 0.022 (Table S6). Finally, using the webtools CLC-Pred [62] and eMolTox [63], cytotoxic properties were predicted against no tumor cell lines; for the three structures from data-driven models no toxic action was found.

Database
From the ChEMBL database (https://www.ebi.ac.uk/chembl/, accessed on 10 January 2021), we selected a diverse set of 1085 structures that were initially classified according to their predicted activity against L. major. These compounds were classified according to their pIC 50 values [−logIC 50 (mol/L)]; therefore, we stratified them into active (pIC 50 ≥ 5.0) and inactive (pIC 50 < 5.0) structures.
The APD, based on Euclidean distances, was used to identify those compounds in the test set for which predictions may be unreliable; compounds were considered unreliable if they had APD values higher than d + Zσ, where d is the average Euclidian distance, and σ is the standard deviation of the set of samples used as the training set, with lower-thanaverage Euclidian distance values relative to all samples in the training set. The parameter Z is an empirical cutoff value, and 0.5 was used as the default value [64].
Structures with pIC 50 values ranging from 4.9 to 5.0 (range of 0.1 units) were excluded to avoid edge effects and improve the predictive capacity of the models. Excluding these structures minimized the differences in activity values resulting from errors and differences in experimental protocols [65]. Data curation was performed for the datasets according to procedures suggested in the literature [66][67][68]. Standardizer software [JChem, version 16.11.28 (2016), calculation module developed by ChemAxon, https://www.chemaxon. com/, accessed on 20 January 2021] was used to canonize all simplified molecular-input line-entry system (SMILES) codes. After duplicate structures were removed, those with higher pIC 50 values were eliminated, facilitating the generation of more restrictive models. Finally, 638 structures for L. major (338 active and 300 inactive structures) were included in the analysis.
The kaurane dataset was built in-house, and a total of 360 molecules from this dataset were used in this study. For all structures, SMILES codes were used as the input data in Marvin [ChemAxon, version 16.11.28 (2016), calculation module developed by ChemAxon, https://www.chemaxon.com/, accessed on 20 January 2021]. We used standardizer software [JChem, version 16.11.28 (2016), calculation module developed by ChemAxon, https://www.chemaxon.com/, accessed on 20 January 2021]. ChemAxon was used to canonize the structures, add hydrogens, perform aromatic form conversions, and clean molecular graphs in three dimensions.

Volsurf+ Descriptors
The three-dimensional (3D) structures of the identified molecules, in special data file (SDF) format, were used as the input data for VolSurf+, v. 1.0.7 [30] and were subjected to molecular interaction fields (MIFs) to generate descriptors using the following probes: N1 (amide nitrogen-hydrogen bond donor probe), O (carbonyl oxygen-hydrogen bond acceptor probe), OH2 (water probe), and DRY (hydrophobic probe). Additional non-MIF-derived descriptors were generated, resulting in a total of 128 descriptors [30]. One of the main advantages of using VolSurf descriptors is the relatively low influence of conformational sampling and averaging on these descriptors [31].

RF Models
Knime 3.1.0 software (KNIME 3.1.0 the Konstanz Information Miner Copyright, 2003-2014, www.knime.org) [32] was used to perform all of the following analyses. Initially, the descriptors calculated in the VolSurf+ program were imported in comma-separated value (CSV) format, and the "Partitioning" node in the stratified sampling option was used to classify 80% of the initial dataset as the training set and the remaining 20% as the test set. The model was generated by employing the modeling set and the RF algorithm, with a five-fold cross-validation procedure, using WEKA nodes. In the five-fold cross-validation procedure, the dataset is divided five times into a modeling set (80%/20%). The modeling set (which was used to build and validate the models) was further divided into training (80%) and test sets (20%) [33,66]. The parameters selected for the RF models included the following: number of trees to build = 200; seed for random number generator = 1; and Gini Index, as a split criterion, for both the training and internal cross-validation sets. From the confusion matrix, the internal and external performances of the selected models were analyzed, using the following parameters: sensitivity (true-positive rate), specificity (true-negative rate), and accuracy (overall predictability). In addition, to describe the true performance of the model with more clarity than can be obtained from accuracy alone, the ROC curve was employed, using a "ROC curve" node, which uses the sensitivity and specificity parameters. The plotted ROC curve shows the true-positive (active) rate versus the false-positive rate (1 − specificity) [34]. In this representation, when a variable of interest cannot be distinguished between the two groups, the AUC value is 0.5, whereas a perfect separation between the values of the two groups, with no distribution overlap, results in an AUC value of 1. The MCC was also calculated, for which a value of 1 represents a perfect prediction, a value of 0 represents a random prediction, and a value of −1 represents a total disagreement between the prediction and the observation [35].

Materials and Reagents
Optical rotations and UV data were recorded using a Jasco P-2000ST digital polarimeter (Jasco, Tokyo, Japan) and a Thermo Fisher Scientific Genesys 10S spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA), respectively. 1 H and 13 C Nuclear magnetic resonance experiments were recorded in a Bruker Avance400 spectrometer (Bruker Corporation, Billerica, MA, USA) using CDCl 3 as solvent. All shifts are given in δ (ppm) using the signal of TMS as reference. All coupling constants (J) are given in Hz. HRESIMS data were obtained on a Bruker micro-QToF II spectrometer (Bruker, Billerica, MA, USA), respectively. Thin-layer chromatography (TLC) using silica gel 60 F254 TLC plates (Merck, Darmstadt, Germany) and mobile phases comprising solvent mixtures of n-hexane, EtOAc, and MeOH were used. Plates after TLC development were observed under UV light (254 and 365 nm) and derivatized using I 2 vapor and Hannessian's reagent (aqueous solution of ammonium molybdate, cerium sulphate and H 2 SO 4 ). Silica gel (SiO2) 60 (0.04-0.063 mm) (Merck) was used for flash chromatography (flash CC). Cinnamic acid, p-coumaric acid, (R)-(−)-carvone, and other reagents and solvents for synthesis and enzyme inhibition assay were acquired from Sigma-Aldrich (St. Louis, MO, USA).

LmPTR1 Enzyme Inhibition Assay
Recombinant LmPTR1 enzyme was obtained, purified, and kinetically characterized, as reported previously [71]. The in vitro assessment of selected diterpenes (i. e., 135, 301,  302, 301a, and 302a) for LmPTR1 inhibitory activity was performed through the spectrophotometric monitoring of the enzymatic activity under balanced conditions: LmPTR1 (30 µg), 7,8-dihydro-L-biopterin (DHB, 20 µM), sodium citrate buffer (20 mM, pH 6.0), 30 • C, and a final assay volume of 600 µL. Each reaction was started by the addition of 250 µM NADPH. Absorbance was monitored at 340 nm (i.e., oxidation of NADPH to NADP+) for 240 s, and the resulting profile was used to measure the initial reaction rate (IRR) through the respective slope by linear regression. All recordings were performed in triplicate. PMA was used as the positive control. The resulting IRR values were used to calculate the % inhibition, as 100 − (Ri/Rc × 100), where Ri is the IRR in the presence of the inhibitor and Rc is the IRR in the absence of inhibitors (1% DMSO v/v final concentration). The % inhibition for at least five concentrations (range: 0.1-128 µM) for each test compound (diterpenes and PMA) were calculated, and concentration-response curves (% inhibition vs. Log[inhibitor]) were obtained by non-linear regression to determine the IC 50 using Graph-Pad Prism 5.0 (GraphPad, San Diego, CA, USA). Finally, Ki app values were calculated using the Cheng-Prusoff equation for competitive inhibition, assuming a 1:1 stoichiometry and that the inhibitor-binding reactions are reversible [47]: Ki app = IC 50  , and LaPTR1 (O09352), which were obtained from the UniProt database (https://www.uniprot.org/, accessed on 9 February 2021). The stereochemical qualities of the models were evaluated with PROCHECK [72], in which molecular diversity evaluated several stereochemical parameters, such as the torsional angles of the main chain, the torsional angles of the side chain, bad contacts or steric impediments, and planarity. PROCHECK generated a Ramachandran graph [51], which verified the allowed and unallowed regions of the main amino acid chain. The structural quality was evaluated in VERIFY 3D software (https://services.mbi.ucla.edu/ SAVES/, accessed on 13 February 2021), which analyzes the compatibility of the protein sequence with its 3D structure, according to the chemical environment, and WHAT IF (https://swift.cmbi.ru.nl/servers/html/index.html, accessed on 15 February 2021), which analyzes various structural parameters, such as the atomic contacts between residues. The software Discovery Studio Visualizer was used to visualize the modeled protein [73].

Molecular Docking Calculations
The LmPTR1 crystal structure (PDB ID: 1E7W), in complex with its respective inhibitor, methotrexate (PDB ID: MTX), was downloaded from PDB [38]. Using Molegro 6.0.1 software, all water compounds were deleted from the enzyme structures, and the enzyme/compound structures were prepared using the same default parameter settings, in the same software package (Score function: MolDock Score; Ligand evaluation: Internal ES, Internal H-Bond, Sp2-Sp2 Torsions, all checked; Number of runs: 10 runs; Algorithm: MolDock SE; Maximum Interactions: 1500; Max. population size: 50; Max. steps: 300; Neighbor distance factor: 1.00; Max. number of poses returned: 5). The docking procedure was performed using a grid with a 15-Å radius and a 0.30-Å resolution to cover the ligand-binding site for the four enzyme structures [14,20].  19.6, 7.8), which was identified through cavities analysis in Molegro 6.0.1. Flexible residues in the binding site were selected for each model. Lb: L19, H39, R40, N110, S112, D181, and S227; Lp: K17, L19, S112, M179, and I180; and La: R18, L19, H38, L188, M233, K244, and Y283. Docking poses were classified according to their docking scores (such as the free energy or affinity). Each calculation was performed in three replicates. Two known PTR1 ligands (DHB and PMA) were used as controls. The two-dimensional (2D)-residual interaction diagrams were visualized on Discovery Studio 2016 Visualizer Client (Biovia, San Diego, CA, USA) [73].

Molecular Dynamics Simulations
MD simulations were run in the Gromacs 5.0.5 on Ubuntu 12.04 server [74,75]. Structures 135, 302, and 302a displayed the best poses from docking, and the DHB and PMA structures, as well as the hybrid model of LbPTR1, were employed as the inputs for the MD simulations. The five ligands were prepared by adding hydrogen atoms and the corresponding charges using the AM1-BCC charge scheme in UCSF Chimera. Subsequently, ligand topologies were generated automatically with ACPYPE script. Protein topologies were obtained in Gromacs using the Amber 99SB force field, and the TIP3P water model was implemented. Solvation was performed in a triclinic box using a margin distance of 1.0 nm. The addition of 0.1 M NaCl to complexes and proteins was performed by randomly replacing water molecules until neutrality was achieved [20,56].
The systems were energy-minimized by 2000 steps using the steepest descent method. Systems were subjected to NVT equilibration performed at 310 K for 50 ps, followed by NPT equilibration for 500 ps, using the Parrinello−Rahman method at 1 bar as a reference, using position restraints. Finally, the solute position restraints were released, and a production run for 5 ns was performed. The temperature and pressure were maintained constant at 310 K and 1 bar, respectively. Coordinates were recorded in a 1 fs time step. Electrostatic forces were calculated using the particle-mesh Ewald method. Periodic boundary conditions were used in all simulations, and covalent bond lengths were constrained by the LINCS algorithm. The molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) method was used to calculate binding free energies, using the trajectories calculated by the MD simulations [20,56].

Prediction of ADMET Properties
For the structures 135, 301, and 302a, the ADMET parameters were calculated using the ADMETlab 2.0, an integrated online platform for predictions of ADMET properties [58]. Drug toxicity prediction was performed using OSIRIS Data Warrior v.5.2.1, based on the following parameters: mutagenicity, tumorigenicity, reproductive effect, and irritability [59]. For in silico prediction of human cell line cytotoxicity, two web-tools were used: CLC-Pred, a freely available web-service [62] and eMolTox, a web server for the prediction of potential toxicity associated with a given molecule [63].

Conclusions
Structures 135 and 302 are two kauranes that were identified as hits for anti-leishmanicidal activity, with IC 50 values against L. major below 10 µM. These two structures were selected from an in-house database comprising 360 kauranes through an in silico approach combining machine learning and molecular docking methodologies. Only five structures from Asteraceae were classified as active by both methodologies. The in vitro results allowed the successful verification of the RF classification model, which predicted that structures 135 and 302 would be active (pIC 50 > 5.0) and that structure 301 would be inactive (pIC 50 < 5.0), which was observed experimentally.
Additionally, the inhibitory activity was improved by approximately 60% when a 3α-p-coumaroyloxy group was used in 302 in place of the 3α-cinnamoyloxy substituent, with 302a exhibiting a lower Ki app value. Although the tested diterpenes were found to be less active than the positive control, the validity of the designed VS approach for the selection of bioactive molecules against PTR1 was demonstrated, and the computationally studied binding mode of these selected compounds within the active site of LmPTR1, which causes CL, was explored. These selected compounds can be considered important leads that can be used to obtain more active PTR1 inhibitors.
Finally, because throughout the American continent, other Leishmania species are responsible for the clinical diversity of CL and MCL, including L. amazonensis (La), L. braziliensis (Lb), and L. panamensis (Lp), molecular docking calculations and MD simulations were performed for the entire set of kauranes (including 301a and 302a), and the compounds 135, 302, and 302a were identified as potential multispecies agents. Therefore, this study describes a valuable screening approach for the identification of lead compounds in natural products, which can contribute to the further development of alternative chemotherapies against this group of diseases.

Supplementary Materials:
The following are available online. Figure S1: Ramachandran plots for hybrid models of L. panamensis, L. braziliensis, and L. amazonensis. Table S1: ChEMBL structures with pIC 50 values used for random forest model construction. Table S2: LB probability for the entire diterpenes dataset obtained from random forest model. Table S3: Docking energies for 360 diterpenes from the structure-based VS for LmPTR1. Table S4: Kauranes CA scores using an approach combining ligand-based and structure-based VS.