Author Contributions
Conceptualization, K.W.L.; Data curation, S.R., C.P., A.B., and S.P.; Formal analysis, A.Z., S.P., and S.J.P.; Investigation, R.M.R. and S.J.P.; Methodology, R.M.R., S.R., M.S., C.P., G.L., A.B., and S.Y.; Project administration, K.W.L.; Supervision, K.W.L.; Validation, A.Z. and M.S.; Visualization, R.M.R.; Writing and editing, A.Z.
Figure 1.
2D structure representation of the 27 chemically diverse training set compounds used for pharmacophore generation. The experimental IC50 values (nM/L) are shown in parentheses.
Figure 1.
2D structure representation of the 27 chemically diverse training set compounds used for pharmacophore generation. The experimental IC50 values (nM/L) are shown in parentheses.
Figure 2.
(A) Shows the 3D spatial arrangement and distance constraints of the chemical features of Hypo1. Hypo1 consists of one hydrogen bond acceptor (green), two hydrogen bond donor (magenta), and one hydrophobic (cyan) features. (B) Alignment of the reference compound (the most active compound of the training set (IC50 = 0.19 nM/L)) mapped all the features of the Hypo1. (C) Alignment of the least active compound of the training set (IC50 = 10,000 nM/L) mapped the hydrogen bond donor features only and missed the hydrogen bond acceptor and hydrophobic features of the Hypo1.
Figure 2.
(A) Shows the 3D spatial arrangement and distance constraints of the chemical features of Hypo1. Hypo1 consists of one hydrogen bond acceptor (green), two hydrogen bond donor (magenta), and one hydrophobic (cyan) features. (B) Alignment of the reference compound (the most active compound of the training set (IC50 = 0.19 nM/L)) mapped all the features of the Hypo1. (C) Alignment of the least active compound of the training set (IC50 = 10,000 nM/L) mapped the hydrogen bond donor features only and missed the hydrogen bond acceptor and hydrophobic features of the Hypo1.
Figure 3.
A graphical representation of the total cost values of Hypo1 and each of the 10 hypotheses generated from 19 random spreadsheets during Fischer’s randomization run. A confidence level of 95% was used. Hypo1 obtained the lowest total cost value.
Figure 3.
A graphical representation of the total cost values of Hypo1 and each of the 10 hypotheses generated from 19 random spreadsheets during Fischer’s randomization run. A confidence level of 95% was used. Hypo1 obtained the lowest total cost value.
Figure 4.
Correlations between the experimental activities and the predicted activities using Hypo1 with the training set compounds and test set compounds. Hypo1 observed strong correlation between the predicted activities of training set and test set compounds.
Figure 4.
Correlations between the experimental activities and the predicted activities using Hypo1 with the training set compounds and test set compounds. Hypo1 observed strong correlation between the predicted activities of training set and test set compounds.
Figure 5.
Schematic flow of the virtual screening. Hypo1 was employed as 3D query for virtual screening. ADMET assessment test and Lipiniski’s rule of five were used as filtration systems to identify drug-like compounds.
Figure 5.
Schematic flow of the virtual screening. Hypo1 was employed as 3D query for virtual screening. ADMET assessment test and Lipiniski’s rule of five were used as filtration systems to identify drug-like compounds.
Figure 6.
Alignment of Hypo1 and the final hit compounds. Hypo1 mapped all the features of Hit1 (A), Hit2 (B), and Hit3 (C). Hit compounds are represented as stick models.
Figure 6.
Alignment of Hypo1 and the final hit compounds. Hypo1 mapped all the features of Hit1 (A), Hit2 (B), and Hit3 (C). Hit compounds are represented as stick models.
Figure 7.
Molecular dynamics simulation analyses. (A) The root mean square deviation (RMSD) profile of the backbone atoms of human dihydrofolate reductase (hDHFR). Each system suggested the stable RMSD during the entire simulation run. Color scheme is depicted as hDHFR in complex with reference compound (red), Hit1 (magenta), Hit2 (green), and Hit3 (orange). (B) The number of intermolecular hydrogen bonds between protein and compound during the 50 ns MD simulations. The reference compound, Hit1, Hit2, and Hit3 are displayed as red, magenta, green, and orange.
Figure 7.
Molecular dynamics simulation analyses. (A) The root mean square deviation (RMSD) profile of the backbone atoms of human dihydrofolate reductase (hDHFR). Each system suggested the stable RMSD during the entire simulation run. Color scheme is depicted as hDHFR in complex with reference compound (red), Hit1 (magenta), Hit2 (green), and Hit3 (orange). (B) The number of intermolecular hydrogen bonds between protein and compound during the 50 ns MD simulations. The reference compound, Hit1, Hit2, and Hit3 are displayed as red, magenta, green, and orange.
Figure 8.
The reference compound and the three hit compounds occupied the active site of hDHFR. All the compounds in their representative structures were superimposed (left) and enlarged (right). 3D structure of hDHFR is shown in light grey color. Red, magenta, green, and orange colors represent the reference compound, Hit1, Hit2, and Hit3, respectively.
Figure 8.
The reference compound and the three hit compounds occupied the active site of hDHFR. All the compounds in their representative structures were superimposed (left) and enlarged (right). 3D structure of hDHFR is shown in light grey color. Red, magenta, green, and orange colors represent the reference compound, Hit1, Hit2, and Hit3, respectively.
Figure 9.
Molecular interactions analyses. The reference compound and the three hit compounds interacted essential residues in the active site of hDHFR. The reference compound (A), Hit1 (B), Hit2 (C), and Hit3 (D) are depicted as a red, magenta, green, and orange colored stick representation. The H-bond forming residues of hDHFR are displayed as cyan stick model. H-bonding and bond distance are represented as green dashed lines and measured in angstrom (Å), respectively.
Figure 9.
Molecular interactions analyses. The reference compound and the three hit compounds interacted essential residues in the active site of hDHFR. The reference compound (A), Hit1 (B), Hit2 (C), and Hit3 (D) are depicted as a red, magenta, green, and orange colored stick representation. The H-bond forming residues of hDHFR are displayed as cyan stick model. H-bonding and bond distance are represented as green dashed lines and measured in angstrom (Å), respectively.
Figure 10.
2D structures of the final hit compounds. The 2D structures of the reference compound (A), Hit1 (B), Hit2 (C), and Hit3 (D) are depicted.
Figure 10.
2D structures of the final hit compounds. The 2D structures of the reference compound (A), Hit1 (B), Hit2 (C), and Hit3 (D) are depicted.
Figure 11.
Binding free energy analyses. (A) Graphical representation of MM/PBSA estimated binding free energy of hDHFR in complex with reference compound, Hit1, Hit2, and Hit3 throughout the simulation time. The reference compound, Hit1, Hit2, and Hit3 are depicted as red, magenta, green, and orange, respectively. (B) The binding free energy decomposition analysis of the final hits in the active site of hDHFR.
Figure 11.
Binding free energy analyses. (A) Graphical representation of MM/PBSA estimated binding free energy of hDHFR in complex with reference compound, Hit1, Hit2, and Hit3 throughout the simulation time. The reference compound, Hit1, Hit2, and Hit3 are depicted as red, magenta, green, and orange, respectively. (B) The binding free energy decomposition analysis of the final hits in the active site of hDHFR.
Table 1.
Statistical data of ten pharmacophore hypotheses generated by HypoGen.
Table 1.
Statistical data of ten pharmacophore hypotheses generated by HypoGen.
Hypo. No. | Total Cost | Cost Difference a | Root Means Square Deviation (RMSD) b | Correlation (R2) | Max Fit | Features c |
---|
Hypo1 | 125.276 | 78.308 | 0.99 | 0.94 | 11.15 | HBA, HBD, HBD, HYP |
Hypo2 | 125.362 | 78.304 | 1.05 | 0.93 | 9.94 | HBA, HBD, HBD, HYP |
Hypo3 | 125.509 | 78.157 | 1.05 | 0.93 | 10.04 | HBA, HBD, HBD, HYP |
Hypo4 | 125.358 | 77.701 | 1.05 | 0.93 | 10.04 | HBA, HBD, HBD, HYP |
Hypo5 | 125.867 | 77.799 | 1.06 | 0.93 | 10.21 | HBA, HBD, HBD, HYP |
Hypo6 | 125.965 | 77.946 | 1.03 | 0.93 | 10.74 | HBA, HBD, HBD, HYP |
Hypo7 | 127.533 | 76.133 | 1.13 | 0.92 | 9.97 | HBA, HBD, HBD, HYP |
Hypo8 | 125.781 | 75.885 | 1.14 | 0.92 | 9.75 | HBA, HBD, HBD, HYP |
Hypo9 | 128.568 | 75.098 | 1.15 | 0.92 | 10.24 | HBA, HBD, HBD, HYP |
Hypo10 | 128.568 | 75.098 | 1.15 | 0.92 | 10.26 | HBA, HBD, HBD, HYP |
Table 2.
Experimental and predicted activity of training set compounds based on Hypo1.
Table 2.
Experimental and predicted activity of training set compounds based on Hypo1.
Compound No. | Fit Value | Experimental IC50 nM/L | Predicted IC50 nM/L | a Error | b Exp. Scale | b Pred. Scale |
---|
1 | 9.63 | 0.19 | 0.21 | 1.10 | +++ | +++ |
2 | 7.42 | 0.73 | 7.55 | 10.34 | +++ | +++ |
3 | 6.87 | 1.8 | 27.09 | 15.05 | +++ | +++ |
4 | 8.24 | 2.7 | 1.15 | −2.36 | +++ | +++ |
5 | 7.33 | 3.5 | 9.40 | 2.69 | +++ | +++ |
6 | 7.66 | 6 | 4.41 | −1.36 | +++ | +++ |
7 | 7.32 | 10 | 9.68 | −1.03 | +++ | +++ |
8 | 7.36 | 17.5 | 8.69 | −2.01 | +++ | +++ |
9 | 6.26 | 89 | 110.31 | 1.24 | +++ | ++ |
10 | 5.74 | 137 | 364.60 | 2.66 | ++ | ++ |
11 | 5.75 | 155 | 357.03 | 2.30 | ++ | ++ |
12 | 5.77 | 160 | 339.73 | 2.12 | ++ | ++ |
13 | 5.52 | 190 | 606.05 | 3.19 | ++ | + |
14 | 5.78 | 210 | 331.91 | 1.58 | ++ | ++ |
15 | 5.62 | 260 | 482.36 | 1.86 | ++ | ++ |
16 | 5.38 | 290 | 827.48 | 2.85 | ++ | + |
17 | 5.75 | 400 | 352.12 | −1.14 | ++ | ++ |
18 | 5.80 | 440 | 315.88 | −1.39 | ++ | ++ |
19 | 5.11 | 2630 | 1566.48 | −1.68 | + | + |
20 | 4.84 | 3530 | 2915.70 | −1.21 | + | + |
21 | 5.18 | 4000 | 1327.90 | −3.01 | + | + |
22 | 4.78 | 6000 | 3296.55 | −1.82 | + | + |
23 | 5.01 | 7700 | 1946.96 | −3.95 | + | + |
24 | 5.06 | 7800 | 1749.88 | −4.46 | + | + |
25 | 5.53 | 8600 | 597.43 | −14.40 | + | + |
26 | 4.52 | 9332 | 6062.97 | −1.54 | + | + |
27 | 4.34 | 10,000 | 9094.55 | −1.10 | + | + |
Table 3.
Decoy set validation. The decoy set validation of Hypo1 obtained highest goodness of fit (GF) score (0.86) and suggested its suitability for virtual screening.
Table 3.
Decoy set validation. The decoy set validation of Hypo1 obtained highest goodness of fit (GF) score (0.86) and suggested its suitability for virtual screening.
Parameters | Values |
---|
Total no. of molecules in database (D) | 75 |
Total no. of actives in database (A) | 8 |
Total no. of hit molecules from the database (Ht) | 8 |
Total no. of active molecules in hit list (Ha) | 7 |
Percentage Yield of actives ((Ha/Ht) × 100) | 87.5% |
Percentage Ratio of actives ((Ha/A) × 100) | 88% |
Enrichment Factor (EF = (Ha/Ht)/(A/D)) | 8.23 |
False negatives (A − Ha) | 1 |
False positive (Ht − Ha) | 1 |
Goodness of fit score (GF) | 0.86 |
Table 4.
Comparison of Goldscore and Chemscore of dihydrofolate reductase (DHFR) in complex with the reference, Hit1, Hit2, and Hit3.
Table 4.
Comparison of Goldscore and Chemscore of dihydrofolate reductase (DHFR) in complex with the reference, Hit1, Hit2, and Hit3.
System | Goldfitness Score | Chemscore | Estimated IC50 (nM/L) |
---|
DHFR + a Reference | 44.67 | −23.35 | 0.21 |
DHFR + Hit1 | 70.05 | −34.51 | 0.12 |
DHFR + Hit2 | 58.95 | −33.66 | 0.043 |
DHFR + Hit3 | 57.31 | −37.27 | 0.17 |
Table 5.
The specifications of four systems used for molecular dynamics simulations.
Table 5.
The specifications of four systems used for molecular dynamics simulations.
System | No. of TIP3P Water Molecules | No. of Na+ Ions | System Size (nm) |
---|
DHFR+NADPH + a Reference | 11,306 | 4 | 7.193 × 7.193 × 7.193 |
DHFR+NADPH + Hit1 | 11,306 | 4 | 7.193 × 7.193 × 7.193 |
DHFR+NADPH + Hit2 | 11,306 | 4 | 7.193 × 7.193 × 7.193 |
DHFR+NADPH + Hit3 | 11,306 | 4 | 7.193 × 7.193 × 7.193 |
Table 6.
Molecular interactions between the ligands (the reference and hit compounds) and the active site residues of hDHFR.
Table 6.
Molecular interactions between the ligands (the reference and hit compounds) and the active site residues of hDHFR.
Compound | Hydrogen Bond (<3 Å) | Van der Waals Interactions and Carbon Hydrogen Bond | π-Interaction |
---|
a Reference | Ile7, Glu30, Val115 | Val8, Gly31, Asn64, Gly116, Tyr121, NADPH | Ile7, Phe34, Ile60, Pro61, Leu67, Val115 |
Hit1 | Leu27, Glu30, Ser59 | Val8, Ile16, Gly17, Asp21, Asp21, Pro26, Gly31, Gln35, Thr136, Pro61, Ile60 | Ala9, Leu22, Pro23, Phe34, Val115, NADPH |
Hit2 | Trp24, Glu30 (2), Thr56, NADPH | Ala9, Gly31, Gln35, Ser59, Ile60, Pro61 | Phe34, Met52, Ile60, Leu67, Val115 |
Hit3 | Glu30, Asn64 (2) | Val8, Leu22, Gly31, Gln35, Met52, Thr56, Pro61, Leu67, Val115, Tyr121, Thr136, NADPH | Ile7, Ala9, Ile60, Phe34 |
Table 7.
Decomposition of binding free energy.
Table 7.
Decomposition of binding free energy.
Compounds | Van der Waals Energy (kJ/mol) | Electrostatic Energy (kJ/mol) | Polar Solvation Energy (kJ/mol) | SASA b Energy (kJ/mol) | Binding Energy (kJ/mol) |
---|
a Reference | −131.78 | −92.84 | 111.73 | −13.9 | −127.05 |
Hit1 | −154.58 | −92.63 | 93.27 | −17.218 | −171.12 |
Hit2 | −154.75 | −127.45 | 120.79 | −17.31 | −178.47 |
Hit3 | −140.36 | −86.23 | 109.75 | −16.47 | −133.16 |
Table 8.
Comparison of absorption, distribution, metabolism, excretion, and toxicity (ADMET), Lipinski’s rule of five, and toxicity prediction by komputer assisted technology (TOPKAT) predictions for Federal Drug Agency (FDA) approved drugs and hit compounds.
Table 8.
Comparison of absorption, distribution, metabolism, excretion, and toxicity (ADMET), Lipinski’s rule of five, and toxicity prediction by komputer assisted technology (TOPKAT) predictions for Federal Drug Agency (FDA) approved drugs and hit compounds.
Name | ADMET Analysis | Lipinski’s Rule of Five | TOPKAT Analysis |
---|
Solubility | BBB a | Hepato-Toxicity | Absorption | HBA b | HBD c | M.W (Da) d | Rat (Carcinogenicity) | Skin Irritancy |
---|
Female | Male |
---|
MTX | 2 | 4 | TRUE | 3 | 12 | 5 | 454.447 | Single | Non | Non |
Pralatexet | 2 | 4 | TRUE | 3 | 11 | 5 | 477.481 | Non | Non | Non |
Pemetrexed | 3 | 4 | TRUE | 3 | 6 | 6 | 427.417 | Single | Non | Non |
Hit1 | 3 | 3 | FALSE | 0 | 6 | 3 | 450.714 | Non | Non | Non |
Hit2 | 3 | 3 | FALSE | 0 | 5 | 3 | 420.833 | Non | Non | Non |
Hit3 | 3 | 3 | FALSE | 0 | 6 | 2 | 368.385 | Non | Non | Non |