In Silico Study Probes Potential Inhibitors of Human Dihydrofolate Reductase for Cancer Therapeutics

Dihydrofolate reductase (DHFR) is an essential cellular enzyme and thereby catalyzes the reduction of dihydrofolate to tetrahydrofolate (THF). In cancer medication, inhibition of human DHFR (hDHFR) remains a promising strategy, as it depletes THF and slows DNA synthesis and cell proliferation. In the current study, ligand-based pharmacophore modeling identified and evaluated the critical chemical features of hDHFR inhibitors. A pharmacophore model (Hypo1) was generated from known inhibitors of DHFR with a correlation coefficient (0.94), root mean square (RMS) deviation (0.99), and total cost value (125.28). Hypo1 was comprised of four chemical features, including two hydrogen bond donors (HDB), one hydrogen bond acceptor (HBA), and one hydrophobic (HYP). Hypo1 was validated using Fischer’s randomization, test set, and decoy set validations, employed as a 3D query in a virtual screening at Maybridge, Chembridge, Asinex, National Cancer Institute (NCI), and Zinc databases. Hypo1-retrieved compounds were filtered by an absorption, distribution, metabolism, excretion, and toxicity (ADMET) assessment test and Lipinski’s rule of five, where the drug-like hit compounds were identified. The hit compounds were docked in the active site of hDHFR and compounds with Goldfitness score was greater than 44.67 (docking score for the reference compound), clustering analysis, and hydrogen bond interactions were identified. Furthermore, molecular dynamics (MD) simulation identified three compounds as the best inhibitors of hDHFR with the lowest root mean square deviation (1.2 Å to 1.8 Å), hydrogen bond interactions with hDHFR, and low binding free energy (−127 kJ/mol to −178 kJ/mol). Finally, the toxicity prediction by computer (TOPKAT) affirmed the safety of the novel inhibitors of hDHFR in human body. Overall, we recommend novel hit compounds of hDHFR for cancer and rheumatoid arthritis chemotherapeutics.


Introduction
Dihydrofolate reductase (DHFR) is a ubiquitous enzyme and exists in a wide range of organisms [1]. DHFR is crucial for proper cellular growth and proliferation, where it regulates deviation (RMSD). Each hit molecule established strong hydrogen bond(s) with the catalytic active residues of the hDHFR. The binding free energy calculations suggested that the final hit molecules strongly bind hDHFR. Overall, we recommend that the newly identified candidate hit molecules of hDHFR may serve as the fundamental approaches to design more efficient and safe inhibitors of the hDHFR. Such inhibitors may have potential applications in cancer chemotherapy.

Collection of Dataset
A dataset of 67 known inhibitors of hDHFR was accumulated from literature mining (https://www.bindingdb.org/bind/index.jsp) and categorized into two distinctive data sets: (i) a training set and (ii) a test set. The training set compounds were sub-categorized into active (IC 50 < 100 nM/L), moderate active (100 nM/L < IC 50 < 500 nM/L), and inactive (IC 50 ≥ 500 nM/L) compounds on the basis of structural diversity and experimentally determined IC 50 values. The training set compounds were used for pharmacophore generation. The test set compounds were used to validate the hypotheses. The 2D structures of all the compounds were drawn by Biovia Draw and subsequently converted to their 3D structures in Discovery Studio v4.5 (DS).

Generation of Pharmacophore Model
In this study, the 3D QSAR Pharmacophore Generation module of DS was used to generate ligand-based pharmacophoric hypotheses. In brief, the possible pharmacophoric features of the training set compounds were identified with Feature Mapping protocol of DS. The training set was used for hypotheses generation through the HypoGen algorithm of 3D QSAR Pharmacophore Generation module, which was implanted in DS. During the hypotheses generation, the minimum and maximum numbers of features were set to 0 and 5, respectively. BEST algorithm of DS was employed to generate low energy conformations of the training set compounds. Uncertainty value was set to 3 while other parameters had default values. Hypotheses were generated with their resultant statistical parameters such as cost values (fixed cost and null cost), correlation (R 2 ), root mean square deviation (RMSD), and fit values. Cost values were analyzed as per Debnath's method [26].

Pharmacophore Validation
Hypothesis validation is carrying out to evaluate the quality of pharmacophore in computational drug designing procedures. The top-ranked hypothesis was validated by Fisher's randomization, test set method, and decoy test method. The statistical significance of the model was computed by employing Fischer's randomization method. The confidence level was set to 95% and 19 random spreadsheets were generated by arbitrarily reassigning the experimental activity values to each compound in the training set. The significance of hypothesis is calculated as: where X denotes the total number of hypotheses with a total cost lower than the original hypothesis and Y represents the total number of HypoGen runs (initial + random runs). Here, X = 0 and Y = (1 + 19), hence 95% = {1 − [(1 + 0)/(19 + 1)]} × 100. In test set validation, the selected pharmacophore hypothesis was used to predict the activity values of the test set compounds. The predicted and experimental activity values were plotted to observe the range of correlation between them. Additionally, the decoy set method was recruited to further affirm the robustness of the model and was estimated as: where D represents the total number of molecules in the data set, A indicates the total number of active molecules (hDHFR inhibitors) in the data set, Ht denotes the total number of hits retrieved, and Ha refers to the number of active molecules present in the retrieved hits. Furthermore, the efficacy of the Hypo1 was determined by the enrichment factor (EF) score and the goodness of fit (GF) score.

Virtual Screening and Drug-Likeness Prediction
The validated hypothesis was used as a 3D structural query in the virtual screening of chemical databases, including Chembridge, NCI, Asinex, Maybridge, and ZINC database. For virtual screening, the Ligand Pharmacophore Mapping tool of DS was employed. During the screening, Conformation Generation and Fitting Methods were kept to BEST and Flexible modes, respectively. The Maximum Omitted Feature was set to 0.
Chemical compounds should have acceptable physiochemical, pharmacokinetic, and pharmacodynamics properties to be used as drug molecules. The drug-like properties of the screened compounds were evaluated with ADMET Descriptors and Lipinski's rule of five modules of DS. During the ADMET filtration, estimated values of blood-brain barrier (BBB), optimal solubility, and absorption levels were set to 3, 3, and 0, respectively. Other parameters such as CYP2D6 prediction and hepatotoxicity prediction were assigned as false. In Lipinski's rule of five, the protocol was optimized as the number of hydrogen bond donors was less than 5, number of hydrogen bond acceptor was less than 10, molecular weight was equal or less than 500 Da, and AlogP was selected to equal or less than 5. The successful molecules were assigned as the hit compounds and subjected to further analyses.

Molecular Docking Simulation
Docking studies were performed using Genetic Optimization of Ligand Docking (GOLD) package v5.2.2 (The Cambridge Crystallographic Data Centre) [27]. For molecular docking, the crystal structure of hDHFR in complex with inhibitor N6-(2,5-Dimethoxy-Benzyl)-N6-Methyl-Pyrido[2,3-D]Pyrimidine-2,4,6-Triamine (PRD) was taken from protein data bank (PDB ID: 1BOZ). During the selection of hDHFR structure, it was confirmed that when the structure was completed, no mutations were observed, having high resolution and the bound inhibitor. The structure of hDHFR was prepared for docking by removing water molecules and hetero atoms in DS. Chemistry at Harvard macromolecular mechanisms (CHARMm) force field was used to add hydrogen atoms to the structure of hDHFR. The binding site of hDHFR was identified from the bound inhibitor (PRD) using the Define and Edit Binding Site module, seeded in DS. During docking, the hit compounds along with the training set compounds were treated as ligand molecules. The Goldfitness score and Chemscore functions of GOLD were assigned as the default scoring and rescoring functions, respectively. For each ligand, a total of 250 conformers were generated using Genetic Algorithm (GA) of the GOLD package. Docking results were analyzed with DS.

Molecular Dynamics (MD) Simulation
MD simulations were carried out with a CHARMm27 all-atom force field [28] in Groningen Machine for Chemical Simulation (GROMACS) v5.0.6 package [29]. For each ligand, an independent simulation system was generated. The topology and coordinates files for the co-factor (NADPH) of hDHFR and hit molecules were generated using SwissParam. Each system was solvated with transferable intermolecular potential with three points (TIP3P) water model in a octahedral box. Solvent molecules were replaced with sodium ions (Na + ) to neutralize the simulation systems. The energy minimization of the simulation systems were carried out using the steepest descent algorithm at a maximum force less than 10 kJ/mol to avoid any possible bad contacts during the production phase. Equilibration of simulation systems was performed in two phases. First, number of particles at constant volume and temperature (NVT) equilibration was conducted for 300 ps at 300 K. A V-rescale thermostat was used to maintain constant temperature [30]. Secondly, number of particles at constant pressure and temperature (NPT) equilibration was performed for 300 ps at constant pressure of 1 bar [31]. Finally, each system was simulated while following the protocol previously described. In brief, the Linear constraint solver (LINCS) algorithm was used to preserve the bond length of heavy atoms. The long range electrostatic interactions were computed by employing the mesh ewald (PME) method. The threshold value of short-range interactions was kept to 14 Å. All simulations were executed under the periodic boundary conditions to make infinite systems. The coordinate data were saved at 1 ps time interval. The results were visualized using GROMACS and DS.

Binding Free Energy Calculations
The Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) method was used for binding energy calculations [32]. Equidistant snapshots of hDHFR-ligand complex were extracted and MM/PBSA protocol was followed for binding energy calculations.
The binding free energy of protein-ligand complex is stated as: where, G complex refers to the sum of the free energy of the complex, and G protein and G ligand indicate the free energies of portion and ligand in their unbound state. The free energy can be expressed as: where X can be a protein, ligand, or their complex. E MM represents the average molecular mechanics potential energy in vacuum, while G solvation interprets the free energy of solvation. Additionally, molecular mechanics potential energy in vacuum can be evaluated by adapting the equation: E bonded represents the bonded interactions, while E non-bonded depicts the non-bonded interactions. The value of ∆E bonded is mostly treated as zero.
The combined energetic terms of electrostatic (G polar ) and apolar (G non-polar ) solvation free energies forms the solvation free energy and is measured as: The Poisson-Boltzmann (PB) equation is employed to compute G polar , while the G non-polar is calculated from the solvent accessible surface area (SASA).
Furthermore, the G non-polar calculated as: where, γ represents the coefficient of solvent surface tension, while b is its fitting parameter, whose values are 0.02267 kJ/mol/ Å2 and 3.849 kJ/mol, respectively.

Generation of Pharmacophore Model
The training set was designed by selecting compounds from the literature mined dataset. The selection was made on the basis of structural diversity and variation in IC 50 values of the hDHFR inhibitors. The training set was comprised of 27 inhibitors of the hDHFR and their experimental IC 50 values were determined using the same biological assays. The 2D structures and anti-DHFR activities (IC 50 values) ranged from 0.19 to 10,000 nM/L of the training set compounds (Figure 1). The hDHFR inhibitors of the training set were sub-divided into active (IC 50 < 100 nM/L, +++), moderately active (100 nM/L < IC 50 < 500 nM/L, ++), and inactive (IC 50 ≥ 500 nM/L, +) compounds. A total number of hypotheses were generated using a 3D QSAR Pharmacophore Generation module of DS, using the training set compounds (Figure 1). Our results observed that the statistical parameters of Hypo1 included configuration cost (17.06), total cost (125.276), close to the fixed cost (109.01), and away from the null cost (203.67). Futher, a high correlation coefficient of 0.94 went along with a large cost difference (78.308) and a low RMS value (0.99) (     The comparative analysis suggested that Hypo1 is the best representative hypothesis, presenting a good geometric spatial agreement with the active site of hDHFR and comprised of four chemical features including 1 hydrogen bond acceptor (HBA), 2 hydrogen bond donors (HBD), and 1 hydrophobic (HYP) (Figure 2A). Hypo1 was selected as the representative hypothesis for further analysis. It is assumed that a best hypothesis (pharmacophore) estimates either the same or closed activity scale (pIC 50 value) to experimentally determine the IC 50 value of the tested compound; it is characterized as a predictive accuracy. To elucidate the predictive accuracy of Hypo1, we estimated the activity values of the training set compounds. Our results demonstrated that Hypo1 estimated the same range of inhibitory activity for all the training set compounds except for one active and two moderately active compounds, which were underestimated as moderately active and inactive compounds, respectively ( Table 2). The fit value shows the strength of mapping of the correspondent compound onto Hypo1 (Table 2). Error measures the difference between the experimentally determined IC 50 and the predicted IC 50 values of the tested compound. The activity scale is represented as active (+++), moderately active (++), and inactive (+) ( The comparative analysis suggested that Hypo1 is the best representative hypothesis, presenting a good geometric spatial agreement with the active site of hDHFR and comprised of four chemical features including 1 hydrogen bond acceptor (HBA), 2 hydrogen bond donors (HBD), and 1 hydrophobic (HYP) (Figure 2A). Hypo1 was selected as the representative hypothesis for further analysis. It is assumed that a best hypothesis (pharmacophore) estimates either the same or closed activity scale (pIC50 value) to experimentally determine the IC50 value of the tested compound; it is characterized as a predictive accuracy. To elucidate the predictive accuracy of Hypo1, we estimated the activity values of the training set compounds. Our results demonstrated that Hypo1 estimated the same range of inhibitory activity for all the training set compounds except for one active and two moderately active compounds, which were underestimated as moderately active and inactive compounds, respectively ( Table 2). The fit value shows the strength of mapping of the correspondent compound onto Hypo1 (Table 2). Error measures the difference between the experimentally determined IC50 and the predicted IC50 values of the tested compound. The activity scale is represented as active (+++), moderately active (++), and inactive (+) ( Table 2).  inverse if the ratio is <1. b Activity scale: IC 50 < 100 nM/L = +++ (active), 100 nM/L < IC 50 < 500 nM/L = ++ (moderate active), IC 50 ≥ 500 nM/L = + (inactive).
Additionally, mapping of the training set by Hypo1 showed that all the hypothetical features of Hypo1 were perfectly mapped by the most active (IC 50 = 0.19 nM/L) compound ( Figure 2B), whereas the inactive (IC 50 = 10,000 nM/L) compound failed to fit on one HBA and one HYP features ( Figure 2C). This analysis advocates that Hypo1 has high accuracy to effectively differentiate the chemical compounds on the basis of activity values.

Fischer's Randomization Test
Fischer's test was carried out to estimate the statistical significance of Hypo1. At 95% confidence level, 19 random spreadsheets were estimated. The estimated random spreadsheets indicate total cost values of 10 generated hypotheses ( Figure 3). The comparative analysis showed that the total cost value of Hypo1 was significantly lower than the generated random spreadsheets. Therefore, we argue that Hypo1 is far superior and was not generated by chance.
Fischer's test was carried out to estimate the statistical significance of Hypo1. At 95% confidence level, 19 random spreadsheets were estimated. The estimated random spreadsheets indicate total cost values of 10 generated hypotheses (Figure 3). The comparative analysis showed that the total cost value of Hypo1 was significantly lower than the generated random spreadsheets. Therefore, we argue that Hypo1 is far superior and was not generated by chance.

Test Set Validation
Hypo1 was validated after evaluating its ability to measure the activity range of the test set compounds. The test set was comprised of 40 chemically diverse compounds and sub-divided into active, moderately active, and inactive compounds ( Figure S1, Table S1). Our findings observed that Hypo1 estimated the acceptable activity range of the entire test set compounds except for three molecules (Table S1). Furthermore, a high correlation coefficient (R 2 ) value of 0.92 was obtained by linear regression between the experimental inhibitory activities and predicted inhibitory activities of the test set compounds (Figure 4). Our results affirmed that Hypo1 has a strong capacity to differentiate active compounds from moderately active and inactive compounds.

Test Set Validation
Hypo1 was validated after evaluating its ability to measure the activity range of the test set compounds. The test set was comprised of 40 chemically diverse compounds and sub-divided into active, moderately active, and inactive compounds ( Figure S1, Table S1). Our findings observed that Hypo1 estimated the acceptable activity range of the entire test set compounds except for three molecules (Table S1). Furthermore, a high correlation coefficient (R 2 ) value of 0.92 was obtained by linear regression between the experimental inhibitory activities and predicted inhibitory activities of the test set compounds (Figure 4). Our results affirmed that Hypo1 has a strong capacity to differentiate active compounds from moderately active and inactive compounds.

Decoy Set Validation
Decoy set validation was performed using a Ligand Pharmacophore Mapping module in DS. The strength of Hypo1 was determined by four parameters such as false positive, false negative, enrichment factor (EF), and goodness of fit (GF). EF and GF were calculated from the values of different parameters given in table 3. Other characteristics of Hypo1 including percentage of number of active yields (%Y), percent ratio of actives in the hit list (%A), false negatives, and false positives were also measured ( Table 3).

Decoy Set Validation
Decoy set validation was performed using a Ligand Pharmacophore Mapping module in DS. The strength of Hypo1 was determined by four parameters such as false positive, false negative, enrichment factor (EF), and goodness of fit (GF). EF and GF were calculated from the values of different parameters given in Table 3. Other characteristics of Hypo1 including percentage of number of active yields (%Y), percent ratio of actives in the hit list (%A), false negatives, and false positives were also measured ( Table 3). Hypo1 retrieved 87% active compounds from the decoy set. Hypo1 predicted 1 active compound as inactive (false negative) and 1 inactive compound as active (false positive) compound. Hypo1 achieved a high GF score of 0.86. Similarly, Hypo1 obtained enrichment factor (EF) score of 8.23 (Table 3). These observations suggested that Hypo1 had high potential to show true positives.

Virtual Screening of Chemical Databases
Chemical features of pharmacophore play an essential role in the screening of chemical databases and identify the best-fitted compounds from a database. Herein, Hypo1 was employed to screen NCI, Maybridge, Chembridge, Asinex, and ZINC databases, which contains 238,819, 59,652, 50,000, 213,262, and 137,077 compounds, respectively ( Figure 5). Hypo1 retrieved a total of 32,617 compounds, which were subsequently subjected to the ADMET assessment test and Lipinski's Rule of five. The ADMET assessment test evaluated the pharmacokinetic properties of Hypo1-retrieved compounds. During the ADMET assessment test, compounds were checked for non-inhibition to CYP2D6 and non-hepatotoxicity. Other pharmacokinetic properties included blood brain barrier (BBB), optimal solubility, and good intestinal absorption, evaluated by setting their values to 3, 3, and 0, respectively. The ADMET assessment test was followed by Lipinski's rule of five. During Lipinski's rule of five filtration, compounds with AlogP value less than 5, number of HBD < 5, number of HBA < 10, molecular weight < 500 Da, and fewer than ten rotatable bonds were identified [33,34]. Finally, a total of 479 molecules satisfied the drug-like properties ( Figure 5). 50,000, 213,262, and 137,077 compounds, respectively ( Figure 5). Hypo1 retrieved a total of 32,617 compounds, which were subsequently subjected to the ADMET assessment test and Lipinski's Rule of five. The ADMET assessment test evaluated the pharmacokinetic properties of Hypo1-retrieved compounds. During the ADMET assessment test, compounds were checked for non-inhibition to CYP2D6 and non-hepatotoxicity. Other pharmacokinetic properties included blood brain barrier (BBB), optimal solubility, and good intestinal absorption, evaluated by setting their values to 3, 3, and 0, respectively. The ADMET assessment test was followed by Lipinski's rule of five. During Lipinski's rule of five filtration, compounds with AlogP value less than 5, number of HBD < 5, number of HBA < 10, molecular weight < 500 Da, and fewer than ten rotatable bonds were identified [33] [34]. Finally, a total of 479 molecules satisfied the drug-like properties ( Figure 5).

Molecular Docking Simulation
To unveil the binding mode of true positive candidate hits of hDHFR, the Hpo1-retrieved druglike hits (479) were subjected to molecular docking. The 3D structure of hDHFR in complex with inhibitor PRD was taken from protein databank (PDB ID: 1BOZ) [35]. The resolution of the selected structure is 2.1 Å. The docking protocol was optimized by docking the co-crystal bound inhibitor (PRD) in the active site of hDHFR. The docked pose of the PRD obtained an acceptable RMSD value

Molecular Docking Simulation
To unveil the binding mode of true positive candidate hits of hDHFR, the Hpo1-retrieved drug-like hits (479) were subjected to molecular docking. The 3D structure of hDHFR in complex with inhibitor PRD was taken from protein databank (PDB ID: 1BOZ) [35]. The resolution of the selected structure is 2.1 Å. The docking protocol was optimized by docking the co-crystal bound inhibitor (PRD) in the active site of hDHFR. The docked pose of the PRD obtained an acceptable RMSD value of 0.92 Å with the crystallographic pose of PRD in the active site of the hDHFR ( Figure S2A). Furthermore, the potentiality of the generated pharmacophore (Hypo1) model was assessed by its comparative analysis with the docked pose of the reference compound (the most active compound of the training set). Herein, the superimposition of Hypo1 and the docked pose of the reference compound observed an acceptable RMSD value of 1.16 Å ( Figure S2B). Therefore, the Hypo1-retrieved drug-like (candidate) compounds were docked while employing the same optimized protocol. The docking results showed that the Goldfitness score and Chemscore of the reference compound in the training set were 44.67 and −23.35, respectively, and used as cut-off values during the docking results analysis (Table 4). The candidate compounds were selected based on a Goldfitness score greater than 44.67, Chemscore lower than −23.35, and the ligand conformations satisfying the necessary interactions with essential residues in the active site of the hDHFR. During virtual screening, hit molecules with estimated IC 50 values less than the estimated IC 50 value of the highly active compound were considered. Our results also showed that the selected hit compounds had a lower estimated IC 50 values than the estimated IC 50 value (0.19 nM/L) of the highly active compound of the training set (Table 4). Finally, three compounds were selected as the best hit compounds against the hDHFR, which also mapped well onto the pharmacophoric features of Hypo1 ( Figure 6). The candidate compounds were selected based on a Goldfitness score greater than 44.67, Chemscore lower than −23.35, and the ligand conformations satisfying the necessary interactions with essential residues in the active site of the hDHFR. During virtual screening, hit molecules with estimated IC50 values less than the estimated IC50 value of the highly active compound were considered. Our results also showed that the selected hit compounds had a lower estimated IC50 values than the estimated IC50 value (0.19 nM/L) of the highly active compound of the training set (Table 4). Finally, three compounds were selected as the best hit compounds against the hDHFR, which also mapped well onto the pharmacophoric features of Hypo1 ( Figure 6).

Molecular Dynamics Simulation
MD simulations were performed to evaluate the binding stability of the novel hits in the active site of the hDHFR. A total of four MD simulation systems were prepared as one for each hit compound and one for the reference compound. The preliminary details of each system are provided in Table 5. Table 5. The specifications of four systems used for molecular dynamics simulations.

Molecular Dynamics Simulation
MD simulations were performed to evaluate the binding stability of the novel hits in the active site of the hDHFR. A total of four MD simulation systems were prepared as one for each hit compound and one for the reference compound. The preliminary details of each system are provided in Table 5. The stability of simulation system was assessed by measuring the root mean square deviation (RMSD) of the backbone atoms of the hDHFR. The simulation results observed that the root mean square deviation values of backbone atoms of hDHFR ranged from 1.2 Å to 1.8 Å throughout the simulation; each system was well converged ( Figure 7A). For further analyses, the representative structure of each system was taken from the last 5 ns trajectory. The superimposition of representative structures observed that the binding pattern and conformational orientation of hit compounds in the active site of hDHFR were similar to the reference compound ( Figure 8). The substrate binding site of DHFR is comprised of Ile7, Glu30, Phe34, Ser59, Pro61, Asn64, and Val115 [36]. Our results suggested that the substrate binding residues of hDHFR could bind the reference compound and the hit molecules. The reference compound formed hydrogen bonds with Ile7, Glu30, and Val115 of hDHFR ( Figure 9A, Table 6).
Furthermore, the reference compound established π-cation interactions with Ile7, Phe34, Ile60, Pro61, Leu67, and Val115 and showed van der Waals interactions with Val8, Gly31, Asn64, Gly116, Tyr121, and NADPH (Table 6). Hit1 formed hydrogen bond interactions with Leu27, Glu30, Ser59 and van der Waals interaction with Val8, Ile16, Gly17, Asp21, Asp21, Pro26, Gly31, Gln35, and Thr136 ( Figure 9B, Table 6). Our findings also suggested that Hit1 formed carbon-hydrogen bonds with Pro61, Ile60, and π-cation interactions with Ala9, Leu22, Pro23, Phe34, Val115, and NADPH. Hit2 formed hydrogen bonds with Trp24, Glu30, Thr56, and NADPH as well as carbon-hydrogen bonds with Gly31, Ser59, Ile60, and Pro61 ( Figure 9C, Table 6). Additionally, Hit2 showed interactions with the hydrophobic pocket residues of hDHFR such as Ala9 and Gln351, as well as π-π interactions with Phe34, Met52, Ile60, Leu67, and Val115 (Table 6). Hit3 established hydrogen bonds with Glu30 and Asn64 ( Figure 9D, Table 6). Hit3 showed hydrophobic interactions with Val259, Ala271, Glu288, Met292, Ile314, Phe318, Asn369, Leu371, Ala381, and Phe383 residues of the hDHFR. The benzene ring of Hit3 participated in π-cation interactions with Ile7, Ala9, Ile60, and Phe34. Hit3 exhibited van der Waals interactions with the residues forming the hydrophobic core of hDHFR and NADPH, while one carbon-hydrogen bond with Glu30 (Table 6). During the entire simulation period, the total number of intermolecular hydrogen bonds of hDHFR with each hit compound were measured. Our results observed that the hit compounds formed comparatively high number of H-bonds with hDHFR than the reference compound ( Figure 7B). To assist the novelty of hit compounds against the hDHFR, PubChem structure was used [34]. Our search confirmed that the newly identified hit compounds have not been previously tested to inhibit hDHFR. Therefore, we recommend these hit compounds as potential candidate inhibitors of hDHFR. The 2D structures of hit compounds are shown in Figure 10. structure of each system was taken from the last 5 ns trajectory. The superimposition of representative structures observed that the binding pattern and conformational orientation of hit compounds in the active site of hDHFR were similar to the reference compound ( Figure 8). The substrate binding site of DHFR is comprised of Ile7, Glu30, Phe34, Ser59, Pro61, Asn64, and Val115 [36]. Our results suggested that the substrate binding residues of hDHFR could bind the reference compound and the hit molecules. The reference compound formed hydrogen bonds with Ile7, Glu30, and Val115 of hDHFR ( Figure 9A, Table 6).    All the compounds in their representative structures were superimposed (left) and enlarged (right). 3D structure of hDHFR is shown in light grey color. Red, magenta, green, and orange colors represent the reference compound, Hit1, Hit2, and Hit3, respectively.

Binding Free Energy Analysis
The binding free energy of hDHFR in complex with hit molecules ranged from −127.05 kJ/mol to −178.47 kJ/mol ( Figure 11A). Our findings observed that the average binding free energy for hDHFR in complexes with reference compound, Hit1, Hit2, and Hit3 were −127.05 kJ/mol, −178.47 kJ/mol, −171.12 kJ/mol, and −133.16 kJ/mol, respectively. The decomposition analysis of the binding

Binding Free Energy Analysis
The binding free energy of hDHFR in complex with hit molecules ranged from −127.05 kJ/mol to −178.47 kJ/mol ( Figure 11A). Our findings observed that the average binding free energy for hDHFR in complexes with reference compound, Hit1, Hit2, and Hit3 were −127.05 kJ/mol, −178.47 kJ/mol, −171.12 kJ/mol, and −133.16 kJ/mol, respectively. The decomposition analysis of the binding free energy suggested that electrostatic and van der Waals forces are pre-dominant factors in DHFR inhibition ( Figure 11B, Table 7).    Overall, our results demonstrated that the newly identified hits preferentially accommodated the active site of hDHFR and established polar as well as non-polar interactions with the catalytic active residues.

Toxicity Evaluation by TOPKAT
Toxicity prediction by komputer assisted technology (TOPKAT) is a toxicity prediction tool implanted in DS. TOPKAT is used by universities, private companies, and government agencies such as Amgen, Pfizer, and US CDC to evaluate the toxicity of target molecules. Our results suggested that the newly identified hit compounds of hDHFR are non-carcinogenic and do not pose any skin irritancy. The safety of novel hits of hDHFR was evaluated using several parameters in Table 8. Table 8. Comparison of absorption, distribution, metabolism, excretion, and toxicity (ADMET), Lipinski's rule of five, and toxicity prediction by komputer assisted technology (TOPKAT) predictions for Federal Drug Agency (FDA) approved drugs and hit compounds. that the generated hypothesis is able to differentiate active compounds from the inactive compounds. The least cost value for Hypo1 in Fischer's randomization validation and high GF score in the decoy test validation also suggested its suitability to use for virtual screening of chemical databases. The validated pharmacophore was employed as 3D query for virtual screening of chemical databases. This implies that the Hypo1-retrieved hits might retain the same or improved therapeutic ability corresponding to the known inhibitors of hDHFR [43]. Molecular docking approach predicts the most compatible binding conformation of ligand molecule in the binding site of the receptor protein. Accordingly, the best poses from docking on the basis of scoring functions and essential interactions with the active site residues of hDHFR were subjected to MD simulations for stability analysis [44]. RMSD plots implied that the hits showed similar mode of interaction as the reference compound. Specifically, the average RMSD profiles (<0.25 nm) obtained for the backbone atoms of DHFR suggested that all the systems were uniform because the stability of the system can be inferred by its RMSD value less than 0.3 nm [45]. Our results showed that all the hit compounds established stable H-bonds with the active site residues of hDHFR. Intriguingly, Hit2 formed highest number of hydrogen bonds with hDHFR and NADPH. Such pattern of high number of H-bonding of inhibitors was previously observed in the development of small molecule inhibitors of DHFR of Mycobacterium tuberculosis [46].
In human dihydrofolate reductase, Glu30 is an essential residue in the active site and forms hydrogen bond with inhibitor(s) [18,46]. We observed that Glu30 demonstrated consistent H-bond interaction with the reference and hit compounds. It is investigated that the carboxylate oxygen of Glu30 makes H-bond interaction with the pyridopyrimidine ring of inhibitor. Likewise, N1 nitrogen and 2-NH2 of the same inhibitor form H-bonds with OE2 and OE1 of Glu30, respectively [35]. Our results also demonstrated that N21 nitrogen of Hit1 formed hydrogen bond with OE1 of Glu30 with bond distance of 1.91 Å. The interaction analysis displayed a H-bond between the oxygen atom (O27) of Hit2 and oxygen atom (OE1) of Glu30 with the bond distance of 2.03 Å. Additionally, another H-bond was observed between the O25 of Hit2 and OE2 of Glu30 with the bond distance of 1.03 Å. Our results also observed H-bond between the O11 atom of Hit3 and OE2 of Glu30. Afterward, the hydrogen bond distance in the reference and the co-crystal was observed as 2.64 Å and 2.71 Å, respectively. Therefore, we have argued that the novel hits may have a higher affinity towards DHFR than the reference compound. Furthermore, binding free energy calculations using MM/PBSA also suggested that the complexes of hDHFR with the hit compounds were more stable than the reference [47].
Therefore, the known inhibitors of hDHFR are not safe and showed severe side effects and as a result we carried out safety measures for the newly identified inhibitors. The comparative analyses of the Federal Drug Agency's (FDA) approved drugs and the novel hits predicted that the hit compounds of hDHFR were safe and do not pose severe side effects [46,47]. The QSAR-based system generates and validates accurate and rapid assessments of chemical toxicity solely from a molecular structure. This assessment module combined with ADMET and Lipinski's rule of five prediction modules exhibited that the hit compounds have acceptable pharmacokinetic properties. Finally, we recommend the newly identified hits of hDHFR as fundamental platforms to develop effective chemotherapeutics in cancer and rheumatoid medications.