Finding a Potential Dipeptidyl Peptidase-4 (DPP-4) Inhibitor for Type-2 Diabetes Treatment Based on Molecular Docking, Pharmacophore Generation, and Molecular Dynamics Simulation

Dipeptidyl peptidase-4 (DPP-4) is the vital enzyme that is responsible for inactivating intestinal peptides glucagon like peptide-1 (GLP-1) and Gastric inhibitory polypeptide (GIP), which stimulates a decline in blood glucose levels. The aim of this study was to explore the inhibition activity of small molecule inhibitors to DPP-4 following a computational strategy based on docking studies and molecular dynamics simulations. The thorough docking protocol we applied allowed us to derive good correlation parameters between the predicted binding affinities (pKi) of the DPP-4 inhibitors and the experimental activity values (pIC50). Based on molecular docking receptor-ligand interactions, pharmacophore generation was carried out in order to identify the binding modes of structurally diverse compounds in the receptor active site. Consideration of the permanence and flexibility of DPP-4 inhibitor complexes by means of molecular dynamics (MD) simulation specified that the inhibitors maintained the binding mode observed in the docking study. The present study helps generate new information for further structural optimization and can influence the development of new DPP-4 inhibitors discoveries in the treatment of type-2 diabetes.


Introduction
Worldwide 314 million individuals are suffering from (metabolic disorder) type-2 diabetes mellitus, which has been classified as a disease of glucose overproduction by tissues lacking enough insulin production [1]. Insulin resistance is one of the major associated conditions for most patients with the disorder, and is generally present in people suffering with obesity [2]. The infirmity of insulin secretion and insulin action on glucose uptake in muscles leads to type-2 diabetes mellitus [3]. Type-2 diabetes is radically increasing throughout the world and, according to the estimates of the International Diabetes Federation (IDF), by the year 2030 the total number of people with diabetes is estimated to reach 552 million [4].
The function of glucagon like peptide-1 (GLP-1) and its potential role in treating type-2 diabetes was first reported in late 1990s [5]. Glucagon-like peptide-1 (GLP-1) and glucose-dependent insulinotropic peptide (GIP) are intestinally derived incretin hormones, which physiologically arouse insulin emissions from β-cells postprandially (Figure 1) [6]. GLP-1 has multiple advantageous effects on beta-cell chore and mass by stimulating beta-cell development from precursor cells and inhibiting After the ingestion of a meal, ileal L-cells produce active GLP-1 and influence its action through: (1) stimulation of insulin liberation; (2) glucagon release inhibition; and (3) slowing down the gastric emptying process [9]. GLP-1 analogues and DPP-4 inhibitors belong to one of the incretin-based therapies for type-2 diabetes [10,11]. GLP-1 analogues (liraglutide and exenatide) are not suitable for oral delivery because of their biochemical nature, and must be injected subcutaneously on a daily basis. The usage of GLP-1 analogues was limited to agents because of its requirement for injection, whereas competitive DPP-4 inhibitors, which are chemically plagiaristic, are administered orally [9,10,12].
Human DPP-4 is signified as a single polypeptide chain of 766 amino acids, which consists of a N-terminal cytoplasmic domain, a trans-membrane domain, a β-propeller domain, and a α/β-hydrolase domain [11,13]. DPP-4 has different types of known substrates and contains growth factors like chemokines, neuropeptides, and vasoactive peptides [14]. Reports show that DPP-4 has a significant function in glucose metabolism; furthermore, it is responsible for degrading incretins like GLP-1 and GIP through cleaving their amino acids which have proline or alanine in the second position ( Figure 2) [15,16]. To overcome this predicament, the inhibition of the enzyme DPP-4, which prevents the inactivation of GLP-1, thus intensifies and prolongs the action of the incretin hormone.
A computational approach involving molecular docking refined with molecular dynamics simulations as well as a pharmacophore analysis could pave the way for new insights on potential DPP-4 inhibitors for type-2 diabetes treatments, as successfully described in the literature about other case studies [17][18][19][20]. Finding small molecular inhibitors through computer-aided technologies has been successful for disease diagnosis and therapeutic interventions in previous research [21][22][23][24][25]. With this in mind, in silico studies have been deployed for the identification of DPP-4 inhibitors.
In the present study, we have reported molecular interaction studies including molecular docking and molecular dynamics to order to understand the stability of the complex. Furthermore, pharmacophore generation was used in order to recognize how structurally diverse compounds bind in the specific receptor site. The, integration of these methods might help to develop a potential DPP-4 inhibitor for treating type-2 diabetes. After the ingestion of a meal, ileal L-cells produce active GLP-1 and influence its action through: (1) stimulation of insulin liberation; (2) glucagon release inhibition; and (3) slowing down the gastric emptying process [9]. GLP-1 analogues and DPP-4 inhibitors belong to one of the incretin-based therapies for type-2 diabetes [10,11]. GLP-1 analogues (liraglutide and exenatide) are not suitable for oral delivery because of their biochemical nature, and must be injected subcutaneously on a daily basis. The usage of GLP-1 analogues was limited to agents because of its requirement for injection, whereas competitive DPP-4 inhibitors, which are chemically plagiaristic, are administered orally [9,10,12].
Human DPP-4 is signified as a single polypeptide chain of 766 amino acids, which consists of a N-terminal cytoplasmic domain, a trans-membrane domain, a β-propeller domain, and a α/β-hydrolase domain [11,13]. DPP-4 has different types of known substrates and contains growth factors like chemokines, neuropeptides, and vasoactive peptides [14]. Reports show that DPP-4 has a significant function in glucose metabolism; furthermore, it is responsible for degrading incretins like GLP-1 and GIP through cleaving their amino acids which have proline or alanine in the second position ( Figure 2) [15,16]. To overcome this predicament, the inhibition of the enzyme DPP-4, which prevents the inactivation of GLP-1, thus intensifies and prolongs the action of the incretin hormone.
A computational approach involving molecular docking refined with molecular dynamics simulations as well as a pharmacophore analysis could pave the way for new insights on potential DPP-4 inhibitors for type-2 diabetes treatments, as successfully described in the literature about other case studies [17][18][19][20]. Finding small molecular inhibitors through computer-aided technologies has been successful for disease diagnosis and therapeutic interventions in previous research [21][22][23][24][25]. With this in mind, in silico studies have been deployed for the identification of DPP-4 inhibitors. In the present study, we have reported molecular interaction studies including molecular docking and molecular dynamics to order to understand the stability of the complex. Furthermore, pharmacophore generation was used in order to recognize how structurally diverse compounds bind in the specific receptor site. The, integration of these methods might help to develop a potential DPP-4 inhibitor for treating type-2 diabetes.

Molecular Docking
In the present study, we collected structurally diverse small molecules, such as aminopiperidine fused imidazoles, thiazolopyrimidine derivatives, and quinolin-fused imidazoles ( Figure S1), from the literature [26][27][28]. Molecular docking was performed on these inhibitors in the effort to study the binding modes and to reveal the most essential residues involved in binding interactions. The following amino acids were involved in H-bond interactions: Arg125, Glu205, Glu206, Ser209, Arg358, Tyr547, Tyr631, Tyr662, Tyr666, and Asn710, (these amino acids are active site pocket residues of 2P8S) [29].
A linear equation was developed for the predicted binding affinities (pKi) decision by using CDOCKER and experimental activity values (pIC50) (Figure 3). Between pIC50 and the pKi of 31 diverse inhibitors, a linear correlation was obtained that yielded a good correlation coefficient (R 2 = 0.72). Moreover, it can be observed that the 31 compounds are well strewn around the fitting line and without significant outliers.

Molecular Docking
In the present study, we collected structurally diverse small molecules, such as aminopiperidine fused imidazoles, thiazolopyrimidine derivatives, and quinolin-fused imidazoles ( Figure S1), from the literature [26][27][28]. Molecular docking was performed on these inhibitors in the effort to study the binding modes and to reveal the most essential residues involved in binding interactions. The following amino acids were involved in H-bond interactions: Arg125, Glu205, Glu206, Ser209, Arg358, Tyr547, Tyr631, Tyr662, Tyr666, and Asn710, (these amino acids are active site pocket residues of 2P8S) [29].
A linear equation was developed for the predicted binding affinities (pK i ) decision by using CDOCKER and experimental activity values (pIC 50 ) ( Figure 3). Between pIC 50 and the pK i of 31 diverse inhibitors, a linear correlation was obtained that yielded a good correlation coefficient (R 2 = 0.72). Moreover, it can be observed that the 31 compounds are well strewn around the fitting line and without significant outliers.

Molecular Docking
In the present study, we collected structurally diverse small molecules, such as aminopiperidine fused imidazoles, thiazolopyrimidine derivatives, and quinolin-fused imidazoles ( Figure S1), from the literature [26][27][28]. Molecular docking was performed on these inhibitors in the effort to study the binding modes and to reveal the most essential residues involved in binding interactions. The following amino acids were involved in H-bond interactions: Arg125, Glu205, Glu206, Ser209, Arg358, Tyr547, Tyr631, Tyr662, Tyr666, and Asn710, (these amino acids are active site pocket residues of 2P8S) [29].
A linear equation was developed for the predicted binding affinities (pKi) decision by using CDOCKER and experimental activity values (pIC50) (Figure 3). Between pIC50 and the pKi of 31 diverse inhibitors, a linear correlation was obtained that yielded a good correlation coefficient (R 2 = 0.72). Moreover, it can be observed that the 31 compounds are well strewn around the fitting line and without significant outliers.  The top 10 compounds were selected from the 31 compounds for the present study by CDOCKER energy scores (Table 1) and whether they could bind with a 2P8S receptor to form more stable complexes than three existing drugs: sitagliptin (´39.43 kcal/mol), alogliptin (´25.64 kcal/mol), and vildagliptin (´5.64 kcal/mol). In this study Comp71 has the lowest CDOCKER energy score (´47.22 kcal/mol) with seven H-bonds (Table 1, Figure 4). The top 10 compounds were selected from the 31 compounds for the present study by CDOCKER energy scores (Table 1) and whether they could bind with a 2P8S receptor to form more stable complexes than three existing drugs: sitagliptin (−39.43 kcal/mol), alogliptin (−25.64 kcal/mol), and vildagliptin (−5.64 kcal/mol). In this study Comp71 has the lowest CDOCKER energy score (−47.22 kcal/mol) with seven H-bonds (Table 1, Figure 4).

Pharmacophore Generation
In a receptor site, a set of structural features in a molecule is recognized and is responsible for that molecule's biological activity-this set of structural features is called a pharmacophore. The

Pharmacophore Generation
In a receptor site, a set of structural features in a molecule is recognized and is responsible for that molecule's biological activity-this set of structural features is called a pharmacophore. The generated pharmacophore models based on receptor-ligand interactions have confirmed all substantial interactions in the compound-receptor interaction modes. The number of features, feature set, and selectivity score from pharmacophore generation is reported in Table 2. The ranking of pharmacophores are based on selectivity (arbitrary) scores-the higher the better. The top ten compounds with the highest arbitrary scores were chosen out of 31 compounds, and seven of them are all better than the existing drugs sitagliptin (5.63), vildagliptin (7.07), and alogliptin (8.19). In fact, Comp73 has highest selectivity score 11.72 and possesses five features: HB_ACCEPTOR (Hydrogen bond acceptor), HB_DONAR (Hydrogen bond donor), NEG_IONIZABLE (Negative ionizable feature), POS_IONIZABLE (Positive ionizable feature), and RING_AROMATIC (Aromatic ring). Our analysis identified oxygen, as specified as HB_ACCEPTORS (green spheres), are common pharmacophore features for all the compounds. Interactions between Comp73 and 2P8S can be explained by the recognition of POS_IONIZABLE features (red sphere), RING_AROMATIC features (orange sphere), NEG_IONIZABLE feature (blue sphere), and HB_DONOR (magenta sphere) features at nitrogen positions N12, the ring aromatic region, and N5 and N28 of Comp73, respectively ( Figure 5). generated pharmacophore models based on receptor-ligand interactions have confirmed all substantial interactions in the compound-receptor interaction modes. The number of features, feature set, and selectivity score from pharmacophore generation is reported in Table 2. The ranking of pharmacophores are based on selectivity (arbitrary) scores-the higher the better. The top ten compounds with the highest arbitrary scores were chosen out of 31 compounds, and seven of them are all better than the existing drugs sitagliptin (5.63), vildagliptin (7.07), and alogliptin (8.19). In fact, Comp73 has highest selectivity score 11.72 and possesses five features: HB_ACCEPTOR (Hydrogen bond acceptor), HB_DONAR (Hydrogen bond donor), NEG_IONIZABLE (Negative ionizable feature), POS_IONIZABLE (Positive ionizable feature), and RING_AROMATIC (Aromatic ring). Our analysis identified oxygen, as specified as HB_ACCEPTORS (green spheres), are common pharmacophore features for all the compounds. Interactions between Comp73 and 2P8S can be explained by the recognition of POS_IONIZABLE features (red sphere), RING_AROMATIC features (orange sphere), NEG_IONIZABLE feature (blue sphere), and HB_DONOR (magenta sphere) features at nitrogen positions N12, the ring aromatic region, and N5 and N28 of Comp73, respectively ( Figure 5).

Molecular Dynamics Simulation
The top 10 high binding compounds from each CDOCKER and pharmacophore simulations were considered, out of which five compounds (Comp65, Comp71, Comp72, Comp73, and Comp74) were the same in both simulations (Tables 1 and 2). As such, a total of 15 compounds and three existing inhibitors were further evaluated through MD simulations. Analysis of MD simulation results at different time points over the simulation trajectory confirm that H-bonds are formed and interrupt with protein side chains. The total number of intermolecular H-bonds at different time points for each ligand are specified in Table 3. For Comp70, the minimum intermolecular H-bonds with 2P8S at the active site region were seven to a maximum of 12 during the simulation trajectory. H-bonds were not constant and were disrupted during the simulation period; however, three stable H-bonds in Comp70 were formed with Glu205, Glu206, and Arg358. At the end of 1000 ps simulation, Ser209 and Ser630 were also significant residues in contributing to the 12 H-bonds. However, the H-bonds formed between residues Tyr631 and Asn710 with Comp70 were only at 200 ps in the simulation trajectory. The number of H-bonds varies from the CDOCKER and MD simulation because of diverse algorithms in each simulation. Protein-ligand complexes and individual root mean square deviations (RMSDs) of ligands were obtained through MD simulation and are shown in Figures 6 and 7. For each ligand the moderate RMSD value was calculated over the simulation trajectory. The average RMSD values of protein-ligand complexes and individual ligands are mentioned in Table 4. Similar RMSD values for individual ligands were observed and receptor-ligand complexes suggest the ligands can have stable interactions with their specific receptors.  The total energies of 18 protein-ligand complexes through MD simulation listed in Table 3 with H-bonding and energies are also shown in Figure 8    The total energies of 18 protein-ligand complexes through MD simulation listed in Table 3 with H-bonding and energies are also shown in Figure 8    The total energies of 18 protein-ligand complexes through MD simulation listed in Table 3 with H-bonding and energies are also shown in Figure 8 at different time levels over 1000 ps simulation. In molecular dynamics simulation, Comp70 (´157,714.28 kcal/mol), Comp54 (´156,891.72 kcal/mol), and Comp63 (´156,672.61 kcal/mol) have the lowest total energies compared to the existing drugs vildagliptin (´156,620.70 kcal/mol), alogliptin (´156,613.52 kcal/mol), and sitagliptin (´156,192.52 kcal/mol). Among these three compounds in this study molecular dynamics results show that Comp70 with 12 H-bonding interaction during the MD simulation trajectory is an aminopiperidine fused imidazole and has the lowest total energy. Furthermore, Arg125, Glu205, Ser209, Arg358, Tyr547, Ser630, Tyr666, and His740 are the crucial amino acids (Figure 9) in formation of 12 H-bonding interactions with Comp70 during MD simulation. Amino acids Arg125, Arg358, Tyr547, and Ser630 are participating in both CDOCKER and MD simulations during the H-bond formation. Therefore, these significant residues might give guidance to develop more potent DPP-4 inhibitors. Comp70 has showed better simulation results than the existing drugs vildagliptin, alogliptin, and sitagliptin.   Furthermore, Arg125, Glu205, Ser209, Arg358, Tyr547, Ser630, Tyr666, and His740 are the crucial amino acids (Figure 9) in formation of 12 H-bonding interactions with Comp70 during MD simulation. Amino acids Arg125, Arg358, Tyr547, and Ser630 are participating in both CDOCKER and MD simulations during the H-bond formation. Therefore, these significant residues might give guidance to develop more potent DPP-4 inhibitors. Comp70 has showed better simulation results than the existing drugs vildagliptin, alogliptin, and sitagliptin. Furthermore, Arg125, Glu205, Ser209, Arg358, Tyr547, Ser630, Tyr666, and His740 are the crucial amino acids (Figure 9) in formation of 12 H-bonding interactions with Comp70 during MD simulation. Amino acids Arg125, Arg358, Tyr547, and Ser630 are participating in both CDOCKER and MD simulations during the H-bond formation. Therefore, these significant residues might give guidance to develop more potent DPP-4 inhibitors. Comp70 has showed better simulation results than the existing drugs vildagliptin, alogliptin, and sitagliptin.

Small-Molecules Preparation
We collected 82 small molecule compounds to inhibit DPP-4 enzyme from the literature [26][27][28]. By using the equation ∆G exp =´RTln(IC 50 ), IC 50 values of inhibitors were converted to binding free energies (∆G exp ). Structures of inhibitors and biological activities (IC 50 ) were shown in Figure S1. The compounds were drawn through Accelrys Draw v4.1 [30,31] to get SMILE formats, and simultaneously through Accelrys Discovery Studio v4.1, the SMILE formats were converted to .sdf files. Compounds were prepared using the "Prepare Ligand" protocol in Discovery Studio v4.1 with default parameters towards performing the CDOCKER simulation [32].

DPP-4 Model Preparation
In the present study DPP-4 was retrieved from RCSB PDB (entry code: 2P8S) with the cyclohexalamine inhibitor. 2P8S is a dimer of two chains, only the monomeric unit was used in the docking studies. The binding site (β-amino butyl amide moiety) for 2P8S was experimentally verified and the coordinates of the validated cyclohexalamine complex with 2P8S are available online [29]. The bounded cyclohexalamine inhibitor from the 2P8S structure was removed for docking experiments. Prepare Protein protocol performed tasks such as inserting missing atoms in incomplete residues, modeling missing loop regions, and removing waters from protein. The default parameter values were mostly kept the same in the "Prepare Protein" protocol from Discovery Studio v4.1 for DPP-4 model preparation.

Molecular Docking
Molecular docking of 82 selected DPP-4 inhibitors (from literature) to the 2P8S receptor was carried out in the present study by using CDOCKER with Discovery Studio v4.1. To perform flexible docking, for the small-molecules all torsion angles were set to be free. CDOCKER is a powerful CHARMm-based docking method that has been used to generate highly accurate docked poses. In this refinement application, the ligands were conceded to tilt around the rigid receptor [32]. In the CDOCKER simulation the following parameters were used: Top Hits-10; RandomConformations-10; Orientations to Refine-10; Force field-CHARMm; and Use Full Potential-False. For the ascertainment of potential correlations between experimental activities and predicted K i values the best docked conformations of small-molecule inhibitors were selected as preliminary binding conformations.

Pharmacophore Generation
Receptor-ligand interactions are understood in detail with the advancement of pharmacophore generation. Through the pharmacophore model, the binding of structurally diverse ligands to common receptors active sites can be visualized [33]. In the present study we performed Receptor-Ligand Pharmacophore Generation with Discovery Studio v4.1 to know the interactions between protein and ligand. The Receptor-Ligand Pharmacophore Generation generates a set of selective pharmacophore models from a receptor-ligand complex. The model is generated from the features that correspond to the receptor-ligand interactions (CDOCKER interactions). The following ligand features types were considered: Hydrogen bond acceptor (HB_ACCEPTOR), Hydrogen bond donor (HB_DONOR), Hydrophobic feature (HYDROPHOBIC), Negative ionizable feature (NEG_IONIZABLE), Positive ionizable feature (POS_IONIZABLE), and Aromatic ring (RING_AROMATIC) [34]. The ranking of the top pharmacophore models was based on sensitivity and selectivity measurements. The selectivity of enumerated pharmacophore models is assessed with the Genetic Function Approximation (GFA) model. In Discovery Studio, the GFA model is implanted by the default settings of the GFA algorithm [35,36]. From a training set of 1544 pharmacophore models, the GFA model for selectivity of a pharmacophore is constructed. This set is used for searching a database of 5384 diverse, drug-like molecules and the logarithmic values of the number of database search hits are used as the targets (GFAscore). According to the following Equation (1), selectivity is derived from the GFAscore: Selectivity " 11´GFAscore (1) 11 is a constant which ensures the selectivity score to result in positive values.

Molecular Dynamics Simulations
To know the interaction strength and stability of the receptor-ligand complex molecular dynamics (MD) simulation studies were conceded. Discovery Studio v4.1 was utilized as a platform to perform MD simulations by means of Standard Dynamics Cascade. Prior to performing MD simulations, CHARMm polar hydrogen force field was applied to each of the protein-ligand complexes and solvation was set to explicit periodic boundary. The parameters for MD simulations were set with following conditions: Both steepest descent of energy minimization and steps of conjugate gradient minimization were kept to 300 in order to obtain constant and reasonable conformation of biomolecules. The system was heated from an initial temperature of 50 K to the target temperature of 300 K within 4 ps, and the equilibration steps were set to 1000. Moreover, the parameters of electrostatics were chosen as Particle Mesh Ewald (PME) for long-range electrostatic constrains and apply SHAKE Constraint was chosen as true, respectively. The total production time of 1000 ps simulations were performed with a type of run set as NVE (dynamics without temperature/pressure control). Default setting values were adopted for other parameters [37]. To analyze root mean square deviations (RMSDs) of protein-ligand complexes and ligands, we utilized the functions of Analyze Trajectory. After MD simulation, analyze trajectory functions were used to get RMSDs of protein-ligand complexes and ligands, total energies, as well as H-bonding interactions. All the computational procedures were performed on a dual-processor workstation (Intel(R)Xeon, 2.40 GHz) with a memory size of 16 GB.

Conclusions
In this study, we have explored structurally diverse compounds from the literature and performed molecular docking (CDOCKER) which revealed the importance of DPP-4 central β-amino butyl amide moiety for binding modes. Following this procedure, several selected compounds were submitted to pharmacophore generation and MD simulations. Notably, this analysis allowed us to identify DPP-4 inhibitors Comp68, Comp70, and Comp71 with effective selectivity scores and MD total energies. Moreover MD results of Comp70 showed the lowest total energy value with 12 H-bonding interactions and the ligand was stable according to its RMSD during MD simulation. Based on these computational methods, we have recognized that Comp70 might have the possibility to become an effective inhibitor for DPP-4, which belongs to the class of aminopiperidine-fused imidazoles. Computational analysis such as quantitative structure activity relationship (QSAR) and biological investigations are necessary in the further study.