Model Optimization and In Silico Analysis of Potential Dipeptidyl Peptidase IV Antagonists from GC-MS Identified Compounds in Nauclea latifolia Leaf Extracts

Dipeptidyl peptidase IV (DPP-IV) is a pharmacotherapeutic target in type 2 diabetes. Inhibitors of this enzyme constitute a new class of drugs used in the treatment and management of type 2 diabetes. In this study, phytocompounds in Nauclea latifolia (NL) leaf extracts, identified using gas chromatography–mass spectroscopy (GC-MS), were tested for potential antagonists of DPP-IV via in silico techniques. Phytocompounds present in N. latifolia aqueous (NLA) and ethanol (NLE) leaf extracts were identified using GC–MS. DPP-IV model optimization and molecular docking of the identified compounds/standard inhibitors in the binding pocket was simulated. Drug-likeness, pharmacokinetic and pharmacodynamic properties of promising docked leads were also predicted. Results showed the presence of 50 phytocompounds in NL extracts of which only 2-O-p-methylphenyl-1-thio-β-d-glucoside, 3-tosylsedoheptulose, 4-benzyloxy-6-hydroxymethyl-tetrahydropyran-2,3,5-triol and vitamin E exhibited comparable or better binding iGEMDOCK and AutoDock Vina scores than the clinically prescribed standards. These four compounds exhibited promising drug-likeness as well as absorption, distribution, metabolism, excretion and toxicity (ADMET) properties suggesting their candidature as novel leads for developing DPP-IV inhibitors.


Protein Sequence Retrieval and Model Optimization Results
The obtained DPP-IV sequence from NCBI with reference sequence identification number NP_001926.2 was made up of 766 amino acid residues. Querying of the sequence generated other proteins with a similar sequence. SWISS-MODEL also corroborated the result. Eleven Homo sapiens DPP-IV templates (1wcy, 3qbj, 2qt9, 2bgr, 5lls, 2gbg, 2jid, 1orv, 3f8s, 5vta and 4ffv) were selected but 1wcy was further chosen as the homology modeling template due to the sequence identity and similarity, global model quality estimate (GMQE), template resolution, quaternary structure quality estimate (QSQE), oligomeric state, local quality estimate and experimental comparison plot superiority over other templates (Table 3). Chain A of the modeled DPP-IV protein was selected despite structural similarities between the chains A and B. The modeled protein had a GMQE and QSQE score of 0.99 and 1 respectively. The protein was a homo-dimer with a 2.2 Å resolution and 0.62 sequence similarity ( Table 3). The model also had a Z-score that was less than 1 (Z-score < 1) when compared with other pdb structures and a QMEAN of −0.56. The local quality estimate ranged from 0.7-0.9 with a few outliers lower than 0.6 ( Figure 3). superiority over other templates (Table 3). Chain A of the modeled DPP-IV protein was selected despite structural similarities between the chains A and B. The modeled protein had a GMQE and QSQE score of 0.99 and 1 respectively. The protein was a homo-dimer with a 2.2 Å resolution and 0.62 sequence similarity ( Table 3). The model also had a Z-score that was less than 1 (Z-score < 1) when compared with other pdb structures and a QMEAN of -0.56. The local quality estimate ranged from 0.7-0.9 with a few outliers lower than 0.6 ( Figure 3).

Energy Minimization, Physicochemical Analysis and Model Evaluation Results
3D refine generated five energy-minimized models from the modeled DPP-IV of which model 5 was selected for further analysis based on the 3D refine score, RWplus and MolProbity (Table 4). When this energy-minimized model was superimposed with the modeled DPP-IV and 1wcy the root mean square deviation (RMSD) score was 0.262 and 0.267 Å respectively ( Figure 4).

Energy Minimization, Physicochemical Analysis and Model Evaluation Results
3D refine generated five energy-minimized models from the modeled DPP-IV of which model 5 was selected for further analysis based on the 3D refine score, RWplus and MolProbity (Table 4). When this energy-minimized model was superimposed with the modeled DPP-IV and 1wcy the root mean square deviation (RMSD) score was 0.262 and 0.267 Å respectively ( Figure 4).  From the physicochemical analysis of the selected model, the protein has an atomic composition of 3832 carbons, 5721 hydrogens, 983 nitrogens, 1132 oxygens and 26 sulphurs ( Figure  5). The observed atomic compositions was as a result of the amino acid compositions where serine (8.6%) followed by tyrosine (7.7%) and leucine (7.5%) were the highest amino acid residues with 63, 56 and 55 residues respectively while cysteine (1.6%) followed by methionine (1.9%) and histidine (2.6%) were the lowest amino acid residues with 12, 14 and 19 residues respectively ( Figure 6). This protein was acidic with a molecular weight of 84,506.05 and an isoelectric point of 5.61. Additionally, an extinction coefficient of 194,190 M -1 cm -1 with an estimated half life of 1.1 h, 3 min and over 10 h in mammals, yeast and Escherichia coli respectively. 44.19 and 76.87 were recorded as the instability and aliphatic index respectively with a -0.407 grand average of hydropathicity. From the physicochemical analysis of the selected model, the protein has an atomic composition of 3832 carbons, 5721 hydrogens, 983 nitrogens, 1132 oxygens and 26 sulphurs ( Figure 5). The observed atomic compositions was as a result of the amino acid compositions where serine (8.6%) followed by tyrosine (7.7%) and leucine (7.5%) were the highest amino acid residues with 63, 56 and 55 residues respectively while cysteine (1.6%) followed by methionine (1.9%) and histidine (2.6%) were the lowest amino acid residues with 12, 14 and 19 residues respectively ( Figure 6). This protein was acidic with a molecular weight of 84,506.05 and an isoelectric point of 5.61. Additionally, an extinction coefficient of 194,190 M −1 cm −1 with an estimated half life of 1.1 h, 3 min and over 10 h in mammals, yeast and Escherichia coli respectively. 44.19 and 76.87 were recorded as the instability and aliphatic index respectively with a −0.407 grand average of hydropathicity.  The Ramachandran plot of the minimized modeled DPP-IV showed 88.8%, 10.6%, 0.3% and 0.3% of the amino acid residues were in the most favored, additional allowed, generously allowed and disallowed region respectively as compared to the modeled protein (89.6%, 10.1%, 0.2% and 0.2% respectively) and template (1wcy) residues (86.7%, 12.9%, 0.3% and 0.2% respectively) found in the most favored, additional allowed, generously allowed and disallowed region. The torsion angles (-0.02), covalent geometry (-1.07) and overall average (-0.52) of the energy minimized model were reduced compared to that of modeled DPP-IV (-0.20, 0.11 and -0.07 respectively) and the template 1wcy (0.13, 0.54 and 0.30 respectively; Table 5). The ERRAT quality factor of the minimized modeled DPP-IV was 92.93% as shown in Figure 7 while from Figure 8, 95.34% was the recorded Verify 3D score.  The Ramachandran plot of the minimized modeled DPP-IV showed 88.8%, 10.6%, 0.3% and 0.3% of the amino acid residues were in the most favored, additional allowed, generously allowed and disallowed region respectively as compared to the modeled protein (89.6%, 10.1%, 0.2% and 0.2% respectively) and template (1wcy) residues (86.7%, 12.9%, 0.3% and 0.2% respectively) found in the most favored, additional allowed, generously allowed and disallowed region. The torsion angles (-0.02), covalent geometry (-1.07) and overall average (-0.52) of the energy minimized model were reduced compared to that of modeled DPP-IV (-0.20, 0.11 and -0.07 respectively) and the template 1wcy (0.13, 0.54 and 0.30 respectively; Table 5). The ERRAT quality factor of the minimized modeled DPP-IV was 92.93% as shown in Figure 7 The Ramachandran plot of the minimized modeled DPP-IV showed 88.8%, 10.6%, 0.3% and 0.3% of the amino acid residues were in the most favored, additional allowed, generously allowed and disallowed region respectively as compared to the modeled protein (89.6%, 10.1%, 0.2% and 0.2% respectively) and template (1wcy) residues (86.7%, 12.9%, 0.3% and 0.2% respectively) found in the most favored, additional allowed, generously allowed and disallowed region. The torsion angles (−0.02), covalent geometry (−1.07) and overall average (−0.52) of the energy minimized model were reduced compared to that of modeled DPP-IV (−0.20, 0.11 and −0.07 respectively) and the template 1wcy (0.13, 0.54 and 0.30 respectively; Table 5). The ERRAT quality factor of the minimized modeled DPP-IV was 92.93% as shown in Figure

Pocket Identification and Molecular Docking Simulation Results
The potential binding pocket of the minimized modeled DPP-IV 3D structure was identified based on the pocket volume and druggability of the pocket in Figure 9 as simulated using DoGSiteScorer and PockDrug prior docking simulations.

Molecular Docking Simulation Results
The docking results for iGEMDOCK ranged from -50 to -90.22 kcal/mol for the GC-MS identified phytocompounds while -82.55 and -76.74 kcal/mol were recorded respectively for alogliptin and saxagliptin, which are clinically prescribed inhibitors of DPP-IV as summarized in Table 6. For AutoDock Vina, nine conformations were obtained for each phytocompounds and standard with the binding energy of the most stable conformation recorded in Table 6. The binding energy of the phytocompounds ranged from -4.3 to -7.8 kcal/mol while alogliptin and saxagliptin

Pocket Identification and Molecular Docking Simulation Results
The potential binding pocket of the minimized modeled DPP-IV 3D structure was identified based on the pocket volume and druggability of the pocket in Figure 9 as simulated using DoGSiteScorer and PockDrug prior docking simulations.

Molecular Docking Simulation Results
The docking results for iGEMDOCK ranged from -50 to -90.22 kcal/mol for the GC-MS identified phytocompounds while -82.55 and -76.74 kcal/mol were recorded respectively for alogliptin and saxagliptin, which are clinically prescribed inhibitors of DPP-IV as summarized in Table 6. For AutoDock Vina, nine conformations were obtained for each phytocompounds and standard with the binding energy of the most stable conformation recorded in Table 6. The binding energy of the phytocompounds ranged from -4.3 to -7.8 kcal/mol while alogliptin and saxagliptin

Pocket Identification and Molecular Docking Simulation Results
The potential binding pocket of the minimized modeled DPP-IV 3D structure was identified based on the pocket volume and druggability of the pocket in Figure 9 as simulated using DoGSiteScorer and PockDrug prior docking simulations.

Pocket Identification and Molecular Docking Simulation Results
The potential binding pocket of the minimized modeled DPP-IV 3D structure was identified based on the pocket volume and druggability of the pocket in Figure 9 as simulated using DoGSiteScorer and PockDrug prior docking simulations.

Molecular Docking Simulation Results
The docking results for iGEMDOCK ranged from -50 to -90.22 kcal/mol for the GC-MS identified phytocompounds while -82.55 and -76.74 kcal/mol were recorded respectively for alogliptin and saxagliptin, which are clinically prescribed inhibitors of DPP-IV as summarized in Table 6. For AutoDock Vina, nine conformations were obtained for each phytocompounds and standard with the binding energy of the most stable conformation recorded in Table 6. The binding energy of the phytocompounds ranged from -4.3 to -7.8 kcal/mol while alogliptin and saxagliptin

Molecular Docking Simulation Results
The docking results for iGEMDOCK ranged from −50 to −90.22 kcal/mol for the GC-MS identified phytocompounds while −82.55 and −76.74 kcal/mol were recorded respectively for alogliptin and saxagliptin, which are clinically prescribed inhibitors of DPP-IV as summarized in Table 6. For AutoDock Vina, nine conformations were obtained for each phytocompounds and standard with the binding energy of the most stable conformation recorded in Table 6. The binding energy of the phytocompounds ranged from −4.3 to −7.8 kcal/mol while alogliptin and saxagliptin both had a score of −6.7 kcal/mol. It was further observed from Table 6 that 2-O-p-methylphenyl-1-thio-β-d-glucoside, 3-tosylsedoheptulose, 4-benzyloxy-6-hydroxymethyl-tetrahydropyran-2,3,5-triol and vitamin E had better iGEMDOCK (−76.67, −79.45, −90.22 and −77.10 kcal/mol respectively) and AutoDock Vina (−6.9, −7, −6.8 and −6.7 kcal/mol respectively) docking scores compared to either standard and were processed further as hits. 3-O-methyl-d-glucose had a relatively comparable iGEMDOCK score of −79.56 kcal/mol with the standard saxagliptin but a low AutoDock Vina score was recorded. The inverse was the case for androstan-17-one,16,16-dimethyl-(5-alpha)-and γ-sitosterol, whereby their AutoDock Vina score (−7.3 and −7.8 kcal/mol respectively) was way higher than both standards, but lower scores were observed when docking was carried out using iGEMDOCK.

Druglikeness, Pharmacokinetic and Toxicity Prediction
From the physicochemical properties of the docked ligands that were identified as potential leads in Table 9, no potential leads violated Lipinski drug-likeness RO5 except vitamin E with a high octanol-water partition coefficient (LogP) of 8.84. Other parameters such as molecular weight, LogP, hydrogen bond acceptor and donor were within the specified limit of Lipinski drug-likeness.      Table 9. Physicochemical parameters of potential DPP-IVi hits identified from Nauclea latifolia extracts and their comparison with Lipinski rule details. From the predicted absorption properties tabulated in Table 10, the selected leads showed no human intestinal absorption and Caco-2 permeability except vitamin E. Nonetheless, they possessed blood-brain barrier permeability with 3-tosylsedoheptulose being the only exception. The compounds were also non-inhibitors of p-glycoprotein, renal organic cation transport and non-substrates of p-glycoprotein. The compounds were also not orally bioavailable at both 20% and 30% except 2-O-p-methylphenyl-1-thio-β-d-glucoside, which was orally bioavailable at 20%. For the distribution pattern of the compounds, all compounds are likely to be localized in the mitochondria except 3-tosylsedoheptulose, which localization is in the lysosome. A predicted plasma protein binding of 55.94%, 49.26%, 43.47% and 84.65% was recorded for 2-O-p-methylphenyl-1-thio-β-d-glucoside, 3-tosylsedoheptulose, 4-benzyloxy-6-hydroxymethyl-tetrahydropyran-2,3,5-triol and vitamin E respectively while −0.496, −1.017, −0.278 and 0.444 L/kg were their respective predicted volume distribution (Table 10). These four compounds were not predicted inhibitors and substrates of 1A2, 3A4, 2C9, 2C19 and 2D6 CYP 450 isoforms. However, vitamin E was a predicted substrate for 2C9, 2C19 and 3A4 CYP 450 isoforms while 3-tosylsedoheptulose was predicted to be a substrate for 2C9 CYP 450 isoforms (Table 10). Clearance rate of 1.55, 0.724, 1.732 and 1.581 mL/min/kg was predicted for 2-O-p-methylphenyl-1-thio-β-d-glucoside, 3-tosylsedoheptulose, 4-benzyloxy-6-hydroxymethyl-tetrahydropyran-2,3,5-triol and vitamin E respectively while 0.98, 1.21, 0.93 and 1.95 h were their respective predicted half life (Table 10). The predicted toxicity profile of the compounds as tabulated in Table 10 revealed 2-O-p-methylphenyl-1-thio-β-d-glucoside, 3-tosylsedoheptulose, 4-benzyloxy-6-hydroxymethyl-tetrahydropyran-2,3,5-triol and vitamin E are non-mutagenic, non-carcinogen, non-skin sensitizers, non-inhibitors of human ether-a-go-go-related gene and classified as class III acute oral toxicity compounds. They have an LD 50 of 736.03, 1146.95, 1967.05 and 1161.96 mg/kg respectively and meet food and drug administration (FDA) maximum recommended daily dose requirements. Table 10. Predicted pharmacokinetic and toxicity properties of potential DPP-IVi hits identified from N. latifolia extracts.    Table 10. Cont.

Discussion
This study was carried out to identify potent DPP-IV antagonists from GC-MS identified phytocompounds in N. latifolia aqueous and ethanol leaf extracts. Development of DPP-IVi from plant-based sources is on the increase due to the observed side effects such as pancreatitis, cardiovascular challenges, renal-and hepatotoxicity of clinically approved DPP-4 inhibitors [20]. In silico approaches are cheap, proven and energy-saving techniques successfully applied in discovering leads from phytocompounds identified in different plant parts or large databases ensuring they succeed in the drug development process [32]. The phytoconstituents found in the ethanol and aqueous extract of N. latifolia belonged to various phytochemical classes such as alkaloids, phenolics, terpenes/terpenoids, fatty acid and others. Majority of these compounds were either phenolics or terpenes/terpenoids, which may be a major reason N. latifolia leaves have been verified to possess a wide range of pharmacological activities such as hepatoprotective, antimalarial and antifungal amongst others due to the synergistic action of these phytocompounds [31]. Notably, α-terpineol, γ-sitosterol, catechol, vitamin E, phytol, 9,12-octadecadienoic acid (Z,Z)-and dodecanoic acid are some bioactives that have been isolated and identified in various plants and fungi to illicit antioxidant, anti-inflammatory and antidiabetic activity by scavenging free radicals, preserving beta-cell function, ameliorating glucose-induced toxicity, attenuating oxidative and inflammatory stress [33][34][35][36][37][38]. SWISS-MODEL uses a modeling pipeline, which relies on OpenStructure comparative modeling engine to extract structural information of various templates to provide complete stoichiometry and the overall structure of the complex as inferred by homology modeling [39,40]. It generates a 3D protein model of the target sequence by extrapolating experimental information from an evolutionarily related protein structure that serves as a template [39]. The modeled DPP-IV protein was made up of two identical polypeptide chains, which had the same number, order and amino acid residues signifying a possible homo-dimer protein.
The GMQE and QSQE score were also very high indicating structural similarity to the template as well as proper quaternary structure inter chain interaction. The QMEAN and Z-score also show the modeled protein behaves similar to experimentally determined DPP-IV pdb structures. Over 99.6% of the modeled protein was deemed to have local similarity to template indicating an overall quality structure. Nonetheless, the remaining 0.4% of the protein region (Ser288, Leu765 and Pro766) with low quality did not have any part to play in the binding site. The latter amino acid residue were located at the tail end of the protein. The modeled protein was adduced to be acidic because of a higher number D and E residues compared to N and Q composition with majorly alpha-helical and beta-sheet secondary conformations indicating structural orderliness of the protein as well as a low RMSD signifying a good model representation [14]. The residue distribution as visualized by the Ramachandran plot shows over 99% of the modeled 3D structure had high structural integrity with a usual feature due to the high overall G-factor not reaching the threshold of −0.5. The compatibility of the modeled protein with its amino acid sequence and the observed non-bond interactions among various atoms of the modeled protein signified the model had good stereochemical quality [41]. This further corroborates the highly reliable quality of the modeled structure. The prediction of a drug targets active site with high substrate affinity is an important phase in the discovery of therapeutic compounds as they improve the clinical progression of compounds amidst the various uncertainties surrounding pocket estimation [42]. Most servers and software are predicting drug target active sites use one unique pocket estimation method. It is always imperative to use two or more to validate the prediction. In DoGSiteScorer, a grid-based method incorporating Gaussian filter difference is used to detect potential binding pockets thereafter splitting them into subpockets based the protein 3D structure. A support vector machine, which has a subset of meaningful descriptors integrated is used to predict druggability scores of the pockets and subpockets [43]. PockDrug detects druggable binding pocket through ligand proximity via several thresholds and amino atoms present at the binding site surface [44]. Despite the differences in the mode of cavity prediction by these two servers, the binding site as detected by DoGSiteScorer was validated by PockDrug with an identical cavity location. This step always precedes docking simulations as docking is simulated to predict the interaction between therapeutic ligands or substrates in the binding pocket of their target macromolecule [45]. Like drug target active sites prediction, two or more docking servers or software's will ensure there is an elimination of false negatives and positives. iGEMDOCK and AutoDock Vina are software that predicts binding modes between ligands and targets through an evolutionary approach in empirical scoring function [46] and sophisticated gradient optimization method in local optimization procedure [47] respectively. Comparing respective iGEMDOCK and AutoDock Vina binding values, 2-O-p-methylphenyl-1-thio-β-d-glucoside, 3-tosylsedoheptulose, 4-benzyloxy-6-hydroxymethyl-tetrahydropyran-2,3,5-triol and vitamin E exhibited considerable good binding mode with DPP-IV compared with saxagliptin and alogliptin, which are some United States food and drug administration (USFDA) approved DPP-IVi. Saxagliptin and alogliptin are two different DPP-IVi that belong to peptide and non-peptide mimetic class respectively based on their structure while they also belong to class 1 and 2 inhibitors respectively based on inhibitory action [19]. Kalhotra et al. [32] reported the hydrophobic S1, S2, N-terminal recognition and catalytic triad clefts as the four important domains responsible for DPP-IV inhibitory activity. From this study, 2-O-p-methylphenyl-1-thio-β-d-glucoside could be inferred to have a class 1 binding mode due to its simulated interaction with hydrophobic S1, S2 amino acid residues and hydrogen bond formation with Ser630 [48]. The modeled binding mode was similar to saxagliptin as Glu205 and Glu206 interaction ensures both 2-O-p-methylphenyl-1-thio-β-d-glucoside and saxagliptin are aligned in the β-propeller region such that there is a covalent interaction with one or more amino acid residues (Ser630, His740 and Asp708) in the catalytic triad domain [49]. 3-tosylsedoheptulose had some interactions similar to that of 2-O-p-methylphenyl-1-thio-β-d-glucoside, but the π-π interaction with Tyr547 may suggest a conformational change in the S1 domain mimicking a class 2 binding model [50,51]. 4-Benzyloxy-6-hydroxymethyl-tetrahydropyran-2,3,5-triol formed various interactions with the β-propeller, S1 and S2 domain with a π-π interaction with Tyr547 amino residue that would propose a class 2 binding model as a result of a possible S1 domain conformational change [50,51]. For vitamin E simulated binding, common β-propeller, S1 and S2 domain interactions were present, but interestingly with π-π and Van der Waals interaction with Phe357 and Ser209 amino residues is suggestive of binding with the S2 extensive or S3 domain. This domain has been reported to be larger in DPP-IV compared to other DPP isoforms, thereby having the ability to accommodate the large hydrophobic phytyl side chain of vitamin E suggesting class 3 binding [50]. Since these hit compounds exhibited lower binding energies and bound tightly to DPP-IV, they may play a role in preventing the rapid degradation of GLP1 and GIP, concomitantly having a remarkable impact on glycemic homeostasis and ultimately diabetes [8]. The binding classification of these lead compounds has not been previously reported to the best of our knowledge. It is worthy of note that a similar report was generated for oxidovanadium complexes in the binding pocket of DPP-IV [51]. Meduru et al. [52], have previously reported a positive correlation between predicted binding score and experimental activity values suggesting the predicted low binding affinity of 2-O-p-methylphenyl-1-thio-β-d-glucoside, 3-tosyl sedoheptulose, 4-benzyloxy-6-hydroxymethyl-tetrahydropyran-2,3,5-triol and vitamin E on DPP-IV may be translated to low inhibitory concentrations (IC 50 ) when validated experimentally. Prediction of physicochemical and ADMET properties are a cheap and time-saving alternative in developing therapeutic leads as compared to in vivo testing. This process helps eliminate compounds with both poor pharmacokinetic characteristics and high toxicity in biological systems leading to failed drug development [53]. Lipinski RO5 states the molecular weight, octanol-water coefficient, hydrogen bond acceptors and donors of the compounds should be no more than 500, 5, 10 and 5 g/mol respectively. However, a consensus is allowed if only one parameter is violated by the compound [27,54]. All the compounds passed the RO5 suggesting they will likely be orally active drugs [54,55]. TPSA and LogP are important parameters that determine how compounds are absorbed [56]. Vitamin E had the highest LogP and lowest TPSA while 3-tosylsedoheptulose had the lowest LogP and highest TPSA. Due to these physicochemical properties, all compounds bar 3-tosylsedoheptulose were able to permeate the blood brain barrier (BBB) with vitamin E being the most permeable. Thus, these compounds could also have neurodegenerative therapeutic applications. As regards Caco-2 permeability and human intestinal absorption, only vitamin E was positive suggesting easy absorption of this compound in the brush of the intestinal wall [57]. The other leads could, however, make use of carrier-mediated transport due to their lower lipophilic physicochemical properties compared to vitamin E [58]. Since the identified leads were neither substrates nor inhibitors of P-glycoprotein, renal organic cation transporter, human ether-a-go-go-related gene blockers, CYP1A2 and CYP2D6 as well as low CYP inhibitory promiscuity, there is a promising probability not to cause a multidrug resistance phenomenon, drug metabolism malfunctioning and toxicity elevation [45,59]. In addition, the risk of CYP drug interactions via the various isoforms found in different body tissues is greatly reduced as non-inhibitors of various CYP 450 isoforms do not impede the biotransformation of CYP 450 metabolized drugs [56,60]. Compounds that inhibit CYP 450 not only decrease the enzymatic activity but lead to the accumulation of drugs to toxic levels [61]. These compounds do not show carcinogenicity, mutagenicity, skin sensitivity and oral toxicity. They meet the maximum recommended daily dose of USFDA with a high LD 50 suggesting the compounds are non-toxic and have a low chance of causing toxicity.

Chemicals and Reagents
Ethanol used was of analytical grade and obtained from BDH chemicals, Poole, England.

GC-MS Analysis
NLA and NLE were subjected to GC-MS analysis using GCMS-QP2010SE SHIMADZU JAPAN with a fused Optima-5MS capillary column of 30 m length, 0.25 mm diameter and 0.25 µm film thickness. GC conditions: Pure helium (1.56 mL/min flow rate and 37 cm/s linear velocity); Injector temperature (200 • C); column oven temperature (initially 60 • C increased to 160 • C then 250 • C at 10 • C/min with 2 min/increment hold time); injection volume and split ratio (0.5 µL and 1:1 respectively). MS conditions: Ion source and interface temperature (230 • C and 250 • C respectively), solvent delay (4.5 min) recorded in a scan range of 50-700 amu. Unknown constituents were identified by comparing the retention time, mass spectral data and fragmentation pattern of the extracts with established libraries (National Institute of Standards and Technology (NIST) and Wiley libraries) [63].

Protein Sequence Retrieval, Model Optimization and Energy Minimization
DPP-IV sequence was obtained from NCBI (Available online: https://www.ncbi.nlm.nih.gov/ protein) through NCBI reference sequence identification number (NP_001926.2). The obtained sequence was queried using the basic local alignment search tool (BLASTp) against the PDB (Available online: http://www.pdb.org) program to identify related protein structural templates [68]. The homology modeling approach was performed via the SWISS-MODEL to generate an optimized model [39]. The 3D structure of DPP-IV was modeled based on a deposited crystal structure of DPP-IV from Homo sapiens with high resolution, sequence identity, domain coverage and E-value after blasting. The modeled DPP-IV structure energy level was minimized using 3Drefine [69].

Model Evaluation, Physicochemical Analysis and Pocket Identification
The structural chemistry stability of the minimized DPP-IV protein was evaluated using PROCHECK [70], ERRAT [71] and Verify_3D [72] while the physicochemical properties were analyzed using ProtoParam [73]. DoGSiteScorer [43] was used to identify possible active druggable binding sites of the protein with further verification using PockDrug [44].

Molecular Docking Simulation
Molecular docking (MD) of the modeled ligands in the binding pocket of DPP-IV was carried out using iGEMDOCK v2.1 [46]. A population size of 300, 80 generations and 10 solutions were the parameters assigned to screen the ligands in comparison with clinically approved DPP-IVi (Saxagliptin and Vildagliptin). To further validate results and avoid false positives, ligands were also docked using AutoDock Vina [47]. Prior to AutoDock Vina docking simulations, Autodock 4.2 [66,67] was used to compute Gasteiger charges, assign nonpolar hydrogen and set the grid map at 18 × 24 × 26 spaced at 1 Å.

Druglikeness, Pharmacokinetic and Toxicity Prediction
Lipinski rule of five (RO5) [27], ADMETlab [74] and admetSAR [75] were used to predict various drug-likeness, pharmacokinetic and toxicity parameters respectively of the docked ligands that had a better binding fit than the clinically approved DPP-IVi.

Conclusions
This study has identified 50 phytoconstituents present in N. latifolia leaf extracts using GC-MS proving the therapeutic potential of this plant against diabetes and other ailments. However, only four phytocompounds have been identified to possess comparable binding score with two clinically prescribed DPP-IVi as well as exhibit promising ADMET properties. This to the best of our knowledge is the first time, the binding classification of these four phytocompounds have been reported. Nonetheless, vitamin E concentration should be given close attention to during the lead optimization process due to its high permeability of the BBB to avoid eliciting adverse effect and CYP substrate ability. Molecular mechanics energies combined with the Poisson-Boltzmann or generalized Born and surface area continuum solvation (MM-PBSA/GBSA) and molecular dynamics analyses are required to corroborate these results. Further in vitro and in vivo research can be done to confirm and validate the pharmacological significance of these lead compounds for further development as potent DPP-IV antagonistic drugs.