Structural Dynamics of DPP-4 and Its Influence on the Projection of Bioactive Ligands

Dipeptidyl peptidase-4 (DPP-4) is a target to treat type II diabetes mellitus. Therefore, it is important to understand the structural aspects of this enzyme and its interaction with drug candidates. This study involved molecular dynamics simulations, normal mode analysis, binding site detection and analysis of molecular interactions to understand the protein dynamics. We identified some DPP-4 functional motions contributing to the exposure of the binding sites and twist movements revealing how the two enzyme chains are interconnected in their bioactive form, which are defined as chains A (residues 40–767) and B (residues 40–767). By understanding the enzyme structure, its motions and the regions of its binding sites, it will be possible to contribute to the design of new DPP-4 inhibitors as drug candidates to treat diabetes.


Introduction
Type II diabetes mellitus, a chronic metabolic disease related to hyperglycemia, has a global estimate of reaching 642 million cases by the year 2040 [1]. The inhibition strategy of the dipeptidyl peptidase-4 (DPP-4) enzyme is currently employed in the treatment of this disease. However, some substances available on the market can cause anemia, neuropathic risk, pancreatitis and nausea [2]. To understand these side effects, an accurate analysis of the structural characteristics of this biological target is required, since its mechanism of action is not yet fully elucidated. Therefore, understanding the functional motions of the enzyme may contribute to the projections of new and effective drug candidates with fewer side effects.

Calculation of Normal Modes
The analysis of the normal modes allowed observation of two important functional movements of the DPP-4 enzyme: a torsion (twist) and an opening motion that exposes the active site, both movements being present in all systems (with or without an inhibitor). It was found that some residues located at certain regions are involved in the movement of exposure of the active site, the β chain region (Glu91, Asn92, Ser93, Phe95, Asp96 and Glu97) and α-helix region (Ser745, Thr746, Ala747 His748, Gln749, His750, Ile751, Tyr752, Thr753, His754,  [18][19][20] and FTMap [18,[20][21][22] where regions 1 and 2 (colored in salmon and green, respectively) correspond to the active sites described in the literature (sites 1 and 2) and region 3 (colored in blue) is an alternative binding site (site 3) not described in the literature. (C) Key residues of the active site of DPP-4. Region 3 is a possible candidate as an allosteric binding site.
By analyzing the overlapping graphs of Cα atom fluctuations of the chains A and B (Figure 3), we detected slight differences in the profile of flexibility between the DPP-4 enzyme chains, regardless of the presence of the N7F inhibitor. It was noted that there are differences in the magnitudes of the peaks of flexibility between the two DPP-4 chains, and this characteristic has not been reported in any work so far. In order to understand this phenomenon, we used several tools such as BINANA [17], FTSite [18][19][20] and FTMap [18,[20][21][22].

Interactions between DPP-4 and the Inhibitor N7F
We used BINANA [17] to analyze the main molecular interactions of the N7F inhibitor with the DPP-4 enzyme, comparing them when the ligand is in the A chain or in the B one. The results obtained are shown in Table 1.

Glu205
HB In the A chain of DPP-4, we found a greater number of hydrogen bond interactions than in the B chain, with the presence of π-π T-shaped-like (PIT-like )interactions only in the A chain. On the other hand, the hydrophobic contacts, π-π stacking (PIS) interactions and salt bridges were more numerous in the B chain. Thus, it can be speculated that new inhibitors have to contain functional groups that will allow for establishing interactions with a similar profile, since they are supposed to be coupled in both DPP-4 active sites (chains A and B).
We used the BINANA [17] tool to analyze the molecular interactions of 84 crystallographic structures of DPP-4 (human) enzyme with their respective ligands to verify if there were other cases in which a given ligand established different interactions in the A and B chains of DPP-4. We found that, in some cases, the biological target/ligand interaction type was different between the DPP-4 chains (these results can be seen in Table S1, Supporting Information).
In chain B, three similar sites were predicted with the addition of Gly549 at site 1. Residues such as Arg125, Phe357, Arg358, Tyr547, Ser630, Tyr631, and His740 were detected as an integral part of more than one binding site. Most of the residues detected at sites 1 and 2 have already been described in the literature as being present in the region of the active site [5][6][7][8][9][10][11]. The third site does not correspond to the non-catalytic sites of DPP-4 described in the literature, that promote intermolecular interactions with adenosine deaminase complexing protein 2 (ADA) (composed of Asn281, Leu294, Leu340, Val341, Ala342, and Arg343) [12,13].
Analyzing the results obtained from FTSite and FTMap, using the standard probe molecules (mentioned in the methodology), it was possible to determine those that present the greatest affinity for the different binding sites, as follows: (1) Figure 4.

Protocol Overview-Calculation of Normal Modes (NM)
The calculation of normal modes makes it possible to investigate the large amplitude motions around an equilibrium structure [23][24][25][26]. In the case of the DPP-4 enzyme, a large system (containing two chains with more

Protocol Overview-Calculation of Normal Modes (NM)
The calculation of normal modes makes it possible to investigate the large amplitude motions around an equilibrium structure [23][24][25][26]. In the case of the DPP-4 enzyme, a large system (containing two chains with more than 700 residues each), it was necessary to follow various procedures prior to the calculation of the normal modes: (1) selection of the crystallographic structure: PDB code 4A5S [14,15] with an atomic resolution of 1.62 Å (the same 3D structure that has been used in previous studies [6]), that has a structurally interesting bound ligand (with a simultaneous affinity towards the S1 and S2 sites); (2) preparation of 4 molecular systems: (a) a dimer containing an inhibitor only in the chain A of the protein; (b) a dimer containing an inhibitor only in the chain B; (c) a dimer in which both chains contain an inhibitor; (d) a dimer without an inhibitor; (3) use of CHARMM-GUI server (Quick MD Simulator) [27,28] to generate the inputs files for the minimization with GROMACS [29][30][31][32] in which we indicated the presence of disulfide bridges, an octahedral water box, the addition of 0.15 M KCl ions where their positions were determined by the application of a Monte Carlo method; (4) execution of minimization with GROMACS (5.1.1) [29-32] using the above inputs; (5) the equilibrated structure obtained in the previous simulation was minimized using the CHARMM-GUI (PDB Reader) [27,33] server in order to carry out the normal mode calculations.
Firstly the systems were minimized with the conjugate gradient (CG) methods followed by the Adopted Basis Newton-Raphson algorithm (ABNR). The harmonic constraints were applied during the CG stages, progressively decreasing from 250 to 0 kcal mol −1 Å −2 . Then, we minimized the systems with the unrestricted ABNR algorithm using a RMS (Root Mean Square) energy gradient convergence criterion of 10 −5 kcal mol −1 Å −1 . We calculated 81 modes using the CHARMM 40b1 [34] program with CHARMM36 [35] force field; the two lowest frequency modes corresponded to movements of torsion between the chains and the opening that exposes the active site, respectively.
The ligand and receptor files were prepared in PDBQT format by adding hydrogens and Gasteiger partial charges through the tools of AutoDockTools [36]; the calculation of molecular interactions was performed using the GROMACS [29][30][31][32] equilibrated structure.

Protocol Overview-Search for Binding Sites FTSite and FTMap
In the search for possible binding sites of DPP-4, we used FTSite [18][19][20] and FTMap [18,[20][21][22]. Such tools map the protein to 16 organic molecules called probes. Using the FTSite server it is possible to identify regions of possible protein binding sites, whereas FTMap [18,[20][21][22] allows the characterization of the affinity profile of these regions according to the molecules-probe used. In the FTSite tool each probe is particularly placed on a dense grid around the protein. Each cluster region of the probe (cluster) is classified based on the average energy, and the consensus regions are identified as sites where different groups of probes overlap, suggesting a possible favorable region for coupling molecules. The main stages involved in FTMap [18,[20][21][22] include: (i) processing of the PDB file, where ligands and water molecules are excluded; (ii) pre-docking minimization: addition of polar hydrogen atoms; (iii) Poisson-Boltzmann (PB) potential calculation: uses the CHARMM23 program to calculate PB potential around the protein; (iv) clustering of the probes and minimization, with generation of the consensus regions; (v) calculation of nonbonded interactions and hydrogen bonding (H-bonded) between the probes and the protein. The main characteristics of the probe molecules employed in both tools are shown in Table 2.

Conclusions
In this study, we observed a small difference in the flexibility profiles of the two DPP-4 chains, suggesting that chain A is more flexible and that there are fewer interactions between chain B and the inhibitors. The most significant movements observed involve the exposure of the active site and the twist between the two chains. Considering the N7F inhibitor, we detected a greater number of interactions of the hydrogen bonding and T-stacking types in the A chain of the DPP-4 enzyme.
Regarding the three sites identified here, it was noted that with the exception of the residues Phe357 and Arg358, the other residues detected at the third binding site have not yet been exploited for identification of new bioactive ligands, noting that the third site did not correspond to the non-catalytic binding sites of DPP-4 (residues Asn281, Leu294, Leu340, Val341, Ala342 and Arg343). Therefore, it might be possible that the region of the third binding site could be occupied without altering its non-catalytic functions.
In addition to structural information on the enzyme, our study contributes to understand protein dynamics in order to use this enzyme for drug design more efficiently. We proposed that the design of new compounds sharing physicochemical characteristics similar to the 16 probe molecules can be a good approach to plan efficient inhibitors of DPP-4. For site 2, it was observed that compounds containing groups with a physicochemical profile similar to the probe molecules related to acetaldehyde, isopropanol and urea were not the best compositional options for coupling in this region. In the case of site 3, molecules with structural similarity containing ethanol, isopropanol, methanamide, tert-butanol would not be suitable. The binding site 3 (see Figure 1) should be additionally studied as a new allosteric binding site. A better understanding of the functional motions of DPP-4, as well as the characteristics of the residues that compose the binding sites of this enzyme can contribute to the design of more effective DPP-4 inhibitors and further studies on side effects.