Combined Structure-Based Pharmacophore and 3D-QSAR Studies on Phenylalanine Series Compounds as TPH1 Inhibitors

Tryptophan hydroxylase-1 (TPH1) is a key enzyme in the synthesis of serotonin. As a neurotransmitter, serotonin plays important physiological roles both peripherally and centrally. In this study, a combination of ligand-based and structure-based methods is used to clarify the essential quantitative structure-activity relationship (QSAR) of known TPH1 inhibitors. A multicomplex-based pharmacophore (MCBP) guided method has been suggested to generate a comprehensive pharmacophore of TPH1 kinase based on three crystal structures of TPH1-inhibitor complex. This model has been successfully used to identify the bioactive conformation and align 32 structurally diverse substituted phenylalanine derivatives. The QSAR analyses have been performed on these TPH1 inhibitors based on the MCBP guided alignment. These results may provide important information for further design and virtual screening of novel TPH1 inhibitors.


Introduction
Serotonin (5-hydroxytryptamine, 5-HT) is a monoamine neurotransmitter that modulates central and peripheral functions through action on platelets, smooth muscles, neurons, and other cell OPEN ACCESS types in the gastrointestinal (GI) tract or in the central nervous system (CNS). The biosynthesis of Serotonin is limited by the hydroxylation of tryptophan which is catalyzed by tryptophan hydroxylase (TPH). Two vertebrate isoforms of TPH, TPH1 and TPH2, have been described [1][2][3]. In the GI system, TPH1 is primarily expressed and dysregulation of the peripheral 5-HT signaling system is involved in the etiology of several conditions such as functional GI disorders, chemotherapy-induced emesis, and heart valve damage [4,5]. Therefore, inhibitors of TPH1 have proven effective in treating chemotherapy-induced emesis, as well as diarrhea, in carcinoid tumor patients. Some pharmaceutical companies are developing candidate drug molecules based on this target for treating dysregulation of the serotonergic system, such as irritable bowel syndrome [6].
Some research results focused on the physiological function of the gut-derived serotonin show that 5-HT is a powerful inhibitor of osteoblast proliferation and bone formation [7][8][9]. Yadav V.K. and co-workers reported that inhibitors of gut-derived serotonin synthesis have the potential to become a new class of bone anabolic drugs that can be added to the armamentarium to treat osteoporosis [10]. Small molecule inhibitors of TPH1 can be considered as a new target to treat osteoporosis and this mechanism is different from any known drugs (Estrogen or Bisphosphonates) [11]. Therefore, structure-activity relationship (SAR) studies on the known TPH1 inhibitors are very interesting and meaningful for discovering new anti-osteoporosis candidate compounds. Recently, a novel series of phenylalanine was reported as a kind of selective TPH1 inhibitor [12,13]. However, quantitative structure-activity relationship (QSAR) focusing on phenylalanine series compounds as TPH1 inhibitors have not been reported. In this study, combination of the ligand-based and structure-based methods is used to clarify the essential quantitative structure-activity relationship of the known TPH1 inhibitors. First of all, a multicomplex-based pharmacophore has been generated from a comprehensive pharmacophore map of TPH1 based on three crystal structures of TPH1-inhibitor complex. Secondly, a multicomplex-based pharmacophore guided alignment procedure is used in the data set, which is further exploited in the development of predictive 3D-QSAR models. Finally, reliable CoMFA models are developed based on these pharmacophore models [14][15][16].
As a part of the ongoing work in our research groups aimed at the search for selective TPH1 inhibitors, and our recent attempts to explore how to generate more accurate and reasonable structure-based pharmacophore models, the combined structure-based and ligand-based drug design strategy is useful to gain further insights into the molecular recognition patterns required for TPH1 protein binding, and for developing a multicomplex-based pharmacophore model that can be used for virtual screening to discover novel potential lead compounds. The multicomplex-based pharmacophore and 3D-QSAR models can help us to predict the biological activities of the series compounds with a change in the chemical substitutions and to provide some useful references for the design of new TPH1 inhibitors. The theoretical results can offer some useful references for the design of new TPH1 inhibitors as anti-osteoporosis drugs.

Generation of Multicomplex-Based Pharmacophore Models
A set of 3 crystal structures of TPH1 in complex with diverse ligands (Table 1) was obtained from the Protein Data Bank (PDB) [17]. Water molecules in ligand-binding sites have been reported to play a crucial role in mediating the interactions between TPH1 and its ligands, and they can provide useful information for the process of pharmacophore construction. Therefore, all the water molecules in the crystal structures were retained. The coordinates of 3 TPH1-ligand X-ray crystal structures were transformed into a common reference frame by using "Multiple Structure Alignment" module within Discovery Studio (DS). Arg257 -Tyr264 Tyr264 Pro267 -Pro268 Pro268 -His272 His272 The complex of TPH1 with the docked conformation of compound 12 was used as the starting structure for the generation of the pharmacophore model in the present study. The software Ligandscout 1.03 was applied for the detection and interpretation of crucial interaction patterns between TPH1 and compound 12 [18]. Ligandscout extracts and interprets the ligand and the macromolecular environment from the PDB file, then automatically creates and visualizes an advanced pharmacophore model. The pharmacophore model was exported as a hypoedit script and converted into the Discovery Studio format with the Hypoedit tool [19]. Subsequently, the pharmacophore model was used for mapping all of the molecules.

Data Set and Molecular Sketching
Except for some compounds with no activity or unclear activity, 26 phenylalanine compounds from references are selected as the training set [12,13], among which 6 compounds are randomly chosen as the testing set (the testing set is marked by *). According to research practice, all original IC 50 values (μmol/L) were converted to negative logarithm of IC 50 (pIC 50 ) and used as dependent variable in our 3D-QSAR study. The structure of these compounds and their activity values are listed in Table 2. Among all the compounds in the data set, compound 12 was selected as the template to construct other compounds because of its high biological activity and representative chemical structure, and the computation was completed by SYBYL 6.9 program package (Tripos) on a PC workstation [20]. Except for some special notes, default values were chosen. The calculation can be defined as follows: after the construction of molecules, hydrogen and Gasteiger-Hückel charges were added to the compounds. Then their geometries were optimized by the conjugate gradient method in TRIPOS force field. The energy convergence criterion is 0.001 kcal/mol.

Conformational Model Analysis and Alignment Rule
For the training and test sets molecules, conformational models representing their available conformational space were calculated. All molecules were subjected to Diverse Conformation Generation protocol to produce a maximum of 255 conformations within 20 kcal/mol in energy from the global minimization. All other parameters used were kept at their default settings.
In the 3D-QSAR studies, alignment rule and biological conformation selection are two important factors to construct reliable models. All the molecules in the training and test sets were mapped simultaneously onto the pharmacophore model using "flexible" fitting method and "best mapping only" option in the Ligand Pharmacophore Mapping protocol. The conformation with highest fit value (i.e., best fitting the pharmacophore) was assumed as the bioactive conformation for each compound. The final aligned molecules were exported to SYBYL for CoMFA analysis.

CoMFA Study
The 3D-QSAR studies (CoMFA and CoMSIA) were done on a PC workstation using SYBYL 6.9 software. The superimposed molecules were kept in a 3D grid (spacing set at 2 Å), then steric and electrostatic fields were calculated at various grid points using Lennard-Jonnes and Coulombic potentials, respectively, for CoMFA studies. A sp 3 carbon atom having a charge of +1 and a radius of 1.52 Å was used as a probe to calculate various steric and electrostatic fields for all three of the alignments. Various steric and electrostatic cutoffs and grid spacings were tried to investigate the influence of different parameter settings on CoMFA.

Partial Least Square Analysis (PLS) and Model Validation
Partial least squares (PLS) [21] methodology was used for all the 3D-QSAR analyses. The cross-validation [22] analysis was performed using leave-one-out (LOO) method in which one compound was removed from the dataset, and its activity was predicted using the model derived from the rest of the dataset. The cross-validated r 2 that resulted in the optimum number of components and lowest standard error of prediction were considered for further analysis. To speed up the analysis and reduce noise, a minimum filter value of 2.00 kcal/mol was used. Final analysis was performed to calculate conventional r 2 using the optimum number of components obtained from the cross-validation analysis. CoMFA standard scaling was applied to all of the CoMFA analysis.
The predictive power of the 3D-QSAR models were determined from external test sets that were excluded during model development. The inhibitors in the test sets were given exactly the same pretreatment as the inhibitors in the corresponding training sets. The correlation between the experimental and predicted activity for all models was calculated as a predictive r 2 value.

Generation and Validation of Multicomplex-Based Phamacophore
Three X-ray crystallography structures of TPH1 in complex with small molecular inhibitors were used to construct the pharmacophore. Results of molecular superposition from the result based on Modeller [23] are reported below (Figure 1). The detected pharmacophore features as well as their statistical frequency, which measures how many complexes a given pharmacophore feature can be found in, are showed in Table 1. One can see that there were 10 pharmacophore features, including 1 hydrogen bond acceptor (A1), 3 hydrogen bond donors (D1-D3) and 5 hydrophobic features (H1-H5) and 1 negative ionizable point. In the 10 detected pharmacophore features, 7 features (A1, D1, D2, H1, H2, H3 and Neg1) were found to be common in the 3 complexes. It is believed that the pharmacophore features, which present in the complexes with a high probability, were likely to be more important than features that exhibit a low probability. For a full pharmacophore map, it was also important to include excluded volume features, which reflected potential steric restriction and corresponded to the positions that were inaccessible to any potential ligand. The comprehensive pharmacophore map and the ligand binding conformation at the ATP site of TPH1 is shown in Figure 2. The comprehensive pharmacophore map initially obtained was too restrictive and not suitable for the virtual screening since it contained a large number of chemical features and the fit of a molecule to such a pharmacophore was still out of reach for today's state-of-the-art computational tools. A correctly reduced pharmacophore model would be much preferred in terms of practical application [24]. According to our experience, the top ranked seven features (A1, D1, D2, H1, H2, H3 and Neg1) would be more appropriate in practice, and consequently they were selected from the comprehensive pharmacophore map and were merged to generate a multicomplex-based phamacophore (Figures 3 and 4). The difference of the chemical feature in this position between the ligand-based pharmacophore model and multicomplex-based pharmacophore was mainly due to the distinct methodologies that have been employed. In LigandScout, the pharmacophore feature was added to the model only if a reasonable interaction pattern between the ligand and the receptor was found. In contrast, the pharmacophore hypothesis generated in Catalyst merely includes ligand information.    A reliable pharmacophore model may be used to determine the bioactive conformations of the ligands that share the same binding mode. The conformation selected for each compound, assumed as the bioactive conformation, corresponds to the conformation which best fits the pharmacophore. To verify whether the pharmacophore model finds the correct bioactive conformation, we applied the method to a substituted 3-(5-(pyrazine-2-yl)-phenyl)-2-aminopropanoic acid inhibitor (compound 12), whose bioactive conformation is known from co-crystal structure of TPH1 and the binding mode is similar to the other derivatives. Thus, the X-ray crystal structure of TPH1 kinase (PDB code: 3HFB) was selected from the Protein Data Bank. The bound conformation of this inhibitor was respectively mapped onto the pharmacophore model using "flexible" fitting method and "best mapping only" option in the Ligand Pharmacophore Mapping protocol and was meanwhile superimposed to the best mapping conformations (Figure 4). The Root-mean-square deviation (RMSD) value between the heavy atom positions of the bound and the best mapping conformation was 0.45 Å. The result showed that the pharmacophore model is capable of reproducing the bioactive conformation from the Protein Data Bank and supported our choice for the bioactive conformation obtained from the best mapping conformation of the calculated ensemble to the alignment in 3D-QSAR analysis, rather than the commonly used energy minimization method.

Alignment of Molecules in the Training and Test Sets
One of the most fundamental problems when trying to develop a good and predictive 3D-QSAR model, is how to align the investigated compounds. This becomes especially critical when one is dealing with a set of structurally flexible and diverse compounds [25]. A pharmacophore model also constitutes a useful tool to guide the alignment of compounds in 3D-QSAR study. Compared with the scaffold alignment based on the atom RMS fitting, which is commonly used in 3D-QSAR study, the pharmacophore-based alignment approach is more advantageous in aligning flexible and diverse molecules [26]. Figure 5 shows an alignment of all molecules in the training and test sets by the pharmacophore model. It seemed that the alignment was good when the bioactive conformations were automatically aligned to the pharmacophore model. The amino group and one of the nitrogen atoms in heterocyclic moiety superimpose or locate near hydrogen bonds features in hinge to force all compounds to take similar space orientations, which represents that the urea moiety could access the back hydrophobic pocket adjacent to the ATP binding site. For these compounds with good inhibitory activities, they can produce good fits with all features in the pharmacophore model. While for those compounds with poor inhibitory activity, they can only produce relatively good fits with one feature missed.

CoMFA Models
All of the CoMFA models were developed from the training set of 26 inhibitors and the test set of 6 inhibitors using MCBP alignments, and the results of the CoMFA analyses are presented in Table 2.
The statistical information and quality of 3D-QSAR models based on two different alignments have been compared, as the alignment of the molecules is the most crucial step in the development of the 3D-QSAR models using CoMFA. The 3D-QSAR models with a r cv 2 value >0.3 are considered significant, although a r cv 2 value >0.4 is preferred [27]. Among all models generated using two alignments, the best model was the CoMFA model with MCBP alignment, having a r cv 2 value of 0.57.
The CoMFA models were built after model development and validation based on the internal predictions of the training set and the external predictions of the test set. PLS analyses of the TPH1 inhibitor training sets showed a high cross-validated r cv 2 value of 0.57 using six principal components and non-cross-validated r 2 value of 0.986. All of the parameters of these CoMFA models showed certain reliability and feasible predictability to help us design new and high selectivity TPH1 inhibitors. From Table 3 we can see that almost all compounds in the test set yielded a good predicted pIC 50 within 1 log unit of the experimental value. The PLS analysis results obtained from structurally diverse TPH1 inhibitors were similar which further strengthens the robustness of the multicomplex-based phamacophore guided alignment. This might be due to the fact that the multicomplex-based phamacophore considered a large number of interactions between the TPH1 protein and the small molecular inhibitors at the ATP active sites. The Aurora-A inhibitory activity (pIC 50 ) and the residual values for the training set and the test set compounds used for the best CoMFA model are given in Table 3. The graphical plot of observed vs. calculated TPH1 inhibitory activity for both the training set as well as the test set is shown in Figure 6.

Graphical Interpretation of the CoMFA Results
The contour maps of CoMFA denoted the region in the space where the aligned molecules would favorably or unfavorably interact with the receptor where the presence of a group with a particular physicochemical activity bound to the receptor. The CoMFA results were graphically interpreted by field contribution maps using the "STDEV*COEFF" field type. Figure 7 showed the contour maps derived from the CoMFA model. The more potent analogue, compound 27 was embedded in the maps to demonstrate its affinity for the steric and electrostatic regions of inhibitors. The areas of yellow indicate regions of steric hindrance to activity, while green areas indicated a steric contribution to potency. The blue regions indicated positive electrostatic charge potential associated with increased activity, while regions of red show negative charge with increased activity. All of the contours represented the default 80 and 20% level contributions for favored and disfavored regions, respectively. Figure 7a-d showed that the different physicochemical fields properties contours were mainly distributed within the region surrounding the substitute aromatic ring unit and near the region enclosed by the carboxyl group of the 2-aminopropanoic acid group of the reference inhibitor. This suggested that these functional groups tuned the affinity of each ligand. To the electrostatic properties, the blue contour presented in the 1 and 4 position of the 1,2,4-triazin ring in the map suggested that positive electrostatic charge groups, e.g., the more positive charged nitrogen may favor enhanced affinity between TPH1 and its inhibitors (Figure 7a), for example, compounds 13 to 17 bearing a 4-amino-1,2,4-triazin ring have higher inhibitory activity than compounds 1 to 12 bearing a pyrazin ring. The green and yellow contour for the steric properties derived from the CoMFA studies indicated moderate steric interaction of the R group would benefit the inhibitor for increasing the activity with the TPH1 receptor (Figure 7b), for example, compound 7 with a naphthalene ring has higher inhibitory activity than compound 1 bearing a cyclo-hexane ring and 11 bearing a biphenyl group. The contributions from the steric and electrostatic fields for the present models were 0.829/0.171 (Table 2), respectively. Such contributions of field indicated that the variations in binding affinity among these inhibitors were dominated by steric interactions but distributed in different proportions across the binding sites of the TPH1 kinase. This factor could be applied to design highly potent and selective TPH1 inhibitors.

Conclusion
In conclusion, we utilized 3 crystal structures of human TPH1 bound to small molecular inhibitors to generate a multicomplex-based pharmacophore. The multicomplex-based pharmacophore was used to compare three previously reported ligand-based pharmacophore models. It has been validated that the multicomplex-based pharmacophore model was capable of predicting the bioactive conformations and molecular alignments of a wide variety of TPH1 inhibitors in the structurally diverse datasets.
The work conducted here has provided an approach to generate a multicomplex-based pharmacophore guided 3D-QSAR model based on a set of crystal structures of protein-ligand complexes and structurally diverse inhibitors. The multicomplex-based pharmacophore guided 3D-QSAR model can be used to further optimize and design more potent TPH1 inhibitors and to evaluate the newly engineered compounds in de novo design. The studies suggest that in the development of 3D-QSAR models, the multicomplex-based pharmacophore guided alignment could be useful for getting the robust predictive models which may provide useful information required for a proper understanding of the important structural and physicochemical features for designing novel selective kinase inhibitors comprising novel scaffolds leading to the candidate molecules, such as anti-osteoporosis agents for drug development. It is expected that the information provided here will be helpful for the study toward more accurate pharmacophore-based 3D-QSAR modeling.