Combined 3D-QSAR and Docking Modelling Study on Indolocarbazole Series Compounds as Tie-2 Inhibitors

Tie-2, a kind of endothelial cell tyrosine kinase receptor, is required for embryonic blood vessel development and tumor angiogenesis. Several compounds that showed potent activity toward this attractive anticancer drug target in the assay have been reported. In order to investigate the structure-activity correlation of indolocarbazole series compounds and modify them to improve their selectivity and activity, 3D-QSAR models were built using CoMFA and CoMSIA methods and molecular docking was used to check the results. Based on the common sketch align, two good QSAR models with high predictabilities (CoMFA model: q2 = 0.823, r2 = 0.979; CoMSIA model: q2 = 0.804, r2 = 0.967) were obtained and the contour maps obtained from both models were applied to identify the influence on the biological activity. Molecular docking was then used to confirm the results. Combined with the molecular docking results, the detail binding mode between the ligands and Tie-2 was elucidated, which enabled us to interpret the structure-activity relationship. These satisf actory results not only offered help to comprehend the action mechanism of indolocarbazole series compounds, but also provide new information for the design of new potent inhibitors.


Introduction
Tie-2, a kind of endothelial cell tyrosine kinase receptor, is expressed primarily by vascular endothelial cells and is required for embryonic blood vessel development and tumor angiogenesis [1]. Angiopoietin integrates with Tie-2 in the body to promote the Tie-2 receptor's autophosphorylation, while the endothelial cells which express Tie-2 attract the cells around the vein to help the endothelial cells construct complete vessel walls, promote vascular remodeling and maturing, maintain the integrity of blood vessels and regulate their functions [2]. More and more data suggest that the inhibition of Tie-2 plays a vital part in curing cancer, and Tie-2 represents an important candidate for targeted therapy in cancer [3][4][5].
Recently, a novel series of indolocarbazole was reported as a kind of receptor tyrosine kinase inhibitor targeted to Tie-2 [6][7][8][9]. However, the QSAR focuses on indolocarbazole series compounds as Tie-2 inhibitors have not been reported. In our study, the 3D-QSAR models are constructed by CoMFA and CoMSIA methods on a training set of 80 indolocarbazole compounds as Tie-2 inhibitors. Comparative Molecular Field Analysis (CoMFA) and Comparative Molecular Similarity Indices Analysis (CoMSIA) are two commonly used three-dimensional quantitative structure-activity relationship (3D-QSAR) methods. It is a useful technique for understanding the pharmacological properties of studied compounds, because not only are the models visualized, but also the relationships between physical/chemical properties and pharmacological activity are represented in contour maps [10,11]. Combining with the docking study, the protein-ligand interactions were implicated. The models can help us to predict the biological activities of the series compounds with a change in the chemical substitutions and to provide some useful references for the design of new Tie-2 inhibitors. The theoretical results can offer some useful references for the design of new Tie-2 inhibitors as anticancer drugs.

Data Set and Molecular Sketching
Except for some compounds with no activity or unclear activity, 80 indolocarbazole compounds from 4 references [6][7][8][9] are selected as the training set, among which 15 compounds are randomly chosen as the testing set (the testing set is marked by *). According to research practice, all original IC 50 values (nmol/L) were converted to negative logarithm of IC 50 (pIC 50 ) and used as dependent variable in our 3D-QSAR study. The structure of these compounds and their activity values are listed in Table 1.
Among all the compounds in the data set, compound 27 was selected as template to construct other compounds because of its highest biological activity, and the computation is completed by SYBYL7.3 program package (Tripos. Int.) on a personal computer. Except for some special notes, default values are chosen to finish this work. The calculation can be defined as follows: after the construction of molecules, hydrogen and Gasteiger-Hückel charges were added to the compounds. Then their geometries are optimized by the conjugate gradient method in TRIPOS force field. The energy convergence criterion is 0.001 kcal/mol. Structural alignment is considered as one of the most sensitive parameters in CoMFA and CoMSIA study. In our work, the most active compound 27, was used as a template for alignment, and the common fragment of 65 compounds is shown in Figure 1(A). The compound 27 was used as the template for alignment, and the rest of compounds were aligned on it [12]. This process was performed by Database Align of SYBYL 7.3. The aligned molecules were presented in Figure 1

CoMFA and CoMSIA Study
In the CoMFA procedure, a 3D cubic lattice with a grid spacing of 2Ǻ was created automatically by the program to encompass all the aligned ligands. A default sp 3 -carbon probe atom with a van der Waals radius of 1.52 Ǻ and a charge of +1.0 was used to generate steric (Lennard-Jones 6-12 potential) field energies and electrostatic (Coulombic potential) fields with a distance-dependent dielectric at each lattice point. The computed field energy was truncated to 30 kcal/mol for both steric and electrostatic fields [12,13].
The CoMSIA procedure is similar to the CoMFA procedure. In this approach, five different similarity fields are calculated: steric, electrostatic, hydrophobic, hydrogen bond donor and hydrogen bond acceptor. The CoMSIA similarity index descriptors were derived using the same lattice box as that used in CoMFA calculations. These fields were selected to cover the major contributions to ligand binding. In CoMSIA fields, singularities were avoided at atomic positions because a Gaussian type distance dependence of each physicochemical property was adopted and thus no arbitrary cutoffs were necessary [14]. The attenuation factor was set to the default value of 0.3.
The Partial Least Squares (PLS) regression technique was used to construct a linear correlation between the 3D-field (independent variables) and the biological activity values (dependent variables). Cross-validations were performed by the leave-one-out (LOO) procedure to in which one compound was removed from the data set and its activity is predicted using a model built from the rest of data set [14,15]. It results in the cross-validation square correlation coefficient q 2 and the optimum number of components (ONC). Then the optimum number of components was employed to construct 3D-QSAR models by non-cross-validations to obtain the conventional correlation coefficient r 2 , standard deviation SE and significant factor F. In order to speed up the analysis and reduce the noise, the column filter value of 2.0kcal/mol was set for non-cross-validations. The analysis procedure was performed by combing the bioactivity values (pIC 50 ) and the corresponding field descriptor variation.

Predictive Correlation Co-efficient (r 2 pred )
In order to assess the predictive abilities of the CoMFA and CoMSIA models, the pIC 50 values of an external test set composed of fifteen compounds were predicted. Furthermore, the predictive correlation coefficient r 2 pred was calculated by the formula shown below.
where SD is the sum of the squared deviations between the biological activities of the test set compounds and the mean activity of the training set compounds, and PRESS is the sum of squared deviations between experimental and predicted activity values of the test set compounds [16,17].

Molecular Docking
In order to determine the appropriate binding conformations of these compounds and check the main factors affecting the activity from the 3D-QSAR models, docking study for all compounds was performed with the Surflex-Dock program from Sybyl7.3. The Surflex-Dock uses an empirical scoring function and a patented search engine to dock ligands into a protein's binding site [18]. The scoring function considers the four terms, including the hydrophobic complementarity, polar complementarity, entropic terms and solvation terms. Protomol, an idealized representation of a ligand that makes every potential interaction with the binding site, was used to guide molecular docking. In our study, the protomol was established guiding by the ligand in the protein. The X-ray crystal structure of Tie-2 was retrieved from the RCSB Protein Data Bank (PDB entry code: 3l8p). At the beginning of the docking, all the water and ligands were removed and the random hydrogen atoms were added. Then the receptor structure was minimized in 10,000 cycles with Powell method in sybyl7.3 [19]. The surface of protein was calculated with a MOLCAD program. Other parameters used in docking were default, except for those explained.

3D-QSAR Models
Two QSAR models are established from CoMFA and CoMSIA analysis and the statistical parameters derived from the experiment are listed in Table 2. These parameters demonstrate that both QSAR models obtained are of high degree of confidence and strong predictive ability. The CoMFA model had a high cross-validated square correlation coefficient q 2 (0.823) with an optimized component of 8, which suggests that the model is reliable and predictive, for q 2 is greater than 0.5 indicating a model with good predictability [18][19][20]. The non-cross-validated square correlation coefficient r 2 is 0.979(r 2 is closer to 1 indicates better linear relationship) with a low standard error estimate (SEE) of 0.114, and F value of 319.429. Contributions of steric and electrostatic fields were 0.527 and 0.473, indicating that the steric interaction of the ligand with the receptor may be a more important influencing factor for the anticancer activity. Compared with the CoMFA model, the CoMSIA model is a bit poor, but it is also a good model with high predictability (q 2 value of 0.804 for eight components and a conventional r 2 value of 0.967 with a SEE of 0.141, F value of 207.935, shown in Table 2). The r 2 Pred value of 0.935 also shows that the predictive ability of the model are good, as does the CoMFA model with the r 2 Pred value of 0.948. The contribution of five fields: steric, electrostatic, hydrophobic, hydrogen-bond donor and hydrogen-bond acceptor, were 0.121, 0.236, 0.160, 0.225, and 0.259, respectively.
The actual and predicted pIC 50 values of the training set and the test set by two models are listed in Table 3, and the linear relationship for the CoMFA and CoMSIA analysis are shown in Figure 2 (a is the CoMFA model and b is the CoMSIA model), in which most points are evenly distributed along the line Y = X. It can clearly be seen that the predicted pIC 50 values obtained from CoMFA and CoMSIA models are in good agreement with the experimental data.

Contour Analysis
The contour maps were used to display the fields around the molecules, and to rationalize where changes in each field probably affect the activity of the molecule. The models from CoMFA and CoMSIA were graphically interpreted through the stdev*coeff contour maps, which are plotted as the percentages of the contribution of CoMFA or CoMSIA equation. They show regions where variations of steric, electrostatic, hydrophilic, hydrogen-bond donor or acceptor nature in the structural features of the different molecules lead to an increase or decrease in the activity [20][21][22]. The contour maps of CoMFA are displayed in Figure 3. The steric field (A) is characterized by green and yellow contours, in which green indicates that increased steric is associated with enhanced activity and yellow indicated reduced activity. Compound 16 was selected as a reference molecule.
There are green contours bellow the N-13 position, which suggested the suitable volume of alkyl at this position would increase the activity. The length of C3-C4 of N-alkyl substitution is probably suitable for improving the activity, shorter or longer lengths would decrease the activity. A bigger yellow contour beside the C-3 position and N-10 position shows that the more bulky substitutes in these areas will significantly decrease the biological activities. So, compared with the N-10 position alkynes substitutes (compound 77 and 78), the compounds with the methyl in the N-9 position (such as compounds 72 and 73) have bigger pIC 50 values. Compound 16 has more potential than 15 because the i-tu is more bulky than i-Pr in the yellow area. This is satisfactory in accordance with the contour maps. The steric field (B) is characterized by blue and red contours, which indicates that the positive-charge groups and negative-charge groups would be favorable to the activity, respectively. As an electron-donating group, the isopropyl can decrease the positive-charge of the blue areas and decrease the activity, so compound 6 has the largest pIC 50 value compared with compounds 1, 3 and 5. For another example, because the NHCO group is in the blue area, most of the compounds with phenyl urea have potential activity. Compared with the CoMFA model, the CoMSIA model provides more information. The CoMSIA contour maps involve three parts: the electrostatic and steric field contours, the hydrophobic field contours, and the hydrogen-bond donor and hydrogen-bond acceptor field contours. The CoMSIA steric and electrostatic contour plots shown in Figure 4(A,B) are consistent to those of CoMFA. The big or small ramificate alkyl substituent of N-13 position would decrease the activity. The CoMSIA hydrophobic contour plot is shown in Figure 4E using compound 72. The yellow regions indicate hydrophobic substitutions will increase the activity of the compounds, while the white areas show that hydrophilic substitutions will increase activity. The compounds with NHCO groups at C-3 position have potential activity probably because the NHCO groups as a hydrophilic in the white area. If the NHCO group was replaced by C=O such as the compound-58 analogues, they also shows the activity. However if the NHCO group was replaced by C=N-O groups such as the compounds-1-4 analogues, the activities would be decreased because of the C=N in white area with less hydrophilic. The hydrogen-bond donor and hydrogen-bond acceptor contour plots are shown in Figure 4 (C,D) using the compound 27. The blue regions highlight the hydrogen-bond donor's contribution to the activity. The purple areas show the hydrogen-bond donor's disadvantage in contributing to the activity. The hydrogen of the group NHCONH acted as hydrogen-bond donor would benefit the activity, however, if the hydrogen at the N-13 position is not replaced by alkyl, it would act as a hydrogen-bond donor in the purple areas, which can decrease the activity. This may explain why the compounds without N-alkylation at N-13 position have low pIC 50 value. In the hydrogen-bond acceptor contour plot ( Figure 4D), the red-purple and red regions represent those areas of favorable and unfavorable hydrogen-bond acceptor respectively. The C=O in the red regions shows it would decrease the activity. When the conformation was changed because of the substitution of phenyl (the C=O lie in the red-purple area), it could increase the activity, such as compounds 34 and 39.

Docking Analysis
In order to investigate the probable binding conformations between these indolocarbazole derivatives and the receptor, and to check the reliability of the established 3D-QSAR models, all studied inhibitors were docked into the active pocket of Tie-2. The crystal structure of Tie-2 was retrieved from the RCSB Protein Data Bank (PDB entry code: 3l8p). In order to validate the docking reliability, the ligand was removed from the active site and docked back into the binding pocket. The root mean square deviation (RMSD) between the predicted conformation and the actual conformation from the crystal structure of ligand was 0.609 Ǻ, which is smaller than the resolution of X-ray crystallography, 2.40 Ǻ. The results indicated that the parameter set for the Surflex-dock simulation was reasonable to reproduce the X-ray structure. Therefore the simulation method and the parameter set could be extended to study the binding conformations of the other inhibitors.
After all molecules were aligned according the method mentioned above, we found most of the molecules adopted a similar binding conformation to the potent inhibitor and had same orientation with each other. However, some inhibitor with low activity exhibited different poses and influenced the alignment based on docking, such as compound 19, 48, etc. The high quality 3D-QSAR models based on docking alignment could, unfortunately, not be constructed. A docking study showed that if the group of C-3 position is not enough to match to the sub-hydrophobic pocket composed of Asn887, Leu888, Leu876, Leu985, Asp982, Phe983, the molecule might be have much more orientation because of the large binding pocket of Tie-2. Most compounds had the same interaction with Tie-2, especially those with high activity, we still could acquire enough information to demonstrate our 3D-QSAR models. The following statements are the results of our docking studies, which are the complement of 3D-QSAR studies for drug design.
The compounds 27 and 73 were selected for further detailed analysis. The best possible interacting model of compound 73 with Tie-2 and the main residues involved in the interaction were generally depicted in Figure   The knowledge of hydrogen-bonding sites on a molecular surface is a powerful tool for docking studies. Ligands can be docked to proteins by matching the patterns displayed on the surface. The surface is divided into three kinds of regions: hydrogen-bonding donating, hydrogen acceptor and the rest of the molecule. Figure 6    The methyl at N-9 position which is matched to the hydrophilic area is favorable to the activity. If the substitutes of 3-position is the flexible hydrophilic group which linked the hydrophobic phenyl substitutes, it will enhance the activity. The flexible bond may rotate to fit the hydrophobic pocket. Therefore the compounds-20-38 analogues have bigger pIC 50 value. The conclusion from the lipophilic potential surface was satisfied according to the corresponding CoMSIA hydrophobic contour map. The compound 27 was docked into the ATP-binding site, the red color shows the electron-withdrawing zone and purple color shows electron-donating zone. In Figure 7(B), the R3 position was found in a yellow area. Figure 7. The MOLCAD lipophilic potential surface (A) and electrostatic potential surface (B) of ATP-binding site of Tie-2 within compound 27. The color ramp for LP ranges from brown (highest lipophilic area of the surface) to blue (highest hydrophilic area). The color ramp for EP ranges from red (most positive) to purple (most negative).

Design for New Inhibitors
Based on the analysis of the structure-activity relationship and the docking studies, a series of novel compounds were designed as the active tie-2 inhibitor. These compounds were aligned to the database using compound 27 as a template and the theoretical pIC 50 values were predicted by the CoMFA and CoMSIA models. They also were docked into the pocket of Tie-2. The chemical structure and the predicted pIC 50 value so as to the dock score were shown in Table 4. The high predicted activity and the better dock score showed that they would be a potent inhibitor to Tie-2 in the future. Table 4. Chemical structure of newly designed compounds and their predicted pIC 50 and the Surflex-dock total-scores.

Conclusion
In this study, the 3D-QSAR study and molecular docking were carried out not only to construct highly accurate and predictive 3D-QSAR models, including the CoMFA (q 2 = 0.823, r 2 = 0.979, SEE = 0.114) and CoMSIA (q 2 = 0.804, r 2 = 0.967, SEE = 0.141), but also to explore the interaction mechanism between the indolocarbazole compounds and Tie-2. The results from the combined 3D-QSAR and docking study are: (1) The length of alkyl chain of N-13 position up to C3-C4 is favorable to the anticancer activity as larger or smaller alky cannot match to the pocket of active site of Tie-2; (2) The C-3 position suggested hydrophobic flexible group linked with the hydrogen-donor substituent are favored to the binding, which fits to the hydrophobic pocket of active site. Moreover, the hydrogen-donor group may form the hydrogen bond with the residues from Tie-2 through the rotation of flexible bonds. If the group is electron-negative at this position, the activity would be enhanced; (3) The methyl at N-9 position is more favorable than at N-10 position, because the bulky group in N-10 position may decrease the activity. The 6-position N-H and the 7-position C=O are requirements for the biological activity because of the hydrogen bonds and hydrophilic.The results from VEGFR2 did not conform to the regular, expected statistic for a kind of dual VEGFR2 and Tie 2 inhibitor and unfortunately the 3D-QSAR models could not be obtained. We can modify the indolocarbazole structures which were potent to VEGFR2 in order to improve the activity to Tie-2 based on the information from the QSAR models and docking and design new potent dual inhibitors to Tie-2 and VEGFR2.