Next Article in Journal
Publisher's Note - Overlapping Pagination Error
Previous Article in Journal
Myosin Assembly, Maintenance and Degradation in Muscle: Role of the Chaperone UNC-45 in Myosin Thick Filament Dynamics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

QSAR Study of p56lck Protein Tyrosine Kinase Inhibitory Activity of Flavonoid Derivatives Using MLR and GA-PLS

Department of Medicinal Chemistry, Faculty of Pharmacy, Isfahan University of Medical Sciences and Health Services, 81746-73461, Isfahan, Iran
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2008, 9(9), 1876-1892; https://doi.org/10.3390/ijms9091876
Submission received: 9 August 2008 / Revised: 2 September 2008 / Accepted: 13 September 2008 / Published: 22 September 2008
(This article belongs to the Section Physical Chemistry, Theoretical and Computational Chemistry)

Abstract

:
Quantitative relationships between molecular structure and p56lck protein tyrosine kinase inhibitory activity of 50 flavonoid derivatives are discovered by MLR and GA-PLS methods. Different QSAR models revealed that substituent electronic descriptors (SED) parameters have significant impact on protein tyrosine kinase inhibitory activity of the compounds. Between the two statistical methods employed, GA-PLS gave superior results. The resultant GA-PLS model had a high statistical quality (R2 = 0.74 and Q2 = 0.61) for predicting the activity of the inhibitors. The models proposed in the present work are more useful in describing QSAR of flavonoid derivatives as p56lck protein tyrosine kinase inhibitors than those provided previously.

1. Introduction

The quantitative structure-activity relationship (QSAR) research field provides medicinal chemists with the ability to predict drug activity by mathematical equations which construct a relationship between the chemical structure and the biological activity [1, 2]. These mathematical equations are in the form of y = Xb+e that describe a set of predictor variables (X) with a predicted variable (y) by means of a regression vector (b) [3]. After the earlier QSAR studies by Hansch, who showed a correlation between biological activity and octanol-water partition coefficient [2], it is now assumed that the sum of substituent effects on the steric, electronic and hydrophobic interaction of compounds with their receptor determines their biological activity [46]. The first step in constructing the QSAR models is finding one or more molecular descriptors that represent variation in the structural property of the molecules by a number [7]. Nowadays, a wide range of descriptors are being used in QSAR studies which can be classified into different categories according to the Karelson approach including; constitutional, geometrical, topological, quantum, chemical and so on [8]. There are different variable selection methods available including; multiple linear regression (MLR), genetic algorithm (GA), principal component or factor analysis (PCA/FA) and so on. The mathematical relationships between molecular descriptors and activity are used to find the parameters affecting the biological activity and/or estimate the property of other molecules.
It is now well established that protein tyrosine kinases (PTKs) provide a central switching mechanism in cellular signal transduction pathways by catalyzing the transfer of the γ-phosphate of either ATP or GTP to specific tyrosine residues in certain protein substrates [9, 10]. This regulatory control plays a crucial role in signal transduction pathways that regulate several cellular functions under both normal and deregulated conditions [1114]. PTKs are the intracellular effectors for many growth hormone receptors. After the discovery of activated PTKs as the product of dominant viral-transforming genes (oncogenes) providing the early hypothesis for the connection between protein tyrosine phosphorylation and cell transformation, enough evidence are now available to suggest that inappropriate or elevated expression of PTKs contribute to the transformed state of cells in many human malignancies [1519]. P56lck is a lymphoid-specific protein tyrosine kinase that is principally expressed in T lymphocytes [20]. Association of p56lck with the cytoplasmic tail of various cell surface receptors, as well as associations of p56lck with intracellular targets of phosphorylation, suggests that this tyrosine kinase plays a central role in coordinating early signal transduction events [21]. Based on this knowledge it is clear that, substances which can modulate the activity of PTKs might be potentially effective therapeutic agents. The key step in the mechanism of kinase activity of all PTKs is the recognition and binding of a nucleoside triphosphate (usually ATP) and an appropriate tyrosyl-containing substrate to the enzyme. Direct transfer of phosphate between the two molecules is the next step in the PTKs function [22]. A variety of compounds can inhibit the function of PTKs in a manner which is competitive with respect to nucleotide binding. Among such competitive inhibitors are flavonoids, a group of low molecular weight plant natural products that include one of the largest classes of naturally-occurring polyphenolic compounds [23, 24]. This group of plant natural products is largely responsible for the colors of many fruits and flowers, and over 4,000 flavonoid pigments have been characterized and classified according to their chemical structure. Chemically they are C6-C3-C6 compounds in which the two C6 groups are substituted benzene rings, and the C3 group is an aliphatic chain which contains a pyran ring. Flavonoids occur as O-or C-glycosides or in the “free” state as aglycones with hydroxyl or methoxyl groups present on the aglycone. The flavonoids may be divided into seven types: flavones, flavonols, flavonones, chalcones, xanthones, isoflavones, and biflavones. Flavonoids have been gained wide interest as potential pharmacological agents since some of the best sources of flavonoids are foods: apples, blueberries, bilberries, onions, soy products and tea. Furthermore numerous medicinal plants contain therapeutic amounts of flavonoids, which are used to treat a wide variety of disorders [25].
Here, we consider the inhibitory activity of flavonoids against protein–tyrosine kinase p56lck. Several QSAR studies were reported on this class of molecules using different descriptors and different methods of modeling. Thakur et al. described a QSAR study on p56lck protein tyrosine kinase inhibitor flavonoids using only hydration energy and hydrophobic parameters [26]. Nikolovska-Coleska et al. treated a set of 104 derivatives with standard linear regression technique by the use of classical/quantum descriptors [27]. The same dataset was treated by Novic et al. with a counter propagation neural network by the use of classical/quantum descriptors [28]. Oblak et al. applied a wide variety of descriptors with CODESSA software on the above-mentioned dataset [29]. A quantum chemical/classical QSAR study on a set of 75 flavonoids and closely related compounds tested as p56lck protein tyrosine kinase and AR inhibitors has been carried out by Stefanic et al. and the obtained structure-activity relationships of both enzyme systems were compared [30]. A comprehensive ab initio study of 3D structures of some flavonoids is reported by Meyer [31]. Deeb et al. calculated nodal orientation with program NODANGLE [32].
In the present paper, the QSAR study for a series of 50 flavonoid analogues with the ability to inhibit protein tyrosine kinase has been considered [32]. In a comprehensive study of the PTK system we used a very large descriptor set (more than 600 topological, geometrical, constitutional, functional group, electrostatic, quantum and chemical descriptors) and different analyses: Hansch, Free-Wilson and substituent electronic descriptors (SED), in order to be able to compare the predictive ability of descriptors from different descriptor groups. Multiple linear regression (MLR) and genetic algorithm partial least squares (GA-PLS) methods were applied as methods for modeling.

2. Results and Discussion

The structural features and biological activity of the studied compounds are listed in Table 1. Calculated descriptors for each molecule are summarized in Table 2.

2.1. MLR analysis

In the first step, separate stepwise selection-based MLR analyses were performed using different types of descriptors, and then, an MLR equation was obtained utilizing the pool of all calculated descriptors. The results are summarized in Table 3. Correlation coefficient (r2) matrix for the descriptors used in different MLR equations is shown in Table 4. Collinear descriptors degrade the performance of MLR equations and such models have lowered prediction ability.
In Table 3 the QSAR models derived for different derivatives by using different sets of molecular descriptors are listed. Table 3 provides the resulted equations for the studied compounds. The first equation of Table 3 was found by using chemical descriptors (E1). This equation explained the negative effect of hydration energy and molecular weight (Mass) of molecules on protein tyrosine kinase inhibitory activity. Equation E2 shows that among quantum descriptors, most positive charge (MPC) has a negative effect on protein tyrosine kinase inhibitory activity and reveals the presence of columbic interactions between the ligands and receptors. The negative sign of the coefficient of MPC demonstrates that ligands with the least MPC could interact with receptor more efficiently. This indicates that there is probably a negative region in receptor which produces columbic interactions with ligand. Equation E3 of Table 3 demonstrates the effect of constitutional descriptors. It includes the negative effects of average molecular weight (AMW), number of multiple bonds (nBM) and number of aromatic bonds (nAB) on protein tyrosine kinase inhibitory activity. Molecules with lower coefficient of AMW show better protein tyrosine kinase inhibitory activity and decreasing the number of multiple bonds of compounds results in activity enhancement. The MLR equation of Table 3 was obtained from the pool of topological descriptors (E4) explained the positive effect of mean information content on the distance equality (ICR), path/walk 4-randic shape index (PW4), average connectivity index chi-4 (X4v) and the negative effect of mean information content vertex degree magnitude (IVDM) and average valence connectivity index chi-1 (X1v) on protein tyrosine kinase inhibitory activity. This equation describes the structure-activity relationship better than those obtained from the chemical, quantum, constitutional descriptors.
The equation obtained from the effect of geometrical parameter on protein tyrosine kinase inhibitory activity of the studied compounds has been described as E5 of Table 3. It explains the positive effect of spherosity (SPH) and negative effect of sum of geometrical distances between N...O, i.e. G (N...O) on protein tyrosine kinase inhibitory activity. The effect of functional groups on protein tyrosine kinase inhibitory activity of the studied compounds has been described by equation E6 of Table 3. This three-parametric equation does not have a high statistical quality, which suggests that the protein tyrosine kinase inhibitory activity of the studied molecules is not highly dependent on the type of functional group; but it is dependent on the structural changes induced by variations in functional groups. The negative sign of nNO2 and nOHt indicates that molecules with lower number of nitro groups (aliphatic) and tertiary alcohols (aliphatic) bind to protein kinase stronger. On the other hand, number of hydroxyl groups (nOH) represents direct effect on the inhibitory activity of the compounds. The Hansch equation (E7) shows the importance of steric, electronic and lipophilic factors on protein tyrosine kinase inhibitory activity. These factors are described by L3 (Length parameter of C3 substituent), ℑR′3, ℑR8 (Swain and Lupton field parameter of C-R′3 and C-R8 substitutes) and π5 (lipophilic parameter of C5 substitute), respectively. The negative coefficient of π5 indicates that lipophilic substituents at R5 are not favorable for binding affinity. This equation shows the positive effect of ℑR′3 and the negative effect of ℑR8 on the inhibitory activity of the compounds. In addition the negative effect of L3 describes that the presence of bulky groups at C3 leads to decreased activity because bulky groups hinder strong interaction between ligands and the enzyme. The SED equation (E8) shows the importance of SED factors on protein tyrosine kinase inhibitory activity. One of the parameters is molecular orbital energy HOMOA3 (Highest occupied molecular orbital parameter of C3 substitute) and the other one is SNQ8 (Sum of negative charges parameter of C8 substitute). It explains the positive effect of HOMOA3 and negative effect of SNQ8 on protein tyrosine kinase inhibitory activity.
The last Equation (E9) was obtained from the all types of calculated descriptors. Stepwise selection and elimination of variables produced a four-parametric QSAR equation. This equation shows that geometrical (SPH), quantum (MPC), Hansch (L3) and SED (SNQ8) parameters are major factors that affect protein tyrosine kinase inhibitory activity of compounds. Among these descriptors MPC and L3 have negative effects and the others have positive effects on the protein tyrosine kinase inhibitory activity.

2.2. Free-Wilson analysis

The simple Free-Wilson analysis (FWA) was considered to indicate which substituents on ring B and chromone moiety contribute to protein tyrosine kinase inhibitory activity and which ones detract from activity [33]. As indicated in Table 1, the molecules used in this study have a phenyl ring (ring B) and chromone moiety with different types of substituents in different positions of the ring. Some important substituents such as methoxyl, hydroxyl and amine are used in calculations. Therefore, the descriptors data matrix built for the FWA has 44 rows (i.e., number of selected molecules for FWA) and 24 columns (i.e., three substituents at eight substitution positions on the flavonoid structure). The elements of the descriptor data matrix are 1 or 0, to indicate the presence or absence of a given substituent in a specified position in a molecule, respectively. The following two-parametric equation was found between the activity data (y) and the Free-Wilson type descriptors data matrix:
pIC 50 = 3.893 ( ± 0.089 ) + 0.439 ( ± 0.207 ) R ' 3 _ Hydroxyl   1 .103 ( ± 0 .534)R 5 _ Methoxyl R 2 = 0.70 , N = 44 , F = 31 .45 , SE = 0 .30
Equation (1) describes that protein tyrosine kinase inhibitory activity of studied compounds is directly affected by the presence of electron-donating hydroxyl group in the meta position (R′3) of the phenyl ring and most probably this part of the flavonoid molecule interacts with the catalytic domain of the enzyme. The same result was obtained by other researchers [27]. A methoxyl group on C-R5 detracts from the inhibitory activity, according to this equation.

2.3. GA-PLS analysis

In PLS analysis, the descriptors data matrix is decomposed to orthogonal matrices with an inner relationship between the dependent and independent variables. Therefore, unlike MLR analysis, the multicolinearity problem in the descriptors is omitted by PLS analysis. Because a minimal number of latent variables are used for modeling in PLS; this modeling method coincides with noisy data better than MLR. In order to find the more convenient set of descriptors in PLS modeling, genetic algorithm was used. To do so, many different GA-PLS runs were conducted using different initial set of populations. The data set (n = 50) was divided into two group: calibration set (n = 40) and prediction set (n = 10). Given 40 calibration samples; the leave-one out cross-validation procedure was used to find the optimum number of latent variables for each PLS model. The most convenient GA-PLS model that resulted in the best fitness contained 14 indices, four of them being those obtained by MLR. The PLS estimate of coefficients for these descriptors are given in Figure 1. As it observed, a combination of quantum, topological, geometrical and Hansch descriptors have been selected by GA-PLS to account the protein tyrosine kinase inhibitory activity of flavonoid derivatives. The majority of these descriptors are topological indices. The resulted GA-PLS model possessed a high statistical quality R2 = 0.74 and Q2 = 0.61. The predictive ability of the model was measured by applying to 10 external test set molecules. The squared correlation coefficient for prediction was 0.82 and standard error of prediction was 0.30. The values of pIC50 using GA-PLS model (refined from cross-validation or external prediction set) along with the corresponding relative errors of prediction (REP) are shown in Table 1. Very small values of relative errors (between ± 0.40) confirm the accuracy of the proposed GA-PLS model for modeling protein tyrosine kinase inhibitory activity of the studied flavonoid derivatives.
Comparison between the results obtained by GA-PLS and MLR methods indicates higher accuracy of GA-PLS method in describing the inhibitory activity of flavonoid derivatives toward protein tyrosine kinase enzyme. The difference in accuracy of the two regression methods used in this study is visualized in Figure 2 by plotting the predicted activity (by cross-validation) against the experimental values. Obviously, two linear models represented scattering of data around a straight line with slope close to one. As it is observed, the plot of data resulted by GA-PLS represents the lowest scattering and the plot obtained by MLR analysis (which is obtained from E9) is in the second order of accuracy.
To measure the significance of the 14 selected PLS descriptors in the protein tyrosine kinase inhibitory activity; VIP was calculated for each descriptor [34]. The VIP analysis of PLS equation is shown in Figure 3. VIP shows that HNar and TI2, which are topological, and SPH which is a geometrical parameter, are the most important indices in the QSAR equation derived by PLS analysis. In addition, quantum parameters such as (HOMO) and Hansch (ℑR′3) have been found to be moderately influential parameters.

3. Methodology

3.1. Software

The two-dimensional structures of molecules were drawn using Hyperchem 7.0 software. The final geometries were obtained with the semi-empirical AM1 method in Hyperchem program. The molecular structures were optimized using the Polak-Ribiere algorithm until the root mean square gradient was 0.01 kcal mol−1. The resulted geometry was transferred into Dragon program package, which was developed by Milano Chemometrics and QSAR Group [35]. The z-matrix of the structures was provided by the software and transferred to the Gaussian 98 program. Complete geometry optimization was performed taking the most extended conformation as starting geometries. Semi-empirical molecular orbital calculation (AM1) of the structures was preformed using Gaussian 98 program [36].

3.2. Activity data & descriptor generation

The biological data used in this study are protein tyrosine kinase inhibitory activity, −log (IC50), of a set of 50 flavonoid analogues [32]. The structural features and biological activity of these compounds are listed in Table 1 and then used for subsequent QSAR analysis as dependent variables. The large number of molecular descriptors was calculated using Hyperchem, Dragon package and Gaussian 98. Some chemical parameters including molecular volume (V), molecular surface area (SA), hydrophobicity (Log P), hydration energy (HE) and molecular polarizability (MP) were calculated using Hyperchem Software. The Dragon software calculated different functional groups, topological, geometrical and constitutional descriptors for each molecule. Gaussian 98 was employed for calculation of different quantum chemical descriptors including, dipole moment (DM), local charges, and HOMO and LOMO energies. Hardness (η), softness (S), electronegativity (χ) and electrophilicity (ω) were calculated according to the method proposed by Thanikaivelan et al. [37]. Classical substituent constants including hydrophobic constant (π), the Hammet electronic constants (σ), the Taft field effect (FI), resonance (R) substituent and steric (molar refractivity MR and STERIMOL) constants were also used as descriptor in this study [38]. The calculated descriptors for each molecule are summarized in Table 2.

3.3. Data screening & model building

The selected descriptors from each class and the experimental data were analyzed by the stepwise regression SPSS (version 12.0) software. The calculated descriptors were collected in a data matrix whose number of rows and columns were the number of molecules and descriptors, respectively. Multiple linear regression (MLR) and partial least squares (PLS) were used to derive the QSAR equations and feature selection was performed by the use of genetic algorithm (GA). The resulted models were validated by leave-one out cross-validation procedure (using MATLAB software) to check their predictability and robustness. However, this procedure did not produce good results and therefore we used genetic algorithm (GA-PLS) to select the best variables.
Application of PLS allows the construction of larger QSAR equations, while still avoiding over-fitting and eliminating most variables. PLS is normally used in combination with cross-validation to obtain the optimum number of components [39, 40]. The PLS regression method used in this study was the NIPALS-based algorithm existed in the chemometrics toolbox of MATLAB software (version 7.1 Math work Inc.). Leave-one-out cross-validation procedure was used to obtain the optimum number of factors based on the Haaland and Thomas F-ratio criterion [41].

3.4. Variable importance in the projection (VIP)

In order to investigate the relative importance of the variable appeared in the final model obtained by GA-PLS method, variable important in projection (VIP) was employed [34]. VIP values reflect the importance of terms in PLS model. According to Erikson et al. X-variables (predictor variables) could be classified according to their relevance in explaining y (predicted variable), so that VIP > 1.0 and VIP < 0.8 mean highly or less influential, respectively, and 0.8 < VIP< 1.0 means moderately influential [8].

3.5. Substituent electronic descriptors (SED)

Electronic descriptors obtained from quantum chemical calculations have found major popularity and there is a challenge between calculation complexity and accuracy to select the quantum chemical calculation methods (i.e., semi-empirical and ab initio) [42]. To simplify the quantum chemical calculations Hemmateenejad et al. recently have hypothesized that the calculations could be performed on the substituents instead of whole molecular structures and the resulting electronic features can be considered as electronic descriptors which have found major popularity in QSAR/QSPR studies [43,44]. Hemmateenejad et al. proposed substituent electronic descriptors (SED) as an alternative to both substituent constants and molecular descriptors [43]. SED analysis for each substituent was used in our study and the calculated descriptors are listed in Table 2. They can be classified into three different electronic categories including local charges, dipoles and orbital energies. Since most of the constituents are open shell quantum species (due to being in doublet quantum state as a radical molecule), a difference in energy between two electronic energy populations, alpha (spine up) and beta (spine down) can be seen using Gaussian 98. It provides some additional descriptors HOMOA, HOMOB, LUMOA, LUMOB, HAD, HDB, SOFA, SOFB, ENA, ENB, EPHA, and EPHB stem from two different alpha and beta electronic population energy, where the subscript A and B stand for alpha and beta population of electronic energy, respectively. Therefore, a total of 26 electronic descriptors were calculated for each substituent.

4. Conclusions

Quantitative relationships between molecular structure and protein tyrosine kinase inhibitory activity of flavonoid derivatives were discovered by two chemometrics methods: MLR and GA-PLS. Different QSAR models revealed that SED parameters have significant impact on protein tyrosine kinase inhibitory activity of the compounds. In this series a significant role of topological and geometrical parameters on the inhibitory activity was observed. Using the pool of all types of calculated descriptors a new QSAR model was derived for these compounds. In this model the importance of quantum, geometrical, SED and Hansch parameters have an effect on protein tyrosine kinase inhibitory activity was indicated. A comparison between the two statistical methods employed indicated that GA-PLS represented superior results. The resulted GA-PLS model possessed a high statistical quality (R2 = 0.74 and Q2 = 0.61) for predicting the activity of the inhibitors. The models proposed in present work are more useful in describing QSAR of flavonoid derivatives as p56lck protein tyrosin kinase Inhibitors than those proposed previously.

Acknowledgments

This work was supported by Isfahan Pharmaceutical Sciences Research Center. The authors wish to thank Dr. Bahram Hemmateenejad for his advice on various aspects of this research.

References

  1. Hansch, C; Hoekman, D; Gao, H. Comparative QSAR: Toward a Deeper Understanding of Chemicobiological Interactions. Chem. Rev 1996, 96, 1045–1076. [Google Scholar]
  2. Hansch, C; Maloney, PP; Fujita, T; Muir, RM. Correlation of Biological Activity of Phenoxyacetic Acids with Hammett Substituent Constants and Partition Coefficients. Nature 1962, 194, 178–180. [Google Scholar]
  3. Hemmateenejad, B. Correlation Ranking Procedure for Factor Selection in PC-ANN Modeling and Application to ADMETox Evaluation. Chemom. Intell. Lab. Syst 2005, 75, 231–245. [Google Scholar]
  4. Fujita, T; Iwasa, J; Hansch, C. A New Substituent Constant, π, Derived from Partition Coefficients. J. Am. Chem. Soc 1964, 86, 5175–5180. [Google Scholar]
  5. Hansch, C. Quantitative Approach to Biochemical Structure-Activity Relationships. Acc. Chem. Res 1968, 2, 232–239. [Google Scholar]
  6. Hansch, C; Clayton, JM. Lipophilic Character and Biological Activity of Drugs II: The Parabolic Case. J. Pharm. Sci 1973, 62, 1–21. [Google Scholar]
  7. Agatonovic-Kustrin, S; Tucker, IG; Zecevic, M; Ziva-novic, LJ. Prediction of Drug Transfer into Human Milk from Theoretically Derived Descriptors. Anal. Chem. Acta 2000, 418, 181–195. [Google Scholar]
  8. Mohajeri, A; Hemmateenejad, B; Mehdipour, A; Miri, R. Modeling Calcium Channel Antagonistic Activity of Dihydropyridine Derivatives Using QTMS Indices Analyzed by GA-PLS and PC-GA-PLS. J. Mol. Graph. Model 2008, 26, 1057–1065. [Google Scholar]
  9. Ullrich, A; Schlessinger, J. Signal Transduction by Receptors with Tyrosine Kinase Activity. Cell 1990, 61, 203–212. [Google Scholar]
  10. Bishop, JM. The Molecular Genetics of Cancer. Science 1987, 235, 305–311. [Google Scholar]
  11. Blume-Jensen, P; Hunter, T. Oncogenic Kinase C Signalling. Nature 2001, 411, 355–365. [Google Scholar]
  12. Hunter, T. Signaling — 2000 and Beyond. Cell 2000, 100, 113–127. [Google Scholar]
  13. Schlessinger, J. Cell Signaling by Receptor Tyrosine Kinases. Cell 2000, 103, 211–225. [Google Scholar]
  14. Hanahan, D; Weinberg, RA. The Hallmarks of Cancer. Cell 2000, 100, 57–70. [Google Scholar]
  15. Cantley, LC; Auger, KR; Carpenter, C; Duckworth, B; Graziani, A; Kapeller, R; Oltoff, S. Oncogenes and Signal Transduction. Cell 1991, 64, 281–302. [Google Scholar]
  16. Groundwater, PW; Solomons, KRH; Drewe, JA; Munawar, MA. Progress in Medicinal Chemistry; Ellis, GP, Luscombe, DK, Eds.; Elsevier Science B.V: Amsterdam, 1996; pp. 233–329. [Google Scholar]
  17. Bolen, JB; Veillette, A; Schwartz, AM; DeSeau, V; Rosen, N. Activation of pp60c-src Protein Kinase Activity in Human Colon Carcinoma. Proc. Natl. Acad. Sci. USA 1987, 84, 2251–2255. [Google Scholar]
  18. Slamon, DJ; Clark, GM; Wong, SG; Levin, WJ; Ullrich, A; McGuire, WL. Human Breast Cancer: Correlation of Relapse and Survival with Amplification of the HER-2/neu Oncogene. Science 1987, 235, 177–182. [Google Scholar]
  19. Yamamoto, T; Kamata, N; Kawano, H; Shimizu, S; Kuroki, T; Toyoshima, K; Rikimaru, K; Nomura, N; Ishizaki, R; Pastan, I; Gamou, S; Shimizu, N. High Incidence of Amplification of the Epidermal Growth Factor Receptor Gene in Human Squamous Carcinoma Cell Lines. Cancer Res 1986, 46, 414–416. [Google Scholar]
  20. Weil, R; Veillette, A. Signal Transduction by the Lymphocyte-Specific Tyrosine Protein Kinase p56lck. Current Topics Micro. Immunol 1996, 205, 63–87. [Google Scholar]
  21. Anderson, SJ; Levin, SD; Perlmutter, RM. Involvement of the Protein Tyrosine Kinase p56lck in T Cell Signaling and Thymocyte Development. Adv. Immunol 1994, 56, 151–178. [Google Scholar]
  22. Bishop, JM. Cellular Oncogenes and Retroviruses. Annue. Rev. Biochem 1983, 52, 301–354. [Google Scholar]
  23. Cushman, M; Nagarathnam, D; Burg, DL; Geahlen, RL. Synthesis and Protein-Tyrosine Kinase Inhibitory Activities of Flavonoid Analogues. J. Med. Chem 1991, 34, 798–806. [Google Scholar]
  24. Cushman, M; Zhu, H; Geahlen, RL; Kraker, AJ. Synthesis and Biochemical Evaluation of a Series of Aminoflavones as Potential Inhibitors of Protein-Tyrosine Kinases p56lck, EGFr, and p60v-src. J. Med. Chem 1994, 37, 3353–3362. [Google Scholar]
  25. Bylka, W; Matlawska, I; Pilewski, NA. Natural Flavonoids as Antimicrobial Agents. JANA 2004, 7, 24–31. [Google Scholar]
  26. Thakur, A; Vishwakarma, S; Thakur, M. QSAR Study of Flavonoid Derivatives as p56lck Tyrosine Kinase Inhibitors. Bioorg. Med. Chem 2004, 12, 1209–1214. [Google Scholar]
  27. Nikolovska-Coleska, Ž; Suturkova, L; Dorevski, K; Krbavcic, A; Solmajer, T. Quantitative Structure-Activity Relationship of Flavonoid Inhibitors of p56lck Protein Tyrosine Kinase: A Classical/Quantum Chemical Approach. Quant. Struct.-Act. Relat 1998, 17, 7–13. [Google Scholar]
  28. Novic, M; Nikolovska-Coleska, Ž; Šolmajer, T. Quantitative Structure-Activity Relationship of Flavonoid p56lck Protein Tyrosine Kinase Inhibitors. A Neural Network Approach. J. Chem. Inf. Comput. Sci 1997, 37, 990–998. [Google Scholar]
  29. Oblak, M; Randic, M; Solmajer, T. Quantitative Structure-Activity Relationship of Flavonoid Analogues.3. Inhibition of p56lck Protein Tyrosine Kinase. J. Chem. Inf. Comput. Sci 2000, 40, 994–1001. [Google Scholar]
  30. Stefanic-Petek, A; Krbavcic, A; Solmajer, T. QSAR of Flavonoids: 4. Differential Inhibition of Aldose Reductase and p56lck Protein Tyrosine Kinase. Croatica Chemica Acta 2002, 75, 517–529. [Google Scholar]
  31. Meyer, M. Ab initio Study of Flavonoid. Int. J. Quantum Chem 2000, 76, 724–732. [Google Scholar]
  32. Deeb, O; Clare, BW. QSAR of Aromatic Substances: Protein Tyrosin Kinase Inhibitory Activity of Flavonoid Analogues. Chem. Biol. Drug Des 2007, 70, 437–449. [Google Scholar]
  33. Free, SMJR; Wilson, JW. A Mathematical Contribution to Structure-Activity Studies. J. Med. Chem 1964, 7, 395–399. [Google Scholar]
  34. Olah, M; Bologa, C; Oprea, TI. An Automated PLS Search for Biologically Relevant QSAR Descriptors. J. Comput. Aided Mol. Des 2004, 18, 437–449. [Google Scholar]
  35. Todeschini, R. Milano Chemometrics and QSPR Group. http://michem.disat.unimib.it/, accessed 9 September, 2008.
  36. Frisch, MJ; Trucks, MJ; Schlegel, HB; Scuseria, GE; Robb, MA; Cheeseman, JR; Zakrzewski, VG; Montgomery, JA; Stratmann, JR; Burant, JC; et al. Gaussian 98, Revision A.7. Gaussian, Inc: Pittsburgh, PA, 1998. [Google Scholar]
  37. Roy, K. QSAR of Adenosine Receptor Antagonists II: Exploring Physicochemical Requirements for Selective Binding of 2-arylpyrazolo [3,4-c]quinoline Derivatives with Adenosine A1 and A3 Receptor Subtypes. QSAR. Comb. Sci 2003, 22, 614–621. [Google Scholar]
  38. Hansch, C; Leo, A; Taft, RW. A Survey of Hammett Substituent Constants and Resonance and Field Parameters. Chem. Rev 1991, 91, 165–195. [Google Scholar]
  39. Bhattacharya, P; Roy, K. QSAR of Adenosine A3 Receptor Antagonist 1,2,4-triazolo[4,3-a]quinoxalin-1-one Derivatives Using Chemometric Tools. Bioorg. Med. Chem. Lett 2005, 15, 3737–3743. [Google Scholar]
  40. Leardi, R. Genetic Algorithms in Chemometrics and Chemistry: A Review. J. Chemometrics 2001, 15, 559–569. [Google Scholar]
  41. Hemmateenejad, B. Optimal QSAR Analysis of the Carcinogenic Activity of Drugs by Correlation Ranking and Genetic Algorithm-Based. J. Chemometrics 2004, 18, 475–485. [Google Scholar]
  42. Wang, J; Zhang, L; Yang, G; Zhan, CG. Quantitative Structure-Activity Relationship for Cyclic Imide Derivatives of Protoporphyrinogen Oxidase Inhibitors: A Study of Quantum Chemical Descriptors from Density Functional Theory. J. Chem. Inf. Comput. Sci 2004, 44, 2099–2105. [Google Scholar]
  43. Hemmateenejad, B; Sanchooli, M. Substituent Electronic Descriptors for Fast QSAR/QSPR. J. Chemometrics 2007, 21, 96–107. [Google Scholar]
  44. Smeyers, YG; Bouniam, L; Smeyers, NJ; Ezzamarty, A; Hernandez-Laguna, A; Sainz-Diaz, CI. Quantum Mechanical and QSAR Study of Some a-Arylpropionic Acids as Anti-Inflammatory Agents. Eur. J. Med. Chem 1998, 33, 103–112. [Google Scholar]
Figure 1. PLS regression coefficients for the variables used in GA-PLS model.
Figure 1. PLS regression coefficients for the variables used in GA-PLS model.
Ijms 09 01876f1
Figure 2. Plots of the cross-validated predicted activity against the experimental activity for the QSAR models obtained by MLR, GA-PLS methods.
Figure 2. Plots of the cross-validated predicted activity against the experimental activity for the QSAR models obtained by MLR, GA-PLS methods.
Ijms 09 01876f2
Figure 3. Plot of variables important in projection (VIP) for the descriptors used in GA-PLS model.
Figure 3. Plot of variables important in projection (VIP) for the descriptors used in GA-PLS model.
Ijms 09 01876f3
Table 1. Chemical structure of flavonoid derivatives used in this study and their experimental and predicted activity for protein kinase inhibition. Ijms 09 01876f4Chemical structure of flavonoid derivatives.
Table 1. Chemical structure of flavonoid derivatives used in this study and their experimental and predicted activity for protein kinase inhibition. Ijms 09 01876f4Chemical structure of flavonoid derivatives.
CompoundRExperimental pIC50aPredicted pIC50REP b
15,7-OH,4′-NH25.134.7707−0.0753
23,5,7,3′,4′-OH4.884.94310.0128
33,7,3′,4′-OH4.864.7707−0.0187
45,7,4′-OH4.834.4356−0.0889
55,4′-OH4.804.2603−0.1267
66,3′-OH4.804.4242−0.0849
76-OH,5,7,4′-NH24.744.1061−0.1544
85,7-OH4.714.0895−0.1518
94′-OH,3′,5′-OCH34.574.2687−0.0706
105,7,3′,4′-OH4.464.4172−0.0097
117,3′-OH4.414.43580.0058
126-OH,5,7,3′-NH24.344.36810.0064
136-OMe,8,3′-NH24.254.1649−0.0204
146-OH,3′,4′,5′-OCH34.224.35910.0319
153,5,7,4′-OH,3′,5′-OCH34.164.16490.0012
163,5,7,3′,5′-OH4.003.9947−0.0013
176,4′-NH23.993.9613−0.0072
186,8,4′-NH23.973.97640.0016
196-OH,8,4′-NH23.933.94460.0037
206,4′-OH3.933.9247−0.0013
217,8,4′-OH,3′,5′-OCH33.923.8990−0.0054
228,4′-NH23.913.8994−0.0027
236,4′-OH,3′,5′-OCH33.893.91330.0060
247-OH,4′-NH23.863.88150.0056
257-OH,6,4′-NH23.853.8296−0.0053
267,4′-OH3.783.86210.0213
277,8,3′OH3.753.6903−0.0162
286,3′-NH23.704.02280.0803
294′-NH23.684.18500.1207
305-OH,6,4′-NH23.653.93250.0718
313,5,7-OH3.533.97940.1129
325,4′-OH,7-OCH33.553.73150.0487
335,3′-OH3.504.12090.1507
347,8-OH3.503.4873−0.0036
355-OH,8,4′-NH23.493.67050.0492
367-OH,8,4′-NH23.483.66940.0516
377-OH3.473.85670.1003
386-OCH3,8,4′-NH23.433.67090.0683
397,8-OH,3′,4′,5′-OCH33.404.00580.1512
403-COOCH3,4′-OH3.363.70810.0939
414′-OH3.303.70810.1101
427-OH,6,3′-NH23.303.34190.0125
437-OH,6,8,4′-NH23.123.34190.0664
443-COOCH3,4′-NH23.093.34190.0754
453-COOH,7-OCH3,4′-OH2.993.32620.1011
467,4′-OH,3′,5′-OCH32.903.32620.1281
477-OH,6,8,4′-NO22.813.06740.0839
483-COOH,4′-OH2.803.06740.0872
495-OCH3,8,4′-NH22.793.06740.0904
507-OH,8,4′-NO22.733.32620.1793
apIC50 = –log (IC50),
bREP = Relative Error Prediction
Table 2. Brief description of some descriptors used in this study.
Table 2. Brief description of some descriptors used in this study.
Descriptor typeMolecular Description
ConstitutionalMolecular weight, no. of atoms, no. of non-H atoms, no. of bonds, no. of heteroatoms, no. of multiple bonds (nBM), no. of aromatic bonds, no. of functional groups (hydroxyl, amine, aldehyde, carbonyl, nitro, nitroso, etc.), no. of rings, no. of circuits, no of H-bond donors, no of H-bond acceptors, no. of Nitrogen atoms (nN), chemical composition, sum of Kier-Hall electrotopological states (Ss), mean atomic polarizability (Mp), number of rotable bonds (RBN), mean atomic Sanderson electronegativity (Me), etc.
TopologicalMolecular size index, molecular connectivity indices (X1A, X4A, X2v, X1Av, X2Av, X3Av, X4Av), information content index (IC), Kier Shape indices, total walk count, path/walk-Randic shape indices (PW3, PW4, Zagreb indices, Schultz indices, Balaban J index (such as MSD) Wiener indices, topological charge indices, Sum of topological distances between F..F (T(F..F)), Ratio of multiple path count to path counts (PCR), Mean information content vertex degree magnitude (IVDM), Eigenvalue sum of Z weighted distance matrix (SEigZ), reciprocal hyper-detour index (Rww), Eigenvalue coefficient sum from adjacency matrix (VEA1), radial centric information index, 2D petijean shape index (PJI2), etc.
Geometrical3D petijean shape index (PJI3), Gravitational index, Balaban index, Wiener index, etc.
QuantumHighest occupied Molecular Orbital Energy (HOMO) , Lowest Unoccupied Molecular Orbital Energy (LUMO), Most positive charge (MPC), Least negative charge (LNC), Sum of squares of charges (SSC), Sum of square of positive charges (SSPC), Sum of square of negative charges (SSNC), Sum of positive charges (SUMPC), Sum of negative charges (SUMNC), Sum of absolute of charges (SAC), Total dipole moment (DMt), Molecular dipole moment at X-direction (DMX), Molecular dipole moment at Y-direction (DMY), Molecular dipole moment at Z-direction (DMZ), Electronegativity (χ= −0.5 (HOMO-LUMO)), Electrophilicity (ω= χ2/2 η) ,Hardness (η = 0.5 (HOMO+LUMO)), Softness (S=1/η).
Functional groupNumber of total tertiary carbons (nCt), Number of H-bond acceptor atoms (nHAcc), number of total hydroxyl groups (nOH), number of unsubstituted aromatic C(nCaH), number of ethers (aromatic) (nRORPh), etc.
ChemicalLogP (Octanol-water partition coefficient), Hydration Energy (HE), Polarizability (Pol), Molar refractivity (MR), Molecular volume (V), Molecular surface area (SA).
Substituent electronic descriptorsRMSQ (Root mean square error of charges), SPQ ( Sum of positive charges), SNQ ( Sum of negative charges), RMSDM (Root mean square of dipole moments at any Cartesian coordinate direction), TDM (Total dipole moment), FRMS (Root mean square force that any atom in constituent molecule see right before the optimization), FMAX (Maximum force on molecule), HOMO (Highest occupied molecular orbital), LUMO (Lowest unoccupied molecular orbital), HD (Hardness), SOF (Softness), EPH (Electrophilicity), EN (Electronegativity).
Table 3. The results of MLR analysis with different types of descriptors.
Table 3. The results of MLR analysis with different types of descriptors.
No.Descriptor sourceMLR EquationsNR2SERMSCVQ2F
E1ChemicalpIC50 = 4.893 (± 0.735) − 0.056 (± 0.017) HE −0.007 (± 0.003) Mass500.400.550.580.3213.82
E2QuantumpIC50 = 6.362 (± 0.565) − 6.805 (± 1.505) MPC500.430.530.540.3817.44
E3ConstitutionalpIC50 = 3.139 (± 1.250) − 0.438 (± 0.100) nBM − 0.506 (± 0.205) AMW − 0.584 (± 0.266) nAB500.490.490.510.4219.65
E4TopologicalpIC50 = 17.242 (± 0.605) − 3.374 (± 0.545) IVDM − 53.95 (± 12.355) X1Av + 2.349 (± 0.696) ICR +24.874 (±9.569) PW4 + 73.575 (±33.719) X4A500.720.380.480.5830.13
E5GeometricalpIC50 = −15.093 (± 3.339) + 19.450 (± 3.406) SPH − 0.010 (± 0.002) G(N...O)500.600.430.470.4917.23
E6Functional grouppIC50 = 3.672 (± 0.123) − 0.414 (± 0.130) nNO2 −1.098 (± 0.369) nOHt + 0.160 (± 0.058) nOH500.530.450.500.4512.67
E7HanschpIC50 = 4.219 (± 0.289) − 0.615 (± 0.202) π5 + 1.462 (± 0.555) ℑR′3 − 1.379 (± 0.490) ℑR8 −0.249 (± 0.111) L3500.530.450.500.4512.67
E8SEDpIC50 = −0.708 (± 1.228) − 9.570 (± 2.500) HOMOA3 + 1.092 (±0.308) SNQ8500.820.320.300.6151.43
E9Molecular descriptorpIC50 = −19.763 (± 4.304) − 4.785 (± 1.275) MPC + 25.113 (± 4.142) SPH + 0.849 (± 0.264) SNQ8 − 0.357 (± 0.136) L3500.830.310.280.6252.43
Table 4. Correlation coefficient (r2) matrix for the descriptors of flavone derivatives used in the MLR equation.
Table 4. Correlation coefficient (r2) matrix for the descriptors of flavone derivatives used in the MLR equation.
HEMassMPCnBMAMWnABASPG(N...O)X1AVICRPW4X4AIVDMnNO2nOHtnOHℑR′3L3ℑR8π5pIC50
HE1−0.2340.1920.124−0.3270.236−0.0060.000.6510.075−0.0120.3160.0650.0690.047−0.745−0.3940.067−0.0050.485−0.347
Mass10.5310.5800.5120.136−0.2690.328−0.6550.4160.541−0.6310.8160.5540.0990.2110.3260.1960.4870.040−0.268
MPC10.9530.7150.366−0.2330.623−0.5390.3040.050−0.3290.9040.8760.259−0.227−0.2860.2890.5950.156−0.547
nBM10.7780.165−0.0940.725−0.6240.3900.016−0.3250.9370.9720.114−0.196−0.2110.1250.6870.193−0.498
AMW10.050−0.2000.3560.8970.0370.116−0.2060.7180.7750.1160.4340.1360.1250.6200.065−0.191
nAB1−0.684−0.1270.069−0.1920.257−0.3970.235−0.0730.692−0.086−0.1980.930−0.1080.185−0.364
ASP10.2940.1550.5380.5320.388−0.2210.0690.369−0.273−0.201−0.768−0.039−0.0980.269
G(N...O)1−0.3790.5780.2990.3480.6180.763−0.138−0.478−0.437−0.1820.5080.034−0.329
X1AV1−0.130−0.1710.413−0.651−0.647−0.052−0.572−0.270−0.056−0.5420.2290.058
ICR1−0.212−0.2770.4420.441−0.104−0.410−0.161−0.2780.1530.168−0.080
PW41−0.1570.261−0.0450.1580.3360.4130.3560.029−0.2490.002
X4A1−0.489−0.233−0.252−0.046−0.025−0.466−0.261−0.1570.347
IVDM10.8910.155−0.100−0.0300.2180.6630.192−0.494
nNO21−0.050−0.177−0.166−0.0970.7200.151−0.416
nOHt10.061−0.1370.513−0.0750.128−0.306
nOH10.6210.104−0.004−0.3750.370
R′31−0.0700.008−0.0140.315
L31−0.1430.085−0.259
R810.224−0.367
π51−0.451
pIC501

Share and Cite

MDPI and ACS Style

Fassihi, A.; Sabet, R. QSAR Study of p56lck Protein Tyrosine Kinase Inhibitory Activity of Flavonoid Derivatives Using MLR and GA-PLS. Int. J. Mol. Sci. 2008, 9, 1876-1892. https://doi.org/10.3390/ijms9091876

AMA Style

Fassihi A, Sabet R. QSAR Study of p56lck Protein Tyrosine Kinase Inhibitory Activity of Flavonoid Derivatives Using MLR and GA-PLS. International Journal of Molecular Sciences. 2008; 9(9):1876-1892. https://doi.org/10.3390/ijms9091876

Chicago/Turabian Style

Fassihi, Afshin, and Razieh Sabet. 2008. "QSAR Study of p56lck Protein Tyrosine Kinase Inhibitory Activity of Flavonoid Derivatives Using MLR and GA-PLS" International Journal of Molecular Sciences 9, no. 9: 1876-1892. https://doi.org/10.3390/ijms9091876

Article Metrics

Back to TopTop