Molecular Modeling Studies of 4,5-Dihydro-1H-pyrazolo[4,3-h] quinazoline Derivatives as Potent CDK2/Cyclin A Inhibitors Using 3D-QSAR and Docking

CDK2/cyclin A has appeared as an attractive drug targets over the years with diverse therapeutic potentials. A computational strategy based on comparative molecular fields analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) followed by molecular docking studies were performed on a series of 4,5-dihydro-1H-pyrazolo[4,3-h]quinazoline derivatives as potent CDK2/cyclin A inhibitors. The CoMFA and CoMSIA models, using 38 molecules in the training set, gave r2cv values of 0.747 and 0.518 and r2 values of 0.970 and 0.934, respectively. 3D contour maps generated by the CoMFA and CoMSIA models were used to identify the key structural requirements responsible for the biological activity. Molecular docking was applied to explore the binding mode between the ligands and the receptor. The information obtained from molecular modeling studies may be helpful to design novel inhibitors of CDK2/cyclin A with desired activity.


Introduction
Essentially all physiological processes and a majority of human diseases involve protein phosphorylation. Given the fact that protein phosphorylation is a primary post-translational mechanism applied by cells to regulate enzymes and other proteins in each of the cell cycle transitions, its deregulation has been regarded as the cause or consequence of many maladies [1][2][3]. CDKs/cyclins, a series of binary protein kinase, show genetic defects in many malignant diseases such as Alzheimer's [4], Parkinson's [5], Nieman-Pick's diseases [6], and ischemia [7] as well as traumatic brain injury [8], when deregulated. CDKs/cyclins exert their effects via activation of host proteins through phosphorylation of key serine or threonine residues by ATP. It was revealed in previous studies that the inhibitors of these CDKs/cyclins were down-regulated in most of the cancer cells [9,10]. A considerable amount of investigations have been carried out to develop inhibitors that target CDK2/cyclin A for treating cancer, and several CDK2/cyclin A inhibitors have been under clinical evaluation [10]. 3D-QSAR and docking approaches have emerged as one of the most powerful tools in ligand based drug design strategies [11,12]. They have been used to develop efficient models for identifying CDK2/cyclin A inhibitors [13,14].
Recently, a series of compounds containing 4,5-dihydro-1H-pyrazolo [4,3-h]quinazoline that have potent CDK2/cyclin A inhibitory activities were reported by literature [15]. In this paper, molecular modeling studies of these 4,5-dihydro-1H-pyrazolo [4,3-h]quinazoline derivatives were performed by using 3D-QSAR and docking approaches. 3D-QSAR including comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) methods were performed to predict the inhibitory activities of these inhibitors, and to provide the regions in space where interactive fields may influence the activity. Meanwhile, a docking study was employed to investigate the protein-ligand interactions. The constructed models can help not only in understanding the structure-activity relationship of these compounds but can also serve as a useful guide for the design of new inhibitors with desired potencies.

CoMFA Model
The statistical parameters corresponding to the CoMFA model are listed in Table 1. The CoMFA model of a series of 4,5-dihydro-1H-pyrazolo [4,3-h]quinazoline derivatives was generated using leave-one-out PLS analysis with an optimized component of 5 to give a good cross-validated correlation coefficient (r 2 cv ) of 0.747 (>0.5), which suggesting that the model should be a reasonable tool for predicting the IC 50 values. A high non-cross-validated correlation coefficient (r 2 ) of 0.970 with a low standard error estimate (SEE) of 0.225 was obtained as well as an F value of 206.080 and predictive correlation coefficient (r 2 pred ) of 0.942. Contributions of steric and electrostatic fields were 0.599 and 0.401, respectively. The actual and predicted pIC 50 values of the training set and test set by the model are listed in Table 2, and the graph of actual activity versus predicted pIC 50 of the training set and test set is illustrated in Figure 1.

CoMSIA Model
The statistical parameters corresponding to the CoMSIA model are listed in Table 1. The CoMSIA model, consisting of steric (S), electrostatic (E), hydrophobic (H), hydrogen bond donor (D) and acceptor (A) fields, can be generated using these fields in different combinations. The results of CoMSIA analysis with different combinations are summarized in Table 3. Among the combination models, steric, electrostatic and hydrogen bond acceptor fields played essential roles for the present series of compounds. To confirm whether the addition of hydrophobic, hydrogen bond donor and acceptor fields affect the model, each descriptor was considered along with steric and electrostatic descriptors for generating the model. Inclusion of the hydrophobic field descriptor caused a reduction in both r 2 cv and r 2 pred , which implied that the hydrophobic field descriptor may not be crucial for these molecules. Moreover, the removal of steric and electrostatic descriptors (H + D + A) resulted in significant reduction in r 2 cv , r 2 and r 2 pred . The S + E + A combination was better than the S + E + H + D + A combination in every statistical parameter, which suggested that the steric, electrostatic and hydrogen bond acceptor functional groups were of extreme significance for the inhibitory activity. In conclusion, the combination of steric, electrostatic and hydrogen bond acceptor fields was selected as the best model. The CoMSIA model with a combination of steric, electrostatic and hydrogen acceptor fields gave a good cross-validated correlation coefficient (r 2 cv ) of 0.518 (>0.5) with an optimized component of 6. A high non-cross-validated correlation coefficient (r 2 ) of 0.934 was attained, as well as a low standard error estimate (SEE) of 0.339, F value of 72.528 and predictive correlation coefficient (r 2 pred ) of 0.931. Contributions of steric, electrostatic and hydrogen bond acceptor fields were 0.373, 0.472 and 0.155, respectively. The actual and predicted pIC 50 values and residual values for the training set and test set compounds are listed in Table 2. The association between actual and predicted pIC 50 of the training set and test set compounds is illustrated in Figure 2.

CoMFA Contour Maps
To view the information of the resultant 3D-QSAR model, CoMFA contour maps were generated to rationalize the regions in 3D space around the molecules where changes in the steric and electrostatic fields were predicted to enhance or lessen the activity of the compound. The CoMFA steric and electrostatic contour maps are shown in Figure 3.
The steric field is characterized by green and yellow contours, in which yellow contours indicate regions where minor groups would be favorable, while the green contours represent regions where minor groups would decrease the activity. Compound 19 was selected as a reference structure. As shown in Figure 3A, the N-1 position (R 1 ) was surrounded by two small yellow contours, which suggested a minor group at this position would increase the inhibitory potency. This may explain why compounds 01, 02, 04 which possessed a minor group (e.g., Me, H) at R 1 showed significantly increased activities compared to those with a bulky substituent. For instance, compounds 1-8 had an order for the potency of 01 > 02 > 05 > 03 > 08 > 07, with the corresponding R 1 substituent Me, F 3 CCH 2 -, Cyclohexane, Phenyl, 1-piperidine-CH 2 -CH 2 -, 1-methyl-piperidine-, respectively. The presence of the yellow contour around the C-3 (R 2 ) position also suggested a bulky group at this region would be unfavorable. By checking up all the C-3 modified compounds, it was found that derivatives 1 and 9-14 have the activity order of 1 (R 2 = NH 2 ) > 10 (R 2 = OH) > 11 (R 2 = NHMe) > 9 (R 2 = OEt) > 12 (R 2 = NHcyclopropyl) > 13 (R 2 = NHcyclopentyl) > 14 (R 2 = NHPh). This is satisfactory in accordance with the contour map. The large yellow contour around the benzene at R 3 indicated that minor groups at this position may benefit potency. This may explain why compound 28 (R 3 = SMe) was more potential than 34 (R 3 = SO 2 NH 2 ), while compound 34 (R 3 = SO 2 NH 2 ) was more active than 40 (R 3 = SPh). Comparing compound 27 (R 3 = Me) with 31 (R 3 = i-Pr), as well as 18 (R 3 = Ac) with 32 (R 7 = CO 2 Me), it could be easily found that their activity discrepancies can also be explained by this yellow contour. The electrostatic field ( Figure 3B) is indicated by blue and red contours, which exhibit the regions where electron-donating groups and electron-withdrawing groups would be favorable, respectively. Compound 19 was selected as a reference molecule again. In the CoMFA electrostatic field, a strip blue contour around the N-1 (R 1 ) side chain revealed the electron-donating substituent was essential for the inhibitory activity. Take the compound 2 (R 1 = CF 3 CH 2 ) for an example: The strong electron-withdrawing -CF 3 group at the terminal of N-1 side chain in compound 2 resulted in significantly decreased activity compared to the compound 1 with the electron-donating substituent -CH 3 . The red contour near the C-3 (R 2 ) position demonstrated that the electronwithdrawing groups at this position would benefit potency, this may be the reason why compounds 9, and 11-13, which possessed electron-donating groups, had decreased potencies compared to the compounds with -OH group such as compound 10 (R 2 = OH). The three red contours around the benzene at R 3 revealed that the electron-withdrawing groups at this position may increase the potency. For instance, compounds 30, 29, 31 had an order for the activity of 30 > 29 > 31, with the corresponding R 3 substituent -F, -NHMe, i-Pr, respectively.

CoMSIA Contour Maps
The best combination model for CoMSIA were steric, electrostatic and hydrogen bond acceptor fields ( Table 3). The hydrophobic and hydrogen bond donor fields were not essential for the CoMSIA model, thus their contours were not generated. The CoMSIA steric and electrostatic field contour maps were approximately similar to the corresponding CoMFA contour maps, therefore the figures were not illustrated, either ( Figure 4A or 4B).
The hydrogen bond acceptor field contour map of CoMSIA is shown in Figure 4 using compound 19 as a reference molecule. The magenta and red contours represent favorable and unfavorable hydrogen bond acceptor groups. In the CoMSIA hydrogen bond fields, the magenta contour near the benzene (m,p-R 3 ) revealed that hydrogen bond acceptor groups may benefit the potency. The -F, -O, and -N atom at this position acted as hydrogen bond acceptor, this may explain why compounds 16-17, 19-20, 22-23 and 25-26 showed relatively better activities. One huge red contour around the benzene (o-R 3 ) revealed that hydrogen bond acceptor groups may decrease the inhibitory activity. For example, compounds 1, 15, 29, 30 had an order for the activity of 1 > 30 > 29 > 15, with the corresponding o-R 3 substituent -H, -F, -NHMe, -CF 3 , respectively.

Docking Analysis
Docking was implemented to find the probable binding conformations between these 4,5-dihydro-1H-pyrazolo [4,3-h]quinazoline derivatives and the receptor, furthermore, to check the reliability of the 3D-QSAR models established. Since the crystal structure of CDK2/cyclin A was known, we docked compound 19 into the allosteric site of CDK2/cyclin A (PDB code 2WXV), and the surfex-dock total score was 9.17.
As shown in Figure 5, the key residues and hydrogen bonds were labeled, namely: the O and N at the C-3 position of the pyrazolo ring in compound 19 served as hydrogen bond acceptor and donor by forming two H-bond with the -NH 2 , -OH group of LEU83 residue, respectively. The N atom of the pyrimidine ring and the O atom of m-Ac in compound 19 acted as the hydrogen bond acceptors by forming two H-bonds with -NH 2 group of LYS33 residue. The results confirmed the observation from the CoMSIA hydrogen bond acceptor contour map. In order to test and verify the use of docking, the MOLCAD surface with cavity depth potential was generated and is shown in Figure 6. The cavity depth measures how deep a surface point is located inside a cavity of a molecule. The cavity depth color ramp ranges from blue (low depth values represent outside of the molecule) to light red (high depth values represent cavities deep inside the molecule). It can be observed that the whole molecule was in the light red region ( Figure 6) which revealed that compound 19 was placed well in the allosteric site.  The MOLCAD surface of the allosteric site was developed and displayed with electrostatic potential to test and verify the CoMFA electrostatic contour map (Figure 7). The molecular electrostatic potential on a protein surface can be applied to find the sites that act attractively on ligands by matching opposite colors. The compound 19 was docked into the allosteric site; the red color shows the electron-withdrawing zone and purple color shows electron-donating zone. The observation seen in Figure 7 was satisfactory according to that of CoMFA electrostatic contour map. In detail, the R 2 region is in the red zone, which suggested that electron-withdrawing substituent would be favorable; the R 1 region is in a blue zone, which indicated that electron-donating groups may be favorable.
To better visualize the protein structure, in this paper, protein residues were explored using the ribbon program (Figure 8). The protein backbone is drawn as a ribbon or tube. Representations of proteins in Richardson style use arrows for beta strands, cylinders for alpha helices and tubes for coils and turns. As showed in Figure 8, the two protein residues involved-LYS33 and LEU83-lie within arrows designating beta strands.

Design of New Molecules Based on COMFA, CoMSIA and Docking Studies
The detailed contour map analysis of both COMFA and CoMSIA models and the docking analysis empowered us to identify structural requirements for the observed inhibitory activity (Figure 9). The molecules were modified to further improve the inhibition activity toward CDK2/CyclinA. Compound 19 were chosen as a reference structure to design new molecules to obtain a number of new potent molecules ( Figure 10). The newly designed molecules were docked into the protein active site. The COMFA and CoMSIA models established above were used to predict the activity by applying the 3D-QSAR model. The new molecules showed better dock score and predicted activity ( Table 4). The comparison of the predicted activity of the newly designed molecules between CoMFA and CoMSIA models are showed in Figure 11. The designed molecules showed better activity than the reference molecules, which indicates that the 3D-QSAR model has a good predictability and can be used to design new molecules with better activity.     Table 4. Surflex-Dock total-score and predicted activity of newly designed molecules.

Data Sets
The 47 compounds involved in this study were taken from the literature [15]. The inhibitory activities were reported as IC 50 against CDK2/cyclin A. The IC 50 values were converted into pIC 50 by taking Log (1/IC 50 ). The entire derivatives were divided into a training set of 38 compounds and a test set of nine compounds for model validation. The test set compounds were selected randomly. Chemical structures and associated inhibitory activities are shown in Table 5 and Table 1.

Molecular Modeling and Alignment
Molecular modeling and statistical analysis were performed using the molecular modeling package SYBYL 8.1 Tripos, Inc.
[16]. The three-dimensional structures of all compounds were constructed using the Sketch Molecule module. Energy minimization of each structure was performed using the SYBYL energy minimizer Tripos force field and Gasteiger-Hückel charge [17,18]. All of the compounds were aligned into a lattice box by fitting with common substructure ( Figure 12) using compound 19 as a template, which was one of the most active compounds. The conformation of the template was based on crystallographic ligand/receptor complex. (The aligned molecules in the training set are shown in Figure 13).

CoMFA and CoMSIA Modeling
The CoMFA descriptor fields including the steric fields and the electrostatic fields were calculated at each lattice with grid spacing of 1 Å and extending to 4 Å units in all three dimensions within defined region [17,18]. The Van Dar Waals potentials and Coulombic terms, which represented steric and electrostatic fields, respectively, were calculated by using the standard Tripos force field. In CoMFA method, a sp 3 hybridized carbon atom with a charge of 1e was used as a probe atom, the energy values of the steric and electrostatic fields were truncated at 30 kcal/mol [18][19][20].
The steric, electrostatic, hydrophobic, hydrogen bond donor and hydrogen bond acceptor CoMSIA potential fields were calculated at each lattice intersection of a regularly spaced grid of 1 Å and extending to 4 Å using a probe atom with radius 1.0 Å, +1.0 charge, and hydrophobic and hydrogen bond properties of +1. The attenuation factor was set to the default value of 0.3 [21].

Partial Least Squares (PLS) Analysis
The partial least-squares (PLS), an extension of multiple regression analysis, was used to linearly correlate the CoMFA and CoMSIA fields to the pIC 50 values. The cross-validation analysis was performed using the leave-one-out (LOO) method in which one molecule was removed from the data set and its activity was predicted using the model derived from the rest of the data set [20]. PLS was used in conjunction with the cross-validation option to determine the optimum number of components (ONC) which were then used in deriving the final CoMFA and CoMSIA model without cross-validation. The ONC was the number of components resulted in highest cross-validated correlated correlation coefficient (r 2 cv ) [20][21][22]. Column filtering was used at the default value of 2.0 kcal/mol in the cross-validation part. The final models were developed with ONC by using non-cross-validated analysis equal yielded the highest correlation coefficient (r 2 ) [23].

Predictive Correlation Co-Efficient (r 2 pred )
The predictive abilities of 3D-QSAR models were validated by predicting the activities of a test set of eight compounds that were not included in the training set. These molecules were aligned to the template and their pIC 50 values were predicted by the produced models which were obtained using the training set. The predictive correlation coefficient (r 2 pred ), based on the molecules of test set, was calculated using Equation (1): In this equation, SD is the sum of the squared deviations between the inhibitory activities of the test set and the mean activity of the training molecules and PRESS is the sum of squared deviations between predicted and actual activity values for each molecule in the test set [22][23][24][25][26].

Molecular Docking
To study the protein-ligand interactions, compound 19 with a high pIC 50 value was selected as a reference compound and docked into the ATP-binding site of CDK2/cyclin A. The ATP binding site was situated in a deep cleft between a amino-terminal lobe (residues 1-85) and a carboxy-terminal lobe which contains the catalytic residues conserved among eukaryotic protein kinases. The ATP pocket of CDK2/cyclin A has an impressive capacity to accommodate a variety of inhibitors containing flat heterocyclic structures [27]. The Surflex-Dock using an empirical scoring function and a patented search engine to dock ligands into a protein's binding site was used to investigate molecular docking [26]. The scoring function was tuned to predict the binding affinities of protein/ligand complexes, with its output being represented in units of −log(K d ) 2 . The terms, in rough order of significance, were hydrophobic complementarity, polar complementarity, entropic terms, and solvation terms. The full scoring function was the sum of each of these terms [27].
The crystal structure of CDK2/cyclin A was obtained from Protein Data Bank, having a PDB entry of 2WXV. The CDK2/cyclin A structure was exploited in subsequent docking experiments without energy minimization. The compound 19 was docked into corresponding protein's binding site by an empirical scoring function and a patented search engine in Surflex-Dock [16]. The automatic docking was applied. All the inhibitor and water molecules from crystal structure have been removed and the polar hydrogen atoms were added.
The MOLCAD (Molecular Computer Aided Design) program was applied to visualize the binding mode between the ligand and protein. MOLCAD calculates and displays the surfaces of channels, Ribbon, and cavities, as well as the separating surface between protein subunits [16]. MOLCAD program provides several types to create a molecular surface, the fast Connolly method which uses a marching cube algorithm to generate the surface was utilized. Other parameters were established by default in software.

Conclusion
We have employed 3D-QSAR and docking methods to explore the structure-activity relationship of a series of 4,5-dihydro-1H-pyrazolo[4,3-h]quinazoline derivatives as potent CDK2/cyclin A inhibitors. The CoMFA analysis was used to build statistically significant models with good correlative and predictive capability for the inhibition of CDK2/cyclin A by 47 4,5-dihydro-1H-pyrazolo [4,3-h] quinazoline derivatives. These models could be used to predict the inhibitory potencies of related structures. The analysis of contours for the CoMFA models has provided a clue about the structural requirement for the observed biological activity for the respective kinases: A more electron-withdrawing group and less bulky substitution on the pyrazolo ring are expected to improve the inhibitory potency. Furthermore, the CoMSIA contour maps along with the docking results offered enough information that more hydrogen bond acceptor groups on the benzene ring (m,p-R 3 ) and more hydrogen bond donor groups on the benzene ring (o-R 3 ) may benefit the potency. The clues obtained from 3D-QSAR and docking studies can be served as a useful guideline for the amplification of the known CDK2/cyclin A family of inhibitors. The designed molecules based on those parameters showed better activity than the reference molecules, which indicates that the 3D-QSAR model that was generated has a good predictability and can be used to design new molecules with better activity. These molecules can be synthesized to generate a greater number of molecules with required pharmacokinetics for further clinical studies.