Molecular Sciences Triazoloquinazolines as Human A3 Adenosine Receptor Antagonists: a Qsar Study

Multiple linear regression analysis was performed on the quantitative structure-activity relationships (QSAR) of the triazoloquinazoline adenosine antagonists for human A 3 receptors. The data set used for the QSAR analysis encompassed the activities of 33 triazoloquinazoline derivatives and 72 physicochemical descriptors. A template molecule was derived using the known molecular structure for one of the compounds when bound to the human A 2B receptor, in which the amide bond was in a cis-conformation. All the test compounds were aligned to the template molecule. In order to identify a reasonable QSAR equation to describe the data set, we developed a multiple linear regression program that examined every possible combination of descriptors. The QSAR equation derived from this analysis indicates that the spatial and electronic effects is greater than that of hydrophobic effects in binding of the antagonists to the human A 3 receptor. It also predicts that a large sterimol length parameter is advantageous to activity, whereas large sterimol width parameters and fractional positive partial surface areas are nonadvatageous.


Introduction
Adenosine receptors belong to the rhodopsin family of G-protein-coupled receptors (GPCRs).These are transmembrane (TM) receptors with seven-helices that play key roles in signal transduction, including the phosphorylation cascade [1].The adenosine receptors include the subtypes A 1 , A 2A , A 2B and A 3 [2] of which the adenosine A 3 receptor performs cardio-or neuro-protective activities during cardiac or cerebral ischemia, respectively [3].The A 3 receptor is activated by adenosine or agonists by activating inhibitory G proteins, it inhibits the generation of cAMP by adenylate cyclase [4].However, activation of the human A 3 receptor results in hypotension and release of inflammatory mediators from mast cells [5][6].At present, human A 3 receptor antagonists are being investigated as potential antiinflammatory, anti-asthmatic and anti-ischemic agents [7].
In order to develop new therapeutics and to define the A 3 receptor mechanism, investigations have focused on the search for potent and selective A 3 receptor antagonists, including flavonoids [8][9], pyridine derivatives [10], 1,4-dihydropyridines [11][12] and triazoloquinazolines [13][14].Derivatives of the latter group are considered to be amongst the strongest antagonists of the human A 3 receptor [15].Moro et.al. [16] used their seven-helix model of the human A 3 receptor to identify certain key amino acid residues that could participate in the binding of antagonists [16].However, this research only considered one triazoloquinazoline derivative and did not provide a comprehensive, quantitative analysis of antagonist activity.
Quantitative structure-activity relationship (QSAR) studies represent a powerful tool for relating the biological activities of compounds to their physicochemical properties, which are referred to as descriptors [17].As a data set for QSAR analysis of A 3 receptor antagonists, we chose a series of 33 triazoloquinazoline derivatives that had been reported by Jacobson et al. [14].QSAR analysis was performed on the binding affinities of these agonists using descriptors which represent 72 physicochemical parameters.In order to derive the best QSAR equation, we generated a full search multiple linear regression method (FS-MLR).This technique examines all potential combinations of the 72 descriptors, in order to identify the descriptor combination which correlates most closely with the biological data.

Data Set
The 3D-sketcher module of the Cerius2 program (version 3.5, Molecular Simulations Inc., San Diego, CA) was used to generate molecular models of 33 triazoloquinazoline derivatives.Since there was no quantitative binding affinity data for the compound containing the (o-iodophenyl)acetyl (-COCH 2 -(2-I-Ph)) group as substituent, we excluded it from the analysis.The γ-aminobutyrylsubstituted derivative 26 was selected as a template molecule because its molecular model could be derived from the literature [14].Upon binding to the human A 2B receptor, the amide bond of compound 26 is in a cis-conformation [14].As A 3 receptors are thought to form a similar seven-helix structure to that of A 2B receptors, we used a similar cis-conformation for the amide bond for our model of compound 26 binding to the human A 3 receptor.In compound 26, the long chain of the substituent lies in the plane of the molecule.has been reported to point in an extracellular direction [16].
Molecular conformations of the other compounds were fitted manually to the template molecule.They were then energy-minimized using the Merck Molecular Force Field (MMFF) [18][19] and the models re-aligned to the template molecule using the rigid fit alignment module in the Cerius2 program (Fig. 1).The charge equilibration method was used to assign atomic partial charges to each of the compounds [20].Activity values for the QSAR equation were obtained using the negative logarithm of binding affinity (K i ), which had been determined by radio-ligand binding assays [14].
The physicochemical properties of each compound were specified using 72 descriptors, which delineate conformational, electronic, spatial, structural, thermodynamic and quantum mechanical information.The QSAR+ module of Cerius2 was used to identify all the descriptors.

Full Search Multiple Linear Regression Method
A relationship between independent and dependent variables (physicochemical descriptors and biological activities, respectively) can be determined statistically using regression analysis.Linear regression is achieved by fitting a best-fit straight line to the data using the least squares method.Descriptors that are included in a reasonable QSAR equation should exhibit low collinearity and thus, behave as independent variables [21].We calculated the collinearity between descriptors using equation ( 1) and the quality of fit for a regression equation was assessed relative to its correlation coefficient and standard deviation, using equations (2) and (3), respectively.The fitness of the regression equation improves, the closer the correlation coefficient approaches to one.The F value represents the level of statistical significance of the regression (equation 4).The predictive quality of a regression model can be evaluated using the leave-one-out cross-validation procedure (r 2 cv ). Collinearity: where σ A and σ B are the standard deviations of A and B, respectively and cov(A,B) is the covariance of A and B.
Correlation coefficient: where is the sum of the squares about the regression, and is the sum of the squares about the mean.
and n is the number of observations, k is the number of variables, and r is the correlation coefficient.
For a regression model, r 2 was used to describe the fitness of data and fitness is considered to improve as r 2 approaches 1.The sum of the squared deviations of dependent variables (SD) is described by (Y obs -Y mean ) 2 and the predicted sum of squares (PRESS), by (Y pred -Y obs ) 2 .The crossvalidated correlation coefficient (r cv ) is defined as (1 -PRESS/SD) 1/2 and it used to evaluate the predictive power of the QSAR equations that were generated.Each molecule was eliminated from the training set and cross-validated r cv was calculated using the predicted values for the missing molecule.
The FS-MLR was performed using the least squares method, the statistical definitions described above and a full search method.Given that the full search method performs an exhaustive examination all possible descriptor combinations, there is little concern that important descriptors might be missed and this method enables identification of the QSAR equation with the best correlations.
The program determines collinearity between descriptors and those combinations containing high inter-descriptor collinearity are discarded.Multiple linear regressions are performed using the descriptor combinations remaining and upon calculation of the correlation coefficients, QSAR equations that have correlation coefficients which equal or exceed a preset value are reported.We specified 0.7 and 0.9 as the collinearity and correlation coefficient cutoff values, respectively.

Selection of Outliers
Data points that cannot be described using the QSAR equation are referred to as outliers.In order to investigate these systematically we eliminated each compound individually from the data set, generating 33 reduced data sets.Then we used the FS-MLR to determine which QSAR equations derived the best correlation coefficients from the outlier-free data sets.The best QSAR equation was determined using statistical analyses of correlation coefficient, standard deviation and F-value.Once an outlier was identified, the elimination process was repeated, in order to achieve the best QSAR equations containing between one and five terms.

Derivation of QSAR
In order to derive a reasonable QSAR equation, we performed FS-MLR and evaluated QSAR equations that used between one and six terms.With large data sets (n > 30), the linear regression equation should contain at least 6 data points per variable in order to avoid chance correlations [21].In Table 1, the statistical values of the best QSAR equations are presented in the column headed 'none'.If the statistical values were unsatisfactory (for example, r 2 = 0.61 for a 5 term equation), the QSAR equation was not considered to be reasonable.Removal of outliers improved the correlation coefficient of the QSAR equations.Thus, outliers were inspected systematically and are summarized in Table 2, as are the statistical values of the best QSAR equations and the specific number of terms.Although compound 14 was found to be an outlier in equations using between one and four terms, statistical values indicated that equations with less than four terms were unreliable (r 2 < 0.64).Following elimination of compound 15, there was great improvement in the statistical value (r 2 = 0.75) of the best QSAR equation that used five terms.The statistical values of QSAR equations with one to five terms are shown in Table 1, in the column headed '15'.In fact, the residual of compound 15 was -2.98, which was more than six times greater than the standard deviation (0.44).Therefore, the compound was considered to be an outlier.In order to improve the regression equation further, compound 5 was also removed, as its residual was about four times greater than the standard deviation.In Table 1, the statistical values of QSAR equations, following elimination of both compounds 5 and 15, are shown in the column headed '5, 15'.Increasing the number of terms from four to five, improved the correlation coefficient, standard deviation and partial F-value by 33, 29, and 259 %, respectively.However, even with removal of compound 14, a six-term equation afforded little significant improvement on the statistical values.Thus, it was concluded that removal of two outliers (compounds 5, 15) and adoption of a five term equation represented the most reasonable steps for derivation of a statistically reliable QSAR equation.The final QSAR equation was as follows: Log K i = 0.387L -3.697B 1 -0.331B 3 -92.456FPSA 3 -10.423ρ+ 26.354 n = 31, r 2 = 0.820, r 2 cv = 0.716, s = 0.440, F = 22.805 (5) in which n represents the number of data points used for derivation of the equation, r is the correlation coefficient, s is the standard deviation from the regression, and F is the F value.The QSAR equation ( 5) demonstrates significant correlation and is able to explain 82 % (r 2 ) of observed variations in activities.The cross-validated r 2 (0.716) indicates that this QSAR equation is able to predict activities successfully.Descriptor values, observed activities, and predicted activities are presented in Table 3. Predicted activity versus observed activity was plotted (Fig. 2) and the correlation matrix (Table 4) demonstrates low collinearity (r 2 < 0.5) between descriptors.Although, there may be a slight correlation (r 2 < 0.33) between descriptors and observed activity, the combination of the five descriptors provides a good fit (Table 4).

QSAR Analysis
Although the QSAR equation ( 5) predicted that the outlier compound 15 would exhibit very high activity (1.12), its observed activity (-1.86) did not match (Table 3).This compound has a conjugated double bond between the carbonyl and phenyl groups and the reason for its poor fit remains unclear.However, it is possible that the rigidity of conjugated double bond causes steric hindrance in the binding site of the human A 3 receptor.
The outlier compound 5, which has a di-substituted (4-NH 2 -3-I-Ph) phenyl group, was predicted to exhibit a lower activity (-3.37) than was observed (-1.67).In contrast, compounds 6 (4-NH 2 -Ph) and 7 (3-I-Ph) fit well and the former was 3150 times more active than the latter and it is possible that the adverse effect that the 3-I substituent exerts on activity is ameliorated by the 4-NH 2 substituent through interactions with residues in the human A 3 receptor.However, the QSAR equation ( 5) failed to explain this interaction effectively.
The sterimol parameter L is defined as the length of a substituent along the axis of the bond between the first atom of the substituent and the parent molecule [22][23].Therefore, a positive L in the QSAR equation (5) suggests that there are long substituents in a specific direction.The fractional charged partial surface area (FPSA 3 ) is obtained by dividing atomic charge weighted positive surface area by the total molecular solvent-accessible surface area [24] and negative FPSA 3 values are favored on the bond axis.Moro et.al. made the assumption that when compound 1 was bound to the human A 3 receptor, it would be surrounded by transmembrane (TM) units 3, 5, 6, 7 and that its furan ring would point in an extracellular direction [16].Owing to similarities in receptor composition, they suggested that in the human A 2B receptor, the substituent in compound 26 would locate midway between TM6 and TM7 and would extend in an extracellular direction from the plane of the membrane [14].Accordingly, the bond axis would be expected to extend to midway between TM6 and TM7 with an upward tilt (~45º) from the plane of the membrane, providing sufficient space for molecules with long substituents to enter the binding site.The sterimol width parameters (B 1 and B 3 ) are orthogonal to the bond axis and negative coefficients in the parameters indicate steric hindrance in parallel to the membrane (Fig. 1).There is limited space in the TM structure of the human A 3 receptor in the direction parallel to the membrane and negative B 1 and B 3 coefficients reflect the steric hindrance in this direction.FPSA 3 indicates that densely positive surface areas contribute negatively to binding of compounds by the human A 3 receptor.Thus, we suggest that with respect to ligand binding, the positively charged residues of the receptor contribute more than the negatively charged residues.This hypothesis is supported by the findings of Moro et.al. [16] which indicate that no negatively charged residues are present in the TM core near the putative binding site and that His272 (part of TM7) is located within 5Å of bound compound 1.Density (ρ), which is defined as the ratio of molecular weight to molecular volume inside the contact surface, can be used to reflect how tightly a molecule fits into the binding site and a negative coefficient of density indicates that packed molecules are of low activity.In order to perform further analyses on the QSAR equation ( 5), the data set was divided into five subgroups and a linear regression was performed on each subgroup using one or two of the descriptors from the QSAR equation ( 5).The subgroup members, corresponding regression equations and correlation coefficients are presented in Table 5.The first subgroup comprised compounds containing a substituent on the phenyl ring of compound 1b.These compounds exhibit a wide range of molecular weights (403 -529) and therefore, density had a marked effect on their activities.For example, compounds 5, 7, and 8 were found to exhibit weak activities because of their relatively heavy iodine substituents.The second subgroup comprised molecules with substituents on the α-carbon of the acetyl group and bulkier substituents were found to increase B 3 values, thereby having a negative influence on activity.For example, the highest B 3 value (6.41) in this subgroup was exhibited by compound 13, which is 330-fold less potent than compound 12.The third subgroup included compounds that contained bulky tert-butoxycarbonyl substituents and therefore, they have a high sterimol width parameter (B 3 ); the critical effect that B 1 and B 3 exert on the activities of this subgroup may be seen in Table 5.The fourth subgroup comprised compounds containing amine chains of various lengths and the sterimol length parameter L was found to have a significant effect on this group's activities.The final subgroup contained esters and carboxylic acids and the trend in activity could be explained via the electronic properties represented by FPSA 3 .In all the regression equations presented in Table 5, the signs of the descriptors were consistent with those of QSAR equation ( 5) and its correlation coefficients to the subgroups were high, indicating that the descriptors in QSAR equation (5) were well selected.

Conclusions
Figure 3. Model for the binding of triazoloquinazoline derivatives to the human A 3 receptor.This model was generated using QSAR equation ( 5) and information from the docking study by Moro et.al. [16].The direction of substitution is represented by the wavy bond, the arrow indicates the bond axis and direction.The bond axis is tilted by about 45° from the plane of the membrane.The sterimol length parameter L is defined as the length of a substituent along the bond axis.Substituents containing high L values were favored, whereas high sterimol width parameters orthogonal to L, B 1 and B 3 were found to be unfavorable for binding of compounds to the human A 3 receptor.
Using the FS-MLR method, we have successfully derived the most statistically reasonable QSAR equation for the triazoloquinazoline derivatives.The physicochemical descriptors used for the QSAR reveal that the electronic (FPSA 3 ) and spatial characteristics (L, B 1 , B 3 and ρ) of substituents provide contributions that are critical to ligand-receptor binding.Although the hydrophobic properties of the substituents in triazoloquinazoline derivatives might be expected to play a crucial role in the binding affinity of these compounds to the generally hydrophobic core of the TM helix [1], our QSAR analysis suggests that they exert a negligible effect on ligand binding to the human A 3 receptor.Hydrophobic effects are only important for the chlorophenyl moiety of these compounds.We propose a model for the binding of triazoloquinazoline derivatives to the human A 3 receptor that is based on our QSAR analysis and the docking study performed by Moro et.al. [16] (Fig. 3).The QSAR equation is consistent with the findings of the docking study [16] and provides a quantitative explanation of the trends in binding affinity in relation to the physicochemical properties of the compounds.Although this QSAR equation is useful, the caveat remains that it was derived from a limited number of amidecontaining compounds.Thus, it is only applicable to similar compounds.Future research will focus on modeling the human A 3 receptor, including its loop sections, in order to aid in the design of potent compounds that bind selectively to the A 3 receptor.

Figure 1 .
Figure 1.Front (a) and side (b) view of the aligned molecules on the membrane plane.The furan ringhas been reported to point in an extracellular direction[16].

Figure 2 .
Figure 2. Predicted versus observed human A 3 receptor binding affinities.Predicted values were determined using equation (5).

Table 1 .
Improvement of the best QSAR equation in relation to the number of terms and outliers removed.

Table 2 .
Elimination of the outlier from the best QSAR equation and the corresponding statistical values in relation to the number of terms.