Molecular Sciences Self-organizing Molecular Field Analysis on a New Series of Cox-2 Selective Inhibitors: 1,5-diarylimidazoles

Self-organizing molecular field analysis (SOMFA), a simple three-dimensional quantitative structure–activity relationship (3D-QSAR) method is used to study the correlation between the molecular properties and the anti-inflammatory biological activities of a new series of 1,5-Diarylimidazoles that act as selective COX-2 inhibitors. The statistical results, cross-validated r CV 2 (0.507) and non cross-validated r 2 (0.546), show a satisfied predictive ability.


Introduction
Nonsteroidal anti-inflammatory drugs (NSAIDs) [1] display their anti-inflammatory actions primarily through the inhibition of cyclooxygenase (COX), which catalyzes converts arachidonic acid to prostaglandin (PG)H 2 [2] and subsequently to a number of other prostaglandins which are potent mediators of inflammation.Cyclooxygenase exists in at least two different isoforms, namely, COX-1 and COX-2 [3].COX-1 is a constitutive enzyme [4], and COX-2 is an inducible isoform that leads to inflammation [5].All classical NSAIDs, such as aspirin, ibuprofen, and indomethacin, can inhibit both COX-1 and COX-2, but bind more tightly to COX-1 [6].Selective COX-2 inhibitors are proving to have the same anti-inflammatory, anti-pyretic, and analgesic activities as do nonselective NSAID inhibitors, but with few or none of their gastrointestinal side-effects [7].Nowadays, the search for novel and selective COX-2 inhibitors is increased due to their therapertical potential in the treatment of inflammation.Recently, a new series of 1,5-diarylimidazoles compounds has been reported to selectively inhibit COX-2 [8].
The self-organizing molecular field analysis (SOMFA) [9] is a simple 3D-QSAR technique, which has been developed just recently by Robinson et al.The method has similarities to both comparative molecular field analysis (CoMFA) [10] and molecular similarity studies.Like CoMFA, a grid-based approach is used; however, no probe interaction energies need to be evaluated.Like the similarity methods it is the intrinsic molecular properties, such as the molecular shape and electrostatic potential, which are used to develop the QSAR models.
A SOMFA model could suggest a method of tackling the all-important alignment, which all 3D-QSAR methods have faced.The inherent simplicity of this method allows the possibility of aligning the training compounds as an integral part of the model derivation process and of aligning prediction compounds to optimize their predicted activities.
The purpose of this paper is to describe the application of self-organizing molecular field analysis, SOMFA, on this set of 1,5-diarylimidazoles, a novel class of selective COX-2 inhibitors.Thus our main objective is to provide some useful information by SOMFA analysis and design new specific inhibitors of COX-2 in the hope that these molecules may be further explored as powerful non-ulcerogenic anti-inflammatory agents.

Data sets and biological activities
Twenty-nine 1,5-diarylimidazoles compounds are divided into two sets.The training set of 22 molecules with structures and their anti-inflammatory activities expressed as -log(IC 50 ) are shown in Table 1.The predictive power of the models is evaluated using a test set of 7 molecules whose structures and activities are also shown in Table 1.Two sets of 29 molecules are selected in order to find some molecular descriptors and to elucidate convenient models for the predictive discrimination between these various activities.All compounds and their activities are processed as enantiomers in order to decrease the molecular alignment error derived from different configuration and increase the correlation of SOMFA models.

Molecular modeling and docking
The three-dimensional structures of the 1,5-Diarylimidazoles are constructured with the CAChe worksystem pro evaluation [11], running on an AMD Athone XP 2400+ Processor/Microsoft Windows XP platform.
Unless otherwise indicated, parameters are default.Full geometry optimization are performed by PM5 [11] semi-empirical method in the CAChe software.The final active comfomations search are performed by dock into ActiveSite method which also in the CAChe software.The PDB entry of Cyclooxygenase receptors used in docking experiments is 6COX.For a example, the docked structure of ZA18 versus SC-558 in the active site of COX-2 are shown in Figure 1.
According to the docked structures or the alignment of the three cycle in the optimized geometries of 1,5-Diarylimidazoles, these compounds are then performed SOMFA analysis.The superposition of 1,5-Diarylimidazoles structures after docking are shown in Figure 2, the superposition of 1,5-Diarylimidazoles accoriding to the optimized structures and alignment of the three cycle in 1,5-Diarylimidazoles are also shown in Figure 3. Using VEGA software [12], the final overlayed geometries are converted into CSSR file format, the only file format which SOMFA2 program can accept to process a SOMFA analysis.

SOMFA 3D-QSAR models
In the SOMFA study a 40x40x40 Å grid originating at (-20,-20,-20) with a resolution of 0.5 and 1 Å respectively, is generated around the aligned compounds.Table 2 reports 12 models using different alignment, charge and resolution of grid under exploration.For all of the studies, shape and electrostatic potential are generated.To sum up the predictive power of these two properties into one final model, we combine their individual predictions using a weighted average of the shape and electrostatic potential based QSAR, using a mixing coefficient (c1) as illustrated in eq. 1 [9].
Clearly, multiproperty predictions could have been obtained through multiple linear regression.Using eq 1 instead gives greater insight into the resultant model by allowing the study of the variation in predictive power with different values of c 1 .
With the highest value of r 2 , the SOMFA models then are derived by the partial least squares (PLS), implemented in NoSA [13] with cross-validation.
The predictive ability of the model is quantitated in terms of r CV 2 which is defined in eq. 2.
r CV 2 = (SD-PRESS)/SD where PRESS = σ (Y pred -Y actual ) and SD = σ (Y actual -Y mean ) SD is the sum of squares of derivations of the observed values from their meaning and PRESS is the prediction error sum of squares.The final models are constructed by a conventional regression analysis with the optimum value of mixing coefficient (c 1 ) equal to that yielding the highest r 2 and r CV 2 value according to eq 2.

3.Results and Discussion
SOMFA, a novel 3D-QSAR methodology, is employed for the analysis with the training set composed of 29 various compounds, from which biological activities are known.Statistical results of 12 SOMFA models are summarized in Table 3.
A cross-validated value r CV 2 which is obtained as a result of PLS analysis serves as a quantitative measure of the predictability of the SOMFA model.From the table we find that the result is less sensitive to resolution of grid and quantum chemistry charge but the model overlayed using original docked structures shows higher r CV 2 values than using the model of alignment of the optimized 1,5-Diarylimidazoles structures.Among the twelve models tested, the best predictive power is the ninth models from cross-validated.Good cross-validated correlation coefficient r CV 2 values (0.507), moderate non cross-validated correlation coefficient r 2 values (0.546) proves a good conventional statistical correlation which have been obtained, and we also find that the resultant SOMFA model have a satisfied predictive ability.
During the SOMFA investigation, grid spacings of 1 and 0.5 Å were investigated.The 1Å grid spacing produces a good correlation equal to 0.5 Å grid.This has improved marginally with the 0.5 Å spacing uses for the results presented here.Further increases in resolution has produced further small increases in model quality but not enough to warrant the extra computational time.
The observed and predicted activities of the training set are reported in Table 4. Figure 4 shows a satisfied linear correlation and moderate difference between observed and predicted values of molecules in the training set.
It's well known that the best way to validate a 3D-QSAR model is to predict biological activities for some compounds of test set.The SOMFA analysis of the test set composed of 7 compounds is reported in Table 5.Most of compounds in test set show good correlation between observed and predicted values.SOMFA calculation for both shape and electrostatic potentials are performed, then combined to get an optimal coefficient c 1 =0.512 according to eq 1.The master grid maps derived from the best model is used to display the contribution of electrostatic potential and shape molecular field.The master grid maps give a direct visual indication of which parts of the compounds differentiate the activities of compounds in the training set under study.The master grid also offers an interpretation as to how to design and synthesis some novel compounds with much higher activities.The visualization of the electrostatic potential master grid and shape master grid of the best SOMFA model is showed in Figures 5 and 6, respectively, with compound 18 as the reference.
Each master grid map is colored in two different colors for favorable and unfavorable effects.In other words, the electrostatic features are red (more positive charge increases activity, or more negative charge decreases activity) and blue (more negative charge increases activity, or more positive charge decreases activity), and the shape feature are red (more steric bulk increases activity) and blue (more steric bulk decreases activity), respectively.SOMFA analysis result indicates the electrostatic contribution is of a slightly low importance (c 1 =0.512).The SOMAF electrostatic potential for the analysis is presented as master grid in Figure 5.In this map of important features, we find a high density of blue points around the substituent R 3 and R 4 at the second phenyl ring, which means some electronegative groups are favorable.Meanwhile, in the map of shape master grid, we can find a high density of red points around the substituent R 3 and R 4 at the second phenyl ring, which means a favorable steric interaction; simultaneously, we also find a high density of blue points outside the red regions which around substituent R 3 and R 4 at the second phenyl ring, where an unfavorable steric interaction may be expected to enhance activities.Generally, the medium-sized substituent R 3 and R 4 at the second phenyl increase the activity.
All analysis of SOMFA model may provide some useful information in the design of new 1,5-diarylimidazoles antagonists.

4.Conclusion
We have developed predictive SOMFA 3D-QSAR models for 1,5-diarylimidazoles as anti-inflammatory agents.The master grid obtained for the various SOMFA models electrostatic potential contributions can be mapped back onto structural features relating to the trends in activities of the molecules.On the basis of the spatial arrangement of the various electrostatic potential contributions, novel molecules are being designed with improved activity.

Figure 4 .
Figure 4. Observed versus predicted activities in the training set.

Figure 5 .
Figure5.The electrostatic potential master grid with compound 18.Red represents areas where postive potential is favorable, or negative charge is unfavorable.Blue represents areas where negative potential is favorable, or postive charge is unfavorable.

Figure 6 .
Figure 6.The shape master grid with compound 18.Red represents areas of favorable steric interaction.Blue represents areas of unfavorable steric interaction.

Table 2 .
Encoding 12 models for the training set used for the SOMFA investigations

Table 3 .
Statistics of the various SOMFA models

Table 4 .
Observed and predicted activities of 22 compounds in the training set a Residual=Observed-predicted.

Table 5 .
Observed and predicted activities of 7 compounds in the training set a Residual=Observed-predicted.