A DFT-Based QSARs Study of Acetazolamide/Sulfanilamide Derivatives with Carbonic Anhydrase (CA-II) Isozyme Inhibitory Activity

This study presents Quantitative Structure Activity Relationships (QSAR) study on a pool of 18 bio-active sulfonamide compounds which includes five acetazolamide derivatives, eight sulfanilamide derivatives and five clinically used sulfonamides molecules as drugs namely acetazolamide, methazolamide, dichlorophenamide, ethoxolamide and dorzolamide. For all the compounds, initial geometry optimizations were carried out with a molecular mechanics (MM) method using the MM+ force fields. The lowest energy conformations of the compounds obtained by the MM method were further optimized by the Density Functional Theory (DFT) method by employing Becke’s three-parameter hybrid functional (B3LYP) and 6-31G (d) basis set. Molecular descriptors, dipole moment, electronegativity, total energy at 0 K, entropy at 298 K, HOMO and LUMO energies obtained from DFT calculations provide valuable information and have a significant role in the assessment of carbonic anhydrase (CA-II) inhibitory activity of the compounds. By using the multiple linear regression technique several QSAR models have been drown up with the help these calculated descriptors and carbonic anhydrase (CA-II) inhibitory data of the molecules. Among the obtained QSAR models presented in the study, statistically the most significant one is a five parameters linear equation with the squared correlation coefficient R2 values of ca. 0.94 and the squared cross-validated correlation coefficient R2CV values of ca. 0.85. The results were discussed in the light of the main factors that influence the inhibitory activity of the carbonic anhydrase (CA-II) isozyme.


Introduction
Acetazolamide, methazolamide, dichlorophenamide, ethoxolamide and dorzolamide, as carbonic anhydrase (CA-II) isozyme inhibitors, sulfonamide compounds are clinically used drugs for the treatment of glaucoma [1]. CA-II reversibly catalyzes the reaction of H 2 O and CO 2 to form carbonic acid and subsequently the bicarbonate ion HCO − 3 . The HCO − 3 ion is responsible for the movement of Na + ion into the eye. Water follows Na + to form the aqueous humor. CA-II inhibition by an agent such as one of the drugs mentioned above decreases the HCO − 3 ion concentration and therefore the flow of Na + and H 2 O into the posterior chamber, resulting in decreased production of aqueous humor and hence a lowering of intraocular pressure (IOP) [2]. Glaucoma, the leading cause of blindness worldwide , is the general term for a group of ophthalmic disorders characterized by an increase in IOP. This gives rise to damage to the optic disc and visual field disturbances of the eye. IOP increases through an imbalance between the production and drainage of aqueous humor. Agents such as mentioned above, used to treat glaucoma, are designed to decrease IOP [3].
All the drugs used for the treatment of glaucoma have some systemic side effects [4]. To reduce side effects of the drugs, it is of interest to develop new agents for the topical use of CA-II inhibitors for the long-term management of glaucoma. For the researchers, the prospect of overcoming the systemic side effects of a drug, achieving an effect at a much lower dose, is very attractive. Modification of the structure of a known drug is one way to develop new drugs. For this purpose, members of our group have synthesized and reported new five acetazolamide-like and eight sulfanilamide-like derivatives, which are the subject of the present study. These new derivatives have been obtained by modification of acetazolamide and sulfanilamide using the tail approach [5]. The inhibition constants (K I ) of these new molecules against the carbonic anhydrase enzyme CA II are shown in Table 1, are much lower than their mother molecule acetazolamide and sulfanilamide. Therefore, these derivatives can be the subject of further investigation to explore the possibilities of becoming candidate drugs.
Quantitative structure activity relationships (QSAR) studies are tools of predicting endpoints of interest in organic molecules acting as drugs [6]. Many physiological activities of molecules can be related to their composition and structures. Molecular descriptors, which are the numerical representation of the molecular structures, are used to perform QSAR analysis [7]. In the literature, for the calculation of the quantum mechanical molecular descriptors used in QSAR studies, usually semiempirical methods such as AM1 and PM3 mainly have been used [8][9][10]. However, some recent QSAR studies [11][12][13] have shown that choice of the method DFT instead of AM1 [14] or PM3 [15,16] results in better to correlation between calculated results and experimental data. Therefore, the DFT method is expected to lead to statistically more accurate QSAR model by comparing the semi-empirical methods.
Aim of the present study is to build QSAR models using multiple regression method, to explore the correlations between the experimental the inhibition constants (K I ) and calculated molecular descriptors of 18 aromatic and heterocyclic sulfonamide compounds.

Theory and Computational details
For all the molecules, 3-D modeling and calculations were performed using the Gaussian 03 quantum chemistry package [17]. For saving computational time, initial geometry optimizations were carried out with a molecular mechanics (MM) method using the MM+ force fields. The lowest energy confirmations of the molecules obtained by the MM method were further optimized by the DFT [18] method by employing Becke's three-parameter hybrid functional (B3LYP) [19] and the 6-31G (d) basis set; their fundamental vibrations were also calculated using the same level of the theory to check if there were true minima. Program CODESSA (Comprehensive Descriptors for Structural and Statistical Analysis), Version 2.7.2 [20], was used to extract the calculated molecular descriptors from Gaussian 03 output files. This code uses diverse statistical structure property activity correlation techniques for the analysis of experimental data in combination with calculated molecular descriptors. The heuristic method, implemented in CODESSA PRO was employed for selecting the 'best' regression model.
In recent years, increased use has been made of the DFT method for predicting molecular properties of relatively large molecules. DFT enables to calculate molecular properties such as optimized geometry and energy, with the accuracy as good as electron-correlated ab initio methods such as MP2, but requires much less computational time [21]. For an accurate calculation of molecular properties, choice of the basis set and method are important task, and vary for the type of molecules of interest.
Molecular descriptors, calculated using quantum mechanical methods have been used in many QSAR studies [6,7]. They enable determination of molecular quantities characterizing reactivity, shape and binding properties of molecules. The values of molecular descriptors, derived from our calculations for the 18 sulfonamide compounds and their experimental inhibition constant (K I ) are presented in Table 2. Two of these descriptors, related to the thermo chemistry of the molecules obtained from frequency calculation at the optimized geometry, are the total energy at 0 K (in a.u.) and entropy at 298 K (in cal/mol K). Energies (in eV) of the HOMO (highest occupied molecular orbital) and LUMO (lowest unoccupied molecular orbital) are popular quantum mechanical descriptors which play a major role in governing many chemical reactions and determining electronic band gaps in solids [22][23][24]. The energy of the HOMO is directly related to the ionization potential and characterizes the susceptibility of a molecule. According to Koopmans theorem, the ionization potential HOMO (eV) is defined as The same idea applies for the electron affinity calculation . The energy of the LUMO is directly related to the electron affinity and characterizes the susceptibility of the molecule towards attack by nucleophiles [25]. The electron affinity LUMO (eV) is obtained through Koopmans theorem as The polarity of a molecule is well known to be important for various physicochemical properties. The dipole moment is the most obvious and most widely-used quantity to describe the polarity of a molecule [26]. The remaining descriptor presented in Table 2, namely electronegativity, is derived from the DFT framework [27]. The electronegativity is defined as the negative of the partial derivative of energy E of an atomic or molecular system with respect to the number of electrons N with a constant external potential ( ) r V [28].
By combining Eq. (1) with the earlier work of Iczkowski and Margrave [29], assuming a quadratic relationship between E and N and in a finite difference approximation, Eq. (1) can be rewritten as

Results and discussion
The list of the chemical name and value of the inhibition constants in decadic logarithm of K I (in nM) of 18 sulfonamide compounds taken from the literature [5] are given in Table 1. Structural details of the compounds used in this study are presented in Figure 1. Molecular descriptors, dipole moment, electronegativity, total energy at 0 K, entropy at 298 K, HOMO and LUMO energies obtained from DFT calculations are presented in Table 2. These descriptors used in order to select the dominant parameters affecting the inhibitory activity of the compounds. The heuristic method, implemented in CODESSA PRO was to built the multiple linear regression QSAR models which are given in Table 3. The goodness of fit of the models was tested the squared correlation coefficient (R 2 ), the F-test (F) and the standard deviation of the regression (s 2 ). For testing the predictive performance of the models, 2 CV R 'leave one out' (LOO), the squared cross-validated coefficient method was used. LOO approach consists in developing a number of models with one sample omitted at the time. After developing each model, the omitted data are predicted and the differences between experimental and predicted activity values are calculated. The best models that were produced are shown in Table 3. Among the models, the best goodness of fit is the model 3 with the R 2 =0.943, the F=32.20 and s 2 =0.067.   Interestingly, the model 4 with the best predictive power one (R 2 CV =0.893) has a relatively lower goodness of fit as evident from its R 2 =0.936 in comparison with the model 3 (R 2 =0.943). Using the model 3, predicted inhibition constants of compounds are presented in Table 2. It should be noted that C3 and C17 are outliers in this model.  When all the compounds (N=18) have been taken into account, the best model we obtained is model 1 which is a tetra-parametric regression equation. This model has good statistical characteristics as evident from its R 2 =0.857, F=19.4 and s 2 =0.137 values. It also has a satisfactory predictive power as evident from its R 2 CV =0.78 value. In the case of N=18, the second best model is model 2 which is a triparametric regression equation. Statistical characteristics of this model are slightly lower in comparison with model 1, but it still has good statistical fit and satisfactory predictive power. Only one difference between model 1 and 2 is the removal of electronegativity χ. When two compounds (C3 and C17) were treated as outliers the best model obtained is model 3 which is a penta-parametric regression equation with very good statistical fit and good predictive power as evident from its R 2 =0.943, F=33.2, s 2 =0.067 and R 2 CV =0.855 values. Model 4 and 5 have the same descriptors as model 1 and 2. The comparison of model 1 and 4 indicates that there is a tremendous improvement in the quality of regression such that R 2 value changes from 0.857 to 0.936 and standard deviation of the regression from 0.137 to 0.068 when C1 and C17 are outliers. Similar degree of improvement can be seen by comparison of model 2 and 5. Model 4 and 5 also have very good predictive power as evident from their R 2 CV =0.893 and R 2 CV =0.883 values respectively. Figure 2. shows a plot of experimental LogK I versus predicted LogK I using the model 3.  Table 3. shows that three factors namely total energy at 0 K, dipole moment and entropy at 298 K of the compounds play a major role in the inhibitory activity against CA-II isozyme. According to all the models in Table 3., the regression coefficient of total energy T e are positive, therefore, LogK I increases with the increasing T e . The regression coefficient of dipole moment µ are negative that means LogK I increases with the decreasing µ. Contribution of entropy S to the biological activity in the models is the same as total energy, LogK I increases with the increasing S. Remaining Literatures [30, 31 and 33] have shown that sulfonamide compounds bind as anions to the Zn(II) ion within the CAII active site. They concluded that inhibition properties of these compounds can be accounted by several factors. These include the stability of CAII enzyme-sulfonamides compound complex being stabilized by a large favorable enthalpy change associated with the binding of the sulfonamide to the CAII. Another factor that influences inhibition properties of the compounds, weak coordination bond between the active site Zn (II) ion and sulfonamide nitrogen is enormously supplemented by the cooperative interaction of the organic moieties of the inhibitor with the amino acid side chains from the active site. The models in Table 3, we produced, accord with these literatures. According to our models, inhibition activity of compounds is mainly affected thermo dynamical properties such as total energy, entropy and polarity of molecule (dipole moment) and reactivity of molecules (electronegativity and LUMO energy).

Conclusions
The results given above indicate that QSAR of inhibition constant (LogK I ) of sulfonamides compounds to CA-II isozyme can be modeled with the DFT-based quantum mechanical molecular descriptors. The best produced model is a penta-parametric regression equation with very good statistical fit and good predictive power as evident from its R 2 =0.943, F=33.2, s 2 =0.067 and R 2 CV =0.855 values. An analysis of descriptors that involved in the models, indicates that inhibition of CA-II is influenced by energy, entropy, polarity and reactivity indexes of sulfonamide compounds.