QSAR Studies on N-aryl Derivative Activity Towards Alzheimer’s Disease

A Quantitative Structure Activity Relationship (QSAR) study has been an attempted on a series of 88 N-aryl derivatives which display varied inhibitory activity towards both acetylcholinesterase (AChE) and butyrylcholinesterase (BChE), targets in Alzheimer’s drug discovery. QSAR models were derived for 53 and 61 compounds for each target, respectively, with the aid of genetic function approximation (GFA) technique using topological, molecular shape, electronic and structural descriptors. The predictive ability of the QSAR model was evaluated using a test set of 26 compounds for AChE (r2 pred = 0.857), (q2 = 0.803) and 20 compounds for BChE (r2 pred = 0.882), (q2 = 0.857). The QSAR models point out that AlogP98, Wiener, Kappa-1-AM, Dipole-Mag, and CHI-1 are the important descriptors effectively describing the bioactivity of the compounds.


Introduction
Alzheimer's disease (AD), the most common form of dementia among the aged, is a fatal neurodegenerative disease characterized by loss of mental ability, cognition deterioration, progressive impairment of daily activities and a variety of neuropsychiatric symptoms and behavioral disturbances [1,2]. Both acetylcholinesterase and butyrylcholinesterase are enzymes whose vital function is the hydrolytic breakdown and degradation of acetylcholine (ACh), a neurotransmitter which plays a role in OPEN ACCESS the modulation of memory function in normal and neurodegenerative conditions [3]. Choline esterase is the only target that has resulted in the design of a few palliative drugs presently marketed for the treatment of the Alzheimer's disease [4]. Tacrine, galantamine, rivastigmine, donopezil and huperzine are some cholinesterase inhibitors, currently used promising drugs for the treatment of Alzheimer's disease [5]. But their clinical use is strictly limited because of several adverse effects such as hepatotoxicity and some pharmacokinetic disadvantages, so the study of new compounds as cholinesterase inhibitors is required to discover more effective and targeted drugs.
The Quantitative Structure Activity Relationship (QSAR) methodology is useful in predicting the activity of novel molecules by mathematical equations which deduce the relationship(s) between a chemical structure and its biological activity [6][7][8][9][10][11][12]. QSAR models are pointers to design effective drugs. QSAR studies on Alzheimer's disease targets has been carried out on phenylpentenone derivatives [13], physotigmine analogues [14], indanone [15], and tacrine [16] Recent studies indicate that N-aryl derivatives (amides and imides) to be active-site-directed inhibitors of acetylcholine esterase and butyrylcholine esterase [17]. For the present study, a QSAR study for a series of 88 N-aryl compounds (whose inhibitory effect against Ache and Bche was reported in the above paper) was carried out using the Cerius2 package. The QSAR module in this package provides different descriptors that are categorized into different types like spatial, structural, electronic, conformational, thermodynamic and receptor. A QSAR model was generated using the Genetic Function Approximation (GFA), which has also been applied for the QSAR analysis of steroids, dopamine β-hydroxylase inhibitors [18] and anticancer agents [19]. An interesting application of GFA is in the QSAR studies on acetylcholinesterase inhibitors which has already resulted in discovery of a new molecule, E2020, for the treatment of Alzheimer's disease [20].

Results and Discussion
Out of 88 N-aryl derivatives, the QSAR models were generated using 53 and 59 compounds as training set for acetylcholinesterase and butyrylcholinesterase, respectively. The physical/ chemical / physiochemical significance of each of the descriptors appearing in the QSAR equations are given in Table 1. Out of the total 88 N-aryl compounds, 79 were selected for QSAR analysis in Cerius2 using GFA technique, the remaining being left out due to poor scalability. Genetic Function Approximation (GFA), statistical analysis was carried out for all 79 compounds for acetylcholine esterase and butyrylcholine esterase respectively.
To improve the predictive power of the QSAR model, outliers were removed from the sets. In the case of the Ache inhibitors, the log ki values of 26 compounds in the test set (out of 88) were completely out of the scalable range of the remaining 60 compounds. Hence, for better scalability, they were treated as outliers and not included in the training set for QSAR calculations. Likewise 20 Bche inhibitors were ignored for the same reason. Using Leave-one-out (LOO) method, final training sets of 53 and 61 compounds were selected and the best equations were obtained with the combination of 2D topological, thermodynamic, structural, electronic, charge dependent descriptors.
Out of the 43 descriptors calculated for each compound in the dataset, only a few, viz., CHI-V-1-3, CHI-1, WIENER, ALOGP98, KAPPA-1-AM , DIPOLE-MAG, PHI, LogZ and HBOND DONOR were selected based on the correlation coefficient r (0.929, 0.942), squared correlation coefficient (0.862, 0.887) and cross validated r 2 or q 2 (0.803, 0.857) of the model generated. Recent QSAR studies on AchE inhibitors [21] have highlighted log P, NORB and WNSA1 as deterministic descriptors in favor for the AChE activity using GFA technique. This too was observed in this study, where ALOGP98 and WIENER appear in the QSAR equation in both the cases viz., AChE and BChE. Therefore, all the above mentioned descriptors can be used as filters while exploring a database or as an important criterion while designing new chemical entities as leads for effective drugs to function as AChE/BChE inhibitors.
To validate the final QSAR model, test sets of 26 and 20 compounds were used for acetylcholinesterase and butyrylcholinesterase inhibition, respectively. The predictive power of the model was reasonably good with predictive r 2 , ( r 2 pred ) value of 0.857 and 0.882, cross validated r 2 (0.929, 0.917), respectively, for the set of compounds against Ache and Bche. Structure and statistics of the training set is given in the supplementary data. The experimental and predicted values of the biological activity are given in Table 2    Plots of the experimental and predicted activity for both against AChE and BChE are given in Figure1.

Data screening and Molecular Modeling
The data used in this study are acetylcholinesterase and butyrylcholinesterase inhibitor activities of a set of 88 N-aryl derivatives obtained from the literature [17]. For better scalability, the inhibition constant (Ki) values of the N-aryl derivatives were converted into log (Ki) values and this was used as a dependent variable in the study. All the N-aryl derivatives were built using INSIGHT-II software and the structures were energy minimized using the cff91 force field. Seventy nine compounds were selected for the QSAR studies in Cerius2 based on the logarithmic value of Ki. Molecular descriptors for each molecule for this QSAR study were calculated in the study table. These descriptors include 2D topological, thermodynamic, structural descriptors and charge dependent descriptors.

Generation of QSAR models using GFA technique
QSAR models were generated using the Genetic Function Approximation (GFA). The GFA technique is a conglomeration of Genetic Algorithm, Friedman's multivariate adaptive regression splines (MARS) algorithm and Holland's genetic algorithm to evolve population of equations that best fit the training set data. The GFA algorithm could be a useful technique for searching a large probability space with a large number of descriptors for a small number of molecules. A distinctive feature of GFA is that it produces a population of models (eg.-100), instead of generating a single model, as do most other statistical methods. The range of variation in this population gives added information on the quality fit and importance of the descriptors. GFA calculations are based on three operators: selection, crossover and mutation. An initial population of equations is generated by random choice of descriptors. The fitness of each equation is scored by Lack-of-Fit (LOF) measure, LOF = LSE / {1-[c + d*p / m]}2 , where LSE is least square error, c is the number of basis functions in the model, d is the smoothing parameter which controls the number of terms. Crossovers are performed between pairs from the population chosen at random [22] and mutations are performed to add randomness for finding the good GFA equations. In this study, the crossover probability and mutation probability were 0.5, the size of the population 100 and the number of generations was fixed to 100. Several QSAR equations for training sets of 53/61 compounds were obtained using GFA. From those equations, best equations were selected by the statistical measures such as number of compounds in regression, correlation coefficient, square of correlation coefficient r 2 , q 2 , and r 2 predictive [24] . The correlation coefficient values closer to 1.0 represent the better fit of regression. The squared correlation coefficient (r 2 ) is a relative measure of fit by the regression equation. Predictive ability (r 2 pred) of the selected model was evaluated by test sets of 26/20 compounds for acetylcholinesterase and butyrylcholinesterase, which were not included in training sets.

Conclusions
Acetylcholinesterase and butyrylcholinesterase are targets for many of the the currently available anti-Alzheimer drugs. In this present work, QSAR analysis on a series of N-aryl compounds against both choline esterases was carried out using the GFA technique. The best model was selected based on the statistical parameters. The presented QSAR model has good predictive power (r 2 pred ) and square correlation coefficient (r 2 ). The models derived independently for both the targets showed that ALOGP98 (thermodynamic descriptor) and WEINER (charge descriptor) increased the inhibitory potency of N-aryl derivatives against AchE and BChE activity. Thus the descriptors, CHI-V-1-3, CHI-1, WIENER, ALOGP98, KAPPA-1-AM, DIPOLE-MAG, PHI, LogZ and HBOND DONOR seem to determine the activity of the compounds to function as effective AchE and Bche inhibitors. This knowledge can be used for designing more effective chemical entities and may also provide important insights into structural variants leading to the development of novel ChE inhibitors.