A QSAR Toxicity Study of a Series of Alkaloids with the Lycoctonine Skeleton

A QSAR toxicity analysis has been performed for a series of 19 alkaloids with the lycoctonine skeleton. GA-MLRA (Genetic Algorithm combined with Multiple Linear Regression Analysis) technique was applied for the generation of two types of QSARs: first, models containing exclusively 3D-descriptors and second, models consisting of physicochemical descriptors. As expected, 3D-descriptor QSARs have better statistical fits. Physicochemical-descriptor containing models, that are in a good agreement with the mode of toxic action exerted by the alkaloids studied, have also been identified and discussed. In particular, TPSA (Topological Polar Surface Area) and nC=O (number of -C(O)- fragments) parameters give the best statistically significant mono- and bidescriptor models (when combined with lipophilicity, MlogP) confirming the importance of H-bonding capability of the alkaloids for binding at the receptor site.


Introduction
Diterpene alkaloids isolated from the Aconitum and Delphinium plant species have been used mainly for preparation of a wide spectrum of poisons as well as medicinals for many centuries [1].Owing to their structure diversity, these alkaloids can be divided into a number of subgroups represented by molecules with the lycoctonine, heteratisine and napelline diterpene skeletons [2].Extensive biological and pharmacological studies performed have shown that Aconitum and Delphinium alkaloids affect substantially the cardiac and central nervous systems.Thus, at present, a number of diterpene alkaloids have been identified that exert analgesic, anti-inflammatory, antiepileptiform and cardiovascular effects [3,4].
Nevertheless, the practical application of Aconitum and Delphinium alkaloids in medicine is limited by their high toxicity.For the last few decades, these alkaloids have been the target of considerable interest by medicinal chemists who have put great effort into identifying structure-activity and/or structure-toxicity relationships.Thus, in the paper of Ameri et al. [1] Aconitum alkaloids are subdivided into three main groups: group number one consists of highly toxic alkaloids with two ester bonds ─OC(O)─ ; monoester derivatives of lappaconitine-like alkaloids of lower toxicity represent the second group.The most striking fact about these two groups is that they are antagonists when their affect on the voltage-dependent sodium channel is considered.Thus, group one compounds have been found to activate sodium channels, whereas alkaloids of group two block passive diffusion of sodium ions.The third group is comprised of the Aconitum alkaloids exerting the lowest toxicity due to the absence of any ester side chains.With no effect on the neuronal system, the alkaloids of this last group are still found to exhibit antiarrhythmic activity.
The high toxicity of various species of Delphinium plants is attributed to the norditerpenoid alkaloids present.These plants still continue to be the main cause of extensive cattle poisoning resulting in substantial losses for the cattle-breeding industry [5,6].A number of investigations have been performed on systematisation and taxonomic classification of Delphinium alkaloids known at present.Thus, according to the classification reported by Panter et al. [5], Delphinium species alkaloids can be divided also into three general types: (I) N-(methylsuccinyl) anthranoyllycoctonine (MSAL)-type alkaloids with the highest toxicity; (II) Lycoctonine-type alkaloids of moderate toxicity; and (III) 7,8-methylenedioxyllycoctonine (MDL)-type with low toxicity.
Having screened the literature, one might conclude that structure-activity investigations of Aconitum and Delphinium alkaloids have been carried out basically using conventional approaches.Likewise, toxicity of the alkaloids has been investigated by modifying certain functional groups.Further, a careful literature search has revealed that investigation of diterpene alkaloids by means of QSAR approach has received very little attention and only recently has an attempt been made to perform a QSAR analysis of the analgesic properties for twelve Aconitum alkaloids [7].
QSAR [8][9][10][11][12] is a powerful lead-compound optimisation technique, which quantitatively relates variations in biological activity to changes in molecular properties (descriptors).In other words, it attempts to link activity data with descriptors chosen via identification of the "rules" that can be further used to guide chemical synthesis when new chemical entities are developed.Recently new programs have emerged allowing more than thousand descriptors to be generated for the single molecule.Therefore, the search for successful combinations of mutually inter-related descriptors becomes an issue even for small series of compounds.For the last few years the Genetic Algorithm (GA) method, together with Multiple Linear Regression Analysis (MLRA) has become a valuable approach for deriving and validating QSARs.GA is an optimisation algorithm based on the mechanisms of Darwinian evolution that uses random mutation, crossover and selection procedures to breed better models or solutions from an originally random starting population or sample [13,14].
In this paper we present the results of a QSAR toxicity study carried out for 19 alkaloids with the lycoctonine skeleton.Structure -toxicity relationships are discussed in terms of the QSAR models built and the descriptors chosen.

Training Compounds
The structures of the 19 diterpenoids chosen for this work and their functional groups are shown in Figure 1.
Compounds are subdivided into two general groups in accordance with the functionality nature at C4. Thus the first group alkaloids contain ─CH 3 , ─CH 2 OCH 3 , and ─CH 2 OH residues, whereas the second type of alkaloids has a benzoylester side chain.Toxicity data used in this study is taken from the reference [15] suggesting compounds of the second group to be more toxic than the first group alkaloids.This is in a good agreement with structure-toxicity investigations reported in two recent reviews [1,5].All original LD 50 toxicity data (mg/kg) has been converted to molar -logLD 50 response variables.

Molecular Modelling
The molecular modelling studies (molecular mechanics and semi-empirical calculations) were carried out using the Hyperchem 6.01 software package [16].The Molecular Mechanics (MM+) force field was applied for preliminary structure optimisation and study of the conformational behaviour of each alkaloid.Molecular mechanics has been shown to produce more realistic geometry values for the majority of organic molecules owing to the fact of being highly parameterised [17].The next step was a re-optimisation of the MM+ optimised structures by applying AM1 semi-empirical method.Quantum mechanical method has been used in order to obtain an accurate charge distribution and quantum-chemical descriptors for each compound in the series.

Molecular Descriptors
Descriptors are normally calculated for molecules after a low-energy conformation has been found and optimised using any standard optimisation technique, e.g.molecular mechanics, ab initio, DFT or semi-empirical methods.The molecular descriptors used in this study have been calculated applying the DRAGON program [18].The program contains scripts for generating 1497 descriptors of different types including: constitutional, topological, RDF, GETAWAY, functional groups, WHIM, Randic, 3D-Morse etc [19].A set of additional quantum-chemical descriptors (energy of heat of formation, dipole moment, atomic charges, energy of highest occupied orbital HOMO, energy of lowest unoccupied orbital LUMO, HOMO-LUMO energy gap etc.) has also been obtained for the each molecule after geometry optimisation procedure.

Statistical Methods
Preliminary models selection was performed by means of GA-MLRA technique as implemented in the BuildQSAR [20] program.As mentioned before, this approach allows selection of the models with the following characteristics: high quadratic correlation coefficient R 2 , low standard deviation S and the least number of descriptors involved.Next, the NCSS98 [21] professional software package was applied for detailed statistical analysis of the models obtained.Thus, the high Fisher coefficient F, noncollinear descriptors, and the significance level P variable served as additional selection parameters.A final set of QSARs was identified by applying the "leave-one-out" technique with its predicting ability being evaluated and confirmed by cross validation coefficient Q 2 based on predictive error sum of squares (SPRESS).

Results and Discussion
As mentioned above, a QSAR analysis has been performed for 19 alkaloids with the lycoctonine skeleton (Figure 1) aiming at establishing a structure-toxicity relationship.For the sake of simplicity we have divided the models into two general groups in accordance with nature of the descriptors involved: group one, comprised of 3D descriptors generated by DRAGON, and group two, containing physicochemical descriptors only.While constructing the models, great care was taken in order to avoid inclusion of highly collinear descriptors.The correlation matrix for the 3D together with physicochemical descriptors used in this study is given in Table 1.The table includes only those variables that have comprised the most populated models selected by the variable selection Genetic Algorithm method.

3D Descriptor Containing Models
One of the main advantages of these descriptors is the unambiguity regarding the 3D arrangement of atoms.There are also other properties making them flexible and therefore popular descriptors to be used.One of them is independence from the molecular size resulting in applicability to the large datasets with great structural variance.Another important property of 3D descriptors is their invariance against translation and rotation of the molecule.Such atom-based descriptors are also suggested to be applied for the collection of exotic chemicals, since there is a greater chance of physicochemical descriptors giving misleading information [22].
For the model generation we have chosen 3D descriptors of the following type: RDF, 3D-Morse, WHIM, GETAWAY, and Randic, available within DRAGON.Several runs of GA-MLRA variable selection technique implemented in BuildQSAR program have resulted in models containing mainly RDF, and 3D-Morse type descriptors.Values of 3D descriptors together with the toxicity data are indicated in Table 2. 3D-Morse descriptors were obtained on the basis of the molecular transform equation used in electron diffraction [23].RDF code is based on the radial distribution function of an ensemble with N atoms, i.e. probability distribution of finding atom on a sphere with radius r [24].LD values have been taken from reference [15] The following best QSAR models have been identified:

3D monodescriptor model (A):
Log( The statistical significance of each model is evaluated by the correlation coefficient r, standard error s, adjusted r-squared r 2 adj , F-test value, significance level of the model P, leave-one-out crossvalidation coefficient Q 2 and predictive error sum of squares SPRESS.As can be seen, all three models have a high significance level P and the statistical fit of each model improves as the additional descriptors are included.Interestingly, these models cannot be further improved by deleting from the series the compound which deviates the most from the average.This can be explained by the fact that 3D descriptors take into account 3D structure of each molecule of the series.Though being nonmechanistically relevant, the models obtained show high flexibility and the predicting ability of 3D descriptors, which is confirmed by the close values of r 2 adj and Q 2 .

Physicochemical descriptor containing models.
A number of physicochemical descriptors used in this study are shown in Table 1.Two descriptors, such as TPSA (Topological Polar Surface Area) [25] and nC=O (descriptor indicating the number of -C(O)─ fragments present in the molecule) have been identified by the GA-MLRA model search as the best correlated with the toxicity data (refer to Table 2 for their values).TPSA descriptor is described as a polar part of the molecule associated with the oxygen, nitrogen, sulfur atoms and also hydrogens connected to these heteroatoms.It has been reported to be one of the best predictivity descriptors to build a QSAR model for the drugs affecting Central Nervous System (CNS) [26].Thus, TPSA has demonstrated a strong correlation with drug absorption: decrease of the surface of polar area results in increase of blood-brain permeation of CNS drug molecules [27] and improved intestinal epithelial permeability of orally administrated medicinal agents [28].A larger topological surface area was also proposed as a feature favoring HERG K + channel blockers binding [29].The QSAR model containing TPSA and generated in this study is as follows:

PhysChem monodescriptor model (А):
Log ( 1 50 − LD ) = +0.01709(±0.00480)TPSA +3.26509(±0.33908)n=19; r=0.88; r 2 adj =0.75; s=0.318;F=56.00;Q 2 =0.70;SPRESS=0.36;P=0.000001 (4) The graphical view of the model is shown in Figure 2. As can be seen from this model, the toxicity of the alkaloids increases with the number of heteroatoms in the molecule.This contrasts with the model obtained for other CNS-drugs and confirms the difference in mechanisms of action between the two types of CNS affecting drugs.Thus, lycoctonine alkaloids have been reported to affect CNS by binding to the sodium ion channel, i.e. the membrane protein, leading to conformational changes and subsequently to functional disruption of the channel, whereas low molecular weight CNS drugs induce toxicity at a certain concentration and therefore their ability to pass through biological membranes (e.g.blood-brain barrier) plays a key role in the mode of their toxic action.Furthermore, as reported elsewhere [30], no correlation should be observed for octanol-water coefficient and toxicity for the specific interactions of the toxicant with the receptor site.This is in a good agreement with our results as no substantial correlation has been observed for LogP/MlogP and toxicity data (for example, see Figure 3 with MlogP plotted against toxicity data).Having taken into account that TPSA describes H-bonding capability of the molecules, one can conclude that high values of this descriptor can be attributed to the alkaloids of higher toxicity due to the stronger binding at the receptor site.According to Figure 4, the toxicity of alkaloids can be improved as the number of C=O fragments decreases.Thus, the least toxic are alkaloids with no C=O fragments, and the most toxic one contains four such fragments.
Development of mechanistically based models requires an accurate determination of the most relevant descriptors controlling the endpoints of interest.Even at the first glance, it becomes clear that the nC=O fragment descriptor is responsible for the receptor-ligand interaction strength.The spatial orientation of -C(O) makes them the most favored "targets" -donors of electron lone pairs promoting attachment and fixation of the ligand molecule in 3D receptor domain.It is worth to note, that MlogP descriptor has been identified as the best to combine with both TPSA and nC=O descriptors while generating two-descriptor QSAR models.MlogP (Moriguchi octanol-water partition coefficient) [19] is a popular and traditional descriptor used in QSAR model building.It describes one of the most important properties of any compound-its lipophilicity which indicates the ability to penetrate lipid-rich zones from aqueous solutions.The later becomes a very important feature of any drug that is administrated orally and supposed to pass gastrointestinal epithelium.Thus in accordance with the reference [28], medicinal agents with PSA (general case of TPSA) varying in the range from 61Ǻ to 140 Ǻ and with LogP (general case of MlogP) less than 5 are proposed to be well absorbed.Despite the fact that nHAcc (number of acceptor atoms N, O, F) is collinear with TPSA, we have applied the latter descriptor as it gives the better models and describes both electron-donor and electron-acceptor properties of the molecule.Next, as can be seen from Table 1, two lipophilicity descriptors have been calculated for the alkaloids: MlogP (DRAGON program code) and LogP (HyperChem program code).However, the second octanol-water partition coefficient has not been considered in this study for two reasons: firstly, it has higher collinearity with the descriptor MW (Molecular Weight) and therefore should be avoided; secondly, it results in less significant models when combined with either TPSA or nC=O descriptors.GA-MLRA QSARs generation procedure has resulted in a number of statistically equivalent combinations of TPSA and nC=O with the other physicochemical descriptors.However, these models are not presented in this paper as the authors consider them being unable to reflect a true relationship between toxicity and physicochemical properties.It is also interesting to note, that both descriptors chosen (TPSA and nC=O) are highly collinear with the descriptor MW that results in statistically equal two-descriptor model when combined with MlogP (Eqn.8):

PhysChem bidescriptor model (E):
Log( 1 50 − LD ) = +0.00447(±0.00122)MW +0.52498(±0.26125)MlogP +1.04611(±0.75092)n=19; r=0.92; r 2 adj =0.83; s=0.27;F=44.20;Q 2 =0.78;SPRESS=0.32;P=0.000000 However, the QSAR model built does not have a real value or meaning as it cannot explain the toxicity mechanism of action of the alkaloids studied (Eqn.8).As shown by this model, alkaloids of higher molecular weight are more toxic.Nevertheless it is important to clarify the real cause of the phenomena observed.Thus, according to classical theory, an increase in MW due to the bulky part of the molecule (alkyl groups or aromatic rings) should result in a lipophilicity increase, which is not observed with the present series of compounds.Also, as can be seen from the descriptor correlation table (Table 1), the MW descriptor is not collinear with MlogP but with nHAcc, TPSA, and nC=O.Therefore, it can be concluded that the major factor influencing the toxicity of alkaloids is the number of electron donor (H-bonding) atoms or substituents, rather than lipophilicity.
In the recently reported QSAR analysis of 12 Aconitum alkaloids [7], the authors have identified a number of structural characteristics, chemical manipulation of which might result in both an increase of analgesic potency and a limitation of toxicity.Mono-parameter equations consisting of reactivity index, heat of formation, total, electronic and steric energies, molecular weight and core-core repulsion were selected as statistically significant ones.Two-descriptor models have not been proposed because of the considerable interrelation of the variables studied.For the same reason, the authors of the present paper are not presenting the QSAR models containing three descriptors due to a good chance of getting two highly collinear parameters in one equation.
Keeping in mind that realistically r 2 values may be as low as the range of 0.6 -0.7 [22], all the QSAR models obtained in this study are of satisfactory statistical fit with acceptable correlation coefficients, and with great significance level of each equation.The ability of each model to make predictions has been evaluated by obtaining cross-validation coefficients.Close values of both adjusted for the degrees of freedom correlation coefficients and cross-validation coefficients confirm a good predictivity of QSARs, especially for the ones containing two descriptors.

Conclusions
To summarize, 19 Aconitum and Delphinium alkaloids with the lycoctonine skeleton have been analyzed applying a QSAR approach.The best models selected and studied in this paper are the ones containing 3D descriptors (RDF020u, Mor28e, Mor07p) and physicochemical descriptors (ТPSA, MlogP, nC=O).The major difference between the two types of models is the better statistical fit of the 3D-descriptor containing models.It is worth mentioning that the exclusion of the deviant compound does not lead to the notable model improvement in both 3D-and physicochemical descriptors containing models.QSARs consisting of physicochemical descriptors confirm the mode of toxic action of lycoctonine-type alkaloids that interact with the sodium ion channel.Thus, a correlation observed between TPSA (determined as a Van der Waals area of electron donor and electron acceptor atoms) and toxicity demonstrates an importance of H-bonding capability of each alkaloid for binding to the bioreceptor.The descriptor most related to toxicity is the number of nC=O fragments in each molecule, which also describes the H-bonding formation capacity of the molecule.It has also been shown that toxicity increases with the number of atoms participating in hydrogen bonding and not due to the increase of bulky part of the molecules.This confirms experimental results suggesting Aconitum and Delphinium alkaloids interact with specific receptor sites in the sodium ion channel and therefore, hydrogen bonds playing an important role in their binding processes.