Quantitative Structure − Antiprotozoal Activity Relationships of Sesquiterpene Lactones †

Prompted by results of our previous studies where we found high activity of some sesquiterpene lactones (STLs) against Trypanosoma brucei rhodesiense (which causes East African sleeping sickness), we have now conducted a structure-(in-vitro)-activity study on a set of 40 STLs against T. brucei rhodesiense, T. cruzi, Leishmania donovani and Plasmodium falciparum. Furthermore, cytotoxic activity against L6 rat skeletal myoblast cells was assessed. Some of the compounds possess high activity, especially against T. brucei (e.g. helenalin and some of its esters with IC50-values of 0.05-0.1 µM, which is about 10 times lower than their cytotoxic activity). It was found that all investigated antiprotozoal activities are significantly correlated with cytotoxicity and the major determinants for activity are α,β-unsaturated structural elements, also known to be essential for other biological activities of STLs. It was observed, however, that certain compounds are considerably more toxic against protozoa than against mammalian cells while others are more cytotoxic than active against the protozoa. A comparative QSAR analysis was therefore undertaken, in order to discern the antiparasitic activity of STLs against T. brucei and cytotoxicity. Both activities were found to depend to a large extent on the same structural elements and molecular properties. The observed variance in the biological data can be explained in terms of subtle variations in the relative influences of various molecular descriptors.


Introduction
Protozoal infections such as malaria, trypanosomiases and leishmaniases represent major health risks in developing countries. It is estimated that world-wide these diseases are responsible for over one million deaths a year [1]. While relatively effective and safe therapies for malaria exist, African sleeping sickness and Chagas' disease caused by Trypanosoma species, as well as cutaneous and visceral Leishmaniasis (Kala-Azar) are currently classified as "neglected diseases" [2]. Only a few effective drugs exist for the treatment of these infections and therapy is often accompanied by severe adverse effects and high toxicity, so the search for new drugs or lead structures, especially against Trypanosoma and Leishmania infections, is an urgent task [3]. Natural products have in many instances been found to provide interesting leads for such diseases [3]. Among many other examples, it has been shown by our group that certain sesquiterpene lactones (STLs) possess considerable activity against Trypanosoma species [4,5]. The present study was conducted in order to obtain a more detailed insight into the structure-activity relationships governing antiprotozoal activity of STLs. To this end, 40 STLs including 16 pseudoguaianolides, four xanthanolides, four modified xanthanolides, eight eudesmanolides and eight germacranolides (see Figure 1) were tested in-vitro against four major protozoan pathogens, Trypanosoma brucei rhodesiense (Tbr), Trypanosoma cruzi (Tcr), Leishmania donovani (Ldon) as well as Plasmodium falciparum (Pfc). As a control system to assess cytotoxic activity, the rat skeletal myoblast cell line L6, also serving as host cell system in the Tcr assay, was used. The resulting data were subsequently investigated for quantitative structure-activity relationships (QSAR) using molecular modelling and multivariate data analysis tools.

Biological Activity Data and Activity-Activity Relationships
The bioactivity data of 40 sesquiterpene lactones (Structures see Figure 1) tested in-vitro for activity against T. brucei rhodesiense (Tbr), T. cruzi (Tcr), L. donovani (Ldon) and P. falciparum (Pfc) as well as cytotoxicity against L6 rat skeletal myoblasts are reported in Table 1. Generally, T. brucei rhodesiense was found to be the most sensitive to STLs among the tested parasites. In line with the significant activity of helenalin (1), previously reported [4], its ester derivatives 2-4 were found to be very active, with IC 50 s in the range of 0.1 µM and below. Pseudoguaianolides of the helenalin series also showed the highest activity against the other parasites. Some of the helenalin congeners exhibited more pronounced bioactivity against Tcr than the positive control benznidazole. They also showed activity against Ldon and Pfc in a similar range as their respective controls.
On the other hand the xanthanolide 8-epixanthatin-1,5-epoxide 19, recently isolated as the most active compound from Xanthium brasilicum in the course of a bioactivity-guided isolation study [5], exhibited considerable activity against Tbr and its close relative Ldon. However, all of the most active STLs displayed significant toxicity against the rat skeletal myoblast cell line L6, used as control to assess cytotoxicity.

T. cruzi (Tcr)
Pairwise correlation of the bioactivity data (Table 2) revealed that antiprotozoal activity is in all cases significantly correlated with cytotoxicity against the L6 cells. It can thus be expected that all investigated bioactivities are -at least to a significant degree -governed by similar structure-activity relationships as cytotoxicity. However, several of the tested compounds (e.g. 1-6, 18, 19) displayed significantly higher levels of antiparasitic activity, especially against Tbr, than cytotoxicity, whereas others, e.g. the eudesmanolides 25 and 26, are considerably more toxic against the mammalian cells than against the protozoa. In order to assess the degree of selectivity of each STL against a particular parasite, the ratio of the cytotoxic IC 50 value of the mammalian control cell line L6 over the respective values for the four protozoan organisms was investigated. The most favourable selectivity indices (SI) with respect to activity against Tbr were found in case of compounds 19 and 24, respectively, which were 67 and 52 times more active than cytotoxic. These two compounds, moreover, were also the most selective against Ldon (SI = 36 and 15, respectively). The absolute activity of compound 24 being relatively low, however, an interesting potential as lead compound may be conceived especially for 19. Helenalin 1, followed by its acetate 2, both showing Tbr activity well below 0.1 µM, despite their somewhat higher cytotoxicity still possess SI values of 19 and 13, respectively, rendering them also interesting candidates for further studies.

Structure-activity relationships, QSAR
A very simple but nevertheless essential structure-activity relationship (SAR) is already obvious when visually comparing the structures and their activity data. Compounds possessing at least one potentially reactive α,β-unsaturated carbonyl group as a pharmacophore usually show significant antiprotozoal as well as cytotoxic activities, while compounds lacking such structural elements show relatively insignificant activity. The presence of such potential Michael acceptors in the structure is thus a prerequisite for activity, in much the same way as reported in previous studies [6][7][8][9] and in full agreement with the frequent observation that various bioactivities of STLs are associated with their chemical reactivity, especially towards free thiol groups (e.g. cysteine residues in enzymes and transcription factors; for overviews see [8,9]).
Previous studies on quantitative SAR (QSAR) in our laboratory have concentrated on structurecytotoxicity relationships among STL and the major structural determinants of this activity of various data sets against several human and murine cell lines were reported [6,7].
Since the spread of activity data was largest in case of the Tbr activity (3.8 log units) and the absolute level of activity displayed by some of the compounds was highest against this parasite, we concentrated on this set of activity data. In our previous work [6,7] it was found that cytotoxic activity of STLs correlates quite strongly with a very simple type of molecular descriptors, namely, binary indicator variables which encode the presence/absence of a particular reactive structure element. The same was found also in this study. When the activity of all 40 compounds against Tbr and their L6cytotoxicity were analysed for correlation with such descriptors (ML, 1 in case of the presence, 0 in case of absence of an α-methylene-γ-lactone, ENONE for the presence of an α,β-unsaturated ketone structure) by multiple linear regression, squared correlation coefficients (R 2 ) of 0.61 and 0.41, respectively, were found. These values being not very high, the significance of both descriptors' regression coefficients was nevertheless confirmed for both sets of activity data by the results of t-and F-tests. Quite interestingly, when only the subset of structurally closely related compounds 1 -16 (pseudoguaianolides of the helenanolide series) were considered, the correlation coefficients were much higher (R 2 = 0.87 and 0.79, respectively for Tbr and L6). This result is in line with our previous findings [6,7] and can be interpreted in a straightforward manner by assuming that in case of compounds possessing very similar molecular structure (in fact the same carbon skeleton and thus similar size, shape and substitution) the modulating influence of other structural factors on bioactivity is small compared to the major impact of chemical reactivity (i.e. presence of enone and methylene lactone groups). Both bioactivities in the set of closely related compounds are dominated largely by these factors, i.e. differences in antitrypanosomal as well as cytotoxic activity can easily be explained by differences in the potential to alkylate biomolecules. However, when STLs of a greater structural diversity (i.e. the whole data set) are considered, the modulating influence of other structural features increases (reflected in the lower degree of direct correlation with the indicators for ML and ENONE). Thus, in order to explain the above-mentioned differences in antiprotozoal and cytotoxic activity for the whole set of compounds, other molecular properties and structural features must be considered.
To this end, a 3D model of each molecule was created using the molecular modelling package MOE [11] and a variety of molecular descriptors were calculated using the QSAR module of MOE (for a full list of the descriptors considered see Experimental Section). The resulting data matrix (40 compounds x 44 descriptors) was analysed with the multivariate correlation method PLS2 as implemented in the statistics program The Unscrambler [12]. PLS2 served to correlate both sets of biological activity data simultaneously with the structural descriptors (resulting statistical parameters see Table 3). It was found that compound 24 represented an outlier, leading to poor predictive quality of the correlation model (low cross validated correlation coefficient Q 2 especially for the cytotoxicity data). After exclusion of this compound and subsequent elimination of variables not significantly contributing to the overall correlation (variable selection by Martens' uncertainty test [12]), the final model presented in Figures 2-4 resulted (statistics see Table 3). In this model, the information content of 20 descriptors is combined in three significant PLS components (PCs, latent variables). The correlation coefficients and leave-one-out cross validated correlation coefficients for Tbr activity are 0.89 and 0.85, respectively, the corresponding values for L6 cytotoxicity are 0.90 and 0.84. As can be seen in the loadings plot in Figure 4, the first PLS component (PC1) explaining 45% of the overall variance of the biological data, is dominated by descriptors AM1-LUMO (= energy of the lowest unoccupied molecular orbital; most negative loading) and ENONS (most positive loading). Both descriptors are related to the molecules' reactivity/alkylating potency. According to frontier molecular orbital theory, electrophilic reactivity is inversely correlated with the energy of a molecule's lowest unoccupied molecular orbital (LUMO) [13]. Descriptor ENONS, on the other hand represents the molecular surface area due to α,β-unsaturated carbonyl structures which, as expected, has a positive impact on activity.  Table 3]. Blue: calibration data; red: predictions of leave-one-out cross validation.  Table 4) in the PLS2 model [model 3 in Table 3] (left: L6 cytotoxicity, right: Tbr activity).  Table 3]. Top: PC2 vs. PC1, Bottom: PC3 vs PC2. In the scores plots, compounds are coloured according to their activity against Tbr (pIC 50 ).
The second latent variable PC2 (explaining further 27% of the variance in the biological data) receives major influences from descriptors ASAP4 [7] (positive coefficient), ASA, ASA+ and stdim1 (negative coefficients). The former, showing a positive influence also in PC1, represents the molecular surface area attributable to hydrogen atoms attached to the double bond carbons of α,β-unsaturated ketone structures and thus is also related to reactivity and accessibility of the reactive partial structures. The latter three descriptors represent the total surface area, the surface area attributable to atoms with . Accordingly, this PC appears to be related mainly to molecular size. The negative coefficients of the latter descriptors indicate that reduction of the activity is associated with the total size of the molecule coupled with a large overall positive surface area. Finally, PC3 (explaining further 9% of biological data variance) is clearly correlated with the molecules' polarity/hydrophilicity. This latent variable receives strong influence from descriptors E_sol (calculated solvation energy, more positive value in case of hydrophobic molecules) and logS (log of the calculated water solubility). The positive coefficient of the former and negative coefficient of the latter descriptors clearly demonstrate an inverse correlation of the biological activities with polarity, i.e. active molecules should not be too polar and but rather have a certain degree of hydrophobicity.

Conclusions
In conclusion, the very similar PLS coefficients (Figure 3) in the two models for Tbr and L6 activity, which differ only slightly in magnitude for the individual descriptors, clearly show that no major differences exist in the general structure-activity relationships for cytotoxicity and antitrypanosomal activity of the 40 STLs included in the present study. It appears a difficult task -if at all possible -to exploit the relatively subtle structural differences responsible for differential activity with respect to lead structure optimisation. While tests of further STLs and related compounds, as well as the application of further QSAR methods might yet reveal clearer and possibly more detailed quantitative structure-activity relationships, different strategies may have to be applied in order to increase the selectivity and to exploit the interesting antiprotozoal potential of these natural products. In this respect, e.g., the design of parasite-targeted prodrugs [14,15] or exploitation of a parasite transporter may be of interest. Studies in this direction have been initiated.

Test compounds
Compounds 1-17, 20, 26-28 were isolated from Arnica species as reported previously [16]. Compounds 18, 19 and 21-24 were isolated from Xanthium brasilicum Vell. [5]. Compounds 25, 29 and 30 were isolated from roots of Inula helenium in our laboratory. They were identified by their NMR data which were in accordance with published data [17,18] Compounds 31-33 and 36-40 originating from various Asteraceae, were kindly provided by G. Willuhn, Düsseldorf, Germany. Compound 35 was kindly provided by N. H. Fischer, Denton, TX, U.S.A. Compound 34 (parthenolide) was obtained from Sigma-Aldrich (cat. No. P667). The purity of all compounds was assessed by 1 H-NMR, HPLC and/or TLC analyses and found to be >80% in all cases.

In vitro assays and IC 50 determination
Plasmodium falciparum. Antiplasmodial activity was determined using the K1 strain of P. falciparum (resistant to chloroquine and pyrimethamine). A modification of the [ 3 H]-hypoxanthine incorporation assay was used [19]. Briefly, infected human red blood cells (final parasitaemia and haematocrit were 0.3% and 1.25%, respectively) in RPMI 1640 medium with 5% Albumax were exposed to serial drug dilutions in microtiter plates. After 48 hours of incubation at 37°C in a reduced oxygen atmosphere, 0.5 μCi 3H-hypoxanthine was added to each well. Cultures were incubated for a further 24 h before they were harvested onto glass-fiber filters and washed with distilled water. The radioactivity was counted using a BetaplateTM liquid scintillation counter (Wallac, Zurich, Switzerland). The results were recorded as counts per minute (CPM) per well at each drug concentration and expressed as percentage of the untreated controls. From the sigmoidal inhibition curves IC50 values were calculated. Assays were run in duplicate and repeated once.
Trypanosoma brucei rhodesiense and cytotoxicity against L6 cells. Minimum Essential Medium with Earle's salts (50 µL) supplemented with 0.2 mM 2-mercapto-ethanol, 1 mM Na-pyruvate and 15% heat-inactivated horse serum was added to each well of a 96-well microtiter plate. Serial drug dilutions were prepared by adding 25 μL complete medium containing 540 μg/mL (6x the starting concentration), thus covering a range from 90 to 0.123 µg/mL. Then 10 4 bloodstream forms of Trypanosoma brucei rhodesiense STIB 900 in 50 µL of medium were added to each well and the plate incubated at 37°C under a 5% CO 2 atmosphere for 72 hours. Alamar blue solution (10 µL, 12.5 mg resazurin dissolved in 100 mL distilled water) were then added to each well and incubation continued for a further 2-4 hours. The plate was then read in a Spectramax Gemini XS microplate fluorometer (Molecular Devices Corporation, Sunnyvale, CA, USA) using an excitation wavelength of 536 nm and emission wavelength of 588 nm [20]. Fluorescence development was measured and expressed as percentage of the control. Data were transferred into the graphic programme Softmax Pro (Molecular Devices) which calculated IC 50 values. Cytotoxicity was assessed using a similar protocol and rat skeletal myoblasts (L6 cells). L6 cells were seeded in to RPMI 1640 medium supplemented with Lglutamine 2 mM, HEPES 5.95 g/L, NaHCO 3 2 g/L and 10% fetal bovine serum in 96 well microtiter plates (4,000 cells/well). All following steps were according to the T. b. rhodesiense protocol.
Trypanosoma cruzi. Rat skeletal myoblasts (L-6 cells) were seeded in 96-well microtiter plates at 2,000 cells/well in 100 µL RPMI 1640 medium with 10% FBS (fetal bovine serum) and 2 mM Lglutamine. After 24 hours the medium was removed and replaced by 100 µL per well containing 5,000 trypomastigote forms of T. cruzi Tulahuen strain C2C4 containing the β-galactosidase (Lac Z) gene [21]. Fortyeight hours later the medium was removed from the wells and replaced by 100 μL fresh medium with or without a serial drug dilution. Seven 3-fold dilutions were used covering a range from 90 μg/mL to 0.123 μg/mL. Each drug was tested in duplicate. After 96 hours of incubation the plates were inspected under an inverted microscope to assure growth of the controls and sterility. Then the substrate CPRG/ Nonidet (50 µL) was added to all wells. A colour reaction developed within 2-6 hours and could be read photometrically at 540 nm. Data were transferred into the graphic programme Softmax Pro (Molecular Devices) which calculated IC 50 values.
Leishmania donovani (axenic amastigote assay). Ffty µL of SM medium [22] at pH 5.4 supplemented with 10% heat-inactivated FBS, was added to each well of a 96-well microtiter plate (Costar, USA). Serial drug dilutions in duplicates were prepared covering a range from 30 to 0.041 µg/mL. Then 105 axenically grown Leishmania donovani amastigotes (strain MHOM/ET/67/L82) in 50 µL medium were added to each well and the plate incubated at 37°C under a 5% CO 2 atmosphere for 72 hours. Resazurin solution (10 µL, 12.5 mg resazurin dissolved in 100 mL distilled water) were then added to each well and incubation continued for a further 2-4 hours. The plate was then read in a Spectramax Gemini XS microplate fluorometer (Molecular Devices Cooperation, Sunnyvale, CA, USA) using an excitation wavelength of 536 nm and emission wavelength of 588 nm [20]. Fluorescence development was measured and expressed as percentage of the control. Data were transferred into the graphic programme Softmax Pro (Molecular Devices) which calculated IC 50 values from the sigmoidal inhibition curves. The compounds used as positive controls in the various bioassays (see Table 1) were of commercial origin, with the exception of melarsoprol, which was a gift from WHO. Their purity (generally > 95%) was specified by the manufacturers.

Computational Methods
3D-models of all compounds were generated with MOE [11] using the MMFF94x force field. A stochastic conformational search was performed for each compound (default settings of MOE) and the resulting conformers were energy minimised using AM1 (MOPAC module of MOE). The conformer with the lowest AM1 energy was used in the QSAR study.
QSAR descriptors were calculated for each of the compounds using the QSAR module of MOE. A full list of descriptors considered is reported in Table 4. For the QSAR analyses, the IC 50 data for Tbr activity and L6 cytotoxicity, expressed on the molar scale were converted to negative decadic logarithms (pIC 50 ).
Multivariate analysis for the resulting data matrix was performed statistics program "The Unscrambler", v. 9.2 [12] using the PLS2 algorithm which allows simultaneous construction of correlative models for both dependent variables (i.e. pIC 50 data) using the same set of independent variables (i.e. descriptors). Refinement of the initial model containing all 44 variables was obtained by variable selection based on Martens' Uncertainty Test as implemented in the Unscrambler program.  31 ASAN1 fractional accessible surface area due to atoms in partial charge interval 0 to -0.05 e [7] 32 ASAN2 fractional accessible surface area due to atoms in partial charge interval -0.05 to -0.1 e [7] 33 ASAN3 fractional accessible surface area due to atoms in partial charge interval -0.1 to -0.15 e [7] 34 ASAN4 fractional accessible surface area due to atoms in partial charge interval -0.15 to -0.2 e [7] 35 ASAN5 fractional accessible surface area due to atoms in partial charge interval -0.2 to -0.25 e [7] 36 ASAN6 fractional accessible surface area due to atoms in partial charge interval -0.25 to -0.3 e [7] 37 ASAN7 fractional accessible surface area due to atoms in partial charge interval <-0.30 e [7] 38 ASAP1 fractional accessible surface area due to atoms in partial charge interval 0 to 0.05 e [7] 39 ASAP2 fractional accessible surface area due to atoms in partial charge interval 0.05 to 0.1 e [7] 40 ASAP3 fractional accessible surface area due to atoms in partial charge interval 0.1 to 0.15 e [7] 41 ASAP4 fractional accessible surface area due to atoms in partial charge interval 0.15 to 0.2 e [7] 42 ASAP5 fractional accessible surface area due to atoms in partial charge interval 0.2 to 0.25 e [7] 43 ASAP6 fractional accessible surface area due to atoms in partial charge interval 0.25 to 0.3 e [7] 44 ASAP7 fractional accessible surface area due to atoms in partial charge interval >0.3 e [7]