Next Article in Journal
Isolation and Characterization of an α-Glucosidase Inhibitor from Musa spp. (Baxijiao) Flowers
Next Article in Special Issue
In Vivo Antiplasmodial Potentials of the Combinations of Four Nigerian Antimalarial Plants
Previous Article in Journal
Reduction of Adhesion Molecule Production and Alteration of eNOS and Endothelin-1 mRNA Expression in Endothelium by Euphorbia hirta L. through Its Beneficial β-Amyrin Molecule
Previous Article in Special Issue
In Vivo Anti-Trypanosoma cruzi Activity of Hydro-Ethanolic Extract and Isolated Active Principles from Aristeguietia glutinosa and Mechanism of Action Studies
Article Menu

Export Article

Molecules 2014, 19(7), 10546-10562; doi:10.3390/molecules190710546

Hologram QSAR Studies of Antiprotozoal Activities of Sesquiterpene Lactones
Faculdade de Ciências Farmacêuticas, Universidade de São Paulo, Av. Lineu Prestes, 580, 05508-000 São Paulo, Brazil
Institute of Pharmaceutical Biology and Phytochemistry (IPBP), University of Münster, Pharma Campus, Corrensstraße 48, D-48149 Münster, Germany
Author to whom correspondence should be addressed.
Received: 4 April 2014; in revised form: 8 July 2014 / Accepted: 9 July 2014 / Published: 18 July 2014


Infectious diseases such as trypanosomiasis and leishmaniasis are considered neglected tropical diseases due the lack for many years of research and development into new drug treatments besides the high incidence of mortality and the lack of current safe and effective drug therapies. Natural products such as sesquiterpene lactones have shown activity against T. brucei and L. donovani, the parasites responsible for these neglected diseases. To evaluate structure activity relationships, HQSAR models were constructed to relate a series of 40 sesquiterpene lactones (STLs) with activity against T. brucei, T. cruzi, L. donovani and P. falciparum and also with their cytotoxicity. All constructed models showed good internal (leave-one-out q2 values ranging from 0.637 to 0.775) and external validation coefficients (r2test values ranging from 0.653 to 0.944). From HQSAR contribution maps, several differences between the most and least potent compounds were found. The fragment contribution of PLS-generated models confirmed the results of previous QSAR studies that the presence of α,β-unsatured carbonyl groups is fundamental to biological activity. QSAR models for the activity of these compounds against T. cruzi, L. donovani and P. falciparum are reported here for the first time. The constructed HQSAR models are suitable to predict the activity of untested STLs.
HQSAR; sesquiterpene lactones; Trypanosoma brucei; Trypanosoma cruzi; Leishmania donovani; Plasmodium falciparum; antiprotozoal activity; fragment-based drug design

1. Introduction

Nowadays, several diseases caused by protozoan parasites such as leishamaniases, trypanosomiases (Chagas Disease and African Sleeping sickness) and malaria represent major health risks in developing countries. Leishmaniases and trypanosomiases have few available drug therapies and the development of anti-malarial compounds is also urgently needed due to rapidly emerging resistance of the parasites against existing drugs. Is estimated that the infections by Trypanosoma and Leishmania are responsible for over a million deaths per year. Their treatment by drugs is complicated by severe side effects due to the high toxicity of available drugs. Due to the lack of research and development of new drugs over many decades, these diseases are considered “neglected tropical diseases” [1,2,3,4,5]. There is thus an urgent need for development of new therapeuticals against these diseases. Many classes of chemicals have been tested against these parasites. Among them, natural products and, particularly, sesquiterpene lactones (STLs) have shown interesting activities [6,7].
In a previous study [8], in vitro activity data for 40 sesquiterpene lactones (STLs) against Trypanosoma brucei rhodesiense (the etiologic agent of East African sleeping sickness; Tbr), T. cruzi (Chagas Disease; Tcr), Leishmania donovani (visceral leishmaniasis, Kala-Azar; Ldon) as well as Plasmodium falciparum (tropical malaria; Pfc) were reported. Quantitative structure-activity relationship (QSAR) models for the activity against T. brucei rhodesiense and for the cytotoxic activity of these compounds against L6 rat skeletal myoblast cells were presented. It was found that the biological effects against the protozoan parasites were all correlated significantly with cytotoxicity against the mammalian control cells. It was not possible at that time and with the methods used for QSAR modelling to clearly define a structural basis for selectivity against the parasites [8]. QSAR approaches are considered powerful tools in lead identification as well as optimization [9] in cases where the bioactivity of congeneric sets of compounds is known. Even though QSAR methods have been applied to STLs successfully for several bioactivities [10,11,12,13] it remained a challenge to construct validated models of anti-protozoal activity of STLs against T. cruzi, L. donovani and P. falciparum [8,14].
The main objective of our present work is therefore to apply the hologram quantitative structure-activity relationship (HQSAR) approach to construct comparable models for all four mentioned protozoa and cytotoxicity and to employ the molecular fragment information of the generated models to analyze the structural basis for the antiprotozoal activity and cytotoxicity of the compounds in this data set in order to find possible reasons for the selectivity observed with some of the STLs.

2. Results and Discussion

As biological activities against T.brucei and L6 cover a range of at least three logarithmic units, as shown by Figure 1, and all data within each activity set were determined under identical experimental conditions [8], the dataset is deemed suitable for QSAR studies. The biological activities against T. cruzi, L. donovani and P. falciparum only cover 2.25, 1.90 and 1.92 logarithmic units, respectively. This is not the ideal scenario to construct HQSAR models, but we decided to construct these three models in order to support the results generated by Tbr and L6 models.
Figure 1. pIC50 distribution of the dataset of 40 STLs over the five biological activities under study. Each graph represents the respective number of compounds with measured pIC50 (N) values in a particular concentration range against each tested parasite and cytotoxicity (L6).
Figure 1. pIC50 distribution of the dataset of 40 STLs over the five biological activities under study. Each graph represents the respective number of compounds with measured pIC50 (N) values in a particular concentration range against each tested parasite and cytotoxicity (L6).
Molecules 19 10546 g001
Initially, the HQSAR models with 16 series of fragment distinction and fixed fragment size (4 to 7 atoms) were generated for each series of biological activity (T. brucei rhodesiense, T. cruzi, L. donovani, P. falciparum and L6 cytotoxicity). The five best models for each dependent variable are presented in Table 1.
The initial search for the fragment distinction that best represents each biological activity shows that the model employing fragments based on atoms, bonds and connections (A/B/C) provides the best description for anti-T. brucei activity (q2 = 0.637). The best models for T. cruzi and P. falciparum were obtained employing fragments based on atoms and connections (A/C) with cross-validated correlation coefficients (q2) equal to 0.721 and 0.703, respectively. Finally, best HQSAR models for L. donovani (q2 = 0.775) and cytotoxicity (q2 = 0.647) employed fragment distinction based on atoms, connections, chirality and H-bond donor/acceptor groups (A/C/Ch/DA). In general, these initial results indicate that both anti-Ldon activity and cytotoxicity could be influenced more strongly by H-bond interactions and stereoselectivity since the best Ldon and L6 models were the only ones constructed with Ch and DA flags in fragment distinction.
Table 1. 5 Best HQSAR models with fragment size equal to 4 to 7 atoms.
Table 1. 5 Best HQSAR models with fragment size equal to 4 to 7 atoms.
Tbr HQSAR models
Tcr HQSAR models
Ldon HQSAR models
Pfc HQSAR models
L6 HQSAR models
Fdist: fragment distinction; HL: hologram length; PC: number of PLS principal components; standard error of validation; SEE: standard error of estimation.
After this step, the fragment distinction of the best models was fixed and then a variation of fragment size was employed in order to analyze the influence of this parameter on statistical results. For each model (Tbr, Tcr, Ldon, Pfc and L6 models), we tested the fragment sizes with: 1 to 4 atoms, 2 to 5 atoms, 3 to 6 atoms, 4 to 7 atoms (tested in first step), 5 to 8 atoms, 6 to 9 atoms, 7 to 10 atoms and 8 to 11 atoms. All results of this second step are shown in Table 2.
Table 2. HQSAR models with fragment size variations from 1–4 atoms to 8–11 atoms.
Table 2. HQSAR models with fragment size variations from 1–4 atoms to 8–11 atoms.
Tbr HQSAR models with Fdist = A/B/C
Fsize (atoms)q2SEVr2SEEHLPC
1 to 40.4070.7360.7080.517974
2 to 50.5470.6440.7610.467834
3 to 60.5460.6440.8080.419974
4 to 70.6370.5760.8220.404534
5 to 80.5880.6140.8330.3913074
6 to 90.5650.6310.8170.409534
7 to 100.5190.6630.8260.398614
8 to 110.4800.6900.8190.4061514
Tcr HQSAR models with Fdist = A/C
Fsize (atoms)q2SEVr2SEEHLPC
1 to 40.3300.4400.6400.322534
2 to 50.5180.3810.8250.230535
4 to 70.7210.2970.9390.139716
3 to 60.6370.3390.9340.1451996
5 to 80.7410.2800.9480.1261515
6 to 90.7290.2860.9530.1201515
7 to 100.7480.2820.9650.1061516
8 to 110.6890.2990.9230.1493534
Ldon HQSAR models with Fdist = A/C/Ch/DA
Fsize (atoms)q2SEVr2SEEHLPC
1 to 40.5420.3980.8810.2031516
2 to 50.6180.3640.9040.182836
3 to 60.7060.3190.9660.1081516
4 to 70.7750.2790.9720.098836
5 to 80.7700.2820.9820.0801996
6 to 90.7470.2960.9760.0911996
7 to 100.7330.2960.9680.1033075
8 to 110.7160.3050.9650.1062575
Pfc HQSAR models Fdist = A/C
Fsize (atoms)q2SEVr2SEEHLPC
1 to 40.4580.3330.8090.197595
2 to 50.6150.2800.8750.160615
3 to 60.7300.2420.9440.110716
4 to 70.7030.2540.9500.104836
5 to 80.7060.2530.9610.0923076
6 to 90.6830.2620.9650.0873536
7 to 100.7360.2320.9600.090975
8 to 110.7320.2410.9820.0633536
L6 HQSAR models Fdist = A/C/Ch/DA
Fsize (atoms)q2SEVr2SEEHLPC
1 to 40.2040.5040.7130.303614
2 to 50.4650.4220.8560.219715
3 to 60.5680.3710.8430.2243074
4 to 70.6470.3430.8930.189615
5 to 80.6730.3370.9520.129716
6 to 90.6190.3630.9660.109596
7 to 100.5490.3790.8890.1881514
8 to 110.5830.3730.9480.1311515
Fdist: fragment distinction; Fsize: fragment distinction; HL: hologram length; PC: number of PLS principal components; standard error of validation; SEE: standard error of estimation.
After the analysis of the influence of fragment distinction and size, hologram length and number of PCs on the statistical parameters, we evaluated the quality of the constructed models by internal and external validations.
The robustness test (Figure 2) suggests that all constructed models have acceptable internal consistency since all average q2 values for each number of cross-validation groups were higher than 0.6. In order to certify that all models are completely validated, the r2test value was calculated for each model and the residues of prediction were also considered in external validation. Table 3 summarizes all parameters of the constructed HQSAR models as well all statistical results of internal and external validations. Figure 3 displays the experimental versus predicted pIC50 values of all HQSAR models.
It is important to note that compound 17 in the T. brucei model, compounds 19 and 34 in the L. donovani model, and compounds 24 and 26 in the cytotoxicity model are considered outliers due to their high values of both CV and external validation residuals (residuals > 1.50 log units)
Figure 2. Robustness test of the five constructed HQSAR models.
Figure 2. Robustness test of the five constructed HQSAR models.
Molecules 19 10546 g002
Table 3. Comparison of statistical results of all five constructed HQSAR models.
Table 3. Comparison of statistical results of all five constructed HQSAR models.
HQSAR Models
Fsize4–7 atoms7–10 atoms4–7 atoms7–10 atoms5–8 atoms
q2CV0.623 ± 0.030.736 ± 0.030.753 ± 0.020.722 ± 0.020.656 ± 0.03
Fdist: fragment distinction; Fsize: fragment size; HL: hologram lenght; PC: number of PLS principal components; N: number of compounds of training set; SEV: standard error of validation; SEE: standard error of estimation.
Figure 3. Experimental versus predicted pIC50 values of training and test sets of all constructed HQSAR models.
Figure 3. Experimental versus predicted pIC50 values of training and test sets of all constructed HQSAR models.
Molecules 19 10546 g003
These compounds were removed from the respective data sets and the modelling repeated without them, in order to avoid distortions in the models. Manifold reasons may lead to the behavior of particular compounds as outliers [15] on which to speculate here for each case in detail does not appear useful. From the results of external validations, we can note that all constructed models have acceptable values of external validation correlation coefficients and residuals of prediction for all test set compounds lower than 1 logarithmic unit (Supplementary Table S6). All generated models including fragment distinction search and fragment size evaluation for the five sets of biological data are available in Supplementary Tables S1–S5.
Therefore, both the LOO and CV internal validation methods as well as the external validation provide results which indicate that all constructed HQSAR models and their respective fragments information are suitable to explain the anti-protozoal and cytotoxic activities.
From the contribution maps of compound 2, one of the most potent compounds in each HQSAR model (Figure 4), it becomes clear that the 7-membered ring with one of the attached methyl groups is assigned a positive contribution to biological activity by each of the HQSAR models. Quite notably, the oxygen atom in the butyrolactone ring only shows a positive contribution to the cytotoxicity model, indicating that this atom (or the butyrolactone moiety) could be related to an important difference between anti-protozoal and toxic activities of the compounds in this data set. The lactone carbonyl oxygen atom contributes positively to Tbr and Tcr models.
Figure 4. HQSAR maps of positive contribution for all 5 constructed HQSAR models.
Figure 4. HQSAR maps of positive contribution for all 5 constructed HQSAR models.
Molecules 19 10546 g004
Analyzing the contribution maps of the five constructed models (Figure 5), the 6-membered ring contributes negatively to T. brucei, P. falciparum and cytotoxic activities. The butyrolactone moiety (except the oxygen atom of carbonyl group) contributes negatively to anti T. cruzi activity. The oxygen atom of oxirane group contributes negatively to the L. donovani HQSAR model.
On the background of previous QSAR analyses of this data set, it could be expected that all HQSAR models should be influenced by similar parameters and lead to similar contribution maps since the pairwise correlation between the sets of biological activity values is quite high (higher than 69%, Supplementary Table S7) [8]. However, this is not the case so that the information provided by the contribution maps of the individual models could be useful to identify differences, especially between cytotoxicity and the anti-protozoal activities. Even though the differences between the models for the antiparasitic and cytotoxic activities may be subtle and difficult to interpret in detail due to the complexity of the applied descriptors, it is noteworthy that these differences exist and thus represent a possibility to rationalize the structural reasons for the selectivity of some compounds against the parasites.
Figure 5. HQSAR maps of negative contribution for all five constructed HQSAR models.
Figure 5. HQSAR maps of negative contribution for all five constructed HQSAR models.
Molecules 19 10546 g005
We calculated the maximum common structure (MCS) with the HQSAR module (Figure 6, MCS colored in cyan). This MCS comprises the butyrolactone moiety along with two carbon atoms of the attached ring system. The α-methylene group, although present in most compounds, is not part of the MCS since compounds 5, 6, 7 and 35 are 11,13-dihydro derivatives, i.e., they have a methyl group instead of the =CH2 group. Apart from this, compound 23 has a cyclic substituent at this position. Compounds 5, 6 and 7 are pseudoguaianolides bearing another α,β-unsatured carbonyl system, i.e., a cyclopentenone moiety located on the opposite side of the molecule. Compounds 23 and 35 do not contain any α,β-unsatured carbonyl system and both show very low activity against Tbr and also no significant cytotoxicity (pIC50 values equal to 3.79 and 4.31, respectively). Therefore, our HQSAR studies indicate that the presence of α,β-unsatured carbonyl system could be considered a common scaffold which is generally related to biological activity, while the fragments with positive and negative contributions (Figure 4 and Figure 5) could be related to the differences of pIC50 in each model.
Figure 6. Maximum common structure (cyan atoms of compound 01) of dataset calculated by all HQSAR models.
Figure 6. Maximum common structure (cyan atoms of compound 01) of dataset calculated by all HQSAR models.
Molecules 19 10546 g006
In order to perform an analysis of the anti-Tbr HQSAR model in terms of statistical influence of particular fragments on biological activity, we extracted the information about the fragments with highest positive and negative contributions to biological activity from this model (Table 4).
Table 4. List of fragments with highest positive and negative contribution to Tbr HQSAR model; X atoms are the connectivity flag and are not considered part of fragment.
Table 4. List of fragments with highest positive and negative contribution to Tbr HQSAR model; X atoms are the connectivity flag and are not considered part of fragment.
Frag 01Frag 02Frag 03Frag 04 *Frag 05 *
Molecules 19 10546 i009 Molecules 19 10546 i010 Molecules 19 10546 i001 Molecules 19 10546 i002 Molecules 19 10546 i003
Frag 06Frag 07Frag 08Frag 09 *Frag 10 *
Molecules 19 10546 i004 Molecules 19 10546 i005 Molecules 19 10546 i006 Molecules 19 10546 i007 Molecules 19 10546 i008
* Fragments containing explicit α,β-unsatured carbonyl system with highest contribution values.
From the results obtained, it is possible to note that two of three fragments with highest contribution to biological activity (fragments 01 and 03) have two sp2 carbon atoms bonded directly to each other which would be characteristic of an α,β-unsatured carbonyl system. Fragment 04 is the fragment with an explicit α,β-unsatured carbonyl system with highest positive contribution to the model and this fragment is exactly the substructure present in compounds 01–08 which have the highest anti-T. brucei activities. Fragment 05 is one example of a fragment encoding the butyrolactone moiety indicating that this group also contributes positively to biological activity. There are also fragments containing an α,β-unsatured carbonyl system that show a negative contribution to the model (fragments 09, 10) but, in general terms, the values of their contribution to biological activity are lower than the positive ones, indicating that positive contributions have a higher statistical significance to this HQSAR model. From fragments with negative contributions (fragments 07, 08), it is possible to note that an epoxide group contributes negatively to anti-Tbr activity. As previously described, α,β-unsatured carbonyl systems such as the α-methylene-γ-lactone and cyclopentenone moiety are of major influence on biological activity of STLs, not only with respect to their antiprotozoal and cytotoxic activity. [8,10,11,14,16,17,18].
In comparison to recent descriptor-based QSARs models for T. brucei activity and cytotoxicity constructed by Schmidt et al. [7], the obtained results in HQSAR suggest similar physicochemical interpretations. The positive contribution of methylcycloheptane (as part of a pseudoguaianolide skeleton) to all models suggests a positive influence of this ring system on activity that may be due to steric or hydrophobic factors since the cyclohexane system as present in the eudesmanolides showed a negative contribution to biological activity for both Tbr and L6 models.
The two fragments with the highest contribution to the Tbr model represent alkene structures which are also hydrophobic groups. These results corroborate the positive contribution of hydrophobicity to anti-Tbr activity.
In summary, our HQSAR models showed once more that α,β-unsatured groups are fundamental to biological activity of STLs, in accordance with several previous works [7,8,9,10,11,12,13]. Furthermore, the methyl-cycloheptane ring as well as further hydrophobic groups appear to be responsible for higher levels of biological activity, indicating that the potency of the studied compounds could be related to cellular permeation mechanisms.
After the analyses of HQSAR maps and the influence of fragments for most and less potent compounds, we also analyzed the HQSAR maps of the compounds with highest selectivity indices (SI) for T. brucei (compounds 19, 24 and 32) and lowest SI (compounds 26, 25 and 28). We generated these maps with Tbr and L6 models in order to verify the influence of fragments for both biological activities as a strategy to study the selectivity. From the maps of compounds 24, 25, 26 and 28 we cannot verify significant differences that could explain the selectivity of the lack of it (Supplementary Figure S1). The maps of compounds 19 and 32 are shown by Figure 7.
Figure 7. HQSAR maps of most selective compounds.
Figure 7. HQSAR maps of most selective compounds.
Molecules 19 10546 g007
From Figure 7, we can note: (i) the contribution maps of compound 19, the most T. brucei selective, indicates that de C-atoms of the α,β-unsatured carbonyl system and the 7-membered ring contribute positively to the Tbr model but negatively to toxicity. Therefore, this compound could be considered a lead for the development of new chemical entities with antiprotozoal activity and low toxicity; (ii) the contribution map for compound 32 indicates that the 6-membered ring contributes positively to toxicity. From this information, it is possible to note that this fragment is present in compounds with lower antiprotozoal activity and also could lead to increased toxicity; (iii) the O atom of the hydroxy group of the distal ring of compound 32 contributes positively to anti-T. brucei activity, indicating that compounds with an –OH group at this position could be tested due the low influence of this fragment on toxicity.

3. Experimental Section

3.1. Data Set

The data set used for the HQSAR studies contains 40 sesquiterpene lactones with their antiprotozoal activity against Trypanosoma brucei rhodesiense (Tbr), Trypanosoma cruzi (Tcr), Leishmania donovani (Ldon) and Plasmodium falciparum (Pfc), as well as cytotoxicity against L6 rat skeletal myoblasts (Table 5) [8]. The biological activity data were reported as micromolar IC50 values which were converted to molar pIC50 (−logIC50) and used as dependent variables in the QSAR model development (Table 5). The chemical structures were drawn in the 2D format and converted to 3D, using the Sybyl X 2.0 package [19]. The studied compounds were divided into training and test sets containing 80% and 20%, respectively, of the total number of compounds of each dataset (a set with certain compounds with specific biological activity measurement) in order to construct the HQSAR models and to perform external validations. The dataset split step was performed in such a manner that the entire range of pIC50 values was covered by test set compounds, also taking into account the structural homogeneity of training and test sets. Thus, both training and test set compounds were inside the two dimensional Y (biological activity) and X (fragment) spaces.
Table 5. Structures of dataset compounds and their pIC50 values. Selectivity indices (SI) are defined as SI = IC50(L6)/IC50(parasite) and showed between parenthesis.
Table 5. Structures of dataset compounds and their pIC50 values. Selectivity indices (SI) are defined as SI = IC50(L6)/IC50(parasite) and showed between parenthesis.
1 Molecules 19 10546 i011H7.284 (19.1)6.158 (1.4)n.a.n.a.6.003
2ac7.201 (12.9)6.269 (1.5)6.351 (1.8)6.483 (2.5)6.092
3i-butyryl6.979 (9.8)5.805 (0.7)6.077 (1.2)6.155 (1.5)5.987
4i-valeryl6.936 (11.2)5.606 (0.5)6.060 (1.5)6.085 (1.6)5.887
5 Molecules 19 10546 i012H6.164 (13.0)4.668 (0.4)5.415 (2.3)5.516 (2.9)5.051
6Ac5.849 (2.2)5.159 (0.4)n.a.n.a.5.515
7i-valeryl6.040 (5.0)5.452 (1.3)5.831 (3.1)5.795 (2.9)5.339
8 Molecules 19 10546 i013-6.496 (7.7)5.728 (1.3)
9 Molecules 19 10546 i014AcHOH5.033 (1.3)4.339 (0.3)n.a.n.a.4.911
10AcHH5.174 (0.7)4.690 (0.2)4.912 (0.4)5.190 (0.7)5.357
11HHH4.736 (1.7)4.278 (0.6)4.686 (1.5)4.916 (2.6)4.496
12HtigOH4.961 (1.5)4.308 (0.3)4.746 (0.9)4.975 (1.6)4.778
13HHOtig4.716 (1.7)4.221 (0.5)4.838 (2.2)4.561 (1.2)4.498
14AcacH5.930 (7.2)4.834 (0.6)5.371 (2.0)5.049 (0.9)5.074
15AcHOac5.402 (1.8)4.788 (0.4)5.062 (0.8)5.223 (1.2)5.138
16 Molecules 19 10546 i015---4.866 (1.6)4.348 (0.5)4.927 (1.8)4.972 (2.0)4.666
17 Molecules 19 10546 i0164.207 (2.9)n.a.4.623 (7.5)n.a.3.748
18 Molecules 19 10546 i0175.535 (10.4)4.581 (1.2)4.753 (1.7)5.107 (3.9)4.520
19 Molecules 19 10546 i0186.481 (67.0)4.948 (2.0)6.223 (37.0)5.186 (3.4)4.655
20 Molecules 19 10546 i0194.799 (1.3)4.409 (0.5)4.836 (1.4)4.896 (1.6)4.702
Modified Xanthanolides
21 Molecules 19 10546 i0206.195 (14.8)4.794 (0.6)4.657 (0.4)5.304 (1.9)5.026
22 Molecules 19 10546 i0215.890 (10.3)4.020 (0.1)4.570 (0.5)5.185 (2.0)4.877
23 Molecules 19 10546 i0223.790n.a.4.812n.a.n.a.
24 Molecules 19 10546 i0235.086 (52.1)n.a.4.568 (15.8)n.a.3.369
25 Molecules 19 10546 i024H4.627 (0.2)4.652 (0.2)n.a.n.a.5.348
26OH5.108 (0.1)4.606 (0.0)n.a.n.a.6.023
27Oac4.970 (0.8)4.614 (0.4)4.930 (0.8)5.036 (1.0)5.050
28 Molecules 19 10546 i025-3.467 (0.5)n.a.n.a.n.a.3.728
29 Molecules 19 10546 i026-5.581 (2.3)5.081 (0.7)5.519 (2.0)5.194 (0.9)5.223
30 Molecules 19 10546 i027-5.795 (6.4)4.763 (0.6)5.061 (1.2)4.971 (1.0)4.992
31 Molecules 19 10546 i028-4.749 (1.9)4.539 (1.1)5.221 (5.5)4.919 (2.8)4.479
32 Molecules 19 10546 i029-5.956 (23.1)4.582 (1.0)5.134 (3.5)4.866 (1.9)4.592
33 Molecules 19 10546 i030-5.885 (7.8)4.775 (0.6)4.818 (0.7)5.199 (1.6)4.991
34 Molecules 19 10546 i031-6.411 (18.7)4.972 (0.7)5.449 (0.3)4.925 (0.6)5.140
35 Molecules 19 10546 i032-4.310 (2.7)n.a.4.449 (0.4)n.a.3.877
36 Molecules 19 10546 i033H4.845 (1.0)4.592 (0.5)5.5645.025 (1.4)4.866
37ac6.321 (16.2)4.802 (0.5)5.2855.198 (1.2)5.111
38 Molecules 19 10546 i034HH6.026 (13.8)4.783 (0.8)4.7405.055 (1.5)4.887
39acH6.095 (12.6)4.755 (0.6)4.5225.093 (1.3)4.996
40HOac6.156 (9.5)4.927 (0.6)4.7675.178 (1.0)5.176
ac = acetyl; tig = tigloyl; n.a. = pIC50 not available.

3.2. Fragment-Based Strategy

The HQSAR technique was chosen as fragment-based drug design strategy [20,21,22,23]. This technique has been successfully employed in drug design studies obtaining good agreement with experimental data of several different compound datasets [24,25,26,27]. The HQSAR technique consists in the decomposition of each molecule in the dataset into a molecular hologram that consists basically of linear, branched, and overlapping fragments which are divided to a fixed-length array (53 to 401 bins). The bin occupancies encode compositional and topological molecular information used as independent (X) variables in QSAR modeling. The hologram length, fragment size and fragment distinction (atoms (A), bonds (B), connections (C), hydrogen atoms (H), chirality (Ch), and H-bond donor/acceptor groups (DA)) are the parameters that affect the hologram generation and consequently the statistical evaluation of constructed HQSAR models. Initially, the several models applying different combinations of fragment distinctions were generated using default fragment size 4–7 atoms over the 13 default series of hologram lengths. Next, the influence of fragment size was further investigated for the best model. All models generated in this study were generated using the Partial Least Squares (PLS) method. Each model was fully cross-validated by the Leave-One-Out (LOO) method.

3.3. QSAR Model Validation

After the obtainment of an optimum HQSAR model for each biological activity, we carried out a robustness test and external validation, with a test set of compounds which were not considered for the purpose of QSAR model development [28,29,30]. The robustness test was performed employing the cross-validation (CV) method with pre-determined groups of compounds (from 5 to 25 groups) used to perform the internal capacity of biological activity prediction. All CV validations were carried out in triplicate and the average q2 (coefficient of determination of the predicted vs. experimental values during cross validation) and standard deviation for each number of CV groups were also calculated.
Next, the models were submitted to an external validation to estimate the capacity of biological activity prediction for compounds that were not used in HQSAR model construction. Both, residuals of predicted values as well the external validation coefficient (r2test = coefficient of determination of predicted vs. experimental data of the test set) were analyzed in this step. As the pIC50 range of the five constructed models is different due to the employed dependent variable (pIC50 values for four protozoa and cytotoxicity), the five test sets were selected according to the pIC50 distribution of each specific dataset in order to optimally cover the different ranges of biological activity in the dataset.

4. Conclusions

All HQSAR models constructed in this study showed good internal consistency and external predictivity. The quality of all models with respect to internal and external predictiveness was evaluated by statistical parameters such as as leave-one-out cross-validation q2 (ranging from 0.637 to 0.775) and quality of test set predictions r2test (ranging from 0.653 to 0.944), respectively. All of the obtained values were above those considered acceptable in literature [31]. All constructed models showed good internal (leave-one-out q2 values ranging from 0.637 to 0.775) and external validation coefficients (r2test values ranging from 0.653 to 0.944). While it was not possible so far to obtain reliable and statistically sound QSAR models for these STLs’ bioactivity against T. cruzi, L. donovani and P. falciparum with classical approaches, this task could now be achieved by using the HQSAR approach.
Apart from their explanatory value, these models can now be used for activity predictions with larger databases of hitherto untested STLs in order to select promising candidates for testing against the parasites under study. It should not remain unmentioned, that recently, Schmidt et al. [14] reported on a refined QSAR model for anti-Tbr activity based on an extended data set of almost 70 compounds which could successfully be used to predict very high in vitro activity for a group of hitherto untested STLs. By using a similar approach in further studies with the HQSAR models generated in the present work, we expect to find new and promising hits against T. cruzi, L. donovani and P. falciparum within the vast structural diversity of STLs.

Supplementary Materials

Supplementary materials can be accessed at:


The authors would like to thank the Brazilian agencies CAPES and FAPESP (Trossini, G.H.G. Project 11/11499-0 and 13/50677-7) for financial support.

Author Contributions

G. H. G. Trossini and T. J. Schmidt conceived and designed the computational experiments; V. G. Maltarollo and G. H. G. Trossini performed the computational experiments; G. H. H. Trossini, V. G. Maltarollo and T. J. Schmidt analyzed the data and wrote the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Gilbert, I.H. Drug discovery for neglected diseases: Molecular target-based and phenotypic approaches. J. Med. Chem. 2013, 56, 7719–7726. [Google Scholar] [CrossRef]
  2. Utzinger, J.; Becker, S.L.; Knopp, S.; Blum, J.; Neumayr, A.L.; Keiser, J.; Hatz, C.F. Neglected tropical diseases: Diagnosis, clinical management, treatment and control. Swiss Med Wkly. 2012, 142, w13727. [Google Scholar]
  3. WHO. 10 Facts on Neglected Tropical Diseases Web Page. Available online: (accessed on 2 October 2014).
  4. World Health Organization. Working to Overcome the Global Impact of Neglected Tropical Diseases: First Who Report on Neglected Tropical Diseases, 2010. Available online: (accessed on 3 April 2014).
  5. Biamonte, M.A.; Wanner, J.; le Roch, K.G. Recent advances in malaria drug discovery. Bioorg. Med. Chem. Lett. 2013, 23, 2829–2843. [Google Scholar] [CrossRef]
  6. Hoet, S.; Opperdoes, F.; Brun, R.; Quetin-Leclercq, J. Natural products active against african trypanosomes: A step towards new drugs. Nat. Prod. Rep. 2004, 21, 353–364. [Google Scholar] [CrossRef]
  7. Newman, D.J.; Cragg, G.M.; Snader, K.M. Natural products as sources of new drugs over the period 1981−2002. J. Nat. Prod. 2003, 66, 1022–1037. [Google Scholar] [CrossRef]
  8. Schmidt, T.; Nour, A.; Khalid, S.; Kaiser, M.; Brun, R. Quantitative structure-antiprotozoal activity relationships of sesquiterpene lactones. Molecules 2009, 14, 2062–2076. [Google Scholar] [CrossRef]
  9. Langer, T.; Bryant, S.D. Chapter 10 - in Silico Screening: Hit finding from database mining. In The Practice of Medicinal Chemistry, 3rd ed.; Wermuth, C.G., Ed.; Academic Press: New York, NY, USA, 2008; pp. 210–227. [Google Scholar]
  10. Schmidt, T.J.; Heilmann, J. Quantitative structure-cytotoxicity relationships of sesquiterpene lactones derived from partial charge (q)-based fractional accessible surface area descriptors (q_frasas). Quant. Struct.-Act. Relat. 2002, 21, 276–287. [Google Scholar]
  11. Schmidt, T. Quantitative structure-cytotoxicity relationships within a series of helenanolide type sesquiterpene lactones. Pharm. Pharmacol. Lett. 1999, 9, 9–13. [Google Scholar]
  12. Scotti, M.T.; Fernandes, M.B.; Ferreira, M.J.; Emerenciano, V.P. Quantitative structure-activity relationship of sesquiterpene lactones with cytotoxic activity. Bioorg. Med. Chem. 2007, 15, 2927–2934. [Google Scholar] [CrossRef]
  13. Wagner, S.; Hofmann, A.; Siedle, B.; Terfloth, L.; Merfort, I.; Gasteiger, J. Development of a structural model for nf-kappab inhibition of sesquiterpene lactones using self-organizing neural networks. J. Med. Chem. 2006, 49, 2241–2252. [Google Scholar] [CrossRef]
  14. Schmidt, T.J.; da Costa, F.B.; Lopes, N.P.; Kaiser, M.; Brun, R. In silico prediction and experimental evaluation of furanoheliangolide sesquiterpene lactones as potent agents against trypanosoma brucei rhodesiense. Antimicrob. Agents Chemother. 2014, 58, 325–332. [Google Scholar] [CrossRef]
  15. Cronin, M.T.D.; Schultz, T.W. Pitfalls in QSAR. Comp. Theor. Chem. 2003, 622, 39–51. [Google Scholar]
  16. Schmidt, T.J. Toxic activities of sesquiterpene lactones: Structural and biochemical aspects. Curr. Org. Chem. 1999, 3, 599–600. [Google Scholar]
  17. Schmidt, T.J. Structure-activity relationships of sesquiterpene lactones. Stud. Nat. Prod. Chem. 2006, 33, 309–392. [Google Scholar] [CrossRef]
  18. Schomburg, C.; Schuehly, W.; da Costa, F.B.; Klempnauer, K.-H.; Schmidt, T.J. Natural sesquiterpene lactones as inhibitors of myb-dependent gene expression: Structure–activity relationships. Eur. J. Med. Chem. 2013, 63, 313–320. [Google Scholar] [CrossRef]
  19. Tripos. In Sybyl x 2.0; Tripos Inc.: St. Louis, MO, USA, 2010.
  20. Lowis, D.R. Hqsar: A new, highly predictive qsar technique. Tripos Technical Notes 1997, 1, 1–10. [Google Scholar]
  21. Seel, M.; Turner, D.B.; Willett, P. Effect of parameter variations on the effectiveness of HQSAR analyses. Quant. Struct.-Act. Relat. 1999, 18, 245–252. [Google Scholar]
  22. Salum, L.B.; Andricopulo, A.D. Fragment-based QSAR: Perspectives in drug design. Mol. Divers. 2009, 13, 277–285. [Google Scholar] [CrossRef]
  23. Salum, L.B.; Andricopulo, A.D. Fragment-based QSAR strategies in drug design. Expert. Opin. Drug Discov. 2010, 5, 405–412. [Google Scholar] [CrossRef]
  24. Garcia, T.S.; Honorio, K.M. Two-dimensional quantitative structure-activity relationship studies on bioactive ligands of peroxisome proliferator-activated receptor delta. J. Braz. Chem. Soc. 2011, 22, 65–72. [Google Scholar] [CrossRef]
  25. Araujo, S.C.; Maltarollo, V.G.; Honorio, K.M. Computational studies of TGF-βRI (ALK-5) inhibitors: Analysis of the binding interactions between ligand–receptor using 2D and 3D techniques. Eur. J. Pharm. Sci. 2013, 49, 542–549. [Google Scholar] [CrossRef]
  26. Guido, R.V.C.; Trossini, G.H.G.; Castilho, M.S.; Oliva, G.; Ferreira, E.I.; Andricopulo, A.D. Structure-activity relationships for a class of selective inhibitors of the major cysteine protease from trypanosoma cruzi. J. Enzyme Inhib. Med. Chem. 2008, 23, 964–973. [Google Scholar] [CrossRef]
  27. Moda, T.L.; Andricopulo, A.D. Consensus hologram qsar modeling for the prediction of human intestinal absorption. Bioorg. Med. Chem. Lett. 2012, 22, 2889–2893. [Google Scholar] [CrossRef]
  28. Ferreira, M.M.C. Multivariate QSAR. J. Braz. Chem. Soc. 2002, 13, 742–753. [Google Scholar]
  29. Gaudio, A.C.; Zandonade, E. Proposition, validation and analysis of QSAR models. Quím. Nova 2001, 24, 658–671. [Google Scholar]
  30. Gertrudes, J.C.; Maltarollo, V.G.; Silva, R.A.; Oliveira, P.R.; Honorio, K.M.; da Silva, A.B. Machine learning techniques and drug design. Curr. Med. Chem. 2012, 19, 4289–4297. [Google Scholar] [CrossRef]
  31. Tropsha, A. Best practices for qsar model development, validation, and exploitation. Mol. Inform. 2010, 29, 476–488. [Google Scholar] [CrossRef]
  • Sample Availability: Samples of the compounds are available from T.J.S.
Molecules EISSN 1420-3049 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top