Next Article in Journal
Antifungal Activities of New Coumarins
Next Article in Special Issue
A QSAR Study of Environmental Estrogens Based on a Novel Variable Selection Method
Previous Article in Journal
Development of a Method for the Preparation of Ruthenium Indenylidene-Ether Olefin Metathesis Catalysts
Previous Article in Special Issue
Comparison of Different Approaches to Define the Applicability Domain of QSAR Models
Open AccessArticle

QSAR Modeling on Benzo[c]phenanthridine Analogues as Topoisomerase I Inhibitors and Anti-cancer Agents

Department of Medicinal Chemistry, School of Pharmacy, University of Medicine and Pharmacy at Ho Chi Minh City, 41 Dinh Tien Hoang St., Dist. 1, Ho Chi Minh City, Vietnam
*
Author to whom correspondence should be addressed.
Molecules 2012, 17(5), 5690-5712; https://doi.org/10.3390/molecules17055690
Received: 5 April 2012 / Revised: 25 April 2012 / Accepted: 4 May 2012 / Published: 11 May 2012
(This article belongs to the Special Issue QSAR and Its Applications)

Abstract

Benzo[c]phenanthridine (BCP) derivatives were identified as topoisomerase I (TOP-I) targeting agents with pronounced antitumor activity. In this study, hologram-QSAR, 2D-QSAR and 3D-QSAR models were developed for BCPs on topoisomerase I inbibitory activity and cytotoxicity against seven tumor cell lines including RPMI8402, CPT-K5, P388, CPT45, KB3-1, KBV-1and KBH5.0. The hologram, 2D, and 3D-QSAR models were obtained with the square of correlation coefficient R2 = 0.58 − 0.77, the square of the crossvalidation coefficient q2 = 0.41 − 0.60 as well as the external set’s square of predictive correlation coefficient r2 = 0.51 − 0.80. Moreover, the assessment method based on reliability test with confidence level of 95% was used to validate the predictive power of QSAR models and to prevent over-fitting phenomenon of classical QSAR models. Our QSAR model could be applied to design new analogues of BCPs with higher antitumor and topoisomerase I inhibitory activity.
Keywords: QSAR; topoisomerase; benzo[c]phenanthridine; cytotoxicity; model assessment; confidence level QSAR; topoisomerase; benzo[c]phenanthridine; cytotoxicity; model assessment; confidence level

1. Introduction

The topoisomerases (TOP) are enzymes involved in DNA replication, repair, transcription, recombination and segregation. The DNA topoisomerase I (TOP-I) is considered as one of the most effective targets for developing anti-cancer agents, not only due to its abnormally high intracellular levels, but also by the restriction of corrective mechanisms’ cleavage of stabled TOP-I-DNA [1,2]. Among the groups showing the resistance to TOP-I activity, the substances similar to benzo[c]phenanthridine synthesized by Lavoie and colleagues (Figure 1) have shown significant cytotoxicity [3,4,5,6,7,8,9,10,11]. Although over 130 compounds have been synthesized but the QSAR studies on this group are still rare and its application is limited [12,13].
In this study, a dataset of 137 benzo[c]phenanthridine (BCP) analogues with TOP-I inhibitory activity and antitumor activity against seven cell lines, including RPMI8402, CPT-K5, P388, CPT45, KB3-1, KBV-1 and KBH5.0, were chosen for hologram-QSAR (H-QSAR), 2D-QSAR as well as 3D-QSAR studies with CoMFA and CoMSIA analyses. By combining three QSAR methods, we expect that the theoretical results can decrease the error of the prediction and offer some useful information for designing and screening more potential antitumor compounds with less time and cost.

2. Result and Discussion

2.1. The Benzo[c]phenanthridins and Their Biological Activity Data

The compounds studied in this work were BCP derivatives having a similar core to the two alkaloids nitidine and fagaronine shown in Figure 1 [1,14]. The in vitro TOP-I inhibition data (REC, which is the relative effective concentration of TOP-I related to topotecan) and IC50 values (the concentration of compound causing 50% cell growth inhibition against tumor cell lines) on RPMI8402, CPT-K5, P388, CPT45, KB3-1, KBV-1, KBH5.0, U937 and U937rs of 137 chemical structures related to BCPs were collected from the literature [3,4,5,6,7,8,9,10,11]. However, not all bioactivity data of different cell lines is available for each compound. Compounds numbers and available bioactivity data are listed in Table 1. U937 and U937rs cell lines having a limited number of cytotoxicity data were not used for developing the QSAR model. REC and IC50 values were converted to negative logarithm of REC, IC50 (pREC, pIC50) for use in the QSAR studies. Chemically, the dataset can be divided into six groups of general skeletons presented in Figure 2 and the number of compounds of each group are shown in Table 1. The number of compounds in the training and external test sets are presented in Table 2. The detailed chemical structures and bioactivity of BCP dataset are presented in the Supporting Information.

2.2. Over-fitting Problem

A well-accepted QSAR model should be able to accurately predict activities of a new compound which is not included in the training set. Over-fitting or over-estimation occurs when the predictive ablility on external set is bad, some papers use r2 for this assessment [15,16]. However, using the square of correlation coefficient is not exact in all cases and cannot manifest the meaning of the model predictions. In this study, the results of 3D QSAR model of RPMI8402 cell line and 2D QSAR model of CPT45 cell line were given as examples. Accordingly, RPMI-fs45 model (R2 = 0.812 for training set; r2 = 0.701 for test set) was assessed as having a good result and the ability to predict accurately but in fact the predictive power of this model is bad (Figure 3A) with the external set of model RPMI-fs45 (red triangle) tending to go out of the two limit lines at a confidence level of 95% of training set (blue circle), whereas, the CPT45-2D model gave a reasonable result based on the 95% confidence level assessment method, which is shown in Figure 3B. Hence, the QSAR model with high value r2 of training and test sets does not necessarily correlate with a good predictive model.

2.3. Model Assessment Method

For QSAR validation, several parameters such as R2, q2, standard error of training and test sets, Y-scrambling analyses, and confidence interval estimators were used to judge the QSAR models [12,13,16,17,18,19,20]. Confidence level is the result of statistical estimation based on observations on a population. This estimated level is hard to reach 100%, therefore, the statisticians often use the estimate of 90%, 95%, 99% confidence intervals [15,18]. For classical QSAR study, 95% confidence interval is commonly used as the parameter in validation of QSAR models. In this study, the QSAR model evaluation method based on confidence level is presented as below.
At a confidence level of 95%, the limit is calculated so that 95% of training is in the area limited by the upper and lower bounds as shown in Figure 3. The two bounds are almost straight lines parallel to the baseline y = x. If the two bounds meet the horizontal axis at the points x1= −x2= δ (δ > 0), they can be assumed as two lines y = x − δ and y = x + δ. The d value represents the desired predictability of the model which depends on squared correlation coefficient R2 and standard error of the predicted results compared with experimental values of training set. If the model gave the predicted β, then 95% of β was in the range β ± δ.
The model predictions are confirmed as true to its ability when the external evaluation set with coordinates (xi = pIC50 expected, yi = pIC50 experimental) lies within the boundaries of the two lines, the type I error probability is 5% (if the external set is large enough, there will be 5% of compounds with the predicted value lies outside the confidence interval).
The assessment step in this study is done as follows:
- Determine the δ from the training set.
- Use a set of external set to assess the reliability of the value of δ. The reliability of the value of δ is evaluated by seeing how much the coordinates of the compounds in the external set properly distributed in the confidence limits. This number is not required to be larger than or equal to the value of reliability (95%) but the difference of those two numbers must not be too large. The signs were used in this study for assessing the value δ with possitive (+), negative (−) and unknown (+/−).
- If r2 is also greater than 0.5, the model can be proposed to predict beyond the range of values evaluated.
In addition, several new metrics r m 2 , r m 2 ¯ and Δ r m 2 proposed by Roy’s reasearch group was also calculated for both training and test set to validate our QSAR models [21,22,23]. These additional validation parameters were used to assess the predictive quality of QSAR models. For the good QSAR models, the values of r m 2 ¯ should have be more than 0.5 and Δ r m 2 values should preferably be lower than 0.2 for both of the training and test sets. The equations for calculation of r m 2 , r m 2 ¯ and Δ r m 2 metrics could be found at supporting information.

2.4. Hologram, 2D and 3D QSAR Modeling

In this study, eight 2D QSAR models, eight hologram QSAR models and thirteen 3D QSAR models for TOP-I inhibitory activity and anti-toxicity on RPMI8402, CPT-K5, P388, CPT45, KB3-1, KBV-1, KBH5.0 tumor cell line were developed and the results are presented in Table 3 (2D), Table 4 (Hologram), Table 5 (3D) and the assessments of corresponding models with a confidence level of 95% are also presented. Based up on 95% confidence interval, the δ value, assessment and range of prediction of all obtained models were calculated. There are several models with good R2 and q2 values of training and test sets but those models could not give the predictive power for external test set by applying the confidence intervals (Table 5).
The models of RPMI-8402, KB3-1 cell lines and on TOP-I inhibitory activity have correlated and results in all three methods’ building QSAR models are reasonable. The hologram, 2D and 3D- QSAR models performed on pREC (topoisomerase inhibitory activity) and pIC50 of RPMI8402, KB3-1 cell-lines showed not only significant statistical quality, but also predictive ability, with the square of correlation coefficient R2 = 0.584 − 0.768, the square of the crossvalidation coefficient q2 = 0.406 − 0.594 as well as the external set’s square of predictive correlation coefficient r2 = 0.514 − 0.795. For RPMI 8402 cell line and KB3-1 cell-lines, the largest range of prediction are [−1:3] from hologram model and [−0.5:2.2] from 2D QSAR, respectively, were obtained. The best range of prediction for anti-topoisomerase 1 is [−2.5:1] is achieved from 3D QSAR model. Based on the calculation of r m 2 , r m 2 ¯ and Δ r m 2 metrics, several good QSAR models are highlighted in bold numbers in Table 3, Table 4 and Table 5. Detailed of 29 QSAR models are available in supporting information.
QSAR models on topoisomerase inhibitory activity and cytotoxicity of RPMI8402, KB3-1 cell-lines were used for further investigation on application set. The prediction on application set containing 1214 new virtual designed compounds offers a short list of 94 compounds with better predictive antitumor activity. Several selected compounds with predicted bioactive values are listed in supporting information. Analysis of the results from our QSAR models shows the general points of the relationship between chemical structures and antitumor activity of BCP derivatives summarized in Figure 4 and described as follows:
(1).
The steric interaction plays an important role in determining the bioactivities of the BCP against many tumor cell lines, including cytotoxicity and TOP-I inhibitory ability. Substituents at 8,9-dimethoxy position on the skeletons are necessary for the biological effects. The results have shown that methoxy group at position 2 is essential for bioactivity while position 3 is not essential. The substituents at position 11, 12 affect the activity and should have a length of 4-5 carbons or lower, be straight up with the bulky end groups.
(2).
Reducing the amount of nitrogen in the rings system and increasing the number of nitrogen atoms in the substituent can improve the bioactivity. Nitrogen in position 6 gave a better effect than position 5.
(3).
The substituents at two positions 11 and 12 could have a positive effect on cytotoxicity and TOP-I inhibitory activity. The substituent at position 12 gives a stronger effect on bioactivity than position 11.
The previous studies indicated that topotecan, the synthetic derivative of camptothecin is the most potent anticancer drugs in clinical use [24]. Topotecan, ethoxidine, fagaronine and BCP related compounds indicated the selectivity on TOP-I than TOP-II. These novel compounds acted as DNA intercalators and have two mechanisms including (i) TOP-I poison activity like fagaronine; and (ii) TOP-I suppressor activity like ethoxidine [24,25]. Our preliminary results from in silico modeling indicated that BCP compounds may inhibit the TOP-I activity via suppression mechanism. From this QSAR study, the important role of natural functional groups related to biological activity is indicated in Figure 4. Hence, the combination of our QSAR models with other classification on TOP-I and cytotoxicity predictive models and molecular docking studies [12,25,26] could provide insight into the molecular basis of BCPs derivatives on antitumor and TOP-I inhibitory activity.

3. Materials and Methods

3.1. QSAR Study Process

The QSAR study process is summarized in Figure 5.

3.2. Preparation of the Data Sets

A total of 137 chemical structures of benzo[c]phenanthridine analogues were collected from the literature (Table 6). The structures of the compounds were first drawn in Molecular Operating Environment sofware (MOE), named and put into global dataset [27]. The published data is direct copied and converted from *.pdf into a table in accordance with the format *.csv. Data will be imported into the used programs with the command “import”, “read”, “merge” based on the name of each subtance in order to ensure the precision and convenience. Recheck the drawn structures by using SAR report in MOE and Ligand Prepare in SYBYL-X 1.1 [28].
Using the SAR report generated by MOE, the dataset was devided into six groups according to structural skeleton similarity. Then we calculated the weight descriptors and sorted the compounds in order of molecular weight. The training and test sets were generated by random division using the original variable descriptors along with cytotoxicity and TOP-I inhibitory activity values. The data set is split randomly for five times into 80% for training and 20% for the test set and the results were presented in Table 2.

3.3. Hologram-QSAR

3.3.1. Calculated Fragment Descriptors

Using all suggested descriptors but in order to save computational time, the limitation of number’s atoms must be set from 4–7 and change step by step from 1 to 10 [28].

3.3.2. Hologram-QSAR Process

The standardized structures in 2D-QSAR were used in building Hologram-QSAR by SYBYL software. The importable files must be *.sdf, which are results from MOE. Changing the parameters of atoms’ limitation and doing Partial least squared regression with default principal components.

3.4. 2D-QSAR

3.4.1. Calculated 2D-descriptors

The structures of the compounds were first standardized by “Depict2D” command then calculated of 184 2D-descriptors in MOE software [27].

3.4.2. 2D-QSAR Process

The data was transferred to Rapidminer software for multi linear 2D-QSAR process [29]. The process has eight main steps which are shown in Figure S1 in the Supporting Information:
1st step: Transfer data from MOE.
2nd step: Remove useless descriptors with more than 20% compounds having value = 0.
3rd step: Remove the desriptors with intercorrelation greater than 0.8.
4th step: Optimize selection with modified forward selection using multi linear regression algorithm (MLR). The modifications include limiting of descriptors, keeping more than 1 best subset of descriptors and validating by Leave One Out (L.O.O.) cross-validation.
5th step: Build the model using MLR and use L.O.O. to validate the predictive ability
6th step: Give the parameters on the training set.
7th step: Give the parameters on the external set.
8th step: Give the predictive results on application.
Several subset of chemical descriptors having an effect on the performance of predictition of anticancer and TOP-I inhibitory activity were selected and showed in supporting informtion for each QSAR models. The detailed of these molecular descriptors are described in Table 7.

3.5. 3D-QSAR

3.5.1. Calculated 3D-descriptors

The stable conformation of the 3D structure is very important to develop a reliable and repetitive 3D-QSAR models. The search for lowest energy 3D conformations were conducted in MOE with forcefield MMFF94 and RMS Gradient 0.0001 kcal.mol−1. The results were transferred to SYBYL and MMFF94 charges were assigned to all the molecules [28].
Structural alignment is considered as one of the most sensitive parameters in CoMFA and CoMSIA analysis. The accuracy of the prediction and the reliability of the contour maps are directly dependent on the structural alignment rule. The compound BMC_05_6782_9b was used as a template for superimposition, the common fragment for each group was determined based on comparison with this compound’s core structure. The aligned compounds are shown in Figure 6.
Steric (fa) and electrostatic (fe) fields for CoMFA, steric (s), electrostatic (e), hydrogen bond donor (d), hydrogen bond acceptor (a) and hydrophobic descriptor (h) fields were calculated by using the default of SYBYL-X 1.1 with an sp3 carbon atom having van der Waals radius of 0.152nm, +1 charge, and 0.2 nm grid spacing. The energy cutoff values were set to 30 kcal.mol−1.

3.5.2. 3D-QSAR Process

Each training set was conducted on 32 models with different 3D descriptors and a vary column filter values from 1 to 5. The PLS analysis was used to construct a linear correlation between the subset of descriptors and the bioactivities. To select the best model, the cross-validation L.O.O. was performed to reduce the square of crossvalidation coefficient (q2) and the optimum number of principal components. The q2 results were recorded into combination matrix table such as result on KB3-1 in Table 8. Detailed of q2 matrix of 3D models for diferrent cell lines could be found at supporting information.

4. Conclusions

In this study, the hologram, 2D- and 3D-QSAR analyses were used to build up the model for prediction of 137 BCP analogues based on their anti-topoisomerase-1 activity and cytotoxicity on seven tumor cell lines. The best model was obtained between pREC (topoisomerase inhibitory activity) and pIC50 of RPMI8402, KB3-1 cell-lines biological data and BCP analogues. In addition, the reliability test with 95% confidence interval was applied as a parameter for QSAR models validation on internal and external dataset and to prevent over-fitting problem of classical QSAR models. With its high accuracy and fast prediction on the BCPs, our QSAR model could be applied to design new analogues of BCPs with higher antitumor and topoisomerase I inhibitory activity.

Supplementary Materials

The cytotoxicity on seven tumor cell lines and toposisomerase I inhibitory activity of 137 benzo[c]phenanthridine analogues (eight tables) and the detailed of 29 QSAR models including hologram, 2D- and 3D- QSAR were provided. Several new designed BCPs compounds with predictive activity from QSAR models and the equations of r m 2 , r m 2 ¯ and Δ r m 2 metrics were also available in detailed. The detailed information can be found at: https://www.mdpi.com/1420-3049/17/5/5690/s1.

Acknowledgments

This work was supported by the Vietnam’s National Foundation for Science and Technology Development - NAFOSTED (Grant # 104.01.21.09 to Khac-Minh Thai).

Conflict of Interest

The authors declare no conflict of interest.

References and Notes

  1. Li, T.-K.; Houghton, P.J.; Desai, S.D.; Daroui, P.; Liu, A.A.; Hars, E.S.; Ruchelman, A.L.; LaVoie, E.J.; Liu, L.-F. Characterization of ARC-111 as a novel topoisomerase I-targeting anticancer drug. Cancer Res. 2003, 63, 8400–8407. [Google Scholar] [PubMed]
  2. Pommier, Y. DNA topoisomerase I inhibitors: Chemistry, biology, and interfacial inhibition. Chem. Rev. 2009, 109, 2894–2902. [Google Scholar] [CrossRef] [PubMed]
  3. Li, D.; Zhao, B.; Sim, S.-P.; Li, T.-K.; Liu, A.; Liu, L.-F.; LaVoie, E.J. 2,3-Dimethoxy-benzo[i]phenanthridines: Topoisomerase I-targeting anticancer agents. Bioorg. Med. Chem. 2003, 11, 521–528. [Google Scholar] [CrossRef]
  4. Li, D.; Zhao, B.; Sim, S.-P.; Li, T.-K.; Liu, A.; Liu, L.-F.; LaVoie, E.J. 8,9-methylene-dioxybenzo[i]phenanthridines: topoisomerase I-targeting activity and cytotoxicity. Bioorg. Med. Chem. 2003, 11, 3795–3805. [Google Scholar] [CrossRef]
  5. Makhey, D.; Li, D.; Zhao, B.; Sim, S.-P.; Li, T.-K.; Liu, A.; Liu, L.-F.; LaVoie, E.J. Substituted benzo(i)phenanthridines as mammalian topoisomerase I-targeting anticancer agents. Bioorg. Med. Chem. 2003, 11, 1809–1820. [Google Scholar] [CrossRef]
  6. Ruchelman, A.L.; Kerrigan, J.E.; Li, T.-K.; Zhou, N.; Liu, A.; Liu, L.-F.; LaVoie, E.J. Nitro and amino substitution within the A-ring of 5H-8,9-dimethoxy-5-(2-N,N-dimethylamino-ethyl)dibenzo(c,h) (1,6) naph-thyridin-6-ones: influence on topoisomerase I-targeting activity and cytotoxicity. Bioorg. Med. Chem. 2004, 12, 3731–3742. [Google Scholar] [CrossRef] [PubMed]
  7. Zhu, S.; Ruchelman, A.L.; Zhou, N.; Liu, A.; Liu, L.F.; LaVoie, E.J. Esters and amides of 2,3-dime-thoxy-8,9-methylene-dioxybenzo(i)phenanthridine-12-carboxylic acid: potent topoisomerase I-targeting agents. Bioorg. Med. Chem. 2005, 13, 6782–6794. [Google Scholar] [CrossRef] [PubMed]
  8. Zhu, S.; Ruchelman, A.L.; Zhou, N.; Liu, A.; Liu, L.-F.; LaVoie, E.J. 6-Substituted 6H-dibenzo(c,h)(2,6)naph-thyridin-5-ones: reversed lactam analogues of ARC-111 with potent topoisomerase I-targeting activity and cytotoxicity. Bioorg. Med. Chem. 2006, 14, 3131–3143. [Google Scholar] [CrossRef] [PubMed]
  9. Feng, W.; Satyanarayana, M.; Tsai, Y.-C.; Liu, A.A.; Liu, L.-F.; LaVoie, E.J. 11-Substituted 2,3-dimethoxy- 8,9 -methylenedioxybenzo[i]phenanthridine derivatives as novel topoisomerase I-targeting agents. Bioorg. Med. Chem. 2008, 16, 8598–8606. [Google Scholar] [CrossRef] [PubMed]
  10. Sharma, L.; Tsai, Y.-C.; Liu, A.-A.; Liu, L.-F.; LaVoie, E.J. Cytotoxicity and TOP1-targeting activity of 8- and 9-amino derivatives of 5-butyl- and 5-(2-N,N-dimethylamino)ethyl-5H-dibenzo[c,h][1,6]naphthyridin-6-ones. Eur. J. Med. Chem. 2009, 44, 1471–1476. [Google Scholar] [CrossRef] [PubMed]
  11. Feng, W.; Satyanarayana, M.; Tsai, Y.-C.; Liu, A.-A.; Liu, L.-F.; LaVoie, E.J. 12-Substituted 2,3-dimethoxy-8,9-methylenedioxybenzo[i]phenanthridines as novel topoisomerase I-targeting antitumor agents. Bioorg. Med. Chem. 2009, 17, 2877–2885. [Google Scholar] [CrossRef] [PubMed]
  12. Liao, S.-Y.; Qian, L.; Lu, H.-L.; Shen, Y.; Zheng, K.-C. A Combined 2D- and 3D-QSAR Study on Analogues of ARC-111 with Antitumor Activity. QSAR Comb. Sci. 2008, 27, 740–749. [Google Scholar] [CrossRef]
  13. Verma, R.P. Understanding topoisomerase I and II in terms of QSAR. Bioorg. Med. Chem. 2005, 13, 1059–1067. [Google Scholar] [CrossRef] [PubMed]
  14. Bai, L.-P.; Zhao, Z.-Z.; Cai, Z.; Jiang, Z.-H. DNA-binding affinities and sequence selectivity of quaternary benzophenanthridine alkaloids sanguinarine, chelerythrine, and nitidine. Bioorg. Med. Chem. 2006, 14, 5439–5445. [Google Scholar] [CrossRef] [PubMed]
  15. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer-Verlag: New York, NY, USA, 2008. [Google Scholar]
  16. Gramatica, P. Principles of QSAR models validation: internal and external. QSAR Comb. Sci. 2007, 26, 694–701. [Google Scholar] [CrossRef]
  17. Kubinyi, H. Validation and predictivity of QSAR models. In QSAR & Molecular Modelling in Rational Design of Bioactive Molecules, Proceedings of the 15th European Symposium on QSAR & Molecular Modelling, Istanbul, Turkey, 2004; Sener, E.A., Yalcin, I., Eds.; CADDD Society: Ankara, Turkey, 2006; pp. 30–33. [Google Scholar]
  18. Dietrich, S.W.; Dreyer, N.D.; Hansch, C.; Bentley, D.L. Confidence interval estimators for parameters associated with quantitative structure-activity relationships. J. Med. Chem. 1980, 23, 1201–1205. [Google Scholar] [CrossRef] [PubMed]
  19. Thai, K.-M.; Ecker, G.F. Classification models for hERG inhibitors by counter-propagation neural networks. Chem. Biol. Drug Des. 2008, 72, 279–289. [Google Scholar] [CrossRef] [PubMed]
  20. Thai, K.-M.; Ecker, G.F. Similarity-based SIBAR descriptors for classification of chemically diverse hERG blockers. Mol. Divers. 2009, 13, 321–336. [Google Scholar] [CrossRef] [PubMed]
  21. Roy, K.; Mitra, I.; Kar, S.; Ojha, P.K.; Das, R.N.; Kabir, H. Comparative Studies on Some Metrics for External Validation of QSPR Models. J. Chem. Inf. Model. 2012, 52, 396–408. [Google Scholar] [CrossRef] [PubMed]
  22. Ojha, P.K.; Mitra, I.; Das, R.N.; Roy, K. Further exploring rm2 metrics for validation of QSPR models. Chemometr. Intell. Lab. 2011, 107, 194–205. [Google Scholar] [CrossRef]
  23. Pratim Roy, P.; Paul, S.; Mitra, I.; Roy, K. On Two Novel Parameters for Validation of Predictive QSAR Models. Molecules 2009, 14, 1660–1701. [Google Scholar] [CrossRef] [PubMed]
  24. Khadka, D.B.; Cho, W.J. 3-Arylisoquinolines as novel topoisomerase I inhibitors. Bioorg. Med. Chem. 2011, 19, 724–734. [Google Scholar] [CrossRef] [PubMed]
  25. Clark, R.L.; Deane, F.M.; Anthony, N.G.; Johnston, B.F.; McCarthy, F.O.; Mackay, S.P. Exploring DNA topoisomerase I inhibition by the benzo[c]phenanthridines fagaronine and ethoxidine using steered molecular dynamics. Bioorg. Med. Chem. 2007, 15, 4741–4752. [Google Scholar] [CrossRef] [PubMed]
  26. Thai, K.-M.; Nguyen, T.-Q.; Ngo, T.-D.; Tran, T.-D.; Huynh, T.-N.-P. A Support Vector Machine Classification Model for Benzo[c]phenathridine Analogues with Topoisomerase-I Inhibitory Activity. Molecules 2012, 17, 4560–4582. [Google Scholar] [CrossRef] [PubMed]
  27. MOE 2007.02. Chemical Computing Group Inc.: Montreal, Canada. Available online: http://www.chemcomp.com (accessed on 26 December 2011).
  28. Sybyl 2007: Computational Informatics Software for Molecular Modelers. Tripos L.P.: St. Louis, MO, USA. Available online: http://tripos.com/ (accessed on 26 December 2011).
  29. Rapidminer Home Page. Available online: http://www.rapidminer.com (accessed on 5 November 2011).
Sample Availability: Not available.
Figure 1. Structure of nitidine and fagaronine.
Figure 1. Structure of nitidine and fagaronine.
Molecules 17 05690 g001
Figure 2. General structural skeletons of the BCPs dataset.
Figure 2. General structural skeletons of the BCPs dataset.
Molecules 17 05690 g002
Figure 3. The relationship between observed and predicted data from QSAR model and its 95% confidence interval of (A) RPMI8402 cell line from 3D QSAR with steric analysis fields and (B) CPT45 cell line from 2D QSAR. Compound of training set are in blue circle and test set in red triangle.
Figure 3. The relationship between observed and predicted data from QSAR model and its 95% confidence interval of (A) RPMI8402 cell line from 3D QSAR with steric analysis fields and (B) CPT45 cell line from 2D QSAR. Compound of training set are in blue circle and test set in red triangle.
Molecules 17 05690 g003
Figure 4. The summary for BCPs structures—antitumor activity relationship.
Figure 4. The summary for BCPs structures—antitumor activity relationship.
Molecules 17 05690 g004
Figure 5. Process of combined QSAR studies.
Figure 5. Process of combined QSAR studies.
Molecules 17 05690 g005
Figure 6. Structure of BMC_05_6782_9b and 3D alignment of 137 BCPs chemical structures.
Figure 6. Structure of BMC_05_6782_9b and 3D alignment of 137 BCPs chemical structures.
Molecules 17 05690 g006
Table 1. Dataset and biological activities used in this study.
Table 1. Dataset and biological activities used in this study.
BioactivityNumber of compoundsNumber of compounds on each skeleton
G1G2G3G4G5G6
RPMI1336591361030
CPTk5101508126817
P3888253145910
CPT457345045910
U937392390601
U937rs332070501
KB3-183531013601
KBV-18152913601
KBH6046013001
TOP-I9452812598
Table 2. Dataset division.
Table 2. Dataset division.
BioactivityRPMI 8402CPT-K5P388CPT45KB3-1KBV-1KBH5.0TOP-I
Number of compounds133101827383816094
Training set10580665868604874
External test set 28 21161515211220
Table 3. Results of 2D-QSAR.
Table 3. Results of 2D-QSAR.
ModelRPMICPTk5P388CPT45KB3-1KBVKBHTOP-I
Number compounds in training set10580665868604874
Number compounds in external test set 28 2116 1515211220
R2 (Training set)0.5840.4520.6550.4720.6270.6320.5360.602
Standard Error (Training set)0.5430.4000.2710.3380.4140.2480.2180.355
q2 (L.O.O.)0.5110.3020.4170.2300.5370.4740.3940.475
Standard Error (L.O.O.)0.6410.5220.4890.5270.5200.3640.290.477
rt2 (External set)0.5140.2480.3340.0430.5140.3140.0530.657
Standard Error (External set)0.8030.4340.7990.9430.6650.8580.780.417
p-value of model0.0000.0000.0000.0000.0000.0000.0000.000
Number of 2D molecular descriptor7910968610
Greates p-value of used descriptors0.0390.0180.0140.0000.0260.0020.0260.008
Model assessment
δ1.451.301.101.201.251.001.001.25
Assessment++++++++
Range of prediction−1.5
2.2
−1.2
1.4
0
2.3
−0.5
1.25
−0.5
2.2
−1.2
1.5
0.5
1.7
−2
0.5
r m 2 0.5840.4520.6550.4720.6270.6320.5360.602
r m / 2 0.3990.3740.4240.2840.4330.5060.2640.451
r m 2 ¯ 0.4910.4130.5400.3780.5300.5690.4000.527
Δ r m 2 0.1860.0780.2310.1880.1940.1260.2720.151
r m ( t e s t ) 2 0.5140.2480.2560.0180.5910.2780.0350.588
r m ( t e s t ) / 2 0.3240.1690.1870.0120.5490.180−0.0070.413
r m ( t e s t ) 2 ¯ 0.4190.2080.2220.0150.5700.2290.0140.501
Δ r m ( t e s t ) 2 0.1900.0790.0680.0060.0420.0990.0420.175
Table 4. Results of Hologram-QSAR.
Table 4. Results of Hologram-QSAR.
ModelRPMICPTk5P388CPT45KB3-1KBVKBHTOP-I
Number compounds in training set10580665868604874
Number compounds in external test set 28 21161515211220
R2 (Training set)0.7650.5730.4390.4830.6220.5010.6240.616
Standard Error (Training set)0.4890.4280.6750.5860.6580.4180.4390.594
q2 (L.O.O.)0.5680.3200.1820.1230.5140.3280.4820.406
Standard Error (L.O.O.)0.7770.7330.8280.7770.7570.6970.5160.754
rt2 (External set)0.5250.2850.3020.0100.5410.7080.4390.690
Standard Error (External set)0.5670.3820.9410.6350.7520.3060.3390.433
Hologram lengths15153353199617159307
Principal components65633334
Limitation of atoms in each fragment5–105–105–85–62–85–65–71–7
p-value0.0000.0000.0000.0000.0000.0000.0000.000
Model assessment
δ1.101.201.301.201.351.20.801.20
Assessment++++++++
Range of prediction−1
3
−1.5
0.2
0
1.2
0.2
1
0
2.5
−0.5
1.7
0.5
2
−2
0.5
r m 2 0.7650.5720.4390.4820.6210.5000.6240.616
r m / 2 0.6330.5040.1320.2970.4250.3480.3870.466
r m 2 ¯ 0.6990.5380.2860.3890.5230.4240.5060.541
Δ r m 2 0.1320.0680.3070.1850.1960.1520.2370.150
r m ( t e s t ) 2 0.5070.2760.2890.0060.5030.6450.3670.569
r m ( t e s t ) / 2 0.3620.159−0.0330.0050.3420.493−0.0540.378
r m ( t e s t ) 2 ¯ 0.4350.2170.1280.0050.4220.5690.1540.473
Δ r m ( t e s t ) 2 0.1450.1170.3220.0010.1610.1510.4150.190
Table 5. Results of 3D-QSAR.
Table 5. Results of 3D-QSAR.
ModelRPMI-fs43RPMI-fs45CPTk5-sh44P388-s15CPT45-s53KB3-s34KB3-e12KB3-h34KB3-eh32KBV-s34KBH-fs43TOP-I -s34TOP-I -h54
Number compounds in training set1051058066586868686860487474
Number compounds in test set 28 282116151515151521122020
R2 (Training set)0.7340.8120.6500.6670.4960.7210.6980.7680.7310.6290.6960.7010.700
Standard Error (Training set)0.6010.5100.5220.5370.5890.5970.5930.5280.5590.5230.3940.5350.536
q2 (L.O.O.)0.5940.6070.3380.3090.2030.5840.5520.5390.5820.3720.3300.4230.345
Standard Error (L.O.O.)0.7420.7370.7180.7740.7410.7070.7220.7440.6970.6800.5860.7430.792
rt2 (External set)0.6850.7010.4710.5700.2020.6610.4960.6360.6200.4360.2820.7950.836
Standard Error (External set)0.7420.7240.5450.7390.5700.6470.7890.6700.6860.8570.7960.5180.338
3D-descriptorFsfsshSSSehehsFsSH
Column filter4441531333435
Principal component3545342424344
p-value0.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.000
Model assessment
δ 1.101.001.051.051.201.151.201.101.151.000.751.101.2
Assessment +-++++++++/−+++
Range of prediction−1.5
2.5

−1.5
1
−0.2
2
−0.2
1
−0.5
2
0
2
−0.2
2.2
−0.2
2
0
1.2
−0.5
2
−2.5
0.5
−2.5
1
r m 2 0.5710.8120.6500.6670.4960.7210.6970.7670.7300.6280.6960.7010.700
r m / 2 0.5300.7030.5920.4420.3120.5650.5310.6340.5790.5010.4940.5780.577
r m 2 ¯ 0.5500.7580.6210.5540.4040.6430.6140.7010.6540.5640.5950.6390.638
Δ r m 2 0.0410.1090.0580.2250.1830.1560.1660.1330.1510.1270.2020.1230.123
r m ( t e s t ) 2 0.6630.3240.4290.5690.1880.5850.4610.6360.5910.3320.2820.6040.523
r m ( t e s t ) / 2 0.568−0.4510.2940.3320.0630.5460.1620.4090.349−0.014−0.0090.4310.318
r m ( t e s t ) 2 ¯ 0.6160.0640.3620.4500.1260.5650.3120.5220.4700.1590.1370.5170.421
Δ r m ( t e s t ) 2 0.0950.7750.1350.2370.1250.0390.2990.2780.2420.3460.2910.1730.206
Table 6. Chemical structure of 137 benzo[c]phenanthridine analogues.
Table 6. Chemical structure of 137 benzo[c]phenanthridine analogues.
Molecules 17 05690 i001
Molecules 17 05690 i002
Molecules 17 05690 i003
Molecules 17 05690 i004
Molecules 17 05690 i005
Molecules 17 05690 i006
Molecules 17 05690 i007
Table 7. Description of 35 molecular descriptors using to create the 2D QSAR models.
Table 7. Description of 35 molecular descriptors using to create the 2D QSAR models.
NoMolecular descriptorDescription
1a_accNumber of hydrogen bond acceptor atoms
1a_accNumber of hydrogen bond acceptor atoms
2a_aroNumber of aromatic atoms
3a_ICMAtom information content (mean). This is the entropy of the element distribution in the molecule (including implicit hydrogens but not lone pair pseudo-atoms).
4a_nNNumber of nitrogen atoms: #{Zi | Zi = 7}.
5a_nONumber of oxygen atoms: #{Zi | Zi = 8}.
6b_1rotRFraction of rotatable single bonds: b_1rotN divided by b_heavy.
7BCUT_PEOE_0The BCUT descriptors [Pearlman 1998] are calculated from the eigenvalues of a modified adjacency matrix.
8BCUT_PEOE_1
9BCUT_PEOE_2
10chi1v_CCarbon valence connectivity index (order 1).
11densityMolecular mass density: Weight divided by vdw_vol (amu/Å3).
12diameterLargest value in the distance matrix
13GCUT_PEOE_1The GCUT descriptors are calculated from the eigenvalues of a modified graph distance adjacency matrix.
14GCUT_SLOGP_0The GCUT descriptors using atomic contribution to logP instead of partial charge.
15GCUT_SLOGP_1
16GCUT_SMR_0The GCUT descriptors using atomic contribution to molar refractivity instead of partial charge.
17opr_leadlikeAtom Counts and Bond Counts: One if and only if opr_violation < 2 otherwise zero.
18PEOE_VSA_FHYDFractional hydrophobic van der Waals surface area.
19PEOE_VSA_FNEGFractional negative van der Waals surface area.
20PEOE_VSA_NEGTotal negative van der Waals surface area.
21PEOE_VSA+0Sum of vi where qi is in the range [0.00, 0.05).
22PEOE_VSA+1PEOE: Sum of vi where qi is in the range [0.05, 0.10).
23PEOE_VSA+2PEOE: Sum of vi where qi is in the range [0.10, 0.15).
24PEOE_VSA+3PEOE: Sum of vi where qi is in the range [0.15, 0.20).
25PEOE_VSA-0PEOE: Sum of vi where qi is in the range [−0.05, 0.00).
26PEOE_VSA-1PEOE: Sum of vi where qi is in the range [−0.10, −0.05).
27petitjeanLargest value in the distance matrix
 28SlogPLog of the octanol/water partition coefficient (including implicit hydrogens).
29SlogP_VSA1Subdivided Surface Areas: Sum of vi such that Li is in (−0.4, −0.2].
30SlogP_VSA5Subdivided Surface Areas: Sum of vi such that Li is in (0.15, 0.20].
31SlogP_VSA9Subdivided Surface Areas: Sum of vi such that Li > 0.40.
32VDistMaAdjacency and Distance Matrix Descriptors: If m is the sum of the distance matrix entries then VDistMa is defined to be the sum of log2 m - Dij log2 Dij / m over all i and j.
33vsa_accApproximation to the sum of VDW surface areas (Å2) of pure hydrogen bond acceptors
34vsa_otherApproximation to the sum of VDW surface areas (Å2) of atoms typed as “other”.
35vsa_polApproximation to the sum of VDW surface areas (Å2) of polar atoms (atoms that are both hydrogen bond donors and acceptors), such as -OH.
Table 8. Cross-validation results of 3D-QSAR with KB3-1 cells.
Table 8. Cross-validation results of 3D-QSAR with KB3-1 cells.
3D descriptor fieldq2 for each column filter values
12345
S0.5830.5830.5840.5790.575
E0.5520.5510.5530.5400.542
H0.5420.5410.5390.5090.485
D0.0140.0130.0040.0000.000
A0.4140.4150.4170.4230.429
s.e0.5800.5800.5800.5760.576
s.h0.5510.5550.5550.5450.536
s.d0.4480.4460.4730.4850.502
s.a0.4980.5070.5010.5030.505
e.h0.5820.5810.5820.5710.560
e.d0.5600.5620.5680.5450.534
e.a0.5610.5580.5520.5500.547
h.d0.4370.4380.4380.4110.406
h.a0.4660.4650.4650.4630.467
d.a0.4090.4120.4170.4230.427
s.e.h0.5850.5810.5840.5790.570
s.e.d0.5780.5780.5860.5840.575
s.e.a0.5630.5640.5620.5610.558
s.h.d0.5140.5270.5330.5280.516
s.h.a0.5110.5150.5110.5100.511
s.d.a0.4990.5070.5140.5300.509
e.h.d0.5630.5650.5730.5600.552
e.h.a0.5600.5570.5510.5460.540
e.d.a0.5260.5280.5330.5310.520
h.d.a0.4440.4460.4450.4430.449
s.e.h.d0.5720.5740.5790.5710.562
s.e.h.a0.5690.5690.5650.5610.555
s.e.d.a0.5480.5510.5550.5580.544
s.h.d.a0.4930.5070.5150.5180.501
e.h.d.a0.5280.5290.5300.5230.513
s.e.h.d.a0.5450.5470.5430.5400.537
CoMFA0.5490.5490.5460.5470.551
s: steric, e: electrostatic, h: hydrophobic; d: H-bond donor; a: H- bond acceptor.
Back to TopTop