Next Article in Journal
Molecular Design of d-Luciferin-Based Bioluminescence and 1,2-Dioxetane-Based Chemiluminescence Substrates for Altered Output Wavelength and Detecting Various Molecules
Next Article in Special Issue
QSAR Models for Active Substances against Pseudomonas aeruginosa Using Disk-Diffusion Test Data
Previous Article in Journal
Binding of Androgen- and Estrogen-Like Flavonoids to Their Cognate (Non)Nuclear Receptors: A Comparison by Computational Prediction
Previous Article in Special Issue
QSAR Modelling of Peptidomimetic Derivatives towards HKU4-CoV 3CLpro Inhibitors against MERS-CoV
Article

Predictive Capability of QSAR Models Based on the CompTox Zebrafish Embryo Assays: An Imbalanced Classification Problem

1
Know-Center, Inffeldgasse 13, 8010 Graz, Austria
2
Ruđer Bošković Institute, P.O. Box 180, 10002 Zagreb, Croatia
3
Department of Biology, Faculty of Science, University of Zagreb, Rooseveltov Trg 6, 10000 Zagreb, Croatia
4
Institute of Interactive Systems and Data Science, TU Graz, Inffeldgasse 16c, 8010 Graz, Austria
5
Department of Chemical Engineering, Pukyong National University, Busan 608-739, Korea
*
Authors to whom correspondence should be addressed.
Academic Editor: Alla P. Toropova
Molecules 2021, 26(6), 1617; https://doi.org/10.3390/molecules26061617
Received: 5 February 2021 / Revised: 3 March 2021 / Accepted: 11 March 2021 / Published: 15 March 2021
(This article belongs to the Special Issue QSAR and QSPR: Recent Developments and Applications II)
The CompTox Chemistry Dashboard (ToxCast) contains one of the largest public databases on Zebrafish (Danio rerio) developmental toxicity. The data consists of 19 toxicological endpoints on unique 1018 compounds measured in relatively low concentration ranges. The endpoints are related to developmental effects occurring in dechorionated zebrafish embryos for 120 hours post fertilization and monitored via gross malformations and mortality. We report the predictive capability of 209 quantitative structure–activity relationship (QSAR) models developed by machine learning methods using penalization techniques and diverse model quality metrics to cope with the imbalanced endpoints. All these QSAR models were generated to test how the imbalanced classification (toxic or non-toxic) endpoints could be predicted regardless which of three algorithms is used: logistic regression, multi-layer perceptron, or random forests. Additionally, QSAR toxicity models are developed starting from sets of classical molecular descriptors, structural fingerprints and their combinations. Only 8 out of 209 models passed the 0.20 Matthew’s correlation coefficient value defined a priori as a threshold for acceptable model quality on the test sets. The best models were obtained for endpoints mortality (MORT), ActivityScore and JAW (deformation). The low predictability of the QSAR model developed from the zebrafish embryotoxicity data in the database is mainly due to a higher sensitivity of 19 measurements of endpoints carried out on dechorionated embryos at low concentrations. View Full-Text
Keywords: predictive QSAR; toxicity; ToxCast; zebrafish embryo; rdkit; structural descriptors; structural fingerprints; machine learning; imbalanced classification; aquatic toxicology predictive QSAR; toxicity; ToxCast; zebrafish embryo; rdkit; structural descriptors; structural fingerprints; machine learning; imbalanced classification; aquatic toxicology
Show Figures

Figure 1

MDPI and ACS Style

Lovrić, M.; Malev, O.; Klobučar, G.; Kern, R.; Liu, J.J.; Lučić, B. Predictive Capability of QSAR Models Based on the CompTox Zebrafish Embryo Assays: An Imbalanced Classification Problem. Molecules 2021, 26, 1617. https://doi.org/10.3390/molecules26061617

AMA Style

Lovrić M, Malev O, Klobučar G, Kern R, Liu JJ, Lučić B. Predictive Capability of QSAR Models Based on the CompTox Zebrafish Embryo Assays: An Imbalanced Classification Problem. Molecules. 2021; 26(6):1617. https://doi.org/10.3390/molecules26061617

Chicago/Turabian Style

Lovrić, Mario, Olga Malev, Göran Klobučar, Roman Kern, Jay J. Liu, and Bono Lučić. 2021. "Predictive Capability of QSAR Models Based on the CompTox Zebrafish Embryo Assays: An Imbalanced Classification Problem" Molecules 26, no. 6: 1617. https://doi.org/10.3390/molecules26061617

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop