Next Article in Journal
Impaired CENP-E Function Renders Large Chromosomes More Vulnerable to Congression Failure
Next Article in Special Issue
Formulations of Curcumin Nanoparticles for Brain Diseases
Previous Article in Journal
High-Throughput Screening of Lipidomic Adaptations in Cultured Cells
Previous Article in Special Issue
BIOFACQUIM: A Mexican Compound Database of Natural Products
Article Menu
Issue 2 (February) cover image

Export Article

Open AccessArticle

NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules

1
Center for Bioinformatics (ZBH), Department of Informatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, 20146 Hamburg, Germany
2
Department of Chemistry, University of Bergen, 5007 Bergen, Norway
3
Computational Biology Unit (CBU), Department of Informatics, University of Bergen, 5008 Bergen, Norway
*
Author to whom correspondence should be addressed.
Biomolecules 2019, 9(2), 43; https://doi.org/10.3390/biom9020043
Received: 4 December 2018 / Revised: 21 January 2019 / Accepted: 21 January 2019 / Published: 24 January 2019
  |  
PDF [6710 KB, uploaded 30 January 2019]
  |  

Abstract

Natural products (NPs) remain the most prolific resource for the development of small-molecule drugs. Here we report a new machine learning approach that allows the identification of natural products with high accuracy. The method also generates similarity maps, which highlight atoms that contribute significantly to the classification of small molecules as a natural product or synthetic molecule. The method can hence be utilized to (i) identify natural products in large molecular libraries, (ii) quantify the natural product-likeness of small molecules, and (iii) visualize atoms in small molecules that are characteristic of natural products or synthetic molecules. The models are based on random forest classifiers trained on data sets consisting of more than 265,000 to 322,000 natural products and synthetic molecules. Two-dimensional molecular descriptors, MACCS keys and Morgan2 fingerprints were explored. On an independent test set the models reached areas under the receiver operating characteristic curve (AUC) of 0.997 and Matthews correlation coefficients (MCCs) of 0.954 and higher. The method was further tested on data from the Dictionary of Natural Products, ChEMBL and other resources. The best-performing models are accessible as a free web service at http://npscout.zbh.uni-hamburg.de/npscout. View Full-Text
Keywords: natural products; natural product-likeness; machine learning; random forest; classification; similarity maps; visualization; molecular fingerprints; web service natural products; natural product-likeness; machine learning; random forest; classification; similarity maps; visualization; molecular fingerprints; web service
Figures

Graphical abstract

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Supplementary material

SciFeed
Printed Edition Available!
A printed edition of this Special Issue is available here.

Share & Cite This Article

MDPI and ACS Style

Chen, Y.; Stork, C.; Hirte, S.; Kirchmair, J. NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules. Biomolecules 2019, 9, 43.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Biomolecules EISSN 2218-273X Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top