Next Article in Journal
N,N′-Bis(2-cyclohexylethyl)naphtho[2,3-b:6,7-b′]dithiophene Diimides: Effects of Substituents
Next Article in Special Issue
Predicting Protein-Protein Interactions Using BiGGER: Case Studies
Previous Article in Journal
Native Mass Spectrometry in Fragment-Based Drug Discovery
Previous Article in Special Issue
Identification of Hydrophobic Interfaces in Protein-Ligand Complexes by Selective Saturation Transfer NMR Spectroscopy
Article Menu
Issue 8 (August) cover image

Export Article

Open AccessArticle
Molecules 2016, 21(8), 983; doi:10.3390/molecules21080983

Bioactive Molecule Prediction Using Extreme Gradient Boosting

1
UTM Big Data Centre, Ibnu Sina Institute for Scientific and Industrial Research, Universiti Teknologi Malaysia, Skudai, Johor 81310, Malaysia
2
Information Systems Department, Faculty of Computing, Universiti Teknologi Malaysia, Skudai, Johor 81310, Malaysia
*
Author to whom correspondence should be addressed.
Academic Editor: Leif A. Eriksson
Received: 1 May 2016 / Revised: 19 July 2016 / Accepted: 22 July 2016 / Published: 28 July 2016
(This article belongs to the Collection Molecular Docking)
View Full-Text   |   Download PDF [379 KB, uploaded 28 July 2016]   |  

Abstract

Following the explosive growth in chemical and biological data, the shift from traditional methods of drug discovery to computer-aided means has made data mining and machine learning methods integral parts of today’s drug discovery process. In this paper, extreme gradient boosting (Xgboost), which is an ensemble of Classification and Regression Tree (CART) and a variant of the Gradient Boosting Machine, was investigated for the prediction of biological activity based on quantitative description of the compound’s molecular structure. Seven datasets, well known in the literature were used in this paper and experimental results show that Xgboost can outperform machine learning algorithms like Random Forest (RF), Support Vector Machines (LSVM), Radial Basis Function Neural Network (RBFN) and Naïve Bayes (NB) for the prediction of biological activities. In addition to its ability to detect minority activity classes in highly imbalanced datasets, it showed remarkable performance on both high and low diversity datasets. View Full-Text
Keywords: biological data; drug discovery; virtual screening; prediction of biological activity biological data; drug discovery; virtual screening; prediction of biological activity
Figures

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Babajide Mustapha, I.; Saeed, F. Bioactive Molecule Prediction Using Extreme Gradient Boosting. Molecules 2016, 21, 983.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]

Molecules EISSN 1420-3049 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top