Next Article in Journal
Network Modeling Approaches and Applications to Unravelling Non-Alcoholic Fatty Liver Disease
Previous Article in Journal
Long Noncoding RNA from PVT1 Exon 9 Is Overexpressed in Prostate Cancer and Induces Malignant Transformation and Castration Resistance in Prostate Epithelial Cells
Open AccessArticle

SXGBsite: Prediction of Protein–Ligand Binding Sites Using Sequence Information and Extreme Gradient Boosting

School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China
*
Author to whom correspondence should be addressed.
Genes 2019, 10(12), 965; https://doi.org/10.3390/genes10120965 (registering DOI)
Received: 5 September 2019 / Revised: 19 October 2019 / Accepted: 19 November 2019 / Published: 22 November 2019
(This article belongs to the Section Technologies and Resources for Genetics)
The prediction of protein–ligand binding sites is important in drug discovery and drug design. Protein–ligand binding site prediction computational methods are inexpensive and fast compared with experimental methods. This paper proposes a new computational method, SXGBsite, which includes the synthetic minority over-sampling technique (SMOTE) and the Extreme Gradient Boosting (XGBoost). SXGBsite uses the position-specific scoring matrix discrete cosine transform (PSSM-DCT) and predicted solvent accessibility (PSA) to extract features containing sequence information. A new balanced dataset was generated by SMOTE to improve classifier performance, and a prediction model was constructed using XGBoost. The parallel computing and regularization techniques enabled high-quality and fast predictions and mitigated overfitting caused by SMOTE. An evaluation using 12 different types of ligand binding site independent test sets showed that SXGBsite performs similarly to the existing methods on eight of the independent test sets with a faster computation time. SXGBsite may be applied as a complement to biological experiments. View Full-Text
Keywords: protein–ligand binding site; SMOTE; Extreme Gradient Boosting; discrete cosine transform (DCT); discrete wavelet transform (DWT) protein–ligand binding site; SMOTE; Extreme Gradient Boosting; discrete cosine transform (DCT); discrete wavelet transform (DWT)
Show Figures

Figure 1

MDPI and ACS Style

Zhao, Z.; Xu, Y.; Zhao, Y. SXGBsite: Prediction of Protein–Ligand Binding Sites Using Sequence Information and Extreme Gradient Boosting. Genes 2019, 10, 965.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop