Next Article in Journal
Biocompatible 3D Matrix with Antimicrobial Properties
Next Article in Special Issue
A SILAC-Based Approach Elicits the Proteomic Responses to Vancomycin-Associated Nephrotoxicity in Human Proximal Tubule Epithelial HK-2 Cells
Previous Article in Journal
Comparative Study of Essential Oils Extracted from Egyptian Basil Leaves (Ocimum basilicum L.) Using Hydro-Distillation and Solvent-Free Microwave Extraction
Previous Article in Special Issue
Synthesis of Canthardin Sulfanilamides and Their Acid Anhydride Analogues via a Ring-Opening Reaction of Activated Aziridines and Their Associated Pharmacological Effects
Article Menu
Issue 1 (January) cover image

Export Article

Open AccessArticle
Molecules 2016, 21(1), 95; doi:10.3390/molecules21010095

iPPBS-Opt: A Sequence-Based Ensemble Classifier for Identifying Protein-Protein Binding Sites by Optimizing Imbalanced Training Datasets

1
Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen 333403, China
2
Gordon Life Science Institute, Boston, MA 02478, USA
3
Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah 21589, Saudi Arabia
*
Authors to whom correspondence should be addressed.
Academic Editor: Derek J. McPhee
Received: 18 November 2015 / Revised: 18 December 2015 / Accepted: 7 January 2016 / Published: 19 January 2016
(This article belongs to the Special Issue Drug Design and Discovery: Principles and Applications)
View Full-Text   |   Download PDF [1679 KB, uploaded 19 January 2016]   |  

Abstract

Knowledge of protein-protein interactions and their binding sites is indispensable for in-depth understanding of the networks in living cells. With the avalanche of protein sequences generated in the postgenomic age, it is critical to develop computational methods for identifying in a timely fashion the protein-protein binding sites (PPBSs) based on the sequence information alone because the information obtained by this way can be used for both biomedical research and drug development. To address such a challenge, we have proposed a new predictor, called iPPBS-Opt, in which we have used: (1) the K-Nearest Neighbors Cleaning (KNNC) and Inserting Hypothetical Training Samples (IHTS) treatments to optimize the training dataset; (2) the ensemble voting approach to select the most relevant features; and (3) the stationary wavelet transform to formulate the statistical samples. Cross-validation tests by targeting the experiment-confirmed results have demonstrated that the new predictor is very promising, implying that the aforementioned practices are indeed very effective. Particularly, the approach of using the wavelets to express protein/peptide sequences might be the key in grasping the problem’s essence, fully consistent with the findings that many important biological functions of proteins can be elucidated with their low-frequency internal motions. To maximize the convenience of most experimental scientists, we have provided a step-by-step guide on how to use the predictor’s web server (http://www.jci-bioinfo.cn/iPPBS-Opt) to get the desired results without the need to go through the complicated mathematical equations involved. View Full-Text
Keywords: protein-protein binding sites; physicochemical property; stationary wavelet transform; PseAAC; Optimize training dataset; KNNC; IHTS; target cross-validation protein-protein binding sites; physicochemical property; stationary wavelet transform; PseAAC; Optimize training dataset; KNNC; IHTS; target cross-validation
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Supplementary material

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Jia, J.; Liu, Z.; Xiao, X.; Liu, B.; Chou, K.-C. iPPBS-Opt: A Sequence-Based Ensemble Classifier for Identifying Protein-Protein Binding Sites by Optimizing Imbalanced Training Datasets. Molecules 2016, 21, 95.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]

Molecules EISSN 1420-3049 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top