Next Article in Journal
Extracts of Magnolia Species-Induced Prevention of Diabetic Complications: A Brief Review
Next Article in Special Issue
Interactions of β-Conglycinin (7S) with Different Phenolic Acids—Impact on Structural Characteristics and Proteolytic Degradation of Proteins
Previous Article in Journal
Potential of LC Coupled to Fluorescence Detection in Food Metabolomics: Determination of Phenolic Compounds in Virgin Olive Oil
Previous Article in Special Issue
Highly Accurate Prediction of Protein-Protein Interactions via Incorporating Evolutionary Information and Physicochemical Characteristics
Article Menu
Issue 10 (October) cover image

Export Article

Open AccessArticle
Int. J. Mol. Sci. 2016, 17(10), 1623; doi:10.3390/ijms17101623

Identification of Protein–Protein Interactions via a Novel Matrix-Based Sequence Representation Model with Amino Acid Contact Information

1
,
1,2
and
1,*
1
School of Computer Science and Technology, Tianjin University, Tianjin 300350, China
2
Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA
*
Author to whom correspondence should be addressed.
Academic Editor: Christo Z. Christov
Received: 15 July 2016 / Revised: 7 September 2016 / Accepted: 7 September 2016 / Published: 24 September 2016
(This article belongs to the Collection Proteins and Protein-Ligand Interactions)
View Full-Text   |   Download PDF [839 KB, uploaded 28 September 2016]   |  

Abstract

Identification of protein–protein interactions (PPIs) is a difficult and important problem in biology. Since experimental methods for predicting PPIs are both expensive and time-consuming, many computational methods have been developed to predict PPIs and interaction networks, which can be used to complement experimental approaches. However, these methods have limitations to overcome. They need a large number of homology proteins or literature to be applied in their method. In this paper, we propose a novel matrix-based protein sequence representation approach to predict PPIs, using an ensemble learning method for classification. We construct the matrix of Amino Acid Contact (AAC), based on the statistical analysis of residue-pairing frequencies in a database of 6323 protein–protein complexes. We first represent the protein sequence as a Substitution Matrix Representation (SMR) matrix. Then, the feature vector is extracted by applying algorithms of Histogram of Oriented Gradient (HOG) and Singular Value Decomposition (SVD) on the SMR matrix. Finally, we feed the feature vector into a Random Forest (RF) for judging interaction pairs and non-interaction pairs. Our method is applied to several PPI datasets to evaluate its performance. On the S . c e r e v i s i a e dataset, our method achieves 94 . 83 % accuracy and 92 . 40 % sensitivity. Compared with existing methods, and the accuracy of our method is increased by 0 . 11 percentage points. On the H . p y l o r i dataset, our method achieves 89 . 06 % accuracy and 88 . 15 % sensitivity, the accuracy of our method is increased by 0 . 76 % . On the H u m a n PPI dataset, our method achieves 97 . 60 % accuracy and 96 . 37 % sensitivity, and the accuracy of our method is increased by 1 . 30 % . In addition, we test our method on a very important PPI network, and it achieves 92 . 71 % accuracy. In the Wnt-related network, the accuracy of our method is increased by 16 . 67 % . The source code and all datasets are available at https://figshare.com/s/580c11dce13e63cb9a53. View Full-Text
Keywords: protein–protein interactions; protein sequence; feature extraction; amino acid contact; substitution matrix representation protein–protein interactions; protein sequence; feature extraction; amino acid contact; substitution matrix representation
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Ding, Y.; Tang, J.; Guo, F. Identification of Protein–Protein Interactions via a Novel Matrix-Based Sequence Representation Model with Amino Acid Contact Information. Int. J. Mol. Sci. 2016, 17, 1623.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Int. J. Mol. Sci. EISSN 1422-0067 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top