Next Article in Journal
Cross-Domain Recommendation Based on Sentiment Analysis and Latent Feature Mapping
Next Article in Special Issue
A Blended Artificial Intelligence Approach for Spectral Classification of Stars in Massive Astronomical Surveys
Previous Article in Journal
Time-Dependent Pseudo-Hermitian Hamiltonians and a Hidden Geometric Aspect of Quantum Mechanics
Previous Article in Special Issue
Fifteen Years of Gene Set Analysis for High-Throughput Genomic Data: A Review of Statistical Approaches and Future Challenges
Open AccessArticle

Residue Cluster Classes: A Unified Protein Representation for Efficient Structural and Functional Classification

1
C3 Consensus, Miguel Hidalgo, CDMX, Mexico City 11510, Mexico
2
Department of Biochemistry and Structural Biology, Instituto de Fisiología Celular, UNAM, Mexico City 04510, Mexico
*
Author to whom correspondence should be addressed.
Entropy 2020, 22(4), 472; https://doi.org/10.3390/e22040472
Received: 1 March 2020 / Revised: 30 March 2020 / Accepted: 7 April 2020 / Published: 20 April 2020
(This article belongs to the Special Issue Statistical Inference from High Dimensional Data)
Proteins are characterized by their structures and functions, and these two fundamental aspects of proteins are assumed to be related. To model such a relationship, a single representation to model both protein structure and function would be convenient, yet so far, the most effective models for protein structure or function classification do not rely on the same protein representation. Here we provide a computationally efficient implementation for large datasets to calculate residue cluster classes (RCCs) from protein three-dimensional structures and show that such representations enable a random forest algorithm to effectively learn the structural and functional classifications of proteins, according to the CATH and Gene Ontology criteria, respectively. RCCs are derived from residue contact maps built from different distance criteria, and we show that 7 or 8 Å with or without amino acid side-chain atoms rendered the best classification models. The potential use of a unified representation of proteins is discussed and possible future areas for improvement and exploration are presented. View Full-Text
Keywords: residue cluster class; structural classification; functional classification residue cluster class; structural classification; functional classification
Show Figures

Figure 1

MDPI and ACS Style

Fontove, F.; Del Rio, G. Residue Cluster Classes: A Unified Protein Representation for Efficient Structural and Functional Classification. Entropy 2020, 22, 472.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop