Next Article in Journal
Code Synchronization Algorithm Based on Segment Correlation in Spread Spectrum Communication
Next Article in Special Issue
An Integer Linear Programming Formulation for the Minimum Cardinality Segmentation Problem
Previous Article in Journal
Newton-Type Methods on Generalized Banach Spaces and Applications in Fractional Calculus
Previous Article in Special Issue
Finding Supported Paths in Heterogeneous Networks
Open AccessArticle

Automatic Classification of Protein Structure Using the Maximum Contact Map Overlap Metric

1
INRIA Rennes-Bretagne Atlantique and University of Rennes 1, Campus de Beaulieu, 35042 Rennes Cedex, France
2
Los Alamos National Laboratory, Los Alamos, NM 87544, USA
3
Life Sciences, CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands
4
Genome Informatics, University of Duisburg-Essen, 45147 Essen, Germany
5
Platform for Genome Analytics, Institutes of Neurogenetics & for Integrative and Experimental Genomics, University of Lübeck, 23562 Lübeck, Germany
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in Algorithms for Computational Biology. Wohlers, I.; Le Boudic-Jamin, M.; Djidjev, H.; Klau, G. W.; Andonov, R. Exact Protein Structure Classification Using the Maximum Contact Map Overlap Metric, In the Proceeding of the First International Conference, AlCoB 2014, Tarragona, Spain, 1–3 July 2014; pp.262–273.
Academic Editor: Giuseppe Lancia
Algorithms 2015, 8(4), 850-869; https://doi.org/10.3390/a8040850
Received: 27 June 2015 / Revised: 31 August 2015 / Accepted: 16 September 2015 / Published: 9 October 2015
(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics)
In this work, we propose a new distance measure for comparing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows one to avoid pairwise comparisons on the entire database and, thus, to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a gold standard superfamily classification benchmark set of 6759 proteins that our exact k-nearest neighbor (k-NN) scheme classifies up to 224 out of 236 queries correctly and on a larger, extended version of the benchmark with 60; 850 additional structures, up to 1361 out of 1369 queries. Our k-NN classification thus provides a promising approach for the automatic classification of protein structures based on flexible contact map overlap alignments. View Full-Text
Keywords: maximum contact map overlap; protein space metric; k-nearest neighbor classification; superfamily classification; SCOP maximum contact map overlap; protein space metric; k-nearest neighbor classification; superfamily classification; SCOP
Show Figures

Figure 1

MDPI and ACS Style

Andonov, R.; Djidjev, H.; Klau, G.W.; Boudic-Jamin, M.L.; Wohlers, I. Automatic Classification of Protein Structure Using the Maximum Contact Map Overlap Metric. Algorithms 2015, 8, 850-869.

Show more citation formats Show less citations formats

Article Access Map by Country/Region

1
Only visits after 24 November 2015 are recorded.
Back to TopTop