Next Article in Journal
ALE-PSO: An Adaptive Swarm Algorithm to Solve Design Problems of Laminates
Next Article in Special Issue
SDPhound, a Mutual Information-Based Method to Investigate Specificity-Determining Positions
Previous Article in Journal / Special Issue
A Bayesian Algorithm for Functional Mapping of Dynamic Complex Traits
Algorithms 2009, 2(2), 692-709; doi:10.3390/a2020692
Article

Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors

1, 2,* , 3, 4,*, 1, 5, 6, 3, 4 and 1,*
1 Charité Medical University, Structural Bioinformatics Group, Arnimallee 22, 14195 Berlin, Germany 2 Graduate School: Genomics and Systems Biology of Molecular Networks, Invalidenstrasse 43, 10115 Berlin, Germany 3 International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland 4 Laboratory of Bioinformatics, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, ul. Umultowska 89, 61-614 Poznan, Poland 5 Freie Universität Berlin, Algorithmische Bioinformatik, Institut für Informatik, Takustr. 9, 14195 Berlin, Germany 6 Zuse Institute Berlin, Dept. Computer Science, Takustrasse 7, 14195 Berlin, Germany
* Authors to whom correspondence should be addressed.
Received: 30 November 2008 / Revised: 8 April 2009 / Accepted: 9 April 2009 / Published: 21 April 2009
(This article belongs to the Special Issue Algorithms and Molecular Sciences)
Download PDF [465 KB, uploaded 28 April 2009]

Abstract

This work presents a generalized approach for the fast structural alignment of thousands of macromolecular structures. The method uses string representations of a macromolecular structure and a hash table that stores n-grams of a certain size for searching. To this end, macromolecular structure-to-string translators were implemented for protein and RNA structures. A query against the index is performed in two hierarchical steps to unite speed and precision. In the first step the query structure is translated into n-grams, and all target structures containing these n-grams are retrieved from the hash table. In the second step all corresponding n-grams of the query and each target structure are subsequently aligned, and after each alignment a score is calculated based on the matching n-grams of query and target. The extendable framework enables the user to query and structurally align thousands of protein and RNA structures on a commodity machine and is available as open source from http://lajolla.sf.net.
Keywords: Structural alignment; protein; RNA; hash table; n-gram; torsion angles Structural alignment; protein; RNA; hash table; n-gram; torsion angles
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Share & Cite This Article

Export to BibTeX |
EndNote


MDPI and ACS Style

Bauer, R.A.; Rother, K.; Moor, P.; Reinert, K.; Steinke, T.; Bujnicki, J.M.; Preissner, R. Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors. Algorithms 2009, 2, 692-709.

View more citation formats

Article Metrics

Comments

Citing Articles

[Return to top]
Algorithms EISSN 1999-4893 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert