Next Article in Journal
ALE-PSO: An Adaptive Swarm Algorithm to Solve Design Problems of Laminates
Next Article in Special Issue
SDPhound, a Mutual Information-Based Method to Investigate Specificity-Determining Positions
Previous Article in Journal / Special Issue
A Bayesian Algorithm for Functional Mapping of Dynamic Complex Traits
Algorithms 2009, 2(2), 692-709; doi:10.3390/a2020692

Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors

1, 2,* , 3, 4,*, 1, 5, 6, 3, 4 and 1,*
1 Charité Medical University, Structural Bioinformatics Group, Arnimallee 22, 14195 Berlin, Germany 2 Graduate School: Genomics and Systems Biology of Molecular Networks, Invalidenstrasse 43, 10115 Berlin, Germany 3 International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland 4 Laboratory of Bioinformatics, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, ul. Umultowska 89, 61-614 Poznan, Poland 5 Freie Universität Berlin, Algorithmische Bioinformatik, Institut für Informatik, Takustr. 9, 14195 Berlin, Germany 6 Zuse Institute Berlin, Dept. Computer Science, Takustrasse 7, 14195 Berlin, Germany
* Authors to whom correspondence should be addressed.
Received: 30 November 2008 / Revised: 8 April 2009 / Accepted: 9 April 2009 / Published: 21 April 2009
(This article belongs to the Special Issue Algorithms and Molecular Sciences)
View Full-Text   |   Download PDF [465 KB, uploaded 28 April 2009]   |  


This work presents a generalized approach for the fast structural alignment of thousands of macromolecular structures. The method uses string representations of a macromolecular structure and a hash table that stores n-grams of a certain size for searching. To this end, macromolecular structure-to-string translators were implemented for protein and RNA structures. A query against the index is performed in two hierarchical steps to unite speed and precision. In the first step the query structure is translated into n-grams, and all target structures containing these n-grams are retrieved from the hash table. In the second step all corresponding n-grams of the query and each target structure are subsequently aligned, and after each alignment a score is calculated based on the matching n-grams of query and target. The extendable framework enables the user to query and structurally align thousands of protein and RNA structures on a commodity machine and is available as open source from
Keywords: Structural alignment; protein; RNA; hash table; n-gram; torsion angles Structural alignment; protein; RNA; hash table; n-gram; torsion angles
This is an open access article distributed under the Creative Commons Attribution License (CC BY 3.0).

Share & Cite This Article

Further Mendeley | CiteULike
Export to BibTeX |
EndNote |
MDPI and ACS Style

Bauer, R.A.; Rother, K.; Moor, P.; Reinert, K.; Steinke, T.; Bujnicki, J.M.; Preissner, R. Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors. Algorithms 2009, 2, 692-709.

View more citation formats

Related Articles

Article Metrics

For more information on the journal, click here


[Return to top]
Algorithms EISSN 1999-4893 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert