Abstract: This work presents a generalized approach for the fast structural alignment of thousands of macromolecular structures. The method uses string representations of a macromolecular structure and a hash table that stores n-grams of a certain size for searching. To this end, macromolecular structure-to-string translators were implemented for protein and RNA structures. A query against the index is performed in two hierarchical steps to unite speed and precision. In the first step the query structure is translated into n-grams, and all target structures containing these n-grams are retrieved from the hash table. In the second step all corresponding n-grams of the query and each target structure are subsequently aligned, and after each alignment a score is calculated based on the matching n-grams of query and target. The extendable framework enables the user to query and structurally align thousands of protein and RNA structures on a commodity machine and is available as open source from http://lajolla.sf.net.
Keywords: Structural alignment; protein; RNA; hash table; n-gram; torsion angles
This is an open access article distributed under the
Creative Commons Attribution License which permits unrestricted use, distribution,
and reproduction in any medium, provided the original work is properly cited.
Export to BibTeX
MDPI and ACS Style
Bauer, R.A.; Rother, K.; Moor, P.; Reinert, K.; Steinke, T.; Bujnicki, J.M.; Preissner, R. Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors. Algorithms 2009, 2, 692-709.
Bauer RA, Rother K, Moor P, Reinert K, Steinke T, Bujnicki JM, Preissner R. Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors. Algorithms. 2009; 2(2):692-709.
Bauer, Raphael André; Rother, Kristian; Moor, Peter; Reinert, Knut; Steinke, Thomas; Bujnicki, Janusz M.; Preissner, Robert. 2009. "Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors." Algorithms 2, no. 2: 692-709.