Algorithms 2011, 4(1), 61-74; doi:10.3390/a4010061

Compressed Matching in Dictionaries

Received: 30 January 2011; in revised form: 2 March 2011 / Accepted: 17 March 2011 / Published: 22 March 2011
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract: The problem of compressed pattern matching, which has recently been treated in many papers dealing with free text, is extended to structured files, specifically to dictionaries, which appear in any full-text retrieval system. The prefix-omission method is combined with Huffman coding and a new variant based on Fibonacci codes is presented. Experimental results suggest that the new methods are often preferable to earlier ones, in particular for small files which are typical for dictionaries, since these are usually kept in small chunks.
Keywords: dictionaries; IR systems; pattern matching; compressed matching; Huffman codes; Fibonacci codes
