Algorithms 2009, 2(4), 1429-1448; doi:10.3390/a2041429
Article

Linear-Time Text Compression by Longest-First Substitution

1, 2,* email, 1email, 1, 1email and 3email
Received: 30 September 2009; Accepted: 20 November 2009 / Published: 25 November 2009
(This article belongs to the Special Issue Data Compression)
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract: We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented.
Keywords: grammar-based text compression; suffix trees; linear-time algorithms
PDF Full-text Download PDF Full-Text [267 KB, uploaded 26 November 2009 11:50 CET]

Export to BibTeX |
EndNote


MDPI and ACS Style

Nakamura, R.; Inenaga, S.; Bannai, H.; Funamoto, T.; Takeda, M.; Shinohara, A. Linear-Time Text Compression by Longest-First Substitution. Algorithms 2009, 2, 1429-1448.

AMA Style

Nakamura R, Inenaga S, Bannai H, Funamoto T, Takeda M, Shinohara A. Linear-Time Text Compression by Longest-First Substitution. Algorithms. 2009; 2(4):1429-1448.

Chicago/Turabian Style

Nakamura, Ryosuke; Inenaga, Shunsuke; Bannai, Hideo; Funamoto, Takashi; Takeda, Masayuki; Shinohara, Ayumi. 2009. "Linear-Time Text Compression by Longest-First Substitution." Algorithms 2, no. 4: 1429-1448.

Algorithms EISSN 1999-4893 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert