Open AccessThis article is
- freely available
Linear-Time Text Compression by Longest-First Substitution
Department of Informatics, Kyushu University, 744 Motooka, Fukuoka 819-0395, Japan
Graduate School of Information Science and Electrical Engineering, Kyushu University, 744 Motooka, Fukuoka 819-0395, Japan
Graduate School of Information Sciences, Tohoku University, Aoba 6-6-05, Aramaki, Sendai 980-8579, Japan
* Author to whom correspondence should be addressed.
Received: 30 September 2009; Accepted: 20 November 2009 / Published: 25 November 2009
Abstract: We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented.
Keywords: grammar-based text compression; suffix trees; linear-time algorithms
Citations to this Article
Cite This Article
MDPI and ACS Style
Nakamura, R.; Inenaga, S.; Bannai, H.; Funamoto, T.; Takeda, M.; Shinohara, A. Linear-Time Text Compression by Longest-First Substitution. Algorithms 2009, 2, 1429-1448.
Nakamura R, Inenaga S, Bannai H, Funamoto T, Takeda M, Shinohara A. Linear-Time Text Compression by Longest-First Substitution. Algorithms. 2009; 2(4):1429-1448.
Nakamura, Ryosuke; Inenaga, Shunsuke; Bannai, Hideo; Funamoto, Takashi; Takeda, Masayuki; Shinohara, Ayumi. 2009. "Linear-Time Text Compression by Longest-First Substitution." Algorithms 2, no. 4: 1429-1448.