- freely available
- re-usable
Algorithms 2009, 2(4), 1429-1448; doi:10.3390/a2041429
Article
Linear-Time Text Compression by Longest-First Substitution
1
Department of Informatics, Kyushu University, 744 Motooka, Fukuoka 819-0395, Japan
2
Graduate School of Information Science and Electrical Engineering, Kyushu University, 744 Motooka, Fukuoka 819-0395, Japan
3
Graduate School of Information Sciences, Tohoku University, Aoba 6-6-05, Aramaki, Sendai 980-8579, Japan
* Author to whom correspondence should be addressed.
Received: 30 September 2009 / Accepted: 20 November 2009 / Published: 25 November 2009
(This article belongs to the Special Issue Data Compression)
Abstract: We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented.
Keywords: grammar-based text compression; suffix trees; linear-time algorithms
Article Statistics
Click here to load and display the download statistics.Cite This Article
MDPI and ACS Style
Nakamura, R.; Inenaga, S.; Bannai, H.; Funamoto, T.; Takeda, M.; Shinohara, A. Linear-Time Text Compression by Longest-First Substitution. Algorithms 2009, 2, 1429-1448.
AMA StyleNakamura R, Inenaga S, Bannai H, Funamoto T, Takeda M, Shinohara A. Linear-Time Text Compression by Longest-First Substitution. Algorithms. 2009; 2(4):1429-1448.
Chicago/Turabian StyleNakamura, Ryosuke; Inenaga, Shunsuke; Bannai, Hideo; Funamoto, Takashi; Takeda, Masayuki; Shinohara, Ayumi. 2009. "Linear-Time Text Compression by Longest-First Substitution." Algorithms 2, no. 4: 1429-1448.
Algorithms
EISSN 1999-4893
Published by MDPI AG, Basel, Switzerland
RSS
E-Mail Table of Contents Alert
