Algorithms 2012, 5(2), 214-235; doi:10.3390/a5020214
Article

An Online Algorithm for Lightweight Grammar-Based Compression

Received: 30 January 2012; in revised form: 26 March 2012 / Accepted: 28 March 2012 / Published: 10 April 2012
(This article belongs to the Special Issue Data Compression, Communication and Processing)
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract: Grammar-based compression is a well-studied technique to construct a context-free grammar (CFG) deriving a given text uniquely. In this work, we propose an online algorithm for grammar-based compression. Our algorithm guarantees O(log2 n)- approximation ratio for the minimum grammar size, where n is an input size, and it runs in input linear time and output linear space. In addition, we propose a practical encoding, which transforms a restricted CFG into a more compact representation. Experimental results by comparison with standard compressors demonstrate that our algorithm is especially effective for highly repetitive text.
Keywords: lossless compression; grammar-based compression; online algorithm; approximation algorithm
PDF Full-text Download PDF Full-Text [347 KB, Updated Version, uploaded 24 April 2012 12:14 CEST]
The original version is still available [347 KB, uploaded 10 April 2012 09:34 CEST]

Export to BibTeX |
EndNote


MDPI and ACS Style

Maruyama, S.; Sakamoto, H.; Takeda, M. An Online Algorithm for Lightweight Grammar-Based Compression. Algorithms 2012, 5, 214-235.

AMA Style

Maruyama S, Sakamoto H, Takeda M. An Online Algorithm for Lightweight Grammar-Based Compression. Algorithms. 2012; 5(2):214-235.

Chicago/Turabian Style

Maruyama, Shirou; Sakamoto, Hiroshi; Takeda, Masayuki. 2012. "An Online Algorithm for Lightweight Grammar-Based Compression." Algorithms 5, no. 2: 214-235.

Algorithms EISSN 1999-4893 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert