Special Issue "Data Compression"

Quicklinks

A special issue of Algorithms (ISSN 1999-4893).

Deadline for manuscript submissions: 30 September 2009

Special Issue Editors

Assistant Editor
Ms. Laura Simon
MDPI, Kandererstrasse 25, CH-4057 Basel, Switzerland
E-mail:

Guest Editor
Dr. David Salomon
Computer Science Dept. (Retired), California State University, Northridge, CA 91330-8281, USA
Website: http://www.davidsalomon.name/
E-mail:

Special Issue Information

Data compression is the operation of converting an input data file to a smaller file. This operation is important for the following reasons: 1. People like to accumulate data. Thus, no matter how big a storage device one has, sooner or later it is going to fill up. 2. People hate to wait for data transfers. We often upload and download files from our computers and we hate to wait for long, slow data transfers. How can data be compressed? We can represent the same amount of information in fewer bits because the original data representation is not the shortest possible. It is intentionally long in order to simplify processing the data. We say that our data representations have redundancies. Compressing data is done by locating its redundancies and reducing or eliminating them. Thus, the field of data compression tries to understand the sources of redundancies in different types of data and find clever methods to eliminate them. Today, after decades of research, there are hundreds of algorithms and dozens of implementations that can reduce the size of all types of digital data. It is my hope that this issue of Algorithms will make a significant contribution toward this goal.

Submission

All papers should be submitted to algorithms@mdpi.org. To be published continuously until the deadline and papers will be listed together at the special issue website.

Submitted papers should not have been published nor be under consideration for publication elsewhere. All papers are refereed through a peer-review process. A guide for authors is available on the Instructions for Authors page. Algorithms is an international peer-reviewed quarterly journal published by Molecular Diversity Preservation International.

Open Access publication fees are 300 CHF per paper. English correction fees and/or formatting fees (250 CHF) will be added in certain cases (550 CHF per paper for those papers that require extensive additional formatting and/or English corrections.).

Article Processing Charges (APC)

Article Processing Charges (APC) will be waived for well prepared manuscripts of invited papers. For the first two volumes of this new journal the APC are of 300 CHF (or 550 CHF per paper for those papers that require extensive additional formatting and/or English corrections).

Keywords

  • data compression
  • data coding
  • source coding
  • information theory
  • entropy
  • data redundancy
  • variable-length codes

Planned Papers

Title: Indexing Straight-Line Programs
Author: Gonzalo Navarro; Francisco Claude
Abstract: Straight-line programs (SLPs) offer powerful text compression by representing a text T[1,u] in terms of a restricted context-free grammar of n rules, so that T can be recovered in O(u) time. However, the problem of operating the grammar in compressed form has not been studied much. We present a grammar representation whose size is of the same order of that of a plain SLP representation, and can answer other queries apart from expanding nonterminals. This can be of independent interest. We then extend it to achieve the first grammar representation able of extracting text substrings, and of searching the text for patterns, in time o(n). We also give byproducts on representing binary relations.

Title: Approximate String Matching over Compressed Self-Indexes
Author: Gonzalo Navarro; Luis Russo; Arlindo Oliveira
Abstract: To be added soon

Title: To be added soon
Authors: Mahmoud El-Sakka; Amr Abdel-Dayem
Title: To be added soon

Title: Data Compression Algorithms and their Applications to Bioinformatics
Authors: Ö. U. Nalbantoglu; K. Sayood, D. Russell
Affiliation: Department of Electrical Engineering, University of Nebraska, Lincoln, NE 68588-0511, USA
Abstract: Data compression at its base is concerned with how information is organized in data. Understanding this organization can lead to efficient ways of representing the information and hence data compression. In this paper we review the ways in which ideas and approaches fundamental to the theory and practice of data compression have been used in the area of bioinformatics. We look at how some of the ideas behind the popular Lempel-Ziv compression algorithms have been used to infer phylogenetic relationships between organisms, develop multiple sequence alignments for very large data-sets, and infer relationships between partial genomic sequences. We will also examine how basic theoretical ideas from data compression, such as the notions of entropy and mutual information, have been used for analyzing biological sequences in order to discover hidden patterns and to develop techniques for segmenting large genomes in a biologically informative manner. Finally, we look at how the theoretical concepts which underlie many data compression algorithms can be used to advance our understanding of microbial communities that inhabit our bodies and our environment.

Title: Linear-time Off-line Text Compression by Longest First Substitution
Authors: Ryosuke Nakamura, Shunsuke Inenaga, Hideo Bannai, Takashi Funamoto, Masayuki Takeda, Ayumi Shinohara
Abstract: We consider grammar based text compression with longest first substitution, where non-overlapping occurrences of a longest repeating substring of the input text are replaced by a new non-terminal symbol. We introduce a new data structure called the sparse lazy suffix tree (SLSTree), and present a novel linear time algorithm for text compression with longest first substitution using SLSTree. We also present another type of longest first substitution strategy that allows better compression ratio. We show results of preliminary experiments comparing grammar sizes of the two versions of the longest first strategy and the most frequent strategy.

Title: Interactive Compression of Digital Data
Author:
Bruno Carpentieri
Abstract:
Because of the fundamental source coding theorem, entropy is a limit to the length of the string that we can use to lossless code a message. Today lossless compressors are very efficient. While it is not possible to prove that they always achieve the entropy limit, the performances of the state of the art compressors for specific types of data, like text or images, are certainly very close to this theoretical limit. Current research on lossless compression focuses on developing compressors for new types of digital data or on improving, often only by small amounts, existing compressors. Even a small improvement is an important achievement, given the economical impact it can have on data transmission or storage. If we can use previous knowledge of the source (or the knowledge of a source that is correlated to the one we want to compress) to exploit the compression process we can have significant gains in compression. By doing this in the fundamental source coding theorem we can substitute entropy with conditional entropy and therefore we have a new theoretical limit that allows for better compression. To do this we can assume some degree of interaction between the compressor and the decompressor that can allow a more efficient usage of the previous knowledge they both have of the source when data compression is used for data transmission. In this paper we review recent work that applies interactive approaches to data compression and discuss this possibility.

Published Papers

No papers have been published in this special issue yet.

Last update: 30 June 2009

Algorithms EISSN 1999-4893 Published by Molecular Diversity Preservation International (MDPI) RSS Feed