Algorithms: Data Compression http://www.mdpi.com/journal/algorithms/special_issues/data_compression/ Data compression is the operation of converting an input data file to a smaller file. This operation is important for the following reasons: 1. People like to accumulate data. Thus, no matter how big a storage device one has, sooner or later it is going to fill up. 2. People hate to wait for data transfers. We often upload and download files from our computers and we hate to wait for long, slow data transfers. How can data be compressed? We can represent the same amount of information in fewer bits because the original data representation is not the shortest possible. It is intentionally long in order to simplify processing the data. We say that our data representations have redundancies. Compressing data is done by locating its redundancies and reducing or eliminating them. Thus, the field of data compression tries to understand the sources of redundancies in different types of data and find clever methods to eliminate them. Today, after decades of research, there are hundreds of algorithms and dozens of implementations that can reduce the size of all types of digital data. It is my hope that this issue of Algorithms will make a significant contribution toward this goal. Submission All papers should be submitted to algorithms@mdpi.org. To be published continuously until the deadline and papers will be listed together at the special issue website. Submitted papers should not have been published nor be under consideration for publication elsewhere. All papers are refereed through a peer-review process. A guide for authors is available on the Instructions for Authors page. Algorithms is an international peer-reviewed quarterly journal published by Molecular Diversity Preservation International. Article Processing Charges (APC) will be waived for well prepared manuscripts of invited papers. For the first three volumes of this new journal the APC are of 300 CHF (or 550 CHF per paper for those papers that require extensive additional formatting and/or English corrections) for papers submitted before 31 December 2010. Algorithms, Vol. 3, Pages 63-75: Interactive Compression of Digital Data http://www.mdpi.com/1999-4893/3/1/63/ If we can use previous knowledge of the source (or the knowledge of a source that is correlated to the one we want to compress) to exploit the compression process then we can have significant gains in compression. By doing this in the fundamental source coding theorem we can substitute entropy with conditional entropy and we have a new theoretical limit that allows for better compression. To do this, when data compression is used for data transmission, we can assume some degree of interaction between the compressor and the decompressor that can allow a more efficient usage of the previous knowledge they both have of the source. In this paper we review previous work that applies interactive approaches to data compression and discuss this possibility. http://www.mdpi.com/1999-4893/3/1/63/ Fri, 29 Jan 2010 00:00:00 CET Algorithms 2010-01-29 3 1 Article 63 75 1999-4893 Interactive Compression of Digital Data 2010-01-29 doi: 10.3390/a3010063 Bruno Carpentieri Algorithms, Vol. 2, Pages 1429-1448: Linear-Time Text Compression by Longest-First Substitution http://www.mdpi.com/1999-4893/2/4/1429/ We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented. http://www.mdpi.com/1999-4893/2/4/1429/ Wed, 25 Nov 2009 00:00:00 CET Algorithms 2009-11-25 2 4 Article 1429 1448 1999-4893 Linear-Time Text Compression by Longest-First Substitution 2009-11-25 doi: 10.3390/a2041429 Ryosuke Nakamura Shunsuke Inenaga Hideo Bannai Takashi Funamoto Masayuki Takeda Ayumi Shinohara Algorithms, Vol. 2, Pages 1221-1231: Multiplication Symmetric Convolution Property for Discrete Trigonometric Transforms http://www.mdpi.com/1999-4893/2/3/1221/ The symmetric-convolution multiplication (SCM) property of discrete trigonometric transforms (DTTs) based on unitary transform matrices is developed. Then as the reciprocity of this property, the novel multiplication symmetric-convolution (MSC) property of discrete trigonometric transforms, is developed. http://www.mdpi.com/1999-4893/2/3/1221/ Tue, 22 Sep 2009 00:00:00 CEST Algorithms 2009-09-22 2 3 Article 1221 1231 1999-4893 Multiplication Symmetric Convolution Property for Discrete Trigonometric Transforms 2009-09-22 doi: 10.3390/a2031221 Do Nyeon Kim K. R. Rao Algorithms, Vol. 2, Pages 1105-1136: Approximate String Matching with Compressed Indexes http://www.mdpi.com/1999-4893/2/3/1105/ A compressed full-text self-index for a text T is a data structure requiring reduced space and able to search for patterns P in T. It can also reproduce any substring of T, thus actually replacing T. Despite the recent explosion of interest on compressed indexes, there has not been much progress on functionalities beyond the basic exact search. In this paper we focus on indexed approximate string matching (ASM), which is of great interest, say, in bioinformatics. We study ASM algorithms for Lempel-Ziv compressed indexes and for compressed suffix trees/arrays. Most compressed self-indexes belong to one of these classes. We start by adapting the classical method of partitioning into exact search to self-indexes, and optimize it over a representative of either class of self-index. Then, we show that a Lempel- Ziv index can be seen as an extension of the classical q-samples index. We give new insights on this type of index, which can be of independent interest, and then apply them to a Lempel- Ziv index. Finally, we improve hierarchical verification, a successful technique for sequential searching, so as to extend the matches of pattern pieces to the left or right. Most compressed suffix trees/arrays support the required bidirectionality, thus enabling the implementation of the improved technique. In turn, the improved verification largely reduces the accesses to the text, which are expensive in self-indexes. We show experimentally that our algorithms are competitive and provide useful space-time tradeoffs compared to classical indexes. http://www.mdpi.com/1999-4893/2/3/1105/ Thu, 10 Sep 2009 00:00:00 CEST Algorithms 2009-09-10 2 3 Article 1105 1136 1999-4893 Approximate String Matching with Compressed Indexes 2009-09-10 doi: 10.3390/a2031105 Luís M. S. Russo Gonzalo Navarro Arlindo L. Oliveira Pedro Morales Algorithms, Vol. 2, Pages 1031-1044: Graph Compression by BFS http://www.mdpi.com/1999-4893/2/3/1031/ The Web Graph is a large-scale graph that does not fit in main memory, so that lossless compression methods have been proposed for it. This paper introduces a compression scheme that combines efficient storage with fast retrieval for the information in a node. The scheme exploits the properties of the Web Graph without assuming an ordering of the URLs, so that it may be applied to more general graphs. Tests on some datasets of use achieve space savings of about 10% over existing methods. http://www.mdpi.com/1999-4893/2/3/1031/ Tue, 25 Aug 2009 00:00:00 CEST Algorithms 2009-08-25 2 3 Article 1031 1044 1999-4893 Graph Compression by BFS 2009-08-25 doi: 10.3390/a2031031 Alberto Apostolico Guido Drovandi