Algorithms

Editorial

Jump to: Research

2 pages, 145 KiB

Open AccessEditorial

Editorial: Special Issue on Data Compression Algorithms and Their Applications

by Philip Bille

Algorithms 2020, 13(1), 28; https://doi.org/10.3390/a13010028 - 17 Jan 2020

Viewed by 3956

Abstract

This Special Issue of Algorithms is focused on data compression algorithms and their applications. Full article

(This article belongs to the Special Issue Data Compression Algorithms and their Applications)

Research

Jump to: Editorial

18 pages, 1969 KiB

Open AccessArticle

A New Lossless DNA Compression Algorithm Based on A Single-Block Encoding Scheme

by Deloula Mansouri, Xiaohui Yuan and Abdeldjalil Saidani

Algorithms 2020, 13(4), 99; https://doi.org/10.3390/a13040099 - 20 Apr 2020

Cited by 14 | Viewed by 6753

Abstract

With the emergent evolution in DNA sequencing technology, a massive amount of genomic data is produced every day, mainly DNA sequences, craving for more storage and bandwidth. Unfortunately, managing, analyzing and specifically storing these large amounts of data become a major scientific challenge [...] Read more.

With the emergent evolution in DNA sequencing technology, a massive amount of genomic data is produced every day, mainly DNA sequences, craving for more storage and bandwidth. Unfortunately, managing, analyzing and specifically storing these large amounts of data become a major scientific challenge for bioinformatics. Therefore, to overcome these challenges, compression has become necessary. In this paper, we describe a new reference-free DNA compressor abbreviated as DNAC-SBE. DNAC-SBE is a lossless hybrid compressor that consists of three phases. First, starting from the largest base (Bi), the positions of each Bi are replaced with ones and the positions of other bases that have smaller frequencies than Bi are replaced with zeros. Second, to encode the generated streams, we propose a new single-block encoding scheme (SEB) based on the exploitation of the position of neighboring bits within the block using two different techniques. Finally, the proposed algorithm dynamically assigns the shorter length code to each block. Results show that DNAC-SBE outperforms state-of-the-art compressors and proves its efficiency in terms of special conditions imposed on compressed data, storage space and data transfer rate regardless of the file format or the size of the data. Full article

(This article belongs to the Special Issue Data Compression Algorithms and their Applications)

► Show Figures

Figure 1

21 pages, 412 KiB

Open AccessArticle

Optimal Prefix Free Codes with Partial Sorting

by Jérémy Barbay

Algorithms 2020, 13(1), 12; https://doi.org/10.3390/a13010012 - 31 Dec 2019

Cited by 3 | Viewed by 6894

Abstract

We describe an algorithm computing an optimal prefix free code for n unsorted positive weights in time within

O (n (1 + lg α)) \subseteq O (n lg n)

, where the alternation [...] Read more.

We describe an algorithm computing an optimal prefix free code for n unsorted positive weights in time within

O (n (1 + lg α)) \subseteq O (n lg n)

, where the alternation

α \in [1 . . n - 1]

approximates the minimal amount of sorting required by the computation. This asymptotical complexity is within a constant factor of the optimal in the algebraic decision tree computational model, in the worst case over all instances of size n and alternation

α

. Such results refine the state of the art complexity of

Θ (n lg n)

in the worst case over instances of size n in the same computational model, a landmark in compression and coding since 1952. Beside the new analysis technique, such improvement is obtained by combining a new algorithm, inspired by van Leeuwen’s algorithm to compute optimal prefix free codes from sorted weights (known since 1976), with a relatively minor extension of Karp et al.’s deferred data structure to partially sort a multiset accordingly to the queries performed on it (known since 1988). Preliminary experimental results on text compression by words show

α

to be polynomially smaller than n, which suggests improvements by at most a constant multiplicative factor in the running time for such applications. Full article

(This article belongs to the Special Issue Data Compression Algorithms and their Applications)

► Show Figures

Figure 1

18 pages, 1556 KiB

Open AccessArticle

Finding Patterns in Signals Using Lossy Text Compression

by Liat Rozenberg, Sagi Lotan and Dan Feldman

Algorithms 2019, 12(12), 267; https://doi.org/10.3390/a12120267 - 11 Dec 2019

Cited by 2 | Viewed by 4252

Abstract

Whether the source is autonomous car, robotic vacuum cleaner, or a quadcopter, signals from sensors tend to have some hidden patterns that repeat themselves. For example, typical GPS traces from a smartphone contain periodic trajectories such as “home, work, home, work, ⋯”. Our [...] Read more.

Whether the source is autonomous car, robotic vacuum cleaner, or a quadcopter, signals from sensors tend to have some hidden patterns that repeat themselves. For example, typical GPS traces from a smartphone contain periodic trajectories such as “home, work, home, work, ⋯”. Our goal in this study was to automatically reverse engineer such signals, identify their periodicity, and then use it to compress and de-noise these signals. To do so, we present a novel method of using algorithms from the field of pattern matching and text compression to represent the “language” in such signals. Common text compression algorithms are less tailored to handle such strings. Moreover, they are lossless, and cannot be used to recover noisy signals. To this end, we define the recursive run-length encoding (RRLE) method, which is a generalization of the well known run-length encoding (RLE) method. Then, we suggest lossy and lossless algorithms to compress and de-noise such signals. Unlike previous results, running time and optimality guarantees are proved for each algorithm. Experimental results on synthetic and real data sets are provided. We demonstrate our system by showing how it can be used to turn commercial micro air-vehicles into autonomous robots. This is by reverse engineering their unpublished communication protocols and using a laptop or on-board micro-computer to control them. Our open source code may be useful for both the community of millions of toy robots users, as well as for researchers that may extend it for further protocols. Full article

(This article belongs to the Special Issue Data Compression Algorithms and their Applications)

► Show Figures

Figure 1

26 pages, 1370 KiB

Open AccessFeature PaperArticle

Compression Challenges in Large Scale Partial Differential Equation Solvers

by Sebastian Götschel and Martin Weiser

Algorithms 2019, 12(9), 197; https://doi.org/10.3390/a12090197 - 17 Sep 2019

Cited by 5 | Viewed by 5596

Abstract

Solvers for partial differential equations (PDEs) are one of the cornerstones of computational science. For large problems, they involve huge amounts of data that need to be stored and transmitted on all levels of the memory hierarchy. Often, bandwidth is the limiting factor [...] Read more.

Solvers for partial differential equations (PDEs) are one of the cornerstones of computational science. For large problems, they involve huge amounts of data that need to be stored and transmitted on all levels of the memory hierarchy. Often, bandwidth is the limiting factor due to the relatively small arithmetic intensity, and increasingly due to the growing disparity between computing power and bandwidth. Consequently, data compression techniques have been investigated and tailored towards the specific requirements of PDE solvers over the recent decades. This paper surveys data compression challenges and discusses examples of corresponding solution approaches for PDE problems, covering all levels of the memory hierarchy from mass storage up to the main memory. We illustrate concepts for particular methods, with examples, and give references to alternatives. Full article

(This article belongs to the Special Issue Data Compression Algorithms and their Applications)

► Show Figures

Figure 1

16 pages, 1875 KiB

Open AccessArticle

Nearest Embedded and Embedding Self-Nested Trees

by Romain Azaïs

Algorithms 2019, 12(9), 180; https://doi.org/10.3390/a12090180 - 29 Aug 2019

Cited by 2 | Viewed by 3829

Abstract

Self-nested trees present a systematic form of redundancy in their subtrees and thus achieve optimal compression rates by directed acrylic graph (DAG) compression. A method for quantifying the degree of self-similarity of plants through self-nested trees was introduced by Godin and Ferraro in [...] Read more.

Self-nested trees present a systematic form of redundancy in their subtrees and thus achieve optimal compression rates by directed acrylic graph (DAG) compression. A method for quantifying the degree of self-similarity of plants through self-nested trees was introduced by Godin and Ferraro in 2010. The procedure consists of computing a self-nested approximation, called the nearest embedding self-nested tree, that both embeds the plant and is the closest to it. In this paper, we propose a new algorithm that computes the nearest embedding self-nested tree with a smaller overall complexity, but also the nearest embedded self-nested tree. We show from simulations that the latter is mostly the closest to the initial data, which suggests that this better approximation should be used as a privileged measure of the degree of self-similarity of plants. Full article

(This article belongs to the Special Issue Data Compression Algorithms and their Applications)

► Show Figures

Figure 1

16 pages, 4444 KiB

Open AccessArticle

Compaction of Church Numerals

by Isamu Furuya and Takuya Kida

Algorithms 2019, 12(8), 159; https://doi.org/10.3390/a12080159 - 8 Aug 2019

Cited by 1 | Viewed by 3946

Abstract

In this study, we address the problem of compaction of Church numerals. Church numerals are unary representations of natural numbers on the scheme of lambda terms. We propose a novel decomposition scheme from a given natural number into an arithmetic expression using tetration, [...] Read more.

In this study, we address the problem of compaction of Church numerals. Church numerals are unary representations of natural numbers on the scheme of lambda terms. We propose a novel decomposition scheme from a given natural number into an arithmetic expression using tetration, which enables us to obtain a compact representation of lambda terms that leads to the Church numeral of the natural number. For natural number n, we prove that the size of the lambda term obtained by the proposed method is

O ({({slog}_{2} n)}^{(log n / log log n)})

. Moreover, we experimentally confirmed that the proposed method outperforms binary representation of Church numerals on average, when n is less than approximately 10,000. Full article

(This article belongs to the Special Issue Data Compression Algorithms and their Applications)

► Show Figures

Figure 1

11 pages, 1681 KiB

Open AccessArticle

A New Regularized Reconstruction Algorithm Based on Compressed Sensing for the Sparse Underdetermined Problem and Applications of One-Dimensional and Two-Dimensional Signal Recovery

by Bin Wang, Li Wang, Hao Yu and Fengming Xin

Algorithms 2019, 12(7), 126; https://doi.org/10.3390/a12070126 - 26 Jun 2019

Cited by 2 | Viewed by 3912

Abstract

The compressed sensing theory has been widely used in solving undetermined equations in various fields and has made remarkable achievements. The regularized smooth L0 (ReSL0) reconstruction algorithm adds an error regularization term to the smooth L0(SL0) algorithm, achieving the reconstruction of the signal [...] Read more.

The compressed sensing theory has been widely used in solving undetermined equations in various fields and has made remarkable achievements. The regularized smooth L0 (ReSL0) reconstruction algorithm adds an error regularization term to the smooth L0(SL0) algorithm, achieving the reconstruction of the signal well in the presence of noise. However, the ReSL0 reconstruction algorithm still has some flaws. It still chooses the original optimization method of SL0 and the Gauss approximation function, but this method has the problem of a sawtooth effect in the later optimization stage, and the convergence effect is not ideal. Therefore, we make two adjustments to the basis of the ReSL0 reconstruction algorithm: firstly, we introduce another CIPF function which has a better approximation effect than Gauss function; secondly, we combine the steepest descent method and Newton method in terms of the algorithm optimization. Then, a novel regularized recovery algorithm named combined regularized smooth L0 (CReSL0) is proposed. Under the same experimental conditions, the CReSL0 algorithm is compared with other popular reconstruction algorithms. Overall, the CReSL0 algorithm achieves excellent reconstruction performance in terms of the peak signal-to-noise ratio (PSNR) and run-time for both a one-dimensional Gauss signal and two-dimensional image reconstruction tasks. Full article

(This article belongs to the Special Issue Data Compression Algorithms and their Applications)

► Show Figures

Figure 1

10 pages, 742 KiB

Open AccessArticle

Time-Universal Data Compression

by Boris Ryabko

Algorithms 2019, 12(6), 116; https://doi.org/10.3390/a12060116 - 29 May 2019

Cited by 3 | Viewed by 4236

Abstract

Nowadays, a variety of data-compressors (or archivers) is available, each of which has its merits, and it is impossible to single out the best ones. Thus, one faces the problem of choosing the best method to compress a given file, and this problem [...] Read more.

Nowadays, a variety of data-compressors (or archivers) is available, each of which has its merits, and it is impossible to single out the best ones. Thus, one faces the problem of choosing the best method to compress a given file, and this problem is more important the larger is the file. It seems natural to try all the compressors and then choose the one that gives the shortest compressed file, then transfer (or store) the index number of the best compressor (it requires

log m

bits, if m is the number of compressors available) and the compressed file. The only problem is the time, which essentially increases due to the need to compress the file m times (in order to find the best compressor). We suggest a method of data compression whose performance is close to optimal, but for which the extra time needed is relatively small: the ratio of this extra time and the total time of calculation can be limited, in an asymptotic manner, by an arbitrary positive constant. In short, the main idea of the suggested approach is as follows: in order to find the best, try all the data compressors, but, when doing so, use for compression only a small part of the file. Then apply the best data compressors to the whole file. Note that there are many situations where it may be necessary to find the best data compressor out of a given set. In such a case, it is often done by comparing compressors empirically. One of the goals of this work is to turn such a selection process into a part of the data compression method, automating and optimizing it. Full article

(This article belongs to the Special Issue Data Compression Algorithms and their Applications)

Journal Menu

Journal Browser

Data Compression Algorithms and their Applications

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (9 papers)

Editorial

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI