Next Article in Journal
Previous Article in Journal
Entropy 2014, 16(7), 4168-4184; doi:10.3390/e16074168
Article

Characterizing the Asymptotic Per-Symbol Redundancy of Memoryless Sources over Countable Alphabets in Terms of Single-Letter Marginals

 and *
Received: 27 May 2014; in revised form: 24 June 2014 / Accepted: 7 July 2014 / Published: 23 July 2014
(This article belongs to the Section Information Theory)
View Full-Text   |   Download PDF [258 KB, uploaded 23 July 2014]
Abstract: The minimum expected number of bits needed to describe a random variable is its entropy, assuming knowledge of the distribution of the random variable. On the other hand, universal compression describes data supposing that the underlying distribution is unknown, but that it belongs to a known set Ρ of distributions. However, since universal descriptions are not matched exactly to the underlying distribution, the number of bits they use on average is higher, and the excess over the entropy used is the redundancy. In this paper, we study the redundancy incurred by the universal description of strings of positive integers (Z+), the strings being generated independently and identically distributed (i.i.d.) according an unknown distribution over Z+ in a known collection P. We first show that if describing a single symbol incurs finite redundancy, then P is tight, but that the converse does not always hold. If a single symbol can be described with finite worst-case regret (a more stringent formulation than redundancy above), then it is known that describing length n i.i.d. strings only incurs vanishing (to zero) redundancy per symbol as n increases. On the contrary, we show it is possible that the description of a single symbol from an unknown distribution of P incurs finite redundancy, yet the description of length n i.i.d. strings incurs a constant (> 0) redundancy per symbol encoded. We then show a sufficient condition on single-letter marginals, such that length n i.i.d. samples will incur vanishing redundancy per symbol encoded.
Keywords: universal compression; redundancy; large alphabets; tightness; redundancy-capacity theorem universal compression; redundancy; large alphabets; tightness; redundancy-capacity theorem
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Export to BibTeX |
EndNote


MDPI and ACS Style

Hosseini, M.; Santhanam, N. Characterizing the Asymptotic Per-Symbol Redundancy of Memoryless Sources over Countable Alphabets in Terms of Single-Letter Marginals. Entropy 2014, 16, 4168-4184.

AMA Style

Hosseini M, Santhanam N. Characterizing the Asymptotic Per-Symbol Redundancy of Memoryless Sources over Countable Alphabets in Terms of Single-Letter Marginals. Entropy. 2014; 16(7):4168-4184.

Chicago/Turabian Style

Hosseini, Maryam; Santhanam, Narayana. 2014. "Characterizing the Asymptotic Per-Symbol Redundancy of Memoryless Sources over Countable Alphabets in Terms of Single-Letter Marginals." Entropy 16, no. 7: 4168-4184.


Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert