Abstract
Two correlated sources emit a pair of sequences, each of which is observed by a different encoder. Each encoder produces a rate-limited description of the sequence it observes, and the two descriptions are presented to a guessing device that repeatedly produces sequence pairs until correct. The number of guesses until correct is random, and it is required that it have a moment (of some prespecified order) that tends to one as the length of the sequences tends to infinity. The description rate pairs that allow this are characterized in terms of the Rényi entropy and the Arimoto–Rényi conditional entropy of the joint law of the sources. This solves the guessing analog of the Slepian–Wolf distributed source-coding problem. The achievability is based on random binning, which is analyzed using a technique by Rosenthal.
1. Introduction
In the Massey–Arıkan guessing problem [,], a random variable X is drawn from a finite set according to some probability mass function (PMF) , and it has to be determined by making guesses of the form “Is X equal to x?” until the guess is correct. The guessing order is determined by a guessing function G, which is a bijective function from to . Guessing according to G proceeds as follows: the first guess is the element satisfying ; the second guess is the element satisfying , and so on. Consequently, is the number of guesses needed to guess X. Arıkan [] showed that for any , the th moment of the number of guesses required by an optimal guesser to guess X is bounded by:
where denotes the natural logarithm, and denotes the Rényi entropy of order , which is defined in Section 3 ahead (refinements of (1) were recently derived in []).
Guessing with an encoder is depicted in Figure 1. Here, prior to guessing X, the guesser is provided some side information about X in the form of , where is a function taking on at most different values (“labels”). Accordingly, a guessing function is a function from to such that for every label , is bijective. If, among all encoders, minimizes the th moment of the number of guesses required by an optimal guesser to guess X after observing , then [] (Corollary 7):
Figure 1.
Guessing with an encoder f.
Thus, in guessing a sequence of independent and identically distributed (IID) random variables, a description rate of approximately bits per symbol is needed to drive the th moment of the number of guesses to one as the sequence length tends to infinity [,] (see Section 2 for more related work).
In this paper, we generalize the single-encoder setting from Figure 1 to the setting with distributed encoders depicted in Figure 2, which is the analog of Slepian–Wolf coding [] for guessing: A source generates a sequence of pairs over a finite alphabet . The sequence is described by one of labels and the sequence by one of labels using functions:
where and . Based on and , a guesser repeatedly produces guesses of the form until .
Figure 2.
Guessing with distributed encoders and .
For a fixed , a rate pair is called achievable if there exists a sequence of encoders and guessing functions such that the th moment of the number of guesses tends to one as n tends to infinity, i.e.,
Our main contribution is Theorem 1, which characterizes the achievable rate pairs. For a fixed , let the region comprise all rate pairs satisfying the following inequalities simultaneously:
where the Rényi entropy and the Arimoto–Rényi conditional entropy of order are both defined in Section 3 ahead, and throughout the paper,
Theorem 1.
Proof.
The rate region defined by (10)–(12) resembles the rate region of Slepian–Wolf coding [] (Theorem 15.4.1); the difference is that the Shannon entropy and conditional entropy are replaced by their Rényi counterparts. The rate regions are related as follows:
Remark 1.
For memoryless sources and , the region is contained in the Slepian–Wolf region. Typically, the containment is strict.
Proof.
The containment follows from the monotonicity of the Arimoto–Rényi conditional entropy: (9) implies that , so, by [] (Proposition 5), , , and . As for the strict containment, first note that the Slepian–Wolf region contains at least one rate pair satisfying . Consequently, if , then the containment is strict. Because unless is distributed uniformly over its support [], the containment is typically strict.
The claim can also be shown operationally: The probability of error is equal to the probability that more than one guess is needed, and for every ,
where (14) follows from Markov’s inequality. Thus, the probability of error tends to zero if the th moment of the number of guesses tends to one. ☐
Despite the resemblance between (10)–(12) and the Slepian–Wolf region, there is an important difference: while Slepian–Wolf coding allows separate encoding with the same sum rate as with joint encoding, this is not necessarily true in our setting:
Remark 2.
Although the sum rate constraint (12) is the same as in single-source guessing [], separate encoding of and may require a larger sum rate than joint encoding of and .
Proof.
If , then (10) and (11) together impose a stronger constraint on the sum rate than (12). For example, if:
and , then bits, so separate (distributed) encoding requires a sum rate exceeding bits as opposed to joint encoding, which is possible with bits (in Slepian–Wolf coding, this cannot happen because ). ☐
The guessing problem is related to the task-encoding problem, where based on and , the decoder outputs a list that is guaranteed to contain , and the th moment of the list size is required to tend to one as n tends to infinity. While, in the single-source setting, the guessing problem and the task-encoding problem have the same asymptotics [], this is not the case in the distributed setting:
Remark 3.
For memoryless sources, the task-encoding region from [] is strictly smaller than the guessing region unless X and Y are independent.
Proof.
In the IID case, the task-encoding region is the set of all rate pairs satisfying the following inequalities [] (Theorem 1):
where is a Rényi measure of dependence studied in [] (when is one, is the mutual information). The claim now follows from the following observations: By [] (Theorem 2), with equality if and only if X and Y are independent; similarly, with equality if and only if X and Y are independent; and by [] (Theorem 2), with equality if and only if X and Y are independent. ☐
The rest of this paper is structured as follows: in Section 2, we review other guessing settings; in Section 3, we recall the Rényi information measures and prove some auxiliary lemmas; in Section 4, we prove the converse theorem; and in Section 5, we prove the achievability theorem, which is based on random binning and, in the case , is analyzed using a technique by Rosenthal [].
2. Related Work
Tighter versions of (1) can be found in [,]. The large deviation behavior of guessing was studied in [,]. The relation between guessing and variable-length lossless source coding was explored in [,,].
Mismatched guessing, where the assumed distribution of X does not match its actual distribution, was studied in [], along with guessing under source uncertainty, where the PMF of X belongs to some known set, and a guesser was sought with good worst-case performance over that set. Guessing subject to distortion, where instead of guessing X, it suffices to guess an that is close to X according to some distortion measure, was treated in [].
If the guesser observes some side information Y, then the th moment of the number of guesses required by an optimal guesser is bounded by []:
where denotes the Arimoto–Rényi conditional entropy of order , which is defined in Section 3 ahead (refinements of (18) were recently derived in []). Guessing is related to the cutoff rate of a discrete memoryless channel, which is the supremum over all rates for which the th moment of the number of guesses needed by the decoder to guess the message can be driven to one as the block length tends to infinity. In [,], the cutoff rate was expressed in terms of Gallager’s function []. Joint source-channel guessing was considered in [].
Guessing with an encoder, i.e., the situation where the side information can be chosen, was studied in [], where it was also shown that guessing and task encoding [] have the same asymptotics. With distributed encoders, however, task encoding [] and guessing no longer have the same asymptotics; see Remark 3. Lower and upper bounds for guessing with a helper, i.e., an encoder that does not observe X, but has access to a random variable that is correlated with X, can be found in [].
3. Preliminaries
Throughout the paper, denotes the base-two logarithm. When clear from the context, we often omit sets and subscripts; for example, we write for and for . The Rényi entropy [] of order is defined for positive other than one as:
In the limit as tends to one, the Shannon entropy is recovered, i.e., . The Arimoto–Rényi conditional entropy [] of order is defined for positive other than one as:
In the limit as tends to one, the Shannon conditional entropy is recovered, i.e., . The properties of the Arimoto–Rényi conditional entropy were studied in [,,].
In the rest of this section, we recall some properties of the Arimoto–Rényi conditional entropy that will be used in Section 4 (Lemmas 1–3), and we prove auxiliary results for Section 5 (Lemmas 4–7).
Lemma 1
([], Theorem 2). Let , and let be a PMF over the finite set . Then,
with equality if and only if form a Markov chain.
Lemma 2
([], Proposition 4). Let , and let be a PMF over the finite set . Then,
with equality if and only if Y is uniquely determined by X and Z.
Lemma 3
([], Theorem 3). Let , and let be a PMF over the finite set . Then,
Lemma 4
([], Problem 4.15(f)). Let be a finite set, and let . Then, for all ,
Proof.
If , then (24) holds because the left-hand side (LHS) and the right-hand side (RHS) are both zero. If , then:
where (26) holds because and for every . ☐
Lemma 5.
Let a, b, and c be nonnegative integers. Then, for all ,
(the restriction to integers cannot be omitted; for example, (28) does not hold if and ).
Proof.
Lemma 6.
Let a, b, c, and d be nonnegative real numbers. Then, for all ,
Proof.
If , then (33) follows from Lemma 4 because . If , then:
where (35) follows from Jensen’s inequality because is convex on since . ☐
Lemma 7
(Rosenthal). Let , and let be independent random variables that are either zero or one. Then, satisfies:
Proof.
This is a special case of [] (Lemma 1). For convenience, we also provide a self-contained proof:
where (39) holds because each is either zero or one; (41) holds because are independent; (42) holds because is increasing on for ; (44) holds because for real numbers , , and , we have ; and (46) follows from Jensen’s inequality because is concave on for .
4. Converse
In this section, we prove a nonasymptotic and an asymptotic converse result (Theorem 2 and Corollary 1, respectively).
Theorem 2.
Let form a Markov chain over the finite set , and let . Then, for every and for every guesser, the ρth moment of the number of guesses it takes to guess the pair based on the side information satisfies:
Proof.
We view (50) as three lower bounds corresponding to the three terms in the maximization on its RHS. The lower bound involving holds because:
where (51) follows from (18) and (52) follows from Lemma 3. The lower bound involving holds because:
where (53) follows from (18); (54) follows from Lemma 1; (55) follows from Lemma 2; (56) follows from Lemma 1 because form a Markov chain; and (57) follows from Lemma 3. The lower bound involving is analogous to the one with . ☐
Corollary 1.
For any , rate pairs outside are not achievable.
5. Achievability
In this section, we prove a nonasymptotic and an asymptotic achievability result (Theorem 3 and Corollary 2, respectively).
Theorem 3.
Let , , , and be finite nonempty sets; let be a PMF; let ; and let be such that:
Then, there exist functions and and a guesser such that the ρth moment of the number of guesses needed to guess the pair based on the side information satisfies:
Proof.
Our achievability result relies on random binning: we map each uniformly at random to some and each uniformly at random to some . We then show that the th moment of the number of guesses averaged over all such mappings and is upper bounded by the RHS of (64). From this, we conclude that there exist f and g that satisfy (64).
Let the guessing function G correspond to guessing in decreasing order of probability [] (ties can be resolved arbitrarily). Let f and g be distributed as described above, and denote by the expectation with respect to f and g. Then,
with:
where is the indicator function that is one if the condition comprising its argument is true and zero otherwise; (65) holds because and are independent; (66) holds because the number of guesses is upper bounded by the number of that are at least as likely as and that are mapped to the same labels as ; (67) follows from splitting the sum depending on whether or not and whether or not and from the fact that ; and (68) follows from Lemma 5 because , , and are nonnegative integers. As indicated in (69)–(74), the dependence of , , , , , and on x, y, f, and g is implicit in our notation.
We first treat the case . We bound the terms on the RHS of (68) as follows:
where (75) follows from Jensen’s inequality because is concave on since ; (76) holds because the expectation operator is linear and because since ; in (77), we extended the inner summation and used that ; and (82) follows from (61). In the same way, we obtain:
Similarly,
From (68), (82), (83), and (90), we obtain:
and hence infer the existence of and satisfying (64).
We now consider (68) when . Unlike in the case , we cannot use Jensen’s inequality as we did in (75). Instead, for fixed and , we upper-bound the first expectation on the RHS of (68) by:
where (93) follows from Lemma 7 because and because is a sum of independent random variables taking values in . By the same steps as in (76)–(82),
As to the expectation of the other term on the RHS of (94),
where (96) follows from Jensen’s inequality because is concave on since , and (97) follows from (95). From (94), (95), and (97), we obtain:
where (99) holds because since and . In the same way, we obtain for the second expectation on the RHS of (68):
Bounding , i.e., the third expectation on the RHS of (68), is more involved because is not a sum of independent random variables. Our approach builds on the ideas used by Rosenthal [] (Proof of Lemma 1); compare (47) and (48) with (108) and (123) ahead. For fixed and ,
with:
where (102) follows from splitting the sum in braces depending on whether or not and whether or not and from assuming within the braces, which does not change the value of the expression because it is multiplied by ; (104) holds because and are independent since and ; (105) holds because , , , and ; (106) follows from Lemma 6; and (107) follows from identifying , , and because and are independent, , and . As indicated in (109)–(113), the dependence of , , , , and on x, y, , , f, and g is implicit in our notation.
To bound further, we study some of the terms on the RHS of (108) separately, starting with the second, which involves the sum over . For fixed , , and ,
where (114) follows from Jensen’s inequality because and are both concave on since , and (116) follows from Lemma 7 because and because is a sum of independent random variables taking values in . This implies that for fixed and ,
where (119) follows from the definitions of and . Similarly, for the third term on the RHS of (108),
With the help of (119) and (120), we now go back to (108) and argue that it implies that for fixed and ,
To prove this, we consider four cases depending on which term on the RHS of (108) achieves the maximum: If achieves the maximum, then (121) holds because . If the LHS of (118) achieves the maximum, then (121) follows from (119) because . If the LHS of (120) achieves the maximum, then (121) follows similarly. Finally, if achieves the maximum, then:
where (123) follows from Jensen’s inequality because is concave on for . Rearranging (123), we obtain:
so (121) holds also in this case.
Having established (121), we now take the expectation of its sides to obtain:
We now study the terms on the RHS of (125) separately, starting with the fourth (last). By (85)–(90), which hold also if ,
As for the first term on the RHS of (125),
which follows from (126) in the same way as (97) followed from (95). As for the second term on the RHS of (125),
where in (129), we extended the inner summations and used that ; (131) follows from Hölder’s inequality; and (133) follows from (89)–(90) and (81)–(82). In the same way, we obtain for the third term on the RHS of (125):
Corollary 2.
For any , rate pairs in the interior of are achievable.
Proof.
Let be in the interior of . Then, (6)–(8) hold with strict inequalities, and there exists a such that for all sufficiently large n,
Using Theorem 3 with , , , , , and shows that, for all sufficiently large n, there exist encoders and and a guessing function satisfying:
Because tends to infinity as n tends to infinity, the RHS of (143) tends to one as n tends to infinity, which implies that the rate pair is achievable. ☐
Author Contributions
Writing—original draft preparation, A.B., A.L. and C.P.; writing—review and editing, A.B., A.L. and C.P.
Funding
This research received no external funding.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Massey, J.L. Guessing and entropy. In Proceedings of the 1994 IEEE International Symposium on Information Theory (ISIT), Trondheim, Norway, 27 June–1 July 1994; p. 204. [Google Scholar] [CrossRef]
- Arıkan, E. An inequality on guessing and its application to sequential decoding. IEEE Trans. Inf. Theory 1996, 42, 99–105. [Google Scholar] [CrossRef]
- Sason, I.; Verdú, S. Improved bounds on lossless source coding and guessing moments via Rényi measures. IEEE Trans. Inf. Theory 2018, 64, 4323–4346. [Google Scholar] [CrossRef]
- Bracher, A.; Hof, E.; Lapidoth, A. Guessing attacks on distributed-storage systems. arXiv, 2017; arXiv:1701.01981v1. [Google Scholar]
- Graczyk, R.; Lapidoth, A. Variations on the guessing problem. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018; pp. 231–235. [Google Scholar] [CrossRef]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2006; ISBN 978-0-471-24195-9. [Google Scholar]
- Fehr, S.; Berens, S. On the conditional Rényi entropy. IEEE Trans. Inf. Theory 2014, 60, 6801–6810. [Google Scholar] [CrossRef]
- Csiszár, I. Generalized cutoff rates and Rényi’s information measures. IEEE Trans. Inf. Theory 1995, 41, 26–34. [Google Scholar] [CrossRef]
- Bracher, A.; Lapidoth, A.; Pfister, C. Distributed task encoding. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 1993–1997. [Google Scholar] [CrossRef]
- Lapidoth, A.; Pfister, C. Two measures of dependence. In Proceedings of the 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE), Eilat, Israel, 16–18 November 2016; pp. 1–5. [Google Scholar] [CrossRef]
- Rosenthal, H.P. On the subspaces of Lp (p > 2) spanned by sequences of independent random variables. Isr. J. Math. 1970, 8, 273–303. [Google Scholar] [CrossRef]
- Boztaş, S. Comments on “An inequality on guessing and its application to sequential decoding”. IEEE Trans. Inf. Theory 1997, 43, 2062–2063. [Google Scholar] [CrossRef]
- Hanawal, M.K.; Sundaresan, R. Guessing revisited: A large deviations approach. IEEE Trans. Inf. Theory 2011, 57, 70–78. [Google Scholar] [CrossRef]
- Christiansen, M.M.; Duffy, K.R. Guesswork, large deviations, and Shannon entropy. IEEE Trans. Inf. Theory 2013, 59, 796–802. [Google Scholar] [CrossRef]
- Sundaresan, R. Guessing based on length functions. In Proceedings of the 2007 IEEE International Symposium on Information Theory (ISIT), Nice, France, 24–29 June 2007; pp. 716–719. [Google Scholar] [CrossRef]
- Sason, I. Tight bounds on the Rényi entropy via majorization with applications to guessing and compression. Entropy 2018, 20, 896. [Google Scholar] [CrossRef]
- Sundaresan, R. Guessing under source uncertainty. IEEE Trans. Inf. Theory 2007, 53, 269–287. [Google Scholar] [CrossRef]
- Arıkan, E.; Merhav, N. Guessing subject to distortion. IEEE Trans. Inf. Theory 1998, 44, 1041–1056. [Google Scholar] [CrossRef]
- Bunte, C.; Lapidoth, A. On the listsize capacity with feedback. IEEE Trans. Inf. Theory 2014, 60, 6733–6748. [Google Scholar] [CrossRef]
- Gallager, R.G. Information Theory and Reliable Communication; John Wiley & Sons: Hoboken, NJ, USA, 1968; ISBN 0-471-29048-3. [Google Scholar]
- Arıkan, E.; Merhav, N. Joint source-channel coding and guessing with application to sequential decoding. IEEE Trans. Inf. Theory 1998, 44, 1756–1769. [Google Scholar] [CrossRef]
- Bunte, C.; Lapidoth, A. Encoding tasks and Rényi entropy. IEEE Trans. Inf. Theory 2014, 60, 5065–5076. [Google Scholar] [CrossRef]
- Rényi, A. On measures of entropy and information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 20 June–30 July 1960; Volume 1, pp. 547–561. [Google Scholar]
- Arimoto, S. Information measures and capacity of order α for discrete memoryless channels. In Topics in Information Theory; Csiszár, I., Elias, P., Eds.; North-Holland Publishing Company: Amsterdam, The Netherlands, 1977; pp. 41–52. ISBN 0-7204-0699-4. [Google Scholar]
- Sason, I.; Verdú, S. Arimoto–Rényi conditional entropy and Bayesian M-Ary hypothesis testing. IEEE Trans. Inf. Theory 2018, 64, 4–25. [Google Scholar] [CrossRef]
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).