Are Guessing, Source Coding and Tasks Partitioning Birds of A Feather?

This paper establishes a close relationship among the four information theoretic problems, namely Campbell source coding, Arikan guessing, Huleihel et al. memoryless guessing and Bunte and Lapidoth tasks’ partitioning problems in the IID-lossless case. We first show that the aforementioned problems are mathematically related via a general moment minimization problem whose optimum solution is given in terms of Renyi entropy. We then propose a general framework for the mismatched version of these problems and establish all the asymptotic results using this framework. The unified framework further enables us to study a variant of Bunte–Lapidoth’s tasks partitioning problem which is practically more appealing. In addition, this variant turns out to be a generalization of Arıkan’s guessing problem. Finally, with the help of this general framework, we establish an equivalence among all these problems, in the sense that, knowing an asymptotically optimal solution in one problem helps us find the same in all other problems.


I. INTRODUCTION
The concept of entropy is very central to information theory.For example, number of bits required on average (per letter) to compress a source with (finite) alphabet set X and probability distribution P is the Shannon entropy H(P ).If the compressor does not know the true distribution P , but assumes a distribution Q (mismatch), then the number of bits required for compression is H(P ) + I(P, Q), where I(P, Q) is the entropy of P relative to Q (or the Kullback-Leibler divergence).In his seminal paper, Shannon [1] argued that H(P ) can also be regarded as a measure of uncertainty.Subsequently, Rényi [2] introduced an alternate measure of uncertainty, now known as Rényi entropy of order α, as where α > 0 and α = 1.Rényi entropy can also be regarded as a generalization of the Shannon entropy as lim α→1 H α (P ) = H(P ).Refer Aczel and Daroczy [3] and the references therein for an extensive study of various measures of uncertainty and their characterizations.
In 1965, Campbell [4] gave an operational meaning to Rényi entropy.He showed that if one minimizes the cumulants of code lengths, then the optimal cumulant is Rényi entropy H α (P ).He also showed that the optimal cumulant can be achieved by encoding sufficiently long sequences of symbols.Sundaresan [5,Th. 8] (c.f.Blumer and McElice [6]) showed that in the mismatched case, the optimal cumulant is H α (P ) + I α (P, Q), where is α-entropy of P relative to Q or Sundaresan's divergence [7].I α (P, Q) ≥ 0 with equality if and only if P = Q.Hence I α (P, Q) can be interpreted as the penalty for not knowing the true distribution.I α -divergence can also be regarded as a generalization of the Kullback-Leibler divergence as lim α→1 I α (P, Q) = I(P, Q).Refer [5,8] for detailed discussions on the properties of I α .Lutwak et al. also independently identified I α in the context of maximum Rényi entropy and called it α-Rényi relative entropy [9, Eq. ( 4)].I α , for α > 1, also arises in robust inference problems (see [10] and the references therein).
In [11], Massey studied a guessing problem where one is interested in the expected number of guesses required to guess a random variable X that assumes values from an infinite set, and found a lower bound in terms of Shannon entropy.Arıkan [12] studied a guessing problem on finite alphabet sets and showed that Rényi entropy arises as the optimal solution when minimizing moments of number of guesses.Subsequently, Sundaresan [5] showed that the penalty in guessing according to a distribution Q when the true distribution is P , is given by I α (P, Q).Bunte and Lapidoth [13] studied a problem of partitioning tasks and showed that Rényi entropy and Sundaresan's divergence play a similar role in the optimal number of tasks performed.Quite recently, Huleihel et al. [14] studied the memoryless guessing problem, a variant of Arıkan's guessing problem, with i.i.d.(independent and identically distributed) guesses and showed that the minimum attainable factorial moments of number of guesses is the Rényi entropy.

A. Our Contribution
On studying the aforementioned four problems namely, Campbell's source coding, Arıkan's guessing, Huleihel et al.'s memoryless guessing, and Bunte and Lapidoth's tasks partitioning, we observe a close relationship among them.In all these problems, the objective is to minimize moments or factorial moments of random variables, and Rényi entropy and Sundaresan's divergence arise in optimal solutions.This motivated us to seek out mathematical and operational commonalities that exist in these problems.The main contributions of this paper are as follows.
• A general framework for the problems on source coding, guessing and tasks partitioning.
• A unified approach to derive bounds for the mismatched version of these problems.
• A generalized tasks partitioning problem.
• Establishing operational commonality among the problems.
The remainder of the paper is organized as follows.In Section II, we present our unified framework, and find conditions under which lower and upper bounds are attained.In Section III, we present four well-known information-theoretic problems, namely, Campbell's source coding, Arıkan's guessing, Huleihel et al.'s memoryless guessing, and Bunte and Lapidoth's tasks partitioning, and re-establish and refine major results pertaining to these problems.In Section IV, December 29, 2020 DRAFT we present and solve a generalized tasks partitioning problem.In Section V, we establish a connection among the aforementioned problems.Finally, we summarize and conclude the paper in Section VI.

II. A GENERAL MINIMIZATION PROBLEM
In this section, we present a general minimization problem whose optimum solution evaluates to Rényi entropy.We will later show that all problems stated in Section III are particular instances of this general problem.
The left-side of ( 3) is called normalised cumulant of ψ(X) of order ρ.The measure P (α) (x) := P (x) α /Z P,α in (4) that attains the lower bound is called α-scaled measure or escort measure of P .This measure also arises in robust inference [10, Eq. ( 7)] and statistical physics [16].The above proposition can also be proved using a variational formula as follows.
By a version of Donsker-Varadhan variational formula [17, Propostion 4.5.1],for any real-valued f on X , we have where the max is over all probability distributions Q on X .Taking ρ > 0 and f (x) = ρ log ψ(x) in (5) we have where (a) is by the log-sum inequality [15, Eq. (4.1)] and (b) is by applying the constraint x ψ(x) −1 ≤ k.For ρ ∈ (−1, 0), the inequalities in (a) and (b) are reversed and the last max is replaced by min.Hence, (3) follows as the last max is equal to H α (P ) by [18,Th. 1].Equality in (a) and (b) holds if and only if ψ(x) −1 = k • Q(x).Also the last max is attained when Q(x) = P (x) α /Z P,α for x ∈ X .This completes the proof.
The following is the analogous one for Shannon entropy.
The lower bound is achieved if and only if ψ(x) −1 = k • P (x) ∀x ∈ X . Proof: where the pen-ultimate inequality is due to the log-sum inequality.Equality holds in both inequalities if and only if ] and H α (P ) → H(P ) as ρ → 0 in (3).We now extend Propositions 1 and 2 to sequences of random variables.Let X n be the set of all n-length sequences of elements of X , and P n be the n-fold product distribution of P on X n , that is, for where E Pn [•] denotes the expectation with respect to probability distribution P n on X n .
Proof: It is easy to see that H α (P n ) = nH α (P ) and H(P n ) = nH(P ).Applying Propositions 1 and 2, dividing throughout by n and taking lim inf n → ∞, the results follow.

A. A General Framework for Mismatched Cases
In this sub-section, we establish a unified approach for cases when there mismatch between assumed and true distributions.
Proposition 3: Let ρ > −1, α = 1/(1 + ρ), and Q be a probability distribution on X .For n ≥ 1, let Q n be the n-fold for some c n > 0, then (a) for ρ = 0, Proof: Part (a): From (7), we have where the penultimate equality holds from the definition of I α , and the last one holds because H α (P n ) = nH α (P ), Part (b): Taking log, dividing throughout by nρ, and then applying lim sup successively on both sides of (8), the result follows.
Part (c): When ρ = 0, we have α = 1 and (7) becomes where the last equality holds because H(P n ) = nH(P ), and Part (d): Dividing (9) throughout by n, and taking limsup on both sides, the result follows.

III. PROBLEM STATEMENTS AND KNOWN RESULTS
In this section we discuss Campbell's source coding problem, Arıkan's guessing problem, Huleihel et al.'s memoryless guessing problem, and Bunte and Lapidoth's tasks partitioning problem.Using the general framework presented in the previous section, we re-establish known results, and present a few new results relating to these problems.

A. Source Coding Problem
Let X be a random variable that assumes values from a finite alphabet set X = {a 1 , . . ., a m } according to a probability distribution P .The tuple (X , P ) is usually referred to as a source.A binary code C is a mapping from X to the set of finite length binary strings.Let L(C(X)) be the length of code C(X).The objective is to find a uniquely decodable code that minimizes the expected code-length, that is, December 29, 2020 DRAFT over all uniquely decodable codes C. Kraft and McMillan independently proved the following relation between uniquely decodable codes and their code-lengths.[19]:

Kraft-McMilan Theorem
Conversely, given a length sequence that satisfies the above inequality, there exists a uniquely decodable code C with the given length sequence.
Thus, one can confine the search space for C to codes satisfying the Kraft-McMillan inequality (10).
, where L(C(x)) is the length of code C(x) assigned to alphabet x.Since C is uniquely decodable, from (10), we have x∈X ψ(x) −1 ≤ 1.Now, an application of Proposition 2 with k = 1 yields the desired result.
Theorem 1: Let X n := X 1 , . . ., X n be an i.i.d.sequence from X n following the product distribution we have An application of Proposition 3 with c n = 2 yields lim sup n→∞ E Pn [L(C n (X n ))]/n ≤ H(P ) + I(P, Q).Further, we also have

B. Campbell Coding Problem
In Campbell's coding problem, the setup is identical to Shannon's source coding problem.However, instead of minimizing the expected code-length, we are interested in minimizing the normalized cumulant of code lengths, that is, over all uniquely decodable codes C, and ρ > 0. A similar problem was considered by Humblet in [20] for minimizing buffer overflow probability.A lower bound for the normalized cumulants in terms of Rényi entropy was provided by Campbell [4].
Notice that, if we ignore the integer constraint of the length function, then with Z P,α as in Proposition 1, satisfies (10) and achieves the lower bound in (11).Campbell also showed that the lower bound in ( 11) can be achieved by encoding long sequences of symbols with code-lengths close to (12). then . Then, from ( 13) we have The result follows by applying Proposition 3 and Proposition 4 with c n = 2, a n = 1 and Q = P .

Mismatch Case:
Redundancy in the mismatched case of the Campbell's problem was studied in [5,6].Sundaresan showed that the difference in the normalized cumulant from the minimum when encoding according to an arbitrary uniquely decodable code is measured by the I α -divergence up-to a factor of 1 [5].We generalize this idea as follows.
Proposition 5: Let X be a random variable that assumes values from set X according to a probability distribution P .
We note that the bound in ( 15) can be loose when η is small.For example, for a source with two symbols, say x and y, with code lengths L(x) = L(y) = 100, we have R c (P, L, ρ) ≥ I α (P, Q L ) + 98.However, if one imposes the is, in a sense, the penalty when Q L does not match the true distribution P .In view of this, a result analogous to Proposition 5 also holds for the Shannon source coding problem.

C. Arıkan's Guessing Problem
Let X be a set of objects with |X | = m.Bob thinks of an object X (a random variable) from X according to a probability distribution P .Alice guesses it by asking questions of the form "is X = x?".The objective is to minimize average number of guesses required to correctly guess X.By a guessing strategy (or guessing function), we mean a one-one map G : X → {1, . . ., m}, where G(x) is to be interpreted as the number of questions required to guess x correctly.Arıkan studied the ρ th moment of number of guesses and found upper and lower bounds in terms of Rényi entropy.
Theorem 1 of [12]: Let G be any guessing function.Then, for ρ ∈ (−1, 0) ∪ (0, ∞), Proof: Let G be any guessing function.Let ψ(x) = G(x).Then, we have Proposition 4 of [12]: If G * is an optimal guessing function, then for ρ ∈ (−1, 0) ∪ (0, ∞), Proof: Let us rearrange the probabilities {P (x), x ∈ X } in non-increasing order, say Then, the optimal guessing function G * is given by G * (x) = i if P (x) = p i .Let us index the elements in set X as {x 1 , x 2 , . . . ,x m }, according to the decreasing order of their probabilities.Then, for i ∈ {1, . . ., m}, we have That is, G * (x) ≤ ZP,α P (x) α for x ∈ X .Now, an application of Proposition 3 with n = 1, Arıkan also proved that the upper bound of Rényi entropy can be achieved by guessing long sequences of letters in an i.i.d.fashion.
Henceforth we shall denote the optimal guessing function corresponding to a probability distribution P by G P .

Mismatch Case:
Suppose Alice does not know the true underlying probability distribution P , and guesses according to some guessing function G.The following proposition tells us that the penalty for deviating from the optimal guessing function can be measured by I α -divergence.
Proposition 6: Let G be an arbitrary guessing function.Then, for ρ ∈ (−1, 0) ∪ (0, ∞), there exists a probability distribution Q G on X such that Proof: Let G be a guessing function.Define a probability distribution Q G on X as Then, we have Now, an application of Proposition 4 with n = 1, ψ 1 (x) = G(x), and a 1 = 1/(1 + ln m) yields the desired result.
A converse result is the following.
Proof: Let us rearrange the probabilities (Q(x), x ∈ X ) in non-increasing order, say Observe that, given a guessing function G, if we apply the above proposition for Q = Q G , where Q G is as in ( 21), then we get Thus, the above two propositions can be combined to state the following, which is analogous to Proposition 5 (refer Section III-B).
Theorem 6 of [5]: Let G be an arbitrary guessing function and G P be the optimal guessing function for P .For Then, there exists a probability distribution Q G such that

D. Memoryless Guessing
In memoryless guessing, the setup is similar to that of Arıkan's guessing problem except that this time the guesser Alice comes up with guesses independent of her previous guesses.Let X1 , X2 , . . .be Alice's sequence of independent guesses according to a distribution P .The guessing function in this problem is defined as, that is, the number of guesses until a successful guess.Sundaresan [21], inspired by Arıkan's result, showed that the minimum expected number of guesses required is exp{H 1 2 (P )}, and the distribution that achieves this, is surprisingly not the underlying distribution P , but the "tilted distribution" P * (x) := P (x)/ y P (y).
Unlike in Arikan's guessing problem, Huleihel et al. [14] minimized what are called factorial moments, defined for Huleihel et al. [14] studied the following problem.
Minimize E P V P , ρ (X) , over all P ∈ P, where P is the probability simplex, that is, P = {(P (x)) x∈X : P (x) ≥ 0, x P (x) = 1}.Let P * be the optimal solution of the above problem.
For sequence of random guesses, above theorem can be stated in the following way.Let Xn = ( X1 , . . ., Xn ), where X i 's are i.i.d.guesses, drawn from X n with distribution P n -the n-fold product distribution of P on X n .If the true underlying distribution is P n , then , where P * n (x) = P n (x) α /Z Pn,α .For the mismatched case, we have the following result.Proposition 7: If the true underlying probability distribution is P but Alice assumes it as Q and guesses according to its optimal one, namely Proof: Due to ( 22), the result follows easily by taking n = 1, and 4.

E. Tasks Partitioning Problem
Encoding of Tasks problem studied by Bunte and Lapidoth [13] can be phrased the following way.Let X be a finite set of tasks.A task X is randomly drawn from X according to a probability distribution P , which may correspond to the frequency of occurrences of tasks.Suppose these tasks are associated with M keys.Typically, M < |X |.Due to the limited availability of keys, more than one task may be associated with a single key.When a task needs to be performed, the key associated with it is pressed.Consequently, all tasks associated with this key will be performed.The objective in this problem is to minimize the number of redundant tasks performed.Usual coding techniques suggest assigning tasks with high probability to individual keys and leaving the low probability tasks unassigned.But for an individual, all tasks can be equally important.It may just be the case that some tasks may have a higher frequency of occurrence than others.If M ≥ |X |, then one can perform tasks without any redundancy.However, Bunte and Lapidoth [13] showed that, even when M < |X |, one can accomplish the tasks with much less redundancy on average, provided the underlying probability distribution is different from the uniform distribution.
Let A = {A 1 , A 2 , . . ., A M } be a partition of X that corresponds to the assignment of tasks to M keys.Let A(x) be the cardinality of the subset containing x in the partition.We shall call A the partition function associated with partition A.

Theorem I.1 of [13]:
The following results hold.
(a) Let ρ ∈ (−1, 0) ∪ (0, ∞).For any partition of X of size M with partition function A, we have , then there exists a partition of X of size at most M with partition function A such that where

Part (b):
For the proof of this part, we refer to [13].
Bunte and Lapidoth also proved the following limit results.
Theorem I.2 of [13]: Let ρ > 0. Then for every n ≥ 1 there exists a partition A n of X n of size at most M n with associated partition function A n such that where X n := (X 1 , . . ., X n ).
It should be noted that, in a general set-up of the tasks partitioning problem, it is not necessary that the partition size is of the form M n ; it can be some M n (a function of n).Consequently, we have the following result.
Proof: Let We first claim that lim n→∞ log Mn n = γ.Indeed, since log(1/4) n ≤ log Mn n < log Mn n , when γ = 0, we have lim n→∞ log Mn n = 0. On the other hand, when γ > 0, we can find an n γ such that M n ≥ 2 γn/2 ∀n ≥ n γ .Thus, we have lim n→∞ n Mn = 0. Consequently, This proves the claim.From Theorem I.1 of [11], for any n ≥ 1 and M n > n log |X | + 2, there exists a partition A n of X n of size at most M n such that the associated partition function A n satisfies n .Thus, we have Part (b): For any > 0, there exists an n such that log Mn n ≥ γ − ∀n ≥ n .Thus, we have Hence, we have Further, an invocation of Corollary 1 with Remark It is interesting to note that when γ < H α (P ), in addition to the fact that for large values of n.

Mismatch Case:
Let us now suppose that one does not know the true underlying probability distribution P , but arbitrarily partitions X .
Then, the penalty due to such a partition can be measured by the I α -divergence as stated in the following theorem.
Proposition 9: Let ρ ∈ (−1, 0) ∪ (0, ∞).Let A be a partition of X of size M with partition function A. Then, there exists a probability distribution Q A on X such that Proof: Define a probability distribution where the last equality follows due to [13, Prop.III.1].Rearranging terms, we have

Hence an application of Propositions 3 and 4 with
A converse result is the following.
Proposition 10: Let X be a random task from X following distribution P and ρ ∈ (0, ∞).Let Q be another distribution on X .If M > log |X | + 2, then there exists a partition A Q (with an associated partition function A Q ) of X of size at most M such that where M is as in (23).
Proof: Similar to proof of Theorem I.1 of [13].

IV. ORDERED TASKS PARTITIONING PROBLEM
In Bunte and Lapidoth's tasks partitioning problem [13], one is interested in the average number of tasks associated with a key.However, in some scenarios, it might be more important to minimize the average number of redundant tasks performed, before the intended task.To achieve this, tasks associated with a key should be performed in decreasing order of their probabilities.With such a strategy in place, this problem draws parallel with Arıkan's guessing problem [12].
Let A = {A 1 , A 2 , . . ., A M } be a partition of X that corresponds to the assignment of tasks to M keys.Let N (x) be the number of redundant tasks performed until and including the intended task x.We refer to N (•) as the count function associated with partition A. We suppress the dependence of N on A for the sake of notational convenience.If X denotes the intended task, then we are interested in the ρ th moment of number of tasks performed, that is, E P [N (X) ρ ], where ρ ∈ (−1, 0) ∪ (0, ∞).
Lemma 1: For any count function associated with a partition of size M , we have Proof: For a partition Since for any k ∈ {1, . . ., M }, we have where (a) follows due to the AM-GM inequality.
(a) For any partition of X of size M , we have Then, for ρ > 0, then there exists a partition of X of size at most M with count function where M is as in (23).for x ∈ X .Once we observe this, the proof is same as Theorem I.1 (b) of [13].
Proposition 12: Let {M n } be a sequence of positive integers such that M n ≥ n log |X |+3, and γ := lim n→∞ log M n /n exists.Then, there exists a sequence of partitions of X n of size at most M n with count functions N n such that (a) Proof: Similar to proof of Proposition 8.

Remark a). If we choose the trivial partition, namely
A n = {X n }, then the ordered tasks partitioning problem simplifies to Arıkan's guessing problem, that is, we have ) and (25) simplifies to Hence, all results pertaining to the Arıken's guessing problem can be derived from the ordered tasks partitioning problem.
b) Structurally, ordered tasks partitioning problem differs from Bunte and Lapidoth's problem only due the factor 27).While this factor matters for one-shot results, for a sequence of i.i.d.tasks, this factor vanishes asymptotically.

Mismatch Case:
Let us now suppose that one does not know the true underlying probability distribution P , but arbitrarily partitions X and executes tasks within each subset of this partition in an arbitrary order.Then, the penalty due to such a partition and ordering can be measured by the I α -divergence as stated in the following propositions.
Proposition 13: Let ρ ∈ (−1, 0) ∪ (0, ∞).Let A be a partition of X of size M with count function N .Then, there exists a probability distribution Q N on X such that Proof: Define a probability distribution

Now, an application of Proposition 4 with
A converse result is the following.
Proposition 14: Let X be a random task from X following distribution P .Let Q be another distribution on X .If M > log |X | + 2, then there exists partition A Q (with an associated count function N Q ) of X of size at most M such that, where M is as in (23).
Proof: Identical to the proof of Proposition 11(b).

V. OPERATIONAL CONNECTION AMONG THE PROBLEMS
In this section, we establish an operational relationship among the five problems (refer Fig. 1) that we studied in the previous section.The relationship we are interested in is "Does knowing an optimal or asymptotically optimal solution in one problem helps us find the same in another?"In fact, we end up showing that, under suitable conditions, all the five problems form an equivalence class with respect to the above mentioned relation.In this section, we assume ρ > 0.
First, we make the following observations: • Among the five problems discussed in the previous section, only Arıkan's guessing and Huleihel et al.'s memoryless guessing have a unique optimal solution; others only have asymptotically optimal solutions.
• Optimal solution of Huleihel et al.'s memoryless guessing problem is the α-scaled measure of the underlying probability distribution P .Hence, knowledge about the optimal solution of this problem implies knowledge about optimal (or asymptotically optimal) solution of all other problems.
• Among the Bunte and Lapidoth's and ordered tasks problems, asymptotically optimal solution of one yields that of the other.The partitioning lemma (Prop.III-2 of [13]) is the key result in these two problems as it guarantees the existence of the asymptotically optimal partitions in both these problems.

A. Campbell's Coding and Arıkan's Guessing
An attempt to find a close relationship between these two problems was made by Hanawal and Sundaresan [22,Sec. II].
Here, we show the equivalence between asymptotically optimal solutions of these two problems.
Proposition 15: An asymptotically optimal solution exists for Campbell's source coding problem if and only if an asymptotically optimal solution exists for Arıkan's guessing problem.
Proof: Let {G * n } be an asymptotically optimal sequence of guessing functions, that is, where c n is the normalization constant.Notice that Let us now define Then by [22,Prop. 1], Hence Thus, we have We observe that Consequently, from (11), we have Thus, {L G * n } is an asymptotically optimal sequence of length functions for Campbell's coding problem.Conversely, given an asymptotically optimal sequence of length functions {L * n } for Campbell's coding problem, define n be the guessing function on X n that guesses according to the decreasing order of Q L * n -probabilities.Then by [22,Prop. 2], Further, from Theorem 1 of [12], we have  [23].In this section, we establish a different relation between these problems.
Proposition 16: An asymptotically optimal solution of Arıkan's guessing problem gives rise to an asymptotically optimal solution of tasks partitioning problem.
Proof: Let {G * n } be an asymptotically optimal sequence of guessing functions.Define where where Mn is as in (24); and inequality (c) follows from Proposition 4 of [12] proved in Section III.Thus, if M n is such that M n ≥ n log |X | + 3 and if γ := lim n→∞ (log M n )/n exists and γ > H α (P ), then we have When γ < H α (P ), arguing along the lines of proof of Proposition 8(b), it can be shown than Reverse implication of the above result does not hold always due to the additional parameter M n in the tasks partitioning problem.For example, if M n = |X | n and A n (x n ) = 1 for every x n , the partition does not provide any information about the underlying distribution.As a consequence, we will not be able to conclude anything about the optimal (or asymptotically optimal) solutions of other problems.However, if M n is such that log M n increases sub-linearly, then it does help us find asymptotically optimal solutions of other problems.
Proposition 17: An asymptotically optimal sequence of partition functions {A n } with partition sizes {M n } for the tasks partitioning problem gives rise to an asymptotically optimal solution for the guessing problem if M n ≥ n log |X |+3 and lim n→∞ (log M n )/n = 0.
Proof: By hypothesis, For every A n , define the probability distribution where An be the guessing function that guesses according to the decreasing order of Q An -probabilities.Then by [22, Prop.2], we have Further, an application of Theorem 1 of [12] gives us This completes the proof.

C. Huleihel et al.'s Memoryless Guessing and Campbell's Coding
We already know that if one knows the optimal solution of Huleihel et al.'s memoryless guessing problem, that is, the α-scaled measure of the underlying probability distribution P , one has knowledge about the optimal (or asymptotically optimal) solution of Campbell's coding problem.In this section, we prove a converse statement.We first prove the following lemma.
Lemma 2: Let L * n denote the length function corresponding to an optimal solution for Campbell's coding problem on the alphabet set X n endowed with the product distribution P n .Then, x n ∈X n 2 −L * n (x n ) ≥ 1/2.
Proof: Suppose x n ∈X n 2 −L * n (x n ) < 1/2.Then, we must have L * n (x n ) ≥ 2 for every x n ∈ X n .Define Ln (x n ) := L * n (x n ) − 1.We observe that x n ∈X n 2 − Ln(x n ) < 1, that is, the length function Ln (•) satisfies (10).Hence, there exists a code C n for X n such that L(C n (x n )) = Ln (x n ).Then for ρ > 0, we have log E Pn [2 ρL * n (X n ) ] > log E Pn [2 ρ Ln(X n ) ]a contradiction.
Proposition 18: An asymptotically optimal solution for Huleihel et al.'s memoryless guessing problem exists if an asymptotically optimal solution exists for Campbell's coding problem.
Proof: Let {L * n , n ≥ 1} denote a sequence of asymptotically optimal length functions of Campbell's coding problem, that is, Let us define xn ∈X n 2 −L * n (x n )/α .Then, we have

VI. SUMMARY AND CONCLUDING REMARKS
The main motivation of this paper was the need to unify the source coding, guessing and the tasks partitioning problems by investigating the mathematical and operational commonalities among them.To that end, we formulated a general moment minimization problem and observed that optimal value of its objective function is bounded below by Rényi entropy.We then re-established all achievable lower bounds in each of the above-mentioned problems using the generalized framework.It was interesting to note that the optimal solution did not depend on the moment function ψ, but only on the underlying probability distribution P and order of the moment ρ (refer Proposition 1).We also presented a unified framework for the mismatched version of the above mentioned problems.This framework not only led to refinement of the known theorems, but also helped us identify a few new results.We went on to extend the tasks partitioning problem by asking a more practical question and solved it using the proposed unified theory.Finally, we established a close relationship among these problems.
The established unified framework has the capacity to act as a general toolset and provide insights for a variety of problems in Information Theory.For example, we could solve the ordered tasks partitioning problem using this framework.
The presented unified approach can also be extended and explored further in several ways.This includes, (a) Extension to general state-space: It would be interesting to see if the studied problems can be formulated and solved, for example, for countably infinite or continuous support sets.(b) Applications: Arıkan showed an application of the guessing problem in a sequential decoding problem [12].Humblet showed that cumulant of code-lengths arises in minimizing the probability of buffer overflow in source coding problems [20].Sundaresan [21] and Salamatian et al. [24] show application of memoryless guessing in password attack problems.It would be interesting to see if other potential applications emerge from this unified study.

Proposition 8 :
Let {M n } be a sequence of positive integers such that M n ≥ n log |X | + 3, and γ := lim n→∞ log M n n exists.Then, for ρ > 0, there exists a sequence of partitions of X n of size at most M n with partition functions A n such that (a) Hα(Pn)−log Mn) = 1 + 2 nρ Hα(P )− log Mn n .Part (a): When γ > H α (p), let us choose = (γ − H α (P ))/2 > 0.Then, there exists an n such that log Mn n ≥ γ − ∀n ≥ Proof: Part (a): Applying Proposition 1 with k = M [1 + ln (|X |/M )] and ψ(x) = N (x), we get the desired result.Part (b): If A and N are respectively the partition and count functions of a partition A, then we have 1 ≤ N (x) ≤ A(x)

Fig. 1 .
Fig. 1.Relationships established among the five problems.A directed arrow from problem A to problem B means knowing optimal or asymptotically optimal solution of A helps us find the same in B.