Variable-Length Resolvability for General Sources and Channels

Yagi, Hideki; Han, Te Sun

doi:10.3390/e25101466

Open AccessFeature PaperArticle

Variable-Length Resolvability for General Sources and Channels^†

by

Hideki Yagi

^1,*

and

Te Sun Han

²

¹

Department of Computer and Network Engineering, The University of Electro-Communications, Tokyo 182-8585, Japan

²

The National Institute of Information and Communications Technology (NICT), Tokyo 184-8795, Japan

^*

Author to whom correspondence should be addressed.

^†

This paper is an extended version of our paper published in the proceedings of 2017 IEEE International Symposium on Information Theory (ISIT2017), Aachen, Germany, titled “Variable-length resolvability for general sources”.

Entropy 2023, 25(10), 1466; https://doi.org/10.3390/e25101466

Submission received: 7 September 2023 / Revised: 30 September 2023 / Accepted: 3 October 2023 / Published: 19 October 2023

(This article belongs to the Special Issue Advances in Information and Coding Theory II)

Download

Browse Figure

Versions Notes

Abstract

:

We introduce the problem of variable-length (VL) source resolvability, in which a given target probability distribution is approximated by encoding a VL uniform random number, and the asymptotically minimum average length rate of the uniform random number, called the VL resolvability, is investigated. We first analyze the VL resolvability with the variational distance as an approximation measure. Next, we investigate the case under the divergence as an approximation measure. When the asymptotically exact approximation is required, it is shown that the resolvability under two kinds of approximation measures coincides. We then extend the analysis to the case of channel resolvability, where the target distribution is the output distribution via a general channel due to a fixed general source as an input. The obtained characterization of channel resolvability is fully general in the sense that, when the channel is just an identity mapping, it reduces to general formulas for source resolvability. We also analyze the second-order VL resolvability.

Keywords:

random number generation; source resolvability; channel resolvability; output approximation; variable-length resolvability

1. Introduction

Generating a random number subject to a given probability distribution has a number of applications, such as in information security, statistical machine learning, and computer science. From the viewpoint of information theory, random number generation may be considered to be a transformation (encoding) of sequences emitted from a given source with coin distribution into other sequences with target distribution via a deterministic mapping [1,2,3]. Among others, there have been two major types of problems of random number generation: intrinsic randomness [4,5] and (source) resolvability [6,7]. In the former case, a fixed-length (FL) uniform random number is extracted from an arbitrary coin distribution, and we want to find the maximum achievable rate of such uniform random numbers. In the latter case, in contrast, an FL uniform random number used as a coin distribution is encoded to approximate a given target distribution, and we want to find the minimum achievable rate of such uniform random numbers. Thus, there is a duality between these two problems.

The problem of intrinsic randomness has been extended to the case of variable-length (VL) uniform random numbers, for which the length of random numbers may vary. This problem, referred to as the VL intrinsic randomness, was first introduced by Vembu and Verdú [5] for a finite source alphabet and later extended by Han [4] to a countably infinite alphabet. This problem was actually motivated because, in many practical situations, it is indispensable to consider cases where FL uniform random numbers are not available, and, instead, VL uniform random numbers are available; typically, this is in cases where we work with Elias’ universal random numbers [1]. The use of such uniform random numbers is expected generally to increase the achievable average length rate for intrinsic randomness. Then, the following natural question may be raised: Can we indeed lower the average length rate needed in the “resolvability” problem by using VL random numbers? The answer is “yes”. Despite the duality between these two kinds of problems for random number generation, the VL counterpart in the resolvability problem has not been discussed, where we focus on this problem.

We introduce the problem of VL source/channel resolvability, where a given target probability distribution is to be approximated by encoding a VL uniform random number. Distance measures between the target distribution and an approximated distribution are used to measure the fineness of the approximation. We first analyze the fundamental limit on VL source resolvabilities with the variational distance as an approximation measure in Section 3. We use the smooth Shannon entropy, which is a version of smooth Rényi entropy [8], to characterize the

δ

-source resolvability, which is defined as the minimum achievable length rate of uniform random numbers with an asymptotic distance of less than or equal to

δ \in [0, 1)

. In the proof of the direct part, we will develop a simple version of information spectrum slicing [2], in which each “sliced” information density quantized to an integer is approximated by an FL uniform random number. Due to the simplicity of the method, the analysis with variational distance is first facilitated. As an important implication of general formulas for the

δ

-source resolvability, it is shown that the minimum resolvability rate of VL resolvability is equal to

(1 - δ)

times that of FL resolvability when the source is stationary and memoryless or is even with one-point spectrum (cf. Corollary 1). This result indicates an advantage of the use of a VL uniform random number when

δ > 0

because we can make the VL resolvability rate strictly smaller than an FV one. We then extend these analyses to the case under the (unnormalized) divergence as an approximation measure in Section 4. When

δ = 0

, that is, when the asymptotically exact approximation is required, it is shown that the 0-source resolvabilities under two kinds of approximation measures coincide with each other.

In Section 5, we then consider the problem of channel resolvability [6,9,10], in which not only a source but also a channel is fixed, and the output distribution via the channel is now the target of approximation. This problem, also referred to as the problem of output approximation, provides a powerful tool to analyze the fundamental limits of various problems in information theory and information security. Some such examples include identification codes [11,12,13], distributed hypothesis testing [14], message authentication [15], secret key generation [16], and coding for secure communication [17,18,19]. We consider two types of problems in which either a general source (mean-channel resolvability) or a VL uniform random number (VL channel resolvability) is used as a coin distribution. It is shown that the formulas established are equal for both coin distributions. In the special case that the channel is the identity mapping, the formulas established reduce to those in source resolvability as established in Section 3 and Section 4.

From Section 3, Section 4 and Section 5, the so-called first-order resolvability rates are analyzed, and the next important step may be the second-order analysis. Second-order analyses for various coding problems were initiated by Strassen [20] and have been studied in the past decade or so (cf. [21,22,23,24,25,26,27,28,29,30,31]). We also analyze the second-order fundamental limits of the VL channel/source resolvability in Section 6. In this paper, it is shown that the VL

δ

-source resolvability under the variational distance is equal to the minimum achievable rate of fixed-to-variable-length source codes with an error probability of less than or equal to

δ

. It is demonstrated that this close relationship provides a single-letter characterization for the first- and second-order source resolvability under the variational distance when the source is stationary and memoryless. It is worth noting that second-order analyses for the VL setting are relatively few, compared to those in the FL setting. The second-order formulas established in this paper are of importance from this perspective, too.

The remainder of the paper is organized as follows: Section 2 reviews the problem of FL source resolvability and the relations between the minimum resolvability rate and the minimum coding rate of FL source codes. Section 3 formally introduces the problem of VL source resolvability with variational distance as an approximation measure. Then, Section 4 discusses VL source resolvability with divergence as an approximation measure, and Section 5 generalizes the settings to channel resolvability. Section 6 investigates the second-order fundamental limits of the VL channel/source resolvability. Section 7 concludes the paper with a discussion of possible extensions.

2. FL Resolvability: Review

Let

U = {1, 2, \dots, K}

be a finite alphabet of size K, and let

X

be a finite or countably infinite alphabet. Let

X = {X^{n}}_{n = 1}^{\infty}

be a general source [2], where

P_{X^{n}}

is a probability distribution on

X^{n}

. We do not impose any assumptions, such as stationarity or ergodicity. In this paper, we identify

X^{n}

with its probability distribution

P_{X^{n}}

, and these symbols are used interchangeably.

We first review the problem of FL (source) resolvability [2] using the variational distance as an approximation measure. Let

U_{M_{n}}

denote the uniform random number, which is a random variable uniformly distributed over

U_{M_{n}} : = {1, \dots, M_{n}}

. Consider the problem of approximating the target distribution

P_{X^{n}}

by using

U_{M_{n}}

as a coin distribution via a deterministic mapping

φ_{n} : {1, \dots, M_{n}} \to X^{n}

. Denoting

{\tilde{X}}^{n} = φ_{n} (U_{M_{n}})

, we want to make

P_{{\tilde{X}}^{n}}

approximate

P_{X^{n}}

(cf. Figure 1). A standard choice of the performance measure for approximation is

\begin{matrix} d (P_{X^{n}}, P_{{\tilde{X}}^{n}}) : = \frac{1}{2} \sum_{x \in X^{n}} | P_{X^{n}} (x) - P_{{\tilde{X}}^{n}} (x) |, \end{matrix}

(1)

which is referred to as the variational distance between

P_{X^{n}}

and

P_{{\tilde{X}}^{n}}

. It is easily seen that

\begin{matrix} 0 & \leq d (P_{X^{n}}, P_{{\tilde{X}}^{n}}) \leq 1 . \end{matrix}

(2)

Let us now review the problem for source resolvability. Throughout this paper, logarithms are of the base K.

Definition 1

(FL resolvability). A resolution rate

R \geq 0

is said to be FL achievable or simply f-achievable (under the variational distance) if there exists a deterministic mapping

φ_{n} : {1, \dots, M_{n}} \to X^{n}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} log M_{n} & \leq R, \end{matrix}

(3)

\begin{matrix} lim_{n \to \infty} d (P_{X^{n}}, P_{{\tilde{X}}^{n}}) & = 0, \end{matrix}

(4)

where

{\tilde{X}}^{n} = φ_{n} (U_{M_{n}})

and

U_{M_{n}}

is the uniform random number over

U_{M_{n}}

. The infimum of f-achievable rates, i.e.,

\begin{matrix} S_{f} (X) : = inf {R : R i s f - a c h i e v a b l e} \end{matrix}

(5)

is called the FL resolvability or simply f-resolvability.

Then, we have the following theorem:

Theorem 1

(Han and Verdú [6]). For any general target source

X

,

\begin{matrix} S_{f} (X) = \bar{H} (X), \end{matrix}

(6)

where

\begin{matrix} \bar{H} (X) & : = inf \{a : lim_{n \to \infty} Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} > a\} = 0\} . \end{matrix}

(7)

Remark 1.

As a dual counterpart of (7), we may define

\begin{matrix} \underset{̲}{H} (X) & : = sup \{b : lim_{n \to \infty} Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} < b\} = 0\} . \end{matrix}

(8)

Sources such that

\bar{H} (X) = \underset{̲}{H} (X)

are called one-point spectrum sources (or equivalently, said to satisfy the strong converse property (cf. Han [2])), which includes stationary memoryless sources and stationary ergodic sources, etc. This class of sources is discussed later in Corollary 1.

The following problem is called the δ-resolvability problem [2,7], which relaxes the condition on the variational distance, compared to (4).

Definition 2

(FL

δ

-resolvability). For a fixed

δ \in [0, 1)

, a resolution rate

R \geq 0

is said to be FL δ-achievable or simply

f (δ)

-achievable (under the variational distance) if there exists a deterministic mapping

φ_{n} : {1, \dots, M_{n}} \to X^{n}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} log M_{n} & \leq R, \end{matrix}

(9)

\begin{matrix} \underset{n \to \infty}{lim sup} d (P_{X^{n}}, P_{{\tilde{X}}^{n}}) & \leq δ, \end{matrix}

(10)

where

{\tilde{X}}^{n} = φ_{n} (U_{M_{n}})

and

U_{M_{n}}

is the uniform random number over

U_{M_{n}}

. The infimum of all

f (δ)

-achievable rates, i.e.,

\begin{matrix} S_{f} (δ | X) : = inf {R : R i s f (δ) - a c h i e v a b l e} \end{matrix}

(11)

is referred to as the FL δ-resolvability or simply

f (δ)

-resolvability.

Then, a characterization of

S_{f} (δ | X)

is given by

Theorem 2

(Steinberg and Verdú [7]). For any general target source

X

,

\begin{matrix} S_{f} (δ | X) = {\bar{H}}_{δ} (X) (δ \in [0, 1)), \end{matrix}

(12)

where

\begin{matrix} {\bar{H}}_{δ} (X) & : = inf \{a : \underset{n \to \infty}{lim sup} Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} > a\} \leq δ\} . \end{matrix}

(13)

Remark 2.

The FL resolvability problem is deeply related to the FL source coding problem allowing a probability of a decoding error up to ε. Denoting by

R_{f} (ε | X)

the minimum achievable rate for the source

X

, there is the relationship [7]:

\begin{matrix} R_{f} (ε | X) = {\bar{H}}_{ε} (X) (\forall ε \in [0, 1)) \end{matrix}

(14)

and, hence, by Theorem 2,

\begin{matrix} S_{f} (δ | X) = R_{f} (δ | X) (\forall δ \in [0, 1)) . \end{matrix}

(15)

Formula (14) can also be shown with a smooth Rényi entropy of order zero [32].

3. VL Resolvability: Variational Distance

In this section, we introduce the problem of variable-length (VL) resolvability, where the target probability distribution is approximated by encoding a VL uniform random number. As an initial step, we analyze the fundamental limit on the VL resolvability with the variational distance as an approximation measure.

3.1. Definitions

Let

U^{*}

denote the set of all sequences

u \in U^{m}

over

m = 0, 1, 2, \dots

, where

U^{0} = {λ}

(

λ

is the null string). Let

L_{n}

denote a random variable which takes a value in

{0, 1, 2, \dots}

. We define the VL uniform random number

U^{(L_{n})}

so that

U^{(m)}

is uniformly distributed over

U^{m}

given

L_{n} = m

. In other words,

\begin{matrix} P_{U^{(L_{n})}} (u, m) & : = Pr {U^{(L_{n})} = u, L_{n} = m} = \frac{Pr {L_{n} = m}}{K^{m}} (\forall u \in U^{m}), \end{matrix}

(16)

\begin{matrix} Pr {U^{(L_{n})} = u | L_{n} = m} & = \frac{P_{U^{(L_{n})}} (u, m)}{Pr {L_{n} = m}} = \frac{1}{K^{m}} (\forall u \in U^{m}), \end{matrix}

(17)

where

K = | U |

. It should be noticed that the VL sequence

u \in U^{m}

is generated with the joint probability

P_{U^{(L_{n})}} (u, m)

.

We formally define the

δ

-resolvability problem under the variational distance using the VL random number, called the VL δ-resolvability or simply

v (δ)

-resolvability.

Definition 3

(VL

δ

-resolvability: variational distance). A resolution rate

R \geq 0

is said to be VL δ-achievable (under the variational distance) with

δ \in [0, 1)

if there exists a VL uniform random number

U^{(L_{n})}

and a deterministic mapping

φ_{n} : U^{*} \to X^{n}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} E [L_{n}] & \leq R, \end{matrix}

(18)

\begin{matrix} \underset{n \to \infty}{lim sup} d (P_{X^{n}}, P_{{\tilde{X}}^{n}}) & \leq δ, \end{matrix}

(19)

where

E [\cdot]

denotes the expected value and

{\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})

. The infimum of all

v (δ)

-achievable rates, i.e.,

\begin{matrix} S_{v} (δ | X) : = inf {R : R i s v (δ) - a c h i e v a b l e} \end{matrix}

(20)

is referred to as the VL δ-resolvability or simply

v (δ)

-resolvability.

If

δ = 0

,

v (0)

-achievable is said to be VL achievable or simply v-achievable (under the variational distance). The infimum of all v-achievable rates, i.e.,

\begin{matrix} S_{v} (X) : = inf {R : R i s v - a c h i e v a b l e} \end{matrix}

(21)

is called the VL resolvability or simply v-resolvability.

Remark 3.

One may think that condition (18) can be replaced with the condition on the sup-entropy rate:

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} H (U^{(L_{n})}) & \leq R \end{matrix}

(22)

as in [6], where

H (\cdot)

denotes the Shannon entropy. Indeed, both conditions yield the same resolvability result. To see this, let us denote by

{\tilde{S}}_{v} (δ | X)

the infimum of v-achievable rates R under constraints (19) and (22). It is easily checked that

\begin{matrix} E [L_{n}] & = \sum_{m = 1}^{\infty} \sum_{u \in U^{m}} P_{U^{(L_{n})}} (u, m) log K^{m} \\ = \sum_{m = 1}^{\infty} \sum_{u \in U^{m}} P_{U^{(L_{n})}} (u, m) log \frac{Pr {L_{n} = m}}{P_{U^{(L_{n})}} (u, m)} \\ = H (U^{(L_{n})}) - H (L_{n}) \leq H (U^{(L_{n})}) . \end{matrix}

(23)

This implies

S_{v} (δ | X) \leq {\tilde{S}}_{v} (δ | X)

. On the other hand, by invoking the well-known relation (cf. ([33], Corollary 3.12)), it holds that

\begin{matrix} H (L_{n}) \leq log (e \cdot E [L_{n}]) . \end{matrix}

(24)

Consider any resolution rate

R > S_{v} (δ | X)

. Then, (18) holds for some

U^{(L_{n})}

and

φ_{n}

and, hence, (24) leads to

\begin{matrix} lim_{n \to \infty} \frac{1}{n} H (L_{n}) = lim_{n \to \infty} \frac{1}{n} log (e \cdot E [L_{n}]) = 0 . \end{matrix}

(25)

From this equation, (23) yields that

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} H (U^{(L_{n})}) = \underset{n \to \infty}{lim sup} \frac{1}{n} E [L_{n}] \leq R \end{matrix}

(26)

to obtain

R \geq {\tilde{S}}_{v} (δ | X)

, implying that

S_{v} (δ | X) \geq {\tilde{S}}_{v} (δ | X)

. Thus,

S_{v} (δ | X) = {\tilde{S}}_{v} (δ | X)

.

3.2. Smooth Shannon Entropy

To establish a general formula for

S_{v} (δ | X)

, we introduce the following quantity for a general source

X

. Let

P (X^{n})

denote the set of all probability distributions on

X^{n}

. For

δ \in [0, 1)

, by defining the δ-ball using the variational distance

\begin{matrix} B_{δ} (X^{n}) = \{P_{V^{n}} \in P (X^{n}) : d (P_{X^{n}}, P_{V^{n}}) \leq δ\}, \end{matrix}

(27)

we introduce the smooth Shannon entropy:

\begin{matrix} H_{[δ]} (X^{n}) & : = inf_{P_{V^{n}} \in B_{δ} (X^{n})} \sum_{x \in X^{n}} P_{V^{n}} (x) log \frac{1}{P_{V^{n}} (x)} \\ = inf_{P_{V^{n}} \in B_{δ} (X^{n})} H (V^{n}), \end{matrix}

(28)

where

H (V^{n})

denotes the Shannon entropy of

P_{V^{n}}

. The

H_{[δ]} (X^{n})

is a nonincreasing function of

δ

. Based on this quantity for a general source

X = {X^{n}}_{n = 1}^{\infty}

, we define

\begin{matrix} H_{[δ]} (X) & = \underset{n \to \infty}{lim sup} \frac{1}{n} H_{[δ]} (X^{n}) . \end{matrix}

(29)

Remark 4.

Renner and Wolf [8] have defined the smooth Rény entropy of order

α \in (0, 1) \cup (1, \infty)

as

\begin{matrix} H_{[δ]}^{α} (X^{n}) = inf_{P_{V^{n}} \in B_{δ} (X^{n})} \frac{1}{1 - α} log \sum_{x \in X^{n}} P_{V^{n}} {(x)}^{α} . \end{matrix}

(30)

By letting

α ↑ 1

, we have

\begin{matrix} lim_{α ↑ 1} H_{[δ]}^{α} (X^{n}) = H_{[δ]} (X^{n}) . \end{matrix}

(31)

As for the proof, see Appendix A.

3.3. General Formula for General $δ \in [0, 1)$

The following main theorem indicates that the

v (δ)

-resolvability

S_{v} (δ | X)

can be characterized by the smooth Shannon entropy for

X

.

Theorem 3.

For any general target source

X

,

\begin{matrix} S_{v} (δ | X) = lim_{γ ↓ 0} H_{[δ + γ]} (X) (δ \in [0, 1)) . \end{matrix}

(32)

Remark 5.

In Formula (32), the limit

{lim}_{γ ↓ 0}

of the offset term

+ γ

appears in the characterization of

S_{v} (δ | X)

. This is because the smooth entropy

H_{[δ]} (X^{n})

for

X^{n}

involves the infimum over the non-asymptotic δ-ball

B_{δ} (X^{n})

for a given length n. Alternatively, we may consider the asymptotic δ-ball defined as

\begin{matrix} B_{δ} (X) = \{V = {V^{n}}_{n = 1}^{\infty} : \underset{n \to \infty}{lim sup} d (P_{X^{n}}, P_{V^{n}}) \leq δ\}, \end{matrix}

(33)

and then we obtain the alternative formula

\begin{matrix} S_{v} (δ | X) = inf_{V \in B_{δ} (X)} H (V) (δ \in [0, 1)) \end{matrix}

(34)

without an offset term, where

\begin{matrix} H (V) : = \underset{n \to \infty}{lim sup} \frac{1}{n} H (V^{n}) \end{matrix}

(35)

is the sup-entropy rate for

V

with the Shannon entropy

H (V^{n})

. The proof of (34) is given in Appendix B.

The same remark also applies to general formulas to be established in the subsequent sections.

Remark 6.

Independently of this work, Tomita, Uyematsu, and Matsumoto [34] have investigated the following problem: the coin distribution is given by fair coin-tossing and the average number of coin tosses should be asymptotically minimized as in [3] while the variational distance between the target and approximated distributions should satisfy (19). In this case, the asymptotically minimum average number of coin tosses is also characterized by the right-hand side (r.h.s.) of (32) (cf. [34]). Since the coin distribution is restricted to that given by fair coin-tossing with a stopping algorithm, realizations of

L_{n}

must satisfy the Kraft inequality (for prefix codes), whereas the problem addressed in this paper allows the probability distribution of

L_{n}

to be an arbitrary discrete one, not necessarily implying prefix codes. In this sense, our problem is more relaxed, while the coin is constrained to be conditionally independent given

L_{n}

. Theorem 3 indicates that the

v (δ)

-resolvability does not differ in both problems. Later, we shall show that, even in the case where the coin distribution may be any general source

X

, the δ-resolvability remains the same (cf. Theorem 5 and Remark 14).

On the other hand, we now define the following information quantity (to be used in Remark 7 below) to discuss the relationship with VL source coding: For

δ \in [0, 1)

we define

\begin{matrix} G_{[δ]} (X^{n}) = inf_{\begin{matrix} A_{n} \subseteq X^{n} : \\ Pr {X^{n} \in A_{n}} \geq 1 - δ \end{matrix}} \sum_{x \in A_{n}} P_{X^{n}} (x) log \frac{1}{P_{X^{n}} (x)} . \end{matrix}

(36)

The

G_{[δ]} (X^{n})

is a nonincreasing function of

δ

. Based on this quantity, we define

\begin{matrix} G_{[δ]} (X) & = \underset{n \to \infty}{lim sup} \frac{1}{n} G_{[δ]} (X^{n}) . \end{matrix}

(37)

Then, we have

Remark 7.

There is a deep relation between the δ-resolvability problem and VL δ-source coding with the error probability asymptotically not exceeding δ. Koga and Yamamoto [35] (also, cf. Han [36]) showed that the minimum average length rate

R_{v}^{*} (δ | X)

of VL δ-source codes is given by

\begin{matrix} R_{v}^{*} (δ | X) = lim_{γ ↓ 0} G_{[δ + γ]} (X) (\forall δ \in [0, 1)) . \end{matrix}

(38)

Theorem 3 and Proposition 1 (to be shown just below) reveal that

\begin{matrix} S_{v} (δ | X) = R_{v}^{*} (δ | X) (\forall δ \in [0, 1)) . \end{matrix}

(39)

The following proposition shows a general relationship between

G_{[δ]} (X)

and

H_{[δ]} (X)

.

Proposition 1.

For any general source

X

,

\begin{matrix} H_{[δ]} (X) = G_{[δ]} (X) \leq (1 - δ) {\bar{H}}_{δ - γ} (X) (\forall δ \in (0, 1), \forall γ \in (0, δ]) . \end{matrix}

(40)

In particular,

\begin{matrix} lim_{γ ↓ 0} H_{[δ + γ]} (X) = lim_{γ ↓ 0} G_{[δ + γ]} (X) \leq (1 - δ) {\bar{H}}_{δ} (X) (\forall δ \in [0, 1)) . \end{matrix}

(41)

(Proof) See Appendix C.

By plugging

γ = δ

into (40), a looser but sometimes useful bound

\begin{matrix} H_{[δ]} (X) = G_{[δ]} (X) \leq (1 - δ) \bar{H} (X) \end{matrix}

(42)

can be obtained. Equation (40) has been derived by [21], which improves a bound established in [35,37]. In view of Theorems 2 and 3, (41) in Proposition 1 implies

\begin{matrix} S_{v} (δ | X) \leq (1 - δ) S_{f} (δ | X) \end{matrix}

(43)

for all

δ \in [0, 1)

, where

S_{f} (δ | X)

denotes the

f (δ)

-resolvability. This general relationship elucidates the advantage of the use of VL uniform random numbers to make the average length rate lower. The proposition also claims that

G_{[δ]} (X)

coincides with

H_{[δ]} (X)

for all

δ \in [0, 1)

for any general source

X

.

A consequence of Theorem 3 is the following corollary:

Corollary 1.

Let

X = {X^{n}}_{n = 1}^{\infty}

be a one-point spectrum source (

\bar{H} (X) = \underset{̲}{H} (X)

) with

X^{n} = (X_{1}, X_{2}, \dots, X_{n})

, then we have

\begin{matrix} S_{f} (δ | X) = {\bar{H}}_{δ} (X) = H^{*} (X) (\forall δ \in [0, 1)), \end{matrix}

(44)

where

H^{*} (X) : = \bar{H} (X) = \underset{̲}{H} (X)

. Moreover, it holds that

\begin{matrix} S_{v} (δ | X) = H_{[δ]} (X) = G_{[δ]} (X) = (1 - δ) H^{*} (X) (\forall δ \in [0, 1)), \end{matrix}

(45)

where we notice that

H^{*} (X) = H (X_{1})

(entropy) for stationary memoryless sources.

(Proof) See Appendix D.

Now, we are ready to give the proof of Theorem 3:

Proof of Theorem 3.

(1): Converse Part:
Let R be $v (δ)$ -achievable. Then, there exists $U^{(L_{n})}$ and $φ_{n}$ satisfying (18) and

$\begin{matrix} \underset{n \to \infty}{lim sup} δ_{n} \leq δ, \end{matrix}$

(46)

where we define $δ_{n} = d (P_{X^{n}}, P_{{\tilde{X}}^{n}})$ with ${\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})$ . Equation (46) implies that, for any given $γ > 0$ , it holds that $δ_{n} \leq δ + γ$ for all $n \geq n_{0}$ with some $n_{0} > 0$ , and thus we have

$\begin{matrix} H_{[δ + γ]} (X^{n}) \leq H_{[δ_{n}]} (X^{n}) (\forall n \geq n_{0}), \end{matrix}$

(47)

because $H_{[δ]} (X^{n})$ is a nonincreasing function of $δ$ . Since $P_{{\tilde{X}}^{n}} \in B_{δ_{n}} (X^{n})$ , we have

$\begin{matrix} H_{[δ_{n}]} (X^{n}) \leq H ({\tilde{X}}^{n}) . \end{matrix}$

(48)

On the other hand, it follows from (23) that

$\begin{matrix} H ({\tilde{X}}^{n}) & \leq H (U^{(L_{n})}) = E [L_{n}] + H (L_{n}), \end{matrix}$

(49)

where the inequality is due to the fact that $φ_{n}$ is a deterministic mapping and ${\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})$ .
Combining (47)–(49) yields

$\begin{matrix} H_{[δ + γ]} (X) & = \underset{n \to \infty}{lim sup} \frac{1}{n} H_{[δ + γ]} (X^{n}) \\ \leq \underset{n \to \infty}{lim sup} \frac{1}{n} E [L_{n}] + \underset{n \to \infty}{lim sup} \frac{1}{n} H (L_{n}) \leq R, \end{matrix}$

(50)

where we have used (18) and (25) for the last inequality. Since $γ > 0$ is arbitrary, we obtain

$\begin{matrix} lim_{γ ↓ 0} H_{[δ + γ]} (X) \leq R . \end{matrix}$

(51)
(2): Direct Part:
Without loss of generality, we assume that $H^{+} : = {lim}_{γ ↓ 0} H_{[δ + γ]} (X)$ is finite ( $H^{+} < + \infty$ ). Letting $R = H^{+} + 3 γ$ , where $γ > 0$ is an arbitrary constant, we shall show that R is $v (δ)$ -achievable. In what follows, we use a simpler form of information spectrum slicing [2], where each piece of sliced information quantized to a positive integer ℓ is approximated by the uniform random number $U^{(ℓ)}$ of the length ℓ.
First, we note that

$\begin{matrix} H^{+} \geq H_{[δ + γ]} (X) \geq \frac{1}{n} H_{[δ + γ]} (X^{n}) - γ (\forall n > n_{0}) \end{matrix}$

(52)

because of the monotonicity of $H_{[δ]} (X)$ in $δ$ . Let $V^{n}$ be a random variable subject to $P_{V^{n}} \in B_{δ + γ} (X^{n})$ , which satisfies

$\begin{matrix} H_{[δ + γ]} (X^{n}) + γ \geq H (V^{n}) . \end{matrix}$

(53)

For $γ > 0$ , we can choose a $c_{n} > 0$ so large that

$\begin{matrix} Pr {V^{n} \notin T_{n}} \leq γ \end{matrix}$

(54)

where

$\begin{matrix} T_{n} : = \{x \in X^{n} : \frac{1}{n} log \frac{1}{P_{V^{n}} (x)} \leq c_{n}\} . \end{matrix}$

(55)

We also define

$\begin{matrix} ℓ (x) : = \{\begin{matrix} ⌈ log \frac{1}{P_{V^{n}} (x)} + n γ ⌉ & f o r x \in T_{n} \\ 0 & otherwise . \end{matrix} \end{matrix}$

(56)

For $m = 0, 1, \dots, β_{n} : = ⌈ n (c_{n} + γ) ⌉$ , set

$\begin{matrix} S_{n} (m) : = \{x \in X^{n} : ℓ (x) = m\}, \end{matrix}$

(57)

then these sets form a partition of $X^{n}$ , i.e.,

$\begin{matrix} ⋃_{m = 0}^{β_{n}} S_{n} (m) = X^{n} a n d ⋃_{m = 1}^{β_{n}} S_{n} (m) = T_{n} . \end{matrix}$

(58)

We set $L_{n}$ so that

$\begin{matrix} Pr {L_{n} = m} = Pr {V^{n} \in S_{n} (m)}, \end{matrix}$

(59)

where it is obvious that $\sum_{m = 0}^{β_{n}} Pr {L_{n} = m} = 1$ , and, hence, the probability distribution of the VL uniform random number $U^{(L_{n})}$ is given as

$\begin{matrix} P_{U^{(L_{n})}} (u, m) : = Pr {U^{(L_{n})} = u, L_{n} = m} = \frac{Pr {V^{n} \in S_{n} (m)}}{K^{m}} (\forall u \in U^{m}) . \end{matrix}$

(60)

(a): Construction of Mapping $φ_{n} : U^{*} \to X^{n}$ :
Index the elements in $S_{n} (m)$ as $x_{1}, x_{2}, \dots, x_{| S_{n} (m) |} (m = 1, 2, \dots)$ , where

$\begin{matrix} | S_{n} (m) | \leq K^{m - n γ} \end{matrix}$

(61)

since for $x \in S_{n} (m)$

$\begin{matrix} log \frac{1}{P_{V^{n}} (x)} \leq m - n γ ⟺ P_{V^{n}} (x) \geq K^{- (m - n γ)}, \end{matrix}$

(62)

and, therefore,

$\begin{matrix} 1 \geq \sum_{x \in S_{n} (m)} P_{V^{n}} (x) \geq \sum_{x \in S_{n} (m)} K^{- (m - n γ)} = | S_{n} (m) | K^{- (m - n γ)} . \end{matrix}$

(63)

For $i = 1, 2, \dots, | S_{n} (m) |$ , define ${\tilde{A}}_{i}^{(m)} \subset U^{m}$ as the set of sequences $u \in U^{m}$ so that

$\begin{matrix} \sum_{u \in {\tilde{A}}_{i}^{(m)}} P_{U^{(L_{n})}} (u, m) \leq P_{V^{n}} (x_{i}) < \sum_{u \in {\tilde{A}}_{i}^{(m)}} P_{U^{(L_{n})}} (u, m) + \frac{Pr {V^{n} \in S_{n} (m)}}{K^{m}} \end{matrix}$

(64)

and

$\begin{matrix} {\tilde{A}}_{i}^{(m)} \cap {\tilde{A}}_{j}^{(m)} = \emptyset (i \neq j) . \end{matrix}$

(65)

If

$\begin{matrix} \sum_{i = 1}^{| S_{n} (m) |} \sum_{u \in {\tilde{A}}_{i}^{(m)}} P_{U^{(L_{n})}} (u, m) < \sum_{i = 1}^{| S_{n} (m) |} P_{V^{n}} (x_{i}) = Pr {V^{n} \in S_{n} (m)}, \end{matrix}$

(66)

then add a $u_{i} \in U^{m} ∖ (\cup_{j} {\tilde{A}}_{j}^{(m)})$ to obtain

$\begin{matrix} A_{i}^{(m)} = {\tilde{A}}_{i}^{(m)} \cup {u_{i}} \end{matrix}$

(67)

for $i = 1, 2, \dots$ in order, until it holds that with some $1 \leq c \leq | S_{n} (m) |$

$\begin{matrix} ⋃_{i = 1}^{c} A_{i}^{(m)} \cup ⋃_{i = c + 1}^{| S_{n} (m) |} {\tilde{A}}_{i}^{(m)} = U^{m}, \end{matrix}$

(68)

where $u_{1}, u_{2}, \dots$ are selected to be all different. Since $| U^{m} | = K^{m}$ and

$\begin{matrix} \sum_{u \in U^{m}} P_{U^{(L_{n})}} (u, m) = \sum_{u \in U^{m}} \frac{Pr {V^{n} \in S_{n} (m)}}{K^{m}} = Pr {V^{n} \in S_{n} (m)}, \end{matrix}$

(69)

such a $1 \leq c \leq | S_{n} (m) |$ always exists. For simplicity, we set for $i = c + 1, c + 2, \dots, | S_{n} (m) |$

$\begin{matrix} A_{i}^{(m)} = {\tilde{A}}_{i}^{(m)} \end{matrix}$

(70)

and for $i = 1, 2, \dots, | S_{n} (m) |$

$\begin{matrix} φ_{n} (u) = x_{i} f o r u \in A_{i}^{(m)}, \end{matrix}$

(71)

which defines the random variable ${\tilde{X}}^{n}$ with values in $X^{n}$ such that

$\begin{matrix} P_{{\tilde{X}}^{n}} (x_{i}) = \sum_{u \in A_{i}^{(m)}} P_{U^{(L_{n})}} (u, m) (x_{i} \in S_{n} (m)), \end{matrix}$

(72)

that is, ${\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})$ , where if $X^{n} ∖ T_{n} \neq \emptyset$ , we choose some $x_{0} \in X^{n} ∖ T_{n}$ and set

$\begin{matrix} P_{{\tilde{X}}^{n}} (x_{0}) = Pr {V^{n} \notin T_{n}} and φ_{n} (λ) = x_{0} . \end{matrix}$

(73)

Notice that, by this construction, we have

$\begin{matrix} | P_{{\tilde{X}}^{n}} (x_{i}) - P_{V^{n}} (x_{i}) | \leq \frac{Pr {V^{n} \in S_{n} (m)}}{K^{m}} \end{matrix}$

(74)

for $i = 1, 2, \dots, | S_{n} (m) |; m = 1, 2, \dots, β_{n}$ , and

$\begin{matrix} Pr {{\tilde{X}}^{n} \notin T_{n}} = Pr {V^{n} \notin T_{n}} \leq γ . \end{matrix}$

(75)
(b): Evaluation of Average Length:
Since $m = 0$ does not contribute to the average length $E [L_{n}]$ , it is evaluated as follows:

$\begin{matrix} E [L_{n}] & = \sum_{m = 1}^{β_{n}} \sum_{u \in U^{m}} P_{U^{(L_{n})}} (u, m) \cdot m \\ = \sum_{m = 1}^{β_{n}} \sum_{i = 1}^{| S_{n} (m) |} \sum_{u \in A_{i}^{(m)}} P_{U^{(L_{n})}} (u, m) \cdot m \\ = \sum_{m = 1}^{β_{n}} \sum_{x_{i} \in S_{n} (m)} P_{{\tilde{X}}^{n}} (x_{i}) \cdot m, \end{matrix}$

(76)

where we have used $U^{m} = ⋃_{i = 1}^{| S_{n} (m) |} A_{i}^{(m)}$ and (72). For $x_{i} \in S_{n} (m)$ , we obtain from (74)

$\begin{matrix} P_{{\tilde{X}}^{n}} (x_{i}) & \leq P_{V^{n}} (x_{i}) + \frac{Pr {V^{n} \in S_{n} (m)}}{K^{m}} \\ \leq P_{V^{n}} (x_{i}) (1 + \frac{1}{P_{V^{n}} (x_{i}) K^{m}}) \\ \leq P_{V^{n}} (x_{i}) (1 + \frac{1}{K^{n γ}}), \end{matrix}$

(77)

where, to derive the last inequality, we have used (62). Plugging the inequality

$\begin{matrix} m \leq log \frac{1}{P_{V^{n}} (x_{i})} + n γ + 1 (\forall x_{i} \in S_{n} (m)) \end{matrix}$

(78)

and (77) into (76), we obtain

$\begin{matrix} E [L_{n}] & \leq (1 + \frac{1}{K^{n γ}}) \sum_{m = 1}^{β_{n}} \sum_{x_{i} \in S_{n} (m)} P_{V^{n}} (x_{i}) (log \frac{1}{P_{V^{n}} (x_{i})} + n γ + 1) \\ \leq (1 + \frac{1}{K^{n γ}}) (H (V^{n}) + n γ + 1), \end{matrix}$

(79)

which yields

$\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} E [L_{n}] & \leq \underset{n \to \infty}{lim sup} \frac{1}{n} H (V^{n}) + 2 γ \\ \leq \underset{n \to \infty}{lim sup} \frac{1}{n} H_{[δ + γ]} (X^{n}) + 3 γ \\ = H_{[δ + γ]} (X) + 3 γ \\ \leq H^{+} + 3 γ = R, \end{matrix}$

(80)

where the second inequality follows from (53) and the last one is due to (52).
(c): Evaluation of Variational Distance:
From (61) and (74), we have

$\begin{matrix} \sum_{x \in S_{n} (m)} | P_{{\tilde{X}}^{n}} (x) - P_{V^{n}} (x) | \leq \frac{| S_{n} (m) | Pr {V^{n} \in S_{n} (m)}}{K^{m}} \leq \frac{Pr {V^{n} \in S_{n} (m)}}{K^{n γ}}, \end{matrix}$

(81)

which, in view of (58), leads to

$\begin{matrix} d (P_{{\tilde{X}}^{n}}, P_{V^{n}}) & = \frac{1}{2} \sum_{x \in T_{n}} | P_{{\tilde{X}}^{n}} (x) - P_{V^{n}} (x) | + \frac{1}{2} \sum_{x \notin T_{n}} | P_{{\tilde{X}}^{n}} (x) - P_{V^{n}} (x) | \\ \leq \frac{1}{2} \sum_{m = 1}^{β_{n}} \sum_{x \in S_{n} (m)} | P_{{\tilde{X}}^{n}} (x) - P_{V^{n}} (x) | \\ + \frac{1}{2} (Pr {{\tilde{X}}^{n} \notin T_{n}} + Pr {V^{n} \notin T_{n}}) \\ \leq \frac{1}{2} \sum_{m = 1}^{β_{n}} \frac{Pr {V^{n} \in S_{n} (m)}}{K^{n γ}} + γ \leq \frac{1}{2} K^{- n γ} + γ, \end{matrix}$

(82)

where we have used (75) to obtain the leftmost inequality in (82). By the triangle inequality, we obtain

$\begin{matrix} d (P_{X^{n}}, P_{{\tilde{X}}^{n}}) \leq d (P_{X^{n}}, P_{V^{n}}) + d (P_{{\tilde{X}}^{n}}, P_{V^{n}}) \leq δ + 2 γ + \frac{1}{2} K^{- n γ}, \end{matrix}$

(83)

where the last inequality follows because $P_{V^{n}} \in B_{δ + γ} (X^{n})$ . Thus, we obtain from (83)

$\begin{matrix} \underset{n \to \infty}{lim sup} d (P_{X^{n}}, P_{{\tilde{X}}^{n}}) & \leq δ + 2 γ . \end{matrix}$

(84)

Since $γ > 0$ is arbitrary and we have (80), we conclude that R is $v (δ)$ -achievable.

□

3.4. General Formula for $δ = 0$

In this subsection, we consider the special case with

δ = 0

. In this case, we can elucidate the relationship between the minimum achievable rates for VL source codes with an asymptotically vanishing decoding error probability and the FL source codes.

We obtain the following corollary from Theorem 3 and Proposition 1:

Corollary 2.

For any general target source

X

,

\begin{matrix} S_{v} (X) = lim_{γ ↓ 0} G_{[γ]} (X), \end{matrix}

(85)

where

G_{[γ]} (X)

is defined in (37).

It has been shown by Han [2] that any source

X = {X^{n}}_{n = 1}^{\infty}

satisfying the uniform integrability (cf. Han [2]) satisfies

\begin{matrix} lim_{γ ↓ 0} G_{[γ]} (X) = H (X) : = \underset{n \to \infty}{lim sup} \frac{1}{n} H (X^{n}), \end{matrix}

(86)

where

H (X)

is called the sup-entropy rate. Notice here, in particular, that the finiteness of an alphabet implies the uniform integrability [2]. Thus, we obtain the following corollary:

Corollary 3.

For any finite alphabet target source

X

,

\begin{matrix} S_{v} (X) = H (X) . \end{matrix}

(87)

Remark 8.

As in the case of FL resolvability and FL source coding problems,

S_{v} (X)

is tightly related to VL source codes with vanishing decoding error probabilities. Denoting by

R_{v}^{*} (X)

the minimum error-vanishing VL -achievable rate for a source

X

, Han [36] has shown that

\begin{matrix} R_{v}^{*} (X) = lim_{γ ↓ 0} G_{[γ]} (X), \end{matrix}

(88)

and, hence, from Corollary 3, it is concluded that

\begin{matrix} S_{v} (X) = R_{v}^{*} (X) . \end{matrix}

(89)

In addition, if a general source

X

satisfies the uniform integrability and the strong converse property (cf. Han [2]), then equation (86) holds and hence it follows from ([2], Theorem 1.7.1) that

\begin{matrix} S_{f} (X) = S_{v} (X) = R_{v}^{*} (X) = R_{v} (X) = R_{f} (X) = H (X), \end{matrix}

(90)

where

R_{f} (X) : = R_{f} (0 | X)

and

R_{v} (X)

denotes the minimum achievable rate of VL source codes with zero error probabilities for all

n = 1, 2, \dots

.

Remark 9.

Han and Verdú [6] have discussed the problem of mean-resolvability for the target distribution

P_{X^{n}}

. In this problem, the coin distribution may be a general source

\tilde{X} = {{\tilde{X}}^{n}}_{n = 1}^{\infty}

, where

{\tilde{X}}^{n}

is a random variable that takes values in

X^{n}

with the average length rate

\frac{1}{n} E [L_{n}]

in (18) replaced with the entropy rate

\frac{1}{n} H ({\tilde{X}}^{n})

. Denoting by

{\bar{S}}_{v} (X)

the mean-resolvability, which is defined as the infimum of v-achievable rates for a general source

X

(with the countably infinite alphabet), we can easily verify that any mean-resolution rate

R > {\bar{S}}_{v} (X)

must satisfy

R \geq {lim}_{γ ↓ 0} G_{[γ]} (X)

so that

S_{v} (X) \leq {\bar{S}}_{v} (X)

. On the other hand,

S_{v} (X) \geq {\bar{S}}_{v} (X)

by definition. Thus, in view of Corollary 2, we have

Corollary 4.

For any general target source

X

,

\begin{matrix} S_{v} (X) = {\bar{S}}_{v} (X) = lim_{γ ↓ 0} G_{[γ]} (X) . \end{matrix}

(91)

4. VL Resolvability: Divergence

So far, we have considered the problem of VL resolvability, in which the approximation level is measured by the variational distance between

X^{n}

and

{\tilde{X}}^{n}

. It is sometimes of use to deal with another quantity as an approximation measure. In this section, we use the (unnormalized) divergence as the approximation measure.

4.1. Definitions

In this subsection, we address the following problem.

Definition 4

(VL

δ

-resolvability: divergence). A resolution rate

R \geq 0

is said to be VL δ-achievable or simply

v_{D} (δ)

-achievable (under the divergence) with

δ \geq 0

, if there exists a VL uniform random number

U^{(L_{n})}

and a deterministic mapping

φ_{n} : U^{*} \to X^{n}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} E [L_{n}] & \leq R, \end{matrix}

(92)

\begin{matrix} \underset{n \to \infty}{lim sup} D ({\tilde{X}}^{n} | | X^{n}) & \leq δ, \end{matrix}

(93)

where

{\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})

and

D ({\tilde{X}}^{n} | | X^{n})

denotes the divergence between

P_{{\tilde{X}}^{n}}

and

P_{X^{n}}

:

\begin{matrix} D ({\tilde{X}}^{n} | | X^{n}) = \sum_{x \in X^{n}} P_{{\tilde{X}}^{n}} (x) log \frac{P_{{\tilde{X}}^{n}} (x)}{P_{X^{n}} (x)} . \end{matrix}

(94)

The infimum of all

v_{D} (δ)

-achievable rates, i.e.,

\begin{matrix} S_{v}^{D} (δ | X) : = inf {R : R i s v_{D} (δ) - a c h i e v a b l e} \end{matrix}

(95)

is called the VL

δ

-resolvability or simply the

v_{D} (δ)

-resolvability.

Remark 10.

The measure of approximation is now the divergence

D ({\tilde{X}}^{n} | | X^{n})

but not its reversed version

D (X^{n} | | {\tilde{X}}^{n})

. In the context of resolvability, divergence

D ({\tilde{X}}^{n} | | X^{n})

(and its counterpart in the case of channel resolvability) is usually employed as in [6,7,9]. We also use this type of divergence in the subsequent sections.

To establish the general formula for

S_{v}^{D} (δ | X)

, we introduce the following quantity for a general source

X = {X^{n}}_{n = 1}^{\infty}

. Recall that

P (X^{n})

denotes the set of all probability distributions on

X^{n}

. For

δ \geq 0

, defining the δ-ball using the divergence as

\begin{matrix} B_{δ}^{D} (X^{n}) = \{P_{V^{n}} \in P (X^{n}) : D (V^{n} | | X^{n}) \leq δ\}, \end{matrix}

(96)

we introduce the following quantity, referred to as the smooth entropy using the divergence:

\begin{matrix} H_{[δ]}^{D} (X^{n}) & : = inf_{P_{V^{n}} \in B_{δ}^{D} (X^{n})} H (V^{n}), \end{matrix}

(97)

where

H (V^{n})

denotes the Shannon entropy of

P_{V^{n}}

. Obviously,

H_{[δ]}^{D} (X^{n})

is a nonincreasing function of

δ

. Now, we define

\begin{matrix} H_{[δ]}^{D} (X) & = \underset{n \to \infty}{lim sup} \frac{1}{n} H_{[δ]}^{D} (X^{n}) . \end{matrix}

(98)

The following lemma is used to derive Corollary 5 of Theorem 4 below in the next subsection.

Lemma 1.

For any general source

X

,

\begin{matrix} H_{[δ]} (X) \leq H_{[g (δ)]}^{D} (X) (δ \geq 0), \end{matrix}

(99)

where we define

g (δ) = 2 δ^{2} / ln K

, and

\begin{matrix} lim_{δ ↓ 0} G_{[δ]} (X) = lim_{δ ↓ 0} H_{[δ]} (X) = lim_{δ ↓ 0} H_{[δ]}^{D} (X) \leq \bar{H} (X) . \end{matrix}

(100)

(Proof) See Appendix E.

4.2. General Formula

Here, we establish another main theorem, which characterizes

S_{v}^{D} (δ | X)

for all

δ \geq 0

in terms of the smooth entropy using the divergence.

Theorem 4.

For any general target source

X

,

\begin{matrix} S_{v}^{D} (δ | X) = lim_{γ ↓ 0} H_{[δ + γ]}^{D} (X) (δ \geq 0) . \end{matrix}

(101)

Remark 11.

It should be noticed that the approximation measure considered here is not the normalized divergence

\begin{matrix} \frac{1}{n} D (φ_{n} (U^{(L_{n})}) | | X^{n}), \end{matrix}

(102)

which has been used in the problem of FL δ-resolvability [7]. The achievability scheme given in the proof of the direct part of Theorem 4 can also be used in the case of this relaxed measure. Indeed, denoting the VL δ-resolvability with the normalized divergence by

{\tilde{S}}_{v}^{D} (δ | X)

, the general formula for

{\tilde{S}}_{v}^{D} (δ | X)

is given in the same form as (101), if the radius of the δ-ball

B_{δ}^{D} (X^{n})

in the definition of

H_{[δ]}^{D} (X^{n})

is replaced with the normalized divergence. It generally holds that

S_{v}^{D} (δ | X) \geq {\tilde{S}}_{v}^{D} (δ | X)

for all

δ \geq 0

because the normalized divergence is smaller than the unnormalized divergence.

As we have seen in Lemma 1, we generally have

S_{v}^{D} (g (δ) | X) \geq S_{v} (δ | X)

for any

δ \in [0, 1)

with

g (δ) = 2 δ^{2} / ln K

. In particular, in the case that

δ = 0

, we obtain the following corollary of Theorems 3 and 4.

Corollary 5.

For any general target source

X

,

\begin{matrix} S_{v}^{D} (0 | X) & = S_{v} (X) . \end{matrix}

(103)

Corollary 5 indicates that the

v_{D} (0)

-resolvability

S_{v}^{D} (0 | X)

coincides with the

v

-resolvability

S_{v} (X)

and is also characterized by the r.h.s. of (85). By (88), it also implies that

S_{v}^{D} (0 | X) = R_{v}^{*} (X)

, where

R_{v}^{*} (X)

denotes the minimum error-vanishing achievable rate with VL source codes for

X

.

Proof of Theorem 4.

(1): Converse Part:
Let R be $v_{D} (δ)$ -achievable. Then, there exists $U^{(L_{n})}$ and $φ_{n}$ satisfying (92) and

$\begin{matrix} \underset{n \to \infty}{lim sup} δ_{n} \leq δ, \end{matrix}$

(104)

where we define $δ_{n} = D ({\tilde{X}}^{n} | | X^{n})$ with ${\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})$ . Equation (104) implies that, for any given $γ > 0$ , it holds that $δ_{n} \leq δ + γ$ for all $n \geq n_{0}$ with some $n_{0} > 0$ , and therefore

$\begin{matrix} H_{[δ + γ]}^{D} (X^{n}) \leq H_{[δ_{n}]}^{D} (X^{n}) (\forall n \geq n_{0}) \end{matrix}$

(105)

since $H_{[δ]}^{D} (X^{n})$ is a nonincreasing function of $δ$ . Since $P_{{\tilde{X}}^{n}} \in B_{δ_{n}}^{D} (X^{n})$ , we have

$\begin{matrix} H_{[δ_{n}]}^{D} (X^{n}) \leq H ({\tilde{X}}^{n}) . \end{matrix}$

(106)

On the other hand, it follows from (23) that

$\begin{matrix} H ({\tilde{X}}^{n}) & \leq H (U^{(L_{n})}) = E [L_{n}] + H (L_{n}), \end{matrix}$

(107)

where the inequality is due to the fact that $φ_{n}$ is a deterministic mapping and ${\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})$ .
Combining (105)–(107) yields

$\begin{matrix} H_{[δ + γ]}^{D} (X) & = \underset{n \to \infty}{lim sup} \frac{1}{n} H_{[δ + γ]}^{D} (X^{n}) \\ \leq \underset{n \to \infty}{lim sup} \frac{1}{n} E [L_{n}] + \underset{n \to \infty}{lim sup} \frac{1}{n} H (L_{n}) \leq R, \end{matrix}$

(108)

where we used (25) and (92) for the last inequality. Since $γ > 0$ is arbitrary, we have

$\begin{matrix} lim_{γ ↓ 0} H_{[δ + γ]}^{D} (X) \leq R . \end{matrix}$

(109)
(2): Direct Part:
We modify the achievability scheme in the proof of the direct part of Theorem 3. Although the proof of this part is quite similar to that of the direct part of Theorem 3, we give here the full proof in order to avoid subtle possible confusions. We may assume that $H^{+} : = {lim}_{γ ↓ 0} H_{[δ + γ]}^{D} (X)$ is finite ( $H^{+} < + \infty$ ). Letting $R = H^{+} + μ$ , where $μ > 0$ is an arbitrary constant, we shall show that R is $v_{D} (δ)$ -achievable.
Let $V^{n}$ be a random variable subject to $P_{V^{n}} \in B_{δ + γ}^{D} (X^{n})$ satisfying

$\begin{matrix} H_{[δ + γ]}^{D} (X^{n}) + γ \geq H (V^{n}) \end{matrix}$

(110)

for any fixed $γ \in (0, \frac{1}{2}]$ . We can choose a $c_{n} > 0$ so large that

$\begin{matrix} γ_{0} : = Pr {V^{n} \notin T_{n}} \leq γ \end{matrix}$

(111)

where

$\begin{matrix} T_{n} : = \{x \in X^{n} : \frac{1}{n} log \frac{1}{P_{V^{n}} (x)} \leq c_{n}\} . \end{matrix}$

(112)

We also define

$\begin{matrix} ℓ (x) : = ⌈log \frac{1}{P_{V^{n}} (x)} + n γ⌉ f o r x \in T_{n} . \end{matrix}$

(113)

Letting, for $m = 1, 2, \dots, β_{n} : = ⌈ n (c_{n} + γ) ⌉$ ,

$\begin{matrix} S_{n} (m) : = \{x \in X^{n} : ℓ (x) = m\}, \end{matrix}$

(114)

these sets form a partition of $T_{n}$ :

$\begin{matrix} ⋃_{m = 1}^{β_{n}} S_{n} (m) = T_{n} . \end{matrix}$

(115)

We set $L_{n}$ so that

$\begin{matrix} Pr {L_{n} = m} = \frac{Pr {V^{n} \in S_{n} (m)}}{Pr {V^{n} \in T_{n}}} = \frac{Pr {V^{n} \in S_{n} (m)}}{1 - γ_{0}}, \end{matrix}$

(116)

which satisfies

$\begin{matrix} \sum_{m = 1}^{β_{n}} Pr {L_{n} = m} = \frac{Pr {V^{n} \in T_{n}}}{1 - γ_{0}} = 1, \end{matrix}$

(117)

and, hence, the probability distribution of $U^{(L_{n})}$ is given as

$\begin{matrix} P_{U^{(L_{n})}} (u, m) : = Pr {U^{(L_{n})} = u, L_{n} = m} = \frac{Pr {V^{n} \in S_{n} (m)}}{(1 - γ_{0}) K^{m}} (\forall u \in U^{m}) . \end{matrix}$

(118)

(a): Construction of Mapping $φ_{n} : U^{*} \to X^{n}$ :
Index the elements in $S_{n} (m)$ as $x_{1}, x_{2}, \dots, x_{| S_{n} (m) |} (m = 1, 2, \dots, β_{n})$ , where it holds that

$\begin{matrix} | S_{n} (m) | \leq K^{m - n γ} \end{matrix}$

(119)

(cf. (61)–(63)). For $i = 1, 2, \dots, | S_{n} (m) |$ , define ${\tilde{A}}_{i}^{(m)} \subset U^{m}$ as the set of sequences $u \in U^{m}$ so that

$\begin{matrix} \sum_{u \in {\tilde{A}}_{i}^{(m)}} P_{U^{(L_{n})}} (u, m) \leq \frac{P_{V^{n}} (x_{i})}{1 - γ_{0}} < \sum_{u \in {\tilde{A}}_{i}^{(m)}} P_{U^{(L_{n})}} (u, m) + \frac{Pr {L_{n} = m}}{K^{m}} \end{matrix}$

(120)

and

$\begin{matrix} {\tilde{A}}_{i}^{(m)} \cap {\tilde{A}}_{j}^{(m)} = \emptyset (i \neq j) . \end{matrix}$

(121)

If

$\begin{matrix} \sum_{i = 1}^{| S_{n} (m) |} \sum_{u \in {\tilde{A}}_{i}^{(m)}} P_{U^{(L_{n})}} (u, m) < \frac{1}{1 - γ_{0}} \sum_{i = 1}^{| S_{n} (m) |} P_{V^{n}} (x_{i}) = Pr {L_{n} = m}, \end{matrix}$

(122)

then add a $u_{i} \in U^{m} ∖ (\cup_{j} {\tilde{A}}_{j}^{(m)})$ to obtain

$\begin{matrix} A_{i}^{(m)} = {\tilde{A}}_{i}^{(m)} \cup {u_{i}} \end{matrix}$

(123)

for $i = 1, 2, \dots$ in order, until it holds that with some $1 \leq c \leq | S_{n} (m) |$

$\begin{matrix} ⋃_{i = 1}^{c} A_{i}^{(m)} \cup ⋃_{i = c + 1}^{| S_{n} (m) |} {\tilde{A}}_{i}^{(m)} = U^{m}, \end{matrix}$

(124)

where $u_{1}, u_{2}, \dots$ are selected to be all distinct. Since $| U^{m} | = K^{m}$ and

$\begin{matrix} \sum_{u \in U^{m}} P_{U^{(L_{n})}} (u, m) = \sum_{u \in U^{m}} \frac{Pr {V^{n} \in S_{n} (m)}}{(1 - γ_{0}) K^{m}} = Pr {L_{n} = m}, \end{matrix}$

(125)

such a $1 \leq c \leq | S_{n} (m) |$ always exists. For simplicity, we set for $i = c + 1, c + 2, \dots, | S_{n} (m) |$

$\begin{matrix} A_{i}^{(m)} = {\tilde{A}}_{i}^{(m)} \end{matrix}$

(126)

and for $i = 1, 2, \dots, | S_{n} (m) |$

$\begin{matrix} φ_{n} (u) = x_{i} f o r u \in A_{i}^{(m)}, \end{matrix}$

(127)

which defines the random variable ${\tilde{X}}^{n}$ with values in $X^{n}$ such that

$\begin{matrix} P_{{\tilde{X}}^{n}} (x_{i}) = \sum_{u \in A_{i}^{(m)}} P_{U^{(L_{n})}} (u, m) (x_{i} \in S_{n} (m)), \end{matrix}$

(128)

that is, ${\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})$ . Notice that, by this construction, we have

$\begin{matrix} |P_{{\tilde{X}}^{n}} (x_{i}) - \frac{P_{V^{n}} (x_{i})}{1 - γ_{0}}| \leq \frac{Pr {L_{n} = m}}{K^{m}} = \frac{Pr {V^{n} \in S_{n} (m)}}{(1 - γ_{0}) K^{m}} \end{matrix}$

(129)

for $i = 1, 2, \dots, | S_{n} (m) |; m = 1, 2, \dots, β_{n}$ , and

$\begin{matrix} Pr {{\tilde{X}}^{n} \notin T_{n}} = 0 and Pr {V^{n} \notin T_{n}} \leq γ . \end{matrix}$

(130)
(b): Evaluation of Average Length:
The average length $E [L_{n}]$ is evaluated as follows:

$\begin{matrix} E [L_{n}] & = \sum_{m = 1}^{β_{n}} \sum_{u \in U^{m}} P_{U^{(L_{n})}} (u, m) \cdot m \\ = \sum_{m = 1}^{β_{n}} \sum_{i = 1}^{| S_{n} (m) |} \sum_{u \in A_{i}^{(m)}} P_{U^{(L_{n})}} (u, m) \cdot m \\ = \sum_{m = 1}^{β_{n}} \sum_{x_{i} \in S_{n} (m)} P_{{\tilde{X}}^{n}} (x_{i}) \cdot m, \end{matrix}$

(131)

where we have used $U^{m} = ⋃_{i = 1}^{| S_{n} (m) |} A_{i}^{(m)}$ and (128). For $x_{i} \in S_{n} (m)$ we obtain from (129) and the right inequality of (130)

$\begin{matrix} P_{{\tilde{X}}^{n}} (x_{i}) & \leq \frac{P_{V^{n}} (x_{i})}{1 - γ_{0}} + \frac{Pr {V^{n} \in S_{n} (m)}}{(1 - γ_{0}) K^{m}} \\ = \frac{1}{1 - γ_{0}} (P_{V^{n}} (x_{i}) + \frac{Pr {V^{n} \in S_{n} (m)}}{K^{m}}) \\ \leq \frac{1}{1 - γ_{0}} (1 + \frac{1}{P_{V^{n}} (x_{i}) K^{m}}) P_{V^{n}} (x_{i}) \\ \leq (1 + 2 γ) (1 + \frac{1}{K^{n γ}}) P_{V^{n}} (x_{i}), \end{matrix}$

(132)

where, to derive the last inequality, we have used the fact $0 \leq γ_{0} \leq γ \leq \frac{1}{2}$ and

$\begin{matrix} P_{V^{n}} (x_{i}) \geq K^{- (m - n γ)} (\forall x_{i} \in S_{n} (m)) . \end{matrix}$

(133)

It should be noticed that (132) also implies that

$\begin{matrix} P_{{\tilde{X}}^{n}} (x) \leq (1 + 2 γ) (1 + \frac{1}{K^{n γ}}) P_{V^{n}} (x) (\forall x \in X^{n}) \end{matrix}$

(134)

since $P_{{\tilde{X}}^{n}} (x) = 0$ for $x \notin T_{n} = ⋃_{m = 1}^{β_{n}} S_{n} (m)$ . Plugging the inequality

$\begin{matrix} m \leq log \frac{1}{P_{V^{n}} (x_{i})} + n γ + 1 (\forall x_{i} \in S_{n} (m)) \end{matrix}$

(135)

and (132) into (131), we obtain

$\begin{matrix} E [L_{n}] & \leq (1 + 2 γ) (1 + \frac{1}{K^{n γ}}) \\ \cdot \sum_{m = 1}^{β_{n}} \sum_{x_{i} \in S_{n} (m)} P_{V^{n}} (x_{i}) (log \frac{1}{P_{V^{n}} (x_{i})} + n γ + 1) \\ \leq (1 + 2 γ) (1 + \frac{1}{K^{n γ}}) (H (V^{n}) + n γ + 1) . \end{matrix}$

(136)

Thus, we obtain from (136)

$\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} E [L_{n}] & \leq (1 + 2 γ) \{\underset{n \to \infty}{lim sup} \frac{1}{n} H (V^{n}) + γ\} \\ \leq (1 + 2 γ) \{\underset{n \to \infty}{lim sup} \frac{1}{n} H_{[δ + γ]}^{D} (X^{n}) + 2 γ\} \\ \leq (1 + 2 γ) (H^{+} + 2 γ), \end{matrix}$

(137)

where the second inequality follows from (110). Since we have assumed that $H^{+}$ is finite and $γ \in (0, \frac{1}{2}]$ is arbitrary, the r.h.s. of (137) can be made as close to $H^{+}$ as desired. Therefore, for all sufficiently small $γ > 0$ , we obtain

$\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} E [L_{n}] & \leq H^{+} + μ = R \end{matrix}$

(138)
(c): Evaluation of Divergence:
The divergence $D ({\tilde{X}}^{n} | | X^{n})$ can be rewritten as

$\begin{matrix} D ({\tilde{X}}^{n} | | X^{n}) = D ({\tilde{X}}^{n} | | V^{n}) + E [log \frac{P_{V^{n}} ({\tilde{X}}^{n})}{P_{X^{n}} ({\tilde{X}}^{n})}] . \end{matrix}$

(139)

In view of (132), we obtain

$\begin{matrix} D ({\tilde{X}}^{n} | | V^{n}) & = \sum_{m = 1}^{β_{n}} \sum_{x \in S_{n} (m)} P_{{\tilde{X}}^{n}} (x) log \frac{P_{{\tilde{X}}^{n}} (x)}{P_{V^{n}} (x)} \\ \leq \sum_{m = 1}^{β_{n}} \sum_{x \in S_{n} (m)} P_{{\tilde{X}}^{n}} (x) log \{(1 + 2 γ) (1 + \frac{1}{K^{n γ}})\} \\ \leq \frac{2 γ}{ln K} + log (1 + \frac{1}{K^{n γ}}) \end{matrix}$

(140)

and

$\begin{matrix} E [log \frac{P_{V^{n}} ({\tilde{X}}^{n})}{P_{X^{n}} ({\tilde{X}}^{n})}] & = \sum_{x \in X^{n}} P_{{\tilde{X}}^{n}} (x) log \frac{P_{V^{n}} (x)}{P_{X^{n}} (x)} \\ \leq (1 + 2 γ) (1 + \frac{1}{K^{n γ}}) D (V^{n} | | X^{n}) \\ \leq (1 + 2 γ) (δ + γ) (1 + \frac{1}{K^{n γ}}), \end{matrix}$

(141)

where to obtain the last inequality we used the fact that $P_{V^{n}} \in B_{δ + γ}^{D} (X^{n})$ . Plugging (140) and (141) into (139) yields

$\begin{matrix} \underset{n \to \infty}{lim sup} D ({\tilde{X}}^{n} | | X^{n}) & \leq \frac{2 γ}{ln K} + (1 + 2 γ) (δ + γ) \\ \leq δ + γ (2 δ + 5), \end{matrix}$

(142)

where we have used the fact that $\frac{2 γ}{ln K} \leq 3 γ$ for all $K \geq 2$ and the assumption $0 < γ \leq \frac{1}{2}$ to derive the last inequality. Since $γ \in (0, \frac{1}{2}]$ is arbitrary and we have (138), R is $v_{D} (δ)$ -achievable.

□

5. Mean and VL Channel Resolvability

So far, we have studied the problem of source resolvability, whereas the problem of channel resolvability has been introduced by Han and Verdú [6] to investigate the capacity of identification codes [11]. In a conventional problem of this kind, a target output distribution

P_{Y^{n}}

via a channel

W^{n}

due to an input

X^{n}

is approximated by encoding the FL uniform random number

U_{M_{n}}

as a channel input. In this section, we generalize the problem of such channel resolvability to that in the variable-length setting.

5.1. Definitions

Let

X

and

Y

be finite or countably infinite alphabets. Let

W = {W^{n}}_{n = 1}^{\infty}

be a general channel, where

W^{n} : X^{n} \to Y^{n}

denotes a stochastic mapping. We denote by

Y = {Y^{n}}_{n = 1}^{\infty}

the output process via

W

due to an input process

X = {X^{n}}_{n = 1}^{\infty}

, where

X^{n}

and

Y^{n}

take values in

X^{n}

and

Y^{n}

, respectively. Again, we do not impose any assumptions such as stationarity or ergodicity on either

X

or

W

. As in the previous sections, we will identify

X^{n}

and

Y^{n}

with their probability distributions

P_{X^{n}}

and

P_{Y^{n}}

, respectively, and these symbols are used interchangeably.

In this section, we consider several types of problems of approximating a target output distribution

P_{Y^{n}}

. The first one is the problem of mean-resolvability [6], in which the channel input is allowed to be an arbitrary general source.

Definition 5

(mean

δ

-channel resolvability: variational distance). Let

δ \in [0, 1)

be fixed arbitrarily. A resolution rate

R \geq 0

is said to be mean δ-achievable for

X

(under the variational distance) if there exists a general source

\tilde{X} = {{\tilde{X}}^{n}}_{n = 1}^{\infty}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} H ({\tilde{X}}^{n}) & \leq R, \end{matrix}

(143)

\begin{matrix} \underset{n \to \infty}{lim sup} d (P_{Y^{n}}, P_{{\tilde{Y}}^{n}}) & \leq δ, \end{matrix}

(144)

where

{\tilde{Y}}^{n}

denotes the output via

W^{n}

due to input

{\tilde{X}}^{n}

. The infimum of all mean δ-achievable rates for

X

, i.e.,

\begin{matrix} {\bar{S}}_{v} (δ | X, W) : = inf {R : R i s m e a n δ - a c h i e v a b l e f o r X} \end{matrix}

(145)

is referred to as the mean δ-resolvability for

X

. We also define the mean δ-resolvability for the worst input as

\begin{matrix} {\bar{S}}_{v} (δ | W) : = sup_{X} {\bar{S}}_{v} (δ | X, W) . \end{matrix}

(146)

On the other hand, we may also consider the problem of VL channel resolvability. Here, the VL uniform random number

U^{(L_{n})}

is defined as in the foregoing sections. Consider the problem of approximating the target output distribution

P_{Y^{n}}

via

W^{n}

due to

X^{n}

by using another input

{\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})

with a deterministic mapping

φ_{n} : U^{*} \to X^{n}

.

Definition 6

(VL

δ

-channel resolvability: variational distance). Let

δ \in [0, 1)

be fixed arbitrarily. A resolution rate

R \geq 0

is said to be VL δ-achievable or simply

v (δ)

-achievable for

X

(under the variational distance) if there exists a VL uniform random number

U^{(L_{n})}

and a deterministic mapping

φ_{n} : U^{*} \to X^{n}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} E [L_{n}] & \leq R, \end{matrix}

(147)

\begin{matrix} \underset{n \to \infty}{lim sup} d (P_{Y^{n}}, P_{{\tilde{Y}}^{n}}) & \leq δ, \end{matrix}

(148)

where

E [\cdot]

denotes the expected value and

{\tilde{Y}}^{n}

denotes the output via

W^{n}

due to input

{\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})

. The infimum of all

v (δ)

-achievable rates for

X

, i.e.,

\begin{matrix} S_{v} (δ | X, W) : = inf {R : R i s v (δ) - a c h i e v a b l e f o r X} \end{matrix}

(149)

is called the VL δ-channel resolvability or simply

v (δ)

-channel resolvability for

X

. We also define the VL δ-channel resolvability or simply

v (δ)

-channel resolvability for the worst input as

\begin{matrix} S_{v} (δ | W) : = sup_{X} S_{v} (δ | X, W) . \end{matrix}

(150)

When

W^{n}

is the identity mapping, the problem of channel resolvability reduces to that of source resolvability, which has been investigated in the foregoing sections. In this sense, the problem of channel resolvability is a generalization of the problem of source resolvability.

Similarly to the problem of source resolvability, we may also use the divergence between the target output distribution

P_{Y^{n}}

and the approximated output distribution

P_{{\tilde{Y}}^{n}}

as an approximation measure.

Definition 7

(mean

δ

-channel resolvability: divergence). Let

δ \geq 0

be fixed arbitrarily. A resolution rate

R \geq 0

is said to be mean δ-achievable for

X

(under the divergence) if there exists a general source

\tilde{X} = {{\tilde{X}}^{n}}_{n = 1}^{\infty}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} H ({\tilde{X}}^{n}) & \leq R, \end{matrix}

(151)

\begin{matrix} \underset{n \to \infty}{lim sup} D ({\tilde{Y}}^{n} | | Y^{n}) & \leq δ, \end{matrix}

(152)

where

{\tilde{Y}}^{n}

denotes the output via

W^{n}

due to input

{\tilde{X}}^{n}

. The infimum of all mean δ-achievable rates for

X

, i.e.,

\begin{matrix} {\bar{S}}_{v}^{D} (δ | X, W) : = inf {R : R i s m e a n δ - a c h i e v a b l e f o r X} \end{matrix}

(153)

is referred to as the mean δ-channel resolvability for

X

. We also define the mean δ-channel resolvability for the worst input as

\begin{matrix} {\bar{S}}_{v}^{D} (δ | W) : = sup_{X} {\bar{S}}_{v}^{D} (δ | X, W) . \end{matrix}

(154)

Definition 8

(VL

δ

-channel resolvability: divergence). Let

δ \geq 0

be fixed arbitrarily. A resolution rate

R \geq 0

is said to be VL δ-achievable or simply

v_{D} (δ)

-achievable for

X

(under the divergence) if there exists a VL uniform random number

U^{(L_{n})}

and a deterministic mapping

φ_{n} : U^{*} \to X^{n}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} E [L_{n}] & \leq R, \end{matrix}

(155)

\begin{matrix} \underset{n \to \infty}{lim sup} D ({\tilde{Y}}^{n} | | Y^{n}) & \leq δ, \end{matrix}

(156)

where

E [\cdot]

denotes the expected value and

{\tilde{Y}}^{n}

denotes the output via

W^{n}

due to input

{\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})

. The infimum of all

v_{D} (δ)

-achievable rates for

X

, i.e.,

\begin{matrix} S_{v}^{D} (δ | X, W) : = inf {R : R i s v_{D} (δ) - a c h i e v a b l e f o r X} \end{matrix}

(157)

is called the VL δ-channel resolvability or simply

v_{D} (δ)

-channel resolvability for

X

. We also define the VL δ-channel resolvability or simply

v_{D} (δ)

-channel resolvability for the worst input as

\begin{matrix} S_{v}^{D} (δ | W) : = sup_{X} S_{v}^{D} (δ | X, W) . \end{matrix}

(158)

Remark 12.

Since the outputs of a deterministic mapping

{\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})

form a general source

\tilde{X}

, it holds that

\begin{matrix} {\bar{S}}_{v} (δ | X, W) & \leq S_{v} (δ | X, W) (δ \in [0, 1)), \end{matrix}

(159)

\begin{matrix} {\bar{S}}_{v}^{D} (δ | X, W) & \leq S_{v}^{D} (δ | X, W) (δ \geq 0) \end{matrix}

(160)

for any general source

X

and general channel

W

. These relations lead to the analogous relation for the mean/VL δ-channel resolvability for the worst input:

\begin{matrix} {\bar{S}}_{v} (δ | W) & \leq S_{v} (δ | W) (δ \in [0, 1)), \end{matrix}

(161)

\begin{matrix} {\bar{S}}_{v}^{D} (δ | W) & \leq S_{v}^{D} (δ | W) (δ \geq 0) . \end{matrix}

(162)

5.2. General Formulas

For a given general source

X = {X^{n}}_{n = 1}^{\infty}

and a general channel

W = {W^{n}}_{n = 1}^{\infty}

, let

Y = {Y^{n}}_{n = 1}^{\infty}

be the channel output via

W

due to input

X

. We define

\begin{matrix} H_{[δ], W^{n}} (X^{n}) = inf_{P_{V^{n}} \in B_{δ} (X^{n}, W^{n})} H (V^{n}), \end{matrix}

(163)

\begin{matrix} H_{[δ], W^{n}}^{D} (X^{n}) = inf_{P_{V^{n}} \in B_{δ}^{D} (X^{n}, W^{n})} H (V^{n}) \end{matrix}

(164)

where

H (V^{n})

denotes the Shannon entropy of

V^{n}

and

B_{δ} (X^{n}, W^{n})

and

B_{δ}^{D} (X^{n}, W^{n})

are defined as

\begin{matrix} B_{δ} (X^{n}, W^{n}) & = \{P_{V^{n}} \in P (X^{n}) : d (P_{Y^{n}}, P_{Z^{n}}) \leq δ\}, \end{matrix}

(165)

\begin{matrix} B_{δ}^{D} (X^{n}, W^{n}) & = \{P_{V^{n}} \in P (X^{n}) : D (Z^{n} | | Y^{n}) \leq δ\}, \end{matrix}

(166)

respectively, with

Z^{n}

defined as the output via

W^{n}

due to input

V^{n}

. Both

H_{[δ], W^{n}} (X^{n})

and

H_{[δ], W^{n}}^{D} (X^{n})

are nonincreasing functions of

δ

. In addition, we define

\begin{matrix} H_{[δ], W} (X) & = \underset{n \to \infty}{lim sup} \frac{1}{n} H_{[δ], W^{n}} (X^{n}), \end{matrix}

(167)

\begin{matrix} H_{[δ], W}^{D} (X) & = \underset{n \to \infty}{lim sup} \frac{1}{n} H_{[δ], W^{n}}^{D} (X^{n}), \end{matrix}

(168)

which play an important role in characterizing the mean/VL

δ

-channel resolvability.

We show the general formulas for the mean/VL

δ

-channel resolvability.

Theorem 5

(with variational distance). For any input process

X

and any general channel

W

,

\begin{matrix} {\bar{S}}_{v} (δ | X, W) = S_{v} (δ | X, W) = lim_{γ ↓ 0} H_{[δ + γ], W} (X) (δ \in [0, 1)) . \end{matrix}

(169)

In particular,

\begin{matrix} {\bar{S}}_{v} (δ | W) = S_{v} (δ | W) & = sup_{X} lim_{γ ↓ 0} H_{[δ + γ], W} (X) (δ \in [0, 1)) . \end{matrix}

(170)

Theorem 6

(with divergence). For any input process

X

and any general channel

W

,

\begin{matrix} {\bar{S}}_{v}^{D} (δ | X, W) = S_{v}^{D} (δ | X, W) = lim_{γ ↓ 0} H_{[δ + γ], W}^{D} (X) (δ \geq 0) . \end{matrix}

(171)

In particular,

\begin{matrix} {\bar{S}}_{v}^{D} (δ | W) = S_{v}^{D} (δ | W) = sup_{X} lim_{γ ↓ 0} H_{[δ + γ], W}^{D} (X) (δ \geq 0) . \end{matrix}

(172)

Remark 13.

It can be easily verified that the variational distance satisfies

\begin{matrix} d (P_{Y^{n}}, P_{Z^{n}}) & \leq d (P_{X^{n}}, P_{V^{n}}), \end{matrix}

(173)

and, therefore, we have

B_{δ} (X^{n}) \subseteq B_{δ} (X^{n}, W^{n})

. This relation and formulas (32) and (169) indicate that

\begin{matrix} S_{v} (δ | X, W) \leq S_{v} (δ | X) (δ \in [0, 1)) \end{matrix}

(174)

for any given channel

W

. Likewise, it is well known that the divergence satisfies the data processing inequality

D ({\tilde{Y}}^{n} | | Y^{n}) \leq D ({\tilde{X}}^{n} | | X^{n})

[33], and formulas (101) and (171) lead to

\begin{matrix} S_{v}^{D} (δ | X, W) \leq S_{v}^{D} (δ | X) (δ \geq 0), \end{matrix}

(175)

regardless of channel

W

.

Remark 14.

It is obvious that Theorems 5 and 6 reduce to Theorems 3 and 4, respectively, when the channel

W

is the identity mapping. Precisely, for the identity mapping

W = I

, the mean δ-resolvability and the

v (δ)

-channel resolvability for

X

are given by

\begin{matrix} {\bar{S}}_{v} (δ | X) = S_{v} (δ | X) = lim_{γ ↓ 0} H_{[δ + γ]} (X), \end{matrix}

(176)

where

{\bar{S}}_{v} (δ | X)

denotes the mean δ-resolvability

{\bar{S}}_{v} (δ | X, W)

for the identity mapping

W

. The analogous relationship holds under the divergence:

\begin{matrix} {\bar{S}}_{v}^{D} (δ | X) = S_{v}^{D} (δ | X) = lim_{γ ↓ 0} H_{[δ + γ]}^{D} (X), \end{matrix}

(177)

where

{\bar{S}}_{v}^{D} (δ | X)

denotes the mean δ-resolvability

{\bar{S}}_{v}^{D} (δ | X, W)

for the identity mapping

W = I

. Thus, it turns out that Theorems 5 and 6 are indeed generalizations of Theorems 3 and 4.

Proof of Theorems 5 and 6.

(1): Converse Part:
Because of the general relationship (159), to prove the converse part of Theorem 5, it suffices to show that

$\begin{matrix} {\bar{S}}_{v} (δ | X, W) & \geq lim_{γ ↓ 0} H_{[δ + γ], W} (X) . \end{matrix}$

(178)

Let R be mean $δ$ -achievable for $X$ under the variational distance. Then, there exists a general source $\tilde{X} = {{\tilde{X}}^{n}}_{n = 1}^{\infty}$ satisfying (143) and

$\begin{matrix} \underset{n \to \infty}{lim sup} δ_{n} \leq δ, \end{matrix}$

(179)

where $δ_{n} : = d (P_{Y^{n}}, P_{{\tilde{Y}}^{n}})$ . Fixing $γ > 0$ arbitrarily, we have $δ_{n} \leq δ + γ$ for all $n \geq n_{0}$ with some $n_{0} > 0$ and then

$\begin{matrix} H_{[δ + γ], W^{n}} (X^{n}) \leq H_{[δ_{n}], W^{n}} (X^{n}) (n \geq n_{0}) \end{matrix}$

(180)

since $H_{[δ], W^{n}} (X^{n})$ is a nonincreasing function of $δ$ . Since $P_{{\tilde{X}}^{n}} \in B_{δ_{n}} (X^{n}, W^{n})$ , we have $H_{[δ_{n}], W^{n}} (X^{n}) \leq H ({\tilde{X}}^{n})$ . Thus, we obtain from (143)

$\begin{matrix} H_{[δ + γ], W} (X) \leq \underset{n \to \infty}{lim sup} \frac{1}{n} H ({\tilde{X}}^{n}) \leq R . \end{matrix}$

(181)

Since $γ > 0$ is an arbitrary constant, this implies (178).
The converse part of Theorem 6 can be proven in an analogous way with due modifications.
(2): Direct Part:
Because of the general relationship (159), to prove the direct part (achievability) of Theorem 5, it suffices to show that, for any fixed $γ > 0$ , the resolution rate

$\begin{matrix} R = lim_{γ ↓ 0} H_{[δ + γ], W} (X) + 3 γ \end{matrix}$

(182)

is $v (δ)$ -achievable for $X$ under the variational distance.
Let $P_{V^{n}} \in B_{δ + γ} (X^{n}, W^{n})$ be a source satisfying

$\begin{matrix} H (V^{n}) \leq H_{[δ + γ], W^{n}} (X^{n}) + γ . \end{matrix}$

(183)

Then, by the same argument to derive (80) and (82) as developed in the proof of the direct part of Theorem 3, we can construct a VL uniform random number $U^{(L_{n})}$ and a deterministic mapping $φ_{n} : U^{*} \to X^{n}$ satisfying

$\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} E [L_{n}] \leq lim_{γ ↓ 0} H_{[δ + γ], W} (X) + 3 γ = R \end{matrix}$

(184)

and

$\begin{matrix} d (P_{{\tilde{X}}^{n}}, P_{V^{n}}) \leq \frac{1}{2} K^{- n γ} + γ, \end{matrix}$

(185)

where ${\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})$ . Let $Z^{n}$ denote the output random variable via $W^{n}$ due to input $V^{n}$ . Then, letting ${\tilde{Y}}^{n}$ be the output via channel $W^{n}$ due to input ${\tilde{X}}^{n}$ , we can evaluate $d (P_{{\tilde{Y}}^{n}}, P_{Z^{n}})$ as

$\begin{matrix} d (P_{{\tilde{Y}}^{n}}, P_{Z^{n}}) & = \frac{1}{2} \sum_{y \in Y^{n}} | P_{{\tilde{Y}}^{n}} (y) - P_{Z^{n}} (y) | \\ = \frac{1}{2} \sum_{y \in Y^{n}} |\sum_{x \in X^{n}} W (y | x) (P_{{\tilde{X}}^{n}} (x) - P_{V^{n}} (x))| \\ \leq \frac{1}{2} \sum_{y \in Y^{n}} \sum_{x \in X^{n}} W (y | x) |P_{{\tilde{X}}^{n}} (x) - P_{V^{n}} (x)| \\ = d (P_{{\tilde{X}}^{n}}, P_{V^{n}}) \leq \frac{1}{2} K^{- n γ} + γ . \end{matrix}$

(186)

Thus, we obtain

$\begin{matrix} \underset{n \to \infty}{lim sup} d (P_{Y^{n}}, P_{{\tilde{Y}}^{n}}) & \leq \underset{n \to \infty}{lim sup} d (P_{Y^{n}}, P_{Z^{n}}) + \underset{n \to \infty}{lim sup} d (P_{{\tilde{Y}}^{n}}, P_{Z^{n}}) \\ \leq δ + 2 γ, \end{matrix}$

(187)

where we have used the fact $P_{V^{n}} \in B_{δ + γ} (X^{n}, W^{n})$ to derive the last inequality. Since $γ > 0$ is an arbitrary constant, we can conclude that R is $v (δ)$ -achievable for $X$ .
The direct part of Theorem 6 can be proven in the same way as Theorem 4 with due modifications. Fixing $P_{V^{n}} \in B_{δ + γ}^{D} (X^{n}, W^{n})$ and using the encoding scheme as developed in the proof of Theorem 4, the evaluation of the average length rate is exactly the same, and we can obtain (138). A key step is to evaluate the divergence $D ({\tilde{Y}}^{n} | | Y^{n})$ , which can be rewritten as

$\begin{matrix} D ({\tilde{Y}}^{n} | | Y^{n}) = D ({\tilde{Y}}^{n} | | Z^{n}) + E [log \frac{P_{Z^{n}} ({\tilde{Y}}^{n})}{P_{Y^{n}} ({\tilde{Y}}^{n})}] . \end{matrix}$

(188)

The first term on the r.h.s. can be bounded as

$\begin{matrix} D ({\tilde{Y}}^{n} | | Z^{n}) & \leq D ({\tilde{X}}^{n} | | V^{n}) \leq \frac{2 γ}{ln K} + log (1 + \frac{1}{K^{n γ}}) \end{matrix}$

(189)

as in (140), where the left inequality is due to the data processing inequality. Similarly to the derivation of (141), the second term can be bounded as

$\begin{matrix} E [log \frac{P_{Z^{n}} ({\tilde{Y}}^{n})}{P_{Y^{n}} ({\tilde{Y}}^{n})}] & = \sum_{y \in Y^{n}} \sum_{x \in X^{n}} P_{{\tilde{X}}^{n}} (x) W^{n} (y | x) log \frac{P_{Z^{n}} (y)}{P_{Y^{n}} (y)} \\ \leq (1 + 2 γ) (1 + \frac{1}{K^{n γ}}) \sum_{y \in Y^{n}} \sum_{x \in X^{n}} P_{V^{n}} (x) W^{n} (y | x) log \frac{P_{Z^{n}} (y)}{P_{Y^{n}} (y)} \\ = (1 + 2 γ) (1 + \frac{1}{K^{n γ}}) D (Z^{n} | | Y^{n}), \end{matrix}$

(190)

where we have used (134). Here, $D (Z^{n} | | Y^{n}) \leq δ + γ$ because $Z^{n}$ is the output via $W^{n}$ due to input $V^{n} \in B_{δ + γ}^{D} (X^{n}, W^{n})$ . The rest of the steps are the same as in the proof of Theorem 4.

□

6. Second-Order VL Channel Resolvability

So far, we have analyzed the first-order VL resolvabilities and established various first-order resolvability theorems. One of the next important steps is the second-order analysis, and so, in this section, we generalize VL resolvabilities in the second-order setting.

6.1. Definitions

We now turn to considering the second-order resolution rates [26,27,29]. First, we consider the VL resolvability based on the variational distance.

Definition 9

(VL

(δ, R)

-channel resolvability: variational distance). A second-order resolution rate

L \in (- \infty, + \infty)

is said to be VL

(δ, R)

-achievable (under the variational distance) for

X

with

δ \in [0, 1)

if there exist a VL uniform random number

U^{(L_{n})}

and a deterministic mapping

φ_{n} : U^{*} \to X^{n}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (E [L_{n}] - n R) & \leq L, \end{matrix}

(191)

\begin{matrix} \underset{n \to \infty}{lim sup} d (P_{Y^{n}}, P_{{\tilde{Y}}^{n}}) & \leq δ, \end{matrix}

(192)

where

{\tilde{Y}}^{n}

denotes the output via

W^{n}

due to input

{\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})

. The infimum of all VL

(δ, R)

-achievable rates for

X

is denoted by

\begin{matrix} T_{v} (δ, R | X, W) : = inf {L : L i s V L (δ, R) - a c h i e v a b l e f o r X} . \end{matrix}

(193)

When

W

is the identity mapping

I

,

T_{v} (δ, R | X, W)

is simply denoted by

T_{v} (δ, R | X)

(source resolvability).

Next, we may consider the VL resolvability with the divergence instead of the variational distance.

Definition 10

(VL

(δ, R)

-channel resolvability: divergence). A second-order resolution rate

L \in (- \infty, + \infty)

is said to be VL

(δ, R)

-achievable for

X

(with the divergence) where

δ \geq 0

if there exists a VL uniform random number

U^{(L_{n})}

and a deterministic mapping

φ_{n} : U^{*} \to X^{n}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (E [L_{n}] - n R) & \leq L, \end{matrix}

(194)

\begin{matrix} \underset{n \to \infty}{lim sup} D ({\tilde{Y}}^{n} | | Y^{n}) & \leq δ, \end{matrix}

(195)

where

{\tilde{Y}}^{n}

denotes the output via

W^{n}

due to input

{\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})

. The infimum of all VL

(δ, R)

-achievable rates for

X

is denoted as

\begin{matrix} T_{v}^{D} (δ, R | X, W) : = inf {L : L i s V L (δ, R) - a c h i e v a b l e f o r X} . \end{matrix}

(196)

When

W

is the identity mapping

I

,

T_{v}^{D} (δ, R | X, W)

is simply denoted by

T_{v}^{D} (δ, R | X)

(source resolvability).

Remark 15.

It is easily verified that

\begin{matrix} T_{v} (δ, R | X, W) = \{\begin{matrix} + \infty & f o r R < S_{v} (δ | X, W) \\ - \infty & f o r R > S_{v} (δ | X, W) . \end{matrix} \end{matrix}

(197)

Hence, only the case

R = S_{v} (δ | X, W)

is of interest to us. The same remark also applies to

T_{v}^{D} (δ, R | X, W)

.

6.2. General Formulas

We establish general formulas for the second-order resolvability. The proofs of the following theorems are given below subsequently to Remark 17.

Theorem 7

(with variational distance). For any input process

X

and general channel

W

,

\begin{matrix} T_{v} (δ, R | X, W) = lim_{γ ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (H_{[δ + γ], W^{n}} (X^{n}) - n R) (δ \in [0, 1), R \geq 0) . \end{matrix}

(198)

In particular, in the case where

W

is the identity mapping

I

,

\begin{matrix} T_{v} (δ, R | X) = lim_{γ ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (H_{[δ + γ]} (X^{n}) - n R) (δ \in [0, 1), R \geq 0) . \end{matrix}

(199)

Theorem 8

(with divergence). For any input process

X

and general channel

W

,

\begin{matrix} T_{v}^{D} (δ, R | X, W) = lim_{γ ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (H_{[δ + γ], W^{n}}^{D} (X^{n}) - n R) (δ \geq 0, R \geq 0) . \end{matrix}

(200)

In particular, in the case where

W

is the identity mapping

I

,

\begin{matrix} T_{v}^{D} (δ, R | X) = lim_{γ ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (H_{[δ + γ]}^{D} (X^{n}) - n R) (δ \geq 0, R \geq 0) . \end{matrix}

(201)

Remark 16.

As discussed in Section 5, we may also consider using a general source

\tilde{X}

as an input to channel

W

, and we can define L to be a mean

(δ, R)

-achievable rate for

X

by replacing (191) and (194) with

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (H ({\tilde{X}}^{n}) - n R) & \leq L . \end{matrix}

(202)

Let

{\bar{T}}_{v} (δ, R | X, W)

and

{\bar{T}}_{v}^{D} (δ, R | X, W)

denote the infimum of all mean

(δ, R)

-achievable rates for

X

under the variational distance and the divergence, respectively. Then, it is not difficult to verify that

\begin{matrix} {\bar{T}}_{v} (δ, R | X, W) & = T_{v} (δ, R | X, W) (δ \in [0, 1)), \end{matrix}

(203)

\begin{matrix} {\bar{T}}_{v}^{D} (δ, R | X, W) & = T_{v}^{D} (δ, R | X, W) (δ \geq 0) . \end{matrix}

(204)

Thus, there is no loss in the

(δ, R)

-achievable resolution rate even if the channel input

\tilde{X}

is restricted to be generated by the VL uniform random number

U^{(L_{n})}

.

Remark 17.

As in the first-order case, when the channel

W

is the identity mapping

I

,

T_{v} (δ, R | X)

coincides with the minimum second-order length rate of the VL source codes. More precisely, we denote by

R_{v}^{*} (δ, R | X)

the minimum second-order length rate of a sequence of VL source codes with the first-order average length rate R and an average error probability asymptotically not exceeding δ. Yagi and Nomura [31] have shown that

\begin{matrix} R_{v}^{*} (δ, R | X) = lim_{γ ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (G_{[δ + γ]} (X^{n}) - n R) (δ \in [0, 1), R \geq 0) . \end{matrix}

(205)

Modifying the proof of Proposition 1 (cf. Appendix C), we can show that the r.h.s. of (199) coincides with that of (205), and, therefore, it generally holds that

\begin{matrix} T_{v} (δ, R | X) = R_{v}^{*} (δ, R | X) (δ \in [0, 1), R \geq 0) . \end{matrix}

(206)

As a special case, suppose that

X

is a stationary and memoryless source X with the finite third absolute moment of

log \frac{1}{P_{X} (X)}

. In this case, Kostina et al. [25] have recently given a single-letter characterization for

R_{v}^{*} (δ, R | X)

with

R = H_{[δ]} (X) = (1 - δ) H (X)

as

\begin{matrix} R_{v}^{*} (δ, R | X) = - \sqrt{\frac{V (X)}{2 π}} e^{- \frac{{(Q^{- 1} (δ))}^{2}}{2}}, \end{matrix}

(207)

where

V (X)

denotes the variance of

log \frac{1}{P_{X} (X)}

(varentropy) and

Q^{- 1}

is the inverse of the complementary cumulative distribution function of the standard Gaussian distribution. In view of the general relation (206), we can also obtain the single-letter characterization for

T_{v} (δ, R | X)

:

\begin{matrix} T_{v} (δ, R | X) = - \sqrt{\frac{V (X)}{2 π}} e^{- \frac{{(Q^{- 1} (δ))}^{2}}{2}} . \end{matrix}

(208)

It has not yet been made clear whether we can also have a single-letter formula for

T_{v} (δ, R | X, W)

when the channel

W

is memoryless but not necessarily the identity mapping.

Proof of Theorems 7 and 8.

(1): Converse Part:
We will show the converse part of Theorem 7. The converse part of Theorem 8 can be proved in an analogous way.
Let L be VL $(δ, R)$ -achievable for $X$ under the variational distance. Then, there exists $U^{(L_{n})}$ and $φ_{n}$ satisfying (191) and

$\begin{matrix} \underset{n \to \infty}{lim sup} δ_{n} \leq δ, \end{matrix}$

(209)

where we define $δ_{n} = d (P_{Y^{n}}, P_{{\tilde{Y}}^{n}}),$ and ${\tilde{Y}}^{n}$ is the output via $W^{n}$ due to input ${\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})$ . Equation (209) implies that, for any given $γ > 0$ , it holds that $δ_{n} \leq δ + γ$ for all $n \geq n_{0}$ with some $n_{0} > 0$ , and, therefore,

$\begin{matrix} H_{[δ + γ], W^{n}} (X^{n}) \leq H_{[δ_{n}], W^{n}} (X^{n}) (\forall n \geq n_{0}) . \end{matrix}$

(210)

Since $P_{{\tilde{X}}^{n}} \in B_{δ_{n}} (X^{n}, W^{n})$ , we have

$\begin{matrix} H_{[δ_{n}], W^{n}} (X^{n}) \leq H ({\tilde{X}}^{n}) . \end{matrix}$

(211)

On the other hand, it follows from (23) that

$\begin{matrix} H ({\tilde{X}}^{n}) & \leq H (U^{(L_{n})}) = E [L_{n}] + H (L_{n}), \end{matrix}$

(212)

where the inequality is due to the fact that $φ_{n}$ is a deterministic mapping and ${\tilde{X}}^{n} = φ_{n} (U^{(L_{n})})$ .
Combining (210)–(212) yields

$\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (H_{[δ + γ], W^{n}} (X^{n}) - n R) \\ \leq \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (H ({\tilde{X}}^{n}) - n R) \\ \leq \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (E [L_{n}] - n R) + \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} H (L_{n}) \leq L, \end{matrix}$

(213)

where we have used (24) and (191) for the last inequality. Since $γ > 0$ is arbitrary, we have

$\begin{matrix} lim_{γ ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (H_{[δ + γ], W^{n}} (X^{n}) - n R) \leq L . \end{matrix}$

(214)
(2): Direct Part:
We will show the direct part (achievability) of Theorem 7 by modifying the argument of Theorems 3 and 5, whereas the direct part of Theorem 8 can be proved in a similar manner by modifying that of Theorem 4 instead of Theorem 3.
Letting

$\begin{matrix} L = lim_{γ ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (H_{[δ + γ], W^{n}} (X^{n}) - n R) + 2 γ, \end{matrix}$

(215)

where $γ > 0$ is an arbitrary constant, we shall show that L is VL $(δ, R)$ -achievable for $X$ under the variational distance.
We use the same achievability scheme as in the proof of Theorem 3 with slightly different parameter settings. For $γ > 0$ , we choose a $c_{n} > 0$ so that

$\begin{matrix} Pr {V^{n} \notin T_{n}} \leq γ \end{matrix}$

(216)

where $P_{V^{n}} \in B_{δ + γ} (X^{n}, W^{n})$ with $H_{[δ + γ], W^{n}} (X^{n}) + γ \geq H (V^{n})$ and

$\begin{matrix} T_{n} : = \{x \in X^{n} : \frac{1}{n} log \frac{1}{P_{V^{n}} (x)} \leq c_{n}\} . \end{matrix}$

(217)

We here define

$\begin{matrix} ℓ (x) : = \{\begin{matrix} ⌈ log \frac{1}{P_{V^{n}} (x)} + \sqrt{n} γ ⌉ & f o r x \in T_{n} \\ 0 & otherwise \end{matrix} \end{matrix}$

(218)

and $β_{n} = ⌈ (n c_{n} + \sqrt{n} γ) ⌉$ . Arguing similarly to the proof of Theorems 3 and 5, we can show that there exist $φ_{n} : U^{*} \to X^{n}$ and $U^{(L_{n})}$ such that

$\begin{matrix} d (P_{Y^{n}}, P_{{\tilde{Y}}^{n}}) \leq δ + 2 γ + \frac{1}{2} K^{- \sqrt{n} γ} \end{matrix}$

(219)

and

$\begin{matrix} E [L_{n}] & \leq (1 + \frac{1}{K^{\sqrt{n} γ}}) (H_{[δ + γ], W^{n}} (X^{n}) + 2 \sqrt{n} γ + 1) . \end{matrix}$

(220)

Therefore, we obtain

$\begin{matrix} lim_{n \to \infty} d (P_{Y^{n}}, P_{{\tilde{Y}}^{n}}) \leq δ + 2 γ \end{matrix}$

(221)

and

$\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (E [L_{n}] - n R) \\ \leq \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (H_{[δ + γ], W^{n}} (X^{n}) - n R) + 2 γ \\ \leq lim_{γ ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} (H_{[δ + γ], W^{n}} (X^{n}) - n R) + 2 γ = L . \end{matrix}$

(222)

Since $γ > 0$ is arbitrary, L is VL $(δ, R)$ -achievable for $X$ .

□

7. Conclusions

We have investigated the problem of VL source/channel resolvability, in which a given target probability distribution is approximated by transforming VL uniform random numbers. Table 1 summarizes various first-order resolvabilities and their characterizations in terms of information quantities. In this table, the theorem numbers that contain the corresponding characterization are also indicated.

In this paper, we have first analyzed the fundamental limits on the VL

δ

-source resolvability with the variational distance in Theorem 3. The VL

δ

-source resolvability is essentially characterized in terms of smooth Shannon entropies. In the proof of the direct part, we have developed a simple method for information spectrum slicing, in which sliced information densities quantized to the same integer are approximated by an FL uniform random number of the same length. Next, we have extended the analysis to the

δ

-source resolvability under the unnormalized divergence in Theorem 4. The smoothed entropy with the divergence again plays an important role in characterizing the

δ

-source resolvability.

Then, we have addressed the problem of

δ

-channel resolvability. It has been revealed in Theorems 5 and 6 that using an arbitrary general source as a coin distribution (mean-resolvability problem) cannot go beyond the fundamental limits of the VL resolvability, in which only VL uniform random numbers are allowed to be a coin distribution. As in the case of source resolvability, we have discussed the

δ

-channel resolvability under the variational distance and the unnormalized divergence. The second-order channel resolvability has been characterized in Theorems 7 and 8 as well as the first-order case. We notice here that a counterpart of the VL uniform random number is the problem of VL source coding, for which the general treatment, focused on overflow/underflow probabilities, is found in [38]. Indeed, when the variational distance is used as an approximation measure, it turns out that the

δ

-source resolvability is equal to the minimum achievable rate of VL source codes with an error probability of less than or equal to

δ

. This is a parallel relationship between FL source resolvability and the minimum achievable rate of FL source codes [6,7]. It is of interest to investigate whether there is a coding problem to which the

δ

-channel resolvability is closely related.

When

δ = 0

, asymptotically exact approximation is required. In the case where the channel

W

is the identity mapping

I

, it turned out that the source resolvability under the variational distance and the unnormalized divergence coincides and is given by

{lim}_{γ ↓ 0} H_{[γ]} (X)

, where

X

is the general target source. This result is analogous to the dual problem of VL intrinsic randomness [5,36], in which the maximum achievable rates of VL uniform random numbers extracted from a given source

X

are the same under two kinds of approximation measures. It should be emphasized that in the case of VL intrinsic randomness, the use of normalized divergence as an approximation measure results in the same general formula as with the variational distance and the unnormalized divergence, which does not necessarily hold in the case of mean/VL resolvability (cf. Remark 11). It is also noteworthy that whereas only the case of

δ = 0

has been completely solved for the VL intrinsic randomness, we have also dealt with the case of a

δ > 0

for the VL source/channel resolvability.

When

X

is a stationary and memoryless source or is even with a one-point spectrum (cf. Corollary 1), the formulas established reduce to a single-letter characterization for the first- and second-order source resolvability under the variational distance. In the case where the divergence is an approximation measure and/or the channel

W

is a non-identity mapping, however, it has not yet been made clear whether we can derive a single-letter characterization for the

δ

-source/channel resolvability. This question remains to be studied.

As noted in Remark 10, the order of arguments in divergence

D ({\tilde{X}}^{n} | | X^{n})

is important, and it seems difficult to extend the analyses in this paper to the case with the reversed

D (X^{n} | | {\tilde{X}}^{n})

used as an approximation measure. In the context of intrinsic randomness, the reversed divergence is also discussed [23]. Investigating the problem of source/channel resolvability using such divergence is an interesting research topic.

Author Contributions

The author H.Y. contributed to the conceptualization of the research goals and aims, the visualization/presentation of the works, the formal analysis of the results, and the writing of the paper. The author T.S.H. contributed to the formal analysis of the results, the validation of obtained results, and the review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by JSPS KAKENHI Grant Numbers JP16K06340, JP18H01438, and JP20K04462.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Equation (31)

(i): We first show ${lim}_{α ↑ 1} H_{[δ]}^{α} (X^{n}) \leq H_{[δ]} (X^{n})$ .

Fix

γ > 0

arbitrarily. We choose

P_{V^{n}} \in B_{δ} (X^{n})

satisfying

\begin{matrix} H_{[δ]} (X^{n}) + γ \geq H (V^{n}) . \end{matrix}

(A1)

It is well known that the Rényi entropy of order

α

defined as

\begin{matrix} H^{α} (V^{n}) = \frac{1}{1 - α} log \sum_{x \in X^{n}} P_{V^{n}} {(x)}^{α} (\forall α \in (0, 1) \cup (1, + \infty)) \end{matrix}

satisfies

\begin{matrix} H (V^{n}) = lim_{α \to 1} H^{α} (V^{n}) . \end{matrix}

(A2)

By definition, we have

\begin{matrix} H^{α} (V^{n}) \geq H_{[δ]}^{α} (X^{n}) (\forall α \in (0, 1) \cup (1, + \infty)), \end{matrix}

(A3)

leading to

\begin{matrix} H (V^{n}) = lim_{α ↑ 1} H^{α} (V^{n}) \geq lim_{α ↑ 1} H_{[δ]}^{α} (X^{n}) . \end{matrix}

(A4)

Combining (A1) and (A4) yields

\begin{matrix} H_{[δ]} (X^{n}) + γ \geq H (V^{n}) \geq lim_{α ↑ 1} H_{[δ]}^{α} (X^{n}) . \end{matrix}

(A5)

Since

γ > 0

is an arbitrary constant, we obtain

{lim}_{α ↑ 1} H_{[δ]}^{α} (X^{n}) \leq H_{[δ]} (X^{n})

.

(ii): Next, we shall show ${lim}_{α ↑ 1} H_{[δ]}^{α} (X^{n}) \geq H_{[δ]} (X^{n})$ .

Fix

γ > 0

arbitrarily. We choose some

α_{0} \in (0, 1)

satisfying

\begin{matrix} lim_{α ↑ 1} H_{[δ]}^{α} (X^{n}) + γ \geq H_{[δ]}^{α_{0}} (X^{n}) . \end{matrix}

(A6)

For this

α_{0}

we choose

P_{V^{n}} \in B_{δ} (X^{n})

satisfying

\begin{matrix} H_{[δ]}^{α_{0}} (X^{n}) + γ \geq H^{α_{0}} (V^{n}) . \end{matrix}

(A7)

Since

H^{α} (V^{n})

is a nonincreasing function of

α

, we have

\begin{matrix} H^{α_{0}} (V^{n}) \geq H (V^{n}), \end{matrix}

(A8)

and it follows from (A6)–(A8) that

\begin{matrix} lim_{α ↑ 1} H_{[δ]}^{α} (X^{n}) + 2 γ \geq H^{α_{0}} (V^{n}) \geq H (V^{n}) . \end{matrix}

(A9)

Since

H (V^{n}) \geq H_{[δ]} (X^{n})

, due to

P_{V^{n}} \in B_{δ} (X^{n})

, and

γ > 0

is arbitrarily fixed, we obtain the desired inequality.

Appendix B. Proof of Equation (34)

To prove the alternative formula (34) for the

v (δ)

-resolvability

S_{v} (δ | X)

, we shall show

\begin{matrix} lim_{γ ↓ 0} H_{[δ + γ]} (X) = inf_{V \in B_{δ} (X)} H (V) (δ \in [0, 1)) . \end{matrix}

(A10)

(i): We first show $lim_{γ ↓ 0} H_{[δ + γ]} (X) \leq inf_{V \in B_{δ} (X)} H (V)$ .

Fix

γ > 0

arbitrarily. We choose

\tilde{V} = {{\tilde{V}}^{n}}_{n = 1}^{\infty} \in B_{δ} (X)

satisfying

\begin{matrix} H (\tilde{V}) \leq inf_{V \in B_{δ} (X)} H (V) + γ . \end{matrix}

(A11)

For

\tilde{V} \in B_{δ} (X)

, we have

d (X^{n}, {\tilde{V}}^{n}) \leq δ + γ

for all

n \geq n_{0}

with some

n_{0} > 0

, yielding

\begin{matrix} H_{[δ + γ]} (X^{n}) \leq H ({\tilde{V}}^{n}) (\forall n \geq n_{0}) . \end{matrix}

(A12)

Thus, it follows from (A11) and (A12) that

\begin{matrix} H_{[δ + γ]} (X) \leq inf_{V \in B_{δ} (X)} H (V) + γ . \end{matrix}

(A13)

Since

γ > 0

is an arbitrary constant, letting

γ ↓ 0

on both sides yields the desired inequality.

(ii): Next, we shall show $lim_{γ ↓ 0} H_{[δ + γ]} (X) \geq inf_{V \in B_{δ} (X)} H (V)$ .

Fix

λ > 0

arbitrarily. We choose an arbitrary decreasing sequence of positive numbers

{γ_{i}}_{i = 1}^{\infty}

satisfying

γ_{1} > γ_{2} > \dots \to 0

. Then, we have

\begin{matrix} lim_{γ ↓ 0} H_{[δ + γ]} (X) = lim_{i \to \infty} H_{[δ + γ_{i}]} (X) . \end{matrix}

(A14)

Additionally, by the definition of the limit superior, for each

i = 1, 2, \dots

we have

\begin{matrix} \frac{1}{n} H_{[δ + γ_{i}]} (X^{n}) \leq H_{[δ + γ_{i}]} (X) + λ (\forall n \geq n_{i}) \end{matrix}

(A15)

with some

0 < n_{1} < n_{2} < \dots

. Now, for each

n = 1, 2, \dots

, we denote by

i_{n}

the index i satisfying

\begin{matrix} n_{i} \leq n < n_{i + 1} . \end{matrix}

(A16)

Then, from (A15), we obtain

\begin{matrix} \frac{1}{n} H_{[δ + γ_{i_{n}}]} (X^{n}) \leq H_{[δ + γ_{i_{n}}]} (X) + λ (\forall n \geq n_{1}) . \end{matrix}

(A17)

On the other hand, by the definition of

H_{[δ + γ_{i_{n}}]} (X^{n})

, for each

n = 1, 2, \dots

, we can choose some

V_{i_{n}}^{n} \in B_{δ + γ_{i_{n}}} (X^{n})

satisfying

\begin{matrix} \frac{1}{n} H (V_{i_{n}}^{n}) \leq \frac{1}{n} H_{[δ + γ_{i_{n}}]} (X^{n}) + λ . \end{matrix}

(A18)

We now construct the general source

\tilde{V} = {V_{i_{n}}^{n}}_{n = 1}^{\infty}

. Since

V_{i_{n}}^{n} \in B_{δ + γ_{i_{n}}} (X^{n})

for all

n \geq n_{1}

indicates that

\begin{matrix} \underset{n \to \infty}{lim sup} d (X^{n}, V_{i_{n}}^{n}) \leq δ + lim_{n \to \infty} γ_{i_{n}} = δ, \end{matrix}

(A19)

the general source satisfies

\tilde{V} \in B_{δ} (X)

.

From (A17) and (A18), we obtain

\begin{matrix} \frac{1}{n} H (V_{i_{n}}^{n}) \leq H_{[δ + γ_{i_{n}}]} (X) + 2 λ (\forall n \geq n_{1}) . \end{matrix}

(A20)

In view of (A14) and the fact

\tilde{V} \in B_{δ} (X)

, taking

\underset{n \to \infty}{lim sup}

on both sides yields

\begin{matrix} inf_{V \in B_{δ} (X)} H (V) \leq H (\tilde{V}) & \leq \underset{n \to \infty}{lim sup} H_{[δ + γ_{i_{n}}]} (X) + 2 λ \\ = lim_{γ ↓ 0} H_{[δ + γ]} (X) + 2 λ . \end{matrix}

(A21)

Since

λ > 0

is an arbitrary constant, letting

λ ↓ 0

yields the desired inequality.

Appendix C. Proof of Proposition 1

We shall prove the equality and inequality in (40). Equation (41) is an immediate consequence of (40) because

{\bar{H}}_{δ} (X)

is a right-continuous function [2].

(i): We first shall show the equality in (40): $H_{[δ]} (X) = G_{[δ]} (X)$ .

We show

H_{[δ]} (X) \leq G_{[δ]} (X)

. For any given

γ > 0

and

P_{X^{n}}

, let

A_{n}^{*} \subseteq X^{n}

be a subset of

X^{n}

which satisfies

\begin{matrix} Pr {X^{n} \in A_{n}^{*}} \geq 1 - δ \end{matrix}

(A22)

and

\begin{matrix} G_{[δ]} (X^{n}) + γ \geq \sum_{x \in A_{n}^{*}} P_{X^{n}} (x) log \frac{1}{P_{X^{n}} (x)} = : F (A_{n}^{*}) . \end{matrix}

(A23)

Choose

x_{0} \in X^{n} ∖ A_{n}^{*}

arbitrarily. Set

P_{{\tilde{X}}^{n}}

so that

\begin{matrix} P_{{\tilde{X}}^{n}} (x) = \{\begin{matrix} P_{X^{n}} (x) & f o r x \in A_{n}^{*} \\ α_{0} & f o r x = x_{0} \\ 0 & otherwise, \end{matrix} \end{matrix}

(A24)

where we define

α_{0} = Pr {X^{n} \notin A_{n}^{*}}

.

The variational distance between

P_{X^{n}}

and

P_{{\tilde{X}}^{n}}

satisfies

\begin{matrix} d (P_{X^{n}}, P_{{\tilde{X}}^{n}}) & = \frac{1}{2} \sum_{x \notin A_{n}^{*}} | P_{X^{n}} (x) - P_{{\tilde{X}}^{n}} (x) | \\ \leq \frac{1}{2} \sum_{x \notin A_{n}^{*}} (P_{X^{n}} (x) + P_{{\tilde{X}}^{n}} (x)) \leq α_{0} = 1 - Pr {X^{n} \in A_{n}^{*}} \leq δ, \end{matrix}

(A25)

where the last inequality is due to (A22). Therefore,

P_{{\tilde{X}}^{n}} \in B_{δ} (X^{n})

, and this implies

\begin{matrix} H_{[δ]} (X^{n}) \leq H ({\tilde{X}}^{n}) & = F (A_{n}^{*}) + α_{0} log \frac{1}{α_{0}} \\ \leq G_{[δ]} (X^{n}) + α_{0} log \frac{1}{α_{0}} + γ \\ \leq G_{[δ]} (X^{n}) + \frac{log e}{e} + γ, \end{matrix}

(A26)

where the first inequality is due to (A23) and the last inequality is due to the inequality

x log x \geq - \frac{log e}{e}

for all

x > 0

. Thus, we obtain the desired inequality:

H_{[δ]} (X) \leq G_{[δ]} (X)

.

Next, we shall show

H_{[δ]} (X) \geq G_{[δ]} (X)

. Assume, without loss of generality, that the elements of

X^{n}

are indexed as

x_{1}, x_{2}, \dots \in X^{n}

so that

\begin{matrix} P_{X^{n}} (x_{i}) \geq P_{X^{n}} (x_{i + 1}) (\forall i = 1, 2, \dots) . \end{matrix}

(A27)

For a given

δ \in [0, 1)

, let

j^{*}

denote the integer satisfying

\begin{matrix} \sum_{i = 1}^{j^{*} - 1} P_{X^{n}} (x_{i}) < 1 - δ, \sum_{i = 1}^{j^{*}} P_{X^{n}} (x_{i}) \geq 1 - δ . \end{matrix}

(A28)

Let

V_{δ}^{n}

be a random variable taking values in

X^{n}

whose probability distribution is given by

\begin{matrix} P_{V_{δ}^{n}} (x_{i}) = \{\begin{matrix} P_{X^{n}} (x_{i}) + δ & for i = 1 \\ P_{X^{n}} (x_{i}) & for i = 2, 3, \dots, j^{*} - 1 \\ P_{X^{n}} (x_{i}) - ε_{n} & for i = j^{*} \\ 0 & otherwise, \end{matrix} \end{matrix}

(A29)

where

ε_{n} : = δ - \sum_{i \geq j^{*} + 1} P_{X^{n}} (x_{i})

. It is easily checked that

0 \leq ε_{n} \leq P_{X^{n}} (x_{j^{*}})

and the probability distribution

P_{V_{δ}^{n}}

majorizes (for a sequence

u = (u_{1}, u_{2}, \dots, u_{L})

of length L, we denote by

\tilde{u} = ({\tilde{u}}_{1}, {\tilde{u}}_{2}, \dots, {\tilde{u}}_{L})

a permuted version of

u

satisfying

{\tilde{u}}_{i} \geq {\tilde{u}}_{i + 1}

for all

i = 1, 2, \dots, L - 1

, where ties are arbitrarily broken. We say

u = (u_{1}, u_{2}, \dots, u_{L})

majorizes

v = (v_{1}, v_{2}, \dots, v_{L})

if

\sum_{i = 1}^{j} {\tilde{u}}_{i} \geq \sum_{i = 1}^{j} {\tilde{v}}_{i}

for all

j = 1, 2, \dots, L

.) any

P_{V^{n}} \in B_{δ} (X^{n})

[39]. Since the Shannon entropy is a Schur concave function (The function

f (u)

is said to be Schur concave if

f (u) \leq f (v)

for any pair

(u, v)

, where

v

is majorized by

u

.) According to [40], we immediately obtain the following lemma, which is of use to compute

H_{[δ]} (X^{n})

.

Lemma A1

(Ho and Yeung [39]).

\begin{matrix} H_{[δ]} (X^{n}) = H (V_{δ}^{n}) (\forall δ \in [0, 1)) . \end{matrix}

(A30)

By the definition of

G_{[δ]} (X^{n})

, we obtain

\begin{matrix} G_{[δ]} (X^{n}) & \leq \sum_{i = 1}^{j^{*}} P_{X^{n}} (x_{i}) log \frac{1}{P_{X^{n}} (x_{i})} \end{matrix}

(A31)

\begin{matrix} \leq H (V_{δ}^{n}) + P_{X^{n}} (x_{1}) log \frac{1}{P_{X^{n}} (x_{1})} + P_{X^{n}} (x_{j^{*}}) log \frac{1}{P_{X^{n}} (x_{j^{*}})} \end{matrix}

(A32)

\begin{matrix} \leq H (V_{δ}^{n}) + \frac{2 log e}{e}, \end{matrix}

(A33)

where the last inequality is due to

x log x \geq - \frac{log e}{e}

for all

x > 0

. Thus, it follows from Lemma A1 that

\begin{matrix} G_{[δ]} (X) & \leq \underset{n \to \infty}{lim sup} \frac{1}{n} H (V_{δ}^{n}) = H_{[δ]} (X), \end{matrix}

(A34)

which is the desired inequality.

(ii): Next, we show the inequality in (40): $H_{[δ]} (X) \leq (1 - δ) {\bar{H}}_{δ - γ} (X)$ . By the definition of ${\bar{H}}_{δ - γ} (X)$ , for any $η > 0$ , there exists some $n_{0} > 0$ such that

\begin{matrix} Pr {X^{n} \in T_{n}} \geq 1 - δ (\forall n \geq n_{0}), \end{matrix}

(A35)

where we define

\begin{matrix} T_{n} = \{x \in X^{n} : \frac{1}{n} log \frac{1}{P_{X^{n}} (x)} \leq {\bar{H}}_{δ - γ} (X) + η\} . \end{matrix}

(A36)

Choose a sequence

x_{0} \in T_{n}

arbitrarily. Set

P_{V^{n}}

so that

\begin{matrix} P_{V^{n}} (x) = \{\begin{matrix} α_{n} P_{X^{n}} (x) + δ & f o r x = x_{0} \\ α_{n} P_{X^{n}} (x) & f o r x \neq x_{0}, x \in T_{n} \\ 0 & otherwise, \end{matrix} \end{matrix}

(A37)

where we define

α_{n} = (1 - δ) / Pr {X^{n} \in T_{n}}

. Then, the variational distance between

P_{X^{n}}

and

P_{V^{n}}

satisfies

\begin{matrix} d (P_{X^{n}}, P_{V^{n}}) & \leq \frac{δ}{2} + \frac{1}{2} \sum_{x \in T_{n}} P_{X^{n}} (x) | 1 - α_{n} | + \frac{1}{2} \sum_{x \in T_{n}^{c}} P_{X^{n}} (x) \\ = \frac{δ}{2} + \frac{1}{2} Pr {X^{n} \in T_{n}} (1 - α_{n}) + Pr {X^{n} \in T_{n}^{c}} \\ = δ, \end{matrix}

(A38)

which indicates that

P_{V^{n}} \in B_{δ} (X^{n})

. The normalized Shannon entropy

\frac{1}{n} H (V^{n})

can be upper bounded as

\begin{matrix} \frac{1}{n} H (V^{n}) & = \frac{1}{n} (α_{n} P_{X^{n}} (x_{0}) + δ) log \frac{1}{α_{n} P_{X^{n}} (x_{0}) + δ} \\ + \frac{1}{n} \sum_{x \in T_{n} ∖ {x_{0}}} P_{V^{n}} (x) log \frac{1}{P_{V^{n}} (x)} \\ \leq \frac{δ}{n} log \frac{1}{δ} + \frac{1}{n} \sum_{x \in T_{n}} α_{n} P_{X^{n}} (x) log \frac{1}{α_{n} P_{X^{n}} (x)} \\ = \frac{δ}{n} log \frac{1}{δ} + \frac{α_{n}}{n} \sum_{x \in T_{n}} P_{X^{n}} (x) (log \frac{1}{α_{n}} + log \frac{1}{P_{X^{n}} (x)}) \\ \leq \frac{δ}{n} log \frac{1}{δ} + \frac{(1 - δ)}{n} log \frac{1}{1 - δ} + α_{n} \sum_{x \in T_{n}} P_{X^{n}} (x) ({\bar{H}}_{δ - γ} (X) + η) \\ = \frac{1}{n} h_{2} (δ) + (1 - δ) ({\bar{H}}_{δ - γ} (X) + η), \end{matrix}

(A39)

where we have used the fact that

\frac{1}{n} log \frac{1}{P_{X^{n}} (x)} \leq {\bar{H}}_{δ - γ} (X) + η

for

x \in T_{n}

to obtain the second inequality, and

h_{2} (δ) : = - δ log δ - (1 - δ) log (1 - δ)

denotes the binary entropy function. On the other hand, because

P_{V^{n}} \in B_{δ} (X^{n})

, there exists some

n_{1} > 0

such that

\begin{matrix} H_{[δ]} (X) \leq \frac{1}{n} H (V^{n}) + η (\forall n > n_{1}) . \end{matrix}

(A40)

Combining (A39) and (A40) yields

\begin{matrix} H_{[δ]} (X) \leq (1 - δ) {\bar{H}}_{δ - γ} (X) + 3 η, \end{matrix}

(A41)

and since

η > 0

is arbitrarily fixed, (A41) means the desired inequality.

Appendix D. Proof of Corollary 1

We first show (44), i.e.,

\begin{matrix} S_{f} (δ | X) \overset{(a)}{=} {\bar{H}}_{δ} (X) \overset{(b)}{=} H^{*} (X) . \end{matrix}

(A42)

Equality (a) is due to Theorem 2. To prove equality (b), we notice that

\begin{matrix} \underset{̲}{H} (X) \leq {\bar{H}}_{δ} (X) \leq \bar{H} (X) (\forall δ \in [0, 1)) \end{matrix}

(A43)

for general source

X

by definition. If source

X

is with one-point spectrum, the left-hand side (l.h.s.) and the r.h.s. in (A43) match, i.e.,

H^{*} (X) : = \underset{̲}{H} (X) = \bar{H} (X)

, and the squeeze theorem indicates equality (b).

Next, we show (45), i.e.,

\begin{matrix} S_{v} (δ | X) \overset{(c)}{=} H_{[δ]} (X) \overset{(d)}{=} G_{[δ]} (X) \overset{(e)}{=} (1 - δ) H^{*} (X) . \end{matrix}

(A44)

Equality (d) is a direct consequence of Proposition 1. To prove equality (e), we notice a general relationship

\begin{matrix} (1 - δ) \underset{̲}{H} (X) \leq G_{[δ]} (X) \leq (1 - δ) \bar{H} (X) (\forall δ \in [0, 1)) \end{matrix}

(A45)

for general source

X

, where the first inequality is due to ([35], Theorem 4) and the second one is due to Proposition 1 (cf. Equation (42)). In view of

H^{*} (X) = \underset{̲}{H} (X) = \bar{H} (X)

for source

X

with one-point spectrum, the squeeze theorem for (A45) indicates equality (e), i.e.,

\begin{matrix} G_{[δ]} (X) = (1 - δ) H^{*} (X) (\forall δ \in [0, 1)) . \end{matrix}

(A46)

The r.h.s. of (A46) is obviously right-continuous in

δ \in [0, 1)

, and so is the l.h.s. side

G_{[δ]} (X) = H_{[δ]} (X)

if source

X

is with one-point spectrum. The right-continuity of

H_{[δ]} (X)

and Theorem 3 indicate equality (c).

Appendix E. Proof of Lemma 1

We first show (99). For two general sources

X = {X^{n}}_{n = 1}^{\infty}

and

\tilde{X} = {{\tilde{X}}^{n}}_{n = 1}^{\infty}

, the following well-known inequality (cf. ([33], Problem 3.18)) holds between the variational distance and the divergence:

\begin{matrix} \frac{2 {(d (P_{X^{n}}, P_{{\tilde{X}}^{n}}))}^{2}}{ln K} \leq D ({\tilde{X}}^{n} | | X^{n}) . \end{matrix}

(A47)

This inequality implies that any

P_{V^{n}} \in B_{g (δ)}^{D} (X^{n})

satisfies

P_{V^{n}} \in B_{δ} (X^{n})

. Thus, we have

\begin{matrix} H_{[δ]} (X^{n}) \leq H_{[g (δ)]}^{D} (X^{n}) . \end{matrix}

(A48)

Now, we shall show the rightmost equality of (100). It obviously follows from (99) that

\begin{matrix} lim_{δ ↓ 0} H_{[δ]} (X) \leq lim_{δ ↓ 0} H_{[δ]}^{D} (X) . \end{matrix}

(A49)

To show the opposite inequality, in view of (41), it suffices to show

\begin{matrix} lim_{δ ↓ 0} G_{[δ]} (X) \geq lim_{δ ↓ 0} H_{[δ]}^{D} (X) . \end{matrix}

(A50)

Fix

δ \in (0, 1)

and

γ > 0

arbitrarily. We choose

A_{n} \subseteq X^{n}

satisfying

\begin{matrix} G_{[δ]} (X^{n}) + γ & \geq \sum_{x \in A_{n}} P_{X^{n}} (x) log \frac{1}{P_{X^{n}} (x)}, \end{matrix}

(A51)

\begin{matrix} α_{0} & : = Pr {X^{n} \in A_{n}} \geq 1 - δ . \end{matrix}

(A52)

We arrange a new random variable

V^{n}

subject to

\begin{matrix} P_{V^{n}} (x) = \{\begin{matrix} \frac{P_{X^{n}} (x)}{α_{0}} & i f x \in A_{n} \\ 0 & otherwise . \end{matrix} \end{matrix}

(A53)

Then, we obtain

\begin{matrix} D (V^{n} | | X^{n}) = \sum_{x \in A_{n}} P_{V^{n}} (x) log \frac{P_{V^{n}} (x)}{P_{X^{n}} (x)} = log \frac{1}{α_{0}} \leq log \frac{1}{1 - δ}, \end{matrix}

(A54)

and, thus, letting

h (δ) = log \frac{1}{1 - δ}

, it holds that

P_{V^{n}} \in B_{h (δ)}^{D} (X^{n})

. We can expand (A51) as

\begin{matrix} G_{[δ]} (X^{n}) + γ & \geq α_{0} \sum_{x \in A_{n}} P_{V^{n}} (x) log \frac{1}{α_{0} P_{V^{n}} (x)} \\ \geq α_{0} H (V^{n}) \\ \geq (1 - δ) H_{[h (δ)]}^{D} (X^{n}), \end{matrix}

(A55)

where the last inequality is due to (A52) and

P_{V^{n}} \in B_{h (δ)}^{D} (X^{n})

. Thus, as

γ > 0

is arbitrary,

\begin{matrix} G_{[δ]} (X) \geq (1 - δ) H_{[h (δ)]}^{D} (X) . \end{matrix}

(A56)

Since

δ \in (0, 1)

is arbitrary, in view of

{lim}_{δ ↓ 0} h (δ) = 0

, we obtain (A50).

References

Elias, P. The efficient construction of an unbiased random sequence. Ann. Math. Statist. 1972, 43, 865–870. [Google Scholar] [CrossRef]
Han, T.S. Information-Spectrum Methods in Information Theory; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
Knuth, D.E.; Yao, A.C.-C. The complexity of nonuniform random number generation. In Algorithms and Complexity: New Directions and Recent Results; Academic Press: New York, NY, USA, 1976; pp. 357–428. [Google Scholar]
Han, T.S. Theorems on variable-length intrinsic randomness. IEEE Trans. Inf. Theory 2000, 46, 2108–2116. [Google Scholar]
Vembu, S.; Verdú, S. Generating random bits from an arbitrary source: Fundamental limits. IEEE Trans. Inf. Theory 1995, 41, 1322–1332. [Google Scholar] [CrossRef]
Han, T.S.; Verdú, S. Approximation theory of output statistics. IEEE Trans. Inf. Theory 1993, 39, 752–771. [Google Scholar] [CrossRef]
Steinberg, Y.; Verdú, S. Simulation of random processes and rate-distortion theory. IEEE Trans. Inf. Theory 1996, 42, 63–86. [Google Scholar] [CrossRef]
Renner, R.; Wolf, S. Smooth Rényi entropy and applications. In Proceedings of the International Symposium onInformation Theory, ISIT 2004, Chicago, IL, USA, 27 June–2 July 2004; p. 232. [Google Scholar]
Hou, J.; Kramer, G. Informational divergence approximations to product distributions. In Proceedings of the 13th Canadian Workshop on Information Theory, Toronto, ON, Canada, 18–21 June 2013. [Google Scholar]
Yagi, H. Channel resolvability theorems for general sources and channels. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017. [Google Scholar]
Ahlswede, R.; Dueck, G. Identification via channels. IEEE Trans. Inf. Theory 1989, 35, 15–29. [Google Scholar] [CrossRef]
Oohama, Y. Converse coding theorems for identification via channels. IEEE Trans. Inf. Theory 2013, 59, 744–759. [Google Scholar] [CrossRef]
Steinberg, Y. New converses in the theory of identification via channels. IEEE Trans. Inf. Theory 1998, 44, 984–998. [Google Scholar] [CrossRef]
Sreekumar, S.; Cohen, A.; Gündüz, D. Privacy-aware distributed hypothesis testing. Entropy 2020, 22, 665. [Google Scholar] [CrossRef]
Zheng, F.; Xiao, Z.; Zhou, S.; Wang, J.; Huang, L. Message authentication over noisy channels. Entropy 2015, 17, 368–383. [Google Scholar] [CrossRef]
Lin, P.-H.; Janda, C.R.; Jorswieck, E.A.; Schaefer, R.F. Stealthy secret key generation. Entropy 2020, 22, 679. [Google Scholar] [CrossRef]
Hayashi, M. Nonasymptotic and asymptotic formulas in channel resolvability and identification capacity and their application to the wiretap channel. IEEE Trans. Inf. Theory 2006, 52, 1562–1575. [Google Scholar] [CrossRef]
Bloch, M.R.; Laneman, J.N. Strong secrecy from channel resolvability. IEEE Trans. Inf. Theory 2013, 59, 8077–8098. [Google Scholar] [CrossRef]
Frèche, G.; Bloch, M.R.; Barret, M. Polar codes for covert communications over asynchronous discrete memoryless channels. Entropy 2022, 22, 679. [Google Scholar]
Strassen, V. Asymptotische abschätzungen in Shannon’s informationstheorie. In Proceedings of the Transactions of the Third Prague Conference on Information Theory, Statistical Decision Functions, Random Processes, Prague, Czech Republic, 5–13 June 1962; pp. 689–723. [Google Scholar]
Han, T.S. Second-order information theory and hypothesis testing. In Proceedings of the 2015 IEEE Information Theory Workshop, Jeju, Republic of Korea, 11–15 October 2015. [Google Scholar]
Han, T.S.; Nomura, R. First- and second-order hypothesis testing for mixed memoryless sources. Entropy 2018, 20, 174. [Google Scholar] [CrossRef]
Hayashi, M. Second-order asymptotics in fixed-length source coding and intrinsic randomness. IEEE Trans. Inf. Theory 2008, 54, 4619–4637. [Google Scholar] [CrossRef]
Hayashi, M. Information spectrum approach to second-order coding rate in channel coding. IEEE Trans. Inf. Theory 2009, 55, 4947–4966. [Google Scholar] [CrossRef]
Kostina, V.; Polyanskiy, Y.; Verdú, S. Variable-length compression allowing errors. IEEE Trans. Inf. Theory 2015, 61, 4316–4330. [Google Scholar] [CrossRef]
Nomura, R. Source resolvability and intrinsic randomness: Two random number generation problems with respect to a subclass of f-divergences. IEEE Trans. Inf. Theory 2020, 66, 7588–7601. [Google Scholar] [CrossRef]
Nomura, R.; Han, T.S. Second-order resolvability, intrinsic randomness, and fixed-length source coding for mixed sources: Information spectrum approach. IEEE Trans. Inf. Theory 2013, 59, 1–16. [Google Scholar] [CrossRef]
Polyanskiy, Y.; Poor, H.V.; Verdú, S. Channel coding rate in the finite blocklength regime. IEEE Trans. Inf. Theory 2010, 59, 2307–2358. [Google Scholar] [CrossRef]
Watanabe, S.; Hayashi, M. Strong converse and second-order asymptotics of channel resolvability. In Proceedings of the 2014 IEEE International Symposium on Information Theory, Honolulu, HI, USA, 29 June–4 July 2014. [Google Scholar]
Yagi, H.; Han, T.S.; Nomura, R. First- and second-order coding theorems for mixed memoryless channels with general mixture. IEEE Trans. Inf. Theory 2016, 62, 4395–4412. [Google Scholar] [CrossRef]
Yagi, H.; Nomura, R. Variable-length coding with cost allowing non-vanishing error probability. IEICE Trans. Fundam. 2017, E100-A, 1683–1692. [Google Scholar] [CrossRef]
Uyematsu, R. A new unified method for fixed-length source coding problems of general sources. IEICE Trans. Fundam. 2010, E93-A, 1868–1877. [Google Scholar] [CrossRef]
Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems, 2nd ed.; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Tomita, T.; Uyematsu, T.; Matsumoto, R. Bounds on a variable-to-fixed type δ-resolvability problem. In Proceedings of the Poster Presentation at 2016 Symposium on Information Theory and its Applications, Gifu, Japan, 30 October–2 November 2016. [Google Scholar]
Koga, H.; Yamamoto, H. Asymptotic properties on codeword length of an optimal FV code for general sources. IEEE Trans. Inf. Theory 2005, 51, 1546–1555. [Google Scholar] [CrossRef]
Han, T.S. Weak variable-length source coding theorem. IEEE Trans. Inf. Theory 2000, 46, 1217–1226. [Google Scholar]
Kuzuoka, S.; Watanabe, S. An information-spectrum approach to weak variable-length source coding with side information. IEEE Trans. Inf. Theory 2015, 61, 3559–3573. [Google Scholar] [CrossRef]
Uchida, O.; Han, T.S. The optimal overflow and underflow probabilities of variable-length coding for general source. IEICE Trans. Fundam. 2001, E84-A, 2457–2465. [Google Scholar]
Ho, S.-W.; Yeung, R.W. The interplay between entropy and variational distance. IEEE Trans. Inf. Theory 2010, 56, 5906–5929. [Google Scholar] [CrossRef]
Marshall, A.W.; Olkin, I.; Arnold, B.C. Inequalities: Theory of Majorization and Its Applications, 2nd ed.; Springer: New York, NY, USA, 2011. [Google Scholar]

Figure 1. Illustration of the problem of FL resolvability.

Table 1. Summary of First-Order Resolvability and Information Quantities.

Approximation Measure	Resolvability	Characterization	Theorem #
Fixed-Length Resolvability
Variational Distance	$S_{f} (X)$	$\bar{H} (X)$	Theorem 1 [6]
Variational Distance	$S_{f} (δ X)$	${\bar{H}}_{δ} (X)$	Theorem 2 [7]
Variable-Length Resolvability
Variational Distance	$S_{v} (δ X)$	$lim_{γ ↓ 0} H_{[δ + γ]} (X)$	Theorem 3
Variational Distance	$S_{v} (δ X, W)$	$lim_{γ ↓ 0} H_{[δ + γ], W} (X)$	Theorem 5
Divergence	$S_{v}^{D} (δ X)$	$lim_{γ ↓ 0} H_{[δ + γ]}^{D} (X)$	Theorem 4
Divergence	$S_{v}^{D} (δ X, W)$	$lim_{γ ↓ 0} H_{[δ + γ], W}^{D} (X)$	Theorem 6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yagi, H.; Han, T.S. Variable-Length Resolvability for General Sources and Channels. Entropy 2023, 25, 1466. https://doi.org/10.3390/e25101466

AMA Style

Yagi H, Han TS. Variable-Length Resolvability for General Sources and Channels. Entropy. 2023; 25(10):1466. https://doi.org/10.3390/e25101466

Chicago/Turabian Style

Yagi, Hideki, and Te Sun Han. 2023. "Variable-Length Resolvability for General Sources and Channels" Entropy 25, no. 10: 1466. https://doi.org/10.3390/e25101466

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Variable-Length Resolvability for General Sources and Channels^†

Abstract

1. Introduction

2. FL Resolvability: Review

3. VL Resolvability: Variational Distance

3.1. Definitions

3.2. Smooth Shannon Entropy

3.3. General Formula for General $δ \in [0, 1)$

3.4. General Formula for $δ = 0$

4. VL Resolvability: Divergence

4.1. Definitions

4.2. General Formula

5. Mean and VL Channel Resolvability

5.1. Definitions

5.2. General Formulas

6. Second-Order VL Channel Resolvability

6.1. Definitions

6.2. General Formulas

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Proof of Equation (31)

Appendix B. Proof of Equation (34)

Appendix C. Proof of Proposition 1

Appendix D. Proof of Corollary 1

Appendix E. Proof of Lemma 1

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Variable-Length Resolvability for General Sources and Channels †

Abstract

1. Introduction

2. FL Resolvability: Review

3. VL Resolvability: Variational Distance

3.1. Definitions

3.2. Smooth Shannon Entropy

3.3. General Formula for General δ ∈ [ 0 , 1 )

3.4. General Formula for δ = 0

4. VL Resolvability: Divergence

4.1. Definitions

4.2. General Formula

5. Mean and VL Channel Resolvability

5.1. Definitions

5.2. General Formulas

6. Second-Order VL Channel Resolvability

6.1. Definitions

6.2. General Formulas

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Proof of Equation (31)

Appendix B. Proof of Equation (34)

Appendix C. Proof of Proposition 1

Appendix D. Proof of Corollary 1

Appendix E. Proof of Lemma 1

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Variable-Length Resolvability for General Sources and Channels^†

3.3. General Formula for General $δ \in [0, 1)$

3.4. General Formula for $δ = 0$