Exponential Strong Converse for One Helper Source Coding Problem

Oohama, Yasutada

doi:10.3390/e21060567

Open AccessArticle

Exponential Strong Converse for One Helper Source Coding Problem^†

by

Yasutada Oohama

^†

Department of Communication Engineering and Informatics, University of Electro-Communications, Tokyo 182-8585, Japan

^†

This paper is an extended version of our paper published in 2015 IEEE International Symposium on Information Theory (ISIT), Hong Kong, China, 14–19 June 2015.

Entropy 2019, 21(6), 567; https://doi.org/10.3390/e21060567

Submission received: 12 March 2019 / Revised: 29 May 2019 / Accepted: 31 May 2019 / Published: 5 June 2019

(This article belongs to the Special Issue Multiuser Information Theory II)

Download

Browse Figures

Versions Notes

Abstract

:

We consider the one helper source coding problem posed and investigated by Ahlswede, Körner and Wyner. Two correlated sources are separately encoded and are sent to a destination where the decoder wishes to decode one of the two sources with an arbitrary small error probability of decoding. In this system, the error probability of decoding goes to one as the source block length n goes to infinity. This implies that we have a strong converse theorem for the one helper source coding problem. In this paper, we provide the much stronger version of this strong converse theorem for the one helper source coding problem. We prove that the error probability of decoding tends to one exponentially and derive an explicit lower bound of this exponent function.

Keywords:

one helper source coding problem; strong converse theorem; exponent of correct probability of decoding

1. Introduction

For single or multi terminal source encoding systems, the converse coding theorems state that, at any data compression rates below the fundamental theoretical limit of the system, the error probability of decoding can not go to zero when the block length n of the codes tends to infinity.

In this paper, we study the one helper source coding problem posed and investigated by Ahlswede, Körner [1] and Wyner [2]. We call the above source coding system (the AKW system). The AKW system is shown in Figure 1.

In this figure, the AKW system corresponds to the case where the switch is closed. In Figure 1, the sequence

(X^{n}, Y^{n})

represents independent copies of a pair of dependent random variables

(X, Y)

which take values in the finite sets

X, Y

, respectively. We assume that

(X, Y)

has a probability distribution denoted by

p_{X Y}

. For each

i = 1, 2

, the encoder

φ_{i}^{(n)}

outputs a binary sequence which appears at a rate

R_{i}

bits per input symbol. The decoder function

ψ^{(n)}

observes

φ_{1}^{(n)} (X^{n})

and

φ_{2}^{(n)} (Y^{n})

to output a sequence

{\hat{Y}}^{n} : = ψ^{(n)} (φ_{1}^{(n)} (X^{n}), φ_{2}^{(n)} (Y^{n}))

, which is an estimation of

Y^{n}

. When the switch is open, it is well known that the minimum transmission rate

R_{2}

such that the error probability

P_{e}^{(n)} : = \Pr {Y^{n} \neq {\hat{Y}}^{n}}

of decoding tends to zero as n tends to infinity is given by

H (Y)

. Csiszár and Longo [3] proved that, if

R_{2} < H (Y)

, then the correct probability

P_{c}^{(n)} : = \Pr {Y^{n} = {\hat{Y}}^{n}}

of decoding decay exponentially and derived the optimal exponent function. When the switch is open and

R_{1} > H (X)

, Slepian and Wolf [4] proved that

H (Y | X)

is the minimum transmission rate

R_{2}

such that the error probability

\Pr {Y^{n} \neq {\hat{Y}}^{n}}

of decoding tends to zero as n tends to infinity. Oohama and Han [5] proved that, if

R_{2} < H (Y | X)

, then the correct probability

P_{c}^{(n)} : = \Pr {Y^{n} = {\hat{Y}}^{n}}

of decoding decay exponentially and derived the optimal exponent function.

In this paper, we consider the strong converse theorem in the case where the switch is closed and

0 < R_{1} < H (X)

. Let

R_{AKW} (p_{X Y})

be the rate region of the AKW system. This region consists of the rate pair

(R_{1}, R_{2})

such that the error provability of decoding goes to zero as n tends to infinity. The rate region was determined by Ahlswede, Körner [1] and Wyner [2]. On the converse coding theorem, Ahlswede et al. [6] proved that, if

(R_{1}, R_{2})

is outside the rate region, then,

P_{c}^{(n)}

must tends to zero as n tends to infinity. Gu and Effors [7] examined a speed of convergence for

P_{c}^{(n)}

to tend to zero as

n \to \infty

by carefully checking the proof of Ahlswede et al. [6]. However, they could not obtain a result on an explicit form of the exponent function with respect to the code length n.

Our main results on the strong converse theorem for the AKW system are as follows. For the AKW system, we prove that, if

(R_{1}, R_{2})

is outside the rate region

R_{AKW} (p_{X Y})

,

P_{c}^{(n)}

must go to zero exponentially and derive an explicit lower bound of this exponent. This result corresponds to Theorem 3. As a corollary from this theorem, we obtain the strong converse result, which is stated in Corollary 2. This result states that we have an outer bound with

O (1 / \sqrt{n})

gap from the rate region

R_{AKW} (p_{X Y})

.

To derive our result, we use a new method called the recursive method. This method, which is a new method introduced by the author, includes a certain recursive algorithm for a single letterization of exponent functions. In a standard argument of proving converse coding theorems, single letterization methods based on the chain rule of the entropy functions are used. In general, the functions representing multi letter characterizations of exponent functions do not have the chain rule property. In such cases, the recursive method is quite useful for deriving single letterized bounds. The recursive method is a general powerful tool to prove strong converse theorems for several coding problems in information theory. In fact, the recursive method plays important roles in deriving exponential strong converse exponent for communication systems treated in [8,9,10,11,12].

On the strong converse theorem for the one helper source coding problem, we have two recent other works [13,14]. The above two works proved the strong converse theorem using different methods from our method. In [13], Watanabe found a relationship between the AKW system and the Gray–Wyner network. Using this relationship and the second order rate region for the Gray–Wyner network obtained by him [15], Watanabe established the strong converse theorem for the AKW system. In [14], Liu et al. introduced a new method to derive sharp strong converse bounds via a reverse hypercontractivity. Using this method, they obtained an outer bound of the rate region for the AKW system with

O (1 / \sqrt{n})

gap from the rate region. Furthermore, in [14], an extension of the AKW system to the case of Gaussian source and quadratic distortion is investigated, obtaining an outer bound with

O (1 / \sqrt{n})

gap from the rate distortion region for the extended source coding system. In his resent paper [16], Liu showed a lower bound (converse) on the dispersion of AWK as the variance of the linear combination of information densities.

The strong converse theorems seem to be regarded just as a mathematical problem and have been investigated mainly from theoretical interest. Recently, Watanabe and Oohama [17] have found an interesting security problem, which has a close connection with the strong converse theorem for the AKW system. Furthermore, Oohama and Santoso [18] and Santoso and Oohama [19] clarify that the exponential strong converse theorem obtained by this paper plays an essential role in deriving a strong sufficient secure condition for the privacy amplification in their new theoritical model of side channel attacks to the Shannon chipher systems. From the above two cases, we expect that exponential strong converse theorems for multiterminal source networks will serve as a strong tool to several information theoretical security problems.

2. Problem Formulation

Let

X

and

Y

be finite sets and

{\{(X_{t}, Y_{t})\}}_{t = 1}^{\infty}

be a stationary discrete memoryless source. For each

t = 1, 2, \dots

, the random pair

(X_{t}, Y_{t})

takes values in

X \times Y

, and has a probability distribution

p_{X Y} = {\{p_{X Y} (x, y)\}}_{(x, y) \in X \times Y} .

We write n independent copies of

{\{X_{t}\}}_{t = 1}^{\infty}

and

{\{Y_{t}\}}_{t = 1}^{\infty}

, respectively as

X^{n} = X_{1}, X_{2}, \dots, X_{n} and Y^{n} = Y_{1}, Y_{2}, \dots, Y_{n} .

We consider a communication system depicted in Figure 2. This communication system corresponds to the case where the switch is closed in Figure 1. Data sequences

X^{n}

and

Y^{n}

are separately encoded to

φ_{1}^{(n)} (X^{n})

and

φ_{2}^{(n)} (Y^{n})

and those are sent to the information processing center. At the center, the decoder function

ψ^{(n)}

observes

(φ_{1}^{(n)} (X^{n}), φ_{2}^{(n)} (Y^{n}))

to output the estimation

{\hat{Y}}^{n}

of

Y^{n}

. The encoder functions

φ_{1}^{(n)}

and

φ_{2}^{(n)}

are defined by

\begin{matrix} φ_{1}^{(n)} : X^{n} \to M_{1} = \{1, 2, \dots, M_{1}\} \\ φ_{2}^{(n)} : Y^{n} \to M_{2} = \{1, 2, \dots, M_{2}\} \end{matrix}\},

(1)

where for each

i = 1, 2

,

∥ φ_{i}^{(n)} ∥

(= M_{i})

stands for the range of cardinality of

φ_{i}^{(n)}

. The decoder function

ψ^{(n)}

is defined by

ψ^{(n)} : M_{1} \times M_{2} \to Y^{n} .

(2)

The error probability of decoding is

P_{e}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) = Pr \{{\hat{Y}}^{n} \neq Y^{n}\},

(3)

where

{\hat{Y}}^{n} = ψ^{(n)} (φ_{1}^{(n)} (X^{n}), φ_{2}^{(n)} (Y^{n}))

. A rate pair

(R_{1}, R_{2})

is

ε

-achievable if, for any

δ > 0

, there exists a positive integer

n_{0} = n_{0} (ε, δ)

and a sequence of triples

{(φ_{1}^{(n)},

φ_{2}^{(n)},

ψ^{(n)} {)}}_{n \geq n_{0}}

such that, for

n \geq n_{0}

,

\begin{matrix} \frac{1}{n} log ∥ φ_{i}^{(n)} ∥ \leq R_{i} + δ for i = 1, 2, P_{e}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq ε . \end{matrix}

For

ε \in (0, 1)

, the rate region

R_{AKW} (ε | p_{X Y})

is defined by

\begin{matrix} R_{AKW} (ε | p_{X Y}) : = \{(R_{1}, R_{2}) : (R_{1}, R_{2}) is ε - achievable for p_{X Y}\} . \end{matrix}

Furthermore, define

R_{AKW} (p_{X Y}) : = ⋂_{ε \in (0, 1)} R_{AKW} (ε | p_{X Y}) .

We can show that the two rate regions

R_{AKW} (ε |

p_{X Y})

,

ε \in (0, 1)

and

R_{AKW} (p_{X Y})

satisfy the following property.

Property 1.

(a): The regions $R_{AKW} (ε | p_{X Y})$ , $ε \in (0, 1)$ , and $R_{AKW} ($ $p_{X Y})$ are closed convex sets of $R_{+}^{2}$ , where

$\begin{matrix} R_{+}^{2} & : = {(R_{1}, R_{2}) : R_{1} \geq 0, R_{2} \geq 0} . \end{matrix}$
(b): $R_{AKW} (ε | p_{X Y})$ has another form using $(n, ε)$ -rate region $R_{AKW} (n, ε | p_{X Y})$ , the definition of which is as follows. We set

$\begin{matrix} R_{AKW} (n, ε | p_{X Y}) = {(R_{1}, R_{2}) : T h e r e e x i s t s (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) s u c h t h a t \\ \frac{1}{n} log | | φ_{i}^{(n)} | | \leq R_{i}, i = 1, 2, P_{e}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq ε} . \end{matrix}$

Using $R_{AKW} (n,$ $ε | p_{X Y})$ , $R_{AKW} (ε | p_{X Y})$ can be expressed as

$\begin{matrix} R_{AKW} (ε | p_{X Y}) & = cl (⋃_{m \geq 1} ⋂_{n \geq m} R_{AKW} (n, ε | p_{X Y})) . \end{matrix}$

Proof of this property is given in Appendix A. It is well known that

R_{AKW} (p_{X Y})

was determined by Ahlswede, Körner and Wyner. To describe their result, we introduce an auxiliary random variable U taking values in a finite set

U

. We assume that the joint distribution of

(U, X, Y)

is

p_{U X Y} (u, x, y) = p_{U} (u) p_{X | U} (x | u) p_{Y | X} (y | x) .

The above condition is equivalent to

U \leftrightarrow X \leftrightarrow Y

. Define the set of probability distribution

p = p_{U X Y}

by

\begin{matrix} P (p_{X Y}) : = {p_{U X Y} : | U | \leq | X | + 1, U \leftrightarrow X \leftrightarrow Y} . \end{matrix}

Set

\begin{matrix} R (p) & : = \begin{matrix} {(R_{1}, R_{2}) : R_{1}, R_{2} \geq 0 \begin{matrix} R_{1} & \geq & I_{p} (X; U), R_{2} \geq H_{p} (Y | U)}, \end{matrix} \end{matrix} \\ R (p_{X Y}) & : = ⋃_{p \in P (p_{X Y})} R (p) . \end{matrix}

We can show that the region

R (p_{X Y})

satisfies the following property.

Property 2.

(a): The region $R (p_{X Y})$ is a closed convex subset of $R_{+}^{2}$ .
(b): For any $p_{X Y}$ , we have

$min_{(R_{1}, R_{2}) \in R (p_{X Y})} (R_{1} + R_{2}) = H_{p} (Y) .$

(4)

The minimum is attained by $(R_{1}, R_{2}) = (0, H_{p} (Y))$ . This result implies that

$\begin{matrix} R (p_{X Y}) \subseteq {(R_{1}, R_{2}) : R_{1} + R_{2} \geq H_{p} (Y)} \cap R_{+}^{2} . \end{matrix}$

Furthermore, the point $(0, H_{p} (Y))$ always belongs to $R (p_{X Y})$ .

Property 2 part a is a well known property. Proof of Property 2 part b is easy. Proofs of Property 2 parts a and b are omitted. A typical shape of the rate region

R (p_{X Y})

is shown in Figure 3.

The rate region

R_{AKW} (p_{X Y})

was determined by Ahlswede and Körner [1] and Wyner [2]. Their results are the following.

Theorem 1

(Ahlswede, Körner [1] and Wyner [2]).

\begin{matrix} R_{AKW} (p_{X Y}) = R (p_{X Y}) . \end{matrix}

On the converse coding theorem, Ahlswede et al. [6] obtained the following.

Theorem 2

(Ahlswede et al. [6]). For each fixed ε

\in (0, 1)

, we have

\begin{matrix} R_{AKW} (ε | p_{X Y}) = R (p_{X Y}) . \end{matrix}

Gu and Effors [7] examined a speed of convergence for

P_{e}^{(n)}

to tend to 1 as

n \to \infty

by carefully checking the proof of Ahlswede et al. [6]. However, they could not obtain a result on an explicit form of the exponent function with respect to the code length n.

Our aim is to find an explicit form of the exponent function for the error probability of decoding to tend to one as

n \to \infty

when

(R_{1}, R_{2}) \notin R_{AKW} (p_{X Y})

. To examine this quantity, we define the following quantity. Set

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) : = 1 - P_{e}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}), \\ G^{(n)} (R_{1}, R_{2} | p_{X Y}) : = min_{\binom{(φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) :}{(1 / n) log ∥ φ_{i}^{(n)} ∥ \leq R_{i}, i = 1, 2}} (- \frac{1}{n}) log P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) . \\ G (R_{1}, R_{2} | p_{X Y}) : = lim_{n \to \infty} G^{(n)} (R_{1}, R_{2} | p_{X Y}), \\ G (p_{X Y}) : = {(R_{1}, R_{2}, G) : G \geq G (R_{1}, R_{2} | p_{X Y})} . \end{matrix}

By time sharing, we have that

\begin{matrix} G^{(n + m)} (\frac{n R_{1} + m R_{1}^{'}}{n + m}, \frac{n R_{2} + m R_{2}^{'}}{n + m}| p_{X Y}) \leq \frac{n G^{(n)} (R_{1}, R_{2} | p_{X Y}) + m G^{(m)} (R_{1}^{'}, R_{2}^{'} | p_{X Y})}{n + m} . \end{matrix}

(5)

Choosing

R = R^{'}

in the inequality (5), we obtain the following subadditivity property on

{G^{(n)} (R_{1}, R_{2} | p_{X Y})

}_{n \geq 1}

:

\begin{matrix} G^{(n + m)} (R_{1}, R_{2} | p_{X Y}) \leq \frac{n G^{(n)} (R_{1}, R_{2} | p_{X Y}) + m G^{(m)} (R_{1}, R_{2} | p_{X Y})}{n + m}, \end{matrix}

from which this, and Fekete’s subadditive lemma, we have that

G^{(n)} (R_{1}, R_{2} | p_{X Y})

exists and satisfies the following:

\begin{matrix} lim_{n \to \infty} G^{(n)} (R_{1}, R_{2} | p_{X Y}) = inf_{n \geq 1} G^{(n)} (R_{1}, R_{2} | p_{X Y}) . \end{matrix}

The exponent function

G (R_{1}, R_{2} | p_{X Y})

is a convex function of

(R_{1}, R_{2})

. In fact, from the inequality (5), we have that for any

α \in [0, 1]

\begin{matrix} G (α R_{1} + \bar{α} R_{1}^{'}, α R_{2} + \bar{α} R_{2}^{'} | p_{X Y}) \leq α G (R_{1}, R_{2} | p_{X Y}) + \bar{α} G (R_{1}^{'}, R_{2}^{'} | p_{X Y}) . \end{matrix}

The region

G (p_{X Y})

is also a closed convex set. Our main aim is to find an explicit characterization of

G (p_{X Y})

. In this paper, we derive an explicit outer bound of

G

(p_{X Y})

whose section by the plane

G = 0

coincides with

R_{AKW} (p_{X Y})

.

3. Main Results

In this section, we state our main result. We first explain that the region

R (p_{X Y})

can be expressed with a family of supporting hyperplanes. To describe this result, we define a set of probability distributions on

U

\times X

\times Y

by

\begin{matrix} P_{sh} (p_{X Y}) & : = {p = p_{U X Y} : | U | \leq | X |, U \leftrightarrow X \leftrightarrow Y} . \end{matrix}

For

μ \geq 0

, define

\begin{matrix} R^{(μ)} (p_{X Y}) : = min_{p \in P_{sh} (p_{X Y})} \{μ I_{p} (X; U) + \bar{μ} H_{p} (Y | U)\} . \end{matrix}

Furthermore, define

\begin{matrix} R_{sh} (p_{X Y}) : = ⋂_{μ \in [0, 1]} {(R_{1}, R_{2}) : \begin{matrix} μ R_{1} + \bar{μ} R_{2} \geq R^{(μ)} (p_{X Y})} . \end{matrix} \end{matrix}

Then, we have the following property.

Property 3.

(a): The bound $| U | \leq | X |$ is sufficient to describe $R^{(μ)} ($ $p_{X Y})$ .
(b): For every $μ \in [0, 1]$ , we have

$\begin{matrix} min_{(R_{1}, R_{2}) \in R (p_{X Y})} {μ R_{1} + \bar{μ} R_{2}} = R^{(μ)} (p_{X Y}) . \end{matrix}$

(6)
(c): For any $p_{X Y},$ we have

$R_{sh} (p_{X Y}) = R (p_{X Y}) .$

(7)

Property 3 part a is stated as Lemma A1 in Appendix B. Proof of this lemma is given in this appendix. Proofs of Property 3 parts b and c are given in Appendix C. Set

\begin{matrix} Q (p_{Y | X}) & : = {q = q_{U X Y} : | U | \leq | X |, U \leftrightarrow X \leftrightarrow Y, p_{Y | X} = q_{Y | X}} . \end{matrix}

For

(μ, α) \in {[0, 1]}^{2}

, and for

q = q_{U X Y} \in Q (p_{Y | X})

, define

\begin{matrix} ω_{q | p_{X}}^{(μ, α)} (x, y | u) : = \bar{α} log \frac{q_{X} (x)}{p_{X} (x)} + α [μ log \frac{q_{X | U} (x | u)}{p_{X} (x)} + \bar{μ} log \frac{1}{q_{Y | U} (y | u)}], \\ f_{q | p_{X}}^{(μ, α)} (x, y | u) : = exp \{- ω_{q | p_{X}}^{(μ, α)} (x, y | u)\}, \\ Ω^{(μ, α)} (q | p_{X}) : = - log E_{q} [exp \{- ω_{q | p_{X}}^{(μ, α)} (X, Y | U)\}], Ω^{(μ, α)} (p_{X Y}) : = min_{\binom{}{q \in Q (p_{Y | X})}} Ω^{(μ, α)} (q | p_{X}), \\ F^{(μ, α)} (μ R_{1} + \bar{μ} R_{2} | p_{X Y}) : = \frac{Ω^{(μ, α)} (p_{X Y}) - α (μ R_{1} + \bar{μ} R_{2})}{2 + α \bar{μ}}, \\ F (R_{1}, R_{2} | p_{X Y}) : = sup_{(μ, α) \in {[0, 1]}^{2}} F^{(μ, α)} (μ R_{1} + \bar{μ} R_{2} | p_{X Y}) . \end{matrix}

We next define a function serving as a lower bound of

F (R_{1}, R_{2} | p_{X Y})

. For

λ \geq 0

and for

p_{U X Y} \in P_{sh} (p_{X Y})

, define

\begin{matrix} {\tilde{ω}}_{p}^{(μ)} (x, y | u) : = μ log \frac{p_{X | U} (x | u)}{p_{X} (x)} + \bar{μ} log \frac{1}{p_{Y | U} (y | u)}, \\ {\tilde{Ω}}^{(μ, λ)} (p) : = - log E_{p} [exp \{- λ {\tilde{ω}}_{p}^{(μ)} (X, Y | U)\}], {\tilde{Ω}}^{(μ, λ)} (p_{X Y}) : = min_{\binom{}{p \in P_{sh} (p_{X Y})}} {\tilde{Ω}}^{(μ, λ)} (p) . \end{matrix}

Furthermore, set

\begin{matrix} {\underset{̲}{F}}^{(μ, λ)} (μ R_{1} + \bar{μ} R_{2} | p_{X Y}) : = \frac{{\tilde{Ω}}^{(μ, λ)} (p_{X Y}) - λ (μ R_{1} + \bar{μ} R_{2})}{2 + λ (5 - μ)}, \\ \underset{̲}{F} (R_{1}, R_{2} | p_{X Y}) : = sup_{λ \geq 0, μ \in [0, 1]} {\underset{̲}{F}}^{(μ, λ)} (μ R_{1} + \bar{μ} R_{2} | p_{X Y}) . \end{matrix}

We can show that the above functions satisfy the following property.

Property 4.

(a): The cardinality bound $| U | \leq | X |$ in $Q (p_{Y | X})$ is sufficient to describe the quantity $Ω^{(μ, α)} (p_{X Y})$ . Furthermore, the cardinality bound $| U | \leq | X |$ in $P_{sh} (p_{X Y})$ is sufficient to describe the quantity ${\tilde{Ω}}^{(μ, λ)} (p_{X Y})$ .
(b): For any $R_{1}, R_{2} \geq 0$ , we have

$\begin{matrix} F (R_{1}, R_{2} | p_{X Y}) \geq \underset{̲}{F} (R_{1}, R_{2} | p_{X Y}) . \end{matrix}$
(c): For any $p = p_{U X Y} \in P_{sh} (p_{X Y})$ and any $(μ, λ$ ${) \in [0, 1]}^{2}$ , we have

$0 \leq {\tilde{Ω}}^{(μ, λ)} (p) \leq μ log | X | + \bar{μ} log | Y | .$

(8)
(d): Fix any $p = p_{U X Y} \in P_{sh} (p_{X Y})$ and $μ \in [0, 1]$ . For $λ \in [0, 1]$ , we define a probability distribution $p^{(λ)} = p_{U X Y}^{(λ)}$ by

$\begin{matrix} p^{(λ)} (u, x, y) : = \frac{p (u, x, y) exp \{- λ {\tilde{ω}}_{p}^{(μ)} (x, y | u)\}}{E_{p} [exp \{- λ {\tilde{ω}}_{p}^{(μ)} (X, Y | U)\}]} . \end{matrix}$

Then, for $λ \in [0, 1 / 2]$ , ${\tilde{Ω}}^{(μ, λ)} (p)$ is twice differentiable. Furthermore, for $λ \in [0, 1 / 2]$ , we have

$\begin{matrix} \frac{d}{d λ} {\tilde{Ω}}^{(μ, λ)} (p) = E_{p^{(λ)}} [{\tilde{ω}}_{p}^{(μ)} (X, Y | U)], \frac{d^{2}}{d λ^{2}} {\tilde{Ω}}^{(μ, λ)} (p) = - {Var}_{p^{(λ)}} [{\tilde{ω}}_{p}^{(μ)} (X, Y | U)] . \end{matrix}$

The second equality implies that ${\tilde{Ω}}^{(μ, λ)} (p$ $| p_{X Y})$ is a concave function of $λ \in [0, 1 / 2]$ .
(e): For every $(μ, λ) \in [0, 1] \times [0, 1 / 2]$ , define

$\begin{matrix} ρ^{(μ, λ)} (p_{X Y}) : = max_{\binom{(ν, p) \in [0, λ] \times P_{sh} (p_{X Y}) :}{{\tilde{Ω}}^{(μ, λ)} (p) = {\tilde{Ω}}^{(μ, λ)} (p_{X Y})}} {Var}_{p^{(ν)}} [{\tilde{ω}}_{p}^{(μ)} (X, Y | U)], \end{matrix}$

and set

$\begin{matrix} ρ = ρ (p_{X Y}) : = max_{(μ, λ) \in [0, 1] \times [0, 1 / 2]} ρ^{(μ, λ)} (p_{X Y}) . \end{matrix}$

Then, we have $ρ (p_{X Y}) < \infty .$ Furthermore, for any $(μ, λ) \in [0, 1] \times [0, 1 / 2]$ , we have

${\tilde{Ω}}^{(μ, λ)} (p_{X Y}) \geq λ R^{(μ)} (p_{X Y}) - \frac{λ^{2}}{2} ρ (p_{X Y}) .$

(9)
(f): For every $τ \in (0, (1 / 2) ρ (p_{X Y}))$ , the condition $(R_{1} + τ,$ $R_{2} + τ) \notin R (p_{X Y})$ implies

$\underset{̲}{F} (R_{1}, R_{2} | p_{X Y}) > \frac{ρ (p_{X Y})}{4} \cdot g^{2} (\frac{τ}{ρ (p_{X Y})}) > 0,$

where g is the inverse function of $ϑ (a) : = a + (5 / 4) a^{2}, a \geq 0$ .

Property 3 part a is stated as Lemma A2 in Appendix B. Proof of this lemma is given in this appendix. Proof of Property 4 part b is given in Appendix D. Proofs of Property 4 parts c, d, e, and f are given in Appendix E.

Our main result is the following.

Theorem 3.

For any

R_{1}, R_{2} \geq 0

, any

p_{X Y}

, and for any

(φ_{1}^{(n)},

φ_{1}^{(n)},

ψ^{(n)})

satisfying

(1 / n) log | | φ_{i}^{(n)} | | \leq R_{i}, i = 1, 2,

we have

P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq 5 exp \{- n F (R_{1}, R_{2} | p_{X Y})\} .

(10)

It can be seen from Property 4 parts b and f that

F (R_{1}, R_{2} | p_{X Y})

is strictly positive if

(R_{1}, R_{2})

is outside the rate region

R (p_{X Y})

. Hence, by Theorem 3, we have that, if

(R_{1}, R_{2})

is outside the rate region, then the error probability of decoding goes to one exponentially and its exponent is not below

F (R_{1}, R_{2} | p_{X Y})

. It immediately follows from Theorem 3 that we have the following corollary.

Corollary 1.

\begin{matrix} G (R_{1}, R_{2} | p_{X Y}) \geq F (R_{1}, R_{2} | p_{X Y}), \\ G (p_{X Y}) \subseteq \bar{G} (p_{X Y}) = \{(R_{1}, R_{2}, G) : G \geq F (R_{1}, R_{2} | p_{X Y})\} . \end{matrix}

Proof of Theorem 3 will be given in the next section. The exponent function at rates outside the rate region was derived by Oohama and Han [5] for the separate source coding problem for correlated sources [4]. The techniques used by them is a method of types [21], which is not useful to prove Theorem 3. Some novel techniques based on the information spectrum method introduced by Han [22] are necessary to prove this theorem.

From Theorem 3 and Property 4 part e, we can obtain an explicit outer bound of

R_{AKW} (ε | p_{X Y})

with an asymptotically vanishing deviation from

R_{AKW} (p_{X Y})

= R (p_{X Y})

. The strong converse theorem established by Ahlswede et al. [6] immediately follows from this corollary. To describe this outer bound, for

κ > 0

, we set

R (p_{X Y}) - κ (1, 1) : = {(R_{1} - κ, R_{2} - κ) : (R_{1}, R_{2}) \in R (p_{X Y})},

which serves as an outer bound of

R (p_{X Y})

. For each fixed

ε \in (0, 1)

, we define

κ_{n}

= κ_{n} (ε, ρ (p_{X Y}))

by

\begin{matrix} κ_{n} & : = & ρ (p_{X Y}) ϑ (\sqrt{\frac{4}{n ρ (p_{X Y})} log (\frac{5}{1 - ε})}) \\ \overset{(a)}{=} & 2 \sqrt{\frac{ρ (p_{X Y})}{n} log (\frac{5}{1 - ε})} + \frac{5}{n} log (\frac{5}{1 - ε}) . \end{matrix}

(11)

Step (a) follows from

ϑ (a) = a + (5 / 4) a^{2}

. Since

κ_{n} \to 0

as

n \to \infty

, we have the smallest positive integer

n_{0} = n_{0} (ε, ρ (p_{X Y}))

such that

κ_{n} \leq (1 / 2) ρ (p_{X Y})

for

n \geq n_{0}

. From Theorem 3 and Property 4 part e, we have the following corollary.

Corollary 2.

For each fixed ε

\in (0, 1)

, we choose the above positive integer

n_{0} =

n_{0} (ε, ρ (p_{X Y}))

. Then, for any

n \geq n_{0}

, we have

\begin{matrix} R_{AKW} (n, ε | p_{X Y}) \subseteq R (p_{X Y}) - κ_{n} (1, 1) . \end{matrix}

The above result together with

\begin{matrix} R_{AKW} (ε | p_{X Y}) & = cl (⋃_{m \geq 1} ⋂_{n \geq m} R_{AKW} (n, ε | p_{X Y})) \end{matrix}

yields that, for each fixed

ε \in (0, 1)

, we have

\begin{matrix} R_{AKW} (ε | p_{X Y}) = R_{AKW} (p_{X Y}) = R (p_{X Y}) . \end{matrix}

This recovers the strong converse theorem proved by Ahlswede et al. [6].

Proof of this corollary will be given in the next section.

4. Proof of the Main Result

Let

(X^{n}, Y^{n})

be a pair of random variables from the information source. We set

S = φ_{1}^{(n)} (X^{n})

. Joint distribution

p_{S X^{n} Y^{n}}

of

(S, X^{n}, Y^{n})

is given by

\begin{matrix} p_{S X^{n} Y^{n}} (s, x^{n}, y^{n}) = p_{S | X^{n}} (s | x^{n}) \prod_{t = 1}^{n} p_{X_{t} Y_{t}} (x_{t}, y_{t}) . \end{matrix}

It is obvious that

S \leftrightarrow X^{n} \leftrightarrow Y^{n}

. Then, we have the following lemma, which is well known as a single shot infomation spectrum bound.

Lemma 1.

For any

η > 0

and for any

(φ_{1}^{(n)}

,

φ_{2}^{(n)}, ψ^{(n)})

satisfying

(1 / n) log | | φ_{i}^{(n)} | | \leq R_{i}, i = 1, 2,

we have

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq p_{S X^{n} Y^{n}} { & 0 \geq \frac{1}{n} log \frac{{\hat{q}}_{{S X^{n} Y}^{n}} (S, X^{n}, Y^{n})}{p_{S X^{n} Y^{n}} (S, X^{n}, Y^{n})} - η, \end{matrix}

(12)

\begin{matrix} 0 \geq \frac{1}{n} log \frac{Q_{X^{n}} (X^{n})}{p_{X^{n}} (X^{n})} - η, \end{matrix}

(13)

\begin{matrix} R_{1} \geq \frac{1}{n} log \frac{{\tilde{Q}}_{X^{n} | S} (X^{n} | S)}{p_{X^{n}} (X^{n})} - η, \end{matrix}

(14)

\begin{matrix} R_{2} \geq \frac{1}{n} log \frac{1}{p_{Y^{n} | S} (Y^{n} | S)} - η\} + 4 e^{- n η} . \end{matrix}

(15)

The probability distributions appearing in the three inequalities (12), (13), and (14) in the right members of (15) have a property that we can select them as arbitrary. In (12), we can choose any probability distribution

{\hat{q}}_{S X^{n} Y^{n}}

on

S

\times X^{n}

\times Y^{n}

. In (13), we can choose any distribution

Q_{X^{n}}

on

X^{n}

. In (14), we can choose any stochastic matrix

{\tilde{Q}}_{X^{n} | U^{n}}

:

X^{n}

\to U^{n}

.

This lemma can be proved by a standard argument in the information spectrum method [22]. The detail of the proof is given in Appendix F. Next, we single letterize the four information spectrum quantities inside the first term in the right members of (15) in Lemma 1 to obtain the following lemma.

Lemma 2.

For any

η > 0

and for any

(φ_{1}^{(n)}

,

φ_{2}^{(n)}, ψ^{(n)})

satisfying

(1 / n) log | | φ_{i}^{(n)} | | \leq R_{i}, i = 1, 2,

we have

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq p_{S X^{n} Y^{n}} \{0 \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{X_{t}} (X_{t})}{p_{X_{t}} (X_{t})} - η, \end{matrix}

(16)

\begin{matrix} R_{1} \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{{\tilde{Q}}_{X_{t} | S X^{t - 1}} (X_{t} | S, X^{t - 1})}{p_{X_{t}} (X_{t})} - η, \end{matrix}

(17)

\begin{matrix} R_{2} \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{1}{p_{Y_{t} | S X^{t - 1} Y^{t - 1}} (Y_{t} | S, X^{t - 1}, Y^{t - 1})} - 2 η\} + 4 e^{- n η}, \end{matrix}

where for each

t = 1, 2, \dots, n

, the probability distribution

Q_{X_{t}}

on

X

appearing in (16) and the stochastic matrix

{\tilde{Q}}_{X_{t} | S X^{t - 1}} :

M_{1} \times X^{t - 1}

\to X

appearing in (17) have a property that we can choose their values arbitrary.

Proof.

In (12) in Lemma 1, we choose

{\hat{q}}_{S X^{n} Y^{n}}

having the form

\begin{matrix} {\hat{q}}_{S X^{n} Y^{n}} (S, X^{n}, Y^{n}) = p_{S} (S) \prod_{t = 1}^{n} \{p_{X_{t} | S X^{t - 1} Y^{t}} (X_{t} | S, X^{t - 1}, Y^{t}) p_{Y_{t} | S Y^{t - 1}} (Y_{t} | S, Y^{t - 1})\} . \end{matrix}

In (13) in Lemma 1, we choose

Q_{X^{n}}

having the form

Q_{X^{n}} (X^{n}) = \prod_{t = 1}^{n} Q_{X_{t}} (X_{t}) .

We further note that

\begin{matrix} \frac{{\tilde{Q}}_{X^{n} | S} (X^{n} | S)}{p_{X^{n}} (X^{n})} = \prod_{t = 1}^{n} \frac{{\tilde{Q}}_{X_{t} | S X^{t - 1}} (X_{t} | S, X^{t - 1})}{p_{X_{t}} (X_{t})}, p_{Y^{n} | S} (Y^{n} | S) = \prod_{t = 1}^{n} p_{Y_{t} | S Y^{t - 1}} (Y_{t} | S, Y^{t - 1}) . \end{matrix}

Then, the bound (15) in Lemma 1 becomes

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq p_{S X^{n} Y^{n}} { & 0 \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{p_{Y_{t} | S Y^{t - 1}} (Y_{t} | S, Y^{t - 1})}{p_{Y_{t} | S X^{t - 1} Y^{t - 1}} (Y_{t} | S, X^{t - 1}, Y^{t - 1})} - η, \\ 0 \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{X_{t}} (X_{t})}{p_{X_{t}} (X_{t})} - η, \\ R_{1} \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{{\tilde{Q}}_{X_{t} | S X^{t - 1}} (X_{t} | S, X^{t - 1})}{p_{X_{t}} (X_{t})} - η, \\ R_{2} \geq \frac{1}{n} \sum_{t = 1}^{n} \frac{1}{p_{Y_{t} | S Y^{t - 1}} (Y_{t} | S, Y^{t - 1})} - η\} + 4 e^{- n η} \\ \leq p_{S X^{n} Y^{n}} \{0 \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{X_{t}} (X_{t})}{p_{X_{t}} (X_{t})} - η, \\ R_{1} \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{{\tilde{Q}}_{X_{t} | S X^{t - 1}} (X_{t} | S, X^{t - 1})}{p_{X_{t}} (X_{t})} - η, \\ R_{2} \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{1}{p_{Y_{t} | S X^{t - 1} Y^{t - 1}} (Y_{t} | S, X^{t - 1}, Y^{t - 1})} - 2 η\} + 4 e^{- n η}, \end{matrix}

completing the proof. □

As in the standard converse coding argument, we identify auxiliary random variables, based on the bound in Lemma 2. The following lemma is necessary for such identification.

Lemma 3.

Suppose that, for each

t = 1, 2, \dots, n

, the joint distribution

p_{S X^{t} Y^{t}}

of the random vector

S X^{t} Y^{t}

is a marginal distribution of

p_{S X^{n} Y^{n}}

. Then, we have the following Markov chain:

S X^{t - 1} \leftrightarrow X_{t} \leftrightarrow Y_{t}

(18)

or equivalently that

I (Y_{t}; S X^{t - 1} | X_{t}) = 0

. Furthermore, we have the following Markov chain:

Y^{t - 1} \leftrightarrow S X^{t - 1} \leftrightarrow (X_{t}, Y_{t})

(19)

or equivalently that

I (X_{t} Y_{t}; Y^{t - 1} | S X^{t - 1}) = 0

. The above two Markov chains are equivalent to the following one long Markov chain:

Y^{t - 1} \leftrightarrow S X^{t - 1} \leftrightarrow X_{t} \leftrightarrow Y_{t} .

(20)

Proof of this lemma is given in Appendix G. For

t = 1, 2, \dots, n

, set

U_{t} : = M_{1}

\times X^{t - 1}

. Define a random variable

U_{t} \in U_{t}

by

U_{t} : = (S, X^{t - 1})

. From Lemmas 2 and 3, we identify auxiliary random variables to obtain the following lemma.

Lemma 4.

For any

η > 0

and for any

(φ_{1}^{(n)}

,

φ_{2}^{(n)}, ψ^{(n)})

satisfying

(1 / n) log | | φ_{i}^{(n)} | | \leq R_{i}, i = 1, 2,

we have

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq p_{S X^{n} Y^{n}} \{0 \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{X_{t}} (X_{t})}{p_{X_{t}} (X_{t})} - η, \end{matrix}

(21)

\begin{matrix} R_{1} \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{{\tilde{Q}}_{X_{t} | U_{t}} (X_{t} | U_{t})}{p_{X_{t}} (X_{t})} - η, \end{matrix}

(22)

\begin{matrix} R_{2} \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{1}{p_{Y_{t} | U_{t}} (Y_{t} | U_{t})} - 2 η\} + 4 e^{- n η}, \end{matrix}

(23)

where, for each

t = 1, 2, \dots, n

, the probability distribution

Q_{X_{t}}

on

X

appearing in (21) and the stochastic matrix

{\tilde{Q}}_{X_{t} | U_{t}} :

U_{t}

\to X

appearing in (22) have a property that we can choose their values arbitrary.

Now, the challenge is that, although the quantities inside the first term in the right members of (23) in Lemma 4 have n sum of information spectrum quantities, the measure

p_{S X^{n} Y^{n}}

does not have an i.i.d. structure in general. To resolve this, we first use the large deviation theory to upper bound the first quantity in the right members of (23). For each

t = 1, 2, \dots, n

, set

{\underset{̲}{Q}}_{t} : = (Q_{X_{t}}, {\tilde{Q}}_{X_{t} | U_{t}}) .

Let

{\underset{̲}{Q}}_{t}

be a set of all

{\underset{̲}{Q}}_{t}

. We define a quantity which serves as an exponential upper bound of

P_{c}^{(n)} (φ_{1}^{(n)},

φ_{2}^{(n)}, ψ^{(n)})

. Let

P^{(n)} (p_{X Y})

be a set of all probability distributions

p_{S X^{n} Y^{n}}

on

M_{1}

\times X^{n}

\times Y^{n}

having a form:

\begin{matrix} p_{S X^{n} Y^{n}} (s, x^{n}, y^{n}) = p_{S | X^{n}} (s | x^{n}) \prod_{t = 1}^{n} p_{X Y} (x_{t}, y_{t}) \\ for (s, x^{n}, y^{n}) \in M_{1} \times X^{n} \times Y^{n} . \end{matrix}

For simplicity of notation, we use the notation

p^{(n)}

for

p_{S X^{n} Y^{n}}

\in P^{(n)}

(p_{X Y})

. For each

t = 1, 2, \dots, n

,

p_{U_{t} X_{t} Y_{t}} = p_{S X^{t} Y_{t}}

is a marginal distribution of

p^{(n)}

. For

t = 1, 2, \dots, n

, we simply write

p_{t} =

p_{U_{t} X_{t} Y_{t}}

. For

μ \in [0, 1]

,

α \in [0, 1)

,

p^{(n)}

\in P^{(n)} (p_{X Y})

, and

{\underset{̲}{Q}}^{n}

\in Q^{n}

, we define

\begin{matrix} Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) : = - log E_{p^{(n)}} [\prod_{t = 1}^{n} \frac{p_{X_{t}}^{\bar{α}} (X_{t})}{Q_{X_{t}}^{\bar{α}} (X_{t})} \frac{p_{X_{t}}^{μ α} (X_{t}) p_{Y_{t} | U_{t}}^{μ α} (Y_{t} | U_{t})}{{\tilde{Q}}_{X_{t} | U_{t}}^{μ α} (X_{t} | U_{t})}], \end{matrix}

where for each

t = 1, 2, \dots, n

, the probability distribution

Q_{X_{t}}

and the conditional probability distribution

{\tilde{Q}}_{X_{t} | U_{t}}

appearing in the definition of

Ω^{(μ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n})

can be chosen as arbitrary.

The following is well known as the Cramèr’s bound in the large deviation principle.

Lemma 5.

For any real valued random variable Z and any

α \geq 0

, we have

Pr {Z \geq a} \leq exp [- (α a - log E [exp (α Z)])] .

By Lemmas 4 and 5, we have the following proposition.

Proposition 1.

For any

(μ, α) \in {[0, 1]}^{2}

any

{\underset{̲}{Q}}^{n} \in {\underset{̲}{Q}}^{n}

, and any

(φ_{1}^{(n)},

φ_{2}^{(n)}, ψ^{(n)})

satisfying

(1 / n) log | | φ_{i}^{(n)} | | \leq R_{i}, i = 1, 2,

there exists

p^{(n)} \in P^{(n)} (W_{1}, W_{2})

such that

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq 5 exp \{- n {[2 + α \bar{μ}]}^{- 1} [\frac{1}{n} Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) - α (μ R_{1} + \bar{μ} R_{2})]\} . \end{matrix}

Proof.

By Lemma 4, for

(μ, α) \in {[0, 1]}^{2}

, we have the following chain of inequalities:

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq p_{S X^{n} Y^{n}} \{0 \geq [\frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{X_{t}}^{\bar{α}} (X_{t})}{p_{X_{t}}^{\bar{α}} (X_{t})} - \bar{α} η], \\ α μ R_{1} \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{{\tilde{Q}}_{X_{t} | U_{t}}^{α μ} (X_{t} | U_{t})}{p_{X_{t}}^{α μ} (X_{t})} - α μ η, \\ α \bar{μ} R_{2} \geq \frac{1}{n} \sum_{t = 1}^{n} log \frac{1}{p_{Y_{t} | U_{t}}^{α \bar{μ}} (Y_{t} | U_{t})} - 2 α \bar{μ} η\} + 4 e^{- n η} \\ \leq p_{S X^{n} Y^{n}} \{\begin{matrix}  \end{matrix} α (μ R_{1} + \bar{μ} R_{2}) + (1 + α \bar{μ}) η \geq - \frac{1}{n} \sum_{t = 1}^{n} log [\frac{p_{X_{t}}^{\bar{α}} (X_{t})}{Q_{X_{t}}^{\bar{α}} (X_{t})} \frac{p_{X_{t}}^{μ α} (X_{t}) p_{Y_{t} | U_{t}}^{\bar{μ} α} (Y_{t} | U_{t})}{{\tilde{Q}}_{X_{t} | U_{t}}^{μ α} (X_{t} | U_{t})}]\} + 4 e^{- n η} \\ = p_{S X^{n} Y^{n}} \{\frac{1}{n} \sum_{t = 1}^{n} log [\frac{p_{X_{t}}^{\bar{α}} (X_{t})}{Q_{X_{t}}^{\bar{α}} (X_{t})} \frac{p_{X_{t}}^{μ α} (X_{t}) p_{Y_{t} | U_{t}}^{α} (Y_{t} | U_{t})}{{\tilde{Q}}_{X_{t} | U_{t}}^{μ α} (X_{t} | U_{t})}] \geq - [α (μ R_{1} + \bar{μ} R_{2}) + (1 + α \bar{μ}) η]\} + 4 e^{- n η} \\ \overset{(a)}{\leq} exp [n \{α (μ R_{1} + \bar{μ} R_{2}) + (1 + α \bar{μ}) η - \frac{1}{n} Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y})\}] + 4 e^{- n η} . \end{matrix}

(24)

Step (a) follows from Lemma 5. When

Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) \leq n α (μ R_{1} + \bar{μ} R_{2})

, the bound we wish to prove is obvious. In the following argument, we assume that

Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) >

n α (μ R_{1} + \bar{μ} R_{2})

. We choose

η

so that

\begin{matrix} - η & = & α (μ R_{1} + \bar{μ} R_{2}) + (1 + α \bar{μ}) η - \frac{1}{n} Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) . \end{matrix}

(25)

Solving (25) with respect to

η

, we have

\begin{matrix} η = \frac{(1 / n) Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) - α (μ R_{1} + \bar{μ} R_{2})}{2 + α \bar{μ}} . \end{matrix}

For this choice of

η

and (24), we have

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq 5 e^{- n η} = 5 exp \{- n {[2 + α \bar{μ}]}^{- 1} [\frac{1}{n} Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) - α (μ R_{1} + \bar{μ} R_{2})]\}, \end{matrix}

completing the proof. □

Set

\begin{matrix} {\underset{̲}{Ω}}^{(μ, α)} (p_{X Y}) : = inf_{n \geq 1} min_{p^{(n)} \in P^{(n)}} max_{{\underset{̲}{Q}}^{n} \in {\underset{̲}{Q}}^{n}} \frac{1}{n} Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) . \end{matrix}

By Proposition 1, we have the following corollary.

Corollary 3.

For any

(μ, α) \in {[0, 1]}^{2}

and any

(φ_{1}^{(n)},

φ_{2}^{(n)}, ψ^{(n)})

satisfying

(1 / n) log | | φ_{i}^{(n)} | | \leq R_{i}, i = 1, 2,

we have

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq 5 exp \{- n [\frac{{\underset{̲}{Ω}}^{(μ, α)} (p_{X Y}) - α (μ R_{1} + \bar{μ} R_{2})}{2 + α \bar{μ}}]\} . \end{matrix}

We shall call

{\underset{̲}{Ω}}^{(μ, α)} (p_{X Y})

the communication potential. The above corollary implies that the analysis of

{\underset{̲}{Ω}}^{(μ, α)} (

p_{X Y})

leads to an establishment of a strong converse theorem for the one helper source coding problem. Note here that

{\underset{̲}{Ω}}^{(μ, α)} (p_{X Y})

is still a multi letter quantity. However, we successfully single letterize this quantity. This result which will be stated later in Proposition 2 is a mathematical core of our main result.

In the following argument, we drive an explicit lower bound of

{\underset{̲}{Ω}}^{(μ, α)} (p_{X Y})

. For each

t = 1, 2, \dots, n

, set

u_{t} = (s, x^{t - 1})

\in U_{t}

and

F_{t} : = (p_{X_{t}}, p_{X_{t} Y_{t} | U_{t}}, {\underset{̲}{Q}}_{t}), F^{t} : = {F_{i}}_{i = 1}^{t} .

For

t = 1, 2, \dots, n

, define a function of

(u_{t}, x_{t}, y_{t})

\in U_{t}

\times X

\times Y

by

\begin{matrix} f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) : = \frac{p_{X_{t}}^{\bar{α}} (x_{t})}{Q_{X_{t}}^{\bar{α}} (x_{t})} \frac{p_{X_{t}}^{μ α} (x_{t}) p_{Y_{t} | U_{t}}^{α} (y_{t} | u_{t})}{{\tilde{Q}}_{X_{t} | U_{t}}^{μ α} (x_{t} | u_{t})} . \end{matrix}

By definition, we have

\begin{matrix} exp \{- Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y})\} = \sum_{s, x^{n}, y^{n}} p_{S X^{n} Y^{n}} (s, x^{n}, y^{n}) \prod_{t = 1}^{n} f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) . \end{matrix}

For each

t = 1, 2, \dots, n

, we define the probability distribution

p_{S X^{t} Y^{t}; F^{t}}^{(μ, α)} : = {\{p_{S X^{t} Y^{t}; F^{t}}^{(μ, α)} (s, x^{t}, y^{t})\}}_{(s, x^{t}, y^{t}) \in M_{1} \times X^{t} \times Y^{t}}

by

\begin{matrix} p_{S X^{t} Y^{t}; F^{t}}^{(μ, α)} (s, x^{t}, y^{t}) : = C_{t}^{- 1} p_{S X^{t} Y^{t}} (s, x^{t}, y^{t}) \prod_{i = 1}^{t} f_{F_{i}}^{(μ, α)} (x_{i}, y_{i} | u_{i}), \end{matrix}

where

\begin{matrix} C_{t} & : = \sum_{s, x^{t}, y^{t}} p_{S X^{t} Y^{t}} (s, x^{t}, y^{t}) \prod_{i = 1}^{t} f_{F_{i}}^{(μ, α)} (x_{i}, y_{i}) \end{matrix}

are constants for normalization. For

t = 1, 2, \dots, n

, define

Φ_{t}^{(μ, α)} : = C_{t} C_{t - 1}^{- 1},

(26)

where we define

C_{0} = 1

. Then, we have the following lemma.

Lemma 6.

For each

t = 1, 2, \dots, n

, and for any

(s,

x^{t}, y^{t}) \in M_{1}

\times X^{t}

\times Y^{t}

, we have

\begin{matrix} p_{S X^{t} Y^{t}; F^{t}}^{(μ, α)} (s, x^{t}, y^{t}) \\ = {(Φ_{t}^{(μ, α)})}^{- 1} p_{S X^{t - 1} Y^{t - 1}; F^{t - 1}}^{(μ, α)} (s, x^{t - 1}, y^{t - 1}) p_{X_{t} Y_{t} | S X^{t - 1} Y^{t - 1}} (x_{t}, y_{t} | s, x^{t - 1}, y^{t - 1}) f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) . \end{matrix}

(27)

Furthermore, we have

\begin{matrix} Φ_{t}^{(μ, α)} = \sum_{s, x^{t}, y^{t}} p_{S X^{t - 1} Y^{t - 1}; F^{t - 1}}^{(μ, α)} (s, x^{t - 1}, y^{t - 1}) p_{X_{t} Y_{t} | S X^{t - 1} Y^{t - 1}} (x_{t}, y_{t} | s, x^{t - 1}, y^{t - 1}) f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) . \end{matrix}

(28)

Proof of this lemma is given in Appendix H. Define

\begin{matrix} p_{U_{t}; F^{t - 1}}^{(μ, α)} (u_{t}) = p_{S X^{t - 1}; F^{t - 1}}^{(μ, α)} (s, x^{t - 1}) : = \sum_{y^{t - 1}} p_{S X^{t - 1} Y^{t - 1}; F^{t - 1}}^{(μ, α)} (s, x^{t - 1}, y^{t - 1}) . \end{matrix}

Then, we have the following lemma, which is a key result to derive a single letterized lower bound of

{\underset{̲}{Ω}}^{(μ, α)} (p_{X Y})

.

Lemma 7.

For any

p^{(n)} \in P^{(n)} (p_{X Y})

and any

{\underset{̲}{Q}}^{n} \in {\underset{̲}{Q}}^{n}

, we have

\begin{matrix} Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) = (- 1) \sum_{t = 1}^{n} log Φ_{t}^{(μ, α)}, \end{matrix}

(29)

\begin{matrix} Φ_{t}^{(μ, α)} = \sum_{u_{t}, x_{t}, y_{t}} p_{U_{t}; F^{t - 1}}^{(μ, α)} (u_{t}) p_{X_{t} | U_{t}} (x_{t} | u_{t}) p_{Y_{t} | X_{t}} (y_{t} | x_{t}) f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) . \end{matrix}

(30)

Proof.

We first prove (29). From (26), we have

log Φ_{t}^{(μ, α)} = - log C_{t} + log C_{t - 1} .

(31)

Furthermore, by definition, we have

Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) = - log C_{n}, C_{0} = 1 .

(32)

From (31) and (32), (29) is obvious. We next prove (30). We first observe that for

(s, x^{t}, y^{t})

\in S \times X^{t} \times Y^{t}

and for

t = 1, 2, \dots, n

,

\begin{matrix} p_{X_{t} Y_{t} | S X^{t - 1} Y^{t - 1}} (x_{t}, y_{t} | s, x^{t - 1}, y^{t - 1}) = p_{X_{t} | S X^{t - 1} Y^{t - 1}} (x_{t} | s, x^{t - 1}, y^{t - 1}) p_{Y_{t} | S X^{t} Y^{t - 1}} (y_{t} | s, x^{t}, y^{t - 1}) \\ \overset{(a)}{=} p_{X_{t} | S X^{t - 1}} (x_{t} | s, x^{t - 1}) p_{Y_{t} | X_{t}} (y_{t} | x_{t}) . \end{matrix}

Step (a) follows from Lemma 3. Then, by Lemma 6, we have

\begin{matrix} Φ_{t}^{(μ, α)} = \sum_{s, x^{t}, y^{t}} p_{S X^{t - 1} Y^{t - 1}; F^{t - 1}}^{(μ, α)} (s, x^{t - 1}, y^{t - 1}) p_{X_{t} Y_{t} | S X^{t - 1} Y^{t - 1}} (x_{t}, y_{t} | s, x^{t - 1}, y^{t - 1}) f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) \\ = \sum_{s, x^{t}, y_{t}} \{\sum_{y^{t - 1}} p_{S X^{t - 1} Y^{t - 1}; F^{t - 1}}^{(μ, α)} (s, x^{t - 1}, y^{t - 1})\} p_{X_{t} | S X^{t - 1}} (x_{t} | s, x^{t - 1}) p_{Y_{t} | X_{t}} (y_{t} | x_{t}) f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) \\ = \sum_{s, x^{t}, y_{t}} p_{S X^{t - 1}; F^{t - 1}}^{(μ, α)} (s, x^{t - 1}) p_{X_{t} | S X^{t - 1}} (x_{t} | s, x^{t - 1}) p_{Y_{t} | X_{t}} (y_{t} | x_{t}) f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}), \end{matrix}

completing the proof. □

The following proposition is a mathematical core to prove our main result.

Proposition 2.

For any

μ \in [0, 1]

and any

α \geq 0

, we have

{\underset{̲}{Ω}}^{(μ, α)} (p_{X Y}) \geq Ω^{(μ, α)} (p_{X Y}) .

Proof.

Set

\begin{matrix} Q_{n} (p_{Y | X}) : = & {q = q_{U X Y} : | U | \leq | M_{1} | | X^{n - 1} | | Y^{n - 1} |, q_{Y | X} = p_{Y | X}, U \leftrightarrow X \leftrightarrow Y}, \\ {\hat{Ω}}_{n}^{(μ, α)} (p_{X Y}) : = & min_{\binom{}{q \in Q_{n} (p_{Y | X})}} Ω^{(μ, α)} (q | p_{X Y}) . \end{matrix}

For each

t = 1, 2, \dots, n

, we define

q_{t} = q_{U_{t} X_{t} Y_{t} Z_{t}}

by

\begin{matrix} q_{U_{t}} (u_{t}) = p_{U_{t}; F^{t - 1}}^{(μ, α)} (u_{t}), q_{X_{t} Y_{t} | U_{t}} (x_{t}, y_{t} | u_{t}) = p_{X_{t} | U_{t}} (x_{t} | u_{t}) p_{Y | X} (y_{t} | x_{t}) . \end{matrix}

(33)

Equation (33) implies that

q_{t} = q_{U_{t} X_{t} Y_{t}} \in Q_{n} (p_{Y | X}) .

Furthermore, for each

t = 1, 2, \dots, n

, we choose

{\underset{̲}{Q}}_{t} = (Q_{X_{t}}, {\tilde{Q}}_{X_{t} | U_{t}})

appearing in

\begin{matrix} f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) = \frac{p_{X_{t}}^{\bar{α}} (x_{t})}{Q_{X_{t}}^{\bar{α}} (x_{t})} \frac{p_{X_{t}}^{μ α} (x_{t}) p_{Y_{t} | U_{t}}^{α} (y_{t} | u_{t})}{{\tilde{Q}}_{X_{t} | U_{t}}^{μ α} (x_{t} | u_{t})} \end{matrix}

such that

{\underset{̲}{Q}}_{t} = (Q_{X_{t}}, {\tilde{Q}}_{X_{t} | U_{t}}) = (q_{X_{t}}, q_{X_{t} | U_{t}}) .

For this choice of

{\underset{̲}{Q}}_{t}

, we have the following chain of inequalities:

\begin{matrix} Φ_{t}^{(μ, α)} \overset{(a)}{=} E_{q_{t}} [f_{F_{t}}^{(μ, θ)} (X_{t}, Y_{t} | U_{t})] \overset{(b)}{=} E_{q_{t}} [\frac{p_{X_{t}}^{\bar{α}} (X_{t})}{q_{X_{t}}^{\bar{α}} (X_{t})} \frac{p_{X_{t}}^{μ α} (X_{t}) p_{Y_{t} | U_{t}}^{α} (Y_{t} | U_{t})}{q_{X_{t} | U_{t}}^{μ α} (X_{t} | U_{t})}] = E_{q_{t}} [f_{q_{t} | p_{X_{t}}}^{(μ, α)} (X_{t}, Y_{t} | U_{t})] \\ = exp \{- Ω^{(μ, α)} (q_{t} | p_{X_{t}})\} \overset{(c)}{=} exp \{- Ω^{(μ, α)} (q_{t} | p_{X})\} \\ \overset{(d)}{\leq} exp \{- {\hat{Ω}}_{n}^{(μ, α)} (p_{X Y})\} \overset{(e)}{=} exp \{- Ω^{(μ, α)} (p_{X Y})\} . \end{matrix}

(34)

Step (a) follows from Lemma 7 and (33). Step (b) follows from the choice

(Q_{X_{t}}, {\tilde{Q}}_{X_{t} | U_{t}}) =

(q_{X_{t}}, q_{X_{t} | U_{t}})

of

(Q_{X_{t}}, {\tilde{Q}}_{X_{t} | U_{t}})

for

t = 1, 2, \dots, n

. Step (c) follows from

p_{X_{t}} = p_{X}

for

t = 1, 2, \dots, n

. Step (d) follows from

q_{t} \in Q_{n} (p_{Y | X})

and the definition of

{\hat{Ω}}_{n}^{(μ, α)} (p_{X Y})

. Step (e) follows from Property 4 part a. Hence, we have the following:

\begin{matrix} max_{{\underset{̲}{Q}}^{n} \in {\underset{̲}{Q}}^{n}} \frac{1}{n} Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) \geq \frac{1}{n} Ω^{(μ, α)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) \overset{(a)}{=} - \frac{1}{n} \sum_{t = 1}^{n} log Φ_{t}^{(μ, α)} \overset{(b)}{\geq} Ω^{(μ, α)} (p_{X Y}) . \end{matrix}

(35)

Step (a) follows from Lemma 7. Step (b) follows from (34). Since (35) holds fo any

n \geq 1

and any

p_{S X^{n} Y^{n}}

satisfying

S \leftrightarrow X^{n} \leftrightarrow Y^{n}

, we have that, for any

(μ, α) \in {[0, 1]}^{2}

,

{\underset{̲}{Ω}}^{(μ, α)} (p_{X Y}) \geq Ω^{(μ, α)} (p_{X Y}) .

Thus, Proposition 2 is proved. □

Proof of Theorem 3.

For any

(μ, α) \in {[0, 1]}^{2}

, for any

R_{1}, R_{2} \geq 0

and for any

(φ_{1}^{(n)},

φ_{2}^{(n)},

ψ^{(n)})

satisfying

(1 / n) log | | φ_{i}^{(n)} | | \leq R_{i}, i = 1, 2,

we have the following:

\begin{matrix} \frac{1}{n} log \{\frac{5}{P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)})}\} \overset{(a)}{\geq} \frac{{\underset{̲}{Ω}}^{(μ, α)} (p_{X Y}) - α (μ R_{1} + \bar{μ} R_{2})}{2 + α \bar{μ}} \overset{(b)}{\geq} \frac{Ω^{(μ, α)} (p_{X Y}) - α (μ R_{1} + \bar{μ} R_{2})}{2 + α \bar{μ}} \\ = F^{(μ, α)} (μ R_{1} + \bar{μ} R_{2} | p_{X Y}) . \end{matrix}

Step (a) follows from Corollary 3. Step (b) follows from Proposition 2. Since the above bound holds for any

μ \in [0, 1]

and any

α \geq 0

, we have

\frac{1}{n} log \{\frac{5}{P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)})}\} \geq F (R_{1}, R_{2} | p_{X Y}) .

Thus, (10) in Theorem 3 is proved. □

Proof. of Corollary 2.

Since g is an inverse function of

ϑ

, the definition (11) of

κ_{n}

is equivalent to

g (\frac{κ_{n}}{ρ (p_{X Y})}) = \sqrt{\frac{4}{n ρ (p_{X Y})} log (\frac{5}{1 - ε})} .

(36)

By the definition of

n_{0} = n_{0} (ε, ρ (p_{X Y}))

, we have that

κ_{n} \leq (1 / 2) ρ (p_{X Y})

for

n \geq n_{0}

. We assume that, for

n \geq n_{0}

,

(R_{1}, R_{2}) \in R_{AKW} (n, ε | p_{X Y}) .

Then, there exists a sequence

{(φ_{1}^{(n)},

φ_{2}^{(n)},

ψ^{(n)})

}_{n \geq n_{0}}

such that, for

n \geq n_{0}

, we have

\begin{matrix} \frac{1}{n} log | | φ_{i}^{(n)} | | \leq R_{i}, i = 1, 2, 1 - ε \leq P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) . \end{matrix}

(37)

Then, by Theorem 3, we have

\begin{matrix} 1 - ε & \leq & P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) \leq 5 exp \{- n F (R_{1}, R_{2} | p_{X Y})\} \end{matrix}

(38)

for any

n \geq n_{0} (ε, ρ (p_{X Y}))

. From (38), we have that for

n \geq n_{0} (ε, ρ (

p_{X Y}))

,

\begin{matrix} F (R_{1}, R_{2} | p_{X Y}) \leq \frac{1}{n} log (\frac{5}{1 - ε}) \overset{(a)}{=} \frac{ρ (p_{X Y})}{4} \cdot g^{2} (\frac{κ_{n}}{ρ (p_{X Y})}) . \end{matrix}

(39)

Step (a) follows from (36). Hence, by Property 4 part e, we have that, under

κ_{n} \leq (1 / 2) ρ (p_{X Y})

, the inequality (39) implies

(R_{1}, R_{2}) \in R (p_{X Y}) + κ_{n} (1, 1) .

(40)

Since (40) holds for any

n \geq n_{0}

and

(R_{1}, R_{2}) \in R_{AKW} (

n, ε | p_{X Y})

, we have

R_{AKW} (n, ε | p_{X Y}) \subseteq R (p_{X Y}) + κ_{n} (1, 1) for n \geq n_{0},

completing the proof. □

5. One Helper Problem Studied by Wyner

We consider a communication system depicted in Figure 4. Data sequences

X^{n}

,

Y^{n}

, and

Z^{n}

, respectively are separately encoded to

φ_{1}^{(n)} (X^{n})

,

φ_{2}^{(n)} (Y^{n})

, and

φ_{3}^{(n)} (Z^{n})

. The encoded data

φ_{1}^{(n)} (X^{n})

and

φ_{2}^{(n)} (Y^{n})

are sent to the information processing center 1. The encoded data

φ_{1}^{(n)} (X^{n})

and

φ_{3}^{(n)} (Z^{n})

are sent to the information processing center 2. At center 1, the decoder function

ψ^{(n)}

observes

(φ_{1}^{(n)} (X^{n}),

φ_{2}^{(n)} (Y^{n}))

to output the estimation

{\hat{Y}}^{n}

of

Y^{n}

. At center 2, the decoder function

ϕ^{(n)}

observes

(φ_{1}^{(n)} (X^{n}),

φ_{3}^{(n)} (Z^{n}))

to output the estimation

{\hat{Z}}^{n}

of

Z^{n}

. The error probability of decoding is

\begin{matrix} P_{e}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, φ_{3}^{(n)}, ψ^{(n)}, ϕ^{(n)}) = Pr \{{\hat{Y}}^{n} \neq Y^{n} o r {\hat{Z}}^{n} \neq Z^{n}\}, \end{matrix}

where

{\hat{Y}}^{n} = ψ^{(n)} (

φ_{1}^{(n)} (X^{n}), φ_{2}^{(n)} (Y^{n}))

and

{\hat{Z}}^{n} = ψ^{(n)} (

φ_{1}^{(n)} (X^{n}), φ_{3}^{(n)} (Z^{n}))

.

A rate triple

(R_{1}, R_{2}, R_{3})

is

ε

-achievable if, for any

δ > 0

, there exist a positive integer

n_{0} = n_{0} (ε, δ)

and a sequence of three encoders and two decoder functions

{(φ_{1}^{(n)}, φ_{2}^{(n)},

φ_{3}^{(n)}, ψ^{(n)},

ϕ^{(n)} {)}}_{n \geq n_{0}}

such that, for

n \geq n_{0} (ε, δ)

,

\begin{matrix} \frac{1}{n} log | | φ_{i}^{(n)} | | \leq R_{i} + δ f o r i = 1, 2, 3, P_{e}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, φ_{3}^{(n)}, ψ^{(n)}, ϕ^{(n)}) \leq ε . \end{matrix}

The rate region

R_{W} (ε | p_{X Y Z})

is defined by

\begin{matrix} R_{W} (ε | p_{X Y Z}) : = {(R_{1}, R_{2}, R_{3}) : (R_{1}, R_{2}, R_{3}) i s ε - achievable for p_{X Y Z}} . \end{matrix}

Furthermore, define

R_{W} (p_{X Y Z}) : = ⋂_{ε \in (0, 1)} R_{W} (ε | p_{X Y Z}) .

We can show that the two rate regions

R_{W} (ε |

p_{X Y Z})

,

ε \in (0, 1)

and

R_{W} (p_{X Y Z})

satisfy the following property.

Property 5.

(a): The regions $R_{W} (ε | p_{X Y Z})$ , $ε \in (0, 1)$ , and $R_{W} ($ $p_{X Y Z})$ are closed convex sets of $R_{+}^{3}$ .
(b): We set

$\begin{matrix} R_{W} (n, ε | p_{X Y Z}) = {(R_{1}, R_{2}, R_{3}) : T h e r e e x i s t s (φ_{1}^{(n)}, φ_{2}^{(n)}, φ_{3}^{(n)}, ψ^{(n)}) s u c h t h a t \\ \frac{1}{n} log | | φ_{i}^{(n)} | | \leq R_{i}, i = 1, 2, 3, P_{e}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, φ_{3}^{(n)}, ψ^{(n)}) \leq ε}, \end{matrix}$

which is called the $(n, ε)$ -rate region. Using $R_{W} (n,$ $ε | p_{X Y Z})$ , $R_{W} (ε | p_{X Y Z})$ can be expressed as

$\begin{matrix} R_{W} (ε | p_{X Y Z}) & = cl (⋃_{m \geq 1} ⋂_{n \geq m} R_{W} (n, ε | p_{X Y Z})) . \end{matrix}$

It is well known that

R_{W} (p_{X Y Z})

was determined by Wyner. To describe his result, we introduce an auxiliary random variable U taking values in a finite set

U

. We assume that the joint distribution of

(U, X, Y, Z)

is

p_{U X Y} (u, x, y, z) = p_{U} (u) p_{X | U} (x | u) p_{Y Z | X} (y, z | x) .

The above condition is equivalent to

U \leftrightarrow X \leftrightarrow Y Z

. Define the set of probability distribution on

U

\times X

\times Y

\times Z

by

\begin{matrix} P (p_{X Y Z}) : = & {p = p_{U X Y Z} : | U | \leq | X | + 2, U \leftrightarrow X \leftrightarrow Y Z} . \end{matrix}

Set

\begin{matrix} R (p) & : = \begin{matrix} {(R_{1}, R_{2}, R_{3}) : R_{1}, R_{2}, R_{3} \geq 0, \\ \begin{matrix} R_{1} & \geq & I_{p} (X; U), R_{2} \geq H_{p} (Y | U), R_{3} \geq H_{p} (Z | U)}, \end{matrix} \end{matrix} \\ R (p_{X Y Z}) & : = ⋃_{p \in P (p_{X Y Z})} R (p) . \end{matrix}

We can show that the region

R (p_{X Y Z})

satisfies the following property.

Property 6.

(a): The region $R (p_{X Y Z})$ is a closed convex subset of $R_{+}^{3}$ .
(b): For any $p_{X Y Z}$ , and any $γ \in [0, 1]$ , we have

$\begin{matrix} min_{(R_{1}, R_{2}, R_{3}) \in R (p_{X Y})} (R_{1} + \bar{γ} R_{2} + γ R_{3}) = \bar{γ} H_{p} (Y) + γ H_{p} (Z) . \end{matrix}$

(41)

The minimun is attained by $(R_{1}, R_{2}, R_{3}) = (0, H_{p} (Y),$ $H_{p} (Z))$ . This result implies that

$\begin{matrix} R (p_{X Y Z}) \subseteq \begin{matrix} [⋂_{γ \in [0, 1]} {(R_{1}, R_{2}, R_{3}) : R_{1} + \bar{γ} R_{2} + γ R_{3} \geq \bar{γ} H_{p} (Y) + γ H_{p} (Z)}] \cap R_{+}^{3} . \end{matrix} \end{matrix}$

Furthermore, the point $(0, H_{p} (Y), H_{p} (Z))$ always belongs to $R (p_{X Y Z})$ .

The rate region

R_{W} (p_{X Y Z})

was determined by Wyner [2]. His result is the following.

Theorem 4

(Wyner [2]).

\begin{matrix} R_{W} (p_{X Y Z}) = R (p_{X Y Z}) . \end{matrix}

On the strong converse theorem, Csiszár and Körner [21] obtained the following.

Theorem 5

(Csiszár and Körner [21]). For each fixed ε

\in (0, 1)

, we have

\begin{matrix} R_{W} (ε | p_{X Y Z}) = R (p_{X Y Z}) . \end{matrix}

To examine a rate of convergence for the error probability of decoding to tend to one as

n \to \infty

for

(R_{1}, R_{2}, R_{3})

\notin R_{W} (p_{X Y Z})

, we define the following quantity. Set

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, φ_{3}^{(n)}, ψ^{(n)}, ϕ^{(n)}) : = 1 - P_{e}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, φ_{3}^{(n)}, ψ^{(n)}, ϕ^{(n)}), \\ G^{(n)} (R_{1}, R_{2}, R_{3} | p_{X Y Z}) : = min_{\binom{(φ_{1}^{(n)}, φ_{2}^{(n)}, φ_{3}^{(n)},}{\binom{ψ^{(n)}, ϕ^{(n)}) :}{\binom{(1 / n) log ∥ φ_{i}^{(n)} ∥}{\leq R_{i}, i = 1, 2, 3}}}} (- \frac{1}{n}) log P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, φ_{3}^{(n)}, ψ^{(n)}, ϕ^{(n)}), \\ G (R_{1}, R_{2}, R_{3} | p_{X Y Z}) : = lim_{n \to \infty} G^{(n)} (R_{1}, R_{2}, R_{3} | p_{X Y Z}), \\ G (p_{X Y Z}) : = {(R_{1}, R_{2}, R_{3}, G) : G \geq G (R_{1}, R_{2}, R_{3} | p_{X Y Z})} . \end{matrix}

By time sharing, we have that

\begin{matrix} G^{(n + m)} (\frac{n R_{1} + m R_{1}^{'}}{n + m}, \frac{n R_{2} + m R_{2}^{'}}{n + m}, \frac{n R_{2} + m R_{2}^{'}}{n + m}| p_{X Y Z}) \\ \leq \frac{n G^{(n)} (R_{1}, R_{2}, R_{3} | p_{X Y Z}) + m G^{(m)} (R_{1}^{'}, R_{2}^{'}, R_{3}^{'} | p_{X Y Z})}{n + m} . \end{matrix}

(42)

Choosing

R = R^{'}

in (42), we obtain the following subadditivity property on

{G^{(n)} (R_{1}, R_{2}, R_{3} | p_{X Y Z})

}_{n \geq 1}

:

\begin{matrix} G^{(n + m)} (R_{1}, R_{2}, R_{3} | p_{X Y Z}) \leq \frac{n G^{(n)} (R_{1}, R_{2}, R_{3} | p_{X Y Z}) + m G^{(m)} (R_{1}, R_{2}, R_{3} | p_{X Y Z})}{n + m}, \end{matrix}

from which we have that

G (R_{1}, R_{2}, R_{3} | p_{X Y Z})

exists and satisfies the following:

\begin{matrix} G (R_{1}, R_{2}, R_{3} | p_{X Y Z}) = inf_{n \geq 1} G^{(n)} (R_{1}, R_{2}, R_{3} | p_{X Y Z}) . \end{matrix}

The exponent function

G (R_{1}, R_{2}, R_{3} | p_{X Y Z})

is a convex function of

(R_{1}, R_{2}, R_{3})

. In fact, by time sharing, we have that

\begin{matrix} G^{(n + m)} (\frac{n R_{1} + m R_{1}^{'}}{n + m}, \frac{n R_{2} + m R_{2}^{'}}{n + m}, \frac{n R_{2} + m R_{2}^{'}}{n + m}| p_{X Y Z}) \\ \leq \frac{n G^{(n)} (R_{1}, R_{2}, R_{3} | p_{X Y Z}) + m G^{(m)} (R_{1}^{'}, R_{2}^{'}, R_{3}^{'} | p_{X Y Z})}{n + m}, \end{matrix}

from which we have that for any

α \in [0, 1]

\begin{matrix} G (α R_{1} + \bar{α} R_{1}^{'}, α R_{2} + \bar{α} R_{2}^{'}, α R_{3} + \bar{α} R_{3}^{'} | p_{X Y Z}) \leq α G (R_{1}, R_{2}, R_{3} | p_{X Y Z}) + \bar{α} G (R_{1}^{'}, R_{2}^{'}, R_{3}^{'} | p_{X Y Z}) . \end{matrix}

The region

G (p_{X Y Z})

is also a closed convex set. Our main aim is to find an explicit characterization of

G (p_{X Y Z})

. In this paper, we derive an explicit outer bound of

G

(p_{X Y Z})

whose section by the plane

G = 0

coincides with

R_{W} (p_{X Y Z})

. We first explain that the region

R (p_{X Y Z})

has another expression using the supporting hyperplane. We define two sets of probability distributions on

U

\times X

\times Y

\times Z

by

\begin{matrix} P_{sh} (p_{X Y Z}) : = {p = p_{U X Y Z} : | U | \leq | X |, U \leftrightarrow X \leftrightarrow Y Z}, \\ Q (p_{Y Z | X}) : = {q = q_{U X Y Z} : | U | \leq | X |, p_{Y Z | X} = q_{Y Z | X}, U \leftrightarrow X \leftrightarrow Y Z} . \end{matrix}

For

(μ, γ) \in {[0, 1]}^{2}

, set

\begin{matrix} R^{(μ, γ)} (p_{X Y Z}) : = max_{p \in P_{sh} (p_{X Y Z})} \{μ I_{p} (X; U) + \bar{μ} (\bar{γ} H_{p} (Y | U) + γ H_{p} (Z | U))\} . \end{matrix}

Furthermore, define

\begin{matrix} R_{sh} (p_{X Y Z}) = ⋂_{(μ, γ) \in {[0, 1]}^{2}} {(R_{1}, R_{2}, R_{3}) : μ R_{1} + \bar{μ} (\bar{γ} R_{2} + γ R_{3}) \geq R^{(μ, γ)} (p_{X Y Z})} . \end{matrix}

Then, we have the following property.

Property 7.

(a): The bound $| U | \leq | X |$ is sufficient to describe $R^{(μ)} ($ $p_{X Y Z})$ .
(b): For every $(μ, γ) \in {[0, 1]}^{2}$ , we have

$\begin{matrix} min_{(R_{1}, R_{2}, R_{3}) \in R (p_{X Y Z})} {μ R_{1} + \bar{μ} (\bar{γ} R_{2} + γ R_{3})} = R^{(μ, γ)} (p_{X Y Z}) . \end{matrix}$
(c): For any $p_{X Y Z},$ we have

$R_{sh} (p_{X Y Z}) = R (p_{X Y Z}) .$

(43)

For

(μ, γ, α) \in {[0, 1]}^{3}

, and for

q = q_{U X Y Z} \in Q (p_{Y Z | X})

, define

\begin{matrix} ω_{q | p_{X}}^{(μ, γ, α)} (x, y, z | u) : = \bar{α} log \frac{q_{X} (x)}{p_{X} (x)} + α [μ log \frac{q_{X | U} (x | u)}{p_{X} (x)} + \bar{μ} (\bar{γ} log \frac{1}{q_{Y | U} (y | u)} + γ log \frac{1}{q_{Z | U} (z | u)})], \\ f_{q | p_{X}}^{(μ, γ, α)} (x, y, z | u) : = exp \{- ω_{q | p_{X}}^{(μ, γ, α)} (x, y, z | u)\}, \\ Ω^{(μ, γ, α)} (q | p_{X}) : = - log E_{q} [f_{q | p_{X}}^{(μ, γ, α)} (X, Y, Z | U)], Ω^{(μ, γ, α)} (p_{X Y Z}) : = min_{\binom{}{q \in Q (p_{Y Z | X})}} Ω^{(μ, γ, α)} (q | p_{X}), \\ F^{(μ, γ, α)} (μ R_{1} + \bar{γ} R_{2} + γ R_{3}) : = \frac{Ω^{(μ, γ, α)} (p_{X Y Z}) - α [μ R_{1} + \bar{μ} (\bar{γ} R_{2} + γ R_{3})]}{2 + α \bar{μ}}, \\ F (R_{1}, R_{2}, R_{3} | p_{X Y Z}) : = sup_{(μ, γ, α) \in {[0, 1]}^{3},} F^{(μ, γ, α)} (μ R_{1} + \bar{μ} (\bar{γ} R_{2} + γ R_{3}) | p_{X Y Z}) . \end{matrix}

We next define a function serving as a lower bound of

F (R_{1}, R_{2}, R_{3} | p_{X Y Z})

. For each

p = p_{U X Y Z} \in P_{sh} (p_{X Y Z})

, define

\begin{matrix} {\tilde{ω}}_{p}^{(μ, γ)} (x, y, z | u) : = μ log \frac{p_{X | U} (x | u)}{p_{X} (x)} + \bar{μ} (\bar{γ} log \frac{1}{p_{Y | U} (y | u)} + γ log \frac{1}{p_{Z | U} (z | u)}), \\ {\tilde{Ω}}^{(μ, γ, λ)} (p) : = - log E_{p} [exp \{- λ ω_{p}^{(μ, γ)} (X, Y, Z | U)\}], {\tilde{Ω}}^{(μ, γ, λ)} (p_{X Y Z}) : = min_{\binom{}{p \in P_{sh} (p_{X Y Z})}} {\tilde{Ω}}^{(μ, γ, λ)} (p) . \end{matrix}

Furthermore, set

\begin{matrix} {\underset{̲}{F}}^{(μ, γ, λ)} (μ R_{1} + \bar{γ} R_{2} + γ R_{3} | p_{X Y Z}) : = \frac{{\tilde{Ω}}^{(μ, γ, λ)} (p_{X Y Z}) - λ [μ R_{1} + \bar{μ} (\bar{γ} R_{2} + γ R_{3})]}{2 + λ (5 - μ)}, \\ \underset{̲}{F} (R_{1}, R_{2}, R_{3} | p_{X Y Z}) : = sup_{\binom{(μ, γ) \in {[0, 1]}^{2},}{λ \geq 0}} {\underset{̲}{F}}^{(μ, γ, λ)} (μ R_{1} + \bar{μ} \bar{γ} R_{2} + γ R_{3} | p_{X Y Z}) . \end{matrix}

We can show that the above functions and sets satisfy the following property.

Property 8.

(a): The cardinality bound $| U | \leq | X |$ in $Q (p_{Y | X})$ is sufficient to describe the quantity $Ω^{(μ, α)} (p_{X Y})$ . Furthermore, the cardinality bound $| U | \leq | X |$ in $Q (p_{Y Z | X})$ is sufficient to describe the quantity ${\tilde{Ω}}^{(μ, γ, λ)} (p_{X Y Z})$ .
(b): For any $R_{1}, R_{2}, R_{3} \geq 0$ , we have

$\begin{matrix} F (R_{1}, R_{2}, R_{3} | p_{X Y Z}) \geq \underset{̲}{F} (R_{1}, R_{2}, R_{3} | p_{X Y Z}) . \end{matrix}$
(c): For any $p = p_{U X Y} \in P_{sh} (p_{X Y})$ and any $(μ, γ, λ$ $) \in [0,$ ${1]}^{3}$ , we have

$0 \leq {\tilde{Ω}}^{(μ, γ, λ)} (p) \leq μ log | X | + \bar{μ} {log (| Y |}^{\bar{γ}} {| Z |}^{γ}) .$

(44)
(d): Fix any $p = p_{U X Y Z} \in P_{sh} (p_{X Y Z})$ and $(μ, γ) \in {[0, 1]}^{2}$ . We define a probability distribution $p^{(λ)} = p_{U X Y Z}^{(λ)}$ by

$\begin{matrix} p^{(λ)} (u, x, y, z) : = \frac{p (u, x, y, z) exp \{- λ ω_{p}^{(μ, γ)} (x, y, z | u)\}}{E_{p} [exp \{- λ ω_{p}^{(μ, γ)} (X, Y, Z | U)\}]} . \end{matrix}$

Then, for $λ \in [0, 1 / 2]$ , ${\tilde{Ω}}^{(μ, γ, λ)} (p)$ is twice differentiable. Furthermore, for $λ \in [0, 1 / 2]$ , we have

$\begin{matrix} \frac{d}{d λ} {\tilde{Ω}}^{(μ, γ, λ)} (p) = E_{p^{(λ)}} [ω_{p}^{(μ, γ)} (X, Y, Z | U)], \frac{d^{2}}{d λ^{2}} {\tilde{Ω}}^{(μ, γ, λ)} (p) = - {Var}_{p^{(λ)}} [ω_{p}^{(μ, γ)} (X, Y, Z | U)] . \end{matrix}$

The second equality implies that ${\tilde{Ω}}^{(μ, γ, λ)} (p)$ is a concave function of $λ \in [0, 1 / 2]$ .
(e): For $(μ, γ, λ) \in {[0, 1]}^{2} \times [0, 1 / 2]$ , define

$\begin{matrix} ρ^{(μ, γ, λ)} (p_{X Y Z}) & : = max_{\binom{(ν, p) \in [0, λ] \times P_{sh} (p_{X Y Z}) :}{{\tilde{Ω}}^{(μ, γ, λ)} (p) = {\tilde{Ω}}^{(μ, γ, λ)} (p_{X Y Z})}} {Var}_{p^{(ν)}} [{\tilde{ω}}_{p}^{(μ, γ)} (X, Y, Z | U)], \end{matrix}$

and set

$\begin{matrix} ρ = ρ (p_{X Y Z}) : = max_{(μ, γ, λ) \in {[0, 1]}^{2} \times [0, 1 / 2]} ρ^{(μ, γ, λ)} (p_{X Y Z}) . \end{matrix}$

Then, we have $ρ (p_{X Y Z}) < \infty$ . Furthermore, for any $(μ, γ, λ)$ $\in {[0, 1]}^{2} \times [0, 1 / 2]$ , we have

${\tilde{Ω}}^{(μ, γ, λ)} (p_{X Y Z}) \geq λ R^{(μ, γ)} (p_{X Y Z}) - \frac{λ^{2}}{2} ρ (p_{X Y Z}) .$
(f): For every $τ \in (0, (1 / 2) ρ (p_{X Y Z}))$ , the condition $(R_{1} + τ,$ $R_{2} + τ, R_{3} + τ) \notin R (p_{X Y Z})$ implies

$\begin{matrix} F (R_{1}, R_{2}, R_{3} | p_{X Y Z}) > \frac{ρ (p_{X Y Z})}{4} \cdot g^{2} (\frac{τ}{ρ (p_{X Y Z})}) > 0 . \end{matrix}$

Since proofs of the results stated in Property 8 are quite parallel with those of the results stated in Property 4, we omit them. Our main result is the following.

Theorem 6.

For any

R_{1}, R_{2},

R_{3} \geq 0

, any

p_{X Y Z}

, and for any

(φ_{1}^{(n)},

φ_{2}^{(n)},

φ_{3}^{(n)},

ψ^{(n)}, ϕ^{(n)})

satisfying

(1 / n) log | | φ_{i}^{(n)} | |

\leq R_{i}, i = 1, 2, 3,

we have

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, φ_{3}^{(n)}, ψ^{(n)}, ϕ^{(n)}) \leq 7 exp \{- n F (R_{1}, R_{2}, R_{3} | p_{X Y Z})\} . \end{matrix}

It follows from Theorem 6 and Property 8 part d) that, if

(R_{1}, R_{2}, R_{3})

is outside the capacity region, then the error probability of decoding goes to one exponentially and its exponent is not below

F (R_{1}, R_{2}, R_{3} | p_{X Y Z})

. It immediately follows from Theorem 3 that we have the following corollary.

Corollary 4.

\begin{matrix} G (R_{1}, R_{2}, R_{3} | p_{X Y Z}) \geq F (R_{1}, R_{2}, R_{3} | p_{X Y Z}), \\ G (p_{X Y Z}) \subseteq \bar{G} (p_{X Y Z}) = \{(R_{1}, R_{2}, R_{3}, G) : G \geq F (R_{1}, R_{2}, R_{3} | p_{X Y Z})\} . \end{matrix}

Proof of Theorem 6 is quite parallel with that of Theorem 3. We omit the detail of the proof. From Theorem 6 and Property 8 part e, we can obtain an explicit outer bound of

R_{W} (ε | p_{X Y Z})

with an asymptotically vanishing deviation from

R_{W} (p_{X Y Z})

= R (p_{X Y Z})

. The strong converse theorem established by Csiszár and Körner [21] immediately follows from this corollary. To describe this outer bound, for

κ > 0

, we set

\begin{matrix} R (p_{X Y Z}) - κ (1, 1, 1) : = {(R_{1} - κ, R_{2} - κ, R_{3} - κ) : (R_{1}, R_{2}, R_{3}) \in R (p_{X Y Z})}, \end{matrix}

which serves as an outer bound of

R (p_{X Y Z})

. For each fixed

ε \in (0, 1)

, we define

{\tilde{κ}}_{n}

= {\tilde{κ}}_{n} (ε, ρ (p_{X Y Z}))

by

\begin{matrix} {\tilde{κ}}_{n} & : = & ρ (p_{X Y}) ϑ (\sqrt{\frac{4}{n ρ (p_{X Y})} log (\frac{7}{1 - ε})}) \\ \overset{(a)}{=} & 2 \sqrt{\frac{ρ (p_{X Y})}{n} log (\frac{7}{1 - ε})} + \frac{5}{n} log (\frac{7}{1 - ε}) . \end{matrix}

(45)

Step (a) follows from

ϑ (a) = a + (5 / 4) a^{2}

. Since

{\tilde{κ}}_{n} \to 0

as

n \to \infty

, we have the smallest positive integer

n_{1} = n_{1} (ε, ρ (p_{X Y Z}))

such that

{\tilde{κ}}_{n} \leq (1 / 2) ρ (p_{X Y Z})

for

n \geq n_{1}

. From Theorem 6 and Property 8 part e, we have the following corollary.

Corollary 5.

For each fixed

ε \in (0, 1)

, we choose the above positive integer

n_{1} =

n_{1} (ε, ρ (p_{X Y Z}))

. Then, for any

n \geq n_{1}

, we have

\begin{matrix} R_{W} (ε | p_{X Y Z}) & \subseteq R (p_{X Y Z}) - {\tilde{κ}}_{n} (0, 1, 1) . \end{matrix}

The above result together with

\begin{matrix} R_{W} (ε | p_{X Y Z}) & = cl (⋃_{m \geq 1} ⋂_{n \geq m} R_{W} (n, ε | p_{X Y Z})) \end{matrix}

yields that for each fixed

ε \in (0, 1)

, we have

\begin{matrix} R_{W} (ε | p_{X Y Z}) = R_{W} (p_{X Y Z}) = R (p_{X Y Z}) . \end{matrix}

This recovers the strong converse theorem proved by Csiszár and Körner [21].

Proof of this corollary is quite parallel with that of Corollary 2. We omit the detail.

6. Conclusions

For the AWZ system, the one helper source coding system posed by Ahlswede, Körner [1] and Wyner [2], we have derived an explicit lower bound of the optimal exponent function

G (R_{1}, R_{2} | p_{X Y})

on the correct probability of decoding for

(R_{1}, R_{2}) \notin R_{WZ} (p_{X Y})

. We have described this result in Theorem 3. Furthermore, for the source coding system posed and investigated Wyner [2], we have obtained an explicit lower bound of the optimal exponent function

G (R_{1}, R_{2}, R_{3} | p_{X Y Z})

on the correct probability of decoding for

(R_{1}, R_{2}, R_{3}) \notin R_{W} (p_{X Y Z})

. We have described this result in Theorem 6. The determination problems of

G (R_{1}, R_{2} | p_{X Y})

and

G (R_{1}, R_{2}, R_{3} | p_{X Y Z})

still remain to be resolved. Those problems are our future works.

Funding

This research was funded by JSPS Kiban (B) 18H01438.

Acknowledgments

The author is very grateful to Shun Watanabe and Shigeaki Kuzuoka for their helpful comments.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Properties of the Rate Regions

In this appendix, we prove Property 1. Property 1 part a can easily be proved by the definitions of the rate distortion regions. We omit the proofs of this part. In the following argument, we prove the part b.

Proof of Property 1 part b:

We set

\begin{matrix} {\underset{̲}{R}}_{AKW} (m, ε | p_{X Y}) & = ⋂_{n \geq m} R_{AKW} (n, ε | p_{X Y}) . \end{matrix}

By the definitions of

{\underset{̲}{R}}_{AKW} (m, ε | p_{X Y})

and

R_{AKW} (ε | p_{X Y})

, we have that

{\underset{̲}{R}}_{AKW} (m, ε | p_{X Y})

\subseteq R_{AKW} (ε | p_{X Y})

for

m \geq 1

. Hence, we have that

⋃_{m \geq 1} {\underset{̲}{R}}_{AKW} (m, ε | p_{X Y}) \subseteq R_{AKW} (ε | p_{X Y}) .

(A1)

We next assume that

(R_{1}, R_{2}) \in R_{AKW} (ε | p_{X Y})

. Set

\begin{matrix} R_{AKW}^{(δ)} (ε | p_{X Y}) : = {(R_{1} + δ, R_{2} + δ) : (R_{1}, R_{2}) \in R_{AKW} (ε | p_{X Y})} . \end{matrix}

Then, by the definitions of

R_{AKW} (n, ε

| p_{X Y})

and

R_{AKW} (

ε | p_{X Y})

, we have that, for any

δ > 0

, there exists

n_{0} (ε, δ)

such that for any

n \geq n_{0} (ε, δ)

,

(R_{1} + δ, R_{2} + δ) \in R_{AKW} (n, ε | p_{X Y}),

which implies that

\begin{matrix} R_{AKW}^{(δ)} (ε | p_{X Y}) & \subseteq & ⋂_{n \geq n_{0} (ε, δ)} R_{AKW} (n, ε | p_{X Y}) = {\underset{̲}{R}}_{AKW} (n_{0} (ε, δ), ε | p_{X Y}) \\ \subseteq & cl (⋃_{m \geq 1} {\underset{̲}{R}}_{AKW} (m, ε | p_{X Y})) . \end{matrix}

(A2)

Here, we assume that there exists a pair

(R_{1}, R_{2})

belonging to

R_{AKW} (ε | p_{X Y})

such that

(R_{1}, R_{2}) \notin cl (⋃_{m \geq 1} {\underset{̲}{R}}_{AKW} (m, ε | p_{X Y})) .

(A3)

Since the set on the right-hand side of (A3) is a closed set, we have

(R_{1} + δ, R_{2} + δ) \notin cl (⋃_{m \geq 1} {\underset{̲}{R}}_{AKW} (m, ε | p_{X Y}))

(A4)

for some small

δ > 0

. On the other hand, we have

(R_{1} + δ, R_{2} + δ)

\in R_{AKW}^{(δ)} (ε | p_{X Y})

, which contradicts (A2). Thus, we have

\begin{matrix} ⋃_{m \geq 1} {\underset{̲}{R}}_{AKW} (m, ε | p_{X Y}) \subseteq R_{AKW} (ε | p_{X Y}) \subseteq cl (⋃_{m \geq 1} {\underset{̲}{R}}_{AKW} (m, ε | p_{X Y})) . \end{matrix}

(A5)

Note here that

R_{AKW}

(ε

| p_{X Y})

is a closed set. Then, from (A5), we conclude that

\begin{matrix} R_{AKW} (ε | W) & = cl (⋃_{m \geq 1} {\underset{̲}{R}}_{AKW} (m, ε | p_{X Y})) = cl (⋃_{m \geq 1} ⋂_{n \geq m} R_{AKW} (n, ε | p_{X Y})), \end{matrix}

completing the proof. □

Appendix B. Cardinality Bound on Auxiliary Random Variables

We first prove the following lemma.

Lemma A1.

\begin{matrix} {\underset{̲}{R}}^{(μ)} (p_{X Y}) : = min_{p \in P (p_{X Y})} \{μ I_{p} (X; U) + \bar{μ} H_{p} (Y | U)\} \\ = R^{(μ)} (p_{X Y}) : = min_{p \in P_{sh} (p_{X Y})} \{μ I_{p} (X; U) + \bar{μ} H_{p} (Y | U)\} . \end{matrix}

Proof.

We bound the cardinality

| U |

of U to show that the bound

| U | \leq | X |

is sufficient to describe

{\underset{̲}{R}}^{(μ)} (p_{X Y})

. Observe that

\begin{matrix} p_{X} (x) = \sum_{u \in U} p_{U} (u) p_{X | U} (x | u), \end{matrix}

(A6)

\begin{matrix} μ I_{p} (X; U) + \bar{μ} H_{p} (Y | U) = \sum_{u \in U} p_{U} (u) π (p_{X | U} (\cdot | u)), \end{matrix}

(A7)

where

\begin{matrix} π (p_{X | U} (\cdot | u)) : = \sum_{(x, y) \in X \times Y} p_{X | U} (x | u) p_{Y | X} (y | x) log \{\frac{p_{X | U}^{μ} (x | u)}{p_{X}^{μ} (x)} {[\sum_{\tilde{x} \in X} p_{Y | X} (y | \tilde{x}) p_{X | U} (\tilde{x} | u)]}^{- \bar{μ}}\} . \end{matrix}

For each

u \in U

,

π (p_{X | U} (\cdot | u))

is a continuous function of

p_{X | U} (\cdot | u)

. Then, by the support lemma,

| U | \leq | X | - 1 + 1 = | X |

is sufficient to express

| X | - 1

values of (A6) and one value of (A7). □

Next, we prove the following lemma.

Lemma A2.

The cardinality bound

| U | \leq | X |

in

Q (p_{Y | X})

is sufficient to describe the quantity

Ω^{(μ, α)} (p_{X Y})

. The cardinality bound

| U | \leq | X |

in

P_{sh} (p_{X Y})

is sufficient to describe the quantity

{\tilde{Ω}}^{(μ, λ)} (p_{X Y})

.

Proof.

We first bound the cardinality

| U |

of U in

Q (p_{Y | X})

to show that the bound

| U | \leq | X |

is sufficient to describe

Ω^{(μ, α)}

(p_{X Y})

. Observe that

\begin{matrix} q_{X} (x) = \sum_{u \in U} q_{U} (u) q_{X | U} (x | u), \end{matrix}

(A8)

\begin{matrix} exp \{- Ω^{(μ, α)} (q | p_{X})\} = \sum_{u \in U} q_{U} (u) Π^{(μ, α)} (q_{X}, q_{X Y | U} (\cdot, \cdot | u)), \end{matrix}

(A9)

where

\begin{matrix} Π^{(μ, α)} (q_{X}, q_{X Y | U} (\cdot, \cdot | u)) : = \sum_{\binom{(x, y)}{\in X \times Y}} q_{X Y | U} (x, y | u) exp \{- ω_{q | p_{X}}^{(μ, α)} (x, y | u)\} . \end{matrix}

The value of

q_{X}

included in

Π^{(μ, α)} (q_{X}, q_{X Y | U} (\cdot, \cdot | u))

must be preserved under the reduction of

U

. For each

u \in U

,

Π^{(μ, α)} (q_{X}, q_{X Y | U} (\cdot, \cdot | u))

is a continuous function of

q_{X Y | U} (

\cdot, \cdot | u)

. Then, by the support lemma,

| U | \leq | X | - 1 + 1 = | X |

is sufficient to express

| X | - 1

values of (A8) and one value of (A9). We next bound the cardinality

| U |

of U in

P_{sh} (p_{X Y})

to show that the bound

| U | \leq | X |

is sufficient to describe

{\tilde{Ω}}^{(μ, λ)}

(p_{X Y})

. Observe that

\begin{matrix} p_{X} (x) = \sum_{u \in U} p_{U} (u) p_{X | U} (x | u), \end{matrix}

(A10)

\begin{matrix} exp \{- {\tilde{Ω}}^{(μ, λ)} (p)\} = \sum_{u \in U} p_{U} (u) {\tilde{Π}}^{(μ, λ)} (p_{X}, p_{X Y | U} (\cdot, \cdot | u)), \end{matrix}

(A11)

where

\begin{matrix} {\tilde{Π}}^{(μ, λ)} (p_{X}, p_{X Y | U} (\cdot, \cdot | u)) : = \sum_{\binom{(x, y)}{\in X \times Y}} p_{X Y | U} (x, y | u) exp \{- λ {\tilde{ω}}_{p}^{(μ)} (x, y | u)\} . \end{matrix}

The value of

p_{X}

included in

{\tilde{Π}}^{(μ, λ)} (p_{X}, p_{X Y | U} (\cdot, \cdot | u))

must be preserved under the reduction of

U

. For each

u \in U

,

{\tilde{Π}}^{(μ, λ)} (p_{X}, p_{X Y | U} (\cdot, \cdot | u))

is a continuous function of

p_{X Y | U} (\cdot, \cdot | u)

. Then, by the support lemma,

| U | \leq | X | - 1 + 1 = | X |

is sufficient to express

| X | - 1

values of (A10) and one value of (A11). □

Appendix C. Supporting Hyperplain Expressions of $R (p_{X Y})$

In this appendix we prove Property 3 parts (b), (c). We first prove the part (b).

Proof of Property 3 part b:

For any

μ \geq 0,

we have the following chain of inequalities:

\begin{matrix} min_{(R_{1}, R_{2}) \in R (p_{X Y})} {μ R_{1} + \bar{μ} R_{2}} \\ = min_{p \in P_{} (p_{X Y})} {μ I_{p} (X; U) + \bar{μ} H_{p} (Y | U)} \overset{(a)}{=} min_{p \in P_{sh} (p_{X Y})} {μ I_{p} (X; U) + \bar{μ} H_{p} (Y | U)} = R^{(μ)} (p_{X Y}) . \end{matrix}

Step (a) follows from Lemma A1 stating that the cardinality bound

| U | \leq | X | + 1

in

P_{} (p_{X Y})

can be reduced to that

| U | \leq | X |

in

P_{sh} (p_{X Y})

. □

We next prove part c. We first prepare a lemma useful to prove this property. From the convex property of the region

R (p_{X Y})

, we have the following lemma.

Lemma A3.

Suppose that

({\hat{R}}_{1}, {\hat{R}}_{2})

does not belong to

R (p_{X Y})

. Then, there exist

ϵ > 0

and

μ_{0} \geq 0

such that for any

(R_{1}, R_{2}) \in R (p_{X Y})

we have

\begin{matrix} μ_{0} (R_{1} - {\hat{R}}_{1}) + \bar{μ_{0}} (R_{2} - {\hat{R}}_{2}) - ϵ \geq 0 . \end{matrix}

Proof of this lemma is omitted here. Lemma A3 is equivalent to the fact that if the region

R (p_{X Y})

is a convex set; then, for any point

({\hat{R}}_{1}, {\hat{R}}_{2})

outside the region

R (p_{X Y})

, there exists a line which separates the point

({\hat{R}}_{1}, {\hat{R}}_{2})

from the region

R (p_{X Y})

.

Proof of Property 3 part c:

We first prove

{\underset{̲}{R}}_{sh} (p_{X Y})

\subseteq R (p_{X Y})

. We assume that

({\hat{R}}_{1}, {\hat{R}}_{2}) \notin R (p_{X Y})

. Then, by Lemma A3, there exist

ϵ > 0

and

μ_{0} \geq 0

such that for any

(R_{1}, R_{2}) \in R (p_{X Y})

, we have

\begin{matrix} μ_{0} {\hat{R}}_{1} + \bar{μ_{0}} {\hat{R}}_{2} \leq μ_{0} R_{1} + \bar{μ_{0}} R_{2} - ϵ . \end{matrix}

Then, we have

\begin{matrix} μ_{0} {\hat{R}}_{1} + \bar{μ_{0}} {\hat{R}}_{2} \leq min_{(R_{1}, R_{2}) \in R (p_{X Y})} \{μ_{0} R_{1} + \bar{μ_{0}} R_{2}\} - ϵ \overset{(a)}{=} min_{p \in P (p_{X Y})} \{μ_{0} I_{p} (U; X) + \bar{μ_{0}} H_{p} (Y | U)\} - ϵ \\ \leq min_{p \in P_{sh} (p_{X Y})} \{μ_{0} I_{p} (U; X) + \bar{μ_{0}} H_{p} (Y | U)\} - ϵ = R^{(μ_{0})} (p_{X Y}) - ϵ . \end{matrix}

(A12)

Step (a) follows from the definition of

R (p_{X Y})

. The inequality (A12) implies that

({\hat{R}}_{1}, {\hat{R}}_{2})

\notin R_{sh} (p_{X Y})

. Thus

R_{sh} (p_{X Y})

\subseteq R (p_{X Y})

is concluded. □

Appendix D. Proof of Property 4 Part b

In this appendix, we prove Property 4 part b. Fix

q = q_{U X Y} \in Q (p_{Y | X})

and

p = p_{U X Y} = (p_{U | X}, p_{X Y}) \in P_{sh} (

p_{X Y})

arbitrary. For

β \geq 0

,

p \in P_{sh} (p_{X Y})

, and

q_{Y | U}

induced by q, define

\begin{matrix} {\hat{ω}}_{p, q_{Y | U}}^{(μ)} (x, y | u) : = μ log \frac{p_{X | U} (x | u)}{p_{X} (x)} + \bar{μ} log \frac{1}{q_{Y | U} (y | u)}, \\ {\hat{Ω}}^{(μ, β)} (p, q_{Y | U}) : = - log E_{p} [exp \{- β {\hat{ω}}_{p, q_{Y | U}}^{(μ)} (X, Y | U)\}] . \end{matrix}

Then, we have the following two lemmas.

Lemma A4.

For any μ

\in [0, 1]

,

α \in [0, 1)

, and any

q = q_{U X Y} \in Q (p_{Y | X})

, there exists

p = p_{U X Y} \in P_{sh} (p_{X Y})

such that

Ω^{(μ, α)} (q | p_{X}) \geq \bar{α} {\hat{Ω}}^{(μ, \frac{α}{\bar{α}})} (p, q_{Y | U}) .

(A13)

Lemma A5.

For any

μ, α

satisfying μ

\in [0, 1]

,

α \in [0, 1 / 2)

, any

p = p_{U X Y} \in P_{sh} (p_{X Y})

, and any stochastic matrix

q_{Y | U}

induced by

q_{U X Y} \in Q (p_{Y | X})

, we have

\begin{matrix} {\hat{Ω}}^{(μ, \frac{α}{\bar{α}})} (p, q_{Y | U}) \geq \frac{1 - 2 α}{\bar{α}} {\tilde{Ω}}^{(μ, \frac{α}{1 - 2 α})} (p) . \end{matrix}

(A14)

From Lemmas A4 and A5, we have the following corollary.

Corollary A1.

For any

μ, α

satisfying μ

\in [0, 1]

,

α \in [0, 1 / 2)

, and any

q = q_{U X Y} \in Q (p_{Y | X})

, there exists

p = p_{U X Y} \in P_{sh} (p_{X Y})

such that

\begin{matrix} Ω^{(μ, α)} (q | p_{X}) \geq (1 - 2 α) {\tilde{Ω}}^{(μ, \frac{α}{1 - 2 α})} (p) . \end{matrix}

(A15)

From (A15), we have that for any μ

\in [0, 1]

,

α \in [0, 1 / 2)

, we have

\begin{matrix} Ω^{(μ, α)} (p_{X Y}) \geq (1 - 2 α) {\tilde{Ω}}^{(μ, \frac{α}{1 - 2 α})} (p_{X Y}) . \end{matrix}

(A16)

Proof of Lemma A4:

We fix

(μ, α) \in {[0, 1]}^{2}

arbitrary. For each

q = q_{U X Y} \in Q (p_{Y | X})

, we choose

p = p_{U X Y} \in P_{sh} (p_{X Y})

so that

p_{U | X} = q_{U | X}

. Then, we have the following:

\begin{matrix} exp \{- Ω^{(μ, α)} (q | p_{X})\} = E_{q} [\frac{p_{X}^{\bar{α}} (X)}{q_{X}^{\bar{α}} (X)} \{\frac{p_{X}^{μ α} (X) q_{Y | U}^{\bar{μ} α} (Y | U)}{q_{X | U}^{μ α} (X | U)}\}] \\ = E_{q} [{\{\frac{p_{U X} (U, X)}{q_{U X} (U, X)}\}}^{\bar{α}} {\{\frac{p_{X}^{μ \frac{α}{\bar{α}}} (X) q_{Y | U}^{\bar{μ} \frac{α}{\bar{α}}} (Y | U)}{p_{X | U}^{μ \frac{α}{\bar{α}}} (X | U)}\}}^{\bar{α}} {\{\frac{p_{X | U}^{μ} (X | U)}{q_{X | U}^{μ} (X | U)}\}}^{α}] \\ \overset{(a)}{\leq} {(E_{q} [\frac{p_{U X} (U, X)}{q_{U X} (U, X)} \frac{p_{X}^{μ \frac{α}{\bar{α}}} (X) q_{Y | U}^{\bar{μ} \frac{α}{\bar{α}}} (Y | U)}{p_{X | U}^{μ \frac{α}{\bar{α}}} (X | U)}])}^{\bar{α}} {(E_{q} [\frac{p_{X | U}^{μ} (X | U)}{q_{X | U}^{μ} (X | U)}])}^{α} \\ = exp \{- \bar{α} {\hat{Ω}}^{(μ, \frac{α}{\bar{α}})} (p, q_{Y | U})\} A^{α}, \end{matrix}

(A17)

where we set

A : = E_{q} [\frac{p_{X | U}^{μ} (X | U)}{q_{X | U}^{μ} (X | U)}] .

Step (a) follows from Hölder’s inequality. From (A17), we can see that it suffices to show

A \leq 1

to complete the proof. When

μ = 1

, we have

A = 1

. When

μ \in [0, 1)

, we apply Hölder’s inequality to A to obtain

\begin{matrix} A & = E_{q} [\frac{p_{X | U}^{μ} (X | U)}{q_{X | U}^{μ} (X | U)}] \leq {(E_{q} [\frac{p_{X | U} (X | U)}{q_{X | U} (X | U)}])}^{μ} = 1 . \end{matrix}

Hence, we have (A13) in Lemma A4. □

Proof of Lemma A5:

We fix

μ

\in [0, 1]

,

α \in [0, 1 / 2)

, arbitrary. For any

p = p_{U X Y} \in P_{sh} (p_{X Y})

, and any

q = q_{U X Y} \in Q (p_{Y | X})

, we have the following chain of inequalities:

\begin{matrix} exp \{- {\hat{Ω}}^{(μ, \frac{α}{\bar{α}})} (p, q_{Y | U})\} = E_{p} [\begin{matrix}  \end{matrix} {\{\frac{p_{X}^{μ \frac{α}{1 - 2 α}} (X) p_{Y | U}^{\bar{μ} \frac{α}{1 - 2 α}} (Y | U)}{p_{X | U}^{μ \frac{α}{1 - 2 α}} (X | U)}\}}^{\frac{1 - 2 α}{\bar{α}}} {\{\frac{q_{Y | U}^{\bar{μ}} (Y | U)}{p_{Y | U}^{\bar{μ}} (Y | U)}\}}^{\frac{α}{\bar{α}}} \begin{matrix}  \end{matrix}] \\ \overset{(a)}{\leq} exp \{- \frac{1 - 2 α}{\bar{α}} {\tilde{Ω}}^{(μ, \frac{α}{1 - 2 α})} (p)\} {(E_{p} [\frac{q_{Y | U}^{\bar{μ}} (Y | U)}{p_{Y | U}^{\bar{μ}} (Y | U)}])}^{\frac{α}{\bar{α}}} = exp \{- \frac{1 - 2 α}{\bar{α}} {\tilde{Ω}}^{(μ, \frac{α}{1 - 2 α})} (p)\} B^{\frac{α}{\bar{α}}}, \end{matrix}

(A18)

where we set

B : = E_{q} [\frac{q_{Y | U}^{\bar{μ}} (Y | U)}{p_{Y | U}^{\bar{μ}} (Y | U)}] .

Step (a) follows from Hölder’s inequality. From (A18), we can see that it suffices to show

B \leq 1

to complete the proof. In a manner quite smilar to the proof of

A \leq 1

in the proof of (A13) in Lemma A4, we can show that

B \leq 1

. Thus, we have (A14) in Lemma A5. □

Proof of Property 4 part b:

We evaluate lower bounds of

F (R_{1}, R_{2} | p_{X Y})

to obtain the following chain of inequalities:

\begin{matrix} F (R_{1}, R_{2} | p_{X Y}) \overset{(a)}{\geq} sup_{\binom{μ \in [0, 1],}{α \in [0, 1 / 2)}} \frac{(1 - 2 α) {\tilde{Ω}}^{(μ, \frac{α}{1 - 2 α})} (p_{X Y}) - α (μ R_{1} + \bar{μ} R_{2})}{2 + α \bar{μ}} \\ = sup_{\binom{μ \in [0, 1],}{\binom{α \in [0, 1 / 2),}{λ = \frac{α}{1 - 2 α}}}} \frac{(1 - 2 α) {\tilde{Ω}}^{(μ, λ)} (p_{X Y}) - α (μ R_{1} + \bar{μ} R_{2})}{2 + α \bar{μ}} \\ \overset{(b)}{=} sup_{\binom{μ \in [0, 1],}{α = \frac{λ}{1 + 2 λ}, λ \geq 0}} \frac{(1 - 2 α) {\tilde{Ω}}^{(μ, λ)} (p_{X Y}) - α (μ R_{1} + \bar{μ} R_{2})}{2 + α \bar{μ}} \\ \overset{(c)}{=} sup_{μ \in [0, 1], λ \geq 0} \frac{{\tilde{Ω}}^{(μ, λ)} (p_{X Y}) - λ (μ R_{1} + \bar{μ} R_{2})}{2 + λ (5 - μ)} = sup_{μ \in [0, 1], λ \geq 0} {\underset{̲}{F}}^{(μ, α)} (μ R_{1} + \bar{μ} R_{2} | p_{X Y}) . \end{matrix}

(A19)

Step (a) follows from the definition of

F (R_{1}, R_{2} | p_{X Y})

and (A16) in Corollary A1. Steps (b) and (c) follow from that

α \in [0, 1 / 2), λ = \frac{α}{1 - 2 α} \Leftrightarrow λ \geq 0, α = \frac{λ}{1 + 2 λ} .

From (A19), we have

\begin{matrix} F (R_{1}, R_{2} | p_{X Y}) \geq sup_{μ \in [0, 1], λ \geq 0} {\underset{̲}{F}}^{(μ, λ)} (μ R_{1} + \bar{μ} R_{2} | p_{X Y}) = \underset{̲}{F} (R_{1}, R_{2} | p_{X Y}), \end{matrix}

completing the proof. □

Appendix E. Proof of Property 4 Parts c, d, e, and f

In this appendix, we prove Property 4 parts c, d, e, and f. We first prove part c and then prove parts d and e. We finally prove part f.

Proof of Property 4 part c:

We first prove the second inequality in (8) in part c. We first observe that

\begin{matrix} exp [- {\tilde{Ω}}^{(μ, λ)} (p)] = E_{p} [\frac{p_{X}^{μ λ} (X) p_{Y | U}^{\bar{μ} λ} (Y | U)}{p_{X | U}^{μ λ} (X | U)}] . \end{matrix}

(A20)

Let

{\bar{p}}_{X}

be the uniform distribution on

X

and let

{\bar{p}}_{Y}

be the uniform distribution on

Y

. On lower bound of

exp [- {\tilde{Ω}}^{(μ, λ)} (p)]

for

p \in P_{sh} (p_{X Y})

and

(μ, λ) \in {[0, 1]}^{2}

, we have the following chain of inequalities:

\begin{matrix} exp [- {\tilde{Ω}}^{(μ, λ)} (p)] = \frac{1}{{| X |}^{μ λ} {| Y |}^{\bar{μ} λ}} E_{p} [\begin{matrix}  \end{matrix} p_{X | U}^{- μ λ} (X | U) {\{\frac{p_{X} (X)}{{\bar{p}}_{X} (X)}\}}^{μ λ} {\{\frac{p_{Y | U} (Y | U)}{{\bar{p}}_{Y} (Y)}\}}^{\bar{μ} λ} \begin{matrix}  \end{matrix}] \\ \overset{(a)}{\geq} \frac{1}{{| X |}^{μ} {| Y |}^{\bar{μ}}} E_{p} [\begin{matrix}  \end{matrix} {\{\frac{{\bar{p}}_{X} (X)}{p_{X} (X)}\}}^{- μ λ} {\{\frac{{\bar{p}}_{Y} (Y)}{p_{Y | U} (Y | U)}\}}^{- \bar{μ} λ} \begin{matrix}  \end{matrix}] \\ \overset{(b)}{\geq} \frac{1}{{| X |}^{μ} {| Y |}^{\bar{μ}}} {(E_{p} [\begin{matrix}  \end{matrix} \frac{{\bar{p}}_{X} (X)}{p_{X} (X)} \begin{matrix}  \end{matrix}])}^{- μ λ} {(E_{p} [\begin{matrix}  \end{matrix} \frac{{\bar{p}}_{Y} (Y)}{p_{Y | U} (Y | U)} \begin{matrix}  \end{matrix}])}^{- \bar{μ} λ} = \frac{1}{{| X |}^{μ} {| Y |}^{\bar{μ}}} . \end{matrix}

(A21)

Step (a) follows from that

λ \in [0, 1]

and

p_{X | U} (x | u) \leq 1

for any

(u,

x) \in U \times X

. Step (b) follows from the reverse Hölder’s inequality. The bound (A21) implies the second inequality in (8). We next show that

{\tilde{Ω}}^{(μ, λ)} (p) \geq 0

for

λ \in [0, 1]

. On upper bounds of

exp [- {\tilde{Ω}}^{(μ, λ)} (p)]

for

p \in P_{sh} (p_{X Y})

and

λ \in [0, 1]

, we have the following chain of inequalities:

\begin{matrix} exp [- {\tilde{Ω}}^{(μ, λ)} (p)] \overset{(a)}{\leq} E_{p} [\begin{matrix}  \end{matrix} {\{\frac{p_{X} (X)}{p_{X | U} (X | U)}\}}^{μ λ} \begin{matrix}  \end{matrix}] \overset{(b)}{\leq} {\{E_{p} [\begin{matrix}  \end{matrix} \frac{p_{X} (X)}{p_{X | U} (X | U)} \begin{matrix}  \end{matrix}]\}}^{μ λ} = 1 . \end{matrix}

(A22)

Step (a) follows from (A20) and

p_{Y | U} (y | u) \leq 1

for any

(u, y) \in U \times Y

. Step (b) follows from

μ λ \in [0, 1]

and Hölder’s inequality. □

Proof of Property 4 parts d and e:

We first prove that, for each

p \in P_{sh} (p_{X Y})

and

μ \in [0, 1]

,

{\tilde{Ω}}^{(μ, λ)} (p)

is twice differentiable for

λ \in [0, 1 / 2]

. For simplicity of notations, set

\begin{matrix} \underset{̲}{a} : = (u, x, y), \underset{̲}{A} : = (U, X, Y), \underset{̲}{A} : = U \times X \times Y, \\ {\tilde{ω}}_{p}^{(μ)} (x, y | u) : = ς (\underset{̲}{a}), {\tilde{Ω}}^{(μ, λ)} (p) : = ξ (λ) . \end{matrix}

Then, we have

{\tilde{Ω}}^{(μ, λ)} (p) = ξ (λ) = - log [\sum_{\underset{̲}{a} \in \underset{̲}{A}} p_{\underset{̲}{A}} (\underset{̲}{a}) e^{- λ ς (\underset{̲}{a})}] .

(A23)

The quantity

p^{(λ)} (\underset{̲}{a}) = p_{\underset{̲}{A}}^{(λ)} (\underset{̲}{a}), \underset{̲}{a} \in A

has the following form:

p^{(λ)} (\underset{̲}{a}) = e^{ξ (λ)} p (\underset{̲}{a}) e^{- λ ς (\underset{̲}{a})} .

(A24)

By simple computations, we have

\begin{matrix} ξ^{'} (λ) = e^{ξ (λ)} [\sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) ς (\underset{̲}{a}) e^{- λ ς (\underset{̲}{a})}] = \sum_{\underset{̲}{a} \in \underset{̲}{A}} p^{(λ)} (\underset{̲}{a}) ς (\underset{̲}{a}), \\ ξ^{″} (λ) = - e^{2 ξ (λ)} [\sum_{\underset{̲}{a}, \underset{̲}{b} \in \underset{̲}{A}} p (\underset{̲}{a}) p_{\underset{̲}{A}} (\underset{̲}{b}) \frac{{\{ς (\underset{̲}{a}) - ς (\underset{̲}{b})\}}^{2}}{2} e^{- λ \{ς (\underset{̲}{a}) + ς (\underset{̲}{b})\}}] \\ = - \sum_{\underset{̲}{a}, \underset{̲}{b} \in \underset{̲}{A}} p^{(λ)} (\underset{̲}{a}) p^{(λ)} (\underset{̲}{b}) \frac{{\{ς (\underset{̲}{a}) - ς (\underset{̲}{b})\}}^{2}}{2} = - \sum_{\underset{̲}{a} \in \underset{̲}{A}} p^{(λ)} (\underset{̲}{a}) ς^{2} (\underset{̲}{a}) + {[\sum_{\underset{̲}{a} \in \underset{̲}{A}} p^{(λ)} (\underset{̲}{a}) ς (\underset{̲}{a})]}^{2} \leq 0 . \end{matrix}

(A25)

On upper bound of

- ξ^{″} (λ) \geq 0

for

λ \in [0, 1 / 2]

, we have the following chain of inequalities:

\begin{matrix} - ξ^{″} (λ) \overset{(a)}{\leq} \sum_{\underset{̲}{a} \in \underset{̲}{A}} p^{(λ)} (\underset{̲}{a}) ς^{2} (\underset{̲}{a}) \overset{(b)}{=} \sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) ς^{2} (\underset{̲}{a}) e^{- λ ς (\underset{̲}{a}) + ξ (λ)} = e^{ξ (λ)} \sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) \sqrt{e^{- 2 λ ς (\underset{̲}{a})}} \sqrt{ς^{4} (\underset{̲}{a})} \\ \overset{(c)}{\leq} \sqrt{e^{2 ξ (λ) - ξ (2 λ)}} \sqrt{\sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) ς^{4} (\underset{̲}{a})} \overset{(d)}{\leq} \sqrt{e^{2 ξ (λ)}} \sqrt{\sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) ς^{4} (\underset{̲}{a})} . \end{matrix}

(A26)

Step (a) follows from (A25). Step (b) follows from (A24). Step (c) follows from Cauchy–Schwarz inequality and (A23). Step (d) follows from that

ξ (2 λ) \geq 0

for

2 λ \in [0, 1]

. Note that

ξ (λ)

exists for

λ \in [0, 1 / 2]

. Furthermore, we have the following:

\sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) ς^{4} (\underset{̲}{a}) < \infty .

Hence, by (A26),

ξ^{″} (λ)

exists for

λ \in [0, 1 / 2]

. We next prove part e. We derive the lower bound (9) of

{\tilde{Ω}}^{(μ, λ)} (p_{X Y})

. Fix any

(μ, λ) \in [0, 1]

\times [0, 1 / 2]

and any

p \in P_{sh} (p_{X Y})

. By the Taylor expansion of

ξ (λ) = {\tilde{Ω}}^{(μ, λ)} (p)

with respect to

λ

around

λ = 0

, we have that for any

p \in P_{sh} (p_{X Y})

and for some

ν \in [0, λ]

\begin{matrix} {\tilde{Ω}}^{(μ, λ)} (p) = ξ (0) + ξ^{'} (0) λ + \frac{1}{2} ξ^{″} (ν) λ^{2} = λ E_{p} [{\tilde{ω}}_{p}^{(μ)} (X, Y | U)] - \frac{λ^{2}}{2} {Var}_{p^{(ν)}} [{\tilde{ω}}_{p}^{(μ)} (X, Y | U)] \\ \overset{(a)}{\geq} λ R^{(μ)} (p_{X Y}) - \frac{λ^{2}}{2} {Var}_{p^{(ν)}} [{\tilde{ω}}_{p}^{(μ)} (X, Y, Z | U)] . \end{matrix}

(A27)

Step (a) follows from

p \in P_{sh} (p_{X Y})

,

E_{p} [{\tilde{ω}}_{p}^{(μ)} (X, Y | U)] = μ I_{p} (X; U) + \bar{μ} H_{p} (Y | U),

and the definition of

R^{(μ)} (p_{X Y})

. Let

(ν_{opt}, p_{opt})

\in [0, λ] \times

P_{sh} (p_{X Y})

be a pair which attains

ρ^{(μ, λ)} (p_{X Y})

. By this definition, we have that

\begin{matrix} {\tilde{Ω}}^{(μ, λ)} (p_{opt}) = {\tilde{Ω}}^{(μ, λ)} (p_{X Y}) \end{matrix}

(A28)

and that, for any

ν \in [0, λ],

\begin{matrix} {Var}_{p_{opt}^{(ν)}} [ω_{p_{opt}}^{(μ)} (X, Y | U)] \leq {Var}_{p_{opt}^{(ν_{opt})}} [ω_{p_{opt}}^{(μ)} (X, Y | U)] = ρ^{(μ, λ)} (p_{X Y}) . \end{matrix}

(A29)

On lower bounds of

Ω^{(μ, λ)} (p_{X Y})

, we have the following chain of inequalities:

\begin{matrix} {\tilde{Ω}}^{(μ, λ)} (p_{X Y}) \overset{(a)}{=} {\tilde{Ω}}^{(μ, λ)} (p_{opt}) \overset{(b)}{\geq} λ R^{(μ)} (p_{X Y}) - \frac{λ^{2}}{2} {Var}_{p_{opt}^{(ν)}} [{\tilde{ω}}_{p_{opt}}^{(μ)} (X, Y | U)] \\ \overset{(c)}{\geq} λ R^{(μ)} (p_{X Y}) - \frac{λ^{2}}{2} ρ^{(μ, λ)} (p_{X Y}) \overset{(d)}{\geq} λ R^{(μ)} (p_{X Y}) - \frac{λ^{2}}{2} ρ (p_{X Y}) . \end{matrix}

Step (a) follows from (A28). Step (b) follows from (A27). Step (c) follows from (A29). Step (d) follows from the definition of

ρ (p_{X Y})

. □

To prove part f, we use the following lemma.

Lemma A6.

When

τ \in (0, (1 / 2) ρ]

, the maximum of

\frac{1}{2 + 5 λ} \{- \frac{ρ}{2} λ^{2} + τ λ\}

for

λ \in (0, 1 / 2]

is attained by the positive

λ_{0}

satisfying

ϑ (λ_{0}) : = λ_{0} + \frac{5}{4} λ_{0}^{2} = \frac{τ}{ρ} .

(A30)

Let

g (a)

be the inverse function of

ϑ (a)

for

a \geq 0

. Then, the condition of (A30) is equivalent to

λ_{0} = g (\frac{τ}{ρ})

. The maximum is given by

\frac{1}{2 + 5 λ_{0}} \{- \frac{ρ}{2} λ_{0}^{2} + τ λ_{0}\} = \frac{ρ}{4} λ_{0}^{2} = \frac{ρ}{4} g^{2} (\frac{τ}{ρ}) .

By an elementary computation, we can prove this lemma. We omit the detail.

Proof of Property 4 part f.

By the hyperplane expression

R_{sh} (p_{X Y})

of

R (p_{X Y})

stated Property 3 part b, we have that, when

(R_{1} + τ, R_{2} + τ) \notin R (p_{X Y})

, we have

\begin{matrix} R^{(μ_{0})} (p_{X Y}) - (μ_{0} R_{1} + \bar{μ_{0}} R_{2}) > τ \end{matrix}

(A31)

for some

μ_{0} \in [0, 1]

. Then, for each positive

τ

, we have the following chain of inequalities:

\begin{matrix} \underset{̲}{F} (R_{1}, R_{2} | p_{X Y}) \geq sup_{λ \in (0, 1 / 2]} {\underset{̲}{F}}^{(μ_{0}, λ)} (μ_{0} R_{1} + \bar{μ_{0}} R_{2} | p_{X Y}) = sup_{λ \in (0, 1 / 2]} \frac{{\tilde{Ω}}^{(μ_{0}, λ)} (p_{X Y}) - λ (μ_{0} R_{1} + \bar{μ_{0}} R_{2})}{2 + λ (5 - μ_{0})} \\ \overset{(a)}{\geq} sup_{λ \in (0, 1 / 2]} \frac{1}{2 + 5 λ} \{- \frac{ρ}{2} λ^{2} + λ R^{(μ_{0})} (p_{X Y}) - λ (μ_{0} R_{1} + \bar{μ_{0}} R_{2})} \\ \overset{(b)}{>} sup_{λ \in (0, 1 / 2]} \frac{1}{2 + 5 λ} \{- \frac{ρ}{2} λ^{2} + τ λ\} \overset{(c)}{=} \frac{ρ}{4} g^{2} (\frac{τ}{ρ}) . \end{matrix}

Step (a) follows from Property 4 part d. Step (b) follows from (A31). Step (c) follows from Lemma A6. □

Appendix F. Proof of Lemma 1

To prove Lemma 1, we prepare a lemma. Set

A_{n} : = \{(s, x^{n}, y^{n}) : \frac{1}{n} log \frac{p_{S X^{n} Y^{n}} (s, x^{n}, y^{n})}{{\hat{q}}_{S X^{n} Y^{n}} (s, x^{n}, y^{n})} \geq - η\} .

Furthermore, set

\begin{matrix} {\tilde{B}}_{n} : = \{x^{n} : \frac{1}{n} log \frac{p_{X^{n}} (x^{n})}{Q_{X^{n}} (x^{n})} \geq - η\}, B_{n} : = {\tilde{B}}_{n} \times M_{1} \times Y^{n}, B_{n}^{c} : = {\tilde{B}}_{n}^{c} \times M_{1} \times Y^{n}, \\ {\tilde{C}}_{n} : = {(s, x^{n}) : \begin{matrix} s = φ_{1}^{(n)} (x^{n}), {\tilde{Q}}_{X^{n} | S} (x^{n} | s) \leq M_{1} e^{n η} p_{X^{n}} (x^{n})}, \end{matrix} C_{n} : = {\tilde{C}}_{n} \times Y^{n}, C_{n}^{c} : = {\tilde{C}}_{n}^{c} \times Y^{n}, \\ D_{n} : = {(s, x^{n}, y^{n}) : \begin{matrix} s = φ_{1}^{(n)} (x^{n}), p_{Y^{n} | S} (y^{n} | s) \geq (1 / M_{2}) e^{- n η}}, \end{matrix} \\ E_{n} : = {(s, x^{n}, y^{n}) : \begin{matrix} s = φ_{1}^{(n)} (x^{n}), ψ^{(n)} (φ_{1}^{(n)} (x^{n}), φ_{2}^{(n)} (y^{n})) = y^{n}} . \end{matrix} \end{matrix}

Then, we have the following lemma.

Lemma A7.

\begin{matrix} p_{S X^{n} Y^{n}} (A_{n}^{c}) \leq e^{- n η}, p_{S X^{n} Y^{n}} (B_{n}^{c}) \leq e^{- n η}, p_{S X^{n} Y^{n}} (C_{n}^{c}) \leq e^{- n η}, p_{S X^{n} Y^{n}} (D_{n}^{c} \cap E_{n}) \leq e^{- n η} . \end{matrix}

Proof.

We first prove the first inequality.

\begin{matrix} p_{S X^{n} Y^{n}} (A_{n}^{c}) = \sum_{(s, x^{n}, y^{n}) \in A_{n}^{c}} p_{S X^{n} Y^{n}} (s, x^{n}, y^{n}) \\ \overset{(a)}{\leq} \sum_{(s, x^{n}, y^{n}) \in A_{n}^{c}} e^{- n η} {\hat{q}}_{S X^{n} Y^{n}} (s, x^{n}, y^{n}) = e^{- n η} {\hat{q}}_{S X^{n} Y^{n}} (A_{n}^{c}) \leq e^{- n η} . \end{matrix}

Step (a) follows from the definition of

A_{n}

. In the second inequality, we have

\begin{matrix} p_{S X^{n} Y^{n}} (B_{n}^{c}) = p_{X^{n}} ({\tilde{B}}_{n}^{c}) = \sum_{x^{n} \in {\tilde{B}}_{n}^{c}} p_{X_{n}} (x^{n}) \overset{(a)}{\leq} \sum_{x^{n} \in {\tilde{B}}_{n}^{c}} e^{- n η} Q_{X^{n}} (x^{n}) = e^{- n η} Q_{X^{n}} ({\tilde{B}}_{n}^{c}) \leq e^{- n η} . \end{matrix}

Step (a) follows from the definition of

B_{n}

. We next prove the third inequality:

\begin{matrix} p_{S X^{n} Y^{n}} (C_{n}^{c}) = p_{S X^{n}} ({\tilde{C}}_{n}^{c}) = \sum_{s \in M_{1}} \sum_{\binom{x^{n} : φ_{1}^{(n)} (x^{n}) = s}{\binom{p_{X^{n}} (x^{n}) \leq (1 / M_{1}) e^{- n η}}{\times {\tilde{Q}}_{X^{n} | S} (x^{n} | s)}}} p_{X^{n}} (x^{n}) \\ \leq \frac{1}{M_{1}} e^{- n η} \sum_{s \in M_{1}} \sum_{\binom{x^{n} : φ_{1}^{(n)} (x^{n}) = s}{\binom{p_{X^{n}} (x^{n}) \leq (1 / M_{1}) e^{- n η}}{\times {\tilde{Q}}_{X^{n} | S} (x^{n} | s)}}} {\tilde{Q}}_{X^{n} | S} (x^{n} | s) \leq \frac{1}{M_{1}} e^{- n η} | M_{1} | = e^{- n η} . \end{matrix}

Finally, we prove the fourth inequality. We first observe that

\begin{matrix} p_{S} (s) = \sum_{x^{n} : φ_{1}^{(n)} (x^{n}) = s} p_{X^{n}} (x^{n}), p_{X^{n} | S} (x^{n} | s) = \frac{p_{X^{n}} (x^{n})}{p_{S} (s)} . \end{matrix}

We have the following chain of inequalities:

\begin{matrix} p_{S X^{n} Y^{n}} (D_{n}^{c} \cap E_{n}) = \sum_{s \in M_{1}} p_{S} (s) \sum_{x^{n} : φ_{1}^{(n)} (x^{n}) = s} p_{X^{n} | S} (x^{n} | s) \sum_{\binom{y^{n} : ψ^{(n)} (s, φ_{2}^{(n)} (y^{n})) = y^{n}}{p_{Y^{n} | S} (y^{n} | s) \leq (1 / M_{2}) e^{- n η}}} p_{Y^{n} | X^{n}} (y^{n} | x^{n}) \\ = \sum_{s \in M_{1}} p_{S} (s) \sum_{\binom{y^{n} : ψ^{(n)} (s, φ_{2}^{(n)} (y^{n})) = y^{n}}{p_{Y^{n} | S} (y^{n} | s) \leq (1 / M_{2}) e^{- n η}}} p_{Y^{n} | S} (y^{n} | s) \\ \leq \sum_{s \in M_{1}} p_{S} (s) \frac{1}{M_{2}} e^{- n η} |\{y^{n} : ψ^{(n)} (s, φ_{2}^{(n)} (y^{n})) = y^{n}\}| \overset{(a)}{\leq} \sum_{s \in M_{1}} p_{S} (s) \frac{1}{M_{2}} e^{- n η} M_{2} = e^{- n η} . \end{matrix}

Step (a) follows from that the number of

y^{n}

correctly decoded does not exceed

M_{2}

. □

Proof of Lemma 1:

By definition, we have

\begin{matrix} p_{S X^{n} Y^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n}) = p_{S X^{n} Y^{n}} & \{\frac{1}{n} log \frac{p_{S X^{n} Y^{n}} (S, X^{n}, Y^{n})}{{\hat{q}}_{{S X^{n} Y}^{n}} (S, X^{n}, Y^{n})} \geq - η, \\ 0 \geq \frac{1}{n} log \frac{Q_{X^{n}} (X^{n})}{p_{X^{n}} (X^{n})} - η, \\ \frac{1}{n} log M_{1} \geq \frac{1}{n} log \frac{{\tilde{Q}}_{X^{n} | S} (X^{n} | S)}{p_{X^{n}} (X^{n})} - η, \\ \frac{1}{n} log M_{2} \geq \frac{1}{n} log \frac{1}{p_{Y^{n} | S} (Y^{n} | S)} - η\} . \end{matrix}

Then, for any

(φ_{1}^{(n)}

,

φ_{2}^{(n)}, ψ^{(n)})

satisfying

(1 / n) log | | φ_{i}^{(n)} | | \leq R_{i}, i = 1, 2,

we have

\begin{matrix} p_{S X^{n} Y^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n}) \leq p_{S X^{n} Y^{n}} & \{\frac{1}{n} log \frac{p_{S X^{n} Y^{n}} (S, X^{n}, Y^{n})}{{\hat{q}}_{{S X^{n} Y}^{n}} (S, X^{n}, Y^{n})} \geq - η, \\ 0 \geq \frac{1}{n} log \frac{Q_{X^{n}} (X^{n})}{p_{X^{n}} (X^{n})} - η, \\ R_{1} \geq \frac{1}{n} log \frac{{\tilde{Q}}_{X^{n} | S} (X^{n} | S)}{p_{X^{n}} (X^{n})} - η, \\ R_{2} \geq \frac{1}{n} log \frac{1}{p_{Y^{n} | S} (Y^{n} | S)} - η\} . \end{matrix}

Hence, it suffices to show

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) & \leq p_{S X^{n} Y^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n}) + 4 e^{- n η} \end{matrix}

to prove Lemma 1. By definition, we have

P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) = p_{S X^{n} Y^{n}} (E_{n}) .

Then, we have the following.

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}) = p_{S X^{n} Y^{n}} (E_{n}) \\ = p_{S X^{n} Y^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n} \cap E_{n}) + p_{S X^{n} Y^{n}} ({[A_{n} \cap B_{n} \cap C_{n} \cap D_{n}]}^{c} \cap E_{n}) \\ \leq p_{S X^{n} Y^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n}) + p_{S X^{n} Y^{n}} (A_{n}^{c}) + p_{S X^{n} Y^{n}} (B_{n}^{c}) + p_{S X^{n} Y^{n}} (C_{n}^{c}) + p_{S X^{n} Y^{n}} (D_{n}^{c} \cap E_{n}) \\ \overset{(a)}{\leq} p_{S X^{n} Y^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n}) + 4 e^{- n η} . \end{matrix}

Step (a) follows from Lemma A7. □

Appendix G. Proof of Lemma 3

In this appendix, we prove Lemma 3.

Proof of Lemma 3:

We first prove the Markov chain

S X^{t - 1} \leftrightarrow X_{t} \leftrightarrow Y_{t}

in (18) in Lemma 3. We have the following chain of inequalities:

\begin{matrix} I (Y_{t}; S X^{t - 1} | X_{t}) = H (Y_{t} | X_{t}) - H (Y_{t} | S X^{t - 1} X_{t}) \leq H (Y_{t} | X_{t}) - H (Y_{t} | S X^{n}) \\ \overset{(a)}{=} H (Y_{t} | X_{t}) - H (Y_{t} | X^{n}) \overset{(b)}{=} H (Y_{t} | X_{t}) - H (Y_{t} | X_{t}) = 0 . \end{matrix}

Step (a) follows from that

S = φ_{1}^{(n)} (X^{n})

is a function of

X^{n}

. Step (b) follows from the memoryless property of the information source

{(X_{t}, Y_{t})}_{t = 1}^{\infty}

. Next, we prove the Markov chain

Y^{t - 1} \leftrightarrow S X^{t - 1} \leftrightarrow (X_{t}, Y_{t})

in (19) in Lemma 3. We have the following chain of inequalities:

\begin{matrix} I (X_{t} Y_{t}; Y^{t - 1} | S X^{t - 1}) = H (Y^{t - 1} | S X^{t - 1}) - H (Y^{t - 1} | S X^{t - 1} X_{t} Y_{t}) \leq H (Y^{t - 1} | X^{t - 1}) - H (Y^{t - 1} | X^{n} S Y_{t}) \\ \overset{(a)}{=} H (Y^{t - 1} | X^{t - 1}) - H (Y^{t - 1} | X^{n} Y_{t}) \overset{(b)}{=} H (Y^{t - 1} | X^{t - 1}) - H (Y^{t - 1} | X^{t - 1} Y_{t}) = 0 . \end{matrix}

Step (a) follows from that

S = φ_{1}^{(n)} (X^{n})

is a function of

X^{n}

. Step (b) follows from the memoryless property of the information source

{(X_{t}, Y_{t})}_{t = 1}^{\infty}

. □

Appendix H. Proof of Lemma 6

In this appendix, we prove Lemma 6.

Proof of Lemma 6.

By the definition of

p_{S X^{t} Y^{t}; F^{t}}^{(μ, α)} (s,

x^{t}, y^{t})

, for

t = 1, 2, \dots, n

, we have

\begin{matrix} p_{S X^{t} Y^{t}; F^{t}}^{(μ, α)} (s, x^{t}, y^{t}) = C_{t}^{- 1} p_{S X^{t} Y^{t}} (s, x^{t}, y^{t}) \prod_{i = 1}^{t} f_{F_{i}}^{(μ, α)} (x_{i}, y_{i} | u_{i}) . \end{matrix}

(A32)

Then, we have the following chain of equalities:

\begin{matrix} p_{S X^{t} Y^{t}; F^{t}}^{(μ, α)} (s, x^{t}, y^{t}) \overset{(a)}{=} C_{t}^{- 1} p_{S X^{t} Y^{t}} (s, x^{t}, y^{t}) \prod_{i = 1}^{t} f_{F_{i}}^{(μ, α)} (x_{i}, y_{i} | u_{i}) \\ = C_{t}^{- 1} p_{S X^{t - 1} Y^{t - 1}} (s, x^{t - 1}, y^{t - 1}) \prod_{i = 1}^{t - 1} f_{F_{i}}^{(μ, α)} (x_{i}, y_{i} | u_{i}) \\ \times p_{X_{t} Y_{t} | S X^{t - 1} Y^{t - 1}} (x_{t}, y_{t} | s, x^{t - 1}, y^{t - 1}) f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) \\ \overset{(b)}{=} C_{t}^{- 1} C_{t - 1} p_{S X^{t - 1} Y^{t - 1}}^{(μ, α)} (s, x^{t - 1}, y^{t - 1}) p_{X_{t} Y_{t} | S X^{t - 1} Y^{t - 1}} (x_{t}, y_{t} | s, x^{t - 1}, y^{t - 1}) f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) \\ = {(Φ_{t}^{(μ, α)})}^{- 1} p_{S X^{t - 1} Y^{t - 1}; F^{t - 1}}^{(μ, α)} (s, x^{t - 1}, y^{t - 1}) p_{X_{t} Y_{t} | S X^{t - 1} Y^{t - 1}} (x_{t}, y_{t} | s, x^{t - 1}, y^{t - 1}) f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) . \end{matrix}

(A33)

Steps (a) and (b) follow from (A32). From (A33), we have

\begin{matrix} Φ_{t}^{(μ, α)} p_{S X^{t} Y^{t}; F^{t}}^{(μ, α)} (s, x^{t}, y^{t}) \end{matrix}

(A34)

\begin{matrix} = p_{S X^{t - 1} Y^{t - 1}; F^{t - 1}}^{(μ, α)} (s, x^{t - 1}, y^{t - 1}) p_{X_{t} Y_{t} | S X^{t - 1} Y^{t - 1}} (x_{t}, y_{t} | s, x^{t - 1}, y^{t - 1}) f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}) . \end{matrix}

(A35)

Taking summations of (A34) and (A35) with respect to

s, x^{t}, y^{t}

, we obtain

\begin{matrix} Φ_{t}^{(μ, α)} = \sum_{s, x^{t}, y^{t}} p_{S X^{t - 1} Y^{t - 1}; F^{t - 1}}^{(μ, α)} (s, x^{t - 1}, y^{t - 1}) p_{X_{t} Y_{t} | S X^{t - 1} Y^{t - 1}} (x_{t}, y_{t} | s, x^{t - 1}, y^{t - 1}) f_{F_{t}}^{(μ, α)} (x_{t}, y_{t} | u_{t}), \end{matrix}

completing the proof. □

References

Ahlswede, R.F.; Körner, J. Source coding with side information and a converse for degraded broadcast channels. IEEE Trans. Inf. Theory 1975, 21, 629–637. [Google Scholar] [CrossRef]
Wyner, A.D. On source coding with side information at the decoder. IEEE Trans. Inf. Theory 1975, 21, 294–300. [Google Scholar] [CrossRef]
Csiszár, I.; Longo, G. On the exponent function for source coding and for testing simple statistical hypotheses. Studia Sci. Math. Hungar 1971, 6, 181–191. [Google Scholar]
Slepian, D.; Wolf, J.K. Noiseless coding of correlated information sources. IEEE Trans. Inf. Theory 1973, 19, 471–480. [Google Scholar] [CrossRef]
Oohama, Y.; Han, T.S. Universal coding for the Slepian-wolf data compression system and the strong converse theorem. IEEE Trans. Inf. Theory 1994, 40, 1908–1919. [Google Scholar] [CrossRef]
Ahlswede, R.; Gács, P.; Körner, J. Bounds on conditional probabilities with applications in multi-user communication. Probab. Theory Relat. Fields 1976, 34, 157–177. [Google Scholar] [CrossRef] [Green Version]
Gu, W.; Effors, M. A strong converse for a collection of network source coding problems. In Proceedings of the IEEE International Symposium on Information Theory, Soul, Korea, 28 June–3 July 2009; pp. 2316–2320. [Google Scholar]
Oohama, Y. Strong converse exponent for degraded broadcast channels at rates outside the capacity region. In Proceedings of the 2015 IEEE International Symposium on Information Theory, Hong Kong, China, 14–19 June 2015; pp. 939–943. [Google Scholar]
Oohama, Y. Strong converse theorems for degraded broadcast channels with feedback. In Proceedings of the 2015 IEEE International Symposium on Information Theory, Hong Kong, China, 14–19 June 2015; pp. 2510–2514. [Google Scholar]
Oohama, Y. Exponent function for asymmetric broadcast channels at rates outside the capacity region. In Proceedings of the 2016 IEEE International Symposium on Information Theory and its Applications, Monterey, CA, USA, 30 October–2 Novomber 2016; pp. 568–572. [Google Scholar]
Oohama, Y. New Strong Converse for Asymmetric Broadcast Channels. Available online: https://arxiv.org/pdf/1604.02901.pdf (accessed on 31 May 2019).
Oohama, Y. Exponential strong converse for source coding with side information at the decoder. Entropy 2018, 20, 352. [Google Scholar] [CrossRef]
Watanabe, S. A converse bound on Wyner-Ahlswede-Körner network via Gray–Wyner network. In Proceedings of the 2017 IEEE Information Theory Workshop (ITW), Kaohsiung, Taiwan, 6–10 November 2017; pp. 81–85. [Google Scholar]
Liu, J.; van Handel, R.; Verdu, S. Beyond the blowing-up lemma: Sharp converses via reverse hypercontractivity. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 943–947. [Google Scholar]
Watanabe, S. Second-order region for Gray–Wyner network. IEEE Trans. Inform. Theory 2017, 63, 1006–1018. [Google Scholar] [CrossRef]
Liu, J. Dispersion bound for the Wyner-Ahlswede-Körner network via reverse hypercontractivity on types. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018; pp. 1854–1858. [Google Scholar]
Watanabe, S.; Oohama, Y. Privacy amplification theorem for bounded storage eavesdropper. In Proceedings of the 2012 IEEE Information Theory Workshop (ITW), Lausanne, Switzerland, 3–7 September 2012; pp. 177–181. [Google Scholar]
Oohama, Y.; Santoso, B. Information Theoretic Security for Side-Channel Attacks to the Shannon Cipher System. Available online: https://arxiv.org/pdf/1801.02563v5.pdf. (accessed on 31 May 2019).
Santoso, B.; Oohama, Y. Information Theoretic Security for Shannon Cipher System under Side-Channel Attacks. Entropy 2019, 21, 469. [Google Scholar] [CrossRef]
Oohama, Y. Exponent Function for One Helper Source Coding Problem at Rates outside the Rate Region. arXiv 2015, arXiv:1504.05891. [Google Scholar]
Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Cambridge University Press: London, UK, 1981. [Google Scholar]
Han, T.S. Information-Spectrum Methods in Information Theory; Springer Nature Switzerland AG: Basel, Switzerland, 2002. [Google Scholar]

Figure 1. Source encoding with or without side information at the decoder.

Figure 2. One helper source coding system [20].

Figure 3. A typical shape of

R (p_{X Y})

.

Figure 3. A typical shape of

R (p_{X Y})

.

Figure 4. One helper source coding system investigated by Wyner.

© 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Oohama, Y. Exponential Strong Converse for One Helper Source Coding Problem. Entropy 2019, 21, 567. https://doi.org/10.3390/e21060567

AMA Style

Oohama Y. Exponential Strong Converse for One Helper Source Coding Problem. Entropy. 2019; 21(6):567. https://doi.org/10.3390/e21060567

Chicago/Turabian Style

Oohama, Yasutada. 2019. "Exponential Strong Converse for One Helper Source Coding Problem" Entropy 21, no. 6: 567. https://doi.org/10.3390/e21060567

APA Style

Oohama, Y. (2019). Exponential Strong Converse for One Helper Source Coding Problem. Entropy, 21(6), 567. https://doi.org/10.3390/e21060567

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Exponential Strong Converse for One Helper Source Coding Problem^†

Abstract

1. Introduction

2. Problem Formulation

3. Main Results

4. Proof of the Main Result

5. One Helper Problem Studied by Wyner

6. Conclusions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Properties of the Rate Regions

Appendix B. Cardinality Bound on Auxiliary Random Variables

Appendix C. Supporting Hyperplain Expressions of $R (p_{X Y})$

Appendix D. Proof of Property 4 Part b

Appendix E. Proof of Property 4 Parts c, d, e, and f

Appendix F. Proof of Lemma 1

Appendix G. Proof of Lemma 3

Appendix H. Proof of Lemma 6

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Exponential Strong Converse for One Helper Source Coding Problem †

Abstract

1. Introduction

2. Problem Formulation

3. Main Results

4. Proof of the Main Result

5. One Helper Problem Studied by Wyner

6. Conclusions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Properties of the Rate Regions

Appendix B. Cardinality Bound on Auxiliary Random Variables

Appendix C. Supporting Hyperplain Expressions of R ( p X Y )

Appendix D. Proof of Property 4 Part b

Appendix E. Proof of Property 4 Parts c, d, e, and f

Appendix F. Proof of Lemma 1

Appendix G. Proof of Lemma 3

Appendix H. Proof of Lemma 6

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Exponential Strong Converse for One Helper Source Coding Problem^†

Appendix C. Supporting Hyperplain Expressions of $R (p_{X Y})$