Next Article in Journal
Exploration vs. Data Refinement via Multiple Mobile Sensors
Next Article in Special Issue
Structural Characteristics of Two-Sender Index Coding
Previous Article in Journal
SIMIT: Subjectively Interesting Motifs in Time Series
Previous Article in Special Issue
Information Theoretic Security for Shannon Cipher System under Side-Channel Attacks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Exponential Strong Converse for One Helper Source Coding Problem †

by
Yasutada Oohama
Department of Communication Engineering and Informatics, University of Electro-Communications, Tokyo 182-8585, Japan
This paper is an extended version of our paper published in 2015 IEEE International Symposium on Information Theory (ISIT), Hong Kong, China, 14–19 June 2015.
Entropy 2019, 21(6), 567; https://doi.org/10.3390/e21060567
Submission received: 12 March 2019 / Revised: 29 May 2019 / Accepted: 31 May 2019 / Published: 5 June 2019
(This article belongs to the Special Issue Multiuser Information Theory II)

Abstract

:
We consider the one helper source coding problem posed and investigated by Ahlswede, Körner and Wyner. Two correlated sources are separately encoded and are sent to a destination where the decoder wishes to decode one of the two sources with an arbitrary small error probability of decoding. In this system, the error probability of decoding goes to one as the source block length n goes to infinity. This implies that we have a strong converse theorem for the one helper source coding problem. In this paper, we provide the much stronger version of this strong converse theorem for the one helper source coding problem. We prove that the error probability of decoding tends to one exponentially and derive an explicit lower bound of this exponent function.

1. Introduction

For single or multi terminal source encoding systems, the converse coding theorems state that, at any data compression rates below the fundamental theoretical limit of the system, the error probability of decoding can not go to zero when the block length n of the codes tends to infinity.
In this paper, we study the one helper source coding problem posed and investigated by Ahlswede, Körner [1] and Wyner [2]. We call the above source coding system (the AKW system). The AKW system is shown in Figure 1.
In this figure, the AKW system corresponds to the case where the switch is closed. In Figure 1, the sequence ( X n , Y n ) represents independent copies of a pair of dependent random variables ( X , Y ) which take values in the finite sets X , Y , respectively. We assume that ( X , Y ) has a probability distribution denoted by p X Y . For each i = 1 , 2 , the encoder φ i ( n ) outputs a binary sequence which appears at a rate R i bits per input symbol. The decoder function ψ ( n ) observes φ 1 ( n ) ( X n ) and φ 2 ( n ) ( Y n ) to output a sequence Y ^ n : = ψ ( n ) ( φ 1 ( n ) ( X n ) , φ 2 ( n ) ( Y n ) ) , which is an estimation of Y n . When the switch is open, it is well known that the minimum transmission rate R 2 such that the error probability P e ( n ) : = Pr { Y n Y ^ n } of decoding tends to zero as n tends to infinity is given by H ( Y ) . Csiszár and Longo [3] proved that, if R 2 < H ( Y ) , then the correct probability P c ( n ) : = Pr { Y n = Y ^ n } of decoding decay exponentially and derived the optimal exponent function. When the switch is open and R 1 > H ( X ) , Slepian and Wolf [4] proved that H ( Y | X ) is the minimum transmission rate R 2 such that the error probability Pr { Y n Y ^ n } of decoding tends to zero as n tends to infinity. Oohama and Han [5] proved that, if R 2 < H ( Y | X ) , then the correct probability P c ( n ) : = Pr { Y n = Y ^ n } of decoding decay exponentially and derived the optimal exponent function.
In this paper, we consider the strong converse theorem in the case where the switch is closed and 0 < R 1 < H ( X ) . Let R AKW ( p X Y ) be the rate region of the AKW system. This region consists of the rate pair ( R 1 , R 2 ) such that the error provability of decoding goes to zero as n tends to infinity. The rate region was determined by Ahlswede, Körner [1] and Wyner [2]. On the converse coding theorem, Ahlswede et al. [6] proved that, if ( R 1 , R 2 ) is outside the rate region, then, P c ( n ) must tends to zero as n tends to infinity. Gu and Effors [7] examined a speed of convergence for P c ( n ) to tend to zero as n by carefully checking the proof of Ahlswede et al. [6]. However, they could not obtain a result on an explicit form of the exponent function with respect to the code length n.
Our main results on the strong converse theorem for the AKW system are as follows. For the AKW system, we prove that, if ( R 1 , R 2 ) is outside the rate region R AKW ( p X Y ) , P c ( n ) must go to zero exponentially and derive an explicit lower bound of this exponent. This result corresponds to Theorem 3. As a corollary from this theorem, we obtain the strong converse result, which is stated in Corollary 2. This result states that we have an outer bound with O ( 1 / n ) gap from the rate region R AKW ( p X Y ) .
To derive our result, we use a new method called the recursive method. This method, which is a new method introduced by the author, includes a certain recursive algorithm for a single letterization of exponent functions. In a standard argument of proving converse coding theorems, single letterization methods based on the chain rule of the entropy functions are used. In general, the functions representing multi letter characterizations of exponent functions do not have the chain rule property. In such cases, the recursive method is quite useful for deriving single letterized bounds. The recursive method is a general powerful tool to prove strong converse theorems for several coding problems in information theory. In fact, the recursive method plays important roles in deriving exponential strong converse exponent for communication systems treated in [8,9,10,11,12].
On the strong converse theorem for the one helper source coding problem, we have two recent other works [13,14]. The above two works proved the strong converse theorem using different methods from our method. In [13], Watanabe found a relationship between the AKW system and the Gray–Wyner network. Using this relationship and the second order rate region for the Gray–Wyner network obtained by him [15], Watanabe established the strong converse theorem for the AKW system. In [14], Liu et al. introduced a new method to derive sharp strong converse bounds via a reverse hypercontractivity. Using this method, they obtained an outer bound of the rate region for the AKW system with O ( 1 / n ) gap from the rate region. Furthermore, in [14], an extension of the AKW system to the case of Gaussian source and quadratic distortion is investigated, obtaining an outer bound with O ( 1 / n ) gap from the rate distortion region for the extended source coding system. In his resent paper [16], Liu showed a lower bound (converse) on the dispersion of AWK as the variance of the linear combination of information densities.
The strong converse theorems seem to be regarded just as a mathematical problem and have been investigated mainly from theoretical interest. Recently, Watanabe and Oohama [17] have found an interesting security problem, which has a close connection with the strong converse theorem for the AKW system. Furthermore, Oohama and Santoso [18] and Santoso and Oohama [19] clarify that the exponential strong converse theorem obtained by this paper plays an essential role in deriving a strong sufficient secure condition for the privacy amplification in their new theoritical model of side channel attacks to the Shannon chipher systems. From the above two cases, we expect that exponential strong converse theorems for multiterminal source networks will serve as a strong tool to several information theoretical security problems.

2. Problem Formulation

Let X and Y be finite sets and ( X t , Y t ) t = 1 be a stationary discrete memoryless source. For each t = 1 , 2 , , the random pair ( X t , Y t ) takes values in X × Y , and has a probability distribution
p X Y = p X Y ( x , y ) ( x , y ) X × Y .
We write n independent copies of X t t = 1 and Y t t = 1 , respectively as
X n = X 1 , X 2 , , X n   and   Y n = Y 1 , Y 2 , , Y n .
We consider a communication system depicted in Figure 2. This communication system corresponds to the case where the switch is closed in Figure 1. Data sequences X n and Y n are separately encoded to φ 1 ( n ) ( X n ) and φ 2 ( n ) ( Y n ) and those are sent to the information processing center. At the center, the decoder function ψ ( n ) observes ( φ 1 ( n ) ( X n ) , φ 2 ( n ) ( Y n ) ) to output the estimation Y ^ n of Y n . The encoder functions φ 1 ( n ) and φ 2 ( n ) are defined by
φ 1 ( n ) : X n M 1 = 1 , 2 , , M 1 φ 2 ( n ) : Y n M 2 = 1 , 2 , , M 2 ,
where for each i = 1 , 2 , φ i ( n ) ( = M i ) stands for the range of cardinality of φ i ( n ) . The decoder function ψ ( n ) is defined by
ψ ( n ) : M 1 × M 2 Y n .
The error probability of decoding is
P e ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) = Pr Y ^ n Y n ,
where Y ^ n = ψ ( n ) ( φ 1 ( n ) ( X n ) , φ 2 ( n ) ( Y n ) ) . A rate pair ( R 1 , R 2 ) is ε -achievable if, for any δ > 0 , there exists a positive integer n 0 = n 0 ( ε , δ ) and a sequence of triples { ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) } n n 0 such that, for n n 0 ,
1 n log φ i ( n ) R i + δ   for   i = 1 , 2 , P e ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) ε .
For ε ( 0 , 1 ) , the rate region R AKW ( ε | p X Y ) is defined by
R AKW ( ε | p X Y ) : = ( R 1 , R 2 ) : ( R 1 , R 2 ) is ε achievable   for p X Y .
Furthermore, define
R AKW ( p X Y ) : = ε ( 0 , 1 ) R AKW ( ε | p X Y ) .
We can show that the two rate regions R AKW ( ε | p X Y ) , ε ( 0 , 1 ) and R AKW ( p X Y ) satisfy the following property.
Property 1.
(a) 
The regions R AKW ( ε | p X Y ) , ε ( 0 , 1 ) , and R AKW ( p X Y ) are closed convex sets of R + 2 , where
R + 2 : = { ( R 1 , R 2 ) : R 1 0 , R 2 0 } .
(b) 
R AKW ( ε | p X Y ) has another form using ( n , ε ) -rate region R AKW ( n , ε | p X Y ) , the definition of which is as follows. We set
R AKW ( n , ε | p X Y ) = { ( R 1 , R 2 ) : T h e r e   e x i s t s ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) s u c h   t h a t 1 n log | | φ i ( n ) | | R i , i = 1 , 2 , P e ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) ε } .
Using R AKW ( n , ε | p X Y ) , R AKW ( ε | p X Y ) can be expressed as
R AKW ( ε | p X Y ) = cl m 1 n m R AKW ( n , ε | p X Y ) .
Proof of this property is given in Appendix A. It is well known that R AKW ( p X Y ) was determined by Ahlswede, Körner and Wyner. To describe their result, we introduce an auxiliary random variable U taking values in a finite set U . We assume that the joint distribution of ( U , X , Y ) is
p U X Y ( u , x , y ) = p U ( u ) p X | U ( x | u ) p Y | X ( y | x ) .
The above condition is equivalent to U X Y . Define the set of probability distribution p = p U X Y by
P ( p X Y ) : = { p U X Y : | U | | X | + 1 , U X Y } .
Set
R ( p ) : = { ( R 1 , R 2 ) : R 1 , R 2 0 R 1 I p ( X ; U ) , R 2 H p ( Y | U ) } , R ( p X Y ) : = p P ( p X Y ) R ( p ) .
We can show that the region R ( p X Y ) satisfies the following property.
Property 2.
(a) 
The region R ( p X Y ) is a closed convex subset of R + 2 .
(b) 
For any p X Y , we have
min ( R 1 , R 2 ) R ( p X Y ) ( R 1 + R 2 ) = H p ( Y ) .
The minimum is attained by ( R 1 , R 2 ) = ( 0 , H p ( Y ) ) . This result implies that
R ( p X Y ) { ( R 1 , R 2 ) : R 1 + R 2 H p ( Y ) } R + 2 .
Furthermore, the point ( 0 , H p ( Y ) ) always belongs to R ( p X Y ) .
Property 2 part a is a well known property. Proof of Property 2 part b is easy. Proofs of Property 2 parts a and b are omitted. A typical shape of the rate region R ( p X Y ) is shown in Figure 3.
The rate region R AKW ( p X Y ) was determined by Ahlswede and Körner [1] and Wyner [2]. Their results are the following.
Theorem 1
(Ahlswede, Körner [1] and Wyner [2]).
R AKW ( p X Y ) = R ( p X Y ) .
On the converse coding theorem, Ahlswede et al. [6] obtained the following.
Theorem 2
(Ahlswede et al. [6]). For each fixed ε ( 0 , 1 ) , we have
R AKW ( ε | p X Y ) = R ( p X Y ) .
Gu and Effors [7] examined a speed of convergence for P e ( n ) to tend to 1 as n by carefully checking the proof of Ahlswede et al. [6]. However, they could not obtain a result on an explicit form of the exponent function with respect to the code length n.
Our aim is to find an explicit form of the exponent function for the error probability of decoding to tend to one as n when ( R 1 , R 2 ) R AKW ( p X Y ) . To examine this quantity, we define the following quantity. Set
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) : = 1 P e ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) , G ( n ) ( R 1 , R 2 | p X Y ) : = min ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) : ( 1 / n ) log φ i ( n ) R i , i = 1 , 2 1 n log P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) . G ( R 1 , R 2 | p X Y ) : = lim n G ( n ) ( R 1 , R 2 | p X Y ) , G ( p X Y ) : = { ( R 1 , R 2 , G ) : G G ( R 1 , R 2 | p X Y ) } .
By time sharing, we have that
G ( n + m ) n R 1 + m R 1 n + m , n R 2 + m R 2 n + m p X Y n G ( n ) ( R 1 , R 2 | p X Y ) + m G ( m ) ( R 1 , R 2 | p X Y ) n + m .
Choosing R = R in the inequality (5), we obtain the following subadditivity property on { G ( n ) ( R 1 , R 2 | p X Y ) } n 1 :
G ( n + m ) ( R 1 , R 2 | p X Y ) n G ( n ) ( R 1 , R 2 | p X Y ) + m G ( m ) ( R 1 , R 2 | p X Y ) n + m ,
from which this, and Fekete’s subadditive lemma, we have that G ( n ) ( R 1 , R 2 | p X Y ) exists and satisfies the following:
lim n G ( n ) ( R 1 , R 2 | p X Y ) = inf n 1 G ( n ) ( R 1 , R 2 | p X Y ) .
The exponent function G ( R 1 , R 2 | p X Y ) is a convex function of ( R 1 , R 2 ) . In fact, from the inequality (5), we have that for any α [ 0 , 1 ]
G ( α R 1 + α ¯ R 1 , α R 2 + α ¯ R 2 | p X Y ) α G ( R 1 , R 2 | p X Y ) + α ¯ G ( R 1 , R 2 | p X Y ) .
The region G ( p X Y ) is also a closed convex set. Our main aim is to find an explicit characterization of G ( p X Y ) . In this paper, we derive an explicit outer bound of G ( p X Y ) whose section by the plane G = 0 coincides with R AKW ( p X Y ) .

3. Main Results

In this section, we state our main result. We first explain that the region R ( p X Y ) can be expressed with a family of supporting hyperplanes. To describe this result, we define a set of probability distributions on U × X × Y by
P sh ( p X Y ) : = { p = p U X Y : | U | | X | , U X Y } .
For μ 0 , define
R ( μ ) ( p X Y ) : = min p P sh ( p X Y ) μ I p ( X ; U ) + μ ¯ H p ( Y | U ) .
Furthermore, define
R sh ( p X Y ) : = μ [ 0 , 1 ] { ( R 1 , R 2 ) : μ R 1 + μ ¯ R 2 R ( μ ) ( p X Y ) } .
Then, we have the following property.
Property 3.
(a) 
The bound | U | | X | is sufficient to describe R ( μ ) ( p X Y ) .
(b) 
For every μ [ 0 , 1 ] , we have
min ( R 1 , R 2 ) R ( p X Y ) { μ R 1 + μ ¯ R 2 } = R ( μ ) ( p X Y ) .
(c) 
For any p X Y , we have
R sh ( p X Y ) = R ( p X Y ) .
Property 3 part a is stated as Lemma A1 in Appendix B. Proof of this lemma is given in this appendix. Proofs of Property 3 parts b and c are given in Appendix C. Set
Q ( p Y | X ) : = { q = q U X Y : | U | | X | , U X Y , p Y | X = q Y | X } .
For ( μ , α ) [ 0 , 1 ] 2 , and for q = q U X Y Q ( p Y | X ) , define
ω q | p X ( μ , α ) ( x , y | u ) : = α ¯ log q X ( x ) p X ( x ) + α μ log q X | U ( x | u ) p X ( x ) + μ ¯ log 1 q Y | U ( y | u ) , f q | p X ( μ , α ) ( x , y | u ) : = exp ω q | p X ( μ , α ) ( x , y | u ) , Ω ( μ , α ) ( q | p X ) : = log E q exp ω q | p X ( μ , α ) ( X , Y | U ) , Ω ( μ , α ) ( p X Y ) : = min q Q ( p Y | X ) Ω ( μ , α ) ( q | p X ) , F ( μ , α ) ( μ R 1 + μ ¯ R 2 | p X Y ) : = Ω ( μ , α ) ( p X Y ) α ( μ R 1 + μ ¯ R 2 ) 2 + α μ ¯ , F ( R 1 , R 2 | p X Y ) : = sup ( μ , α ) [ 0 , 1 ] 2 F ( μ , α ) ( μ R 1 + μ ¯ R 2 | p X Y ) .
We next define a function serving as a lower bound of F ( R 1 , R 2 | p X Y ) . For λ 0 and for p U X Y P sh ( p X Y ) , define
ω ˜ p ( μ ) ( x , y | u ) : = μ log p X | U ( x | u ) p X ( x ) + μ ¯ log 1 p Y | U ( y | u ) , Ω ˜ ( μ , λ ) ( p ) : = log E p exp λ ω ˜ p ( μ ) ( X , Y | U ) , Ω ˜ ( μ , λ ) ( p X Y ) : = min p P sh ( p X Y ) Ω ˜ ( μ , λ ) ( p ) .
Furthermore, set
F ̲ ( μ , λ ) ( μ R 1 + μ ¯ R 2 | p X Y ) : = Ω ˜ ( μ , λ ) ( p X Y ) λ ( μ R 1 + μ ¯ R 2 ) 2 + λ ( 5 μ ) , F ̲ ( R 1 , R 2 | p X Y ) : = sup λ 0 , μ [ 0 , 1 ] F ̲ ( μ , λ ) ( μ R 1 + μ ¯ R 2 | p X Y ) .
We can show that the above functions satisfy the following property.
Property 4.
(a) 
The cardinality bound | U | | X | in Q ( p Y | X ) is sufficient to describe the quantity Ω ( μ , α ) ( p X Y ) . Furthermore, the cardinality bound | U | | X | in P sh ( p X Y ) is sufficient to describe the quantity Ω ˜ ( μ , λ ) ( p X Y ) .
(b) 
For any R 1 , R 2 0 , we have
F ( R 1 , R 2 | p X Y ) F ̲ ( R 1 , R 2 | p X Y ) .
(c) 
For any p = p U X Y P sh ( p X Y ) and any ( μ , λ ) [ 0 , 1 ] 2 , we have
0 Ω ˜ ( μ , λ ) ( p ) μ log | X | + μ ¯ log | Y | .
(d) 
Fix any p = p U X Y P sh ( p X Y ) and μ [ 0 , 1 ] . For λ [ 0 , 1 ] , we define a probability distribution p ( λ ) = p U X Y ( λ ) by
p ( λ ) ( u , x , y ) : = p ( u , x , y ) exp λ ω ˜ p ( μ ) ( x , y | u ) E p exp λ ω ˜ p ( μ ) ( X , Y | U ) .
Then, for λ [ 0 , 1 / 2 ] , Ω ˜ ( μ , λ ) ( p ) is twice differentiable. Furthermore, for λ [ 0 , 1 / 2 ] , we have
d d λ Ω ˜ ( μ , λ ) ( p ) = E p ( λ ) ω ˜ p ( μ ) ( X , Y | U ) , d 2 d λ 2 Ω ˜ ( μ , λ ) ( p ) = Var p ( λ ) ω ˜ p ( μ ) ( X , Y | U ) .
The second equality implies that Ω ˜ ( μ , λ ) ( p | p X Y ) is a concave function of λ [ 0 , 1 / 2 ] .
(e) 
For every ( μ , λ ) [ 0 , 1 ] × [ 0 , 1 / 2 ] , define
ρ ( μ , λ ) ( p X Y ) : = max ( ν , p ) [ 0 , λ ] × P sh ( p X Y ) : Ω ˜ ( μ , λ ) ( p ) = Ω ˜ ( μ , λ ) ( p X Y ) Var p ( ν ) ω ˜ p ( μ ) ( X , Y | U ) ,
and set
ρ = ρ ( p X Y ) : = max ( μ , λ ) [ 0 , 1 ] × [ 0 , 1 / 2 ] ρ ( μ , λ ) ( p X Y ) .
Then, we have ρ ( p X Y ) < . Furthermore, for any ( μ , λ ) [ 0 , 1 ] × [ 0 , 1 / 2 ] , we have
Ω ˜ ( μ , λ ) ( p X Y ) λ R ( μ ) ( p X Y ) λ 2 2 ρ ( p X Y ) .
(f) 
For every τ ( 0 , ( 1 / 2 ) ρ ( p X Y ) ) , the condition ( R 1 + τ , R 2 + τ ) R ( p X Y ) implies
F ̲ ( R 1 , R 2 | p X Y ) > ρ ( p X Y ) 4 · g 2 τ ρ ( p X Y ) > 0 ,
where g is the inverse function of ϑ ( a ) : = a + ( 5 / 4 ) a 2 , a 0 .
Property 3 part a is stated as Lemma A2 in Appendix B. Proof of this lemma is given in this appendix. Proof of Property 4 part b is given in Appendix D. Proofs of Property 4 parts c, d, e, and f are given in Appendix E.
Our main result is the following.
Theorem 3.
For any R 1 , R 2 0 , any p X Y , and for any ( φ 1 ( n ) , φ 1 ( n ) , ψ ( n ) ) satisfying ( 1 / n ) log | | φ i ( n ) | | R i , i = 1 , 2 , we have
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) 5 exp n F ( R 1 , R 2 | p X Y ) .
It can be seen from Property 4 parts b and f that F ( R 1 , R 2 | p X Y ) is strictly positive if ( R 1 , R 2 ) is outside the rate region R ( p X Y ) . Hence, by Theorem 3, we have that, if ( R 1 , R 2 ) is outside the rate region, then the error probability of decoding goes to one exponentially and its exponent is not below F ( R 1 , R 2 | p X Y ) . It immediately follows from Theorem 3 that we have the following corollary.
Corollary 1.
G ( R 1 , R 2 | p X Y ) F ( R 1 , R 2 | p X Y ) , G ( p X Y ) G ¯ ( p X Y ) = ( R 1 , R 2 , G ) : G F ( R 1 , R 2 | p X Y ) .
Proof of Theorem 3 will be given in the next section. The exponent function at rates outside the rate region was derived by Oohama and Han [5] for the separate source coding problem for correlated sources [4]. The techniques used by them is a method of types [21], which is not useful to prove Theorem 3. Some novel techniques based on the information spectrum method introduced by Han [22] are necessary to prove this theorem.
From Theorem 3 and Property 4 part e, we can obtain an explicit outer bound of R AKW ( ε | p X Y ) with an asymptotically vanishing deviation from R AKW ( p X Y ) = R ( p X Y ) . The strong converse theorem established by Ahlswede et al. [6] immediately follows from this corollary. To describe this outer bound, for κ > 0 , we set
R ( p X Y ) κ ( 1 , 1 ) : = { ( R 1 κ , R 2 κ ) : ( R 1 , R 2 ) R ( p X Y ) } ,
which serves as an outer bound of R ( p X Y ) . For each fixed ε ( 0 , 1 ) , we define κ n = κ n ( ε , ρ ( p X Y ) ) by
κ n : = ρ ( p X Y ) ϑ 4 n ρ ( p X Y ) log 5 1 ε = ( a ) 2 ρ ( p X Y ) n log 5 1 ε + 5 n log 5 1 ε .
Step (a) follows from ϑ ( a ) = a + ( 5 / 4 ) a 2 . Since κ n 0 as n , we have the smallest positive integer n 0 = n 0 ( ε , ρ ( p X Y ) ) such that κ n ( 1 / 2 ) ρ ( p X Y ) for n n 0 . From Theorem 3 and Property 4 part e, we have the following corollary.
Corollary 2.
For each fixed ε ( 0 , 1 ) , we choose the above positive integer n 0 = n 0 ( ε , ρ ( p X Y ) ) . Then, for any n n 0 , we have
R AKW ( n , ε | p X Y ) R ( p X Y ) κ n ( 1 , 1 ) .
The above result together with
R AKW ( ε | p X Y ) = cl m 1 n m R AKW ( n , ε | p X Y )
yields that, for each fixed ε ( 0 , 1 ) , we have
R AKW ( ε | p X Y ) = R AKW ( p X Y ) = R ( p X Y ) .
This recovers the strong converse theorem proved by Ahlswede et al. [6].
Proof of this corollary will be given in the next section.

4. Proof of the Main Result

Let ( X n , Y n ) be a pair of random variables from the information source. We set S = φ 1 ( n ) ( X n ) . Joint distribution p S X n Y n of ( S , X n , Y n ) is given by
p S X n Y n ( s , x n , y n ) = p S | X n ( s | x n ) t = 1 n p X t Y t ( x t , y t ) .
It is obvious that S X n Y n . Then, we have the following lemma, which is well known as a single shot infomation spectrum bound.
Lemma 1.
For any η > 0 and for any ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) satisfying ( 1 / n ) log | | φ i ( n ) | | R i , i = 1 , 2 , we have
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) p S X n Y n { 0 1 n log q ^ S X n Y n ( S , X n , Y n ) p S X n Y n ( S , X n , Y n ) η ,
0 1 n log Q X n ( X n ) p X n ( X n ) η ,
R 1 1 n log Q ˜ X n | S ( X n | S ) p X n ( X n ) η ,
R 2 1 n log 1 p Y n | S ( Y n | S ) η + 4 e n η .
The probability distributions appearing in the three inequalities (12), (13), and (14) in the right members of (15) have a property that we can select them as arbitrary. In (12), we can choose any probability distribution q ^ S X n Y n on S × X n × Y n . In (13), we can choose any distribution Q X n on X n . In (14), we can choose any stochastic matrix Q ˜ X n | U n : X n U n .
This lemma can be proved by a standard argument in the information spectrum method [22]. The detail of the proof is given in Appendix F. Next, we single letterize the four information spectrum quantities inside the first term in the right members of (15) in Lemma 1 to obtain the following lemma.
Lemma 2.
For any η > 0 and for any ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) satisfying ( 1 / n ) log | | φ i ( n ) | | R i , i = 1 , 2 , we have
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) p S X n Y n 0 1 n t = 1 n log Q X t ( X t ) p X t ( X t ) η ,
R 1 1 n t = 1 n log Q ˜ X t | S X t 1 ( X t | S , X t 1 ) p X t ( X t ) η ,
R 2 1 n t = 1 n log 1 p Y t | S X t 1 Y t 1 ( Y t | S , X t 1 , Y t 1 ) 2 η + 4 e n η ,
where for each t = 1 , 2 , , n , the probability distribution Q X t on X appearing in (16) and the stochastic matrix Q ˜ X t | S X t 1 : M 1 × X t 1 X appearing in (17) have a property that we can choose their values arbitrary.
Proof. 
In (12) in Lemma 1, we choose q ^ S X n Y n having the form
q ^ S X n Y n ( S , X n , Y n ) = p S ( S ) t = 1 n p X t | S X t 1 Y t ( X t | S , X t 1 , Y t ) p Y t | S Y t 1 ( Y t | S , Y t 1 ) .
In (13) in Lemma 1, we choose Q X n having the form
Q X n ( X n ) = t = 1 n Q X t ( X t ) .
We further note that
Q ˜ X n | S ( X n | S ) p X n ( X n ) = t = 1 n Q ˜ X t | S X t 1 ( X t | S , X t 1 ) p X t ( X t ) , p Y n | S ( Y n | S ) = t = 1 n p Y t | S Y t 1 ( Y t | S , Y t 1 ) .
Then, the bound (15) in Lemma 1 becomes
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) p S X n Y n { 0 1 n t = 1 n log p Y t | S Y t 1 ( Y t | S , Y t 1 ) p Y t | S X t 1 Y t 1 ( Y t | S , X t 1 , Y t 1 ) η , 0 1 n t = 1 n log Q X t ( X t ) p X t ( X t ) η , R 1 1 n t = 1 n log Q ˜ X t | S X t 1 ( X t | S , X t 1 ) p X t ( X t ) η , R 2 1 n t = 1 n 1 p Y t | S Y t 1 ( Y t | S , Y t 1 ) η + 4 e n η p S X n Y n 0 1 n t = 1 n log Q X t ( X t ) p X t ( X t ) η , R 1 1 n t = 1 n log Q ˜ X t | S X t 1 ( X t | S , X t 1 ) p X t ( X t ) η , R 2 1 n t = 1 n log 1 p Y t | S X t 1 Y t 1 ( Y t | S , X t 1 , Y t 1 ) 2 η + 4 e n η ,
completing the proof. □
As in the standard converse coding argument, we identify auxiliary random variables, based on the bound in Lemma 2. The following lemma is necessary for such identification.
Lemma 3.
Suppose that, for each t = 1 , 2 , , n , the joint distribution p S X t Y t of the random vector S X t Y t is a marginal distribution of p S X n Y n . Then, we have the following Markov chain:
S X t 1 X t Y t
or equivalently that I ( Y t ; S X t 1 | X t ) = 0 . Furthermore, we have the following Markov chain:
Y t 1 S X t 1 ( X t , Y t )
or equivalently that I ( X t Y t ; Y t 1 | S X t 1 ) = 0 . The above two Markov chains are equivalent to the following one long Markov chain:
Y t 1 S X t 1 X t Y t .
Proof of this lemma is given in Appendix G. For t = 1 , 2 , , n , set U t : = M 1 × X t 1 . Define a random variable U t U t by U t : = ( S , X t 1 ) . From Lemmas 2 and 3, we identify auxiliary random variables to obtain the following lemma.
Lemma 4.
For any η > 0 and for any ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) satisfying ( 1 / n ) log | | φ i ( n ) | | R i , i = 1 , 2 , we have
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) p S X n Y n 0 1 n t = 1 n log Q X t ( X t ) p X t ( X t ) η ,
R 1 1 n t = 1 n log Q ˜ X t | U t ( X t | U t ) p X t ( X t ) η ,
R 2 1 n t = 1 n log 1 p Y t | U t ( Y t | U t ) 2 η + 4 e n η ,
where, for each t = 1 , 2 , , n , the probability distribution Q X t on X appearing in (21) and the stochastic matrix Q ˜ X t | U t : U t X appearing in (22) have a property that we can choose their values arbitrary.
Now, the challenge is that, although the quantities inside the first term in the right members of (23) in Lemma 4 have n sum of information spectrum quantities, the measure p S X n Y n does not have an i.i.d. structure in general. To resolve this, we first use the large deviation theory to upper bound the first quantity in the right members of (23). For each t = 1 , 2 , , n , set Q ̲ t : = ( Q X t , Q ˜ X t | U t ) . Let Q ̲ t be a set of all Q ̲ t . We define a quantity which serves as an exponential upper bound of P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) . Let P ( n ) ( p X Y ) be a set of all probability distributions p S X n Y n on M 1 × X n × Y n having a form:
p S X n Y n ( s , x n , y n ) = p S | X n ( s | x n ) t = 1 n p X Y ( x t , y t )   for   ( s , x n , y n ) M 1 × X n × Y n .
For simplicity of notation, we use the notation p ( n ) for p S X n Y n P ( n ) ( p X Y ) . For each t = 1 , 2 , , n , p U t X t Y t = p S X t Y t is a marginal distribution of p ( n ) . For t = 1 , 2 , , n , we simply write p t = p U t X t Y t . For μ [ 0 , 1 ] , α [ 0 , 1 ) , p ( n ) P ( n ) ( p X Y ) , and Q ̲ n Q n , we define
Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) : = log E p ( n ) t = 1 n p X t α ¯ ( X t ) Q X t α ¯ ( X t ) p X t μ α ( X t ) p Y t | U t μ α ( Y t | U t ) Q ˜ X t | U t μ α ( X t | U t ) ,
where for each t = 1 , 2 , , n , the probability distribution Q X t and the conditional probability distribution Q ˜ X t | U t appearing in the definition of Ω ( μ , θ ) ( p ( n ) , Q ̲ n ) can be chosen as arbitrary.
The following is well known as the Cramèr’s bound in the large deviation principle.
Lemma 5.
For any real valued random variable Z and any α 0 , we have
Pr { Z a } exp α a log E [ exp ( α Z ) ] .
By Lemmas 4 and 5, we have the following proposition.
Proposition 1.
For any ( μ , α ) [ 0 , 1 ] 2 any Q ̲ n Q ̲ n , and any ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) satisfying ( 1 / n ) log | | φ i ( n ) | | R i , i = 1 , 2 , there exists p ( n ) P ( n ) ( W 1 , W 2 ) such that
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) 5 exp n 2 + α μ ¯ 1 1 n Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) α ( μ R 1 + μ ¯ R 2 ) .
Proof. 
By Lemma 4, for ( μ , α ) [ 0 , 1 ] 2 , we have the following chain of inequalities:
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) p S X n Y n 0 1 n t = 1 n log Q X t α ¯ ( X t ) p X t α ¯ ( X t ) α ¯ η , α μ R 1 1 n t = 1 n log Q ˜ X t | U t α μ ( X t | U t ) p X t α μ ( X t ) α μ η , α μ ¯ R 2 1 n t = 1 n log 1 p Y t | U t α μ ¯ ( Y t | U t ) 2 α μ ¯ η + 4 e n η p S X n Y n α ( μ R 1 + μ ¯ R 2 ) + ( 1 + α μ ¯ ) η 1 n t = 1 n log p X t α ¯ ( X t ) Q X t α ¯ ( X t ) p X t μ α ( X t ) p Y t | U t μ ¯ α ( Y t | U t ) Q ˜ X t | U t μ α ( X t | U t ) + 4 e n η = p S X n Y n 1 n t = 1 n log p X t α ¯ ( X t ) Q X t α ¯ ( X t ) p X t μ α ( X t ) p Y t | U t α ( Y t | U t ) Q ˜ X t | U t μ α ( X t | U t ) α ( μ R 1 + μ ¯ R 2 ) + ( 1 + α μ ¯ ) η + 4 e n η ( a ) exp n α ( μ R 1 + μ ¯ R 2 ) + ( 1 + α μ ¯ ) η 1 n Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) + 4 e n η .
Step (a) follows from Lemma 5. When Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) n α ( μ R 1 + μ ¯ R 2 ) , the bound we wish to prove is obvious. In the following argument, we assume that Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) > n α ( μ R 1 + μ ¯ R 2 ) . We choose η so that
η = α ( μ R 1 + μ ¯ R 2 ) + ( 1 + α μ ¯ ) η 1 n Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) .
Solving (25) with respect to η , we have
η = ( 1 / n ) Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) α ( μ R 1 + μ ¯ R 2 ) 2 + α μ ¯ .
For this choice of η and (24), we have
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) 5 e n η = 5 exp n 2 + α μ ¯ 1 1 n Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) α ( μ R 1 + μ ¯ R 2 ) ,
completing the proof. □
Set
Ω ̲ ( μ , α ) ( p X Y ) : = inf n 1 min p ( n ) P ( n ) max Q ̲ n Q ̲ n 1 n Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) .
By Proposition 1, we have the following corollary.
Corollary 3.
For any ( μ , α ) [ 0 , 1 ] 2 and any ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) satisfying ( 1 / n ) log | | φ i ( n ) | | R i , i = 1 , 2 , we have
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) 5 exp n Ω ̲ ( μ , α ) ( p X Y ) α ( μ R 1 + μ ¯ R 2 ) 2 + α μ ¯ .
We shall call Ω ̲ ( μ , α ) ( p X Y ) the communication potential. The above corollary implies that the analysis of Ω ̲ ( μ , α ) ( p X Y ) leads to an establishment of a strong converse theorem for the one helper source coding problem. Note here that Ω ̲ ( μ , α ) ( p X Y ) is still a multi letter quantity. However, we successfully single letterize this quantity. This result which will be stated later in Proposition 2 is a mathematical core of our main result.
In the following argument, we drive an explicit lower bound of Ω ̲ ( μ , α ) ( p X Y ) . For each t = 1 , 2 , , n , set u t = ( s , x t 1 ) U t and
F t : = ( p X t , p X t Y t | U t , Q ̲ t ) , F t : = { F i } i = 1 t .
For t = 1 , 2 , , n , define a function of ( u t , x t , y t ) U t × X × Y by
f F t ( μ , α ) ( x t , y t | u t ) : = p X t α ¯ ( x t ) Q X t α ¯ ( x t ) p X t μ α ( x t ) p Y t | U t α ( y t | u t ) Q ˜ X t | U t μ α ( x t | u t ) .
By definition, we have
exp Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) = s , x n , y n p S X n Y n ( s , x n , y n ) t = 1 n f F t ( μ , α ) ( x t , y t | u t ) .
For each t = 1 , 2 , , n , we define the probability distribution
p S X t Y t ; F t ( μ , α ) : = p S X t Y t ; F t ( μ , α ) ( s , x t , y t ) ( s , x t , y t ) M 1 × X t × Y t
by
p S X t Y t ; F t ( μ , α ) ( s , x t , y t ) : = C t 1 p S X t Y t ( s , x t , y t ) i = 1 t f F i ( μ , α ) ( x i , y i | u i ) ,
where
C t : = s , x t , y t p S X t Y t ( s , x t , y t ) i = 1 t f F i ( μ , α ) ( x i , y i )
are constants for normalization. For t = 1 , 2 , , n , define
Φ t ( μ , α ) : = C t C t 1 1 ,
where we define C 0 = 1 . Then, we have the following lemma.
Lemma 6.
For each t = 1 , 2 , , n , and for any ( s , x t , y t ) M 1 × X t × Y t , we have
p S X t Y t ; F t ( μ , α ) ( s , x t , y t ) = ( Φ t ( μ , α ) ) 1 p S X t 1 Y t 1 ; F t 1 ( μ , α ) ( s , x t 1 , y t 1 ) p X t Y t | S X t 1 Y t 1 ( x t , y t | s , x t 1 , y t 1 ) f F t ( μ , α ) ( x t , y t | u t ) .
Furthermore, we have
Φ t ( μ , α ) = s , x t , y t p S X t 1 Y t 1 ; F t 1 ( μ , α ) ( s , x t 1 , y t 1 ) p X t Y t | S X t 1 Y t 1 ( x t , y t | s , x t 1 , y t 1 ) f F t ( μ , α ) ( x t , y t | u t ) .
Proof of this lemma is given in Appendix H. Define
p U t ; F t 1 ( μ , α ) ( u t ) = p S X t 1 ; F t 1 ( μ , α ) ( s , x t 1 ) : = y t 1 p S X t 1 Y t 1 ; F t 1 ( μ , α ) ( s , x t 1 , y t 1 ) .
Then, we have the following lemma, which is a key result to derive a single letterized lower bound of Ω ̲ ( μ , α ) ( p X Y ) .
Lemma 7.
For any p ( n ) P ( n ) ( p X Y ) and any Q ̲ n Q ̲ n , we have
Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) = ( 1 ) t = 1 n log Φ t ( μ , α ) ,
Φ t ( μ , α ) = u t , x t , y t p U t ; F t 1 ( μ , α ) ( u t ) p X t | U t ( x t | u t ) p Y t | X t ( y t | x t ) f F t ( μ , α ) ( x t , y t | u t ) .
Proof. 
We first prove (29). From (26), we have
log Φ t ( μ , α ) = log C t + log C t 1 .
Furthermore, by definition, we have
Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) = log C n , C 0 = 1 .
From (31) and (32), (29) is obvious. We next prove (30). We first observe that for ( s , x t , y t ) S × X t × Y t and for t = 1 , 2 , , n ,
p X t Y t | S X t 1 Y t 1 ( x t , y t | s , x t 1 , y t 1 ) = p X t | S X t 1 Y t 1 ( x t | s , x t 1 , y t 1 ) p Y t | S X t Y t 1 ( y t | s , x t , y t 1 ) = ( a ) p X t | S X t 1 ( x t | s , x t 1 ) p Y t | X t ( y t | x t ) .
Step (a) follows from Lemma 3. Then, by Lemma 6, we have
Φ t ( μ , α ) = s , x t , y t p S X t 1 Y t 1 ; F t 1 ( μ , α ) ( s , x t 1 , y t 1 ) p X t Y t | S X t 1 Y t 1 ( x t , y t | s , x t 1 , y t 1 ) f F t ( μ , α ) ( x t , y t | u t ) = s , x t , y t y t 1 p S X t 1 Y t 1 ; F t 1 ( μ , α ) ( s , x t 1 , y t 1 ) p X t | S X t 1 ( x t | s , x t 1 ) p Y t | X t ( y t | x t ) f F t ( μ , α ) ( x t , y t | u t ) = s , x t , y t p S X t 1 ; F t 1 ( μ , α ) ( s , x t 1 ) p X t | S X t 1 ( x t | s , x t 1 ) p Y t | X t ( y t | x t ) f F t ( μ , α ) ( x t , y t | u t ) ,
completing the proof. □
The following proposition is a mathematical core to prove our main result.
Proposition 2.
For any μ [ 0 , 1 ] and any α 0 , we have
Ω ̲ ( μ , α ) ( p X Y ) Ω ( μ , α ) ( p X Y ) .
Proof. 
Set
Q n ( p Y | X ) : = { q = q U X Y : | U | | M 1 | | X n 1 | | Y n 1 | , q Y | X = p Y | X , U X Y } , Ω ^ n ( μ , α ) ( p X Y ) : = min q Q n ( p Y | X ) Ω ( μ , α ) ( q | p X Y ) .
For each t = 1 , 2 , , n , we define q t = q U t X t Y t Z t by
q U t ( u t ) = p U t ; F t 1 ( μ , α ) ( u t ) , q X t Y t | U t ( x t , y t | u t ) = p X t | U t ( x t | u t ) p Y | X ( y t | x t ) .
Equation (33) implies that q t = q U t X t Y t Q n ( p Y | X ) . Furthermore, for each t = 1 , 2 , , n , we choose Q ̲ t = ( Q X t , Q ˜ X t | U t ) appearing in
f F t ( μ , α ) ( x t , y t | u t ) = p X t α ¯ ( x t ) Q X t α ¯ ( x t ) p X t μ α ( x t ) p Y t | U t α ( y t | u t ) Q ˜ X t | U t μ α ( x t | u t )
such that Q ̲ t = ( Q X t , Q ˜ X t | U t ) = ( q X t , q X t | U t ) . For this choice of Q ̲ t , we have the following chain of inequalities:
Φ t ( μ , α ) = ( a ) E q t f F t ( μ , θ ) ( X t , Y t | U t ) = ( b ) E q t p X t α ¯ ( X t ) q X t α ¯ ( X t ) p X t μ α ( X t ) p Y t | U t α ( Y t | U t ) q X t | U t μ α ( X t | U t ) = E q t f q t | p X t ( μ , α ) ( X t , Y t | U t ) = exp Ω ( μ , α ) ( q t | p X t ) = ( c ) exp Ω ( μ , α ) ( q t | p X ) ( d ) exp Ω ^ n ( μ , α ) ( p X Y ) = ( e ) exp Ω ( μ , α ) ( p X Y ) .
Step (a) follows from Lemma 7 and (33). Step (b) follows from the choice ( Q X t , Q ˜ X t | U t ) = ( q X t , q X t | U t ) of ( Q X t , Q ˜ X t | U t ) for t = 1 , 2 , , n . Step (c) follows from p X t = p X for t = 1 , 2 , , n . Step (d) follows from q t Q n ( p Y | X ) and the definition of Ω ^ n ( μ , α ) ( p X Y ) . Step (e) follows from Property 4 part a. Hence, we have the following:
max Q ̲ n Q ̲ n 1 n Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) 1 n Ω ( μ , α ) ( p ( n ) , Q ̲ n | p X Y ) = ( a ) 1 n t = 1 n log Φ t ( μ , α ) ( b ) Ω ( μ , α ) ( p X Y ) .
Step (a) follows from Lemma 7. Step (b) follows from (34). Since (35) holds fo any n 1 and any p S X n Y n satisfying S X n Y n , we have that, for any ( μ , α ) [ 0 , 1 ] 2 ,
Ω ̲ ( μ , α ) ( p X Y ) Ω ( μ , α ) ( p X Y ) .
Thus, Proposition 2 is proved. □
Proof of Theorem 3.
For any ( μ , α ) [ 0 , 1 ] 2 , for any R 1 , R 2 0 and for any ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) satisfying ( 1 / n ) log | | φ i ( n ) | | R i , i = 1 , 2 , we have the following:
1 n log 5 P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) ( a ) Ω ̲ ( μ , α ) ( p X Y ) α ( μ R 1 + μ ¯ R 2 ) 2 + α μ ¯ ( b ) Ω ( μ , α ) ( p X Y ) α ( μ R 1 + μ ¯ R 2 ) 2 + α μ ¯ = F ( μ , α ) ( μ R 1 + μ ¯ R 2 | p X Y ) .
Step (a) follows from Corollary 3. Step (b) follows from Proposition 2. Since the above bound holds for any μ [ 0 , 1 ] and any α 0 , we have
1 n log 5 P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) F ( R 1 , R 2 | p X Y ) .
Thus, (10) in Theorem 3 is proved. □
Proof. of Corollary 2.
Since g is an inverse function of ϑ , the definition (11) of κ n is equivalent to
g κ n ρ ( p X Y ) = 4 n ρ ( p X Y ) log 5 1 ε .
By the definition of n 0 = n 0 ( ε , ρ ( p X Y ) ) , we have that κ n ( 1 / 2 ) ρ ( p X Y ) for n n 0 . We assume that, for n n 0 , ( R 1 , R 2 ) R AKW ( n , ε | p X Y ) . Then, there exists a sequence { ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) } n n 0 such that, for n n 0 , we have
1 n log | | φ i ( n ) | | R i , i = 1 , 2 , 1 ε P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) .
Then, by Theorem 3, we have
1 ε P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) 5 exp n F ( R 1 , R 2 | p X Y )
for any n n 0 ( ε , ρ ( p X Y ) ) . From (38), we have that for n n 0 ( ε , ρ ( p X Y ) ) ,
F ( R 1 , R 2 | p X Y ) 1 n log 5 1 ε = ( a ) ρ ( p X Y ) 4 · g 2 κ n ρ ( p X Y ) .
Step (a) follows from (36). Hence, by Property 4 part e, we have that, under κ n ( 1 / 2 ) ρ ( p X Y ) , the inequality (39) implies
( R 1 , R 2 ) R ( p X Y ) + κ n ( 1 , 1 ) .
Since (40) holds for any n n 0 and ( R 1 , R 2 ) R AKW ( n , ε | p X Y ) , we have
R AKW ( n , ε | p X Y ) R ( p X Y ) + κ n ( 1 , 1 )   for   n n 0 ,
completing the proof. □

5. One Helper Problem Studied by Wyner

We consider a communication system depicted in Figure 4. Data sequences X n , Y n , and Z n , respectively are separately encoded to φ 1 ( n ) ( X n ) , φ 2 ( n ) ( Y n ) , and φ 3 ( n ) ( Z n ) . The encoded data φ 1 ( n ) ( X n ) and φ 2 ( n ) ( Y n ) are sent to the information processing center 1. The encoded data φ 1 ( n ) ( X n ) and φ 3 ( n ) ( Z n ) are sent to the information processing center 2. At center 1, the decoder function ψ ( n ) observes ( φ 1 ( n ) ( X n ) , φ 2 ( n ) ( Y n ) ) to output the estimation Y ^ n of Y n . At center 2, the decoder function ϕ ( n ) observes ( φ 1 ( n ) ( X n ) , φ 3 ( n ) ( Z n ) ) to output the estimation Z ^ n of Z n . The error probability of decoding is
P e ( n ) ( φ 1 ( n ) , φ 2 ( n ) , φ 3 ( n ) , ψ ( n ) , ϕ ( n ) ) = Pr Y ^ n Y n o r Z ^ n Z n ,
where Y ^ n = ψ ( n ) ( φ 1 ( n ) ( X n ) , φ 2 ( n ) ( Y n ) ) and Z ^ n = ψ ( n ) ( φ 1 ( n ) ( X n ) , φ 3 ( n ) ( Z n ) ) .
A rate triple ( R 1 , R 2 , R 3 ) is ε -achievable if, for any δ > 0 , there exist a positive integer n 0 = n 0 ( ε , δ ) and a sequence of three encoders and two decoder functions { ( φ 1 ( n ) , φ 2 ( n ) , φ 3 ( n ) , ψ ( n ) , ϕ ( n ) ) } n n 0 such that, for n n 0 ( ε , δ ) ,
1 n log | | φ i ( n ) | | R i + δ f o r i = 1 , 2 , 3 , P e ( n ) ( φ 1 ( n ) , φ 2 ( n ) , φ 3 ( n ) , ψ ( n ) , ϕ ( n ) ) ε .
The rate region R W ( ε | p X Y Z ) is defined by
R W ( ε | p X Y Z ) : = { ( R 1 , R 2 , R 3 ) : ( R 1 , R 2 , R 3 ) i s ε achievable   for p X Y Z } .
Furthermore, define
R W ( p X Y Z ) : = ε ( 0 , 1 ) R W ( ε | p X Y Z ) .
We can show that the two rate regions R W ( ε | p X Y Z ) , ε ( 0 , 1 ) and R W ( p X Y Z ) satisfy the following property.
Property 5.
(a) 
The regions R W ( ε | p X Y Z ) , ε ( 0 , 1 ) , and R W ( p X Y Z ) are closed convex sets of R + 3 .
(b) 
We set
R W ( n , ε | p X Y Z ) = { ( R 1 , R 2 , R 3 ) : T h e r e   e x i s t s ( φ 1 ( n ) , φ 2 ( n ) , φ 3 ( n ) , ψ ( n ) ) s u c h   t h a t 1 n log | | φ i ( n ) | | R i , i = 1 , 2 , 3 , P e ( n ) ( φ 1 ( n ) , φ 2 ( n ) , φ 3 ( n ) , ψ ( n ) ) ε } ,
which is called the ( n , ε ) -rate region. Using R W ( n , ε | p X Y Z ) , R W ( ε | p X Y Z ) can be expressed as
R W ( ε | p X Y Z ) = cl m 1 n m R W ( n , ε | p X Y Z ) .
It is well known that R W ( p X Y Z ) was determined by Wyner. To describe his result, we introduce an auxiliary random variable U taking values in a finite set U . We assume that the joint distribution of ( U , X , Y , Z ) is
p U X Y ( u , x , y , z ) = p U ( u ) p X | U ( x | u ) p Y Z | X ( y , z | x ) .
The above condition is equivalent to U X Y Z . Define the set of probability distribution on U × X × Y × Z by
P ( p X Y Z ) : = { p = p U X Y Z : | U | | X | + 2 , U X Y Z } .
Set
R ( p ) : = { ( R 1 , R 2 , R 3 ) : R 1 , R 2 , R 3 0 , R 1 I p ( X ; U ) , R 2 H p ( Y | U ) , R 3 H p ( Z | U ) } , R ( p X Y Z ) : = p P ( p X Y Z ) R ( p ) .
We can show that the region R ( p X Y Z ) satisfies the following property.
Property 6.
(a) 
The region R ( p X Y Z ) is a closed convex subset of R + 3 .
(b) 
For any p X Y Z , and any γ [ 0 , 1 ] , we have
min ( R 1 , R 2 , R 3 ) R ( p X Y ) ( R 1 + γ ¯ R 2 + γ R 3 ) = γ ¯ H p ( Y ) + γ H p ( Z ) .
The minimun is attained by ( R 1 , R 2 , R 3 ) = ( 0 , H p ( Y ) , H p ( Z ) ) . This result implies that
R ( p X Y Z ) γ [ 0 , 1 ] { ( R 1 , R 2 , R 3 ) : R 1 + γ ¯ R 2 + γ R 3 γ ¯ H p ( Y ) + γ H p ( Z ) } R + 3 .
Furthermore, the point ( 0 , H p ( Y ) , H p ( Z ) ) always belongs to R ( p X Y Z ) .
The rate region R W ( p X Y Z ) was determined by Wyner [2]. His result is the following.
Theorem 4
(Wyner [2]).
R W ( p X Y Z ) = R ( p X Y Z ) .
On the strong converse theorem, Csiszár and Körner [21] obtained the following.
Theorem 5
(Csiszár and Körner [21]). For each fixed ε ( 0 , 1 ) , we have
R W ( ε | p X Y Z ) = R ( p X Y Z ) .
To examine a rate of convergence for the error probability of decoding to tend to one as n for ( R 1 , R 2 , R 3 ) R W ( p X Y Z ) , we define the following quantity. Set
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , φ 3 ( n ) , ψ ( n ) , ϕ ( n ) ) : = 1 P e ( n ) ( φ 1 ( n ) , φ 2 ( n ) , φ 3 ( n ) , ψ ( n ) , ϕ ( n ) ) , G ( n ) ( R 1 , R 2 , R 3 | p X Y Z ) : = min ( φ 1 ( n ) , φ 2 ( n ) , φ 3 ( n ) , ψ ( n ) , ϕ ( n ) ) : ( 1 / n ) log φ i ( n ) R i , i = 1 , 2 , 3 1 n log P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , φ 3 ( n ) , ψ ( n ) , ϕ ( n ) ) , G ( R 1 , R 2 , R 3 | p X Y Z ) : = lim n G ( n ) ( R 1 , R 2 , R 3 | p X Y Z ) , G ( p X Y Z ) : = { ( R 1 , R 2 , R 3 , G ) : G G ( R 1 , R 2 , R 3 | p X Y Z ) } .
By time sharing, we have that
G ( n + m ) n R 1 + m R 1 n + m , n R 2 + m R 2 n + m , n R 2 + m R 2 n + m p X Y Z n G ( n ) ( R 1 , R 2 , R 3 | p X Y Z ) + m G ( m ) ( R 1 , R 2 , R 3 | p X Y Z ) n + m .
Choosing R = R in (42), we obtain the following subadditivity property on { G ( n ) ( R 1 , R 2 , R 3 | p X Y Z ) } n 1 :
G ( n + m ) ( R 1 , R 2 , R 3 | p X Y Z ) n G ( n ) ( R 1 , R 2 , R 3 | p X Y Z ) + m G ( m ) ( R 1 , R 2 , R 3 | p X Y Z ) n + m ,
from which we have that G ( R 1 , R 2 , R 3 | p X Y Z ) exists and satisfies the following:
G ( R 1 , R 2 , R 3 | p X Y Z ) = inf n 1 G ( n ) ( R 1 , R 2 , R 3 | p X Y Z ) .
The exponent function G ( R 1 , R 2 , R 3 | p X Y Z ) is a convex function of ( R 1 , R 2 , R 3 ) . In fact, by time sharing, we have that
G ( n + m ) n R 1 + m R 1 n + m , n R 2 + m R 2 n + m , n R 2 + m R 2 n + m p X Y Z n G ( n ) ( R 1 , R 2 , R 3 | p X Y Z ) + m G ( m ) ( R 1 , R 2 , R 3 | p X Y Z ) n + m ,
from which we have that for any α [ 0 , 1 ]
G ( α R 1 + α ¯ R 1 , α R 2 + α ¯ R 2 , α R 3 + α ¯ R 3 | p X Y Z ) α G ( R 1 , R 2 , R 3 | p X Y Z ) + α ¯ G ( R 1 , R 2 , R 3 | p X Y Z ) .
The region G ( p X Y Z ) is also a closed convex set. Our main aim is to find an explicit characterization of G ( p X Y Z ) . In this paper, we derive an explicit outer bound of G ( p X Y Z ) whose section by the plane G = 0 coincides with R W ( p X Y Z ) . We first explain that the region R ( p X Y Z ) has another expression using the supporting hyperplane. We define two sets of probability distributions on U × X × Y × Z by
P sh ( p X Y Z ) : = { p = p U X Y Z : | U | | X | , U X Y Z } , Q ( p Y Z | X ) : = { q = q U X Y Z : | U | | X | , p Y Z | X = q Y Z | X , U X Y Z } .
For ( μ , γ ) [ 0 , 1 ] 2 , set
R ( μ , γ ) ( p X Y Z ) : = max p P sh ( p X Y Z ) μ I p ( X ; U ) + μ ¯ ( γ ¯ H p ( Y | U ) + γ H p ( Z | U ) ) .
Furthermore, define
R sh ( p X Y Z ) = ( μ , γ ) [ 0 , 1 ] 2 { ( R 1 , R 2 , R 3 ) : μ R 1 + μ ¯ ( γ ¯ R 2 + γ R 3 ) R ( μ , γ ) ( p X Y Z ) } .
Then, we have the following property.
Property 7.
(a) 
The bound | U | | X | is sufficient to describe R ( μ ) ( p X Y Z ) .
(b) 
For every ( μ , γ ) [ 0 , 1 ] 2 , we have
min ( R 1 , R 2 , R 3 ) R ( p X Y Z ) { μ R 1 + μ ¯ ( γ ¯ R 2 + γ R 3 ) } = R ( μ , γ ) ( p X Y Z ) .
(c) 
For any p X Y Z , we have
R sh ( p X Y Z ) = R ( p X Y Z ) .
For ( μ , γ , α ) [ 0 , 1 ] 3 , and for q = q U X Y Z Q ( p Y Z | X ) , define
ω q | p X ( μ , γ , α ) ( x , y , z | u ) : = α ¯ log q X ( x ) p X ( x ) + α μ log q X | U ( x | u ) p X ( x ) + μ ¯ γ ¯ log 1 q Y | U ( y | u ) + γ log 1 q Z | U ( z | u ) , f q | p X ( μ , γ , α ) ( x , y , z | u ) : = exp ω q | p X ( μ , γ , α ) ( x , y , z | u ) , Ω ( μ , γ , α ) ( q | p X ) : = log E q f q | p X ( μ , γ , α ) ( X , Y , Z | U ) , Ω ( μ , γ , α ) ( p X Y Z ) : = min q Q ( p Y Z | X ) Ω ( μ , γ , α ) ( q | p X ) , F ( μ , γ , α ) ( μ R 1 + γ ¯ R 2 + γ R 3 ) : = Ω ( μ , γ , α ) ( p X Y Z ) α [ μ R 1 + μ ¯ ( γ ¯ R 2 + γ R 3 ) ] 2 + α μ ¯ , F ( R 1 , R 2 , R 3 | p X Y Z ) : = sup ( μ , γ , α ) [ 0 , 1 ] 3 , F ( μ , γ , α ) ( μ R 1 + μ ¯ ( γ ¯ R 2 + γ R 3 ) | p X Y Z ) .
We next define a function serving as a lower bound of F ( R 1 , R 2 , R 3 | p X Y Z ) . For each p = p U X Y Z P sh ( p X Y Z ) , define
ω ˜ p ( μ , γ ) ( x , y , z | u ) : = μ log p X | U ( x | u ) p X ( x ) + μ ¯ γ ¯ log 1 p Y | U ( y | u ) + γ log 1 p Z | U ( z | u ) , Ω ˜ ( μ , γ , λ ) ( p ) : = log E p exp λ ω p ( μ , γ ) ( X , Y , Z | U ) , Ω ˜ ( μ , γ , λ ) ( p X Y Z ) : = min p P sh ( p X Y Z ) Ω ˜ ( μ , γ , λ ) ( p ) .
Furthermore, set
F ̲ ( μ , γ , λ ) ( μ R 1 + γ ¯ R 2 + γ R 3 | p X Y Z ) : = Ω ˜ ( μ , γ , λ ) ( p X Y Z ) λ [ μ R 1 + μ ¯ ( γ ¯ R 2 + γ R 3 ) ] 2 + λ ( 5 μ ) , F ̲ ( R 1 , R 2 , R 3 | p X Y Z ) : = sup ( μ , γ ) [ 0 , 1 ] 2 , λ 0 F ̲ ( μ , γ , λ ) ( μ R 1 + μ ¯ γ ¯ R 2 + γ R 3 | p X Y Z ) .
We can show that the above functions and sets satisfy the following property.
Property 8.
(a) 
The cardinality bound | U | | X | in Q ( p Y | X ) is sufficient to describe the quantity Ω ( μ , α ) ( p X Y ) . Furthermore, the cardinality bound | U | | X | in Q ( p Y Z | X ) is sufficient to describe the quantity Ω ˜ ( μ , γ , λ ) ( p X Y Z ) .
(b) 
For any R 1 , R 2 , R 3 0 , we have
F ( R 1 , R 2 , R 3 | p X Y Z ) F ̲ ( R 1 , R 2 , R 3 | p X Y Z ) .
(c) 
For any p = p U X Y P sh ( p X Y ) and any ( μ , γ , λ ) [ 0 , 1 ] 3 , we have
0 Ω ˜ ( μ , γ , λ ) ( p ) μ log | X | + μ ¯ log ( | Y | γ ¯ | Z | γ ) .
(d) 
Fix any p = p U X Y Z P sh ( p X Y Z ) and ( μ , γ ) [ 0 , 1 ] 2 . We define a probability distribution p ( λ ) = p U X Y Z ( λ ) by
p ( λ ) ( u , x , y , z ) : = p ( u , x , y , z ) exp λ ω p ( μ , γ ) ( x , y , z | u ) E p exp λ ω p ( μ , γ ) ( X , Y , Z | U ) .
Then, for λ [ 0 , 1 / 2 ] , Ω ˜ ( μ , γ , λ ) ( p ) is twice differentiable. Furthermore, for λ [ 0 , 1 / 2 ] , we have
d d λ Ω ˜ ( μ , γ , λ ) ( p ) = E p ( λ ) ω p ( μ , γ ) ( X , Y , Z | U ) , d 2 d λ 2 Ω ˜ ( μ , γ , λ ) ( p ) = Var p ( λ ) ω p ( μ , γ ) ( X , Y , Z | U ) .
The second equality implies that Ω ˜ ( μ , γ , λ ) ( p ) is a concave function of λ [ 0 , 1 / 2 ] .
(e) 
For ( μ , γ , λ ) [ 0 , 1 ] 2 × [ 0 , 1 / 2 ] , define
ρ ( μ , γ , λ ) ( p X Y Z ) : = max ( ν , p ) [ 0 , λ ] × P sh ( p X Y Z ) : Ω ˜ ( μ , γ , λ ) ( p ) = Ω ˜ ( μ , γ , λ ) ( p X Y Z ) Var p ( ν ) ω ˜ p ( μ , γ ) ( X , Y , Z | U ) ,
and set
ρ = ρ ( p X Y Z ) : = max ( μ , γ , λ ) [ 0 , 1 ] 2 × [ 0 , 1 / 2 ] ρ ( μ , γ , λ ) ( p X Y Z ) .
Then, we have ρ ( p X Y Z ) < . Furthermore, for any ( μ , γ , λ ) [ 0 , 1 ] 2 × [ 0 , 1 / 2 ] , we have
Ω ˜ ( μ , γ , λ ) ( p X Y Z ) λ R ( μ , γ ) ( p X Y Z ) λ 2 2 ρ ( p X Y Z ) .
(f) 
For every τ ( 0 , ( 1 / 2 ) ρ ( p X Y Z ) ) , the condition ( R 1 + τ , R 2 + τ , R 3 + τ ) R ( p X Y Z ) implies
F ( R 1 , R 2 , R 3 | p X Y Z ) > ρ ( p X Y Z ) 4 · g 2 τ ρ ( p X Y Z ) > 0 .
Since proofs of the results stated in Property 8 are quite parallel with those of the results stated in Property 4, we omit them. Our main result is the following.
Theorem 6.
For any R 1 , R 2 , R 3 0 , any p X Y Z , and for any ( φ 1 ( n ) , φ 2 ( n ) , φ 3 ( n ) , ψ ( n ) , ϕ ( n ) ) satisfying ( 1 / n ) log | | φ i ( n ) | | R i , i = 1 , 2 , 3 , we have
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , φ 3 ( n ) , ψ ( n ) , ϕ ( n ) ) 7 exp n F ( R 1 , R 2 , R 3 | p X Y Z ) .
It follows from Theorem 6 and Property 8 part d) that, if ( R 1 , R 2 , R 3 ) is outside the capacity region, then the error probability of decoding goes to one exponentially and its exponent is not below F ( R 1 , R 2 , R 3 | p X Y Z ) . It immediately follows from Theorem 3 that we have the following corollary.
Corollary 4.
G ( R 1 , R 2 , R 3 | p X Y Z ) F ( R 1 , R 2 , R 3 | p X Y Z ) , G ( p X Y Z ) G ¯ ( p X Y Z ) = ( R 1 , R 2 , R 3 , G ) : G F ( R 1 , R 2 , R 3 | p X Y Z ) .
Proof of Theorem 6 is quite parallel with that of Theorem 3. We omit the detail of the proof. From Theorem 6 and Property 8 part e, we can obtain an explicit outer bound of R W ( ε | p X Y Z ) with an asymptotically vanishing deviation from R W ( p X Y Z ) = R ( p X Y Z ) . The strong converse theorem established by Csiszár and Körner [21] immediately follows from this corollary. To describe this outer bound, for κ > 0 , we set
R ( p X Y Z ) κ ( 1 , 1 , 1 ) : = { ( R 1 κ , R 2 κ , R 3 κ ) : ( R 1 , R 2 , R 3 ) R ( p X Y Z ) } ,
which serves as an outer bound of R ( p X Y Z ) . For each fixed ε ( 0 , 1 ) , we define κ ˜ n = κ ˜ n ( ε , ρ ( p X Y Z ) ) by
κ ˜ n : = ρ ( p X Y ) ϑ 4 n ρ ( p X Y ) log 7 1 ε = ( a ) 2 ρ ( p X Y ) n log 7 1 ε + 5 n log 7 1 ε .
Step (a) follows from ϑ ( a ) = a + ( 5 / 4 ) a 2 . Since κ ˜ n 0 as n , we have the smallest positive integer n 1 = n 1 ( ε , ρ ( p X Y Z ) ) such that κ ˜ n ( 1 / 2 ) ρ ( p X Y Z ) for n n 1 . From Theorem 6 and Property 8 part e, we have the following corollary.
Corollary 5.
For each fixed ε ( 0 , 1 ) , we choose the above positive integer n 1 = n 1 ( ε , ρ ( p X Y Z ) ) . Then, for any n n 1 , we have
R W ( ε | p X Y Z ) R ( p X Y Z ) κ ˜ n ( 0 , 1 , 1 ) .
The above result together with
R W ( ε | p X Y Z ) = cl m 1 n m R W ( n , ε | p X Y Z )
yields that for each fixed ε ( 0 , 1 ) , we have
R W ( ε | p X Y Z ) = R W ( p X Y Z ) = R ( p X Y Z ) .
This recovers the strong converse theorem proved by Csiszár and Körner [21].
Proof of this corollary is quite parallel with that of Corollary 2. We omit the detail.

6. Conclusions

For the AWZ system, the one helper source coding system posed by Ahlswede, Körner [1] and Wyner [2], we have derived an explicit lower bound of the optimal exponent function G ( R 1 , R 2 | p X Y ) on the correct probability of decoding for ( R 1 , R 2 ) R WZ ( p X Y ) . We have described this result in Theorem 3. Furthermore, for the source coding system posed and investigated Wyner [2], we have obtained an explicit lower bound of the optimal exponent function G ( R 1 , R 2 , R 3 | p X Y Z ) on the correct probability of decoding for ( R 1 , R 2 , R 3 ) R W ( p X Y Z ) . We have described this result in Theorem 6. The determination problems of G ( R 1 , R 2 | p X Y ) and G ( R 1 , R 2 , R 3 | p X Y Z ) still remain to be resolved. Those problems are our future works.

Funding

This research was funded by JSPS Kiban (B) 18H01438.

Acknowledgments

The author is very grateful to Shun Watanabe and Shigeaki Kuzuoka for their helpful comments.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Properties of the Rate Regions

In this appendix, we prove Property 1. Property 1 part a can easily be proved by the definitions of the rate distortion regions. We omit the proofs of this part. In the following argument, we prove the part b.
Proof of Property 1 part b:
We set
R ̲ AKW ( m , ε | p X Y ) = n m R AKW ( n , ε | p X Y ) .
By the definitions of R ̲ AKW ( m , ε | p X Y ) and R AKW ( ε | p X Y ) , we have that R ̲ AKW ( m , ε | p X Y ) R AKW ( ε | p X Y ) for m 1 . Hence, we have that
m 1 R ̲ AKW ( m , ε | p X Y ) R AKW ( ε | p X Y ) .
We next assume that ( R 1 , R 2 ) R AKW ( ε | p X Y ) . Set
R AKW ( δ ) ( ε | p X Y ) : = { ( R 1 + δ , R 2 + δ ) : ( R 1 , R 2 ) R AKW ( ε | p X Y ) } .
Then, by the definitions of R AKW ( n , ε | p X Y ) and R AKW ( ε | p X Y ) , we have that, for any δ > 0 , there exists n 0 ( ε , δ ) such that for any n n 0 ( ε , δ ) , ( R 1 + δ , R 2 + δ ) R AKW ( n , ε | p X Y ) , which implies that
R AKW ( δ ) ( ε | p X Y ) n n 0 ( ε , δ ) R AKW ( n , ε | p X Y ) = R ̲ AKW ( n 0 ( ε , δ ) , ε | p X Y ) cl m 1 R ̲ AKW ( m , ε | p X Y ) .
Here, we assume that there exists a pair ( R 1 , R 2 ) belonging to R AKW ( ε | p X Y ) such that
( R 1 , R 2 ) cl m 1 R ̲ AKW ( m , ε | p X Y ) .
Since the set on the right-hand side of (A3) is a closed set, we have
( R 1 + δ , R 2 + δ ) cl m 1 R ̲ AKW ( m , ε | p X Y )
for some small δ > 0 . On the other hand, we have ( R 1 + δ , R 2 + δ ) R AKW ( δ ) ( ε | p X Y ) , which contradicts (A2). Thus, we have
m 1 R ̲ AKW ( m , ε | p X Y ) R AKW ( ε | p X Y ) cl m 1 R ̲ AKW ( m , ε | p X Y ) .
Note here that R AKW ( ε | p X Y ) is a closed set. Then, from (A5), we conclude that
R AKW ( ε | W ) = cl m 1 R ̲ AKW ( m , ε | p X Y ) = cl m 1 n m R AKW ( n , ε | p X Y ) ,
completing the proof. □

Appendix B. Cardinality Bound on Auxiliary Random Variables

We first prove the following lemma.
Lemma A1.
R ̲ ( μ ) ( p X Y ) : = min p P ( p X Y ) μ I p ( X ; U ) + μ ¯ H p ( Y | U ) = R ( μ ) ( p X Y ) : = min p P sh ( p X Y ) μ I p ( X ; U ) + μ ¯ H p ( Y | U ) .
Proof. 
We bound the cardinality | U | of U to show that the bound | U | | X | is sufficient to describe R ̲ ( μ ) ( p X Y ) . Observe that
p X ( x ) = u U p U ( u ) p X | U ( x | u ) ,
μ I p ( X ; U ) + μ ¯ H p ( Y | U ) = u U p U ( u ) π ( p X | U ( · | u ) ) ,
where
π ( p X | U ( · | u ) ) : = ( x , y ) X × Y p X | U ( x | u ) p Y | X ( y | x ) log p X | U μ ( x | u ) p X μ ( x ) x ˜ X p Y | X ( y | x ˜ ) p X | U ( x ˜ | u ) μ ¯ .
For each u U , π ( p X | U ( · | u ) ) is a continuous function of p X | U ( · | u ) . Then, by the support lemma, | U | | X | 1 + 1 = | X | is sufficient to express | X | 1 values of (A6) and one value of (A7). □
Next, we prove the following lemma.
Lemma A2.
The cardinality bound | U | | X | in Q ( p Y | X ) is sufficient to describe the quantity Ω ( μ , α ) ( p X Y ) . The cardinality bound | U | | X | in P sh ( p X Y ) is sufficient to describe the quantity Ω ˜ ( μ , λ ) ( p X Y ) .
Proof. 
We first bound the cardinality | U | of U in Q ( p Y | X ) to show that the bound | U | | X | is sufficient to describe Ω ( μ , α ) ( p X Y ) . Observe that
q X ( x ) = u U q U ( u ) q X | U ( x | u ) ,
exp Ω ( μ , α ) ( q | p X ) = u U q U ( u ) Π ( μ , α ) ( q X , q X Y | U ( · , · | u ) ) ,
where
Π ( μ , α ) ( q X , q X Y | U ( · , · | u ) ) : = ( x , y ) X × Y q X Y | U ( x , y | u ) exp ω q | p X ( μ , α ) ( x , y | u ) .
The value of q X included in Π ( μ , α ) ( q X , q X Y | U ( · , · | u ) ) must be preserved under the reduction of U . For each u U , Π ( μ , α ) ( q X , q X Y | U ( · , · | u ) ) is a continuous function of q X Y | U ( · , · | u ) . Then, by the support lemma, | U | | X | 1 + 1 = | X | is sufficient to express | X | 1 values of (A8) and one value of (A9). We next bound the cardinality | U | of U in P sh ( p X Y ) to show that the bound | U | | X | is sufficient to describe Ω ˜ ( μ , λ ) ( p X Y ) . Observe that
p X ( x ) = u U p U ( u ) p X | U ( x | u ) ,
exp Ω ˜ ( μ , λ ) ( p ) = u U p U ( u ) Π ˜ ( μ , λ ) ( p X , p X Y | U ( · , · | u ) ) ,
where
Π ˜ ( μ , λ ) ( p X , p X Y | U ( · , · | u ) ) : = ( x , y ) X × Y p X Y | U ( x , y | u ) exp λ ω ˜ p ( μ ) ( x , y | u ) .
The value of p X included in Π ˜ ( μ , λ ) ( p X , p X Y | U ( · , · | u ) ) must be preserved under the reduction of U . For each u U , Π ˜ ( μ , λ ) ( p X , p X Y | U ( · , · | u ) ) is a continuous function of p X Y | U ( · , · | u ) . Then, by the support lemma, | U | | X | 1 + 1 = | X | is sufficient to express | X | 1 values of (A10) and one value of (A11). □

Appendix C. Supporting Hyperplain Expressions of R ( p X Y )

In this appendix we prove Property 3 parts (b), (c). We first prove the part (b).
Proof of Property 3 part b:
For any μ 0 , we have the following chain of inequalities:
min ( R 1 , R 2 ) R ( p X Y ) { μ R 1 + μ ¯ R 2 } = min p P ( p X Y ) { μ I p ( X ; U ) + μ ¯ H p ( Y | U ) } = ( a ) min p P sh ( p X Y ) { μ I p ( X ; U ) + μ ¯ H p ( Y | U ) } = R ( μ ) ( p X Y ) .
Step (a) follows from Lemma A1 stating that the cardinality bound | U | | X | + 1 in P ( p X Y ) can be reduced to that | U | | X | in P sh ( p X Y ) . □
We next prove part c. We first prepare a lemma useful to prove this property. From the convex property of the region R ( p X Y ) , we have the following lemma.
Lemma A3.
Suppose that ( R ^ 1 , R ^ 2 ) does not belong to R ( p X Y ) . Then, there exist ϵ > 0 and μ 0 0 such that for any ( R 1 , R 2 ) R ( p X Y ) we have
μ 0 ( R 1 R ^ 1 ) + μ 0 ¯ ( R 2 R ^ 2 ) ϵ 0 .
Proof of this lemma is omitted here. Lemma A3 is equivalent to the fact that if the region R ( p X Y ) is a convex set; then, for any point ( R ^ 1 , R ^ 2 ) outside the region R ( p X Y ) , there exists a line which separates the point ( R ^ 1 , R ^ 2 ) from the region R ( p X Y ) .
Proof of Property 3 part c:
We first prove R ̲ sh ( p X Y ) R ( p X Y ) . We assume that ( R ^ 1 , R ^ 2 ) R ( p X Y ) . Then, by Lemma A3, there exist ϵ > 0 and μ 0 0 such that for any ( R 1 , R 2 ) R ( p X Y ) , we have
μ 0 R ^ 1 + μ 0 ¯ R ^ 2 μ 0 R 1 + μ 0 ¯ R 2 ϵ .
Then, we have
μ 0 R ^ 1 + μ 0 ¯ R ^ 2 min ( R 1 , R 2 ) R ( p X Y ) μ 0 R 1 + μ 0 ¯ R 2 ϵ = ( a ) min p P ( p X Y ) μ 0 I p ( U ; X ) + μ 0 ¯ H p ( Y | U ) ϵ min p P sh ( p X Y ) μ 0 I p ( U ; X ) + μ 0 ¯ H p ( Y | U ) ϵ = R ( μ 0 ) ( p X Y ) ϵ .
Step (a) follows from the definition of R ( p X Y ) . The inequality (A12) implies that ( R ^ 1 , R ^ 2 ) R sh ( p X Y ) . Thus R sh ( p X Y ) R ( p X Y ) is concluded. □

Appendix D. Proof of Property 4 Part b

In this appendix, we prove Property 4 part b. Fix q = q U X Y Q ( p Y | X ) and p = p U X Y = ( p U | X , p X Y ) P sh ( p X Y ) arbitrary. For β 0 , p P sh ( p X Y ) , and q Y | U induced by q, define
ω ^ p , q Y | U ( μ ) ( x , y | u ) : = μ log p X | U ( x | u ) p X ( x ) + μ ¯ log 1 q Y | U ( y | u ) , Ω ^ ( μ , β ) ( p , q Y | U ) : = log E p exp β ω ^ p , q Y | U ( μ ) ( X , Y | U ) .
Then, we have the following two lemmas.
Lemma A4.
For any μ [ 0 , 1 ] , α [ 0 , 1 ) , and any q = q U X Y Q ( p Y | X ) , there exists p = p U X Y P sh ( p X Y ) such that
Ω ( μ , α ) ( q | p X ) α ¯ Ω ^ ( μ , α α ¯ ) ( p , q Y | U ) .
Lemma A5.
For any μ , α satisfying μ [ 0 , 1 ] , α [ 0 , 1 / 2 ) , any p = p U X Y P sh ( p X Y ) , and any stochastic matrix q Y | U induced by q U X Y Q ( p Y | X ) , we have
Ω ^ ( μ , α α ¯ ) ( p , q Y | U ) 1 2 α α ¯ Ω ˜ ( μ , α 1 2 α ) ( p ) .
From Lemmas A4 and A5, we have the following corollary.
Corollary A1.
For any μ , α satisfying μ [ 0 , 1 ] , α [ 0 , 1 / 2 ) , and any q = q U X Y Q ( p Y | X ) , there exists p = p U X Y P sh ( p X Y ) such that
Ω ( μ , α ) ( q | p X ) ( 1 2 α ) Ω ˜ ( μ , α 1 2 α ) ( p ) .
From (A15), we have that for any μ [ 0 , 1 ] , α [ 0 , 1 / 2 ) , we have
Ω ( μ , α ) ( p X Y ) ( 1 2 α ) Ω ˜ ( μ , α 1 2 α ) ( p X Y ) .
Proof of Lemma A4:
We fix ( μ , α ) [ 0 , 1 ] 2 arbitrary. For each q = q U X Y Q ( p Y | X ) , we choose p = p U X Y P sh ( p X Y ) so that p U | X = q U | X . Then, we have the following:
exp Ω ( μ , α ) ( q | p X ) = E q p X α ¯ ( X ) q X α ¯ ( X ) p X μ α ( X ) q Y | U μ ¯ α ( Y | U ) q X | U μ α ( X | U ) = E q p U X ( U , X ) q U X ( U , X ) α ¯ p X μ α α ¯ ( X ) q Y | U μ ¯ α α ¯ ( Y | U ) p X | U μ α α ¯ ( X | U ) α ¯ p X | U μ ( X | U ) q X | U μ ( X | U ) α ( a ) E q p U X ( U , X ) q U X ( U , X ) p X μ α α ¯ ( X ) q Y | U μ ¯ α α ¯ ( Y | U ) p X | U μ α α ¯ ( X | U ) α ¯ E q p X | U μ ( X | U ) q X | U μ ( X | U ) α = exp α ¯ Ω ^ ( μ , α α ¯ ) ( p , q Y | U ) A α ,
where we set
A : = E q p X | U μ ( X | U ) q X | U μ ( X | U ) .
Step (a) follows from Hölder’s inequality. From (A17), we can see that it suffices to show A 1 to complete the proof. When μ = 1 , we have A = 1 . When μ [ 0 , 1 ) , we apply Hölder’s inequality to A to obtain
A = E q p X | U μ ( X | U ) q X | U μ ( X | U ) E q p X | U ( X | U ) q X | U ( X | U ) μ = 1 .
Hence, we have (A13) in Lemma A4. □
Proof of Lemma A5:
We fix μ [ 0 , 1 ] , α [ 0 , 1 / 2 ) , arbitrary. For any p = p U X Y P sh ( p X Y ) , and any q = q U X Y Q ( p Y | X ) , we have the following chain of inequalities:
exp Ω ^ ( μ , α α ¯ ) ( p , q Y | U ) = E p p X μ α 1 2 α ( X ) p Y | U μ ¯ α 1 2 α ( Y | U ) p X | U μ α 1 2 α ( X | U ) 1 2 α α ¯ q Y | U μ ¯ ( Y | U ) p Y | U μ ¯ ( Y | U ) α α ¯ ( a ) exp 1 2 α α ¯ Ω ˜ ( μ , α 1 2 α ) ( p ) E p q Y | U μ ¯ ( Y | U ) p Y | U μ ¯ ( Y | U ) α α ¯ = exp 1 2 α α ¯ Ω ˜ ( μ , α 1 2 α ) ( p ) B α α ¯ ,
where we set
B : = E q q Y | U μ ¯ ( Y | U ) p Y | U μ ¯ ( Y | U ) .
Step (a) follows from Hölder’s inequality. From (A18), we can see that it suffices to show B 1 to complete the proof. In a manner quite smilar to the proof of A 1 in the proof of (A13) in Lemma A4, we can show that B 1 . Thus, we have (A14) in Lemma A5. □
Proof of Property 4 part b:
We evaluate lower bounds of F ( R 1 , R 2 | p X Y ) to obtain the following chain of inequalities:
F ( R 1 , R 2 | p X Y ) ( a ) sup μ [ 0 , 1 ] , α [ 0 , 1 / 2 ) ( 1 2 α ) Ω ˜ ( μ , α 1 2 α ) ( p X Y ) α ( μ R 1 + μ ¯ R 2 ) 2 + α μ ¯ = sup μ [ 0 , 1 ] , α [ 0 , 1 / 2 ) , λ = α 1 2 α ( 1 2 α ) Ω ˜ ( μ , λ ) ( p X Y ) α ( μ R 1 + μ ¯ R 2 ) 2 + α μ ¯ = ( b ) sup μ [ 0 , 1 ] , α = λ 1 + 2 λ , λ 0 ( 1 2 α ) Ω ˜ ( μ , λ ) ( p X Y ) α ( μ R 1 + μ ¯ R 2 ) 2 + α μ ¯ = ( c ) sup μ [ 0 , 1 ] , λ 0 Ω ˜ ( μ , λ ) ( p X Y ) λ ( μ R 1 + μ ¯ R 2 ) 2 + λ ( 5 μ ) = sup μ [ 0 , 1 ] , λ 0 F ̲ ( μ , α ) ( μ R 1 + μ ¯ R 2 | p X Y ) .
Step (a) follows from the definition of F ( R 1 , R 2 | p X Y ) and (A16) in Corollary A1. Steps (b) and (c) follow from that
α [ 0 , 1 / 2 ) , λ = α 1 2 α λ 0 , α = λ 1 + 2 λ .
From (A19), we have
F ( R 1 , R 2 | p X Y ) sup μ [ 0 , 1 ] , λ 0 F ̲ ( μ , λ ) ( μ R 1 + μ ¯ R 2 | p X Y ) = F ̲ ( R 1 , R 2 | p X Y ) ,
completing the proof. □

Appendix E. Proof of Property 4 Parts c, d, e, and f

In this appendix, we prove Property 4 parts c, d, e, and f. We first prove part c and then prove parts d and e. We finally prove part f.
Proof of Property 4 part c:
We first prove the second inequality in (8) in part c. We first observe that
exp [ Ω ˜ ( μ , λ ) ( p ) ] = E p p X μ λ ( X ) p Y | U μ ¯ λ ( Y | U ) p X | U μ λ ( X | U ) .
Let p ¯ X be the uniform distribution on X and let p ¯ Y be the uniform distribution on Y . On lower bound of exp [ Ω ˜ ( μ , λ ) ( p ) ] for p P sh ( p X Y ) and ( μ , λ ) [ 0 , 1 ] 2 , we have the following chain of inequalities:
exp [ Ω ˜ ( μ , λ ) ( p ) ] = 1 | X | μ λ | Y | μ ¯ λ E p p X | U μ λ ( X | U ) p X ( X ) p ¯ X ( X ) μ λ p Y | U ( Y | U ) p ¯ Y ( Y ) μ ¯ λ ( a ) 1 | X | μ | Y | μ ¯ E p p ¯ X ( X ) p X ( X ) μ λ p ¯ Y ( Y ) p Y | U ( Y | U ) μ ¯ λ ( b ) 1 | X | μ | Y | μ ¯ E p p ¯ X ( X ) p X ( X ) μ λ E p p ¯ Y ( Y ) p Y | U ( Y | U ) μ ¯ λ = 1 | X | μ | Y | μ ¯ .
Step (a) follows from that λ [ 0 , 1 ] and p X | U ( x | u ) 1 for any ( u , x ) U × X . Step (b) follows from the reverse Hölder’s inequality. The bound (A21) implies the second inequality in (8). We next show that Ω ˜ ( μ , λ ) ( p ) 0 for λ [ 0 , 1 ] . On upper bounds of exp [ Ω ˜ ( μ , λ ) ( p ) ] for p P sh ( p X Y ) and λ [ 0 , 1 ] , we have the following chain of inequalities:
exp [ Ω ˜ ( μ , λ ) ( p ) ] ( a ) E p p X ( X ) p X | U ( X | U ) μ λ ( b ) E p p X ( X ) p X | U ( X | U ) μ λ = 1 .
Step (a) follows from (A20) and p Y | U ( y | u ) 1 for any ( u , y ) U × Y . Step (b) follows from μ λ [ 0 , 1 ] and Hölder’s inequality. □
Proof of Property 4 parts d and e:
We first prove that, for each p P sh ( p X Y ) and μ [ 0 , 1 ] , Ω ˜ ( μ , λ ) ( p ) is twice differentiable for λ [ 0 , 1 / 2 ] . For simplicity of notations, set
a ̲ : = ( u , x , y ) , A ̲ : = ( U , X , Y ) , A ̲ : = U × X × Y , ω ˜ p ( μ ) ( x , y | u ) : = ς ( a ̲ ) , Ω ˜ ( μ , λ ) ( p ) : = ξ ( λ ) .
Then, we have
Ω ˜ ( μ , λ ) ( p ) = ξ ( λ ) = log a ̲ A ̲ p A ̲ ( a ̲ ) e λ ς ( a ̲ ) .
The quantity p ( λ ) ( a ̲ ) = p A ̲ ( λ ) ( a ̲ ) , a ̲ A has the following form:
p ( λ ) ( a ̲ ) = e ξ ( λ ) p ( a ̲ ) e λ ς ( a ̲ ) .
By simple computations, we have
ξ ( λ ) = e ξ ( λ ) a ̲ A ̲ p ( a ̲ ) ς ( a ̲ ) e λ ς ( a ̲ ) = a ̲ A ̲ p ( λ ) ( a ̲ ) ς ( a ̲ ) , ξ ( λ ) = e 2 ξ ( λ ) a ̲ , b ̲ A ̲ p ( a ̲ ) p A ̲ ( b ̲ ) ς ( a ̲ ) ς ( b ̲ ) 2 2 e λ ς ( a ̲ ) + ς ( b ̲ ) = a ̲ , b ̲ A ̲ p ( λ ) ( a ̲ ) p ( λ ) ( b ̲ ) ς ( a ̲ ) ς ( b ̲ ) 2 2 = a ̲ A ̲ p ( λ ) ( a ̲ ) ς 2 ( a ̲ ) + a ̲ A ̲ p ( λ ) ( a ̲ ) ς ( a ̲ ) 2 0 .
On upper bound of ξ ( λ ) 0 for λ [ 0 , 1 / 2 ] , we have the following chain of inequalities:
ξ ( λ ) ( a ) a ̲ A ̲ p ( λ ) ( a ̲ ) ς 2 ( a ̲ ) = ( b ) a ̲ A ̲ p ( a ̲ ) ς 2 ( a ̲ ) e λ ς ( a ̲ ) + ξ ( λ ) = e ξ ( λ ) a ̲ A ̲ p ( a ̲ ) e 2 λ ς ( a ̲ ) ς 4 ( a ̲ ) ( c ) e 2 ξ ( λ ) ξ ( 2 λ ) a ̲ A ̲ p ( a ̲ ) ς 4 ( a ̲ ) ( d ) e 2 ξ ( λ ) a ̲ A ̲ p ( a ̲ ) ς 4 ( a ̲ ) .
Step (a) follows from (A25). Step (b) follows from (A24). Step (c) follows from Cauchy–Schwarz inequality and (A23). Step (d) follows from that ξ ( 2 λ ) 0 for 2 λ [ 0 , 1 ] . Note that ξ ( λ ) exists for λ [ 0 , 1 / 2 ] . Furthermore, we have the following:
a ̲ A ̲ p ( a ̲ ) ς 4 ( a ̲ ) < .
Hence, by (A26), ξ ( λ ) exists for λ [ 0 , 1 / 2 ] . We next prove part e. We derive the lower bound (9) of Ω ˜ ( μ , λ ) ( p X Y ) . Fix any ( μ , λ ) [ 0 , 1 ] × [ 0 , 1 / 2 ] and any p P sh ( p X Y ) . By the Taylor expansion of ξ ( λ ) = Ω ˜ ( μ , λ ) ( p ) with respect to λ around λ = 0 , we have that for any p P sh ( p X Y ) and for some ν [ 0 , λ ]
Ω ˜ ( μ , λ ) ( p ) = ξ ( 0 ) + ξ ( 0 ) λ + 1 2 ξ ( ν ) λ 2 = λ E p ω ˜ p ( μ ) ( X , Y | U ) λ 2 2 Var p ( ν ) ω ˜ p ( μ ) ( X , Y | U ) ( a ) λ R ( μ ) ( p X Y ) λ 2 2 Var p ( ν ) ω ˜ p ( μ ) ( X , Y , Z | U ) .
Step (a) follows from p P sh ( p X Y ) ,
E p ω ˜ p ( μ ) ( X , Y | U ) = μ I p ( X ; U ) + μ ¯ H p ( Y | U ) ,
and the definition of R ( μ ) ( p X Y ) . Let ( ν opt , p opt ) [ 0 , λ ] × P sh ( p X Y ) be a pair which attains ρ ( μ , λ ) ( p X Y ) . By this definition, we have that
Ω ˜ ( μ , λ ) ( p opt ) = Ω ˜ ( μ , λ ) ( p X Y )
and that, for any ν [ 0 , λ ] ,
Var p opt ( ν ) ω p opt ( μ ) ( X , Y | U ) Var p opt ( ν opt ) ω p opt ( μ ) ( X , Y | U ) = ρ ( μ , λ ) ( p X Y ) .
On lower bounds of Ω ( μ , λ ) ( p X Y ) , we have the following chain of inequalities:
Ω ˜ ( μ , λ ) ( p X Y ) = ( a ) Ω ˜ ( μ , λ ) ( p opt ) ( b ) λ R ( μ ) ( p X Y ) λ 2 2 Var p opt ( ν ) ω ˜ p opt ( μ ) ( X , Y | U ) ( c ) λ R ( μ ) ( p X Y ) λ 2 2 ρ ( μ , λ ) ( p X Y ) ( d ) λ R ( μ ) ( p X Y ) λ 2 2 ρ ( p X Y ) .
Step (a) follows from (A28). Step (b) follows from (A27). Step (c) follows from (A29). Step (d) follows from the definition of ρ ( p X Y ) . □
To prove part f, we use the following lemma.
Lemma A6.
When τ ( 0 , ( 1 / 2 ) ρ ] , the maximum of
1 2 + 5 λ ρ 2 λ 2 + τ λ
for λ ( 0 , 1 / 2 ] is attained by the positive λ 0 satisfying
ϑ ( λ 0 ) : = λ 0 + 5 4 λ 0 2 = τ ρ .
Let g ( a ) be the inverse function of ϑ ( a ) for a 0 . Then, the condition of (A30) is equivalent to λ 0 = g ( τ ρ ) . The maximum is given by
1 2 + 5 λ 0 ρ 2 λ 0 2 + τ λ 0 = ρ 4 λ 0 2 = ρ 4 g 2 τ ρ .
By an elementary computation, we can prove this lemma. We omit the detail.
Proof of Property 4 part f.
By the hyperplane expression R sh ( p X Y ) of R ( p X Y ) stated Property 3 part b, we have that, when ( R 1 + τ , R 2 + τ ) R ( p X Y ) , we have
R ( μ 0 ) ( p X Y ) ( μ 0 R 1 + μ 0 ¯ R 2 ) > τ
for some μ 0 [ 0 , 1 ] . Then, for each positive τ , we have the following chain of inequalities:
F ̲ ( R 1 , R 2 | p X Y ) sup λ ( 0 , 1 / 2 ] F ̲ ( μ 0 , λ ) ( μ 0 R 1 + μ 0 ¯ R 2 | p X Y ) = sup λ ( 0 , 1 / 2 ] Ω ˜ ( μ 0 , λ ) ( p X Y ) λ ( μ 0 R 1 + μ 0 ¯ R 2 ) 2 + λ ( 5 μ 0 ) ( a ) sup λ ( 0 , 1 / 2 ] 1 2 + 5 λ ρ 2 λ 2 + λ R ( μ 0 ) ( p X Y ) λ ( μ 0 R 1 + μ 0 ¯ R 2 ) } > ( b ) sup λ ( 0 , 1 / 2 ] 1 2 + 5 λ ρ 2 λ 2 + τ λ = ( c ) ρ 4 g 2 τ ρ .
Step (a) follows from Property 4 part d. Step (b) follows from (A31). Step (c) follows from Lemma A6. □

Appendix F. Proof of Lemma 1

To prove Lemma 1, we prepare a lemma. Set
A n : = ( s , x n , y n ) : 1 n log p S X n Y n ( s , x n , y n ) q ^ S X n Y n ( s , x n , y n ) η .
Furthermore, set
B ˜ n : = x n : 1 n log p X n ( x n ) Q X n ( x n ) η , B n : = B ˜ n × M 1 × Y n , B n c : = B ˜ n c × M 1 × Y n , C ˜ n : = { ( s , x n ) : s = φ 1 ( n ) ( x n ) , Q ˜ X n | S ( x n | s ) M 1 e n η p X n ( x n ) } , C n : = C ˜ n × Y n , C n c : = C ˜ n c × Y n , D n : = { ( s , x n , y n ) : s = φ 1 ( n ) ( x n ) , p Y n | S ( y n | s ) ( 1 / M 2 ) e n η } , E n : = { ( s , x n , y n ) : s = φ 1 ( n ) ( x n ) , ψ ( n ) ( φ 1 ( n ) ( x n ) , φ 2 ( n ) ( y n ) ) = y n } .
Then, we have the following lemma.
Lemma A7.
p S X n Y n A n c e n η , p S X n Y n B n c e n η , p S X n Y n C n c e n η , p S X n Y n D n c E n e n η .
Proof. 
We first prove the first inequality.
p S X n Y n ( A n c ) = ( s , x n , y n ) A n c p S X n Y n ( s , x n , y n ) ( a ) ( s , x n , y n ) A n c e n η q ^ S X n Y n ( s , x n , y n ) = e n η q ^ S X n Y n A n c e n η .
Step (a) follows from the definition of A n . In the second inequality, we have
p S X n Y n ( B n c ) = p X n ( B ˜ n c ) = x n B ˜ n c p X n ( x n ) ( a ) x n B ˜ n c e n η Q X n ( x n ) = e n η Q X n B ˜ n c e n η .
Step (a) follows from the definition of B n . We next prove the third inequality:
p S X n Y n ( C n c ) = p S X n ( C ˜ n c ) = s M 1 x n : φ 1 ( n ) ( x n ) = s p X n ( x n ) ( 1 / M 1 ) e n η × Q ˜ X n | S ( x n | s ) p X n ( x n ) 1 M 1 e n η s M 1 x n : φ 1 ( n ) ( x n ) = s p X n ( x n ) ( 1 / M 1 ) e n η × Q ˜ X n | S ( x n | s ) Q ˜ X n | S ( x n | s ) 1 M 1 e n η | M 1 | = e n η .
Finally, we prove the fourth inequality. We first observe that
p S ( s ) = x n : φ 1 ( n ) ( x n ) = s p X n ( x n ) , p X n | S ( x n | s ) = p X n ( x n ) p S ( s ) .
We have the following chain of inequalities:
p S X n Y n D n c E n = s M 1 p S ( s ) x n : φ 1 ( n ) ( x n ) = s p X n | S ( x n | s ) y n : ψ ( n ) ( s , φ 2 ( n ) ( y n ) ) = y n p Y n | S ( y n | s ) ( 1 / M 2 ) e n η p Y n | X n ( y n | x n ) = s M 1 p S ( s ) y n : ψ ( n ) ( s , φ 2 ( n ) ( y n ) ) = y n p Y n | S ( y n | s ) ( 1 / M 2 ) e n η p Y n | S ( y n | s ) s M 1 p S ( s ) 1 M 2 e n η y n : ψ ( n ) ( s , φ 2 ( n ) ( y n ) ) = y n ( a ) s M 1 p S ( s ) 1 M 2 e n η M 2 = e n η .
Step (a) follows from that the number of y n correctly decoded does not exceed M 2 . □
Proof of Lemma 1:
By definition, we have
p S X n Y n A n B n C n D n = p S X n Y n 1 n log p S X n Y n ( S , X n , Y n ) q ^ S X n Y n ( S , X n , Y n ) η , 0 1 n log Q X n ( X n ) p X n ( X n ) η , 1 n log M 1 1 n log Q ˜ X n | S ( X n | S ) p X n ( X n ) η , 1 n log M 2 1 n log 1 p Y n | S ( Y n | S ) η .
Then, for any ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) satisfying ( 1 / n ) log | | φ i ( n ) | | R i , i = 1 , 2 , we have
p S X n Y n A n B n C n D n p S X n Y n 1 n log p S X n Y n ( S , X n , Y n ) q ^ S X n Y n ( S , X n , Y n ) η , 0 1 n log Q X n ( X n ) p X n ( X n ) η , R 1 1 n log Q ˜ X n | S ( X n | S ) p X n ( X n ) η , R 2 1 n log 1 p Y n | S ( Y n | S ) η .
Hence, it suffices to show
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) p S X n Y n A n B n C n D n + 4 e n η
to prove Lemma 1. By definition, we have P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) = p S X n Y n E n . Then, we have the following.
P c ( n ) ( φ 1 ( n ) , φ 2 ( n ) , ψ ( n ) ) = p S X n Y n E n = p S X n Y n A n B n C n D n E n + p S X n Y n A n B n C n D n c E n p S X n Y n A n B n C n D n + p S X n Y n A n c + p S X n Y n B n c + p S X n Y n C n c + p S X n Y n D n c E n ( a ) p S X n Y n A n B n C n D n + 4 e n η .
Step (a) follows from Lemma A7. □

Appendix G. Proof of Lemma 3

In this appendix, we prove Lemma 3.
Proof of Lemma 3:
We first prove the Markov chain S X t 1 X t Y t in (18) in Lemma 3. We have the following chain of inequalities:
I ( Y t ; S X t 1 | X t ) = H ( Y t | X t ) H ( Y t | S X t 1 X t ) H ( Y t | X t ) H ( Y t | S X n ) = ( a ) H ( Y t | X t ) H ( Y t | X n ) = ( b ) H ( Y t | X t ) H ( Y t | X t ) = 0 .
Step (a) follows from that S = φ 1 ( n ) ( X n ) is a function of X n . Step (b) follows from the memoryless property of the information source { ( X t , Y t ) } t = 1 . Next, we prove the Markov chain Y t 1 S X t 1 ( X t , Y t ) in (19) in Lemma 3. We have the following chain of inequalities:
I ( X t Y t ; Y t 1 | S X t 1 ) = H ( Y t 1 | S X t 1 ) H ( Y t 1 | S X t 1 X t Y t ) H ( Y t 1 | X t 1 ) H ( Y t 1 | X n S Y t ) = ( a ) H ( Y t 1 | X t 1 ) H ( Y t 1 | X n Y t ) = ( b ) H ( Y t 1 | X t 1 ) H ( Y t 1 | X t 1 Y t ) = 0 .
Step (a) follows from that S = φ 1 ( n ) ( X n ) is a function of X n . Step (b) follows from the memoryless property of the information source { ( X t , Y t ) } t = 1 . □

Appendix H. Proof of Lemma 6

In this appendix, we prove Lemma 6.
Proof of Lemma 6.
By the definition of p S X t Y t ; F t ( μ , α ) ( s , x t , y t ) , for t = 1 , 2 , , n , we have
p S X t Y t ; F t ( μ , α ) ( s , x t , y t ) = C t 1 p S X t Y t ( s , x t , y t ) i = 1 t f F i ( μ , α ) ( x i , y i | u i ) .
Then, we have the following chain of equalities:
p S X t Y t ; F t ( μ , α ) ( s , x t , y t ) = ( a ) C t 1 p S X t Y t ( s , x t , y t ) i = 1 t f F i ( μ , α ) ( x i , y i | u i ) = C t 1 p S X t 1 Y t 1 ( s , x t 1 , y t 1 ) i = 1 t 1 f F i ( μ , α ) ( x i , y i | u i ) × p X t Y t | S X t 1 Y t 1 ( x t , y t | s , x t 1 , y t 1 ) f F t ( μ , α ) ( x t , y t | u t ) = ( b ) C t 1 C t 1 p S X t 1 Y t 1 ( μ , α ) ( s , x t 1 , y t 1 ) p X t Y t | S X t 1 Y t 1 ( x t , y t | s , x t 1 , y t 1 ) f F t ( μ , α ) ( x t , y t | u t ) = ( Φ t ( μ , α ) ) 1 p S X t 1 Y t 1 ; F t 1 ( μ , α ) ( s , x t 1 , y t 1 ) p X t Y t | S X t 1 Y t 1 ( x t , y t | s , x t 1 , y t 1 ) f F t ( μ , α ) ( x t , y t | u t ) .
Steps (a) and (b) follow from (A32). From (A33), we have
Φ t ( μ , α ) p S X t Y t ; F t ( μ , α ) ( s , x t , y t )
= p S X t 1 Y t 1 ; F t 1 ( μ , α ) ( s , x t 1 , y t 1 ) p X t Y t | S X t 1 Y t 1 ( x t , y t | s , x t 1 , y t 1 ) f F t ( μ , α ) ( x t , y t | u t ) .
Taking summations of (A34) and (A35) with respect to s , x t , y t , we obtain
Φ t ( μ , α ) = s , x t , y t p S X t 1 Y t 1 ; F t 1 ( μ , α ) ( s , x t 1 , y t 1 ) p X t Y t | S X t 1 Y t 1 ( x t , y t | s , x t 1 , y t 1 ) f F t ( μ , α ) ( x t , y t | u t ) ,
completing the proof. □

References

  1. Ahlswede, R.F.; Körner, J. Source coding with side information and a converse for degraded broadcast channels. IEEE Trans. Inf. Theory 1975, 21, 629–637. [Google Scholar] [CrossRef]
  2. Wyner, A.D. On source coding with side information at the decoder. IEEE Trans. Inf. Theory 1975, 21, 294–300. [Google Scholar] [CrossRef]
  3. Csiszár, I.; Longo, G. On the exponent function for source coding and for testing simple statistical hypotheses. Studia Sci. Math. Hungar 1971, 6, 181–191. [Google Scholar]
  4. Slepian, D.; Wolf, J.K. Noiseless coding of correlated information sources. IEEE Trans. Inf. Theory 1973, 19, 471–480. [Google Scholar] [CrossRef]
  5. Oohama, Y.; Han, T.S. Universal coding for the Slepian-wolf data compression system and the strong converse theorem. IEEE Trans. Inf. Theory 1994, 40, 1908–1919. [Google Scholar] [CrossRef]
  6. Ahlswede, R.; Gács, P.; Körner, J. Bounds on conditional probabilities with applications in multi-user communication. Probab. Theory Relat. Fields 1976, 34, 157–177. [Google Scholar] [CrossRef] [Green Version]
  7. Gu, W.; Effors, M. A strong converse for a collection of network source coding problems. In Proceedings of the IEEE International Symposium on Information Theory, Soul, Korea, 28 June–3 July 2009; pp. 2316–2320. [Google Scholar]
  8. Oohama, Y. Strong converse exponent for degraded broadcast channels at rates outside the capacity region. In Proceedings of the 2015 IEEE International Symposium on Information Theory, Hong Kong, China, 14–19 June 2015; pp. 939–943. [Google Scholar]
  9. Oohama, Y. Strong converse theorems for degraded broadcast channels with feedback. In Proceedings of the 2015 IEEE International Symposium on Information Theory, Hong Kong, China, 14–19 June 2015; pp. 2510–2514. [Google Scholar]
  10. Oohama, Y. Exponent function for asymmetric broadcast channels at rates outside the capacity region. In Proceedings of the 2016 IEEE International Symposium on Information Theory and its Applications, Monterey, CA, USA, 30 October–2 Novomber 2016; pp. 568–572. [Google Scholar]
  11. Oohama, Y. New Strong Converse for Asymmetric Broadcast Channels. Available online: https://arxiv.org/pdf/1604.02901.pdf (accessed on 31 May 2019).
  12. Oohama, Y. Exponential strong converse for source coding with side information at the decoder. Entropy 2018, 20, 352. [Google Scholar] [CrossRef]
  13. Watanabe, S. A converse bound on Wyner-Ahlswede-Körner network via Gray–Wyner network. In Proceedings of the 2017 IEEE Information Theory Workshop (ITW), Kaohsiung, Taiwan, 6–10 November 2017; pp. 81–85. [Google Scholar]
  14. Liu, J.; van Handel, R.; Verdu, S. Beyond the blowing-up lemma: Sharp converses via reverse hypercontractivity. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 943–947. [Google Scholar]
  15. Watanabe, S. Second-order region for Gray–Wyner network. IEEE Trans. Inform. Theory 2017, 63, 1006–1018. [Google Scholar] [CrossRef]
  16. Liu, J. Dispersion bound for the Wyner-Ahlswede-Körner network via reverse hypercontractivity on types. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018; pp. 1854–1858. [Google Scholar]
  17. Watanabe, S.; Oohama, Y. Privacy amplification theorem for bounded storage eavesdropper. In Proceedings of the 2012 IEEE Information Theory Workshop (ITW), Lausanne, Switzerland, 3–7 September 2012; pp. 177–181. [Google Scholar]
  18. Oohama, Y.; Santoso, B. Information Theoretic Security for Side-Channel Attacks to the Shannon Cipher System. Available online: https://arxiv.org/pdf/1801.02563v5.pdf. (accessed on 31 May 2019).
  19. Santoso, B.; Oohama, Y. Information Theoretic Security for Shannon Cipher System under Side-Channel Attacks. Entropy 2019, 21, 469. [Google Scholar] [CrossRef]
  20. Oohama, Y. Exponent Function for One Helper Source Coding Problem at Rates outside the Rate Region. arXiv 2015, arXiv:1504.05891. [Google Scholar]
  21. Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Cambridge University Press: London, UK, 1981. [Google Scholar]
  22. Han, T.S. Information-Spectrum Methods in Information Theory; Springer Nature Switzerland AG: Basel, Switzerland, 2002. [Google Scholar]
Figure 1. Source encoding with or without side information at the decoder.
Figure 1. Source encoding with or without side information at the decoder.
Entropy 21 00567 g001
Figure 2. One helper source coding system [20].
Figure 2. One helper source coding system [20].
Entropy 21 00567 g002
Figure 3. A typical shape of R ( p X Y ) .
Figure 3. A typical shape of R ( p X Y ) .
Entropy 21 00567 g003
Figure 4. One helper source coding system investigated by Wyner.
Figure 4. One helper source coding system investigated by Wyner.
Entropy 21 00567 g004

Share and Cite

MDPI and ACS Style

Oohama, Y. Exponential Strong Converse for One Helper Source Coding Problem. Entropy 2019, 21, 567. https://doi.org/10.3390/e21060567

AMA Style

Oohama Y. Exponential Strong Converse for One Helper Source Coding Problem. Entropy. 2019; 21(6):567. https://doi.org/10.3390/e21060567

Chicago/Turabian Style

Oohama, Yasutada. 2019. "Exponential Strong Converse for One Helper Source Coding Problem" Entropy 21, no. 6: 567. https://doi.org/10.3390/e21060567

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop